Artificial Intelligence 6 min read

Running and Fine‑Tuning Large Language Models Locally with Ollama, Docker, and Cloud Resources

The author chronicles the challenges and solutions of running large language models locally using Ollama, experimenting with cloud GPUs on Google Colab, managing Python dependencies through Docker, and ultimately fine‑tuning a small Qwen model, providing a practical guide for AI enthusiasts.

System Architect Go

Oct 17, 2024

Running and Fine‑Tuning Large Language Models Locally with Ollama, Docker, and Cloud Resources

Previously I used Ollama to run large language models locally (see the article "AI LLM Tool Ollama Architecture and Dialogue Processing Flow Analysis"). This time I wanted to try more advanced operations, such as fine‑tuning.

My idea was that, since a ready‑made large model exists, I could gather a domain‑specific dataset and "add some material" to the model, eventually obtaining a model optimized for that domain.

However, I quickly discovered that fine‑tuning is not that simple; the model must first be runnable via code. Thus this article was born, documenting my "simple" attempt to run an AI model and the many problems that followed.

Cloud Environment or Local?

It is well known that running AI models is best done with a GPU. I don’t have a GPU, so I turned to cloud resources. Both Google Colab and Kaggle Notebooks are attractive because they allow free GPU usage, so I chose Colab, hoping for abundant resources.

Reality hit hard: free‑user GPU slots are scarce and allocated by luck. Nevertheless, Colab remains useful despite the limitation.

Because I could not obtain better resources and the environment had restrictions, I decided to return to my modest local setup.

Python Environment: A Headache

Entering the AI field inevitably means using Python, which brings a slew of version and dependency‑management issues. Tools such as Conda, pipenv, pipx, and poetry were tried, but even installing PyTorch via the modern package manager poetry failed, leading to great frustration.

The solution was to abandon virtual‑environment tools and use Docker. By mounting the code directory into a clean Docker container, I obtained an isolated environment that works smoothly.

To avoid re‑downloading dependencies after a container restart, I either build a large base image (which I avoided) or persist the packages in the project directory—similar to node_modules for Node.js—then set PYTHONPATH to point to those locations.

With VSCode’s Remote Development extension, the development environment became stable and hassle‑free.

Model Selection: There’s One for You

After the environment was ready, I needed to pick a model (welcome, Hugging Face). I first tried the popular LLaMA, but access requires permission, and my request was denied, likely due to regional restrictions.

Undeterred, I switched to another model family and chose Qwen. Both Qwen and LLaMA belong to the same family with various parameter sizes; to avoid overloading my computer, I selected the small Qwen/Qwen2.5-0.5B model.

Running a simple "Hello, world" test worked, though the tiny model responded slowly after waiting more than ten minutes. Still, it was encouraging to see a real response faster than many HR bots.

(Follow me for ad‑free, technology‑focused content; I welcome discussion.)

References:

https://huggingface.co/

https://huggingface.co/Qwen/Qwen2.5-0.5B

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Docker Python LLM fine-tuning Qwen Ollama Google Colab

Written by

System Architect Go

Programming, architecture, application development, message queues, middleware, databases, containerization, big data, image processing, machine learning, AI, personal growth.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.