Artificial Intelligence 17 min read

Fine‑Tuning Local LLaMA‑Factory Models and Building Networked AI Applications

The article walks through preparing a GPU‑enabled environment, downloading and LoRA‑fine‑tuning a DeepSeek model with LLaMA‑Factory, merging the adapter, then wrapping the model in a web UI that queries a ChromaDB vector store via crawled web data, illustrating security‑focused use cases and forecasting domain‑specific LLM adoption.

Tencent Cloud Developer
Tencent Cloud Developer
Tencent Cloud Developer
Fine‑Tuning Local LLaMA‑Factory Models and Building Networked AI Applications

This article presents a step‑by‑step guide for customizing large language models (LLMs) locally using LLaMA‑Factory, and then extending the fine‑tuned model with network‑enabled capabilities.

1. LLaMA‑Factory Local Model Fine‑Tuning

Environment preparation : GPU T4 (8 GB), 36 GB RAM, Linux‑based OS.

Model download (DeepSeek‑R1‑Distill‑Qwen‑1.5B):

pip install -U huggingface_hub
pip install huggingface-cli
export HF_ENDPOINT=https://hf-mirror.com   # optional China mirror
huggingface-cli download --resume-download deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --local-dir /data/llm/models/DeepSeek-R1-Distill-Qwen-1.5B

LLaMA‑Factory installation :

git clone https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e .

Training command (LoRA fine‑tuning) :

llamafactory-cli train \
    --stage sft \
    --do_train True \
    --model_name_or_path /data/llm/models/DeepSeek-R1-Distill-Qwen-1.5B \
    --finetuning_type lora \
    --template deepseek3 \
    --learning_rate 0.0002 \
    --num_train_epochs 10.0 \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 8 \
    --lr_scheduler_type cosine \
    --output_dir saves/DeepSeek-R1-1.5B-Distill/lora/train_2025-03-02-15-57-16 \
    --fp16 True \
    --lora_rank 8 \
    --lora_alpha 16 \
    --lora_dropout 0 \
    --lora_target all

Model merging (base model + LoRA adapter):

llamafactory-cli export cust/merge_deepseekr1_lora_sft.yaml

2. Local Model Network Function Development

Overall architecture : a web UI (LLama‑Factory) calls a local LLM, retrieves relevant documents from a ChromaDB vector store, and returns context‑aware answers.

LLM call wrapper (Python example):

def call_llm(prompt: str, with_context: bool = True, context: str | None = None):
    # Build message list, optionally prepend system prompt and context
    # Use ollama.chat(stream=True) to get streaming responses
    pass

Vector database initialization (ChromaDB with nomic‑embed‑text embeddings):

client = chromadb.PersistentClient(path="./web-search-llm-db")
collection = client.get_or_create_collection(name="docs", metadata={"hnsw:space": "cosine"})

URL normalization (standardize IDs for vector storage):

def normalize_url(url):
    url = url.replace("https://", "").replace("www.", "")
    return url.replace("/", "_").replace("-", "_").replace(".", "_")

Data ingestion : split crawled pages into 400‑character chunks with 100‑character overlap, store text, source URL and normalized ID in ChromaDB.

Web crawling (async crawler with BM25 filtering and robots.txt compliance):

async def crawl_webpages(urls: list[str], prompt: str) -> CrawlResult:
    # fetch pages, filter with BM25, respect robots.txt, extract text via UnstructuredMarkdownLoader
    pass

Search integration (DuckDuckGo API, video‑site exclusion, robots.txt double‑check):

def get_web_urls(search_term: str, num_results: int = 10) -> list[str]:
    # query DuckDuckGo, filter out youtube.com, vimeo.com, etc.
    pass

The article demonstrates the end‑to‑end workflow: fine‑tune, merge, launch the web UI, query the model, and observe the customized responses.

3. Business Scenario Exploration

Three security‑related use cases are discussed:

APK virus detection – using LLMs to spot malicious intent in app code and permissions.

URL security detection – AI‑driven analysis of phishing page language, certificate info, and navigation traces.

APP privacy inspection – comparing declared privacy policies with actual data‑access behavior, risk scoring, and anomaly detection.

4. Future Outlook and Practical Recommendations

The author predicts a shift toward vertical, domain‑specific LLMs, emphasizes the importance of data assets, application innovation, and solution integration as new competitive moats, and advocates for T‑shaped skill development (deep technical expertise + domain knowledge).

Overall, the guide provides a comprehensive, reproducible pipeline for turning a generic open‑source LLM into a customized, network‑aware AI assistant suitable for security‑oriented product scenarios.

AILLMFine-tuningsecurityLLaMA-FactoryVectorDBWebCrawling
Tencent Cloud Developer
Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.