Artificial Intelligence 17 min read

Fine‑Tuning Local LLaMA‑Factory Models and Building Networked AI Applications

The article walks through preparing a GPU‑enabled environment, downloading and LoRA‑fine‑tuning a DeepSeek model with LLaMA‑Factory, merging the adapter, then wrapping the model in a web UI that queries a ChromaDB vector store via crawled web data, illustrating security‑focused use cases and forecasting domain‑specific LLM adoption.

Tencent Cloud Developer

Mar 11, 2025

Fine‑Tuning Local LLaMA‑Factory Models and Building Networked AI Applications

This article presents a step‑by‑step guide for customizing large language models (LLMs) locally using LLaMA‑Factory, and then extending the fine‑tuned model with network‑enabled capabilities.

1. LLaMA‑Factory Local Model Fine‑Tuning

Environment preparation : GPU T4 (8 GB), 36 GB RAM, Linux‑based OS.

Model download (DeepSeek‑R1‑Distill‑Qwen‑1.5B):

pip install -U huggingface_hub
pip install huggingface-cli
export HF_ENDPOINT=https://hf-mirror.com   # optional China mirror
huggingface-cli download --resume-download deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B --local-dir /data/llm/models/DeepSeek-R1-Distill-Qwen-1.5B

LLaMA‑Factory installation :

git clone https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e .

Training command (LoRA fine‑tuning) :

llamafactory-cli train \
    --stage sft \
    --do_train True \
    --model_name_or_path /data/llm/models/DeepSeek-R1-Distill-Qwen-1.5B \
    --finetuning_type lora \
    --template deepseek3 \
    --learning_rate 0.0002 \
    --num_train_epochs 10.0 \
    --per_device_train_batch_size 2 \
    --gradient_accumulation_steps 8 \
    --lr_scheduler_type cosine \
    --output_dir saves/DeepSeek-R1-1.5B-Distill/lora/train_2025-03-02-15-57-16 \
    --fp16 True \
    --lora_rank 8 \
    --lora_alpha 16 \
    --lora_dropout 0 \
    --lora_target all

Model merging (base model + LoRA adapter):

llamafactory-cli export cust/merge_deepseekr1_lora_sft.yaml

2. Local Model Network Function Development

Overall architecture : a web UI (LLama‑Factory) calls a local LLM, retrieves relevant documents from a ChromaDB vector store, and returns context‑aware answers.

LLM call wrapper (Python example):

def call_llm(prompt: str, with_context: bool = True, context: str | None = None):
    # Build message list, optionally prepend system prompt and context
    # Use ollama.chat(stream=True) to get streaming responses
    pass

Vector database initialization (ChromaDB with nomic‑embed‑text embeddings):

client = chromadb.PersistentClient(path="./web-search-llm-db")
collection = client.get_or_create_collection(name="docs", metadata={"hnsw:space": "cosine"})

URL normalization (standardize IDs for vector storage):

def normalize_url(url):
    url = url.replace("https://", "").replace("www.", "")
    return url.replace("/", "_").replace("-", "_").replace(".", "_")

Data ingestion : split crawled pages into 400‑character chunks with 100‑character overlap, store text, source URL and normalized ID in ChromaDB.

Web crawling (async crawler with BM25 filtering and robots.txt compliance):

async def crawl_webpages(urls: list[str], prompt: str) -> CrawlResult:
    # fetch pages, filter with BM25, respect robots.txt, extract text via UnstructuredMarkdownLoader
    pass

Search integration (DuckDuckGo API, video‑site exclusion, robots.txt double‑check):

def get_web_urls(search_term: str, num_results: int = 10) -> list[str]:
    # query DuckDuckGo, filter out youtube.com, vimeo.com, etc.
    pass

The article demonstrates the end‑to‑end workflow: fine‑tune, merge, launch the web UI, query the model, and observe the customized responses.

3. Business Scenario Exploration

Three security‑related use cases are discussed:

APK virus detection – using LLMs to spot malicious intent in app code and permissions.

URL security detection – AI‑driven analysis of phishing page language, certificate info, and navigation traces.

APP privacy inspection – comparing declared privacy policies with actual data‑access behavior, risk scoring, and anomaly detection.

4. Future Outlook and Practical Recommendations

The author predicts a shift toward vertical, domain‑specific LLMs, emphasizes the importance of data assets, application innovation, and solution integration as new competitive moats, and advocates for T‑shaped skill development (deep technical expertise + domain knowledge).

Overall, the guide provides a comprehensive, reproducible pipeline for turning a generic open‑source LLM into a customized, network‑aware AI assistant suitable for security‑oriented product scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI LLM fine-tuning Security LLaMA-Factory VectorDB WebCrawling

Written by

Tencent Cloud Developer

Official Tencent Cloud community account that brings together developers, shares practical tech insights, and fosters an influential tech exchange community.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.