Artificial Intelligence 9 min read

Fine‑Tuning Qwen‑14B Large Language Model: A Complete Guide

This article provides a comprehensive tutorial on fine‑tuning the Qwen‑14B large language model, covering the motivation, fine‑tuning concepts, step‑by‑step workflow, required code, DeepSpeed training parameters, testing scripts, and deployment using FastChat and the 360AI platform.

360 Smart Cloud
360 Smart Cloud
360 Smart Cloud
Fine‑Tuning Qwen‑14B Large Language Model: A Complete Guide

Introduction: With the rise of ChatGPT and other large language models (LLMs) such as OpenAI GPT, Meta LLaMA, Alibaba Tongyi Qianwen, and Baidu Wenxin, generic models often give overly broad answers for specific scenarios, motivating fine‑tuning.

What is fine‑tuning: Fine‑tuning adapts a pre‑trained model to a target task or domain by further training on task‑specific data, leveraging the model’s general knowledge while specializing it.

Why fine‑tune: Transfer learning, data scarcity, and computational savings are the main reasons.

Typical fine‑tuning workflow: data preparation, model selection, hyper‑parameter setting, training, evaluation, and deployment.

Case study – fine‑tuning Qwen‑14B: The article walks through environment setup (4 × A100 GPUs), choice of model (Qwen‑14B), framework (360AI platform, FastChat API), data format (JSONL with a “conversations” field), and the DeepSpeed command used for training, including a full list of relevant parameters.

# $DATA is the data path
# $MODEL is the model path
deepspeed finetune_merge.py \
    --report_to "none" \
    --data_path $DATA \
    --lazy_preprocess False \
    --model_name_or_path $MODEL \
    --output_dir /hboxdir/output \
    --model_max_length 2048 \
    --num_train_epochs 24 \
    --per_device_train_batch_size 1 \
    --gradient_accumulation_steps 1 \
    --save_strategy epoch \
    --save_total_limit 2 \
    --learning_rate 1e-5 \
    --lr_scheduler_type "cosine" \
    --adam_beta1 0.9 \
    --adam_beta2 0.95 \
    --adam_epsilon 1e-8 \
    --max_grad_norm 1.0 \
    --weight_decay 0.1 \
    --warmup_ratio 0.01 \
    --logging_steps 1 \
    --gradient_checkpointing True \
    --deepspeed "ds_config_zero3.json" \
    --bf16 True \
    --tf32 True

Testing the fine‑tuned model: Example Python code using HuggingFace Transformers to generate a response, and the expected output.

from transformers import AutoModelForCausalLM, AutoTokenizer
model_dir="/models/qwen-14b"
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_dir, device_map="auto", trust_remote_code=True).eval()
inputs = tokenizer('你好啊,介绍下你自己', return_tensors='pt')
inputs = inputs.to(model.device)
pred = model.generate(**inputs)
print(tokenizer.decode(pred.cpu()[0], skip_special_tokens=True))

Deployment with FastChat: Steps to launch the controller, model worker, OpenAI‑compatible API server, and a curl example to query the service.

python -m fastchat.serve.controller --host 0.0.0.0 --port 21001
python -m fastchat.serve.model_worker --model-path /models/qwen-14b/ --host 0.0.0.0
python -m fastchat.serve.openai_api_server --host 0.0.0.0 --port 8000
curl http://{{HOST}}:8000/v1/chat/completions -H "Content-Type: application/json" -d '{"model": "qwen-14b", "messages": [{"role": "user", "content": "你是谁"}]}'

360AI platform usage: Data upload, parameter configuration, resource allocation, and model serving screenshots are described.

References: Links to related articles on LLaMA‑2 self‑recognition fine‑tuning and Tongyi Qianwen fine‑tuning.

large language modelsLLM fine-tuningDeepSpeedAI Model DeploymentFastChatQwen-14B
360 Smart Cloud
Written by

360 Smart Cloud

Official service account of 360 Smart Cloud, dedicated to building a high-quality, secure, highly available, convenient, and stable one‑stop cloud service platform.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.