Artificial Intelligence 11 min read

MOSS 003: Open‑Source Large Language Model Development, Training Data, and Plugin‑Enabled Deployment

The article details the evolution of the open‑source MOSS series—from OpenChat 001 to MOSS 003—covering data collection, fine‑tuning procedures, multilingual capabilities, plugin architecture, example code for inference, and upcoming releases, providing a comprehensive technical overview for AI practitioners.

Architect

Apr 24, 2023

MOSS 003: Open‑Source Large Language Model Development, Training Data, and Plugin‑Enabled Deployment

The post introduces the MOSS family of open‑source large language models, starting with the early internal prototype OpenChat 001, which was built by expanding ~400k dialogue pairs using self‑instruction techniques and fine‑tuned on a 16B CodeGen base.

OpenChat 001 already demonstrated instruction‑following, multi‑turn dialogue, and surprising cross‑language alignment despite being trained on almost no Chinese data.

Following OpenChat 001, the team released MOSS 002, adding ~30B Chinese tokens and over 1.16M bilingual helpfulness, honesty, and harmlessness dialogues (available on HuggingFace). Engineering work on inference acceleration, model deployment, and front‑end/back‑end integration was also completed, and a closed beta began on February 21.

MOSS 003 further scales pre‑training to 100B Chinese tokens (total 700B tokens, including ~300B code) and incorporates ~1.1M real‑world user dialogues plus ~300k plugin‑enhanced conversations covering search, image generation, calculators, and equation solving. A small subset of this data is publicly released.

The model suite uploaded to HuggingFace includes:

moss-moon-003-base – the base language model with extensive Chinese knowledge.

moss-moon-003-sft – a dialogue‑fine‑tuned model with initial helpfulness, honesty, and harmlessness.

moss-moon-003-sft-plugin – a plugin‑enhanced version capable of invoking at least four external tools.

Interaction with MOSS can be done in a few Python lines:

from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("fnlp/moss-moon-003-sft", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("fnlp/moss-moon-003-sft", trust_remote_code=True).half()
model.eval()
meta_instruction = "You are an AI assistant whose name is MOSS. ..."
query = meta_instruction + "<|Human|>: 你好<eoh>
<|MOSS|>:"
inputs = tokenizer(query, return_tensors="pt")
outputs = model.generate(**inputs, do_sample=True, temperature=0.7, top_p=0.8, repetition_penalty=1.1, max_new_tokens=128)
response = tokenizer.decode(outputs[0])
print(response[len(query)+2:])

Future work includes releasing quantized Int‑4/8 models, expanding the full fine‑tuning dataset, and improving plugin reliability. The team also open‑sources front‑end and back‑end code repositories for community experimentation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

AI model fine-tuning Large Language Model Transformers open-source Plugins MOSS

Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.