Tagged articles
2 articles
Page 1 of 1
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
May 28, 2026 · Artificial Intelligence

Synthesizing Agentic Factual SFT/Mid‑train Data: Query Filtering, Trajectory Generation, and Tool Usage

The article outlines a practical pipeline for creating agentic factual SFT and mid‑train datasets, covering how to define training goals, filter and classify queries, label processing tags, format trajectory samples, differentiate SFT from mid‑train data, and avoid common pitfalls when generating evidence‑driven AI training data.

SFTagentic AIdata synthesis
0 likes · 10 min read
Synthesizing Agentic Factual SFT/Mid‑train Data: Query Filtering, Trajectory Generation, and Tool Usage
Machine Learning Algorithms & Natural Language Processing
Machine Learning Algorithms & Natural Language Processing
May 17, 2026 · Artificial Intelligence

How to Build Agentic Factual SFT and Mid‑Train Datasets: Query Selection, Trajectory Generation, and Tool Usage

This article outlines a systematic approach for creating agentic factual SFT and Mid‑train data, covering the definition of training goals, query filtering, two‑layer classification and labeling, trajectory format, differences between Mid‑train and SFT, a practical synthesis pipeline, and common pitfalls to avoid.

SFTagentic AIdata synthesis
0 likes · 11 min read
How to Build Agentic Factual SFT and Mid‑Train Datasets: Query Selection, Trajectory Generation, and Tool Usage