Wu Shixiong's Large Model Academy
Author

Wu Shixiong's Large Model Academy

We continuously share large‑model know‑how, helping you master core skills—LLM, RAG, fine‑tuning, deployment—from zero to job offer, tailored for career‑switchers, autumn recruiters, and those seeking stable large‑model positions.

109
Articles
0
Likes
109
Views
0
Comments
Recent Articles

Latest from Wu Shixiong's Large Model Academy

100 recent articles max
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 16, 2025 · Artificial Intelligence

How to Slash RAG First‑Token Latency: Practical Engineering Strategies

This guide breaks down the three layers of a RAG pipeline—embedding, vector retrieval, and system architecture—and provides concrete engineering tactics such as batch embedding, async concurrency, caching, ANN indexing, partitioning, connection pooling, and async pipelines to dramatically reduce Time‑to‑First‑Token latency.

Async PipelineEmbeddingRAG
0 likes · 10 min read
How to Slash RAG First‑Token Latency: Practical Engineering Strategies
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 15, 2025 · Artificial Intelligence

How to Build Robust Function Call Training Data for LLM Agents

This article explains why function call capabilities in large language model agents require dedicated training, outlines the four core abilities to teach, describes the structure and sources of effective training data, and compares lightweight LoRA fine‑tuning with full supervised fine‑tuning approaches.

Agent SystemsData GenerationLLM training
0 likes · 11 min read
How to Build Robust Function Call Training Data for LLM Agents
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 14, 2025 · Artificial Intelligence

How to Engineer Reliable Function Calls for LLM Agents: An End‑to‑End Framework

This article explains why function‑call accuracy is critical for LLM agents, identifies four common failure causes, and presents a systematic, five‑step engineering framework—including dynamic routing, chain‑of‑thought planning, result validation, memory injection, and log‑driven optimization—backed by concrete examples and quantitative improvements.

Function CallingInterview preparationLLM
0 likes · 10 min read
How to Engineer Reliable Function Calls for LLM Agents: An End‑to‑End Framework
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 12, 2025 · Artificial Intelligence

Agent Memory Modules Explained: Short‑Term vs Long‑Term Strategies for LLM Agents

This article breaks down the memory systems behind LLM‑based agents, explaining why persistent memory is needed, the differences between short‑term context buffers and long‑term vector stores, practical implementation choices, maintenance strategies, and how to articulate these concepts effectively in technical interviews.

AgentLLMretrieval
0 likes · 14 min read
Agent Memory Modules Explained: Short‑Term vs Long‑Term Strategies for LLM Agents
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 6, 2025 · Artificial Intelligence

How to Optimize RAG Knowledge Base Construction: Parsing, Chunking, and Retrieval

This article explains why building a high‑quality RAG knowledge base is critical, outlines offline parsing techniques for multi‑format documents, presents semantic chunking strategies that preserve structure and context, and shows how to answer interview questions with a robust, production‑ready pipeline.

AI InterviewChunkingKnowledge Base
0 likes · 8 min read
How to Optimize RAG Knowledge Base Construction: Parsing, Chunking, and Retrieval
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 5, 2025 · Artificial Intelligence

Why Production-Ready RAG Is Ten Times Harder Than a Simple Demo

Building a Retrieval‑Augmented Generation (RAG) system may be straightforward in code, but making it reliable, accurate, and scalable in production involves challenges across data preparation, vector retrieval, query rewriting, generation control, and system integration, turning a demo into a truly useful AI service.

AILLMPrompt Engineering
0 likes · 8 min read
Why Production-Ready RAG Is Ten Times Harder Than a Simple Demo
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 4, 2025 · Artificial Intelligence

Why Financial RAG Fails and How to Solve Its Core Challenges

This article explains why Retrieval‑Augmented Generation (RAG) projects in the financial sector often underperform, highlighting data‑structure complexities, document‑parsing hurdles, chunking strategies, compliance constraints, evaluation metrics, and engineering requirements, and offers practical solutions and code examples.

ChunkingComplianceEngineering
0 likes · 10 min read
Why Financial RAG Fails and How to Solve Its Core Challenges
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Nov 1, 2025 · Artificial Intelligence

Turn a Basic RAG Demo into a High‑Impact Interview Project

This guide shows how to evolve a simple Retrieval‑Augmented Generation prototype into a production‑grade system by strengthening data ingestion, optimizing retrieval with hybrid and reranking techniques, adding query rewriting, long‑context handling, reinforcement learning, and multimodal support, so candidates can demonstrate real engineering depth in interviews.

AILLMRAG
0 likes · 7 min read
Turn a Basic RAG Demo into a High‑Impact Interview Project