Tagged articles
2079 articles
Page 12 of 21
Volcano Engine Developer Services
Volcano Engine Developer Services
Aug 21, 2025 · Artificial Intelligence

Why Prompt Engineering Isn’t Enough: The Rise of Context Engineering and RAG

Since last year, the debate over “Prompt Engineering” has split between practitioners who favor “Context Engineering” for building scalable agent systems and scholars who treat Prompt Engineering as a broad umbrella term, highlighting the need to dynamically construct and manage context for reliable, extensible AI applications.

AI agentsLLMPrompt Engineering
0 likes · 33 min read
Why Prompt Engineering Isn’t Enough: The Rise of Context Engineering and RAG
Alibaba Cloud Native
Alibaba Cloud Native
Aug 21, 2025 · Cloud Native

How Higress AI Gateway Optimizes LLM Load Balancing with Global, Prefix, and GPU‑Aware Algorithms

This article explains why traditional load‑balancing methods fall short for large language model services and introduces Higress AI Gateway's three specialized algorithms—global minimum‑request, prefix‑matching, and GPU‑aware load balancing—detailing their design, Redis‑based implementation, deployment steps, and performance gains.

GPULLMRedis
0 likes · 11 min read
How Higress AI Gateway Optimizes LLM Load Balancing with Global, Prefix, and GPU‑Aware Algorithms
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 21, 2025 · Artificial Intelligence

Why Your AI Defect Deduplication Returns Mixed Data and How to Fix It

This article details the challenges of building an AI‑powered defect deduplication system using Retrieval‑Augmented Generation, explains why LLMs produce composite (spliced) results, diagnoses the root cause as information loss in the RAG pipeline, and presents a step‑by‑step solution that restores atomicity of records for reliable duplicate detection.

AI debuggingKnowledge BaseLLM
0 likes · 14 min read
Why Your AI Defect Deduplication Returns Mixed Data and How to Fix It
JD Tech
JD Tech
Aug 20, 2025 · Artificial Intelligence

Boosting Text-to-SQL Accuracy with J‑Schema, Iterative DPO, and Self‑Consistency

This article examines the evolution of Text-to-SQL, introduces the J‑Schema representation and chain-of-thought prompting, applies iterative DPO training and self-consistency voting, and demonstrates how these techniques raise execution accuracy on the BIRD benchmark from 56.6% to 69.2%.

BIRD benchmarkIterative DPOJ-Schema
0 likes · 11 min read
Boosting Text-to-SQL Accuracy with J‑Schema, Iterative DPO, and Self‑Consistency
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 20, 2025 · Artificial Intelligence

How DeepSearch Elevates RAG: From RAG 1.0 to a Multi‑Agent AI Search Engine

This article explains how Alibaba Cloud OpenSearch LLM version evolved from RAG 1.0 to RAG 2.0, introducing the DeepSearch multi‑agent architecture that combines offline data processing, online query handling, planning, clarification, search, and summarization agents to deliver more accurate and complex AI‑driven answers.

AI SearchDeepSearchLLM
0 likes · 10 min read
How DeepSearch Elevates RAG: From RAG 1.0 to a Multi‑Agent AI Search Engine
Volcano Engine Developer Services
Volcano Engine Developer Services
Aug 20, 2025 · Artificial Intelligence

What Is Vibe Coding? Exploring the AI‑Driven Programming Paradigm

Vibe Coding, a new AI‑centric programming paradigm introduced by Andrej Karpathy, replaces traditional code‑centric development with natural‑language‑driven interactions, enabling developers to act as product‑focused guides while large language models generate code, and discusses tools, workflows, benefits, challenges, and future trends.

AI codingLLMVibe Coding
0 likes · 26 min read
What Is Vibe Coding? Exploring the AI‑Driven Programming Paradigm
Data Party THU
Data Party THU
Aug 20, 2025 · Artificial Intelligence

How Large-Scale Corpus Rewriting is Shaping LLM Training: A Deep Dive into K2, WRAP, and Beyond

This article surveys recent large‑scale corpus rewriting techniques for LLM pre‑training, covering K2’s token‑utilization strategies, domain‑specific methods like SwallowMath/Code, reStructured pretraining, the WRAP pipeline, Nemotron‑CC filtering, Pro‑X noise removal, and the MAGA multi‑style expansion, while highlighting challenges, experimental findings, and open research questions.

LLMcorpus rewritingdata synthesis
0 likes · 20 min read
How Large-Scale Corpus Rewriting is Shaping LLM Training: A Deep Dive into K2, WRAP, and Beyond
Instant Consumer Technology Team
Instant Consumer Technology Team
Aug 19, 2025 · Artificial Intelligence

Mastering Document Chunking for RAG: Strategies, Code & Best Practices

This article explores why proper document chunking is crucial for Retrieval‑Augmented Generation, explains core concepts like context windows and signal‑to‑noise, compares various chunking strategies—from simple fixed‑size splits to semantic and hybrid approaches—and provides practical Python code examples to help you build more effective RAG pipelines.

LLMRAGText Splitting
0 likes · 24 min read
Mastering Document Chunking for RAG: Strategies, Code & Best Practices
Volcano Engine Developer Services
Volcano Engine Developer Services
Aug 19, 2025 · Artificial Intelligence

How to Strengthen LLM System Prompts for Safer AI Agents

This guide explains how to reinforce system prompts for AI agents by optimizing their content and structure, using active defense, role‑based, and format constraints, providing practical examples, measurement methods, and experimental results that demonstrate up to 90% reduction in unsafe behavior.

AI safetyLLMSystem Prompt
0 likes · 13 min read
How to Strengthen LLM System Prompts for Safer AI Agents
Data Party THU
Data Party THU
Aug 19, 2025 · Artificial Intelligence

Why RL Fine‑Tuning Fails to Extend LLM Reasoning Limits: Entropy Collapse Explained

This article examines how reinforcement learning fine‑tuning influences large language model reasoning, revealing that RL primarily amplifies pre‑trained capabilities, suffers from entropy collapse, and fails to push the model’s reasoning boundary, supported by extensive experiments on scaling laws, entropy analysis, and mitigation techniques.

LLMRLRLVR
0 likes · 24 min read
Why RL Fine‑Tuning Fails to Extend LLM Reasoning Limits: Entropy Collapse Explained
Tencent Cloud Developer
Tencent Cloud Developer
Aug 19, 2025 · Artificial Intelligence

Demystifying LLMs: From Transformers to Agents, Prompts, and Function Calling

This article explains the fundamentals of large language models, covering transformer self‑attention, prompt engineering, API usage with temperature and tool parameters, function calling, agent architectures, the Model Context Protocol (MCP), Agent‑to‑Agent (A2A) communication, and future AI programming roles.

A2AAI agentsFunction Calling
0 likes · 11 min read
Demystifying LLMs: From Transformers to Agents, Prompts, and Function Calling
Kuaishou Tech
Kuaishou Tech
Aug 18, 2025 · Artificial Intelligence

How Klear-Reasoner Achieves SOTA Math & Code Reasoning with GPPO Optimization

The Klear‑Reasoner model, built on Qwen3‑8B‑Base and powered by the novel Gradient‑Preserving Clipping Policy Optimization (GPPO) algorithm, surpasses same‑size open‑source baselines on challenging math (AIME) and code (LiveCodeBench) benchmarks, while revealing key insights on data quality, reward design, and clipping strategies for large‑language‑model reasoning.

GPPOLLMcode reasoning
0 likes · 11 min read
How Klear-Reasoner Achieves SOTA Math & Code Reasoning with GPPO Optimization
Qborfy AI
Qborfy AI
Aug 16, 2025 · Artificial Intelligence

Mastering LLM Tokens: How They Work, Cost, and Choose the Right Model

This article explains what tokens are in large language models, how they are counted and priced, compares tokenization methods across major models, and provides practical guidelines and code examples for optimizing token usage and selecting the appropriate model for different scenarios.

AILLMModel selection
0 likes · 8 min read
Mastering LLM Tokens: How They Work, Cost, and Choose the Right Model
DaTaobao Tech
DaTaobao Tech
Aug 15, 2025 · Mobile Development

How to Eliminate Text Lag in iOS LLM Chat Apps with Smart Buffering and Typewriter Animation

This article explains how to eliminate stuttered text output in iOS chat applications powered by local LLMs using the MNN framework, by introducing a three‑layer optimization—smart stream buffering, UI update throttling with batch processing, and a typewriter‑style animation—to achieve smooth, near‑online responsiveness.

C++LLMMNN
0 likes · 16 min read
How to Eliminate Text Lag in iOS LLM Chat Apps with Smart Buffering and Typewriter Animation
Baobao Algorithm Notes
Baobao Algorithm Notes
Aug 15, 2025 · Artificial Intelligence

Unlocking LLM Performance: Classic Deep RL Tricks Reimagined for Modern Training

This article systematically adapts classic deep reinforcement‑learning techniques—such as multi‑step returns, TD(λ)/GAE, V‑trace corrections, uncertainty‑aware weighting, safety constraints, distribution‑robust optimization, and value‑guided decoding—to improve large language model training and inference, providing concrete formulas, implementation tips, and empirical results.

Deep RLGAELLM
0 likes · 17 min read
Unlocking LLM Performance: Classic Deep RL Tricks Reimagined for Modern Training
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 15, 2025 · Artificial Intelligence

Mastering AI Agents: Prompt Engineering, Workflows, and RAG Strategies

This article systematically explains how to build reliable, high‑performance AI agents by focusing on the core components—LLM, prompts, workflows, RAG, and tools—while covering prompt engineering techniques, DSL‑based workflow design, vector‑database knowledge bases, security against prompt injection, and practical project planning.

AI AgentLLMRAG
0 likes · 15 min read
Mastering AI Agents: Prompt Engineering, Workflows, and RAG Strategies
Tencent Technical Engineering
Tencent Technical Engineering
Aug 14, 2025 · Artificial Intelligence

Why Do Large Language Models Hallucinate? Causes, Risks, and Multi‑Dimensional Solutions

This article systematically examines the root causes of hallucinations in large language models, evaluates their pros and cons, and presents a comprehensive set of optimization techniques—including prompt engineering, RAG, sampling tweaks, supervised fine‑tuning, LoRA, RLHF, chain‑of‑thought reasoning, and agent/workflow designs—to build more reliable and trustworthy AI applications.

AILLMLoRA
0 likes · 29 min read
Why Do Large Language Models Hallucinate? Causes, Risks, and Multi‑Dimensional Solutions
Data Party THU
Data Party THU
Aug 14, 2025 · Artificial Intelligence

How FilterLLM Turns One LLM Pass into Billion‑User Cold‑Start Recommendations

The article analyzes the FilterLLM approach, which augments a frozen LLM with billions of learnable user tokens to predict a full‑user interaction probability distribution in a single forward pass, dramatically speeding up cold‑start recommendation while preserving recommendation quality across multiple benchmarks.

AIFilterLLMLLM
0 likes · 8 min read
How FilterLLM Turns One LLM Pass into Billion‑User Cold‑Start Recommendations
Huolala Tech
Huolala Tech
Aug 14, 2025 · Artificial Intelligence

How LLMs Are Revolutionizing Natural Language to SQL for Intelligent Data Queries

This article explores how large language models break the natural‑language‑to‑SQL barrier, outlines the challenges of NLP‑driven data retrieval, compares Text2SQL and Text2DSL approaches, and proposes a unified data service and metric platform to power enterprise‑grade ChatBI solutions.

AIChatBIData Engineering
0 likes · 22 min read
How LLMs Are Revolutionizing Natural Language to SQL for Intelligent Data Queries
JD Cloud Developers
JD Cloud Developers
Aug 14, 2025 · Artificial Intelligence

Boosting Text-to-SQL Accuracy: J‑Schema, Iterative DPO, and Self‑Consistency

This article presents a comprehensive study on improving Text-to-SQL performance by introducing J‑Schema for structured schema representation, applying iterative Direct Preference Optimization (DPO) training, and leveraging self‑consistency voting mechanisms, achieving up to a 12% accuracy gain on the BIRD benchmark.

Database QAIterative DPOJ-Schema
0 likes · 10 min read
Boosting Text-to-SQL Accuracy: J‑Schema, Iterative DPO, and Self‑Consistency
JD Retail Technology
JD Retail Technology
Aug 14, 2025 · Artificial Intelligence

Boosting Text-to-SQL Accuracy: J‑Schema, Iterative DPO, and Self‑Consistency

This article surveys the evolution of Text-to-SQL, introduces the J‑Schema representation and chain-of-thought prompting, details an iterative DPO training pipeline with hyper‑parameter tuning, and demonstrates how self‑consistency voting boosts execution accuracy on the BIRD benchmark from 56.6% to 69.2%.

BIRD datasetIterative DPOLLM
0 likes · 14 min read
Boosting Text-to-SQL Accuracy: J‑Schema, Iterative DPO, and Self‑Consistency
Youzan Coder
Youzan Coder
Aug 13, 2025 · Artificial Intelligence

Understanding AI Agents: Core Modules, Planning Strategies, and Evaluation

This article explains what an AI agent is, outlines its four core modules—perception, memory, planning, and action—describes the role of large language models, compares software development generations, discusses memory implementations, planning methods like ReAct and Plan‑and‑Solve, and covers evaluation, cost analysis, and differences between agents and workflows.

AILLMMemory
0 likes · 15 min read
Understanding AI Agents: Core Modules, Planning Strategies, and Evaluation
Zhongtong Tech
Zhongtong Tech
Aug 13, 2025 · Artificial Intelligence

Unlock Seamless AI‑Tool Interaction with the Model Context Protocol (MCP)

The Model Context Protocol (MCP) is an open‑source interface that standardizes how large language models interact with external data sources and tools, offering a USB‑C‑like universal connector for AI applications, with built‑in session management, security, and flexible HTTP/SSE transport for seamless real‑world integration.

AI IntegrationData SecurityLLM
0 likes · 7 min read
Unlock Seamless AI‑Tool Interaction with the Model Context Protocol (MCP)
Data Party THU
Data Party THU
Aug 12, 2025 · Artificial Intelligence

Unlocking Chain-of-Thought: How AI Reasoning Boosts Accuracy Across Domains

Chain‑of‑Thought (CoT) enables large language models to solve complex tasks by breaking problems into sequential reasoning steps, improving accuracy in mathematics, commonsense, code generation, business strategy, and medical diagnosis, while highlighting its principles, advantages, challenges, and future prospects.

LLMchain-of-thoughtmachine learning
0 likes · 13 min read
Unlocking Chain-of-Thought: How AI Reasoning Boosts Accuracy Across Domains
Qborfy AI
Qborfy AI
Aug 12, 2025 · Artificial Intelligence

What Powers Large Language Models? A Deep Dive into LLM Architecture and Scaling

This article explains how massive Transformer‑based large language models compress text data into mathematical representations, why scale, self‑attention, and training paradigms enable emergent general intelligence, and walks through tokenization, embedding, multi‑layer attention, architecture choices, energy costs, and hallucination mitigation.

AIEmbeddingLLM
0 likes · 6 min read
What Powers Large Language Models? A Deep Dive into LLM Architecture and Scaling
Huolala Tech
Huolala Tech
Aug 12, 2025 · Information Security

Can AI Boost Traditional SAST to Detect Complex Logic Bugs?

This article explores a hybrid approach that combines traditional static application security testing (SAST) with large language models (LLM) to automatically detect business‑logic vulnerabilities, detailing the methodology, implementation stages, experimental results, and the challenges of integrating AI into code security analysis.

AILLMSAST
0 likes · 15 min read
Can AI Boost Traditional SAST to Detect Complex Logic Bugs?
Liangxu Linux
Liangxu Linux
Aug 11, 2025 · Artificial Intelligence

Four Must‑Try Open‑Source AI Tools: Gemini CLI, XiaoZhi Bot, AI Hub, GPT‑Pilot

This article introduces four notable open‑source AI projects—Google's Gemini CLI, the voice‑interactive XiaoZhi chatbot, the comprehensive AI Engineering Hub, and the GPT‑Pilot programming companion—detailing their key features, generous free quotas, star counts, supported hardware, and providing direct GitHub repository links for each.

AIChatbotGemini CLI
0 likes · 5 min read
Four Must‑Try Open‑Source AI Tools: Gemini CLI, XiaoZhi Bot, AI Hub, GPT‑Pilot
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 11, 2025 · Artificial Intelligence

How Fine‑Tuning Large Models Solves Code Upgrade Challenges and Boosts Stable Module Matching

This article details an innovative approach that uses large‑model supervised fine‑tuning to overcome the instability of code RAG and code agents during open‑source repository upgrades, addressing domain‑specific terminology, code style differences, and improving recall, accuracy, and deployment efficiency.

AI agentsLLMRAG
0 likes · 11 min read
How Fine‑Tuning Large Models Solves Code Upgrade Challenges and Boosts Stable Module Matching
Data Party THU
Data Party THU
Aug 11, 2025 · Artificial Intelligence

What Sets the Latest LLMs Apart? A Deep Dive into V3, OLMo, Gemma, Mistral, Llama 4 and More

This article systematically compares the architectures of recent large language models—including DeepSeek V3/R1, OLMo 2, Gemma 3, Mistral Small 3.1, Llama 4, Qwen 3, SmolLM 3 and Kimi 2—highlighting innovations such as MLA, MoE, post‑norm, sliding‑window attention, NoPE and optimizer choices, with diagrams and code examples to illustrate their impact on efficiency and performance.

LLMMLAMoE
0 likes · 12 min read
What Sets the Latest LLMs Apart? A Deep Dive into V3, OLMo, Gemma, Mistral, Llama 4 and More
AI Large Model Application Practice
AI Large Model Application Practice
Aug 11, 2025 · Artificial Intelligence

How to Build an LLM-Powered Smart Resume Screening System

This article presents a detailed design and implementation of an LLM‑based intelligent resume matching system that combines semantic vector retrieval, structured rule filtering, multi‑dimensional weighted scoring, and natural‑language interaction to create a fast, quantifiable, and explainable hiring pipeline.

AI RecruitmentLLMRAG
0 likes · 18 min read
How to Build an LLM-Powered Smart Resume Screening System
Wuming AI
Wuming AI
Aug 11, 2025 · Industry Insights

Why LLMs Overthink and How Developers Can Control Inference Depth

Developers notice that large language models often enter an "overthinking" mode that slows down simple coding tasks, prompting calls for adjustable inference depth controls so models can switch between quick checks and deep analysis based on task risk level.

AI usabilityDeveloper ExperienceLLM
0 likes · 5 min read
Why LLMs Overthink and How Developers Can Control Inference Depth
Data Party THU
Data Party THU
Aug 10, 2025 · Artificial Intelligence

Can Evolutionary Algorithms Auto-Design Training-Free Vision-Language Model Adaptations?

This study introduces EvoVLMA, an evolutionary vision-language model adaptation framework that automatically searches training-free VLM adaptation algorithms using a two-stage LLM-guided evolution, demonstrating superior performance—such as a 1.91 % accuracy gain on 8-shot image classification—and releasing the code publicly.

Evolutionary AlgorithmsLLMModel Adaptation
0 likes · 5 min read
Can Evolutionary Algorithms Auto-Design Training-Free Vision-Language Model Adaptations?
Data Party THU
Data Party THU
Aug 10, 2025 · Artificial Intelligence

Can LLMs Predict Multiple Tokens at Once? A Deep Dive into Multi‑Token Generation

This article evaluates whether autoregressive large language models can generate several tokens in a single inference step, describing a mask‑based multi‑token prediction framework, gated LoRA adaptation, experimental results on Tulu‑3‑8B showing up to 5.2× speedup, and discusses implications for future research.

AI efficiencyLLMMulti-token generation
0 likes · 13 min read
Can LLMs Predict Multiple Tokens at Once? A Deep Dive into Multi‑Token Generation
Sohu Smart Platform Tech Team
Sohu Smart Platform Tech Team
Aug 9, 2025 · Artificial Intelligence

Deploying Large Language Models Offline on Mobile Devices: A Practical Guide

This article explains the challenges of running large language models on mobile devices, reviews recent industry efforts, and provides a step‑by‑step guide—including code snippets—for integrating a distilled GPT‑2 model with Sohu's Hybrid AI Engine using TensorFlow Lite and Keras‑NLP for on‑device inference.

Hybrid AIKerasLLM
0 likes · 10 min read
Deploying Large Language Models Offline on Mobile Devices: A Practical Guide
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Aug 8, 2025 · Artificial Intelligence

Can GitOps Power Low‑Cost LLM Agents? A Hands‑On Exploration

This article examines how the Manus sandbox and CodeAct mechanisms inspire a GitOps‑based approach to building LLM agents, detailing the design of planner and executor components, workflow steps, advantages such as RAG and observability, and the potential for low‑cost, scalable intelligent agent development.

AI agentsGitOpsIntelligent agents
0 likes · 12 min read
Can GitOps Power Low‑Cost LLM Agents? A Hands‑On Exploration
Tencent Technical Engineering
Tencent Technical Engineering
Aug 8, 2025 · Artificial Intelligence

Which Open‑Source Deep‑Research Agent Framework Is Best? A Comprehensive Comparison

This article systematically compares major open‑source deep‑research agent frameworks—including DeerFlow, SmolAgents, LangChainAI, SkyworkAI, and Researcher—detailing their architectures, best practices, and commercial alternatives, to help developers and users choose the most suitable tool for automated research workflows.

AI automationLLMdeep research
0 likes · 27 min read
Which Open‑Source Deep‑Research Agent Framework Is Best? A Comprehensive Comparison
Tencent Cloud Developer
Tencent Cloud Developer
Aug 8, 2025 · Artificial Intelligence

Mastering AI Agents: A Practical Guide to Building Effective Workflows and Tools

This comprehensive guide explains when to use AI agents, presents core design patterns such as prompt chains, routing, parallelization, orchestrator‑worker and eval‑optimize loops, and offers concrete implementation advice and tool‑prompt engineering techniques for building reliable, high‑quality agent systems.

LLMPrompt Engineeringtool engineering
0 likes · 24 min read
Mastering AI Agents: A Practical Guide to Building Effective Workflows and Tools
Amap Tech
Amap Tech
Aug 7, 2025 · Artificial Intelligence

Boosting Codebase Upgrades with Code RAG and Agent‑Driven Fine‑Tuning

This article describes how the Gaode terminal team tackled large‑scale repository upgrades by building a code‑RAG and code‑Agent tool, addressing recall and stability issues, then fine‑tuning a small LLM (Qwen3‑4B) with LoRA and custom datasets to achieve reliable, low‑cost, on‑device code‑query performance.

Code AgentKnowledge GraphLLM
0 likes · 11 min read
Boosting Codebase Upgrades with Code RAG and Agent‑Driven Fine‑Tuning
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Aug 4, 2025 · Artificial Intelligence

How RAG and Long‑Term Memory Turn AI into a Truly Remembering Assistant

This article explains how Retrieval‑Augmented Generation (RAG) and long‑term memory systems like MenoBase enable large language models to overcome short‑term memory limits, dynamically retrieve up‑to‑date knowledge, and personalize interactions, with practical Dify implementation steps and real‑world use cases across industries.

AIDifyKnowledge Base
0 likes · 18 min read
How RAG and Long‑Term Memory Turn AI into a Truly Remembering Assistant
Baidu Maps Tech Team
Baidu Maps Tech Team
Jul 31, 2025 · Artificial Intelligence

How Baidu’s AI Voice Assistant Turns Speech into Precise Navigation Commands

This article explains how Baidu Map’s AI voice assistant converts spoken commands into precise navigation actions by detailing the speech‑to‑text pipeline, intent parsing, template and generative approaches, tool‑calling mechanisms, memory and reflection capabilities, and future directions for intelligent agents.

AIIntent ParsingLLM
0 likes · 14 min read
How Baidu’s AI Voice Assistant Turns Speech into Precise Navigation Commands
Data Party THU
Data Party THU
Jul 31, 2025 · Industry Insights

How mini‑SWE‑agent Solves 65% of SWE‑bench Bugs with Only 100 Lines of Code

The mini‑SWE‑agent, a lightweight open‑source software‑engineering AI built by the original SWE‑bench team, achieves about 65% bug‑fix success on the SWE‑bench benchmark using roughly 100 lines of Python, thanks to its minimal dependencies, shell‑based execution, linear history, and support for various container environments, offering a fast, extensible alternative to the full‑featured SWE‑agent.

AI AgentLLMOpen Source
0 likes · 8 min read
How mini‑SWE‑agent Solves 65% of SWE‑bench Bugs with Only 100 Lines of Code
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 31, 2025 · Artificial Intelligence

Why Post‑Training Matters: Scaling Laws, Fine‑Tuning, and RL Strategies for LLMs

This article explores the importance of post‑training for large language models, explains scaling laws for pre‑ and post‑training, details common fine‑tuning methods (full, PEFT, LoRA), outlines alignment techniques such as RLHF, DPO, PPO, and presents practical workflows using Llama 3 and DeepSeek‑R1, while also discussing test‑time reasoning optimizations.

LLMRLHFalignment
0 likes · 19 min read
Why Post‑Training Matters: Scaling Laws, Fine‑Tuning, and RL Strategies for LLMs
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Jul 30, 2025 · Artificial Intelligence

How MCP‑RAG Overcomes Prompt Inflation for Massive LLM Service Calls

This article analyzes the prompt‑inflation bottleneck that arises when large language models (LLMs) must handle thousands of Model Context Protocol (MCP) services, and introduces the MCP‑RAG architecture—a retrieval‑augmented generation solution that builds a metadata knowledge base and intelligent retrieval layer to enable precise, efficient MCP service discovery at scale.

AILLMMCP
0 likes · 21 min read
How MCP‑RAG Overcomes Prompt Inflation for Massive LLM Service Calls
Ops Development Stories
Ops Development Stories
Jul 29, 2025 · Artificial Intelligence

Master AI Agents with LangGraph: Build Adaptive RAG, Translation, and ReAct Agents

This comprehensive guide explains what an AI Agent is, its core capabilities and design patterns, and walks through step‑by‑step implementations of RAG, Translation, and ReAct agents using LangGraph, complete with code samples, workflow diagrams, and practical tips for building personal ops knowledge‑base agents.

LLMLangGraphRAG
0 likes · 64 min read
Master AI Agents with LangGraph: Build Adaptive RAG, Translation, and ReAct Agents
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 29, 2025 · Artificial Intelligence

How to Transform Chaotic AI Prompts into Robust System Designs

This article examines the pitfalls of rule‑heavy prompt engineering, introduces a systematic four‑layer architecture for AI prompts, outlines six practical compilation principles, and demonstrates how to rewrite a tangled prompt into a clear, maintainable, and scalable system blueprint.

AI ArchitectureLLMPrompt Engineering
0 likes · 84 min read
How to Transform Chaotic AI Prompts into Robust System Designs
Data Thinking Notes
Data Thinking Notes
Jul 27, 2025 · Databases

How Dify Turns Natural Language into SQL: Building Scalable Text2SQL Apps

This article explains how Text2SQL technology converts natural language queries into executable SQL using large language models, and demonstrates how the open‑source Dify platform’s visual workflow and component‑based development dramatically lower the barrier for building, validating, and deploying secure, low‑code Text2SQL applications.

AIDifyLLM
0 likes · 13 min read
How Dify Turns Natural Language into SQL: Building Scalable Text2SQL Apps
Architecture and Beyond
Architecture and Beyond
Jul 27, 2025 · Artificial Intelligence

Why Context Engineering Is the Secret to Powerful AI Agents

This article explains how AI agents work through perception, planning, and action, describes the four supporting systems—memory, tools, safety, and evaluation—and shows how the evolution from prompt engineering to context engineering, with strategies like selective saving, retrieval, compression, and modularization, addresses the core challenges of managing large‑scale context for reliable, efficient agent performance.

AI agentsContext EngineeringLLM
0 likes · 17 min read
Why Context Engineering Is the Secret to Powerful AI Agents
Full-Stack Cultivation Path
Full-Stack Cultivation Path
Jul 26, 2025 · Artificial Intelligence

Step-by-Step Local Deployment Guide for Coze Studio: Launch Your Low-Code AI Agent Development

This article provides a comprehensive, hands‑on tutorial for installing Ollama, Docker, and the open‑source Coze Studio on a local machine, configuring various LLM services such as Qwen 3, DeepSeek‑V3, and OpenRouter, and running the platform via Docker Compose to create and test AI agents.

Coze StudioLLMLocal Deployment
0 likes · 7 min read
Step-by-Step Local Deployment Guide for Coze Studio: Launch Your Low-Code AI Agent Development
DaTaobao Tech
DaTaobao Tech
Jul 23, 2025 · Artificial Intelligence

How Alibaba’s New Distributed Agent Framework Solves 2C AI Challenges

Alibaba introduces the ali‑langengine‑dflow framework, a hybrid distributed‑agent architecture that moves core intelligence to the cloud while keeping execution reachable on heterogeneous client devices, addressing data‑isolation, latency and security issues of existing cloud‑VM and local‑agent solutions for 2C internet services.

AIDistributed SystemsLLM
0 likes · 21 min read
How Alibaba’s New Distributed Agent Framework Solves 2C AI Challenges
FunTester
FunTester
Jul 23, 2025 · Artificial Intelligence

Mastering Prompt Iteration: A Step‑by‑Step Guide to Effective LLM Collaboration

This article explains why a perfect answer from a large language model requires iterative prompt design, outlines a six‑step spiral loop for refining prompts, and offers practical tips such as starting with a minimal prompt, focusing on one improvement at a time, and preserving version history.

Artificial IntelligenceBest PracticesIterative Design
0 likes · 5 min read
Mastering Prompt Iteration: A Step‑by‑Step Guide to Effective LLM Collaboration
Go Programming World
Go Programming World
Jul 23, 2025 · Artificial Intelligence

Directing Code with AI: How Vibe Coding Turns Natural Language into Software

Vibe Coding, introduced by Andrej Karpathy in 2025, lets developers describe software goals in natural language while large language models generate the code, reshaping the developer’s role, outlining the workflow, discussing tools, risks, and future prospects of this AI‑driven programming paradigm.

AI-driven developmentLLMVibe Coding
0 likes · 6 min read
Directing Code with AI: How Vibe Coding Turns Natural Language into Software
Code Mala Tang
Code Mala Tang
Jul 22, 2025 · Artificial Intelligence

Convert Any PDF to Clean Markdown with a Local LLM (Gemma 3)

Learn how to transform any PDF—including scanned documents—into well‑structured Markdown using a local LLM (Gemma 3 via Ollama), Python, PyMuPDF and Pillow, without cloud APIs or API keys, by converting pages to images, prompting the model, and saving the output.

GemmaLLMMarkdown
0 likes · 12 min read
Convert Any PDF to Clean Markdown with a Local LLM (Gemma 3)
DaTaobao Tech
DaTaobao Tech
Jul 18, 2025 · Artificial Intelligence

Build a Minimal Java ReAct Agent in 200 Lines: A Hands‑On Tutorial

This tutorial walks you through constructing a lightweight ReAct agent using Java, explaining the Thought‑Action‑Observation loop, providing a 200‑line code example, and demonstrating a real‑world approval workflow with prompts, tool definitions, and step‑by‑step interaction logs.

JavaLLMPrompt Engineering
0 likes · 21 min read
Build a Minimal Java ReAct Agent in 200 Lines: A Hands‑On Tutorial
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Jul 17, 2025 · Artificial Intelligence

Explore the Ultimate Open-Source LLM Catalog: Models, Tools, and Resources

This article compiles a comprehensive, up‑to‑date inventory of open‑source large language models from Chinese and international organizations, detailing each model’s architecture, parameter count, multilingual capabilities, deployment requirements, and associated tools, offering a valuable reference for AI researchers and developers.

AILLMLarge Language Model
0 likes · 50 min read
Explore the Ultimate Open-Source LLM Catalog: Models, Tools, and Resources
Tencent Advertising Technology
Tencent Advertising Technology
Jul 17, 2025 · Artificial Intelligence

LEADRE: Knowledge‑Enhanced LLMs Supercharge Display Ad Recommendations

The paper introduces LEADRE, a multi‑faceted knowledge‑enhanced large language model‑driven display advertisement recommender that tackles user interest modeling, knowledge alignment, and low‑latency deployment, achieving significant GMV gains in Tencent’s ad platforms through innovative prompt engineering, semantic alignment, and TensorRT‑accelerated inference.

Knowledge AlignmentLLMPrompt Engineering
0 likes · 16 min read
LEADRE: Knowledge‑Enhanced LLMs Supercharge Display Ad Recommendations
Tech Freedom Circle
Tech Freedom Circle
Jul 17, 2025 · Artificial Intelligence

DeepSeek V3 Architecture Deep Dive: MoE, MLA, DualPipe, FP8 Mixed Precision & Multi‑Token Prediction

This article provides a detailed technical analysis of DeepSeek‑V3, covering its MOE architecture, the novel Multi‑head Latent Attention (MLA) mechanism, the DualPipe pipeline‑parallel algorithm, mixed‑precision FP8 training, and the Multi‑Token Prediction (MTP) inference improvements that together boost performance and efficiency.

DeepSeekDualPipeFP8
0 likes · 44 min read
DeepSeek V3 Architecture Deep Dive: MoE, MLA, DualPipe, FP8 Mixed Precision & Multi‑Token Prediction
Alimama Tech
Alimama Tech
Jul 17, 2025 · Artificial Intelligence

How to Build a High‑Scoring AI Werewolf Agent: Strategies, Prompt Engineering, and Code

This article details the author's experience designing a top‑performing AI Werewolf agent for the Taotian Group's AI Werewolf Challenge, covering game rules, core challenges, prompt engineering, caching, concurrent requests, model selection, reinforcement‑learning‑style tuning, and tactical strategies for each role, with code examples.

AI AgentLLMPrompt Engineering
0 likes · 25 min read
How to Build a High‑Scoring AI Werewolf Agent: Strategies, Prompt Engineering, and Code
DataFunSummit
DataFunSummit
Jul 16, 2025 · Artificial Intelligence

How Tencent Cloud ES Powers RAG with Hybrid Search and Massive Vector Optimizations

This article explores how Tencent Cloud Elasticsearch combines decades of text search expertise with cutting‑edge vector retrieval and large language models to deliver a one‑stop Retrieval‑Augmented Generation solution, detailing the underlying models, hybrid search architecture, performance tricks, and real‑world case studies.

ElasticsearchHybrid SearchLLM
0 likes · 24 min read
How Tencent Cloud ES Powers RAG with Hybrid Search and Massive Vector Optimizations
Volcano Engine Developer Services
Volcano Engine Developer Services
Jul 16, 2025 · Information Security

Securing the Model Context Protocol (MCP): Volcanic Engine’s End‑to‑End Approach

This article explains how Volcanic Engine safeguards the Model Context Protocol (MCP) throughout its lifecycle, detailing MCP fundamentals, core components, a step‑by‑step interaction example, seven major security risks, official design principles, and a comprehensive security architecture covering admission control, native design, and runtime protection.

LLMMCPModel Context Protocol
0 likes · 21 min read
Securing the Model Context Protocol (MCP): Volcanic Engine’s End‑to‑End Approach
DaTaobao Tech
DaTaobao Tech
Jul 16, 2025 · Artificial Intelligence

From GPT‑4 to Agentic AI: How LLM Architecture Evolved (2023‑2025)

Since GPT‑4’s 2023 debut, large language models have shifted from sheer scale to efficiency‑driven designs, advanced reasoning with chain‑of‑thought, and agentic tool use, as illustrated by MoE, MLA, and new attention mechanisms, reshaping benchmarks, commercial strategies, and the future of AI.

EfficiencyLLMModel Scaling
0 likes · 24 min read
From GPT‑4 to Agentic AI: How LLM Architecture Evolved (2023‑2025)
AntTech
AntTech
Jul 16, 2025 · Artificial Intelligence

Can AI Auditors Match Human Experts? Inside RepoAudit’s LLM‑Powered Code Review

The EXPRESS Workshop at ISSTA 2025, hosted by Ant Group, featured a keynote by Purdue’s Prof. Zhang on an LLM‑driven “Human‑like AI Auditor” called RepoAudit, which demonstrated high‑accuracy automated code review, uncovering dozens of real bugs and hundreds of zero‑day vulnerabilities across major open‑source projects.

AILLMRepoAudit
0 likes · 6 min read
Can AI Auditors Match Human Experts? Inside RepoAudit’s LLM‑Powered Code Review
IT Services Circle
IT Services Circle
Jul 16, 2025 · Artificial Intelligence

How a Simple Colon Can Trick Top LLMs – The Master‑RM Fix

A recent study reveals that tiny symbols like colons or generic reasoning prefixes can cause large language models used as reward judges to issue false‑positive rewards, but an enhanced reward model called Master‑RM, trained with adversarial data, eliminates this vulnerability across multiple LLMs and languages.

AI safetyLLMMaster-RM
0 likes · 10 min read
How a Simple Colon Can Trick Top LLMs – The Master‑RM Fix
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 15, 2025 · Information Security

Boost Web Vulnerability Scanning with LLM‑Powered MCP Server Automation

This article explores how large language models can be integrated with MCP Server and Burp Suite to automate web application vulnerability detection, detailing environment setup, workflow steps, code snippets, challenges such as token limits and payload formatting, and the advantages and limitations of the approach.

Automated Vulnerability ScanningBurp SuiteKotlin
0 likes · 12 min read
Boost Web Vulnerability Scanning with LLM‑Powered MCP Server Automation
Tencent Cloud Developer
Tencent Cloud Developer
Jul 15, 2025 · Artificial Intelligence

How RAG Evolved: From Naive to Agentic – A Complete Guide

This article systematically outlines the evolution of Retrieval‑Augmented Generation (RAG) from its naive three‑step pipeline to advanced, modular, and agentic architectures, highlighting each generation's motivations, core features, advantages, drawbacks, and practical implementation details for large language model applications.

Agentic RAGArtificial IntelligenceLLM
0 likes · 20 min read
How RAG Evolved: From Naive to Agentic – A Complete Guide
Tencent Technical Engineering
Tencent Technical Engineering
Jul 14, 2025 · Artificial Intelligence

Demystifying AIGC, Agents, and MCP: Core Concepts and How They Interact

This article provides a concise overview of the latest AI concepts—including AIGC, Retrieval‑Augmented Generation, Function‑Calling models, intelligent agents, and the Model Context Protocol—explaining their principles, differences, and how they can be combined to build more powerful AI applications for developers outside the AI field.

AIGCFunction CallingLLM
0 likes · 15 min read
Demystifying AIGC, Agents, and MCP: Core Concepts and How They Interact
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Jul 12, 2025 · Artificial Intelligence

Why GraphRAG Is the Future of Retrieval‑Augmented Generation

This article explains how GraphRAG combines knowledge graphs with retrieval‑augmented generation to overcome the limitations of vector‑only RAG, delivering higher accuracy, better explainability, easier development, and stronger governance for generative AI applications across various domains.

AIGraphRAGKnowledge Graph
0 likes · 23 min read
Why GraphRAG Is the Future of Retrieval‑Augmented Generation
AI Frontier Lectures
AI Frontier Lectures
Jul 11, 2025 · Artificial Intelligence

Can LLMs ‘Squint’ to Recognize Hidden Faces? A Comparative Test

The article evaluates several large language models—including ChatGPT, Gemini, Grok, Qwen, and o3‑Pro—on a visual illusion that requires squinting to identify the Mona Lisa, revealing varied success rates, reasoning differences, and insights into model capabilities and limitations.

LLMPrompt Engineeringmodel comparison
0 likes · 6 min read
Can LLMs ‘Squint’ to Recognize Hidden Faces? A Comparative Test
Qborfy AI
Qborfy AI
Jul 11, 2025 · Artificial Intelligence

Building a Dynamic Agent Workflow with LangGraph: A Step‑by‑Step Guide

This tutorial walks through creating a full‑featured LLM Agent workflow using LangGraph, covering goal definition, task decomposition, execution nodes, state updates, re‑planning logic, and user feedback, while comparing ReAct and Reflexion approaches and providing complete Python code examples.

Agent workflowLLMLangChain
0 likes · 11 min read
Building a Dynamic Agent Workflow with LangGraph: A Step‑by‑Step Guide
Tech Freedom Circle
Tech Freedom Circle
Jul 11, 2025 · Artificial Intelligence

The Three Core Protocols of AI Agents 2.0: MCP, A2A, and AG‑UI

This article explains the three foundational protocols—MCP for tool access, A2A for inter‑agent communication, and AG‑UI for Agent‑UI interaction—detailing their origins, technical roles, example implementations, and how they together form the communication backbone of modern AI applications.

A2AAG-UIAI Agent
0 likes · 18 min read
The Three Core Protocols of AI Agents 2.0: MCP, A2A, and AG‑UI
Fun with Large Models
Fun with Large Models
Jul 10, 2025 · Artificial Intelligence

Grok 4: The ‘Problem‑Solving Champion’ That Falters in Real‑World Use – Detailed Evaluation

The article reviews Grok 4’s flashy launch and claimed first‑principles advantage, then presents benchmark results—showing strong reasoning, multimodal and agent scores but disappointing coding performance versus DeepSeek‑R1—concluding that the model’s real‑world capabilities fall short of its hype.

Grok4LLMagent
0 likes · 11 min read
Grok 4: The ‘Problem‑Solving Champion’ That Falters in Real‑World Use – Detailed Evaluation
Tencent Cloud Developer
Tencent Cloud Developer
Jul 10, 2025 · Artificial Intelligence

Demystifying AIGC, Agents, and MCP: Essential AI Concepts for Developers

This article provides a concise, developer‑focused overview of emerging AI concepts—including AIGC, multimodal models, Retrieval‑Augmented Generation, intelligent agents, Function‑Calling, and the Model Context Protocol (MCP)—explaining their core principles, differences, and how they interrelate to enable advanced AI applications.

AIAIGCFunction Calling
0 likes · 16 min read
Demystifying AIGC, Agents, and MCP: Essential AI Concepts for Developers
Instant Consumer Technology Team
Instant Consumer Technology Team
Jul 9, 2025 · Artificial Intelligence

How Easy Dataset Automates High‑Quality LLM Fine‑Tuning Data from Unstructured Docs

The article introduces Easy Dataset, a GUI‑driven framework that transforms heterogeneous documents into high‑quality, persona‑driven fine‑tuning data for large language models, details its architecture, core contributions, experimental validation on financial QA, and compares it with existing data‑synthesis tools.

Artificial IntelligenceGUILLM
0 likes · 12 min read
How Easy Dataset Automates High‑Quality LLM Fine‑Tuning Data from Unstructured Docs
Alimama Tech
Alimama Tech
Jul 9, 2025 · Artificial Intelligence

How to Make LLMs Recognize and Resolve Their Own Uncertainty

This article introduces ConfuseBench, a benchmark that classifies LLM uncertainty into document‑missing, ability‑limited, and ambiguous types, and presents methods—including retrieval, chain‑of‑thought, and clarification—to detect and actively resolve uncertainty, improving answer quality across diverse tasks.

BenchmarkClarificationInquiry
0 likes · 17 min read
How to Make LLMs Recognize and Resolve Their Own Uncertainty
AntTech
AntTech
Jul 9, 2025 · Artificial Intelligence

How KAG-Thinker Boosts Structured Reasoning in Large Language Models

The KAG-Thinker model, a collaborative effort by Ant Group, Zhejiang University, and Tongji University, introduces a hierarchical "breadth splitting + depth solving" framework that enhances logical stability, knowledge utilization, and retrieval robustness for complex multi‑hop reasoning tasks across general and specialized domains.

AIKAG-ThinkerKnowledge retrieval
0 likes · 10 min read
How KAG-Thinker Boosts Structured Reasoning in Large Language Models
High Availability Architecture
High Availability Architecture
Jul 9, 2025 · Artificial Intelligence

How LLMs Evolved from GPT‑4 to Agentic AI: Trends, Techniques, and Future Directions

This article analyzes the rapid evolution of large language models from the GPT‑4 era through efficiency‑focused sparsity and attention innovations, to inference‑time reasoning and tool‑using agents, highlighting key architectures, benchmark breakthroughs, competitive strategies, and emerging research directions toward embodied AI.

EfficiencyLLMReasoning
0 likes · 24 min read
How LLMs Evolved from GPT‑4 to Agentic AI: Trends, Techniques, and Future Directions
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jul 8, 2025 · Artificial Intelligence

How Video Retrieval‑Augmented Generation Transforms Multimodal AI Search

This article explains the end‑to‑end implementation of Video RAG in OpenSearch LLM, covering offline parsing, key‑frame extraction, audio transcription, slice creation, multimodal vectorization, hybrid indexing, and online query processing while addressing challenges like recall performance and long‑video efficiency.

ASRKey Frame ExtractionLLM
0 likes · 10 min read
How Video Retrieval‑Augmented Generation Transforms Multimodal AI Search
Alibaba Cloud Developer
Alibaba Cloud Developer
Jul 8, 2025 · Artificial Intelligence

From GPT‑4 to Thinking Models: How LLM Architecture Evolved After 2023

This article traces the evolution of large language models from the GPT‑4 era through 2024‑2025, highlighting the shift from pure scaling to efficiency‑focused architectures, the rise of reasoning‑centric "thinking" models, and the emergence of agentic capabilities that enable tools and real‑world interaction.

LLMReasoningTransformer
0 likes · 27 min read
From GPT‑4 to Thinking Models: How LLM Architecture Evolved After 2023
Instant Consumer Technology Team
Instant Consumer Technology Team
Jul 4, 2025 · Artificial Intelligence

How AI Agents Boost Development: Inside the ReAct Framework & Prompt Engineering

This article explains how AI agents, using the ReAct framework, enable a human‑machine pair‑programming workflow, details the reasoning‑acting‑observation loop, showcases practical Python examples with smolagents and DeepSeek, and provides prompt‑engineering guidelines for effective tool‑calling.

AI AgentLLMPrompt Engineering
0 likes · 19 min read
How AI Agents Boost Development: Inside the ReAct Framework & Prompt Engineering
DaTaobao Tech
DaTaobao Tech
Jul 4, 2025 · Artificial Intelligence

How Taobao Live’s AI Digital Humans Transform E‑Commerce: Architecture, Algorithms, and Engineering Insights

This article details the end‑to‑end design of Taobao Live's AI digital human system, covering six core components such as LLM‑driven content creation, interactive dialogue, TTS voice synthesis, visual synchronization, audio‑video engineering, and a scalable backend, while also discussing product evolution, automation challenges, and future roadmap.

AILLMTTS
0 likes · 19 min read
How Taobao Live’s AI Digital Humans Transform E‑Commerce: Architecture, Algorithms, and Engineering Insights
macrozheng
macrozheng
Jul 4, 2025 · Artificial Intelligence

Build Java LLM Applications with LangChain4j: A Hands‑On Guide

This tutorial walks through the fundamentals of large language models, prompt engineering, word embeddings, and shows how to use the LangChain framework (including its Java implementation LangChain4j) to build, memory‑manage, retrieve, and chain AI‑driven applications with practical code examples.

AIEmbeddingJava
0 likes · 17 min read
Build Java LLM Applications with LangChain4j: A Hands‑On Guide
DataFunTalk
DataFunTalk
Jul 3, 2025 · Artificial Intelligence

How Vivo’s Blue Heart XiaoV Leverages LLMs to Transform Conversational Recommendations

In an interview with Vivo AI engineer Liang Tianan, the article explores the challenges of post‑Q&A recommendation, the integration of large language models into recall, ranking and evaluation pipelines, and the engineering trade‑offs required to deliver high‑quality, diverse suggestions on mobile devices.

EvaluationLLMModel Compression
0 likes · 15 min read
How Vivo’s Blue Heart XiaoV Leverages LLMs to Transform Conversational Recommendations