Tagged articles
2079 articles
Page 11 of 21
DataFunSummit
DataFunSummit
Sep 27, 2025 · Artificial Intelligence

Bridging the Gap: Enforcing Discipline in AI Agents for Reliable Performance

This article examines the challenges of building production‑grade AI agents—such as context drift, knowledge leakage, and fragile state handling—and presents a disciplined architecture that combines code locks, attention anchors, and Redis‑backed state management to turn a prototype travel planner into a robust, industrial‑strength system.

AI AgentLLMcode architecture
0 likes · 14 min read
Bridging the Gap: Enforcing Discipline in AI Agents for Reliable Performance
DataFunTalk
DataFunTalk
Sep 27, 2025 · Artificial Intelligence

How Bilibili Uses LLMs to Diagnose Big Data Platform Issues

This article explains how Bilibili leverages a large‑language‑model‑driven assistant to diagnose and resolve failures and slowdowns in its massive big‑data platform, detailing the platform’s five‑layer architecture, common task issues, and the need for intelligent troubleshooting tools.

AI AssistantBig DataBilibili
0 likes · 5 min read
How Bilibili Uses LLMs to Diagnose Big Data Platform Issues
Architecture and Beyond
Architecture and Beyond
Sep 27, 2025 · Artificial Intelligence

Mastering AI Agent Tool Management: OpenManus, Gemini CLI & Shopify Sidekick

This article explains how AI agents work, examines OpenManus’s comprehensive tool framework, reviews Gemini CLI’s minimalist tool scheduling and error handling, and discusses Shopify Sidekick’s scaling challenges and Just‑in‑Time instruction strategy, offering practical guidance for building robust, production‑ready agentic systems.

AI AgentsError HandlingJust-in-Time
0 likes · 15 min read
Mastering AI Agent Tool Management: OpenManus, Gemini CLI & Shopify Sidekick
Tech Freedom Circle
Tech Freedom Circle
Sep 27, 2025 · Artificial Intelligence

What Is an AI‑Native Application and How to Design One?

The article explains the concept of AI‑native applications, distinguishes them from AI‑plugin extensions, outlines their core principles such as model‑first design, data flywheel, event‑driven agents, multimodal semantics, continuous learning, and provides a seven‑step practical guide with code examples for building an AI‑native app.

AI AssistantAI-nativeData Flywheel
0 likes · 23 min read
What Is an AI‑Native Application and How to Design One?
Raymond Ops
Raymond Ops
Sep 26, 2025 · Artificial Intelligence

How to Build and Deploy a Dify LLM Application Platform on CentOS

This comprehensive guide walks you through the fundamentals of Dify, an open‑source LLM application platform, its key features and use cases, and provides step‑by‑step instructions for preparing the environment, installing Docker and Docker‑Compose, and deploying Dify on a CentOS 7.9 server.

AI platformDifyDocker
0 likes · 13 min read
How to Build and Deploy a Dify LLM Application Platform on CentOS
Bilibili Tech
Bilibili Tech
Sep 26, 2025 · Artificial Intelligence

How RAG Transforms Natural Language Queries into Accurate SQL for Business Users

This article explains how Retrieval‑Augmented Generation (RAG) combines large language models with vector databases to let non‑technical staff query massive membership data using plain language, detailing the workflow, technical architecture, optimization challenges, and real‑world impact on data‑driven decision making.

AIData PlatformLLM
0 likes · 17 min read
How RAG Transforms Natural Language Queries into Accurate SQL for Business Users
Tech Freedom Circle
Tech Freedom Circle
Sep 25, 2025 · Operations

RAGFlow Link Tracing: GPS‑Style Observability for LLM‑Powered Applications

The article explains why RAGFlow needs end‑to‑end link tracing, introduces OpenTelemetry’s core concepts, shows how custom tracing utilities are implemented in Python, describes the layered architecture, provides concrete Docker and YAML configurations, and offers best‑practice guidelines for performance monitoring and fault diagnosis.

Distributed SystemsLLMObservability
0 likes · 24 min read
RAGFlow Link Tracing: GPS‑Style Observability for LLM‑Powered Applications
Tech Freedom Circle
Tech Freedom Circle
Sep 25, 2025 · Artificial Intelligence

RAGFlow Primer Part 1: Introduction and Concept Deep Dive

This article provides a comprehensive technical overview of RAGFlow, an industrial‑grade Retrieval‑Augmented Generation platform, detailing its architecture, core components such as DeepDoc, intelligent chunking, embedding integration, multi‑stage retrieval, and agent workflow, while comparing it with traditional RAG shortcomings.

Agent workflowDeepDocIntelligent Chunking
0 likes · 32 min read
RAGFlow Primer Part 1: Introduction and Concept Deep Dive
Tech Freedom Circle
Tech Freedom Circle
Sep 25, 2025 · Artificial Intelligence

RAGFlow Deep Dive: Data Parsing and Knowledge Graph Construction

This article examines RAGFlow's end‑to‑end pipeline for turning diverse documents into structured knowledge, detailing the TaskExecutor factory, the DeepDoc layout‑aware parser, chunking strategies, embedding and storage mechanisms, and the GraphRAG‑based knowledge‑graph extraction that together enable high‑precision retrieval and reasoning.

ChunkingData ParsingDeepDoc
0 likes · 15 min read
RAGFlow Deep Dive: Data Parsing and Knowledge Graph Construction
BirdNest Tech Talk
BirdNest Tech Talk
Sep 25, 2025 · Artificial Intelligence

How to Install and Configure LangChain for LLM Development

This guide walks you through installing the LangChain library, adding model‑specific packages, verifying the setup with a Python script, configuring API keys via environment variables or a .env file, and preparing to use OpenAI‑compatible models such as DeepSeek or Qwen.

API keysEnvironmentInstallation
0 likes · 8 min read
How to Install and Configure LangChain for LLM Development
BirdNest Tech Talk
BirdNest Tech Talk
Sep 25, 2025 · Artificial Intelligence

Mastering LangChain: A Hands‑On Guide to Building LLM Applications

This repository offers a comprehensive, step‑by‑step LangChain tutorial series that walks developers through installation, the LangChain Expression Language, streaming, parallel execution, callbacks, serialization, model customization, prompt templates, memory, multimodal support, and advanced tools like LangGraph and LangSmith, enabling the creation of sophisticated AI applications.

AI developmentLLMLangChain
0 likes · 9 min read
Mastering LangChain: A Hands‑On Guide to Building LLM Applications
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Sep 24, 2025 · Artificial Intelligence

Key Points for Evaluating AI Agents

The article explains how Coze's Compass introduces a flexible evaluation system for AI agents, outlines a four‑dimensional submodule assessment (planning, tool use, self‑reflection, memory), and details specific testing criteria and challenges for web, scientific, dialogue, and programming agents.

AI AgentsBenchmarkingCoze
0 likes · 6 min read
Key Points for Evaluating AI Agents
DataFunSummit
DataFunSummit
Sep 24, 2025 · Artificial Intelligence

Taming LLM Hallucinations: Strategies and Solutions from 360

This article explores the problem of large‑model hallucinations, explains its definitions and classifications, analyzes root causes in data, algorithms and inference, and presents detection methods and practical mitigation techniques such as RAG, decoding strategies, and model‑enhancement approaches, illustrated with real‑world 360 use cases and future research directions.

AI safetyLLMRAG
0 likes · 22 min read
Taming LLM Hallucinations: Strategies and Solutions from 360
Huolala Tech
Huolala Tech
Sep 24, 2025 · Artificial Intelligence

How CID-GraphRAG Boosts Multi‑Turn AI Customer Service with Dual‑Layer Retrieval

The article introduces CID-GraphRAG, a novel framework that combines intent‑driven graphs with semantic similarity search to improve multi‑turn intelligent customer service, detailing its architecture, dual‑layer retrieval mechanism, evaluation against baseline models, and future research directions.

AICustomer ServiceDialogue Systems
0 likes · 14 min read
How CID-GraphRAG Boosts Multi‑Turn AI Customer Service with Dual‑Layer Retrieval
AI Large Model Application Practice
AI Large Model Application Practice
Sep 23, 2025 · Artificial Intelligence

How MindsDB Turns Any Data Source into an AI‑Powered Query Engine

This article walks through installing MindsDB, configuring its unified data access layer, and demonstrates how to query across relational databases, files, and vector stores while injecting AI models—including traditional ML, LLMs, and embedding models—directly into SQL for intelligent data retrieval and analysis.

AI data integrationLLMMindsDB
0 likes · 16 min read
How MindsDB Turns Any Data Source into an AI‑Powered Query Engine
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Sep 23, 2025 · Artificial Intelligence

How Front‑End Developers Can Build Powerful AI Agents with LangChain.js

This article guides front‑end developers through the evolution of AI agents—from early chatbots to modern multimodal agents—covering LLM fundamentals, prompt engineering, LangChain.js workflow creation, Retrieval‑Augmented Generation, model context protocols, and future multi‑agent technologies.

AI AgentFrontend DevelopmentLLM
0 likes · 11 min read
How Front‑End Developers Can Build Powerful AI Agents with LangChain.js
Baobao Algorithm Notes
Baobao Algorithm Notes
Sep 22, 2025 · Artificial Intelligence

How to Add Special Tokens to LLMs Without Losing Performance

This guide explains why naïvely adding special tokens during supervised fine‑tuning can destabilize a large language model, and provides step‑by‑step strategies—including tokenizer updates, embedding resizing, smart initialization, and LoRA‑based PEFT—to integrate new tokens while preserving the model's original capabilities.

LLMLoRAspecial tokens
0 likes · 9 min read
How to Add Special Tokens to LLMs Without Losing Performance
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Sep 21, 2025 · Artificial Intelligence

FinKario: Event‑Enhanced Financial Knowledge Graphs Boost A‑Share Sharpe Ratio to 4.9

This article reviews the FinKario paper, which introduces an event‑augmented financial knowledge graph and a two‑stage RAG retrieval strategy that together enable real‑time knowledge updates and efficient integration of long‑form research reports, yielding a Sharpe ratio of 4.9 and outperforming baseline LLMs and institutional strategies in back‑testing.

FinKarioLLMRAG
0 likes · 10 min read
FinKario: Event‑Enhanced Financial Knowledge Graphs Boost A‑Share Sharpe Ratio to 4.9
AntTech
AntTech
Sep 19, 2025 · Artificial Intelligence

How Reinforcement Learning Cuts Hallucinations in Large Language Models: Ant Insurance’s Proven Approach

Ant Insurance’s tech team leveraged reinforcement learning, focused data selection, and a multi‑dimensional reward system to dramatically reduce hallucinations in LLMs, achieving top‑rank performance on the HHEM leaderboard and robust improvements across instruction‑following and reasoning‑enhanced models.

Hallucination ControlLLMLLM-as-judge
0 likes · 6 min read
How Reinforcement Learning Cuts Hallucinations in Large Language Models: Ant Insurance’s Proven Approach
DataFunSummit
DataFunSummit
Sep 19, 2025 · Artificial Intelligence

How Tencent Leverages LLMs: RAG, GraphRAG, and Agents in Real‑World Apps

This article examines Tencent's large language model deployments across diverse business scenarios, detailing core use cases such as content generation, intelligent customer service, and role‑play, and explains the underlying technologies—Supervised Fine‑Tuning, Retrieval‑Augmented Generation, and intelligent agents—that enable these applications.

AILLMRAG
0 likes · 4 min read
How Tencent Leverages LLMs: RAG, GraphRAG, and Agents in Real‑World Apps
JD Tech
JD Tech
Sep 18, 2025 · Artificial Intelligence

How I Turned a General LLM into a Precise E‑commerce Risk Detector

The article recounts how a risk‑control algorithm engineer progressively refined a generic large language model through four stages of prompt engineering—role‑playing, business knowledge injection, deeper analysis, and a double‑hypothesis decision framework—to transform it into a precise e‑commerce fraud detection expert.

AILLMPrompt Engineering
0 likes · 12 min read
How I Turned a General LLM into a Precise E‑commerce Risk Detector
DataFunSummit
DataFunSummit
Sep 18, 2025 · Artificial Intelligence

Boosting LLM Function Call: Data, Training, and Agent Optimization Strategies

This presentation by Yao Yitong of China Telecom AI Research Institute explains why Function Call is essential for LLM deployment, outlines data‑centric and training‑centric optimization methods, discusses common pitfalls and reward‑function design for reinforcement learning, and showcases practical Agent application patterns for real‑world tasks.

LLMTraining Optimizationagent
0 likes · 36 min read
Boosting LLM Function Call: Data, Training, and Agent Optimization Strategies
AI Cyberspace
AI Cyberspace
Sep 18, 2025 · Artificial Intelligence

LangChain vs LangGraph vs LangSmith: Which AI Framework Fits Your Needs?

This article compares LangChain, LangGraph, and LangSmith—three complementary frameworks for building LLM-powered applications—explaining their distinct architectures, use cases, and features, and also introduces related concepts such as RAG, MCP, A2A protocols, hierarchical memory systems, context engineering, and knowledge graphs to guide developers in selecting and integrating the appropriate tools.

Context EngineeringLLMLangChain
0 likes · 21 min read
LangChain vs LangGraph vs LangSmith: Which AI Framework Fits Your Needs?
Zhuanzhuan Tech
Zhuanzhuan Tech
Sep 17, 2025 · Artificial Intelligence

LLM‑Powered Intent Understanding, RAG QA, and Knowledge Base Maintenance for Recycling

This article details how Zhuanzhuan leverages large language models to enhance on‑site device inspection through a three‑stage pipeline—intent understanding, retrieval‑augmented generation QA, and automated knowledge‑base upkeep—highlighting technical innovations, workflow integration, and the resulting operational benefits.

AIIntent UnderstandingKnowledge Base
0 likes · 14 min read
LLM‑Powered Intent Understanding, RAG QA, and Knowledge Base Maintenance for Recycling
Ops Development & AI Practice
Ops Development & AI Practice
Sep 16, 2025 · Artificial Intelligence

Why the “Bash Only” Benchmark Is the Toughest Test for AI Code Agents

This article examines the design philosophy behind the “Bash Only” category of the SWE‑bench benchmark, explaining how its minimal‑agent approach isolates LLM reasoning by restricting interactions to a plain Bash shell, making it a rigorous, reproducible test of true software‑engineering intelligence.

AI evaluationBash OnlyBenchmark
0 likes · 7 min read
Why the “Bash Only” Benchmark Is the Toughest Test for AI Code Agents
AntTech
AntTech
Sep 16, 2025 · Information Security

Cutting-Edge Privacy Tech Unveiled: Gibbon, Panther & PromeFuzz at ACM CCS 2025

At the ACM CCS 2025 live paper showcase, three groundbreaking studies—Gibbon’s fast secure two‑party GBDT training, Panther’s efficient private approximate nearest‑neighbor search on a single server, and PromeFuzz’s knowledge‑driven LLM approach to fuzzing harness generation—are presented, highlighting significant performance and security advances.

LLMMPCapproximate nearest neighbor
0 likes · 8 min read
Cutting-Edge Privacy Tech Unveiled: Gibbon, Panther & PromeFuzz at ACM CCS 2025
Baidu Geek Talk
Baidu Geek Talk
Sep 15, 2025 · Artificial Intelligence

How Baidu’s AI Navigation Turns Voice Commands into Precise Actions

This article explains how Baidu Map’s AI navigation system converts spoken queries into accurate map instructions by combining speech recognition, intent parsing, large‑language‑model reasoning, tool calling, and memory‑reflection techniques, showcasing the underlying technologies that enable instant, context‑aware responses.

AILLMMap Services
0 likes · 13 min read
How Baidu’s AI Navigation Turns Voice Commands into Precise Actions
Data Party THU
Data Party THU
Sep 15, 2025 · Artificial Intelligence

Agentic RL: Transforming LLMs into Autonomous Decision‑Making Agents

This survey formalizes the shift from preference‑based reinforcement fine‑tuning to Agentic Reinforcement Learning, defines Agentic RL via MDP/POMDP abstractions, proposes a dual taxonomy of capabilities and task domains, compiles over 500 recent works, and outlines open challenges for scalable, robust AI agents.

AI AgentsLLMPOMDP
0 likes · 12 min read
Agentic RL: Transforming LLMs into Autonomous Decision‑Making Agents
Data Party THU
Data Party THU
Sep 15, 2025 · Artificial Intelligence

Why Merge SFT and RL? Exploring Unified Fine‑Tuning Strategies for LLMs

This article examines the necessity of integrating Supervised Fine‑Tuning (SFT) with Reinforcement Learning (RL) for large language models, surveys alternating, sample‑reuse, simultaneous, and hint‑guided fusion methods, presents the underlying loss functions, and discusses practical trade‑offs such as entropy collapse and importance‑sampling corrections.

AILLMRL
0 likes · 14 min read
Why Merge SFT and RL? Exploring Unified Fine‑Tuning Strategies for LLMs
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 15, 2025 · Artificial Intelligence

Why MCP Isn't a Magic AI Upgrade: Deep Dive into Its Architecture, Host Role, and Real Costs

This article debunks common misconceptions about the Model Context Protocol (MCP), explains its client‑host‑server (CHS) architecture, shows how the Host drives AI decisions while Server and Client remain model‑agnostic, compares MCP with Function Calling, analyzes SDK source code, evaluates practical trade‑offs, and outlines the true engineering value and costs of using MCP in AI applications.

AI EngineeringFunction CallingLLM
0 likes · 35 min read
Why MCP Isn't a Magic AI Upgrade: Deep Dive into Its Architecture, Host Role, and Real Costs
Data Thinking Notes
Data Thinking Notes
Sep 14, 2025 · Artificial Intelligence

How to Build a Robust Tool Integration Module for AI Agents

This article explains the architecture, core components, and step‑by‑step implementation of a tool usage module that enables AI agents to standardize, select, execute, and transform external tools, illustrated with a sales data analysis case and detailed code snippets.

AI AgentLLMTool Integration
0 likes · 9 min read
How to Build a Robust Tool Integration Module for AI Agents
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Sep 14, 2025 · Artificial Intelligence

How MM‑DREX Uses Multimodal LLMs for Dynamic Expert Routing in Financial Trading

The article reviews the MM‑DREX framework, which tackles the non‑stationarity of financial markets by modeling trading as a POMDP, employing a vision‑language model‑driven dynamic router to allocate four heterogeneous experts, and demonstrates superior returns, Sharpe ratios, and drawdown control across stocks, futures, and crypto compared with 15 strong baselines.

Dynamic RoutingLLMPOMDP
0 likes · 13 min read
How MM‑DREX Uses Multimodal LLMs for Dynamic Expert Routing in Financial Trading
JavaEdge
JavaEdge
Sep 14, 2025 · Artificial Intelligence

Exploring Hugging Face AI Sheets: No‑Code LLM‑Powered Data Manipulation

Hugging Face AI Sheets lets users employ large language models through a spreadsheet‑like interface to clean, transform, enrich, and generate datasets without writing code, offering both zero‑shot dataset creation and import‑based bulk processing, with optional self‑hosting via Docker for privacy‑sensitive workflows.

AI SheetsDocker deploymentHugging Face
0 likes · 5 min read
Exploring Hugging Face AI Sheets: No‑Code LLM‑Powered Data Manipulation
AI Algorithm Path
AI Algorithm Path
Sep 14, 2025 · Artificial Intelligence

Qwen3-Next: Achieving Unmatched Training and Inference Cost‑Effectiveness

Alibaba's Qwen team unveils Qwen3-Next, a hybrid expert LLM with 800 B parameters but only 30 B active, delivering training costs under one‑tenth of comparable dense models and more than ten‑fold inference throughput for long contexts, while matching or surpassing larger models on benchmark tasks.

AIBenchmarkLLM
0 likes · 9 min read
Qwen3-Next: Achieving Unmatched Training and Inference Cost‑Effectiveness
Data Party THU
Data Party THU
Sep 14, 2025 · Artificial Intelligence

Why Do Large Language Models Hallucinate? Uncovering the Root Causes and Practical Fixes

The article analyzes why large language models frequently generate confidently wrong answers, attributing hallucinations to statistical inevitability, data scarcity, and limited model expressiveness, and shows how RLHF exacerbates the problem by rewarding guesses, then proposes confidence‑threshold and "I don't know" strategies to mitigate it.

AISafetyConfidenceThresholdLLM
0 likes · 6 min read
Why Do Large Language Models Hallucinate? Uncovering the Root Causes and Practical Fixes
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Sep 13, 2025 · Artificial Intelligence

Paper Summary: Recent AI Advances in Financial Time-Series (Sep 6‑12, 2025)

This article summarizes four recent AI research papers that explore zero‑shot PDE extrapolation with text‑trained LLMs, causal hidden‑state interventions for rare financial events, tabular reformulation of graph node classification, and a multimodal model for financial time‑series forecasting, detailing their methods, experiments, and key findings.

LLMTime-seriescausal intervention
0 likes · 10 min read
Paper Summary: Recent AI Advances in Financial Time-Series (Sep 6‑12, 2025)
DataFunSummit
DataFunSummit
Sep 13, 2025 · Artificial Intelligence

How Pinterest Scaled LLM Data Pipelines with Ray: Boosting Throughput and Cutting Costs

This article details how Pinterest’s senior staff engineer Dr. Luo leveraged the open‑source Ray framework to overcome LLM data‑preprocessing bottlenecks, describing the system’s architecture, key features such as map_batches, Carry‑Over Columns and Accumulators, and the dramatic performance and cost improvements achieved.

Distributed computingLLMPerformance Optimization
0 likes · 12 min read
How Pinterest Scaled LLM Data Pipelines with Ray: Boosting Throughput and Cutting Costs
Architecture and Beyond
Architecture and Beyond
Sep 12, 2025 · Artificial Intelligence

How Gemini CLI and Claude Code Achieve Context Isolation for AI Agents

This article examines the context isolation strategies employed by Gemini CLI and Claude Code in AI agents, detailing why isolation is essential, the multi‑layer memory architecture, tool execution pipelines, concurrency controls, session management, and practical recommendations for building robust, cost‑effective agent systems.

AI AgentsClaude CodeGemini CLI
0 likes · 15 min read
How Gemini CLI and Claude Code Achieve Context Isolation for AI Agents
Volcano Engine Developer Services
Volcano Engine Developer Services
Sep 11, 2025 · Artificial Intelligence

Why Do Large Language Models Hallucinate? Causes, Types, and Mitigation Strategies

This article examines the growing problem of hallucinations in large language models, outlining their causes across the model lifecycle, classifying four main hallucination types, and presenting both retrieval‑augmented generation and detection techniques—white‑box and black‑box—to reduce factual errors in critical applications.

AI safetyLLMhallucination
0 likes · 15 min read
Why Do Large Language Models Hallucinate? Causes, Types, and Mitigation Strategies
Data Party THU
Data Party THU
Sep 11, 2025 · Artificial Intelligence

How ComRAG Revolutionizes Real‑Time Community QA with Dynamic Vector Stores

ComRAG tackles the static‑knowledge gaps, uneven QA quality, and storage explosion of community question‑answer platforms by integrating a static documentation vector store with dual dynamic CQA stores managed via a centroid‑based memory, delivering higher accuracy, lower latency, and scalable storage for industrial retrieval‑augmented generation.

Artificial IntelligenceCommunity QADynamic Retrieval
0 likes · 7 min read
How ComRAG Revolutionizes Real‑Time Community QA with Dynamic Vector Stores
Instant Consumer Technology Team
Instant Consumer Technology Team
Sep 11, 2025 · Artificial Intelligence

How REFRAG Cuts LLM Decoding Time by 30×: A New Efficient RAG Framework

REFRAG (REpresentation For RAG) introduces a novel decoding framework that compresses, senses, and expands context using precomputed chunk embeddings, achieving up to 30.85× faster first-token generation and 16× larger context windows without sacrificing perplexity, as validated across diverse long‑context tasks.

LLMRAGchunk embeddings
0 likes · 18 min read
How REFRAG Cuts LLM Decoding Time by 30×: A New Efficient RAG Framework
DataFunTalk
DataFunTalk
Sep 11, 2025 · Artificial Intelligence

How Google's AI Is Transforming Scientific Code Development

Google researchers have built a breakthrough AI system that uses large language models and tree‑search to automatically write, rewrite, and optimize scientific computing code, delivering solutions that surpass human experts across biology, epidemiology, remote sensing, neuroscience, time‑series analysis, and computational mathematics.

AICross‑Domain InnovationLLM
0 likes · 6 min read
How Google's AI Is Transforming Scientific Code Development
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Sep 11, 2025 · Artificial Intelligence

How AST Boosts LLM‑Powered Code Question Answering: Theory, Practice, and Future Directions

This article explores how abstract syntax trees (AST) can enrich large language model (LLM) based code question‑answering by providing precise structural context, detailing LLM strengths and limits, describing AST‑LLM collaboration, RAG integration, cutting‑edge models, practical tooling, challenges, standardisation efforts, and future research avenues.

ASTLLMRAG
0 likes · 30 min read
How AST Boosts LLM‑Powered Code Question Answering: Theory, Practice, and Future Directions
Instant Consumer Technology Team
Instant Consumer Technology Team
Sep 10, 2025 · Artificial Intelligence

Why AI Agents Fail and How Parlant Ensures Reliable, Controllable Bots

This article explains why most AI agents underperform due to low problem‑resolution rates, critiques traditional prompting methods, and introduces the Parlant framework with conditional rule activation, dual protection, and state‑machine architecture, followed by a complete implementation example and best‑practice guidance.

AI AgentsCustomer Service AutomationLLM
0 likes · 12 min read
Why AI Agents Fail and How Parlant Ensures Reliable, Controllable Bots
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Sep 9, 2025 · Artificial Intelligence

How EFS Leverages Large Language Models for Sparse Portfolio Optimization

The paper introduces the Evolutionary Factor Search (EFS) framework, which uses large language models to automatically generate and evolve alpha factors, turning sparse portfolio selection into an LLM‑guided top‑m ranking task, and demonstrates superior performance on multiple Fama‑French benchmarks and real‑world market datasets.

Alpha FactorsEvolutionary AlgorithmsFactor Search
0 likes · 11 min read
How EFS Leverages Large Language Models for Sparse Portfolio Optimization
DataFunSummit
DataFunSummit
Sep 8, 2025 · Artificial Intelligence

How Ant Group’s Ragent Redefines LLM‑Based AI Agents on Ray

This article introduces Ant Group’s new Ray‑based distributed agent framework Ragent, outlines its background and motivation, and details the four core modules—Profile, Memory, Planning, and Action—that together enable sophisticated LLM‑driven AI agents for large‑scale applications.

AI AgentsAnt GroupDistributed Systems
0 likes · 4 min read
How Ant Group’s Ragent Redefines LLM‑Based AI Agents on Ray
Data Party THU
Data Party THU
Sep 8, 2025 · Artificial Intelligence

Why Small Language Models Will Dominate Agentic AI by 2025

By 2025, Agentic AI is shifting from massive LLMs to cost‑effective Small Language Models (SLMs), driven by their comparable performance, lower latency, and dramatically reduced inference and fine‑tuning costs, as detailed through market data, model benchmarks, migration steps, and real‑world case studies.

AILLMModel Migration
0 likes · 6 min read
Why Small Language Models Will Dominate Agentic AI by 2025
21CTO
21CTO
Sep 8, 2025 · Artificial Intelligence

Alibaba Unveils Qwen3‑Max‑Preview: First Trillion‑Parameter LLM and What It Means

Alibaba introduced the Qwen3‑Max‑Preview model, a trillion‑parameter LLM that boosts multilingual understanding, complex instruction handling, and tool use while cutting hallucinations, offers competitive benchmark scores, supports 262K context, and comes with tiered token‑based pricing that may limit broader adoption.

AIAlibabaLLM
0 likes · 5 min read
Alibaba Unveils Qwen3‑Max‑Preview: First Trillion‑Parameter LLM and What It Means
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Sep 8, 2025 · Artificial Intelligence

Unlocking Precise Code Q&A: How ASTs Power AI-Driven Development

With software systems growing ever more complex, traditional text‑based code search falls short; this article explains how abstract syntax trees (AST) provide deeper structural understanding, improve query precision, enable advanced features like control‑flow analysis and knowledge‑graph construction, and outlines a full architecture for building AI‑enhanced code question‑answering systems.

ASTLLMcode question answering
0 likes · 33 min read
Unlocking Precise Code Q&A: How ASTs Power AI-Driven Development
JD Tech Talk
JD Tech Talk
Sep 8, 2025 · Artificial Intelligence

How I Turned a Generic LLM into a Precise E‑Commerce Risk Detector

The article recounts how a risk‑control algorithm engineer progressively refined a generic large language model through four stages of prompt engineering—defining roles, dimensions, structured I/O, business rules, behavior fingerprints, and a dual‑hypothesis decision framework—to transform it into a precise e‑commerce fraud detection expert.

AILLMPrompt Engineering
0 likes · 10 min read
How I Turned a Generic LLM into a Precise E‑Commerce Risk Detector
JD Cloud Developers
JD Cloud Developers
Sep 8, 2025 · Artificial Intelligence

Turn a Generic LLM into an E‑Commerce Risk Detector with Prompt Engineering

In this detailed case study, a risk‑control algorithm engineer explains how he progressively refined prompts for a large language model—starting from a basic role‑playing instruction, adding business‑specific exemption rules, structuring input/output, and finally implementing a dual‑hypothesis decision framework—to transform the model into a reliable e‑commerce fraud detection expert.

AIE-commerce FraudLLM
0 likes · 10 min read
Turn a Generic LLM into an E‑Commerce Risk Detector with Prompt Engineering
Data Thinking Notes
Data Thinking Notes
Sep 7, 2025 · Artificial Intelligence

Unlocking AI Agent Memory: How LLMs Use Retrieval and Planning to Stay Smart

This article explains the core architecture of AI agents powered by large language models, detailing how planning, short‑term and long‑term memory, and tool integration work together through vector databases, retrieval‑augmented generation, and summarization to enable stateful, intelligent interactions across multiple sessions.

AI AgentLLMMemory
0 likes · 10 min read
Unlocking AI Agent Memory: How LLMs Use Retrieval and Planning to Stay Smart
Architecture and Beyond
Architecture and Beyond
Sep 6, 2025 · Artificial Intelligence

How AI Agents Manage Context: Compression Strategies from Manus, Claude Code, and Gemini CLI

This article examines the context explosion problem in AI agents and compares three distinct compression approaches—Manus's never‑lose philosophy, Claude Code's aggressive 92% threshold with eight‑section summaries, and Gemini CLI's balanced 70% trigger with curated history—highlighting their trade‑offs in performance, cost, and reliability.

AIAgent DesignLLM
0 likes · 19 min read
How AI Agents Manage Context: Compression Strategies from Manus, Claude Code, and Gemini CLI
IT Services Circle
IT Services Circle
Sep 5, 2025 · Artificial Intelligence

10 Must‑Know Tencent AI Interview Topics: Overfitting, Dropout, Transformers & Beyond

This article compiles the ten core questions from a Tencent algorithm interview, covering overfitting, regularization, generalization error, dropout, residual connections, attention, embeddings, BART vs BERT, instruction‑tuning data, LLM hallucination, and why GANs collapse more than diffusion models, with concise explanations and interview‑ready tips.

GaNLLMRegularization
0 likes · 22 min read
10 Must‑Know Tencent AI Interview Topics: Overfitting, Dropout, Transformers & Beyond
DataFunTalk
DataFunTalk
Sep 5, 2025 · Artificial Intelligence

Inside Ant Group’s Ragent: Building Scalable AI Agents on Ray

This article introduces Ant Group’s Ray‑based distributed agent framework Ragent, outlines its background, motivation, and design, and details the four essential modules—Profile, Memory, Planning, and Action—that power large‑language‑model agents in large‑scale AI serving.

AI AgentsAnt GroupDistributed Systems
0 likes · 5 min read
Inside Ant Group’s Ragent: Building Scalable AI Agents on Ray
Instant Consumer Technology Team
Instant Consumer Technology Team
Sep 5, 2025 · Artificial Intelligence

How Context Engineering Transforms Dify Agents: Boost Efficiency by 10×

This article explains how Context Engineering (CE) extends Prompt Engineering by integrating seven core elements—system prompts, user input, short‑term memory, long‑term memory, retrieval, tools, and structured output—using the open‑source Dify platform to build dynamic, multimodal agents that cut inference costs tenfold and raise complex‑task success rates by 40%.

AI Agent DevelopmentDifyLLM
0 likes · 16 min read
How Context Engineering Transforms Dify Agents: Boost Efficiency by 10×
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 5, 2025 · Artificial Intelligence

How Browser-Use Leverages LLMs to Transform Browser Automation

This article explores Browser-Use, an AI‑driven browser automation framework that combines large language models, visual perception, and DOM analysis to enable intelligent, multi‑step web tasks such as registration, price comparison, form filling, and monitoring, while detailing its architecture, historical context, core modules, and future challenges.

AI AgentsLLMLangChain
0 likes · 26 min read
How Browser-Use Leverages LLMs to Transform Browser Automation
Data Party THU
Data Party THU
Sep 4, 2025 · Artificial Intelligence

How MXFP4 Quantization Lets a 1200‑Billion‑Parameter LLM Run on a Single 80GB GPU

This article analyzes the memory bottleneck of massive language models, explains the mathematical modeling of memory requirements, evaluates traditional sharding limits, and details how GPT‑OSS’s MXFP4 quantization combined with Mixture‑of‑Experts reduces memory, bandwidth, and compute demands enough to fit a 1200‑billion‑parameter model onto an 80 GB GPU with minimal accuracy loss.

FP4LLMMXFP4
0 likes · 11 min read
How MXFP4 Quantization Lets a 1200‑Billion‑Parameter LLM Run on a Single 80GB GPU
Data Party THU
Data Party THU
Sep 4, 2025 · Artificial Intelligence

Unraveling PPO Variants: From GRPO to DAPO and GSPO – A Deep Dive

This article provides a comprehensive technical analysis of PPO‑based reinforcement learning methods for large language models, detailing the evolution from the original PPO algorithm through GRPO, DAPO, and GSPO, and explaining their motivations, mathematical formulations, advantages, and practical challenges such as entropy collapse and importance‑sampling variance.

DAPOGRPOGSPO
0 likes · 30 min read
Unraveling PPO Variants: From GRPO to DAPO and GSPO – A Deep Dive
Tencent Cloud Developer
Tencent Cloud Developer
Sep 4, 2025 · Artificial Intelligence

Why Youtu-Agent Sets a New Standard for Open‑Source AI Agents

Youtu-Agent, an open‑source agent framework released by Tencent Youtu Lab, combines minimalist design with high performance, delivers strong benchmark results without training or proprietary models, and offers flexible, cost‑effective, automated agent generation for researchers, developers, and AI enthusiasts.

AI AgentsBenchmarkFramework
0 likes · 12 min read
Why Youtu-Agent Sets a New Standard for Open‑Source AI Agents
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 4, 2025 · Artificial Intelligence

Why Context Engineering Is the New Frontier in LLM Development

This article explores the rise of Context Engineering as an essential discipline for large language models, comparing it to Prompt Engineering, detailing its definition, classifications, common pitfalls such as poisoning and distraction, and presenting best‑practice strategies and an LLM‑OS analogy for building robust AI agents.

LLMLLM OSmemory management
0 likes · 27 min read
Why Context Engineering Is the New Frontier in LLM Development
Aikesheng Open Source Community
Aikesheng Open Source Community
Sep 4, 2025 · Artificial Intelligence

How GPT‑5, DeepSeek‑V3.1 and SQLShift Stack Up in the August 2025 SQL LLM Benchmark

The August 2025 SCALE benchmark evaluates new AI models—including the GPT‑5 family, DeepSeek‑V3.1, and the SQLShift tool—across SQL understanding, optimization, and dialect conversion, revealing distinct strengths, weaknesses, and the growing advantage of specialized tools over generic large language models.

AIBenchmarkDeepSeek
0 likes · 15 min read
How GPT‑5, DeepSeek‑V3.1 and SQLShift Stack Up in the August 2025 SQL LLM Benchmark
Sohu Tech Products
Sohu Tech Products
Sep 3, 2025 · Artificial Intelligence

How GRPO Revolutionizes RLHF for Large Language Models

This article explains the motivation, mathematical foundations, implementation details, advantages, experimental results, and future directions of Group Relative Policy Optimization (GRPO), a novel reinforcement‑learning algorithm that replaces PPO’s value network with efficient group‑wise relative evaluation for large language models.

Artificial IntelligenceGRPOLLM
0 likes · 17 min read
How GRPO Revolutionizes RLHF for Large Language Models
DataFunSummit
DataFunSummit
Sep 3, 2025 · Artificial Intelligence

Demystifying MCP: A Simple Guide to Building LLM Tool Integration Servers

This article explains the Model Context Protocol (MCP), its three‑layer architecture, its core advantages, and step‑by‑step development of an MCP server in TypeScript (with Python and C++ examples), showing how LLMs can invoke tools for tasks like Unreal Engine code analysis.

LLMPythonTypeScript
0 likes · 16 min read
Demystifying MCP: A Simple Guide to Building LLM Tool Integration Servers
37 Interactive Technology Team
37 Interactive Technology Team
Sep 3, 2025 · Artificial Intelligence

How AI is Revolutionizing Web Scraping: Tools, Techniques, and Best Practices

Discover how AI, especially large language models, transforms traditional web scraping by introducing semantic understanding, dynamic adaptability, and automated extraction, with in‑depth reviews of emerging tools like Crawl4AI and Browser‑use, practical code examples, best‑practice guidelines, and deployment tips for modern data collection.

AIBrowser UseCrawl4AI
0 likes · 17 min read
How AI is Revolutionizing Web Scraping: Tools, Techniques, and Best Practices
Baobao Algorithm Notes
Baobao Algorithm Notes
Sep 3, 2025 · Artificial Intelligence

How Atom-Searcher Boosts LLM Reasoning with Atomic Thought Rewards

Atom-Searcher introduces an atomic‑thought reinforcement‑learning framework that decomposes complex reasoning into fine‑grained units, uses a Reasoning Reward Model to assign step‑wise rewards, dynamically balances process and result incentives, and achieves state‑of‑the‑art performance on multiple LLM benchmarks.

Agentic ResearchAtomic ThoughtLLM
0 likes · 12 min read
How Atom-Searcher Boosts LLM Reasoning with Atomic Thought Rewards
Cognitive Technology Team
Cognitive Technology Team
Sep 3, 2025 · Artificial Intelligence

How to Build AI Agents that Auto‑Generate Helm Charts: Strategies, Pitfalls, and Best Practices

This article chronicles the author's hands‑on journey of designing AI agents to automatically generate Helm charts for open‑source applications, exploring agent role definition, behavior paradigms like ReAct and plan‑and‑execute, prompt engineering challenges, structured workflows, multi‑agent collaboration, and practical lessons for reliable, production‑grade automation.

AI AgentsAgent FrameworksHelm chart automation
0 likes · 29 min read
How to Build AI Agents that Auto‑Generate Helm Charts: Strategies, Pitfalls, and Best Practices
Architects Research Society
Architects Research Society
Sep 2, 2025 · Artificial Intelligence

What Really Sets True Agentic AI Apart from Pseudo‑Agent Systems?

The article contrasts pseudo‑agent AI—such as simple LLM chatbots, RPA scripts, and RAG systems—with genuine agentic AI architectures that combine large language models, orchestrators, memory stores, tool‑calling, planning modules, and multi‑agent collaboration, highlighting key capabilities like autonomous planning, feedback loops, and dynamic tool coordination.

Autonomous PlanningLLMOrchestrator
0 likes · 3 min read
What Really Sets True Agentic AI Apart from Pseudo‑Agent Systems?
DataFunSummit
DataFunSummit
Sep 2, 2025 · Artificial Intelligence

How Ant Group’s Ray‑Powered Ragent Redefines LLM‑Based AI Agents

This article introduces Ant Group’s Ray‑based distributed agent framework Ragent, outlines its background, motivation, and design, and breaks down the four essential modules—Profile, Memory, Planning, and Action—that enable large‑language‑model agents to operate in real‑world scenarios.

Ant GroupDistributed SystemsLLM
0 likes · 5 min read
How Ant Group’s Ray‑Powered Ragent Redefines LLM‑Based AI Agents
Coder Circle
Coder Circle
Sep 2, 2025 · Artificial Intelligence

Unlocking the New Era of AI Development: Exploring Spring AI Core Classes

This article walks through Spring AI’s three core classes—Message, Prompt, and ChatModel—explaining their roles, showing concrete code examples for constructing messages, building prompts, and invoking a large language model via a REST controller, and provides a complete demo repository.

ChatModelJavaLLM
0 likes · 3 min read
Unlocking the New Era of AI Development: Exploring Spring AI Core Classes
Data Party THU
Data Party THU
Sep 1, 2025 · Artificial Intelligence

Why Intermediate Tokens Make LLMs Reason Better: Insights from Denny Zhou

The article analyzes Denny Zhou's Stanford CS25 lecture on large language model reasoning, explaining how intermediate token generation, chain‑of‑thought prompting, self‑consistency, reinforcement‑learning fine‑tuning, and answer aggregation together unlock powerful reasoning capabilities beyond traditional greedy decoding.

AI researchLLMPrompt Engineering
0 likes · 17 min read
Why Intermediate Tokens Make LLMs Reason Better: Insights from Denny Zhou
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Aug 31, 2025 · Artificial Intelligence

Paper Review: AlphaEval – A Comprehensive, Efficient Framework for Evaluating Alpha Mining

AlphaEval is a unified, parallelizable evaluation framework that assesses Alpha mining models across predictive ability, time stability, market‑perturbation robustness, financial logic, and diversity without backtesting, matching full backtest results while offering higher efficiency and open‑source reproducibility.

Alpha MiningEvaluation FrameworkLLM
0 likes · 10 min read
Paper Review: AlphaEval – A Comprehensive, Efficient Framework for Evaluating Alpha Mining
JD Retail Technology
JD Retail Technology
Aug 29, 2025 · Artificial Intelligence

Turning a General LLM into an E‑commerce Risk‑Detection Expert: A Step‑by‑Step Prompt Engineering Guide

The article recounts how a risk‑control algorithm engineer transformed a generic large language model into a specialized e‑commerce fraud detector by iteratively designing prompts, injecting business rules, structuring I/O, and introducing a dual‑hypothesis decision framework to achieve accurate, automated risk analysis.

Artificial IntelligenceLLMPrompt Engineering
0 likes · 11 min read
Turning a General LLM into an E‑commerce Risk‑Detection Expert: A Step‑by‑Step Prompt Engineering Guide
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Aug 28, 2025 · Artificial Intelligence

Key AI-Driven Quantitative Finance Papers from KDD2025

This article summarizes recent AI research on quantitative finance, covering AlphaAgent's LLM-driven alpha mining, UMI's multi‑level irrationality factors, PDU's progressive dependency learning for stock ranking, SSPT's stock‑specific pretraining transformer, and Enhancer's distribution‑aware meta‑learning framework, all of which demonstrate improved stock prediction and resistance to alpha decay.

Alpha MiningFinancial AILLM
0 likes · 9 min read
Key AI-Driven Quantitative Finance Papers from KDD2025
IT Services Circle
IT Services Circle
Aug 28, 2025 · Artificial Intelligence

Why DeepSeek V3.1 Keeps Spitting the ‘Extreme’ Token and How to Fix It

Developers using DeepSeek V3.1's API have reported that the model intermittently inserts the Chinese character “极” (or its variants) into generated code, a bug that spreads across multiple platforms and threatens high‑precision code generation, prompting community workarounds and speculation about its root causes.

AI model bugDeepSeekLLM
0 likes · 6 min read
Why DeepSeek V3.1 Keeps Spitting the ‘Extreme’ Token and How to Fix It
Aikesheng Open Source Community
Aikesheng Open Source Community
Aug 28, 2025 · Artificial Intelligence

How Does DeepSeek‑V3.1 Perform on Professional SQL Tasks? A Detailed Benchmark

This report objectively evaluates DeepSeek‑V3.1 on professional‑grade SQL tasks, presenting its balanced strengths in understanding, optimization, and dialect conversion, highlighting its top scores in syntax error detection and Chinese database conversion while exposing weaknesses in execution‑plan analysis and large‑SQL transformations.

Artificial IntelligenceDeepSeekLLM
0 likes · 8 min read
How Does DeepSeek‑V3.1 Perform on Professional SQL Tasks? A Detailed Benchmark
Fun with Large Models
Fun with Large Models
Aug 28, 2025 · Artificial Intelligence

A Deep Dive into LangGraph: Understanding the New Graph‑Based AI Agent Framework

The article compares LangGraph with LangChain, explains why a graph‑based architecture offers greater flexibility than linear chains, outlines LangGraph’s three‑layer core architecture and its ecosystem tools—including LangSmith, LangGraph Studio, CLI, and Agent Chat UI—while noting its reliance on LangChain and the need for VPN for CLI usage.

AI AgentsGraph WorkflowLLM
0 likes · 11 min read
A Deep Dive into LangGraph: Understanding the New Graph‑Based AI Agent Framework
Alibaba Cloud Native
Alibaba Cloud Native
Aug 27, 2025 · Artificial Intelligence

How LoongSuite Enables Full‑Stack Observability for LLM Applications

The article explains the rapid evolution of the AI application ecosystem, outlines the challenges of end‑to‑end observability for large‑language‑model services, and details how the open‑source LoongSuite suite—through non‑intrusive instrumentation for Python and Go agents and tight integration with the Dify platform—provides comprehensive, cloud‑native monitoring, tracing, and metric collection across the entire AI stack.

AIDifyInstrumentation
0 likes · 19 min read
How LoongSuite Enables Full‑Stack Observability for LLM Applications
Wuming AI
Wuming AI
Aug 26, 2025 · Artificial Intelligence

A Layered Overview of Agentic AI: From LLM Foundations to Multi‑Agent Systems

This article presents a hierarchical breakdown of Agentic AI, detailing the foundational large language models, the capabilities of AI agents, the coordination mechanisms of multi‑agent systems, and the supporting infrastructure needed for reliability, scalability, and security.

AI AgentsLLMObservability
0 likes · 5 min read
A Layered Overview of Agentic AI: From LLM Foundations to Multi‑Agent Systems
DataFunSummit
DataFunSummit
Aug 25, 2025 · Artificial Intelligence

Building Xiaomi’s Vertical Domain QA Agent: From RAG to Real‑World Deployment

This article explains how Xiaomi designed and deployed a vertical‑domain question‑answering assistant for product and car queries, covering business background, a four‑module RAG‑plus‑LLM architecture, knowledge‑base construction, custom chunking strategies, dynamic signal handling, and the challenges overcome to achieve reliable real‑time voice interactions.

Agent ArchitectureLLMRAG
0 likes · 22 min read
Building Xiaomi’s Vertical Domain QA Agent: From RAG to Real‑World Deployment
DataFunSummit
DataFunSummit
Aug 24, 2025 · Artificial Intelligence

Unlocking LLM Efficiency: Asymmetry, Token Compression, and Quantization Insights

This article examines the core mechanisms of large language models, revealing asymmetric token behaviors, novel token‑compression techniques, scaling‑law theory, and mixed‑precision quantization methods that together boost inference efficiency while dramatically reducing model size.

Artificial IntelligenceLLMtoken compression
0 likes · 26 min read
Unlocking LLM Efficiency: Asymmetry, Token Compression, and Quantization Insights
Data Party THU
Data Party THU
Aug 22, 2025 · Artificial Intelligence

How BAML Turns a 25% Success Rate into 99%+ for Knowledge‑Graph Extraction with Small LLMs

This article presents a systematic study of extracting knowledge graphs from unstructured news articles using small quantized LLMs, exposing the brittleness of LangChain's JSON‑based pipelines, evaluating prompt‑engineering fixes, and introducing the BAML framework whose fuzzy parsing and concise schema raise extraction success from roughly 25% to over 99% on a 344‑document benchmark.

BAMLGraphRAGLLM
0 likes · 33 min read
How BAML Turns a 25% Success Rate into 99%+ for Knowledge‑Graph Extraction with Small LLMs
Ctrip Technology
Ctrip Technology
Aug 22, 2025 · Artificial Intelligence

How AI Can Auto‑Generate Test Cases from PRDs and Cut Design Time by Up to 70%

This article explains how an AIGC‑driven solution uses large language models, prompt engineering, and a layered architecture built on Flask and LangChain to automatically transform product requirement documents into structured, BDD‑style test cases, achieving 89% adoption and up to 70% time reduction.

AI testingAIGCFlask
0 likes · 9 min read
How AI Can Auto‑Generate Test Cases from PRDs and Cut Design Time by Up to 70%
Data Thinking Notes
Data Thinking Notes
Aug 21, 2025 · Artificial Intelligence

Why Intermediate Tokens Matter: Denny Zhou’s Deep Insights into LLM Reasoning

This article distills Denny Zhou’s Stanford CS25 lecture, explaining how large language models achieve reasoning through intermediate token generation, chain‑of‑thought prompting, self‑consistency, reinforcement‑learning fine‑tuning, and answer aggregation, while highlighting theoretical foundations and practical breakthroughs.

LLMReasoningchain-of-thought
0 likes · 18 min read
Why Intermediate Tokens Matter: Denny Zhou’s Deep Insights into LLM Reasoning
21CTO
21CTO
Aug 21, 2025 · Artificial Intelligence

Why Most AI Agent Projects Fail and How to Benchmark Their Capabilities

The article analyzes why AI agent initiatives often flop compared to traditional software, explains the fundamental differences in development approaches, and introduces a three‑step Agent Capability Benchmark Testing framework with concrete evaluation criteria and a practical weekly‑report agent example.

AI AgentsLLMPrompt Engineering
0 likes · 12 min read
Why Most AI Agent Projects Fail and How to Benchmark Their Capabilities