Tagged articles

2079 articles

Page 12 of 21

Aug 21, 2025 · Artificial Intelligence

Why Prompt Engineering Isn’t Enough: The Rise of Context Engineering and RAG

Since last year, the debate over “Prompt Engineering” has split between practitioners who favor “Context Engineering” for building scalable agent systems and scholars who treat Prompt Engineering as a broad umbrella term, highlighting the need to dynamically construct and manage context for reliable, extensible AI applications.

AI agentsLLMPrompt Engineering

0 likes · 33 min read

Why Prompt Engineering Isn’t Enough: The Rise of Context Engineering and RAG

Alibaba Cloud Native

Aug 21, 2025 · Cloud Native

How Higress AI Gateway Optimizes LLM Load Balancing with Global, Prefix, and GPU‑Aware Algorithms

This article explains why traditional load‑balancing methods fall short for large language model services and introduces Higress AI Gateway's three specialized algorithms—global minimum‑request, prefix‑matching, and GPU‑aware load balancing—detailing their design, Redis‑based implementation, deployment steps, and performance gains.

GPULLMRedis

0 likes · 11 min read

How Higress AI Gateway Optimizes LLM Load Balancing with Global, Prefix, and GPU‑Aware Algorithms

Instant Consumer Technology Team

Aug 21, 2025 · Artificial Intelligence

How Data‑Juicer Supercharges LLM Training with High‑Quality Multimodal Data

Data‑Juicer is an open‑source, one‑stop multimodal data processing system that provides fine‑grained operators, scalable pipelines, and ready‑made recipes to deliver high‑quality, diverse, and model‑friendly data for large language model pre‑training, fine‑tuning, and multimodal applications.

AILLMOpen Source

0 likes · 22 min read

How Data‑Juicer Supercharges LLM Training with High‑Quality Multimodal Data

Alibaba Cloud Developer

Aug 21, 2025 · Artificial Intelligence

Why Your AI Defect Deduplication Returns Mixed Data and How to Fix It

This article details the challenges of building an AI‑powered defect deduplication system using Retrieval‑Augmented Generation, explains why LLMs produce composite (spliced) results, diagnoses the root cause as information loss in the RAG pipeline, and presents a step‑by‑step solution that restores atomicity of records for reliable duplicate detection.

AI debuggingKnowledge BaseLLM

0 likes · 14 min read

Why Your AI Defect Deduplication Returns Mixed Data and How to Fix It

JD Tech

Aug 20, 2025 · Artificial Intelligence

Boosting Text-to-SQL Accuracy with J‑Schema, Iterative DPO, and Self‑Consistency

This article examines the evolution of Text-to-SQL, introduces the J‑Schema representation and chain-of-thought prompting, applies iterative DPO training and self-consistency voting, and demonstrates how these techniques raise execution accuracy on the BIRD benchmark from 56.6% to 69.2%.

BIRD benchmarkIterative DPOJ-Schema

0 likes · 11 min read

Boosting Text-to-SQL Accuracy with J‑Schema, Iterative DPO, and Self‑Consistency

Alibaba Cloud Big Data AI Platform

Aug 20, 2025 · Artificial Intelligence

How DeepSearch Elevates RAG: From RAG 1.0 to a Multi‑Agent AI Search Engine

This article explains how Alibaba Cloud OpenSearch LLM version evolved from RAG 1.0 to RAG 2.0, introducing the DeepSearch multi‑agent architecture that combines offline data processing, online query handling, planning, clarification, search, and summarization agents to deliver more accurate and complex AI‑driven answers.

AI SearchDeepSearchLLM

0 likes · 10 min read

How DeepSearch Elevates RAG: From RAG 1.0 to a Multi‑Agent AI Search Engine

Volcano Engine Developer Services

Aug 20, 2025 · Artificial Intelligence

What Is Vibe Coding? Exploring the AI‑Driven Programming Paradigm

Vibe Coding, a new AI‑centric programming paradigm introduced by Andrej Karpathy, replaces traditional code‑centric development with natural‑language‑driven interactions, enabling developers to act as product‑focused guides while large language models generate code, and discusses tools, workflows, benefits, challenges, and future trends.

AI codingLLMVibe Coding

0 likes · 26 min read

What Is Vibe Coding? Exploring the AI‑Driven Programming Paradigm

Data Party THU

Aug 20, 2025 · Artificial Intelligence

How Large-Scale Corpus Rewriting is Shaping LLM Training: A Deep Dive into K2, WRAP, and Beyond

This article surveys recent large‑scale corpus rewriting techniques for LLM pre‑training, covering K2’s token‑utilization strategies, domain‑specific methods like SwallowMath/Code, reStructured pretraining, the WRAP pipeline, Nemotron‑CC filtering, Pro‑X noise removal, and the MAGA multi‑style expansion, while highlighting challenges, experimental findings, and open research questions.

LLMcorpus rewritingdata synthesis

0 likes · 20 min read

How Large-Scale Corpus Rewriting is Shaping LLM Training: A Deep Dive into K2, WRAP, and Beyond

Rare Earth Juejin Tech Community

Aug 20, 2025 · Artificial Intelligence

How to Build an AI-Powered Code Review System with Node.js and LLMs

This article walks through designing and implementing an AI code review tool using Node.js, GitLab webhooks, and large language models, covering prompt engineering, diff augmentation, token management, response parsing, and automated comment posting to streamline the review process.

AICode ReviewGitLab

0 likes · 25 min read

How to Build an AI-Powered Code Review System with Node.js and LLMs

Instant Consumer Technology Team

Aug 19, 2025 · Artificial Intelligence

Mastering Document Chunking for RAG: Strategies, Code & Best Practices

This article explores why proper document chunking is crucial for Retrieval‑Augmented Generation, explains core concepts like context windows and signal‑to‑noise, compares various chunking strategies—from simple fixed‑size splits to semantic and hybrid approaches—and provides practical Python code examples to help you build more effective RAG pipelines.

LLMRAGText Splitting

0 likes · 24 min read

Mastering Document Chunking for RAG: Strategies, Code & Best Practices

Volcano Engine Developer Services

Aug 19, 2025 · Artificial Intelligence

How to Strengthen LLM System Prompts for Safer AI Agents

This guide explains how to reinforce system prompts for AI agents by optimizing their content and structure, using active defense, role‑based, and format constraints, providing practical examples, measurement methods, and experimental results that demonstrate up to 90% reduction in unsafe behavior.

AI safetyLLMSystem Prompt

0 likes · 13 min read

How to Strengthen LLM System Prompts for Safer AI Agents

Data Party THU

Aug 19, 2025 · Artificial Intelligence

Why RL Fine‑Tuning Fails to Extend LLM Reasoning Limits: Entropy Collapse Explained

This article examines how reinforcement learning fine‑tuning influences large language model reasoning, revealing that RL primarily amplifies pre‑trained capabilities, suffers from entropy collapse, and fails to push the model’s reasoning boundary, supported by extensive experiments on scaling laws, entropy analysis, and mitigation techniques.

LLMRLRLVR

0 likes · 24 min read

Why RL Fine‑Tuning Fails to Extend LLM Reasoning Limits: Entropy Collapse Explained

Tencent Cloud Developer

Aug 19, 2025 · Artificial Intelligence

Demystifying LLMs: From Transformers to Agents, Prompts, and Function Calling

This article explains the fundamentals of large language models, covering transformer self‑attention, prompt engineering, API usage with temperature and tool parameters, function calling, agent architectures, the Model Context Protocol (MCP), Agent‑to‑Agent (A2A) communication, and future AI programming roles.

A2AAI agentsFunction Calling

0 likes · 11 min read

Demystifying LLMs: From Transformers to Agents, Prompts, and Function Calling

Kuaishou Tech

Aug 18, 2025 · Artificial Intelligence

How Klear-Reasoner Achieves SOTA Math & Code Reasoning with GPPO Optimization

The Klear‑Reasoner model, built on Qwen3‑8B‑Base and powered by the novel Gradient‑Preserving Clipping Policy Optimization (GPPO) algorithm, surpasses same‑size open‑source baselines on challenging math (AIME) and code (LiveCodeBench) benchmarks, while revealing key insights on data quality, reward design, and clipping strategies for large‑language‑model reasoning.

GPPOLLMcode reasoning

0 likes · 11 min read

How Klear-Reasoner Achieves SOTA Math & Code Reasoning with GPPO Optimization

Data Party THU

Aug 17, 2025 · Artificial Intelligence

Why Do Large Language Models Hallucinate? Unpacking the Probabilistic Roots and Fixes

Large language models often generate confident but false statements—a phenomenon called hallucination—because they predict the next token based on statistical patterns rather than factual understanding, and this article explains the underlying mechanisms and practical mitigation strategies.

Knowledge DistillationLLMRLHF

0 likes · 11 min read

Why Do Large Language Models Hallucinate? Unpacking the Probabilistic Roots and Fixes

Qborfy AI

Aug 16, 2025 · Artificial Intelligence

Mastering LLM Tokens: How They Work, Cost, and Choose the Right Model

This article explains what tokens are in large language models, how they are counted and priced, compares tokenization methods across major models, and provides practical guidelines and code examples for optimizing token usage and selecting the appropriate model for different scenarios.

AILLMModel selection

0 likes · 8 min read

Mastering LLM Tokens: How They Work, Cost, and Choose the Right Model

DaTaobao Tech

Aug 15, 2025 · Mobile Development

How to Eliminate Text Lag in iOS LLM Chat Apps with Smart Buffering and Typewriter Animation

This article explains how to eliminate stuttered text output in iOS chat applications powered by local LLMs using the MNN framework, by introducing a three‑layer optimization—smart stream buffering, UI update throttling with batch processing, and a typewriter‑style animation—to achieve smooth, near‑online responsiveness.

C++LLMMNN

0 likes · 16 min read

How to Eliminate Text Lag in iOS LLM Chat Apps with Smart Buffering and Typewriter Animation

Instant Consumer Technology Team

Aug 15, 2025 · Artificial Intelligence

Why Building Enterprise AI Agents Feels Like Building a Distributed Brain

An engineer recounts the hard‑earned lessons from moving beyond RAG to enterprise‑level AI agents, exposing three critical challenges—scheduling, memory management, and tool integration—and proposes architectural patterns that turn fragile prototypes into robust, observable, and secure AI systems.

AI agentsEnterprise AILLM

0 likes · 9 min read

Why Building Enterprise AI Agents Feels Like Building a Distributed Brain

Baobao Algorithm Notes

Aug 15, 2025 · Artificial Intelligence

Unlocking LLM Performance: Classic Deep RL Tricks Reimagined for Modern Training

This article systematically adapts classic deep reinforcement‑learning techniques—such as multi‑step returns, TD(λ)/GAE, V‑trace corrections, uncertainty‑aware weighting, safety constraints, distribution‑robust optimization, and value‑guided decoding—to improve large language model training and inference, providing concrete formulas, implementation tips, and empirical results.

Deep RLGAELLM

0 likes · 17 min read

Unlocking LLM Performance: Classic Deep RL Tricks Reimagined for Modern Training

Alibaba Cloud Developer

Aug 15, 2025 · Artificial Intelligence

Mastering AI Agents: Prompt Engineering, Workflows, and RAG Strategies

This article systematically explains how to build reliable, high‑performance AI agents by focusing on the core components—LLM, prompts, workflows, RAG, and tools—while covering prompt engineering techniques, DSL‑based workflow design, vector‑database knowledge bases, security against prompt injection, and practical project planning.

AI AgentLLMRAG

0 likes · 15 min read

Mastering AI Agents: Prompt Engineering, Workflows, and RAG Strategies

Tencent Technical Engineering

Aug 14, 2025 · Artificial Intelligence

Why Do Large Language Models Hallucinate? Causes, Risks, and Multi‑Dimensional Solutions

This article systematically examines the root causes of hallucinations in large language models, evaluates their pros and cons, and presents a comprehensive set of optimization techniques—including prompt engineering, RAG, sampling tweaks, supervised fine‑tuning, LoRA, RLHF, chain‑of‑thought reasoning, and agent/workflow designs—to build more reliable and trustworthy AI applications.

AILLMLoRA

0 likes · 29 min read

Why Do Large Language Models Hallucinate? Causes, Risks, and Multi‑Dimensional Solutions

Data Party THU

Aug 14, 2025 · Artificial Intelligence

How FilterLLM Turns One LLM Pass into Billion‑User Cold‑Start Recommendations

The article analyzes the FilterLLM approach, which augments a frozen LLM with billions of learnable user tokens to predict a full‑user interaction probability distribution in a single forward pass, dramatically speeding up cold‑start recommendation while preserving recommendation quality across multiple benchmarks.

AIFilterLLMLLM

0 likes · 8 min read

How FilterLLM Turns One LLM Pass into Billion‑User Cold‑Start Recommendations

Huolala Tech

Aug 14, 2025 · Artificial Intelligence

How LLMs Are Revolutionizing Natural Language to SQL for Intelligent Data Queries

This article explores how large language models break the natural‑language‑to‑SQL barrier, outlines the challenges of NLP‑driven data retrieval, compares Text2SQL and Text2DSL approaches, and proposes a unified data service and metric platform to power enterprise‑grade ChatBI solutions.

AIChatBIData Engineering

0 likes · 22 min read

How LLMs Are Revolutionizing Natural Language to SQL for Intelligent Data Queries

JD Cloud Developers

Aug 14, 2025 · Artificial Intelligence

Boosting Text-to-SQL Accuracy: J‑Schema, Iterative DPO, and Self‑Consistency

This article presents a comprehensive study on improving Text-to-SQL performance by introducing J‑Schema for structured schema representation, applying iterative Direct Preference Optimization (DPO) training, and leveraging self‑consistency voting mechanisms, achieving up to a 12% accuracy gain on the BIRD benchmark.

Database QAIterative DPOJ-Schema

0 likes · 10 min read

Boosting Text-to-SQL Accuracy: J‑Schema, Iterative DPO, and Self‑Consistency

JD Retail Technology

Aug 14, 2025 · Artificial Intelligence

Boosting Text-to-SQL Accuracy: J‑Schema, Iterative DPO, and Self‑Consistency

This article surveys the evolution of Text-to-SQL, introduces the J‑Schema representation and chain-of-thought prompting, details an iterative DPO training pipeline with hyper‑parameter tuning, and demonstrates how self‑consistency voting boosts execution accuracy on the BIRD benchmark from 56.6% to 69.2%.

BIRD datasetIterative DPOLLM

0 likes · 14 min read

Youzan Coder

Aug 13, 2025 · Artificial Intelligence

Understanding AI Agents: Core Modules, Planning Strategies, and Evaluation

This article explains what an AI agent is, outlines its four core modules—perception, memory, planning, and action—describes the role of large language models, compares software development generations, discusses memory implementations, planning methods like ReAct and Plan‑and‑Solve, and covers evaluation, cost analysis, and differences between agents and workflows.

AILLMMemory

0 likes · 15 min read

Understanding AI Agents: Core Modules, Planning Strategies, and Evaluation

Zhongtong Tech

Aug 13, 2025 · Artificial Intelligence

Unlock Seamless AI‑Tool Interaction with the Model Context Protocol (MCP)

The Model Context Protocol (MCP) is an open‑source interface that standardizes how large language models interact with external data sources and tools, offering a USB‑C‑like universal connector for AI applications, with built‑in session management, security, and flexible HTTP/SSE transport for seamless real‑world integration.

AI IntegrationData SecurityLLM

0 likes · 7 min read

Unlock Seamless AI‑Tool Interaction with the Model Context Protocol (MCP)

Data Party THU

Aug 12, 2025 · Artificial Intelligence

How SWE‑Swiss Enables a 32B Model to Match Larger LLMs on Software Engineering Tasks

Researchers from Peking University, ByteDance Seed, and Hong Kong University present SWE‑Swiss, a 32‑billion‑parameter model that, through a two‑stage training recipe and enhanced self‑consistency, achieves 60.2% accuracy on SWE‑bench Verified, matching larger models while remaining fully open‑source.

LLMSWE‑Swiss

0 likes · 8 min read

How SWE‑Swiss Enables a 32B Model to Match Larger LLMs on Software Engineering Tasks

Data Party THU

Aug 12, 2025 · Artificial Intelligence

Unlocking Chain-of-Thought: How AI Reasoning Boosts Accuracy Across Domains

Chain‑of‑Thought (CoT) enables large language models to solve complex tasks by breaking problems into sequential reasoning steps, improving accuracy in mathematics, commonsense, code generation, business strategy, and medical diagnosis, while highlighting its principles, advantages, challenges, and future prospects.

LLMchain-of-thoughtmachine learning

0 likes · 13 min read

Unlocking Chain-of-Thought: How AI Reasoning Boosts Accuracy Across Domains

Qborfy AI

Aug 12, 2025 · Artificial Intelligence

What Powers Large Language Models? A Deep Dive into LLM Architecture and Scaling

This article explains how massive Transformer‑based large language models compress text data into mathematical representations, why scale, self‑attention, and training paradigms enable emergent general intelligence, and walks through tokenization, embedding, multi‑layer attention, architecture choices, energy costs, and hallucination mitigation.

AIEmbeddingLLM

0 likes · 6 min read

What Powers Large Language Models? A Deep Dive into LLM Architecture and Scaling

Huolala Tech

Aug 12, 2025 · Information Security

Can AI Boost Traditional SAST to Detect Complex Logic Bugs?

This article explores a hybrid approach that combines traditional static application security testing (SAST) with large language models (LLM) to automatically detect business‑logic vulnerabilities, detailing the methodology, implementation stages, experimental results, and the challenges of integrating AI into code security analysis.

AILLMSAST

0 likes · 15 min read

Can AI Boost Traditional SAST to Detect Complex Logic Bugs?

Liangxu Linux

Aug 11, 2025 · Artificial Intelligence

Four Must‑Try Open‑Source AI Tools: Gemini CLI, XiaoZhi Bot, AI Hub, GPT‑Pilot

This article introduces four notable open‑source AI projects—Google's Gemini CLI, the voice‑interactive XiaoZhi chatbot, the comprehensive AI Engineering Hub, and the GPT‑Pilot programming companion—detailing their key features, generous free quotas, star counts, supported hardware, and providing direct GitHub repository links for each.

AIChatbotGemini CLI

0 likes · 5 min read

Four Must‑Try Open‑Source AI Tools: Gemini CLI, XiaoZhi Bot, AI Hub, GPT‑Pilot

Alibaba Cloud Developer

Aug 11, 2025 · Artificial Intelligence

How Fine‑Tuning Large Models Solves Code Upgrade Challenges and Boosts Stable Module Matching

This article details an innovative approach that uses large‑model supervised fine‑tuning to overcome the instability of code RAG and code agents during open‑source repository upgrades, addressing domain‑specific terminology, code style differences, and improving recall, accuracy, and deployment efficiency.

AI agentsLLMRAG

0 likes · 11 min read

How Fine‑Tuning Large Models Solves Code Upgrade Challenges and Boosts Stable Module Matching

Data Party THU

Aug 11, 2025 · Artificial Intelligence

What Sets the Latest LLMs Apart? A Deep Dive into V3, OLMo, Gemma, Mistral, Llama 4 and More

This article systematically compares the architectures of recent large language models—including DeepSeek V3/R1, OLMo 2, Gemma 3, Mistral Small 3.1, Llama 4, Qwen 3, SmolLM 3 and Kimi 2—highlighting innovations such as MLA, MoE, post‑norm, sliding‑window attention, NoPE and optimizer choices, with diagrams and code examples to illustrate their impact on efficiency and performance.

LLMMLAMoE

0 likes · 12 min read

What Sets the Latest LLMs Apart? A Deep Dive into V3, OLMo, Gemma, Mistral, Llama 4 and More

AI Large Model Application Practice

Aug 11, 2025 · Artificial Intelligence

How to Build an LLM-Powered Smart Resume Screening System

This article presents a detailed design and implementation of an LLM‑based intelligent resume matching system that combines semantic vector retrieval, structured rule filtering, multi‑dimensional weighted scoring, and natural‑language interaction to create a fast, quantifiable, and explainable hiring pipeline.

AI RecruitmentLLMRAG

0 likes · 18 min read

How to Build an LLM-Powered Smart Resume Screening System

Wuming AI

Aug 11, 2025 · Industry Insights

Why LLMs Overthink and How Developers Can Control Inference Depth

Developers notice that large language models often enter an "overthinking" mode that slows down simple coding tasks, prompting calls for adjustable inference depth controls so models can switch between quick checks and deep analysis based on task risk level.

AI usabilityDeveloper ExperienceLLM

0 likes · 5 min read

Why LLMs Overthink and How Developers Can Control Inference Depth

Data Party THU

Aug 10, 2025 · Artificial Intelligence

Can Evolutionary Algorithms Auto-Design Training-Free Vision-Language Model Adaptations?

This study introduces EvoVLMA, an evolutionary vision-language model adaptation framework that automatically searches training-free VLM adaptation algorithms using a two-stage LLM-guided evolution, demonstrating superior performance—such as a 1.91 % accuracy gain on 8-shot image classification—and releasing the code publicly.

Evolutionary AlgorithmsLLMModel Adaptation

0 likes · 5 min read

Can Evolutionary Algorithms Auto-Design Training-Free Vision-Language Model Adaptations?

Data Party THU

Aug 10, 2025 · Artificial Intelligence

Can LLMs Predict Multiple Tokens at Once? A Deep Dive into Multi‑Token Generation

This article evaluates whether autoregressive large language models can generate several tokens in a single inference step, describing a mask‑based multi‑token prediction framework, gated LoRA adaptation, experimental results on Tulu‑3‑8B showing up to 5.2× speedup, and discusses implications for future research.

AI efficiencyLLMMulti-token generation

0 likes · 13 min read

Can LLMs Predict Multiple Tokens at Once? A Deep Dive into Multi‑Token Generation

Sohu Smart Platform Tech Team

Aug 9, 2025 · Artificial Intelligence

Deploying Large Language Models Offline on Mobile Devices: A Practical Guide

This article explains the challenges of running large language models on mobile devices, reviews recent industry efforts, and provides a step‑by‑step guide—including code snippets—for integrating a distilled GPT‑2 model with Sohu's Hybrid AI Engine using TensorFlow Lite and Keras‑NLP for on‑device inference.

Hybrid AIKerasLLM

0 likes · 10 min read

Deploying Large Language Models Offline on Mobile Devices: A Practical Guide

Alibaba Cloud Big Data AI Platform

Aug 8, 2025 · Artificial Intelligence

Can GitOps Power Low‑Cost LLM Agents? A Hands‑On Exploration

This article examines how the Manus sandbox and CodeAct mechanisms inspire a GitOps‑based approach to building LLM agents, detailing the design of planner and executor components, workflow steps, advantages such as RAG and observability, and the potential for low‑cost, scalable intelligent agent development.

AI agentsGitOpsIntelligent agents

0 likes · 12 min read

Can GitOps Power Low‑Cost LLM Agents? A Hands‑On Exploration

Tencent Technical Engineering

Aug 8, 2025 · Artificial Intelligence

Which Open‑Source Deep‑Research Agent Framework Is Best? A Comprehensive Comparison

This article systematically compares major open‑source deep‑research agent frameworks—including DeerFlow, SmolAgents, LangChainAI, SkyworkAI, and Researcher—detailing their architectures, best practices, and commercial alternatives, to help developers and users choose the most suitable tool for automated research workflows.

AI automationLLMdeep research

0 likes · 27 min read

Which Open‑Source Deep‑Research Agent Framework Is Best? A Comprehensive Comparison

Tencent Cloud Developer

Aug 8, 2025 · Artificial Intelligence

Mastering AI Agents: A Practical Guide to Building Effective Workflows and Tools

This comprehensive guide explains when to use AI agents, presents core design patterns such as prompt chains, routing, parallelization, orchestrator‑worker and eval‑optimize loops, and offers concrete implementation advice and tool‑prompt engineering techniques for building reliable, high‑quality agent systems.

LLMPrompt Engineeringtool engineering

0 likes · 24 min read

Mastering AI Agents: A Practical Guide to Building Effective Workflows and Tools

Amap Tech

Aug 7, 2025 · Artificial Intelligence

Boosting Codebase Upgrades with Code RAG and Agent‑Driven Fine‑Tuning

This article describes how the Gaode terminal team tackled large‑scale repository upgrades by building a code‑RAG and code‑Agent tool, addressing recall and stability issues, then fine‑tuning a small LLM (Qwen3‑4B) with LoRA and custom datasets to achieve reliable, low‑cost, on‑device code‑query performance.

Code AgentKnowledge GraphLLM

0 likes · 11 min read

Boosting Codebase Upgrades with Code RAG and Agent‑Driven Fine‑Tuning

Full-Stack Cultivation Path

Aug 7, 2025 · Artificial Intelligence

Getting Started with MCP: From Core Concepts to Building Server and Client

This article explains why the Model Context Protocol (MCP) is needed for LLMs, describes its client‑server architecture, data and transport layers, and provides step‑by‑step Python examples for creating both an MCP server and a client using FastMCP and low‑level APIs.

LLMMCPModel Context Protocol

0 likes · 18 min read

Getting Started with MCP: From Core Concepts to Building Server and Client

Alibaba Cloud Big Data AI Platform

Aug 6, 2025 · Artificial Intelligence

How to Build an Enterprise‑Grade Vector Search Q&A Bot with Milvus and n8n

This article explains how to combine Alibaba Cloud Milvus, a high‑performance vector database, with the low‑code workflow platform n8n to create an enterprise‑level, domain‑specific intelligent Q&A system, covering challenges, architecture, setup, workflow configuration, and verification steps.

LLMMilvusn8n

0 likes · 18 min read

How to Build an Enterprise‑Grade Vector Search Q&A Bot with Milvus and n8n

Architect's Alchemy Furnace

Aug 4, 2025 · Artificial Intelligence

How RAG and Long‑Term Memory Turn AI into a Truly Remembering Assistant

This article explains how Retrieval‑Augmented Generation (RAG) and long‑term memory systems like MenoBase enable large language models to overcome short‑term memory limits, dynamically retrieve up‑to‑date knowledge, and personalize interactions, with practical Dify implementation steps and real‑world use cases across industries.

AIDifyKnowledge Base

0 likes · 18 min read

How RAG and Long‑Term Memory Turn AI into a Truly Remembering Assistant

Go Programming World

Aug 1, 2025 · Fundamentals

Should Go Developers Learn Python? Avoid the Syntax‑Translation Pitfall

In the AI‑driven era, Go programmers face a dilemma about mastering Python, and this article explains when Python is essential, how to prevent naïve code translation, and provides concrete examples of writing truly Pythonic code versus direct Go‑style conversions.

AICode TranslationGo

0 likes · 7 min read

Should Go Developers Learn Python? Avoid the Syntax‑Translation Pitfall

Ctrip Technology

Aug 1, 2025 · Artificial Intelligence

How Semantic Search Transforms Hotel Booking: From Entity Recognition to Vector Retrieval

This article explores how Ctrip leverages advanced AI techniques—including named entity recognition, entity linking, large language models, and vector search—to replace traditional keyword queries with semantic search, dramatically improving hotel recommendation accuracy and user experience.

AILLMVector Retrieval

0 likes · 14 min read

How Semantic Search Transforms Hotel Booking: From Entity Recognition to Vector Retrieval

Baidu Maps Tech Team

Jul 31, 2025 · Artificial Intelligence

How Baidu’s AI Voice Assistant Turns Speech into Precise Navigation Commands

This article explains how Baidu Map’s AI voice assistant converts spoken commands into precise navigation actions by detailing the speech‑to‑text pipeline, intent parsing, template and generative approaches, tool‑calling mechanisms, memory and reflection capabilities, and future directions for intelligent agents.

AIIntent ParsingLLM

0 likes · 14 min read

How Baidu’s AI Voice Assistant Turns Speech into Precise Navigation Commands

Data Party THU

Jul 31, 2025 · Industry Insights

How mini‑SWE‑agent Solves 65% of SWE‑bench Bugs with Only 100 Lines of Code

The mini‑SWE‑agent, a lightweight open‑source software‑engineering AI built by the original SWE‑bench team, achieves about 65% bug‑fix success on the SWE‑bench benchmark using roughly 100 lines of Python, thanks to its minimal dependencies, shell‑based execution, linear history, and support for various container environments, offering a fast, extensible alternative to the full‑featured SWE‑agent.

AI AgentLLMOpen Source

0 likes · 8 min read

How mini‑SWE‑agent Solves 65% of SWE‑bench Bugs with Only 100 Lines of Code

Alibaba Cloud Developer

Jul 31, 2025 · Artificial Intelligence

Why Post‑Training Matters: Scaling Laws, Fine‑Tuning, and RL Strategies for LLMs

This article explores the importance of post‑training for large language models, explains scaling laws for pre‑ and post‑training, details common fine‑tuning methods (full, PEFT, LoRA), outlines alignment techniques such as RLHF, DPO, PPO, and presents practical workflows using Llama 3 and DeepSeek‑R1, while also discussing test‑time reasoning optimizations.

LLMRLHFalignment

0 likes · 19 min read

Why Post‑Training Matters: Scaling Laws, Fine‑Tuning, and RL Strategies for LLMs

Data Party THU

Jul 30, 2025 · Artificial Intelligence

When Metrics Mislead: Uncovering Simpson’s, Accuracy, and Goodhart Paradoxes in LLMs

The article examines three classic paradoxes—Simpson’s paradox, the accuracy paradox, and Goodhart’s law—showing how they arise in business intelligence and large language model contexts, and offers practical guidelines to detect and mitigate their misleading effects on data‑driven decisions.

Goodhart's lawLLMMetrics

0 likes · 12 min read

When Metrics Mislead: Uncovering Simpson’s, Accuracy, and Goodhart Paradoxes in LLMs

AsiaInfo Technology: New Tech Exploration

Jul 30, 2025 · Artificial Intelligence

How MCP‑RAG Overcomes Prompt Inflation for Massive LLM Service Calls

This article analyzes the prompt‑inflation bottleneck that arises when large language models (LLMs) must handle thousands of Model Context Protocol (MCP) services, and introduces the MCP‑RAG architecture—a retrieval‑augmented generation solution that builds a metadata knowledge base and intelligent retrieval layer to enable precise, efficient MCP service discovery at scale.

AILLMMCP

0 likes · 21 min read

How MCP‑RAG Overcomes Prompt Inflation for Massive LLM Service Calls

Java Architecture Diary

Jul 30, 2025 · Artificial Intelligence

What’s New in LangChain4j 1.2.0? Key AI Features and Enhancements

LangChain4j 1.2.0 introduces a suite of stable modules, advanced inference and thinking capabilities, streaming tool calls, and extensive AI service enhancements, offering developers finer control, lower latency, and richer debugging for LLM‑driven applications.

AIJavaLLM

0 likes · 7 min read

What’s New in LangChain4j 1.2.0? Key AI Features and Enhancements

Ops Development Stories

Jul 29, 2025 · Artificial Intelligence

Master AI Agents with LangGraph: Build Adaptive RAG, Translation, and ReAct Agents

This comprehensive guide explains what an AI Agent is, its core capabilities and design patterns, and walks through step‑by‑step implementations of RAG, Translation, and ReAct agents using LangGraph, complete with code samples, workflow diagrams, and practical tips for building personal ops knowledge‑base agents.

LLMLangGraphRAG

0 likes · 64 min read

Master AI Agents with LangGraph: Build Adaptive RAG, Translation, and ReAct Agents

Ops Development & AI Practice

Jul 29, 2025 · Artificial Intelligence

Building a Retrieval‑Augmented Generation QA Bot to Keep LLMs Up‑to‑Date

This article explains how to create a RAG‑based intelligent QA system that fetches the latest documentation (e.g., PlantUML) before querying Gemini, detailing knowledge‑base creation, embedding, vector store management, LangChain integration, and deployment tips.

AI AssistantEmbeddingGemini

0 likes · 8 min read

Building a Retrieval‑Augmented Generation QA Bot to Keep LLMs Up‑to‑Date

Alibaba Cloud Developer

Jul 29, 2025 · Artificial Intelligence

How to Transform Chaotic AI Prompts into Robust System Designs

This article examines the pitfalls of rule‑heavy prompt engineering, introduces a systematic four‑layer architecture for AI prompts, outlines six practical compilation principles, and demonstrates how to rewrite a tangled prompt into a clear, maintainable, and scalable system blueprint.

AI ArchitectureLLMPrompt Engineering

0 likes · 84 min read

How to Transform Chaotic AI Prompts into Robust System Designs

Data Thinking Notes

Jul 27, 2025 · Databases

How Dify Turns Natural Language into SQL: Building Scalable Text2SQL Apps

This article explains how Text2SQL technology converts natural language queries into executable SQL using large language models, and demonstrates how the open‑source Dify platform’s visual workflow and component‑based development dramatically lower the barrier for building, validating, and deploying secure, low‑code Text2SQL applications.

AIDifyLLM

0 likes · 13 min read

How Dify Turns Natural Language into SQL: Building Scalable Text2SQL Apps

Architecture and Beyond

Jul 27, 2025 · Artificial Intelligence

Why Context Engineering Is the Secret to Powerful AI Agents

This article explains how AI agents work through perception, planning, and action, describes the four supporting systems—memory, tools, safety, and evaluation—and shows how the evolution from prompt engineering to context engineering, with strategies like selective saving, retrieval, compression, and modularization, addresses the core challenges of managing large‑scale context for reliable, efficient agent performance.

AI agentsContext EngineeringLLM

0 likes · 17 min read

Why Context Engineering Is the Secret to Powerful AI Agents

Full-Stack Cultivation Path

Jul 26, 2025 · Artificial Intelligence

Step-by-Step Local Deployment Guide for Coze Studio: Launch Your Low-Code AI Agent Development

This article provides a comprehensive, hands‑on tutorial for installing Ollama, Docker, and the open‑source Coze Studio on a local machine, configuring various LLM services such as Qwen 3, DeepSeek‑V3, and OpenRouter, and running the platform via Docker Compose to create and test AI agents.

Coze StudioLLMLocal Deployment

0 likes · 7 min read

Step-by-Step Local Deployment Guide for Coze Studio: Launch Your Low-Code AI Agent Development

DaTaobao Tech

Jul 23, 2025 · Artificial Intelligence

How Alibaba’s New Distributed Agent Framework Solves 2C AI Challenges

Alibaba introduces the ali‑langengine‑dflow framework, a hybrid distributed‑agent architecture that moves core intelligence to the cloud while keeping execution reachable on heterogeneous client devices, addressing data‑isolation, latency and security issues of existing cloud‑VM and local‑agent solutions for 2C internet services.

AIDistributed SystemsLLM

0 likes · 21 min read

How Alibaba’s New Distributed Agent Framework Solves 2C AI Challenges

FunTester

Jul 23, 2025 · Artificial Intelligence

Mastering Prompt Iteration: A Step‑by‑Step Guide to Effective LLM Collaboration

This article explains why a perfect answer from a large language model requires iterative prompt design, outlines a six‑step spiral loop for refining prompts, and offers practical tips such as starting with a minimal prompt, focusing on one improvement at a time, and preserving version history.

Artificial IntelligenceBest PracticesIterative Design

0 likes · 5 min read

Mastering Prompt Iteration: A Step‑by‑Step Guide to Effective LLM Collaboration

Go Programming World

Jul 23, 2025 · Artificial Intelligence

Directing Code with AI: How Vibe Coding Turns Natural Language into Software

Vibe Coding, introduced by Andrej Karpathy in 2025, lets developers describe software goals in natural language while large language models generate the code, reshaping the developer’s role, outlining the workflow, discussing tools, risks, and future prospects of this AI‑driven programming paradigm.

AI-driven developmentLLMVibe Coding

0 likes · 6 min read

Directing Code with AI: How Vibe Coding Turns Natural Language into Software

Code Mala Tang

Jul 22, 2025 · Artificial Intelligence

Convert Any PDF to Clean Markdown with a Local LLM (Gemma 3)

Learn how to transform any PDF—including scanned documents—into well‑structured Markdown using a local LLM (Gemma 3 via Ollama), Python, PyMuPDF and Pillow, without cloud APIs or API keys, by converting pages to images, prompting the model, and saving the output.

GemmaLLMMarkdown

0 likes · 12 min read

Convert Any PDF to Clean Markdown with a Local LLM (Gemma 3)

DaTaobao Tech

Jul 18, 2025 · Artificial Intelligence

Build a Minimal Java ReAct Agent in 200 Lines: A Hands‑On Tutorial

This tutorial walks you through constructing a lightweight ReAct agent using Java, explaining the Thought‑Action‑Observation loop, providing a 200‑line code example, and demonstrating a real‑world approval workflow with prompts, tool definitions, and step‑by‑step interaction logs.

JavaLLMPrompt Engineering

0 likes · 21 min read

Build a Minimal Java ReAct Agent in 200 Lines: A Hands‑On Tutorial

Architect's Alchemy Furnace

Jul 17, 2025 · Artificial Intelligence

Explore the Ultimate Open-Source LLM Catalog: Models, Tools, and Resources

This article compiles a comprehensive, up‑to‑date inventory of open‑source large language models from Chinese and international organizations, detailing each model’s architecture, parameter count, multilingual capabilities, deployment requirements, and associated tools, offering a valuable reference for AI researchers and developers.

AILLMLarge Language Model

0 likes · 50 min read

Explore the Ultimate Open-Source LLM Catalog: Models, Tools, and Resources

Tencent Advertising Technology

Jul 17, 2025 · Artificial Intelligence

LEADRE: Knowledge‑Enhanced LLMs Supercharge Display Ad Recommendations

The paper introduces LEADRE, a multi‑faceted knowledge‑enhanced large language model‑driven display advertisement recommender that tackles user interest modeling, knowledge alignment, and low‑latency deployment, achieving significant GMV gains in Tencent’s ad platforms through innovative prompt engineering, semantic alignment, and TensorRT‑accelerated inference.

Knowledge AlignmentLLMPrompt Engineering

0 likes · 16 min read

LEADRE: Knowledge‑Enhanced LLMs Supercharge Display Ad Recommendations

Tech Freedom Circle

Jul 17, 2025 · Artificial Intelligence

DeepSeek V3 Architecture Deep Dive: MoE, MLA, DualPipe, FP8 Mixed Precision & Multi‑Token Prediction

This article provides a detailed technical analysis of DeepSeek‑V3, covering its MOE architecture, the novel Multi‑head Latent Attention (MLA) mechanism, the DualPipe pipeline‑parallel algorithm, mixed‑precision FP8 training, and the Multi‑Token Prediction (MTP) inference improvements that together boost performance and efficiency.

DeepSeekDualPipeFP8

0 likes · 44 min read

DeepSeek V3 Architecture Deep Dive: MoE, MLA, DualPipe, FP8 Mixed Precision & Multi‑Token Prediction

Alimama Tech

Jul 17, 2025 · Artificial Intelligence

How to Build a High‑Scoring AI Werewolf Agent: Strategies, Prompt Engineering, and Code

This article details the author's experience designing a top‑performing AI Werewolf agent for the Taotian Group's AI Werewolf Challenge, covering game rules, core challenges, prompt engineering, caching, concurrent requests, model selection, reinforcement‑learning‑style tuning, and tactical strategies for each role, with code examples.

AI AgentLLMPrompt Engineering

0 likes · 25 min read

How to Build a High‑Scoring AI Werewolf Agent: Strategies, Prompt Engineering, and Code

Alibaba Cloud Big Data AI Platform

Jul 16, 2025 · Artificial Intelligence

Master Post-Training: Fine-Tune LLMs with SFT, DPO, and GRPO on Alibaba PAI

This article explains post‑training concepts, compares SFT, DPO, and GRPO fine‑tuning methods, and provides step‑by‑step guidance for using Alibaba Cloud's PAI platform—including Model Gallery and DSW—to fine‑tune large language models with code examples and practical tips.

DPOGRPOLLM

0 likes · 14 min read

Master Post-Training: Fine-Tune LLMs with SFT, DPO, and GRPO on Alibaba PAI

DataFunSummit

Jul 16, 2025 · Artificial Intelligence

How Tencent Cloud ES Powers RAG with Hybrid Search and Massive Vector Optimizations

This article explores how Tencent Cloud Elasticsearch combines decades of text search expertise with cutting‑edge vector retrieval and large language models to deliver a one‑stop Retrieval‑Augmented Generation solution, detailing the underlying models, hybrid search architecture, performance tricks, and real‑world case studies.

ElasticsearchHybrid SearchLLM

0 likes · 24 min read

How Tencent Cloud ES Powers RAG with Hybrid Search and Massive Vector Optimizations

Volcano Engine Developer Services

Jul 16, 2025 · Information Security

Securing the Model Context Protocol (MCP): Volcanic Engine’s End‑to‑End Approach

This article explains how Volcanic Engine safeguards the Model Context Protocol (MCP) throughout its lifecycle, detailing MCP fundamentals, core components, a step‑by‑step interaction example, seven major security risks, official design principles, and a comprehensive security architecture covering admission control, native design, and runtime protection.

LLMMCPModel Context Protocol

0 likes · 21 min read

Securing the Model Context Protocol (MCP): Volcanic Engine’s End‑to‑End Approach

Instant Consumer Technology Team

Jul 16, 2025 · Artificial Intelligence

How to Build a Text‑to‑Video Workflow in Dify Using LLMs

This guide walks you through creating a Dify workflow that turns user prompts into videos by chaining LLM‑generated descriptions with a Text‑to‑Video model, covering workflow types, system variables, model setup, node configuration, plugin installation, and final testing steps.

AIDifyLLM

0 likes · 14 min read

How to Build a Text‑to‑Video Workflow in Dify Using LLMs

DaTaobao Tech

Jul 16, 2025 · Artificial Intelligence

From GPT‑4 to Agentic AI: How LLM Architecture Evolved (2023‑2025)

Since GPT‑4’s 2023 debut, large language models have shifted from sheer scale to efficiency‑driven designs, advanced reasoning with chain‑of‑thought, and agentic tool use, as illustrated by MoE, MLA, and new attention mechanisms, reshaping benchmarks, commercial strategies, and the future of AI.

EfficiencyLLMModel Scaling

0 likes · 24 min read

From GPT‑4 to Agentic AI: How LLM Architecture Evolved (2023‑2025)

AntTech

Jul 16, 2025 · Artificial Intelligence

Can AI Auditors Match Human Experts? Inside RepoAudit’s LLM‑Powered Code Review

The EXPRESS Workshop at ISSTA 2025, hosted by Ant Group, featured a keynote by Purdue’s Prof. Zhang on an LLM‑driven “Human‑like AI Auditor” called RepoAudit, which demonstrated high‑accuracy automated code review, uncovering dozens of real bugs and hundreds of zero‑day vulnerabilities across major open‑source projects.

AILLMRepoAudit

0 likes · 6 min read

Can AI Auditors Match Human Experts? Inside RepoAudit’s LLM‑Powered Code Review

IT Services Circle

Jul 16, 2025 · Artificial Intelligence

How a Simple Colon Can Trick Top LLMs – The Master‑RM Fix

A recent study reveals that tiny symbols like colons or generic reasoning prefixes can cause large language models used as reward judges to issue false‑positive rewards, but an enhanced reward model called Master‑RM, trained with adversarial data, eliminates this vulnerability across multiple LLMs and languages.

AI safetyLLMMaster-RM

0 likes · 10 min read

How a Simple Colon Can Trick Top LLMs – The Master‑RM Fix

Architects' Tech Alliance

Jul 15, 2025 · Artificial Intelligence

Why High‑Bandwidth Memory (HBM) Is Critical for Modern AI and How It Works

This article explains what high‑bandwidth memory (HBM) is, outlines its brief history, compares it with DDR, LPDDR and GDDR, describes why large language models and generative AI drive its demand, and reviews its architecture, PCB requirements, market status, and future outlook.

AI hardwareGenerative AIHBM

0 likes · 3 min read

Why High‑Bandwidth Memory (HBM) Is Critical for Modern AI and How It Works

Alibaba Cloud Developer

Jul 15, 2025 · Information Security

Boost Web Vulnerability Scanning with LLM‑Powered MCP Server Automation

This article explores how large language models can be integrated with MCP Server and Burp Suite to automate web application vulnerability detection, detailing environment setup, workflow steps, code snippets, challenges such as token limits and payload formatting, and the advantages and limitations of the approach.

Automated Vulnerability ScanningBurp SuiteKotlin

0 likes · 12 min read

Boost Web Vulnerability Scanning with LLM‑Powered MCP Server Automation

Tencent Cloud Developer

Jul 15, 2025 · Artificial Intelligence

How RAG Evolved: From Naive to Agentic – A Complete Guide

This article systematically outlines the evolution of Retrieval‑Augmented Generation (RAG) from its naive three‑step pipeline to advanced, modular, and agentic architectures, highlighting each generation's motivations, core features, advantages, drawbacks, and practical implementation details for large language model applications.

Agentic RAGArtificial IntelligenceLLM

0 likes · 20 min read

How RAG Evolved: From Naive to Agentic – A Complete Guide

Fun with Large Models

Jul 15, 2025 · Artificial Intelligence

Getting Started with LangChain & LangGraph: Core Concepts of AI Agents

This article introduces AI Agents and explains why LangChain is the leading framework, detailing its core concepts, three‑layer architecture, key features, comparison with other agent frameworks, and showcasing popular projects built with LangChain and LangGraph.

AI AgentLLMLangChain

0 likes · 10 min read

Getting Started with LangChain & LangGraph: Core Concepts of AI Agents

Tencent Technical Engineering

Jul 14, 2025 · Artificial Intelligence

Demystifying AIGC, Agents, and MCP: Core Concepts and How They Interact

This article provides a concise overview of the latest AI concepts—including AIGC, Retrieval‑Augmented Generation, Function‑Calling models, intelligent agents, and the Model Context Protocol—explaining their principles, differences, and how they can be combined to build more powerful AI applications for developers outside the AI field.

AIGCFunction CallingLLM

0 likes · 15 min read

Demystifying AIGC, Agents, and MCP: Core Concepts and How They Interact

Architect's Alchemy Furnace

Jul 12, 2025 · Artificial Intelligence

Why GraphRAG Is the Future of Retrieval‑Augmented Generation

This article explains how GraphRAG combines knowledge graphs with retrieval‑augmented generation to overcome the limitations of vector‑only RAG, delivering higher accuracy, better explainability, easier development, and stronger governance for generative AI applications across various domains.

AIGraphRAGKnowledge Graph

0 likes · 23 min read

Why GraphRAG Is the Future of Retrieval‑Augmented Generation

AI Frontier Lectures

Jul 11, 2025 · Artificial Intelligence

Can LLMs ‘Squint’ to Recognize Hidden Faces? A Comparative Test

The article evaluates several large language models—including ChatGPT, Gemini, Grok, Qwen, and o3‑Pro—on a visual illusion that requires squinting to identify the Mona Lisa, revealing varied success rates, reasoning differences, and insights into model capabilities and limitations.

LLMPrompt Engineeringmodel comparison

0 likes · 6 min read

Can LLMs ‘Squint’ to Recognize Hidden Faces? A Comparative Test

Instant Consumer Technology Team

Jul 11, 2025 · Artificial Intelligence

Boost LLM Performance with Prompt‑Optimizer: Open‑Source Prompt Tuning Made Easy

Prompt‑Optimizer is an open‑source tool that uses AI models to automatically refine and compare prompts, offering multi‑model support, security features, and cross‑platform access, while providing step‑by‑step Docker deployment instructions for developers and prompt engineers.

AI toolsLLMPrompt Engineering

0 likes · 7 min read

Boost LLM Performance with Prompt‑Optimizer: Open‑Source Prompt Tuning Made Easy

Qborfy AI

Jul 11, 2025 · Artificial Intelligence

Building a Dynamic Agent Workflow with LangGraph: A Step‑by‑Step Guide

This tutorial walks through creating a full‑featured LLM Agent workflow using LangGraph, covering goal definition, task decomposition, execution nodes, state updates, re‑planning logic, and user feedback, while comparing ReAct and Reflexion approaches and providing complete Python code examples.

Agent workflowLLMLangChain

0 likes · 11 min read

Building a Dynamic Agent Workflow with LangGraph: A Step‑by‑Step Guide

Tech Freedom Circle

Jul 11, 2025 · Artificial Intelligence

The Three Core Protocols of AI Agents 2.0: MCP, A2A, and AG‑UI

This article explains the three foundational protocols—MCP for tool access, A2A for inter‑agent communication, and AG‑UI for Agent‑UI interaction—detailing their origins, technical roles, example implementations, and how they together form the communication backbone of modern AI applications.

A2AAG-UIAI Agent

0 likes · 18 min read

The Three Core Protocols of AI Agents 2.0: MCP, A2A, and AG‑UI

Fun with Large Models

Jul 10, 2025 · Artificial Intelligence

Grok 4: The ‘Problem‑Solving Champion’ That Falters in Real‑World Use – Detailed Evaluation

The article reviews Grok 4’s flashy launch and claimed first‑principles advantage, then presents benchmark results—showing strong reasoning, multimodal and agent scores but disappointing coding performance versus DeepSeek‑R1—concluding that the model’s real‑world capabilities fall short of its hype.

Grok4LLMagent

0 likes · 11 min read

Grok 4: The ‘Problem‑Solving Champion’ That Falters in Real‑World Use – Detailed Evaluation

Instant Consumer Technology Team

Jul 10, 2025 · Artificial Intelligence

How LLMs and Vector Search Power Real-Time Icon Recommendations

This article explains a system that combines large language models with multimodal vector retrieval to automatically understand user intent and instantly recommend the most relevant icons, detailing the workflow, semantic vectorization, offline indexing, online inference, and evaluation methods.

CLIPHNSWLLM

0 likes · 13 min read

How LLMs and Vector Search Power Real-Time Icon Recommendations

Tencent Cloud Developer

Jul 10, 2025 · Artificial Intelligence

Demystifying AIGC, Agents, and MCP: Essential AI Concepts for Developers

This article provides a concise, developer‑focused overview of emerging AI concepts—including AIGC, multimodal models, Retrieval‑Augmented Generation, intelligent agents, Function‑Calling, and the Model Context Protocol (MCP)—explaining their core principles, differences, and how they interrelate to enable advanced AI applications.

AIAIGCFunction Calling

0 likes · 16 min read

Demystifying AIGC, Agents, and MCP: Essential AI Concepts for Developers

Instant Consumer Technology Team

Jul 9, 2025 · Artificial Intelligence

How Easy Dataset Automates High‑Quality LLM Fine‑Tuning Data from Unstructured Docs

The article introduces Easy Dataset, a GUI‑driven framework that transforms heterogeneous documents into high‑quality, persona‑driven fine‑tuning data for large language models, details its architecture, core contributions, experimental validation on financial QA, and compares it with existing data‑synthesis tools.

Artificial IntelligenceGUILLM

0 likes · 12 min read

How Easy Dataset Automates High‑Quality LLM Fine‑Tuning Data from Unstructured Docs

Alimama Tech

Jul 9, 2025 · Artificial Intelligence

How to Make LLMs Recognize and Resolve Their Own Uncertainty

This article introduces ConfuseBench, a benchmark that classifies LLM uncertainty into document‑missing, ability‑limited, and ambiguous types, and presents methods—including retrieval, chain‑of‑thought, and clarification—to detect and actively resolve uncertainty, improving answer quality across diverse tasks.

BenchmarkClarificationInquiry

0 likes · 17 min read

How to Make LLMs Recognize and Resolve Their Own Uncertainty

AntTech

Jul 9, 2025 · Artificial Intelligence

How KAG-Thinker Boosts Structured Reasoning in Large Language Models

The KAG-Thinker model, a collaborative effort by Ant Group, Zhejiang University, and Tongji University, introduces a hierarchical "breadth splitting + depth solving" framework that enhances logical stability, knowledge utilization, and retrieval robustness for complex multi‑hop reasoning tasks across general and specialized domains.

AIKAG-ThinkerKnowledge retrieval

0 likes · 10 min read

How KAG-Thinker Boosts Structured Reasoning in Large Language Models

High Availability Architecture

Jul 9, 2025 · Artificial Intelligence

How LLMs Evolved from GPT‑4 to Agentic AI: Trends, Techniques, and Future Directions

This article analyzes the rapid evolution of large language models from the GPT‑4 era through efficiency‑focused sparsity and attention innovations, to inference‑time reasoning and tool‑using agents, highlighting key architectures, benchmark breakthroughs, competitive strategies, and emerging research directions toward embodied AI.

EfficiencyLLMReasoning

0 likes · 24 min read

How LLMs Evolved from GPT‑4 to Agentic AI: Trends, Techniques, and Future Directions

Alibaba Cloud Big Data AI Platform

Jul 8, 2025 · Artificial Intelligence

How Video Retrieval‑Augmented Generation Transforms Multimodal AI Search

This article explains the end‑to‑end implementation of Video RAG in OpenSearch LLM, covering offline parsing, key‑frame extraction, audio transcription, slice creation, multimodal vectorization, hybrid indexing, and online query processing while addressing challenges like recall performance and long‑video efficiency.

ASRKey Frame ExtractionLLM

0 likes · 10 min read

How Video Retrieval‑Augmented Generation Transforms Multimodal AI Search

Alibaba Cloud Developer

Jul 8, 2025 · Artificial Intelligence

From GPT‑4 to Thinking Models: How LLM Architecture Evolved After 2023

This article traces the evolution of large language models from the GPT‑4 era through 2024‑2025, highlighting the shift from pure scaling to efficiency‑focused architectures, the rise of reasoning‑centric "thinking" models, and the emergence of agentic capabilities that enable tools and real‑world interaction.

LLMReasoningTransformer

0 likes · 27 min read

From GPT‑4 to Thinking Models: How LLM Architecture Evolved After 2023

Instant Consumer Technology Team

Jul 4, 2025 · Artificial Intelligence

How AI Agents Boost Development: Inside the ReAct Framework & Prompt Engineering

This article explains how AI agents, using the ReAct framework, enable a human‑machine pair‑programming workflow, details the reasoning‑acting‑observation loop, showcases practical Python examples with smolagents and DeepSeek, and provides prompt‑engineering guidelines for effective tool‑calling.

AI AgentLLMPrompt Engineering

0 likes · 19 min read

How AI Agents Boost Development: Inside the ReAct Framework & Prompt Engineering

DaTaobao Tech

Jul 4, 2025 · Artificial Intelligence

How Taobao Live’s AI Digital Humans Transform E‑Commerce: Architecture, Algorithms, and Engineering Insights

This article details the end‑to‑end design of Taobao Live's AI digital human system, covering six core components such as LLM‑driven content creation, interactive dialogue, TTS voice synthesis, visual synchronization, audio‑video engineering, and a scalable backend, while also discussing product evolution, automation challenges, and future roadmap.

AILLMTTS

0 likes · 19 min read

How Taobao Live’s AI Digital Humans Transform E‑Commerce: Architecture, Algorithms, and Engineering Insights

macrozheng

Jul 4, 2025 · Artificial Intelligence

Build Java LLM Applications with LangChain4j: A Hands‑On Guide

This tutorial walks through the fundamentals of large language models, prompt engineering, word embeddings, and shows how to use the LangChain framework (including its Java implementation LangChain4j) to build, memory‑manage, retrieve, and chain AI‑driven applications with practical code examples.

AIEmbeddingJava

0 likes · 17 min read

Build Java LLM Applications with LangChain4j: A Hands‑On Guide

Alipay Experience Technology

Jul 3, 2025 · Artificial Intelligence

How MCP Transforms Agent Development: From Complex Tools to Plug‑and‑Play

This talk explains the Model Context Protocol (MCP), how it simplifies agent tool integration by replacing numerous custom interfaces with a single standardized protocol, and details its adoption, architecture, security, and future directions within Ant Group's ecosystem.

AILLMMCP

0 likes · 21 min read

How MCP Transforms Agent Development: From Complex Tools to Plug‑and‑Play

DataFunTalk

Jul 3, 2025 · Artificial Intelligence

How Vivo’s Blue Heart XiaoV Leverages LLMs to Transform Conversational Recommendations

In an interview with Vivo AI engineer Liang Tianan, the article explores the challenges of post‑Q&A recommendation, the integration of large language models into recall, ranking and evaluation pipelines, and the engineering trade‑offs required to deliver high‑quality, diverse suggestions on mobile devices.

EvaluationLLMModel Compression

0 likes · 15 min read

How Vivo’s Blue Heart XiaoV Leverages LLMs to Transform Conversational Recommendations