Tagged articles
2079 articles
Page 13 of 21
DaTaobao Tech
DaTaobao Tech
Jul 2, 2025 · Artificial Intelligence

How AI Powers 24/7 Digital Human Live Streams: Architecture, Challenges, and Innovations

This article presents a comprehensive overview of the AI‑driven digital‑human live‑streaming solution used by Taobao, detailing six core components—including LLM‑based content generation and interaction, TTS, visual driving, audio‑video engineering, and backend services—while sharing architectural diagrams, cost‑reduction strategies, productization insights, and future directions.

AILLMTTS
0 likes · 8 min read
How AI Powers 24/7 Digital Human Live Streams: Architecture, Challenges, and Innovations
Cognitive Technology Team
Cognitive Technology Team
Jul 1, 2025 · Artificial Intelligence

How We Built a Live‑Streaming TTS Engine: From Data Pipelines to AI Voice Generation

This article presents a comprehensive practice summary of building an intelligent digital‑human system, covering six core modules—LLM content generation, LLM interaction, TTS synthesis, visual driving, audio‑video engineering, and backend services—while detailing data collection, signal processing, ASR annotation, speaker clustering, model optimization (V1‑V4), evaluation metrics, and future research directions.

AI voiceAudio ProcessingLLM
0 likes · 23 min read
How We Built a Live‑Streaming TTS Engine: From Data Pipelines to AI Voice Generation
Go Programming World
Go Programming World
Jul 1, 2025 · Artificial Intelligence

What Is the Model Context Protocol (MCP) and How It’s Shaping AI Development

Model Context Protocol (MCP), an open-source standard from Anthropic, standardizes how large language models interact with external tools and data sources, introducing a client‑server architecture with hosts, clients, and servers, and promises to simplify AI application development compared to traditional function‑calling approaches.

AILLMclient-server
0 likes · 5 min read
What Is the Model Context Protocol (MCP) and How It’s Shaping AI Development
JavaEdge
JavaEdge
Jun 30, 2025 · Artificial Intelligence

How GPULlama3.java Brings GPU‑Accelerated Llama 3 to Pure Java

GPULlama3.java, released by Manchester University's Beehive Lab, is the first native Java implementation of Llama 3 that leverages TornadoVM to automatically accelerate inference on GPUs without writing CUDA or native code, supporting NVIDIA, Intel and Apple Silicon back‑ends and modern Java 21 features.

AIGPU AccelerationJava
0 likes · 7 min read
How GPULlama3.java Brings GPU‑Accelerated Llama 3 to Pure Java
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jun 30, 2025 · Artificial Intelligence

Unlocking Small LLM Power: Variable‑Length Chain Distillation with DistillQwen‑ThoughtY

This article introduces a variable‑length chain‑of‑thought distillation technique built on Alibaba Cloud PAI’s EasyDistill toolkit, presents the high‑quality OmniThought‑0528 dataset, details the training of the DistillQwen‑ThoughtY 4B/8B/32B models, and provides code and usage examples for researchers and practitioners.

LLMchain-of-thoughtdataset
0 likes · 15 min read
Unlocking Small LLM Power: Variable‑Length Chain Distillation with DistillQwen‑ThoughtY
DaTaobao Tech
DaTaobao Tech
Jun 30, 2025 · Artificial Intelligence

One‑Click AI Digital Human for Live Commerce: LLM, Lip Sync & Real‑Time Tech

This article outlines the end‑to‑end architecture and practical solutions behind creating intelligent digital humans for live commerce, covering LLM‑driven content generation, real‑time lip‑sync, image‑driven avatar creation, automated material review, lightweight model training, and a roadmap toward fully automated, high‑performance virtual presenters.

AILLMModel Compression
0 likes · 19 min read
One‑Click AI Digital Human for Live Commerce: LLM, Lip Sync & Real‑Time Tech
Qborfy AI
Qborfy AI
Jun 28, 2025 · Artificial Intelligence

Mastering LangGraph: Build Stateful, Looping LLM Agents with Python

This tutorial walks through the limitations of linear LangChain workflows, introduces LangGraph’s state‑node‑edge architecture, and provides step‑by‑step code examples—including a Hello‑World tool, conditional branching, multi‑turn conversation handling, and graph visualization—so readers can construct robust, persistent LLM agents.

LLMLangChainLangGraph
0 likes · 9 min read
Mastering LangGraph: Build Stateful, Looping LLM Agents with Python
MaGe Linux Operations
MaGe Linux Operations
Jun 28, 2025 · Artificial Intelligence

Master Dify: From Local Deployment to Advanced AI Workflows in 2025

This guide walks you through installing and configuring Dify—a open‑source LLM application platform—on your local machine using Docker, integrating it with Ollama for custom models, and exploring its core features such as chat assistants, agents, workflows, and tool extensions, all illustrated with step‑by‑step screenshots and code snippets.

AI workflowDifyDocker
0 likes · 12 min read
Master Dify: From Local Deployment to Advanced AI Workflows in 2025
Fighter's World
Fighter's World
Jun 28, 2025 · Artificial Intelligence

What Is the Generator‑Verifier Gap and Why It Matters for LLM Reasoning

The article explains the Generator‑Verifier Gap (GVG)—the asymmetry where verifying a solution is far cheaper than generating it—covers its origin, its impact on test‑time scaling for large language models, reinforcement‑learning approaches, and how the concept can shape agent architectures and AI product strategy.

Agent ArchitectureGenerator-Verifier GapLLM
0 likes · 21 min read
What Is the Generator‑Verifier Gap and Why It Matters for LLM Reasoning
AI Algorithm Path
AI Algorithm Path
Jun 28, 2025 · Artificial Intelligence

Implementing Greedy and Beam Decoding for Large Language Models from Scratch

This article walks through the mechanics of greedy search and beam search in large language models, demonstrates both methods with GPT‑2 on the prompt "I have a dream", visualizes the decoding trees, compares their scores, and discusses the trade‑offs between efficiency and output quality.

Beam SearchGPT-2Greedy Search
0 likes · 16 min read
Implementing Greedy and Beam Decoding for Large Language Models from Scratch
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Jun 27, 2025 · Operations

How AI‑Powered Ops‑Nexus Transforms Intelligent Operations for 100k+ Servers

This article details the design, technology choices, functional modules, core implementation, performance optimizations, and future roadmap of Ops‑Nexus, an AI‑driven intelligent operations platform that streamlines alarm analysis, log processing, and host health checks for large‑scale monitoring environments.

AI OpsIntelligent OperationsLLM
0 likes · 12 min read
How AI‑Powered Ops‑Nexus Transforms Intelligent Operations for 100k+ Servers
AI Algorithm Path
AI Algorithm Path
Jun 26, 2025 · Artificial Intelligence

The 10 Essential Components of a Retrieval‑Augmented Generation (RAG) System

This guide breaks down the ten core building blocks of a production‑ready RAG pipeline—from input handling and vector stores to prompt engineering, LLM inference, observability, and evaluation—showing why each piece matters, common pitfalls, and practical best‑practice recommendations.

LLMObservabilityPrompt Engineering
0 likes · 9 min read
The 10 Essential Components of a Retrieval‑Augmented Generation (RAG) System
Java Architecture Diary
Java Architecture Diary
Jun 25, 2025 · Artificial Intelligence

Build a Text‑to‑SQL Chatbot with Spring AI and DeepSeek LLM

This tutorial walks through creating a natural‑language‑to‑SQL chatbot using Spring AI, configuring a MySQL school database with Flyway, defining system prompts for a DeepSeek LLM, implementing service beans and a REST API, and interacting with the bot via curl commands.

ChatbotDeepSeekJava
0 likes · 15 min read
Build a Text‑to‑SQL Chatbot with Spring AI and DeepSeek LLM
Continuous Delivery 2.0
Continuous Delivery 2.0
Jun 25, 2025 · Artificial Intelligence

How Model Context Protocol Turns LLMs into Plug‑and‑Play AI Assistants

The Model Context Protocol (MCP) is an open, standardized adapter that lets large language models seamlessly connect to tools, data sources, and workflows, offering plug‑and‑play intelligence, cross‑platform compatibility, security, and modular extensibility for building real‑world AI applications.

AI IntegrationLLMModel Context Protocol
0 likes · 11 min read
How Model Context Protocol Turns LLMs into Plug‑and‑Play AI Assistants
AntTech
AntTech
Jun 23, 2025 · Artificial Intelligence

Can AI Auditors Ensure Reliable Software? Highlights from EXPRESS 2025 at ISSTA

The EXPRESS 2025 workshop at ISSTA in Norway will showcase AI‑driven code auditing, present cutting‑edge research on trustworthy software systems, and invite researchers and practitioners to discuss transparency, reliability, and security challenges in modern software engineering.

AI auditingISSTA 2025LLM
0 likes · 5 min read
Can AI Auditors Ensure Reliable Software? Highlights from EXPRESS 2025 at ISSTA
Alibaba Cloud Native
Alibaba Cloud Native
Jun 23, 2025 · Artificial Intelligence

From If/Else to Goal‑Oriented Agents: How LLMs Are Shaping Software 3.0

The article reflects on Andrej Karpathy’s AI Startup School talk, outlining the evolution from traditional if‑else programming (Software 1.0) through data‑driven models (Software 2.0) to goal‑oriented natural‑language agents (Software 3.0), and examines LLMs as operating‑system‑like infrastructure, prompting, and engineering challenges.

LLMsoftware evolution
0 likes · 5 min read
From If/Else to Goal‑Oriented Agents: How LLMs Are Shaping Software 3.0
Architecture & Thinking
Architecture & Thinking
Jun 23, 2025 · Artificial Intelligence

Building AI Assistants with Eino: A Go Framework for Large‑Model Applications

This article introduces Eino, an open‑source Golang framework for large‑model AI applications, explains its core capabilities, walks through creating a simple AI assistant with message templates and chat model integration, and demonstrates how to extend the system with tools and a modular architecture for future expansion.

AI AssistantEinoFramework
0 likes · 17 min read
Building AI Assistants with Eino: A Go Framework for Large‑Model Applications
DataFunSummit
DataFunSummit
Jun 22, 2025 · Artificial Intelligence

How Vivo’s BlueHeart AI Assistant Optimizes Post‑Conversation Recommendations with LLMs

In a detailed interview, Vivo AI engineer Liang Tianan explains how the BlueHeart Small V assistant leverages large language models, multi‑stage recall, ranking, and reward‑model fine‑tuning (SFT/DPO) to generate high‑quality, diverse post‑dialogue recommendation items while balancing latency, cost, and evaluation challenges.

DPOLLMSFT
0 likes · 15 min read
How Vivo’s BlueHeart AI Assistant Optimizes Post‑Conversation Recommendations with LLMs
Tech Freedom Circle
Tech Freedom Circle
Jun 21, 2025 · Artificial Intelligence

How MCP + LLM + Agent Architecture Becomes the AI Agent’s Neural Hub and New Infrastructure

The article explains the Model Context Protocol (MCP) as a zero‑code bridge that lets large language models seamlessly access databases, external APIs, and execute code, detailing its benefits for developers and everyday users, its core components, step‑by‑step workflow, real‑world examples, and how it outperforms traditional APIs in modern AI agent systems.

AI AgentLLMModel Context Protocol
0 likes · 37 min read
How MCP + LLM + Agent Architecture Becomes the AI Agent’s Neural Hub and New Infrastructure
Spring Full-Stack Practical Cases
Spring Full-Stack Practical Cases
Jun 21, 2025 · Artificial Intelligence

Master AI Agent Workflows with Spring Boot 3: From Chains to Orchestrators

This article introduces the fundamentals of augmented large language model agents, explains six workflow patterns—including chain, parallel, routing, orchestrator‑workers, evaluator‑optimizer, and autonomous agents—and provides complete Spring Boot 3 code examples, configuration, and test results for each pattern.

BackendJavaLLM
0 likes · 15 min read
Master AI Agent Workflows with Spring Boot 3: From Chains to Orchestrators
Fighter's World
Fighter's World
Jun 21, 2025 · Artificial Intelligence

Speculating Devin’s Context Engineering Architecture: How Long‑Horizon Agents Preserve Complete Context

The article analyzes why context engineering is crucial for multi‑agent AI systems, illustrates the fragility caused by fragmented context with a Flappy Bird analogy, and proposes three detailed speculative components—a compression‑to‑structure pipeline, a hybrid layered memory architecture, and a context‑aware coordination mechanism—culminating in a unified reference design for long‑horizon agents.

Agent CoordinationCompression PipelineContext Engineering
0 likes · 22 min read
Speculating Devin’s Context Engineering Architecture: How Long‑Horizon Agents Preserve Complete Context
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 20, 2025 · Artificial Intelligence

How to Build High‑Availability AI Agents: Challenges, Strategies, and Real‑World Insights

This article explores the evolving concept of AI agents, debates their definitions, outlines four major deployment challenges—including prompt instability, planning balance, domain knowledge integration, and response speed—and presents practical strategies such as prompt engineering, workflow design, multi‑agent architectures, and model optimization to build reliable, high‑availability agents.

AI AgentAgentic SystemsLLM
0 likes · 32 min read
How to Build High‑Availability AI Agents: Challenges, Strategies, and Real‑World Insights
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 19, 2025 · Artificial Intelligence

What Is Model Context Protocol (MCP) and How It Empowers LLMs?

The article introduces Model Context Protocol (MCP), explains its architecture of Host, Client, and Server, describes its components—Resources, Tools, Prompts—and demonstrates practical integration with IDE plugins to extend LLM capabilities such as real‑time ticket queries, highlighting its significance for AI development.

AI IntegrationAI toolingFunction Calling
0 likes · 11 min read
What Is Model Context Protocol (MCP) and How It Empowers LLMs?
Sohu Tech Products
Sohu Tech Products
Jun 18, 2025 · Backend Development

How LLMs Transform Traffic Replay Testing for Backend Services

This article walks through the challenges of traditional traffic replay, explains the design and benefits of a conventional replay system, and then details how integrating large language models can automate data preparation, script generation, and validation to make backend testing more accurate, scalable, and efficient.

Backend testingLLMservice reliability
0 likes · 18 min read
How LLMs Transform Traffic Replay Testing for Backend Services
DataFunTalk
DataFunTalk
Jun 18, 2025 · Artificial Intelligence

Can LLMs Really Beat Human Olympiad Programmers? Insights from LiveCodeBench Pro

This article examines the LiveCodeBench Pro benchmark, revealing that while large language models achieve impressive scores on knowledge‑ and logic‑heavy coding problems, they still fall short of human experts on high‑difficulty, observation‑intensive tasks, especially without external tool support.

AI evaluationBenchmarkLLM
0 likes · 11 min read
Can LLMs Really Beat Human Olympiad Programmers? Insights from LiveCodeBench Pro
AIWalker
AIWalker
Jun 18, 2025 · Artificial Intelligence

Six New Directions for Large Language Models

Large language models are booming, and this article highlights six cutting‑edge research directions—LLM‑plus synthetic data, reward modeling, inference techniques, LLM‑as‑a‑Judge, safety alignment, and long‑context handling—each illustrated with recent papers, experimental results, and links to code repositories.

LLMReward ModelingSafety Alignment
0 likes · 9 min read
Six New Directions for Large Language Models
Aikesheng Open Source Community
Aikesheng Open Source Community
Jun 17, 2025 · Artificial Intelligence

Introducing SCALE: An Open‑Source Benchmark Redefining LLM SQL Capabilities

This article presents SCALE, a community‑driven, open‑source benchmark that expands beyond simple Text‑to‑SQL accuracy to evaluate large language models on performance, dialect conversion, and deep SQL understanding, offering developers, researchers, and CTOs a realistic measure of AI‑assisted database tasks.

AIBenchmarkEvaluation
0 likes · 10 min read
Introducing SCALE: An Open‑Source Benchmark Redefining LLM SQL Capabilities
Tencent Technical Engineering
Tencent Technical Engineering
Jun 16, 2025 · Artificial Intelligence

Mastering RAG and AI Agents: Practical Tips, Code Samples, and Evaluation Strategies

This comprehensive guide walks you through the fundamentals of Retrieval‑Augmented Generation (RAG) and AI agents, explains their inner workings, shares optimization tricks, provides ready‑to‑run code snippets, and demonstrates how to evaluate performance with metrics such as recall, faithfulness, and answer relevance.

AI AgentsEvaluationLLM
0 likes · 36 min read
Mastering RAG and AI Agents: Practical Tips, Code Samples, and Evaluation Strategies
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Jun 16, 2025 · Artificial Intelligence

How LangGraph Implements Shared Memory for Multi‑Agent Systems: Techniques, Tools, and Future Directions

This article examines the theory and practice of shared memory in multi‑agent systems, tracing its evolution from classic blackboard models to modern solutions like Mem0.ai, Open Memory, and A‑MEM, and provides concrete design patterns, integration strategies, and future research directions for LangGraph users.

AI memoryDistributed SystemsLLM
0 likes · 37 min read
How LangGraph Implements Shared Memory for Multi‑Agent Systems: Techniques, Tools, and Future Directions
ITPUB
ITPUB
Jun 15, 2025 · Artificial Intelligence

How to Build a High‑Performance Enterprise RAG System with Model Context Protocol (MCP)

This article presents a step‑by‑step guide for constructing a scalable enterprise Retrieval‑Augmented Generation (RAG) solution using the Model Context Protocol (MCP), covering architecture comparison, system design, Milvus‑backed knowledge store, Python client implementation, deployment scripts, code examples, and best‑practice recommendations.

KnowledgeBaseLLMMilvus
0 likes · 22 min read
How to Build a High‑Performance Enterprise RAG System with Model Context Protocol (MCP)
Fighter's World
Fighter's World
Jun 14, 2025 · Artificial Intelligence

How Can LLMs Learn to “Think” in Complex Industry Scenarios?

The article analyzes how large language models can acquire true reasoning abilities for hard‑to‑score industry tasks by combining Chain‑of‑Thought prompting with reinforcement learning, addressing vague reward signals, reward hacking, and loyalty, and proposing a toolbox of reward engineering, synthetic data, hierarchical RL and multi‑agent collaboration.

LLMReward Modelingchain-of-thought
0 likes · 22 min read
How Can LLMs Learn to “Think” in Complex Industry Scenarios?
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jun 13, 2025 · Artificial Intelligence

How EasyDistill Cuts LLM Costs: Mastering DistilQwen-ThoughtX on Alibaba Cloud

EasyDistill, an open-source framework from Alibaba Cloud PAI, streamlines knowledge distillation for large language models, introducing the DistilQwen-ThoughtX series with variable-length chain-of-thought reasoning, and provides comprehensive best-practice guidance for training, fine-tuning, evaluation, compression, and deployment via the PAI-ModelGallery.

AI inferenceKnowledge DistillationLLM
0 likes · 12 min read
How EasyDistill Cuts LLM Costs: Mastering DistilQwen-ThoughtX on Alibaba Cloud
Instant Consumer Technology Team
Instant Consumer Technology Team
Jun 12, 2025 · Artificial Intelligence

How to Build a Production-Ready RAG System with Qwen3 Embedding and Reranker Models

This guide walks through using Alibaba's new Qwen3-Embedding and Qwen3-Reranker models to build a two‑stage Retrieval‑Augmented Generation pipeline with Milvus, covering environment setup, data ingestion, vector indexing, reranking, and LLM‑driven answer generation, demonstrating production‑grade performance across multilingual queries.

EmbeddingLLMMilvus
0 likes · 19 min read
How to Build a Production-Ready RAG System with Qwen3 Embedding and Reranker Models
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 11, 2025 · Artificial Intelligence

From Chat to Autonomous Agents: Architecture, ReAct, Prompt Engineering

This article chronicles the evolution from simple chat interactions to sophisticated autonomous agents, detailing stages of LLM development, ReAct reasoning, memory management, tool integration, and practical implementation using the browser-use project, while offering prompt design insights and future directions for AI agents.

AI AgentLLMMemory
0 likes · 30 min read
From Chat to Autonomous Agents: Architecture, ReAct, Prompt Engineering
Architecture & Thinking
Architecture & Thinking
Jun 11, 2025 · Artificial Intelligence

Accelerate LLM App Development with Eino: A Go Framework Walkthrough

Eino is an open‑source Golang framework for building large‑model applications, offering reusable components, robust orchestration, clean APIs, best‑practice templates, and full‑cycle DevOps tools, with code examples for both Ollama and OpenAI modes, plus streaming and normal output options.

AI developmentFrameworkGo
0 likes · 10 min read
Accelerate LLM App Development with Eino: A Go Framework Walkthrough
Instant Consumer Technology Team
Instant Consumer Technology Team
Jun 10, 2025 · Artificial Intelligence

Unlocking AI Agent Integration with Model Context Protocol (MCP): A Complete Guide

This article explains how the Model Context Protocol (MCP) standardizes AI agent communication with external tools, outlines its benefits, describes its core components, showcases open‑source implementations, and provides step‑by‑step Python examples for building MCP servers and clients.

Function CallingLLMModel Context Protocol
0 likes · 22 min read
Unlocking AI Agent Integration with Model Context Protocol (MCP): A Complete Guide
Alibaba Cloud Developer
Alibaba Cloud Developer
Jun 10, 2025 · Artificial Intelligence

How AI Application Architectures Evolve: From Simple LLM Calls to Guardrails, Routing, and Agents

This article traces the evolution of AI application architectures—from the earliest minimal user‑LLM interaction to advanced designs featuring context enhancement, input/output guardrails, intent routing, model gateways, caching strategies, agent capabilities, monitoring, and inference performance optimizations—providing practical insights and references for developers.

AI ArchitectureCachingInference Optimization
0 likes · 21 min read
How AI Application Architectures Evolve: From Simple LLM Calls to Guardrails, Routing, and Agents
DataFunSummit
DataFunSummit
Jun 8, 2025 · Artificial Intelligence

Mastering LLM Applications: Practical Agent Design and Implementation Strategies

This comprehensive guide explores the core implementation paths for large language model (LLM) applications, focusing on agent design, workflow orchestration, tool integration, memory management, multi‑agent architectures, and future trends, providing actionable methodologies and real‑world examples for practitioners.

AI AgentAgent DesignLLM
0 likes · 25 min read
Mastering LLM Applications: Practical Agent Design and Implementation Strategies
dbaplus Community
dbaplus Community
Jun 7, 2025 · Artificial Intelligence

How Large Language Models Are Transforming Data Warehousing: Real-World Experiments and Lessons

The article shares practical experiences using large language models such as Cursor and DeepSeek in data‑warehouse workflows, covering assisted coding, automated metric extraction, self‑service analysis, documentation generation, their benefits, limitations, and the broader impact on data engineering roles.

AI automationBusiness IntelligenceData Warehouse
0 likes · 9 min read
How Large Language Models Are Transforming Data Warehousing: Real-World Experiments and Lessons
AI2ML AI to Machine Learning
AI2ML AI to Machine Learning
Jun 6, 2025 · Artificial Intelligence

Tackling the Top Challenges of Retrieval‑Augmented Generation (RAG)

The article enumerates common pitfalls of Retrieval‑Augmented Generation—such as missing content, low‑rank document misses, context limits, format errors, incomplete answers, scalability bottlenecks, complex PDF extraction, data‑quality issues, domain adaptation gaps, hallucinations, and feedback‑loop deficiencies—and offers concrete mitigation strategies ranging from data cleaning and prompt design to hybrid search, hierarchical retrieval, document compression, and automated evaluation.

Data QualityHybrid SearchLLM
0 likes · 9 min read
Tackling the Top Challenges of Retrieval‑Augmented Generation (RAG)
Youzan Coder
Youzan Coder
Jun 6, 2025 · Artificial Intelligence

How AI Agents Turn Manual Data Retrieval into Fully Automated Insights

This article examines the challenges of manual data extraction in data‑driven enterprises, explains why large language models alone fall short, and details how the Cursor‑Agent framework automates end‑to‑end querying, knowledge‑base integration, and result validation to become a self‑sufficient "data master" for both technical and non‑technical users.

AI AgentCursor AgentData Automation
0 likes · 26 min read
How AI Agents Turn Manual Data Retrieval into Fully Automated Insights
DaTaobao Tech
DaTaobao Tech
Jun 6, 2025 · Artificial Intelligence

Redefining Business Core Assets in the LLM Era: Agent Evolution & Collaboration

This article examines how the rise of large language models reshapes core business assets, defines agents and tools, explores multi‑agent collaboration patterns, task allocation and conflict resolution mechanisms, and evaluates the MCP protocol and engineering requirements for building scalable, flexible agent platforms.

Agent ArchitectureLLMMCP protocol
0 likes · 9 min read
Redefining Business Core Assets in the LLM Era: Agent Evolution & Collaboration
JavaEdge
JavaEdge
Jun 5, 2025 · Artificial Intelligence

How Amazon’s Strands Agents SDK Simplifies Building AI Agents

Amazon’s newly open‑source Strands Agents SDK lets developers create AI agents with minimal code by defining prompts, tools, and models, offering a lightweight, production‑ready framework that supports multiple model providers, observability, multi‑agent collaboration, and extensible tooling via dedicated packages.

AI AgentsAmazonLLM
0 likes · 7 min read
How Amazon’s Strands Agents SDK Simplifies Building AI Agents
Didi Tech
Didi Tech
Jun 5, 2025 · Artificial Intelligence

Unlocking Modern AI Application Architecture: From RAG to Agents and MCP

This article surveys the evolution of AI applications, explains large language model fundamentals, outlines architectural challenges, and introduces three core patterns—Retrieval‑Augmented Generation (RAG), autonomous Agents, and Model Context Protocol (MCP)—while providing practical LangChain code snippets and integration guidance.

AILLMLangChain
0 likes · 28 min read
Unlocking Modern AI Application Architecture: From RAG to Agents and MCP
AI Frontier Lectures
AI Frontier Lectures
Jun 5, 2025 · Artificial Intelligence

Bridging Thought Leaps: How CoT‑Bridge Boosts LLM Reasoning Accuracy

This paper introduces the Thought Leap Bridge task and the CoT‑Bridge model, which detect and fill missing intermediate steps in chain‑of‑thought reasoning, dramatically improving large language model performance on mathematical and logical benchmarks and enhancing downstream distillation and reinforcement‑learning pipelines.

Chain-of-ThoughtCoT-BridgeLLM
0 likes · 8 min read
Bridging Thought Leaps: How CoT‑Bridge Boosts LLM Reasoning Accuracy
AI Algorithm Path
AI Algorithm Path
Jun 4, 2025 · Artificial Intelligence

Why LLMs Hallucinate and How to Mitigate the Problem

The article explains that hallucinations in large language models stem mainly from the supervised fine‑tuning stage, illustrates the issue with concrete examples, and presents mitigation techniques such as knowledge‑probing data generation and web‑search tool integration using special tokens.

LLMMetaOpenAssistant
0 likes · 12 min read
Why LLMs Hallucinate and How to Mitigate the Problem
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Jun 4, 2025 · Artificial Intelligence

What Is an AI Engineer? Roles, Skills, and the Future of LLM‑Powered Systems

This article examines the evolving role of the AI engineer, contrasting it with AI researchers, ML engineers, and software engineers, outlines essential skills such as prompt engineering, MLOps, and data integration, and predicts how AI engineering will become a pivotal, high‑demand discipline in the coming years.

AI EngineeringAI SystemsAgentic RAG
0 likes · 17 min read
What Is an AI Engineer? Roles, Skills, and the Future of LLM‑Powered Systems
Xiaohongshu Tech REDtech
Xiaohongshu Tech REDtech
Jun 4, 2025 · Artificial Intelligence

From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MARKERGEN

MarkerGen introduces a novel, plug‑and‑play framework that decomposes length‑controllable text generation into four sub‑abilities—identifying, counting, planning, and aligning—integrates external tokenizers and dynamic markers, and achieves significantly lower length errors and higher quality across diverse models, tasks, and languages.

LLMLength-Controlled GenerationMarkerGen
0 likes · 14 min read
From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MARKERGEN
DaTaobao Tech
DaTaobao Tech
Jun 4, 2025 · Artificial Intelligence

Understanding Large Language Model Architecture, Parameters, Memory, Storage, and Fine‑Tuning Techniques

This article provides a comprehensive overview of large language models (LLMs), covering their transformer architecture, parameter counts, GPU memory and storage requirements, and detailed fine‑tuning methods such as prompt engineering, data construction, LoRA, PEFT, RLHF, and DPO, along with practical deployment and inference acceleration strategies.

DPOLLMLoRA
0 likes · 17 min read
Understanding Large Language Model Architecture, Parameters, Memory, Storage, and Fine‑Tuning Techniques
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Jun 2, 2025 · Artificial Intelligence

Choosing the Right LLM AI Agent Protocol: A Four‑Category Guide

This article provides a systematic overview of existing LLM AI Agent communication protocols, categorizing them into four major types, detailing their functions, benefits, and use‑cases, and compares four representative protocols—MCP, A2A, ANP, and Agora—through a concrete travel‑planning scenario.

AI AgentCommunication ProtocolLLM
0 likes · 11 min read
Choosing the Right LLM AI Agent Protocol: A Four‑Category Guide
Fighter's World
Fighter's World
Jun 2, 2025 · Artificial Intelligence

Why Is Context King for Large Language Models?

This article provides a comprehensive technical analysis of LLM context, covering its definition, types, tokenization, window‑size evolution, diminishing returns, management techniques such as RAG, CoT, memory‑as‑a‑service, and future challenges like multimodal fusion, privacy, and autonomous agent memory.

Agent MemoryContext ManagementLLM
0 likes · 48 min read
Why Is Context King for Large Language Models?
JavaEdge
JavaEdge
May 30, 2025 · Artificial Intelligence

How to Build a Deep Research Workflow in Dify Using AI Agents

This guide explains how to construct a deep research workflow in Dify that leverages AI agents, loop variables, and structured outputs to automatically explore complex topics, gather sources, and synthesize comprehensive reports with proper citations.

AI workflowDifyLLM
0 likes · 9 min read
How to Build a Deep Research Workflow in Dify Using AI Agents
Instant Consumer Technology Team
Instant Consumer Technology Team
May 30, 2025 · Artificial Intelligence

Why Streamable HTTP Is Replacing SSE in AI Communication: An MCP Protocol Deep Dive

This article explains how the Model Context Protocol (MCP) standardizes AI‑assistant communication, compares the traditional Server‑Sent Events (SSE) transport with the newer Streamable HTTP mechanism, and provides step‑by‑step code examples for building both MCP servers and clients that leverage Streamable HTTP for bidirectional, session‑aware data exchange.

AILLMStreamable HTTP
0 likes · 22 min read
Why Streamable HTTP Is Replacing SSE in AI Communication: An MCP Protocol Deep Dive
Alibaba Cloud Developer
Alibaba Cloud Developer
May 29, 2025 · Artificial Intelligence

Build a Minimal Large Language Model from Scratch with Python and PyTorch

This tutorial walks through creating a simple bigram language model in pure Python, refactoring it into a PyTorch implementation, and explains core concepts such as tokenization, embedding layers, loss functions, gradient descent, training loops, and text generation, preparing you for building a full GPT model.

BigramLLMLanguageModel
0 likes · 31 min read
Build a Minimal Large Language Model from Scratch with Python and PyTorch
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
May 29, 2025 · Artificial Intelligence

How OmniThought Enables Adaptive Reasoning Chains for Better LLM Performance

This article introduces the OmniThought dataset, which annotates over two million chain‑of‑thought reasoning steps with Reasoning Verbosity and Cognitive Difficulty scores, and explains how these metrics guide the training of DistilQwen‑ThoughtX models that adapt chain length to task difficulty, achieving superior performance compared to existing distilled LLMs.

CoTLLMReasoning
0 likes · 16 min read
How OmniThought Enables Adaptive Reasoning Chains for Better LLM Performance
Tencent Technical Engineering
Tencent Technical Engineering
May 28, 2025 · Artificial Intelligence

A Beginner-friendly Overview of LLMs, Transformers, Prompts, Function Calling, MCP and Agents

This article provides a concise, easy-to-understand introduction to large language models, the transformer architecture, prompt engineering, temperature settings, function calling, the Model Context Protocol (MCP), agent communication (A2A), and future AI programming trends, using simple analogies and illustrative examples.

AIFunction CallingLLM
0 likes · 11 min read
A Beginner-friendly Overview of LLMs, Transformers, Prompts, Function Calling, MCP and Agents
Alibaba Cloud Developer
Alibaba Cloud Developer
May 28, 2025 · Artificial Intelligence

Unlocking LLM Fine‑Tuning: From Architecture to LoRA, DPO and Deployment

This article provides a comprehensive guide to large language model fine‑tuning, covering model architecture, parameter and memory calculations, prompt engineering, data construction, LoRA and PEFT techniques, reinforcement learning methods such as DPO, and practical deployment workflows on internal platforms.

Fine‑TuningLLMLoRA
0 likes · 21 min read
Unlocking LLM Fine‑Tuning: From Architecture to LoRA, DPO and Deployment
JavaEdge
JavaEdge
May 27, 2025 · Artificial Intelligence

Boost LLM App Performance: Master Parallel Workflows in Dify v0.8.0

Version 0.8.0 of Dify introduces parallel workflow capabilities, allowing multiple branches to run concurrently, which dramatically reduces latency for complex LLM tasks; the guide explains how to create simple, nested, iterative, and conditional parallel branches, with step‑by‑step instructions and visual examples.

DifyLLMparallel processing
0 likes · 8 min read
Boost LLM App Performance: Master Parallel Workflows in Dify v0.8.0
Instant Consumer Technology Team
Instant Consumer Technology Team
May 27, 2025 · Artificial Intelligence

How to Build a Text‑to‑SQL Assistant: From Prompt Tricks to Enterprise‑Ready Solutions

This comprehensive guide explains the Text2SQL concept, showcases real‑world scenarios, compares three implementation architectures—including a simple prompt‑based method, a LangChain‑based pipeline, and an enterprise‑grade Vanna solution—while providing practical tips, security measures, and advanced enhancements for deploying robust natural‑language‑to‑SQL systems.

DatabaseLLMText2SQL
0 likes · 26 min read
How to Build a Text‑to‑SQL Assistant: From Prompt Tricks to Enterprise‑Ready Solutions
Architecture & Thinking
Architecture & Thinking
May 25, 2025 · Artificial Intelligence

Which AI Workflow Platform Wins? A Deep Dive into n8n, Dify, and Coze

This article compares three leading AI workflow tools—n8n, Dify, and Coze—by examining their origins, technical architectures, core advantages, typical use cases, real‑world case studies, and future deployment trends, helping developers and businesses choose the right "intelligent assistant" for their needs.

AILLMLow‑code
0 likes · 11 min read
Which AI Workflow Platform Wins? A Deep Dive into n8n, Dify, and Coze
Youzan Coder
Youzan Coder
May 23, 2025 · Artificial Intelligence

How LLMs Supercharge SaaS Alert Monitoring: An AI‑Powered Workflow

This article explains how a SaaS company leveraged large language models to automatically ingest, enrich, and analyze stability alerts, turning noisy notifications into actionable insights through configurable pipelines, Feishu integration, and a streamlined AI workflow that boosts incident response speed and reduces manual effort.

AIAlert MonitoringLLM
0 likes · 6 min read
How LLMs Supercharge SaaS Alert Monitoring: An AI‑Powered Workflow
Volcano Engine Developer Services
Volcano Engine Developer Services
May 22, 2025 · Artificial Intelligence

How LLMs Can Automate Ticket Escalation: Inside ByteBrain’s TickIt System

This article introduces TickIt, a ByteBrain system that leverages large language models to automatically identify and escalate critical Oncall tickets, detailing its multi‑class escalation, deduplication, and category‑guided fine‑tuning modules, experimental results, and the operational impact on cloud services.

Incident ManagementLLMOncall analysis
0 likes · 13 min read
How LLMs Can Automate Ticket Escalation: Inside ByteBrain’s TickIt System
JD Tech Talk
JD Tech Talk
May 22, 2025 · Artificial Intelligence

From Academic Research to Industrial Anti‑Fraud: Leveraging LLMs, Reinforcement Learning, and Model Distillation for Advertising Risk Detection

The article recounts Xiaoting’s journey from a PhD research background to leading JD.com’s ad‑fraud detection, detailing how large language models, reinforcement learning, and model distillation were applied to identify hidden address codes, reduce false‑positive rates to 0.3%, and balance accuracy with real‑time performance in a high‑traffic e‑commerce environment.

AIAd FraudAdvertising
0 likes · 11 min read
From Academic Research to Industrial Anti‑Fraud: Leveraging LLMs, Reinforcement Learning, and Model Distillation for Advertising Risk Detection
Sohu Tech Products
Sohu Tech Products
May 21, 2025 · Artificial Intelligence

Beyond LLM Limits: Function Calling, MCP, and A2A Compared

The article examines the inherent knowledge cutoff of large language models, introduces function calling, Model Context Protocol (MCP), and Agent‑to‑Agent (A2A) as solutions for real‑time data access, compares their architectures, communication patterns, and use cases, and discusses their respective strengths and drawbacks.

A2AAI protocolsFunction Calling
0 likes · 17 min read
Beyond LLM Limits: Function Calling, MCP, and A2A Compared
Alibaba Cloud Developer
Alibaba Cloud Developer
May 21, 2025 · Artificial Intelligence

How to Seamlessly Integrate MCP Protocol with Spring AI for Powerful LLM Tool Calls

This article explains the challenges of integrating diverse tools without MCP, then demonstrates step‑by‑step how to configure Spring‑AI and the native MCP SDK to call LLMs, register tools, handle SSE and stdio services, and troubleshoot common issues, providing code snippets and best‑practice recommendations.

AI tool integrationBackend DevelopmentJava
0 likes · 16 min read
How to Seamlessly Integrate MCP Protocol with Spring AI for Powerful LLM Tool Calls
DeWu Technology
DeWu Technology
May 19, 2025 · Artificial Intelligence

AI-Powered Automated Test Case Generation: Design, Implementation, and Future Plans

This article presents a comprehensive AI-driven solution for automatically generating functional test cases, detailing the AI background, design scheme, core components such as PRD parsing, test‑point generation, test‑case creation, knowledge‑base construction, implementation results, and future development directions.

AIKnowledge BaseLLM
0 likes · 7 min read
AI-Powered Automated Test Case Generation: Design, Implementation, and Future Plans
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
May 19, 2025 · Artificial Intelligence

What Is AI MCP and How It Revolutionizes Model Integration?

AI MCP (Model Context Protocol) is an open protocol that standardizes communication between large language model applications and external data sources or tools, offering pre‑installed services, fast registration, ecosystem openness, and automatic discovery within Huawei Cloud ModelArts Studio, while eliminating the need for per‑API integration code.

AIHuaweiLLM
0 likes · 7 min read
What Is AI MCP and How It Revolutionizes Model Integration?
Youzan Coder
Youzan Coder
May 16, 2025 · Artificial Intelligence

Intelligent Address Recognition: AI‑Assisted Hybrid Solution and Prompt Engineering

This article describes how a hybrid architecture that combines third‑party address‑recognition APIs with large‑language‑model (LLM) processing, along with carefully engineered prompts and a TSV output format, dramatically improves address parsing accuracy and latency in a retail checkout scenario.

AIHybrid ArchitectureLLM
0 likes · 12 min read
Intelligent Address Recognition: AI‑Assisted Hybrid Solution and Prompt Engineering
Alibaba Cloud Developer
Alibaba Cloud Developer
May 16, 2025 · Artificial Intelligence

Designing Robust MCP Servers for Alibaba Cloud Observability 2.0 – Lessons & Best Practices

This article explains the Model Context Protocol (MCP), its components, and how to integrate MCP servers with Alibaba Cloud Observability 2.0, offering practical design experiences, tool simplification tips, default parameter strategies, output size control, and future AI‑driven observability insights.

LLMObservabilitymcp
0 likes · 17 min read
Designing Robust MCP Servers for Alibaba Cloud Observability 2.0 – Lessons & Best Practices
AI Large Model Application Practice
AI Large Model Application Practice
May 16, 2025 · Artificial Intelligence

Why Residual Connections Keep Deep Neural Networks Stable

This article explains why residual connections are essential in deep neural networks, describing the problems of network degradation and gradient vanishing, how shortcut paths add the input to the layer output, the requirement of matching dimensions, and the resulting stability for training large language models.

LLMResidual Connectionsgradient flow
0 likes · 7 min read
Why Residual Connections Keep Deep Neural Networks Stable
Instant Consumer Technology Team
Instant Consumer Technology Team
May 15, 2025 · Artificial Intelligence

Unlocking Agentic AI: How Agent Workflows Transform Intelligent Automation

This article demystifies AI agents and agentic workflows, explaining their core components—LLMs, tools, and memory—while detailing planning, tool‑use, and reflection patterns, comparing agentic, non‑agentic, and traditional workflows, and exploring real‑world applications, advantages, and limitations.

AI AgentsLLMagentic workflows
0 likes · 21 min read
Unlocking Agentic AI: How Agent Workflows Transform Intelligent Automation
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
May 15, 2025 · Artificial Intelligence

How to Build a Qwen3‑Powered ChatBI Agent with PAI‑LangStudio and Hologres

This guide walks you through creating a ChatBI intelligent agent by integrating Alibaba's Qwen3 large language model with PAI‑LangStudio, configuring the Model Context Protocol (MCP) server, and connecting to Hologres real‑time data warehouse, covering setup, deployment, and verification steps for enterprise data analysis.

ChatBIData AnalysisHologres
0 likes · 11 min read
How to Build a Qwen3‑Powered ChatBI Agent with PAI‑LangStudio and Hologres
StarRocks
StarRocks
May 13, 2025 · Artificial Intelligence

How StarRocks MCP Server Enables LLMs to Query Databases Without Custom Plugins

StarRocks MCP Server provides a universal adapter that lets large language models like Claude, OpenAI, and Gemini execute SQL queries directly against StarRocks, simplifying data Q&A, intelligent analysis, and automated reporting by eliminating the need for bespoke plugins or complex prompt engineering.

AI AgentsLLMStarRocks
0 likes · 14 min read
How StarRocks MCP Server Enables LLMs to Query Databases Without Custom Plugins
Tencent Cloud Developer
Tencent Cloud Developer
May 13, 2025 · Artificial Intelligence

Function Calling and Model Context Protocol (MCP): Bridging Large Language Models with Real‑World Systems

The article reviews the shortcomings of traditional large language models, explains how function calling extends LLMs beyond pure text, introduces the Model Context Protocol (MCP) as a standardized USB‑C‑like interface for AI tools, and demonstrates a Python MCP example that integrates LLMs with Tencent Advertising APIs.

AI IntegrationAPIFunction Calling
0 likes · 16 min read
Function Calling and Model Context Protocol (MCP): Bridging Large Language Models with Real‑World Systems
Tencent Technical Engineering
Tencent Technical Engineering
May 12, 2025 · Artificial Intelligence

Comprehensive Summary and Expansion of Andrej Karpathy’s 7‑Hour LLM Lecture

This article provides a detailed Chinese‑to‑English summary of Andrej Karpathy’s 7‑hour LLM tutorial, covering chat process analysis, tokenization, pre‑training data pipelines, model architecture, training strategies, post‑training fine‑tuning, reinforcement learning, chain‑of‑thought reasoning, and current industry applications.

AILLMmodel architecture
0 likes · 25 min read
Comprehensive Summary and Expansion of Andrej Karpathy’s 7‑Hour LLM Lecture
AI Algorithm Path
AI Algorithm Path
May 9, 2025 · Artificial Intelligence

A Visual Guide to Mixture of Experts (MoE) Architecture in Large Language Models

This article explains the Mixture of Experts (MoE) technique used in modern LLMs, detailing its core components—experts and router—comparing dense and sparse layers, describing load‑balancing, expert capacity, and routing strategies, and showcasing real‑world examples such as Switch Transformer, Vision‑MoE, and Mixtral 8x7B.

Expert CapacityLLMMixture of Experts
0 likes · 15 min read
A Visual Guide to Mixture of Experts (MoE) Architecture in Large Language Models
phodal
phodal
May 9, 2025 · Artificial Intelligence

Why Pre‑Generated Context Is the Key to Faster, More Accurate AI Code Retrieval

The article examines how pre‑generating structured context for codebases can overcome the uncertainty and quality issues of traditional Retrieval‑Augmented Generation, outlines the technical and business challenges of RAG, compares existing code‑search tools, and introduces AutoDev’s Context Worker as a practical solution.

AILLMRAG
0 likes · 11 min read
Why Pre‑Generated Context Is the Key to Faster, More Accurate AI Code Retrieval
Bilibili Tech
Bilibili Tech
May 9, 2025 · Artificial Intelligence

How an AI Gateway Scales LLM Services: Architecture, Auth, Quotas, and Load Balancing

This article explains the design of an AI gateway that centralizes LLM access, detailing its background, overall architecture, authentication, quota management, multi‑model routing, load‑balancing strategies, multi‑tenant isolation, observability features, and the supported API protocols for enterprise integration.

AI gatewayAuthenticationLLM
0 likes · 17 min read
How an AI Gateway Scales LLM Services: Architecture, Auth, Quotas, and Load Balancing