Tagged articles

2079 articles

Page 13 of 21

Jul 3, 2025 · Artificial Intelligence

Boosting LLM Function Call Capabilities: From Data Construction to RLHF Optimization

On July 12, 2025, the DataFun Summit will feature a technical session where China Telecom AI Research Institute engineer Yao Yitong presents a deep dive into enhancing large language model Function Call abilities through systematic data and training optimizations, offering practical insights for AI practitioners.

AILLMRLHF

0 likes · 4 min read

Boosting LLM Function Call Capabilities: From Data Construction to RLHF Optimization

DaTaobao Tech

Jul 2, 2025 · Artificial Intelligence

How AI Powers 24/7 Digital Human Live Streams: Architecture, Challenges, and Innovations

This article presents a comprehensive overview of the AI‑driven digital‑human live‑streaming solution used by Taobao, detailing six core components—including LLM‑based content generation and interaction, TTS, visual driving, audio‑video engineering, and backend services—while sharing architectural diagrams, cost‑reduction strategies, productization insights, and future directions.

AILLMTTS

0 likes · 8 min read

How AI Powers 24/7 Digital Human Live Streams: Architecture, Challenges, and Innovations

Cognitive Technology Team

Jul 1, 2025 · Artificial Intelligence

How We Built a Live‑Streaming TTS Engine: From Data Pipelines to AI Voice Generation

This article presents a comprehensive practice summary of building an intelligent digital‑human system, covering six core modules—LLM content generation, LLM interaction, TTS synthesis, visual driving, audio‑video engineering, and backend services—while detailing data collection, signal processing, ASR annotation, speaker clustering, model optimization (V1‑V4), evaluation metrics, and future research directions.

AI voiceAudio ProcessingLLM

0 likes · 23 min read

How We Built a Live‑Streaming TTS Engine: From Data Pipelines to AI Voice Generation

Go Programming World

Jul 1, 2025 · Artificial Intelligence

What Is the Model Context Protocol (MCP) and How It’s Shaping AI Development

Model Context Protocol (MCP), an open-source standard from Anthropic, standardizes how large language models interact with external tools and data sources, introducing a client‑server architecture with hosts, clients, and servers, and promises to simplify AI application development compared to traditional function‑calling approaches.

AILLMclient-server

0 likes · 5 min read

What Is the Model Context Protocol (MCP) and How It’s Shaping AI Development

JavaEdge

Jun 30, 2025 · Artificial Intelligence

How GPULlama3.java Brings GPU‑Accelerated Llama 3 to Pure Java

GPULlama3.java, released by Manchester University's Beehive Lab, is the first native Java implementation of Llama 3 that leverages TornadoVM to automatically accelerate inference on GPUs without writing CUDA or native code, supporting NVIDIA, Intel and Apple Silicon back‑ends and modern Java 21 features.

AIGPU AccelerationJava

0 likes · 7 min read

How GPULlama3.java Brings GPU‑Accelerated Llama 3 to Pure Java

Alibaba Cloud Big Data AI Platform

Jun 30, 2025 · Artificial Intelligence

Unlocking Small LLM Power: Variable‑Length Chain Distillation with DistillQwen‑ThoughtY

This article introduces a variable‑length chain‑of‑thought distillation technique built on Alibaba Cloud PAI’s EasyDistill toolkit, presents the high‑quality OmniThought‑0528 dataset, details the training of the DistillQwen‑ThoughtY 4B/8B/32B models, and provides code and usage examples for researchers and practitioners.

LLMchain-of-thoughtdataset

0 likes · 15 min read

Unlocking Small LLM Power: Variable‑Length Chain Distillation with DistillQwen‑ThoughtY

DaTaobao Tech

Jun 30, 2025 · Artificial Intelligence

One‑Click AI Digital Human for Live Commerce: LLM, Lip Sync & Real‑Time Tech

This article outlines the end‑to‑end architecture and practical solutions behind creating intelligent digital humans for live commerce, covering LLM‑driven content generation, real‑time lip‑sync, image‑driven avatar creation, automated material review, lightweight model training, and a roadmap toward fully automated, high‑performance virtual presenters.

AILLMModel Compression

0 likes · 19 min read

One‑Click AI Digital Human for Live Commerce: LLM, Lip Sync & Real‑Time Tech

Qborfy AI

Jun 28, 2025 · Artificial Intelligence

Mastering LangGraph: Build Stateful, Looping LLM Agents with Python

This tutorial walks through the limitations of linear LangChain workflows, introduces LangGraph’s state‑node‑edge architecture, and provides step‑by‑step code examples—including a Hello‑World tool, conditional branching, multi‑turn conversation handling, and graph visualization—so readers can construct robust, persistent LLM agents.

LLMLangChainLangGraph

0 likes · 9 min read

Mastering LangGraph: Build Stateful, Looping LLM Agents with Python

MaGe Linux Operations

Jun 28, 2025 · Artificial Intelligence

Master Dify: From Local Deployment to Advanced AI Workflows in 2025

This guide walks you through installing and configuring Dify—a open‑source LLM application platform—on your local machine using Docker, integrating it with Ollama for custom models, and exploring its core features such as chat assistants, agents, workflows, and tool extensions, all illustrated with step‑by‑step screenshots and code snippets.

AI workflowDifyDocker

0 likes · 12 min read

Master Dify: From Local Deployment to Advanced AI Workflows in 2025

Fighter's World

Jun 28, 2025 · Artificial Intelligence

What Is the Generator‑Verifier Gap and Why It Matters for LLM Reasoning

The article explains the Generator‑Verifier Gap (GVG)—the asymmetry where verifying a solution is far cheaper than generating it—covers its origin, its impact on test‑time scaling for large language models, reinforcement‑learning approaches, and how the concept can shape agent architectures and AI product strategy.

Agent ArchitectureGenerator-Verifier GapLLM

0 likes · 21 min read

What Is the Generator‑Verifier Gap and Why It Matters for LLM Reasoning

AI Algorithm Path

Jun 28, 2025 · Artificial Intelligence

Implementing Greedy and Beam Decoding for Large Language Models from Scratch

This article walks through the mechanics of greedy search and beam search in large language models, demonstrates both methods with GPT‑2 on the prompt "I have a dream", visualizes the decoding trees, compares their scores, and discusses the trade‑offs between efficiency and output quality.

Beam SearchGPT-2Greedy Search

0 likes · 16 min read

Implementing Greedy and Beam Decoding for Large Language Models from Scratch

JavaEdge

Jun 27, 2025 · Artificial Intelligence

Why Inference Engines Are Essential for Deploying Large Language Models in Production

The article explains what inference engines are, why they are needed beyond raw Python scripts, and outlines best practices such as model quantization, batching, and parallelism, while comparing popular open‑source and commercial options for production AI workloads.

AI deploymentBatchingInference Engine

0 likes · 14 min read

Why Inference Engines Are Essential for Deploying Large Language Models in Production

360 Zhihui Cloud Developer

Jun 27, 2025 · Operations

How AI‑Powered Ops‑Nexus Transforms Intelligent Operations for 100k+ Servers

This article details the design, technology choices, functional modules, core implementation, performance optimizations, and future roadmap of Ops‑Nexus, an AI‑driven intelligent operations platform that streamlines alarm analysis, log processing, and host health checks for large‑scale monitoring environments.

AI OpsIntelligent OperationsLLM

0 likes · 12 min read

How AI‑Powered Ops‑Nexus Transforms Intelligent Operations for 100k+ Servers

Fun with Large Models

Jun 27, 2025 · Artificial Intelligence

Boost Answer Accuracy: Detailed GraphRAG Retrieval Steps with Knowledge Graphs

This article walks through GraphRAG’s retrieval phase, showing how knowledge‑graph entities, relationships, and community reports are assembled into a query context, comparing local and global modes with traditional RAG, and illustrating the process with a concrete “Age of Big Data” example.

GraphRAGKnowledge GraphLLM

0 likes · 14 min read

Boost Answer Accuracy: Detailed GraphRAG Retrieval Steps with Knowledge Graphs

AI Algorithm Path

Jun 26, 2025 · Artificial Intelligence

The 10 Essential Components of a Retrieval‑Augmented Generation (RAG) System

This guide breaks down the ten core building blocks of a production‑ready RAG pipeline—from input handling and vector stores to prompt engineering, LLM inference, observability, and evaluation—showing why each piece matters, common pitfalls, and practical best‑practice recommendations.

LLMObservabilityPrompt Engineering

0 likes · 9 min read

The 10 Essential Components of a Retrieval‑Augmented Generation (RAG) System

Java Architecture Diary

Jun 25, 2025 · Artificial Intelligence

Build a Text‑to‑SQL Chatbot with Spring AI and DeepSeek LLM

This tutorial walks through creating a natural‑language‑to‑SQL chatbot using Spring AI, configuring a MySQL school database with Flyway, defining system prompts for a DeepSeek LLM, implementing service beans and a REST API, and interacting with the bot via curl commands.

ChatbotDeepSeekJava

0 likes · 15 min read

Build a Text‑to‑SQL Chatbot with Spring AI and DeepSeek LLM

Continuous Delivery 2.0

Jun 25, 2025 · Artificial Intelligence

How Model Context Protocol Turns LLMs into Plug‑and‑Play AI Assistants

The Model Context Protocol (MCP) is an open, standardized adapter that lets large language models seamlessly connect to tools, data sources, and workflows, offering plug‑and‑play intelligence, cross‑platform compatibility, security, and modular extensibility for building real‑world AI applications.

AI IntegrationLLMModel Context Protocol

0 likes · 11 min read

How Model Context Protocol Turns LLMs into Plug‑and‑Play AI Assistants

AntTech

Jun 23, 2025 · Artificial Intelligence

Can AI Auditors Ensure Reliable Software? Highlights from EXPRESS 2025 at ISSTA

The EXPRESS 2025 workshop at ISSTA in Norway will showcase AI‑driven code auditing, present cutting‑edge research on trustworthy software systems, and invite researchers and practitioners to discuss transparency, reliability, and security challenges in modern software engineering.

AI auditingISSTA 2025LLM

0 likes · 5 min read

Can AI Auditors Ensure Reliable Software? Highlights from EXPRESS 2025 at ISSTA

Alibaba Cloud Native

Jun 23, 2025 · Artificial Intelligence

From If/Else to Goal‑Oriented Agents: How LLMs Are Shaping Software 3.0

The article reflects on Andrej Karpathy’s AI Startup School talk, outlining the evolution from traditional if‑else programming (Software 1.0) through data‑driven models (Software 2.0) to goal‑oriented natural‑language agents (Software 3.0), and examines LLMs as operating‑system‑like infrastructure, prompting, and engineering challenges.

LLMsoftware evolution

0 likes · 5 min read

From If/Else to Goal‑Oriented Agents: How LLMs Are Shaping Software 3.0

DaTaobao Tech

Jun 23, 2025 · Artificial Intelligence

How We Built a High‑Accuracy AI‑Powered Digital Human Script Engine for Live Commerce

This article details the end‑to‑end AI pipeline for creating intelligent digital humans in live streaming, covering LLM‑driven script generation, multimodal data integration, error‑prone number handling, DPO fine‑tuning, experimental results, and future directions for more human‑like presentations.

AILLMScript Generation

0 likes · 35 min read

How We Built a High‑Accuracy AI‑Powered Digital Human Script Engine for Live Commerce

Architecture & Thinking

Jun 23, 2025 · Artificial Intelligence

Building AI Assistants with Eino: A Go Framework for Large‑Model Applications

This article introduces Eino, an open‑source Golang framework for large‑model AI applications, explains its core capabilities, walks through creating a simple AI assistant with message templates and chat model integration, and demonstrates how to extend the system with tools and a modular architecture for future expansion.

AI AssistantEinoFramework

0 likes · 17 min read

Building AI Assistants with Eino: A Go Framework for Large‑Model Applications

DataFunSummit

Jun 22, 2025 · Artificial Intelligence

How Vivo’s BlueHeart AI Assistant Optimizes Post‑Conversation Recommendations with LLMs

In a detailed interview, Vivo AI engineer Liang Tianan explains how the BlueHeart Small V assistant leverages large language models, multi‑stage recall, ranking, and reward‑model fine‑tuning (SFT/DPO) to generate high‑quality, diverse post‑dialogue recommendation items while balancing latency, cost, and evaluation challenges.

DPOLLMSFT

0 likes · 15 min read

How Vivo’s BlueHeart AI Assistant Optimizes Post‑Conversation Recommendations with LLMs

Tech Freedom Circle

Jun 21, 2025 · Artificial Intelligence

How MCP + LLM + Agent Architecture Becomes the AI Agent’s Neural Hub and New Infrastructure

The article explains the Model Context Protocol (MCP) as a zero‑code bridge that lets large language models seamlessly access databases, external APIs, and execute code, detailing its benefits for developers and everyday users, its core components, step‑by‑step workflow, real‑world examples, and how it outperforms traditional APIs in modern AI agent systems.

AI AgentLLMModel Context Protocol

0 likes · 37 min read

How MCP + LLM + Agent Architecture Becomes the AI Agent’s Neural Hub and New Infrastructure

Spring Full-Stack Practical Cases

Jun 21, 2025 · Artificial Intelligence

Master AI Agent Workflows with Spring Boot 3: From Chains to Orchestrators

This article introduces the fundamentals of augmented large language model agents, explains six workflow patterns—including chain, parallel, routing, orchestrator‑workers, evaluator‑optimizer, and autonomous agents—and provides complete Spring Boot 3 code examples, configuration, and test results for each pattern.

BackendJavaLLM

0 likes · 15 min read

Master AI Agent Workflows with Spring Boot 3: From Chains to Orchestrators

Fighter's World

Jun 21, 2025 · Artificial Intelligence

Speculating Devin’s Context Engineering Architecture: How Long‑Horizon Agents Preserve Complete Context

The article analyzes why context engineering is crucial for multi‑agent AI systems, illustrates the fragility caused by fragmented context with a Flappy Bird analogy, and proposes three detailed speculative components—a compression‑to‑structure pipeline, a hybrid layered memory architecture, and a context‑aware coordination mechanism—culminating in a unified reference design for long‑horizon agents.

Agent CoordinationCompression PipelineContext Engineering

0 likes · 22 min read

Speculating Devin’s Context Engineering Architecture: How Long‑Horizon Agents Preserve Complete Context

Alibaba Cloud Developer

Jun 20, 2025 · Artificial Intelligence

How to Build High‑Availability AI Agents: Challenges, Strategies, and Real‑World Insights

This article explores the evolving concept of AI agents, debates their definitions, outlines four major deployment challenges—including prompt instability, planning balance, domain knowledge integration, and response speed—and presents practical strategies such as prompt engineering, workflow design, multi‑agent architectures, and model optimization to build reliable, high‑availability agents.

AI AgentAgentic SystemsLLM

0 likes · 32 min read

How to Build High‑Availability AI Agents: Challenges, Strategies, and Real‑World Insights

dbaplus Community

Jun 19, 2025 · Artificial Intelligence

How Constrained Decoding Guarantees 100% Correct SQL from Large Language Models

This article explains how constrained decoding, built on context‑free grammars, Jinja templates, and the XGrammar engine, can enforce strict SQL syntax and custom business rules during LLM generation, enabling reliable, production‑grade NL‑to‑SQL services.

CFGJinjaLLM

0 likes · 37 min read

How Constrained Decoding Guarantees 100% Correct SQL from Large Language Models

Alibaba Cloud Developer

Jun 19, 2025 · Artificial Intelligence

What Is Model Context Protocol (MCP) and How It Empowers LLMs?

The article introduces Model Context Protocol (MCP), explains its architecture of Host, Client, and Server, describes its components—Resources, Tools, Prompts—and demonstrates practical integration with IDE plugins to extend LLM capabilities such as real‑time ticket queries, highlighting its significance for AI development.

AI IntegrationAI toolingFunction Calling

0 likes · 11 min read

What Is Model Context Protocol (MCP) and How It Empowers LLMs?

Sohu Tech Products

Jun 18, 2025 · Backend Development

How LLMs Transform Traffic Replay Testing for Backend Services

This article walks through the challenges of traditional traffic replay, explains the design and benefits of a conventional replay system, and then details how integrating large language models can automate data preparation, script generation, and validation to make backend testing more accurate, scalable, and efficient.

Backend testingLLMservice reliability

0 likes · 18 min read

How LLMs Transform Traffic Replay Testing for Backend Services

DataFunTalk

Jun 18, 2025 · Artificial Intelligence

Can LLMs Really Beat Human Olympiad Programmers? Insights from LiveCodeBench Pro

This article examines the LiveCodeBench Pro benchmark, revealing that while large language models achieve impressive scores on knowledge‑ and logic‑heavy coding problems, they still fall short of human experts on high‑difficulty, observation‑intensive tasks, especially without external tool support.

AI evaluationBenchmarkLLM

0 likes · 11 min read

Can LLMs Really Beat Human Olympiad Programmers? Insights from LiveCodeBench Pro

AIWalker

Jun 18, 2025 · Artificial Intelligence

Six New Directions for Large Language Models

Large language models are booming, and this article highlights six cutting‑edge research directions—LLM‑plus synthetic data, reward modeling, inference techniques, LLM‑as‑a‑Judge, safety alignment, and long‑context handling—each illustrated with recent papers, experimental results, and links to code repositories.

LLMReward ModelingSafety Alignment

0 likes · 9 min read

Six New Directions for Large Language Models

Aikesheng Open Source Community

Jun 17, 2025 · Artificial Intelligence

Introducing SCALE: An Open‑Source Benchmark Redefining LLM SQL Capabilities

This article presents SCALE, a community‑driven, open‑source benchmark that expands beyond simple Text‑to‑SQL accuracy to evaluate large language models on performance, dialect conversion, and deep SQL understanding, offering developers, researchers, and CTOs a realistic measure of AI‑assisted database tasks.

AIBenchmarkEvaluation

0 likes · 10 min read

Introducing SCALE: An Open‑Source Benchmark Redefining LLM SQL Capabilities

Tencent Technical Engineering

Jun 16, 2025 · Artificial Intelligence

Mastering RAG and AI Agents: Practical Tips, Code Samples, and Evaluation Strategies

This comprehensive guide walks you through the fundamentals of Retrieval‑Augmented Generation (RAG) and AI agents, explains their inner workings, shares optimization tricks, provides ready‑to‑run code snippets, and demonstrates how to evaluate performance with metrics such as recall, faithfulness, and answer relevance.

AI AgentsEvaluationLLM

0 likes · 36 min read

Mastering RAG and AI Agents: Practical Tips, Code Samples, and Evaluation Strategies

AsiaInfo Technology: New Tech Exploration

Jun 16, 2025 · Artificial Intelligence

How LangGraph Implements Shared Memory for Multi‑Agent Systems: Techniques, Tools, and Future Directions

This article examines the theory and practice of shared memory in multi‑agent systems, tracing its evolution from classic blackboard models to modern solutions like Mem0.ai, Open Memory, and A‑MEM, and provides concrete design patterns, integration strategies, and future research directions for LangGraph users.

AI memoryDistributed SystemsLLM

0 likes · 37 min read

How LangGraph Implements Shared Memory for Multi‑Agent Systems: Techniques, Tools, and Future Directions

Network Intelligence Research Center (NIRC)

Jun 15, 2025 · Cloud Native

How MicroOps Enables Easy Deployment and Management of Virtual Networks on Kubernetes

The article details MicroOps' virtual network feature on Kubernetes, covering manual and intent‑driven deployment, topology visualization and editing, node types, monitoring with Prometheus and Fluentd, chaos injection via ChaosMesh and VN_Chaos, and upcoming alarm and self‑healing modules.

FluentdKubernetesLLM

0 likes · 6 min read

How MicroOps Enables Easy Deployment and Management of Virtual Networks on Kubernetes

ITPUB

Jun 15, 2025 · Artificial Intelligence

How to Build a High‑Performance Enterprise RAG System with Model Context Protocol (MCP)

This article presents a step‑by‑step guide for constructing a scalable enterprise Retrieval‑Augmented Generation (RAG) solution using the Model Context Protocol (MCP), covering architecture comparison, system design, Milvus‑backed knowledge store, Python client implementation, deployment scripts, code examples, and best‑practice recommendations.

KnowledgeBaseLLMMilvus

0 likes · 22 min read

How to Build a High‑Performance Enterprise RAG System with Model Context Protocol (MCP)

Fighter's World

Jun 14, 2025 · Artificial Intelligence

How Can LLMs Learn to “Think” in Complex Industry Scenarios?

The article analyzes how large language models can acquire true reasoning abilities for hard‑to‑score industry tasks by combining Chain‑of‑Thought prompting with reinforcement learning, addressing vague reward signals, reward hacking, and loyalty, and proposing a toolbox of reward engineering, synthetic data, hierarchical RL and multi‑agent collaboration.

LLMReward Modelingchain-of-thought

0 likes · 22 min read

How Can LLMs Learn to “Think” in Complex Industry Scenarios?

Alibaba Cloud Big Data AI Platform

Jun 13, 2025 · Artificial Intelligence

How EasyDistill Cuts LLM Costs: Mastering DistilQwen-ThoughtX on Alibaba Cloud

EasyDistill, an open-source framework from Alibaba Cloud PAI, streamlines knowledge distillation for large language models, introducing the DistilQwen-ThoughtX series with variable-length chain-of-thought reasoning, and provides comprehensive best-practice guidance for training, fine-tuning, evaluation, compression, and deployment via the PAI-ModelGallery.

AI inferenceKnowledge DistillationLLM

0 likes · 12 min read

How EasyDistill Cuts LLM Costs: Mastering DistilQwen-ThoughtX on Alibaba Cloud

Instant Consumer Technology Team

Jun 12, 2025 · Artificial Intelligence

How to Build a Production-Ready RAG System with Qwen3 Embedding and Reranker Models

This guide walks through using Alibaba's new Qwen3-Embedding and Qwen3-Reranker models to build a two‑stage Retrieval‑Augmented Generation pipeline with Milvus, covering environment setup, data ingestion, vector indexing, reranking, and LLM‑driven answer generation, demonstrating production‑grade performance across multilingual queries.

EmbeddingLLMMilvus

0 likes · 19 min read

How to Build a Production-Ready RAG System with Qwen3 Embedding and Reranker Models

Alibaba Cloud Developer

Jun 11, 2025 · Artificial Intelligence

From Chat to Autonomous Agents: Architecture, ReAct, Prompt Engineering

This article chronicles the evolution from simple chat interactions to sophisticated autonomous agents, detailing stages of LLM development, ReAct reasoning, memory management, tool integration, and practical implementation using the browser-use project, while offering prompt design insights and future directions for AI agents.

AI AgentLLMMemory

0 likes · 30 min read

From Chat to Autonomous Agents: Architecture, ReAct, Prompt Engineering

Architecture & Thinking

Jun 11, 2025 · Artificial Intelligence

Accelerate LLM App Development with Eino: A Go Framework Walkthrough

Eino is an open‑source Golang framework for building large‑model applications, offering reusable components, robust orchestration, clean APIs, best‑practice templates, and full‑cycle DevOps tools, with code examples for both Ollama and OpenAI modes, plus streaming and normal output options.

AI developmentFrameworkGo

0 likes · 10 min read

Accelerate LLM App Development with Eino: A Go Framework Walkthrough

Instant Consumer Technology Team

Jun 10, 2025 · Artificial Intelligence

Unlocking AI Agent Integration with Model Context Protocol (MCP): A Complete Guide

This article explains how the Model Context Protocol (MCP) standardizes AI agent communication with external tools, outlines its benefits, describes its core components, showcases open‑source implementations, and provides step‑by‑step Python examples for building MCP servers and clients.

Function CallingLLMModel Context Protocol

0 likes · 22 min read

Unlocking AI Agent Integration with Model Context Protocol (MCP): A Complete Guide

Alibaba Cloud Developer

Jun 10, 2025 · Artificial Intelligence

How AI Application Architectures Evolve: From Simple LLM Calls to Guardrails, Routing, and Agents

This article traces the evolution of AI application architectures—from the earliest minimal user‑LLM interaction to advanced designs featuring context enhancement, input/output guardrails, intent routing, model gateways, caching strategies, agent capabilities, monitoring, and inference performance optimizations—providing practical insights and references for developers.

AI ArchitectureCachingInference Optimization

0 likes · 21 min read

How AI Application Architectures Evolve: From Simple LLM Calls to Guardrails, Routing, and Agents

DataFunSummit

Jun 8, 2025 · Artificial Intelligence

Mastering LLM Applications: Practical Agent Design and Implementation Strategies

This comprehensive guide explores the core implementation paths for large language model (LLM) applications, focusing on agent design, workflow orchestration, tool integration, memory management, multi‑agent architectures, and future trends, providing actionable methodologies and real‑world examples for practitioners.

AI AgentAgent DesignLLM

0 likes · 25 min read

Mastering LLM Applications: Practical Agent Design and Implementation Strategies

dbaplus Community

Jun 7, 2025 · Artificial Intelligence

How Large Language Models Are Transforming Data Warehousing: Real-World Experiments and Lessons

The article shares practical experiences using large language models such as Cursor and DeepSeek in data‑warehouse workflows, covering assisted coding, automated metric extraction, self‑service analysis, documentation generation, their benefits, limitations, and the broader impact on data engineering roles.

AI automationBusiness IntelligenceData Warehouse

0 likes · 9 min read

How Large Language Models Are Transforming Data Warehousing: Real-World Experiments and Lessons

AI2ML AI to Machine Learning

Jun 6, 2025 · Artificial Intelligence

Tackling the Top Challenges of Retrieval‑Augmented Generation (RAG)

The article enumerates common pitfalls of Retrieval‑Augmented Generation—such as missing content, low‑rank document misses, context limits, format errors, incomplete answers, scalability bottlenecks, complex PDF extraction, data‑quality issues, domain adaptation gaps, hallucinations, and feedback‑loop deficiencies—and offers concrete mitigation strategies ranging from data cleaning and prompt design to hybrid search, hierarchical retrieval, document compression, and automated evaluation.

Data QualityHybrid SearchLLM

0 likes · 9 min read

Tackling the Top Challenges of Retrieval‑Augmented Generation (RAG)

Youzan Coder

Jun 6, 2025 · Artificial Intelligence

How AI Agents Turn Manual Data Retrieval into Fully Automated Insights

This article examines the challenges of manual data extraction in data‑driven enterprises, explains why large language models alone fall short, and details how the Cursor‑Agent framework automates end‑to‑end querying, knowledge‑base integration, and result validation to become a self‑sufficient "data master" for both technical and non‑technical users.

AI AgentCursor AgentData Automation

0 likes · 26 min read

How AI Agents Turn Manual Data Retrieval into Fully Automated Insights

DaTaobao Tech

Jun 6, 2025 · Artificial Intelligence

Redefining Business Core Assets in the LLM Era: Agent Evolution & Collaboration

This article examines how the rise of large language models reshapes core business assets, defines agents and tools, explores multi‑agent collaboration patterns, task allocation and conflict resolution mechanisms, and evaluates the MCP protocol and engineering requirements for building scalable, flexible agent platforms.

Agent ArchitectureLLMMCP protocol

0 likes · 9 min read

Redefining Business Core Assets in the LLM Era: Agent Evolution & Collaboration

JavaEdge

Jun 5, 2025 · Artificial Intelligence

How Amazon’s Strands Agents SDK Simplifies Building AI Agents

Amazon’s newly open‑source Strands Agents SDK lets developers create AI agents with minimal code by defining prompts, tools, and models, offering a lightweight, production‑ready framework that supports multiple model providers, observability, multi‑agent collaboration, and extensible tooling via dedicated packages.

AI AgentsAmazonLLM

0 likes · 7 min read

How Amazon’s Strands Agents SDK Simplifies Building AI Agents

Didi Tech

Jun 5, 2025 · Artificial Intelligence

Unlocking Modern AI Application Architecture: From RAG to Agents and MCP

This article surveys the evolution of AI applications, explains large language model fundamentals, outlines architectural challenges, and introduces three core patterns—Retrieval‑Augmented Generation (RAG), autonomous Agents, and Model Context Protocol (MCP)—while providing practical LangChain code snippets and integration guidance.

AILLMLangChain

0 likes · 28 min read

Unlocking Modern AI Application Architecture: From RAG to Agents and MCP

AI Frontier Lectures

Jun 5, 2025 · Artificial Intelligence

Bridging Thought Leaps: How CoT‑Bridge Boosts LLM Reasoning Accuracy

This paper introduces the Thought Leap Bridge task and the CoT‑Bridge model, which detect and fill missing intermediate steps in chain‑of‑thought reasoning, dramatically improving large language model performance on mathematical and logical benchmarks and enhancing downstream distillation and reinforcement‑learning pipelines.

Chain-of-ThoughtCoT-BridgeLLM

0 likes · 8 min read

Bridging Thought Leaps: How CoT‑Bridge Boosts LLM Reasoning Accuracy

AI Algorithm Path

Jun 4, 2025 · Artificial Intelligence

Why LLMs Hallucinate and How to Mitigate the Problem

The article explains that hallucinations in large language models stem mainly from the supervised fine‑tuning stage, illustrates the issue with concrete examples, and presents mitigation techniques such as knowledge‑probing data generation and web‑search tool integration using special tokens.

LLMMetaOpenAssistant

0 likes · 12 min read

Why LLMs Hallucinate and How to Mitigate the Problem

Architect's Alchemy Furnace

Jun 4, 2025 · Artificial Intelligence

What Is an AI Engineer? Roles, Skills, and the Future of LLM‑Powered Systems

This article examines the evolving role of the AI engineer, contrasting it with AI researchers, ML engineers, and software engineers, outlines essential skills such as prompt engineering, MLOps, and data integration, and predicts how AI engineering will become a pivotal, high‑demand discipline in the coming years.

AI EngineeringAI SystemsAgentic RAG

0 likes · 17 min read

What Is an AI Engineer? Roles, Skills, and the Future of LLM‑Powered Systems

Baobao Algorithm Notes

Jun 4, 2025 · Artificial Intelligence

Do Recent LLM‑RL Papers Overstate Their Gains? A Critical Review

This article critically examines seven high‑profile reinforcement‑learning papers for large language models, exposing flawed baseline evaluations, unrealistic settings, and modest actual improvements despite bold claims of dramatic performance gains.

AI researchLLMbaseline evaluation

0 likes · 8 min read

Do Recent LLM‑RL Papers Overstate Their Gains? A Critical Review

Xiaohongshu Tech REDtech

Jun 4, 2025 · Artificial Intelligence

From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MARKERGEN

MarkerGen introduces a novel, plug‑and‑play framework that decomposes length‑controllable text generation into four sub‑abilities—identifying, counting, planning, and aligning—integrates external tokenizers and dynamic markers, and achieves significantly lower length errors and higher quality across diverse models, tasks, and languages.

LLMLength-Controlled GenerationMarkerGen

0 likes · 14 min read

From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MARKERGEN

DaTaobao Tech

Jun 4, 2025 · Artificial Intelligence

Understanding Large Language Model Architecture, Parameters, Memory, Storage, and Fine‑Tuning Techniques

This article provides a comprehensive overview of large language models (LLMs), covering their transformer architecture, parameter counts, GPU memory and storage requirements, and detailed fine‑tuning methods such as prompt engineering, data construction, LoRA, PEFT, RLHF, and DPO, along with practical deployment and inference acceleration strategies.

DPOLLMLoRA

0 likes · 17 min read

Understanding Large Language Model Architecture, Parameters, Memory, Storage, and Fine‑Tuning Techniques

AI Frontier Lectures

Jun 3, 2025 · Artificial Intelligence

Master LLM Engineering: Model Conversion, Parallel Inference, and Channel‑Loss Techniques

This article outlines essential LLM engineering skills, including scripts for converting various model checkpoints to Llama format, customizing modeling files for advanced features, building a multi‑GPU inference class, and adding channel‑aware loss tracking to fine‑tuning pipelines.

Flash AttentionLLMTraining Optimization

0 likes · 6 min read

Master LLM Engineering: Model Conversion, Parallel Inference, and Channel‑Loss Techniques

Alibaba Cloud Observability

Jun 3, 2025 · Artificial Intelligence

How to Build an MCP Server for AI-Powered Observability: 6 Practical Design Tips

Discover how to design and implement an MCP Server that integrates AI-driven observability, covering essential components, best practices, code examples, and real-world lessons learned to enable natural language interaction with monitoring data and streamline system analysis.

AILLMServer

0 likes · 16 min read

How to Build an MCP Server for AI-Powered Observability: 6 Practical Design Tips

ITFLY8 Architecture Home

Jun 2, 2025 · Artificial Intelligence

Choosing the Right LLM AI Agent Protocol: A Four‑Category Guide

This article provides a systematic overview of existing LLM AI Agent communication protocols, categorizing them into four major types, detailing their functions, benefits, and use‑cases, and compares four representative protocols—MCP, A2A, ANP, and Agora—through a concrete travel‑planning scenario.

AI AgentCommunication ProtocolLLM

0 likes · 11 min read

Choosing the Right LLM AI Agent Protocol: A Four‑Category Guide

Fighter's World

Jun 2, 2025 · Artificial Intelligence

Why Is Context King for Large Language Models?

This article provides a comprehensive technical analysis of LLM context, covering its definition, types, tokenization, window‑size evolution, diminishing returns, management techniques such as RAG, CoT, memory‑as‑a‑service, and future challenges like multimodal fusion, privacy, and autonomous agent memory.

Agent MemoryContext ManagementLLM

0 likes · 48 min read

Why Is Context King for Large Language Models?

Radish, Keep Going!

Jun 2, 2025 · Databases

Human‑Crafted XOR Trick Beats LLMs in Detecting Redis Vector Set Bugs

The author recounts fixing a complex Redis Vector Sets bug, explores how human creativity outperforms LLMs in devising efficient data‑consistency checks, and shares experimental ideas—including XOR accumulators and MurmurHash—to detect non‑mutual links in large HNSW graphs.

Data ConsistencyLLMRedis

0 likes · 8 min read

Human‑Crafted XOR Trick Beats LLMs in Detecting Redis Vector Set Bugs

JavaEdge

May 30, 2025 · Artificial Intelligence

How to Build a Deep Research Workflow in Dify Using AI Agents

This guide explains how to construct a deep research workflow in Dify that leverages AI agents, loop variables, and structured outputs to automatically explore complex topics, gather sources, and synthesize comprehensive reports with proper citations.

AI workflowDifyLLM

0 likes · 9 min read

How to Build a Deep Research Workflow in Dify Using AI Agents

Instant Consumer Technology Team

May 30, 2025 · Artificial Intelligence

Why Streamable HTTP Is Replacing SSE in AI Communication: An MCP Protocol Deep Dive

This article explains how the Model Context Protocol (MCP) standardizes AI‑assistant communication, compares the traditional Server‑Sent Events (SSE) transport with the newer Streamable HTTP mechanism, and provides step‑by‑step code examples for building both MCP servers and clients that leverage Streamable HTTP for bidirectional, session‑aware data exchange.

AILLMStreamable HTTP

0 likes · 22 min read

Why Streamable HTTP Is Replacing SSE in AI Communication: An MCP Protocol Deep Dive

Alibaba Cloud Developer

May 29, 2025 · Artificial Intelligence

Build a Minimal Large Language Model from Scratch with Python and PyTorch

This tutorial walks through creating a simple bigram language model in pure Python, refactoring it into a PyTorch implementation, and explains core concepts such as tokenization, embedding layers, loss functions, gradient descent, training loops, and text generation, preparing you for building a full GPT model.

BigramLLMLanguageModel

0 likes · 31 min read

Build a Minimal Large Language Model from Scratch with Python and PyTorch

Alibaba Cloud Big Data AI Platform

May 29, 2025 · Artificial Intelligence

How OmniThought Enables Adaptive Reasoning Chains for Better LLM Performance

This article introduces the OmniThought dataset, which annotates over two million chain‑of‑thought reasoning steps with Reasoning Verbosity and Cognitive Difficulty scores, and explains how these metrics guide the training of DistilQwen‑ThoughtX models that adapt chain length to task difficulty, achieving superior performance compared to existing distilled LLMs.

CoTLLMReasoning

0 likes · 16 min read

How OmniThought Enables Adaptive Reasoning Chains for Better LLM Performance

Tencent Technical Engineering

May 28, 2025 · Artificial Intelligence

A Beginner-friendly Overview of LLMs, Transformers, Prompts, Function Calling, MCP and Agents

This article provides a concise, easy-to-understand introduction to large language models, the transformer architecture, prompt engineering, temperature settings, function calling, the Model Context Protocol (MCP), agent communication (A2A), and future AI programming trends, using simple analogies and illustrative examples.

AIFunction CallingLLM

0 likes · 11 min read

A Beginner-friendly Overview of LLMs, Transformers, Prompts, Function Calling, MCP and Agents

Alibaba Cloud Developer

May 28, 2025 · Artificial Intelligence

Unlocking LLM Fine‑Tuning: From Architecture to LoRA, DPO and Deployment

This article provides a comprehensive guide to large language model fine‑tuning, covering model architecture, parameter and memory calculations, prompt engineering, data construction, LoRA and PEFT techniques, reinforcement learning methods such as DPO, and practical deployment workflows on internal platforms.

Fine‑TuningLLMLoRA

0 likes · 21 min read

Unlocking LLM Fine‑Tuning: From Architecture to LoRA, DPO and Deployment

AI Large Model Application Practice

May 28, 2025 · Artificial Intelligence

Mastering Human-in-the-Loop for Enterprise LLM Agents with LangGraph

This article explains why Human-in-the-Loop (HITL) is essential for enterprise LLM agents, outlines common HITL patterns, and shows how LangGraph’s interrupt, resume, and checkpoint mechanisms can be used to build reliable, auditable workflows with tool‑call control and practical code examples.

AIHuman-in-the-LoopLLM

0 likes · 14 min read

Mastering Human-in-the-Loop for Enterprise LLM Agents with LangGraph

JavaEdge

May 27, 2025 · Artificial Intelligence

Boost LLM App Performance: Master Parallel Workflows in Dify v0.8.0

Version 0.8.0 of Dify introduces parallel workflow capabilities, allowing multiple branches to run concurrently, which dramatically reduces latency for complex LLM tasks; the guide explains how to create simple, nested, iterative, and conditional parallel branches, with step‑by‑step instructions and visual examples.

DifyLLMparallel processing

0 likes · 8 min read

Boost LLM App Performance: Master Parallel Workflows in Dify v0.8.0

Instant Consumer Technology Team

May 27, 2025 · Artificial Intelligence

How to Build a Text‑to‑SQL Assistant: From Prompt Tricks to Enterprise‑Ready Solutions

This comprehensive guide explains the Text2SQL concept, showcases real‑world scenarios, compares three implementation architectures—including a simple prompt‑based method, a LangChain‑based pipeline, and an enterprise‑grade Vanna solution—while providing practical tips, security measures, and advanced enhancements for deploying robust natural‑language‑to‑SQL systems.

DatabaseLLMText2SQL

0 likes · 26 min read

How to Build a Text‑to‑SQL Assistant: From Prompt Tricks to Enterprise‑Ready Solutions

Baobao Algorithm Notes

May 26, 2025 · Artificial Intelligence

Why Do Reasoning LLMs Lose Instruction-Following Ability? A Deep Dive into Recent Findings

This article compares two recent papers that investigate why large reasoning models such as Llama and Qwen show degraded instruction‑following performance when using chain‑of‑thought prompting, analyzing attention patterns, training effects, and proposed mitigation strategies.

EvaluationLLMattention

0 likes · 11 min read

Why Do Reasoning LLMs Lose Instruction-Following Ability? A Deep Dive into Recent Findings

Architecture & Thinking

May 25, 2025 · Artificial Intelligence

Which AI Workflow Platform Wins? A Deep Dive into n8n, Dify, and Coze

This article compares three leading AI workflow tools—n8n, Dify, and Coze—by examining their origins, technical architectures, core advantages, typical use cases, real‑world case studies, and future deployment trends, helping developers and businesses choose the right "intelligent assistant" for their needs.

AILLMLow‑code

0 likes · 11 min read

Which AI Workflow Platform Wins? A Deep Dive into n8n, Dify, and Coze

AI Frontier Lectures

May 24, 2025 · Artificial Intelligence

When Chain‑of‑Thought Backfires: Why More Reasoning Can Hurt LLM Accuracy

A recent study from Harvard, Amazon and NYU shows that using chain‑of‑thought (CoT) prompting can significantly reduce large language models' ability to follow strict instructions, introducing a new "constraint attention" metric and four mitigation strategies to restore performance.

Chain-of-ThoughtLLMPrompt Engineering

0 likes · 11 min read

When Chain‑of‑Thought Backfires: Why More Reasoning Can Hurt LLM Accuracy

Architect

May 23, 2025 · Artificial Intelligence

How We Won the RAG Challenge: Multi‑Router & Dynamic Knowledge Base Techniques Revealed

This article details the end‑to‑end design, parsing tricks, vector database setup, retrieval strategies, prompt engineering, and LLM reranking that powered the winning solution in a company‑annual‑report question‑answering competition.

FAISSLLMPDF parsing

0 likes · 37 min read

How We Won the RAG Challenge: Multi‑Router & Dynamic Knowledge Base Techniques Revealed

Youzan Coder

May 23, 2025 · Artificial Intelligence

How LLMs Supercharge SaaS Alert Monitoring: An AI‑Powered Workflow

This article explains how a SaaS company leveraged large language models to automatically ingest, enrich, and analyze stability alerts, turning noisy notifications into actionable insights through configurable pipelines, Feishu integration, and a streamlined AI workflow that boosts incident response speed and reduces manual effort.

AIAlert MonitoringLLM

0 likes · 6 min read

How LLMs Supercharge SaaS Alert Monitoring: An AI‑Powered Workflow

Instant Consumer Technology Team

May 22, 2025 · Artificial Intelligence

Build a Weather‑Query AI Service with FastMCP: Step‑by‑Step Python Guide

This tutorial walks you through creating a FastMCP‑based weather‑query server in Python, registering it as an LLM‑callable tool, and building a matching Python client that connects via stdio, handles tool calls, and provides an interactive chat loop for AI‑driven queries.

LLMTool Integrationfastmcp

0 likes · 18 min read

Build a Weather‑Query AI Service with FastMCP: Step‑by‑Step Python Guide

Volcano Engine Developer Services

May 22, 2025 · Artificial Intelligence

How LLMs Can Automate Ticket Escalation: Inside ByteBrain’s TickIt System

This article introduces TickIt, a ByteBrain system that leverages large language models to automatically identify and escalate critical Oncall tickets, detailing its multi‑class escalation, deduplication, and category‑guided fine‑tuning modules, experimental results, and the operational impact on cloud services.

Incident ManagementLLMOncall analysis

0 likes · 13 min read

How LLMs Can Automate Ticket Escalation: Inside ByteBrain’s TickIt System

JD Tech Talk

May 22, 2025 · Artificial Intelligence

From Academic Research to Industrial Anti‑Fraud: Leveraging LLMs, Reinforcement Learning, and Model Distillation for Advertising Risk Detection

The article recounts Xiaoting’s journey from a PhD research background to leading JD.com’s ad‑fraud detection, detailing how large language models, reinforcement learning, and model distillation were applied to identify hidden address codes, reduce false‑positive rates to 0.3%, and balance accuracy with real‑time performance in a high‑traffic e‑commerce environment.

AIAd FraudAdvertising

0 likes · 11 min read

From Academic Research to Industrial Anti‑Fraud: Leveraging LLMs, Reinforcement Learning, and Model Distillation for Advertising Risk Detection

JD Cloud Developers

May 22, 2025 · Artificial Intelligence

How AI and LLMs Power JD’s Real-Time Advertising Anti‑Fraud System

This article recounts a JD researcher’s journey from academic data‑mining competitions to building an AI‑driven, LLM‑enhanced anti‑fraud platform that balances detection accuracy, computational cost, and business value in large‑scale e‑commerce advertising.

AILLMadvertising fraud

0 likes · 11 min read

How AI and LLMs Power JD’s Real-Time Advertising Anti‑Fraud System

Sohu Tech Products

May 21, 2025 · Artificial Intelligence

Beyond LLM Limits: Function Calling, MCP, and A2A Compared

The article examines the inherent knowledge cutoff of large language models, introduces function calling, Model Context Protocol (MCP), and Agent‑to‑Agent (A2A) as solutions for real‑time data access, compares their architectures, communication patterns, and use cases, and discusses their respective strengths and drawbacks.

A2AAI protocolsFunction Calling

0 likes · 17 min read

Beyond LLM Limits: Function Calling, MCP, and A2A Compared

Alibaba Cloud Developer

May 21, 2025 · Artificial Intelligence

How to Seamlessly Integrate MCP Protocol with Spring AI for Powerful LLM Tool Calls

This article explains the challenges of integrating diverse tools without MCP, then demonstrates step‑by‑step how to configure Spring‑AI and the native MCP SDK to call LLMs, register tools, handle SSE and stdio services, and troubleshoot common issues, providing code snippets and best‑practice recommendations.

AI tool integrationBackend DevelopmentJava

0 likes · 16 min read

How to Seamlessly Integrate MCP Protocol with Spring AI for Powerful LLM Tool Calls

DeWu Technology

May 19, 2025 · Artificial Intelligence

AI-Powered Automated Test Case Generation: Design, Implementation, and Future Plans

This article presents a comprehensive AI-driven solution for automatically generating functional test cases, detailing the AI background, design scheme, core components such as PRD parsing, test‑point generation, test‑case creation, knowledge‑base construction, implementation results, and future development directions.

AIKnowledge BaseLLM

0 likes · 7 min read

AI-Powered Automated Test Case Generation: Design, Implementation, and Future Plans

Huawei Cloud Developer Alliance

May 19, 2025 · Artificial Intelligence

What Is AI MCP and How It Revolutionizes Model Integration?

AI MCP (Model Context Protocol) is an open protocol that standardizes communication between large language model applications and external data sources or tools, offering pre‑installed services, fast registration, ecosystem openness, and automatic discovery within Huawei Cloud ModelArts Studio, while eliminating the need for per‑API integration code.

AIHuaweiLLM

0 likes · 7 min read

What Is AI MCP and How It Revolutionizes Model Integration?

Youzan Coder

May 16, 2025 · Artificial Intelligence

Intelligent Address Recognition: AI‑Assisted Hybrid Solution and Prompt Engineering

This article describes how a hybrid architecture that combines third‑party address‑recognition APIs with large‑language‑model (LLM) processing, along with carefully engineered prompts and a TSV output format, dramatically improves address parsing accuracy and latency in a retail checkout scenario.

AIHybrid ArchitectureLLM

0 likes · 12 min read

Intelligent Address Recognition: AI‑Assisted Hybrid Solution and Prompt Engineering

Baobao Algorithm Notes

May 16, 2025 · Artificial Intelligence

Why Multi‑Turn LLM Evaluation Fails and How a User‑Simulator Can Fix It

The article explains that large language models lose up to 35% performance in multi‑turn conversations, critiques static single‑turn evaluation methods, and proposes a dynamic user‑simulator with loss‑masking techniques to generate realistic test turns and improve assessment reliability.

AI testingLLMRLHF

0 likes · 6 min read

Why Multi‑Turn LLM Evaluation Fails and How a User‑Simulator Can Fix It

Alibaba Cloud Developer

May 16, 2025 · Artificial Intelligence

Designing Robust MCP Servers for Alibaba Cloud Observability 2.0 – Lessons & Best Practices

This article explains the Model Context Protocol (MCP), its components, and how to integrate MCP servers with Alibaba Cloud Observability 2.0, offering practical design experiences, tool simplification tips, default parameter strategies, output size control, and future AI‑driven observability insights.

LLMObservabilitymcp

0 likes · 17 min read

Designing Robust MCP Servers for Alibaba Cloud Observability 2.0 – Lessons & Best Practices

AI Large Model Application Practice

May 16, 2025 · Artificial Intelligence

Why Residual Connections Keep Deep Neural Networks Stable

This article explains why residual connections are essential in deep neural networks, describing the problems of network degradation and gradient vanishing, how shortcut paths add the input to the layer output, the requirement of matching dimensions, and the resulting stability for training large language models.

LLMResidual Connectionsgradient flow

0 likes · 7 min read

Why Residual Connections Keep Deep Neural Networks Stable

Instant Consumer Technology Team

May 15, 2025 · Artificial Intelligence

Unlocking Agentic AI: How Agent Workflows Transform Intelligent Automation

This article demystifies AI agents and agentic workflows, explaining their core components—LLMs, tools, and memory—while detailing planning, tool‑use, and reflection patterns, comparing agentic, non‑agentic, and traditional workflows, and exploring real‑world applications, advantages, and limitations.

AI AgentsLLMagentic workflows

0 likes · 21 min read

Unlocking Agentic AI: How Agent Workflows Transform Intelligent Automation

Alibaba Cloud Big Data AI Platform

May 15, 2025 · Artificial Intelligence

How to Build a Qwen3‑Powered ChatBI Agent with PAI‑LangStudio and Hologres

This guide walks you through creating a ChatBI intelligent agent by integrating Alibaba's Qwen3 large language model with PAI‑LangStudio, configuring the Model Context Protocol (MCP) server, and connecting to Hologres real‑time data warehouse, covering setup, deployment, and verification steps for enterprise data analysis.

ChatBIData AnalysisHologres

0 likes · 11 min read

How to Build a Qwen3‑Powered ChatBI Agent with PAI‑LangStudio and Hologres

Java Architecture Diary

May 15, 2025 · Artificial Intelligence

What’s New in LangChain4j 1.0.0? A Deep Dive into Java AI SDK Features

LangChain4j 1.0.0 brings official OpenAI SDK support, GitHub Models integration, expanded database and vector store compatibility, customizable HTTP clients, and clear migration steps for renamed interfaces and streaming methods, marking a major milestone for Java AI development.

AI SDKLLMLangChain4j

0 likes · 7 min read

What’s New in LangChain4j 1.0.0? A Deep Dive into Java AI SDK Features

Kuaishou Tech

May 14, 2025 · Artificial Intelligence

StableReinforce and R1-Reward: Enhancing Multimodal Reward Models with Reinforcement Learning

This article presents StableReinforce and the R1-Reward model, demonstrating how reinforcement learning techniques can stabilize training and significantly improve the performance of multimodal reward models for large language models across several benchmarks.

AILLMR1-Reward

0 likes · 15 min read

StableReinforce and R1-Reward: Enhancing Multimodal Reward Models with Reinforcement Learning

StarRocks

May 13, 2025 · Artificial Intelligence

How StarRocks MCP Server Enables LLMs to Query Databases Without Custom Plugins

StarRocks MCP Server provides a universal adapter that lets large language models like Claude, OpenAI, and Gemini execute SQL queries directly against StarRocks, simplifying data Q&A, intelligent analysis, and automated reporting by eliminating the need for bespoke plugins or complex prompt engineering.

AI AgentsLLMStarRocks

0 likes · 14 min read

How StarRocks MCP Server Enables LLMs to Query Databases Without Custom Plugins

Tencent Cloud Developer

May 13, 2025 · Artificial Intelligence

Function Calling and Model Context Protocol (MCP): Bridging Large Language Models with Real‑World Systems

The article reviews the shortcomings of traditional large language models, explains how function calling extends LLMs beyond pure text, introduces the Model Context Protocol (MCP) as a standardized USB‑C‑like interface for AI tools, and demonstrates a Python MCP example that integrates LLMs with Tencent Advertising APIs.

AI IntegrationAPIFunction Calling

0 likes · 16 min read

Function Calling and Model Context Protocol (MCP): Bridging Large Language Models with Real‑World Systems

Tencent Technical Engineering

May 12, 2025 · Artificial Intelligence

Comprehensive Summary and Expansion of Andrej Karpathy’s 7‑Hour LLM Lecture

This article provides a detailed Chinese‑to‑English summary of Andrej Karpathy’s 7‑hour LLM tutorial, covering chat process analysis, tokenization, pre‑training data pipelines, model architecture, training strategies, post‑training fine‑tuning, reinforcement learning, chain‑of‑thought reasoning, and current industry applications.

AILLMmodel architecture

0 likes · 25 min read

Comprehensive Summary and Expansion of Andrej Karpathy’s 7‑Hour LLM Lecture

Java Tech Enthusiast

May 12, 2025 · Artificial Intelligence

Chain‑of‑Recursive‑Thoughts (CoRT): Boosting LLM Reasoning with Recursive Self‑Critique

The article introduces Chain‑of‑Recursive‑Thoughts (CoRT), explains how recursive self‑evaluation enhances large language model reasoning, outlines its workflow, shares GitHub resources, compares it with existing CoT methods, and reports experimental results using Mistral 3.1 24B.

AIChain-of-Recursive-ThoughtsCoRT

0 likes · 6 min read

Chain‑of‑Recursive‑Thoughts (CoRT): Boosting LLM Reasoning with Recursive Self‑Critique

Network Intelligence Research Center (NIRC)

May 12, 2025 · Artificial Intelligence

Hands‑On Experience with Amap MCP: Setup, Features, and Real‑World Use Cases

This article walks through Amap's Model Context Protocol (MCP) service, explaining its purpose, installation steps, configuration in the Cursor client, and practical examples such as travel planning and location‑based queries, while also evaluating its strengths and current limitations.

AI IntegrationAmapCursor

0 likes · 8 min read

Hands‑On Experience with Amap MCP: Setup, Features, and Real‑World Use Cases

Data Thinking Notes

May 11, 2025 · Artificial Intelligence

How to Build Effective LLM Agents: Design Principles and Practical Workflows

This article outlines Anthropic's yearly insights on constructing large‑language‑model agents, explaining their definitions, when to employ them, recommended frameworks, modular building blocks, common workflow patterns, and real‑world application scenarios for developers.

AI AgentsAgent ArchitectureLLM

0 likes · 14 min read

How to Build Effective LLM Agents: Design Principles and Practical Workflows

AI Algorithm Path

May 9, 2025 · Artificial Intelligence

A Visual Guide to Mixture of Experts (MoE) Architecture in Large Language Models

This article explains the Mixture of Experts (MoE) technique used in modern LLMs, detailing its core components—experts and router—comparing dense and sparse layers, describing load‑balancing, expert capacity, and routing strategies, and showcasing real‑world examples such as Switch Transformer, Vision‑MoE, and Mixtral 8x7B.

Expert CapacityLLMMixture of Experts

0 likes · 15 min read

A Visual Guide to Mixture of Experts (MoE) Architecture in Large Language Models

phodal

May 9, 2025 · Artificial Intelligence

Why Pre‑Generated Context Is the Key to Faster, More Accurate AI Code Retrieval

The article examines how pre‑generating structured context for codebases can overcome the uncertainty and quality issues of traditional Retrieval‑Augmented Generation, outlines the technical and business challenges of RAG, compares existing code‑search tools, and introduces AutoDev’s Context Worker as a practical solution.

AILLMRAG

0 likes · 11 min read

Why Pre‑Generated Context Is the Key to Faster, More Accurate AI Code Retrieval

Bilibili Tech

May 9, 2025 · Artificial Intelligence

How an AI Gateway Scales LLM Services: Architecture, Auth, Quotas, and Load Balancing

This article explains the design of an AI gateway that centralizes LLM access, detailing its background, overall architecture, authentication, quota management, multi‑model routing, load‑balancing strategies, multi‑tenant isolation, observability features, and the supported API protocols for enterprise integration.

AI gatewayAuthenticationLLM

0 likes · 17 min read

How an AI Gateway Scales LLM Services: Architecture, Auth, Quotas, and Load Balancing