Tagged articles
2079 articles
Page 14 of 21
G7 EasyFlow Tech Circle
G7 EasyFlow Tech Circle
May 9, 2025 · Artificial Intelligence

How LLMs + Python Are Redefining Data Analysis: A Practical Guide

This article explains how large language models combined with Python's data‑science ecosystem can automate metadata extraction, data cleaning, and analysis tasks—illustrated with a step‑by‑step Titanic passenger dataset case study, complete prompts, code snippets, and best‑practice recommendations.

Data AnalysisData cleaningLLM
0 likes · 18 min read
How LLMs + Python Are Redefining Data Analysis: A Practical Guide
Youzan Coder
Youzan Coder
May 8, 2025 · Artificial Intelligence

Building and Optimizing a Store Smart Assistant with Aily: Architecture, Workflow, and Practical Lessons

The article details how Youzan’s Store Smart Assistant was built on the Feishu Aily platform, describing why Aily was chosen, the three‑stage development process, deep system integration, practical tips for knowledge‑base management and model stability, and the resulting efficiency gains such as handling 80% of routine queries.

AI AssistantAily platformKnowledge Base
0 likes · 24 min read
Building and Optimizing a Store Smart Assistant with Aily: Architecture, Workflow, and Practical Lessons
Architect's Alchemy Furnace
Architect's Alchemy Furnace
May 7, 2025 · Artificial Intelligence

Which LLM Inference Engine Reigns Supreme? A Deep Dive into Transformers, vLLM, Llama.cpp, SGLang, MLX and Ollama

This article provides a comprehensive comparison of seven popular large‑language‑model inference engines—Transformers, vLLM, Llama.cpp, SGLang, MLX, Ollama and others—detailing their core features, performance characteristics, hardware compatibility, concurrency support, and ideal use‑cases, plus practical installation guidance for Xinference.

LLMMLXSGLang
0 likes · 17 min read
Which LLM Inference Engine Reigns Supreme? A Deep Dive into Transformers, vLLM, Llama.cpp, SGLang, MLX and Ollama
Alibaba Cloud Developer
Alibaba Cloud Developer
May 7, 2025 · Artificial Intelligence

What Is an AI Agent? Understanding the Shift from Chatbots to Intelligent Automation

This article explores the concept of AI agents, contrasting them with traditional software and chatbots, outlines their core components, workflow, and the technological and market forces driving their evolution, and provides practical guidance for improving agent performance and choosing between workflow and LLM approaches.

AI AgentLLMPrompt Engineering
0 likes · 24 min read
What Is an AI Agent? Understanding the Shift from Chatbots to Intelligent Automation
JD Tech
JD Tech
May 6, 2025 · Artificial Intelligence

One4All Generative Recommendation Framework for CPS Advertising

This article reviews recent advances in applying large language models to CPS advertising recommendation, outlines business requirements and core technical challenges, proposes an extensible multi‑task generative framework with explicit intent perception and multi‑objective optimization, and presents offline and online performance gains along with future research directions.

CPS advertisingGenerative ModelsLLM
0 likes · 13 min read
One4All Generative Recommendation Framework for CPS Advertising
AI Large Model Application Practice
AI Large Model Application Practice
May 6, 2025 · Artificial Intelligence

How to Build an Agentic RAG System from Scratch Using MCP Architecture

This article walks through the design and full implementation of an Agentic Retrieval‑Augmented Generation (RAG) system built on the MCP standard, covering the conceptual fusion of MCP and RAG, server‑side tool creation with LlamaIndex, client‑side agent construction with LangGraph, configuration files, caching strategies, code examples, and an end‑to‑end demonstration.

Agentic RAGLLMLangGraph
0 likes · 15 min read
How to Build an Agentic RAG System from Scratch Using MCP Architecture
Data Thinking Notes
Data Thinking Notes
May 5, 2025 · Artificial Intelligence

How MCP’s Text2SQL Service Turns Natural Language into Powerful Database Queries

This article explores the MCP platform’s data service capabilities, detailing its core components—Resources, Prompts, and Tools—and demonstrates how its Text2SQL feature enables natural‑language queries to retrieve table schemas, perform data sampling, and execute complex relational analyses across multiple database tables.

AIData IntegrationDatabase
0 likes · 7 min read
How MCP’s Text2SQL Service Turns Natural Language into Powerful Database Queries
21CTO
21CTO
May 3, 2025 · Artificial Intelligence

Meet Mellum: JetBrains’ Purpose‑Built Code Completion LLM Now Open‑Source

JetBrains has released its purpose‑built code‑completion large language model, Mellum, as an open‑source project on Hugging Face, highlighting its focus on specialized code‑completion tasks, low runtime costs, support for many programming languages, and its potential for AI/ML researchers and educators.

AILLMcode completion
0 likes · 4 min read
Meet Mellum: JetBrains’ Purpose‑Built Code Completion LLM Now Open‑Source
AI Algorithm Path
AI Algorithm Path
May 3, 2025 · Artificial Intelligence

DeepSeek Prover V2: Pioneering the Next Era of AI‑Driven Formal Math Reasoning

DeepSeek‑Prover‑V2, an open‑source LLM specialized for Lean 4, bridges intuitive high‑level reasoning and strict formal verification through sub‑goal decomposition, dual operation modes, and a novel cold‑start data pipeline, achieving state‑of‑the‑art results on MiniF2F, PutnamBench and CombiBench while highlighting trade‑offs in inference cost and model scalability.

AI mathematicsDeepSeek Prover V2LLM
0 likes · 18 min read
DeepSeek Prover V2: Pioneering the Next Era of AI‑Driven Formal Math Reasoning
Baobao Algorithm Notes
Baobao Algorithm Notes
May 2, 2025 · Artificial Intelligence

Do Reinforcement Learning Techniques Really Boost LLM Reasoning? A Deep Dive into Recent Models

This article analyzes whether reinforcement learning enhances large language model reasoning, compares findings from DeepSeek-Math, a Tsinghua‑Shanghai Jiao‑Tong paper, and Qwen3, and outlines practical training pipelines—including Seed‑Thinking‑v1.5, DeepSeek‑R1, Kimi‑K1.5, and Qwen3—that aim to endow LLMs with robust reasoning capabilities.

Artificial IntelligenceLLMReasoning
0 likes · 12 min read
Do Reinforcement Learning Techniques Really Boost LLM Reasoning? A Deep Dive into Recent Models
AI Algorithm Path
AI Algorithm Path
May 1, 2025 · Artificial Intelligence

Uncovering the Secrets of LLM Inference Optimization

This article dissects the major bottlenecks of large‑language‑model serving—prefill vs. decode, sparsity, memory bandwidth, KV‑cache growth—and walks through concrete engineering tricks such as paged attention, radix‑tree KV caches, compressed attention, speculative decoding, FlexGen weight scheduling, FastServe queuing, plus a runnable vLLM code snippet.

FastServeFlexGenInference Optimization
0 likes · 18 min read
Uncovering the Secrets of LLM Inference Optimization
Architecture & Thinking
Architecture & Thinking
Apr 30, 2025 · Artificial Intelligence

Unlocking AI Integration: How the Model Context Protocol (MCP) Bridges LLMs with External Tools

This article introduces the Model Context Protocol (MCP) released by Anthropic, explains its core features and client‑server architecture, walks through building a Go‑based MCP server and client with time, weather, and schedule tools, demonstrates testing with MCP Inspector, and highlights MCP's advantages and typical AI application scenarios.

AI IntegrationGoLLM
0 likes · 22 min read
Unlocking AI Integration: How the Model Context Protocol (MCP) Bridges LLMs with External Tools
Tencent Cloud Developer
Tencent Cloud Developer
Apr 29, 2025 · Artificial Intelligence

Comparative Analysis of MCP and A2A Protocols for AI Agent Coordination

The article compares Google’s A2A coordination protocol with Anthropic’s Model Context Protocol, showing through a financial‑report case study that A2A enables deeper LLM‑driven interactions while MCP provides tool‑wrapper services, evaluates three integration paths, discusses SDK, latency and cost challenges, and predicts A2A could become the dominant orchestration layer for AI agents.

A2AAI AgentsLLM
0 likes · 23 min read
Comparative Analysis of MCP and A2A Protocols for AI Agent Coordination
Data Thinking Notes
Data Thinking Notes
Apr 27, 2025 · Artificial Intelligence

Step‑by‑Step MCP Demo: Build Server and Claude/DeepSeek Clients

This guide walks developers through creating a complete MCP application, covering the workflow, server setup with Python, debugging tools, and client implementation using both Claude and DeepSeek models, complete with code snippets, environment configuration, and testing procedures to demonstrate end‑to‑end LLM tool integration.

ClaudeDeepSeekLLM
0 likes · 10 min read
Step‑by‑Step MCP Demo: Build Server and Claude/DeepSeek Clients
Baobao Algorithm Notes
Baobao Algorithm Notes
Apr 27, 2025 · Artificial Intelligence

How DeepSeek R1T‑Chimera Cuts Tokens by 40% Without Fine‑Tuning

The DeepSeek‑R1T‑Chimera model merges DeepSeek‑R1 reasoning with V3‑0324 architecture, reusing most V3 weights and swapping only the blue‑highlighted R1 routing experts, achieving the same intelligence as R1 while reducing output tokens by about 40% and running faster, all without any fine‑tuning or distillation.

Artificial IntelligenceDeepSeekLLM
0 likes · 5 min read
How DeepSeek R1T‑Chimera Cuts Tokens by 40% Without Fine‑Tuning
Youzan Coder
Youzan Coder
Apr 25, 2025 · Artificial Intelligence

AI-Powered Code Review System: Design, Implementation, and Lessons Learned

The team built a low‑cost AI‑powered code‑review assistant that injects line‑level comments into GitLab merge requests, using LLMs via Feishu, iterating quickly through MVP and optimization phases, achieving 64 integrations, 150+ daily comments, feedback‑driven prompt refinement, and demonstrating high ROI for small‑to‑medium teams while outlining future IDE and rule‑based extensions.

AICode ReviewGitLab
0 likes · 17 min read
AI-Powered Code Review System: Design, Implementation, and Lessons Learned
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 25, 2025 · Artificial Intelligence

Unlocking AI Agents: Theory, Design Patterns, and Hands‑On Experiments

This article combines theoretical analysis and practical case studies to systematically explore the core components, design patterns, and future directions of AI agents, detailing the implementation of OpenManus, custom memory and planning modules, experimental evaluations, and insights for improving agent reliability and scalability.

AI AgentLLMMemory
0 likes · 31 min read
Unlocking AI Agents: Theory, Design Patterns, and Hands‑On Experiments
JavaEdge
JavaEdge
Apr 24, 2025 · Artificial Intelligence

How to Customize HTTP Clients for LangChain4j LLM Integration in Java

This guide explains how LangChain4j modules let you replace the default HTTP client used to call LLM provider APIs, showing two out‑of‑the‑box implementations (JdkHttpClient and SpringRestClient) and providing step‑by‑step code examples for custom JDK and Spring RestClient configurations.

HTTP clientJavaLLM
0 likes · 4 min read
How to Customize HTTP Clients for LangChain4j LLM Integration in Java
Alimama Tech
Alimama Tech
Apr 23, 2025 · Artificial Intelligence

Explainable LLM-driven Multi-dimensional Distillation for E-Commerce Relevance Learning

The paper introduces an explainable LLM framework (ELLM‑rele) that uses chain‑of‑thought reasoning and a multi‑dimensional knowledge distillation pipeline to compress large‑model relevance judgments into lightweight student models, achieving superior offline relevance scores and online click‑through and conversion improvements in Taobao’s search advertising.

Knowledge DistillationLLMchain-of-thought
0 likes · 17 min read
Explainable LLM-driven Multi-dimensional Distillation for E-Commerce Relevance Learning
AI Algorithm Path
AI Algorithm Path
Apr 22, 2025 · Artificial Intelligence

Understanding LLM Quantization: GPTQ, QAT, AWQ, GGUF, and GGML Explained

The article walks through the fundamentals of large‑language‑model quantization, presenting a concrete int8 example, detailed explanations of GPTQ, GGUF/GGML, QAT, and AWQ methods, and provides step‑by‑step code snippets, formulas, calibration procedures, and performance observations for each technique.

AWQGGMLGGUF
0 likes · 15 min read
Understanding LLM Quantization: GPTQ, QAT, AWQ, GGUF, and GGML Explained
Volcano Engine Developer Services
Volcano Engine Developer Services
Apr 22, 2025 · Artificial Intelligence

What Is Model Context Protocol (MCP) and How It Transforms LLM Applications

Model Context Protocol (MCP) is an open standard that standardizes how large language models interact with external tools and data, enabling seamless function calls, simplifying prompt engineering, and allowing developers to build modular AI applications without handling low‑level integration details.

AI IntegrationFunction CallingLLM
0 likes · 16 min read
What Is Model Context Protocol (MCP) and How It Transforms LLM Applications
Tencent Cloud Developer
Tencent Cloud Developer
Apr 22, 2025 · Industry Insights

Can Vibe Coding Revolutionize Software Development? A Deep Dive into AI‑Driven Programming

Vibe Coding, introduced by AI expert Andrej Karpathy in 2025, lets developers describe functionality in natural language and rely on large language models to generate code, shifting the programmer’s role to guiding AI, boosting productivity, lowering entry barriers, and reshaping software development practices.

AI programmingLLMVibe Coding
0 likes · 16 min read
Can Vibe Coding Revolutionize Software Development? A Deep Dive into AI‑Driven Programming
DaTaobao Tech
DaTaobao Tech
Apr 21, 2025 · Artificial Intelligence

How MNN LLM Delivers Fast, Stable On‑Device LLM Inference for Android, iOS, and Desktop

Facing DeepSeek R1 server instability, the open‑source MNN LLM framework offers local, mobile‑friendly deployment with model quantization and hardware‑specific optimizations, dramatically improving inference speed, stability, and download reliability across Android, iOS, and desktop platforms while supporting multimodal inputs.

AndroidLLMMNN
0 likes · 11 min read
How MNN LLM Delivers Fast, Stable On‑Device LLM Inference for Android, iOS, and Desktop
Nightwalker Tech
Nightwalker Tech
Apr 21, 2025 · Artificial Intelligence

Turning AI into a Reliable Engineering Partner: Methodology, Rules, and Practices

This article outlines a comprehensive methodology for integrating AI—particularly large language models—into software development workflows by establishing knowledge‑base templates, rule systems, multi‑model collaboration, context management, and task decomposition to transform AI from a whimsical code generator into a trustworthy engineering partner.

AILLMPrompt Engineering
0 likes · 16 min read
Turning AI into a Reliable Engineering Partner: Methodology, Rules, and Practices
AI Algorithm Path
AI Algorithm Path
Apr 20, 2025 · Artificial Intelligence

Boosting Visual Reasoning in VLMs with Reinforcement Learning

The article analyzes how reinforcement learning, which transformed LLM reasoning in DeepSeek, can be applied to visual‑language models to overcome the limitations of traditional chain‑of‑thought prompting and supervised fine‑tuning, presenting concrete reward designs, training pipelines, and a critical assessment of their strengths and weaknesses.

LLMRL trainingchain-of-thought
0 likes · 10 min read
Boosting Visual Reasoning in VLMs with Reinforcement Learning
DataFunTalk
DataFunTalk
Apr 19, 2025 · Artificial Intelligence

Microsoft Research's Open‑Source Native 1‑Bit LLM BitNet b1.58 2B4T: Design, Performance, and Deployment

Microsoft Research released BitNet b1.58 2B4T, the first open‑source native 1‑bit large language model with 2 billion parameters, 1.58‑bit effective precision and a 0.4 GB footprint, achieving full‑precision performance while enabling efficient CPU and GPU inference for edge AI applications.

1-bit quantizationCPU inferenceLLM
0 likes · 10 min read
Microsoft Research's Open‑Source Native 1‑Bit LLM BitNet b1.58 2B4T: Design, Performance, and Deployment
Fun with Large Models
Fun with Large Models
Apr 18, 2025 · Artificial Intelligence

How RAG Works: From Data Prep to LLM Generation Explained

This article breaks down Retrieval‑Augmented Generation (RAG) into its three core stages—data preparation, data retrieval, and LLM generation—showing how document chunking, embedding, vector databases, similarity search, and optional re‑ranking combine to let large language models produce more accurate, knowledge‑grounded answers.

EmbeddingLLMRAG
0 likes · 9 min read
How RAG Works: From Data Prep to LLM Generation Explained
Data Thinking Notes
Data Thinking Notes
Apr 17, 2025 · Artificial Intelligence

How Dify Accelerates Generative AI App Development with Low‑Code and Modular Design

Dify is an open‑source LLM application platform that blends BaaS and LLMOps, offering low‑code development, modular components, extensive model support, and advanced retrieval features, while also detailing its current limitations and recent enhancements such as MySQL integration and Elasticsearch‑based RAG capabilities.

AIElasticsearchLLM
0 likes · 7 min read
How Dify Accelerates Generative AI App Development with Low‑Code and Modular Design
AI Frontier Lectures
AI Frontier Lectures
Apr 17, 2025 · Artificial Intelligence

Why Reinforcement Learning Fails to Boost Small LLM Reasoning: A Deep Dive

This article analyzes a recent study on language‑model reasoning, revealing that reinforcement learning often brings little or no improvement, while evaluation variance caused by seeds, hardware, and decoding settings can dramatically affect benchmark results, and supervised fine‑tuning emerges as a more reliable path.

LLMReproducibilityreinforcement learning
0 likes · 12 min read
Why Reinforcement Learning Fails to Boost Small LLM Reasoning: A Deep Dive
21CTO
21CTO
Apr 17, 2025 · Artificial Intelligence

How AI Will Revolutionize Software Development in 2025

This article explores how context‑aware AI, on‑premise model training, autonomous agents, and new metrics for AI impact will reshape software development, boost productivity, improve code quality, and give forward‑looking enterprises a decisive market advantage.

AILLMcode quality
0 likes · 8 min read
How AI Will Revolutionize Software Development in 2025
Java Captain
Java Captain
Apr 17, 2025 · Artificial Intelligence

Demonstrating the Full Lifecycle of Model Context Protocol (MCP) with Tool Calls

This article explains how the Model Context Protocol (MCP) enables large language models to retrieve up‑to‑date external information through standardized tool calls, illustrating the complete end‑to‑end workflow with Python code for the MCP server, client, and host, and discussing its advantages for building AI agents.

AI AgentLLMPython
0 likes · 21 min read
Demonstrating the Full Lifecycle of Model Context Protocol (MCP) with Tool Calls
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Apr 16, 2025 · Artificial Intelligence

Optimizing Multi‑Node Distributed LLM Inference with ACK Gateway and vLLM

This article presents a step‑by‑step guide for deploying and optimizing large‑language‑model inference across multiple GPU‑enabled nodes using ACK Gateway with Inference Extension, vLLM’s tensor‑ and pipeline‑parallel techniques, and Kubernetes resources such as LeaderWorkerSet, PVCs, and custom routing policies, followed by performance benchmarking and analysis.

ACK GatewayKubernetesLLM
0 likes · 19 min read
Optimizing Multi‑Node Distributed LLM Inference with ACK Gateway and vLLM
Java Architecture Diary
Java Architecture Diary
Apr 16, 2025 · Artificial Intelligence

Mastering Prompt Engineering with Spring AI: Patterns and Practical Java Examples

An in‑depth guide shows how to configure Spring AI for various LLM providers, tune model parameters such as temperature and max tokens, and apply a range of prompt‑engineering patterns—including zero‑shot, few‑shot, chain‑of‑thought, self‑consistency, role‑based and automatic prompting—using concise Java code examples.

ChatOptionsLLMSpring AI
0 likes · 18 min read
Mastering Prompt Engineering with Spring AI: Patterns and Practical Java Examples
Ops Development & AI Practice
Ops Development & AI Practice
Apr 15, 2025 · Frontend Development

How to Build an AI‑Powered VS Code Extension in Minutes

This guide walks you through the VS Code extension architecture and provides a step‑by‑step example that creates a simple AI text‑explanation plugin, covering preparation, project scaffolding, command registration, API integration, debugging, and best‑practice security tips.

AI IntegrationExtension DevelopmentLLM
0 likes · 12 min read
How to Build an AI‑Powered VS Code Extension in Minutes
Baobao Algorithm Notes
Baobao Algorithm Notes
Apr 15, 2025 · Industry Insights

Why GLM‑Z1‑AirX Hits 150‑200 TPS: A Deep Dive into LLM Speed Benchmarking

The article examines the slowdown caused by long‑chain‑of‑thought LLMs, presents a Python benchmarking script, compares token‑per‑second performance of several models—including the ultra‑fast GLM‑Z1‑AirX—and demonstrates a real‑time anti‑fraud use case that benefits from sub‑second response times.

BenchmarkGLM-Z1-AirXLLM
0 likes · 13 min read
Why GLM‑Z1‑AirX Hits 150‑200 TPS: A Deep Dive into LLM Speed Benchmarking
DeWu Technology
DeWu Technology
Apr 14, 2025 · Artificial Intelligence

Overview of Recent Large Language Model Quantization Techniques

The article surveys modern post‑training quantization approaches for large language models, detailing weight‑only and activation‑aware methods such as GPTQ, AWQ, HQQ, SmoothQuant, QuIP, QuaRot, SpinQuant, QQQ, QoQ, and FP8, and compares their precision levels, algorithmic steps, accuracy‑throughput trade‑offs, and implementation considerations for efficient inference.

AILLMModel Compression
0 likes · 32 min read
Overview of Recent Large Language Model Quantization Techniques
Open Source Tech Hub
Open Source Tech Hub
Apr 14, 2025 · Artificial Intelligence

What Is Model Context Protocol (MCP) and How It Turns AI Into a Universal Interface?

This article explains the Model Context Protocol (MCP) – an open, consensus‑based standard that lets large language models seamlessly interact with external tools and data, describes its architecture, why it’s needed, how models choose tools, and provides a step‑by‑step Python server implementation with code examples.

LLMTool Callingmcp
0 likes · 22 min read
What Is Model Context Protocol (MCP) and How It Turns AI Into a Universal Interface?
Ops Development & AI Practice
Ops Development & AI Practice
Apr 10, 2025 · Artificial Intelligence

Debugging LLM Model Context Protocol Servers Made Easy with MCP Inspector

Introducing MCP Inspector, a GUI-based debugger for Model Context Protocol (MCP) servers that lets developers visualize tool registrations, prompt templates, resources, and real-time interactions, while providing commands to launch, control, and troubleshoot LLM applications, ultimately streamlining development and reducing debugging friction.

LLMMCP InspectorModel Context Protocol
0 likes · 8 min read
Debugging LLM Model Context Protocol Servers Made Easy with MCP Inspector
AI Algorithm Path
AI Algorithm Path
Apr 10, 2025 · Artificial Intelligence

Beginner-Friendly Guide to Understanding Large Language Models

This article walks readers through the fundamentals of large language models, covering what tokens are, how tokenization works, the conversion of tokens to numeric IDs, the transformer architecture—including positional encoding, self‑attention, feed‑forward networks and softmax—and explains how these components enable next‑token prediction.

Artificial IntelligenceEmbeddingLLM
0 likes · 9 min read
Beginner-Friendly Guide to Understanding Large Language Models
Spring Full-Stack Practical Cases
Spring Full-Stack Practical Cases
Apr 10, 2025 · Artificial Intelligence

Build a RAG-Powered Knowledge Base with Spring Boot, Milvus, and Ollama

This guide walks through creating a Retrieval‑Augmented Generation (RAG) system using Spring Boot 3.4.2, Milvus vector database, and the bge‑m3 embedding model via Ollama, covering environment setup, dependency configuration, vector store operations, and integration with a large language model to deliver refined, similarity‑based answers.

EmbeddingLLMMilvus
0 likes · 11 min read
Build a RAG-Powered Knowledge Base with Spring Boot, Milvus, and Ollama
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Apr 10, 2025 · Artificial Intelligence

Building a Pet Hospital AI Assistant with RAG and LLMs

This article walks through the motivation, core concepts of Retrieval‑Augmented Generation, and a step‑by‑step guide to constructing a pet‑hospital AI assistant on Alibaba Cloud using LLMs, vector databases, and automated pipelines, complete with code examples and practical tips.

AI AssistantAlibaba CloudLLM
0 likes · 18 min read
Building a Pet Hospital AI Assistant with RAG and LLMs
Beijing SF i-TECH City Technology Team
Beijing SF i-TECH City Technology Team
Apr 7, 2025 · Artificial Intelligence

LLM Application in Text Information Detection and Extraction: A Case Study of Blue-Collar Recruitment Data Processing

This article explores the application of Large Language Models (LLM) in text information detection and extraction, focusing on blue-collar recruitment data processing. It details the implementation of LLM through prompt engineering, RAG enhancement, and model fine-tuning to improve data cleaning efficiency and accuracy.

AI applicationsLLMPrompt Engineering
0 likes · 31 min read
LLM Application in Text Information Detection and Extraction: A Case Study of Blue-Collar Recruitment Data Processing
JD Cloud Developers
JD Cloud Developers
Apr 7, 2025 · Artificial Intelligence

Why Bigger Prompts Fail: Modular Strategies for Building Efficient AI Agents

This article explains why overloading prompts and tools harms AI‑Agent performance, and offers practical modular design, intent‑driven instruction splitting, and efficient context management strategies such as curated function‑call tools and dynamic RAG to reduce token costs, improve response speed, and avoid hallucinations.

AI AgentLLMModular Design
0 likes · 13 min read
Why Bigger Prompts Fail: Modular Strategies for Building Efficient AI Agents
AI Large Model Application Practice
AI Large Model Application Practice
Apr 7, 2025 · Artificial Intelligence

8 Leading LLM Agent Frameworks and How to Plug In MCP Server

This article surveys eight popular large‑language‑model (LLM) agent development frameworks—OpenAI Agents SDK, LangGraph, LlamaIndex, AutoGen, Pydantic AI, SmolAgents, Camel, and CrewAI—explaining each’s key features and providing concrete Python code to integrate the MCP Server for tool access.

LLMPythonagents
0 likes · 15 min read
8 Leading LLM Agent Frameworks and How to Plug In MCP Server
AI Frontier Lectures
AI Frontier Lectures
Apr 6, 2025 · Artificial Intelligence

Can Multi‑Round Thinking Boost LLM Accuracy Without Extra Training?

A new study from the a‑m‑team introduces “Think Twice”, a test‑time multi‑round reasoning technique that, without additional training or model changes, repeatedly prompts large language models to self‑correct, yielding notable accuracy gains across benchmarks such as AIME, MATH‑500, GPQA‑Diamond and LiveCodeBench, while also producing shorter, more confident answers.

Artificial IntelligenceLLMMulti-round reasoning
0 likes · 6 min read
Can Multi‑Round Thinking Boost LLM Accuracy Without Extra Training?
21CTO
21CTO
Apr 5, 2025 · Artificial Intelligence

AI Platform Highlights: Amazon Nova, Solo.io MCP, Kong Gateway, and More

Developers can stay current with recent AI advancements as Anthropic introduces Claude’s educational mode, Amazon launches the Nova model hub and Act SDK, Solo.io unveils the MCP Gateway for AI tool integration, Kong updates its AI Gateway to curb hallucinations, env0 releases Cloud Analyst, CodeSignal adds AI skill assessments, and Zencoder offers new AI coding and testing agents.

AIAI PlatformsCloud Computing
0 likes · 8 min read
AI Platform Highlights: Amazon Nova, Solo.io MCP, Kong Gateway, and More
Ops Development & AI Practice
Ops Development & AI Practice
Apr 5, 2025 · Artificial Intelligence

Why Do LLMs Follow Instructions So Well? Unpacking the Secrets

This article explains the concept of instruction‑following in large language models, compares early and modern LLMs, details the training techniques that enable it, highlights its importance, offers practical prompting tips, and discusses current challenges and future directions.

AILLMPrompt Engineering
0 likes · 10 min read
Why Do LLMs Follow Instructions So Well? Unpacking the Secrets
AI Frontier Lectures
AI Frontier Lectures
Apr 4, 2025 · Artificial Intelligence

Why Test‑Time Scaling Is Revolutionizing LLM Reasoning in 2025

This article surveys the latest research on large language model reasoning, highlighting test‑time scaling methods, chain‑of‑thought variants, and novel inference‑time techniques that boost performance while exposing trade‑offs, costs, and future directions for AI developers.

AILLMTest-Time Scaling
0 likes · 26 min read
Why Test‑Time Scaling Is Revolutionizing LLM Reasoning in 2025
Alimama Tech
Alimama Tech
Apr 3, 2025 · Artificial Intelligence

UQABench: A Personalized QA Benchmark for Evaluating User Embeddings in LLM‑Driven Recommendation Systems

UQABench introduces the first benchmark for assessing high‑density user embeddings that serve as soft prompts in LLM‑driven recommendation, featuring a three‑stage pre‑train‑align‑evaluate pipeline, seven personalized QA tasks, and findings that transformer encoders, side‑information, simple linear adapters, and larger models markedly improve accuracy while cutting input tokens to about five percent.

AIBenchmarkLLM
0 likes · 12 min read
UQABench: A Personalized QA Benchmark for Evaluating User Embeddings in LLM‑Driven Recommendation Systems
ByteDance Cloud Native
ByteDance Cloud Native
Apr 3, 2025 · Operations

How to Seamlessly Integrate CloudWeGo with APMPlus for Full‑Stack Observability

This article explains the challenges of observability in distributed microservice and LLM architectures, introduces CloudWeGo and APMPlus, and provides step‑by‑step integration guides for Kitex, Hertz, and Eino frameworks, including code samples, data reporting methods, and advanced monitoring features such as RED metrics, LLM‑specific indicators, service topology, and future roadmap.

APMAPMPlusCloudWeGo
0 likes · 13 min read
How to Seamlessly Integrate CloudWeGo with APMPlus for Full‑Stack Observability
MaGe Linux Operations
MaGe Linux Operations
Apr 3, 2025 · Artificial Intelligence

How to Build and Deploy a Dify LLM Application Platform on CentOS

This guide explains what Dify is, outlines its key features and application scenarios, and provides step‑by‑step instructions for preparing the environment, installing Docker and Docker‑Compose, and deploying Dify on a CentOS 7.9 system, including verification of a successful setup.

AI platformDifyDocker
0 likes · 9 min read
How to Build and Deploy a Dify LLM Application Platform on CentOS
BirdNest Tech Talk
BirdNest Tech Talk
Apr 3, 2025 · Artificial Intelligence

How Genspark’s Super Agent Outperforms OpenAI and Manus in GAIA Benchmarks

Genspark’s newly released Super Agent, built on a Mixture‑of‑Agents architecture that combines eight specialized LLMs and over 80 tools, claims to autonomously plan, execute, and integrate external services across tasks such as travel planning and video summarization, and reportedly surpasses OpenAI and Manus in the GAIA benchmark while offering instant access without an invitation code.

AI AgentGAIA benchmarkLLM
0 likes · 4 min read
How Genspark’s Super Agent Outperforms OpenAI and Manus in GAIA Benchmarks
Big Data Technology & Architecture
Big Data Technology & Architecture
Apr 3, 2025 · Artificial Intelligence

Understanding Model Context Protocol (MCP), Retrieval-Augmented Generation (RAG), and Vector Databases for LLM Integration

This article explains the Model Context Protocol (MCP) as a standard for LLM‑data integration, describes Retrieval‑Augmented Generation (RAG) techniques to reduce hallucinations, and introduces vector databases like Milvus that store high‑dimensional embeddings for efficient AI retrieval tasks.

LLMMilvusRAG
0 likes · 7 min read
Understanding Model Context Protocol (MCP), Retrieval-Augmented Generation (RAG), and Vector Databases for LLM Integration
DevOps
DevOps
Apr 2, 2025 · Artificial Intelligence

Understanding Retrieval-Augmented Generation (RAG): Concepts, Evolution, and Types

This article explains Retrieval‑Augmented Generation (RAG), its role in mitigating large language model knowledge cutoff and hallucination, outlines the evolution from naive to advanced, modular, graph, and agentic RAG, and discusses future directions such as intelligent and multi‑modal RAG systems.

Artificial IntelligenceKnowledge retrievalLLM
0 likes · 10 min read
Understanding Retrieval-Augmented Generation (RAG): Concepts, Evolution, and Types
AntTech
AntTech
Apr 2, 2025 · Artificial Intelligence

PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead

The PEAR framework introduces a position‑embedding‑agnostic attention re‑weighting method that detects and suppresses detrimental attention heads in large language models, dramatically improving retrieval‑augmented generation performance without adding any inference overhead, as demonstrated on multiple RAG benchmarks and LLM families.

Attention Re-weightingLLMPEAR
0 likes · 6 min read
PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead
JD Retail Technology
JD Retail Technology
Apr 2, 2025 · Artificial Intelligence

One4All: A Scalable Multi‑Task Generative Recommendation Framework for CPS Advertising

The paper introduces One4All, a scalable multi‑task generative recommendation framework for CPS advertising that combines few‑shot intent prompting, a Rewards‑in‑Context multi‑objective optimization, and an online model‑selection strategy, delivering 2‑3× offline HitRate/NDCG gains and notable online CTR, CVR, and commission improvements.

AdvertisingLLMlarge language models
0 likes · 14 min read
One4All: A Scalable Multi‑Task Generative Recommendation Framework for CPS Advertising
AI Algorithm Path
AI Algorithm Path
Apr 2, 2025 · Artificial Intelligence

Master the Three Essential LLM Training Stages for 2025

The article breaks down the three core stages of large‑language‑model training—pre‑training, supervised fine‑tuning, and RLHF—explaining their purpose, methods, and concrete examples while noting DeepSeek‑R1’s recent breakthrough and its implications for AI development.

AI trainingDeepSeekLLM
0 likes · 5 min read
Master the Three Essential LLM Training Stages for 2025
Huolala Tech
Huolala Tech
Apr 1, 2025 · Frontend Development

How Frontend Teams Can Leverage LLMs for Real‑Time Compliance Checks

This article explains how frontend developers can use large language models to detect and prevent marketing content violations in WeChat mini‑programs, covering pain‑point discovery, LLM‑driven compliance architecture, prompt optimization, model selection, testing methods, and seamless frontend integration with Feishu notifications.

AILLMPrompt Engineering
0 likes · 10 min read
How Frontend Teams Can Leverage LLMs for Real‑Time Compliance Checks
Efficient Ops
Efficient Ops
Mar 31, 2025 · Artificial Intelligence

How the Model Context Protocol (MCP) Is Revolutionizing AI Operations

The Model Context Protocol (MCP) lets large language models safely and directly access diverse data sources and tools, breaking data silos and enabling seamless AI‑driven automation across development, operations, and multi‑agent workflows.

AI IntegrationLLMModel Context Protocol
0 likes · 5 min read
How the Model Context Protocol (MCP) Is Revolutionizing AI Operations
Architect
Architect
Mar 31, 2025 · Artificial Intelligence

A Comprehensive Study of Failure Modes in Large‑Language‑Model Based Multi‑Agent Systems

This paper presents a systematic investigation of failure patterns in LLM‑driven multi‑agent systems, introducing a 14‑type taxonomy (MASFT) derived from over 150 annotated dialogues, evaluating it with an LLM‑as‑a‑judge pipeline, and exploring modest intervention strategies while releasing all data and tools for future research.

AILLMagentic
0 likes · 29 min read
A Comprehensive Study of Failure Modes in Large‑Language‑Model Based Multi‑Agent Systems
Architect
Architect
Mar 29, 2025 · Artificial Intelligence

How Non‑AI Developers Can Build Powerful LLM Apps: Prompt Engineering, RAG, and AI Agents Explained

This article guides developers without an AI background through the fundamentals of building large‑language‑model applications, covering prompt engineering, multi‑turn interaction, function calling, retrieval‑augmented generation, vector databases, code assistants, and the MCP protocol for AI agents.

AI AgentEmbeddingFunction Calling
0 likes · 51 min read
How Non‑AI Developers Can Build Powerful LLM Apps: Prompt Engineering, RAG, and AI Agents Explained
Qborfy AI
Qborfy AI
Mar 29, 2025 · Artificial Intelligence

Mastering LangChain: Build LLM Apps with Chains, Agents, and Vector Stores

This tutorial walks through the limitations of simple prompt usage, introduces LangChain as a framework for building full‑featured LLM applications, explains its core concepts and components, and provides step‑by‑step code examples for installing, configuring, and running a basic LangChain demo.

AI ApplicationLLMLangChain
0 likes · 11 min read
Mastering LangChain: Build LLM Apps with Chains, Agents, and Vector Stores
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Mar 27, 2025 · Artificial Intelligence

Xinference vs Ollama: Which Open‑Source LLM Engine Fits Your Needs?

This article provides a comprehensive side‑by‑side comparison of the open‑source LLM serving tools Xinference and Ollama, examining their core goals, architecture, model support, deployment options, performance, ecosystem integration, typical use cases, future roadmap, and guidance on selecting the right solution for enterprise or personal projects.

LLMLocal DeploymentModel Serving
0 likes · 7 min read
Xinference vs Ollama: Which Open‑Source LLM Engine Fits Your Needs?
JavaEdge
JavaEdge
Mar 27, 2025 · Artificial Intelligence

Can a Single LLM Both See and Reason? Exploring Visual Reasoning Models (VRM)

This article examines the limitations of current vision‑language and reasoning models, proposes a visual reasoning model (VRM) that can process images and perform deep logical inference, and discusses architecture, training methods, reinforcement‑learning reward designs, and practical challenges.

Artificial IntelligenceDeep LearningLLM
0 likes · 8 min read
Can a Single LLM Both See and Reason? Exploring Visual Reasoning Models (VRM)
DevOps
DevOps
Mar 26, 2025 · Artificial Intelligence

Introducing Model Context Protocol (MCP): An Open Standard for LLM Integration with Data Sources and Tools

The article explains Anthropic's open Model Context Protocol (MCP), detailing its client‑server architecture, resource and prompt definitions, tool discovery and execution, sampling workflow, security features, and provides a complete Python example that demonstrates building, running, and testing an MCP server and client for real‑time data retrieval.

AI IntegrationLLMPython
0 likes · 12 min read
Introducing Model Context Protocol (MCP): An Open Standard for LLM Integration with Data Sources and Tools
Architect
Architect
Mar 26, 2025 · Artificial Intelligence

Agent Memory Mechanisms and Dify Knowledge Base Segmentation & Retrieval Details

This article explains the fundamentals of AI agent memory—including short‑term, long‑term, and working memory types and their storage designs—and then details Dify's knowledge‑base segmentation modes, indexing strategies, and retrieval configurations for effective RAG applications.

Agent MemoryDifyKnowledge Base
0 likes · 14 min read
Agent Memory Mechanisms and Dify Knowledge Base Segmentation & Retrieval Details
DaTaobao Tech
DaTaobao Tech
Mar 26, 2025 · Artificial Intelligence

Overview of Retrieval-Augmented Generation (RAG) and Related AI Technologies

The article surveys Retrieval‑Augmented Generation (RAG) as a solution to large language model limits—such as outdated knowledge, hallucinations, and security risks—by integrating vector‑database retrieval with LLM generation, and discusses related tools, multi‑agent frameworks, prompt engineering, fine‑tuning methods, and emerging optimization trends.

AI applicationsLLMPrompt Engineering
0 likes · 29 min read
Overview of Retrieval-Augmented Generation (RAG) and Related AI Technologies
ELab Team
ELab Team
Mar 26, 2025 · Artificial Intelligence

Uncovering LLM Blind Spots in AI Coding: Common Pitfalls and Solutions

Large language models often struggle with coding tasks, failing to stop when encountering obstacles, ignoring black‑box testing principles, and making unnecessary refactors; this article examines those blind spots, offers practical examples, and suggests strategies such as preparatory refactoring, stateless tools, and careful prompting to improve AI‑assisted development.

AI codingBest PracticesDebugging
0 likes · 59 min read
Uncovering LLM Blind Spots in AI Coding: Common Pitfalls and Solutions
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Mar 26, 2025 · Artificial Intelligence

Enable Traditional LLMs to Use DeepSeek’s Multi‑Head Latent Attention Without Retraining

The paper introduces MHA2MLA, a data‑efficient fine‑tuning framework that converts pre‑trained multi‑head attention LLMs to DeepSeek’s Multi‑Head Latent Attention architecture, achieving up to 92% KV‑cache compression with less than 0.5% performance loss on long‑context tasks.

LLMLow-Rank ApproximationModel Compression
0 likes · 8 min read
Enable Traditional LLMs to Use DeepSeek’s Multi‑Head Latent Attention Without Retraining
Programmer DD
Programmer DD
Mar 25, 2025 · Artificial Intelligence

How to Build an MCP Client‑Server with Spring AI for LLM‑Powered Apps

This article demonstrates how to implement the Model Context Protocol (MCP) using Spring AI, covering the creation of MCP hosts, clients, and servers, configuring dependencies, integrating Claude, adding Brave Search and filesystem tools, and building a functional chatbot that leverages external data sources through standardized LLM interfaces.

LLMModel Context Protocolai-integration
0 likes · 15 min read
How to Build an MCP Client‑Server with Spring AI for LLM‑Powered Apps
21CTO
21CTO
Mar 25, 2025 · Artificial Intelligence

Which LLM Is Best for Coding? Speed, Hallucination, and Context Compared

This article breaks down major large language models, defining key comparison metrics such as speed, hallucination rate, and context window, then evaluates each model with benchmarks like HumanEval+, ChatBot Arena, and Aider to help you choose the most suitable LLM for your coding tasks.

AIBenchmarkLLM
0 likes · 10 min read
Which LLM Is Best for Coding? Speed, Hallucination, and Context Compared
Open Source Tech Hub
Open Source Tech Hub
Mar 24, 2025 · Artificial Intelligence

Break Data Silos for LLMs with Model Context Protocol (MCP) – PHP SDK Guide

This article explains the data‑isolation problem facing large language models, introduces the Model Context Protocol (MCP) as a standard bridge to external data sources, and provides a step‑by‑step PHP SDK tutorial—including installation, server and client code, and optional advanced logging—to help developers integrate AI models securely and efficiently.

Backend DevelopmentLLMModel Context Protocol
0 likes · 13 min read
Break Data Silos for LLMs with Model Context Protocol (MCP) – PHP SDK Guide
AI Algorithm Path
AI Algorithm Path
Mar 24, 2025 · Artificial Intelligence

How to Use Pydantic for Structured LLM Output

The article explains why LLM responses can be inconsistent, introduces Pydantic as a way to define custom output schemas, and walks through concrete examples—both with OpenAI and Ollama models—showing how to build a LangChain pipeline that parses responses into structured data.

LLMLangChainOllama
0 likes · 7 min read
How to Use Pydantic for Structured LLM Output