Tagged articles

2079 articles

Page 14 of 21

May 9, 2025 · Artificial Intelligence

How LLMs + Python Are Redefining Data Analysis: A Practical Guide

This article explains how large language models combined with Python's data‑science ecosystem can automate metadata extraction, data cleaning, and analysis tasks—illustrated with a step‑by‑step Titanic passenger dataset case study, complete prompts, code snippets, and best‑practice recommendations.

Data AnalysisData cleaningLLM

0 likes · 18 min read

How LLMs + Python Are Redefining Data Analysis: A Practical Guide

Volcano Engine Developer Services

May 8, 2025 · Artificial Intelligence

Connect Your Self‑Hosted LLM to Volcengine Edge Gateway in 4 Simple Steps

This step‑by‑step tutorial explains how to add a self‑deployed large language model to Volcengine's Edge Large Model Gateway, configure a secure calling channel, bind it to a gateway access key, and integrate the provided sample code for seamless API access.

API integrationLLMModel Deployment

0 likes · 9 min read

Connect Your Self‑Hosted LLM to Volcengine Edge Gateway in 4 Simple Steps

Youzan Coder

May 8, 2025 · Artificial Intelligence

Building and Optimizing a Store Smart Assistant with Aily: Architecture, Workflow, and Practical Lessons

The article details how Youzan’s Store Smart Assistant was built on the Feishu Aily platform, describing why Aily was chosen, the three‑stage development process, deep system integration, practical tips for knowledge‑base management and model stability, and the resulting efficiency gains such as handling 80% of routine queries.

AI AssistantAily platformKnowledge Base

0 likes · 24 min read

Building and Optimizing a Store Smart Assistant with Aily: Architecture, Workflow, and Practical Lessons

Architect's Alchemy Furnace

May 7, 2025 · Artificial Intelligence

Which LLM Inference Engine Reigns Supreme? A Deep Dive into Transformers, vLLM, Llama.cpp, SGLang, MLX and Ollama

This article provides a comprehensive comparison of seven popular large‑language‑model inference engines—Transformers, vLLM, Llama.cpp, SGLang, MLX, Ollama and others—detailing their core features, performance characteristics, hardware compatibility, concurrency support, and ideal use‑cases, plus practical installation guidance for Xinference.

LLMMLXSGLang

0 likes · 17 min read

Which LLM Inference Engine Reigns Supreme? A Deep Dive into Transformers, vLLM, Llama.cpp, SGLang, MLX and Ollama

Architect

May 7, 2025 · Artificial Intelligence

RAG-MCP: Mitigating Prompt Bloat in LLM Tool Selection via Retrieval‑Augmented Generation

The article reviews the RAG-MCP framework, which combines Retrieval‑Augmented Generation with Model Context Protocol to reduce prompt bloat and improve tool‑selection accuracy for large language models by first retrieving the most relevant tools before feeding them to the LLM.

LLMPrompt BloatRAG-MCP

0 likes · 11 min read

RAG-MCP: Mitigating Prompt Bloat in LLM Tool Selection via Retrieval‑Augmented Generation

Alibaba Cloud Developer

May 7, 2025 · Artificial Intelligence

What Is an AI Agent? Understanding the Shift from Chatbots to Intelligent Automation

This article explores the concept of AI agents, contrasting them with traditional software and chatbots, outlines their core components, workflow, and the technological and market forces driving their evolution, and provides practical guidance for improving agent performance and choosing between workflow and LLM approaches.

AI AgentLLMPrompt Engineering

0 likes · 24 min read

What Is an AI Agent? Understanding the Shift from Chatbots to Intelligent Automation

JD Tech

May 6, 2025 · Artificial Intelligence

One4All Generative Recommendation Framework for CPS Advertising

This article reviews recent advances in applying large language models to CPS advertising recommendation, outlines business requirements and core technical challenges, proposes an extensible multi‑task generative framework with explicit intent perception and multi‑objective optimization, and presents offline and online performance gains along with future research directions.

CPS advertisingGenerative ModelsLLM

0 likes · 13 min read

One4All Generative Recommendation Framework for CPS Advertising

AI Large Model Application Practice

May 6, 2025 · Artificial Intelligence

How to Build an Agentic RAG System from Scratch Using MCP Architecture

This article walks through the design and full implementation of an Agentic Retrieval‑Augmented Generation (RAG) system built on the MCP standard, covering the conceptual fusion of MCP and RAG, server‑side tool creation with LlamaIndex, client‑side agent construction with LangGraph, configuration files, caching strategies, code examples, and an end‑to‑end demonstration.

Agentic RAGLLMLangGraph

0 likes · 15 min read

How to Build an Agentic RAG System from Scratch Using MCP Architecture

Alibaba Cloud Developer

May 6, 2025 · Cloud Computing

Build a Custom Alibaba Cloud OpenAPI MCP Server with Just 10 Lines of Python

This guide explains how to create a lightweight, extensible Alibaba Cloud OpenAPI MCP Server using only ten lines of Python, covering MCP fundamentals, tool registration via OpenAPI metadata, configuration, code examples, and client setup to enable seamless LLM integration with external services.

Alibaba CloudLLMOpenAPI

0 likes · 9 min read

Build a Custom Alibaba Cloud OpenAPI MCP Server with Just 10 Lines of Python

Data Thinking Notes

May 5, 2025 · Artificial Intelligence

How MCP’s Text2SQL Service Turns Natural Language into Powerful Database Queries

This article explores the MCP platform’s data service capabilities, detailing its core components—Resources, Prompts, and Tools—and demonstrates how its Text2SQL feature enables natural‑language queries to retrieve table schemas, perform data sampling, and execute complex relational analyses across multiple database tables.

AIData IntegrationDatabase

0 likes · 7 min read

How MCP’s Text2SQL Service Turns Natural Language into Powerful Database Queries

21CTO

May 3, 2025 · Artificial Intelligence

Meet Mellum: JetBrains’ Purpose‑Built Code Completion LLM Now Open‑Source

JetBrains has released its purpose‑built code‑completion large language model, Mellum, as an open‑source project on Hugging Face, highlighting its focus on specialized code‑completion tasks, low runtime costs, support for many programming languages, and its potential for AI/ML researchers and educators.

AILLMcode completion

0 likes · 4 min read

Meet Mellum: JetBrains’ Purpose‑Built Code Completion LLM Now Open‑Source

AI Algorithm Path

May 3, 2025 · Artificial Intelligence

DeepSeek Prover V2: Pioneering the Next Era of AI‑Driven Formal Math Reasoning

DeepSeek‑Prover‑V2, an open‑source LLM specialized for Lean 4, bridges intuitive high‑level reasoning and strict formal verification through sub‑goal decomposition, dual operation modes, and a novel cold‑start data pipeline, achieving state‑of‑the‑art results on MiniF2F, PutnamBench and CombiBench while highlighting trade‑offs in inference cost and model scalability.

AI mathematicsDeepSeek Prover V2LLM

0 likes · 18 min read

DeepSeek Prover V2: Pioneering the Next Era of AI‑Driven Formal Math Reasoning

Baobao Algorithm Notes

May 2, 2025 · Artificial Intelligence

Do Reinforcement Learning Techniques Really Boost LLM Reasoning? A Deep Dive into Recent Models

This article analyzes whether reinforcement learning enhances large language model reasoning, compares findings from DeepSeek-Math, a Tsinghua‑Shanghai Jiao‑Tong paper, and Qwen3, and outlines practical training pipelines—including Seed‑Thinking‑v1.5, DeepSeek‑R1, Kimi‑K1.5, and Qwen3—that aim to endow LLMs with robust reasoning capabilities.

Artificial IntelligenceLLMReasoning

0 likes · 12 min read

Do Reinforcement Learning Techniques Really Boost LLM Reasoning? A Deep Dive into Recent Models

AI Algorithm Path

May 1, 2025 · Artificial Intelligence

Uncovering the Secrets of LLM Inference Optimization

This article dissects the major bottlenecks of large‑language‑model serving—prefill vs. decode, sparsity, memory bandwidth, KV‑cache growth—and walks through concrete engineering tricks such as paged attention, radix‑tree KV caches, compressed attention, speculative decoding, FlexGen weight scheduling, FastServe queuing, plus a runnable vLLM code snippet.

FastServeFlexGenInference Optimization

0 likes · 18 min read

Uncovering the Secrets of LLM Inference Optimization

Architecture & Thinking

Apr 30, 2025 · Artificial Intelligence

Unlocking AI Integration: How the Model Context Protocol (MCP) Bridges LLMs with External Tools

This article introduces the Model Context Protocol (MCP) released by Anthropic, explains its core features and client‑server architecture, walks through building a Go‑based MCP server and client with time, weather, and schedule tools, demonstrates testing with MCP Inspector, and highlights MCP's advantages and typical AI application scenarios.

AI IntegrationGoLLM

0 likes · 22 min read

Unlocking AI Integration: How the Model Context Protocol (MCP) Bridges LLMs with External Tools

Tencent Cloud Developer

Apr 29, 2025 · Artificial Intelligence

Comparative Analysis of MCP and A2A Protocols for AI Agent Coordination

The article compares Google’s A2A coordination protocol with Anthropic’s Model Context Protocol, showing through a financial‑report case study that A2A enables deeper LLM‑driven interactions while MCP provides tool‑wrapper services, evaluates three integration paths, discusses SDK, latency and cost challenges, and predicts A2A could become the dominant orchestration layer for AI agents.

A2AAI AgentsLLM

0 likes · 23 min read

Comparative Analysis of MCP and A2A Protocols for AI Agent Coordination

Data Thinking Notes

Apr 27, 2025 · Artificial Intelligence

Step‑by‑Step MCP Demo: Build Server and Claude/DeepSeek Clients

This guide walks developers through creating a complete MCP application, covering the workflow, server setup with Python, debugging tools, and client implementation using both Claude and DeepSeek models, complete with code snippets, environment configuration, and testing procedures to demonstrate end‑to‑end LLM tool integration.

ClaudeDeepSeekLLM

0 likes · 10 min read

Step‑by‑Step MCP Demo: Build Server and Claude/DeepSeek Clients

Baobao Algorithm Notes

Apr 27, 2025 · Artificial Intelligence

How DeepSeek R1T‑Chimera Cuts Tokens by 40% Without Fine‑Tuning

The DeepSeek‑R1T‑Chimera model merges DeepSeek‑R1 reasoning with V3‑0324 architecture, reusing most V3 weights and swapping only the blue‑highlighted R1 routing experts, achieving the same intelligence as R1 while reducing output tokens by about 40% and running faster, all without any fine‑tuning or distillation.

Artificial IntelligenceDeepSeekLLM

0 likes · 5 min read

How DeepSeek R1T‑Chimera Cuts Tokens by 40% Without Fine‑Tuning

Baobao Algorithm Notes

Apr 27, 2025 · Artificial Intelligence

How Model Fusion Cut LLM Chain‑of‑Thought Length by 40% Without Fine‑Tuning

A small tech firm, tngtech, released an open‑source model fusion called DeepSeek‑R1T‑Chimera that merges R1 inference with V3‑0324 without fine‑tuning, distillation, or prompts, achieving the same intelligence as R1 while reducing token output by 40% and speeding up inference.

Artificial IntelligenceDeepSeekLLM

0 likes · 4 min read

How Model Fusion Cut LLM Chain‑of‑Thought Length by 40% Without Fine‑Tuning

Youzan Coder

Apr 25, 2025 · Artificial Intelligence

AI-Powered Code Review System: Design, Implementation, and Lessons Learned

The team built a low‑cost AI‑powered code‑review assistant that injects line‑level comments into GitLab merge requests, using LLMs via Feishu, iterating quickly through MVP and optimization phases, achieving 64 integrations, 150+ daily comments, feedback‑driven prompt refinement, and demonstrating high ROI for small‑to‑medium teams while outlining future IDE and rule‑based extensions.

AICode ReviewGitLab

0 likes · 17 min read

AI-Powered Code Review System: Design, Implementation, and Lessons Learned

Alibaba Cloud Developer

Apr 25, 2025 · Artificial Intelligence

Unlocking AI Agents: Theory, Design Patterns, and Hands‑On Experiments

This article combines theoretical analysis and practical case studies to systematically explore the core components, design patterns, and future directions of AI agents, detailing the implementation of OpenManus, custom memory and planning modules, experimental evaluations, and insights for improving agent reliability and scalability.

AI AgentLLMMemory

0 likes · 31 min read

Unlocking AI Agents: Theory, Design Patterns, and Hands‑On Experiments

JavaEdge

Apr 24, 2025 · Artificial Intelligence

How to Customize HTTP Clients for LangChain4j LLM Integration in Java

This guide explains how LangChain4j modules let you replace the default HTTP client used to call LLM provider APIs, showing two out‑of‑the‑box implementations (JdkHttpClient and SpringRestClient) and providing step‑by‑step code examples for custom JDK and Spring RestClient configurations.

HTTP clientJavaLLM

0 likes · 4 min read

How to Customize HTTP Clients for LangChain4j LLM Integration in Java

Alimama Tech

Apr 23, 2025 · Artificial Intelligence

Explainable LLM-driven Multi-dimensional Distillation for E-Commerce Relevance Learning

The paper introduces an explainable LLM framework (ELLM‑rele) that uses chain‑of‑thought reasoning and a multi‑dimensional knowledge distillation pipeline to compress large‑model relevance judgments into lightweight student models, achieving superior offline relevance scores and online click‑through and conversion improvements in Taobao’s search advertising.

Knowledge DistillationLLMchain-of-thought

0 likes · 17 min read

Explainable LLM-driven Multi-dimensional Distillation for E-Commerce Relevance Learning

AI Algorithm Path

Apr 22, 2025 · Artificial Intelligence

Understanding LLM Quantization: GPTQ, QAT, AWQ, GGUF, and GGML Explained

The article walks through the fundamentals of large‑language‑model quantization, presenting a concrete int8 example, detailed explanations of GPTQ, GGUF/GGML, QAT, and AWQ methods, and provides step‑by‑step code snippets, formulas, calibration procedures, and performance observations for each technique.

AWQGGMLGGUF

0 likes · 15 min read

Understanding LLM Quantization: GPTQ, QAT, AWQ, GGUF, and GGML Explained

Volcano Engine Developer Services

Apr 22, 2025 · Artificial Intelligence

What Is Model Context Protocol (MCP) and How It Transforms LLM Applications

Model Context Protocol (MCP) is an open standard that standardizes how large language models interact with external tools and data, enabling seamless function calls, simplifying prompt engineering, and allowing developers to build modular AI applications without handling low‑level integration details.

AI IntegrationFunction CallingLLM

0 likes · 16 min read

What Is Model Context Protocol (MCP) and How It Transforms LLM Applications

Tencent Cloud Developer

Apr 22, 2025 · Industry Insights

Can Vibe Coding Revolutionize Software Development? A Deep Dive into AI‑Driven Programming

Vibe Coding, introduced by AI expert Andrej Karpathy in 2025, lets developers describe functionality in natural language and rely on large language models to generate code, shifting the programmer’s role to guiding AI, boosting productivity, lowering entry barriers, and reshaping software development practices.

AI programmingLLMVibe Coding

0 likes · 16 min read

Can Vibe Coding Revolutionize Software Development? A Deep Dive into AI‑Driven Programming

Alibaba Cloud Developer

Apr 21, 2025 · Artificial Intelligence

One-Click Deploy an SSE MCP Server on Alibaba Cloud with Serverless Devs CLI

This guide walks you through the Model Context Protocol (MCP) basics, the pain points of manual setup, and how to use Serverless Devs CLI to initialize, develop, package, deploy, and test a native SSE MCP Server on Alibaba Cloud Function Compute, complete with sample code and client testing options.

Alibaba CloudCLILLM

0 likes · 10 min read

One-Click Deploy an SSE MCP Server on Alibaba Cloud with Serverless Devs CLI

DaTaobao Tech

Apr 21, 2025 · Artificial Intelligence

How MNN LLM Delivers Fast, Stable On‑Device LLM Inference for Android, iOS, and Desktop

Facing DeepSeek R1 server instability, the open‑source MNN LLM framework offers local, mobile‑friendly deployment with model quantization and hardware‑specific optimizations, dramatically improving inference speed, stability, and download reliability across Android, iOS, and desktop platforms while supporting multimodal inputs.

AndroidLLMMNN

0 likes · 11 min read

How MNN LLM Delivers Fast, Stable On‑Device LLM Inference for Android, iOS, and Desktop

Nightwalker Tech

Apr 21, 2025 · Artificial Intelligence

Turning AI into a Reliable Engineering Partner: Methodology, Rules, and Practices

This article outlines a comprehensive methodology for integrating AI—particularly large language models—into software development workflows by establishing knowledge‑base templates, rule systems, multi‑model collaboration, context management, and task decomposition to transform AI from a whimsical code generator into a trustworthy engineering partner.

AILLMPrompt Engineering

0 likes · 16 min read

Turning AI into a Reliable Engineering Partner: Methodology, Rules, and Practices

AI Algorithm Path

Apr 20, 2025 · Artificial Intelligence

Boosting Visual Reasoning in VLMs with Reinforcement Learning

The article analyzes how reinforcement learning, which transformed LLM reasoning in DeepSeek, can be applied to visual‑language models to overcome the limitations of traditional chain‑of‑thought prompting and supervised fine‑tuning, presenting concrete reward designs, training pipelines, and a critical assessment of their strengths and weaknesses.

LLMRL trainingchain-of-thought

0 likes · 10 min read

Boosting Visual Reasoning in VLMs with Reinforcement Learning

DataFunTalk

Apr 19, 2025 · Artificial Intelligence

Microsoft Research's Open‑Source Native 1‑Bit LLM BitNet b1.58 2B4T: Design, Performance, and Deployment

Microsoft Research released BitNet b1.58 2B4T, the first open‑source native 1‑bit large language model with 2 billion parameters, 1.58‑bit effective precision and a 0.4 GB footprint, achieving full‑precision performance while enabling efficient CPU and GPU inference for edge AI applications.

1-bit quantizationCPU inferenceLLM

0 likes · 10 min read

Microsoft Research's Open‑Source Native 1‑Bit LLM BitNet b1.58 2B4T: Design, Performance, and Deployment

Fun with Large Models

Apr 18, 2025 · Artificial Intelligence

How RAG Works: From Data Prep to LLM Generation Explained

This article breaks down Retrieval‑Augmented Generation (RAG) into its three core stages—data preparation, data retrieval, and LLM generation—showing how document chunking, embedding, vector databases, similarity search, and optional re‑ranking combine to let large language models produce more accurate, knowledge‑grounded answers.

EmbeddingLLMRAG

0 likes · 9 min read

How RAG Works: From Data Prep to LLM Generation Explained

Data Thinking Notes

Apr 17, 2025 · Artificial Intelligence

How Dify Accelerates Generative AI App Development with Low‑Code and Modular Design

Dify is an open‑source LLM application platform that blends BaaS and LLMOps, offering low‑code development, modular components, extensive model support, and advanced retrieval features, while also detailing its current limitations and recent enhancements such as MySQL integration and Elasticsearch‑based RAG capabilities.

AIElasticsearchLLM

0 likes · 7 min read

How Dify Accelerates Generative AI App Development with Low‑Code and Modular Design

AI Frontier Lectures

Apr 17, 2025 · Artificial Intelligence

Why Reinforcement Learning Fails to Boost Small LLM Reasoning: A Deep Dive

This article analyzes a recent study on language‑model reasoning, revealing that reinforcement learning often brings little or no improvement, while evaluation variance caused by seeds, hardware, and decoding settings can dramatically affect benchmark results, and supervised fine‑tuning emerges as a more reliable path.

LLMReproducibilityreinforcement learning

0 likes · 12 min read

Why Reinforcement Learning Fails to Boost Small LLM Reasoning: A Deep Dive

Tencent Cloud Middleware

Apr 17, 2025 · Operations

Boost RocketMQ Ops with LLM‑Powered Natural‑Language Queries via GraphQL

By integrating large language models, Chatbox, MCP, and GraphQL, the TDMQ RocketMQ team enables operators to retrieve cluster, topic, and message data across heterogeneous sources using a single natural‑language query, dramatically simplifying diagnostics and reducing manual query effort.

ChatboxGraphQLLLM

0 likes · 9 min read

Boost RocketMQ Ops with LLM‑Powered Natural‑Language Queries via GraphQL

21CTO

Apr 17, 2025 · Artificial Intelligence

How AI Will Revolutionize Software Development in 2025

This article explores how context‑aware AI, on‑premise model training, autonomous agents, and new metrics for AI impact will reshape software development, boost productivity, improve code quality, and give forward‑looking enterprises a decisive market advantage.

AILLMcode quality

0 likes · 8 min read

How AI Will Revolutionize Software Development in 2025

Java Captain

Apr 17, 2025 · Artificial Intelligence

Demonstrating the Full Lifecycle of Model Context Protocol (MCP) with Tool Calls

This article explains how the Model Context Protocol (MCP) enables large language models to retrieve up‑to‑date external information through standardized tool calls, illustrating the complete end‑to‑end workflow with Python code for the MCP server, client, and host, and discussing its advantages for building AI agents.

AI AgentLLMPython

0 likes · 21 min read

Demonstrating the Full Lifecycle of Model Context Protocol (MCP) with Tool Calls

Alibaba Cloud Infrastructure

Apr 16, 2025 · Artificial Intelligence

Optimizing Multi‑Node Distributed LLM Inference with ACK Gateway and vLLM

This article presents a step‑by‑step guide for deploying and optimizing large‑language‑model inference across multiple GPU‑enabled nodes using ACK Gateway with Inference Extension, vLLM’s tensor‑ and pipeline‑parallel techniques, and Kubernetes resources such as LeaderWorkerSet, PVCs, and custom routing policies, followed by performance benchmarking and analysis.

ACK GatewayKubernetesLLM

0 likes · 19 min read

Optimizing Multi‑Node Distributed LLM Inference with ACK Gateway and vLLM

Java Architecture Diary

Apr 16, 2025 · Artificial Intelligence

Mastering Prompt Engineering with Spring AI: Patterns and Practical Java Examples

An in‑depth guide shows how to configure Spring AI for various LLM providers, tune model parameters such as temperature and max tokens, and apply a range of prompt‑engineering patterns—including zero‑shot, few‑shot, chain‑of‑thought, self‑consistency, role‑based and automatic prompting—using concise Java code examples.

ChatOptionsLLMSpring AI

0 likes · 18 min read

Mastering Prompt Engineering with Spring AI: Patterns and Practical Java Examples

Ops Development & AI Practice

Apr 15, 2025 · Frontend Development

How to Build an AI‑Powered VS Code Extension in Minutes

This guide walks you through the VS Code extension architecture and provides a step‑by‑step example that creates a simple AI text‑explanation plugin, covering preparation, project scaffolding, command registration, API integration, debugging, and best‑practice security tips.

AI IntegrationExtension DevelopmentLLM

0 likes · 12 min read

How to Build an AI‑Powered VS Code Extension in Minutes

Baobao Algorithm Notes

Apr 15, 2025 · Industry Insights

Why GLM‑Z1‑AirX Hits 150‑200 TPS: A Deep Dive into LLM Speed Benchmarking

The article examines the slowdown caused by long‑chain‑of‑thought LLMs, presents a Python benchmarking script, compares token‑per‑second performance of several models—including the ultra‑fast GLM‑Z1‑AirX—and demonstrates a real‑time anti‑fraud use case that benefits from sub‑second response times.

BenchmarkGLM-Z1-AirXLLM

0 likes · 13 min read

Why GLM‑Z1‑AirX Hits 150‑200 TPS: A Deep Dive into LLM Speed Benchmarking

DeWu Technology

Apr 14, 2025 · Artificial Intelligence

Overview of Recent Large Language Model Quantization Techniques

The article surveys modern post‑training quantization approaches for large language models, detailing weight‑only and activation‑aware methods such as GPTQ, AWQ, HQQ, SmoothQuant, QuIP, QuaRot, SpinQuant, QQQ, QoQ, and FP8, and compares their precision levels, algorithmic steps, accuracy‑throughput trade‑offs, and implementation considerations for efficient inference.

AILLMModel Compression

0 likes · 32 min read

Overview of Recent Large Language Model Quantization Techniques

Open Source Tech Hub

Apr 14, 2025 · Artificial Intelligence

What Is Model Context Protocol (MCP) and How It Turns AI Into a Universal Interface?

This article explains the Model Context Protocol (MCP) – an open, consensus‑based standard that lets large language models seamlessly interact with external tools and data, describes its architecture, why it’s needed, how models choose tools, and provides a step‑by‑step Python server implementation with code examples.

LLMTool Callingmcp

0 likes · 22 min read

What Is Model Context Protocol (MCP) and How It Turns AI Into a Universal Interface?

Java Architecture Diary

Apr 14, 2025 · Artificial Intelligence

How to Empower LLMs with a Private SearXNG Search Engine for Real‑Time Knowledge

This guide explains why large language models need private search capabilities, outlines the benefits of a self‑hosted SearXNG engine, provides step‑by‑step Docker deployment, and demonstrates Java integration using LangChain4j for both basic queries and retrieval‑augmented generation (RAG).

DockerLLMLangChain4j

0 likes · 6 min read

How to Empower LLMs with a Private SearXNG Search Engine for Real‑Time Knowledge

Alibaba Cloud Developer

Apr 14, 2025 · Artificial Intelligence

What Is Model Context Protocol (MCP) and Why It’s the New USB‑C for AI Agents

This article explains the Model Context Protocol (MCP), its architecture, why it’s needed for seamless AI‑tool interaction, how it differs from traditional function calls, and provides a step‑by‑step guide with Python code to build, test, and debug an MCP server.

LLMModel Context ProtocolPython

0 likes · 21 min read

What Is Model Context Protocol (MCP) and Why It’s the New USB‑C for AI Agents

Data Thinking Notes

Apr 13, 2025 · Artificial Intelligence

How to Build a Retrieval‑Augmented Generation Knowledge Base with DeepSeek and RAGFlow

This guide walks you through the fundamentals of Retrieval‑Augmented Generation, introduces the open‑source RAGFlow framework, details installation steps, shows how to integrate DeepSeek LLMs, and explores practical application scenarios such as intelligent customer service and enterprise document QA.

AIDeepSeekLLM

0 likes · 11 min read

How to Build a Retrieval‑Augmented Generation Knowledge Base with DeepSeek and RAGFlow

Ops Development & AI Practice

Apr 13, 2025 · Industry Insights

MarkItDown vs Docling: Which Open‑Source Tool Wins for LLM‑Ready Markdown?

This article provides an in‑depth comparison of Microsoft’s MarkItDown and IBM‑backed Docling, evaluating their supported formats, output options, performance, community backing, and ideal use cases to help developers choose the right tool for AI‑driven document processing.

LLMMarkdownPDF processing

0 likes · 8 min read

MarkItDown vs Docling: Which Open‑Source Tool Wins for LLM‑Ready Markdown?

Ops Development & AI Practice

Apr 10, 2025 · Artificial Intelligence

Debugging LLM Model Context Protocol Servers Made Easy with MCP Inspector

Introducing MCP Inspector, a GUI-based debugger for Model Context Protocol (MCP) servers that lets developers visualize tool registrations, prompt templates, resources, and real-time interactions, while providing commands to launch, control, and troubleshoot LLM applications, ultimately streamlining development and reducing debugging friction.

LLMMCP InspectorModel Context Protocol

0 likes · 8 min read

Debugging LLM Model Context Protocol Servers Made Easy with MCP Inspector

AI Algorithm Path

Apr 10, 2025 · Artificial Intelligence

Beginner-Friendly Guide to Understanding Large Language Models

This article walks readers through the fundamentals of large language models, covering what tokens are, how tokenization works, the conversion of tokens to numeric IDs, the transformer architecture—including positional encoding, self‑attention, feed‑forward networks and softmax—and explains how these components enable next‑token prediction.

Artificial IntelligenceEmbeddingLLM

0 likes · 9 min read

Beginner-Friendly Guide to Understanding Large Language Models

Spring Full-Stack Practical Cases

Apr 10, 2025 · Artificial Intelligence

Build a RAG-Powered Knowledge Base with Spring Boot, Milvus, and Ollama

This guide walks through creating a Retrieval‑Augmented Generation (RAG) system using Spring Boot 3.4.2, Milvus vector database, and the bge‑m3 embedding model via Ollama, covering environment setup, dependency configuration, vector store operations, and integration with a large language model to deliver refined, similarity‑based answers.

EmbeddingLLMMilvus

0 likes · 11 min read

Build a RAG-Powered Knowledge Base with Spring Boot, Milvus, and Ollama

Alibaba Cloud Big Data AI Platform

Apr 10, 2025 · Artificial Intelligence

Building a Pet Hospital AI Assistant with RAG and LLMs

This article walks through the motivation, core concepts of Retrieval‑Augmented Generation, and a step‑by‑step guide to constructing a pet‑hospital AI assistant on Alibaba Cloud using LLMs, vector databases, and automated pipelines, complete with code examples and practical tips.

AI AssistantAlibaba CloudLLM

0 likes · 18 min read

Building a Pet Hospital AI Assistant with RAG and LLMs

DataFunTalk

Apr 9, 2025 · Artificial Intelligence

The Origin of Large Language Models: A Historical Investigation of ULMFiT and Early LLMs

This article examines the historical roots of large language models, highlighting Jeremy Howard’s ULMFiT as a pioneering work, its influence on GPT‑1, and subsequent debates about which model truly qualifies as the first true LLM, supported by citations and expert commentary.

AI historyGPT-1LLM

0 likes · 7 min read

The Origin of Large Language Models: A Historical Investigation of ULMFiT and Early LLMs

Alibaba Cloud Native

Apr 8, 2025 · Cloud Native

How to Connect Large Language Models to Grafana Using MCP (Model Context Protocol)

This guide demonstrates building a Model Context Protocol (MCP) server in Python to enable large language models to query Grafana dashboards, retrieve folder lists, and return real‑time visualizations, covering installation, server definition, tool creation, and integration with Cherry Studio.

GrafanaLLMPrompt

0 likes · 12 min read

How to Connect Large Language Models to Grafana Using MCP (Model Context Protocol)

Full-Stack Cultivation Path

Apr 7, 2025 · Frontend Development

Introducing Midscene.js: An AI‑Powered UI Automation Framework with Deep‑Think Capability

Midscene.js, the Web Infra team's AI × UI automation framework, adds Instant Actions for faster, more reliable UI operations and a Deep‑Think option that improves element localization by focusing LLM searches, with concrete code examples and model compatibility notes.

AI automationDeep ThinkInstant Actions

0 likes · 5 min read

Introducing Midscene.js: An AI‑Powered UI Automation Framework with Deep‑Think Capability

Beijing SF i-TECH City Technology Team

Apr 7, 2025 · Artificial Intelligence

LLM Application in Text Information Detection and Extraction: A Case Study of Blue-Collar Recruitment Data Processing

This article explores the application of Large Language Models (LLM) in text information detection and extraction, focusing on blue-collar recruitment data processing. It details the implementation of LLM through prompt engineering, RAG enhancement, and model fine-tuning to improve data cleaning efficiency and accuracy.

AI applicationsLLMPrompt Engineering

0 likes · 31 min read

LLM Application in Text Information Detection and Extraction: A Case Study of Blue-Collar Recruitment Data Processing

JD Tech Talk

Apr 7, 2025 · Artificial Intelligence

Best Practices for Building AI Agents: Prompt Design, Tool Management, and Context Optimization

This article explains how to develop robust AI agents by breaking down large prompts, selecting appropriate tools, managing context efficiently, and applying modular design principles to reduce token costs, avoid hallucinations, and improve overall performance and reliability.

AI AgentBest PracticesLLM

0 likes · 11 min read

Best Practices for Building AI Agents: Prompt Design, Tool Management, and Context Optimization

JD Cloud Developers

Apr 7, 2025 · Artificial Intelligence

Why Bigger Prompts Fail: Modular Strategies for Building Efficient AI Agents

This article explains why overloading prompts and tools harms AI‑Agent performance, and offers practical modular design, intent‑driven instruction splitting, and efficient context management strategies such as curated function‑call tools and dynamic RAG to reduce token costs, improve response speed, and avoid hallucinations.

AI AgentLLMModular Design

0 likes · 13 min read

Why Bigger Prompts Fail: Modular Strategies for Building Efficient AI Agents

AI Large Model Application Practice

Apr 7, 2025 · Artificial Intelligence

8 Leading LLM Agent Frameworks and How to Plug In MCP Server

This article surveys eight popular large‑language‑model (LLM) agent development frameworks—OpenAI Agents SDK, LangGraph, LlamaIndex, AutoGen, Pydantic AI, SmolAgents, Camel, and CrewAI—explaining each’s key features and providing concrete Python code to integrate the MCP Server for tool access.

LLMPythonagents

0 likes · 15 min read

8 Leading LLM Agent Frameworks and How to Plug In MCP Server

Ops Development & AI Practice

Apr 6, 2025 · Artificial Intelligence

How to Inspect Local LLM Specs with Ollama’s ‘show’ Command

This guide explains how to use the Ollama ‘show’ command to retrieve detailed specifications of locally stored large language models, covering architecture, parameters, context length, embedding size, quantization, capabilities, and licensing information for informed model selection.

AI toolsLLMOllama

0 likes · 4 min read

How to Inspect Local LLM Specs with Ollama’s ‘show’ Command

AI Frontier Lectures

Apr 6, 2025 · Artificial Intelligence

Can Multi‑Round Thinking Boost LLM Accuracy Without Extra Training?

A new study from the a‑m‑team introduces “Think Twice”, a test‑time multi‑round reasoning technique that, without additional training or model changes, repeatedly prompts large language models to self‑correct, yielding notable accuracy gains across benchmarks such as AIME, MATH‑500, GPQA‑Diamond and LiveCodeBench, while also producing shorter, more confident answers.

Artificial IntelligenceLLMMulti-round reasoning

0 likes · 6 min read

Can Multi‑Round Thinking Boost LLM Accuracy Without Extra Training?

21CTO

Apr 5, 2025 · Artificial Intelligence

AI Platform Highlights: Amazon Nova, Solo.io MCP, Kong Gateway, and More

Developers can stay current with recent AI advancements as Anthropic introduces Claude’s educational mode, Amazon launches the Nova model hub and Act SDK, Solo.io unveils the MCP Gateway for AI tool integration, Kong updates its AI Gateway to curb hallucinations, env0 releases Cloud Analyst, CodeSignal adds AI skill assessments, and Zencoder offers new AI coding and testing agents.

AIAI PlatformsCloud Computing

0 likes · 8 min read

AI Platform Highlights: Amazon Nova, Solo.io MCP, Kong Gateway, and More

Ops Development & AI Practice

Apr 5, 2025 · Artificial Intelligence

Why Do LLMs Follow Instructions So Well? Unpacking the Secrets

This article explains the concept of instruction‑following in large language models, compares early and modern LLMs, details the training techniques that enable it, highlights its importance, offers practical prompting tips, and discusses current challenges and future directions.

AILLMPrompt Engineering

0 likes · 10 min read

Why Do LLMs Follow Instructions So Well? Unpacking the Secrets

Ops Development & AI Practice

Apr 5, 2025 · Artificial Intelligence

How Tool-Specific Tokens Empower LLMs to Interact with the Real World

This article explains the concept of tool-specific tokens for large language models, detailing how they enable efficient, reliable tool calls, the implementation steps, advantages over JSON, practical advice, comparisons, challenges, and future directions for AI agents.

LLMTool Callingmodel fine-tuning

0 likes · 10 min read

How Tool-Specific Tokens Empower LLMs to Interact with the Real World

AI Frontier Lectures

Apr 4, 2025 · Artificial Intelligence

Why Test‑Time Scaling Is Revolutionizing LLM Reasoning in 2025

This article surveys the latest research on large language model reasoning, highlighting test‑time scaling methods, chain‑of‑thought variants, and novel inference‑time techniques that boost performance while exposing trade‑offs, costs, and future directions for AI developers.

AILLMTest-Time Scaling

0 likes · 26 min read

Why Test‑Time Scaling Is Revolutionizing LLM Reasoning in 2025

Ops Development & AI Practice

Apr 4, 2025 · Artificial Intelligence

Decoding LLM Endpoint Features: Quantization, Tokens, and Tool Support Explained

This article breaks down the key endpoint features of large language models—such as quantization, max token limits, streaming cancellation, tool support, and reasoning ability—explaining what each term means, why it matters, and how to choose models wisely for different applications.

AI model evaluationEndpoint FeaturesLLM

0 likes · 11 min read

Decoding LLM Endpoint Features: Quantization, Tokens, and Tool Support Explained

Ops Development & AI Practice

Apr 3, 2025 · Artificial Intelligence

What Powers LLMs? Unpacking Transformers, Architectures, and Context Windows

This article explains the core Transformer architecture behind large language models, compares encoder‑decoder and decoder‑only designs, and dives into the crucial concept of the context window, including its limits, examples, and ongoing research to extend it.

AI ArchitectureLLMTransformer

0 likes · 10 min read

What Powers LLMs? Unpacking Transformers, Architectures, and Context Windows

Alimama Tech

Apr 3, 2025 · Artificial Intelligence

UQABench: A Personalized QA Benchmark for Evaluating User Embeddings in LLM‑Driven Recommendation Systems

UQABench introduces the first benchmark for assessing high‑density user embeddings that serve as soft prompts in LLM‑driven recommendation, featuring a three‑stage pre‑train‑align‑evaluate pipeline, seven personalized QA tasks, and findings that transformer encoders, side‑information, simple linear adapters, and larger models markedly improve accuracy while cutting input tokens to about five percent.

AIBenchmarkLLM

0 likes · 12 min read

UQABench: A Personalized QA Benchmark for Evaluating User Embeddings in LLM‑Driven Recommendation Systems

ByteDance Cloud Native

Apr 3, 2025 · Operations

How to Seamlessly Integrate CloudWeGo with APMPlus for Full‑Stack Observability

This article explains the challenges of observability in distributed microservice and LLM architectures, introduces CloudWeGo and APMPlus, and provides step‑by‑step integration guides for Kitex, Hertz, and Eino frameworks, including code samples, data reporting methods, and advanced monitoring features such as RED metrics, LLM‑specific indicators, service topology, and future roadmap.

APMAPMPlusCloudWeGo

0 likes · 13 min read

How to Seamlessly Integrate CloudWeGo with APMPlus for Full‑Stack Observability

MaGe Linux Operations

Apr 3, 2025 · Artificial Intelligence

How to Build and Deploy a Dify LLM Application Platform on CentOS

This guide explains what Dify is, outlines its key features and application scenarios, and provides step‑by‑step instructions for preparing the environment, installing Docker and Docker‑Compose, and deploying Dify on a CentOS 7.9 system, including verification of a successful setup.

AI platformDifyDocker

0 likes · 9 min read

How to Build and Deploy a Dify LLM Application Platform on CentOS

BirdNest Tech Talk

Apr 3, 2025 · Artificial Intelligence

How Genspark’s Super Agent Outperforms OpenAI and Manus in GAIA Benchmarks

Genspark’s newly released Super Agent, built on a Mixture‑of‑Agents architecture that combines eight specialized LLMs and over 80 tools, claims to autonomously plan, execute, and integrate external services across tasks such as travel planning and video summarization, and reportedly surpasses OpenAI and Manus in the GAIA benchmark while offering instant access without an invitation code.

AI AgentGAIA benchmarkLLM

0 likes · 4 min read

How Genspark’s Super Agent Outperforms OpenAI and Manus in GAIA Benchmarks

Big Data Technology & Architecture

Apr 3, 2025 · Artificial Intelligence

Understanding Model Context Protocol (MCP), Retrieval-Augmented Generation (RAG), and Vector Databases for LLM Integration

This article explains the Model Context Protocol (MCP) as a standard for LLM‑data integration, describes Retrieval‑Augmented Generation (RAG) techniques to reduce hallucinations, and introduces vector databases like Milvus that store high‑dimensional embeddings for efficient AI retrieval tasks.

LLMMilvusRAG

0 likes · 7 min read

Understanding Model Context Protocol (MCP), Retrieval-Augmented Generation (RAG), and Vector Databases for LLM Integration

DevOps

Apr 2, 2025 · Artificial Intelligence

Understanding Retrieval-Augmented Generation (RAG): Concepts, Evolution, and Types

This article explains Retrieval‑Augmented Generation (RAG), its role in mitigating large language model knowledge cutoff and hallucination, outlines the evolution from naive to advanced, modular, graph, and agentic RAG, and discusses future directions such as intelligent and multi‑modal RAG systems.

Artificial IntelligenceKnowledge retrievalLLM

0 likes · 10 min read

Understanding Retrieval-Augmented Generation (RAG): Concepts, Evolution, and Types

AntTech

Apr 2, 2025 · Artificial Intelligence

PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead

The PEAR framework introduces a position‑embedding‑agnostic attention re‑weighting method that detects and suppresses detrimental attention heads in large language models, dramatically improving retrieval‑augmented generation performance without adding any inference overhead, as demonstrated on multiple RAG benchmarks and LLM families.

Attention Re-weightingLLMPEAR

0 likes · 6 min read

PEAR: Position-Embedding-Agnostic Attention Re-weighting Enhances Retrieval-Augmented Generation with Zero Inference Overhead

JD Retail Technology

Apr 2, 2025 · Artificial Intelligence

One4All: A Scalable Multi‑Task Generative Recommendation Framework for CPS Advertising

The paper introduces One4All, a scalable multi‑task generative recommendation framework for CPS advertising that combines few‑shot intent prompting, a Rewards‑in‑Context multi‑objective optimization, and an online model‑selection strategy, delivering 2‑3× offline HitRate/NDCG gains and notable online CTR, CVR, and commission improvements.

AdvertisingLLMlarge language models

0 likes · 14 min read

One4All: A Scalable Multi‑Task Generative Recommendation Framework for CPS Advertising

AI Algorithm Path

Apr 2, 2025 · Artificial Intelligence

Master the Three Essential LLM Training Stages for 2025

The article breaks down the three core stages of large‑language‑model training—pre‑training, supervised fine‑tuning, and RLHF—explaining their purpose, methods, and concrete examples while noting DeepSeek‑R1’s recent breakthrough and its implications for AI development.

AI trainingDeepSeekLLM

0 likes · 5 min read

Master the Three Essential LLM Training Stages for 2025

Huolala Tech

Apr 1, 2025 · Frontend Development

How Frontend Teams Can Leverage LLMs for Real‑Time Compliance Checks

This article explains how frontend developers can use large language models to detect and prevent marketing content violations in WeChat mini‑programs, covering pain‑point discovery, LLM‑driven compliance architecture, prompt optimization, model selection, testing methods, and seamless frontend integration with Feishu notifications.

AILLMPrompt Engineering

0 likes · 10 min read

How Frontend Teams Can Leverage LLMs for Real‑Time Compliance Checks

Code Mala Tang

Mar 31, 2025 · Artificial Intelligence

Unlocking LLM Power: A Hands‑On Guide to Function Calling with Mistral, Llama, and Qwen

This tutorial explains how large language models can use function calling to access real‑time data, walks through setting up a Flask endpoint, demonstrates integration with Mistral Small, Llama 3.2‑1B, and Qwen models, and provides complete Python code examples for end‑to‑end execution.

APIFunction CallingLLM

0 likes · 10 min read

Unlocking LLM Power: A Hands‑On Guide to Function Calling with Mistral, Llama, and Qwen

Efficient Ops

Mar 31, 2025 · Artificial Intelligence

How the Model Context Protocol (MCP) Is Revolutionizing AI Operations

The Model Context Protocol (MCP) lets large language models safely and directly access diverse data sources and tools, breaking data silos and enabling seamless AI‑driven automation across development, operations, and multi‑agent workflows.

AI IntegrationLLMModel Context Protocol

0 likes · 5 min read

How the Model Context Protocol (MCP) Is Revolutionizing AI Operations

Architect's Alchemy Furnace

Mar 31, 2025 · Artificial Intelligence

How to Deploy and Run Large Language Models with Xinference: A Step‑by‑Step Guide

Xinference is a powerful distributed inference framework that enables quick deployment and efficient serving of open‑source large language models via Docker or source installation, offering Web UI, CLI, and API interfaces with detailed setup, model launching, and Chatbox integration instructions.

APIDockerLLM

0 likes · 11 min read

How to Deploy and Run Large Language Models with Xinference: A Step‑by‑Step Guide

Architect

Mar 31, 2025 · Artificial Intelligence

A Comprehensive Study of Failure Modes in Large‑Language‑Model Based Multi‑Agent Systems

This paper presents a systematic investigation of failure patterns in LLM‑driven multi‑agent systems, introducing a 14‑type taxonomy (MASFT) derived from over 150 annotated dialogues, evaluating it with an LLM‑as‑a‑judge pipeline, and exploring modest intervention strategies while releasing all data and tools for future research.

AILLMagentic

0 likes · 29 min read

A Comprehensive Study of Failure Modes in Large‑Language‑Model Based Multi‑Agent Systems

Baobao Algorithm Notes

Mar 30, 2025 · Artificial Intelligence

Why Scaling, Data, and Infra Matter More Than Reward Design in R1 Replication

The article analyses two months of community attempts to reproduce DeepSeek R1, highlighting that model scaling, high‑quality data, robust training infrastructure, and careful hyper‑parameter tuning outweigh pure reward‑based tricks, and it outlines common pitfalls and future research directions.

DeepSeekLLMRLHF

0 likes · 13 min read

Why Scaling, Data, and Infra Matter More Than Reward Design in R1 Replication

Rare Earth Juejin Tech Community

Mar 30, 2025 · Backend Development

Implementing Model Context Protocol (MCP) with SSE and HTTP in SpringBoot

This article explains the Model Context Protocol (MCP) for seamless LLM integration, describes its background, presents a sequence diagram of its architecture, and provides step‑by‑step Java SpringBoot code for SSE streaming, HTTP POST handling, and annotation‑based tool registration.

BackendJavaLLM

0 likes · 11 min read

Implementing Model Context Protocol (MCP) with SSE and HTTP in SpringBoot

Architect

Mar 29, 2025 · Artificial Intelligence

How Non‑AI Developers Can Build Powerful LLM Apps: Prompt Engineering, RAG, and AI Agents Explained

This article guides developers without an AI background through the fundamentals of building large‑language‑model applications, covering prompt engineering, multi‑turn interaction, function calling, retrieval‑augmented generation, vector databases, code assistants, and the MCP protocol for AI agents.

AI AgentEmbeddingFunction Calling

0 likes · 51 min read

How Non‑AI Developers Can Build Powerful LLM Apps: Prompt Engineering, RAG, and AI Agents Explained

Qborfy AI

Mar 29, 2025 · Artificial Intelligence

Mastering LangChain: Build LLM Apps with Chains, Agents, and Vector Stores

This tutorial walks through the limitations of simple prompt usage, introduces LangChain as a framework for building full‑featured LLM applications, explains its core concepts and components, and provides step‑by‑step code examples for installing, configuring, and running a basic LangChain demo.

AI ApplicationLLMLangChain

0 likes · 11 min read

Mastering LangChain: Build LLM Apps with Chains, Agents, and Vector Stores

DevOps

Mar 27, 2025 · Artificial Intelligence

From Personal AI Tools to Industry Platforms: A Multi-Level Framework for AI Application Development

The article outlines a hierarchical model for AI application development, from basic user tools through personal assistants, SOP platforms, industry tools, and base models, emphasizing the importance of industry know‑how, data quality, and engineering to overcome model limitations and drive practical AI adoption.

AILLMSOP

0 likes · 24 min read

From Personal AI Tools to Industry Platforms: A Multi-Level Framework for AI Application Development

Architect's Alchemy Furnace

Mar 27, 2025 · Artificial Intelligence

Xinference vs Ollama: Which Open‑Source LLM Engine Fits Your Needs?

This article provides a comprehensive side‑by‑side comparison of the open‑source LLM serving tools Xinference and Ollama, examining their core goals, architecture, model support, deployment options, performance, ecosystem integration, typical use cases, future roadmap, and guidance on selecting the right solution for enterprise or personal projects.

LLMLocal DeploymentModel Serving

0 likes · 7 min read

Xinference vs Ollama: Which Open‑Source LLM Engine Fits Your Needs?

JavaEdge

Mar 27, 2025 · Artificial Intelligence

Can a Single LLM Both See and Reason? Exploring Visual Reasoning Models (VRM)

This article examines the limitations of current vision‑language and reasoning models, proposes a visual reasoning model (VRM) that can process images and perform deep logical inference, and discusses architecture, training methods, reinforcement‑learning reward designs, and practical challenges.

Artificial IntelligenceDeep LearningLLM

0 likes · 8 min read

Can a Single LLM Both See and Reason? Exploring Visual Reasoning Models (VRM)

AI Large Model Application Practice

Mar 27, 2025 · Artificial Intelligence

Mastering AutoGen 0.4: Build Multi‑Agent Tools with Python and MCP

This article walks through the major changes in Microsoft AutoGen 0.4, explains its layered modular architecture and event‑driven multi‑agent design, details the built‑in Tools types, and provides step‑by‑step Python code for creating a Tools Agent and integrating it with an MCP server.

AutoGenLLMPython

0 likes · 9 min read

Mastering AutoGen 0.4: Build Multi‑Agent Tools with Python and MCP

Baobao Algorithm Notes

Mar 27, 2025 · Artificial Intelligence

Why a Robust Training Pipeline Beats Fancy LLM Tricks – Lessons from DAPO

The article analyzes the DAPO technical report, showing how dynamic‑sampling pipelines and token‑level loss handling in SFT and RL training outperform ad‑hoc algorithm tricks, and compares the training dynamics of reinforce_baseline and GRPO with concrete code examples.

Dynamic SamplingGRPOLLM

0 likes · 8 min read

Why a Robust Training Pipeline Beats Fancy LLM Tricks – Lessons from DAPO

DevOps

Mar 26, 2025 · Artificial Intelligence

Introducing Model Context Protocol (MCP): An Open Standard for LLM Integration with Data Sources and Tools

The article explains Anthropic's open Model Context Protocol (MCP), detailing its client‑server architecture, resource and prompt definitions, tool discovery and execution, sampling workflow, security features, and provides a complete Python example that demonstrates building, running, and testing an MCP server and client for real‑time data retrieval.

AI IntegrationLLMPython

0 likes · 12 min read

Introducing Model Context Protocol (MCP): An Open Standard for LLM Integration with Data Sources and Tools

Architect

Mar 26, 2025 · Artificial Intelligence

Agent Memory Mechanisms and Dify Knowledge Base Segmentation & Retrieval Details

This article explains the fundamentals of AI agent memory—including short‑term, long‑term, and working memory types and their storage designs—and then details Dify's knowledge‑base segmentation modes, indexing strategies, and retrieval configurations for effective RAG applications.

Agent MemoryDifyKnowledge Base

0 likes · 14 min read

Agent Memory Mechanisms and Dify Knowledge Base Segmentation & Retrieval Details

Architecture Digest

Mar 26, 2025 · Artificial Intelligence

Getting Started with LangChain in Java: Building Large Language Model Applications

This tutorial introduces the fundamentals of LangChain, explains large language models, prompt engineering, word embeddings, and demonstrates how to use the Java implementation LangChain4j with Maven dependencies, model I/O, memory, retrieval, chains, and agents to build sophisticated LLM‑driven applications.

AIJavaLLM

0 likes · 18 min read

Getting Started with LangChain in Java: Building Large Language Model Applications

DaTaobao Tech

Mar 26, 2025 · Artificial Intelligence

Overview of Retrieval-Augmented Generation (RAG) and Related AI Technologies

The article surveys Retrieval‑Augmented Generation (RAG) as a solution to large language model limits—such as outdated knowledge, hallucinations, and security risks—by integrating vector‑database retrieval with LLM generation, and discusses related tools, multi‑agent frameworks, prompt engineering, fine‑tuning methods, and emerging optimization trends.

AI applicationsLLMPrompt Engineering

0 likes · 29 min read

Overview of Retrieval-Augmented Generation (RAG) and Related AI Technologies

ELab Team

Mar 26, 2025 · Artificial Intelligence

Uncovering LLM Blind Spots in AI Coding: Common Pitfalls and Solutions

Large language models often struggle with coding tasks, failing to stop when encountering obstacles, ignoring black‑box testing principles, and making unnecessary refactors; this article examines those blind spots, offers practical examples, and suggests strategies such as preparatory refactoring, stateless tools, and careful prompting to improve AI‑assisted development.

AI codingBest PracticesDebugging

0 likes · 59 min read

Uncovering LLM Blind Spots in AI Coding: Common Pitfalls and Solutions

Network Intelligence Research Center (NIRC)

Mar 26, 2025 · Artificial Intelligence

Enable Traditional LLMs to Use DeepSeek’s Multi‑Head Latent Attention Without Retraining

The paper introduces MHA2MLA, a data‑efficient fine‑tuning framework that converts pre‑trained multi‑head attention LLMs to DeepSeek’s Multi‑Head Latent Attention architecture, achieving up to 92% KV‑cache compression with less than 0.5% performance loss on long‑context tasks.

LLMLow-Rank ApproximationModel Compression

0 likes · 8 min read

Enable Traditional LLMs to Use DeepSeek’s Multi‑Head Latent Attention Without Retraining

Programmer DD

Mar 25, 2025 · Artificial Intelligence

How to Build an MCP Client‑Server with Spring AI for LLM‑Powered Apps

This article demonstrates how to implement the Model Context Protocol (MCP) using Spring AI, covering the creation of MCP hosts, clients, and servers, configuring dependencies, integrating Claude, adding Brave Search and filesystem tools, and building a functional chatbot that leverages external data sources through standardized LLM interfaces.

LLMModel Context Protocolai-integration

0 likes · 15 min read

How to Build an MCP Client‑Server with Spring AI for LLM‑Powered Apps

21CTO

Mar 25, 2025 · Artificial Intelligence

Which LLM Is Best for Coding? Speed, Hallucination, and Context Compared

This article breaks down major large language models, defining key comparison metrics such as speed, hallucination rate, and context window, then evaluates each model with benchmarks like HumanEval+, ChatBot Arena, and Aider to help you choose the most suitable LLM for your coding tasks.

AIBenchmarkLLM

0 likes · 10 min read

Which LLM Is Best for Coding? Speed, Hallucination, and Context Compared

Open Source Tech Hub

Mar 24, 2025 · Artificial Intelligence

Break Data Silos for LLMs with Model Context Protocol (MCP) – PHP SDK Guide

This article explains the data‑isolation problem facing large language models, introduces the Model Context Protocol (MCP) as a standard bridge to external data sources, and provides a step‑by‑step PHP SDK tutorial—including installation, server and client code, and optional advanced logging—to help developers integrate AI models securely and efficiently.

Backend DevelopmentLLMModel Context Protocol

0 likes · 13 min read

Break Data Silos for LLMs with Model Context Protocol (MCP) – PHP SDK Guide

AI Algorithm Path

Mar 24, 2025 · Artificial Intelligence

How to Use Pydantic for Structured LLM Output

The article explains why LLM responses can be inconsistent, introduces Pydantic as a way to define custom output schemas, and walks through concrete examples—both with OpenAI and Ollama models—showing how to build a LangChain pipeline that parses responses into structured data.

LLMLangChainOllama

0 likes · 7 min read

How to Use Pydantic for Structured LLM Output

JavaEdge

Mar 24, 2025 · Artificial Intelligence

Why Large Language Models Still Struggle with Complex Reasoning – Challenges and Solutions

This article examines the fundamental reasoning limitations of large language models, illustrates real‑world failure cases, and outlines current research directions such as better datasets, chain‑of‑thought prompting, external verification, and specialized solvers to improve their logical capabilities.

AILLMReasoning

0 likes · 8 min read

Why Large Language Models Still Struggle with Complex Reasoning – Challenges and Solutions