Tagged articles
2079 articles
Page 8 of 21
PaperAgent
PaperAgent
Jan 17, 2026 · Artificial Intelligence

Hypergraphs Turn LLMs into Reliable Material Discovery Agents

This article explains how representing multi‑component scientific knowledge as hyperedges, rather than traditional triples, enables large language models to traverse complex material interactions, reduce hallucinations, and generate verifiable experimental designs, demonstrated through a large hypergraph built from thousands of scaffold papers.

AI reasoningHypergraphLLM
0 likes · 7 min read
Hypergraphs Turn LLMs into Reliable Material Discovery Agents
macrozheng
macrozheng
Jan 16, 2026 · Artificial Intelligence

Unlock Seamless Document Search with WeKnora: An Open‑Source LLM Retrieval Framework

WeKnora is an open‑source Tencent framework that combines large language models with retrieval‑augmented generation to enable fast, accurate semantic search and question answering across heterogeneous documents such as PDFs, Word files, and images, offering a modular, extensible architecture and easy Docker‑based deployment.

AILLMRAG
0 likes · 7 min read
Unlock Seamless Document Search with WeKnora: An Open‑Source LLM Retrieval Framework
php Courses
php Courses
Jan 16, 2026 · Artificial Intelligence

From Coding to Validation: How AI Is Redefining the Developer’s Role

The rise of large language models has shifted software development from manual coding to AI‑generated drafts, making verification, security, and business alignment the core responsibilities of modern engineers, and outlining the skills, workflows, and challenges needed to thrive in this new paradigm.

AILLMcode generation
0 likes · 11 min read
From Coding to Validation: How AI Is Redefining the Developer’s Role
Ops Development & AI Practice
Ops Development & AI Practice
Jan 15, 2026 · Artificial Intelligence

Why Rapid Experimentation Beats Token‑Saving in LLM Development

The article explains how AI development with large language models differs from traditional software engineering, why developers feel abstract and uncertain, and offers actionable strategies—such as micro‑prototyping, tiered model usage, simple evaluation sheets, and embracing throwaway code—to accelerate learning despite token costs.

LLMRapid PrototypingToken Management
0 likes · 7 min read
Why Rapid Experimentation Beats Token‑Saving in LLM Development
PaperAgent
PaperAgent
Jan 15, 2026 · Artificial Intelligence

How GAG Enables Zero‑Retrieval, Single‑Token Private Knowledge Injection in LLMs

The article presents GAG, a third‑generation framework that injects proprietary domain knowledge into frozen large language models using a single token, eliminating retrieval, avoiding base model updates, and maintaining constant inference budget while delivering strong performance on private QA and public benchmarks.

AI alignmentGAGLLM
0 likes · 8 min read
How GAG Enables Zero‑Retrieval, Single‑Token Private Knowledge Injection in LLMs
HyperAI Super Neural
HyperAI Super Neural
Jan 15, 2026 · Artificial Intelligence

97% Accuracy: MOFSeq‑LMM Uses LLMs to Efficiently Predict MOF Synthesizability

A joint Princeton and Colorado School of Mines team introduced MOFSeq‑LMM, a large‑language‑model‑based framework that leverages a million‑scale MOF dataset and a novel string representation to predict free energy with MAE 0.789 kJ/mol and synthesizeability with 97% F1, dramatically accelerating high‑throughput MOF screening.

LLMMOFsMaterials Informatics
0 likes · 15 min read
97% Accuracy: MOFSeq‑LMM Uses LLMs to Efficiently Predict MOF Synthesizability
Sohu Tech Products
Sohu Tech Products
Jan 14, 2026 · Artificial Intelligence

Build a Zero‑Cost Open‑Source RAG Smart Document Q&A System from Scratch

This guide walks through building an open‑source Retrieval‑Augmented Generation (RAG) system that indexes local files with Everything, uses hybrid BM25‑vector search via Elasticsearch, and answers questions with a local LLM, covering architecture, core techniques, deployment steps, performance tweaks, and common pitfalls.

ElasticsearchLLMOpen Source
0 likes · 11 min read
Build a Zero‑Cost Open‑Source RAG Smart Document Q&A System from Scratch
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 14, 2026 · Artificial Intelligence

How DataAgent Turns AI into a Virtual Data Analyst for Enterprise Insights

DataAgent, built on Spring AI Alibaba, tackles the "last mile" of AI data analysis by combining deterministic workflow orchestration with large‑model reasoning, offering human‑in‑the‑loop feedback, dynamic prompt configuration, hybrid retrieval, containerized Python execution, streaming SSE, multi‑model scheduling, multi‑source connectivity, and secure API‑key management to deliver instant, insight‑rich reports for business users.

AIAnalyticsDataAgent
0 likes · 11 min read
How DataAgent Turns AI into a Virtual Data Analyst for Enterprise Insights
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Jan 14, 2026 · Artificial Intelligence

From Black‑Box Guessing to Quantitative Deconstruction: Unveiling the Mystery Inside Large Language Models

At EMNLP 2025, the BUPT NIRC team presented a paper that introduces the ARR metric to quantitatively separate latent reasoning from factual shortcuts in LLMs, using Logit Lens and Attention Knockout to reveal distinct internal pathways and shares their conference experience.

ARR metricAttention KnockoutEMNLP2025
0 likes · 6 min read
From Black‑Box Guessing to Quantitative Deconstruction: Unveiling the Mystery Inside Large Language Models
Data Party THU
Data Party THU
Jan 13, 2026 · Artificial Intelligence

How Engram’s ‘Lookup‑Compute Separation’ Boosts LLM Performance

DeepSeek’s newly open‑sourced Engram module introduces a scalable lookup‑based memory that separates knowledge retrieval from computation, enabling O(1) deterministic access and significantly improving large language model performance on knowledge‑heavy, reasoning, code, and math tasks without extra FLOPs.

@lookupLLMMoE
0 likes · 10 min read
How Engram’s ‘Lookup‑Compute Separation’ Boosts LLM Performance
AI Tech Publishing
AI Tech Publishing
Jan 12, 2026 · Artificial Intelligence

Ralph Loop: Engineering Continuous Iteration for AI Agents

Ralph Loop introduces an externalized iterative loop that forces AI agents to keep working until objective completion criteria are met, dramatically extending effective runtime from hours to a full day or more and shifting human‑agent collaboration from frequent supervision to efficient delegation.

AI AgentIterative AutomationLLM
0 likes · 17 min read
Ralph Loop: Engineering Continuous Iteration for AI Agents
Design Hub
Design Hub
Jan 12, 2026 · Artificial Intelligence

Visual AI Prompt Editor Eliminates ‘Spell’ Anxiety, Tweaks Like Ordering Food

The article introduces a visual AI prompt editor that transforms lengthy, complex prompt strings into modular, editable Chinese sections, demonstrating the workflow with two examples—converting a “California girl” portrait to an Asian style and re‑imagining a cinematic skyscraper scene—while detailing step‑by‑step usage and JSON export options.

AI prompt engineeringJSON exportLLM
0 likes · 11 min read
Visual AI Prompt Editor Eliminates ‘Spell’ Anxiety, Tweaks Like Ordering Food
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Jan 11, 2026 · Artificial Intelligence

FinRpt: A Multi‑Agent Framework for Automatic Generation and Evaluation of Stock Research Reports

FinRpt introduces a novel multi‑agent pipeline that builds a high‑quality stock research report (ERR) dataset from six financial data sources, defines a comprehensive 11‑metric evaluation suite, and demonstrates that supervised‑fine‑tuned and reinforcement‑learned LLM agents significantly outperform single LLM baselines in both accuracy and efficiency.

FinRptFinancial NLPLLM
0 likes · 14 min read
FinRpt: A Multi‑Agent Framework for Automatic Generation and Evaluation of Stock Research Reports
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Jan 10, 2026 · Artificial Intelligence

Build and Test a Multi‑Agent AI System with MetaGPT

This guide walks through the MetaGPT framework—explaining its multi‑agent architecture, core concepts, predefined roles, team setup, environment preparation, installation, configuration, and troubleshooting steps—so you can quickly build, run, and validate a collaborative AI software‑company simulation.

AI AgentsLLMMetaGPT
0 likes · 14 min read
Build and Test a Multi‑Agent AI System with MetaGPT
AI Engineering
AI Engineering
Jan 10, 2026 · Artificial Intelligence

Teaching LLMs to Manage Memory Autonomously, Dropping Manual Rules

Alibaba's new AgeMem framework turns long‑term and short‑term memory management for large language model agents into a learnable reinforcement‑learning task, replacing handcrafted rules with a three‑stage training process and achieving significant benchmark gains.

AgeMemBenchmarkGRPO
0 likes · 9 min read
Teaching LLMs to Manage Memory Autonomously, Dropping Manual Rules
JD Tech Talk
JD Tech Talk
Jan 9, 2026 · Artificial Intelligence

How JoyCode Agent Scored 74.6% Pass@1 on SWE‑bench Verified with a Patch‑Test Co‑generation Loop

JoyCode Agent leverages a patch‑test co‑generation and iterative validation framework to achieve a 74.6% Pass@1 score on the SWE‑bench Verified benchmark, reducing resource consumption by 30‑50% and introducing a closed‑loop multi‑agent pipeline that integrates testing, patch generation, trajectory compression, similarity retrieval, and decision arbitration.

AILLMSWE-bench
0 likes · 41 min read
How JoyCode Agent Scored 74.6% Pass@1 on SWE‑bench Verified with a Patch‑Test Co‑generation Loop
PaperAgent
PaperAgent
Jan 9, 2026 · Artificial Intelligence

Why Traditional RAG Breaks the Chain and How SentGraph Fixes It

The article explains why traditional retrieval‑augmented generation fails in multi‑hop scenarios due to overly large chunks, introduces SentGraph’s sentence‑level graph that trims retrieval units and encodes logical relations, details offline construction and online inference steps, and shows experimental gains and remaining limitations.

LLMMulti-hop QARAG
0 likes · 7 min read
Why Traditional RAG Breaks the Chain and How SentGraph Fixes It
Meituan Technology Team
Meituan Technology Team
Jan 8, 2026 · Artificial Intelligence

Must‑Read AAAI 2026 Papers: Efficient Reasoning, Annealing, Multimodal Diffusion & More

This article curates eight AAAI 2026 papers authored by the Meituan research team, covering verifiable stepwise rewards for LLM reasoning, annealing strategies in large‑scale training, process reward models, competence‑difficulty sampling, high‑fidelity visual text rendering, counterfactual fusion, compress‑then‑rank reranking, and cross‑modal quantization for generative recommendation, with direct PDF links for each work.

AAAI2026CounterfactualLLM
0 likes · 14 min read
Must‑Read AAAI 2026 Papers: Efficient Reasoning, Annealing, Multimodal Diffusion & More
Kuaishou Tech
Kuaishou Tech
Jan 8, 2026 · Artificial Intelligence

Top 12 Kuaishou Papers Accepted at AAAI 2026: Breakthroughs in Recommendation, Video Generation, and LLM Research

Kuaishou secured 12 papers at AAAI 2026, covering advances in search and recommendation systems, multi‑camera video generation, multimodal understanding, generative model fundamentals, video large language models, experimental design, and LLM latent‑space reasoning, with three papers highlighted as oral presentations.

AILLMdiffusion
0 likes · 22 min read
Top 12 Kuaishou Papers Accepted at AAAI 2026: Breakthroughs in Recommendation, Video Generation, and LLM Research
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 8, 2026 · Artificial Intelligence

How to Build Human‑In‑The‑Loop (HITL) Capabilities into ReactAgent

This article explains how to integrate a Human‑In‑The‑Loop (HITL) mechanism into ReactAgent, detailing the motivation, design of interaction, tool description, XML‑based UI rendering, Redis‑driven waiting loop, and the broader architectural parallels with design patterns and other agent frameworks.

Design PatternsHITLHuman-in-the-Loop
0 likes · 14 min read
How to Build Human‑In‑The‑Loop (HITL) Capabilities into ReactAgent
AndroidPub
AndroidPub
Jan 8, 2026 · Artificial Intelligence

Unlocking Anthropic’s Agent Skill: Build Reusable AI Task Assistants in 3 Steps

This article explains Anthropic’s open‑standard Agent Skill, how it serves as a reusable task specification for Claude, walks through creating a skill with metadata, instructions, and advanced Reference/Script features, and compares Skill with MCP to help developers choose the right tool.

AI automationAgent SkillAnthropic
0 likes · 11 min read
Unlocking Anthropic’s Agent Skill: Build Reusable AI Task Assistants in 3 Steps
Sohu Tech Products
Sohu Tech Products
Jan 7, 2026 · Artificial Intelligence

Master Retrieval-Augmented Generation (RAG): Concepts, Benefits, Implementation

This article explains Retrieval‑Augmented Generation (RAG), its dual‑stage architecture that combines parametric LLM knowledge with external non‑parametric data, outlines its technical evolution, discusses why it outperforms pure LLMs, and provides a step‑by‑step guide with toolchain choices, evaluation metrics, and future challenges.

AIKnowledge BaseLLM
0 likes · 14 min read
Master Retrieval-Augmented Generation (RAG): Concepts, Benefits, Implementation
DaTaobao Tech
DaTaobao Tech
Jan 7, 2026 · Artificial Intelligence

5 Design Patterns to Control LLM Output in Generative AI Applications

The article presents five design patterns—Logits Masking, Grammar, Style Transfer, Reverse Neutralization, and Content Optimization—for steering the output of generative AI models, compares their suitable scenarios, advantages, drawbacks, and anti‑patterns, and provides concrete implementation steps, code snippets, and flowcharts to help developers reliably enforce style, format, and compliance constraints.

Generative AILLMPrompt Engineering
0 likes · 20 min read
5 Design Patterns to Control LLM Output in Generative AI Applications
Tencent Cloud Developer
Tencent Cloud Developer
Jan 7, 2026 · Artificial Intelligence

How Context Engineering Powers the Next Generation of AI Agents

Transitioning from simple chatbots to sophisticated agents, this article explains how expanding context becomes a core variable, detailing the evolution from prompt engineering to context engineering, the challenges of managing growing context, and practical solutions like structured context, tool integration, and the MCP framework for reliable AI systems.

LLMReliabilityTool Integration
0 likes · 20 min read
How Context Engineering Powers the Next Generation of AI Agents
Wuming AI
Wuming AI
Jan 6, 2026 · Artificial Intelligence

Top LLM Leaderboards Explained: How to Choose the Right Model

This article surveys the most popular large‑language‑model leaderboards—including lmarena, Artificial Analysis, SuperCLUE, and llm‑stats—detailing their evaluation methods, coverage areas, URLs, and practical usage tips, while warning readers that rankings are only a reference and real‑world performance may vary.

AI benchmarkingArtificial IntelligenceLLM
0 likes · 5 min read
Top LLM Leaderboards Explained: How to Choose the Right Model
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Jan 6, 2026 · Artificial Intelligence

FinRS: A Risk‑Sensitive Trading Framework for Real‑World Financial Markets

FinRS integrates hierarchical market analysis, dual decision agents, and multi‑time‑scale reward feedback to enable risk‑aware multi‑stage trading, achieving higher cumulative returns, better Sharpe ratios, and lower maximum drawdowns than existing LLM‑based and reinforcement‑learning baselines across diverse stocks.

FinRSLLMfinancial markets
0 likes · 14 min read
FinRS: A Risk‑Sensitive Trading Framework for Real‑World Financial Markets
PMTalk Product Manager Community
PMTalk Product Manager Community
Jan 6, 2026 · Industry Insights

Strategic Comparison of Dify, n8n, and ComfyUI for AI Applications and Automation

This article provides a multi‑dimensional strategic analysis of three representative AI‑focused platforms—Dify, n8n, and ComfyUI—examining their product positioning, architecture, interaction models, commercialization strategies, and agent capabilities, and offers concrete recommendations for product managers on choosing the right tool based on ease of use, control, scalability, and total cost of ownership.

AI PlatformsLLMOpen Source
0 likes · 35 min read
Strategic Comparison of Dify, n8n, and ComfyUI for AI Applications and Automation
PaperAgent
PaperAgent
Jan 6, 2026 · Artificial Intelligence

How Ontology‑Driven GraphRAG Eliminates Noise in AI Knowledge Graphs

This article examines the shortcomings of naïve GraphRAG implementations on clinical data and explains how an ontology‑driven, zero‑noise GraphRAG architecture can create self‑improving, conflict‑free knowledge graphs for AI applications.

AIData QualityGraphRAG
0 likes · 3 min read
How Ontology‑Driven GraphRAG Eliminates Noise in AI Knowledge Graphs
PaperAgent
PaperAgent
Jan 5, 2026 · Artificial Intelligence

How QuCo‑RAG Replaces Model Confidence with Objective Evidence to Cut Hallucinations

QuCo‑RAG introduces a dynamic retrieval‑augmented generation framework that quantifies uncertainty using pre‑training corpus statistics, replacing unreliable model confidence with objective frequency and co‑occurrence evidence, achieving millisecond‑level hallucination detection, superior multi‑hop QA performance, and cross‑model transferability across various LLMs.

Dynamic RetrievalLLMRetrieval-Augmented Generation
0 likes · 9 min read
How QuCo‑RAG Replaces Model Confidence with Objective Evidence to Cut Hallucinations
AI Insight Log
AI Insight Log
Jan 4, 2026 · Artificial Intelligence

Agent Skills for Context Engineering: 4K Stars, Powering Cursor & Codex

The open‑source ‘Agent Skills for Context Engineering’ project, which amassed over 4,100 stars in a week, demonstrates why managing a model’s attention budget—through foundational, operational, and development‑methodology skills—is essential as context windows grow, and provides platform‑agnostic instructions for Claude Code, Cursor and other AI tools.

Agent SkillsClaude CodeContext Engineering
0 likes · 7 min read
Agent Skills for Context Engineering: 4K Stars, Powering Cursor & Codex
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Jan 4, 2026 · Artificial Intelligence

How VTA Combines Large‑Model Reasoning for Precise and Explainable Stock Time‑Series Forecasting

The VTA framework integrates large language model reasoning with textual annotation of technical indicators, employs a Time‑GRPO reinforcement‑learning objective and multi‑stage joint conditional training, and achieves state‑of‑the‑art accuracy and expert‑rated interpretability on US, Chinese and European stock datasets.

LLMTime-seriesVTA
0 likes · 19 min read
How VTA Combines Large‑Model Reasoning for Precise and Explainable Stock Time‑Series Forecasting
AI Insight Log
AI Insight Log
Jan 4, 2026 · Artificial Intelligence

How Playwright + AI Powers a Fully Automated Xianyu Treasure Hunt

The article examines the open‑source ai‑goofish‑monitor project, which combines Playwright‑driven browsing with large‑language‑model analysis to continuously scan Xianyu listings, filter out junk, and highlight high‑quality items, while also discussing its AI‑generated code, benefits, limitations, and security risks.

AILLMPlaywright
0 likes · 7 min read
How Playwright + AI Powers a Fully Automated Xianyu Treasure Hunt
PaperAgent
PaperAgent
Jan 4, 2026 · Artificial Intelligence

How Sophia’s System 3 Turns LLM Agents into Persistent Learners

The article presents Sophia, a System 3‑enabled persistent agent framework that adds a meta‑cognitive layer to LLM‑based agents, enabling identity continuity, self‑scheduled learning, real‑time self‑checks, and autonomous task generation, and validates its benefits through a 24‑hour continuous‑run experiment.

AI AgentsLLMSystem architecture
0 likes · 7 min read
How Sophia’s System 3 Turns LLM Agents into Persistent Learners
Architect
Architect
Jan 3, 2026 · Artificial Intelligence

Unlocking AI Agent Memory: A Comprehensive Survey of Forms, Functions, and Dynamics

This article surveys the emerging field of AI agent memory, presenting a three‑dimensional taxonomy of memory forms, detailing functional categories such as factual, experiential, and working memory, and outlining dynamic processes of formation, evolution, and retrieval, while also highlighting benchmarks, open‑source frameworks, and future research directions.

AI AgentsAgentic SystemsLLM
0 likes · 7 min read
Unlocking AI Agent Memory: A Comprehensive Survey of Forms, Functions, and Dynamics
NetEase LeiHuo Testing Center
NetEase LeiHuo Testing Center
Jan 2, 2026 · Artificial Intelligence

From ChatGPT to LLM‑Native: Building Intelligent AI Agents and Workflows with LangChain

The article explains why traditional chat‑based AI tools are limited to advice, introduces next‑generation LLM‑native applications that can understand, plan, and act, and provides a step‑by‑step guide on designing AI workflows, autonomous agents, hybrid architectures, and the Model Context Protocol (MCP) using LangChain.

AI AgentsLLMLangChain
0 likes · 36 min read
From ChatGPT to LLM‑Native: Building Intelligent AI Agents and Workflows with LangChain
IT Services Circle
IT Services Circle
Jan 2, 2026 · Artificial Intelligence

Top Open‑Source NotebookLM Alternatives: AI‑Powered Docs, Podcasts & Research Tools

This article surveys the most popular open‑source replacements for Google NotebookLM, detailing each project's star count, supported AI models, multimodal input capabilities, Docker deployment options, and unique features such as multi‑speaker podcast generation, semantic search, and collaborative knowledge‑base integration.

AIDockerLLM
0 likes · 8 min read
Top Open‑Source NotebookLM Alternatives: AI‑Powered Docs, Podcasts & Research Tools
AI Architecture Hub
AI Architecture Hub
Dec 31, 2025 · Artificial Intelligence

Why LangGraph Is the Next‑Generation Framework for LLM Agent Orchestration

This article explains the motivation behind LangGraph, walks through a quick start, details its core syntax and state management, demonstrates conditional branching, parallel execution, tool integration, multi‑agent orchestration, and real‑time monitoring, and finally discusses future directions for the framework.

LLMLangGraphParallel Execution
0 likes · 32 min read
Why LangGraph Is the Next‑Generation Framework for LLM Agent Orchestration
Data Party THU
Data Party THU
Dec 29, 2025 · Artificial Intelligence

Unlocking AI Agent Memory: A Deep Dive into Forms, Functions, and Dynamics

This article reviews the survey "Memory in the Age of AI Agents," presenting a comprehensive taxonomy that classifies agent memory by its forms, functions, and dynamic mechanisms, and explores future directions such as generative memory, reinforcement‑learning‑driven management, multimodal storage, and trustworthy handling.

AI AgentsAgent ArchitectureFuture AI
0 likes · 14 min read
Unlocking AI Agent Memory: A Deep Dive into Forms, Functions, and Dynamics
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 29, 2025 · Artificial Intelligence

How Alibaba’s Tair KVCache Manager Revolutionizes Enterprise‑Level LLM Cache Management

This article details the architecture and implementation of Tair KVCache Manager, an enterprise‑grade service that centralises KVCache metadata, decouples inference engines from storage, provides elastic scaling, multi‑tenant isolation, high availability, and performance‑optimised cache management for large‑scale LLM inference workloads.

Cache ManagementKVCacheLLM
0 likes · 28 min read
How Alibaba’s Tair KVCache Manager Revolutionizes Enterprise‑Level LLM Cache Management
MaGe Linux Operations
MaGe Linux Operations
Dec 27, 2025 · Artificial Intelligence

How to Deploy and Optimize Enterprise‑Scale LLM Inference Services: A Practical Guide

This guide walks you through deploying large language models such as ChatGLM and Llama in production, covering environment setup, model quantization, dynamic batching, service configuration, Nginx load balancing, monitoring, troubleshooting, and best‑practice recommendations for high‑performance, cost‑effective AI inference.

GPULLMPerformance tuning
0 likes · 48 min read
How to Deploy and Optimize Enterprise‑Scale LLM Inference Services: A Practical Guide
AI Architecture Hub
AI Architecture Hub
Dec 27, 2025 · Artificial Intelligence

How GraphRAG Turns Knowledge Graphs into Smarter Retrieval for LLMs

GraphRAG extends traditional Retrieval‑Augmented Generation by building a knowledge graph from documents, extracting entities and relationships, performing community detection, and supporting both local and global searches, offering detailed step‑by‑step guidance, code examples, configuration tips, and a comparison with classic RAG approaches.

GraphRAGKnowledge GraphLLM
0 likes · 28 min read
How GraphRAG Turns Knowledge Graphs into Smarter Retrieval for LLMs
Alibaba Cloud Native
Alibaba Cloud Native
Dec 27, 2025 · Artificial Intelligence

Unlocking AI Agent Memory: Short‑Term vs Long‑Term Strategies and Framework Integration

This article explains how AI agents overcome context window limits by using memory systems, distinguishes short‑term (session) and long‑term (cross‑session) memory, compares implementations in Google ADK, LangChain and AgentScope, and outlines context‑engineering techniques, core components, challenges, and emerging trends.

AI memoryAgent FrameworksContext Engineering
0 likes · 20 min read
Unlocking AI Agent Memory: Short‑Term vs Long‑Term Strategies and Framework Integration
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 26, 2025 · Artificial Intelligence

How AutoContextMemory Cuts LLM Costs by 70% in Long Conversations

This article explains the challenges of token explosion in long‑running AI agent dialogues and introduces AutoContextMemory, a Java component that automatically compresses, offloads, and summarizes conversation history to dramatically reduce token usage, speed up responses, and preserve critical information.

AgentScopeContext ManagementLLM
0 likes · 12 min read
How AutoContextMemory Cuts LLM Costs by 70% in Long Conversations
360 Tech Engineering
360 Tech Engineering
Dec 26, 2025 · Artificial Intelligence

15 Chunking Strategies to Supercharge Retrieval‑Augmented Generation

This article presents fifteen practical chunking techniques—ranging from line‑by‑line and fixed‑size chunking to semantic and hierarchical methods—explaining their principles, ideal use‑cases, concrete input examples, chunk outputs, and key advantages or cautions for improving Retrieval‑Augmented Generation with large language models.

AIChunkingData Retrieval
0 likes · 28 min read
15 Chunking Strategies to Supercharge Retrieval‑Augmented Generation
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 26, 2025 · Artificial Intelligence

How to Build a Fully Automated Knowledge‑Extraction Pipeline for AI Agents with Python

This article presents a complete end‑to‑end pipeline that automatically extracts, generalizes, incrementally updates, and vector‑syncs knowledge from diverse sources such as tickets, documents, and SQL code, turning the traditionally labor‑intensive knowledge‑base construction for agents into a low‑effort, continuously maintainable Python‑driven solution.

LLMPythonRAG
0 likes · 15 min read
How to Build a Fully Automated Knowledge‑Extraction Pipeline for AI Agents with Python
Architect
Architect
Dec 25, 2025 · Artificial Intelligence

How GraphRAG Boosts Retrieval Accuracy with Knowledge Graphs – A Complete Guide

This article explains why traditional RAG suffers from hallucinations, introduces GraphRAG’s knowledge‑graph‑based approach, walks through its indexing and query pipelines—including text splitting, entity‑relation extraction, graph construction, community detection, and local vs. global retrieval—provides practical setup commands, Neo4j visualization steps, and compares its performance with classic RAG.

EmbeddingGraphRAGKnowledge Graph
0 likes · 27 min read
How GraphRAG Boosts Retrieval Accuracy with Knowledge Graphs – A Complete Guide
360 Tech Engineering
360 Tech Engineering
Dec 25, 2025 · Artificial Intelligence

Why LangChain 1.0 Makes AI Agent Development Faster, Safer, and More Scalable

LangChain 1.0 replaces fragmented agent code with a production‑ready framework that unifies model outputs, simplifies tool integration, introduces content_blocks for consistent response handling, and adds a middleware system for privacy, summarization, and human‑in‑the‑loop safety, dramatically improving developer efficiency and reliability.

LLMLangChainPython
0 likes · 13 min read
Why LangChain 1.0 Makes AI Agent Development Faster, Safer, and More Scalable
AI Architecture Hub
AI Architecture Hub
Dec 24, 2025 · Artificial Intelligence

From LLMs to Autonomous Agents: The Three Evolution Stages of AI

This article explains the three evolutionary stages of AI—from large language models that generate text, through workflow‑enhanced systems using retrieval‑augmented generation, to fully autonomous agents capable of self‑directed decision‑making—while detailing the four core technologies that power each stage.

AI evolutionEmbeddingLLM
0 likes · 9 min read
From LLMs to Autonomous Agents: The Three Evolution Stages of AI
Zhuanzhuan Tech
Zhuanzhuan Tech
Dec 24, 2025 · Artificial Intelligence

Building an ASR+LLM+Vector Knowledge Base for Precise Video Ad Category Detection

This article presents a layered ASR‑LLM‑vector‑knowledge‑base pipeline that cleans speech transcripts, semantically repairs text, performs hierarchical exact and fuzzy matching, and iteratively refines mappings to accurately identify product categories in video advertisements, while detailing module functions, technical choices, and LLM parameter tuning.

ASRKnowledge BaseLLM
0 likes · 11 min read
Building an ASR+LLM+Vector Knowledge Base for Precise Video Ad Category Detection
Baidu Geek Talk
Baidu Geek Talk
Dec 24, 2025 · Artificial Intelligence

Context Parallelism Slashes TTFT by 80% for 128K-Token LLMs

The article explains how Baidu’s Baige team integrated a Context Parallelism strategy into DeepSeek V3.2, detailing the DSA architecture, the limitations of traditional tensor and sequence parallelism, and how CP distributes computation and memory across GPUs to achieve up to an 80 % reduction in token‑to‑first‑token latency for ultra‑long 128K‑token contexts.

Context ParallelismDeepSeekLLM
0 likes · 9 min read
Context Parallelism Slashes TTFT by 80% for 128K-Token LLMs
Tencent Technical Engineering
Tencent Technical Engineering
Dec 24, 2025 · Artificial Intelligence

Build a Mini LLM from Scratch: Step‑by‑Step Guide to Tokenizer, Attention, and Transformer

This article walks through constructing a small large‑language model from the ground up, covering model architecture, tokenization methods, BPE vocabulary building, embedding, positional encoding, attention mechanisms, multi‑head attention, transformer blocks, training pipelines, inference, and sampling strategies, all with runnable Python code.

Deep LearningLLMPython
0 likes · 34 min read
Build a Mini LLM from Scratch: Step‑by‑Step Guide to Tokenizer, Attention, and Transformer
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Dec 24, 2025 · Artificial Intelligence

How Context Parallelism Slashes LLM First‑Token Latency by 80% for 128K Tokens

The article explains how the newly merged Context Parallelism (CP) technique in SGLang, combined with DeepSeek V3.2's Sparse Attention architecture, reduces first‑token latency by up to 80% and alleviates memory pressure for ultra‑long 128K‑token sequences, detailing both algorithmic innovations and engineering solutions.

AI InfrastructureContext ParallelismLLM
0 likes · 10 min read
How Context Parallelism Slashes LLM First‑Token Latency by 80% for 128K Tokens
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Dec 23, 2025 · Artificial Intelligence

How H3M‑SSMoEs Combines Hypergraph Multimodal Learning and LLM Reasoning to Predict Stock Direction

The paper introduces H3M‑SSMoEs, a framework that integrates a multi‑context hypergraph for fine‑grained spatio‑temporal dynamics with a frozen Llama‑3.2‑1B LLM adapter, and a style‑structured expert mixture to jointly model stock relationships, multimodal semantics, and market regimes, achieving superior accuracy and investment returns on DJIA, NASDAQ‑100, and S&P‑100 benchmarks.

Financial AIHypergraphLLM
0 likes · 14 min read
How H3M‑SSMoEs Combines Hypergraph Multimodal Learning and LLM Reasoning to Predict Stock Direction
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Dec 22, 2025 · Artificial Intelligence

Boost LLM Inference with KV‑Cache‑Aware Routing on Alibaba Cloud ACK GIE

This article explains why KV‑Cache hit rate is critical for large‑model inference, describes vLLM's automatic prefix caching, outlines the distributed cache challenges, and provides a step‑by‑step guide to deploying Alibaba Cloud ACK Gateway with Inference Extension's precise‑mode prefix‑cache‑aware routing, backed by benchmark results.

Alibaba CloudKV CacheKubernetes
0 likes · 18 min read
Boost LLM Inference with KV‑Cache‑Aware Routing on Alibaba Cloud ACK GIE
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Dec 22, 2025 · Artificial Intelligence

How Advanced RAG Techniques Are Redefining Enterprise Knowledge Services

This article examines four cutting‑edge Retrieval‑Augmented Generation frameworks—Adaptive RAG, Agentic RAG, OG‑RAG, and OAG—detailing their definitions, core mechanisms, performance gains, and practical selection guidance for complex enterprise scenarios, while highlighting future research directions.

Enterprise KnowledgeLLMOntology
0 likes · 21 min read
How Advanced RAG Techniques Are Redefining Enterprise Knowledge Services
JD Tech
JD Tech
Dec 22, 2025 · Artificial Intelligence

Build Flexible Multi‑Agent Systems Like LEGO with OxyGent – New Features Unveiled

The OxyGent 1.0.8 release introduces multimodal messaging, fine‑grained control, MCP reconnection, and front‑end streaming, while detailing its stateless AOP architecture, execution lifecycle, four data scopes, real‑world use cases, community feedback, and a step‑by‑step tutorial for rapid adoption.

AIFrameworkLLM
0 likes · 11 min read
Build Flexible Multi‑Agent Systems Like LEGO with OxyGent – New Features Unveiled
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 22, 2025 · Artificial Intelligence

Turning Real‑Time Hotspot Detection into AI‑Powered E‑Commerce Recommendations

Traditional recommendation systems lag behind fast‑moving external trends, missing the freshness and surprise users crave. This article details an end‑to‑end AI pipeline that perceives, understands, and reacts to hotspots within hours, automatically generating high‑quality product selections and continuously optimizing through feedback loops.

AI recommendationLLMautomation
0 likes · 25 min read
Turning Real‑Time Hotspot Detection into AI‑Powered E‑Commerce Recommendations
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Dec 21, 2025 · Artificial Intelligence

Deploy and Explore Open WebUI: A Feature‑Rich Self‑Hosted AI Platform

Open WebUI is a self‑hosted, extensible AI platform that runs fully offline, supports multiple LLM back‑ends such as Ollama and OpenAI‑compatible APIs, offers built‑in RAG, role‑based access, multi‑model chat, markdown/LaTeX, image generation, and provides detailed Docker, pip, and Kubernetes installation guides with ready‑to‑run commands.

AI platformDockerLLM
0 likes · 11 min read
Deploy and Explore Open WebUI: A Feature‑Rich Self‑Hosted AI Platform
Advanced AI Application Practice
Advanced AI Application Practice
Dec 20, 2025 · Artificial Intelligence

Master System, User, Assistant Roles to Get Precise AI Testing Answers from LLMs

This article explains how the System, User, and Assistant roles in large-language-model chat APIs shape response quality, demonstrates their impact with concrete Python code examples, compares outcomes with and without System prompts, and offers practical tips for crafting effective prompts to achieve concise, relevant AI testing guidance.

AI testingAssistant RoleLLM
0 likes · 14 min read
Master System, User, Assistant Roles to Get Precise AI Testing Answers from LLMs
Design Hub
Design Hub
Dec 20, 2025 · Artificial Intelligence

Must-Read: K's 2025 AI Review – 6 Paradigm Shifts Reshaping Our World

The article reviews six 2025 paradigm shifts in large language models—from the rise of verifiable‑reward reinforcement learning and the emergence of AI "ghosts" to new "Cursor for X" middle layers, local agents like Claude Code, Vibe Coding that lets users program by conversation, and visual interaction driven by Gemini Nano Banana—highlighting their technical impact and design implications.

AI AgentsLLMRLVR
0 likes · 12 min read
Must-Read: K's 2025 AI Review – 6 Paradigm Shifts Reshaping Our World
PaperAgent
PaperAgent
Dec 20, 2025 · Industry Insights

What 2025 Tells Us About the Future of Large Language Models

The 2025 LLM year‑in‑review highlights paradigm shifts such as RLVR training, uneven “saw‑tooth” intelligence, the rise of Cursor‑style applications, Claude Code agents running locally, Vibe Coding, and the Nano Banana GUI revolution, concluding that current models only exploit about 10 % of their potential.

AI AgentsLLMNano Banana
0 likes · 10 min read
What 2025 Tells Us About the Future of Large Language Models
Bighead's Algorithm Notes
Bighead's Algorithm Notes
Dec 19, 2025 · Artificial Intelligence

Quantitative Finance Paper Digest: Dec 13‑19 2025 Highlights

This digest presents recent arXiv papers (Dec 13‑19 2025) on AI‑driven quantitative finance, covering LLM‑based portfolio recommendation, reinforcement‑learning deep hedging, hybrid SV‑LSTM volatility forecasting, dynamic stacking ensembles, GA‑optimized SVR forecasting, and interpretable deep learning asset pricing, each with abstracts and key findings.

Deep LearningLLMQuantitative Finance
0 likes · 16 min read
Quantitative Finance Paper Digest: Dec 13‑19 2025 Highlights
Alibaba Cloud Native
Alibaba Cloud Native
Dec 19, 2025 · Artificial Intelligence

What Enterprises Are Learning from the State of Agent Engineering Report

The recent LangChain "State of Agent Engineering" report, combined with data from the AI‑Native Application Architecture whitepaper, reveals rapid production adoption of AI agents, persistent quality challenges, widespread observability, multi‑model strategies, and evolving evaluation practices across organizations of all sizes.

AI AgentsEvaluationLLM
0 likes · 10 min read
What Enterprises Are Learning from the State of Agent Engineering Report
Bilibili Tech
Bilibili Tech
Dec 19, 2025 · Artificial Intelligence

SABER: Switchable and Balanced Training for Efficient LLM Reasoning

SABER introduces a reinforcement‑learning framework that lets large language models dynamically switch among four token‑budgeted reasoning modes, dramatically cutting inference length while preserving or improving accuracy across math, code, and logic tasks.

Budgeted ComputationEfficient ReasoningLLM
0 likes · 13 min read
SABER: Switchable and Balanced Training for Efficient LLM Reasoning
PaperAgent
PaperAgent
Dec 18, 2025 · Artificial Intelligence

Can Ontology‑Aware KG‑RAG Double Table QA Performance on Industrial Standards?

This article presents an ontology‑aware knowledge‑graph RAG framework that transforms complex, hierarchical industrial standard documents into a graph of sections, atomic propositions, and refined triples, achieving nearly double F1 scores on table‑based QA tasks and robust performance on long documents.

Knowledge GraphLLMOntology
0 likes · 6 min read
Can Ontology‑Aware KG‑RAG Double Table QA Performance on Industrial Standards?
21CTO
21CTO
Dec 17, 2025 · Artificial Intelligence

Can a New Language Make LLMs Write Code with 100% Accuracy? Meet Sui

Japanese data scientist Takato Honda introduces Sui, an open‑source programming language designed to eliminate syntax and spelling errors and to let large language models generate code with claimed 100% accuracy, offering token‑efficiency optimizations for AI‑assisted programming.

AILLMOpen Source
0 likes · 4 min read
Can a New Language Make LLMs Write Code with 100% Accuracy? Meet Sui
PaperAgent
PaperAgent
Dec 17, 2025 · Artificial Intelligence

Unlocking Agent Memory: A Comprehensive Survey of Forms, Functions, and Dynamics

This article surveys over 200 recent papers on AI agent memory, introducing a three‑dimensional framework of form, function, and dynamics, classifying memory into token‑level, parametric, and latent types, outlining their roles, lifecycle operations, benchmark datasets, open‑source frameworks, and seven emerging research directions.

AI AgentsLLMSurvey
0 likes · 6 min read
Unlocking Agent Memory: A Comprehensive Survey of Forms, Functions, and Dynamics
Architects' Tech Alliance
Architects' Tech Alliance
Dec 17, 2025 · Artificial Intelligence

Mastering Retrieval‑Augmented Generation: From Theory to Scalable Deployment

This guide explains how Retrieval‑Augmented Generation (RAG) overcomes LLM knowledge staleness, hallucination, and domain‑adaptation challenges by combining external knowledge bases with real‑time retrieval, and provides detailed architecture, optimization techniques, engineering practices, monitoring, cost‑control, and future trends for building production‑grade RAG systems.

AICloudflareLLM
0 likes · 15 min read
Mastering Retrieval‑Augmented Generation: From Theory to Scalable Deployment
PaperAgent
PaperAgent
Dec 16, 2025 · Artificial Intelligence

Open Notebook: The Open‑Source, Privacy‑First Alternative to Google Notebook LM

Open Notebook is a fully local, open‑source AI notebook that rivals Google Notebook LM by supporting over 16 LLM providers, handling multimodal content, and enabling advanced multi‑speaker podcast generation while giving users complete data sovereignty and flexible deployment options.

AI NotebookLLMOpen Source
0 likes · 4 min read
Open Notebook: The Open‑Source, Privacy‑First Alternative to Google Notebook LM
Fighter's World
Fighter's World
Dec 16, 2025 · Artificial Intelligence

Boosting Large Language Model Domain Expertise with Claude Skills

The article analyzes why generic LLMs struggle with domain‑specific reasoning, critiques fine‑tuning, RAG and prompt engineering, and presents Claude Skills—using progressive disclosure, Pydantic validation, and state‑machine control—to encode expert constraints as executable rules, illustrated with finance compliance and legal reasoning case studies and backed by Anthropic research.

ClaudeDomain-specificLLM
0 likes · 20 min read
Boosting Large Language Model Domain Expertise with Claude Skills
JakartaEE China Community
JakartaEE China Community
Dec 16, 2025 · Artificial Intelligence

Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3

This guide walks through the importance of Retrieval‑Augmented Generation, outlines the core Langchain4j and Ollama 3 components, and provides a complete Java example—including Maven setup, document ingestion, embedding creation, similarity search, prompt construction, and response generation—to demonstrate a functional RAG pipeline.

EmbeddingLLMLangChain4j
0 likes · 9 min read
Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3
PaperAgent
PaperAgent
Dec 16, 2025 · Artificial Intelligence

Do LLMs Have Emotional Chains? Unveiling the Chain‑of‑Affective Across 8 Model Families

This article analyzes recent research by East China Normal University and Fudan University on whether eight major LLM families exhibit a systematic “Chain-of-Affective,” revealing how internal emotional structures influence model outputs, multi‑agent interactions, and user experience, and offering practical guidelines for mitigating emotional loops in AI systems.

AI safetyBenchmarkChain-of-Affective
0 likes · 8 min read
Do LLMs Have Emotional Chains? Unveiling the Chain‑of‑Affective Across 8 Model Families
Qborfy AI
Qborfy AI
Dec 16, 2025 · Artificial Intelligence

Mastering AI Function Calling: Turn LLMs into Actionable Assistants

Function Calling lets large language models invoke external tools or APIs during a conversation, transforming them from passive responders into proactive assistants; this guide explains the concept, workflow, and practical implementations with weather, parallel queries, and stock price examples using OpenAI’s Python SDK.

AI Function CallingChatbotLLM
0 likes · 9 min read
Mastering AI Function Calling: Turn LLMs into Actionable Assistants
Alibaba Cloud Developer
Alibaba Cloud Developer
Dec 16, 2025 · Artificial Intelligence

How We Built an AI‑Powered Data Agent to Automate Data Retrieval at Scale

This article details the design and implementation of Matra, an AI‑driven data assistant for a large e‑commerce platform, covering the challenges of legacy data assets, knowledge‑base construction, GraphRAG integration, multi‑stage agent frameworks, practical results, and future plans for continuous improvement.

AIData EngineeringData Retrieval
0 likes · 22 min read
How We Built an AI‑Powered Data Agent to Automate Data Retrieval at Scale
AI Large Model Application Practice
AI Large Model Application Practice
Dec 16, 2025 · Artificial Intelligence

Recreating NotebookLM’s PPT Generation with a Low‑Code Workflow

This guide shows how to use the open‑source BISHENG low‑code platform, ByteDance’s Seed‑1.6 and Seedream‑4.5 models, and a custom MCP server to build a workflow that uploads documents, performs RAG, generates structured PPT outlines with LLMs, creates page images via text‑to‑image models, and assembles a downloadable PDF, all while incorporating human‑in‑the‑loop controls.

BISHENGHITLLLM
0 likes · 17 min read
Recreating NotebookLM’s PPT Generation with a Low‑Code Workflow
Old Meng AI Explorer
Old Meng AI Explorer
Dec 15, 2025 · Artificial Intelligence

Unlock Multi‑Model AI Decision Power with LLM Council – A Hands‑On Guide

LLM Council, an open‑source platform created by former OpenAI researcher Andrej Karpathy, lets users simultaneously query top LLMs such as GPT‑5.1, Gemini 3 Pro, Claude Sonnet 4.5 and Grok 4, anonymously peer‑review their answers, and synthesize a final report, dramatically improving accuracy for research, tech selection and learning while remaining easy to install and run locally.

AI toolLLMOpen-source
0 likes · 11 min read
Unlock Multi‑Model AI Decision Power with LLM Council – A Hands‑On Guide
Architect
Architect
Dec 15, 2025 · Artificial Intelligence

Demystifying LLM Architecture: From Transformers to Modern MoE Designs

This comprehensive guide explains the fundamentals of large language model (LLM) architectures, covering the original Transformer, tokenization, embeddings, positional encoding, attention mechanisms, feed‑forward networks, layer stacking, a step‑by‑step translation example, and the latest open‑source and hybrid LLM designs shaping the field.

EmbeddingLLMMoE
0 likes · 41 min read
Demystifying LLM Architecture: From Transformers to Modern MoE Designs
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Dec 15, 2025 · Artificial Intelligence

Baidu Baige’s Breakthrough: Orchestrating Giant LLM Inference with Silent Instances

The article details Baidu Baige’s next‑generation distributed inference platform for trillion‑parameter LLMs, explaining how automated orchestration, the FedDeployment abstraction, SplitService unified view, Adaptive HPA predictive scaling, Silent Instances for second‑level activation, and the Staggered Batched Scheduler eliminate scaling limits, reduce TTFT by 30‑40%, boost throughput by up to 20%, and achieve cost‑effective, elastic AI compute.

AutoscalingKubernetesLLM
0 likes · 23 min read
Baidu Baige’s Breakthrough: Orchestrating Giant LLM Inference with Silent Instances
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Dec 15, 2025 · Artificial Intelligence

Mastering Text2SQL: From Schema Design to Secure Multi‑Step LLM Pipelines

This article explains how Text2SQL works by teaching LLMs to understand a closed‑world database schema, constructing tightly constrained prompts, validating generated SQL, handling execution errors, and using a second LLM call to translate results into natural language, while highlighting common pitfalls and engineering best practices.

LLMSQL ValidationText2SQL
0 likes · 9 min read
Mastering Text2SQL: From Schema Design to Secure Multi‑Step LLM Pipelines
Network Intelligence Research Center (NIRC)
Network Intelligence Research Center (NIRC)
Dec 15, 2025 · Artificial Intelligence

Turning LLM-Generated Network Configurations into Verified, Safe Updates with Artanis

The paper introduces Artanis, an intent‑based network configuration update framework that combines large‑language‑model generation with a verification‑feedback loop and reinforcement‑learning optimization, addressing hallucination‑induced errors and ensuring safe, policy‑compliant deployments across diverse network scales.

Configuration ManagementIntent-based NetworkingLLM
0 likes · 9 min read
Turning LLM-Generated Network Configurations into Verified, Safe Updates with Artanis
Architect's Alchemy Furnace
Architect's Alchemy Furnace
Dec 13, 2025 · Artificial Intelligence

Explore 100+ Open‑Source LLM Apps and How to Run Them Locally

This guide presents a curated collection of over a hundred open‑source large language model applications—including AI agents, RAG pipelines, and domain‑specific tools—explains their categories, showcases example projects, and provides step‑by‑step instructions to clone and run them on your own machine.

AI AgentsGitHubLLM
0 likes · 8 min read
Explore 100+ Open‑Source LLM Apps and How to Run Them Locally