Tagged articles

2079 articles

Page 8 of 21

Jan 17, 2026 · Artificial Intelligence

Hypergraphs Turn LLMs into Reliable Material Discovery Agents

This article explains how representing multi‑component scientific knowledge as hyperedges, rather than traditional triples, enables large language models to traverse complex material interactions, reduce hallucinations, and generate verifiable experimental designs, demonstrated through a large hypergraph built from thousands of scaffold papers.

AI reasoningHypergraphLLM

0 likes · 7 min read

Hypergraphs Turn LLMs into Reliable Material Discovery Agents

AI Engineering

Jan 17, 2026 · Artificial Intelligence

Can Tiny LLMs Compute Accurately? WorldModel‑Qwen Inference‑Time WASM Execution

The article details how the small Qwen‑0.6B model was adapted to generate and run WebAssembly code during inference, achieving deterministic calculations and revealing both the promise and current limitations of integrating world‑model reasoning into tiny LLMs.

LLMQwen-0.6BWASM execution

0 likes · 5 min read

Can Tiny LLMs Compute Accurately? WorldModel‑Qwen Inference‑Time WASM Execution

macrozheng

Jan 16, 2026 · Artificial Intelligence

Unlock Seamless Document Search with WeKnora: An Open‑Source LLM Retrieval Framework

WeKnora is an open‑source Tencent framework that combines large language models with retrieval‑augmented generation to enable fast, accurate semantic search and question answering across heterogeneous documents such as PDFs, Word files, and images, offering a modular, extensible architecture and easy Docker‑based deployment.

AILLMRAG

0 likes · 7 min read

Unlock Seamless Document Search with WeKnora: An Open‑Source LLM Retrieval Framework

php Courses

Jan 16, 2026 · Artificial Intelligence

From Coding to Validation: How AI Is Redefining the Developer’s Role

The rise of large language models has shifted software development from manual coding to AI‑generated drafts, making verification, security, and business alignment the core responsibilities of modern engineers, and outlining the skills, workflows, and challenges needed to thrive in this new paradigm.

AILLMcode generation

0 likes · 11 min read

From Coding to Validation: How AI Is Redefining the Developer’s Role

Ops Development & AI Practice

Jan 15, 2026 · Artificial Intelligence

Why Rapid Experimentation Beats Token‑Saving in LLM Development

The article explains how AI development with large language models differs from traditional software engineering, why developers feel abstract and uncertain, and offers actionable strategies—such as micro‑prototyping, tiered model usage, simple evaluation sheets, and embracing throwaway code—to accelerate learning despite token costs.

LLMRapid PrototypingToken Management

0 likes · 7 min read

Why Rapid Experimentation Beats Token‑Saving in LLM Development

Tencent Tech

Jan 15, 2026 · Artificial Intelligence

How TCAR Redefines Enterprise Multi‑Agent Routing with Reason‑First Decision Making

The article explains how Tencent Cloud's open‑source TCAR router, a 4‑billion‑parameter model, tackles the limitations of traditional single‑label routers by first reasoning and then selecting agents, enabling cross‑domain, conflict‑aware, and adaptable task coordination in enterprise AI systems.

AILLMOpen Source

0 likes · 7 min read

How TCAR Redefines Enterprise Multi‑Agent Routing with Reason‑First Decision Making

PaperAgent

Jan 15, 2026 · Artificial Intelligence

How GAG Enables Zero‑Retrieval, Single‑Token Private Knowledge Injection in LLMs

The article presents GAG, a third‑generation framework that injects proprietary domain knowledge into frozen large language models using a single token, eliminating retrieval, avoiding base model updates, and maintaining constant inference budget while delivering strong performance on private QA and public benchmarks.

AI alignmentGAGLLM

0 likes · 8 min read

How GAG Enables Zero‑Retrieval, Single‑Token Private Knowledge Injection in LLMs

HyperAI Super Neural

Jan 15, 2026 · Artificial Intelligence

97% Accuracy: MOFSeq‑LMM Uses LLMs to Efficiently Predict MOF Synthesizability

A joint Princeton and Colorado School of Mines team introduced MOFSeq‑LMM, a large‑language‑model‑based framework that leverages a million‑scale MOF dataset and a novel string representation to predict free energy with MAE 0.789 kJ/mol and synthesizeability with 97% F1, dramatically accelerating high‑throughput MOF screening.

LLMMOFsMaterials Informatics

0 likes · 15 min read

97% Accuracy: MOFSeq‑LMM Uses LLMs to Efficiently Predict MOF Synthesizability

AI Large Model Application Practice

Jan 15, 2026 · Artificial Intelligence

Why Transformers Need Positional Embeddings and How They Work

This article explains the order‑blindness of Transformer self‑attention, why naïvely adding raw position indices harms semantics, and walks through sinusoidal, learnable, and rotary positional encodings together with PI and YaRN techniques for extending sequence length.

AIDeep LearningLLM

0 likes · 12 min read

Why Transformers Need Positional Embeddings and How They Work

Sohu Tech Products

Jan 14, 2026 · Artificial Intelligence

Build a Zero‑Cost Open‑Source RAG Smart Document Q&A System from Scratch

This guide walks through building an open‑source Retrieval‑Augmented Generation (RAG) system that indexes local files with Everything, uses hybrid BM25‑vector search via Elasticsearch, and answers questions with a local LLM, covering architecture, core techniques, deployment steps, performance tweaks, and common pitfalls.

ElasticsearchLLMOpen Source

0 likes · 11 min read

Build a Zero‑Cost Open‑Source RAG Smart Document Q&A System from Scratch

Aikesheng Open Source Community

Jan 14, 2026 · Artificial Intelligence

NL2SQL Datasets REEF & text2SQL4PM: Causal Analysis Meets Process Mining

This article introduces two recent NL2SQL benchmark datasets—REEF, a synthetic e‑commerce database for end‑to‑end causal analysis, and text2SQL4PM, a bilingual process‑mining dataset—detailing their construction, evaluation results, and research implications for large language models.

Causal AnalysisLLMNL2SQL

0 likes · 8 min read

NL2SQL Datasets REEF & text2SQL4PM: Causal Analysis Meets Process Mining

Alibaba Cloud Developer

Jan 14, 2026 · Artificial Intelligence

How DataAgent Turns AI into a Virtual Data Analyst for Enterprise Insights

DataAgent, built on Spring AI Alibaba, tackles the "last mile" of AI data analysis by combining deterministic workflow orchestration with large‑model reasoning, offering human‑in‑the‑loop feedback, dynamic prompt configuration, hybrid retrieval, containerized Python execution, streaming SSE, multi‑model scheduling, multi‑source connectivity, and secure API‑key management to deliver instant, insight‑rich reports for business users.

AIAnalyticsDataAgent

0 likes · 11 min read

How DataAgent Turns AI into a Virtual Data Analyst for Enterprise Insights

PMTalk Product Manager Community

Jan 14, 2026 · Product Management

From Docs to Evals: Essential AI Skills for Modern Product Managers

AI product managers are shifting from static PRDs to dynamic evaluation frameworks—Evals—that define product quality through automated tests, golden conversations, and LLM judges, enabling continuous iteration, error-driven requirement discovery, and architecture decisions in complex AI systems.

AIEvaluationLLM

0 likes · 7 min read

From Docs to Evals: Essential AI Skills for Modern Product Managers

Network Intelligence Research Center (NIRC)

Jan 14, 2026 · Artificial Intelligence

From Black‑Box Guessing to Quantitative Deconstruction: Unveiling the Mystery Inside Large Language Models

At EMNLP 2025, the BUPT NIRC team presented a paper that introduces the ARR metric to quantitatively separate latent reasoning from factual shortcuts in LLMs, using Logit Lens and Attention Knockout to reveal distinct internal pathways and shares their conference experience.

ARR metricAttention KnockoutEMNLP2025

0 likes · 6 min read

From Black‑Box Guessing to Quantitative Deconstruction: Unveiling the Mystery Inside Large Language Models

Data Party THU

Jan 13, 2026 · Artificial Intelligence

How Engram’s ‘Lookup‑Compute Separation’ Boosts LLM Performance

DeepSeek’s newly open‑sourced Engram module introduces a scalable lookup‑based memory that separates knowledge retrieval from computation, enabling O(1) deterministic access and significantly improving large language model performance on knowledge‑heavy, reasoning, code, and math tasks without extra FLOPs.

@lookupLLMMoE

0 likes · 10 min read

How Engram’s ‘Lookup‑Compute Separation’ Boosts LLM Performance

AI Insight Log

Jan 12, 2026 · Artificial Intelligence

Goodbye H100: How DeepSeek’s Engram Uses CPU Memory to Scale LLM Knowledge Bases

DeepSeek’s Engram architecture adds a deterministic dictionary lookup to Transformers, storing massive N‑gram tables in cheap CPU DRAM, which reduces GPU memory use and boosts both knowledge‑heavy and reasoning benchmarks while keeping inference latency under 3%.

CPU memoryDeterministic LookupEngram

0 likes · 7 min read

Goodbye H100: How DeepSeek’s Engram Uses CPU Memory to Scale LLM Knowledge Bases

AI Tech Publishing

Jan 12, 2026 · Artificial Intelligence

Ralph Loop: Engineering Continuous Iteration for AI Agents

Ralph Loop introduces an externalized iterative loop that forces AI agents to keep working until objective completion criteria are met, dramatically extending effective runtime from hours to a full day or more and shifting human‑agent collaboration from frequent supervision to efficient delegation.

AI AgentIterative AutomationLLM

0 likes · 17 min read

Ralph Loop: Engineering Continuous Iteration for AI Agents

Design Hub

Jan 12, 2026 · Artificial Intelligence

Visual AI Prompt Editor Eliminates ‘Spell’ Anxiety, Tweaks Like Ordering Food

The article introduces a visual AI prompt editor that transforms lengthy, complex prompt strings into modular, editable Chinese sections, demonstrating the workflow with two examples—converting a “California girl” portrait to an Asian style and re‑imagining a cinematic skyscraper scene—while detailing step‑by‑step usage and JSON export options.

AI prompt engineeringJSON exportLLM

0 likes · 11 min read

Visual AI Prompt Editor Eliminates ‘Spell’ Anxiety, Tweaks Like Ordering Food

Bighead's Algorithm Notes

Jan 11, 2026 · Artificial Intelligence

FinRpt: A Multi‑Agent Framework for Automatic Generation and Evaluation of Stock Research Reports

FinRpt introduces a novel multi‑agent pipeline that builds a high‑quality stock research report (ERR) dataset from six financial data sources, defines a comprehensive 11‑metric evaluation suite, and demonstrates that supervised‑fine‑tuned and reinforcement‑learned LLM agents significantly outperform single LLM baselines in both accuracy and efficiency.

FinRptFinancial NLPLLM

0 likes · 14 min read

FinRpt: A Multi‑Agent Framework for Automatic Generation and Evaluation of Stock Research Reports

Architect's Alchemy Furnace

Jan 10, 2026 · Artificial Intelligence

Build and Test a Multi‑Agent AI System with MetaGPT

This guide walks through the MetaGPT framework—explaining its multi‑agent architecture, core concepts, predefined roles, team setup, environment preparation, installation, configuration, and troubleshooting steps—so you can quickly build, run, and validate a collaborative AI software‑company simulation.

AI AgentsLLMMetaGPT

0 likes · 14 min read

Build and Test a Multi‑Agent AI System with MetaGPT

AI Engineering

Jan 10, 2026 · Artificial Intelligence

Teaching LLMs to Manage Memory Autonomously, Dropping Manual Rules

Alibaba's new AgeMem framework turns long‑term and short‑term memory management for large language model agents into a learnable reinforcement‑learning task, replacing handcrafted rules with a three‑stage training process and achieving significant benchmark gains.

AgeMemBenchmarkGRPO

0 likes · 9 min read

Teaching LLMs to Manage Memory Autonomously, Dropping Manual Rules

JD Tech Talk

Jan 9, 2026 · Artificial Intelligence

How JoyCode Agent Scored 74.6% Pass@1 on SWE‑bench Verified with a Patch‑Test Co‑generation Loop

JoyCode Agent leverages a patch‑test co‑generation and iterative validation framework to achieve a 74.6% Pass@1 score on the SWE‑bench Verified benchmark, reducing resource consumption by 30‑50% and introducing a closed‑loop multi‑agent pipeline that integrates testing, patch generation, trajectory compression, similarity retrieval, and decision arbitration.

AILLMSWE-bench

0 likes · 41 min read

How JoyCode Agent Scored 74.6% Pass@1 on SWE‑bench Verified with a Patch‑Test Co‑generation Loop

PaperAgent

Jan 9, 2026 · Artificial Intelligence

Why Traditional RAG Breaks the Chain and How SentGraph Fixes It

The article explains why traditional retrieval‑augmented generation fails in multi‑hop scenarios due to overly large chunks, introduces SentGraph’s sentence‑level graph that trims retrieval units and encodes logical relations, details offline construction and online inference steps, and shows experimental gains and remaining limitations.

LLMMulti-hop QARAG

0 likes · 7 min read

Why Traditional RAG Breaks the Chain and How SentGraph Fixes It

AI Insight Log

Jan 9, 2026 · Industry Insights

Did AI Doom Tailwind? 75% Layoffs and a Founder’s Threat to Developers

The article analyzes how the rise of AI coding tools led Tailwind CSS founder Adam Wathan to reject a community PR adding an llms.txt file, trigger a 75% staff cut, and expose the collapse of the open‑source‑plus‑services business model in the AI era.

AIBusiness ModelDeveloper Tools

0 likes · 7 min read

Did AI Doom Tailwind? 75% Layoffs and a Founder’s Threat to Developers

Meituan Technology Team

Jan 8, 2026 · Artificial Intelligence

Must‑Read AAAI 2026 Papers: Efficient Reasoning, Annealing, Multimodal Diffusion & More

This article curates eight AAAI 2026 papers authored by the Meituan research team, covering verifiable stepwise rewards for LLM reasoning, annealing strategies in large‑scale training, process reward models, competence‑difficulty sampling, high‑fidelity visual text rendering, counterfactual fusion, compress‑then‑rank reranking, and cross‑modal quantization for generative recommendation, with direct PDF links for each work.

AAAI2026CounterfactualLLM

0 likes · 14 min read

Must‑Read AAAI 2026 Papers: Efficient Reasoning, Annealing, Multimodal Diffusion & More

Kuaishou Tech

Jan 8, 2026 · Artificial Intelligence

Top 12 Kuaishou Papers Accepted at AAAI 2026: Breakthroughs in Recommendation, Video Generation, and LLM Research

Kuaishou secured 12 papers at AAAI 2026, covering advances in search and recommendation systems, multi‑camera video generation, multimodal understanding, generative model fundamentals, video large language models, experimental design, and LLM latent‑space reasoning, with three papers highlighted as oral presentations.

AILLMdiffusion

0 likes · 22 min read

Top 12 Kuaishou Papers Accepted at AAAI 2026: Breakthroughs in Recommendation, Video Generation, and LLM Research

Alibaba Cloud Developer

Jan 8, 2026 · Artificial Intelligence

How to Build Human‑In‑The‑Loop (HITL) Capabilities into ReactAgent

This article explains how to integrate a Human‑In‑The‑Loop (HITL) mechanism into ReactAgent, detailing the motivation, design of interaction, tool description, XML‑based UI rendering, Redis‑driven waiting loop, and the broader architectural parallels with design patterns and other agent frameworks.

Design PatternsHITLHuman-in-the-Loop

0 likes · 14 min read

How to Build Human‑In‑The‑Loop (HITL) Capabilities into ReactAgent

AndroidPub

Jan 8, 2026 · Artificial Intelligence

Unlocking Anthropic’s Agent Skill: Build Reusable AI Task Assistants in 3 Steps

This article explains Anthropic’s open‑standard Agent Skill, how it serves as a reusable task specification for Claude, walks through creating a skill with metadata, instructions, and advanced Reference/Script features, and compares Skill with MCP to help developers choose the right tool.

AI automationAgent SkillAnthropic

0 likes · 11 min read

Unlocking Anthropic’s Agent Skill: Build Reusable AI Task Assistants in 3 Steps

Sohu Tech Products

Jan 7, 2026 · Artificial Intelligence

Master Retrieval-Augmented Generation (RAG): Concepts, Benefits, Implementation

This article explains Retrieval‑Augmented Generation (RAG), its dual‑stage architecture that combines parametric LLM knowledge with external non‑parametric data, outlines its technical evolution, discusses why it outperforms pure LLMs, and provides a step‑by‑step guide with toolchain choices, evaluation metrics, and future challenges.

AIKnowledge BaseLLM

0 likes · 14 min read

Master Retrieval-Augmented Generation (RAG): Concepts, Benefits, Implementation

21CTO

Jan 7, 2026 · Fundamentals

Can LLMs Build a Garbage‑Collector‑Free System Language? Inside Steve Klabnik’s Rue Project

Steve Klabnik, a veteran of Rust and Ruby on Rails, explores a new system programming language called Rue that aims for memory safety without garbage collection, leveraging Anthropic’s Claude AI for rapid development and discussing its design trade‑offs, progress, and future prospects.

ClaudeLLMLanguage Design

0 likes · 8 min read

Can LLMs Build a Garbage‑Collector‑Free System Language? Inside Steve Klabnik’s Rue Project

DaTaobao Tech

Jan 7, 2026 · Artificial Intelligence

5 Design Patterns to Control LLM Output in Generative AI Applications

The article presents five design patterns—Logits Masking, Grammar, Style Transfer, Reverse Neutralization, and Content Optimization—for steering the output of generative AI models, compares their suitable scenarios, advantages, drawbacks, and anti‑patterns, and provides concrete implementation steps, code snippets, and flowcharts to help developers reliably enforce style, format, and compliance constraints.

Generative AILLMPrompt Engineering

0 likes · 20 min read

5 Design Patterns to Control LLM Output in Generative AI Applications

Tencent Cloud Developer

Jan 7, 2026 · Artificial Intelligence

How Context Engineering Powers the Next Generation of AI Agents

Transitioning from simple chatbots to sophisticated agents, this article explains how expanding context becomes a core variable, detailing the evolution from prompt engineering to context engineering, the challenges of managing growing context, and practical solutions like structured context, tool integration, and the MCP framework for reliable AI systems.

LLMReliabilityTool Integration

0 likes · 20 min read

How Context Engineering Powers the Next Generation of AI Agents

Wuming AI

Jan 6, 2026 · Artificial Intelligence

Top LLM Leaderboards Explained: How to Choose the Right Model

This article surveys the most popular large‑language‑model leaderboards—including lmarena, Artificial Analysis, SuperCLUE, and llm‑stats—detailing their evaluation methods, coverage areas, URLs, and practical usage tips, while warning readers that rankings are only a reference and real‑world performance may vary.

AI benchmarkingArtificial IntelligenceLLM

0 likes · 5 min read

Top LLM Leaderboards Explained: How to Choose the Right Model

Bighead's Algorithm Notes

Jan 6, 2026 · Artificial Intelligence

FinRS: A Risk‑Sensitive Trading Framework for Real‑World Financial Markets

FinRS integrates hierarchical market analysis, dual decision agents, and multi‑time‑scale reward feedback to enable risk‑aware multi‑stage trading, achieving higher cumulative returns, better Sharpe ratios, and lower maximum drawdowns than existing LLM‑based and reinforcement‑learning baselines across diverse stocks.

FinRSLLMfinancial markets

0 likes · 14 min read

FinRS: A Risk‑Sensitive Trading Framework for Real‑World Financial Markets

PMTalk Product Manager Community

Jan 6, 2026 · Industry Insights

Strategic Comparison of Dify, n8n, and ComfyUI for AI Applications and Automation

This article provides a multi‑dimensional strategic analysis of three representative AI‑focused platforms—Dify, n8n, and ComfyUI—examining their product positioning, architecture, interaction models, commercialization strategies, and agent capabilities, and offers concrete recommendations for product managers on choosing the right tool based on ease of use, control, scalability, and total cost of ownership.

AI PlatformsLLMOpen Source

0 likes · 35 min read

Strategic Comparison of Dify, n8n, and ComfyUI for AI Applications and Automation

macrozheng

Jan 6, 2026 · Artificial Intelligence

Getting Started with AgentScope Java: Build Multi‑Agent LLM Applications Quickly

This guide introduces AgentScope, a multi‑agent framework for Java that brings ReAct reasoning, tool calling, memory management, RAG, and serverless capabilities to LLM‑powered applications, and provides step‑by‑step code examples for basic and advanced usage.

AgentScopeLLMMulti‑Agent

0 likes · 12 min read

Getting Started with AgentScope Java: Build Multi‑Agent LLM Applications Quickly

PaperAgent

Jan 6, 2026 · Artificial Intelligence

How Ontology‑Driven GraphRAG Eliminates Noise in AI Knowledge Graphs

This article examines the shortcomings of naïve GraphRAG implementations on clinical data and explains how an ontology‑driven, zero‑noise GraphRAG architecture can create self‑improving, conflict‑free knowledge graphs for AI applications.

AIData QualityGraphRAG

0 likes · 3 min read

How Ontology‑Driven GraphRAG Eliminates Noise in AI Knowledge Graphs

AI Insight Log

Jan 5, 2026 · Artificial Intelligence

Free Access to NVIDIA GLM‑4.7 and Minimax‑M2.1 with a Step‑by‑Step NIM Tutorial

This guide shows how to obtain a free NVIDIA NIM API key, verify a Chinese phone number, and call the hidden GLM‑4.7 and Minimax‑M2.1 large‑language models using provided Python or curl snippets, all without owning a GPU.

APIGLM-4.7LLM

0 likes · 5 min read

Free Access to NVIDIA GLM‑4.7 and Minimax‑M2.1 with a Step‑by‑Step NIM Tutorial

PaperAgent

Jan 5, 2026 · Artificial Intelligence

How QuCo‑RAG Replaces Model Confidence with Objective Evidence to Cut Hallucinations

QuCo‑RAG introduces a dynamic retrieval‑augmented generation framework that quantifies uncertainty using pre‑training corpus statistics, replacing unreliable model confidence with objective frequency and co‑occurrence evidence, achieving millisecond‑level hallucination detection, superior multi‑hop QA performance, and cross‑model transferability across various LLMs.

Dynamic RetrievalLLMRetrieval-Augmented Generation

0 likes · 9 min read

How QuCo‑RAG Replaces Model Confidence with Objective Evidence to Cut Hallucinations

AI Insight Log

Jan 4, 2026 · Artificial Intelligence

Agent Skills for Context Engineering: 4K Stars, Powering Cursor & Codex

The open‑source ‘Agent Skills for Context Engineering’ project, which amassed over 4,100 stars in a week, demonstrates why managing a model’s attention budget—through foundational, operational, and development‑methodology skills—is essential as context windows grow, and provides platform‑agnostic instructions for Claude Code, Cursor and other AI tools.

Agent SkillsClaude CodeContext Engineering

0 likes · 7 min read

Agent Skills for Context Engineering: 4K Stars, Powering Cursor & Codex

Bighead's Algorithm Notes

Jan 4, 2026 · Artificial Intelligence

How VTA Combines Large‑Model Reasoning for Precise and Explainable Stock Time‑Series Forecasting

The VTA framework integrates large language model reasoning with textual annotation of technical indicators, employs a Time‑GRPO reinforcement‑learning objective and multi‑stage joint conditional training, and achieves state‑of‑the‑art accuracy and expert‑rated interpretability on US, Chinese and European stock datasets.

LLMTime-seriesVTA

0 likes · 19 min read

How VTA Combines Large‑Model Reasoning for Precise and Explainable Stock Time‑Series Forecasting

AI Insight Log

Jan 4, 2026 · Artificial Intelligence

How Playwright + AI Powers a Fully Automated Xianyu Treasure Hunt

The article examines the open‑source ai‑goofish‑monitor project, which combines Playwright‑driven browsing with large‑language‑model analysis to continuously scan Xianyu listings, filter out junk, and highlight high‑quality items, while also discussing its AI‑generated code, benefits, limitations, and security risks.

AILLMPlaywright

0 likes · 7 min read

How Playwright + AI Powers a Fully Automated Xianyu Treasure Hunt

PaperAgent

Jan 4, 2026 · Artificial Intelligence

How Sophia’s System 3 Turns LLM Agents into Persistent Learners

The article presents Sophia, a System 3‑enabled persistent agent framework that adds a meta‑cognitive layer to LLM‑based agents, enabling identity continuity, self‑scheduled learning, real‑time self‑checks, and autonomous task generation, and validates its benefits through a 24‑hour continuous‑run experiment.

AI AgentsLLMSystem architecture

0 likes · 7 min read

How Sophia’s System 3 Turns LLM Agents into Persistent Learners

Architect

Jan 3, 2026 · Artificial Intelligence

Unlocking AI Agent Memory: A Comprehensive Survey of Forms, Functions, and Dynamics

This article surveys the emerging field of AI agent memory, presenting a three‑dimensional taxonomy of memory forms, detailing functional categories such as factual, experiential, and working memory, and outlining dynamic processes of formation, evolution, and retrieval, while also highlighting benchmarks, open‑source frameworks, and future research directions.

AI AgentsAgentic SystemsLLM

0 likes · 7 min read

Unlocking AI Agent Memory: A Comprehensive Survey of Forms, Functions, and Dynamics

AI Architecture Hub

Jan 2, 2026 · Artificial Intelligence

How Manifold-Constrained Hyper-Connections Boost LLM Performance with Minimal Overhead

DeepSeek's new mHC architecture projects residual connections onto a manifold, enabling a 6.7% training cost increase for 27B models while delivering significant stability and downstream performance gains over traditional residual and hyper‑connection designs.

Deep LearningLLMManifold Optimization

0 likes · 13 min read

How Manifold-Constrained Hyper-Connections Boost LLM Performance with Minimal Overhead

NetEase LeiHuo Testing Center

Jan 2, 2026 · Artificial Intelligence

From ChatGPT to LLM‑Native: Building Intelligent AI Agents and Workflows with LangChain

The article explains why traditional chat‑based AI tools are limited to advice, introduces next‑generation LLM‑native applications that can understand, plan, and act, and provides a step‑by‑step guide on designing AI workflows, autonomous agents, hybrid architectures, and the Model Context Protocol (MCP) using LangChain.

AI AgentsLLMLangChain

0 likes · 36 min read

From ChatGPT to LLM‑Native: Building Intelligent AI Agents and Workflows with LangChain

IT Services Circle

Jan 2, 2026 · Artificial Intelligence

Top Open‑Source NotebookLM Alternatives: AI‑Powered Docs, Podcasts & Research Tools

This article surveys the most popular open‑source replacements for Google NotebookLM, detailing each project's star count, supported AI models, multimodal input capabilities, Docker deployment options, and unique features such as multi‑speaker podcast generation, semantic search, and collaborative knowledge‑base integration.

AIDockerLLM

0 likes · 8 min read

Top Open‑Source NotebookLM Alternatives: AI‑Powered Docs, Podcasts & Research Tools

Code Mala Tang

Dec 31, 2025 · Artificial Intelligence

Can TOON Replace JSON for LLMs? A Token‑Efficient Data Format Explained

The article introduces Token‑Oriented Object Notation (TOON), a compact alternative to JSON designed for large language models, and demonstrates how its reduced syntax cuts token usage by up to 60%, speeds up parsing, and remains human‑readable.

AILLMdata format

0 likes · 7 min read

Can TOON Replace JSON for LLMs? A Token‑Efficient Data Format Explained

AI Architecture Hub

Dec 31, 2025 · Artificial Intelligence

Why LangGraph Is the Next‑Generation Framework for LLM Agent Orchestration

This article explains the motivation behind LangGraph, walks through a quick start, details its core syntax and state management, demonstrates conditional branching, parallel execution, tool integration, multi‑agent orchestration, and real‑time monitoring, and finally discusses future directions for the framework.

LLMLangGraphParallel Execution

0 likes · 32 min read

Why LangGraph Is the Next‑Generation Framework for LLM Agent Orchestration

Architect's Alchemy Furnace

Dec 30, 2025 · Artificial Intelligence

Run AgenticSeek Locally: Complete Guide to a Private AI Assistant

This guide walks you through installing, configuring, and running AgenticSeek—a fully local, privacy‑focused AI assistant—by setting up prerequisites, cloning the repository, adjusting environment files, launching Docker services or CLI mode, and troubleshooting common issues.

AgenticSeekDockerLLM

0 likes · 21 min read

Run AgenticSeek Locally: Complete Guide to a Private AI Assistant

Aikesheng Open Source Community

Dec 30, 2025 · Databases

Year-in-Review: Open-Source SQL LLM Benchmark, SQLE Updates, and Top DB Articles

This community roundup reviews the 2025 release of the SCALE open‑source LLM‑SQL benchmark, SQLE platform updates, curated video playlists, a curated list of the year's ten best database articles, and provides reference links for further exploration.

BenchmarkDatabaseLLM

0 likes · 10 min read

Year-in-Review: Open-Source SQL LLM Benchmark, SQLE Updates, and Top DB Articles

Data Party THU

Dec 29, 2025 · Artificial Intelligence

Unlocking AI Agent Memory: A Deep Dive into Forms, Functions, and Dynamics

This article reviews the survey "Memory in the Age of AI Agents," presenting a comprehensive taxonomy that classifies agent memory by its forms, functions, and dynamic mechanisms, and explores future directions such as generative memory, reinforcement‑learning‑driven management, multimodal storage, and trustworthy handling.

AI AgentsAgent ArchitectureFuture AI

0 likes · 14 min read

Unlocking AI Agent Memory: A Deep Dive into Forms, Functions, and Dynamics

Alibaba Cloud Developer

Dec 29, 2025 · Artificial Intelligence

How Alibaba’s Tair KVCache Manager Revolutionizes Enterprise‑Level LLM Cache Management

This article details the architecture and implementation of Tair KVCache Manager, an enterprise‑grade service that centralises KVCache metadata, decouples inference engines from storage, provides elastic scaling, multi‑tenant isolation, high availability, and performance‑optimised cache management for large‑scale LLM inference workloads.

Cache ManagementKVCacheLLM

0 likes · 28 min read

How Alibaba’s Tair KVCache Manager Revolutionizes Enterprise‑Level LLM Cache Management

AI Large Model Application Practice

Dec 29, 2025 · Artificial Intelligence

Integrating Anthropic‑Style Skills into LangChain DeepAgents: A Step‑by‑Step Guide

This article explains how to bring Anthropic's Skills concept into the open‑source LangChain DeepAgents framework by detailing the discovery, system‑prompt injection, progressive loading, and execution phases, and provides a complete code‑driven example using a web‑research Skill.

Agent SkillsDeepAgentsLLM

0 likes · 14 min read

Integrating Anthropic‑Style Skills into LangChain DeepAgents: A Step‑by‑Step Guide

MaGe Linux Operations

Dec 27, 2025 · Artificial Intelligence

How to Deploy and Optimize Enterprise‑Scale LLM Inference Services: A Practical Guide

This guide walks you through deploying large language models such as ChatGLM and Llama in production, covering environment setup, model quantization, dynamic batching, service configuration, Nginx load balancing, monitoring, troubleshooting, and best‑practice recommendations for high‑performance, cost‑effective AI inference.

GPULLMPerformance tuning

0 likes · 48 min read

How to Deploy and Optimize Enterprise‑Scale LLM Inference Services: A Practical Guide

AI Architecture Hub

Dec 27, 2025 · Artificial Intelligence

How GraphRAG Turns Knowledge Graphs into Smarter Retrieval for LLMs

GraphRAG extends traditional Retrieval‑Augmented Generation by building a knowledge graph from documents, extracting entities and relationships, performing community detection, and supporting both local and global searches, offering detailed step‑by‑step guidance, code examples, configuration tips, and a comparison with classic RAG approaches.

GraphRAGKnowledge GraphLLM

0 likes · 28 min read

How GraphRAG Turns Knowledge Graphs into Smarter Retrieval for LLMs

Alibaba Cloud Native

Dec 27, 2025 · Artificial Intelligence

Unlocking AI Agent Memory: Short‑Term vs Long‑Term Strategies and Framework Integration

This article explains how AI agents overcome context window limits by using memory systems, distinguishes short‑term (session) and long‑term (cross‑session) memory, compares implementations in Google ADK, LangChain and AgentScope, and outlines context‑engineering techniques, core components, challenges, and emerging trends.

AI memoryAgent FrameworksContext Engineering

0 likes · 20 min read

Unlocking AI Agent Memory: Short‑Term vs Long‑Term Strategies and Framework Integration

Alibaba Cloud Developer

Dec 26, 2025 · Artificial Intelligence

How AutoContextMemory Cuts LLM Costs by 70% in Long Conversations

This article explains the challenges of token explosion in long‑running AI agent dialogues and introduces AutoContextMemory, a Java component that automatically compresses, offloads, and summarizes conversation history to dramatically reduce token usage, speed up responses, and preserve critical information.

AgentScopeContext ManagementLLM

0 likes · 12 min read

How AutoContextMemory Cuts LLM Costs by 70% in Long Conversations

360 Tech Engineering

Dec 26, 2025 · Artificial Intelligence

15 Chunking Strategies to Supercharge Retrieval‑Augmented Generation

This article presents fifteen practical chunking techniques—ranging from line‑by‑line and fixed‑size chunking to semantic and hierarchical methods—explaining their principles, ideal use‑cases, concrete input examples, chunk outputs, and key advantages or cautions for improving Retrieval‑Augmented Generation with large language models.

AIChunkingData Retrieval

0 likes · 28 min read

15 Chunking Strategies to Supercharge Retrieval‑Augmented Generation

Alibaba Cloud Developer

Dec 26, 2025 · Artificial Intelligence

How to Build a Fully Automated Knowledge‑Extraction Pipeline for AI Agents with Python

This article presents a complete end‑to‑end pipeline that automatically extracts, generalizes, incrementally updates, and vector‑syncs knowledge from diverse sources such as tickets, documents, and SQL code, turning the traditionally labor‑intensive knowledge‑base construction for agents into a low‑effort, continuously maintainable Python‑driven solution.

LLMPythonRAG

0 likes · 15 min read

How to Build a Fully Automated Knowledge‑Extraction Pipeline for AI Agents with Python

Architect

Dec 25, 2025 · Artificial Intelligence

How GraphRAG Boosts Retrieval Accuracy with Knowledge Graphs – A Complete Guide

This article explains why traditional RAG suffers from hallucinations, introduces GraphRAG’s knowledge‑graph‑based approach, walks through its indexing and query pipelines—including text splitting, entity‑relation extraction, graph construction, community detection, and local vs. global retrieval—provides practical setup commands, Neo4j visualization steps, and compares its performance with classic RAG.

EmbeddingGraphRAGKnowledge Graph

0 likes · 27 min read

How GraphRAG Boosts Retrieval Accuracy with Knowledge Graphs – A Complete Guide

360 Tech Engineering

Dec 25, 2025 · Artificial Intelligence

Why LangChain 1.0 Makes AI Agent Development Faster, Safer, and More Scalable

LangChain 1.0 replaces fragmented agent code with a production‑ready framework that unifies model outputs, simplifies tool integration, introduces content_blocks for consistent response handling, and adds a middleware system for privacy, summarization, and human‑in‑the‑loop safety, dramatically improving developer efficiency and reliability.

LLMLangChainPython

0 likes · 13 min read

Why LangChain 1.0 Makes AI Agent Development Faster, Safer, and More Scalable

AI Architecture Hub

Dec 24, 2025 · Artificial Intelligence

From LLMs to Autonomous Agents: The Three Evolution Stages of AI

This article explains the three evolutionary stages of AI—from large language models that generate text, through workflow‑enhanced systems using retrieval‑augmented generation, to fully autonomous agents capable of self‑directed decision‑making—while detailing the four core technologies that power each stage.

AI evolutionEmbeddingLLM

0 likes · 9 min read

From LLMs to Autonomous Agents: The Three Evolution Stages of AI

Zhuanzhuan Tech

Dec 24, 2025 · Artificial Intelligence

Building an ASR+LLM+Vector Knowledge Base for Precise Video Ad Category Detection

This article presents a layered ASR‑LLM‑vector‑knowledge‑base pipeline that cleans speech transcripts, semantically repairs text, performs hierarchical exact and fuzzy matching, and iteratively refines mappings to accurately identify product categories in video advertisements, while detailing module functions, technical choices, and LLM parameter tuning.

ASRKnowledge BaseLLM

0 likes · 11 min read

Building an ASR+LLM+Vector Knowledge Base for Precise Video Ad Category Detection

Baidu Geek Talk

Dec 24, 2025 · Artificial Intelligence

Context Parallelism Slashes TTFT by 80% for 128K-Token LLMs

The article explains how Baidu’s Baige team integrated a Context Parallelism strategy into DeepSeek V3.2, detailing the DSA architecture, the limitations of traditional tensor and sequence parallelism, and how CP distributes computation and memory across GPUs to achieve up to an 80 % reduction in token‑to‑first‑token latency for ultra‑long 128K‑token contexts.

Context ParallelismDeepSeekLLM

0 likes · 9 min read

Context Parallelism Slashes TTFT by 80% for 128K-Token LLMs

Tencent Technical Engineering

Dec 24, 2025 · Artificial Intelligence

Build a Mini LLM from Scratch: Step‑by‑Step Guide to Tokenizer, Attention, and Transformer

This article walks through constructing a small large‑language model from the ground up, covering model architecture, tokenization methods, BPE vocabulary building, embedding, positional encoding, attention mechanisms, multi‑head attention, transformer blocks, training pipelines, inference, and sampling strategies, all with runnable Python code.

Deep LearningLLMPython

0 likes · 34 min read

Build a Mini LLM from Scratch: Step‑by‑Step Guide to Tokenizer, Attention, and Transformer

Baidu Intelligent Cloud Tech Hub

Dec 24, 2025 · Artificial Intelligence

How Context Parallelism Slashes LLM First‑Token Latency by 80% for 128K Tokens

The article explains how the newly merged Context Parallelism (CP) technique in SGLang, combined with DeepSeek V3.2's Sparse Attention architecture, reduces first‑token latency by up to 80% and alleviates memory pressure for ultra‑long 128K‑token sequences, detailing both algorithmic innovations and engineering solutions.

AI InfrastructureContext ParallelismLLM

0 likes · 10 min read

How Context Parallelism Slashes LLM First‑Token Latency by 80% for 128K Tokens

Bighead's Algorithm Notes

Dec 23, 2025 · Artificial Intelligence

How H3M‑SSMoEs Combines Hypergraph Multimodal Learning and LLM Reasoning to Predict Stock Direction

The paper introduces H3M‑SSMoEs, a framework that integrates a multi‑context hypergraph for fine‑grained spatio‑temporal dynamics with a frozen Llama‑3.2‑1B LLM adapter, and a style‑structured expert mixture to jointly model stock relationships, multimodal semantics, and market regimes, achieving superior accuracy and investment returns on DJIA, NASDAQ‑100, and S&P‑100 benchmarks.

Financial AIHypergraphLLM

0 likes · 14 min read

How H3M‑SSMoEs Combines Hypergraph Multimodal Learning and LLM Reasoning to Predict Stock Direction

Alibaba Cloud Infrastructure

Dec 22, 2025 · Artificial Intelligence

Boost LLM Inference with KV‑Cache‑Aware Routing on Alibaba Cloud ACK GIE

This article explains why KV‑Cache hit rate is critical for large‑model inference, describes vLLM's automatic prefix caching, outlines the distributed cache challenges, and provides a step‑by‑step guide to deploying Alibaba Cloud ACK Gateway with Inference Extension's precise‑mode prefix‑cache‑aware routing, backed by benchmark results.

Alibaba CloudKV CacheKubernetes

0 likes · 18 min read

Boost LLM Inference with KV‑Cache‑Aware Routing on Alibaba Cloud ACK GIE

AsiaInfo Technology: New Tech Exploration

Dec 22, 2025 · Artificial Intelligence

How Advanced RAG Techniques Are Redefining Enterprise Knowledge Services

This article examines four cutting‑edge Retrieval‑Augmented Generation frameworks—Adaptive RAG, Agentic RAG, OG‑RAG, and OAG—detailing their definitions, core mechanisms, performance gains, and practical selection guidance for complex enterprise scenarios, while highlighting future research directions.

Enterprise KnowledgeLLMOntology

0 likes · 21 min read

How Advanced RAG Techniques Are Redefining Enterprise Knowledge Services

JD Tech

Dec 22, 2025 · Artificial Intelligence

Build Flexible Multi‑Agent Systems Like LEGO with OxyGent – New Features Unveiled

The OxyGent 1.0.8 release introduces multimodal messaging, fine‑grained control, MCP reconnection, and front‑end streaming, while detailing its stateless AOP architecture, execution lifecycle, four data scopes, real‑world use cases, community feedback, and a step‑by‑step tutorial for rapid adoption.

AIFrameworkLLM

0 likes · 11 min read

Build Flexible Multi‑Agent Systems Like LEGO with OxyGent – New Features Unveiled

Alibaba Cloud Developer

Dec 22, 2025 · Artificial Intelligence

Turning Real‑Time Hotspot Detection into AI‑Powered E‑Commerce Recommendations

Traditional recommendation systems lag behind fast‑moving external trends, missing the freshness and surprise users crave. This article details an end‑to‑end AI pipeline that perceives, understands, and reacts to hotspots within hours, automatically generating high‑quality product selections and continuously optimizing through feedback loops.

AI recommendationLLMautomation

0 likes · 25 min read

Turning Real‑Time Hotspot Detection into AI‑Powered E‑Commerce Recommendations

AndroidPub

Dec 22, 2025 · Mobile Development

Boost Android Development with MCP: Secure, Transparent Automation for Your Toolchain

This article explains how the Model Context Protocol (MCP) enables controllable, auditable automation of Android development tasks—bridging large language models with local tools like Gradle, ADB, and emulators—to improve efficiency, safety, and workflow integration.

AndroidLLMmcp

0 likes · 15 min read

Boost Android Development with MCP: Secure, Transparent Automation for Your Toolchain

Architect's Alchemy Furnace

Dec 21, 2025 · Artificial Intelligence

Deploy and Explore Open WebUI: A Feature‑Rich Self‑Hosted AI Platform

Open WebUI is a self‑hosted, extensible AI platform that runs fully offline, supports multiple LLM back‑ends such as Ollama and OpenAI‑compatible APIs, offers built‑in RAG, role‑based access, multi‑model chat, markdown/LaTeX, image generation, and provides detailed Docker, pip, and Kubernetes installation guides with ready‑to‑run commands.

AI platformDockerLLM

0 likes · 11 min read

Deploy and Explore Open WebUI: A Feature‑Rich Self‑Hosted AI Platform

Advanced AI Application Practice

Dec 20, 2025 · Artificial Intelligence

Master System, User, Assistant Roles to Get Precise AI Testing Answers from LLMs

This article explains how the System, User, and Assistant roles in large-language-model chat APIs shape response quality, demonstrates their impact with concrete Python code examples, compares outcomes with and without System prompts, and offers practical tips for crafting effective prompts to achieve concise, relevant AI testing guidance.

AI testingAssistant RoleLLM

0 likes · 14 min read

Master System, User, Assistant Roles to Get Precise AI Testing Answers from LLMs

Design Hub

Dec 20, 2025 · Artificial Intelligence

Must-Read: K's 2025 AI Review – 6 Paradigm Shifts Reshaping Our World

The article reviews six 2025 paradigm shifts in large language models—from the rise of verifiable‑reward reinforcement learning and the emergence of AI "ghosts" to new "Cursor for X" middle layers, local agents like Claude Code, Vibe Coding that lets users program by conversation, and visual interaction driven by Gemini Nano Banana—highlighting their technical impact and design implications.

AI AgentsLLMRLVR

0 likes · 12 min read

Must-Read: K's 2025 AI Review – 6 Paradigm Shifts Reshaping Our World

PaperAgent

Dec 20, 2025 · Industry Insights

What 2025 Tells Us About the Future of Large Language Models

The 2025 LLM year‑in‑review highlights paradigm shifts such as RLVR training, uneven “saw‑tooth” intelligence, the rise of Cursor‑style applications, Claude Code agents running locally, Vibe Coding, and the Nano Banana GUI revolution, concluding that current models only exploit about 10 % of their potential.

AI AgentsLLMNano Banana

0 likes · 10 min read

What 2025 Tells Us About the Future of Large Language Models

Bighead's Algorithm Notes

Dec 19, 2025 · Artificial Intelligence

Quantitative Finance Paper Digest: Dec 13‑19 2025 Highlights

This digest presents recent arXiv papers (Dec 13‑19 2025) on AI‑driven quantitative finance, covering LLM‑based portfolio recommendation, reinforcement‑learning deep hedging, hybrid SV‑LSTM volatility forecasting, dynamic stacking ensembles, GA‑optimized SVR forecasting, and interpretable deep learning asset pricing, each with abstracts and key findings.

Deep LearningLLMQuantitative Finance

0 likes · 16 min read

Quantitative Finance Paper Digest: Dec 13‑19 2025 Highlights

Alibaba Cloud Native

Dec 19, 2025 · Artificial Intelligence

What Enterprises Are Learning from the State of Agent Engineering Report

The recent LangChain "State of Agent Engineering" report, combined with data from the AI‑Native Application Architecture whitepaper, reveals rapid production adoption of AI agents, persistent quality challenges, widespread observability, multi‑model strategies, and evolving evaluation practices across organizations of all sizes.

AI AgentsEvaluationLLM

0 likes · 10 min read

What Enterprises Are Learning from the State of Agent Engineering Report

Bilibili Tech

Dec 19, 2025 · Artificial Intelligence

SABER: Switchable and Balanced Training for Efficient LLM Reasoning

SABER introduces a reinforcement‑learning framework that lets large language models dynamically switch among four token‑budgeted reasoning modes, dramatically cutting inference length while preserving or improving accuracy across math, code, and logic tasks.

Budgeted ComputationEfficient ReasoningLLM

0 likes · 13 min read

SABER: Switchable and Balanced Training for Efficient LLM Reasoning

Bighead's Algorithm Notes

Dec 18, 2025 · Artificial Intelligence

How 3S‑Trader’s Multi‑Agent Framework Optimizes Multi‑Stock Portfolios

The article reviews the 3S‑Trader framework, a training‑free multi‑LLM system that uses scoring, strategy, and selection modules to construct weekly stock portfolios, and shows that it outperforms rule‑based and deep‑learning baselines on DJIA and sector datasets with strong risk‑adjusted returns.

DJIAFinancial AIGPT-4o

0 likes · 12 min read

How 3S‑Trader’s Multi‑Agent Framework Optimizes Multi‑Stock Portfolios

Wu Shixiong's Large Model Academy

Dec 18, 2025 · Artificial Intelligence

Why Text2SQL Must Be Integrated into AI Agents – An Interviewer's Guide

The article explains how Text2SQL should be treated as a read‑only tool within an AI Agent, covering its role in function calls, dynamic schema pruning, ambiguity handling, SQL safety checks, result validation, semantic caching, and logging to build a production‑grade system.

AI AgentLLMSQL Safety

0 likes · 11 min read

Why Text2SQL Must Be Integrated into AI Agents – An Interviewer's Guide

PaperAgent

Dec 18, 2025 · Artificial Intelligence

Can Ontology‑Aware KG‑RAG Double Table QA Performance on Industrial Standards?

This article presents an ontology‑aware knowledge‑graph RAG framework that transforms complex, hierarchical industrial standard documents into a graph of sections, atomic propositions, and refined triples, achieving nearly double F1 scores on table‑based QA tasks and robust performance on long documents.

Knowledge GraphLLMOntology

0 likes · 6 min read

Can Ontology‑Aware KG‑RAG Double Table QA Performance on Industrial Standards?

Wu Shixiong's Large Model Academy

Dec 17, 2025 · Artificial Intelligence

How Should Text2SQL Fit Inside an Agent System? Practical Guide for Interviews

This article explains the proper role of Text2SQL within an Agent architecture, detailing its placement as a tool, function‑call implementation, decision logic for invocation, multi‑turn handling, failure management, and how to clearly present these concepts in technical interviews.

AILLMText2SQL

0 likes · 9 min read

How Should Text2SQL Fit Inside an Agent System? Practical Guide for Interviews

21CTO

Dec 17, 2025 · Artificial Intelligence

Can a New Language Make LLMs Write Code with 100% Accuracy? Meet Sui

Japanese data scientist Takato Honda introduces Sui, an open‑source programming language designed to eliminate syntax and spelling errors and to let large language models generate code with claimed 100% accuracy, offering token‑efficiency optimizations for AI‑assisted programming.

AILLMOpen Source

0 likes · 4 min read

Can a New Language Make LLMs Write Code with 100% Accuracy? Meet Sui

PaperAgent

Dec 17, 2025 · Artificial Intelligence

Unlocking Agent Memory: A Comprehensive Survey of Forms, Functions, and Dynamics

This article surveys over 200 recent papers on AI agent memory, introducing a three‑dimensional framework of form, function, and dynamics, classifying memory into token‑level, parametric, and latent types, outlining their roles, lifecycle operations, benchmark datasets, open‑source frameworks, and seven emerging research directions.

AI AgentsLLMSurvey

0 likes · 6 min read

Unlocking Agent Memory: A Comprehensive Survey of Forms, Functions, and Dynamics

Architects' Tech Alliance

Dec 17, 2025 · Artificial Intelligence

Mastering Retrieval‑Augmented Generation: From Theory to Scalable Deployment

This guide explains how Retrieval‑Augmented Generation (RAG) overcomes LLM knowledge staleness, hallucination, and domain‑adaptation challenges by combining external knowledge bases with real‑time retrieval, and provides detailed architecture, optimization techniques, engineering practices, monitoring, cost‑control, and future trends for building production‑grade RAG systems.

AICloudflareLLM

0 likes · 15 min read

Mastering Retrieval‑Augmented Generation: From Theory to Scalable Deployment

PaperAgent

Dec 16, 2025 · Artificial Intelligence

Open Notebook: The Open‑Source, Privacy‑First Alternative to Google Notebook LM

Open Notebook is a fully local, open‑source AI notebook that rivals Google Notebook LM by supporting over 16 LLM providers, handling multimodal content, and enabling advanced multi‑speaker podcast generation while giving users complete data sovereignty and flexible deployment options.

AI NotebookLLMOpen Source

0 likes · 4 min read

Open Notebook: The Open‑Source, Privacy‑First Alternative to Google Notebook LM

Fighter's World

Dec 16, 2025 · Artificial Intelligence

Boosting Large Language Model Domain Expertise with Claude Skills

The article analyzes why generic LLMs struggle with domain‑specific reasoning, critiques fine‑tuning, RAG and prompt engineering, and presents Claude Skills—using progressive disclosure, Pydantic validation, and state‑machine control—to encode expert constraints as executable rules, illustrated with finance compliance and legal reasoning case studies and backed by Anthropic research.

ClaudeDomain-specificLLM

0 likes · 20 min read

Boosting Large Language Model Domain Expertise with Claude Skills

JakartaEE China Community

Dec 16, 2025 · Artificial Intelligence

Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3

This guide walks through the importance of Retrieval‑Augmented Generation, outlines the core Langchain4j and Ollama 3 components, and provides a complete Java example—including Maven setup, document ingestion, embedding creation, similarity search, prompt construction, and response generation—to demonstrate a functional RAG pipeline.

EmbeddingLLMLangChain4j

0 likes · 9 min read

Build a Retrieval‑Augmented Generation (RAG) System with Langchain4j and Ollama 3

PaperAgent

Dec 16, 2025 · Artificial Intelligence

Do LLMs Have Emotional Chains? Unveiling the Chain‑of‑Affective Across 8 Model Families

This article analyzes recent research by East China Normal University and Fudan University on whether eight major LLM families exhibit a systematic “Chain-of-Affective,” revealing how internal emotional structures influence model outputs, multi‑agent interactions, and user experience, and offering practical guidelines for mitigating emotional loops in AI systems.

AI safetyBenchmarkChain-of-Affective

0 likes · 8 min read

Do LLMs Have Emotional Chains? Unveiling the Chain‑of‑Affective Across 8 Model Families

Qborfy AI

Dec 16, 2025 · Artificial Intelligence

Mastering AI Function Calling: Turn LLMs into Actionable Assistants

Function Calling lets large language models invoke external tools or APIs during a conversation, transforming them from passive responders into proactive assistants; this guide explains the concept, workflow, and practical implementations with weather, parallel queries, and stock price examples using OpenAI’s Python SDK.

AI Function CallingChatbotLLM

0 likes · 9 min read

Mastering AI Function Calling: Turn LLMs into Actionable Assistants

Alibaba Cloud Developer

Dec 16, 2025 · Artificial Intelligence

How We Built an AI‑Powered Data Agent to Automate Data Retrieval at Scale

This article details the design and implementation of Matra, an AI‑driven data assistant for a large e‑commerce platform, covering the challenges of legacy data assets, knowledge‑base construction, GraphRAG integration, multi‑stage agent frameworks, practical results, and future plans for continuous improvement.

AIData EngineeringData Retrieval

0 likes · 22 min read

How We Built an AI‑Powered Data Agent to Automate Data Retrieval at Scale

AI Large Model Application Practice

Dec 16, 2025 · Artificial Intelligence

Recreating NotebookLM’s PPT Generation with a Low‑Code Workflow

This guide shows how to use the open‑source BISHENG low‑code platform, ByteDance’s Seed‑1.6 and Seedream‑4.5 models, and a custom MCP server to build a workflow that uploads documents, performs RAG, generates structured PPT outlines with LLMs, creates page images via text‑to‑image models, and assembles a downloadable PDF, all while incorporating human‑in‑the‑loop controls.

BISHENGHITLLLM

0 likes · 17 min read

Recreating NotebookLM’s PPT Generation with a Low‑Code Workflow

Old Meng AI Explorer

Dec 15, 2025 · Artificial Intelligence

Unlock Multi‑Model AI Decision Power with LLM Council – A Hands‑On Guide

LLM Council, an open‑source platform created by former OpenAI researcher Andrej Karpathy, lets users simultaneously query top LLMs such as GPT‑5.1, Gemini 3 Pro, Claude Sonnet 4.5 and Grok 4, anonymously peer‑review their answers, and synthesize a final report, dramatically improving accuracy for research, tech selection and learning while remaining easy to install and run locally.

AI toolLLMOpen-source

0 likes · 11 min read

Unlock Multi‑Model AI Decision Power with LLM Council – A Hands‑On Guide

Architect

Dec 15, 2025 · Artificial Intelligence

Demystifying LLM Architecture: From Transformers to Modern MoE Designs

This comprehensive guide explains the fundamentals of large language model (LLM) architectures, covering the original Transformer, tokenization, embeddings, positional encoding, attention mechanisms, feed‑forward networks, layer stacking, a step‑by‑step translation example, and the latest open‑source and hybrid LLM designs shaping the field.

EmbeddingLLMMoE

0 likes · 41 min read

Demystifying LLM Architecture: From Transformers to Modern MoE Designs

Baidu Intelligent Cloud Tech Hub

Dec 15, 2025 · Artificial Intelligence

Baidu Baige’s Breakthrough: Orchestrating Giant LLM Inference with Silent Instances

The article details Baidu Baige’s next‑generation distributed inference platform for trillion‑parameter LLMs, explaining how automated orchestration, the FedDeployment abstraction, SplitService unified view, Adaptive HPA predictive scaling, Silent Instances for second‑level activation, and the Staggered Batched Scheduler eliminate scaling limits, reduce TTFT by 30‑40%, boost throughput by up to 20%, and achieve cost‑effective, elastic AI compute.

AutoscalingKubernetesLLM

0 likes · 23 min read

Baidu Baige’s Breakthrough: Orchestrating Giant LLM Inference with Silent Instances

Wu Shixiong's Large Model Academy

Dec 15, 2025 · Artificial Intelligence

Mastering Text2SQL: From Schema Design to Secure Multi‑Step LLM Pipelines

This article explains how Text2SQL works by teaching LLMs to understand a closed‑world database schema, constructing tightly constrained prompts, validating generated SQL, handling execution errors, and using a second LLM call to translate results into natural language, while highlighting common pitfalls and engineering best practices.

LLMSQL ValidationText2SQL

0 likes · 9 min read

Mastering Text2SQL: From Schema Design to Secure Multi‑Step LLM Pipelines

Network Intelligence Research Center (NIRC)

Dec 15, 2025 · Artificial Intelligence

Turning LLM-Generated Network Configurations into Verified, Safe Updates with Artanis

The paper introduces Artanis, an intent‑based network configuration update framework that combines large‑language‑model generation with a verification‑feedback loop and reinforcement‑learning optimization, addressing hallucination‑induced errors and ensuring safe, policy‑compliant deployments across diverse network scales.

Configuration ManagementIntent-based NetworkingLLM

0 likes · 9 min read

Turning LLM-Generated Network Configurations into Verified, Safe Updates with Artanis

Architect's Alchemy Furnace

Dec 13, 2025 · Artificial Intelligence

Explore 100+ Open‑Source LLM Apps and How to Run Them Locally

This guide presents a curated collection of over a hundred open‑source large language model applications—including AI agents, RAG pipelines, and domain‑specific tools—explains their categories, showcases example projects, and provides step‑by‑step instructions to clone and run them on your own machine.

AI AgentsGitHubLLM

0 likes · 8 min read

Explore 100+ Open‑Source LLM Apps and How to Run Them Locally