Tagged articles

919 articles

Page 1 of 10

May 30, 2026 · Artificial Intelligence

Is RAG Dead? How Grep Is Making a Comeback in LLM‑Powered Code Search

This article investigates the claim that Retrieval‑Augmented Generation (RAG) is obsolete by dissecting Claude Code’s grep‑driven search architecture, benchmarking its performance against traditional vector‑based retrieval, comparing it with Cursor and OpenAI Codex, and analyzing the trade‑offs of multi‑round agentic search.

Claude CodeCode searchCursor

0 likes · 36 min read

Is RAG Dead? How Grep Is Making a Comeback in LLM‑Powered Code Search

Old Zhang's AI Learning

May 30, 2026 · Artificial Intelligence

Turning Technical Books into Claude Code Skills: Unlocking Internal Documentation as Reusable Skills

The article introduces the open‑source "book-to-skill" tool that compiles PDFs or EPUBs into Claude Code skills, explains its on‑demand loading architecture, compares it with raw PDF retrieval and RAG, and provides detailed implementation steps, performance numbers, and practical usage guidelines.

AIClaudeRAG

0 likes · 12 min read

Turning Technical Books into Claude Code Skills: Unlocking Internal Documentation as Reusable Skills

AI Engineer Programming

May 30, 2026 · Artificial Intelligence

Should You Pre‑filter or Post‑filter in RAG Vector Search?

The article examines RAG vector retrieval filtering strategies, comparing pre‑filtering (filter before vector search) and post‑filtering (filter after ANN search), and introduces single‑stage filtering, discussing their principles, trade‑offs, suitable scenarios, and architectural implications for accuracy and performance.

ANNRAGmetadata filtering

0 likes · 15 min read

Should You Pre‑filter or Post‑filter in RAG Vector Search?

Digital Planet

May 29, 2026 · Industry Insights

5 Essential Skills Data Professionals Must Master in 2026

In the AI‑driven era of 2026, data professionals need to focus on five high‑impact capabilities—data governance, practical large‑model usage, MLOps, data storytelling, and AI compliance—to stay indispensable, with each skill backed by industry reports, job growth data, and concrete learning pathways.

2026 TrendsAI ComplianceAI Skills

0 likes · 13 min read

5 Essential Skills Data Professionals Must Master in 2026

AI Engineer Programming

May 29, 2026 · Artificial Intelligence

How to Build a Reliable RAG Test Dataset

The article explains why a structured test set is essential for Retrieval‑Augmented Generation systems, outlines failure modes, describes layered evaluation of retrieval and generation, details infrastructure like chunk IDs and manifests, and provides a complete annotation pipeline with cold‑start and adversarial strategies.

LLMRAGadversarial

0 likes · 24 min read

How to Build a Reliable RAG Test Dataset

AI Large-Model Wave and Transformation Guide

May 28, 2026 · Artificial Intelligence

Why AI Agent Architecture Mirrors 50 Years of OS Design

The article maps classic operating‑system concepts—processes, system calls, caching, file‑system mounting, and scheduling—to AI agents, showing how these analogies explain challenges like context sharing, tool permissions, token limits, knowledge‑base mounting, and orchestrated execution, and proposes a concrete multi‑layer design framework.

AI agentsAgent ArchitectureContext Management

0 likes · 10 min read

Why AI Agent Architecture Mirrors 50 Years of OS Design

AI Engineer Programming

May 28, 2026 · Artificial Intelligence

Claude Code Best Practices and Getting Started Guide for Large Codebases

This guide explains how Claude Code can be deployed in massive monorepos, legacy systems, and distributed repositories, detailing navigation methods, the limits of RAG, the benefits of agentic search, and a five‑layer support system—including CLAUDE.md, hooks, skills, plugins, and MCP servers—to help teams of thousands achieve reliable AI‑assisted coding.

AI codingCLAUDE.mdClaude Code

0 likes · 18 min read

Claude Code Best Practices and Getting Started Guide for Large Codebases

Su San Talks Tech

May 27, 2026 · Artificial Intelligence

Why Switch from Hand‑Written HTTP Calls to Spring AI for Large‑Model Integration?

The article analyzes the drawbacks of manually coding HTTP calls to large language models—hard‑coded keys, fragile request construction, missing retries, and poor observability—and demonstrates how Spring AI’s layered abstraction, unified configuration, built‑in resilience, function calling, RAG support, and seamless Spring ecosystem integration solve these problems for production‑grade Java applications.

Function CallingJavaLLM

0 likes · 24 min read

Why Switch from Hand‑Written HTTP Calls to Spring AI for Large‑Model Integration?

AI Engineer Programming

May 27, 2026 · Artificial Intelligence

MMR for RAG: Low-Cost Chunk Limits Balance Relevance and Diversity

When a long document is split into many highly similar chunks, vector‑based top‑k retrieval tends to return multiple pieces from the same source, causing document dominance; applying a per‑document chunk limit together with Maximal Marginal Relevance (MMR) re‑ranking introduces diversity while preserving relevance, offering a low‑cost way to improve RAG answer quality.

ChunkingDPPDiversity

0 likes · 17 min read

MMR for RAG: Low-Cost Chunk Limits Balance Relevance and Diversity

PaperAgent

May 26, 2026 · Artificial Intelligence

Why External Retrieval in RAG Is Redundant: Insights from NVIDIA’s INTRA Paper

The INTRA paper shows that using a decoder’s cross‑attention as an internal retrieval mechanism eliminates the need for a separate retriever, achieving state‑of‑the‑art multihop QA performance with only 164 K trainable parameters and shared pre‑encoded representations.

INTRARAGattention

0 likes · 8 min read

Why External Retrieval in RAG Is Redundant: Insights from NVIDIA’s INTRA Paper

Su San Talks Tech

May 25, 2026 · Artificial Intelligence

Mastering RAG: Chunking, Embeddings, BM25 & Multi‑Index Retrieval in Python

This tutorial explains Retrieval‑Augmented Generation (RAG) from fundamentals to a full pipeline, covering text chunking strategies, VoyageAI embeddings, vector‑store implementation, BM25 lexical search, and a multi‑index retriever that fuses semantic and lexical results with Reciprocal Rank Fusion.

BM25ChunkingEmbeddings

0 likes · 48 min read

Mastering RAG: Chunking, Embeddings, BM25 & Multi‑Index Retrieval in Python

AgentGuide

May 24, 2026 · Artificial Intelligence

Comprehensive AI Agent Interview Guide: From Core Concepts to Engineering Implementation

This curated collection gathers AI Agent interview questions covering fundamentals, tokenization, skill design, RAG, MCP, memory systems, evaluation methods, and practical engineering pathways, offering a complete navigation resource for backend engineers transitioning to AI roles.

AI agentAgent EvaluationInterview Questions

0 likes · 3 min read

Comprehensive AI Agent Interview Guide: From Core Concepts to Engineering Implementation

DataFunTalk

May 24, 2026 · Artificial Intelligence

Engineering and Algorithm Innovations for RAG Engines in Office Scenarios

The article analyzes the challenges of deploying large language models in enterprise settings and presents a modular Retrieval‑Augmented Generation (RAG) solution that combines document parsing, multi‑turn query rewriting, hybrid vector‑plus‑BM25 retrieval, two‑stage ranking (RRF, ColBERT, cross‑encoder) and knowledge‑filtered prompt engineering to achieve more comprehensive search, better ranking and more accurate answers.

Document ParsingHybrid RetrievalKnowledge Filtering

0 likes · 22 min read

Engineering and Algorithm Innovations for RAG Engines in Office Scenarios

SuanNi

May 23, 2026 · Artificial Intelligence

Deploy the Open-Source ChatLaw Legal LLM on the SuanWang Platform

This article introduces ChatLaw, an open‑source legal large language model trained on 936,727 real cases, explains its high‑dimensional embedding ChatLaw‑Text2Vec for fast knowledge alignment, and provides a step‑by‑step guide to deploy it on the SuanWang cloud platform using Python and MLU resources.

ChatLawDeploymentEmbedding

0 likes · 3 min read

Deploy the Open-Source ChatLaw Legal LLM on the SuanWang Platform

James' Growth Diary

May 23, 2026 · Artificial Intelligence

Choosing the Right Retrieval Strategy: Full‑Text vs Vector vs Graph Search

This article breaks down the underlying logic, ideal scenarios, benchmark data, decision trees, and real‑world case studies for full‑text (BM25), vector, and graph retrieval, showing why hybrid approaches dominate production while each technique has distinct strengths and trade‑offs.

Full-Text SearchHybrid SearchRAG

0 likes · 25 min read

Choosing the Right Retrieval Strategy: Full‑Text vs Vector vs Graph Search

AI Large-Model Wave and Transformation Guide

May 22, 2026 · Artificial Intelligence

Can Agentic Search Replace Traditional RAG? A Deep Dive into Their Differences

The article explains agentic search as an LLM‑driven, multi‑step retrieval process, contrasts it with traditional RAG pipelines, provides concrete examples, discusses when each approach is appropriate, and argues that agentic search will augment rather than fully replace RAG.

AILLMRAG

0 likes · 7 min read

Can Agentic Search Replace Traditional RAG? A Deep Dive into Their Differences

AI Algorithm Path

May 21, 2026 · Artificial Intelligence

Essential Ranking Techniques Every RAG Engineer Must Know

This article explains why ranking is the decisive factor behind successful Retrieval‑Augmented Generation (RAG) pipelines, walks through pointwise, pairwise, and listwise learning‑to‑rank paradigms, details key algorithms such as LambdaMART, compares cross‑encoders with bi‑encoders, and provides practical guidance on metrics, production‑grade rerankers, model fine‑tuning, and framework integration.

Bi-EncoderCross-EncoderLLM

0 likes · 22 min read

Essential Ranking Techniques Every RAG Engineer Must Know

DataFunSummit

May 21, 2026 · Artificial Intelligence

Designing Next‑Gen Recommendation and Search with Intelligent Agent Architecture

The article reviews a collection of technical chapters that analyze how multi‑agent AI architectures, large‑language‑model‑enhanced recommendation pipelines, generative ranking for ads, and Elasticsearch‑based vector RAG are applied to build next‑generation recommendation and search systems, citing concrete designs, performance numbers and real‑world deployments.

AI agentsElasticsearchGenerative Ranking

0 likes · 6 min read

Designing Next‑Gen Recommendation and Search with Intelligent Agent Architecture

James' Growth Diary

May 21, 2026 · Databases

Building a Neo4j Knowledge Graph: Entity Modeling, Cypher Queries, and LangChain Integration

This article walks through why graph databases excel at multi‑hop queries, compares Neo4j with relational and vector stores, explains core concepts of nodes, relationships and properties, shows Docker setup, demonstrates six common Cypher patterns, integrates LangChain for LLM‑generated queries, and shares production‑grade modeling tips and pitfalls.

CypherGraph DatabaseLangChain

0 likes · 19 min read

Building a Neo4j Knowledge Graph: Entity Modeling, Cypher Queries, and LangChain Integration

大转转FE

May 21, 2026 · Artificial Intelligence

Why AI Buzzwords Multiply Faster Than My Hair Falls

The article maps three generations of AI engineering—Prompt Engineering, Context Engineering, and Harness Engineering—explaining their core capabilities, key terms like LLM, RAG, Agent, and evaluation methods, while offering practical tips, pitfalls, and a concise three‑question checklist to stay grounded amid the rapid influx of new AI jargon.

AIAgentHarness

0 likes · 19 min read

Why AI Buzzwords Multiply Faster Than My Hair Falls

AI Engineer Programming

May 21, 2026 · Artificial Intelligence

RAG with Multimodal Inputs vs LLM + Toolchains: Handling Non‑Text Data

The article analyzes how large language models process only tokenized text, compares the traditional LLM‑plus‑toolchain pipeline with emerging multimodal models, evaluates their cost, speed, controllability, and hallucination risks, and proposes a hybrid architecture that matches each approach to specific document scenarios.

LLMMultimodalRAG

0 likes · 16 min read

RAG with Multimodal Inputs vs LLM + Toolchains: Handling Non‑Text Data

James' Growth Diary

May 20, 2026 · Artificial Intelligence

Boosting RAG Retrieval Quality with Cohere Rerank and Cross‑Encoder

After achieving high recall with hybrid Elasticsearch and vector search, the article shows how inserting a reranker—either Cohere's cloud API or a local Cross‑Encoder—compresses the top‑20 candidates to the most relevant three to five, dramatically improving answer accuracy, cutting token costs, and detailing a dual‑track implementation for production and development environments.

CohereCross-EncoderLangChain

0 likes · 22 min read

Boosting RAG Retrieval Quality with Cohere Rerank and Cross‑Encoder

Spring Full-Stack Practical Cases

May 20, 2026 · Artificial Intelligence

RAG vs. LLM Wiki vs. GBrain: Which Architecture Best Powers Agent Memory?

The article analyzes why AI agents forget, then compares three memory architectures—RAG, LLM Wiki, and GBrain—detailing their strengths, weaknesses, scalability, latency, compounding knowledge, and autonomy, and offers guidance on choosing the right approach for different use cases.

AI ArchitectureAgent MemoryGBrain

0 likes · 20 min read

RAG vs. LLM Wiki vs. GBrain: Which Architecture Best Powers Agent Memory?

Tech Minimalism

May 20, 2026 · Artificial Intelligence

How Karpathy’s Markdown Wiki Redefines LLM Knowledge Management

The article examines the LLM Wiki concept introduced by Karpathy, explaining how a Markdown‑based wiki maintained outside the LLM context can persist and evolve model understanding, compares it with RAG, note‑taking tools and traditional knowledge bases, and outlines architectural components, risks, and practical guidelines.

AIKnowledge BaseLLM

0 likes · 14 min read

How Karpathy’s Markdown Wiki Redefines LLM Knowledge Management

AI Engineer Programming

May 20, 2026 · Artificial Intelligence

Why Chunk‑Based RAG Fails and How IdeaBlocks Improve Retrieval

The article argues that the common assumption that text chunks are the proper knowledge unit in RAG pipelines is flawed, leading to versioning, metadata, and redundancy problems, and demonstrates that replacing chunks with structured IdeaBlocks dramatically reduces corpus size, token usage, and improves vector relevance.

IdeaBlockLLMRAG

0 likes · 10 min read

Why Chunk‑Based RAG Fails and How IdeaBlocks Improve Retrieval

dbaplus Community

May 19, 2026 · Artificial Intelligence

From RAG to GraphRAG: How Huolala Raised Metadata Retrieval Accuracy from 56% to 78%

The article details Huolala's transition from a basic Retrieval‑Augmented Generation (RAG) system to a GraphRAG architecture, explaining the challenges of traditional RAG, the design of offline and online stages, multi‑index hybrid search, concrete performance metrics (accuracy up to 78%, knowledge recall 91%, Top‑K 90%, MRR 0.73), and future plans such as stronger hybrid retrieval, reranking, and Agentic RAG.

AIGraphRAGHybrid Search

0 likes · 15 min read

From RAG to GraphRAG: How Huolala Raised Metadata Retrieval Accuracy from 56% to 78%

AI Large-Model Wave and Transformation Guide

May 19, 2026 · Industry Insights

Which Company Will Shape the Future of Enterprise AI: Anthropic or Palantir?

The article compares Anthropic's lightweight, knowledge‑externalizing AI approach with Palantir's heavyweight data‑semantic and governance platform, arguing that Chinese B‑end firms should initially adopt Anthropic‑style quick‑value layers and later integrate Palantir‑style controls to build a sustainable enterprise AI operation layer.

AI OpsAnthropicChina B2B

0 likes · 10 min read

Which Company Will Shape the Future of Enterprise AI: Anthropic or Palantir?

DataFunSummit

May 18, 2026 · Artificial Intelligence

How Palantir’s Ontology‑Based Semantic Network Drove 85% Growth and Zero Churn

Palantir’s Q1 2026 revenue jumped 85% while many AI firms saw valuations collapse, and the company attributes its success to replacing cheap‑token LLM wrappers with a deep ontology‑driven semantic network that secures high‑risk AI deployments, creates a durable moat, and delivers unprecedented net‑retention.

AI infrastructurePalantirRAG

0 likes · 10 min read

How Palantir’s Ontology‑Based Semantic Network Drove 85% Growth and Zero Churn

AgentGuide

May 18, 2026 · Artificial Intelligence

AI Agent Essentials: Tokens, Skills, RAG, MCP, SDD & Harness Engineering

The article explains AI Agents as LLM‑based entities with planning, memory, and tool‑use capabilities, covering model pre‑training, fine‑tuning, hallucinations, the Model Context Protocol (MCP), tokenization, Retrieval‑Augmented Generation (RAG), multi‑layer memory, Skill packaging, ReAct reasoning‑action loops, self‑reflection, Harness engineering, and Spec‑Driven Development (SDD).

AI agentHarness EngineeringLLM

0 likes · 9 min read

AI Agent Essentials: Tokens, Skills, RAG, MCP, SDD & Harness Engineering

dbaplus Community

May 17, 2026 · Artificial Intelligence

Why Grep Is Replacing Vector Indexes: RAG Isn’t Dead, It’s Evolving

The article dissects Claude Code’s LLM‑driven Grep search, showing how multi‑round tool calls replace static vector‑based RAG, presents ripgrep performance benchmarks, compares Claude Code with Cursor and Codex, and argues that zero‑index search is optimal for local code bases while larger projects still need indexing.

Claude CodeCode searchGrep

0 likes · 36 min read

Why Grep Is Replacing Vector Indexes: RAG Isn’t Dead, It’s Evolving

IT Services Circle

May 17, 2026 · Artificial Intelligence

60 Essential AI Terms Every Programmer Should Master

This article walks programmers through 60 core AI concepts—from the basics of large language models and tokens to advanced topics like prompt engineering, retrieval‑augmented generation, fine‑tuning, and inference optimization—organized into progressive skill levels and illustrated with concrete examples and code snippets.

AIInference OptimizationRAG

0 likes · 25 min read

60 Essential AI Terms Every Programmer Should Master

Tech Minimalism

May 16, 2026 · Artificial Intelligence

One‑page guide to the three RAG architectures: Classic, Graph, and Agentic

The article explains why plain large language models cannot answer internal company questions, introduces Retrieval‑Augmented Generation (RAG) as a solution, and compares three RAG variants—Classic, Graph, and Agentic—detailing their workflows, strengths, limitations, and how to choose the right one for a given problem.

Agentic RAGClassic RAGGraph RAG

0 likes · 17 min read

One‑page guide to the three RAG architectures: Classic, Graph, and Agentic

AI Engineer Programming

May 16, 2026 · Artificial Intelligence

How to Boost RAG Retrieval Quality: Real‑World Cost‑Benefit Analysis

This article examines practical ways to improve Retrieval‑Augmented Generation (RAG) retrieval quality—covering vector database choices, data chunking, embedding models, query expansion, and re‑ranking—while weighing performance gains against operational costs through multiple real‑world case studies.

LLMRAGRe‑ranking

0 likes · 16 min read

How to Boost RAG Retrieval Quality: Real‑World Cost‑Benefit Analysis

Su San Talks Tech

May 15, 2026 · Artificial Intelligence

Understanding Rerank in Retrieval‑Augmented Generation (RAG)

The article explains why a reranking step is essential in RAG pipelines, describes how it refines the initial vector‑search results, compares mainstream rerank techniques, discusses practical engineering choices such as candidate set size and model selection, and outlines how to evaluate and tune rerank performance.

Cross-EncoderLLMModel selection

0 likes · 15 min read

Understanding Rerank in Retrieval‑Augmented Generation (RAG)

AI Engineer Programming

May 15, 2026 · Artificial Intelligence

Hybrid Retrieval in RAG: Combining BM25 Precision with Dense Vector Semantics

The article examines why pure vector retrieval in RAG lacks lexical precision and traceable relevance scores, explains BM25's strengths, and presents hybrid retrieval architectures—including RRF and linear combination fusion—as well as the trade‑offs of externalizing the fusion process.

BM25Hybrid SearchRAG

0 likes · 9 min read

Hybrid Retrieval in RAG: Combining BM25 Precision with Dense Vector Semantics

DeepHub IMBA

May 14, 2026 · Artificial Intelligence

How HyDE Transforms RAG Retrieval from Keyword Matching to Intent Understanding

The article explains how Hypothetical Document Embeddings (HyDE) improve Retrieval‑Augmented Generation by generating a synthetic answer before vector search, allowing the system to embed richer semantic intent rather than relying on shallow keyword similarity, and provides a step‑by‑step implementation using LangChain.

HyDELLMLangChain

0 likes · 6 min read

How HyDE Transforms RAG Retrieval from Keyword Matching to Intent Understanding

AntData

May 14, 2026 · Artificial Intelligence

How RAG‑Powered DB‑GPT Enables Intelligent Marine‑Environment Queries with Text2SQL

The article presents a private‑deployed DB‑GPT solution that combines Retrieval‑Augmented Generation (RAG) and Text2SQL to address low utilization of unstructured marine‑environment knowledge, cross‑source data querying difficulties, and security concerns, detailing technical selection, implementation steps, and performance gains that reduce query time from 30 minutes to 1‑3 minutes.

AIDB-GPTKnowledge retrieval

0 likes · 13 min read

How RAG‑Powered DB‑GPT Enables Intelligent Marine‑Environment Queries with Text2SQL

AI Engineer Programming

May 14, 2026 · Artificial Intelligence

RAG Retrieval: Comparing Bi-encoder and Cross-encoder Architectures

The article reviews the three‑step RAG pipeline, explains why retrieval quality hinges on fast, accurate semantic matching, contrasts Bi-encoder’s offline vector indexing and speed with Cross-encoder’s token‑level interaction and higher precision, and discusses hybrid solutions such as ColBERT and LLM rerankers with practical engineering guidelines.

Bi-EncoderColBERTCross-Encoder

0 likes · 10 min read

RAG Retrieval: Comparing Bi-encoder and Cross-encoder Architectures

ITPUB

May 13, 2026 · Databases

Is the Hype Around Vector Databases a Pseudo‑Demand in the AI Era?

The article questions whether dedicated vector databases are truly needed for AI applications, examining market hype, the rapid emergence of many vector‑DB products, real‑world examples like PostgreSQL pgvector and major vendor integrations, and the hidden costs of data fragmentation and operational complexity.

AIPostgreSQLRAG

0 likes · 15 min read

Is the Hype Around Vector Databases a Pseudo‑Demand in the AI Era?

DataFunSummit

May 13, 2026 · Artificial Intelligence

From RAG to Ontology: Palantir’s Semantic Network Drives 85% Growth and Zero Churn

Amid rapidly commoditized large‑model capabilities, Palantir achieved an 85% YoY revenue surge and zero churn by replacing generic RAG approaches with a deep enterprise ontology that unifies business semantics, creating a durable infrastructure moat while other AI firms see valuation collapse.

AI infrastructurePalantirRAG

0 likes · 11 min read

From RAG to Ontology: Palantir’s Semantic Network Drives 85% Growth and Zero Churn

Machine Heart

May 13, 2026 · Artificial Intelligence

From 0 to 193 Logins in 88 Days: Evidence‑Driven AI Empowers 5 Million Chinese Doctors

Facing overwhelming patient loads and unreliable AI hallucinations, Chinese doctors turned to a new medical AI that combines low‑hallucination retrieval‑augmented generation, PICO‑GRADE evidence structuring, reward‑based model alignment and expert‑in‑the‑loop feedback, delivering clinically vetted answers in seconds and gaining 193 logins within 88 days.

AIRAGclinical-decision-support

0 likes · 16 min read

From 0 to 193 Logins in 88 Days: Evidence‑Driven AI Empowers 5 Million Chinese Doctors

ITPUB

May 12, 2026 · Industry Insights

Why Pinecone Is Dismantling Its Own RAG Paradigm

In May 2026 Pinecone announced the end of its Retrieval‑Augmented Generation (RAG) approach, unveiling the Nexus knowledge engine and KnowQL query language to address the structural inefficiencies of RAG for AI agents, and positioning this shift as a strategic industry‑wide pivot.

AI agentsKnowQLKnowledge Compilation

0 likes · 8 min read

Why Pinecone Is Dismantling Its Own RAG Paradigm

DataFunSummit

May 12, 2026 · Artificial Intelligence

15 Critical Questions on Why Enterprise AI Agents Need Business Ontology

The article analyzes why large language models and RAG alone cannot meet enterprise AI needs, argues that a business ontology provides essential semantic grounding for agents, outlines ontology construction methods, demonstrates hybrid search improvements, and shares real‑world case studies showing dramatic efficiency gains.

AI agentsHybrid SearchRAG

0 likes · 16 min read

15 Critical Questions on Why Enterprise AI Agents Need Business Ontology

Architecture Digest

May 12, 2026 · Artificial Intelligence

Tencent Open‑Sources WeKnora: An AI‑Powered Document Understanding Framework

WeKnora, Tencent's newly open‑source framework built on the IMA kernel, combines LLM and RAG to parse unstructured PDFs, Word files and scans with over 300% speed improvement and 89% top‑10 retrieval precision, offering modular deployment, secure private‑cloud options, and seamless integration with vector databases and the WeChat ecosystem.

Knowledge BaseLLMOpen Source

0 likes · 8 min read

Tencent Open‑Sources WeKnora: An AI‑Powered Document Understanding Framework

DeepHub IMBA

May 11, 2026 · Artificial Intelligence

2026 RAG Selection Guide: How to Choose Between Vector, Graph, and Vectorless

This article compares traditional Vector RAG, GraphRAG, and the newer Vectorless RAG, explains why Vector RAG fails on relational and structured queries, presents benchmark results, outlines each architecture's strengths and costs, and offers a decision framework and Adaptive RAG routing strategy for production systems.

Adaptive RetrievalGraphRAGLLM

0 likes · 13 min read

2026 RAG Selection Guide: How to Choose Between Vector, Graph, and Vectorless

IT Services Circle

May 11, 2026 · Artificial Intelligence

Can Claude’s Code Generation Replace Agent Memory Systems? Understanding CLAUDE.md, Memory, and RAG

The article explains why large language model agents need dedicated memory systems to overcome limited context windows, outlines short‑term and long‑term memory architectures, storage forms, functional categories, lifecycle operations, control‑policy research, compares leading products, and presents best‑practice engineering guidelines for building scalable, privacy‑aware agent memory pipelines.

Agent MemoryControl PolicyLong-term Memory

0 likes · 25 min read

Can Claude’s Code Generation Replace Agent Memory Systems? Understanding CLAUDE.md, Memory, and RAG

James' Growth Diary

May 11, 2026 · Artificial Intelligence

Mastering RAG Evaluation: Recall@K, MRR, NDCG, and RAGAS Explained

This article breaks down RAG evaluation into a two‑layer framework, explains the four core metrics—Recall@K, MRR, NDCG, and the four RAGAS scores—shows how to implement them with LangChain.js, highlights common pitfalls, and offers scenario‑specific metric combinations for reliable performance monitoring.

LangChainMRRNDCG

0 likes · 20 min read

Mastering RAG Evaluation: Recall@K, MRR, NDCG, and RAGAS Explained

Smart Workplace Lab

May 10, 2026 · Artificial Intelligence

When Your Internal AI Is Fed Bad Data, How to Fix It?

The article recounts a real incident where an AI‑generated SOP cited outdated policy because a knowledge base was overloaded with unchecked historical documents, then outlines a step‑by‑step protocol—including corpus cleaning, version locking, and isolation zones—to prevent data contamination and ensure reliable AI outputs.

AIData cleaningKnowledge Base

0 likes · 7 min read

When Your Internal AI Is Fed Bad Data, How to Fix It?

James' Growth Diary

May 10, 2026 · Artificial Intelligence

Syncing Vectors with Changing Documents: Add, Update, Delete Made Simple

This article walks through why keeping a vector store consistent with a mutable knowledge base is challenging, explains the three failure points, introduces hash‑based incremental syncing, shows idempotent add, proper update and soft‑delete workflows, covers embedding model upgrades, and presents a production‑grade event‑driven architecture with common pitfalls and remedies.

Hash DeduplicationLangChainRAG

0 likes · 17 min read

Syncing Vectors with Changing Documents: Add, Update, Delete Made Simple

DataFunTalk

May 10, 2026 · Artificial Intelligence

Exploring Multimodal GraphRAG: Combining Document Intelligence, Knowledge Graphs, and Large Models

This article presents a detailed technical walkthrough of multimodal GraphRAG, covering document‑intelligence parsing pipelines, multimodal graph index construction, knowledge‑graph‑driven chunk linking, recent research progress, performance trade‑offs, and practical recommendations for deploying RAG solutions.

Document IntelligenceGraphRAGMultimodal Retrieval

0 likes · 23 min read

Exploring Multimodal GraphRAG: Combining Document Intelligence, Knowledge Graphs, and Large Models

IT Services Circle

May 9, 2026 · Artificial Intelligence

How to Choose Between LangChain and LlamaIndex: Core Use‑Case Comparison for Agent Development

The article analyzes the design philosophies, key components, strengths, and weaknesses of LangChain and LlamaIndex, explains their distinct core scenarios—complex multi‑step agent orchestration versus private‑data RAG—and shows how they can be combined in real projects while outlining emerging ecosystem trends.

AgentLLMLangChain

0 likes · 13 min read

How to Choose Between LangChain and LlamaIndex: Core Use‑Case Comparison for Agent Development

java1234

May 9, 2026 · Artificial Intelligence

Claude‑mem: Persistent Memory for Claude Code – Architecture, Token Savings, Quick Install

Claude‑mem adds automatic capture, compression, and retrieval of high‑value coding context to Claude Code, reducing token usage with a three‑stage retrieval pipeline, offering a single‑command install, cross‑tool compatibility, and configurable privacy controls.

AI assistantClaude CodeRAG

0 likes · 11 min read

Claude‑mem: Persistent Memory for Claude Code – Architecture, Token Savings, Quick Install

AI Engineer Programming

May 9, 2026 · Artificial Intelligence

Why PDF Parsing Is Hard for RAG and Which Mainstream Solutions Work

The article examines the intrinsic challenges of extracting structured text from PDFs for Retrieval‑Augmented Generation—such as missing reading order, table reconstruction, font encoding, and scanned images—and compares lightweight libraries, AI‑enhanced frameworks, commercial APIs, and visual language models as practical solutions.

AI frameworksOCRPDF parsing

0 likes · 23 min read

Why PDF Parsing Is Hard for RAG and Which Mainstream Solutions Work

AI Step-by-Step

May 8, 2026 · Artificial Intelligence

How LLM Wiki Transforms Personal Agent Knowledge Management

LLM Wiki, proposed by Andrej Karpathy, replaces repetitive RAG retrieval for personal agents with a three‑layer markdown‑based knowledge base that separates raw sources, curated wiki pages, and schema constraints, enabling durable, auditable memory, structured updates, health checks, and a hybrid Wiki‑RAG workflow.

AIKnowledge BaseLLM Wiki

0 likes · 17 min read

How LLM Wiki Transforms Personal Agent Knowledge Management

AI Engineer Programming

May 8, 2026 · Artificial Intelligence

Is Non-Vector RAG the Next Generation of Retrieval‑Augmented Generation?

The article analyses the relevance and accuracy shortcomings of traditional vector‑based RAG, explains how non‑vector approaches like PageIndex let LLMs navigate document trees for relevance classification and auditability, and evaluates their complexity, latency, metadata risks, and suitable use cases compared with hybrid retrieval.

Hybrid RetrievalLLMRAG

0 likes · 8 min read

Is Non-Vector RAG the Next Generation of Retrieval‑Augmented Generation?

Architect's Guide

May 7, 2026 · Artificial Intelligence

Spring AI 2.0 vs LangChain4j: Which Should You Choose?

The article provides a side‑by‑side analysis of Spring AI 2.0 and LangChain4j, comparing their goals, version alignment, programming models, RAG and agent capabilities, ecosystem integration, learning curve, and operational considerations to help Java teams decide which library best fits their project constraints.

AI agentsJavaLLM integration

0 likes · 11 min read

Spring AI 2.0 vs LangChain4j: Which Should You Choose?

Lao Guo's Learning Space

May 6, 2026 · Artificial Intelligence

Why Your RAG Keeps Missing the Mark: Enterprise‑Level Pitfall Guide

This article examines why Retrieval‑Augmented Generation systems that work in demos often fail in production, detailing common pitfalls—from chunking and vector‑database selection to hybrid retrieval and re‑ranking—and offers concrete strategies, configuration tips, and a decision tree to build reliable enterprise‑grade RAG solutions.

ChunkingHybrid RetrievalRAG

0 likes · 12 min read

Why Your RAG Keeps Missing the Mark: Enterprise‑Level Pitfall Guide

Old Zhang's AI Learning

May 6, 2026 · Artificial Intelligence

Solving RAG’s Biggest Pain Point: Introducing the Open‑Source CocoIndex

RAG and agent contexts suffer from stale data, not chunking or reranking, and CocoIndex—a Rust‑based incremental engine with a declarative Python API—offers fresh, delta‑processed context, automatic schema evolution, and production‑grade features, demonstrated through PDF‑to‑Markdown pipelines and a podcast knowledge‑graph case study.

PythonRAGRust

0 likes · 13 min read

Solving RAG’s Biggest Pain Point: Introducing the Open‑Source CocoIndex

AI Engineer Programming

May 6, 2026 · Artificial Intelligence

How to Evaluate and Choose Embedding Models for RAG Systems

This article explains why embedding models are the foundation of RAG pipelines, outlines concrete evaluation metrics such as MTEB v2 scores, latency, throughput and cost, compares a range of commercial and open‑source models, and discusses emerging trends like multimodal and long‑context embeddings.

MTEBModel selectionMultimodal

0 likes · 13 min read

How to Evaluate and Choose Embedding Models for RAG Systems

Su San Talks Tech

May 6, 2026 · Information Security

What Is Prompt Injection? Attack Vectors and Defense Strategies

The article explains that Prompt injection is a new LLM security threat where attackers blur the line between instruction and data, outlines direct and indirect injection techniques—including command overriding, role‑play jailbreaks, encoding obfuscation, and multi‑turn attacks—and proposes a defense‑in‑depth framework with input filtering, prompt design, output validation, least‑privilege architecture, and specialized safeguards for RAG and agent scenarios.

AI safetyAgentDefense in Depth

0 likes · 15 min read

What Is Prompt Injection? Attack Vectors and Defense Strategies

java1234

May 5, 2026 · Artificial Intelligence

Spring AI 2.0: New Video Tutorial Series Empowers Java Developers with AI

The author announces a refreshed Spring AI 2.0 video tutorial series and provides a detailed overview of the framework’s design goals, provider‑agnostic API, full‑type model support, Spring integration, enterprise value, typical use cases, and a comparison with competing Java AI libraries.

AI FrameworkJavaLangChain4j

0 likes · 7 min read

Spring AI 2.0: New Video Tutorial Series Empowers Java Developers with AI

AI Engineering

May 4, 2026 · Artificial Intelligence

Why the Big‑Model Race Is Over: Where Real Value Lies in AI Infrastructure

The article argues that the competition over which large language model will dominate is outdated, explaining that true value now comes from building multi‑model routing, context engineering, standardized tool protocols, intelligent orchestration, and robust evaluation layers that turn models into reliable AI infrastructure.

AI infrastructureMCPModel routing

0 likes · 6 min read

Why the Big‑Model Race Is Over: Where Real Value Lies in AI Infrastructure

DataFunTalk

May 4, 2026 · Artificial Intelligence

Engineering and Algorithm Innovations for RAG Engines in Office Applications

This article analyzes the challenges and practical solutions of building a Retrieval‑Augmented Generation (RAG) system for office scenarios, covering background issues, modular architecture, offline and online pipelines, hybrid retrieval, ranking models, knowledge filtering, prompt design, and two‑stage generation techniques.

AIDocument ParsingHybrid Retrieval

0 likes · 22 min read

Engineering and Algorithm Innovations for RAG Engines in Office Applications

PMTalk Product Manager Community

May 4, 2026 · Product Management

2026 AI Product Manager: The Essential Capability Model

By 2026, AI product managers must shift from merely using models to delivering stable, valuable results, mastering seven core abilities—demand judgment, evaluation-driven iteration, context design, RAG strategy, agent orchestration, solution planning, and rapid Vibe Coding—to close the loop between business needs and AI capabilities.

AI product managementAgent DesignContext Engineering

0 likes · 13 min read

2026 AI Product Manager: The Essential Capability Model

AI Engineer Programming

May 4, 2026 · Artificial Intelligence

RAG in the Long-Context Era: Challenges, Benchmarks, and Context Engineering

The article analyzes how expanding LLM context windows to millions of tokens reshape Retrieval‑Augmented Generation, detailing chunking trade‑offs, embedding retrieval limits, attention U‑shaped distribution, benchmark results, and the emerging practice of Context Engineering for optimal end‑to‑end pipelines.

Embedding RetrievalLLMRAG

0 likes · 10 min read

RAG in the Long-Context Era: Challenges, Benchmarks, and Context Engineering

AI Architect Hub

May 3, 2026 · Artificial Intelligence

Choosing the Right Vector Database: Milvus, Chroma, Weaviate, Qdrant, FAISS Compared

This article compares five popular vector databases—Chroma, Milvus, Weaviate, Qdrant, and FAISS—detailing their positions, strengths, weaknesses, suitable scenarios, a selection‑dimension matrix, common pitfalls, code implementations for a unified RAG pipeline, best‑practice recommendations, and thought questions to guide engineers in choosing and migrating vector stores.

ChromaFAISSMilvus

0 likes · 23 min read

Choosing the Right Vector Database: Milvus, Chroma, Weaviate, Qdrant, FAISS Compared

DataFunSummit

May 3, 2026 · Artificial Intelligence

From Flawed to Production-Ready: Deep Dive into Building Enterprise-Grade RAG Systems

The article analyzes why early RAG deployments often fall short, dissects the most common technical pain points—from document parsing to vector overload—and presents a systematic roadmap that includes hybrid search, reranking, GraphRAG, Agentic RAG, model selection, scalability tricks, and security controls for robust B‑side production.

Agentic RAGGraphRAGHybrid Search

0 likes · 20 min read

From Flawed to Production-Ready: Deep Dive into Building Enterprise-Grade RAG Systems

Spring Full-Stack Practical Cases

May 3, 2026 · Artificial Intelligence

9 Advanced Retrieval‑Augmented Generation (RAG) Architectures Explained

This article introduces Retrieval‑Augmented Generation (RAG) and systematically details nine distinct RAG architectures—standard, conversational with memory, corrective (CRAG), adaptive, self‑RAG, fusion, HyDE, agentic, and Graph RAG—highlighting their workflows, real‑world examples, advantages, and trade‑offs.

AI ArchitectureGraphRAGLLM

0 likes · 17 min read

9 Advanced Retrieval‑Augmented Generation (RAG) Architectures Explained

AI Engineer Programming

May 2, 2026 · Artificial Intelligence

From Demo to Production: How to Evaluate RAG Effectively

This guide outlines a comprehensive RAG evaluation framework covering failure modes, multi‑layer metrics, test‑set construction, open‑source tools, CI/CD quality gates, production monitoring, and special considerations for agentic RAG to ensure reliable, trustworthy retrieval‑augmented generation systems.

AILLMMetrics

0 likes · 18 min read

From Demo to Production: How to Evaluate RAG Effectively

DataFunSummit

May 1, 2026 · Artificial Intelligence

How Agentic Architectures Power the Next‑Gen Recommendation and Search Systems

This article summarizes a technical ebook that analyzes the evolution of recommendation and search systems—from deep‑learning models to large‑language‑model agents—detailing multi‑agent RAG architectures, Huawei’s KAR knowledge adapters, Baidu’s generative ranking (GRAB), Elasticsearch vector search, and performance results such as a 1.5% AUC lift and GPU‑accelerated throughput gains.

ElasticsearchGenerative RankingMulti-Agent Architecture

0 likes · 6 min read

How Agentic Architectures Power the Next‑Gen Recommendation and Search Systems

AI Engineer Programming

May 1, 2026 · Artificial Intelligence

From Naive Retrieval to Knowledge Runtime: The Full Evolution of RAG

The article traces the evolution of Retrieval‑Augmented Generation from its 2020 Naive baseline through Advanced, Modular, Graph, and Agentic generations, detailing architectural shifts, optimization techniques, self‑correction mechanisms, and future challenges such as long‑context handling and multimodal retrieval.

LLMRAGagentic

0 likes · 14 min read

From Naive Retrieval to Knowledge Runtime: The Full Evolution of RAG

Alibaba Cloud Big Data AI Platform

May 1, 2026 · Artificial Intelligence

Zero Deployment, Zero Ops: Alibaba Cloud Milvus Embedding Service Makes Vectorization Plug‑and‑Play

The article explains how Alibaba Cloud's Milvus Embedding Service eliminates the need for self‑hosted embedding models by integrating model inference, vector generation and Milvus indexing into a managed pipeline, dramatically reducing deployment complexity, operational overhead, and time‑to‑value for semantic search, RAG and multimodal retrieval use cases.

Alibaba CloudEmbeddingMilvus

0 likes · 19 min read

Zero Deployment, Zero Ops: Alibaba Cloud Milvus Embedding Service Makes Vectorization Plug‑and‑Play

DeepHub IMBA

Apr 30, 2026 · Artificial Intelligence

Why Real RAG Systems Need Both BM25 and Vector Search

The article analyzes how BM25 excels at exact token matching while vector embeddings capture semantic intent, explains their distinct failure modes, and shows that a hybrid retriever—combined with metadata filtering, proper chunking, and reciprocal rank fusion—delivers the most reliable results for RAG pipelines.

BM25EmbeddingHybrid Retrieval

0 likes · 17 min read

Why Real RAG Systems Need Both BM25 and Vector Search

AI Architect Hub

Apr 30, 2026 · Artificial Intelligence

How AI Understands Your Queries: Core Techniques of Semantic Vector Search

The article explains why traditional keyword search often fails when user questions differ from knowledge‑base wording, introduces semantic search that matches queries and documents via vector similarity, details query understanding and rewriting techniques, lists common pitfalls, provides a full Python implementation, and shares best‑practice recommendations.

AIPythonRAG

0 likes · 16 min read

How AI Understands Your Queries: Core Techniques of Semantic Vector Search

DataFunSummit

Apr 30, 2026 · Industry Insights

Why Palantir’s Edge Isn’t Unique – Chinese Enterprises Can Replicate Its Methodology

A panel of industry experts dissected Palantir’s rapid growth, revealing that its advantage lies in a systematic ontology‑driven methodology rather than exclusive technology, and argued that Chinese firms can adopt the same approach if they first resolve data governance, semantic consistency, and management challenges.

AI agentsCapability vs CompetencyPalantir

0 likes · 26 min read

Why Palantir’s Edge Isn’t Unique – Chinese Enterprises Can Replicate Its Methodology

MeowKitty Programming

Apr 29, 2026 · Artificial Intelligence

10 Must‑Try Open‑Source AI Projects for Java Developers: RAG, Agents, Knowledge Bases, and Text‑to‑SQL

This article curates ten open‑source AI projects on Gitee that Java developers can use to learn RAG pipelines, AI agents, knowledge‑base construction, Text‑to‑SQL, workflow orchestration, and multi‑model integration, offering concrete use cases, learning goals, and guidance on selecting a learning path.

AIJavaLangChain4j

0 likes · 13 min read

10 Must‑Try Open‑Source AI Projects for Java Developers: RAG, Agents, Knowledge Bases, and Text‑to‑SQL

Machine Heart

Apr 29, 2026 · Artificial Intelligence

Doc‑V*: Reading Only 5 Pages Beats RAG on 80‑Page Docs – 10 Key Insights

Doc‑V* introduces a dynamic, thumbnail‑driven approach that lets a model decide which pages to read, achieving a 49.7% improvement over RAG variants on multi‑page document QA benchmarks without larger models or longer context windows, and demonstrates how strategic evidence acquisition outperforms naïve full‑document reading.

AIRAGdocument understanding

0 likes · 10 min read

Doc‑V*: Reading Only 5 Pages Beats RAG on 80‑Page Docs – 10 Key Insights

Kuaishou Tech

Apr 29, 2026 · Operations

Boosting Oncall Interception from 15% to 55%: KOncall’s AI‑Driven Evolution at Kuaishou

Kuaishou’s R&D efficiency team built the KOncall intelligent on‑call platform, integrating LLM‑based retrieval‑augmented generation, Redis Pub/Sub streaming, OCR multimodal parsing, FAQ knowledge ops, and custom reranking, which raised automated query interception from 15% to 55% and processed over 116 000 requests, turning on‑call from a bottleneck into a capability starter.

AI OperationsIncident ManagementLLM

0 likes · 26 min read

Boosting Oncall Interception from 15% to 55%: KOncall’s AI‑Driven Evolution at Kuaishou

MaGe Linux Operations

Apr 28, 2026 · Artificial Intelligence

Why Your RAG Performance Is Poor: Common Issues and Optimization Strategies

This article systematically analyzes why Retrieval‑Augmented Generation pipelines often underperform—covering embedding model selection, chunking strategies, hybrid retrieval, reranking, context window waste, evaluation metrics, and a detailed troubleshooting checklist—while providing concrete code examples and best‑practice recommendations for engineers.

ChunkingEmbeddingHybrid Retrieval

0 likes · 19 min read

Why Your RAG Performance Is Poor: Common Issues and Optimization Strategies

360 Tech Engineering

Apr 28, 2026 · Artificial Intelligence

How 360 AI Institute Boosted Airline Translation Accuracy from 70% to 96%

The 360 AI Research Institute tackled the zero‑tolerance translation demands of airline maintenance by building a specialized parallel corpus and applying RAG‑enhanced, SFT‑fine‑tuned, and RL‑reinforced models, raising Chinese‑to‑English translation accuracy from 70% to 96% and enabling a one‑month rollout.

AI translationRAGSFT

0 likes · 5 min read

How 360 AI Institute Boosted Airline Translation Accuracy from 70% to 96%

AI Illustrated Series

Apr 28, 2026 · Artificial Intelligence

Comprehensive Interview Guide: LangChain & LangGraph Frameworks

This article provides a detailed, question‑and‑answer style walkthrough of LangChain and LangGraph, covering their core concepts, components, workflow patterns, memory mechanisms, LCEL syntax, graph construction, conditional edges, loops, multi‑agent collaboration, persistence, and a comparison with LlamaIndex, offering concrete code examples and practical insights for AI interview preparation.

AI FrameworkAgentLCEL

0 likes · 32 min read

Comprehensive Interview Guide: LangChain & LangGraph Frameworks

Node.js Tech Stack

Apr 28, 2026 · Artificial Intelligence

Turn Your Article Collection into an LLM‑Powered Wiki with a Single Skill

This article walks through using the youdaonote‑llm‑wiki skill to automatically ingest a set of Markdown articles into a cloud‑synced Youdao Note knowledge base, generate structured Wiki pages, perform cross‑document queries with citations, and keep the repository up‑to‑date, while comparing it to Karpathy's original script‑based approach.

AI agentsLLM WikiRAG

0 likes · 14 min read

Turn Your Article Collection into an LLM‑Powered Wiki with a Single Skill

Ray's Galactic Tech

Apr 27, 2026 · Backend Development

Java Engineer’s Complete Guide to Enterprise LLM Apps: LLM, Agent, RAG & Skill

This article walks Java engineers through building production‑grade enterprise AI assistants, explaining the roles of LLM, RAG, Agent and Skill, detailing a layered architecture, best‑practice code samples, deployment strategies, observability, security and cost‑control considerations.

AgentJavaLLM

0 likes · 37 min read

Java Engineer’s Complete Guide to Enterprise LLM Apps: LLM, Agent, RAG & Skill

AI Illustrated Series

Apr 27, 2026 · Artificial Intelligence

Comprehensive RAG Interview Q&A: 22 In-Depth Questions and Answers

This extensive interview guide covers 22 core RAG questions, detailing the definition, workflow, embedding selection, vector database choices, retrieval optimization, multi‑turn handling, context compression, evaluation metrics, knowledge‑graph integration, operational challenges, Agentic and hybrid RAG, document update strategies, similarity algorithms, and hallucination mitigation, providing concrete examples and practical advice for AI interview preparation.

AI InterviewEmbeddingKnowledge retrieval

0 likes · 29 min read

Comprehensive RAG Interview Q&A: 22 In-Depth Questions and Answers

SuanNi

Apr 27, 2026 · Artificial Intelligence

Agent Skills Explained: Definition, Structure, and Engineering Practices

This article breaks down the official Anthropic definition of Agent Skills, shows how they are simple file‑system‑based, composable units stored in SKILL.md, scripts, references and assets, and explains the three‑layer progressive‑disclosure loading model, discovery, selection, execution, composition patterns, security, version‑control integration and evaluation practices.

AIAgent SkillsComposable

0 likes · 14 min read

Agent Skills Explained: Definition, Structure, and Engineering Practices

Architect's Tech Stack

Apr 27, 2026 · Artificial Intelligence

Can Your RAG System Pass the Demo and Remain Accurate Across 5,000 Documents?

The article dissects a tough interview question about building a production‑grade Retrieval‑Augmented Generation (RAG) system that not only works in a demo but also delivers stable, correct answers over a knowledge base of 5,000 documents, covering chunking, hybrid retrieval, intent routing, constrained generation, evaluation metrics, and operational safeguards.

Hybrid RetrievalIntent RoutingRAG

0 likes · 15 min read

Can Your RAG System Pass the Demo and Remain Accurate Across 5,000 Documents?

Data Party THU

Apr 27, 2026 · Artificial Intelligence

Three Overlooked Failure Points in RAG Pipelines and How to Build a Feedback Loop

The article analyzes silent failures in Retrieval‑Augmented Generation pipelines, identifies three gaps—retrieval relevance, LLM confidence masking uncertainty, and missing fault signals—and presents a practical feedback‑loop architecture with relevance gating, post‑generation evaluation, session tracing, and user‑signal logging to make production RAG systems trustworthy.

LLMObservabilityRAG

0 likes · 13 min read

Three Overlooked Failure Points in RAG Pipelines and How to Build a Feedback Loop

Lao Guo's Learning Space

Apr 27, 2026 · Artificial Intelligence

Build a Private Knowledge Base from Scratch with DeepSeek V4 and AnythingLLM

This guide walks you through creating a fully local, zero‑cloud RAG knowledge base using DeepSeek V4, AnythingLLM, and the BGE‑M3 embedding model, covering component choices, step‑by‑step installation, advanced tuning, troubleshooting, use‑case scenarios, and cost estimation.

AnythingLLMBGE‑M3DeepSeek V4

0 likes · 18 min read

Build a Private Knowledge Base from Scratch with DeepSeek V4 and AnythingLLM

Wu Shixiong's Large Model Academy

Apr 27, 2026 · Artificial Intelligence

Can Your RAG Pass the Demo? Scaling to 5,000 Docs for Reliable Answers

The article walks through the practical challenges of turning a RAG demo into a production system for 5,000 insurance documents, covering knowledge‑base chunking, embedding model selection, recall‑threshold tuning, hybrid vector‑BM25 retrieval, intent‑aware query routing, prompt constraints, confidence scoring, and operational scaling, with concrete metrics and code examples.

EmbeddingHybrid RetrievalRAG

0 likes · 16 min read

Can Your RAG Pass the Demo? Scaling to 5,000 Docs for Reliable Answers

Java Web Project

Apr 27, 2026 · Artificial Intelligence

DeepSeek V4 Meets Claude Code: A Cost‑Effective Leap in Open‑Source LLM Performance

DeepSeek V4 preview, released quietly on April 24, offers two models with 1 M token context and pricing 1/16 of Claude Opus, achieving near‑par performance on SWE‑bench and LiveCodeBench, while integration with Claude Code enables rapid project understanding, bug detection, refactoring, testing and documentation, saving days of work for under ¥6.

Agentic CodingClaude CodeCode Refactoring

0 likes · 15 min read

DeepSeek V4 Meets Claude Code: A Cost‑Effective Leap in Open‑Source LLM Performance

The Dominant Programmer

Apr 27, 2026 · Artificial Intelligence

Building a Private Document Vector Search with SpringBoot, LangChain4j, and Ollama RAG

This guide walks through why Retrieval‑Augmented Generation (RAG) is needed for large language models, explains the three‑step indexing and query workflow, details LangChain4j’s core components, and provides a complete SpringBoot example—including Maven setup, configuration, service code, and troubleshooting—to create a private document‑vector search system powered by Ollama.

EmbeddingLangChain4jOllama

0 likes · 13 min read

Building a Private Document Vector Search with SpringBoot, LangChain4j, and Ollama RAG

James' Growth Diary

Apr 26, 2026 · Databases

Vector Database Fundamentals: Embedding, Similarity Search, and Index Structures Explained in One Go

This article walks through the complete workflow of turning split text into high‑dimensional vectors, choosing the right embedding model, selecting an appropriate similarity metric, comparing index structures such as Flat, IVF, HNSW and PQ, and finally picking a vector database and integrating it with LangChain.js for production‑grade RAG pipelines.

EmbeddingsLangChainRAG

0 likes · 25 min read

Vector Database Fundamentals: Embedding, Similarity Search, and Index Structures Explained in One Go

DataFunTalk

Apr 26, 2026 · Artificial Intelligence

Building an Enterprise‑Grade RAG 2.0 System: Architecture, Challenges, and Best Practices

This article analyses the practical construction of an enterprise‑level Retrieval‑Augmented Generation (RAG) 2.0 system, covering background issues of large models, a modular architecture, layered offline/online pipelines, hybrid retrieval, ranking strategies, prompt engineering, and deployment insights drawn from China Mobile’s production experience.

Hybrid RetrievalRAGRanking Models

0 likes · 22 min read

Building an Enterprise‑Grade RAG 2.0 System: Architecture, Challenges, and Best Practices

AI Illustrated Series

Apr 26, 2026 · Artificial Intelligence

Build Your First LangChain Agent: A Hands‑On Framework Tutorial

This article walks through a practical, step‑by‑step construction of a LangChain agent—from basic concepts and a simple weather‑query agent to a more complex market‑research agent, adding memory and RAG capabilities, and finally comparing LangChain with LangGraph.

AI agentLangChainMemory

0 likes · 15 min read

Build Your First LangChain Agent: A Hands‑On Framework Tutorial

AI Architect Hub

Apr 26, 2026 · Artificial Intelligence

Embedding Explained: How Vectorization Turns Text into Numbers for RAG

This article walks through why traditional keyword matching fails for RAG, explains the evolution from one‑hot encoding to Word2Vec and BERT, details sentence‑level embeddings and similarity metrics, compares leading Chinese and multilingual embedding models using the C‑MTEB benchmark, and provides practical LangChain code, deployment tips, and common pitfalls.

Chinese NLPEmbeddingLangChain

0 likes · 18 min read

Embedding Explained: How Vectorization Turns Text into Numbers for RAG

The Dominant Programmer

Apr 25, 2026 · Backend Development

Integrating LangChain4j with Spring Boot for Fast AI Conversations on Alibaba Baichuan

This guide walks through using the SpringAIAlibaba framework to integrate Alibaba Baichuan with Spring Boot via LangChain4j, explains core concepts, compares LangChain4j to Spring AI and OpenAI, and provides step‑by‑step dependency setup, environment configuration, code examples, and a simple browser test.

AI chatAgentAlibaba Baichuan

0 likes · 11 min read

Integrating LangChain4j with Spring Boot for Fast AI Conversations on Alibaba Baichuan

AI Architect Hub

Apr 25, 2026 · Artificial Intelligence

How to Feed Massive Documents to an RAG System: Mastering the Art of Text Chunking

This article explains why proper text chunking is critical for Retrieval‑Augmented Generation, illustrates common pitfalls with real‑world examples, compares four chunking strategies (fixed length, recursive, structure‑aware, and code‑aware), and provides practical guidelines for chunk size, overlap, metadata handling, and a production‑ready pipeline.

AI RetrievalLangChainRAG

0 likes · 21 min read

How to Feed Massive Documents to an RAG System: Mastering the Art of Text Chunking

Architecture and Beyond

Apr 25, 2026 · Artificial Intelligence

Practical Insights on Recent AI Engineering Deployments

The article examines how large language models function as probabilistic components within deterministic software, discusses fault‑tolerance limits for viable AI use cases, and offers detailed engineering guidance on RAG pipelines, tool‑calling determinism, agent fragility, testing, monitoring, and privacy‑conscious deployment in finance.

AI EngineeringAgent ArchitectureLLM

0 likes · 14 min read

Practical Insights on Recent AI Engineering Deployments

Geek Labs

Apr 25, 2026 · Artificial Intelligence

Boost AI Workflow: Personal Knowledge Base with llm_wiki and Evolving Agents

Unlike typical RAG that discards knowledge after each query, the open‑source tools llm_wiki and SkillClaw let you continuously compile a personal knowledge base and evolve AI agents by incrementally storing documents and session‑derived skills, complete with multi‑step processing, community‑tested benchmarks, and cross‑platform support.

AI agentsKnowledge BaseLLM Wiki

0 likes · 7 min read

Boost AI Workflow: Personal Knowledge Base with llm_wiki and Evolving Agents

Ray's Galactic Tech

Apr 24, 2026 · Backend Development

From Bottlenecks to a High‑Concurrency Medical Assistant with LangChain4j

This guide details how to design and implement a production‑grade, high‑concurrency medical AI assistant using LangChain4j, Spring Boot, Redis, and Kubernetes, covering architecture, RAG‑enhanced retrieval, controlled tool invocation, guardrails, idempotent transactions, scaling strategies and observability to ensure reliable, compliant patient interactions.

LangChain4jRAGSpring Boot

0 likes · 33 min read

From Bottlenecks to a High‑Concurrency Medical Assistant with LangChain4j