Tagged articles
919 articles
Page 1 of 10
ITPUB
ITPUB
May 30, 2026 · Artificial Intelligence

Is RAG Dead? How Grep Is Making a Comeback in LLM‑Powered Code Search

This article investigates the claim that Retrieval‑Augmented Generation (RAG) is obsolete by dissecting Claude Code’s grep‑driven search architecture, benchmarking its performance against traditional vector‑based retrieval, comparing it with Cursor and OpenAI Codex, and analyzing the trade‑offs of multi‑round agentic search.

Claude CodeCode searchCursor
0 likes · 36 min read
Is RAG Dead? How Grep Is Making a Comeback in LLM‑Powered Code Search
AI Engineer Programming
AI Engineer Programming
May 30, 2026 · Artificial Intelligence

Should You Pre‑filter or Post‑filter in RAG Vector Search?

The article examines RAG vector retrieval filtering strategies, comparing pre‑filtering (filter before vector search) and post‑filtering (filter after ANN search), and introduces single‑stage filtering, discussing their principles, trade‑offs, suitable scenarios, and architectural implications for accuracy and performance.

ANNRAGmetadata filtering
0 likes · 15 min read
Should You Pre‑filter or Post‑filter in RAG Vector Search?
Digital Planet
Digital Planet
May 29, 2026 · Industry Insights

5 Essential Skills Data Professionals Must Master in 2026

In the AI‑driven era of 2026, data professionals need to focus on five high‑impact capabilities—data governance, practical large‑model usage, MLOps, data storytelling, and AI compliance—to stay indispensable, with each skill backed by industry reports, job growth data, and concrete learning pathways.

2026 TrendsAI ComplianceAI Skills
0 likes · 13 min read
5 Essential Skills Data Professionals Must Master in 2026
AI Engineer Programming
AI Engineer Programming
May 29, 2026 · Artificial Intelligence

How to Build a Reliable RAG Test Dataset

The article explains why a structured test set is essential for Retrieval‑Augmented Generation systems, outlines failure modes, describes layered evaluation of retrieval and generation, details infrastructure like chunk IDs and manifests, and provides a complete annotation pipeline with cold‑start and adversarial strategies.

LLMRAGadversarial
0 likes · 24 min read
How to Build a Reliable RAG Test Dataset
AI Large-Model Wave and Transformation Guide
AI Large-Model Wave and Transformation Guide
May 28, 2026 · Artificial Intelligence

Why AI Agent Architecture Mirrors 50 Years of OS Design

The article maps classic operating‑system concepts—processes, system calls, caching, file‑system mounting, and scheduling—to AI agents, showing how these analogies explain challenges like context sharing, tool permissions, token limits, knowledge‑base mounting, and orchestrated execution, and proposes a concrete multi‑layer design framework.

AI agentsAgent ArchitectureContext Management
0 likes · 10 min read
Why AI Agent Architecture Mirrors 50 Years of OS Design
AI Engineer Programming
AI Engineer Programming
May 28, 2026 · Artificial Intelligence

Claude Code Best Practices and Getting Started Guide for Large Codebases

This guide explains how Claude Code can be deployed in massive monorepos, legacy systems, and distributed repositories, detailing navigation methods, the limits of RAG, the benefits of agentic search, and a five‑layer support system—including CLAUDE.md, hooks, skills, plugins, and MCP servers—to help teams of thousands achieve reliable AI‑assisted coding.

AI codingCLAUDE.mdClaude Code
0 likes · 18 min read
Claude Code Best Practices and Getting Started Guide for Large Codebases
Su San Talks Tech
Su San Talks Tech
May 27, 2026 · Artificial Intelligence

Why Switch from Hand‑Written HTTP Calls to Spring AI for Large‑Model Integration?

The article analyzes the drawbacks of manually coding HTTP calls to large language models—hard‑coded keys, fragile request construction, missing retries, and poor observability—and demonstrates how Spring AI’s layered abstraction, unified configuration, built‑in resilience, function calling, RAG support, and seamless Spring ecosystem integration solve these problems for production‑grade Java applications.

Function CallingJavaLLM
0 likes · 24 min read
Why Switch from Hand‑Written HTTP Calls to Spring AI for Large‑Model Integration?
AI Engineer Programming
AI Engineer Programming
May 27, 2026 · Artificial Intelligence

MMR for RAG: Low-Cost Chunk Limits Balance Relevance and Diversity

When a long document is split into many highly similar chunks, vector‑based top‑k retrieval tends to return multiple pieces from the same source, causing document dominance; applying a per‑document chunk limit together with Maximal Marginal Relevance (MMR) re‑ranking introduces diversity while preserving relevance, offering a low‑cost way to improve RAG answer quality.

ChunkingDPPDiversity
0 likes · 17 min read
MMR for RAG: Low-Cost Chunk Limits Balance Relevance and Diversity
Su San Talks Tech
Su San Talks Tech
May 25, 2026 · Artificial Intelligence

Mastering RAG: Chunking, Embeddings, BM25 & Multi‑Index Retrieval in Python

This tutorial explains Retrieval‑Augmented Generation (RAG) from fundamentals to a full pipeline, covering text chunking strategies, VoyageAI embeddings, vector‑store implementation, BM25 lexical search, and a multi‑index retriever that fuses semantic and lexical results with Reciprocal Rank Fusion.

BM25ChunkingEmbeddings
0 likes · 48 min read
Mastering RAG: Chunking, Embeddings, BM25 & Multi‑Index Retrieval in Python
DataFunTalk
DataFunTalk
May 24, 2026 · Artificial Intelligence

Engineering and Algorithm Innovations for RAG Engines in Office Scenarios

The article analyzes the challenges of deploying large language models in enterprise settings and presents a modular Retrieval‑Augmented Generation (RAG) solution that combines document parsing, multi‑turn query rewriting, hybrid vector‑plus‑BM25 retrieval, two‑stage ranking (RRF, ColBERT, cross‑encoder) and knowledge‑filtered prompt engineering to achieve more comprehensive search, better ranking and more accurate answers.

Document ParsingHybrid RetrievalKnowledge Filtering
0 likes · 22 min read
Engineering and Algorithm Innovations for RAG Engines in Office Scenarios
SuanNi
SuanNi
May 23, 2026 · Artificial Intelligence

Deploy the Open-Source ChatLaw Legal LLM on the SuanWang Platform

This article introduces ChatLaw, an open‑source legal large language model trained on 936,727 real cases, explains its high‑dimensional embedding ChatLaw‑Text2Vec for fast knowledge alignment, and provides a step‑by‑step guide to deploy it on the SuanWang cloud platform using Python and MLU resources.

ChatLawDeploymentEmbedding
0 likes · 3 min read
Deploy the Open-Source ChatLaw Legal LLM on the SuanWang Platform
James' Growth Diary
James' Growth Diary
May 23, 2026 · Artificial Intelligence

Choosing the Right Retrieval Strategy: Full‑Text vs Vector vs Graph Search

This article breaks down the underlying logic, ideal scenarios, benchmark data, decision trees, and real‑world case studies for full‑text (BM25), vector, and graph retrieval, showing why hybrid approaches dominate production while each technique has distinct strengths and trade‑offs.

Full-Text SearchHybrid SearchRAG
0 likes · 25 min read
Choosing the Right Retrieval Strategy: Full‑Text vs Vector vs Graph Search
AI Algorithm Path
AI Algorithm Path
May 21, 2026 · Artificial Intelligence

Essential Ranking Techniques Every RAG Engineer Must Know

This article explains why ranking is the decisive factor behind successful Retrieval‑Augmented Generation (RAG) pipelines, walks through pointwise, pairwise, and listwise learning‑to‑rank paradigms, details key algorithms such as LambdaMART, compares cross‑encoders with bi‑encoders, and provides practical guidance on metrics, production‑grade rerankers, model fine‑tuning, and framework integration.

Bi-EncoderCross-EncoderLLM
0 likes · 22 min read
Essential Ranking Techniques Every RAG Engineer Must Know
DataFunSummit
DataFunSummit
May 21, 2026 · Artificial Intelligence

Designing Next‑Gen Recommendation and Search with Intelligent Agent Architecture

The article reviews a collection of technical chapters that analyze how multi‑agent AI architectures, large‑language‑model‑enhanced recommendation pipelines, generative ranking for ads, and Elasticsearch‑based vector RAG are applied to build next‑generation recommendation and search systems, citing concrete designs, performance numbers and real‑world deployments.

AI agentsElasticsearchGenerative Ranking
0 likes · 6 min read
Designing Next‑Gen Recommendation and Search with Intelligent Agent Architecture
James' Growth Diary
James' Growth Diary
May 21, 2026 · Databases

Building a Neo4j Knowledge Graph: Entity Modeling, Cypher Queries, and LangChain Integration

This article walks through why graph databases excel at multi‑hop queries, compares Neo4j with relational and vector stores, explains core concepts of nodes, relationships and properties, shows Docker setup, demonstrates six common Cypher patterns, integrates LangChain for LLM‑generated queries, and shares production‑grade modeling tips and pitfalls.

CypherGraph DatabaseLangChain
0 likes · 19 min read
Building a Neo4j Knowledge Graph: Entity Modeling, Cypher Queries, and LangChain Integration
大转转FE
大转转FE
May 21, 2026 · Artificial Intelligence

Why AI Buzzwords Multiply Faster Than My Hair Falls

The article maps three generations of AI engineering—Prompt Engineering, Context Engineering, and Harness Engineering—explaining their core capabilities, key terms like LLM, RAG, Agent, and evaluation methods, while offering practical tips, pitfalls, and a concise three‑question checklist to stay grounded amid the rapid influx of new AI jargon.

AIAgentHarness
0 likes · 19 min read
Why AI Buzzwords Multiply Faster Than My Hair Falls
AI Engineer Programming
AI Engineer Programming
May 21, 2026 · Artificial Intelligence

RAG with Multimodal Inputs vs LLM + Toolchains: Handling Non‑Text Data

The article analyzes how large language models process only tokenized text, compares the traditional LLM‑plus‑toolchain pipeline with emerging multimodal models, evaluates their cost, speed, controllability, and hallucination risks, and proposes a hybrid architecture that matches each approach to specific document scenarios.

LLMMultimodalRAG
0 likes · 16 min read
RAG with Multimodal Inputs vs LLM + Toolchains: Handling Non‑Text Data
James' Growth Diary
James' Growth Diary
May 20, 2026 · Artificial Intelligence

Boosting RAG Retrieval Quality with Cohere Rerank and Cross‑Encoder

After achieving high recall with hybrid Elasticsearch and vector search, the article shows how inserting a reranker—either Cohere's cloud API or a local Cross‑Encoder—compresses the top‑20 candidates to the most relevant three to five, dramatically improving answer accuracy, cutting token costs, and detailing a dual‑track implementation for production and development environments.

CohereCross-EncoderLangChain
0 likes · 22 min read
Boosting RAG Retrieval Quality with Cohere Rerank and Cross‑Encoder
Tech Minimalism
Tech Minimalism
May 20, 2026 · Artificial Intelligence

How Karpathy’s Markdown Wiki Redefines LLM Knowledge Management

The article examines the LLM Wiki concept introduced by Karpathy, explaining how a Markdown‑based wiki maintained outside the LLM context can persist and evolve model understanding, compares it with RAG, note‑taking tools and traditional knowledge bases, and outlines architectural components, risks, and practical guidelines.

AIKnowledge BaseLLM
0 likes · 14 min read
How Karpathy’s Markdown Wiki Redefines LLM Knowledge Management
AI Engineer Programming
AI Engineer Programming
May 20, 2026 · Artificial Intelligence

Why Chunk‑Based RAG Fails and How IdeaBlocks Improve Retrieval

The article argues that the common assumption that text chunks are the proper knowledge unit in RAG pipelines is flawed, leading to versioning, metadata, and redundancy problems, and demonstrates that replacing chunks with structured IdeaBlocks dramatically reduces corpus size, token usage, and improves vector relevance.

IdeaBlockLLMRAG
0 likes · 10 min read
Why Chunk‑Based RAG Fails and How IdeaBlocks Improve Retrieval
dbaplus Community
dbaplus Community
May 19, 2026 · Artificial Intelligence

From RAG to GraphRAG: How Huolala Raised Metadata Retrieval Accuracy from 56% to 78%

The article details Huolala's transition from a basic Retrieval‑Augmented Generation (RAG) system to a GraphRAG architecture, explaining the challenges of traditional RAG, the design of offline and online stages, multi‑index hybrid search, concrete performance metrics (accuracy up to 78%, knowledge recall 91%, Top‑K 90%, MRR 0.73), and future plans such as stronger hybrid retrieval, reranking, and Agentic RAG.

AIGraphRAGHybrid Search
0 likes · 15 min read
From RAG to GraphRAG: How Huolala Raised Metadata Retrieval Accuracy from 56% to 78%
AI Large-Model Wave and Transformation Guide
AI Large-Model Wave and Transformation Guide
May 19, 2026 · Industry Insights

Which Company Will Shape the Future of Enterprise AI: Anthropic or Palantir?

The article compares Anthropic's lightweight, knowledge‑externalizing AI approach with Palantir's heavyweight data‑semantic and governance platform, arguing that Chinese B‑end firms should initially adopt Anthropic‑style quick‑value layers and later integrate Palantir‑style controls to build a sustainable enterprise AI operation layer.

AI OpsAnthropicChina B2B
0 likes · 10 min read
Which Company Will Shape the Future of Enterprise AI: Anthropic or Palantir?
DataFunSummit
DataFunSummit
May 18, 2026 · Artificial Intelligence

How Palantir’s Ontology‑Based Semantic Network Drove 85% Growth and Zero Churn

Palantir’s Q1 2026 revenue jumped 85% while many AI firms saw valuations collapse, and the company attributes its success to replacing cheap‑token LLM wrappers with a deep ontology‑driven semantic network that secures high‑risk AI deployments, creates a durable moat, and delivers unprecedented net‑retention.

AI infrastructurePalantirRAG
0 likes · 10 min read
How Palantir’s Ontology‑Based Semantic Network Drove 85% Growth and Zero Churn
AgentGuide
AgentGuide
May 18, 2026 · Artificial Intelligence

AI Agent Essentials: Tokens, Skills, RAG, MCP, SDD & Harness Engineering

The article explains AI Agents as LLM‑based entities with planning, memory, and tool‑use capabilities, covering model pre‑training, fine‑tuning, hallucinations, the Model Context Protocol (MCP), tokenization, Retrieval‑Augmented Generation (RAG), multi‑layer memory, Skill packaging, ReAct reasoning‑action loops, self‑reflection, Harness engineering, and Spec‑Driven Development (SDD).

AI agentHarness EngineeringLLM
0 likes · 9 min read
AI Agent Essentials: Tokens, Skills, RAG, MCP, SDD & Harness Engineering
dbaplus Community
dbaplus Community
May 17, 2026 · Artificial Intelligence

Why Grep Is Replacing Vector Indexes: RAG Isn’t Dead, It’s Evolving

The article dissects Claude Code’s LLM‑driven Grep search, showing how multi‑round tool calls replace static vector‑based RAG, presents ripgrep performance benchmarks, compares Claude Code with Cursor and Codex, and argues that zero‑index search is optimal for local code bases while larger projects still need indexing.

Claude CodeCode searchGrep
0 likes · 36 min read
Why Grep Is Replacing Vector Indexes: RAG Isn’t Dead, It’s Evolving
IT Services Circle
IT Services Circle
May 17, 2026 · Artificial Intelligence

60 Essential AI Terms Every Programmer Should Master

This article walks programmers through 60 core AI concepts—from the basics of large language models and tokens to advanced topics like prompt engineering, retrieval‑augmented generation, fine‑tuning, and inference optimization—organized into progressive skill levels and illustrated with concrete examples and code snippets.

AIInference OptimizationRAG
0 likes · 25 min read
60 Essential AI Terms Every Programmer Should Master
Tech Minimalism
Tech Minimalism
May 16, 2026 · Artificial Intelligence

One‑page guide to the three RAG architectures: Classic, Graph, and Agentic

The article explains why plain large language models cannot answer internal company questions, introduces Retrieval‑Augmented Generation (RAG) as a solution, and compares three RAG variants—Classic, Graph, and Agentic—detailing their workflows, strengths, limitations, and how to choose the right one for a given problem.

Agentic RAGClassic RAGGraph RAG
0 likes · 17 min read
One‑page guide to the three RAG architectures: Classic, Graph, and Agentic
AI Engineer Programming
AI Engineer Programming
May 16, 2026 · Artificial Intelligence

How to Boost RAG Retrieval Quality: Real‑World Cost‑Benefit Analysis

This article examines practical ways to improve Retrieval‑Augmented Generation (RAG) retrieval quality—covering vector database choices, data chunking, embedding models, query expansion, and re‑ranking—while weighing performance gains against operational costs through multiple real‑world case studies.

LLMRAGRe‑ranking
0 likes · 16 min read
How to Boost RAG Retrieval Quality: Real‑World Cost‑Benefit Analysis
Su San Talks Tech
Su San Talks Tech
May 15, 2026 · Artificial Intelligence

Understanding Rerank in Retrieval‑Augmented Generation (RAG)

The article explains why a reranking step is essential in RAG pipelines, describes how it refines the initial vector‑search results, compares mainstream rerank techniques, discusses practical engineering choices such as candidate set size and model selection, and outlines how to evaluate and tune rerank performance.

Cross-EncoderLLMModel selection
0 likes · 15 min read
Understanding Rerank in Retrieval‑Augmented Generation (RAG)
DeepHub IMBA
DeepHub IMBA
May 14, 2026 · Artificial Intelligence

How HyDE Transforms RAG Retrieval from Keyword Matching to Intent Understanding

The article explains how Hypothetical Document Embeddings (HyDE) improve Retrieval‑Augmented Generation by generating a synthetic answer before vector search, allowing the system to embed richer semantic intent rather than relying on shallow keyword similarity, and provides a step‑by‑step implementation using LangChain.

HyDELLMLangChain
0 likes · 6 min read
How HyDE Transforms RAG Retrieval from Keyword Matching to Intent Understanding
AntData
AntData
May 14, 2026 · Artificial Intelligence

How RAG‑Powered DB‑GPT Enables Intelligent Marine‑Environment Queries with Text2SQL

The article presents a private‑deployed DB‑GPT solution that combines Retrieval‑Augmented Generation (RAG) and Text2SQL to address low utilization of unstructured marine‑environment knowledge, cross‑source data querying difficulties, and security concerns, detailing technical selection, implementation steps, and performance gains that reduce query time from 30 minutes to 1‑3 minutes.

AIDB-GPTKnowledge retrieval
0 likes · 13 min read
How RAG‑Powered DB‑GPT Enables Intelligent Marine‑Environment Queries with Text2SQL
AI Engineer Programming
AI Engineer Programming
May 14, 2026 · Artificial Intelligence

RAG Retrieval: Comparing Bi-encoder and Cross-encoder Architectures

The article reviews the three‑step RAG pipeline, explains why retrieval quality hinges on fast, accurate semantic matching, contrasts Bi-encoder’s offline vector indexing and speed with Cross-encoder’s token‑level interaction and higher precision, and discusses hybrid solutions such as ColBERT and LLM rerankers with practical engineering guidelines.

Bi-EncoderColBERTCross-Encoder
0 likes · 10 min read
RAG Retrieval: Comparing Bi-encoder and Cross-encoder Architectures
ITPUB
ITPUB
May 13, 2026 · Databases

Is the Hype Around Vector Databases a Pseudo‑Demand in the AI Era?

The article questions whether dedicated vector databases are truly needed for AI applications, examining market hype, the rapid emergence of many vector‑DB products, real‑world examples like PostgreSQL pgvector and major vendor integrations, and the hidden costs of data fragmentation and operational complexity.

AIPostgreSQLRAG
0 likes · 15 min read
Is the Hype Around Vector Databases a Pseudo‑Demand in the AI Era?
Machine Heart
Machine Heart
May 13, 2026 · Artificial Intelligence

From 0 to 193 Logins in 88 Days: Evidence‑Driven AI Empowers 5 Million Chinese Doctors

Facing overwhelming patient loads and unreliable AI hallucinations, Chinese doctors turned to a new medical AI that combines low‑hallucination retrieval‑augmented generation, PICO‑GRADE evidence structuring, reward‑based model alignment and expert‑in‑the‑loop feedback, delivering clinically vetted answers in seconds and gaining 193 logins within 88 days.

AIRAGclinical-decision-support
0 likes · 16 min read
From 0 to 193 Logins in 88 Days: Evidence‑Driven AI Empowers 5 Million Chinese Doctors
ITPUB
ITPUB
May 12, 2026 · Industry Insights

Why Pinecone Is Dismantling Its Own RAG Paradigm

In May 2026 Pinecone announced the end of its Retrieval‑Augmented Generation (RAG) approach, unveiling the Nexus knowledge engine and KnowQL query language to address the structural inefficiencies of RAG for AI agents, and positioning this shift as a strategic industry‑wide pivot.

AI agentsKnowQLKnowledge Compilation
0 likes · 8 min read
Why Pinecone Is Dismantling Its Own RAG Paradigm
DataFunSummit
DataFunSummit
May 12, 2026 · Artificial Intelligence

15 Critical Questions on Why Enterprise AI Agents Need Business Ontology

The article analyzes why large language models and RAG alone cannot meet enterprise AI needs, argues that a business ontology provides essential semantic grounding for agents, outlines ontology construction methods, demonstrates hybrid search improvements, and shares real‑world case studies showing dramatic efficiency gains.

AI agentsHybrid SearchRAG
0 likes · 16 min read
15 Critical Questions on Why Enterprise AI Agents Need Business Ontology
Architecture Digest
Architecture Digest
May 12, 2026 · Artificial Intelligence

Tencent Open‑Sources WeKnora: An AI‑Powered Document Understanding Framework

WeKnora, Tencent's newly open‑source framework built on the IMA kernel, combines LLM and RAG to parse unstructured PDFs, Word files and scans with over 300% speed improvement and 89% top‑10 retrieval precision, offering modular deployment, secure private‑cloud options, and seamless integration with vector databases and the WeChat ecosystem.

Knowledge BaseLLMOpen Source
0 likes · 8 min read
Tencent Open‑Sources WeKnora: An AI‑Powered Document Understanding Framework
DeepHub IMBA
DeepHub IMBA
May 11, 2026 · Artificial Intelligence

2026 RAG Selection Guide: How to Choose Between Vector, Graph, and Vectorless

This article compares traditional Vector RAG, GraphRAG, and the newer Vectorless RAG, explains why Vector RAG fails on relational and structured queries, presents benchmark results, outlines each architecture's strengths and costs, and offers a decision framework and Adaptive RAG routing strategy for production systems.

Adaptive RetrievalGraphRAGLLM
0 likes · 13 min read
2026 RAG Selection Guide: How to Choose Between Vector, Graph, and Vectorless
IT Services Circle
IT Services Circle
May 11, 2026 · Artificial Intelligence

Can Claude’s Code Generation Replace Agent Memory Systems? Understanding CLAUDE.md, Memory, and RAG

The article explains why large language model agents need dedicated memory systems to overcome limited context windows, outlines short‑term and long‑term memory architectures, storage forms, functional categories, lifecycle operations, control‑policy research, compares leading products, and presents best‑practice engineering guidelines for building scalable, privacy‑aware agent memory pipelines.

Agent MemoryControl PolicyLong-term Memory
0 likes · 25 min read
Can Claude’s Code Generation Replace Agent Memory Systems? Understanding CLAUDE.md, Memory, and RAG
James' Growth Diary
James' Growth Diary
May 11, 2026 · Artificial Intelligence

Mastering RAG Evaluation: Recall@K, MRR, NDCG, and RAGAS Explained

This article breaks down RAG evaluation into a two‑layer framework, explains the four core metrics—Recall@K, MRR, NDCG, and the four RAGAS scores—shows how to implement them with LangChain.js, highlights common pitfalls, and offers scenario‑specific metric combinations for reliable performance monitoring.

LangChainMRRNDCG
0 likes · 20 min read
Mastering RAG Evaluation: Recall@K, MRR, NDCG, and RAGAS Explained
Smart Workplace Lab
Smart Workplace Lab
May 10, 2026 · Artificial Intelligence

When Your Internal AI Is Fed Bad Data, How to Fix It?

The article recounts a real incident where an AI‑generated SOP cited outdated policy because a knowledge base was overloaded with unchecked historical documents, then outlines a step‑by‑step protocol—including corpus cleaning, version locking, and isolation zones—to prevent data contamination and ensure reliable AI outputs.

AIData cleaningKnowledge Base
0 likes · 7 min read
When Your Internal AI Is Fed Bad Data, How to Fix It?
James' Growth Diary
James' Growth Diary
May 10, 2026 · Artificial Intelligence

Syncing Vectors with Changing Documents: Add, Update, Delete Made Simple

This article walks through why keeping a vector store consistent with a mutable knowledge base is challenging, explains the three failure points, introduces hash‑based incremental syncing, shows idempotent add, proper update and soft‑delete workflows, covers embedding model upgrades, and presents a production‑grade event‑driven architecture with common pitfalls and remedies.

Hash DeduplicationLangChainRAG
0 likes · 17 min read
Syncing Vectors with Changing Documents: Add, Update, Delete Made Simple
DataFunTalk
DataFunTalk
May 10, 2026 · Artificial Intelligence

Exploring Multimodal GraphRAG: Combining Document Intelligence, Knowledge Graphs, and Large Models

This article presents a detailed technical walkthrough of multimodal GraphRAG, covering document‑intelligence parsing pipelines, multimodal graph index construction, knowledge‑graph‑driven chunk linking, recent research progress, performance trade‑offs, and practical recommendations for deploying RAG solutions.

Document IntelligenceGraphRAGMultimodal Retrieval
0 likes · 23 min read
Exploring Multimodal GraphRAG: Combining Document Intelligence, Knowledge Graphs, and Large Models
IT Services Circle
IT Services Circle
May 9, 2026 · Artificial Intelligence

How to Choose Between LangChain and LlamaIndex: Core Use‑Case Comparison for Agent Development

The article analyzes the design philosophies, key components, strengths, and weaknesses of LangChain and LlamaIndex, explains their distinct core scenarios—complex multi‑step agent orchestration versus private‑data RAG—and shows how they can be combined in real projects while outlining emerging ecosystem trends.

AgentLLMLangChain
0 likes · 13 min read
How to Choose Between LangChain and LlamaIndex: Core Use‑Case Comparison for Agent Development
AI Engineer Programming
AI Engineer Programming
May 9, 2026 · Artificial Intelligence

Why PDF Parsing Is Hard for RAG and Which Mainstream Solutions Work

The article examines the intrinsic challenges of extracting structured text from PDFs for Retrieval‑Augmented Generation—such as missing reading order, table reconstruction, font encoding, and scanned images—and compares lightweight libraries, AI‑enhanced frameworks, commercial APIs, and visual language models as practical solutions.

AI frameworksOCRPDF parsing
0 likes · 23 min read
Why PDF Parsing Is Hard for RAG and Which Mainstream Solutions Work
AI Step-by-Step
AI Step-by-Step
May 8, 2026 · Artificial Intelligence

How LLM Wiki Transforms Personal Agent Knowledge Management

LLM Wiki, proposed by Andrej Karpathy, replaces repetitive RAG retrieval for personal agents with a three‑layer markdown‑based knowledge base that separates raw sources, curated wiki pages, and schema constraints, enabling durable, auditable memory, structured updates, health checks, and a hybrid Wiki‑RAG workflow.

AIKnowledge BaseLLM Wiki
0 likes · 17 min read
How LLM Wiki Transforms Personal Agent Knowledge Management
AI Engineer Programming
AI Engineer Programming
May 8, 2026 · Artificial Intelligence

Is Non-Vector RAG the Next Generation of Retrieval‑Augmented Generation?

The article analyses the relevance and accuracy shortcomings of traditional vector‑based RAG, explains how non‑vector approaches like PageIndex let LLMs navigate document trees for relevance classification and auditability, and evaluates their complexity, latency, metadata risks, and suitable use cases compared with hybrid retrieval.

Hybrid RetrievalLLMRAG
0 likes · 8 min read
Is Non-Vector RAG the Next Generation of Retrieval‑Augmented Generation?
Architect's Guide
Architect's Guide
May 7, 2026 · Artificial Intelligence

Spring AI 2.0 vs LangChain4j: Which Should You Choose?

The article provides a side‑by‑side analysis of Spring AI 2.0 and LangChain4j, comparing their goals, version alignment, programming models, RAG and agent capabilities, ecosystem integration, learning curve, and operational considerations to help Java teams decide which library best fits their project constraints.

AI agentsJavaLLM integration
0 likes · 11 min read
Spring AI 2.0 vs LangChain4j: Which Should You Choose?
Lao Guo's Learning Space
Lao Guo's Learning Space
May 6, 2026 · Artificial Intelligence

Why Your RAG Keeps Missing the Mark: Enterprise‑Level Pitfall Guide

This article examines why Retrieval‑Augmented Generation systems that work in demos often fail in production, detailing common pitfalls—from chunking and vector‑database selection to hybrid retrieval and re‑ranking—and offers concrete strategies, configuration tips, and a decision tree to build reliable enterprise‑grade RAG solutions.

ChunkingHybrid RetrievalRAG
0 likes · 12 min read
Why Your RAG Keeps Missing the Mark: Enterprise‑Level Pitfall Guide
Old Zhang's AI Learning
Old Zhang's AI Learning
May 6, 2026 · Artificial Intelligence

Solving RAG’s Biggest Pain Point: Introducing the Open‑Source CocoIndex

RAG and agent contexts suffer from stale data, not chunking or reranking, and CocoIndex—a Rust‑based incremental engine with a declarative Python API—offers fresh, delta‑processed context, automatic schema evolution, and production‑grade features, demonstrated through PDF‑to‑Markdown pipelines and a podcast knowledge‑graph case study.

PythonRAGRust
0 likes · 13 min read
Solving RAG’s Biggest Pain Point: Introducing the Open‑Source CocoIndex
AI Engineer Programming
AI Engineer Programming
May 6, 2026 · Artificial Intelligence

How to Evaluate and Choose Embedding Models for RAG Systems

This article explains why embedding models are the foundation of RAG pipelines, outlines concrete evaluation metrics such as MTEB v2 scores, latency, throughput and cost, compares a range of commercial and open‑source models, and discusses emerging trends like multimodal and long‑context embeddings.

MTEBModel selectionMultimodal
0 likes · 13 min read
How to Evaluate and Choose Embedding Models for RAG Systems
Su San Talks Tech
Su San Talks Tech
May 6, 2026 · Information Security

What Is Prompt Injection? Attack Vectors and Defense Strategies

The article explains that Prompt injection is a new LLM security threat where attackers blur the line between instruction and data, outlines direct and indirect injection techniques—including command overriding, role‑play jailbreaks, encoding obfuscation, and multi‑turn attacks—and proposes a defense‑in‑depth framework with input filtering, prompt design, output validation, least‑privilege architecture, and specialized safeguards for RAG and agent scenarios.

AI safetyAgentDefense in Depth
0 likes · 15 min read
What Is Prompt Injection? Attack Vectors and Defense Strategies
java1234
java1234
May 5, 2026 · Artificial Intelligence

Spring AI 2.0: New Video Tutorial Series Empowers Java Developers with AI

The author announces a refreshed Spring AI 2.0 video tutorial series and provides a detailed overview of the framework’s design goals, provider‑agnostic API, full‑type model support, Spring integration, enterprise value, typical use cases, and a comparison with competing Java AI libraries.

AI FrameworkJavaLangChain4j
0 likes · 7 min read
Spring AI 2.0: New Video Tutorial Series Empowers Java Developers with AI
AI Engineering
AI Engineering
May 4, 2026 · Artificial Intelligence

Why the Big‑Model Race Is Over: Where Real Value Lies in AI Infrastructure

The article argues that the competition over which large language model will dominate is outdated, explaining that true value now comes from building multi‑model routing, context engineering, standardized tool protocols, intelligent orchestration, and robust evaluation layers that turn models into reliable AI infrastructure.

AI infrastructureMCPModel routing
0 likes · 6 min read
Why the Big‑Model Race Is Over: Where Real Value Lies in AI Infrastructure
DataFunTalk
DataFunTalk
May 4, 2026 · Artificial Intelligence

Engineering and Algorithm Innovations for RAG Engines in Office Applications

This article analyzes the challenges and practical solutions of building a Retrieval‑Augmented Generation (RAG) system for office scenarios, covering background issues, modular architecture, offline and online pipelines, hybrid retrieval, ranking models, knowledge filtering, prompt design, and two‑stage generation techniques.

AIDocument ParsingHybrid Retrieval
0 likes · 22 min read
Engineering and Algorithm Innovations for RAG Engines in Office Applications
PMTalk Product Manager Community
PMTalk Product Manager Community
May 4, 2026 · Product Management

2026 AI Product Manager: The Essential Capability Model

By 2026, AI product managers must shift from merely using models to delivering stable, valuable results, mastering seven core abilities—demand judgment, evaluation-driven iteration, context design, RAG strategy, agent orchestration, solution planning, and rapid Vibe Coding—to close the loop between business needs and AI capabilities.

AI product managementAgent DesignContext Engineering
0 likes · 13 min read
2026 AI Product Manager: The Essential Capability Model
AI Engineer Programming
AI Engineer Programming
May 4, 2026 · Artificial Intelligence

RAG in the Long-Context Era: Challenges, Benchmarks, and Context Engineering

The article analyzes how expanding LLM context windows to millions of tokens reshape Retrieval‑Augmented Generation, detailing chunking trade‑offs, embedding retrieval limits, attention U‑shaped distribution, benchmark results, and the emerging practice of Context Engineering for optimal end‑to‑end pipelines.

Embedding RetrievalLLMRAG
0 likes · 10 min read
RAG in the Long-Context Era: Challenges, Benchmarks, and Context Engineering
AI Architect Hub
AI Architect Hub
May 3, 2026 · Artificial Intelligence

Choosing the Right Vector Database: Milvus, Chroma, Weaviate, Qdrant, FAISS Compared

This article compares five popular vector databases—Chroma, Milvus, Weaviate, Qdrant, and FAISS—detailing their positions, strengths, weaknesses, suitable scenarios, a selection‑dimension matrix, common pitfalls, code implementations for a unified RAG pipeline, best‑practice recommendations, and thought questions to guide engineers in choosing and migrating vector stores.

ChromaFAISSMilvus
0 likes · 23 min read
Choosing the Right Vector Database: Milvus, Chroma, Weaviate, Qdrant, FAISS Compared
DataFunSummit
DataFunSummit
May 3, 2026 · Artificial Intelligence

From Flawed to Production-Ready: Deep Dive into Building Enterprise-Grade RAG Systems

The article analyzes why early RAG deployments often fall short, dissects the most common technical pain points—from document parsing to vector overload—and presents a systematic roadmap that includes hybrid search, reranking, GraphRAG, Agentic RAG, model selection, scalability tricks, and security controls for robust B‑side production.

Agentic RAGGraphRAGHybrid Search
0 likes · 20 min read
From Flawed to Production-Ready: Deep Dive into Building Enterprise-Grade RAG Systems
Spring Full-Stack Practical Cases
Spring Full-Stack Practical Cases
May 3, 2026 · Artificial Intelligence

9 Advanced Retrieval‑Augmented Generation (RAG) Architectures Explained

This article introduces Retrieval‑Augmented Generation (RAG) and systematically details nine distinct RAG architectures—standard, conversational with memory, corrective (CRAG), adaptive, self‑RAG, fusion, HyDE, agentic, and Graph RAG—highlighting their workflows, real‑world examples, advantages, and trade‑offs.

AI ArchitectureGraphRAGLLM
0 likes · 17 min read
9 Advanced Retrieval‑Augmented Generation (RAG) Architectures Explained
AI Engineer Programming
AI Engineer Programming
May 2, 2026 · Artificial Intelligence

From Demo to Production: How to Evaluate RAG Effectively

This guide outlines a comprehensive RAG evaluation framework covering failure modes, multi‑layer metrics, test‑set construction, open‑source tools, CI/CD quality gates, production monitoring, and special considerations for agentic RAG to ensure reliable, trustworthy retrieval‑augmented generation systems.

AILLMMetrics
0 likes · 18 min read
From Demo to Production: How to Evaluate RAG Effectively
DataFunSummit
DataFunSummit
May 1, 2026 · Artificial Intelligence

How Agentic Architectures Power the Next‑Gen Recommendation and Search Systems

This article summarizes a technical ebook that analyzes the evolution of recommendation and search systems—from deep‑learning models to large‑language‑model agents—detailing multi‑agent RAG architectures, Huawei’s KAR knowledge adapters, Baidu’s generative ranking (GRAB), Elasticsearch vector search, and performance results such as a 1.5% AUC lift and GPU‑accelerated throughput gains.

ElasticsearchGenerative RankingMulti-Agent Architecture
0 likes · 6 min read
How Agentic Architectures Power the Next‑Gen Recommendation and Search Systems
AI Engineer Programming
AI Engineer Programming
May 1, 2026 · Artificial Intelligence

From Naive Retrieval to Knowledge Runtime: The Full Evolution of RAG

The article traces the evolution of Retrieval‑Augmented Generation from its 2020 Naive baseline through Advanced, Modular, Graph, and Agentic generations, detailing architectural shifts, optimization techniques, self‑correction mechanisms, and future challenges such as long‑context handling and multimodal retrieval.

LLMRAGagentic
0 likes · 14 min read
From Naive Retrieval to Knowledge Runtime: The Full Evolution of RAG
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
May 1, 2026 · Artificial Intelligence

Zero Deployment, Zero Ops: Alibaba Cloud Milvus Embedding Service Makes Vectorization Plug‑and‑Play

The article explains how Alibaba Cloud's Milvus Embedding Service eliminates the need for self‑hosted embedding models by integrating model inference, vector generation and Milvus indexing into a managed pipeline, dramatically reducing deployment complexity, operational overhead, and time‑to‑value for semantic search, RAG and multimodal retrieval use cases.

Alibaba CloudEmbeddingMilvus
0 likes · 19 min read
Zero Deployment, Zero Ops: Alibaba Cloud Milvus Embedding Service Makes Vectorization Plug‑and‑Play
DeepHub IMBA
DeepHub IMBA
Apr 30, 2026 · Artificial Intelligence

Why Real RAG Systems Need Both BM25 and Vector Search

The article analyzes how BM25 excels at exact token matching while vector embeddings capture semantic intent, explains their distinct failure modes, and shows that a hybrid retriever—combined with metadata filtering, proper chunking, and reciprocal rank fusion—delivers the most reliable results for RAG pipelines.

BM25EmbeddingHybrid Retrieval
0 likes · 17 min read
Why Real RAG Systems Need Both BM25 and Vector Search
AI Architect Hub
AI Architect Hub
Apr 30, 2026 · Artificial Intelligence

How AI Understands Your Queries: Core Techniques of Semantic Vector Search

The article explains why traditional keyword search often fails when user questions differ from knowledge‑base wording, introduces semantic search that matches queries and documents via vector similarity, details query understanding and rewriting techniques, lists common pitfalls, provides a full Python implementation, and shares best‑practice recommendations.

AIPythonRAG
0 likes · 16 min read
How AI Understands Your Queries: Core Techniques of Semantic Vector Search
DataFunSummit
DataFunSummit
Apr 30, 2026 · Industry Insights

Why Palantir’s Edge Isn’t Unique – Chinese Enterprises Can Replicate Its Methodology

A panel of industry experts dissected Palantir’s rapid growth, revealing that its advantage lies in a systematic ontology‑driven methodology rather than exclusive technology, and argued that Chinese firms can adopt the same approach if they first resolve data governance, semantic consistency, and management challenges.

AI agentsCapability vs CompetencyPalantir
0 likes · 26 min read
Why Palantir’s Edge Isn’t Unique – Chinese Enterprises Can Replicate Its Methodology
MeowKitty Programming
MeowKitty Programming
Apr 29, 2026 · Artificial Intelligence

10 Must‑Try Open‑Source AI Projects for Java Developers: RAG, Agents, Knowledge Bases, and Text‑to‑SQL

This article curates ten open‑source AI projects on Gitee that Java developers can use to learn RAG pipelines, AI agents, knowledge‑base construction, Text‑to‑SQL, workflow orchestration, and multi‑model integration, offering concrete use cases, learning goals, and guidance on selecting a learning path.

AIJavaLangChain4j
0 likes · 13 min read
10 Must‑Try Open‑Source AI Projects for Java Developers: RAG, Agents, Knowledge Bases, and Text‑to‑SQL
Machine Heart
Machine Heart
Apr 29, 2026 · Artificial Intelligence

Doc‑V*: Reading Only 5 Pages Beats RAG on 80‑Page Docs – 10 Key Insights

Doc‑V* introduces a dynamic, thumbnail‑driven approach that lets a model decide which pages to read, achieving a 49.7% improvement over RAG variants on multi‑page document QA benchmarks without larger models or longer context windows, and demonstrates how strategic evidence acquisition outperforms naïve full‑document reading.

AIRAGdocument understanding
0 likes · 10 min read
Doc‑V*: Reading Only 5 Pages Beats RAG on 80‑Page Docs – 10 Key Insights
Kuaishou Tech
Kuaishou Tech
Apr 29, 2026 · Operations

Boosting Oncall Interception from 15% to 55%: KOncall’s AI‑Driven Evolution at Kuaishou

Kuaishou’s R&D efficiency team built the KOncall intelligent on‑call platform, integrating LLM‑based retrieval‑augmented generation, Redis Pub/Sub streaming, OCR multimodal parsing, FAQ knowledge ops, and custom reranking, which raised automated query interception from 15% to 55% and processed over 116 000 requests, turning on‑call from a bottleneck into a capability starter.

AI OperationsIncident ManagementLLM
0 likes · 26 min read
Boosting Oncall Interception from 15% to 55%: KOncall’s AI‑Driven Evolution at Kuaishou
MaGe Linux Operations
MaGe Linux Operations
Apr 28, 2026 · Artificial Intelligence

Why Your RAG Performance Is Poor: Common Issues and Optimization Strategies

This article systematically analyzes why Retrieval‑Augmented Generation pipelines often underperform—covering embedding model selection, chunking strategies, hybrid retrieval, reranking, context window waste, evaluation metrics, and a detailed troubleshooting checklist—while providing concrete code examples and best‑practice recommendations for engineers.

ChunkingEmbeddingHybrid Retrieval
0 likes · 19 min read
Why Your RAG Performance Is Poor: Common Issues and Optimization Strategies
360 Tech Engineering
360 Tech Engineering
Apr 28, 2026 · Artificial Intelligence

How 360 AI Institute Boosted Airline Translation Accuracy from 70% to 96%

The 360 AI Research Institute tackled the zero‑tolerance translation demands of airline maintenance by building a specialized parallel corpus and applying RAG‑enhanced, SFT‑fine‑tuned, and RL‑reinforced models, raising Chinese‑to‑English translation accuracy from 70% to 96% and enabling a one‑month rollout.

AI translationRAGSFT
0 likes · 5 min read
How 360 AI Institute Boosted Airline Translation Accuracy from 70% to 96%
AI Illustrated Series
AI Illustrated Series
Apr 28, 2026 · Artificial Intelligence

Comprehensive Interview Guide: LangChain & LangGraph Frameworks

This article provides a detailed, question‑and‑answer style walkthrough of LangChain and LangGraph, covering their core concepts, components, workflow patterns, memory mechanisms, LCEL syntax, graph construction, conditional edges, loops, multi‑agent collaboration, persistence, and a comparison with LlamaIndex, offering concrete code examples and practical insights for AI interview preparation.

AI FrameworkAgentLCEL
0 likes · 32 min read
Comprehensive Interview Guide: LangChain & LangGraph Frameworks
Node.js Tech Stack
Node.js Tech Stack
Apr 28, 2026 · Artificial Intelligence

Turn Your Article Collection into an LLM‑Powered Wiki with a Single Skill

This article walks through using the youdaonote‑llm‑wiki skill to automatically ingest a set of Markdown articles into a cloud‑synced Youdao Note knowledge base, generate structured Wiki pages, perform cross‑document queries with citations, and keep the repository up‑to‑date, while comparing it to Karpathy's original script‑based approach.

AI agentsLLM WikiRAG
0 likes · 14 min read
Turn Your Article Collection into an LLM‑Powered Wiki with a Single Skill
AI Illustrated Series
AI Illustrated Series
Apr 27, 2026 · Artificial Intelligence

Comprehensive RAG Interview Q&A: 22 In-Depth Questions and Answers

This extensive interview guide covers 22 core RAG questions, detailing the definition, workflow, embedding selection, vector database choices, retrieval optimization, multi‑turn handling, context compression, evaluation metrics, knowledge‑graph integration, operational challenges, Agentic and hybrid RAG, document update strategies, similarity algorithms, and hallucination mitigation, providing concrete examples and practical advice for AI interview preparation.

AI InterviewEmbeddingKnowledge retrieval
0 likes · 29 min read
Comprehensive RAG Interview Q&A: 22 In-Depth Questions and Answers
SuanNi
SuanNi
Apr 27, 2026 · Artificial Intelligence

Agent Skills Explained: Definition, Structure, and Engineering Practices

This article breaks down the official Anthropic definition of Agent Skills, shows how they are simple file‑system‑based, composable units stored in SKILL.md, scripts, references and assets, and explains the three‑layer progressive‑disclosure loading model, discovery, selection, execution, composition patterns, security, version‑control integration and evaluation practices.

AIAgent SkillsComposable
0 likes · 14 min read
Agent Skills Explained: Definition, Structure, and Engineering Practices
Architect's Tech Stack
Architect's Tech Stack
Apr 27, 2026 · Artificial Intelligence

Can Your RAG System Pass the Demo and Remain Accurate Across 5,000 Documents?

The article dissects a tough interview question about building a production‑grade Retrieval‑Augmented Generation (RAG) system that not only works in a demo but also delivers stable, correct answers over a knowledge base of 5,000 documents, covering chunking, hybrid retrieval, intent routing, constrained generation, evaluation metrics, and operational safeguards.

Hybrid RetrievalIntent RoutingRAG
0 likes · 15 min read
Can Your RAG System Pass the Demo and Remain Accurate Across 5,000 Documents?
Data Party THU
Data Party THU
Apr 27, 2026 · Artificial Intelligence

Three Overlooked Failure Points in RAG Pipelines and How to Build a Feedback Loop

The article analyzes silent failures in Retrieval‑Augmented Generation pipelines, identifies three gaps—retrieval relevance, LLM confidence masking uncertainty, and missing fault signals—and presents a practical feedback‑loop architecture with relevance gating, post‑generation evaluation, session tracing, and user‑signal logging to make production RAG systems trustworthy.

LLMObservabilityRAG
0 likes · 13 min read
Three Overlooked Failure Points in RAG Pipelines and How to Build a Feedback Loop
Wu Shixiong's Large Model Academy
Wu Shixiong's Large Model Academy
Apr 27, 2026 · Artificial Intelligence

Can Your RAG Pass the Demo? Scaling to 5,000 Docs for Reliable Answers

The article walks through the practical challenges of turning a RAG demo into a production system for 5,000 insurance documents, covering knowledge‑base chunking, embedding model selection, recall‑threshold tuning, hybrid vector‑BM25 retrieval, intent‑aware query routing, prompt constraints, confidence scoring, and operational scaling, with concrete metrics and code examples.

EmbeddingHybrid RetrievalRAG
0 likes · 16 min read
Can Your RAG Pass the Demo? Scaling to 5,000 Docs for Reliable Answers
Java Web Project
Java Web Project
Apr 27, 2026 · Artificial Intelligence

DeepSeek V4 Meets Claude Code: A Cost‑Effective Leap in Open‑Source LLM Performance

DeepSeek V4 preview, released quietly on April 24, offers two models with 1 M token context and pricing 1/16 of Claude Opus, achieving near‑par performance on SWE‑bench and LiveCodeBench, while integration with Claude Code enables rapid project understanding, bug detection, refactoring, testing and documentation, saving days of work for under ¥6.

Agentic CodingClaude CodeCode Refactoring
0 likes · 15 min read
DeepSeek V4 Meets Claude Code: A Cost‑Effective Leap in Open‑Source LLM Performance
The Dominant Programmer
The Dominant Programmer
Apr 27, 2026 · Artificial Intelligence

Building a Private Document Vector Search with SpringBoot, LangChain4j, and Ollama RAG

This guide walks through why Retrieval‑Augmented Generation (RAG) is needed for large language models, explains the three‑step indexing and query workflow, details LangChain4j’s core components, and provides a complete SpringBoot example—including Maven setup, configuration, service code, and troubleshooting—to create a private document‑vector search system powered by Ollama.

EmbeddingLangChain4jOllama
0 likes · 13 min read
Building a Private Document Vector Search with SpringBoot, LangChain4j, and Ollama RAG
James' Growth Diary
James' Growth Diary
Apr 26, 2026 · Databases

Vector Database Fundamentals: Embedding, Similarity Search, and Index Structures Explained in One Go

This article walks through the complete workflow of turning split text into high‑dimensional vectors, choosing the right embedding model, selecting an appropriate similarity metric, comparing index structures such as Flat, IVF, HNSW and PQ, and finally picking a vector database and integrating it with LangChain.js for production‑grade RAG pipelines.

EmbeddingsLangChainRAG
0 likes · 25 min read
Vector Database Fundamentals: Embedding, Similarity Search, and Index Structures Explained in One Go
DataFunTalk
DataFunTalk
Apr 26, 2026 · Artificial Intelligence

Building an Enterprise‑Grade RAG 2.0 System: Architecture, Challenges, and Best Practices

This article analyses the practical construction of an enterprise‑level Retrieval‑Augmented Generation (RAG) 2.0 system, covering background issues of large models, a modular architecture, layered offline/online pipelines, hybrid retrieval, ranking strategies, prompt engineering, and deployment insights drawn from China Mobile’s production experience.

Hybrid RetrievalRAGRanking Models
0 likes · 22 min read
Building an Enterprise‑Grade RAG 2.0 System: Architecture, Challenges, and Best Practices
AI Illustrated Series
AI Illustrated Series
Apr 26, 2026 · Artificial Intelligence

Build Your First LangChain Agent: A Hands‑On Framework Tutorial

This article walks through a practical, step‑by‑step construction of a LangChain agent—from basic concepts and a simple weather‑query agent to a more complex market‑research agent, adding memory and RAG capabilities, and finally comparing LangChain with LangGraph.

AI agentLangChainMemory
0 likes · 15 min read
Build Your First LangChain Agent: A Hands‑On Framework Tutorial
AI Architect Hub
AI Architect Hub
Apr 26, 2026 · Artificial Intelligence

Embedding Explained: How Vectorization Turns Text into Numbers for RAG

This article walks through why traditional keyword matching fails for RAG, explains the evolution from one‑hot encoding to Word2Vec and BERT, details sentence‑level embeddings and similarity metrics, compares leading Chinese and multilingual embedding models using the C‑MTEB benchmark, and provides practical LangChain code, deployment tips, and common pitfalls.

Chinese NLPEmbeddingLangChain
0 likes · 18 min read
Embedding Explained: How Vectorization Turns Text into Numbers for RAG
The Dominant Programmer
The Dominant Programmer
Apr 25, 2026 · Backend Development

Integrating LangChain4j with Spring Boot for Fast AI Conversations on Alibaba Baichuan

This guide walks through using the SpringAIAlibaba framework to integrate Alibaba Baichuan with Spring Boot via LangChain4j, explains core concepts, compares LangChain4j to Spring AI and OpenAI, and provides step‑by‑step dependency setup, environment configuration, code examples, and a simple browser test.

AI chatAgentAlibaba Baichuan
0 likes · 11 min read
Integrating LangChain4j with Spring Boot for Fast AI Conversations on Alibaba Baichuan
AI Architect Hub
AI Architect Hub
Apr 25, 2026 · Artificial Intelligence

How to Feed Massive Documents to an RAG System: Mastering the Art of Text Chunking

This article explains why proper text chunking is critical for Retrieval‑Augmented Generation, illustrates common pitfalls with real‑world examples, compares four chunking strategies (fixed length, recursive, structure‑aware, and code‑aware), and provides practical guidelines for chunk size, overlap, metadata handling, and a production‑ready pipeline.

AI RetrievalLangChainRAG
0 likes · 21 min read
How to Feed Massive Documents to an RAG System: Mastering the Art of Text Chunking
Architecture and Beyond
Architecture and Beyond
Apr 25, 2026 · Artificial Intelligence

Practical Insights on Recent AI Engineering Deployments

The article examines how large language models function as probabilistic components within deterministic software, discusses fault‑tolerance limits for viable AI use cases, and offers detailed engineering guidance on RAG pipelines, tool‑calling determinism, agent fragility, testing, monitoring, and privacy‑conscious deployment in finance.

AI EngineeringAgent ArchitectureLLM
0 likes · 14 min read
Practical Insights on Recent AI Engineering Deployments
Geek Labs
Geek Labs
Apr 25, 2026 · Artificial Intelligence

Boost AI Workflow: Personal Knowledge Base with llm_wiki and Evolving Agents

Unlike typical RAG that discards knowledge after each query, the open‑source tools llm_wiki and SkillClaw let you continuously compile a personal knowledge base and evolve AI agents by incrementally storing documents and session‑derived skills, complete with multi‑step processing, community‑tested benchmarks, and cross‑platform support.

AI agentsKnowledge BaseLLM Wiki
0 likes · 7 min read
Boost AI Workflow: Personal Knowledge Base with llm_wiki and Evolving Agents
Ray's Galactic Tech
Ray's Galactic Tech
Apr 24, 2026 · Backend Development

From Bottlenecks to a High‑Concurrency Medical Assistant with LangChain4j

This guide details how to design and implement a production‑grade, high‑concurrency medical AI assistant using LangChain4j, Spring Boot, Redis, and Kubernetes, covering architecture, RAG‑enhanced retrieval, controlled tool invocation, guardrails, idempotent transactions, scaling strategies and observability to ensure reliable, compliant patient interactions.

LangChain4jRAGSpring Boot
0 likes · 33 min read
From Bottlenecks to a High‑Concurrency Medical Assistant with LangChain4j