Tagged articles

920 articles

Page 10 of 10

Mar 6, 2024 · Artificial Intelligence

What the New “Generative AI Act Two” Reveals About the Next AI Wave

Sequoia Capital’s “Generative AI Act Two” report highlights a shift from hype‑driven model releases to user‑centric, end‑to‑end solutions, emphasizing the rise of foundational models as components, the importance of developer tools, emerging RAG and fine‑tuning techniques, and the evolving competitive landscape.

AI MarketFoundational modelsRAG

0 likes · 6 min read

What the New “Generative AI Act Two” Reveals About the Next AI Wave

JD Retail Technology

Mar 4, 2024 · Artificial Intelligence

How JD Retail Integrates LLMs with SFT, RAG, and AI Agents for Real-World Impact

This article examines JD Retail's end‑to‑end large language model framework that combines supervised fine‑tuning, retrieval‑augmented generation, and ReAct‑based AI agents to overcome retail‑specific challenges, improve model accuracy, reduce hallucinations, and enable autonomous multi‑step business workflows.

AI agentArtificial IntelligenceIndustry Insights

0 likes · 20 min read

How JD Retail Integrates LLMs with SFT, RAG, and AI Agents for Real-World Impact

AI Large Model Application Practice

Mar 1, 2024 · Artificial Intelligence

Why LangGraph Is Needed: Extending LangChain with Loops and Fine‑Grained Agent Control

LangGraph, introduced in LangChain 0.1, addresses the limitations of simple Chains by adding loop capabilities and detailed control over Agent execution, enabling complex multi‑Agent, RAG, and self‑repair scenarios through a state‑graph architecture.

LCELLangChainLangGraph

0 likes · 12 min read

Why LangGraph Is Needed: Extending LangChain with Loops and Fine‑Grained Agent Control

Alibaba Cloud Big Data AI Platform

Feb 27, 2024 · Artificial Intelligence

Build a Knowledge‑Enhanced LLM Chatbot with Alibaba Cloud PAI: A Step‑by‑Step RAG Guide

This comprehensive guide walks AI developers through building a Retrieval‑Augmented Generation (RAG) chatbot on Alibaba Cloud PAI, covering architecture, vector store setup, model deployment, knowledge ingestion, multi‑modal retrieval, fusion, re‑ranking, prompt design, and end‑to‑end configuration with code examples.

Alibaba CloudChatbotLLM

0 likes · 26 min read

Build a Knowledge‑Enhanced LLM Chatbot with Alibaba Cloud PAI: A Step‑by‑Step RAG Guide

Rare Earth Juejin Tech Community

Feb 25, 2024 · Artificial Intelligence

Pinecone Vector Database and Embedding Model Summary from DeepLearning.AI’s AI Course

This article reviews the author’s hands‑on experience with Pinecone’s serverless vector database, various embedding and generation models such as all‑MiniLM‑L6‑v2, text‑embedding‑ada‑002, clip‑ViT‑B‑32, and GPT‑3.5‑turbo‑instruct, and demonstrates how they are applied to semantic search, RAG, recommendation, hybrid, and facial similarity tasks using Python code examples.

AIPineconePython

0 likes · 9 min read

Pinecone Vector Database and Embedding Model Summary from DeepLearning.AI’s AI Course

AI Large Model Application Practice

Feb 23, 2024 · Artificial Intelligence

How to Build a Text‑to‑SQL Chatbot with Vanna’s Open‑Source RAG Framework

This guide explains Vanna, an open‑source Python RAG framework for Text2SQL, covering its core concepts, RAG‑based architecture, step‑by‑step model training, code examples for customization, and how to deploy a conversational database chatbot with a Flask web UI.

ChatbotDatabaseLLM

0 likes · 11 min read

How to Build a Text‑to‑SQL Chatbot with Vanna’s Open‑Source RAG Framework

Cloud Native Technology Community

Feb 8, 2024 · Artificial Intelligence

How Retrieval‑Augmented Generation Boosts LLM Accuracy and Trust

Retrieval‑augmented generation (RAG) enhances large language models by fetching up‑to‑date, authoritative information from external sources, addressing hallucinations, outdated knowledge, and lack of citations, while offering cost‑effective implementation, improved relevance, user trust, and greater developer control through vector databases, semantic search, and prompt engineering.

AIRAGlarge language models

0 likes · 10 min read

How Retrieval‑Augmented Generation Boosts LLM Accuracy and Trust

Baobao Algorithm Notes

Feb 4, 2024 · Industry Insights

Balancing Fun, Utility, and Slow Thinking: The Future of AI Agents

In this talk, the speaker examines the dual goals of AI agents—being entertaining and useful—while introducing the concepts of fast and slow thinking, multimodal perception, long‑term memory, retrieval‑augmented generation, and tool integration as essential steps toward building truly valuable digital companions.

AI AgentsFuture AILong-term Memory

0 likes · 18 min read

Balancing Fun, Utility, and Slow Thinking: The Future of AI Agents

Rare Earth Juejin Tech Community

Jan 31, 2024 · Artificial Intelligence

Advanced RAG with Semi‑Structured Data Using LangChain, Unstructured, and ChromaDB

This tutorial demonstrates how to build an advanced Retrieval‑Augmented Generation (RAG) system for semi‑structured PDF data by leveraging LangChain, the unstructured library, ChromaDB vector store, and OpenAI models, covering installation, PDF partitioning, element classification, summarization, and query execution.

AIChromaDBLangChain

0 likes · 11 min read

Advanced RAG with Semi‑Structured Data Using LangChain, Unstructured, and ChromaDB

Data Thinking Notes

Jan 7, 2024 · Artificial Intelligence

Boost Text2SQL Accuracy with Retrieval‑Augmented Generation and LangChain

This article explains how Retrieval‑Augmented Generation (RAG) can improve LLM‑based Text2SQL conversion, covering RAG fundamentals, LangChain implementation steps, practical enhancements for SQL agents, and future directions for integrating domain knowledge.

AI AgentsLLMLangChain

0 likes · 16 min read

Boost Text2SQL Accuracy with Retrieval‑Augmented Generation and LangChain

DaTaobao Tech

Dec 27, 2023 · Artificial Intelligence

Deploying a Private LLM Knowledge Base on a MacBook

The guide walks through installing and quantizing the open‑source ChatGLM3‑6B model and the m3e‑base embedder on a MacBook, wrapping them with a FastAPI OpenAI‑compatible service, routing requests through a One‑API gateway, storing metadata in MongoDB and vectors in PostgreSQL pgvector, deploying FastGPT for RAG, ingesting data, and demonstrating 5‑7 second response times, while outlining future improvements.

ChatGLM3DeploymentFastAPI

0 likes · 23 min read

Deploying a Private LLM Knowledge Base on a MacBook

AI Large Model Application Practice

Dec 12, 2023 · Artificial Intelligence

Boost Enterprise LLM Performance: Solving Common RAG Challenges

This article explains Retrieval‑Augmented Generation for enterprise LLMs, outlines four production‑grade problems, and presents practical solutions such as parent‑child chunking, multi‑vector and multi‑query retrieval, and context‑aware question refinement with concrete prompts and workflow diagrams.

LLMRAG

0 likes · 13 min read

Boost Enterprise LLM Performance: Solving Common RAG Challenges

Baobao Algorithm Notes

Dec 6, 2023 · Artificial Intelligence

How to Systematically Fix Bad Cases in Large Language Models

The article outlines a structured approach to identifying, categorizing, evaluating impact, and repairing undesirable responses from large language models, covering both model‑level interventions across training stages and practical inference‑time techniques such as parameter tuning, prompt engineering, RAG, and pre/post‑processing safeguards.

RAGbad case remediationinference tuning

0 likes · 9 min read

How to Systematically Fix Bad Cases in Large Language Models

DataFunTalk

Nov 17, 2023 · Databases

Cost as the Primary Driver of Vector Database Industry Development

Vector databases gain traction because they dramatically reduce storage, learning, scaling, and large‑model limitations costs by enabling semantic similarity search, RAG‑based prompt optimization, efficient high‑dimensional indexing, and cloud‑native architectures, making them essential for modern AI applications despite the promotional context.

AIBig DataRAG

0 likes · 11 min read

Cost as the Primary Driver of Vector Database Industry Development

Architect

Nov 8, 2023 · Artificial Intelligence

AI Agents Unleashed: From Assistants API to Multi‑Agent Frameworks

The article dissects the rise of AI agents—from OpenAI's Assistants API and multimodal perception‑brain‑action pipelines to retrieval‑augmented generation, tool‑use strategies, single‑ and multi‑agent deployments, and emerging frameworks like AutoGen—while highlighting concrete examples, benchmark results, and current limitations.

AI AgentsAssistants APIEmbodied AI

0 likes · 38 min read

AI Agents Unleashed: From Assistants API to Multi‑Agent Frameworks

AI Large Model Application Practice

Oct 18, 2023 · Artificial Intelligence

How to Extract and Embed Tables and Images from PDFs for Multimodal RAG

This article explains a practical approach to parsing PDFs containing text, tables, and images, using the open‑source Unstructured library and LlaVA model, then embedding each modality into a vector store with multi‑vector retrieval to enable accurate semantic search in private‑knowledge RAG pipelines, with optional LangChain integration.

EmbeddingsLLMLangChain

0 likes · 12 min read

How to Extract and Embed Tables and Images from PDFs for Multimodal RAG

dbaplus Community

Oct 14, 2023 · Artificial Intelligence

Demystifying Retrieval‑Augmented Generation: From Theory to Working Chatbot

This guide explains the Retrieval‑Augmented Generation (RAG) technique, detailing how user queries are matched to private knowledge bases, how relevant passages are retrieved, and how large language models use those passages to generate context‑aware answers, complete with code examples and practical tips.

ChatbotEmbeddingLLM

0 likes · 19 min read

Demystifying Retrieval‑Augmented Generation: From Theory to Working Chatbot

phodal

Sep 24, 2023 · Artificial Intelligence

Designing a JVM‑Based LLM Framework: Insights from Chocolate Factory

This article explores the design principles, architectural decisions, and practical code examples behind the Chocolate Factory framework, a JVM‑centric LLM development platform inspired by LangChain, LlamaIndex, Spring AI, and PromptFlow, highlighting SDK construction, RAG workflows, and prompt engineering challenges.

AI developmentFrameworkJVM

0 likes · 11 min read

Designing a JVM‑Based LLM Framework: Insights from Chocolate Factory

phodal

Sep 3, 2023 · Artificial Intelligence

Engineering LLM Applications: Architecture, Prompt Modeling, and Multi‑Language Strategies

This article shares practical insights from months of building LLM proof‑of‑concepts, covering language‑agnostic architectures, FFI integration, prompt engineering, RAG patterns, DSL design, and four core architectural principles for scalable AI applications.

AI ArchitectureDSLFFI

0 likes · 13 min read

Engineering LLM Applications: Architecture, Prompt Modeling, and Multi‑Language Strategies

Java High-Performance Architecture

Aug 18, 2023 · Databases

Redis 7.2 Unified Release: Boost AI, Vector Search, and Real‑Time Functions

Redis 7.2, the first Unified Redis Release, introduces AI‑ready vector indexing, hybrid semantic search, scalable RAG support, server‑side Triggers and Functions, enhanced geospatial queries, and a preview of high‑performance searchable indexes, while expanding client library support and integrating Redis Data Integration for seamless enterprise data pipelines.

AIDatabaseRAG

0 likes · 8 min read