Tagged articles
2077 articles
Page 17 of 21
AI Large Model Application Practice
AI Large Model Application Practice
Dec 2, 2024 · Artificial Intelligence

Master CrewAI: Build Multi‑Agent Systems Quickly with Flows and a Full Demo

This article introduces CrewAI, a high‑level Python framework for constructing multi‑agent systems, explains its core concepts such as Crew, Agent, Tool, Task and Process, walks through a complete demo with code, evaluates its strengths and limitations, and showcases the new Flows feature for more flexible workflow orchestration.

AI FrameworkCrewAIFlows
0 likes · 15 min read
Master CrewAI: Build Multi‑Agent Systems Quickly with Flows and a Full Demo
JavaEdge
JavaEdge
Dec 1, 2024 · Artificial Intelligence

Exploring the Limits and Benchmarks of Qwen’s QwQ‑32B‑Preview AI Model

QwQ‑32B‑Preview, an experimental AI model from the Qwen team, showcases strong reasoning in math and programming while facing challenges like language switching, inference loops, safety concerns, and variable capabilities across domains, with benchmark scores ranging from 50% to over 90% on tests such as GPQA, AIME, MATH‑500, and LiveCodeBench.

AI BenchmarkLLMMachine Learning
0 likes · 7 min read
Exploring the Limits and Benchmarks of Qwen’s QwQ‑32B‑Preview AI Model
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Nov 29, 2024 · Artificial Intelligence

How GraphRAG Transforms Global QA with Structured Retrieval

This article examines GraphRAG—a graph‑enhanced Retrieval‑Augmented Generation approach—detailing its core concepts, the practical challenges of deploying it in enterprise settings, and the engineering solutions and future directions that enable more accurate, efficient, and explainable global question‑answering systems.

Global QAGraphRAGLLM
0 likes · 16 min read
How GraphRAG Transforms Global QA with Structured Retrieval
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 28, 2024 · Artificial Intelligence

Understanding Tokenizers and Embeddings in Large Language Models

This article introduces the core concepts of tokenizers and embeddings in large language models, explains how they convert text into numeric IDs and dense vectors, compares different tokenization strategies, and provides practical JavaScript and TensorFlow.js code examples for beginners.

AI fundamentalsJavaScriptLLM
0 likes · 10 min read
Understanding Tokenizers and Embeddings in Large Language Models
Sohu Tech Products
Sohu Tech Products
Nov 27, 2024 · Artificial Intelligence

RAG Technology and Practical Application in Multi-Modal Query: Using Chinese-CLIP and Redis Search

The article explains how Retrieval‑Augmented Generation (RAG) outperforms direct LLM inference by enabling real‑time knowledge updates and lower costs, and demonstrates a practical multi‑modal RAG pipeline that uses Chinese‑CLIP for vector encoding, various chunking strategies, and Redis Search for fast vector storage and retrieval.

Chinese-CLIPChunkingLLM
0 likes · 17 min read
RAG Technology and Practical Application in Multi-Modal Query: Using Chinese-CLIP and Redis Search
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Nov 27, 2024 · Artificial Intelligence

How to Train, Evaluate, and Deploy Qwen2.5-Coder on Alibaba Cloud PAI‑QuickStart

This guide walks developers through the entire lifecycle of Qwen2.5‑Coder—covering model sizes, training token expansion, resource requirements, fine‑tuning with SFT/DPO, evaluation on custom and public datasets, and one‑click deployment and compression on Alibaba Cloud's PAI‑QuickStart platform.

DeploymentLLMPAI-QuickStart
0 likes · 15 min read
How to Train, Evaluate, and Deploy Qwen2.5-Coder on Alibaba Cloud PAI‑QuickStart
Baobao Algorithm Notes
Baobao Algorithm Notes
Nov 24, 2024 · Artificial Intelligence

How Marco‑o1 Merges Chain‑of‑Thought Fine‑Tuning with Monte‑Carlo Tree Search for Superior Reasoning

The article introduces Marco‑o1, an open‑source LLM that enhances complex reasoning by fine‑tuning on Chain‑of‑Thought data, integrating Monte‑Carlo Tree Search, introducing mini‑step actions and a reflection mechanism, and evaluates its performance on multilingual math and translation benchmarks.

Artificial IntelligenceLLMMonte Carlo Tree Search
0 likes · 15 min read
How Marco‑o1 Merges Chain‑of‑Thought Fine‑Tuning with Monte‑Carlo Tree Search for Superior Reasoning
System Architect Go
System Architect Go
Nov 24, 2024 · Artificial Intelligence

Building a Web Voice Chatbot with Whisper, llama.cpp, and LLM

This article demonstrates how to build a web‑based voice chatbot by integrating Whisper speech‑to‑text, llama.cpp LLM inference, and WebSocket communication, detailing both the frontend JavaScript implementation and the Python FastAPI backend, along with Docker deployment and example code.

FastAPIJavaScriptLLM
0 likes · 10 min read
Building a Web Voice Chatbot with Whisper, llama.cpp, and LLM
DaTaobao Tech
DaTaobao Tech
Nov 20, 2024 · Mobile Development

MNN-Transformer: Efficient On‑Device Large Language and Diffusion Model Deployment

MNN‑Transformer provides an end‑to‑end framework that enables large language and diffusion models to run efficiently on modern smartphones by exporting, quantizing (including dynamic int4/int8 and KV cache compression) and executing via a plugin‑engine runtime, achieving up to 35 tokens/s decoding and 2‑3× faster image generation compared with existing on‑device solutions.

LLMMNNQuantization
0 likes · 15 min read
MNN-Transformer: Efficient On‑Device Large Language and Diffusion Model Deployment
System Architect Go
System Architect Go
Nov 19, 2024 · Artificial Intelligence

Retrieval Augmented Generation (RAG) System Overview and Implementation with LangChain, Redis, and llama.cpp

This article explains the concept, architecture, and step‑by‑step implementation of Retrieval Augmented Generation (RAG), covering indexing, retrieval & generation processes, a practical LangChain‑Redis‑llama.cpp example on Kubernetes, code snippets, test results, challenges, and references.

AIEmbeddingLLM
0 likes · 6 min read
Retrieval Augmented Generation (RAG) System Overview and Implementation with LangChain, Redis, and llama.cpp
dbaplus Community
dbaplus Community
Nov 16, 2024 · Artificial Intelligence

Are LLM Frameworks Overhyped? A Critical Look at RAG and Reusability

The article critiques LLM frameworks, comparing them to early ORM tools, explains how Retrieval Augmented Generation works, warns against premature optimization, and advises developers to favor simple, visible practices over complex, abstracted frameworks for better control and understanding.

AILLMModelEvaluation
0 likes · 7 min read
Are LLM Frameworks Overhyped? A Critical Look at RAG and Reusability
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 15, 2024 · Frontend Development

How to Build Real‑Time LLM Streaming in the Browser with Fetch

This article explains the mechanism of HTTP API streaming for large language models and shows step‑by‑step how front‑end developers can use the Fetch API, readable streams, and incremental UI updates to deliver real‑time, progressive results while handling errors and connection interruptions.

Front-endHTTP streamingJavaScript
0 likes · 9 min read
How to Build Real‑Time LLM Streaming in the Browser with Fetch
Linux Kernel Journey
Linux Kernel Journey
Nov 14, 2024 · Artificial Intelligence

Deep Dive: How DeepFlow Collects Business Metrics for Large‑Model Services

This article explains how China Mobile built a hybrid‑cloud production environment for its customer‑service LLM, using eBPF and WebAssembly plugins from DeepFlow to achieve zero‑intrusion observability, automatically capture full‑stack topology, application/network metrics, and key LLM business indicators such as TTFT, TPOT, and token throughput.

DeepFlowGrafanaLLM
0 likes · 19 min read
Deep Dive: How DeepFlow Collects Business Metrics for Large‑Model Services
Baobao Algorithm Notes
Baobao Algorithm Notes
Nov 14, 2024 · Artificial Intelligence

How I Built a 1B‑Parameter Chinese LLM on a Single A100: Lessons Learned

This article details the end‑to‑end process of pre‑training, fine‑tuning, and evaluating a 1‑billion‑parameter Chinese LLM named Steel‑LLM on limited hardware, covering data collection, pipeline design, training framework choices, architectural tweaks, performance results, and practical lessons for resource‑constrained developers.

LLMTraining Optimizationdata pipeline
0 likes · 18 min read
How I Built a 1B‑Parameter Chinese LLM on a Single A100: Lessons Learned
Aikesheng Open Source Community
Aikesheng Open Source Community
Nov 12, 2024 · Artificial Intelligence

ChatDBA: An AI‑Powered Database Fault Diagnosis Assistant Using Large Language Models

ChatDBA is a conversational AI system built by Shanghai Aikesheng that employs large language models and Retrieval‑Augmented Generation to help database administrators diagnose faults, learn domain knowledge, and generate or optimize SQL, with a redesigned architecture that addresses early‑stage shortcomings and outlines future enhancements.

ChatDBAFault DiagnosisKnowledge Base
0 likes · 10 min read
ChatDBA: An AI‑Powered Database Fault Diagnosis Assistant Using Large Language Models
JD Cloud Developers
JD Cloud Developers
Nov 11, 2024 · Artificial Intelligence

Mastering Prompt Engineering: History, Techniques, and Real-World Applications

This article explains what Prompt Engineering is, traces its evolution from early NLP commands to modern adaptive and multimodal prompting, details core techniques such as Zero‑shot, Chain‑of‑Thought, Auto‑CoT, and reduction of hallucinations, and showcases a logistics case study using various prompting strategies.

AILLMPrompt Engineering
0 likes · 26 min read
Mastering Prompt Engineering: History, Techniques, and Real-World Applications
Fighter's World
Fighter's World
Nov 11, 2024 · Artificial Intelligence

How CoCounsel’s $650M Acquisition Reveals Key Design Principles for LLM‑Powered Legal Tools

The article examines how Casetext’s CoCounsel, an AI‑driven legal assistant acquired by Thomson Reuters for $650 million, achieved rapid growth by prioritizing accuracy, workflow integration, user‑centered design, security, and continuous improvement, and distills the critical challenges and success factors for building LLM‑native products in low‑tolerance B2B environments.

AI ethicsB2B SaaSLLM
0 likes · 11 min read
How CoCounsel’s $650M Acquisition Reveals Key Design Principles for LLM‑Powered Legal Tools
Alibaba Cloud Observability
Alibaba Cloud Observability
Nov 8, 2024 · Cloud Native

Enable Python Probe for LLM Observability on Alibaba Cloud ACK

This guide explains how to integrate Alibaba Cloud's Python probe into a Kubernetes (ACK) environment to monitor large language model (LLM) applications, covering prerequisites, installation steps, Dockerfile modifications, resource permissions, and sample Python code for both server and client components.

ARMSCloud NativeDocker
0 likes · 16 min read
Enable Python Probe for LLM Observability on Alibaba Cloud ACK
CSS Magic
CSS Magic
Nov 8, 2024 · Artificial Intelligence

LLM Application Development Tips (3): Exploring LLM API Inputs and Outputs

This article explains how to configure key OpenAI chat completion parameters—such as temperature, top_p, streaming, response format, and tool selection—and walks through the structure of the API's JSON response, highlighting fields like id, model, choices, finish_reason, and usage for better control and cost estimation.

AI agentsAPI parametersJSON response
0 likes · 8 min read
LLM Application Development Tips (3): Exploring LLM API Inputs and Outputs
Alimama Tech
Alimama Tech
Nov 6, 2024 · Artificial Intelligence

How AI Generates Synchronized Video Narrations for E‑Commerce

This article presents the research behind Synchronized Video Storytelling, introducing the E‑SyncVidStory dataset, the VideoNarrator multimodal architecture, and extensive experiments that demonstrate high‑quality, product‑aware video narration generation for e‑commerce applications.

LLMdatasete‑commerce
0 likes · 12 min read
How AI Generates Synchronized Video Narrations for E‑Commerce
37 Interactive Technology Team
37 Interactive Technology Team
Nov 4, 2024 · Artificial Intelligence

Developing RAG and Agent Applications with LangChain: A Case Study of an AI Assistant for Activity Components

The article outlines a step‑by‑step methodology for creating Retrieval‑Augmented Generation and custom Agent applications with LangChain, illustrated by an AI assistant for activity components that evolves from a rapid Dify prototype to a LangChain‑based RAG system and finally a hand‑crafted ReAct‑style agent, detailing LCEL chain composition, vector‑search integration, model performance trade‑offs, and a unified routing layer.

AI AssistantCloud-nativeData Warehouse
0 likes · 6 min read
Developing RAG and Agent Applications with LangChain: A Case Study of an AI Assistant for Activity Components
Baobao Algorithm Notes
Baobao Algorithm Notes
Nov 4, 2024 · Artificial Intelligence

Uncovering 16 Limits of AI Search Engines and 16 Design Recommendations

A user study with 21 participants reveals sixteen critical limitations of generative AI search engines, maps them to eight quantitative metrics, proposes sixteen design recommendations, and evaluates You.com, Perplexity and BingChat against this framework to highlight current performance gaps.

AI SearchLLMMetrics
0 likes · 12 min read
Uncovering 16 Limits of AI Search Engines and 16 Design Recommendations
CSS Magic
CSS Magic
Nov 1, 2024 · Artificial Intelligence

Refining System Prompts for LLMs: Practical Tips for Batch Automation

This article explains how to automate batch document processing with LLM APIs by mastering the messages parameter, defining system, user, and assistant roles, and iteratively polishing system prompts through scripts or OpenAI's GPTs editor and Playground interfaces.

ChatGPTLLMOpenAI API
0 likes · 7 min read
Refining System Prompts for LLMs: Practical Tips for Batch Automation
DataFunTalk
DataFunTalk
Oct 31, 2024 · Artificial Intelligence

Tencent OlaChat: An LLM‑Powered Intelligent Business Intelligence Platform – Architecture, Capabilities, and Practice

This article presents the evolution from traditional to intelligent BI, explores how large language models enable natural‑language data analysis, details the OlaChat platform’s architecture, metadata‑enhanced retrieval methods, Text2SQL pipeline, multi‑turn dialogue system, and shares practical deployment insights and Q&A.

Intelligent AnalyticsLLMMetadata Retrieval
0 likes · 20 min read
Tencent OlaChat: An LLM‑Powered Intelligent Business Intelligence Platform – Architecture, Capabilities, and Practice
NewBeeNLP
NewBeeNLP
Oct 31, 2024 · Artificial Intelligence

How o1 Is Redefining LLM Engineering and What It Means for AI Professionals

The article examines OpenAI's o1 model, highlighting its unprecedented scientific capabilities, its shift from a chat toy to a high‑value tool, the potential impact on algorithm engineers, and the technical directions (RLHF, MCTS, PPO, PRM) that practitioners should master to stay relevant.

AILLMmodel analysis
0 likes · 8 min read
How o1 Is Redefining LLM Engineering and What It Means for AI Professionals
DaTaobao Tech
DaTaobao Tech
Oct 30, 2024 · Artificial Intelligence

Understanding OpenAI o1: Chain‑of‑Thought, Scaling Laws, and Training Strategies

The article explains how OpenAI’s o1 model leverages chain‑of‑thought prompting, dual‑system cognitive theory, and new scaling laws—pre‑training on code/math and post‑training reinforcement with step‑wise reward models—to achieve superior reasoning, safety, and performance over GPT‑4, heralding a shift toward models that learn to think.

LLMSafetychain-of-thought
0 likes · 42 min read
Understanding OpenAI o1: Chain‑of‑Thought, Scaling Laws, and Training Strategies
Baobao Algorithm Notes
Baobao Algorithm Notes
Oct 29, 2024 · Industry Insights

Inside Perplexity AI: How RAG Powers the Next‑Gen Search Engine

In this interview, Perplexity AI CEO Aravind Srinivas explains the company’s retrieval‑augmented generation architecture, multi‑model strategy, vector‑database use, competitive positioning against Google, monetization plans, and future product road‑map, offering a deep industry perspective on AI‑driven search.

AI startupLLMPerplexity AI
0 likes · 38 min read
Inside Perplexity AI: How RAG Powers the Next‑Gen Search Engine
CSS Magic
CSS Magic
Oct 29, 2024 · Artificial Intelligence

LLM Application Development Tips (1): How to Choose the Right Model

With a growing array of overseas and domestic LLM APIs in 2024, this guide explains how to pick the right model—starting with a top‑tier option like GPT‑4o for feasibility testing, then moving to cost‑effective or Chinese alternatives, while weighing price, inference speed, context window, API compatibility, and rate limits.

API compatibilityChinese LLMGPT-4o
0 likes · 8 min read
LLM Application Development Tips (1): How to Choose the Right Model
DevOps
DevOps
Oct 27, 2024 · Artificial Intelligence

Best Practices for Building Efficient Retrieval‑Augmented Generation (RAG) Systems

This article reviews Wang et al.'s 2024 research on Retrieval‑Augmented Generation, outlining optimal practices such as query classification, chunk sizing, hybrid metadata search, embedding selection, vector databases, query transformation, reranking, document repacking, summarization, fine‑tuning, and multimodal retrieval to guide developers in constructing high‑performance RAG pipelines.

LLMQuery ClassificationRAG
0 likes · 11 min read
Best Practices for Building Efficient Retrieval‑Augmented Generation (RAG) Systems
Alibaba Cloud Native
Alibaba Cloud Native
Oct 26, 2024 · Artificial Intelligence

Build a Real‑Time Semantic Search with EventBridge, DashVector, and FunctionCompute

This tutorial walks through constructing a zero‑to‑one RAG pipeline that ingests OSS text files via EventBridge, transforms them into embeddings with DashScope, stores vectors in DashVector, and performs semantic search using FunctionCompute and a Qwen‑Turbo LLM, complete with code samples and configuration steps.

DashVectorEmbeddingEventBridge
0 likes · 10 min read
Build a Real‑Time Semantic Search with EventBridge, DashVector, and FunctionCompute
System Architect Go
System Architect Go
Oct 25, 2024 · Artificial Intelligence

Designing and Extending a Self‑Built ChatGPT System: Architecture, Session Management, and Scaling Strategies

This article explains how to construct a ChatGPT‑like conversational system by detailing the core dialogue flow, adding session and history management with a database, defining REST APIs, and exploring extensions such as caching, elastic scaling, and production‑ready deployment considerations.

ChatGPTLLMScalability
0 likes · 7 min read
Designing and Extending a Self‑Built ChatGPT System: Architecture, Session Management, and Scaling Strategies
Baobao Algorithm Notes
Baobao Algorithm Notes
Oct 25, 2024 · Artificial Intelligence

How to Use Importance Sampling for Effective Continue Pretraining of LLMs

Continuing pretraining (CP) bridges pretraining and SFT to inject domain knowledge, but faces catastrophic forgetting; this article explores leveraging importance sampling to balance common and domain data, discusses data selection, annealing strategies, and practical tips for mitigating forgetting while enhancing specialized capabilities.

Catastrophic ForgettingContinue PretrainingImportance Sampling
0 likes · 8 min read
How to Use Importance Sampling for Effective Continue Pretraining of LLMs
System Architect Go
System Architect Go
Oct 24, 2024 · Artificial Intelligence

How to Fine‑Tune Translation Models on Kubernetes Docs with LoRA

This article walks through the complete process of fine‑tuning both domain‑specific and large‑language translation models on Kubernetes documentation, covering data preparation, model selection, training configurations, the differences between Seq2Seq and CausalLM, and how LoRA can dramatically reduce resource usage while improving performance.

AILLMLoRA
0 likes · 7 min read
How to Fine‑Tune Translation Models on Kubernetes Docs with LoRA
21CTO
21CTO
Oct 23, 2024 · Artificial Intelligence

IBM Unveils Granite 3.0 LLMs: Open‑Source, Secure, and Cost‑Effective AI Models

IBM introduced the Granite 3.0 series, an open‑source family of large language models that combine cutting‑edge performance with enhanced security, multi‑language support, and cost‑efficiency, while offering a variety of base, instruct, and specialist variants for enterprise use.

AI modelsGraniteIBM
0 likes · 4 min read
IBM Unveils Granite 3.0 LLMs: Open‑Source, Secure, and Cost‑Effective AI Models
DaTaobao Tech
DaTaobao Tech
Oct 23, 2024 · Artificial Intelligence

Retrieval-Augmented Generation (RAG): Principles, Applications, Limitations and Challenges

Retrieval-Augmented Generation (RAG) combines a retriever that fetches relevant external documents and a generator that uses them, improving LLM accuracy, relevance, privacy, and up-to-date information, but faces challenges such as retrieval latency, computational cost, chunking strategies, embedding selection, and system integration complexity.

AIKnowledge retrievalLLM
0 likes · 13 min read
Retrieval-Augmented Generation (RAG): Principles, Applications, Limitations and Challenges
Baidu Geek Talk
Baidu Geek Talk
Oct 23, 2024 · Artificial Intelligence

Integrating Yuan 2.0 Large Model with PaddleNLP: Overview, Usage Steps, and Interaction Examples

The open‑source Yuan 2.0 large model is fully integrated into Baidu’s PaddleNLP, offering quick inference for tasks like code generation, translation, and reasoning, along with efficient distributed training and fine‑tuning features such as Zero Padding optimization, enabling developers to easily deploy and customize the model via simple setup steps and example interactions.

AIJavaLLM
0 likes · 10 min read
Integrating Yuan 2.0 Large Model with PaddleNLP: Overview, Usage Steps, and Interaction Examples
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Oct 22, 2024 · Artificial Intelligence

How Alibaba Cloud Optimizes Enterprise RAG: Key Techniques for AI Search

At the 2024 Alibaba Cloud Yúnxī Conference, senior AI Search expert Xing Shaomin detailed the enterprise‑grade Retrieval‑Augmented Generation (RAG) pipeline, covering critical link architecture, effectiveness, performance, and cost optimizations, as well as practical applications, vector store enhancements, LLM agents, and deployment strategies.

AI SearchEnterprise AILLM
0 likes · 16 min read
How Alibaba Cloud Optimizes Enterprise RAG: Key Techniques for AI Search
DataFunSummit
DataFunSummit
Oct 18, 2024 · Artificial Intelligence

Building Efficient RAG Applications with a Small Team: Insights from PingCAP AI Lab

This article details how PingCAP's three‑person AI Lab leveraged Retrieval‑Augmented Generation (RAG) techniques—including basic RAG, fine‑tuned embeddings, re‑ranking, graph RAG, and agent‑based RAG—to create scalable, multilingual document‑question answering services while addressing large‑scale documentation challenges, model limitations, and user feedback loops.

EmbeddingLLMRAG
0 likes · 14 min read
Building Efficient RAG Applications with a Small Team: Insights from PingCAP AI Lab
NewBeeNLP
NewBeeNLP
Oct 16, 2024 · Artificial Intelligence

Unlocking Long-Sequence LLMs: Position Embeddings, Scaling, and Efficient Attention

This article reviews recent advances in training and inference for long‑sequence large language models, comparing ALIBI and RoPE position embeddings, exploring RoPE scaling techniques, analyzing attention optimizations, and outlining practical data, evaluation, and system frameworks for scalable LLM deployment.

Flash AttentionLLMRoPE
0 likes · 14 min read
Unlocking Long-Sequence LLMs: Position Embeddings, Scaling, and Efficient Attention
Baobao Algorithm Notes
Baobao Algorithm Notes
Oct 16, 2024 · Artificial Intelligence

How the DB3 Team Won the Meta CRAG RAG Challenge: Prompts, Retrieval, and LoRA Fine‑Tuning

This article analyzes the Meta Comprehensive RAG (CRAG) benchmark, detailing its three tasks, evaluation metrics, and the champion DB3 team's end‑to‑end solution that combines data preprocessing, dual‑stage retrieval, prompt engineering, LoRA‑based fine‑tuning, and public data augmentation to achieve top scores across all tasks.

Knowledge GraphLLMLoRA
0 likes · 17 min read
How the DB3 Team Won the Meta CRAG RAG Challenge: Prompts, Retrieval, and LoRA Fine‑Tuning
Baobao Algorithm Notes
Baobao Algorithm Notes
Oct 13, 2024 · Artificial Intelligence

Can Hierarchical LLMs Transform Sequential Recommendation? A Deep Dive

This article provides a comprehensive analysis of the HLLM paper, detailing its hierarchical LLM architecture for item and user modeling, the training objectives, fusion strategies, extensive offline and online experiments, scaling behavior, ablation studies, and practical deployment insights in large‑scale recommendation systems.

Industrial DeploymentLLMScaling Law
0 likes · 12 min read
Can Hierarchical LLMs Transform Sequential Recommendation? A Deep Dive
JD Tech
JD Tech
Oct 13, 2024 · Artificial Intelligence

Building a Simple Local AI Question‑Answer System with Java, LangChain4J, Ollama, and ChromaDB

This article guides readers through the concepts of large language models, embeddings, vector databases, and Retrieval‑Augmented Generation, then demonstrates step‑by‑step how to set up Ollama, install a local Chroma vector store, configure Maven dependencies, and write Java code using LangChain4J to build and test a functional AI Q&A application.

AIJavaLLM
0 likes · 22 min read
Building a Simple Local AI Question‑Answer System with Java, LangChain4J, Ollama, and ChromaDB
AntTech
AntTech
Oct 12, 2024 · Artificial Intelligence

Observations from ISSTA 2024: Conference Highlights, Awarded Papers, Keynotes, and In‑Depth Reviews

The article reports on the 33rd ISSTA 2024 conference in Vienna, summarizing its acceptance statistics, highlighting the Impact Paper Award and Distinguished Papers, detailing keynotes on large‑language‑model‑driven software quality, and providing extensive reviews of selected research works ranging from fuzzing and program repair to database query simplification and AI‑oriented code generation.

ISSTA2024LLMProgramRepair
0 likes · 29 min read
Observations from ISSTA 2024: Conference Highlights, Awarded Papers, Keynotes, and In‑Depth Reviews
21CTO
21CTO
Oct 10, 2024 · Artificial Intelligence

5 Practical AI Projects to Build Your Skills with Python

This article presents five hands‑on AI project ideas—from resume optimization to multimodal search—complete with step‑by‑step instructions, required Python libraries, and code snippets, helping beginners and intermediate developers quickly build valuable AI applications.

AILLMPython
0 likes · 12 min read
5 Practical AI Projects to Build Your Skills with Python
JD Tech Talk
JD Tech Talk
Oct 8, 2024 · Artificial Intelligence

Building a Retrieval‑Augmented Generation (RAG) System with Rust and Qdrant

This article explains how to construct a Retrieval‑Augmented Generation pipeline in Rust, covering knowledge‑base creation with Qdrant, model loading and embedding using the candle library, data ingestion, and integration of a Rust‑based inference service based on mistral.rs, while also discussing resource usage and common pitfalls.

AIEmbeddingLLM
0 likes · 16 min read
Building a Retrieval‑Augmented Generation (RAG) System with Rust and Qdrant
Baobao Algorithm Notes
Baobao Algorithm Notes
Oct 7, 2024 · Artificial Intelligence

Mastering LLM Supervised Fine‑Tuning: Practical Tips, Data Strategies, and Debugging

This article provides a comprehensive, experience‑driven guide to supervised fine‑tuning (SFT) of large language models, covering special tokens, latency considerations, data diversity and production, training frameworks and hyper‑parameters, over‑/under‑fitting diagnostics, and evaluation metrics such as helpfulness, honesty, and harmlessness.

AIData EngineeringLLM
0 likes · 40 min read
Mastering LLM Supervised Fine‑Tuning: Practical Tips, Data Strategies, and Debugging
Fighter's World
Fighter's World
Sep 30, 2024 · Artificial Intelligence

Exploring Google NotebookLM: Use Cases, Interaction Experience, and Key Insights

The author reviews Google NotebookLM, describing how it aids deep paper reading, boosts chat willingness with guided prompts, maintains conversation coherence through self‑play insights, highlights the audio‑overview feature, and reflects on AI concepts such as the "bitter lesson" and the limits of self‑play in open scenarios.

AI researchAudio GenerationGoogle
0 likes · 22 min read
Exploring Google NotebookLM: Use Cases, Interaction Experience, and Key Insights
21CTO
21CTO
Sep 30, 2024 · Artificial Intelligence

How LLM‑Powered IDEs Can Cut Your Coding Time in Half

Using an LLM-powered IDE, the author built a full‑stack weekend project without writing a single line of code, discovering faster development cycles, new debugging habits, and the strengths and limits of AI assistants compared to traditional Google searches.

AI codingDebuggingLLM
0 likes · 10 min read
How LLM‑Powered IDEs Can Cut Your Coding Time in Half
JD Cloud Developers
JD Cloud Developers
Sep 29, 2024 · Artificial Intelligence

Build a Local AI Q&A System with Java, Ollama, and LangChain4J

This article walks through building a local AI question‑answer system using Java, Ollama, LangChain4J, embeddings, and a Chroma vector database, covering LLM fundamentals, embedding techniques, RAG architecture, setup steps, Maven dependencies, and sample code to retrieve and answer queries.

AIEmbeddingJava
0 likes · 19 min read
Build a Local AI Q&A System with Java, Ollama, and LangChain4J
Architect
Architect
Sep 26, 2024 · Artificial Intelligence

Decoding OpenAI o1: How RL‑LLM Fusion Powers Next‑Gen Reasoning

This article provides a detailed technical analysis of OpenAI’s o1 model, exploring its enhanced logical reasoning, the likely use of reinforcement learning with hidden chain‑of‑thought generation, multi‑model architecture, training data pipelines, reward modeling, and how these innovations could reshape AI safety and scaling strategies.

AI safetyLLMOpenAI o1
0 likes · 43 min read
Decoding OpenAI o1: How RL‑LLM Fusion Powers Next‑Gen Reasoning
Huolala Tech
Huolala Tech
Sep 26, 2024 · Artificial Intelligence

How LLM-Powered AI Assistants Transform Logistics Operations

This article details Huolala's exploration of large‑language‑model (LLM) based AI assistants across multiple business scenarios, describing their architecture, implementation challenges, prompt engineering techniques, and the progressive stages from professional assistants to multi‑agent systems that drive efficiency and innovation in logistics.

AI AssistantLLMPrompt Engineering
0 likes · 12 min read
How LLM-Powered AI Assistants Transform Logistics Operations
Baobao Algorithm Notes
Baobao Algorithm Notes
Sep 25, 2024 · Industry Insights

Decoding OpenAI o1: How RL and LLM Fuse to Power Hidden Chain‑of‑Thought

This article analytically reconstructs OpenAI o1’s architecture, training pipeline, and inference workflow, exploring its reinforcement‑learning‑enhanced hidden chain‑of‑thought, multi‑model composition, scaling laws, reward modeling, and potential implications for future AI safety and small‑model strategies.

AI safetyHidden COTLLM
0 likes · 43 min read
Decoding OpenAI o1: How RL and LLM Fuse to Power Hidden Chain‑of‑Thought
ByteDance Data Platform
ByteDance Data Platform
Sep 25, 2024 · Artificial Intelligence

How LLMs Power the “Find Data Assistant” for Smarter Data Retrieval

This article explains how the Volcano Engine DataLeap team leveraged large‑language models to build the “Find Data Assistant”, detailing its design, challenges, embedding‑and‑reranker enhancements, LLM‑driven semantic search, mixing architecture, and practical lessons for improving data asset management and retrieval.

Data Asset ManagementData RetrievalEmbedding
0 likes · 17 min read
How LLMs Power the “Find Data Assistant” for Smarter Data Retrieval
JavaEdge
JavaEdge
Sep 24, 2024 · Artificial Intelligence

Mastering RAG with LangChain4j: From Simple Setup to Advanced Retrieval‑Augmented Generation

This article explains how to extend large language models with domain‑specific knowledge using Retrieval‑Augmented Generation (RAG) in LangChain4j, covering the concepts of RAG, its indexing and retrieval stages, simple RAG setup, detailed API usage, and advanced customization options such as query transformers and content injectors.

EmbeddingJavaLLM
0 likes · 24 min read
Mastering RAG with LangChain4j: From Simple Setup to Advanced Retrieval‑Augmented Generation
AsiaInfo Technology: New Tech Exploration
AsiaInfo Technology: New Tech Exploration
Sep 23, 2024 · Artificial Intelligence

How Large Language Models Power Multi‑Turn Dialogue for Smart Marketing

This article presents a comprehensive technical analysis of using large language models to build a task‑oriented multi‑turn dialogue system for intelligent marketing, detailing architecture, intent detection, slot extraction, prompt design, dialogue management, practical experience, and future research directions.

LLMintelligent marketingintent recognition
0 likes · 21 min read
How Large Language Models Power Multi‑Turn Dialogue for Smart Marketing
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 23, 2024 · Artificial Intelligence

Boosting Aviator Script Development with AI—No Model Training Required

This article details an engineering‑focused practice that uses large language models, RAG, prompt engineering, and reranking to automatically generate, review, and refine Aviator scripts for decision‑center policies without any model pre‑training, offering practical insights and code examples for developers.

AI Code GenerationAviator scriptLLM
0 likes · 29 min read
Boosting Aviator Script Development with AI—No Model Training Required
JavaEdge
JavaEdge
Sep 21, 2024 · Artificial Intelligence

Understanding LLM API Types and Usage in LangChain4j

This article explains the different low‑level LLM API types in LangChain4j, including LanguageModel, ChatLanguageModel, and other model interfaces, and shows how to create and combine ChatMessage objects for multi‑turn conversations.

AI APIChatLanguageModelChatMessage
0 likes · 8 min read
Understanding LLM API Types and Usage in LangChain4j
DataFunSummit
DataFunSummit
Sep 21, 2024 · Artificial Intelligence

DataLeap "Find Data Assistant": Leveraging Large Language Models for Data Asset Retrieval and Management

This article details how the DataLeap team applied large language model technology to build the "Find Data Assistant" platform, addressing the challenges of locating and using massive data assets through a hybrid retrieval architecture, enhanced embedding, reranking, mixed ranking, and answer summarization, while sharing practical lessons and future directions.

Data Asset ManagementData RetrievalEmbedding
0 likes · 17 min read
DataLeap "Find Data Assistant": Leveraging Large Language Models for Data Asset Retrieval and Management
Senior Brother's Insights
Senior Brother's Insights
Sep 19, 2024 · Artificial Intelligence

Rule Engines vs AI Models: Choosing the Right Approach for Product Logic

The article compares traditional rule‑engine architectures with AI‑driven models, explains their differing characteristics, outlines when deterministic rule matching is preferable over flexible AI inference, and recommends practical technologies such as Drools for rule‑based solutions and LLM‑based RAG/Agent frameworks for AI‑centric scenarios.

AIDroolsLLM
0 likes · 9 min read
Rule Engines vs AI Models: Choosing the Right Approach for Product Logic
JavaEdge
JavaEdge
Sep 19, 2024 · Artificial Intelligence

Unlock Java LLM Power: A Deep Dive into LangChain4j Features and Architecture

LangChain4j streamlines the integration of large language models into Java applications by offering a standardized API, extensive support for over a dozen LLM providers and vector stores, a rich toolbox for RAG, chat memory, and tool calling, plus two abstraction layers that cater to both low‑level control and high‑level convenience.

AIJavaLLM
0 likes · 10 min read
Unlock Java LLM Power: A Deep Dive into LangChain4j Features and Architecture
DevOps
DevOps
Sep 13, 2024 · Artificial Intelligence

15 Advanced Retrieval‑Augmented Generation (RAG) Techniques for Production‑Ready AI Solutions

The article outlines fifteen advanced Retrieval‑Augmented Generation (RAG) techniques—from hierarchical indexing and context caching to multimodal alignment and microservice orchestration—explaining how they help transform AI prototypes into scalable, reliable production systems while highlighting common pitfalls and a concluding call to action.

AI productionLLMRAG
0 likes · 8 min read
15 Advanced Retrieval‑Augmented Generation (RAG) Techniques for Production‑Ready AI Solutions
Code Mala Tang
Code Mala Tang
Sep 12, 2024 · Artificial Intelligence

Unlocking LangChain.js: The Swiss Army Knife for LLM Applications

This article introduces LangChain.js, explains its origins, core concepts such as chats, templates, tools, and chains, demonstrates practical JavaScript code examples, and explores the LangChain Execution Language (LCEL) for building flexible, conditional AI workflows.

AI workflowLCELLLM
0 likes · 17 min read
Unlocking LangChain.js: The Swiss Army Knife for LLM Applications
Code Mala Tang
Code Mala Tang
Sep 12, 2024 · Artificial Intelligence

Boost LLM Accuracy with Retrieval‑Augmented Generation Using LangChain.js

This article explains the core concepts of Retrieval‑Augmented Generation (RAG), walks through its implementation steps with LangChain.js—including text chunking, embedding, storage, retrieval, and generation—and showcases practical use cases, challenges, and best practices for building reliable AI‑powered applications.

AI applicationsEmbeddingLLM
0 likes · 16 min read
Boost LLM Accuracy with Retrieval‑Augmented Generation Using LangChain.js
DataFunTalk
DataFunTalk
Sep 12, 2024 · Artificial Intelligence

MetaGPT: Advances in Multi‑Agent Collaboration and Agent Capability Enhancement

This article reviews MetaGPT, an open‑source multi‑agent framework that integrates human‑engineered SOPs into LLM‑based agents to improve software generation, data interpretation, and simulation tasks, highlighting its rapid community growth, experimental successes, tool integration strategies, and future research directions.

Agent CollaborationLLMMetaGPT
0 likes · 20 min read
MetaGPT: Advances in Multi‑Agent Collaboration and Agent Capability Enhancement
Baobao Algorithm Notes
Baobao Algorithm Notes
Sep 10, 2024 · Artificial Intelligence

Do LLMs Silence Human Voices? Unveiling the ‘Spiral of Silence’ in Retrieval‑Augmented Generation

This article reviews the ACL 2024 paper that investigates how large language model‑generated text influences retrieval‑augmented generation pipelines, revealing short‑term retrieval gains but a long‑term “spiral of silence” that marginalizes human‑generated content and homogenizes open‑domain QA results.

AI ImpactLLMOpen Domain QA
0 likes · 9 min read
Do LLMs Silence Human Voices? Unveiling the ‘Spiral of Silence’ in Retrieval‑Augmented Generation
Code Mala Tang
Code Mala Tang
Sep 7, 2024 · Artificial Intelligence

Unlocking LangChain.js: The Swiss Army Knife for LLM Applications

This article introduces LangChain.js, its core concepts such as chats, templates, tools, and chains, demonstrates how to use LCEL for flexible workflow composition, and shows practical JavaScript code examples for building AI-powered applications with large language models.

AI workflowLCELLLM
0 likes · 17 min read
Unlocking LangChain.js: The Swiss Army Knife for LLM Applications
iKang Technology Team
iKang Technology Team
Sep 5, 2024 · Artificial Intelligence

What Is LangChain? Overview, Core Advantages, Components, and Use Cases

LangChain is a modular framework that streamlines integration of large language models by providing unified model interfaces, prompt optimization, memory handling, indexing, chains, and agents, enabling developers to quickly build and deploy sophisticated NLP applications such as text generation, information extraction, and dynamic tool‑driven workflows across various industries.

AI FrameworkChainsLLM
0 likes · 6 min read
What Is LangChain? Overview, Core Advantages, Components, and Use Cases
Full-Stack Cultivation Path
Full-Stack Cultivation Path
Sep 4, 2024 · Artificial Intelligence

Hot Open-Source RAG Tool for Document Chat: GraphRAG, Multimodal QA & Complex Reasoning

This article introduces Kotaemon, an open‑source Retrieval‑Augmented Generation platform that lets users chat with their documents, offering a self‑hosted web UI, support for local and API LLMs, hybrid retrieval, multimodal question answering, GraphRAG indexing, and advanced reasoning capabilities, along with step‑by‑step installation via App or Docker.

GraphRAGLLMMultimodal QA
0 likes · 6 min read
Hot Open-Source RAG Tool for Document Chat: GraphRAG, Multimodal QA & Complex Reasoning
AI Large Model Application Practice
AI Large Model Application Practice
Sep 4, 2024 · Artificial Intelligence

When to Use GraphRAG vs. Traditional RAG and How to Combine Them

This article compares GraphRAG with traditional RAG across seven dimensions—suitable scenarios, knowledge representation, retrieval, comprehensive queries, hidden‑relationship understanding, scalability, and performance‑cost trade‑offs—explains how they can be fused, and offers guidance on selecting the right approach for complex data‑driven applications.

Artificial IntelligenceGraphRAGLLM
0 likes · 13 min read
When to Use GraphRAG vs. Traditional RAG and How to Combine Them
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Sep 2, 2024 · Artificial Intelligence

Turning PDFs and Word Docs into Searchable Knowledge for RAG Systems

This article explains why generic large language models struggle with domain‑specific data, introduces Retrieval‑Augmented Generation (RAG) as a solution, compares Word and PDF formats, outlines document‑parsing pipelines, reviews open‑source PDF tools, and presents Alibaba Cloud's rule‑based parsing architecture with performance results.

AIDocument ParsingLLM
0 likes · 13 min read
Turning PDFs and Word Docs into Searchable Knowledge for RAG Systems
Data Thinking Notes
Data Thinking Notes
Sep 1, 2024 · Artificial Intelligence

Master LLMs: Basics, Prompt Engineering, RAG, Agents & Multimodal AI

This article provides a comprehensive overview of large language models, covering their fundamental concepts, historical milestones, parameter scaling, prompt engineering techniques, retrieval‑augmented generation, autonomous agents, and multimodal model applications, illustrating how these technologies reshape AI capabilities across domains.

AI agentsLLMPrompt Engineering
0 likes · 22 min read
Master LLMs: Basics, Prompt Engineering, RAG, Agents & Multimodal AI
DataFunSummit
DataFunSummit
Aug 29, 2024 · Artificial Intelligence

Intelligent NPC Practices in Tencent Games: Multi‑Modal LLM Solutions and System Optimizations

This article details Tencent Game's end‑to‑end approach to building intelligent NPCs, covering the opportunities brought by AI, the practical implementation of multimodal LLM‑driven dialogue, knowledge‑augmented retrieval, long‑context handling, safety measures, multimodal expression (voice and facial animation), and system‑level performance optimizations for real‑time deployment.

AILLMNPC
0 likes · 18 min read
Intelligent NPC Practices in Tencent Games: Multi‑Modal LLM Solutions and System Optimizations