Tagged articles

2077 articles

Page 17 of 21

Dec 2, 2024 · Artificial Intelligence

Master CrewAI: Build Multi‑Agent Systems Quickly with Flows and a Full Demo

This article introduces CrewAI, a high‑level Python framework for constructing multi‑agent systems, explains its core concepts such as Crew, Agent, Tool, Task and Process, walks through a complete demo with code, evaluates its strengths and limitations, and showcases the new Flows feature for more flexible workflow orchestration.

AI FrameworkCrewAIFlows

0 likes · 15 min read

Master CrewAI: Build Multi‑Agent Systems Quickly with Flows and a Full Demo

JavaEdge

Dec 1, 2024 · Artificial Intelligence

Exploring the Limits and Benchmarks of Qwen’s QwQ‑32B‑Preview AI Model

QwQ‑32B‑Preview, an experimental AI model from the Qwen team, showcases strong reasoning in math and programming while facing challenges like language switching, inference loops, safety concerns, and variable capabilities across domains, with benchmark scores ranging from 50% to over 90% on tests such as GPQA, AIME, MATH‑500, and LiveCodeBench.

AI BenchmarkLLMMachine Learning

0 likes · 7 min read

Exploring the Limits and Benchmarks of Qwen’s QwQ‑32B‑Preview AI Model

AsiaInfo Technology: New Tech Exploration

Nov 29, 2024 · Artificial Intelligence

How GraphRAG Transforms Global QA with Structured Retrieval

This article examines GraphRAG—a graph‑enhanced Retrieval‑Augmented Generation approach—detailing its core concepts, the practical challenges of deploying it in enterprise settings, and the engineering solutions and future directions that enable more accurate, efficient, and explainable global question‑answering systems.

Global QAGraphRAGLLM

0 likes · 16 min read

How GraphRAG Transforms Global QA with Structured Retrieval

Alibaba Cloud Developer

Nov 28, 2024 · Artificial Intelligence

Understanding Tokenizers and Embeddings in Large Language Models

This article introduces the core concepts of tokenizers and embeddings in large language models, explains how they convert text into numeric IDs and dense vectors, compares different tokenization strategies, and provides practical JavaScript and TensorFlow.js code examples for beginners.

AI fundamentalsJavaScriptLLM

0 likes · 10 min read

Understanding Tokenizers and Embeddings in Large Language Models

Sohu Tech Products

Nov 27, 2024 · Artificial Intelligence

RAG Technology and Practical Application in Multi-Modal Query: Using Chinese-CLIP and Redis Search

The article explains how Retrieval‑Augmented Generation (RAG) outperforms direct LLM inference by enabling real‑time knowledge updates and lower costs, and demonstrates a practical multi‑modal RAG pipeline that uses Chinese‑CLIP for vector encoding, various chunking strategies, and Redis Search for fast vector storage and retrieval.

Chinese-CLIPChunkingLLM

0 likes · 17 min read

RAG Technology and Practical Application in Multi-Modal Query: Using Chinese-CLIP and Redis Search

NewBeeNLP

Nov 27, 2024 · Artificial Intelligence

How Can Large Language Models Extend Their Context Window? A Deep Dive into Position Encoding

This article reviews the principles of absolute and relative positional encodings, explains why window extrapolation is crucial for large language models, analyzes current extrapolation methods, evaluates their performance, and answers common questions about extending LLM context windows.

LLMPositional EncodingRoPE

0 likes · 14 min read

How Can Large Language Models Extend Their Context Window? A Deep Dive into Position Encoding

Alibaba Cloud Big Data AI Platform

Nov 27, 2024 · Artificial Intelligence

How to Train, Evaluate, and Deploy Qwen2.5-Coder on Alibaba Cloud PAI‑QuickStart

This guide walks developers through the entire lifecycle of Qwen2.5‑Coder—covering model sizes, training token expansion, resource requirements, fine‑tuning with SFT/DPO, evaluation on custom and public datasets, and one‑click deployment and compression on Alibaba Cloud's PAI‑QuickStart platform.

DeploymentLLMPAI-QuickStart

0 likes · 15 min read

How to Train, Evaluate, and Deploy Qwen2.5-Coder on Alibaba Cloud PAI‑QuickStart

AI Large Model Application Practice

Nov 25, 2024 · Artificial Intelligence

Building Multi‑Agent Systems with LangGraph: A Step‑by‑Step Guide

This article walks through implementing a multi‑agent workflow using LangGraph, comparing it with the lightweight Swam framework, and detailing the code for defining models, tools, agents, graph structures, testing, and evaluating the framework's strengths, limitations, and suitable use cases.

AILLMLangGraph

0 likes · 10 min read

Building Multi‑Agent Systems with LangGraph: A Step‑by‑Step Guide

Baobao Algorithm Notes

Nov 24, 2024 · Artificial Intelligence

How Marco‑o1 Merges Chain‑of‑Thought Fine‑Tuning with Monte‑Carlo Tree Search for Superior Reasoning

The article introduces Marco‑o1, an open‑source LLM that enhances complex reasoning by fine‑tuning on Chain‑of‑Thought data, integrating Monte‑Carlo Tree Search, introducing mini‑step actions and a reflection mechanism, and evaluates its performance on multilingual math and translation benchmarks.

Artificial IntelligenceLLMMonte Carlo Tree Search

0 likes · 15 min read

How Marco‑o1 Merges Chain‑of‑Thought Fine‑Tuning with Monte‑Carlo Tree Search for Superior Reasoning

System Architect Go

Nov 24, 2024 · Artificial Intelligence

Building a Web Voice Chatbot with Whisper, llama.cpp, and LLM

This article demonstrates how to build a web‑based voice chatbot by integrating Whisper speech‑to‑text, llama.cpp LLM inference, and WebSocket communication, detailing both the frontend JavaScript implementation and the Python FastAPI backend, along with Docker deployment and example code.

FastAPIJavaScriptLLM

0 likes · 10 min read

Building a Web Voice Chatbot with Whisper, llama.cpp, and LLM

ZhongAn Tech Team

Nov 24, 2024 · Artificial Intelligence

Weekly AI Digest – Issue 3: Agentic AI, Riemann Hypothesis Rumors, AI Search Trends, and Real‑time Voice Interaction

This issue reviews the rise of Agentic AI and upcoming computer agents, debunks a viral claim about Grok‑3 proving the Riemann hypothesis, analyzes Gartner’s AI search forecasts, and highlights OpenAI’s Realtime API for ultra‑low‑latency voice interactions.

AI SearchLLMRealtime API

0 likes · 10 min read

Weekly AI Digest – Issue 3: Agentic AI, Riemann Hypothesis Rumors, AI Search Trends, and Real‑time Voice Interaction

DaTaobao Tech

Nov 20, 2024 · Mobile Development

MNN-Transformer: Efficient On‑Device Large Language and Diffusion Model Deployment

MNN‑Transformer provides an end‑to‑end framework that enables large language and diffusion models to run efficiently on modern smartphones by exporting, quantizing (including dynamic int4/int8 and KV cache compression) and executing via a plugin‑engine runtime, achieving up to 35 tokens/s decoding and 2‑3× faster image generation compared with existing on‑device solutions.

LLMMNNQuantization

0 likes · 15 min read

MNN-Transformer: Efficient On‑Device Large Language and Diffusion Model Deployment

System Architect Go

Nov 19, 2024 · Artificial Intelligence

Retrieval Augmented Generation (RAG) System Overview and Implementation with LangChain, Redis, and llama.cpp

This article explains the concept, architecture, and step‑by‑step implementation of Retrieval Augmented Generation (RAG), covering indexing, retrieval & generation processes, a practical LangChain‑Redis‑llama.cpp example on Kubernetes, code snippets, test results, challenges, and references.

AIEmbeddingLLM

0 likes · 6 min read

Retrieval Augmented Generation (RAG) System Overview and Implementation with LangChain, Redis, and llama.cpp

dbaplus Community

Nov 16, 2024 · Artificial Intelligence

Are LLM Frameworks Overhyped? A Critical Look at RAG and Reusability

The article critiques LLM frameworks, comparing them to early ORM tools, explains how Retrieval Augmented Generation works, warns against premature optimization, and advises developers to favor simple, visible practices over complex, abstracted frameworks for better control and understanding.

AILLMModelEvaluation

0 likes · 7 min read

Are LLM Frameworks Overhyped? A Critical Look at RAG and Reusability

Alibaba Cloud Developer

Nov 15, 2024 · Frontend Development

How to Build Real‑Time LLM Streaming in the Browser with Fetch

This article explains the mechanism of HTTP API streaming for large language models and shows step‑by‑step how front‑end developers can use the Fetch API, readable streams, and incremental UI updates to deliver real‑time, progressive results while handling errors and connection interruptions.

Front-endHTTP streamingJavaScript

0 likes · 9 min read

How to Build Real‑Time LLM Streaming in the Browser with Fetch

Linux Kernel Journey

Nov 14, 2024 · Artificial Intelligence

Deep Dive: How DeepFlow Collects Business Metrics for Large‑Model Services

This article explains how China Mobile built a hybrid‑cloud production environment for its customer‑service LLM, using eBPF and WebAssembly plugins from DeepFlow to achieve zero‑intrusion observability, automatically capture full‑stack topology, application/network metrics, and key LLM business indicators such as TTFT, TPOT, and token throughput.

DeepFlowGrafanaLLM

0 likes · 19 min read

Deep Dive: How DeepFlow Collects Business Metrics for Large‑Model Services

Baobao Algorithm Notes

Nov 14, 2024 · Artificial Intelligence

How I Built a 1B‑Parameter Chinese LLM on a Single A100: Lessons Learned

This article details the end‑to‑end process of pre‑training, fine‑tuning, and evaluating a 1‑billion‑parameter Chinese LLM named Steel‑LLM on limited hardware, covering data collection, pipeline design, training framework choices, architectural tweaks, performance results, and practical lessons for resource‑constrained developers.

LLMTraining Optimizationdata pipeline

0 likes · 18 min read

How I Built a 1B‑Parameter Chinese LLM on a Single A100: Lessons Learned

Alibaba Cloud Developer

Nov 14, 2024 · Artificial Intelligence

Building a High‑Accuracy Automotive Maintenance Q&A System with Multi‑Agent LLMs

This article details how to design, implement, and evaluate a complex‑table intelligent Q&A solution for automotive maintenance using large language models, RAG pipelines, multi‑agent architectures, prompt engineering, and Alibaba Cloud services, achieving up to 93.8% accuracy.

AutomotiveLLMPrompt Engineering

0 likes · 31 min read

Building a High‑Accuracy Automotive Maintenance Q&A System with Multi‑Agent LLMs

AI Large Model Application Practice

Nov 13, 2024 · Artificial Intelligence

Exploring OpenAI Swam: A Minimalist Multi‑Agent Orchestration Framework

This article introduces the concept of multi‑agent systems, compares five popular orchestration frameworks, and provides a step‑by‑step tutorial for building and testing a simple supervision‑based workflow using OpenAI's experimental Swam library, complete with code snippets and performance observations.

LLMOpenAIPython

0 likes · 12 min read

Exploring OpenAI Swam: A Minimalist Multi‑Agent Orchestration Framework

Aikesheng Open Source Community

Nov 12, 2024 · Artificial Intelligence

ChatDBA: An AI‑Powered Database Fault Diagnosis Assistant Using Large Language Models

ChatDBA is a conversational AI system built by Shanghai Aikesheng that employs large language models and Retrieval‑Augmented Generation to help database administrators diagnose faults, learn domain knowledge, and generate or optimize SQL, with a redesigned architecture that addresses early‑stage shortcomings and outlines future enhancements.

ChatDBAFault DiagnosisKnowledge Base

0 likes · 10 min read

ChatDBA: An AI‑Powered Database Fault Diagnosis Assistant Using Large Language Models

Alibaba Cloud Developer

Nov 12, 2024 · Artificial Intelligence

How Multi‑Agent LLMs Can Auto‑Optimize E‑Commerce Product Titles

This article explains how large language models and rule‑based multi‑agent pipelines are used to automatically generate and select high‑impact keywords for e‑commerce product titles, improving search exposure without extra advertising costs.

AILLMe‑commerce

0 likes · 19 min read

How Multi‑Agent LLMs Can Auto‑Optimize E‑Commerce Product Titles

Aikesheng Open Source Community

Nov 11, 2024 · Databases

ChatDBA: An AI‑Powered Intelligent Assistant for Database Fault Diagnosis and Management

ChatDBA is an AI‑driven conversational system developed by Shanghai Aikesheng that assists DBAs with fault diagnosis, knowledge learning, SQL generation and optimization by leveraging large language models, RAG architecture, and advanced retrieval and document‑processing techniques.

ChatDBAFault DiagnosisLLM

0 likes · 10 min read

ChatDBA: An AI‑Powered Intelligent Assistant for Database Fault Diagnosis and Management

JD Cloud Developers

Nov 11, 2024 · Artificial Intelligence

Mastering Prompt Engineering: History, Techniques, and Real-World Applications

This article explains what Prompt Engineering is, traces its evolution from early NLP commands to modern adaptive and multimodal prompting, details core techniques such as Zero‑shot, Chain‑of‑Thought, Auto‑CoT, and reduction of hallucinations, and showcases a logistics case study using various prompting strategies.

AILLMPrompt Engineering

0 likes · 26 min read

Mastering Prompt Engineering: History, Techniques, and Real-World Applications

Fighter's World

Nov 11, 2024 · Artificial Intelligence

How CoCounsel’s $650M Acquisition Reveals Key Design Principles for LLM‑Powered Legal Tools

The article examines how Casetext’s CoCounsel, an AI‑driven legal assistant acquired by Thomson Reuters for $650 million, achieved rapid growth by prioritizing accuracy, workflow integration, user‑centered design, security, and continuous improvement, and distills the critical challenges and success factors for building LLM‑native products in low‑tolerance B2B environments.

AI ethicsB2B SaaSLLM

0 likes · 11 min read

How CoCounsel’s $650M Acquisition Reveals Key Design Principles for LLM‑Powered Legal Tools

Alibaba Cloud Observability

Nov 8, 2024 · Cloud Native

Enable Python Probe for LLM Observability on Alibaba Cloud ACK

This guide explains how to integrate Alibaba Cloud's Python probe into a Kubernetes (ACK) environment to monitor large language model (LLM) applications, covering prerequisites, installation steps, Dockerfile modifications, resource permissions, and sample Python code for both server and client components.

ARMSCloud NativeDocker

0 likes · 16 min read

Enable Python Probe for LLM Observability on Alibaba Cloud ACK

AI Large Model Application Practice

Nov 8, 2024 · Artificial Intelligence

How to Build a Multimodal Embedding RAG with Cohere and LlamaIndex

This guide explains how to overcome the limitations of text‑only embeddings for enterprise AI search by using a multimodal embedding model to index and retrieve both text and images, detailing the full workflow, code examples, and performance benefits.

CohereLLMLlamaIndex

0 likes · 13 min read

How to Build a Multimodal Embedding RAG with Cohere and LlamaIndex

CSS Magic

Nov 8, 2024 · Artificial Intelligence

LLM Application Development Tips (3): Exploring LLM API Inputs and Outputs

This article explains how to configure key OpenAI chat completion parameters—such as temperature, top_p, streaming, response format, and tool selection—and walks through the structure of the API's JSON response, highlighting fields like id, model, choices, finish_reason, and usage for better control and cost estimation.

AI agentsAPI parametersJSON response

0 likes · 8 min read

LLM Application Development Tips (3): Exploring LLM API Inputs and Outputs

Alimama Tech

Nov 6, 2024 · Artificial Intelligence

How AI Generates Synchronized Video Narrations for E‑Commerce

This article presents the research behind Synchronized Video Storytelling, introducing the E‑SyncVidStory dataset, the VideoNarrator multimodal architecture, and extensive experiments that demonstrate high‑quality, product‑aware video narration generation for e‑commerce applications.

LLMdatasete‑commerce

0 likes · 12 min read

How AI Generates Synchronized Video Narrations for E‑Commerce

37 Interactive Technology Team

Nov 4, 2024 · Artificial Intelligence

Developing RAG and Agent Applications with LangChain: A Case Study of an AI Assistant for Activity Components

The article outlines a step‑by‑step methodology for creating Retrieval‑Augmented Generation and custom Agent applications with LangChain, illustrated by an AI assistant for activity components that evolves from a rapid Dify prototype to a LangChain‑based RAG system and finally a hand‑crafted ReAct‑style agent, detailing LCEL chain composition, vector‑search integration, model performance trade‑offs, and a unified routing layer.

AI AssistantCloud-nativeData Warehouse

0 likes · 6 min read

Developing RAG and Agent Applications with LangChain: A Case Study of an AI Assistant for Activity Components

Baobao Algorithm Notes

Nov 4, 2024 · Artificial Intelligence

Uncovering 16 Limits of AI Search Engines and 16 Design Recommendations

A user study with 21 participants reveals sixteen critical limitations of generative AI search engines, maps them to eight quantitative metrics, proposes sixteen design recommendations, and evaluates You.com, Perplexity and BingChat against this framework to highlight current performance gaps.

AI SearchLLMMetrics

0 likes · 12 min read

Uncovering 16 Limits of AI Search Engines and 16 Design Recommendations

CSS Magic

Nov 1, 2024 · Artificial Intelligence

Refining System Prompts for LLMs: Practical Tips for Batch Automation

This article explains how to automate batch document processing with LLM APIs by mastering the messages parameter, defining system, user, and assistant roles, and iteratively polishing system prompts through scripts or OpenAI's GPTs editor and Playground interfaces.

ChatGPTLLMOpenAI API

0 likes · 7 min read

Refining System Prompts for LLMs: Practical Tips for Batch Automation

DataFunTalk

Oct 31, 2024 · Artificial Intelligence

Tencent OlaChat: An LLM‑Powered Intelligent Business Intelligence Platform – Architecture, Capabilities, and Practice

This article presents the evolution from traditional to intelligent BI, explores how large language models enable natural‑language data analysis, details the OlaChat platform’s architecture, metadata‑enhanced retrieval methods, Text2SQL pipeline, multi‑turn dialogue system, and shares practical deployment insights and Q&A.

Intelligent AnalyticsLLMMetadata Retrieval

0 likes · 20 min read

Tencent OlaChat: An LLM‑Powered Intelligent Business Intelligence Platform – Architecture, Capabilities, and Practice

NewBeeNLP

Oct 31, 2024 · Artificial Intelligence

How o1 Is Redefining LLM Engineering and What It Means for AI Professionals

The article examines OpenAI's o1 model, highlighting its unprecedented scientific capabilities, its shift from a chat toy to a high‑value tool, the potential impact on algorithm engineers, and the technical directions (RLHF, MCTS, PPO, PRM) that practitioners should master to stay relevant.

AILLMmodel analysis

0 likes · 8 min read

How o1 Is Redefining LLM Engineering and What It Means for AI Professionals

Alibaba Cloud Developer

Oct 31, 2024 · Artificial Intelligence

How to Guarantee 100% Structured JSON Output from Large Language Models

This article explains why LLMs often fail to produce strict JSON, reviews existing solutions, and presents a three‑stage strategy—prompt engineering, dynamic constrained decoding, and post‑processing—to achieve reliable structured JSON output for automated pipelines.

AIJSONLLM

0 likes · 11 min read

How to Guarantee 100% Structured JSON Output from Large Language Models

DaTaobao Tech

Oct 30, 2024 · Artificial Intelligence

Understanding OpenAI o1: Chain‑of‑Thought, Scaling Laws, and Training Strategies

The article explains how OpenAI’s o1 model leverages chain‑of‑thought prompting, dual‑system cognitive theory, and new scaling laws—pre‑training on code/math and post‑training reinforcement with step‑wise reward models—to achieve superior reasoning, safety, and performance over GPT‑4, heralding a shift toward models that learn to think.

LLMSafetychain-of-thought

0 likes · 42 min read

Understanding OpenAI o1: Chain‑of‑Thought, Scaling Laws, and Training Strategies

Baobao Algorithm Notes

Oct 29, 2024 · Artificial Intelligence

Reproducing OpenAI o1: Steiner Model’s Reasoning, Training, and Evaluation

This report details the design, data synthesis, three‑stage training pipeline, and benchmark evaluation of the open‑source Steiner reasoning model, which aims to emulate OpenAI o1’s inference‑time scaling while highlighting current performance gaps and future research challenges.

Inference ScalingLLMOpen-source AI

0 likes · 14 min read

Reproducing OpenAI o1: Steiner Model’s Reasoning, Training, and Evaluation

Baobao Algorithm Notes

Oct 29, 2024 · Industry Insights

Inside Perplexity AI: How RAG Powers the Next‑Gen Search Engine

In this interview, Perplexity AI CEO Aravind Srinivas explains the company’s retrieval‑augmented generation architecture, multi‑model strategy, vector‑database use, competitive positioning against Google, monetization plans, and future product road‑map, offering a deep industry perspective on AI‑driven search.

AI startupLLMPerplexity AI

0 likes · 38 min read

Inside Perplexity AI: How RAG Powers the Next‑Gen Search Engine

CSS Magic

Oct 29, 2024 · Artificial Intelligence

LLM Application Development Tips (1): How to Choose the Right Model

With a growing array of overseas and domestic LLM APIs in 2024, this guide explains how to pick the right model—starting with a top‑tier option like GPT‑4o for feasibility testing, then moving to cost‑effective or Chinese alternatives, while weighing price, inference speed, context window, API compatibility, and rate limits.

API compatibilityChinese LLMGPT-4o

0 likes · 8 min read

LLM Application Development Tips (1): How to Choose the Right Model

DevOps

Oct 27, 2024 · Artificial Intelligence

Best Practices for Building Efficient Retrieval‑Augmented Generation (RAG) Systems

This article reviews Wang et al.'s 2024 research on Retrieval‑Augmented Generation, outlining optimal practices such as query classification, chunk sizing, hybrid metadata search, embedding selection, vector databases, query transformation, reranking, document repacking, summarization, fine‑tuning, and multimodal retrieval to guide developers in constructing high‑performance RAG pipelines.

LLMQuery ClassificationRAG

0 likes · 11 min read

Best Practices for Building Efficient Retrieval‑Augmented Generation (RAG) Systems

Alibaba Cloud Native

Oct 26, 2024 · Artificial Intelligence

Build a Real‑Time Semantic Search with EventBridge, DashVector, and FunctionCompute

This tutorial walks through constructing a zero‑to‑one RAG pipeline that ingests OSS text files via EventBridge, transforms them into embeddings with DashScope, stores vectors in DashVector, and performs semantic search using FunctionCompute and a Qwen‑Turbo LLM, complete with code samples and configuration steps.

DashVectorEmbeddingEventBridge

0 likes · 10 min read

Build a Real‑Time Semantic Search with EventBridge, DashVector, and FunctionCompute

System Architect Go

Oct 25, 2024 · Artificial Intelligence

Designing and Extending a Self‑Built ChatGPT System: Architecture, Session Management, and Scaling Strategies

This article explains how to construct a ChatGPT‑like conversational system by detailing the core dialogue flow, adding session and history management with a database, defining REST APIs, and exploring extensions such as caching, elastic scaling, and production‑ready deployment considerations.

ChatGPTLLMScalability

0 likes · 7 min read

Designing and Extending a Self‑Built ChatGPT System: Architecture, Session Management, and Scaling Strategies

Baobao Algorithm Notes

Oct 25, 2024 · Artificial Intelligence

How Simhash and Minhash Power LLM Data Deduplication: Theory and Spark Code

This article explains document‑level, paragraph‑level, and sentence‑level deduplication for large‑scale LLM pre‑training, introduces the Simhash and Minhash algorithms with step‑by‑step Python examples, and shows how to implement efficient LSH‑based deduplication using Spark.

LLMMinhashPython

0 likes · 29 min read

How Simhash and Minhash Power LLM Data Deduplication: Theory and Spark Code

Baobao Algorithm Notes

Oct 25, 2024 · Artificial Intelligence

How to Use Importance Sampling for Effective Continue Pretraining of LLMs

Continuing pretraining (CP) bridges pretraining and SFT to inject domain knowledge, but faces catastrophic forgetting; this article explores leveraging importance sampling to balance common and domain data, discusses data selection, annealing strategies, and practical tips for mitigating forgetting while enhancing specialized capabilities.

Catastrophic ForgettingContinue PretrainingImportance Sampling

0 likes · 8 min read

How to Use Importance Sampling for Effective Continue Pretraining of LLMs

System Architect Go

Oct 24, 2024 · Artificial Intelligence

How to Fine‑Tune Translation Models on Kubernetes Docs with LoRA

This article walks through the complete process of fine‑tuning both domain‑specific and large‑language translation models on Kubernetes documentation, covering data preparation, model selection, training configurations, the differences between Seq2Seq and CausalLM, and how LoRA can dramatically reduce resource usage while improving performance.

AILLMLoRA

0 likes · 7 min read

How to Fine‑Tune Translation Models on Kubernetes Docs with LoRA

21CTO

Oct 23, 2024 · Artificial Intelligence

IBM Unveils Granite 3.0 LLMs: Open‑Source, Secure, and Cost‑Effective AI Models

IBM introduced the Granite 3.0 series, an open‑source family of large language models that combine cutting‑edge performance with enhanced security, multi‑language support, and cost‑efficiency, while offering a variety of base, instruct, and specialist variants for enterprise use.

AI modelsGraniteIBM

0 likes · 4 min read

IBM Unveils Granite 3.0 LLMs: Open‑Source, Secure, and Cost‑Effective AI Models

DaTaobao Tech

Oct 23, 2024 · Artificial Intelligence

Retrieval-Augmented Generation (RAG): Principles, Applications, Limitations and Challenges

Retrieval-Augmented Generation (RAG) combines a retriever that fetches relevant external documents and a generator that uses them, improving LLM accuracy, relevance, privacy, and up-to-date information, but faces challenges such as retrieval latency, computational cost, chunking strategies, embedding selection, and system integration complexity.

AIKnowledge retrievalLLM

0 likes · 13 min read

Retrieval-Augmented Generation (RAG): Principles, Applications, Limitations and Challenges

Baidu Geek Talk

Oct 23, 2024 · Artificial Intelligence

Integrating Yuan 2.0 Large Model with PaddleNLP: Overview, Usage Steps, and Interaction Examples

The open‑source Yuan 2.0 large model is fully integrated into Baidu’s PaddleNLP, offering quick inference for tasks like code generation, translation, and reasoning, along with efficient distributed training and fine‑tuning features such as Zero Padding optimization, enabling developers to easily deploy and customize the model via simple setup steps and example interactions.

AIJavaLLM

0 likes · 10 min read

Integrating Yuan 2.0 Large Model with PaddleNLP: Overview, Usage Steps, and Interaction Examples

Alibaba Cloud Big Data AI Platform

Oct 22, 2024 · Artificial Intelligence

How Alibaba Cloud Optimizes Enterprise RAG: Key Techniques for AI Search

At the 2024 Alibaba Cloud Yúnxī Conference, senior AI Search expert Xing Shaomin detailed the enterprise‑grade Retrieval‑Augmented Generation (RAG) pipeline, covering critical link architecture, effectiveness, performance, and cost optimizations, as well as practical applications, vector store enhancements, LLM agents, and deployment strategies.

AI SearchEnterprise AILLM

0 likes · 16 min read

How Alibaba Cloud Optimizes Enterprise RAG: Key Techniques for AI Search

AI Large Model Application Practice

Oct 21, 2024 · Artificial Intelligence

Building Personalized Long‑Term Memory for AI Agents with Mem0 and LangGraph

This tutorial explains why AI agents need durable, personalized long‑term memory, introduces the open‑source Mem0 solution, shows how Mem0 works with LLMs and vector stores, and provides step‑by‑step code to integrate Mem0 into a LangGraph workflow for adaptive, user‑specific interactions.

AI memoryLLMLangGraph

0 likes · 11 min read

Building Personalized Long‑Term Memory for AI Agents with Mem0 and LangGraph

DataFunSummit

Oct 18, 2024 · Artificial Intelligence

Building Efficient RAG Applications with a Small Team: Insights from PingCAP AI Lab

This article details how PingCAP's three‑person AI Lab leveraged Retrieval‑Augmented Generation (RAG) techniques—including basic RAG, fine‑tuned embeddings, re‑ranking, graph RAG, and agent‑based RAG—to create scalable, multilingual document‑question answering services while addressing large‑scale documentation challenges, model limitations, and user feedback loops.

EmbeddingLLMRAG

0 likes · 14 min read

Building Efficient RAG Applications with a Small Team: Insights from PingCAP AI Lab

JavaEdge

Oct 18, 2024 · Artificial Intelligence

Designing Scalable Multi‑Agent Systems with LangGraph: Architectures, Communication, and Code Samples

This article explains why large‑language‑model agents become hard to manage, outlines the benefits of modular multi‑agent designs, compares several connection architectures, and provides concrete LangGraph code for supervisor‑based, tool‑calling, and custom workflow patterns.

LLMLangGraphPython

0 likes · 12 min read

Designing Scalable Multi‑Agent Systems with LangGraph: Architectures, Communication, and Code Samples

System Architect Go

Oct 17, 2024 · Artificial Intelligence

Running and Fine‑Tuning Large Language Models Locally with Ollama, Docker, and Cloud Resources

The author chronicles the challenges and solutions of running large language models locally using Ollama, experimenting with cloud GPUs on Google Colab, managing Python dependencies through Docker, and ultimately fine‑tuning a small Qwen model, providing a practical guide for AI enthusiasts.

DockerGoogle ColabLLM

0 likes · 6 min read

Running and Fine‑Tuning Large Language Models Locally with Ollama, Docker, and Cloud Resources

NewBeeNLP

Oct 16, 2024 · Artificial Intelligence

Unlocking Long-Sequence LLMs: Position Embeddings, Scaling, and Efficient Attention

This article reviews recent advances in training and inference for long‑sequence large language models, comparing ALIBI and RoPE position embeddings, exploring RoPE scaling techniques, analyzing attention optimizations, and outlining practical data, evaluation, and system frameworks for scalable LLM deployment.

Flash AttentionLLMRoPE

0 likes · 14 min read

Unlocking Long-Sequence LLMs: Position Embeddings, Scaling, and Efficient Attention

Baobao Algorithm Notes

Oct 16, 2024 · Artificial Intelligence

How the DB3 Team Won the Meta CRAG RAG Challenge: Prompts, Retrieval, and LoRA Fine‑Tuning

This article analyzes the Meta Comprehensive RAG (CRAG) benchmark, detailing its three tasks, evaluation metrics, and the champion DB3 team's end‑to‑end solution that combines data preprocessing, dual‑stage retrieval, prompt engineering, LoRA‑based fine‑tuning, and public data augmentation to achieve top scores across all tasks.

Knowledge GraphLLMLoRA

0 likes · 17 min read

How the DB3 Team Won the Meta CRAG RAG Challenge: Prompts, Retrieval, and LoRA Fine‑Tuning

System Architect Go

Oct 15, 2024 · Artificial Intelligence

Overview of Ollama: Architecture, Storage Structure, and Dialogue Process

This article provides a comprehensive overview of Ollama, a lightweight tool for running large language models, detailing its client‑server architecture, local storage layout, and the step‑by‑step workflow of user interactions with the model.

AI toolsLLMOllama

0 likes · 7 min read

Overview of Ollama: Architecture, Storage Structure, and Dialogue Process

CSS Magic

Oct 14, 2024 · Artificial Intelligence

How OpenAI’s o1 Models Impact Developers: Performance, Limits, Cost, and Prompting

The article evaluates OpenAI’s o1 series—o1‑preview, o1‑mini and the upcoming full model—by comparing their complex reasoning strength, slower inference speed, higher pricing, API restrictions, and prompting best practices, helping developers decide when to adopt them.

APILLMOpenAI

0 likes · 13 min read

How OpenAI’s o1 Models Impact Developers: Performance, Limits, Cost, and Prompting

Baobao Algorithm Notes

Oct 13, 2024 · Artificial Intelligence

Can Hierarchical LLMs Transform Sequential Recommendation? A Deep Dive

This article provides a comprehensive analysis of the HLLM paper, detailing its hierarchical LLM architecture for item and user modeling, the training objectives, fusion strategies, extensive offline and online experiments, scaling behavior, ablation studies, and practical deployment insights in large‑scale recommendation systems.

Industrial DeploymentLLMScaling Law

0 likes · 12 min read

Can Hierarchical LLMs Transform Sequential Recommendation? A Deep Dive

JD Tech

Oct 13, 2024 · Artificial Intelligence

Building a Simple Local AI Question‑Answer System with Java, LangChain4J, Ollama, and ChromaDB

This article guides readers through the concepts of large language models, embeddings, vector databases, and Retrieval‑Augmented Generation, then demonstrates step‑by‑step how to set up Ollama, install a local Chroma vector store, configure Maven dependencies, and write Java code using LangChain4J to build and test a functional AI Q&A application.

AIJavaLLM

0 likes · 22 min read

Building a Simple Local AI Question‑Answer System with Java, LangChain4J, Ollama, and ChromaDB

AntTech

Oct 12, 2024 · Artificial Intelligence

Observations from ISSTA 2024: Conference Highlights, Awarded Papers, Keynotes, and In‑Depth Reviews

The article reports on the 33rd ISSTA 2024 conference in Vienna, summarizing its acceptance statistics, highlighting the Impact Paper Award and Distinguished Papers, detailing keynotes on large‑language‑model‑driven software quality, and providing extensive reviews of selected research works ranging from fuzzing and program repair to database query simplification and AI‑oriented code generation.

ISSTA2024LLMProgramRepair

0 likes · 29 min read

Observations from ISSTA 2024: Conference Highlights, Awarded Papers, Keynotes, and In‑Depth Reviews

21CTO

Oct 10, 2024 · Artificial Intelligence

5 Practical AI Projects to Build Your Skills with Python

This article presents five hands‑on AI project ideas—from resume optimization to multimodal search—complete with step‑by‑step instructions, required Python libraries, and code snippets, helping beginners and intermediate developers quickly build valuable AI applications.

AILLMPython

0 likes · 12 min read

5 Practical AI Projects to Build Your Skills with Python

JD Tech Talk

Oct 8, 2024 · Artificial Intelligence

Building a Retrieval‑Augmented Generation (RAG) System with Rust and Qdrant

This article explains how to construct a Retrieval‑Augmented Generation pipeline in Rust, covering knowledge‑base creation with Qdrant, model loading and embedding using the candle library, data ingestion, and integration of a Rust‑based inference service based on mistral.rs, while also discussing resource usage and common pitfalls.

AIEmbeddingLLM

0 likes · 16 min read

Building a Retrieval‑Augmented Generation (RAG) System with Rust and Qdrant

JD Cloud Developers

Oct 8, 2024 · Artificial Intelligence

How to Build a Rust-Powered Retrieval‑Augmented Generation (RAG) System from Scratch

This article explains how to construct a Retrieval‑Augmented Generation pipeline in Rust, covering knowledge‑base creation with Qdrant, model loading and embedding generation using candle, and integrating a Rust‑based inference service to answer queries with up‑to‑date external data.

EmbeddingLLMQdrant

0 likes · 17 min read

How to Build a Rust-Powered Retrieval‑Augmented Generation (RAG) System from Scratch

Baobao Algorithm Notes

Oct 7, 2024 · Artificial Intelligence

Mastering LLM Supervised Fine‑Tuning: Practical Tips, Data Strategies, and Debugging

This article provides a comprehensive, experience‑driven guide to supervised fine‑tuning (SFT) of large language models, covering special tokens, latency considerations, data diversity and production, training frameworks and hyper‑parameters, over‑/under‑fitting diagnostics, and evaluation metrics such as helpfulness, honesty, and harmlessness.

AIData EngineeringLLM

0 likes · 40 min read

Mastering LLM Supervised Fine‑Tuning: Practical Tips, Data Strategies, and Debugging

Baobao Algorithm Notes

Oct 7, 2024 · Artificial Intelligence

Decoding OpenAI’s o1: How RL and Process‑Supervised Reward Models Might Power the Next LLM

The author speculates on OpenAI’s o1 architecture, proposing that it relies on reinforcement learning guided by a generalizable, process‑supervised reward model, and outlines data collection, multi‑model generation, and training tweaks needed to realize such a system.

AI researchLLMRLHF

0 likes · 8 min read

Decoding OpenAI’s o1: How RL and Process‑Supervised Reward Models Might Power the Next LLM

Fighter's World

Sep 30, 2024 · Artificial Intelligence

Exploring Google NotebookLM: Use Cases, Interaction Experience, and Key Insights

The author reviews Google NotebookLM, describing how it aids deep paper reading, boosts chat willingness with guided prompts, maintains conversation coherence through self‑play insights, highlights the audio‑overview feature, and reflects on AI concepts such as the "bitter lesson" and the limits of self‑play in open scenarios.

AI researchAudio GenerationGoogle

0 likes · 22 min read

Exploring Google NotebookLM: Use Cases, Interaction Experience, and Key Insights

21CTO

Sep 30, 2024 · Artificial Intelligence

How LLM‑Powered IDEs Can Cut Your Coding Time in Half

Using an LLM-powered IDE, the author built a full‑stack weekend project without writing a single line of code, discovering faster development cycles, new debugging habits, and the strengths and limits of AI assistants compared to traditional Google searches.

AI codingDebuggingLLM

0 likes · 10 min read

How LLM‑Powered IDEs Can Cut Your Coding Time in Half

JD Tech Talk

Sep 29, 2024 · Artificial Intelligence

Building a Simple Local AI Question‑Answer System with Java, LangChain, Ollama, and ChromaDB

This article explains how to set up a lightweight local AI Q&A system using Java, LangChain (and LangChain4J), Ollama for LLM inference, embedding techniques, and a vector database (ChromaDB), covering core concepts, environment preparation, Maven dependencies, and sample code.

AILLMLangChain

0 likes · 21 min read

Building a Simple Local AI Question‑Answer System with Java, LangChain, Ollama, and ChromaDB

JD Cloud Developers

Sep 29, 2024 · Artificial Intelligence

Build a Local AI Q&A System with Java, Ollama, and LangChain4J

This article walks through building a local AI question‑answer system using Java, Ollama, LangChain4J, embeddings, and a Chroma vector database, covering LLM fundamentals, embedding techniques, RAG architecture, setup steps, Maven dependencies, and sample code to retrieve and answer queries.

AIEmbeddingJava

0 likes · 19 min read

Build a Local AI Q&A System with Java, Ollama, and LangChain4J

Architect

Sep 26, 2024 · Artificial Intelligence

Decoding OpenAI o1: How RL‑LLM Fusion Powers Next‑Gen Reasoning

This article provides a detailed technical analysis of OpenAI’s o1 model, exploring its enhanced logical reasoning, the likely use of reinforcement learning with hidden chain‑of‑thought generation, multi‑model architecture, training data pipelines, reward modeling, and how these innovations could reshape AI safety and scaling strategies.

AI safetyLLMOpenAI o1

0 likes · 43 min read

Decoding OpenAI o1: How RL‑LLM Fusion Powers Next‑Gen Reasoning

Huolala Tech

Sep 26, 2024 · Artificial Intelligence

How LLM-Powered AI Assistants Transform Logistics Operations

This article details Huolala's exploration of large‑language‑model (LLM) based AI assistants across multiple business scenarios, describing their architecture, implementation challenges, prompt engineering techniques, and the progressive stages from professional assistants to multi‑agent systems that drive efficiency and innovation in logistics.

AI AssistantLLMPrompt Engineering

0 likes · 12 min read

How LLM-Powered AI Assistants Transform Logistics Operations

Baobao Algorithm Notes

Sep 25, 2024 · Industry Insights

Decoding OpenAI o1: How RL and LLM Fuse to Power Hidden Chain‑of‑Thought

This article analytically reconstructs OpenAI o1’s architecture, training pipeline, and inference workflow, exploring its reinforcement‑learning‑enhanced hidden chain‑of‑thought, multi‑model composition, scaling laws, reward modeling, and potential implications for future AI safety and small‑model strategies.

AI safetyHidden COTLLM

0 likes · 43 min read

Decoding OpenAI o1: How RL and LLM Fuse to Power Hidden Chain‑of‑Thought

ByteDance Data Platform

Sep 25, 2024 · Artificial Intelligence

How LLMs Power the “Find Data Assistant” for Smarter Data Retrieval

This article explains how the Volcano Engine DataLeap team leveraged large‑language models to build the “Find Data Assistant”, detailing its design, challenges, embedding‑and‑reranker enhancements, LLM‑driven semantic search, mixing architecture, and practical lessons for improving data asset management and retrieval.

Data Asset ManagementData RetrievalEmbedding

0 likes · 17 min read

How LLMs Power the “Find Data Assistant” for Smarter Data Retrieval

JavaEdge

Sep 24, 2024 · Artificial Intelligence

Mastering RAG with LangChain4j: From Simple Setup to Advanced Retrieval‑Augmented Generation

This article explains how to extend large language models with domain‑specific knowledge using Retrieval‑Augmented Generation (RAG) in LangChain4j, covering the concepts of RAG, its indexing and retrieval stages, simple RAG setup, detailed API usage, and advanced customization options such as query transformers and content injectors.

EmbeddingJavaLLM

0 likes · 24 min read

Mastering RAG with LangChain4j: From Simple Setup to Advanced Retrieval‑Augmented Generation

AsiaInfo Technology: New Tech Exploration

Sep 23, 2024 · Artificial Intelligence

How Large Language Models Power Multi‑Turn Dialogue for Smart Marketing

This article presents a comprehensive technical analysis of using large language models to build a task‑oriented multi‑turn dialogue system for intelligent marketing, detailing architecture, intent detection, slot extraction, prompt design, dialogue management, practical experience, and future research directions.

LLMintelligent marketingintent recognition

0 likes · 21 min read

How Large Language Models Power Multi‑Turn Dialogue for Smart Marketing

NewBeeNLP

Sep 23, 2024 · Artificial Intelligence

Why Post‑Training Is Redefining LLMs: DPO vs PPO, Synthetic Data, and Scaling Strategies

This article analyzes recent post‑training trends in large language models, comparing DPO and PPO, examining the scarcity of open‑source preference data, the iterative training process, the rise of synthetic data pipelines, and emerging methods for improving math and reasoning capabilities.

DPOLLMPPO

0 likes · 12 min read

Why Post‑Training Is Redefining LLMs: DPO vs PPO, Synthetic Data, and Scaling Strategies

Alibaba Cloud Developer

Sep 23, 2024 · Artificial Intelligence

Boosting Aviator Script Development with AI—No Model Training Required

This article details an engineering‑focused practice that uses large language models, RAG, prompt engineering, and reranking to automatically generate, review, and refine Aviator scripts for decision‑center policies without any model pre‑training, offering practical insights and code examples for developers.

AI Code GenerationAviator scriptLLM

0 likes · 29 min read

Boosting Aviator Script Development with AI—No Model Training Required

JavaEdge

Sep 21, 2024 · Artificial Intelligence

Understanding LLM API Types and Usage in LangChain4j

This article explains the different low‑level LLM API types in LangChain4j, including LanguageModel, ChatLanguageModel, and other model interfaces, and shows how to create and combine ChatMessage objects for multi‑turn conversations.

AI APIChatLanguageModelChatMessage

0 likes · 8 min read

Understanding LLM API Types and Usage in LangChain4j

DataFunSummit

Sep 21, 2024 · Artificial Intelligence

DataLeap "Find Data Assistant": Leveraging Large Language Models for Data Asset Retrieval and Management

This article details how the DataLeap team applied large language model technology to build the "Find Data Assistant" platform, addressing the challenges of locating and using massive data assets through a hybrid retrieval architecture, enhanced embedding, reranking, mixed ranking, and answer summarization, while sharing practical lessons and future directions.

Data Asset ManagementData RetrievalEmbedding

0 likes · 17 min read

DataLeap "Find Data Assistant": Leveraging Large Language Models for Data Asset Retrieval and Management

Senior Brother's Insights

Sep 19, 2024 · Artificial Intelligence

Rule Engines vs AI Models: Choosing the Right Approach for Product Logic

The article compares traditional rule‑engine architectures with AI‑driven models, explains their differing characteristics, outlines when deterministic rule matching is preferable over flexible AI inference, and recommends practical technologies such as Drools for rule‑based solutions and LLM‑based RAG/Agent frameworks for AI‑centric scenarios.

AIDroolsLLM

0 likes · 9 min read

Rule Engines vs AI Models: Choosing the Right Approach for Product Logic

JavaEdge

Sep 19, 2024 · Artificial Intelligence

Unlock Java LLM Power: A Deep Dive into LangChain4j Features and Architecture

LangChain4j streamlines the integration of large language models into Java applications by offering a standardized API, extensive support for over a dozen LLM providers and vector stores, a rich toolbox for RAG, chat memory, and tool calling, plus two abstraction layers that cater to both low‑level control and high‑level convenience.

AIJavaLLM

0 likes · 10 min read

Unlock Java LLM Power: A Deep Dive into LangChain4j Features and Architecture

Continuous Delivery 2.0

Sep 19, 2024 · Artificial Intelligence

Applying Large Language Models for Automated Test Case Generation at KooJiaLe

This article describes how KooJiaLe, a leading 3D design company, built an AI‑powered platform that uses large language models to automate test case generation, detailing its workflow, generation modes, editing features, export options, optimization efforts, results, and remaining challenges.

AILLMR&D

0 likes · 8 min read

Applying Large Language Models for Automated Test Case Generation at KooJiaLe

DevOps

Sep 13, 2024 · Artificial Intelligence

15 Advanced Retrieval‑Augmented Generation (RAG) Techniques for Production‑Ready AI Solutions

The article outlines fifteen advanced Retrieval‑Augmented Generation (RAG) techniques—from hierarchical indexing and context caching to multimodal alignment and microservice orchestration—explaining how they help transform AI prototypes into scalable, reliable production systems while highlighting common pitfalls and a concluding call to action.

AI productionLLMRAG

0 likes · 8 min read

15 Advanced Retrieval‑Augmented Generation (RAG) Techniques for Production‑Ready AI Solutions

Code Mala Tang

Sep 12, 2024 · Artificial Intelligence

Unlocking LangChain.js: The Swiss Army Knife for LLM Applications

This article introduces LangChain.js, explains its origins, core concepts such as chats, templates, tools, and chains, demonstrates practical JavaScript code examples, and explores the LangChain Execution Language (LCEL) for building flexible, conditional AI workflows.

AI workflowLCELLLM

0 likes · 17 min read

Unlocking LangChain.js: The Swiss Army Knife for LLM Applications

Code Mala Tang

Sep 12, 2024 · Artificial Intelligence

Boost LLM Accuracy with Retrieval‑Augmented Generation Using LangChain.js

This article explains the core concepts of Retrieval‑Augmented Generation (RAG), walks through its implementation steps with LangChain.js—including text chunking, embedding, storage, retrieval, and generation—and showcases practical use cases, challenges, and best practices for building reliable AI‑powered applications.

AI applicationsEmbeddingLLM

0 likes · 16 min read

Boost LLM Accuracy with Retrieval‑Augmented Generation Using LangChain.js

DataFunTalk

Sep 12, 2024 · Artificial Intelligence

MetaGPT: Advances in Multi‑Agent Collaboration and Agent Capability Enhancement

This article reviews MetaGPT, an open‑source multi‑agent framework that integrates human‑engineered SOPs into LLM‑based agents to improve software generation, data interpretation, and simulation tasks, highlighting its rapid community growth, experimental successes, tool integration strategies, and future research directions.

Agent CollaborationLLMMetaGPT

0 likes · 20 min read

MetaGPT: Advances in Multi‑Agent Collaboration and Agent Capability Enhancement

Alibaba Cloud Big Data AI Platform

Sep 11, 2024 · Artificial Intelligence

How to Build a Retrieval‑Augmented Generation (RAG) System with OpenSearch LLM and Dify

Learn step‑by‑step how to integrate OpenSearch LLM’s intelligent Q&A edition with the Dify large‑model platform to create a robust Retrieval‑Augmented Generation (RAG) system, covering architecture, workflow setup, API authentication, result parsing, and practical code examples.

AIDifyLLM

0 likes · 7 min read

How to Build a Retrieval‑Augmented Generation (RAG) System with OpenSearch LLM and Dify

Cloud Native Technology Community

Sep 10, 2024 · Industry Insights

What Makes Cloudflare AI Gateway Stand Out? A Deep Dive into AI API Gateway Features

This article analyzes the emerging AI Gateway market, compares major products such as Kong, Gloo, Higress, Portkey, and OneAPI, and provides a detailed technical review of Cloudflare AI Gateway’s architecture, capabilities, advantages, limitations, and practical usage for LLM integration.

AI gatewayCloudflareLLM

0 likes · 9 min read

What Makes Cloudflare AI Gateway Stand Out? A Deep Dive into AI API Gateway Features

Baobao Algorithm Notes

Sep 10, 2024 · Artificial Intelligence

Do LLMs Silence Human Voices? Unveiling the ‘Spiral of Silence’ in Retrieval‑Augmented Generation

This article reviews the ACL 2024 paper that investigates how large language model‑generated text influences retrieval‑augmented generation pipelines, revealing short‑term retrieval gains but a long‑term “spiral of silence” that marginalizes human‑generated content and homogenizes open‑domain QA results.

AI ImpactLLMOpen Domain QA

0 likes · 9 min read

Do LLMs Silence Human Voices? Unveiling the ‘Spiral of Silence’ in Retrieval‑Augmented Generation

Code Mala Tang

Sep 7, 2024 · Artificial Intelligence

Unlocking LangChain.js: The Swiss Army Knife for LLM Applications

This article introduces LangChain.js, its core concepts such as chats, templates, tools, and chains, demonstrates how to use LCEL for flexible workflow composition, and shows practical JavaScript code examples for building AI-powered applications with large language models.

AI workflowLCELLLM

0 likes · 17 min read

Architect

Sep 6, 2024 · Artificial Intelligence

Combining Geo‑IP and Prompt Engineering with Higress AI Gateway: Implementation and Usage

This article explains Prompt Engineering, introduces the AI Gateway concept, and demonstrates how to integrate a Geo‑IP plugin with an AI prompt‑modifying plugin in Higress using Go and Wasm, providing configuration examples, implementation details, and sample request‑response scenarios.

AIGeo-IPHigress

0 likes · 11 min read

Combining Geo‑IP and Prompt Engineering with Higress AI Gateway: Implementation and Usage

Architect's Alchemy Furnace

Sep 6, 2024 · Artificial Intelligence

Exploring LLM Application Architectures: From AI Embedded to Multi‑Agent Systems

This article examines the typical business and technical architectures for large language model applications, covering AI Embedded, Copilot, and Agent modes, single‑ and multi‑agent systems, core frameworks, and guidance on selecting appropriate technical routes.

AI agentsLLMRAG

0 likes · 11 min read

Exploring LLM Application Architectures: From AI Embedded to Multi‑Agent Systems

iKang Technology Team

Sep 5, 2024 · Artificial Intelligence

What Is LangChain? Overview, Core Advantages, Components, and Use Cases

LangChain is a modular framework that streamlines integration of large language models by providing unified model interfaces, prompt optimization, memory handling, indexing, chains, and agents, enabling developers to quickly build and deploy sophisticated NLP applications such as text generation, information extraction, and dynamic tool‑driven workflows across various industries.

AI FrameworkChainsLLM

0 likes · 6 min read

What Is LangChain? Overview, Core Advantages, Components, and Use Cases

Alibaba Cloud Native

Sep 4, 2024 · Cloud Native

Deploy NVIDIA NIM LLM Inference on Alibaba Cloud ACK with Auto‑Scaling and Monitoring

This guide walks you through deploying NVIDIA NIM for LLM inference on Alibaba Cloud ACK, integrating the Cloud Native AI Suite, configuring KServe, setting up Prometheus and Grafana monitoring, and implementing custom autoscaling based on request queue metrics.

ACKAutoscalingGrafana

0 likes · 15 min read

Deploy NVIDIA NIM LLM Inference on Alibaba Cloud ACK with Auto‑Scaling and Monitoring

Full-Stack Cultivation Path

Sep 4, 2024 · Artificial Intelligence

Hot Open-Source RAG Tool for Document Chat: GraphRAG, Multimodal QA & Complex Reasoning

This article introduces Kotaemon, an open‑source Retrieval‑Augmented Generation platform that lets users chat with their documents, offering a self‑hosted web UI, support for local and API LLMs, hybrid retrieval, multimodal question answering, GraphRAG indexing, and advanced reasoning capabilities, along with step‑by‑step installation via App or Docker.

GraphRAGLLMMultimodal QA

0 likes · 6 min read

Hot Open-Source RAG Tool for Document Chat: GraphRAG, Multimodal QA & Complex Reasoning

AI Large Model Application Practice

Sep 4, 2024 · Artificial Intelligence

When to Use GraphRAG vs. Traditional RAG and How to Combine Them

This article compares GraphRAG with traditional RAG across seven dimensions—suitable scenarios, knowledge representation, retrieval, comprehensive queries, hidden‑relationship understanding, scalability, and performance‑cost trade‑offs—explains how they can be fused, and offers guidance on selecting the right approach for complex data‑driven applications.

Artificial IntelligenceGraphRAGLLM

0 likes · 13 min read

When to Use GraphRAG vs. Traditional RAG and How to Combine Them

Alibaba Cloud Big Data AI Platform

Sep 2, 2024 · Artificial Intelligence

Turning PDFs and Word Docs into Searchable Knowledge for RAG Systems

This article explains why generic large language models struggle with domain‑specific data, introduces Retrieval‑Augmented Generation (RAG) as a solution, compares Word and PDF formats, outlines document‑parsing pipelines, reviews open‑source PDF tools, and presents Alibaba Cloud's rule‑based parsing architecture with performance results.

AIDocument ParsingLLM

0 likes · 13 min read

Turning PDFs and Word Docs into Searchable Knowledge for RAG Systems

Data Thinking Notes

Sep 1, 2024 · Artificial Intelligence

Master LLMs: Basics, Prompt Engineering, RAG, Agents & Multimodal AI

This article provides a comprehensive overview of large language models, covering their fundamental concepts, historical milestones, parameter scaling, prompt engineering techniques, retrieval‑augmented generation, autonomous agents, and multimodal model applications, illustrating how these technologies reshape AI capabilities across domains.

AI agentsLLMPrompt Engineering

0 likes · 22 min read

Master LLMs: Basics, Prompt Engineering, RAG, Agents & Multimodal AI

Alibaba Cloud Big Data AI Platform

Aug 30, 2024 · Artificial Intelligence

Boost LLM Performance: Data Augmentation & Distillation with Qwen2

This guide explains how to reduce the computational cost of large language models by preparing instruction data, optionally augmenting or refining it, deploying teacher and student models on PAI, and performing distillation training with detailed hyper‑parameter settings and sample Python scripts.

AIDeploymentLLM

0 likes · 21 min read

Boost LLM Performance: Data Augmentation & Distillation with Qwen2

DataFunSummit

Aug 29, 2024 · Artificial Intelligence

Intelligent NPC Practices in Tencent Games: Multi‑Modal LLM Solutions and System Optimizations

This article details Tencent Game's end‑to‑end approach to building intelligent NPCs, covering the opportunities brought by AI, the practical implementation of multimodal LLM‑driven dialogue, knowledge‑augmented retrieval, long‑context handling, safety measures, multimodal expression (voice and facial animation), and system‑level performance optimizations for real‑time deployment.

AILLMNPC

0 likes · 18 min read

Intelligent NPC Practices in Tencent Games: Multi‑Modal LLM Solutions and System Optimizations

Baobao Algorithm Notes

Aug 27, 2024 · Industry Insights

What Real‑World LLM Researchers Face: Scaling Limits, Data Bottlenecks, and Deployment Challenges

The author shares a candid account of recent large‑model experiments, highlighting why most labs struggle to exceed 100 B parameters, how data and hardware constraints shape model iteration, and the practical engineering, safety, and multimodal challenges that dictate real‑world LLM deployment.

AI IndustryAI scalingLLM

0 likes · 6 min read

What Real‑World LLM Researchers Face: Scaling Limits, Data Bottlenecks, and Deployment Challenges