Tagged articles
2079 articles
Page 15 of 21
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 24, 2025 · Artificial Intelligence

Why LLM Internet Search Fails and How to Fix It: A Deep Dive into Qwen, Doubao, and DeepSeek

This article analyses the shortcomings of large‑model internet search—such as unverifiable sources, fabricated content, and poor instruction compliance—by comparing Qwen‑max, Doubao‑1.5‑pro‑256k, and DeepSeek‑v3, and proposes prompt engineering, post‑processing, and custom tool improvements to boost reliability.

AIEvaluationLLM
0 likes · 22 min read
Why LLM Internet Search Fails and How to Fix It: A Deep Dive into Qwen, Doubao, and DeepSeek
Ma Wei Says
Ma Wei Says
Mar 24, 2025 · Artificial Intelligence

Master BGE Multilingual Embeddings: Models, Installation, and Quick Usage

Explore the BGE (BAAI General Embedding) family—including v1, v1.5, M3, Multilingual Gemma2, and EN‑ICL—detailing their multilingual capabilities, model variants, token limits, optimal use cases, and step‑by‑step installation and Python usage instructions with code examples for embedding generation and similarity scoring.

EmbeddingLLMPython
0 likes · 8 min read
Master BGE Multilingual Embeddings: Models, Installation, and Quick Usage
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Mar 24, 2025 · Artificial Intelligence

How to Build a Real‑Time Data Analysis Agent with LLMs, Hologres, and MCP

This article explains the challenges LLMs face in data analysis, introduces the Model Context Protocol (MCP) as a standard bridge, and provides a step‑by‑step guide to integrate Hologres, MCP, and large language models—using Claude Desktop as an example—to create a fast, multi‑source data‑analysis agent.

AI AgentData AnalysisHologres
0 likes · 11 min read
How to Build a Real‑Time Data Analysis Agent with LLMs, Hologres, and MCP
Architect
Architect
Mar 23, 2025 · Artificial Intelligence

The Future of AI Agents: From Prompt‑Driven Workflows to Model‑as‑Product and Reinforcement‑Learning‑Powered Agents

The article argues that the next wave of AI agents will shift from brittle, prompt‑driven workflows like Manus to truly autonomous, model‑centric agents trained with reinforcement learning and reasoning, exemplified by OpenAI's DeepResearch and Anthropic's Claude Sonnet 3.7, while the API‑driven market model collapses.

AI AgentsClaudeDeepResearch
0 likes · 28 min read
The Future of AI Agents: From Prompt‑Driven Workflows to Model‑as‑Product and Reinforcement‑Learning‑Powered Agents
Baobao Algorithm Notes
Baobao Algorithm Notes
Mar 23, 2025 · Artificial Intelligence

Why Future AI Agents Must Evolve Beyond Prompt‑Driven Workflows

The article argues that the next generation of AI agents should focus on improving the model itself through reinforcement learning and reasoning rather than relying on pre‑designed prompt‑driven workflows, highlighting industry trends, technical challenges, and the shift toward treating models as products.

DeepSearchLLMmodel as product
0 likes · 29 min read
Why Future AI Agents Must Evolve Beyond Prompt‑Driven Workflows
Architect
Architect
Mar 22, 2025 · Artificial Intelligence

Understanding and Mitigating Failures in Retrieval‑Augmented Generation (RAG) Systems

Retrieval‑augmented generation (RAG) combines external knowledge retrieval with large language models to improve answer accuracy, but it often suffers from retrieval mismatches, algorithmic flaws, chunking issues, embedding biases, inefficiencies, generation errors, reasoning limits, formatting problems, system‑level failures, and high resource costs, which this article analyzes and offers solutions for.

AI reliabilityLLMRAG
0 likes · 32 min read
Understanding and Mitigating Failures in Retrieval‑Augmented Generation (RAG) Systems
Cognitive Technology Team
Cognitive Technology Team
Mar 22, 2025 · Artificial Intelligence

Three Stages of Developing Large Language Models and Practical Guidance

The article outlines the three development phases of large language models—building, pre‑training, and fine‑tuning—describes usage options, highlights key factors such as data scale, architecture, training processes, and evaluation, and offers practical advice for cost‑effective development.

LLMLarge Language ModelModel Development
0 likes · 3 min read
Three Stages of Developing Large Language Models and Practical Guidance
Baobao Algorithm Notes
Baobao Algorithm Notes
Mar 21, 2025 · Artificial Intelligence

Unlocking LLM Reasoning: A Deep Dive into Post‑Training Techniques

This article provides a comprehensive technical overview of large language model post‑training, covering fine‑tuning methods (full, parameter‑efficient, LoRA families, prompt tuning), domain‑adaptive tuning, reinforcement‑learning reward modeling, process vs. outcome rewards, inference‑enhancement strategies, dynamic compute allocation, verifier‑augmented reasoning, current challenges, and emerging research directions such as meta‑cognition, physical reasoning, and swarm intelligence.

LLMmeta-cognitionpost-training
0 likes · 21 min read
Unlocking LLM Reasoning: A Deep Dive into Post‑Training Techniques
Meituan Technology Team
Meituan Technology Team
Mar 20, 2025 · Artificial Intelligence

Meituan Tech Team's Selected Papers on Large Language Models and AI (2024-2025)

The article compiles Meituan’s recent 2024‑2025 research on large language models, presenting a diverse set of papers that explore transformer enhancements, scaling laws, safety optimization, instruction fine‑tuning, temporal decay learning, code generation, agent refinement, cost‑efficient MoE inference, quantization, fast parallel inference, speculative decoding, multilingual speech, vision‑language models, evaluation benchmarks, and jailbreak robustness.

AILLMMeituan
0 likes · 4 min read
Meituan Tech Team's Selected Papers on Large Language Models and AI (2024-2025)
Sohu Tech Products
Sohu Tech Products
Mar 19, 2025 · Artificial Intelligence

How to Recreate a Translation Agent with LangGraph and LLMs

This guide demonstrates building a steerable LLM‑based translation workflow using LangGraph, covering the initial translation, model‑generated reflection suggestions, and final improvement steps with full Python code examples and a complete execution result.

AILLMLangGraph
0 likes · 34 min read
How to Recreate a Translation Agent with LangGraph and LLMs
Ops Development & AI Practice
Ops Development & AI Practice
Mar 19, 2025 · Artificial Intelligence

How Integrating LLMs with the Model Context Protocol Could Transform AI Workflows

Integrating large language models with the open‑standard Model Context Protocol enables direct access to file systems, databases, and APIs, unlocking use cases such as automated file management, intelligent data analysis, personalized content generation, and task automation, while also raising security, privacy, and maturity challenges for future AI‑human collaboration.

Data SecurityLLMcross-domain
0 likes · 10 min read
How Integrating LLMs with the Model Context Protocol Could Transform AI Workflows
Ops Development & AI Practice
Ops Development & AI Practice
Mar 19, 2025 · Artificial Intelligence

Can Cache‑Augmented Generation Outperform RAG? A Deep Dive into LLM Efficiency

Cache‑augmented generation (CAG) preloads documents into LLM context using KV caches to eliminate retrieval latency, offering faster inference for static knowledge bases, while RAG remains more flexible for dynamic or large corpora; this article compares their definitions, performance, implementation steps, and future prospects.

CAGCache AugmentationInference Optimization
0 likes · 11 min read
Can Cache‑Augmented Generation Outperform RAG? A Deep Dive into LLM Efficiency
DaTaobao Tech
DaTaobao Tech
Mar 19, 2025 · Artificial Intelligence

Retrieval Augmented Generation (RAG): Principles, Challenges, and Implementation Techniques

Retrieval‑augmented generation (RAG) enhances large language models by integrating a preprocessing pipeline—cleaning, chunking, embedding, and vector storage—with a query‑driven retrieval and prompt‑injection workflow, leveraging vector databases, multi‑stage recall, advanced prompting, and comprehensive evaluation metrics to mitigate knowledge cut‑off, hallucinations, and security issues.

EvaluationLLMRAG
0 likes · 27 min read
Retrieval Augmented Generation (RAG): Principles, Challenges, and Implementation Techniques
Tencent Cloud Developer
Tencent Cloud Developer
Mar 19, 2025 · Artificial Intelligence

Inside Tencent Hunyuan Turbo S: Speed, Cost, and Hybrid Mamba Transformer Explained

Tencent's new Hunyuan Turbo S model combines a 44% faster response time, dramatically lower token costs, and a hybrid Mamba‑Transformer architecture that merges linear attention with full attention, offering insights into fast‑thinking versus slow‑thinking LLM designs, MoE scaling laws, low‑precision training effects, and long‑short chain fusion techniques.

AIArchitectureHybridMambaLLM
0 likes · 14 min read
Inside Tencent Hunyuan Turbo S: Speed, Cost, and Hybrid Mamba Transformer Explained
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 18, 2025 · Cloud Native

Gray Release of LoRA and Base Models Using ACK Gateway with AI Extension on Kubernetes

This guide explains how to deploy large language model inference services on a GPU-enabled Kubernetes cluster, configure ACK Gateway with AI Extension for intelligent routing and load balancing, and perform gray releases for both LoRA fine‑tuned models and base models such as QwQ‑32B and DeepSeek‑R1, including step‑by‑step commands and validation procedures.

ACK GatewayAI inferenceKubernetes
0 likes · 25 min read
Gray Release of LoRA and Base Models Using ACK Gateway with AI Extension on Kubernetes
JD Tech Talk
JD Tech Talk
Mar 18, 2025 · Artificial Intelligence

Generative Recommendation for CPS Advertising: Intent Sensing, Multi‑Objective Optimization, and the One4All Framework

This article surveys recent advances in generative recommendation for CPS advertising, detailing explicit intent‑aware controllable product recommendation, multi‑objective optimization techniques based on reward‑in‑context and DPO, and the scalable One4All framework that unifies behavior and language modeling across diverse ad scenarios.

CPS advertisingGenerative RecommendationLLM
0 likes · 14 min read
Generative Recommendation for CPS Advertising: Intent Sensing, Multi‑Objective Optimization, and the One4All Framework
JD Cloud Developers
JD Cloud Developers
Mar 18, 2025 · Artificial Intelligence

How Generative LLMs Are Transforming CPS Advertising Recommendations

Since large language models have excelled in NLP, researchers are now enhancing CPS advertising recommendation systems by integrating generative LLMs for explicit intent perception, multi‑objective optimization, and a unified One4All framework, achieving significant offline and online performance gains across click‑through, conversion, and revenue metrics.

CPS advertisingGenerative RecommendationLLM
0 likes · 19 min read
How Generative LLMs Are Transforming CPS Advertising Recommendations
AI Algorithm Path
AI Algorithm Path
Mar 17, 2025 · Artificial Intelligence

Agentic AI vs Generative AI: Key Differences and Comparative Analysis

The article defines Agentic AI as autonomous, goal‑directed systems that can act and learn from experience, contrasts it with Generative AI’s passive, single‑step content generation, and illustrates the practical advantage of Agentic workflows through Andrew Ng’s HumanEval benchmark where a step‑wise approach outperforms zero‑shot prompting even for older models.

AI autonomyBenchmarkGenerative AI
0 likes · 10 min read
Agentic AI vs Generative AI: Key Differences and Comparative Analysis
Infra Learning Club
Infra Learning Club
Mar 17, 2025 · Artificial Intelligence

Testing OpenManus with DeepSeek: A Hands‑On Evaluation

The author walks through installing OpenManus, configuring it to use DeepSeek (and an Ollama‑based vision model), runs a sample financial data query, and reports that the system is slow, sometimes inaccurate, and still requires further optimization.

AI AgentsDeepSeekLLM
0 likes · 5 min read
Testing OpenManus with DeepSeek: A Hands‑On Evaluation
Ops Development & AI Practice
Ops Development & AI Practice
Mar 17, 2025 · Artificial Intelligence

Unlocking LLM Power: A Hands‑On Guide to Open WebUI

Open WebUI offers a user‑friendly, open‑source web interface that simplifies interaction with large language models, supporting multiple back‑ends, offline operation, and extensible plugins, making AI experimentation accessible for developers, researchers, and enthusiasts alike.

AILLMModel Management
0 likes · 4 min read
Unlocking LLM Power: A Hands‑On Guide to Open WebUI
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 17, 2025 · Cloud Native

Boost LLM Inference with ACK Gateway AI Extension: A Step‑by‑Step Guide

This guide demonstrates how to deploy the QwQ‑32B large language model on an Alibaba Cloud ACK cluster, configure OSS storage, enable the ACK Gateway with AI Extension, set up InferencePool and InferenceModel resources, and benchmark intelligent routing versus standard gateway routing, revealing latency and throughput improvements.

ACK GatewayAI ExtensionKubernetes
0 likes · 16 min read
Boost LLM Inference with ACK Gateway AI Extension: A Step‑by‑Step Guide
Cognitive Technology Team
Cognitive Technology Team
Mar 17, 2025 · Artificial Intelligence

Leveraging Large Language Models to Optimize Traditional Machine Learning Pipelines

Large language models can assist and enhance each stage of traditional machine learning—including sample generation, data cleaning, feature engineering, model selection, hyper‑parameter tuning, and workflow automation—by generating synthetic data, refining features, selecting models, and orchestrating pipelines, though challenges such as bias, privacy, and noise remain.

Data GenerationFeature EngineeringLLM
0 likes · 11 min read
Leveraging Large Language Models to Optimize Traditional Machine Learning Pipelines
Ops Development & AI Practice
Ops Development & AI Practice
Mar 16, 2025 · Artificial Intelligence

How Function Calling Helps LLMs Overcome Hallucinations

This article explains how LLM function calling works, from defining external functions to processing API responses, and demonstrates a Python example using OpenAI's ChatGPT‑4o to fetch real‑time weather, showing how the technique mitigates hallucinations and expands practical AI applications.

AIFunction CallingHallucination Mitigation
0 likes · 8 min read
How Function Calling Helps LLMs Overcome Hallucinations
Architect
Architect
Mar 15, 2025 · Artificial Intelligence

Why Building Your Own RAG System Is a Costly Mistake

The article explains that developing a custom Retrieval‑Augmented Generation (RAG) solution incurs hidden infrastructure, personnel, and security costs, leads to operational overload and budget overruns, and is rarely justified compared to purchasing a proven vendor solution.

AILLMRAG
0 likes · 11 min read
Why Building Your Own RAG System Is a Costly Mistake
AI Algorithm Path
AI Algorithm Path
Mar 15, 2025 · Artificial Intelligence

Why the Industry Is Shifting From AI Agents to Agentic Workflows

The article explains that low accuracy and security risks of current AI agents—evidenced by a Claude AI Agent achieving only 14% of human performance and an average success rate of about 20%—are driving a move toward agentic workflows, which offer observable, auditable, and data‑synthesizing pipelines that dramatically improve enterprise productivity.

AI AgentsLLMObservability
0 likes · 7 min read
Why the Industry Is Shifting From AI Agents to Agentic Workflows
DataFunSummit
DataFunSummit
Mar 14, 2025 · Artificial Intelligence

Insights from Zhihu's ZhiLight Large‑Model Inference Framework: Architecture, Parallelism, and Performance Optimizations

The article summarizes Zhihu's machine‑learning platform lead Wang Xin's presentation on the ZhiLight large‑model inference framework, covering model execution mechanisms, GPU workload analysis, pipeline and tensor parallelism, GPU architecture evolution, open‑source engine comparisons, ZhiLight's compute‑communication overlap and quantization optimizations, benchmark results, supported models, and future directions.

GPULLMOpen‑source
0 likes · 13 min read
Insights from Zhihu's ZhiLight Large‑Model Inference Framework: Architecture, Parallelism, and Performance Optimizations
Baidu Geek Talk
Baidu Geek Talk
Mar 12, 2025 · Artificial Intelligence

How LLMs Are Revolutionizing Semantic Embeddings: Models, Methods, and Trends

This article reviews how large language models (LLMs) enhance semantic text embeddings by comparing traditional methods with LLM‑based approaches, detailing synthetic data generation, backbone model designs, key model families, experimental results on the MTEB benchmark, and future research challenges.

LLMcontrastive learningmodel comparison
0 likes · 30 min read
How LLMs Are Revolutionizing Semantic Embeddings: Models, Methods, and Trends
DaTaobao Tech
DaTaobao Tech
Mar 12, 2025 · Artificial Intelligence

Multimodal Automatic Layout Generation for E-commerce

The project develops a multimodal automatic layout generation system for e‑commerce by fine‑tuning the qwen‑vl‑7b vision‑language model with LoRA on poster and Taobao image‑layout data, employing diffusion‑based image generation and coordinate‑prediction methods to produce structured layouts that power poster, marketing image, and video‑cover creation with over 90% adoption, while exploring multi‑image, style‑aware, and iterative refinement extensions.

LLMdiffusione‑commerce
0 likes · 12 min read
Multimodal Automatic Layout Generation for E-commerce
NewBeeNLP
NewBeeNLP
Mar 11, 2025 · Artificial Intelligence

How DeepSeek’s New Architecture Redefines LLM Efficiency and Performance

This article analyzes DeepSeek’s recent breakthroughs—including the Multi‑Head Latent Attention (MLA), Group Relative Policy Optimization (GRPO), and a refined Mixture‑of‑Experts design—along with its three‑stage training pipeline, RL‑only R1‑Zero variant, and benchmark comparisons against GPT‑4o‑Mini and Llama 3.1, highlighting both gains and remaining challenges.

DeepSeekLLMMixture of Experts
0 likes · 18 min read
How DeepSeek’s New Architecture Redefines LLM Efficiency and Performance
Tencent Cloud Developer
Tencent Cloud Developer
Mar 11, 2025 · Artificial Intelligence

Fine‑Tuning Local LLaMA‑Factory Models and Building Networked AI Applications

The article walks through preparing a GPU‑enabled environment, downloading and LoRA‑fine‑tuning a DeepSeek model with LLaMA‑Factory, merging the adapter, then wrapping the model in a web UI that queries a ChromaDB vector store via crawled web data, illustrating security‑focused use cases and forecasting domain‑specific LLM adoption.

AILLMLLaMA-Factory
0 likes · 17 min read
Fine‑Tuning Local LLaMA‑Factory Models and Building Networked AI Applications
Architect
Architect
Mar 10, 2025 · Artificial Intelligence

What Makes DeepSeek’s New Architecture a Game‑Changer? Inside MLA, GRPO, and MoE Innovations

This article analyzes DeepSeek’s latest large‑model breakthroughs, covering the MLA attention compression, GRPO alignment algorithm, MoE load‑balancing redesign, multi‑stage training pipelines, reinforcement‑learning tricks, and performance comparisons with GPT‑4o‑Mini and Llama 3.1, highlighting both strengths and remaining challenges.

AI trainingDeepSeekGRPO
0 likes · 19 min read
What Makes DeepSeek’s New Architecture a Game‑Changer? Inside MLA, GRPO, and MoE Innovations
AI Algorithm Path
AI Algorithm Path
Mar 10, 2025 · Artificial Intelligence

How Much GPU Memory Does an LLM Service Really Need?

This article explains a simple formula for estimating the GPU VRAM required to serve large language models, demonstrates the calculation with a 7‑billion‑parameter example, clarifies why a 20% safety buffer is needed, and offers practical strategies such as quantization, CPU offload, and multi‑GPU parallelism to reduce memory usage.

GPU MemoryLLMModel Quantization
0 likes · 6 min read
How Much GPU Memory Does an LLM Service Really Need?
Tencent Technical Engineering
Tencent Technical Engineering
Mar 10, 2025 · Artificial Intelligence

How Non‑AI Developers Can Build LLM Apps: Prompt Engineering, RAG, and Function Calling Explained

This guide shows non‑AI developers how to create large‑model applications by mastering prompt engineering, multi‑turn interactions, Retrieval‑Augmented Generation, function calling, and AI‑Agent integration, with practical code examples, tool design patterns, and deployment tips.

AI AgentEmbeddingFunction Calling
0 likes · 48 min read
How Non‑AI Developers Can Build LLM Apps: Prompt Engineering, RAG, and Function Calling Explained
Baobao Algorithm Notes
Baobao Algorithm Notes
Mar 10, 2025 · Artificial Intelligence

Why DeepSeek V3’s FP8 Training Beats Traditional Schemes: A Deep Dive

This article provides a detailed technical analysis of FP8 training, comparing Nvidia’s TransformerEngine approach with DeepSeek V3’s novel scheme, and examines how block‑wise scaling, high‑precision accumulation, and vector length and correlation affect quantization error and signal‑to‑noise ratio in large‑language‑model training.

DeepSeekFP8LLM
0 likes · 20 min read
Why DeepSeek V3’s FP8 Training Beats Traditional Schemes: A Deep Dive
phodal
phodal
Mar 10, 2025 · Artificial Intelligence

How AutoDev Bridge Uses LLMs to Accelerate Legacy System Migration

AutoDev Bridge combines large‑model reasoning, C4 architecture analysis, AST‑based business logic extraction, and IDE‑integrated tooling to automate the migration of legacy systems, reducing manual effort and migration risk while highlighting the unique advantages of modern AI agents.

AICode TranslationLLM
0 likes · 7 min read
How AutoDev Bridge Uses LLMs to Accelerate Legacy System Migration
DevOps
DevOps
Mar 9, 2025 · Artificial Intelligence

A Beginner's Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling, and AI Agents

This article provides a comprehensive introduction to developing large language model (LLM) applications, covering prompt engineering, zero‑ and few‑shot techniques, function calling, retrieval‑augmented generation (RAG) with embedding and vector databases, code assistants, and the MCP protocol for building AI agents, all aimed at non‑AI specialists.

AI AgentEmbeddingFunction Calling
0 likes · 48 min read
A Beginner's Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling, and AI Agents
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Mar 9, 2025 · Cloud Computing

Deploy QwQ-32B LLM Inference on Alibaba Cloud ACS with vLLM: Step‑by‑Step Guide

This guide walks you through using Alibaba Cloud Container Compute Service (ACS) to provision GPU resources, prepare the QwQ-32B model, configure persistent storage, deploy the model with vLLM, set up OpenWebUI, verify the service, and optionally benchmark its performance, all with detailed commands and YAML examples.

ACSAlibaba CloudBenchmark
0 likes · 17 min read
Deploy QwQ-32B LLM Inference on Alibaba Cloud ACS with vLLM: Step‑by‑Step Guide
AI Frontier Lectures
AI Frontier Lectures
Mar 9, 2025 · Industry Insights

Why the Model Is Becoming the Product: AI Market Trends and Risks

The article argues that AI models are evolving into standalone products, examines scaling limits, integration challenges, reinforcement‑learning economics, and investment dynamics, and warns that reliance on large‑lab APIs may jeopardize future profitability for integrators.

AIIndustryInsightsLLM
0 likes · 15 min read
Why the Model Is Becoming the Product: AI Market Trends and Risks
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Mar 7, 2025 · Artificial Intelligence

How QwQ-32B Outperforms OpenAI o1-mini and Deploys in One Click on Alibaba Cloud

Alibaba Cloud's newly released QwQ-32B model delivers benchmark‑level performance rivaling top open‑source LLMs, integrates agent capabilities, and can be deployed with a single click through the PAI‑Model Gallery, offering a cost‑effective solution for developers seeking advanced AI inference.

AI BenchmarkAlibaba CloudLLM
0 likes · 5 min read
How QwQ-32B Outperforms OpenAI o1-mini and Deploys in One Click on Alibaba Cloud
dbaplus Community
dbaplus Community
Mar 7, 2025 · Artificial Intelligence

Master Prompt Engineering: Frameworks, Strategies, and Real‑World Examples for Large Language Models

This comprehensive guide explains what prompts are, outlines essential prompt components and multiple engineering frameworks, presents practical strategies for crafting clear and structured prompts, addresses model limitations such as hallucinations, and showcases a wide range of advanced prompting techniques with code examples.

AILLMPrompt Engineering
0 likes · 29 min read
Master Prompt Engineering: Frameworks, Strategies, and Real‑World Examples for Large Language Models
DevOps
DevOps
Mar 6, 2025 · Artificial Intelligence

Building Multi-Model Chat Agents with Dify: Integrating DeepSeek‑R1 and Gemini

This article explains how to create a high‑performance multi‑model chat agent on the Dify platform by combining DeepSeek‑R1 for reasoning and Gemini for answer generation, covering the underlying principles, configuration steps, API integration, performance benchmarks, and practical deployment guidance.

API integrationChatbotDeepSeek
0 likes · 12 min read
Building Multi-Model Chat Agents with Dify: Integrating DeepSeek‑R1 and Gemini
Cognitive Technology Team
Cognitive Technology Team
Mar 4, 2025 · Artificial Intelligence

Deep Searcher: An Open‑Source Agentic RAG Framework for Enterprise‑Level Search and Knowledge Retrieval

The article introduces Deep Searcher, an open‑source Agentic Retrieval‑Augmented Generation system that combines large language models, Milvus vector databases, and multi‑step reasoning to deliver enterprise‑grade search, reporting, and complex query capabilities, and compares its performance against traditional RAG and Graph RAG approaches.

LLMOpen SourceRAG
0 likes · 18 min read
Deep Searcher: An Open‑Source Agentic RAG Framework for Enterprise‑Level Search and Knowledge Retrieval
AI Algorithm Path
AI Algorithm Path
Mar 4, 2025 · Artificial Intelligence

How to Control LLM Output Using Temperature, Top‑K, and Top‑P

The article explains how sampling parameters—Temperature, Top‑k, and Top‑p—shape the output of large language models, comparing greedy and beam search, illustrating probability changes with concrete examples, and offering practical guidance on adjusting these settings for different tasks.

Beam SearchGreedy SearchLLM
0 likes · 9 min read
How to Control LLM Output Using Temperature, Top‑K, and Top‑P
Tencent Cloud Developer
Tencent Cloud Developer
Mar 4, 2025 · Artificial Intelligence

A Practical Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling and AI Agents

The guide teaches non‑AI developers how to build practical LLM‑powered applications by mastering prompt engineering, function calling, retrieval‑augmented generation, and AI agents, and introduces the Modal Context Protocol for seamless tool integration, offering a clear learning path to leverage large language models without deep theory.

AI AgentFunction CallingLLM
0 likes · 48 min read
A Practical Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling and AI Agents
Architect
Architect
Mar 3, 2025 · Artificial Intelligence

Unlocking Reasoning LLMs: Methods, DeepSeek R1 Insights, and Cost‑Effective Strategies

This article examines how to build and improve reasoning‑capable large language models, explains the definition and use‑cases of reasoning models, details DeepSeek‑R1’s training pipeline, compares four key enhancement methods—including inference‑time scaling, pure RL, SFT + RL, and distillation—and offers budget‑friendly advice.

AI researchDeepSeekInference Scaling
0 likes · 27 min read
Unlocking Reasoning LLMs: Methods, DeepSeek R1 Insights, and Cost‑Effective Strategies
Code Mala Tang
Code Mala Tang
Mar 3, 2025 · Artificial Intelligence

Unlock AI’s Full Potential with Structured Prompt Decorators

Prompt Decorators are structured prefixes that standardize and enhance AI responses, addressing common challenges like vague prompts, inconsistent answers, and lack of reasoning by guiding the model to produce clear, logical, and well‑organized outputs across various use cases.

AILLMPrompt Engineering
0 likes · 23 min read
Unlock AI’s Full Potential with Structured Prompt Decorators
Fighter's World
Fighter's World
Mar 3, 2025 · Artificial Intelligence

How OpenAI’s Deep Research Is Sparking a Wave of LLM‑Powered Search Experiments

The article explains what Deep Research agents are, walks through a concrete example of investigating the $6 million training cost controversy of DeepSeek V3, details the multi‑step plan‑edit‑execute workflow, and discusses broader implications for AI efficiency, market dynamics, and product design.

AI AgentsLLMcost efficiency
0 likes · 10 min read
How OpenAI’s Deep Research Is Sparking a Wave of LLM‑Powered Search Experiments
AI Large Model Application Practice
AI Large Model Application Practice
Mar 3, 2025 · Artificial Intelligence

Can DeepSeek‑R1 Unlock True “Deep Thinking” for Enterprise RAG?

This article examines how swapping in DeepSeek‑R1 enhances Retrieval‑Augmented Generation with deeper reasoning, outlines its benefits and pitfalls—including slower inference, higher compute costs, and hallucinations—provides a simple hallucination test, and proposes an Agentic RAG research assistant to balance accuracy and creativity.

AI reasoningDeepSeekLLM
0 likes · 10 min read
Can DeepSeek‑R1 Unlock True “Deep Thinking” for Enterprise RAG?
JD Retail Technology
JD Retail Technology
Mar 1, 2025 · Industry Insights

How JD Retail’s AI Assistant Uses Multimodal LLMs to Boost E‑Commerce

JD Retail’s AI assistant combines a Master‑Sub agent framework, ReAct paradigm, multimodal integration and MoE architecture to improve sales forecasting, pricing, and recommendation accuracy, while the team’s collaborative culture and open talent pathways illustrate how cutting‑edge AI is applied in real‑world e‑commerce.

AIJD RetailLLM
0 likes · 8 min read
How JD Retail’s AI Assistant Uses Multimodal LLMs to Boost E‑Commerce
Code Mala Tang
Code Mala Tang
Mar 1, 2025 · Artificial Intelligence

Why Do Large Language Models Hallucinate and How Can We Fix It?

This article explains why large language models produce plausible‑looking but false information, traces the problem to the supervised fine‑tuning stage, and outlines mitigation techniques such as knowledge interrogation, RLHF, and tool‑augmented search to reduce hallucinations.

LLMRLHFhallucination
0 likes · 12 min read
Why Do Large Language Models Hallucinate and How Can We Fix It?
AntTech
AntTech
Mar 1, 2025 · Artificial Intelligence

ScaleOT: Privacy‑Utility‑Scalable Offsite‑Tuning with Dynamic LayerReplace and Selective Rank Compression

The ScaleOT framework introduces a privacy‑preserving offsite‑tuning pipeline for large language models that combines importance‑aware dynamic layer replacement with selective rank compression, enabling flexible model compression, near‑lossless fine‑tuning, and strong privacy guarantees across diverse downstream tasks.

AdapterLLMModel Compression
0 likes · 16 min read
ScaleOT: Privacy‑Utility‑Scalable Offsite‑Tuning with Dynamic LayerReplace and Selective Rank Compression
Cognitive Technology Team
Cognitive Technology Team
Feb 28, 2025 · Artificial Intelligence

Design and High‑Availability Architecture of Alibaba LangEngine AI Application Framework

This article introduces Alibaba's LangEngine, a pure Java AI application framework, detailing its high‑availability gateway architecture, communication protocols, streaming and non‑streaming output, multi‑level metadata caching, asynchronous and serverless designs, and future open‑source roadmap, offering practical guidance for building robust AI services.

AI FrameworkLLMLangEngine
0 likes · 11 min read
Design and High‑Availability Architecture of Alibaba LangEngine AI Application Framework
AI Large Model Application Practice
AI Large Model Application Practice
Feb 28, 2025 · Artificial Intelligence

How Self-Attention Powers LLMs: A Step‑by‑Step Deep Dive

This article explains the self‑attention mechanism behind large language models, detailing why static word importance fails, how queries, keys, and values are generated, how attention scores are computed, scaled, softmaxed, and used to produce context‑aware word vectors, while noting computational costs.

AILLMSelf-Attention
0 likes · 9 min read
How Self-Attention Powers LLMs: A Step‑by‑Step Deep Dive
JavaEdge
JavaEdge
Feb 27, 2025 · Artificial Intelligence

How to Quickly Build a DeepSeek‑Powered Knowledge Base on Tencent Cloud

This guide walks through deploying the full‑feature DeepSeek V3+R1 model on Tencent Cloud, configuring a smart knowledge‑base application, importing documentation, enabling internet search, tuning retrieval parameters, and publishing the app for public use, all without writing code.

AIDeepSeekKnowledge Base
0 likes · 6 min read
How to Quickly Build a DeepSeek‑Powered Knowledge Base on Tencent Cloud
Baidu Tech Salon
Baidu Tech Salon
Feb 26, 2025 · Artificial Intelligence

Graph‑Engine‑Driven Workflow for Building Intelligent Agents

The article presents a graph‑engine‑driven workflow platform that lets developers assemble, orchestrate, and execute intelligent LLM‑based agents with low‑code visual design, fine‑grained path control, hierarchical sub‑flows, and event‑driven hooks, addressing perception, reasoning, planning, and scalability challenges while surpassing existing frameworks.

Data DecouplingIntelligent agentsLLM
0 likes · 19 min read
Graph‑Engine‑Driven Workflow for Building Intelligent Agents
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Feb 25, 2025 · Artificial Intelligence

Build a RAG‑Powered Smart Q&A Assistant with Milvus, DeepSeek, and PAI LangStudio

This step‑by‑step guide shows how to assemble a Retrieval‑Augmented Generation (RAG) system using Alibaba Cloud Milvus vector search, the DeepSeek large language model, and PAI LangStudio, covering instance creation, data upload, model deployment, connection setup, flow design, and service invocation.

AI TutorialDeepSeekLLM
0 likes · 9 min read
Build a RAG‑Powered Smart Q&A Assistant with Milvus, DeepSeek, and PAI LangStudio
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Feb 25, 2025 · Artificial Intelligence

How DistilQwen2.5 Boosts LLM Efficiency with Dual‑Stage Knowledge Distillation

This article introduces DistilQwen2.5, a lightweight LLM series built on Qwen2.5 that uses a novel two‑layer distillation framework, instruction‑data optimization, and parameter‑fusion techniques to achieve higher performance while drastically reducing computational cost and deployment overhead.

Efficient InferenceKnowledge DistillationLLM
0 likes · 26 min read
How DistilQwen2.5 Boosts LLM Efficiency with Dual‑Stage Knowledge Distillation
DataFunSummit
DataFunSummit
Feb 25, 2025 · Artificial Intelligence

Collecting High-Quality LLM Training Data and Custom Model Training Guide

This article explains what constitutes high‑quality LLM training data, why large datasets are essential, outlines the step‑by‑step process for collecting, preprocessing, and fine‑tuning models, and highlights the best data sources—including web content, books, code repositories, and news—while noting available free datasets.

AIData CollectionLLM
0 likes · 9 min read
Collecting High-Quality LLM Training Data and Custom Model Training Guide
Code Mala Tang
Code Mala Tang
Feb 25, 2025 · Artificial Intelligence

How Resources, Tools, and Prompts Power LLM Super‑Agents

This article explains how the Resources data hub, Tools capability engine, and Prompts interaction templates work together to create a secure, extensible workflow that enables large language models to ingest data, execute tasks, and generate structured outputs.

AI workflowArtificial IntelligenceLLM
0 likes · 5 min read
How Resources, Tools, and Prompts Power LLM Super‑Agents
CSS Magic
CSS Magic
Feb 25, 2025 · Artificial Intelligence

Two Simple Ways to Access DeepSeek API for Free

This guide shows how to obtain free DeepSeek API access through GitHub Models and SiliconFlow, detailing the required API base URL, key, and model name, how to register, create keys, verify usage with a web chat tool, and compare model choices and platform limits.

APIDeepSeekFree access
0 likes · 7 min read
Two Simple Ways to Access DeepSeek API for Free
Baidu Geek Talk
Baidu Geek Talk
Feb 24, 2025 · Artificial Intelligence

Using a Graph Engine to Drive Workflow for Intelligent Agents

By leveraging mature graph‑engine technology, the article shows how visual, low‑code workflow orchestration can give intelligent LLM‑based agents fine‑grained path control, reusable functions, hierarchical sub‑flows, and robust error handling, turning complex business tasks into modular, scalable processes adopted by hundreds of thousands of developers.

AI AgentsLLMLow‑code
0 likes · 18 min read
Using a Graph Engine to Drive Workflow for Intelligent Agents
Architecture Digest
Architecture Digest
Feb 24, 2025 · Artificial Intelligence

MoBA: Mixture of Block Attention for Long‑Context Large Language Models

The article introduces MoBA, a Mixture‑of‑Block‑Attention mechanism that applies Mixture‑of‑Experts principles to transformer attention, enabling efficient long‑context processing for large language models while maintaining performance comparable to full attention through sparse, trainable block selection and seamless switching.

LLMMixture of ExpertsMoBA
0 likes · 12 min read
MoBA: Mixture of Block Attention for Long‑Context Large Language Models
AI Large Model Application Practice
AI Large Model Application Practice
Feb 24, 2025 · Artificial Intelligence

How Web Agents Combine LLMs and Browser Automation to Perform Real‑World Tasks

This article explains what Web Agents are, their ReAct‑style reasoning loop, key implementation technologies such as observation parsing, multimodal models, and browser control tools like Selenium and Playwright, and demonstrates building a DeepSeek‑powered Web Agent with the Browser‑use framework, including code samples and performance insights.

DeepSeekLLMPlaywright
0 likes · 11 min read
How Web Agents Combine LLMs and Browser Automation to Perform Real‑World Tasks
Java Web Project
Java Web Project
Feb 23, 2025 · Artificial Intelligence

Build Your First AI Chatbot with Spring Boot and DeepSeek LLM

This guide walks you through creating a Spring Boot project, configuring DeepSeek's large language model via SiliconFlow, setting up OpenAI‑compatible parameters, and implementing a REST controller that returns weather forecasts using the model, complete with step‑by‑step code snippets, configuration files, and deployment instructions.

AIChatbotDeepSeek
0 likes · 7 min read
Build Your First AI Chatbot with Spring Boot and DeepSeek LLM
Ma Wei Says
Ma Wei Says
Feb 23, 2025 · Artificial Intelligence

How Microsoft’s PIKE‑RAG Builds Knowledge‑Driven AI Across Four Stages

The article explains Microsoft’s open‑source PIKE‑RAG system, detailing its four progressive stages—from knowledge‑base construction to creative multi‑agent reasoning—while describing the underlying modules, chunking strategies, multi‑granularity retrieval, and code snippets that enable specialized domain understanding and inference.

AI RetrievalKnowledge GraphLLM
0 likes · 11 min read
How Microsoft’s PIKE‑RAG Builds Knowledge‑Driven AI Across Four Stages
Architecture and Beyond
Architecture and Beyond
Feb 22, 2025 · Artificial Intelligence

Understanding Retrieval‑Augmented Generation (RAG) and Its Role in Enhancing Large Language Models

The article explains how the inherent knowledge‑staleness, hallucination, lack of private data, non‑traceable output, limited long‑text handling, and data‑security concerns of large language models can be mitigated by Retrieval‑Augmented Generation, which combines external retrieval, augmentation, and generation to provide up‑to‑date, reliable, and secure AI responses.

AIKnowledge augmentationLLM
0 likes · 15 min read
Understanding Retrieval‑Augmented Generation (RAG) and Its Role in Enhancing Large Language Models
Infra Learning Club
Infra Learning Club
Feb 21, 2025 · Artificial Intelligence

5 Must‑Try Open‑Source AI Projects You Can Start Using Today

This article introduces five open‑source AI tools—a PPT generator, an LLM app development platform, a cloud‑agnostic AI runner, a curated collection of LLM applications, and a one‑click HD video creator—detailing their key features, usage links, and sample configurations.

AIDifyLLM
0 likes · 8 min read
5 Must‑Try Open‑Source AI Projects You Can Start Using Today
Ma Wei Says
Ma Wei Says
Feb 21, 2025 · Artificial Intelligence

How PIKE‑RAG Boosts Retrieval‑Augmented Generation for Industrial AI

PIKE‑RAG, a Retrieval‑Augmented Generation framework from Microsoft Research, tackles knowledge source diversity, one‑size‑fits‑all limitations, and LLMs' lack of domain expertise by building multi‑layer heterogeneous graphs, task‑driven modular pipelines, and a staged L0‑L4 system for more accurate industrial AI responses.

AIKnowledgeGraphLLM
0 likes · 5 min read
How PIKE‑RAG Boosts Retrieval‑Augmented Generation for Industrial AI
Architect
Architect
Feb 20, 2025 · Artificial Intelligence

Why Long CoT and In‑Context RL Are the Next Frontier for LLMs

The article analyses recent breakthroughs such as OpenAI's o1, Long CoT, and test‑time search, arguing that enabling LLMs to perform self‑critique and reinforcement learning with long output sequences is essential for future AI performance, while warning against overly structured workflows.

AI researchIn‑Context RLLLM
0 likes · 12 min read
Why Long CoT and In‑Context RL Are the Next Frontier for LLMs
JD Tech Talk
JD Tech Talk
Feb 20, 2025 · Artificial Intelligence

Multi‑Agent Architecture for an E‑Commerce Business Assistant: Design, Planning, Evaluation, and Sample Generation

The document describes the evolution, design principles, key technologies, online inference workflow, evaluation methods, and sample‑generation techniques of a large‑language‑model‑based multi‑agent system that powers a 24/7 e‑commerce merchant assistant, highlighting its benefits, challenges, and future work.

AI PlanningLLMReward Model
0 likes · 21 min read
Multi‑Agent Architecture for an E‑Commerce Business Assistant: Design, Planning, Evaluation, and Sample Generation
JD Cloud Developers
JD Cloud Developers
Feb 20, 2025 · Artificial Intelligence

How Multi‑Agent ReAct Architecture Boosts E‑Commerce AI Assistants

This article explains the evolution of multi‑agent systems for e‑commerce assistants, detailing the ReAct‑based planning framework, hierarchical master‑sub agent collaboration, evaluation methods, and sample‑generation techniques that together improve accuracy, efficiency, and scalability of AI‑driven merchant services.

AI PlanningAgent ArchitectureLLM
0 likes · 23 min read
How Multi‑Agent ReAct Architecture Boosts E‑Commerce AI Assistants
Alibaba Cloud Developer
Alibaba Cloud Developer
Feb 20, 2025 · Artificial Intelligence

How LLMs Power Real-Time Interactive 3D Worlds in Unreal Engine

This article explains how large language models are integrated with Unreal Engine to enable natural‑language‑driven 3D model search, manipulation, and scene understanding, detailing metadata extraction, vision‑language labeling, RAG‑based retrieval, and function‑call translation for interactive virtual environments.

3D interactionLLMRAG
0 likes · 21 min read
How LLMs Power Real-Time Interactive 3D Worlds in Unreal Engine
Architects' Tech Alliance
Architects' Tech Alliance
Feb 18, 2025 · Artificial Intelligence

How to Distill DeepSeek LLMs into Lightweight Models for Local Deployment

This article explains DeepSeek's knowledge‑distillation approach for compressing large language models into small, efficient student models, details step‑by‑step local deployment requirements, performance optimizations, and highlights the cost, privacy, and application benefits of running the distilled model on‑premise.

AI inferenceDeepSeekKnowledge Distillation
0 likes · 10 min read
How to Distill DeepSeek LLMs into Lightweight Models for Local Deployment
Big Data Tech Team
Big Data Tech Team
Feb 18, 2025 · Artificial Intelligence

How DeepSeek Trains and Optimizes Its LLMs: From Pre‑training to Reasoning Models

This article breaks down DeepSeek's LLM training pipeline, explaining the massive pre‑training phase, instruction fine‑tuning, reinforcement‑learning‑from‑human‑feedback, and the distinct roles of its V3 instruction model and R1 reasoning model, while also highlighting performance metrics and current limitations.

DeepSeekLLMRLHF
0 likes · 8 min read
How DeepSeek Trains and Optimizes Its LLMs: From Pre‑training to Reasoning Models
Java One
Java One
Feb 17, 2025 · Artificial Intelligence

How to Get Free Access to DeepSeek R1 Across Major Cloud Platforms

This guide walks you through using DeepSeek R1 via the official website or popular third‑party cloud services, compares free token quotas, explains token accounting, and provides step‑by‑step instructions for configuring API access and AI clients such as Chatbox, Cherry Studio, and Dify.

AI clientAPIDeepSeek
0 likes · 11 min read
How to Get Free Access to DeepSeek R1 Across Major Cloud Platforms