Tagged articles

2079 articles

Page 15 of 21

Mar 24, 2025 · Artificial Intelligence

Boost LLM Evaluation with Semantic Enrichment and Vector Search

This article explains how semantic enrichment, vector and hybrid search, and clustering techniques can be applied to large language model logs to evaluate inputs and outputs, improve compliance auditing, and enhance model iteration across various business scenarios.

AIEvaluationLLM

0 likes · 12 min read

Boost LLM Evaluation with Semantic Enrichment and Vector Search

Alibaba Cloud Developer

Mar 24, 2025 · Artificial Intelligence

Why LLM Internet Search Fails and How to Fix It: A Deep Dive into Qwen, Doubao, and DeepSeek

This article analyses the shortcomings of large‑model internet search—such as unverifiable sources, fabricated content, and poor instruction compliance—by comparing Qwen‑max, Doubao‑1.5‑pro‑256k, and DeepSeek‑v3, and proposes prompt engineering, post‑processing, and custom tool improvements to boost reliability.

AIEvaluationLLM

0 likes · 22 min read

Why LLM Internet Search Fails and How to Fix It: A Deep Dive into Qwen, Doubao, and DeepSeek

AI Large Model Application Practice

Mar 24, 2025 · Artificial Intelligence

How to Build a Multimodal RAG Pipeline for PPT Documents with Vision LLMs

This article explains a step‑by‑step implementation of a multimodal Retrieval‑Augmented Generation system that parses PPT/PDF files, extracts rich text and images with vision models, indexes them in a vector store, and generates answers that combine markdown and relevant slide screenshots.

LLMPythonRAG

0 likes · 9 min read

How to Build a Multimodal RAG Pipeline for PPT Documents with Vision LLMs

Ma Wei Says

Mar 24, 2025 · Artificial Intelligence

Master BGE Multilingual Embeddings: Models, Installation, and Quick Usage

Explore the BGE (BAAI General Embedding) family—including v1, v1.5, M3, Multilingual Gemma2, and EN‑ICL—detailing their multilingual capabilities, model variants, token limits, optimal use cases, and step‑by‑step installation and Python usage instructions with code examples for embedding generation and similarity scoring.

EmbeddingLLMPython

0 likes · 8 min read

Master BGE Multilingual Embeddings: Models, Installation, and Quick Usage

Alibaba Cloud Big Data AI Platform

Mar 24, 2025 · Artificial Intelligence

How to Build a Real‑Time Data Analysis Agent with LLMs, Hologres, and MCP

This article explains the challenges LLMs face in data analysis, introduces the Model Context Protocol (MCP) as a standard bridge, and provides a step‑by‑step guide to integrate Hologres, MCP, and large language models—using Claude Desktop as an example—to create a fast, multi‑source data‑analysis agent.

AI AgentData AnalysisHologres

0 likes · 11 min read

How to Build a Real‑Time Data Analysis Agent with LLMs, Hologres, and MCP

Architect

Mar 23, 2025 · Artificial Intelligence

The Future of AI Agents: From Prompt‑Driven Workflows to Model‑as‑Product and Reinforcement‑Learning‑Powered Agents

The article argues that the next wave of AI agents will shift from brittle, prompt‑driven workflows like Manus to truly autonomous, model‑centric agents trained with reinforcement learning and reasoning, exemplified by OpenAI's DeepResearch and Anthropic's Claude Sonnet 3.7, while the API‑driven market model collapses.

AI AgentsClaudeDeepResearch

0 likes · 28 min read

The Future of AI Agents: From Prompt‑Driven Workflows to Model‑as‑Product and Reinforcement‑Learning‑Powered Agents

Baobao Algorithm Notes

Mar 23, 2025 · Artificial Intelligence

Why Future AI Agents Must Evolve Beyond Prompt‑Driven Workflows

The article argues that the next generation of AI agents should focus on improving the model itself through reinforcement learning and reasoning rather than relying on pre‑designed prompt‑driven workflows, highlighting industry trends, technical challenges, and the shift toward treating models as products.

DeepSearchLLMmodel as product

0 likes · 29 min read

Why Future AI Agents Must Evolve Beyond Prompt‑Driven Workflows

Architect

Mar 22, 2025 · Artificial Intelligence

Understanding and Mitigating Failures in Retrieval‑Augmented Generation (RAG) Systems

Retrieval‑augmented generation (RAG) combines external knowledge retrieval with large language models to improve answer accuracy, but it often suffers from retrieval mismatches, algorithmic flaws, chunking issues, embedding biases, inefficiencies, generation errors, reasoning limits, formatting problems, system‑level failures, and high resource costs, which this article analyzes and offers solutions for.

AI reliabilityLLMRAG

0 likes · 32 min read

Understanding and Mitigating Failures in Retrieval‑Augmented Generation (RAG) Systems

Cognitive Technology Team

Mar 22, 2025 · Artificial Intelligence

Three Stages of Developing Large Language Models and Practical Guidance

The article outlines the three development phases of large language models—building, pre‑training, and fine‑tuning—describes usage options, highlights key factors such as data scale, architecture, training processes, and evaluation, and offers practical advice for cost‑effective development.

LLMLarge Language ModelModel Development

0 likes · 3 min read

Three Stages of Developing Large Language Models and Practical Guidance

Baobao Algorithm Notes

Mar 21, 2025 · Artificial Intelligence

Unlocking LLM Reasoning: A Deep Dive into Post‑Training Techniques

This article provides a comprehensive technical overview of large language model post‑training, covering fine‑tuning methods (full, parameter‑efficient, LoRA families, prompt tuning), domain‑adaptive tuning, reinforcement‑learning reward modeling, process vs. outcome rewards, inference‑enhancement strategies, dynamic compute allocation, verifier‑augmented reasoning, current challenges, and emerging research directions such as meta‑cognition, physical reasoning, and swarm intelligence.

LLMmeta-cognitionpost-training

0 likes · 21 min read

Unlocking LLM Reasoning: A Deep Dive into Post‑Training Techniques

Meituan Technology Team

Mar 20, 2025 · Artificial Intelligence

Meituan Tech Team's Selected Papers on Large Language Models and AI (2024-2025)

The article compiles Meituan’s recent 2024‑2025 research on large language models, presenting a diverse set of papers that explore transformer enhancements, scaling laws, safety optimization, instruction fine‑tuning, temporal decay learning, code generation, agent refinement, cost‑efficient MoE inference, quantization, fast parallel inference, speculative decoding, multilingual speech, vision‑language models, evaluation benchmarks, and jailbreak robustness.

AILLMMeituan

0 likes · 4 min read

Meituan Tech Team's Selected Papers on Large Language Models and AI (2024-2025)

AI Large Model Application Practice

Mar 20, 2025 · Artificial Intelligence

Mastering Model Context Protocol (MCP): Build AI Agents with LlamaIndex & LangGraph

This guide explains the Model Context Protocol (MCP), its architecture, and how to create and debug MCP servers and clients in Python, then shows how to integrate third‑party MCP servers with LlamaIndex or LangGraph to quickly build powerful LLM agents.

LLMLangGraphLlamaIndex

0 likes · 12 min read

Mastering Model Context Protocol (MCP): Build AI Agents with LlamaIndex & LangGraph

Sohu Tech Products

Mar 19, 2025 · Artificial Intelligence

How to Recreate a Translation Agent with LangGraph and LLMs

This guide demonstrates building a steerable LLM‑based translation workflow using LangGraph, covering the initial translation, model‑generated reflection suggestions, and final improvement steps with full Python code examples and a complete execution result.

AILLMLangGraph

0 likes · 34 min read

How to Recreate a Translation Agent with LangGraph and LLMs

Ops Development & AI Practice

Mar 19, 2025 · Artificial Intelligence

How Integrating LLMs with the Model Context Protocol Could Transform AI Workflows

Integrating large language models with the open‑standard Model Context Protocol enables direct access to file systems, databases, and APIs, unlocking use cases such as automated file management, intelligent data analysis, personalized content generation, and task automation, while also raising security, privacy, and maturity challenges for future AI‑human collaboration.

Data SecurityLLMcross-domain

0 likes · 10 min read

How Integrating LLMs with the Model Context Protocol Could Transform AI Workflows

Ops Development & AI Practice

Mar 19, 2025 · Artificial Intelligence

Can Cache‑Augmented Generation Outperform RAG? A Deep Dive into LLM Efficiency

Cache‑augmented generation (CAG) preloads documents into LLM context using KV caches to eliminate retrieval latency, offering faster inference for static knowledge bases, while RAG remains more flexible for dynamic or large corpora; this article compares their definitions, performance, implementation steps, and future prospects.

CAGCache AugmentationInference Optimization

0 likes · 11 min read

Can Cache‑Augmented Generation Outperform RAG? A Deep Dive into LLM Efficiency

DaTaobao Tech

Mar 19, 2025 · Artificial Intelligence

Retrieval Augmented Generation (RAG): Principles, Challenges, and Implementation Techniques

Retrieval‑augmented generation (RAG) enhances large language models by integrating a preprocessing pipeline—cleaning, chunking, embedding, and vector storage—with a query‑driven retrieval and prompt‑injection workflow, leveraging vector databases, multi‑stage recall, advanced prompting, and comprehensive evaluation metrics to mitigate knowledge cut‑off, hallucinations, and security issues.

EvaluationLLMRAG

0 likes · 27 min read

Retrieval Augmented Generation (RAG): Principles, Challenges, and Implementation Techniques

Tencent Cloud Developer

Mar 19, 2025 · Artificial Intelligence

Inside Tencent Hunyuan Turbo S: Speed, Cost, and Hybrid Mamba Transformer Explained

Tencent's new Hunyuan Turbo S model combines a 44% faster response time, dramatically lower token costs, and a hybrid Mamba‑Transformer architecture that merges linear attention with full attention, offering insights into fast‑thinking versus slow‑thinking LLM designs, MoE scaling laws, low‑precision training effects, and long‑short chain fusion techniques.

AIArchitectureHybridMambaLLM

0 likes · 14 min read

Inside Tencent Hunyuan Turbo S: Speed, Cost, and Hybrid Mamba Transformer Explained

Alibaba Cloud Infrastructure

Mar 18, 2025 · Cloud Native

Gray Release of LoRA and Base Models Using ACK Gateway with AI Extension on Kubernetes

This guide explains how to deploy large language model inference services on a GPU-enabled Kubernetes cluster, configure ACK Gateway with AI Extension for intelligent routing and load balancing, and perform gray releases for both LoRA fine‑tuned models and base models such as QwQ‑32B and DeepSeek‑R1, including step‑by‑step commands and validation procedures.

ACK GatewayAI inferenceKubernetes

0 likes · 25 min read

Gray Release of LoRA and Base Models Using ACK Gateway with AI Extension on Kubernetes

JD Tech Talk

Mar 18, 2025 · Artificial Intelligence

Generative Recommendation for CPS Advertising: Intent Sensing, Multi‑Objective Optimization, and the One4All Framework

This article surveys recent advances in generative recommendation for CPS advertising, detailing explicit intent‑aware controllable product recommendation, multi‑objective optimization techniques based on reward‑in‑context and DPO, and the scalable One4All framework that unifies behavior and language modeling across diverse ad scenarios.

CPS advertisingGenerative RecommendationLLM

0 likes · 14 min read

Generative Recommendation for CPS Advertising: Intent Sensing, Multi‑Objective Optimization, and the One4All Framework

JD Cloud Developers

Mar 18, 2025 · Artificial Intelligence

How Generative LLMs Are Transforming CPS Advertising Recommendations

Since large language models have excelled in NLP, researchers are now enhancing CPS advertising recommendation systems by integrating generative LLMs for explicit intent perception, multi‑objective optimization, and a unified One4All framework, achieving significant offline and online performance gains across click‑through, conversion, and revenue metrics.

CPS advertisingGenerative RecommendationLLM

0 likes · 19 min read

How Generative LLMs Are Transforming CPS Advertising Recommendations

AI Large Model Application Practice

Mar 18, 2025 · Artificial Intelligence

Master OpenAI’s New Agents SDK: 10 Core Concepts with a Complete Example

This guide walks you through OpenAI's open‑source Agents SDK, explaining ten essential concepts—from model configuration and agent creation to runners, tools, context handling, guardrails, handoffs, structured output, tracing, and orchestration—while providing runnable Python code and visual demos.

LLMOpenAI AgentsPython

0 likes · 17 min read

Master OpenAI’s New Agents SDK: 10 Core Concepts with a Complete Example

AI Algorithm Path

Mar 17, 2025 · Artificial Intelligence

Agentic AI vs Generative AI: Key Differences and Comparative Analysis

The article defines Agentic AI as autonomous, goal‑directed systems that can act and learn from experience, contrasts it with Generative AI’s passive, single‑step content generation, and illustrates the practical advantage of Agentic workflows through Andrew Ng’s HumanEval benchmark where a step‑wise approach outperforms zero‑shot prompting even for older models.

AI autonomyBenchmarkGenerative AI

0 likes · 10 min read

Agentic AI vs Generative AI: Key Differences and Comparative Analysis

Infra Learning Club

Mar 17, 2025 · Artificial Intelligence

Testing OpenManus with DeepSeek: A Hands‑On Evaluation

The author walks through installing OpenManus, configuring it to use DeepSeek (and an Ollama‑based vision model), runs a sample financial data query, and reports that the system is slow, sometimes inaccurate, and still requires further optimization.

AI AgentsDeepSeekLLM

0 likes · 5 min read

Testing OpenManus with DeepSeek: A Hands‑On Evaluation

Ops Development & AI Practice

Mar 17, 2025 · Artificial Intelligence

Unlocking LLM Power: A Hands‑On Guide to Open WebUI

Open WebUI offers a user‑friendly, open‑source web interface that simplifies interaction with large language models, supporting multiple back‑ends, offline operation, and extensible plugins, making AI experimentation accessible for developers, researchers, and enthusiasts alike.

AILLMModel Management

0 likes · 4 min read

Unlocking LLM Power: A Hands‑On Guide to Open WebUI

Alibaba Cloud Native

Mar 17, 2025 · Cloud Native

How to Deploy DeepSeek as an Enterprise AI Assistant on DingTalk Using Alibaba Cloud

This guide walks you through deploying the DeepSeek large‑language model on Alibaba Cloud PAI, integrating it with DingTalk via the Magic Wand AI platform, and configuring multi‑model routing, authentication, rate limiting, content safety, caching, web‑search, and observability using the Cloud Native API Gateway.

AIAlibaba CloudDingTalk

0 likes · 15 min read

How to Deploy DeepSeek as an Enterprise AI Assistant on DingTalk Using Alibaba Cloud

Alibaba Cloud Infrastructure

Mar 17, 2025 · Cloud Native

Boost LLM Inference with ACK Gateway AI Extension: A Step‑by‑Step Guide

This guide demonstrates how to deploy the QwQ‑32B large language model on an Alibaba Cloud ACK cluster, configure OSS storage, enable the ACK Gateway with AI Extension, set up InferencePool and InferenceModel resources, and benchmark intelligent routing versus standard gateway routing, revealing latency and throughput improvements.

ACK GatewayAI ExtensionKubernetes

0 likes · 16 min read

Boost LLM Inference with ACK Gateway AI Extension: A Step‑by‑Step Guide

Cognitive Technology Team

Mar 17, 2025 · Artificial Intelligence

Leveraging Large Language Models to Optimize Traditional Machine Learning Pipelines

Large language models can assist and enhance each stage of traditional machine learning—including sample generation, data cleaning, feature engineering, model selection, hyper‑parameter tuning, and workflow automation—by generating synthetic data, refining features, selecting models, and orchestrating pipelines, though challenges such as bias, privacy, and noise remain.

Data GenerationFeature EngineeringLLM

0 likes · 11 min read

Leveraging Large Language Models to Optimize Traditional Machine Learning Pipelines

Spring Full-Stack Practical Cases

Mar 17, 2025 · Backend Development

Generate SQL with Spring AI: LLM‑Powered Queries in Spring Boot 3

This article demonstrates how to use Spring AI with a large language model to automatically generate and execute SELECT SQL statements in a Spring Boot 3 application, covering dependency setup, configuration files, prompt templates, controller implementation, and testing with example scripts.

LLMSQL GenerationSpring AI

0 likes · 9 min read

Generate SQL with Spring AI: LLM‑Powered Queries in Spring Boot 3

Ops Development & AI Practice

Mar 16, 2025 · Artificial Intelligence

How Function Calling Helps LLMs Overcome Hallucinations

This article explains how LLM function calling works, from defining external functions to processing API responses, and demonstrates a Python example using OpenAI's ChatGPT‑4o to fetch real‑time weather, showing how the technique mitigates hallucinations and expands practical AI applications.

AIFunction CallingHallucination Mitigation

0 likes · 8 min read

How Function Calling Helps LLMs Overcome Hallucinations

Architect

Mar 15, 2025 · Artificial Intelligence

Why Building Your Own RAG System Is a Costly Mistake

The article explains that developing a custom Retrieval‑Augmented Generation (RAG) solution incurs hidden infrastructure, personnel, and security costs, leads to operational overload and budget overruns, and is rarely justified compared to purchasing a proven vendor solution.

AILLMRAG

0 likes · 11 min read

Why Building Your Own RAG System Is a Costly Mistake

AI Algorithm Path

Mar 15, 2025 · Artificial Intelligence

Why the Industry Is Shifting From AI Agents to Agentic Workflows

The article explains that low accuracy and security risks of current AI agents—evidenced by a Claude AI Agent achieving only 14% of human performance and an average success rate of about 20%—are driving a move toward agentic workflows, which offer observable, auditable, and data‑synthesizing pipelines that dramatically improve enterprise productivity.

AI AgentsLLMObservability

0 likes · 7 min read

Why the Industry Is Shifting From AI Agents to Agentic Workflows

DataFunSummit

Mar 14, 2025 · Artificial Intelligence

Insights from Zhihu's ZhiLight Large‑Model Inference Framework: Architecture, Parallelism, and Performance Optimizations

The article summarizes Zhihu's machine‑learning platform lead Wang Xin's presentation on the ZhiLight large‑model inference framework, covering model execution mechanisms, GPU workload analysis, pipeline and tensor parallelism, GPU architecture evolution, open‑source engine comparisons, ZhiLight's compute‑communication overlap and quantization optimizations, benchmark results, supported models, and future directions.

GPULLMOpen‑source

0 likes · 13 min read

Insights from Zhihu's ZhiLight Large‑Model Inference Framework: Architecture, Parallelism, and Performance Optimizations

AI Large Model Application Practice

Mar 14, 2025 · Artificial Intelligence

Why Softmax Is the Secret Behind LLM Probabilities and Creative Generation

This article explains how the Softmax function converts raw neural‑network scores into a proper probability distribution, why this conversion is essential for training and inference in large language models, and how the temperature parameter shapes the model's creativity and diversity.

LLMLanguage ModelsSoftmax

0 likes · 9 min read

Why Softmax Is the Secret Behind LLM Probabilities and Creative Generation

Baidu Geek Talk

Mar 12, 2025 · Artificial Intelligence

How LLMs Are Revolutionizing Semantic Embeddings: Models, Methods, and Trends

This article reviews how large language models (LLMs) enhance semantic text embeddings by comparing traditional methods with LLM‑based approaches, detailing synthetic data generation, backbone model designs, key model families, experimental results on the MTEB benchmark, and future research challenges.

LLMcontrastive learningmodel comparison

0 likes · 30 min read

How LLMs Are Revolutionizing Semantic Embeddings: Models, Methods, and Trends

Alibaba Cloud Developer

Mar 12, 2025 · Artificial Intelligence

Deploy Alibaba Cloud’s QwQ-32B LLM: Benchmarks, Agent Features, and One‑Click Setup

This guide introduces Alibaba Cloud’s open‑source QwQ-32B large language model, highlights its superior benchmark performance over competing models, explains its integrated agent capabilities, and provides step‑by‑step instructions for one‑click deployment via the PAI‑Model Gallery.

Alibaba CloudLLMModel Deployment

0 likes · 7 min read

Deploy Alibaba Cloud’s QwQ-32B LLM: Benchmarks, Agent Features, and One‑Click Setup

DaTaobao Tech

Mar 12, 2025 · Artificial Intelligence

Multimodal Automatic Layout Generation for E-commerce

The project develops a multimodal automatic layout generation system for e‑commerce by fine‑tuning the qwen‑vl‑7b vision‑language model with LoRA on poster and Taobao image‑layout data, employing diffusion‑based image generation and coordinate‑prediction methods to produce structured layouts that power poster, marketing image, and video‑cover creation with over 90% adoption, while exploring multi‑image, style‑aware, and iterative refinement extensions.

LLMdiffusione‑commerce

0 likes · 12 min read

Multimodal Automatic Layout Generation for E-commerce

Cognitive Technology Team

Mar 11, 2025 · Artificial Intelligence

Deploying DeepSeek R1:7b Model Locally with Ollama and Building AI Applications Using Dify

This tutorial explains how to set up Ollama for CPU or GPU environments, run the DeepSeek R1:7b large language model, and use the open‑source Dify platform to create and deploy a custom AI application, providing step‑by‑step commands and configuration details.

AIDeepSeekDify

0 likes · 8 min read

Deploying DeepSeek R1:7b Model Locally with Ollama and Building AI Applications Using Dify

NewBeeNLP

Mar 11, 2025 · Artificial Intelligence

How DeepSeek’s New Architecture Redefines LLM Efficiency and Performance

This article analyzes DeepSeek’s recent breakthroughs—including the Multi‑Head Latent Attention (MLA), Group Relative Policy Optimization (GRPO), and a refined Mixture‑of‑Experts design—along with its three‑stage training pipeline, RL‑only R1‑Zero variant, and benchmark comparisons against GPT‑4o‑Mini and Llama 3.1, highlighting both gains and remaining challenges.

DeepSeekLLMMixture of Experts

0 likes · 18 min read

How DeepSeek’s New Architecture Redefines LLM Efficiency and Performance

Tencent Cloud Developer

Mar 11, 2025 · Artificial Intelligence

Fine‑Tuning Local LLaMA‑Factory Models and Building Networked AI Applications

The article walks through preparing a GPU‑enabled environment, downloading and LoRA‑fine‑tuning a DeepSeek model with LLaMA‑Factory, merging the adapter, then wrapping the model in a web UI that queries a ChromaDB vector store via crawled web data, illustrating security‑focused use cases and forecasting domain‑specific LLM adoption.

AILLMLLaMA-Factory

0 likes · 17 min read

Fine‑Tuning Local LLaMA‑Factory Models and Building Networked AI Applications

Architect

Mar 10, 2025 · Artificial Intelligence

What Makes DeepSeek’s New Architecture a Game‑Changer? Inside MLA, GRPO, and MoE Innovations

This article analyzes DeepSeek’s latest large‑model breakthroughs, covering the MLA attention compression, GRPO alignment algorithm, MoE load‑balancing redesign, multi‑stage training pipelines, reinforcement‑learning tricks, and performance comparisons with GPT‑4o‑Mini and Llama 3.1, highlighting both strengths and remaining challenges.

AI trainingDeepSeekGRPO

0 likes · 19 min read

What Makes DeepSeek’s New Architecture a Game‑Changer? Inside MLA, GRPO, and MoE Innovations

AI Algorithm Path

Mar 10, 2025 · Artificial Intelligence

How Much GPU Memory Does an LLM Service Really Need?

This article explains a simple formula for estimating the GPU VRAM required to serve large language models, demonstrates the calculation with a 7‑billion‑parameter example, clarifies why a 20% safety buffer is needed, and offers practical strategies such as quantization, CPU offload, and multi‑GPU parallelism to reduce memory usage.

GPU MemoryLLMModel Quantization

0 likes · 6 min read

How Much GPU Memory Does an LLM Service Really Need?

Tencent Technical Engineering

Mar 10, 2025 · Artificial Intelligence

How Non‑AI Developers Can Build LLM Apps: Prompt Engineering, RAG, and Function Calling Explained

This guide shows non‑AI developers how to create large‑model applications by mastering prompt engineering, multi‑turn interactions, Retrieval‑Augmented Generation, function calling, and AI‑Agent integration, with practical code examples, tool design patterns, and deployment tips.

AI AgentEmbeddingFunction Calling

0 likes · 48 min read

How Non‑AI Developers Can Build LLM Apps: Prompt Engineering, RAG, and Function Calling Explained

Baobao Algorithm Notes

Mar 10, 2025 · Artificial Intelligence

Why DeepSeek V3’s FP8 Training Beats Traditional Schemes: A Deep Dive

This article provides a detailed technical analysis of FP8 training, comparing Nvidia’s TransformerEngine approach with DeepSeek V3’s novel scheme, and examines how block‑wise scaling, high‑precision accumulation, and vector length and correlation affect quantization error and signal‑to‑noise ratio in large‑language‑model training.

DeepSeekFP8LLM

0 likes · 20 min read

Why DeepSeek V3’s FP8 Training Beats Traditional Schemes: A Deep Dive

phodal

Mar 10, 2025 · Artificial Intelligence

How AutoDev Bridge Uses LLMs to Accelerate Legacy System Migration

AutoDev Bridge combines large‑model reasoning, C4 architecture analysis, AST‑based business logic extraction, and IDE‑integrated tooling to automate the migration of legacy systems, reducing manual effort and migration risk while highlighting the unique advantages of modern AI agents.

AICode TranslationLLM

0 likes · 7 min read

How AutoDev Bridge Uses LLMs to Accelerate Legacy System Migration

Java Architecture Diary

Mar 10, 2025 · Artificial Intelligence

Simplify Java AI Integration with Spring AI Custom Annotations and AI Services

AI Services, inspired by Spring Data JPA and Retrofit, offers a declarative Java API that abstracts LLM interactions, supporting input formatting, output parsing, chat memory, function calling, and RAG, with detailed examples using LangChain4j, custom Spring AI annotations, AOP aspects, and controller integration.

AI servicesAOPLLM

0 likes · 7 min read

Simplify Java AI Integration with Spring AI Custom Annotations and AI Services

DevOps

Mar 9, 2025 · Artificial Intelligence

A Beginner's Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling, and AI Agents

This article provides a comprehensive introduction to developing large language model (LLM) applications, covering prompt engineering, zero‑ and few‑shot techniques, function calling, retrieval‑augmented generation (RAG) with embedding and vector databases, code assistants, and the MCP protocol for building AI agents, all aimed at non‑AI specialists.

AI AgentEmbeddingFunction Calling

0 likes · 48 min read

A Beginner's Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling, and AI Agents

Architects' Tech Alliance

Mar 9, 2025 · Industry Insights

How DeepSeek’s LLMs Slash Training Costs and Reshape China’s Compute Landscape

DeepSeek’s three‑model LLM lineup—V3, R1‑Zero and R1—delivers high performance while cutting training expenses to under $600 k, a fraction of the $0.6‑1 B typical for comparable models, signaling a major shift in China’s AI compute demand and supply chain dynamics.

AI computeChinaDeepSeek

0 likes · 3 min read

How DeepSeek’s LLMs Slash Training Costs and Reshape China’s Compute Landscape

Alibaba Cloud Infrastructure

Mar 9, 2025 · Cloud Computing

Deploy QwQ-32B LLM Inference on Alibaba Cloud ACS with vLLM: Step‑by‑Step Guide

This guide walks you through using Alibaba Cloud Container Compute Service (ACS) to provision GPU resources, prepare the QwQ-32B model, configure persistent storage, deploy the model with vLLM, set up OpenWebUI, verify the service, and optionally benchmark its performance, all with detailed commands and YAML examples.

ACSAlibaba CloudBenchmark

0 likes · 17 min read

Deploy QwQ-32B LLM Inference on Alibaba Cloud ACS with vLLM: Step‑by‑Step Guide

AI Frontier Lectures

Mar 9, 2025 · Industry Insights

Why the Model Is Becoming the Product: AI Market Trends and Risks

The article argues that AI models are evolving into standalone products, examines scaling limits, integration challenges, reinforcement‑learning economics, and investment dynamics, and warns that reliance on large‑lab APIs may jeopardize future profitability for integrators.

AIIndustryInsightsLLM

0 likes · 15 min read

Why the Model Is Becoming the Product: AI Market Trends and Risks

Alibaba Cloud Infrastructure

Mar 8, 2025 · Artificial Intelligence

Deploying QwQ-32B LLM with vLLM on Alibaba Cloud ACK and Configuring Intelligent Routing

This guide explains how to deploy the QwQ-32B large language model using vLLM on an Alibaba Cloud ACK Kubernetes cluster, configure storage, set up OpenWebUI, enable ACK Gateway with AI Extension for intelligent routing, and benchmark the inference service performance.

ACKBenchmarkKubernetes

0 likes · 17 min read

Deploying QwQ-32B LLM with vLLM on Alibaba Cloud ACK and Configuring Intelligent Routing

AI Product Manager Community

Mar 8, 2025 · Artificial Intelligence

Deploy OpenManus Locally and Let It Generate a Complete WeChat Mini‑Program

This article walks through installing OpenManus locally using Python 3.12, cloning its GitHub repository, configuring DeepSeek LLM credentials, launching the service, and prompting the agent to generate a full WeChat mini‑program, while sharing observations on performance, token cost, and limitations.

AI AgentDeepSeekLLM

0 likes · 5 min read

Deploy OpenManus Locally and Let It Generate a Complete WeChat Mini‑Program

Qunhe Technology Quality Tech

Mar 7, 2025 · Artificial Intelligence

How AI is Revolutionizing Software Testing: 2025 Roadmap and Real-World Successes

The Qunhe Technology Quality team outlines a 2025 strategy that leverages advanced AI models, a user-friendly AI testing platform, and AI‑driven automation to boost test efficiency, streamline workflows, and promote AI adoption across the testing organization.

AIEfficiencyLLM

0 likes · 14 min read

How AI is Revolutionizing Software Testing: 2025 Roadmap and Real-World Successes

Alibaba Cloud Big Data AI Platform

Mar 7, 2025 · Artificial Intelligence

How QwQ-32B Outperforms OpenAI o1-mini and Deploys in One Click on Alibaba Cloud

Alibaba Cloud's newly released QwQ-32B model delivers benchmark‑level performance rivaling top open‑source LLMs, integrates agent capabilities, and can be deployed with a single click through the PAI‑Model Gallery, offering a cost‑effective solution for developers seeking advanced AI inference.

AI BenchmarkAlibaba CloudLLM

0 likes · 5 min read

How QwQ-32B Outperforms OpenAI o1-mini and Deploys in One Click on Alibaba Cloud

dbaplus Community

Mar 7, 2025 · Artificial Intelligence

Master Prompt Engineering: Frameworks, Strategies, and Real‑World Examples for Large Language Models

This comprehensive guide explains what prompts are, outlines essential prompt components and multiple engineering frameworks, presents practical strategies for crafting clear and structured prompts, addresses model limitations such as hallucinations, and showcases a wide range of advanced prompting techniques with code examples.

AILLMPrompt Engineering

0 likes · 29 min read

Master Prompt Engineering: Frameworks, Strategies, and Real‑World Examples for Large Language Models

DevOps

Mar 6, 2025 · Artificial Intelligence

Building Multi-Model Chat Agents with Dify: Integrating DeepSeek‑R1 and Gemini

This article explains how to create a high‑performance multi‑model chat agent on the Dify platform by combining DeepSeek‑R1 for reasoning and Gemini for answer generation, covering the underlying principles, configuration steps, API integration, performance benchmarks, and practical deployment guidance.

API integrationChatbotDeepSeek

0 likes · 12 min read

Building Multi-Model Chat Agents with Dify: Integrating DeepSeek‑R1 and Gemini

Cognitive Technology Team

Mar 5, 2025 · Artificial Intelligence

Comparative Analysis of Java AI Frameworks: LangChain4j, Spring AI, and Agent-Flex

This article examines three leading Java AI frameworks—LangChain4j, Spring AI, and Agent-Flex—by comparing their architectures, core capabilities, and ideal use‑cases, helping developers choose the most suitable solution for enterprise, domestic, or rapid‑prototype projects.

AIAgent-FlexLLM

0 likes · 5 min read

Comparative Analysis of Java AI Frameworks: LangChain4j, Spring AI, and Agent-Flex

Cognitive Technology Team

Mar 4, 2025 · Artificial Intelligence

Deep Searcher: An Open‑Source Agentic RAG Framework for Enterprise‑Level Search and Knowledge Retrieval

The article introduces Deep Searcher, an open‑source Agentic Retrieval‑Augmented Generation system that combines large language models, Milvus vector databases, and multi‑step reasoning to deliver enterprise‑grade search, reporting, and complex query capabilities, and compares its performance against traditional RAG and Graph RAG approaches.

LLMOpen SourceRAG

0 likes · 18 min read

Deep Searcher: An Open‑Source Agentic RAG Framework for Enterprise‑Level Search and Knowledge Retrieval

AI Algorithm Path

Mar 4, 2025 · Artificial Intelligence

How to Control LLM Output Using Temperature, Top‑K, and Top‑P

The article explains how sampling parameters—Temperature, Top‑k, and Top‑p—shape the output of large language models, comparing greedy and beam search, illustrating probability changes with concrete examples, and offering practical guidance on adjusting these settings for different tasks.

Beam SearchGreedy SearchLLM

0 likes · 9 min read

How to Control LLM Output Using Temperature, Top‑K, and Top‑P

Alibaba Cloud Developer

Mar 4, 2025 · Artificial Intelligence

Build a Smart Knowledge Base with DeepSeek R1 and Alibaba Cloud Low‑Code

This tutorial guides you through creating an AI‑powered, customizable knowledge space by integrating DeepSeek R1 via Alibaba Cloud Bailei's Model‑as‑a‑Service with the low‑code Mobinext platform, covering setup, configuration, deployment, and future expansion for multi‑tenant use.

AIAlibaba CloudDeepSeek

0 likes · 12 min read

Build a Smart Knowledge Base with DeepSeek R1 and Alibaba Cloud Low‑Code

Tencent Cloud Developer

Mar 4, 2025 · Artificial Intelligence

A Practical Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling and AI Agents

The guide teaches non‑AI developers how to build practical LLM‑powered applications by mastering prompt engineering, function calling, retrieval‑augmented generation, and AI agents, and introduces the Modal Context Protocol for seamless tool integration, offering a clear learning path to leverage large language models without deep theory.

AI AgentFunction CallingLLM

0 likes · 48 min read

A Practical Guide to Building Large Language Model Applications: Prompt Engineering, Retrieval‑Augmented Generation, Function Calling and AI Agents

Architect

Mar 3, 2025 · Artificial Intelligence

Unlocking Reasoning LLMs: Methods, DeepSeek R1 Insights, and Cost‑Effective Strategies

This article examines how to build and improve reasoning‑capable large language models, explains the definition and use‑cases of reasoning models, details DeepSeek‑R1’s training pipeline, compares four key enhancement methods—including inference‑time scaling, pure RL, SFT + RL, and distillation—and offers budget‑friendly advice.

AI researchDeepSeekInference Scaling

0 likes · 27 min read

Unlocking Reasoning LLMs: Methods, DeepSeek R1 Insights, and Cost‑Effective Strategies

Code Mala Tang

Mar 3, 2025 · Artificial Intelligence

Unlock AI’s Full Potential with Structured Prompt Decorators

Prompt Decorators are structured prefixes that standardize and enhance AI responses, addressing common challenges like vague prompts, inconsistent answers, and lack of reasoning by guiding the model to produce clear, logical, and well‑organized outputs across various use cases.

AILLMPrompt Engineering

0 likes · 23 min read

Unlock AI’s Full Potential with Structured Prompt Decorators

Fighter's World

Mar 3, 2025 · Artificial Intelligence

How OpenAI’s Deep Research Is Sparking a Wave of LLM‑Powered Search Experiments

The article explains what Deep Research agents are, walks through a concrete example of investigating the $6 million training cost controversy of DeepSeek V3, details the multi‑step plan‑edit‑execute workflow, and discusses broader implications for AI efficiency, market dynamics, and product design.

AI AgentsLLMcost efficiency

0 likes · 10 min read

How OpenAI’s Deep Research Is Sparking a Wave of LLM‑Powered Search Experiments

AI Large Model Application Practice

Mar 3, 2025 · Artificial Intelligence

Can DeepSeek‑R1 Unlock True “Deep Thinking” for Enterprise RAG?

This article examines how swapping in DeepSeek‑R1 enhances Retrieval‑Augmented Generation with deeper reasoning, outlines its benefits and pitfalls—including slower inference, higher compute costs, and hallucinations—provides a simple hallucination test, and proposes an Agentic RAG research assistant to balance accuracy and creativity.

AI reasoningDeepSeekLLM

0 likes · 10 min read

Can DeepSeek‑R1 Unlock True “Deep Thinking” for Enterprise RAG?

Java Architect Essentials

Mar 2, 2025 · Artificial Intelligence

Zero‑Code Local Deployment of DeepSeek LLM on Consumer GPUs Using Ollama

This guide explains why DeepSeek is a compelling GPT‑4‑level alternative, provides hardware recommendations for various model sizes, and walks through a three‑step Windows deployment using Ollama, including installation, environment configuration, model download, performance tuning, and common troubleshooting tips.

AIDeepSeekGPU

0 likes · 8 min read

Zero‑Code Local Deployment of DeepSeek LLM on Consumer GPUs Using Ollama

JD Retail Technology

Mar 1, 2025 · Industry Insights

How JD Retail’s AI Assistant Uses Multimodal LLMs to Boost E‑Commerce

JD Retail’s AI assistant combines a Master‑Sub agent framework, ReAct paradigm, multimodal integration and MoE architecture to improve sales forecasting, pricing, and recommendation accuracy, while the team’s collaborative culture and open talent pathways illustrate how cutting‑edge AI is applied in real‑world e‑commerce.

AIJD RetailLLM

0 likes · 8 min read

How JD Retail’s AI Assistant Uses Multimodal LLMs to Boost E‑Commerce

Code Mala Tang

Mar 1, 2025 · Artificial Intelligence

Why Do Large Language Models Hallucinate and How Can We Fix It?

This article explains why large language models produce plausible‑looking but false information, traces the problem to the supervised fine‑tuning stage, and outlines mitigation techniques such as knowledge interrogation, RLHF, and tool‑augmented search to reduce hallucinations.

LLMRLHFhallucination

0 likes · 12 min read

Why Do Large Language Models Hallucinate and How Can We Fix It?

AntTech

Mar 1, 2025 · Artificial Intelligence

ScaleOT: Privacy‑Utility‑Scalable Offsite‑Tuning with Dynamic LayerReplace and Selective Rank Compression

The ScaleOT framework introduces a privacy‑preserving offsite‑tuning pipeline for large language models that combines importance‑aware dynamic layer replacement with selective rank compression, enabling flexible model compression, near‑lossless fine‑tuning, and strong privacy guarantees across diverse downstream tasks.

AdapterLLMModel Compression

0 likes · 16 min read

ScaleOT: Privacy‑Utility‑Scalable Offsite‑Tuning with Dynamic LayerReplace and Selective Rank Compression

Cognitive Technology Team

Feb 28, 2025 · Artificial Intelligence

Design and High‑Availability Architecture of Alibaba LangEngine AI Application Framework

This article introduces Alibaba's LangEngine, a pure Java AI application framework, detailing its high‑availability gateway architecture, communication protocols, streaming and non‑streaming output, multi‑level metadata caching, asynchronous and serverless designs, and future open‑source roadmap, offering practical guidance for building robust AI services.

AI FrameworkLLMLangEngine

0 likes · 11 min read

Design and High‑Availability Architecture of Alibaba LangEngine AI Application Framework

AI Large Model Application Practice

Feb 28, 2025 · Artificial Intelligence

How Self-Attention Powers LLMs: A Step‑by‑Step Deep Dive

This article explains the self‑attention mechanism behind large language models, detailing why static word importance fails, how queries, keys, and values are generated, how attention scores are computed, scaled, softmaxed, and used to produce context‑aware word vectors, while noting computational costs.

AILLMSelf-Attention

0 likes · 9 min read

How Self-Attention Powers LLMs: A Step‑by‑Step Deep Dive

JavaEdge

Feb 27, 2025 · Artificial Intelligence

How to Quickly Build a DeepSeek‑Powered Knowledge Base on Tencent Cloud

This guide walks through deploying the full‑feature DeepSeek V3+R1 model on Tencent Cloud, configuring a smart knowledge‑base application, importing documentation, enabling internet search, tuning retrieval parameters, and publishing the app for public use, all without writing code.

AIDeepSeekKnowledge Base

0 likes · 6 min read

How to Quickly Build a DeepSeek‑Powered Knowledge Base on Tencent Cloud

AI Algorithm Path

Feb 26, 2025 · Artificial Intelligence

Anthropic Unveils Claude 3.7 Sonnet: The World’s First Hybrid Reasoning Model

Anthropic’s Claude 3.7 Sonnet introduces a hybrid reasoning LLM with an extended thinking mode, a 128K‑token context window, improved coding abilities, lower refusal rates, and strong benchmark results, while being accessible via web, mobile apps and API under tiered pricing.

AI codingAnthropicClaude 3.7 Sonnet

0 likes · 10 min read

Anthropic Unveils Claude 3.7 Sonnet: The World’s First Hybrid Reasoning Model

Baidu Tech Salon

Feb 26, 2025 · Artificial Intelligence

Graph‑Engine‑Driven Workflow for Building Intelligent Agents

The article presents a graph‑engine‑driven workflow platform that lets developers assemble, orchestrate, and execute intelligent LLM‑based agents with low‑code visual design, fine‑grained path control, hierarchical sub‑flows, and event‑driven hooks, addressing perception, reasoning, planning, and scalability challenges while surpassing existing frameworks.

Data DecouplingIntelligent agentsLLM

0 likes · 19 min read

Graph‑Engine‑Driven Workflow for Building Intelligent Agents

Alibaba Cloud Big Data AI Platform

Feb 25, 2025 · Artificial Intelligence

Build a RAG‑Powered Smart Q&A Assistant with Milvus, DeepSeek, and PAI LangStudio

This step‑by‑step guide shows how to assemble a Retrieval‑Augmented Generation (RAG) system using Alibaba Cloud Milvus vector search, the DeepSeek large language model, and PAI LangStudio, covering instance creation, data upload, model deployment, connection setup, flow design, and service invocation.

AI TutorialDeepSeekLLM

0 likes · 9 min read

Build a RAG‑Powered Smart Q&A Assistant with Milvus, DeepSeek, and PAI LangStudio

Alibaba Cloud Big Data AI Platform

Feb 25, 2025 · Artificial Intelligence

How DistilQwen2.5 Boosts LLM Efficiency with Dual‑Stage Knowledge Distillation

This article introduces DistilQwen2.5, a lightweight LLM series built on Qwen2.5 that uses a novel two‑layer distillation framework, instruction‑data optimization, and parameter‑fusion techniques to achieve higher performance while drastically reducing computational cost and deployment overhead.

Efficient InferenceKnowledge DistillationLLM

0 likes · 26 min read

How DistilQwen2.5 Boosts LLM Efficiency with Dual‑Stage Knowledge Distillation

DataFunSummit

Feb 25, 2025 · Artificial Intelligence

Collecting High-Quality LLM Training Data and Custom Model Training Guide

This article explains what constitutes high‑quality LLM training data, why large datasets are essential, outlines the step‑by‑step process for collecting, preprocessing, and fine‑tuning models, and highlights the best data sources—including web content, books, code repositories, and news—while noting available free datasets.

AIData CollectionLLM

0 likes · 9 min read

Collecting High-Quality LLM Training Data and Custom Model Training Guide

Code Mala Tang

Feb 25, 2025 · Artificial Intelligence

How Resources, Tools, and Prompts Power LLM Super‑Agents

This article explains how the Resources data hub, Tools capability engine, and Prompts interaction templates work together to create a secure, extensible workflow that enables large language models to ingest data, execute tasks, and generate structured outputs.

AI workflowArtificial IntelligenceLLM

0 likes · 5 min read

How Resources, Tools, and Prompts Power LLM Super‑Agents

CSS Magic

Feb 25, 2025 · Artificial Intelligence

Two Simple Ways to Access DeepSeek API for Free

This guide shows how to obtain free DeepSeek API access through GitHub Models and SiliconFlow, detailing the required API base URL, key, and model name, how to register, create keys, verify usage with a web chat tool, and compare model choices and platform limits.

APIDeepSeekFree access

0 likes · 7 min read

Two Simple Ways to Access DeepSeek API for Free

Alibaba Cloud Native

Feb 24, 2025 · Cloud Native

Build a Real‑Time AI Search‑Enabled Q&A System with Higress and DeepSeek

This guide shows how open‑source LLMs like DeepSeek can power cost‑effective intelligent Q&A services, and how the cloud‑native Higress API gateway adds real‑time web search, routing, security, and observability to create a production‑grade solution in just a few steps.

DeepSeekHigressLLM

0 likes · 6 min read

Build a Real‑Time AI Search‑Enabled Q&A System with Higress and DeepSeek

Baidu Geek Talk

Feb 24, 2025 · Artificial Intelligence

Using a Graph Engine to Drive Workflow for Intelligent Agents

By leveraging mature graph‑engine technology, the article shows how visual, low‑code workflow orchestration can give intelligent LLM‑based agents fine‑grained path control, reusable functions, hierarchical sub‑flows, and robust error handling, turning complex business tasks into modular, scalable processes adopted by hundreds of thousands of developers.

AI AgentsLLMLow‑code

0 likes · 18 min read

Using a Graph Engine to Drive Workflow for Intelligent Agents

Alibaba Cloud Observability

Feb 24, 2025 · Backend Development

Build a Cloud‑Native AI Chatbot with Spring AI Alibaba and ARMS Observability

This tutorial walks you through creating a Java‑based AI chat agent using Spring AI Alibaba, integrating Alibaba Cloud's large language model, adding function‑calling for weather queries, and enabling full observability with ARMS in a cloud‑native deployment.

ARMSLLMSpring AI

0 likes · 10 min read

Build a Cloud‑Native AI Chatbot with Spring AI Alibaba and ARMS Observability

Selected Java Interview Questions

Feb 24, 2025 · Artificial Intelligence

Deploying Ollama on Windows and Linux and Integrating with SpringBoot

This guide explains how to download, install, and configure Ollama on Windows and Linux, set up environment variables, select a DeepSeek model, and call the Ollama API from a SpringBoot application with example code snippets.

APIDeepSeekLLM

0 likes · 6 min read

Deploying Ollama on Windows and Linux and Integrating with SpringBoot

Architecture Digest

Feb 24, 2025 · Artificial Intelligence

MoBA: Mixture of Block Attention for Long‑Context Large Language Models

The article introduces MoBA, a Mixture‑of‑Block‑Attention mechanism that applies Mixture‑of‑Experts principles to transformer attention, enabling efficient long‑context processing for large language models while maintaining performance comparable to full attention through sparse, trainable block selection and seamless switching.

LLMMixture of ExpertsMoBA

0 likes · 12 min read

MoBA: Mixture of Block Attention for Long‑Context Large Language Models

AI Large Model Application Practice

Feb 24, 2025 · Artificial Intelligence

How Web Agents Combine LLMs and Browser Automation to Perform Real‑World Tasks

This article explains what Web Agents are, their ReAct‑style reasoning loop, key implementation technologies such as observation parsing, multimodal models, and browser control tools like Selenium and Playwright, and demonstrates building a DeepSeek‑powered Web Agent with the Browser‑use framework, including code samples and performance insights.

DeepSeekLLMPlaywright

0 likes · 11 min read

How Web Agents Combine LLMs and Browser Automation to Perform Real‑World Tasks

Java Architecture Diary

Feb 24, 2025 · Artificial Intelligence

Run Large Language Models Directly in Java with Jlama – Quick Start Guide

This article introduces Jlama, an open‑source Java LLM inference engine, outlines its key features, provides step‑by‑step CLI and Maven integration instructions, shows code examples, run logs, and special setup notes for using large language models efficiently within Java applications.

AIJlamaLLM

0 likes · 6 min read

Run Large Language Models Directly in Java with Jlama – Quick Start Guide

Alibaba Cloud Developer

Feb 24, 2025 · Artificial Intelligence

How to Build a Local Chatbot with Web Search Using DeepSeek, Ollama, and Dify

Learn how to create a locally hosted chatbot powered by DeepSeek R1 32b, using Ollama and Docker, integrate Dify for model management, and add web‑search capability through SEARXNG, covering environment setup, search logic, content extraction, testing, and optimization tips.

ChatbotDeepSeekDify

0 likes · 10 min read

How to Build a Local Chatbot with Web Search Using DeepSeek, Ollama, and Dify

Java Web Project

Feb 23, 2025 · Artificial Intelligence

Build Your First AI Chatbot with Spring Boot and DeepSeek LLM

This guide walks you through creating a Spring Boot project, configuring DeepSeek's large language model via SiliconFlow, setting up OpenAI‑compatible parameters, and implementing a REST controller that returns weather forecasts using the model, complete with step‑by‑step code snippets, configuration files, and deployment instructions.

AIChatbotDeepSeek

0 likes · 7 min read

Build Your First AI Chatbot with Spring Boot and DeepSeek LLM

Ma Wei Says

Feb 23, 2025 · Artificial Intelligence

How Microsoft’s PIKE‑RAG Builds Knowledge‑Driven AI Across Four Stages

The article explains Microsoft’s open‑source PIKE‑RAG system, detailing its four progressive stages—from knowledge‑base construction to creative multi‑agent reasoning—while describing the underlying modules, chunking strategies, multi‑granularity retrieval, and code snippets that enable specialized domain understanding and inference.

AI RetrievalKnowledge GraphLLM

0 likes · 11 min read

How Microsoft’s PIKE‑RAG Builds Knowledge‑Driven AI Across Four Stages

Architecture and Beyond

Feb 22, 2025 · Artificial Intelligence

Understanding Retrieval‑Augmented Generation (RAG) and Its Role in Enhancing Large Language Models

The article explains how the inherent knowledge‑staleness, hallucination, lack of private data, non‑traceable output, limited long‑text handling, and data‑security concerns of large language models can be mitigated by Retrieval‑Augmented Generation, which combines external retrieval, augmentation, and generation to provide up‑to‑date, reliable, and secure AI responses.

AIKnowledge augmentationLLM

0 likes · 15 min read

Understanding Retrieval‑Augmented Generation (RAG) and Its Role in Enhancing Large Language Models

Infra Learning Club

Feb 21, 2025 · Artificial Intelligence

5 Must‑Try Open‑Source AI Projects You Can Start Using Today

This article introduces five open‑source AI tools—a PPT generator, an LLM app development platform, a cloud‑agnostic AI runner, a curated collection of LLM applications, and a one‑click HD video creator—detailing their key features, usage links, and sample configurations.

AIDifyLLM

0 likes · 8 min read

5 Must‑Try Open‑Source AI Projects You Can Start Using Today

Ma Wei Says

Feb 21, 2025 · Artificial Intelligence

How PIKE‑RAG Boosts Retrieval‑Augmented Generation for Industrial AI

PIKE‑RAG, a Retrieval‑Augmented Generation framework from Microsoft Research, tackles knowledge source diversity, one‑size‑fits‑all limitations, and LLMs' lack of domain expertise by building multi‑layer heterogeneous graphs, task‑driven modular pipelines, and a staged L0‑L4 system for more accurate industrial AI responses.

AIKnowledgeGraphLLM

0 likes · 5 min read

How PIKE‑RAG Boosts Retrieval‑Augmented Generation for Industrial AI

Architect

Feb 20, 2025 · Artificial Intelligence

Why Long CoT and In‑Context RL Are the Next Frontier for LLMs

The article analyses recent breakthroughs such as OpenAI's o1, Long CoT, and test‑time search, arguing that enabling LLMs to perform self‑critique and reinforcement learning with long output sequences is essential for future AI performance, while warning against overly structured workflows.

AI researchIn‑Context RLLLM

0 likes · 12 min read

Why Long CoT and In‑Context RL Are the Next Frontier for LLMs

Alibaba Cloud Infrastructure

Feb 20, 2025 · Artificial Intelligence

Deploying DeepSeek‑R1 Large Language Model on Knative with GPU A10

This guide explains how to deploy the DeepSeek‑R1 large language model on a Knative platform using an A10 GPU, covering preparation, service creation with appropriate annotations, YAML configuration, verification via curl, custom domain setup, and optional personal AI assistant deployment.

AIDeepSeekGPU

0 likes · 8 min read

Deploying DeepSeek‑R1 Large Language Model on Knative with GPU A10

JD Tech Talk

Feb 20, 2025 · Artificial Intelligence

Multi‑Agent Architecture for an E‑Commerce Business Assistant: Design, Planning, Evaluation, and Sample Generation

The document describes the evolution, design principles, key technologies, online inference workflow, evaluation methods, and sample‑generation techniques of a large‑language‑model‑based multi‑agent system that powers a 24/7 e‑commerce merchant assistant, highlighting its benefits, challenges, and future work.

AI PlanningLLMReward Model

0 likes · 21 min read

Multi‑Agent Architecture for an E‑Commerce Business Assistant: Design, Planning, Evaluation, and Sample Generation

JD Cloud Developers

Feb 20, 2025 · Artificial Intelligence

How Multi‑Agent ReAct Architecture Boosts E‑Commerce AI Assistants

This article explains the evolution of multi‑agent systems for e‑commerce assistants, detailing the ReAct‑based planning framework, hierarchical master‑sub agent collaboration, evaluation methods, and sample‑generation techniques that together improve accuracy, efficiency, and scalability of AI‑driven merchant services.

AI PlanningAgent ArchitectureLLM

0 likes · 23 min read

How Multi‑Agent ReAct Architecture Boosts E‑Commerce AI Assistants

Alibaba Cloud Developer

Feb 20, 2025 · Artificial Intelligence

How LLMs Power Real-Time Interactive 3D Worlds in Unreal Engine

This article explains how large language models are integrated with Unreal Engine to enable natural‑language‑driven 3D model search, manipulation, and scene understanding, detailing metadata extraction, vision‑language labeling, RAG‑based retrieval, and function‑call translation for interactive virtual environments.

3D interactionLLMRAG

0 likes · 21 min read

How LLMs Power Real-Time Interactive 3D Worlds in Unreal Engine

Architects' Tech Alliance

Feb 18, 2025 · Artificial Intelligence

How to Distill DeepSeek LLMs into Lightweight Models for Local Deployment

This article explains DeepSeek's knowledge‑distillation approach for compressing large language models into small, efficient student models, details step‑by‑step local deployment requirements, performance optimizations, and highlights the cost, privacy, and application benefits of running the distilled model on‑premise.

AI inferenceDeepSeekKnowledge Distillation

0 likes · 10 min read

How to Distill DeepSeek LLMs into Lightweight Models for Local Deployment

Java Backend Technology

Feb 18, 2025 · Artificial Intelligence

Boost Java AI Apps with DeepSeek4j: Full Chain Support & Reactive Streaming

DeepSeek4j 1.4 brings a Java‑native integration framework that fully preserves DeepSeek's chain‑of‑thought capabilities, adds reactive streaming, and offers a one‑line Spring Boot starter, enabling developers to quickly embed the model with simple configuration and rich debugging tools.

AI IntegrationDeepSeekLLM

0 likes · 5 min read

Boost Java AI Apps with DeepSeek4j: Full Chain Support & Reactive Streaming

Big Data Tech Team

Feb 18, 2025 · Artificial Intelligence

How DeepSeek Trains and Optimizes Its LLMs: From Pre‑training to Reasoning Models

This article breaks down DeepSeek's LLM training pipeline, explaining the massive pre‑training phase, instruction fine‑tuning, reinforcement‑learning‑from‑human‑feedback, and the distinct roles of its V3 instruction model and R1 reasoning model, while also highlighting performance metrics and current limitations.

DeepSeekLLMRLHF

0 likes · 8 min read

How DeepSeek Trains and Optimizes Its LLMs: From Pre‑training to Reasoning Models

Java One

Feb 17, 2025 · Artificial Intelligence

How to Get Free Access to DeepSeek R1 Across Major Cloud Platforms

This guide walks you through using DeepSeek R1 via the official website or popular third‑party cloud services, compares free token quotas, explains token accounting, and provides step‑by‑step instructions for configuring API access and AI clients such as Chatbox, Cherry Studio, and Dify.

AI clientAPIDeepSeek

0 likes · 11 min read

How to Get Free Access to DeepSeek R1 Across Major Cloud Platforms