Tagged articles

675 articles

Page 5 of 7

Mar 15, 2025 · Artificial Intelligence

What Makes Google’s New Gemma 3 Model a Game‑Changer for AI Developers?

Google’s Gemma 3, a lightweight open‑source model with up to 27 billion parameters, offers multimodal input, 128K token context, and broad language support, outperforming leading rivals on single‑GPU benchmarks and providing flexible deployment options for developers and researchers alike.

AI modelGemma 3Google AI

0 likes · 9 min read

What Makes Google’s New Gemma 3 Model a Game‑Changer for AI Developers?

Alibaba Cloud Big Data AI Platform

Mar 12, 2025 · Artificial Intelligence

Deploy, Fine‑Tune, and Compress DistilQwen2.5 on Alibaba Cloud PAI – A Complete Guide

This article walks through the full workflow for using Alibaba Cloud's open‑source DistilQwen2.5 models on the PAI platform, covering environment setup, model deployment, fine‑tuning with SFT and DPO, evaluation, and model compression for resource‑constrained scenarios.

DistilQwen2.5Large Language ModelPAI

0 likes · 13 min read

Deploy, Fine‑Tune, and Compress DistilQwen2.5 on Alibaba Cloud PAI – A Complete Guide

Architects' Tech Alliance

Mar 10, 2025 · Industry Insights

How AI Agents Are Redefining the Future of Intelligent Computing

This article provides a comprehensive analysis of AI agents, covering their historical origins, three‑layer technology stack, market size forecasts, evolution from training to inference, interaction modes, core modules, and the full industry chain from infrastructure providers to downstream applications.

AI AgentAI MarketAgent Architecture

0 likes · 13 min read

How AI Agents Are Redefining the Future of Intelligent Computing

CSS Magic

Mar 10, 2025 · Artificial Intelligence

Three Advanced Ways to Harness DeepSeek for Everyone

The article outlines three practical approaches to get the most out of DeepSeek—using it as a conversational assistant, integrating its API to power AI tools such as the Chrome immersive‑translation plugin, and leveraging it for AI‑assisted programming—while comparing the V3 and R1 models and offering concrete configuration steps.

AI programmingAI translationAPI integration

0 likes · 8 min read

Three Advanced Ways to Harness DeepSeek for Everyone

Top Architect

Mar 9, 2025 · Artificial Intelligence

Alibaba Unveils Qwen QwQ-32B: A Compact Open‑Source LLM Rivaling DeepSeek

Alibaba has released the open‑source Qwen QwQ‑32B model, a 32‑billion‑parameter LLM that matches DeepSeek‑R1's performance while being deployable on consumer‑grade GPUs, and the announcement is accompanied by extensive promotional offers for AI‑related products and services.

AI BenchmarkAlibabaLarge Language Model

0 likes · 7 min read

Alibaba Unveils Qwen QwQ-32B: A Compact Open‑Source LLM Rivaling DeepSeek

ZhongAn Tech Team

Mar 8, 2025 · Artificial Intelligence

Weekly AI Rumors Issue 15: Manus AI Agent Launch, GPT‑4.5 Evaluation, and LightThinker Technique

This issue reviews the hype around China’s Manus AI Agent and its invitation‑code controversy, critiques OpenAI’s GPT‑4.5 performance versus DeepSeek, showcases industry solutions using AI agents, and introduces the LightThinker method for dynamically compressing LLM inference chains to boost efficiency.

AI AgentAI MarketGPT-4.5

0 likes · 15 min read

Weekly AI Rumors Issue 15: Manus AI Agent Launch, GPT‑4.5 Evaluation, and LightThinker Technique

Java Tech Enthusiast

Mar 8, 2025 · Artificial Intelligence

QwQ-32B Large Language Model Overview and Performance

Alibaba’s new QwQ‑32B large‑language model, with 32 billion parameters, delivers performance comparable to or surpassing the 671‑billion‑parameter DeepSeek‑R1 across math, coding, and general benchmarks, and is available via HuggingFace, ModelScope, and a DashScope API demo with example Python code.

AI BenchmarkLarge Language ModelPython API

0 likes · 5 min read

QwQ-32B Large Language Model Overview and Performance

AI Product Manager Community

Mar 7, 2025 · Artificial Intelligence

Function Calls vs ReAct: Core Concepts, Implementation, and Real‑World Use Cases

This article explains the technical principles behind Function Call and ReAct in large language models, provides code samples, compares their strengths and limitations, and illustrates each approach with practical scenarios such as smart customer service and financial analysis assistants.

AI Tool UseLarge Language ModelPrompt Engineering

0 likes · 9 min read

Function Calls vs ReAct: Core Concepts, Implementation, and Real‑World Use Cases

ByteDance Cloud Native

Mar 7, 2025 · Artificial Intelligence

How to Deploy the QwQ-32B Large Language Model on Volcengine Cloud in Minutes

This guide walks you through the end‑to‑end process of deploying the open‑source QwQ‑32B inference model on Volcengine's cloud platform, covering GPU ECS selection, VKE cluster creation, continuous delivery CP setup, vLLM service launch, and API gateway exposure.

GPU ECSLarge Language ModelQwQ-32B

0 likes · 8 min read

How to Deploy the QwQ-32B Large Language Model on Volcengine Cloud in Minutes

Java Architecture Diary

Mar 7, 2025 · Artificial Intelligence

Boost Inference Efficiency with QwQ-32B: Benchmarks, Resource Savings, and Java Integration

QwQ-32B, Alibaba’s new inference‑optimized large language model built on the Qwen2.5 architecture, outperforms DeepSeek‑R1 across math reasoning, code generation, and safety benchmarks while requiring only 24 GB vRAM, and the article provides detailed performance data, resource‑efficiency analysis, and step‑by‑step Java and Ollama integration instructions.

Function CallingInference OptimizationJava integration

0 likes · 7 min read

Boost Inference Efficiency with QwQ-32B: Benchmarks, Resource Savings, and Java Integration

AI Product Manager Community

Mar 6, 2025 · Artificial Intelligence

Why Alibaba’s QwQ‑32B Rivals 670B Models with Just 32B Parameters

Alibaba’s newly released 32‑billion‑parameter QwQ‑32B model matches the performance of 670‑billion‑parameter rivals like DeepSeek‑R1, integrates agent‑based reasoning, runs on consumer hardware, and has sparked strong open‑source community adoption, as shown by benchmark results and download statistics.

AgentAlibabaLarge Language Model

0 likes · 6 min read

Why Alibaba’s QwQ‑32B Rivals 670B Models with Just 32B Parameters

Programmer DD

Mar 6, 2025 · Artificial Intelligence

Discover QwQ-32B: A 32B LLM Matching 671B DeepSeek‑R1 Performance

The QwQ-32B model, released by Alibaba Cloud, delivers DeepSeek‑R1‑level results with only 32 billion parameters, offers integrated agent capabilities, is open‑source under Apache 2.0, and can be quickly deployed locally via Ollama or integrated into Java applications using Spring AI.

AI inferenceLarge Language ModelModel Deployment

0 likes · 4 min read

Discover QwQ-32B: A 32B LLM Matching 671B DeepSeek‑R1 Performance

Baobao Algorithm Notes

Mar 6, 2025 · Artificial Intelligence

Alibaba Unveils QwQ-32B: A 32‑Billion‑Parameter Inference Model with Agent Capabilities

Alibaba has open‑sourced its new QwQ‑32B inference model, a 32.5‑billion‑parameter transformer that rivals top models like DeepSeek‑R1 and o1‑mini, features integrated agent abilities for tool use and critical thinking, and offers a low inference barrier with extensive technical specifications and RL‑based training details.

AlibabaLarge Language ModelTransformer

0 likes · 4 min read

Alibaba Unveils QwQ-32B: A 32‑Billion‑Parameter Inference Model with Agent Capabilities

Baobao Algorithm Notes

Mar 5, 2025 · Artificial Intelligence

Why My 0.5B LLM’s Reasoning Collapsed During RLHF on Logic Puzzles

The author experiments with reinforcement‑learning‑from‑human‑feedback on a 0.5B Qwen instruct model using Logic‑RL and Open‑R1, discovers that reward mis‑design and curriculum learning cause the model to produce overly short or incorrect reasoning chains on knight‑and‑knave puzzles, and analyses the underlying causes.

Artificial IntelligenceCurriculum LearningLarge Language Model

0 likes · 11 min read

Why My 0.5B LLM’s Reasoning Collapsed During RLHF on Logic Puzzles

Open Source Linux

Mar 5, 2025 · Artificial Intelligence

How DeepSeek‑R1 Redefines Prompt Engineering and Real‑World AI Deployment

The article analyzes DeepSeek‑R1’s low‑cost inference architecture, Chinese language optimizations, novel prompt‑engineering techniques, and the practical challenges of deploying large domestic models, offering insights into vertical AI applications and the evolving open‑source ecosystem in China.

AI deploymentDeepSeekLarge Language Model

0 likes · 8 min read

How DeepSeek‑R1 Redefines Prompt Engineering and Real‑World AI Deployment

Architects' Tech Alliance

Feb 28, 2025 · Artificial Intelligence

DeepSeek V3 & R1: How Their Training Costs Compare to Llama 3.1

The article analyzes DeepSeek’s latest V3 conversational model and R1 inference model, detailing their MoE architecture, training on H800 GPUs costing about $558 k, comparing compute expenses to Meta’s Llama 3.1, and showing that their API pricing is roughly one‑tenth of GPT‑4o for dialogue and one‑twentieth of OpenAI o1 for inference.

AI model analysisDeepSeekLarge Language Model

0 likes · 4 min read

DeepSeek V3 & R1: How Their Training Costs Compare to Llama 3.1

IT Architects Alliance

Feb 26, 2025 · Artificial Intelligence

DeepSeek Large Model: Core Architecture, Key Technologies, and Training Strategies

The article provides an in‑depth overview of DeepSeek’s large language model, detailing its mixture‑of‑experts and Transformer foundations, novel attention mechanisms, load‑balancing, multi‑token prediction, FP8 mixed‑precision training, and various training regimes such as knowledge distillation and reinforcement learning.

DeepSeekFP8Knowledge Distillation

0 likes · 18 min read

DeepSeek Large Model: Core Architecture, Key Technologies, and Training Strategies

Tencent Technical Engineering

Feb 26, 2025 · Artificial Intelligence

Engineers' Perspectives on DeepSeek: Technical Innovations and Implications

Thirteen engineers praise DeepSeek’s open‑source, reinforcement‑learning‑driven architecture—using FP8 storage and SFT‑free training—to deliver GPT‑4‑level reasoning at one‑twentieth the cost, enabling single‑GPU deployment, lowering barriers for academia and startups, and prompting notable market reactions that could democratize advanced AI.

AI cost reductionDeepSeekFP8

0 likes · 9 min read

Engineers' Perspectives on DeepSeek: Technical Innovations and Implications

Architecture & Thinking

Feb 26, 2025 · Artificial Intelligence

Unlocking DeepSeek: A Comprehensive Guide to China’s Cutting-Edge AI Chat Model

This article provides an in‑depth overview of DeepSeek, covering its core multimodal and multilingual features, long‑context capabilities, domain optimizations, security, main functions, diverse application scenarios, and practical usage via web interface or API integration.

AI chatbotArtificial IntelligenceDeepSeek

0 likes · 6 min read

Unlocking DeepSeek: A Comprehensive Guide to China’s Cutting-Edge AI Chat Model

Architects' Tech Alliance

Feb 25, 2025 · Artificial Intelligence

What Makes DeepSeek‑R1 a Game‑Changer in AIGC? Insights from Peking University

This article summarizes a Peking University lecture on DeepSeek‑R1, detailing its core concepts, advantages, and historical significance, then explains the underlying mechanisms of large‑model AI and AIGC tools, and finally offers practical guidance for selecting and efficiently applying AI solutions.

AI model analysisAIGCDeepSeek

0 likes · 5 min read

What Makes DeepSeek‑R1 a Game‑Changer in AIGC? Insights from Peking University

Ma Wei Says

Feb 25, 2025 · Artificial Intelligence

What Is GraphRAG? A Deep Dive into Next‑Gen Retrieval‑Augmented Generation and Open‑Source Implementations

GraphRAG, the next generation of Retrieval‑Augmented Generation, combines large language models, knowledge graphs, and graph databases to overcome traditional RAG’s knowledge gaps, hallucinations, and context limitations, and the article reviews its architecture, core modules, a recent 2025 paper, and six notable open‑source implementations.

Artificial IntelligenceGraphRAGLarge Language Model

0 likes · 9 min read

What Is GraphRAG? A Deep Dive into Next‑Gen Retrieval‑Augmented Generation and Open‑Source Implementations

AI Algorithm Path

Feb 22, 2025 · Artificial Intelligence

Elon Musk Unveils Grok 3, Claiming the World’s Most Powerful AI Model

The article details the launch of Grok 3 by Elon Musk’s xAI, highlighting its massive GPU infrastructure, benchmark dominance over GPT‑4o, multiple model variants, pricing for Premium+ users, upcoming API and voice features, and the team’s plan to open‑source Grok 2 once the new model stabilises.

AI BenchmarkAI pricingElon Musk

0 likes · 6 min read

Elon Musk Unveils Grok 3, Claiming the World’s Most Powerful AI Model

Selected Java Interview Questions

Feb 21, 2025 · Artificial Intelligence

Integrating DeepSeek Large Model with Spring AI: A Step-by-Step Guide

This article explains how to integrate DeepSeek's large language models into a Spring AI application, covering model selection, API key configuration, URL setup, dependency inclusion, and providing complete Java code examples for both synchronous and streaming chat interactions.

Backend IntegrationDeepSeekJava

0 likes · 5 min read

Integrating DeepSeek Large Model with Spring AI: A Step-by-Step Guide

Top Architect

Feb 20, 2025 · Artificial Intelligence

Deploying DeepSeek R1 671B Model Locally with Ollama and Dynamic Quantization

This guide explains how to download, quantize, and run the full‑size 671‑billion‑parameter DeepSeek R1 model on local hardware using Ollama, covering model selection, hardware requirements, step‑by‑step deployment commands, optional web UI setup, performance observations, and practical recommendations.

AIDeepSeekDynamic Quantization

0 likes · 16 min read

Deploying DeepSeek R1 671B Model Locally with Ollama and Dynamic Quantization

Practical DevOps Architecture

Feb 20, 2025 · Artificial Intelligence

Training MiniDeepSeek V3+R1 from Scratch: Full-Scale Large Model Technical Practice for 2025

This tutorial series provides a step‑by‑step technical guide to training, deploying, and fine‑tuning the MiniDeepSeek V3+R1 large language model, covering model performance, open‑source details, API usage, parameter explanation, multi‑turn chatbot construction, function calling, integration with Open WebUI, GraphRAG, Swarm, and various deployment and optimization techniques.

AILarge Language ModelMiniDeepSeek

0 likes · 4 min read

Training MiniDeepSeek V3+R1 from Scratch: Full-Scale Large Model Technical Practice for 2025

Tencent Technical Engineering

Feb 19, 2025 · Artificial Intelligence

Reproduction and Analysis of DeepSeek R1/R1‑zero Reinforcement Learning Experiments

This note surveys four open‑source reproductions of DeepSeek R1/R1‑zero reinforcement‑learning pipelines, re‑implements their training on math and logic datasets using Qwen‑based models, shows that format‑plus‑accuracy rewards improve long‑chain reasoning though stability and scaling remain challenges, and outlines future directions for large‑scale RL and business deployment.

DeepSeek-R1Large Language Modellong chain of thought

0 likes · 39 min read

Reproduction and Analysis of DeepSeek R1/R1‑zero Reinforcement Learning Experiments

Java Tech Enthusiast

Feb 19, 2025 · Artificial Intelligence

xAI's Grok 3 Model: Benchmarks, Reasoning, and Industry Reactions

Elon Musk’s xAI introduced the Grok 3 family—trained on roughly 200,000 GPUs and offered in standard, mini and Reasoning versions—that claims top‑slot performance on math, science and coding benchmarks, outpacing Google Gemini, DeepSeek V3, Claude and OpenAI GPT‑4o, while pricing starts at $30 per month and drawing both praise for its speed and criticism for lingering hallucinations and ethical sensitivities.

AIDeepSearchGrok3

0 likes · 16 min read

xAI's Grok 3 Model: Benchmarks, Reasoning, and Industry Reactions

Alibaba Cloud Big Data AI Platform

Feb 19, 2025 · Artificial Intelligence

Build a DeepSeek AI Assistant with PAI‑RAG: Internet Search & Enterprise Knowledge Base

This guide walks you through using Alibaba Cloud's PAI‑RAG platform to deploy a DeepSeek large‑language‑model assistant that combines real‑time web search with an enterprise knowledge‑base, covering deployment, network‑search configuration, testing, and advanced enterprise features.

AI AssistantDeepSeekEnterprise Knowledge Base

0 likes · 10 min read

Build a DeepSeek AI Assistant with PAI‑RAG: Internet Search & Enterprise Knowledge Base

Architecture Digest

Feb 18, 2025 · Artificial Intelligence

Integrating DeepSeek Large Model with Spring AI: A Step‑by‑Step Guide

This article explains how to obtain a DeepSeek API key, configure Spring AI with the appropriate base URL and model, and provides Java code examples for both synchronous and streaming chat interactions using the DeepSeek large‑language model.

API integrationChatbotDeepSeek

0 likes · 5 min read

Full-Stack DevOps & Kubernetes

Feb 18, 2025 · Cloud Native

Deploy Massive LLMs on Kubernetes: Step‑by‑Step Guide for Ollama and DeepSeek‑R1

This guide explains how to deploy large‑scale AI models such as Ollama and DeepSeek‑R1 on a Kubernetes 1.30 cluster, covering hardware requirements, PVC and deployment manifests, service exposure, image pulling, verification steps, API access, and monitoring with Prometheus and Grafana.

AIDeepSeekKubernetes

0 likes · 12 min read

Deploy Massive LLMs on Kubernetes: Step‑by‑Step Guide for Ollama and DeepSeek‑R1

JD Retail Technology

Feb 18, 2025 · Artificial Intelligence

Engineering Practices of JD Advertising Agent: JDZunTong Intelligent Assistant

JD’s advertising R&D team created the JDZunTong Intelligent Assistant by engineering a modular Agent platform that combines advanced Retrieval‑Augmented Generation (RAG 1.0 → 2.0) and Function‑Call capabilities, a visual designer, custom tool registration, and a native Python workflow engine to deliver intelligent customer service, data queries, and ad creation for merchants.

AIAgentJD Advertising

0 likes · 18 min read

Engineering Practices of JD Advertising Agent: JDZunTong Intelligent Assistant

Goodme Frontend Team

Feb 17, 2025 · Backend Development

How Plug Revolutionizes API Capture and Mocking with AI‑Powered Automation

This article introduces Plug, a unified front‑end tool that combines non‑intrusive interface capture, flexible mocking, and large‑model assistance to streamline API development for both mini‑programs and PC, while addressing HTTPS proxy challenges and performance considerations.

API mockingBackend DevelopmentInterface Capture

0 likes · 15 min read

How Plug Revolutionizes API Capture and Mocking with AI‑Powered Automation

AIWalker

Feb 16, 2025 · Artificial Intelligence

VARGPT: A Unified Autoregressive Architecture for Multimodal Understanding and Generation

VARGPT is a novel multimodal large language model that unifies visual understanding and autoregressive image generation within a single architecture, extending LLaVA with next‑token and next‑scale prediction, trained through three staged data‑curated phases and achieving superior performance on numerous vision‑language benchmarks.

AI researchLarge Language ModelMultimodal

0 likes · 20 min read

VARGPT: A Unified Autoregressive Architecture for Multimodal Understanding and Generation

Ops Development & AI Practice

Feb 16, 2025 · Artificial Intelligence

Why FlashAttention Supercharges Qwen Models: A Technical Deep Dive

This article explains the FlashAttention algorithm, its memory‑efficient tiling and recomputation techniques, and how enabling the flash_attn flag dramatically speeds up Qwen‑series large models while outlining hardware, software requirements and potential trade‑offs.

FlashAttentionGPU optimizationLarge Language Model

0 likes · 8 min read

Why FlashAttention Supercharges Qwen Models: A Technical Deep Dive

Code Ape Tech Column

Feb 14, 2025 · Artificial Intelligence

Integrating DeepSeek Large Model with Spring AI: A Step‑by‑Step Guide

This article explains how to integrate DeepSeek's large language models—both the chat‑oriented deepseek‑chat and the reasoning‑focused deepseek‑reasoner—into a Spring AI application, covering API key setup, base‑URL configuration, model selection, and providing full code examples for dependency, configuration, and a simple chat controller.

AIChatbotDeepSeek

0 likes · 6 min read

JD Cloud Developers

Feb 13, 2025 · Artificial Intelligence

Unlocking DeepSeek R1: Concepts, Training Secrets, and Real-World Experiments

This article demystifies DeepSeek R1 by explaining key concepts such as online search integration and the R1 model, detailing its two‑phase training pipeline, core techniques like iterative data enhancement, and showcases practical reproductions, benchmark tests, and deployment examples for AI developers.

DeepSeekKnowledge DistillationLarge Language Model

0 likes · 12 min read

Unlocking DeepSeek R1: Concepts, Training Secrets, and Real-World Experiments

Tencent Cloud Developer

Feb 13, 2025 · Artificial Intelligence

Build an AI Super App with DeepSeek and Tencent Cloud Code Assistant in Minutes

This guide walks you through configuring Tencent Cloud AI Code Assistant to use DeepSeek models—either via the DeepSeek public API or a locally‑deployed Ollama instance—covering prerequisites, step‑by‑step setup, required hardware, and command‑line examples.

AI code assistantDeepSeekLarge Language Model

0 likes · 7 min read

Build an AI Super App with DeepSeek and Tencent Cloud Code Assistant in Minutes

AI Algorithm Path

Feb 12, 2025 · Artificial Intelligence

Essential DeepSeek‑R1 Reading List: Papers Behind the 2025 Hottest LLM

This article compiles a curated reading list of foundational and recent research papers—from the original Transformer to chain‑of‑thought, mixture‑of‑experts, and reinforcement‑learning studies—that together explain the breakthroughs behind DeepSeek‑R1 and guide readers through the technical evolution of modern large language models.

DeepSeekLarge Language ModelMixture of Experts

0 likes · 15 min read

Essential DeepSeek‑R1 Reading List: Papers Behind the 2025 Hottest LLM

Architects' Tech Alliance

Feb 12, 2025 · Artificial Intelligence

DeepSeek‑V3 Training Efficiency, Knowledge Distillation, and the Risks of Synthetic Data

The article examines DeepSeek‑V3’s low‑cost training using 2048 H800 GPUs, explains how knowledge distillation and high‑quality data improve efficiency, discusses expert concerns about training on AI‑generated content, and outlines the limitations and ceiling effect of distillation techniques.

AI Training EfficiencyAI safetyDeepSeek-V3

0 likes · 7 min read

DeepSeek‑V3 Training Efficiency, Knowledge Distillation, and the Risks of Synthetic Data

Bilibili Tech

Feb 11, 2025 · Artificial Intelligence

Building a Scalable AI Agent for Code Review: Practices, Architecture, and Challenges

The article outlines how to build a scalable, modular AI code‑review agent using LangChain, detailing stages from naive prompting to advanced prompt engineering, architecture with six core modules, strategies to curb hallucinations, improve reliability, performance, and human‑AI collaboration, and future RAG integration.

AI AgentCode ReviewLangChain

0 likes · 22 min read

Building a Scalable AI Agent for Code Review: Practices, Architecture, and Challenges

AI2ML AI to Machine Learning

Feb 10, 2025 · Artificial Intelligence

Eight Ways Enterprises Can Leverage DeepSeek

The article outlines eight distinct enterprise strategies for adopting DeepSeek, categorizing them by model maturity, available data types, and specific business challenges, and maps these approaches onto four capability tiers—from basic compliance requirements to advanced multimodal, low‑cost solutions.

AI agentsDeepSeekEnterprise AI

0 likes · 3 min read

Eight Ways Enterprises Can Leverage DeepSeek

DataFunSummit

Feb 10, 2025 · Artificial Intelligence

Intelligent Decision-Making Large Model ORLM: Research, Training Challenges, Commercialization, and Future Directions

This article presents the ORLM intelligent decision‑making large model, detailing how real‑world decision problems are formalized and solved, the training difficulties and data synthesis methods, the transition from academic research to commercial platforms, and future technical improvement plans.

AIDecision ModelingLarge Language Model

0 likes · 10 min read

Intelligent Decision-Making Large Model ORLM: Research, Training Challenges, Commercialization, and Future Directions

Code Mala Tang

Feb 10, 2025 · Artificial Intelligence

How Much Does It Really Cost to Run a Full‑Scale DeepSeek AI Locally?

This article breaks down the hardware and software expenses required to deploy a complete DeepSeek large‑language model on‑premises, revealing a total cost of roughly $110,000 and explaining why such an investment is prohibitive for most individual developers but may be justified for well‑funded research or corporate projects.

DeepSeekDeploymentGPU

0 likes · 4 min read

How Much Does It Really Cost to Run a Full‑Scale DeepSeek AI Locally?

Big Data Tech Team

Feb 9, 2025 · Artificial Intelligence

7 Proven Prompt Techniques to Unlock DeepSeek’s Full Potential

This guide presents seven practical prompt engineering tricks—ranging from precise requirement definition and contextual background provision to step‑by‑step decomposition, keyword tagging, iterative follow‑ups, tone/style adjustments, and model switching—that dramatically improve the relevance and quality of DeepSeek’s responses for work, learning, and creative tasks.

AI productivityArtificial IntelligenceDeepSeek

0 likes · 6 min read

7 Proven Prompt Techniques to Unlock DeepSeek’s Full Potential

AIWalker

Feb 8, 2025 · Artificial Intelligence

Introducing Ola: A Full‑Modal Language Model from Tsinghua & Tencent that Unifies Image, Video, and Audio Understanding

The article presents Ola, an open‑source full‑modal LLM that uses progressive modality alignment to jointly process text, images, video, and audio, and demonstrates competitive performance across image, video, and audio benchmarks, surpassing many specialized models.

Large Language ModelMultimodalOla

0 likes · 22 min read

Introducing Ola: A Full‑Modal Language Model from Tsinghua & Tencent that Unifies Image, Video, and Audio Understanding

IT Architects Alliance

Feb 8, 2025 · Artificial Intelligence

Inside DeepSeek: How Its Innovative Architecture Redefines AI Performance

This article examines DeepSeek's advanced Transformer‑based architecture, dynamic routing, MoE system, multi‑stage training, efficient inference, multimodal capabilities, real‑world applications, technical challenges, and future prospects, providing a comprehensive technical analysis of the model's strengths and limitations.

AI ArchitectureDeepSeekLarge Language Model

0 likes · 15 min read

Architect

Feb 7, 2025 · Industry Insights

Can DeepSeek’s Native Chinese LLM Transform Enterprise AI and Organizational Design?

The article evaluates DeepSeek‑R1’s strong reasoning, high performance, native Chinese training and low cost, then explores how such large language models can reshape B2C and B2B services, propose a new “intelligent data store” architecture, and outline comprehensive organizational and strategic changes enterprises must adopt to thrive in the AI era.

AI strategyDeepSeekEnterprise AI

0 likes · 16 min read

Can DeepSeek’s Native Chinese LLM Transform Enterprise AI and Organizational Design?

Alibaba Cloud Developer

Feb 7, 2025 · Artificial Intelligence

Why DeepSeek V3 Achieves Low Training Costs: Inside Its AI Innovations

This article provides a comprehensive analysis of DeepSeek's large‑language‑model technology, covering the company's background, model capabilities, remarkably low training and inference costs, and the core architectural and algorithmic innovations such as MoE, MLA attention, FP8 mixed‑precision, and the DualPipe pipeline that enable efficient large‑scale AI deployment.

AI ArchitectureDeepSeekFP8 training

0 likes · 19 min read

Why DeepSeek V3 Achieves Low Training Costs: Inside Its AI Innovations

Java One

Feb 6, 2025 · Artificial Intelligence

Deploy DeepSeek‑R1 Locally on Your Laptop in Just 3 Minutes

This step‑by‑step guide shows non‑technical users how to install Ollama, pull the desired DeepSeek‑R1 model version, run it from the terminal, and optionally connect the free Chatbox desktop client for a visual chat interface, all without external network dependencies.

AI modelChatboxDeepSeek

0 likes · 6 min read

Deploy DeepSeek‑R1 Locally on Your Laptop in Just 3 Minutes

Cognitive Technology Team

Feb 6, 2025 · Artificial Intelligence

DeepSeek Model Guide: 10 Practical Tips and Usage Techniques

This article presents ten detailed techniques for effectively using DeepSeek's large language models—including mode selection, model comparisons, knowledge updates, prompt engineering, RAG, file uploads, API access, and open‑source resources—while offering concrete examples and code snippets for each feature.

AI APIDeepSeekLarge Language Model

0 likes · 12 min read

DeepSeek Model Guide: 10 Practical Tips and Usage Techniques

Tencent Cloud Developer

Feb 3, 2025 · Artificial Intelligence

DeepSeek's Emergence: Implications for AI, Enterprise Digital Transformation, and Future Software Development

DeepSeek’s debut marks a watershed for China’s AI, offering low‑cost, Chinese‑native reasoning that outperforms foreign models and prompting enterprises to restructure development around demand‑engineering, AI‑assisted low‑code, intelligent data stores, and a shift from “how to code” to “why to code” across a three‑phase transformation roadmap.

AI strategyDeepSeekEnterprise AI

0 likes · 15 min read

DeepSeek's Emergence: Implications for AI, Enterprise Digital Transformation, and Future Software Development

21CTO

Jan 31, 2025 · Artificial Intelligence

How DeepSeek‑R1 Is Redefining Open‑Source AI and Challenging OpenAI’s O1

DeepSeek‑R1, an open‑source inference model released under the MIT license, matches or surpasses OpenAI’s O1 on math, coding, and reasoning benchmarks, offers multiple scaled versions, runs at lightning speed, and is rapidly adopted worldwide, signaling a shift toward more accessible, high‑performance AI.

DeepSeek-R1Large Language Modelbenchmark

0 likes · 9 min read

How DeepSeek‑R1 Is Redefining Open‑Source AI and Challenging OpenAI’s O1

DataFunTalk

Jan 29, 2025 · Artificial Intelligence

ChatBI: NetEase AI‑Powered Business Intelligence Platform – Architecture, Technology, and Real‑World Applications

This article introduces ChatBI, NetEase’s AI‑driven BI solution that combines large‑model capabilities with traditional data analytics, detailing its product features, AI‑enabled opportunities and challenges, the underlying NL2SQL model, technical architecture, performance optimizations such as materialized views, open APIs, and several enterprise deployment cases.

AIBIData Analytics

0 likes · 21 min read

ChatBI: NetEase AI‑Powered Business Intelligence Platform – Architecture, Technology, and Real‑World Applications

DataFunTalk

Jan 27, 2025 · Artificial Intelligence

Improving AI Agent Planning and Reasoning: Challenges and Practical Solutions

The article examines current limitations of AI agents in planning and complex reasoning, critiques existing methods like COT/TOT and ReAct, and proposes practical strategies—including combined COT‑Reflection approaches, structured memory algorithms, and white‑box interaction designs—to enhance agent performance within the DataFun knowledge map framework.

AI AgentCoTLarge Language Model

0 likes · 3 min read

Improving AI Agent Planning and Reasoning: Challenges and Practical Solutions

DataFunTalk

Jan 26, 2025 · Artificial Intelligence

58.com’s LingXi Large Language Model Platform: Development, Deployment, and Performance Optimizations

Since the launch of ChatGPT, 58.com has built a Model‑as‑a‑Service platform called LingXi that trains and serves domain‑specific large language models, supports over a hundred internal scenarios with daily inference exceeding ten million calls, and continuously improves performance through quantization, GPU optimization, model miniaturization, and advanced AI applications such as interview assistants, voice agents, and RAG‑enabled agents.

AI PlatformAI applicationsInference Optimization

0 likes · 9 min read

58.com’s LingXi Large Language Model Platform: Development, Deployment, and Performance Optimizations

AI Code to Success

Jan 26, 2025 · Industry Insights

How DeepSeek‑R1 Is Challenging OpenAI’s o1 and Shaping the AI Landscape

DeepSeek‑R1 achieved a 1357‑point Arena score, ranking third overall and tying OpenAI o1 for first in StyleCtrl, while its open‑source MIT‑licensed release—including distilled variants—and low‑cost API service aim to democratize advanced AI inference for developers worldwide.

AI competitionArena benchmarkDeepSeek

0 likes · 5 min read

How DeepSeek‑R1 Is Challenging OpenAI’s o1 and Shaping the AI Landscape

DataFunSummit

Jan 25, 2025 · Artificial Intelligence

AI-Driven Next-Generation Sales: Project Overview, Core Technologies, System Deployment, and Future Outlook

This article explores how AI transforms next‑generation sales by detailing project background and goals, core technologies such as efficient sample generation, model training and evaluation, system deployment impact, practical case studies, challenges, solutions, and future directions across multiple industries.

AILarge Language ModelSales Automation

0 likes · 25 min read

AI-Driven Next-Generation Sales: Project Overview, Core Technologies, System Deployment, and Future Outlook

Kuaishou Tech

Jan 24, 2025 · Artificial Intelligence

KwaiCoder-23BA4-v1: An Efficient Large Code Generation Model via Pruning, Knowledge Distillation, and Granular Upcycling

KwaiCoder-23BA4-v1 is a 23B wide MoE code‑completion model that achieves state‑of‑the‑art performance on HumanEval, BigCodeBench and Fill‑in‑Middle benchmarks by using high‑quality data, a cost‑effective training pipeline that combines model pruning, knowledge distillation and fine‑grained merging, and extensive ablation studies.

AIKnowledge DistillationLarge Language Model

0 likes · 10 min read

KwaiCoder-23BA4-v1: An Efficient Large Code Generation Model via Pruning, Knowledge Distillation, and Granular Upcycling

Baobao Algorithm Notes

Jan 22, 2025 · Artificial Intelligence

Can RL‑Only Training Make LLMs Beat OpenAI‑o1? Inside DeepSeek‑R1’s Architecture and Results

DeepSeek‑R1’s open‑source series demonstrates that reinforcement‑learning‑only training can match top‑tier models like OpenAI‑o1, while a small amount of SFT further improves readability; the article dissects its technical report, training pipeline, reward design, distillation strategy, benchmark outcomes, and remaining challenges.

DeepSeekLarge Language ModelSupervised Fine‑Tuning

0 likes · 11 min read

Can RL‑Only Training Make LLMs Beat OpenAI‑o1? Inside DeepSeek‑R1’s Architecture and Results

Baidu Tech Salon

Jan 21, 2025 · Artificial Intelligence

How AI Is Transforming Legal Research: Inside the YuanDian WenDa Smart Q&A Engine

Faced with billions of legal documents and the shortcomings of keyword search, Chinese legal professionals are turning to the AI‑powered YuanDian WenDa engine, which leverages Baidu's Wenxin model, a structured legal database, and prompt‑engineering to deliver trustworthy, citation‑rich answers and rapid research reports.

AIKnowledge GraphLarge Language Model

0 likes · 10 min read

How AI Is Transforming Legal Research: Inside the YuanDian WenDa Smart Q&A Engine

AIWalker

Jan 18, 2025 · Artificial Intelligence

How InternLM 3.0 Achieves High Performance with Just 4 TB of Training Data

Shanghai AI Laboratory’s InternLM 3.0 upgrade demonstrates that a refined 4 TB token dataset can boost a large‑language model’s performance beyond that of open‑source peers trained on 18 TB, cutting training cost by over 75% while merging regular dialogue with deep reasoning capabilities.

AI evaluationInternLMLarge Language Model

0 likes · 9 min read

How InternLM 3.0 Achieves High Performance with Just 4 TB of Training Data

AIWalker

Jan 17, 2025 · Artificial Intelligence

InternLM 3.0: Boosting Model Performance with Only 4 TB of Training Data

Shanghai AI Laboratory’s InternLM 3.0 upgrade demonstrates that refining data quality—measured as intelligence‑per‑token—can replace massive datasets, achieving higher reasoning and dialogue capabilities with just 4 TB of tokens, cutting training cost by over 75 % while approaching GPT‑4‑level performance.

AI researchInternLMLarge Language Model

0 likes · 9 min read

InternLM 3.0: Boosting Model Performance with Only 4 TB of Training Data

AIWalker

Jan 16, 2025 · Artificial Intelligence

How InternLM 3.0 Achieves High Performance with Just 4 TB of Training Data

InternLM 3.0 (InternLM‑3) upgrades the Shusheng‑PuYu model by refining data to boost "thinking density", using only 4 TB of tokens to surpass peer open‑source models, cutting training cost by over 75% while merging ordinary dialogue with deep reasoning capabilities.

InternLMLarge Language Modeldata efficiency

0 likes · 9 min read

Alibaba Cloud Native

Jan 13, 2025 · Cloud Native

Build a Serverless AI Summarization Assistant with Alibaba Cloud Function Compute and Baileian

This guide explains how to use Alibaba Cloud Function Compute together with the Baileian large‑model platform to create a highly available, cloud‑native AI summarization service that automatically extracts key information from massive documents.

AI summarizationAlibaba CloudCloud Native

0 likes · 8 min read

Build a Serverless AI Summarization Assistant with Alibaba Cloud Function Compute and Baileian

Infra Learning Club

Jan 12, 2025 · Artificial Intelligence

How to Connect a XiaoAI Speaker to a Large Language Model

This guide walks through preparing a XiaoAI speaker, selecting a free LLM service, creating an API key, installing Docker, running the MiGPT server, and configuring the speaker to query the chosen large language model.

Large Language ModelMiGPTSiliconFlow

0 likes · 6 min read

How to Connect a XiaoAI Speaker to a Large Language Model

JD Cloud Developers

Jan 9, 2025 · Artificial Intelligence

Boost Your Java Apps with LangChain4j: A Hands‑On RAG Guide

This article walks Java developers through the fundamentals of Retrieval‑Augmented Generation (RAG), explains the LangChain4j framework, compares large‑model development with traditional Java coding, and provides step‑by‑step code examples for environment setup, document splitting, embedding, vector‑store operations, and LLM interaction.

EmbeddingJavaLangChain4j

0 likes · 34 min read

Boost Your Java Apps with LangChain4j: A Hands‑On RAG Guide

Alibaba Cloud Developer

Jan 9, 2025 · Artificial Intelligence

Unlocking Large Model Power: From Semantic Vectors to Real‑World Business Applications

This article explores large‑model capabilities through semantic‑vector theory, outlines business‑scenario focus, presents practical case studies such as AI customer‑service bots, and details prompt‑engineering techniques and optimization workflows to help practitioners effectively apply foundation models in real‑world tasks.

Business ApplicationLarge Language Modelsemantic vectors

0 likes · 36 min read

Unlocking Large Model Power: From Semantic Vectors to Real‑World Business Applications

Baobao Algorithm Notes

Jan 7, 2025 · Artificial Intelligence

How Efficient Is DeepSeek V3? Calculating Its MFU Around 37%

This article derives DeepSeek V3's training Model FLOPs Utilization (MFU) using publicly available data, showing an MFU of roughly 37%—about a 60% improvement over V2—and provides detailed formulas, parameter settings, and a reproducible Python script.

AI PerformanceDeepSeekLarge Language Model

0 likes · 8 min read

How Efficient Is DeepSeek V3? Calculating Its MFU Around 37%

Alibaba Cloud Developer

Jan 2, 2025 · Operations

Mastering Error and Latency Diagnosis for Online Applications

This article presents a systematic root‑cause diagnosis framework for online applications, covering how to identify and resolve both error ("wrong") and performance ("slow") problems using trace links, associated data, high‑quality observability, and large‑language‑model‑driven intelligence.

Large Language ModelRoot Cause AnalysisTrace Analysis

0 likes · 12 min read

Mastering Error and Latency Diagnosis for Online Applications

DataFunTalk

Dec 27, 2024 · Artificial Intelligence

Designing Enterprise Business Analysis Agents with Large Language Models

This article explains how large‑model capabilities combined with metric and tag platforms can be used to build intelligent data‑analysis products for enterprises, covering challenges, solution routes such as NLP2SQL, NLP2API, NLP2Python, agent design, planning, and future outlooks.

AI AgentData AnalysisEnterprise Analytics

0 likes · 21 min read

Designing Enterprise Business Analysis Agents with Large Language Models

NewBeeNLP

Dec 23, 2024 · Artificial Intelligence

What’s New in Qwen2.5? A Deep Dive into the Latest LLM Advances

The Qwen2.5 Technical Report introduces a new series of large language models with up to 72 B parameters, expanded pre‑training data to 18 trillion tokens, advanced supervised fine‑tuning and reinforcement learning pipelines, and demonstrates strong performance across comprehension, reasoning, coding, and long‑context tasks.

LLMLarge Language ModelQwen2.5

0 likes · 5 min read

What’s New in Qwen2.5? A Deep Dive into the Latest LLM Advances

iQIYI Technical Product Team

Dec 19, 2024 · Artificial Intelligence

Project BaixiaoSheng: An AI‑Powered Project Management Assistant – iQIYI Case Study

Project BaixiaoSheng, iQIYI’s AI‑powered project management assistant unveiled at the 13th TOP 100 Global Software Case Study Summit, uses a Retrieval‑Augmented Generation framework with static knowledge Q&A, dynamic data consulting, and scenario‑assistant automation to cut context‑switching, streamline data flow, and boost cross‑system efficiency, while future plans target fine‑tuned LLMs, multi‑model fusion, and AI‑agent orchestration.

AIKnowledge BaseLarge Language Model

0 likes · 11 min read

Project BaixiaoSheng: An AI‑Powered Project Management Assistant – iQIYI Case Study

Alimama Tech

Dec 11, 2024 · Artificial Intelligence

Engineering Architecture of Alibaba's AI Digital Employee "AI XiaoWan"

Alibaba’s AI digital employee “AI XiaoWan” uses a native multi‑agent architecture where a Controller Agent interprets intent, plans tasks, and orchestrates execution while an Executable Agent performs domain‑specific operations, communicating via a standardized Agent Communication Protocol, leveraging a centralized Tool Center, a retrieval‑augmented knowledge base, and a data‑flywheel feedback loop to continuously improve and evolve toward memory‑based reasoning and self‑learning.

AIKnowledge BaseLarge Language Model

0 likes · 14 min read

Engineering Architecture of Alibaba's AI Digital Employee "AI XiaoWan"

DataFunTalk

Dec 10, 2024 · Artificial Intelligence

Tencent Large Language Model Applications: RAG, GraphRAG, and Agent Technologies

This article explores Tencent's large language model deployments across various business scenarios, detailing the principles and practical implementations of Retrieval‑Augmented Generation (RAG), GraphRAG for role‑playing, and Agent technologies, while also covering model fine‑tuning, knowledge‑base construction, and evaluation methods.

AI applicationsAgentGraphRAG

0 likes · 15 min read

Tencent Large Language Model Applications: RAG, GraphRAG, and Agent Technologies

AI Large Model Application Practice

Dec 9, 2024 · Artificial Intelligence

How GUI Agents Use Large Models to Automate Any Desktop Task

This article explains why GUI agents are needed, defines their multimodal capabilities, walks through a high‑level automation scenario, details the architecture of large‑model‑driven GUI agents, highlights recent open‑source projects, and compares them with traditional RPA solutions.

AI automationGUI AgentHuman-Computer Interaction

0 likes · 10 min read

How GUI Agents Use Large Models to Automate Any Desktop Task

DataFunSummit

Dec 7, 2024 · Artificial Intelligence

Technical Practices of Tencent's Intelligent BI System: Architecture, Model Fine‑Tuning, and Agent Design

This article details Tencent's shift from traditional BI to an AI‑driven intelligent BI platform, describing the challenges of architecture, large‑language‑model integration, and data integration, and presenting the OlaChat framework, unified orchestration, atomic agents, DSL conversion, monitoring, and future roadmap.

AIData AnalysisIntelligent BI

0 likes · 22 min read

Technical Practices of Tencent's Intelligent BI System: Architecture, Model Fine‑Tuning, and Agent Design

Tencent Cloud Developer

Dec 5, 2024 · Industry Insights

Why Most RAG Projects Fail and How Tencent’s LeXiang AI Assistant Overcomes Them

The article analyses the rapid growth of Retrieval‑Augmented Generation (RAG) in enterprises, explains why self‑built RAG solutions often collapse under cost and maintenance pressures, and demonstrates how Tencent LeXiang AI Assistant addresses these issues through a robust knowledge‑management core, extensive industry experience, scalable resources, and advanced multimodal capabilities.

AI AssistantEnterprise AILarge Language Model

0 likes · 16 min read

Why Most RAG Projects Fail and How Tencent’s LeXiang AI Assistant Overcomes Them

Baidu Tech Salon

Nov 29, 2024 · Artificial Intelligence

How AI‑Powered “WenZhi” Transforms Job Matching with Baidu’s ERNIE Model

Faced with overloaded job listings and low offer rates, a group of students built “WenZhi,” an AI‑driven job‑matching app that leverages Baidu’s ERNIE SDK, generative recommendation, and workflow orchestration to deliver personalized role suggestions and interview advice within minutes.

AICareer TechnologyERNIE SDK

0 likes · 7 min read

How AI‑Powered “WenZhi” Transforms Job Matching with Baidu’s ERNIE Model

Tencent Cloud Developer

Nov 27, 2024 · Artificial Intelligence

Tencent Cloud AI Code Assistant: Product Evolution, Architecture, and Technical Implementation

Tencent Cloud AI Code Assistant has evolved from token‑level IDE completions to LLM‑driven multi‑modal coding and chat features, employing a dual‑loop R&D system, Hunyuan‑based code models, and sophisticated trigger, prompt, stop, and display strategies to deliver context‑aware, secure, and efficient code generation within IDE and review environments.

AB testingAI code assistantAST analysis

0 likes · 15 min read

Tencent Cloud AI Code Assistant: Product Evolution, Architecture, and Technical Implementation

Meituan Technology Team

Nov 21, 2024 · Frontend Development

AutoConsis: Automated UI Consistency Detection for Mobile Apps Using Multimodal AI

AutoConsis is a research‑driven, AI‑powered workflow that automatically detects UI content inconsistencies across mobile app pages by combining target region recognition, OCR‑based extraction, and large language model reasoning, achieving low cost, high generalization, and high confidence as demonstrated on Meituan's large‑scale marketing scenarios.

CLIPICSE 2024Large Language Model

0 likes · 15 min read

AutoConsis: Automated UI Consistency Detection for Mobile Apps Using Multimodal AI

Rare Earth Juejin Tech Community

Nov 20, 2024 · Artificial Intelligence

Resolving 02_DocQA.py Errors and Using LangChain to Call Large Models Locally

This guide explains how to fix the ArkNotFoundError in the 02_DocQA.py script by configuring a Doubao‑embedding endpoint, setting up a Conda environment with the latest LangChain packages, and provides step‑by‑step code examples for invoking both Zhipu glm‑4 and Volcano large language models via LangChain.

EmbeddingEnvironment setupLangChain

0 likes · 9 min read

Resolving 02_DocQA.py Errors and Using LangChain to Call Large Models Locally

Baidu Tech Salon

Nov 19, 2024 · Artificial Intelligence

Baidu's Wenxin AI Agent Technology Wins Leading Science and Technology Award at 2024 World Internet Conference

At the 2024 World Internet Conference in Wuzhen, Baidu’s Wenxin AI Agent technology earned the Leading Science and Technology Award, marking its second consecutive win and highlighting the system’s brain‑inspired “System 2” architecture that enhances large‑model reasoning, accelerates diverse applications, and drives significant social and economic value.

AIAI AgentAward

0 likes · 6 min read

Baidu's Wenxin AI Agent Technology Wins Leading Science and Technology Award at 2024 World Internet Conference

Baidu Tech Salon

Nov 14, 2024 · Artificial Intelligence

How Baidu’s Wenxin Model Hit 430 Million Users and What Its New Tech Means for AI

At Baidu World 2024, CTO Wang Haifeng revealed that Wenxin Yiyan has reached 430 million users, detailed the model’s retrieval‑augmented and multimodal generation breakthroughs, showcased intelligent‑agent‑driven coding tools, and highlighted expanding AI applications across education, sports, and industry.

AIIntelligent agentsLarge Language Model

0 likes · 7 min read

How Baidu’s Wenxin Model Hit 430 Million Users and What Its New Tech Means for AI

Architects' Tech Alliance

Nov 12, 2024 · Artificial Intelligence

How Retrieval‑Augmented Generation Boosts Enterprise AI with Intel Optimizations

This article explains the fundamentals of Retrieval‑Augmented Generation (RAG), its four‑step workflow, architecture, and how Intel’s hardware and software optimizations—including vector search, quantized embeddings, and advanced inference extensions—enhance performance, security, and scalability for enterprise LLM applications.

AI inferenceEmbedding QuantizationIntel Optimization

0 likes · 14 min read

How Retrieval‑Augmented Generation Boosts Enterprise AI with Intel Optimizations

DataFunSummit

Nov 8, 2024 · Artificial Intelligence

ChatDBA: An AI‑Powered Database Fault Diagnosis Assistant Using Retrieval‑Augmented Generation

ChatDBA, developed by Shanghai Aikesheng, is an AI-driven database operation assistant that leverages large language models and Retrieval‑Augmented Generation to provide fault diagnosis, knowledge learning, SQL generation and optimization, addressing challenges such as vague outputs, complex troubleshooting logic, and memory management through a structured architecture and multi‑modal retrieval strategies.

AIDatabaseFault Diagnosis

0 likes · 10 min read

ChatDBA: An AI‑Powered Database Fault Diagnosis Assistant Using Retrieval‑Augmented Generation

Tencent Cloud Developer

Nov 6, 2024 · Artificial Intelligence

Overview of Tencent Hunyuan Large and 3D Generation Model Open‑Source Release

Tencent has open‑sourced its 389‑billion‑parameter Hunyuan Large Mixture‑of‑Experts model—featuring 52 B active parameters, 256 K token context, novel routing, KV‑cache compression, and advanced training optimizations that beat leading open‑source models—and its first text‑to‑3D/image‑to‑3D Hunyuan 3D Generation model, both downloadable via GitHub, Hugging Face, and Tencent Cloud.

3D generationAI researchLarge Language Model

0 likes · 9 min read

Overview of Tencent Hunyuan Large and 3D Generation Model Open‑Source Release

DataFunSummit

Oct 27, 2024 · Artificial Intelligence

How Siemens Harnesses Generative AI to Build the Enterprise Knowledge Chatbot “XiaoYu”

This article describes Siemens' journey in applying generative AI and Retrieval‑Augmented Generation to create an internal knowledge chatbot, detailing the business challenges, technical architecture, data integration, multi‑modal capabilities, deployment outcomes, and strategic lessons for enterprise AI adoption.

AI chatbotEnterprise Knowledge ManagementLarge Language Model

0 likes · 21 min read

How Siemens Harnesses Generative AI to Build the Enterprise Knowledge Chatbot “XiaoYu”

DataFunSummit

Oct 24, 2024 · Big Data

Bilibili’s Large Language Model‑Based Intelligent Assistant for the Big Data Platform: Architecture, Principles, and Deployment

This article details Bilibili’s implementation of a large‑language‑model‑driven intelligent assistant for its massive big‑data platform, covering background, problem analysis, architectural design, knowledge‑base construction, precision and recall challenges, deployment across offline and real‑time Spark/Flink diagnostics, and future outlooks.

AgentBig DataFlink

0 likes · 23 min read

Bilibili’s Large Language Model‑Based Intelligent Assistant for the Big Data Platform: Architecture, Principles, and Deployment

DataFunSummit

Oct 21, 2024 · Artificial Intelligence

Retrieval‑Augmented Generation (RAG) for Office Applications: Architecture, Challenges, and Practical Practices

This article introduces Retrieval‑Augmented Generation (RAG) as a solution to the hallucination, freshness, and data‑privacy issues of large language models, details its modular architecture, explains the layered system design and hybrid retrieval pipeline, and shares the practical challenges and engineering tricks encountered when deploying RAG in enterprise office scenarios.

AIHybrid RetrievalLarge Language Model

0 likes · 19 min read

Retrieval‑Augmented Generation (RAG) for Office Applications: Architecture, Challenges, and Practical Practices

Baidu Tech Salon

Oct 17, 2024 · Artificial Intelligence

How to Deploy Yuan 2.0 LLM with PaddleNLP: A Step‑by‑Step Guide

This article explains how the open‑source Yuan 2.0 large language model is fully integrated with Baidu’s PaddleNLP, covering its capabilities, fine‑tuning optimizations, step‑by‑step deployment instructions, interaction examples, and training/finetuning results with loss‑curve visualizations.

AILarge Language ModelPaddleNLP

0 likes · 10 min read

How to Deploy Yuan 2.0 LLM with PaddleNLP: A Step‑by‑Step Guide

DataFunTalk

Oct 11, 2024 · Artificial Intelligence

ChatBI: Leveraging Large Language Models for Intelligent Business Intelligence at Ximalaya

This article details Ximalaya’s ChatBI project, describing how large language models are integrated into a BI platform to improve data accessibility, reduce development effort, and enhance query accuracy through prompt engineering, RAG, fine‑tuning, and multi‑agent architectures.

AIData PlatformLarge Language Model

0 likes · 10 min read

ChatBI: Leveraging Large Language Models for Intelligent Business Intelligence at Ximalaya

Java Tech Enthusiast

Oct 10, 2024 · Artificial Intelligence

Google Rehires AI Pioneer Noam Shazeer for Gemini Development

Google has signed a $2.7 billion agreement to rehire AI pioneer Noam Shazeer—co‑author of the seminal “Attention is All You Need” paper and creator of the Meena chatbot—bringing him back from his Character.AI venture to serve as vice president overseeing the Gemini generative‑AI project alongside DeepMind leaders, thereby bolstering Google’s competitive edge in the field.

AICharacter AIGemini

0 likes · 8 min read

Google Rehires AI Pioneer Noam Shazeer for Gemini Development

Zhihu Tech Column

Oct 10, 2024 · Artificial Intelligence

Massive Multi-Label Text Classification via Semantic Retrieval and Large AI Model

This article presents a method for massive multi-label text classification on Zhihu content by combining a semantic retrieval model with a proprietary large AI model, detailing the challenges of large label spaces, model architecture, loss optimization, and experimental results showing significant accuracy gains.

BGELarge Language Modelmulti-label classification

0 likes · 16 min read

Massive Multi-Label Text Classification via Semantic Retrieval and Large AI Model

58 Tech

Sep 23, 2024 · Artificial Intelligence

Enhancing Commercial Search with Knowledge Graphs and Large‑Model Techniques

This article describes how a commercial search platform iteratively upgrades its system by structuring business knowledge into a knowledge graph, applying multi‑stage entity extraction (CRF, Electra‑CRF, GLM‑3, OCR), and leveraging large language models to improve relevance, user experience, and revenue.

AIKnowledge GraphLarge Language Model

0 likes · 14 min read

Enhancing Commercial Search with Knowledge Graphs and Large‑Model Techniques

Data Thinking Notes

Sep 13, 2024 · Artificial Intelligence

How OpenAI’s o1 Series Redefines Complex Reasoning and AI Safety

OpenAI’s new o1 series, including o1‑preview and o1‑mini, leverages reinforcement‑learning‑based chain‑of‑thought reasoning to achieve superior performance on academic exams, coding contests, and safety benchmarks, offering faster, cost‑effective options while advancing AI alignment and human‑preference evaluation.

AI safetyLarge Language ModelOpenAI

0 likes · 15 min read

How OpenAI’s o1 Series Redefines Complex Reasoning and AI Safety

MaGe Linux Operations

Sep 13, 2024 · Artificial Intelligence

Can OpenAI’s New o1 Model Reach Human‑Level Reasoning?

OpenAI’s newly released o1 series introduces a reinforcement‑learning‑trained LLM that generates long chain‑of‑thought reasoning, achieving top‑50% scores on IOI contests, high rankings on Codeforces and AIME, and dramatically outperforming GPT‑4o across scientific and mathematical tasks.

AI reasoningArtificial IntelligenceLarge Language Model

0 likes · 8 min read

Can OpenAI’s New o1 Model Reach Human‑Level Reasoning?

Qunhe Technology Quality Tech

Sep 10, 2024 · Artificial Intelligence

Boost Test Case Creation with AI: How a Multi‑Model Platform Cuts Effort by 80%

An AI-driven test case generation platform at KuJiaLe leverages multiple large language models, offering three input methods, online editing, and dual export options, while addressing stability, length limits, and security challenges to improve testing efficiency and achieve over 80% success rate.

AI testingAutomationLarge Language Model

0 likes · 10 min read

Boost Test Case Creation with AI: How a Multi‑Model Platform Cuts Effort by 80%

Xiaohongshu Tech REDtech

Sep 2, 2024 · Artificial Intelligence

How AIGC Transforms Advertising Material Creation on Xiaohongshu

This article analyzes how large‑model AIGC reshapes the production, evaluation, and deployment of advertising creatives on Xiaohongshu, detailing the business motivations, technical pipeline, controllable generation, reward‑model filtering, and experimental results that balance commercial efficiency with community tone.

AIGCAdvertisingLarge Language Model

0 likes · 14 min read

How AIGC Transforms Advertising Material Creation on Xiaohongshu

Volcano Engine Developer Services

Aug 29, 2024 · Artificial Intelligence

Building a Multi‑Model AI Bot: Design, Prompt Tricks, and Lessons Learned

This article details the creation of a multi‑model AI chatbot, covering its core features, workflow, prompt role configuration, parameter tuning, anti‑reverse‑engineering measures, competitive landscape, and reflective insights for developers building large‑model applications.

AI botLarge Language ModelParameter Tuning

0 likes · 12 min read

Building a Multi‑Model AI Bot: Design, Prompt Tricks, and Lessons Learned

Baobao Algorithm Notes

Aug 27, 2024 · Artificial Intelligence

Unlock Free GLM-4-Flash API: Step-by-Step Guide, Code Samples, and Logic Puzzle Test

This article explores the free GLM-4-Flash API from Zhipu AI, detailing its lightweight architecture, performance specs, a logic‑puzzle demonstration, and provides a comprehensive step‑by‑step tutorial—including data upload, model fine‑tuning, deployment commands and example code for building a LangChain‑based knowledge‑base retrieval system.

AI deploymentFree APIGLM-4-Flash

0 likes · 11 min read

Unlock Free GLM-4-Flash API: Step-by-Step Guide, Code Samples, and Logic Puzzle Test