Tagged articles

1070 articles

Page 8 of 11

Feb 10, 2025 · Artificial Intelligence

How DeepSeek R1 Uses Large‑Scale Reinforcement Learning to Replicate OpenAI o1

This article examines DeepSeek R1’s large‑scale reinforcement‑learning approach, its training pipeline that combines rule‑based scaling and deep‑reasoning SFT data, and why its open‑source, low‑cost replication of OpenAI o1 marks a pivotal step toward more efficient, democratized AI models.

AI efficiencyDeepSeekLarge Language Models

0 likes · 18 min read

How DeepSeek R1 Uses Large‑Scale Reinforcement Learning to Replicate OpenAI o1

DevOps

Feb 9, 2025 · Artificial Intelligence

DeepSeek’s Impact on the Large Model Ecosystem and the Resurgence of AI PCs

The article examines DeepSeek’s rapid rise, its open‑source R1 model and distilled variants, the resurgence of AI PCs, hardware support from Nvidia, AMD and others, and how this ecosystem is reshaping personal AI experiences and the broader large‑model landscape.

AI PCDeepSeekHardware

0 likes · 11 min read

DeepSeek’s Impact on the Large Model Ecosystem and the Resurgence of AI PCs

AI Algorithm Path

Feb 9, 2025 · Artificial Intelligence

Understanding Multi-Token Prediction in DeepSeek‑R1 Architecture

This article dissects the Multi‑Token Prediction (MTP) technique used in DeepSeek‑R1, contrasting it with traditional next‑token prediction, detailing Meta’s MTP design, DeepSeek’s adapted architecture, loss weighting, and why MTP is applied only during training to boost efficiency and model capability.

DeepSeekLarge Language ModelsMTP

0 likes · 9 min read

Architect

Feb 9, 2025 · Artificial Intelligence

How DeepSeek’s Model Distillation Boosts AI Efficiency and Performance

This article provides an in‑depth analysis of DeepSeek’s model distillation technology, covering its definition, core principles, innovative strategies, architecture design, training optimizations, benchmark results, efficiency gains, and the remaining challenges of applying distillation to large language models and multimodal data.

AI efficiencyDeepSeekKnowledge Transfer

0 likes · 16 min read

How DeepSeek’s Model Distillation Boosts AI Efficiency and Performance

Architects' Tech Alliance

Feb 9, 2025 · Artificial Intelligence

How DeepSeek R1 Replicates OpenAI o1 Using Large‑Scale Reinforcement Learning

The article provides an in‑depth technical analysis of DeepSeek R1, explaining how it reproduces OpenAI o1's reasoning abilities through rule‑based large‑scale reinforcement learning, mixed SFT data, and efficient scaling, while discussing its broader impact on AI model development and capability density trends.

AI IndustryCapability DensityDeepSeek

0 likes · 19 min read

How DeepSeek R1 Replicates OpenAI o1 Using Large‑Scale Reinforcement Learning

AI2ML AI to Machine Learning

Feb 8, 2025 · Artificial Intelligence

Analyzing DeepSeek R1 Inference Projects: Source Code, Cold‑Start, and Scaling Techniques

This article examines DeepSeek R1’s three breakthroughs, its low‑cost optimizations that bypass CUDA, and the resulting impact on the AI ecosystem, then provides a detailed technical review of seven open‑source reproductions—Open‑R1, Tiny‑Zero, SimpleScaling‑S1, and simpleRL‑reason—covering their architectures, reinforcement‑learning pipelines, and code implementations.

DeepSeekInference ScalingLarge Language Models

0 likes · 10 min read

Analyzing DeepSeek R1 Inference Projects: Source Code, Cold‑Start, and Scaling Techniques

Huawei Cloud Developer Alliance

Feb 8, 2025 · Artificial Intelligence

Why DeepSeek V3 and R1 Are Redefining Low‑Cost AI: Architecture, Training Tricks, and Industry Impact

This article analyses DeepSeek's V3 and R1 models, explaining how their innovative MoE architecture, Multi‑Head Latent Attention, low‑cost training strategies, and distributed‑training optimizations deliver high‑performance large language models while reducing GPU/NPU demand and sparking industry excitement.

AI inferenceDeepSeekLarge Language Models

0 likes · 16 min read

Why DeepSeek V3 and R1 Are Redefining Low‑Cost AI: Architecture, Training Tricks, and Industry Impact

IT Services Circle

Feb 7, 2025 · Artificial Intelligence

Building Low‑Cost AI Clusters with Old Phones Using Exo and Open WebUI

This article introduces Exo, an open‑source platform that lets you turn idle smartphones, tablets, and laptops into a distributed AI cluster capable of running large language models, and shows how Open WebUI provides a user‑friendly interface for deploying private AI assistants.

AI clusteringExoLarge Language Models

0 likes · 6 min read

Building Low‑Cost AI Clusters with Old Phones Using Exo and Open WebUI

Java Captain

Feb 7, 2025 · Artificial Intelligence

DeepSeek: Disruptive Innovations in Large Language Model Architecture, Efficiency, and Ecosystem

DeepSeek reshapes the AI landscape by replacing brute‑force compute scaling with algorithmic breakthroughs such as a novel MoE architecture, memory compression, active‑learning data pipelines, and open‑source tooling, delivering dramatically lower training and inference costs while enabling edge deployment and a vibrant developer ecosystem.

Algorithmic EfficiencyDeepSeekLarge Language Models

0 likes · 11 min read

DeepSeek: Disruptive Innovations in Large Language Model Architecture, Efficiency, and Ecosystem

Tencent Cloud Developer

Feb 6, 2025 · Artificial Intelligence

DeepSeek V Series: Technical Overview of Scaling Laws, Grouped Query Attention, and Mixture‑of‑Experts

The article reviews DeepSeek’s V‑series papers, explaining how scaling‑law insights, Grouped Query Attention, a depth‑first design, loss‑free load balancing, multi‑token prediction and Multi‑Head Latent Attention together enable economical mixture‑of‑experts LLMs that rival closed‑source models while cutting compute and hardware costs.

DeepSeekGrouped Query AttentionLarge Language Models

0 likes · 13 min read

DeepSeek V Series: Technical Overview of Scaling Laws, Grouped Query Attention, and Mixture‑of‑Experts

Alibaba Cloud Developer

Feb 5, 2025 · Artificial Intelligence

10 Common Prompt Engineering Mistakes and How to Overcome Them

This article lists ten common misconceptions about prompt engineering, explains why each is flawed, and offers practical insights and strategies—such as using the CO‑STAR framework, tailoring prompts to specific models, keeping prompts concise, and continuously testing and refining—to help readers communicate effectively with large language models.

AI misconceptionsLLMLarge Language Models

0 likes · 10 min read

10 Common Prompt Engineering Mistakes and How to Overcome Them

Architect

Feb 3, 2025 · Artificial Intelligence

How DeepSeek‑R1 Uses Pure Reinforcement Learning to Match OpenAI’s o1

This article presents DeepSeek‑R1 and DeepSeek‑R1‑Zero, two next‑generation LLMs trained with pure reinforcement learning and multi‑stage fine‑tuning, details their GRPO training framework, model‑distillation pipeline, open‑source release, and evaluation results that rival OpenAI’s o1‑1217 across reasoning, knowledge, and coding benchmarks.

DeepSeekLLM evaluationLarge Language Models

0 likes · 10 min read

How DeepSeek‑R1 Uses Pure Reinforcement Learning to Match OpenAI’s o1

Cognitive Technology Team

Feb 3, 2025 · Artificial Intelligence

DeepSeek R1 Introduces Group‑Related Policy Optimization for Advanced Reasoning in Large Language Models

DeepSeek AI’s new open‑source model DeepSeek‑R1 leverages a novel Group‑Related Policy Optimization (GRPO) reinforcement‑learning framework and multi‑stage training to dramatically boost complex reasoning performance, achieving AIME 2024 Pass@1 scores comparable to OpenAI’s o1 model.

AIDeepSeekGRPO

0 likes · 4 min read

DeepSeek R1 Introduces Group‑Related Policy Optimization for Advanced Reasoning in Large Language Models

DataFunSummit

Jan 31, 2025 · Artificial Intelligence

LLMOps: Building a Prompt‑Driven Engine for AI Operations

This article presents the concept of LLMOps—applying large language models to AIOps—by analyzing prompt challenges, introducing the LogPrompt engine for log analysis, describing a prompt‑learning data flywheel with CoachLM optimization, reporting experimental results, and outlining future multi‑modal directions.

CoachLMData FlywheelLLMOps

0 likes · 16 min read

LLMOps: Building a Prompt‑Driven Engine for AI Operations

Alibaba Cloud Native

Jan 27, 2025 · Frontend Development

How Large Language Models Can Supercharge Frontend Development: Practical Insights

This article explores how large language models can be leveraged to automate and accelerate frontend development tasks, covering prompt engineering, repo‑level code generation, quality factors, hallucination mitigation, knowledge‑base integration, and practical strategies for improving developer productivity.

AIFrontendKnowledge Base

0 likes · 22 min read

How Large Language Models Can Supercharge Frontend Development: Practical Insights

JD Cloud Developers

Jan 26, 2025 · Operations

How Large Language Models are Transforming Modern IT Operations

This article traces the evolution of IT operations from manual tasks to automation, AIOps, and ChatOps, and explains how large language models boost efficiency, enable intelligent assistants, automated diagnosis, and smart log analysis for more reliable, automated Ops workflows.

ChatOpsLarge Language Modelsaiops

0 likes · 7 min read

How Large Language Models are Transforming Modern IT Operations

ByteDance Web Infra

Jan 22, 2025 · Artificial Intelligence

Introducing UI‑TARS: A Native GUI Agent Model Integrated with Midscene.js for Multimodal UI Automation

The article presents UI‑TARS, a native GUI‑agent model that combines multimodal large‑language models with the open‑source Midscene.js framework to enable more accurate, token‑efficient, and privacy‑preserving UI automation, while discussing its architecture, advantages, limitations, and integration steps.

GUI AgentLarge Language ModelsMidscene.js

0 likes · 11 min read

Introducing UI‑TARS: A Native GUI Agent Model Integrated with Midscene.js for Multimodal UI Automation

Bilibili Tech

Jan 21, 2025 · Artificial Intelligence

Accelerating Large Model Inference: Challenges and Multi‑Level Optimization Strategies

The article outlines how exploding LLM sizes create compute, memory, and latency bottlenecks and proposes a full‑stack solution—operator fusion, high‑performance libraries, quantization, speculative decoding, sharding, contiguous batching, PageAttention, and specialized frameworks like MindIE‑LLM—to dramatically boost inference throughput and reduce latency, while highlighting future ultra‑low‑bit and heterogeneous hardware directions.

Inference AccelerationLarge Language ModelsOperator fusion

0 likes · 21 min read

Accelerating Large Model Inference: Challenges and Multi‑Level Optimization Strategies

Fighter's World

Jan 10, 2025 · Artificial Intelligence

How to Escape the Demo Dilemma: A Three‑Stage Leap for B2B Large‑Model Deployment

The article analyzes why B2B large‑model projects often stall at demo, prototype, or POC stages and proposes a three‑level value‑lift framework—model domain intelligence, business‑process smart density, and pervasive seamless interaction—to turn demos into real‑world impact.

AI value ladderAI-nativeB2B AI

0 likes · 13 min read

How to Escape the Demo Dilemma: A Three‑Stage Leap for B2B Large‑Model Deployment

Baidu Tech Salon

Jan 8, 2025 · Artificial Intelligence

Evolution of Video Search Ranking Architecture Toward an End‑to‑End Large‑Model Framework

The paper describes transforming a tightly coupled, multi‑stage video search ranking pipeline into a modular, end‑to‑end large‑model architecture that decouples recall, employs a graph‑engine parallel framework and elastic compute allocation, thereby boosting performance, flexibility, personalization and lowering long‑term operational costs.

End-to-EndLarge Language ModelsSystem Optimization

0 likes · 10 min read

Evolution of Video Search Ranking Architecture Toward an End‑to‑End Large‑Model Framework

ZhongAn Tech Team

Jan 5, 2025 · Artificial Intelligence

Weekly AI Roundup Issue 9: OpenAI Vision, LeCun Interview, ByteDance HLLM, and DeepSeek‑V3 Highlights

This issue presents a curated overview of recent AI developments, including Sam Altman's 2025 technology vision poll, LeCun's interview on future AI directions, ByteDance's hierarchical large language model for recommendation, and the performance and cost advantages of the open‑source DeepSeek‑V3 model.

AIByteDanceDeepSeek

0 likes · 10 min read

Weekly AI Roundup Issue 9: OpenAI Vision, LeCun Interview, ByteDance HLLM, and DeepSeek‑V3 Highlights

DataFunTalk

Jan 1, 2025 · Artificial Intelligence

Applying Large Language Models to Financial Risk Control at Akulaku

This article details Akulaku’s deployment of large language models across multimodal financial risk‑control scenarios—covering business background, a three‑module intelligent‑agent architecture, concrete tool‑ and planning‑enhancement case studies, and future outlook—demonstrating how LLMs boost efficiency, reduce labeling effort, and enable copilot‑style assistance.

Agent ArchitectureKYC verificationLarge Language Models

0 likes · 15 min read

Applying Large Language Models to Financial Risk Control at Akulaku

DataFunSummit

Dec 31, 2024 · Artificial Intelligence

How Momo Leverages Large Model Technology to Transform Business and R&D Processes

This article explains how Momo utilizes large language model technologies to revamp its AI application paradigm, achieve efficient inference through quantization and prefix caching, build a workflow‑based model platform, and outline future plans for framework optimization and multimodal support.

AI PlatformInference OptimizationLarge Language Models

0 likes · 16 min read

How Momo Leverages Large Model Technology to Transform Business and R&D Processes

Xiaohongshu Tech REDtech

Dec 26, 2024 · Artificial Intelligence

Instruction Embedding: Latent Representations of Instructions for Task Identification

The paper introduces Instruction Embedding—a task‑focused text representation learned on the new Instruction Embedding Benchmark—and shows that Prompt‑based Instruction Embedding (PIE) outperforms standard embeddings in clustering, similarity, and downstream tasks such as data selection, in‑context example retrieval, test‑set compression, and task‑correlation analysis.

Large Language Modelscontrastive learningfine-tuning

0 likes · 15 min read

Instruction Embedding: Latent Representations of Instructions for Task Identification

DeWu Technology

Dec 25, 2024 · Artificial Intelligence

AI-Powered Intelligent Coding: Product Evolution, Technical Advances, and Future Outlook

AI‑powered coding tools—from JetBrains’ free IDEs to VSCode extensions like Cursor and end‑to‑end web platforms—are rapidly evolving, offering code continuation, AI‑driven Q&A, multi‑file editing, and chat interfaces, while advances in context handling, caching, LLM fine‑tuning, and speculative decoding promise faster, more integrated development workflows and a future where IDEs become chat‑centric assistants that streamline debugging, deployment, and junior developer support.

AI codingIDE integrationIntelligent code completion

0 likes · 18 min read

AI-Powered Intelligent Coding: Product Evolution, Technical Advances, and Future Outlook

Architects' Tech Alliance

Dec 23, 2024 · Artificial Intelligence

Why High‑Quality, Massive, Diverse Data Fuels AI Breakthroughs

The article explains how breakthroughs in artificial intelligence depend on high‑quality, large‑scale, and diverse training data, outlines the data‑centric AI movement, details a six‑step workflow for building datasets, and surveys the data industry ecosystem supporting large language model development.

AI dataData‑Centric AILarge Language Models

0 likes · 7 min read

Why High‑Quality, Massive, Diverse Data Fuels AI Breakthroughs

Fighter's World

Dec 21, 2024 · Artificial Intelligence

Is Pre‑training Coming to an End? Evaluating Data Sufficiency

The article examines Ilya Sutskever’s claim that pre‑training will end, argues that scaling laws still hold and data is not yet a bottleneck, highlights the scarcity of high‑quality frontier data, and explains why the industry is shifting toward inference‑time compute (o1) as a more sustainable path for large language models.

AI trendsData WallInference‑time Compute

0 likes · 13 min read

Is Pre‑training Coming to an End? Evaluating Data Sufficiency

Data Thinking Notes

Dec 18, 2024 · Artificial Intelligence

Mastering Prompt Engineering: Advanced Techniques from OpenAI, Anthropic, and Google

This article provides a comprehensive guide to modern prompt engineering, covering foundational principles, detailed techniques such as role‑playing, delimiters, step‑by‑step instructions, and advanced strategies like chain‑of‑thought, reflection, and external tool integration, with real‑world examples from major AI providers and a practical Img2Code case study.

AI best practicesLLM DevelopmentLarge Language Models

0 likes · 24 min read

Mastering Prompt Engineering: Advanced Techniques from OpenAI, Anthropic, and Google

Baidu Geek Talk

Dec 16, 2024 · Artificial Intelligence

AIAPI: Baidu's AI-Native Retrieval System for Large Language Model Applications

AIAPI, Baidu’s AI‑native retrieval platform for large language models, tackles hallucination, slow domain updates, and output opacity by delivering authoritative, timely, full‑content data through a dual‑channel architecture that combines traditional search and RAG, employs reusable ranking, graph‑enhanced data layers, dynamic caching that cuts storage by 70 %, and QueryPlan‑based QoS, achieving markedly higher retrieval quality and a 34 % speed gain with Wenxin 4.0.

AI-Native SystemsAIAPILarge Language Models

0 likes · 12 min read

AIAPI: Baidu's AI-Native Retrieval System for Large Language Model Applications

JD Tech

Dec 14, 2024 · Artificial Intelligence

Generative Retrieval for E‑commerce Search: Lexical and Semantic ID Approaches

This article presents a comprehensive study of generative retrieval for large‑scale e‑commerce search, comparing lexical‑based and Semantic‑ID‑based methods, introducing a Query‑to‑MultiSpan framework, analyzing the sand‑glass distribution problem in residual quantization, and proposing heuristic and adaptive solutions to improve recall and efficiency.

AIE-commerce SearchGenerative Retrieval

0 likes · 20 min read

Generative Retrieval for E‑commerce Search: Lexical and Semantic ID Approaches

Alibaba Cloud Big Data AI Platform

Dec 12, 2024 · Artificial Intelligence

How PertEval Reveals the Real Knowledge Limits of Large Language Models

At NeurIPS 2024, Alibaba Cloud's PAI team presented the Spotlight paper PertEval, which introduces knowledge‑invariant perturbations to expose the true knowledge capacity of LLMs, critiques over‑optimistic static benchmarks, and showcases responsible AI solutions and platform demos for enterprise use.

Alibaba CloudLarge Language ModelsNeurIPS 2024

0 likes · 6 min read

How PertEval Reveals the Real Knowledge Limits of Large Language Models

Tencent Tech

Dec 11, 2024 · Artificial Intelligence

Inside Tencent LeYong AI: Solving Enterprise RAG with Knowledge, Engineering & Algorithms

This article explores how Tencent's LeYong AI assistant leverages Retrieval‑Augmented Generation to empower enterprise knowledge retrieval, detailing three capability dimensions—knowledge management, engineering, and algorithmic—along with eight sub‑areas such as knowledge boundaries, quality, permissions, multimodal handling, long‑context span, and complex reasoning.

AI assistantsEnterprise AILarge Language Models

0 likes · 18 min read

Inside Tencent LeYong AI: Solving Enterprise RAG with Knowledge, Engineering & Algorithms

AntTech

Dec 11, 2024 · Artificial Intelligence

Ant Group’s Selected NeurIPS 2024 Papers: Summaries and Highlights

This article presents a curated overview of fifteen Ant Group research papers accepted at NeurIPS 2024, covering topics such as large language models, knowledge graphs, recommendation systems, privacy-preserving inference, and multimodal learning, with abstracts, paper types, links, and key contributions highlighted.

Ant GroupArtificial IntelligenceLarge Language Models

0 likes · 32 min read

Ant Group’s Selected NeurIPS 2024 Papers: Summaries and Highlights

DevOps

Dec 10, 2024 · Artificial Intelligence

Key Generative AI Trends to Watch in 2024

The article outlines the major 2024 generative AI trends—including realistic expectations, multimodal models, smaller open‑source LLMs, GPU shortages, easier model optimization, custom local pipelines, stronger virtual agents, regulatory and ethical challenges, and the rise of shadow AI—while explaining their technical and business implications.

AI governanceLarge Language Models

0 likes · 17 min read

Key Generative AI Trends to Watch in 2024

AntTech

Dec 10, 2024 · Artificial Intelligence

Three Representative Ant Group Papers at NeurIPS 2024

Ant Group will showcase three flagship papers at NeurIPS 2024—AMOR for adaptable modular knowledge agents, PaRO for efficient data‑parallel training of large language models, and LLMDFA for code data‑flow analysis using LLMs—highlighting novel methods, experimental results, and upcoming live discussions.

Ant GroupArtificial IntelligenceDataflow Analysis

0 likes · 5 min read

Three Representative Ant Group Papers at NeurIPS 2024

AsiaInfo Technology: New Tech Exploration

Dec 9, 2024 · Artificial Intelligence

How Programming Large Models Transform Repository‑Level Code Completion

This article examines how programming large models combined with code knowledge graphs can overcome the limited context of traditional code‑completion tools, detailing key techniques, trigger strategies, context acquisition methods, model fine‑tuning practices, current challenges, and future research directions for intelligent, repository‑wide code suggestions.

AI programmingKnowledge GraphLarge Language Models

0 likes · 14 min read

How Programming Large Models Transform Repository‑Level Code Completion

JD Retail Technology

Dec 9, 2024 · Artificial Intelligence

Generative Retrieval for E‑commerce Search: Lexical‑Based and Semantic‑ID Approaches

This article presents a comprehensive study of generative retrieval in large‑scale e‑commerce search, detailing lexical‑based and SemanticID‑based methods, their challenges such as long‑tail distribution and token length, experimental evaluations, the discovered "sandglass" effect, and proposed solutions to improve recall and efficiency.

AIE-commerce SearchGenerative Retrieval

0 likes · 20 min read

Generative Retrieval for E‑commerce Search: Lexical‑Based and Semantic‑ID Approaches

ZhongAn Tech Team

Dec 8, 2024 · Artificial Intelligence

Weekly AI Digest Issue 5: Voice Interaction Trends, End‑to‑End vs. Chain Integration, and Enterprise Solutions

This issue examines the growing importance of voice interaction in AI, highlights Justin Uberti’s move to OpenAI and the launch of GPT‑4o, compares end‑to‑end large‑model and chain‑integration approaches, and offers practical enterprise deployment scenarios for both weak and strong voice‑based interactions.

AIChain IntegrationEnd-to-End

0 likes · 14 min read

Weekly AI Digest Issue 5: Voice Interaction Trends, End‑to‑End vs. Chain Integration, and Enterprise Solutions

Fighter's World

Dec 7, 2024 · Artificial Intelligence

Does Scaling Law Still Hold? Analyzing OpenAI’s 12‑Day Mini Releases and the Future of GPT‑5

The article examines OpenAI’s 12‑day mini‑series, the emergence of o1 and Reinforcement Fine‑Tuning, and uses Epoch AI’s 2024 report to evaluate four critical constraints—power, chip capacity, data scarcity, and latency—that determine whether AI scaling laws can sustain the compute needed for a GPT‑5‑scale model by 2030.

AI scalingLarge Language ModelsLatency

0 likes · 11 min read

Does Scaling Law Still Hold? Analyzing OpenAI’s 12‑Day Mini Releases and the Future of GPT‑5

Baobao Algorithm Notes

Dec 7, 2024 · Artificial Intelligence

What Is Reinforcement Fine-Tuning (RFT) and How Does It Supercharge LLMs?

Reinforcement Fine-Tuning (RFT) combines supervised fine‑tuning with reinforcement learning to teach large language models to reason more effectively, using separate training and validation datasets, graders, and PPO optimization, and has shown superior performance on tasks like gene prediction and math reasoning compared to standard SFT.

AILarge Language ModelsMachine Learning

0 likes · 8 min read

What Is Reinforcement Fine-Tuning (RFT) and How Does It Supercharge LLMs?

NewBeeNLP

Dec 3, 2024 · Artificial Intelligence

Can LLMs Self‑Correct Their Answers? Exploring Reward Models, Loss Functions, and Training Dynamics

The article reflects on open‑source LLMs like Qwen2 and Llama 3.1, questioning whether models should self‑review answers, how hidden states might signal uncertainty, the role of loss‑function design, scaling laws, and the trade‑offs between PPO and DPO in alignment.

Large Language ModelsReward ModelScaling Law

0 likes · 9 min read

Can LLMs Self‑Correct Their Answers? Exploring Reward Models, Loss Functions, and Training Dynamics

NewBeeNLP

Dec 2, 2024 · Artificial Intelligence

What Are Today’s Unified Generation-and-Understanding Multimodal Model Architectures?

This article surveys current unified generation-and-understanding multimodal large-model architectures, compares LLM-centric and LLM-plus-diffusion designs, extracts common insights, details large-scale training tricks from models like Emu3, Chameleon and Janus, and outlines open research directions for visual encoders.

Large Language ModelsMultimodaldiffusion

0 likes · 5 min read

What Are Today’s Unified Generation-and-Understanding Multimodal Model Architectures?

ZhongAn Tech Team

Dec 1, 2024 · Artificial Intelligence

AI Weekly Digest Issue 4: Market Insights, Industry Solutions, and Emerging Technologies

The fourth AI weekly newsletter reviews recent industry news—including Jensen Huang's robot era vision and Tesla's Optimus plans—introduces Claude's new style‑customization feature, explores AI‑enhanced input methods, and evaluates DeepSeek's R1‑Lite model performance on complex reasoning tasks.

AIAI applicationsClaude

0 likes · 10 min read

AI Weekly Digest Issue 4: Market Insights, Industry Solutions, and Emerging Technologies

AntTech

Nov 29, 2024 · Artificial Intelligence

AI Industry Trends in 2024: From Global Slowdown to Chinese Market Acceleration

In 2024, despite a global slowdown in generative AI hype, China's AI market accelerates with rapid application deployments, emerging industries like embodied intelligence and autonomous driving, and a maturing ecosystem that shifts AI from hype to tangible industrial impact.

Artificial IntelligenceChinaLarge Language Models

0 likes · 11 min read

AI Industry Trends in 2024: From Global Slowdown to Chinese Market Acceleration

AI Large Model Application Practice

Nov 29, 2024 · Artificial Intelligence

Understanding RAG: How Retrieval‑Augmented Generation Reduces Large‑Model Hallucinations

This article explains the hallucination problem of large language models, introduces Retrieval‑Augmented Generation (RAG) as a solution, compares RAG with model fine‑tuning, and outlines basic RAG architecture and workflow for practical applications.

Hallucination MitigationLarge Language ModelsRAG

0 likes · 10 min read

Understanding RAG: How Retrieval‑Augmented Generation Reduces Large‑Model Hallucinations

Ximalaya Technology Team

Nov 29, 2024 · Artificial Intelligence

Applying Large Language Models for AIGC Advertising: Content Generation, Multimodal Understanding, and Creative Optimization at Ximalaya

Ximalaya leverages large language models and AI‑generated content to automate ad creative production, multimodal semantic understanding, and creative selection, slashing image costs to 0.2 CNY, boosting CTR by up to 3.5 %, improving revenue and eCPM by over 2 %, and expanding material diversity fivefold.

AIGCLarge Language Modelscreative optimization

0 likes · 21 min read

Applying Large Language Models for AIGC Advertising: Content Generation, Multimodal Understanding, and Creative Optimization at Ximalaya

Alibaba Cloud Developer

Nov 28, 2024 · Artificial Intelligence

Mooncake: Open-Source KVCache-Centric Architecture Boosting Large-Model Inference

Mooncake, an open-source KVCache-centric inference architecture co-developed by Alibaba Cloud and Tsinghua University's MADSys lab, dramatically improves large-model throughput and reduces cost by decoupling resources, standardizing cache pooling, and integrating with frameworks like vLLM, sparking broad industry interest.

AI infrastructureKVCacheLarge Language Models

0 likes · 4 min read

Mooncake: Open-Source KVCache-Centric Architecture Boosting Large-Model Inference

Kuaishou Large Model

Nov 22, 2024 · Artificial Intelligence

Boost LLM Training on Massive Clusters with DP/TP Overlap and Context Parallelism

This article details a comprehensive set of techniques—including data‑ and tensor‑parallel overlap, context‑parallelism, activation rematerialization, and a performance‑driven cost model—that dramatically improve large‑language‑model training efficiency on ultra‑large GPU clusters while preserving model quality.

Large Language ModelsParallelismPerformance Modeling

0 likes · 28 min read

Boost LLM Training on Massive Clusters with DP/TP Overlap and Context Parallelism

HyperAI Super Neural

Nov 20, 2024 · Artificial Intelligence

From Computer Vision to Medical AI: Prof. Xie's Work Hits Nature, NeurIPS, CVPR

Professor Xie's team at Shanghai Jiao Tong University reports rapid progress in AI for Science, detailing multimodal medical AI models, large open datasets, language and vision‑language models, and knowledge‑enhanced representations that outperform existing baselines across multiple benchmarks.

Knowledge GraphsLarge Language ModelsOpen Datasets

0 likes · 14 min read

From Computer Vision to Medical AI: Prof. Xie's Work Hits Nature, NeurIPS, CVPR

DataFunSummit

Nov 18, 2024 · Artificial Intelligence

Intelligent Data Analysis: Agent Architecture Combined with Semantic Layer for Product Implementation

This article explores how large‑model technologies can address data analysis challenges by introducing an Agent‑based architecture integrated with a semantic layer, detailing design principles, optimization paths, technical implementation, real‑world retail case studies, product design considerations, and future directions for intelligent analytics.

AIAgent ArchitectureData Analytics

0 likes · 22 min read

Intelligent Data Analysis: Agent Architecture Combined with Semantic Layer for Product Implementation

Alibaba Cloud Developer

Nov 18, 2024 · Artificial Intelligence

Solving Knowledge Challenges in Retrieval‑Augmented Generation: Practical Optimizations

This article shares a half‑year of hands‑on experience with Retrieval‑Augmented Generation, analyzing why simple RAG setups often feel unintelligent, identifying three core knowledge issues, and presenting concrete optimization strategies—including chunking, knowledge expansion, and tag‑based conflict resolution—to improve retrieval and generation performance in low‑resource environments.

AILarge Language ModelsRAG

0 likes · 25 min read

Solving Knowledge Challenges in Retrieval‑Augmented Generation: Practical Optimizations

ZhongAn Tech Team

Nov 16, 2024 · Artificial Intelligence

Weekly AI Digest Issue 2: Video Generation, Large Models, AGI, and LoRA Fine‑Tuning

This weekly AI roundup discusses emerging video generation tools like PixelDance and Vidu 1.5, debates on scaling limits of large models, AGI geopolitical considerations, and a MIT study comparing LoRA with full fine‑tuning for domain adaptation.

AGIAILarge Language Models

0 likes · 8 min read

Weekly AI Digest Issue 2: Video Generation, Large Models, AGI, and LoRA Fine‑Tuning

NewBeeNLP

Nov 14, 2024 · Artificial Intelligence

What’s Trending in Recommendation Systems at KDD 2024? A Comprehensive Paper Overview

The 30th SIGKDD conference in Barcelona featured 2,046 research papers with a 20% acceptance rate, and this article compiles the 59 recommendation‑system papers—covering large‑model recommenders, graph‑based methods, sequential models, fairness, privacy, advertising, debiasing, reinforcement learning and more—for researchers to explore the latest academic advances.

FairnessKDD2024Large Language Models

0 likes · 15 min read

What’s Trending in Recommendation Systems at KDD 2024? A Comprehensive Paper Overview

Tencent Docs Tech Team

Nov 13, 2024 · Artificial Intelligence

Technical Architecture and Practices of the AI Document Assistant

This article explores the challenges large language models bring to efficiency tools, outlines the AI document assistant's technical thinking and architecture, and details both application‑side and model‑side practices such as retrieval‑augmented generation, intent recognition, and code‑driven table handling, concluding with key lessons.

AIAI ArchitectureDocument Automation

0 likes · 16 min read

Technical Architecture and Practices of the AI Document Assistant

JD Tech Talk

Nov 11, 2024 · Artificial Intelligence

Prompt Engineering: Concepts, Evolution, Techniques, and a Logistics Application Case

This article explains what Prompt Engineering is, traces its development from early command‑based interactions to modern adaptive and multimodal prompting, details various prompting techniques such as zero‑shot, few‑shot, Chain‑of‑Thought, hallucination‑reduction methods, and demonstrates their practical use in a JD Logistics SKU piece‑type classification case with code examples.

AI promptingLLM applicationsLarge Language Models

0 likes · 26 min read

Prompt Engineering: Concepts, Evolution, Techniques, and a Logistics Application Case

DataFunSummit

Nov 9, 2024 · Artificial Intelligence

GraphRAG: Using Graph Structures to Enhance Retrieval‑Augmented Generation – Challenges, Methods, and Product Deployments

This article introduces GraphRAG, explains the limitations of traditional RAG, outlines four major challenges (fine‑grained retrieval, global context, similarity vs relevance, and macro‑level reasoning), describes GraphRAG’s graph‑based retrieval strategies, showcases comparative experiments, and presents NebulaGraph’s GenAI Suite and RAG products along with future research directions.

AIGraphRAGLarge Language Models

0 likes · 16 min read

GraphRAG: Using Graph Structures to Enhance Retrieval‑Augmented Generation – Challenges, Methods, and Product Deployments

Baobao Algorithm Notes

Nov 7, 2024 · Artificial Intelligence

Demystifying FlashAttention: A Minimalist Derivation of the Algorithm

This article presents a concise, step‑by‑step derivation of FlashAttention, explaining the prerequisite linear‑algebra concepts, the softmax simplifications, and the parallel computation workflow—including the LSE‑enhanced version—so readers can grasp the algorithm’s elegance without heavy mathematics.

Algorithm DerivationFlashAttentionLarge Language Models

0 likes · 8 min read

Demystifying FlashAttention: A Minimalist Derivation of the Algorithm

NewBeeNLP

Nov 7, 2024 · Artificial Intelligence

Tackling Large Model Hallucinations: Causes, Detection, and Mitigation Strategies

This article provides a comprehensive analysis of large language model hallucinations, detailing their definitions, classifications, root causes, detection techniques, and a wide range of mitigation approaches—including RAG pipelines, decoding strategies, and model‑enhancement methods—to improve reliability and safety in real‑world AI applications.

AI safetyLarge Language ModelsPrompt Engineering

0 likes · 22 min read

Tackling Large Model Hallucinations: Causes, Detection, and Mitigation Strategies

DataFunSummit

Nov 6, 2024 · Artificial Intelligence

Applying AIGC to Transform Insurance Marketing at Ant Group

This article explains how Ant Group’s insurance marketing team leverages Artificial Intelligence‑generated content (AIGC) to create personalized marketing materials, automate recommendation workflows, and produce video scripts, thereby improving efficiency, compliance, and user engagement in the insurance sector.

AIGCArtificial IntelligenceInsurance Marketing

0 likes · 9 min read

Applying AIGC to Transform Insurance Marketing at Ant Group

Fighter's World

Nov 1, 2024 · Artificial Intelligence

How Fiercely Competitive Is the Large‑Model Landscape? Insights from the State of AI Report 2024

The State of AI Report 2024 reveals converging capabilities among open and closed LLMs, a shift toward inference compute, benchmark and data contamination challenges, rising synthetic‑data risks, booming robotics research, Nvidia's hardware dominance, and a mix of accurate and missed predictions for the coming year.

AI IndustryAI hardwareLarge Language Models

0 likes · 15 min read

How Fiercely Competitive Is the Large‑Model Landscape? Insights from the State of AI Report 2024

Infra Learning Club

Oct 31, 2024 · Industry Insights

Top AI Startups to Watch in 2024: 10 Leading and 6 Emerging Companies

The article surveys the most funded and influential AI startups of 2024, profiling ten large‑scale companies such as OpenAI, Anthropic, and Scale AI, and highlighting six promising newcomers, while detailing their products, CEOs, valuations, recent milestones, and industry impact.

2024AI IndustryAI startups

0 likes · 11 min read

Top AI Startups to Watch in 2024: 10 Leading and 6 Emerging Companies

Infra Learning Club

Oct 31, 2024 · Artificial Intelligence

What Is a Token in Large Language Models?

The article explains that a token is the unit processed by large language models, describes three common tokenizer methods—word‑level, character‑level, and sub‑word level—with English and Chinese examples, discusses their advantages and limitations, and shows how OpenAI’s tokenizer varies across model versions.

Large Language ModelsNLPToken

0 likes · 5 min read

What Is a Token in Large Language Models?

AntTech

Oct 29, 2024 · Artificial Intelligence

Three Ant Group Papers Featured at EMNLP 2024: Dynamic Transformers, Plug‑and‑Play Visual Reasoner, and Efficient Fine‑Tuning of Large Language Models

This announcement introduces three Ant Group papers accepted at EMNLP 2024—Mixture‑of‑Modules for dynamic Transformer assembly, a plug‑and‑play visual reasoning framework built via data synthesis, and a layer‑wise importance‑aware efficient fine‑tuning method for large language models—highlighting their innovations and upcoming live presentations.

AI researchEMNLP 2024Large Language Models

0 likes · 6 min read

Three Ant Group Papers Featured at EMNLP 2024: Dynamic Transformers, Plug‑and‑Play Visual Reasoner, and Efficient Fine‑Tuning of Large Language Models

Alibaba Cloud Infrastructure

Oct 28, 2024 · Artificial Intelligence

How AI Is Redefining the Enterprise CIO Role – Insights from Alibaba Cloud’s CIO

In a detailed interview, Alibaba Cloud’s CIO Jiang Linquan discusses how rapid AI advancements—from large language models to multimodal and reasoning systems—are reshaping CIO responsibilities, accelerating enterprise information system intelligence, and driving new strategies for knowledge bases, customer service, and cross‑departmental adoption.

AICIOCustomer Service

0 likes · 14 min read

How AI Is Redefining the Enterprise CIO Role – Insights from Alibaba Cloud’s CIO

Fighter's World

Oct 26, 2024 · Artificial Intelligence

Key Considerations for Deploying Large Language Models in Cloud Services

The article reflects on Alibaba Cloud's large‑model deployments, outlines four service scenarios, examines three fundamental questions about foundation models, and offers a prioritized roadmap—including prompt engineering, RAG, and organizational changes—to effectively bring LLMs to production.

AI deploymentAlibaba CloudCloud Services

0 likes · 8 min read

Key Considerations for Deploying Large Language Models in Cloud Services

AntTech

Oct 15, 2024 · Artificial Intelligence

AI Large Model Technology Exploration and Application Forum (CNCC2024)

The AI Large Model Technology Exploration and Application Forum, held on October 24‑26, 2024 in Hengdian, Zhejiang, gathers leading experts from Ant Group, universities and research institutes to discuss challenges, knowledge enhancement, data infrastructure, diffusion models, multimodal and medical large models through a series of keynote talks and panel sessions.

AILarge Language Modelsconference

0 likes · 12 min read

AI Large Model Technology Exploration and Application Forum (CNCC2024)

Tencent Advertising Technology

Oct 14, 2024 · Artificial Intelligence

Generative Retrieval Based on Yuan Large Model: Implementation and Practice in Tencent Advertising

This paper presents the implementation and practice of generative retrieval based on Yuan large model in Tencent Advertising, addressing three key challenges: user intent capture, model alignment in advertising domain, and high-performance platform design under ROI constraints.

Generative RetrievalHigh Performance ComputingLarge Language Models

0 likes · 17 min read

Generative Retrieval Based on Yuan Large Model: Implementation and Practice in Tencent Advertising

360 Zhihui Cloud Developer

Oct 11, 2024 · Artificial Intelligence

How 360 Built a Thousand‑GPU AI Supercomputer with Kubernetes and Advanced Scheduling

This article details the design and implementation of 360’s AI Computing Center, covering server selection, network topology, Kubernetes scheduling, training and inference acceleration, and the AI platform’s core, visualization, and fault‑tolerance capabilities for large‑scale AI workloads.

AI infrastructureGPU clusterKubernetes

0 likes · 22 min read

How 360 Built a Thousand‑GPU AI Supercomputer with Kubernetes and Advanced Scheduling

NewBeeNLP

Oct 11, 2024 · Artificial Intelligence

Inside Llama 3: Training, Architecture, and Performance Secrets

An extensive review of Meta’s Llama 3 model breaks down its pre‑training data pipeline, scaling laws, architectural tweaks like GQA and RoPE, post‑training methods such as SFT, DPO, and reward modeling, and evaluates benchmark results, offering practical insights for researchers and engineers building large language models.

BenchmarkingLarge Language ModelsLlama 3

0 likes · 32 min read

Inside Llama 3: Training, Architecture, and Performance Secrets

Baobao Algorithm Notes

Oct 10, 2024 · Artificial Intelligence

How MCTS Powers Inference in OpenAI’s o1: A Deep Dive with rStar

This article explains how the inference component of OpenAI’s o1 model can be implemented using Monte‑Carlo Tree Search, detailing the action space, rollout process, UCT scoring, and best‑path selection, with a concrete walkthrough of Microsoft’s open‑source rStar code.

Large Language ModelsMCTSOpenAI o1

0 likes · 26 min read

How MCTS Powers Inference in OpenAI’s o1: A Deep Dive with rStar

Architect

Oct 7, 2024 · Artificial Intelligence

Master Prompt Engineering: A Universal Framework for Building Effective LLM Prompts

This article presents a systematic, four‑part Prompt engineering framework—role definition, problem description, goal setting, and requirement specification—augmented with RAG, few‑shot examples, memory handling, and model‑parameter tuning, enabling developers to craft high‑quality prompts for large language models across diverse tasks.

Large Language ModelsModel ParametersPrompt Engineering

0 likes · 28 min read

Master Prompt Engineering: A Universal Framework for Building Effective LLM Prompts

DataFunSummit

Oct 2, 2024 · Artificial Intelligence

NVIDIA’s Solutions for Large Language Models: NeMo Framework, TensorRT‑LLM, and Retrieval‑Augmented Generation

This article explains NVIDIA’s end‑to‑end stack for large language models, covering the NeMo Framework for data processing, training, and deployment, the open‑source TensorRT‑LLM inference accelerator, and the Retrieval‑Augmented Generation (RAG) technique that enriches model outputs with external knowledge.

Large Language ModelsNVIDIANeMo

0 likes · 17 min read

NVIDIA’s Solutions for Large Language Models: NeMo Framework, TensorRT‑LLM, and Retrieval‑Augmented Generation

Architect

Sep 28, 2024 · Artificial Intelligence

How Does OpenAI’s o1 Model Leverage Self‑Play RL and New Scaling Laws?

The article provides an in‑depth technical analysis of OpenAI’s multimodal o1 model, explaining its self‑play reinforcement‑learning pipeline, the novel train‑time and test‑time compute scaling laws, its long‑think reasoning abilities demonstrated through a cipher example, and speculative architectures for generator‑verifier systems.

Large Language ModelsOpenAIRL scaling

0 likes · 35 min read

How Does OpenAI’s o1 Model Leverage Self‑Play RL and New Scaling Laws?

Tencent Cloud Developer

Sep 27, 2024 · Artificial Intelligence

A Comprehensive Prompt Engineering Framework: Universal Templates, RAG, Few‑Shot, Memory, and Automated Optimization

The article presents a universal four‑part prompt template—role, problem description, goal, and requirements—augmented with role definitions, RAG‑based knowledge retrieval, few‑shot examples, memory handling, temperature/top‑p tuning, and automated optimization techniques such as APE, APO, and OPRO, enabling developers to reliably craft high‑quality prompts for LLMs.

AI Prompt OptimizationLarge Language ModelsPrompt Engineering

0 likes · 26 min read

A Comprehensive Prompt Engineering Framework: Universal Templates, RAG, Few‑Shot, Memory, and Automated Optimization

AntData

Sep 26, 2024 · Artificial Intelligence

DB-GPT: Open-Source AI-Native Data Application Development Framework

DB‑GPT is an open‑source AI‑native data‑application framework that provides multi‑model management, Text‑to‑SQL optimization, RAG, multi‑agent collaboration, and intelligent workflow orchestration, enabling developers to build scalable large‑model database applications, with proven enterprise adoption, community growth, and academic publications.

AIData EngineeringLarge Language Models

0 likes · 6 min read

DB-GPT: Open-Source AI-Native Data Application Development Framework

DataFunTalk

Sep 23, 2024 · Artificial Intelligence

Comprehensive Guide to Selecting, Adapting, and Deploying Large Language Models for Enterprise Applications

This article provides an in‑depth, step‑by‑step guide on how enterprises can choose between open‑source and closed‑source large language models, adapt them through incremental pre‑training, instruction fine‑tuning, and reinforcement learning, and finally deploy them across front‑office, middle‑office, and back‑office scenarios to drive digital transformation.

Enterprise AILarge Language ModelsRLHF

0 likes · 28 min read

Comprehensive Guide to Selecting, Adapting, and Deploying Large Language Models for Enterprise Applications

Refining Core Development Skills

Sep 21, 2024 · Artificial Intelligence

Using GLM-4-Plus Large Model API: Features, Code Samples, and Practical Application Scenarios

This article introduces the rapid rise of large language models, highlights the advantages of the GLM-4-Plus model—including superior language understanding, long‑text handling, and enhanced reasoning—explains how to obtain API credentials, demonstrates request parameters and curl examples, and showcases diverse real‑world use cases such as code generation, social‑media copy, travel planning, and interview question creation.

API UsageGLM-4-PlusLarge Language Models

0 likes · 17 min read

Using GLM-4-Plus Large Model API: Features, Code Samples, and Practical Application Scenarios

Kuaishou Tech

Sep 20, 2024 · Artificial Intelligence

Building an LLM-Based Agent Platform for Enterprise Commercialization: Strategies, Architecture, and Practical Insights

This article details the strategic development and technical architecture of SalesCopilot, an LLM-driven agent platform designed for enterprise commercialization, highlighting the implementation of RAG and agent technologies, addressing practical challenges, and sharing key insights for building scalable AI applications.

AI agentsAI evaluationEnterprise AI

0 likes · 15 min read

Building an LLM-Based Agent Platform for Enterprise Commercialization: Strategies, Architecture, and Practical Insights

NewBeeNLP

Sep 20, 2024 · Industry Insights

Why Large Language Models Still Matter: Insights into Industry Trends and Research Directions

The author reflects on the shifting mental state and market sentiment of 2024, noting the waning hype for AI applications, the importance of research over sheer scale, and the evolving role of multimodal LLMs and scientific scaling laws in shaping the future of AI.

AI fundingAI industry trendsLarge Language Models

0 likes · 7 min read

Why Large Language Models Still Matter: Insights into Industry Trends and Research Directions

Baidu Geek Talk

Sep 18, 2024 · Industry Insights

How Baidu’s Large‑Model ‘Yuanji’ Is Transforming Traffic Policing in China

The article examines Baidu Cloud’s large‑model‑powered digital police assistant “Yuanji,” detailing its deployment in Shijiazhuang, its voice‑enabled 24/7 Q&A capabilities, performance metrics, broader city rollouts, and the strategic AI partnership reshaping smart traffic management.

Artificial IntelligenceBaidu CloudDigital Police

0 likes · 9 min read

How Baidu’s Large‑Model ‘Yuanji’ Is Transforming Traffic Policing in China

Huawei Cloud Developer Alliance

Sep 18, 2024 · Artificial Intelligence

How Distributed Training Powers Massive Language Models: Concepts, Strategies, and Code

This article explains why single‑machine resources are insufficient for training ever‑larger language models, introduces the fundamentals of distributed training systems, details various parallel strategies such as data, model, pipeline, and hybrid parallelism, and provides practical PyTorch code and memory‑optimization techniques to accelerate large‑scale model training.

GPULarge Language ModelsParallelism

0 likes · 29 min read

How Distributed Training Powers Massive Language Models: Concepts, Strategies, and Code

DataFunSummit

Sep 15, 2024 · Artificial Intelligence

AgentUniverse: A Multi‑Agent Framework for Financial Scenarios

This article presents Ant Group's agentUniverse framework, detailing its multi‑agent collaborative mechanisms, architectural design, and real‑world financial applications such as AI assistants, ESG analysis, and automated report generation, while addressing challenges of information‑dense, knowledge‑rich, and decision‑critical finance domains.

AI FrameworkFinancial AILarge Language Models

0 likes · 12 min read

AgentUniverse: A Multi‑Agent Framework for Financial Scenarios

Meituan Technology Team

Sep 12, 2024 · Artificial Intelligence

How BlackPearl Dominated All Three KDD 2024 OAG‑Challenge Tracks with Large‑Model Techniques

The BlackPearl team leveraged large‑model strategies—including iterative self‑refinement, train‑time difficulty increase, test‑time augmentation, grafting‑learning, and boosting—to dominate the WhoIsWho‑IND, PST, and AQA tracks of the KDD 2024 OAG‑Challenge Cup, surpassing traditional feature‑engineered, GNN, and BERT baselines.

AQAAcademic Graph MiningKDD 2024

0 likes · 21 min read

How BlackPearl Dominated All Three KDD 2024 OAG‑Challenge Tracks with Large‑Model Techniques

Baidu Geek Talk

Sep 11, 2024 · Databases

Why Vector Databases Are the Next Big Thing in AI: A Deep Dive into RAG and Baidu’s VectorDB

This article examines the 70‑year evolution of databases, explains how large‑model AI drives the rise of vector databases and Retrieval‑Augmented Generation (RAG), outlines the four‑stage RAG workflow, compares Baidu’s self‑built VectorDB with open‑source alternatives, and showcases real‑world deployments that highlight performance, scalability, and enterprise benefits.

AIDatabase ArchitectureIndustry Insights

0 likes · 16 min read

Why Vector Databases Are the Next Big Thing in AI: A Deep Dive into RAG and Baidu’s VectorDB

DataFunSummit

Sep 5, 2024 · Artificial Intelligence

NVIDIA’s End‑to‑End Solutions for Large Language Models: NeMo Framework, TensorRT‑LLM, and Retrieval‑Augmented Generation

This article introduces NVIDIA’s comprehensive solutions for large language models, covering the NeMo Framework’s full‑stack development pipeline, the open‑source TensorRT‑LLM inference accelerator, and Retrieval‑Augmented Generation techniques, while detailing data preprocessing, distributed training, model fine‑tuning, deployment, and performance optimizations.

Large Language ModelsNVIDIANeMo Framework

0 likes · 16 min read

NVIDIA’s End‑to‑End Solutions for Large Language Models: NeMo Framework, TensorRT‑LLM, and Retrieval‑Augmented Generation

Baidu Geek Talk

Sep 2, 2024 · Industry Insights

How a R&D Data Platform Leverages Large Language Models to Accelerate Issue Diagnosis

The article explains how the R&D data middle platform integrates large language models to automate data collection, real‑time monitoring, intelligent analysis, and rapid root‑cause identification for online issues, detailing the architecture, wide‑table modeling, generative BI, attribution algorithms, RAG enhancements, and future optimization plans.

Data PlatformLarge Language ModelsRetrieval-Augmented Generation

0 likes · 37 min read

How a R&D Data Platform Leverages Large Language Models to Accelerate Issue Diagnosis

DataFunTalk

Sep 2, 2024 · Artificial Intelligence

Exploring Graph Foundation Models: Concepts, Techniques, and Future Directions

This article introduces graph foundation models, explains their relationship with large language models, reviews recent advances in graph neural networks and representation learning, presents the authors' own research on PT‑HGNN, Specformer and GraphTranslator, and discusses challenges, future research directions, and a Q&A session.

Large Language ModelsMachine Learningfoundation models

0 likes · 23 min read

Exploring Graph Foundation Models: Concepts, Techniques, and Future Directions

NewBeeNLP

Sep 2, 2024 · Artificial Intelligence

Boosting Large Language Model Math Reasoning: Mixed Instructions, Synthetic Data, and Training Optimizations

This article presents a comprehensive technical walkthrough on enhancing large language model mathematical reasoning by reviewing model architectures, introducing mixed CoT‑PoT instructions, generating and filtering synthetic data, and applying multi‑stage training optimizations such as RFT, PPO, and DPO, with detailed experimental results and Q&A insights.

AILarge Language ModelsReward Model

0 likes · 17 min read

Boosting Large Language Model Math Reasoning: Mixed Instructions, Synthetic Data, and Training Optimizations

DataFunTalk

Sep 1, 2024 · Artificial Intelligence

Building Multi‑Scenario AI Assistants with Large Models at Huolala

Huolala, a logistics technology company, shares how it leverages large language models to create personal and office AI assistants across dozens of real‑world scenarios, detailing the underlying platform, prompt engineering, multimodal capabilities, multi‑agent coordination, and the resulting business empowerment.

AI assistantsLarge Language Modelslogistics AI

0 likes · 13 min read

Building Multi‑Scenario AI Assistants with Large Models at Huolala

Baobao Algorithm Notes

Aug 29, 2024 · Artificial Intelligence

Why RLHF Is Essential: The Limits of SFT and the Power of Reward Modeling

The article analyzes why Reinforcement Learning from Human Feedback (RLHF) cannot be replaced by Supervised Fine‑Tuning (SFT), highlighting SFT's lack of negative feedback, its one‑directional attention limitation, and how RLHF's reward models provide crucial safety and performance improvements for large language models.

AI alignmentLarge Language ModelsRLHF

0 likes · 9 min read

Why RLHF Is Essential: The Limits of SFT and the Power of Reward Modeling

Continuous Delivery 2.0

Aug 29, 2024 · Artificial Intelligence

The Post‑AI Era: How Large Language Models Will Transform Software Development

Matt Welsh argues that despite fifty years of programming advances, humans remain poor at coding, and in the post‑AI era large language models will reshape software engineering, prompting a shift toward prompt engineering, new team roles, and a revival of fundamental engineering practices.

AILarge Language ModelsPrompt Engineering

0 likes · 6 min read

The Post‑AI Era: How Large Language Models Will Transform Software Development

Efficient Ops

Aug 28, 2024 · Artificial Intelligence

How Large Language Models Are Revolutionizing Banking Regulatory Interpretation

This article explores how AI-powered large language models enable Chinese commercial banks to automate, accurately match, and predict regulatory requirements, detailing new use‑cases, a prompt‑engineering framework, and the resulting efficiency and risk‑reduction benefits for the financial sector.

AILarge Language ModelsPrompt Engineering

0 likes · 7 min read

How Large Language Models Are Revolutionizing Banking Regulatory Interpretation

AntTech

Aug 28, 2024 · Artificial Intelligence

Ant Group’s Selected Papers at KDD2024: Abstracts and Highlights

The article presents a curated collection of Ant Group's research papers accepted at KDD2024, summarizing each paper's title, type, link, source, relevant fields, and abstract, covering topics such as graph mining, large language models, fraud detection, recommendation systems, and multimodal medical AI.

AI researchAnt GroupKDD2024

0 likes · 31 min read

Ant Group’s Selected Papers at KDD2024: Abstracts and Highlights

ByteDance Data Platform

Aug 27, 2024 · Artificial Intelligence

AI-Driven BI: Achieving Zero-Barrier Data Access and Smart Insights

This article traces the evolution of business intelligence platforms from early report‑centric tools to modern AI‑enhanced, search‑driven solutions, detailing the architectural layers, high‑performance data analysis design, multi‑level aggregation, hot‑cold data tiering, and large‑model applications that enable zero‑threshold data consumption and intelligent insights.

Artificial IntelligenceData AnalyticsHigh Performance Computing

0 likes · 18 min read

AI-Driven BI: Achieving Zero-Barrier Data Access and Smart Insights

Baidu Geek Talk

Aug 26, 2024 · Artificial Intelligence

RLHF Performance Optimization: PPO Algorithm Acceleration Techniques

The article presents three RLHF‑PPO acceleration techniques—TRT‑LLM‑based text generation speedups, selective activation recomputation with sequence parallelism for dynamic memory reduction, and overlapping pipeline stages for system‑level parallelism—demonstrating a 350 % throughput boost on a 10 B model using 16 A100 GPUs.

GPU optimizationLarge Language ModelsPPO optimization

0 likes · 16 min read

RLHF Performance Optimization: PPO Algorithm Acceleration Techniques

Java High-Performance Architecture

Aug 25, 2024 · Artificial Intelligence

Can AI Ace the Gaokao Math Test? Surprising Results from Six Top LLMs

A recent evaluation had six leading large‑language‑model products (GPT‑4o, GLM‑4, Wenxin 4.0, Doubao, Baichuan 4, and Qwen‑2.5) answer the first 14 objective questions of the new Gaokao mathematics I paper, revealing that only GLM‑4 surpassed the 60% passing threshold while the others performed far below expectations.

AIGLM-4Gaokao

0 likes · 7 min read

Can AI Ace the Gaokao Math Test? Surprising Results from Six Top LLMs

21CTO

Aug 24, 2024 · Artificial Intelligence

How GenAI Is Revolutionizing Legacy Code Reverse‑Engineering and Developer Productivity

This article explores how generative AI tools like ChatGPT, GitHub Copilot, and Amazon Q are enabling developers to rapidly reverse‑engineer, document, and modernize legacy codebases, reduce technical debt, and transform software development practices worldwide.

AI toolsGenAILarge Language Models

0 likes · 12 min read

How GenAI Is Revolutionizing Legacy Code Reverse‑Engineering and Developer Productivity

DataFunTalk

Aug 24, 2024 · Artificial Intelligence

Improving the Mathematical Reasoning Ability of Large Language Models: Overview, Mixed Instructions, Synthetic Data, and Training Optimization

This article presents a comprehensive approach to enhancing large language models' mathematical reasoning by reviewing model architectures, introducing mixed CoT‑PoT instructions, generating and filtering synthetic data, and applying multi‑stage training optimizations such as RFT, PPO, and DPO, with detailed experimental results and Q&A.

AILarge Language ModelsReward Model

0 likes · 16 min read

Improving the Mathematical Reasoning Ability of Large Language Models: Overview, Mixed Instructions, Synthetic Data, and Training Optimization

DataFunSummit

Aug 23, 2024 · Artificial Intelligence

Applying Large Language Models to Automotive Industrialization: Practices and Experiences

This presentation outlines the development of ChatGPT, the underlying principles of large language models, and how they empower new industrialization in automotive manufacturing, detailing practical implementations, agent architectures, data and model closed loops, and case studies such as intelligent inspection and G8D agents.

Agent ArchitectureAutomotiveChatGPT

0 likes · 13 min read

Applying Large Language Models to Automotive Industrialization: Practices and Experiences

Alibaba Cloud Developer

Aug 23, 2024 · Artificial Intelligence

Mastering Prompt Engineering: Advanced Techniques from Top AI Labs

This comprehensive guide examines cutting‑edge prompt‑engineering strategies—covering clear instruction design, role‑playing, separators, step‑by‑step workflows, external tools, systematic testing, and case studies from Anthropic, Google, and practical Img2Code applications—to help developers achieve more accurate and powerful interactions with large language models.

AI developmentBest PracticesLarge Language Models

0 likes · 21 min read

Mastering Prompt Engineering: Advanced Techniques from Top AI Labs