Tag

model optimization

1 views collected around this technical thread.

Kuaishou Tech
Kuaishou Tech
Jun 4, 2025 · Artificial Intelligence

KwaiCoder-AutoThink-preview: An Automatic‑Thinking Large Model Enhanced with Step‑SRPO Reinforcement Learning

The KwaiPilot team released the KwaiCoder‑AutoThink‑preview model, which introduces a novel automatic‑thinking training paradigm and a process‑supervised reinforcement‑learning method called Step‑SRPO, enabling the model to dynamically switch between thinking and non‑thinking modes, reduce inference cost, and achieve up to 20‑point gains on code and math benchmarks while handling large‑scale codebases.

AI researchautomatic thinkingcode generation
0 likes · 12 min read
KwaiCoder-AutoThink-preview: An Automatic‑Thinking Large Model Enhanced with Step‑SRPO Reinforcement Learning
JD Tech
JD Tech
May 26, 2025 · Artificial Intelligence

Solving Technical Challenges at JD Retail: Multi‑Reward Models, LLM‑Based Query Expansion, Model Pruning, and Reinforcement Learning

This article details how JD Retail's young algorithm engineers tackled a series of AI engineering problems—including advertising image quality assessment with multi‑reward models, large‑language‑model‑driven query expansion, FFT‑and‑RDP‑based model pruning, and agent‑centric reinforcement learning—while sharing practical growth insights and code snippets.

AIComputer Visionlarge language models
0 likes · 15 min read
Solving Technical Challenges at JD Retail: Multi‑Reward Models, LLM‑Based Query Expansion, Model Pruning, and Reinforcement Learning
ZhongAn Tech Team
ZhongAn Tech Team
Apr 28, 2025 · Artificial Intelligence

Weekly Tech Overview: Major AI Model Updates, Industry Funding, and Expert Perspectives on AI Agents and Consciousness

This weekly technology digest highlights significant advancements in artificial intelligence, including OpenAI's GPT-4o upgrades, Tencent's Hunyuan 3D v2.5 release, and major funding rounds for xAI and Manus, alongside expert discussions on the future evolution of AI agent networks and the theoretical possibility of machine consciousness.

AI agentsAI fundingArtificial Intelligence
0 likes · 7 min read
Weekly Tech Overview: Major AI Model Updates, Industry Funding, and Expert Perspectives on AI Agents and Consciousness
JD Tech
JD Tech
Mar 19, 2025 · Artificial Intelligence

JD Retail's End‑to‑End AI Engine Compatible with GPU and Domestic NPU: Architecture, Optimization, and Real‑World Applications

This article details JD Retail's AI engine that seamlessly supports both GPU and domestic NPU hardware, describing its heterogeneous cluster architecture, unified training and inference APIs, performance optimizations, extensive model coverage, and multiple production use cases across e‑commerce, logistics, and intelligent assistance.

AI EngineGPUJD Retail
0 likes · 20 min read
JD Retail's End‑to‑End AI Engine Compatible with GPU and Domestic NPU: Architecture, Optimization, and Real‑World Applications
JD Retail Technology
JD Retail Technology
Mar 4, 2025 · Artificial Intelligence

JD Retail End-to-End AI Engine Compatible with GPU and Domestic NPU: Architecture, Optimization, and Applications

JD Retail’s Nine‑Number Algorithm Platform delivers an end‑to‑end AI engine that unifies GPU and domestic NPU resources across a thousand‑card cluster, offering zero‑cost model migration, optimized training and inference pipelines, support for over 40 LLM and multimodal models, and proven business‑level performance that reduces dependence on overseas chips.

AIGPUInference
0 likes · 19 min read
JD Retail End-to-End AI Engine Compatible with GPU and Domestic NPU: Architecture, Optimization, and Applications
DaTaobao Tech
DaTaobao Tech
Feb 21, 2025 · Artificial Intelligence

AI-Powered Face Swapping for the Spring Festival Gala: System Design and Deployment

The paper details the design and deployment of an AI‑driven face‑swap platform for the 2025 CCTV Spring Festival Gala, featuring a dual‑model SDXL pipeline with ControlNet and LoRA fine‑tuning, optimized preprocessing and GPU‑specific acceleration to achieve sub‑3‑second latency at over 10 k QPS, supporting scaling, throttling, and multi‑region load balancing, and ultimately serving ten million users and generating hundreds of millions of personalized gala images.

AI EngineeringAIGCSpring Festival Gala
0 likes · 28 min read
AI-Powered Face Swapping for the Spring Festival Gala: System Design and Deployment
Tencent Technical Engineering
Tencent Technical Engineering
Feb 14, 2025 · Artificial Intelligence

Technical Overview of DeepSeek Series Models and Innovations

The DeepSeek series introduces a refined Mixture‑of‑Experts architecture with fine‑grained expert partitioning, shared experts, and learnable load‑balancing, alongside innovations such as Group Relative Policy Optimization, Multi‑Head Latent Attention, Multi‑Token Prediction, mixed‑precision FP8 training, and the R1/R1‑Zero models that use Long‑CoT reasoning, reinforcement‑learning pipelines, and distillation to achieve OpenAI‑comparable performance at lower cost.

AIDeepSeekMixture of Experts
0 likes · 25 min read
Technical Overview of DeepSeek Series Models and Innovations
Tencent Cloud Developer
Tencent Cloud Developer
Feb 6, 2025 · Artificial Intelligence

DeepSeek V Series: Technical Overview of Scaling Laws, Grouped Query Attention, and Mixture‑of‑Experts

The article reviews DeepSeek’s V‑series papers, explaining how scaling‑law insights, Grouped Query Attention, a depth‑first design, loss‑free load balancing, multi‑token prediction and Multi‑Head Latent Attention together enable economical mixture‑of‑experts LLMs that rival closed‑source models while cutting compute and hardware costs.

DeepSeekGrouped Query AttentionMixture of Experts
0 likes · 13 min read
DeepSeek V Series: Technical Overview of Scaling Laws, Grouped Query Attention, and Mixture‑of‑Experts
DevOps
DevOps
Jan 25, 2025 · Artificial Intelligence

DeepSeek R1: An Open‑Source Large Model Matching OpenAI’s o1 at a Fraction of the Cost

DeepSeek’s newly released R1 model delivers performance comparable to OpenAI’s o1 while cutting inference costs by 90‑95%, leveraging innovative MLA and MoE architectures, low‑cost hardware training, an open‑source strategy, and a youthful, flat‑structured team that challenges the AI industry’s high‑spending model.

AI StartupArtificial IntelligenceCost‑Efficient Training
0 likes · 12 min read
DeepSeek R1: An Open‑Source Large Model Matching OpenAI’s o1 at a Fraction of the Cost
DataFunSummit
DataFunSummit
Jan 6, 2025 · Artificial Intelligence

Efficient Large‑Model Training with LLaMA‑Factory: Overview, Techniques, and Applications

This article explains how to train large language models efficiently using LLaMA‑Factory, covering low‑resource training challenges, memory‑saving optimizations for parameters, gradients and activations, framework features, quick‑start guidance, performance tuning, real‑world case studies, and a detailed Q&A.

AIDeepSpeedLLaMA-Factory
0 likes · 10 min read
Efficient Large‑Model Training with LLaMA‑Factory: Overview, Techniques, and Applications
DevOps
DevOps
Dec 10, 2024 · Artificial Intelligence

Key Generative AI Trends to Watch in 2024

The article outlines the major 2024 generative AI trends—including realistic expectations, multimodal models, smaller open‑source LLMs, GPU shortages, easier model optimization, custom local pipelines, stronger virtual agents, regulatory and ethical challenges, and the rise of shadow AI—while explaining their technical and business implications.

AI TrendsAI governancegenerative AI
0 likes · 17 min read
Key Generative AI Trends to Watch in 2024
DevOps
DevOps
Dec 8, 2024 · Artificial Intelligence

Understanding Fine-Tuning in Machine Learning: Concepts, Importance, Steps, and Applications

This article explains fine‑tuning in machine learning, covering its definition, why it matters, the role of pre‑trained models, detailed step‑by‑step procedures, advantages, and diverse applications such as NLP, computer vision, speech and finance, with practical examples like face recognition and object detection.

AI applicationsFine-tuningPretraining
0 likes · 16 min read
Understanding Fine-Tuning in Machine Learning: Concepts, Importance, Steps, and Applications
Model Perspective
Model Perspective
Dec 5, 2024 · Artificial Intelligence

Choosing the Right Activation Function: Pros, Cons, and Best Practices

Activation functions are crucial for neural networks, providing non‑linearity, normalization, and gradient flow; this article reviews common functions such as Sigmoid, Tanh, ReLU, Leaky ReLU, ELU, Noisy ReLU, Softmax, and Swish, comparing their characteristics, advantages, drawbacks, and guidance for selecting the appropriate one.

activation functionsdeep learningmachine learning
0 likes · 10 min read
Choosing the Right Activation Function: Pros, Cons, and Best Practices
Zhuanzhuan Tech
Zhuanzhuan Tech
Oct 24, 2024 · Artificial Intelligence

Pre‑Ranking in Recommendation Systems: Model and Sample Optimization Practices at Zhuanzhuan Home Page

This article reviews the role of pre‑ranking in multi‑stage recommendation pipelines, compares dual‑tower and fully‑connected DNN models, discusses negative and positive sample selection strategies, and presents Zhuanzhuan's practical improvements in model architecture and traffic‑pool allocation to boost precision and diversity.

dual‑towermodel optimizationpre‑ranking
0 likes · 16 min read
Pre‑Ranking in Recommendation Systems: Model and Sample Optimization Practices at Zhuanzhuan Home Page
Tencent Advertising Technology
Tencent Advertising Technology
Oct 14, 2024 · Artificial Intelligence

Generative Retrieval Based on Yuan Large Model: Implementation and Practice in Tencent Advertising

This paper presents the implementation and practice of generative retrieval based on Yuan large model in Tencent Advertising, addressing three key challenges: user intent capture, model alignment in advertising domain, and high-performance platform design under ROI constraints.

Advertising TechnologyHigh Performance ComputingRecommendation systems
0 likes · 17 min read
Generative Retrieval Based on Yuan Large Model: Implementation and Practice in Tencent Advertising
DataFunSummit
DataFunSummit
Oct 3, 2024 · Artificial Intelligence

A Survey of Multimodal Recommendation Systems: From Background to Future Directions

This article reviews the latest academic advances in multimodal recommendation systems, covering background, system workflow, modal encoders, feature interaction (connection, fusion, filtering), feature enhancement, model optimization, and future research challenges.

AIfeature enhancementfeature interaction
0 likes · 18 min read
A Survey of Multimodal Recommendation Systems: From Background to Future Directions
iQIYI Technical Product Team
iQIYI Technical Product Team
Jul 26, 2024 · Artificial Intelligence

Optimizing Advertising Feature Evaluation Process with the Opal Machine Learning Platform

By migrating iQIYI’s advertising feature‑evaluation workflow to the Opal machine‑learning platform, the team replaced a manual, engineer‑heavy process with a unified, automated pipeline that cut evaluation cycles from five days to 1.5 days, tripling iteration speed while lowering barriers and improving consistency for future feature optimization.

Big DataFeature EvaluationOpal Platform
0 likes · 6 min read
Optimizing Advertising Feature Evaluation Process with the Opal Machine Learning Platform
Kuaishou Tech
Kuaishou Tech
Jul 17, 2024 · Artificial Intelligence

Key Technical Innovations in Kuaishou’s “Kuaiyi” Large Model and Its Real-World Applications

The article details Kuaishou’s development of the 175B “Kuaiyi” multimodal large model, presenting eight novel technical innovations—from Temporal Scaling Law and MiLe Loss to MoE‑enhanced reward modeling—and describes how these advances enable high‑performance AI services such as the AI Xiao Kuai chatbot across diverse real‑world scenarios.

AI applicationslarge language modelmodel optimization
0 likes · 12 min read
Key Technical Innovations in Kuaishou’s “Kuaiyi” Large Model and Its Real-World Applications
360 Smart Cloud
360 Smart Cloud
Jul 4, 2024 · Artificial Intelligence

Optimizing Mixture-of-Experts (MoE) Training with the QLM Framework

This article introduces the background and challenges of large language model training, explains the Mixture-of-Experts (MoE) architecture, and details several optimization techniques implemented in the QLM framework—including fine-grained and shared experts, top‑k gating, token distribution, expert parallelism, and grouped GEMM – to improve training efficiency and performance.

AIMixture of ExpertsQLM
0 likes · 10 min read
Optimizing Mixture-of-Experts (MoE) Training with the QLM Framework
Bilibili Tech
Bilibili Tech
Jun 14, 2024 · Artificial Intelligence

Technical Report on the Index-1.9B Series: Model Variants, Pre‑training Optimizations, and Alignment Experiments

The report presents the open‑source Index‑1.9B family—base, pure, chat, and character variants—detailing benchmark results, pre‑training optimizations such as a normalized LM‑Head and deeper‑slim architectures, the importance of modest instruction data, alignment via SFT/DPO, role‑play enhancements with RAG, and acknowledges remaining safety and factual limitations.

Instruction TuningLLMPretraining
0 likes · 15 min read
Technical Report on the Index-1.9B Series: Model Variants, Pre‑training Optimizations, and Alignment Experiments