Tagged articles

8 articles

Page 1 of 1

Jan 18, 2026 · Artificial Intelligence

OptScale: Probabilistic Optimal Stopping for Inference‑Time Scaling

OptScale introduces a probabilistic framework that determines the optimal number of inference samples needed to meet a target accuracy with a confidence guarantee, dramatically reducing token usage while maintaining or improving performance across various large‑language‑model benchmarks.

Inference ScalingOptimal Stoppingprobabilistic modeling

0 likes · 12 min read

OptScale: Probabilistic Optimal Stopping for Inference‑Time Scaling

Bighead's Algorithm Notes

Dec 4, 2025 · Artificial Intelligence

Paper Review: RETuning Boosts Large‑Model Stock Trend Prediction Reasoning

This article analyzes the RETuning framework, which addresses LLMs' bias toward analyst opinions and lack of evidence weighting in stock movement prediction by introducing a two‑stage cold‑start fine‑tuning and reinforcement learning pipeline, evaluating it on the large Fin‑2024 dataset and demonstrating significant F1 gains, inference‑time scaling, and out‑of‑distribution robustness.

Fin-2024GRPOInference Scaling

0 likes · 12 min read

Paper Review: RETuning Boosts Large‑Model Stock Trend Prediction Reasoning

AIWalker

Mar 15, 2025 · Artificial Intelligence

How SANA 1.5 Lets Small Models Reach New Text‑to‑Image SOTA

SANA 1.5 introduces an efficient model‑growth pipeline, depth‑pruning, and inference‑time scaling that reuse a 1.6 B‑parameter foundation to train a 4.8 B model with 8× lower memory, 60 % less training time, and GenEval scores that rival or surpass much larger diffusion models.

Inference ScalingModel Scalingdiffusion

0 likes · 17 min read

How SANA 1.5 Lets Small Models Reach New Text‑to‑Image SOTA

Architect

Mar 3, 2025 · Artificial Intelligence

Unlocking Reasoning LLMs: Methods, DeepSeek R1 Insights, and Cost‑Effective Strategies

This article examines how to build and improve reasoning‑capable large language models, explains the definition and use‑cases of reasoning models, details DeepSeek‑R1’s training pipeline, compares four key enhancement methods—including inference‑time scaling, pure RL, SFT + RL, and distillation—and offers budget‑friendly advice.

AI researchDeepSeekInference Scaling

0 likes · 27 min read

Unlocking Reasoning LLMs: Methods, DeepSeek R1 Insights, and Cost‑Effective Strategies

DataFunTalk

Feb 16, 2025 · Artificial Intelligence

Understanding Reasoning LLMs: DeepSeek R1 Variants, Inference‑Time Scaling, and Training Strategies

This article explains what reasoning language models are, outlines their strengths and weaknesses, details DeepSeek R1's three variants and their training pipelines—including pure reinforcement learning, SFT + RL, and distillation—while also discussing inference‑time scaling techniques and related research such as Sky‑T1 and TinyZero.

DeepSeekInference Scalingmodel distillation

0 likes · 16 min read

Understanding Reasoning LLMs: DeepSeek R1 Variants, Inference‑Time Scaling, and Training Strategies

Baobao Algorithm Notes

Feb 13, 2025 · Artificial Intelligence

How to Build and Improve Reasoning LLMs: Methods, Trade‑offs, and DeepSeek Insights

This article explains what reasoning language models are, when they are needed, and reviews four main techniques— inference‑time scaling, pure reinforcement learning, combined SFT + RL, and distillation—illustrated with DeepSeek‑R1’s development, cost analysis, and low‑budget alternatives.

AI researchDeepSeekInference Scaling

0 likes · 27 min read

How to Build and Improve Reasoning LLMs: Methods, Trade‑offs, and DeepSeek Insights

AI2ML AI to Machine Learning

Feb 8, 2025 · Artificial Intelligence

Analyzing DeepSeek R1 Inference Projects: Source Code, Cold‑Start, and Scaling Techniques

This article examines DeepSeek R1’s three breakthroughs, its low‑cost optimizations that bypass CUDA, and the resulting impact on the AI ecosystem, then provides a detailed technical review of seven open‑source reproductions—Open‑R1, Tiny‑Zero, SimpleScaling‑S1, and simpleRL‑reason—covering their architectures, reinforcement‑learning pipelines, and code implementations.

DeepSeekInference ScalingOpen Source

0 likes · 10 min read

Analyzing DeepSeek R1 Inference Projects: Source Code, Cold‑Start, and Scaling Techniques

Baobao Algorithm Notes

Oct 29, 2024 · Artificial Intelligence

Reproducing OpenAI o1: Steiner Model’s Reasoning, Training, and Evaluation

This report details the design, data synthesis, three‑stage training pipeline, and benchmark evaluation of the open‑source Steiner reasoning model, which aims to emulate OpenAI o1’s inference‑time scaling while highlighting current performance gaps and future research challenges.

Inference ScalingLLMbenchmark evaluation

0 likes · 14 min read

Reproducing OpenAI o1: Steiner Model’s Reasoning, Training, and Evaluation