Tag

sampling

0 views collected around this technical thread.

Model Perspective
Model Perspective
Mar 20, 2025 · Big Data

How to Sample Effectively in the Big Data Era: Methods and Best Practices

This article explores essential sampling strategies for big‑data environments—including simple random, reservoir, stratified, oversampling, undersampling, and weighted sampling—detailing their principles, algorithmic steps, advantages, drawbacks, and suitable application scenarios to help analysts choose the right method.

big datadata analysisoversampling
0 likes · 8 min read
How to Sample Effectively in the Big Data Era: Methods and Best Practices
JD Tech Talk
JD Tech Talk
Dec 27, 2024 · Backend Development

Log Sampling and Cross‑Thread Propagation in High‑Throughput Java Services

The article examines the performance impact of excessive logging in large‑scale Java systems and proposes request‑level sampling with cross‑thread identifier propagation, offering practical component‑based solutions, implementation considerations, and a concrete code example for backend developers.

Javabackendlogging
0 likes · 7 min read
Log Sampling and Cross‑Thread Propagation in High‑Throughput Java Services
DataFunTalk
DataFunTalk
Jul 22, 2024 · Fundamentals

A/B Testing and Causal Inference: Evolution of Sampling, Metric Evaluation, and Statistical Inference

The article reviews the development of online A/B testing, covering sampling and traffic‑splitting techniques, metric computation improvements, statistical inference advances, and current challenges such as interference, real‑time inference, and large‑scale metric computation, while referencing recent research papers.

A/B testingMetric Evaluationcausal inference
0 likes · 10 min read
A/B Testing and Causal Inference: Evolution of Sampling, Metric Evaluation, and Statistical Inference
JD Tech Talk
JD Tech Talk
Jul 9, 2024 · Artificial Intelligence

Getting Started with AI Image Generation Using Stable Diffusion for Promotional Posters

This guide introduces the fundamentals of AI image generation with Stable Diffusion, covering three main usage methods, the Draw Things desktop app, model types, samplers, prompts, and post‑processing techniques to create high‑quality promotional graphics for events like the 618 sale.

AI artDrawThingsImage Generation
0 likes · 11 min read
Getting Started with AI Image Generation Using Stable Diffusion for Promotional Posters
DaTaobao Tech
DaTaobao Tech
May 27, 2024 · Artificial Intelligence

Sampling Strategies for Large Language Models: Greedy, Beam, Top‑K, Top‑p, and Temperature

The article explains how greedy search, beam search, Top‑K, Top‑p (nucleus) sampling, and temperature each shape large language model generation, comparing their effects on repetition, diversity, and creativity, and provides concise TensorFlow‑based code examples illustrating these inference‑time strategies.

AIGenerationLLM
0 likes · 15 min read
Sampling Strategies for Large Language Models: Greedy, Beam, Top‑K, Top‑p, and Temperature
FunTester
FunTester
Sep 1, 2023 · Operations

Observability in the Cloud‑Native Era: Data Collection Strategies and Sampling Techniques

The article explains how cloud‑native observability systems gather massive telemetry from infrastructure, containers, middleware and services, compares direct push and file‑based collection approaches, and details head, tail and local sampling methods to optimize data completeness and performance.

Distributed TracingObservabilityPerformance Optimization
0 likes · 10 min read
Observability in the Cloud‑Native Era: Data Collection Strategies and Sampling Techniques
DevOps Cloud Academy
DevOps Cloud Academy
Aug 29, 2023 · Cloud Native

Observability and Data Collection Strategies in Cloud‑Native Environments

The article explains that while observability is not new, cloud‑native systems have driven rapid development of observable platforms, detailing data collection architectures, direct push versus file‑based approaches, and various sampling techniques (head, tail, and local sampling) to balance completeness, real‑time reporting, and performance impact.

MicroservicesObservabilitycloud native
0 likes · 11 min read
Observability and Data Collection Strategies in Cloud‑Native Environments
Baidu Geek Talk
Baidu Geek Talk
Aug 21, 2023 · Artificial Intelligence

Decoding Strategies for Generative Models: Top‑k, Top‑p, Contrastive Search, Beam Search, and Sampling

The article explains how generative models use deterministic methods like greedy and beam search and stochastic techniques such as top‑k, top‑p, contrastive search and sampling, describing their mechanisms, temperature control, repetition penalties, and practical trade‑offs for balancing fluency, diversity and coherence.

AIbeam searchcontrastive search
0 likes · 9 min read
Decoding Strategies for Generative Models: Top‑k, Top‑p, Contrastive Search, Beam Search, and Sampling
vivo Internet Technology
vivo Internet Technology
Aug 2, 2023 · Game Development

Pre‑Experiment User Stratification Model for Improving AB Test Uniformity in Vivo Game Center

The paper introduces a pre‑user stratification model that uses covariate‑balancing algorithms to create separate strata for distribution and revenue metrics, ensuring equal user allocation in Vivo game‑center AB tests, which reduces metric variance, improves gray‑release effectiveness, and saves significant investigation effort.

AB testingdata analysisexperiment design
0 likes · 14 min read
Pre‑Experiment User Stratification Model for Improving AB Test Uniformity in Vivo Game Center
DataFunSummit
DataFunSummit
Jul 23, 2023 · Big Data

Optimizing OLAP Performance with ADM, Cube Pre‑aggregation and Sampling at Ant Group

This article explains how Ant Group tackles the performance challenges of large‑scale OLAP tables by using ADM to reduce data volume, employing Cube pre‑aggregation for reporting, and applying statistical sampling for exploratory analysis, detailing the processes, metrics, and architectural designs involved.

ADMCubeOLAP
0 likes · 15 min read
Optimizing OLAP Performance with ADM, Cube Pre‑aggregation and Sampling at Ant Group
Architect
Architect
Jul 1, 2023 · Artificial Intelligence

Comprehensive Guide to Text Generation Decoding Strategies with HuggingFace Transformers

This tutorial explores various text generation decoding methods—including greedy search, beam search, top‑k/top‑p sampling, sample‑and‑rank, and group beam search—explaining their principles, providing detailed Python code examples, and comparing their use in modern large language models.

HuggingFacebeam searchdecoding strategies
0 likes · 59 min read
Comprehensive Guide to Text Generation Decoding Strategies with HuggingFace Transformers
Tencent Cloud Developer
Tencent Cloud Developer
Jun 1, 2023 · Artificial Intelligence

A Comprehensive Guide to Decoding Strategies for Text Generation with HuggingFace Transformers

This guide thoroughly explains the major decoding strategies for neural text generation in HuggingFace Transformers—including greedy, beam, diverse beam, sampling, top‑k, top‑p, sample‑and‑rank, beam sampling, and group beam search—detailing their principles, Python implementations with LogitsProcessor components, workflow diagrams, comparative analysis, and references to original research.

HuggingFaceNatural Language Processingbeam search
0 likes · 60 min read
A Comprehensive Guide to Decoding Strategies for Text Generation with HuggingFace Transformers
DataFunTalk
DataFunTalk
Jan 8, 2023 · Big Data

ByteDance Event‑Tracking Data Cost Governance Practices

This article describes ByteDance's comprehensive approach to managing the massive volume of event‑tracking (埋点) data, detailing the background, cost‑reduction strategies, experience review, future plans, and a Q&A session that together illustrate how systematic data governance can dramatically cut storage and processing expenses.

ByteDancebig datadata cost optimization
0 likes · 18 min read
ByteDance Event‑Tracking Data Cost Governance Practices
ByteDance SYS Tech
ByteDance SYS Tech
Jan 6, 2023 · Fundamentals

How ByteDance Scaled Profile‑Guided Optimization to Boost CPU Efficiency

This article explains ByteDance's large‑scale adoption of profile‑guided optimization (PGO), covering its principles, instrumentation and sampling methods, the automated platform built for data collection and compilation, and the resulting performance gains across dozens of critical services.

ByteDanceInstrumentationPGO
0 likes · 12 min read
How ByteDance Scaled Profile‑Guided Optimization to Boost CPU Efficiency
Model Perspective
Model Perspective
Nov 9, 2022 · Fundamentals

Understanding Markov Chains: From Basics to Convergence and Sampling

This article explains the fundamentals of Markov chains, illustrates their transition matrix with a market example, demonstrates convergence through Python code, and outlines how to use the stationary distribution for sampling in Monte Carlo simulations.

Markov ChainStochastic Processconvergence
0 likes · 9 min read
Understanding Markov Chains: From Basics to Convergence and Sampling
Model Perspective
Model Perspective
Oct 4, 2022 · Artificial Intelligence

How Metropolis-Hastings Improves MCMC Sampling Efficiency

This article explains the detailed‑balance condition for Markov chains, shows why finding a transition matrix for a given stationary distribution is hard, and demonstrates how Metropolis‑Hastings modifies MCMC to achieve higher acceptance rates with a concrete Python example.

MCMCMarkov ChainMetropolis-Hastings
0 likes · 9 min read
How Metropolis-Hastings Improves MCMC Sampling Efficiency
Model Perspective
Model Perspective
Oct 2, 2022 · Fundamentals

Why Do Markov Chains Always Converge? A Hands‑On Exploration

This article explains the basic definition of Markov chains, illustrates a stock‑market example with transition matrices, demonstrates convergence through Python simulations, and shows how the steady‑state distribution enables sampling for Monte Carlo methods.

Markov ChainPythonconvergence
0 likes · 10 min read
Why Do Markov Chains Always Converge? A Hands‑On Exploration
Model Perspective
Model Perspective
Sep 28, 2022 · Artificial Intelligence

How Monte Carlo Sampling Powers AI: From Basics to Acceptance-Rejection

This article introduces Monte Carlo methods, explains how random sampling approximates integrals, discusses uniform and non‑uniform probability distributions, and details acceptance‑rejection sampling as a technique for generating samples from complex distributions, laying the groundwork for understanding Markov Chain Monte Carlo in AI.

Acceptance-RejectionArtificial IntelligenceMCMC
0 likes · 8 min read
How Monte Carlo Sampling Powers AI: From Basics to Acceptance-Rejection
Model Perspective
Model Perspective
Sep 23, 2022 · Fundamentals

Mastering Monte Carlo: From Acceptance-Rejection to Gibbs Sampling in Python

This article explains the motivations behind Monte Carlo methods, introduces acceptance-rejection sampling, details Markov Chain Monte Carlo concepts, and walks through Metropolis-Hastings and Gibbs sampling algorithms with Python implementations, highlighting their use in high‑dimensional probability distribution sampling.

AlgorithmsMCMCPython
0 likes · 18 min read
Mastering Monte Carlo: From Acceptance-Rejection to Gibbs Sampling in Python
Model Perspective
Model Perspective
Sep 21, 2022 · Fundamentals

Unlocking Monte Carlo Sampling: From Basics to Acceptance‑Rejection in AI

Monte Carlo methods, originally a gambling-inspired random simulation technique, provide a versatile way to approximate integrals and sums, and by using acceptance‑rejection sampling they enable drawing samples from complex probability distributions, a key step toward effective Markov Chain Monte Carlo algorithms in machine learning and AI.

Acceptance-RejectionMCMCProbability Distribution
0 likes · 7 min read
Unlocking Monte Carlo Sampling: From Basics to Acceptance‑Rejection in AI