Tagged articles

41 articles

Page 1 of 1

Dec 24, 2025 · Backend Development

Mastering OpenTelemetry: From Setup to Advanced Sampling and Production‑Ready Practices

This guide walks through the fundamentals of OpenTelemetry, covering component architecture, environment setup, SDK and Collector configuration for Java, Go, and Kubernetes, and dives into common pitfalls, performance tuning, security hardening, high‑availability deployment, and advanced tail‑based sampling strategies.

CollectorKubernetesObservability

0 likes · 27 min read

Mastering OpenTelemetry: From Setup to Advanced Sampling and Production‑Ready Practices

Model Perspective

Mar 20, 2025 · Big Data

How to Sample Effectively in the Big Data Era: Methods and Best Practices

This article explores essential sampling strategies for big‑data environments—including simple random, reservoir, stratified, oversampling, undersampling, and weighted sampling—detailing their principles, algorithmic steps, advantages, drawbacks, and suitable application scenarios to help analysts choose the right method.

Big DataSamplingoversampling

0 likes · 8 min read

How to Sample Effectively in the Big Data Era: Methods and Best Practices

AI Algorithm Path

Mar 4, 2025 · Artificial Intelligence

How to Control LLM Output Using Temperature, Top‑K, and Top‑P

The article explains how sampling parameters—Temperature, Top‑k, and Top‑p—shape the output of large language models, comparing greedy and beam search, illustrating probability changes with concrete examples, and offering practical guidance on adjusting these settings for different tasks.

Beam SearchGreedy SearchLLM

0 likes · 9 min read

How to Control LLM Output Using Temperature, Top‑K, and Top‑P

AI Algorithm Path

Feb 19, 2025 · Artificial Intelligence

How Temperature Shapes Output in Large Language Models

The article explains the Temperature hyper‑parameter in large language models, shows how it modifies the softmax distribution, provides a Python visualisation script, and demonstrates through experiments that higher values increase creativity while lower values make outputs more deterministic.

PythonSamplingSoftmax

0 likes · 5 min read

How Temperature Shapes Output in Large Language Models

JD Tech Talk

Dec 27, 2024 · Backend Development

Log Sampling and Cross‑Thread Propagation in High‑Throughput Java Services

The article examines the performance impact of excessive logging in large‑scale Java systems and proposes request‑level sampling with cross‑thread identifier propagation, offering practical component‑based solutions, implementation considerations, and a concrete code example for backend developers.

JavaSamplingThreadLocal

0 likes · 7 min read

Log Sampling and Cross‑Thread Propagation in High‑Throughput Java Services

DataFunTalk

Jul 22, 2024 · Fundamentals

A/B Testing and Causal Inference: Evolution of Sampling, Metric Evaluation, and Statistical Inference

The article reviews the development of online A/B testing, covering sampling and traffic‑splitting techniques, metric computation improvements, statistical inference advances, and current challenges such as interference, real‑time inference, and large‑scale metric computation, while referencing recent research papers.

A/B testingMetric EvaluationSampling

0 likes · 10 min read

A/B Testing and Causal Inference: Evolution of Sampling, Metric Evaluation, and Statistical Inference

JD Tech Talk

Jul 9, 2024 · Artificial Intelligence

Getting Started with AI Image Generation Using Stable Diffusion for Promotional Posters

This guide introduces the fundamentals of AI image generation with Stable Diffusion, covering three main usage methods, the Draw Things desktop app, model types, samplers, prompts, and post‑processing techniques to create high‑quality promotional graphics for events like the 618 sale.

AI artDrawThingsLoRA

0 likes · 11 min read

Getting Started with AI Image Generation Using Stable Diffusion for Promotional Posters

NewBeeNLP

Jun 28, 2024 · Artificial Intelligence

Why Large Language Models Aren’t Magic: Understanding Compression and Prompt Engineering

This article demystifies large language models by comparing them to classic compression algorithms, explains how they compress massive data into compact parameters, explores their ability to learn abstract patterns, and provides practical insights into prompt engineering, sampling strategies, and multi‑step agent architectures for real‑world applications.

Agent ArchitectureLLMModel Compression

0 likes · 19 min read

Why Large Language Models Aren’t Magic: Understanding Compression and Prompt Engineering

DaTaobao Tech

May 27, 2024 · Artificial Intelligence

Sampling Strategies for Large Language Models: Greedy, Beam, Top‑K, Top‑p, and Temperature

The article explains how greedy search, beam search, Top‑K, Top‑p (nucleus) sampling, and temperature each shape large language model generation, comparing their effects on repetition, diversity, and creativity, and provides concise TensorFlow‑based code examples illustrating these inference‑time strategies.

AILLMPython

0 likes · 15 min read

Sampling Strategies for Large Language Models: Greedy, Beam, Top‑K, Top‑p, and Temperature

Open Source Tech Hub

Dec 31, 2023 · Backend Development

Profile PHP Applications with Reli: Sampling Tracing and Flamegraph Guide

This guide explains how to install and use Reli, a PHP‑written sampling profiler and VM state inspector, to detect bottlenecks, trace execution, locate memory issues, and generate flamegraphs for PHP processes without modifying the target program.

PHPPerformanceProfiling

0 likes · 16 min read

Huawei Cloud Developer Alliance

Nov 30, 2023 · Artificial Intelligence

Mastering LLM Text Generation: Decoding Methods Explained

This review of the recent MindSpore NLP public class walks through the fundamentals of large language model text generation, detailing deterministic decoding such as greedy and beam search, stochastic sampling techniques like temperature, top‑k and top‑p, and advanced methods including constrained beam, contrastive, and assisted search, with illustrative examples.

Beam SearchGreedy SearchLLM

0 likes · 5 min read

Mastering LLM Text Generation: Decoding Methods Explained

FunTester

Sep 1, 2023 · Operations

Observability in the Cloud‑Native Era: Data Collection Strategies and Sampling Techniques

The article explains how cloud‑native observability systems gather massive telemetry from infrastructure, containers, middleware and services, compares direct push and file‑based collection approaches, and details head, tail and local sampling methods to optimize data completeness and performance.

ObservabilityPerformance OptimizationSampling

0 likes · 10 min read

Observability in the Cloud‑Native Era: Data Collection Strategies and Sampling Techniques

DevOps Cloud Academy

Aug 29, 2023 · Cloud Native

Observability and Data Collection Strategies in Cloud‑Native Environments

The article explains that while observability is not new, cloud‑native systems have driven rapid development of observable platforms, detailing data collection architectures, direct push versus file‑based approaches, and various sampling techniques (head, tail, and local sampling) to balance completeness, real‑time reporting, and performance impact.

PerformanceSamplingcloud-native

0 likes · 11 min read

Observability and Data Collection Strategies in Cloud‑Native Environments

Baidu Geek Talk

Aug 21, 2023 · Artificial Intelligence

Decoding Strategies for Generative Models: Top‑k, Top‑p, Contrastive Search, Beam Search, and Sampling

The article explains how generative models use deterministic methods like greedy and beam search and stochastic techniques such as top‑k, top‑p, contrastive search and sampling, describing their mechanisms, temperature control, repetition penalties, and practical trade‑offs for balancing fluency, diversity and coherence.

AIBeam SearchSampling

0 likes · 9 min read

Decoding Strategies for Generative Models: Top‑k, Top‑p, Contrastive Search, Beam Search, and Sampling

vivo Internet Technology

Aug 2, 2023 · Game Development

Pre‑Experiment User Stratification Model for Improving AB Test Uniformity in Vivo Game Center

The paper introduces a pre‑user stratification model that uses covariate‑balancing algorithms to create separate strata for distribution and revenue metrics, ensuring equal user allocation in Vivo game‑center AB tests, which reduces metric variance, improves gray‑release effectiveness, and saves significant investigation effort.

AB testingGame AnalyticsSampling

0 likes · 14 min read

Pre‑Experiment User Stratification Model for Improving AB Test Uniformity in Vivo Game Center

DataFunSummit

Jul 23, 2023 · Big Data

Optimizing OLAP Performance with ADM, Cube Pre‑aggregation and Sampling at Ant Group

This article explains how Ant Group tackles the performance challenges of large‑scale OLAP tables by using ADM to reduce data volume, employing Cube pre‑aggregation for reporting, and applying statistical sampling for exploratory analysis, detailing the processes, metrics, and architectural designs involved.

ADMCubeData Warehouse

0 likes · 15 min read

Optimizing OLAP Performance with ADM, Cube Pre‑aggregation and Sampling at Ant Group

Architect

Jul 1, 2023 · Artificial Intelligence

Comprehensive Guide to Text Generation Decoding Strategies with HuggingFace Transformers

This tutorial explores various text generation decoding methods—including greedy search, beam search, top‑k/top‑p sampling, sample‑and‑rank, and group beam search—explaining their principles, providing detailed Python code examples, and comparing their use in modern large language models.

Beam SearchGreedy SearchSampling

0 likes · 59 min read

Comprehensive Guide to Text Generation Decoding Strategies with HuggingFace Transformers

Tencent Cloud Developer

Jun 1, 2023 · Artificial Intelligence

A Comprehensive Guide to Decoding Strategies for Text Generation with HuggingFace Transformers

This guide thoroughly explains the major decoding strategies for neural text generation in HuggingFace Transformers—including greedy, beam, diverse beam, sampling, top‑k, top‑p, sample‑and‑rank, beam sampling, and group beam search—detailing their principles, Python implementations with LogitsProcessor components, workflow diagrams, comparative analysis, and references to original research.

Beam SearchSamplingText Generation

0 likes · 60 min read

A Comprehensive Guide to Decoding Strategies for Text Generation with HuggingFace Transformers

DataFunTalk

Jan 8, 2023 · Big Data

ByteDance Event‑Tracking Data Cost Governance Practices

This article describes ByteDance's comprehensive approach to managing the massive volume of event‑tracking (埋点) data, detailing the background, cost‑reduction strategies, experience review, future plans, and a Q&A session that together illustrate how systematic data governance can dramatically cut storage and processing expenses.

Big DataByteDanceSampling

0 likes · 18 min read

ByteDance Event‑Tracking Data Cost Governance Practices

ByteDance SYS Tech

Jan 6, 2023 · Fundamentals

How ByteDance Scaled Profile‑Guided Optimization to Boost CPU Efficiency

This article explains ByteDance's large‑scale adoption of profile‑guided optimization (PGO), covering its principles, instrumentation and sampling methods, the automated platform built for data collection and compilation, and the resulting performance gains across dozens of critical services.

ByteDanceCompiler OptimizationInstrumentation

0 likes · 12 min read

Model Perspective

Nov 9, 2022 · Fundamentals

Understanding Markov Chains: From Basics to Convergence and Sampling

This article explains the fundamentals of Markov chains, illustrates their transition matrix with a market example, demonstrates convergence through Python code, and outlines how to use the stationary distribution for sampling in Monte Carlo simulations.

Markov chainMonte CarloSampling

0 likes · 9 min read

Understanding Markov Chains: From Basics to Convergence and Sampling

Model Perspective

Oct 4, 2022 · Artificial Intelligence

How Metropolis-Hastings Improves MCMC Sampling Efficiency

This article explains the detailed‑balance condition for Markov chains, shows why finding a transition matrix for a given stationary distribution is hard, and demonstrates how Metropolis‑Hastings modifies MCMC to achieve higher acceptance rates with a concrete Python example.

MCMCMarkov chainMetropolis-Hastings

0 likes · 9 min read

How Metropolis-Hastings Improves MCMC Sampling Efficiency

Model Perspective

Sep 28, 2022 · Artificial Intelligence

How Monte Carlo Sampling Powers AI: From Basics to Acceptance-Rejection

This article introduces Monte Carlo methods, explains how random sampling approximates integrals, discusses uniform and non‑uniform probability distributions, and details acceptance‑rejection sampling as a technique for generating samples from complex distributions, laying the groundwork for understanding Markov Chain Monte Carlo in AI.

Acceptance-RejectionArtificial IntelligenceMCMC

0 likes · 8 min read

How Monte Carlo Sampling Powers AI: From Basics to Acceptance-Rejection

Model Perspective

Sep 23, 2022 · Fundamentals

Mastering Monte Carlo: From Acceptance-Rejection to Gibbs Sampling in Python

This article explains the motivations behind Monte Carlo methods, introduces acceptance-rejection sampling, details Markov Chain Monte Carlo concepts, and walks through Metropolis-Hastings and Gibbs sampling algorithms with Python implementations, highlighting their use in high‑dimensional probability distribution sampling.

MCMCMonte CarloPython

0 likes · 18 min read

Mastering Monte Carlo: From Acceptance-Rejection to Gibbs Sampling in Python

Model Perspective

Sep 21, 2022 · Fundamentals

Unlocking Monte Carlo Sampling: From Basics to Acceptance‑Rejection in AI

Monte Carlo methods, originally a gambling-inspired random simulation technique, provide a versatile way to approximate integrals and sums, and by using acceptance‑rejection sampling they enable drawing samples from complex probability distributions, a key step toward effective Markov Chain Monte Carlo algorithms in machine learning and AI.

Acceptance-RejectionMCMCMonte Carlo

0 likes · 7 min read

Unlocking Monte Carlo Sampling: From Basics to Acceptance‑Rejection in AI

Bilibili Tech

Sep 20, 2022 · Fundamentals

Common Color Representation Methods and Image/Video Fundamentals

The article explains common color models such as grayscale, RGB and YUV, describes image fundamentals like resolution and aspect ratio, outlines typical storage formats (RGB, YUV420P, NV12/NV21) and their bit‑depth considerations, and introduces video basics including frame rate, compression stages and HDR mapping.

RGBSamplingcolor representation

0 likes · 21 min read

Common Color Representation Methods and Image/Video Fundamentals

Model Perspective

Jun 1, 2022 · Fundamentals

How the Central Limit Theorem Turns Any Distribution Into a Normal Curve

This article intuitively demonstrates the Central Limit Theorem using uniform and Beta distributions, showing how sample means converge to a normal shape as sample size grows, and provides the formal statistical statement and its significance for inference.

Samplingcentral limit theoremnormal distribution

0 likes · 5 min read

How the Central Limit Theorem Turns Any Distribution Into a Normal Curve

Model Perspective

Jun 1, 2022 · Fundamentals

How the Central Limit Theorem Powers Confidence Intervals and Sample Estimates

This article explains the Central Limit Theorem, distinguishes standard deviation from standard error, illustrates the 3‑σ rule, and shows how confidence levels, significance levels, and interval estimation combine to derive reliable confidence intervals for large‑sample population mean estimates.

Samplingcentral limit theoremconfidence interval

0 likes · 9 min read

How the Central Limit Theorem Powers Confidence Intervals and Sample Estimates

Tencent Cloud Developer

Dec 1, 2021 · Backend Development

From Dapper to Modern Distributed Tracing: Concepts, Algorithms, and Practices

The article traces the evolution of distributed tracing from Google’s Dapper paper through early research, Pinpoint and X‑Trace, to modern open‑source tools like Zipkin, Jaeger and SkyWalking, explaining metadata propagation, asynchronous reporting, classic nested and convolution algorithms, and practical implementation details for non‑intrusive, scalable tracing.

DapperSamplingTrace Propagation

0 likes · 14 min read

From Dapper to Modern Distributed Tracing: Concepts, Algorithms, and Practices

Didi Tech

May 19, 2021 · Artificial Intelligence

Applying Epsilon‑Greedy Bandit Algorithm for Content Delivery Optimization at DiDi

DiDi applied the epsilon‑greedy bandit algorithm integrated with its CMS to optimize ad placement across 600 slots, using quality scores, traffic sampling, and a drag‑and‑drop UI, which boosted CTR from 1.35% to 13.43% and unique visitors by 686%, demonstrating data‑driven growth beyond simple A/B testing.

Content OptimizationEpsilon-GreedySampling

0 likes · 10 min read

Applying Epsilon‑Greedy Bandit Algorithm for Content Delivery Optimization at DiDi

New Oriental Technology

May 17, 2021 · Fundamentals

Live Streaming Process Model: Capture, Sampling, Encoding, and Audio Channel Technologies

This article explains the live streaming workflow, detailing audio and video capture, digital sampling rates and bit depths, various sound channel configurations from mono to immersive formats, and common audio encoding methods such as PCM, AAC, MP3, and FLAC.

Audio ProcessingSamplingaudio encoding

0 likes · 22 min read

Live Streaming Process Model: Capture, Sampling, Encoding, and Audio Channel Technologies

dbaplus Community

Jul 8, 2019 · Big Data

How to Use ClickHouse Sampling and Materialized Views for Real‑Time Monitoring of Billion‑Scale Ad Traffic

This article explains how to handle high‑volume advertising monitoring by storing raw request logs in ClickHouse, enabling sampling and materialized views, and using TP999 metrics, aggregating tables, and Grafana queries to achieve fast, flexible, and low‑impact real‑time analytics on billions of events.

ClickHouseMonitoringSampling

0 likes · 10 min read

How to Use ClickHouse Sampling and Materialized Views for Real‑Time Monitoring of Billion‑Scale Ad Traffic

DataFunTalk

Jul 5, 2019 · Artificial Intelligence

Lead Quality Prediction for Real Estate: Data, Modeling, and Interpretability

This article presents a case study on building and deploying a lead‑quality classification model for a high‑value, low‑frequency real‑estate platform, covering business context, data challenges, sampling strategies, feature engineering, model selection, tuning, evaluation metrics, interpretability analysis, and observed performance improvements.

Machine LearningReal EstateSampling

0 likes · 14 min read

Lead Quality Prediction for Real Estate: Data, Modeling, and Interpretability

Hulu Beijing

Mar 8, 2018 · Artificial Intelligence

Master Common Sampling Techniques: Inverse Transform, Rejection, Importance & MCMC

This article explains the core ideas and step-by-step procedures of widely used sampling methods—including inverse transform, rejection, importance, and Markov Chain Monte Carlo techniques such as Metropolis‑Hastings and Gibbs—highlighting their mathematical foundations, practical implementations, and when each method is appropriate.

Importance SamplingMCMCMonte Carlo

0 likes · 11 min read

Master Common Sampling Techniques: Inverse Transform, Rejection, Importance & MCMC

Hulu Beijing

Dec 26, 2017 · Fundamentals

How to Sample a Gaussian Distribution: Methods, Algorithms, and Performance

This article explains why Gaussian (normal) distribution sampling is essential, describes the mathematical transformation from a standard normal, and compares several practical algorithms—including inverse transform, Box‑Muller, Marsaglia polar, rejection sampling, and Ziggurat—highlighting their implementation steps and efficiency considerations.

Box-MullerGaussianMarsaglia

0 likes · 8 min read

How to Sample a Gaussian Distribution: Methods, Algorithms, and Performance

Hulu Beijing

Nov 21, 2017 · Artificial Intelligence

How to Tackle Imbalanced Datasets with Sampling Techniques

Sampling transforms complex distributions into manageable data points, and mastering methods like random oversampling, undersampling, SMOTE, and its variants is essential for handling imbalanced binary classification problems in machine learning, ensuring models achieve balanced accuracy and recall across classes.

SMOTESamplingimbalanced data

0 likes · 8 min read

How to Tackle Imbalanced Datasets with Sampling Techniques

Beike Product & Technology

Jul 16, 2017 · Industry Insights

How Lianjia Built LTrace: A Low‑Overhead, Scalable Distributed Tracing Platform

This article explains how Lianjia designed and implemented LTrace, a zero‑intrusion, high‑performance distributed tracing system that captures full request chains across heterogeneous services, supports multi‑language environments, offers flexible sampling, and enables rapid fault isolation and performance optimization.

ObservabilitySamplingarchitecture

0 likes · 12 min read

How Lianjia Built LTrace: A Low‑Overhead, Scalable Distributed Tracing Platform

Didi Tech

Jul 10, 2017 · Fundamentals

Statistical Foundations for A/B Testing: Populations, Samples, Confidence Intervals, and the Central Limit Theorem

This article explains the essential statistical concepts—populations, samples, sampling error, confidence intervals, the Central Limit Theorem, and normal distribution—that underpin A/B testing, showing how they enable reliable hypothesis evaluation, accurate impact prediction, and data‑driven decision making for product experiments.

A/B testingSamplingcentral limit theorem

0 likes · 14 min read

Statistical Foundations for A/B Testing: Populations, Samples, Confidence Intervals, and the Central Limit Theorem

ITFLY8 Architecture Home

Dec 8, 2016 · Operations

Designing Effective End-to-End Tracing Systems for Distributed Services

This article surveys the design of end‑to‑end tracing systems for large distributed services, explaining core use cases, tracing approaches, metadata propagation, sampling strategies, visualization techniques, and recommended design choices to improve debugging, performance analysis, and resource attribution.

SamplingSystem Designdistributed tracing

0 likes · 44 min read

Designing Effective End-to-End Tracing Systems for Distributed Services

360 Quality & Efficiency

Sep 18, 2016 · Mobile Development

Performance Testing Metrics and Sampling Strategy for Android Apps

The article outlines a comprehensive set of Android app performance metrics, device coverage, a non‑root sampling strategy using dumpsys commands, shell‑based data collection, and Python‑driven HTML reporting, providing practical guidance and reference implementations for mobile developers.

AndroidSamplingperformance testing

0 likes · 4 min read

Performance Testing Metrics and Sampling Strategy for Android Apps

Art of Distributed System Architecture Design

Aug 2, 2015 · Artificial Intelligence

Designing Machine Learning Models for Fraud Detection: Sampling, Feature Engineering, and Evaluation

This article explains how Airbnb's Trust & Safety team builds machine‑learning models to detect fraudulent behavior, covering problem definition, role‑based sampling, feature design techniques such as normalization and CP‑coding, and the trade‑offs between precision and recall in model evaluation.

AISamplingfeature engineering

0 likes · 10 min read

Designing Machine Learning Models for Fraud Detection: Sampling, Feature Engineering, and Evaluation