Tag

statistical modeling

1 views collected around this technical thread.

DataFunSummit
DataFunSummit
Jul 26, 2024 · Big Data

Understanding Power Law Distributions in Content Ecosystems: Data Science Insights and Applications

This article explores how power‑law and other heavy‑tailed distributions appear in content ecosystems, explains their statistical foundations, discusses why they are common, and presents data‑driven strategies—including integer programming, graph‑based creator analysis, and causal inference—to optimize content production, recommendation, and settlement policies.

Content EcosystemPower Lawbig-data
0 likes · 18 min read
Understanding Power Law Distributions in Content Ecosystems: Data Science Insights and Applications
Model Perspective
Model Perspective
Jul 12, 2024 · Fundamentals

Why Lognormal Distribution Is Key to Modeling Rainfall and Financial Data

Lognormal distribution, where a variable’s logarithm follows a normal law, offers non‑negative, right‑skewed modeling ideal for phenomena such as rainfall, river flow, asset prices, and biological sizes, and this article explains its definition, properties, and a practical rainfall‑modeling case study.

Financeenvironmentlognormal distribution
0 likes · 5 min read
Why Lognormal Distribution Is Key to Modeling Rainfall and Financial Data
Model Perspective
Model Perspective
Apr 28, 2024 · Fundamentals

Why Simple Linear Regression Falls Short and How Hierarchical Models Solve It

Linear regression often fails to capture nested data structures, but hierarchical (multilevel) linear models address this limitation by modeling both within‑group and between‑group variation, enabling nuanced analysis of factors like school type on student performance and extending to fields such as ecology and health.

educational statisticshierarchical linear modelmultilevel regression
0 likes · 11 min read
Why Simple Linear Regression Falls Short and How Hierarchical Models Solve It
Model Perspective
Model Perspective
Sep 17, 2023 · Fundamentals

Why Correlation Isn’t Causation: Methods to Reveal True Relationships in Data

This article explains the difference between correlation and causation, illustrates common misconceptions with real‑world examples, and introduces statistical tools such as randomized experiments, instrumental variables, propensity score matching, and difference‑in‑differences that help researchers uncover genuine causal effects in mathematical modeling.

causal inferencecausalitycorrelation
0 likes · 9 min read
Why Correlation Isn’t Causation: Methods to Reveal True Relationships in Data
Test Development Learning Exchange
Test Development Learning Exchange
Aug 19, 2023 · Artificial Intelligence

Regression Analysis Methods and Code Examples for Various Business Scenarios

This article provides comprehensive regression analysis methods and Python code examples for various business scenarios including e-commerce, market research, healthcare, finance, social media, HR, education, hospitality, marketing, and logistics.

Machine LearningOLS regressionPoisson regression
0 likes · 6 min read
Regression Analysis Methods and Code Examples for Various Business Scenarios
Python Programming Learning Circle
Python Programming Learning Circle
May 26, 2023 · Fundamentals

Introduction to Statsmodels: Installation, Data Loading, and Basic Statistical Analysis with Python

This article introduces the Python Statsmodels library, explains its key features such as linear regression, GLM, time‑series and robust methods, shows how to install it, load data with pandas, perform descriptive statistics, visualizations, hypothesis testing, and simple and multiple linear regression examples.

Pythondata-analysisregression
0 likes · 6 min read
Introduction to Statsmodels: Installation, Data Loading, and Basic Statistical Analysis with Python
DataFunSummit
DataFunSummit
May 8, 2023 · Fundamentals

Understanding Data Distributions: Normal vs. Power Law in Content Ecosystems

This article explores how data in content ecosystems is distributed, contrasting the classic normal distribution with heavy‑tailed power‑law patterns, explains why power‑law appears frequently, discusses its statistical properties and risks, and presents practical optimization and causal‑inference methods applied to creator incentives and platform strategies.

Content EcosystemOptimizationPower Law
0 likes · 20 min read
Understanding Data Distributions: Normal vs. Power Law in Content Ecosystems
Model Perspective
Model Perspective
Mar 31, 2023 · Big Data

How to Model Used Sailboat Prices and Rethink the Future of the Olympics

These COMAP MCM problem statements challenge teams to develop statistical models for pricing used sailboats using a large 2023 dataset and to propose innovative strategies for the Olympic Games, evaluating regional effects, data sources, and policy recommendations for sustainable hosting.

Data ModelingOlympicsprice analysis
0 likes · 10 min read
How to Model Used Sailboat Prices and Rethink the Future of the Olympics
Model Perspective
Model Perspective
Dec 4, 2022 · Fundamentals

How Logistic Regression Predicts Titanic Survival: A Step-by-Step R Guide

This article explains logistic regression for binary outcomes, demonstrates its implementation in R with the TitanicSurvival dataset, and interprets the model coefficients showing how gender, age, and passenger class significantly affect survival probability.

RTitanic datasetbinary classification
0 likes · 5 min read
How Logistic Regression Predicts Titanic Survival: A Step-by-Step R Guide
Model Perspective
Model Perspective
Dec 3, 2022 · Fundamentals

How to Perform Multiple Linear Regression in R with the Birthweight Dataset

This article explains the theory of multiple linear regression, demonstrates how to fit such a model in R using the birthwt dataset with the lm() function, and interprets the output, diagnostic plots, and handling of categorical variables.

Rbirthweight datasetlm function
0 likes · 6 min read
How to Perform Multiple Linear Regression in R with the Birthweight Dataset
Model Perspective
Model Perspective
Nov 8, 2022 · Fundamentals

Mastering Multiple Linear Regression: Theory, Estimation, and Prediction

This article explains the fundamentals of multiple linear regression, covering model formulation, least‑squares estimation of coefficients, statistical tests for significance, and how to use the fitted equation for accurate predictions and confidence intervals.

least squaresmultiple linear regressionprediction
0 likes · 5 min read
Mastering Multiple Linear Regression: Theory, Estimation, and Prediction
Model Perspective
Model Perspective
Sep 14, 2022 · Fundamentals

Mastering Grouped and Dummy Variable Regression: Weighted Models Explained

This article explains how regression can handle grouped (aggregated) data using weighted least squares, illustrates the impact of heteroskedasticity, and shows how dummy variables encode categorical factors for flexible, non‑parametric modeling of treatment effects.

dummy variablesgrouped dataheteroskedasticity
0 likes · 12 min read
Mastering Grouped and Dummy Variable Regression: Weighted Models Explained
Model Perspective
Model Perspective
Sep 8, 2022 · Fundamentals

How Monte Carlo Simulation Optimizes Part Parameter Design and Reduces Losses

This article explains how to design part calibration values and tolerances for a product composed of seven components, models the relationship between component parameters and product quality, and uses a Monte Carlo simulation in Python to estimate the average loss per product, illustrating the trade‑off between quality loss and manufacturing cost.

Parameter DesignPythonmanufacturing
0 likes · 5 min read
How Monte Carlo Simulation Optimizes Part Parameter Design and Reduces Losses
Model Perspective
Model Perspective
Aug 2, 2022 · Fundamentals

How ARMA Models Enable Accurate Time Series Forecasting

This article explains the recursive forecasting formulas for ARMA and MA(q) time‑series models, showing how forecasts depend only on past observations, how model invertibility ensures stability, and how estimated parameters are used in practical prediction.

ARMAMA(q)forecasting
0 likes · 2 min read
How ARMA Models Enable Accurate Time Series Forecasting
Model Perspective
Model Perspective
Jul 20, 2022 · Fundamentals

Unlocking Multiple Linear Regression: Theory, Estimation, and Prediction

This article explains the fundamentals of multiple linear regression, covering model formulation, least‑squares estimation of coefficients, hypothesis testing of the regression equation, and how to use the fitted model for point and interval predictions.

hypothesis testingleast squaresmultiple regression
0 likes · 5 min read
Unlocking Multiple Linear Regression: Theory, Estimation, and Prediction
Model Perspective
Model Perspective
Jul 12, 2022 · Fundamentals

How Simple Linear Regression Uncovers Hidden Relationships in Data

This article explains the theory and practice of simple linear regression, covering deterministic vs. stochastic relationships, the least‑squares estimation of coefficients, goodness‑of‑fit measures such as R², hypothesis testing for linearity, and a real‑world case linking wine consumption to heart‑disease mortality.

R-squaredhypothesis testingleast squares
0 likes · 8 min read
How Simple Linear Regression Uncovers Hidden Relationships in Data
Model Perspective
Model Perspective
Jul 9, 2022 · Fundamentals

Unlocking Multiple Linear Regression: Theory, Estimation, and Prediction

This article explains the fundamentals of multiple linear regression, covering model formulation, least‑squares estimation of coefficients, statistical tests for significance, and how to use the fitted equation for point and interval predictions.

hypothesis testingleast squaresmultiple regression
0 likes · 4 min read
Unlocking Multiple Linear Regression: Theory, Estimation, and Prediction
DataFunTalk
DataFunTalk
Jan 2, 2022 · Fundamentals

Survival Analysis for User Churn: Concepts, Data Preparation, and Quantitative Modeling

This article introduces survival analysis, explains how to model user churn by defining purchase and cancellation times as birth and death events, describes data formatting, presents descriptive Kaplan‑Meier results, and shows how Cox regression quantifies the impact of factors such as membership and activity on user survival.

Cox regressionData Analysisstatistical modeling
0 likes · 7 min read
Survival Analysis for User Churn: Concepts, Data Preparation, and Quantitative Modeling
Python Programming Learning Circle
Python Programming Learning Circle
Nov 8, 2021 · Fundamentals

Time Series Analysis with Python: Complete ARIMA Modeling Workflow

This tutorial walks through the full Python-based ARIMA modeling process for time‑series analysis, covering data loading, stationarity and white‑noise tests, model order selection, parameter estimation, diagnostic checks, and future forecasting with detailed code examples.

ARIMAData Analysisstatistical modeling
0 likes · 10 min read
Time Series Analysis with Python: Complete ARIMA Modeling Workflow
DeWu Technology
DeWu Technology
Mar 4, 2021 · Fundamentals

Dominance Analysis for Attribution in Data Analytics

The article explains that attribution analysis of metric declines requires a quantitative approach, introducing Dominance Analysis—a econometric technique that decomposes regression R² into variable-specific contributions by fitting all subset models, averaging marginal effects, ranking factors, and providing a Python implementation with the dominance‑analysis package illustrated on the Boston Housing dataset.

AttributionData AnalyticsPython
0 likes · 7 min read
Dominance Analysis for Attribution in Data Analytics