Artificial Intelligence 11 min read

Time Series Analysis and ARIMA Modeling Practice with Python

This article introduces time series fundamentals, classification, and challenges for internet businesses, then provides a step‑by‑step Python tutorial on ARIMA modeling—including data loading, stationarity testing, differencing, ACF/PACF analysis, AIC‑based order selection, model training, prediction, error evaluation, exogenous variable integration, and diagnostic checks.

Ctrip Technology

Sep 24, 2020

Time Series Analysis and ARIMA Modeling Practice with Python

Time series analysis is an important branch of statistics that studies patterns over time to forecast future values, applicable to stock prices, sales, rainfall, and other domains.

Time series can be classified by stationarity, indicator type, and time attribute, with period indicators being additive and point indicators non‑additive.

For internet companies, business volume forecasting faces challenges such as periodic effects, holidays, regional differences, inventory constraints, and external factors.

The article focuses on ARIMA modeling, introducing ARMA components (AR(p) and MA(q)) and how differencing transforms a non‑stationary series into a stationary one for ARMA application.

Practical steps using Python:

Step 1 – Load data:

df = pd.read_csv('testdata.csv', encoding='gbk', index_col='ddate')
df.index = pd.to_datetime(df.index)
df['cnt'] = df['cnt'].astype(float)

Step 2 – Test stationarity with Augmented Dickey‑Fuller:

from statsmodels.tsa.stattools import adfuller
def test_stationarity(timeseries):
    dftest = adfuller(timeseries, autolag='AIC')
    return dftest[1]

Step 3 – Differencing to achieve stationarity and re‑test.

Step 4 – Plot ACF and PACF to get initial hints for p and q.

Step 5 – Determine optimal (p,q) by minimizing AIC over a grid:

for p in range(1, pmax+1):
    for q in range(1, qmax+1):
        try:
            model = ARIMA(endog=df['cnt'], order=(p,1,q))
            results = model.fit(disp=-1)
            print('ARIMA p:{} q:{} - AIC:{}'.format(p, q, results.aic))
        except:
            pass

The minimum AIC suggests p=7, q=7 for a first‑order differenced series.

Step 6 – Train the ARIMA(7,1,7) model, generate predictions, and compute error rate (≈8.58%).

Step 7 – Incorporate exogenous variables (e.g., holidays, week identifiers) to improve accuracy, reducing error to about 1.77%.

Step 8 – Model diagnostics using residual QQ‑plot and Durbin‑Watson test (value ≈1.99) confirm normality and lack of autocorrelation.

Conclusion: Proper data preprocessing, model selection, and inclusion of relevant external factors are crucial for reliable time‑series forecasting.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Statistical Modeling forecasting Time-series ARIMA

Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.