Artificial Intelligence 11 min read

Time Series Analysis and ARIMA Modeling Practice with Python

This article introduces time series fundamentals, classification, and challenges for internet businesses, then provides a step‑by‑step Python tutorial on ARIMA modeling—including data loading, stationarity testing, differencing, ACF/PACF analysis, AIC‑based order selection, model training, prediction, error evaluation, exogenous variable integration, and diagnostic checks.

Ctrip Technology
Ctrip Technology
Ctrip Technology
Time Series Analysis and ARIMA Modeling Practice with Python

Time series analysis is an important branch of statistics that studies patterns over time to forecast future values, applicable to stock prices, sales, rainfall, and other domains.

Time series can be classified by stationarity, indicator type, and time attribute, with period indicators being additive and point indicators non‑additive.

For internet companies, business volume forecasting faces challenges such as periodic effects, holidays, regional differences, inventory constraints, and external factors.

The article focuses on ARIMA modeling, introducing ARMA components (AR(p) and MA(q)) and how differencing transforms a non‑stationary series into a stationary one for ARMA application.

Practical steps using Python:

Step 1 – Load data:

df = pd.read_csv('testdata.csv', encoding='gbk', index_col='ddate')
df.index = pd.to_datetime(df.index)
df['cnt'] = df['cnt'].astype(float)

Step 2 – Test stationarity with Augmented Dickey‑Fuller:

from statsmodels.tsa.stattools import adfuller
def test_stationarity(timeseries):
    dftest = adfuller(timeseries, autolag='AIC')
    return dftest[1]

Step 3 – Differencing to achieve stationarity and re‑test.

Step 4 – Plot ACF and PACF to get initial hints for p and q.

Step 5 – Determine optimal (p,q) by minimizing AIC over a grid:

for p in range(1, pmax+1):
    for q in range(1, qmax+1):
        try:
            model = ARIMA(endog=df['cnt'], order=(p,1,q))
            results = model.fit(disp=-1)
            print('ARIMA p:{} q:{} - AIC:{}'.format(p, q, results.aic))
        except:
            pass

The minimum AIC suggests p=7, q=7 for a first‑order differenced series.

Step 6 – Train the ARIMA(7,1,7) model, generate predictions, and compute error rate (≈8.58%).

Step 7 – Incorporate exogenous variables (e.g., holidays, week identifiers) to improve accuracy, reducing error to about 1.77%.

Step 8 – Model diagnostics using residual QQ‑plot and Durbin‑Watson test (value ≈1.99) confirm normality and lack of autocorrelation.

Conclusion: Proper data preprocessing, model selection, and inclusion of relevant external factors are crucial for reliable time‑series forecasting.

Pythonstatistical modelingforecastingtime seriesARIMA
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.