Fundamentals 5 min read

How to Build and Forecast an ARMA Model in Python Using Sunspot Data

Learn to load the sunspots.csv dataset, fit an ARMA(4,2) model with statsmodels, evaluate it using AIC/BIC, visualize original and predicted values, and forecast the 1989 sunspot count, all with step‑by‑step Python code and plots.

Model Perspective

Aug 2, 2022

How to Build and Forecast an ARMA Model in Python Using Sunspot Data

ARMA Model Solution in Python

Using the sunspots.csv file, we build an appropriate ARMA model to predict the number of sunspots in 1989. The statsmodels.api.tsa.ARMA() function is used for fitting.

Code

import pandas as pd, numpy as np
import statsmodels.api as sm
import matplotlib.pyplot as plt
plt.rc('font',family='SimHei'); plt.rc('font',size=16)

d=pd.read_csv('sunspots.csv',usecols=['counts'])
md=sm.tsa.ARMA(d,(9,1)).fit()
years=np.arange(1700,1989)  # known observation years

dhat=md.predict()
plt.plot(years[-20:],d.values[-20:],'o-k')
plt.plot(years[-20:],dhat.values[-20:],'P--')
plt.legend(('Original observations','Predicted values'))
plt.show()

dnext=md.predict(d.shape[0],d.shape[0])
print(dnext)  # show next period prediction

Below is a complete modeling procedure.

Step 1: Plot the original data series to confirm stationarity.

Step 2: Use AIC and BIC criteria to select ARMA(4,2), fit the model with Python, and obtain residuals and their distribution.

Step 3: Using the fitted model, predict the 1989 sunspot count as 139, and compare with the original data.

Code

import pandas as pd, numpy as np
import statsmodels.api as sm
import matplotlib.pylab as plt
from statsmodels.graphics.tsaplots import plot_acf, plot_pacf

plt.rc('axes',unicode_minus=False)
plt.rc('font',family='SimHei'); plt.rc('font',size=16)

d=pd.read_csv('sunspots.csv'); dd=d['counts']
years=d['year'].values.astype(int)

plt.plot(years,dd.values,'-*'); plt.figure()
ax1=plt.subplot(121); plot_acf(dd,ax=ax1,title='Autocorrelation')
ax2=plt.subplot(122); plot_pacf(dd,ax=ax2,title='Partial Autocorrelation')

for i in range(1,6):
    for j in range(1,6):
        md=sm.tsa.ARMA(dd,(i,j)).fit()
        print([i,j,md.aic,md.bic])

zmd=sm.tsa.ARMA(dd,(4,2)).fit()
print(zmd.summary())  # display model information

residuals = pd.DataFrame(zmd.resid)
fig, ax = plt.subplots(1,2)
residuals.plot(title="Residuals", ax=ax[0])
residuals.plot(kind='kde', title='Density', ax=ax[1])
plt.legend(''); plt.ylabel('')

dhat=zmd.predict(); plt.figure()
plt.plot(years[-20:],dd.values[-20:],'o-k')
plt.plot(years[-20:],dhat.values[-20:],'P--')
plt.legend(('Original observations','Predicted values'))

dnext=zmd.predict(d.shape[0],d.shape[0])
print(dnext)  # show next period prediction
plt.show()

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python Statsmodels ARMA

Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.