Artificial Intelligence 7 min read

Understanding RNNs and LSTM: Theory and Python Keras Implementation

This article explains the fundamentals of Recurrent Neural Networks and Long Short‑Term Memory units, their gating mechanisms, and demonstrates a practical Python Keras example that predicts future PM2.5 concentrations using an LSTM model.

Model Perspective

Mar 2, 2023

Understanding RNNs and LSTM: Theory and Python Keras Implementation

Recurrent Neural Network

Recurrent Neural Network (RNN) is a special neural network where neurons form cyclic connections, allowing the network to process sequential data.

The basic idea introduces the concept of time steps, treating a sequence as a series of inputs and outputs. Unlike feed‑forward networks, each neuron receives input from the previous layer and from the previous time step, enabling information flow over time.

Because RNNs have memory, they can remember past states and generate outputs based on both past and current inputs, making them widely used in natural language processing, speech recognition, image captioning, time‑series forecasting, etc. Variants such as Long Short‑Term Memory (LSTM) and Gated Recurrent Unit (GRU) further improve sequence handling.

Long Short‑Term Memory (LSTM)

LSTM is a variant of RNN designed to handle long‑sequence data and mitigate the gradient‑vanishing problem.

Standard RNNs have a single hidden state passed forward at each time step, which limits modeling of long‑range dependencies. LSTM introduces three gates—input, forget, and output—that regulate information flow based on the current input and previous hidden state, allowing better long‑term dependency modeling and preventing gradient decay.

The core idea is a long‑term memory cell controlled by these gates, enabling the network to store and retrieve information over many time steps.

The mathematical formulation involves input xₜ, hidden state hₜ, cell state cₜ, and gate values iₜ, fₜ, oₜ, with weight matrices and biases for each gate.

Python Implementation

Using a sample historical dataset, the article builds an LSTM model with Keras to predict future PM2.5 concentrations.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import LSTM, Dense

# construct PM2.5 data
data = np.array([90, 85, 82, 88, 87, 80, 75, 72, 78, 85, 88, 90, 93, 92, 89, 87, 82, 75, 70, 68, 70, 72, 75, 80, 85])
data = np.reshape(data, (-1, 1))

# normalize
d scaler = MinMaxScaler(feature_range=(0, 1))
data = scaler.fit_transform(data)

# create input-output pairs
look_back = 4
X, y = [], []
for i in range(len(data)-look_back-1):
    X.append(data[i:(i+look_back), 0])
    y.append(data[i+look_back, 0])
X = np.array(X)
y = np.array(y)

# reshape for LSTM
X = np.reshape(X, (X.shape[0], X.shape[1], 1))

# build model
model = Sequential()
model.add(LSTM(4, input_shape=(look_back, 1)))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')

# train
model.fit(X, y, epochs=100, batch_size=1, verbose=2)

# predict future PM2.5
future_data = np.array([70, 72, 75, 80])
future_data = np.reshape(future_data, (-1, 1))
future_data = scaler.transform(future_data)
future_data = np.reshape(future_data, (1, look_back, 1))
future_pm25 = model.predict(future_data)
future_pm25 = scaler.inverse_transform(future_pm25)
print('Future PM2.5 prediction:', future_pm25)

The result is:

Future PM2.5 prediction: [[78.97241]]

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python time series forecasting Keras LSTM RNN

Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.