Artificial Intelligence 15 min read

Unlocking Hidden Markov Models: Theory, Algorithms, and Python Implementations

This article explains Hidden Markov Models, covering their core concepts, basic elements, the three fundamental problems with forward, Viterbi, and Baum‑Welch algorithms, provides a weather illustration, detailed Python code using hmmlearn, and a real‑world earthquake case study, highlighting practical implementation steps.

Model Perspective

Aug 13, 2023

Unlocking Hidden Markov Models: Theory, Algorithms, and Python Implementations

Hidden Markov Model (HMM) is a statistical model that describes a Markov process with hidden states. Its core idea is that the system’s states are not directly observable, but each state generates observable outputs.

Basic Elements

HMM consists of the following elements:

Set of hidden states and observation symbols.

Initial state distribution, defining the probability of starting in each state.

State transition probability matrix, defining probabilities of moving between states.

Observation (emission) probability matrix, defining the probability of each observation given a state.

Three Fundamental Problems

Applying HMM involves three basic problems:

Evaluation: compute the probability of an observation sequence given a model (solved by the forward‑backward algorithm).

Decoding: find the most likely state sequence for a given observation sequence (solved by the Viterbi algorithm).

Learning: adjust model parameters to maximize the likelihood of the observations (solved by the Baum‑Welch algorithm, a special case of EM).

Forward Algorithm

The forward algorithm is a dynamic‑programming method that computes the probability of an observation sequence. It defines a forward variable α_t(i) representing the probability of being in state i at time t after observing the first t symbols.

Viterbi Algorithm

The Viterbi algorithm also uses dynamic programming but aims to find the most probable hidden state path. It maintains a path variable and back‑pointers to reconstruct the optimal state sequence.

Baum‑Welch Algorithm

The Baum‑Welch algorithm is an Expectation‑Maximization (EM) procedure that estimates model parameters that maximize the likelihood of the observed data. HMMs are widely used in speech recognition, natural language processing, bioinformatics, etc.

Illustrative Weather Example

The figure below shows a simple HMM where the hidden states are weather (Rainy, Sunny) and the observations are a person’s daily activity (walk, shop, clean).

Hidden States: Rainy, Sunny.

Observations: walk, shop, clean.

Start probabilities: Rainy = 0.6, Sunny = 0.4.

Transition probabilities: Rainy→Rainy = 0.7, Rainy→Sunny = 0.3; Sunny→Rainy = 0.4, Sunny→Sunny = 0.6.

Emission probabilities: Rainy→walk = 0.1, shop = 0.4, clean = 0.5; Sunny→walk = 0.6, shop = 0.3, clean = 0.1.

Python Implementation

from hmmlearn import hmm
import numpy as np

# Define model parameters
states = ["Sunny", "Rainy"]
observations = ["Short-sleeved", "Long-sleeved"]

initial_probabilities = np.array([0.9, 0.1])

transition_probabilities = np.array([
    [0.8, 0.2],
    [0.3, 0.7]
])

emission_probabilities = np.array([
    [0.7, 0.3],
    [0.4, 0.6]
])

# Create and configure the HMM
model = hmm.MultinomialHMM(n_components=2, n_trials=1)
model.startprob_ = initial_probabilities
model.transmat_ = transition_probabilities
model.emissionprob_ = emission_probabilities

# Observation sequence: Short‑sleeved, Long‑sleeved, Long‑sleeved
obs_seq = np.array([
    [1, 0],  # Short‑sleeved
    [0, 1],  # Long‑sleeved
    [0, 1]   # Long‑sleeved
])

# Decode the most likely hidden state sequence using Viterbi
logprob, state_seq = model.decode(obs_seq)

print("Observation sequence:", [observations[np.argmax(o)] for o in obs_seq])
print("Most likely weather sequence:", [states[i] for i in state_seq])

The output shows that the most likely weather sequence for the three observations is Sunny → Sunny → Sunny.

hmmlearn Library Overview

The hmmlearn library implements various HMMs, including:

hmm.CategoricalHMM – discrete emissions.

hmm.GaussianHMM – Gaussian emissions.

hmm.GMMHMM – Gaussian‑mixture emissions.

hmm.MultinomialHMM – multinomial emissions.

hmm.PoissonHMM – Poisson emissions.

vhmm.VariationalCategoricalHMM – variational inference for categorical emissions.

vhmm.VariationalGaussianHMM – variational inference for Gaussian emissions.

Models are built by passing the parameters to the constructor and can generate samples with sample(). Training uses the EM algorithm, whose iteration limit is set by n_iter and convergence is monitored via monitor_.

Earthquake Case Study

An example fits a Poisson HMM to yearly earthquake counts (1900‑2006). The data are visualized and the model is trained with different numbers of hidden states.

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import poisson
from hmmlearn import hmm

earthquakes = np.array([...])  # data omitted for brevity

fig, ax = plt.subplots()
ax.plot(earthquakes, ".-", ms=6, mfc="orange", alpha=0.7)
ax.set_xticks(range(0, earthquakes.size, 10))
ax.set_xticklabels(range(1906, 2007, 10))
ax.set_xlabel('Year')
ax.set_ylabel('Count')
fig.show()

Several Poisson HMMs with 1‑4 components are trained, each with ten random initializations. The best model is selected by the highest log‑likelihood.

scores = []
models = []
for n_components in range(1, 5):
    for idx in range(10):
        model = hmm.PoissonHMM(n_components=n_components, random_state=idx, n_iter=10)
        model.fit(earthquakes[:, None])
        models.append(model)
        scores.append(model.score(earthquakes[:, None]))
        print(f'Converged: {model.monitor_.converged}\tScore: {scores[-1]}')

best_model = models[np.argmax(scores)]
print(f'The best model had a score of {max(scores)} and {best_model.n_components} components')

The Viterbi algorithm predicts the most likely hidden state sequence, and the results are visualized alongside the original earthquake counts.

These examples demonstrate how HMMs can be applied to real‑world sequential data such as weather patterns and seismic activity.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Machine Learning Python Time-series HMM Hidden Markov Model

Written by

Model Perspective

Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.