Artificial Intelligence 17 min read

A Comprehensive Guide to Dimensionality Reduction Algorithms with Python Implementations

This article introduces eleven classic dimensionality reduction techniques—including PCA, LDA, MDS, LLE, and t‑SNE—explains their principles, advantages, and limitations, and provides complete Python code examples and resources for each method, making it a valuable guide for beginners in machine learning and data mining.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
A Comprehensive Guide to Dimensionality Reduction Algorithms with Python Implementations

Various dimensionality reduction algorithms are often scattered online, many without source code. This article collects a GitHub project that implements 11 classic data extraction (dimensionality reduction) algorithms in Python, such as PCA, LDA, MDS, LLE, and t‑SNE, and provides related documentation and visual results, making it suitable for machine‑learning beginners and newcomers to data mining.

Why Perform Dimensionality Reduction?

Dimensionality reduction replaces a high‑dimensional vector \(X\) with a lower‑dimensional vector \(Z\) (where \(d < D\)), preserving the useful information. Real‑world datasets often have hundreds or thousands of dimensions, leading to the "curse of dimensionality"; reducing dimensions makes data easier to use, ensures variable independence, lowers computational cost, and removes noise. It is widely applied in text processing, face recognition, image classification, and natural language processing.

Principles of Dimensionality Reduction

High‑dimensional data are usually sparse; reduction removes redundant, invalid, or duplicate information. For example, a 1024×1024 image with a 50×50 non‑zero center can discard the zero‑valued regions. Classical reduction methods are divided into linear and nonlinear approaches; nonlinear methods further split into kernel‑based and eigen‑value‑based techniques.

Linear methods: PCA, ICA, LDA, LFA, LPP (linear representation of LE)

Non‑linear kernel‑based: KPCA, KICA, KDA

Non‑linear eigen‑value‑based (manifold learning): ISOMAP, LLE, LE, LPP, LTSA, MVU

Principal Component Analysis (PCA) Algorithm

PCA is a fundamental unsupervised linear reduction method that projects data onto directions of maximum variance (or minimum reconstruction error). Proposed by Karl Pearson in 1901, it follows the maximum variance theory or minimum error theory.

The algorithm steps are:

Input: data matrix \(X_{m\times n}\).

Center the data: \(X_{new}=X - \text{mean}(X)\).

Compute the covariance matrix of \(X_{new}\).

Obtain eigenvalues and eigenvectors of the covariance matrix.

Sort eigenvalues descending, select the top \(k\) eigenvectors to form matrix \(W_{n\times k}\).

Project the data: \(X_{new}W\) yields the reduced dataset.

The minimum‑error perspective seeks the linear projection that minimizes average projection cost.

Detailed steps can be found in "From Scratch Implementing PCA" (https://blog.csdn.net/u013719780/article/details/78352262).

PCA Code Implementation (Python)

from __future__ import print_function
from sklearn import datasets
import matplotlib.pyplot as plt
import matplotlib.cm as cmx
import matplotlib.colors as colors
import numpy as np
%matplotlib inline

def shuffle_data(X, y, seed=None):
    if seed:
        np.random.seed(seed)
    idx = np.arange(X.shape[0])
    np.random.shuffle(idx)
    return X[idx], y[idx]

# Normalize dataset X
def normalize(X, axis=-1, p=2):
    lp_norm = np.atleast_1d(np.linalg.norm(X, p, axis))
    lp_norm[lp_norm == 0] = 1
    return X / np.expand_dims(lp_norm, axis)

# Standardize dataset X
def standardize(X):
    X_std = np.zeros(X.shape)
    mean = X.mean(axis=0)
    std = X.std(axis=0)
    for col in range(np.shape(X)[1]):
        if std[col]:
            X_std[:, col] = (X[:, col] - mean[col]) / std[col]
    return X_std

# Split dataset into train and test sets
def train_test_split(X, y, test_size=0.2, shuffle=True, seed=None):
    if shuffle:
        X, y = shuffle_data(X, y, seed)
    n_train_samples = int(X.shape[0] * (1 - test_size))
    x_train, x_test = X[:n_train_samples], X[n_train_samples:]
    y_train, y_test = y[:n_train_samples], y[n_train_samples:]
    return x_train, x_test, y_train, y_test

# Compute covariance matrix of X
def calculate_covariance_matrix(X, Y=np.empty((0,0))):
    if not Y.any():
        Y = X
    n_samples = np.shape(X)[0]
    covariance_matrix = (1 / (n_samples-1)) * (X - X.mean(axis=0)).T.dot(Y - Y.mean(axis=0))
    return np.array(covariance_matrix, dtype=float)

# Compute variance of each column of X
def calculate_variance(X):
    n_samples = np.shape(X)[0]
    variance = (1 / n_samples) * np.diag((X - X.mean(axis=0)).T.dot(X - X.mean(axis=0)))
    return variance

# Compute standard deviation of each column of X
def calculate_std_dev(X):
    std_dev = np.sqrt(calculate_variance(X))
    return std_dev

# Compute correlation matrix
def calculate_correlation_matrix(X, Y=np.empty([0])):
    covariance_matrix = calculate_covariance_matrix(X, Y)
    std_dev_X = np.expand_dims(calculate_std_dev(X), 1)
    std_dev_y = np.expand_dims(calculate_std_dev(Y), 1)
    correlation_matrix = np.divide(covariance_matrix, std_dev_X.dot(std_dev_y.T))
    return np.array(correlation_matrix, dtype=float)

class PCA():
    """Principal Component Analysis (unsupervised)"""
    def __init__(self):
        self.eigen_values = None
        self.eigen_vectors = None
        self.k = 2
    def transform(self, X):
        covariance = calculate_covariance_matrix(X)
        self.eigen_values, self.eigen_vectors = np.linalg.eig(covariance)
        idx = self.eigen_values.argsort()[::-1]
        eigenvalues = self.eigen_values[idx][:self.k]
        eigenvectors = self.eigen_vectors[:, idx][:, :self.k]
        X_transformed = X.dot(eigenvectors)
        return X_transformed

def main():
    data = datasets.load_iris()
    X = data.data
    y = data.target
    X_trans = PCA().transform(X)
    x1 = X_trans[:, 0]
    x2 = X_trans[:, 1]
    cmap = plt.get_cmap('viridis')
    colors = [cmap(i) for i in np.linspace(0, 1, len(np.unique(y)))]
    class_distr = []
    for i, l in enumerate(np.unique(y)):
        _x1 = x1[y == l]
        _x2 = x2[y == l]
        class_distr.append(plt.scatter(_x1, _x2, color=colors[i]))
    plt.legend(class_distr, y, loc=1)
    plt.xlabel('Principal Component 1')
    plt.ylabel('Principal Component 2')
    plt.show()

if __name__ == "__main__":
    main()

The resulting plot shows the two‑dimensional representation of the Iris dataset. While PCA is widely used, it has limitations: it cannot handle high‑order correlations well and assumes that principal features lie on orthogonal axes, which may not hold for non‑orthogonal variance structures.

Other Dimensionality Reduction Algorithms and Resources

KPCA (Kernel PCA) : Extends PCA by applying a kernel function before computing the covariance matrix, enabling nonlinear mappings. Resources and code: https://github.com/heucoder/dimensionality_reduction_alo_codes/blob/master/codes/PCA/KPCA.py

LDA (Linear Discriminant Analysis) : Maximizes between‑class variance while minimizing within‑class variance, useful for classification tasks. Resources and code: https://github.com/heucoder/dimensionality_reduction_alo_codes/tree/master/codes/LDA

MDS (Multidimensional Scaling) : Preserves pairwise distances when projecting to lower dimensions. Both iterative and non‑iterative implementations are provided. Code: https://github.com/heucoder/dimensionality_reduction_alo_codes/tree/master/codes/MDS

ISOMAP : Improves MDS by constructing a neighborhood graph to preserve geodesic distances on manifolds. Code: https://github.com/heucoder/dimensionality_reduction_alo_codes/tree/master/codes/ISOMAP

LLE (Locally Linear Embedding) : Reconstructs each point from its neighbors and preserves these local linear relationships in the low‑dimensional embedding. Code: https://github.com/heucoder/dimensionality_reduction_alo_codes/tree/master/codes/LLE

t‑SNE : Non‑linear technique especially suited for visualizing high‑dimensional data in 2‑D or 3‑D. Code and examples are linked in the article.

LE (Laplacian Eigenmaps) : Similar to LLE, it builds a graph based on local neighborhoods and solves a generalized eigenproblem to obtain the embedding. Code: https://github.com/heucoder/dimensionality_reduction_alo_codes/tree/master/codes/LE

LPP (Locality Preserving Projections) : Linear counterpart of LE that seeks a projection matrix preserving local structure. Code: https://github.com/heucoder/dimensionality_reduction_alo_codes/tree/master/codes/LPP

The project author, Heucoder, is a master's student at Harbin Institute of Technology, active in the internet community, and maintains the GitHub repository https://github.com/heucoder/dimensionality_reduction_alo_codes where all the above code can be accessed.

Pythondata miningPCAdimensionality reduction
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.