Artificial Intelligence 13 min read

Python Implementation of DBSCAN and KMeans for Point Cloud Clustering and Tracking with Hungarian Matching

This article presents a Python project that reads point‑cloud data from CSV files, applies DBSCAN and KMeans clustering, extracts cluster features, and uses the Hungarian algorithm to match clusters across frames for tracking, complete with full source code and result visualization.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Python Implementation of DBSCAN and KMeans for Point Cloud Clustering and Tracking with Hungarian Matching

The project demonstrates how to process point‑cloud data collected by a sensor, where each frame contains a variable number of (x, y, z) points, by performing unsupervised clustering with DBSCAN and KMeans and then tracking clusters over time using Hungarian matching.

Data is stored in a CSV file with columns Frame #, X, Y, Z ; each row represents a point in a specific frame. Example rows show how frame numbers increase and points are listed.

import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn.cluster import DBSCAN from sklearn.preprocessing import StandardScaler def adaption_frame(data, frame_start, frame_end, num_threshold=1000): data_x = [] data_y = [] data_z = [] for i in range(frame_start, frame_end): target_frame = i table_data = data[data['Frame #'] == target_frame] x_arr = table_data['X'].values data_x = np.concatenate((data_x, x_arr), axis=0) y_arr = table_data['Y'].values data_y = np.concatenate((data_y, y_arr), axis=0) z_arr = table_data['Z'].values data_z = np.concatenate((data_z, z_arr), axis=0) if data_x.shape[0] > num_threshold: break return data_x, data_y, data_z def valid_data(data_x, data_y, data_z): condition = (data_x >= -5) & (data_x <= 5) & (data_y >= -5) & (data_y <= 5) & (data_z >= -5) & (data_z <= 5) return data_x[condition], data_y[condition], data_z[condition] def draw_data_origin(data_x, data_y, data_z): fig = plt.figure() ax = fig.add_subplot(111, projection='3d') ax.scatter(data_x, data_y, data_z, s=0.1) ax.set_xlabel('X') ax.set_ylabel('Y') ax.set_zlabel('Z') ax.set_title(f'Point Cloud at Frame {1}') plt.show() def dbscan(data_x, data_y, data_z): data_input = np.column_stack((data_x, data_y, data_z)) scaler = StandardScaler() data_scaled = scaler.fit_transform(data_input) dbscan = DBSCAN(eps=0.3, min_samples=5) labels = dbscan.fit_predict(data_scaled) num_clusters = len(set(labels)) - (1 if -1 in labels else 0) return num_clusters, labels

from sklearn.cluster import KMeans from scipy.optimize import linear_sum_assignment from scipy.spatial.distance import cdist def cluster_kmeans(value, data_x, data_y, data_z): points = np.hstack((data_x.reshape(-1,1), data_y.reshape(-1,1), data_z.reshape(-1,1))) kmeans = KMeans(n_clusters=value, random_state=0).fit(points) return kmeans def extract_feature(K, labels_order, data_x, data_y, data_z): features = [] for i in range(K): x_mean = np.mean(data_x[labels_order[i]]) y_mean = np.mean(data_y[labels_order[i]]) z_mean = np.mean(data_z[labels_order[i]]) cluster_mean = np.hstack((x_mean, y_mean, z_mean)) cluster_points_size = labels_order[i].shape[0] features.append([cluster_mean, cluster_points_size, i]) return features def hungarian_match(features_last, features_now): centers_last = np.array([f[0] for f in features_last]) counts_last = np.array([f[1] for f in features_last]) centers_now = np.array([f[0] for f in features_now]) counts_now = np.array([f[1] for f in features_now]) distance_matrix = cdist(centers_last, centers_now) cost_matrix = np.abs(counts_last[:, np.newaxis] - counts_now) + distance_matrix * 10 row_ind, col_ind = linear_sum_assignment(cost_matrix) matches = [(features_last[r], features_now[c]) for r, c in zip(row_ind, col_ind)] return matches

The main script reads the CSV, iteratively loads frames, filters noisy points, runs DBSCAN to obtain initial cluster counts, then switches to KMeans for a fixed number of clusters, extracts mean positions and sizes, and applies the Hungarian algorithm to associate clusters between consecutive frames, finally visualizing the tracked clusters in a 3D plot.

Result images show the matched clusters (e.g., red and green) over time, illustrating how the algorithm aligns point‑cloud groups across frames for tracking purposes.

ClusteringData Processingpoint cloudDBSCANHungarian Algorithmkmeans
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.