Python Implementation of DBSCAN and KMeans for Point Cloud Clustering and Tracking with Hungarian Matching
This article presents a Python project that reads point‑cloud data from CSV files, applies DBSCAN and KMeans clustering, extracts cluster features, and uses the Hungarian algorithm to match clusters across frames for tracking, complete with full source code and result visualization.
The project demonstrates how to process point‑cloud data collected by a sensor, where each frame contains a variable number of (x, y, z) points, by performing unsupervised clustering with DBSCAN and KMeans and then tracking clusters over time using Hungarian matching.
Data is stored in a CSV file with columns Frame #, X, Y, Z ; each row represents a point in a specific frame. Example rows show how frame numbers increase and points are listed.
import pandas as pd import numpy as np import matplotlib.pyplot as plt from sklearn.cluster import DBSCAN from sklearn.preprocessing import StandardScaler def adaption_frame(data, frame_start, frame_end, num_threshold=1000): data_x = [] data_y = [] data_z = [] for i in range(frame_start, frame_end): target_frame = i table_data = data[data['Frame #'] == target_frame] x_arr = table_data['X'].values data_x = np.concatenate((data_x, x_arr), axis=0) y_arr = table_data['Y'].values data_y = np.concatenate((data_y, y_arr), axis=0) z_arr = table_data['Z'].values data_z = np.concatenate((data_z, z_arr), axis=0) if data_x.shape[0] > num_threshold: break return data_x, data_y, data_z def valid_data(data_x, data_y, data_z): condition = (data_x >= -5) & (data_x <= 5) & (data_y >= -5) & (data_y <= 5) & (data_z >= -5) & (data_z <= 5) return data_x[condition], data_y[condition], data_z[condition] def draw_data_origin(data_x, data_y, data_z): fig = plt.figure() ax = fig.add_subplot(111, projection='3d') ax.scatter(data_x, data_y, data_z, s=0.1) ax.set_xlabel('X') ax.set_ylabel('Y') ax.set_zlabel('Z') ax.set_title(f'Point Cloud at Frame {1}') plt.show() def dbscan(data_x, data_y, data_z): data_input = np.column_stack((data_x, data_y, data_z)) scaler = StandardScaler() data_scaled = scaler.fit_transform(data_input) dbscan = DBSCAN(eps=0.3, min_samples=5) labels = dbscan.fit_predict(data_scaled) num_clusters = len(set(labels)) - (1 if -1 in labels else 0) return num_clusters, labels
from sklearn.cluster import KMeans from scipy.optimize import linear_sum_assignment from scipy.spatial.distance import cdist def cluster_kmeans(value, data_x, data_y, data_z): points = np.hstack((data_x.reshape(-1,1), data_y.reshape(-1,1), data_z.reshape(-1,1))) kmeans = KMeans(n_clusters=value, random_state=0).fit(points) return kmeans def extract_feature(K, labels_order, data_x, data_y, data_z): features = [] for i in range(K): x_mean = np.mean(data_x[labels_order[i]]) y_mean = np.mean(data_y[labels_order[i]]) z_mean = np.mean(data_z[labels_order[i]]) cluster_mean = np.hstack((x_mean, y_mean, z_mean)) cluster_points_size = labels_order[i].shape[0] features.append([cluster_mean, cluster_points_size, i]) return features def hungarian_match(features_last, features_now): centers_last = np.array([f[0] for f in features_last]) counts_last = np.array([f[1] for f in features_last]) centers_now = np.array([f[0] for f in features_now]) counts_now = np.array([f[1] for f in features_now]) distance_matrix = cdist(centers_last, centers_now) cost_matrix = np.abs(counts_last[:, np.newaxis] - counts_now) + distance_matrix * 10 row_ind, col_ind = linear_sum_assignment(cost_matrix) matches = [(features_last[r], features_now[c]) for r, c in zip(row_ind, col_ind)] return matches
The main script reads the CSV, iteratively loads frames, filters noisy points, runs DBSCAN to obtain initial cluster counts, then switches to KMeans for a fixed number of clusters, extracts mean positions and sizes, and applies the Hungarian algorithm to associate clusters between consecutive frames, finally visualizing the tracked clusters in a 3D plot.
Result images show the matched clusters (e.g., red and green) over time, illustrating how the algorithm aligns point‑cloud groups across frames for tracking purposes.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.