Comprehensive List of Aggregation Functions and Custom Feature Engineering Utilities for Python
This article presents a detailed collection of built‑in pandas aggregation methods and numerous custom Python functions for time‑series feature engineering, offering beginners practical tools to enhance data preprocessing and model performance in machine‑learning projects.
Recently readers asked about effective feature engineering techniques for beginners, especially aggregation operations that can boost early competition performance.
The article lists built‑in pandas aggregation functions such as mean() , sum() , size() , count() , std() , var() , sem() , first() , last() , nth() , min() , and max() , and mentions other important aggregation utilities.
It also provides a collection of custom Python functions for time‑series feature extraction, including statistical measures (median, variation_coefficient, variance, skewness, kurtosis, standard_deviation, large_standard_deviation), drawdown and drawup calculations, peak detection, count and ratio metrics, and many helper utilities such as count_above , count_below , number_peaks , mean_abs_change , root_mean_square , abs_energy , and others.
All functions are presented in raw code format, preserving their definitions for direct reuse:
def median(x):
return np.median(x)
def variation_coefficient(x):
mean = np.mean(x)
if mean != 0:
return np.std(x) / mean
else:
return np.nan
# ... (additional function definitions omitted for brevity) ...Additionally, the article shows how to group these functions into logical categories for feature generation, e.g.,
base_stats = ['mean','sum','size','count','std','first','last','min','max',median,skewness,kurtosis]
higher_order_stats = [abs_energy,root_mean_square,sum_values,realized_volatility,realized_abs_skew,realized_skew,realized_vol_skew,realized_quarticity]
# ... (other groupings) ...Reference: https://www.kaggle.com/code/lucasmorin/amex-feature-engineering-2-aggreg-functions
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.