Artificial Intelligence 10 min read

Comprehensive List of Aggregation Functions and Custom Feature Engineering Utilities for Python

This article presents a detailed collection of built‑in pandas aggregation methods and numerous custom Python functions for time‑series feature engineering, offering beginners practical tools to enhance data preprocessing and model performance in machine‑learning projects.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Comprehensive List of Aggregation Functions and Custom Feature Engineering Utilities for Python

Recently readers asked about effective feature engineering techniques for beginners, especially aggregation operations that can boost early competition performance.

The article lists built‑in pandas aggregation functions such as mean() , sum() , size() , count() , std() , var() , sem() , first() , last() , nth() , min() , and max() , and mentions other important aggregation utilities.

It also provides a collection of custom Python functions for time‑series feature extraction, including statistical measures (median, variation_coefficient, variance, skewness, kurtosis, standard_deviation, large_standard_deviation), drawdown and drawup calculations, peak detection, count and ratio metrics, and many helper utilities such as count_above , count_below , number_peaks , mean_abs_change , root_mean_square , abs_energy , and others.

All functions are presented in raw code format, preserving their definitions for direct reuse:

def median(x):
    return np.median(x)

def variation_coefficient(x):
    mean = np.mean(x)
    if mean != 0:
        return np.std(x) / mean
    else:
        return np.nan

# ... (additional function definitions omitted for brevity) ...

Additionally, the article shows how to group these functions into logical categories for feature generation, e.g.,

base_stats = ['mean','sum','size','count','std','first','last','min','max',median,skewness,kurtosis]
higher_order_stats = [abs_energy,root_mean_square,sum_values,realized_volatility,realized_abs_skew,realized_skew,realized_vol_skew,realized_quarticity]
# ... (other groupings) ...

Reference: https://www.kaggle.com/code/lucasmorin/amex-feature-engineering-2-aggreg-functions

feature engineeringdata sciencetime seriespandasaggregation functions
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.