Artificial Intelligence 8 min read

Top 10 New Features in Scikit‑learn 0.24

The article reviews the most important additions in scikit‑learn 0.24, including faster hyper‑parameter search methods, ICE plots, histogram‑based boosting improvements, new feature‑selection tools, polynomial‑feature approximations, a semi‑supervised classifier, MAPE metric, enhanced OneHotEncoder and OrdinalEncoder handling, and a more flexible RFE interface.

Python Programming Learning Circle
Python Programming Learning Circle
Python Programming Learning Circle
Top 10 New Features in Scikit‑learn 0.24

Since its first release in 2007, scikit‑learn has become a cornerstone Python library for machine learning, offering classification, regression, dimensionality reduction, clustering, feature extraction, data preprocessing, and model evaluation.

Its strengths lie in comprehensive documentation, an extensive and well‑liked API, a large collection of algorithms (including LIBSVM and LIBLINEAR), and many built‑in datasets that save users time.

Version 0.24, released in 2021, introduces several noteworthy features:

1. Faster hyper‑parameter selection

HalvingGridSearchCV and HalvingRandomSearchCV combine the functionality of GridSearchCV and RandomizedSearchCV while using a tournament‑style approach to evaluate fewer candidates, dramatically reducing computational cost. Use them when the search space is large or model training is slow; otherwise, stick with GridSearchCV.

Import the experimental classes before use:

from sklearn.experimental import enable_halving_search_cv
from sklearn.model_selection import HalvingGridSearchCV, HalvingRandomSearchCV

2. ICE plots

Partial dependence plots (PDP) were introduced in 0.23; version 0.24 adds Individual Conditional Expectation (ICE) plots, which display the dependence of the prediction on a feature for each individual sample. Use kind='individual' in plot_partial_dependency to view ICE, or kind='both' for PDP and ICE together.

3. Histogram‑based boosting improvements

Inspired by LightGBM, HistGradientBoostingRegressor and HistGradientBoostingClassifier now accept a categorical_features argument, allowing direct handling of categorical data without one‑hot encoding, reducing training time and often improving performance. Missing values are also natively supported.

model = HistGradientBoostingRegressor(
    categorical_features=[True, False]
)

4. Forward feature selection

The SequentialFeatureSelector performs forward selection by iteratively adding the most valuable feature until a stopping criterion is met, without requiring the underlying estimator to expose coef_ or feature_importances_ . It may be slower than RFE because it uses cross‑validation.

5. Fast approximation of polynomial features

The new PolynomialCountSketch estimator from the kernel_approximation module provides a memory‑ and time‑efficient alternative to PolynomialFeatures , generating a fixed number of sketch features (default 100) that approximate high‑order interactions.

6. SelfTrainingClassifier for semi‑supervised learning

This meta‑classifier wraps any supervised classifier that can output class probabilities, allowing it to learn from unlabeled data. Unlabeled samples must be marked with -1 in the target vector.

7. Mean Absolute Percentage Error (MAPE)

The new mean_absolute_percentage_error function provides a regression metric comparable across different problems, complementing R‑squared.

8. OneHotEncoder supports missing values

When handle_unknown='ignore' and the training data contain np.nan , the encoder creates an extra column to represent missing values.

9. OrdinalEncoder can handle unseen categories in test data

Set handle_unknown='use_encoded_value' together with an unknown_value (an integer not used in the training encoding or np.nan ) to safely encode categories that appear only in the test set.

10. RFE accepts a proportion of features to retain

Passing a float between 0 and 1 to n_features_to_select lets Recursive Feature Elimination keep a specified percentage of the original features, simplifying programmatic feature reduction.

For the original article, see the link: Towards Data Science .

machine learningPythonmodel evaluationdata preprocessingfeature selectionscikit-learn
Python Programming Learning Circle
Written by

Python Programming Learning Circle

A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.