A Survey of Python Libraries for Hyperparameter Optimization, Feature Selection, Model Explainability, and Rapid Machine Learning Development
This article introduces several Python libraries—including Optuna, ITMO_FS, shap‑hypertune, PyCaret, floWeaver, Gradio, Terality, and torch‑handle—that simplify hyperparameter tuning, feature selection, model explainability, visualization, and low‑code ML workflows, providing code examples and key advantages for each tool.
Optuna is an open‑source hyper‑parameter optimization framework that automatically finds the best hyper‑parameters for machine‑learning models using a Bayesian optimization algorithm called Tree‑structured Parzen Estimator, offering a more efficient alternative to sklearn’s GridSearchCV and works with any ML library such as TensorFlow, Keras or PyTorch.
ITMO_FS is a feature‑selection library for ML models that provides six categories of algorithms (supervised filters, unsupervised filters, wrappers, hybrid, embedded, and ensemble) and helps avoid over‑fitting by reducing the number of features; a typical usage example is shown below.
<code>>> from sklearn.linear_model import SGDClassifier
>>> from ITMO_FS.embedded import MOS
>>> X, y = make_classification(n_samples=300, n_features=10, random_state=0, n_informative=2)
>>> sel = MOS()
>>> trX = sel.fit_transform(X, y, smote=False)
>>> cl1 = SGDClassifier()
>>> cl1.fit(X, y)
>>> cl1.score(X, y)
0.9033333333333333
>>> cl2 = SGDClassifier()
>>> cl2.fit(trX, y)
>>> cl2.score(trX, y)
0.9433333333333334</code>shap‑hypertune combines SHAP (SHapley Additive exPlanations) model‑explainability with hyper‑parameter tuning, allowing simultaneous selection of informative features and optimal hyper‑parameters via grid, random or Bayesian search, though it currently supports only gradient‑boosting models.
PyCaret is a low‑code, open‑source Python library that automates the entire ML workflow—including data loading, preprocessing, model comparison, creation of interactive apps, API generation and Docker packaging—with just a few lines of code, as illustrated in the examples.
<code># load dataset
from pycaret.datasets import get_data
diabetes = get_data('diabetes')
# init setup
from pycaret.classification import *
clf1 = setup(data = diabetes, target = 'Class variable')
# compare models
best = compare_models()
</code>floWeaver generates Sankey diagrams from flow‑type datasets, useful for visualising conversion funnels, marketing journeys or budget allocations; the input format is “source × target × value” and a single line of code creates the diagram.
Gradio provides an easy way to build front‑end interfaces for ML models by specifying input types, functions and outputs, enabling rapid prototyping and free hosting on Hugging Face, making model interaction far simpler than building a Flask app.
Terality offers a Pandas‑compatible API that compiles operations to Spark, delivering 10‑100× speed‑ups, parallel execution, and off‑loading of computation to the cloud, though the free tier limits usage to 1 TB per month.
torch‑handle abstracts repetitive PyTorch training code, allowing users to define a model, dataset, optimizer and scheduler in a few lines and run training sessions automatically, with built‑in reporting and TensorBoard integration.
<code>from collections import OrderedDict
import torch
from torchhandle.workflow import BaseContext
class Net(torch.nn.Module):
def __init__(self, ):
super().__init__()
self.layer = torch.nn.Sequential(OrderedDict([
('l1', torch.nn.Linear(10, 20)),
('a1', torch.nn.ReLU()),
('l2', torch.nn.Linear(20, 10)),
('a2', torch.nn.ReLU()),
('l3', torch.nn.Linear(10, 1))
]))
def forward(self, x):
x = self.layer(x)
return x
num_samples, num_features = int(1e4), int(1e1)
X, Y = torch.rand(num_samples, num_features), torch.rand(num_samples)
dataset = torch.utils.data.TensorDataset(X, Y)
trn_loader = torch.utils.data.DataLoader(dataset, batch_size=64, num_workers=0, shuffle=True)
loaders = {"train": trn_loader, "valid": trn_loader}
device = 'cuda' if torch.cuda.is_available() else 'cpu'
model = {"fn": Net}
criterion = {"fn": torch.nn.MSELoss}
optimizer = {"fn": torch.optim.Adam,
"args": {"lr": 0.1},
"params": {"layer.l1.weight": {"lr": 0.01},
"layer.l1.bias": {"lr": 0.02}}
}
scheduler = {"fn": torch.optim.lr_scheduler.StepLR,
"args": {"step_size": 2, "gamma": 0.9}}
c = BaseContext(model=model,
criterion=criterion,
optimizer=optimizer,
scheduler=scheduler,
context_tag="ex01")
train = c.make_train_session(device, dataloader=loaders)
train.train(epochs=10)</code>Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.