Artificial Intelligence 6 min read

Introduction to Lasso Regression with scikit-learn

This article provides a comprehensive guide to Lasso regression, covering its theoretical background, scikit-learn API parameters, step‑by‑step Python implementation, cross‑validation for hyper‑parameter tuning, visualization of predictions, and a discussion of its advantages over ridge regression.

Qunar Tech Salon

Oct 10, 2018

Introduction to Lasso Regression with scikit-learn

The article introduces Lasso regression as an L1‑regularized linear model, explains its role in feature selection and over‑fitting mitigation, and contrasts it with ridge regression (L2 regularization).

It then presents the key scikit‑learn Lasso class parameters— alpha (regularization strength), max_iter (maximum iterations), and warm_start (reuse previous solution)—in a concise table.

Code example 1: Importing libraries

#coding=utf-8
import pandas as pd
from sklearn import cross_validation
from sklearn.linear_model import Lasso, LassoCV
import numpy as np
import matplotlib.pyplot as plt

Code example 2: Loading data and splitting

data = pd.read_csv('D:\\hua.cao\\python\\20180606\\Folds5x2_pp.csv')
X = data[['AT', 'V', 'AP', 'RH']]
Y = data[['PE']]
X_TRAIN, X_TEST, Y_TRAIN, Y_TEST = train_test_split(X, Y, random_state=1)

Code example 3: Training a Lasso model

lasso = Lasso(alpha=0.01)
lasso.fit(X_TRAIN, Y_TRAIN)

Code example 4: Making predictions Y_PRED = lasso.predict(X_TEST) Code example 5: Inspecting the fitted model

print lasso.coef_
print lasso.intercept_

The resulting coefficients and intercept are shown, and the derived regression equation is displayed as

PE = -1.9687·AT - 0.2393·V + 0.0566·AP - 0.1586·RH + 460.35

Code example 6: Cross‑validation with LassoCV

X1 = data[['AT', 'V', 'AP', 'RH']]
Y1 = data[['PE']]
lassocv = LassoCV()
lassocv.fit(X1, Y1)
print lassocv.alpha_
print lassocv.coef_
print np.sum(lassocv.coef_ != 0)

This block automatically selects the optimal alpha and reports the number of non‑zero features.

Code example 7: Visualizing predictions

fig, ax = plt.subplots()
ax.scatter(Y_TEST, Y_PRED)
ax.plot([Y_TEST.min(), Y_TEST.max()], [Y_TEST.min(), Y_TEST.max()], 'k--', lw=4)
ax.set_xlabel('Measured')
ax.set_ylabel('Predicted')
plt.show()

A scatter plot with a diagonal reference line illustrates how closely the predicted values match the measured ones.

The article concludes by summarizing that Lasso regression shrinks some coefficients to zero, enhancing model generalization and reducing computational cost compared to ridge regression, and it references further reading on norm regularization and compressed sensing.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Machine Learning Python Data visualization Regularization lasso regression Cross‑Validation

Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.