Mastering sklearn.svm: Parameters, Grid Search, and Real-World Examples
An in‑depth guide to sklearn.svm explains SVM classification and regression, details key parameters such as C and kernel types, demonstrates how to use GridSearchCV for hyperparameter tuning, and provides complete Python code examples for iris classification and California housing price prediction.
sklearn.svm module
Support vector machine classification and regression can be solved using Python's sklearn module.
In SVM classification models, two important parameters need to be chosen: the soft‑margin penalty coefficient C and the kernel type. Kernels include linear, polynomial, radial basis function, and sigmoid. GridSearchCV can be used to search for optimal parameter combinations by providing a dictionary where keys are parameter names and values are lists of candidate values.
sklearn.svm submodules LinearSVC and SVC implement SVM classification. The basic syntax and parameter meanings of SVC are:
C : penalty parameter C of the error term, default 1.
kernel : kernel type ('linear', 'poly', 'rbf', 'sigmoid', 'precomputed'). For 'poly' and 'sigmoid' the degree and gamma parameters are used via degree and gamma .
degree : degree of the polynomial kernel.
gamma : kernel coefficient for 'rbf', 'poly' and 'sigmoid'.
coef0 : independent term in kernel function for 'poly' and 'sigmoid'.
shrinking : whether to use the shrinking heuristic, default True.
probability : whether to enable probability estimates, default False.
tol : tolerance for stopping criteria, default 0.001.
cache_size : size of the kernel cache (MB), default 200.
class_weight : set the parameter C of class i to class_weight[i]*C.
verbose : enable verbose output, default 0.
decision_function_shape : whether to return a one‑vs‑rest or one‑vs‑one decision function.
random_state : seed of the pseudo‑random number generator.
Example: Iris Classification
<code>from sklearn import datasets, svm, metrics
from sklearn.model_selection import GridSearchCV
import numpy as np
iris = datasets.load_iris()
x = iris.data; y = iris.target
parameters = {'kernel': ('linear','rbf'), 'C':[1,10,15]}
svc = svm.SVC(gamma='scale')
clf = GridSearchCV(svc, parameters, cv=5) # 5‑fold cross‑validation
clf.fit(x, y)
print("Best parameters:", clf.best_params_)
print("score:", clf.score(x, y))
yh = clf.predict(x); print(yh) # classification results
print("Accuracy:", metrics.accuracy_score(y, yh))
print("Misclassified samples:", np.where(yh != y)[0] + 1)
</code>Using a linear SVM directly:
<code>from sklearn import datasets, svm
from sklearn.model_selection import GridSearchCV
import numpy as np
iris = datasets.load_iris()
x = iris.data; y = iris.target
clf = svm.LinearSVC(C=1, max_iter=10000)
clf.fit(x, y); yh = clf.predict(x); print(yh)
print("Accuracy:", clf.score(x, y))
</code>Example: California Housing Prices
The LinearSVM and SVR classes in sklearn.svm implement support vector regression. The basic syntax of SVR and its parameters are:
<code>SVR(kernel='rbf', degree=3, gamma='auto_deprecated', coef0=0.0, tol=0.001,
C=1.0, epsilon=0.1, shrinking=True, cache_size=200,
verbose=False, max_iter=-1)
</code>Here epsilon defines the width of the epsilon‑insensitive loss tube; it defaults to 0 for linear SVR and 0.1 for non‑linear SVR. Other parameters are the same as in SVC.
<code>from sklearn.svm import SVR
import numpy as np
np.random.seed(123)
house = datasets.fetch_california_housing()
x = house.data; y = house.target
model = SVR(gamma='auto')
print(model)
model.fit(x, y)
pred_y = model.predict(x)
</code>References
司守奎,孙玺菁 Python数学实验与建模
Model Perspective
Insights, knowledge, and enjoyment from a mathematical modeling researcher and educator. Hosted by Haihua Wang, a modeling instructor and author of "Clever Use of Chat for Mathematical Modeling", "Modeling: The Mathematics of Thinking", "Mathematical Modeling Practice: A Hands‑On Guide to Competitions", and co‑author of "Mathematical Modeling: Teaching Design and Cases".
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.