Artificial Intelligence 12 min read

Common Machine Learning Algorithms for Data Prediction with Python Code Examples

This article introduces ten widely used machine learning algorithms for data prediction, explains their core concepts, and provides complete Python code snippets using scikit‑learn and related libraries to help readers implement regression, classification, and time‑series forecasting tasks.

Test Development Learning Exchange

Oct 19, 2023

Common Machine Learning Algorithms for Data Prediction with Python Code Examples

Using machine learning algorithms for data prediction is a common task in data analysis.

1. Linear Regression

Explanation: Linear regression is a basic regression algorithm suitable for predicting continuous variables.

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Assume feature matrix X and target y
X = [[1, 2], [2, 4], [3, 6], [4, 8]]
y = [3, 6, 9, 12]

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit model
model = LinearRegression()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print("Linear Regression MSE:", mse)

2. Logistic Regression

Explanation: Logistic regression is a common classification algorithm for binary problems.

from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Assume feature matrix X and binary target y
X = [[1, 2], [2, 4], [3, 6], [4, 8]]
y = [0, 0, 1, 1]

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit model
model = LogisticRegression()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Logistic Regression Accuracy:", accuracy)

3. Decision Tree

Explanation: Decision trees are tree‑structured models for classification and regression, easy to interpret.

from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Assume feature matrix X and target y for classification
X = [[1, 2], [2, 4], [3, 6], [4, 8]]
y = [0, 0, 1, 1]

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit model
model = DecisionTreeClassifier()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Decision Tree Accuracy:", accuracy)

4. Random Forest

Explanation: Random forest is an ensemble learning method that combines multiple decision trees to improve prediction accuracy.

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Assume feature matrix X and target y for classification
X = [[1, 2], [2, 4], [3, 6], [4, 8]]
y = [0, 0, 1, 1]

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit model
model = RandomForestClassifier()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Random Forest Accuracy:", accuracy)

5. Support Vector Machine

Explanation: SVM is a powerful classifier that maximizes the margin between classes, suitable for linear and non‑linear problems.

from sklearn.svm import SVC
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Assume feature matrix X and target y for classification
X = [[1, 2], [2, 4], [3, 6], [4, 8]]
y = [0, 0, 1, 1]

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit model
model = SVC()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("SVM Accuracy:", accuracy)

6. K‑Nearest Neighbors

Explanation: K‑NN classifies a sample based on the majority label of its K nearest neighbors.

from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Assume feature matrix X and target y for classification
X = [[1, 2], [2, 4], [3, 6], [4, 8]]
y = [0, 0, 1, 1]

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit model
model = KNeighborsClassifier()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("KNN Accuracy:", accuracy)

7. Naive Bayes

Explanation: Naive Bayes applies Bayes' theorem with the assumption of feature independence for classification.

from sklearn.naive_bayes import GaussianNB
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Assume feature matrix X and target y for classification
X = [[1, 2], [2, 4], [3, 6], [4, 8]]
y = [0, 0, 1, 1]

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit model
model = GaussianNB()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Naive Bayes Accuracy:", accuracy)

8. Neural Network (MLP)

Explanation: A multilayer perceptron is a deep learning model suitable for various prediction tasks.

from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Assume feature matrix X and target y for classification
X = [[1, 2], [2, 4], [3, 6], [4, 8]]
y = [0, 0, 1, 1]

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit model
model = MLPClassifier()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("Neural Network Accuracy:", accuracy)

9. XGBoost

Explanation: XGBoost is a gradient‑boosted tree algorithm that combines many decision trees to improve performance.

from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

# Assume feature matrix X and target y for classification
X = [[1, 2], [2, 4], [3, 6], [4, 8]]
y = [0, 0, 1, 1]

# Split data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create and fit model
model = XGBClassifier()
model.fit(X_train, y_train)

# Predict and evaluate
y_pred = model.predict(X_test)
accuracy = accuracy_score(y_test, y_pred)
print("XGBoost Accuracy:", accuracy)

10. Time‑Series Forecasting (ARIMA)

Explanation: Time‑series prediction models future values based on temporal dependencies in the data.

import pandas as pd
from statsmodels.tsa.arima.model import ARIMA

# Assume a CSV file with columns "date" and "value"
data = pd.read_csv("data.csv")
data["date"] = pd.to_datetime(data["date"])

data.set_index("date", inplace=True)

# Build and fit ARIMA model
model = ARIMA(data["value"], order=(1, 1, 1))
model_fit = model.fit()

# Forecast next 10 points
future_values = model_fit.predict(start=len(data), end=len(data)+10)
print("Future 10 predictions:", future_values)

These examples cover common data‑prediction algorithms and tasks, allowing readers to choose suitable models for their specific analysis needs.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Machine Learning Python regression classification scikit-learn data prediction

Written by

Test Development Learning Exchange

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.