Linear Regression
Linear regression is the 'Hello World' of machine learning. It's simple yet powerful, and understanding it deeply provides intuition for more complex algorithms. It models the relationship between a dependent variable and one or more independent variables.
Definition
Linear regression is a supervised learning algorithm that predicts a continuous output variable (y) based on input features (X) by fitting a linear equation: y = wx + b, where w is the weight (slope) and b is the bias (intercept).
Key Concepts
Hypothesis Function
The linear equation h(x) = wx + b that maps inputs to predictions. In multiple regression: h(x) = w₁x₁ + w₂x₂ + ... + b
Loss/Cost Function
Mean Squared Error (MSE) measures how far predictions are from actual values: J = (1/n)Σ(y - ŷ)²
Gradient Descent
Optimization algorithm that iteratively adjusts weights to minimize the loss function by moving in the direction of steepest descent.
Learning Rate
Hyperparameter controlling step size in gradient descent. Too large = overshooting, too small = slow convergence.
R² Score
Coefficient of determination measuring how well the model explains variance in the target. 1.0 is perfect, 0 is no better than mean.
Real-World Applications
House Price Prediction
Real EstateZillow's Zestimate uses regression models considering square footage, location, bedrooms, and more to predict home values.
Sales Forecasting
RetailRetailers predict future sales based on historical data, marketing spend, seasonality, and economic indicators.
Ad Spending Optimization
MarketingCompanies use regression to understand ROI of different marketing channels and optimize budget allocation.
Code Example
import numpy as np
class LinearRegression:
"""
Linear Regression from scratch using Gradient Descent
"""
def __init__(self, learning_rate=0.01, n_iterations=1000):
self.learning_rate = learning_rate
self.n_iterations = n_iterations
self.weights = None
self.bias = None
self.loss_history = []
def fit(self, X, y):
"""Train the model using gradient descent"""
n_samples, n_features = X.shape
# Initialize parameters
self.weights = np.zeros(n_features)
self.bias = 0
# Gradient descent
for i in range(self.n_iterations):
# Forward pass: predictions
y_pred = np.dot(X, self.weights) + self.bias
# Compute loss (MSE)
loss = np.mean((y - y_pred) ** 2)
self.loss_history.append(loss)
# Compute gradients
dw = -(2/n_samples) * np.dot(X.T, (y - y_pred))
db = -(2/n_samples) * np.sum(y - y_pred)
# Update parameters
self.weights -= self.learning_rate * dw
self.bias -= self.learning_rate * db
if i % 100 == 0:
print(f"Iteration {i}, Loss: {loss:.4f}")
return self
def predict(self, X):
"""Make predictions"""
return np.dot(X, self.weights) + self.bias
def r2_score(self, X, y):
"""Calculate R² score"""
y_pred = self.predict(X)
ss_res = np.sum((y - y_pred) ** 2)
ss_tot = np.sum((y - np.mean(y)) ** 2)
return 1 - (ss_res / ss_tot)
# Example usage
np.random.seed(42)
# Generate synthetic data: y = 3x + 2 + noise
X = 2 * np.random.rand(100, 1)
y = 3 * X.squeeze() + 2 + np.random.randn(100) * 0.5
# Train model
model = LinearRegression(learning_rate=0.1, n_iterations=500)
model.fit(X, y)
print(f"\nLearned weight: {model.weights[0]:.4f} (true: 3)")
print(f"Learned bias: {model.bias:.4f} (true: 2)")
print(f"R² Score: {model.r2_score(X, y):.4f}")
# Predict new values
X_new = np.array([[0], [1], [2]])
predictions = model.predict(X_new)
print(f"\nPredictions for [0, 1, 2]: {predictions}")This implementation shows the core of linear regression: initialize parameters, compute predictions, calculate loss, compute gradients, and update parameters. This same pattern applies to neural networks - just with more complex architectures!
Practice Problems
- 1Implement linear regression with regularization (L1/L2)
- 2Use linear regression on a real dataset (Boston Housing)
- 3Compare your implementation with scikit-learn's LinearRegression
Summary
Linear regression is foundational to ML. The concepts of loss functions, gradient descent, and optimization apply to nearly all ML algorithms. Master this, and you'll have intuition for understanding complex models.