AILearn
Neural Network FundamentalsThe Perceptron

The Perceptron

35 min
Neural Network Fundamentals

The perceptron is the simplest neural network - a single neuron that can learn to classify linearly separable data. Understanding it is key to understanding how neural networks learn.

Definition

A perceptron is a single-layer neural network that takes multiple inputs, applies weights and a bias, sums them up, and passes the result through an activation function to produce an output.

Key Concepts

Neuron/Node

The basic unit that receives inputs, applies weights, sums them with a bias, and outputs through an activation function.

Weights

Parameters that determine the importance of each input. Learning adjusts these weights.

Bias

An additional parameter that shifts the activation function, allowing the neuron to fit data better.

Activation Function

A function that introduces non-linearity. Step function for perceptron, sigmoid/ReLU for modern networks.

Real-World Applications

Early Spam Filters

Technology

Simple perceptrons were among the first ML approaches to email spam classification.

Pattern Recognition Research

Research

Perceptrons sparked the field of neural networks in the 1950s-60s, leading to modern deep learning.

Code Example

python
import numpy as np

class Perceptron:
    """
    Simple Perceptron implementation
    """
    def __init__(self, learning_rate=0.1, n_iterations=100):
        self.learning_rate = learning_rate
        self.n_iterations = n_iterations
        self.weights = None
        self.bias = None

    def activation(self, x):
        """Step function"""
        return np.where(x >= 0, 1, 0)

    def fit(self, X, y):
        """Train the perceptron"""
        n_samples, n_features = X.shape

        # Initialize weights and bias
        self.weights = np.zeros(n_features)
        self.bias = 0

        for iteration in range(self.n_iterations):
            errors = 0
            for xi, yi in zip(X, y):
                # Forward pass
                linear_output = np.dot(xi, self.weights) + self.bias
                prediction = self.activation(linear_output)

                # Update rule (Perceptron learning rule)
                update = self.learning_rate * (yi - prediction)
                self.weights += update * xi
                self.bias += update

                errors += int(update != 0)

            if errors == 0:
                print(f"Converged at iteration {iteration}")
                break

        return self

    def predict(self, X):
        """Make predictions"""
        linear_output = np.dot(X, self.weights) + self.bias
        return self.activation(linear_output)

# Example: AND gate
X = np.array([[0, 0], [0, 1], [1, 0], [1, 1]])
y = np.array([0, 0, 0, 1])  # AND

perceptron = Perceptron()
perceptron.fit(X, y)

print(f"Weights: {perceptron.weights}")
print(f"Bias: {perceptron.bias}")
print(f"Predictions: {perceptron.predict(X)}")
print(f"Expected:    {y}")

The perceptron updates weights only when it makes an error. This simple learning rule is the ancestor of backpropagation used in modern neural networks.

Practice Problems

  • 1Implement OR and NAND gates with a perceptron
  • 2Show why a single perceptron cannot learn XOR
  • 3Visualize the decision boundary learned by the perceptron

Summary

The perceptron introduced key concepts: weights, bias, activation, and learning through error correction. While limited to linearly separable problems, it laid the foundation for multi-layer neural networks.