Multilayer Perceptron

8 min readNov 3, 2024

Multilayer Perceptron (MLP)

An MLP is a type of Artificial Neural Network that consists of multiple layers of neurons. These layers allow the network to learn complex functions and patterns in data by performing computations in a series of steps. Here’s how it works, explained in key points:

1. Multiple Layers for Complex Functions

Concept: In a Multilayer Perceptron, neurons are grouped into multiple layers. Unlike a simple neural network, which might have only an input and an output layer, an MLP has one or more hidden layers between them. These additional layers allow the network to learn more intricate patterns and relationships within the data.
Real-Life Example: Imagine a chef preparing a complex recipe that requires several stages (preparing ingredients, cooking, garnishing). If each stage represents a “layer” in the process, then the final dish is the result of all these layers working together. Similarly, in an MLP, each layer adds complexity to the function the network can perform, allowing it to learn from intricate patterns within the data.

2. Layers and Computation

Concept: Neurons in an MLP are organized into layers: an input layer, one or more hidden layers, and an output layer. Each layer contributes to the computation that transforms the input data step-by-step until it reaches the final output.
Real-Life Example: Think of a production line in a factory where each station performs a different part of the assembly process. Each layer in the MLP is like a station on this line, where each performs some operation on the data, passing it along to the next layer. The final product is the outcome of each station’s work, much like the final prediction or classification in the MLP.

3. Input, Hidden, and Output Layers

Concept: An MLP starts with an input layer (which takes in data), followed by hidden layers (which perform intermediate calculations), and ends with an output layer (which provides the final result).Input Layer: This layer receives raw data, such as features in a dataset, which are fed into the network.Hidden Layers: These layers do the “heavy lifting” by learning patterns, relationships, and transformations in the data. Each hidden layer receives the output from the previous layer, applies a function to it, and then passes it on to the next layer.Output Layer: This layer provides the final output, such as a prediction, classification, or value.
Real-Life Example: Imagine translating a document from one language to another. First, you might identify the sentences (input layer). Then, each hidden layer could represent steps in the translation process, such as breaking down grammar, translating vocabulary, and reconstructing sentences. The final output layer would produce the fully translated document.

4. Passing Outputs to the Next Layer

Concept: In an MLP, the output of neurons in one layer serves as the input for neurons in the next layer. This chain of outputs being fed forward through each layer enables the MLP to “build upon” its understanding with each step.
Real-Life Example: Think of a relay race where each runner passes the baton to the next. Here, each runner (neuron) contributes to the team’s progress. Similarly, each neuron layer in the MLP builds upon the work of the previous layer, leading to a more refined and accurate outcome by the end.

5. Hidden Layers as the Main Computation Center

Concept: Hidden layers are where most of the learning and computation happen. They help the MLP capture complex patterns in data, which might not be obvious at the input level alone.
Real-Life Example: Think of the hidden layers like a chef’s kitchen, where all the complex work (chopping, mixing, cooking) happens. By the time the dish reaches the customer (output layer), it’s fully prepared and ready to serve. Similarly, the hidden layers process and refine the data, enabling the network to produce an accurate result at the output layer.

Here’s a step-by-step breakdown of a Multilayer Perceptron (MLP), explaining each part of the process clearly.

Step 1: Input Layer — Receiving Data

The input layer is where the raw data enters the network. Each node (neuron) in this layer represents a feature of the data. If we’re classifying handwritten digits, for example, each pixel in the image is an input to this layer.

Example: Think of this step as gathering ingredients for a recipe. Each ingredient (data feature) represents an individual item you need before starting.

Step 2: Weights and Biases — Adjusting the Importance of Inputs

After entering the network, each input is multiplied by a weight. Weights adjust the significance of each input, emphasizing or de-emphasizing certain features. Additionally, a bias is added to further refine the output. Weights and biases are adjusted during training to improve the network’s predictions.

Example: Imagine a job interview where skills, experience, and attitude are evaluated. Skills might be weighted higher than experience, with attitude playing a smaller, supporting role (bias). These “weights” represent the importance of each factor in the hiring decision.

Step 3: Hidden Layers — Processing with Nonlinear Functions

The MLP contains one or more hidden layers between the input and output layers. These hidden layers perform most of the network’s computations. Each hidden layer neuron receives the output of neurons from the previous layer, applies an activation function to it (like a threshold), and then sends it to the next layer. This activation function allows the network to learn complex, nonlinear patterns in data.

Example: Think of a complex recipe with multiple preparation steps, like marinating, cooking, and garnishing. Each hidden layer adds complexity, combining basic ingredients and flavors to produce a final dish. The steps in each layer prepare the data for the next, resulting in a richer outcome.

Step 4: Forward Propagation — Moving Data Through the Layers

The network processes data layer by layer, starting from the input layer, moving through each hidden layer, and finally reaching the output layer. This step-by-step process is known as forward propagation, where each layer’s output becomes the input for the next.

Example: Imagine an assembly line where each station adds something new to a product. As the product moves along the line, it is refined and improved, eventually becoming a finished product at the end of the line.

Step 5: Output Layer — Producing the Final Prediction

After the data has moved through all layers, it reaches the output layer. This layer contains the final result of the computation, such as a classification or prediction. For example, if we’re classifying handwritten digits, the output layer would indicate the predicted digit (0–9) based on the probabilities calculated for each possible outcome.

Example: In a quiz competition, each round brings contestants closer to the final answer. The last round (output layer) determines the winner based on accumulated knowledge and performance from previous rounds. Similarly, the output layer gives the network’s final answer based on all previous layers’ computations.

Step 6: Training the Network — Adjusting Weights and Biases

Training is the process where the network learns from data by adjusting weights and biases. During training, the network makes predictions on the data, compares them to the actual answers (ground truth), and uses the difference (error) to adjust weights and biases. This process, called backpropagation, repeats many times to improve accuracy.

Example: Imagine a coach training an athlete by giving feedback after each practice session. The athlete adjusts their technique based on the coach’s feedback, gradually improving their performance over time. Similarly, backpropagation adjusts weights and biases, allowing the MLP to “learn” and produce more accurate predictions.

Step 7: Using the Trained Model for Predictions

Once the MLP is trained, it can make predictions on new data. You feed new inputs into the network, it processes them through the layers, and outputs a prediction. A well-trained MLP should be able to accurately classify or predict outcomes on data it hasn’t seen before.

Example: Once an athlete has mastered a technique, they can perform well in actual competitions, not just practice. Similarly, a trained MLP can predict accurately on new data, not just the training data.

Summary

Input Layer receives raw data.
Weights and Biases adjust input importance.
Hidden Layers process data with nonlinear functions.
Forward Propagation moves data through the network.
Output Layer provides the final prediction.
Training involves adjusting weights to reduce errors.
Using the Model enables accurate predictions on new data.

This layered process allows the MLP to capture complex relationships in data, making it effective for tasks like image recognition, sentiment analysis, medical diagnosis, and more.

Here’s a simple Python program to create and train a Multilayer Perceptron (MLP) using the sklearn library, a popular machine learning library in Python. This example uses an MLP for a simple classification task on the Iris dataset, which classifies flowers into three species based on features like petal and sepal dimensions.

Requirements

To run this code, you’ll need the sklearn library. You can install it using:

pip install scikit-learn

Code: Simple MLP with Scikit-Learn

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score

# Step 1: Load the Iris dataset
data = load_iris()
X = data.data  # Features (petal length, petal width, sepal length, sepal width)
y = data.target  # Target (flower species)# Step 2: Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)# Step 3: Initialize the MLP model
# Here, we define an MLP with 2 hidden layers containing 5 and 3 neurons respectively
mlp = MLPClassifier(hidden_layer_sizes=(5, 3), max_iter=1000, random_state=42)# Step 4: Train the MLP model on the training data
mlp.fit(X_train, y_train)# Step 5: Make predictions on the test set
y_pred = mlp.predict(X_test)# Step 6: Calculate accuracy
accuracy = accuracy_score(y_test, y_pred)# Display results
print(f"Predicted species: {y_pred}")
print(f"Actual species: {y_test}")
print(f"Model Accuracy: {accuracy * 100:.2f}%")

Explanation of Each Step

Load the Data: We load the Iris dataset, which is included in sklearn. It contains measurements of different flower species.
Split the Data: We divide the dataset into training and testing sets.
Initialize the MLP: We set up an MLP with two hidden layers containing 5 and 3 neurons, respectively. We specify max_iter=1000 to allow more iterations for convergence.
Train the Model: The MLP learns the patterns in the training data.
Make Predictions: We predict the species on the test data.
Calculate Accuracy: We compare predicted values with actual values to get the accuracy.

Output

This program will output the predicted species, actual species, and the accuracy of the model.

In summary, a Multilayer Perceptron is like a team of specialized stages working together, where each layer refines the data in steps until the final answer is produced. It is these multiple layers and hidden computations that make MLPs powerful for solving complex problems like image recognition, text processing, and prediction tasks.