CNN Backpropagation, Mathematically Explained

AI, But Simple Issue #41

Hello from the AI, but simple team! If you enjoy our content, consider supporting us so we can keep doing what we do.

Our newsletter is no longer sustainable to run at no cost, so we’re relying on different measures to cover operational expenses. Thanks again for reading!

CNN Backpropagation, Mathematically Explained

AI, But Simple Issue #41

This week’s issue is an extension of a past issue where we dived into a step-by-step math process of a convolutional neural network’s (CNN) forward pass. If you want to check it out, you can find it below.

In this issue, we’ll be focusing on the backwards pass and CNN backpropagation, since it is a topic that is confusing to many learners.

Review of Backpropagation

Backpropagation is the process where we compute the gradients of the loss function with respect to each model parameter (weights and biases).

These gradients tell us how to update each parameter in order to minimize the loss using gradient descent. This is essentially how a neural network “learns” and gradually performs better by iterating through a dataset.

The key tool in backpropagation is the chain rule, which allows us to decompose the gradient of a composite function into a product of derivatives.

In CNNs, backpropagation is repeated layer by layer from the output layer back to the input, helping the network “learn” by updating the model parameters.

If you have a moment, clicking the ad below helps us a ton—and who knows, you might find something you’ll enjoy.

Fact-based news without bias awaits. Make 1440 your choice today.

Overwhelmed by biased news? Cut through the clutter and get straight facts with your daily 1440 digest. From politics to sports, join millions who start their day informed.

Also, before continuing, make sure you understand the basics of a CNN, such as how a convolution works and what hyperparameters such as stride and padding mean, along with some basic knowledge of calculus.

Let’s go through the mathematical process for a convolutional layer, step by step.

Forward Pass

To understand the backwards pass and backpropagation, we need to first understand the forwards pass of a convolutional layer. Let’s say we have a 3×3 input matrix (X) and a 3×3 convolutional filter (W).

Using the input matrix (X) and filter (W), we obtain the output feature map by performing a convolution operation between them.

  • We’ll perform a convolution with padding 1 and a stride of 1, resulting in a 3×3 feature map.

After obtaining the feature map, we compute the pre-activation output (O) by adding a bias constant (b) to every element in the feature map.

Subscribe to keep reading

This content is free, but you must be subscribed to AI, But Simple to continue reading.

Already a subscriber?Sign In.Not now