- AI, But Simple
- Posts
- Neural Networks in Pytorch
Neural Networks in Pytorch
AI, But Simple Issue #11
Quick note, this week’s letter is one of our first tutorials with code! Please feel free to experiment further with the base code provided.
Also, some knowledge of neural networks and how they work would be useful for this issue. If you need to catch up on that part, feel free to check our archive, particularly issue 1, issue 2, issue 4 and issue 8!
Neural Networks in Pytorch
AI, But Simple Issue #11
In past issues, we’ve gone over the theory of neural networks, we’ve talked about the concepts of them, but we haven’t discussed how to implement them in real life.
So today, we’re going to make a ANN without any convolutional layers (used in CNNs) to classify some images.
Convolutional layers are shown above
Now, if you know anything about CNNs, you’ll know that they work great for image processing related tasks.
To demonstrate the power of simple neural networks, we’ll take on the challenge of making an ANN image classifier with acceptable accuracy.
We’re going to implement the ANN in Python using Pytorch, along with some helper libraries like Numpy and Matplotlib.
Disclaimer: This issue will require some knowledge of Python. If you have no knowledge but still want to follow along, please consult a beginner Python tutorial before continuing.
But why choose PyTorch for ANNs?
Pytorch’s intuitive syntax and flexibility make it a favorite among researchers and developers. It’s backed by a large community and lots of support, which is one of the reasons it’s so popular.
We’re going to be training our model on the infamous Fashion-MNIST dataset, as it’s much more robust than the MNIST dataset.
The Fashion-MNIST Dataset consists of 60,000 training images and 10,000 testing images
Each image in the dataset is a 28x28 pixel image
Every image in the dataset belongs to one of ten classes of items:
Label | Description |
---|---|
0 | T-shirt/top |
1 | Pants |
2 | Pullover |
3 | Dress |
4 | Coat |
5 | Sandal |
6 | Shirt |
7 | Sneaker |
8 | Bag |
9 | Ankle boot |
The idea is to construct a neural network using a decent architecture and enough layers to learn patterns, hoping it will be able to classify images effectively.
By pairing fully connected layers with the ReLU activation function and gradually decreasing the number of neurons, we hope to achieve good performance on this image classification dataset.
Keep in mind that the model we will be creating won’t be state-of-the-art or too complicated, so its performance won’t be the absolute best.
We want to go for an architecture like shown above
Let’s start with the imports and download the dataset.
import numpy as np # python numerical library
import matplotlib.pyplot as plt # graphing library
from collections import OrderedDict
import torch # pytorch
import torch.nn as nn # neural network
import torch.optim as optim # optimizers
from torchvision import datasets, transforms # data
# Download training and testing data
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
train = datasets.FashionMNIST('F_MNIST_data', download=True, train=True, transform=transform)
test = datasets.FashionMNIST('F_MNIST_data', download=True, train=False, transform=transform)
Then, let’s split our training dataset into 80% training and 20% validation.
As a reminder, validation datasets are to evaluate performance and to do any hyperparameter tuning.
# split train set into training (80%) and validation set (20%)
train_num = len(train)
indices = list(range(train_num))
np.random.shuffle(indices)
split = int(np.floor(0.2 * train_num))
val_idx, train_idx = indices[:split], indices[split:]
len(val_idx), len(train_idx)
When we use Pytorch, we usually use data loaders, which are just tools to manage our data.