AI, But Simple
Posts
Deep Unsupervised Learning: Autoencoders

Deep Unsupervised Learning: Autoencoders

AI, But Simple Issue #16

Edwin Dong
September 09, 2024

Autoencoders

AI, But Simple Issue #16

Autoencoders are a specific type of feedforward neural network where the model attempts to deconstruct the input and construct a similar output.

They compress the input into a lower-dimensional version and then reconstruct the output from this representation. This aims to learn a compressed, yet informative, representation of the input data.

They are one of the foundational techniques of deep unsupervised learning (learning without labels).

They are mostly used as a form of dimensionality reduction (making it easier to visualize, process, or use in other machine learning tasks).
Some other common applications are anomaly detection, denoising, and generation.
Other networks used in deep unsupervised learning include the GAN, clustering nets, and more.

Technically, autoencoders can also be considered “self-supervised” as they can generate their own labels.

Let’s think back to the main ideas of autoencoders.

On a very basic level, the autoencoder is just tasked with generating the same output from the code produced by the input. We call this version the “Undercomplete” autoencoder.

This way, we can learn the intricacies of the data. The autoencoder will try to capture the most important features or patterns in the data by compressing it into a lower-dimensional form and then reconstructing it back.

A loss function is used to train the model; it simply evaluates the difference between the true target and the generated image.

If you know anything about dimensionality reduction in machine learning, you might have heard of PCA or ICA—these methods can perform linear transformations to the data.

Autoencoders can perform larger scale non-linear dimensionality reduction, making it useful in its little niche.

Autoencoder Architecture

Let’s go over the architecture of an undercomplete autoencoder, which is the general substructure of many versions of autoencoders.

An autoencoder consists of 3 components: the encoder, the code/bottleneck, and the decoder.

The encoder compresses the input and produces the code, the decoder then reconstructs the input only using the output from the code.

Let’s explore the details of these components more deeply.

Both the encoder and decoder are fully-connected feedforward neural networks, essentially just ANNs.

The code is a single layer with the dimensionality of our choice. The number of nodes in the code layer (called the code size) is a hyperparameter that we set before training the autoencoder.

It is a module that contains the compressed knowledge representations and is the most important part of the network.
The compressed lower-dimensional version of the input is called the latent-space representation.

Note that the decoder architecture is the mirror image of the encoder. This is not a requirement, but it’s typically the case.

The only requirement is that the input and the output must be the same size.

Let’s talk about the hyperparameters used in autoencoders; specifically, there are 4 main ones.

The first hyperparameter is the code size. This determines the number of nodes in the middle layer, and a smaller size results in more compression.

Subscribe to keep reading

This content is free, but you must be subscribed to AI, But Simple to continue reading.

Already a subscriber?Sign in.Not now