Autoencoders

1️⃣ What is an Autoencoder?

An Autoencoder is a neural network that learns to compress data and then reconstruct it.

It learns:

Input → Compressed Representation → Reconstructed Output

In simple words:

It tries to copy the input to the output — but through a smaller hidden layer.

That “smaller hidden layer” forces the model to learn important features.

2️⃣ Why Do We Need Autoencoders?

They help in:

Dimensionality reduction (like PCA but nonlinear)
Noise removal (denoising)
Feature extraction
Anomaly detection
Image compression

Autoencoders became popular with deep learning research from institutions like:

Google
OpenAI
Meta

3️⃣ Autoencoder Architecture

It has 3 main parts:


| Component | Function |
|-----------|----------|
| Encoder   | Compress input into latent space |
| Bottleneck (Latent Space) | Compressed representation |
| Decoder   | Reconstruct input from latent space |

Visual structure:


Input → Encoder → Bottleneck → Decoder → Output

Important:

Output size = Input size

But bottleneck size < input size.

4️⃣ Mathematical Intuition

Let input be:

Encoder:

z = f(x)

Decoder:

x̂ = g(z)

Goal:

Minimize reconstruction error:

Loss = || x − x̂ ||

The network learns parameters that make output close to input.

5️⃣ Types of Autoencoders

🔹Vanilla Autoencoder

Basic encoder + decoder using dense layers.

🔹Sparse Autoencoder

Adds sparsity constraint.

Encourages most neurons to be zero.

Used for feature learning.

🔹Denoising Autoencoder

Input is corrupted with noise.

Target is clean version.

Learns robust representations.

🔹Convolutional Autoencoder

Uses CNN layers instead of dense layers.

Best for images.

🔹Variational Autoencoder (VAE)

More advanced and probabilistic.

Instead of encoding to fixed vector, it learns:

Mean (μ) and Variance (σ)

Then samples:

z = μ + σ * ε

Where ε ~ Normal(0,1)

This makes it generative.

VAEs influenced modern generative AI research including systems developed at:

OpenAI

6️⃣ Loss Functions


| Task Type | Loss Function |
|------------|--------------|
| Image Reconstruction | MSE (Mean Squared Error) |
| Binary Image | Binary Crossentropy |
| VAE | Reconstruction Loss + KL Divergence |

7️⃣ Applications of Autoencoders


| Application | Description |
|------------|-------------|
| Dimensionality Reduction | Compress high-dimension data |
| Anomaly Detection | Detect unusual patterns |
| Image Denoising | Remove noise from images |
| Data Compression | Reduce storage |
| Feature Learning | Extract useful features |
| Generative Models | Create new data (VAE) |

8️⃣ Example: Simple Autoencoder in Python

We’ll compress and reconstruct MNIST digit images.

MNIST dataset contains handwritten digits (0–9).

🔹 Install Required Library


pip install tensorflow matplotlib

🔹 Python Implementation


import tensorflow as tf
from tensorflow.keras import layers, models
import matplotlib.pyplot as plt
import numpy as np

# Load MNIST dataset
(x_train, _), (x_test, _) = tf.keras.datasets.mnist.load_data()

# Normalize data
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

# Flatten images (28x28 → 784)
x_train = x_train.reshape((len(x_train), 784))
x_test = x_test.reshape((len(x_test), 784))

# Encoder
input_img = layers.Input(shape=(784,))
encoded = layers.Dense(128, activation='relu')(input_img)
encoded = layers.Dense(64, activation='relu')(encoded)
bottleneck = layers.Dense(32, activation='relu')(encoded)

# Decoder
decoded = layers.Dense(64, activation='relu')(bottleneck)
decoded = layers.Dense(128, activation='relu')(decoded)
output = layers.Dense(784, activation='sigmoid')(decoded)

# Autoencoder Model
autoencoder = models.Model(input_img, output)

autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

# Train
autoencoder.fit(x_train, x_train,
                epochs=10,
                batch_size=256,
                shuffle=True,
                validation_data=(x_test, x_test))

# Reconstruct images
decoded_imgs = autoencoder.predict(x_test)

# Display original vs reconstructed
n = 5
plt.figure(figsize=(10,4))
for i in range(n):
    # Original
    plt.subplot(2,n,i+1)
    plt.imshow(x_test[i].reshape(28,28))
    plt.gray()
    plt.axis('off')

    # Reconstructed
    plt.subplot(2,n,i+1+n)
    plt.imshow(decoded_imgs[i].reshape(28,28))
    plt.gray()
    plt.axis('off')

plt.show()

9️⃣ What Happens in This Code?

Load MNIST digits
Flatten images
Encoder compresses 784 → 32
Decoder reconstructs 32 → 784
Model learns to minimize reconstruction loss
Output image resembles original

The bottleneck layer (32 neurons) is the compressed representation.

🔟 Key Differences: Autoencoder vs PCA

| PCA | Autoencoder |
|-----|-------------|
| Linear method | Nonlinear |
| No deep learning | Neural network based |
| Fast | Requires training |
| Limited complexity | Can model complex patterns |

FULL COMPILATION OF ALL CODE

Example Code:

# Install:
# pip install tensorflow matplotlib

import tensorflow as tf
from tensorflow.keras import layers, models
import matplotlib.pyplot as plt
import numpy as np

# Load MNIST dataset
(x_train, _), (x_test, _) = tf.keras.datasets.mnist.load_data()

# Normalize
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

# Flatten
x_train = x_train.reshape((len(x_train), 784))
x_test = x_test.reshape((len(x_test), 784))

# Encoder
input_img = layers.Input(shape=(784,))
encoded = layers.Dense(128, activation='relu')(input_img)
encoded = layers.Dense(64, activation='relu')(encoded)
bottleneck = layers.Dense(32, activation='relu')(encoded)

# Decoder
decoded = layers.Dense(64, activation='relu')(bottleneck)
decoded = layers.Dense(128, activation='relu')(decoded)
output = layers.Dense(784, activation='sigmoid')(decoded)

# Model
autoencoder = models.Model(input_img, output)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

# Train
autoencoder.fit(x_train, x_train,
                epochs=10,
                batch_size=256,
                shuffle=True,
                validation_data=(x_test, x_test))

# Predict
decoded_imgs = autoencoder.predict(x_test)

# Display
n = 5
import matplotlib.pyplot as plt
plt.figure(figsize=(10,4))
for i in range(n):
    plt.subplot(2,n,i+1)
    plt.imshow(x_test[i].reshape(28,28))
    plt.gray()
    plt.axis('off')

    plt.subplot(2,n,i+1+n)
    plt.imshow(decoded_imgs[i].reshape(28,28))
    plt.gray()
    plt.axis('off')

plt.show()

Deep Learning

Autoencoders

Example Code:

Deep Learning

All Courses