Autoencoders | Deep Learning Tutorial - Learn with VOKS
Back Next

Autoencoders


1️⃣ What is an Autoencoder?

An Autoencoder is a neural network that learns to compress data and then reconstruct it.

It learns:

Input → Compressed Representation → Reconstructed Output

In simple words:

It tries to copy the input to the output — but through a smaller hidden layer.

That “smaller hidden layer” forces the model to learn important features.


2️⃣ Why Do We Need Autoencoders?

They help in:

  • Dimensionality reduction (like PCA but nonlinear)
  • Noise removal (denoising)
  • Feature extraction
  • Anomaly detection
  • Image compression

Autoencoders became popular with deep learning research from institutions like:

  • Google
  • OpenAI
  • Meta

3️⃣ Autoencoder Architecture

It has 3 main parts:


| Component | Function |
|-----------|----------|
| Encoder   | Compress input into latent space |
| Bottleneck (Latent Space) | Compressed representation |
| Decoder   | Reconstruct input from latent space |

Visual structure:


Input → Encoder → Bottleneck → Decoder → Output

Important:

Output size = Input size

But bottleneck size < input size.


4️⃣ Mathematical Intuition

Let input be:

x

Encoder:

z = f(x)

Decoder:

x̂ = g(z)

Goal:

Minimize reconstruction error:

Loss = || x − x̂ ||

The network learns parameters that make output close to input.


5️⃣ Types of Autoencoders

🔹Vanilla Autoencoder

Basic encoder + decoder using dense layers.


🔹Sparse Autoencoder

Adds sparsity constraint.

Encourages most neurons to be zero.

Used for feature learning.


🔹Denoising Autoencoder

Input is corrupted with noise.

Target is clean version.

Learns robust representations.


🔹Convolutional Autoencoder

Uses CNN layers instead of dense layers.

Best for images.


🔹Variational Autoencoder (VAE)

More advanced and probabilistic.

Instead of encoding to fixed vector, it learns:

Mean (μ) and Variance (σ)

Then samples:

z = μ + σ * ε

Where ε ~ Normal(0,1)

This makes it generative.

VAEs influenced modern generative AI research including systems developed at:

  • OpenAI

6️⃣ Loss Functions


| Task Type | Loss Function |
|------------|--------------|
| Image Reconstruction | MSE (Mean Squared Error) |
| Binary Image | Binary Crossentropy |
| VAE | Reconstruction Loss + KL Divergence |

7️⃣ Applications of Autoencoders


| Application | Description |
|------------|-------------|
| Dimensionality Reduction | Compress high-dimension data |
| Anomaly Detection | Detect unusual patterns |
| Image Denoising | Remove noise from images |
| Data Compression | Reduce storage |
| Feature Learning | Extract useful features |
| Generative Models | Create new data (VAE) |

8️⃣ Example: Simple Autoencoder in Python

We’ll compress and reconstruct MNIST digit images.

MNIST dataset contains handwritten digits (0–9).


🔹 Install Required Library


pip install tensorflow matplotlib

🔹 Python Implementation


import tensorflow as tf
from tensorflow.keras import layers, models
import matplotlib.pyplot as plt
import numpy as np

# Load MNIST dataset
(x_train, _), (x_test, _) = tf.keras.datasets.mnist.load_data()

# Normalize data
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

# Flatten images (28x28 → 784)
x_train = x_train.reshape((len(x_train), 784))
x_test = x_test.reshape((len(x_test), 784))

# Encoder
input_img = layers.Input(shape=(784,))
encoded = layers.Dense(128, activation='relu')(input_img)
encoded = layers.Dense(64, activation='relu')(encoded)
bottleneck = layers.Dense(32, activation='relu')(encoded)

# Decoder
decoded = layers.Dense(64, activation='relu')(bottleneck)
decoded = layers.Dense(128, activation='relu')(decoded)
output = layers.Dense(784, activation='sigmoid')(decoded)

# Autoencoder Model
autoencoder = models.Model(input_img, output)

autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

# Train
autoencoder.fit(x_train, x_train,
                epochs=10,
                batch_size=256,
                shuffle=True,
                validation_data=(x_test, x_test))

# Reconstruct images
decoded_imgs = autoencoder.predict(x_test)

# Display original vs reconstructed
n = 5
plt.figure(figsize=(10,4))
for i in range(n):
    # Original
    plt.subplot(2,n,i+1)
    plt.imshow(x_test[i].reshape(28,28))
    plt.gray()
    plt.axis('off')

    # Reconstructed
    plt.subplot(2,n,i+1+n)
    plt.imshow(decoded_imgs[i].reshape(28,28))
    plt.gray()
    plt.axis('off')

plt.show()

9️⃣ What Happens in This Code?

  1. Load MNIST digits
  2. Flatten images
  3. Encoder compresses 784 → 32
  4. Decoder reconstructs 32 → 784
  5. Model learns to minimize reconstruction loss
  6. Output image resembles original

The bottleneck layer (32 neurons) is the compressed representation.


🔟 Key Differences: Autoencoder vs PCA

| PCA | Autoencoder |
|-----|-------------|
| Linear method | Nonlinear |
| No deep learning | Neural network based |
| Fast | Requires training |
| Limited complexity | Can model complex patterns |

FULL COMPILATION OF ALL CODE

Example Code:
# Install:
# pip install tensorflow matplotlib

import tensorflow as tf
from tensorflow.keras import layers, models
import matplotlib.pyplot as plt
import numpy as np

# Load MNIST dataset
(x_train, _), (x_test, _) = tf.keras.datasets.mnist.load_data()

# Normalize
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.

# Flatten
x_train = x_train.reshape((len(x_train), 784))
x_test = x_test.reshape((len(x_test), 784))

# Encoder
input_img = layers.Input(shape=(784,))
encoded = layers.Dense(128, activation='relu')(input_img)
encoded = layers.Dense(64, activation='relu')(encoded)
bottleneck = layers.Dense(32, activation='relu')(encoded)

# Decoder
decoded = layers.Dense(64, activation='relu')(bottleneck)
decoded = layers.Dense(128, activation='relu')(decoded)
output = layers.Dense(784, activation='sigmoid')(decoded)

# Model
autoencoder = models.Model(input_img, output)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')

# Train
autoencoder.fit(x_train, x_train,
                epochs=10,
                batch_size=256,
                shuffle=True,
                validation_data=(x_test, x_test))

# Predict
decoded_imgs = autoencoder.predict(x_test)

# Display
n = 5
import matplotlib.pyplot as plt
plt.figure(figsize=(10,4))
for i in range(n):
    plt.subplot(2,n,i+1)
    plt.imshow(x_test[i].reshape(28,28))
    plt.gray()
    plt.axis('off')

    plt.subplot(2,n,i+1+n)
    plt.imshow(decoded_imgs[i].reshape(28,28))
    plt.gray()
    plt.axis('off')

plt.show()
Deep Learning
Architecture Activation Functions BackPropagation Image Recognition Natural Language Processing (NLP) with Deep Learning Time Series Forecasting Autoencoders Generative Adversarial Networks (GANs)
All Courses
Advance AI Bootstrap C C++ Computer Vision Content Writing CSS Cyber Security Data Analysis Deep Learning Email Marketing Excel Figma HTML Java Script Machine Learning MySQLi Node JS PHP Power Bi Python Python for AI Python for Analysis React React Native SEO SMM SQL