1️⃣ What is an Autoencoder?
An Autoencoder is a neural network that learns to compress data and then reconstruct it.
It learns:
Input → Compressed Representation → Reconstructed Output
In simple words:
It tries to copy the input to the output — but through a smaller hidden layer.
That “smaller hidden layer” forces the model to learn important features.
2️⃣ Why Do We Need Autoencoders?
They help in:
Autoencoders became popular with deep learning research from institutions like:
3️⃣ Autoencoder Architecture
It has 3 main parts:
| Component | Function | |-----------|----------| | Encoder | Compress input into latent space | | Bottleneck (Latent Space) | Compressed representation | | Decoder | Reconstruct input from latent space |
Visual structure:
Input → Encoder → Bottleneck → Decoder → Output
Important:
Output size = Input size
But bottleneck size < input size.
4️⃣ Mathematical Intuition
Let input be:
x
Encoder:
z = f(x)
Decoder:
x̂ = g(z)
Goal:
Minimize reconstruction error:
Loss = || x − x̂ ||
The network learns parameters that make output close to input.
5️⃣ Types of Autoencoders
🔹Vanilla Autoencoder
Basic encoder + decoder using dense layers.
🔹Sparse Autoencoder
Adds sparsity constraint.
Encourages most neurons to be zero.
Used for feature learning.
🔹Denoising Autoencoder
Input is corrupted with noise.
Target is clean version.
Learns robust representations.
🔹Convolutional Autoencoder
Uses CNN layers instead of dense layers.
Best for images.
🔹Variational Autoencoder (VAE)
More advanced and probabilistic.
Instead of encoding to fixed vector, it learns:
Mean (μ) and Variance (σ)
Then samples:
z = μ + σ * ε
Where ε ~ Normal(0,1)
This makes it generative.
VAEs influenced modern generative AI research including systems developed at:
6️⃣ Loss Functions
| Task Type | Loss Function | |------------|--------------| | Image Reconstruction | MSE (Mean Squared Error) | | Binary Image | Binary Crossentropy | | VAE | Reconstruction Loss + KL Divergence |
7️⃣ Applications of Autoencoders
| Application | Description | |------------|-------------| | Dimensionality Reduction | Compress high-dimension data | | Anomaly Detection | Detect unusual patterns | | Image Denoising | Remove noise from images | | Data Compression | Reduce storage | | Feature Learning | Extract useful features | | Generative Models | Create new data (VAE) |
8️⃣ Example: Simple Autoencoder in Python
We’ll compress and reconstruct MNIST digit images.
MNIST dataset contains handwritten digits (0–9).
🔹 Install Required Library
pip install tensorflow matplotlib
🔹 Python Implementation
import tensorflow as tf
from tensorflow.keras import layers, models
import matplotlib.pyplot as plt
import numpy as np
# Load MNIST dataset
(x_train, _), (x_test, _) = tf.keras.datasets.mnist.load_data()
# Normalize data
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
# Flatten images (28x28 → 784)
x_train = x_train.reshape((len(x_train), 784))
x_test = x_test.reshape((len(x_test), 784))
# Encoder
input_img = layers.Input(shape=(784,))
encoded = layers.Dense(128, activation='relu')(input_img)
encoded = layers.Dense(64, activation='relu')(encoded)
bottleneck = layers.Dense(32, activation='relu')(encoded)
# Decoder
decoded = layers.Dense(64, activation='relu')(bottleneck)
decoded = layers.Dense(128, activation='relu')(decoded)
output = layers.Dense(784, activation='sigmoid')(decoded)
# Autoencoder Model
autoencoder = models.Model(input_img, output)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
# Train
autoencoder.fit(x_train, x_train,
epochs=10,
batch_size=256,
shuffle=True,
validation_data=(x_test, x_test))
# Reconstruct images
decoded_imgs = autoencoder.predict(x_test)
# Display original vs reconstructed
n = 5
plt.figure(figsize=(10,4))
for i in range(n):
# Original
plt.subplot(2,n,i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
plt.axis('off')
# Reconstructed
plt.subplot(2,n,i+1+n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
plt.axis('off')
plt.show()
9️⃣ What Happens in This Code?
The bottleneck layer (32 neurons) is the compressed representation.
🔟 Key Differences: Autoencoder vs PCA
| PCA | Autoencoder | |-----|-------------| | Linear method | Nonlinear | | No deep learning | Neural network based | | Fast | Requires training | | Limited complexity | Can model complex patterns |
FULL COMPILATION OF ALL CODE
# Install:
# pip install tensorflow matplotlib
import tensorflow as tf
from tensorflow.keras import layers, models
import matplotlib.pyplot as plt
import numpy as np
# Load MNIST dataset
(x_train, _), (x_test, _) = tf.keras.datasets.mnist.load_data()
# Normalize
x_train = x_train.astype('float32') / 255.
x_test = x_test.astype('float32') / 255.
# Flatten
x_train = x_train.reshape((len(x_train), 784))
x_test = x_test.reshape((len(x_test), 784))
# Encoder
input_img = layers.Input(shape=(784,))
encoded = layers.Dense(128, activation='relu')(input_img)
encoded = layers.Dense(64, activation='relu')(encoded)
bottleneck = layers.Dense(32, activation='relu')(encoded)
# Decoder
decoded = layers.Dense(64, activation='relu')(bottleneck)
decoded = layers.Dense(128, activation='relu')(decoded)
output = layers.Dense(784, activation='sigmoid')(decoded)
# Model
autoencoder = models.Model(input_img, output)
autoencoder.compile(optimizer='adam', loss='binary_crossentropy')
# Train
autoencoder.fit(x_train, x_train,
epochs=10,
batch_size=256,
shuffle=True,
validation_data=(x_test, x_test))
# Predict
decoded_imgs = autoencoder.predict(x_test)
# Display
n = 5
import matplotlib.pyplot as plt
plt.figure(figsize=(10,4))
for i in range(n):
plt.subplot(2,n,i+1)
plt.imshow(x_test[i].reshape(28,28))
plt.gray()
plt.axis('off')
plt.subplot(2,n,i+1+n)
plt.imshow(decoded_imgs[i].reshape(28,28))
plt.gray()
plt.axis('off')
plt.show()