BATCH_01 // #12

Deep Learning Mastery

Learning AI from First Principles

I believe the best way to learn is to build. I created this repository to master Deep Learning from the ground up—moving from basic MLPs to Transformers and RLHF. Every model here is implemented without high-level abstractions, teaching me the intricate details of backpropagation, attention mechanisms, and model alignment.

Project Highlights

[+]Implemented Transformers and Causal Self-Attention from scratch to master the math.
[+]Explored the full RLHF pipeline, including Reward Modeling and PPO alignment.
[+]Learned to use Hypothesis for property-based testing of neural network layers.
[+]Developed a custom training loop with structured logging and checkpointing for CPU-only runs.

Tech Stack

#Python#PyTorch#Transformers#Hypothesis#Machine Learning

Open Source Repository

Challenge

Frameworks like Hugging Face hide the 'magic'. I wanted to peel back the layers and build everything myself.

Solution

A transparent library of model implementations focused on readability and mathematical correctness.

File Explorer

backprop

model.py

train.py

data.py

shared

config.py

model.py

1import torch.nn as nn

3class MLP(nn.Module):

4 """Multi-layer perceptron for regression."""

5 def __init__(self, input_dim: int, hidden_dims: list[int], dropout: float):

6 super().__init__()

7 layers = []

8 in_dim = input_dim

9 for hidden_dim in hidden_dims:

10 layers.append(nn.Linear(in_dim, hidden_dim))

11 layers.append(nn.ReLU())

12 layers.append(nn.Dropout(p=dropout))

13 in_dim = hidden_dim

14 layers.append(nn.Linear(in_dim, 1))

15 self.network = nn.Sequential(*layers)

17def initialize_weights(model: MLP, strategy: str):

18 for module in model.modules():

19 if isinstance(module, nn.Linear):

20 if strategy == "kaiming":

21 nn.init.kaiming_normal_(module.weight, nonlinearity="relu")

22 elif strategy == "xavier":

23 nn.init.xavier_uniform_(module.weight)

24 elif strategy == "normal":

25 nn.init.normal_(module.weight, mean=0.0, std=0.01)

26 if module.bias is not None:

27 nn.init.zeros_(module.bias)

Console

Initializing PyTorch DL environment...

Found device: CPU (Simulated)

Ready.

MLP & Weight Init

My first dive into building training loops from scratch to see how initialization affects convergence.

Learner Insight

I learned that Kaiming init is critical for ReLU—otherwise, gradients die early.

Project Architecture

Dataset

California Housing

Parameters

18,305

Training

< 5 min

Core Concept

Weight Initialization

Core Algorithms

AdamW
Cosine LR w/ Warmup
Kaiming Init
Gradient Accumulation