Articles Projects Weekly Credentials About

Minimal Transformer in PyTorch

A self-contained transformer encoder built from scratch in PyTorch: multi-head scaled dot-product attention, positional encoding, and a feed-forward sublayer — under 150 lines with annotated shapes at every step.

machine-learningtransformersnlpdeep-learning

A transformer encoder in under 150 lines of PyTorch — every tensor shape printed so you can follow the data through attention.

Source code

# Minimal Transformer in PyTorch

Code from the article [Understanding Transformers: The Architecture Behind Modern AI](https://rishisharma.in/articles/transformers-explained).

Requires Python 3.8+ and PyTorch.

```sh
pip install torch
python transformer.py
```

## Contents

- `transformer.py` — self-contained transformer encoder: multi-head attention, positional encoding, feed-forward sublayer — shapes printed at each step

← All projects