Loom

A decoder-only transformer (a small GPT), built and trained from scratch in PyTorch. The goal is understanding every part, not competing with frontier models. You write the core; this scaffold gives you the structure, the plumbing, and a roadmap.

Setup

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

The build, in order

Each step is one concept. The plumbing (data, config, batching, device selection) is already written so you can focus on the model and the loop.

1. Data — python data/prepare.py Downloads ~1MB of Shakespeare, builds a char-level vocab, writes train.bin / val.bin / meta.pkl. Char-level = no tokenizer yet.
2. Attention — implement CausalSelfAttention.forward in model.py
3. MLP — implement MLP.forward
4. Block — wire attention + MLP with residuals in Block.forward
5. Full model — GPT.forward (embeddings -> blocks -> loss)
6. Sampling — GPT.generate
7. Train — fill the loop in train.py, then python train.py
8. Generate — finish sample.py, then python sample.py

When that works end-to-end, milestone 2: implement real BPE in tokenizer.py and move from characters to subword tokens.

Expectations

A ~10M-parameter model on Shakespeare trains in minutes on a GPU, an hour or so on CPU. It won't be smart — it'll learn to produce text that looks like Shakespeare (character names, line breaks, archaic phrasing). That "it went from noise to structure" moment is the whole point.

Layout

config.py        all hyperparameters in one place
model.py         the transformer  <- you implement this
train.py         training loop    <- you implement the loop
sample.py        generate from a checkpoint
tokenizer.py     BPE (milestone 2)
data/prepare.py  download + encode corpus (done)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Loom

Setup

The build, in order

Expectations

Layout

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
config.py		config.py
model.py		model.py
prepare.py		prepare.py
requirements.txt		requirements.txt
sample.py		sample.py
tokenizer.py		tokenizer.py
train.py		train.py

Folders and files

Latest commit

History

Repository files navigation

Loom

Setup

The build, in order

Expectations

Layout

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages