Skip to content

sigfault-byte/myGPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

44 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Small-Scale Language Model Experiments (Rousseau-style corpus)

Overview

This repository contains a series of quick experiments training small GPT-style language models from scratch on a limited corpus (~2MB → ~10MB) of 18th-century French texts (Rousseau and later contemporaries).

The goal is not to achieve production quality text generation, but to explore:

- the impact of dataset size
- tokenizer choice (char vs BPE)
- model capacity vs data scale
- training dynamics (loss vs sample quality)

Observations

Most runs observtion may be found in runs/experimentID/info.md

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages