Tiny Shakespeare GPT

1.83M-parameter character-level decoder transformer on Tiny Shakespeare. Same architecture as GPT-2, scaled down.