Domain-Specialized NanoGPT
A hands-on project training NanoGPT (character-level GPT) models on three tiny corpora (Shakespeare, Wikipedia, and a Math textbook) to study how domain-specific data reshapes generation. I built an evaluation and interpretability workflow covering zero-shot vs few-shot transfer, softmax confidence dynamics, and gradient-based token attribution (Grad-CAM style).
- GitHub: TODO (paste repo link)
- Demo: TODO (optional)
Notes
What I built
- Domain training (3 models): Implemented a reusable dataset/vocabulary pipeline and trained NanoGPT checkpoints (
iter_*.pt,final.pt) for Shakespeare/Wikipedia/Math. - Zero-shot cross-domain evaluation: Evaluated each trained model on all other corpora (loss + controlled prompt generations) to quantify distribution shift effects.
- Few-shot transfer: Fine-tuned from early/mid/late checkpoints on a target corpus and tracked loss to compare adaptation speed vs pretraining maturity.
- Softmax inspection: Tracked top token probabilities and entropy over checkpoints to analyze confidence sharpening and overconfidence trends.
- Grad-CAM-style interpretability: Used embedding-gradient norms to estimate per-input-token importance for next-token predictions; exported results to CSV and generated plots across prompts/checkpoints.
Key takeaways
- Domain data strongly changes next-token “classification” behavior (both loss and generations).
- Few-shot adaptation is sensitive to how far training has progressed before transfer.
- Softmax entropy provides a practical signal for confidence/saturation over training.
- Gradient-based attribution can reveal how token importance shifts across domains and checkpoints.
