Fine-tuning LLMs to solve and explain UC Berkeley CS61A problems using reinforcement learning with test verification.
Status: Just getting started
Build an LLM that can:
- Solve CS61A programming problems (Python, Scheme)
- Explain solutions step-by-step
- Provide hints without giving away full answers
- Code problems with test verification
- Discussion/lab worksheet problems
- Lecture transcript Q&A
- Video explanations (multimodal - future)
Using the Tinker API for fine-tuning with:
- RL with test verification: Reward signal from code passing unit tests
- SFT on explanations: Train on problem + solution + explanation pairs
- Python ≥3.11
- Tinker SDK
- PyTorch
- HuggingFace Transformers
See TINKER_OVERVIEW.md for Tinker API documentation.
Apache 2.0