| 1-2 |
Foundations: deep learning, computational graphs, autodiff, and ML frameworks |
| 3 |
GPUs, CUDA, and collective communication |
| 4 |
Graph and memory optimizations |
| 4 |
Guest lecture (TBD): GPU and hardware trends |
| 5 |
Parallelism in ML systems |
| 6 |
Transformers, LLMs, and diffusion models |
| 6 |
Guest lecture (TBD): LLM systems |
| 7 |
LLM training: FlashAttention and related efficiency techniques |
| 8 |
LLM inference and serving: paged attention, continuous batching, and speculative decoding |
| 9 |
Guest lecture (TBD): agents |
| 9 |
Scaling laws, test-time compute, and reasoning |
| 10 |
LLM + X: RAG, search, multimodality, and related directions |
| 10 |
Guest lecture (TBD): TBD |
| 10 |
Final exam review |
| 11 |
Final exam |