AI Summary Series
After the little success of the CS336 Series, I decided to expand this idea to other courses that are either popular with a wider audience or personally interesting to me. The motivation remains the same: to explore whether we can learn a course faster and (hopefully) better with the help of LLMs.
At the moment, the following courses are covered:
- CS336 Series: Language Modeling from Scratch by Prof. Percy Liang and the Stanford CS336 team.
- Stanford CS236 - Deep Generative Models: Deep Generative Models by Prof. Stefano Ermon and the team.
- MIT 6.S184 - Flow Matching and Diffusion Models: Flow Matching and Diffusion Models by Prof. Stefano Ermon and the team.
- Karpathy’s Series: The famous series by Andrej Karpathy on building LLMs from scratch.
- Agentic AI Series: A collection of lectures and tutorials on building agentic AI systems (unfortunately, I couldn’t find the original source of this series)
Disclaimers:
- All credits go to the original authors and instructors.
- This should not be seen as a substitute for the official materials. Watching the lectures directly will always give you the best learning experience.
- The actual content is produced by LLMs with very little human intervention. While I believe that today’s flagship reasoning LLMs can often summarize better than the average human, they are still not perfect. There may be mistakes, hallucinations, or missing insights—especially since I intentionally avoid adding too much of my own interpretation. So please keep these limitations in mind and take the content with a grain of salt.
CS336 - Language Modeling from Scratch
- CS336 Lecture 1: Overview and Tokenization
- CS336 Lecture 2: PyTorch, resource accounting
- CS336 Lecture 3: Architectures, hyperparameters
- CS336 Lecture 4: Mixture of experts
- CS336 Lecture 5: GPUs
- CS336 Lecture 6: Kernels, Triton
- CS336 Lecture 7: Parallelism
- CS336 Lecture 8: Parallelism - Part 2
- CS336 Lecture 9: Scaling laws
- CS336 Lecture 10: Inference
- CS336 Lecture 11: Scaling laws
- CS336 Lecture 12: Evaluation
- CS336 Lecture 13: Data
- CS336 Lecture 14: Data - Part 2
- CS336 Lecture 15: Alignment - SFT/RLHF
- CS336 Lecture 16: Alignment - RL
- CS336 Lecture 17: Alignment - RL - Part 2
Karpathy Zero to Hero
- Karpathy Series - Building Micrograd
- Karpathy Series - Building Makemore
- Karpathy Series - Bulding Makemore Part 2 - MLP
- Karpathy Series - Bulding Makemore Part 3 - Activations, Gradients, BatchNorm
- Karpathy Series - Bulding Makemore Part 4 - Becoming a Backprop Ninja
- Karpathy Series - Bulding Makemore Part 5 - Wavenet
- Karpathy Series - Intro to Large Language Models
- Karpathy Series - Deep Dive into LLMs like ChatGPT
- Karpathy Series - How I use LLMs
- Karpathy Series - Let’s reproduce GPT-2
- Karpathy Series - Let’s build the GPT Tokenizer
- Karpathy Series - Let’s build GPT from scratch
Agentic AI
- Agent 01 - Intro to Large Language Models by Andrej Karpathy
- Agent 02 - Building Large Language Models by Stanford CS229
- Agent 03 - Stanford Webinar - Agentic AI - A Progression of Language Model Usage
- Agent 04 - Building and evaluating AI Agents by Sayash Kapoor AI Snake Oil
- Agent 05 - How We Build Effective Agents by Barry Zhang Anthropic
- Agent 06 - Building Agents with MCP by Mahesh Murag Anthropic
- Agent 07 - Building an Agent from Scratch by AI Engineer
- Agent 08 - Building Game Simulation Agents
Stanford CS236 - Deep Generative Models
- Lecture 1 - Introduction
- Lecture 2 - Background
- Lecture 3 - Autoregressive Models
- Lecture 4 - Maximum Likelihood Learning
- Lecture 5 - VAEs
- Lecture 6 - VAEs - Part 2
- Lecture 7 - Normalizing Flows
- Lecture 8 - Normalizing Flows - Part 2
- Lecture 9 - GANs
- Lecture 10 - GANs - Part 2
- Lecture 11 - Energy-based Models
- Lecture 12 - Energy-based Models - Part 2
- Lecture 13 - Score-based Models
- Lecture 14 - Score-based Models - Part 2
- Lecture 15 - Evaluation of Generative Models
- Lecture 16 - Score Based Diffusion Models
- Lecture 17 - Discrete Latent Variable Models
- Lecture 18 - Diffusion Models for Discrete Data
MIT 6.S184 - Flow Matching and Diffusion Models
- MIT 6.S184 Lecture 1 - Generative AI with SDEs
- MIT 6.S184 Lecture 2 - Constructing a Training Target
- MIT 6.S184 Lecture 3 - Training Flow and Diffusion Models
- MIT 6.S184 Lecture 4 - Building an Image Generator
- MIT 6.S184 Lecture 5 - Diffusion for Robotics
- MIT 6.S184 Lecture 6 - Diffusion for Protein Generation
Pipeline
The pipeline is designed to be as automated as possible. It is composed of the following steps:
- Get transcript of the lecture by using Youtube Transcriptor tool. I use free tier from tactiq.io https://tactiq.io/tools/youtube-transcript. (Update on Dec 17, 2025: I found that actually tactiq.io uses
Youtube Transcript APIunder the hood to get the transcript. So I switched to using that to make the pipeline fully automated with Youtube URL as the only input.) - Use OpenAI API (gpt-5-mini model) to summarize the transcript, outputing a list of key discussions with detailed explanations and timestamps associated with each discussion.
- Use OpenAI API (gpt-5-mini model) to refine the detailed explanations so that they are blog post friendly format (highlighting, itemizing, etc.).
- With the timestamps, and the Youtube video URL, capture the relevant frames from the lecture video to demonstrate the key discussions. This is done by using the
youtube-dltool to download the video (only specific segments regarding that specific timestamp) and then using theffmpegtool to extract the frames. - Put all the content, frames together and generate the blog post following the al-folio blog post format.
Enjoy Reading This Article?
Here are some more articles you might like to read next: