Reviewing Daniel Han’s guest lecture + Backprop Ninja
Daniel’s Triton Kernels lecture
I wanted to briefly review Daniel’s lecture from last week on Triton kernels from first principles, going through simplifying derivatives for LLM blocks and using calculus tricks to write custom backprop kernels in triton rather than letting the autograd to its magic.
Backprop ninja
This reminded me of this lecture by Andrej Karpaty in his NN Zero To Hero series on youtube. So of course, just for practice and to get a stronger grasp on backprop, I did the exercises, implemented the entire backprop by hand (and Apple Pen + iPad). As a first principled kind of guy, there’s nothing better than going this low, understanding where the \(\mathbf{W}^\top\) comes from in the derivatives by writing all the dot products and derivating the loss wrt the inputs/weights and seeing the pattern emerge.
I went through the playlist 2 years ago already, in my first wave of interest for ML, and it’s been good getting back in the game. I’m feeling nostalgic, I miss the good parts about living in New York and being locked in.
Misc
I saw the leaks for the incoming Meta Connect keynote, about the new Meta x {RayBan, Oakley}, and I can’t wait for us to enter in the era of good AR glasses and embedded AI assistants! This reminds me of a theory I saw on X that the iPhone 17 Air serves many very interesting purposes to Apple, the first being for “fashion” and taking this customer base away from the Pros, but also this is a crash test on miniaturization, and it will finance the progress and experiments towards including this kind of tech in Apple’s own AR glasses, or even using the thinness for foldable phones.
Good time to be optimist about tech.