rss
Well, that’s disappointing (not the paper, it’s awesome, but my free time)
Still too busy to code math / ML …
Built (compiled) our new products Yocto Linux image from scratch in Devcontainers on MacOS
Writing notes about negative log-likelihood and reading about LoRA on Connectionism
Today I took the saturday to catch up on Thu and Fri’s lessons, missed because of how busy I’ve been
Two super interesting guest lectures from Scratch To Scale, on Async TP and Distributed Inference
Learning about Pipeline Paralellism and Tensor Parallelism
Fixing prod issues at Kivala, working on a design overhaul of the product + progress on ZeRO-2 and 3
Went through the new DataLoader (Sharding) Workshop for S2S and searched more techniques to make my ZeRO implementation better
Writing the backbone for ZeRO-2 and an isntructive call with Tunji Ruwase from Snowflake introducing Arctic Sequence Length Training as part of S2S
Implemented DP and ZeRO1 from scratch in a Modal notebook followed by two guest lectures, DTensor/DeviceMesh and Parallel Processing, as part of S2S
Wrote about expectation/variance for random vectors and covariance matrix, followed by the S2S course on ZeRO3 and HSDP
Reviewing Daniel Han’s triton kernel lecture and going through backprop ninja (again) by Andrej Karpathy
Read Thinking Machines’s first blog article + cursor’s article on online RL training for Cursor Tab
Wrote some notes on eigendecomposition and the Jacobian matrix
A guest speaker session with Daniel Han on writig triton kernel (S2S), a light mode to this website and new linear algebra notes
Evening guest speaker: FP8 Training with Phuc (S2S)
Evening study session for Scratch To Scale (S2S)
A superb Marimo workshop with Vincent Warmerdam (S2S) and binge watching Essence Of Calculus (3b1b)
An enlighting workshop to learn Ray with Robert Nishihara of Anyscale
Finishing the preliminaries for D2L and quick evening study session on DDP with torch.distributed primitives
Questions surrounding distributed training at FAIR
Learning about distributed training on GPUs (S2S) and preliminaries for D2L
Learning more about Linear Transformations, Eigen{vectors, values} and Diagonalization
(Re-)Developping intuition around Linear Algebra
Today I decided I would convert my old Next.js blog to Notebooks + Quarto and document my learning process