Show HN: UNet diffusion model in pure CUDA Hi HN! I was inspired by Andrej Karpathy's llm.c ( https://ift.tt/Zcy3uw6 ), and wrote a full diffusion model training loop in CUDA. I learnt a lot about CUDA from Simon Boehm's Matmul blog ( https://ift.tt/1zJNwUH ). Currently there is still a lot of room for optimization: the model is running at 45% speed of PyTorch with torch.compile. I'm curious about any thoughts or CUDA tips for convolutions. https://ift.tt/cXWUCdZ June 28, 2024 at 10:00PM
0 Comments