Simplifying Equivariant GPU Kernels through Tile-based Programming
Author(s)
Kotak, Mit
DownloadThesis PDF (675.5Kb)
Advisor
Smidt, Tess E.
Terms of use
Metadata
Show full item recordAbstract
E(3)-equivariant neural networks have demonstrated success across a wide range of 3D modeling tasks. Until recently, they were bottlenecked due to their high memory and wall-time requirements. In this thesis we first provide an overview of recent GPU kernel efforts by both academia and industry that address this issue. These approaches tradeoff performance for engineering complexity, while still being algorithmically bottlenecked at 10 % GPU utilization. We instead trade off engineering complexity for performance. This not only lowers the barrier to GPU programming but also builds an abstraction layer to reason about future algorithmic innovations that can improve GPU utilization. Our kernel 𝐵3, based on the tiling- optimizations in just 100 lines of PyTorch-like code. We explore the performance-simplicity tradeoff with two case studies and demonstrate the practicality of our kernel workflow through downstream integration with a production model. We hope this work serves as inspiration to broaden and deepen existing equivariant kernel efforts.
Date issued
2025-09Department
Massachusetts Institute of Technology. Center for Computational Science and EngineeringPublisher
Massachusetts Institute of Technology