Simplifying Equivariant GPU Kernels through Tile-based
Programming

Kotak, Mit

Author(s)

Kotak, Mit

DownloadThesis PDF (675.5Kb)

Advisor

Smidt, Tess E.

Terms of use

Attribution-NonCommercial-NoDerivatives 4.0 International (CC BY-NC-ND 4.0) Copyright retained by author(s) https://creativecommons.org/licenses/by-nc-nd/4.0/

Metadata

Show full item record

Abstract

E(3)-equivariant neural networks have demonstrated success across a wide range of 3D modeling tasks. Until recently, they were bottlenecked due to their high memory and wall-time requirements. In this thesis we first provide an overview of recent GPU kernel efforts by both academia and industry that address this issue. These approaches tradeoff performance for engineering complexity, while still being algorithmically bottlenecked at 10 % GPU utilization. We instead trade off engineering complexity for performance. This not only lowers the barrier to GPU programming but also builds an abstraction layer to reason about future algorithmic innovations that can improve GPU utilization. Our kernel 𝐵3, based on the tiling- optimizations in just 100 lines of PyTorch-like code. We explore the performance-simplicity tradeoff with two case studies and demonstrate the practicality of our kernel workflow through downstream integration with a production model. We hope this work serves as inspiration to broaden and deepen existing equivariant kernel efforts.

Date issued

2025-09

URI

https://hdl.handle.net/1721.1/164822

Department

Massachusetts Institute of Technology. Center for Computational Science and Engineering

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses