Symmetries in Neural Network Functions and Parameters
Author(s)
Lim, Derek
DownloadThesis PDF (21.07Mb)
Advisor
Jegelka, Stefanie
Terms of use
Metadata
Show full item recordAbstract
Modern neural networks are large, complex objects, which can be difficult to study and work with. In this thesis, I analyze and improve neural networks from the perspective of symmetries, with particular focus on function symmetries and parameter symmetries. Function symmetries are transformations of the input that lead to predictable changes in the output, which can be enforced in neural network architectures to improve performance on data with symmetry structures. Parameter symmetries are transformations of parameters that leave the underlying neural network function unchanged, and they have impacts on various empirical phenomena in neural networks. In Part I of this thesis, I focus on function symmetries, and develop new methods and analysis techniques for equivariant neural networks that have function symmetries baked into their architectures. I apply these techniques primarily on eigenvector-valued data, resulting in the first provably expressive neural network architectures that respect the symmetries of eigenvector data. In Part II, I focus on parameter-symmetries, and analyze their impact in various empirical phenomena of neural networks, as well as their impact in the open-weight ecosystem of models with publicly-shared parameters. In Part III, I consider both function and parameter symmetries to construct metanetworks: models that take in the parameters of other neural networks as input. Since the input to metanetworks are parameters, I develop metanetworks that are invariant or equivariant to the parameter symmetries of the input networks. All in all, my work shows that accounting for function and parameter symmetries is both theoretically and empirically beneficial across diverse types of data, learning tasks, neural network architectures, and other parts of the deep learning pipeline.
Date issued
2025-09Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology