Publications
Collections in this community
Recent Submissions
-
Compositional Sparsity of Learnable Functions
(Center for Brains, Minds and Machines (CBMM), 2024-02-08)Neural networks have demonstrated impressive success in various domains, raising the question of what fundamental principles underlie the effectiveness of the best AI systems and quite possibly of human intelligence. This ... -
The Janus effects of SGD vs GD: high noise and low rank
(2023-12-21)It was always obvious that SGD has higher fluctuations at convergence than GD. It has also been often reported that SGD in deep RELU networks has a low-rank bias in the weight matrices. A recent theoretical analysis linked ... -
A Homogeneous Transformer Architecture
(Center for Brains, Minds and Machines (CBMM), 2023-09-18)While the Transformer architecture has made a substantial impact in the field of machine learning, it is unclear what purpose each component serves in the overall architecture. Heterogeneous nonlinear circuits such as ...