Formation of Representations in Neural Networks

Ziyin, Liu; Chuang, Isaac; Galanti, Tomer; Poggio, Tomaso

dc.contributor.author	Ziyin, Liu
dc.contributor.author	Chuang, Isaac
dc.contributor.author	Galanti, Tomer
dc.contributor.author	Poggio, Tomaso
dc.date.accessioned	2024-10-08T14:32:03Z
dc.date.available	2024-10-08T14:32:03Z
dc.date.issued	2024-10-07
dc.identifier.uri	https://hdl.handle.net/1721.1/157132
dc.description.abstract	Understanding neural representations will help open the black box of neural networks and advance our scientific understanding of modern AI systems. However, how complex, structured, and transferable representations emerge in modern neural networks has remained a mystery. Building on previous results, we propose the Canonical Representation Hypothesis (CRH), which posits a set of six alignment relations to universally govern the formation of representations in most hidden layers of a neural network. Under the CRH, the latent representations (R), weights (W), and neuron gradients (G) become mutually aligned during training. This alignment implies that neural networks naturally learn compact representations, where neurons and weights are invariant to task-irrelevant transformations. We then show that the breaking of CRH leads to the emergence of reciprocal power-law relations between R, W, and G, which we refer to as the Polynomial Alignment Hypothesis (PAH). We present a minimal-assumption theory demonstrating that the balance between gradient noise and regularization is crucial for the emergence the canonical representation. The CRH and PAH lead to an exciting possibility of unifying major key deep learning phenomena, including neural collapse and the neural feature ansatz, in a single framework.	en_US
dc.description.sponsorship	This material is based upon work supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216.	en_US
dc.publisher	Center for Brains, Minds and Machines (CBMM)	en_US
dc.relation.ispartofseries	CBMM Memo;150
dc.title	Formation of Representations in Neural Networks	en_US
dc.type	Article	en_US
dc.type	Technical Report	en_US
dc.type	Working Paper	en_US

Files in this item

Name:: CBMM-Memo-150.pdf
Size:: 4.029Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

CBMM Memo Series

Show simple item record