Show simple item record

dc.contributor.authorPoggio, Tomaso
dc.date.accessioned2022-10-11T16:06:10Z
dc.date.available2022-10-11T16:06:10Z
dc.date.issued2022-10-10
dc.identifier.urihttps://hdl.handle.net/1721.1/145776
dc.description.abstractThe main claim of this perspective is that compositional sparsity of the target function, which corre- sponds to the task to be learned, is the key principle underlying machine learning. I prove that under restrictions of smoothness of the constituent functions, sparsity of the compositional target functions naturally leads to sparse deep networks for approximation, optimization and generalization. This is the case of most CNNs in current use, in which the known sparse graph of the target function is reflected in the sparse connectivity of the network. When the graph of the target function is unknow, I conjec- ture that transformers are able to implement a flexible version of sparsity (selecting which input tokens interact in the MLP layer), through the self-attention layers. Surprisingly, the assumption of compositional sparsity of the target function is not restrictive in practice, since for computable functions with Lipschitz continuous derivatives compositional sparsity is equivalent to efficient computability, that is computability in polynomial time.en_US
dc.description.sponsorshipThis material is based upon work supported by the Center for Brains, Minds and Machines (CBMM), funded by NSF STC award CCF-1231216.en_US
dc.publisherCenter for Brains, Minds and Machines (CBMM)en_US
dc.relation.ispartofseriesCBMM Memo;138
dc.titleCompositional Sparsity: a framework for MLen_US
dc.typeArticleen_US
dc.typeTechnical Reporten_US
dc.typeWorking Paperen_US


Files in this item

Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record