Browsing CBMM Memo Series by Author "Rosasco, Lorenzo"

Deep Convolutional Networks are Hierarchical Kernel Machines

Anselmi, Fabio; Rosasco, Lorenzo; Tan, Cheston; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv, 2015-08-05)

We extend i-theory to incorporate not only pooling but also rectifying nonlinearities in an extended HW module (eHW) designed for supervised learning. The two operations roughly correspond to invariance and selectivity, ...

A Deep Representation for Invariance And Music Classification

Zhang, Chiyuan; Evangelopoulos, Georgios; Voinea, Stephen; Rosasco, Lorenzo; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-17-03)

Representations in the auditory cortex might be based on mechanisms similar to the visual ventral stream; modules for building invariance to transformations and multiple layers for compositionality and selectivity. In this ...

For interpolating kernel machines, the minimum norm ERM solution is the most stable

Rangamani, Akshay; Rosasco, Lorenzo; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), 2020-06-22)

We study the average CVloo stability of kernel ridge-less regression and derive corresponding risk bounds. We show that the interpolating solution with minimum norm has the best CVloo stability, which in turn is controlled ...

Holographic Embeddings of Knowledge Graphs

Nickel, Maximilian; Rosasco, Lorenzo; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv, 2015-11-16)

Learning embeddings of entities and relations is an efficient and versatile method to perform machine learning on relational data such as knowledge graphs. In this work, we propose holographic embeddings (HolE) to learn ...

I-theory on depth vs width: hierarchical function composition

Poggio, Tomaso; Anselmi, Fabio; Rosasco, Lorenzo (Center for Brains, Minds and Machines (CBMM), 2015-12-29)

Deep learning networks with convolution, pooling and subsampling are a special case of hierar- chical architectures, which can be represented by trees (such as binary trees). Hierarchical as well as shallow networks can ...

The Janus effects of SGD vs GD: high noise and low rank

Xu, Mengjia; Galanti, Tomer; Rangamani, Akshay; Rosasco, Lorenzo; Poggio, Tomaso (2023-12-21)

It was always obvious that SGD has higher fluctuations at convergence than GD. It has also been often reported that SGD in deep RELU networks has a low-rank bias in the weight matrices. A recent theoretical analysis linked ...

Learning An Invariant Speech Representation

Evangelopoulos, Georgios; Voinea, Stephen; Zhang, Chiyuan; Rosasco, Lorenzo; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv, 2014-06-15)

Recognition of speech, and in particular the ability to generalize and learn from small sets of labelled examples like humans do, depends on an appropriate representation of the acoustic input. We formulate the problem of ...

Notes on Hierarchical Splines, DCLNs and i-theory

Poggio, Tomaso; Rosasco, Lorenzo; Shashua, Amnon; Cohen, Nadav; Anselmi, Fabio (Center for Brains, Minds and Machines (CBMM), 2015-09-29)

We define an extension of classical additive splines for multivariate function approximation that we call hierarchical splines. We show that the case of hierarchical, additive, piece-wise linear splines includes present-day ...

On Invariance and Selectivity in Representation Learning

Anselmi, Fabio; Rosasco, Lorenzo; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv, 2015-03-23)

We discuss data representation which can be learned automatically from data, are invariant to transformations, and at the same time selective, in the sense that two points have the same representation only if they are one ...

Symmetry Regularization

Anselmi, Fabio; Evangelopoulos, Georgios; Rosasco, Lorenzo; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), 2017-05-26)

The properties of a representation, such as smoothness, adaptability, generality, equivari- ance/invariance, depend on restrictions imposed during learning. In this paper, we propose using data symmetries, in the sense of ...

Theory I: Why and When Can Deep Networks Avoid the Curse of Dimensionality?

Poggio, Tomaso; Mhaskar, Hrushikesh; Rosasco, Lorenzo; Miranda, Brando; Liao, Qianli (Center for Brains, Minds and Machines (CBMM), arXiv, 2016-11-23)

[formerly titled "Why and When Can Deep – but Not Shallow – Networks Avoid the Curse of Dimensionality: a Review"] The paper reviews and extends an emerging body of theoretical results on deep learning including the ...

Theory of Deep Learning III: explaining the non-overfitting puzzle

Poggio, Tomaso; Kawaguchi, Kenji; Liao, Qianli; Miranda, Brando; Rosasco, Lorenzo; e.a. (arXiv, 2017-12-30)

THIS MEMO IS REPLACED BY CBMM MEMO 90 A main puzzle of deep networks revolves around the absence of overfitting despite overparametrization and despite the large capacity demonstrated by zero training error on randomly ...