Search
Now showing items 1-7 of 7
For interpolating kernel machines, the minimum norm ERM solution is the most stable
(Center for Brains, Minds and Machines (CBMM), 2020-06-22)
We study the average CVloo stability of kernel ridge-less regression and derive corresponding risk bounds. We show that the interpolating solution with minimum norm has the best CVloo stability, which in turn is controlled ...
Dreaming with ARC
(Center for Brains, Minds and Machines (CBMM), 2020-11-23)
Current machine learning algorithms are highly specialized to whatever it is they are meant to do –– e.g. playing chess, picking up objects, or object recognition. How can we extend this to a system that could solve a ...
Hierarchically Local Tasks and Deep Convolutional Networks
(Center for Brains, Minds and Machines (CBMM), 2020-06-24)
The main success stories of deep learning, starting with ImageNet, depend on convolutional networks, which on certain tasks perform significantly better than traditional shallow classifiers, such as support vector machines. ...
Biologically Inspired Mechanisms for Adversarial Robustness
(Center for Brains, Minds and Machines (CBMM), 2020-06-23)
A convolutional neural network strongly robust to adversarial perturbations at reasonable computational and performance cost has not yet been demonstrated. The primate visual ventral stream seems to be robust to small ...
Stable Foundations for Learning: a foundational framework for learning theory in both the classical and modern regime.
(Center for Brains, Minds and Machines (CBMM), 2020-03-25)
We consider here the class of supervised learning algorithms known as Empirical Risk Minimization (ERM). The classical theory by Vapnik and others characterize universal consistency of ERM in the classical regime in which ...
Implicit dynamic regularization in deep networks
(Center for Brains, Minds and Machines (CBMM), 2020-08-17)
Square loss has been observed to perform well in classification tasks, at least as well as crossentropy. However, a theoretical justification is lacking. Here we develop a theoretical analysis for the square loss that also ...
Loss landscape: SGD can have a better view than GD
(Center for Brains, Minds and Machines (CBMM), 2020-07-01)
Consider a loss function L = ni=1 l2i with li = f(xi) − yi, where f(x) is a deep feedforward network with R layers, no bias terms and scalar output. Assume the network is overparametrized that is, d >> n, where d is the ...