Now showing items 1-5 of 5

    • An analysis of training and generalization errors in shallow and deep networks 

      Mhaskar, Hrushikesh; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv.org, 2018-02-20)
      An open problem around deep networks is the apparent absence of over-fitting despite large over-parametrization which allows perfect fitting of the training data. In this paper, we explain this phenomenon when each unit ...
    • Deep vs. shallow networks : An approximation theory perspective 

      Mhaskar, Hrushikesh; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv, 2016-08-12)
      The paper briefly reviews several recent results on hierarchical architectures for learning from examples, that may formally explain the conditions under which Deep Convolutional Neural Networks perform much better in ...
    • Learning Real and Boolean Functions: When Is Deep Better Than Shallow 

      Mhaskar, Hrushikesh; Liao, Qianli; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), arXiv, 2016-03-08)
      We describe computational tasks - especially in vision - that correspond to compositional/hierarchical functions. While the universal approximation property holds both for hierarchical and shallow networks, we prove that ...
    • Theory I: Why and When Can Deep Networks Avoid the Curse of Dimensionality? 

      Poggio, Tomaso; Mhaskar, Hrushikesh; Rosasco, Lorenzo; Miranda, Brando; Liao, Qianli (Center for Brains, Minds and Machines (CBMM), arXiv, 2016-11-23)
      [formerly titled "Why and When Can Deep – but Not Shallow – Networks Avoid the Curse of Dimensionality: a Review"] The paper reviews and extends an emerging body of theoretical results on deep learning including the ...
    • Theory of Deep Learning III: explaining the non-overfitting puzzle 

      Poggio, Tomaso; Kawaguchi, Kenji; Liao, Qianli; Miranda, Brando; Rosasco, Lorenzo; e.a. (arXiv, 2017-12-30)
      THIS MEMO IS REPLACED BY CBMM MEMO 90 A main puzzle of deep networks revolves around the absence of overfitting despite overparametrization and despite the large capacity demonstrated by zero training error on randomly ...