Browsing CBMM Memo Series by Author "Xu, Mengjia"
Now showing items 1-2 of 2
-
The Janus effects of SGD vs GD: high noise and low rank
Xu, Mengjia; Galanti, Tomer; Rangamani, Akshay; Rosasco, Lorenzo; Poggio, Tomaso (2023-12-21)It was always obvious that SGD has higher fluctuations at convergence than GD. It has also been often reported that SGD in deep RELU networks has a low-rank bias in the weight matrices. A recent theoretical analysis linked ... -
Norm-Based Generalization Bounds for Compositionally Sparse Neural Network
Galanti, Tomer; Xu, Mengjia; Galanti, Liane; Poggio, Tomaso (Center for Brains, Minds and Machines (CBMM), 2023-02-14)In this paper, we investigate the Rademacher complexity of deep sparse neural networks, where each neuron receives a small number of inputs. We prove generalization bounds for multilayered sparse ReLU neural networks, ...