Emergent outlier subspaces in high-dimensional stochastic gradient descent
Reza Gheissari, Northwestern University
E18-304
Abstract: It has been empirically observed that the spectrum of neural network Hessians after training have a bulk concentrated near zero, and a few outlier eigenvalues. Moreover, the eigenspaces associated…