Views Navigation

Event Views Navigation

Saddle-to-saddle dynamics in diagonal linear networks

Nicolas Flammarion (EPFL)
E18-304

Abstract: When training neural networks with gradient methods and small weight initialisation, peculiar learning curves are observed: the training initially shows minimal progress, which is then followed by a sudden transition where a new "feature" is rapidly learned. This pattern is commonly known as incremental learning. In this talk, I will demonstrate that we can comprehensively understand this phenomenon within the context of a simplified network architecture. In this setting, we can establish that the gradient flow trajectory transitions from one saddle point of the training…

Find out more »


MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764