Saddle-to-saddle dynamics in diagonal linear networks

Nicolas Flammarion (EPFL)
E18-304

Abstract: When training neural networks with gradient methods and small weight initialisation, peculiar learning curves are observed: the training initially shows minimal progress, which is then followed by a sudden transition where a new "feature" is rapidly learned. This pattern is commonly known as incremental learning. In this talk, I will demonstrate that we can comprehensively understand this phenomenon within the context of a simplified network architecture. In this setting, we can establish that the gradient flow trajectory transitions from one saddle point of the training…

Find out more »

The discrete Schrödinger bridge, and the ensuing chaos

Zaid Harchaoui (University of Washington)
E18-304

Abstract: Schrödinger studied in the 1930s a thought experiment about hot gas in which a cloud of particles evolves in time from an initial distribution to another one, possibly quite different from the initial one. He posed the problem of determining the most likely evolution among the many possible ones, a problem now known as the Schrödinger bridge problem. H. Föllmer later in the 1980s framed the problem as an entropy regularized variational problem. The Schrödinger problem underlies a number…

Find out more »


© MIT Statistics + Data Science Center | 77 Massachusetts Avenue | Cambridge, MA 02139-4307 | 617-253-1764 |
      
Accessibility