BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//MIT Statistics and Data Science Center - ECPv5.14.2.1//NONSGML v1.0//EN
CALSCALE:GREGORIAN
METHOD:PUBLISH
X-WR-CALNAME:MIT Statistics and Data Science Center
X-ORIGINAL-URL:https://stat.mit.edu
X-WR-CALDESC:Events for MIT Statistics and Data Science Center
REFRESH-INTERVAL;VALUE=DURATION:PT1H
X-Robots-Tag:noindex
X-PUBLISHED-TTL:PT1H
BEGIN:VTIMEZONE
TZID:America/New_York
BEGIN:DAYLIGHT
TZOFFSETFROM:-0500
TZOFFSETTO:-0400
TZNAME:EDT
DTSTART:20210314T070000
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:-0400
TZOFFSETTO:-0500
TZNAME:EST
DTSTART:20211107T060000
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTART;TZID=America/New_York:20210226T110000
DTEND;TZID=America/New_York:20210226T120000
DTSTAMP:20220528T122630
CREATED:20210112T204853Z
LAST-MODIFIED:20210217T144957Z
UID:4477-1614337200-1614340800@stat.mit.edu
SUMMARY:Self-regularizing Property of Nonparametric Maximum Likelihood Estimator in Mixture Models
DESCRIPTION:Abstract: Introduced by Kiefer and Wolfowitz 1956\, the nonparametric maximum likelihood estimator (NPMLE) is a widely used methodology for learning mixture models and empirical Bayes estimation. Sidestepping the non-convexity in mixture likelihood\, the NPMLE estimates the mixing distribution by maximizing the total likelihood over the space of probability measures\, which can be viewed as an extreme form of over parameterization. \nIn this work we discover a surprising property of the NPMLE solution. Consider\, for example\, a Gaussian mixture model on the real line with a subgaussian mixing distribution. Leveraging complex-analytic techniques\, we show that with high probability the NPMLE based on a sample of size n has O(\log n) atoms (mass points)\, significantly improving the deterministic upper bound of n due to Lindsay (1983). Notably\, any such Gaussian mixture is statistically indistinguishable from a finite one with O(\log n) components (and this is tight for certain mixtures). Thus\, absent any explicit form of model selection\, NPMLE automatically chooses the right model complexity\, a property we term self-regularization. Extensions to other exponential families are given. As a statistical application\, we show that this structural property can be harnessed to bootstrap existing Hellinger risk bound of the (parametric) MLE for finite Gaussian mixtures to the NPMLE for general Gaussian mixtures\, recovering a result of Zhang (2009). Time permitting\, we will discuss connections to approaching the optimal regret in empirical Bayes. This is based on joint work with Yihong Wu (Yale). \n– \nBio: Yury Polyanskiy is an Associate Professor of Electrical Engineering and Computer Science and a member of IDSS and LIDS at MIT. Yury received M.S. degree in applied mathematics and physics from the Moscow Institute of Physics and Technology\, Moscow\, Russia in 2005 and Ph.D. degree in electrical engineering from Princeton University\, Princeton\, NJ in 2010. His research interests span information theory\, statistical learning\, error-correcting codes\, wireless communication and fault tolerance. Dr. Polyanskiy won the 2020 IEEE Information Theory Society James Massey Award\, 2013 NSF CAREER award and 2011 IEEE Information Theory Society Paper Award.
URL:https://stat.mit.edu/calendar/polyanskiy/
LOCATION:online
CATEGORIES:Stochastics and Statistics Seminar
END:VEVENT
END:VCALENDAR