Views Navigation

Event Views Navigation

Asymptotics of learning on dependent and structured random objects

Morgane Austern (Harvard University)
E18-304

Abstract:  Classical statistical inference relies on numerous tools from probability theory to study the properties of estimators. However, these same tools are often inadequate to study modern machine problems that frequently involve structured data (e.g networks) or complicated dependence structures (e.g dependent random matrices). In this talk, we extend universal limit theorems beyond the classical setting. Firstly, we consider distributionally "structured" and dependent random object i.e random objects whose distribution are invariant under the action of an amenable group. We…

Find out more »

Characterizing the Type 1-Type 2 Error Trade-off for SLOPE

Cynthia Rush (Columbia University)
E18-304

Abstract:  Sorted L1 regularization has been incorporated into many methods for solving high-dimensional statistical estimation problems, including the SLOPE estimator in linear regression. In this talk, we study how this relatively new regularization technique improves variable selection by characterizing the optimal SLOPE trade-off between the false discovery proportion (FDP) and true positive proportion (TPP) or, equivalently, between measures of type I and type II error. Additionally, we show that on any problem instance, SLOPE with a certain regularization sequence outperforms…

Find out more »

Precise high-dimensional asymptotics for AdaBoost via max-margins & min-norm interpolants

Pragya Sur (Harvard University)
E18-304

Abstract: This talk will introduce a precise high-dimensional asymptotic theory for AdaBoost on separable data, taking both statistical and computational perspectives. We will consider the common modern setting where the number of features p and the sample size n are both large and comparable, and in particular, look at scenarios where the data is asymptotically separable. Under a class of statistical models, we will provide an (asymptotically) exact analysis of the max-min-L1-margin and the min-L1-norm interpolant. In turn, this will…

Find out more »

The Geometry of Particle Collisions: Hidden in Plain Sight

Jesse Thaler (MIT)
E18-304

Abstract: Since the 1960s, particle physicists have developed a variety of data analysis strategies for the goal of comparing experimental measurements to theoretical predictions.  Despite their numerous successes, these techniques can seem esoteric and ad hoc, even to practitioners in the field.  In this talk, I explain how many particle physics analysis tools have a natural geometric interpretation in an emergent "space" of collider events induced by the Wasserstein metric.  This in turn suggests new analysis strategies to interpret generic…

Find out more »

The Brownian transport map

Dan Mikulincer, MIT
E18-304

Abstract: The existence of a transport map from the standard Gaussian leads to succinct​representations for, potentially complicated, measures.​ Inspired by result from optimal transport, we introduce the Brownian transport map that pushes forward the Wiener measure to a target measure in a finite-dimensional Euclidean space. Using tools from Ito's and Malliavin's calculus, we show that the map is Lipschitz in several cases of interest. Specifically, our results apply when the target measure satisfies one of the following: - More log-concave than the Gaussian, recovering…

Find out more »

On the power of Lenstra-Lenstra-Lovasz in noiseless inference

Ilias Zadik, MIT
E18-304

Abstract:   In this talk, we are going to discuss a new polynomial-time algorithmic framework for inference problems, based on the celebrated Lenstra-Lenstra-Lovasz lattice basis reduction algorithm. Potentially surprisingly, this algorithmic framework is able to successfully bypass multiple suggested notions of “computational hardness for inference” for various noiseless settings. Such settings include 1) sparse regression, where there is Overlap Gap Property and low-degree methods fail, 2) phase retrieval where Approximate Message Passing fails and 3) Gaussian clustering where the SoS…

Find out more »

Optimal testing for calibration of predictive models

Edgar Dobriban, University of Pennsylvania
E18-304

Abstract:   The prediction accuracy of machine learning methods is steadily increasing, but the calibration of their uncertainty predictions poses a significant challenge. Numerous works focus on obtaining well-calibrated predictive models, but less is known about reliably assessing model calibration. This limits our ability to know when algorithms for improving calibration have a real effect, and when their improvements are merely artifacts due to random noise in finite datasets. In this work, we consider the problem of detecting mis-calibration of…

Find out more »

Inference on Winners

Isaiah Andrews, Harvard University
E18-304

Abstract: Many empirical questions concern target parameters selected through optimization. For example, researchers may be interested in the effectiveness of the best policy found in a randomized trial, or the best-performing investment strategy based on historical data. Such settings give rise to a winner's curse, where conventional estimates are biased and conventional confidence intervals are unreliable. This paper develops optimal confidence intervals and median-unbiased estimators that are valid conditional on the target selected and so overcome this winner's curse. If…

Find out more »


MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764