Stochastics and Statistics Seminar

Views Navigation

Event Views Navigation

The discrete Schrödinger bridge, and the ensuing chaos

Zaid Harchaoui (University of Washington)
E18-304

Abstract: Schrödinger studied in the 1930s a thought experiment about hot gas in which a cloud of particles evolves in time from an initial distribution to another one, possibly quite different from the initial one. He posed the problem of determining the most likely evolution among the many possible ones, a problem now known as the Schrödinger bridge problem. H. Föllmer later in the 1980s framed the problem as an entropy regularized variational problem. The Schrödinger problem underlies a number…

Find out more »

Empirical methods for macroeconomic policy analysis

Christian Wolf, MIT
E18-304

Abstract: We show that, in a general family of linearized structural macroeconomic models, the counterfactual evolution of the economy under alternative policy rules is fully pinned down by two empirically estimable objects: (i) reduced-form projections with respect to a large information set; and (ii) the causal effects of policy shocks on macroeconomic aggregates. Under our assumptions, the derived counterfactuals are fully robust to the Lucas critique. Building on these insights, we discuss how to leverage the classical ``VAR'' approach to policy…

Find out more »

Efficient Algorithms for Semirandom Planted CSPs at the Refutation Threshold

Pravesh Kothari, Princeton University
E18-304

Abstract: We present an efficient algorithm to solve semi-random planted instances of any Boolean constraint satisfaction problem (CSP). The semi-random model is a hybrid between worst-case and average-case input models, where the input is generated by (1) choosing an arbitrary planted assignment x∗, (2) choosing an arbitrary clause structure, and (3) choosing literal negations for each clause from an arbitrary distribution "shifted by x∗" so that x∗ satisfies each constraint. For an n variable semi-random planted instance of a k-arity…

Find out more »

Entropic optimal transport: limit theorems and algorithms

Kengo Kato, Cornell University
E18-304

Abstract: In this talk, I will discuss my recent work on entropic optimal transport (EOT). In the first part, I will discuss limit theorems for EOT maps, dual potentials, and the Sinkhorn divergence. The key technical tool we use is a first and second-order Hadamard differentiability analysis of EOT potentials with respect to the marginals, from which the limit theorems, bootstrap consistency, and asymptotic efficiency of the empirical estimators follow. The second part concerns the entropic Gromov-Wasserstein (EGW) distance, which…

Find out more »

On Provably Learning Sparse High-Dimensional Functions

Joan Bruna, New York University
E18-304

Abstract: Neural Networks are hailed for their ability to discover useful low-dimensional 'features' out of complex high-dimensional data, yet such ability remains mostly hand-wavy. Over the recent years, the class of sparse (or 'multi-index') functions has emerged as a model with both practical motivations and a rich mathematical structure, enabling a quantitative theory of 'feature learning'. In this talk, I will present recent progress on this front, by describing (i) the ability of gradient-descent algorithms to efficiently learn the multi-index class over Gaussian data, and (ii) the tight Statistical-Query…

Find out more »

Efficient Algorithms for Locally Private Estimation with Optimal Accuracy Guarantees

Vitaly Feldman, Apple ML Research
E18-304

Abstract: Locally Differentially Private (LDP) reports are commonly used for collection of statistics and machine learning in the federated setting with an untrusted server. We study the efficiency of two basic tasks, frequency estimation and vector mean estimation, using LDP reports. Existing algorithms for these problems that achieve the lowest error are neither communication nor computation efficient in the high-dimensional regime. In this talk I’ll describe new efficient LDP algorithms for these tasks that achieve the optimal error (up to…

Find out more »

Confinement of Unimodal Probability Distributions and an FKG-Gaussian Correlation Inequality

Mark Sellke, Harvard University
E18-304

Abstract: While unimodal probability distributions are well understood in dimension 1, the same cannot be said in high dimension without imposing stronger conditions such as log-concavity. I will explain a new approach to proving confinement (e.g. variance upper bounds) for high-dimensional unimodal distributions which are not log-concave, based on an extension of Royen's celebrated Gaussian correlation inequality. We will see how it yields new localization results for Ginzberg-Landau random surfaces, a well-studied family of continuous-variable graphical models, with very general…

Find out more »

Optimal nonparametric capture-recapture methods for estimating population size

Edward Kennedy, Carnegie Mellon University
E18-304

Abstract: Estimation of population size using incomplete lists has a long history across many biological and social sciences. For example, human rights groups often construct partial lists of victims of armed conflicts, to estimate the total number of victims. Earlier statistical methods for this setup often use parametric assumptions, or rely on suboptimal plug-in-type nonparametric estimators; but both approaches can lead to substantial bias, the former via model misspecification and the latter via smoothing. Under an identifying assumption that two lists…

Find out more »

Lattices and the Hardness of Statistical Problems

Vinod Vaikuntanathan (MIT)
E18-304

Abstract: I will describe recent results that (a) show nearly optimal hardness of learning Gaussian mixtures, and (b) give evidence of average-case hardness of sparse linear regression w.r.t. all efficient algorithms, assuming the worst-case hardness of lattice problems. The talk is based on the following papers with Aparna Gupte and Neekon Vafa. https://arxiv.org/pdf/2204.02550.pdf https://arxiv.org/pdf/2402.14645.pdf Bio: Vinod Vaikuntanathan is a professor of computer science at MIT and the chief cryptographer at Duality Technologies. His research is in the foundations of cryptography…

Find out more »

Emergent outlier subspaces in high-dimensional stochastic gradient descent

Reza Gheissari, Northwestern University
E18-304

Abstract:  It has been empirically observed that the spectrum of neural network Hessians after training have a bulk concentrated near zero, and a few outlier eigenvalues. Moreover, the eigenspaces associated to these outliers have been associated to a low-dimensional subspace in which most of the training occurs, and this implicit low-dimensional structure has been used as a heuristic for the success of high-dimensional classification. We will describe recent rigorous results in this direction for the Hessian spectrum over the course…

Find out more »


MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764