On Provably Learning Sparse High-Dimensional Functions
Abstract: Neural Networks are hailed for their ability to discover useful low-dimensional 'features' out of complex high-dimensional data, yet such ability remains mostly hand-wavy. Over the recent years, the class of sparse (or 'multi-index') functions has emerged as a model with both practical motivations and a rich mathematical structure, enabling a quantitative theory of 'feature learning'. In this talk, I will present recent progress on this front, by describing (i) the ability of gradient-descent algorithms to efficiently learn the multi-index class over Gaussian data, and (ii) the tight Statistical-Query…
Efficient Algorithms for Locally Private Estimation with Optimal Accuracy Guarantees
Abstract: Locally Differentially Private (LDP) reports are commonly used for collection of statistics and machine learning in the federated setting with an untrusted server. We study the efficiency of two basic tasks, frequency estimation and vector mean estimation, using LDP reports. Existing algorithms for these problems that achieve the lowest error are neither communication nor computation efficient in the high-dimensional regime. In this talk I’ll describe new efficient LDP algorithms for these tasks that achieve the optimal error (up to…
Confinement of Unimodal Probability Distributions and an FKG-Gaussian Correlation Inequality
Abstract: While unimodal probability distributions are well understood in dimension 1, the same cannot be said in high dimension without imposing stronger conditions such as log-concavity. I will explain a new approach to proving confinement (e.g. variance upper bounds) for high-dimensional unimodal distributions which are not log-concave, based on an extension of Royen's celebrated Gaussian correlation inequality. We will see how it yields new localization results for Ginzberg-Landau random surfaces, a well-studied family of continuous-variable graphical models, with very general…
Estimation of Functionals of High-Dimensional and Infinite-Dimensional Parameters of Statistical Models
The mini-course will meet on Monday, April 1 and Wednesday, April 3rd from 1:30-3:00pm This mini-course deals with a circle of problems related to estimation of real valued functionals of high-dimensional and infinite-dimensional parameters of statistical models. In such problems, it is of interest to estimate one-dimensional features of a high-dimensional parameter represented by nonlinear functionals of certain degree of smoothness defined on the parameter space. The functionals of interest could be often estimated with faster convergence rates than the…
Optimal nonparametric capture-recapture methods for estimating population size
Abstract: Estimation of population size using incomplete lists has a long history across many biological and social sciences. For example, human rights groups often construct partial lists of victims of armed conflicts, to estimate the total number of victims. Earlier statistical methods for this setup often use parametric assumptions, or rely on suboptimal plug-in-type nonparametric estimators; but both approaches can lead to substantial bias, the former via model misspecification and the latter via smoothing. Under an identifying assumption that two lists…
SDSCon 2024
Lattices and the Hardness of Statistical Problems
Abstract: I will describe recent results that (a) show nearly optimal hardness of learning Gaussian mixtures, and (b) give evidence of average-case hardness of sparse linear regression w.r.t. all efficient algorithms, assuming the worst-case hardness of lattice problems. The talk is based on the following papers with Aparna Gupte and Neekon Vafa. https://arxiv.org/pdf/2204.02550.pdf https://arxiv.org/pdf/2402.14645.pdf Bio: Vinod Vaikuntanathan is a professor of computer science at MIT and the chief cryptographer at Duality Technologies. His research is in the foundations of cryptography…
Emergent outlier subspaces in high-dimensional stochastic gradient descent
Abstract: It has been empirically observed that the spectrum of neural network Hessians after training have a bulk concentrated near zero, and a few outlier eigenvalues. Moreover, the eigenspaces associated to these outliers have been associated to a low-dimensional subspace in which most of the training occurs, and this implicit low-dimensional structure has been used as a heuristic for the success of high-dimensional classification. We will describe recent rigorous results in this direction for the Hessian spectrum over the course…
Consensus-based optimization and sampling
Abstract: Particle methods provide a powerful paradigm for solving complex global optimization problems leading to highly parallelizable algorithms. Despite widespread and growing adoption, theory underpinning their behavior has been mainly based on meta-heuristics. In application settings involving black-box procedures, or where gradients are too costly to obtain, one relies on derivative-free approaches instead. This talk will focus on two recent techniques, consensus-based optimization and consensus-based sampling. We explain how these methods can be used for the following two goals: (i)…