Views Navigation

Event Views Navigation

Evaluating a black-box algorithm: stability, risk, and model comparisons

Rina Foygel Barber, University of Chicago
E18-304

Abstract: When we run a complex algorithm on real data, it is standard to use a holdout set, or a cross-validation strategy, to evaluate its behavior and performance. When we do so, are we learning information about the algorithm itself, or only about the particular fitted model(s) that this particular data set produced? In this talk, we will establish fundamental hardness results on the problem of empirically evaluating properties of a black-box algorithm, such as its stability and its average…

Find out more »

Statistical Inference with Limited Memory

Ofer Shayevitz, Tel Aviv University
E18-304

Abstract:  In statistical inference problems, we are typically given a limited number of samples from some underlying distribution, and we wish to estimate some property of that distribution, under a given measure of risk. We are usually interested in characterizing and achieving the best possible risk as a function of the number of available samples. Thus, it is often implicitly assumed that samples are co-located, and that communication bandwidth as well as computational power are not a bottleneck, essentially making the number…

Find out more »

Winners with Confidence: Discrete Argmin Inference with an Application to Model Selection

Jing Lei, Carnegie Mellon University
E18-304

Abstract:  We study the problem of finding the index of the minimum value of a vector from noisy observations. This problem is relevant in population/policy comparison, discrete maximum likelihood, and model selection. By integrating concepts and tools from cross-validation and differential privacy, we develop a test statistic that is asymptotically normal even in high-dimensional settings, and allows for arbitrarily many ties in the population mean vector. The key technical ingredient is a central limit theorem for globally dependent data characterized…

Find out more »

Deep Learning Methods for Public Health Prediction

Alexander Rodríguez, University of Michigan
E18-304

Abstract: Epidemic prediction is an essential tool for public health decision-making and strategic planning. Despite its importance, our ability to model the spread of epidemics remains limited, largely due to the complexity of social and pathogen dynamics. With the increasing availability of real-time multimodal data and advances in deep learning, a new opportunity has emerged to capture and exploit previously unobservable facets of the spatiotemporal dynamics of epidemics. Toward realizing the potential of AI in public health, my work addresses multiple challenges in this domain,…

Find out more »

On Statistical Inference in Observational Studies

Rajarshi Mukherjee, Harvard University
E18-304

Abstract In this talk, we will focus on drawing inferences for average treatment effect type quantities arising in the context of many observational studies. In the first part of the talk, we will try to understand the problem's subtleties in low-dimensional nonparametric settings and discuss the potential usefulness of higher-order semiparametric theory to paint a detailed picture. In another half of the talk, we will consider high-dimensional aspects of the question and discuss different regimes and associated subtleties that arise…

Find out more »


MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764