Stochastics and Statistics Seminar

Views Navigation

Event Views Navigation

Next Generation Missing Data Methodology

Eric Tchetgen Tchetgen (Harvard University)
32-141

Missing data is a reality of empirical sciences and can rarely be prevented entirely. It is often assumed that incomplete data are missing completely at random (MCAR) or missing at random (MAR), When neither MCAR nor MAR, missingness is said to be Not MAR (NMAR). Under MAR, there are two main approaches to inference, likelihood/Bayesian inference, e.g. EM or MI, and semiparametric approaches such as Inverse probability weighting (IPW). In several important settings, likelihood based inferences suffer the difficulty of…

Find out more »

Efficient Optimal Strategies for Universal Prediction

Peter Bartlett (UC Berkeley)
32-141

In game-theoretic formulations of prediction problems, a strategy makes a decision, observes an outcome and pays a loss. The aim is to minimize the regret, which is the amount by which the total loss incurred exceeds the total loss of the best decision in hindsight. This talk will focus on the minimax optimal strategy, which minimizes the regret, in three settings: prediction with log loss (a formulation of sequential probability density estimation that is closely related to sequential compression, coding,…

Find out more »

Principal Components Analysis in Light of the Spiked Model

Principal components is a true workhorse of science and technology, applied everywhere from radio frequency signal processing to financial econometrics, genomics, and social network analysis. In this talk, I will review some of these applications and then describe the challenge posed by modern 'big data asymptotics' where there are roughly as many dimensions as observations; this setting has seemed in the past full of mysteries. Over the last ten years random matrix theory has developed a host of new tools…

Find out more »

Large Average Submatrices of a Gaussian Random Matrix: Landscapes and Local Optima

Andrew Nobel (UNC)

The problem of finding large average submatrices of a real-valued matrix arises in the exploratory analysis of data from disciplines as diverse as genomics and social sciences. This talk will present several new theoretical results concerning large average submatrices of an n x n Gaussian random matrix that are motivated in part by previous work on biomedical applications. We will begin by considering the average and distribution of the k x k submatrix having largest average value (the global maximum),…

Find out more »

Incremental Methods for Additive Convex Cost Optimization

David Donoho (Stanford)
32-123

Motivated by machine learning problems over large data sets and distributed optimization over networks, we consider the problem of minimizing the sum of a large number of convex component functions. We study incremental gradient methods for solving such problems, which process component functions sequentially one at a time. We first consider deterministic cyclic incremental gradient methods (that process the component functions in a cycle) and provide new convergence rate results under some assumptions. We then consider a randomized incremental gradient…

Find out more »

On Shape Constrained Estimation

Shape constraints such as monotonicity, convexity, and log-concavity are naturally motivated in many applications, and can offer attractive alternatives to more traditional smoothness constraints in nonparametric estimation. In this talk we present some recent results on shape constrained estimation in high and low dimensions. First, we show how shape constrained additive models can be used to select variables in a sparse convex regression function. In contrast, additive models generally fail for variable selection under smoothness constraints. Next, we introduce graph-structured…

Find out more »

On Complex Supervised Learning Problems, and On Ranking and Choice Models

Shivani Agarwal (Indian Institute of Science/Radcliffe)
32-123

While simple supervised learning problems like binary classification and regression are fairly well understood, increasingly, many applications involve more complex learning problems: more complex label and prediction spaces, more complex loss structures, or both. The first part of the talk will discuss recent advances in our understanding of such problems, including the notion of convex calibration dimension of a loss function, unified approaches for designing convex calibrated surrogates for arbitrary losses, and connections between supervised learning and property elicitation. The…

Find out more »

Pairwise Comparison Models for High-Dimensional Ranking

Martin Wainwright (UC Berkeley)
32-123

Data in the form of pairwise comparisons between a collection of n items arises in many settings, including voting schemes, tournament play, and online search rankings. We study a flexible non-parametric model for pairwise comparisons, under which the probabilities of outcomes are required only to satisfy a natural form of stochastic transitivity (SST). The SST class includes a large family of classical parametric models as special cases, among them the Bradley-Terry-Luce and Thurstone models, but is substantially richer. We provide…

Find out more »

Sub-Gaussian Mean Estimators

 Roberto Oliveira (IMPA)
32-123

We discuss the possibilities and limitations of estimating the mean of a real-valued random variable from independent and identically distributed observations from a non-asymptotic point of view. In particular, we define estimators with a sub-Gaussian behavior even for certain heavy-tailed distributions. We also prove various impossibility results for mean estimators. These results are in http://arxiv.org/abs/1509.05845, to appear in Ann Stat. (Joint work with L. Devroye, M. Lerasle, and G. Lugosi.)

Find out more »

Double Machine Learning: Improved Point and Interval Estimation of Treatment and Causal Parameters

Most supervised machine learning (ML) methods are explicitly designed to solve prediction problems very well. Achieving this goal does not imply that these methods automatically deliver good estimators of causal parameters. Examples of such parameters include individual regression coefficients, average treatment effects, average lifts, and demand or supply elasticities. In fact, estimates of such causal parameters obtained via naively plugging ML estimators into estimating equations for such parameters can behave very poorly, for example, by formally having inferior rates of…

Find out more »


MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764