Loading Events
  • This event has passed.
Stochastics and Statistics Seminar

Causal Matrix Completion

October 1, 2021 @ 11:00 am - 12:00 pm

Devavrat Shah (MIT)


Abstract: Matrix completion is the study of recovering an underlying matrix from a sparse subset of noisy observations. Traditionally, it is assumed that the entries of the matrix are “missing completely atrandom” (MCAR), i.e., each entry is revealed at random, independent of everything else, with uniform probability. This is likely unrealistic due to the presence of “latent confounders”, i.e., unobserved factors that determine both the entries of the underlying matrix and the missingness pattern in the observed matrix. 

In general, these confounders yield “missing not at random” (MNAR) data, which can severely impact any inference procedure that does not correct for this bias. We develop a formal causal model for matrix completion with MNAR data through the language of potential outcomes, and provide identification arguments for causal estimand of interest. We design a procedure, which we call “synthetic nearest neighbors” (SNN), to estimate these causal estimands. We prove finite-sample consistency and asymptotic normality of our estimator. Our analysis also leads to new theoretical results for the matrix completion literature. In particular, we establish entry-wise, i.e., max-norm, finite-sample consistency and asymptotic normality results for matrix completion with MNAR data. As a special case, this also provides entry-wise bounds for matrix completion with MCAR data. Across simulated and real data, we demonstrate the efficacy of our proposed estimator.

This is based on joint works with Anish Agarwal (MIT), Munther Dahleh (MIT) and Dennis Shen (UC Berkeley).

Bio:  Devavrat Shah is the Andrew (1956) and Erna Viterbi Professor with the department of electrical engineering and computer science, MIT. He is a member of LIDS and the ORC, and the Faculty Director of the MicroMasters in Statistics and Data Science program at IDSS.  His research focus is on theory of large complex networks, which includes network algorithms, stochastic networks, network information theory and large-scale statistical inference.

© MIT Statistics + Data Science Center | 77 Massachusetts Avenue | Cambridge, MA 02139-4307 | 617-253-1764 |