A Causal Exposure Response Function with Local Adjustment for Confounding: A study of the health effects of long-term exposure to low levels of fine particulate matter

Francesca Dominici (Harvard University)
E18-304

Abstract:   In the last two decades, ambient levels of air pollution have declined substantially. Yet, as mandated by the Clean Air Act, we must continue to address the following question: is exposure to levels of air pollution that are well below the National Ambient Air Quality Standards (NAAQS) harmful to human health? Furthermore, the highly contentious nature surrounding environmental regulations necessitates casting this question within a causal inference framework. Several parametric and semi-parametric regression modeling approaches have been used to…

Find out more »

Stability of a Fluid Model for Fair Bandwidth Sharing with General File Size Distributions

Ruth J Williams (University of California, San Diego)
E18-304

Abstract: Massoulie and Roberts introduced a stochastic model for a data communication network where file sizes are generally distributed and the network operates under a fair bandwidth sharing policy.  It has been a standing problem to prove stability of this general model when the average load on the system is less than the network's capacity. A crucial step in an approach to this problem is to prove stability of an associated measure-valued fluid model. We shall describe prior work on this question done under various strong assumptions and…

Find out more »

Understanding Machine Learning with Statistical Physics

Lenka Zdeborová (Institute of Theoretical Physics, CNRS)
E18-304

Abstract: The affinity between statistical physics and machine learning has long history, this is reflected even in the machine learning terminology that is in part adopted from physics. Current theoretical challenges and open questions about deep learning and statistical learning call for unified account of the following three ingredients: (a) the dynamics of the learning algorithm, (b) the architecture of the neural networks, and (c) the structure of the data. Most existing theories are not taking in account all of…

Find out more »

Artificial Bayesian Monte Carlo Integration: A Practical Resolution to the Bayesian (Normalizing Constant) Paradox

Xiao-Li Meng (Harvard University)
E18-304

Abstract: Advances in Markov chain Monte Carlo in the past 30 years have made Bayesian analysis a routine practice. However, there is virtually no practice of performing Monte Carlo integration from the Bayesian perspective; indeed,this problem has earned the “paradox” label in the context of computing normalizing constants (Wasserman, 2013). We first use the modeling-what-we-ignore idea of Kong et al. (2003) to explain that the crux of the paradox is not with the likelihood theory, which is essentially the same…

Find out more »

SDP Relaxation for Learning Discrete Structures: Optimal Rates, Hidden Integrality, and Semirandom Robustness

Yudong Chen (Cornell)
E18-304

Abstract: We consider the problems of learning discrete structures from network data under statistical settings. Popular examples include various block models, Z2 synchronization and mixture models. Semidefinite programming (SDP) relaxation has emerged as a versatile and robust approach to these problems. We show that despite being a relaxation, SDP achieves the optimal Bayes error rate in terms of distance to the target solution. Moreover, SDP relaxation is provably robust under the so-called semirandom model, which frustrates many existing algorithms. Our…

Find out more »

One-shot Information Theory via Poisson Processes

Cheuk Ting Li (UC Berkeley)
E18-304

Abstract: In information theory, coding theorems are usually proved in the asymptotic regime where the blocklength tends to infinity. While there are techniques for finite blocklength analysis, they are often more complex than their asymptotic counterparts. In this talk, we study the use of Poisson processes in proving coding theorems, which not only gives sharp one-shot and finite blocklength results, but also gives significantly shorter proofs than conventional asymptotic techniques in some settings. Instead of using fixed-size random codebooks, we…

Find out more »

Accurate Simulation-Based Parametric Inference in High Dimensional Settings

Maria-Pia Victoria-Feser, (University of Geneva)
E18-304

Abstract: Accurate estimation and inference in finite sample is important for decision making in many experimental and social fields, especially when the available data are complex, like when they include mixed types of measurements, they are dependent in several ways, there are missing data, outliers, etc. Indeed, the more complex the data (hence the models), the less accurate are asymptotic theory results in finite samples.  This is in particular the case, for example, with logistic regression, with possibly also random effects…

Find out more »

Esther Williams in the Harold Holt Memorial Swimming Pool: Some Thoughts on Complexity

Daniel Simpson (University of Toronto)
E18-304

IDS.190 – Topics in Bayesian Modeling and Computation Speaker: Daniel Simpson (University of Toronto) Abstract: As data becomes more complex and computational modelling becomes more powerful, we rapidly find ourselves beyond the scope of traditional statistical theory. As we venture beyond the traditional thunderdome, we need to think about how to cope with this additional complexity in our model building.  In this talk, I will talk about a few techniques that are useful when specifying prior distributions and building Bayesian models…

Find out more »

Towards Robust Statistical Learning Theory

Stanislav Minsker (USC)
E18-304

Abstract:  Real-world data typically do not fit statistical models or satisfy assumptions underlying the theory exactly, hence reducing the number and strictness of these assumptions helps to lessen the gap between the “mathematical” world and the “real” world. The concept of robustness, in particular, robustness to outliers, plays the central role in understanding this gap. The goal of the talk is to introduce the principles and robust algorithms based on these principles that can be applied in the general framework…

Find out more »

Markov Chain Monte Carlo Methods and Some Attempts at Parallelizing Them

Pierre E. Jacob (Harvard University)
E18-304

IDS.190 – Topics in Bayesian Modeling and Computation Abstract: MCMC methods yield approximations that converge to quantities of interest in the limit of the number of iterations. This iterative asymptotic justification is not ideal: it stands at odds with current trends in computing hardware. Namely, it would often be computationally preferable to run many short chains in parallel, but such an approach is flawed because of the so-called "burn-in" bias.  This talk will first describe that issue and some known…

Find out more »


MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764