Views Navigation

Event Views Navigation

Artificial Bayesian Monte Carlo Integration: A Practical Resolution to the Bayesian (Normalizing Constant) Paradox

Xiao-Li Meng (Harvard University)
E18-304

Abstract: Advances in Markov chain Monte Carlo in the past 30 years have made Bayesian analysis a routine practice. However, there is virtually no practice of performing Monte Carlo integration from the Bayesian perspective; indeed,this problem has earned the “paradox” label in the context of computing normalizing constants (Wasserman, 2013). We first use the modeling-what-we-ignore idea of Kong et al. (2003) to explain that the crux of the paradox is not with the likelihood theory, which is essentially the same…

Find out more »

Understanding Machine Learning with Statistical Physics

Lenka Zdeborová (Institute of Theoretical Physics, CNRS)
E18-304

Abstract: The affinity between statistical physics and machine learning has long history, this is reflected even in the machine learning terminology that is in part adopted from physics. Current theoretical challenges and open questions about deep learning and statistical learning call for unified account of the following three ingredients: (a) the dynamics of the learning algorithm, (b) the architecture of the neural networks, and (c) the structure of the data. Most existing theories are not taking in account all of…

Find out more »

Stability of a Fluid Model for Fair Bandwidth Sharing with General File Size Distributions

Ruth J Williams (University of California, San Diego)
E18-304

Abstract: Massoulie and Roberts introduced a stochastic model for a data communication network where file sizes are generally distributed and the network operates under a fair bandwidth sharing policy.  It has been a standing problem to prove stability of this general model when the average load on the system is less than the network's capacity. A crucial step in an approach to this problem is to prove stability of an associated measure-valued fluid model. We shall describe prior work on this question done under various strong assumptions and…

Find out more »

A Causal Exposure Response Function with Local Adjustment for Confounding: A study of the health effects of long-term exposure to low levels of fine particulate matter

Francesca Dominici (Harvard University)
E18-304

Abstract:   In the last two decades, ambient levels of air pollution have declined substantially. Yet, as mandated by the Clean Air Act, we must continue to address the following question: is exposure to levels of air pollution that are well below the National Ambient Air Quality Standards (NAAQS) harmful to human health? Furthermore, the highly contentious nature surrounding environmental regulations necessitates casting this question within a causal inference framework. Several parametric and semi-parametric regression modeling approaches have been used to…

Find out more »

Automated Data Summarization for Scalability in Bayesian Inference

Tamara Broderick (MIT)
E18-304

Abstract: Many algorithms take prohibitively long to run on modern, large data sets. But even in complex data sets, many data points may be at least partially redundant for some task of interest. So one might instead construct and use a weighted subset of the data (called a “coreset”) that is much smaller than the original dataset. Typically running algorithms on a much smaller data set will take much less computing time, but it remains to understand whether the output…

Find out more »

Flexible Perturbation Models for Robustness to Misspecification

Jeffrey Miller (Harvard University)
E18-304

Abstract: In many applications, there are natural statistical models with interpretable parameters that provide insight into questions of interest. While useful, these models are almost always wrong in the sense that they only approximate the true data generating process. In some cases, it is important to account for this model error when quantifying uncertainty in the parameters. We propose to model the distribution of the observed data as a perturbation of an idealized model of interest by using a nonparametric…

Find out more »

Inferring the Evolutionary History of Tumors

Simon Tavaré (Columbia University)
E18-304

Abstract: Bulk sequencing of tumor DNA is a popular strategy for uncovering information about the spectrum of mutations arising in the tumor, and is often supplemented by multi-region sequencing, which provides a view of tumor heterogeneity. The statistical issues arise from the fact that bulk sequencing makes the determination of sub-clonal frequencies, and other quantities of interest, difficult. In this talk I will discuss this problem, beginning with its setting in population genetics. The data provide an estimate of the…

Find out more »

The Statistical Finite Element Method

Mark Girolami, University of Cambridge
E18-304

Abstract: The finite element method (FEM) is one of the great triumphs of modern day applied mathematics, numerical analysis and software development. Every area of the sciences and engineering has been positively impacted by the ability to model and study complex physical and natural systems described by systems of partial differential equations (PDE) via the FEM . In parallel the recent developments in sensor, measurement, and signalling technologies enables the phenomenological study of systems as diverse as protein signalling in the…

Find out more »

Gaussian Differential Privacy, with Applications to Deep Learning

Weijie Su (University of Pennsylvania)
E18-304

Abstract:   Privacy-preserving data analysis has been put on a firm mathematical foundation since the introduction of differential privacy (DP) in 2006. This privacy definition, however, has some well-known weaknesses: notably, it does not tightly handle composition. This weakness has inspired several recent relaxations of differential privacy based on the Renyi divergences. We propose an alternative relaxation we term "f-DP", which has a number of nice properties and avoids some of the difficulties associated with divergence based relaxations. First, f-DP preserves…

Find out more »


MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764