IDS.190 Topics in Bayesian Modeling and Computation


list_alt
  • Probabilistic Inference and Learning with Stein’s Method

    On November 6, 2019 at 5:00 am till 4:00 pm
    Lester Mackey (Microsoft Research)
    37-212

    IDS.190 – Topics in Bayesian Modeling and Computation 

    **PLEASE NOTE ROOM CHANGE TO BUILDING 37-212 FOR THE WEEKS OF 10/30 AND 11/6** 

    Lester Mackey (Microsoft Research) 

    Stein’s method is a powerful tool from probability theory for bounding the distance between probability distributions. In this talk, I’ll describe how this tool designed to prove central limit theorems can be adapted to assess and improve the quality of practical inference procedures. I’ll highlight applications to Markov chain sampler selection, goodness-of-fit testing, variational inference, and nonconvex optimization and close with several opportunities for future work.

    **Taking IDS.190 satisfies the seminar requirement for students in MIT’s Interdisciplinary Doctoral Program in Statistics (IDPS), but formal registration is open to any graduate student who can register for MIT classes. For more information and an up-to-date schedule, please see https://stellar.mit.edu/S/course/IDS/fa19/IDS.190/ 

    **Meetings are open to any interested researcher.

    Find out more »: Probabilistic Inference and Learning with Stein’s Method
  • Automated Data Summarization for Scalability in Bayesian Inference

    On September 11, 2019 at 4:00 pm till 5:00 pm
    Tamara Broderick (MIT)
    E18-304

    IDS.190 – Topics in Bayesian Modeling and Computation

    Many algorithms take prohibitively long to run on modern, large datasets. But even in complex data sets, many data points may be at least partially redundant for some task of interest. So one might instead construct and use a weighted subset of the data (called a coreset) that is much smaller than the original dataset. Typically running algorithms on a much smaller data set will take much less computing time, but it remains to understand whether the output can be widely useful. (1) In particular, can running an analysis on a smaller coreset yield answers close to those from running on the full data set? (2) And can useful coresets be constructed automatically for new analyses, with minimal extra work from the user? We answer in the affirmative for a wide variety of problems in Bayesian inference. We demonstrate how to construct Bayesian coresets as an automatic, practical pre-processing step. We prove that our method provides geometric decay in relevant approximation error as a function of coreset size. Empirical analysis shows that our method reduces approximation error by orders of magnitude relative to uniform random subsampling of data. Though we focus on Bayesian methods here, we also show that our construction can be applied in other domains.

    Tamara Broderick is an Associate Professor in EECS at MIT.

    **Meetings are open to any interested researcher.

    **Taking IDS.190 satisfies the seminar requirement for students in MIT’s Interdisciplinary Doctoral Program in Statistics (IDPS), but formal registration is open to any graduate student who can register for MIT classes. For more information and an up-to-date schedule, please see https://stellar.mit.edu/S/course/IDS/fa19/IDS.190/

    Find out more »: Automated Data Summarization for Scalability in Bayesian Inference
  • The Statistical Finite Element Method

    On December 11, 2019 at 4:00 pm till 5:00 pm
    Mark Girolami, University of Cambridge
    E18-304

    The finite element method (FEM) is one of the great triumphs of modern day applied mathematics, numerical analysis and software development. Every area of the sciences and engineering has been positively impacted by the ability to model and study complex physical and natural systems described by systems of partial differential equations (PDE) via the FEM .

    In parallel the recent developments in sensor, measurement, and signalling technologies enables the phenomenological study of systems as diverse as protein signalling in the cell, to turbulent combustion in jet engines, to plastic deformation in bridges.

    The connection between sensor data and FEM is currently restricted to data assimilation for solving inverse problems or the calibration of PDE based models. This however places unwarranted faith in the fidelity of the underlying mathematical description of the actual system under study.

    If one concedes that there is ‘missing physics’ or mis-specification between generative reality and the mathematical abstraction defining the FEM then a framework to systematically characterise and propagate this uncertainty in FEM is required.

    This talk will present a formal statistical construction of the FEM which systematically blends both mathematical description with observational data and provides both small and large scale examples from 3D printed structures to working rail bridges currently operated by the United Kingdom Network Rail.

    Mark Girolami is a Computational Statistician having ten years experience as a Chartered Engineer within IBM. In March 2019 he was elected to the Sir Kirby Laing Professorship of Civil Engineering (1965) within the Department of Engineering at the University of Cambridge where he also holds the Royal Academy of Engineering Research Chair in Data Centric Engineering. Girolami takes up the Sir Kirby Laing Chair upon the retirement of Professor Lord Robert Mair. Professor Girolami is a fellow of Christ’s College Cambridge.

    **Taking IDS.190 satisfies the seminar requirement for students in MIT’s Interdisciplinary Doctoral Program in Statistics (IDPS), but formal registration is open to any graduate student who can register for MIT classes.  For more information and an up-to-date schedule, please see https://stellar.mit.edu/S/course/IDS/fa19/IDS.190/

    **Meetings are open to any interested researcher.

    Find out more »: The Statistical Finite Element Method
  • Flexible Perturbation Models for Robustness to Misspecification

    On December 4, 2019 at 4:00 pm till 5:00 pm
    Jeffrey Miller (Harvard University)
    E18-304

    Abstract:
    In many applications, there are natural statistical models with interpretable parameters that provide insight into questions of interest. While useful, these models are almost always wrong in the sense that they only approximate the true data generating process. In some cases, it is important to account for this model error when quantifying uncertainty in the parameters. We propose to model the distribution of the observed data as a perturbation of an idealized model of interest by using a nonparametric mixture model in which the base distribution is the idealized model. This provides robustness to small departures from the idealized model and, further, enables uncertainty quantification regarding the model error itself. Inference can easily be performed using existing methods for the idealized model in combination with standard methods for mixture models. Remarkably, inference can be even more computationally efficient than in the idealized model alone, because similar points are grouped into clusters that are treated as individual points from the idealized model. We demonstrate with simulations and an application to flow cytometry.

    For more information and an up-to-date schedule, please see https://stellar.mit.edu/S/course/IDS/fa19/IDS.190/

    **Taking IDS.190 satisfies the seminar requirement for students in MIT’s Interdisciplinary Doctoral Program in Statistics (IDPS), but formal registration is open to any graduate student who can register for MIT classes. And the meetings are open to any interested researcher.   Talks will be followed by 30 minutes of tea/snacks and informal discussion.**

    Find out more »: Flexible Perturbation Models for Robustness to Misspecification
  • A Causal Exposure Response Function with Local Adjustment for Confounding: A study of the health effects of long-term exposure to low levels of fine particulate matter

    On November 20, 2019 at 4:00 pm till 5:00 pm
    Francesca Dominici (Harvard University)
    E18-304

    Abstract:

    In the last two decades, ambient levels of air pollution have declined substantially. Yet, as mandated by the Clean Air Act, we must continue to address the following question: is exposure to levels of air pollution that are well below the National Ambient Air Quality Standards (NAAQS) harmful to human health? Furthermore, the highly contentious nature surrounding environmental regulations necessitates casting this question within a causal inference framework. Several parametric and semi-parametric regression modeling approaches have been used to estimate the exposure-response (ER) curve relating long-term exposure to air pollution and various health outcomes. However, most of these approaches are not formulated in the context of a potential outcome framework for causal inference, adjust for the same set of potential confounders across all levels of exposure, and do not account for model uncertainty regarding covariate selection and the shape of the ER. In this paper, we introduce a Bayesian framework for the estimation of a causal ER curve called LERCA (Local Exposure Response Confounding Adjustment). LERCA allows for: a) different confounders and different strength of confounding at the different exposure levels; and b) model uncertainty regarding confounders’ selection and the shape of the ER. Also, LERCA provides a principled way of assessing the observed covariates’ confounding importance at different exposure levels, providing environmental researchers with important information regarding the set of variables to measure and adjust for in regression models. Using simulation studies, we show that state of the art approaches perform poorly in estimating the ER curve in the presence of local confounding. Lastly, LERCA is used on a large data set which includes health, weather, demographic, and pollution information for 5,362 zip codes and for the years of 2011-2013.

    Dr. Francesca Dominici is Professor of Biostatistics at the Harvard T.H. Chan School of Public Health and Co-Director of the Data Science Initiative at Harvard University. She was recruited to the Harvard Chan School as a tenured Professor of Biostatistics in 2009. She was appointed Associate Dean of Information Technology in 2011 and Senior Associate Dean for Research in 2013.

    —-

    For more information and an up-to-date schedule, please see https://stellar.mit.edu/S/course/IDS/fa19/IDS.190/

    **Taking IDS.190 satisfies the seminar requirement for students in MIT’s Interdisciplinary Doctoral Program in Statistics (IDPS), but formal registration is open to any graduate student who can register for MIT classes. And the meetings are open to any interested researcher.   Talks will be followed by 30 minutes of tea/snacks and informal discussion.**

    Find out more »: A Causal Exposure Response Function with Local Adjustment for Confounding: A study of the health effects of long-term exposure to low levels of fine particulate matter
  • Artificial Bayesian Monte Carlo Integration: A Practical Resolution to the Bayesian (Normalizing Constant) Paradox

    On November 13, 2019 at 4:00 pm till 5:00 pm
    Xiao-Li Meng (Harvard University)
    E18-304

    Advances in Markov chain Monte Carlo in the past 30 years have made Bayesian analysis a routine practice. However, there is virtually no practice of performing Monte Carlo integration from the Bayesian perspective; indeed,this problem has earned the “paradox” label in the context of computing normalizing constants (Wasserman, 2013). We first use the modeling-what-we-ignore idea of Kong et al. (2003) to explain that the crux of the paradox is not with the likelihood theory, which is essentially the same as for a standard non-parametric probability/density estimation (Vardi, 1985); though via using group theory, it provides a richer framework for modeling the trade-off between statistical efficiency and computational efficiency. But there is a real Bayesian paradox: Bayesian analysis cannot be applied exactly for solving Bayesian computation, because to perform the exact Bayesian Monte Carlo integration would require more computation than needed to solve the original Monte Carlo problem. We then show that there is a practical resolution to this paradox using the profile likelihood obtained in Kong et al. (2006) and that this approximation is second-order valid asymptotically. We also investigate a more computationally efficient approximation via an artificial likelihood of Geyer (1994). This artificial likelihood approach is only first-order valid, but there is a computationally trivial adjustment to render its second-order validity. We demonstrate empirically the efficiency of these approximated Bayesian estimators, compared to the usual frequentist-based Monte Carlo estimators, such as bridge sampling estimators (Meng and Wong, 1996).

    [This is a joint work with Masatoshi Uehara.]

    Wasserman, L. (2013) All of Statistics: A Concise Course in Statistical Inference.  Springer Science & Business Media. Also see https://normaldeviate.wordpress.com/2012/10/05/the-normalizing-constant-paradox/

    Kong, A.,P. McCullagh, X.-L. Meng, D. Nicolae, and Z. Tan (2003). A theory of statistical models for Monte Carlo integration (with Discussions). J. R. Statist. Soc. B 65, 585-604.   http://stat.harvard.edu/XLM/JRoyStatSoc/JRoyStatSocB65-3_585-618_2003.pdf

    Vardi, Y. (1985). Empirical distributions in selection bias models. Ann. Statist. 13 (1), 178-203.  https://projecteuclid.org/download/pdf_1/euclid.aos/1176346585

    Kong, A., P. McCullagh, X.-L. Meng, and D. Nicolae (2006). Further explorations of likelihood theory for Monte Carlo integration. In Advances in Statistical Modeling and Inference: Essays in Honor of Kjell A. Doksum (Ed: V. Nair), 563-592. World Scientific Press.  http://www.stat.harvard.edu/XLM/books/kmmn.pdf

    Geyer, C. J. (1994). Estimating normalizing constants and reweighting mixtures in Markov chain Monte Carlo.Technical Report, School of Statistics,University of Minnesota, Minneapolis 568. https://scholar.google.com/scholar?cluster=6307665497304333587&hl=en&as_sdt=0,22

    Meng, X.-L. and Wong, W.H. (1996). Simulating ratios of normalizing constants via a simple identity: A theoretical exploration. Statistics Sinica6, 831-860. http://stat.harvard.edu/XLM/StatSin/StatSin6-4_831-860_1996.pdf

    Xiao-Li Meng, the Whipple V. N. Jones Professor of Statistics, and the Founding Editor-in-Chief of Harvard Data Science Review, is well known for his depth and breadth in research, his innovation and passion in pedagogy, his vision and effectiveness in administration, as well as for his engaging and entertaining style as a speaker and writer. Meng was named the best statistician under the age of 40 by COPSS (Committee of Presidents of Statistical Societies) in 2001, and he is the recipient of numerous awards and honors for his more than 150 publications in at least a dozen theoretical and methodological areas, as well as in areas of pedagogy and professional development. He has delivered more than 400 research presentations and public speeches on these topics, and he is the author of “The XL-Files,” a thought-provoking and entertaining column in the IMS (Institute of Mathematical Statistics) Bulletin. His interests range from the theoretical foundations of statistical inferences (e.g., the interplay among Bayesian, Fiducial, and frequentist perspectives; frameworks for multi-source, multi-phase and multi- resolution inferences) to statistical methods and computation (e.g., posterior predictive p-value; EM algorithm; Markov chain Monte Carlo; bridge and path sampling) to applications in natural, social, and medical sciences and engineering (e.g., complex statistical modeling in astronomy and astrophysics, assessing disparity in mental health services, and quantifying statistical information in genetic studies). Meng received his BS in mathematics from Fudan University in 1982 and his PhD in statistics from Harvard in 1990. He was on the faculty of the University of Chicago from 1991 to 2001 before returning to Harvard, where he served as the Chair of the Department of Statistics (2004-2012) and the Dean of Graduate School of Arts and Sciences (2012-2017).

    For more information and an up-to-date schedule, please see https://stellar.mit.edu/S/course/IDS/fa19/IDS.190/

    **Taking IDS.190 satisfies the seminar requirement for students in MIT’s Interdisciplinary Doctoral Program in Statistics (IDPS), but formal registration is open to any graduate student who can register for MIT classes. And the meetings are open to any interested researcher.   Talks will be followed by 30 minutes of tea/snacks and informal discussion.**

    Find out more »: Artificial Bayesian Monte Carlo Integration: A Practical Resolution to the Bayesian (Normalizing Constant) Paradox
  • Using Bagged Posteriors for Robust Inference

    On October 30, 2019 at 4:00 pm till 5:00 pm
    Jonathan Huggins (Boston University)
    37-212

    IDS.190 – Topics in Bayesian Modeling and Computation

    **PLEASE NOTE ROOM CHANGE TO BUILDING 37-212 FOR THE WEEKS OF 10/30 AND 11/6**

    Speaker:  

    Jonathan Huggins (Boston University)

    Abstract:

    Standard Bayesian inference is known to be sensitive to misspecification between the model and the data-generating mechanism, leading to unreliable uncertainty quantification and poor predictive performance. However, finding generally applicable and computationally feasible methods for robust Bayesian inference under misspecification has proven to be a difficult challenge. An intriguing approach is to use bagging on the Bayesian posterior (“BayesBag”); that is, to use the average of posterior distributions conditioned on bootstrapped datasets. In this talk, I describe the statistical behavior of BayesBag, propose a model–data mismatch index for diagnosing model misspecification using BayesBag, and empirically validate our BayesBag methodology on synthetic and real-world data. We find that in the presence of significant misspecification, BayesBag yields more reproducible inferences, has better predictive accuracy, and selects correct models more often than the standard Bayesian posterior; meanwhile, when the model is correctly specified, BayesBag produces superior or equally good results for parameter inference and prediction, while being slightly more conservative for model selection. Overall, our results demonstrate that BayesBag combines the attractive modeling features of standard Bayesian inference with the distributional robustness properties of frequentist methods.

    Bio:

    Jonathan Huggins will formally join the Mathematics & Statistics faculty of Boston University in January 2020 as an Assistant Professor, coming from Harvard University, where he has been a postdoctoral fellow in biostatistics.

    *Taking IDS.190 satisfies the seminar requirement for students in MIT’s Interdisciplinary Doctoral Program in Statistics (IDPS), but formal registration is open to any graduate student who can register for MIT classes.  For more information and an up-to-date schedule, please see https://stellar.mit.edu/S/course/IDS/fa19/IDS.190/

    **Meetings are open to any interested researcher.

    Find out more »: Using Bagged Posteriors for Robust Inference
  • Esther Williams in the Harold Holt Memorial Swimming Pool: Some Thoughts on Complexity

    On October 23, 2019 at 4:00 pm till 5:00 pm
    Daniel Simpson (University of Toronto)
    E18-304

    IDS.190 – Topics in Bayesian Modeling and Computation

    Speaker:

    Daniel Simpson (University of Toronto)

    Abstract:

    As data becomes more complex and computational modelling becomes more powerful, we rapidly find ourselves beyond the scope of traditional statistical theory. As we venture beyond the traditional thunderdome, we need to think about how to cope with this additional complexity in our model building.  In this talk, I will talk about a few techniques that are useful when specifying prior distributions and building Bayesian models for complex data.

    Bio:

    Daniel Simpson is an Assistant Professor at the University of Toronto’s Department of Statistical Sciences.

    **Taking IDS.190 satisfies the seminar requirement for students in MIT’s Interdisciplinary Doctoral Program in Statistics (IDPS), but formal registration is open to any graduate student who can register for MIT classes.  For more information and an up-to-date schedule, please see https://stellar.mit.edu/S/course/IDS/fa19/IDS.190/

    **Meetings are open to any interested researcher.

    Find out more »: Esther Williams in the Harold Holt Memorial Swimming Pool: Some Thoughts on Complexity
  • Markov Chain Monte Carlo Methods and Some Attempts at Parallelizing Them

    On October 16, 2019 at 4:00 pm till 5:00 pm
    Pierre E. Jacob (Harvard University)
    E18-304

    IDS.190 – Topics in Bayesian Modeling and Computation

    MCMC methods yield approximations that converge to quantities of interest in the limit of the number of iterations. This iterative asymptotic justification is not ideal: it stands at odds with current trends in computing hardware. Namely, it would often be computationally preferable to run many short chains in parallel, but such an approach is flawed because of the so-called “burn-in” bias.  This talk will first describe that issue and some known resolutions, including regeneration techniques and sequential Monte Carlo samplers.  Then I will describe a recent proposal, joint work with John O’Leary, Yves Atchadé and others, that allows to completely remove the burn-in bias. In a nutshell, the proposed unbiased estimators are constructed from pairs of chains, that are generated over a random, finite number of iterations. Furthermore, their variances and costs can be made arbitrarily close to those of standard MCMC estimators, if desired.  The proposed method is described in https://arxiv.org/abs/1708.03625 and code in R is available to reproduce the experiments at https://github.com/pierrejacob/unbiasedmcmc.

    Pierre E. Jacob is an Associate Professor of Statistics at Harvard University.  He develops methods for statistical inference, e.g. to run Monte Carlo methods on parallel computers, to compare models, to estimate latent variables, and to deal with intractable likelihood functions.

    For more information and an up-to-date schedule, please see https://stellar.mit.edu/S/course/IDS/fa19/IDS.190/

    **Taking IDS.190 satisfies the seminar requirement for students in MIT’s Interdisciplinary Doctoral Program in Statistics (IDPS), but formal registration is open to any graduate student who can register for MIT classes. And the meetings are open to any interested researcher.   Talks will be followed by 30 minutes of tea/snacks and informal discussion.**

    Find out more »: Markov Chain Monte Carlo Methods and Some Attempts at Parallelizing Them
  • Probabilistic Programming and Artificial Intelligence

    On October 9, 2019 at 4:00 pm till 5:00 pm
    Vikash Mansinghka (MIT)
    E18-304

    IDS.190 – Topics in Bayesian Modeling and Computation

    Abstract:

    Probabilistic programming is an emerging field at the intersection of programming languages, probability theory, and artificial intelligence. This talk will show how to use recently developed probabilistic programming languages to build systems for robust 3D computer vision, without requiring any labeled training data; for automatic modeling of complex real-world time series; and for machine-assisted analysis of experimental data that is too small and/or messy for standard approaches from machine learning and statistics.

    This talk will use these applications to illustrate recent technical innovations in probabilistic programming that formalize and unify modeling approaches from multiple eras of AI, including generative models, neural networks, symbolic programs, causal Bayesian networks, and hierarchical Bayesian modeling. Specifically, it will present languages in which models are represented using executable code, and in which inference is programmable using novel constructs for Monte Carlo, optimization-based, and neural inference. It will also present techniques for Bayesian learning of probabilistic program structure and parameters from real-world data. Finally, this talk will review challenges and research opportunities in the development and use of general-purpose probabilistic programming languages that performant enough and flexible enough for real-world AI engineering.

    Biography:

    Vikash Mansinghka is a Principal Research Scientist at MIT, where he leads the MIT Probabilistic Computing Project. Vikash holds S.B. degrees in Mathematics and in Computer Science from MIT, as well as an M.Eng. in Computer Science and a PhD in Computation. He also held graduate fellowships from the National Science Foundation and MIT’s Lincoln Laboratory. His PhD dissertation on natively probabilistic computation won the MIT George M. Sprowls dissertation award in computer science, and his research on the Picture probabilistic programming language won an award at CVPR. He co-founded two VC-backed startups — Prior Knowledge (acquired by Salesforce in 2012) and Empirical Systems (acquired by Tableau in 2018) — and has consulted on probabilistic programming for leading companies in the semiconductor, biopharma, IT services, and banking sectors. He served on DARPA’s Information Science and Technology advisory board from 2010-2012, currently serves on the editorial boards for the Journal of Machine Learning Research and the journal Statistics and Computation, and co-founded the International Conference on Probabilistic Programming.

    ===========

    For more information and an up-to-date schedule, please see https://stellar.mit.edu/S/course/IDS/fa19/IDS.190/

    **Taking IDS.190 satisfies the seminar requirement for students in MIT’s Interdisciplinary Doctoral Program in Statistics (IDPS), but formal registration is open to any graduate student who can register for MIT classes. And the meetings are open to any interested researcher.   Talks will be followed by 30 minutes of tea/snacks and informal discussion.**

    Find out more »: Probabilistic Programming and Artificial Intelligence