Stochastics and Statistics Seminar

Probabilistic factorizations of big tables and networks

Speaker Name: David Dunson (Duke)

Date: March 17, 2017

Time: 11:00am

Location: E18-304

Abstract:

It is common to collect high-dimensional data that are structured as a multiway array or tensor; examples include multivariate categorical data that are organized as a contingency table, sequential data on nucleotides or animal vocalizations, and neuroscience data on brain networks. In each of these cases, there is interest in doing inference on the joint probability distribution of the data and on interpretable functionals of this probability distribution. The goal is to avoid restrictive parametric assumptions, enable both statistical and computational scaling to high dimensional low sample size cases, and maintain a (hopefully accurate) characterization of uncertainty. In this talk, the focus is on probabilistic factorizations and Bayesian inference algorithms relying on Markov chain Monte Carlo (MCMC) sampling. Novel classes of factorizations are proposed, practical and theoretical properties are discussed, scalable algorithms are developed, and a variety of applications are considered.

Speaker Bio:

David Dunson is Arts and Sciences Distinguished Professor of Statistical Science, Mathematics and ECE at Duke University. His research focuses on developing and applying innovative probabilistic modeling approaches and data science methods for high-dimensional and complex data, with a particular emphasis on neuroscience, genomics, ecology and other scientific applications. His work emphasizes Bayesian methods, latent structure learning, geometric data analysis, and computationally efficient sampling algorithms, with a focus on developing practically useful and broad new approaches for improving data analysis, while maintaining theoretical guarantees. Dr. Dunson has won numerous awards, including the COPSS Presidents' Award given annually to a single statistician internationally for outstanding contributions, and a gold medal from the Environmental Protection Agency for outstanding service in risk assessment. He is highly cited, with an H-index of 58 and over 30,000 total citations.