Views Navigation

Event Views Navigation

Computing partition functions by interpolation

Alexander Barvinok (University of Michigan)
E18-304

Abstract: Partition functions are just multivariate polynomials with great many monomials enumerating combinatorial structures of a particular type and their efficient computation (approximation) are of interest for combinatorics, statistics, physics and computational complexity. I’ll present a general principle: the partition function can be efficiently approximated in a domain if it has no complex zeros in a slightly larger domain, and illustrate it on the examples of the permanent of a matrix, the independence polynomial of a graph and, time permitting, the graph homomorphism partition…

Find out more »

Robust Statistics, Revisited

Ankur Moitra (MIT)

Starting from the seminal works of Tukey (1960) and Huber (1964), the field of robust statistics asks: Are there estimators that provable work in the presence of noise? The trouble is that all known provably robust estimators are also hard to compute in high-dimensions. Here, we study a basic problem in robust statistics, posed in various forms in the above works. Given corrupted samples from a high-dimensional Gaussian, are there efficient algorithms to accurately estimate its parameters? We give the…

Find out more »

Probabilistic factorizations of big tables and networks

David Dunson (Duke)

Abstract: It is common to collect high-dimensional data that are structured as a multiway array or tensor; examples include multivariate categorical data that are organized as a contingency table, sequential data on nucleotides or animal vocalizations, and neuroscience data on brain networks. In each of these cases, there is interest in doing inference on the joint probability distribution of the data and on interpretable functionals of this probability distribution. The goal is to avoid restrictive parametric assumptions, enable both statistical and…

Find out more »

Jagers-Nerman stable age distribution theory, change point detection and power of two choices in evolving networks

Shankar Bhamidi (UNC)
E18-304

Abstract: (i) Change point detection for networks: We consider the preferential attachment model. We formulate and study the regime where the network transitions from one evolutionary scheme to another. In the large network limit we derive asymptotics for various functionals of the network including degree distribution and maximal degree. We study functional central limit theorems for the evolution of the degree distribution which feed into proving consistency of a proposed estimator of the change point. (ii) Power of choice and network…

Find out more »

Sample-optimal inference, computational thresholds, and the methods of moments

David Steurer (Cornell)

 Abstract: We propose an efficient meta-algorithm for Bayesian inference problems based on low-degree polynomials, semidefinite programming, and tensor decomposition. The algorithm is inspired by recent lower bound constructions for sum-of-squares and related to the method of moments. Our focus is on sample complexity bounds that are as tight as possible (up to additive lower-order terms) and often achieve statistical thresholds or conjectured computational thresholds. Our algorithm recovers the best known bounds for partial recovery in the stochastic block model, a…

Find out more »

Active learning with seed examples and search queries

Daniel Hsu (Columbia)

Abstract: Active learning is a framework for supervised learning that explicitly models, and permits one to control and optimize, the costs of labeling data. The hope is that by carefully selecting which examples to label in an adaptive manner, the number of labels required to learn an accurate classifier is substantially reduced. However, in many learning settings (e.g., when some classes are rare), it is difficult to identify which examples are most informative to label, and existing active learning algorithms are prone to labeling uninformative examples. Based…

Find out more »

SDSCon 2017 – Statistics and Data Science Center Conference

As part of the MIT Institute for Data, Systems, and Society (IDSS), the Statistics and Data Science Center (SDSC) is a MIT-wide focal point for advancing academic programs and research activities in statistics and data science. SDSC Day will be a celebration and community-building event for those interested in statistics. Discussions will cover applications of statistics and data science across a wide range of fields and approaches.

Find out more »

Testing properties of distributions over big domains

Ronitt Rubinfeld (MIT)

Abstract: We describe an emerging research direction regarding the complexity of testing global properties of discrete distributions, when given access to only a few samples from the distribution. Such properties might include testing if two distributions have small statistical distance, testing various independence properties, testing whether a distribution has a specific shape (such as monotone decreasing, k-modal, k-histogram, monotone hazard rate,...), and approximating the entropy. We describe bounds for such testing problems whose sample complexities are sublinear in the size…

Find out more »

Some related phase transitions in phylogenetics and social network analysis

Sebastian Roch (Wisconsin)

Abstract: Spin systems on trees have found applications ranging from the reconstruction of phylogenies to the analysis of networks with community structure. A key feature of such processes is the interplay between the growth of the tree and the decay of correlations along it. How the resulting threshold phenomena impact estimation depends on the problem considered. I will illustrate this on two recent results: 1) the critical threshold of ancestral sequence reconstruction by maximum parsimony on general phylogenies and 2)…

Find out more »


MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764