Views Navigation

Event Views Navigation

Shotgun Assembly of Graphs

Elchanan Mossel (MIT)
E18-304

We will present some results and some open problems related to shotgun assembly of graphs for random generating models.Shotgun assembly of graphs is the problem of recovering a random graph or a randomly labelled graphs from small pieces. This problem generalizes the theoretically elegant and practically important problem of shotgun assembly of DNA sequences. The general problem of shotgun assembly presents novel problems in random graphs, percolation, and random constraint satisfaction problems. Based on joint works with Nathan Ross, with…

Find out more »

Sparse PCA via covariance thresholding

Yash Deshpande (Microsoft Research)
E18-304

Abstract: In sparse principal components analysis (PCA), the task is to infer a sparse, low-rank matrix from noisy observations. Johnstone and Lu proposed the popular “spiked covariance” model, wherein the population distribution is equivariant with the exception of a single direction, called the spike. Assuming that the spike direction is sparse in some basis, they also proposed a simple scheme to estimate its support based on the diagonal entries of the sample covariance. Indeed, later information-theoretic analysis demonstrated that the…

Find out more »

Non-classical Berry-Esseen inequality and accuracy of the weighted bootstrap

Mayya Zhilova (Georgia Tech)
E18-304

Abstract: In this talk, we will study higher-order accuracy of the weighted bootstrap procedure for estimation of a distribution of a sum of independent random vectors with bounded fourth moments, on the set of all Euclidean balls. Our approach is based on Berry-Esseen type inequality which extends the classical normal approximation bound. These results justify in non-asymptotic setting that the weighted bootstrap can outperform Gaussian (or chi-squared) approximation in accuracy w.r.t. dimension and sample size. In addition, the presented results lead…

Find out more »

Slope meets Lasso in sparse linear regression

Pierre Bellec (Rutgers)
E18-304

Abstract: We will present results in sparse linear regression on two convex regularized estimators, the Lasso and the recently introduced Slope estimator, in the high-dimensional setting where the number of covariates p is larger than the number of observations n. The estimation and prediction performance of these estimators will be presented, as well as a comparative study of the assumptions on the design matrix.  https://arxiv.org/pdf/1605.08651.pdf Biography: I am an Assistant Professor of statistics at Rutgers, the State University of New Jersey. I obtained my PhD…

Find out more »

Causal Discovery in Systems with Feedback Cycles

Frederick Eberhardt (CalTech)
E18-304

Abstract: While causal relations are generally considered to be anti-symmetric, we often find that over time there are feedback systems such that a variable can have a causal effect on itself. Such "cyclic" causal systems pose significant challenges for causal analysis, both in terms of the appropriate representation of the system under investigation, and for the development of algorithms that attempt to infer as much as possible about the underlying causal system from statistical data. This talk will aim to provide some theoretical insights about…

Find out more »

Estimating the number of connected components of large graphs based on subgraph sampling

Yihong Wu (Yale)
E18-304

Abstract:  Learning properties of large graphs from samples is an important problem in statistical network analysis, dating back to the early work of Goodman and Frank. We revisit the problem formulated by Frank (1978) of estimating the numbers of connected components in a graph of N vertices based on the subgraph sampling model, where we observe the subgraph induced by n vertices drawn uniformly at random. The key question is whether it is possible to achieve accurate estimation, i.e., vanishing normalized mean-square error,…

Find out more »

Computing partition functions by interpolation

Alexander Barvinok (University of Michigan)
E18-304

Abstract: Partition functions are just multivariate polynomials with great many monomials enumerating combinatorial structures of a particular type and their efficient computation (approximation) are of interest for combinatorics, statistics, physics and computational complexity. I’ll present a general principle: the partition function can be efficiently approximated in a domain if it has no complex zeros in a slightly larger domain, and illustrate it on the examples of the permanent of a matrix, the independence polynomial of a graph and, time permitting, the graph homomorphism partition…

Find out more »

Robust Statistics, Revisited

Ankur Moitra (MIT)

Starting from the seminal works of Tukey (1960) and Huber (1964), the field of robust statistics asks: Are there estimators that provable work in the presence of noise? The trouble is that all known provably robust estimators are also hard to compute in high-dimensions. Here, we study a basic problem in robust statistics, posed in various forms in the above works. Given corrupted samples from a high-dimensional Gaussian, are there efficient algorithms to accurately estimate its parameters? We give the…

Find out more »

Probabilistic factorizations of big tables and networks

David Dunson (Duke)

Abstract: It is common to collect high-dimensional data that are structured as a multiway array or tensor; examples include multivariate categorical data that are organized as a contingency table, sequential data on nucleotides or animal vocalizations, and neuroscience data on brain networks. In each of these cases, there is interest in doing inference on the joint probability distribution of the data and on interpretable functionals of this probability distribution. The goal is to avoid restrictive parametric assumptions, enable both statistical and…

Find out more »

Jagers-Nerman stable age distribution theory, change point detection and power of two choices in evolving networks

Shankar Bhamidi (UNC)
E18-304

Abstract: (i) Change point detection for networks: We consider the preferential attachment model. We formulate and study the regime where the network transitions from one evolutionary scheme to another. In the large network limit we derive asymptotics for various functionals of the network including degree distribution and maximal degree. We study functional central limit theorems for the evolution of the degree distribution which feed into proving consistency of a proposed estimator of the change point. (ii) Power of choice and network…

Find out more »


MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764