Views Navigation

Event Views Navigation

Genome-wide association, phenotype prediction, and population structure: a review and some open problems

Alex Bloemendal (Broad Institute)
E18-304

Abstract: I will give a broad overview of human genetic variation, polygenic traits, association studies, heritability estimation and risk prediction. I will focus on the dual correlation structures of linkage disequilibrium and population structure, discussing how these both confound and enable the various analyses we perform. I will highlight an important open problem on the failure of polygenic risk prediction to generalize across diverse ancestries. Biography: Alex Bloemendal is a computational scientist at the Broad Institute of MIT and Harvard…

Find out more »

Connections between structured estimation and weak submodularity

Sahand Negahban (Yale University)
E18-304

Abstract:  Many modern statistical estimation problems rely on imposing additional structure in order to reduce the statistical complexity and provide interpretability. Unfortunately, these structures often are combinatorial in nature and result in computationally challenging problems. In parallel, the combinatorial optimization community has placed significant effort in developing algorithms that can approximately solve such optimization problems in a computationally efficient manner. The focus of this talk is to expand upon ideas that arise in combinatorial optimization and connect those algorithms and…

Find out more »

Variable selection using presence-only data with applications to biochemistry

Garvesh Raskutti (University of Wisconsin)
E18-304

Abstract:  In a number of problems, we are presented with positive and unlabelled data, referred to as presence-only responses. The application I present today involves studying the relationship between protein sequence and function and presence-only data arises since for many experiments it is impossible to obtain a large set of negative (non-functional) sequences. Furthermore, if the number of variables is large and the goal is variable selection (as in this case), a number of statistical and computational challenges arise due…

Find out more »

User-friendly guarantees for the Langevin Monte Carlo

Arnak Dalalyan (ENSAE-CREST)
E18-304

Abstract: In this talk, I will revisit the recently established theoretical guarantees for the convergence of the Langevin Monte Carlo algorithm of sampling from a smooth and (strongly) log-concave density. I will discuss the existing results when the accuracy of sampling is measured in the Wasserstein distance and provide further insights on relations between, on the one hand, the Langevin Monte Carlo for sampling and, on the other hand, the gradient descent for optimization. I will also present non-asymptotic guarantees for the accuracy…

Find out more »

Optimization’s Implicit Gift to Learning: Understanding Optimization Bias as a Key to Generalization

Nathan Srebro-Bartom (TTI-Chicago)
E18-304

Abstract: It is becoming increasingly clear that implicit regularization afforded by the optimization algorithms play a central role in machine learning, and especially so when using large, deep, neural networks. We have a good understanding of the implicit regularization afforded by stochastic approximation algorithms, such as SGD, and as I will review, we understand and can characterize the implicit bias of different algorithms, and can design algorithms with specific biases. But in this talk I will focus on implicit biases of…

Find out more »

One and two sided composite-composite tests in Gaussian mixture models

Alexandra Carpentier (Otto von Guericke Universitaet)
E18-304

Abstract: Finding an efficient test for a testing problem is often linked to the problem of estimating a given function of the data. When this function is not smooth, it is necessary to approximate it cleverly in order to build good tests. In this talk, we will discuss two specific testing problems in Gaussian mixtures models. In both, the aim is to test the proportion of null means. The aforementioned link between sharp approximation rates of non-smooth objects and minimax testing…

Find out more »

Statistical estimation under group actions: The Sample Complexity of Multi-Reference Alignment

Afonso Bandeira (NYU)
E18-304

Abstract: : Many problems in signal/image processing, and computer vision amount to estimating a signal, image, or tri-dimensional structure/scene from corrupted measurements. A particularly challenging form of measurement corruption are latent transformations of the underlying signal to be recovered. Many such transformations can be described as a group acting on the object to be recovered. Examples include the Simulatenous Localization and Mapping (SLaM) problem in Robotics and Computer Vision, where pictures of a scene are obtained from different positions andorientations;…

Find out more »

When Inference is tractable

David Sontag (MIT)
E18-304

Abstract:  A key capability of artificial intelligence will be the ability to reason about abstract concepts and draw inferences. Where data is limited, probabilistic inference in graphical models provides a powerful framework for performing such reasoning, and can even be used as modules within deep architectures. But, when is probabilistic inference computationally tractable? I will present recent theoretical results that substantially broaden the class of provably tractable models by exploiting model stability (Lang, Sontag, Vijayaraghavan, AI Stats ’18), structure in…

Find out more »

Statistical theory for deep neural networks with ReLU activation function

Johannes Schmidt-Hieber (Leiden)
E18-304

Abstract: The universal approximation theorem states that neural networks are capable of approximating any continuous function up to a small error that depends on the size of the network. The expressive power of a network does, however, not guarantee that deep networks perform well on data. For that, control of the statistical estimation risk is needed. In the talk, we derive statistical theory for fitting deep neural networks to data generated from the multivariate nonparametric regression model. It is shown…

Find out more »


MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764