Views Navigation

Event Views Navigation

On the statistical cost of score matching

Andrej Risteski, Carnegie Mellon University
E18-304

Abstract: Energy-based models are a recent class of probabilistic generative models wherein the distribution being learned is parametrized up to a constant of proportionality (i.e. a partition function). Fitting such models using maximum likelihood (i.e. finding the parameters which maximize the probability of the observed data) is computationally challenging, as evaluating the partition function involves a high dimensional integral. Thus, newer incarnations of this paradigm instead train other losses which obviate the need to evaluate partition functions. Prominent examples include score matching (in which we fit…

Find out more »

Spectral pseudorandomness and the clique number of the Paley graph

Dmitriy (Tim) Kunisky, Yale University
E18-304

Abstract: The Paley graph is a classical number-theoretic construction of a graph that is believed to behave "pseudorandomly" in many regards. Accurately bounding the clique number of the Paley graph is a long-standing open problem in number theory, with applications to several other questions about the statistics of finite fields. I will present recent results studying the application of convex optimization and spectral graph theory to this problem, which involve understanding the extent to which the Paley graph is "spectrally…

Find out more »

WiDS Cambridge 2023

Microsoft NERD Center

WiDS Cambridge is a hybrid one-day technical conference will feature an all-female line up of speakers from academia and industry to talk about the latest data science-related research in a number of domains.

Find out more »

Spectral Independence: A New Tool to Analyze Markov Chains

Kuikui Liu, University of Washington
E18-304

Abstract: Sampling from high-dimensional probability distributions is a fundamental and challenging problem encountered throughout science and engineering. One of the most popular approaches to tackle such problems is the Markov chain Monte Carlo (MCMC) paradigm. While MCMC algorithms are often simple to implement and widely used in practice, analyzing the rate of convergence to stationarity, i.e. the "mixing time", remains a challenging problem in many settings. I will describe a new technique based on pairwise correlations called "spectral independence", which has been…

Find out more »

Geometric EDA for Random Objects

Paromita Dubey, University of Southern California
E18-304

Abstract: In this talk I will propose new tools for the exploratory data analysis of data objects taking values in a general separable metric space. First, I will introduce depth profiles, where the depth profile of a point ω in the metric space refers to the distribution of the distances between ω and the data objects. I will describe how depth profiles can be harnessed to define transport ranks, which capture the centrality of each element in the metric space with respect to the…

Find out more »

Variational methods in reinforcement learning

Martin Wainwright, MIT
E18-304

Abstract: Reinforcement learning is the study of models and procedures for optimal sequential decision-making under uncertainty.  At its heart lies the Bellman optimality operator, whose unique fixed point specifies an optimal policy and value function.  In this talk, we discuss two classes of variational methods that can be used to obtain approximate solutions with accompanying error guarantees.  For policy evaluation problems based on on-line data, we present Krylov-Bellman boosting, which combines ideas from Krylov methods with non-parametric boosting.  For policy optimization problems based on…

Find out more »

James-Stein for eigenvectors: reducing the optimization bias in Markowitz portfolios

Lisa Goldberg, UC Berkeley

Abstract: We identify and reduce bias in the leading sample eigenvector of a high-dimensional covariance matrix of correlated variables. Our analysis illuminates how error in an estimated covariance matrix corrupts optimization. It may be applicable in finance, machine learning and genomics. Biography: Lisa Goldberg is Head of Research at Aperio and Managing Director at BlackRock.  She is Professor of the Practice of Economics at University of California, Berkeley, where she co-directs the Center for Data Analysis in Risk, an industry…

Find out more »

Free Discontinuity Design (joint w/ David van Dijcke)

Florian Gunsilius, University of Michigan
E18-304

Abstract: Regression discontinuity design (RDD) is a quasi-experimental impact evaluation method ubiquitous in the social- and applied health sciences. It aims to estimate average treatment effects of policy interventions by exploiting jumps in outcomes induced by cut-off assignment rules. Here, we establish a correspondence between the RDD setting and free discontinuity problems, in particular the celebrated Mumford-Shah model in image segmentation. The Mumford-Shah model is non-convex and hence admits local solutions in general. We circumvent this issue by relying on…

Find out more »


MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764