Loading Events

← Back to Events

online

February 2018

Data Science and Big Data Analytics: Making Data-Driven Decisions

February 5, 2018
online

The seven-week course launches February 5, 2018. This course was developed by over ten MIT faculty members at IDSS. It is specially designed for data scientists, business analysts, engineers, and technical managers looking to learn the latest theories and strategies to harness data.

Find out more »

September 2019

Data Science and Big Data Analytics: Making Data-Driven Decisions

September 30, 2019
online

The seven-week course launches September 30, 2019. This course was developed by over ten MIT faculty members at IDSS. It is specially designed for data scientists, business analysts, engineers, and technical managers looking to learn the latest theories and strategies to harness data.

Find out more »

April 2020

Matrix Concentration for Products

Jonathan Niles-Weed (New York University)

April 10 @ 11:00 am - 12:00 pm
online

Abstract: We develop nonasymptotic concentration bounds for products of independent random matrices. Such products arise in the study of stochastic algorithms, linear dynamical systems, and random walks on groups. Our bounds exactly match those available for scalar random variables and continue the program, initiated by Ahlswede-Winter and Tropp, of extending familiar concentration bounds to the noncommutative setting. Our proof technique relies on geometric properties of the Schatten trace class. Joint work with D. Huang, J. A. Tropp, and R. Ward.…

Find out more »

On Using Graph Distances to Estimate Euclidean and Related Distances

Ery Arias-Castro (University of California, San Diego)

April 17 @ 11:00 am - 12:00 pm
online

Abstract:  Graph distances have proven quite useful in machine learning/statistics, particularly in the estimation of Euclidean or geodesic distances. The talk will include a partial review of the literature, and then present more recent developments on the estimation of curvature-constrained distances on a surface, as well as on the estimation of Euclidean distances based on an unweighted and noisy neighborhood graph. - About the Speaker:  Ery Arias-Castro received his Ph.D. in Statistics from Stanford University in 2004. He then took…

Find out more »

How to Trap a Gradient Flow

Sébastien Bubeck (Microsoft Research)

April 24 @ 11:00 am - 12:00 pm
online

Abstract: In 1993, Stephen A. Vavasis proved that in any finite dimension, there exists a faster method than gradient descent to find stationary points of smooth non-convex functions. In dimension 2 he proved that 1/eps gradient queries are enough, and that 1/sqrt(eps) queries are necessary. We close this gap by providing an algorithm based on a new local-to-global phenomenon for smooth non-convex functions. Some higher dimensional results will also be discussed. I will also present an extension of the 1/sqrt(eps)…

Find out more »

May 2020

Naive Feature Selection: Sparsity in Naive Bayes

Alexandre d'Aspremont (ENS, CNRS)

May 1 @ 11:00 am - 12:00 pm
online

Abstract: Due to its linear complexity, naive Bayes classification remains an attractive supervised learning method, especially in very large-scale settings. We propose a sparse version of naive Bayes, which can be used for feature selection. This leads to a combinatorial maximum-likelihood problem, for which we provide an exact solution in the case of binary data, or a bound in the multinomial case. We prove that our bound becomes tight as the marginal contribution of additional features decreases. Both binary and…

Find out more »

Data Science and Big Data Analytics: Making Data-Driven Decisions

May 4
online

Developed by 11 MIT faculty members at IDSS, this seven-week course is specially designed for data scientists, business analysts, engineers and technical managers looking to learn strategies to harness data. Offered by MIT xPRO. Course begins May 4, 2020.

Find out more »

The Ethical Algorithm

Michael Kearns (University of Pennsylvania)

May 19 @ 4:00 pm - 5:00 pm
online

Title: The Ethical Algorithm Abstract: Many recent mainstream media articles and popular books have raised alarms over anti-social algorithmic behavior, especially regarding machine learning and artificial intelligence. The concerns include leaks of sensitive personal data by predictive models, algorithmic discrimination as a side-effect of machine learning, and inscrutable decisions made by complex models. While standard and legitimate responses to these phenomena include calls for stronger and better laws and regulations, researchers in machine learning, statistics and related areas are also…

Find out more »

August 2020

SES & IDPS Dissertation Defense – Rui Sun

Rui Sun

August 19 @ 1:00 pm - 3:00 pm
online

Online Learning and Optimization in Operations Management ABSTRACT We study in this thesis online learning and optimization problems in operations management where we need to make decisions in the face of incomplete information and operational constraints in a dynamic environment. We first consider an online matching problem where a central platform needs to match a number of limited resources to different groups of users that arrive sequentially over time. The platform does not know the reward of each matching option…

Find out more »

September 2020

Stein’s method for multivariate continuous distributions and applications

Gesine Reinert, University of Oxford

September 11 @ 11:00 am - 12:00 pm
online

Abstract: Stein’s method is a key method for assessing distributional distance, mainly for one-dimensional distributions. In this talk we provide a general approach to Stein’s method for multivariate continuous distributions. Among the applications we consider is the Wasserstein distance between two continuous probability distributions under the assumption of existence of a Poincare constant. This is joint work with Guillaume Mijoule (INRIA Paris) and Yvik Swan (Liege). - Bio: Gesine Reinert is a Research Professor of the Department of Statistics and…

Find out more »

Causal Inference and Overparameterized Autoencoders in the Light of Drug Repurposing for SARS-CoV-2

Caroline Uhler, MIT

September 18 @ 11:00 am - 12:00 pm
online

Abstract:  Massive data collection holds the promise of a better understanding of complex phenomena and ultimately, of better decisions. An exciting opportunity in this regard stems from the growing availability of perturbation / intervention data (drugs, knockouts, overexpression, etc.) in biology. In order to obtain mechanistic insights from such data, a major challenge is the development of a framework that integrates observational and interventional data and allows predicting the effect of yet unseen interventions or transporting the effect of interventions…

Find out more »

Separating Estimation from Decision Making in Contextual Bandits

Dylan Foster, MIT

September 25 @ 11:00 am - 12:00 pm
online

Abstract: The contextual bandit is a sequential decision making problem in which a learner repeatedly selects an action (e.g., a news article to display) in response to a context (e.g., a user’s profile) and receives a reward, but only for the action they selected. Beyond the classic explore-exploit tradeoff, a fundamental challenge in contextual bandits is to develop algorithms that can leverage flexible function approximation to model similarity between contexts, yet have computational requirements comparable to classical supervised learning tasks…

Find out more »

October 2020

Bayesian inverse problems, Gaussian processes, and partial differential equations

Richard Nickl - University of Cambridge

October 2 @ 11:00 am - 12:00 pm
online

Abstract: The Bayesian approach to inverse problems has become very popular in the last decade after seminal work by Andrew Stuart (2010) and collaborators. Particularly in non-linear applications with PDEs and when using Gaussian process priors, this can leverage powerful MCMC methodology to tackle difficult high-dimensional and non-convex inference problems. Little is known in terms of rigorous performance guarantees for such algorithms. After laying out the main ideas behind Bayesian inversion, we will discuss recent progress providing both statistical and…

Find out more »

On Estimating the Mean of a Random Vector

Gábor Lugosi, Pompeu Fabra University

October 9 @ 11:00 am - 12:00 pm
online

Abstract: One of the most basic problems in statistics is the estimation of the mean of a random vector, based on independent observations. This problem has received renewed attention in the last few years, both from statistical and computational points of view. In this talk we review some recent results on the statistical performance of mean estimators that allow heavy tails and adversarial contamination in the data. The basic punchline is that one can construct estimators that, under minimal conditions,…

Find out more »

Data driven variational models for solving inverse problems

Carola-Bibiane Schönlieb - University of Cambridge

October 16 @ 11:00 am - 12:00 pm
online

Abstract:  In this talk we discuss the idea of data- driven regularisers for inverse imaging problems. We are in particular interested in the combination of mathematical models and purely data-driven approaches, getting the best from both worlds. In this context we will make a journey from “shallow” learning for computing optimal parameters for variational regularisation models by bilevel optimization to the investigation of different approaches that use deep neural networks for solving inverse imaging problems. Bio: Carola-Bibiane Schönlieb is Professor of…

Find out more »

Statistical Aspects of Wasserstein Distributionally Robust Optimization Estimators

Jose Blanchet - Stanford University

October 23 @ 11:00 am - 12:00 pm
online

Abstract: Wasserstein-based distributional robust optimization problems are formulated as min-max games in which a statistician chooses a parameter to minimize an expected loss against an adversary (say nature) which wishes to maximize the loss by choosing an appropriate probability model within a certain non-parametric class. Recently, these formulations have been studied in the context in which the non-parametric class chosen by nature is defined as a Wasserstein-distance neighborhood around the empirical measure. It turns out that by appropriately choosing the…

Find out more »

November 2020

Valid hypothesis testing after hierarchical clustering

Daniela Witten - University of Washington

November 6 @ 11:00 am - 12:00 pm
online

Abstract:  As datasets continue to grow in size, in many settings the focus of data collection has shifted away from testing pre-specified hypotheses, and towards hypothesis generation. Researchers are often interested in performing an exploratory data analysis in order to generate hypotheses, and then testing those hypotheses on the same data; I will refer to this as 'double dipping'. Unfortunately, double dipping can lead to highly-inflated Type 1 errors. In this talk, I will consider the special case of hierarchical…

Find out more »

Sharp Thresholds for Random Subspaces, and Applications

Mary Wootters - Stanford University

November 13 @ 11:00 am - 12:00 pm
online

Abstract: What combinatorial properties are likely to be satisfied by a random subspace over a finite field? For example, is it likely that not too many points lie in any Hamming ball? What about any cube?  We show that there is a sharp threshold on the dimension of the subspace at which the answers to these questions change from "extremely likely" to "extremely unlikely," and moreover we give a simple characterization of this threshold for different properties. Our motivation comes…

Find out more »

Perfect Simulation for Feynman-Kac Models using Ensemble Rejection Sampling

Arnaud Doucet - University of Oxford

November 20 @ 11:00 am - 12:00 pm
online

Abstract: I will introduce Ensemble Rejection Sampling, a scheme for perfect simulation of a class of Feynmac-Kac models. In particular, this scheme allows us to sample exactly from the posterior distribution of the latent states of a class of non-linear non-Gaussian state-space models and from the distribution of a class of conditioned random walks. Ensemble Rejection Sampling relies on a high-dimensional proposal distribution built using ensembles of state samples and dynamic programming. Although this algorithm can be interpreted as a…

Find out more »
+ Export Events