Stochastics and Statistics Seminar

Views Navigation

Event Views Navigation

Causal Inference on Outcomes Learned from Text

Jann Spiess, Stanford University
E18-304

Abstract: (with Iman Modarressi and Amar Venugopal; arxiv.org/abs/2503.00725 We propose a machine-learning tool that yields causal inference on text in randomized trials. Based on a simple econometric framework in which text may capture outcomes of interest, our procedure addresses three questions: First, is the text affected by the treatment? Second, which outcomes is the effect on? And third, how complete is our description of causal effects? To answer all three questions, our approach uses large language models (LLMs) that suggest systematic…

Find out more »

Same Root Different Leaves: Time Series and Cross-Sectional Methods in Panel Data

Dennis Shen, University of Southern California
E18-304

Abstract: One dominant approach to evaluate the causal effect of a treatment is through panel data analysis, whereby the behaviors of multiple units are observed over time. The information across time and units motivates two general approaches: (i) horizontal regression (i.e., unconfoundedness), which exploits time series patterns, and (ii) vertical regression (e.g., synthetic controls), which exploits cross-sectional patterns. Conventional wisdom often considers the two approaches to be different. We establish this position to be partly false for estimation but generally…

Find out more »

How should we do linear regression?

Richard Samworth, University of Cambridge
E18-304

Abstract: In the context of linear regression, we construct a data-driven convex loss function with respect to which empirical risk minimisation yields optimal asymptotic variance in the downstream estimation of the regression coefficients. Our semiparametric approach targets the best decreasing approximation of the derivative of the log-density of the noise distribution. At the population level, this fitting process is a nonparametric extension of score matching, corresponding to a log-concave projection of the noise distribution with respect to the Fisher divergence.…

Find out more »


MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764