Loading Events
Stochastics and Statistics Seminar

How should we do linear regression?

April 25 @ 11:00 am - 12:00 pm

Richard Samworth, University of Cambridge

E18-304

Abstract: In the context of linear regression, we construct a data-driven convex loss function with respect to which empirical risk minimisation yields optimal asymptotic variance in the downstream estimation of the regression coefficients. Our semiparametric approach targets the best decreasing approximation of the derivative of the log-density of the noise distribution. At the population level, this fitting process is a nonparametric extension of score matching, corresponding to a log-concave projection of the noise distribution with respect to the Fisher divergence. The procedure is computationally efficient, and we prove that our procedure attains the minimal asymptotic covariance among all convex M-estimators. As an example of a non-log-concave setting, for Cauchy errors, the optimal convex loss function is Huber-like, and our procedure yields an asymptotic efficiency greater than 0.87 relative to the oracle maximum likelihood estimator of the regression coefficients that uses knowledge of this error distribution; in this sense, we obtain robustness without sacrificing much efficiency.

Bio: Richard Samworth obtained his PhD in Statistics from the University of Cambridge in 2004, and has remained in Cambridge since, becoming a full professor in 2013 and the Professor of Statistical Science in 2017.  His main research interests are in high-dimensional and nonparametric statistics; he has developed methods and theory for shape-constrained inference, missing data, subgroup selection, data perturbation techniques (subsampling, the bootstrap, random projections, knockoffs), changepoint estimation and independence testing, amongst others. Richard currently holds a European Research Council Advanced Grant.  He received the COPSS Presidents’ Award in 2018, was elected a Fellow of the Royal Society in 2021 and served as co-editor of the Annals of Statistics (2019-2021).


MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764