- This event has passed.
Stochastics and Statistics Seminar
Optimal nonparametric capture-recapture methods for estimating population size
April 5, 2024 @ 11:00 am - 12:00 pm
Edward Kennedy, Carnegie Mellon University
E18-304
Event Navigation
Abstract: Estimation of population size using incomplete lists has a long history across many biological and social sciences. For example, human rights groups often construct partial lists of victims of armed conflicts, to estimate the total number of victims. Earlier statistical methods for this setup often use parametric assumptions, or rely on suboptimal plug-in-type nonparametric estimators; but both approaches can lead to substantial bias, the former via model misspecification and the latter via smoothing. Under an identifying assumption that two lists are conditionally independent given measured covariates, we make several contributions. First, we derive the nonparametric efficiency bound for estimating the capture probability, which indicates the best possible performance of any estimator, and sheds light on the statistical limits of capture-recapture methods. Then we present a new estimator, that has a double robustness property new to capture-recapture, and is near-optimal in a nonasymptotic sense, under relatively mild nonparametric conditions. Next, we give a confidence interval construction method for total population size from generic capture probability estimators, and prove nonasymptotic near-validity. Finally, we apply them to estimate the number of killings and disappearances in Peru during its internal armed conflict between 1980 and 2000.
Paper links: journal, arxiv
Bio: Edward Kennedy is an associate professor of Statistics & Data Science at Carnegie Mellon University. He joined the department after graduating with a PhD in biostatistics from the University of Pennsylvania. Edward’s methodological interests lie at the intersection of causal inference, machine learning, and nonparametric theory, especially in settings involving high-dimensional and otherwise complex data. His applied research focuses on problems in criminal justice, health services, medicine, and public policy. Edward is a recipient of the NSF CAREER award, the David P. Byar Young Investigator award, and the Thomas Ten Have Award for exceptional research in causal inference.