Loading Events
Stochastics and Statistics Seminar

Geometric EDA for Random Objects

October 14 @ 11:00 am - 12:00 pm

Paromita Dubey, University of Southern California


In this talk I will propose new tools for the exploratory data analysis of data objects taking values in a general separable metric space. First, I will introduce depth profiles, where the depth profile of a point ωin the metric space refers to the distribution of the distances between ω and the data objects. I will describe how depth profiles can be harnessed to define transport ranks, which capture the centrality of each element in the metric space with respect to the data cloud. Next, I will discuss the properties of transport ranks and show how they can be an effective device for detecting and visualizing patterns in samples of random objects. Together with practical illustrations I will establish the theoretical guarantees for the estimation of the depth profiles and the transport ranks for a wide class of metric spaces. Finally, I will describe a new two sample test geared towards populations of random objects by utilizing the depth profiles corresponding to the data objects.  I will demonstrate the efficacy of this new approach on distributional data comprising of a sample of age-at-death distributions for various countries, for compositional data through energy usage for the U.S. states and for neuroimaging network data. This talk is based on joint work with Yaqing Chen and Hans-Georg Müller.

Paromita Dubey is an Assistant Professor in the Data Sciences and Operations department at the USC Marshall School of Business since 2021. Her research centers around developing novel statistical frameworks for non-Euclidean data, examples being distribution and network valued data. She is also working on addressing challenges in the analysis of dynamic time-evolving data, particularly in non-Euclidean and high dimensional settings. Aside from theoretical challenges, she enjoys working collaboratively to develop statistical frameworks arising in application oriented challenges in population genetics, environmental sciences and social sciences.


© MIT Statistics + Data Science Center | 77 Massachusetts Avenue | Cambridge, MA 02139-4307 | 617-253-1764 |