Loading Events
  • This event has passed.
Stochastics and Statistics Seminar

Optimal Adaptivity of Signed-Polygon Statistics for Network Testing (Tracy Ke, Harvard University)

May 3, 2019 @ 11:00 am - 12:00 pm

Tracy Ke (Harvard University)

E18-304

Abstract:
Given a symmetric social network, we are interested in testing whether it has only one community or multiple communities. The desired tests should (a) accommodate severe degree heterogeneity, (b) accommodate mixed-memberships, (c) have a tractable null distribution, and (d) adapt automatically to different levels of sparsity, and achieve the optimal detection boundary. How to find such a test is a challenging problem.

We propose the Signed Polygon as a class of new tests. Fix m ≥ 3. For each m-gon in the network, we define a score using the centralized adjacency matrix. The sum of such scores is then the m-th order Signed Polygon statistic. The Signed Triangle (SgnT) and the Signed Quadrilateral (SgnQ) are special examples of the Signed Polygon. We show that both the SgnT and SgnQ tests satisfy all the requirements (a)-(d). Especially, they work well for both the very sparse and less sparse cases. Our proposed tests compare favorably with the existing tests. For example, the EZ test (Gao and Lafferty, 2017) and GC test (Jin et al, 2018) behave unsatisfactorily in the less sparse case and do not achieve the optimal phase diagram. Also, many existing tests assume no heterogeneity or mixed-memberships, so they behave unsatisfactorily in our settings.

The analysis of the SgnT and SgnQ tests is delicate and tedious, since the proof has to cover a whole range of sparsity levels and (almost) arbitrary degree heterogeneity.

Joint work with Jiashun Jin and Shengming Luo. (arXiv preprint: https://arxiv.org/abs/1904.09532)

Biography:
Tracy Ke is Assistant Professor of Statistics in Harvard University. Dr. Ke received her PhD from Princeton University in 2014. She was Assistant Professor of Statistics in The University of Chicago from 2014 to 2018. Her recent research focuses on unsupervised learning problems, including spectral clustering, network community detection, topic modeling, and nonnegative matrix factorization. Her works aim to find statistically optimal methods when the signals are extremely weak and when the data contains severe heterogeneity. Her other research interests include large-scale sparse inference and random matrix theory.


MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307
617-253-1764