Loading Events
  • This event has passed.
Stochastics and Statistics Seminar

Expansion of biological pathways by integrative Genomics

November 13, 2015 @ 11:00 am

Jun Liu (Harvard University)


The number of publicly available gene expression datasets has been growing dramatically. Various methods had been proposed to predict gene co-expression by integrating the publicly available datasets. These methods assume that the genes in the query gene set are homogeneously correlated and consider no gene-specific correlation tendencies, no background intra-experimental correlations, and no quality variations of different experiments. We propose a two-step algorithm called CLIC (CLustering by Inferred Co-expression) based on a coherent Bayesian model to overcome these limitations. CLIC first employs a Bayesian partition model with feature selection to partition the gene set into disjoint co-expression modules (CEMs), simultaneously assigning posterior probability of selection to each dataset. In the second step, CLIC expands each CEM by scanning the whole reference genome for candidate genes that were not in the input gene set but co-expressed with the genes in this CEM. CLIC is capable of integrating over thousands of gene expression datasets to achieve much higher co-expression prediction accuracy compared to traditional co-expression methods. Application of CLIC to ~1000 annotated human pathways and ~6000 poorly characterized human genes reveals new components of some well-studied pathways and provides strong functional predictions for some poorly characterized genes. We validated the predicted association between protein C7orf55 and ATP synthase assembly using CRISPR knock-out assays.

MIT Statistics + Data Science Center
Massachusetts Institute of Technology
77 Massachusetts Avenue
Cambridge, MA 02139-4307