Coexpression analysis of human genes across many microarray data sets

(TMM home)

Homin K. Lee1, Amy K. Hsu1,2, Jon Sajdak1, Jie Qin1, Paul Pavlidis 1,3*

Genome Research 14, 1085-1094

1 Columbia Genome Center
2 College of Physicians and Surgeons
3 Department of Biomedical Informatics
Columbia University

pavlidis@dbmi.columbia.edu

Abstract

We present a large-scale analysis of mRNA coexpression based on 60 large human data sets containing a total of 3960 microarrays. We sought pairs of genes which were reliably coexpressed (based on the correlation of their expression profiles) in multiple data sets, establishing a network of 8805 genes connected by 220649 "coexpression links" that are observed in at least 3 data sets. Many of these links are confirmed on both Affymetrix oligonucleotide microarrays and spotted cDNA microarray data sets. Confirmed positive correlations between genes were much more common than confirmed negative correlations. We show that confirmation of coexpression in multiple data sets is correlated with functional relatedness, and show how clustering analysis of the network can reveal functionally coherent groups of genes. Our findings demonstrate how the large body of accumulated microarray data can be exploited to increase the reliability of inferences about gene function.


Last modified: Wed Jun 2 10:31:09 EDT 2004