Author:
Kuang Junyao,Buchon Nicolas,Michel Kristin,Scoglio Caterina
Abstract
Gene co-expression networks can be used to determine gene regulation and attribute gene function to biological processes. Different high throughput technologies, including one and two-channel microarrays and RNA-sequencing, allow evaluating thousands of gene expression data simultaneously, but these methodologies provide results that cannot be directly compared. Thus, it is complex to analyze coexpression relations between genes, especially when there are missing values arising for experimental reasons. Networks are a helpful tool for studying gene co-expression, where nodes represent genes and edges represent co-expression of pairs of genes. In this paper, we propose a method for constructing a gene co-expression network for the Anopheles gambiae transcriptome from 257 unique studies obtained with different methodologies and experimental designs. We introduce the sliding threshold approach to select node pairs with high Pearson correlation coefficients. The robustness of the method was verified by comparing edge weight distributions under random removal of conditions. The properties of the constructed network are studied in this paper, including node degree distribution, coreness, and community structure. The network core is largely comprised of genes that encode components of the mitochondrial respiratory chain and the ribosome, while different communities are enriched for genes involved in distinct biological processes. This suggests that the overall network structure is driven to maximize the integration of essential cellular functions, possibly allowing the flexibility to add novel functions.
Publisher
Cold Spring Harbor Laboratory