Network depth affects inference of gene sets from bacterial transcriptomes using denoising autoencoders-Reference-Cited by-同舟云学术

Network depth affects inference of gene sets from bacterial transcriptomes using denoising autoencoders

Published:2024-01-01 Issue:1 Volume:4 Page:
ISSN:2635-0041
Container-title:Bioinformatics Advances
language:en
Short-container-title:

Author:

Kion-Crosby Willow¹²,Barquist Lars¹²³^ORCID

Affiliation:

1. Helmholtz Institute for RNA-based Infection Research (HIRI)/Helmholtz Centre for Infection Research (HZI) , 97080 Würzburg, Germany

2. Faculty of Medicine, University of Würzburg , 97080 Würzburg, Germany

3. Department of Biology, University of Toronto , Mississauga, ON L5L 1C6, Canada

Abstract

Abstract Summary The increasing number of publicly available bacterial gene expression data sets provides an unprecedented resource for the study of gene regulation in diverse conditions, but emphasizes the need for self-supervised methods for the automated generation of new hypotheses. One approach for inferring coordinated regulation from bacterial expression data is through neural networks known as denoising autoencoders (DAEs) which encode large datasets in a reduced bottleneck layer. We have generalized this application of DAEs to include deep networks and explore the effects of network architecture on gene set inference using deep learning. We developed a DAE-based pipeline to extract gene sets from transcriptomic data in Escherichia coli, validate our method by comparing inferred gene sets with known pathways, and have used this pipeline to explore how the choice of network architecture impacts gene set recovery. We find that increasing network depth leads the DAEs to explain gene expression in terms of fewer, more concisely defined gene sets, and that adjusting the width results in a tradeoff between generalizability and biological inference. Finally, leveraging our understanding of the impact of DAE architecture, we apply our pipeline to an independent uropathogenic E.coli dataset to identify genes uniquely induced during human colonization. Availability and implementation https://github.com/BarquistLab/DAE_architecture_exploration.

Funder

Bavarian State Ministry for Science and the Arts

Publisher

Oxford University Press (OUP)

Link

https://academic.oup.com/bioinformaticsadvances/advance-article-pdf/doi/10.1093/bioadv/vbae066/57452311/vbae066.pdf

Reference53 articles.

1. Assembly and dynamics of the bacterial flagellum;Armitage;Annu Rev Microbiol,2020

2. Fe-S cluster assembly pathways in bacteria;Ayala-Castro;Microbiol Mol Biol Rev,2008

3. Approximation and estimation bounds for artificial neural networks;Barron;Mach Learn,1994

4. A predictive model for transcriptional control of physiology in a free living cell;Bonneau;Cell,2007

5. The inferelator: an algorithm for learning parsimonious regulatory networks from systems-biology data sets de novo;Bonneau;Genome Biol,2006