A large-scale phylogeny-guided analysis of pseudogenes in Pseudomonas aeruginosa bacterium

Author:

Cohen Nimrod1,Veksler-Lublinsky Isana1ORCID

Affiliation:

1. Department of Software and Information Systems Engineering, Faculty of Engineering, Ben-Gurion University of the Negev , Beer-Sheva, Israel

Abstract

ABSTRACT Pseudogenes, once considered "junk DNA" based on the incorrect assumption that the absence of full coding potential means a complete lack of functionality, have recently become a subject of significant interest in the scientific community. Concurrently, it is widely assumed that bacterial genomes are compact and have a high density of coding genes with little room for non-coding genes, including pseudogenes. A key aspect of genome annotation is the correct identification of genes and the distinction between coding genes and pseudogenes, as it directly impacts functional and comparative genomics studies. In this study, we analyzed the genomic data of 4,699 strains of the bacterium Pseudomonas aeruginosa ( P. aeruginosa ) as they exhibit high variability in the number of annotated pseudogenes. In particular, we looked for correlations between the number of pseudogenes and other genomic and meta-features of the strains. We identified clusters of orthologous genes and pseudogenes and compared cluster size distributions and length homogeneity within clusters. We then mapped and examined orthology relationships between genes and pseudogenes. Additionally, we generated a phylogenetic tree of the strains and found that phylogenetically related strains are more homogeneous in the number of pseudogenes and share a significant amount of pseudogenes. Finally, we delved into clusters of orthologous genes and pseudogenes and quantified their phylogenetic neighborhood, classifying pseudogenes into evolutionary preserved pseudogenes, mis-annotated pseudogenes, or pseudogenes formed by failed horizontal transfer events. This in-depth study provides important insights that can be incorporated into pseudogene annotation pipelines in the future. IMPORTANCE Accurate annotation of genes and pseudogenes is vital for comparative genomics analysis. Recent studies have shown that bacterial pseudogenes have an important role in regulatory processes and can provide insight into the evolutionary history of homologous genes or the genome as a whole. Due to pseudogenes’ nature as non-functional genes, there is no commonly accepted definition of a pseudogene, which poses difficulties in verifying the annotation through experimental methods and resolving discrepancies among different annotation techniques. Our study introduces an in-depth analysis of annotated genes and pseudogenes and insights that can be incorporated into improved pseudogene annotation pipelines in the future.

Publisher

American Society for Microbiology

Subject

Infectious Diseases,Cell Biology,Microbiology (medical),Genetics,General Immunology and Microbiology,Ecology,Physiology

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3