Abstract
Interspecific gene comparisons are the keystones for many areas of biological research and are especially important for the translation of knowledge from model organisms to economically important species. Currently they are hampered by the low resolution of methods based on sequence analysis and by the complex evolutionary history of eukaryotic genes. This is especially critical for plants, whose genomes are shaped by multiple whole genome duplications and subsequent gene loss. This requires the development of new methods for comparing the functions of genes in different species. Here, we report ISEEML (Interspecific Similarity of Expression Evaluated using Machine Learning)–a novel machine learning-based algorithm for interspecific gene classification. In contrast to previous studies focused on sequence similarity, our algorithm focuses on functional similarity inferred from the comparison of gene expression profiles. We propose novel metrics for expression pattern similarity–expression score (ES)–that is suitable for species with differing morphologies. As a proof of concept, we compare detailed transcriptome maps of Arabidopsis thaliana, the model species, Zea mays (maize) and Fagopyrum esculentum (common buckwheat), which are species that represent distant clades within flowering plants. The classifier resulted in an AUC of 0.91; under the ES threshold of 0.5, the specificity was 94%, and sensitivity was 72%.
Funder
Russian Science Foundation
Institute for Information Transmission Problems
Ministry of Science and Higher Education
Publisher
Public Library of Science (PLoS)
Subject
Computational Theory and Mathematics,Cellular and Molecular Neuroscience,Genetics,Molecular Biology,Ecology,Modeling and Simulation,Ecology, Evolution, Behavior and Systematics
Reference55 articles.
1. Orthologs, Paralogs, and Evolutionary Genomics;EV Koonin;Annu Rev Genet,2005
2. The Plant Orthology Browser: An Orthology and Gene-Order Visualizer for Plant Comparative Genomics;D Tulpan;Plant Genome,2017
3. Standardized benchmarking in the quest for orthologs;Quest for Orthologs consortium;Nat Methods,2016
4. The multiple fates of gene duplications: Deletion, hypofunctionalization, subfunctionalization, neofunctionalization, dosage balance constraints, and neutral variation;JA Birchler;Plant Cell,2022
5. The Role of Lineage-Specific Gene Family Expansion in the Evolution of Eukaryotes;O Lespinet;Genome Res,2002
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献