Big data in genomic research for big questions with examples from covid-19 and other zoonoses-Reference-Cited by-同舟云学术

Big data in genomic research for big questions with examples from covid-19 and other zoonoses

Published:2022-12-16 Issue:1 Volume:134 Page:
ISSN:1365-2672
Container-title:Journal of Applied Microbiology
language:en
Short-container-title:

Author:

Wassenaar Trudy M¹,Ussery David W²^ORCID,Rosel Adriana Cabal³^ORCID

Affiliation:

1. Molecular Microbiology and Genomics Consultants , Tannenstrasse 7, 55576 Zotzenheim , Germany

2. Department of Biomedical Informatics, University of Arkansas for Medical Sciences , 4301 W Markham St, Little Rock, AR 72205 , USA

3. Institute for Medical Microbiology and Hygiene, Division for Public Health, Austrian Agency for Health and Food Safety , Währingerstrasse 25a, 1096, Vienna , Austria

Abstract

AbstractOmics research inevitably involves the collection and analysis of big data, which can only be handled by automated approaches. Here we point out that the analysis of big data in the field of genomics dictates certain requirements, such as specialized software, quality control of input data, and simplification for visualization of the results. The latter results in a loss of information, as is exemplified for phylogenetic trees. Clear communication of big data analyses can be enhanced by novel visualization strategies. The interpretation of findings is sometimes hampered when dedicated analytical tools are not fully understood by microbiologists, while the researchers performing these analyses may not have a full overview of the biology of the microbes under study. These issues are illustrated here, using SARS-Cov-2 and Salmonella enterica as zoonotic examples. Whereas in scientific communications jargon should be avoided or explained, nomenclature to group similar organisms and distinguish these from more distant relatives is not only essential, but also influences the interpretation of results. Unfortunately, changes in taxonomically accepted names are now so frequent that they hamper rather than assist research, as is illustrated with difficulties of microbiome studies. Nomenclature to group viral isolates, as is done for SARS-Cov2, is also not without difficulties. Some weaknesses in current omics research stem from poor quality of data or biased databases, and problems can be magnified by machine learning approaches. Moreover, the overall opus of scientific publications can now be considered “big data”, as is illustrated by the avalanche of COVID-19-related publications. The peer-review model of scientific publishing is only barely coping with this novel situation, resulting in retractions and the publication of bogus works. The avalanche of scientific publications that originated from the current pandemic can obstruct literature searches, and this will unfortunately continue over time.

Funder

NIH

National Science Foundation

Arkansas Research Alliance

Publisher

Oxford University Press (OUP)

Subject

Applied Microbiology and Biotechnology,General Medicine,Biotechnology

Link

https://academic.oup.com/jambio/article-pdf/134/1/lxac055/49095044/lxac055.pdf

Reference67 articles.

1. Mash-based analyses of Escherichia coli genomes reveal 14 distinct phylogroups;Abram;Commun Biol,2021

2. Forest and trees: exploring bacterial virulence with Genome-wide association studies and machine learning;Allen;Trends Microbiol,2021

3. Geographical and temporal distribution of SARS-CoV-2 clades in the WHO European Region, January to June 2020;Alm;Euro Surveill,2020

4. Applications of machine learning to the problem of antimicrobial resistance: an emerging model for translational research;Anahtar;J Clin Microbiol,2021

5. Efficient computation of faith’s phylogenetic diversity with applications in characterizing microbiomes;Armstrong;Genome Res,2021

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. CREATION OF A NATIONAL DATABASE OF GENOMIC INFORMATION IN UZBEKISTAN;ԴԱՏԱԿԱՆ ՓՈՐՁԱՔՆՆՈՒԹՅԱՆ ԵՎ ՔՐԵԱԳԻՏՈՒԹՅԱՆ ՀԱՅԿԱԿԱՆ ՀԱՆԴԵՍ;2023