Abstract
AbstractThis review article describes the current status of data archiving and computational infrastructure in the field of genomic medicine, focusing primarily on the situation in Japan. I begin by introducing the status of supercomputer operations in Japan, where a high-performance computing infrastructure (HPCI) is operated to meet the diverse computational needs of science in general. Since this HPCI consists of supercomputers of various architectures located across the nation connected via a high-speed network, including supercomputers specialized in genome science, the status of its response to the explosive increase in genomic data, including the International Nucleotide Sequence Database Collaboration (INSDC) data archive, is explored. Separately, since it is clear that the use of commercial cloud computing environments needs to be promoted, both in light of the rapid increase in computing demands and to support international data sharing and international data analysis projects, I explain how the Japanese government has established a series of guidelines for the use of cloud computing based on its cybersecurity strategy and has begun to build a government cloud for government agencies. I will also carefully consider several other issues of user concern. Finally, I will show how Japan’s major cloud computing infrastructure is currently evolving toward a multicloud and hybrid cloud configuration.
Funder
Japan Agency for Medical Research and Development
Publisher
Springer Science and Business Media LLC
Subject
Genetics,Molecular Biology,Biochemistry
Reference54 articles.
1. Van der Auwera, G. & O’Connor, B. Genomics in the Cloud. (O’Reilly Mediam, 2020).
2. NVIDIA Clara Parabricks, https://www.nvidia.com/en-us/clara/genomics/, Accessed 11 December 2022.
3. McKenna, A. et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 20, 1297–1303 (2010).
4. DePristo, M. A. et al. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 43, 491–498 (2011).
5. Zhao, S., Agafonov, O., Azab, A., Stokowy, T. & Hovig, E. Accuracy and efficiency of germline variant calling pipelines for human genome data. bioRxiv https://www.biorxiv.org/content/10.1101/2020.03.27.011767v1 (2020).