Cloud-based introduction to BASH programming for biologists

Author:

Wilkins Owen M123,Campbell Ross4,Yosufzai Zelaikha5,Doe Valena6,Soucy Shannon M123ORCID

Affiliation:

1. Dartmouth College Genomic Data Science Core, Center for Quantitative Biology (COBRE), , 1 Medical Center Drive, Lebanon, NH 03766, United States

2. Dartmouth College Department of Biomedical Data Science, Geisel School of Medicine, , 1 Medical Center Drive, Lebanon, NH 03766, United States

3. Dartmouth Health Dartmouth Cancer Center, Geisel School of Medicine, , 1 Medical Center Drive, Lebanon, NH 03766, United States

4. National Institutes of Health , 9000 Rockville Pike, Bethesda, MD 20892, United States

5. Health Data and AI, Deloitte Consulting LLP , 1919 N Lynn St, Suite 1500, Arlington, VA 22209, United States

6. Google Cloud , 1900 Reston Metro Plaza, Reston, VA 20190, United States

Abstract

Abstract This manuscript describes the development of a resource module that is part of a learning platform named ‘NIGMS Sandbox for Cloud-based Learning’, https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial authored by National Institute of General Medical Sciences: NIGMS Sandbox: A Learning Platform toward Democratizing Cloud Computing for Biomedical Research at the beginning of this supplement. This module delivers learning materials introducing the utility of the BASH (Bourne Again Shell) programming language for genomic data analysis in an interactive format that uses appropriate cloud resources for data access and analyses. The next-generation sequencing revolution has generated massive amounts of novel biological data from a multitude of platforms that survey an ever-growing list of genomic modalities. These data require significant downstream computational and statistical analyses to glean meaningful biological insights. However, the skill sets required to generate these data are vastly different from the skills required to analyze these data. Bench scientists that generate next-generation data often lack the training required to perform analysis of these datasets and require support from bioinformatics specialists. Dedicated computational training is required to empower biologists in the area of genomic data analysis, however, learning to efficiently leverage a command line interface is a significant barrier in learning how to leverage common analytical tools. Cloud platforms have the potential to democratize access to the technical tools and computational resources necessary to work with modern sequencing data, providing an effective framework for bioinformatics education. This module aims to provide an interactive platform that slowly builds technical skills and knowledge needed to interact with genomics data on the command line in the Cloud. The sandbox format of this module enables users to move through the material at their own pace and test their grasp of the material with knowledge self-checks before building on that material in the next sub-module. This manuscript describes the development of a resource module that is part of a learning platform named ``NIGMS Sandbox for Cloud-based Learning'' https://github.com/NIGMS/NIGMS-Sandbox. The overall genesis of the Sandbox is described in the editorial NIGMS Sandbox [1] at the beginning of this Supplement. This module delivers learning materials on the analysis of bulk and single-cell ATAC-seq data in an interactive format that uses appropriate cloud resources for data access and analyses.

Funder

National Institutes of General Medical Science

Publisher

Oxford University Press (OUP)

Reference13 articles.

1. NIGMS Sandbox: A Learning Platform toward Democratizing Cloud Computing for Biomedical Research;Lei;Brief Bioinform

2. Navigating bottlenecks and trade-offs in genomic data analysis;Berger;Nat Rev Genet,2023

3. Coming of age: ten years of next-generation sequencing technologies;Goodwin;Nat Rev Genet,2016

4. The complete sequence of a human genome;Nurk;Science,2022

5. A new coronavirus associated with human respiratory disease in China;Wu;Nature,2020

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3