Author:
Hleap Jose Sergio,Cristescu Melania E.,Steinke Dirk
Abstract
AbstractSummaryAmplicons to Global Gene (A2G2) is a Python wrapper that uses MAFFT and an “Amplicon to Gene” strategy to align very large numbers of sequences while improving alignment accuracy. It is specially developed to deal with conserved genes, where traditional aligners introduce a significant amount of gaps. A2G2 leverages the add sequences option of MAFFT to align the sequences to a global reference gene and a local reference region. Both of these references can be consensus sequences of trusted sources. Efficient parallelization of these tasks allows A2G2 to align a very large number of sequences (> 500K) in a reasonable amount of time. A2G2 can be imported in Python for easier integration with other software, or can be run via command line.AvailabilityA2G2 is implemented in Python 3 (3.6) and depends on MAFFT availability. Other package requirements can be found in the requirements.txt file at https://github.com/jshleap/A2G. A2G2 is also available via PyPi (https://pypi.org/project/A2G). It is licensed under the LGPLv3.Supplementary informationSupplementary material is available at github as jupyter notebook.
Publisher
Cold Spring Harbor Laboratory
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献