Abstract
AbstractThe progress and utility of synthetic biology is currently hindered by the lengthy process of studying literature and replicating poorly documented work. Reconstruction of crucial design information through post-hoc curation is highly noisy and error-prone. To combat this, author participation during the curation process is crucial. To encour-age author participation without overburdening them, an ML-assisted curation tool called SeqImprove has been developed. Using named entity recognition, named entity normalization, and sequence matching, SeqImprove creates machine-readable sequence data and metadata annotations, which authors can then review and edit before sub-mitting a final sequence file. SeqImprove makes it easier for authors to submit FAIR sequence data that is findable, accessible, interoperable, and reusable.
Publisher
Cold Spring Harbor Laboratory
Reference32 articles.
1. iGEM 2.0—refoundations for engineering biology
2. Targeted Development of Registries of Biological Parts
3. The future of biocuration
4. Mante, J. ; Hao, Y. ; Jett, J. ; Joshi, U. ; Keating, K. ; Lu, X. ; Nakum, G. ; Rodriguez, N. E. ; Tang, J. ; Terry, L. et al. Synthetic Biology Knowledge System. ACS Synthetic Biology 2021,
5. Jett, J. ; Mante, J. ; Myers, C. J. ; Downie, S. Is Cyberinfrastructure for Strategic Reading Possible?: What Species is ‘Baby’? and other Anecdotes from Cleaning BioBERT Data for Synthetic Biology. 2022; in preparation.