Abstract
Data curation practices of the Crystallography Open Database (COD) are described with additional focus being placed on the formal validation using the Crystallographic Information Framework (CIF). The cif_validate program, capable of validating CIF files against both the DDL1 and the DDLm dictionaries, is presented and used to process the entirety of the COD. Validation results collected from over 450 000 CIF files are demonstrated to be a useful resource in the data maintenance process as well as the development of the underlying ontologies. A set of programs intended to aid in the dictionary migration from DDL1 to DDLm is also presented.
Funder
Research Council of Lithuania
Publisher
International Union of Crystallography (IUCr)
Subject
General Biochemistry, Genetics and Molecular Biology
Reference54 articles.
1. Adams, S., de Castro, P., Echenique, P., Estrada, J., Hanwell, M., Murray-Rust, P., Sherwood, P., Thomas, J. & Townsend, J. (2011). J. Cheminform. 3, 38.
2. Ontology usage analysis in the ontology lifecycle: A state-of-the-art review
3. Berman, H., Henrick, K. & Nakamura, H. (2003). Nat. Struct. Mol. Biol. 10, 980.
4. Specification of the Crystallographic Information File format, version 2.0
5. Bollinger, J., Hall, S., Hester, J., Merkys, A., Spadaccini, N. & Vaitkus, A. (2020). COMCIFS/cif_core: CIF Core March 2020, https://doi.org/10.5281/zenodo.3887473.
Cited by
233 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献