Author:
Ruusmann Villu,Sild Sulev,Maran Uko
Abstract
Abstract
Background
Research efforts in the field of descriptive and predictive Quantitative Structure-Activity Relationships or Quantitative Structure–Property Relationships produce around one thousand scientific publications annually. All the materials and results are mainly communicated using printed media. The printed media in its present form have obvious limitations when they come to effectively representing mathematical models, including complex and non-linear, and large bodies of associated numerical chemical data. It is not supportive of secondary information extraction or reuse efforts while in silico studies poses additional requirements for accessibility, transparency and reproducibility of the research. This gap can and should be bridged by introducing domain-specific digital data exchange standards and tools. The current publication presents a formal specification of the quantitative structure-activity relationship data organization and archival format called the QSAR DataBank (QsarDB for shorter, or QDB for shortest).
Results
The article describes QsarDB data schema, which formalizes QSAR concepts (objects and relationships between them) and QsarDB data format, which formalizes their presentation for computer systems. The utility and benefits of QsarDB have been thoroughly tested by solving everyday QSAR and predictive modeling problems, with examples in the field of predictive toxicology, and can be applied for a wide variety of other endpoints. The work is accompanied with open source reference implementation and tools.
Conclusions
The proposed open data, open source, and open standards design is open to public and proprietary extensions on many levels. Selected use cases exemplify the benefits of the proposed QsarDB data format. General ideas for future development are discussed.
Publisher
Springer Science and Business Media LLC
Subject
Library and Information Sciences,Computer Graphics and Computer-Aided Design,Physical and Theoretical Chemistry,Computer Science Applications
Reference58 articles.
1. Tropsha A: Best practices for QSAR model development, validation, and exploitation. Mol Inf. 2010, 29: 476-488. 10.1002/minf.201000061.
2. Dearden JC, Cronin MT, Kaiser KL: How not to develop a quantitative structure-activity or structure–property relationship (QSAR/QSPR). SAR QSAR. Environ Res. 2009, 20: 241-266.
3. Stouch TR, Kenyon JR, Johnson SR, Chen XQ, Doweyko A, Li Y: In silico ADME/Tox: why models fail. J Comput Aided Mol Des. 2003, 17: 83-92. 10.1023/A:1025358319677.
4. Foster I, Kesselman C: The Grid 2: Blueprint for a New Computing Infrastructure. 2003, San Francisco, CA: Morgan Kaufmann Publishers Inc.
5. Open Computing GRID for Molecular Science and Engineering (OpenMolGRID); EU 5-th FP, # IST-2001-37238, duration 2002–2005. [http://www.openmolgrid.org]
Cited by
41 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献