Author:
Yabuuchi Hiroaki,Fujiwara Makiko,Shigemoto Akihiko,Hayashi Kazuhito,Nomura Yuhei,Nakashima Mayumi,Ogusu Takeshi,Mori Megumi,Tokumoto Shin-ichi,Miyai Kazuyuki
Abstract
AbstractPlants are valuable resources for drug discovery as they produce diverse bioactive compounds. However, the chemical diversity makes it difficult to predict the biological activity of plant extracts via conventional chemometric methods. In this research, we propose a new computational model that integrates chemical composition data with structure-based chemical ontology. For a model validation, two training datasets were prepared from literature on antibacterial essential oils to classify active/inactive oils. Random forest classifiers constructed from the data showed improved prediction performance in both test datasets. Prior feature selection using hierarchical information criterion further improved the performance. Furthermore, an antibacterial assay using a standard strain of Staphylococcus aureus revealed that the classifier correctly predicted the activity of commercially available oils with an accuracy of 83% (= 10/12). The results of this study indicate that machine learning of chemical composition data integrated with chemical ontology can be a highly efficient approach for exploring bioactive plant extracts.
Funder
Kayamori Foundation of Informational Science Advancement
Publisher
Springer Science and Business Media LLC