A Comparison of Clustering and Prediction Methods for Identifying Key Chemical–Biological Features Affecting Bioreactor Performance-Reference-Cited by-同舟云学术

A Comparison of Clustering and Prediction Methods for Identifying Key Chemical–Biological Features Affecting Bioreactor Performance

Published:2019-09-10 Issue:9 Volume:7 Page:614
ISSN:2227-9717
Container-title:Processes
language:en
Short-container-title:Processes

Author:

Tsai Yiting,Baldwin Susan A.,Siang Lim C.^ORCID,Gopaluni Bhushan^ORCID

Abstract

Chemical–biological systems, such as bioreactors, contain stochastic and non-linear interactions which are difficult to characterize. The highly complex interactions between microbial species and communities may not be sufficiently captured using first-principles, stationary, or low-dimensional models. This paper compares and contrasts multiple data analysis strategies, which include three predictive models (random forests, support vector machines, and neural networks), three clustering models (hierarchical, Gaussian mixtures, and Dirichlet mixtures), and two feature selection approaches (mean decrease in accuracy and its conditional variant). These methods not only predict the bioreactor outcome with sufficient accuracy, but the important features correlated with said outcome are also identified. The novelty of this work lies in the extensive exploration and critique of a wide arsenal of methods instead of single methods, as observed in many papers of similar nature. The results show that random forest models predict the test set outcomes with the highest accuracy. The identified contributory features include process features which agree with domain knowledge, as well as several different biomarker operational taxonomic units (OTUs). The results reinforce the notion that both chemical and biological features significantly affect bioreactor performance. However, they also indicate that the quality of the biological features can be improved by considering non-clustering methods, which may better represent the true behaviour within the OTU communities.

Publisher

MDPI AG

Subject

Process Chemistry and Technology,Chemical Engineering (miscellaneous),Bioengineering

Link

https://www.mdpi.com/2227-9717/7/9/614/pdf

Reference53 articles.

1. Bagging predictors

2. Support-vector networks

3. RANDOM FORESTS FOR CLASSIFICATION IN ECOLOGY

4. Support vector machines for speaker and language recognition

5. Learning representations by back-propagating errors

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. How does a reservoir affect the fish assemblage structure in a Mediterranean River (Turkey)?;River Research and Applications;2022-06-23

2. Online deep neural network-based feedback control of a Lutein bioprocess;Journal of Process Control;2021-02