Affiliation:
1. CSIC-Universidad Politecnica de Valencia, Spain
Abstract
The data mining technology increasingly employed into new industrial processes, which require automatic analysis of data and related results in order to quickly proceed to conclusions. However, for some applications, an absolute automation may not be appropriate. Unlike traditional data mining, contexts deal with voluminous amounts of data, some domains are actually characterized by a scarcity of data, owing to the cost and time involved in conducting simulations or setting up experimental apparatus for data collection. In such domains, it is hence prudent to balance speed through automation and the utility of the generated data. The authors review the active learning methodology, and a new one that aims at generating successively new samples in order to reach an improved final estimation of the entire search space investigated according to the knowledge accumulated iteratively through samples selection and corresponding obtained results, is presented. The methodology is shown to be of great interest for applications such as high throughput material science and especially heterogeneous catalysis where the chemists do not have previous knowledge allowing to direct and to guide the exploration.
Reference94 articles.
1. Combinatorial Approach to the Hydrothermal Synthesis of Zeolites
2. Bailey-Kellogg, N. Ramakrishnan.(2003). Proc. 17th Int. Workshop on Qualitative Reasoning, pp. 23-30.\
3. MAP: An Iterative Experimental Design Methodology for the Optimization of Catalytic Search Space Structure Modeling
4. Baumes, L. A., Blansché, A., Serna, P., Tchougang, A., Lachiche, N., P. Collet & A. Corma (2009). Materials and Manufacturing Processes, 24 (3), 282 – 292.
5. Examination of genetic programming paradigm for high-throughput experimentation and heterogeneous catalysis