A Method to Extract Feature Variables Contributed in Nonlinear Machine Learning Prediction-Reference-Cited by-同舟云学术

A Method to Extract Feature Variables Contributed in Nonlinear Machine Learning Prediction

Published:2020-02 Issue:01 Volume:59 Page:001-008
ISSN:0026-1270
Container-title:Methods of Information in Medicine
language:en
Short-container-title:Methods Inf Med

Author:

Suzuki Mayumi¹,Shibahara Takuma¹,Muragaki Yoshihiro²

Affiliation:

1. Hitachi, Ltd. Research and Development Group, Tokyo, Japan

2. Faculty of Advanced Techno-Surgery, Institute of Advanced Biomedical Engineering and Science, Graduate School of Medicine, Department of Neurosurgery, Neurological Institute, Tokyo Women’s Medical University, Tokyo, Japan

Abstract

Abstract Background Although advances in prediction accuracy have been made with new machine learning methods, such as support vector machines and deep neural networks, these methods make nonlinear machine learning models and thus lack the ability to explain the basis of their predictions. Improving their explanatory capabilities would increase the reliability of their predictions. Objective Our objective was to develop a factor analysis technique that enables the presentation of the feature variables used in making predictions, even in nonlinear machine learning models. Methods A factor analysis technique was consisted of two techniques: backward analysis technique and factor extraction technique. We developed a factor extraction technique extracted feature variables that was obtained from the posterior probability distribution of a machine learning model which was calculated by backward analysis technique. Results In evaluation, using gene expression data from prostate tumor patients and healthy subjects, the prediction accuracy of a model of deep neural networks was approximately 5% better than that of a model of support vector machines. Then the rate of concordance between the feature variables extracted in an earlier report using Jensen–Shannon divergence and the ones extracted in this report using backward elimination using Hilbert–Schmidt independence criteria was 40% for the top five variables, 40% for the top 10, and 49% for the top 100. Conclusion The results showed that models can be evaluated from different viewpoints by using different factor extraction techniques. In the future, we hope to use this technique to verify the characteristics of features extracted by factor extraction technique, and to perform clinical studies using the genes, we extracted in this experiment.

Publisher

Georg Thieme Verlag KG

Subject

Health Information Management,Advanced and Specialized Nursing,Health Informatics

Link

http://www.thieme-connect.de/products/ejournals/pdf/10.1055/s-0040-1701615.pdf

Reference12 articles.

1. Divergence measures based on the Shannon entropy;J Lin;IEEE Trans Inf Theory,1991

2. Exchange monte carlo method and application to spin glass simulations;K Hukushima;J Phys Soc Jpn,1996

3. Gene expression correlates of clinical prostate cancer behavior;D Singh;Cancer Cell,2002

4. Random search for hyper-parameter optimization;J Bergstra;J Mach Learn Res,2012

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Prediction of mustard yield using different machine learning techniques: a case study of Rajasthan, India;International Journal of Biometeorology;2023-01-31

2. An Explainable Knowledge-Based System Using Subjective Preferences and Objective Data for Ranking Decision Alternatives;Methods of Information in Medicine;2022-09