Affiliation:
1. Graduate Institute of Statistics and Information Science, National Changhua University of Education, No. 1, Jin-De Road, Changhua City 500207, Taiwan
Abstract
With the increasing complexity and dimensionality of datasets in statistical research, traditional methods of identifying interactions are often more challenging to apply due to the limitations of model assumptions. Logic regression has emerged as an effective tool, leveraging Boolean combinations of binary explanatory variables. However, the prevalent simulated annealing approach in logic regression sometimes faces stability issues. This study introduces the BLogic algorithm, a novel approach that amalgamates multiple runs of simulated annealing on a dataset and synthesizes the results via the Bayesian model combination technique. This algorithm not only facilitates predicting response variables using binary explanatory ones but also offers a score computation for prime implicants, elucidating key variables and their interactions within the data. In simulations with identical parameters, conventional logic regression, when executed with a single instance of simulated annealing, exhibits reduced predictive and interpretative capabilities as soon as the ratio of explanatory variables to sample size surpasses 10. In contrast, the BLogic algorithm maintains its effectiveness until this ratio approaches 50. This underscores its heightened resilience against challenges in high-dimensional settings, especially the large p, small n problem. Moreover, employing real-world data from the UK10K Project, we also showcase the practical performance of the BLogic algorithm.
Funder
National Science and Technology Council, Taiwan
Subject
General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)
Reference51 articles.
1. Detecting gene–gene interactions that underlie human diseases;Cordell;Nat. Rev. Genet.,2009
2. Measuring higher-order drug interactions: A review of recent approaches;Tekin;Curr. Opin. Syst. Biol.,2017
3. Kuhn, M., Johnson, K., Kuhn, M., and Johnson, K. (2013). Applied Predictive Modeling, Springer.
4. Data mining with decision trees and decision rules;Weiss;Future Gener. Comput. Syst.,1997
5. Kocbek, S., Kocbek, P., Gosak, L., Fijačko, N., and Štiglic, G. (2022). Extracting new temporal features to improve the interpretability of undiagnosed type 2 diabetes mellitus prediction models. J. Pers. Med., 12.