Abstract
AbstractLogistic regression has demonstrated its utility in classifying binary labeled datasets through the maximum likelihood approach. However, in numerous biological and clinical contexts, the aim is often to determine coefficients that yield the highest sensitivity at the pre-specified specificity or vice versa. Therefore, the application of logistic regression is limited in such settings. To this end, we have developed an improved regression framework, SMAGS, for binary classification that, for a given specificity, finds the linear decision rule that yields the maximum sensitivity. Furthermore, we employed the method for feature selection to find the features that are satisfying the sensitivity maximization goal. We compared our method with normal logistic regression by applying it to real clinical data as well as synthetic data. In the real application data (colorectal cancer dataset), we found 14% improvement of sensitivity at 98.5% specificity.Availability and implementationSoftware is made available in Python (https://github.com/smahmoodghasemi/SMAGS)
Publisher
Cold Spring Harbor Laboratory