Abstract
Leprosy is a dermatoneurological disease and can cause irreversible nerve damage. In addition to being able to mimic different rheumatological, neurological and dermatological diseases, leprosy is underdiagnosed because several professionals present lack of training. The World Health Organization instituted active search for new leprosy cases as one of the four pillars of the global leprosy strategy, which aims detecting cases early before visible disabilities occur. The Leprosy Suspicion Questionnaire (LSQ) was created aiming to be a screening tool to actively detect new cases; it is composed of 14 simple yes/no questions that can be answered with the help of a health professional or by the very patient themselves. During its development, it was noticed that combination of marked questions was related to new case detections. To better perform and being able to expand its use, we developed MaLeSQs, a Machine Learning tool whose output may be LSQ Positive when the subject is indicated for being further clinically evaluated or LSQ Negative when the subject does not present any evidence that justify being further evaluated for leprosy. To achieve an efficient product, we trained four classifiers with different learning paradigms, Support Vectors Machine, Logistic Regression, Random Forest and XGBoost. We compared them based on sensitivity, specificity, positive predicted value, negative predicted value, and area under the ROC curve. After the training process, the Support Vectors Machine was the classifier with most balanced metrics, and it was chosen as the MaLeSQs. With Shapley values, we were able to evaluate variable importance and nerve symptoms were considered imported to differentiate between subject that potentially had leprosy of those who did not. The results highlight the possibility that machine learning algorithms are able to contribute improving health care coverage and strengthening leprosy control strategies.