BACKGROUND
HIV self-testing (HIVST) has been rapidly scaled up and additional strategies further expand testing uptake. Secondary distribution has people (indexes) apply for multiple kits and pass these kits to people (alters) in their social networks. However, identifying key influencers is difficult.
OBJECTIVE
This study aimed to develop an innovative ensemble machine learning approach to identify key influencers among Chinese men who have sex with men (MSM) for HIVST secondary distribution.
METHODS
We defined three types of key influencers: 1) key distributors who can distribute more kits; 2) key promoters who can contribute to finding first-time testing alters; 3) key detectors who can help to find positive alters. Four machine learning models (logistic regression, support vector machine, decision tree, random forest) were trained to identify key influencers. An ensemble learning algorithm was adopted to combine these four models. Simulation experiments were run to validate our approach.
RESULTS
309 indexes distributed kits to 269 alters. Our approach outperformed human identification (self-reported scales cut-off), exceeding by an average accuracy of 11·0%, could distribute 18·2% (95%CI: 9·9%-26·5%) more kits, find 13·6% (95%CI: 1·9%-25·3%) more first-time testing alters and 12·0% (95%CI: -14·7%-38·7%) more positive-testing alters. Our approach could also increase simulated intervention efficiency by 17·7% (95%CI: -3·5%-38·8%) than human identification.
CONCLUSIONS
We built machine learning models to identify key influencers among Chinese MSM who were more likely to engage in HIVST secondary distribution.
CLINICALTRIAL
Our study was a secondary modeling analysis of an RCT, which was registered with the Chinese Clinical Trial Registry (ChiCTR) ChiCTR1900025433.