Abstract
AbstractBackgroundUpdating systematic reviews is often a time-consuming process involving a lot of human effort and is therefore not carried out as often as it should be. Our aim was therefore to explore the potential of machine learning methods to reduce the human workload, and to particularly also gauge the performance of deep learning methods as compared to more established machine learning methods.MethodsWe used three available reviews of diagnostic test studies as data basis. In order to identify relevant publications we used typical text pre-processing methods. The reference standard for the evaluation was the human-consensus based binary classification (inclusion, exclusion). For the evaluation of models various scenarios were generated using a grid of combinations of data preprocessing steps. Furthermore, we evaluated each machine learning approach with an approach-specific predefined grid of tuning parameters using the Brier score metric.ResultsThe best performance was obtained with an ensemble method for two of the reviews, and by a deep learning approach for the other review. Yet, the final performance of approaches is seen to strongly depend on data preparation. Overall, machine learning methods provided reasonable classification.ConclusionIt seems possible to reduce the human workload in updating systematic reviews by using machine learning methods. Yet, as the influence of data preprocessing on the final performance seems to be at least as important as choosing the specific machine learning approach, users should not blindly expect good performance just by using approaches from a popular class, such as deep learning.
Publisher
Cold Spring Harbor Laboratory
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献