Abstract
AbstractThe development of models to predict sensitivity to anticancer drugs is an area of significant interest, given the diverse responses to treatment among patients and the considerable expense and time involved in anticancer drug development. Leveraging “omic” data and anticancer response information from the Cancer Cell Line Encyclopedia, we propose a novel approach utilizing multitask learning to enhance prediction accuracy and inference. We extended a multitask learning framework called the Data Shared Lasso to develop the Data Shared Elastic Net. This enabled the construction of tissue-specific models with information sharing while maintaining the attractive properties of Elastic Net regression. By employing this approach, we observed improvements in prediction accuracy compared to single-task Elastic Net models, particularly for cell lines displaying high sensitivity to treatment. Furthermore, the Data Shared Elastic Net facilitated the identification of predictors for anticancer drug sensitivity within specific tissue types, shedding light on cellular pathways targeted by these drugs across tissues. We also investigated the impact of data leakage on modeling outcomes from previous studies, which led to underestimating prediction error and erroneous inferences
Publisher
Cold Spring Harbor Laboratory
Reference21 articles.
1. High dimensional data enrichment: Interpretable, fast, and data-efficient;arXiv preprint,2018
2. Amir Asiaee , Samet Oymak , Kevin R Coombes , and Arindam Banerjee . Data enrichment: Multi-task learning in high dimension with theoretical guarantees. In Adaptive and Multitask Learning Workshop at the ICML. IMLS, Long Beach, CA, 2019.
3. The Cancer Cell Line Encyclopedia enables predictive modelling of anticancer drug sensitivity
4. A community effort to assess and improve drug sensitivity prediction algorithms
5. An overview of machine learning methods for monotherapy drug response prediction;Briefings in Bioinformatics,2022