Multiple-instance learning of somatic mutations for the classification of tumour type and the prediction of microsatellite status-Reference-Cited by-同舟云学术

Multiple-instance learning of somatic mutations for the classification of tumour type and the prediction of microsatellite status

Published:2023-11-02 Issue:1 Volume:8 Page:57-67
ISSN:2157-846X
Container-title:Nature Biomedical Engineering
language:en
Short-container-title:Nat. Biomed. Eng

Author:

Anaya Jordan^ORCID,Sidhom John-William,Mahmood Faisal^ORCID,Baras Alexander S.^ORCID

Abstract

AbstractLarge-scale genomic data are well suited to analysis by deep learning algorithms. However, for many genomic datasets, labels are at the level of the sample rather than for individual genomic measures. Machine learning models leveraging these datasets generate predictions by using statically encoded measures that are then aggregated at the sample level. Here we show that a single weakly supervised end-to-end multiple-instance-learning model with multi-headed attention can be trained to encode and aggregate the local sequence context or genomic position of somatic mutations, hence allowing for the modelling of the importance of individual measures for sample-level classification and thus providing enhanced explainability. The model solves synthetic tasks that conventional models fail at, and achieves best-in-class performance for the classification of tumour type and for predicting microsatellite status. By improving the performance of tasks that require aggregate information from genomic datasets, multiple-instance deep learning may generate biological insight.

Publisher

Springer Science and Business Media LLC

Subject

Computer Science Applications,Biomedical Engineering,Medicine (miscellaneous),Bioengineering,Biotechnology

Link

https://www.nature.com/articles/s41551-023-01120-3.pdf

Reference38 articles.

1. Ching, T. et al. Opportunities and obstacles for deep learning in biology and medicine. J. R. Soc. Interface 15, 20170387 (2018).

2. Zhou, J. & Troyanskaya, O. G. Predicting effects of noncoding variants with deep learning-based sequence model. Nat. Methods 12, 931–934 (2015).

3. Routhier, E. & Mozziconacci, J. Genomics enters the deep learning era. PeerJ 10, e13613 (2022).

4. Altman, N. S. & Krzywinski, M. The curse(s) of dimensionality. Nat. Methods 15, 399–400 (2018).

5. Elmarakeby, H. A. et al. Biologically informed deep neural network for prostate cancer discovery. Nature 598, 348–352 (2021).

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Lynch Syndrome and Somatic Mismatch Repair Variants in Pancreas Cancer;JAMA Oncology;2024-09-05

2. A guide to artificial intelligence for cancer researchers;Nature Reviews Cancer;2024-05-16

3. Machine learning enabled prediction of digital biomarkers from whole slide histopathology images;2024-01-08