Author:
Vaisband Marc,Schubert Maria,Gassner Franz Josef,Geisberger Roland,Greil Richard,Zaborsky Nadja,Hasenauer Jan
Abstract
AbstractAccurate somatic variant calling from next-generation sequencing data is one most important tasks in personalised cancer therapy. The sophistication of the available technologies is ever-increasing, yet, manual candidate refinement is still a necessary step in state-of-the-art processing pipelines. This limits reproducibility and introduces a bottleneck with respect to scalability. We demonstrate that the validation of genetic variants can be improved using a machine learning approach resting on a Convolutional Neural Network, trained using existing human annotation. In contrast to existing approaches, we introduce a way in which contextual data from sequencing tracks can be included into the automated assessment. A rigorous evaluation shows that the resulting model is robust and performs on par with trained researchers following published standard operating procedure.
Funder
Salzburger Landesregierung
Austrian Science Fund
Deutsche Forschungsgemeinschaft
Rheinische Friedrich-Wilhelms-Universität Bonn
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology