Abstract
AbstractIdentifying cancer driver genes and their interrelations is critical in understanding cancer progression mechanisms. In this paper, we introduce ToMExO, a probabilistic method to infer cancer driver genes and how they affect each other, using cross-sectional data from cohorts of tumors. We model cancer progression dynamics using a tree with sets of driver genes in the nodes. This model explains the temporal orders among driver mutations and their mutual exclusivity patterns. We introduce a dynamic programming procedure for the likelihood calculation and build an MCMC inference algorithm. Together with our engineered MCMC moves, our efficient likelihood calculations enable us to work with datasets having hundreds of genes and thousands of tumors in the datasets. We demonstrate our method’s performance on several synthetic datasets covering various scenarios for cancer progression dynamics. We then present the analyses of several biological datasets using the ToMExO method and validate the results using a set of method-independent metrics.
Publisher
Cold Spring Harbor Laboratory