Abstract
AbstractGene fusions are important cancer drivers and drug targets, but are difficult to reliably identify with short-read RNA-sequencing. Long-read RNA sequencing data are more likely to span a fusion breakpoint and provide more sequence context around the breakpoint. This allows for more reliable identification of gene fusions and for detecting alternative splicing in gene fusions. Alternative splicing of fusions has been shown to be a mechanism for drug resistance and altered levels of oncogenicity. We have created FLAIR-fusion, a computational tool to identify gene fusions and their isoforms from long-read RNA-sequencing data. FLAIR-fusion can detect simulated fusions and their isoforms with high precision and recall even with error-prone reads. It can also reliably call known fusions in multiple cancer cell lines with no consistent effect of the library preparation method used on total or previously validated fusions detected across cell lines. To demonstrate potential clinical utilities, we ran FLAIR-fusion on amplicon sequencing from multiple tumor samples and cell lines and detected alternative splicing in the previously validated fusion PIWIL4-GUCYA2, which could have implications in the treatment of lung cancers with this mutation. We also detect fusion isoforms from long-read sequencing in chronic lymphocytic leukemias with and without a splicing factor mutation, SF3B1 K700E, and find that up to 10% of gene fusions had more than one unique isoform. Our results demonstrate that gene fusion isoforms can be effectively detected from long-read RNA-sequencing and are important in the characterization of the full complexity of cancer transcriptomes.
Publisher
Cold Spring Harbor Laboratory
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献