TSAFinder: exhaustive tumor-specific antigen detection with RNAseq

Author:

Sharpnack Michael F1,Johnson Travis S1,Chalkley Robert2,Han Zhi3,Carbone David1,Huang Kun3,He Kai1

Affiliation:

1. Department of Internal Medicine, Comprehensive Cancer Center, The Ohio State University , Columbus, OH 43210, USA

2. Department of Pharmaceutical Chemistry, University of California San Francisco , San Francisco, CA 94158, USA

3. Department of Biostatistics and Health Data Science, Indiana University School of Medicine , Indianapolis, IN 46202, USA

Abstract

Abstract Motivation Tumor-specific antigen (TSA) identification in human cancer predicts response to immunotherapy and provides targets for cancer vaccine and adoptive T-cell therapies with curative potential, and TSAs that are highly expressed at the RNA level are more likely to be presented on major histocompatibility complex (MHC)-I. Direct measurements of the RNA expression of peptides would allow for generalized prediction of TSAs. Human leukocyte antigen (HLA)-I genotypes were predicted with seq2HLA. RNA sequencing (RNAseq) fastq files were translated into all possible peptides of length 8–11, and peptides with high and low expressions in the tumor and control samples, respectively, were tested for their MHC-I binding potential with netMHCpan-4.0. Results A novel pipeline for TSA prediction from RNAseq was used to predict all possible unique peptides size 8–11 on previously published murine and human lung and lymphoma tumors and validated on matched tumor and control lung adenocarcinoma (LUAD) samples. We show that neoantigens predicted by exomeSeq are typically poorly expressed at the RNA level, and a fraction is expressed in matched normal samples. TSAs presented in the proteomics data have higher RNA abundance and lower MHC-I binding percentile, and these attributes are used to discover high confidence TSAs within the validation cohort. Finally, a subset of these high confidence TSAs is expressed in a majority of LUAD tumors and represents attractive vaccine targets. Availability and implementation The datasets were derived from sources in the public domain as follows: TSAFinder is open-source software written in python and R. It is licensed under CC-BY-NC-SA and can be downloaded at https://github.com/RNAseqTSA. Supplementary information Supplementary data are available at Bioinformatics online.

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Reference26 articles.

1. The Protein Data Bank;Berman;Acta Crystallogr. D Biol. Crystallogr,2000

2. HLA typing from RNA-Seq sequence reads;Boegel;Genome Med,2012

3. BLAST+: architecture and applications;Camacho;BMC Bioinformatics,2009

4. Pan-cancer immunogenomic analyses reveal genotype-immunophenotype relationships and predictors of response to checkpoint blockade;Charoentong;Cell Rep,2017

5. Integrated proteogenomic deep sequencing and analytics accurately identify non-canonical peptides in tumor immunopeptidomes;Chong;Nat. Commun,2020

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3