Reference genome and annotation updates lead to contradictory prognostic predictions in gene expression signatures: a case study of resected stage I lung adenocarcinoma

Author:

Zhang Zheyang,Zhang Sainan,Li Xin,Zhao Zhangxiang,Chen Changjing,Zhang Juxuan,Li Mengyue,Wei Zixin,Jiang Wenbin,Pan Bo,Li Ying,Liu Yixin,Cao Yingyue,Zhao Wenyuan,Gu YunyanORCID,Yu Yan,Meng Qingwei,Qi LishuangORCID

Abstract

Abstract RNA-sequencing enables accurate and low-cost transcriptome-wide detection. However, expression estimates vary as reference genomes and gene annotations are updated, confounding existing expression-based prognostic signatures. Herein, prognostic 9-gene pair signature (GPS) was applied to 197 patients with stage I lung adenocarcinoma derived from previous and latest data from The Cancer Genome Atlas (TCGA) processed with different reference genomes and annotations. For 9-GPS, 6.6% of patients exhibited discordant risk classifications between the two TCGA versions. Similar results were observed for other prognostic signatures, including IRGPI, 15-gene and ORACLE. We found that conflicting annotations for gene length and overlap were the major cause of their discordant risk classification. Therefore, we constructed a prognostic 40-GPS based on stable genes across GENCODE v20-v30 and validated it using public data of 471 stage I samples (log-rank P < 0.0010). Risk classification was still stable in RNA-sequencing data processed with the newest GENCODE v32 versus GENCODE v20–v30. Specifically, 40-GPS could predict survival for 30 stage I samples with formalin-fixed paraffin-embedded tissues (log-rank P = 0.0177). In conclusion, this method overcomes the vulnerability of existing prognostic signatures due to reference genome and annotation updates. 40-GPS may offer individualized clinical applications due to its prognostic accuracy and classification stability.

Funder

National Natural Science Foundation of China

Publisher

Oxford University Press (OUP)

Subject

Molecular Biology,Information Systems

Reference51 articles.

1. Cancer transcriptome profiling at the juncture of clinical translation, nature reviews;Cieslik;Genetics,2018

2. A strategy of DNA sequencing employing computer programs;Staden;Nucleic Acids Res,1979

3. Assembly of a pan-genome from deep sequencing of 910 humans of African descent;Sherman;Nat Genet,2019

4. GENCODE reference annotation for the human and mouse genomes;Frankish;Nucleic Acids Res,2019

5. Gene signature driving invasive mucinous adenocarcinoma of the lung;Guo;EMBO Mol Med,2017

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3