DeepEBV: a deep learning model to predict Epstein–Barr virus (EBV) integration sites

Author:

Liang Jiuxing12,Cui Zifeng3,Wu Canbiao1,Yu Yao45,Tian Rui6,Xie Hongxian7,Jin Zhuang3,Fan Weiwen3,Xie Weiling3,Huang Zhaoyue3,Xu Wei3,Zhu Jingjing3,You Zeshan3,Guo Xiaofang8,Qiu Xiaofan1,Ye Jiahao19,Lang Bin10,Li Mengyuan3,Tan Songwei11,Hu Zheng312ORCID

Affiliation:

1. Key Laboratory of Brain, Cognition and Education Sciences, Ministry of Education, Guangzhou, China

2. Institute for Brain Research and Rehabilitation, South China Normal University, Guangzhou 510631, China

3. Department of Gynaecological oncology, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, Guangdong 510080, China

4. Department of Urology, The First Medical Center of Chinese PLA General Hospital, Beijing 100853, China

5. School of Medicine, Nankai University, Tianjin 300071, China

6. Center for Translational Medicine, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou, Guangdong 510080, China

7. Generulor Company Bio-X Lab, Guangzhou 510006, Guangdong, China

8. Department of Medical Oncology of the Eastern Hospital, The First Affiliated Hospital, Sun Yat-sen University, Guangzhou 510700, China

9. School of Computer Science, South China Normal University, Guangzhou 510631, China

10. School of Health Sciences and Sports, Macao Polytechnic Institute, Macao, China

11. School of Pharmacy, Tongji Medical College, Huazhong University of Science and Technology, Wuhan 430030, China

12. Department of Obstetrics and Gynaecology, Tongji Hospital, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei 430030, China

Abstract

Abstract Motivation Epstein–Barr virus (EBV) is one of the most prevalent DNA oncogenic viruses. The integration of EBV into the host genome has been reported to play an important role in cancer development. The preference of EBV integration showed strong dependence on the local genomic environment, which enables the prediction of EBV integration sites. Results An attention-based deep learning model, DeepEBV, was developed to predict EBV integration sites by learning local genomic features automatically. First, DeepEBV was trained and tested using the data from the dsVIS database. The results showed that DeepEBV with EBV integration sequences plus Repeat peaks and 2-fold data augmentation performed the best on the training dataset. Furthermore, the performance of the model was validated in an independent dataset. In addition, the motifs of DNA-binding proteins could influence the selection preference of viral insertional mutagenesis. Furthermore, the results showed that DeepEBV can predict EBV integration hotspot genes accurately. In summary, DeepEBV is a robust, accurate and explainable deep learning model, providing novel insights into EBV integration preferences and mechanisms. Availabilityand implementation DeepEBV is available as open-source software and can be downloaded from https://github.com/JiuxingLiang/DeepEBV.git. Supplementary information Supplementary data are available at Bioinformatics online.

Funder

National Science and Technology Major Project

Ministry of science and technology of China

National Natural Science Foundation of China

Guangzhou Science and Technology Programme

National Ten Thousands Plan for Young Top Talents

Key-Area Research and Development Program of Guangdong Province

General Program of Natural Science Foundation of Guang-dong Province of China

National Postdoctoral Program for Innovative Talent

China Postdoctoral Science Foundation

Guangdong Basic and Applied Basic Research Foundation

Characteristic Innovation Research Project of University Teachers

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3