Platelet-Based Liquid Biopsies through the Lens of Machine Learning

Author:

Cygert Sebastian12,Pastuszak Krzysztof345ORCID,Górski Franciszek1,Sieczczyński Michał1,Juszczyk Piotr1,Rutkowski Antoni1,Lewalski Sebastian1,Różański Robert6,Jopek Maksym Albin45ORCID,Jassem Jacek7ORCID,Czyżewski Andrzej1ORCID,Wurdinger Thomas8,Best Myron G.8,Żaczek Anna J.4ORCID,Supernat Anna45ORCID

Affiliation:

1. Department of Multimedia Systems, Faculty of Electronics, Telecommunication and Informatics, Gdansk University of Technology, 80-233 Gdańsk, Poland

2. Ideas NCBR, 00-801 Warsaw, Poland

3. Department of Algorithms and System Modeling, Faculty of Electronics, Telecommunication and Informatics, Gdansk University of Technology, 80-233 Gdańsk, Poland

4. Laboratory of Translational Oncology, Intercollegiate Faculty of Biotechnology, Medical University of Gdańsk, 80-210 Gdańsk, Poland

5. Center of Biostatistics and Bioinformatics, Medical University of Gdańsk, 80-210 Gdańsk, Poland

6. Independent Researcher, 80-211 Gdańsk, Poland

7. Department of Oncology and Radiotherapy, Medical University of Gdańsk, 80-210 Gdańsk, Poland

8. Department of Neurosurgery, Amsterdam University Medical Center, 1081 Amsterdam, The Netherlands

Abstract

Liquid biopsies offer minimally invasive diagnosis and monitoring of cancer disease. This biosource is often analyzed using sequencing, which generates highly complex data that can be used using machine learning tools. Nevertheless, validating the clinical applications of such methods is challenging. It requires: (a) using data from many patients; (b) verifying potential bias concerning sample collection; and (c) adding interpretability to the model. In this work, we have used RNA sequencing data of tumor-educated platelets (TEPs) and performed a binary classification (cancer vs. no-cancer). First, we compiled a large-scale dataset with more than a thousand donors. Further, we used different convolutional neural networks (CNNs) and boosting methods to evaluate the classifier performance. We have obtained an impressive result of 0.96 area under the curve. We then identified different clusters of splice variants using expert knowledge from the Kyoto Encyclopedia of Genes and Genomes (KEGG). Employing boosting algorithms, we identified the features with the highest predictive power. Finally, we tested the robustness of the models using test data from novel hospitals. Notably, we did not observe any decrease in model performance. Our work proves the great potential of using TEP data for cancer patient classification and opens the avenue for profound cancer diagnostics.

Funder

Electronics, Telecommunications and Informatics Faculty, Gdansk University of Technology

European Regional Development Fund

National Science Centre

Medical University of Gdansk

National Center for Research and Development

Publisher

MDPI AG

Subject

Cancer Research,Oncology

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3