Principal Component Analysis of Alternative Splicing Profiles Revealed by Long-Read ONT Sequencing in Human Liver Tissue and Hepatocyte-Derived HepG2 and Huh7 Cell Lines
-
Published:2023-10-24
Issue:21
Volume:24
Page:15502
-
ISSN:1422-0067
-
Container-title:International Journal of Molecular Sciences
-
language:en
-
Short-container-title:IJMS
Author:
Sarygina Elizaveta1, Kozlova Anna1ORCID, Deinichenko Kseniia1, Radko Sergey1, Ptitsyn Konstantin1, Khmeleva Svetlana1, Kurbatov Leonid K.1, Spirin Pavel2ORCID, Prassolov Vladimir S.2ORCID, Ilgisonis Ekaterina1, Lisitsa Andrey1, Ponomarenko Elena1
Affiliation:
1. Institute of Biomedical Chemistry, Pogodinskaya Street 10, 119121 Moscow, Russia 2. Department of Cancer Cell Biology, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Vavilova 32, 119991 Moscow, Russia
Abstract
The long-read RNA sequencing developed by Oxford Nanopore Technology provides a direct quantification of transcript isoforms. That makes the number of transcript isoforms per gene an intrinsically suitable metric for alternative splicing (AS) profiling in the application to this particular type of RNA sequencing. By using this simple metric and recruiting principal component analysis (PCA) as a tool to visualize the high-dimensional transcriptomic data, we were able to group biospecimens of normal human liver tissue and hepatocyte-derived malignant HepG2 and Huh7 cells into clear clusters in a 2D space. For the transcriptome-wide analysis, the clustering was observed regardless whether all genes were included in analysis or only those expressed in all biospecimens tested. However, in the application to a particular set of genes known as pharmacogenes, which are involved in drug metabolism, the clustering worsened dramatically in the latter case. Based on PCA data, the subsets of genes most contributing to biospecimens’ grouping into clusters were selected and subjected to gene ontology analysis that allowed us to determine the top 20 biological processes among which translation and processes related to its regulation dominate. The suggested metrics can be a useful addition to the existing metrics for describing AS profiles, especially in application to transcriptome studies with long-read sequencing.
Funder
Ministry of Education and Science of the Russian Federation
Subject
Inorganic Chemistry,Organic Chemistry,Physical and Theoretical Chemistry,Computer Science Applications,Spectroscopy,Molecular Biology,General Medicine,Catalysis
Reference42 articles.
1. Principal Component Analysis: A Review and Recent Developments;Jolliffe;Philos. Trans. R. Soc. A,2016 2. The Application of Principal Component Analysis to Drug Discovery and Biomedical Data;Giuliani;Drug Discov. Today,2017 3. Multivariate Data Analysis for Advancing the Interpretation of Bioprocess Measurement and Monitoring Data;Mandenius;Measurement, Monitoring, Modelling and Control of Bioprocesses,2012 4. Xu, Q., Ni, S., Wu, F., Liu, F., Ye, X., Mougin, B., Meng, X., and Du, X. (2011). Investigation of Variation in Gene Expression Profiling of Human Blood by Extended Principle Component Analysis. PLoS ONE, 6. 5. Guo, X., Chen, Q.-R., Song, Y.K., Wei, J.S., and Khan, J. (2011). Exon Array Analysis Reveals Neuroblastoma Tumors Have Distinct Alternative Splicing Patterns According to Stage and MYCN Amplification Status. BMC Med Genom., 4.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|