Affiliation:
1. Navarrabiomed Complejo Hospitalario de Navarra (CHN) Universidad Pública de Navarra (UPNA) IdiSNA Pamplona 31001 Spain
2. Biological and Environmental Sciences and Engineering Division King Abdullah University of Science and Technology Thuwal 23955 Saudi Arabia
3. Algorithmic Dynamic Lab Department of Oncology and Pathology Center for Molecular Medicine Karolinska Institute Stockholm 17177 Sweden
4. Division of Hemato‐Oncology Center for Applied Medical Research CIMA Cancer Center University of Navarra (CCUN) Navarra Institute for Health Research (IDISNA) CIBERONC Pamplona 31008 Spain
5. Department of Hematology Clinica Universidad de Navarra CIBERONC Pamplona 31008 Spain
6. Computer, Electrical and Mathematical Sciences and Engineering Division King Abdullah University of Science and Technology Thuwal 23955 Saudi Arabia
Abstract
There is a need for tools that integrate single‐cell multi‐omic data while addressing several integrative challenges simultaneously. To this end, we designed a deep‐learning based tool LIBRA that performs competitively in both “integration” and “prediction” tasks based on single‐cell multi‐omics data. Furthermore, when assessing the predictive power across data modalities, LIBRA outperforms existing tools. LIBRA and its adaptive scheme aLIBRA, allow automatic fine‐tuning for users with limited effort. Additionally, aLIBRA allows experienced users to implement custom configurations. The LIBRA toolbox is freely available as R and Python libraries.BackgroundSingle‐cell multi‐omics technologies allow a profound system‐level biology understanding of cells and tissues. However, an integrative and possibly systems‐based analysis capturing the different modalities is challenging. In response, bioinformatics and machine learning methodologies are being developed for multi‐omics single‐cell analysis. It is unclear whether current tools can address the dual aspect of modality integration and prediction across modalities without requiring extensive parameter fine‐tuning.MethodsWe designed LIBRA, a neural network based framework, to learn translation between paired multi‐omics profiles so that a shared latent space is constructed. Additionally, we implemented a variation, aLIBRA, that allows automatic fine‐tuning by identifying parameter combinations that optimize both the integrative and predictive tasks. All model parameters and evaluation metrics are made available to users with minimal user iteration. Furthermore, aLIBRA allows experienced users to implement custom configurations. The LIBRA toolbox is freely available as R and Python libraries at GitHub (TranslationalBioinformaticsUnit/LIBRA).ResultsLIBRA was evaluated in eight multi‐omic single‐cell data‐sets, including three combinations of omics. We observed that LIBRA is a state‐of‐the‐art tool when evaluating the ability to increase cell‐type (clustering) resolution in the integrated latent space. Furthermore, when assessing the predictive power across data modalities, such as predictive chromatin accessibility from gene expression, LIBRA outperforms existing tools. As expected, adaptive parameter optimization (aLIBRA) significantly boosted the performance of learning predictive models from paired data‐sets.ConclusionLIBRA is a versatile tool that performs competitively in both “integration” and “prediction” tasks based on single‐cell multi‐omics data. LIBRA is a data‐driven robust platform that includes an adaptive learning scheme.
Subject
Applied Mathematics,Computer Science Applications,Biochemistry, Genetics and Molecular Biology (miscellaneous),Modeling and Simulation