Composite measurements and molecular compressed sensing for highly efficient transcriptomics-Reference-Cited by-同舟云学术

Composite measurements and molecular compressed sensing for highly efficient transcriptomics

Published:2017-01-02 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Cleary Brian,Cong Le,Lander Eric S.,Regev Aviv

Abstract

AbstractRNA profiling is an excellent phenotype of cellular responses and tissue states, but can be costly to generate at the massive scale required for studies of regulatory circuits, genetic states or perturbation screens. Here, we draw on a series of advances over the last decade in the field of mathematics to establish a rigorous link between biological structure, data compressibility, and efficient data acquisition. We propose that very few random composite measurements – in which gene abundances are combined in a random linear combination – are needed to approximate the high-dimensional similarity between any pair of gene abundance profiles. We then show how finding latent, sparse representations of gene expression data would enable us to “decompress” a small number of random composite measurements and recover high-dimensional gene expression levels that were not measured (unobserved). We present a new algorithm for finding sparse, modular structure, which improves the ability to interpret samples in terms of small numbers of active modules, and show that the modular structure we find is sufficient to recover gene expression profiles from composite measurements (with ~100-fold fewer composite measurements than genes). Moreover, the knowledge that sparse, modular structures exist allows us to recover expression profiles from composite measurements, even without access to any training data. Finally, we present a proof-of-concept experiment for making composite measurements in the laboratory, involving the measurement of linear combinations of RNA abundances. Altogether, our results suggest new compressive modalities in experimental biology that can form a foundation for massive scaling in high-throughput measurements, while also offering new insights into the interpretation of high-dimensional data.

Publisher

Cold Spring Harbor Laboratory

Reference74 articles.

1. Drug target validation and identification of secondary drug target effects using DNA microarrays

2. [12] DNA arrays for analysis of gene expression

3. Knowledge-based analysis of microarray gene expression data by using support vector machines

4. Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling

5. Biclustering of expression data;Int. Conf. Intell. Syst. Mol. Biol. ISMB. Int. Conf. Intell. Syst. Mol. Biol. Dep. Genet. Harvard Med. Sch. Boston, MA 02115, USA,1999

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Analysis of higher order interactions quantifies co-ordination in the epigenome and reveals novel biological relationships in Kabuki syndrome;2024-03-13

2. Enter the Matrix: Factorization Uncovers Knowledge from Omics;Trends in Genetics;2018-10

3. Logarithmic molecular sampling for next-generation sequencing;2018-09-18

4. Zero-preserving imputation of scRNA-seq data using low-rank approximation;2018-08-22

5. Decomposing cell identity for transfer learning across cellular measurements, platforms, tissues, and species;2018-08-20