BELMM: Bayesian model selection and random walk smoothing in time-series clustering-Reference-Cited by-同舟云学术

BELMM: Bayesian model selection and random walk smoothing in time-series clustering

Published:2023-11-01 Issue:11 Volume:39 Page:
ISSN:1367-4811
Container-title:Bioinformatics
language:en
Short-container-title:

Author:

Sarala Olli¹^ORCID,Pyhäjärvi Tanja²^ORCID,Sillanpää Mikko J¹^ORCID

Affiliation:

1. Research Unit of Mathematical Sciences, University of Oulu , FI-90014 Oulu, Finland

2. Department of Forest Sciences, University of Helsinki , FI-00014 Helsinki, Finland

Abstract

Abstract Motivation Due to advances in measuring technology, many new phenotype, gene expression, and other omics time-course datasets are now commonly available. Cluster analysis may provide useful information about the structure of such data. Results In this work, we propose BELMM (Bayesian Estimation of Latent Mixture Models): a flexible framework for analysing, clustering, and modelling time-series data in a Bayesian setting. The framework is built on mixture modelling: first, the mean curves of the mixture components are assumed to follow random walk smoothing priors. Second, we choose the most plausible model and the number of mixture components using the Reversible-jump Markov chain Monte Carlo. Last, we assign the individual time series into clusters based on the similarity to the cluster-specific trend curves determined by the latent random walk processes. We demonstrate the use of fast and slow implementations of our approach on both simulated and real time-series data using widely available software R, Stan, and CU-MSDSp. Availability and implementation The French mortality dataset is available at http://www.mortality.org, the Drosophila melanogaster embryogenesis gene expression data at https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE121160. Details on our simulated datasets are available in the Supplementary Material, and R scripts and a detailed tutorial on GitHub at https://github.com/ollisa/BELMM. The software CU-MSDSp is available on GitHub at https://github.com/jtchavisIII/CU-MSDSp.

Funder

Academy of Finland R’Life program funding

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Link

https://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btad686/53842744/btad686.pdf

Reference43 articles.

1. Bayesian Computation with R

2. Quantifying post-transcriptional regulation in the development of Drosophila melanogaster;Becker;Nat Commun,2018

3. Reversible jump, birth-and-death and more general continuous time Markov chain Monte Carlo samplers;Cappé;J R Stat Soc Ser B (Stat Methodol),2003

4. Nbclust: an R package for determining the relevant number of clusters in a data set;Charrad;J Stat Soft,2014