Expanding the boundaries of local similarity analysis-Reference-Cited by-同舟云学术

Expanding the boundaries of local similarity analysis

Published:2013-01 Issue:S1 Volume:14 Page:
ISSN:1471-2164
Container-title:BMC Genomics
language:en
Short-container-title:BMC Genomics

Author:

Durno W Evan,Hanson Niels W,Konwar Kishori M,Hallam Steven J

Abstract

Abstract Background Pairwise comparison of time series data for both local and time-lagged relationships is a computationally challenging problem relevant to many fields of inquiry. The Local Similarity Analysis (LSA) statistic identifies the existence of local and lagged relationships, but determining significance through a p-value has been algorithmically cumbersome due to an intensive permutation test, shuffling rows and columns and repeatedly calculating the statistic. Furthermore, this p-value is calculated with the assumption of normality -- a statistical luxury dissociated from most real world datasets. Results To improve the performance of LSA on big datasets, an asymptotic upper bound on the p-value calculation was derived without the assumption of normality. This change in the bound calculation markedly improved computational speed from O(pm 2 n) to O(m 2 n), where p is the number of permutations in a permutation test, m is the number of time series, and n is the length of each time series. The bounding process is implemented as a computationally efficient software package, FAST LSA, written in C and optimized for threading on multi-core computers, improving its practical computation time. We computationally compare our approach to previous implementations of LSA, demonstrate broad applicability by analyzing time series data from public health, microbial ecology, and social media, and visualize resulting networks using the Cytoscape software. Conclusions The FAST LSA software package expands the boundaries of LSA allowing analysis on datasets with millions of co-varying time series. Mapping metadata onto force-directed graphs derived from FAST LSA allows investigators to view correlated cliques and explore previously unrecognized network relationships. The software is freely available for download at: http://www.cmde.science.ubc.ca/hallam/fastLSA/.

Publisher

Springer Science and Business Media LLC

Subject

Genetics,Biotechnology

Link

https://link.springer.com/content/pdf/10.1186/1471-2164-14-S1-S3.pdf

Reference20 articles.

1. Lynch C: Big data: How do your data grow?. Nature. 2008, 455 (7209): 28-29. 10.1038/455028a.

2. Bell G, Hey T, Szalay A: Computer science. Beyond the data deluge. Science. 2009, 323 (5919): 1297-1298. 10.1126/science.1170411.

3. Schadt EE, Linderman MD, Sorenson J, Lee L, Nolan GP: Computational solutions to large-scale data management and analysis. Nature Reviews Genetics. 2010, 11 (9): 647-657. 10.1038/nrg2857.

4. Ranjard L, Poly F, Lata JC, Mougel C, Thioulouse J, Nazaret S: Characterization of bacterial and fungal soil communities by automated ribosomal intergenic spacer analysis fingerprints: biological and methodological variability. Applied and Environmental Microbiology. 2001, 67 (10): 4479-4487. 10.1128/AEM.67.10.4479-4487.2001.

5. Mooy BASV, Devol AH, Keil RG: Relationship between bacterial community structure, light, and carbon cycling in the eastern subarctic North Pacific. Limnology and Oceanography. 2004, 1056-1062.

Cited by 25 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Identifying local associations in biological time series: algorithms, statistical significance, and applications;Briefings in Bioinformatics;2023-09-22

2. Selection pressure on the rhizosphere microbiome can alter nitrogen use efficiency and seed yield in Brassica rapa;Communications Biology;2022-09-14

3. Abundance and diversity of antibiotic resistance genes and bacterial communities in the western Pacific and Southern Oceans;Science of The Total Environment;2022-05

4. Network-based approaches for the investigation of microbial community structure and function using metagenomics-based data;Future Microbiology;2022-05

5. Oceanographic setting influences the prokaryotic community and metabolome in deep-sea sponges;Scientific Reports;2022-03-01