The maximum clique enumeration problem: algorithms, applications, and implementations-Reference-Cited by-同舟云学术

The maximum clique enumeration problem: algorithms, applications, and implementations

Published:2012-06 Issue:S10 Volume:13 Page:
ISSN:1471-2105
Container-title:BMC Bioinformatics
language:en
Short-container-title:BMC Bioinformatics

Author:

Eblen John D,Phillips Charles A,Rogers Gary L,Langston Michael A

Abstract

Abstract Background The maximum clique enumeration (MCE) problem asks that we identify all maximum cliques in a finite, simple graph. MCE is closely related to two other well-known and widely-studied problems: the maximum clique optimization problem, which asks us to determine the size of a largest clique, and the maximal clique enumeration problem, which asks that we compile a listing of all maximal cliques. Naturally, these three problems are N P -hard, given that they subsume the classic version of the N P -complete clique decision problem. MCE can be solved in principle with standard enumeration methods due to Bron, Kerbosch, Kose and others. Unfortunately, these techniques are ill-suited to graphs encountered in our applications. We must solve MCE on instances deeply seeded in data mining and computational biology, where high-throughput data capture often creates graphs of extreme size and density. MCE can also be solved in principle using more modern algorithms based in part on vertex cover and the theory of fixed-parameter tractability (FPT). While FPT is an improvement, these algorithms too can fail to scale sufficiently well as the sizes and densities of our datasets grow. Results An extensive testbed of benchmark graphs are created using publicly available transcriptomic datasets from the Gene Expression Omnibus (GEO). Empirical testing reveals crucial but latent features of such high-throughput biological data. In turn, it is shown that these features distinguish real data from random data intended to reproduce salient topological features. In particular, with real data there tends to be an unusually high degree of maximum clique overlap. Armed with this knowledge, novel decomposition strategies are tuned to the data and coupled with the best FPT MCE implementations. Conclusions Several algorithmic improvements to MCE are made which progressively decrease the run time on graphs in the testbed. Frequently the final runtime improvement is several orders of magnitude. As a result, instances which were once prohibitively time-consuming to solve are brought into the domain of realistic feasibility.

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology

Link

https://link.springer.com/content/pdf/10.1186/1471-2105-13-S10-S5.pdf

Reference29 articles.

1. Garey MR, Johnson DS: Computers and Intractability: A Guide to the Theory of NP-Completeness. 1979, WH Freeman & Co.

2. Palla G, Derényi I, Farkas I, Vicsek T: Uncovering the overlapping community structure of complex networks in nature and society. Nature. 2005, 435: 814-818. 10.1038/nature03607.

3. Chesler EJ, Langston MA: Combinatorial Genetic Regulatory Network Analysis Tools for High Throughput Transcriptomic Data. RECOMB Satellite Workshop on Systems Biology and Regulatory Genomics. 2005

4. Chesler EJ, Lu L, Shou S, Qu Y, Gu J, Wang J, Hsu HC, Mountz JD, Baldwin NE, Langston MA, Hogenesch JB, Threadgill DW, Manly KF, Williams RW: Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function. Nature Genetics. 2005, 37: 233-242. 10.1038/ng1518.

5. Eblen JD, Gerling IC, Saxton AM, Wu J, Snoddy JR, Langston MA: Graph Algorithms for Integrated Biological Analysis, with Applications to Type 1 Diabetes Data. Clustering Challenges in Biological Networks, World Scientific. 2008, 207-222.

Cited by 30 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. AFCMiner: Finding Absolute Fair Cliques From Attributed Social Networks for Responsible Computational Social Systems;IEEE Transactions on Computational Social Systems;2023-12

2. Learning fine-grained search space pruning and heuristics for combinatorial optimization;Journal of Heuristics;2023-05-08

3. Data Processing of Product Ion Spectra: Quality Improvement by Averaging Multiple Similar Spectra of Small Molecules;Mass Spectrometry;2022-12-15

4. Modeling Physical Interaction and Understanding Peer Group Learning Dynamics: Graph Analytics Approach Perspective;Mathematics;2022-04-24

5. Identifying protein function and functional links based on large-scale co-occurrence patterns;PLOS ONE;2022-03-03