Algorithms and Complexity on Indexing Founder Graphs
-
Published:2022-07-28
Issue:
Volume:
Page:
-
ISSN:0178-4617
-
Container-title:Algorithmica
-
language:en
-
Short-container-title:Algorithmica
Author:
Equi Massimo, Norri Tuukka, Alanko Jarno, Cazaux Bastien, Tomescu Alexandru I., Mäkinen VeliORCID
Abstract
AbstractWe study the problem of matching a string in a labeled graph. Previous research has shown that unless theOrthogonal Vectors Hypothesis(OVH) is false, one cannot solve this problem in strongly sub-quadratic time, nor index the graph in polynomial time to answer queries efficiently (Equi et al. ICALP 2019, SOFSEM 2021). These conditional lower-bounds cover even deterministic graphs with binary alphabet, but there naturally exist also graph classes that are easy to index: For example,Wheeler graphs(Gagie et al. Theor. Comp. Sci.2017) cover graphs admitting a Burrows-Wheeler transform -based indexing scheme. However, it is NP-complete to recognize if a graph is a Wheeler graph (Gibney, Thankachan, ESA 2019). We propose an approach to alleviate the construction bottleneck of Wheeler graphs. Rather than starting from an arbitrary graph, we study graphs induced frommultiple sequence alignments().Elastic degenerate strings(Bernadini et al. SPIRE 2017, ICALP 2019) can be seen as such graphs, and we introduce here their generalization:elastic founder graphs. We first prove that even such induced graphs are hard to index under OVH. Then we introduce two subclasses, repeat-free and semi-repeat-free graphs, that are easy to index. We give a linear time algorithm to construct a repeat-free (non-elastic) founder graph from a gapless , and (parameterized) near-linear time algorithms to construct a semi-repeat-free (repeat-free, respectively) elastic founder graph from general . Finally, we show that repeat-free founder graphs admit a reduction to Wheeler graphs in polynomial time.
Funder
Luonnontieteiden ja Tekniikan Tutkimuksen Toimikunta H2020 European Research Council
Publisher
Springer Science and Business Media LLC
Subject
Applied Mathematics,Computer Science Applications,General Computer Science
Reference50 articles.
1. Mäkinen, V., Cazaux, B., Equi, M., Norri, T., Tomescu, A.I.: Linear time construction of indexable founder block graphs. In: Kingsford, C., Pisanti, N. (eds.) 20th International Workshop on Algorithms in Bioinformatics, WABI 2020, September 7-9, 2020, Pisa, Italy (Virtual Conference). LIPIcs, vol. 172. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Dagstuhl, Germany (2020). https://doi.org/10.4230/LIPIcs.WABI.2020.7. pp. 7:1–7:18 2. Equi, M., Norri, T., Alanko, J., Cazaux, B., Tomescu, A.I., Mäkinen, V.: Algorithms and complexity on indexing elastic founder graphs. In: Ahn, H., Sadakane, K. (eds.) 32nd International Symposium on Algorithms and Computation, ISAAC 2021, December 6-8, 2021, Fukuoka, Japan. LIPIcs, vol. 212. Schloss Dagstuhl - Leibniz-Zentrum für Informatik, Dagstuhl, Germany (2021). https://doi.org/10.4230/LIPIcs.ISAAC.2021.20. pp. 20:1–20:18 3. Maier, D.: The complexity of some problems on subsequences and supersequences. J. ACM 25(2), 322–336 (1978). https://doi.org/10.1145/322063.322075 4. Chatzou, M., Magis, C., Chang, J.-M., Kemena, C., Bussotti, G., Erb, I., Notredame, C.: Multiple sequence alignment modeling: methods and applications. Briefings in Bioinformatics 17(6), 1009–1023 (2015) 5. Mäkinen, V., Navarro, G., Sirén, J., Välimäki, N.: Storage and retrieval of highly repetitive sequence collections. Journal of Computational Biology 17(3), 281–308 (2010)
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Finding maximal exact matches in graphs;Algorithms for Molecular Biology;2024-03-11 2. Elastic founder graphs improved and enhanced;Theoretical Computer Science;2024-01 3. On the Complexity of String Matching for Graphs;ACM Transactions on Algorithms;2023-04-12
|
|