A Linear-Time Algorithm for Seeds Computation-Reference-Cited by-同舟云学术

A Linear-Time Algorithm for Seeds Computation

Published:2020-04-30 Issue:2 Volume:16 Page:1-23
ISSN:1549-6325
Container-title:ACM Transactions on Algorithms
language:en
Short-container-title:ACM Trans. Algorithms

Author:

Kociumaka Tomasz¹^ORCID,Kubica Marcin²,Radoszewski Jakub²^ORCID,Rytter Wojciech²,Waleń Tomasz²

Affiliation:

1. University of Warsaw and Bar-Ilan University, Ramat-Gan, Israel

2. University of Warsaw, Banacha, Warsaw, Poland

Abstract

A seed in a word is a relaxed version of a period in which the occurrences of the repeating subword may overlap. Our first contribution is a linear-time algorithm computing a linear-size representation of all seeds of a word (the number of seeds might be quadratic). In particular, one can easily derive the shortest seed and the number of seeds from our representation. Thus, we solve an open problem stated in a survey by Smyth from 2000 and improve upon a previous O ( n log n )-time algorithm by Iliopoulos et al. from 1996. Our approach is based on combinatorial relations between seeds and subword complexity (used here for the first time in the context of seeds). In previous papers, compact representations of seeds consisted of two independent parts operating on the suffix tree of the input word and the suffix tree of its reverse, respectively. Our second contribution is a novel and significantly simpler representation of all seeds that avoids dealing with the suffix tree of the reversed word. This result is also of independent interest from a combinatorial point of view. A preliminary version of this work, with a much more complex algorithm constructing a representation of seeds on two suffix trees, was presented at the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’12).

Funder

ISF

BSF

ERC grant MPM

Foundation for Polish Science project “Algorithms for text processing with errors and uncertainties”

EU's Horizon 2020 Research and Innovation Programme

European Union under the European Regional Development Fund

Publisher

Association for Computing Machinery (ACM)

Subject

Mathematics (miscellaneous)

Link