Efficient Data Structures for Range Shortest Unique Substring Queries-Reference-Cited by-同舟云学术

Efficient Data Structures for Range Shortest Unique Substring Queries

Published:2020-10-30 Issue:11 Volume:13 Page:276
ISSN:1999-4893
Container-title:Algorithms
language:en
Short-container-title:Algorithms

Author:

Abedin Paniz^ORCID,Ganguly Arnab,Pissis Solon P.,Thankachan Sharma V.

Abstract

Let T[1,n] be a string of length n and T[i,j] be the substring of T starting at position i and ending at position j. A substring T[i,j] of T is a repeat if it occurs more than once in T; otherwise, it is a unique substring of T. Repeats and unique substrings are of great interest in computational biology and information retrieval. Given string T as input, the Shortest Unique Substring problem is to find a shortest substring of T that does not occur elsewhere in T. In this paper, we introduce the range variant of this problem, which we call the Range Shortest Unique Substring problem. The task is to construct a data structure over T answering the following type of online queries efficiently. Given a range [α,β], return a shortest substring T[i,j] of T with exactly one occurrence in [α,β]. We present an O(nlogn)-word data structure with O(logwn) query time, where w=Ω(logn) is the word size. Our construction is based on a non-trivial reduction allowing for us to apply a recently introduced optimal geometric data structure [Chan et al., ICALP 2018]. Additionally, we present an O(n)-word data structure with O(nlogϵn) query time, where ϵ>0 is an arbitrarily small constant. The latter data structure relies heavily on another geometric data structure [Nekrich and Navarro, SWAT 2012].

Publisher

MDPI AG

Subject

Computational Mathematics,Computational Theory and Mathematics,Numerical Analysis,Theoretical Computer Science

Link

https://www.mdpi.com/1999-4893/13/11/276/pdf

Reference42 articles.

1. Applied Combinatorics on Words;Lothaire,2005

2. REPuter: the manifold applications of repeat analysis on a genomic scale

3. On shortest unique substring queries

4. A repetition based measure for verification of text collections and for text categorization

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Finding top-k longest palindromes in substrings;Theoretical Computer Science;2023-11

2. Internal Longest Palindrome Queries in Optimal Time;WALCOM: Algorithms and Computation;2023

3. Internal shortest absent word queries in constant time and linear space;Theoretical Computer Science;2022-06