Affiliation:
1. Bar-Ilan University and Johns Hopkins University, Israel
2. Bar-Ilan University, Israel
3. Shenkar College and CRI, University of Haifa, Haifa, Israel
Abstract
Assume that a natural cyclic phenomenon has been measured, but the data is corrupted by errors. The type of corruption is application-dependent and may be caused by measurements errors, or natural features of the phenomenon. We assume that an appropriate metric exists, which measures the amount of corruption experienced. This article studies the problem of recovering the correct cycle from data corrupted by various error models, formally defined as the
period recovery problem
. Specifically, we define a metric property which we call
pseudolocality
and study the period recovery problem under pseudolocal metrics. Examples of pseudolocal metrics are the Hamming distance, the swap distance, and the interchange (or Cayley) distance. We show that for pseudolocal metrics, periodicity is a powerful property allowing detecting the original cycle and correcting the data, under suitable conditions. Some surprising features of our algorithm are that we can efficiently identify the period in the corrupted data, up to a number of possibilities logarithmic in the length of the data string, even for metrics whose calculation is
NP-hard
. For the Hamming metric, we can reconstruct the corrupted data in near-linear time even for unbounded alphabets. This result is achieved using the property of separation in the self-convolution vector and Reed-Solomon codes. Finally, we employ our techniques beyond the scope of pseudo-local metrics and give a recovery algorithm for the non-pseudolocal Levenshtein edit metric.
Funder
National Science Foundation
Israel Science Foundation
Publisher
Association for Computing Machinery (ACM)
Subject
Mathematics (miscellaneous)
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. On suffix tree detection;Theoretical Computer Science;2024-10
2. On Suffix Tree Detection;String Processing and Information Retrieval;2023
3. Multidimensional Period Recovery;Algorithmica;2022-02-14
4. Multidimensional Period Recovery;String Processing and Information Retrieval;2020
5. Approximate cover of strings;Theoretical Computer Science;2019-11