Affiliation:
1. Université Paris Diderot
2. University of Chile
Abstract
Self-indexes are able to represent a text asymptotically within the information-theoretic lower bound under the
k
th order entropy model and offer access to any text substring and indexed pattern searches. Their time complexities are not optimal, however; in particular, they are always multiplied by a factor that depends on the alphabet size. In this article, we achieve, for the first time,
full alphabet independence
in the time complexities of self-indexes while retaining space optimality. We also obtain some relevant byproducts.
Funder
Agence Nationale de la Recherche
Millennium Institute for Cell Dynamics and Biotechnology
Publisher
Association for Computing Machinery (ACM)
Subject
Mathematics (miscellaneous)
Reference47 articles.
1. Fast text searching for regular expressions or automaton searching on tries
2. R. Baeza-Yates and B. Ribeiro-Neto. 2011. Modern Information Retrieval (2nd ed.). Addison-Wesley. R. Baeza-Yates and B. Ribeiro-Neto. 2011. Modern Information Retrieval (2nd ed.). Addison-Wesley.
Cited by
39 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Near-Optimal Search Time in $$\delta $$-Optimal Space, and Vice Versa;Algorithmica;2023-11-06
2. Collapsing the Hierarchy of Compressed Data Structures: Suffix Arrays in Optimal Compressed Space;2023 IEEE 64th Annual Symposium on Foundations of Computer Science (FOCS);2023-11-06
3. String Indexing with Compressed Patterns;ACM Transactions on Algorithms;2023-09-26
4. PTHash: Revisiting FCH Minimal Perfect Hashing;Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval;2021-07-11
5. Range Majorities and Minorities in Arrays;Algorithmica;2021-03-19