Affiliation:
1. Helsinki Institute of Information Technology, Department of Computer Science, University of Helsinki, Finland
Abstract
For decades the Lempel-Ziv (LZ77) factorization has been a cornerstone of data compression and string processing algorithms, and uses for it are still being uncovered. For example, LZ77 is central to several recent text indexing data structures designed to search highly repetitive collections. However, in many applications computation of the factorization remains a bottleneck in practice. In this article, we describe a number of simple and fast LZ77 factorization algorithms, which consistently outperform all previous methods in practice, use less memory, and still offer strong worst-case performance guarantees. A common feature of the new algorithms is that they compute longest common prefix information in a lazy fashion, with the degree of laziness in preprocessing characterizing different algorithms.
Publisher
Association for Computing Machinery (ACM)
Subject
Theoretical Computer Science
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献