Abstract
We describe an algorithm computing an optimal prefix free code for n unsorted positive weights in time within O ( n ( 1 + lg α ) ) ⊆ O ( n lg n ) , where the alternation α ∈ [ 1 . . n − 1 ] approximates the minimal amount of sorting required by the computation. This asymptotical complexity is within a constant factor of the optimal in the algebraic decision tree computational model, in the worst case over all instances of size n and alternation α . Such results refine the state of the art complexity of Θ ( n lg n ) in the worst case over instances of size n in the same computational model, a landmark in compression and coding since 1952. Beside the new analysis technique, such improvement is obtained by combining a new algorithm, inspired by van Leeuwen’s algorithm to compute optimal prefix free codes from sorted weights (known since 1976), with a relatively minor extension of Karp et al.’s deferred data structure to partially sort a multiset accordingly to the queries performed on it (known since 1988). Preliminary experimental results on text compression by words show α to be polynomially smaller than n, which suggests improvements by at most a constant multiplicative factor in the running time for such applications.
Subject
Computational Mathematics,Computational Theory and Mathematics,Numerical Analysis,Theoretical Computer Science
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Lazy Search Trees;2020 IEEE 61st Annual Symposium on Foundations of Computer Science (FOCS);2020-11
2. Editorial: Special Issue on Data Compression Algorithms and Their Applications;Algorithms;2020-01-17