A faster FPRAS for #NFA-Reference-Cited by-同舟云学术

A faster FPRAS for #NFA

Published:2024-05-10 Issue:2 Volume:2 Page:1-22
ISSN:2836-6573
Container-title:Proceedings of the ACM on Management of Data
language:en
Short-container-title:Proc. ACM Manag. Data

Author:

Meel Kuldeep S.¹^ORCID,Chakraborty Sourav²^ORCID,Mathur Umang³^ORCID

Affiliation:

1. University of Toronto, Toronto, ON, Canada

2. Indian Statistical Institute, Kolkata, India

3. National University of Singapore, Singapore, Singapore

Abstract

Given a non-deterministic finite automaton (NFA) A with m states, and a natural number n (presented in unary), the #NFA problem asks to determine the size of the set L(A,n) of words of length n accepted by A. While the corresponding decision problem of checking the emptiness of L(A,n) is solvable in polynomial time, the #NFA problem is known to be #P-hard. Recently, the long-standing open question --- whether there is an FPRAS (fully polynomial time randomized approximation scheme) for #NFA --- was resolved by Arenas, Croquevielle, Jayaram, and Riveros in [ACJR19]. The authors demonstrated the existence of a fully polynomial randomized approximation scheme with a time complexity of ~O(m 17 n 17 • 1/ε 14 • log (1/δ)), for a given tolerance ε and confidence parameter δ. Given the prohibitively high time complexity in terms of each of the input parameters, and considering the widespread application of approximate counting (and sampling) in various tasks in Computer Science, a natural question arises: is there a faster FPRAS for #NFA that can pave the way for the practical implementation of approximate #NFA tools? In this work, we answer this question in the positive. We demonstrate that significant improvements in time complexity are achievable, and propose an FPRAS for #NFA that is more efficient in terms of both time and sample complexity. A key ingredient in the FPRAS due to Arenas, Croquevielle, Jayaram, and Riveros [ACJR19] is inter-reducibility of sampling and counting, which necessitates a closer look at the more informative measure --- the number of samples maintained for each pair of state q and length i <= n. In particular, the scheme of [ACJR19] maintains O(m 7 /n 7 ε 7 ) samples per pair of state and length. In the FPRAS we propose, we systematically reduce the number of samples required for each state to be only poly-logarithmically dependent on m, with significantly less dependence on n and ε, maintaining only ~O(n 4 /ε 2 ) samples per state. Consequently, our FPRAS runs in time ~O((m 2 n 10 + m 3 n 6 ) • 1/ε 4 • log 2 (1/δ)). The FPRAS and its analysis use several novel insights. First, our FPRAS maintains a weaker invariant about the quality of the estimate of the number of samples for each state q and length i <= n. Second, our FPRAS only requires that the distribution of the samples maintained is close to uniform distribution only in total variation distance (instead of maximum norm). We believe our insights may lead to further reductions in time complexity and thus open up a promising avenue for future work towards the practical implementation of tools for approximate #NFA.

Funder

Ministry of Education - Singapore

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3651613

Reference18 articles.

1. Foundations of Modern Query Languages for Graph Databases

2. Efficient Logspace Classes for Enumeration, Counting, and Uniform Generation

3. #NFA Admits an FPRAS: Efficient Enumeration, Counting, and Uniform Generation for Logspace Classes

4. String analysis for side channels with segmented oracles