Parikh’s Theorem Made Symbolic

Author:

Hague Matthew1ORCID,Jeż Artur2ORCID,Lin Anthony W.3ORCID

Affiliation:

1. Royal Holloway, University of London, Egham, UK

2. University of Wroclaw, Wroclaw, Poland

3. University of Kaiserslautern-Landau, Kaiserslautern, Germany / Max-Planck Institute for Software Systems (MPI-SWS), Kaiserslautern, Germany

Abstract

Parikh’s Theorem is a fundamental result in automata theory with numerous applications in computer science. These include software verification (e.g. infinite-state verification, string constraints, and theory of arrays), verification of cryptographic protocols (e.g. using Horn clauses modulo equational theories) and database querying (e.g. evaluating path-queries in graph databases), among others. Parikh’s Theorem states that the letter-counting abstraction of a language recognized by finite automata or context-free grammars is definable in Linear Integer Arithmetic (a.k.a. Presburger Arithmetic). In fact, there is a linear-time algorithm computing existential Presburger formulas capturing such abstractions, which enables an efficient analysis via SMT-solvers. Unfortunately, real-world applications typically require large alphabets (e.g. Unicode, containing a ‍million of characters) — which are well-known to be not amenable to explicit treatment of the alphabets — or even worse infinite alphabets. Symbolic automata have proven in the last decade to be an effective algorithmic framework for handling large finite or even infinite alphabets. A symbolic automaton employs an effective boolean algebra, which offers a symbolic representation of character sets (i.e. in terms of predicates) and often lends itself to an exponentially more succinct representation of a language. Instead of letter-counting, Parikh’s Theorem for symbolic automata amounts to counting the number of times different predicates are satisfied by an input sequence. Unfortunately, naively applying Parikh’s Theorem from classical automata theory to symbolic automata yields existential Presburger formulas of exponential size. In this paper, we provide a new construction for Parikh’s Theorem for symbolic automata and grammars, which avoids this exponential blowup: our algorithm computes an existential formula in polynomial-time over (quantifier-free) Presburger and the base theory. In fact, our algorithm extends to the model of parametric symbolic grammars, which are one of the most expressive models of languages over infinite alphabets. We have implemented our algorithm and show it can be used to solve string constraints that are difficult to solve by existing solvers.

Funder

European Research Council

Narodowe Centrum Nauki

Publisher

Association for Computing Machinery (ACM)

Subject

Safety, Risk, Reliability and Quality,Software

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3