Affiliation:
1. Department of Computer and Information Science, University of Pennsylvania
Abstract
We propose
regular expression pattern matching
as a core feature for programming languages for manipulating XML (and similar tree-structured data formats). We extend conventional pattern-matching facilities with regular expression operators such as repetition (*), alternation (I), etc., that can match arbitrarily long
sequences
of subtrees, allowing a compact pattern to extract data from the middle of a complex sequence. We show how to check standard notions of exhaustiveness and redundancy for these patterns.Regular expression patterns are intended to be used in languages whose type systems are also based on the
regular expression types
. To avoid excessive type annotations, we develop a type inference scheme that propagates type constraints to pattern variables from the surrounding context. The type inference algorithm translates types and patterns into regular tree automata and then works in terms of standard closure operations (union, intersection, and difference) on tree automata. The main technical challenge is dealing with the interaction of repetition and alternation patterns with the
first-match
policy, which gives rise to subtleties concerning both the termination and the precision of the analysis. We address these issues by introducing a data structure representing closure operations lazily.
Publisher
Association for Computing Machinery (ACM)
Subject
Computer Graphics and Computer-Aided Design,Software
Cited by
16 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Set-theoretic Types for Erlang;Proceedings of the 34th Symposium on Implementation and Application of Functional Languages;2022-08-31
2. SFJ: An Implementation of Semantic Featherweight Java;Lecture Notes in Computer Science;2020
3. Compositional Dataflow Circuits;ACM Transactions on Embedded Computing Systems;2019-01-31
4. Semantic Subtyping for Objects and Classes;The Computer Journal;2016-12-05
5. Reusing metadata across components, applications, and languages;Science of Computer Programming;2015-02