Author:
Kari Lila,Konstantinidis Stavros,Kopecki Steffen,Yang Meng
Abstract
The concept of edit distance and its variants has applications in many areas such as computational linguistics, bioinformatics, and synchronization error detection in data communications. Here, we revisit the problem of computing the inner edit distance of a regular language given via a Nondeterministic Finite Automaton (NFA). This problem relates to the inherent maximal error-detecting capability of the language in question. We present two efficient algorithms for solving this problem, both of which execute in time O ( r 2 n 2 d ) , where r is the cardinality of the alphabet involved, n is the number of transitions in the given NFA, and d is the computed edit distance. We have implemented one of the two algorithms and present here a set of performance tests. The correctness of the algorithms is based on the connection between word distances and error detection and the fact that nondeterministic transducers can be used to represent the errors (resp., edit operations) involved in error-detection (resp., in word distances).
Subject
Computational Mathematics,Computational Theory and Mathematics,Numerical Analysis,Theoretical Computer Science
Reference30 articles.
1. Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison,1999
2. Algorithms on Strings, Trees, and Sequences: Computer Science and Computational Biology;Gusfield,1997
3. Insertion/Deletion Detecting Codes and the Boundary Problem
4. Computing the edit distance of a regular language
5. Computing maximal error-detecting capabilities and distances of regular languages;Konstantinidis;Fundam. Inform.,2010