Enumerating Regular Languages with Bounded Delay

Antoine Amarilli; Mikaël Monet

doi:10.4230/LIPIcs.STACS.2023.8

Communication Dans Un Congrès Année : 2023

Enumerating Regular Languages with Bounded Delay

(1, 2, 3) , (4)

1
2
3
4

Antoine Amarilli

Fonction : Auteur
PersonId : 4934
IdHAL : a3nm
ORCID : 0000-0002-7977-4441
IdRef : 195286677

Data, Intelligence and Graphs

Département Informatique et Réseaux

Institut Polytechnique de Paris

Mikaël Monet

Fonction : Auteur
PersonId : 1085321

Linking Dynamic Data

Résumé

We study the task, for a given language $L$, of enumerating the (generally infinite) sequence of its words, without repetitions, while bounding the delay between two consecutive words. To allow for delay bounds that do not depend on the current word length, we assume a model where we produce each word by editing the preceding word with a small edit script, rather than writing out the word from scratch. In particular, this witnesses that the language is orderable, i.e., we can write its words as an infinite sequence such that the Levenshtein edit distance between any two consecutive words is bounded by a value that depends only on the language. For instance, $(a+b)^*$ is orderable (with a variant of the Gray code), but $a^* + b^*$ is not. We characterize which regular languages are enumerable in this sense, and show that this can be decided in PTIME in an input deterministic finite automaton (DFA) for the language. In fact, we show that, given a DFA $A$, we can compute in PTIME automata $A_1, \ldots, A_t$ such that $L(A)$ is partitioned as $L(A_1) \sqcup \ldots \sqcup L(A_t)$ and every $L(A_i)$ is orderable in this sense. Further, we show that the value of $t$ obtained is optimal, i.e., we cannot partition $L(A)$ into less than $t$ orderable languages. In the case where $L(A)$ is orderable (i.e., $t=1$), we show that the ordering can be produced by a bounded-delay algorithm: specifically, the algorithm runs in a suitable pointer machine model, and produces a sequence of bounded-length edit scripts to visit the words of $\L(A)$ without repetitions, with bounded delay -- exponential in $|A|$ -- between each script. In fact, we show that we can achieve this while only allowing the edit operations push and pop at the beginning and end of the word, which implies that the word can in fact be maintained in a double-ended queue. By contrast, when fixing the distance bound $d$ between consecutive words and the number of classes of the partition, it is NP-hard in the input DFA $A$ to decide if $L(A)$ is orderable in this sense, already for finite languages. Last, we study the model where push-pop edits are only allowed at the end of the word, corresponding to a case where the word is maintained on a stack. We show that these operations are strictly weaker and that the slender languages are precisely those that can be partitioned into finitely many languages that are orderable in this sense. For the slender languages, we can again characterize the minimal number of languages in the partition, and achieve bounded-delay enumeration.

Mots clés

Regular language Constant-delay enumeration Edit distance constant-delay enumeration edit distance Theory of computation → Formal languages and automata theory

Domaines

Informatique [cs]

Fichier principal

enumerating.pdf (880.61 Ko)

Mikaël Monet : Connectez-vous pour contacter le contributeur

https://hal.science/hal-03940590

Soumis le : lundi 16 janvier 2023-11:20:03

Dernière modification le : mercredi 24 janvier 2024-09:54:24

Archivage à long terme le : lundi 17 avril 2023-18:54:15

Dates et versions

hal-03940590 , version 1 (16-01-2023)

Licence

Paternité

Identifiants

HAL Id : hal-03940590 , version 1
ARXIV : 2209.14878
DOI : 10.4230/LIPIcs.STACS.2023.8

Citer

Antoine Amarilli, Mikaël Monet. Enumerating Regular Languages with Bounded Delay. STACS, Mar 2023, Hamburg, Germany. ⟨10.4230/LIPIcs.STACS.2023.8⟩. ⟨hal-03940590⟩

Exporter

BibTeX XML-TEI Dublin Core DC Terms EndNote DataCite

Collections

INSTITUT-TELECOM CNRS INRIA PARISTECH CRISTAL INRIA2 CRISTAL-LINKS UNIV-LILLE LTCI INFRES DIG IP_PARIS ANR

67 Consultations

70 Téléchargements

Enumerating Regular Languages with Bounded Delay

Résumé

Mots clés

Domaines

Dates et versions

Licence

Identifiants

Citer

Exporter

Collections

Altmetric

Partager