Chapter 10 in Handbook of language theory, G.
Rozenberg et A. Salomaa (éd.), 1997
PostScript file compressed with
gzip, PDF
file
Résumé. Ce chapitre présente une
synthèse de ce qu'on appelle souvent la théorie
algébrique des automates. Cette théorie parle de langages,
d'automates et de semigroupes et a des connections avec la logique, les
circuits booléens, la dynamique symbolique et la topologie.
Abstract. This chapter gives an overview on what is often called the
algebraic theory of finite automata. It deals with languages, automata and
semigroups, and has connections with model theory in logic, boolean
circuits, symbolic dynamics and topology.
Introduction : This chapter gives an overview on what is often called the
algebraic theory of finite automata. It deals with languages, automata and
semigroups, and has connections with model theory in logic, boolean
circuits, symbolic dynamics and topology.
Kleene's theorem is usually considered as the foundation of this theory. It
shows that the class of recognizable languages (i.e. recognized by
finite automata), coincides with the class of rational languages,
which are given by rational expressions. The definition of the syntactic
monoid, a monoid canonically attached to each language, was first given in
an early paper of Rabin and Scott, where the notion is credited to Myhill.
It was shown in particular that a language is recognizable if and only if
its syntactic monoid is finite. However, the first classification results
were rather related to automata. A break-through was realized by
Schützenberger in the mid sixties. Schützenberger made a non
trivial use of the syntactic monoid to characterize an important subclass
of the rational languages, the star-free languages.
Schützenberger's theorem states that a language is star-free if and
only if its syntactic monoid is finite and aperiodic.
This theorem had a considerable influence on the theory. Two other
important "syntactic" characterizations were obtained in the early
seventies: Simon proved that a language is piecewise testable if and only
if its syntactic monoid is finite and J-trivial and Brzozowski-Simon and
independently, McNaughton characterized the locally testable languages.
These successes settled the power of the semigroup approach, but it was
Eilenberg who discovered the appropriate framework to formulate this type
of results.
A variety of finite monoids is a class of monoids closed under taking
submonoids, quotients and finite direct product. Eilenberg's theorem
states that varieties of finite monoids are in one to one correspondence
with certain classes of recognizable languages, the varieties of languages.
For instance, the rational languages correspond to the variety of all
finite monoids, the star-free languages correspond to the variety of finite
aperiodic monoids, and the piecewise testable languages correspond to the
variety of finite J-trivial monoids. Numerous similar results have been
established during the past fifteen years and, for this reason, the theory
of finite automata is now intimately related to the theory of finite
monoids.
It is time to mention a sensitive feature of this theory, the role of the
empty word. Indeed, there are two possible definitions for a language. A
first definition consists in defining a language on the alphabet A
as a subset of the free monoid A^{*}. In this case a language may
contain the empty word. In the second definition, a language is defined as
a subset of the free semigroup, which excludes the empty word. This subtle
distinction has dramatic consequences on the full theory. First, one has
to distinguish between *-varieties (the first case) and +-varieties of
languages (the latter case). Next, with the latter definition, monoids
have to be replaced by semigroups and Eilenberg's theorem gives a one to
one correspondence between varieties of finite semigroups and +-varieties
of languages. Although it might seem quite annoying to have two such
parallel theories, this distinction proved to be necessary. For instance,
the locally testable languages form a +-variety, which correspond to
locally idempotent and commutative semigroups. But no characterization of
the locally testable languages is known in terms of syntactic monoids.
An extension of the the notion of syntactic semigroup (or monoid) was
recently proposed. The key idea is to define a partial order on syntactic
semigroups, leading to the notion of syntactic ordered semigroup. The
resulting extension of Eilenberg's variety theory permits to treat classes
of languages that are not necessarily closed under complement, a major
difference with the original theory. We have adopted this new point of
view throughout this chapter. For this reason, even our definition of
recognizable languages may seem unfamiliar to the reader.
The theory has now developed into many directions and has generated a
rapidly growing literature.The aim of this chapter is to provide the
reader with an overview of the main results. As these results are nowadays
intimately related with non commutative algebra, a certain amount of
semigroup theory had to be introduced, but we tried to favor the main ideas
rather than the technical developments. Some important topics had to be
skipped and are briefly mentioned in the last section. Due to the lack of
place, no proofs are given, but numerous examples should help the reader to
catch the spirit of the more abstract statements. The references listed at
the end of the chapter are far from being exhaustive. However, most of the
references should be reached by the standard recursive process of tracing
the bibliography of the papers cited in the references.
The chapter is organized as follows. The amount of semigroup theory that
is necessary to state precisely the results of this chapter is introduced
in Section 2. The basic concepts of recognizable set and syntactic ordered
semigroups are introduced in Section 3. The variety theorem is stated in
Section 4 and examples follow in Section 5. Some algebraic tools are
presented in Section 6. Sections 7 and 8 are devoted to the study of the
concatenation product and its variants. Connections with the theory of
codes are discussed in Section 9. Section 10 gives an overview on the
operators on recognizable languages. Various extensions are briefly
reviewed in Section 11.
PostScript file compressed with
gzip, PDF
file