/
Text
Texts and Monographs in Computer Science
Editor
David Gries
Advisory Board
F. L. Bauer
J. J. Horning
R. Reddy
D. C. Tsichritzis
W. M. Waite
The AKM Series in
Theoretical Computer Science
A Subseries of Texts and Monographs in Computer Science
A Basis for Theoretical Computer Science
by M. A. Arbib, A. J. Kfoury, and R. N. Moll
A Programming Approach to Computability
by A. J. Kfoury, R. N. Moll, and M. A. Arbib
An Introduction to Formal Language Theory
by R. N. Moll, M. A. Arbib, and A. J. Kfoury
Algebraic Approaches to Program Semantics
by E. G. Manes and M. A. Arbib
Algebraic
Approaches to
Program
Semantics
Ernest G. Manes
Michael A. Arbib
Springer-Verlag
New York Berlin Heidelberg
London Paris Tokyo
Ernest G. Manes
Department of Mathematics and
Statistics
University of Massachusetts
Amherst, Massachusetts 01003
U.S.A.
Michael A. Arbib
Departments of Computer Science,
Neurobiology and Physiology
University of Southern California
Los Angeles, California 90089
U.S.A.
Series Editor
David Gries
Department of Computer Science
Cornell University
Upson Hall
Ithaca, New York 14853
U.S.A.
Library of Congress Cataloging in Publication Data
Manes, Ernest G., 1943Algebraic approaches to program semantics.
(Texts and monographs in computer science)
Includes index.
1. Programming languages (Electronic computers)Semantics. 2. Algebra. I. Arbib, Michael A.
II. Title. III. Series.
1986
005.13'1
86-6563
QA76.7.M34
© 1986 by Springer-Verlag New York Inc.
All rights reserved. No part of this book may be translated or reproduced in any form without
written permission from Springer-Verlag, 175 Fifth Avenue, New York, New York 10010, U.S.A.
The use of general descriptive names, trade names, trademarks, etc. in this publication, even
if the former are not especially identified, is not to be taken as a sign that such names, as
understood by the Trade Marks and Merchandise Marks Act, may accordingly be used freely by
anyone.
Typeset by Asco Trade Typesetting Ltd., Hong Kong.
9 8 7 6 5 432 1
ISBN-13: 978-1-4612-9377-4
DOl: 10.1007/978-1-4612-4962-7
e-ISBN-13: 978-1-4612-4962-7
To
Bernadette and Prue
Preface
In the 1930s, mathematical logicians studied the notion of "effective computability" using such notions as recursive functions, A-calculus, and Turing
machines. The 1940s saw the construction of the first electronic computers,
and the next 20 years saw the evolution of higher-level programming languages
in which programs could be written in a convenient fashion independent
(thanks to compilers and interpreters) of the architecture of any specific
machine. The development of such languages led in turn to the general
analysis of questions of syntax, structuring strings of symbols which could
count as legal programs, and semantics, determining the "meaning" of a
program, for example, as the function it computes in transforming input data
to output results. An important approach to semantics, pioneered by Floyd,
Hoare, and Wirth, is called assertion semantics: given a specification of which
assertions (preconditions) on input data should guarantee that the results
satisfy desired assertions (postconditions) on output data, one seeks a logical
proof that the program satisfies its specification. An alternative approach,
pioneered by Scott and Strachey, is called denotational semantics: it offers
algebraic techniques for characterizing the denotation of (i.e., the function
computed by) a program-the properties of the program can then be checked
by direct comparison of the denotation with the specification.
This book is an introduction to denotational semantics. More specifically,
we introduce the reader to two approaches to denotational semantics: the
order semantics of Scott and Strachey and our own partially additive semantics.
Moreover, we show how each approach may be applied both to the specification of the semantics of programs, including recursive programs, and to the
specification of new data types from old. There has been a growing acceptance
that category theory, a branch of abstract algebra, provides a perspicuous
viii
Preface
general setting for all these topics, and for many other algebraic approaches
to program semantics as well. Thus, an important aim of this book is to
interweave the study of semantics with a completely self-contained introduction to a useful core of category theory, fully motivated by basic concepts of
computer science.
Computer science seeks to provide a scientific basis for the study of information processing, algorithms, and the design and programming of computers. The past four decades have witnessed major advances in programming
methodology, which allow immense programs to be designed with increasing
speed and reduced error, and in the development of mathematical techniques
to allow the rigorous specification of program, process, and machine. The
present volume is one of a series, the AKM Series in Theoretical Computer
Science, designed to make key mathematical developments in computer
science readily accessible to undergraduate and beginning graduate students.
The book is essentially self-contained: what little background is required may
be found in the AKM volume A Basis for Theoretical Computer Science.
However, this book is more algebraic than other books in the AKM
Series, and as such may prove somewhat heavier going-at least for American
students, since the American curriculum in theoretical computer science, as
distinct from the European curriculum, stresses combinatorial methods over
algebraic methods.
The book is organized in three parts: Part 1 presents the denotational
semantics of control, that is, the way in which the denotation of a program
can be obtained from the denotation of the pieces from which it is composed.
The approach is motivated by analysis of a fragment of Pascal, a functional
programming fragment, and a consideration of nondeterministic semantics.
Basic notions of category theory include those of product and coproduct.
Chapter 3 presents the elements of partially additive semantics, including
a denotational semantics of iteration and a new theory of guards ("test
functions") which provides a bridge between denotational semantics and the
assertion semantics presented in Chapter 4.
Part 2 extends the theory of Part 1 by showing how the Kleene sequence
yields a denotation for the computation given by a recursive program.
Chapter 6 then introduces domains as the setting for the order semantics
of recursion, while Chapter 8 provides the partially ordered semantics of
recursion. Chapter 7, on canonical fixed points, provides a unified setting for
both approaches, as well as for the study of fixed points in metric spaces in
Chapter 9.
Part 3 extends the theory to data types. The crucial tools are provided
by the following notions from category theory, which are introduced in
Chapters 10 and 11: functors, fixed points of functors, and co-continuous and
continuous functors. We motivate these with a discussion of how a generalized
Kleene sequence can provide the denotation of a recursive specification of a
data type. In Chapter 12, we consider parametric specification of data types,
analyzing arrays, stacks, queues, and our functional programming fragment
ix
Preface
in the process. We devote Chapter 13 to the order semantics of data types.
Finally, Chapter 14 gives a brief introduction to describing data types using
operations and equations, and extends the earlier theory of functorial fixed
points to include these ideas. As a result, the reader is not limited to anyone
algebraic approach to program semantics, but rather is given the tools to
tailor the formal semantics to the need of different applications.
The book grew out of our research in partially additive semantics, which
was in turn based on our general investigation of "category theory applied to
computation and control." We thank the National Science Foundation for
its support of this research. This volume represents an attempt to place the
material in the perspective of other approaches to denotational semantics,
and to render the common algebraic tools as accessible as possible. We thank
our many colleagues in both America and Europe for all they taught us in the
course of this research, and for their comments on an earlier draft of the
book. It is with regret that we note that limitations of space make it impossible
to address all the topics raised in this correspondence within the compass of
an introductory text. Finally, we thank Gwyn Mitchell and Kathy Adamczyk
for their typing ofthe draft of this manuscript; and Ms. Adamczyk for helping
with research on the notes for Chapter 5.
Amherst, Massachusetts
ERNEST
G.
MANES
MICHAEL A. ARBIB
Contents
Part 1
Denotational Semantics of Control
CHAPTER 1
An Introduction to Denotational Semantics
1.1
1.2
1.3
1.4
1.5
Syntax and Semantics
A Simple Fragment of Pascal
A Functional Programming Fragment
Multifunctions
A Preview of Partially Additive Semantics
CHAPTER 2
An Introduction to Category Theory
2.1
2.2
2.3
The Definition of a Category
Isomorphism, Duality, and Zero Objects
Products and Coproducts
3
3
5
11
21
26
38
39
46
57
CHAPTER 3
Partially Additive Semantics
71
3.1
3.2
3.3
71
75
85
Partial Addition
Partially Additive Categories and Iteration
The Boolean Algebra of Guards
CHAPTER 4
Assertion Semantics
4.1 Assertions and Preconditions
4.2 Partial Correctness
4.3 . Total Correctness
98
98
102
109
xii
Contents
Part 2 Semantics of Recursion
CHAPTER 5
Recursive Specifications
119
5.1
5.2
5.3
120
129
The Kleene Sequence
The Pattern-of-Calls Expansion
Iteration Recursively
139
CHAPTER 6
Order Semantics of Recursion
146
6.1
6.2
6.3
6.4
147
152
160
169
Domains
Fixed Point Theorems
Recursive Specification in FPF
Fixed Points and Formal Languages
CHAPTER 7
Canonical Fixed Points
176
CHAPTER 8
Partially Additive Semantics of Recursion
180
8.1
8.2
8.3
8.4
8.5
180
186
193
PAR Schemes
The Canonical Fixed Point for PAR Schemes
Additive Domains
Proving Correctness
Power Series and Products
200
203
CHAPTER 9
Fixed Points in Metric Spaces
210
9.1
9.2
9.3
9.4
210
218
220
228
Contractions on Complete Metric Spaces
Differential Equations
Metrics on Trees
Context-Free Languages as Metric Fixed Points
Part 3 Data Types
CHAPTER 10
Functors
235
10.1
10.2
236
245
Data Types Lead to Functors
Fixed Points of Functors
CHAPTER 11
Recursive Specification of Data Types
258
11.1
11.2
11.3
258
266
From Least Upper Bounds to Least Fixed Points
Co-continuous Functors
Continuous Functors and Greatest Fixed Points
272
Contents
CHAPTER 12
xiii
Parametric Specification
279
12.1
12.2
12.3
280
283
288
Arrays
Stacks and Queues
A Functional Programming Fragment Revisited
CHAPTER 13
Order Semantics of Data Types
293
13.1
13.2
13.3
13.4
293
296
300
305
Introduction
Constructions with Domains
Cartesian-Closed Categories
Solving Function Space Equations
CHAPTER 14
Equational Specification
318
14.1
14.2
319
328
Initial Algebras
Sur-reflections
Epilogue
341
Author Index
345
Subject Index
347
PART 1
DENOTATIONAL SEMANTICS OF
CONTROL
CHAPTER 1
An Introduction to Denotational
Semantics
1.1
1.2
1.3
1.4
1.5
Syntax and Semantics
A Simple Fragment of Pascal
A Functional Programming Fragment
Multifunctions
A Preview of Partially Additive Semantics
1.1 Syntax and Semantics
To specify a programming language we must specify its syntax and semantics. The syntax of a programming language specifies which strings of symbols constitute valid programs. A formal description of the syntax typically
involves a precise specification of the alphabet of allowable symbols and a
finite set of rules delineating how symbols may be grouped into expressions,
instructions, and programs. Most compilers for programming languages are
implemented with syntax checking whereby the first stage in compiling a
program is to check its text to see if it is syntactically valid. In practice, syntax
must be described at two levels, for a human user through programming
manuals and as a syntax-checking algorithm within a compiler or interpreter.
"Semantics" is a technical word for "meaning." A semantics for a programming language explains what programs in that language mean. In more
mathematical terms, semantics is a function whose input is a syntactically
valid program and whose output is a description of the function computed by
the program.
There are different approaches to semantics. We briefly introduce three:
operational semantics, denotational semantics, and assertion semantics. We
will give an example of an operational semantics in the next section. Assertion semantics will be further considered in Chapter 4. Denotational semantics is a major concern of this book.
Operational semantics is the most intuitive for beginners with some programming experience, being the form of semantics described in most programming manuals. To provide an operational semantics for a programming
4
1 An Introduction to Denotational Semantics
language, one invents an "abstract computer" and describes how programs
"run" on this computer. Usually, the semantics prescribes how the syntactic
form of a program is to be interpreted as a (data-dependent) sequence of
instructions. Input data are then transformed as the program is run in sequence, instruction by instruction, branching and looping back on the basis
of tests on current values of data.
By contrast to operational semantics which traces all intermediate states
in a computation, denotational semantics focuses on input/output behavior
and ignores the intermediate states. Operational semantics provides more
information on how to implement a programming language as long as the
implementation environment resembles that of the abstract computer. For
example, an operational semantics in which every computation is described
as a serial sequence of state changes would be somewhat at odds with an
implementation on a pipeline architecture which maximizes parallel computation. An objective of denotational semantics is to avoid worry about
details of implementation.
A challenge posed by denotational semantics is to invent mathematical
frameworks permitting the description of repetitive programming constructs
(i.e., "loops") without explicit reference to intermediate states. The "partially
additive semantics" of Section 1.5 introduces a power-series representation
for computed functions which, in part, expresses programming constructs
in terms of operations that manipulate power series. Other approaches to
denotational semantics, to be discussed in Part 2, use partially ordered sets
and metric spaces for their mathematical underpinnings.
Before discussing assertion semantics we must first introduce assertions.
An assertion is a statement about the program state which is either true or
false. As an example, consider the (hopefully transparent) program 1.
1
INPUTS: X
OUTPUTS: Y
{X
~
O}
BEGIN
(a block of code representing an
algorithm for Y :=
END
{X=Y*Y}.
JX)
The assertions are shown enclosed by braces, { and }. They are not part of
the program, but assert what properties should hold true when the assertion
is encountered in executing the program. A program is correct if indeed the
satisfaction of all initial assertions about the input data guarantees the truth
of all assertions encountered later on.
One could attempt to design a programming language with assertions in
mind. All built-in functions would come with associated assertions and for
each programming construct there would be rules explaining how to find
5
1.2 A Simple Fragment of Pascal
suitable assertions for the overall construct from the pieces of the construct
and their assertions. Ideally, every program would automatically be strewn
with assertions with the following beneficial effects. The assertions would
usefully document the program, and it would be possible to write software
that could automatically scan the assertions to detect bugs and check for
correctness.
In the next section we introduce a small fragment of Pascal giving a formal
syntax and an operational semantics. In Section 1.3, however, we introduce a
functional programming fragment that makes no use of identifiers or assignment statements. Here, the concept of "state" (which in Section 1.2 means the
values stored by the identifiers) would require major overhaul before one
could give an operational semantics or an assertion semantics. It is hard to
create general semantic theories devoid of built-in assumptions about the
programming languages to which they apply!
1.2 A Simple Fragment of Pascal
In this section we describe an abbreviated version of Pascal. Although this
limited version has full computing power with regard to functions whose
inputs and outputs are natural numbers, this is a tangential point-the main
objective of this section is to illustrate how to present a formal syntax as well
as an operational semantics for a simple programming language. The reader
should observe that the level of precision of the operational semantics is such
that it becomes fairly clear how to write a compiler or interpreter for the
Pascal fragment, so that we accomplish more than an exercise in formalizing
what we already knew.
The complete syntax of our Pascal fragment is given in Table 1. Here, the
colons, commas, and periods are not among the 64 symbols in the alphabet.
Parentheses are used liberally to ensure that there is exactly one way to
derive an expression, test, or statement using the building rules and beginning with those which are given outright. We do not give a formal proof of
this here, but encourage the reader to explore this (see Exercise 1). Three
examples of expressions are
((a
+ 5)*2),
572,
(cat
+ (dog + mouse)),
whereas, according to our rules,
a+5
is not an expression. An example of a statement is shown in 2.
6
1 An Introduction to Denotational Semantics
Table 1 The Syntax of a Pascal Fragment
Alphabet of Symbols
Digits: 0, 1, ... ,9
Letters: a, b, ... , z
Boolean Truth Values: T, F
Parentheses: ( , )
Boolean Connectives: --" v, /\
Comparisons: =, =1-, <, ~, >, ;:0:
Arithmetic Functions: +, -, *, -7Statement Constructors: :=,;, begin, end, if, then, else, while, do, repeat, until
The set of expressions is defined by:
Given Outright: Any nonempty string of digits (called a numeral), a letter
followed by a (possibly empty) string of digits and letters
(called an identifier).
Building Rules: If D, E are expressions so are (D + E), (D - E), (D * E), (D -7- E).
The set of tests is defined by:
Given Outright: T and F
D = E, D =I- E, D < E, D ~ E, D > E, D ;:0: E for any two
expressions D, E.
Building Rules: If B, C are tests so are (--, B), (B v C), (B /\ C).
The set of statements is defined by:
Given Outright: I := E if I is an identifier and E is an expression.
Building Rules: If Sl' ... , Sn are statements (n ;:0: 0) so is begin Sl; ... ; Sn end.
If B is a test and R, S are statements, so are
(if B then Reise S)
(while B do S)
(repeat S until B).
2
begin a := 5; (while (a > 0 /\ a # 6) do a := a - 1) end
Note that begin, while, do, and end are single symbols in the chosen alphabet
and that there is no space symbol in the alphabet. Normally, one displays a
statement so as to be more readable by humans, for example, as in 3:
3
begin
a:= 5;
(while (a> 0 /\ a # 6) do a := a - 1)
end
This is harmless since we obtain 2 from 3 by ignoring the aspects (in this case
the vertical arrangement and the spaces) which are not expressible in the
formal syntax.
We assume that the reader already has a good idea of what the semantics
of our fragment should be. (For example, the algorithm described by 2 always
terminates with identifier a storing the value 0.) A formal operational semantics is as follows.
We imagine an abstract computer with one memory location set aside for
each identifier. Each location stores a single value, where a value is either a
natural number or the symbol 1.. meaning "as yet undefined." At any time,
only finitely many locations store a number. The effect of executing a state-
7
1.2 A Simple Fragment of Pascal
ment is to assign numerical values to identifiers by evaluating numerical
expressions according to an algorithm controlled by tests and conditional
and repetitive constructs. (Here we ignore overflow: our numerical operations, +, -, *, -;-, for addition, subtraction, multiplication, and division
compute exact integer values no matter how large.) The only thing that can
"go wrong" is that we might attempt to evaluate an expression containing
identifiers for which no numerical values have been assigned. When this
happens we wish to abort the computation and so we create a special abort
state ro. Every other state is a normal state which we define to be a function a
from the set of all identifiers to the set of all values, with the requirement that
a(l) #- 1- for only finitely many identifiers I. The initial state is the function r
which assigns 1- to each identifier.
The operational semantics of a statement S will be defined as a computation sequence of states beginning with the initial state r and taking one of
the forms 4a, 4b, or 4c:
4a
r, aI' ... , an' ro
(n > 0, all ai -=f. ro);
4b
r, aI' ... , an' ...
(all ai -=f. ro);
4c
(n
~
0, all a i -=f. ro).
In 4a, computation aborts.
In 4b, computation is nonterminating.
In 4c, the computation terminates in a normal state an'
We now turn to the details of how to associate a definite sequence of states
to a statement. Here the description of Table 1 provides a guide. (We substitute the more mathematical terms "basis step" for "given outright" and
"inductive step" for "building rules" from now on.) We must first assign
appropriate values to expressions and tests (a process that depends on the
state).
5 The value [a, E] of expression E in normal state a is defined inductively as
follows.
Basis Step: If E is a numeral, [a, E] is the usual base-tO natural number
value of E (with leading zeros ignored).
If E is an identifier, [a, E] = aCE).
Inductive Step: If either [a, D] = 1- or [a, E] = 1-, then
[a, (D
+ E)] =
[a, (D - E)]
Else
[O',(D
+ E)]
= [0', (D * E)] = [0', (D -;- E)] = 1-.
= [O',D]
+ [O',E]
[O',(D - E)] = [O',D] -=- [O',EJ
[a,(D*E)J = [O',DJ[O',EJ
[a,(D -;- E)J = [a,DJ div [a,EJ
8
1 An Introduction to Denotational Semantics
are the expected natural-number arithmetic operations so that x ..:.. y means
the maximum of 0 and x - y and x div y is the largest integer ~ y/x, that
is, the unique integer q with y = qx + r, where the remainder r satisfies
o ~ r < x.
(Here we have relied on the earlier-stated fact that there is only one way to
decouple an expression; if there were more than one way the above rules
might assign values to expressions ambiguously.)
To illustrate how 5 is used, suppose that O"(a) = 3. Then
[O",((a
+ 5)*2)]
+ 5)] [0",2J
= ([O",aJ + [0",5J)[0",2J
= (3 + 5)(2) = 16.
=
[0", (a
Tests are evaluated in a similar way:
6 The truth value [0", BJ of test B in normal state 0" is defined inductively as
follows.
Basis Step:
[0", TJ = T,
[O",FJ
= F.
[0", D = EJ is .1 if either of [0", DJ or [0", EJ is .1, else is Tor F
accordingly as [O",DJ = [O",EJ, [O",DJ #- [O",E].
[O",D #- EJ, [O",D < EJ, [O",D
~
EJ, [O",D > EJ, and [O",D ;;::: EJ are defined
similarly.
Inductive Step: Let I (not), v (or) /\ (and) have their usual meanings on
the Boolean truth values T, F (T for "true," F for "false") so that, for example,
IT = F, IF = T, F /\ T = F, and so on. Then
[0", (IB)] is .1 if [O",BJ is .1, else is I [O",B].
[0", (B v e)] is .1 if either of [0", BJ or [0", eJ is .1, else is [0", BJ v [0", e].
[0", (B /\ e)] is .1 if either of [0", BJ or [0", eJ is .1, else is [0", BJ /\ [0", eJ.
As a prelude to defining the semantics of statements, we aid the reader's
intuition with flowschemes for the programming constructs in Table 7.
9
1.2 A Simple Fragment of Pascal
Table 7 Flowschemes for Programming Constructs
Assignment Statement: I := E
--1
1:= E
~
Composition: begin S1; ... ; Sn end
-1L-__ _...J~
Conditional: (if B then Reise S)
----+
S_l
Repetitive Constructs:
(while B do S)
F
T
(repeat S until B)
The principal semantic definition is: For any normal state a the computation
sequence of S starting at a is a state sequence a, S) of one of the three forms
8a, 8b, 8c,
<
(n
8a
8b
a, ai' ... , an''''
~
0), all ai #- w);
(all ai #- w);
10
1 An Introduction to Denotational Semantics
8c
(n ~ 0, all a i i= co).
(with interpretations similar to those of 4a, 4b, 4c) defined inductively as
follows.
9 Basis Step
<a,I:= E) = {
a,co
a,a 1
if [a,E] =.1..
else,
where
_ {a(J)
a 1 (J ) -
[a,E]
if J i= I
if J = I.
This is the expected meaning. Identifier I is assigned the value obtained
by evaluating E, as long as this is possible, and other identifiers are left
unchanged.
Inductive Step
10 Composition. Define <a, begin end) = a and define <a, begin Sl end) =
<a, Sl)' Proceeding inductively on the number of statements, assume that
<a, begin Sz;"'; Sk+1 end) has been defined for every normal state a and
every k statements Sz, .. " Sk+1' Then <a,begin S1;",;Sk+1 end) is defined as
follows. It is defined to be <a, Sl) if "S1 fails to terminate normally starting
at a," that is, if <a,Sl) has one of the forms Sa, Sb. Otherwise, <a,S1) =
a, a 1, ... , an as in Sc so we define <a, begin Sl;' .. ; Sk+1 end) to be the sequence
In short, we form the sequence obtained if each Si+1 begins where the
previous Si leaves off, save that this cannot continue if computation aborts
or one of the Si did not terminate.
11 Conditional.
<a,(if B then Reise S)
<a,co)
{
=
<a,R)
<a, S)
if[a,B]=.1..
if [a,B]
if [a, B]
= T
=
F.
Repetitive constructs, The computation sequence <a, (while B do S) is given
by 12.
12
a, co
if [a, B] = .1..;
a
if [a, B] = F.
11
1.3 A Functional Programming Fragment
<(J, while B do S) = <(J,S)
(J, (J b
... , (In-b
<(In' while B do S)
if [(J,B] = T and [(J,S]
has one of the forms Sa, Sb
if [(J, B] = T and <(J, S)
has the form (J, (J 1, ... , (In of Sc.
This sequence may, of course, fail to terminate. We leave it to the reader to
formulate a similar definition for <(J, (repeat S until B).
13 The computation sequence <S) of the statement Sis <r, S), where r is the
initial state mapping each identifier to .1. The operational semantics of our
Pascal fragment is complete.
EXERCISES FOR SECTION
1.2
1. Give several examples of ambiguities that would arise if the operational semantics
of the syntax of Table 1 were modified to delete some of the parentheses. Why were
we able to get away without parentheses in the tests D = E, ... , D ;;::: E? Why were
parentheses not needed to enclose begin Sl; ... ; Sn end?
2. Give an inductive definition of the set of numerals which excludes numerals with
leading zeros.
3. Provide an algorithm for <(J, (repeat S until B) similar to that of 12.
4. Two statements R, S are semantically equivalent if <(J,R) = <(J,S) for all normal
states (J. Let B be any test and let R, S be any statements. Show that any two of the
following three statements are semantically equivalent:
begin (while B do R); Send.
(if B then begin (while B do R); S end else S).
begin (while B do (if B then Reise S)); Send.
5. Let m be a natural number ;;::: 1. Find a polynomial p(n) with natural-number
coefficients such that an m-symbol alphabet has p(n) words of length :;;; n. (Do
not forget the empty word of length 0 which is an important input in practice:
"carriage return.")
1.3 A Functional Programming Fragment
In his provocative 1977 Turing Award Lecture, John Backus expressed concern that many programming languages were syntactically fat and unwieldy
but semantically lean and inexpressive. In reaction, he proposed a new class
of languages, the functional programming languages, in which a "program" is
a symbolic input/output function whose inputs are not given names: there are
no identifiers, assignments, or references of any kind to intermediate storage
and hence there are no side effects (such as clashes between local and global
12
1 An Introduction to Denotational Semantics
variable identifiers) to concern the programmer. In this section we present a
simple functional programming fragment whose principal data structures are
trees similar to the "s-expressions" of the programming language LISP but
many of whose function constructors are patterned after those emphasized
by Backus. Because we delay introduction of repetitive constructs into this
fragment until our later discussion of recursion in Chapter 5, the version of
this section temporarily fails to have full computing power.
We shall call our language FPF for "Functional Programming Fragment."
The syntax of FPF is given in Table 1. Here the colons and periods are not
among the 32 symbols of the alphabet.
The reader need not feel uneasy if Table 1 fails to explain how FPF works,
since that is the job of semantics: syntax has no meaning!
We will give a denotational semantics for FPF. We begin by discussing
DTN whose inductive definition is given in Table 1 which is the set of DTNs,
that is, dynamic trees of numerals (note the different typeface for the set and
for the generic name of elements of the set). This set includes lists of numerals,
namely, the DTNs of form <n l , ... , nk ), where n i are numerals. The case
k = 0 gives the empty list
as a DTN. Similarly, we can have a list of lists
such as «5,17), < ), (035», the list whose first entry is the list <5,17),
<)
Table 1 The Syntax of FPF
Alphabet of Symbols
Digits: 0 1 ... 9
Parentheses: ( ) < >
Atomic Functions: id head tail + - * -;- num =
Function Constructors == 0 if then else [ ] ex /
The set DTN of dynamic trees of numerals (DTNs for short) is defined by:
Basis Step: A numeral (i.e., a nonempty string of digits) is a DTN.
Inductive Step: If t 1 ... tk are DTN s (k ;::: 0) then <t 1, ... , t k is a DTN.
The set of functions is defined by:
Basis Step: An atomic function symbol is a function
if t is a DTN then == t is a function.
Inductive Step: If f1 ... fk are functions (k ;::: 1) then so are Uk 0'" 0 fd
and [f1' ... ,fk].
If p, f, 9 are functions then so is (if p then f else g).
If f is a function then so are (exf) and (If).
>
whose second entry is the empty list, and whose third entry is the length-l
list consisting of the numeral 035. Other examples are less homogeneous,
for example, <05,« »,0,3». An m x n matrix of numerals (au), usually
visualized as a rectangular array with the numeral au in row i and column j,
may conveniently be coded as the DTN
2
13
1.3 A Functional Programming Fragment
The input to a matrix multiplication algorithm may then be coded as a
length-2 list whose entries are matrices as in 2. These examples suggest the
ease with which DTNs model complex inputs and outputs.
Each DTN has a unique derivation tree describing how to build it using
the basis and inductive steps in the definition of DTN of Table 1. For
example, <1, «0,10), <
has derivation tree
»)
o
10
where each node (= dark circle) indicates a list whose entries are the subtrees branching from that node (read in left-to-right order). A node without
branches thus indicates the empty list. The node at the root (= top) of the
(upside down) tree indicates the lists represented by the whole tree. It is clear
that such derivation trees are in natural one-to-one correspondence with the
elements of DTN and, indeed, that the list notation is just a convenient way
to code such a tree as a string. This explains the term dynamic tree of
numerals. "Dynamic" is in the same sense as in the term "dynamic array" in
Pascal, meaning that the lengths and shapes of DTNs are not prespecified in
a "declaration."
In our denotational semantics of FPF, the semantics of each syntactic
function will be such as to transform inputs in DTN to outputs which are
again in DTN. We pause briefly to note what kind of function constitutes
such a transformation:
3 Definitions. Let X, Y be sets. A partial function from X to Y is specified by
providing a subset A of X and a function mapping each element of A to a
unique element of Y. We say X is the domain, Y is the codomain and A is
the domain of definition. (Other authors use "domain" for our "domain of
definition." Our terminology follows the conventions of category theory as
discussed in the next chapter; see Definition 2.1.1.)
Our most common notation will be to assign a symbolic name such as f
to a partial function. We write "let X ~ Y be a partial function" to mean
f is a partial function from X to Y. We may also write f: X -+ Y in place of
X ~ Y. In either case, we use f(x) for the value assigned by f to each x in
its domain of definition, which we denote by DD(f). If x E X but x ¢ DD(f)
we say ''f(x) is undefined."
4 The set of all partial functions from X to Y will be written Pfn(X, Y). The
14
1 An Introduction to Denotational Semantics
"partial" in partial function means "partially defined." Paradoxically, an important special case of a partial function X ~ Y occurs when DD(f) = X.
This is just a function from X to Y. For emphasis, we call such f a total function from X to Y.
5 We relate this to program semantics in general before returning to DTN.
Let X be an input set and let Y be an output set. A given algorithm with
input x in X may fail to terminate. Let A be the subset of X consisting of
those x for which the algorithm terminates if x is the input. The denotational
semantics of the algorithm is the partial function f: X -+ Y with DD(f) = A,
and where f(x) is the output at termination when x is the input. (In 1.4.5
below we will consider a computation environment in which 5 requires
modification.)
6 The set of all total functions from X to Y will be written Tot(X, Y).
If f E Pfn(X, Y), g E Pfn(Y, Z) arise as in 5 we may think of f, g as the
computations of subalgorithms which can be chained together setting the
output of f as the input to g to produce a net output in Z from an input in
X. The formal operation involved is as follows.
7 Definition. For f E Pfn(X, Y), g E Pfn(Y, Z) their composition gf E Pfn(X, Z)
is defined by
DD(gf)
= {xEXlxEDD(f),!(x)EDD(g)},
(gf)(x) = g(f(x))
for x E DD(gf).
Note that gf is total when f and g are.
The functions studied in first-semester calculus are partial functions
from the set of reals to itself (e.g., DD(l/x) = {xix #- O}, DD(arcsinx) =
{xl-I:::; x :::; I}, etc.). The "chain rule" refers to the composition of 7, being
a rule for the derivative of gf. Composition of functions is sometimes called
"chaining" because the output of one function is the input to the next, creating a chain of two links. Longer chains arise in 12 below.
We turn now to the semantics for FPF by associating a partial function in
Pfn(DTN, DTN) to each syntactic function. To keep the notation as simple as
possible we will denote the semantics of a function f by f: and so we will
write f: t for the value f: assigns to the DTN t. Thus, the presence of the
colon (which is not in the alphabet of Table 1) indicates semantics.
In describing a specific partial function f, if a formula for f : t is given for
t of a particular form without further comment our convention is that f: is
not defined for other t. Sometimes, of course, DD(f) is sufficiently complicated for a more careful description to be necessary.
We begin with the basis step functions in Table 1.
15
1.3 A Functional Programming Fragment
8 id: is the identity function, id: t = t for all t E DTN. head returns the first
element of a list and tail drops the first element of a list as follows:
head: <t1> ... ,tk ) = t1
tail: <t 1, ... ,tk )
=
<t 2 , ••• ,tk )
(k
~
1),
(k
~
1).
Thus, we can not make heads or tails of the empty list or numerals.
9 The arithmetic functions +, -, *, and -;- require an input of form <m,n)
where m, n are numerals. The meaning of the operations is then the same as
in Pascal as described in 1.2.5. Thus,
+: <m,n)
=
m + n,
-: <m, n)
=
m ...:.. n,
*: <m, n)
=
mn,
-;- : <m, n) =
m div n,
where, on the right-hand sides, the numerals m, n represent numbers in the
usual base-lO way and the numerical results are represented as numerals
without leading zeros.
10 The numeral function num is defined by
num:
t =
{«»
<)
if t is a numeral
else.
Similarly, the equality function = takes an input of the form <t, u) where t,
u E DTN are arbitrary and
= : <t, u)
. {«»
IS
<)
if t = u
'f t #- u.
1
Here we have coded the truth values as DTNs by representing T as« »and
This is analogous to the trick used in set theory (mathematicians
sometimes adopt the view that all of mathematics may be derived from set
theory) to define natural numbers in terms of sets wherein is defined as the
empty set 0, 1 is defined as the one-element set {0}, 2 is defined as the
two-element set {a, 1} = {0, {0}}, and n = {O, ... , n - 1} in general. Using
lists instead of sets, that is, by substituting for { and ) for }, the same
constructions are available in DTN. We could have used the numerals and
1 for F and T but it seemed more desirable to use a convention that would
apply to dynamic trees of objects other than numerals. In fact, our convention is analogous to that used in the programming language LISP, where the
empty list NIL is used for the truth value F and any other values may be
interpreted as T.
We now provide the semantics for the basis step in the set of functions
defined in Table 1.
F as
< ).
°
<
°
16
1 An Introduction to Denotational Semantics
11 For each DTN t, == t: is the total function which is constantly t, that is,
==t:
u is
t
for any DTN u.
To continue our description of the semantics of FPF we examine the
constructions of the inductive step for functions in Table 1.
12 If fl' ... , fk are functions (k z 1), (fk 0 ••• 0 fl): is the k-fold composition of
7, being essentially the same as the pseudo-Pascal fl; ... ; fk' that is,
(Such is defined, of course, only when all the intermediate steps are defined.)
The next construction applies k functions in parallel and combines the results
in a single list.
13 If fl' ... , fk are functions (k
DD(fk} and
z
1), then DD([fl, .. · ,fkJ}
=
DD(fl} n··· n
This construction is a major tool in building lists.
14 For p, f, 9 functions,
(if p then f
f: t
{
else g): tis g: t
undefined
p: t =f.
p: t =
<)
<)
p: t is undefined.
Thus, our device for viewing function p as a test is to consider p: t false if it is
our coding < ) for false, true if it is defined but not false, and undefined else.
The notation above is understood to mean that (if p then f else g): t is unbut g: t is undefined or if p: t is defined and =f.
but
defined if p: t =
f: t is undefined.
<)
<)
15 The symbol IX is the apply-to-all operator. If f is a function, (1Xf): is "f:
applied to all entries in the input list." Specifically, an input to (1Xf): must have
the form <t1> ... , tk) with k
z
1 and each ti E DD(f} and then
(1Xf): <tl> ... ,tk) = <f: tl'···,f: tk)·
16 The symbol/is the insertion operator. If f is a function then (If):
<t 1, t 2, t 3), for example, will be defined as f: <tl,f: <t 2, t3». Equivalently,
using infix notation t f u instead of f: <t, u), (If): <t 1, t 2, t 3) = tl f (t2 f t3)·
Similarly, (If): <t 1, t 2, t 3, t 4) will be tl f (t2 f (t3 f t 4))· Thus, / treats f as a
function of two variables and extends it to a function on any number of
variables by "inserting" it between the variables. The formal definition is as
follows. The input must have the form <t 1, ... ,tk) (k z O), that is, it cannot
be a numeral. We use induction on k.
17
1.3 A Functional Programming Fragment
(If):
< ) = < ),
(If): <t 1) = t 1,
(If): <t 1,···,tk+1) =f: <tl>(lf): <t 2,···,tk+1»·
This completes the description of the syntax and semantics of FPF. Since
the reader may have had very little prior experience with functional languages,
we will write some FPF functions to illustrate some of the concepts. Additional examples using recursion will be given in Section 5.1, but we shall be
able to achieve quite a bit without any repetitive constructs. Indeed, it is possible to write an FPF function to multiply two square matrices and this is
done in 26 below.
We begin by introducing "abbreviations" which amount to "subprograms."
17 We introduce the symbol
=.bb'
If f is a syntactic function then
X =.bb
f,
read "x is an abbreviation for f," is an informal declaration that any occurrence of x may be literally replaced by the string f
We begin with some abbreviations which produce functions to manipulate
lists and matrices.
18 For any function f and n ?: 0,1" is the abbreviation defined by
fO
=abb
id,
f1
=.bb
f,
1"
=.bb
(f 0 ' "
0
f)
(n times for n > 1).
For i ?: 1 we have the following abbreviations:
19
pr;
20
21
=.bb
(head 0 taiJi- 1 ), the ith projection function.
col;
transPn
=.bb
=.bb
(cxpr;), the ith column function.
[coIl," ., coIn], the n-column transpose function.
Thus, transp3 is an abbreviation for the FPF function
[(cx(head 0 id)), (cx(head 0 tail)), (cx(head 0 tail 0 tail))].
The reader may easily check that pr;: <t 1, ... , t n ) is t; for i :::; n but undefined
for i > n so that pr; selects the ith entry of a list, that col; returns the ith
column of a matrix
i:::; n
i> n,
18
1 An Introduction to Denotational Semantics
and that
transPn: «all'···, aln ), .. ·, <amI'···' amn »
= «all,···,aml),···,<aln,···,amn»
produces the transpose of an n-column matrix.
In 26 below we define an FPF function to multiply two n x n matrices. We
present a strategy whereby a sequence of subfunctions will be composed to
produce the desired function. We begin with the input at step 0 and the result
at step i will be the output of the ith subfunction and the input to the (i + l)th
subfunction. For C a matrix, we use the notations Ci and Ci for the ith row
andjth column of C, and thus C/ for its ij component.
Step 0:
Step 1:
Step 2:
Step 3:
<A, B), A, B the input n x n matrices coded as in 2.
«AI,B), ... , <An,B».
«<AI,B I ), ... , <AI,Bn», ... , «An,B I ), .... , <An, B n»).
Replace each <Ai' Bi) with its dot product
dot: <Ai,Bi)
= ~::<A7B{
1 ~ k ~ n)
so that the result in Step 3 is indeed the matrix product desired.
We implement these subfunctions as follows.
The ith row of a matrix is just the ith column of its transpose and this
leads to
22
Since
pr2: <A,B)
=
B,
a function to transform step 0 to step 1 is
In the same vein, if
then
g: <Ai' B) = «Ai' BI), ... , <Ai' Bn»
so that
112 =abb (rxg)
24
transforms step 1 to step 2. A powerful use of insertion and apply-to-all is
25
dotn
=abb ( /
+ )(rx *)transPn
19
1.3 A Functional Programming Fragment
whose semantics for two length-n lists of numerals is
dotn: «P1,···,Pn),(q1,···,qn»
= P1q1 + ... + Pnqn
(In detail,
transPn: «P1,···,Pn),(q1,···,qn» = «P1,q1),···,(Pn,qn»
(cu): «Pl,q1),···,(Pn,qn» = (P1q1,···,Pnqn)
(/ + ): (P1 ql> ... ' Pnqn) = P1 q1 + (/ +): (P2q2,···, Pnqn)
= P1Q1 + P2Q2 + (/+): (P3Q3,···,PnQn)
= ... = P1Q1 + ... + PnQn)·
».
Thus, step 2 to step 3 is achieved by (oc(oc dotn Chaining these steps together,
we obtain an FPF abbreviation to multiply two n x n matrices in
» f12
matmliltiplYn =abb «oc(oc dot n
26
0
0
fod
for f01 as in 23, f12 as in 24, and dotn as in 25.
We conclude the section with a few additional abbreviations and encourage
the reader to work the exercises.
Boolean operations may be derived as follows. Preserving the conventions
of 10, coin the notations
27
F for ( ),
T for
« »,
false =abb
== F,
true =abb
== T.
We may then define
(IP) =abb (if P then false else true),
28
if p: t = F
if p: t # F
undefined if p: t is undefined.
T
(IP): tis { F
29
(p v Q) =abb (if P then (trueoQ) else (if Q then true else false»,
T
(p v q): t is { F
undefined
(p
30
1\
if p: t ::/= F or q: t # F
if p: t = F = q: t
if p: tor q: t is undefined,
q) =abb(I«IP) V (Iq))).
::/= =abb (I = ).
The numerical relations are then introduced as follows:
20
1 An Introduction to Denotational Semantics
31
~ =abb ( = 0 [ - ,
>
=abb
(~ /\
<
=abb
(I ~),
~
=abb
«
V
== OJ),
#-),
=).
For example,
~: <t,u) is {;
undefined
EXERCISES FOR SECTION
if t, u are numerals and t ~ u
if t, u are numerals and t > u
else.
1.3
1. Draw the derivation tree of the matrix of 2 for the case m = 3, n
= 2.
2. Let X be an m-element set and let Y be an n-element set. Show that Tot(X, Y) has
nm elements and that Pfn(X, Y) has (n + 1)m elements.
3. Recall that a total function f: X ---> Y is injective if whenever Xl' X 2 are distinct
elements of X then f(x l ) oF f(x 2 ) in Y. Given fETot(X, Y), gETot(Y, Z), prove
(a) that gf is injective if f, g are injective;
(b) that f is injective if gf is injective;
(c) that there are P(n, r) injective functions from an r-element set to an n-element
set, where P(n, r) = n(n - 1)'" (n - r + 1) is the number of ways to select r
things from n if the selection is made without repetition.
4. Recall that a total function f: X ---> Y is surjective iffor every y in Y there exists at
least one x in X with f(x) = y. Given fETot(X, Y), gETot(Y, Z), prove
(a) that gf is surjective if f, g are surjective;
(b) that g is surjective if gf is surjective.
5. Describe taiF 0 head: as a partial function.
6. Write an FPF function for the everywhere undefined function .1 defined by
00(.1:) = 0.
7. If A, Bare m x n matrices, their sum is the m x n matrix A + B with ij entry
aij + bij if A, B, respectively, have ij entry a ij , bij' Write an FPF function +m.n to
add two m x n matrices.
8. Write an FPF function =m,n whose input is a list of two m x n matrices, such that
_. <A, B) . {T
-m n'
.
IS
undefined
A, Bare equal matrices
if
else.
Say that two FPF functions f, g are semantically equivalent if f: = g: m
Pfn(DTN, DTN).
9. Show that (if p then true else false) is semantically equivalent to (..., (...,p)). Give
necessary and sufficient conditions on p for p and (""(...,p)) to be semantically
equivalent.
21
1.4 Multifunctions
10. Prove that (p v q) and (q v p) are semantically equivalent. Prove, however, that
if
(p 0 q)
=abb
(if p then true else (if q then true else false))
then (p 0 q) and (q 0 p) are not semantically equivalent.
11. Prove that (p v q) and (I (( Ip)
1\
(Iq))) are semantically equivalent.
12. Write an FPF function f to compute the number of occurrences of the least value
in a nonempty list of numbers. [Hint: a possible strategy is
Step 0:
Step 1:
Step 2:
Step 4:
<n1, ... ,nk).
<t1, ... ,tk) where ti = «ni,ni, ... ,n),<n1, ... ,nk»'
<i 1 , ••• , ik ) where ij is 1 if nj S n, for all t, and ij = 0 else.
sum all ij .]
1.4 M ultifunctions
Since denotational semantics is to assign an input/output meaning to each
program, it is reasonable to consider possible general forms for input/output
"functions." In the Pascal fragment of Section 1.2 inputs and outputs were
assignments of natural numbers to identifiers whereas they were DTNs for
the functional programming fragment of Section 1.3. In Part 3 we shall be
concerned with the theory of data types which addresses the question of how
inputs and outputs can be structured (e.g., "DTN structure"). But even if we
bypass this issue for the time being, allowing the inputs and outputs to have
no particular structure, we may nonetheless wish to consider more general
things than partial functions for input/output descriptions.
In this section we introduce "multifunctions." We make no claim that
partial functions and multifunctions exhaust all reasonable possibilities.
Rather, we introduce the notion of a "category" in Chapter 2 as a candidate
for a truly general framework. The common properties of partial functions
and multifunctions studied in this section will help to motivate later work
with categories.
A total function is "single-valued" in the sense that exactly one output f(x)
results for each input x. Similarly, a partial function is "at-most-one-valued."
More generally, multifunctions obtain by allowing f(x) to be any set of
outputs, including the empty set. For an example, consider an anthropological data base for a population P in which it is possible to retrieve the names
of the children (also in P) of any person in P. The "children" multifunction f
then assigns to each p in P the set f(p) of all children of p. The formal
definition of a multifunction is as follows.
1 Definitions. Let X, Y be sets. A multifunction from X to Y is a total function from X to the set of subsets of Y. The set of all multifunctions from X to
Y will be denoted Mfn(X, Y).
22
1 An Introduction to Denotational Semantics
In set theory it is customary_ to call the set of subsets of Y the power set of
Y which leads to the following standard:
2 Notation. If Y is a set, .9(Y) denotes the set of subsets of Y.
We then have, by Definition 1,
Mfn(X, Y) = Tot(X, .9(Y)).
3
Why should Definition 1 be useful, then, if multifunctions are just a special
case of total functions? The reason lies in considering how we want to chain
multifunctions together. For example, a grandchild is just a child of a child
so that if f E Mfn(P, P) is the "children" multifunction as above, one intuitively expects to obtain the "grandchildren" multifunction by an appropriate
composition of f with itself. Considering f as a total function from P to [3l!(P)
and trying to compose f with itself as in 1.3.7 does not work because the
value of the output f(p) does not have the right form to be an input to f.
What we need is the following definition.
4 Definition. For f E Mfn(X, Y), 9 E Mfn(Y, Z), their composition gf E
Mfn(X, Z) is defined by
gf(x)
=
{zEZlthere exists YEf(x) with zEf(Y)}.
Indeed, it is immediate that if f E Mfn(P, P) is the "children" multifunction
then ff E Mfn(P, P) is the "grandchildren" multifunction we desired.
Multifunctions are suitable input/output functions for the following parallel computation scenario which generalizes that of 1.3.5.
5 Let X be an input set and let Y be an output set. Beginning with an input
x in X, a given algorithm simultaneously initiates a set of noninteracting
computations. Some of these may not terminate and those that do may halt
at different times. The denotational semantics of the algorithm is the multifunction f E Mfn(X, Y) which assigns to x the set f(x) of all outputs in Y
resulting from some terminating computation initiated by input x.
One might, for example, add atomic multifunctions to the functional programming fragment of Section 1.3 and give a multifunction denotational
semantics based on 5 rather than 1.3.3. See Exercise 3. In such a situation we
would need a multifunction semantics for the FPF (J;. 0 ••• 0 fd. Similarly, in
attempting a multifunction semantics for Pascal we would need to assign a
meaning to begin fl; ... ; fk end. While the composition operation of 4 is the
natural candidate, a technical issue is raised. Up to now we have viewed the
chaining together of, say, three functions in the following way:
23
1.4 Multifunctions
For multifunctions, should this mean h(gf) or (hg)f? Fortunately, it makes
no difference.
6 Proposition (Associative Law for Multifunction Composition). If f
Mfn(W, X), g E Mfn(X, Y), hE Mfn(Y, Z) then h(gf) = (hg)f E Mfn(X, Z).
E
Let zE(h(gf))(w). Then there exists YE(gf)(W) with zEh(y). But then
there exists X Ef(w) with YEg(X). By the definition of hg, zE(hg)(x) and so
Z E ((hg)f) (w). So far, we have shown that (h(gf))(w) is a subset of ((hg)f)(w)
for all WE W To complete the proof, let zE((hg)f)(w) and show zE(h(gf))(w).
There exists X Ef(w) with zE(hg)(x). Thus, there exists YEg(X) with zEh(y).
By the definition of gf, y E (gf)(w) and then Z E (h(gf))(w).
0
PROOF.
Theorem 6 allows us to write the equal multifunctions h(gf) and (hg)f
simply as hgf In fact, the proof has shown
7
(hgf)(w)
=
{ZEZ: there exists XEf(w) and then YEg(X) with zEh(y)}.
Repeated use of the associative law guarantees that parentheses can be
avoided for chains of all lengths. While we will not give a formal proof here,
the following example indicates the general idea.
8 Example. e((d(cb))a)
=
(ed)(c(ba)) by five uses of6 as follows:
e((d(cb))a) = e(((dc)b)a)
=
e((dc)(ba))
=
(e(dc))(ba)
=
((ed)c)(ba)
=
(ed)(c(ba)).
(as d(cb)
=
(as ((dc)b)a
(dc)b)
=
(dc)(ba)
Thus, both compositions in 8 could be written edcba. See Exercise 5.
We conclude this section by showing that partial functions (and so total
functions) may be thought of as special cases of multifunctions.
9 Definition. For each f
E
Pfn(X, Y) define F E Mfn(X, Y) by
F(x)
=
{{J(x)}
o
x E DD(f)
else.
Such F is closely associated to f For example, f can be completely
deduced from F because DD(f) = {xEXIF(x) #- 0} and for xEDD(f),
f(x) is the unique element of F(x). A multifunction g has the form F if and
only if g(x) has at most one element for all x. Furthermore, the compositions
of 4 and 1.3.7 respect each other as is shown in the next result.
24
1 An Introduction to Denotational Semantics
10 Proposition, Let fEPfn(X, Y), gEPfn(Y,Z) and let gfEPfn(X,Z) be
the composition of 1,3,7. Let g1' E Mfn(X, Z) be the composition of 4. Then
(gff
= g1'·
PROOF.
(g1')(x) = {zlthere exists yEf'(X) with zEg'(y)}
= {zlx E DD(f) and f(x) E DD(g) and z = g(f(x))}
= {0{g(f(X))} xEDD(f) andf(x)EDD(g)
else
=
(gf),(x).
D
The import of 9, 10 is that "partial functions are multifunctions," that is,
blurring the distinction between f and f' is unlikely to be imprecise. Usually,
one writes f' simply as f Thus, if f E Pfn(X, Y), g E Mfn( Y, Z) we would write
gf without comment for the more precise gf' E Mfn(X, Z). One mild warning
is in order, however, relating to 1.3,3. If a known programming statement
computes f we would expect to be able to write the statement
if f'(x) =
0
then g(x) else h(x)
in, say, Pascal. This would compute h(x) if computation of f(x) halts, but
would be undefined rather than returning g(x) if f(x) does not halt, that is, if
x¢DD(f). In short, f'(x) = 0 in 1,3,5 should be interpreted not as a
returned value but as nontermination. A similar interpretation applies to
f(x) = 0 in 5. On the other hand, there are circumstances such as the
"children" multifunction where 0 is a reasonable returned value. In a semantic environment where a possibly non terminating algorithm has the empty
set as a possible returned value, multifunctions may not provide the correct
type offunction. See Exercise 2.1.10.
11 Proposition (Associative Law for Partial Function Composition). If
f E Pfn(W, X), g E Pfn(X, Y), hE Pfn(Y, Z) then, with respect to the composition
of 1.3,7, h(gf) = (hg)f E Pfn(W, Z).
PROOF. Using 6 and 10 we have (h(gf)f
= «hg)f)', so that h(gf) = (hg)f.
(hgff'
EXERCISES
FOR
SECTION
= h'(gff = h'(g1') = (h'g')f'
=
D
1.4
1. Show that &l(Y) has 2" elements if Y has n elements. Show that Mfn(X, Y) has 2m"
elements if X has m elements and Y has n elements.
2. A relation from X to Y is a subset of X x Y where X x Y is the set of all ordered
pairs (x, y) with x E X and y E Y. It is standard to write xRy as a synonym for
(x, y) E R and to say "x, yare R-related" if xRy. Denote the set ofrelations from X
to Yas Rel(X, Y). Three examples of relations are ::::; E Rel(N, N) where n ::::; m has
25
1.4 Multifunctions
its usual meaning, R E Rel(R, N) if R is the set of real numbers and if xRn means
x 2 = n, and S E Rel(P, W) if pSw means w is the sister of p where P is a set of people
and W is a set of women.
Prove that "a relation is the same thing as a multifunction." More precisely, for
each f E Mfn(X, Y) define f* E Rel(X, Y) by
xf*y
if and only if y E f(x).
Prove that fl-+ f* establishes a bijective (= injective and surjective) function from
Mfn(X, Y) to Rel(X, Y).
3. It is easy to extend the semantics of FPF of Section 1.3 to multifunctions in
Mfn(DTN, DTN). For function constructors, define (A 0···0 fd: using 4 (and 6):
[fl' .. ·,jkJ: t = {<ul, ... ,un)luiE/;: t for all i}
A
=
(if p then f else g): t = Au B
{0
f: t
(af): (t l ,···, t k> = {<u l
(If): (
°
if p: t - {< )} oF
else,
>=
(If): (t l > =
{<
>},
B
where
= {g: t if (
0
else,
tJor all
,···, uk>luiEf:
>E p: t
i},
{td,
(If): <tl,···,tk+l>
= {f: (tl,u>luE(lf):
(t 2 ,···,tk+l>}.
Consider the atomic functions of Table 1.3.1 as multifunctions as in 9.
(a) Iff: denotes the partial function semantics of f as in Section 1.3 show that the
multifunction semantics is just (f:f.
Because of (a) we must extend the atomic functions to include multifunctions
which are not partial functions in order for multifunction semantics to be
interesting. Many possibilities might be considered. Here we explore two, the
first being the "index generator" or "iota" function of the programming language
APL and the second being a proper multifunction.
Extend the syntax of Table 1.3.1 by adding 1 (lower case Greek iota) and
foreach to the atomic functions. Their semantics is as follows. The input to 1 is
a numeral nand
I: n
= (1,2, ... ,n)
whereas for m, n ;::: 1,
» = {(t i,u )ll::;; i::;; m, l::;;j::;; m}.
foreach: «tl, ... ,tm>,(ul, ... ,un
(b) If 9
=abb
((a *) 0 transp2 0
[I, I])
g:
j
show that
n
= (1,4,9, ... ,n 2 >
for each numeral n.
(c) Show that
(+ 0 foreach 0
[g, g]):
n = {p2 + q2: 1 ::;; p, q ::;; n}
and use this to write a function to test if an input number n has the form p2
for p, q natural numbers.
+ q2
26
1 An Introduction to Denotational Semantics
(d) Write a similar function to test if a number n has the form p2
q, r natural numbers.
+ q2 + r3 with p,
4. Generalize 7 to n multifunctions.
5. A complete list of all ways to parenthesize a chain of four functions using a binary
composition is
((dc)b)a, (d(cb))a, (dc)(ba), d((cb)a), d(c(ba)).
As in 8, show that all four are the same if the composition satisfies the associative
law.
1.5 A Preview of Partially Additive Semantics
In this section we consider partial functions and multifunctions as frameworks for denotational semantics without reference to any partic4lar
programming language. Basic constructions such as chaining, conditional
testing, and looping are described at the function level. The term "partially
additive" refers to a kind of sum operation which can be defined on the sets
Pfn(X, Y), Mfn(X, Y) (and more generally in Section 3.2).
To fix the context we must choose just one of "partial function" or "multifunction," that is, we must specify the "semantic category" in the sense of the
following definition (which will be generalized in the next chapter).
1 Definition. The semantic category is either Pfn (for partial functions) or
Mfn (for ml.lltifunctions). We adopt the noncommittal notation SC(X, Y)
to mean "Pfn(X, Y) if the semantic category is Pfn and Mfn(X, Y) if the
semantic category is Mfn."
2 Notation. We will use all the notations
f: X
-+
Y
X~Y
-X---1·.r0
y
•
as synonyms for f E SC(X, Y). These may appear geometrically reoriented in
diagrams, for example, right-to-Ieft, vertically, diagonally, and so on. The last
notation is "flowscheme" notation.
The important operation of iterated composition has already been introduced (in 1.4.4, 1.4.6-8 for Mfn, 1.3.7, 1.4.11 for Pfn). If/;ESC(Xi - 1 ,X;)
for i = 1, ... , n, suitable flowscheme notation for the composition fn' .. f1 E
SC(XO,Xn ) is
27
1.5 A Preview of Partially Additive Semantics
3
~
.~
The labeled arrow notation X --.!- Y is useful in "commutative diagrams"
such as
4
in which our convention is the following:
5 In a diagram such as 4, if two paths of arrows begin at the same place and
end at the same place then, unless the contrary is indicated, the compositions
of these paths are asserted to be equal. To emphasize this assertion we say
"the diagram commutes."
Thus, in 4, g = f3fdl ESC(XO,X3 ), h = f4f3ESC(XZ,X4), hfdl = f4gE
SC(Xo, X 4 ), f = fsf4f3fzfl E SC(Xo, X 5 ), and so on. Notation such as
f
6
IY
could be used to indicate that h is not necessarily the same as gf E SC(X, Z).
The identity function id: DTN ---+ DTN was introduced with FPF in
1.3.8. More generally, we have the following:
7 Definition. For each set X, the identity function of X, idx : X ~ X is the
total function defined by idx(x) = x. This function is in Pfn(X, X) and so may
be considered in Mfn(X, X) as in 1.4.9-10, so that always idx ESC(X, X).
We clearly have the following:
8
For fESC(X, Y),
idyj = f = fid x .
We may express this by a commutative diagram:
28
1 An Introduction to Denotational Semantics
Alternatively, inventing the "through box"
X
~R
x.
as flowscheme notation for idx , 8 may be expressed in flowscheme terms
by
x
X
x
f
f
y
X
y
f
y
y
We now introduce the fundamental operation of sum, first for Mfn and
then for Pfn.
9 Definitions. Let X, Y be sets. Let I be a set and for each i E I let I; E
Mfn(X, Y) (we say (I; liE I) is an I -indexed family in Mfn(X, Y)). Then
the sum ~)I;I i E I) (alternatively written L 1;) is the multifunction in
Mfn(X, Y) defined by
ieI
C~II;>x) = i~J;(X) =
{YE YIYEI;(x) for some iEI}.
Hence, for one-element families (meaning that I has one element) L (f) = I
and, in case I is empty, the sum maps x to the empty set for all x in X (see
Exercise 1).
If I = {1,2, ... ,n} with n ~ 2 so that the family (1;IiEI) has the form
(f1, ... , In), we write 11 + ... + In as a synonym for L (I; liE I). In general, we
may write LI; instead of (1;1 i E I) when I is clear from context.
L
An intuitive flows cherne notation for summing is exemplified by the
following.
1.5 A Preview of Partially Additive Semantics
10
29
f + g is written
1
and similarly for other families (I; liE I).
This notation conveys the idea of 9 since an output from (f + g)(x) is an
output from either of f(x) or g(x).
We next seek a suitable sum operation for partial functions. It is easy to
see that when each I; in 9 is a partial function (i.e., a multifunction which
happens to be a partial function-recall 1.4.9-10) then 2..h need not be. In
the case of 10, let f, g be partial functions and x be such that f(x) and g(x) are
defined and different. Then the sum f + g maps x to the set {J(x), g(x)} and
so is not a partial function. To better understand what needs to be fixed,
imagine the "fanout" in 10,
1
n
as controlled by a test such as "if f is defined go left; if g is defined go right."
For multifunctions, such a test can pass the input down both lines simultaneously. For partial functions we demand that such a test choose at most
one alternative and define 10 only when DD(f) (') DD(g) = 0. We have
motivated.
11 Let X, Y be sets and let (1;IiEl) be an I-indexed family in Pfn(X, Y).
Then (1;IiEI) is summable in Pfn(X, Y) if for all i,jEI with i =f.j, DD(fdn
DD(J}) = 0. In that case, II; = I(l;liEI) in Pfn(X, Y) is defined by
30
1 An Introduction to Denotational Semantics
DD(L.O =
U DD(J;)
iEI
(LJ;)(x)
=
{
jj(X)
undefined
if there exists j with x E DD(jj)
else.
Note that we do not require that I be finite. The following is an immediate
result:
r
12 If (J;liEI) is summable, (LJ;)" = I (J;'), where
is defined in 1.4,9 and
the latter sum is that of 9. Thus, the Pfn sum, when it exists, specializes the
Mfn sum.
In particular, we have for one-element families
13
I
(f) =
f
and for empty families
14
I0=0,
where 0: A -+ B denotes the everywhere undefined partial function characterized by DD(O) = 0. It is obvious that we may extend a summable family by
adding any number of O's or we may delete any number of O's which are
already there without affecting either the summability of the family or the
value of the sum. It is for this reason that, in this context, we prefer the
notation 0 instead of the alternate notation .1 introduced in Exercise 1.3.6.
Our operation of sum, then, differs from ordinary numerical addition in
two fundamental respects:
(a) It is not always defined. Indeed, for any f E Pfn(X, Y) with DD(f) -# 0,
f + f is never defined. The "partial" in "partially additive" refers to this
property-addition (= sum) is only partially defined.
(b) There are many infinite families whose sum is defined.
We remark that even finite sums such as 10 can not be implemented in an
unrestricted way. It is well known from computability theory that given two
programs which compute partial functions f, 9 there is no way to decide,
in general, if DD(f) n DD(g) = 0, and this makes it hard to imagine a
suitable approach to compute f + 9 for arbitrary f, 9 (see Exercise 4 for an
unsuccessful attempt). There remains the option to restrict the use of sum to
"provably disjoint" families, and this will in fact be what happens when we
give a partially additive semantics for iteration in Section 3.3 (see also 27
below).
We turn to some properties of sum, beginning with the following one.
15 Proposition (Distributive Law of Composition over Sums in Mfn). Let
fEMfn(W,X), let (giliEl) be a family in Mfn(X, Y), and let hEMfn(Y,Z).
Then
31
1.5 A Preview of Partially Additive Semantics
(IgJf
=
I(gJ)EMfn(W, Y),
h(I gJ
=
I (hgJ E Mfn(X, Z).
PROOF. YE«IgJf)(w)~there exists XEf(w), YE(LgJ(x)
~there
exists XEX, iEI with XEf(w) and YEgi(W)
~there
exists iEI with YE(gJ)(W)
~ YE(~::<gJ))(w).
zE(h(Igi))(x)~there
exists YE(LgJ(x) with zEh(y)
~ there
exists i E I, Y E gi(X) with Z E h(y)
~ there
exists i El with Z E(hgJ(x)
D
~zE(I(hgi))(X).
16 Corollary (Distributive Law of Composition over Sums in Pfn). Let f E
Pfn(W, Y), let (gdi E I) be a summablefamily in Pfn(X, Y), and let hE Pfn(Y, Z).
Then (gJliEI) and (hgiliEI) are summable and
(I gi)f = L (gJ) E Pfn(W, Y),
h(IgJ
n
=
L(hgJEPfn(X,Z).
n
n
n
PROOF. If WE DD(gJ) DD(gjf) then f(w) E DD(gJ DD(gj), so i = j. If
x E DD(hgJ DD(hgj ) then x E DD(gJ DD(g), so i = j. Then equality of
the sums follows from 15 in view of 12.
D
Proposition 15 and Corollary 16 are valid when I is empty, yielding the
following:
17 For fESC(X, Y) and for all sets W, Z we have the commutative diagram
o
W~}~
Y
0
)Z
where the four o's are the appropriate empty sums of 14. It follows that any
composition fn ... f1 is 0 if any of the h is o.
A useful result about the existence of sums is the following:
18 Proposition. Let (hliEI) be a summablefamily in Pfn(X, Y). Then:
(a) If J c I, (hliEJ) is summable in Pfn(X, Y).
(b) If (gd i E I) is a similarly indexed family (not necessarily summable) in
Pfn( Y, Z), then (gih liE I) is a summable family in Pfn(X, Z).
32
1 An Introduction to Denotational Semantics
PROOF. That (a) holds is obvious. For (b), if a E DD(gih)
a E DD(.t;) DD(ij), so i = j.
n
nDD(ijgj) then
0
In the balance of this section we emphasize the use of sums to define
programming constructs.
19 Definition. If A is a subset of X, the inclusion function of A is incA E
Pfn(X, X) defined by
DD(incA) = A,
incA(x)
=
x.
Thus, inc0 = 0 is the everywhere undefined function X -+ X and incx = idx .
As usual, we consider incA E Mfn(X, X) as well, as in 1.4.9-10.
20 Definition. If P E Pfn(X, X) is an inclusion function (so that P = incA for
A = DD(p)), we say P is a guard function and for f E SC(X, Y) we introduce
the notation P -+ f for fp. The meaning of p -+ f is "if p is true then execute f
else the result is undefined" where to say p(x) is true means x E DD(p). Thus
p "guards" entry to f Such p -+ f is called a guarded command.
21 Definition. For n ;::: 1, an n-way test on X is (P1, .. . , Pn) with each Pi an
inclusion function in Pfn(X, X) and DD(Pi) DD(pj) = 0 if i =f. j.
n
22 Definition. Let (P1, ... ,Pn) be an n-way test on X and let f1, ... , fnE
SC(X, Y). Then a natural generalization of the case statement in Pascal is
case (P1,··· ,Pn) of (f1,··· ,f,,) = f1P1
+ ... + fnPn
with flowscheme
Pn
In
The sum is defined by Proposition 18.
33
1.5 A Preview of Partially Additive Semantics
A related construction in multifunction semantics is the following:
23 Definition. Let PI' ... , Pn be guard functions in Pfn(X, X) and let fl' ... ,
fn E Mfn(X, Y). Then the alternative construct is
if PI
-+
fl 0··· oPn -+ fn fi
=
flPI
+ ... + fnPnEMfn(X, Y).
We emphasize that the guards here are not required to have disjoint
domains. The intended meaning is "pick any i for which the guard Pi is true
and execute h." The flowscheme is the same as in 22.
The Pascal if-then-else construction is a special case of 22 as follows.
24 Definition. Let A be a subset of X. Define A'to be the complement of A,
that is, A' = {xEXlx¢A}. Then (incA, incA') is a two-way test on X. For f,
gESC(X, Y) define
if A then f else 9 = f incA
+ 9 incA'
in SC(X, Y). Two suitable flowschemes are
x
T
F
y
y
The sum operation and composition lead to a calculus to manipulate
functions. We begin with two basic properties of inclusion functions whose
proof is obvious and follow this with an example that simplifies a compound
conditional statement.
25 Proposition. Let A, B be subsets of X. Then:
(a) incA incB = incAnB = incBincA'
(b) If An B = 0, incA + incB exists and is incAuB '
26 Example. If A, B
c:
X, f, g, hE SC(X, Y) then
34
1 An Introduction to Denotational Semantics
if A then (if B then I else g) else (if A' then I else h)
= (/incB + gincB,)incA + (jincA, + hincA)incA, (since A" =
== lincBincA + gincB, incA
+I
incA' incA'
+ gincAnB , + lincA'
l(incAnB + incA') + g incAnB ,
+I
incA incA'
A)
(by 15)
= lincAnB
(by 25)
=
(by 15 since the sum in parentheses
is defined by 25)
= l(inc(AnB)UA') + g incAnB , (by 25)
= I inc(AnB')' + g incAnB,
= if A n B' then g else f
Repetitive constructs may be defined using infinite sums. For example,
27 Definition. For A c X,fESC(X,X) define
while A do 1=
co
L incA,(jincA)"ESC(X,X)
"=0
(where for g E Sc(Y, Y), g" is defined by gO = idy , g"+l = g"g) with one summand for each number n of traversals of the loop in the flowscheme
~--------------~x
F
x
That the sum exists when the semantic category is Pfn is clear from the fact
that
28
DD(incA,(jincA)")
= {xEXlx,J(x), ... ,j"-l(X)EA,J"(x)¢A},
which ensures that x E DD(incA, (1 incA)") for at most one n.
29 Definition. For A c X,fESC(X,X) define
repeat I until A
=
(while A' do f)f
It is easy to use the laws for manipulating sums to deduce a formula like that
of 27 (see Exercise 7).
35
1.5 A Preview of Partially Additive Semantics
A companion to the multi valued alternative construct of 23 is the multivalued repetitive construct
do Pl
flO··· 0 Pn ~ fn od
~
which is intended to mean "pick any i for which guard Pi is true and execute
h; repeat until no such i exists and then exit." Since the choice of i is multivalued, many successful computation paths of varying length are possible. A
suitable formal definition of the semantics is the following:
30 Definition. Let Pl, ... , Pn be guard functions in Pfn(X, X) and let fl, ... ,
fn E Mfn(X, X). Then the multivalued repetitive construct is
do Pl
~
flO'" 0 Pn ~ In od
=
where if Pi = incA, then A = Al
infinite sum (see Exercise 8).
31 Example. For A
c:
X, f
while A do if Pl
U··· U
E SC(X, X),
--+
flO'" 0 Pn ~ fn fi,
An" This may be expressed as an
9 E SC(X, Y) we have the identity
x
x
g
f
g
y
that is,
9 (while A do f) = if A then 9 (while A do f) else g.
A formal proof is as follows, where we make use of 15, 16, and 25.
if A then 9 (while A do f) else 9
=
(g n~o inCA,(fincA)") incA + gincA,
36
1 An Introduction to Denotational Semantics
= g
(Jo
(since incA incA = incA
while incA' incA = 0)
incA,(f incAt)
(since the sum in parentheses is defined)
= g (while A do f).
EXERCISES FOR SECTION
1.5
1. If (Ai: i E J) is a family of subsets of Y with union A then x E A if and only if there
exists i E J with x E Ai' Conclude that A is empty if J is empty.
2. Show that every multifunction is a sum (in Mfn) of partial functions.
3. Give an example off, g, hE Pfn(X, Y) such that f
f + g + h is not defined.
+ g, f + h, g + h are defined but
4. Given programs to compute f, g E Pfn(X, Y) one can design an operating system
to run both programs simultaneously (e.g., by interleaving steps). Explain why
this approach can not be made to compute f + g.
5. Give an example of fl' f2EPfn(X, Y), gl, g2EPfn(W,X) such that fl
defined butflgl + f2g2 does not exist.
+ f2
is
6. Give a proof of 25.
7. Using the laws for manipulating sums give a careful proof that
repeat f until A =
I
00
n=O
(incAfHincA, f)n.
8. Give a careful proof that
00
do PI
->
flO'" 0 Pn
-> J.
od =
I
k=O
incA,(fIPI
+ ... + J.Pn)k,
where A = DD(pd v··· v DD(Pn).
9. Although no mathematically distinguished function in SC(X, X) suggests "abort"
we may declare a particular element aEX to be the "abort value" and treat the
total function abort(x) = a as the abort function in SC(X, X). Give a modified
form of the alternative construct of 23 which aborts if none of the guards is true.
Similarly, give a modified form of the multi valued repetitive construct of 30 which
aborts if none ofthe guards is true initially.
10. As in Example 31, draw flowschemes and give a proof of
g (while A do f) = g (while A do (if A then f else g)).
11. Draw flowschemes and give a proof of
while A do f = while A do (while A do f).
12. Let X, j( be sets and let A c X,
Ac
j(, fESC(X,X), h, kESC(X,j(),
j,
~E
37
Notes and References for Chapter 1
SC(X, X). Assume that k is such that
if A then hg else k
= (if A
then j else g) hE SC(X, X).
Show that
k (while A do f) = {j (while
A do j) hE SC(X, X).
Draw flowschemes for the hypothesis and the conclusion.
Notes and References for Chapter 1
For an operational semantics of Pascal see K. Jensen and N. Wirth, PASCAL Users
Manual and Report, Springer-Verlag, 1974. Denotational semantics of programming
languages stems from the work of D. S. Scott and C. Strachey. See J. Stoy, Denotational Semantics: The Scott-Strachey Approach to Programming Language Theory,
MIT Press, 1979, for a textbook account. Assertion semantics was introduced in
R. Floyd, "Assigning meanings to programs," in Mathematical Aspects of Computer
Science, American Mathematical Society, 1967, pp. 19-32, and C. A. R. Hoare "An
axiomatic basis for computer programming," Communications of the Association for
Computing Machinery, 12, 1969, pp. 576-580, 583. A textbook account is given by
S. Alagic and M. A. Arbib, The Design of Well-Structured and Correct Programs,
Springer-Verlag, 1977.
Backus' Turing Award Lecture is published in Communications of the Association
for Computing Machinery, 21,1978, pp. 613-641.
For computability theory based on a Pascal fragment see A. J. Kfoury, R. N. Moll,
and M. A. Arbib, A Programming Approach to Computability, Springer-Verlag, 1982.
Computable functions were first defined in various equivalent ways in the 1930s
before the computer age. As such, the idea of developing a theory of computable
functions without reference to a programming language as suggested by Section 1.5
is not at all new. What is different about the modern approach is an emphasis
on constructions that seem likely candidates for use in defining the semantics of a
programming language.
Partially additive semantics was introduced by the present authors in two papers,
Journal of Algebra, 62, 1980, pp. 203-227 and Journal of the Association for Computing Machinery, 29, 1982, pp. 577-602. The second of these cites Karp (1959),
De Bakker and Meertens (1975), and De Roever (1976) for applying the Mfn sum of
1.5.9 to aspects of semantics.
For a formal proof that the associative law implies that all n-chains, regardless of
parenthesization, compose equally see N. Jacobson, Lectures in Abstract Algebra,
Van Nostrand, 1951, pp. 20-21.
The alternative construction and the multivalued repetitive construct are set forth
in the book by E. J. Dijkstra, A Discipline of Programming, Prentice-Hall, 1977. While
he requires these constructions to have the abort features of our Exercise 9, in fact
his constructions coincide with those we have given because his abort function is
indistinguishable from nontermination.
CHAPTER 2
An Introduction to Category Theory
2.1 The Definition of a Category
2.2 Isomorphism, Duality, and Zero Objects
2.3 Products and Coproducts
Going beyond the partial functions and multifunctions already considered,
one might invent other useful notions of the input/output function from X to
y. In addition to the need to consider X, Y as "data structures," there
are theoretical approaches to semantics in which all X, Y must carry
further structure. Rather than embark on the misguided task of presenting an
exhaustive list of present and future possibilities, we introduce categories as a
framework for semantics which possess so little structure that most models of
semantics can be represented this way. Surprisingly, what structure remains
can be extensively developed and there is a great deal to say.
Category theory per se is tangential to this book. We discuss only a few
topics which bear directly on our analysis of the "semantic category." In
Section 2.1 we introduce the notion of a category which provides the bare
bones of abstraction of the semantics of composition. Section 2.2 introduces
the useful organizing principle of duality and relates it to isomorphisms and
to initial and terminal objects. Isomorphism is self-dual and initial is dual to
terminal.The uniqueness of initial objects has important instantiations in
semantics, such as the uniqueness of a sequence defined by simple recursion.
Zero objects are simultaneously initial and terminal and generalize the empty
set in PCo. To round out this introduction to category theory we present, in
Section 2.3, the notion of product and the dual concept of coproduct which
both find frequent applications throughout the book.
With this, we have all the category theory needed for our study of program
semantics in Chapter 3. Further category theory is developed in Chapter 4 as
motivated by the issues raised by attempting to describe assertion semantics
in a semantic category.
When we turn to the study of data types in Part 3 we shall need to call
2.1 The Definition of a Category
39
on further concepts from category theory-functors, limits, and algebraic
theories.
The concepts in this chapter are quite abstract and may seem so even
to readers with experience in pure mathematics. We encourage patience!
Familiarity with the language will grow and the approach should come to
seem increasingly natural with the applications to semantics in subsequent
chapters.
2.1 The Definition of a Category
A "category" is an abstraction of "sets and functions between them." In a
category sets become "objects," abstract things with no internal structure.
There are sets in the theory, however, namely, for each two objects X, Y there
is a set of "morphisms" from X to Y. These morphisms compose in an
associative way, and there are identity morphisms. The motivating examples
for us are 2 and 3 below. Here, then, is the precise definition:
1 Definition. A category C is given by data (i), (ii), (iii) subject to axioms (a),
(b), (c) as follows.
Datum i. A collection ob(C) of C-objects X, Y, Z, ....
Datum ii. For each ordered pair of objects (X, Y) a set C(X, Y) of Cmorphisms from X to Y. We use the term map as a synonym for
morphism.
Axiom a. The sets C(X, Y) are disjoint: if C(X, Y) n C(X, Y) =f. 0, then
X
= X and Y = Y.
We will rarely say fEC(X, Y), introducing instead the following two synonymous notations: f: X --+ Y, X ~ Y. Here X
is called the domain of f and Y is the codomain of f. Axiom a
guarantees that this definition makes sense, that is, there will
never be any ambiguity concerning the domain or codomain of
a morphism.
Datum iii. A composition operator 0 assigning to each ordered pair of
morphisms (f, g) of form f: X --+ Y, g: Y --+ Z (i.e., the codomain of
f coincides with the domain of g) a third morphism go f: X --+ Z
whose domain is that of f and whose codomain is that of g.
Axiom b. Composition is associative, that is, given f: X --+ Y, g: Y --+ Z,
h: Z --+ W, (hog)of = ho(gof): X --+ W
Axiom c. For each object X there exists an identity morphism id x : X --+ X
with domain and codomain X with the property that for
each morphism f: Y --+ X, id x o f = f and for each morphism
g: X --+ Z. goid x = g.
This completes the definition. We observe at once that the id x of axiom c
40
2 An Introduction to Category Theory
is unique. For suppose also that u: X --+ X satisfies u 0 f = f for all f: Y --+ X
and go u = 9 for all g: X --+ Z. Regarding u as f for id x , id x 0 u = u.
Regarding id x as a 9 for u, id x 0 u = id x . Thus, u = id x . Hence, id x is well
named as the identity morphism of X.
As is usual for mathematical structures generally, a host of alternate
notations may prove useful. Thus, composition might be denoted 9 *f
instead of 9 0 f for some categories. Since composition is the basic operation
of category theory we shall most often write composition with no symbol at
all, as gf We shall almost always stick to id x for the identity morphism of X.
Even in our first examples, different categories may share the same objects
and even the same morphisms. In such situations different arrows such
as f: X --,. Y may be used and alternate notation for composition may be
essential.
2 Example. Set, the category of sets and total functions. Here objects are sets,
a morphism f: X --+ Y is a total function from X to Y, composition is the
usual one, (gf)(x) = g(f(x)), and idx(x) = x.
3 Example. Pfn, the category of sets and partial functions. Here objects are
sets but a morphism f: X --+ Y is a partial function from X to Y. Composition
is as in 1.3.7. The identity (total) function still provides id x . Note that Pfn
(X, Y) in the sense of definition 1 is exactly Pfn(X, Y) as in 1.3.4.
4 Example. Mfn, the category of sets and multivalued functions. Here objects
are sets, Mfn(X, Y) is as in 1.4.3 with composition given by 1.4.4, and
idx(x) = {x}.
5 Example. ANMfn, the category of sets and multi valued functions with
"all or nothing" composition. In this example, objects are sets and ANMfn
(X, Y) = Mfn(X, Y) but composition gf: X --+ Z for f: X --+ Y, g: Y --+ Z is
defined by
f( ) 9 x -
{0
0
if g(y) =
for some YEf(x)
{ZEZ: there exists YEf(x) with zEg(y)}
else.
(This is "all or nothing" in the sense that scenario 1.4.5 has been modified so
that no output is defined if any computation fails to terminate.) The identity
morphism id x is the same as in Mfn. Thus, the only difference between
ANMfn and Mfn is composition.
Examples 2-5 are categories. For all but ANMfn, axiom b has been
established in Section 1.4; we leave the modification of properties 1.4.6 to
ANMfn as an exercise. Axiom c is routine. Axiom a holds by definition-we
consider the domain and codomain as part of the definition of a function. In
the student's likely first encounter with functions, elementary calculus, axiom
a is not made explicit. Formulas such as x 2 are confused with functions and
one speaks one moment of"x2 for -1 :s; x :s; 10" and the next moment of"x2
41
2.1 The Definition of a Category
for 2 ::; x ::; 3." According to our conventions these are different functions.
This is reasonable since these functions have different properties-for example,
the second is monotone increasing where the first is not.
We again avoid a formal proof that repeated use of the associative
law axiom b establishes that all n-fold compositions are equal regardless of
parenthesization and so can be written without parentheses as f. ... fl'
(Example 1.4.8 clearly goes through in any category.) The commutative
designation such as 1.5.4 is useful in any category.
Thus, in diagram
"'yA
X
9
IB~
~D~
Y
we understand that "ba = hgf" is asserted and we may emphasize this
assertion by saying "the diagram commutes."
When one regards a category as "the semantic category" generalizing 1.5.1
(with 3, 4, and 5 being examples), the flow scheme notation of 1.5.2-clearly
a workable synonym for f: X ~ Y in any category-is useful. In practice,
however, many other types of category arise. Experience dictates that virtually any class of structures can be made the objects of a category in a
"natural" way. Some of the possibilities are explored in the exercises. We turn
now to examples of categories that are useful in this book but not necessarily
as "semantic" categories.
6 Definitions. A partially ordered set, poset for short, is a pair (P, ::;) where P
is a set and ::; is a binary relation on P which is a partial order on P. This is
defined to mean that the following three axioms hold for all x, y, Z E P.
Reflexivity: x ::; x.
Transitivity: if x ::; y and y ::; Z then x ::; z.
Antisymmetry: if x ::; y and y ::; x then x = y.
We emphasize that the symbol::; has no a priori meaning. Any relation
satisfying the three axioms is a partial order, and many different partial
orders may be of interest on one set.
While other symbols could be used-for example, xRy instead of x ::; ythe ::; symbol gives rise to the following associated definitions. In a poset
(P, ::;) say that
x< y
if x ::; y but x i= y,
x ;::: y
if y ::; x,
x> y
if y < x,
x
i
y
if it is false that x ::; y (warning: not equivalent to x > y; see the
Hasse diagram below).
42
2 An Introduction to Category Theory
x 1:. y, x -;f y, x i. yare defined similarly. It is not so clear how to obtain
similar conventions with the symbol R.
A useful device for drawing finite posets galore is the Hasse diagram, an
example of which is
d",
~
b~C
a
Here P is the set of nodes ( = dark circles); P = {a, b, c, d, e} in this example.
The partial order is defined by x ::;; y if and only if x = y or x is below y and
there exists an upward path from x to y. It is easy to see that (P, ::;;) is always
a poset. In the above example a ::;; b, a ::;; d, while b, c are incomparable
because b i c and c i b.
A totally ordered set is a partially ordered set (P, ::;;) in which every two
elements are comparable-given x, y at least one of x ::;; y or y ::;; x holds.
The term partially ordered set refers to the possibility that incomparable
pairs may exist.
Posets are fundamental structures arising frequently in mathematics and
theoretical computer science. They play several roles in this book. Here are
some examples of po sets:
7 Example. If N = {a, 1,2 ... } is the set of natural numbers and::;; has its
usual meaning, (N, ::;;) is a totally ordered set.
8 Example. If Y is any set and &(Y) is the set of subsets of Y (1.4.2) then
(&,(Y), c) is a poset where A c B is the usual subset inclusion. Note that we
may have
that is, A, BE &,(Y) but neither A c B nor B c A holds. Thus, if Y has two or
more elements, (&(Y), c) is not totally ordered.
9 Example. For any two sets X, Y and f, g E Pfn(X, Y) define f ::;; g to mean
g extends f, that is, "if f(x) is defined then g(x) is also defined and then
g(x) = f(x)." Then (Pfn(X, Y), ::;;) is a poset which is not totally ordered. This
example is important in Section 5.1.
43
2.1 The Definition of a Category
Partially ordered sets form a category:
10 Example. Define Poset to be the category whose objects are posets and
with Poset((P, ::;), (PI' ::; d) the set of all total functions f: P -+ PI which are
monotone in the sense that
if Xl
::; X 2
then
f(xd ::;d(x 2 )
Composition and identity morphisms are as in Set.
The reader should check that Poset does then satisfy the category axioms.
We next introduce another important mathematical structure.
11 Definition. A monoid is a triple (M, a, e) where M is a set, a: M
M is a function, and e EMaIl subject to the axioms
X
M-
a is associative (xa y)az = xa(yaz) for all X, y, z in M.
e is the identity: eax = xae = X for all X in M.
As for categories, the composition of Xl' ... ' xn is written without parentheses
as Xl a···ax n •
12 Example. For any category C and any object X E ob(C), the set C(X, X)
of all morphisms of X to itself forms a monoid under composition, with
identity id x .
13 Example. An example of a monoid familiar from formal language
theory is (X*, cone, A), where X* is the set of all finite strings (Xl' ... ' Xm),
m ~ 0, with each Xi in the given "alphabet" X. Here cone is the operation
of concatenation---conc((xl, ... ,Xm)(YI, ... ,Yn)) = (xl, ... ,Xm,YI, ... ,Yn) and
A = ( ) is the empty string (= (Xl, ... , Xm) with m = 0).
14 Example. The category Mon has monoids as objects, monoid homomorphisms as morphisms. Here, given two monoids (M, a, e) and (M', *, e'),
we say that a function f: M -+ M' is a monoid homomorphismf: (M, a, e ) (M', *, e') if f(e) = e', while f(x a y) = f(x) * f(y) for all x, y in M. We define
composition and identity as for functions. The reader should check that Mon
does indeed satisfy the category axioms.
15 Definition. Let C be any category and let flfi be any subclass of ob(C).
Define a category D by ob(D) = flfi, D(X, Y) = C(X, Y) for each X, Yin flfi
with composition and identities the same as in C. A routine check shows D
is a category. We call it the full subcategory induced by flfi. ("Full" refers to the
fact that all C-morphisms between objects in flfi have been retained.)
Since no restrictions have been imposed on flfi, full subcategories give rise
to a rich supply of new categories. Even more generally:
44
2 An Introduction to Category Theory
16 Definition. Let C be a category. A subcategory D of C is given by a
subclass ob(D) of ob(C) and, for each X, Y in ob(D), a subset D(X, Y) of
C(X, Y) subject to the axioms that id x E D(X, X) and, whenever f E D(X, Y),
gED(Y,Z) then gfEC(X,Z) is, in fact, in D(X,Z).
It is obvious that such D, with the composition inherited from C, satisfies
axioms a, b, c of the definition of a category. Thus, a subcategory is a
category in its own right.
Clearly, a subcategory D of C is a full category if and only if D(X, Y) =
C(X, Y) for all X, YEob(D).
17 Example. Set is a (nonfull) subcategory of Pfn since ob(Set) = ob(Pfn),
Set (X, Y) c Pfn(X, Y), id x E Pfn(X, X) is the total identity function and if f, g
are compos able total functions their composition gf as partial functions is
their composition as total functions.
EXERCISES FOR SECTION
2.1
1. Find a formula analogous to 1.4.7 for the composition of three multifunctions in
ANMfn. Do the same for n multifunctions.
2. Repeat Exercise 1.4.5 in any category.
3. The category Veet has real vector spaces as objects, linear maps as morphisms,
composition, and identity morphisms at the level Set. Verify that this is a
category.
4. Let C be any category and let X be any object at C. Define a category D as
follows.
A D-object is a C-morphism ofform f: A ...... X.
A D-morphism from f: A ...... X to fl: Al ...... X is a C-morphism g: A ...... Al such
that the following diagram commutes:
Define composition and identity morphisms as in C. Verify that D is a category.
It is called the category of C-objects over X.
5. Let C be any category. Let
be C-morphisms. Define a category D as follows.
A D-object is (S, t, u) where t: S ...... X, u: S ...... Y, are C-morphisms such that
45
2.1 The Definition of a Category
x ---:f'--~) z
A D-morphism from (S, t, u) to (Sl' t 1 , ud is a C-morphism 0(: S -+ Sl such that
S
X
~l~ Y
---u-:
IX
~SI
Define composition and identity morphisms as in C. Verify that D is a category.
6. Let C be any category. Let D be the category of commutative squares of C defined
as follows.
A D-object is a commutative square (A, B, C, D, r, s, t, u):
U
A D-morphism from (A, B, C, D, r, s, t, u) to (AI' B 1 , C1 , D1 , r 1 , Sl, t 1 , u 1 ) is a 4-tuple
(0(, fJ, ')I, b) where 0(: A -+ Ai> {J: B -+ B 1 , ')I: C -+ C1 , b: D -+ Dl such that the
following "commutative cube" obtains:
A
'r
--------------------~)B
~
,1
~Cl
r1
U1
~
s
)Dl~
C ---------------------+) D
Define composition as in C, that is,
and similarly let (idA' id B , ide, id D ) be the identity morphism. Verify that D is a
category.
7. Working this exercise will give the reader a head start on later work in data types.
Let (P, s) be a po set. Define a category C(p.:<;;) whose objects are the elements
of P and whose morphisms are the assertions that x S y. More formally,
xsy
else.
Thus, there is a morphism x -+ y (and it is then unique) just in case x
composition by (y -+ z)(x -+ y) = x -+ z and define idx = x -+ x.
s
y. Define
46
2 An Introduction to Category Theory
Verify that C(p.~) is a category. How do the po set axioms correspond to the
category axioms here?
8. Let X be any set. Show that x :::;; y if x = y is a partial order on X, the discrete
ordering. C(x. =) as in Exercise 7 is called the discrete category on X.
9. Let X* be as in 13. Show that the "prefix ordering" w :::;; v if w is a prefix of v, that
is, v = conc(w, u) for some u, is a partial order on X*.
10. Recalling the discussion following 1.4.10, it is useful to have a semantic category
Mfnoo for which the objects are sets but such that
Mfnoo(X, Y) = Tot(X, &,(Y) u {oo})
where 00 is any object not in &,(Y). When fEC(X, Y), fix) = 0 means "termination with the empty set of values," whereas fix) = 00 means "nontermination,"
thereby extending the scenario of 1.4.5. Capture the spirit of this extended
scenario by providing a definition for composition and identities that make Mfnoo
a category.
11. Let (M, 0, e) be any monoid. We think of elements of M as "degrees of reliability"
with e being "totally reliable." A partial function with reliability from X to Y is a
partial function f: X ----+ M x Y, where M x Y is the set of all ordered pairs
(a, y) with aE M, y E Y. If fix) = (a, y) we say "f(x) = y with reliability a."
(i) An intuitive choice of monoid is M = [0,1], the interval of all real numbers
a, 0 :::;; a :::;; 1 with a °b ordinary numerical multiplication and e = 1. Verify
that this is a monoid. (Here fix) = (a, y) might be interpreted ''fix) = y with
probability a.")
(ii) For any monoid (M, 0, e) verify that the following defines a category.
Objects: sets
morphism X -+ Y = partial function X ----+ M x Y
identity morphism X ----+ X: DD(id x ) = X, idx(x) = (e, x).
Composition: given f: X ----+ M x Y, g: Y ----+ M x Z "if fix) = y with
reliability a and g(y) = z with reliability b then gf(x) = z with reliability ba,"
that is, g(f(x)) is defined if and only if fix) = (a, x) is defined and g(y) = (b, y)
is defined and then gf(x) = (ba, z).
The resulting category is denoted FwR(M.o.e)'
(iii) Given f E Pfn(X, Y) define E FwR(M.o.e)(X, Y) by
r
DD(n = DD(f),
F(x)
= (e, x).
A morphism of form F is said to be reliable. Prove that (gff = g"j. and the
reliable morphisms constitute a subcategory of FwR(M.o.e)'
2.2 Isomorphism, Duality, and Zero Objects
This section introduces the fundamental equivalence relation of category
theory, isomorphism. Also discussed are duality, initial and terminal objects,
as well as zero objects which are simultaneously initial and terminal. The
47
2.2 Isomorphism, Duality, and Zero Objects
Cartesian product of two sets, the set of words on a given alphabet, and the
principle of simple recursion are all manifestations of initial or terminal
objects.
Isomorphisms
All constructions in a category must ultimately be described entirely in
the language of objects morphisms, composition, and identities. Our first
definition in this language is that of "isomorphism."
1 Definition. A morphism f: X -+ Y in a category C is an isomorphism if
there exists g: Y -+ X with gf = id x , fg = id y or, in terms of a commutative
diagram,
x~r~
x
f
I
Y
if it exists, is unique, since if also hf = idx, fh = id y then
g(fh) = (gf)h = id x h = h. This proof uses the full force of
axioms band c in the definition of a category. Such g, then, is called the
inverse of f and is written f-1.
Such g,
9
= 9 id y =
2 Example. In Set f: X -+ Y is an isomorphism if and only if f is bijective,
that is, f is one-one and onto. To see this, first suppose f is an isomorphism.
If f(x) = f(x) then x = f-1(f(X)) = f-1(f(X)) = X, which proves f is oneone (injective). If y is any element of Y, y = f(f-1 (y)), so f is onto (surjective).
Conversely, let f be injective and surjective. If y is any element of Y there
exists a unique element of X, call if g(y), which f maps to y. Thus, f(g(y)) = y.
Since, in particular, f(g(f(x)) = f(x) and f is injective, g(f(x)) = x.
3 Example. In Pfn, f: X -+ Y is an isomorphism if and only if f is a
total function which is bijective. By the first example it is obvious that a
bijective total function is an isomorphism. Conversely, let f: X -+ Y be an
isomorphism. Then f-1f = id x so that X = DD(id x) = DD(f-1f) c DD(f)
which implies f is a total function. Similarly, ff- 1 = id y implies f- 1 is total,
so f is an isomorphism in Set.
4 Example. In Mfn, f: X -+ Y is an isomorphism if and only if f is a
total function which is bijective. As in Example 3, one way is clear.
Conversely, let f: X -+ Y be an isomorphism. First observe that if yEf(x)
then xEf-1(y) since ff-1(y) = {y} implies that f-1(y) #- 0 and, if xEf-1(y)
then xEf-1f(x) = {x} so that x = x. Then, if y, YEf(x) then xEf-1(y) and
YEf(x) so YEff-1(y) = {y} and y = y. This proves f is a partial function.
Symmetrically, f- 1 is a partial function. Now use the preceding example.
48
2 An Introduction to Category Theory
5 Definition. Two objects X, Y in a category C are isomorphic if there exists
an isomorphism f: X --+ Y. This is written X ;;: Y.
6 Observation. Isomorphism is an equivalence relation on ob(C).
x
PROOF. id x : X --+ X is an isomorphism with id 1 = id x so that X ;;: X, and
isomorphism is reflexive. Iff: X --+ Y is an isomorphism, so is f- 1: Y --+ X so
that isomorphism is symmetric. To see that transitivity holds, if f: X --+ Yand
g: Y --+ Z are isomorphisms, then (gf)(f-l g-l) = g(jJ-l )g-1 = gg-1 = id z
and U- 1g-1 )(g1) = id x similarly so that gf is an isomorphism (and
(g1)-1 = f- 1g-I).
0
As a rule, definitions and constructions in category theory (beginning with
7 below; see Theorem 8) are not unique but are "unique up to isomorphism."
Thus, a major aspect of the philosophy of category is that "isomorphism"
formalizes "abstractly the same."
Each theorem of category theory has a "dual theorem" whose proof is an
automatic consequence of the original, obtained by "reversing the arrows."
Before giving the general notion of duality, we explore the motivating duality
of initial and terminal objects.
7 Definition. An object A in a category C is initial if for every object X there
exists exactly one morphism from A to X. We denote this unique morphism
by!:A--+X.
The next result, simple as its proof may be, is one of the most fundamental
in category theory because it turns out that many important constructs can
be shown to be equivalent to initial objects in suitable categories.
8 Theorem. If A, B are both initial objects in a category C then !: A --+ B is
an isomorphism. Thus, if C has an initial object it is unique up to a unique
isomorphism.
PROOF. As any two morphisms from A to A are equal, similarly B to B, the
following diagram commutes:
9 Example. The empty set 0 is initial in Pfn with !: 0 --+ X the totally
undefined function. Since this function is total (else it would be undefined on
some element of 0) 0 is also initial in Set. !: 0 --+ X is not only the unique
partial function but also the unique multifunction (by 1.4.3) so 0 is again the
initial object of Mfn and ANMfn.
49
2.2 Isomorphism, Duality, and Zero Objects
For each construction defined in a general category C, the dual construction
is the construction obtained by "reversing all arrows." An initial object is
one admitting unique morphisms from itself, so the dual concept should be
as object admitting unique morphisms to itself and such is aptly called a
"terminal object."
10 Definition. An object A in a category C is terminal if for each object X of
C there exists exactly one C-morphism from X to A. The unique C-morphism
from X to A will be denoted j : X --+ A.
As another exercise in the language of duality, consider the notion of an
isomorphism. We saw that f: X --+ Y is an isomorphism just in case there is
a g: Y --+ X such that the following commutes:
g
Y~I~
Y
g
lX
If we reverse all the arrows, we say that f: Y --+ X is "the dual of
an isomorphism" just in case there is a g: X --+ Y such that the following
commutes:
g
Y~l~ x
Y(
g
But this just says that f is an isomorphism, so the concept of isomorphism is
self-dual. With this observation, the dual of Theorem 8 is the following:
11 Theorem. If A and B are both terminal objects, then
isomorphism.
j :
A
--+
B is an
PROOF. Once the concept of duality is understood, no proof is needed-just
reverse all arrows in the proof of Theorem 10.
D
We are now going to place the concept of duality on a more formal
footing, by regarding a diagram in this category C with all its arrows reversed
as being identical to a diagram in the "opposite category" coP. Here the
abstraction of our general definition of a category begins to show its power.
A function f: X --+ Y from a set X to a set Y is certainly not to be considered
as a function from Y to X, but there is nothing to prevent us using
the "arrow-reversed notation" f: Y --< X for f (using a distinctive new
arrowhead) and calling this a morphism from Y to X in the new category
SetOP • Here is the general definition.
50
2 An Introduction to Category Theory
12 Definition. Let C be a category. The dual or opposite category of C is the
category cop defined as follows:
ob(CO P) = ob(C),
cop (X, Y) = c(y, X).
Taking C as the "primary" category whose arrows we write in the normal
way f: X -. Y, we write the same morphism in cop as f: Y ---< X.
If f E COP (X, Y) and g E COP(Y, Z) their composition g * f in cop (X, Z) is
obtained by taking the composition fog of g E C(Z, Y) and f E C( Y, X) in C.
X~ Y~Z =X~Z =X~ZinCop,
where
Z~Y~X=Z~XinC.
Axioms a, b, and c for cop follow easily from their correspondents in
C. The identity morphisms of cop coincide with those of C. Moreover,
rephrasing our earlier observation that isomorphism is self-dual, f E C(X, Y)
is an isomorphism in C if and only if the same g considered in COP is an
isomorphism in coP.
PROOF. The diagrams (in which equally labeled commutative diagrams are
equal statements in their respective categories)
x
1
,y
~ (A)l~dy
id~ ~ (B)~
X
1 ,y
~
X
in C
establish the assertion about isomorphisms.
J(B)
Y,'id,
id~I' (A)~
X
>>--1-;:--- y
in cop
D
Clearly, C = (COPt P. There is nothing special about being of the form coP.
cop ranges over all categories as C does. When C is a "concrete" category
such as Set there is no guarantee that cop will likewise have such a
representation. SetOP is "more abstract" than Set.
We now see that our earlier definition, for each construction in a general
category C, that the dual construction is that obtained by "reversing all
arrows" becomes
13 Definition. Given a construction A in C, the dual construction is that
obtained by performing the construction in cop, and then interpreting the
construction in C. We will often refer to the dual of the A construct as the
co-A construct.
Thus, isomorphism = co-isomorphism, co-initial = terminal, and coterminal = initial. With this, let us use cop in spelling out a full proof of
51
2.2 Isomorphism, Duality, and Zero Objects
Theorem 11: By definition, A, B are initial objects in cop. By Theorem 8,
!: B --< A is an isomorphism in cop. But we have already shown that
f: Y --< X is an isomorphism in cop if and only if f: A ...... B is an isomorphism in C. Thus, the unique morphism A ...... B (which we choose to
call i rather than !) is an isomorphism in C.
14 Example. In Set, a terminal object is a one-element set. Hence, Set has
many different terminal objects but all are isomorphic (as they must be
by Theorem 13). Thus, while the "abstract theory of initial objects" and
the "abstract theory of terminal objects" should be regarded as the same
(whatever we can state and prove about initial objects in C is automatically
stated and proved, dually, for terminal objects in cop and cop ranges over all
categories as C does), in a particular example such as Set initial objects and
terminal objects behave differently.
15 Example. Let D be the full subcategory of Set whose objects are sets with
two or more elements. While the initial object 0 of Set is not in D this does
not in itself prove that D has no initial object (see Exercise 2(d)). In fact, if A
is any object of D there are at least two morphisms A ...... A. This proves that
no object of D is either initial or terminal.
Before introaucing zero objects we consider the more general concept of
zero morphisms which abstract "totally undefined" morphisms in Pfn.
16 Definition. Let C be any category, and let OXY E C(X, Y) be given for each
X, y. Say that (Oxy) is a family of zero morphisms if for every f: W ...... X,
g: Y ...... Z we have
X --,---+) Y
Oxy
On taking f or g equal to the identity, we see that this amounts to saying
"any composition which has a zero factor is itself zero." Set does not have a
family of zero morphisms because Set (X, 0) is empty whenever X # 0.
When a family of zero morphisms exists, however, it is unique since if (Oxy),
(ZXY) are both families of zero morphisms,
ZXY
= idyZxy(Oxxidx) = (idyZxy}Oxxidx = OXY·
We often write 0: X ...... Y for OXY: X ...... Y if no confusion would arise.
17 Example. In Pfn, Mfn, and ANMfn, the totally undefined functions
i
I
Oxy=X-0~Y
yield a family of zero morphisms.
This example motivates the next definition and proposition.
52
2 An Introduction to Category Theory
18 Definition. A zero object in a category C is an object that is both initial
and terminal. We denote a zero object by 0, the same symbol as for an initial
object. Though arbitrary, the convention is standard.
19 Proposition. A category with a zero object has zero morphisms. In a
category with zero morphisms, each initial object is also a zero object and
each terminal object is also a zero object.
PROOF.
For the first statement, let 0 be a zero object and define
Then
i
!
fi/O~j.
x
y
commutes so (Oxy) is a family of zero morphisms.
For the second statement, let zero morphisms exist and let 0 be an initial
object. There exists at least one morphism X -+ 0, namely, O. As 0 is initial,
0: 0 -+ 0 = ido: 0 -+ 0 so if f: X -+ 0 is arbitrary we have
f = idof = Of = O.
This shows 0 is terminal. That a terminal object is initial is simply the dual
D
statement.
20 Example. Pfn, Mfn, and ANMfn have 0 as zero object. The construction
in Example 17 follows the proof of 19.
In Pfn, we have said f: X -+ Y is total iff DD(f) = X, that is, f(x) is
defined for every XEX. In the same spirit, say that f: X -+ Yin Mfn or
ANMfn is total if f(x) "# 0 for all x E x. It is easy to prove (work Exercise 7!)
that these definitions are unified by the following abstract one.
21 Definition. Let C be a category with zero morphisms. Say that f: X -+ Y
is total if, whenever t: T -+ X, we have that t "# 0 implies ft "# O.
The following results, obviously true in Pfn, Mfn, and ANMfn, hold for
total morphisms in general.
22 Proposition. Let C be a category with zero morphisms and let f: X
g: Y -+ Z. Then
(i) Iff, g are total, so is gf: X
(ii) If gf is total, so is f.
PROOF.
-+
-+
Y,
Z.
(i) if t "# 0 then ft "# 0 so g(ft) = (gf)t "# O.
(ii) If t "# 0 then (gf)t "# 0 so g(ft) "# O. Since gO = 0, ft "# O.
D
53
2.2 Isomorphism, Duality, and Zero Objects
Simple Recursion
We conclude this section by showing how sequences inductively defined by
simple recursion are the unique morphism! from the initial object in an
appropriate category. This is a foretaste of the principle that constructions in
a category which are unique up to isomorphism are instantiations of initial
objects.
By a sequence in a set X we mean a function N --+ X. We may define the
sequence g: N --+ N, n --+ 2n inductively by the definition
g(O) = 1,
g(n
+ 1) = 2 * g(n)
for n ~ O.
This is a specific case of the following general notion.
23 Definition. We say that the sequence g: N --+ X is defined from
f: X --+ X by simple recursion if 9 satisfies the recursive definition
Basis Step: g(O) = Xo.
Induction Step: g(n + 1)
=
f(g(n))
Xo E
X and
for nEN.
In the above example, Xo = 1 and f(x) = 2 * x. It is clear that 9 is defined
uniquely by the above scheme. We shall take up a general discussion of
recursive definitions in Chapter 5. Here our task is to show that the 9 of 23 is
really an example of the unique map! induced from an initial object. First,
we note that an element Xo E X can also be written as the function 1 --+ X that
sends the unique element of the one-element set to Xo. We shall also call the
map Xo' The basis step of 23 can then be rewritten as the commutative
diagram
24
1
~Nl
~X
9
(Here, 0 is not a zero morphism as in 16, but the map whose value is 0 in N;
we are, in fact, in Set which does not have zero morphisms.) Again, if we
let s: N --+ N denote the successor function n f--+ n + 1, the induction step is
equivalent to the commutative diagram
25
More generally, then, we have the following:
26 The Principle of Simple Recursion. For each
Xo:
1 --+ X and f: X
--+
X
54
2 An Introduction to Category Theory
there exists a unique g: N
--+
X such that the following diagram commutes:
l~gNl (
s
~ x(
1
N
1g
X
This leads us to consider the following category:
27 The category of simple recursion data Srd has as objects the triples
(X, xo,f), where X is a set, Xo E X, and f: X --+ X is a total function. A
morphism t/J: (X, xo,f) -----+ (Y,Yo, h) is a total function t/J: X --+ Y for which
the diagram
--=-1__
1->:
1
~y(
X
X
+-(
l~
y
h
commutes, that is, t/J(x o) = Yo, while t/J(f(x))
h(t/J(x)) for each x in X.
=
We must yet specify composition and identities and verify that Srd is a
category. But, given this, we can note immediately that the principle of simple
recursion, 26, is equivalent to the statement that "(N, 0, s) is initial in Srd."
Returning to the definition of Srd, composition is defined to be the usual
composition 1//0 t/J of total functions. That this is well defined is best seen
from "diagram pasting:"
1
X
/1;
1
Yo
l~
h
, Y (
~l~'
z(
y
l~'
k
Z
For example, k(t/J't/J) = (t/J't/J)f because k(t/J't/J) = (kt/J')t/J = (1// h)t/J = t/J'(ht/J) =
t/J'(t/Jf) = (t/J'I/J)f. Axiom b of 2.1.1 is obvious since the composition of total
functions is associative. The identities for axiom c are the obvious ones:
X (
1~ lid
~ x(
1
x
1
x
lid
x
X
Now claim: if t/J: (X, xo,f) -----+ (Y, Yo, h) is an Srd-morphism, t/J is an
isomorphism in Srd if and only if t/J is bijective. On the one hand, if t/J is
an isomorphism then there exists ,p: (Y, Yo, h) -----+ (X, xo,f) with t/J o,p = id y ,
,p 0 t/J = idx , so t/J is bijective. Conversely, if t/J is bijective then there exists a
55
2.2 Isomorphism, Duality, and Zero Objects
function rjJ: Y ---. X with ljJ 0 rjJ = id y , rjJ 0 ljJ = id x . Is such rjJ a morphism:
(Y, Yo, h) ----+ (X, xo,f)? Consider the diagram below-the1's indicate places
where commutativity is yet to be proved.
f
X(
X
1
71~
Yo
I
Y (
~1~
X(
h
?
1~
Y
1~
X
f
Well, reading from the diagram above, (rjJh)ljJ = rjJ(hljJ) = rjJ(ljJf) = (rjJljJ)f =
id x f = f = f 0 id x = f(rjJljJ) = (frjJ)ljJ. Thus, rjJh = (rjJh)(ljJljJ-l) = ((rjJh)ljJ)ljJ-l =
((frjJ)ljJ)ljJ-l = (frjJ)ljJljJ-l = frjJ.
Similarly, rjJ(yo) = rjJ(ljJ(x o)) = (rjJljJ)(xo) = idx(x o) = Xo'
We reiterate the desire that objects in a category should be isomorphic
just in case they are "abstractly the same." This works out well for the
category of recursion data above where if ljJ: (X, xo,f) ----+ (Y, Yo, g) is an
isomorphism, the bijection ljJ transports Xo to Yo and f to g: thinking of ljJ as
a "relabelling," the abstract structure is "the same." When designing new
categories, one of the aesthetic criteria to keep in mind is that this technical
sense of isomorphism should relate to intuitive ones. For example, if when
defining the category of simple recursion data we dropped the requirement
that ljJ(x o) = Yo and ljJf = gljJ, we would get a category whose isomorphisms
were bijections, but here if s: N ---. N is s(n) = n + 1 whereas for z: N ---. N,
z(n) = 0, (N, 0, s) and (N, 0, z) would be isomorphic, which is not desirable.
EXERCISES FOR SECTION
2.2
1. Show that in ANMfn isomorphisms are total bijections.
2. Let (P, ~) be a poset and let C(P.';) be the category of Exercise 2.1.7.
(a) Prove that isomorphic objects of C(P,,;) are equal.
(b) A least element of (P,~) is PEP such that P ~ x for all XEP. Give a direct
proof that if (P, ~) has a least element, it is unique.
(c) Give an alternate proof of (b) using Theorem 11 by showing that a least
element of (P, ~) is an initial object of C(P. ,;).
(d) Let P = {a,b,c} be the po set with a ~ b ~ c. Let D be the full subcategory
with objects b, c of C(p, ,;). In comparison to E<,ample 15, show that D has an
initial object, but one different from that of C(P. ,;).
3. Show that a morphism in Mon is an isomorphism if and only if it is bijective.
4. Show that a morphism f: (P, ~) ---+ (Pl , ~l) in Poset is an isomorphism if and
only if f is bijective and f(x) ~l fry) implies x ~ y. Show that if (P, ~) is discrete
(see Exercise 2.1.8) and (Pl , ~') is not then idp:(P, ~)---+(Pl' ~l) is bijective
and monotone but not an isomorphism.
56
2 An Introduction to Category Theory
5. Prove that an object Z of C is a zero object in C if and only if it is a zero object in
COp. Prove that C has zero morphisms if and only if cop has zero morphisms.
6. Prove that any full subcategory of a category with zero morphisms has zero
morphisms. This fails for arbitrary subcategories (see Example 2.1.17). Use this
construction to give an example of a category with zero morphisms but with no
zero object.
7. Show that the total morphisms of Pfn in the sense of Definition 21 are exactly
those f which are total functions as in 1.3.6. Similarly, show that the total
morphisms of Mfn and ANMfn of 21 are those f with f(x) =I 0 for all x.
8. For your category Mfn"" of Exercise 2.1.10, do zero morphisms exist? If so,
characterize the total morphisms.
9. In any category with zero morphisms, prove that every isomorphism is total.
10. In Pfn, we call t: DD(f) ---+ X with t(x) = x the totalizer of f.
(i) Verify that this is the special case in Pfn for the following general definition:
Let C be any category with zero morphisms. Given f: X --+ Y, a morphism of
form t: T --+ X is a totalizer of f if ft is total and if whenever u: U --+ X is such
that fu is total then there exists unique CL with tCL = u as shown:
T
~\ u I
1 X -----=:/-_1 y
(ii) Using Exercise 9 show that if f is an isomorphism then id x is a totalizer of f.
(iii) Prove that if t: T --+ X and u: U --+ X are both totalizers of f then the unique
CL with tCL = u is an isomorphism. Give both a direct proof and a proof based
on Theorem 11 obtained by representing (T, t) and (U, u) as objects in a
suitable category Cf.
In Pfn, (i) and (ii) suggest that the important semantic notion of the domain o(
definition of a morphism can be described in category-theoretic terms. Mfn and
ANMfn also possess a reasonable notion of domain of definition for f: X --+ Y,
namely, DD(f) = {xEX:f(x) =I 0}. Regrettably, it may be shown that not every
f has a totalizer in these categories.
11. Let (M, 0, e) be a monoid. An element a E M is invertible if there exists bE M
with ab = e = ba. A monoid in which every element is invertible is called a
group.
(i) Let R be the set of real numbers. Show that (R, +,0) is a group.
(ii) Imitate the proof of Definition 1 to show that if a is invertible then the b with
ab = e = ba is unique. We call b the inverse of a and write b = a-i.
(iii) In fact, show that your proof in (ii) is a special case of that of
Definition 1 by associating to (M, 0, e) the following category C(M.a.e).
There is only one object, call it CL. The set of morphisms CL --+ CL is M with
ida = e and 0 for composition. Verify that C(M.a.e) is a category whose
isomorphisms are the invertible elements.
12. Let (M, 0, e) be a monoid and consider the category FwR(M.a.e) of Exercise 2.1.11.
(i) Show that f: X --+ M x Y is an isomorphism from X to Y in FwR(M.a.e) if
57
2.3 Products and Coproducts
and only if
(a) DD(f) = X.
(b) For all y E Y there exists unique x E X such that f(x) has form (a, y).
(c) For all x E X, if f(x) = (a, y), a is invertible (as in Exercise 11).
(ii) Show that the empty set is a zero object of FwR(M.o.e).
2.3 Products and Coproducts
In this section we show that two constructions of set theory which play
an important role in program semantics-Cartesian products and disjoint
union-can be described in category-theoretic terms and so generalize to a
wide class of categories. While the original constructions seem unrelated,
their category-theoretic descriptions are seen to be dual. Cartesian products
abstract to products in a category whereas disjoint unions abstract to
coproducts, it being common to indicate duality by the prefix "co" as
discussed earlier in 2.2.13. Co products are an important structural aspect of
the partially additive categories of the next chapter.
We begin by describing Cartesian products of sets. The term "Cartesian"
honors the mathematician Rene Descartes who developed plane analytic
geometry whereby the plane is represented as the set of all ordered pairs (x, y)
with x, y in the set R of real numbers. Thus, the plane is R x R where, in
general, for any two sets X, Y their Cartesian product is the set
X x Y
=
{(X,Y)[XEX,YE Y}
of all ordered pairs with x E X, Y E Y If a program fragment had two variables
x and Y taking values in the sets X and Y, respectively, then X x Y would
comprise all possible values which could be taken by the two variables
taken together. Turning to a formal analysis, we offer the following precise
description of an ordered pair (x, y) E X X Y which, if somewhat pedantic, is
useful for generalizing to the product of infinitely many sets. If we invent a
convenient two-element set, say {i,j}, an ordered pair in X, Y amounts to
a total function f: {i,j} ---+ X u Y with f(i) E X, f(j) E Y The relationship
between (x, y) and f is that f(i) = x, f(j) = y, and this formula defines (x, y)
in terms of f and f in terms of (x, y) and, indeed, establishes a bijective
correspondence between the f and the (x,y). We could then regard X x Yas
the set of all functions f: {i,j} ---+ X u Y with f(i) E X, f(j) E Y This leads to
the following general definition.
1 Definition. Let (Xi [i E J) be any family of sets. Their Cartesian product is the
set of all functions.
J~UXi
ieI
58
2 An Introduction to Category Theory
such that f(i) E Xi for each i E I. We denote this set of functions by
n Xi
ieI
or
n(X;liEI).
Often family notation is used so we write (xiliEI) instead of f, where f(i) = Xi.
This directly generalizes the motivating comments above where I has two
elements.
2 Example. If 1= {p, a, s} and if Xp is a set of persons, Xa is a set of age
values, say {16,17, ... ,80}, and Xs = {male,fem&le}, then nieIXi is a suitable vallJe object for a data base "record" for a person's age and sex.
We next consider unions. Note that if Xl ~ X2 are isomorphic in Set and
if, similarly, YI ~ Y2 , it is not necessarily true that Xl u YI ~ X2 U Y2 • For
example, let Xl = {a,a}, X2 = {a,b}, YI = {a,b,e}, anq f2 = {c,d,e}. Then
Xl u YI = {a,a,b,e} has four elements whereas X2 u Yz = {a,b,c,d,e} has
five. Since any two sets in bijective correspondence are "abstractly the same,"
according to the philosophy of category theory we seek j;l notion of union
that respects isomorphism better than ordinary union. A solution is given in
terms of "disjoint unions" wherein, given a family (Xi liE I), the elements of Xi
are "painted color i" before taking an ordinary union. The more precise
definition makes use of ordered pairs and is as follows.
3 Definition. Let (X;I i E I) be any family of sets. Their disjoint union is the set
{(x, i)liEX,XEX;}
and is denoted
11
Xi
ieI
or
11 (X;I i E I)
(The choice of the upside-down Cartesian product symbol anticipates the
yet-to-be established category-theoretic duality between Cartesian product
and disjoint union.)
Note that
11
Xi = UXi
ieI
ieI
X
{i}
is the ordinary union of the "colored" sets Xi x {i} in which an element (x, i)
is "x painted color i." The union is disjoint because
Xi x {i} nXi x {j}
even if Xi
=
0
ifi =l-j
= Xj. Disjoint unions also occur in semantics:
4 Example. The disjoint union has a very natural application in describing
the exit value of a multiexit program. Here, a value like (y, i) would be
interpreted as "execution of the program terminates by taking exit i with
59
2.3 Products and Coproducts
value y." For example, given fl' ... , f,,: X
statement of 1.5.22.
Y in Pfn, consider the case
-+
case (Pl"'" Pn) of (fl"" ,f,,)
with flowscheme
I
m~
0---fJ .
For 1 = {l, ... , n}, the semantics of the portion to the left of the dashed line is
X~UY
ie]
(here the notation Ui EI Y means Ui EI Y; with each Y; = Y) where
g(x)
{
if Pi(X) is defined
(!;(X), i)
= undefined else.
This discussion will be completed in 25 below.
We turn now to a description of Cartesian products that uses only
category-theoretic language in Set. Our starting point is a given family
(X;! i E1) of sets. Our Definition 1 is in terms of elements; we must think more
in terms of how to use morphisms in Set to characterize when a set X is to be
isomorphic to niEIXi' To this end consider a family of morphisms of the
form (Y~ X;!iE1). For each yE Y, (!;(y)liE1) is an element ofniEIXi and
so should correspond to a definite element of X, call it f(x). In this way there
is a bijective correspondence between morphisms Y -+ X and families of
morphisms Y -+ Xi' Moreover, when X = niEIXi this correspondence is
easily described in terms of commutative diagrams. To begin, we need the
following definition:
5 Definition. For (X;!i E1) a family of sets andj E1, thejth projection function
is
n x.~x.
,
J'
ie]
We then observe that the relationship between f and the!; is precisely
6
n
iEI
X.I
pro
J
'XoJ
~),
y
(for alljE1)
60
2 An Introduction to Category Theory
In terms of elements, 6 asserts that
f(y)
=
(h(y)liEI)
which is what we expected. We have motivated the following definition.
7 Definition. Let C be any category and let (X;I i E I) be a family of objects of
C. A product of (X;liEI) is (P,(priliEI)), where P is an object ofC and for
each i E I, pri: P --+ Xi is a C-morphism, all subject to the following property:
given (Y, (hi i E 1)) with Ya C-object and h: Y -+ Xi C-morphisms there exists
unique f: Y --+ P such that prJ = h for all i E I as shown:
The pri are called projection morphisms.
8 Example. In Set, P = TIiE1Xi, with pri as in 5, is a product of (X;liEI).
This was established in the discussion motivating 7.
9 Proposition. In any category C, products are unique up to a unique isomorphism, that is, if (P, (pr i)), (P, (pr i)) are both products of (X;I i E I), the
unique IX
p
l~
C(~Xi
Ji pri
is an isomorphism.
PROOF. For given (X;liEI), let D be the category whose objects are all (Y,(h))
with};: Y -+ Xi' whose morphisms h: (Y, (};)) -----+ (Z, (gi)) are defined to be
C-morphisms h: Y -+ Z such that
y~
1
h
z ~g.
.
Xi
(all i E I)
with composition and identities as in C. That D is a category is routinely
verified. By Definition 7, a product of (X;I i E I) is the same thing as a terminal
object of D. The desired result now follows from Theorem 2.2.11.
0
10 Proposition. In any category, a product of the empty family is the same
concept as a terminal object whereas if I = {i} has one element, id: Xi -+ Xi
is a product.
61
2.3 Products and Coproducts
For the first statement, consider 7 with 1= 0. A product is an object
P equipped with the empty family (prj-that is, P is an object with no
further structure. Continuing, the property satisfied by P is that if Y is any
object with no further structure then there exists unique f: Y --+ P with no
further conditions. This is then the same as the definition of a terminal object,
2.2.10. The second statement is immediate from the diagram
PROOF.
id
Xi
lXi
hi/,
y
D
11 Definition and Notations. In any category, if (P,(priliEI)) is a product of
(X;I i E I) we use the notations
n Xi
ieI
or n(X;liEI)
or
n Xi
for P, the third notation being a convenient shorthand if I is understood from
context. By 9 such
Xi is unique only up to isomorphism, but in category
theory that is unique enough. Thus, in Set we are modifying the notation of
1 to now refer to any set isomorphic to the specific model for the product
which we there called "the Cartesian product." See Exercise 1.
The unique way f of 7 depends only on the I; and deserves a notation to
indicate this. We thus write such f as [I; liE I] or simply as [I;] when I is
understood. Thus, in our new notation,
n
X
prj
l/,
l
t
[hliEI]
X.
'
y
When I is finite with at least two elements, infix x is a convenient symbol
to indicate products. Thus, if 1= {l, ... ,n}, n ~ 2,
Xl
is a synonym for
X ... X
n Xi' Similarly, we write
Xn
XxYxZ
instead of the more cumbersome (X;liEI), 1= {1,2,3}, Xl = X, X2 = Y,
X3 = Z. Corresponding notations exist for the unique induced map, for
example,
/X
'f x x.
[fl,···J·]l ~
X, x
y
and
i
62
2 An Introduction to Category Theory
X'
prx
X x Y
pry
J
Y
~i[r.
Y
A product of (X;!iEI) when all Xi = X are equal is called a power (of I
copies of X) and we write Xl instead OfniEIXi.
We say that a category has products if every family of objects has a
product. Similarly, a category has finite products if every finite family (i.e., a
family (X;!iEI) with I finite) has a product. By Proposition 10 any category
which has finite products has a terminal object.
12 Example. Set has products, as we have already discussed. If C is the
full subcategory of Set of all finite sets, then C has finite products since
Xl' ... , Xn E C, the Set product Xl x ... X Xn is again in C (see Exercise 11)
and so, with the same pri and [fl' ... '!"] constructions as in Set, acts as a
product in C. If (X;! i E/) is an infinite family in C for which each Xi has at
least two elements, no product of this family exists in C. (See Exercise 12.)
13 Example. Two one-element sets do not have a product in ANMfn. To see
this, suppose
{a} ~ X 2:4 {b}
were a product diagram in ANMfn. Let I be anyone-element set. By
the product property there exists a unique subset S: I -+ X of X such that
prlS = 0, pr 2 S = {b}. Since pr 2 S =1= 0, S =1= 0. As prlS = 0, there exists
XES with prl(x) = 0. It follows from the definition of composition in
ANMfn that for X: I -+ X, prlX = 0. Similarly, pr 2 X = 0. Since also
pr 1 0 = 0 = pr 2 0 we must have X = 0 which we have already seen is not
so, the desired contradiction.
Using the prefix "co" to signal duality, we define coproducts as dual to
products as follows:
14 Definition. In any category C, given a family (XiliEI) of objects, a
coproduct of (X;!iEI) is (C, (in;!i E1)). C is a C-object and for each
i E I, in i: Xi -+ C is a C-morphism such that (C, (in;! i E 1)) is a product in COp.
This makes sense since in cop the ini have the form
C~x.,
Given /;: Xi
-+
Yin C we have
in C
in cop
63
2.3 Products and Coproducts
The morphism ini is called the ith coproduct injection. The unique rJ. with
rJ. ini = h will be denoted (hi i E I).
Notations similar to those for products are useful for coproducts. If
(C, (in;! i E 1)) is a coproduct of (X;! i E I), we write
UXi
ieI
or
U(XiliE1)
or
UXi
as a synonym for C. We indicate finite coproducts with infix
+ as in
X+Y+Z
when all Xi = x, yielding a copower (of I copies of X), which we write I· X.
C has coproducts if every family of C-objects has a coproduct. C has finite
coproducts if every finite family of C-objects has a coproduct.
The following propositions are dual to 9 and 10 and so require no further
proof whatsoever!
15 Proposition. In any category, coproducts are unique up to a unique
isomorphism, that is, if (C, (in i)), (c, (in;)) are both coproducts of (X;! i E I), the
unique rJ.
.
c
/:
Xi
:CX
'>-1mi-""""'c
is an isomorphism.
16 Proposition. In any category, a coproduct of the empty family is the same
concept as an initial object whereas if I = {i} has one element, id: Xi -+ Xi is a
coproduct.
We now make good our promise that disjoint unions provide coproducts
in Set, resolving a possible notational ambiguity between 3 and 14.
17 Example. Set has coproducts. Given a family (X;! i E I) of sets, let C =
{(x, i): i E I, x E X;} be the disjoint union of 3 and define injections in i: Xi -+ C
by ini(x) = (x, i). Given h: Xi -+ Y we have
Xi
ini
IF
~~/;
y
where f(x, i) = /;(x). The above diagram commutes because for i E I, x E Xi'
fini(x) = f(x, i) = hex). Such f is unique since if g ini = h for all i then for
any (x, i) E c, g(x, i) = g ini(x) = hex) = f(x, i).
64
2 An Introduction to Category Theory
18 Example. As discussed in 1.3.4 and 1.4.9, a total function f: X --+ Y may
also be regarded as a partial function X --+ Y or as a multifunction X --+ Y.
We now observe that if in;: X; --+ C is· a coproduct in Set, then the same
functions considered in Pfn, Mfn, or ANMfn are again co products in those
categories. Thus, Pfn, Mfn, and ANMfn have coproducts and these are
constructed as disjoint unions.
To see this, first observe the following:
19 Iff: X --+ Y is a total function and g: Y --+ Z is a multifunction so that g is
a total function Y --+ &1(Z), then the Mfn and the ANMfn compositions gf are
given as the total function composition
X~Y~&1(Z)
This is immediate from 1.4.4 and 2.1.5. Continuing with the discussion, if
/;: X; --+ &1(Y) are total functions, it follows from the coproduct property in
Set that there exists a unique total function f: C --+ &1(Y) with
x
in j
j
Ie
~ if'
&,(Y)
where fin; = /; in Set. But by 19, as in; is a total function, this is equivalent
to fin; = fin Mfn and ANMfn and so it is clear that (C, (in;» is a coproduct
in Mfn and ANMfn. Finally, construct C = {(x, i): i E I, x E X;} as the disjoint
union with in;(x) = (x, i) so that f(x, i) = h(x) as discussed in 17. Clearly, if
each h(x) has at most one element of Y this holds for each f(x, i) so f is a
partial function if each h is. But then by Proposition 1.4.10, it is clear that
(C,(in;» is a coproduct in Pfn.
In the balance of this section we offer a few examples that demonstrate the
relevance of coproducts to semantics.
As we remarked following 2.1.5, the flowscheme notation of 1.5.2 provides
useful instruction in the "semantic category." The notations of 1.5.3 and 1.5.8
present composition and identities this way. We now extend flowscheme
notation to finite coproducts, using the disjoint union construction of coproducts common to Pfn, Mfn, and ANMfn, as discussed in Example 18,
for motivation.
Let C be any category with finite coproducts. We consider flowschemes
whose atomic components are morphisms of the form f: X --+ Xl + ... + Xn
with flowscheme
65
2.3 Products and Coproducts
Table 20 A Loop-Free Flowscheme
D
c
G
Intuitively, f first applies an n-way test and, if the ith test is successful,
transforms an input from X into an output in Xi' Recall Example 4.
An example of sufficient complexity to illustrate the ideas is shown
in Table 20. The atomic morphisms there are f: X --+ A + B, g: B--+
E + C + F, h: A --+ C + D, t: E --+ D, and u: F --+ G. In general, however,
we should consider "multi-input, multi-output" morphisms of the form
f: Xl + ... + Xm --+ Yl + ... + Y" with flowscheme
21
- - :-:-4:~I~
~---:.---.:
_____f ____
For example, consider Table 20 just after f executes; the remainder of the
computation has the form A + B --+ D + C + G.
What we wish to do is to describe Table 20 as a morphism X --+ D + C + G
in C given f, g, h, t, u. This can be done for this and a larger class of similar
flowschemes using the operations of composition, "parallel construction,"
and "line tying." We will not formalize the general class of flowschemes
involved, but we will describe the last two operations rigorously and apply
them to give a semantics to Table 20.
22 Definition. Let n ~ 2, Xi ~ y;, i = 1, ... , n in a category with finite
co products. Then the parallel construction
Xl + ... + Xn 1,11"'111", Yl + ... + Y"
with flow scheme notation
x.
66
2 An Introduction to Category Theory
is defined by
That is, ftll" ·11f,.
(/1 11"'11 f,.)(x, i)
<flin l , ... ,f,.inn ).
=
In Pfn, Mfn, and ANMfn,
= h(x) which is aptly described by the flowscheme notation.
23 Definition. Before defining "line tying" we give an example. The flowscheme
•..
X2
XI
U
1-'
•
XI
XI
x2
...
x2
--.
...
X3
X3
...
describes the morphism
X 2 + Xl + Xl
~
+ X 2 + X3
<in2.in"in"in2.in3~ Xl
...
+ X 2 + X3
where the inj are the injections of Xl + X 2 + X 3 . In general, given objects
Xl"'" Xn (n ;:::: 1) and an ordered list
with ij E {1, ... , n} (repetitions allowed), the line tying morphism
is defined by
LT W in.]
=
(j = 1, .. . ,k)
in·'i
where inj injects Xij as the jth terms of Xi, + ... + X ik while inij injects Xij as
the ijth term of Xl + ... + X n •
The semantics of Table 20 arises by viewing the flowscheme as in Table 24.
Table 24 A Resolution of Table 20
~
~ ~ E~
X
A
B
::
B
9
cE
F
~
t
C
---.
F
---.
U
G
C
D
D
• }---U
D
C
C
~I - ~
G
G
67
2.3 Products and Coproducts
25 Example. For g: X -4 Y
should have flowscheme
+ '" + Y
as in Example 4, the case statement
Y
--_X--~·GJ~~==}---··
Y
and so is ag, where a: Y + .. , + Y -4 Y is the line tying morphism defined
by a ini = id y for each i. (Recall from Proposition 16 that id y : Y -4 Y is a
coproduct of one copy of Y.)
EXERCISES FOR SECTION
2.3
1. For the Cartesian product construction of sets show that s: X x Y ----+ Y x X,
s(x, y) = (y, x) is bijective, and that a: (X x Y) x Z ---+ X x (Y x Z), a«x, y), z) =
(x,(y, z» is bijective. Thus, X x Y ~ Y x X and X x (Y x Z) ~ (X x Y) x Z
in Set. Discuss how X x Yand Y x X and also X x (Y x Z) and (X x Y) x Z
are genuinely different from the point of view of implementing records (cf.
Example 2.)
2. Let C be a category with finite products.
(i) Define the "coordinate switching morphism" s: X x Y ----+ Y x X by
~f~,
X'Y~;J,
X
and show that for C = Set, s(x, y) = (y, x) as in Exercise 1. In general, show
that s is an isomorphism.
(ii) If 1 is a terminal object, show that pr 1: X x 1 ---+ X is an isomorphism.
(iii) Define a(X x Y) x Z -> X x (Y x Z) by
~Xi-Y
(X x Y) x Z
I
~x
Pf 1
) X x (Y x Z)
lpf
2
YxZ
For C = Set show that a«x, y), z) = (x, (y, z» as in Exercise 1. In general, show
that a is an isomorphism. [Hint: Define a- l in a similar way and show
aa- l and a-la are identities by composing with projections and using the
uniqueness of morphisms induced into a product. For beginners, this is a
somewhat involved exercise.]
(iv) State the dual of (i), (ii), and (iii) for coproducts. Describe s: X + Y ----+ Y + X
and a: (X + Y) + Z ----+ X + (Y + Z) when C = Set.
68
2 An Introduction to Category Theory
3. Show that any category with a terminal object and such that each two objects has
a product possesses all finite products. [Hint: Use induction. Given Xl' ... , X.+1'
if Xl x ... X X. exists show that
is a model of Xl x ...
X
X.+1'] State the dual result for coproducts.
4. In any category, if pri: P - Xi is a product of (X;liEI) and if f: Q - P is an
isomorphism, show that prJ: Q - Xi is also a product.
5. (i) Let Xl' X2 be vector spaces and let Xl x X2 be the Cartesian product set
with the projection functions pri: Xl x X2 --+ Xi of 5. Prove that there exists
a unique vector space structure on Xl x X2 rendering pr l' pr 2 linear and then
show that this constructs the product of Xl' X2 in the category Veet of
Exercise 2.1.3. Now define in i: Xi --+ Xl X X2 by in l (x) = (x,O), in2(y) =
(0, y). Show that each ini is linear and, in fact, that
is a coproduct in Veet. [Hint: <fl'/2)(X l ,X2) = fl(xd + f2(X2)'] Thus, Veet
has the curious property that the same object underlies both the (finite)
product and coproduct construction.
(ii) In case I is infinite, show that the product vector space (Xi' prJ has underlying
set {(x;l i E I)} while the coproduct vector space (Xi' ini) has underlying set
{(x;liEI, and only finitely many Xi are nonzero)}.
6. Let C be any category with the property that for each two objects X, Y there exists
a bijection C(X, Y) - c(Y, X), fr--+ f* with the properties
id! = id x ; for f: X - Y, g: Y - Z, (gf)* = f*g*.
Show that
is a product in C if and only if
.
X. prt', P
is a coproduct in C.
7. Show that in Mfn, the disjoint union C
= {(x, i): i E I,
. {{X}
pri(x,J) =
0
x E Xi} with projections
ifi =j
ifi #- j
is a product in Mfn. Hence, the same object underlies the coproduct and
coproduct.
[Hint: Use Exercise 5 with xEf*(y)<,> yEf(x).]
n
8. In this exercise let X x Y,
Xi' and so on refer to the Cartesian product
construction in Set and let +, li refer to disjoint union. Then Pfn has products
as follows:
(i) The terminal object (in fact the zero object) is the empty set.
69
2.3 Products and Coproducts
(ii) For two sets X, Y, verify that the Pfo product is (X x Y) + X + Y with
prl: (X x Y) + X + Y - - X
defined
by
DD(prd = (X x Y) + X,
prl(X,y) = Xl' prl(x) = X, and pr2: (X x Y) + X + Y __ Ysimilarly.
(iii) For an arbitrary family (Xd i E I) define XJ = iE J Xi for each nonempty
subset J of I. Define P = 1l(XJ 10 "# J c I). Define appropriate projections
with respect to which P is the product of (Xd i E I) in Pfo.
n
9. Express the following flowschemes in a category with finite coproducts.
y
(i)
X
.~
(ii)
~~
Z
z
A
X
Z
~0
B
~
.~ >::
10. In this exercise let C be any category with products. We consider a simplified
model of operational semantics suggested by sets and total functions. A more
complex model is really required. The point of this exercise is to provide some
practice with products. We limit the discussion to the semantics of assignment for
the Pascal fragment of Table 1.2.1.
Let V be a "value" object of C equipped with three "constants" 0, 1, .1: 1 -+ V
(where 1 is a terminal object) for "zero," "one," "undefined," and four operations
+, -, x, -;.-: V x V - - V.
(i) Use the product V x V and Proposition 10 to describe the "successor
morphism" s: V -+ V which, in Set, is s(V) = V + 1.
Let I be the set of all identifiers as in Table 1.2.1. Define the power VI as the
"object of states." In Set an element of VI is a function I -+ V from identifiers
to values. The evaluation of an expression E takes the form of a morphism
[El VI -+ V defined inductively as follows (see Table 1.2.1).
For each numeral n, with ordinary numerical value N, define
en]
= VI --->1 ~ V~ V,
where SN is the N-fold composition of the successor morphism with itself
id y , Sl = S, SHl = (Sk)S).
For each identifier ()(,
(SO =
[()(] = VI ~ V.
Assuming [E], [F] have been defined then
[E
+ F] =
VI
[[El.[Fll ,
V
X
V~ V
and [E - F], [E x F], and [E -;.- F] are defined similarly.
(ii) Use the product property of VI to define the semantics of assigning expression
E to identifier ()( as a "state-transition morphism"
VI
which in Set would be
[.:=Ej
,
VI
70
2 An Introduction to Category Theory
.
[IX. = E](V;) = (W;), Jtj =
{Vi[E](V;)
ifj#1X
ifj = IX.
11. In Set, if Xl' ... , X. are finite and if Xi has k i elements, show that Xl x ... x X.
has k l '" k. elements and so is again finite. Also, show Xl + ... + X. has
kl + ... + k. elements.
12. Let Xo , Xl' X 2 , ••• be an infinite sequence of sets and let
Xi by
elements of Xi' For each subset A of N define A
ZEn
ZA .=
"
{
Xi
Yi
ifiEA
ifi¢A
n
Xi
#
Yi
be distinct
and show that A f-> ZA is injective. This proves that
Xi is infinite (indeed,
Xi is
for those readers familiar with basic facts about infinite sets, it shows
uncountable since N has uncountably many subsets).
n
13. Let (M, 0, e) be any monoid. If in i : Xi --+ X is a coproduct in PCn, show that
in;: Xi --+ X (in the notation of Exercise 2.1.11 (iii)) is a coproduct in FwR(M.o.e)'
Thus, FwR(M,o.e) has coproducts.
Notes and References for Chapter 2
Much has been written about category theory. For an elementary introduction on the
level of this chapter see
M. A. Arbib and E. G. Manes, Arrows, Structures, and Functors: The Categorical
Imperative, Academic Press, 1975.
Many useful texts go into more detail. The reader may wish to consult
1. Adamek, Theory of Mathematical Structures, D. Reidel, Dordrecht, 1983.
P. Freyd, Abelian Categories, Harper & Row, 1964.
H. Herrlich and G. E. Strecker, Category Theory, 2nd ed., Heldermann Verlag, Berlin,
1979.
S. Mac Lane, Categories for the Working Mathematician, Springer-Verlag, 1972.
B. Mitchell, Theory of Categories, Academic Press, 1965.
The Principle of Simple Recursion was formulated by F. W. Lawvere in the 1960s
and used in a category-theoretic approach to the foundations of mathematics. For
discussion and further references see
R. Goldblatt, Topoi, The Categorial Analysis of Logic, Revised Edition, NorthHolland, 1984.
The observation that a wide class ofloop-free flowschemes can be expressed in any
category with finite coproducts is due to C. C. Elgot in lectures given at Stevens
Institute of Technology, July 1975.
The isomorphisms X x (Y x Z) ~ (X x Y) x Z, X x Y ~ Y x X of Exercise 2
imply that the product of n objects, in any order and with any parenthesization, is of
the form Xl x .. , x X. up to isomorphism. This result is more involved than meets
the eye. For an introduction to so-called problems of "coherence" see the book by
Mac Lane cited above.
CHAPTER 3
Partially Additive Semantics
3.1 Partial Addition
3.2 Partially Additive Categories and Iteration
3.3 The Boolean Algebra of Guards
In Section 1.5 we introduced guard functions and finite and infinite sums of
partial functions and multifunctions and used these as tools to describe
guarded commands and various conditional and repetitive constructs. All
this was in Pfn or Mfn. The task of this chapter is to develop axioms on a
general semantic category to make similar constructions possible in a wider
context and to clarify the algebraic processes of manipulation and simplification as partly demonstrated by Example 1.5.31.
Section 3.1 formally axiomatizes the addition operation previously introduced for Pfn(X, Y) and Mfn(X, Y). In Section 3.2, Pfo and Mfn are generalized to "partially additive" categories C in which C(X, Y) has a partial addition as in Section 3.1 subject to appropriate axioms that relate the category
structure to the addition operations. The study initiated in Section 1.5 may
be pursued further in any partially additive category.
All axioms on a partially additive category deal with summability and
behavior of sums. It is rather surprising, then, that if X is an object in a
partially additive category C, no further axioms are needed to describe a
subset Guard(X) of C(X, X) to play the role of the "guard functions." As
proved in Section 3.3, Guard(X) is a Boolean algebra under the partial order
p $; q if pq = p.
3.1 Partial Addition
In Section 1.5 we introduced a sum operation in Mfn(X, Y) and in Pfn(X, Y).
In this section we present "partially additive monoids" (M,
wherein M is
any set (abstracting M = Mfn(X, Y) or M = Pfn(X, Y» and I is a suitable
sum operation on M.
I)
72
3 Partially Additive Semantics
To begin, let us pause to consider the ordinary "sum of xl, ... ,xn ."
The underpinning for this is a binary operation x + y which is associative
(x + (y + z) = (x + y) + z) and commutative (x + y = y + x). The associativity allows us to write Xl + ... + Xn without parentheses (for the same
reasons as discussed following 1.4.7), whereas the commutativity makes the
order in which Xl' ... ' Xn are listed immaterial. For example, y + x + z + w =
w + y + X + z = w + x + y + z obtains from two uses of commutativity.
Sums such as those of 1.5.9 and 1.5.11 cannot be derived from a binary
operation because we need infinite sums such as the formula for while-do in
1.5.27. The approach we shall adopt is to define sums of families of various
size directly. Let M be a fixed set. If I is a set, an I-indexed family in M is a
function x: 1-4 M. Family notation for this function is X = (xiii E I). Here we
have written Xi instead of x(i). Whereas the function notation x: I -4 M
makes both the domain and codomain explicit, the family notation suppresses the codomain. Family notation is thus convenient when the codomain
remains fixed.
The empty family is the unique 0-indexed family (unique because 0 is
initial in Set). Two families x: I ~ M, y: K ~ M are equivalent if there exists
a bijection tjJ: I ~ K with ytjJ = x; in family notation, yop(i) = Xi for all i E I. A
subfamily of (xii i E I) is (xN E J) where J c I. An I -indexed family is countable
if I is finite or denumerably infinite.
The partial addition to be axiomatized will assign an element I (xii i E I) of
M to certain I-indexed families in M. Since the semantic notions we wish to
capture involve no uncountable sums, we will deal exclusively with countable
families. One axiom will be an appropriate expression of the irrelevancy of
breaking up a sum into subs urns, for example, Xl + X 2 + X3 + X4 + X5 + X6 =
X3 + (Xl + X6 + X2) + (X4 + X5)· If I = {I, ... ,6}, Ia = {3}, Ib = {1,6,2},
Ie = {4, 5}, and J = {a, b, c} we may write this result as
1
Here (Ijlj E J) is a partition of I, that is, if j #- k then I j n h = 0 and also
I =
(I/ j E J). We pointedly remark that in our definition of partition,
I j = 0 is allowed for any number of j (even denumerably many j) as long as
U
the partition properties just given hold.
We are now ready for the formal definition.
2 Definition. A partially additive monoid is a pair (M, I), where M is a
nonempty set and L is a partial function which maps countable families in M
to elements of M (and we say that (Xi: i E I) is summable if I (xii i E I) is
defined) subject to the following three axioms:
Partition-Associativity Axiom. If (xii i E I) is a countable family and if
(IN E J) is a partition of I with J countable, then (xii i E I) is summable if and
only if (xii i E Ij ) is summablefor every j E J and
(xii i E I j ) Ij E J) is summable.
In that case,
(I
73
3.1 Partial Addition
Unary Sum Axiom. Any family (x;liEI) in which I has one element is
summable and I(x;liEI) = Xj if I = {j}.
Limit Axiom. If (x;I i E I) is a countable family and if the subfamily (x;l i E F)
is summable for every finite subset F of I then (x;l i E I) is summable.
3 Example. For any two sets X, Y the set Pfn(X, Y) with the I of 1.5.11 is a
partially additive monoid. The details are easily verified. We point out that
the limit axiom is true here in a stronger form: if (/; liE I) is a countable family
then it is summable if /; + fj exists for each i,j in I .
•
4 Example. For any two sets X, Y, Pfn(X, Y) can be considered as a partially
additive monoid in a different way from that in Example 3. Say that a countable family (/;Ii E I) is overlap summable if whenever x E DD(/;) (\ DD(fj) then
/;(x) = fj(x) and then define
(/;liEI)(x) to equal any /;(x) which is defined
for that x. Then
=
on disjoint families but
is more often defined. The
verification of the axioms is routine.
I' I
I'
I'
5 Example. For each two sets X, Y the set Mfn(X, Y) has a partially additive monoid structure in which every countable family is summable: define
(/;1 i E I)(a) =
(/;(a) Ii E I) as in 1.5.9.
I
U
I)
Let (M,
be a partially additive monoid. An alternative notation for
I(x;liEI) is Xi, + x i2 + Xi3 + ... if I = {il,i2,i3""}' The notation gives preference to a particular enumeration of I but the partition-associativity axiom
and the unary sum axiom may be used to prove that equivalent families have
the same sum: if ljJ: I --+ I is any bijection then x'l'(itl + X'l'(i 2) + X'l'(i3) + ...
exists and equals Xi, + x i2 + Xi3 + . .. [just consider the partition J =
({ ljJ(j)} Ij E I)]. The full strength of this equivalence property is used to justify
notations such as x + y + z for x, y, Z E M. Such exists if and only if I (Xi liE I)
exists with I a three-element set and {X 1 ,X 2,X 3} = {x,y,z}, but subject to this
it does not matter how I and the Xi are chosen. Thus, for example, the unary
sum axiom may be written I X = X for all X E M.
Pfn(X, Y), as in Example 3, has a special element, the totally undefined
function 0, which acts like a zero in that if Xi, + x i2 + Xi3 + ... exists then so
does the sum with O's arbitrarily interspersed, for example, Xi, + 0 + Xi2 +
o + 0 + Xi3 + Xi, + .... The remarks made above make it clear that order is
never important in a sum so that, in effect, the zero property of 0 is expressed
by saying that whenever (xiliEI) is summable then I(x;liEI) + 0 + 0 +
0+ ... (for any countable number of O's) exists and equals I(XiliEI). The
following result establishes such a 0 for any partially additive monoid.
I)
6 Theorem. Let (M,
be any partially additive monoid and let!: 0 --+ M be
the empty family in M. Then 0 =
exists. Furthermore, if (xiliEI) is any
summable family, if J is any countable set disjoint from I, and if Xi = 0 for i E J
then I (x;l i E I u J) exists and equals I (x;l i E I).
PROOF.
I!
First observe a principle of interest in its own right:
74
3 Partially Additive Semantics
7 Any subfamily of a summable family is summable.
The proof of 7 is immediate from the partition-associativity axiom since if
(x;liEI) is summable and K c I define J = {1,2}, 11 = K, 12 = I - K so that
(lNEJ) partitions I and so (x;liEK) = (x i liEI 1 ) is summable.
Since the empty family is a subfamily of any family, the proof that
exists is clear once some family is seen to be summable. This is clear from the
unary sum axiom since M is nonempty.
Now let (x;liEI) be summable, let J be a countable set disjoint from I and
let Xi = 0 for iEJ. For jEI u J define
L!
I.
J
=
{U}
0
ifjEI
ifUJ.
Then (ljlj E I u J) partitions 1. By the partition-associativity axiom
L(~)xiliEIj)ljEI u J) exists and equals L(x;liEI). But L(XiliEI) is just Xj
if j E I (by the unary sum axiom) and is 0 if j E J, so this equality is just the
statement of the theorem.
0
Ordinary numerical addition for integers includes situations such as
(-1) + 1 = O. There are no such additive inverses for partially additive
monoids: if X + y = 0 then necessarily x = y = 0, as follows.
8 Proposition. Let (M, D be a partially additive monoid, and let L (x;l i E I)
O. Then each Xi = O.
L
=
L
PROOF. By partition associativity, for each iEI, Xi + (xjlj #- i) = (x;liEI) =
0, so it suffices to prove that X = 0 if X + y = O. This follows from partition
associativity by
x=x+O+O+O+···
= x + (y + x) + (y + x) + (y + x) + ...
=
(x
+ y) + (x + y) + (x + y) + .. .
= O.
o
EXERCISES FOR SECTION 3.1
1. Let (M, D be any partially additive monoid. Let 00 be any new element not
already in M and set M = M u {oo}. For countable families (x;liEI) in M define
I
Xi
=
{L
Xi
00
if each
else.
Xi E
M, and
L
Xi
exists
Show that (M,I) is a partially additive monoid. This demonstrates that any
partially additive monoid may be extended to one in which every countable family
has a sum by adding a single element.
2. Consider the construction of Exercise 1 for Pfn(X, Y) with the overlap sum of
Example 4. Interpret 00 as al). "overdefined element."
75
3.2 Partially Additive Categories and Iteration
3. Let (M, +,0) be a monoid, as in Definition 2.1.11, such that x = Y = 0 whenever
x + y = O. Say that a countable family (x;liEI) is summable if {ilxi =f. O} is finite, in
which case L Xi is defined to be the ordinary sum of the nonzero Xi' the empty sum
being defined as O. Show that (M,D is a partially additive monoid. Where do you
need the assumption that X + y = 0 implies x = y = O?
4. Let M be any set and let OEM be any element. For any countable family (x;liEI)
define
= O. Show that (M,
is a partially additiave monoid if and only if
LXi
M = {O}.
L)
3.2 Partially Additive Categories and Iteration
In this section we introduce an axiomatic class of semantic categories which
plays an important role in this book. The object is to equip a category C with
partially additive monoid structures on each of the sets C(X, Y) and impose
axioms so that the constructions of Section 1.5 for Pfn and Mfn can be done
in C.
Proposition 1.5.15 and Corollary 1.5.16, the distributive law of composition over sum in Mfn and Pfn, immediately motivate the following definition:
1 Definition. Let C be a category. A partially additive structure L on C
is an assignment of a partially additive monoid structure Lx,y making
(C(X, Y), Lx, y) a partially additive monoid for each pair X, Y of C-objects,
subject to the distributivity of composition over sum, that is, for all f: W -+ X,
h: Y -+ Z and for all summable families (g;l i E J) in C(X, Y), (gJI i E J) and
(hg;l i E J) are also summable and
.
( L g;)f = L (gJ),
X,y
h(L gj)
X,y
W,y
=
L (hgJ
x,z
We do not exclude the case J = 0. See the proof of Proposition 4 below.
Henceforth, we will usually write L gj for the more tedious Lx, y gj (as we
did in Section 1.5) unless special emphasis is required.
2 Examples. As already observed, 1.5.9 gives Mfn a partially additive structure whereas 1.5.11 provides Pfn with a partially additive structure, It is
easily verified that the overlap sum of Example 3.1.4 is an alternate partially
additive structure on Pfn, Hence, a category may have more than one partially additive structure,
3 Example. The sum of 1.5.9 is not a partially additive structure on ANMfn.
Let WE W, fl' f2: W -+ X, and g: X -+ Y be such that .t;(w) = {x;} with
76
3 Partially Additive Semantics
g(x 1) = 0 and g(x 2) i= 0· Then (gf1 + gf2)(W)
(f1 + f2)(W) = {X1'X 2} so that (g(f1 + f2))(W) =
0
0·
=
u g(X2) i=
0, whereas
4 Proposition. A category admitting a partially additive structure has zero
morphisms.
PROOF. Let I be a partially additive structure on C and let Oxy E C(X, Y) be
the I-sum of the empty family in C(X, Y) as in 3.1.6. Given f: W -+ X, if
(giliE 0) is the empty family in C(X, Y), (fg;iiE 0) is the empty family in
C(w, X) so that
fOxy = f(I(g;iiE 0)) = I(fg;iiE 0) = OWy
Similarly, for g: Y
-+
Z, gOWY = Owz' We thus have
as required by Definition 2.2.16.
D
5 Notation. Paralleling 1.5.10, if (hi i E J) is a summable family in C(X, Y) in
a category C with partially additive structure I, the following flowscheme
notation is introduced for I.t;.
y
x
Given a category with partially additive structure, what axioms should be
imposed to justify the intuitions suggested by our flowscheme notations?
First of all we should require the existence of sufficiently many coproducts so
as to have available the theory of 2.3.20-25. For reasons of technical hindsight we will in fact require that every countable family of objects has a
coproduct. We shall then require two axioms, both of which guarantee that
certain families are summable. (Hence, in any situation such as Mfn where all
families are summable, these axioms will necessarily hold.)
We begin with some general definitions:
6 Definition. Let C be a category with countable co products and zero morphisms, and let (Xii i E J) be a countable family of C-objects.
(i) For J c J, the quasi projection PRJ:
(X;I i E J) -+ (Xi: i E J) is de-
11
11
77
3.2 Partially Additive Categories and Iteration
fined by
. , {in0i
iEJ
i¢J.
PR J ln. =
If J = {j} we write PRj.
(ii) The diagonal-injection A is defined by
(recall that 1· X is the coproduct of 1 copies of X; see Definition 2.3.14).
Xi = 1· X, we have the special line tying mor(iii) When all Xi = X,
phism (J defined by
11
I·X
(J
I
inii~
X
X
For flowschemes, we use 1
amply suggested by these.
X,
=
{1, 2} as an example, the general case being
I
A=
Xl
X2
X,
X2 ~
I
x2
(J=
In the flowscheme for A we are implicitly using the natural isomorphism
(Xl + X 2 ) + (Xl + X 2 ) ~ Xl + X 2 + Xl + X 2 · See Exercise 1, and the comment on coherence in the notes to Chapter 2.
78
3 Partially Additive Semantics
The reader should now pause to work Exercise 2.
The first axiom on a partially additive category, called the "compatible
sum axiom" in Definition 11 below, is motivated by the following flow scheme
identity induced by f: X -----. Y + Y:
7
Using the maps 0": Y
that
8
+ Y -----. Y,
PR 1, PR 2 : Y + Y -----. Y of 6,7 asserts
O"f = (PR1f)
+ (PR 2 f)·
It is clear how to generalize 7 and 8 for f: X -----. I· Y for any countable I.
For our axiom we will ask for a weaker statement than 8, which we state as
an observation about Pfn.
9 Observation. Given f: X -----. I . Y in Pfn with I countable, (PR i f liE I) is
summable. Indeed this holds with either 1.5.11 or the overlap sum of 3.1.4 since
if(PRJ(x)) is summable, f(x) has form (y, i) so that DD(PRJ) n DD(PRJ) =
if i =1= j. Of course, 9 holds in Mfn with the sum of 1.5.9 because all families
are summable.
o
The second axiom on a partially additive category, called the "untying
axiom" in Definition 11 below, is suggested by the flowscheme
x
for f
y
+ g. It seems reasonable to allow the output lines to be "untied" to yield
y
x
y
Now, at least in Pfn, the effect of untying is to say that x E DD(f) gets
mapped to one copy of Y and x E DD(g) gets mapped to a disjoint copy of
Y so that either gets mapped into Y + Y. The injection maps
y~y+ y~y
distinguish one copy from the other. From this perspective, the untied flow-
79
3.2 Partially Additive Categories and Iteration
chart is
y
x
y+ y
y
which now has the same form as the original and represents in l f
have the following:
+ in 2 g. We
10 Observation. In Pfn, if I. g: Y -+ Y have disjoint domains of definition then
in l I. in 2 g: X -----+ Y + Y do also. The proof is obvious.
We have motivated the central definition of this section:
11 Definition. A partially additive category is a category C which has countable coproducts and a partially additive structure I which satisfies the
following two axioms:
(i) Compatible Sum Axiom. If (I; liE I) is a countable family in C(X, Y) and
if there exists f: X -----+ I· Y with
X~[PR'
Y
(we say the I; are compatible) then (I;I i E I) is summable.
(ii) Untying Axiom. If I. g: X -+ Yare summable then also inlI.
in 2 g: X -----+ Y + Yare summable.
12 Example. Pfn is partially additive with I as in 1.5.11. This was established
in 1.5.16, 2.3.18, 9, and 10.
13 Example. Mfn is partially additive with I as in 1.5.9. See 1.5.15 and 2.3.18.
The terminology of 11 is a bit off-kilter at the current stage because it
appears possible that whether or not a category "is" partially additive depends on which partially additive structure I is under consideration. We
now prove, however, that if C with I is a partially additive category then
the converse of the compatible sum axiom holds so that summability is
completely determined by the categorical structure of coproducts and (necessarily unique) zero maps, and further that the sum is determined by the
category itself. Thus, a category can be partially additive for at most one
This is Theorem 18 below. En route we need some preliminary results.
To begin, consider the map L\: Xl + X 2 -----+ (Xl + X 2 ) + (Xl + X 2 ) of 6.
Identifying (Xl + X 2 ) + (Xl + X 2 ) with Xl + X 2 + Xl + X 2 as discussed
I.
80
3 Partially Additive Semantics
following 6, we have the flowschemes
Xl
Xl
X2
X2
Xl
Xl
Xl
X2
X2
X2
Xl
\
(
I
X2
(1
whose composition u II should surely be idx I +X 2 • A rigorous proof of the
general statement is easily established with no appeal to the assertion about
parenthesization of co products:
14 Proposition. In any category with countable coproducts and zero morphisms,
let (X;I i E I) be a countable family of objects and let
UXi~I·UXi
ieI
ieI
I·UXi~UXi
ieI
be the morphisms of 6. Then u II
ieI
= id.
PROOF. We have ull inj = uinjinj = id inj. Thus, ull = id when preceded by
inj for eachj in I, and hence the two are equal by the uniqueness property in
the definition of coproducts.
0
The next result will find frequent application. We use the morphisms of 6
without comment from now on.
15 Proposition. Given f: X ---+ UP';: iEI) in a partially additive category,
there exists a unique family J;: X -+ Yj with f = L ini !;, namely,!; = PRd.
PROOF. The family (in i !;) is compatible, because llf: X
the compatibility condition of 11 (i):
PRjo(llf)
To see this, first note that
= inJjj.
---+
I· U Yj satisfies
3.2 Partially Additive Categories and Iteration
81
commutes for each j, since the two paths are obviously equal (using 6) when
preceded by ini for each i in I. Thus,
PRj 0 1\ 0 f = inj 0 PRj 0 f = inj 0 ij
as was claimed. Hence, ~)njij is defined, by the compatibility sum axiom.
Noting that PRj 0 id = PRj, we see that the PRj: I'11 1"; ------+ U 1"; are also
summable by the compatibility sum axiom. But if L PRj exists, then the
untying axiom tells us that L injPR/ I'11 1"; ------+ I'11 1"; also exists. We
have
since PRjini
=
0 for i #- j, and zero summands "drop out." This establishes
LinjPRj = id
j
since these morphisms are equal when preceded by ini for each i in I.
As a1\ = id, by 14, we have that
f
= a1\f = a(id)N = a(L injPRj}N
=
L (ainj)(PRj1\)f = L (id)(injPRj)f
= LiniPRd)·
Thus, f = L injij, as was to be shown. For uniqueness, note that if also
f = L injgj, then
PRJ
= L PRiinjgj = gi'
D
16 Corollary. In a partially additive category, for each countable family
(X;I i E I) of objects, L iniPR i exists and
L iniPRi = id: 11 Xi ------+ 11 Xi'
In Example 3 we observed that the obvious L fails to be a partially
additive structure on ANMfn. In fact no L makes ANMfn a partially additive category:
17 Example. ANMfn is not partially additive. Co products exist as disjoint
unions as discussed in 2.3.18. Let X be any nonempty set. Then the injections
in i: X ------+ X + X and quasiprojections PR/ X + X ------+ X are determined
by the category and do not depend on any sum operation. If t: {y} ------+
X + X satisfies t(y) = X + X then PRit = 0: {y} ------+ X + X for i = 1, 2
since each PRi maps some element of X + X to 0. Thus, if ANMfn were
partially additive we could have in! PR! t + in 2 PR 2 t = in! 0 + in 2 0 = 0 + 0 =
o whereas by 16 (in! PR! + in 2 PR 2 )t = t #- 0, and this violates the distributivity of composition over sum.
We next see that a category can be partially additive in at most one way.
82
3 Partially Additive Semantics
In particular, Pfn is partially additive for the disjoint sum but not for the
overlap sum.
18 Theorem. The addition operation of a partially additive category is unique
as follows: if C is a partially additive category, then a family (hli E I) in C(X, Y)
is summable if and only if it is compatible. In that case, the f: X -----+ I· Y with
PRJ = h is unique, and
19
PROOF.
If "ih exists, then f
"i in;h exists by the untying axiom. Then
PRjf = "i(PR)n;hl iE I) = h·
=
This shows that summable families are compatible.
It is immediate from 15 that if PRJ = PRig for each i, then f
over, if PRJ = L then
=
g. More-
o
Note that the formula 19 for the sum established 8 in general. The next
result generalizes 1.5.18(b).
20 Corollary. Given h: X -? Y, g;: Y
exists, then so too does "i 9;h.
-?
Z in a partially additive category if "ih
PROOF. If (h liE I) is summable then it is compatible, by 15; that is, there exists
f: X -----+ I· Y with f = "i in;h. Now define g: I· Y -----+ Z by go in; = g;.
Then
o
Definitions such as if A then f else 9 in 1.5.24 and while A do f in 1.5.27 for
Pfn and Mfn can not be stated in a partially additive category without first
generalizing the guard functions incA of 1.5.20, as will be done in the next
section. Surprisingly, however, we need not wait to define conditional and
iterative constructions which, as will be seen in the next section, do generalize
the if-then-else and while-do constructions.
Following 2.3.4 and 2.3.25 we have the following:
21 Definition. In any category with finite coproducts, the generalized conditional of a morphism of form t: X -? Y is at: X -? Y with a: Y + Y -----+ Yas
in 6, that is, (Jin! = id y = ain 2 • An appropriate flowscheme is
22 Examples. In Pfn, given A c X, f, g: X
provided t: X -----+ Y + Y is defined by
-?
Y, if A then
f else 9 =
(Jt
83
3.2 Partially Additive Categories and Iteration
DD(t) = (A n DD(f) u (A' n DD(g»
f(x) = {f(X)
g(x)
xEA
x¢A.
A similar construction works in Mfn.
There is a partially additive version of 1.5.24 for the generalized conditional:
23 Proposition. Given t: X ----+ Y + Y is a partially additive category,
t = inl t 1 + in2 t2 as in 15, the generalized conditional is given by
(J(
if
= tl + t2
The reader should observe that the flows cherne form of proposition 22 is
.
precisely 7.
In the same vein, a morphism of form f: X ----+ X + Y induces a
"generalized while-do" which we call the iterate of f and denote by a dagger
superscript ft: X --+ Y. Intuitively, ft is "repeat f until in Y." The formal
definition results from the following theorem which makes crucial use of the
limit axiom in Definition 3.1.2.
24 Theorem. Given f: X ----+ X + Y in a partially additive category, write
f = in 1 fl + in2 f2 for unique fl: X --+ X, f2: X -+ Y as in Proposition 15.
Then the sum
00
L fdt: X --+ Y
n=O
ft =
exists.
We call it the iterate off and denote it with the flows cherne
!
PROOF. For any g: X
--+
X
{!Jr---=X::c..-...Jy
Y the sum gfl + f2 exists. To see this, consider
X ~X
and use the fact that f
+Y
(g,id y )
I
Y
= in l fl + in2 f2 to derive
<g,idy>f= <g,idy>{indl +ind2)
= <g, idy>indl + <g,idy)in2f2
= gfl
+ f2
Setting g = f2' f2fl + f2 exists. Proceeding inductively, if g = f2ft
+
84
3 Partially Additive Semantics
f2ft- 1 + ...
+ f2fl + f2' exists, then so does
(fzft + f2ft- 1 + ... f2fl + f2)fl + f2 = f2ft+1 + f2ft + ... + f2fl + f2'
Now let F be any finite subset of I = {a, 1,2, ... }. There exists n with
Fe {O, ... , n}. As (fU2: 0:5: i :5: n) is summable and as any subfamily of a
summable family is summable by 3.1.7, (fU2: iEF) is summable. That ft
exists then follows from the limit axiom.
D
25 Example. In Pfn, for f: X ----+ X + Y, if x, f(x), f(f(x)), ... , r- 1 (x) are
defined and in X and then r(x) is defined and in Y, ft(x) = r(x). Such n is
clearly unique if it exists. If no such n exists ft(x) is undefined. For A eX,
f: X -+ X, while A do f is 9 t if g: X ----+ X + X is defined by
g(x)
=
{(f(x), I)
(x, 2)
if x E DD(f) n A
ifx¢A.
L incA,(f incAt
00
9t =
n=O
which is just the formula for while A do f of 1.5.27.
In Chapter 8 we will discuss the semantics of recursion in a partially
additive category. Our more immediate objectives are to introduce guards so
as to deal directly with conditional and repetitive constructs in the form they
were introduced in Section 1.5 and to discuss assertion semantics and proof
rules in this context. This program is begun in the next section and continued
in Chapter 4.
EXERCISES FOR SECTION
3.2
1. Construct an isomorphism (W + X) + (Y + Z) ~ W + X + Y + Z in any category with finite coproducts. [Hint: define four injections rendering (W + X) +
(Y + Z) a model of W + X + Y + Z.]
2. Give explicit descriptions of PRJ> ~, and [) in Pfn and convince yourself that the
flowschemes of Definition 6 are aptly chosen. Observe that the same partial
functions considered in Mfn provide these constructions in Mfn.
3. Prove that Set is not partially additive.
4. Prove that Veet is not partially additive. [Hint: All finite families are compatible
and this forces (f + g)(x) = f(x) + g(x).]
5. Let C be a partially additive category and let Y be an object of C such that for all
X and for all f, g E C(X, Y), f + g is defined. Prove that
Y~Y+Y~Y
is product in C.
85
3.3 The Boolean Algebra of Guards
6. Show that for h: X ---+ X + Yin Pfo, there exist A c X, f: X
such that h t = go (while A do f).
-->
X, and g: X
-->
Y
7. Show that while A do f in Mfo as in 1.5.27 has form h t but that, in contrast to Pfo
in Exercise 6, there exists h: X ---+ X + Yin Mfn such that h t is not of the form
go (while A do f). Thus, the iterate is truly more general than while-do.
8. Show that in both Pfn and Mfn, repeat f until A has the form h t.
9. Show that the following holds is any partially additive category.
y
y
10. Show that the following holds in any partially additive category.
~
y
y
11. For any monoid (M, 0, e), show that the category FwR(M.o,e) of Exercises 2.1.11,
2.2.12, 2.3.13 is partially additive. [Hint: Pfn(M,o,e) (X, Y) = Pfn(X, M x Y) is a
partially additive monoid with' the sum of 1.5.11.]
3.3 The Boolean Algebra of Guards
Pascal allows a statement of the form
if x > 0 and not x
=
5 then Sl else S2'
If Sl' S2 are to be semantically interpreted as morphisms X -+ Y in a category
C, how can "P and not Q" be interpreted, where P is "x > 0" and Q is
"x = 5"? In categories such as Pfn where objects are sets, "propositions"
such as P, Q are tantamount to subsets. For example, "x > 0" may be identified with the subset of all x for which x > O. Under this identification, the
logical operations and, or, and not correspond to intersection, union, and
complement in the set of subsets of the set of possible values of x. Equivalently, they may be represented by guard functions as in 1.5.20. In this
section, we first explore the set of subsets of a set, showing that its poset
structure determines its logical structure as a "Boolean algebra." We then
show that for an object X in a partially additive category C, there is a subset
Guard(X) of C(X, X) of "guard morphisms" which forms a Boolean algebra.
This is a pleasant surprise since the axioms defining a partially additive
category were not designed with this in mind-Guard(X) comes for free!
86
3 Partially Additive Semantics
We begin by abstracting familiar operations in the poset (&,(X), c) of subsets of X (2.1.8) to an arbitrary poset.
1 Definition. Let (P,
$;)
be a poset. A least element of (P,
$;)
is an element
oE P such that 0 $; x for all x E P. At most one least element exists since if z
is also a least element then 0 $; z and z $; 0 so that z = O. A greatest element
of (P, $;) is an element 1 E P such that x $; 1 for all x E P. Again, antisymmetry yields that there is at most one greatest element. (It is reasonable to
have chosen the same notations as for initial and terminal objects in a
category-see Exercise 1.)
2 Example. (gIl(X), c) has
0 as least element and has X as greatest element.
"Intersection" and "union" generalize to posets by observing that the
intersection of two sets is the largest set contained in both of them whereas
the union of two sets is the smallest set containing both of them. Formally,
we have the following:
3 Definition. Let (P, $;) be a poset with x, YEP. An element
infimum (or greatest lower bound or meet) of x, Y if
(i) W $; x and W $; y; and
(ii) whenever a $; x and a
$;
y, then a
$;
WE P
is the
w.
Such W is unique when it exists since if u also satisfies (i) and (ii) then
and u $; w. We write the infimum of x, y as
W $;
u
x/\y
when it exists.
A supremum (or least upper bound or join) of x, y is an element z of P
satisfying
(iii) x $; z and y $; z; and
(iv) whenever x $; a and y
$;
a, then z
$;
a.
Again, such z is unique if it exists, in which case we write it as
x v y.
A poset in which x /\ y exists for all x, y is called a meet-semilattice. A
po set in which x v y exists for all x, y is called a join-semilattice. A lattice is
a poset in which both x /\ y and x v y exist for all x, y.
4 Example. (gIl(X), c) is a lattice with A /\ B
5 Proposition. In a meet-semilattice,
(i) x /\ y = y /\ x.
(ii) Ifx $; Xl' Y $; Yl then x /\ y
(iii) x /\ (y /\ z) = (x /\ y) /\ z.
$; Xl /\
Yl.
=
A n B and A v B
=
A u B.
87
3.3 The Boolean Algebra of Guards
I n a join-semilattice,
(iv) x v Y = Y v x.
(v) Ifx ::::;; Xl' Y ::::;; Yl then X v Y ::::;;
(vi) X v (y v z) = (x v y) v z.
Xl V
Yl·
PROOF. That (i) holds is obvious since the order in which x, yare listed is
immaterial in Definition 3. To prove (ii) and (iii) we use the axioms on ::::;; and
3 as follows. If x::::;; Xl and Y ::::;; Yl then x 1\ Y ::::;; x and x::::;; Xl so X 1\ y::::;; Xl;
similarly, x 1\ Y ::::;; Yl. Thus, x 1\ y::::;; Xl 1\ Yl by 3 (ii).
The proof of (iii) is longer. As x::::;; x and Y 1\ Z ::::;; Y it follows from
(ii) already proved that x 1\ (y 1\ z) ::::;; X 1\ y. Furthermore, x 1\ (y 1\ z) ::::;;
Y 1\ Z ::::;; z so X 1\ (y 1\ z) ::::;; z. Thus, x 1\ (y 1\ z) :5; (x 1\ y) 1\ z. For the
reverse inequality, (x 1\ y) 1\ Z :5; X 1\ Y :5; x and, as x 1\ Y :5; Y and z :5; z,
(x 1\ y) 1\ Z :5; Y 1\ Z so (x 1\ y) 1\ Z :5; X 1\ (y 1\ z). By anti symmetry, (iii)
follows.
The proof of (iv), (v), and (vi) for join-semilattices is similar.
o
Because of (iii) and (vi) in 5 we may use parentheses-free notation
for n-fold meets and joins.
Xl 1\ ... 1\ Xm Xl V ... V Xn
6 Proposition. Let (P, ::::;;) be a poset and let x E P. Then:
(i) x 1\ x, X V x exist and x 1\ x = X = X V x.
(ii) If the least element 0 of (P, ::::;;) exists, 0 1\ x, 0 V x exist and 1\
o v x = x.
(iii) If the greatest element 1 of (P, :5;) exists, 1 1\ x, 1 v x exist and 1 1\
Ivx=1.
°
=
0,
X =
x,
X
PROOF. We prove the statements involving joins, leaving the remaining results
for the reader since the proofs are similar.
For (i), x :5; x and x :5; x, and if x :5; a and x :5; a, then x :5; a. Thus, x
satisfies the requirements for x v x in 3. As :5; x and x :5; x and if :5; a and
x :5; a then x :5; a, v x = x. Finally, 1 :5; 1 and x :5; 1 and if 1 :5; a and x :5; a
then 1 = a by antisymmetry since certainly a :5; 1 so 1 v x = 1.
D
°
°
°
If A is a subset of X, the only subset S of X satisfying A n S = 0,
A u S = X is S = A' = X - A, the complement of A. This suggests the
following.
°
7 Definition. Let (P, ::::;;) be a poset with least element and greatest element
1. Let x E P. A complement of x is an element x' for which x 1\ x', X V x' exist
and
= 0,
X 1\
x'
x
x' = 1.
V
88
3 Partially Additive Semantics
8 Example. Consider the poset with Hasse diagram
1
"<1>'
o
Here each of a, b, c has two complements.
We would like complements to be unique. A key idea is the following.
9 Definition. A lattice is distributive if
X 1\
(y
V
z)
= (x
1\
y)
V
(x
1\
z)
for all x, y, z in the lattice.
10 Proposition. In a distributive lattice with least and greatest elements, each
element has at most one complement.
PROOF. Suppose
X 1\
X V
= 0 = X 1\ Z,
Y = 1 = x v z.
Y
Then, making free use of Propositions 5 and 6,
y=yl\l=yl\~v~=~I\~v~I\~=yl\z
so that y ::;; z. Similarly, z ::;; y. Thus, y
= z.
o
11 Example. (.?P(X), c) is a distributive lattice. If X E A n (B u C) then x E A
and either x E B (hence x E A n B) or x E C (hence x E A n C) so x E (A n B) u
(A n C). And, if x E (A n B) u (A n C), then either x E A nBc A n (B u C) or
xEAn C c An(Bu C).
It follows that (&,(X), c) is a Boolean algebra which is defined as follows:
12 Definition. A Boolean algebra is a distributive lattice with least and greatest elements in which every element x has a (necessarily unique by 10) complement x'.
While the axioms on a Boolean algebra as an abstraction of (.?P(X), c)
have been well motivated, it is not clear that enough axioms have been
imposed. The reader's confidence that this is in fact so will be strengthened
by working Exercise 10.
We emphasize that all operations involved-x 1\ y, X V y, 0, 1, x'-are
89
3.3 The Boolean Algebra of Guards
defined in terms of the partial order and if they exist they do so uniquely. A
Boolean algebra is a type of poset.
We now turn our attention to the problem of finding a subset of C(X, X)
of "guard morphisms" which forms a Boolean algebra. For intuition, consider the way in which a subset A of X corresponds to the partial function
incA: X -+ X of 1.5.19. We note that incA inherits from A the following
properties:
incA
+ incA' =
1,
incA' incA' = 0 = incA' . incA,
where we now write 1 for the identity function idx .
This motivates the following definition:
.
13 Definition. For X an object of a partially additive category C, Guard(X)
is the subset of C(X, X) comprising all morphisms for which there exists p'
such that
p
+ p' exists and p + p' =
pp'
=0=
1,
p'p,
where we take 1 = idx . Elements of Guard(X) are called guards on X.
14 Example. For both of the partially additive categories Pfn, Mfn, Guard(X)
consists of all the inclusion functions incA of 1.5.19. First consider Mfn. The
equations p + p' = idx for p, p' E Mfn(X, X) yields p(x) u p'(x) = {x}. If both
p(x) = {x} and p'(x) = {x} then p(p'(x» = {x} which contradicts p(p'(y» = 0
for all y. Thus, exactly one of p(x), p'(x) can equal {x}. Setting A =
{x E Xlp(x) = {x}} we see p = incA' Conversely, if p = incA, set p' = incA" The
proof for Pfn is essentially the same.
The object of this section, then, is to show that for each partially additive
C, Guard(X) has a poset structure with respect to which it is a Boolean
algebra in such a way that Guard(X) for Pfn(X,X) and Mfn(X, X) have the
usual Boolean operations on subsets.
In what follows, we leave implicit a partially additive category C with
respect to which Guard(X) is formed for some object X. We begin with the
following:
15 Proposition. For p in Guard(X), the p' in the equations
.p
+ p' =
1
pp' = 0 = p'p
is unique. Furthermore, p"
= p, 0' = 1, and l' = O.
90
3 Partially Additive Semantics
PROOF. In spirit, this is much like PropositioQ 10. If also
p
+q =
pq
1,
= 0 = qp,
then
q
= ql = q(p + p') = qp + qp' = 0 + qp' = qp'
so that
p'
= (p + q)p' = pp' + qp' = qp' = q.
That p" = p is immediate from the symmetry of p and p' in the defining
equations. That 0' = 1, I' = 0 is clear from
0+1=0
0·1 = 0 = 1·0.
o
We next introduce the "sum-ordering" relation which, while not necessarily
antisymmetric on all C(X, X) is general, always mflkes Guard(X) a poset, a~
shown in Theorem 20.
16 Definition. The sum-ordering relation on C(X, X) is defined by
f ::;; 9 if there exists h such that 9 = f + h.
Hence, in any partially additive category, we have p ::;; 1 = idx for each
guard p.
17 Examples. For Pfn(X,X), ::;; is the extension ordering of Example 2.1.9.
If 9 extends f define DD(h) = {xEDD(g)lx¢DD(f)} and define h(x) = g(x)
to get 9 = f + h. That 9 extends f if 9 has the form f + h is obvious.
18 Example. For Mfn(X, X), f::;; 9 if and only if f(x) c g(x) for all x. We
leave this as an exercise.
19 Proposition. The sum-ordering ::;; on C(X, X) satisfies the following
properties:
(i) ::;; is reflexive and transitive.
(ii) Iff::;; 9 then for any t, u, if ::;; tg and fu ::;; guo
(iii) If P is a guard and f ::;; p then
pf
= f = fp,
p'f= 0 =fp'·
(iv) If p is a guard and f ::;; 1 then pf = fp.
(v) For p, q guards, pp = p and pq = qp.
91
3.3 The Boolean Algebra of Guards
Before reading the proof, readers should hone their intuition by checking
that (i) through (v) do indeed hold in Pfn(X, X).
PROOF. (i) As f = f + 0, f ~ f so ~ is reflexive. If 9 = f + w, h = 9 + v then
by partition associativity, h = 9 + v = (f + w) + v = f + (w + v). Thus, if
f ~ 9 and 9 ~ h, f ~ h so ~ is transitive.
(ii) If 9 = f + h, tg = t(f + h) = if + w for w = th and gu = fu + v for
v == hu.
(iii) Write p = f + h. Then 0 = pp' = (f + h)p' = fp' + hp' so fp' = 0 by
Proposition 3.1.8. But then f = f(p + p') = fp + 0 = fp. That p'f = 0 and
f = pf i~ similar.
(iv) Applying (ii) to f ~ 1, pf ~ pI = p and similarly fp ~ p. By two uses
of (iii) we have
pf = (pf)p = p(fp) = fp·
(v) From p ~ p, pp
(iv) that pq = qp.
p is immediate from (iii). Since p
=
20 Theorem. Consider Guard(X) with the sum-ordering
guards p, q
p
Furthermore, (Guard(X),
~)
~
~
1, it follows from
D
~.
Then for any
q<=>pq = p.
is a poset.
PROOF. Let p, q be guards. If pq = p then
q = (p
+ p')q = pq + p' q = p + h
for h = p' q, so p ~ q. Conversely, if p ~ q then pq = p by 19 (iii).
We know ~ is reflexive and transitive from 13 (i). To prove antisymmetry, we note that if p ~ q and q ~ p then, using 19 (v),
p = pq = qp = q.
The following proposition prepares the way to prove that (Guard(X),
is a Boolean algebra.
21 Proposition. For brevity, write G for Guard(X). Let p, q E G. Then:
(i) Also pq E G with (pq)' = pq' + p' q + p' q'.
(ii) The infimum of p, q in (G, ~) exists and is pq.
(iii) p ~ q if and only if q' ~ p'.
PROOF. (i) We have
+ p')(q + q') = p(q + q') + p'(q + q')
pq + (pq' + p'q + p'q').
1 = 1· 1 = (p
=
D
~)
92
3 Partially Additive Semantics
Making free use of 19 (v)
pq(pq'
+ p'q + p'q') = pqq' + pp'q + pp'qq' =
pO
+ Oq + 00 = O.
Similarly,
(pq'
+ p'q + p'q')pq =
0
so by virtue of the defining equations in 13, pq is in G with (pq)'
+ p'q +p'q'.
pq'
=
(ii) That pq is in G has just been established. By 19 and 20 (pq)p =
ppq = pq shows pq ~ p and pq ~ q similarly. If also r ~ p and r ~ q then
rp = r = rq so r(pq) = (rp)q = rq = rand r ~ pq. By Definition 3, pq = P 1\ q
in (G, ~).
(iii) If p ~ q then p = pq so by (i)
p'
= (pq)' = pq' + p'q + p'q' = (p + p')q' + p'q = q' + h,
h
= p'q.
So q' ~ p'. (We caution the reader that if ac + bc exists (a + b)c may not; in
the above, pq' + p' q' = (p + p')q' is valid because we know p + p' exists.)
Conversely, if q' ~ p' then by the result already proved, p" ~ q". So, recalling
15,p ~ q.
0
We can now establish the main result of the section.
22 Theorem. For G = Guard(X) for X an object of a partially additive category C, with sum-ordering ~,(G, ~) is a Boolean algebra. Furthermore,
(i)
(ii)
(iii)
(iv)
°
the empty sum is the least element of (G, ~);
I = idx is the greatest element of (G, ~);
the irifinum operation is p 1\ q = pq;
the Boolean algebra complement of p coincides with the guard complement p'; and
(v) the supremum operation is given by any of
p v q
= pq' + p' q + pq
=
pq'
+q
= P + p'q.
Since Op = 0, pi = p, (i) and (ii) are clear, and (iii) has already been
shown in 21. For the moment let p' denote the guard complement. Given p,
q, it follows from 21 (iii) that (p' q')' is the supremum of p, q as follows. As
p' q' ~ p', p ~ (p' q')'. Similarly, q ~ (p' q')'. It also p ~ t, q ~ t then t' ~ p',
t' ~ q' so t' ~ p' q'; hence, (p' q')' ~ t. Using 21 (i) this shows
PROOF.
p v q
= (p'q')' = p'q + pq' + pq,
which has alternate forms pq'
We then have
+ (p' + p)q =
pq'
+q
and p
+ p' q
similarly.
93
3.3 The Boolean Algebra of Guards
= P + P'P' = P + p' = 1,
p /\ p' = pp' = 0 = p' p = p' /\ p.
P
V
P'
So p' is also the lattice complement. Finally, in accordance with Definition
12, we now must prove the distributive law of 9. Indeed,
p /\ (q
V
r) = p(q
+ q'r) =
pq+ pq'r,
whereas
(p /\ q)
V
(p /\ r)
pq + (pq)'(pr)
=
=
pq + (pq' + p' q + p' q')(pr)
= pq + pq'r + p'pqr + p'pq'r
= pq + pq'r.
D
We have now justified the following definitions which extend those of
Section 1.5 to an arbitrary partially additive category.
23 Definitions. Let C be a partially additive category. An n-way test on X is
a summable n-tuple (Pl" .. , Pn) with each Pi E Guard(X). If fl' ... ,fn: X -+ Y
and (Pl" .. , Pn) is an n-way test, we define
case (Pl"" ,Pn) of (fl"" ,f,.)
=
flPl
+ ... + fnPn·
This sum exists by Corollary 3.2.20. This recaptures both the case statement
of 1.5.22 for C = Pfn and the alternative construct 1.5.23 if C = Mfn.
An important special case occurs for p E Guard(X), f, g: X -+ Y: if p then f
else g = fp + gp'.
For pEGuard(X), f: X -+ X we also have
while p do f
L p'(fpt,
00
=
repeat f until p =
n=O
L (p'f)(pf)n,
00
n=O
generalizing 1.5.27 and 1.5.29 for C = Pfn or Mfn.
For the repetitive construct of 1.5.30 let (Pl"'" Pn) be an n-way test on X
and let fl' ... , f,.: X -+ X. Then do Pl -+ flO" . 0 Pn -+ f,. od = while
Pl v ... V Pn do flPl + ... + fnPn·
EXERCISES FOR SECTION
3.3
1. For the category C(p.';) of Exercise 2.1.7, show that a least element of (P, ~) is the
same thing as an initial object of C(P. ,;). Then invoke duality to prove that a
greatest element of (P, ::;) is a terminal object of C(P, ,;).
2. In the context of Definition 3, show that the uniqueness of the supremum follows
by duality. Also invoke duality for the proofs of (iv), (v), and (vi) in Proposition 5
and the "remaining results" in the proof of Proposition 6.
94
3 Partially Additive Semantics
3. Show that "any subset of a poset is a po set." More precisely, show that if (P, :::;;)
is a po set and Po c P then (Po, so) is a poset if x :::;;0 y means that x s y. Also
prove that (Po, so) is totally ordered if (P, :::;;) is.
4. Let (P, s) be a poset. By Exercise 3, if A c P, A is itself a poset so it is meaningful
to discuss the least element or the greatest element of A. Let x, YEP. A lower
bound of x, y is Z E P such that z s x and z s y. Let LB(x, y) be the set of lower
bounds of x, y. Similarly, let UB(x,y) = {zlx:::;; z and y:::;; z} be the set of upper
bounds of x, y. Show that the greatest lower bound x /\ y is literally the greatest
element of LB(x,y) in the sense that one exists if and only if the other does and
then they are equal. Similarly, show that x v y is the least element of UB(x, y).
5. By Propositions 5 and 6, if (P, s) is a meet-semilattice with greatest element 1,
then (P, /\,1) is a monoid and two special properties hold:
Commutativity: x /\ y = Y /\ x.
Idempotency: x /\ x = x.
Conversely, show that if (P, 0, e) is any monoid in which commutativity and
idempotency hold (x 0 y = yo x, x 0 x = x) then (P, s) is a meet-semilattice
with greatest element if x s y is defined to mean x 0 y = x. In fact, show that
these constructions establish a bijection between meet-semilattice-with-greatestelement structures and monoid structures on P. This summarizes by saying "a
meet-semilattice with greatest element is the same thing as a commutative idempotent monoid."
6. In the po set with Hasse diagram
y
x
show that x v y does not exist, even though there do exist z with x :::;; z, y :::;; z.
7. In any lattice, prove the absorptive laws:
x v (x /\ y) = x,
X /\
(x
V
y) = x,
for all x, y.
8. In any distributive lattice prove that
x v (y /\ z) = (x v y) /\ (x
V
z).
Why does this not follow from duality?
9. Let P = {1, 2, 3, ... }. (P, s) is totally ordered ifn :::;; m means that m is numerically
larger than n. Another important partial ordering on Pis nlm if n divides m, that
is, m = an for some integer a. Verify that (P, I) is a lattice with
95
3.3 The Boolean Algebra of Guards
n
A
m = greatest common divisor of n, m,
n v m = least common multiple of n, m,
1 = least element (defying standard notation!),
which is not totally ordered and has no greatest element.
10. Verify the following laws in any Boolean algebra:
(i) x" = x.
(ii) x ::;; y if and only if y' ::;; x'.
(iii) (De Morgan's Laws):
(x v y)'
(x
A
= x'
A
y',
y)' = x' v y'.
(iv) Use induction to show (Xl v ... v x n )' =
x~ A ... A X~.
11. Let (P, ::;;) be a Boolean algebra. For p, x, YEP define
if p then x else y = (p
A
x)
V
(p'
A
y).
Verify the following:
(i) p' = if p then 0 else 1;
(ii) p v q = if p then 1 else q.
12. Let f: X -> Y in a partially additive category. Define 7, if it exists, to be the least
element of {pEGuard(X)lfp =
In Pfn, show that 7 exists and is incA for
A = DD(f). Similarly, in Mfn, prove
= incA for A = {xEXlf(x) i= 0}. Hence,
7 is a candidate for a general notion of "domain of definition" for morphisms in a
partially additive category.
n.
7
13. If C is a partially additive category, the sum-ordering on C(X, Y) is f::;; g if
g = f + h for some h. By the same proof as that of 19 (i), ::;; is reflexive and
transitive.
(i) Show that (C(X, Y), ::;;) is a poset if C is Pfn or Mfn.
(ii) In general, define the extension-ordering c: on C(X, Y) by
f c: g
if g = fp for some p E Guard(X).
Show that (C(X, Y), c:) is a poset.
(iii) For f, g E C(X, Y) show that f c: g => f::;; g.
(iv) Show that f c: g ¢ > f ::;; g in Pfn but give an example in Mfn with f ::;; g but
not f c: g. [Hint: For the latter, let X have one element.]
14. A partially additive semiring is (R, I, 0,1), such that (R, I) is a partially additive
monoid, (R, 0,1) is a monoid (we write pq rather than po q), and the following
distributive laws hold: if (q;!i E J) is summable in R then for each p, q E R, (qiPli E J)
and (rp;! i E J) are also summable and
(I q;)p = I (qiP)
r(I qi)
=I
(rqi)
The empty sum is not excluded, that is, Op = 0 = pO.
96
3 Partially Additive Semantics
Show that if C is any partially additive category with partially additive structure I then for every object X, (C(X, X), Ix,x, 0, idx)is a partially additive semiring, where 0 denotes C-composition.
15, Let (R, I, 0,1) be a partially additive semiring as in Exercise 14. Define the
sum-ordering ~ on R as in Exercise 13. Verify that the center C of (R, I, 0,1)
defined by
C = {pERlthere exists p' ER with p + p' = 1, pp' = 0 = p'p}
is a Boolean algebra with order ~. [Hint: Check that all results culminating with
Theorem 22 go through unchanged.]
16. Refer to Exercises 14 and 15 for terminology. The unit interval of a partially
additive semiring consists of all x with x ~ 1. The center is always a subset of
the unit interval and they are equal in Pfn(X, X) and Mfn(X, X). The following
develops an example with a trivial center but a large unit irrterval. Let a < b be
real n umbers, let [a, b] denote the closed interval {x Ia ~ x ~ b}, and let R be the
set of all functions f: [a, b] ----> [a, b] which are monotone, that is, if x ~ y then
f(x) ~ f(y). (Thus, a function is monotone if and only if its graph is never decreasing.) We assume the reader to be familiar with the fact that every subset of [a,b]
has a supremum.
(i) Show that (R, I, 0,1) is a partially additive semiring if (I.t;)(x) = V.t;(x), the
supremum of the .t;(x), (f 0 g)(x) = f(x) 1\ g(x) = minimum of f(x), g(x), and 1
is the identity function l(x) = x. Show also that the empty sum 0 is the
function O(x) = O.
(ii) Show that the center of (R, I, 0,1) is {O, I}.
(iii) Show that the unit interval of R is infinite.
17. Let (M, 0, e) be any monoid. For the partially additive category FwR(M,o,e) of
Exercise 3.2.11 show that Guard(X) is the set of all incA (using the notation of
Exercise 2.1.11) with A a subset of X.
18. In any partially additive category, show that for pEGuard(X), f: X .... X, both
while p do f and repeat f until p have form h t for appropriate h: X ----> X + X.
19. Let V be a vector space and let P be the set of all subspaces of v: By Exercise 3,
(P, c) is a poset if c denotes subset inclusion. The zero subspace {O} is the least
element of P and V itself is the greatest element.
(i) Prove that A II B is a subspace for A, BE P and conclude that A II B is the
infimum.
(ii) While A u B need not be a subspace if A, B are, A v B exists and is the linear
span of Au B. Verify this.
H follows that (P, c) is a lattice, the lattice of subspaces of v:
Notes and References for Chapter 3
Partially additive monoids and categories were introduced by the authors in
"Partially-additive categories and flow-diagram semantics," Journal of Algebra, 62,
1980, pp. 203-227. Exercise 3.1.1 is due to M. E. Steenstrup.
The idea of the iteration as a construction assigning a morphism of the form
Notes and References for Chapter 3
97
x .... Y to one of the form f: X ---+ X + Y is due to C. C. Elgot, "Monadic
computation and iterative algebraic theories," in Proceedings of Logic Colloquium '73
(H. E. Rose and J. C. Shepherson, Eds.), North-Holland, Amsterdam, 1975.
Other mathematical structures have subsets which are Boolean algebras. For any
distributive lattice with least and greatest elements, the subset of elements which have
a complement is a Boolean algebra. We leave it as an exercise to verify that this is a
special case of Exercise 3.3.15 (ignore the limit axiom which plays no role, define
summable = all but finitely many are the least element, sum = supremum, composition = infimum). For those familiar with a little ring theory, another well-known
result is that in a commutative ring the subset of all p with pp = p is a Boolean
algebra. While this is not formally a consequence of Exercise 15, it is interesting to
note that if p + p' = 1, pp' = 0 = p' p in a ring then, as p' = 1 - p, p(1 - p) = 0 so
that pp = p. Hence, one suspects there is a common thread.
Much has been written about Boolean algebras. The reader may enjoy the earlier
sections of P. R. Halmos, Lectures on Boolean Algebras, Van Nostrand, 1967.
The material on Guard(X) and, more generally, on the center of a partially additive semiring is adapted from E. G. Manes and D. B. Benson. "The inverse semigroup
of a sum-ordered semiring," Semigroup Forum, 31, 1985, pp. 129-152.
ft:
CHAPTER 4
Assertion Semantics
4.1 Assertions and Preconditions
4.2 Partial Correctness
4.3 Total Correctness
In the introductory Section 4.1 we informally define partial correctness assertions and notions relating to weakest preconditions with the Pascal fragment
of Section 1.2 in mind. Here, we state a number of well-known properties
and proof rules whose truth is intuitively evident.
In keeping with the spirit of this book we must break with current custom
in expositions of assertion semantics by emphasizing the underlying mathematical framework without choosing any specific programming language. In
particular, the "program state" upon which the informal definitions of Section 4.1 are built is not available in the general setting. We show that the
theory of guards of Section 4.3 allows us to generalize a number of properties
of partial correctness. We then introduce the notion of "kernel-domain decomposition" and show in Sections 4.2 and 4.3 that the remaining concepts
and results of Section 4.1 can then be formulated and established in any
partially additive category in which each morphism has a kernel-domain
decomposition.
4.1 Assertions and Preconditions
In this book, our stress is on denotational semantics: given a program S, we
associate with it a denotation which is a morphism f: X -+ Y that relates the
state before the computation to the state (or states) after the computation.
However, another approach to program semantics emphasizes the preconditions which must be met before a program is used, and the postconditions
which can be guaranteed to hold thereafter. For example, a program G to
compute the greatest common divisor of two numbers might only work if
both numbers are positive. The precondition might thus be stated as x =
4.1 Assertions and Preconditions
99
Xo > 0, Y = Yo > 0, where x and yare variables, and Xo and Yo are specific
values. On exit, we might not care about the final values of the variables x
and y, but want to assert that z holds the desired result gcd(xo, Yo) of the
computation. We could write this as
1
{x = Xo > 0, y = Yo> O} G {z = gcd(xo, Yo)}·
Note that in this formulation, "all bets are off" as to how G will perform if
the precondition is not met. In fact, in this "assertion semantics" that goes
back to Floyd and Hoare, 1 is even weaker than our exposition so far sounds,
for it is to be interpreted as a partial correctness specification, asserting only
that "if the precondition is met and if G thereafter halts, then the postcondition will be met." A total correctness specification would include the
stronger claim that the precondition guarantees the eventual termination of
G's computation.
In general, the precondition and postcondition need not specify more than
a few variables used in the program-the idea is that to check that a program
is correct, we often need only check the processing of a few key variables. For
an account of the practicality of this approach, describing a methodology
whereby programs and their specifications are developed together with (possibly informal) correctness proofs in a process of stepwise refinement, see the
text by Alagic and Arbib cited in the Chapter 1 notes. Here our task is to
reconcile our denotational semantics with the use of assertion semanticsbased on the use of assertions as preconditions and postconditions in
program specifications, a methodology especially associated with R. Floyd,
C. A. R. Hoare, and E. Dijkstra. In the rest of this section, we provide a semiformal introduction to assertion semantics based on the Pascal fragment of
Section 1.2. We then embed it in partially additive semantics in the following
two sections where no specific language is in the picture.
2 Definition. A partial correctness specification is a structure of the form
{IX} S {P} where IX and P are tests (the precondition and postcondition, respectively) and S is a statement of the programming language. This is regarded as
asserting that "If the program state initially satisfies IX and if execution of S
terminates, then the program state upon termination satisfies p."
3 Definition. The weakest liberal precondition operator, wlp(S, p), where S is a
statement and Pis a test, defines a new precondition: "wlp(S, p) is satisfied by
any initial state with the property that, if S terminates, it does so in a state
satisfying p." Thus, wlp(S, P) also holds for all initial states from which S does
not terminate. Letting P ¢> Q be logical equivalence (i.e., P => Q 1\ Q => P, P
is true if and only if Q is true) we clearly have the following:
4 Observation. {a} S {P} ¢> (IX => wlp(S, P)).
By contrast with these partial correctness assertions, (the word "liberal" in
3 is the sign of this partialness) total correctness assertions insist that S halts.
100
4 Assertion Semantics
5 Definition. The weakest precondition operator wp(S, {3), where S is a statement and {3 is a test, defines a new precondition: "wp(S, {3) is true of an initial
state from which S terminates and does so in a state satisfying {3."
The total correctness version of 2 is then
6
a=> wp(S, {3)
which asserts that precondition a guarantees that S will terminate and will so
do in a state satisfying {3.
The ultimate objective is to provide useful assertions of the form 2 and 5
when S describes a (perhaps complex) algorithm. While we make no attempt
to be complete, we mention the following rules which have been found in
practice to be important in tailoring assertions to statements and we encourage the reader to work the exercises.
7 Proof Rule. If a=> a1 , {ad S {{3d and {31 => {3 then {a} S {{3}.
8 Proof Rule (Composition Rule). If {a} R {{3}, {{3} S {y} then {a} begin R; S
end {y}.
9 Proof Rule (Conditional Rule). If {a /\ B}R{{3} and {a /\ ,B}S{{3} then
{a} if B then Reise S {{3}.
10 Proof Rule (Iteration Rule). If {a /\ B} S {a} then
{a} while B do S{a /\ 'B}.
For example, Proof Rule 9 makes sense because, if we have precondition a
satisfied before executing if B then Reise S, then (since tests do not change
the value of variables), we will have that precondition a /\ B holds if we take
the R path, while precondition a /\ I B will hold if we take the S path. In
either case, we are guaranteed that {3 will hold if and when the computation
terminates.
In Proof Rule 10, a is what Floyd calls a loop invariant. It is a property of
the program state that remains unchanged no matter how many times we go
round the loop of while B do S, as long as it holds when we first enter the
loop.
In the next section, we shall see how to interpret partial correctness
specifications {a} S {{3} in partially additive semantics and then rigorously
prove the above proof rules in that setting. We shall also prove that the
weakest liberal precondition operator satisfies analogues of the following
properties.
11 Property. wlp(begin Sl; S2 end, {3)
12 Property. wlp(S, true) = true.
=
Wlp(Sl' wlp(S2' {3))
101
4.1 Assertions and Preconditions
13 Property. wlp(S,Pl
1\
P2) = wlp(S,Pl)
1\
wlp(S,P2).
The weakest precondition operator (total correctness) satisfies the following properties.
14 Property. wp(S,false)
15 Property. WP(S,Pl
1\
=
false.
pz) = WP(S,Pl)
1\
wp(S,Pz)·
The composition law expressed by
16 Property. wp(begin Sl; S2 end; P) = WP(Sl' wp(Sz' P))
is true for deterministic semantics (i.e., the semantics if Sl' Sz are partial
functions) but becomes problematic in the nondeterministic case. We refer
the reader to Exercise 6. A mathematically precise insight into 16 is provided
in Theorem 4.3.8 below.
EXERCISES FOR SECTION
4.1
1. Verify that
{x =
Xo
2 O} begin y:= 0;
while x> 1 do
begin x := x - 2;
y:= y+ 1
end
{y = Xo div 2}
end
by first verifying that
{x =
Xo
> l,y = Yo} begin x:= x - 2;
end
y:= y + 1
{X=X o -220,y=yo+l}
and then using Proof Rules 7 through 10.
2. Verify that {false} S
un is true for any Sand p.
3. Prove wp(n := n*n, {n > O}) = true (applied to integers).
4. For odd(n), the predicate which is true of an integer just in case it is odd, verify that
wp(while I odd(n) do n := n div 2, odd(n)) = true.
5. Use the semantic equivalence
repeat S until B = begin S; while IB do Send
and 8 and 10 to infer a suitable proof rule for repeat S until B.
6. Let the semantics of SI' S2 be multifunctions. Show that 16 fails if the semantics of
begin SI; S2 end is defined using the composition of Mfn but holds if the ANMfn
composition 2.1.5 is used.
102
4 Assertion Semantics
4.2 Partial Correctness
For the balance of this chapter we work in a partially additive category C.
No additional axioms are required to give a definition capturing a suitable
interpretation of {IX} S {[J} and to formulate and prove the corresponding
Proof Rules 7 through 10 of Section 4.1. Thereafter, kernel-domain decompositions will be introduced to define a suitable weakest liberal precondition
operator.
Our formulation of {IX} S {[J} will rest on our theory of guards developed
in Section 3.3. In PCn, we may associate S with its denotation f: X --+ Y, IX
with a total function X --+ {true,false}, and [J with a total function Y--+
{true,false}. But we might just as well associate IX with the guard inCA: X --+ X
where A = {xIIX(x) = true}, and [J with the guard incB = {yl[J(y) = true}. We
note that {O(} S {[J} can then be reexpressed in either of two equivalent ways:
1
which says that if IX(X) is true (incA(x) is defined) and f(x) is defined, then
[J(f(x» is true (incB(f(x» = f(x». Or,
incB,·f·incA = 0
2
which says that if IX(X) is true and f(x) is defined, then it is not the case that
[J(f(x» = false.
To generalize this to an arbitrary partially additive category C, the reader
should recall the definition of Guard(X) from Section 3.3 and our proof that
Guard(X) was a Boolean algebra. We first make the following observation:
3 Observation. Given f: X
--+ Y
qfp
in C, p E Guard(X), q E Guard ( Y)
=
fp -= q'fp
= o.
PROOF.
= fp, then q'fp = q' qfp = O.
If q'fp = 0, then fp = (q + q')fp = qfp·
If qfp
D
It is then clear the following generalizes the informal semen tics we offered for
{O(} S {[J} in Section 4.1.
4 Definition. Given f: X
--+
Yin C, p E Guard (X), q E Guard(Y), we write
{p}f{q}
if either of the equivalent conditions qfp = fp or q'fp = 0 hold.
We now state and prove analogues of Proof Rules 4.1.7-10.
103
4.2 Partial Correctness
5 Proposition. Given p :::;; PI EGuard(X), ql :::;; qEGuard(y), and f: X
{pdf{qd then {p}f{q}.
-+
Y, if
PROOF. We have PIP = PI /\ P = P and ql v q = q. By De Morgan's Law
(Exercise 3.3.10), q' = q'ql. Thus,
q'fp
= q'qlfplP
=
q'Op
= O.
D
6 Proposition. The composition rule holds. If {p}f{q} and {q}g{r} with
f: X -+ Y, g: Y -+ Z then {p} gf{r}.
rgfp
PROOF.
= rg(qfp) = (rgq)fp = gqfp = gfp·
D
7 Proposition. The conditional rule holds. If {p /\ q}f{r} and {p /\ q'} g {r}
then {p} if q then f else g {r}.
PROOF.
We recall from 3.3.23 that for f, g: X
if q then f else g
Since rfqp
=
=
-+
fq
Y, q EGuard(X),
+ gq'.
fqp, rgq' p = gq' p are given,
r(fq
+ gq')p =
rfqp
+ rgq'p =
fqp
+ gq'p =
(fq
+ gq')p.
8 Proposition. The iteration rule holds. Given p, qEGuard(X), f: X
{p /\ q} f{ q} then {q} while p do f{p' /\ q}.
PROOF.
D
-+
X, if
Recall from 3.3.23 that
while p do f =
L p'(fpt.
00
n=O
We are given q'fqp = o. It suffices to prove that (p'q)'p'(fp)"q = 0 for all
n ?: O. Using De Morgan's Law and the formula for supremum of 3.3.22,
(p'q)' = p v q' = p + p'q', so
(p' q)'p'(fp)"q
For n
=
= (p + p' q')(p'(fp)n q) = p' q'(fptq.
0, p' q' q = O. As q'fp = q'fp(q
+ q') = q'fpq', for n > 0, we have
p'q'(fp)n q = p'(q'fp)(fp)"-Iq
which ends in q' q and so is O.
= ... =
p'q'(fpq')"q.
D
To define the weakest liberal precondition operator we introduce kerneldomain decompositions. The idea is very intuitive. For f: X -+ Yin Pfn, X
decomposes as the disjoint union of two subsets K, D where D = DD(f) and
K = D'. Thus, if i: K -+ X, i(x) = x and j: D -+ X, j(x) = x are the inclusion
functions, we have that
104
4 Assertion Semantics
is a coproduct diagram such that fi = 0 whereas fj is total. Noting that total
morphisms were defined in any category with zero morphisms (2.2.21 asserted that f is total if t#-O always implies that ft #- 0) we have motivated the
following definition.
9 Definition. For f: X --+ Y in the partially additive category C, a kerneldomain decomposition of f is (K, i, D,j) such that
(i) K ~ X ~ D is a coproduct.
(ii) fi = O.
(iii) fj is total.
C has kernel-domain decomposition if every morphism has a kernel-domain
decomposition.
10 Example. Pfn and Mfn have kernel-domain decompositions. For f E Pfn
set D = DD{f), K = {xlf(x) is not defined} as discussed above. Similarly, in
Mfn, D = {xlf(x) #- 0}, K = {xlf(x) = 0}.
While not obvious from the definition, two kernel-domain decompositions
of a morphism are unique up to isomorphism. A proof is outlined in Exercises 3, 8, 10, and 11.
The motivation for "domain" in "kernel-domain" is clear. We have chosen
"kernel" because K indeed does act as a kernel in the sense of algebra and
category theory. See Exercises 3-7.
For the balance of this chapter we assume that our partially additive
category C has kernel-domain decompositions.
To proceed further, we introduce some shorthand for Pfn. For f: X --+ Y,
set
K{f) = {xEXlf(x) is not defined}
and define two useful guards by
d(f)
= incDD(f)'
k{f) = incK(f)'
We can use k to define wlp in Pfn. For, given f: X --+ Y and q E Guard (X), we
would like to define wlp{f, q) to be the guard incA where
A
=
{xlf(x), if defined, satisfies q}
=
{x Iq'j(x) is not defined}.
We thus have wlp{f,q) = k(q'f).
Our next task then is to show how to define d{f) and k{f) in any partially
additive category with kernel-domain decompositions.
11 Definition. A kernel-domain system for f: X
--+
Y is
105
4.2 Partial Correctness
K,
i
)X'
P
j
)D
Q'
where (K, i, D,j) is a kernel-domain decomposition of f and P and Q are the
quasi projections of 3.2.6:
By 3.2.16 we have
iP
+ jQ =
id x .
Since (iP)(jQ) = i(Pj)Q = 0 and (jQ)(iP) = 0 similarly, it follows from
3.3.13 that iP, jQ E Guard{X) and iP = (jQ)'.
We write d(f) for jQ and k(f) for iP. (These guards do not depend on the
choice of kernel-domain system as is proved in Theorem 13 below.) The
weakest liberal precondition operator is then defined in terms of k(f) in the
expected way:
12 Definition. Given f: X
by
~
Y and q E Guard(Y) define wlp(f, q) E Guard(X)
wlp(f, q)
=
k(q'!).
13 Theorem. Let
K,
i
P
)X'
be a kernel-domain system for f: X
~
j
Q
)D
Y. Then r
=
iP is the only guard
r E Guard(X) satisfying the conditions
(i) fr = 0, and
(ii) if h: W ~ X is such that fh = 0 then rh = h.
(Hence k(f) = iP in Definition 11 depends only on f and not on the particular kernel-domain system since (i) and (ii) are solely in terms of f
and, similarly, d(f) depends only on f since it was observed in 11 that
d(!) = (k(f))'.)
PROOF. We first show r = iP satisfies (i) and (ii). As fi = 0 by the definition
of a kernel-domain decomposition 9, fiP = O. For (ii), if fh = 0 then as
iP + jQ = idx we have
o = fh
= f(iP + jQ)h = fiPh + fiQh
so that fjQh = 0 by 3.1.8. By the definition of a kernel-domain decomposition, fj is total so that Qh = O. But then
106
4 Assertion Semantics
h = (iP + jQ)h = iPh
as desired.
For the uniqueness statement, let r satisfy (i) and (ii) and define
A = {tEGuard(X): ft = O}.
Then as r satisfied (i), rEA. Furthermore, if tEA then setting h = t in (ii).
rt = t, that is, t :::;; r in the Boolean algebra Guard(X). This shows that r is the
0
greatest element of A.
The above proof leads to an order-theoretic characterization of wlp(f, q) E
Guard (X):
14 Corollary. For f: X - Y, qEGuard(Y), the sets
= {tEGuard(X): q'ft = O}
B = {tEGuard(X): qft = ft}
A
are equal and have wlp(f, q) as greatest element.
PROOF. If q'ft = 0, ft = (q + q')ft = qft whereas, conversely, if qft = ft then
q'ft = q'(qft) = (q' q)ft = 0. Hence, A = B. That wlp(J, q) is the greatest element of A is immediate from Definition 12 and the proof of Theorem 13 (with
fq' instead of f).
0
We are now able to establish the fundamental properties of wlp, analogous to 4.1.4 and 11-13.
15 Proposition. If f: X - Y, pEGuard(X), qEGuard(Y) then {p}f{q} if and
only if p :::;; wlp(f, q).
PROOF. Let p = wlp(f,q). If {p}f{q} then qfp = fp so by 14 p :::;; p. Conversely, if p:::;; p, qfp = qf(pp) = (qfp)p = (fp)p = f(pp) = fp so {p}f{q}. 0
16 Proposition. Given f: X - Y,
wlp(J, wlp(g, r)).
g: Y - Z,
rEGuard(Z), wlp(gJ,r) =
PROOF. Let q = wlp(g, r), P= wlp(J, q). By 14 we need only show that p is the
greatest element of A = {p EGuard(X)lrgfp = gfp}. To see that pEA,
rgfp = rg(qfp) = (rgq)fp = (gq)fp = g(qfp) = gfp·
If pEA, then rg(fp) = gfp so by 13 (ii) (with r'g for f and fp for h) qfp = fp.
But then by 14, p :::;; p.
0
17 Proposition. For f: X - Y, wlp(J, 1) = 1 (more precisely, wlp(f, idy ) = idx).
PROOF. idx is the greatest element of {pEGuard(X): idyfp = fp}·
0
4.2 Partial Correctness
107
IS Proposition. For f: X
-+
y, ql' qz EGuard(Y),
wlp(f,ql /\ qz)
PROOF. Let fii
= wlp(J,qd /\ wlp(f,qz).
= wlp(f,qJ We must show filfiz is the greatest element of
A
= {pEGuard(X): qlqzfp = fp}.
To see filfiz E A,
qlqzffilfiz = qz(qtffidfiz = qz(ffidfiz = (qzffiz)fil = ffilfiz·
Now let pEA. Then (ql qz),fp = O. By De Morgan's Law and 3.3.22, (ql qz)' =
v q; = q~ + qlq;. Thus, by 3.1.S, qUp = 0 and so, by 3, qtfp = fp.
Thus, p :::;; fil by 14. SYQlmetrically, p :::;; fiz so p :::;; fil /\ fiz·
0
q~
EXERCISES FOR SECTION 4.2
1. Establish your proof rule of Exercise 4.1.5 in a partially additive category.
2. In a partially additive category, a diagram
A,
i
P
j
'x' Q
,B
is a direct sum system if
Qj = id y ,
Qi = 0,
Pj=O,
iP
+ jQ = idx .
Thus, if X is the coproduct of A, B with injections i, j and P, Q are the quasi
projections, a direct sum system results. Show, conversely, that given a direct sum
system as above
is a coproduct. [Hint: Use 3.2.16.]
3. In any category with zero morphisms, (K, i) is a kernel of f: X
satisfies the following:
->
Y if i: K
->
X
(i) fi = o.
(ii) If t: T -> X satisfies ft = 0, there exists unique
, X
K
_---"-1_-+, y
~'\/
T
a: T -> K with ia = t.
Given f, show that any two kernels of f are isomorphic.
4. For f: X -> Yin Pfn, show that incK : K
not defined}.
->
X is a kernel of f if K
= {xEXlf(x) is
108
4 Assertion Semantics
5. For f: X ...... Yin Mfn, show that incK: K
---+
X is a kernel offif K = {xEXlf(x) =
6. For f: X ...... Yin Mon, show that incK: K
e}, where e denotes the unit of Y.
---+
X is a kernel offif K = {xEXlf(x) =
7. For f: X ...... Yin Vect show that incK: K
O} is the null space of f
---+
X is a kernel of fif K = {xEXlf(x)
0}.
8. For a direct sum system as in Exercise 2, show thatj: B
that i: A ---+ X is a kernel ofO.
---+
=
X is a kernel of P and
9. Let C be a nonempty category with zero morphisms. Prove that C has a zero
object if and only if every total morphism has a kernel. In particular, by Exercises
4-7, Pfn, Mfn, Mon, and Vect have zero objects.
10. Let
A,
i
P
'X'
j
Q
-
,B
f
J-
P
Q
A,_'X'_,B
be direct sum systems in a partialIy additive category as in Exercise 2. Assume
that there exists an isomophism IX,
Show that there exists an isomorphism fJ
.
B
x~lp
~
]
jj
[Hint: First argue that
i
j-
A,_'X'_,B
«P
Q
is a direct sum system. Using ilXP + JQ = idx show P::;; IXP and similarly
show IXP ::;; P. Now use Exercises 8 and 3.]
11. In a partialIy additive category, show that if (K, i, D,j) is a kernel-domain decomposition of f then i: K ---+ X is a kernel of f [Hint: The proof is implicit in the
construction of fJ in the proof of Theorem 4.3.8 below.]
Conclude, using Exercises 10 and 3, that any two kernel-domain decompositions of f are isomorphic.
12. Given fl' ... , J.: X
define
---+
X, PI' ... , PnEGuard(X) in a partialIy additive category,
to mean
flPl
+ .. , + J.Pn
(we assume that the sum exists). If r E Guard(X) is such that
109
4.3 Total Correctness
{r
A
p;}J;{r}
for all i, prove that
{r} DO
{p~ A •.. A p~ A
r}.
13. Let V be a "value" set with at least two elements. Let Pfov be the category whose
objects are sets and whose morphisms are given by
Pfov(X, Y)
= Pfo(X
x V, Y x V)
with X x V and so forth the Cartesian product of sets, and with composition and
identities as in Pfo. For f E Pfov(X, Y), think of X as a set of input lines, Yas a set
of output lines, and interpret f(x, d) = (y, e) as "input value d on line x results in
output value e on line y." Prove that Pfo v is partially additive but show that if v =f:.
Vl E V and x =f:. Xl EX then ifpEPfov(X,X) is defined by DD(p) = {(v,x),(Vl,X l )}
with p(v,x) = (v, x), p(vl,xd = (vl,xd then pEGuard(X) but does not have a
kernel. Conclude that Pfo v does not have kernel-domain decompositions.
14. Let (M, 0, e) be monoid. Show that the partially additive category FwR(M.o.e)
of Exercise 3.3.17 has kernel-domain decompositions. Explain the following
slogan for this category: "All asserted truth is reliable."
4.3 Total Correctness
The weakest precondition wp(S, {3) of 4.1.5 strengthens the liberal precondition wlp(S, {3) of 4.1.3 by guaranteeing that computation of S will terminate.
This shows that the relationship between wlp and wp in Pfn should be
wp(f, q)
d(f)
=
A
wlp(f, q),
where d(f) = incDD(f)' that is, "wp(f, q) is true of an initial state providing f
is defined and wlp(f, q) is true." To elevate the theory of weakest precondition to a partially additive category with kernel-domain decompositions,
then, we recall our definition 4.2.11 of d(f) E Guard(X) for each f: X ~ Y. We
then prove that wp satisfies the analogues of 4.1.14-15. We are also able to
characterize when the composition theorem 4.1.16 should hold. We recall
that if f: X ~ Y has kernel-domain system
KE
i
P
)XE
j
Q
)D
then d(f) E Guard (X) is defined as jQ.
1 Definition. For f: X
~
Y, qEGuard(Y), wp(f,q)EGuard(X) is defined by
wp(f, q)
=
d(f)
A
wlp(f, q).
2 Example. In Pfn, d(f) = incDD(f); if r = inc R , wp(f, r) = incs where
S = {xEXlf(x) is defined and f(X)EQ}.
The analogues of 4.1.14-15 follow quickly.
110
4 Assertion Semantics
3 Proposition. For any f: X
-+
Y, wp(f, 0) = O.
PROOF. Let
K(
i
'X(
P
j
Q
,D
be a kernel-domain system for f As f = fO', by Definition 4.2.12, wlp(f, 0) =
k(O,!) = k(f) = iP. Since d(f) = jQ, we have d(f) = wlp(f,O), by 4.2.11. Thus,
wp(J, 0) = d(f)
1\
wlp(J, 0) = wlp(J, 0)'
4 Proposition. Givenf: X
-+
Yand q1' q2EGuard(Y),
Wp(J,q1
1\
q2)
= WP(J,q1)
1\
1\
wlp(J, 0) = O.
0
WP(f,q2)·
PROOF. Recall 4.2.18 that wlp(J, q1 1\ q2) = wlp(f, q1) 1\ wlp(J, Q2)' and the
fact that a 1\ b 1\ C = (a 1\ b) 1\ (a 1\ c) in any lattice. Thus,
WP(f,Q1
1\
Q2) = d(f)
1\
wlp(J,Q1
d(f)
1\
wlp(J,qd
=
= (d(f)
=
1\
1\
wlp(f, qt))
1\
wp(J, q1)
1\
q2)
wlp(f,q2)
1\
(d(f)
wp(J, Q2)·
1\
wlp(J, Q2))
o
We record the following basic facts about d(f):
5 Proposition. For f: X
-+
Y,
(i) wp(J, id y ) = d(f);
(ii) f is total if and only if d(f)
=
idx .
PROOF. (i) Recalling 4.2.17, wp(J, 1) = d(f) 1\ wlp(J, 1) = d(f) 1\ 1 = d(f).
(ii) Let p = wlp(J, 0). Thus, fp = 0 by 4.2.13(i). If f is total, p = O. But
d(f) = p' as was pointed out in the proof of Proposition 3, so d(f) = 1.
Conversely, if d(f) = 1, p = O. Hence, if ft = 0 it follows from 4.2.13(ii) for p
that t = pt = 0, so f is total.
0
The reader who has worked Exercise 4.1.6 will realize that, in ANMfn, the
composition theorem wp(gf,Q) = wp(g, wp(f,q)) holds because when g(f(x))
is defined, g(y) is defined for every y Ef(x). This can be restated in a way more
suitable for generalization, namely, as follows: Let i = incoo(t). If u is such
that gu is total,
DD(g)
i
, y_....:g'--...., Z
~\I
u
111
4.3 Total Correctness
then there exists unique a with ia = u as shown. This obviously is equivalent
to the assertion that u(x) c DD(S) for each x E U, so this is indeed equivalent
to the original principle, having used a set U of x's instead of just one.
This suggests the following definition.
6 Definition. For g: Y -+ Z in any category with zero morphisms, a totalizer
of 9 is (T, i) where i: T -+ Ysatisfies
(i) gi is total;
(ii) if u: U -+ Y is such that gu is total, there exists unique a: U
ia
= u.
-+ T
with
Totalizers are unique up to isomorphism. See Exercise 2.
7 Examples. By the discussion above, every morphism 9 in ANMfn has
incDD(s) as totalizer; similarly, in Pfn. It is clear that this construction does not
provide a totalizer in Mfn.
The following theorem then shows that wp(gf, q) = wp(f, wp(g, q)) in Pfn
but not in Mfn. The theorem does not apply to ANMfn which is not partially
additive (but see the end of chapter notes).
8 Theorem. Let g: Y -+ Z and let
KE
i
P
) yE
j
Q
)D
be a kernel-domain system for g. Then the following two conditions are
equivalent.
1. For all total f: X
-+ Y
and all q E Guard(Z),
wp(gf, q)
=
wp(f, wp(g, q)).
2. j: D -+ X is a totalizer of g.
PROOF.
1 = 2. That gj is total is given (Definition 4.2.9). Now suppose
u: U -+ X is such that gu is total. If
D
'"
j
~\\
)
) X _---"-9_....) Y
U
= u then a = idDa = (Qj)a = Q(ja) = Qu so a is unique if it exists. We
must show jQu = u. Now jQ = d(g) by definition. Also u is total by 2.2.22
so that d(u) = id u by 5(ii). Thus,
ja
112
4 Assertion Semantics
wlp(u,jQ) = id u /\ wlp(u, d(g))
=
wlp(u, d(g)) /\ wlp(g, id z ))
(by 4.2.17)
= wp(u, wp(g, idz ))
= wp(gu, idz )
= d(gu) /\
=
(by hypothesis, as u is total)
wlp(gu, idx )
(by 5(ii) and 4.2.17)
id u
But then by 4.2.13(i), jQu = jQuid u = uid u = u.
2 => 1. Given total f: X ...... Y, let
be a kernel-domain system for gf. Consider
p
,r---;--+l
x
jl
g
l y-""'::""'_lZ
r
r
j
D I --------.
D
y
Such p, y exist as follows. Since 0 = (gf)i l = gidrfi l = g(iP + jQ)fi l , it follows from 3.1.8 that gjQfi l = O. As gj is total, Qfi l = O. But then if P is
defined as Pfit>
fi l
= idrfi l = (iP + jQ)fi l = iPfi l = iP-
To construct y simply use the hypothesis thatj: D ...... Y is a totalizer of g, since
gfj 1 is total.
We then observe the following:
(i) (iPf)jl = O. For iPfjl = iPjy and Pj = O.
(ii) (iPf)i l is total. For if iPfi l t = 0 then
o = iPfi l t = iPiPt = ipt (as Pi = idK ) = fi l t.
hence, as f is total, il t = 0 and then t = Pl il t = Pl0 = O.
But then (i) and (ii) assert that (K l ,i l ,Dl ,jl) is a kernel-domain decomposition of iPf. Since iP = (jQ)' in Guard( Y) it follows from Definition 4.2.12
that
wlp(f,jQ) = j 1 Q 1 .
Since d(g) = jQ, d(sf) = jl Ql by Definition 4.2.11, this translates to
9
Hence, for q E Guard(Z),
d(gf)
= wlp(j, d(g))
113
4.3 Total Correctness
wp(gj, q)
= d(gf) /\ wlp(gj, q)
=
wlp(f,d(g)) /\ wlp(f, wlp(g,q))
(by 9 and 4.2.16)
=
wlp(f, d(g) /\ wlp(g, q))
(by 4.2.18)
= d(f) /\ wlp(f, wp(g, q))
=
(by 5, as j is total)
o
wp(f, wp(g, q))
EXERCISES FOR SECTION
4.3
1. Let X, Y be objects in Pfn. A guard transformer from Y to X is a function
Guard(Y) ----"!:.... Guard (X),
satisfying the axioms
T(O) = 0,
T(p /\ q) = T(p) /\ T(q),
T(V p;)
= V T(p;) for all families (p;) where,
if Pi = inCA" V Pi = incA for A = U Ai'
(i) Show that for all f: X --+ Yin Pfn that
T(q) = wp(f, q) is a guard transformer from Y to X.
(ii) Show that if T is a guard transformer from Y to X then T(q) = wp(f, q) for
some f: X --+ Y. [Hint: f(x) = y if x E T {y} and f(x) is otherwise undefined;
you must show, among other things, that such f is a well-defined partial
function.]
(iii) Show that the constructions of (i) and (ii) establish a bijection between
Pfn(X, Y) and guard transformers from Y to X. This underlies the idea,
carried out in some expository treatments, that a programming construct can
be given semantics by specifying its guard transformer.
2. Fix g: Y --+ Z in a category with zero morphisms. Define the morphisms of a
category whose objects are (U, u) with u: U --+ Y for which gu is total in such a way
that terminal objects are totalizers of g. Conclude that any two totalizers of g are
isomorphic.
3. Prove that ANMfn has kernel-domain compositions. (Warning: Since ANMfn is
not partially additive, the theory of kernel-domain systems developed in the text
does not necessarily apply.)
4. For f: X
--+
Y, prove that d(f) is the least element of {pEGuard(X)lfp
5. For f: X
--+
Y, g: Y
--+
Z prove that d(gf)
::0;
d(f).
6. For f, g: X --+ Y such that f + g exists prove that d(f + g)
Use De Morgan's Law, 3.1.8, and Exercise 4.2.11.]
7. Let
K,
i
P
'X'
j
Q
,D
= p}.
= d(f)
v d(g). [Hint:
114
4 Assertion Semantics
be kernel-domain systems for f, f1: X
there exists C( as shown:
--->
Y. Prove that d(f)
~
d(fd if and only if
:~ X
0(1
I~
.
D1 it
~
Prove that such
C(
is unique when it exists. [Hint: Use Exercise 4.2.8.]
8. Let f,f1: X ---> Y with kernel-domain systems as in Exercise 7. Prove that f C f1
in the extension ordering of Exercise 3.3.13 if and only if d(f) ~ d(fd and "g
agrees with f when both are defined," that is,
D
0(
---___+
jl
Dl ----"--'-----+1 X
j1
X
commutes, where
C(
19
-----~---_I
f
Y
is as in Exercise 7.
9. Let pEGuard(X). Show that p = d(p). [Hint: Use Exercise 4.]
10. Show that wlp(idx,p) = p = wp(idx,p) for all pEGuard(X). [Hint: Use Exercise
9.]
11. Say that f: X
--->
Y is deterministic iffor all q1, qz E Guard(Y),
wlp(f,q1 v qz) = wlp(f,qd v wlp(f,qz)·
(i) Prove that the deterministic morphisms form a subcategory.
(ii) Show that in Pfn every morphism is deterministic whereas in Mfn the
deterministic morphisms constitute the subcategory Pfn. This motivates the
terminology.
Notes and References for Chapter 4
The early papers in assertion semantics and the text of Alagic and Arbib were cited in
the notes to Chapter 1. For additional expository accounts see:
E. W. Dijkstra, A Discipline of Programming, Prentice-Hall, 1976.
D. Gries, The Science of Programming, Spring-Verlag, 1983.
The theory of Sections 2 and 3 is adapted from
E. G. Manes, "Assertion semantics in a control category," Theoretical Computer
Science, to appear. It is proved there that a third equivalent condition for theorem
4.3.8 is that every morphism is deterministic as defined in Exercise 4.3.11.
Homological algebra, an area of abstract algebra with ties to algebraic topology,
emphasizes the theory of "abelian categories," See the books of Freyd and Mitchell
cited in the Chapter 2 notes. In an abelian category C (Veet is an example of one),
there are zero morphisms, while finite products and finite co products share a common
object and the construction is called a direct sum and is characterized by the direct
sum systems of Exercise 4.2.2.
In the paper by Manes cited above a more general theory of weakest precondition
is given which goes beyond partially additive categories. A generalization of Theorem
115
Notes and References for Chapter 4
4.3.8 establishes that wp(gf, Q) = wp(f, wp(g, Q)) in ANMfo. The interpretation of
wp(f, Q) in ANMfo is the one intended by Dijkstra in the book cited above, namely,
wp(f,Q) = {xlf(x) #
0
and every YEf(x) is in Q}.
As we mentioned in Exercise 4.3.1, many authors adopt the view that it is natural
to define programming constructions in terms of their effects on the weakest precondition operator. We regard the assumptions on Pfo as too specialized to be
adaptable to more general semantic categories, however. Counterexamples appear in
the paper of Manes cited above.
PART 2
SEMANTICS OF RECURSION
CHAPTER 5
Recursive Specifications
5.1 The Kleene Sequence
5.2 The Pattern-ot-Calls Expansion
5.3 Iteration Recursively
A recursive specification "defines a function in terms of itself." Recursive
definitions occur commonly in the mathematical literature including that
prior to the computer age. Here, the art of separating out "improper" recursive definitions was regarded as but one of the many skills necessary to write
correct mathematics. But modern computer languages allow recursive specification to be expressed directly. Since the implementation of a programming
language must respond to any recursive program, no matter how illconceived, we must pay attention to the mathematical question of what an
"arbitrary" recursive specification should mean.
We open Section 5.1 with some examples of recursive specification to
demonstrate that the "desired" denotational semantics is not always clear
and that there are several strategies for an operational semantics. A detailed
discussion would be too long and we primarily focus on an informal treatment of "all-call" operational semantics. In this and the following three
chapters we establish that for a recursive specification of a partial function
X --+ Y there is a mathematical way to decribe the specification in terms of a
total function t/J: Pfo(X, Y) ----+ Pfo(X, Y) whose all-call semantics coincides
with the "Kleene semantics" of t/J. A wide class of recursively defined functions find their expected semantics with this approach.
The pattern-of-calls expansion of Section 5.2 develops a partially additive
form of Kleene semantics which capitalizes on the formal power series calculus available in the partially additive category Pfo. A formal proof that the
semantics are the same must wait for Chapter 8.
Section 5.3 expresses iteration recursively as is always done, say, in LISP.
We extend the usual theory by adjoining concepts from partially additive
semantics.
120
5 Recursive Specifications
5.1 The Kleene Sequence
A simple example of a recursive specification for a partial function f: N
IS
1
f(n) =
{5f(n -
1)
--+
N
if n = 0
else.
This specification is recursive because 1 is not a closed formula, but rather a
definition of a function f in terms of itself. Regarding 1 as an equation, we
may compute
f(3) = f(3 - 1) = f(2) = f(2 - 1) = f(l) = f(1 - 1) = f(O) = 5.
Indeed, it is quite clear that every solution f of equation 1 is total (else there
exists a smallest n > 0 with f(n) not defined, hence, f(n - 1) is defined which
contradicts f(n) = f(n - 1)) and so f(n) = 5 for all n is the unique solution.
Alternatively, we may regard 1 as an algorithm which calls itself. Thus, to
compute f(3) we would first call f(3), then f(2), then f(I), and then f(O) which
terminates with final value 5; then f(l) (previously suspended) is 5; then f(2)
is 5; and, at last, f(3) is 5.
The specification of 1 appears to have only one solution and it seems not
to matter here whether it is considered an equation or an algorithm. In
general, however, recursive specifications may admit more than one equational solution, and more than one algorithmic solution depending on
"calling strategy," and not every algorithmic solution is an equational one.
In this section we present a number of basic examples to illustrate the
complexity of even disarmingly concise recursive definitions for functions of
the form N x ... x N --+ N. We then give a more mathematical formulation
of recursive specification and define the "Kleene sequence" of a specification
in precise mathematical terms to capture the idea of the sequence of successive algorithmic calls using the "all-call" strategy. Chapters 5-8 deal, in a
large part, with alternative algebraic approaches to the semantics of the
Kleene sequence.
Our second example of a recursive definition is the familiar factorial
function:
2 Example. The recursive specification
fact(O) = 1
fact(n) = n' fact(n - 1)
of the factorial function is a perfectly sound mathematical definition. The
usual function f(n) = n! is the only equational or algorithmic solution as
illustrated by the following computation.
121
5.1 The Kleene Sequence
fact(3)
=
3· fact(2)
=
3·2· fact(l)
=
3·2·1· fact(O)
= 3· 2· 1 . 1 = 6.
The next example is somewhat more complicated.
3 Example. The function a(m, n) known as Ackermann's function is defined
by
n+1
ffm=O
{
a(m,n) = a(m - 1,1)
ifm =f. 0, n =
a(m - 1, a(m, n - 1)) else.
°
Thus,
a(l, 1)
=
a(O,a(l,O))
= a(l,O) + 1
= a(O, 1) + 1
= (1 + 1) + 1 = 3.
Although this may not be obvious, there is only one equational solution and
it is total. But here there is possible ambiguity from the algorithmic point
of view. In simplifying a(O, a(l, 0)) we chose to "call" the "outermost" a to
get a(l,O) + 1. We could have called the "innermost" a instead yielding
a(O, a(O, 1)). Although the ultimate result is the same in this case, we have
made the point that calling strategy is not unique.
Ackermann's function is quite interesting. For even very small m, n the
computation of a(m, n) is very lengthy. We invite the reader to experiment.
See the end of chapter notes to see why it is unlikely that any "closed
formula" exists: the recursive definition is almost certainly the most convenient description.
The next example shows how easily the equational and algorithmic approaches can be made to give different results.
4 Example. Consider the recursive "definition"
f(n)
=
f(n
+ 1).
If we use this as an algorithm we get the sequence of calls
+ 1) ---+ f(n + 2) ---+ ...
that f is the everywhere-undefined
f(n) -----. f(n
which fails to terminate, so
function.
On the other hand, if we regard the above as an equation, then while the
122
5 Recursive Specifications
everywhere-undefined function is still a solution, so is every constant total
function.
The following example, despite the simplicity of its description, defies
analysis at the time of this writing.
5 Example. Define f(n) recursively by
=
f(n)
{~(3n +
f(n/2)
1)
ifn = 0 or n = 1
if n is odd, n > 1
else.
Computing equationally, we have
f(3) = f(10) = f(5) = f(16) = f(8) = f(4) = f(2) = f(l) = 1.
Similarly,
f(7) = f(22) = f(ll) = f(34) = f(17) = f(52) = f(26)
= f(13) = f(40) = f(20) = f(1O) = f(5) =
(as above) 1.
It is clear that if f(x) is defined then f(x) = 1. It is an unsolved problem of
number theory whether or not f is total.
We conclude our initial list of examples with a straightforward example in
which different calling strategies yield different results.
6 Example. Define f: N x N -
N recursively by
f (m n) = { 18
ifm=O
f(m - 1,f(1, 0))
,
else.
We now compute f(l, 0) algorithmically with two different calling strategies.
The dot underneath indicates the call to be made.
(i) "Call the leftmost f"
[(1,0) -
[(0,f(1,0)) - 1 8 .
(ii) "Call the rightmost f"
[(1,0) -
f(O,[(l, 0)) --+ f(O,f(O,[(l, 0))) -
....
Here the computation is non terminating.
Our interpretation of (iJ is that we check if m = 0 without attempting to
verify that n has a value in N.
We hope these few examples as well as those to follow present sufficient
evidence for the subtlety of the problem of assigning semantics to recursive specifications. As discussed earlier we shall limit our investigation to
123
5.1 The Kleene Sequence
one algorithmic strategy which amounts to "calling all occurrences of f
simultaneously."
One approach to a rigorous definition would be to adopt a particular
programming language, define the syntax of recursive call, and then adapt
the "substitute-for-all" concept. In a Pascal-type language we would expect
to deal with the issues of "local variables," "parameter passing," and operational semantics generally. This seems quite different from the recursive definitions written in a functional programming language (we shall extend FPF
to include recursion in Section 6.3).
In keeping with the spirit of this book, we will avoid an approach tied to
the specifics of a single programming language and describe recursive specifications entirely in mathematical terms. Just as the requirement of syntactic
validity limits the allowable specifications in a specific language, axioms will
be imposed to prevent "arbitrary" specifications and these would allow us to
prove theorems to guarantee that the assigned semantics exists and has useful
properties. While we will never formally define "all-call" semantics, this idea
underlies the "Kleene semantics" to be given in 16 below.
We now motivate our approach and follow with the desired formal definitions. Let X, Y be sets. A recursive specification of a function f E Pfn(X, Y)
takes the general form:
7 "For each x E X, f(x) depends on x and on f as follows .... "
But let us look at 7 in a somewhat different way, phrasing it as follows:
"Given x and certain values of f, we may combine them to form the value
f(x)."
We may abstract away from the individual values of x to say simply
8 "Given any function g there is a way to manipulate it that returns another
function, call it rjJ(g). The function f that we are seeking to define, then, is
such that f = rjJ(f)."
Clearly, the rjJ mentioned in 8 is a total function g ~ rjJ(g) of the form
9
rjJ: Pfn(X, Y)
-----+
Pfn(X, Y).
10 Example. The function rjJ: Pfn(N2, N) -----+ Pfn(N2, N), corresponding to
the specification of the Ackermann function in Example 3, is defined by
11
n+1
rjJ(h)(m,n)= { h(m-1,1)
h(m - 1, h(m, n - 1))
ifm = 0
ifm#O,n=O
else.
Thus, for example, if h is the total function h(m, n)
function
=
m + n, rjJ(h) is the total
1 24
5 Recursive Specifications
n+1
{
ifJ(h)(m,n) = m
2m
+n-
2
ifm = 0
ifm #- 0, n = 0
else.
Continuing with our motivation, we now say, mathematically, that
ifJ: Pfn(X, Y) ------+ Pfn(X, Y) is the recursive specification. This is justified by
expressing the concepts we need in terms of ifJ as follows. Firstly, the equational solutions are exactly those hE Pfn(X, Y) with
12
ljJ(h)
= h.
In general, if A is any set and oc: A --+ A is any total function, a fixed point of
oc is an a E A with oc(a) = oc. This terminology is natural since such a is left fixed
by oc. In particular, the solutions of 12 are called "fixed point solutions." This
is the usual terminology in the literature and we henceforth use it instead of
the synonym "equational solution."
How is ljJ related to "all-call" algorithmic solutions? Here, it pays to think
syntactically. Imagine ifJ(h)(x) as a formula in h and x (as has been true in our
examples so far) and interpret each h as a "call" so that ifJ(h) is a formula for
the "first level of call." Then ifJ(ljJ(h)) represents the second level of call in the
"all-call" strategy because by definition of ifJ(ljJ(h)), ljJ(h) is substituted for each
h. Defining ljJn+1(h) = ifJ(ljJn(h)) as usual, we see ifJn(h) is the expression for the
nth level of call.
For example, the factorial function corresponds to the map ifJ for which
ifJ(h)(n) =
Thus,
2
ifJ (h)(n) =
{
I
n. h(n - 1)
ifn=O
if n > O.
{I
n. h(n - 1)
I
- {1
n-(n - 1)·h(n - 1)
ifn = 0
ifn > 0
ifn = 0
if n = 1
if n > 1.
We think of ifJn(h) as having an "exit part" and a "call further part." The "call
further" part collects all terms with an h, and the "exit part" obtains by
ignoring all future calls. In the context of factorial as above, the exit parts are
described by
13
exi t part for ifJ 1: if n = 0 then 1 else undefined;
exit part for ifJ2: if n = 0 or 1 then 1 else undefined.
A mathematically precise approach which intuitively "causes future calls to
be ignored" is to substitute the everywhere-undefined functions .1.. E Pfn(X, Y)
for h. Thus, ifJn(l..) is our candidate for the partial function corresponding to
the "all-call" algorithmic solution after n levels of call. The reader should
125
5.1 The Kleene Sequence
verify that, in 13, l/Ii(.1) are, indeed, the exit parts for I/I i • Since the substitution
procedure creating I/In+1 (h) from I/In(h) should not disturb existing exit terms
but may create new exit terms, we expect that I/In+1(.1) is an extension of
I/In(.1) in the sense of Definition 2.1.9.
. The motivations just given have glossed over numerous technical points
and our discussion cannot be considered mathematically rigorous. Experience dictates that these ideas, nonetheless, conform to many examples of
recursive specifications and this leads to the following mathematical
definitions.
14 Definitions. Let X, Y be sets. The everywhere-undefined function in
Pfn(X, Y), often called 0 earlier, will synonymously be called .1. We recall
from 2.1.9 that Pfn(X, Y) is a poset under the extension ordering whereby
f:.:::; g means DD(f) c DD(g) and g(x) = f(x) for x E DD(f).
A recursive specification on Pfn(X, Y) is a total function 1/1: Pfn(X, Y) ~
Pfn(X, Y) such that
15
1/1(.1) :.:::; 1/1 2 (.1) :.:::; 1/1 3 (.1) :.:::; ... ,
that is, I/In(.1) :.:::; I/In+1(.1) for all n :.:::; 1 where I/In means the n-fold composition
of 1/1 with itself. When 15 holds, the sequence
1/1(.1),1/1 2 (.1),1/1 3 (.1), ...
is called the Kleene sequence of 1/1. The Kleene semantics of 1/1 is then
flp E Pfn(X, Y) defined by
00
16
DD(fIp) =
U DD(l/Ik(.1))
k=l
while
flp(x) = I/Ik(.1)(X)
for any k with xEDD(l/Ik(.1)).
Because of 15, it does not matter which k is used in the definition of flp(x), that
is, I/In(.1)(x) = I/Im(.1)(x) if x E DD(l/In(.1)) n DD(l/Im(.1)).
We say that I/Ik(.1) is the kth-approximant of the Kleene semantics flp'
We pause to test these definitions on some of our earlier examples.
17 Example. The recursive definition of factorial in 2 has specification
1/1: Pfn(N, N) - - > Pfn(N, N),
I/I(h)(n) =
{
1
n'h(n - 1)
ifn=O
else.
It is routinely computed that DD(l/Ik(.1)) = {O, ... ,k - 1} with I/Ik(.1)(n) = n!
Thus, the Kleene semantics of 1/1 is the total factorial function since flp(n) =
I/In+1(.1)(n) = n!
126
5 Recursive Specifications
18 Example. The specification for Example 4 is "': Pfn(N, N) ----+ Pfn(N, N),
"'(h)(n)
=
h(n
+ 1).
Since ",<-1) = .1, "'k(.1) = .1 for all k so that the Kleene semantics of", is .1.
19 Example. In the context of Example 6, the "all-call" strategy for f(l, 1)
produces
[(1,1) ----+ [(0,[(1,0» ----+ ...
whose "simultaneous" meaning is perhaps not so clear for reasons discussed
in 6. We invite the reader to struggle with it. What is the Kleene semantics
here? Well, the specification",: Pfn(N x N, N) ----+ Pfn(N x N, N) is given
by
",(h)(m, n) =
2
1/1 (.1)(m, n) =
{
18
h(m _ 1, h(l, 0»
ifm=O
else;
{18
"'<-l)(m _ 1, "'(.1)(1, 0»
ifm=O
else.
But ",(.1)(1,0) is undefined so that "'(.1)(m - 1, "'(.1)(1, 0» is undefined even
if n = 1. Thus, "'2(.1) = "'(.1) and the Kleene sequence is
1/1(.1), "'(.1), "'(.1), ...
with Kleene semantics flp = "'(.1). Thus, DD(flp) = {(m, n): m = O}, flp(m, n)
18.
=
We conclude with an example that underscores many of the points developed in this section.
20 Example. Define f recursively by
f(n) =
o
ifn = 0
2
if n > 0 and f(n - 1) > 1
if n > 0 and f(n - 1) = 1
else.
{1
3
This is surely a reasonable mathematical specification since f(n + 1) is
defined solely in terms of what happens to f(n), and since f(O) is given.
Computing algorithmically,
f(O) = 0
f(l) = 3 as f(O) = 0
f(2) = 1 as f(l) = 3
so that f is the total function
127
5.1 The Kleene Sequence
f
o
21
f(n)
Here
n =0
1 n: 2,4, 6, ...
2 n - 3, 5, 7,9, ...
3 n = 1.
=
f
o
22
l/J(h)(n)
=
ifn =
1 if n >
2 if n >
3 else.
°
°
°
and h(n - 1) > 1
and h(n - 1) = 1
But direct calculation shows
°
=°
l/J(1.)(n) =
{~
ifn =
else,
=
{~
ifn
else,
l/J 2 (1.)(n)
which violates condition 15, namely, that l/J(1.) :5; 1/1 2 (1.) :5; l/J3(1.) :5; .... The
situation is resolved informally by asserting that it is illegal to use the halting
of the procedure itself as a test (since "else" meant "h(n - 1) = or h(n - 1)
is undefined"). Formally, we say that the l/J of 22 is illegal because 15 fails.
Experience dictates that reasonable recursive specifications can be restructured so that the Kleene semantics provides the intended semantics. In this
case the desired specification is
°
f
o
23
~(h)(n) =
ifn =
1 ifn >
2 if n >
3 if n >
°
°
°
°
and h(n - 1) > 1
and h(n - 1) = 1
and h(n - 1) = 0.
The initial computation of f goes through the same way because the case that
f(n - 1) is undefined never arises. However, now
~(1.)(n) =
{01.
~2(1.)(n) =
o
{ 31.
n=
else,
n=
°
°
n= 1
else,
f
o
~3(1.)(n) =
3
1.
1
°
n=
n= 1
n= 2
else,
and the Kleene semantics of 23 is just the intended semantics 21.
128
5 Recursive Specifications
EXERCISES FOR SECTION
5.1
1. Let a(m, n) be Ackermann's function. Compute a(2, 2).
2. The Fibonacci function is defined recursively by
f(n) = if n ::;; 1 then 1 else f(n - 1)
+ f(n
- 2).
Compute f(n) for n = 0, ... ,8. Verify that the Kleene semantics coincides with the
unique fixed point solution.
= 1.
3. For f as in Example 5 verify that f(19)
4. Repeat the analysis of Example 4 for the following:
(i) f(n) = f(n).
(ii) f(n) = f(f(n + 1)).
5. In Example 6, which of the two solutions, if any, is a fixed point solution?
6. Use the initial object property in the principle of simple recursion to prove that,
givenf: N ---tN, xoEN,
g(n) = if n = 0 then Xo else f(g(n - 1))
has exactly one fixed point solution for g.
7. The class of primitive recursive functions is the class of total functions of the form
N k ---+ N defined inductively as follows.
Basis Step: For each k > 0, 1 ::;; i::;; k, prj(n 1, ... , nk) = nj is primitive recursive
Nk---+N.
succ: N
---+
N, succ(n) = n
zero: N
---+
N, zero(n)
Pred:
=
+1
is primitive recursive.
is primitive recursive.
0
N N, Pred(n) = {On-1
---+
ifn = 0
else
is primitive recursive.
Inductive step. If m, k > 0, gl, ... , gm: N k ---+ N are primitive recursive and
h: Nm ---+ N is primitive recursive then f: N k ---+ N is primitive recursive where
f(n1,···,nd = h(gl(n 1,···,nk),···,gm(n 1,···,nk))·
(Primitive recursion) If g: N k ---+ N is primitive recursive and if h: Nk+2 ---+ N is
primitive recursive then f: Nk+1 ---+ N is primitive recursive where f is defined
recursively by
f(n 1,···, nk+1) = if nk+1 = 0 then g(n 1,···, nd
else h(n1, ... ,nk+1,f(n 1, ... ,nk+1 - 1))
(A)
(i) Show that (A) has exactly one equational solution for each fixed total g, hand
show that this solution is total.
(ii) Show that simple recursion (2.2.23) is a special case of primitive recursion.
(iii) Prove that the following functions are primitive recursive:
f(m,n) = m
+ n.
f(m,n) = mn.
129
5.2 The Pattern-of-Calls Expansion
f(m,n) =
f(m, n)
{Om-n
ifm < n
else.
= if m = n then 1 else O.
8. Let tjI(h) = if n = 0 then 0 else 1 + h(h(n - 1)). Prove that the Kleene semantics is
the total identity function f,in) = n. [Hint: show Ijtk(h) = if n < k then n else
1 + h2k(n - 1) by induction on k.]
9. Let tjlk(h) = if n > 100 then n - 10 else h(h(n
tics is the "91-function"
f",(n)
[Hint: For 2
:$;
k
:$;
= if n >
+ 11)). Show that the Kleene seman-
100 then n - 10 else 91.
102 show
n -10
{
tjlk(..L)(n) = 91
ifn> 100
if nE {lOO, 99, ... ,102 - k}
undefined
else
and that tjlI02+m(..L) = tjlI02(..L) for all m > 0.]
10. We claimed that the discussion motivating Definitions 14 "glossed over numerous
technical points." Debate the following.
(i) If two different syntactic formulas tjI(h)(x) describe the same function h 1-+ tjI(h)
the results of "all-call" substitution will compute the same function in both
cases.
(ii) All-call substitution always computes a partial function.
(iii) In any specification one can always separate the "exit part" from the "call
further part."
11. Let tjI: Pfn(N x N, N) ----+ Pfn(N x N, N) be the specification
tjI(h) = if x = y then y
+ 1 else h(x, h(x -
1, y
+ 1)).
Define
f(x,y) =
{
+1
if x ;::0: y and x - y is even
undefined else.
X
Show that f is a fixed point solution of tjI. Show that the total function g(x,y)
x + 1 is another fixed point solution of tjI.
=
5.2 The Pattern-of-Calls Expansion
In the previous section we considered recursive specifications t/I:
PCn(X, Y) ---+ PCn(X, Y) which were arbitrary subject to the requirement
that t/ln(.L) ::; t/l n+ 1 (.L). There is no guarantee that the class of all such t/I is not
much larger than the class suggested by the motivating examples, and it is
reasonable to consider further axioms to narrow the gap. In this section we
regard PCn as a partially additive category and focus on those t/I which are
"power series" of the form
130
5 Recursive Specifications
where, roughly speaking, Hn is the part of the specification involving n calls.
An axiomatic treatment in arbitrary partially additive categories is the subject of Chapter 8. This section, which motivates the later work, is limited to
a few examples of power-series specifications in PCn (for which all but the first
two of these terms are 0), whose Kleene semantics is represented as a sum
which we call the "pattern-of-calls expansion" because there is one term for
each possible pattern of call as we would expect from our earlier motivations
in Section 5.1 regarding the "all-call" strategy.
We begin by examining the iterate ft: X --+ Y of a partial function
f: X ---+ X + Y. As defined in Theorem 3.2.24,
ft
=
I
n=O
fzft,
where fl = PRd, f2 = PR 2fas in 3.2.15.
Intuitively, the following flowchar.t identity holds:
Writing f as indl
+ inzf2 with flowscheme
2 takes the form
3
x
y
Recalling from 3.2.5 that the flowscheme for g
A
+ h: A --+ B is
B
131
5.2 The Pattern-of-Calls Expansion
3 states that
That 4 actually holds is seen by
00
= n=l
Lfdf+f2
00
= L fdf =ft.
n=O
This suggests that we might define
1/1: Pfn(X, Y) - + Pfn(X, Y) given by
ft recursively by the specification
5
As a first step toward the general definition of power-series maps and the
pattern-of-calls expansion, we examine this recursive specification associated
with the iterate in more detail. We distinguish two situations in 5. If the upper
path is taken, we have a 1-substitution path which transforms a to the partial
function
6
given the partial function a substituted for the single call along that path. If
the lower path is taken, the partial function returned is
7
which takes no arguments since we have a O-substitution path.
We may then write
8
Let H denote the pair (Ho, Hi). We associate with this a set of partial functions from X to Y which we name PC(H)-for pattern of calls of H -defined
inductively as follows:
9
HOEPC(H).
If a E PC(H), then Hi (a) E PC(H).
In the inductive step, think of Hl(a) as the 1-substitution path, where a
computation with interpretation a is substituted for the single call. This
clearly implies that the elements of PC(H) are precisely those partial func-
132
5 Recursive Specifications
tions which can be diagrammed as
(n
~
r
HI
Ho
0 occurrences)
which yields the partial function Hi(Ho) = f2ft", corresponding to n calls of
the 1-substitution path followed by a final call of the O-substitution path. The
semantics
ft
L f2ft =
00
=
n=O
the sum of all partial functions in PC(H)
for the iterate may then indeed be termed the pattern-ofcalls expansion for
the specification t/I of 5.
The Kleene semantics of 5 is easily computed. From t/I(a) = f2 + afl we
have
t/lCl) =
f2'
t/l 2(1-)
=
t/l3(1-)
= f2
f2
+ f2fl'
+ (f2 + f2fl)fl = f2 + f2fl + fzfl,
and an easy induction establishes
t/l k(1-)
k-l
=
L f2flk,
n=O
so that
t/l 1(1-) ::; t/l 2(1-) ::; t/l3(1-) ::; ...
and the Kleene semantics ft coincides with the pattern-of-calls expansion.
Whereas the Kleene semantics approximates ft with increasingly larger
functions (relative to the extension ordering), the pattern-of-calls expansion
sums the "smallest components" of ft.
We now give several further examples of recursive definitions of the form
t/I(a) = a o + HI (a), before moving on to "nonlinear" definitions of the form
t/I(a) = a o + HI (a) + H 2(a, a) which motivates the general power-series map
t/I(a)
=
L Hn(a, ... , a)
n~O
~
n times
to be introduced in Chapter 8.
10 Example. The function g: N
-+
N for which DD(g)
=
{nJn > 5},
133
5.2 The Pattern-of-Calls Expansion
g(n) = n,
may be recursively defined by the specification
T
F
ljJ(a)
If p: N
----+
N
+N
corresponds to the test (x > 5?) by
p(x)
=
{(1, x)
(2,x)
if x> 5
else,
then we see that
ljJ(a)
+ a· PF'
PT
=
°
which decomposes into Ho = PT which contains occurrences of the variable
a, and HI = a· PF which contains 1 occurrence of a.
11 Example. Consider 5.1.4. Here
ljJ(a)(n)
=
a(n
+ 1)
or
ljJ(a)
where succ: N --+ N, n ~ n
1.., and HI (a) = a· succ.
=
a· succ
+ 1 is the successor function. In this case, Ho
=
12 Example. Consider the recursive definition
I(n)
=
{o
I(n - 1)
+1
Introducing the predecessor function pred: N
ifn=O
else.
--+
N defined by
o
ifn=O
n - 1 ifn > 0,
pred(n) = {
we see that the corresponding ljJ is
ljJ(a)
=
00 PT + succ· a· pred· PF'
where 0 is the function constantly 0, and p: N --+ N corresponds to the test
(n = O?). Here Ho = O· PT and HI (a) = succ· a· pred· PF.
134
5 Recursive Specifications
We now consider a "nonlinear" recursive definition of a partial function
f:N -+Nby
13
f(x) = if p(x) then f(f(g(x))) else x.
This corresponds to the specification
t/I(a) = PF
+ a 2 gPT'
(Here squaring refers to composition of partial functions.) Corresponding to
8, we may rewrite this t/I as
14
where Ho = PF corresponds to the O-substitution path, while H 2 (a 1 ,a 2 ) =
a 2a 1 gpT corresponds to the 2-substitution path, with a 1 being the partial
function substituted for the first call along the path, while a 2 is the partial
function substituted for the second call along the path. It is only when the
same substitution is made at both places (as in checking the fixed point
equation 14) that we force the two arguments of H2 to be equal.
We now define the set PC(H) of patterns of calls for the H = (Ho, H 2) of
14 to be the set of partial functions N -+ N given by the inductive definition
15
Ho E PC(H).
If a 1 and a 2 are in PC(H), then H 2(a 1 ,a2)EPC(H).
We shall now see that
16 PC(H) is a family of partial functions with disjoint domains whose sum,
eH =
L (a: a E PC(H)),
which we call the pattern-of-calls expansion for
semantics of 13.
t/I, coincides with the Kleene
The idea is that PC(H) exhausts all possible patterns of calls. For example,
consider the pattern of calls represented by the tree
H2
/""Ho
Ho/""Ho
H2
which evaluates to
PF(PFPFgPT)gPT = PF(gPT)2
on noting that pi = PF'
Here Ho corresponds to the O-substitution path. Next,
H2
t1
= /
Ho
""
Ho
corresponds to the 2-substitution path with each call being to Ho; while the
overall pattern t2 corresponds to the 2-substitution path with the first call
135
5.2 The Pattern-of-Calls Expansion
being to the pattern t1 while the second call is to Ho.
The Kleene sequence of 14 takes the form
tjl0(1.) = 1.
tjl1(1.) = Ho
tjl2(1.) = Ho
H2
+ /
""
Ho
Ho
H2
tjl3(1.)
/""Ho
= Ho + Ho
+
H2
/~
+
/H2""
Ho
Ho
/H2""
Ho
Ho
It is clear that any two functions in PC(H) appear as terms in some tjlk(1.)
and so have disjoint domains, and hence eH in 16 exists and coincides with
the Kleene semantics.
17 Example. To illustrate the utility of the pattern-of-calls expansion we use
it to prove that the Kleene semantics of the specification corresponding to 13
reduces to the semantics of while P do g. To analyze eH we recall that
Ho = PF,
while
H 2(a 1,a 2) = a2a1gpT'
It is thus clear that the leftmost term in every a E PC(H) is PF, while the
rightmost term is PT unless a = Ho. Thus,
H 2(a 1,a2) = ( .. ,PT)(PF' .. )gPT
=0
unless a2 = Ho
unless a2 = Ho, since
PTPF = 0, the nowheredefined partial function.
We see, then, that the only patterns-of-calls which can make a nonzero
contribution to eH are of the form
136
5 Recursive Specifications
H2
n~O
/
H2
occurrences
tn
=
/
.'
/""-
""Ho
Ho
H2
/""-Ho
Ho
Now to = Ho = PF' while tn+! = H 2 (tn,PF) = PFtngPT, and we see by induction that tn = PF(gPTt since PFPF = PF' Thus,
eH =
00
00
L t n = n=O
L PF(gPT)"
n=O
which we recognize as the iterative fixed point solution of 4 corresponding to
11 = gPT and 12 = PF' so that eH does indeed equal the semantics of while P
do 9 (which is It for 1= inTgpT + inFPF)'
In the next example we consider a simultaneous recursive definition of two
functions t, U E Pfn(X, X).
18 Example. Consider the simultaneous recursive specification
x
U
Here Definition 5.1.14 must be generalized to ljJ: A ~ A where A is the set
Pfn(X, X) x Pfn(X, X) of pairs of partial functions. This A inherits a partial
addition from Pfn(X,X) by adding separately in each component. Decomposing I into 11 and 12 and 9 into gland g2 as before, we then rewrite 18 as
t =
Utf1
+ 12'
U = t 2g 1 + kg 2·
We are thus led to define HoEA and H 2 : A2 ~ A by
137
5.2 The Pattern-ot-Calls Expansion
H2((atl,aul),(at2,au2))
= (a u2atlfl,at2atlgl)'
H2 requires explanation. H2 deals with the paths in the flowscheme 18 which
make two calls. If VE {t, u}, j E {l, 2}, the argument avj refers to the substitution of v for the jth call. To see that this is the correct idea, it is essentially
necessary to adapt the process of motivation as in 5.1.9-13 to appreciate the
relevance of the Kleene semantics (defined via the obvious generalization of
5.1.14) of the specification
20
I/I(a, b) = Ho
+ H2((a, b), (a, b)).
The first two terms of the Kleene sequence are
1/1(.1.., .1..) = Ho = (f2, kg 2);
1/1 2(.1..,.1..) = I/I(f2' kg 2) = Ho
=
+ H 2((f2, kg 2), (f2, kg 2))
(f2,kg 2 ) + (kgddl,flgd
= (f2 + kgddl,kg 2 + /igd·
We leave it as an exercise for the reader to verify that if the flowschemes for
t and for u are substituted for each t and each u in 18 then the exit paths with
no calls of t, u in the resulting expanded flowschemes are exactly the components of 1/1(.1.., .1..) above.
For H = (Ho, H 2), PC(H) is defined by
21
Ho c: PC(H).
If (a tl , a u1 ), (a t2,au2 )EPC(H), H 2((a tl ,aud, (a t2 ,au2 ))EPC(H).
While not obvious at this stage, the Kleene semantics and pattern-of-calls
expansion exist and coincide. This follows from the theory in Chapter 8.
22 Example. The recursive specification of Ackermann's function, defined
as in 5.1.3 has "power-series" representation. Here, 1/1: Pfn(N2,N)---+
Pfn(N2, N) is given by
138
5 Recursive Specifications
F
F
(m,n)
f-+n+l
(m, n)f-+
(m, n)f-+
a(m - 1,1)
a(m - l,a(m,n - 1))
.N·
N
N
+ Hl(a) + H2(a,a) where
Ho(m, n) = if m = 0 then n + 1 else undefined;
Then l/I(a)
= Ho
Hl(a)(m,n) = ifm
> 0 and n
H 2(a 1 ,a2)(m,n) = ifm, n
EXERCISES FOR SECTION
=
0 then a(m - 1,n) else undefined;
> 0 then a2(m - 1,a 1 (m,n - 1)) else undefined.
5.2
1. Show that the pattern-of-calls expansion for
t/I(h) = if n = 0 then 5 else h(n - 1)
as in 5.1.1 is the total function f(n) = 5.
2. Obtain the pattern-of-calls expansion for the specification corresponding to
Example 5.1.2 and prove that it is the usual factorial function.
3. For f
E
Pfn(X, X), p E Guard(X) a suitable specification for while p do f is
t/I(h) = hfp
+ p'.
Obtain the Kleene semantics and pattern-of-calls expansion and show they are
equal. Discuss why this semantics is intuitively correct.
4. Repeat Exercise 3 for repeat f until p.
5. Show that the pattern-of-calls expansion in Example 12 is the identity function
nr-+ n.
6. Consider the specification
139
5.3 Iteration Recursively
l/I(h) = ifn::;; 1 then 1 else h(n - 1)
+ h(n -
2)
corresponding to the Fibonacci function of Exercise 5.1.2. There are two candidates for a power-series specification, namely,
(i)
and
(ii)
l/I(h) = Ho
+ Hz(h,h),
where
Ho = if n ::;; 1 then 0 else undefined;
Hl(h) = h(n - 1)
+ h(n -
2);
Hz(h l , h z ) = hl (n - 1) + hz(n - 2).
Show that the pattern-of-calls expansion for both (i) and (ii) coincide with the
Fibonacci function. [Warning: Do not confuse + in Pfn(N,N) with + in N.]
7. Consider the power-series specification
l/I(h) = Ho
+ Hz(h, h),
where
Ho
= if n = 0 then 0 else undefined,
corresponding to Exercise 5.1.8. Show that the pattern-of-calls expansion is the
identity function. [Hint: As in the analysis of 13, consider trees. Show that for any
subtree t,
Hz
/
Ho
'"
is 1 or undefined accordingly as t evaluates to 0 or not.]
5.3 Iteration Recursively
Iteration can only be expressed recursively in the programming language
LISP. We have already studied "recursive iteration" in Section 5.2, where we
saw that ft is defined by 5.2.5, ljJ(a) = f2 + afl' In the present section, we
extend this concept by describing an iterative flowscheme with a set of simultaneous recursive equations. We then use partially additive semantics in prn
and algebra to rewrite and solve these equations. This one example provides
methods to be further applied in the exercises.
Consider
140
5 Recursive Specifications
1
X
T
----f
p >-_.:...F_-----,
r-----~
-- -- 9
a
F
----h
F
X
for a, b: X --+ X in Pfn, p, q, r: X ----+ X + X with P = in1PT + in 2PF with
PTEGuard(X), PF = p~ and q = in1qT + in 2qF' r = in1rT + in 2rF similarly.
Define f, g, hEPfn(X,X) to be the functions computed by starting at the
indicated point in 1 and proceeding to the exit. Thus, the semantics of 1 is f.
The reason for introducing g, h lies in the fact that they allow the following
simultaneous recursive equations to describe 1:
2
f
=
g
= if q then g else h;
if P then a else g;
h = if r then b else f.
We now express the right-hand side of 2 in partially additive form, as
3
f
4
g=
5
h=
=
+ apT·
gqT + hqF·
frF + brT .
gPF
It requires the theory of Chapters 6-8 to clarify what we intend "the
correct solution" of 2 or 3-5 to be. Thus, the algebraic analysis we now give,
while highly suggestive, must be justified later. Substituting 5, and then 3, in
4 we obtain
6
+ (frF + brT)qF
gqT + (gPF + apT)rFqF + brTqF
g(qT + PFrFqF) + (apTrFqF + brTqF),
g = gqT
=
=
141
5.3 Iteration Recursively
which we recognize as the same form as 5.2.4. This prompts us to regard g as
ft for f = in! (qT + PFrFqF) + in 2(ap TrFqF + brTqF). Accepting this,
g
=
00
L (apTrFqF + brTqF)(qT + PFrFqF)"·
n=O
But note that since qFqT
=
0, an expression of the form
tqF(qT
for n
~
+ u)"
1 and u E Guard (X) simplifies as follows:
tqF(qT
+ u)" =
+ U)(qT + uri
= tqFU(qT + U)(qT + ur 2
= tUqF(qT + U)(qT + ur 2
tqF(qT
(since guards commute,
by 3.3.19(v))
(3.3.19(v) again)
Thus,
7
g
=
00
L (apTrFqF + brTqF )(PFrFqF)"
n=O
and so g is given by
8
~----+t
X
g
x
where
9
s = not P and not q and not r;
c = aPTrFqF + brTqF.
Substituting in 3, the desired semantics is then also that of
142
5 Recursive Specifications
10
T
a
The reader may easily verify by inspection that 10 is equivalent to 1. Hence,
mechanical algebraic manipulation has simplified the original flowscheme!
EXERCISES FOR SECTION
5.3
1. Use the methods of this section to simplify
x
T
F
x
to
143
5.3 Iteration Recursively
x
x
for appropriate s, c.
2. Show that
F
simplifies to while p do a:
F
144
5 Recursive Specifications
3. A matrix over Pfn(X, X) is an m x n array A = [aij]' where i = 1, ... , m,j = 1, ... , n,
and each aijE Pfn(X, X) subject to the requirement that for each i, t, u with t #- u,
DD(a i,) n DD(a iu ) =
0.
If A = [aiJ is an m x n matrix and if B = [bjk ] is an n x P matrix define an m x P
array BA by a formula familiar from linear algebra, namely,
C
ik =
n
I
j=l
bjkaij ,
where I is the usual partially additive sum of Pfn(X, X) and bjkaij refers to
composition.
(i) Show that the sum for Cik exists.
(ii) Show that BA is a matrix, that is, if t #- u, DD(c i ,) n DD(Ciu) = 0.
(iii) Show that C(BA) = (CB)A for m x n A, n x P B, and P x q C.
4. In the context of 1-5 and the previous exercise, let
aPT]
B= [ 0 .
brT
(As usual, we represent a matrix such as A = [aij] as a rectangular array with aij in
row i and columnj.)
(i) Show that
is a matrix.
If C = [ciJ and D = [diJ are n x n matrices over Pfn(X, X) we say C + D
is defined if cij + dij exists for all cij and if C + D with ij entry cij + dij is again
a matrix.
(ii) Writting
show that FA
+ B is defined and that 3, 4, and 5 are equivalent to
F
=
FA
+ B.
For a continuation of the previous two exercises see Exercise 6.2.11.
Notes and References for Chapter 5
"Recursive function theory" has been studied since the 1930s. The class of recursive
functions is defined by extending the inductive part of the definition of the primitive
recursive functions in Exercise 5.1.7 to include the construction of minimization:
145
Notes and References for Chapter 5
if f(nt, ... , nk+t) is recursive then g is recursive if
g(n t , . .. , nk) is the least m with f(n 1, . .. , nk, m) = 0
but with f(n t , ... , nk, n) defined for 0::;; n ::;; m.
Since there may be no m with f(n1, ... ,nk,m) = 0, this construction produces partial
functions which need not be total. The class of recursive functions is known to
coincide with the class of functions of the form N k ..... N that can be computed in a
programming language such as Fortran, Pascal, LISP, Ada, ....
In 1926, David Hilbert asked whether every recursive function that was total was
necessarily primitive recursive. Ackermann's function of Example 5.1.3 provides a
counterexample as was shown by W. Ackermann in a paper (written in German) in
Mathematische Annalen, 19, 1928, pp. 118-133. After proving that a(m, n) always
halts, Ackermann proved it could not be primitive recursive by arguing that the
number of computations required for a(m, n) was larger than that required for any
primitive recursive function. The reader may wish to check that a(3, 3) = 61, although
here a computer might be helpful. We do not suggest that the reader should attempt
to verify the following extraordinary fact:
a(4,4)
= 22
2216
- 3.
This was verified by some poor abused secretary who was handed a crate of pencils,
two truck loads of paper, and told to figure it out-no hurry-next week will be fine
(The number a(4, 4) is vastly in excess of the number of hydrogen atoms that could fit
in a cube having a side the diameter of the Milky Way Galaxy at a density of 1 ton
per cubic inch).
For the state-of-the-art on the unsolved problem of 5.1.5 see J. C. Lagarias,
"The 3x + 1 problem and its generalizations," American Mathematical Monthly, 92,
January 1985, pp. 3-22.
For details on "computation strategies" see Z. Manna, Mathematical Theory of
Computation, McGraw-Hill, 1974, Section 5-2.
The reader may have been surprised at the omission of a number of terms associated with calling strategy such as "call-by-value," "call-by-name," "call-by-reference,"
and others. These terms refer to details of operational semantics and are usually
meaningful only in the context of a specific programming language, so they apply at
the level which we do not intend to address in this book.
While Definitions 5.1.14-16 are implicit in the work of S. C. Kleene in recursive
function theory, their emphasis in the context of the semantics of programming
languages is due to D. S. Scott. See his article, "The lattice of flow diagrams," in
E. Engeler (ed.), Symposium on Semantics of Algorithmic Languages, Lecture Notes in
Mathematics Vol. 118, Springer-Verlag, 1971, pp. 311-366. For an exposition of
Kleene's approach see Chapter 11 of H. Rogers, J r. Theory of Recursive Functions and
Effective Computability, McGraw-Hill, 1967. (Incidentally, Chapter 12 of Rogers'
book discusses "Recursively enumerable sets as a lattice")
Section 5.2 is adapted from the authors' paper, "The pattern-of-calls expansion is
the canonical fixpoint for recursive definitions," Journal of the Association for Computing Machinery, 29, 1982, pp. 577-602.
The idea of expressing iteration recursively as in 5.3.2 was emphasized by John
McCarthy, the inventor of LISP. See his paper, Recursive functions of symbolic
expressions and their computation by machine, Communications of the Association
for Computing Machinery, 3,1960, pp. 164-195.
CHAPTER 6
Order Semantics of Recursion
6.1
6.2
6.3
6.4
Domains
Fixed Point Theorems
Recursive Specification in FPF
Fixed Points and Formal Languages
The task of the next three chapters is to better understand the abstract
principles underlying the ideas associated with recursive definitions of partial
functions as presented in Chapter 5 and to thereby elevate the theory to a
wide class of semantic categories.
This chapter focuses on order semantics of recursion-the use of the
theory of posets to provide a framework to formulate recursion and study its
properties. We already used the extension ordering 2.1.9 in our requirement
5.1.15 that a recursive specification ljJ: Pfn(X, Y) -----+ Pfn(X, Y) must satisfy
ljJ(..L) ::;; ljJ2(..L) ::;; ljJ3(..L) .,.
This condition ensures that the formula for the Kleene semantics flJ! for ljJ
of 5.1.16 defines a partial function and the motivating remarks preceding
Definition 5.1.14 support that it is a natural condition.
For most recursive specifications in Pfn, flJ! may be characterized as the
least element of the set of fixed point solutions of ljJ, an abstract property
expressed in terms ofthe poset structure ofPfn(X, Y) rather than as a specific
formula such as 5.1.16 which applies only to partial functions. In Section 6.1
we show that the po set Pfn(X, Y) belongs to a general class of posets called
"domains" and in Section 6.2 we prove that each "continuous" specification
ljJ: D ~ D with D a domain has a least fixed point solution which is, moreover,
given by a formula which generalizes 5.1.16. This is a satisfactory theory
because, in practice, recursive specifications are continuous.
One is not forced to look to more general semantic categories than Pfn to
motivate the need for a more general theory, for the simultaneous recursion
of 5.2.18 required an ad hoc generalization of the Definitions 1.5.14 of Kleene
semantics. Simultaneous recursion is included in the theory of Section 6.2.
6.1 Domains
147
This is then applied in Section 6.3 to extend the functional programming
fragment of Section 1.3 to simultaneous recursion definitions in Pfn.
Recursive definitions are not limited to programming languages and find
many uses in mathematics and computer science. In Section 6.4 we apply the
theory of Sections 6.1 and 6.2 to solve for the language generated by a
context-free grammar.
6.1 Domains
The desire to define a Kleene semantics formula such as 5.1.16 in more
general posets than Pfn(X, Y) lends to the notion of "domain." These are
defined in this section and a few examples are given.
Posets were introduced in 2.1.6 and infima and suprema for two-element
families were discussed in 3.3.3. We begin by extending these definitions to
arbitrary families.
1 Definition. Let (P, ~) be a poset and let S be any subset of P. An upper
bound of S is an element x of P (not necessarily in S) satisfying "for all s E S,
S ~ x." Let UB(S) denote the set of all upper bounds. Note that UB(0) = P
(since if x E P, the condition "for all s E 0, S ~ x" is vacuously true). The
supremum or least upper bound of S, denoted LUB(S) or V S, is the least
element of UB(S); thus, it may not exist but is unique if it does as shown in
3.3.1.
A po set (P, ~) is complete if LUB(S) exists for every subset S c: P. A
complete po set has a least element, namely, LUB(0). We shall often use the
symbol .l for the least element of a poset.
Dually, a lower bound of S is an element of P (not necessarily in S) such
that x ~ s for all s E S. Denote the set of all lower bounds of S by LB(S). The
irifimum or greatest lower bound of S, if it exists, is the greatest element of
LB(S) and is denoted GLB(S) or AS.
Since UB(0) = P, LUB(0) is the same concept as the least element .l of
(P, ~). Dually, GLB(0) is the greatest element.
It is immediate from the definitions in 3.3.3 that if S = {x,y}, AS is the
same concept as x 1\ y whereas V S = x v y.
2 Example. Let (N, ~) be the poset of the natural numbers with the usual
numerical ordering. Then any nonempty subset of P has a least element. This
amounts to the principle of mathematical induction, that is, if property Pn is
not true for all n then {n: Pn is false} is nonempty and so has a least element
no: assuming Po is true, no > 0, and Pno - 1 is defined. As Pno - 1 is true this
contradicts the "induction hypothesis" that Pno - 1 => Pno ' It follows that any
nonempty subset of (N, ~) which has at least one upper bound must have a
least upper bound. On the other hand, no infinite subset has any upper
bounds.
148
6 Order Semantics of Recursion
3 Example. The po set (&>(X), c) of 2.1.8 of all subsets of X is a complete
po set. The greatest lower bound of a family !/ of subsets of X (which must
exist -see Exercise 6) is its intersection
n!/ = {xeXlxeS for all S e!/},
whereas the least upper bound of !/ is its union
U!/
=
{xeXlxeS for at least one Se!/}.
The poset (Pfn(X, Y),~) with the extension ordering of 2.1.9 is not
complete. Indeed, if f, g e Pfn(X, Y) have an x e DD(f) n DD(g) such that
f(x) # g(x), then there are no upper bounds of {f, g} since iff ~ hand g ~ h
thenf(x) = h(x) = g(x) which is not so. This provides half of the proof of the
following:
4 Observation. For f, g e Pfn(X, Y), f, g have an upper bound if and only if
= g(x)for all xeDD(f) n DD(g).
f(x)
One way was just observed and, conversely, if f(x) = g(x) for all
~ h, g ~ h for h defined by
x e DD(f) n DD(g) thenf
DD(h) = DD(f) u DD(g),
h(x)
xeDD(f)
xeDD(g).
= {f(X)
g(x)
The following is then immediate.
5 Observation. For the partially additive category Pfn, if (fd i e J) is summable
in Pfn(X, Y), V/; exists in the extension ordering and
6 Example. For any sets X, Y, (Mfn(X, Y), ~) withf ~ g iff(x) c g(x) for all
x e X is a complete po set. We leave the poset axioms as an exercise for the
reader. Given {/;lieJ} c Mfn(X, Y),
(V/;)(x)
=
U/;(x)
iel
defines the least upper bound, the least element being defined by
J..(x)
=0
(corresponding to the case J = 0).
We now ask what property of the extension ordering Pfn(X, Y) allows
a construction such as the formula for Kleene semantics 5.1.16. We have
already seen in 4 that certain suprema exist. Indeed, further suprema exist in
that Pfn(X, l') is a domain as now defined.
149
6.1 Domains
7 Definition. A poset (P, :$;) is a domain if it has a least element and if
whenever (xn: n = 1,2,3, ... ) is an ascending chain in P (which means
Xn :$; Xn+1 for all n) then LUB{x n } exists.
8 Example. (Pfn(X, Y),
I
:$;
:$;)
g if DD(f)
with the extension ordering
DD(g) and g(x) = I(x) for x E DD(f)
c
of 2.1.9 is a domain. Indeed, if
11 :$; 12 :$; jj :$; ..•
define V/; by
00
9
DD(V/;)
=
U DD(/;)
i=l
(V/;)(X) = hex), any k with x E DD(h).
Formula 9 is well defined because if x E DD(jj) n DD(h) then either jj :$; Ik
or h :$; jj (since m :$; n implies that 1m :$; f,,) so that jj(x) = hex). It is then
obvious that each jj is :$; V/;o Furthermore, if jj :$; g for all j then DD(jj) c
DD(g) so that DD(V/;) c DD(g) and (Vn(x) = hex) (for some k) = g(x) so
that V/; :$; g. This shows that 9 indeed defines the least upper bound.
In the context of Example 8, we see that the Kleene sequence of 5.1.15
10
is indeed an ascending chain in Pfn(X, Y) and that its Kleene semantics
00
DD(fIJ!)
=
U DD(t/!k(~)),
k=l
IIJ!(x)
= t/!k(~)(X),
any k with xEDD(t/!k(~))'
as in 5.1.16 is exactly an instance of 9 so that we now see that the Kleene
semantics satisfies
11
12 Example. (Mfn(X, Y), :$;) as in 6 is a domain. This is obvious since any
complete poset is a domain.
In the next section we will introduce a general definition of a recursive
specification as a suitable function t/!: D -+ D on an arbitrary domain D
generalizing D = Pfn(X, Y). In this context, the following intuition is useful.
13
I:$; g for f, g E D means "g has at least as much information as f"
This intuition applies to 10 in the context of the examples of Section 5.1
since
150
6 Order Semantics of Recursion
corresponds to the fact that after k + 1 substitutions of t/I (in the "all-call
scenario") at least as many and possibly more exits can occur as could for k
substitutions. Thus, 10 is a "sequence of approximations" and 11 asserts that
the Kleene semantics is the "limit of the approximating sequence."
Hasse diagram notation as in 2.1.6 can be useful even for infinite posets.
For example, (N, :::;;) as in 2 has Hasse diagram
The "flattest" Hasse diagram would be
14
• • •
with one dot for each element in some set X, and this describes the discretely
ordered poset (X, =) on X where, as usual, x = y means x, yare equal. For
any set X this is a poset as is easily verified. This is not a domain since there
is no least element. This problem may be fixed by adjoining one:
15
This is indeed a domain since every ascending chain has one of the three
forms
.1:::;;.1:::;;.1:::;;···:::;;.1:::;;x:::;;x:::;;x:::;;···,
which have suprema .1, x, and x, respectively. The formal definition is easily
given as follows where the superscript ~ is the flat symbol from musical
notation.
16 Definition. Let X be any set and let .1 be a new object not an element of
X. The flat domain of X is the po set
X~
= (X u {.1}, :::;;),
where x :::;; y means "x = .1 or x = y."
151
6.1 Domains
The Hasse diagram of X~ is indeed as in 15 and the reason that X is a
domain was given above.
A domain D isflat if D = X~, where X = {xED: x#- ..L}.
We conclude the section by introducing product domains. The ad hoc discussion of simultaneous recursion in 5.2.18 required a specification of the
form
Pfn(X, X) x Pfn(X, X) ~ Pfn(X, X) x Pfn(X, X).
Since our general model for a specification will be a function of the form
D --+ D for D a domain and since E = Pfn(X, X) is a domain, we have motivated the idea that E x E should be a domain whenever E is. More general,
but quite natural, examples of simultaneous recursion will require that
Dl x ... x Dn be a domain when the D; are. The requisite definition is easily
given as follows:
17 Definition. Let (D l , :-::; d, ... , (Dn, :-::;n) be domains, n > O. Then their prod-
uct domain (D, :-::;) is defined as follows:
D
= Dl
X ...
x Dn
(the product set, 2.3.1, 2.3.11),
It is routine to show that (D, :-::;) is a poset. If
is an ascending chain in (D, :-::;) then for each iE {1, ... , n},
is an ascending chain in (D;, :-::;;), and so has supremum x; and it is clear that
(Xl' ... , xn) is the supremum in (D, :-::;) of the original chain. The least element
of (D, :-::;) is (..L 1, ... , ..L n ), where ..L; is the least element of (D;, :-::; J Thus, (D, :-::;)
is indeed a domain.
EXERCISES FOR SECTION
6.1
1. Explain the use of the term "dually" in Definition 1 (cf. Exercises 3.3.1 and 3.3.2).
2. In any poset, show that for one-element subsets S
coincide with s. This generalizes 3.3.6(i).
= {s}, V Sand I\s
exist and
3. Let S, T be subsets of a poset with SeT and assume VS, 1\ T exist. Show
that V S ~ V T. State the dual result for infima (hence, of course, no proof is
necessary).
4. Let (P, ~) be a po set. Show that the least element is the same concept as
What is the dual statement?
5. Extend Proposition 3.3.5(iii, vi) by proving
V P.
152
6 Order Semantics of Recursion
X A
(y
A Z) =
GLB{x,y,z}
= (x A
y)
A Z
in any meet-semilattice and
x v (y v z)
= LUB{x,y,z} = (x v
y) v z
in any join-semilattice. Included is the assertion that GLB {x, y, z} (respectively,
LUB{x,y,z}) exists. [Hint: Cut the work in half, using duality.]
6. Prove that every subset of a complete poset has an infimum. [Hint: /\ S =
VLB(S).]
7. Let (P, :-:;;) be a poset. A subset S of P is consistent if each two elements of Shave
an upper bound in P. (P, :-:;;) is consistently complete if (P, :-:;;) is a meet-semilattice
with least element in which every consistent subset has a least upper bound.
(i) Show that every complete po set is consistently complete. [Hint: Use
Exercise 6.]
(ii) In Pfn(X, Y), with the extension ordering of 2.1.9, show that "overlap
summable" and "consistent" coincide. Conclude that Pfn(X, Y) is consistently
complete.
(iii) In Pfn(X, Y), show that "disjoint-domain-summable" coincides with "consistent and!; A jj = .1 if i i= j."
(iv) Show that (N, :-:;;) as in Example 2 is not consistently complete.
8. Let (D, :-:;;) be a domain, let
k > 1. Show that
Xl :-:;;
x2
:-:;; X3 :-:;; •••
be an ascending chain, and let
V {X I ,X2 ,X3 ,···} = V {Xk,Xk+I,Xk+2'···}·
9. Let D = XD, where X has one element. Show that D x D is not flat. [Hint: Draw
the Hasse diagram.]
10. Let (P, :-:;;) be the poset of all nonzero real numbers with the usual numerical
ordering. Let S = {xEPlx < O}. Show that UB(S) is infinite but has no least
element. Conclude that (P,:-:;;) is not consistently complete (as defined in
Exercise 7).
6.2 Fixed Point Theorems
The previous part of this chapter has motivated the following generalizations of Definitions 5.1.14:
1 Definitions. Let D be a domain. A recursive specification on D is a total
function 1/1: D -+ D such that
2
When 2 holds, the sequence
I/I(.L), 1/1 2(.L), 1/13(.L), ...
is the Kleene sequence ofl/l. The Kleene semantics ofl/l is thenftp E D defined by
153
6.2 Fixed Point Theorems
3
fIJi =
V t/JnCl).
00
n=l
This supremum exists by the definition of a domain and generalizes 5.1.16 by
6.1.11.
In this section we investigate conditions on a po set (P, ~) and function
t/J: P --+ P that guarantee the existence of fixed pointsf E P with t/J(f) = f The
most important result is Theorem 13 below which asserts that if D is a
domain and t/J: D --+ Dis continuous,f1Ji of 3 is the least fixed point of t/J. We
begin with the following:
4 Observation. If (D, ~) is a domain and t/J: (D, ~) --+ (D, ~) is monotone as
defined in 2.1.10 then t/J is a recursive specification. To prove this, first observe
that 1- ~ f is true for any fin D since 1- is the least element, so 1- ~ t/J( 1-) must
hold. By monotonicity, t/J(1-) ~ t/J2(1-), t/J2(1-) ~ t/J3(1-), ....
The definition of least fixed point is formally given as follows.
5 Definition. Let (P, ~) be a poset and let t/J: P --+ P be a total function. A
fixed point of t/J is an element f of P satisfying t/J(f) = f The least fixed point
of t/J (if it exists) is the least element of the set of fixed points of t/J.
We claimed in Example 5.1.2 that the recursive specification of the factorial function has only one fixed point solution. The next example provides
a more careful verification.
6 Example. Consider the recursive specification t/J: Pfn(N, N) ------. Pfn(N, N)
for the factorial function
1
ifn = 0
t/J(h)(n) = { n· h( n - 1) eIse.
Then DD(t/J(h)) = {O} u {n: n - 1 E DD(h)} so that g ~ h implies DD(t/J(g)) ~
DD(t/J(h)), and an easy induction argument then establishes that t/J(g) ~ t/J(h).
Hence, t/J is monotone. Let f(n) = nL Clearly, t/J(f) = f so f is a fixed point.
Suppose t/J(h) = h. Then h(O) = t/J(h)(O) = 1 = f(O). Now assume h(n) = f(n)
for n = 0, ... , k. Then h(k + 1) = t/J(h)(k + 1) = (k + 1)· t/J(h)(k) = (k + 1)·
f(k) = (k + 1)k! = f(k + 1). Thus,fis the only fixed point oft/J, and so is the
least fixed point of t/J.
7 Example. Let t/J be the recursive specification corresponding to Example
5.1.4,
t/J(h)(n) = h(n
+ 1).
This is obviously monotone. In 5.1.4 we showed that the set of fixed points is
154
6 Order Semantics of Recursion
the set of total constant functions together with 1-, and so 1- is the least fixed
point.
Because these examples suggest a connection between the semantics of a
specification and the least fixed point, we shall prove two general theorems
concerning the existence of fixed points. The first is:
8 Theorem. Let (P, ::;) be a poset and let
t/!: (P,
::;) ------+ (P, ::;) be monotone.
Then il
I
9
=
V {h: h ::; t/!(h)}
exists, it is afixed point olt/!.
PROOF. Set H = {h: h ::; t/!(h)} so that 1= LUB(H). For any hEH we have
h ::; t/!(h) whereas t/!(h) ::; t/!U) since t/! is monotone and IE UB(H) SO, by
transitivity, h ::; t/!U). As I = LUB(H),
I::;
10
t/!(/~
As t/! is monotone, 10 yields t/!U)::; t/!(t/!U)) so that t/!U) E H. But then
as IE UB(H), t/!(f)::; I which together with 10 and antisymmetry yields
1= t/!U)·
0
t/!: D -+ D be the identity function
t/!(d) = d. Then t/! is monotone. The Kleene semantics is lIP = 1- since
t/!n(1-) = 1- for all n. This is a fixed point solution and is surely the least fixed
point being the least element of D. Since the supremum of 9
11 Example. Let D be any domain and let
I=VD
is the greatest element of D this may not exist and if it does will produce a
different fixed point solution as long as D has at least two elements.
Theorem 8 has useful applications in mathematics (see Exercise 3) but is
not very useful in semantics. We now introduce continuous functions which
lead to the more useful Theorem 13.
12 Definitions. Let (D.::;) and (E, ::;') be domains. A monotone map
t/!: (D,
::;) ------+ (E, ::;') is continuous if is preserves least upper bounds of ascending chains. That is, whenever 10 ::; 11 ::; 12 ::; ... is an ascending chain
with I = V (fn) then t/!Uo)::; t/!Ud ::; t/!(2) ::; ... (which is automatically
an ascending chain because t/! is monotone) has t/!(f) = V(t/!Un))' Thus,
t/!(V Un)) = V(t/!Un))'
We are now ready to prove the Kleenefixed point theorem:
13 Theorem. Let (D, ::;) be a domain and let
uous. Then the Kleene semantics
t/!: (D,
::;) ------+ (D, ::;) be contin-
155
6.2 Fixed Point Theorems
(as in 3) is the ietlst fixed point of 1/1.
PROOF: If fo :s;; fl :;;; f2 :s;; ... is any ascending chain, so is fl :;;; f2 :;;; f3 :;;; ...
and both have exactly the same set of upper bounds and so must have the
same least upper bound (both being least elements of the same set). It follows
that the least upper bound f of .l :;;; I/I(.l) :;;; 1/1 2 (.l) :;;; ... must also be the
least upper bound of I/I(.l) :;;; 1/1 2 (.l) :;;; 1/1 3 (.l) :;;; .... But the latter is exactly
I/IU) by the continuity of 1/1. It follows that I/I(f) = f and f is a fixed point of
~. Now let I/I(g) = 9 be an arbitrary fixed point. As .l :s;; 9 and 1/1 is monotone,
I/I(.l) :s;; I/I(g) = g. Similarly, 1/1 2 (.l):;;; I/I(g) = g, 1/1 3 (.l):;;; I/I(g) = g, ... so
I/In(.l) :;;; 9 for all g. Thus, 9 is an upper bound of {I/I"(.ln and, as f is the least
upper bound of this set,j :s;; g.
0
In certain situations, an intuitive guess f for the value offtp may be hard to
verify using the formula of 3, whereas it might be relatively easy to show
directly that f is the least fixed point of 1/1. Thus, f = ftp by Theorem 13.
Example 6 illustrates this, providing we show 1/1 there is continuous. This is
done next.
14 Example. We saw in Example 6 that the factorial function is the least
fixed point of the specification
I/I(h)(n)
{
=
n= 0
1
n· h( n - 1) eIse.
Such 1/1 is continuous as follows. Monotonicity was already observed above.
If h = V hk with ho :;;; hI :s;; h2 :;;; ... then DD(I/I(h)) = {O} u {n: n - 1 E DD(hk)
for some k}. l/I(h)(O) = 1 = l/I(hk)(O) for any k. For n > 0, I/I(h)(n) =
n· (hd(n - 1) for any k with n - 1 E DD(hd. Thus, I/I(h) = V(I/I(hd). This
shows that Example 6 is an instance of Theorem 13 and proves that the
factorial function is the Kleene semantics.
15 Example. The specification
ifn =
if n >
if n >
if n >
~(h)(n) ~ {~
0
0 but h(n - 1) > 1
0 and h(n - 1) = 1
0 and h(n - 1) = 0
of 5.1.23 is easily seen to be continuous. Settingf = I/IU) we get
f(O)
=
0
f(1) = 3 asf(O) = 0
f(2) = 1 asf(1) = 3
156
6 Order Semantics of Recursion
thus revealing this earlier calculation in 5.1.20 as a proof that 1/1 has a unique
fixed point. The specification of 5.1.20 was seen not to be monotone so
Theorem 13 does not apply, but the same total function is the unique fixed
point.
16 Example. Not every monotone map 1/1: D --+ D is continuous. For an
example with D = Pfn(N, N) see Exercise 6. A simple abstract example is as
follows. Let (D, ~) be the domain N u {oo, 00 + 1}, where 00, 00 + 1 are new
objects not in N, with
O<1<2<···<n<···<00<00+1.
This is a complete poset with
VA
=
{:ge" in
00
A
+1
if A = 0
if A is finite, nonempty
if A is infinite, 00 + 1 rt A
ifoo+1EA
and is a domain in particular. Define 1/1: D --+ D by
1/I(x) = {x + 1
ifxEN
00 + 1 if x = 00 or x = 00 + 1.
Then 1/1 is easily seen to be monotone but if Xn
=
n,
whereas
V1/I(xn) = VX n+1 =
00
so 1/1 is not continuous. Note that
ftp
=
V n + 1 = 00
n=O
00
is not a fixed point of 1/1.
We close the section by showing how the theory includes simultaneous
recursion.
17 Example. The recursive specification
g(n) = if n = 0 then h(n, n) else 5
h(m, n) = if m = 0 then 0 else g(n)
is formalized by the specification
Pfn(N, N) x Pfn(N2, N) ~ Pfn(N, N) x Pfn(N 2, N),
157
6.2 Fixed Point Theorems
1/1 1 (t, u)(n) = if n = 0 then u(n, n) else 5,
1/12(t, u)(m, n) = if m = 0 then 0 else t(n),
where we use the product domain construction of 6.1.17, and the notation
I/Ik(t, u) = (I/I~ (t, u), I/I~(t, n».
It is routine to check that 1/1 is continuous. The first three terms of the Kleene
sequence are as follows:
1/1 1 (.1., 1-)(n) = if n > 0 then 5 else undefined,
1/12(1-, 1-)(m, n) = if m = 0 then 0 else undefined,
2
{5
I/Il(1-,1-)(n)= 0
I/I~(1-,1-)(m,n)= {
3
0
ifm=O
5
ifm,n>O
undefined else,
{5
I/Il(1-,1-)(n)= 0
3
1/12(1-, 1-)(m,n) =
ifn > 0
ifn=O,
ifn > 0
ifn=O,
{O ffm=Omn=O
5
ifm, n > O.
Since 1/13 consists of total functions, 1/13 = 1/1 4 = 1/15 = ... and coincides with
I",. Thus, the Kleene semantics is
I", =
(if n
=
0 then 0 else 5, if m, n > 0 then 5 else 0).
If I were another fixed point solution thenI", ::;
whereas I", consists of total functions so that
unique fixed point solution.
EXERCISES FOR SECTION
I being the least fixed point
I", = f.
This shows
I",
is the
6.2
1. Complete the argument of Example 6 by providing the omitted induction
argument.
2. Let X, Y be sets and let f: X --> Y be a total function. Show that the following
functions are continuous.
(i) f",,: (&,(X), c) ---> (&,(Y), c),
(ii) f*: (&,(Y), c) ---> (&,(X), c),
f*(A) = {YE Yly = f(a) for some aEA}.
f*(B) = {x E Xlf(x) E B}.
3. The Cantor-Schroeder-Bernstein theorem of set theory asserts that if f: X --> Y,
g: Y --> X are total injective functions then there exists an isomorphism h: X --> Y.
This is quite an amazing result. For example, if X, Yare the indicated subsets of
the plane
158
6 Order Semantics of Recursion
then it is obvious that there exists an injective functionf: X --+ Y (create a "photographically reduced" copy of X inside one of the shaded rectangles in Y to define
f) and, similarly, there exists an injective function g: Y --+ X but it is much less
obvious that there exists a bijective function h: X --+ Y.
Prove the Cantor-Schroeder-Bernstein theorem by using the following
outline.
(i) Given!, g define 1/1: (&,(X), c) ---+ (&,(X), c) by
I/I(A)
=
X - g*(Y - f*(A)),
where!*, g* are as in Exercise 2 and X - A means {xEXlxrtA}.
(ii) By Theorem 8, 1/1 has a fixed point S, I/I(S) = S. Thus, X - S =
g*(Y - !*(S)).
(iii) Show that h: X
--+
Y is well defined and bijective if
h(x) = {!(X)
the unique y, g(y) = x
if XES
else.
4. Show that the fixed point in Theorem 8 is, in fact, the greatest fixed point of 1/1.
Conclude, using duality and Exercise 6.1.6, that
A {g: g ~ h for all h with I/I(h) ~ h},
if it exists, is the least fixed point of 1/1.
5. Show that "domains and continuous functions" with composition and identities
at the Set level is a category for which the construction of 6.1.17, with the usual
Set-level projections, provides finite products. What is the terminal object? Prove
that isomorphisms in this category are the isomorphisms of po sets.
6. Define fk E Pfn(N, N) by
j,.(n)
= if n is even and
~
2k then n else undefined
and define
1/1: Pfn(N, N) ---+ Pfn(N, N)
by
I/I(h)
= {!m+l
IdN
if DD(h) is finite and m is the largest k with f" ~ h
else.
Show that 1/1 is monotone but that f" is an ascending chain with 1/1 (Vf,,) =lV(1/1 (f,,)), so that 1/1 is not continuous, show that the Kleene semantics of 1/1 is not
a fixed point of 1/1.
159
6.2 Fixed Point Theorems
7. Investigate the Kleene semantics of
g(n) = if n > 0 then h(n - 1, n) else 5,
h(m, n)
= if m = 0 then 0 else g(n + 1).
8. Use Theorem 13 to do Exercise 5.1.8.
9. Use Theorem 13 to show that the function J provides the Kleene semantics in
Exercise 5.1.11.
10. Investigate the Kleene semantics of
J(n)
= if n = 0 then 1 else n -
g(f(n - 1)),
g(n)
= if n = 0 then 0 else n -
J(g(n - 1)).
Show that the corresponding specification is continuous.
Exercise 11 continues to develop iterative programs from the matrix point of
view, building on Exercises 5.3.3-4.
11. An abstract iterative program with n-Ioops over Pfn(X, X) is (A, B, n), where A is
n x nand B is n x 1 such that A : B is an n x (n + 1) matrix over Pfn(X, X). Such
describes an algorithm to compute a partial function!! EPfn(X, X) by the obvious
- - - II
all
a l2
l
~~---I2
....
a 21
a 22
~
r"""'"
l
bl
a l3
b2
a 23
~---
13
I
.~
a 31
(
'\
a 32
I
~
r"""'"
.
~
~
a 33
I
b3
I
.........
I
In general, the functions in row i have disjoint domains and at most one of these
160
6 Order Semantics of Recursion
generalization ofthe following flow diagram for n = 3 (where A = [aiJ, B = [b;])
can act on an input value, so each row has only one input line. Furthermore, aij
feeds back to row j and bi exits. The function /; is the function computed if entry
is in row i.
(i) Show that (A, B, 3) of Exercise 5.3.4 describes 5.3.1.
(ii) For general (A, B, n) if F is an n x 1 "unknown," show that FA + B is defined
and that if F = [/;] is "the solution," that is, if/; is the computed function
beginning in row i, then
F
=
FA
+ B.
(iii) Using the product domain D = (Pfn(X,X))", show that
FA + B is continuous, and that
fop =
I
t/!: D -> D,
t/!(F)
=
00
m=O
BAm
(including the assertion that this sum exists; BAo means B).
(iv) Argue that "the solution" F obtained by "running the flowscheme" of (A, B, n)
is fop. [Hint: Prove by induction that if /;(x) is obtained in m loops then
f(x) = gi(X), where gi is the ith entry of B + BA + ... + BAm.]
6.3 Recursive Specification in FPF
In this section we extend the functional programming fragment FPF of
Section 1.3 to allow recursive specification and, in particular, iterative constructs. Our approach is very straightforward. We extend the syntax to include FPF function expressions with function variables and use these to
define simultaneous recursions. The semantics is defined as the Kleene
semantics using Theorem 6.2.13. This illustrates the use of order semantics
in providing a formal semantics for a programming language. The many
examples of functions that can now be defined in FPF make the point that
the Kleene semantics is a reasonable one even if, as discussed in Section 5.1,
there are other approaches.
We begin by summarizing the syntax of recursion in FPF in Table 1. (A
final complete syntax for FPF appears in Table 24 on p. 165).
The reader should reread Section 1.3 at this point to regain familiarity
with FPF and its notations. We begin with an example.
161
6.3 Recursive Specification in FPF
Table 1 Syntax of Recursive Definition in FPF
In addition to Table 1.3.1:
New Alphabet Symbols
Letters: A B· .. Z
Recursive Definition Symbol: <=
Afunction variable is a nonempty string ofletters.
The set of recursive expressions (REs for short) is defined by
Basis Step: A function is an RE.
A function variable is an RE.
Inductive Step: Same as for functions in Table 1.3.1.
A recursive specification is
(G1 <=r 1 ,G2 <=r 2 ,···,G.<=r.),
where n > 0, G1, ... , G. are distinct function variables and r l' ... , r. are REs such that
Var(r;) (as defined in 7) c {G1 , 00., G.} for 1 :::; i :::; n.
A recursive definition of a function to sum all the numerals on a tree
(ignoring empty subtrees) is
2
sum(n) = n
sum«
(if n is a numeral),
») = 0,
sum«t1,oo.,t k»), k > 0
tl + sum(t 2,···, tk )
{ sum(t 2, ... ,tk )
if tl is a numeral
sum(head t 1, tail t 1, (t 2, ... ,tk
»
ift 1 = ( )
else.
For example,
sum(l, (2, 3»
=
1 + sum«2, 3»
= 1 + sum (2, (3),( »
= 3 + sum«3), ( »
= 3 + sum (3, ( ), « »)
= 6 + sum« ), « »)
=
6 + sum«( »)
=
6 + sum « ), ( ), ( »
= 6 + sum« ),( »
»
=
6 + sum«
=
6 + sum( )
= 6 + 0 = 6.
162
6 Order Semantics of Recursion
For purposes of illustration, we code 2 in FPF. The following abbreviation is
useful:
3 For U E DTN, equ
=abb (=
0
[id,
equ: t
= u]). The semantics is
=
{
T if t = u
.f
Flt"# u.
We then have
4 Example. An FPF recursive specification corresponding to 2 is
SUM <= (if num then id else (if eq< > then
=0 else
(if (num 0 head) then ( + 0 [head, (SUM 0 tail)]) else
(if (eq< >0 head) then SUM 0 tail else
SUM 0 [head 0 head, tail 0 head, tail]))).
4 has the form G1 <= r 1 with G1 = SUM and r 1 everything to the right of
the <=. The general specification in Table 1, with n > 1, will be used for
simultaneous recursion. Before giving an example of this type, we define the
abbreviation.
5
Pair num
=abb
(if (eq< > v num) then F else
(if « num 0 head) /\ (num 0 tail)) then T else F))
with semantics
.
PaIr num : t =
{T
F
if t = (m, n) with m, n numerals
I
e se.
6 Example. An FPF coding of the simultaneous recursion of Example 6.2.17
is (G 1 <= r 1, G2 <= r 2), where
r 1 =abb if (eqo /\ num) then (G2 0 [id, id]) else = 5,
r 2 =abb if«eq o opr1) /\ Pair num) then =0 else G 1 •
These examples help make it clear what we are trying to do and we turn
now to a formal semantics.
7 Definition. If r is an RE, Yarer) denotes the set of function variables
occurring in r.
For example, if
r
=
(if(SUM 0 [G,
+ ]) then (Go [H,
=0]) else H),
then Yarer) = {SUM,G,H}. The need for the condition "Var(r;) c
{G 1 , •.• , Gn }" in Table 1 should now be clear since we cannot "call" a specification unless we can substitute for each function variable.
163
6.3 Recursive Specification in FPF
For each finite nonempty list G1 , ••• , Gn of function variables, define
RE(G 1 , ... , Gn ) = {nr is an RE, Var(r) c {G 1 , ... , Gn }}.
Note that RE( G1 , .. . , Gn ) has an inductive definition, namely, that of Table 1
save that in the basis step we may take only one of G1 , ••• , Gn as a function
variable. Also, recall from Section 1.3 that the derivation tree of each RE is
unique because of our liberal use of parentheses. This enables us to define
properties of elements of RE(G 1 , ... , Gn ) using induction.
8 Fixed Notation. For the balance of this section
D = Pfn(DTN, DTN).
Thus, D is a domain. For n > 0,
Dn = D x ... x D (n times)
is a domain as in 6.1.17.
9 Definition. Let G 1 , ••• , Gn (n > 0) be a list of function variables. Intuitively,
for r E RE(G 1 , ... , Gn ), if we substitute a function hiE D for Gi the result is a
function in D (e.g., in 6, r 2 ED if G1 E D). We now define this rigorously.
Given rERE(G1 , ... ,Gn), its evaluation
Dn~D
is defined inductively as follows.
10 If r =
f
is a function, j(h 1 , ••• , hn ): t =
11 If r = Gi, Gi = pri, that is,
12 If k
~
1, r
1, ... ,
f
for all t E DTN.
Gi (h 1 , .. ·, hn ) =
hi'
rkE RE(G 1 , ... , Gn ) then
(rko"'ord/\
= Dn
nr, .... ,rkll l
Dk
compk
I
D,
where we use the double-bracket notation
13
[[f"""J.ll\"?X'
y
for the total function induced by the product property since the notation
[/1, ... JkJ of 2.3.11 has a different meaning in FPF, and
14 compk(h 1 , ••• , hk ) = hk 0 ' "
°hI is the k-fold composition
DTN ~ DTN ~ ...
hk-'l
of 1.3.12.
15 Ifk~ 1, rl> ... ,rkERE(Gl, ... ,Gn) then
DTN ~ DTN
164
6 Order Semantics of Recursion
where
16 constrk (h 1, ... , hk ) = [h 1, ... , hk ] as in 1.3.13.
17 ur 1 , r 2 ,r3 ERE(G1 , .•• , Gn) then
(ifr1 then r 2else r 3)" = Dn [[r •. r,.r,ll, D3
if-then-else,
D
with
18 if-then-else (h 1 , h2, h3) = if h1 then h2 else h3 as in 1.3.14.
19 UrERE(G1, ... , Gn) then
(an" = Dn~D~D
with a as in 1.3.15.
20 UrERE(G1, ... ,Gn) then
(fr)"
= Dn ~ D ~ D
with / as in 1.3.16.
This completes Definition 9. We then have the following:
21 Definition. Let p = (G 1 <= r 1, ... , Gn <= rn) be a recursive specification.
Then the evaluation 1/1p of p is the function
./. = Dn [[r •.... ,rnll , Dn •
'l'p
22 Theorem. For any recursive specification p,
I/Ip : Dn -+ Dn is continuous.
While a proof of Theorem 22 is fundamentally straightforward, there are
many things to prove owing to the length of the inductive definition 9-21.
We have preferred to relegate the details to Exercises 4-11. In Chapter 8 we
will similarly establish (but again leaving the details as an exercise) that I/Ip in
21 is a "power-series map." Since we shall prove that every power-series map
is continuous, this would provide an alternative proof of 22.
We can now give the desired definition of the semantics of recursion:
23 Let p = (G 1 <= r 1>' •• , Gn<= r n) be a recursive specification. By Theorems
23 and 6.2.13, I/Ip : Dn -+ Dn has a least fixed point (/1, ... ,fn). We define
P:t=/1: t,
that is, the semantics p: of p is the first component of the least fixed point of
I/Ip-
Our definition ignores the fact that sometimes in a simultaneous recursion
defining n functions we want all n functions, not just the first. We do this
so that the semantics of any FPF program is a function DTN -+ DTN
6.3 Recursive Specification in FPF
165
which makes for neat mathematical bookkeeping. It is easy to prove that
if p = (G 1 <= r 1, ... , Gn <= rn) and t/J has fixed point (h 1 , ... , hn ) then if, say,
p' is (G 3 <=r 3 ,G2 <=r 2 , Gl<=rl,G4<=r4, ... ,Gn<=rn), then t/Jp. will be
(h 3, h 2 , h 1 , h 4 , ... , h n ) in that h3 = p': is obtainable, if somewhat tediously, by
Definition 23.
Tables 1.3.1 and 1 were of temporary status as needed to give a clear
discussion without introducing too many new ideas at once. Table 24 gives
the syntax of FPF in its final form and requires a simultaneous recursion to
define functions and REs.
Table 24 Complete Syntax of FPF with Recursion
Alphabet of Symbols
Digits: 0 1 ... 9
Letters: A B ... Z
Parentheses: ( )
Atomic Functions: id head tail + - * ...;- = num
Function Constructors: == °if then else [ ] a /
Recursive Definition Symbol: <=
DTNs are defined by: A numeral (= nonempty string of digits) is a DTN whereas
(t 1 , ... ,tk >is a DTN ift 1 , ... ,tk (k:2: 0) are.
The sets of functions and of recursive expressions (= REs) are simultaneously defined
(together with Yarer) for each RE r) as follows:
(i)
(ii)
(iii)
(iv)
An atomic function is a function.
For each DTN t, == t is a function.
Each functionfis an RE and Var(f) = 0.
Each function variable G (= nonempty string of letters) is an RE and Var(G) =
{G}.
(v) If fl""'f,. (k :2: 1), p,f, 9 are functions so are (f,. 0'" °fl), [fl,'" ,f,.], (if p then f
else g), (af), and (If).
(vi) If r 1 , ... ,rk (k:2: 1), A, <1>, 'I' are REs so are (rko"'ork), [r 1 , ... ,rk], (if A
then <I> else '1'), (a<l», and (1<1». Moreover, Var(rk0'" or 1) = Var([r 1"'" r k ]) =
Yarer d U'" u Var(rk), Var(if A then <I> else '1') = Var(A) u Var(<I» u Var('I'),
and Var(a<l» = Var(l<l» = Var(<I».
(vii) If G1 , ... , Gn are distinct function variables and r 1,,,,, rn are REs with Yarer;) c
{G 1 , ... ,Gn } for all i then (G 1 <= r 1 , ... ,Gn<= rn) is a function.
Table 24 is reasonably concise given that it is a complete syntax for a fairly
powerful programming language. The semantics is exactly as discussed earlier
and 23 provides the semantics of 24 (vii). The point of the simultaneous
definition of functions and REs is that a function defined by recursion can be
used to build new functions and REs as we would expect.
We round out this section with some examples of FPF functions. We
begin with a basic iterative construct.
166
6 Order Semantics of Recursion
25 Definition. If p,J are functions,
while p do f
=abb
G <= if p then (G 0 f) else id
By 10, 11, and 17 the evaluation tjI: D
--+
D of 21 is
tjI(h) = if p then
hf else id
with Kleene sequence
tjI(.L) = if p then 1. else id, t 1-+ {
tjl2(.L) =
if p then tjI(.L)f else id, tl-+ {
t
d fi d
un e me
f(t)
t
undefined
if p(x) = F
else,
if p(x) = T, p(p(x)) = F
if p(x) = F
else,
We leave it as an exercise to show Vtjln(.L) has the expected interpretation.
In our remaining examples we will relax the formal syntax to conform
to more usual programming style, using vertical structure, indenting,
and fewer parentheses to enhance readability. The recursive specification
(G1 <= r 1 , ••• , Gn <= rn) will be written vertically
G1 <= r 1
Note that this includes abbreviations as a special case, since ri will be substituted for Gi • Indeed, ri is not required to have any function variables, and
Gi <= ri is a "pure abbreviation" in this case. This recaptures some of the
advantages of identifiers but, even so, no values are actually "stored" so there
is no danger of "side effects."
We illustrate with a square root function. When a function is to be used
later, we violate syntax by giving a symbolic name that is not a string of
capital letters.
26 Example. The square root function,
coded in FPF as follows
r
<= if I
In =
largest m with m2
num then 1. else
if eqo v eql then id else
predopr2oGo[id, =1, =1J;
G <= while ::;; 0 [pr 3, pr 1 J do
::;;
n, may be
167
6.3 Recursive Specification in FPF
27 1-
<= -;-
0[= 1, =0];
28 succ <= if I num then 1- else
+ °[id, = 1];
29 pred <= if I num then 1- else
if eqo then
=0 else -
°[id,
=1].
Here there are five function variables, only G used recursively. Clearly, 1- is
the everywhere-undefined function, succ(n) = n + 1, and pred(n) = if n = 0
then 0 else n - 1.
For a sample computation,
fo = pred
o
pr 2 o G<28,1,1).
But G(28, 1, I) = <28,6,36) via
(28,1,1) ~ (28, 2, 4) ~ (28, 3, 9) ~ (28,4,16)
~
so
fo = pred 6 = 5.
(28, 5, 25) ~ <28, 6, 36)
30 Example. The function size <t l ' ... , t k ) = k is defined by
size <= if num then 1- else
= 0 else succ °size °tail.
if eq< >then
A sample computation is
size(3,« ),4» = 1 + size«< ),4»
= 2 + size( <
») =
2.
31 Example. The reverse function is defined by
rev <= if num v eq< > then id else
join °[rev °tail, head].
Thus,
rev«5, < », <7,2),4) = join <rev «7, 2),4), <5, < »)
= join (join <rev (4), <7,2», <5, < »)
=join(join(join<rev< ),4),<7,2»),<5,< »)
=join(join(join« ),4),<7,2»,<5,< »)
= join <join «4), <7,2», <5, < »)
=join«4,<7,2»,<5,< »)
= <4, <7, 2), <5, < »).
168
6 Order Semantics of Recursion
32 Example. last:
<t1, ... ,tk ) = tk
last
EXERCISES FOR SECTION
is defined by
=abb
head 0 rev.
6.3
1. The function sum of 2 is not the same as 1+. Explain why.
2. Modify 4 so that the semantics multiplies all numerals on a tree, ignoring empty
subtrees.
3. Use Definition 9 to compute the evaluation of r
E
RE(G 1 , Gz ) if
r = ((if P then(f 0 [(IXGd, id]) else tail) 0 Gz 0
[G 1 ,/GZ ]).
Exercise 4-111ead to a proof of Theorem 22.
4. If D, E, F are domains and
gJ: D -> F is continuous.
J: D -> E,
5. If D, E are domains and if J: D
->
g: E -> F are continuous, prove that
E is constant, J(d) = eo for all d, then J is
continuous.
6. If D,E are domains and if d 1 ~ d z ~ d3 ~ ···in D, e 1 ~ e z ~ e 3 ~ ••• in E, show
that V {(d.,ern)ln,mEN} exists in D x E and coincides with (V dn, Vern).
7. Let D 1, ... , Dn (n ;;:: 2) be domains, let F be a domain, and let J: Dl x ... x Dn---->
F be a total function. Say that J is separately continuous if for each i E {l, ... , n}
and for each fixed choice of dj E Dj (j ¥= i) the function g: Di -> F, g(x) =
J(d l' ... , di - 1 , X, di +1' . .. , dn) is continuous. Show that J is continuous if and only if
J is separately continuous. [Hint: That continuous implies separately continuous
is easy. For the converse, use Exercise 6 for n = 2 and then use induction capitalizing on the poset isomorphism D1 x ... x Dn+! -> (D1 X ... x Dn) x Dn+1].
8. Let D 1, . .. , DO' E be domains. Show that pri: D1 x ... x Dn ----> Di is continuous
and show that if /;: E -> Di are continuous then [[J1' ... ,1.]]: E ----> D1 X ..• x
Dn is continuous. (Compare Exercise 6.2.5.)
9. Show that composition, (fj, ... Jk) ~ he 0 . . . 0 J1, is a continuous function
Pfn(Xo,Xd x ... x Pfn(Xk _ 1'Xd---->Pfn(XO ,Xk ). [Hint: By Exercise 7, this
reduces to showing g 1-+ hgJ is continuous.]
10. Write a recursive definition of IJ in FPF (given fixed FPF function
using the symbol/.
J) without
11. Complete the proof of Theorem 22 by verifying that the functions of 16, 18, 19,
and 20 are continuous. (For a shortcut on 20 use Exercise 10.)
12. Show that the semantics of
F <= if num v eqo then id else IXF
is the identity function.
13. Write an FPF function list such that list: t = <n 1 , ••• , nk ), where n 1 , .•. , nk are the
numerals occurring in t in left-to-right order. Thus,
6.4 Fixed Points and Formal Languages
list:«
169
»= list: < ) =< ),
list: «1, «2),
< », (3,4,5») = <1, 2, 3, 4, 5).
14. Write an FPF function iota with iota: n = <1,2, ... ,n).
6.4 Fixed Points and Formal Languages
In this section we briefly recall the formal definition of a context-free grammar
G on the alphabet (set of terminal symbols) X, and the usual definition of the
language L(G) c X* that G generates. (X* is the set of finite strings over X,
including the empty string A.) We give a fixed point definition of L(G) by
showing that G induces a continuous function
t/lG: (2 x*t ------. (2x*)"
which maps n-tuples of languages (subsets of X*) to n-tuples of languages,
where n is the number of nonterminal symbols in G, and that if (L I , ... , Ln) is
the least fixed point of t/lG' then LI = L(G).
1 Definition. A context-free grammar G over the alphabet X with set
V = {VI' V 2 , ... , vn } (with no Vi E X) of nonterminals and specified start symbol
VI is characterized by a set P of productions
Pc V x (VuX)*.
We write G = (X, V; VI' P), and rewrite (v, w) in P in the synonymous form
V -+ w. The use of the production V -+ W is to allow modification of a word
u by replacing any occurrence of the letter v in u with the word w. We
now characterize the language generated by the grammar as the set L( G) of
terminal strings (members of X*) which can be derived from the start symbol
by a finite number of applications of the productions of P.
2 Definition. We write WI => W 2 and say WI directly derives W 2 (with respect
to G), if there exists a production v -+ W of P, and strings w', w" in (Vu X)*
such that
WI = w'vw"
and W 2 = w'ww".
We write WI b Wn and say WI derives Wn , if WI = Wn , or there exist W 2 ,.·., Wn - I
such that WI => W 2 ,···, Wi => Wi + I '·.·, Wn - I => Wn .
We then define the language generated by G to be the set
L(G)
= {WIWEX* and VI b w}
of all terminal strings derivable from the start symbol.
3 Example. Let X = {a, b}, set V = {v I , V 2' V 3 }, and let P com prise the productions VI -+ v 2 , VI -+ v 3 , v 2 -+ av 2 b, v 2 -+ ab, V3 -+ bv 3 a, V3 -+ ba. Two
6 Order Semantics of Recursion
170
typical derivations of terminal strings from
VI
VI
VI
are:
=V2 =aV2b b aaav2bbb =aaaabbbb = a b4,
=V3 =bv 3a =bbv 3aa =bbbaaa = b3a3.
4
For this simple example it is clear by inspection that
L(G)
=
{anbnln 2 I} u {bnanln 2 1}.
To prepare the way for the general fixed point theory, we now show how
this L(G) can be obtained from the least fixed point of a suitable operator
"': (2X*)3 -----+ (2 X*)3.
We start by rewriting the productions for G by "adding" all the productions with the same left-hand side:
4
+ ab,
V3 -----+ bv 3a + ba.
V2 -----+ av 2 b
We now replace the nonterminal symbols VI' v2, and V3 by variables VI' V2,
and V3 which take values in the domain (2x*, c) of languages over X, and
regard 4 as defining the sought-for function "': (2X*)3 -----+ (2 x*?:
5
where the previously formal + now denotes union, and bV3a + ba is shorthand for {bwalwE V3} u {ba}, and so on.
Let us now apply Theorem 6.2.13 to determine the least fixpoint of the '"
of 5. (That", is continuous will be proved below.) We compute the Kleene
sequence of '" as follows:
"'O(.l) = (0,0, O),
",I(.l) = "'(0,0,0) = (0,ab,ba),
"'2(.l) = "'(0, ab, ba) = ({ab,ba}, {a 2b2,ab}, {b 2a2,ba}).
It is then easy to prove, by induction on m, that
6
"''''(.l)
=
({a i bi I1 ~j < m} u {b i ai l1 ~j < m},
{a i bi l1 ~j ~ m}, {b i ai l1 ~j ~ m}).
Thus, the sequence ",m(.l) is indeed an ascending chain, and
7
V ",m(.l)
=
(L(G),{aibilj 2 1},{b i ai lj 2 1}).
m;?,O
We see that L(G) is indeed the first component of the least fixed point of
"'. More generally, we see that for k = 1, 2, 3,
8
thekthcomponentof V",m(.l)is{wlwEX*
m;?,O
and Vkbw}.
171
6.4 Fixed Points and Formal Languages
But if we look at 6, we can read off even more information then given by
8. Let :;,j be the jth power of the relation :;,:
w:;,j W'
(w derives w' inj steps) just in case there exists a sequence
WI' W 2 ,.··, Wj
(while
such that W => WI
just means that
W =>0 W'
=> W 2 => ... Wj - 1 => Wj
W =
w').
Then the reader should be able to show that 6 yields
9 the kth component of IjJm( 1.) = {w IWE X* and Vk =>j W for some j ::;; m}.
However, the form of 9 is misleading, as our next example shows:
10 Example. Consider the grammar with one variable and with productions
summarized in the form
VI ---+
a
+ av1a + v1bbv 1.
Then, analogously to 5 this induces 1jJ: 2x ' ~ 2 x ' with
I/I(vd
a
=
+ aVla + V1bbV1·
Then
IjJO(1.) =
0,
1/1 1 (1.)
=
{a},
1/1 2 (1.)
=
{aaa,abba} u {a}.
But 1/1 2 (1.) does not easily satisfy 9 since the shortest derivation of abba is
VI => VI
bbv 1 => abbv 1 => abba
which takes 3 (rather than ::;; 2) steps. However, if we look at the derivation
tree for abba.
we see that it is of height 2. In other terms, if we allow parallel replacement
(all variables in a string may be replaced in a single step), then we can indeed
derive W in 2 steps:
VI => VI
p
bbv 1 =>
abba.
p
This is reminiscent of the "all-call" semantics of Section 5.1 in that all
variables are replaced on each "cycle," but is nondeterministic in that anyone
of a whole set of productions may be chosen in replacing each occurrence of
a variable.
We now give the general definition of =>:
p
172
6 Order Semantics of Recursion
11 Definition. For any grammar G and strings w, w' in (V u X)*, we say w
parallel-derives w' and write w =>
W' just in case we can write
p
(a)
for some k ~ 1 with each wIj in X* and each v Ij in V, and there exist
productions vlj --+ w2j in P such that
or
w is terminal,
(b)
and w = w'.
While 9 does not hold for our present example, it can be shown that t/I does
satisfy
12
the kth component oft/lmCl)
=
{WIWEX* and Vk=>mW}.
p
But with 12 at our disposal, we have no trouble in providing the general
theory.
13 Theorem. Let G
{vl, ... ,vn }. Let
=
(X, V, VI' P) be a context-free grammar, with V
=
t/lG: (2 x *t ---+ (2 x *t
be the function obtained from P in the manner exemplified by 4 and 5. Then t/lG
is continuous, and L( G) equals the first component of the least fixed point of
t/lG'
PROOF.
(i) A typical component of t/lG(VI "",
v,,) will look like
14
where each w is from X* and each V is from {VI' V2 , ••• , v,,}. But it is clear
that such functions are continuous (Exercise 1), and so t/lG is continuous.
(ii) We need to verify the assertion that
15
the kth component of t/I'G( 1-)
=
{w IWE X* and Vk =>m
w}.
p
Denote the right-hand side of 15 by L(v k , m) for convenience. Since
=>0 w iffw = Vk , it is clear that
p
Vk
L(Vk'O)
=0
for 1 :::;; k :::;; n
and so 15 holds for the basis step, m = O.
For the induction step, suppose that
t/I~(1-) = (L(v l , M), ... , L(v n , M»
for some M. We must now prove that 15 holds for m = M + 1. Suppose, then,
that 14 defines the kth component of t/lG(VI,"" v,,). Then if r > 0, all the
terms of
173
6.4 Fixed Points and Formal Languages
are in L(Vk' M + 1) by clause (a) of the definition of =,
while if r = 0, the
p
terms are in L(Vk' M + 1) by clause (b); similarly, for the other terms of 14.
Conversely, a string W belongs to L(Vk' M + 1) just in case there is a production like V k -+ W 10 Vl1 .•. V1rW Ir and strings w1u in L(v 1u , M), for 1 ::;;; u ::;;; r,
such that W = W 10 Wl 1 ••• Wl r W 1r •
Putting all this together, we conclude that
t/I~+l(.l) =
(L(Vl,M + 1), ... ,L(vn,M + 1))
and thus, by induction, that 15 holds for all M.
(iii) Combining (i) and (ii), we have that the least fixed point of t/lG satisfies
V
m:<;;O
= ({ WIWE X* and Vk =* W})l:<;; k:<;; n·
t/lm(.l)
P
In particular, the first component of the least fixed point equals L(G).
EXERCISES FOR SECTION
D
6.4
1. We consider maps (2 x*t
---+
2x'. Prove that 1/11 + 1/12 is continuous if 1/11' 1/12 are,
where
U sing Exercises 6.3.4-8, verify the assertion in part (i) of the proof of Theorem 13.
2. Let A, BE 2 x * and define 1/1: 2 x '
VE
Y}.
---+
2 x' by I/1(Y)
= A Y + B, where AY = {wvlw E A,
(i) Show that A *B (where A * = {A} u A u A 2 U ... ) is the least fixed point of 1/1.
(ii) Show that if A ¢' A, A *B is the only fixed point of 1/1. [Hint: If Y ¢ A *B there
exists Y E Y of shortest length with y ¢' A *B; show this is impossible if 1\ ¢' A.]
The following definitions are needed in Exercises 3-7. An m x n matrix of
languages is
with each Aij E 2X". Mimicking the formula for matrix multiplication in linear
algebra define
Y+Z=YuZ
YZ = {vw:
VE
Y, WEZ}
for Y, Z E 2 X " and then define matrix multiplication [Aij] [BjJ
m x n [Aij] and n x p [Bjk ], yielding m x p [C ik ] by
=
[C ik ] for
174
6 Order Semantics of Recursion
Define matrix sum for m x n [Aij], [Bij] by
[Aij]
+ [Bij]
=
Cij
[Cij],
=
Aij
+ Bij.
3. Verify
= ([Aij] [Bjk ]) [Ckl ];
+ [Bij])[Cjk ] = ([AiJ [Cjk ]) + ([Bkj ] [Cjk ])·
(i)
[Aij]([Bjk ] [Ckl ])
(ii)
([Aij]
4. If Aij' Bi E 2 x • verify that the simultaneous system
Y1
= All Y1 + ... + A 1n Y" + Bl
Y" =
Anl
Y" + ... + Ann Y" + Bn
has matrix form
[lj] = [Aij][lj]
+ [BJ
and least fixed point [Aij]*[BJ
[Aij]*
= I + [AiJ +
[Aij]2
+ "',
where
[.. =
'J
{{A}
0
is the identity matrix and [Aij]n is the matrix product of [Aij] with itself n times.
5. Verify that the m x n identity matrix [Iij] of Exercise 4 satisfies
[Iij] [Ajk] = [Ajk]
for m x n [Ajk]
[Bij][Ijk] = [BiJ
for p x n [Bjk ].
6. Verify that the system of Exercise 4 has exactly one fixed point solution if A¢: Aij
for all i,j. [Hint: Use induction; for n = 1 use Exercise 2.]
7. In this exercise we assume the reader knows what is meant by the language L
recognized by the finite-state automaton with state graph
x
175
Notes and References for Chapter 6
Quick review: L = {a 1 , ••• , ani there exists a path with edge labels a 1 , ••• , an from
the initial state q 1 to a final state (exclusively Q2' here). Thus, x 2 y2 x E L but no word
beginning with y is in L. Solve for L using Exercises 4 and 6. [Hint: Let 1'; be the
language recognized if qi were initial, so we seek L = Y1 • The second equation is
Y2 = XY2
+ {x,y} Y3 + {A}.
To solve, use substitution and Exercise 2.]
Notes and References for Chapter 6
The systematic use of domains (whose order relation abstracts "approximation") and
continuous functions in computation is due to D. s. Scott ["Lattice theory, data
types and semantics," in R. Rustin (ed.), Formal Semantics oj Programming Languages,
Prentice-Hall, 1970]. Theorem 6.2.13 was proved earlier by S. C. Kleene [Introduction
to Metamathematics, Van Nostrand, 1952].
Theorem 6.2.8 is due to A. Tarski ["A lattice-theoretical fixpoint theorem and its
applications," Pacific Journal oj Mathematics, 5, 1955, pp. 285-309] although the
special case (P, :;;;) = (&,(X), c:) as in 2.1.8 was proved by B. Knaster in 1928.
We thank Irene Guessarian for Example 6.2.16.
The use of recursive equations to find the language recognized by an automaton
as in Exercise 6.4.7 is due to D. N. Arden ["Delayed logic and finite-state machines,"
in Theory oj Computing Machine Design, University of Michigan Press, Ann Arbor,
1960, pp.1-35].
CHAPTER 7
Canonical Fixed Points
The previous chapter considered a number of situations in which an object
of semantic interest arises as the least fixed point of a continuous map
t/J: (D, :5:) ------+ (D, :5:) of some domain (D, :5:). So far, the domain structure is
but a technical device to distinguish the least fixpoint from the other fixed
points.
This suggests the more general question: given t/J: D --+ D without assuming
D a domain or t/J continuous, what additional requirements on D and t/J give t/J
a distinguished fixed point? But in fact this question may be misguided since
it examines one D and one t/J in isolation. The remarkable aspect of the least
fixed point V (t/Jn(l.)) is that this formula is the same for all continuous t/J!
This brief chapter introduces "canonical fixed points" as a precise method
of assigning fixed points to particular classes of t/J's in a uniform way. We
establish a criterion for the existence of a unique canonical fixed point. For
domains and continuous t/J, the least fixed point is the unique canonical fixed
point. In Section 8.2 we show that partially additive monoids equipped with
power-series maps again have a unique canonical fixed point, namely, the
patttern-of-calls expansion of Section 5.2.
We begin by abstracting a "fixed point situation" without involving specific
structures such as domains. To this end we will introduce a category of
"recursion schemes" according to the following definition.
1 Definition. A category of recursion schemes is a category d of the following
type.
2 Each object of d has the form A = (A, (J, t/J), where A is a set, (J is an
additional structure on A-what sort depends on the particular d-and
t/J: A --+ A is a total function, possibly subject to constraints involving (J.
177
7 Canonical Fixed Points
3 Each morphism f: (A, (J, 1/1) ---+ (A', (J', 1/1') of d is a total function f:
A --+A' such that I/I'f = fl/l:
A _--,1_-+1 A'
~1
1~'
A _--'1'-----+1 A'
(although not every such function need be a morphism).
4 Composition in d is ordinary composition of total functions.
5 The identity morphisms are the identity functions.
The recursive specifications of 6.2.1 are recursion schemes in which (J is the
order relation:
6 Example. Let an object of d be (D, :::;,,1/1), where (D,:::;,) is a domain and
1/1: (D, :::;,) ---+ (D, :::;,) is a recursive specification, that is, 1/1"(.1) :::;, 1/1"+1(.1) for
all n. Let a morphism f: (D, :::;',1/1) ---+ (D', :::;",1/1') be a strict map (D, :::;,)---+
(D', :::;,') (where we say f is strict if f is continuous and f(.1) = .1) such that
I/I'f = fl/l· Composition and identities are, of course, defined by 4 and 5
above. Such d is a category of recursion schemes.
7 Observation. If d is a category of recursion schemes, any morphism in d
preserves fixed points: If I/I(a) = a, then I/I'(f(a)) = f(l/I(a)) = f(a).
The philosophy embodied by much work in formal semantics at the time
of this writing is to provide the I/I's with enough structure so that any A has
a distinguished fixed point. Here, in 2, the emphasis is on (J and morphisms
are ignored. We place this in a different perspective by introducing the
idea of canonical fixed point which is explicitly based on the structure of
d -morphisms.
8 Definition. A canonical fixed point rx for a category of recursion schemes d
is an assignment of a fixed point rxA = l/I(rxA) to each object A in such a way
that for every f: A --+ A' we have
The next theorem-the main result of this section-shows that the nature
of canonical fixed points in d is completely understood when d has an
initial object.
9 Canonical Fixed Point Theorem. Let d be a category of recursion schemes
with initial object A. Then there is a bijective correspondence between fixed
points of A and canonical fixed points of d. In particular, if A has a unique
fixed point then d has a unique canonical fixed point.
178
7 Canonical Fixed Points
PROOF. Write A = (A, 8, ~). Let ~fi = fi be a fixed point of A. Define IXA = !(fi),
where!: A ---+ A is the unique d-morphism. Because
~
A
IA
A
IA
1
l~
we have "'(IXA) = "'!(fi) = !~(fi) = !fi =
f: A ---+ A' then
IXA'
so
IXA
is a fixed point of A. If
A
y~
A
f
IN
so that f(IXA) = f!(fi) = !(fi) = IXA' and IX is a canonical fixed point. It is obvious
that every canonical fixed point arises this way because if IX is any canonical
fixed point, IXA = !(tl) if fi = IXA. A special case of this is A = A showing that
fi = aA since !: A ---+ A is the identity map, and this means that different
fixed points fi of A give rise to distinct canonical fixed points.
0
We now apply this result to show that the Kleene semantics of 6.2.3 is
canonical.
10 Theorem. For the category of recursive specifications of Example 6, the
Kleene semantics Vn";o ",nCl) is the unique canonical fixed point.
Let (N, ~) be the partially ordered set of natural numbers with
adjoined greatest element 00, N = N + {oo}, and let s(n) = n + 1, s(oo) = 00.
Then s has unique fixed point 00. Moreover, N = (N, ~,s) is initial with the
unique homomorphism !: N ---+ A defined by
PROOF.
!(n)
= !(sno) = ",n!(o) =
",nCl)
while
Hence, these recursion schemes have a unique canonical fixed point given by
o
EXERCISES FOR CHAPTER
7
1. Let C be a partially additive category. In 3.2.24 we showed that if f: X ...... X + y
then for the unique fi: X ...... X, f2: X ...... Y with f = indi + inzf2' the iterate
ft: X ...... Y of f is given by
179
Notes and References for Chapter 7
In this exercise we show that ft arises as a canonical fixed point for the following
category .91 of recursion schemes.
r:
Objects: (C(X, Y),f, I/If) where f: X --+ X + Y, I/If(g) = gfl + f2'
Morphisms:
r: (C(X, Y),f, I/If) ----> (C(X, f), I, 1/11)
is
C(X, Y) ---+ C(x, f) satisfying the following three properties.
a
function
(i) Whenever (g;) is summable in C(X, Y) then (rg;) is summable in C(x, f) and
r(Ig;) = I(rg;).
(ii) r(2) = lz.
(iii) r(gfl) = (r(g».r; for all g E C(X, Y).
Show that .91 is a category of recursion schemes and that ft is a canonical fixed
point.
2. Create a version of Exercise 1 in which objects have the form (Pfn(X, X)', (A, B),
I/IA.B) for which
fA.B =
00
I BA',
m=O
as in Exercise 6.2.11, is a canonical fixed point.
3. Create a category of recursion schemes with objects of the form
«2 x ·)., G, I/IG)
with G a context-free grammar with n nonterminals so that L(G) is a canonical
fixed point.
Notes and References for Chapter 7
The canonical fixed point theorem is due to the authors: "The pattern-of-calls expansion is the canonical fixed point for recursive definitions," Journal of the Association
for Computing Machinery, 29,1982, pp. 557-602.
CHAPTER 8
Partially Additive Semantics of
Recursion
8.1
8.2
8.3
8.4
8.5
PAR Schemes
The Canonical Fixed Point for PAR Schemes
Additive Domains
Proving Correctness
Power Series and Products
In Section 5.2, we used partially additive semantics in Pfn to describe a
number of examples of recursive specification as "power-series" maps
t/!(h)
= Ho + Hl(h) + H 2 (h, h) + ...
in which the Kleene semantics could alternatively be given by the pattern-ofcalls expansion.
In Sections 8.1 and 8.2 we define recursive specifications and their patternof-calls expansion on general partially additive monoids, and will show that
the pattern-of-calls expansion, like the Kleene semantics, may be regarded as
a unique canonical fixed point in the sense of Chapter 7.
In Section 8.3 we define ordered partially additive categories in which it
can be shown that each power-series specification is also continuous and
that, moreover, the Kleene semantics and pattern-of-calls expansion coincide.
It follows that both order semantics and partially additive semantics apply to
important semantic categories.
In Section 8,4 we briefly illustrate some rules for correctness which use
both ordered and partially additive semantics.
Section 8.5 caps the theory ofthe first two sections to provide tools needed
to define the semantics of recursive specification in a programming language
using partially additive ideas.
8.1 PAR Schemes
Generalizing (M,2) = Pfn(X, Y), we seek to formulate recursive definitions
in a partially additive monoid (M,2} We have seen examples in Section
5.2 in which recursive specifications t/!: M - M can take the form t/!(a) =
181
8.1 PAR Schemes
I
Hm(h, . .. , h) for suitable maps Hm: M m --+ M. In these motivating examples,
H m (h 1 , ••• , h m ) was defined as follows: For each distinct m-substitution path in
a recursive call, replace the jth occurrence of a variable by hj (1 ::s; j ::s; m) and
compose the partial functions along the path; then sum over all these paths.
There are, in fact, more general ways of combining m functions than composing, as was seen in the definition of recursive specification for FPF in Section
6.3 where, for example, Hm = constrm as in 6.3.16 is important. The goal of
this section is to give an abstract definition of suitable Hm to introduce a
general theory of partially additive recursive specifications. (Some of this
theory is postponed for Section 5.) The main property required of Hm is
m-additivity as is now defined.
1 Definition. Let (M1' I1), ... , (Mm' Im), (M, D be partially additive monoids.
A function L: M 1 --+ M is additive (M1, I 1) --+ (M, I) if for all summable
families (h;) in (M1, 1), (Lh;) is summable in (M,
and
I
D
L(I h;)
=
I
Lh;.
More generally, a function L: M1 x ... x Mm --+ M is m-additive if whenever
all but the jth variable is fixed the resulting function M j --+ M is additive, that
is,
L(h 1,· .. , hj - 1,
I
c;, hj +!, ... , h m) = I;L(h 1 , ... , hj - 1, c;, hj + 1,· .. , h m)
for allj, summable families (c;), and all choices of fixed ht E M t (t #- j).
Obviously, I-additive is the same as additive. For completeness we define
to be any element of M. (The one-element set
a O-additive map 1 --+ (M,
1, the empty product (2.3.10) M1 x ... x Mm when m = 0, should be thought
of as the trivial partially additive monoid whose only element is 0.)
I)
2 Observations. Any composition of additive maps is additive. The identity map
idM : (M,
--+ (M,
is additive. If L is additive then L(O) = O.
I)
I)
To see this, if L: (M, I) --+ (M', I'),
tive, then
I.;L(Ih;) = L'(I'Lh;) =
I.;: (M', I') --+ (M", I")
are addi-
I" I.; (Lh;) = I" (L'L)h;
shows L'L is additive. This argument may be iterated to see that a finite
composition of additive maps is additive. That the identity map is additive is
obvious and an additive map preserves 0 by definition since 0 is the empty
sum.
3 Example. Let m ;?: 2 and let X o, ... , Xm be objects in a partially additive
category C. Then the composition map
C(Xo,Xd x'"
X
C(Xm-1,Xm)~C(XO,Xm)
(h 1,·· ., hmH----+ h m'" h1
182
8 Partially Additive Semantics of Recursion
is m-additive. For by the axioms on a partially additive category, specifically
3.2.1, we have (writing L for Lj to avoid notational confusion)
L(h1,.·.,hj-1,LC;,hj+1, ... ,hm} = (hm···hj+1}[(Lc;)(hj-1···hdJ
= (h m· .. hj+d L c;(hj- 1 ... hd
= L (h m... hj+1}C;hj-1 ... h1
;
4 Proposition. Let (M 1, L1)' ... , (Mm, Lm), (M, L) be partially additive monoids
and let Dt: M 1 X ... x Mm --+ M (t E T) be a family of m-additive maps such
that the sum
D(h 1,···,hm} = LDt(h 1, ... ,hm}
is defined for all h1, ... , hm. Then D is m-additive.
PROOF.
For fixed h 1, ... , hj- 1, hj+1,
••• ,
hmlet
Lt(c} = Dt(h 1,· .. , hj- 1, C, hj+1, ... , hm},
L(c}
=
D(h1, ... ,hj-1,C,hj+1, ... ,hm}.
We are given that each L t is additive and we must prove that L is additive.
But clearly L(c} = LLt(c}. Hence, all reduces to showing that a sum L of
additive maps L t is additive. Indeed, if (c;iiEI) is summable (we write LC;
instead of Lj C; to avoid notational confusion) we have
L(~)c;ii EI)) = L (Lt(L c;ii E I}lt E T)
= L(LLtc;iiEI}ltE T) (L t is additive)
= L(LLtc;itE T}liEI} (partition associativity)
= L(Lc;iiEI}.
D
From 3 and 4 it follows that if Hm arises from summing composition paths
as in our motivating examples in Section 5.2, then Hm is m-additive.
5 Example. As in Section 6.3, let D = Pfn(DTN, DTN} and for m
consider
so that
constrm(h 1,···, hm}: (t 1,···, tm>= [h1 t1'···' hmtmJ.
Then constr m is m-additive. This is because both
constr m(h 1 ,···, hj- 1, L c;, hj+1, . .. , hm) (t 1,· .. , t m>
~
1
183
8.1 PAR Schemes
and
L (constrm(h l ,···, hj-I' ci, hj+ I,···, hm)li E 1) : <t
l ,···,
tm)
mean [hI t l , ... , hj- I tj- I , citj, hj+ 1 tj+ I , ... , hmtmJ for the unique (if any) i with
tjEDD(c;).
We have set the stage for the main definition of this section:
6 Definition. A partially additive recursive scheme, PAR scheme for short, is
(M, H), where (M, D is a partially additive monoid and H = (Hm : m =
0, 1,2, ... ), where Hm: M m ~ M is m-additive for all m = 0, 1, 2, ... subject to
the requirement that for each x E M, L (Hm(x, ... , x)lm = 0,1,2, ... ) exists so
that the function
L,
7
!/IH: M
00
~ Ml> X~
L Hm(x, ... ,x)
m=O
is defined.
Such !/IH is a "power-series" map (but the formal definition is postponed
for 8.5.1). A "polynomial" is the case where Hm = for m ;;:: mo for some mo.
In practice, a recursive specification arises in terms of a function !/I:
M ~ M. In order semantics, M is a poset and it is a matter of showing !/I is
continuous. The PAR-scheme approach has a new complication, namely,
that once M is a partially additive monoid it is necessary to find H with
!/I = !/IH as in 7. It may not be obvious how to find such H, and we saw in
Exercise 5.2.6 that H need not be unique. On the other hand, the partially
additive approach has advantages. The Hm used in a PAR scheme relate
directly to the constructions used to build recursive specifications in practice.
The pattern-of-calls semantics of the next section will provide a semantics in
M for each PAR scheme (M,
H) in the form of a sum whose terms deal
with individual computation paths at a finer level than the Kleene appro ximants !/I;}(..l).
The examples of Section 5.2 yielded specifications described by polynomials
of degree at most 2. The following example is a nonpolynomial PAR scheme
for a specification to compute the determinant of a square matrix.
°
L,
8 Example. The following recursive algorithm computes the determinant of
an n x n matrix by cofactor expansion along the first column.
0. Define function DET with input matrix MAT, output number Z, and
additional local variables I and N.
1. Let N be the number of rows in MAT.
2. If N = 1 go to END.
3 Z:= 0; I := 0.
4. LOOP: I := I + 1; if updated I = N, exit.
184
8 Partially Additive Semantics of Recursion
5. If aij is the i - j entry of MAT and if Bij denotes the submatrix of MAT
obtained by deleting row i and columnj,
Z:= Z
+ (-I)I+IaIl DET(BIl )
6. go to LOOP
7. END: Z := all
To emphasize the practical reality of this algorithm, we give an APL
program which implements it line-by-line. (The reader need not be familiar
with APL since the original description is equivalent.)
'liZ +-- DET MAT; I; N
[1] N +-- (pMAT) [1]
[2] -'-+ (N
=
1)/END
[3] Z +-- I +--0
[4] LOOP: -'-+ (N < I +-- 1+ 1)/0
[5] Z +-- Z + (Cl)* 1 + I) x MAT[I; 1] x DET MAT[(l # IN)/lN;
1 + l(N - 1)]
[6] -'-+ LOOP
[7] END: Z +-- MAT[I; 1]
The desired function is an element of the partially additive monoid (M,~)
with M the set of all partial functions from the set of all square matrices with
real entries to the set of reals. In defining Hn below, we must specify, for each
m l , ... , mn, what Hn(m l , ... , mn) is as an element of M. We do that by
exhibiting the number it returns when given a matrix MATas input. Then
define
Ho E M by Ho = if MAT = [all] is 1 x 1 then all else undefined;
HI: M -'-+ M
=
is the always-undefined function;
H 2: M2 -'-+ M by H 2(m l , m 2) = if MAT = [all
a 21
a 12 ] is 2 x 2
a 22
then a ll m l ([a 22 ]) - a2Im2([aI2])
else undefined;
H3: M3 -'-+M by H 3(m l ,m2,m 3)
else undefined.
=
if MAT
=[:::
a 31
al2 a13]
a 22
a 32
a 23 is 3 x 3
a 33
185
8.1 PAR Schemes
Similarly, Hn(m l , ... , mn) is defined to yield a result only for a MAT that is
n x n. Then l/J: M -+ M, defined by
l/J(m) =
L Hn(m, ... , m),
n~O
is the sought recursive specification. The desired semantics is equally given by
the Kleene semantics or the pattern-of-calls expansion to be defined in the
next section. In fact, the least fixed point is total, as is the only fixed point,
and we thus use the fixed point equation
DET
=
l/J(DET)
to compute the determinant of a 2 x 2 matrix.
DET([i
n) = l/J(DET)(U
~J)
= H 2 (DET, DET)(U
D)
= 2 DET([5J) - 4 DET([3J)
= 2l/J(DET)([5J) - 4l/J(DET)([3J)
= 2Ho([5J) - 4Ho([3J)
EXERCISES FOR SECTION
= 2·5 - 4· 3 = -2.
8.1
1. Use the algorithm of8 to compute
DET[~ o 1]
1 0
3 4
2. In this exercise we briefly discuss polynomial maps on vector spaces to indicate the
analogy with the partially additive "polynomials" defined after 7. Say that a function f: vm --> V, Va vector space, is m-linear if when all but one of the m variables
are fixed with arbitrary elements of V, the resulting map V --> V is linear. Thus,
1-linear is linear. We define O-linear to mean constant.
(i) Let R be the one-dimensional vector space of reals. Show that f: R --> R is
linear if and only if there exists a constant b with f(x) = bx. [Hint: b = f(1).]
(ii) Show that t: R --> R has form f(x) = cx 2 with c constant if and only if
there exists 2-linear H 2 : R2 --> R with f(x) = H 2(x, x). [Hint: c = H 2(1, 1),
H 2 (t, u) = tu.]
(iii) Show that f: R --> R has form dx' with d constant if and only if there exists
n-linear H.: R' --> R with f(x) = H.(x, ... , x).
It follows that the familiar polynomial function p: R --> R, p(x) = ao +
a l x + a2x2 + ... + a.x' is just
p(x)
= Ho + Hl(X) + H 2(x,x) + ... + H.(x, ... ,x)
with H.: Rn --> R n-linear. The latter generalizes immediately to define polynomials p: V --> V in arbitrary vector spaces V. For example,
(iv) Let V = R2 be the Cartesian plane, a two-dimensional vector space. Define
186
8 Partially Additive Semantics of Recursion
p: V
--+
V by
+ 2xy - y,y2 + 10 - 2x)
show that p(x,y) has the form Ho + H1(x,y) + H 2((x,y),(x,y)) with Hn:
p(X,y) = (X2
n-linear, n = 0, 1, 2.
V --+ V
3. Letjl,'" ,jk Z 0 and let H,: Nil x ... X N'i, ---+ N, bej,-additive for 1 ~ t ~ k. Let
L: Nl x ... X Nk ---+ N be k-additive. Define m = jl + ... + k Show that
Nll
x ...
X N1j, X N21 X ... X
Nkik
M
---+
N,
.»,
M(h ll ,···, hkik ) = L(Hl (h ll ,···, h1j,), ... , Hk(h kl ,···, hki
is m-additive. [Hint: Despite the cumbersome notation, show that if all but one of
the variables is fixed, the resulting function of one variable is the composition of
two additive maps.]
4. Let (M, D = Pfn(DTN, DTN). Show that the map if-then-else: M3 --+ M is not
3-additive. (Further discussion to resolve this situation is given in 8.S.IS-no fair
peeking.)
8.2 The Canonical Fixed Point for PAR Schemes
The PAR schemes of 8.1.6 are the objects of a category of recursion schemes
(7•• 1) whose canonical fixed point exists uniquely as an application of the
canonical fixed point theorem 7..9 and coincides with the pattern-of-calls
expansion 5. In addition to being a useful result for later work, this establishes
that the pattern-of-calls expansion is always a fixed point solution.
We begin by making PAR schemes into a category.
1 Definition. Let (M,I, H), (M',I',H') be PAR schemes. A homomorphism
ifo: (M, I, H) ---. (M', I', H') of PAR schemes is an additive map ifo:
(M, D -----+ (M', I') which also satisfies
H~(ifoml,···,ifomn)
= ifoHn(m1,···,mn)
for all n ~ 0 and m 1 , ... , mn E M. It is obvious (using 8.1.2) that the composition of homomorphisms is a homomorphism and that the identity function is
a homomorphism (M, I, H) -----+ (M, I, H).
To conform to Definition 7.. 1, the category of partially additive recursive
schemes, call it r!J>, has
Objects: (M,a,I/I) where a
=
(I, H) with (M,I,H) a PAR scheme and
1/1 = I/IH as in 8.1.7.
Morphisms: Homomorphisms <p as above.
By remarks already made, this is clearly a category. We will treat a PAR
scheme (M,I" H) as the object (M,(I,H),I/IH) of r!J> without comment in the
sequel.
187
8.2 The Canonical Fixed Point for PAR Schemes
According to Definition 7.. 1, to ensure that f1J> is a category of recursion
schemes we must show for any homomorphism rjJ: (M, I, H) - - (M', I', H')
that
commutes. This is verified by
rjJt/Jh
= rjJ I
=
=
Hih, ... , h)
I' rjJHn(h, ... , h)
I' H~(rjJh, ... , rjJh)
since rjJ is additive
since rjJ is a homomorphism
= t/J' rjJh.
To apply Theorem 7•.9 to f1J> we will need to construct an initial object.
This requires considerable discussion. We begin by considering trees with
n-branch nodes labeled Wn such as
(00
Each such tree can then be thought of as the abstract specification of a
pattern of calls. Given a PAR scheme H, we interpret such a tree by evaluating
from leaf to root, replacing Wn with Hn as we go. We formalize with the
following:
2 Definition. The abstract syntax for PAR-scheme semantics is the set e of all
trees defined inductively as follows:
Basis Step: ~o E e.
Induction Step: Ift1' ... , tnEe, n ~ 1, then
AWn tn
t1
Ee.
•••
In short, e consists of all finite-depth finitely branching trees with a node
labeled W k if there are k branches from that node. In particular, each leaf is
labeled Wo.
Sometimes linear notation for elements of e is more useful. We achieve this
by writing
A
Wo
=
•
Wo
188
8 Partially Additive Semantics of Recursion
so that the first two of the trees above have the linear form
WI [WI [WI [WO]]],
W2[WO, W3 [W O, WO, WI [Wo]]].
The trees in e abstractly represent the result of all possible iterated substitutions or patterns of calls, starting with wo which represents Ho. Our hope
is to continue to develop the theory at a level of abstraction which relieves us
from keeping track of special path structure when it plays no role. Such
details will, of course, be necessary when analyzing specific examples.
L,
H), the interpretation SH of the tree
3 Definition. Given a PAR scheme (M,
s in e is the element of M obtained by "running H on s."
Basis Step: (~O)H
Induction Step:
=
Ho.
(A )H
tl ... tn
= Hn(tf1, ... , t:!).
For example, the determinant of a 3 x 3 matrix [au] is
H) as in Example 8.1.8 where
(M,
L,
S~([au])
for
Wo
because a 3 x 3 determinant via cofactor expansion reduces to three 2 x 2
determinants each of which in turn reduces to two 1 x 1 determinants. Not
every SEe corresponds to a possible pattern of calls in this example and for
such s, SH = O. For example, consider
S
Here
SH
= H 2((W 2 [W O,WO])H,wff)
= H2 (H2 (Ho, Ho), Ho)·
By definition of H 2 , this is defined only for 2 x 2 input [au] to be
all (H2(Ho, HO)([a22 ])) - a 2l H o([a12])'
But this is undefined since H 2(Ho, Ho) is undefined on a 1 x 1 matrix.
189
8.2 The Canonical Fixed Point for PAR Schemes
We need one technical result.
4 Lemma. Let (M,~) be a partially additive monoid. Given summable families
(a;), (bj ) in M and 2-additive H 2: M2 ~ M, "Li.jH2(ai,bj) exists (the sum is
countable-see Exercise 7) and
"L H 2(ai,bj) = H 2("L ai,"L bj)'
i,j
More generally, given an m-additive map Hm: M m ~ M and m summable families (aU, ... , (a4:,), the following equation holds (including the assertion that the
right-hand side exists):
Hm(~afl, ... ,~a4:,)
=.
'm
'1
PROOF IDEA.
"L. Hm(al" ... ,a;:).
lto···.'m
For example,
H2(a
+ b, c + d + e) =
+ d + e) + H 2(b, c + d + e)
= H 2(a,c) + H 2(a,d) + H 2(a,e)
+ H2(b,c) + H2(b,d) + H 2(b, e).
H 2(a, c
The general proof is similar.
D
We are now ready to define the pattern-of-calls expansion.
5 Definition. If (M, "L, H) is a PAR scheme, its pattern-of-calls expansion eH is
given by
eH = "L(SH: sEe),
where SH is the interpretation of s as in 3.
Of course, we need to be sure this sum exists. This is always so by the
following theorem:
6 Theorem. For any PAR scheme (M,"L,H) the family (SH: SEe) is summable
so that the pattern-of-calls expansion eH exists.
PROOF. By basic set theory the sum is indeed a countable one-see Exercises
6-10. By the limit axiom for partially additive monoids (3.1.2), we may
deduce that the sum "L (SH: sEe) exists if we can show that every finite subfamily is summable. Since any subfamily of a summable family is summable
by 3.1.7, it suffices to show that there exists an ascending sequence of subsets
ofe
So
such that Uk~OSk
=
C
Sl
C
S2 c···
e, and each "L(SH: sESd exists. Define So
=
{wo} and
190
8 Partially Additive Semantics of Recursion
Sk+1 = {W n[t1,···,t nJ: n?: 0, each ti is in Sd. Then certainly e = Uk<!OSk'
Also, Sk C Sk+1 since Wo E Sl' and for the inductive step, if t E Sk then either
t = Wo and tESk+1 or t = Wn[Sl," "snJ with n > 0, SjESk- 1 so that SjESk
by the induction assumption and, hence t E Sk+1' It only remains to show
that I(SH: SESk) exists. For k =
use the unary sum axiom. Given that
I (SH: S E Sk) exists, we may deduce that
°
I(SH: SESk+1)
=
I(Hn(tr, ... ,t~): n?: 0,
each tiESk)
exists, being just I/lH(IsH: sESd, by Lemma 4.
D
We are now ready to return to our goal of showing that the category
PAR schemes of 1 has an initial object.
7 Definition. The initial PAR scheme (A,
f1J>
of
!, 1l) is defined by
A = the set of all subsets of the abstract syntax e of 2;
!(Si: iEI) is defined when the sets Si are disjoint-
i # j implies Si n Sj
=
tfi-and is then
U (Si: i E I);
Hn: An --+ A, (Sl"'" Sn) ~{Wn[t1"'" tnJ: t1 E Si}'
For example,
H2( {wo, w 2 [W O, woJ}, {WI [woJ})
=
{W2[WO' WI [woJJ, W2 [W 2 [W O, woJ, WI [woJJ}·
There is work to do if we are to show this is indeed the desired initial object.
We begin with the following:
8 Proposition. (A,!, 1l) is a PAR scheme.
PROOF. It is obvious that (A,!) is a partially additive monoid. We must
show Hn is additive in each variable. For notational convenience we show
additivity in the first variable. Let S2' ... , Sn E A be fixed and consider a
summable family (7;: iEI). Then an element of H n(7;,S2,,,,,Sn) has form
wn[t, S2"'" snJ with t E 7;, Sj E Sj. Since 7; n 1j = tfi if i # j, H n(7;, S2"'" Sn) n
Hn(1j, S2,···' Sn) = tfi. It is then clear that
Hn
(~ 7;, S2"'"
Sn)
=
~ Hn(7;, S2,···' Sn)·
Finally, we must show that for each SEA,
I/lH(S) =
I
n<! 0
Hn(S, ... , S)
is defined, that is, that Hn(S, ... , S) n Hm(S, ... ,S)
since each tree in Hn(S, ... ,S) has root wn-
= tfi if m # n. This is clear
D
191
8.2 The Canonical Fixed Point for PAR Schemes
9 Proposition. (A,!, H) has a unique fixed point, namely, e.
PROOF.
The fixed point equation is
and is certainly satisfied bye. To see that there is no other fixed point, note
that the equation implies that {~o} c S, and it then follows inductively that
any w-tree is in S, so S = e.
D
This sets the stage for a major objective:
10 Theorem. (A,!, fl) is an initial object of the category f1JJ of PA~ schemes.
For each PAR scheme (M,I,H) the unique homomorphism !: (A, I, fl) --+
(M, I, H) is defined by
11
!(S)
=
I(sHlsES)
for each subset S of e (i.e., each S in A). We call! the canonical homomorphism to (M, I, H).
PROOF. By Theorem 6 and 3.1.7, the sum I (sHls E S) is defined for each subset
S of e. Thus, !(S) is well defined. To show that! is a homomorphism we
proceed as follows.
(i) Let Sl' ... , Sn be n subsets of e. Then
!Hn(Sl"'" Sn) =
=
I
I
((W n[t1"'" tnJ)HltiE S;)
(Hn(t~,···, t,!f)1 ti E SJ
= Hn(I (t~lt1 E Sl)"'"
I
(t,!fltnE Sn))
by n-additivity of Hn and 4
= H n(!(Sl),···, !(Sn))·
(ii) Let the Si be disjoint subsets of e so that Ii Si is defined. Then
= II(SH: SES;)
i
=
by partition associativity
since the Si are disjoint
I!(Si)
i
so that! is additive.
It only remains to show that ! is unique. But suppose that if> is any
homomorphism (A,!, fl) - - + (M, I, H). Then S is the sum of its oneelement subsets and
192
8 Partially Additive Semantics of Recursion
¢J(S) = ~)¢J({S})ISES)
for each See, by additivity, while ¢J({wo}) = Ho and ¢J({wn[t1, .. ·,tnJ}) =
Hn(¢J(t 1), ... , ¢J(tn)) for any n Q-trees t i • But these last two equations together imply that ¢J( {s}) = SH for each s in e, and so ¢J(s) = (SH: s E S) =
!(S). Hence! is unique.
D
L
We then have our main theorem on PAR schemes:
12 The Canonical Expansion Theorem. The assignment
(M,L,H)~eH
= L(sHlsEe)
of the pattern-of-calls expansion to each PAR scheme provides the unique canonical fixed point for the category of PAR schemes and their homomorphisms.
PROOF. The Canonical Fixed Point Theorem 7..9 tells us that if there is an
initial PAR scheme (A, 11)-with ! the unique scheme homomorphism to
(M,
H)-and if this initial scheme has a unique fixed point ao, then there
is a unique canonical fixed point, namely, that given by (M,
H) ~ !(a o).
Applying this to the present circumstances, we have that ao = e by 9, while
!(S) = L(SH: SES) by 11.
D
!,
L,
L,
EXERCISES FOR SECTION 8.2
1. Prove in detail the following claims left to the reader in Definition 1:
(i) id M : (M, H) ------> (M, H) is a homomorphism.
L,
I,
(ii) If tfJ:(M,L,H)------>(M',I',H'), tfJ':(M',I',H')------>(MI,I",H") are homomorphisms, so is tfJ' tfJ: (M, I, H) ------> (Mil, I", H").
2. Verify in detail the claim made in Proposition 8 that (A,!) is a partially additive
monoid.
3. Let X be a set and consider the set Lx = 2 x ' oflanguages on X.
(i) Show that (Lx, I, ·,1) is a partially additive semi ring (Exercise 3.3.14) if I is
union, . is setwise concatenation
AB = {WVIWEA,VEB},
and 1 = {A} with A the empty string.
(ii) Show that (Lx,
H) with Ho = 1, Hi (S) = aSb (which we write for the more
tedious {a}S{b}), other Hm identically 0 is a PAR scheme with
L,
I/IH(S)
= 1 + aSb
and
eH = {anbnln = 0,1,2, ... }.
Observe that eH = L(G) for G the grammar:
S-d
S--->aSb.
193
8.3 Additive Domains
4. Generalizing the example of S3 for the determinant algorithm of S.l.S as discussed
following 3, describe Sm E e so that S:: evaluates determinants of m x m matrices.
5. Use 4 to expand H3(a + b,c + d,e + f + g).
In this section we have considered sums of the form
which denotes a sum of the form
~::<bNEII x ... x
1m)
with each It countable. Hence, it is essential to know that a finite product of
countable sets is countable, since we have only considered summing countable
families in a partially additive monoid. Exercises 6-10 review the necessary set
theory culminating with the verification that the pattern-of-calls expansion is a
countable sum.
6. Show that there exists a bijection IX: N x N
1X(1,0) = 2, 1X(0,2) = 3, 1X(1, 1) = 4, .... ]
--+
N. [Hint: IX(O,O)
= 0, IX(O, 1) = 1,
7. Let I = {iI' i2, i3""}' J = {jI,j2,h, ... } be countable. Show that I x J is countable. [Hint: For IX as in Exercise 6, show that f: I x J --+ N, f(im,jn) = lX(m, n) is
bijective; use a subset argument if one of I, J is finite.]
8. Show that any finite product of countable sets is countable. [Hint: Use Exercise 7
and induction.]
9. Let (IN E J) be a family of sets with each I j countable and J countable. Show that
UI j is countable. [Hint: As I j is countable there exists an injective function
Jj: Ir-> N. Write J = {jI,j2,j3""} and for
I j let j(x) =A for the smallest
k with x E Ij •• Define f:
Ij --+ N x N by f(x) = (Jj(X)(x),j(x)). For IX as in
Exercise 8.1.6, prove that IXf: U I j -+ N is injective.]
U
XEU
10. Use Exercise 9 to prove e is countable. [Hint: Use the sets Sk of the proof of 5;
show that each Sk is finite and hence countable.] Hence, eH is a countable sum.
11. Show that ft as in Exercise 7..1 is an instance of the pattern-of-calls expansion.
12. Show that the semantics L BAn of an abstract iterative program as in Exercises
6.2.11 and 7.. 2 is an instance of the pattern-of-calls expansion.
8.3 Additive Domains
In this section we define ordered partially additive categories which include,
as far as we know, all semantic categories of interest which are partially
additive. Such a category C has the property that each C(X, Y) is a domain
under the sum-ordering
f : : ; g if g = f + h for some h
(cf. 3.3.16) and for each PAR scheme (M'L, H), t/lH is continuous so that the
194
8 Partially Additive Semantics of Recursion
scheme has both its Kleene semantics and its pattern-of-calls expansion and
we prove these are equal by demonstrating that they are both canonical fixed
points in a situation where the canonical fixed point theorem 7••9 guarantees
that the canonical fixed point is unique.
1 Definition. Let (M,~) be a partially additive monoid. The sum-ordering on
is the relation
(M,~)
a :::; b if b = a
+h
for some h.
This relation is always reflexive and transitive. It is reflexive, a :::; a, because
a = a + 0, and it is transitive, in that a :::; band b :::; c implies a :::; c, because
if b = a + h, c = b + k then c = (a + h) + k = a + (h + k). We say (M,2) is
a sum-ordered partially additive monoid ifthe sum-ordering is antisymmetric,
a = b if a:::; band b :::; a, so that (M,2) is a poset.
2 Example. (M,2J is not sum-ordered if M
by
00
.1
o
1
= {O, 1, .1, oo} and I, is defined
if some an = 00 or an i= .1 infinitely often
if all an = .1 or I is empty
if no an = 00, {n: an i= .1} is finite and nonempty
and the number of n with an = 1 is even
if no an = 00, {n: an i= .1} is finite and nonempty
and the number of n with an = 1 is odd.
Such (M, I,) is a partially additive monoid with I, totally defined and .1 as
additive zero. Since 0 + 1 = 1, 1 + 1 = 0 we have 0 :::; 1 and 1 :::; 0 even
though 0 i= 1.
3 Definition. An additive domain (see 8 below) is a sum-ordered partially
additive monoid (M, I,) satisfying the additional property that whenever
(adiEI) is summable and bEM has the property that I,(adiEF):::; b for each
finite subset F of I then also I,(ailiEI):::; b. An ordered partially additive
category is a partially additive category C for which each partially additive
monoid C(X, Y) is an additive domain.
4 Example. Not every sum-ordered partially additive monoid (M, I,) is an
additive domain. Set M = {O, 1, oo} with
00
I,(anlnEI)
= {
1
o
if some an = 00 or {nla n = 1} is infinite
if no an = 00 and {nla n = 1} is finite and nonempty
if all an = 0 or I is empty.
Partition associativity is easily verified. This is a sum-ordered partially additive monoid with 0 < 1 < 00. However, if ak = 1 for k = 0, 1, 2, ... then
195
8.3 Additive Domains
I(aklkEF) = 1 for each finite subset F of N even though I(aklkEN)
00 i 1. Thus, (M,
is not an additive domain.
I)
=
The counterexamples 2 and 4 seem rather artificial. We now show that Pfn
and Mfn are ordered and investigate further examples in the exercises.
5 Example. Pfn is an ordered partially additive category. To prove that
Pfn(X, Y) is sum-ordered it is enough to observe.
6 Example. Any partially additive monoid in which c + c is not defined
unless c = 0 is sum-ordered. For if a::;; b, b ::;; a then b = a + h, a = b + k so
a = b + k = a + (h + k) = a + (h + k) + (h + k) so that h + k = 0; hence,
h = 0 = k and a = b.
Indeed the sum-ordering on Pfn(X, Y) is just the extension ordering of
2.1.9 as is clear from the definitions. Hence, if ai exists, ai = Va i (see
6.1.5) so that the axiom in 3 is obviously true.
L
L
7 Example. Mfn is an ordered partially additive category. Since the sumordering is just the usual ordering, f::;; g if f(x) c g(x) for all x E X, on
ai so the axiom of 3 is clear.
Mfn(X, Y), and a i =
I
U
We now establish some theory for arbitrary additive domains beginning
by showing that these are domains.
8 Theorem. An additive domain (M, L) is a domain under the sum-ordering ::;;.
PROOF. By definition, (M, ::;;) is a poset. 0 is the least element since a
for all a. Now let
be an ascending chain. Then there exist
Xk
(k = 0,1,2, ... ) with
9
(k
~
0).
It follows that
10
since this is clear for k
ak + 1
=
=
ao +
k
I Xi
i=O
(k
0)
0 and the inductive step is
But then
11
~
a
= ao +
00
I Xi
i=O
=
0+a
196
8 Partially Additive Semantics of Recursion
exists by the limit axiom since every finite subsum is a subs urn of a sum of
form 10. We will show a = Va k •
That each ak :::;; a is clear from 10 since
a
+
ak
=
00
LXi'
i=k+1
Now suppose ak :::;; b for all k. Thus, by 10, every finite subsum of 11 is :::;; b so
that a :::;; b by Definition 3.
0
We next prove two important results that relate additive constructions to
the continuity of morphisms.
12 Theorem. Any sum of continuous maps is continuous, that is, if (M, L),
(M',L) are additive domains, if hn: (M, :::;;) ~ (M', :::;;) is continuous for nEI,
and if
h(a) = L(hn(a)/nEI)
is defined for all a E M then h is continuous.
If (an) is an ascending chain with supremum a in (M, L), it follows
from the continuity of each hn and 10 and 11 that there exist Xn,i with
PROOF.
k
hn(a k +1)
=
hn(ao) + L Xn,i'
i=O
00
hn(a) = hn(ao) + L Xn,i'
i=O
Thus,
k
= L hn(ao) + L L Xn,i (partition associativity)
neI
neIi=O
k
=
h(ao) + LXi'
i=O
where we define Xi to be the subs urn of
Xi
Lne I
L~=o Xn,i'
= L Xn,i'
neI
Hence, by the proof of Theorem 8,
00
V h(ak ) = h(ao) + LXi'
i=O
But then
197
8.3 Additive Domains
00
=
L hn(ao) + L L Xn,i
nEli=O
ne]
00
= h(a o) + L
i=O
Xi =
Vh(ak )
o
as desired.
This leads to the following theorem:
L,
L)
H) be a PAR-scheme for which (M,
is an additive
13 Theorem. Let (M,
domain. Then t/lH: (M, :::;;) --+ (M, :::;;) is continuous if :::;; is the sum-ordering.
PROOF. By definition, t/I H(X) = L Hm(x, ... , x) with each Hm m-additive. By
Theorem 12 it suffices to show that gm(a) = Hm(a, ... , a) is continuous. For
m = 0 this amounts to observing that a constant map is continuous, which is
clear. Now assume m ::::: 1. Let a = Va k in (M, :::;;) so that, with minor notational changes in 10 and 11 there exist Xk with
14
(k::::: 1),
To prove gm continuous we must find Yk with
15
00
gm(a) = gm(ao) + L Yk.
k=l
First, observe that 10 takes the form
16
if Xo is defined to be a o. To discover Yk, evaluate g(ad using 16 and invoking
8.2.4. For example,
17
+ Xl""'XO + Xl)
= Hm(xo,· .. ,xo) + L(Hm(xi"""xi)l(ib ... ,im)EId
= gm(ao) + Yl,
gm(a l ) = Hm(xo
where Yl is the II-indexed sum, II being the set of all (il, ... ,im ) with
j :::;; 1 and at least one ij = 1. It is then not hard to guess that we should
try
o : :; i
198
8 Partially Additive Semantics of Recursion
18
with
Ik
= {(i 1 , ••• ,im )10 ~
ij ~ k, at least one ij
= k}.
The existence of the sum in 18 is clear since, using 16 and 8.2.6, it is a
subsum of the expansion of t/J H(ak).
We turn now to showing that the Yk of 18 satisfy 15. We first show
gm(ak) = gm(ak-d + Yk' The case k = 1 was handled in 17. Proceeding inductively, assume gm(a n) = gm(an- 1) + Yn holds for alII ~ n ~ k. By 10,
gm(ak) = gm(ao)
k
+ L Yt·
t=l
We then have
gm(ak+d = Hm
C~ XU,"'' :~ Xu)
~ gm(a o) +
L(Hm(x it ,···, xiJIO ~ i
j
= gm(ao) + L (Hn(Xi" ... , xiJIO
~ ij
+ 1, not all ij = 0)
~ k, not all ij = 0)
~u
+ "(Hm(Xi
L...
, , ... , Xi n )1(i 1 , .. ·, in) E Ik+d
= (gm(ao) +
tt
Yt)
+ Yk+1'
Finally, noting a = Lk=O Xk as ao = Xo,
= gm(ao) +
00
L Yk'
k=O
D
We conclude the section with a general result which when applied to
recursive specifications on Pfn(X, Y) guarantees that the pattern-of-calls expansion always gives the Kleene semantics.
L,
19 Theorem. Let (M,
H) be a PAR scheme (8.1.6) with (M,
domain (as in 3). Then the pattern-ofcalls semantics
20
eH
=
L SH
see
of 8.2.5 coincides with the Kleene semantics
21
of 6.2.3.
00
V t/J~(O)
n=O
L) an additive
199
8.3 Additive Domains
PROOF. Let fJ> be the category of PAR schemes of 8.2.1 and let f!J>,.,; be the full
subcategory of allJM,
H) in fJ> with (M,
an additive domain. Now the
initial object (A,
11) of fJ> of 8.2.7-10 is~ clearly in fJ>,.,; sinc~ the sumordering is inclusion of subsets of e whereas is union. Thus, (A, f1) is the
initial object of fJ>,.,; so, by the same proof as in Section 8.2, based on Theorem
7..9, 20 provides fJ> < with a unique canonical fixed point.
But 21 is a fixed-point on (M, H) by Theorem 6.2.13 since, by Theorem
13, r/I H is continuous. Furthermore, 21 is a canonical fixed point of f!J>,.,; since
for each morphism ifJ: (M,
H) ----+ (M',
H'),
L,
L,
L)
L
L,
L,
ifJ
('2
L,
r/I'lI(O))
L,
n'2 ifJr/lH(O)
(Theorem 13)
=
V r/lHifJ(O)
n=O
(8.2.1. ifJrjlH = r/lHifJ)
=
V r/lH(O),
n=O
=
00
00
where ifJ(O) = 0 because since ifJ is additive it preserves all sums including
the empty one.
Since 20 is the unique canonical fixed point, whereas 21 is a canonical
fixed point, 20 and 21 coincide.
D
EXERCISES FOR SECTION 8.3
1. Show that the partially additive category
FwR(M,o,e)
of Exercise 3.2.11 is ordered.
2. Show that the partially additive category Pfn y of Exercise 4.2.13 is ordered.
3. Show that 1/1: Pfn(N, N) ----+ Pfn(N, N)
I/I(h) = if n = 0 then 0 else hn(n - 1)
is a power series specification with eH(n) = 0 for all n. [Hint: DD(Hn(a 1 , ... , an)) is
a one-element set; 1/1 is not a polynomial.]
4. Show that if X is the po set with Hasse diagram
and (a;) is summable if Va i exists with
additive monoid.
Ia
i
=
Va i , then (X,
I) is not a partially
5. Let (P, :::;;) be a consistently complete poset (Exercise 6.1.7) and say that (ai) is
summable if (a;) is consistent with ai = Va i . Show that (P,
is an additive
domain whose sum-ordering coincides with the original one.
I
I)
200
8 Partially Additive Semantics of Recursion
8.4 Proving Correctness
In this brief section we specialize to the additive domain Pfn(X, Y), establishing and illustrating some proof rules for specifications
1
Pfn(X, Y) ~ Pfn(X, Y)
for a PAR scheme (Pfn(X, Y),L,H) which use both the ordered and the
partially additive structure on Pfn(X, Y).
We use the notations of previous sections of this chapter and 1 without
further comment.
2 Tree Induction Rule (Partial Correctness). Let g E Pfn(X, Y). To prove
eH :s; g it is necessary and sufficient to prove that for all n ?: 0 and s l' ..• , Sn E e
withsfI :s;gfori= 1, ... ,nwehave(w[sl, ... ,snJ)H :S;g.
PROOF. If eH :s; g then SH :s; eH :s; g for all SEe, for s = W[Sl' ... ' snJ in particular. Conversely, setting n = 0 yields wg :s; g so that, by induction, SH :s; g
for all SEe. Using the special fact (see Exercise 1) 6.1.9 that sum and supremum coincide in Pfn(X, Y),
D
Note that the tree induction rule requires a guess g for the semantics to
be given first. To be useful, g should be in "closed form." It seems unlikely
that the tree induction rule would contribute useful information, say, about
Ackermann's function.
3 Disjointness Lemma. If x E X, then SH (x) is defined for at most one SEe.
4 Termination Lemma. If for each x in X there exists SEe with SH(X) defined,
then eH is total.
5 Example (Partial Correctness by Tree Induction). The "91 function" given
by the recursive definition
f(x) := if x > 100 then x - 10 else f(f(x
+ 11))
was analyzed in Exercise 5.1.9 where ordered semantics was used to show
that the Kleene semantics is
g(x) := if x > 100 then x - 10 else 91.
We here illustrate the tree induction rule 2 by showing that if f(x) is defined
then f(x) = g(x). In other words, we must prove f:S; g. This is a partial
correctness proof since we say nothing about those x not in DD(f). The
partially additive fixed point equation on Pfn(N, N) is f = l/I(f) =
Ho + Hz(f,f), where
201
8.4 Proving Correctness
Ho
= if x>
100 then x - 10 else undefined,
H 2 (s, t) = stu where u = if x:$; 100 then x
+ 11 else undefined.
We must apply 2 for n = 0 or n = 2.
For n = 0, Ho :$; g is clear. For n = 2, we assume s, t in PC(H) satisfy
s, t :$; g and show that stU:$; g, that is, we show that if stu (x) is defined, then
g(x) = stu (x). Now, if stu (x) is defined then both u(x) and tu(x) are defined
and we have
u(x) = x + 11 and x:$; 100;
and, because t :$; g,
tu ()
x
= {
(x
+ 11) - 10 =
x
+ 1 if x + 11 > 100
91
else.
Because s :$; g,
stu(x) = (x
+
1) - 10 and x
+1>
100, or stu(x) = 91,
that is,
stu (x) = x - 9 and x = 100, or stu(x) = 91,
that is,
stu (x)
=
91.
Thus, stu:$; g, as was to be shown, and hence f :$; g.
6 Example (Total Correctness by Exhaustion). Consider again f, g, h, H o, u
as in 5. We will show directly that g = I(SH: SEe) so that, in fact,f(x) = g(x)
for all x. This is now a total correctness proof, since we characterize the
behavior of f for all x. We must thus apply 4, showing that for each x there
exists SEe with SH(X) = g(x).
Case 1. x > 100. Then Ho(n) = g(n).
Case 2. 90 :$; x :$; 100. To begin, observe
Hou(x)
=
Ho(if x :$; 100 then x
+ 11 else 1-)
+ 11 > 100 then x + 11 - 10 else 1= if 90 :$; x :$; 100 then x + 1 else 1-.
= if x :$; 100 /\
Next, claim that if tk
=
X
Ho(Hou)k then
tk(x) = if x = 101 - k then 91 else 1for 1 :$; k :$; 11. For k = 1,
t 1 (x)
= Ho(Hou)(x) = H o(if90:$; x :$; 100 then x + 1 else 1-)
= if 90 :$; x :$; 100 /\ X + 1 > 100 then x + 1 - 10 else 1= if x = 100 then 91.
202
8 Partially Additive Semantics of Recursion
For the inductive step,
tk+l (x) = tk(Hou)(x) = tk(if 90 :::;; x :::;; 100 then x
=
if 90:::;; x :::;; 100
= if x
+1=
= if x =
A X
+ 1=
101 - k then 91 else -1
101 - k then 91 else -1
101 - (k
+ 1 else -1)
+ 1) then 91 else
(as 1 :::;; k :::;; 11)
L
But tk = sf! if
=
and for 90:::;; x :::;; 100, g(x) = tlol-Ax).
Case 3. 0:::;; x < 90. For any such x there exists a unique a with 1 :::;; a :::;; 9,
x + 11a :::;; 100, x + l1(a + 1) > 100. It is clear that
+ 11a.
As 90 :::;; x + lla :::;; 100, 1 :::;; k :::;; 11 for x + 11a = 101
ua(x)
=
x
SkUa(X)
-
k, we have
= 91.
But as t lo (91) = 91,
tlOmt,;ua(x)
for any m. But tloatkua has form
SH
= 91
since
(a H2'S)
EXERCISES FOR SECTION
8.4
1. Show that the tree induction rule 2 generalizes straightforwardly to any ordered
partially additive category. Specifically relate the "special fact" discussed in 2 to the
definition of an additive domain.
2. Consider the power series H: Pfn(N, N) - - Pfn(N, N) of Exercise 5.2.7 for the
recursive specification
-PH (h) = if n = 0 then 0 else 1 + h(h(n - 1)).
203
8.5 Power Series and Products
Let f be the pattern-of-calls expansion of H. Use the tree induction rule 2 to show
~ idN · Then use the termination lemma 4 to conclude f = idN .
f
3. In 5.2.17 it was shown that the pattern-of-calls expansion
equation
f for the fixed point
f(x) = if p(x) then f(f(g(x))) else x
is
f(x)
Establish that f
~
= while p(x) do g(x).
while p do 9 using the tree induction rule 2.
4. Establish the results of 6.2.17 using the proofrules of this section.
5. Establish the result of Exercise 8.3.3 by using the proof rules of this section.
8.5 Power Series and Products
In this final section we address some technical problems that would arise
naturally in using PAR schemes to define the formal semantics of recursion
for a programming language. The results established here are adequate to
provide the semantics of FPF in PCn as is established in Exercises 6-11.
The proof of the corresponding results for Kleene semantics is a good deal
simpler. Nonetheless, the PAR-scheme approach does provide a tighter setting because Theorems 8.3.13 and 8.3.19 guarantee that PAR-scheme semantics is Kleene semantics whereas no converse result is known at the current
time.
The first new idea is that of a "power-series" scheme which is a PAR
scheme with an additional property. All the PAR schemes arising, say, in
FPF are power-series schemes. The initial PAR scheme of 8.2.7 is a powerseries scheme so that the canonical expansion Theorem 8.2.12 goes through
restricted to power-series schemes. Hence, power-series schemes are not
unduly restrictive.
A "power-series map" is t/lH: M -+ M for (M, H) a power series scheme.
In FPF we would surely require for the map constr mof 8.1.5 that if t/lH I ' ••• ,
t/lHm are power-series maps D -+ D then so is constrm(t/lH I " ' " t/lHJ: D -+ D. In
general, we will define a "strongly m-additive" map to be one which converts
m power-series-map inputs to a power-series output, and we will establish a
workable criterion to prove that an m-additive map is strongly so. In practice,
it appears that programming language constructors are strongly m-additive
maps.
We note that a PAR scheme (M, H) for which all countable families are
summable is necessarily a power-series scheme and then all m-additive maps
M m -+ M are necessarily strongly m-additive. Thus, the technical issues addressed by power series schemes and strongly m-additive maps have to do
with summability.
L,
L,
204
8 Partially Additive Semantics of Recursion
The third concept needed is a suitable product of partially additive
monoids analogous to the product domain of 6.1.17. This is easily given and
is useful for simultaneous recursions such as 5.2.18. But a new use for products arises. For domains DI , ... , Dm, D a function F: DI x ... x Dm ------. D is
continuous (viewing the product DI x ... x Dm as a single domain) if and
only if it is "m-continuous" (= separately continuous, see Exercise 6.3.7)
which explains why "m-continuity" was not a needed concept earlier. The
corresponding result for partially additive monoids is false. Thus, if, say, M I ,
Mz, M 3 , M are partially additive monoids a considerable number of distinct
possibilities arise for a function f: MI x Mz X M3 ------. M. Not only may
it be additive or 3-additive but it may, say, be 2-additive considered as
MI x (Mz x M 3 ) ------. M. This third possibility is exactly what happens
for the if-then-else map of 6.3.18, as we show below in 15.
To motivate power-series schemes, recall the opening remarks of Section
8.1. In a PAR scheme (M,I,H) we think of Hm(h l , ... ,hm) as the sum of
all composition m-substitution paths with hj replacing the jth occurrence
of a function variable. Thus, while we have heretofore required only· that
L Hm(x, ... , x) exists, for all x it would be just as reasonable to require the
existence of
I
Hm(Xml' ... , xmm)
no matter how XII; X21' X2 2 ; X31' X 3 2' X33; .•• are chosen in M. For ifj # k,
the j-substitution paths are guarded from overlap with the k-substitution
paths regardless of which functions replace the variables.
This stronger assumption is the basis of the definitions to which we now
turn.
1 Definitions. Let (M, I), (M', I') be partially additive mono ids. A power
series (M, I) ---+ (M', I') is a family H = (Hmlm ~ 0) of m-additive maps
Hm: Mm -+ M' such that for all m ~ 0, the sum
2
exists, regardless of how the j-arguments for each H j (indicated by ( -)) are
chosen.
The power-series map of H is then
3
which sum necessarily exists by the limit axiom on (M', I') since each finite
subsum is a subsum of a sum of form 2.
A power series H is a polynomial if Hm = 0 except for finitely many m. In
this case the largest m with Hm # 0 is the degree. If H is a polynomial, t/lH is
the polynomial map of (Hm).
4 A power-series scheme is (M, I, H), where H: (M, I) ---+ (M, I) is a
205
8.5 Power Series and Products
power series, and for such H, t/lH: M
-+
M is called a power-series recursive
specification.
The reader should pause to spot-check that our previous examples of PAR
scheme are all power-series schemes.
The promised definition of strong m-additivity is then as follows.
5 Definition. Let (M', I'), (Ml' Il), ... , (Mn' Ln), (M, I) be partially additive
monoids and let L: Ml x ... x Mm - - M. Then L is strongly m-additive if L
is m-additive and if whenever H t = (Htjlj ~ 0) are power series (M', I')-(Mt, Lt)(1 :s; t :s; m) then there exists a power series H: (M', L') - - (M, 2)
with
.
6
t/lH(h') = L(t/lH 1 (h'), ... , t/lH m (h')).
We know of no "natural" examples of m-additive maps which are not
strongly m-additive; although we conjecture that counterexamples do exist.
The fact that one is hard to find is a good sign for the power-series approach.
We now turn to develop a criterion (8 below) which makes it possible to
prove that many m-additive maps are strongly m-additive.
The following is a mild generalization of Lemma 8.2.4, so the proof is
omitted.
7 Lemma. Let (Ml' I d, ... , (Mm' Im), (M,L) be partially additive monoids
and let L: Ml x ... x Mm ---+ M be m-additive. Then whenever (hijljEJJ is a
summable family in (M,I), (L(hlit, ... ,hmiJljiEJi) is a summable family in
(M,I) and
(where we write
I
for
Ii to avoid notational confusion).
I
I)
8 Theorem. Let (Ml' 1), ... , (Mm' Lm), (M,
be partially additive monoids
and let L: Ml x ... x Mm ---+ M be m-additive. Suppose that whenever
H t = (Hth ~ 0) is a power series (M', I') - - (Mt, Lt)(1 :s; t :s; m) then for
all k :s; 0, the sum
9
exists, regardless of how the arguments, indicated by ( - ), are chosen. (Different
arguments may be chosen in different terms.) Then L is strongly m-additive.
PROOF.
Define
206
8 Partially Additive Semantics of Recursion
This sum exists, being a sub sum of a sum of form 9. D = (D k ) is a power series
because
L
O:Sj:Sk
Di-) =
L
O:Sj:Sk
L
it+"'+j~=j
L
it +"'+j~:Sk
L(H 1it (-),···,HmjJ-))
L(H1j ,(-), ... ,HmjJ-))
is a sum of form 9. Finally,
L(t/lH, (h'), ... , t/lHJh'))
=
L( L H 1it (h', ... ,h'), . .. , L
j,"<!O
HmjJh', ... ,h'))
j~"<!O
L
L(H1it (h', ... , h'), ... , HmjJh', ... , h'))
(by 7)
it+"'+j~"<!O
=
I
k"<!O
Dk(h', ... , h') = t/lD(h')·
o
11 Corollary. If in (M, L) every countable family is summable, every madditive map M1 x ... x Mm ---+ M is strongly m-additive.
12 Example. Let C be a partially additive category such as Pfn or Mfn in
which any family in which each two-element subfamily is summable is itself
summable. Let (h 1, ... , hm) ~ hm' .. h1 be the composition map of 8.1.3.
This is already known to be m-additive. It is in fact strongly m-additive. To
check 9, we must verify that
exists where ij, has the form Htj,( -). It suffices to show that ij~'" Jj, +
is defined whenever (j1"" ,jm) =f (u 1, ... , um). Let t be least with
ij, =f fUt Then ij,-l ... Jj, = g = fu,-l ... fu, (= idxo if t = 1). Then ij, + fu,
exists because H t is a power series. By 3.2.1 and 3.2.20
fu~'" fu,
(ijm ... ij,+l )ij,g
+ (fu~ ... fu,+l )fu,g
exists, as desired.
The hypothesis on C is not necessary. See Exercise 5.
The promised definition of the product of partially additive monoids is as
follows.
13 Definition. Let (M1, L 1)' ... , (Mm,Lm) be partially additive monoids,
m > O. Then their product partially additive monoid (M, L) is defined as
follows:
M = M1
X •••
x Mm
(the product set, 2.3.1, 2.3.11).
207
8.5 Power Series and Products
A family ((h li , ... , hmi)li E J) is summable in (M,
(hjil i E J) is summable in (Mi' Ii) and then
I
(hli' ... ,hmi ) =
(
D if for eachj E {l, ... ,mj },
~ (hu), ... '~(hmi)),
that is, "sum independently on each coordinate."
To see that (M,~) satisfies the limit axiom, suppose that every finite
subfamily of((h u , ... , hmi)1 i E J) is summable. Then if F is any finite subset of
J, I(hu, ... ,hmi)liEF) exists so that for any j, Lj(hjiliEF) exists. As F is
arbitrary and (Mj' Ij) satisfies the limit axiom, (hjil i E J) is summable. But
then, as j is arbitrary, ((h li , ... , hmi)1 i E J) is summable.
The remainder of the verification that (M,~) is a partially additive monoid
is similar and is left as an exercise.
We have already warned the reader that there are numerous additivity
possibilities for a map L: M1 x ... x Mm ----+ M. We now explore some
examples.
14 Example. The 2-additive composition map
Pfn(X, Y) x Pfn(Y, Z) --.!::..... Pfn(X, Z)
of 8.1.3 is not additive since it is false that
(g
+ g')(f + 1') =
gf
+ g'f'
as additivity with respect to the product monoid structure would require.
Indeed, by 2-additivity we have
+ g')(f + 1') = g(f + 1') + g'(f + 1')
= gf + gf' + g'f + g'f'
which differs from gi + g'f' if gf' # 0 or gf' # o.
(g
15 Example. The map if-then-else of 6.3.18 is 2-additive considered as
o x 0 2 ----+ 0, where 0 2 = 0 x 0 is the product partially additive monoid.
This amounts to
16
+ g2 else h1 + h2
(if I then gl else h 1) + (if I
if I then gl
=
then g2 else h 2)
and
17
if (f1
=
+ 12) then g else h
(if 11 then g else h)
+ (if 12 then g else h).
To see that 16 holds, observe that if I is true both sides will apply whichever
of g 1, g2 (if either) is defined and similarly if I is false. That 17 holds is obvious
since at best one of 11' 12 is defined.
208
8 Partially Additive Semantics of Recursion
If if-then-else were 3-additive D3
if f then g1
=
-+
D then
+ g2 else h
+ (if f then g2 else h)
since if f is true and h is defined there is
(if f then g1 else h)
would hold. This cannot be true
domain overlap on the right-hand side so that the sum is not defined.
EXERCISES FOR SECTION
8.5
a
{ ;1M
1. If(M,D = (Ml,Ll) x ... x (Mm,Lm) as in 13 and
Nl x ... x Nm
where (Nl , L'd, ... , (Nm, L~) are partially additive monoids, show that
additive if and only if each J; is m-additive.
f is
m-
2. Show that 5.2.19 provides a power-series scheme whose polynomial map on the
product partially additive monoid Pfn(X, X) x Pfn(X, X) is the recursive specification for the simultaneous recursion of 5.2.1S.
3. Show that for additive domains, the product of 13 provides the domain product
of 6.1.17.
4. Let X = {., -,0,1,2,3,4,5,6, 7,8,9} and let (M,D be the product partially additive monoid (Lx, L)3 for Lx as in Exercise 8.2.3. Define a suitable power series H
so that I/IH: (Lx, D 3 ----> (Lx, D is such that
(i) The fixed point equation (E, N, D) = I/IH(E, N, D) is
E = . EE
+
-E
+ N,
N=D+ND,
D = {O, 1,2,3,4,5,6,7,8, 9},
that is, "a digit is one of 0, ... ,9, a number is a digit or a number followed by
a digit, an expression is a number, - e or . e l ez where e, e l , e z are expressions."
(ii) eH = (E, 1'1, 15), where 15 = {O, ... , g}, 1'1 = nonempty strings in 15*, and E is the
set of prefix form arithmetic expressions in binary . and unary - (prefix form
means e l . (e z - (e3· e4 )) is written· e l - e z · e3e4 ). [Hint: H is a second-degree
polynomial.]
5. Generalize Example 12 to any partially additive category. [Hint: For m = 2, using
the notations of 12, show that
fo(fo
+ ... + fk) + ... + h-l (fo + fd + fdo
exists. Then use induction on m.]
Exercises 6-11 outline a proof that each FPF recursive specification Dn--+
Dn, with D = Pfn(DTN, DTN) and Dn the product partially additive monoid, is
a power-series specification. We follow the inductive definition of 6.3.10-21.
209
Notes and References for Chapter 8
6. Show that (h 1 , ••• , hn ) f---+ h; is additive Dn -+ D for each i.
7. Show that constrm: Dm -+ D is strongly m-additive.
8. Show that if I/IH' I/IL, I/IM: Dn -+ D are power-series maps then so is if I/IH then I/IL
else I/IM [Hint: if I/IH then I/IL else I/IM equals the I/IN whose mth term, m ~ 0, is given
by
and hE Dn is vector notation h = (h1 , • •• , hn ).]
9. If I/IH: Dn -+ D is a power-series map show that
(rxl/lH(h)<t1> ... ,tk)
=
<I/IH(h)t 1 ,···,I/IH(h)tk)
is a power-series map. [Hint: Consider
where pr;: <t 1 , ••• ,tk)
= t;.
10. Using Exercise 1, show that if I/Ij: Dn -+ D is a power-series map for 1 ::5: j ::5: k then
1/1: Dn -+ Dk, I/I(h): (I/I(h), ... , I/Ik(h» is also a power-series map.
11. Using the above exercises, Exercise 6.3.11, and the theory ofthis section, conclude
that I/Ip : Dn -+ Dn of 6.3.21 is a power-series map.
It follows from Theorem 8.3.19 above that the Kleene semantics of the I/Ip
(based on Exercises 6.3.4-10) and power-series semantics produce the same
semantics for FPF in Pfn.
12. Show that power-series semantics and Kleene semantics for FPF in Mfn (Exercise
1.4.3) coincide. [Hint: The proofs used for PIn are often trivialized because every
countable family in Mfn(X, Y) is summable.]
13. Show that the initial PAR scheme of 8.2.7 is a power-series scheme. [Hint: The
original proof that fln n flm = fjJ if m n works for arbitrary arguments.] Conclude that the canonical fixed point theorem 7.•9 applies to power-series schemes.
[Hint: Review the proof of 8.2.12.]
*
Notes and References for Chapter 8
The theory of this chapter is due to the authors. The material of Sections 1-4 is
adapted from their paper on the pattern-of-calls expansion as cited in the notes for
Chapter 7. Section 5 is new.
We thank Dana Scott for Exercise 8.3.5. A flawed version was attributed to him in
the pattern-of-calls paper; the errors were ours, not his.
CHAPTER 9
Fixed Points in Metric Spaces
9.1
9.2
9.3
9.4
Contractions on Complete Metric Spaces
Differential Equations
Metrics on Trees
Context-Free Languages as Metric Fixed Points
A metric space is a set equipped with a function which assigns a numerical
"distance" to each pair of points. These objects arise naturally in many areas
of mathematics. In Section 9.1 we prove a classical result due (essentially) to
S. Banach that under appropriate hypotheses, an endomorphism ljI of a
nonempty metric space has a unique fixed point x-indeed x is the limit (in
the sense of ever decreasing distance) of the sequence x o, ljIx o, ljI2xo, ... with
any starting Xo. To illustrate the scope of this theorem we devote each of the
remaining sections to an application. Section 9.2 establishes that an initialvalue problem of ordinary differential equations has a unique solution. Section 9.3 introduces a tree-syntax for recursive specification in which repeated
execution produces an infinite (but nonrecursive) syntax tree. The independence of the resulting infinite tree from "calling strategy" follows from the
uniqueness assertion in the Banach theorem. Finally, we show in Section 9.4
that the language defined by a context-free grammar often arises as the
unique fixed point of the Banach theorem.
The advantage of the Banach-theorem approach is that fixed points are
unique. This also limits the scope of applicability since we have seen many
fixed point equations in semantics (such as 5.1.4) which have more than one
fixed point.
9.1 Contractions on Complete Metric Spaces
A "metric space" is a set equipped with a specific notion of "distance" between its elements. The concept of "limit," familiar for numbers from a
calculus course, extends readily to metric spaces since "approaching x" can
211
9.1 Contractions on Complete Metric Spaces
be expressed in numerical terms by "distance to x approaches 0." If a sequence Xl' X 2 , X 3 , ... approaches X then in particular x n , Xm will approach
each other as m, n get large; a sequence with this latter property is called
"Cauchy." In a "complete" metric space, every Cauchy sequence is required
to approach a limit. The main result of this section is that in a complete
metric space X every function t/J: X -+ X which "shrinks distances" in the
suitably precise sense of 19 has a unique fixed point.
1 Definition. Let [0, 00 J denote the set of all real numbers ~ 0. A metric on a
set X is a function d: X x X ~ [0, ooJ which satisfies the following four
axioms for all x, y, Z EX:
(i)
(ii)
(iii)
(iv)
(Symmetry axiom) d(x, y) = d(y, x).
(Triangle inequality) d(x, y) + d(y, z)
d(x, x) = 0.
If d(x, y) = then x = y.
°
~
d(x, z).
A metric space is a pair (X, d) where X is a set and d is a metric on X.
In a metric space, d(x, y) is called" the distance between x and y."
2 Example. Let X be the Earth's surface. The "neutrino metric" on X is
d(x, y) = length of a straight line segment connecting x, y. (Neutrinos are
subatomic particles which are so small and inert that they typically pass right
through the Earth without interacting with any of its atoms.) The well-known
"great-circle" metric used by airplanes is e(x, y) = length of any great-circle
arc connecting x and y "the short way."
3 Example. Let X be the Euclidean plane R x R of all ordered pairs (x, y)
with x, y real numbers. The Euclidean metric on X is the usual distance
function
d((X l ,X 2)'(Y1,Y2)) = J(x 1 -
yd 2 + (X2
- Yz)2.
The Manhattan metric on X,
is another metric; however, one which reflects the distance along the streets
of a city with east-west and north-south streets.
Examples 2 and 3 are of a "geometric" character with the elements of X
being "points." While such examples motivate much of the terminology used
in metric spaces, they are by no means the only important examples. Two
distinctly different types are provided by 4 and 6 below.
4 Example. Let to < t1 be real numbers and let [to, tlJ = {tlto ::;; t ::;; td.
Define X to be the set of all continuous functions x: [t o,t 1 J ~ R. By the
maximum value theorem, for each x, y E X there exists a number t in [to, t 1J
212
9 Fixed Points in Metric Spaces
maximizing Ix(t) - y(t)l. Call this maximum value the distance d(x, y) between x and y, that is, define
d(x, y) =
Max Ix(t) - y(t)l.
It is not hard to prove (X, d) is the metric space. This example is studied
further in Section 2 in connection with differential equations. Note that
complex objects, namely, functions, are treated as mere "points" by this
abstraction.
5 Definition. A non-Archimedean metric on X is a function d: X x X
[0,00) satisfying (i), (iii), and (iv) of 1 as well as
----+
Max (d(x, y), d(y, z» 2 d(x, z).
Since d(x, y) + d(y, z) 2 Max(d(x, y), d(y, z» it is obvious that a non-Archimedean metric is metric. None of the examples so far is non-Archimedean.
The following proposition provides a non-Archimedean metric which we
apply to the theory of context-free languages in Section 4.
6 Proposition. Let X+ be the set of nonempty words on the alphabet X. Given
two languages L, M c X+, we say they differ on w if w is in one language but
not the other. We let I(L, M) be the length of the shortest word on which L
and M differ, so that I(L, M) = 00 if and only if L = M. The map d: 2x+ x
2x+ ----+ [0, 00) defined by
d(L, M) =
r'(L.M)
is a non-Archimedean metric on 2x+.
°
PROOF. (i) d(L, M) = if and only if I(L, M) = 00 if and only if L = M.
(ii) Symmetry is obvious.
(iii) Given three languages L, M, and N, if L, N differ on w of length
I(L,N) then, say, wEL, w¢N. IfWEM then I(M,N) S I(L,N), else w¢M and
I(L, M) S I(L,N). We therefore have
I(L, N) 2 Min(l(L, M), I(M, N».
Thus,
as desired.
D
7 Example. We saw in Exercise 3.3.3 that "any subset of a poset is a poset."
The same principle holds for metric spaces. If (X, d) is a metric space and A
is any subset of X then dA : A x A ----+ [0, 00) defined by restricting d to
A x A, that is,
dA(x,y)
= d(x,y) for x, YEA
is a metric. That axioms 1 (i)-(iv) hold is obvious.
213
9.1 Contractions on Complete Metric Spaces
Usually, we would just write (A, d) rather than (A, dA ) since this is rarely
confusing. We say (A, d) is a metric subspace of (X, d).
8 Definition. Let (X, d) be a metric space, let (xn In = 1,2,3, ... ) be a sequence
of elements of X, and let x E X. Then x is a limit of (x n), in symbols
lim(xn) = x
or
Xn -4 x,
if the sequence d(x, xn) of real numbers approaches O. (Some readers may
have seen rigorous "epsilon-delta" definitions for limits of real numbers whereas others may have seen only intuitive definitions. We prefer to avoid further
discussion of this here, our main point being that whatever the reader knows
about limits for real numbers extends easily to arbitrary metric spaces.)
9 Proposition. Limits in a metric space (X, d) are unique, that is, if Xn -4 x and
Xn -4 Y then x = y.
PROOF. As d(x, y) ::;; d(x, xn) + dey, Xn) for any n by the axioms of 1, d(x, y) ----+
O. As d(x, y) is independent of n, d(x, y) = O. By 1, again, x = y.
D
Thus, while a sequence (x n) may have no limit, if a limit exists it is the limit
and we denote it by lim(xn).
10 Example. Let f: [0, IJ
-+
R, f(x) = eX. As is well known from calculus,
x2 x3
e=l+x+-+-+· ...
2
3!
If fn: [0, IJ
then f"
-4
-+
R is defined by
f in the metric space of Example 4.
11 Example. In the metric space (2x+,d) of 6 with X = {a} let Ln = {a,a 2 ,
... , an}. Then X+ = lim(Ln) because d(X+, Ln) = 2-(n+1).
12 Example. Let (X, d) be a metric space. If X possesses two distinct elements
y, z then there exists a sequence (x n ) with no limit. Define
Xn
=
{
y if n is odd
if n is even.
Z
To see that Xn -4 X is impossible for all x, observe that d(xn' xm) ::;; d(xn' x) +
d(x, xm) so that as n, m get large d(xn' xm) ----+ 0 which says that dey, z) = 0 (as
we can have y = Xn, Z = Xm for arbitrary large n, m). This contradicts the
axioms of 1 as y =1= z.
214
9 Fixed Points in Metric Spaces
Example 12 shows that it is unreasonable to expect all sequences to have
a limit. A sequence which has a limit must at least be "Cauchy." This is
explained in the next definition and proposition.
13 Definition. Let (xn) be a sequence in a metric space (X, d). Then (xn) is
Cauchy if d(xm' xn) ---+ 0 as n, m get large.
14 Proposition. In any metric space, a sequence with a limit must be Cauchy.
PROOF.
d(x, xn)
The reasoning is the same as in 12. If d(x, xn) ----+ 0 then d(x n, xm) ::;
+ d(x, xn) ---+ 0 also.
0
Our overall objective is to develop conditions on a metric space (X, d) and
a total function t/!: X --+ X which guarantee that t/! has a distinguished fixed
point. It is useful to contrast this situation with that for domains and continuous maps. There we "started" with .1 and applied t/! successively to get
the sequence .1, t/!(.1), t/!2(.1), .... Monotonicity of t/! forced this sequence
to be an ascending chain. The definition of "domain" provided a "limit" x =
V t/!n(.1) to this sequence and continuity was used to prove t/!(x) = x. In a
metric space there is no natural ".1" to start with, so let us start with any Xo.
We may then form the sequence
x, t/!(xo), t/!2(XO)' ....
This sequence will hopefully have a limit x (so the sequence itself must at least
be Cauchy by 14) and hopefully for this Xn we can prove t/!(x) = x. We will in
fact find conditions on (X, d) and t/! to carry out this program in such a way
that t/! has a unique fixed point (so that any starting Xo may be used). While
this property is striking, it is also clear that "metric semantics" will not
work for recursive specifications with multiple fixed points (e.g., 5.1.4). The
uniqueness property stems from the following proposition.
15 Proposition. Let (X, d) be a metric space and let t/!: X
16
d(t/!(x), t/!(y)) < d(x, y)
--+
X satisfy
whenever x =F y.
Then t/! has at most one fixed point.
PROOF. Let t/!(x) = x, t/!(y) = y. Then if x =F y, d(x, y) = d(t/!(x), t/!(y)) < d(x, y)
and this is impossible.
0
17 Example. A map satisfying 16 need not have any fixed points. For example, let R have its usual metric d(x, y) = Ix - y I and let (X, d) be the metric
subspace as in 7 with X the subset {Xl,X2,X3' .•. } with Xn = lin. Define
t/!: X --+ X by I/!(x n) = x n +1 • That 16 holds is easily verified, but clearly t/! has
no fixed points.
215
9.1 Contractions on Complete Metric Spaces
A related example holds for the subset y= {YI'Y2,YJ, ... } of R with
Yn = n - lin. Here, cP: Y -+ Y, CP(Yn) = Yn+l again satisfies 16 and has no fixed
points but whereas Xl' tjJ(xd, tjJ2(X I ), ... was a Cauchy sequence, YI' CP(YI),
cp2(yd, ... is not.
The following two definitions patch the holes created by Example 17 and
lead to our main result, Theorem 20.
18 Definition. A metric space is complete if every Cauchy sequence has a
limit.
As is shown in the exercises and in the balance of the chapter, while it may
take work to establish it, many examples of complete metric spaces exist. It is
an issue of foundations to establish that the real line is complete; see the end
of chapter notes.
19 Definition. Let (X, d) be a metric space. A map tjJ: X
if there exists 0 ::; K < 1 with
-+
X is a contraction
d(tjJ(x), tjJ(y)) ::; K d(x, y) for all X # y.
As K < 1, a contraction must satisfy 16. The reason 16 is a weaker condition
is that the quotient
d(tjJ(x), tjJ(y))
d(x,y)
while < 1, may climb arbitrarily close to 1 (as happens for the cP of 17).
20 Theorem (Banach). Let (X, d) be a complete metric space, let Xo E X be
arbitrary, and let tjJ: X -+ X be a contraction. Then tjJ has a unique fixed point,
namely,
21
We must prove that the limit exists, and that it is a fixed point.
Uniqueness will then follow by 15. To prove 21 is a fixed point, we first
observe the following:
PROOF.
22 Lemma. For any two metric spaces (XI,dd, (X2,d2), if f: X
23
~
Y satisfies
d2(f(x),j(y)) ::; dl(x,Y) for all x, YEX,
then whenever x = lim(xn ) in (Xl' d l ), lim(fxn ) exists in (X2' d 2) and coincides
with f(x). Thus, f(lim(x n )) = lim(f(xn )).
o
216
9 Fixed Points in Metric Spaces
Returning to the proof of 20, if the limit x of 21 exists then applying the
lemma with I = 1/1 yields
I/I(x) = I/I(lim(I/In(xo))) = lim(I/I(I/In(x o))) = lim(I/In+1 (xo» = x
(since it is obvious that whenever lim(Yl' Y2' Y3' ... ) = Y then also lim(Y2' Y3'
Y4' ... ) = y). Thus, we need only show the limit of 21 exists. As (X,d) is
complete, this is equivalent to showing that (I/In(xo» is a Cauchy sequence.
To see this, note that for any m < n,
d(I/Imx,I/Inx) = d(I/Imx, I/Im(I/In-m x» < Kmd(x,I/In-mx).
By repeated application of the triangle inequality
d(x, I/In-m x ) ~ d(x,I/Ix) + d(I/Ix,I/I2 X ) + ... + d(I/In-m-lx,I/In-m x )
~
(1 + K + ... + K n - m +1 ) d(x, I/Ix).
But since K < 1, we know that LJ;2:0 Ki converges to the limit 1/(1 - K).
Thus,
1
d(x, I/In-mx ) ~ 1 _ K d(x, I/Ix)
and so
d(I/Imx,I/Inx)
~
Km
-K
-1-d(x,I/Ix).
Since K < 1, we can make the right-hand side as small as we please simply
by requiring that m exceed some sufficiently large N. But then (I/Imxlm ~ 1) is
Cauchy, and we are done.
D
We observe that Theorem 20 is an instance of the canonical fixed point
theorem of Chapter 7, applied to the category whose objects are all (X, d, 1/1)
with 1/1 a contraction (X, d) ~ (X, d) and whose morphisms I: (X, d, 1/1) ~
(X', d', 1/1') are total functions I: X - X satisfying d(fx,fy) ~ d(x, y) and 1/1'1 =
11/1. The initial object is the object whose set has one element from which the
unique morphism to (X,d,I/I) maps the single element to the unique fixed
point of (X, d, 1/1). Proving that this is an initial object requires Theorem 20,
but the canonical fixed point theorem then establishes that morphisms preserve the unique fixed points. See Exercise 10.
EXERCISES FOR SECTION
9.1
1. Prove that the metric in 4 satisfies the axioms of 1. [Warning: in the triangle
inequality, if d(x,y) = Ix(t) - y(t)l, d(y,z) = I(y(u) - z(u))l you can not assume
t = u; show Ix(t) - z(t)1 ~ d(x, y) + d(y, z) for all z.] Give a specific example of
functions x, y, z with d(x, z) > Max(d(x, y), d(y, z)) to show that this metric fails to
be non-Archimedean. [Hint: Consider parallel curves.]
217
9.1 Contractions on Complete Metric Spaces
2. For any set X define d: X x X -
[0, 00) by
d(x,y) =
{~
x#y
x =y.
Show that d is a non-Atchimedean metric. It is called the discrete metric on X. In
this metric space, prove that if (x n ) is Cauchy then there exists N such that Xn = Xm
if n, m ~ N. Conclude that (X, d) is complete.
3. Prove that f.
-+
f
in 10. [Hint: Look up Taylor's Theorem.]
4. Prove that every nonempty metric space (X,d) admits a map l/I: X -+ X with
d(f(x),f(y)) ~ d(x, y) such that f has a fixed point but is not a contraction. [Hint:
Very easy!]
5. Let (A, d) be a metric subspace of (X, d) as in 7. Say that A is closed if whenever
an -+ x in (X, d) with each an E A then also x E A, that is, "A is closed under limits."
Prove that a closed metric subspace of a complete metric space is again complete.
Give an example to show that "closed" is necessary.
6. In the Euclidean plane, the set of all points equidistant from x, y (x # y) is the
perpendicular bisector of the segment connecting x, y and, in particular, is a set of
zero area. Prove, however, that if the segment connecting x, y has slope 1 then the
set of all points Manhattan-equidistant from x, y has infinite area.
7. A metric space (X, d) is finite if X is finite. Prove that every finite metric space is
complete.
8. Let (X,d) be the Euclidean plane. The map l/I(x,y) = (tx,ty) is a contraction
whose unique fixed point is (0,0). Let (A, d) be a nonempty metric subspace as in
7 such that (O,O)¢A. By 20, if l/I maps A into A, (A, d) is not complete. Give an
example of such an A and show precisely where A fails to be complete. Similarly,
give an example of a complete (A, d) with (O,O)¢A and show that for some aEA,
l/I(a)¢A.
9. In Exercise 2.1.7 we saw that "a poset is a category." A similar result holds for
metric spaces. For a metric space (X,d) define a category C(X.d) as follows:
ob(C(x.d») = X,
C(X.d)(X,y)
that is, we have a morphism x
by
t
= {tE [0, 00): d(x,y) ~ t},
--..!...... y just in case d(x,y)
~ t. Define composition
t+u
u
x-y-z=x----->.z
and define identities by
(i) Verify C(X.d) is a category and that (X, d) = (Y, e) if C(X.d) = C(Y.e)'
(ii) Show that C(X.d) is literally equal to its dual category C(i,d)'
(iii) Show that for any x, y E C(X.d) the product x x y does not exist. [Hint: if
a
d
x--p-y
218
9 Fixed Points in Metric Spaces
were a product, consider
a
x +-- p
b
-----+ y
l+~i;'
Y
and show such v can not exist.]
10. Without relying on the canonical fixed point theorem, prove that if 1/1: (X, d) --+
(X, d), 1/1': (X', d') --+ (X', d') are contractions with unique fixed points xo, Xo
and if f: X ~ X' is a total function with d'(fx,fy) ::::; d(x, y) and 1/1'1 = N then
fx o = xo·
9.2 Differential Equations
The contraction theorem 20 has many applications. In this section we sketch
a proof of a standard application to the solution of differential equations.
While the material is not needed in later sections and is pot directly connected with program semantics, the mathematics used is similar in many
respects and we felt it to be a good idea to give the reader the option of
making the comparison.
While the theory holds in great generality, we shall work out the onedimensional case for simplicity. We are given a function f: [to, t 1 ] x R-----+
R, and seek to find a trajectory x: [to, t 1 ] -----+ R which specif1es the value x(t)
for each t with to ::::; t:5: t1 in such a way as to satisfy the differential equation
1
x(t) = f(t, x(t»
and the initial condition
2
for some specified Xo E R. We emphasize that this classical problem is a
recursive definition of x, since the derivative of x is defined in terms of the x
which is to be found.
Assuming sufficient continuity, we may integrate both sides of 1 to obtain
it x(s)ds = it f(s,x(s»ds
Jto
Jto
or
x(t) - x(t o ) =
it f(s, x(s» ds.
Jto
Using the specified value x(t o ) = x o , we see that we may replace our initialvalue problem 1 and 2 by the integral equation
3
x(t) =
Xo
+
it f(s, x(s» ds,
Jto
219
9.2 Differential Equations
which is a fixed point equation. In more detail, let us use X to denote the
space of all continuous functions [to, t 1 ] ----+ R, which includes the soughtfor trajectory x. Consider the integral operator t/I: X --+ X, where for each
XEX, that is, x: [t o,t 1 ] ----+R, the value t/lx: [t o,t 1 ] ----+R is defined for
each tin [t o,t 1 ] by
(t/lx)(t) = X o +
4
it f(s, x(s» ds.
Jto
We see that x is a solution of the integral equation 3 if and only if x is a fixed
point of the t/I defined by 4: x(t) = rt/lx)(t) for all t in [to, t 1 ].
All that remains is to show how to consider X as a complete metric space,
and to give conditions on f which allow us to apply the contraction lemma
to guarantee that t/I has a unique fixed point and thus, for such f, conclude
that our initial-value problem has a unique solution. Indeed, let d be the
metric on X introduced in 9.1.4. We leave it to the reader (Exercise 1) to
prove that (X, d) is complete. We next consider an appropriate condition
onf.
5 Definition. We say that f: [to, t 1 ] x R ----+ R satisfies a Lipschitz condition
on X uniformly with respect to t in [to, t 1 ] if there exists a number K > 0 such
that
If(t, x) - f(t, y)1 ::; K Ix - yl
for all x, y in Rand t in [to, t 1 ].
Note the (at first surprising) fact that we do not require K < 1 as in the
Banach Theorem. This will be important when the proof of our concluding
theorem takes a surprising turn below.
6 Theorem. If f: [to, t 1] x R ----+ R satisfies a Lipschitz condition with constant K > 0, then the corresponding operator t/I: X --+ X of 4 has a unique fixed
point.
PROOF. We have to compare d(t/lx, t/ly) with d(x, y). In the calculations below,
each Max ranges over t in [to, t 1 ].
d(t/lx, t/ly) = Max It/lx(t) - t/ly(t) I
= MaxlL (f(s,x(s» - f(s,Y(S»)dSI
::; Max
::; Max
(1:
(1:
If(s,x(s» - f(S,Y(S»ldS)
K Ix(s) - y(s)1 dS)
::; Max(K' d(x, y). (t - to»
= K(tl - to)·d(x,y).
since Ix(s) - y(s)1 ::; d(x, y)
220
9 Fixed Points in Metric Spaces
We seem to be in trouble! There is no reason to expect that K· (t 1 - to) <
1, and so 1/1 is not a contraction. However, all is not lost-we shall see that
there is some n ~ 1 for which 1/1" is a contraction. Just as the n-fold integral
of the function constantly 1 is tn/nt, so do we have
7
.1,") < [K·(t 1 -, to)]"d( x, y ),
d( .I,"
'I' X, 'I' Y _
n.
as is readily proved by induction on n. But we can choose n so large that
[K· (t 1
-
to)]"
< n!
and conclude that, for this n, 1/1" is a contraction. But then 1/1" has a unique
fixed point, x in X, with I/I"x = x. Consider, now, that
This sa~s that I/Ix is also a fixed point of 1/1". Since x was the unique fixed
point of 1/1", it follows that I/Ix = x. To see that x is the unique fixed point of
1/1, just note that any fixed point of 1/1 is also a fixed point of 1/1".
D
EXERCISES FOR SECTION
9.2
1. Prove that (X, d) of 9.1.4 is complete.
2. The unique solution x: [0, 00) ---+ R satisfying
x(t)
= 3x(t),
x(O) = 2
is x(t) = 2e 3t• Verify this using the theory of this section by showing that f(t, x) =
3x satisfies a Lipschitz condition and that 2e 3t is a fixed point of t/I. [Hint: The
latter, recall, just says that 2e 3t solves the differential equation.]
3. Repeat Exercise 1 for
x(t) = 2tx(t),
x(O)
= 5.
9.3 Metrics on Trees
Informally, the specification W(x) = while p(x) do f(x) unwraps to the recursive specification
W(x) = if p(x) then W(f(x» else x
as may be seen from the flowchart equivalence
221
9.3 Metrics on Trees
F
w
Of course, we can substitute the right-hand expression for W in this expression and obtain a larger flowchart, and this process could be continued
getting larger and larger flowcharts. Each successive flowchart may evaluate
more and more arguments without calling W itself. This suggests that we
think of the semantics of such a recursive specification as being the semantics
of the limit of the corresponding sequence of flow charts. In this section we
formalize this concept, but replacing the informal treatment of flowcharts by
a formal treatment of trees.
More specifically, we consider recursive specification at a purely syntactic
level in the form of a finite tree which "calls itself on specified arguments."
"Execution" is a process of iterated substitution which produces a sequence
of ever deeper finite trees which, in the limit, defines an infinite tree which
represents the specification in nonrecursive form at the syntactic level. We
show in this section that this infinite tree arises as the unique fixed point of
the specification by applying the Banach fixed point theorem 9.1.20. This is
the starting point of a number of theories discussed further in the notes to
this chapter.
So as not to obscure the underlying simplicity of the concepts involved we
will avoid being too formal. The interested reader may consult the references
in the end of chapter notes to learn more about the formal theory of infinite
trees.
1 Example. A "tree-specification" for W = while p(x) do f(x) is
2
~
W
x/~~x
x
f
I
I
I
x
The interpretation we have in mind is
3
ifp(x,y,z) =
{~
if p(x) is true
if p(x) is false.
222
9 Fixed Points in Metric Spaces
Here the "formal argument" is specified on the left-hand side as x whereas the
argument of W on the right-hand side is the tree for f(x). Hence, a single
"execution" of 2 substitutes f(x) for each x on the right-hand side to give
ifp
f/~~f
I
x
I
f
I
I
x
f
I
x
which, when semantically interpreted from the root down, gives
"if p(x) then (if (f(x» then w(f(f(x))) else f(x» else x."
Repeated call leads to the infinite tree
4
ifp
f/~f
I
I
x
x
ifp
p / ~2
I
x
I
x
223
9.3 Metrics on Trees
where
f
I
f
I
x
and so forth.
5 Example. A "tree-specification" for the factorial function
FACT(x) = if x = 0 then 1 else FACT(pred(x))
is
ifzero
6
x
/I~.
one
tImes
x/ ~FACT
FACT
I
x
1
pred
I
x
Here, ifzero is as in 3 for p the test for equality to zero, one(x) = 1, and pred
is the predecessor function. As the argument to FACT is pred(x) on the
right-hand side of 6, the resulting infinite tree by repeated "execution" is
7
ifzero
x
/I~.
one
tImes
/.£1zero
X/I
pred
1
ole
~times
pred/
I
x
pred
I
I
x
pred
~ifzero
/I~.
one
tImes
pred
/
~
1
pred
1
x
The reader should check that "semantic interpretation from the root down"
is correct.
224
9 Fixed Points in Metric Spaces
8 Example. To illustrate how multiple arguments and calls can be handled,
consider the two-variable Ackermann function of 5.1.3:
9
n+1
a(m, n) = { a(m - 1,1)
a(m-1,a(m,n-1))
°
ifm =
ifm #- 0, n = 0
else.
A suitable "tree-specification" is
Though perhaps not obvious, if "all" substitutions are made, the resulting
"ACK-free" infinite tree is independent of "substitution strategy." It is crucial
here that the root ifzero of the right-hand side of 10 is not ACK. This
guarantees that we can force the depth of all ACKs to be arbitrarily large by
repeated substitutions. Theorem 21 below provides a formal proof.
To check their understanding, readers should now pause to verify the
following:
11 The depth-9 tree obtained from
Pld
m
/
ACK
~
/ACK~
m
pred
I
n
by substituting 10 in both ACKs is independent of which one is substituted
first.
Our definition of trees will be informal. The examples above give the
general idea. Node labels come from a fixed set and are either formal
arguments x, y, z, ... or given function symbols f, g, ifzero, ... or functionvariable symbols FACT, ACK, .... Each function symbol has a definite arity
225
9.3 Metrics on Trees
in {O, 1,2, ... } (we call a O-ary symbol constant, a 1-ary symbol unary, and a
2-ary symbol binary).
Trees are drawn upside down with a root at the top and leaves at the
bottom. Trees used in specifications shall have a unique root and each leaf
shall be either a formal argument or a constant. (We consider subtrees in 13
with more arbitrary leaves, however.)
In 7, the root is ifzero and there are infinitely many leaves, each either x or
one. The given function symbols are one, pred, times, and ifzero of arities 0,
1, 2, and 3. In 6, FACT is a unary function-variable symbol.
Properties 12 and 14 below for trees seem intuitively evident. Any formal
definitions that yield these properties will suffice for the needs of the theory
below.
12 Property. Each node in a tree connects to the root by a unique upward
path of finite length. The number of nodes above the given node in this path
is called the depth of the node. If t is finite, the depth of t is its maximum node
depth.
Thus, in the tree of 6, FACT has depth 2, ifzero has depth 0, and the tree
itself has depth 4. In the infinite tree of 7 times occurs with depths 1, 3, 5, ....
13 Definition. For any tree t and integer n ~ 0, define ten] to be the subtree
of all nodes of depth ::; n. Thus, if t is the tree of 6,
=
ifzero
t[l] =
ifzero
teO]
x
t[2]
=
/I~.
one
tImes
ifzero
x
/I~.
one
tImes
x/
~FACT
and t = t[4] = t[5] = ....
14 Property. If to, t 1 , t 2 , ••• is a sequence of finite trees such that tn has depth
n and such that tn[m] = tm whenever m ::; n. Then there exists an (obviously
infinite) tree t with ten] = tn for all n.
The tree t is unique in 14 since 12 implies that for any two trees, t, u, t = u
if and only if ten] = u[nJ for all n.
We now have the following:
15 Theorem. Let T be a set of trees each of which satisfies 12 and which
satisfies 14. Define d: TxT -----+ [0,00) by
226
9 Fixed Points in Metric Spaces
16
d(t, u)
=
{o2- k
(t,u)
ift=u
if t =I- u,
where
17
for t =I- u, k(t, u) is the least k with t[k] =I- u[kJ.
Then d is a non-Archimedean metric and (T, d) is complete.
PROOF. That d(t, u) = d(u, t) is obvious and d(t, u) = 0 if and only if t = u
by 12. To see d(t, v) = Max(d(t, u), d(u, v)) for all t, u, v observe that if k =
Min(k(t, u), k(u, v)) then for p < k, t[k] = u[k] = v[k] so that k ~ k(t, v), hence
d(t, v) ~ rk. But r kis one of d(u, t), d(u, v), so we have shown that d is a
non-Archimedean metric.
Now let t 1 , t 2 , t 3 , ••• be a Cauchy sequence. For each n, it follows from the
definition of a Cauchy sequence that d(t" t s ) < rn whenever r, s are suitably
large. In view of 12 and 13, this means that there exists an integer N n depending only on n so that
18
t,[n]
=
ts[n]
whenever r, s
~
N m for each n.
Define
19
Un = t,[n]
for any r
~
Then Un is a finite tree of depth at most n. If m
~Nm so that
N n.
~
n choose any r
~
N n and
Un = t,[n],
Um
= t,[m],
hence,
un[m] = (t,[n])[m] = t,[m] = Um
(it being obvious that for any tree t, (t[n] [m] = t[m] if m ~ n). Hence, the
sequence (un) of 19 satisfies the hypotheses of 14 so that there exists a tree u
with
20
u[n]
=
Un
for all n.
To finish the proof we must show that d(u, t,) ---- 0 as r gets large, thereby
establishing completeness by showing that the arbitrary Cauchy sequence (tn)
has u as limit. But this is easy as follows. For any fixed n, u[n] = Un = t,[n]
for all r .~ Nn by 20 and 19. Thus, for r ~ Nn, k(u, t,) ~ nand d(u, t,) ~ 2- n.
Since n is arbitrary (so that rn may be arbitrarily small) this says d(u, t,) ____
Oasr .... 00.
0
We next show that a wide class of recursive tree-specifications are contractions. (Sets of such specifications are discussed in Exercises 4 and 5.)
227
9.3 Metrics on Trees
21 Theorem. For fixed sets of given function symbols, n-ary function-variable
symbol F and formal arguments Xl"'" Xn (n ~ 1), let T be the set of all trees
whose leaves are formal arguments or nullary given functions. We consider T a
complete metric space as in Theorem 15. Let t be a finite tree in T in which F
appears but not at the root. Let 1jI: T --+ T be any function for which ljI(u) is
obtained from u by substituting t for one or more F with arguments respected,
that is, each leaf Xi in t is replaced with the ith-branch subtree of the F. Then
IjI is a contraction.
PROOF. In the notation of 17, for u, VET it is clear that k(ljIu, IjIv) ~ p + k(u, v)
if p is the length of the shortest path from an F to the root in t. As p > 0,
d(ljIu, IjIv)
d(u, v)
----'-':--,-----'-c---'-=
2- k ('I'u,'I'v)
2 k(u, v) <
-
r(p+k(u,v))
2
k(u,v)
=2
_
P
<1 •
D
According to the proof of the Banach theorem 9.1.20, the unique fixed
point of the IjI of Theorem 21 is obtained by iterating IjI on any starting tree
(which might as well be called "F"). The reader who has worked through 11,
let alone its infinite iteration, should appreciate how the theory of contractions on metric spaces has provided a powerful test to prove uniqueness!
We conclude this section with a brief comparison with earlier discussions
of recursive call. Consideration of Example 1 should suffice. Semantically, ifp
in 3 and f take the forms
ifp E Pfn(X 3 , X),
fEPfn(X, X)
for an appropriate set X. At this level, 2 may be represented by
22
Pfn(X, X)
~
Pfn(X, X), IjI(W(x))
=
ifp(x, W(f(x)), x).
It is readily verified that IjI is continuous and that its Kleene sequence
23
1jI(.l) = ifp(x, .l,x) = if not p(x) then X else .l,
1j12(.l) = ifp(x, ifp(fx, .l,fx), x)
=
if not p(x) then x else
if not p(f(x)) then f(x) else .l
1j13(.l)
= ifp(x, ifp(fx, ifp(j2x, .l,j2x),fx), x),
has the intended semantics while p(x) do f(x) as its least upper bound. A
similar analysis obtains replacing Pfn with Mfn. The main idea ofthis section
is that trees capture the process at a syntactic level which unifies an entire
class of semantic interpretations.
In broadest terms, a possible comparison between syntax and semantics
would take the following form:
228
9 Fixed Points in Metric Spaces
24 (i) Choose a semantic category, C.
(ii) Interpret the given function symbols f, g, ... as morphisms in C.
(iii) Impose axioms on C to allow the interpretation of any tree as a Cmorphism.
(iv) For 1/1: T --+ T as in Theorem 21 induce a corresponding 1/10:
C(X,X)---+ C(X,X).
(v) Prove that the interpretation of the unique fixed point of 1/1 is an
appropriate canonical fixed point of 1/10.
The example just discussed has these features, that is, the interpretation of
the infinite tree 4 is also while p(x) do f(x).
The literature cited at the end of the chapter certainly carries out this
program for, at least, C = Pfn and C = Mfn. We have resisted developing 24
further and leave this as an open-ended research problem for the interested
reader.
EXERCISES FOR SECTION
9.3
1. Express Example 5.2.12 as a tree-specification and obtain the infinite tree of call.
Does it clarify that the semantics is the identity function as was seen in Exercise
5.2.5?
2. Consider the infinite tree of call for appropriate tree-specification of the Fibonacci
functions of Exercises 5.1.2 and 5.2.6. Is there any ambiguity?
3. Express 5.2.13 as a tree-specification. How clear is it that the infinite tree of call is
semantically equivalent to that for while p do g, as was shown in 5.2.17?
4. Let (Xl,dd, ... , (X., d.) be metric spaces. Let X = Xl
is a metric space if
d((xJ, (yJ)
X ..• X
X •. Show that(X,d)
= Max(di(xi, Yi))
and that (X, d) is complete if each (Xi' dJ is.
5. Use Exercise 4 to state and prove a generalization of Theorem 21 for simultaneously recursive tree-specifications. [Hint: the ith metric space is the set of all
trees as in 21, but the number of formal arguments may depend on i.] Test your
result on the specification of Example 5.2.1S.
6. Find appropriate forms of 22 and 23 for Example 5.
9.4 Context-Free Languages as
Metric Fixed Points
In Section 6.4, we associated with each context-free grammar G a function
I/IG: (2 x *t
---+
(2x*)n
which maps n-tuples of languages to n-tuples of languages. We saw that
229
9.4 Context-Free Languages as Metric Fixed Points
(2X*)n could be given a domain structure with respect to which t/lG was
continuous, and that L(G) was then the first component of the least fixed
point of t/lG.
In this section, we restrict G slightly by requiring that G be A-free, that is,
that no production be of the form v ~ A. (It is well known that for any
context-free grammar G there exists a A-free context-free grammar G' such
that L(G') = L(G) - {A}.) Recall that X+ = X* - {A}. We may then regard
t/lG as a map (2X+)n ~ (2X+)n. We give (2x+)n the structure of a metric space
and show that t/lG is a contraction. It thus has a unique fixed point-which
must then equal the least fixed point, and so has L(G) as its first component.
We begin with the following:
1 Theorem. The metric space (2x+, d) of 9.1.6 is complete.
PROOF. Let L l , L 2 , L 3 ,
such that
•••
be a Cauchy sequence. For each k there exists Nk
d(Lm, Ln) <
r
k
for m, n > Nk •
But this just says that Lm and Ln do not differ on words of length
m, n > Nk • We can thus define the language L by stipulating that
WE L
if W E Lm
~k
for
for any (and thus all) m > N 1wl ,
where, recall, Iwi is the length ofw.
It is then clear that (Lmlm ~ 1) converges to L.
o
It is seen in Exercise 9.3.4 that whenever (Xl,d l ), ... , (Xn,d n) are complete metric spaces then (X, d) is again a complete metric space if X =
Xl X ..• X Xnandd«xl, ... ,xn)'(Yl, ... ,Yn» = Maxl<i<ndi(xi,y;).ltthenfollows that (2x+)n is a metric space with the metric
- e«Vl'···, v,,),(Wl
,···,
w,,»
=
Max d(V;, W;).
15,i5.n
To see that t/lG has a unique fixed point for A-free G, we need simply verify
that t/lG is a contraction with respect to this metric.
2 Theorem. Let G be a A-free context-free grammar, and let t/lG be the functional (2x+)" ~ (2x+)" derived from G essentially as in Section 6.4. Then t/lG is
a contraction with respect to the metric e on (2x+)".
PROOF. e(V, W) = Maxl$i$nd(V;, W;) = rlwl, where Iwl is the length ofw, the
shortest word on which any V; differs from the corresponding W;. Let w' be
the shortest word on which any t/lG(V)j differs from the corresponding t/lG(W)j.
Then, for t/lG to be a contraction, we require that Iw'l > Iwl, no matter what
the choice of distinct V and W, for then
230
9 Fixed Points in Metric Spaces
The shortest word on which WlO V l l ... VIrWIr differs from WIO Wl l .. ·lt1rwIr
is clearly longer (if it exists) than the shortest word in which any of the J.j's
and aj's recurring therein differ, unless each W Ij = A. But then we are only in
trouble if one of the VIj contains A. But we have forbidden this, and so we are
done.
0
EXERCISES FOR SECTION
9.4
1. Let G be the grammar
Construct a A-free grammar G' with L(G') = L(G) - {A}.
2. Corresponding to the grammar G of Exercise 1 define
I/IG: 2x+
---+
2x +, X = {a}, by l/IG(V) = A + aVo
Show that 1/1 is a contraction even though G is not A-free.
3. Study the proof of Theorem 2 to give an example of a grammar which is not A-free
for which I/IG is not a contraction.
4. In the context of Example 6.4.3, define l/IG and compute l/I:;(Vi, V2, V3) for n = 1,2,
3, for arbitrary Vi' V2 , V3 . Convince yourself that as n gets large the sequence
converges to the same languages as the Kleene sequence of 6.4.6.
Notes and References for Chapter 9
To prove that many metric spaces, including those of Examples 9.1.3-4 are complete,
it is necessary to know that the set R of real numbers with the usual metric d(x, y) =
Iy - xl is complete. This is equivalent to the "least upper bound axiom" about R as a
totally ordered set which asserts that every nonempty set A of reals which has an
upper bound has a least upper bound. An "algorithm" to prove that the least upper
bound axiom is true would construct LUB(A) as follows. As A has an upper bound
there exists a smallest integer n with a < n + 1 for all a E A. To continue, choose
decimal digits d i , d 2 , d3 ••• so that d i is as small as possible with a < n + .d i + .1 for
all a, .d 2 is as small as possible with a < n + .d i d2 + .01 for all a, and so on. The
desired least upper bound is n + .d i d 2 d 3 ... • This is not a constructive algorithm
because, say, to choose d i requires "searching" through all of A even though A may
be infinite. Ultimately, then, one must look harder at the foundations of the real
numbers-and this is why the least upper bound axiom is an axiom. To underscore
the overall importance of this issue, we refer the reader to a rigorous calculus text for
proofs that each of the theorems in the list below requires the statements beneath it:
231
Notes and References for Chapter 9
TaY!O"T,orem
if df/dx = 0 then f is constant
I
the mean value theorem
.
\
the maximum value theorem
I
R satisfies the least upper bound axiom
For more about the Manhattan metric see E. Krause, Taxicab Geometry, AddisonWesley, 1975; who calls it the taxicab metric.
A more leisurely and more general exposition of the material in Section 9.2 may be
found in L. Padulo and M. A. Arbib, System Theory, Saunders, 1974.
The theory of Section 3 was introduced by M. Nivat in the early 1970s, building
on the ideas of M. Schiitzenberger. Many others contributed later. For references
we refer the reader to S. L. Bloom, "All solutions of a system of recursion equations
in infinite trees and other contraction theories," Journal of Computer and System
Sciences, 27, 1983, pp. 225-255 (which also carries out the program of 9.3.24 for
many categories) and to I. Guessarian, Algebraic Semantics, Lecture Notes in Computer Science, Vol. 99, Springer-Verlag, 1981. We note that Guessarian's approach
uses order semantics with the Kleene sequence, rather than the Banach fixed point
theorem. (Associate a function f(t) with each tree t, setting all variables to .1. Given
the sequence t 1 , t 2 , t 3 , ••• of trees with functions f(td, f(t 2 ), ••• one shows that
f(t 1 )::::; f(t 2 )::::; ••. provides the desired Kleene sequence.)
For a more rigorous treatment of infinite trees, see the book by Guessarian just
cited as well as C. C. Elgot, S. C. Bloom, and R. Tindell, "On the algebraic structure
of rooted trees," Journal of Computer and System Sciences, 16,1978, pp. 362-399.
Our treatment in Section 4 is motivated by the general theory of formal power
series in noncommuting variables as initiated by M. Schiitzenberger in such papers
as, "On a theorem of Jungen," Proceedings of the American Mathematical Society, 13,
1962, pp. 885-890. A useful exposition and further references are given in Chapter 9
of G. Lallement, Semigroups and Combinatorial Applications, John Wiley & Sons,
1979.
PART 3
DATA TYPES
CHAPTER 10
Functors
10.1 Data Types Lead to Functors
10.2 Fixed Points of Functors
Modern programming languages employ a variety of data structures which
are (at least) sets of "structured elements" equipped with functions that allow
combining, manipulating, and querying of such elements. A very rich supply
of data types may be built-either directly or recursively-from finite sets
using finite products and coproducts. Thus, if A is a character alphabet,
A x A x A is "length 3 character arrays." Ifpri: A x A x A -----+ A is the ith
projection, pri(W) evaluates the ith coordinate of w, providing the semantics
of what would be written as wEi] in many programming languages. A precise
definition of length 3 character arrays should specify exactly which maps
such as the pri should belong to the structure. For a recursive example, define
"natural number" by
ois a natural number
if n is a natural number nO is a natural number
which is captured by the recursive specification
N: := {O} + N x {O},
where we represent each number n by a string of zeros of length n + 1. A
central concern will be to provide precise semantics to such specifications.
We build new data types, then, by applying such building operations to
old data types. But the old data types are not necessarily just sets-it may be
advantageous to remember some of their associated maps in the building
process. As such, the building operations we use have to act on sets and on
maps. The appropriate notion here is "functor."
In the first section we motivate how data types lead to functors. The
second section introduced the least and greatest fixed points of an endo-
236
10 Functors
functor and presents examples of data types which may be described as such
fixed points.
10.1 Data Types Lead to Functors
Consider the program flowcharted in Figure 1. This program receives a
natural number n ~ 1 as input and outputs the nth prime, 2 being the first.
This program employs lists of primes as data structures. At any time, PL
holds the list of primes found so far, while NP holds the odd integer which is
a candidate for the next prime. This next prime is found as the first odd
number not yet on the list PL which is not divisible by any of the primes
already on the list.
In this section we explore how the need for lists of integers is related to the
notion of functor.
Let N be the set of integers. (Ultimately, we will discover which properties
ofN we need and shall "derive these from scratch.") The set N+ of (non empty)
lists of integers has as elements all finite tuples <n 1, ... , nk) with each ni E N
and k > O. The program of Figure 1 makes explicit use of the following
functions:
1
which we use to treat the length 4 list <2,3,5,7) as dynamic-its length may
change.
2
<m,<nl,···,nk»~
{
nm
I
-L
if 1 < m < k
I ese,
which we use to extract one entry at a time from PL for comparison
purposes.
3
<n
1 , ...
,nk )
~
nk,
which we use to test the end of the list.
4
«n1,···,nk),m) ~ <n1,···,nk,m),
which we use to add a new element at the end of the list.
5
<nl,···,nk)~k,
which we use to find the length of the list.
237
10.1 Data Types Lead to Functors
F
?
T
Figure 1 A flowchart for finding the nth prime: PL is "prime list" and NP is "next
prime candidate."
Let us now attempt a recursive definition of N+ which will also, ultimately, let us define the maps 1-5 (in 10.2.16 and 11.1.14 below). Define L
(hopefully = N+) recursively by
6
L::=N +L x N.
What does this mean? Let us create an analogy with our earlier work in order
238
10 Functors
semantics and write
7
= N +L
t{!(L)
x N.
The general form of such t{! is as a "mapping" from the category Set to itself.
As seen from Exercise 2.1.7, a category may be regarded as a generalization
of a poset. Briefly, consider the elements of the po set P as the objects of a new
category C(P,:s;); and let the assertions that x ~ y become the morphisms
x -+ y of C(P,:s;)' In other words, given objects x and y of C(p,::;) there is at
most one morphism x -+ y, and there is such a morphism if and only if x ~ y
in P. It can then be checked that morphisms compose appropriately (x -+ y
and y -+ z yield x -+ z by transitivity) and identity morphisms are defined
(x -+ x exists by reflexivity). In what follows, we will gain much motivation by
lifting posets from (P, ~) to C(P,::;) and then studying an analogous concept
that is available in every category C.
As a first example, the least element 1- of a poset (1- ~ x for all x in P)
generalizes to the initial object 0 for categories. (There is a unique arrow
-+ X for every object X in C.) The ascending chain
o
1-
~
t{!(1-) ~ t{!2(1-) ~ ...
generalizes to a suitable diagram of morphisms
8
We shall see in Chapter 11 that this diagram has a "colimit," unique up to
isomorphism, which generalizes the least upper bound V(t{!n(1-)). It will also
be seen in Chapter 11 that the Kleene fixed point theorem 6.2.13 generalizes
and that the t{! of 7 is "continuous" and so has a "least fixed point" which,
moreover, is N+. For the present, we proceed intuitively and anticipate these
later results.
Regard 7 as asserting "a list is a natural number or a list followed by a
natural number." Thus, a natural number is a list, and stage 1 yields:
N
=
t{!(0)
c
L.
At the (n + l)st stage, a list is either in N or is <I, n), where I is created at the
nth stage. Clearly, then, t{!n(0) represents the lists created after n stages. Now
observe the following:
9 Distributive Law of Set Theory.
(AI
+ ... + A k) x
B = (AI
X
B)
+ '" + (Ak
x B)
(k
~
2).
Indeed, a typical element of the left side is of the form «i,a),b) with aeA i,
be B whereas a typical element of the right side is <i, <a, b» with ae Ai' be B
so that «i,a),b) 1---+ <i,<a,b» is the desired isomorphism. Using 9, we
compute
239
10.1 Data Types Lead to Functors
10
1jJ(0)
=N
+N x N
1jJ3(0) = N + (N + N x N) x
= N + N2 + N3
1jJ4(0) = N + (N + N 2 + N 3 )
1jJ2(0) = N
1jJ"(0)
+ [(N
N=N
X
N
N
=
x N)
+ (N
x N x N)]
+ N2 + N 3 + N4
= N + N 2 + ... + N".
If 7 is interpreted via 8 as saying that lists are exactly those objects that arise
from natural numbers by finite application of 1jJ, then the set L we seek is the
union of the 1jJ"(0) which, up to isomorphism, is indeed N+. The derivation
of the maps 1, ... , 5 we postpone for later sections when it will have become
more clear which tools we can use.
We next consider the following construction.
11 Whenever f: A --+ B is a total function, there is an induced total function
IjJ(A) ---+ IjJ(B) which we shall call1jJ(f) defined by
N
+A
x N
1p(f)j
N
{
+B
x N,
ifxEN
if X = (a,n)EA x N.
X
X~ (f(a),n)
This construction is a generalization of the notion of monotone map, namely,
one for which x:::;; y implies ljJ(x):::;; ljJ(y). When a poset is viewed as a
category, we write this as saying that IjJ is monotone if x -+ y implies
ljJ(x) ---+ ljJ(y). The difference in 11 is that there are many functions
IjJ(A) ---+ IjJ(B) so a specific one had to be singled out. The definition of the
Kleene sequence
generalizes immediately to 11 to yield the diagram
0
12
~ 1jJ(0)
=
N ~ 1jJ2(0)
=
N
+N
xN
1p2(I)j " ' ,
which is essentially the series of inclusions
oc
N
c
N
+ N2
C
... c
N
+ N 2 + ... + N" c
"',
since it is easily checked that the composition
N
+ N 2 + ... + N" ~ 1jJ"(0)
1pn(l)j
1jJ"+1 (0) ~ N
+ ... + N"+1,
where 0(, f3 are the isomorphisms of 10, is just the inclusion map. Up to
isomorphism, 1jJ"(!) is the inclusion map. In a general category with an initial
object, a construction such as 11 will give rise to a diagram like 12. It is time
then for the formal definition.
240
10 Functors
13 Definition. Let C, D be categories. A functor t/l: C
following data and axioms.
--+
D is given by the
Datum i. For each object C in C, t/l specifies an object t/lC of D.
Datum ii. For each morphism f: C 1 --+ C2 in C, t/l specifies a morphism
t/lf: t/lC 1 -------. t/lC2 of D.
Axiom a. If f: C1 --+ C2 , g: C2 --+ C 3 in C, so that t/lf: t/lC 1 -------. t/lC2 ,
t/lg: t/lC2 -------. t/lC3 , t/l(gf): t/lC 1 -------. t/lC3 in D, then t/l(gf) =
(t/lg)(t/lf).
Axiom b. For each object C of C, t/lid c = idlPc '
Axioms a and b are called the functoriality axioms.
We saw in 2.1.14 that a monoid homomorphism f: (M, *, e) -------. (M', *', e')
is a map f: M --+ M' with the additional properties
f(m 1 * m2 )
=
f(m 1 ) *' f(m 2 ),
f(e)
=
e',
the first of which generalizes to axiom a, while the second generalizes
to axiom b. To make this generalization more explicit, associate with the
monoid (M, *, e) the category C(M ••• e) which has only one object, call it
X, while M is the set of morphisms X --+ X, with composition X ~
X ~ X = X m 2 ·m,) X. It can be easily checked that a map f: M --+ M'
satisfies the conditions for a homomorphism (M, *, e) -------. (M', *', e') if and
only if it satisfies the axioms for a functor C(M ••• e) -------. C(M"".e')'
A functor is thus a "homomorphism of categories," being the obvious
notion of "structure-preserving mapping" between categories. The notion is
central in category theory generally. An endofunctor is a functor of form
C --+ C, mapping a category C to itself, and it is this sort of functor that arises
in recursive specification.
Axioms a and b are both assertions that two appropriately constructed
morphisms with common domain and codomain are equal. This is automatically true in any poset since there is at most one morphism between any
two objects when a poset is viewed as a category. The following is then
obvious:
14 When posets are regarded as categories, "monotone map" is the same
concept as "functor."
In the discussion following 2.2.6, we stated that a major aspect of the
philosophy of category theory is that "isomorphism" formalizes "abstractly
the same." Thus, if t/l is to be regarded as an "abstract construction" (which
constructs t/l(A) from A) then at the very least we should have t/l(A) and t/l(B)
are isomorphic when A and B are. This is a consequence of axioms a and b
as follows.
241
10.1 Data Types Lead to Functors
15 Theorem. Iftjl: C --+ D is afunctor and iff: C1
then tjlf: tjlC I ---+ tjlC2 is an isomorphism in D.
PROOF.
tjI(gf)
Let 9
= f-l.
= tjI(id) = id.
Similarly, (tjlf)(tjlg)
=
--+
C2 is an isomorphism in C,
Since gf = idcl , axioms a and b yield (tjlg)(tjlf) =
id. Thus, tjlg
o
= (tjlf)-I.
We close this section with a number of examples of functors as well as
some tools to construct new such functors from old ones. Unprovided verifications are easy exercises for the reader.
16 Constant Functors. Let C, D be categories and let D be an object of D.
Define a functor
C
fj
)D
c1
11
D
1------+
C2
lid
D
(the notation is shorthand for 15(C) = D while 15(f) = idD for every f: C 1 --+
C 2 ). 15 is a functor called the functor constantly D. A functor of form 15 is
called a constant functor.
17 Identity Functors. If C is any category, the identity functor of C is given
by
C
ide
-----'''---+)
C
18 Products of Functors. Let D be a category with finite products. Given
/;: Di f--+ Ei in D, define fl x ... x fn to be the unique morphism as shown
(i = 1, ... , n).
242
10 Functors
Now let C be any category, and let 1/1 l '
a functor 1/1 1 X .•. x I/In: C -+ D by
I/In be functors C
... ,
-+
D. Define
C "'tx ... x"")D
"'tCt x ...
"'.C t
1"'t!
1----+
C2
X
"'tC2 x···
X
x ... x
"'.C
"'.1
2
That functoriality follows from that of each I/Ii is a straightforward exercise
on Cartesian products. (But note that we must choose a definite product
D1 x ... x Dn in D, at least whenever Di = I/ICi.)
We may now present the dual constructions involving coproducts.
19 Coproducts of Functors. Let D be a category with finite co products.
Given J;: Di -+ Ei in D, define 11 + ... + In to be the unique morphism as
shown:
20
(i = 1, ... , n).
(Recall that, working in the context of flow schemes in 2.3.22, we called this
the parallel construction and denoted it by 1111··· Il/n.) In Set, we would
have (f1 + ... + In)«d i, i)) = (J;(d;), i). In the dual category DOP this is
just 11 x ... x In: E1 X ... X En --< D1 X .•. X Dn. It is easily seen that
11 + ... + In is the identity when each J; is an identity and, if gi: Ei -+ Fi, then
(gl + ... + gn)(/1 + ... + In) = (gdd + ... + (gnfn) together with the dual
results for x. Thus, dualizing 18, given functors 1/1 l ' ... , I/In: C -+ D from any
category C, we may define
C
"'t + ... + "'.) D
"'tCt
1----+
"'tC2
+ ... + ",.Ct
1"'t!
+ ... +
and it is functorial. (Warning: Do not confuse
+ ... + "'.1
"'.C
2
+ with partially additive sum!)
21 Composition of Functors. Let 1/1 1: C -+ D, 1/12: D
Their composition
-+
E be arbitrary functors.
is defined by (1/I21/11)(C) = 1/12(1/I1(C)), (1/I2I/1d(f) = 1/12(1/1 1 (f)). Functoriality
of 1/1 21/1 1 is routinely established.
243
10.1 Data Types Lead to Functors
In algebra, we speak of a polynomial as a function of the form anx n +
an_1Xn-1 + ... + a 1x + ao. It is built up from constants and x using repeated
addition and multiplication. We may think of x as a notation for the identity
function. Then we can generalize the notion of polynomial to obtain the
following:
22 Polynomial Functors. Let C be a category with finite products and
coproducts. A polynomial functor C ---+ C is any functor which can be constructed from constant functors or the identity functor through the use of
the product, coproduct, or composition operations of 18, 19, or 21. More
formally, we have the following:
Basis Step: The identity functor C ---+ C and any constant functor C ---+ C
are polynomial functors C ---+ C.
Induction Step: If 1/11 and 1/12 are polynomial functors C ---+ C, so too are
1/1 2 1/1 1, 1/1 2 + 1/1 band 1/1 2 X 1/1 1 .
23 Example. 1/1: Set ---+ Set with I/I(L) = N + L x N as in 7 is a polynomial
functor. Define 1/1 1 = id x N, 1/1 2 = N + id. Then, 1/1 = 1/1 21/1 1 :
When a distributive law such as 9 holds, in the category C, a polynomial
functor 1/1: C ---+ C may be put into the canonical form
I/I(A)
~
Co
+ (C 1 X
A)
+ (C 2 X
A2)
+ ... + (Cn x
An),
but a formal proof requires a formal definition of "isomorphism of functors"
~. See Exercises 5 and 7.
We also offer some examples of functors Set ---+ Set which are, at least
apparently, not polynomial.
24 Example. Define 1/1: Set ---+ Set by I/IA = 2A , I/If: I/IA ~ I/IB, S r--+
{J(x): XES}. Similarly, r: Set ---+ Set, r A = {S: S is a finite subset of A},
rf: r A ~ r B, s r--+ {J(x): XES}. See Exercise 10.
25 Example. Two functors can be the same on objects yet differ on
morphisms. Define 1/1 1: Set ---+ Set by 1/1 1 A = 2 A , but 1/1 d: 1/1 1 A ~ 1/1 1 B,
S ~ {YEB: f-1(y) c S}.
26 Example. Define (t:Set---+Set by A+={<a1, ... ,ak):k~1,aiEA}
the set of finite nonempty lists in A, and define f +: A + ---+ B+ by
f+<a 1,···,ak) = <f(a 1),···,f(ak)·
EXERCISES FOR SECTION
10.1
1. Construct an isomorphism N+ __ N
"fixed point" of the t/I of 7.
+ N+
X
N. It is in this sense that N+ is a
244
10 Functors
2. Let N+ + be the set of all nonempty finite or infinite lists of natural numbers. Show
that there exists an isomorphism N+ + ~ N + N+ + x N. Thus, N+ is not the
only fixed point of the 1/1 of 7.
3. Describe I/IJ explicitly for the polynomial functor 1/1: Set -> Set of7.
4. Prove that the composition of functors as in 21 is again functorial, so is a functor.
5. Let
l/J, 1/1: C -> D be functors. A natural transformation '1: l/J -> 1/1 is a family of
D-morphisms of form '1C: l/JC ~ I/IC, one for each object C of C, such that for
each C-morphism J: C, -> C2 the following "naturality square" commutes:
"C,
--'---'---+)
"'C,
1"'I
---::;---+)
'" D2
"C 2
The Junctor category DC has as objects all functors C -> D and natural transformations as morphisms with composition ('1''1)C = ('1'C)('1C) (the right-hand
side being D-composition) and identities id'l': 1/1 -> 1/1, (i~)C = id'l'c'
(i) Prove that the composition of natural transformations is a natural transformation and that id'l' above is always a natural transformation.
(ii) Complete the proof that DC is a category.
(iii) Two functors C -> D are isomorphic or naturally equivalent if they are isomorphic in DC. Prove that '1: l/J -> 1/1 is an isomorphism in DC if and only
if each '1C is an isomorphism in D. [Hint: Prove that ('1C)-' is a natural
transformation.]
6. Show that pri: 1/1,
I/InC
~
x ... X I/In ~ I/Ii defined as the projection priC:
I/IiC defines a natural transformation.
7. Show that
I/I(C) = Co
l/J, 1/1: Set -> Set are isomorphic if l/J(C) = Co
X A 2 ).
+ (C,
+ (A
X
1/1, C
x ...
X
(C, x A)) and
8. Let 1 be a one element set and let 1/1: Set -> Set be a functor.
(i) For each x E 1/11 show that '1x: id -> 1/1 is a natural transformation if
'1xC: C ---+ I/IC maps each c E C to 1/1 (c)(x). Here, we regard c E C as the
function 1 -> C mapping the unique element of 1 to c so that I/I(c): 1/1(1)---+
1/1 (C).
(ii) Show that x f---> '1x is a bijection between 1/11 and SetSet(id,I/I) [Hint: The
inverse to Xf--->'1x is '1 f---> '11.]
9. Let X be any set, and let J: X
->
2x be any function. Define A E 2x by
A = {XEX: x¢f(x)}.
Prove that no Xo E X exists with J(x) = A. This establishes Cantor's diagonal
argument: no surjection exists from X to 2x. It follows that there is no largest set:
every set has more subsets than elements.
10. It is a basic fact of set theory that the polynomial functor 1/1 of 23 admits an
isomorphism A = I/I(A) for every infinite set A bigger than Co + ... + Cn. Assuming this, use Exercise 9 to show that I/I(A) = 2A as in 24 is not a polynomial
functor.
245
10.2 Fixed Points of Functors
11. Generalizing 3.2.24, a functorial iterate on a category C with finite coproducts
assigns to each (A, B,f) with f: A ----> A + B a morphism If: A --+ B subject to
two axioms:
(a)
f
A
If
l
(b) Given f: A
---+
(A
g
A
then
Ih+k
lC+D
hI
C
A+B
A
hi
C
+ B,
+B
, (A
+ B) + B)t =
A
+B
<It. ids)
,
B.
(i) Define appropriate categories so that fH ft describes the action on objects
of a functor. This explains the terminology.
(ii) Show that the partially additive iterate of 3.2.24 is indeed a functorial iterate.
Draw flowschemes to express axioms (a) and (b). [Hint: See Exercise 1.5.12.]
(iii) Show that any functorial iterate satisfies the Elgot iteration equation
A
f
lA+B
~ft,idB>'
B
(iv) Show that if a functorial iterate exists, C has zero maps with 0: A --+ B being
(in1: A ---+ A + B)t.
(v) Prove that the usual iterate is the unique functorial one in Pfn. [Hint: If
ft is the usual one and f* is another, ft :0::; f* by Kleene semantics. To show
DD(f*) c DD(ft) show that f*h = 0 if h: A --+ A is the guard function for
the complement ofDD(ft).]
It is an open question at the time of this writing if (v) holds for an arbitrary
partially additive category.
10.2 Fixed Points of Functors
In the previous section, we introduced recursive specification of data types by
considering the set N+ of nonempty lists of integers to be a solution (see
Exercise 10.1.1) L of the equation
L::=N +L x N.
This led us to introduce the general concept of a functor, of which an example
ljJ: Set ---+ Set was defined by
ljJ(L) = N + L x N,
while for f: A
---+
B
ljJ(f): ljJ(A)
---+
ljJ(B)
{;;:)nf---+ (f(a), n).
Thus, N+ could be seen as a fixed point in some sense of the endofunctor ljJ.
246
10 Functors
In this section we define least and greatest fixed points for endofunctors
generally. The definitions generalize definitions familiar on posets to categories. It will be proved in the next chapter that every polynomial endofunctor of Set has special such fixed points, but some examples will be
explored herein to introduce the idea that any solution of the fixed point
equation comes equipped with associated functions.
1 Definition. Let C be an arbitrary category and let l/J: C --+ C be an endofunctor of C. A fixed point of l/J is a pair (A, c5) with c5: l/JA ---+ A an isomorphism. There are two notions of "pre-fixed point" that we emphasize. A
l/J-algebra is a pair (A, c5) with c5: l/JA ---+ A an arbitrary morphism, whereas
a l/J-coalgebra is a pair (A, A) with A: A ---+ l/JA any morphism. If (A, c5), (B, y)
are l/J-algebras, a morphism of l/J-algebras f: (A, c5) ---+ (B, y) is a C-morphism
f: A --+ B such that
Then idA: (A, c5) ---+ (A, c5) is a morphism because l/JidA = idtpA and if f:
(A, c5) ---+ (B, y), g: (B, y) ---+ (C, 0 are morphisms then gf: (A, c5) ---+ (C, 0
is a morphism as l/J(gf) = (l/Jg)(l/Jf). This gives rise to the category l/J-Alg of
l/J-algebras. Note the crucial role of the functoriality of l/J.
Similarly, if (A, A), (B, r) are l/J-coalgebras, a morphism f: (A, A) ---+ (B, r)
of l/J-coalgebras is a C-morphism f: A --+ B with
and this gives rise to the category l/J-coAlg of l/J-coalgebras. In the case
C = C(p,:<;;), we already know that a functor l/J: C(P,:<;;) ---+ C(P,:<;;) is just a
monotone map l/J: (P, :s;;) ---+ (P, :s;;). A fixed point of l/J in the sense of 1 is
just an x with l/J(x) = x. To see this, observe that an isomorphism x --+ y is
just an assertion of equality (for if x --+ y is an isomorphism, it has inverse
y --+ x, but then, by anti symmetry of :s;;, we conclude x = y from the assertions x :s;; y and y :s;; x).
What are algebras and co algebras in C(P,:<;;)? A l/J-algebra is an x with
l/Jx :s;; x, while a l/J-coalgebra is an x with x :s;; l/Jx. Given two l/J-coalgebras
l/Jx :s;; x and l/Jy :s;; y, a morphism f: x --+ y of l/J-algebras is just an assertion
x :s;; y (the first square above is meaningful since l/J is monotone and commutes because in a poset all diagrams do!); similarly for morphisms of
l/J-coalgebras.
Now consider the set of all l/J-algebras (still considering the case of a
monotone l/J: (P, :s;;) ---+ (P, :s;;)). This is just a subset of P and so inherits the
247
10.2 Fixed Points of Functors
ordering of P. Suppose it has a least element Xo (recall that least element in
(P, ~) generalizes to initial object in a category), that is, t/Jxo ~ Xo and Xo ~ x
for any x with t/Jx ~ x. It is a well-known result of po set theory (see Exercise
6.2.4) that this Xo is a fixed point of t/J and so is the least fixed point of t/J. This
result immediately and easily a fortiori generalizes to the following:
2 Theorem. An initial object oft/F-Alg is afixed point oft/J. A terminal object of
t/J-coAlg is afixed point oft/J.
PROOF. Let (L, Jl) be an initial t/J-algebra. Applying t/J to Jl: t/J L ----+ L,
(t/J L, t/J Jl) is a t/J-algebra. Obviously Jl: (t/J L, t/J Jl) ----+ (L, Jl)
:r
L
I
1/11
1/IJl
'OJ:. . 'OJ:
f
1/IL
--"---+)
--'-----+)
L
t
is a morphism (the needed commutativity is Jl(t/JJl) = Jl(t/JJl)). As (L, Jl) is
initial there exists f: (L, Jl) ----+ (t/JL, t/JJl) and, as idL is the only morphism
(L, Jl) ----+ (L, Jl), Jlf = idL- But then fJl = (t/JJlHt/Jf) = t/J(Jlf) = t/J(idL) = id'l'L'
This shows Jl is an isomorphism with inverse f Dually, if (G, M) is a terminal
coalgebra and f: (t/JG, t/JM) ----+ (G, M) is the unique morphism, then f is
inverse to M.
0
We have motivated the following:
3 Definition. Let t/J: C ~ C be an endofunctor of C. The least fixed point of t/J
is the initial object (if it exists) of t/J-Alg. The greatest fixed point of t/J is the
terminal object (if it exists) of t/J-coAlg.
In the event that t/J has both a least fixed point (L, Jl) and a greatest fixed
point (G, M), there exist unique f, g with
1/11
9
:r 'T I ' I
M'
L --1::---+) G
.'
M
1/1 L --:1/I-g---+) 1/IG
Noting, however, that
Mf
=
MfJlJl- 1
=
MM-1(t/Jf)Jl- 1
=
(t/Jf)Jl-l,
we see that f = g. The morphism f: L -+ G is called the comparison map.
We underscore that Definition 3 defines what we mean by "least" and
248
10 Functors
"greatest" in the general case. There is no pre-established ordering. Note that,
in general, the obvious way to attempt to generalize from C(p,::;;) to C via the
relation R on objects defined by ARB if there exists f: A ~ B in C is not
antisymmetric and is often not interesting. For example, in Set we have ARB
always holds unless A is nonempty while B = 0.
What makes Definition 3 important is that many data structures can be
constructed as least or greatest fixed points of functors which arise naturally
from recursive definitions. As will be clarified by the examples, the commutative squares in Definition 3 are not just technicalities needed to generalize
from posets to categories, but rather embody in highly conceptual form a
framework for recursive definition (of f and of g). The principle here is that a
recursively defined data type should automatically have the capability to
define functions on (or to) it recursively.
We now develop some examples. But first, some useful notation:
4 Language Theory Notations. If A is a set, we have already introduced
A+ = {a1"'a n: n ~ 1,a i EA} = A
A*
=
{A} + A+
+ (A
x A)
+ (A
x A x A)
+ "',
{al "'an: n ~ O,aiEA}.
=
We think of elements of A+, A* as "words on the alphabet A."
If S is a set of words on alphabet A and T is a set of words on alphabet B
then define their set concatenation as
ST
=
{st: SES,tE T}
which is a set of words on the alphabet A u B. (It is not always the case
that ST ~ S x T. For example, if S = {a,ab}, T = {ab,bab} then ST =
{aab, abab, abbab} has only three elements whereas S x T has four.) Note,
however, that our use of the coproduct in constructing A + , A * means that an
element of A* belongs to An for a unique n. See Exercise 2. Define
Aoo
=
{(ai: i = 1,2,3, ... ): each aiEA}.
We write elements of A OO as "infinitely long words"
so that it is possible to concatenate finite words on the left. For example,
AnA oo
=
A OO
A*AOO
=
A oo ,
for all n,
but
AOO A is not defined.
5 Example. Let A, B be disjoint sets. Consider the specification "data structure = something in B or else something in A followed by data structure." To
be more precise, let t/J: Set ~ Set be the polynomial functor
t/J(C)
= B + (A x C).
249
10.2 Fixed Points of Functors
Thus, if f: C --+ D, t/I(f) maps b to b and maps (a, c) to (a,f(c», t/I(f) =
idB + (idA X f). While A*B = ({A} + A+)B = B + A+ B = B + (A+ x B)(see
Exercise 2) = t/I(A *B) is a fixed point of t/I (see 9 below), there is a fixed
point of t/I possessing infinite words. Indeed, t/I has a greatest fixpoint
(A *B + A 00, M) as follows. Define M to be the isomorphism defined by
A*B
+ AOO ~B + A
x (A*B
+ AOO),
ifw = AbEA*B
ifw = a l '" a.bEA*B,n
ifw = a l a2a 3 "·EA oo •
To see this, let Ll: C ~ B
that there exists unique 9
6
+A
lB+AxC
gl
A*B
lid B + idA
+ A""
1
x C be an arbitrary function. We must show
L\
C
~
X
g
lB+Ax(A*B+A"")
M
Clearly, 6 is equivalent to
If Ll(c) E B
7
then
g(c)
= Ll(c).
If Ll(c) = <al,cl)EA x C then
that is, g(c) = M-l(alg(c l ».
Mg(c)
= alg(c l ),
To see that 7 truly defines a unique g, classify c E C into two cases:
Case 1. There exists k with Ll(c) = <al,c l ), Ll(c l ) = <a2,a 2), "., Ll(cd
<a k +1,Ck+l), Ll(Ck+l)EB.
Case 2. Otherwise, Ll(c) = <alc l ), Ll(c 2) = <a 2,c2), Ll(c 2) = <a 3 ,c 3 ), "
(Remember: Ll is a total function.)
=
••
In case 1,
g(c)
=
alg(cd
=
a l a 2g(c2)
= ". =
a l ." akg(c k+1)
=
a l ". akLl(c k+l ).
In case 2, g(c) = a l ... a.g(c.+ l ) for every n. But then a l ... a. is an initial
subword of g(c) so g(c) = a l a 2a 3 "·EA OO •
To apply the greatest fixed point of Example 5 we are going to look at the
trace semantics of the iteration of the partial function f: A ~ A
A
A
+ Bo:
250
10 Functors
By this we mean the actual sequence of values produced through the successive applications of f which will thus be of the form
(a o, a 1 , ••• , an' b) if iteration is successfully completed with result b;
(a o, a 1 , ••• , an' error) if iteration ev~ntually produces an an for which f(a n) is
undefined; or
(a o , a 1 , ••• , an, ... ) if iteration never terminates.
We define B = B o + {error} to simplify notation. It will then be useful to
have a function
last: A *B
+ ACO -----+ B
with last(a 1 ••• anb) = b while last(a 1 ••• an···) is undefined, which can determine the final result in B of the iteration from the string produced by the
trace semantics.
We define the total function!: A -----+ A + B by
f(a) = {f(a)
error
if aEDD(f)
else.
Such 1 induces a ",-coalgebra structure on A
A~B+AxA
~(a) = {1~)
_
<f(a),J(a)
The map g: A
-----+
A *B
+ A CO of 6 is recursively specified as in 7 by
g(a) = {.[(a)
f(a)· f(g(a))
8
if .[(a) E B
if f(a) E A.
if f(a) E B
else.
It follows that 9 provides the trace semantics of f For ifthe iteration halts
then g(a) = f(a)P(a)···r- 1 (a)ln(a), where fk(a)EA for k < n, r(a)EBo, or
r(a) is undefined (so that In(a) = error E B), whereas if iteration does not halt
then g(a) is the infinite sequence f(a)f2(a)j3(a)··· . We shall apply this trace
semantics map to derive a formula for ft of 3.2.24 in 14 below, after we have
defined last. But first we observe that the least set of finite strip.gs satisfying
C = B + (A x C) is given by the least fixed point of our functor.
9 Example. The functor "': Set --+ Set, "'(C) = B + (A x C) of 5 also has a
least fixed point (A *B, J1.), where J1. is the isomorphism
B
J1.
(w)
=
+ (A
{waw
To see this, given a function 15: B
x A*B)~A*B,
if WEB
ifw = <a,w)EA x A*B.
+A
x C -----+ C, we must show that
251
10.2 Fixed Points of Functors
10
B + A x A*B
idB
Xli
+ idA
B+AxC
--(j--+I
C
defines f uniquely. This is clear since 10 is equivalent to
=
f(a 1 ... anb)
11
{t5(b)
t5(a1 ,f(a2 ... anb)
~f n = 0
If n > O.
We illustrate the results obtained by defining the partial function
A *B
12
fi rst l
A,
n>O
else,
and the total function
A*B~B,
13
a 1 ... anb 1-----+ b.
The function first is defined by unpacking the isomorphism p.-1 using the
quasi projection (3.2.6) PR 2 to read items occurring in the second term of the
coproduct below.
first = A*B ~ B
+A
x (A*B)
PR 21
A x (A*B) 24 A.
To define the last function, consider the unique morphism to (B,t5) for
15 = (idB , PB), where PB(a, b) = b:
B
id B
+ idA
+A
1
x (A*B)
x last
B+AxB
By 11,
so that
f(a 1 ... anb)
=
f(a 2 ... a1 b)
=
f(a 3 ... a 1b)
= ... = f(b) = b =
last(a 1 ... anb).
Now let g: A ---+ A *B + AOO be the trace semantics ofthe partial function
f: A ---+ A + Bo as in Example 5. Then the iterate of J, as defined earlier in
3.2.24, is equivalently defined by
14
ft=A~A*B x Aoo~A*B~B.
We now also compute the comparison map for this functor t/I. By 3 it is
that h: A*B ---+ A*B + A determined by
OO
252
10 Functors
id
B+AxA*B
B
+ idA
x h
~l
A*B
1
IB+Ax(A*B+AOO)
M- 1
-----:-h---~'
A*B + A OO
By 11
h(al ... anb) =
{
M-1(b)
M-1(al' h(a 2 ... anb)
if n = 0
if n > 0,
from which we see
h(a 1 ... an b)
= M-1(a 1, M- 1(a2' ... ' M-1(a n, M-l(b»···» = a 1 ... anb,
and h = inA*B: A*B----+A*B
inclusion map.
+A
OO
is, as was to be expected, simply the
15 Example (Simple Recursion). The category Srd of simple recursion data of
2.2.27 has (N, 0, S) as initial object, where S(n) = n + 1. Let us introduce the
polynomial functor tjJ: Set -+ Set with tjJ(X) = 1 + X, the coproduct of X
with a one-element set denoted here simply as 1. An object in Srd is then just
a tjJ-algebra since 15: tjJX ----+ X must have the form (xo,f) with xo: 1 -+ X
an element of x, and f: X -+ X. It is easily checked that a morphism of
tjJ-algebras is just a Srd-morphism:
In this way, the natural numbers arise as the least fixed point of the data
type specification
c::= 1 + c
and the fixed point isomorphism unpacks to yield 0 and the successor function. Moreover, the least fixed point property embodies the principle of
simple recursion.
This example is really a special case of 9: let B = 1 = A.
16 Example (Finite Lists of Natural Numbers). The set N+ of finite lists of
natural numbers, considered in 10.1.1-5 arises as the least fixed point of
tjJ: Set -+ Set, tjJ(C) = N + C x N. This is essentially the same as Example 9
with A = B = N but we have written N + C x N instead of N + N x C to
accommodate the form of the push map in 10.1.4. This gives N+ the form
NN* instead of N*N but, of course, these are the same. Thus, the isomorphism J.l: N + N+ ----+ N+ is given by
()
J.lw
w
if WEN
= { vn ifw = (v,n)EN+ x N
253
10.2 Fixed Points of Functors
+C
and given any total function 15: N
N
+ N+
X
x N
-----+
C, the diagram
N _--,J1_~1 N+
Y
id+fxidl
N+CxN
0
Ie
defines f since
17
f(n 1 ••• nk )
=
{b(n 1 )
b(f(nl ... nk - 1 ), nk )
~f k ~ 1
If k > 1.
The maps last, push, and p of 10.1.3-5 arise as follows. The last map is
constructed similarly to the map of 13 by applying 17 to a 15 above where
15 = <id N,P2)' Push is just
The "length map" p: N+
-+
N of 10.1.5 is defined inductively by
length(n) = 1
length(wn)
=
for nEN
length(w)
+1=
s(w)
for WE N+, n E N.
But this just says that length is the f induced by 17 with 15: N
N -----+ N defined as <(J(,f3), where (J((n) = 1 for all nand
+N
x
13 = N x N 2.!.....N ~N.
The maps createk and spec of 10.1.1 and 10.1.2 will be discussed in the next
chapter.
18 Example (Automata Theory). An automaton is given by
a set Q of states,
an initial state qo E Q,
an input alphabet A,
a next-state function y: Q x A
an output function 13: Q -+ y.
-----+
Q,
Such an automaton processes an input word a 1 '''anEA* by beginning at
the initial state qo, processing the input letters from the left to pass through
the sequence of states ql = y(qo, ad, q2 = y(ql,a 2), ... , qn = y(q.-l,an), and
emitting f3(qn) as the output in response to the input word.
We now show that two different constructions, one a least fixed point and
the other a greatest fixed point, may be used to describe this response function. Here A, Yare regarded as fixed sets.
Consider the polynomial functor t/I: Set -+ Set, t/I(C) = 1 + C x A. This is
very similar to 9 and the least fixed point of t/I is A *. Ifthe unique element of
1 is called A, the fixpoint isomorphism Jl: 1 + A * x A -----+ A * is essentially
the identity function, that is, "{A} + A+ = A*." A t/I-algebra 15: 1 + Q x
A -----+ Q takes the form <qo, y), where
254
10 Functors
qo E Q (identifying elements of Q with functions from 1) is an initial state,
and
y: Q x A ~ Q is a next-state function,
so that t/J-algebra
=
r: A* ~ Q with
automaton without output function. The unique
is defined by
r(A) = qo
r(wa)
= y(r(w), a)
and is precisely the final state reached from qo in processing w. We call r the
reachability map of the automaton. The input/output response function with
initial state qo is, then,
A*~Q~Y.
We now turn to a greatest fixed point construction. Let C, D be sets.
Let [C ~ D] denote the set of all functions from C to D, that is, [C ~ D] =
Set(C, D).
19 Observation. [C
~
(-)]: Set ~ Set is a functor, where for f: Dl
[C
~
D1 ]
[C-->f]
j
[C
~
~
D2 ,
D 1 ],
C ~ Dl 1-------+ C ~ Dl ~ D 2 •
Thus, [C ~ n = f 0 - . Functoriality is easy: iff = idD then g 1--+ g idD = g so
[C ~ n = id[c-->D]; and, iff: Dl ~ D 2 , 1': D2 ~ D 3 , [C ~ f'f] (g) = (f'f)g =
f'(fg) = [C ~ f']([C ~ n(g))·
Note that [C ~ (-)] is a polynomial functor if C is finite (for then
[C ~ D] = D x ... x D (n times if C has n elements)), but this seems doubtful
otherwise.
Above, we saw that the input/output response function of an automaton
is of the form A * ~ Y. To motivate what follows, given an automaton introduce the notation hq for the response function if the initial state were q. Thus,
the actual response is hqo ' We immediately observe that
while
hc5(q,a)(w)
= hq(aw).
These equations may be expressed in the form K(hq) = (a 1-------+ hc5(q,a), P(q)) if
255
10.2 Fixed Points of Functors
K: [A*
-+
YJ --+ [A --+ [A* -+ YJ]
X
Y
is defined by
= (S, h(A)),
K(h)
where S: A --+ [A* -+ Y], a 1-----+ (w 1-------+ (w 1-------+ h(aw)). But such K has the
general appearance of a coalgebra!
To formalize this, define t/J: Set -+ Set by t/J = [A -+ (-)] x Y. Thus,
t/J(C) = [A -+ C] x Y, and if f: C -+ D, t/J(f)(IY.., Y) = (f1Y..,y). We now show
that t/J has greatest fixed point ([A* -+ YJ,K).
Let A: Q --+ [A -+ Q] x Y be a t/J-coalgebra. We must show that
'1 ----------., [AT;l
(J
20
[A
-->
Q] x Y
[A
---+
u
]
.d) [A----+ [A* --> Y]] x Y
Xl y
defines a uniquely. For each qEQ, A(q) has the form (y q,f3q ) with Yq: A -+ Q,
f3qE Y. Chasing q to [A --+ [A* -+ YJ] x Y in the two ways shown in
20 leads to two equations, one for each coordinate. These are, with the
Y-coordinate shown first,
21
a(q)(A) = f3q,
a(q)(aw)
=
a(yq(a))(w).
Thus, a(q) is defined on A for all q, and a(q) is defined on all words of length
n + 1 for all q providing a(q) is defined on all words oflength n for all q, so 21
is an inductive definition of a.
But more is true! A t/J-coalgebra is just a way of coding an automaton
without initial state. The formal principle involved is the following:
22 If C, D, E are sets then
f(c, d)
=
g(c)(d)
describes both directions of a bijection
[C x D --+ E]
=
[C --+ [D -+ E]].
The proof of 22 is obvious. But then given A: Q --+ [A -+ Q] x Y, first
A = [', f3] with ,: Q --+ [A -+ Q], f3: Q -+ Y. Then, by 22, such, are in bijective correspondence with functions y: Q x A --+ Q so that A is a way of
coding (y, f3) = automaton without initial state. From this perspective, 21
becomes
23
a(q)(A) = f3(q),
a(q)(aw) = a(y(q, a), w),
from which it follows at once that a(q): A*
-+
Y is the response of the auto-
256
maton with y: Q X A
10 Functors
~
observability map.
Q, p: Q --+ Y and initial state q. Such (J is called the
The reachability and observability maps are important in automata
theory. An automaton is reachable if r is surjective, so that every state can be
reached from the initial state and so "every state is necessary." An automaton
is observable if (J is injective. This says that whenever two states are different,
their input/output responses differ on at least one input word. A fundamental
result of automata theory is that given arbitrary f: A * --+ Y there exists a
reachable and observable automaton whose input/output response is f and,
moreover, it is unique up to isomorphism.
EXERCISES FOR SECTION
10.2
1. Specialize the proof of Theorem 2 to provide an alternate approach to Exercise
6.2.4.
2. Let S, T be disjoint subsets of S*. Show that ST = S x T. How is {A} + B +
(BxB)+(BxBxB)+··· different from {A}uBuBBuBBBu··· if B has
form A*?
3. Specialize the results of this section to compute the least and greatest fixed points
and their comparison map for the polynomial functor t/J: Set ---+ Set, t/J(C) = B + c.
4. The polynomial functor of Exercise 3 is much less interesting on a poset. Let (P, :0;)
be a po set with greatest element 1 and assume that b v x exists for all x. Define
t/J: P ---+ P by t/J(x) = b v x. Show that t/J has b as least fixed point and has 1 as
greatest fixed point.
5. Using the structure of A * B as a least fixed point as in 9, define
A*B~A,
a2
a ... a b I----> {
-L
n
1
ifn> 1
else,
analogously to be the definition of first and last in 12 and 13.
6. In any category C, given
IX
W
al
y
lb
u
with a, b isomorphisms, prove that
I
I
IX
W
a-I
Y
IZ
b- I
U
IZ
Conclude that in the context of 3, if there exists a t/J-algebra morphism
----. (L, /-I) then the comparison map is an isomorphism.
(G, M- 1 )
Notes and References for Chapter 10
257
7. Let 1/>, l/I: C -+ C be functors and let 1/: I/> -+ l/I be a natural equivalence (Exercise
10.1.5). Let (A, (j) be a l/I-algebra. Show that (A, (j) is a least fixed point of l/I if and
only if (A, (j ·1/A) is a least fixed point of 1/>. State the dual result for greatest fixed
points (hence, of course, no proof is needed).
8. Let E be an object in C and let +, EB both assign to each two C-objects A, B
definite coproducts
Show that the endofunctors I/>(C) = E + C, l/I(C) = E EB C are isomorphic. More
generally (thereby raising a technical problem previously swept under the rug),
show by induction that up to isomorphism offunctors (Exercise 10.1.5) it does not
matter which products and coproducts are chosen for a given polynomial functor.
The resulting least and greatest fixed points are then also unique up to isomorphism by Exercise 7.
Notes and References for Chapter 10
Functors and natural transformations between them playa central role in the founding paper of category theory [So Eilenberg and S. Mac Lane, "General theory of
natural equivalences," Transactions of the American Mathematical Society, 58, 1945,
pp. 231-234]. Quoting from the introduction of the book of Freyd cited in the notes
to Chapter 2,
It is not too misleading, at least historically, to say that categories are what one must
define in order to define functors, and that functors are what one must define in order
to define natural transformations.
The reader may wish to look up "representable functors" and the "Yoneda lemma"
about them-Exercise 10.1.8 is a special case. For more on functors see the references
cited in Chapter 2.
The functorial iterate of Exercise 10.1.11 was introduced by the authors in their
paper cited in Chapter 3 notes. The Elgot iteration equation was studied earlier by
C. C. Elgot. (See his paper cited in the notes to Chapter 3.)
In the 1970s a number of workers including G. Plotkin, M. D. Smyth, and M.
Wand considered least fixed points of functors in connection with the recursive
specification of data types, both in Set and in categories of domains such as those
considered in Chapter 13; see D. 1. Lehmann and M. B. Smyth, "Algebraic specification of data types," Mathematical Systems Theory, 14, 1981, pp. 97-139]. Much of this
work had been done earlier in a more abstract setting. Theorem 10.2.2 appears in M.
Barr, "Coequalizers and free triples," Mathematische Zeitschrift, 116, 1970, pp. 307322, where the result is attributed to Lambek. Endofunctors of Set were the object of
intensive study by a number of mathematicians in Prague including 1. Adamek, V.
Koubek, and V. Trnkova in a series of papers beginning in 1971; see 1. Adamek and
V. Koubek, "Remarks on Fixed Points of Functors," in Lecture Notes in Computer
Science, Vol. 56, Springer-Verlag, 1977, pp. 199-205. The use of greatest fixed points
to specify data types is due to the· authors in "Parametrized data types do not need
highly constrained parameters," Information and Control, 52, 1982, pp. 139-158.
For a category-theoretic treatment of reachability and observability see the
authors' paper "Adjoint machines, state-behavior machines and duality" in Journal of
Pure and Applied Algebra 6,1975, pp. 313-344.
CHAPTER 11
Recursive Specification of Data Types
11.1 From Least Upper Bounds to Least Fixed Points
11.2 Co-continuous Functors
11.3 Continuous Functors and Greatest Fixed Points
Just as posets generalize categories, the Kleene fixed point theorem of 6.2.13
generalizes to provide a "co-continuous" endofunctor t/J: C -+ C with a least
fixed point as the "colimit" of the "right chain"
1- ~ t/J(1-) ~ t/J2(1-) ~ t/J3(1-) ~ ....
The details are developed in the first two sections. The dual theory is used to
get greatest fixed points in Section 3. In 11.3.18 we introduce the notations
C: := t/J(C) and C =: : t/J(C) for the least and greatest, respectively fixed
points t/J. These exist for the wide class of "polynomial" functors Set -+ Set
and in many other cases as well.
11.1 From Least Upper Bounds to Least
Fixed Points
In 10.1.12 we observed, for the polynomial functor t/J: Set -+ Set, defined by
t/J(C) = N + C x N, that
1
is, up to isomorphism, the ascending chain of sets
2
o c: N c: N + N
2 c:
N
+N2 +N3
c: ...
whose union N+ arises, as we have seen in 10.2.16, as the least fixed point of
t/J. We have not been very precise about what it might mean for 1 and 2 to
"coincide up to isomorphism." Moreover, to seek the proper level of gener-
259
11.1 From Least Upper Bounds to Least Fixed Points
ality for our theory we should like, at least at first, to consider 1 for a wide
class offunctors ljJ: C -+ C for C a category with initial object, but not necessarily Set. In an arbitrary category, what do the "inclusions" mentioned in 2
mean? Our approach, however, is to avoid answering such questions (which
we regard as red herrings, though answerable), preferring to attack 1 directly
through a "colimit" construction. While such exists in many categories we
pay special attention in this section to the details in Set. We turn to defining
colimits. As in the previous chapter, we proceed by finding the appropriate
concept in a poset (P, :::;;), lifting it to C(p, S;), and then generalizing it to an
arbitrary category C. It is clear that an ascending chain Xo :::;; Xl :::;; X 2 :::;; ...
generalizes to a chain of morphisms Co -+ Cl -+ C2 -+ .. '. A least upper
bound is an element X of P such that not only is Xn :::;; x for all n, but if also
Xn :::;; Y for all n, then x :::;; y. Replacing each :::;; by -+ motivates the following
definitions.
3 Definitions. Let C be a category. A right chain in C is an arbitrary sequence
(en In;::: 0) of morphisms ofthe form
(The term "right" makes reference to the convention that morphisms are
usually written with the domain on the left and the codomain on the right.
Thus, a right chain proceeds to infinity toward the right whereas the "left
chain" to be introduced in 11.3.1 comes in from infinity from the left,
" . -+ C 3 -+ C2 -+ C l -+ Co·)
An upper bound of a right chain (c n ) as above is a pair (U, a) with a =
(anln ;::: 0) with an: Cn -+ U such that
\" ;.:'.'
4
(n ;::: 0).
V
For a fixed chain (c n ) we may form a category whose objects are the upper
bounds of (c n ) and in which a morphism f: (U, a) ~ (V, /3) is a C-morphism
f: U -+ V such that
:I\~
5
v
j
)
(n ;::: 0).
v
Composition and identities are as in C. That this makes sense is clear from a
glance at the diagrams
V
--;-:---+)
id u
V
/I~
v ---:j,----+) V --g--+) W
260
11 Recursive Specification of Data Types
As in 10.2.3, we then use an initial object to generalize "least:" a colimit of (c n )
is an initial object in the category of upper bounds of (c n ). Thus, if (U, IX) is a
colimit of (c n ) and (V, p) is an arbitrary upper bound of (c n ), there exists a
unique f as in 5. If (V, f3) is also a colimit, this f is an isomorphism in C as
well as in the category of upper bounds of (c n ).
The category C has co limits of right chains if every right chain has a
colimit.
Colimits of right chains are dual to limits of left chains, as discussed in
Section 11.3. Although the decision as to "which is the original and which is
the co" is arbitrary, this terminology is well established in category theory.
We now check that colimits of right chains do truly generalize least upper
bounds of ascending chains in a po set.
6 Example (Least Upper Bounds in a Poset). Let C = C(P. S) be a poset
considered a category as in Section 2.1. Then a right chain coincides with the
notion of ascending chain
Co:::;; C 1
:::;;
C2
:::;;····
If U is an upper bound of (Cn ) in the sense of the theory of posets, that is, if
U E P with Cn :::;; U for all n, then using IXn to name the unique Cn ~ y, we have
that (U, IX) is an upper bound of (Cn ) in the sense of 3 because all diagrams
commute in a poset, those of 4 in particular. Pursuing the same reasoning a
little further, U = V Cn if and only if U is the colimit of (Cn ) (and, for po sets,
colimits are unique-not just up to isomorphism).
7 Example (Colimits of Right Chains in Set). While somewhat involved, this
construction is basic and deserves the reader's detailed attention. Let
be a right chain in Set, so that each Cn: Cn -----. Cn +1 is a total function. We
will construct a colimit (U, IX) of (c n ) explicitly. The idea is a simple one: think
of the functions Cn as "inclusions" and "take the union." The analogy falters
a little because the Cn are not required to be injective. The more precise way
of thinking of "Cn C U" is that, for each n, an x E Cn is represented by the
"ultimate value" of the sequence
8
(which is the constant sequence x when the Cn are inclusions). Since the Cn are
sets without further structure there is no reasonable sense in which 8 should
"converge," so we let the "ultimate value" be represented by the sequence
itself. But we will regard two sequences as "having the same ultimate value"
if they agree on all but finitely many values.
Here, then, are the precise definitions. For m :::;; n let Cmn : Cm ~ Cn be the
composition Cn-1Cn-2"'Cm (=id if m=n). Set A=ll(Cn:n~O}=
261
11.1 From Least Upper Bounds to Least Fixed Points
{(n, x): x E Cn, n
9
~
(n, x)
O}. Define an equivalence relation
(m, y)
~
ink
~
~
on A by
m, n
That ~ is reflexive and symmetric is obvious. That ~ is transitive is easy
since if cnk(x) = cmk(y), cm/(Y) = ct/(z), choose u ~ k, 1so that
cnu<x) = CkuCnk(X) = CkuCmk(Y)
=
cmu(Y) = c/ucm/(y) = c/uct/(x)
=
ctu(z).
Define U to be the set of ~ -equivalence classes of A, so that U =
{[n, x]: x E Cn}, where en, x] is the equivalence class of (n, x), the set of all
(m,y) with (n,x) ~ (m,y). Thus, [n,x] = [m,y] ¢>(n, x) ~ (m,y). Since [n,x] =
[n + 1, cn(x)] = [n + 2, Cn+1 cn(x)] = ... , this captures the intuitive context of
8. Define
10
Cn~U,
x 1--------+ [ n, x].
We must show that (U, a) is an upper bound of (c n ), that is,
~\ )~"
u
But if x E Cn, an +1 (cn(x)) = [n + 1, cn(x)] = en, x]
show that if (V, P) is another upper bound then
=
an(x). Finally, we must
11
[n, x] 1--------+ Pn (x)
is the unique function f with
Cn
u:I~
f
(n
~
0).
lV
First of all, is 11 well-defined, that is, if en, x] = em, y] is Pn(x) = Pm(Y)? Well,
as (n, x) ~ (m, y) there exists k ~ m, n with cnk(x) = cmk(y). But as (V, P) is an
upper bound we have
n --->l
en _ _c.::....
e
e
Ck - 1
e1
'.'~jy'
V
262
11 Recursive Specification of Data Types
so that Pn(x) = PkCnk(X) = PkCmk(Y) = Pm(Y). It is then obvious that frt. n = Pn.
Furthermore, if also g: U --+ V satisfies grt.n = Pm then g[n, x] = grt.n(x) =
Pn(x) = f[n,x], so g = f
12 Observation. Let the right chain (c n) in a category C have colimit (U, rt.) and
let f: U --+ V be any C-morphism. Then (C,frt.) is an upper bound of (c n), where
(frt.)n = frt. n, as is immediate from
),c. ,
<;~
flX n
U
flX n +1
1f
V
I t follows that if g: U --+ V satisfies grt. n = frt. n for all n, then g = f
We have seen that po sets generalize to categories, and monotone maps
generalize to functors. In the next section, we shall see that continuous maps
generalize to co-continuous functors, and that a generalized Kleene fixed
point theorem is available for co-continuous functors in a suitable category
C (11.2.13). We shall also prove that every polynomial functor Set --+ Set is
co-continuous (11.2.12). As an immediate corollary, we then have:
13 Generalized Kleene Fixed Point Theorem for Polynomial Functors. Every
polynomial functor 1/1: Set --+ Set has as least fixed point the colimit (L, rt.) for
the right chain
1.. ____ 1/1(1..) ~ 1/1 2 (1..) 1P2(!)j 1/1 2 (1..) ____ ••.•
Since this result can be applied without study of the proof, we now provide
examples for readers who do not wish to study the technical details of Section
11.2.
14 Finite Lists of Natural Numbers. We now reopen the discussion of 10.2.16.
There we showed directly that the polynomial functor 1/1: Set --+ Set, I/I(C) =
N + C x N has a least fixed point of the form (N+, Jl). But we have yet to see
how to define the maps createk and spec of 10.1.1 and 10.1.2, and this we do
now.
By Theorem 13, (N+, rt.) is a colimit of I/In(!) for appropriate rt.. By the
discussion of 10.1.10 we may think of rt.k as the inclusion of N + ... + N k in
N+. But then we have
15
To define spec, first define for each k the maps
263
11.1 From Least Upper Bounds to Least Fixed Points
16
(m,(n 1 , · · · ,nk
»1---+ { nm.1
if1<m<k
I e se.
If we are to play fair in defining SpeCk (and, eventually, spec), we should only
use properties of N that we can deduce by creating N as in 10.2.15. There, a
least fixed point isomorphism decomposed N as the coproduct
1~N~N.
17
But if C = A + Band B = D + E we should have C = A + D + E. This is
true, but we need to be more precise. Thus, if
and D~B~E
A~C~B
are coproducts, claim that
~D
A---~'C
is a coproduct. For, given fA: A
-+
~E
F, fD: D -+ F, fE: E -+ F, define fB by
w
v
and thenfby
Then ft = fA' fuv = fBv = fD' fuw = fBw = fE. Furthermore, if gt = fA'
guv = fD' guw = fE then gu = fB so that g = f
Applying this to 17 we have
o
---~'N
~1
~N
is a ternary coproduct. Applying very similar ideas, for any k ;;:: 1,
264
11 Recursive Specification of Data Types
is a (k - I)-ary coproduct. By the distributive law 10.1.9, N x N k decomposes as a (k + l)-ary coproduct
and we may regard this as a coproduct in Pfn by 2.3.18. Thus, speck is defined
in Pfn by
[siO,id]
18
)
N
X
Nk
speCk)
N =
{l.
Pi
i= 0
I~i<k,
where Pi(n 1 , ••• ,nk ) = n i.
Applying the polynomial functor N x - to the colimit N+ of N k c Nk+l
we see that N x N k - - + N X Nk+l has colimit N x N+, at least in Set. Our
objective is to define spec: N x N+ - - + N by the colimit property, so we had
best pause for the following:
19 Observation. If (U, IX) is a colirnit of (c n ) in Set then (U, IX) is a colirnit of (c n )
in Pfn too.
PROOF:
Let (V, f3) be an upper bound of (c n) in Pfn. Thus,
~~}~:H
V
where --1 arrowheads denote total functions. Define W
')In: en --1 W by
'Yn(X)
=
{f3*n(X)
if x E DD(f3n)
else.
Then
in Set, so there exists unique fo with
Cn
I~
u
Define f: U
~
V by
10
IW
=
W
+ {*} and define
265
11.1 From Least Upper Bounds to Least Fixed Points
f(u)
=
{fo(U)
1.
if fo(u) -#
else.
*
Then frxn = Pn· Moreover, if grxn = Pn in Pfn then gOrxn = Yn in Set if we define
go(u) = * when u ¢ DD(g), go(u) = g(n) else, so that go = fo and hence 9 = f
To conclude the definition of spec, observe
holds in Pfn, so that (N,(specklk
desired map
~
0)) is an upper bound, inducing the
from the colimit.
20 Example (s-Expressions). In the programming language LISP, an
s-expression is either an atom or a pair of s-expressions. This recursive
specification corresponds to the polynomial functor 1/1: Set -+ Set, I/I(C) =
A + C x C, A being the set of "atoms". By Theorem 13, the least fixed point
L is obtained as the colimit of I/In(0). We compute
1/1(0) = A
1/1 2 (0) =
1/13(0)
A+A
A
x
= A + (A + (A x A)) x (A + (A x A))
=A+AxA+~x~xA+Ax~x~
so that L is the set of all binary trees with labels only at the leaves, where each
label is an element of A. Each s-expression decomposes into two trees, a
"head" and a "tail". In LISP notation, the head function is denoted CAR, and
the tail function CDR. The fixed point isomorphism
A+LxL~L
uncouples to give the inclusion Jl in! of the atoms, and the (partial)
CAR and CDR functions:
L
CAR
L
lp,
/J-I
I
A
+ (L
xL)
PR LxL
ILxL
lp2
CDR
L
266
11 Recursive Specification of Data Types
EXERCISES FOR SECTION
11.1
1. Taking note of 2.3.18 show that 19 generalizes to Mfn and ANMfn.
2. State more precisely and prove: In any category, the colimit of the right chain
C~C~C~C~···
exists and is C itself.
3. Let X = {1, 2, 3, ... } and let f(x) = 2x. Consider the colimit of
X~X~X~X~···
as constructed by 7. Characterize the equivalence relation (m, x) ~ (n, y) in terms of
the prime factorizations of x, y and show that there are infinitely many equivalence
classes in the colimit.
4. Let X={O,1, ... ,9}. Let djEX be the ith digit after the decimal point in the
expansion of 'It, d 1 = 1, d2 = 4, d3 = 1, d4 = 5, d s = 2, .... Define J;: X -+ X by
/;(d) =
{
d if d "2: dj
0 if d < d j •
Is the colimit of
X~X~X~X~···
finite or infinite?
11.2 Co-continuous Functors
We now define "co-continuous functors" to generalize continuous maps between domains and prove that all polynomial functors Set --+ Set are cocontinuous. We shall then prove the generalized Kleene fixed point theorem
promised in the previous section.
1 Observation. Let ljJ: C --+ D be a functor. Then if
(c n )
=
CO~Cl ~C2~C3--+···
is a right chain in C,
(ljJc n) = ljJCo ~ ljJC 1
~
ljJC2
~
ljJC3
--+ ...
is a right chain in D. Furthermore, if (U, oc) is any upper bound of (cn ) then
(ljJU, ljJoc)-where (ljJoc)n = ljJocn-is an upper bound of (ljJc n), since the functoriality of ljJ yields
Thus, any functor preserves right chains and upper bounds of right chains.
267
11.2 Co-continuous Functors
2 Definition. A functor t/J: C --+ D is co-continuous if whenever cn : Cn -+ Cn+1
is a right chain in C and (U, a) is a colimit for (c n ) then (t/JU, I/!a) as in 1,
automatically an upper bound for (I/!c n ), is a colimit for (I/!c n ).
3 Example. By 11.1.6, if C and D are domains regarded as categories then a
continuous map is just a co-continuous functor. (The awkwardness of the
terminology is unfortunate; the dual notion of "continuous functor" will be
introduced in 11.3.7 in connection with greatest fixed points, and here the
terminology in category theory is too standard to change; and the dual
notion to continuous maps of po sets seems not to playa role in the order
semantics of control structures owing, for example, to the failure of Pfn(A, B)
to be a "codomain." Definition 2 generalizes, in particular, to arbitrary posets
as opposed to domains.
4 Example. Let I/!: Set --+ Set be a functor such that I/!(A) = 2A (two examples
are given in 10.1.24 and 10.1.25). Then I/! is not co-continuous. For the right
chain
{o} c {O, 1} c {O, 1,2} c ...
has (N, a) as colimit where an (x) = x, as is clear from 11.1.7. Again applying
11.1.7 to construct the colimit of
I/!{O}
----+
I/!{O, 1} ---+ I/!{O, 1,2} ---+ ... ,
even though the functions I/!{O, ... ,n} ---+ I/!{O, ... ,n + 1} are not definite,
we can still assert that the colimit is obtained by partitioning a countable
union of finite sets into equivalence classes, and so is a countable set (see
Exercise 1). Since I/!(N) = 2N is not countable (see Exercise 10.1.9), the argument is complete.
5 Theorem. The constant Junctor D: C --+ D oj 10.1.16 is co-continuous.
PROOF: If an = id D , (D, a) is an upper bound of
D~D~D~···
and is, in fact, colimit since if (V, p) is another upper bound then
id
shows all
Pn have a common value; hence,
has a unique solution J =
Po.
o
268
11 Recursive Specification of Data Types
6 Theorem. The identity functor ide: C
-+
C is co-continuous.
7 Theorem. Let D have finite coproducts and let t/!l' ... ,t/!k: C -+ D be cocontinuous. Then their coproduct t/!l + ... + t/!k: C -+ D as in 10.1.19 is cocontinuous.
PROOF. Let (c n ) be a chain in C with colimit (U, oc) and let (D, P) be an upper
bound of(t/!l + ... + t/!d(cn ). Consult the diagram
""cn
",,(1.n/ ""lCn
~
"'I c. + ... +
----i-----___
~_
"'IV
~
I
~
"'icl·+1
~
"'kC.
'"
'"
I "'I C.+1 + ... + "'kC.+I
~CnR~ ••• + o/~
'1\ 1"'1(1.. + ... + "'k(1..---____~ /
"'I V + ... + "'k V ------7-----. D
1'.+1
where 1 ~ i ~ k and t i, ui, Vi are coproduct injections. We must show there
exists unique f with f(t/!l OCn + ... + t/!kOCn) = Pn for all n ~ O. As the diagram
shows, for each i, (D, pu;), with (pu;)n = PnUi, is an upper bound of (t/!iCn) so
that, as t/!i is continuous, there exists unique J; with J;(t/!iOCn) = PnUi for all n. By
the definition of a coproduct, there exists unique f with ft i = J;. Then for
each i,
f(t/!l rY. n
+ ... + t/!kOCn)U i =
=
fti(t/!iOCn)
J;(t/!iOCn)
=
PnUi
so that, by the uniqueness property of coproduct-induced maps,
f(t/!lrY. n + ... + t/!krY. n) = Pn. Now suppose g(t/!locn + ... + t/!krY. n) = Pn for all n.
Then for each i,
= PnUi
so that, recalling the uniqueness proviso in the definition of J;, gti = J;.
But then g = f
D
8 Theorem. Let t/!l: C -+ D, t/!2: D -+ E be co-continuous. Then their composition t/!2 t/!l: C -+ E is co-continuous.
The astute reader may have observed that in 10.1.19 and Theorem 7
above, the finiteness of the coproducts was only a notational convenience.
Any coproduct of co-continuous functors is co-continuous.
We have so far not proved that a finite product of co-continuous functors
C -+ D is co-continuous and, indeed, this is true only for special categories D.
The next theorem establishes this for D = Set and here the finiteness of the
269
11.2 Co-continuous Functors
product is crucial. This result is important, if difficult, and merits the reader's
careful scrutiny.
9 Theorem. Let
t/ll x ... x t/lk: C
PROOF:
t/ll, ... , t/lk: C -+ Set
-+
be co-continuous. Then their product
Set as in 10.1.18 is again co-continuous.
It is best to isolate the following:
10 Lemma. For i = 1, ... , k let
be a right chain in Set with colirnit (Vi, exi). Define (V, ex) by V = VI X ... x vk,
exn = ex~ x ... x ex~: C; x ... x C! ~ V. Then (V, ex) is a colirnit of
CJ x ... x C~ C6 x ... x c~)
x ... x q d x ... x c1) Ci x ... x q ~ ....
ct
We leave it as an isomorphism-chasing exercise to show
that if the lemma holds for any choice of colimits (Vi, exi) then it holds for all
such choices. This granted, we assume (Vi, exi) is constructed as in 11.1.7. Now
consider the diagram
PROOF OF LEMMA.
.: ......j
C'
X ... X
ck
~
/ - .., ..,
c'
X •.. X
n
ck
n
)
C'
x· ..
X
ck
u----r--+ v
with (V, 13) an upper bound for (c~ x ... x c~). We must show that there
exists unique f with f(ex~ x ... x ex~) = f3n for all n ~ o.
Recall the notations of 11.1.7. A typical element of V has the form
([nl,x l ], ... , [nbx k]) with XiECn, But-and here is where finiteness comes
in-there exists I with I ~ ni for all i = 1, ... ,k. Then [ni,x;] = [/,y;] if Yi =
C~il(XJ Thus, if f exists we must have
f([nl,x l ],··· , [nbxk])
=
f(ex{(yd,··· ,ex~(Yk))
=
f3l(Yl'··· ,Yk)
which proves the uniqueness of f To prove existence define f this way, that
is, define
11
f([n l , Xl],· .. , [n k, Xk])
for any I
= f3l(C~,lxl)'··· , C~kl(Xk))
~
n l , ... , nk •
Since f3t+1 (c tl X ... x c~) = f3t for all t, this definition is independent of l.
In particular, for any n, f(ex~(Xl)' ... ' ex~(Xk)) = f([n, Xl], ... , en, Xn]) =
f3n(X l ' ... , xd as desired.
The main theorem then follows from the lemma by starting with a colimit
(w, y) on a right chain (en) in C and setting (c~) = (t/liCn), (Vi,ex i ) = (t/li W, t/liY).
D
270
11 Recursive Specification of Data Types
12 Corollary. Any polynomial functor Set -+ Set is co-continuous.
We now generalize the Kleene fixed point theorem to provide least fixed
points of co-continuous endofunctors. Note that the conditions on C simply
generalize the conditions on a poset which make it a domain.
13 Generalized Kleene Fixed Point Theorem. Let C be a category with an
initial object .l and such that every right chain has a colimit. Then every functor
t/J: C -+ C which is co-continuous has as least fixed point the colimit (L, IX) for
the right chain
This right chain (c n), Cn = tjJn(!), has been constructed so that t/Jcn =
Hence, if Pn = IXn+1' (L, P) is an upper bound of the right chain (t/Jc n), that
PROOF.
n
C +1'
is,
But as t/J is co-continuous, (tjJL, tjJlX) is a colimit of (t/Jc n). Hence, there exists a
unique Jl:
14
",n+1(~)
",a!
~n+1
",L
(n
~
0)
, L
Jl
so that (L, Jl) is a t/J-algebra. To show that (L, Jl) is the least fixed point of t/J,
we must (by 10.2.3) show that it is the initial t/J-algebra. So let (A, J) be any
t/J-algebra. We must find a unique f with fJl = J(t/Jf).
15
Define fn: t/Jn(.l) 16
",n+1(~)
"'a/
"'fl
",A
(?)
(j
,\"+1
, L
1f
'A
A recursively by
fo =!
/"+1
and claim that
",L
Jl
=
t/J(t/Jn(.l))
tpIn,
tjJA ~ A
271
11.2 Co-continuous Functors
c.
I/I.(.L)
~
A
1
I/I.+1(.L)
).+1
commutes for all n ~ O. This is clear for n = 0 as 1.. is initial. For the inductive step,
f,.+2 Cn+1 = c5(t/lfn+l)(Cn+1 )
=
c5(t/lfn+l)(t/lcn)
=
c5(t/I(fn+l cn)) = c5(t/lf,.) = f,.+I·
Hence, there exists a unique f
17
L ----,,----+1 A
f
By 15 and observation 11.1.12, f/1 is the only morphism g: t/lL ---+ A with
g(t/lcx n) = fcx n+1 for all n ~ O. In view of 17, f/1 is the only g with g(t/lcx n) = f,.+I'
But also (c5(t/I(f)(t/lcx n) = c5(t/I(fcx n)) = c5(t/lf,.) = fn+1 so that f/1 = c5(t/ln. If also
h/1 = c5(t/lh) then hcxn = fn by induction, since both paths are! when n = 0, and
the inductive step is, consulting 15 (with h instead of f), hcx n+1 = h/1(t/lcxn) =
c5(t/lh)(t/lcxn) = c5(t/lf,.) = fn+l' Then by the uniqueness of fin 17, h = f
0
18 Corollary. The Kleene fixed point theorem 6.2.13 is the special case that C
is the category C(p. oS) of a domain (P, ~).
EXERCISES FOR SECTION
11.2
1. Let X = XI
U X2 U X3 U ... , where each Xi is countable. Prove that X is countable. [Hint: Use the coproduct property to construct a surjective function
liXi ---+ X and conclude that it suffices to prove lixi is countable. To this end
let PI < P2 < P3 < ... be prime numbers. List the elements of Xi and let /;: Xi -> N
map the jth element of Xi to P2iP2j+I' Show that the induced function liXi ---+ N
is injective.]
2. Let t/!:Set->Set be the functor t/!{A)=A*, where J:A->B, t/!(f)(al,···,an ) =
(fad' ··(fan)EB*. Verify that t/! is a functor and show that t/! is co-continuous.
[Hint: Write t/! as an infinite coproduct of polynomial functors.]
It follows that the functor A f---+ B x A * is co-continuous; its least fixed point
formalizes the semantics of "a widget is an element of B or a list of widgets."
3. In the spirit of Exercise 2, formalize "a widget is an element of B or a finite set of
widgets" as the least fixed point of a co-continuous functor Set -> Set.
4. In the spirit of Exercises 2 and 3 use 8 to formalize "a widget is an element of B or
a list of finite sets of lists of widgets" as the least fixed point of a co-continuous
functor Set -> Set.
272
11 Recursive Specification of Data Types
11.3 Continuous Functors and Greatest
Fixed Points
The dual of the Generalized Kleene Fixed Point Theorem (11.2.13) asserts
that if r/J is continuous and T is terminal then the limit of the left chain
2'
"'_r/J3 T ~r/J2T
•
•
~tpT ~
T
provides r/J with its greatest fixed point. The definitions and general theory
are dual to those in the preceding two sections and so have already been
done. The main task of this section is to explore limits in Set and to show that
polynomial endofunctors of Set are not only co-continuous but continuous
as well and thus are "bicontinuous."
1 Definition. Let C be a category. Notations for the dual category cop were
given in 2.2.13. A left chain in C is a right chain in cop and so has the form
Thus, left chains proceed from infinity from the left. Let (en) be a left chain. A
lower bound of (c n)is (U, IX) with IX = (IXnl n ~ O), IXn: U -+ en such that (U, IX) is
an upper bound of (c n ) in cop:
(n
~
O).
C
A limit of (c n) is a lower bound of (c n) which, in COP, is a colimit of (c n). Thus,
if (V, /3) is a lower bound of (c n ) in C, there exists unique f: V -+ U with
(n
~
O).
C
C has limits of left chains if every left chain has a limit.
2 Example. Let C = C(p,";) be a po set qua category as discussed following
10.1.7. A left chain is a descending chain
•.• .:-s;
A lower bound is an element
X3
X
.:-s;
X2
with
X
.:-s;
.:-s;
Xl
Xn
.:-s;
Xo'
for every n, and a limit is the
11.3 Continuous Functors and Greatest Fixed Points
273
greatest element of the set of lower bounds, that is, x is a limit of (xn) if x :::;; Xn
for all n and if for every y with y :::;; Xn for all n, y :::;; x. This is a special case of
the greatest lower bound introduced earlier in 6.1.1.
In the theory of domains, ascending chains represent approximating sequences as discussed following 5.1.16, in the notes for Chapter 6, and in
Section 13.1. While this interpretation fares poorly for right chains in Set, this
is not so for left chains as we next explore.
3 Example (Limits of Left Chains in Set). Let
be a left chain in Set. We think of the elements of Cn as "structures of depth
n," and cn(x) as "the highest n-Ievels of the depth n + 1 structure x;" cn(x)
"approximates" x. An example to keep in mind is A an alphabet, Cn = An =
words of length n, cn(a 1 '" an+1) = a 1 ••• an' An approximating sequence is
a sequence (xn In;::::.: 0) with Xn E Cn such that cn(x n+ 1 ) = Xn for all n. In the
examples above, an approximating sequence amounts to an infinite list
a 1 a Z a 3 '" represented as the sequence of its finite approximations a 1 , a 1 a z ,
a 1 a Z a 3 , ••• • In general, given the left chain (en), set U to be the set of all approximating sequences of(cn ) and define IXn: U --+ Cn by IX n(X O,X 1 ,x Z " •• ) = X n•
If (x O ,x 1 ,x Z , ••• )EU then cn(x n+1) = Xn so that Cn lXn+l = IXn and (U,IX) is a
lower bound of (c n ). We will show it is in fact a limit. Let (V, 13) be another
lower bound of (c n ). We must show there exists unique f: V --+ U with
c.
i~
u +-(-f"'--- V
Thus, if such f exists the nth coordinate of f(v), being 1X.f(V), must be f3n(v),
that is,
4
is the only possible way of defining f The only thing left to check is that f(v)
in 4 is an approximating sequence, that is, that cn(f3n+1 (v)) = f3n(v). But this is
precisely the condition that (V, 13) be a lower bound for (en).
It was observed in 11.1.19 that colimits of right chains of total functions
are the same in Set and Prn. Such is not the case for limits of left chains.
Limits of left chains in prn are explored further in Exercises 1-4.
5 Example. Let A be a set ("the alphabet") and let (c n ) be the left chain
6
where cn(a 1 ••• an +1 ) = a 1 ••• an' Then (A OO , IX) with IXn(a 1 aZ a 3 ••• ) = al ... an- 1
is the limit of (en). It is a lower bound since Cn lXn+l (a 1 a Z a 3 "') =
274
11 Recursive Specification of Data Types
cn(a l ···an+1) = a l ···an = IX n(a l a Za 3 ... ). If(V,fJ) is an arbitrary lower bound
define Yn(v) to be the last symbol in f3n+I(V). We shall prove that f3n+I(V) =
YI (v)··· Yn(v) for all n? 0 by induction. This is clear for n = O. For the inductive step, since Cn +1 keeps all but the last symbol of a string of A n + l we have
that
f3n+Z(v)
=
(c n+l f3n+Z(V»Yn+1 (v)
=
f3n+1 (V)Yn+1 (v)
= YI (v)··· Yn(V)Yn+1 (v).
It then follows .that (U, IX) is a limit of (c n). Define f: V --. U by f(v)
YI(V)YZ(V)Y3(V)···. Then 1X0f = ! = 130 whereas
(IXn+d)(v)
= IXn+1 (YI (v)YZ(V)Y3(V)·
.. ) =
YI (v)···
Yn(v)
=
=
f3n+1 (v).
If also g: V --+ U satisfies IXnf = f3n for all n then f(v) = X I X Z X 3 ···
IXn+d(v) = f3n+1 (v) = YI (V)··· Yn(V) so that f = g.
,
where
Xl··· Xn =
An alternate way to prove (AOO, a) is a limit is to show that it is isomorphic
to the construction of Example 3; we leave this as an exercise for the reader.
7 Definition. A functor IjI: C --. C is continuous if whenever Cn: Cn +l --. Cn is a
left chain in C and (U, IX) is a limit for (c n) then (IjIU, IjIIX), automatically a lower
bound for (c n ) (see the diagram below),
is a limit for (c n ).
8 Theorem. A constant functor and the identity functor are continuous. A
composition of continuous functors is continuous. If D has finite products and
IjI I' ••. , IjIk are continuous functors C --. D then their product IjJ I X ... X IjJk (as
in 10.1.18) is continuous.
PROOF. This follows by duality from Theorems 11.2.5-8. Here a crucial
observation is that IjI: C --. D "is" also a functor IjJop: cop --. DOP defined by
9
IjIOP( C)
=
IjIC,
IjIoP(CI ~ Cz ) = IjJCI ~ IjICz ,
functoriality being clear. (In understanding IjJop, the reader may wish to
consider the case where C, Dare po sets; here, to say f is monotone one could
say "Xl ::; X z => fX I ::; fx z" or, alternatively, "Xl? X z => fX I ? fx z ," the latter
description being rp.) Also, dualizing the remark following 11.2.8, any product of continuous functors is continuous.
D
275
11.3 Continuous Functors and Greatest Fixed Points
The dual of Theorem 11.2.9 applies to functors SetOP --+ SetOP , not to
Set --+ Set. The following, then, is a new theorem whose proof must be
supplied:
10 Theorem. Let r/Jl' ... , r/Jk: C ---+ Set be continuous. Then their coproduct
r/J 1 + ... + r/Jk: C ---+ Set (as in 10.1.19) is continuous.
PROOF.
Just as in the proof of 11.2.9, it is useful to first prove the following:
11 Lemma. For i = 1, ... , k let
be a left chain in Set, with limit (Vi, a i). Define (V, a) by V = V 1 + ...
an = a! + ... + a!: V ---+ C; + ... + C!. Then (V, a) is a limit of
+ Vk;
PROOF OF LEMMA. As in the proof of Lemma 11.2.10, it is left as an exercise
to show that we may assume each (Vi, a i) is constructed as in Example 3.
Now consider
C;+1
+ ... + C!+l
c'
n
+ ... + ck
n
) C~
+ ... + C!
~fn f ~!+ ... +~!
v -------+ u
with (V, P) a lower bound for (c! + ... + c!). We must construct unique f
with (a! + ... + a!)f = Pn for all n ?: 0 as shown.
Let v E V. There exists unique i E {1, ... , k} such that Po(v) has form (i,x o),
Xo E C~. Since (i, x) = Po(v) = Pl (ct + ... + d)(v), Pl (v) must have the form
(i, Xl) with Xl E Cf, C~(Xl) = Xo. Continuing in this way, Pn+l (v) has the form
(i, Xn+1) with c!(x n+1 ) = Xn. Define f(v) = (i, (x o, Xl' X2'· .. )). Since (xo, Xl'···)
is indeed an approximating sequence for the left chain (c~), it is in Vi so that
f is well defined. The remaining details are clear.
For the proof of Theorem 10, let (c n ) be a left chain in C and set
D
ct: Cn+1 --+ Cn = r/Ji(Cn): r/Ji Cn+l ---+ r/JiCn·
Unlike 11.2.9, the finiteness of the coproduct in Theorem 9 is not crucial.
The same proof shows that an arbitrary coproduct of continuous set-valued
functors is continuous.
12 Definition. A functor r/J: C
co-continuous.
--+
D is bicontinuous if it is continuous and
Combining the theorems of this section and 11.2.12, we have a major
result:
276
11 Recursive Specification of Data Types
13 Theorem. A polynomial functor
t/I: Set ~ Set is bicontinuous.
Next, we spell out the dual of 11.2.13 as an exercise in using duality
14 Theorem. Let C be a category with a terminal object T and such that every
left chain has a limit. Then every continuous t/I: C ~ C has as greatest fixed
point the limit of the left chain
... ---+
t/l3(T)
'I'2(i\
t/l 2 (T) ~ t/I(T) ~ T.
PROOF. cop has an initial object, namely, T; t/l0P: cop
continuous and the right chain
T ~ t/I(T) ~ t/l 2 (T)
'I'2(!)
~
cop as in 9 is co-
< ...
(we write t/I rather than t/l0P to avoid tedium) has a colimit, call it (G, oc), in cop
so that, by the proof of Theorem 11.2.13, there exists unique M such that
",n+1(T)
"'f
"'L
(n ~ 0)
\+1
M
<L
and (L, M) is the least fixed point of t/l0P. It follows that there exists unique M
in C such that
15
M
\ J:
",n+1( T)
Given an arbitrary t/I-coalgebra (A, Ll), (A, Ll) is a t/l0P-algebra so that there
exists a unique f in cop with
That is, there exists a unique f in C with
16
~
A----+''''A
11
L
M
1"'1
'",L
Thus, (L, M) is the greatest fixed point of t/I.
o
17 Example (Infinite Lists). The limit of Example 5 arises from the polynomial functor t/I: Set ~ Set, t/I(C) = A x C whose least fixed point is empty
but whose greatest fixed point is the limit of the left chain
277
11.3 Continuous Functors and Greatest Fixed Points
...
~
((T
X
C)
X
C)
(j x id)
X
x id j
(T
X
C)
X
C
i X idj
T
X
C ~ T = {A}
which, up to isomorphism, is just 6. The greatest fixed point is (AOO, M) with
M being the isomorphism
18 Notation for Recursive Specification. If 1jJ: C -+ C is an endofunctor
C: := 1jJ(C),
C =: : 1jJ(C),
specify the least and greatest fixed points of 1jJ. Criteria to ensure existence
have been developed in this chapter. The presence of both possibilities encourages careful thought in advance as to whether one wants to define
functions from or to the data type to be specified.
EXERCISES FOR SECTION
11.3
Let C, D be categories. A functor 1jJ: C -+ D is an isomorphism of categories if
IjJ is bijective on objects and if for every C1 , Cz , f 1---+ IjJf is bijective from
C(C1,Cz) to D(IjJC1,IjJCZ )' C, D are isomorphic categories if such an isomorphism IjJ exists.
1. Show that "isomorphism of categories" is an equivalence relation.
2. Show that if C, D are isomorphic categories then C has limits of left chains if and
only if D does. Prove a similar statement for colimits of right chains, products, and
coproducts. [Hint: Use duality!]
3. A set with base point is (X, xo) where X is a set and Xo E X. These form the objects
of a category if a morphism f: (X, x o) ------ (Y, Yo) is a total function f: X ...... Y with
f(x o) = Yo. Verify that this is a category if composition and identities are the usual
ones for total functions. Then prove that this category is isomorphic to PCn. [Hint:
If C is the category of sets with base point define tjI: PCn ...... C by tjlx = (X u {.1},
.1), where .1 rt: X. For f: X ...... Yin PCn define
(tjlf)(x) = {f.1(X)
if x E DD(f)
else.
Then tjI is an isomorphism.]
4. Prove that PCn has limits of left chains. [Hint: the constructions of 3 generalize
directly to the category of sets with base point.]
5. Show that the functor tjI(A) = B
+ A * of Exercise 11.2.2 is bicontinuous.
6. Define tjI: Set ...... Set by tjI(A) = Aoo, (tjlf)(a t a Za 3, ... ) = f(a t )(fa z )(fa 3) .. ·. Verify
that tjI is a functor. Show that tjI is continuous. [Hint: tjI is an infinite product of
polynomial functors.] Show that tjI is not continuous. Conclude that tjI is not a
polynomial functor.
278
11 Recursive Specification of Data Types
7. Let C be a category. Define a new category C x C with objects all pairs (C 1 , C2 )
with C 1, C2 C-objects, morphisms (f1,f2): (C 1, C 2) ~ (D1' D2 ) pairs f1: C1 ..... D1,
f2: C2 ..... D2 of C-morphisms and composition and identities as in C on each
coordinate. Verify that C x C is a category. Also show that if C has finite products,
finite coproducts, colimits of right chains, or limits of left chains then C x C does
too by performing the corresponding C-construction on each coordinate.
8. For Set x Set as in Exercise 7, the semantics of the simultaneous recursive
specification
L := A
M:= L
+ (A x L) + M
+ (B x M)
arises by computing the least fixed point of
Set x Set ~ Set x Set, (L, M) 1-----+ (A
+ (A
x L)
+ M, L + (B
x M».
Show that this functor is co-continuous so that 11.2.13 applies. Show that the
object part of the least fixed point is
(A*A
+ A*(B + A*)*A*A,(B + A*)*A*A)
by using substitution and 10.2.9.
Notes and References for Chapter 11
The Generalized Kleene Theorem 11.2.13, clearly stated as a generalization of the
Kleene fixed point theorem 6.2.13 from posets to categories, is due to M. B. Smyth,
"Category-theoretic solution of recursive domain equations," Theory of Computation
Report No. 14, Department of Computer Science, University of Warwick, 1976.
The reader may wish to consult a text in category theory to learn more about
limits and colimits. Colimits of right chains and coproducts are examples of colimits.
A simple colimit, not otherwise discussed in this book, is that of a coequalizer of a pair
f, g: X ..... Y, this being a morphism of form h: Y ..... Z such that hf = hg and with the
property that (see the diagram below)
X
f
===::::
y
g
h
I
Z
~>
Z'
whenever h'f = h'g there exists a unique ()( with ()(h = h'. An important theorem is that
all colimits can be constructed from coproducts and coequalizers. Dual statements, of
course, hold for limits.
Greatest fixed points in sets and their utility for the semantics of data types are due
to the authors in their paper on parametrized data types cited in the notes to Chapter
10.
CHAPTER 12
Parametric Specification
12.1 Arrays
12.2 Stacks and Queues
12.3 A Functional Programming Fragment Revisited
In Section 2 we will define "lists of E" in terms of the least fixed point
specification
E*: := 1 + E x E*
(i.e., "a list is the empty list (1 = {A}) or an element of E followed by a list).
But the parenthetical explication just given works only in Set, whereas the
specification works in any category in which polynomial functors have least
fixed points. In this sense E is a "parameter" for the "lists-of-" specifier. The
specific parameter E may itself arise from another data type specification and
may live in a category of arbitrary complexity subject to the technical needs
of semantics.
We do not wish to address the general theory of "computable parametric
specification" in this book. We are content to give a leisurely discussion of
three parametric specifiers: arrays-of-, lists-of-, and DTNs-of-. Arrays are
considered in Section 1. No recursion is used here but products are required
to distribute over coproducts in order to define "array-handling" functions
familiar to programmers. In Section 2 we construct lists and regard these
alternatively as stacks or as queues depending on which morphisms defined
on lists are regarded as focal. Here recursive specification is central. Section
3 formalizes the data type DTN of Section 1.3 and makes it clear that the
functional programming fragment FPF, which received a formal semantics in
Pfn in Section 6.3, could in fact be given a formal semantics in a broader class
of categories.
280
12 Parametric Specification
12.1 Arrays
The Pascal type declaration
array 1 .. n of e
1
is a simple data type; that is, it involves no recursive specification. If e is a
set, the informal semantics of 1 is the n-fold Cartesian product set en =
ex· .. x e together with the functions
2
spec:
en
x {l, ... , n}
((c l
3
assign:
en
x
e
,···,
-----+
e
cn), i) 1----+ C i
x {l, ... , n} -----+
en
((c l ,·.·, cn), C, i) f-----+ (c l ,· .. , Ci -
l ' C, Ci+l,···,
cn)
which express the Pascal operations
X[i]
4
X[i]:=
C
if X is of type 1. The choice of 2 and 3 as "functions belonging to an array" is
somewhat arbitrary, but suitable to illustrate a number of ideas.
For the balance of this section we work in a category C which satisfies the
following axioms which are also required in Section 2:
5 C has a terminal object, call it 1.
6 C has finite coproducts. The coproduct of 1 + ...
denoted En].
7
+ 1 (n
times) will be
e has finite products.
8 For each object A, A x - preserves finite coproducts, that is, if in i: Bi-----+
BI + ... + Bn is a coproduct then
idA x in i: A x Bi -----+ (A x B 1 ) + ... + (A x Bn)
is again a coproduct.
These axioms all hold in Set (verify!). We need one technical result:
9 Observation. For any object A, pr I: A x 1 -----+ A is an isomorphism.
PROOF. Consider
12.1 Arrays
281
We see that [idA,i]pr l = [prl,i] = idAxl ' But prl[idA,i] = idA- This shows
pr I is an isomorphism with inverse [idA, i].
D
We are now ready to define the semantics of 1 in C, where
of C. The "object part" is en, the n-fold product. Then
en
id x inj:
e is any object
x 1 ---+ en x [n]
is a coproduct by axiom 8, which with 9 allows the following definition of
spec:
10
C"
pr- 1
1
I
en x
1
id x in.
J_~ C" x En]
~
........
Plj~ ... _---spec
-
C
This clearly yields 2 if C = Set.
To define assign, we make use of the following:
11 Theorem (Bi-index Principle). Let AI' ... , Am, B I , ... , Bn be objects
and let h/ Ai ---+ Bj i = 1, ... , m, j = 1, ... , n. Then there exists unique
f: Al + ... + Am ---+ BI X ... X Bn such that
I
A,
~"l + A.
'jP,~ B,
' B,
(all i, j)
A ; - - - - - - - - + I Bj
lij
PROOF. For fixedj there exist unique fj
hj
with fjini = fu for all i. Let f be the unique map with prjf = fj. Then
prjfini = fjini = hj. If also prjgini = hj then, for each j, prjg must be fj so
that g = f (Alternatively, first construct h with prjh = hj for all j, and then
define f by fin i = h.)
The map assign is then defined using 8 and 11 by
en
12
assign
x C x En] -------+C"
id x in;
r
1
prj
C"xCxl
Iv
IC
where
{ en
en
x
e
e
This recaptures 3 when C
=
Set.
fij=
x
x 1~e
if i = j.
x
ifi "" j.
1~en~e
o
282
1 2 Parametric Specification
To show that these constructions are not just exercises in formalism we
introduce explicit elements and functions (but see Exercise 1).
13 Definition. For C any object, an element of C is a morphism x: 1 -+ C. We
write XEC.
This corresponds to the usual notion in Set. In general, "morphisms are
functions" in that if x E C and f: C -+ D in C the composition fx indeed is an
element of D. Note also that the product property (2.3.11)
asserts that the elements of a product C1 x ...
product set of the sets of elements of the Ci .
X
Cn are just the Cartesian
14 Definition. We now generalize 4. If X E c n and if 1 ::;; i ::;; n, X[i] E Cis
defined by
X[i]
= 1 ~cn~c.
If also aEC then "X[i]:= a" should change X to the Y: 1-+ en defined as
1
[X,a,in,) I
Cn
X
C x En]
assign I cn.
The reader should note that the theory of the present section does not
work in Pfn or Mfn (see Exercise 1 for more details). For example, in Mfn the
categorical product construction yields the disjoint union, with projections
( .) _{(a, i)
pri a,l -
I
..J..
if i = j
I
ese,
which hardly yields the semantics of arrays. This leads us to make a point
which appears counterintuitive but which, once grasped, avoids a great deal
of confusion. The appropriate category for the theory of program semantics
is not, in general, the appropriate category for the semantics of data types.
For example, we have seen that Pfn and Mfn provide appropriate categories
for the semantics of large classes of deterministic and nondeterministic programs, respectively, whereas Set provides the setting for defining many data
types. The function associated with data types through constructions in Set
are, of course, then available as morphisms in both Pfn and Mfn (cf. the
discussion of C TOT preceding 12.2.7). In fact, morphisms defined in Set can
then yield familiar interpretations of partial isomorphisms, as will be seen to
be the case for pop and push of stacks in the next section. No matter which
category we choose for our semantics, the general theory is still available to
guide our analysis and constructions.
283
12.2 Stacks and Queues
EXERCISES FOR SECTION
12.1
1. Show that Pfn satisfies axioms 5-7 but not 8. [Hint: See Exercises 2.3.8 and 11.3.3].
Do the same for Mfn. [Hint: See Exercise 2.3.7.] In both categories, show that
every object has exactly one element! [Hint: What is the terminal element 1 of Mfn
and Pfn?] Hence, using elements and functions gives a degenerate view of these
categories.
There are many important categories, for example, sheaves over a topological
space, which satisfy axioms 5-,8 but for which elements give an incomplete view of
the structure of the objects. It is not clear that such categories would arise in
program semantics, however.
2. Given X E en and 1 ::;; i ::;; n, use spec to define X[i] E C.
3. Given an "array of arrays" X E (cnr and 1 ::;; i ::;; n, 1 ::;; j ::;; m, define X [i,j] E C.
12.2 Stacks and Queues
Given a set E of elements, we may imagine situations in which we construct
words X n •·· Xl in E* in such a way that our access to the Xi in a word is
restricted. Stacks and queues provide two such examples. An instance of a
stack is a stack df bills on a spike at a restaurant cash register-if X n ··· Xl is
such a stack (with Xl the first put on the spike, X 2 the second, etc.) then
immediate access is limited to Xn which, if removed, leaves X n - l ••• X 1. Hence,
when E* is regarded as "stacks of E" we will want to include with this data
type, the following partial functions. In this section total functions will be
written A - B a~d(possibly but not necessarily total) partial functions will
be denoted with half-arrows A ~ B.
1
2
E* ~ E, DD(top) = E+ = nonempty words,
E* ...E!!.E..,..E*, DD(pop)
E+,
=
An example of a queue is a queue of people buying tickets. The ticket seller
deals with individuals in the order in which they entered the queue, just the
reverse of the stack case. Hence, relevant queue functions are
3
E* ~ E, DD(bottom)
4
E* ~E, DD(rest)
=
=
E+,
E+,
Both stacks and queues have the empty word which we express formally
as a function on a one-element set as in 12.1.13 by
I~E*
5
as well as the function
6
E x E*
push,
E*, (X, w) 1-------+ xw.
Both stacks and queues are built up from the left (choice of left over right is
arbitrary) by successive pushes beginning with the empty word, so that
284
12 Parametric Specification
PUSh(Xl' A}
=
Xl'
PUSh(X2,xd
=
X2 X l '
PUSh(X3,X2Xd = X3X2Xl '
and so on.
The task of this section is to define 1-6 in a broad class of semantic
categories. Our approach will be as follows. As long as C has zero maps we
may consider a subcategory, call it C TOT ' of all C-objects and (not necessarily
all) total C-morphisms (see 2.2.21-22). Examples to keep in mind are C =
PCn and C = MCn with CTOT = Set in both cases (though not all total morphisms in MCn are in Set). E* arises as the least fixed point of the endofunctor
CI------+ 1 + E x C of Set (cf. 10.2.9). We shall impose axioms on C so that
polynomial endofunctors on CTOT exist and have least fixed points and use
this not only to define E* but to define 1-6 in C in a way that makes natural
reference to all of the recursive properties involved.
7 Definition. For the balance of this section we let C be a category and let
CTOT be a subcategory of C subject to the following properties:
(i) C has zero morphisms (2.2.16).
(ii) All C-objects are objects of CTOT and every morphism in C TOT is total in
C. Furthermore, CTOT satisfies axioms 12.1.5-8, that is, CTOT has a
terminal object 1, has finite products and coproducts, and is such that
A x-: CTOT --+ CTOT preserves coproducts for every object A.
(iii) If in j : Aj --+ A is a finite coproduct in CTOT it is a coproduct in C as well.
(iv) CTOT has colimits of right chains and limits of left chains and every
polynomial functor CTOT --+ C TOT is bicontinuous.
Notational convention: A --+ B for CToT-morphism, A ---, B for C-morphism
(which may be, but is not necessarily, total).
As already noted, main examples are C = PCn or MCn with CTOT = Set.
Note that C is not required to have products.
While the full strength of these axioms will not be applied in this section,
experience dictates that at least this much should be assumed. Many stronger
assumptions are suggested in the exercises.
Let E be an object in C. We then define E* as follows.
8 Definition. Let t/!: CTOT --+ CTOT be the polynomial functor
9
t/!(C) = 1 + E
x C.
By 11.2.13 t/! has a least fixed point (E*, f1) with f1: 1 + E x E* ~ E*. Thus,
by the definition of a coproduct, f1 decomposes as f1 = (A,push) which not
only defines the desired generalizations of 5 and 6 but gives
10
1 ~ E* ~ E x E*
is a coproduct
285
1 2.2 Stacks and Queues
because J1 is an isomorphism (see Exercise 1). Note that 10 is a coproduct in
both C and C TOT by 7 so we need not specify which.
If C = PCn then 10 is given by the functions of 5 and 6. That this is a
coproduct is clear. The least fixed point property is
id + idE X f
1 + E x E* --""""""--~j 1 + E x D
11
<A, push)
1
l<d,(i)
E*
j
f
D
wherein f exists uniquely given d: 1 ~ D, 15: E x D
equivalent to
12
------+
D. Clearly, 11 is
f(A) = d,
f(xw)
=
b(x,f(w)),
which indeed does define f recursively.
In general, the definition of the stack morphisms in C is immediate, using
the quasi projections of 3.2.6.
13
top = E* ~ 1 + Ex E* ~E x E* -..E4 E;
14
pop
PR
-1
=
It-I
E* ------+ 1
+E
x E*
PR
~E
x E*
pr
~
E*.
That this gives 1 and 2 when C = PCn is easily checked.
Our strategy for defining bottom and rest in C by generalizing 3 and 4 is
to use the least fixed point property. We begin by showing how 3 is a
consequence of 11 when C = PCn. First extend bottom to b: E* ~ E* by
{A
A
b(w)
=
ifw =
else.
b(A)
=A
b(x)
=x
(xEE)
=
(xEE, w "# A)
bottom(w)
Then
15
b(xw)
b(w)
defines b recursively. For example,
b(X 3 X2 X l )
=
But 15 is a special case of 12 if D
16
b(X2Xd
= E*,
b(x, v)
=
=
b(xd
= Xl·
D = A, and
{Xv
V
=A
else.
To do this in general, C then amounts to finding a suitable definition of the 15
286
12 Parametric Specification
of 16. The key fact here is that E x (-) preserves the coproduct 10 in C TOT
by 7. Thus,
17
E x 1
E x E*
idExA)
(idEXpush
E x (E x E*)
is a coproduct.
Hence, fJ is uniquely defined in CTOT by
18
Ex 1
idE
idE
X
A
j
A\
Ex E*
X
idE x push
) E x E* (
E x (E x E*)
/pr
15
P~E*~
2
Ex E*
For C = prn, 18 reduces to
fJ(x, A) = push(x, A) = x
19
fJ(x, yw)
= push(y, w) = yw, that is, fJ(x, v)
=
v if v #- A,
which is just 16. For general C, there exists a unique b in C TOT defined by the
least fixed point property
id + idE x b
1 + E x E* ---=---+) 1 + E x E*
20
(A, push)
1
1
(A, 15)
E*
which gives 15 when C
= prn.
b
) E*
The desired generalization of 3 in C is then
bottom = E* ~ E* ~ E x E* ~ E x 1 ~ E,
21
where the quasi projections refer to 10 and 17. This clearly recaptures 3 if
C = prn, as E* ~ E in 21 is x f-+ x with domain of definition E.
The general construction of rest: E* -..- E* uses the same concepts that led
to 20 and so is left as Exercise 2.
EXERCISES FOR SECTION
12.2
1. Let f: C -> D be an isomorphism and let
A---.!......C~B
be a coproduct. Show that
is also a coproduct.
2. For C = Pfo extend rest to a total function r: E*
(w A). Show that
*
->
r(A) = A
r(xw)
if 15: E x E*
-->
E* is defined by
= 15 (x, r(w»
E* by r(A) = A, r(w) = rest(w)
287
1 2.2 Stacks and Queues
(j(x, w) = {
A
xw
w=A
w *- A.
Generalize this to the C of 7 and define
rest = E* ~E x E*
push,
E* ~ E*.
3. If 7(ii) is extended to countable coproducts a much more explicit construction of
E* can be given.
(i) Using the general theory of products, construct an isomorphism IXn:
E x En ---+ E n+1 which, in Set, is IXn(X,(X 1, ... ,Xn)) = (x, Xl" ., xn). Letting EO
mean 1, show that 1X0 recaptures 12.1.9.
(ii) Define E* as the infinite coproduct 1 + E + E2 + .... Define A = ino: 1 ..... E*.
As idE x inn: E x En ---+ E x E* is a coproduct, define push: E x E* ---+ E*
by
idE x inn
E x En --=---"+1 E X E*
~n 1
1
push
En + 1
E*
Prove that
is a coproduct and that (E*, <A, push») is a least fixed point of the t/I of 9.
4. In Set, the reverse map rev: E* ..... E* is defined by rev (A)
Xl'"
= A,
rev(x n '"
Xl)
=
Xn•
(i) Define the queue functions in terms of the stack functions and rev.
(ii) Define rev if E* is constructed in C TOT as in Exercise 3. (We do not see how to
do this in the context of 7 without additional assumptions.)
5. Define E+ by the least fixed point specification
E+ : := q>(E) = E
+ (E
X
E+).
For C as in Exercise 3 show that E+ ~ E x E*. [Hint: Paralleling Exercise 3, show
directly that a least fixed point can be built on E + E2 + E3 + .... J Conclude that
there exists a coproduct
See Exercise 13.3.3 for a related result.
6. Define E+ as in Exercise 5. Prove, assuming only 7, that there exists a coproduct
of the form
1~E*+--E+
(although E+
~
E x E* is not clear) by completing the following outline.
(i) Show that there is a coproduct
so that (recall Exercises 10.2.7-8) we may write the least fixed point isomorphism as v: Ex (1 + E+) ---+ E+.
288
1 2 Parametric Specification
(ii) Explain why, given d: 1 -+ D, b: E x D
exists unique c: 1 -+ D, g: E+ -+ D with
1 + (E x (1 + E+»
id+vl
----+
D, it suffices to show that there
id + (idE x <e,g») 1 + (E x D)
1<d,8)
E+----------~~------~)D
<c,g)
(iii) h = E pr i " E x 1 idE X d, E x D ~ D depends only on d and b. Complete
the proof by showing that c, g satisfy the square of (ii) if and only if c = d and
g is the unique !/I-algebra morphism (E+, v) ----+ (D, (h, b
>).
7. Give an example of a total morphism in Mfn which is not in Set.
12.3 A Functional Programming Fragment
Revisited
In Section 1.3 we introduced the data type DTN of dynamic trees of natural
numbers and described a number of building operations--composition, conditional, construction, apply-to-all, and insertion-to convert old functions
DTN -+ DTN into new ones. In this section we apply the theory in the
intervening chapters to present the semantics of this functional programming
fragment in a broad class of semantic categories. While we do not pursue the
details, the requirements are roughly those of 12.2.7, where C(X, Y) has
suitable ordered structure or partially additive structure to model iteration
and recursion.
In the following treatment, we shall make use of a number of results on
functorial constructions. These results are easy to state and apply, but their
proof would unduly burden the body of this volume. (Interested readers will
not find it difficult to provide their own proofs and are encouraged to do so.)
The results are the following:
1 Theorem. Let C, D have colimits of right chains and let C x D be the product
category with ob(C x D) = ob(C) x ob(D),and(C x D)((Ct>D 1 ),(C2 ,D2 » =
C(C1 ,C2 ) x D(D 1 ,D2 ) with composition and identities those of C, D on each
coordinate.
Then a functor r: C x D ---+ E is co-continuous if and only if r is
separately co-continuous, that is, nC,-):D-+E and n-,D):C-+E are cocontinuous for each C in ob(C) and each D in ob(D).
2 Theorem. Let C have initial object .1 and co limits of right chains, and
let r: C x C ---+ C be co-continuous (equivalently, separately co-continuous).
Denote nC, -): C -+ C by r c. Then the right chain
289
12.3 A Functional Programming Fragment Revisited
3
has a colimit which we denote
4
from which we derive the least fixed point Jlc:
property.
r dt/l) -
t/lC by the colimit
It can then be shown that for all C, D and f: C -+ Din C, there exists a
unique t/lf such that
5
where
r
d
=
r(C,1.)
rn+d = r cn~(1.)
r(D,!))
r(D,1.),
r(f,id»)
r Dn~(1.)
rDrnf)
r DrM1.).
With this definition on morphisms, t/I is a functor C -+ C. It can in fact then
be shown that", is co-continuous, that is,
6
C-
least fixed point of r( C, -)
is a co-continuous functor.
Dually, if C has a terminal object and limits of left chains, then the entire
discussion applies to COP, yielding the dual result that if r is continuous, then
7
C-
greatest fixed point of r(C,-)
is a continuous functor.
Set, r(A, B) = 1 + A x B is separately biThe functor r: Set x Set continuous, being polynomial in each variable, and so is bicontinuous by
Theorem 1 and its dual. As discussed in Section 2, r(A, -): Set -+ Set has a
least fixed point of the form (A *, JlA)' This gives rise to a co-continuous
functor Af---+A* by 6. We leave it as an exercise for the reader to carefully
unravel 5 to prove that
A*
11*
8
B
withf*«a1,···,a n
»
=
<fa1,···,jan ),
B*
where, in accordance with the notation of Section 1.3, we write <a1, ... ,an )
instead of a 1 ••• an for elements of A *. (Actually, for this special case Exercise
11.2.2 provides a more direct route to the functoriality and co-continuity of
A f---+ A *.)
We then define the data type DTA of dynamic trees of A by the least fixed
point specification
290
12 Parametric Specification
9
Except that we are using an arbitrary set A rather than the set of natural
numbers for the length-l trees, this specification agrees with the basis and
induction steps originally used to define the set of DTNs in Section 1.3.
The least fixed point specified by 9 exists because t/lC = A + C* is cocontinuous, being a coproduct of a constant functor and C*. Indeed, it may
be shown that A ~ DTA is co-continuous and that
+ DTQ
Q::= A
is a valid least fixed point specification-but that is another story. We turn
now to building suitable functions on DTA'
The fixed point isomorphism takes the form of a coproduct
10
Given f: DTA --+ DTA, "apply-to-all f" af: DTA --+ DTA is then defined
using this corproduct by
11
A
in 1
)DTA.
E
~i~f
In1
I
.\.
DTA.
E
in2
in2
DT;l'
11*
DT1
The following general discussion will be useful shortly. In a least fixed
point specification Q : := t/I(Q) for co-continuous t/I, the least fixed point (L, jJ.)
arises (as in 4) via a colimit an: t/ln 1- - - L. Question: Is it valid to use the an
as maps associated with the data type? Answer: Yes, since they are easily
derived in a finite way from the least fixed point itself. Indeed, define
Pn: t/ln 1. - - L for n ~ 0 as follows:
12
Po = 1.-GL,
Pn+l = t/ln+1 L
tpPn)
t/I L ~ L.
Then it may be shown that Pn constitute colimit injections, that is, "Pn = an up
to isomorphism." But even without pausing to prove this here, it is clear that
the ae,n of 5 can be constructed directly from the least fixed point isomorphism jJ..
For the case t/lC = 1 + B x C it may be checked directly that Pn is, up to
isomorphism, the inclusion
1+B
+ ... + B n - - B*.
By composing this with the appropriate injection, the inclusion
13
Bn~B*
<b 1 , .. ·, bn >r------+ <b 1 , ... , bn >
is available, as is our old friend
291
12.3 A Functional Programming Fragment Revisited
14
<)
the empty word (written A in previous sections but as
in Section 1.3).
Recall that the initial object property of B* amounts to: given qo, there
exists unique r with
Hence, we may define, given f: DTA -----+ DTA, a map g: DT':
and it is then easily seen that the insertion map If: DTA
15
in
-----+
-----+
DTA by
DTA is given by
in
A~l/DTl
DTA
The construction map is easy using the y of 13. Given fl' ... , fn: DTA -----+
DTA, define f by
and then
16
Composition and conditional are dealt with as in earlier chapters.
EXERCISES FOR SECTION
12.3
The results of this section provide the semantics of FPF in Mfn. Note that
DTN does not change if we define MfnToT = Set. In Exercises 1 and 2 evaluate in Mfn using Kleene semantics in the usual way.
1. while p do i (6.3.25) for p, i E Mfn (DTN, DTN).
2. foreach 0 [I, g] whereforeach is defined in Exercise 1.4.3 and i, g E Mfn (DTN, DTN).
3. Verify in detail that lXi, /i, and [fl"" ,In] as defined in this section take their
expected meanings in the context of Section 1.3.
292
1 2 Parametric Specification
Notes and References for Chapter 12
For a much more detailed discussion of the issues raised in Exercise 12.1.1 see The
book of R. Goldblatt cited in the notes to Chapter 2.
The omitted proofs of 12.3.1-6 follow easily from results in the Lehmann-Smyth
paper cited in the notes to Chapter 10.
The constructions in Sections 1 and 2 refine those given by the authors in their
paper on parametrized data types cited in the notes to Chapter 10.
CHAPTER 13
Order Semantics of Data Types
13.1
13.2
13.3
13.4
Introduction
Constructions with Domains
Cartesian-Closed Categories
Solving Function Space Equations
Work initiated by D. S. Scott and C. Strachey in the 1960s, and contributed
to by many up to the present writing, yields a framework for program
semantics in which every data type is a domain and every computed function
is continuous. We provide a critique of these basic assumptions in Section 1,
but then proceed to develop an introduction to this theory of ordered semantics in the remaining sections.
13.1 Introduction
In this section we offer a critique of the motivations for ordered semantics.
We hope the reader will not misconstrue our claims that other avenues of
approach are possible as arguing against the merits of the ordered approach.
Rather, we are reacting to the acceptance, certainly championed by some, of
ordered semantics as the only mathematical foundation for the theory of
program semantics.
1 A Basic Claim of Ordered Semantics. All computable functions are
continuous.
Defense of 1. Let D and D' be domains and let f: D - D' be computable (in
some reasonable sense). In these domains the approximation relationship
x ::;; y is interpreted to mean that the "information content" of x is included
in that of y (i.e., y has more and better information). If there is to be some
notion of computability, information will have to be determined by "finite"
approximation, and a function will only be able to compute in terms of these
294
13 Order Semantics of Data Types
finite approximations. To provide a precise notion of "finite approximation,"
suppose that an increasing chain
of elements of D is given. Information here is increasing montonically, and
since D is a domain we can form the least upper bound
x
=
V xn•
n~O
Suppose then that f is computable. We then ought to be able to write
if x::;; y,
fx ::;;fy
for if x approximates y then fx should approximate fy. Moreover, we should
have
since the finite amounts of information are the same on both sides and the
basic assumption is that an element is determined by its finite information
content. But the above says precisely that f is continuous.
Critique of the Necessity of 1. In our study of the order semantics of
recursive definitions in Section 6.2, we did not require that the functions so
defined be continuous: they were simply partial functions f: A - B, say,
where A and B were ordinary sets, not domains. We did see that the functions
I/!: Pfn(A, B) - - Pfn(A, B) associated with recursive specifications were continuous maps, but this continuity was with respect to the partial ordering on
Pfn(A, B) and had nothing to do with requiring A and B to be domains, or
with requiring that the least fixed point of '" be continuous. There is a
simple technical device to counter our objection, using the flat domain
construction of Example 6.1.16. Let f: A - B be a partial function. Define
f~: (A, =)~ --(B, =)~ by
2
f~(a) = {b
.l
if f(a) is defined and f(a) = b
if a = .l or f(a) not defined.
Then f f--+ f~ establishes a bijective correspondence between Pfn(A, B) and
continuous functions (A, =)~ ----+ (B, = )~. But we still object, for we know
from computability theory that computability of f does not guarantee computability of f~. For if f~ were computable, we could solve the halting
problem for any program computing g by setting
I if f~(a) =P .l
{
halt (a) = 0 if f~(a) = .l.
We thus feel that the above principle should be weakened to:
295
1 3.1 Introduction
3 If D and D' are domains and if the orderings S are viewed as "approximation of information content" then all computable functions f: D ~ D' can
be argued to be continuous.
Thus, the theory of recursive definitions does not force computable functions to be continuous. However, could it be that the theory of d&ta types
offers reasons to structure the sets D, D' underlying data types as domains
and to interpret the underlying relations as "approximation of information
content"? We consider a number of issues to further this discussion.
4 A Basic Claim of Ordered Semantics. A data type has an approximation
ordering.
Defense of 4. In the real world, we can in general only compute finite
approximants to computable functions. Data types require an approximation ordering so as to express this.
Critique of the Necessity of 4. The defense has many justifications but
does not apply to all functions. There are many simple functions (consider
Boolean and arithmetical operations) which we compute completely, not
just approximately. More complicated functions arise through iteration and
recursion. Thus, if l/I: Pfn(A, B) ---+ Pfn(A, B) is the continuous function associated with a recursive specification, the semantics V l/I k(1-) of 6.2.3 is
approximated by the l/Ik(1-)-but A, B are just sets and need not themselves
be partially ordered. Furthermore, this is not an approximation in the sense
that if a approximates a', then f(a) approximates f(a'). It is the approximation of successive extensions of a partial function.
The trace semantics map g: A ---+ A*B + AOO of 10.2.8 tracing the (perhaps infinite) iterated application of f: A ---+ A + B is approximated by the
maps
A ~A*B
+ A oo ~B + AB + ... + An-1B + An,
where (ex n ) is the limit of the greatest fixed point construction for the polynomial functor l/IC = B + A x C (see 11.3.13). Again, no ordered sets were
needed.
In short, what is fuzzy about the defense of 4 is the leap from the fact that
recursive definitions of computable functions yield sequences of "approximations" (which equal the function restricted to ever larger subsets of its domain
of definition) to the suggested need for an approximation ordering on the
domains of that function. The ordering of subsets of a set in no way requires
an ordering on elements of the set.
5 A Basic Claim of Ordered Semantics. Every reasonable recursive specification of a data type is solved by taking the least fixed point of a continuous
functor on domains.
296
13 Order Semantics of Data Types
Defense of S. For then every recursive specification written by the programmer will have a meaning. The category Domadj of Section 4 below
allows fixed points not only for polynomial functions but for function-space
specifications that have no solution in Set.
Critique ofS. Mathematically, this is a beautiful idea which has profoundly
altered research directions in the foundations. In the situations we have
examined in Set, however, the domain semantics of a recursive specification
may not be what the programmer had in mind. In many cases, specifications
important to the programmer can be solved directly in Set rather than in
some category of domains, and for some of these (e.g., Example 13.2.10
below) the answer is closer to the programmer's intuition.
Section 4 below expands the discussion of 5. In Section 2 we introduce
constructions for domains which give rise to polynomial functors as well as
the "function-space" domain [D -+ E] of continuous maps from D to E.
Section 3 introduces Cartesian-closed categories to formalize the functionspace construction. Then Section 4 introduces "reflexive domain equations"
such as
D
~
A
+ [D -+ D]
which have nontrivial solutions in D for domains but not for sets.
13.2 Constructions with Domains
1 Definition. Given domains D and D', let [D -+ D'] be the set of all continuous functions f: D -+ D', partially ordered by
f::;; g<=> f(x) ::;; g(x)
2 Observation. [D
space domain.
-+
for all x in D.
D'] is a domain under ::;;. We call [D
-+
D'] a function-
PROOF. The bottom element is 1- with 1-(x) = 1- ED' for each x in D. Given
an ascending chain fo ::;; fl ::;; fz ::;; ... , define F: D -+ D by
F(x) =
V fn(x).
Then F is continuous as follows. If d ::;; e in D, fn(d) ::;; f,,(e) ::;; F(e) for all n
so that F(e) is an upper bound of fn(d) and F(d) ::;; F(e). This shows F is
monotone.
Now let do ::;; d 1 ::;; d z ::;; ... be an ascending chain in D and let d = V dn.
To show F(d) = V F(d n) let F(d n) ::;; d' for all n and show F(d) ::;; d'. Fix m.
For all n, fm(d n) ::;; F(d n) ::;; d'. As fm is continuous, fm(d) ::;; d'. This shows
F(d)::;; d'.
D
297
13.2 Constructions with Domains
3 Definition. Given domains D1 ,
... ,
Dn we define the following:
(i) Dl x .,. x Dn is the set of all ordered n-tuples (x l' ... , xn) with Xi E D
under the ordering (x1, ... ,xn )::; (Yl, ... ,Yn) iff Xi::; Yi in Di , i = 1, ... , n
(cf. 6.1.17).
+ ... + Dn = {1-} u {I}
1- ::; z for all z in Dl + ...
Di ·
(ii) Dl
X
Dl
+ Dn,
Observation. Dl x ... x Dn and Dl
{n} x Dn under the ordering
while (i, x) ::; (j, y) if i = j and x ::; Y in
U ... U
+ ... + Dn are both domains.
4 Categories of Domains. To define data types recursively we are led, as in
Section 11.1, to consider colimits of right chains and, as in Section 11.3, limits
of left chains. This is meaningless unless we pin down which category of
domains we are in. There exists a wealth of possible definitions of such
categories in the literature. In this section we introduce the category Dome of
domains and continuous maps. The categories Dom and Domadj are studied
in Section 4. The reader should understand that different categories are
introduced to solve different problems. Any claim that domains provide the
necessary objects in semantics should clarify that different types of morphism
are required at different times.
The interest in Dome springs from the philosophy of 13.1.1 that "computable maps are continuous." (No one claims that all continuous maps are
computable.)
5 Definition. The category Dome (c for "continuous") has domains for objects
and continuous maps for morphisms. (Morphisms are not necessarily strict:
we do not require f(1-) = 1-.)
6 Observation. Dome has products. The construction 3(i) with the usual projections (6.1.17) works.
The reader might well expect us to say that the construction of 3(ii) is the
coproduct in Dome- This is not so. Dome is a bit out of kilter because its
morphisms do not preserve all the structure the objects possess in ignoring
1-. Thus, in attempting to define (f, g): {1-} U {I} x D U {2} x D' -+ E given
f: D -+ E, g: D' -+ E there is no unique way to define (f, g) (1-). In fact,
7 Example. Dome does not have coproducts. Let 1 = {1-} be the one-element
domain and suppose
were a coproduct. Define domains E, F as shown:
298
13 Order Semantics of Data Types
x
\.1/
x
y
\z/
y
I
.1
F
E
Define continuous maps
E ~ F I(x) = x,/(y) = y, 1(.1) = z,
E ~ F g(x) = x, g(y) = y, g(.1) = .i.
Viewing x, y as (continuous) maps 1 --+ F, we can use the supposed coproduct property to form IX = <x,y): D --+ F.
D
7:~
1'-.......
: 0(
x-.........,.+/y
F
1
Then lX(d) = x and lX(e) = y. But IX must be continuous if indeed we have a
coproduct in Dome. Hence, we can not have d::s; e in D since lX(d) i lX(e).
Similarly, e i d. Thus, the least element .1 of D is distinct from d, e.
Similarly, let p = <x,y): D --+ E. As IP maps d to x and e to y, IP = IX,
by the uniqueness property of coproducts. Similarly, gp = IX. This is possible only if 1X(.1)E{X,y}, since I(E)ng(E) = {x,y}. But then consider
u = <e, d): D --+ D, 't": F --+ F, 't"x = y, 't"y = x, 't"z = z, 't".1 = .i. Since u is an
isomorphism owing to generalities about co products (why?) and all isomorphisms in Dome preserve .1 (why?), u(.1) = .i. As y = 't"IXU: D --+ F maps d to
x and e to y, y = IX. But this is a contradiction since if 1X(.1) = x, y(.1) = Y
whereas if 1X(.1) = y, y(.1) = x.
The situation is not as bad as would appear at first glance because the
construction of 3(ii) does have all the properties we need. For example,
8 Definition. Given h: Di --+ Ei (i = 1, ... , n) in Dome define 11
D1 + ... + Dn - + E1 + ... + En by
(/1
+ ... + fn)(x) = {~(X)
if xEDi
if x = L
It is easily checked that
idDl
+ ... + idDn = idDl +···+Dn
and, given h: Di --+ E i, gi: Ei --+ Fi (i = 1, ... , n), then
+ ... + In:
299
13.2 Constructions with Domains
Thus, just as in 10.1.19, if Fl , ... , Fn: Dome --+ Dome are functors, so is
Fl + ... + Fn (with + as in 3(ii)). And it is then clear how 10.1.22 is interpreted to define polynomial Junctors in Dome.
9 Fact. Dome has colimits of right chains.
In the Set-construction of 11.1.7, if each Cn is a poset with LUBs of
ascending chains, the colimit U is partially ordered by [n,x] ~ Em, y] iff there
exists I ~ m, n and x ~ y in C1 with [n,x] = [I, x], [m,y] = [1,)1]. If such U
happens to be a domain, it is the desired colimit in Dome. In general, the
desired colimit exists as the "completion" of U obtained by adding, in a
suitably precise sense, LUBs which were missing in U. Although we avoid
further discussion of completion, the following example provides intuition.
10 Example. One is tempted to define the natural numbers N by
N::= N
+ {O}.
But in which category? In Set, this yields the colimit
0--+0--+ T--+2--+"',
where n = {O, ... , n} and the unlabeled maps are inclusions. If we regard each
n as a poset in the usual way, the ordering on N obtained as discussed above
is the usual one. But in N, the ascending chain 0 ~ 1 ~ 2 ~ ... has no LUB.
The colimit in Dome is N u { (f)} with 00 = LUB(O, 1,2, ... ). This illustrates a
general slogan: "with domains, one may get infinite elements in the data
structures, like it or not."
Dome shares a very pleasant property with Set:
11 Fact. Any polynomial Junctor Dome --+ Dome is co-continuous and so has
least fixed point.
The proof will be outlined in 13.3.10.
EXERCISES FOR SECTION
1. Let PX,f = Pfn(X, Y)
1\ by
13.2
+ {T}. For (1;IiEI) a family in PX,y, we define the operator
1\1; =
{T
f
if no i exists with I; i= T
else,
where DD(f) = {xEX:I;(x) = jj(x) for all i, j with 1;"# T "# jj} and, for x in
DD(f), f(x) is the common value of all such I;(x). Show that 1\ is the infimum
operation of a complete poset. What is the supremum operation? (See Exercise
6.1.6.) Show that f v g = T if there exists x with f(x) "# g(x). Hence, T is the
"overdefined" element.
2. Let C be the full subcategory of Dome of all complete posets.
300
1 3 Order Semantics of Data Types
(i) Show that [D -> D'] is complete if D' is.
(ii) Show that C has products using the construction of 6.1.17.
(iii) Show that if Definition 3 of Di + ... + D. is modified to add a greatest element
as well then Di + ... + D. is complete. Hence, polynomial functors may be
defined C -> C.
(iv) Show that C does not have coproducts. [Hint: Modify Example 8 by giving E,
F greatest elements.]
3. Give an example of a domain with least and greatest elements which is not complete. [Hint: Six elements suffice.]
13.3 Cartesian-Closed Categories
In this section, we introduce the notion of a Cartesian-closed category for
two reasons: it supports the notion of a "function-space object" which generalizes the function-space domain [D -+ D'J of 13.2.2 thus allowing a categorical version of the notion of "Currying" from the A-calculus; and it provides
insight into intuitionistic generalizations of Boolean logic.
1 Definition. A category C is Cartesian-closed if
(i) C has a terminal object, 1;
(ii) C has finite products; and
(iii) for each B, C in C there exists a "function-space object" [C -+ BJ and an
"evaluation morphism" e: [C -+ BJ x C -+ B with the property that for
all f: C' x C -+ B there exists unique f": C' ~ [C -+ BJ with
2 Example. Set is Cartesian-closed. Define [C -+ BJ to be the set of all functions from C to B. Define e(g, c) = g(c). The unique f" is given by f"(c')(c) =
f(c', c).
This property of functions is familiar to users of LISP or the A.-calculus as
Currying or lambda-abstraction which is the operation which converts the
function f: C' x C ~ B to the function f": C' ~ [C -+ B]. Here, the display we write as x ~ g(x) would be written A.x.g(x). Then f: C' x C -+ B
has A.-notation A.(c', c).f(c', c), and f": C' ~ [C -+ BJ has A.-notation
A.C'.(Ac.f(c', c)). That is, f" takes the argument c' to return A.c.f(c', c), which
is in [C -+ B]. The commutativity of the diagram of 1, expressed in A.-notation,
is that for all a' in C' and a in C,
(A.c'.A.c.f(c', c))(a')(a) = (A.(c', c).f(c', c))(a', a)
since both sides evaluate to f(a', a) in B.
301
13.3 Cartesian-Closed Categories
We now show that if a poset P, considered as a "set of generalized truth
values," is Cartesian-closed when viewed as a category, then we capture what
is known as a Heyting algebra in intuitionistic logic. Classical Boolean logic
is recaptured when P = {F, T} with F < T.
3 Example. Let C = C(p.::s;) for a poset (P, ~). What does it mean to say that
C is Cartesian-closed? Think of elements of P as "propositions" and interpret
p ~ q as "if p is assumed then q can be proved." The terminal object of Cis
the greatest element of P, which we then write as T for "true." A product
p x q is characterized by
p x q
~
p,
p x q
~
q,
if r
~
p, r
~
q then r
p x q,
~
which coincides with the greatest lower bound p 1\ q as defined in 3.3.3; so we
write p 1\ q for p x q and think of it as "p and q." We shall write [p --+ q] as
p q and think of it as "p implies q."
Let us now reexamine the diagram of 1(iii) in the form
=
Then a is just the deduction rule known as "modus ponens" to logicians,
being the assertion
((p=q)
p)
1\
~
q.
The commutativity of the diagram is of no interest in this example since all
diagrams commute in a poset. What is really asserted (modus ponens-that
is, a-having been given) is that f exists if and only if does, that is,
r
r
1\
p
~
q if and only if r
=
~
(p
= q) for all p, q, r.
Evidently, this property requires p q to be LUB{rlr 1\ p ~ q} and so it is
uniquely determined by p, q. A Cartesian-closed poset is called a Heyting
algebra and is well known as the appropriate generalization of Boolean
algebra in the area known as intuitionistic set theory.
Classical Boolean logic, {F, T} with F < T, provides a Heyting algebra
with p 1\ q = T if and only if p = T = q and p q = T if and only if p ~ q.
Another Heyting algebra is the unit interval [0,1] of real numbers with ~
the usual ordering. Here, T = 1, p 1\ q = Min{p,q}, so that
=
p=q= {
1 p~q
q
p
>
q.
Both of these examples have a least element, which we shall denote F for
302
13 Order Semantics of Data Types
"false." Define the complement of p, ip, by
ip=(p=F).
In the Boolean case, iF = T, iT = F is the classical complement. Here,
i i p = P for all p. In the unit interval example, however,
p=o
iP=g
P > 0,
iiP=g
p>o
p = 0,
so that p::S; i i p but P # i i p in general. It is, however, true in general
that ip= i i i p .
The uniqueness of [C -4 B] is not special to posets:
4 Observation. In any category With finite products, if C, B are fixed and
([C -4 B]), e) as in l(iii) exists, then it is unique up to isomorphism. For it is a
terminal object in the obvious category with objects (C',f), where f: C' -4 B.
5 Fact. Dome is Cartesian-closed.
PROOF OUTLINE. 1 = {.l} is a terminal object. As noted in Exercise 6.2.5, the
construction of 6.1.17 provides products in Dome. Define [C -4 B] as in
13.2.2. Useful technical results, not hard to prove, are the following:
6 f: C' x C -----+ B is continuous if and only if for each c' E C', C E C, f(c', -):
C -4 Band f( -, c): C' -----+ B are continuous. (See Exercise 6.3.7.)
7 e: [C -4 B] x C -4 B defined by e(g, c) = g(c) is continuous. Defining e as in
7, then, given f as in l(iii), the unique function F defined as in 2 by F(c')(c) =
f(c', c) is continuous by 6.
More is true:
8 The "abstraction map"
[C' x C -4 B] ~ [C' ---+ [C -4 B]],
rx(f) =
F
is not just a bijection as in 10.2.22 but is an isomorphism in Dome (cf. Exercise
4 for the generalization).
D
9 Discussion. We pause to discuss this notion of Cartesian-closed category.
First, a category is or is not Catesian-closed: the function-space objects, if
they exist, do so in only one way by 4. If objects are not sets neither are the
function-space objects (cf. Example 3) so that function-space objects are not
sets of functions in general. (We might, however, associate sets to objects by
303
13.3 Cartesian-Closed Categories
defining an element of an object X to be a map x: 1 --+ X from the terminal
object as we did in 12.1.13. Then an element x: 1----+ [C --+ BJ is of the form
for a unique f: 1 x C ~ B. But using the canonical isomorphism C ~
1 x C (cf. 12.1.9), we then see that elements x correspond bijectively to actual
morphisms C --+ B.)
Cartesian-closed categories have many nice properties. Some elementary
ones are considered in the exercises. It may be shown that any Cartesianclosed category which has colimits of right chains satisfies Lemma 11.2.10.
This is the hard part of
r
10 Fact. Any polynomial functor Dome --+ Dome is co-continuous and so has a
least fixed point.
PROOF OUTLINE. Constant functors and the identity functor are co-continuous
and any composition of co-continuous functors is again so. As remarked
above, the proof of 11.2.9 that a product of co-continuous functors is cocontinuous goes through in any Cartesian-closed category. While Theorem
11.2.7 asserts that a coproduct of co-continuous functors is co-continuous,
this does not apply since the + for the current polynomial functors is not a
coproduct. Nonetheless, the proof of 11.2.7 goes through with minor changes.
o
EXERCISES FOR SECTION 13.3
1. In any Cartesian category prove the following:
(i) [1 -+ A] ~ A for all A.
(ii) [A -+ B x C] ~ [A -+ B] x [A -+ C] for all A, B, C.
(iii) [A -+ 1] ~ 1. [Hint: Show [A -+ 1] is a terminal object.]
-+ B' in a Cartesian-closed category C, define [A -+ fJ:
[A -+ B] - - [A -+ B'] by
2. Given A and f: B
[A
~B']
[A~f] x idE'!
x A
---~,
B'
If
[A ~ B] x A --e-~' B
Show that this renders [A -+ (-)] a functor C -+ C. Give an explicit description of
[A -+ fJ in Set and in Dome.
3. Let C be a Cartesian-closed category and let C, C TOT satisfy the axioms of 12.2.7.
Define (E*,.u) as the least fixed point of I/I(C) = 1 + E x C and define (E+, v) as the
least fixed point of qJ(C) = E + E x C. Then, as in Exercise 12.2.5 but without
assuming countable coproducts, prove that E+ ~ E x E*. [Hint: Use the least
fixed point property of E* to show that qJ has a least fixed point with object E x
E*. The needed f: E x E* - - D corresponds. to a suitable f·: E* __ [E -+ D].]
4. Recall from Exercise 2.3.2 that in any category with products there is an isomorphism 0: A x (B x C) - - (A x B) x C. In a Cartesian-closed category, given B,
C, C' dt:fine y, /3, IX by
304
13 Order Semantics of Data Types
f-
[C ~ B] x C - - - - - - - - - - - - + l B
y x ide
([C' x
Ire
C~B]
x C')
X C-----::-(j-~l
[C' x
C~B]
x (C' x C)
e
[C' x C ~ B] x (C' x C) - - - - - - - - - - + 1 B
Ie
px
[C,B] x C
idC'xe
rex ide
[C'
~
[C ~ B]] x (C' x C)
Generalizing 8, show that
ness of in 1.]
r
IX
-----::-(j-~l
([C'
~
[C
~
B]]) x C') x C
is an isomorphism with inverse
fl. [Hint: Use unique-
5. Let C be a Cartesian-closed category with an initial object 0 and finite coproducts.
Prove the following:
(i) 0 x A ~ O. [Hint: Show 0 x A is initial.]
(ii) [0 --+ A] ~ 1. [Hint: Show [0 --+ A] is terminal.]
(iii) [A + B --+ C] ~ [A --+ C] x [B --+ C].
(iv) Show that A is not initial iff: A --+ 0 exists. [Hint: A --.L. 0 A =f- idA if A is
not initial; this contradicts (i).]
6. Establish the Lawvere diagonal argument in a Cartesian-closed category C: Given
a: 1. --+ J in C which has no fixed points in the sense that ax =f- x for all ele[X --+ J] which is surjective on
ments x: 1--+ J of J then there exists no f: X elements. [Hint: For any such f, consider g: X x X J with g. = f and let
go: 1 [X --+ J] be h·, where
h=X~X x X-..!!..->l~J
and ~ is defined by prl~ = idx = pr2~. Show that if Xo: 1 --+ X existed with
fx o = gxo then axo = Xo; hence, no such Xo exists.]
Show in detail that the proof outlined in the above hint when C = Set and
J = {true, false} with a(true) = false, a(false) = true amounts to Cantor's diagonal
argument of Exercise 10.1.9.
7. Interpret the identities of Exercises 1 and 5 in a Heyting algebra. Similarly, interpret the functor [A --+ (-)] of Exercise 2. (For Exercise 5 assume x v yexists.)
8. Let C be any Cartesian-closed category and let in i : Ai --+ A be a coproduct in C.
Show that
is again a coproduct in C. Hence, axiom 12.1.8 always holds in Cartesian-closed
B the desired f: E x A B correcategories. [Hint: Given 1;: E x Ai [E --+ B] induced by the coproduct property of A;
sponds to a suitable g: A recall that S x T = T x S as in Exercise 2.3.2.]
305
13.4 Solving Function Space Equations
9. Let H be a Heyting algebra. Show that H is a Boolean algebra (3.3.12) if and only
if IIX = X for all x. [Hint: If IIX = X define 0 = Il,x v Y= I((IX) A (Iy)).
H is a distributive lattice by Exercise 8. Show that x A (Ix) = 0 in any Heyting
algebra with 0; hence, if IIX = x, x v (Ix) = 1.]
13.4 Solving Function Space Equations
We have shown how to solve a wide variety ofrecursive equations to define
useful data types in Set. We shall consider an example due to Scott and
Strachey which suggests that it may be useful to solve recursive specifications
in which the function space construction can appear on the right-hand side.
But before considering this example, let us see, if we accept it, why it would
force us to go beyond Set as a setting for the construction of data types. Look
at the simple example of the isomorphism
1
{3: D
~
A
+ [D -+ D],
where A is a fixed object of "atoms," so that 1 asserts "a datum is either an
atomic datum or a function from data to data." (Nontrivial solutions of
D ~ [D -+ D] are discussed in the exercises and arise by a mild extension of
the theory of this section. The advantage of 1 is that D cannot be the oneelement domain and so must be infinite (if A =f. 0), whereas D ~ [D -+ D] is
true for the one-element domain.) The point here is that, in the Cartesianclosed category Set, isomorphism 1 with A nonempty has no solutions since,
by Cantor's diagonal argument (Exercise 13.3.6, but see also Exercise 10.1.9),
the cardinality of Set(D, D) is strictly greater than that of D for any set D with
at least two elements. It was a striking discovery of Scott to show that this
isomorphism could be solved for D a suitable infinite domain and [D -+ D]
the function-space domain of 13.2.1.
Scott and Strachey considered the following approach to the formal
semantics of a programming environment. We let the store have a given
domain, location, as the set of locations. Each location can hold any value
from some domain value. Thus, a state is given by assigning a value from
value to each location in location:
state
= [location -+ value].
Next, a procedure is to be regarded as a procedure for mapping values to
values but also changing one state into another ("side effects") and so we
represent procedures by the domain
procedure
=
[value x state
-+
value x state].
But Scott and Strachey require that the values which can be stored in a
location include elements from any of the given domains V1 , ••• , v,. , the
specification of some location, the specification of a procedure, or a list of
values. Formally, this leads them to the equation
306
13 Order Semantics of Data Types
value = V1
+ ... + v" + location + procedure + list (value).
The main task of this section will be to introduce a category Domadj in
which each of the three functors
r i : DOm~dj -----+ Domadj
involved in the above recursive specification,
r 1 (state,procedure, value) = [location ~ value],
r 2(state,procedure, value) = [value x state ~ value x
r 3 (state, procedure, value) = V1 + ... + v,. + location
state],
+ procedure + list (value),
is continuous. The proof, which is a categorical refinement of an argument
made by Scott in the setting of lattice theory, is a major achievement, giving
one type of demonstration that the existence of a mathematical space of such
values in which "procedures can call themselves as arguments" can be made
precise without internal inconsistencies. Nonetheless, we deny that it is necessary to forsake Set for Domadj , since the specification
procedure
= [state ~ state]
is not a necessary part of semantics. In actual programming languages, procedures are built up, as in Parts 1 and 2 of the present volume, in such a way
that they form a proper subset of, for example, Pfn(state, state) forming, in
fact, a denumerable subset of the nondenumerable space of arbitrary maps of
the denumerable set of states to itself.
To provide another motivation for the theory presented in this section,
but to also extend the above critique, we may note that Scott's development
of order semantics of programs went hand in hand with his work on the
A-calculus, developed by Church as an alternative formulation of the syntax
of computable functions.
In the "type-free Church-Curry A-calculus," a so-called "A-expression" (cr.
the discussion following 13.3.2) may be interpreted semantically either as a
function or as a piece of data, so that the concatenation MN of two such
A-expressions M and N is to be interpreted as "apply the function denoted by
M to the datum denoted by N." High-level programming languages pass
functions as arguments so that in one context the semantics of a function is a
function whereas in another context the function may be viewed as a datum.
Thus, many workers have felt that providing formal models of the type-free
A-calculus is a necessary step in demonstrating the mathematical consistency
of certain programming languages. To this we now turn.
Let E denote the set of A-expressions, and let D be the space where their
values lie. (Precise definitions of E, D will not be needed to make the motivational point we require.) We need to interpret each M in two ways, both as a
data element in D and as a function D ~ D. This can be accomplished with a
13.4 Solving Function Space Equations
307
pair of maps
E~D~[D--+D],
where rx tells us how to interpret a A-expression as an element of D while p
tells us how to reinterpret each x in D as a map P(x): D --+ D. For consistency,
then, we require that
rx(MN) = P(rx(M))(rx(N)).
The reader familiar with computability theory may recognize p as related
to G6del numbering. In computability theory, we take D to be N and take
[N --+ N] to be the set of all partial functions from N to N. We call n the index
or G8del number for p(n), which is defined to be the partial function N --+ N
computed by the program (or other formal specification) encoded by the
number n. Note that in this setting the map p is neither one-to-one nor onto.
Thus, despite our discussion of the Scott-Strachey example above, it may
seem surprising that Scott sought conditions under which D and p could be
chosen with p an isomorphism p: D ~ [D --+ D], the limiting case of 1 in
which A takes the value 0. This contrast with computability theory, where
(unavoidably) each function has infinitely many G6del numbers, is intentional since a denotational theory should deal directly with the computable
functions themselves.
The ability of computability theory to use nonisomorphic G6del numberings reinforces our suggestion that the solution of D ~ A + [D --+ D] is
not a necessary part of the formal semantics of programming languages.
Nonetheless, the Scott-Strachey approach has been so influential that we
shall focus this section on constructing a nontrivial isomorphism as in 1.
While our construction is very close to the original one of Scott, we couch it
in terms of our earlier work with functorial fixed points, eventually solving
D ~ A + [D --+ D] as the greatest fixed point of a suitable functor rjJD =
A + [D --+ D]. (We could, of course, regard any greatest fixed point of rjJ:
C --+ C as the least fixed point of the same functor considered as cop --+ cop,
and in this way other workers have viewed the construction below as a least
fixed point. The choice is a matter of taste. We support our choice by virtue
of a close comparison with the greatest fixed point construction in Set.)
Several obstacles need to be overcome. We are not sure of what the + in
A + [D --+ D] means until the category we work in is stabilized. Such a category must have a function-space construction [D --+ D] which is a functor in
D. But in even so nice a Cartesian-closed category as Set or Dome there is no
obvious way that a morphism f: D --+ E induces a morphism [D --+ D] -----+
[E --+ E]. Let us begin, then, by considering how f: D --+ E should induce such
a map. If there were also a map g: E --+ D then given hE [D, D] (and here we
are not in an arbitrary Cartesian-closed category but are dealing with a
function-space object which is truly a set of functions so that h is a function
D --+ D) f and g induce a map
E~D~D~E
308
13 Order Semantics of Data Types
hopefully in [E, E]. One approach might be to insist f is an isomorphism and
set g = g-l. But a subcategory which has only isomorphisms will not have
enough interesting maps. For example, the least fixed point colimit of 11.1.13
would just be 1..! What is needed is a broad class of maps which induce a map
in the opposite direction. It turns out that the following definition will work:
2 Definition. Let D, E be domains and let f: D -+ E be continuous. An adjoint
of f is a monotone function f*: E -+ D satisfying ff*(e) = e, f*f(d) s d:
D
f
IE
~l~
D
f
IE
3 Theorem.
(i) A continuous map has at most one adjoint. (Hence, iff has an adjoint f*,
f* is called the adjoint off.)
(ii) Iff has adjoint f*, f and f* are strict. In particular, f* is continuous.
(iii) Iff: D -+ E, g: E -+ F have adjoints so does gf and (gf)* = f*g*.
(iv) Iff is an isomorphism, f* = f- 1 is the adjoint off
PROOF. (i) If ff* = fg = idE, while f*f, gf s idD then for eEE, f*e =
f*(fg)e = (f*f)ge s ge. Symmetrically, ge s f* e. Thus, f* = g.
(ii) As 1..:.::; f* 1.. and f is monotone, f1.. :.::; ff* 1.. = 1.. which implies
f1.. = 1.. so f is strict. Let Yen = e. As f* is assumed monotone, to show f*
is continuous it suffices to verify that if f* en s d for all n, then f* e s d holds.
As f is monotone, en = ff* en s fd for all n so that e s fd. As f* is monotone,
f*e s f*fd :.::; d, so f* is indeed continuous. Finally, as 1.. s f1.., f* 1.. s
f*f1 :.::; 1.. so that f* 1.. = 1.. and f* is strict.
(iii) (gf)(f*g*) = g(ff*)g* = gg* = idE· For dED. g*gfd s fd and, asf*
is monotone, f*g*gfd s d.
(iv) Obvious.
D
4 Definition. Define Dom to be the category whose objects are all domains
and whose morphisms Dom(D, E) are the continuous functions f: D -+ E
which are strict, that is,J(1..) = 1... Dom is a subcategory of Dome.
We then define the category Domadj as the subcategory of Dom of all
domains and of maps which have an adjoint. By 3(ii), if f has an adjoint, f is
in Dom. By 3(iii), Domadj is closed under composition, and identity maps of
Dom are in Domadj by 3(iv). Thus, Domadj is indeed a subcategory of Dom.
5 Example. In Dom, each projection function pri: Dl x ... x Dn ---+ Di has
an adjoint. Define prt(x) = (1.., ... ,1.., x, 1.., ... ,1..) with x in the ith coordinate.
6 Example. In Dom, i: D
-+
1 has an adjoint, namely, 1..: 1 -+ D.
309
13.4 Solving Function Space Equations
7 Example. For any two sets D, E, the map f: Mfn(D, E) ------. Pfn(D, E)
defined by
f(g)(d)
ifg(d)={e}
e
= { undefined else
has an adjoint, namely,
f*(h)(d)
¢J
if h(d) not defined
= { {h(d)} else.
8 Example. Let D be any domain. Define f: [D -+ D] --+ D by f(g) = g(1.).
Then f has an adjoint, namely, f*(d)(e) = d, that is, f*(d): D -+ D is constantly d.
9 Example. Iff: D -+ E in Domadj then the equation ff* = idE implies that f
is surjective and f* is injective. This implies, in particular, that Domadj cannot have an initial object. Let D be any domain and let E be the flat domain
6.1.16 on the set E of subsets of D. By Cantor's theorem (Exercise 13.3.6) there
is no surjection D -+ E and so no Domadrmorphism D -+ E. Thus, D is not
initial.
The Appendix to this section (which may be omitted by readers who are
content to simply apply the following result) shows that Domadj meets a
number of important criteria for a "category for recursive specification of
data types as domains." In particular, it will have the property which motivated this section:
10 Theorem. There is, for each domain A, a domain D defined as the greatest
fixed point of the functor t/!: Domadj ------. Domadj , t/!(C) = A + [C -+ C],
D
~
A
+ [D -+ D].
Furthermore, each polynomial functor Domadj ------. Domadj is continuous and
has a greatest fixed point.
As a result, sets of recursive specifications have greatest fixed point solutions in Domadj . The details are established in the Appendix below. We
conclude the body of this section with examples, starting with the example
due to Scott and Strachey that introduced this section.
11 Example. For a given set location and sets VI' ... , v,. of atomic values,
Values is recursively defined in Domadj by the specification
state =: : [location~
--+
value]
procedure =: : [value x state ------. value x state]
value =: :
Vl + ... + V,f + location + procedure + list (value).
310
13 Order Semantics of Data Types
Here, if r: DOffiadj X Domadj -----+ Domadj is the functor qA, B) = 1 + A x B
then via 12.3.7, list(C) = greatest fixed point ofqC, -) is continuous (though
as we mentioned earlier, and will discuss below, list(C) will of necessity
contain infinite lists). Since this makes the constituent functors continuous,
these specifications may be solved in Domadj as indicated in Section 12.3.
Since Value has at least two elements (unless vi' + ... + V,f + location~ is
trivial) and [value -+ value] is embedded as a subset of value, these specifications have no solutions in Set.
12 Example. Consider an attempt to form A*B as a data type in some
category of domains, the only technical requirement being that the concatenation map
13
A x
A*B~A*B
a, w 1----+ aw
be continuous. (In Set, c arises from the isomorphism f.1-: B + A x
A*B -----+ A*B by composing with inAxA*B; presumably a similar construction would work to justify the continuity of 13.) As a result, for fixed a E A,
14
A*B~A*B
w 1----+ aw
is continuous, being c(a, -). By the Kleene fixed point theorem 6.2.13, t/J has
a fixed point v. Thus,
so that" A *B" has an infinitely long word after all.
We see in particular that to define ft: A -+ B given f: A -----+ A + B in
Domadj the approach of 10.2.S (which used not only the terminal object
property of the greatest fixed point A *B + A 00 of t/JC = B + (A x C) to define
the trace semantics A -----+ A *B + A 00 of f but also the initial object property
of the least fixed point A *B of t/J to define last: A *B -----+ B) would require
modification.
It is possible to define ft in Domadj in a different way. Observe that
15
[A -+ B] ~ [A -+ B]
t/J(g)(a)
=
{
a (f(a))
~a)
if f(a)EA
if f(a)EB
if f(a) = .l
is continuous if f is, and so has least fixed point by the Kleene fixed point
theorem. This least fixed point is ft. This can be done without domains using
essentially the same t/J from Pfn(A, B) to itself, but what is mathematically
nicer about the situation in Domadj is that [A -+ B] is itself a data type which
is "computable" when A, B are, a statement not true for Pfn(N, N).
311
13.4 Solving Function Space Equations
Appendix to Section 13.4
Our work in this Appendix will fall into two parts. The first will establish
general criteria for a category C of domains to enable functors like t/I(C) =
A + [C ~ C] to be continuous, so that they do have greatest fixed points.
The second half of the Appendix will establish that DOmadj meets these
criteria. These technical details will not be used elsewhere in the book, and so
may be omitted by those readers uninterested in the proof of Theorem 10.
16 Desiderata for a data type category C:
(i) C has limits of left chains and a terminal object.
(ii) C has "polynomial" endofunctors, all of which are continuous.
(iii) C has a "function-space endofunctor," given by C 1------+ [C ~ C] on objects, which is continuous. Unfortunately, we shall see that the functionspace belongs not to C but to a larger category of which C is a subcategory. Putting this another way, we start with the category with the desired
function-space, find it has too many morphisms for C 1------+ [C ~ C] to be
made into a functor, and thus restrict the set of morphisms in C to obtain
functoriality.
(iv) The greatest fixed point arising from the functors of (ii) and (iii) accord
with the programmer's intuition.
The point of the emphasized statement in (iii) is that, although Dome has
the morphisms we need for program specification, we have to find a subcategory with fewer morphisms if we are to meet the above desiderata for
data type specification.
We begin by studying the category Dom of domains and strict maps of 4.
17 Observation. Dom has products. The construction of 13.2.3(i) works with
the usual projection functions. Infinite products are constructed in the same
way.
18 Observation. Dom has coproducts. Dl + ... + Dn is constructed as the disjoint union with least elements identified, that is,
Dl
+ ... + Dn = {-L} + {I}
x (Dl\{J-d)
+ ... + {n}
x (Dn\{J-n})
(where B\A = {x: x E B, x ¢ A}) with ordering -L ~ z for all z, whereas
(i, x) ~ (j, y) if and only if i = j and x ~ y in Di. This is clearly a domain, and
the injection in i: Di ------+ Dl + ... + Dn mapping -Li to -L and mapping x "# -Li
to (i, x) is strict. Given J;: Di ~ E strict,
in·
D;~'
ID 1 +·:·+D.
J.,
V
oj,
E
f defined by f(-L) = -L E ' f(i, x) = J;(x) is strict (since each nontrivial chain is in
one of the Di) and f ini = J; because J;(-LJ = -L E •
312
13 Order Semantics of Data Types
Recall, by contrast, that Dom" in which the continuous maps need not be
strict, does not have coproducts. However, this should not tempt us to use
Dom as the setting for program semantics. A continuous f: D ..... D in Dom is
strict, so that f(.1) = .1. But then its least fixed point is .1. But this means
that in Dom every recursive specification has the undefined function for its
semantics. So, we must reject the strictness condition if we are to use the
Kleene sequence for the semantics of recursive program specification. We
shall see that order semantics requires a different category of domains for
program specification from that for data type specification: recall the emphasized remark in 16(iii) above. This is an important observation which must be
understood if confusion is to be avoided. However, it does not invalidate the
ordered approach, for we have seen that the approach we offered in Chapter
12 used Set as the setting for data type specification, but Pfn or Mfn for
program specification. In any case, we must further study Dom to prepare for
the study of Domadj which serves as the category of domains which meets the
criteria 16 for recursive specification of data types-if it is required (contra
our critique of Section 13.1) that all data types be domains.
19 Proposition. Dom has limits of left chains and a terminal object.
PROOF. The terminal object is the one-element domain 1 = {.1} with .1 ::::;; .1.
That i: D ..... 1 is strict for any domain D is clear. Given a left chain
dn: Dn+l ..... Dn in Dom, construct the limit OCn: D ..... Dn in Set as in 11.3.3. Thus,
D = {(x n: n = 0, 1,2, ... )lxnEDn, dn(x n+1 ) = x n} with ocm(dn) = d m. Define
(xJ ::::;; (Yn) to mean Xn ::::;; Yn for all n. If (x~) ::::;; (x;) ::::;; (x;) ::::;; ... is an ascending
chain in D then for each n, x~ ::::;; x; ::::;; x; ::::;; ... is an ascending chain in Dn and
so has least upper bound Yn' As dn is continuous
dnYn+l = dn(Vmx::'+d = Vm dnx::'+ 1
=Vmx::, (as(x::'ln=0,1,2, ... )ED)
=
Yn,
so that (ynl n = 0,1,2, ... ) E D and hence is the least upper bound of (x~) ::::;;
(x;) ::::;; (x;) ::::;; .. '. Clearly, ocn: D ..... Dn is strict. Given another lower bound
f3n: V ..... Dn, the unique function f
is strict since it is obvious from the construction of D that any function f with
ocn! strict for all n is itself strict.
0
20 Proposition. Any polynomial functor 1/1: Dom ..... Dom (defined exactly
as in 10.1.22) is continuous and hence has greatest fixed point by the dual of
11.2.13.
313
13.4 Solving Function Space Equations
PROOF. This is proved just as in Set. The general result 11.3.8 applies. All that
remains is that a coproduct of continuous functors be continuous and here
only minor modifications of the proof of 11.3.10 are needed. We leave them
to the reader.
D
21 Strategy. We seek a subcategory C of Dom with ob(C)
with the following virtues:
=
ob(Dom) and
(i) D 1------+ [D --+ D], where [D --+ D] is defined as in 13.2.1, extends to a
functor C --+ C.
(ii) The terminal object 1 of D is terminal in C.
(iii) If dn: Dn+1 --+ Dn is a left chain in C with limit exn: D --+ Dn in Dom as in 19
then exn is in C and, moreover, if (V, Pn) is a lower bound with Pn in C then
the unique strict f with exnf = Pn is in C. It follows that exn: D --+ Dn is the
limit in C.
(iv) The functor [D --+ D] of (i) is continuous.
(v) If /;: D; --+ E;, i = 1, ... , n, in C then f1 x ... x fn: D1 X ••• x Dn-E1 X •.. X En' f1 + ... + fn: D1 + ... + Dn - - E1 + ... + En, computed
in Dom, are in fact in C. It follows that every polynomial functor
Dom --+ Dom maps C into C.
Such a subcategory would meet the desiderata of 16 rather well. The
"polynomial" functors of 16(ii) are those of 21(v) and these all have greatest
fixed points because of 20 and 21(ii, iii). Similarly. 16(iii) is met by 21(i, iv).
The aesthetic criterion 16(iv) is argued by pointing out that the underlying
set of the greatest fixed point of a polynomial functor is almost the same as if
computed in Set (since products and limits of left chains are constructed as in
Set whereas the coproduct is very close to the disjoint union) whereas the
function-space domain [D --+ D] has been motivated by 13.2.2. Note, however, that C[D,D] for such a C is a subset of the strict maps from D to D
and so is much smaller than [D --+ D], which equals the function-space
Domc[D, D] of all continuous maps from D to D, whether strict or not.
We will in fact satisfy 21 with the subcategory Domadj of Dom. What we
have said so far might well apply to subcategories of categories other than
Dom.
We now show that Domadj satisfies the five conditions of 21, beginning
with the following:
22 Definition. D 1------+ [D --+ D] as in 13.2.2 extends to a functor
1/1: Domadj - - Dom adj as discussed prior to 2, namely, for f: D --+ E define
I/If: [D --+ D] - - [E --+ E] by
(l/If)(D
--+
D) = E L
D ~ D ~ E.
By 3(ii) (I/If)h E [E --+ E]. Recalling how [D --+ D], [E --+ E] are domains from
13.2.2, the continuity of f implies that of I/If since if h ::; I in [D, D] and e E E
then (I/Ih)e = fhf*e ::; flf*e = (I/Il)e, whereas if ho ::; h1 ::; h2 ::; ... is an
314
13 Order Semantics of Data Types
ascending chain in [D --+ D] then
(V t/J(hn}}e. Define (t/Jf)* by
t/J(V hn)e = f(V hn)f*e = V fhnf*e
=
(t/Jf)*(E~E) = D~E~ELD.
Then (t/Jf)* is monotone because f* is. (t/Jf)(t/Jf)*t = f(f*tf)f* =
(ff*)t(ff*) = t, whereas (t/Jf)*(t/Jf)h = f*(fhf*)f = (f*f)h(f*f) ::; h(f*f) ::;
h. Finally, applying 3(iii), if g: E --+ F, t/J(gf)(h) = gfh(gf)* = gfhf*g* =
(t/Jg)(t/Jf)(h) and by 3(iv), (t/JidD)h = idvhidJj = h. This shows t/J is functorial.
23 Proposition. The terminal object 1 = {1-} of Dom is terminal in Domadj •
o
PROOF. This is immediate from 6.
We turn next to establishing 21 (iii) for Domadj , namely, that it forms limits
--+ Dn be a left chain in Domadj . Define
IXn in Dom. To this end let dn: Dn+1
24
Dm
d mn )
Dn =
dn. "dm-l
{ id
d:-
1
"'d~
ifm > n
if m = n
ifm < n.
25 Lemma. For all m, n
commutes.
PROOF. If m > n, dndm(n+l) = dnd n+1 ... dm- 1 = dmn . If m = n, dndm(n+1) =
dnd: = id = dmn . If m < n + 1 then dndm(n+1) = dnd: ... d! = d:- 1 ... d~ =
0
~.
26 Definition. Let IXn: D
--+
D be the limit of dn: Dn+1
27 Lemma. Let (xn)ED so that dnx n+1
dmnxm ::; Xn for all n.
= Xn
--+
Dn in Dom, as in 19.
for all n. Then for any fixed m,
PROOF. Ifm ~ n, dmnxm = dn···dm-1x m = X n. Ifm < n, dmnxm = d:- 1 "'d!x m =
d:- 1 "'d!d m X m +1 ::; d:- 1 ·· ·d!+lXm +1 ::; ... ::; d:-1dn-1x n ::; X n·
0
28 Proposition. The conditions of 21(iii) hold, namely, that the Dom limit of a
chain in Domadj has limit projection in Domadj and the unique strict map
induced by a lower bound in Domadj is again in Domadj .
PROOF. The maps dmn are in Dom by 3(ii). Hence, the content of Lemma 25 is
that (Dm' (d mn : n = 0,1,2, ... » is a lower bound of dn: Dn + 1 --+ Dn in Dom,
315
13.4 Solving Function Space Equations
inducing a unique strict a! with ama n = dmn for all n:
IX
Then ama: = dmn = id. To show a! is the adjoint of am let (xn)ED and show
a!am(xn) ::; (xn) or, equivalently, for each n that ana!x m ::; Xn (since ak(xl) =
x k). But as ana! = dmn this reduces to dmnxm ::; Xn which is precisely Lemma
27. This shows that anEDom adj .
Now let (V, (f3n)) be a lower bound of dn: Dn+l --+ Dn in Domadj , so that
29
D··~~/t
v
As (V, (f3n)) is certainly a lower bound in Dom, there exists unique strict f
\\")D.
V
We must show that f has an adjoint f*: D --+ V. Let (xn) E D. Claim that
f3~am(xn) = f3~xm is an ascending chain. Indeed, f3~xm = f3~+l d~xm (as
dmf3m+l = 13m- use 3(iii)) = f3~+ld~dmxm+l ::; f3~+lXm+l. So define
30
By 3(ii), f3~ is strict. It follows from the proof 13.2.2 that f* is continuous,
monotone in particular. To show f*f::; idy, f*fv = V f3~amfv (as x =
(amx: m = 0,1,2, ... ) for all x in D) = V f3~f3mv. As P~f3mv ::; v, v is an upper
bound of(f3~Pmv) so that f* fv ::; v. Finally, we show ff* = id. We must show
anff*(xm) = Xn for all n, (xm) E D. To do this we exploit
D
dmn
.~).
D
.
(m 2 n)
v
which is immediate from 24 and 29. Computing, !Xnff*(xm) = Pn V P~xm =
Vm:?:.of3nP~xm (continuity of f3n). But for any n, an ascending chain Yo::; Yl ::;
Yz ::; ... has the same set of upper bounds as Yn ::; Yn+l ::; Yn+2 ::; ... so that
both have the same least upper bound. Thus, !Xnff*(xm) = Vm:?:.nf3nP~xm =
Vm:?:.ndmnPmf3~xm = Vm:?:.ndmnxm = Xn as each dmnxm = Xn·
D
31 Lemma. In Domadj let d n: Dn+l --+ Dn be a left chain and let f3n: V --+ Dn be a
lower bound. Then f3~ 13m V is an ascending chain in V for every v E V, and (V, (f3n))
is a limit of (d n ) if and only if V f3~f3m = idy.
316
13 Order Semantics of Data Types
PROOF. Let oc.: D -+ D. be the limit as in 19 which is then a limit in Domadj
by 28. Let g: V -+ D be the unique map in Domadj with rx..g = f3. for all n. By
30 g*(x.) = V f3!x m so that g*gv = g*(rx..gvln ~ 0) = V f3!rx. mgv = V f3!f3mv.
Thus, g*g = Vf3!f3m, and this is always defined. But then (V, (f3.)) is a limit
ifandonly if g is an isomorphism if and only if g*g = idv ·
0
c P be such that h = LUB(A) exists
and suppose B c A is such that for all a E A there exists b E B with a ::; b. Then
LUB(B) exists and LUB(B) = h.
32 Lemma. Let (P, ::;) be a poset, let A
PROOF. That h is an upper bound for B is clear. Now let u be any upper
bound for B. If a E A there exists bE B with a ::; b. As b ::; u, a ::; u; thus,
u E UB(A) and h ::; u.
D
33 Proposition. The functor 1/1: Domadj
---+
Domadj , ljID = [D
-+
D] is
continuous.
PROOF. Let d.: D.+l -+ D. be a left chain in Domadj with limit rx..: D
28. Let f3n = I/IIX.: [D -+ D] ---+ [D. -+ D.] so that
f3.h
=
rx..hlX:,
f3: I =
rx.: IIX.,
f3:f3.h
By Lemma 31 we must show
V1X:rx.. = idv . Thus,
=
-+
D. as in
rx.:rx..hrx.:rx.•.
V f3: f3.h = h. Lemma 31 guarantees that
h = h idv = h V rx.: IX.
=
V hrx.:IX. (h is continuous)
= idv V hrx.:rx.n =
Vmrx.~rx.m V. hrx.:rx..
= Vm V.rx.~rx.mhrx.:rx..
(rx.~rx.m
is continuous).
Now if m ::; n, rx.~rx.mhrx.:lXn::; rx.:lXnhrx.:OC. since OCtrx.k is an ascending chain
whereas, if m > n, rx.~rx.mhrx.:rx.. ::; rx.~rx.mhoc~rx.m because OCtrx.k is ascending and
oc~ocmh is monotone. By Lemma 32, h = V. oc:oc.hrx.:rx.n = (Vn f3: f3.)h as desired.
o
34 Proposition. The conditions of 21(v) hold in Domadj , namely, every polynomial functor on Dom maps Domadj into Domadj .
PROOF. Since x, + are the product and coproduct in Dom as in 17 and 18,
10.1.18 and its dual apply. Using 3(iii) we have (fl x ... x fn)(fl* x ... x f.*) =
fdl* X ••• x fnf.* = id x ... x id = id, whereas (ft* x ... x f.*)(fl X .•. x
f.) = ft*fl X .•• X f.*f.· But in a product domain, (x 1 , ••• ,x.)::; (Yl' ... 'Y.) if
317
Notes and References for Chapter 13
and only if Xi :$; Yi' Hence, /;*/; :$; id implies 11*/1 x ... x J"*J,,
(f1 x ... x J,,)* = It x ... x In*.
The proof that (f1 + .. , + J,,)* = 11* + ... + J,,* is similar.
:$;
id. Thus,
0
Notes and References for Chapter 13
The Scott-Strachey approach to formal semantics was set forth in D. S. Scott and
C. Strachey, "Towards a mathematical semantics for computer languages," in Proceedings of the Symposium on Computers and Automata (J: Fox, ed.), Polytechnic
Institute of Brooklyn Press, 1971, pp. 19-46. The relation ofthis to the A-calculus can
be seen from D. S. Scott, "Models for various type-free calculi," Proceedings of the
IVth International Congress for Logic, Methodology and Philosophy of Science IV (P.
Suppes, L. Henkin, A. Joja, and G. C. Moisil, eds.), North-Holland 1973, pp. 157-187.
The (pre-categorical) statement of the formation of data types by inverse limits in
some space of data types is given by D. S. Scott, "Data types as lattices," SIAM
Journal of Computing, 5, 1976, pp. 522-587. All these matters are treated in textbook
form by J. E. Stoy, Denotational Semantics: The Scott-Strachey Approach to Programming Language Theory, MIT Press, 1977. By contrast, for an informal use of the
Scott-Strachey approach without attention to the mathematical issues involved see
M. J. C. Gordon, The Denotational Description of Programming Languages, An Introduction, Springer-Verlag, 1979.
Godel introduced his numbering in "Dber Formal Unentscheidbare Siitze der
Principia Mathematica und Verwandter Systemes, 1," Monatshefte for Mathematik
und Physik, 38, 1931,pp. 173-198. (An English translation appears in M. Davis, The
Undecidable, Raven Press, 1965.) An exposition of computability theory, including the
use of Godel numbering therein, is given by A. J. Kfoury, R. N. Moll, and M. A. Arbib,
A Programming Approach to Computability, Springer-Verlag, 1982.
The notion of an adjoint map as generalizing a concept of Scott's is due to Gordon
Plotkin; the way in which this concept is used here in making clear the relation
between different categories of domains for programming language semantics is due
to the authors.
For an elementary introduction to Cartesian-closed categories and Heyting algebras see the book by Goldblatt cited in the Chapter 2 notes. Exercise 13.3.6 is an
unpublished observation of F. W. Lawvere from the late 1960s.
One of the problems in writing an introductory textbook is that many exciting
topics have to be omitted. One such topic is that of the semantics of concurrent
programs. For order semantics, the key notion is that of a power domain, designed to
generalize the construction of the set 2D of subsets of a set, while avoiding the
paradoxes associated with a recursive specification of the type D::= 2D. Power
domains were introduced by G. D. Plotkin, "A power domain construction," SIAM
Journal of Computing, 5, 1976, pp. 452-487. Our own approach to concurrency,
avoiding order semantics, is given by M. Steenstrup, M. A. Arbib, and E. G. Manes,
"Port automata and the algebra of concurrent processes," Journal of Computer and
System Sciences, 27, 1983, pp. 29-50. For a well-received approach, see R. Milner, A
Calculus of Communicating Systems, Springer-Verlag, 1980.
CHAPTER 14
Equational Specification
14.1 Initial Algebras
14.2 Sur-reflections
A collection of sets Xl' ... , Xn together with various functions of the form
Xi! x ... X X ik ----+ X ik +! constitutes a "many-sorted algebra." Section 1 gives
examples of data types which arise as many-sorted algebras. An "equational
specification" for a data type posits a many-sorted algebraic structure subject
to a finite set of equations. What is attractive about this idea is that equational specifications are easily formalized within programming languages
and have been partially implemented in experimental languages such as
CLEAR, ACT ONE, CLU, and others. This provides a tool to define data
types useful in programming and additionally promises to make available a
useful research aid for pure mathematicians who study equationally defined
algebraic structures.
It is an unavoidable fact that each consistent equational specification
has many different models. The pure mathematician mentioned above has
this in mind from the beginning. The programmer interested in defining a
definite data type, however, seeks "the minimal structure satisfying the given
equations and no other equations." This is formalized as the initial object of
the category of all models of the specification.
In Section 1 we ofTer basic examples and introduce "sorted functors"
which are simply endofunctors X on a suitable category. Appropriate subcategories of X-algebras lead to a rapprochement with the least fixed points
introduced in Chapter 10. These subcategories are characterized in Section 2
where it is also proved that the desired initial objects exist. In general,
the proof is nonconstructive and uses "very high orders of infinity." This
nonconstructiveness is well known to be unavoidable.
Many proponents of equational specification feel that almost all data
types should be specified equationally. As discussed at the end of Section 1
319
14.1 Initial Algebras
and elsewhere in the Chapter, we feel that this approach creates more
problems than it solves. It remains for future language designers to find the
proper role for equational specification. We have introduced other methods
in the preceding chapters.
14.1 Initial Algebras
Our goal in this section is to give some basic examples of equational
specifications, to formalize their semantics as initial single- or many-sorted
algebras, and to introduce their generalization as algebras over "sorted
functors."
1 Example. An equational specification of the data type "Boolean" is as
follows:
F,T
I
B (
I
B
fv
B2
(i.e., there is a set B with two constants F (for "false"), T (for "true"), a unary
operation I ("negation"), and a binary operation v ("or")) subject to the six
equations:
2
IF
I
=
T,
T=F,
TvT=TvF=FvT=T,
F v F
=
F.
What we have in mind is the set B = {T,F} for which the above equations
would force the usual definitions of "negation" and "or." The additional
property
3
"if x E B then x
= T
or x
= F"
would guarantee this, but 3 is not in the form of an equation. Well-known
theorems of "first-order model theory" guarantee that 3 can not be equivalently reexpressed using equations, even if infinitely many are allowed. (See
15 below for a definition of equation.) The point of limiting constraints to
equations is simplicity, both from the point of view of syntax and efficiency
of implementation.
As it stands there are many models of these equations, for example, B = 2x
for any set X, with F = 0, T = X, I = complement, and v = union.
The "intended" model is the initial object in a suitable category of models.
To make this precise, give.n two models (B, T,F, I, v), (B', T',F', I', Vi)
320
14 Equational Specification
satisfying (the appropriately tagged version of) 2, define a morphism
f:(B,T,F, i , V)_(B',T',F', ii, Vi)
between two models to be a total function f: B
--+
B' subject to
f(T) = T ',
4
f(F)
= F',
f(i x) = i'(f(x)),
f(x
V
y) = f(x) v'f(y).
The conditions 4 are natural. They may be paraphrased as "f preserves the
structure." It is easily seen that this yields a category under the usual composition of total functions and with the usual identity functions. Furthermore,
the intended model (B, T, F, i, v) with B = {B, F} is an initial object in this
category since for arbitrary (B', T', F', ii, Vi), f(T) = T', f(F) = F' is the
requisite unique morphism.
5 Example. An equational specification of "the natural numbers" is
l~Q~Q
(so far we have an arbitrary set Q, an element 0 E Q, and a function s: Q --+ Q)
subject to the empty set of equations.
What we have in mind is Q = N with 0 the usual zero and s(n) = n + 1 the
successor map. By iterated composition we can generate sn: Q --+ Q and we
can apply such to 0 to get sno. But in the intended model no equations
such as
sn
=
sm
sno
~
smo
(m"# n)
(m "# n)
hold, and it is hard to think of any equations to impose. Many other
properties come to mind such as
ifn "# m then smo "# sno
or
if q E Q then q = sno for some n,
but these statements are not equations. However, as has already been shown
in 2.2.27, the intended model is the initial object in the category of all models
of the specification.
6 Example. Let E = {e1, ... ,ep } be a finite alphabet. An equational specification for the data type "stacks of E" analogous to 12.2.1,2,5,6 is as in 7
together with equations to be discovered.
321
14.1 Initial Algebras
7
A
Here the intended model is A = E + {aerr}, S = E* + {serr} where aerr
and serr represent "alphabet error" and "stack error" constants which are
introduced so that all functions may be total.
Given two such models 7 and 7' (where in 7' we write A', e;, top', etc.) a
morphism from the model of 7 to that of 7' is a pair f: A --+ A', g: S --+ S' of
total functions which preserves the structure in the expected sense:
8
(i = 1, ... ,p),
f(aerr)
g(A)
=
aerr',
= A',
g(serr) = serr',
g(pop(w»
=
pop'(g(w»,
top'(g(w»
=
f(top(w»,
g(push(x, w»
=
push'(f(x),g(w».
Now consider a model 7. There is an obvious candidate for a morphism
from the intended interpretation to this model, namely,
9
and g: E*
10
E
+ {serr}
+ {aerr} ~ A, f(eJ = ei , f(aerr) = aerr
--+
S inductively defined by
g(serr)
=
serr,
g(A) = A,
g(xw) = push(x, g(w»
(xEE, wEE*).
Appropriate equations, which together with 7 define an equational specification of "stack of E," are discovered by exploring what is needed to make 9
and 10 define a unique morphism.
Since push(x, w) = xw is the intended model, any morphism must satisfy 9
and 10. Hence, f, 9 are unique without imposing any equations. But certain
equations are suggested if f, 9 are to be morphisms.
Consider the need for g(pop(w» = pop(g(w». The inductive step would
322
14 Equational Specification
be for XW. Now pop(xw) is w in 12.2.2, whereas by 10 pop(g(xw)) =
pop(push(x,g(w))). These are the same provided that we impose the equation
pop(push(x, w)) = w
which is certainly true in the intended model. Proceeding in this way, we
leave it to the reader to show that the intended model is the initial object in
the category of models 7 satisfying the equations 11.
11 For all xEA, WES,
pop(push(x, w)) = w,
top(push(x, w))
x,
=
pop (A)
=
pop(serr)
top (A)
=
top(serr)
=
=
serr,
aerr,
push(aerr, w) = push(x, serr) = serr.
With these examples in tow we may now give the general definitions
needed to define equational specifications.
First of all, data types may require several sets. In Example 6 there were
the sets A, S. Operations take the form Xl x .. , X X k ~ Xk+l, where each Xi
is one of A, S and if k = 0 the operation is a constant of Xk+l' Now imagine
the syntax of a programming language that declared, for example, 7. First a
set of sorts would be declared, for the two sorts of set. Here {alpha, stack}
would be appropriate. We then might declare 7 with the self-evident syntax
[ ~ alpha]
= Xl"'"
[ ~ stack]
= A,
x P ' aerr,
serr,
[stack
~
alpha]
=
top,
[stack
~
stack]
=
pop,
[alpha x stack -----+ stack]
=
push.
The mathematical definition of a signature is precisely this sort of declaration
save that we write Qx, ... xnx instead of [Xl x ... X Xn -----+ xl (Note that the
use of S for the set of sorts in 12 should not be confused with S for stacks in 7.)
12 Definition. A signature is (S, Q) whose S is a finite nonempty set (of sorts)
and Q is a family (QwIWE S+) (recall S+ is the set of nonempty strings in S) of
sets with Q w n Q v = 0 if w f= v.
13 Definition. An algebra over a signature (S, Q) (usually called an Q-algebra
for short) is a structure (Q, S) where Q is a family (Qxlx E S) of sets, that is,
there is one set of each sort, and c5 is a family of operations (c5",lwE U Qw)
323
14.1 Initial Algebras
where if W E aXl ... XnX' (jw has the form
Q Xl x···xQ Xn ~Q X
(this being a constant in Qx if n = 0).
A signature or any of its algebras are called many-sorted if there is more
than one sort and single-sorted if there is only one sort.
Examples 1 and 5 involve single-sorted algebras. Example 6, as already
discussed, has two sorts.
14 Definition. The set of terms of a signature (S, a) and the type of each term
are mutually defined inductively as follows:
(i) If XES and i is an integer ~ 1 then vx , is a term and the type of vx , is x.
(Such vx , is the ith variable of type x.)
(ii) If WE aXl ... XnX and t 1, ... , tn are terms such that the type of ti is Xi then
w(t 1 , ••• ,tn ) is a term of type x.
Intuitively, terms codify what appears on one side of an equation. Thus,
11 provides several examples. For instance, for brevity write x = Va1ph l ,
W = Vstackl. Then
top(push(x, w»
is a term of type alph as follows:
x is a term of type alph, w is a term of type stack by (i);
push(x, w) is a term of type stack by (ii);
top(push(x,
w» is a term of type alph by (ii).
The "variables" in top(push(x, w» are x, w. If (Q, (j) is any algebra this term is
interpreted as a function.
QalPh
x
Qstack ---+ Qstack·
The type of the term is the sort where this function takes values. We leave the
precise inductive definition to the reader as Exercise 2. We then have
15 Definition. Let (S, a) be a signature. As a-equation is any pair of terms of
the same type, such an equation is written t1 = t2 rather than (t1,t 2). An
a-algebra (Q,(j) satisfies the equation tl = t2 if the interpreted functions of
t 1 , t2 on (Q,(j) are equal.
The following is then the centl"al definition of this section.
16 Definition. An equational specification is (S, a, E), where (S, a) is a signature and E is a finite set of a-equations. The data type specified by (S, a, E)
is the initial object of the full subcategory of all a-algebras satisfying all
equations in E where a morphism f: (Q, (j) ---+ (Q', (j') of a-algebras is a
324
family f
14 Equational Specification
= Uxlx E S)
where each
Qx~Q~
is a total function and the algebra structure is preserved in the sense that
b:"UXl(ql), ... ,jxJqn»
= fx(b w (ql,···,qn»
for all qi E Qx;. (See Exercise 3.) The existence of this initial object is proved as
Corollary 14.2.25 below.
One special case is worthy of note here.
18 Theorem. For an equational specification (S, n, 0) with no equations, the
specified data type (Q, b) may be constructed with Qx the subset of terms of
type x which have no variables.
PROOF.
See Exercise 7 for an outline.
D
Theorem 18 applies to Example 5. If S = {x}, all terms (necessarily of
type x) are one of sn(o), sn(x) and so the terms with no variables are the sn(o)
as expected.
We now wish to develop the idea that n-algebras are really XQ-algebras
in the sense of Chapter 10. This idea is simple. Consider the n-algebra
appropriate to Example 6, so that such a (Q, b) has the form
19
"""""7
QP
1
A'~
,aI h
top
Q..ack +-(-pu-s-:-h-
U
Q'IPh X Q"ack
pop
Using the device that a family of morphisms Ai -. B amounts to a single
morphism 11 Ai -+ B, all of 19 can be expressed as a single morphism of
the form
325
14.1 Initial Algebras
Here a morphism (A, B) ---. (A', B') is defined to be a pair of total functions
A ~ A', B ~ B' with no further conditions, and, recalling the notation
[nJ = {1, ... ,n},wedefine
20 Xn(Qalph' Qstack) = ([p
+ IJ + Qstack, [2J + Qstack + (QalPh x Qstack)),
noting that a morphism [nJ ~ Q amounts to a list of n elements of Q.
Viewing things from this point of view allows us, in the next section, to
put equational specifications and functorial least fixed points in a common
framework. This generalizes both theories in a useful way.
The idea of this chapter works for a much larger class of categories than
we actually introduce. This limitation keeps the basic simplicity of the ideas
in focus. Some further developments are suggested in Exercise 9 and some of
the exercises to Section 2.
21 Definition. If S is any non empty set, the category SetS is defined as
follows. An object A is an S-indexed family of sets, A = (Ax: XES) with each
Ax a set. A morphism I: A ~ B is a family 1= (fx) of total functions of
the form Ix: Ax ~ Bx for all XES. Composition is the usual one on each
coordinate, that is, for g: B ~ C, (gf)Aa) = gx(fx(a» for all XES, a E Ax.
The identity morphism idA: A ~ A is defined by (idA)x(a) = a. That this
constitutes a category is obvious.
22 Definition. If (S, n) is any signature we define a functor Xn: SetS ~. SetS so
that an Xn-algebra in the sense of 10.2.1 is essentially the same thing as an
n-algebra as defined in 13. To this end, if (Ax) is an object in SetS and
WES* let
23
ifw
24
A W =A Xl x"'xA Xn
=
Xl ... Xno (If W= A, define Aw
Xn(Ax)
= (Bx),
=
{A}.) Define Xn on objects by
Bx = li(AwlwES*,WEn wx )'
This says that for each win S*, Bx contains a separate copy of Aw for each W
in nwx ' (Recall that Definition 12 guarantees that each W labels an element of
at most one n wx , so we can write (B x )", = Aw for the corresponding copy.
Thus, Bx contains as many copies of Aw as there are elements of n wx ; if nwx
is empty for all w, Bx = 0.)
The reader should now pause to verify that 20 is a special case of 23
providing we identify [nJ, with an n-fold coproduct of singletons.
We then already see that an Xn-algebra 15: XnQ - - + Q has the form of a
family (bxlx E S) with bx of the form
U(QwIWES*,WEnwx) ~ Qx'
By the coproduct property, each such bx has the form <b",lwES*,wEn wx )
where each such 15", has the form
326
14 Equational Specification
Qw~Qx.
The resulting family of bro is then exactly the same as in 13 so an Xn-algebra
"is" an n-algebra under the bijective correspondence b <+> (b",) induced by the
definition of a SetS-morphism and the coproduct property.
There is no difficulty in defining Xn on morphisms to make it a functor
in such a way that an Xn-algebra morphism corresponds to an n-algebra
morphism as in 17. We leave proof details to the reader, but the definition is
as follows: Given f: A ~ A in SetS, define Xnf = (gx) where, in the notation
of 23, gx: Bx ~ Ex is defined by 25:
25
26 Example. For Xn as in 20, if f: A ~
[p
maps 1, ... , p
[2]
+ 1] + Astack
galph
A then Xnf = (gaIPh' gstack). Here,
)
[p
+ 1] + A.tack
+ 1 to themselves and maps Astack to Astack using !stack, whereas
+ Astack + A alPh
x
Astack
g"aok)
[2]
+ A.tack + A alPh
x
A.tack
is the map
1~1,
2~2,
a 1---+ faIPh(a)
(a, b) 1---+ U;'IPh(a),!stack(b))
(aEA aIPh )'
(a E
A alPh '
bE
Astack).
Having presented the basic definitions, we now assess some of the problems associated with defining data types by equational specification. In
situations such as Examples 1 and 5, where the desired equations seem clear
at the beginning, the method works well. In a situation such as Example 6,
where the intended model seems clearer than the set of equations, serious
problems may arise:
Problem (i). More than finitely many equations may be needed.
Problem (ii). If may be difficult to reconcile the initial object condition
with intuitively correct equations of the intended model.
A different problem, but a crucial one from the point of view of implementation of equational specification, is the following:
Problem (iii). Given equational specification (S, n, E) there may be no
effective algorithm to decide whether or not a given n-equation holds in the
specified data type.
327
14.1 Initial Algebras
Problem (iii) creates situations where nonconstructive mathematical proofs
of the existence of a specified data type exist, but no effective construction of
these data types can be given.
A practical illustration of some of these difficulties arises by grappling with
the relationship between stacks and queues. Our earlier approach in Section
12.2 seemed natural to us. The fundamental set is the set of lists. This is then
interpreted as stacks, or as queues depending on which functions one is
allowed to use. From the point of view of using equational specification, we
have defined stacks in Example 6 and it is still necessary to find appropriate
equations for queues. We have left this to the reader as Exercise 10. Once
this is achieved, it would be a nice victory for equational specification to
mathematically provide natural reasons why the same set-lists-underlies
the data type stacks and queues. Unfortunately, we know of no proof other
than to verify that both initial objects are built from lists. In short, our
criticism is that we have put the cart before the horse in that what we already
knew about stacks and queues is used to build an equational specification
whereas equational specification has not helped us to better understand these
data types.
EXERCISES FOR SECTION
14.1
1. Show explicitly that for !, 9 in 9 and 10 the equations of 11 force
top(g(w))
= !(top(w)),
g(push(x, w))
= push(f(x),g(w)).
2. For any signature (S, a) as discussed just prior to 15 define "the variables of term
t" and define "the function interpreting a term on an a-algebra" by induction on
how the term is built.
3. Prove that with the morphisms of 17 and with composition
(gf)x = Qx~Q~~Q~
and identities (idQ)x = id Qx that a-algebras form a category.
4. Verify in detail that the single-sorted data type {T, F} of Example 1 is the data
type specified by an equational specification in the sense of 16.
5. Repeat Exercise 4 for Example 5.
6. Repeat Exercise 4 for Example 6.
7. Prove Theorem 18. [Hint: For WE ax, ... xnx, let 14(ii) define Dw- Exercise 2 provides
the initial object property.]
8. Prove carefully that Xn is functorial and that the bijective correspondence
between Xn-algebras and a-algebras is in fact an isomorphism of categories in the
sense of the exercises of Section 11.3.
9. Generalize Definition 21 to C S for any category C.
10. Give an equational specification for queues.
328
14 Equational Specification
14.2 Sur-reflections
In Section 1 we saw that the equational approach to abstract specification
of data types is to define a finite set E of equations over some signature
(S, Q), and then define the data type so specified as the initial object of the
corresponding category of Q-algebras. In this section we develop the proof
that such an initial object exists. The reader willing to accept this fact without
proof may omit this section.
We begin by studying the structure of the category SetS of 14.1.21.
1 Observation. SetS has products, coproducts, and co limits of right chains and
limits of left chains, and these are constructed independently on each coordinate
as in Set.
Thus, if Ai = (Ai.xlxES) in SetS, define A by Ax = TIiAi,x' Then with
projections pri: A -+ Ai, the usual projection pri,x: Ax -+ Ai,x for each x, this
is a product in Sets. Coproducts are constructed similarly.
Let us outline most of the details for colimits of right chains. (Limits of left
chains are then similar.)
Let An ~ An+1 be a right chain in Sets. Thus, for each XES,
is a right chain in Set and so has a colimit ocn,x: An,x -+ Ux as in 11.1.7. Thus,
OCn: An -+ U is a morphism in Sets if U = (UxIXES). We have
in SetS because we have
for each XES. The rest of the proof that (U, oc) is a coli mit follows a similar
pattern: do whatever is needed for each x separately.
2 Proposition. For every signature (S, Q), Xu: SetS -----+ SetS is bicontinuous.
PROOF.
This is an easy consequence of 11.2.5-9 and 11.3.8,10.
D
In particular, we recapture 14.1.18, the existence of an initial Q-algebra, by
applying the Generalized Kleene theorem 11.2.13.
Of course, we have extra work to do to establish an initial object among
329
14.2 Sur-reflections
those algebras satisfying a given set of equations. According to Definition
14.1.16, we seek the initial object of certain subcategories of O-algebras-or,
equivalently, of Xu-algebras. The subcategories are characterized by the
satisfaction of equations. We shall first show that such subcategories must be
closed under products and sub algebras and then go on to see why any
such subcategory must have an initial object. We begin with the relevant
definitions.
3 Definition. Let (S,O) be a signature and (Q;,15;)liE1) be a family of Xualgebras (1 is not necessarily countable). Let Q =
Qi in Sets. By the product
property there exists unique 15 with
n
for all i E 1. Then (Q, 15) is an Xu-algebra. From the !l-algebra point of view,
if WE OX1"'XnX then it is routinely checked that 15",: QXl x ... x QXn --+ Qx is
defined by
15",((q;xl,···,qix)liE1) = (r;),ri = 15;",(Q;xl,···,Qix).
Such (Q,15) is called the product algebra. See Exercise 3. When 1 = 0 the
definition is satisfied with the obvious terminal algebra with each Qx = 1
a one-element set and with the unique morphism XnQ --+ Q.
4 Definition. Let (S,O) be a signature and let (Q,15) be an Xn-algebra. For
each XES let Ax c: Qx with inclusion map incx : Ax --+ Qx, incAa) = a. Then
A = (Ax) is a subalgebra of (Q, 15) if there exists Yx: XnA --+ Ax with
--:j-nc--+l
Q
Evidently, such y exists in at most one way, and exists if and only if A is
"closed under the operations," that is, for wEOX1 ... XnX and a1EA x" ... ,
an E AXn' 15",(a 1, .. . , an), necessarily in Qx, is in fact in Ax. Note that (A, y) is an
algebra in its own right. We also say "(A, y) is a subalgebra of (Q, 15)."
5 Observation. Let (S,O) be a signature and let E be any set of O-equations.
Then the class of all O-algebras satisfying all equations in E is closed under
products and sub algebras, that is,
if (Q;, 15/) satisfies E (i E1) then their product algebra (Q,15) also satisfies E,
and,
(ii) if (Q, 15) also satisfies E, and (A, y) is a subalgebra of (Q, 15) then (A, y) also
satisfies E.
(i)
330
14 Equational Specification
PROOF IDEA. Let t be a term of type x with variables Xl' ... , x n • Then the
interpretation of t on Q is the I-tuple of interpretations on the Qj, that is,
t maps (%lieI)eQxj to (rjlieI)eQx, where rj is the interpreted value of
t in (Qj)x of (qjl, . .. ,qjn). Thi!l is because the n-operations in a product are
performed on each coordinate separately. Similarly, if A is a subalgebra of
(Q,(j) and ajeA Xi ' t maps (aj) to the same value of Ax regardless of\\lhether t
is the interpretation in (A, y) or (Q, (j) since the operations on a subalgebra
just restrict those of the ambient algebra. The conclusions (i) and (ii) above
are then clear.
.
0
6 Example. Let (S, n, E) be the equational presentation of stacks as in 14.1.611. Let (Q,(j) be the initial stack QalPh = E + {aerr}, Qstack = E* + {serr}
discussed earlier. Then the product (R, y) = (Q, (j) x (Q,(j) has
RalPh
= (E + {aerr}) x (E + {aerr}),
Rstack = (E*
+ {serr})
X (E*
+ {serr}).
In (R, y),
push«x, aerr), (w, v)) = (push(x, w), push(aerr, v)) = (xw, serr),
so that
top(push«x, aerr), (w, v)) = top(xw, serr) = (x, aerr).
This is expected as (R, y) satisfies all equations in 11.
(R, y) is very different for the intended model (Q, (j). For example, the
constants ej are (ej, ej) e Ralph but Ralph has more arbitrary elements of form
(e j , ej). This illustrates how the class of n-algebras satisfying E has many
models that differ from any "intended" one and that some device such as
initiality is needed to specify a particular model.
For the balance of this section we fix a finite set S, an arbitrary cocontinuous endofunctor X: SetS __ SetS and a full subcategory B of the
category (10.2.1) X-Alg of X-algebras.
Our main objective is to discover conditions on B which, on the one hand,
hold when X has the form X(l and B is the subcategory of all a-algebras
satisfying a set of equations, and which, on the other hand, guarantee that B
has an initial object. To this end we require that every X -algebra has a
"sur-reflection" in B. The proof that B has an initial objeGt requires some
study of surjective functions and "image factorization" ~n Set. We present
these ideas in a way that leads naturally to generalization to more arbitrary
categories than SetS, as is explored in the end of chapter I1ot~s.
7 Definition. Let (Q,(j) be an X-algebra. A reflection of (Q,(j) in B is
a morphism (): (Q, (j) - - (B, y) in .y-Alg such that (B, y) js in B with the
property that whenever ()/: (Q, (j) - - (B', y') is an X-aige~r~ morphism
331
14.2 Sur-reflections
()
8
(Q'~~"J;)
(~"Y')
with (B', Y') in B then there exists unique f: (B, y) - - (B', y') in X-Alg with
fO = 0' as shown. Clearly, (B, y, 0) are unique up to isomorphism when they
exist. See Exercise 11.
Our interest in reflections lies in the following.
9 Proposition. If the initial X-algebra (Q,o) has a reflection 0: (Q, 0)--
(B, y) in B, (1J, y) is an initial object of B.
PROOF. (Such initial (Q,o) exists by the Generalized Kleene theorem 11.2.13
in view of 1 and the assumed co-continuity of X.) Now let (B', y') in B be
arbitrary. As (Q, 0) is initial, there exists unique 0' as in 8 and hence unique f
If g: (B, y) ---+ (B', y') is arbitrary then as (Q, 0) is initial necessarily gO = 0' so
that, by the definition of a reflection, g = f
0
10 Example. Consider the single-sorted signature
I~Q.
An algebra is simply a set with two constants. Let B be the full subcategory
of all algebras satisfying the equation
c =d.
If (Q,o) is an arbitrary algebra with constants oe' bd its reflection in B is
obtained by identifying oe and Od' that is, let
,
6
Q ---+B,
That (B, y) is in B if Ye = Yd = e is clear and that 0 is a morphism is obvious.
Now consider 0' in 8. To see that f as in 8 is induced, it pays to first establish
the following fact from set theory which we shall need later as well.
11 Lemma. Let 0: Q -+ B be a surjective function and let 0': Q -+ B' be an
arbitrary function. Then there exists f: B -+ B' with
Q
()
IB
~lf
B'
332
14 Equational Specification
if and only if for all q, r E Q, ()' q = ()' r whenever ()q = ()r. Furthermore,
f exists it is the only f with f() = ()'.
if such
PROOF. If f() = ()' then if ()q = ()r, ()' q = f(()q) = f(()r) = ()'r. Conversely,
define f as follows. Given bE B, as () is surjective choose qo with ()qo = band
define fb = ()' qo. Let q E Q. Then by the definition of f, f()q = ()' qo where
()qo = ()q. By hypothesis, ()' qo = ()' q so f()q = ()' q and this shows f() = ()'.
Finally, f is unique because if also g() = ()' then if fb = ()' qo as above,
gb = g()qo = e'qo = fb.
0
Returning to Example 10, the only situation in which ()q = ()r with g =1= r
is when {q, r} = {<5e , <5d } and in this instance, because ()' is a morphism and
(B', 1') is in B, ()' q = ()'r. It follows from Lemma 11 that there exists a unique
function f with f() = ()'. The argument is complete if such f is an algebra
morphism. This is established by the following lemma which is also needed
later.
12 Lemma. Let (): (Q, (5) ---+ (B, y), ()': (Q, (5) ---+ (B', y') be X-algebra morphisms and let f: B --+ B' be a morphism in SetS such that
()
Q~)'
B'
commutes. Then if ()x: Qx --+ Bx is surjective for each
is an X -algebra morphism.
XES,
f: (B, y) ---+ (B', y')
PROOF. Let XES. As ()x: Qx --+ Bx is surjective there exists d x: Bx --+ Qx with
()xdx = id Bx • (If Qx = 0 necessarily Bx = 0 so let dx = id0 , else let dAb) be
any q with exq = b.) This defines d: B --+ Q in SetS with ()d = idB • As X
is a functor, (X())(Xd) = id XB • It follows that each (X())x: (XQ)x ---+ (XB)x
is surjective since for any t E (X B)x, t = (X())xu if u = (xd)x. Now consider the
diagram
Xf
I
XB'
I
B'
lr'
?
Q --()=----+l B
f
The square marked ? is what we need to establish as commutative. But
as f() = ()', (Xf)(X()) = X(f()) = X()', the outer rectangle commutes by our
assumption that ()' is an algebra morphism. Thus,
(y'(Xf))X() = y'((Xf)(X())) = f()<5 = (fy)X(),
where the last equality uses the assumption that () is an algebra morphism.
333
14.2 Sur-reflections
Hence, for XES, (y'(Xf»x, (fy)x are two functions that agree when preceded
by the surjective function (XO)x, and so these are equal.
0
We now introduce the central definitions of this section.
13 Definition. A sur-reflection of (Q, b) in B is a reflection 0: (Q, b) ~ (B, y)
of (Q, b) in B such that Ox is surjective for all XES.
Generalizing Definitions 3 and 4 where X = X n , the product (Q, b) of a
family ((Qi' bJI i E J) of X-algebras is defined by Q =
Qi (the product in SetS
as in 1) and
n
(i E J)
- - - - + ) Qj
prj
Furthermore, if (Q, b) is an X -algebra and Rx c Qx for all XES then R = (RJ
is a subalgebra of (Q, b) if there exists (necessarily unique) b o with
XR
00
X(inc)
) xQ
I
I
:
oj.
R
inc
)
10
Q
where incAr) = r are the inclusion maps.
B is a quasivariety if the product of any family of algebras in B is again in
B and if every subalgebra of an algebra in B is again in B, that is, "B is closed
under products and subalgebras." The empty product is included, that is, we
require that the terminal algebra Xl ~ I (where Ix has one element for all
x ES) is in B.
It follows from 5 that the class of all Q-alge1;>ras satisfying a set of
equations is a quasivariety.
The main result relating quasi varieties and sur-reflections is the following:
14 Theorem. B is a quasivariety if and only if every X-algebra has a surreflection in B, and every X -algebra isomorphic to an algebra in B is in B.
Before proving (half of) this, we note the following:
15 Corollary. Every quasivariety has an initial object.
PROOF. This is immediate from Proposition 9 and Theorem 14.
o
As Corollary 15 is our main focus we shall only prove the needed half of
Theorem 14. The converse, that sur-reflections imply quasivariety, will be
334
14 Equational Specification
left for Exercise 18. We need a few preliminary results about products,
subalgebras, and image factorizations.
16 Lemma. The product algebra is the product in the category of X -algebras.
PROOF. In the notation following 13, let /;: (R, y) -----+ (ai' <5;) be morphisms
and let f: R --+ Q be the unique morphism in SetS with prJ = /;. Then in the
diagram
XI;
I
XR
yl
R
xI
?
I
I
IXQ
bl
I Q
Xpr,
pr,
I
l
xQ,
lb,
I Q,
I;
t
we must prove that (?) commutes given that the three indicated subdiagrams
and the outer rectangles commute. But for each i we have
pri(<5Xj) = (pri<5)Xf = <5iXpr iXf
=
<5iX/; = /;y = pri(fy)·
By the uniqueness property of products, <5Xf = fy.
17 Definition. Let f: A
is (I, p, inc) where
-+
D
B be a total function. The image factorization of f
with 1= {bEB: b = fa for some aEA}, inc is the inclusion map inc (c) = c,
and p(a) = f(a).
Note that f = inc p with p surjective and inc injective. Uniqueness properties and categorical generalizations are explored in the exercises.
The definition extends in the obvious way to Sets. If f: A -+ B is a
morphism in SetS the image factorization of f is (I, p, inc) where, for XES,
(Ix, Px, inc x ) is the image factorization of fx.
Image factorizations provide subalgebras:
18 Proposition. If f: (Q, <5) -----+ (R, y) is an X -algebra morphism and if (I, p, inc)
is the image factorization of f in SetS then I is a subalgebra (I, /3) of (R,y) and
p: (Q, <5) -----+ (I, /3) and inc: (I, /3) -----+ (R, y) are algebra morphisms.
PROOF. To see this it is useful to establish the diagonal fill-in property: Given
a commutative square us = it of functions as in 19
335
14.2 Sur-reflections
19
with s surjective and i injective there exists a unique {3 with {3s = t, i{3 = u.
To construct a unique {3 with {3s = t use Lemma 5: if sa = sa' then ita =
usa = usa' = ita' and, as i is injective, ta = ta'. To see i{3 = u observe
that each bE B has the form sa so that ub = usa = ita = i{3sa = i{3b. See
Exercise 15.
To apply the diagonal fill-in property, consider 20:
20
XQ
xp
----''---+1
X (inc)
XI
1
XR
I
p:
bl
I
Q ----+1
p
~
I
-~--+I
inc
R
Then for each XES, (X p)x is surjective; this was established in the proof of
Lemma 12. Thus, 20 applies with s = (Xp)x, i = (incx ) to produce a unique
{3 in SetS with 20 commutative. In particular, I is a subalgebra and
p: (0(, (j) ---+ (1, {3), inc: (1, {3) ---+ (R, {3) are algebra morphisms.
0
We are finally ready to prove the desired half of Theorem 14:
21 Theorem. If B is a quasivariety, every X -algebra has a sur-reflection in B.
PROOF. Let (Q, (j) be an X -algebra. Fix an algebra morphism fo: (Q, (j) ---+
(Bo, Yo) with (Bo, Yo) in B; such always exists since we can choose fo the
unique morphism to the terminal algebra. For each XES, q, r E Qx we define
an algebra morphism
22
(Q, (j)
Ix.r
1
(Bxqr> Yxqr),
as follows. If 22 exists with fxqr(q)"# fxqr(r) choose one. Else for all
f: (Q, (j) ---+ (B, y) with (B, y) in B f(q) = f(r); in this case define 22 by
fxqr = fo·
Now let (ii, y) be the product of all the (B xqr , Yqxr)' Then as B is a
quasivariety, (ii, y) is in B. By Lemma 16 there exists a unique algebra
morphism f as shown in 23:
23
Let (B, (), inc) be the image factorization of f Then B is a sub algebra (B, y) of
(ii, y) and (): (Q, (j) ---+ (B, y) is an algebra morphism by Proposition 18. As B
is a quasivariety and (ii, y) is in B, (B, y) is in B. We shall show that () is the
desired sur-reflection.
336
14 Equational Specification
To see that 8 is a reflection, let 8': (Q, c5) ----+ (B', y') be an algebra
morphism with (B', y') in B. To construct 9 as in 24, we rely on Lemma 11. Let
XES, q, rEQx
24
(Q")~
fJ
;l"")
inc
-------+l (B,y)
(H', y')
and suppose 8x q
=
8x r. Then
!xqr(q) = (prXqrf)q
(by 23)
=
(prxqrincx8x)q
(as! = inc 8)
=
(prxqrincx8x)r
(as 8xq
= !xqr(r)
=
8xr)
(similarly).
So that, by the definition of the !xqr' no morphism exists to an algebra in B
distinguishing q and r and, in particular, in 248' q = 8'r. Thus, unique 9 in 24
exists and such 9 is an algebra morphism by Lemma 12.
0
As mentioned earlier, we have the following important result in the theory
of equational specification:
25 Corollary. The data type specified by an equational specification as in
14.1.16 always exists.
PROOF. The full subcategory B of all Q-algebras (thought of as Xn-algebras
by 14.1.22-25 satisfying the set of equations E is a quasivariety by 5. Hence,
B has an initial object by Theorem 21 and its Corollary 15.
0
EXERCISES FOR SECTION 14.2
1. In the context of Exercise 14.1.9, show that Observation 1 holds in CS providing
it holds in C.
2. What difficulties, if any, arise in generalizing proposition 2 to CS , if
(i) C = Dom?
(ii) C = Dome?
(iii) C = Domadj ?
3. Show that the product algebra of 3 is indeed the product in the category of
algebras (in either the Q- or Xn-sense-it does not matter which because of
Exercises 14.1.8 and 11.3.2).
4. Show that vector spaces constitute the full subcategory of algebras of a singlesorted signature with nullary 0, a unary r for each real number r (for the
operations of scalar multiplication, q 1-----+ rq), and binary + which satisfy a finite
set of equations (the usual ones). Show that the concept of a "vector subspace"
coincides with that of a subalgebra as in 4. What is the initial vector space?
337
14.2 Sur-reflections
5. Let (Q,J) be the data type specified by (S, g, E) as in 14.1.16. Prove that the only
subalgebra A of (Q, 15) is A = Q. [Hint: Use the initial algebra property to prove
that inc: A -+ Q is surjective. Where is observation 5(ii) used?]
Two equational specifications (S, g, E), (S, g', E') on the same sort set are
equivalent iffor each Q in SetS there is a bijection 15 -+ {/ between il-algebras (Q,J)
satisfying E and il'-algebras (Q, b') satisfying E' such that for f: Q -+ R in Sets,
f is an g-algebra morphism (Q, 15) ---> (R, y) if and only if f is an g' -algebra
morphism (Q, 15') - - + (R, 1").
6. Show that the two categories of algebras for equivalent specifications are
isomorphic categories in the sense of the exercises to Section 11.3.
7. Show that the equational specification for Boolean algebras of Example 14.1.1 is
equivalent to
with equations
-,F
= T,
-,T= F,
TAT=T,
T
A
F
= FAT = F
A
F
= F.
8. The concept of a group (Exercise 2.2.11) may be captured by the single-sorted
specification
e
I
G (
0- 1
G
J'
G2
with equations
a'(b'c)
= (a'b)'c,
a'e=a=e'a,
a' a-I = e = a-I. a.
Show that this specification is equivalent to
G2~G
subject to the single equation
a -:- «(a -:- a) -:- b) -:- c) -:- «a -:- a) -:- a) -:- c) = b.
[Hint: The bijection 15 <-+ b' is given by a -:- b = a' b- I ; e = a -:- a (you must prove
independence of choice of a), a-I = e -:- a, a·b = a -:- b- I .] This example shows
that it may be far from obvious that two specifications are equivalent.
9. Let C be the category of join-semilattices (3.3.3) whose morphisms are defined
by the condition f(x v y) = f(x) v f(y). Let B be the class of all single-sorted
338
14 Equational Specification
algebras with signature
subject to the equations
q v (r v s) = (q v r) v s,
q v r = r v q,
q v q = q.
Show that B, C are isomorphic categories. [Hint: (Q, :0;) f----+ (Q, b) via the usual
q v r whereas (Q, b) - - + (Q, :0;) if a :0; r means q v r = r.] This shows that a
nonequational structure (e.g., a semilattice qua poset) may be reexpressible in
equational form.
10. Prove that any coproduct offunctors of the form Xu also has this form. Conclude
that X: Set - - + Set, XQ = A + Q* as in Exercise 11.2.2 has the form Xu. Find a
suitable a explicitly.
1l. Prove that reflections, as defined in 7, are unique up to isomorphism. [Hint: A
reflection is just an initial object in a suitable category. If (B', y', fJ') is another
reflection, f in 8 will be an isomorphism.]
12. Let (S, a) be a signature and let E be the set of all equations
(XES).
Let B be the full subcategory of all a-algebras satisfying E. Give a direct
construction of the sur-reflection of each a-algebra in B (bypassing the existence
proof of this section). [Hint: If (B, y) is in B, Bx has at most one element.]
= 1 + A x Q with A finite.
This has the form Xu for the single-sorted signature with S = {x}, ax = 1,
axx• = A. The initial a-algebra is then the least fixed point A * as in 12.2.9.
A subset of A may be viewed as a list in A for which order and repetition are
not important. This may be expressed by the set E of equations
13. Let X: Set - - + Set be the polynomial functor XQ
a(b(v» = b(a(v)),
a(a(v)) = a(v),
where a, bE axx and v = Vxl is a variable.
Show that the data type specified by (S, a, E) is 2A [Hint: Directly construct a
sur-reflection B: A * -> 2A where B(w) = set of symbols occurring in w.]
14. Let (R, y) be a subalgebra of (Q, b) with inclusions inc: R -> Q. Let (U, IX) be an
algebra and let f: U -> R be functions such that inc f: (U, IX) -----+ (Q, b) is an
algebra morphism. Prove that f: (U, IX) -----+ (R, y) is an algebra morphism.
15. Let fJ: Q -> B be a function. Prove that fJ is surjective if and only if for arbitrary
fJ': Q -> B' there exists at most one f: B -> B' with
Q
()
IB
~~/{
B'
339
Notes and References for Chapter 14
Let i: B -+ R be a function. Prove that i is injective if and only if for arbitrary
i': R' -+ Q there exists at most one g: B' -+ B with
gi/,
B
I
I
IR
I
B'
16. In an arbitrary category, a morphism () is an epimorphism if, as in Exercise 15, for
all ()' there is at most one f with f(} = ()'. Dually, i is a monomorphism if, as in
Exercise 15, for all i' there exists at most one g with ig = i'.
Prove the following:
(i) Iff: A -+ B, g: B -+ Care epimorphisms, so is gf
(ii) Iff: A -+ B, g: B -+ C then if gf is an epimorphism, so is g.
(iii) idA: A -+ A is an epimorphism.
State (and hence there is no need to prove) the dual result for monomorphisms.
17. Consider the single-sorted signature of Exercise 10 so that X = Xn: Set --+ Set,
XQ = A + Q*. Let B be the full subcategory of all X-algebras (Q,(j) for which
there exists a monoid structure on ex for which the restriction of (j to Q* is the map
A t-----. monoid unit
q 1 ... qn t-----. monoid product q 1 ... q•.
Prove that B is a quasivariety and that its initial object is (A *, Jl) where
+ (A*)* --+ A* maps elements of A to length-l words and words of
words to single long words by merging, for example, (aba)(bc) of length 2 in
(A *)* to ababc ofiength 5 in A *.
Jl: A
18. Prove the converse part of Theorem 21 by showing that if every object has a
sur-reflection in B then the reflection maps () of a product of B-objects or of a
subalgebra of a B-object are isomorphisms.
19. Let B be a full subcategory of X-algebras. Say that B is closed under images if
whenever f: (B, y) --+ (Q, (j) is an algebra morphism with each fx surjective then
if (B, y) is in B, also (Q, (j) is in B. Show that if X = Xn and B is the set of all
X -algebras satisfying a set of equations then B is closed under images.
Notes and References for Chapter 14
Initial algebras with data types in mind are due to J. A. Goguen, J. W. Thatcher,
E. G. Wagner, and J. B. Wright, "Initial algebra semantics and continuous algebras,"
Journal of the Association of Computing Machinery, 24, 1977, pp. 68-95. This paper
is worth the attention of the reader who has made it this far in the book. Many-sorted
algebras were studied earlier by universal algebraists.
For much more detail and further topics and references in the area of equational
specification, see H. Ehrig and B. Mahr, Fundamentals of Equational Specification 1,
Springer-Verlag, 1985. This book discusses, and provides references for, the difficulties
mentioned at the end of Section 1. See pages 305-306 for references to languages such
as CLEAR, ACT ONE, and CLU as mentioned in the chapter introduction.
Theorem 14.2.14 may be generalized to full subcategories of X -algebras for any
340
14 Equational Specification
category C with products and "image factorization system" (E, M) as long as X maps
morphisms in E into E. The necessary definitions and results may be found in the
authors' text cited in the notes to Chapter 2. The interested reader may wish to pursue
this and explore its applicability to Doms, (Domc)S, and (Domadj)s.
The term "quasivariety" is used because a more central concept in universal
algebra (the study of operations and equations-see G. Gratzer, Universal Algebra,
Van Nostrand, 1968) is that of a variety which is a quasivariety which is additionally
closed under homomorphic images as defined in Exercise 14.2.19. This exercise
establishes that if B is a full subcategory of Xn-algebras then if B is the class satisfying
a set of equations, B is a variety. The converse is true by a celebrated theorem of
Garrett BirkholT (proved in 1935 in the single-sorted case). For a discussion of the
many-sorted case see G. BirkholT and J. D. Lipson, "Heterogeneous algebras," Journal
of Combinatorial Theory, 8, 1970, pp. 115-133, and Chapter 4 of the Ehrig-Mahr
book cited above. A variety need not be describable by only finitely many equations,
however, and the proof that the set of equations exists must be regarded as nonconstructive in general. Exercise 14.1.17 illustrates that the needed set of equations
may not be obvious.
The tentl "reflection" was introduced by Freyd in the exercises to the book cited
in Chapter 2. All the texts on category theory cited there prove Freyd's "General
adjoint functor theorem." Readers familiar with this result could obtain a much
quicker proof of Theorem 14.2.21.
Epilogue
A number of major mathematical structures were introduced in this book in
the context of their application to program semantics. The reader interested
in further pursuit will discover that a voluminous body of theory and open
problems exists for each structure. The most important concepts discussed
were:
1. Partially Ordered Sets (Posets). These are sets equipped with an order
relation ::;; subject to axioms. In general, there may be x, y such that
neither x ::;; y nor y ::;; x hold. Domains are posets in which there is a
minimal element and each ascending chain has a least upper bound.
The Kleene fixed point theorem asserts that each function f from a
domain to itself which is continuous, that is, preserves least upper bounds
of ascending chains, has a least fixed point x, that is, fx = x and if
fy = y then x ::;; y. Boolean algebras are special po sets in which the standard Boolean operations such as or, and, not, and if-then-else are defined.
2. Categories. These have objects and morphisms. Mathematically (but not
conceptually from the point of view of this book), categories are generalizations of posets. Least upper bounds and greatest lower bounds
generalize to coproducts and products. Similarly, least upper bounds of
ascending chains generalize to colimits of right chains with limits of left
chains as the dual concept. Isomorphic objects in a category are "abstractly
the same." An initial object admits a unique morphism to each object
and a terminal object admits a unique morphism to every object. Zero
morphisms generalize the totally undefined partial function. In a Cartesianclosed category, "lambda conversion" ofmorphisms is possible.
3. Metric Spaces. These are sets equipped with a distance function d(x, y)
subject to axioms. In a complete metric space each sequence Xl' x 2 , x 3 , •••
342
Epilogue
whose elements approach each other (i.e., d(xm' x n ) ---+ 0) converges to a
limit x (i.e., d(x, x n ) ---+ 0). The Banach fixed point theorem states that if f
is a function from a complete metric space to itself which is a contraction
in that there exists K < 1 with d(fx,fy) ::;; Kd(x, y) for all x, y, then f has
a unique fixed point x = fx.
4. Functors. Roughly speaking, these are structure-preserving functions
between categories. The generalized Kleene fixed point theorem establishes that a functor from an appropriate category to itself which is
co-continuous by virtue of preserving colimits of right chains has a
least fixed· point. The dual theorem provides greatest fixed points for
continuous functors. Bicontinuous functors are both continuous and cocontinuous and polynomial functors, which are built up from constant
functors and the identity functor by using product and coproduct, are
bicontinuous on appropriate categories.
Additionally, we discussed the partially additive mono ids introduced by the
authors in 1980. These are sets equipped with a sum operation that applies to
some but not all countable families, subject to axioms. In a partially additive
category the set of morphisms between two objects is a partially additive
monoid (and there are other axioms as well).
The fundamental problem addressed in the first two parts of the book
was how to give the overall semantics of a recursive or iterative program
given the semantics of the pieces. We achieved this by considering "semantic
categories" C with enough structure so that the infinitary process of repeated
call found suitable expression, important examples being Pfn (sets and partial
functions) and Mfn (sets with multivalued functions). We emphasized two
approaches to determine the desired element f of the set C(X, Y) of
C-morphisms from X to Y as the semantics of a recursive specification.
In the first, f is the least fixed point of an appropriate continuous function
1/1 mapping C(X, Y) to itself, as constructed by the Kleene fixed point
theorem. Here, the semantic category must be designed so that C(X, Y) is a
domain. In the second, f is the pattern-of-calls expansion which arises when
C is a partially additive category as a pertinent fixed point of a requisite
power-series map 1/1 from C(X, Y) to itself. In the most important cases, both
approaches are applicable, and they yield the same f, but in "orthogonal"
ways-the least fixed point is, roughly speaking, the limit of "up to n calls"
as n increases whereas the pattern-of-calls expansion is the sum of each of the
possible (perhaps infinite) computation paths in the tree-of-call.
At the level of abstract trees, the infinite tree-of-call just mentioned arises
as the unique fixed point of a contraction on a suitable complete metric space
of trees, as constructed by the Banach fixed point theorem.
All three fixed points are examples of a canonical fixed point. This definition
formalizes the idea that, in each case, the fixed point assigned to each member
of a class of functions is constructed "the same way in all cases." Because the
definition uses only morphisms (rather than specific structure on the objects
Epilogue
343
such as domain, partially additive, metric) it offers the mathematician more
leeway in seeking useful semantic categories.
In the third part of the text we considered the problem of data type
specification. In our opinion, the concepts we are trying to formalize are not
as crystallized at this time in the computer science community as those
involved in program flow and recursion, so that we have resisted any attempt
at a definitive treatment. We have not, for example, formally defined "data
type." Our philosophy, however, is that a data type is an object or finitely
many objects in a semantic category C together with a specified finite
collection of morphisms between them. We have taken the stand that a
number of different tools can be employed to construct these objects and
morphisms (and, implicitly, that such tools be considered for implementation
in programming languages).
The simplest data types such as arrays should be constructed directly
(in this case using products in C). Recursively defined data types such as
lists may be defined as functorial fixed points using the generalized Kleene
theorem or its dual. Many data types are usefully defined using equational
specification.
We also discussed the work of Dana Scott in constructing a domain
isomorphism D = A + [D -+ DJ. While emphasizing that the specification
of such data types is not likely to arise in practical semantics, the existence
of such isomorphisms is mathematically important in establishing the consistency of mathematical frameworks in which functions can be passed as
arguments to procedures.
The above mathematical structures and applications provide the reader
with a varied set of algebraic techniques and frameworks useful in the study
of the semantics of current and future programming languages. Progress
in concurrency, parallel-processing, and many other areas will profoundly
affect the design of programming languages. Yet it is our hope that basic
concepts of program and data type construction can be formalized in such
new environments by choosing an appropriate category for denotational
semantics and then applying theory, such as that of this text, which works for
a broad class of categories.
Author Index
A
F
Ackennann, W. 145
Adamek, J. 70,257
Alagic, S. 37, 114
Arbib, M.A. 37,70,96, 114, 145, 179,
209,231,257,317
Arden, D.N. 175
Floyd, R. 37, 39
Freyd, P. 70, 114,257,340
B
Backus, J. 1,37
Barr, M. 257
Benson, D.B. 97
Birkhoff, G. 340
Bloom, S.L. 231
C
Church, A.
306
G
Gooei, K. 317
Goguen, J.A. 339
Goldblatt, R. 70, 292, 317
Gordon, M.J.C. 317
Gratzer, G. 340
Gries, D. 114
Guessarian, I. 175, 231
H
Halmos, P.R. 96
Herrlich, H. 70
Hilbert, D. 145
Hoare, C.A.R. 37, 99
D
Davis, M. 317
De Bakker, J.W. 37
De Roever, W.P. Jr. 37
Dijkstra, E.J. 37,99, 114
J
E
Ehrig, H. 339, 340
Eilenberg, S. 257
Elgot, C.C. 70, 97, 231, 257
K
Karp, R.M. 37
Kfoury, A.J. 37, 317
Kleene, S.C. 145, 175
Jacobson, N. 37
Jensen, K. 37
346
Index
Knaster, B. 175
Koubek, V. 257
Krause, E. 231
p
Padulo, L. 231
Plotkin, G. 257,317
R
Rogers, H. Jr.
L
Lagarias, J .C. 145
Lallement, G. 231
Lambek, J. 257
Lawvere, F.W. 70,317
Lehmann, D. 257, 292
Lipson, J.D. 340
M
Mac Lane, S. 70, 257
Mahr, B. 339,340
Manes, E.G. 70, %, 97, 114, 145, 179,
209,257,317
Manna, Z. 145
McCarthy, J. 145
Meertens, L.G. 37
Milner, R. 317
Mitchell, B. 70, 114
Moll, R.N. 37,317
N
Nivat, M.
231
145
S
Schutzenberger, M. 231
Scott,D.S. 37, 145, 175,209,293,305,
317,343
Smyth, M.B. 257,278
Steenstrup, M.E. 96, 317
Stoy, J. 37,317
Strachey, C. 37,293,305,317
Strecker, G.E. 70
T
Tarski, A. 175
Thatcher, J.W. 339
Tindell, R. 231
TrnkovR, V. 257
W
Wagner, E.G. 317
Wand, M. 257
Wirth, N. 37
Wright, J.B. 339
Subject Index
A
abstraction map 302
abstract iterative program 159
abstract syntax 187
Ackermann function 121, 137, 145,224
Adamczyk, K. 145
additive domain 194
additive function 181
m- 181
strongly m- 205
adjoint 308
-Alg 246
algebra
of endofunctor 246
over a signature 322
alternative construct 33, 93
ANMfn 40
APL 25,184
approximating sequence 273
approximation ordering 149,295
array 280
assertion 4
associativity 23, 72
automaton 174, 253
B
Banach fixed point theorem
bicontinuous functor 275
bi-index principle 280
Boolean algebra 88
215
C
CIM ,.,,) 240
C IP ,") 45, 238
Cantor diagonal argument 244
CAR 265
cartesian product 57
case statement 32, 93
category 39
cartesian-closed 300
discrete 46
ordered partially additive 194
of PAR-schemes 186
partially additive 79
of recursion schemes 176
Cauchy sequence 214
CDR 265
center 96
chain
ascending 149
descending 272
left 272
right 259
co- 50
-coAlg 246
coalgebra of endofunctor 246
co-continuous functor 267
separately 288
codomain 35
coequalizer 278
colimit of right chain 260
comparison map 247
348
Index
complement 87
composition
in a category 39
of functors 242
of multifunctions 22
of partial functions 14
concatenation 43, 248
concurrent programs 317
conditional (if-then-else) 33, 82, 93
generalized 82
context-free grammar 169
continuous function 154
continuous functor 274
contraction 215
coproduct 62
of functors 242
D
determinant 183
deterministic morphism 114
diagonal fill-in 334
direct sum system 101
disjoint union 58, 282
distributive law 30,31,75,95
Dom 308
Dom"'j 308
Dom, 297
domain (of morphism) 39
domain (poset) 149
additive 194
coproduct 297, 311
fiat 150
function-space 296
power 317
product 151, 158, 297, 311
domain of definition 13, 95
DTN 12,288
DTN 12
dual category 50
dynamic tree 289
of numerals 12
E
e (= abstract syntax) 187
If' (= pattern-of-calls expansion)
element 282, 303
Elgot iteration equation 245
empty family 72
empty string 43
endofunctor 240
epimorphism 339
=.bb
17
189
equation 323
equational specification 323
equivalent s 336
evaluation morphism 300
extension ordering 42, 90, 95, 114, 125,
148, 149
F
family notation 58, 72
fibonacci function 128, 139
fixed point 124, 153
canonical 177
of endofunctor 246
greatest 247
least 153, 247
FPF (= functional programming fragment) 12, 203, 291
multifunction semantics for 25
full subcategory 43
function
additive 181
bijective 47
continuous 154
everywhere undefined 20,30, 121, 125
guard 32,89
inclusion 32, 89
injective 20
m-additive 181
multi- 21
partial 13
primitive recursive 128, 144
semantically equivalent s II, 20
separately continuous 168
strict 177
surjective 20
total 14
functional programming languages
function space
domain 296
object 300
functor 240
bicontinuous 275
category 244
co-continuous 267
constant 241
continuous 274
coproduct of s 242
identity 241
polynomial 243
product of s 241
separately co-continuous 288
functoriality axioms 240
FwR 46
349
Index
G
generalized conditional 82
GOdel number 307
greatest element 86
greatest lower bound (see Infimum)
group 56, 337
Guard(X) 89
guarded command 32
guard transformer 113
H
Hasse diagram 42
Heyting algebra 301
homomorphism
canonical---of PAR schemes
of monoids 43, 240
of PAR schemes 186
191
I
identity morphism 39
if-then-else (see Conditional) 95, 186, 207
image factorization 334
infimum
arbitrary 147
binary 86, 94
initial object 48
initial value problem 218
injection morphism 63
integral operator 219
interpretation of syntax tree 188
inverse
in a category 47
in a group 56
invertible 56
isomorphic 48
functors 244
isomorphism 47
of categories 277
iterate 83, 85
functorial 245
J
join (see Supremum)
join-semilattice 86, 337
K
kernel 107
kernel-domain
decomposition
system 104
104
Kleene fixed point theorem 154
generalized 270
Kleene semantics 125, 152, 198
Kleene sequence 125, 152
L
L(G)
169
lambda-abstraction 300
language theory notations 248
lattice 86
distributive 88
Lawvere diagonal argument 304
least element 55, 86
least upper bound (see Supremum)
least upper bound axiom 230
left chain 272
limit
of left chain 272
in a metric space 213
line tying morphism 66
Lipschitz condition 219
LISP 12, 139, 265, 300
Lower bound 94, 147,272
M
many-sorted 323
matrix
of languages 173
over Pfn(X,X) 144, 159
meet (see Infimum)
meet -semilattice 86
metric 211
discrete 217
Manhattan 211, 231
non-Archimedian 212
metric space 211
complete 215
metric subspace 213
closed 217
Mfn 40
Mfn(X,Y) 21
Mfn~ 46
modus ponens 301
Mon 43
monoid 43
monoid homomorphism 43
monomorphism 339
monotone map 43
morphism 39
reliable 46
total 52
350
N
naturally equivalent 244
natural transfonnation 244
o
ob(C) 39
observability map 256
(_)OP 50
opposite category 50
P
22
PAR scheme 183
parallel construction 66, 242
partial correctness 99, 200
specification 99
partial function 13
with reliability 46
partially additive
category 79
monoid 72
ordered category 194
recursive scheme 183
semiring 95
structure 75
partially ordered set (= poset) 41
partition 72
partition-associativity 72
Pascal 37
pattern of calls 131, 134
expansion 134, 189, 198
polynomial 204, 243
functor 243, 312
map 183, 185, 204
poset (= partially ordered set) 41
complete 147
consistently complete 152, 199
discretely-ordered 150
Poset 43
postcondition 99
power series 204
map 183, 204
recursive specification 205
scheme 204
power set 22
precondition 99
primitive recursion 128
product
of algebras 329, 333, 336
of domains 151, 158
of functors 241
of partially additive monoids 206
(JJ>( -)
Index
projection function 59
projection morphism 60
Q
quasi projection 76
quasivariety 333
queue 283
R
reachability map 254
recursive specification
on Pfn(X,Y) 125
on a domain 152
reflection 330, 340
repetitive construct 35, 93
S
SC(X,Y) 26
semantically equivalent II, 20
semantic category 26
semantics 3
assertion 3, 4, 98
denotational 3
operational 3, 4, 69
partially additive 26
Set 40
s-expression 265
signature 322
simple recursion 53, 252
single-sorted 323
sorts 322
stack 283, 321, 330
strict continuous map 297
subalgebra 329, 333
subcategory 44
sum
of multifunctions 28
of partial functions 29
surnmable 72
sum-ordering 90, 95, 194
supremum (= least upper bound)
arbitrary 147
ascending chain 260
binary 86, 94
sur-reflection 333
T
tenninal object
tenns 323
49
Index
test 32
total correctness 99, 201
total morphism 52
Tot(X,Y) 14
totalizer 56, 111
totally ordered set 42
trace semantics 249
tree 224
tree induction rule 200
U
unit interval 96
upper bound 94, 147,259
V
variety 340
Vect 44
351
W
weakest liberal precondition (wlp(S,-»
weakest precondition (wp(S,-» 100
while-do 34, 84, 85, 93, 96
generalized 83
z
zero
morphisms 51
object 52
in partially additive monoid
73
99
Texts and Monographs in Computer Science
Suad Alagic
Relational Database Tecbnology
Suad Alagic and Michael A. Arbib
Tbe Design of Well-Structured and Correct Programs
S. Thomas Alexander
Adaptive Signal Processing: Theory and Applications
Michael A. Arbib. A. J. Kfoury. and Robert N. Moll
A Basis for Theoretical Computer Science
Michael A. Arbib and Ernest G. Manes
Algebraic Approaches to Program Semantics
F. L. Bauer and H. Wiissner
Algorithmic Language and Program Development
Kaare Christian
The Guide to Modula-2
Edsger W. Dijkstra
Selected Writings on Computing: A Personal Perspective
Nissim Francez
Fairness
Peter W. Frey. Ed.
Chess Skill in Man and Machine, 2nd Edition
R. T. Gregory and E.
v.
Krishnamurthy
Methods and Applications of Error-Free Computation
David Gries. Ed.
Programming Methodology: A Collection of Articles by Members of IFIP WG2.3
David Gries
The Science of Programming
A. J. Kfoury. Robert N. Moll. and Michael A. Arbib
A Programming Approach to Computability
E. V. Krishnamurthy
Error-Free Polynomial Matrix Computations
Franco P. Preparata and Michael Ian Shamos
Computational Geometry: An Introduction
Brian Randell. Ed.
The Origins of Digital Computers: Selected Papers
Arto Salomaa and Matti Soittola
Automata-Theoretic Aspects of Formal Power Series
Jeffrey R. Sampson
Adaptive Information Processing: An Introductory Survey
William M. Waite and Gerhard Goos
Compiler Construction
Niklaus Wirth
Programming In Modula-2