/
Author: Rossi H.
Tags: mathematics higher mathematics mathematical analysis natural sciences
Year: 1970
Text
ADVANCE»
CALCULUS
PROBLEMS AND APPLICATION* TO
SCIENCE AM) ENCINEEKINi.
Ilu «> Rami
В^Ып· f*4ir«rti
^3b
» I КЧ1ИЛ I . W >~4 *»?«·
ADVANCED
CALCULUS
ADVANCED
CALCULUS
PROBLEMS AND APPLICATIONS TO
SCIENCE AND ENGINEERING
Hugo Rossi
Brandeis University
W. A. BENJAMIN, INC., New York ■ 1970
ADVANCED CALCULUS
Problems and Applications to
Science and Engineering
Copyright © 1970 by W. A. Benjamin, Inc.
All rights reserved
Library of Congress Catalog Card Number 70-110973
Manufactured in the United States of America
12345R2109
W. A. BENJAMIN, INC., New York, New York 10016
PREFACE
During the 19th and early 20th centuries the curriculum in advanced
mathematics centered around the Cours d'Analyse: the course in
mathematical analysis. This three-or-more year study was a catalog of the concepts,
techniques, and accomplishments of the calculus throughout mathematics
and physics. During this period we saw first the emergence of differential
geometry and complex analysis as separate disciplines, then the development
of linear algebra, group theory, and other branches of algebra and the
beginning of an intensive research into foundations of, and formulation of
mathematical concepts and techniques. During this time the Cours
d'Analyse fattened, as more accomplishments of calculus were added to the
catalog. The mathematical research of the late 19th and early 20th centuries
brought about a revolution in mathematics which not only opened up broad
new areas but changed the basic approach to the subject itself. Another
level of abstraction was attained, from which it became possible to scan large
areas of mathematical research, observing new relations and interconnections
with a more profound understanding and exposing new frontiers of discovery.
What is more, it became necessary for all scientists to attain this new level
By the mid-1920's it was apparent that the Cours d'Analyse was massively
unwieldy as well as out-of-date. Thus it fragmented into a collection of
smaller disciplines; some remained (calculus, differential equations,
differential geometry), others disappeared and were replaced by courses in more
recent mathematics (point set topology, algebra, potential theory,
integration theory). A piece of advanced calculus, which was important but
ν
vi Preface
essentially unchanged remained (series expansion, vector calculus, partial
differential equations, calculus of variations). This was the course in
advanced calculus. However, research in mathematics during the past forty
years has been extensive in these particular subjects. The intensive and far-
reaching developments in the study of differentiable manifolds and partial
differential equations have cast advanced calculus in a new and important
light. It was then clear that geometry and algebra form two important
cornerstones for advanced calculus. In 1957, Nickerson, Spencer, and
Steenrod wrote a new advanced calculus textbook which was in effect an
introduction to the techniques of modern analysis. This book bore little
resemblance to the existing texts in the subject, and was not successful in
replacing them. However, it made the others obsolete; every text written
since then must reckon with the Nickerson-Spencer-Steenrod conception
of advanced calculus.
In 1963,1 taught from that text to a class of exceptionally brilliant students
at Princeton University. I believe that course was successful and influential
for those students. As the text has no illustrative material, I developed a
set of notes of " classical" advanced calculus which we used as a supplement
to the text. This was the beginning of the present textbook of advanced
calculus.
I began to feel that indeed algebra, geometry, and topology are
cornerstones of modern mathematical analysis, but so is "classical" advanced
calculus. I decided that we needed a bridge between freshman calculus and
modern analysis which leaned heavily upon the techniques of algebra and
the concepts of geometry. This text is an attempt at such a bridge. In
1967, I taught from a preliminary edition to a class of physics-motivated
juniors, and in 1968 I taught from what is essentially the present text to a
class of sophomores. These two classes have had a profound influence on
the development of the text and I am deeply indebted to them for assistance
in matters of style and pedagogy.
Needless to say, I did not complete the entire text in either year. As a
text for juniors we covered a now-extinct chapter on metric spaces and
Chapters 4-8, and in the class of sophomores we covered Chapters 1,2, 3, 5,
and 7. I feel that either of these is an adequate year's course, depending
upon the structure of the preceding and subsequent courses. This text
assumes only a course in calculus that includes analytic geometry, multiple
integration, and partial differentiation. These topics are speedily reviewed
in Chapter 2 with a view to setting up the style of the present text as well as
indicating the more valuable facts and concepts of calculus. Chapter 2
includes the abstract formulation of the technique of successive
approximations, this is, of course, the basic theoretical tool in advanced calculus.
Chapter 1 is hardly a course in linear algebra; it is rather a tour through
Preface vii
those algebraic ideas and techniques which are essential to analysis. It is
a large chapter and it is very likely that the student will become anxious to
return to his analytic tools before the chapter is completed It can thus be
split up: Sections 1.3-1.8 are relevant to Chapters 3 and 4, because the topic
of differential equations is consistently handled in the context of systems.
The last four sections can be postponed until the student reaches Chapters
6-3, for it is with the geometric study of Fourier series that they begin to be
relevant. Similarly, Chapter 2 can be broken up. Sections 2.6-2.9 are
completely review material and can be omitted altogether if that is suitable.
If the proof of Picard's theorem is omitted or delayed, this chapter is not
relevant until Chapter 5. Thus Section 2 1-2.3 could be done just before
beginning Chapter 5. Sections 2.4, 2.5, 2.10, and 2.11 are of a purely
theoretical nature and can be kept aside until Section 3.5 is studied, or until
the integral calculus in several variables is begun (Chapters 7 and 8).
Chapters 3-5 constitute a little course in ordinary differential equations.
Since the study of curves and some complex variables are relevant to this
topic, they are introduced in these chapters Chapter 4, in particular, is
about particle motion and Chapter 5 about series expansions in the complex
domain.
Chapter 6 is devoted to the study of Fourier series and their use in the
classical partial differential equations. This is the only illustration of
eigenvalue expansions
Chapters 7 and 8 form the part of advanced calculus having to do with
integration; here, we find the various versions of Stokes' theorem and its
applications. Outside of the notion of a differential one-form the approach
is vector calculus rather than differential forms I have included in Chapter 8
the study of geodesies and Dinchlet's principle, there is no further calculus
of variations.
This text is thus intended to cover the course in advanced calculus given
at the sophomore or junior level. The emphasis in the text is on concepts
and techniques; my main intention being to present the methods of calculus.
There are numerous illustrative examples and exercises on each method
introduced. The exercises appear at the end of each section. Proofs of
theorems are included mainly to offer further understanding of the
mathematical machinery, and secondarily to illustrate its logical structure. It should
be possible to read this book while skipping all proofs. The problems at the
ends of the sections and the miscellaneous problems are included to deepen
the student's understanding of the material, to allow him to try his hand at
mathematical inference, and to suggest related topics.
I wish to thank Anne Clarke and Irene Dougherty of Brandeis University
for their typing of the preliminary notes leading to this text and the classes
of Mathematics 35 (1967-1968) and Mathematics 21 (1968-1969) for their
viii Preface
assistance in correcting them. I thank also the editorial staff of W. A.
Benjamin, Inc., for their patient, friendly, and expert assistance. Finally,
it gives me great pleasure to thank my wife, Ricki, who not only typed the
entire final manuscript and who gave me the needed encouragement to see
this text through, but who also makes the world's greatest martinis.
Hugo Rossi
Waltham, Massachusetts
October 1969
CONTENTS
PREFACE
Chapter 1 Linear Functions
1.1 Simultaneous equations
1.2 Numbers, notation, and geometry
1.3 Linear transformations
1.4 Linear subspaces of R"
1.5 Rank + nullity = dimension
1.6 Invertible matrices
1.7 Eigenvectors and change of basis
1.8 Complex numbers
1.9 Space geometry
1.10 Abstract notions of linearity
1.11 Inner products
Miscellaneous problems
1
2
13
28
40
53
59
76
85
93
105
110
121
Chapter 2 Notions of Calculus
2.1 Convergence of sequences
2.2 Series
2.3 Tests for convergence
2.4 Convergence in R"
2.5 Continuity
2.6 Calculus of one variable
126
129
137
145
153
159
165
IX
x Contents
Chapter 3
Chapter 4
Chapter 5
2.7 Multiple integration
2.8 Partial differentiation
2.9 Improper integrals
2 10 The space of continuous functions
2.11 The fixed point theorem
2.12 Summary
Miscellaneous problems
Ordinary Differential Equations
3.1 Differentiation
3.2 Taylor's formula
3.3 Differential equations
3.4 Some techniques for solving equations
3.5 Existence theorems
3.6 Linear differential equations
3.7 Second-order linear equations
3.8 Summary
Miscellaneous problems
Cnrves
4.1 Parametrization of curves
4.2 Arc length
4.3 Local geometry of curves
4.4 Curves in space
4.5 Varying a curve in the plane
4.6 Vector fields and fluid flows
4.7 Summary
Miscellaneous problems
Series of Functions
5.1 Convergence
5.2 The fundamental theorem of algebra
5.3 Constant coefficient linear differential equations
5.4 Solutions in series
5.5 Power series
5.6 Complex differentiation
5.7 Differential equations with analytic coefficients
5.8 Infinitely flat functions
5.9 Summary
Miscellaneous problems
173
185
195
201
211
219
222
227
228
240
250
259
266
275
289
298
302
307
313
331
349
359
365
380
393
397
400
401
406
410
414
421
428
434
441
445
448
Contents xi
Chapter 6
Chapter 7
Chapter 8
Fum
6.1
6.2
6.3
6.4
6.5
6.6
6.7
6.8
ctions on the Circle (Fourier Analysis)
Approximation by trigonometric polynomials
Laplace's equation
Fourier sine and cosine series
The one-dimensional wave and heat equations
The geometry of Fourier expansions
Differential equations on the circle
Taylor series and Fourier series
Summary
Miscellaneous problems
Line Integrals and Green's Theorem
7.1
7.2
7.3
7.4
7.5
7.6
7.7
7.8
The differential
Coordinate changes
Differential forms
Work and conservative fields
Integration of differential forms
Applications of Green's theorem
The Cauchy integral formula
Summary
Miscellaneous problems
Potential Theory in Three Dimensions
8.1
8.2
8.3
8.4
8.5
8.6
8.7
Divergence and the equation of continuity
Curl and rotation
Surfaces
Surface integrals and Stokes' theorem
The divergence theorem
Dirichlet's principle
Summary
Miscellaneous problems
452
453
467
476
482
495
503
509
512
517
525
527
534
547
552
560
574
584
602
607
611
613
624
635
657
666
674
686
690
ANSWERS TO SELECTED EXERCISES 694
INDEX 723
Chapter 1
LINEAR FUNCTIONS
You probably recall from calculus that a function is a rule which associates
particular values of one variable quantity to particular values of another
variable quantity. Analysis is that branch of mathematics devoted to the
study, or analysis, of functions. The main kind of analysis that goes on is
this: for small changes in the first variable, we try to determine an
approximate value to the corresponding change in the second. Now, we ask, for
large changes in the first variable to what extent can we predict, from such
approximations, the corresponding change in the second? The primary
technique involved in this kind of analysis is simplification of the problem.
That is, we replace the given function by a suitable very simple and more
easily calculable function and work with this simple function instead (making
sure to keep in mind the effect of that replacement).
The simplest possible functions are those which behave linearly. This
means that they have a straight line as graph. Such a function has the
following property. The increment in the value of the function
corresponding to an increment in the variable is a constant multiple of that increment:
f[x + t)-f[x) = Ct (1.1)
for some C. Now, when one moves to the consideration of functions of
several variable quantities the study of even these simplest functions becomes
complex enough that it forms a special mathematical discipline, called linear
algebra. The calculus of one variable, coupled with the concepts and
1
2 1 Linear Functions
techniques of linear algebra constitute the basic tools of analysis of functions
of several variables. It is our purpose in this text to study this subject.
First then, we must study the notions and methods of linear algebra.
We begin our study in a familiar context: that of the solution of a system
of simultaneous equations in more than one unknown. We shall develop
a standard technique for discovering solutions (if they exist), called row
reduction This is the foundation pillar of the theory of linear algebra.
After a brief section on notational conventions, we look at the system of
equations from another point of view. Instead of seeking particular
solutions of a system, we analyze the system itself. This leads us to consider the
fundamental concept of linear algebra: that of a linear transformation. In
this larger context we can resolve the question of existence of solutions and
effectively describe the totality of all solutions to a given linear problem.
We proceed then to analyze the totabty of linear transformations as an
object of interest in itself. This chapter ends with the study of several
important topics allied with linear algebra. We study the plane as the system
of complex numbers and the inner and vector products in three space.
1.1 Simultaneous Equations
Let us begin by considering a well-known problem: that of finding solutions
to systems of simultaneous linear equations The simplest nontri vial example
is that of two equations in two unknowns.
Examples
1. Sx+5y = 3
7x- y = S U '
The technique for solution is that of elimination of one of the
variables. This is accomplished by multiplying the equations by
appropriate nonzero numbers and adding or subtracting the resulting
equations. This is quite legitimate, for the set of solutions of the
system will not be changed by any such operations. It is our intention
to select such operations so that we eventually obtain as equations:
χ = something, у = something. In the present case this is quite
easy: if we add five times the second equation to the first, у will
conveniently disappear:
Sx + 5y = 3
35x - 5y = 40
43* ^43
1.1 Simultaneous Equations 3
and we obtain the equation x= 1. Substituting that in the first
equation gives 8 + 5y = 3, or у = -1. Then χ = 1, у = -1 is the
solution. Let us try a few more illustrative examples.
2. 3x-2y = 9
-x + 3y = 11
We can eliminate χ as follows: multiply the second equation by 3
and add:
3x-2y= 9
- 3x + 9y = 33
7y = 42
We obtain у = 6 and χ = 1.
3. 3* + 4У= 7
6x + 8j>=15 U'JJ
If we subtract twice the first equation from the second, we obtain
a mess:
6x+8y=15
-6x-8j>=14 u '
0= 1
Thus there can be no numbers χ and у satisfying Equations (1.3),
because they imply the Equation (1.4) which is patently false. Notice,
if the second equation were
6x + Sy= 14
then our technique would lead to the equation 0 = 0 which is true,
but hardly offers much new information. We can conclude that our
simple technique of elimination does not always produce results.
We shall go into the causes for this in Section 1.3.
4. Let us now generalize our technique to systems involving more
variables. Consider, for example, the system
χ + у + ζ = 5
3x-2y + 5z= -1 (1.5)
2x + у - ζ = 0
4 1. Linear Functions
The first equation expresses χ in terms of у and z; if the second and
third equation were free of χ we could solve as above for the two
unknowns y, ζ and then use the first to find x. But now it is easy to
replace those last two equations by another two which must also be
satisfied and which are free of the variable x. We use the first to
eliminate χ from the latter two. Namely, subtract three times the
first from the second:
3x-2y + 5z= - 1
3x+3y + 3z = 15
- 5y + lz = -16
and twice the first from the third:
2x + у - ζ = 0
2x + 2y + 2z = 10
- y-3z= -10
The system (1.5) has been replaced by this new system:
χ + у + ζ = 5
-5j>+2z=-16 (1.6)
- у - 3z = - 10
and we can now see our way clear to the end. We solve the last two
as a system in two unknowns:
-5y+ 2z= -16
5j>+15z = 50
17z= 34~
z= 2
Then, substituting this value in the last equation, we obtain — у — 6 =
-10 or у = 4. Finally, substituting these values for у and ζ in the
first equation, we find χ = - 1. Thus the solution is χ = - 1, у = 4,
ζ = 2.
5. χ — у — ζ = 5
2χ + у - 3ζ = 0
-4x-y+ z= 10
1.1 Simultaneous Equations 5
Eliminate χ from the second and third equations by adding appropriate
multiples of the first;
2x + у - 3z = 0
2x-2y-2z= 10
Ъу- ζ = - 10
-4x- у + z = 10
Ax - 4y - 4z = 20
- 5y - 3z = 30
The given system has been replaced by these equations:
χ —y — ζ = 5
Ъу- ζ = - 10
- 5y - Ъг = 30
We solve the last two easily: у = -30/7, ζ = -20/7. Substitutions
into the first equation completes the solution: χ = — 15/7.
Of course, we can run into difficulties as we did in the two unknown
equations of Example 3. We should be prepared for such occurrences
and perhaps even more mysterious ones. Nevertheless, our technique is
productive: if there is a solution to be had we can locate it by this process
of successive eliminations. Furthermore, it easily generalizes to systems
with more unknowns. This is the technique stated for the case of и
unknowns. Eliminate the first variable from all the equations except the first
by adding appropriate multiples of the first. Then, we handle the resulting
equations as a system in и — 1 unknowns. That is, using the second equation
we can eliminate the second variable from all but the second equation, using
the new third equation we can eliminate the third variable from the
remaining equations, and so forth. Eventually we run out of equations and we
ought to be able to find the desired solution by a succession of substitutions.
We shall want to do more than discover solutions if they exist. We want
to be able to predict the existence of solutions; we want to be able to compare
systems, and we want to know in some sense how many solutions there are.
In other words, we should come to understand the nature of a given system
of equations. In order to do that we have to analyze this technique and
develop a notation and theory which do so. That is where linear algebra
begins. Before going into this, we study another pair of more complicated
examples.
6 1. Linear Functions
Examples
6. χ + 2y — ζ — 3w = 13
5x — y — ζ + 2w = —14
у + ζ + w = 4
Зх + 1у — 2z = — 7
According to our technique, we replace the last three equations by a
new set in which the variable χ does not appear. We do this by
adding the suitable multiple of the first equation:
(-5) χ (first) + second: -llj> + 4z + 17w = -79
0 χ (first) + third: у + ζ + w = 4
(-3) χ (first) + fourth: - 4j> + z+ 9w=-46
Now we solve this set by applying the same procedure: we now
eliminate y. Of course the order of the equations is not relevant;
we could have listed them some other way. Since we can avoid
fractions by adding multiples of the second equation to the first and
third, let's do it that way.
(11) χ (second) + first: 15z + 28w = -35
4 χ (second) + third: 5z + 13w = -30
Finally, of this set, ( — 3) χ (second) + first gives — llw = 55. Thus
the original set of four equations is replaced by this set:
χ + 2y — ζ — 3w = 13
у + ζ + w = 4
5z+ 13w= -30
-llw= 55
The solutions are now easily found,
w=— 5 z = 7 y = 2 x=\
7. Now, let us consider this set:
χ + 2y + 3z + и — ν
-5x+ y + 7z
2y + Au + 3v
3z — и — ν
2
-5
18
-5
(1.7)
1.1 Simultaneous Equations 7
χ is already eliminated from the last two equations. Using the first
to eliminate χ from the second, we obtain these three equations in
place of the last three above,
1 ly + 22z + 5k - 5v = 5
2y + 4u+3v = 18 (1.8)
3z — и — ν = — 5
Now у is already eliminated from the last. We eliminate it from the
second (without getting involved with fractions) in this way:
(-2) χ (first) + (11) χ (second): -44z + 34w + 43υ = 188
Now this equation together with the last of the set (1.8) gives this
system
-44z+ 34w + 43v = 188
3z— и — ν = — 5
We can eliminate ν from the first to obtain 129z — 9u = —27. Thus
the system (1.7) has been transformed into this:
χ + 2y + 3z+ и — ν = 2
Uy + 22z+5u-5v= 5
3z— u— v=— 5
\29z-9u =-27
Now we can solve for χ by the first equation once we know y, z,u,v;
we can solve for у by the second once we know z, u, υ; we can solve
for ν in the third once we know ζ and u; and we can use any z, и which
make the last equation true. For example, if ζ = 0, we must have
и = 3, and so on up the line: ν = 2, у = 0, χ = 1. Notice that for
any value of ζ we can always find u, v, x, у that make these equations
all hold. Thus in this case there is more than one solution.
Formulation of the Procedure: Row Reduction
Now, let us turn to the abstract formulation of this procedure. In the
general case we will have some, say m, equations in и unknowns. Let us
refer to the unknowns as x\ ..., x". These m equations may be written as
a,V +a2lx2 + --- + a,ilx" = b1
0l2xl +a22x2+ --- + an2x" = b2 j
αΓ*1 + "™χ2 + ■■■ + аптх" =Ъ"
8 I. Linear Functions
We proceed to solve this system as follows: multiply the first equation
by —al2/al' and add it to the second equation; multiply the first equation by
- a, 3/al l and add it to the third and so on. The result will be a new system,
which we may write this way:
α,1*1 +α2ιχ2 + ··· + α„ν
a22x2 + ■■· + a„ V
oc2mx2 + ■■■ + oinmx"
We now continue with the same technique applied to the system of m — 1
equations in и - 1 unknowns given by the system (1.11) (except for the first
equation). This is an effective reduction of the problem, because xl can be
computed from the first equation once x2, ..., x" are known. Of course, if
a,1 =0, this technique must be slightly modified. We just renumber the
equations so that the coefficient of xl in the first one is nonzero and then
proceed as above. If that is impossible then xl appears in no equation so
we can disregard it and work with x2 instead.
We now introduce a formalism which allows us to keep track of this
procedure. It is clear that the essence of the left side of the system of
Equations (1.10) is embodied in the array of numbers.
■ a2
'. (1-12)
\"1 "2 ' ' απ/
This array is called a matrix: the upper index of the general term is the
row index and the lower index is the column index. Thus, a53 is the number
in the third row and fifth column, a}2 is in the seventh row and forty-second
column, akJ is the number in the j'th row and the ktii column. Symbol
(1.12) is an m χ η matrix: it has m rows and и columns. The matrix
-Θ
is an m χ 1 matrix. Equations (1.10) can now be written symbolically as
b1
■β2
= β"
(1.11)
/ai
A =
Ax = b
(1.13)
1.1 Simultaneous Equations 9
Now the technique for solving the equation described above consists of a
sequence of such equations with new matrices A and b, ending with one whose
solution is obvious. The step from one equation to the next is performed
by a row operation (remember, the rows are the separate equations); that is,
one of these particular steps:
Step 1. Multiply a row by a nonzero constant.
Step 2. Add one row to another.
Step 3. Interchange two rows.
It is clear (and will be verified in Section 1.3) that any such operation does
not change the collection of solutions. Finally, the end result desired is a
matrix of this form, called a row-reduced matrix:
0 1 a32
0 0
Descriptively: the first nonzero entry of any row is a 1 and this 1 in any
row is to the right of the 1 in any previous row. This is the kind of matrix
the above procedure leads to; and it is most desirable because the system it
represents can be immediately solved. In order to see this, we shall
distinguish between two cases by resolving the dotted ambiguity in the lower right
corner of (1.14).
Let
Ax = b
be a system of linear equations where A is a row-reduced matrix (of the form
(1.14)). Let d be the number of nonzero rows of A.
Case 1. d=n. In this case the system of equations has this form:
xl+a2lx2 + --- + an1x" = b1
x2 + a32x3 + --- + an2xn = b2
χ"'1 +апл-1хл = Ьп-1
xn = bn
0 = bn+1
0 = bm
(1.14)
10 1. Linear Functions
Thus there is a solution if and only if bn+1 = ■ · · = bm = 0, and the solution
is found by successive substitutions. In this case the solution is unique.
Case!, d < n. In this case our system has the form:
xl+a2lx2 + --- + a„1xn = b1
x2 + a32x3 + ··· + a„V = 62
0 = bi+l
0 = 'bm
(We may have to reindex the variables in order to get all the leading l's in a
line.) There is a solution if and only if bd+1 = · · · = bm = 0, and all the
solutions are obtained by giving xd+1,..., У arbitrary values, and finding
the values of the remaining variables by substitutions.
We now summarize the factual (rather than the procedural) content of this
discussion in a theorem, the proof of which will appear in Section 1.3.
Definition 1. A matrix A is called a row-reduced matrix if
(i) the first nonzero entry in any row is 1,
(ii) in any row this first 1 appears to the right of the first 1 in any preceding
row.
The number of nonzero rows of A is called its index.
Theorem 1.1. Let A be an m χ η matrix. A can be brought into row-
reduced form A' by a succession of row operations. The equation Ax = b has
precisely the same solutions as the equation A'x = b' if b' is obtained from b
by the same sequence of row operations that led from A to A'.
• EXERCISES
1. Find solutions for these systems
(a) 2x-3^ = 23
Зх+ у = -4
(b) ix + 4y = 10
-ix + 8y= 0
(c) x+ y+ z= 15
*- У+ z= 3
2x-3y-5z = -7
1.1 Simultaneous Equations 11
(d) x + y + z+ w = 4
x+ y + z — w = 2
x— У + z — w = 0
-x+2y-3z + 4tv = 2
(e) -x + y+ z= 0
2x + 2y+ lOz = 28
x+ y+ ζ = 22
(f) 3x + 6r + 9z=12
x + 2y+3z= 4
(g) x + r = 7
χ— y=\
Ъх-4у = 0
(h) x+ у = 7
л:+ 2^ = 9
x+3.y =11
(i) Λ: + ^+ζ+νν = 4
x + у—ζ—w = 6
(j) x + 2r+ z= О
χ — 3.y — 6z = 4
4x+8r + 4z=ll
2. A homogeneous system of linear equations is a system of the form
Ax = 0; that is, the right-hand side is zero. Find nonzero solutions (if
possible) to these homogeneous systems.
(a) x+ y + z = 0
x- y+z=0
x + 2y + z = 0
(b) x + y+z = 0
x — y— ζ — 0
(c) x+ y+ z+ w=>0
x — 2y+ z —2w = 0
2x- y + 2z- w=0
3. Suppose (x1,..., x") is a solution for a given homogeneous system.
Show that for every real number /, (fie1,..., fie") is also a solution.
4. If (xx,..., x"), (y1,..., y) are solutions for a given homogeneous
system, then so is (x1 + y1, ■ ■ ·, x" + У).
5. Find the row-reduced matrix which corresponds to the given matrix
according to Theorem 1.1.
/ 0 7 1\
A= 3 2 2
\-l 6 4/
/l 0 0 6 5\
0 0 0 2 0
12 1. Linear Functions
(1 2 6 1\
-2-4 0 2
0 0 8 8 I
3 6 9 12/
6. Let
i)-Θ i)
Solve these equations:
(a) Ax=b, (b) Bx=a, (c) Bx = c, (d) Cx=a,
(e) Cx = c, where А, В, С are given in Exercise 5.
• PROBLEMS
1. Show that the system
ax + by = α
cx + dy = β
has a solution no matter what α, β are if ad—be φ 0, and there is only one
such solution.
2. Can you suggest an explanation of the ugly phenomenon illustrated
by Example 3 ?
3. Is there only one row-reduced matrix to which a given matrix may be
reduced by row operations? If A' and A" are two such row-reduced
matrices, coming from a given matrix A show that they must have the same
index.
4. Suppose we have a system of и equations in и unknowns, Ax = b.
After row reduction the index of the row-reduced matrix is also n. Show
that in this case the equation Ax = b always has one and only one solution
for every b.
5. Suppose that you have a system of m equations in и unknowns to
solve. What should you expect in the way of existence and uniqueness
of solutions in the cases m < n, m > и ?
6. Suppose we are given the η χ и system Αχ = b, and all the rows of A
are multiples of the first row; that is, there are s1,..., У such that a/ = s'a/
for all j and i = 1,..., n. Under what conditions will the given system
have a solution ?
7. Suppose instead that the columns of A are multiples of the first column.
Can you make any assertions ?
8. Verify that if the columns of a 3 χ 3 matrix are multiples of the first
column, then the rows are multiples of one of the rows.
1.2 Numbers, Notation, and Geometry 13
1.2 Numbers, Notation, and Geometry
We now interrupt our discussion of simultaneous equations in order to
introduce certain facts and notational conventions which shall be in use
throughout this text. We shall also describe the geometry of the plane from
the linear, or vector point of view as an alternative introduction to linear
algebra.
First of all, the collection {1, 2, 3, ...} of positive integers (the " counting
numbers ") will be denoted by P. Every integer и has an immediate successor,
denoted by и + 1. If a fact is true for the integer 1 and also holds for the
successor of every integer for which it is true, then it is true for all integers.
This is the Principle of Mathematical Induction, which we shall take as an
axiom, or defining property of the integers. We shall formulate it this way.
Principle of Mathematical Induction. Let S be a subset of Ρ with these
properties:
(i) 1 is a member of S,
(ii) whenever a particular integer и is in S, so also is its successor и + 1
in S.
Then 5 must be the set Ρ of all positive integers.
This assertion is intuitively clear. You can see, for example that 2 is in S.
For by (ι) 1 is in S, and thus by (ii) 1 + 1 = 2 is also in S. Continuing,
3 = 2 + 1 is in S, again by (ii). By applying (n) another time 4 is in S.
Applying (ii) another 32 times we see that all the integers up to 36 are also
in S. No positive integer can escape: since 1 is in S we need only apply (ii) и
times to verify that the integer и is in S. In fact, the assertion of the principle
of mathematical induction is that there are no integers other than those that
can be captured in this way, and in this sense the principle is a defining
property of the integers.
The principle of mathematical induction provides us with a tool for writing
proofs of assertions for all positive integers which avoids the phrases:
" continuing in this way," " and so forth," "...," We shall find this a
helpful device in verifying assertions concerning problems with an unspecified
number, n, of unknowns. Let us illustrate this method by proving a few
propositions about integers.
14 1. Linear Functions
Proposition 1. The sum of the first η integers is (1/2)л(л + 1).
Proof. Let S be the set of integers for which Proposition 1 is true. Certainly
1 is in S:
1=*· 1(1 + 1)
Now, assuming the assertion of Proposition 1 for any integer n, we show that it also
holds for η + 1.
1 + --- + л+1 = 1 + -- + л+л+1=*л(л+1) + л+1
= (л + 1)(*л + 1) = i(n + 1)(л + 2)
which is the appropriate conclusion. Thus by the principle of mathematical
induction, Proposition 1 is proven.
Proposition 2. The sum of the first η odd integers is л2.
Proof. 1 = 12 surely. We now assume the proposition for any л, and show that
it follows for л + 1:
1+3 + --- + 2(л+1)-1=1+3 + -" + 2л-1+2л+1
= л2 + 2л+1 =(л+1)2
Proposition 3. Let К be a given positive integer. Then for any integer η
we can write
n=QK+R (1.15)
with 0 < R < К in one and only one way.
Proof. We may of course immediately discard the case К = 1 for in that case
(1.15) is just the trivial comment that л = л · 1 for all л. Thus take К > 1, and now
proceed by mathematical induction. The proposition is true for л = 1:
1 =0·ί|1
Now we assume that the proposition is true for any given integer л. Thus
n=QK+R
for some β and R, 0 <. R < K. Then R + 1 <; K. If R + 1 < K, we have
n+l = QK+(R+l)
1.2 Numbers, Notation, and Geometry 15
with 0 < R + 1 < K, as desired. Otherwise, R + 1 = K, in which case
и + 1 = QK+ K = (Q+ l)K+ 0
as desired. Thus, by mathematical induction, (1.15) is possible for every integer n.
This representation is unique, for if
n=Q'K+R'
is also possible with 0 <, R' < K, then we have
Q'K+R'=QK+R
or
(Q'~Q)K=R~R'
and R — R' is between — К and K. Now the only multiple of К between — К and
К is 0, so (β' - β)ΛΓ = R - i?' = 0 from which we conclude R = R', Q' = β.
Set Notation
The set of positive integers forms a subset of a larger number system, the
set Ζ of all integers. Ζ consists of all positive integers, their negatives and 0.
The collection of all quotients of members of Ζ is the set of rational numbers,
denotedy by Q. β is a very large subset of the set of all real numbers R.
For the purposes of geometric interpretation we will conceive of the real
number system R as being in one-to-one correspondence with the points
on a straight line. That is, given a straight line, we fix two points on it,
one is the origin O, and the other denotes the unit of measurement. All
other points Ρ on the line are given a numerical value: it is the displacement
from О as measured on the given scale (negative if О is between Ρ and 1
and positive otherwise). (See Figure 1.1.)
There are certain ideas and notations in connection with sets which we
Figure 1.1
16 1. Linear Functions
shall standardize before proceeding. Customarily, in any given context
there is one largest set of objects under consideration, called the universe
(it may be the positive integers P, or the rabbits in Texas, or the people on
the moon) and all sets are actually subsets of this universe. If X is a set and
χ is an object we shall write χ e X to mean: χ is a member of Χ. χ φ Χ
means that χ is not a member of X. Thus, for example, -7 6 Z, but — ΊφΡ.
The set with no elements is called the empty set, and is designated 0. Most
specific sets are defined by a property: the set in question is the set of all
elements of the universe that have that property. We use the following
shorthand form to represent that phrase
{x 6 U: χ has that property}
For example, the set of all positive real numbers is {xeR:x>0}. The set
of all Englishmen who drink coffee is {x e England: χ drinks coffee}. The
set of all integers between 8 and 18 is {x e Z: 8 < χ < 18}. This is the same
as {xeP: 8<x< 18} and {xeZ: \x - 131 < 5}.
If X and Fare two sets, and every element of X is an element of Γ we shall
say that X is contained in Y, written X <= Y. Notice that 0 <= X for every
set X. We shall consider also these operations on sets:
— X: the set of all χ not in X
Χ υ Υ: the set of all χ in either X or Υ (or both)
Χ η Υ: the set of all χ in both X and Υ
X - Υ: the set of all χ in X, and not in Υ
(Consult Figure 1.2 for a pictorial interpretation.) Notice that X — Υ is
the same as Xn — Y. There are many other identities: X = X,
-Xvj-Y= -{Χ η Υ), Χ η(ΥνΖ) = (ΧηΥ)ν(Χη Ζ), and so on,
so don't be surprised when two different collections of symbols identify the
same set. A final operation is that of forming the Cartesian product. If U
is a given universe, then U χ U is the set of all ordered pairs of elements in U.
U χ U is often denoted by U2. By extension we can define U3 as the set of
all ordered triples (x1, x2, x3) of elements of U; and more generally U" is the
set of all ordered л-tuples of elements of U.
If X1,..., X" are subsets of U, the set of all ordered и-tuples (jc1, ..., x")
with x1 6 X\ ..., Xя 6 Xя is denoted Xх χ · · · χ Χ\ Not every subset of
U" is of the form Χ1 χ · · · χ Χ", those which are of this form are called
rectangles.
Thus the space of и-tuples of real numbers is denoted R". If I1, ..., /"
are intervals in R, then ί'χ···χΓ is indeed a rectangle. A point
(x , ..., x") in R" will be denoted, when specific reference to its elements is
1.2 Numbers, Notation, and Geometry 17
.^^^^,
л- υ κ
not required, by a single boldfaced letter χ = (χ1,..., χ"). If a = (α1,..., α"),
b = (b1, ...,b") are two points in R" with b' > a', I <i <n we shall use the
notation [a, b] to denote the rectangle.
{(x\ ..., x"): a1 < x1 < b\ ..., a" < x" < bn}
If the inequalities are strict we shall denote the rectangle by (a, b):
(a, b) = {(x1, ..., x"): ax < x1 < b\ ..., a" < x" < b"}
A function from a set X to another set У is a rule which associates to each
point of χ a uniquely determined point у in У. It is customary to avoid the
use of the new word rule by defining a function as a certain kind of subset
of Χ χ Υ. Namely, a function is a set of ordered pairs (x, y) with xe X,
у e Y, with each xe X appearing precisely once as a first member. If
(x, y) is such a pair we denote у by/(л): у =f(x). We shall use the notation
/: X -> Υ to indicate that/is to mean a function from X to Υ. Χ is called
the domain of/; the range of/is the set {/(*): xe X} of values of/. If every
point of у appears as a value of/we say that/maps X onto У If every point
of у is the value of/at at most one χ in Z, we say that/is one-to-one. More
precisely,/is one-to-one if χ φ χ implies/(x) φ/(χ'). Now, if/"is a one-to-
one function from X onto Y, then for each у е Y, there is one and only one
xeX such that/(x) = y. This defines a function #: У-»X which is also
one-to-one and onto and has this property: g(y) = χ if and only if/(x) = j\
In this case we shall say that / is invertible and g is its inverse, denoted
18 1. Linear Functions
g=f~l. Finally, if/x: X-* Y,f2- Υ -* Ζ are two functions, we can compose
them to form a third function, denoted/2 °/x: A" -> Ζ defined by
/2 °/iW =/2(/xW)
Plane Geometry
We now turn to the geometric study of the plane, as an alternative
introduction to linear algebra. According to the notion of the Cartesian
coordinate system we can make a correspondence between a plane, supposed
to be of infinite extent in all directions, and the collection R2 of ordered
pairs of real numbers. This is done in the following way: first a point on the
plane is chosen, to be called the origin and denoted О (Figure 1.3). Then
two distinct lines intersecting at О are drawn (it is ordinarily supposed that
these lines are perpendicular, but it is hardly necessary). These lines are
called the coordinate axes; they are sometimes referred to more specifically
as the χ and у axes, ordered counterclockwise (Figure 1.4). Now a point
is chosen on each of these axes; we call these Et and E2 (Figure 1.5). These
are the "unit lengths" in each of the directions of the coordinate axes.
Having chosen a unit on these lines, we can put each of them in one-to-one
correspondence with the real numbers. Now, letting Ρ be any point in the
plane, we associate a pair of real numbers to Ρ in this way. Draw the lines
through Ρ which are parallel to the coordinate axes and let χ be the
intersection with the line through £x and у the intersection with the line through
О
Figure 1.3
Figure 1.4
1.2 Numbers, Notation, and Geometry 19
о
Figure 1.5
E2. Then we identify Ρ with the pair of real numbers (x, y) (Figure 1.6.)
In this way to every point in the plane there corresponds a point in R2
(called its coordinates relative to the choice O, Ev E2). Clearly, for any pair
of real numbers (x, y) we have a point with those coordinates, namely the
fourth vertex of the parallelogram of side lengths χ and у along the
coordinates axes with one vertex at О (Figure 1.6).
P(x.y)
Figure 1.6
20 1. Linear Functions
Figure 1.7
Once a particular point in the plane is fixed as the origin, there can be
defined two operations on the points of the plane, and these operations form
the tools of linear algebra. Since they cannot be defined on the points until
an origin is chosen, we are forced to distinguish between the point set the
plane and the plane with chosen point. This distinction gives rise to the
notion of vector: a vector is a point in the plane-with-origin. The vector
can be physically realized as the directed line segment from the origin to the
given point; such a visualization is nothing more than a heuristic aid. It is
important to realize that as sets, the set of vectors in the plane is the same
as the set of points in the plane. The difference is that the set of vectors has
additional structure: a particular point has been designated the origin. We
shall denote vectors by boldface letters; thus the point Ρ becomes the vector
P, the origin О becomes the vector 0. We shall now describe these two
operations geometrically and then compute them in coordinates.
1. Scalar Multiplication. Let Ρ be a vector in the plane, and r a real
number. Consider the line through 0 and P. Considering Ρ now as a unit
length, we can put that line into one-to-one correspondence with R. Using
this scale, rP is the point corresponding to the real number r. Said differently,
rP is one of the points on that line whose distance from 0 is \r\ times the
distance of Ρ from 0 (Figure 1.7). Now, if Ρ has the coordinates (x, y) we
shall see that rP has coordinates (rx, ry). First, suppose r > 0. Draw the
1.2 Numbers, Notation, and Geometry 21
triangle formed by the line through 0, Ρ and rP, and the Ex axis and the lines
parallel to the E2 axis (Figure 1.8). Triangles I and II are similar. Thus,
referring to the lengths as denoted in Figure 1.8,
|P| _χ
|rP| s
By definition |P|/|HP| =l/r, thus the first coordinate of rP (here denoted by s),
is rx. The second coordinate is similarly seen to be ry. Thus rP has the
coordinates (rx, ry). The case r < 0 is only slightly more complicated.
2. Addition. Let P, Q be vectors in R2. Then 0, P, Q are three vertices
of a uniquely determined parallelogram. We define Ρ + Q to be the fourth
vertex. The description of this operation in terms of coordinates is extremely
simple: if Ρ has coordinates (x, y), and Q has coordinates (s, t), then Ρ + Q
has (x + s, у + t) as coordinates. There is nothing profound to be learned
from the verification of this fact, so we shall not go through it in detail.
After all, it is not our purpose here to logically incorporate plane geometry
Figure 1.8
22 1. Linear Functions
Figure 1.9
into our mathematics, but rather to use it as an intuitive tool for
comprehension. For those who are suspicious of our assurances we include the
verification of a special case Consider Figure 1.9 and include the data relating
to the coordinates (Figure 1.10): To show that the length of the line segment
OB is s + x, we must verify that the length of AB is s. Draw the line through
Ρ and parallel to 0EX, and let С be the intersection of that line with the line
through Ρ + Q and В The quadrilateral ABCP has parallel sides and thus
is a parallelogram. Hence, AB and PC have the same length. Now triangles
OjQ and PC(P + Q) have pairwise parallel sides (as shown in Figure 1.10)
and further 0Q and P(P + Q) have the same length. Thus these triangles
are congruent so the length of PC is the same as the length of 0s, namely s.
Thus the length of AB is also s, and so OB has length s + x. Notice that this
is a special case since it refers to an explicit picture which does not cover all
possibilities.
The operation inverse to addition is subtraction: Ρ — Q is that vector
which must be added to Q in order to obtain Ρ (Figure 1.11). The best way
to visualize Ρ — Q is as the directed line segment running from the head of
Q to the head of Ρ (denoted L in Figure 1.11). In actuality, Ρ - Q is the
vector obtained by translating L to the origin; in practice, it is customary not
to do this but to systematically confuse a vector with any of its translates.
We shall do this only for purposes of pictorial representation.
Notice that, having chosen the vectors Ex and E2, we can express any
vector in the plane uniquely in terms of them and the operations of addition
1.2 Numbers, Notation, and Geometry 23
P + Q
Figure 1.11
24 1. Linear Functions
and scalar multiplication:
(*, y) = (x, 0) + (0, y) = x(l, 0) + уф, 1) = *EX + j>E2
This is true no matter how Ex and E2 are chosen, so long as the points
0, Ex, E2 do not he on the same line (we say the vectors Ex and E2 are not
collinear). Thus we have this important fact.
Proposition 4. Let Ex andE2 be any two noncollinear vectors in the plane.
Then we can write any vector Q uniquely as
Q = *% + x2E2
x1 and x2 are the coordinates of Q relative to the choice of origin 0 and the
vectors Έί and E2.
If we state this geometric fact purely as a fact about R2, it turns out to be
a theoretical assertion about the solvability of a pair of linear equations.
Thus, let us suppose Ex = (аД αχ2), Ε2 = (a2i, a2) relative to some standard
coordinate system (for example, the usual rectangular coordinates). First
of all, how do we express algebraically the assertion that Ex and E2 do not
lie on the same line? We need an algebraic description of a straight line
through the origin.
Proposition 5.
(i) A set L is a straight line through 0 if and only if there exists (a, b) e R2
such that
L= {{x, y): ax + by = 0}
(ii) The points (x, у), {х, у') lie on the same line through the origin if and
only if
*_ = y_
x' y'
You certainly recall these facts from analytic geometry—we leave the
verification to the exercises. Returning to the vectors Et and E2, these
geometric facts become the following algebraic fact.
Proposition 6. Let
Hi °S)
be a 2 χ 2 matrix with nonzero columns.
1.2 Numbers, Notation, and Geometry 25
(i) If a11a22 - a12a21 Φ 0, then the equation Ax = b has a unique solution
for every b e R2.
(ii) The equation Ax = 0 has a nonzero solution if and only if a11a22 =
(iii) If a1ia22 = a2a2x, the equation Ax = b has a solution if and only
tia^b2 = a2b\
Proof.
(ι) This condition is according to Proposition 5 (u) precisely the assertion that
the vectors (βι1, ai2), (a2\ a22) are not collinear. Then, according to Proposition 4,
for any φ1, b2) there is a unique pair (x1, x2) such that
ф1,Ь2)=х1(а11,а12) + х2фг1,аг2)
This is the same as the pair of equations b = Ax.
(n) By the above, if ai'a22 = ai2a2', then the only solution of Ax = 0 is χ = 0.
On the other hand, if ai'a22 = ai2a2', then either (a22, — a,2) or (— a2', аг2) is a
nonzero solution of Ax = 0, or all the entries of A are zero, in which case everything
solves Ax = 0.
(iii) If α^αζ2 = ai2a2', then (а/, ai2), (a2', аг2) lie on the same line through the
origin by Proposition 5(ii). Any combination x'fai1, «i2) + *2{аг1, агг) will have to
lie on that line, and conversely, any point on that line must be such a combination.
Thus Ax = b has a solution if and only if φ1, b2) lies on the line through 0
determined by (a,1, ai2). The equation for this is, by Proposition 5(ii),
b1 b* иг ι м г
— =— or b2a11=b1a12
а.1 Й12
Examples
8. Let Lx be the line through (0, 0) and (3, 2) and L2 the line
through (1,1) and (0, 6). Find the point of intersection of Lx and L2.
Lx has the equation 2x — Ъу = 0, and L2 the equation 5x + у = 6.
The point of intersection must he on both lines, and thus is the pair
(x, y) solving
2X - 3y = 0
5x + у = 6
We find χ = 18/17, у = 12/17.
9. Find the line L through the point (7, 3) that is parallel to the
line L': Sx + 2y = 17. L will be given by an equation of the form
26 1. Linear Functions
ax + by = с In order to be parallel to L, L and L' must have no
point of intersection, so the equations
8x + 2y = 17
ax + by = с
can have no common solution. Thus we must have
8_2
a~ b
Furthermore, since (7, 3) is on L, we must also have
la + 3b = с
This pair of equations in three unknowns has for a solution a = 4,
b = 1, с = 31. Thus L is given by the equation
4* + у = 31
EXERCISES
7. Show that for every integer n,
12 + 22 + · · · + n2 = £«(и + 1)(2« + 1)
8. Show that for every integer n,
2 + 22 + 1_2" = 2"+1-2
9. Show that Xn(YuZ) = (Xn Y) v(XnZ) and X\j(YnZ) =
(Xv Y)n(XvZ).
10. Give an example of a subset of a Cartesian product which is not a
rectangle.
11. Find the point of intersection of these pairs of lines in Rz:
(a) 3x+ y = l (c) 2x+ 2y=-l
x-\7y = l x+12y= 14
(b) x-2y = 4 (d) y = 2x+ 1
2x+ y = 0 x = 3^ + 18
12. Find the line through Ρ which is parallel to L:
(a) P = (2,-l),L:3x + 7^ = 4
(b) P = (8, 1),L:jc-^=-1
(c) P = (0,-7),L:^-2x = 3
PROBLEMS
9. We can define the line through Ρ and Q as the set of all X such that
the vector X - Ρ is parallel to the vector Ρ - Q. Show that two vectors are
parallel if and only if one is a multiple of the other. Conclude that the line
1.2 Numbers, Notation, and Geometry 27
through Ρ and Q is the set
{P + t(P-Q):teR}
10. Using the definition in Exercise 9 show that a straight line in the plane
is, in terms of coordinates, given as
{(*, y) e R2: ax + by + с = 0}
for suitable a, b, с
11. Suppose L is a line given by the equation bx + ay + с = 0
(a) Show that the tangent of the angle this line makes with the
horizontal is — b/a.
(b) Show that the vector (a, b) is perpendicular to L.
(c) Find the point on L which is closest to the origin.
12. Find the line through the point Ρ and perpendicular to L:
(a) P = (3, 7),L:x-3y = 2.
(b) P = (-\,l),L:2x+3y = 0.
(c) P = (0,2),L:5x = 2y.
13. Suppose coordinates have been chosen in the plane. Let Ει, E2 be
two vectors in the plane which are not collinear. (That is, 0, Ei, E2 do not
lie on a straight line.) Then we can recoordinatize the plane relative to this
choice of principal directions. Give formulas which relate these two
coordinatizations in terms of the given coordinates of Ε!, E2 (see Figure 1.12).
/
/
ι
vE,/-""'
/
/
_,p U3<)
uE,
/ ,„--·' Ε,Οτ1,/)
Figure 1.12
28 1. Linear Functions
14. In the text, the equation 1 + 2 + 1- и = in(n + 1) was verified by
induction. There is another way of doing this. An η χ η matrix has η2
entries. There are и of these entries on the diagonal and 1 +2 + \- n — \
entries both above and below that diagonal. Thus
2(1 + 2 + ■·· + «- \) + n = n2
1.3 Linear Transformations
We now return to the problem of analyzing systems of simultaneous linear
equations, with a broader question in mind: given the m χ η matrix A, for
which b is there a solution of the equation Ax = b ? In order to study this,
we associate to A the function from Rm to R":
fA(xi,...,x") = (aiixi + --- + anlx>,,...,aimxi,...,a„m>r> (1.16)
Thus the set of b, such that Ax = b, is precisely the range offA.
Let us begin by introducing the two fundamental operations on R" (just as
in the case и = 2 studied in the previous section):
1. Scalar Multiplication: for r e R, χ = (χ1,..., χ") e R", define
rx = (rx1,..., rx")
2. Addition: for χ = (χ1, ...,x"),y = (у1, ...,уя)е Rn, define
х + у=(х* +y\ ...,x" + y")
Definition 2. A function / from R" to Rm is a linear transformation if it
preserves these two operations, that is, if
f(rx) = rf(x)
/(x + У) =/(x) +/(y)
The function fA defined above for the m χ η matrix A is linear:
fA(rx) = (a,'rx1 + ■■■ + anlrxn, . ., a^rx1 + ■·■ + anmrx")
= (r(ax V + · ·-a„V),..., ria^x1 + ·■■ + a„mx„))
= rfA(x)
1.3 Linear Transformations 29
/л(х + У) = (βχ V + У1) + ■ ■ ■ + α„ V + у"), · ■., αΛ*1 + у1)
+ ■■■ + а„т(х" + у"))
= (a1lxl + ---+a„1xn + a1iyl + ■■■ + а„1у\ ...,
а^х1 + ··· + а„тх" + аГу1 + ■■■+ а„туп)
= /л(х) + /а(у)
The significance of the introduction of linear transformations, from the
point of view of systems of equations, is that it provides a context in which to
consistently interpret the technique of row reduction For, the application
of a row operation to a system of equations amounts to composition of the
associated linear transformation with a particular linear transformation
corresponding to the row operation. Once we have seen that we can analyze
the given system by studying these successive compositions. Looking ahead,
it is even more important to recognize row reduction as a tool for analyzing
linear transformations. Let us now interpret the row operations as linear
transformations.
Type I. Multiply a row by a nonzero constant. Consider the multiplication
of the rth row by с φ 0. Let Px be the transformation on Rm:
Pl(bi,...,bm) = (bl,...,cbr,...,bm)
(multiplication of the rth entry by c) The effect of this row operation is that
of composing the transformation/,,. R" -> Rm with Pu and changing the
equation Ax = b into the equation PxAx = Pxb These two equations have
the same set of solutions since the transformation Pt can be reversed (it is
invertible). Precisely, its inverse is given by multiplying the rth entry by 1/c.
Type 11. Add one row to another. Adding the rth row to the ith row
corresponds to this transformation on Rm:
P2(bi,...,bm) = (bl,...,br,...,b* + br, .,bm)
Again, this step in the solution of the equations amounts to transforming the
equation Ax = b to P2Ax = P2b Since P2 is invertible (what is its inverse?)
we cannot have affected the solutions.
Type HI. Interchange two rows. Interchanging the rth and ith rows
corresponds to the transformation
P3(b\...,br,...,b>,...,bm) = (bl,...,b\...,br,...,bm)
The importance of these observations is this" the row operations correspond
to linear transformations which in turn are representable by matrices. The
30 /. Linear Functions
solution of the system of equations Ax = b thus can be accomplished
completely in terms of manipulations with the matrix corresponding to the system.
It is our purpose now to study the representation of linear transformations
by matrices and the representation of composition of transformations.
In R" the η vectors (1,0,..., 0), (0, 1, 0,..., 0),..., (0,..., 0, 1) play a
fundamental role. We shall refer to them as E,, ...,E„, respectively.
Thus E, has all entries zero, but the /th, which is 1.
Proposition 7. Any vector in R" has a unique representation as a linear
combination o/Ex,..., E„.
Proof. Obviously,
(o\ ...,ό")=όΈ1 + ··· + ΛΈ„
We shall refer to the set of vectors Ei,..., E„ as the standard basis for R". Out
of Proposition 7 comes this more illuminating fact.
Proposition 8. Corresponding to any linear transformation L: R" -> Rm
there is a unique m χ η matrix (a/) such that
Цх\ ...,*") = (£ a/x1,..., ΣαΓχ1) 0-Π)
Proof. It is clear, by the way, that, given the matrix (a/), Equation (1.17) does
define a linear transformation. Now, given L, since it is linear, we can write
L((x\ ..., χ")) = /.(χΈ, + · · · + x"E„) = x'HEi) + · · · + x"L(E„) (1.18)
Thus a linear transformation is completely determined by what it does to the
standard basis. Let
HEi) = (ai\ ..., a,-),..., L(E„) = (a.1,..., a.-)
Then Equation (1.18) becomes
L((x\ ...,x"))= χ\αι\ ..., βι-) + ··· + χ"(α„\ ..., α„")
= (xW,..., x^S) +··· + (x"a„\ .... x"a„m)
= (xW + ■■■ + x"aS, ...,Λ,» + ··· + x"a„n)
which is just Equation (1.17).
1.3 Linear Transformations 31
Matrix Multiplication
Now we must discover how to represent the composition of two linear
transformations by an operation on matrices. There is only one way to
make this discovery: compute. Suppose then that T: R" -> Rm and S:
Rm->RP are linear transformations represented by the matrices (a/) and
(b/), respectively. Then, we can compute the composition ST as follows:
nx1,...,X-)=(iaJ1X\..., £<*J)
st(X\ ...,*-) = (£ V (Д*/*'). · ■ ·. Д v( Д*/*'))
Thus ST is represented by the ρ χ η matrix
Definition 3. Let A = (a/) be an m χ η matrix and В = (6/) ά ρ χ m
matrix. Then the product BA is defined as the ρ χ η matrix whose 0\/)th
entry is given by (1.19).
The preceding discussion thus provides the verification of
Proposition 9. IfT: R" -> Rm, S: Rm -> R" are represented by the matrices
A, B, respectively, then ST is represented by the product BA.
The product operation may seem a bit obscure at first sight; but it is easily
described in this way: the (i,j)th entry of BA is found by multiplying entry
by entry the ith row of В to thejth column of A, and adding.
Examples
10. /5 3 7\ / 6 1 0\
A= 6 5 1 B= -3 2 5
\8 11 -4/ \ 4 4 4/
32 I. Linear Functions
Let AB = (c/). Thei
c11 = 5.6+ 3(-3) +
Cl2 = 6.6+ 5(-3) +
Cl3 = 8.6+ll(-3) + (-
c32 = 6.0 + 5.5 +
/49 39 ' 43\
AB = 1 25 20 29
\-l 14 39/
1
7.4 =
1.4 =
-4)4 =
1.4 =
49
25
-1
29
11.
(2 5
1-1 -5 -1\
5 1)= 20 50 1 0
\2 · 1 5-1 1-1/
12. /-1 0\/ 1 7 0\_/-l -7 0\
I 0 ljl-l 4 -2J l-l 4 -2J
/0 1\/ 1 7 0\ /-1 4 -2\
\\ ОД —1 4 -2J \ 1 7 OJ
/1 1\/ 1 7 0\ / 0 11 -2\
\0 1Д-1 4 -2) l-l 4 -2J
Now, let us recapitulate the discussion of this section so far. The problem
of systems of m linear equations in η unknowns amounts to describing the
range of a linear transformation T: R" -> Rm. The technique of row
reduction corresponds to composing Γ by a succession of invertible transformations
on Rm. These transformations are those which provide the row operations;
we shall call them elementary transformations. Linear transformations can
be represented by means of the standard basis by matrices, and composition
of the transformations corresponds to matrix multiplication. Thus, we
solve a system of linear equations as follows: Multiply the matrix on the left
by a succession of elementary matrices in order to obtain a row-reduced
matrix. Then we can easily read off the solutions. Since multiplication
by an elementary matrix is the same as applying the corresponding row
operation to the matrix it is easy to keep track of this process.
1.3 Linear Transformations 33
Examples
13. Let us consider the system of four equations in three unknowns
corresponding to the matrix
A =
We shall record the process of row reduction in two columns. In
the first we shall list the succession of transformations which A
undergoes and in the second we shall accumulate the products of the
corresponding elementary matrices.
(a) Multiply the third row by — 1 and interchange it with the
first,
4 0 -1\ /0 0 -1 0\
3 2 21/0 1 0 0
4 0 1 II 1 0 0 0
^0 1 2/ \0 0 0 1,
(b) Multiply the first row by 3 and subtract it from the second;
multiply the first row by 4 and subtract it from the third.
Ί 0 -1\ /0 0 -1 0\
0 2 5 W 0 1 3 0
0 0 5 11 1 0 4 0
v0 1 2/ \0 0 0 1,
(c) Divide the second row by 2 and the third row by 5.
1
0
0
0
0 -1\
1 5/2
0 1
1 2/
/°
°
1/5
\o
0
1/2
0
0
-1
3/2
4/5
0
0'
0
0
1,
(d) Subtract the second row from and add one-half the third
row to the fourth.
1 0
0 1
0 0
0 0
-l\
5/2 1
1
o/
/ °
I °
1/5
\l/10
0
1/2
0
-1/2
-1 0'
3/2 0
4/5 0
-11/10 1,
34 1. Linear Functions
Let us denote the product of the elementary matrices by P; thus Ρ
is the last matrix on the right and the matrix on the left is PA. Now,
it is easy to see that if PAx = у has a solution, the fourth entry of у
must be zero. Now our original problem
Ax = b
has a solution if and only if PAx = Pb is solvable (since Ρ is invertible).
Thus b is in the range of A if and only if the fourth entry of Pb is zero:
Ax = b can be solved if and only if
τ1**1 -i*2 - тУ>3 + b4 = 0
If b satisfies that condition, there is an χ such that Ax = b; we find
it by solving PAx = Pb:
χ1_χ3=_£3
fx3 = \tf + \ЪЪ
x3 = \ЪХ + jb
14. Consider now the system in three unknowns given by
We row reduce as above.
(a) Multiply row 1 by 4 and subtract it from row 2; multiply
row 1 by 3 and subtract it from row 3.
(:
(b) Subtract 1/2 of row 2 from row 3; divide row 2 by —10
/13 —2 \ / 1 0 0\
0 1 -3/5 2/5 -1/10 0
\0 0 0/\-l 1/2 1/
The system Ax = b thus has a solution if and only if
-Ьх+^Ь2 + Ь3=0 (1.20)
1.3 Linear Transformations 35
In that case the solution is given by
x1 + 3x2 - 2x3 = bl
X TX - -jO - TQ-ft
Any arbitrarily chosen value of x3 will provide a solution (granted the
condition (1.20) is satisfied).
15. /2 0 0 2\
A=(3 -1 1 0)
\2 2 0 0/
(a) Divide row 1 by 2.
/1 0 0 1\ /1/2 0 0\
(3 -1 1 Oil 0 1 Ol
\2 2 0 0/ \ 0 0 1/
(b) Subtract 3 times row 1 from row 2; subtract twice row 1
from row 3.
/1 0 0 1\ /1/2 0 0\
10 -1 1 -311-3 1 01
\0 2 0 -2/\-2 0 1/
(c) Multiply row 2 by —1, and subtract twice the result from
row 3.
/10 0 1\ /1/2 0 0\
0 1 -1 3 3 -1 0
\0 0 2 -8/\-8 2 1/
Here there is no condition for the equation PAx = у to be
solvable, thus every problem Ax = b is also solvable. The solution
is found by writing the system PAx = Pb:
x1 + x4 = ψ
χ2 - χ3 + Зх4 = ЪЪХ -b
2
2χ3-8χ4= -Sbl +2b2 + b3
Clearly, the value of x4 can be freely chosen, and x\ x2, x3 are
easily found by the equations.
36 1. Linear Functions
Validity of Row Reduction
The basic point behind the present discussion is that the study of m
simultaneous linear equations in и unknowns is the same as the study of linear
transformations of R" into Rm, which is the same as the study of m χ η
matrices under multiplication by m χ m matrices. The matrix version of
this story is the easiest to work, if only because it imposes the minimum
amount of notation. However, the linear transformation interpretation is
the most significant, and in the next section we will follow that line of thought.
But first, let us record a proof of the main result of Section 1.1 in terms of
matrices.
Theorem 1.2. Let A be an m χ η matrix. There is a finite collection
E„,..., Es of elementary m χ m matrices such that the product Es · · · E„A
is in row-reduced form. Let Ρ = Es · · · E„ and let d be the index of Ρ A.
(i) The system Ax = b has a solution if and only if PAx = Pb has a
solution.
(ii) The system Ax = b has a solution if and only if the last m — d entries
of Pb vanish.
(iii) η — d unknowns can be freely chosen in any solution of Ax = b.
Proof. First of all, we may, by a sequence of row operations, replace A with a
matrix whose first nonzero column is
I
namely, supposing the y'th column is the first nonzero column. Thus some entry
in that column, say a/, is nonzero. Interchange the first and y'th rows. This is
accomplished by multiplication on the left by an elementary matrix of Type III, call
it Eo. Now, Eo A = (a/) with α/ Φ 0. Multiply the first row by (a/) -'; this makes
the (l,y) entry 1 and is accomplished by means of an elementary matrix, say Ei.
Now, let En be the elementary matrix representing the operation of adding
-α/ία/)-1 times the first row to the y'th row (this makes the (;',y) entry zero). Then
Em · · · Eo A has its first nonzero column (1.21).
The proof now proceeds by induction on m. If m = 1 the proof is concluded:
the 1 χ и matrix (0,..., 0, 1, a)+1,..., α,,1) is in row-reduced form. For m > 1,
the matrix Em · · · E0A has the form
1.3 Linear Transformations 37
where В is an (m — 1) χ (и —у) matrix. The induction assumption thus applies to
B. There is a collection F0,..., F, of (m - 1) χ (m - 1) elementary matrices such
that F, · · · F0B is in row-reduced form. Now let
Ε -ί1 ° "" °\
Then, it is easy to compute that further multiplication of (1.22) by these matrices
does not affect the first row, and in fact,
EoA
= (2
aj+i
Fs
• αΛ
F0B J
which is in row-reduced form.
(i) Suppose there are χ and b such that Ax = b. Then multiplication by Ρ
preserves the equality, so PAx = Pb. On the other hand, supopse x, b are given
such that PAx = Pb. Let fP be the transformation on R" corresponding to P.
/p is a composition of row operations which are invertible, thus/P is invertible. In
particular, fP is one-to-one, so since/P(Ax) =/p(b) we must have Ax = b.
(ii) If a" is the index of the m χ η matrix PA, its last m — a" rows vanish. Thus for
PAx = Pb to hold for some x, the last m — d rows of Pb must vanish. By (l)
this is also the condition for Ax = b to have a solution.
(iii) The solutions of Ax = b are the same as those of PAx = Pb. This latter
system has the form
x1 + az1*2 +
χ2 + a32x3 +
+ a„1x" = 2Pj1bJ
+ a„2x" = 2Pj2bJ
x° + ai+1xd+1 + ---+ a„dx" = 2 Pj"bJ
Clearly, x1
xd are uniquely determined once xd+1,..., x", b1 br are known.
The b's are restricted by the last m
free to take any values.
d equations of Pb = 0, but xd+1,..., x" are
• EXERCISES
13. Compute the products AB:
(a) A = lo
(b) A =
0
1
0
0
2
0
-3
= 0 -2
B =
Ό
1
2
0
°\
°
i/
1
-2
3
1
2
3
0
2
3
0
-1
3
38 1. Linear Functions
(с) А = (6, 6, 3,2,1,) B =
(d) A
2 8 6\ /0
! :i -i) B4?
14. Compute the products BA for the matrices A, B of Exercise 13.
15. Compute the matrix corresponding to the sequence of row operations
which row reduce the matrices of Exercise 5.
16. For the given m χ η matrix A find conditions on the vector b in Rm
under which the equation Ax = b has a solution.
(a) A as given in Exercise 13(a).
(b) A as given in Exercise 13(b).
(c)
i-'
2
0
\ 4
/2
4
2
\8
2\
0
-1
1
V
6 0
2 0
1 0
6
0
0
1
-1
(d)
17. Show that if A is an m χ η matrix with m>n, then there are always
b for which the equation Ax = b has no solution.
18. Verify that the composition of two linear transformations is again
linear.
19. Suppose that T: Rn^Rn and has this property:
T(E1) = 0 Γ(Ε„)=0.
Show that T(x) = 0 for every χ e Rm
20. Show that there is only one linear function on R" with this property:
/(Ei) = E2 ,/(E2) = E3 /(E„) = Ei
• PROBLEMS
15. Let /: R" -► R be defined by f(x\ ..., x") = £?= ι x*. Show that /
is a linear function. Is the function
g(x1,...,xT)= 2f=i(x')2
1.3 Linear Transformations 39
linear ? Is the function h(x\ x2) = xlx2 lmear?
16. Suppose that S, Τ are lmear transformations of i?" to Rm Show that
S+T, defined by
(S + T)(x) = S(x) + T(x)
is also linear. Show that the matrix representing S + Τ is the entry by
entry sum of the matrices representing S, T, respectively.
17. Let А, В, С be η χ η matrices. Show that
(AB)C = A(BC)
(A + B)C = AC + ВС
A(B + С) = AB + AC
Show that AB = BA need not be true.
18. Write down the products of the elementary matrices which row
reduce these matrices:
l\
3
2
°/
19. Is it possible to apply further operations to the matrices of Exercise 18
in order to bring them to the identity? Notice that when this is possible
for a given matrix A, the product Ρ of the elementary matrices
corresponding to these operations has the property PA = I. That is, Ρ is an
inverse to A. Using this suggestion compute inverses to these matrices
also:
/o
1
0
0
0
4
0
1
0
-2
7
-5
0
5
-1
3
6
0
1
1
6
2
0
3
3
-5
3
3
2
4
2
6
0
1
3
3
0
1
2
1
0
!\
4
°/
/8
0
0
\2
6
0
1
0
0
1
0
-1
0
0
0
-1
20. Find a 2 χ 2 matrix A, different from the identity such that A2 = I.
Find a 2 χ 2 matrix such that A2 = — I.
21. Is the equation (I + A)(I + B)=I + A + B possible (with nonzero
AandB)?
22. An η χ и matrix A = (a/) is said to be diagonal if a/ = 0 for / Ф].
Show that diagonal matrices commute; that is, if A and В are diagonal
matrices, AB = BA. Give necessary and sufficient conditions for a diagonal
matrix to have an inverse.
40 1. Linear Functions
1.4 Linear Subspaces of R"
In the last section we saw that the equation
Ax = b
can be solved just for b's restricted by certain linear equations and that the
set of solutions of that equation might have some degrees of freedom. In both
cases these sets are determined by some linear equations; such sets are called
linear subspaces of R". We will begin with an intrinsic definition of linear
subspace and the notion of its dimension. In the next section we shall find
a simple relation between the dimensions of the sets related to the equation
Ax = b.
Definition 4.
(i) A set К in R" is a linear subspace if it is closed under the operations of
addition and scalar multiplication. That is, these conditions must be
satisfied:
(1) vu v2 e К implies i>x + v2 e V.
(2) re R, ye V implies η e V.
(ii) If S is a set of vectors in R", the linear span of S, denoted [£] is the set
of all vectors of the form
ciy1 + ·■■ + ckyk
with vx, ..., vke S.
(iii) The dimension of a linear subspace V of R" is the minimum number
of vectors whose linear span is V.
Linear Span
Having now given the intuitively loaded word "dimension" a definition,
we had better hope that it suits our preconception of that notion. It does
just that in R3: a line is one dimensional since it is the hnear span of but one
vector; and a plane is two dimensional because we need that many vectors
to span it. In fact, it is precisely those observations which have motivated
the above definition. We should also ask that the above definition makes
1.4 Linear Subspaces ofR" 41
this assertion true: R" has dimension n. You may need a little convincing
that this is not immediately obvious, since you do know of и vectors (the
standard basis) whose linear span is R". But how can we be sure that we
cannot find less than и vectors with the same properties? Consider this
restatement of the notion of "spanning": If the vectors y1( ..., vk span
R", then the system of и simultaneous linear equations
Σ χ\ = ь
has a solution for every b e R". We already know from the preceding section
that this cannot be if к < η, and that gives us a proof that R" has dimension
n. We now repeat the arguments in the present context.
Theorem 1.3. // the set S of vectors in R" spans R", then S has at least η
members. Thus, the dimension of R" is n.
Proof. The proof is by induction on η and goes like this. Supposing that
Vi,..., v* span R", one of them must have a nonzero first entry. Subtracting an
appropriate multiple of that from each of the others, we may suppose that the
remaining к — 1 vectors all have first entry equal to zero. Then they are the same
as vectors in R"'1, and since the original Vi,..., vt spanned R" we can show that
these must span i?"_1. Now, by induction к — 1 >«— 1, and we have it. (Notice
that this is the same as the first step in the proof of Theorem 1.2.) Here now is a
more precise argument.
If none of the Vi vk has a nonzero first entry then Ei = (1, 0,..., 0) could
hardly be in their linear span. Letting a, be the first entry of ν,, we may suppose
(by reordering) that αλ φ 0. Now let Wi = Vi and wj = v, - β,αϊ ^i for ] = 2,...,
k. The vectors Wi,..., w* have the same linear span as the vectors Vi,.. ,vk (see
Problem 18); the difference is that only Wi has a nonzero first entry. Let Wi = (βι, bi),
w2 = (0, b2), ..., m = (0, bk), where bi bk are in R"~\ Now, b2,..., bk span
R"-1. For let с e R"~\ Then (0, c) e R", and since wb ..., щ span R", there are
x1,..., xk e R such that
2 x'w, = (0, c)
1 = 1
Thus χ'αι + χ2 ·0-\ Ьх"-0 = 0, *% + h x'bk = c. Since αϊ Φ 0, the first
equation implies x1 = 0, so the second equation becomes x2b2 + · · · + x"bk = с
Thus, b2 bk span R"~\ so by induction к - 1 > η - 1; that is, к > п. Thus,
dim R" > n. On the other hand, the standard basis Ei,..., E„ clearly spans, so in
fact dim R" = n.
42 1. Linear Functions
Examples
16. Let
v,= (0,1,0,3)
v2 = (2, 2, 2, 2)
v3 = (3, 3, 3, 3)
be three vectors in R4, and let S be their linear span. Then clearly
dim S<3. But it is also clear that v3 is superfluous, since v3 =
3/2(v2). Thus S is also the linear span of y1( v2 : if
ν =a1\1 + a2y2 + a3\3
then we can also write
ν = вЧ, + (α2 + 3/2(α3))ν2
Thus, dim S < 2. In fact, S has precisely dim 2. For suppose there
were a vector w = (a1, a2, a3, a4) which spanned S. Then we would
have numbers cu c2 such that Vi = c,w, v2 = c2w. Explicitly this
becomes
0 = cX 2 = c2a'
1 = cta2 2 = c2a2
0 = Cia3 2 = c2a3
3 = cxa4 2 = СгО4
But this is clearly impossible. By the second equation we must have
cx Φ 0, so by the first we must have a1 = 0. But 2 = c2ax, which
could not be. Thus, dim S = 2.
17. Let V be the subset of RA given by
K= {v:;;1 + v2 + v3 - v4 = 0}
К is certainly a linear subspace of R4. We will shortly have the
theoretical tools to deduce that V has dimension 3; with a little
work we can show it now. First of all, let Ax = (1, 0, 0, 1), A2 =
(0, 1, 0, 1), A3 = (0, 0, 1, 1). Then Alf A2, A3 are all in V, and if
υ = (г1, ν2, ν3, г4), since r4 = vl + v2 + v3 we have
ν = ι>χΑι + ι>2Α2 + ι>3Α3
1.4 Linear Subspaces ofR" 43
Thus К is the linear span of At, A2, A3, so dim V < 3. On the other
hand, if dim V < 3, then Au A2, A3 can all be expanded in terms of
some pair of vectors Bu B2. If we delete the fourth entry in all
these vectors this amounts to saying that the standard basis vectors
in R3 can be spanned by a pair of vectors. But dim R3 = 3, so this is
impossible. Thus dim V = 3 also.
Independence
Repeating the definition once again, dimension is the minimum number of
vectors it takes to span a linear space. There is another closely allied
intuitive concept: that of" degrees of freedom " or " independent directions."
In such phrases as "there is a four parameter family of curves," "two
independently varying quantities are involved," allusion is being made to a
dimension-like notion. Now, if we try to pin down this notion
mathematically and specify the concept of independence in the linear space context, it
turns out to be precisely the requirement for a spanning set of vectors to be
minimal. In other words, the dimension of a linear space is also the
maximum number of degrees of freedom, or indpendent vectors in the space.
Definition 5. Let Sbea set of vectors in R". We say that S is a set of
independent vectors if the equation
xlyt + ■■■ + хкУк = 0
with x1, ..., x* 6 R and v^ ..., vfc distinct elements of S implies x1 =
0,..., xk = 0.
The standard basis of R" is an independent set, as is very easy to verify.
We now verify that R" has in fact no more than π degrees of freedom in this
sense.
Proposition 10. Let \1г..., vk be an independent set in R". Then к <n.
Proof. The proof is by induction on k. The case к = 1 is automatically true,
since η > 1 always. Now let us proceed to the induction step (k > 1). Let as be
the first entry of γ,; we can thus write v, = (a,, b,), where b, e R-1. If all the as
are zero, then bi,..., bk are an independent set in if1-1. By the induction
assumption then, к <n— 1, so к < п. Now suppose instead that some a, is nonzero. We
can reorder the given vectors so that αϊ φ 0. Let Wi = Vi - α, αϊ 'vj for i > 2. Then
the first entry of w, is 0, so w,=(0, β,) with β,εΛ""1. Pi,...,pk are an
44 /. Linear Functions
independent set in R"~\ For if £{=2 с1 β, = 0, then also Jjt= 2 c'v/t = 0, so
(-ic'e,W1Vi+ 2 c'v,=0
\ 1=2 / 1=2
Since Vi,..., Vk are an independent set, c2 = ■ · · = c" = 0, so β2 β* are also
independent. Thus, by induction, once again к — 1 < η — 1. Thus in every case
к <, η, and the proposition is proved.
Examples
18. Let
vx = (0, 3, 0, 2)
v2 = (5, 1, 1, 2)
v3 = (1, 0, 2, 2)
In order to show that these vectors are independent we must show
that the system of equations
x1v1+x2v2+x3v3=0 (1.23)
has only the zero solution. But this system is the same as the system
corresponding to the matrix whose columns are vb v2, v3:
V2 2 2)
If we row reduce this matrix we obtain
\o 0 oy
Now the system PAx = 0 obviously has only the zero solution: if
x1 + x1 + x3 = 0
x2 + 2хъ = 0
хз=0 (1.24)
0 = 0
1.4 Linear Subspaces ofR" 45
we find, reading upward that x3 =0, x2 = 0, x1 = 0. Since Ρ is
invertible then the system Ax = 0 has only the zero solution. What is
the same, if (1.23) holds, so must (1.24), so x3 = x2 = x1 = 0. Thus
the vectors vb v2, v3 are independent.
19. Now let
vx = (3, 2, 1, 0)
v2 = (1, 2, 3, 1)
v3 = (2,0, -2, -1)
Again, let A be the matrix whose columns are y,, v2, v3:
•ί \ I
A row reduces to
rA Ι ο ο ο
\0 0 0/
The system PAx = 0 has the solutions
x» = -3x2 + 2x3
x2 = x3
The system Ax = 0 has the same solutions. Taking x3 = 1 we have
the particular solution (—1, 1, 1). Thus
- vx + v2 + v3 = 0
20. Four vectors in R3 cannot be independent. Let
v, = (2, 1, 2)
v2= (0,3,0)
v3 = (l,0,4)
v4 = (0,l,2)
46 1. Linear Functions
Find a linear relation which these vectors must satisfy. If we row
reduce the matrix whose columns are the v's, we obtain the matrix
/13 0 1
A= (0 1 -1/6 1/3
\0 0 1 -1
Now, corresponding to any value of x4 we obtain a solution of
Ax = 0, and thus of £ x'v, = 0. Take x4 = 1. Then
x3 = x4 = 1
x2 = u3 - i*4 = i
xi=-3x2-x4=-i
Thus
—ivi-iv2+>,3 + >,4=0
Now, the equivalent form of these two propositions about R", that any
spanning set of vectors has at least и members, and any independent set has
at most и members, holds for any linear subspace of R" as well.
Proposition 11. Let V be a linear subspace ofR" of dimensions d.
(ι) A spanning set has no less than d elements.
(ii) An independent set has no more than d elements.
Proof. Part (ι) is of course just the definition, so we need only consider part (ii).
The proof amounts to a reduction to the case where V is Rd, and an application of
Proposition 10.
Let Wi,...,wd span V; since V has dimension d there exists such vectors.
Suppose, as in (n), that Vi,..., \k are independent vectors in V. Then we can write
each Vj as a linear combination of Wi,..., wd;
V/ = Σ «j'wi \<.}<.k
J=l
for suitable numbers a/. The vectors (a/,..., a/) for j = 1 к are vectors in
Rd corresponding to the vectors {vj}; we shall now show that they are likewise
independent. For if,
IcW,...,я/)=0
J=l
1.4 Linear Subspaces ofR" 41
then also 25=1 cJv, = 0 by this computation:
к к ά ά Ι к \
2cJVj= 2cJ2aj'w,= 2 I 2cJaj')w(=0· Wi + --+0· w„
J=l J = l 1 = 1 l = l\j=l /
Thus, by the independence of vi,..., vt, we must have c1 = 0, ..., <* = 0. Thus
the к vectors (β/,..., β/) are independent in Rd, so by Proposition 10 d>k.
Definition 6. Let К be a linear space. A basis of К is a set S of vectors
such that each ν e V can be written in the form
к
у = ]£ с1уг with с1 е R, v, e S
1=1
in one and only one way.
Another way of putting this is: a basis for a linear subspace К is a set of
independent and spanning vectors in V.
Proposition 12. S is a basis for the linear space V if and only if both these
conditions hold:
(i) S is an independent set,
(ii) the linear span of S is V.
Proof. Suppose that Sis a basis of V. Since every vector in V can be written
as a linear combination of vectors in S, certainly (ii) is true: V is the linear span of S.
Since 0 can be written in only one way as a linear combination of vectors in V, any
time we have
ciVl + · · · + сЧ = О
with c1,..., c* in R and vb ..., vk distinct members of S, we must have c1 =
0, ...,<* = 0 (since 0 = 0 · Vi + · · + 0 · \k also). Thus (l) holds: S is an
independent set.
Conversely, suppose now that (i) and (ii) are true for the set S. Then (by (n))
any vector ν in К can be written
v = c1vi + --- + c4l (1.25)
with c' e R, v( e S. This can be done in only one way because of the independence.
In fact, suppose (1.25) holds, and also
ν = α1ν1 + ··· + α4 (1.26)
is true, with (c1 — а1)^! + · · · + (ck — a^v* = 0, so c' = a' since the v, are independent.
48 1. Linear Functions
Dimension and Basis
The important facts to know about dimension of linear subspaces of R"
are these: such a space V always has a basis with a finite number of elements.
That number is the same for all bases and is the dimension of V, and is not
greater than n. We summarize this as follows:
Theorem 1.4. Let V be a linear subspace of R".
(i) There is an integer d<n such that Vhas dimension d.
(ii) Any basis of V has precisely d elements.
(iii) d independent vectors in V form a basis.
(iv) d spanning vectors in V form a basis.
Proof, (ι) The proof of this part of the theorem is by mathematical induction
on n. If η = 1, either V = {0} or V has a nonzero vector, in which case V = R.
Thus either dim V = 0 or 1, so dim V < 1. Now we proceed to the induction step.
Let us describe how it goes. We assume the assertion (i) for и — 1, and consider
R"-1 as the set of «-tuples in R" with zero first entry. If К is a subspace of R", it
intersects this space in some subspace of R"~l which is, by induction spanned by
some S vectors, with 8<n— 1. Now, choosing any other vector in К with a
nonzero first entry, this together with the vectors referred to above will span V.
Now we make this argument precise.
Let К be a subspace of R". If V = {0}, then dim V = 0; if not, К has a nonzero
vector v0 = (a1,..., a"). One of the entries is nonzero; we may, by reordering the
coordinates assume that α1 φ 0. Let now
W = {weR-1:(0, w) e K}
W is a linear subspace of i?""1. For if Wi, w2 e W and с1, с2 е R, we also have
c40, Wl) + c2(0, w2) = (0, c'wi + c2y/2)
in V, so cxwi + c2v/2 e W. Now, by the induction hypothesis, W has dimension
8<,n-l. Let wi,...,w{ span W. By definition of W, Vi =(0, Wi),.. .,ve =
(0, w«) are in К Now we need only show that v0, Vi v«spanK. Let ν eV, and
let с be its first entry. Then ν - cfaV'vo is also in Kand its first entry is 0. Thus
this vector is of the form (0, w) with w e W. Then there are c1,..., c* such that
w = c'wi Η h cV{
Thus
ν - (Wy^o = (0, w) = c40, Wl) + ... + c«(o, W{)
1.4 Linear Subspaces ofR" 49
or,
ν = cfcV'vo + c'vj Η h c4>
Thus, there are δ + 1 vectors which span V, so К has dimension d with d< 8 + 1 <
(w-l) + l =«.
(li) This follows easily from Proposition 12. If S is a basis for V (dim К = d),
then since 5 spans, it has at least d elements, and since S is independent it has at
most d elements.
(iii) Suppose that vlf ..., yd are independent vectors in V; we must show that
they span. Let v0 e V. By Proposition 12(H), since dimV = d, v0 yd are
dependent, so there exist (c°,..., cd) φ 0 such that
c°v0 + · · · + Cvj = 0
If c° = 0, since Vi,..., vd are independent we must also have c1 = · · · = c* = 0, a
contradiction. Thus c° Φ 0, so v0 = (—c°)~1(c1v1 + · · · + c^) as desired.
(lv) Suppose that vi,..., vd span V. If they are dependent, then the equation
cV + · · · + Cvd = 0
holds with at least one c' φ 0. If say С Ф 0, then
vr = (-сг)_1(сЧ1 + · · · + C-^r-! + cr+1t;r+i + · · · + сЧ)
so Vi,..., yd, with vr excluded, also span V. Hence, V has dimension at most
d— 1, a contradiction, so we must have had Vi,..., yd independent and thus a
basis.
This final proposition, whose proof is left as an exercise, is an indication of
the (theoretical) ease in finding bases.
Proposition 13. Let V be a linear subspace of R" of dimension d.
(i) Any set of vectors whose span is V contains d vectors which form a basis.
(ii) Any set of independent vectors in V is part of a basis for V.
Examples
21. Find a basis for the linear span V of the vectors
vx = (4, 3, 2, 1)
v2 = (5, 2, 2, 1)
v3 = (0, 1, 0, 1)
v4 = (1, 0, 0, 1)
and express К by a linear equation.
50 1. Linear Functions
We want to find all vectors b of the form
Е*Ч = Ь (1-27)
and we want to find a basis for such vectors. Now (1.27) is the
system corresponding to the matrix A whose columns are the vectors
yi< v2. v3. v4 · The span of these vectors is just the range of A. If
Ρ is a product of elementary matrices row reducing A, then any vector
b is in the range of A if and only if Pb is in the range of PA. Thus by
row reduction we should easily be able to solve our problem.
A =
The end result of row reduction produces
Ί 1 1 1\ /0 0 0 Γ
PA=I° l ~l °l P= I1 ° ° ~4
1 0 0 1 -11 lo 0 -1/2 1
\0 0 0 0/ \l 1 -3/2 -4y
Thus, the range of A is obtained by setting the fourth entry of Pb
to zero:
V = range of A = {φ1, b2, b3, bA): bl + b2 - \ЪЪ - 4b* = 0}
V has dimension at least three since it contains the independent vectors
(4, 0, 0, 1), (0, 4, 0, 1), (0, 0, 2/3, -1/4). On the other hand, V # R\
so dim V < 3. Thus, dim V = 3 and these three vectors are a basis.
22. Find a basis for the linear subspace V of R5 given by
the equations
5xx + 8x2 + 3x3 + x4 + x5 = 0
x1 - x3- x5=0
x2 + 2x4 =0
We are seeking the solution space of Ax = 0, where
/5 8 3 1 1\
A= 1 0-10 -1
\0 1 0 2 0/
1.4 Linear Subspaces ofR" 51
Row reduction leads to
/10-1 0 -1\
PA= 0 1 0 2 0
\0 0 -2 -15 6/
and Fis the set of χ such that PAx = 0. According to these equations
x4 and x5 are to be freely chosen and x1, x2, x3 determined by this
choice. Thus, dimF=2. Choosing (x4, x5) = (1, 0), (0,1),
respectively, we obtain as a basis
(-¥.-2,-^,1,0) (4,0,3,0,1)
• EXERCISES
21. What is the dimension of the linear span of these vectors?
(a) Vi =(-1,2,-1,0)
v2=(2,5,7,2)
v3=(0,2, 1, 1)
v* = (3, 5, 7, 1)
(b) vi =(-1,0,2,1)
v2=(2,2,-2,2)
v3 = (l,l, 1,1)
(c) Vl =(0,2, 1,1)
v2=(l,7, 3, 3)
v3 = (0,0, 0, 1)
v* = (l,3, 1,2)
v5=(l,5,2,2)
(d) vi= (0,0, 1,1,1)
v2 = (1,0,0, 1, 1)
v3 = (0, 1,0, 1,0)
22. What is the dimension of the space S given by these equations:
(a) 5 = {x e R5: x1 + x2 - x3 - x* = 0, x1 + x3 = 0}
(b) 5 = {x e R5: x2 + x* + x5 = 0, x1 - x3 + x* = 0
л:1 - χ2 - x3 - x5 = 0}
(c) S = {\ e R*: χ1 + χ2 + χ3 = χ3 - χ2 - χ1 + χ4}
23. Determine the linear span of these vectors by a system of equations
(a) Vi= (1,0,0, 1)
va=(0, 1,1,0)
v3=(0, 1,0,1)
(b) vi = (2, 2, 6, 2)
v2=(l,2,3,0)
v3 = (0,1,0,-1)
(c) vi= (1,0,1)
v, = (-1,1,1)
(d) vi= (1,0, 0,0,0)
v2 = (2,0, 1,0, 1)
52 1. Linear Functions
24. Are these vectors independent ?
(a) Vi v5 as given in Exercise 21(c).
(b) Vi, v2, v3 as given in Exercise 23(a).
(c) vi = (0, 2, 0, 2, 0, 6)
v, =(1,1,-1,-1, 1,1)
v3 = (2,4,6,8, 10, 12)
v* = (0, 0, -2, -2,0, 0)
v5=(0, 1,0,0, 1,0)
v« = (1,1, 1,1, 1,1)
25. Find all linear relations involving these sets of vectors.
(a) v1= (0,1,1)
vi = (5,3, 1)
v3= (0,2,0)
v« = (1,-1,1)
(b) vi=(0,2,0,2)
v2 = (0,1,0,0)
v3= (0,1,0, 1)
v4 = (0, 0, 0,1)
v5=(l, 0,-1,0)
(c) vi = (0,0, 0, 0)
v2=(l,l,l,l)
v3 = (1,1,0,0)
v* = (0, 0, -2, -2)
26. Find a basis for the linear subspace of R5 spanned by (0, 0, 0, 1, 1),
(0, 1, 0, 0,0), (1, 0, 0, 0, 1), (1, 1, 0, 0, 1), (2, 1, 0, 1, 2)
27. Find a basis for these linear spaces:
(a) {(x1, ...,x5)eR5:x1 + 2x2 + x3=0,x1 + 2x* + x5=0,
x1+x5=0}
(b) {(x\ ..., x*)eR*: x1 - x2 + x3 - x* = 0, x1 - x3 =0}
28. If the given vectors on R5 are independent, extend them to a basis:
(a) (0, 0, 0, 0, 1), (0, 0, 0, 1, 1), (0, 0, 1, 1, 1)
(b) (1, 5, 2, 0, -3), (6, 7, 0, 2, 1), (1, 0, -1, -2, 0), (1, 1, 1, 1, 1)
(c) (4, 4, 3, 2, 1), (3, 3, 3, 2, 1), (2, 2, 2, 2, 1)
• PROBLEMS
23. Suppose we are given к vectors Vi,..., yk in R". Let Wi = Vi,
w2 = v2 — 02 vb..., wt = vt — β,,Ύί for some numbers β2 ft. Show
that the sets {vu ...,vk} and {wb ..., щ} have the same linear span.
24. The proof of Theorem 1.3 proceeds by assuming that the set S
consists of the vectors Vi,..., vk. What of the case where S has infinitely
many elements ?
25. Prove Proposition 13.
26. Show that if V, W are subspaces of R", so is V η W.
1.5 Rank + Nullity = Dimension 53
27. Show that if A is obtained from В by a row operation, the linear span
of the rows of A is the linear span of the rows of B.
28. Show that if A is a row-reduced matrix the dimension of the linear
span of the rows of A is the same as its index.
1.5 Rank + Nullity = Dimension
Now let us apply the propositions of the preceding section about linear
spaces, and in particular the notion of dimension, to the subject of linear
transformations. There are certain obvious linear spaces to be associated
to a given transformation.
Definition 7. Let T: R" -> Rm be a linear transformation,
(i) The set
K(T)= {νεϋ":Γ(ν) = 0}
is a linear subspace of R", called the kernel of T. Its dimension is the nullity
of Γ, denoted v(T).
(ii) The set
R(T)= {T(v):veRn}
is a linear subspace of Я', called the range of T. Its dimension is the rank
of T, denoted p(T).
Theorem 1.5. Let T: R" -> Rm be a linear transformation. We have
η = v(T) + p(T)
that is, dimension = nullity + rank.
Proof. For short, write v(T) as v. Let Vi,..., v„ be a basis for the kernel of T.
Let v„+1,..., v„ be the rest of a basis for R": Vi,..., v», v»+i,..., v„ thus span
R". Let Wj = r(Vj) for j = ν + 1,..., и. Now the crux of the matter is this:
wv+1,..., w„ form a basis for the range of T. Once this is shown, we will have
^(7-) = η — ν, which is the desired equation.
(l) Let w e R(T). Then there is a v e R" such that w = T(y). Expand ν in the
54 1. Linear Functions
basis vj,..., v„: ν = c'vi Η h c"v„. Then
w=7,(v) = 7,(c1vi + --- + c"v„)
= clT(\i) +■■■+ cvr(vv) + cv+1r(vv+.) + · · · + <?T(v„)
= cv+1wv+i + hc"w„
The second line is justified since Τ is linear and the third follows since Vj,..., v» are
in the kernel of Τ and T(\v+1) = w»+i,..., Γ(ν„) = w„. Thus these last vectors
span R(T).
(ii) wv+b. ., w„ are independent. Suppose
cv+1wv+iH hc"w„ = 0 (1.28)
We must show that the {c1} are all zero. In any event, from (1.28) we have
r(cv+1vv+> + · · · + c"v„) = cv+1r(vv+1) + · · · + c"T(v„) = 0
so cv+1vv+i + h c"v„ e K(T) Vi,..., vv span K(T) so there are c1,..., cv such
that
cv+,Vv+i Η h c"v„ = c4i Η + cvvv
or
(-c>i + · · · + (-c>v + cv+1Vv+i + · · · + c"v„ = 0
Since vi,..., v„ are independent, all the cJ are zero, as required. The theorem is
proven.
Examples
23. Let Г Я4 -> Я3 be given by the matrix
/1 3 2 7\
A= 0 0 1 1 (1.29)
\0 1 0 0/
We can completely analyze this transformation by row reduction.
A easily row reduces to
/1 3 2 7\
0 10 0 (1.30)
\0 0 1 1/
1.5 Rank + Nullity = Dimension 55
merely by interchanging the last two rows. Thus, letting Ρ be the
transformation corresponding to
(1 °o Ϊ)
\0 1 0/
we know that PT is the linear transformation corresponding to (1.30).
Now, the range of PT is easily seen to be all of R3, and the range of
Τ is P~1 (range of PT), which is again all of R3, so p(T) = 3. The
kernel of Τ is the same as the kernel of PT, which has the equations
given by (1.30):
x1 + 3x2 + 2x3 + 7x4 = 0
x2 = 0 (1.31)
x3 + x4 = 0
The set of all such solutions is found by letting x4 take on all real
values and solving for the remaining coordinates by (1.31). Thus
K(T) = {(-5i, 0, -t, t): 16 R}, which is one dimensional.
24.
1
"I
Let Τ
/1
0
3 "
\2 -
:R
1
1
■1
■3
4 _,,
1
0
3
2
R*\
2
1
2
-1
R4 be given by the matrix
Let us row reduce this matrix, keeping track of our row operations:
v0
PA
Now, the kernel of Τ is easy to find; it is the same as the kernel of
the transformation S corresponding to the last matrix PA (because
56 1. Linear Functions
S = the composition of Τ by an invertible transformation). Now the
kernel of S, and thus also of T, has, corresponding to the matrix PA,
the form:
x1 + x2 + x3 + 2x4 = 0
x2 + x4 = 0
0 = 0
0 = 0
or x2 = — x4, x1 = — x3 — x4. Thus,
K(T) = {(-(и + ν), -ν, и, υ): (и, υ) е R2}
so ν(Γ) = 2. The range of Γ is a little harder to find. If R is the
transformation corresponding to the product of the elementary
matrices on the left, then S = R - T, so the range of Γ is R ~x of the
range of S, which has the equations x3 = x4 = 0. (That is, the vector
(61, ..., b4) is in the range of S if and only if there exist (x1, ..., x4)
such that
x1 + x2 + x3 + 2x4 = bl
x2 + хА = Ъ2
0 = b3
0 = 64
The necessary and sufficient condition is b3 = bA = 0.) Thus the
necessary and sufficient condition for ν to be in the range of Τ is that
Pv be in the range of S; that is, the third and fourth coordinates of
Pv must vanish:
-Зх1 +4х2 + х3 =0
-2jcx-5x2 + x4 = 0
Thus, p(T) = 2.
25. Let us do one more example briefly. Suppose that T: R3 -> R5
corresponds to the matrix
/1 0 1\
2 1 31
0 11
1 1 21
\4 3 7/
1.5 Rank + Nullity = Dimension 57
This matrix can be row reduced to
/1
0
0
0
\o
0
1
0
0
0
i\
1
0
0
o/
by multiplication on the left by this matrix
/ ι
-2
2
1
\-i
0
1
-1
-1
-1
0
0
1
0
-1
0 0\
0 0
0 0
1 0
-i i/
The kernel of Γ can be found by looking at the row-reduced form A;
it is the set of χ = (χ1, χ2, χ3) in R3 such that Ax = 0. Precisely, we
must have x1 + x3 = 0, x2 + x3 = 0. Thus a vector is in K(T) if
its first and second coordinates are the negative of the third; that is,
K{T)= {{-t,-t,t):teR}. Thus v(T) = 1. The range of Τ is
the set of χ = (x1, ..., x5) such that Px is in the range of A (since
A = PT). The 5-tuples in the range of A are precisely those with
third, fourth, and fifth coordinates zero. Thus the third through fifth
coordinates of Px must be zero for χ to be in the range of T. Specifically
R(T) is the set of simultaneous solutions of
2xx
x2 + x3
+ X«
x4 + x5
■0
0
■ 0
We can take x1, x2 as free variables and use these equations to define
x3, x4, x5: thus
R(T) = {(и, ν - 2u, ν - u, 3υ - 2w): (и, v) e R2}
so p(T) = 2.
These examples illustrate the fact that Theorem 1.5 can be formulated
purely in terms of matrices. We now do just that.
58 1. Linear Functions
Proposition 14. Let A be an τη χ η matrix, representing the linear
transformation T: R" ->Rm Then
p(T) = number of independent columns of A
= number of independent rows of A
= index of the row-reduced matrix to which A can be reduced.
Finally, we can also reformulate Theorem 1.5 as a conclusion for systems
of linear equations, thus bringing us to the ultimate version of Theorems 1.1
and 1.2.
Theorem 1.6. Suppose given a system of m linear equations in η unknowns,
and suppose d is the index, or rank, of the corresponding matrix A. Then
(i) d <m,d< n.
(ii) {x: Ax = 0} is a vector space of dimension η — d.
(iii) {b. there exists a solution of Ax = b} is a vector space of dimension d.
• EXERCISES
29. Describe by linear equations the range and kernel of the linear
transformations given by these matrices in terms of the standard basis:
(a)
(b)
(c)
№ /8 0 0 1 6\
V
30. Find bases for K(T), R(T) for each Τ given by the matrices (a)-(d) of
Exercise 29.
31. Let/: R"^R be a nonzero linear function. Show that the kernel
of/is a linear subspace of R" of dimension и — 1.
32. Let f(x\ ...,x")= 2?=i x'. Find a spanning set of vectors for the
kernel of/.
1.6 Invertible Matrices 59
• PROBLEMS
29. Let T:R"^Rm be a linear transformation. Then K(T) and R(T)
are linear subspaces of R", Rn, respectively.
30. Let Τ be the transformation represented by the m χ η matrix A.
Show that R(T) is spanned by the columns of A. Show that
K(T) = {(x\...,xy. £c^=0>
J=l
where d,..., C„ are the columns of A.
31. Let v/eR". Define ±(w) as the set of ν such that £"=i v'w' = 0.
Show that for w φ 0, _L(w) is a subspace of R" of dimension η — 1.
32. Let 5 <= i?n. Define 1(5·) as the set of ν such that £"= ι v'w' = 0 for
all w e S. Show that 1 (S) is a subspace of i?", and dim 1 (S) + dim[S] = n.
1.6 Invertible Matrices
In this section we shall pay particular attention to the collection of linear
transformations of R" into R"—or, what is the same, the и х и matrices.
From the point of view of linear equations this is reasonable; for it is usually
the case that a given problem will have as many equations as unknowns.
First of all; it is clear that there are certain operations which are defined
on the collection of all linear transformations of R", thus making of this set an
algebraic object of some sort. We collect together all these notions in the
following definition.
Definition 8. The algebra of linear operators on R" denoted by E", is the
collection of linear transformations provided with these operations:
(i) if/is in E", and с is a real number,
(C/)(x) = cf{x)
(ii) if/, g are in E",f+g is defined by
(/+0 )(*)=/(*) + £(*)
(iii) f° g is defined by
(Z°tf)(x) =/(**))
60 1. Linear Functions
It is important to think of the elements of E" as functions taking n-tuples
of numbers into и-tuples of numbers; but in working with them it is convenient
to represent them in terms of the standard basis by matrices. Thus, we are
led to consider also the algebra M" of real η χ η matrices with the operations
of scalar multiplication, addition and multiplication, the definitions of which
we now recapitulate.
Definition 9. The algebra M" is the collection of и х и matrices provided
with these operations:
(i) If A = (a/) is in M", and с is a real number,
с А = (ш/)
(ii) If A = (a/) and В = (6/) are in M", then
A + Β = (β/ + b/)
AB = (ibV)
The two algebras E", M" are completely interchangeable, for M" is just
the explicit representation of E" relative to the standard basis.
Now the operations on M" obey certain laws, some of which we have
already observed in previous sections. Let us list some important ones.
Proposition 15. These equations hold for all η χ η matrices А, В, С and all
real numbers k.
(ι) fc(A + В) = к А + kB
(ii) C(A + B)=CA+CB
(iii) (A + B)C = AC + BC
(iv) A(BC) = (AB)C
If A is a given matrix, we shall let A2 denote A · A, A3 = A · A · A, and
in general A" is the и-fold product of A with itself. Since we may also add
matrices, and multiply by real numbers, we may consider polynomials in a
given matrix. That is, A2 + ЗА + A, A7 + 3πΑ3 + A6, .. . In fact, if
we adopt the usual convention that A°=/, then for any polynomial
/»СЮ = Σ"= о сгХ' in the indeterminate X, we may consider the matrix p(A) =
Σ"=ο CA'. A most remarkable observation can now be made, by noticing
that the collection M" of и х и matrices is the same as the collection R"2 of
n2-tuples of real numbers.
1.6 Invertible Matrices 61
Proposition 16. Given any η χ η matrix A, there is a nonzero polynomial
ρ of degree at most n2 such that />(A) = 0.
Proof. An element of M" is a rectangular array of n2 real numbers, thus
corresponds to an element of R"2. We may make this correspondence explicit, by, say,
placing the rows one after another. That is, the matrix (a/) corresponds to the
vector (αϊ1, ..., a„\ aS,..., α„\αι3,..., αϊ-ι,α„") in R"2. In any event, the notions
of sum and scalar multiplication is the same in the two interpretations Now
consider the matrices I, A, A2,..., A"2. These n2 + 1 vectors in R"2 cannot be
independent so there are real numbers c0, o,..., c„2, not all zero, such that
c„2 A"2 + · · · + c2 A2 + c,A + col = 0
Thus the proposition is verified with ρ the polynomial
p(X) = c„2 X"2 + · · · + c2 X2 + о X + со
We may rephrase this proposition in this way Every matrix is a root of some
nonzero polynomial equation with real coefficients. From the purely
algebraic point of view this formulation is of some interest and raises the
converse speculation: given a polynomial with real coefficients, does it have
some η χ η matrix as a root? We shall verify this fact, and with η no greater
than two. More precisely, we shall, in a later section, introduce the system
of complex numbers as a certain collection of 2 χ 2 matrices, and later verify
that every real polynomial has a root in the system of complex numbers.
This is known as the fundamental theorem of algebra.
Now, a linear transformation in E" is invertible if it has an inverse as a
function from R" to R". For this it must be one-to-one and onto, that is,
it must have zero nullity and rank n. We have seen (n = rank + nullity)
that either of these assertions implies the other. Now it is clear that these
assertions must be expressible in terms of matrices; we now do that.
Definition 10. The η χ η matrix A is invertible if there is a matrix В such
that BA = I = AB In this case В is said to be an inverse for A.
Proposition 17. An invertible matrix has a unique inverse
Proof. This is clear: if В, С are inverses to A, then all these equations hold:
BA = I = AB CA = I = AC
Then
В = BI = B(AC) = (BA)C = 1С = С
62 1. Linear Functions
We shall denote the inverse of a matrix A, if it exists, by A i. The
relationship between matrices and linear transformations gives us this propostion:
Proposition 18. Let Abe an η χ η matrix. These assertions are equivalent:
(i) A is invertible.
(ii) A represents an invertible transformation.
(iii) There is a matrix В such that BA = I.
(iv) There is a matrix В such that AB = I.
(v) A has index n.
Proof. We have already seen (in discussing systems of linear equations) that (ii)
and (v) are equivalent. By definition (i) implies both (iii) and (iv). Thus we have
left to prove that (i) and (ii) are equivalent, (iii) implies (l), and (iv) implies (i).
(ι) implies (ii). Let A be the given invertible matrix, and Τ the transformation
it represents. Let S be the transformation represented by the inverse, A-1, of A.
Since A ■ A"1 = I = A_1A, we have T· S = I = S ■ T. Thus S is inverse to T, so
(ii) holds.
(ii) implies (l) by the same kind of reasoning with the roles of matrix and
transformation interchanged.
(iii) implies (i). If Τ is the linear transformation represented by A, then by (iii),
there is a transformation S such that i« T = I. Thus, if T(\) = 0 we must also
have χ = S(T(x)) = S(0) = 0, so Τ has nullity zero and is thus invertible. Thus
(iii) implies (ii), so also implies (ι).
(iv) implies (i). If again, Τ is the transformation represented by A, by (iv) there
is a transformation S such that Τ ° S = I. This implies that Τ has rank и and thus
is invertible.
Computing the Inverse
Now, it is clear that the question of invertibility for a given matrix is
important and that the problems arise of effectively deciding this question
and of effectively computing the inverse, if it exists. To ask that the rows
(or columns) be independent, or span R", while responsive to this question
hardly provides a procedure for determining invertibility. We shall now
introduce two such procedures: one is a continuation of row reduction and
the second is based on the notion of the determinant. The determinant is a
real-valued function defined on the algebra M" of и х и matrices; its basic
property is that it is nonzero only on the invertible matrices. We shall
depend heavily on the determinant in the study of eigenvectors (Section 1.7).
In Section 1.9 we shall explore the connection between the determinant and
the notion of volume in R3.
1.6 Invertible Matrices 63
In order to verify the critical properties of the determinant function it is
necessary to return to the elementary matrices, for they provide a technique
for decomposing an invertible matrix into a product of simple ones, and as
a result, a technique for computing inverses. We recall these facts: the
elementary matrices are the matrices which represent the row operations.
Since the row operations are invertible, so are the elementary matrices
invertible. For any matrix A there is a sequence Ps, ..., P„ of elementary
matrices such that В = PjPj-! ··· P„A is in row-reduced form. The index
of A is the number of nonzero rows of B. We augment these facts by this
further observation:
Proposition 19. Suppose that A is an invertible η χ η matrix. There is a
sequence P,,..., P0 of elementary η χ η matrices such that P, · · · P„A is the
identity matrix:
P, P0A = I
Proof. The proof will be by induction on n. It is a slight modification of
Theorem 1.2. The first column of A is nonzero since the columns of A must be
independent (A is invertible). As we have seen in the proof of Theorem 1.2, there
exist elementary matrices P0,..., P» such that the first column of Pk · ■ ■ P0A is Eb
Thus,
1 n-1
/1 Ax\ 1
P„ P0A - Jo Ai2j n _ j
Since Pk · · · P0A is invertible so is A22 (see Problem 37). Thus the proposition
applies to A22. There is a sequence Qs,..., Ql+i of elementary (n - 1) x (n - 1)
matrices such that Qs · ■ · Ql+iA22 = I. Let
Pj = (i qJ ьгу = *-м,...,,
Ql+1A22) = (θ I j
Now the matrix
(i "Μ
Ps PoA
"
is the product Ρ,+η-, ■ · · Ps+i of elementary matrices corresponding to these row
64 1. Linear Functions
operations: subtract a/ times theyth row from the first row, j = 2,..., n. Finally,
Ps+n_1 ■ P0A = yQ j Д0 ϊ ] = Ц !J=I
as required.
This proposition provides us with an effective way for computing inverses;
we just continue the process of row reduction until we obtain the identity.
Then the corresponding product of elementary matrices is the inverse.
Row reduction Product of elementary matrices
Thus
/ 137 -21 -10\
А-1=тЫ 65 15 -20
\-10 5 25/
27.
A =
1.6 Invertible Matrices 65
П О 1
2 1 2
О -1 1
^0 4 2-
П О 1 2
0 10-3
0 0 1 О
νΟ Ο 2 -10
α ο ο 2\ ι-\ 2 -ι o^
О 1 0 -3\| 1-2 0 0
0 0 1 oil 1 -1 10
νΟ 0 0 -10/ \-6 11 -2 1,
24 -и А
Ί О О 0\ /-^
0 10 0 |f
0 0 10 1 -1 1
vo о о ι/\ л -Н ^
Thus,
А-1 = -±-
** — in
'-22 42 -12 2^
28 -53 6 -3
10 | 10 -10 10 О
6 -11 2 -Ь
77ie Determinant Function
The determinant of a matrix is a pretty complicated concept; before going
into a study of it and its properties, we shall first see how to compute it.
Looking ahead, the method of computation comes from Equations (1.35)
and (1.36), but we shall not use those equations to derive it. Instead we shall
simply describe the technique for finding determinants.
The determinant of a 2 χ 2 matrix is defined by
det I ,| = ad — be
(: ί)--
The determinant of a 3x3 matrix is found as follows. First, select
a row. The determinant will be a sum of the products of the elements of that
row withy numbers, called their cofactors. The cofactor of the (i,_/)th entry
is (-l)i+-' times the determinant of the 2x2 matrix remaining when the ith
row and/th column are deleted.
66 1. Linear Functions
Examples
28. Compute the determinant of
/1 3 2\
A= -1 4 0
\ 7 -2 l/
If we select the first row we find
det A = 1[4(1) - (-2)0] - 3[(-l)(l) - 7(0)] + 2[(-l)(-2) - 7(4)]
= -45
Selecting the second row:
det A = -(-1)[3(1) - (-2)2] + 4[1(1) - 2(7)] + 0[...]
= -45
Selecting the third row:
det A = 7[3(0) - 4(2)] - (-2)[1(0) - 2(-l)] + [1(4) - 3(-l)]
= -45
Now, we could also have selected a column first, and proceeded in the
same way. For example, selecting the second column:
det A = -3[(-1)1 - 0(7)] + 4[1(1) - 7(0)] - 2[1(0) - 2(-l)]
= -45
Now, in general, the determinant of the и х и matrix is found in
the same way. Select a row (or column). The determinant is the
sum of the products of the entries in that row (or column) with their
cofactors. The cofactor of the (i,j)th entry is (—1)'+-' times the
determinant of the (л — 1) χ (л — 1) matrix remaining when the rth
row andyth column are deleted.
29. Let
(431 0\
2 6 0-1
10 0 4
2 11-1/
1.6 Invertible Matrices 67
Select the first row
/6 0
det A = 4 det 10 0
\l 1
/2 0 -1\
3det 1 0 4
\2 1 -1
-Odet
We now compute the determinants of the 3 χ 3 matrices by taking
advantage of the location of the O's. Select the second column in the
first three, and don't bother with the last since its factor is 0:
det A = 4(-1)[6(4) - 0(-1)] - 3(-1)[2(4) - 1(-1)]
+ (-6)[l(-l) -4(2)] - [2(4) - 1(-1)]
= -24.
30.
A =
/6 2 1
4 3 8
0 0 2
8 1 4
\2 1 4
0
1
0
0
-1
0
0
1
i/
Select the third row:
Select the third column:
/6 2
detA = 2(-l)2 + 3det 8 1
\2 1
= 96
+ (-l)4 + 3(-l)det
We turn now to the theory of determinants. We begin with a definition
of the determinant function which is appropriate to the theoretical
discussion and then verify that it has the multiplicative property:
det(AB)= det A-det В
68 1. Linear Functions
The formulas (1.35) and (1.36) below which form the basis for the preceding
computations will result from a rewriting of the formula for the determinant.
The determinant of an и х и matrix can be described in this way: it is
the sum of all products of precisely one element from each row and column,
with appropriate signs. Our first business is to determine this appropriate
sign. A selection of precisely one element from each row and column is
described as follows: In the first row we select a certain element, say in the
π(1) column. In the second row we select an element, coming from a different
column, say π(2). We have π(2) φ π(1), and so forth. We select the
element ui(l) in the ith row and я(г')т column, making sure that the numbers
π(1),..., π(η) are all distinct. These numbers then form a rearrangement,
or permutation of the numbers 1,..., n. To form the determinant then, we
consider all products
απ(1) ' ' ' απ(π)
as π ranges over all permutations of the numbers 1, ...,n. A particular
kind of permutation is an interchange of two successive integers:
i -* i + 1
i + 1 -* i
η -> л
(We consider the integers as arranged in a circle, so that 1 is the successor to
n.) Now it is a fact about permutations, that any permutation consists of a
succession of such interchanges. There may be many ways to build up a
given permutation by these simple interchanges, but the parity of the number
involved is always the same. That is, if we can write a given permutation as a
succession of an even number of interchanges, then every way of writing that
permutation as a succession of interchanges will involve an even number.
For example, consider the permutation on four integers
1 2 3 4->3 1 4 2
This is obtained by this succession of interchanges:
12 3 4
2 13 4
2 3 14
3 2 14
3 12 4
3 14 2
1.6 Invertible Matrices 69
Here is a better way of doing it:
12 3 4
13 2 4
3 12 4
3 14 2
Either way, there is an odd number of interchanges involved. We shall not
verify these facts about permutations; the verification would be tangential
to our present study. However, we shall use these facts. We shall say that
a given permutation is even if it can be formed by an even numbered succession
of interchanges; the permutation is odd if an odd numbered succession of
interchanges is required. For any permutation π, its sign, denoted ε(π)
will be +1 if π is even, and — 1 if π is odd. There is another way of defining
the sign function on permutations which is described in Problem 36. This
description does not involve the notion of interchange.
Definition 11. If A = (a ') is an и х и matrix its determinant is
detA= Σ ΦΟΓΚο (1-32)
all permutations π i = l
We shall now show that det A # 0 if and only if A is invertible, by showing
in fact a stronger statement: det (AB) = det A ■ det B.
Lemma 1.
(i) det 1=1.
(ii) If A has a zero row, det A = 0.
Proof.
(i) Writing I = (a/), we have ei<o = 0, unless π(ΐ) = i. Thus, the sum (1.32)
has only one nonzero term, that corresponding to the identity permutation. Since
eache,'= l,detl = l · 1 ·■· 1 =1.
(ii) If they'th row of A is zero, each term of the sum (1.32) has a factor alu) = 0,
so is zero. Then det A = 0.
Lemma 2. If Ρ is an elementary matrix, and A any matrix,
det(PA) = det Ρ · det A (1-33)
Proof. Let A = (a/), PA = (b/).
Type I. If Ρ multiplies the rth row by c, then
η
det PA = 2 Φτ) Π bi{t) = Σ £Wfli(') * * * ca*<r> ''' a"<">
i=l
η
= 2 ε(π)° Π ai<o = с det A
1=1
70 1. Linear Functions
In the special case A = I, we have det Ρ = det(PI) = с det I = с Thus (1.33)
holds in this case.
Type II. Suppose now Ρ interchanges the rth and ith rows. Let η represent
the permutation which interchanges r and s. Thus, bj' = a)w. Now we compute:
det PA = 2 ε(π) Π &<υ = Σ ε(^)Π <<>>
i=l
Now we change the index of summation. Let π = τ · η, and sum over т.
det PA = 2 ε(τ ■ η) Π οϋίί'ο» = ~Σ ε(τ) Π «?<%»
1 = 1
The sign changes since η is an interchange; thus, if τ is even, τ · η is odd. Now the
product Π"= ι α?<ί<0)1S the same as the product Π"=ι «ί<ο (another change of index)
so
det PA = 2 ε(τ) Π αίω = -det A
J=l
In particular, det Ρ =det(PI) = —1, so (1.33) holds in this case.
Type III. Suppose that Ρ adds α times row r to row s. Then b/ =α/ if i=£s
and b/ = a/ + aaf. We now compute
det(PA)=2e0r) П«и
i = l
= Σ £W Π αί<ο + « Σ «(Ό Π ai<i)iftwifi<.) (1.34)
1=1 1 = 1
1*1
1*Γ
The first term on the right is det A. The second term is zero. We can see that by
splitting up the sum into odd and even permutations. Let η represent the
interchange of r and s. It is important to note that the odd permutations are just those
of the form π ■ η, where π is even. Thus the last term in Equation (1.34) is
Σ Π aid) · αϋ<Γ) · αϋ<») - Σ Π ai<o · я*<о · αϋ<*>
π even l?r,s π odd l?r,s
п even 1Фг,я n even i^p.s
= Σ Πί#<ι>(β!ίωβϊ<ι> —ei<i>ifi«) =0
π even i/p,s
Thus, det PA = det A. In particular, detP = l, so (1.33) is verified also for
Type III elementary matrices.
1.6 Invertible Matrices 71
Now, lemma is a word denoting a logical particle of no particular intrinsic
interest, but of crucial importance in the verification of a theorem. Here
now is the main theorem concerning determinants.
Theorem 1.7. A matrix Μ is invertible if and only if det Μ Φ 0. det AB =
det A ■ det В for any two matrices.
Proof. Suppose Μ is an и х и matrix which is not invertible. Then there are
elementary matrices P, P0 such that P, · ■ · P0M is row reduced and has zero
rows. Thus, by the above lemma
0 = det(P, · ■ · P0M) = det P, · det Р5_2 · · · det P0 · det Μ
Since the determinant of an elementary matrix is nonzero, we must have det Μ = 0.
On the other hand, if Μ is invertible, there are elementary matrices P,,..., P0 such
that I = P, · ■ · PoM. Then
1 = det I = det P, ■ det P5_, ■ ■ ■ det P0 ■ det Μ
Thus det Μ Φ 0.
Now let А, В be two и х и matrices. If one of A or В is not invertible, neither
is AB, so det AB = 0 and either det A = 0 or det В = 0. In any case
det AB = det A ■ det В
is true. If A and В are invertible, there are elementary matrices P, ■ ■ ■ P0, Q„ ■ ■ ■ Qo
such that
P, ■ ■ ■ PoA = I = Q„ ■ ■ ■ Q0B
Then
Q„ QoP, PoAB = Q„ Q0(P, P0A)B = Q„ Q0B = I
Thus
det Q„ det Q0 ■ det Ps ■ ■ ■ det P0 ■ det(AB) = 1
det Q„ ■ ■ ■ det Q0 ■ det В = 1
det P, ■ ■ ■ det P0 ■ det A = 1
Thus again det(AB) = det A ■ det B.
Notice that the formula det AB = det A ■ det В is far from transparent on
the basis of the definition above. In fact, it is not at all derivable without
some information regarding the structure of и х и matrices. We have a
means of computing А"г for a given invertible matrix A; namely, the process
of row reduction. But we have not given explicitly any formula for the
72 1. Linear Functions
inverse. Such is provided by the cofactor expansion of a determinant. This
formula is of theoretical interest, but not of any great computational value.
As far as computations are concerned, the surest and quickest route to the
inverse is the process of row reduction.
Let A be an η χ η matrix. The adjoint matrix of an entry of A is the
(и — 1) χ (и — 1) matrix obtained by deleting the row and column of the
given entry (see Figure 1.13). Let A/ be the adjoint matrix of the entry a1.
Then the inverse to the matrix A (if it is invertible) is easily given by the
determinants of the adjoints: the (i,j)th entry of A-1 is
v ' detA
More precisely we have these formulas (the explicit version of AA-1 =
\~1A=I) known as Cramer's rule:
detA= £(-l),+'e/detV fora11' (1-35)
detA= £(-l),+VdetA,' for all; (1.36)
1=1
°= Z(-l)'+VdetV for all i Φ к (1.37)
7=1
0= Σ(-1)'+ν^ν forall^fc (1.38)
1
A,'
Figure 1.13
1.6 Invertible Matrices 73
The verifications of these formulas are simpler than it may seem; they can
be based directly on formula (1.32). For example, let us verify (1.35). First
fix a row index /. We shall break up the sum in (1.32) into и parts: those
permutations taking /-+1, i-*2, ...,i-*n. Consider, for a fixed column
index the permutations taking i -*j. (That is, those π for which π(ι) =/)
These are precisely the same as all permutations on the indices of the
matrix adjoint to a* (those permutations which take the integers 1, ...,n,
except for i, into the integers 1, ..., n, except for^'). Thus the terms
appearing in the sum (1.32) which have a' as a factor, are the same as those in
(1.35): we must now verify that the signs agree. Let τ be a permutation on
the indices of the adjoint to a/. The corresponding permutation π of
(1,..., и) does the same as τ and takes i into j. The number of interchanges
involved in building this permutation is just that for τ, with the interchanges
required to send i to / The last number is j — i, which has the same parity
as i+j. Thus, ε(π) = ( — 1)'+]ε(τ), so the signs of corresponding terms in
(1.32) and (1.35) also agree. Thus (1.35) is true. We shall leave the
verifications of the other formulas to the exercises. (Equations (1.37) and (1.38)
require a small trick.)
Cramer's rule allows for a simple description for solving the equation
Ax = b when A is an invertible η χ η matrix. Let A(l) be the matrix obtained
by replacing the /th column of A with the column b. Then the equation
Ax = b, which is the same as χ = Α" ^, turns out, according to Cramer's rule
to read
. detA(,) , .
xl = ——— 1 < ι < η
detA
This is checked out by unraveling all the definitions and applying the formulas
of Cramer's rule: since χ = A-1b,
*'= Σ (A-1)/*'=771 Z(-l)'+'detA,^
у=1 detAj=i
But the summation is just the determinant of the matrix obtained by replacing
the /th column of A with the column vector b! Thus we can solve by taking
quotients of determinants.
Example
31. Solve the equations
xi+2x2- x3=2
x1 + *2 + 3x3=0
2^ + 2x2+ x3 = l
74 1. Linear Functions
The determinant of the matrix
is easily found by cofactor expansion along the first row:
det A = 1(1 - 6) - 2(1 - 6) - 1(2 - 2) = 5
By Cramer's rule
/2 2 -1\
jc^idetO 1 3 =i[2(-5)+l(6+l)] =-|
\l 2 l/
/1 2 -1\
x2 = |det 1 0 3|=i[-2(-5)-(l+2)]=i
/1 2 2\
x3 = jdet 1 1 0 =|[2(0) +1(1-2)] =-|
\2 2 l/
(the determinants are computed by column cofactor expansion).
• EXERCISES
33. Find the inverse of these matrices
1-6 Invertible Matrices 75
Ί
2
3
'4
3
0
0
1
-1
6
1
4
-1'
2
1
i;
2
0
34. Solve the equation
Hi)
where A is given by
(a) the matrix in Exercise 33(a)
(b) the matrix in Exercise 33(b)
(c)
A = |
(d)
A = |
35. Suppose that the и х и matrix A = (a/) has this property:
a/=0 ifj<y
Show that A" = 0.
36. If A is a matrix such that A" = 0 show that I + A is invertible.
• PROBLEMS
33. Show that if a linear transformation Τ has rank n, it is invertible.
Show that if there is a transformation S such that T° S = Ι, Γ is invertible.
34. Derive Equations (1.35)-(l .38) using the definition of the determinant.
35. Assume this fact about polynomials: A polynomial of degree d has
no more than d roots. Prove the following assertions:
(a) Let A be an и x и matrix. There are at most и numbers s such
that A + Л is not invertible.
(b) The mx η matrix
/1 η η2 ··· гГЛ
V= 1 η η2 ·· гГ1]
\l r„ r2 ··■ гГ1)
has a nonzero determinant if and only if the ri are all distinct. (Hint:
If det V = 0, there is a nonvanishing linear relation among the columns.)
36. Let
/(*1,...,ле-)=П(*,-*0
where x1, ...,x" are distinct numbers. Show that the permutation π is
even if and only if
f(x«l>, ...,x*<"))=f(x1, ...,дс")
76 1. Linear Functions
and similarly π is odd if and only if
f(x"a>,..., x"00) = -fix1, ...,x")
37. Let A be an invertible и x и matrix. For m <n, let В be an (n — w)
x (n — /и) matrix formed from A by deleting any m rows and m columns.
Show that В is also invertible. {Hint: You need only take m = \, and
proceed by induction.)
38. Let
Η ί)
Verify that
A2 - (a + d)\ + (ad - bc)l = 0
that is, that A is a zero of a polynomial of degree 2.
39. The same fact is true for all n, that is an и х и matrix is the zero of a
polynomial of degree n. This is part of a famous theorem of algebra, which
goes like this: If A is any matrix, the polynomial
PA(x) = det(A - xl)
is the characteristic polynomial of A. A is a root of the polynomial equation
PA(x) = 0 (Cayley-Hamilton). That is,
Pa(A)=0
Verify the Cayley-Hamilton theorem for (i) a diagonal matrix, (ii) a triangular
matrix.
1.7 Eigenvectors and Change of Basis
One fruitful way of studying linear transformations on R" is to find
directions along which they act merely by stretching the vector. For example,
if a transformation Τ is represented by a diagonal matrix
(1.39)
1.7 Eigenvectors and Change of Basis 77
then T(E.) = i/,E,, where Ej,..., E„ are the standard basis vectors. Thus
Τ acts by stretching by a factor d, along the ith direction:
T(x\ ...,*") = ^x^ + rf2 x2E2 + ■ · ■ + dnx"En
More generally, suppose we can find a basis v,, ..., v„ of vectors in R"
such that Τ acts by stretching along the direction of v, for each i:
ТЫ = d,y,
Then, if ν is any vector, the action of Τ is easily computed by referring ν to
the basis уг, ..., v„: if ν = £ j'v,, then T(v) = £ rf.j'v,. Τ is represented
by the diagonal matrix (1.39) relative to this basis The process of finding
a basis of vectors along which Τ acts by stretching is called diagonalization.
Unfortunately, not all transformations can be so diagonahzed and this
presents a major difficulty in this line of investigation. For example, a
rotation in the plane clearly does not have any such directions in which it
acts as a stretch. More precisely, let Τ be represented by the matrix
Ч-ϊ i)
Then T(x, y) = (y, — χ). (Τ is a clockwise rotation through a right angle.)
If ν = (α, b) is such that T(a, b) = d(a, b), we must have
da = b db = —a
Then d2a = db = —a, and there are no real numbers d, a making this equation
true (except 0).
Nevertheless, there are many transformations which can be analyzed in
this way, and it is our purpose in this section to study the techniques for
doing so.
Definition 12. Let T: R" -> R" be a linear transformation. An eigenvalue
of Γ is a number rffor which there exists a nonzero vector ν such that Tv = dv.
An eigenvector of Τ with eigenvalue d is a nonzero vector ν such that Γν = dv.
Proposition 20. // T: R" -> R" is a linear transformation for which there is
a basis of eigenvectors \u ...,v„ with eigenvalues dl,...,d„, respectively,
then for any vector ν = £ j'v, , T(v) = £ d, j'v, .
78 1. Linear Functions
Proof. Compute T(y) using the fact that Τ is linear.
Now we find the eigenvalues of a linear transformation Τ by making use
of this remark: d is an eigenvalue of Τ if and only if Τ — dl is singular (not
invertible). If A is the matrix representing Τ in terms of the standard basis,
this condition is verified precisely when det(A — d\) = 0. Thus the
eigenvalues of Гаге just the roots of this equation. Notice that when Tis rotation
by a right angle
det
'(-? i)
dl
= d2+l
which has no real roots, thus explaining in another way why this
transformation has no eigenvectors. We shall see that when we extend the real number
system to a system in which every polynomial has a root (the complex
numbers), then Τ can be represented in terms of (complex) eigenvectors.
This is one of the important reasons (particularly in the study of differential
equations, as we shall see) for so extending the number system. Let us now
collect these observations.
Proposition 21. Let Τ be a transformation on R" represented by the matrix
A. dis an eigenvalue of Τ ifand only ifdis a root of the equation
det(A - il) = 0
If d is an eigenvalue, the set of eigenvectors corresponding to d is the kernel
ofT-dl.
Proof. Suppose d is an eigenvalue of T. Then there is a v φ 0 such that
7V = ds, or (T — dl)y = 0. Thus the nullity of Τ — dl is positive, so Τ — dl is not
invertible. Thus, det(A - dl) = 0. On the other hand, if det(A - dl) = 0, then
T— dl is not invertible, so has a positive dimensional kernel. If ν Φ0 is in the
kernel, (T—dI)(y) = 0, or Ту =dy; thus d is an eigenvector of T.
Examples
32. Let Τ be represented by the matrix
1.7 Eigenvectors and Change of Basis 79
Then
A-HV ,0.,)
and det(A - il) = t2 - 3i + 2. The roots are t = 2, 1. The space
of eigenvectors corresponding to ί = 2 is the kernel of
A-2I=G -Ϊ)
that is, the space of all vectors (x, y) such that χ - у = 0. Thus
(1, 1) is an eigenvector with eigenvalue 2. The eigenvectors
corresponding to t = 1 lie in the kernel of
— Co)
that is, in the space of vectors (x, y) such that χ = 0. (0, 1) is such
an eigenvector. Since (1,1) and (0, 1) are a basis for R2, we have
diagonalized Γ. Relative to this basis Τ is represented by the matrix
33. Consider the transformation given, relative to the standard
basis by the matrix
Then det(A - il) = i2 - 8i + 16 = (i - 4)2. Thus 4 is the only
eigenvalue of Γ.
has as kernel {(x, y): x + 2y =0}, which is one dimensional. Thus
there cannot be a basis of eigenvectors for the only eigenvectors lie
on the line χ = —2y.
Notice that this example differs from that of a rotation, for there
is no problem with the roots; the difficulty lies with the transformation
itself.
80 1. Linear Functions
34. Let Γ: R3 -> Λ3 be given by the matrix
Then det(A - il) = - i3 + 3i2 - 4. The roots of
det(A - fl) = 0
are 2, — 1.
Eigenvalue 2:
/-9 0 -18\
A - 21 = 2 0 4
\ 3 0 6/
The kernel is the set of vectors (x, y, z) such that χ + 2z = 0. This
space is two dimensional, so we can find two independent eigenvectors
with eigenvalue 2; for example, vt = (0, 1, 0), v2 = (—2, 0, 1).
Eigenvalue —1:
/-6 0 -18\
A-(-1)1= 2 1 4
\ 3 0 9/
The kernel is the set of vectors (x, y, z) such that
χ + 3z = 0 or χ = — 3z
2x + у + 4z = 0 or у = 2z
which is one dimensional. An eigenvector is v3 = ( —3, 2, 1). These
vi> v2 > v3 thus form a basis of eigenvectors, and Τ is represented by
the matrix
(1.40)
relative to the basis vx, v2 , v3.
1.7 Eigenvectors and Change of Basis 81
Jordan Canonical Form
Notice that in general there are two difficulties with the procedure
described above. The polynomial det(A - /I) may not have many real roots,
and it may have multiple roots. As we shall see in the next section the first
difficulty can be overcome by transferring to the complex number system.
Example 34 above demonstrates that the second possibility, that of multiple
roots, may not be severe, whereas Example 33 shows that it can seriously
handicap the diagonalization procedure. Continued study of this situation
becomes quite difficult and we shall not enter into it. The conclusion is that
the typical matrix which cannot be diagonalized is of this form
Id 1 0 0
0 d 1 0
0 0 d 1
0 0 0 d
\0 · · ·
representing the transformation
T(x\ ..., x") = {dx1 + x2, dx2 + x3, ..., dx")
Given any matrix, we can find a basis of vectors (which includes all possible
eigenspaces) relative to which Τ decomposes into pieces, each of which has
the form (1.41). This is called the Jordan canonical form.
Change of Basis
Before leaving this subject, let us compute explicitly the formulas which
allow us to change bases in R". If {E1; ..., E„} is a basis for R", then any
x in R" can be written
χ = *% + ■ ■ ■ + χ"Επ
uniquely. We shall refer to the и-tuple (x\ ..., x") as the coordinate of χ
relative to the basis E: {El5..., E„} denoted x£.
Let F: {Fl5 ..., F„} be another basis for R". Let xF be the coordinates of
χ relative to this new basis. To each set of Ε coordinates x£ we can associate
the F coordinates xF of the point corresponding to x£. In this way we can
write xF as a function of x£. The precise relation is this
υ
·· 0
·· 0
·· 0
.. η
0
0
0
(1.41)
82 1. Linear Functions
Proposition 22. Let E: {El5..., E„}, F: {Flt..., F„} be two different
bases for R". Write the E'j in terms of the Fs:
1=1
The matrix (a/) is called the change of basis matrix, and is denoted AFE.
For any point χ in R" we have this relation between its Ε and Έ coordinates:
xF = AFExE (1.42)
Proof. Let xE = (x\ ..., x"), xF = (y\ ■.., У). Then
χ = 2 x'YLj =tAl a>'A
J=l J=l \i=l /
1=1
Thus for each i, y' =2"=ι α/χ1, which is the same as (1.42).
Notice that it follows from (1.42) that (AF£)~г = AEF. For, given any xF
Xp = AF xE = Ap A^ xF
Thus AF£A/ = I.
Now, if Τ is any linear transformation on R", it can be represented by a
a matrix, relative to any basis E: {Et,..., E„}. Let us denote that matrix
byT£:
Γ(χ)£ = T£xE
Proposition 23. IfE: {Ег, ..., E„}, F: {Ft, ..., F„} are two bases of R",
and T: R" -> R" is a linear transformation, we have
ТГ = (А/Г1Т£А/
1.7 Eigenvectors and Change of Basis 83
Proof.
T(x)F = AFET(x)E = AFETExE = AFETEAEFxF
On the other hand, by definition
r(x)F=TFxF
Thus TF = Α/ΤΈΑ/ = (AEF)-lTEAEF
Examples
35. Let T: R2 -> R2 be represented, relative to the standard basis Ε
by
Ho i)
Let F: {(1,1), (2, -1)} be another basis. Find the matrix T>.
Now,
A/ = (A/)-1=i([ _fj
Thus
/7/3 -1\
\4/3 2/3)
36. Let Τ be given, relative to the standard basis Ε by
/-7 0 -18
T£= 2 2 4
\ 3 0 8
and let F: {(0, 1, 0), (-2, 0, 1), (-3, 2, 1)}. We have already seen
that F is a basis of eigenvectors for T, with eigenvalues 2,2, -1,
respectively. Thus we may conclude that TF is given by (1.40).
84 1. Linear Functions
• EXERCISES
37. Find a basis of eigenvectors, if possible, for the transformation
represented in terms of the standard basis by the matrix A:
(a)
(b)
(c)
A= 2 7 5/2 eigenvalues: 2, 3, -1
eigenvalues: 1, — 1, 0, 2
A = I „ „ . - eigenvalues: 1,4
(d) /1-1 Λ
A= "I » 4
\ 2 2 0/
38. Show that for F: {Fb ..., F„} a basis for R", and Ε the standard basis,
the matrix A/ is just the matrix whose columns are Fb ..., F„.
39. Find the matrix AEF for these pairs of bases in R".
(a) F: (1,0,1), (0,1,1), (1,0,0)
E: (0,1,2), (2, 0,1), (1,2,0).
(b) F: (1,0,0), (2, 0,1), (0,1,0)
E:(3, 1,5), (0,2,3), (-1,-1,0).
(c) F: (1, 0, 1, 0), (0, 1, 1, 0), (0, 0, 2, 0), (0, 0, 1, 1)
E: (0, 2, 0, 2), (2, 0, 0, 0), (2, 0, 2, 0), (0, 2, 2, 0).
40. Let T: R3 -* R3 be a linear transformation represented by one of
(a) /20 0\ (b) /1 0 -1\
T£= -1 0 3 Te= 0 1 4
\ 1 0 1/ \2 0 -1/
relative to the standard basis E. Find TV, where F is one of these bases
(F as in Exercises 39(a) and 39(b)).
41. If T:R2^R2 has two independent eigenvectors with the same
eigenvalue, then Τ is represented by a diagonal matrix in any basis.
• PROBLEMS
40. Prove Proposition 20.
41. If Γ is a linear transformation on R" represented by the matrix A
which has η distinct eigenvalues dl,...,d„, then
Pa(x) = (-1)"(* - d,)(x -</,)■■■(*- d„)
and РДА) = о (рл is defined in Problem 39).
1.8 Complex Numbers 85
42. Let Γ be a linear transformation on R". Let E(r) = {yeR":Ty = rv}.
Show that E(r) is a linear subspace of R" (called the r eigenspace of T).
Show that if r φ s, then E(r) η E(s) = {0}.
43. Suppose A represents a linear transformation on R" with this property:
if η,..., rk are the eigenvalues of A, then η = 25= ι dim E(n). Then
РДх) = (-1)4* - Ι) d"°E<'l) ■■■(*- л) d"°E<")
Verify the Cayley-Hamilton theorem for A.
44. Find a matrix with no nontrivial eigenspaces. How would you
expect to prove the Cayley-Hamilton theorem for such a matrix ?
1.8 Complex Numbers
Pythagoras' discovery, that y/2 is not the quotient of two integers, was
considered in his day to be a geometric mystery. His conception of numbers
was limited to rational numbers and his desire to measure lengths (to associate
numbers to line segments) led to this unhappy realization: there are some
lengths which are not measurable! (as the hypotenuse of an isosceles right
triangle of leg length 1). It took a long time for mathematicians to realize
that the solution to this situation was to expand the notion of number. The
general liberation of thought that was the Renaissance led in mathematics to
the possibility of expressing the value of certain lengths by never-ending
decimals, or continued fractions, or other types of infinite expressions. It
was during those days that mathematicians formulated the view that such
expressions represented numbers and served to determine all lengths. Earlier,
Middle Eastern mathematicians were led from certain algebraic problems
to envision extension of the number concept in another direction. As they
observed, quite clearly —1 has no square root; some bold adventurer then
suggested that we contemplate, in our minds, some purely imaginary quantity
whose square would be — 1 and treat it as if it were another number As
this supposition did not contradict any of the known facts concerning the
number system, it could do no harm—and might do a great deal of good
(at least in our minds).
Today we need not be so mysterious or cunning in our ways. We need
only recall that there is a 2 χ 2 matrix (see Problem 20) whose square is the
negative of the identity. We can thus say quite factually that in the set of
2x2 matrices, — 1 does indeed have a square root. Well, there is also a
5x5 matrix, and an η χ η matrix for any и whose square is —I, so we should
ask for the smallest algebraic system in which — 1 has a square root. The
86 1. Linear Functions
complex number system is this system and we shall later derive the remarkable
fact (the fundamental theorem of algebra): Every polynomial has a root in
the complex number system.
Now, to be explicit, the matrix
-C "D
(1.43)
has the property that i2 = —I. The complex number system is the collection
of all 2 χ 2 matrices of the form a\ + Ы, where a, b are real numbers.
Definition 13. C, the set of complex numbers is the collection of all 2 χ 2
matrices of the form
(I ~b)
Proposition 24.
(i) The operations of addition and multiplication are defined on C.
(ii) Every nonzero complex number has an inverse.
(iii) С is in one-to-one correspondence with R2.
Proof.
(i) (a -b\,(c -d\_(a + c -(b + d)\
\b aj + \d с J \b + d a + c J
(a ~b\(c -d\(ac-bd -(ad+bc)\
\b a)\d c) \ad+bc ac-bd)
(ii) If
Μ
■e -i)
is nonzero, then one of a or b is nonzero, so det Μ = a2 + b2 φ 0, and thus Μ has
an inverse. By Cramer's rule
a2 + b2\-b a)
(iii) is obvious, since every complex number is given by a pair of real numbers
and conversely every pair (a, b) of real numbers gives rise to a complex number.
1.8 Complex Numbers 87
Cartesian Form of a Complex Number
We need now a notation which is more convenient than the matrix notation,
and we get our cue from (iii) above. The matrices I and i correspond to the
points (1, 0), (0, 1) of the plane and thus form a basis for C. More explicitly
= al + bi (1.44)
If we identify the real number 1 with the identity matrix I; and more
generally the real number r with the complex number r\ + Oi, then we can
say that every real number is also a complex number. In fact, the complex
system is just the real number system with a square root of — 1 tacked on.
(This takes us full circle back to the original conception of that Arabian
adventurer. The difference here is that we now know what we mean by this
procedure and that it produces no inconsistencies.)
Thus, we can suppress the identity matrix in the expression and write a
complex number in the form a + bi. We now recapitulate the relevant facts.
С is the set of all 2 χ 2 matrices с = a + bi with a, b real numbers, a is
the real part of c, written a = Re c, and b is the imaginary part, written
b = Im с And these following rules hold:
i2=-\
{a + bi) + {c + di) = (a + c) + (b + d)i
(a + bi)(c + di) = ac — bd + (bd + ac)i
(a + Z>0_1=4—%i whena2+Z>2*0
a2 + b2
Polar Form of a Complex Number
Since С is in one-to-one correspondence with R2, we can represent complex
numbers by points in the plane (see Figure 1.14). Addition of
complex numbers is the same as addition of vectors in the plane. We now
seek a geometric description of multiplication of complex numbers. For
this purpose it is convenient to move to polar coordinates.
Definition 14. Let ζ = χ + yi. The modulus of z, written \z\, is its
distance from the origin:
\z\ = {x2 + У2)1
88 1. Linear Functions
у >nz = x + 'У
ο χ
Figure 1.14
The argument of z, written arg z, is defined for ζ # 0; it is the angle defining
the ray on which ζ lies:
arg ζ = tan
χ
We can write complex numbers in polar form: If a = χ + yi has the polar
coordinates (r, Θ) then, since χ = r cos Θ, у = r sin Θ, we have
ζ = r(cos θ + i sin Θ)
(We have moved the i in front of sin θ for the obvious notational
convenience which results.) The set of points of modulus 1 is the unit circle
centered at the origin. It is the set of all points of the form cos θ + i sin Θ.
We shall sometimes abbreviate this to cis Θ. Precisely, cis θ is the point of
the unit circle lying on the ray of angle Θ. Now, let z, w be two complex
numbers,
ζ = r cis θ w = ρ cis φ
Then
zw = r cis(0)p cis(0)
= (r cos θ + ir sin θ)(ρ cos φ + ip sin φ)
= rp(cos θ cos φ — sin θ sin φ) + irp(cas θ sin φ + cos φ sin Θ)
= rp cis(f? + φ)
Thus we form the product of two complex numbers by multiplying the
modulii and adding the arguments. (This does not make sense if one of the
numbers is zero, but that case is trivial anyway.)
1.8 Complex Numbers 89
Notice then, if ζ = ρ cis θ, then ζ2 = ρ2 cis 20 and more generally
z" = p" cis ηθ (1.45)
This observation leads to the fact that it is easy to extract roots. For the
converse of (1.45) is
zllk = pllkas j
к
Proposition 25. Let с be a complex number, and к an integer. There are
precisely к distinct solutions to the equation Xk = с
Proof. Write с in polar form: с = r cis Θ. If ζ = ρ cis φ is a solution, then
r cis θ = с = ζ* = pk cis кф
Thus the modulus of ζ is the kth root of the modulus of c, and the argument of a
is an angle such that к times it is Θ. Well, (l/k)6 is such an angle, but so is
(l/k)(6 + 2π). In fact, each of the angles
ί(0),1(0 + 2π),1(0 + 4π),...,1(0 + 2(*-1)π)
have the property that к times it is Θ. All these angles are distinct, so с = r cis θ
has precisely these к roots:
1,- ■ ο ι* · <? + 2π .,. . 0+2ττ(Α:-1)
r1" cis θ, r1/k cis —-—,..., r1/k cis —-—
Complex Eigenvalues
We shall work extensively with the complex number system in this text.
In fact, we shall discover many situations besides the algebraic one above
where study within the system of complex numbers is beneficial. In
particular, let us return to the eigenvalue problems of the preceding section.
We consider C, the space of и-tuples of complex numbers. We can define
linear transformations on С just as we did on R". In fact, the entire theory
of linear algebra through Section 1.7 holds over С as well as R". Let
E, = (0, 0, ..., 0, 1, 0, ..., 0) (1 in the /th place)
90 1. Linear Functions
be the standard basis vectors for C. Again, any linear transformation on
C" is given by a matrix A = (a/) of complex numbers relative to the standard
basis:
Цг1,..., ζ")=(Σ «/*.···> Σ β/*0
for all (z1, ..., z") £ C".
Examples
37. Consider the matrix
"(-Ϊ i)
as representing a transformation Τ on C2 relative to the
standard basis. Its eigenvalues are the roots of det(A — il) = 0. But
det(A — il) = t2 + 1, so the roots are i, — i.
Eigenvalue i:
A-il
-(:!')
The second row is — i times the first, so the kernel of A — /I is given
by the single equation — ix + у = 0. An eigenvector is (1, i).
The eigenvalue —i has the eigenvector (1, —г). Now, F: {(1,0,
(1, -г')} are a basis of eigenvectors for C2, so Г becomes diagonalized
relative to this basis:
TF
Μ
if
Ч'МЛ)
then
Тг = iz
'CHW
1.8 Complex Numbers 91
38. Consider the matrix
4 i 1)
representing a transformation Τ on C3 relative to the standard basis.
det(A - rl) = -t3 + t2 - t + 1. This polynomial has the roots
1, i, — i. Since the roots are distinct and each must have a
corresponding eigenvector, there is a basis of eigenvectors. We now find such
a basis.
Eigenvalue 1:
A—(1 -i 1)
The kernel of A — I is found as a linear relation among the columns
(recall Example 18). Such a relation is
Ct - 2C2 - 3C3 = 0
Thus (1, —2, —3) is an eigenvector with eigenvalue 1.
Eigenvalue i:
A-/I= i -/ i
\ 2 1 l-t/
In order to find a relation among the columns we must row reduce·
The result of row reduction is
[0 ί -\Ui\
\0 0 0 /
A solution of the corresponding homogeneous system is found by
taking z3 = 5, then we obtain z2 = 1 - 3/, z1 = -3 + 4г. Thus,
(-3 + 4/, 1 - 3f, 5) is an eigenvector with eigenvalue i. Similarly,
we find the eigenvector (-3 - 4/, 1 + 3/, 5) corresponding to the
92 1. Linear Functions
eigenvalue — i. Thus, Τ is represented by
Ί 0
0 /
,0 0
&
0
— i
relative to the basis
(1, -2, -3), (-3 + 4/, 1 - 3/, 5), (-3 - 4/, 1 + 3/, 5).
• EXERCISES
42. Find the inverse of these complex numbers:
(a) 5-3/ (d) 4cis(2/3)
(b) (1-0/2 (e) cis7
(c) 3 + i
43. Show that z_1 = ζ if and only if ζ is on the unit circle.
44. Show that the complex number cis θ represents rotation in the plane
through the angle Θ, when considered as a 2 χ 2 matrix.
45. Find all kth roots of z:
(a) к = 2, ζ = — i. (d) к = 3, ζ = i.
(b) k = 5,z=-l (e) fc=2,z = 3/-4
(c) k=4,z=l + i (f) k = 3, ζ =15 + 5/
46. Find, if possible, a basis of possibly complex eigenvectors for the
transformations represented by these matrices
(a)
(b)
(" "")
(c)
(d)
Ό
1
^
I
I
1
-2
-3
-2
/ °
2
0
\-2
3\
5
3/
1
0
0
0
1
-i
0
0
1
i
0
2
• PROBLEMS
45. Compute that the matrix
(-Ϊ1)
has squar
Ρ-J)
has square equal to —I. We have chosen i to be the 2 χ 2 matrix
so that the correspondence between complex numbers and operations on
R2 will be correct. More precisely, we conceive a complex number in two
ways: as a certain transformation on the plane, and as a vector on the plane.
Given two complex numbers z, w we may interpret their product in two
1.9 Space Geometry 93
ways: composition of the transformations, or the application of the
transformation corresponding to ζ to the vector w. We would like these two
interpretations to have the same result. If ζ = a + ib, w = c + id, show
that zw,
с -% ι
and
с -до
are all the same under that correspondence.
46. Show that the complex numbers z, z, when considered as vectors in
R2 are independent (unless they are real or pure imaginary).
47. Why do the complex eigenvalues of a real matrix come in conjugate
pairs?
1.9 Space Geometry
In this section we shall introduce the basic notions of three-dimensional
geometry, using vector notation. First of all, as in the plane, we select a
particular point in space, called the origin and denoted 0. That being done
we may refer to the points of space as vectors and think heunstically of the
directed line segment from the origin to the point as a vector. The operations
of scalar multiplication and addition can be denned as on the plane—and
expressed in terms of coordinates in much the same way:
(i) If Ρ is a vector and r a real number, rP is the vector lying on the line
through 0 and Ρ and of distance from 0 equal to |r| times the length of the
segment OP. If r > 0, rP lies on the same side of 0 as P, if r < 0, rP lies on
the opposite side.
(ii) if P, Q are two vectors in space, there is a unique parallelogram lying
in the plane determined by Ρ and Q, three of whose vertices are 0, P, Q.
We define Ρ + Q to be the fourth vertex.
Now, we turn to the coordinatization of space. Having chosen a point
as origin, let Ex, E2, E3 be three new points with the property that 0, Ex, E2,
E3 do not all lie on the same plane (we say the vectors El5 E2, E3 are not
coplanar). The three lines determined by the vectors Ex, E2, E3 are called
the coordinate axes. Just as in two dimensions the choices of the vectors
El5 E2, E3 enables Us to put each line in one-to-one correspondence with the
real numbers.
In three dimensions two lines determine a plane. We shall call the planes
94 1. Linear Functions
Figure 1.15
through 0 determined by Et and E2 the 1-2 plane, by Et and E3 the 1-3 plane,
and the plane determined by E2 and E3 is the 2-3 plane (Figure 1.15). These
three planes are called the coordinate planes. Each of these planes can be put
into one-to-one correspondence with R2 just as in the case of two dimensions.
Now, to each point in space we can associate a triple of numbers relative to
these choices in the following way. Let Ρ be any such point. There is a
unique plane through Ρ which is parallel to the 2-3 plane; and this plane
intersects the 1 axis in a unique point. This point has the coordinate x1
relative to the scale determined by E!. We shall call x1 the first coordinate
of P. The second, x2, is found in the same way: by intersecting the plane
through Ρ and parallel to the 1-3 plane with the 2 axis. Finally we find
the third coordinate x3 similarly, and associate the triple (x1, x2, x3) to P.
In this way we put all of space into one-to-one correspondence with R3,
dependent upon the choice of vectors Ex, E2, E3, called a basis for space.
The expression in terms of coordinates of the operations of addition and
scalar multiplication are precisely the same as in R2 (no matter what basis
is chosen):
r(xl, x2, x3) = {rx1, rx2, rx3)
(X1, X2, X3) + (/, y2, y3) = (χ1 + y\ χ2 + y2, χ3 + y3)
1.9 Space Geometry 95
There is no need to check that these formulas correspond to the geometric
descriptions given above; we need only refer to the computation in the plane.
When we are interested in the pictorial representation of problems of
three-dimensional Eculidean geometry it is best if we consistently use a
particular coordinatization. For this purpose we select the "right-handed
rectangular coordinate system"; where the coordinate axes are mutually
perpendicular and the order 1 -> 2 -+ 3 is that of a right-handed screw (see
Figure 1.16). It is common in particular problems to refer to the coordinates
by the letters (x, y, z) rather than (x1, x2, x3). We shall use the numbered
coordinates when it is more convenient to do so.
Inner Product
Now, the basic notions of Euclidean geometry are length and angle. It
will be of importance to Us to derive expressions for these in terms of
coordinates. Consider first the length of the line segment OP between the origin
and the point Ρ with coordinates (x, y, z). This can be easily computed
by use of the Pythagorean theorem (consult Figure 1.17). Let P' be the
point of intersection with the xz plane of the line through Ρ and parallel
to the у axis. Then OPP' is a right triangle, so
|0P|2 = |0P'|2+ |P'P|2
Figure 1.16
96 1. Linear Functions
4
0
^ P{x
y.z)
У
.л,'
Figure 1.17
Letting P" be the point of intersection with the χ axis of the line through P'
and parallel to the ζ axis, we obtain
|0P|2= |0P"|2+ |P"P'|2+ |P'P|2
But now |0P"|2 = x2, |P"P'|2 = z2, |P'P|2 = y2, so
|0P| = [_x2 + y2 + z2]1/2
Now, suppose P(x, y, z), Q(a, b, c) are any two points in space. By
definition of addition, Ρ is the fourth vertex of the parallelogram three of
whose vertices are 0, Ρ — Q and Q. Thus, the side through Ρ and Q has
the same length as the side through Ρ — Q and 0, so
IPQI = l(P - Q)0| = [(χ - a)2 + (y - b)2 + (z - c)2]1'2
(1.46)
Finally, we can compute the angle between Ρ and Q by the law of cosines
(consult Figure 1.18); if θ is that angle, then
|PQ|2 = |0P|2 + |0Q|2 - 2|0P| |0Q| cos θ
In coordinates,
{x - a)2 + {y- b)2 + (z - c)2 = x> + y> + z> + a2 + b2 + c2
-2(x2 + y2+z2)112
x(a2 + b2 + c2)^2cose
1.9 Space Geometry 97
which reduces to
xa + yb + cz
COS = (x2 + y2 + ζψ\α2 + b2 + с1)1'1 °-47)
The form in the numerator thus has some special importance: it together
with the notion of length determines angles. It is called the inner product
of the two vectors P, Q.
Definition 15. Let P, Q be two vectors in space. Their Euclidean inner
product, denoted <P, Q>, is denned as |P| |Q| cosf?, where θ is the angle
between Ρ and Q. In coordinates, Ρ = (xu yu z^, Q = (x2, y2, z2),
<P, Q> = Χχχ2 + yxy2 + ZiZ2
Propositions 26. The nonzero vectors Ρ and Q are perpendicular if and only
i/<P,Q> = o.
Proof. Ρ and Q are perpendicular if and only if the angle θ between them is a
right angle, θ is a right angle if and only if cos θ = 0, and this holds precisely
When <P, Q> = 0.
A plane through the origin is the linear span of two vectors. If N is a
vector perpendicular to such a plane ΓΊ, then Γ] is given by the equation
Π:<χ,Ν> = 0
0 |0Q| Q
Figure 1.18
98 1. Linear Functions
More generally, if ρ is a point on a plane (not necessarily through the origin)
and N is orthogonal to TJ, \\ is given by the equation
<x - p, N> = 0
A line through the origin is the linear span of a single vector, and can be
expressed by two linear equations (since a line is the intersection of two
planes).
Examples
39. Find the equation of the plane through (1, 2, 0) spanned by the
two vectors (1, 0, 1), (3, 1, 2). If N = (л1, л2, л3) perpendicular to
this plane we must have
<N, (1, 0, 1)> = л1 + л3 = О
<N, (3, 0, 2)> = Зл1 + л2 + 2л3 = 0
A solution of this system is (1, —1, —1), so we may take
N = (1, - 1, - 1). Then the equation of the plane is
<x-(l, 2, 0),(1, -1, -1)> = 0 or x-y-z+l=0
40. Find the equation of the plane through Ρ = (1, 0, -1), Q =
(2, 2, 2), R = (3, 1, 1). If N is perpendicular to the plane, we have
<N, Ρ - R> = 0 <N, Q - R> = 0
Letting N = (л1, л2, л3), we obtain this system of equations:
-2их-л2-2л3=0
- л1 + л2 + л3 =0
which has a solution N = (7, 4. 3). Thus the equation we seek is
<x - N, R> = 0, or 7(x - 3) + 4(j> - 1) + 3(z - 1) = 0 or
Ix + 4y + 3z = 28
41. Find the equations of the line L through (4, 0, 0) and
perpendicular to the plane in Example 40. If χ is on L we must have
<x - (4, 0, 0), Ρ - R> = 0 <x - (4, 0, 0), Q - R> = 0
1.9 Space Geometry 99
so we may take these as the equations:
2x + у + 2z = 8
—x+y+z =-4
Vector Product
Given two noncollinear vectors vlt v2 in space, the set of vectors
perpendicular to vx, ν2 is a line. We shall now develop a useful formula for
selecting a particular vector on that line, called the vector product vt χ v2.
If N is on that line, and χ is in the linear span of vt and v2, we have
<x, N> = 0
On the other hand, since x, vl5 v2 are coplanar we have
Now there is a uniquely determined vector N such that
for all χ 6 R3. This is easily seen using coordinates. Write
vi = («ι1, "ι2, ΙΊ3), v2 = (υ2\ υ22, υ23), χ = (x1, x2, x3)
Then
fx \ /x1 x2 x3
detj у у ) =detj ι?/ vx2 υ3
χυ^ υ22 ν23
= xtyi V - ι>! V) - *2("ι V - VlW)
+ x\v11v22-v1W)
= <(x\ X2, X3), ((l>! V - V.W), {V.W - V.W),
100 1. Linear Functions
Definition 16. Let ν = (г1, ν2, ν3), w = (w1, w2, w3) be two vectors in R3.
The vector product ν χ w is defined by
ν χ w = (v2w3 - v3w2, v3™1 - vxw3, v1w2 - v2wl)
Proposition 27.
(i) <x, ν χ w> = detj ν I for all χ e R3.
(ii) vxw= — w χ v.
(iii) ν χ w is orthogonal to ν and w.
(iv) The equation of the plane through the origin spanned by ν and w has
the equation <x, ν χ w> = 0.
The proof of this proposition is completely contained in the preceding
discussion. The basic property of the vector product is the first; it follows,
for example, that for any three vectors u, v, w
<u, ν χ w> = <u χ v, w> = <v, w χ u> = <v χ w, u>
Notice that if v, w are collinear, ν χ w = 0. If they are not collinear, the
ordered basis u-+v-+vxwis right handed (see Figure 1.19). The following
proposition gives an important geometric interpretation of the magnitude
of ν x w.
Proposition 28. Let u, v, w be three noncollinear vectors.
(i) The area of the parallelogram spanned by u and ν is ||u χ ν||.
" WX V
Figure 1.19
1.9 Space Geometry 101
(ii) The volume of the parallelepiped spanned by u, v, and w is
Proof.
Step (a). The first step is to verify (n) in case the vectors u, v, w are mutually
perpendicular. In that case we must show that
detjv I = ||u||· llvll- ||w||
This follows easily from the multiplicative property of the determinant. First we
note that
/u\ /<u,u> 0 0 \
ν (u, v,w)= 0 <v, v> 0
V/ \ 0 0 <w,w>/
since the (i,/)th entry is the inner product of the jth row of the first matrix with the
;th row of the second (see Problem 49). Thus,
detjv I = detjv J(u, v, w) = ||u||2||v||2||w||2
Step (b). In particular, if u, ν are perpendicular, then u, v, u χ ν are mutually
perpendicular, so
(u χ v\
u I
= l|uxv||· Hull· Hvll
so ||u χ v|| = ||u|| · ||v||, when u is perpendicular to v.
Step (c). Now we prove part (i) in general. Let θ be the the angle between u
and ν (see Figure 1.20). Then the area of the parallelogram spanned by u and ν
is the product of the base and the height of the base:
area= ||u||a= ||u|| · ||v||sin0
102 1. Linear Functions
Now the vector u χ (u χ v) is orthogonal to u and u χ v, so lies in the plane spanned
by u and ν and is orthogonal to u. We have
sin 0 = cos
Η
<v, u χ (u χ v)>
||v|| · ||u χ (u χ v)||
Since u and u χ ν are orthogonal, by Step (b) we have ||u χ (u χ ν) || = ||u|| · ||u χ ν ||.
Thus
area= lu|
<v, u χ (u χ v)>
llvll· ||ux(uxv)||
<UX V, U X V>
llu|| -—— — = ||u χ v||
lull u χ v||
Step (d). To prove part (11) we refer to Figure 1.21. The volume of the
parallelepiped spanned by u, v, w is the product of the area of the base and the altitude:
volume = ||u χ v||i= ||u χ v|| · ||w||sin φ
Since u χ ν is orthogonal to the u, ν plane,
sin φ = cos
I- - A = l<w' u x
\2 Φ) Ikll' llu
x v||
u X(u X v)
Figure 1.20
1.9
Space Geometry
. u χ ν
Figure 1.21
Thus
volume = ||u χ vl
„ „ <w, ux v> ,_, M,
l|w|1 "71—ΠΠ T = |det u V
||w||||uxv|| I/
A final equality which will prove useful is this:
l|uxv||2=||u||2||v||2-<u, v>2
This follows easily from the above arguments:
||uxv||2=||u||2||v||2sin20
= ||u||2||v||2(l-cos20)
= ||u||2||v||2-<u,v>2
since the angle between u and ν is φ.
• EXERCISES
47. Which pairs of the following vectors are orthogonal?
vt=(2, 1,2), v2= (3,-1,4), v3=(7,0, 5), v4=(6, -2, 5)
v,= (1,3,0), v6= (0,0,1), v7 = (-15,5,21)
104 1. Linear Functions
48. Find the vector products Vj χ vj for all pairs of vectors given in
Exercise 47.
49. Find a vector ν such that
<v,V!>=2 <v, v2> = -l <v, v3>=7
where vb v2, v3 are given in Exercise 47.
50. Find the equation of the plane spanned by the vectors (a) Vi, ve,
(b) v2, v3, (c) v5, v., (d) v2, n, where the v, are given in
Exercise 47.
51. Find the equation of the line spanned by the vectors given in
Exercise 47.
52. Fmd the equation of the plane through (3, 2, 1) and orthogonal
to the vector (-7, 1,2).
53. Find the equation of the line through (0, 2, 0) and orthogonal to the
plane spanned by (1, — 1, 1) and (0, 3, 1).
54. Find the equation of the line through the origin and perpendicular
to the plane through the points
(a) E1;E2,E3
(b) (1,1, 5), (0,0, 2), (-1,-1,0)
(c) (0, 0, 0), (0, 0, 1), (0, 1, 0)
55. Find the equation of the line of intersection of the two planes,
(a) determined by (a), (b) of Exercise 54.
(b) determined by (a), (c) of Exercise 54.
(c) determined by (b), (c) of Exercise 54.
56. Find the plane of vectors perpendicular to each of the lines
determined in Exercise 55.
57. Let A be a 3 χ 3 matrix. Show that
(a) if the rows of A lie on a plane (but not on a line), the set of
solutions of Ax = b forms a line, or is empty.
(b) if the rows of A lie on a line, the set of solutions of Ax = b forms
a plane, or is empty.
58. Show that ||v χ w|| = ||v|| · ||w|| sin Θ, where θ is the angle between
the two vectors ν and w.
59. Is the vector product associative; that is, is
(u χ ν) χ w = u χ (ν χ w)
always true?
60. If v, w are two noncollinear vectors show that the three vectors
ν, ν χ w, ν χ (ν χ w) are pairwise orthogonal.
• PROBLEMS
48. Prove the identities of Proposition 27.
49. Let Vi, v2, v3 be three vectors in R1 Let A be the matrix whose
1.10 Abstract Notions of Linearity 105
rows are vb v2, v3 and В the matrix with columns vb v2, v3. Show that
(a) the (i,;)th entry of AB is <v,, vj>.
(b) det A = det B.
50. Let Ρ be given by the coordinates (x, y, z) relative to a choice Ε,, E2,
E3 of basis for space. Show that the point of intersection of the line
through Ρ parallel to the Ei axis with the 2-3 plane has coordinates (0, y, z).
1.10 Abstract Notions of Linearity
There are many collections of mathematical objects which are endowed
with a natural algebraic structure which is very reminiscent of R". To be less
vague, there is denned, within these collections, the operations of addition
and multiplication by real numbers. Furthermore, the problems that
naturally arise in these other contexts are reminiscent of the problems on R"
which we have been studying. The question to ask then, is this: does the
same theory hold, and will the same techniques work in this more general
context ? We shall see in this section that for a large class of such objects
(the finite-dimensional vector spaces) the theory is the same We shall see
later on that in many other cases, the techniques we have developed can be
modified to provide solutions to problems in the more general context.
First, let us consider some examples.
Examples
42. If/ and g are continuous real-valued functions on the interval
[0, 1], then we can define the functions/+ g, c/as follows:
(У+*Х*) =/(*)+**)
(c/X*) = cf{x)
Clearly,/+ g and c/are also continuous. Thus we see that operations
of addition and scalar multiplication are denned on the collection
C([0, 1]) of all continuous functions on the interval [0, 1].
43. In the above example, if/and g are differentiable, so are/+ #
and cf. Thus the space C'([0, 1]) of functions on the interval [0, 1]
with continuous derivatives also has the operations of addition and
scalar multiplication. Notice that the operation of differentiation
takes functions in Cx([0, 1]) into C([0, 1]): if/is in C4[0, 1]) it
has a continuous derivative, so f is in C([0, 1]). Furthermore,
106 1. Linear Functions
differentiation could be described as a linear transformation:
(f+g)'=f' + g'
iff)' = cf'
So is, by the way, integration a linear transformation:
\{f+g)=lf+\g
l(cf) = c\f
The fundamental theorem of calculus says that differentiation is the
inverse operation for integration:
(J/)' =/
These remarks may strike you as merely a curious way of describing the
well-known phenomena, but the implied point of view has led to a wide
range of mathematical discoveries. The subject of functional analysis which
was developed early in the 20th century came out of this geometric-algebraic
approach to long standing problems of analysis.
Examples
44. If S and Τ are linear transformations of R" to Rm, then so is
the function S + Τ denned by:
(S + T)(x) = S(x) + T(x)
We can also multiply a linear transformation by a scalar:
(cS)(x) = cS(x)
Thus the space L(R", Rm) of linear transformations of R" to Rm has
denned on it operations of addition and scalar multiplication.
45. We have already observed (Section 1.6) that the collection M"
of η χ η matrices has defined on it these two important operations.
In fact, we used, in an essential way, the fact that when we viewed
M" this way it was just the same as R"2.
These examples, together with R", lead to the notion of an abstract vector
space: a set together with the operations of addition and scalar multiplication.
We include in the definition the algebraic laws governing these operations.
1.10 Abstract Notions of Linearity 107
Definition 17. An abstract vector space is a set V with a distinguished
element, 0, called the origin, on which are defined two operations:
Addition. If ν and w are elements of V, then ν + w is a well-defined element
of К
Scalar multiplication. If г is in К and с is a real number, cv is a well-defined
element of V. These operations must behave in accordance with these laws:
(i) ν + (w + x) = (v + w) + x,
(ii) ν + w = w + v,
(iii) ν + 0 = v,
(iv) c(v + w) = cv + cw,
(v) cx(c2 w) = (cl c2)w,
(vi) 1 w = w.
The preceding examples are all abstract vector spaces; the verifications
of the required laws are easily performed. We now want to investigate the
extent to which the ideas and facts discussed in the case of R" carry over to
abstract vector spaces. First of all, all the definitions carry over sensibly
to the abstract case if we just replace the word R" by the words an abstract
vector space V. Thus we take these notions as defined also in the abstract
case: linear transformation, linear subspace, span, independent, basis, dimension.
Now there is one bit of amplification necessary in the case of dimension.
We have until now encountered spaces of only finite dimension.
Example
46. Let R °° be the collection of all sequences of real numbers. Thus
an element of Rx is an ordered oo-tuple,
(χ\χ2 A---)
Rx is an abstract vector space with these operations:
(x1,x2,...,;c",...) + (j>\j'2,...,/,...)
= (xi + y1,x2 + y2,...,xn+yn,...)
c(xl, x\...,x",...) = (ex1, ex2, ...,cx",...)
Now Л00 has an infinite set of independent vectors. Let E„ be the
sequence all of whose entries are zero but for the nth, which is 1. This
entire collection {£„...,£„,...} is an independent set. For if
there is a relation among some finite subset of these, it must be of the
form
clEl +··· + c*Ek = 0
108 1. Linear Functions
(of course, many of the c's may be zero). But
c1El +··· + с*£* = (с1, с2 с*, 0, 0, ...)
so if this vector is zero we must have cl = c2 = · · · = c* = 0. Thus
indeed the set {£1; ...,£„,...} is an infinite independent set on Rx.
We now make the following restriction to the so-called finite-dimensional
vector space; and we shall see that all of the preceding information about R"
holds also in this more general case.
Definition 18. A vector space V is finite dimensional if there is a finite set
of vectors vl,...,vk which span V. That Rx is not finite dimensional
follows from some of the observations to be made below. It can also be
verified in the terms of the above definition (see Problem 53). The important
result about finite-dimensional vector spaces is that they are no different from
the spaces R".
Proposition 29. Let V be a finite-dimensional vector space of dimension d.
Ifvl,...,vdisa basis for V, every vector in V can be expressed uniquely as a
combination of i>,,..., vd:
v = x1v1 + ··· + xivi
{x},...,x*) is called the coordinate of ν relative to the basis v1,...,vd.
The correspondence i>-> (x1, ..., xd) is a one-to-one linear transformation
of К onto Я11.
Proof. The definition of basis (Definition 6) makes this proposition quite clear.
We leave the verifications to the reader (Problem 54).
What is not so clear is that every finite-dimensional vector space has a
basis, and that every basis has the same number of elements. However,
once these facts are established the above proposition serves to reduce the
general finite-dimensional space to one of the R", and the results of Section
1.3 through 1.6 carry over.
Proposition 30. Every finite-dimensional vector space V has a finite basis,
and every basis has the same number of elements, the dimension of V.
Proof. Suppose V is finite dimensional. Then V has a finite spanning set. Let
{vx,..., Vi) be a spanning set with the minimal number of vectors; by definition V
has dimension d. We shall show that {οι,..., vd) is a basis.
1.10 Abstract Notions of Linearity 109
Since {vi,..., vi) span, every vector in V can be written as a linear combination
of these vectors. We have to show that there is only one way in which this can be
done. Suppose for some vector ν we have two different such ways:
ν = xiv1 + ·■· + x"vd = y*Vi + ■■·+ y"vd (1.48)
Then
(*' -y*)vi + ··· + (χ'-y^vi = 0
Since these two expressions differ we must have χ' Φ у1 for some ; Thus
Now this equation says that vj is in the linear span of the d — 1 elements vi, ..,
Vj-i,vJ+i, ...,vt, so these elements serve to span all of Valso. But this
contradicts the minimal assumption about d. Thus it must be impossible to express ν
in terms of vi,..., νά in two different ways. Hence {vi,. , vi} is a basis.
That any two bases have the same number of elements follows easily from
Proposition 28 (see also Problem 55). Let T: V-* Rd be the linear
transformation associating to each vector its coordinate relative to the above basis
{vt, ...,vd}. If {и>1, ..., wt} is another basis, let S: K->R be the same
coordinate mapping relative to this basis. Then L = S · T~l is a one-to-one
linear mapping of Rd onto R*, so p(L) = δ, v(L) = 0. Thus (rank + nullity =
dimension): δ = d.
• PROBLEMS
51. Show that for any finite set of vectors S = {v1,..., vt} in R", there is
a vector weR which does not lie in their linear span [S]. (Hint: Let
ν represent the first (k + l)-tuple of entries in v. Since v/,.. , v/ cannot
span Rt+1, there is a vector w' in Rk+1 which cannot be written as a combin-
nation of v/,..., v/ Let w = (w', 0,.. ).)
52. Are the vectors E,,..., E„,... in R described in Example 43 a basis
fori?-?
53. Let Ro°° be the collection of those sequences of real numbers
(л:1, χ2,..., χ",...) such that x° = 0 for all but finitely many n. Then R0"
is a linear subspace of R°°. Show that the vectors E,,..., E,,... are a
basis for i?0°°.
54. Prove Proposition 29.
55. Prove, by following the arguments in Section 1.4, that any two bases
of a finite-dimensional vector space have the same number of elements.
110 1. Linear Functions
56. Let V, Whe. two vector spaces. Show that the collection L(V, W) of
linear transformations from К to W \ъ a vector space under the two
operations :
(a) if с e R, L e L(V, W), (cL)(x) = cL(x),
(b) if L, L' e L(V, W), (L + L')(x) = L(x) + L'(x).
57. What is the dimension of L(R", iT)?
58. Show that a vector space К is finite dimensional if there is a one-to-one
linear transformation of V0 into R" for some n.
59. Show that a vector space V is finite dimensional if there is a linear
transformation Τ of R" onto V for some n.
60. Verify that the collection Ρ of polynomials is an abstract vector space.
For a positive integer n, let P„ be the collection of polynomials of degree not
more than n. Show that P„ is a linear subspace of P. Show that Ρ is not
finite dimensional, whereas P„ is. What is the dimension of P„ ?
61. Let x0,..., x„ be distinct real numbers and c0,..., c„ another
collection of real numbers. Show that there is one and only one polynomial ρ in
P„ such that
p(xt) = Ct 0<,i<,n
(Hint: Let L:P„^R"+1 be defined by L(p) = (p(x0), ...,p(x„)). Show
that L has rank и + 1.)
62. Let g be a polynomial, and define the function G.P^P:
G(p)=pg
Show that G is a linear function. Describe the range and kernel of G.
63. Define Dk:P^P: Dk(p) = d^pldx*. What are the range and kernel
of A?
64. Let x0 e R, and let c0,..., ck be given numbers. Show that there is
one and only one polynomial ρ in P„ such that
... dp dkp
p{Xo) =d„ ~ (Xo) = Cl, ... , — (x„) = ck
(Hint: Use the same idea as in Exercise 61.)
65. Does Dk:P^P have any eigenvalues?
66. Show that C([0,1 ]) is not a finite-dimensional vector space.
1.11 Inner Products
The notion of length, or distance, is important in the geometric study of
planar and spatial configurations. In Section 1.3 we studied these concepts
and related them to an algebraic concept, the inner product. From the
point of view of analysis also it is true that these concepts are significant:
1.11 Inner Products 111
it is in terms of distance that we can express "closeness" and in particular
"convergence." By analogy with R3 we define the inner product in R",
and in terms of it, distance. While we are here we shall, in this section,
introduce some topological terms.
Definition 19. The inner product of two vectors ν = (ν1, ..., ιΛ), w =
(w1,..., W), denoted by <v, w> is defined as
<v, w> = £ v'w1
We shall say that ν is orthogonal to w if <v, w> = 0. The distance d(v, w)
between ν and w is defined by
d(v,w)= [£(»'- w')2]1/2
The modulus |v| of a vector ν is the distance between ν and 0,
|v| = rf(v,0) = rj>')2]1/2
Distance in R" behaves much as it does in R2 and R3; in particular, the
Pythagorean theorem holds:
d(y, w)2 = d(y, x)2 + d(x, w)2 (1.49)
when <v — w, w — x> = 0. In any event, two points are no further apart
than the sum of the distances from a third,
d(y, w) < d(\, x) + d(x, w) (1.50)
These facts will be verified in the problems.
Topological Notions
Definition 20. The ball in R" of radius R > 0 and center c, denoted 5(c, R),
is the set of all points whose distance from с is less than R:
B(c,R)= {xeRn:d(x,c)<R}
A set S is said to be a neighborhood of a point с if it contains some ball
centered at с A set V is said to be open if it contains a neighborhood of
each of its points.
Thus, a set S is a neighborhood of с if there is some R (presumably very
112 1. Linear Functions
small) such that
d(x, c) < R implies xeS
A set U is open if for every cell, there is an R such that U => B(c, R). Notice
that any ball is open. For suppose xeB(c,R). Then d(x, c) < R, so
R - d(x, c) > 0. Now B(c, R) contains the ball of radius R - d(x, c)
centered at x. For if у is a point in that ball, then by (1.50),
d(y, c) 5Ξ d(y, x) + d(x, c)<R- d(x, c) + d(y, c) = R
Here is a collection of formal properties of the collection of open sets.
Proposition 31.
(i) R" is open.
(ii) lfU1,..., U„ are open, so is Ut η ··· η [/„.
(iii) If С is any collection of open sets, then the set of all points belonging
to any of the sets in С is open. (This set is denoted [j U).
Proof.
(i) Oearly, R" contains a ball centered at every one of its points.
(ii) Suppose Ui,..., U„ are open, and χ is in every Ut. Then there are Ri,...,
R„ such that t/i => B(x, RJ, ...,!/„=> B(x, Д,). Let R = min[i?i,..., R„]. Then if
d(y, x)<R,y is in each B(x, Ri) so is in each Ut. Thus у is in £Λ η · · · η U„. In
particular, t/i η · · · η U„ => B(x, R). Thus t/i η · · · η U„ is a neighborhood of
any one of its points x, and is thus open.
(iii) Suppose С is a collection of open sets. If χ is in any one of them, say U,
then since U is open there is an R such that U => B(x, R). Thus, \Juec U => B(x, R).
Thus (Juec U is a neighborhood of any one of its points, so is open.
Many of the concepts a mathematician studies are so-called local concepts:
They happen in a neighborhood of a point, or are determined by what goes
on near a point; far behavior being irrelevant. Differentiation is thus local,
whereas integration is not. The importance of open sets is that it is precisely
on such sets that we should study these local concepts, since their definition
at a point depends on behavior in some neighborhood of the point.
If a set is open its complement, the set of all points not in the given set, is
said to be closed. Thus, Sis a closed subset of R" if R" - S = {xeR":x4S}
is open. Corresponding to Proposition 31 we have this proposition about
closed sets.
Proposition 32.
(i) R" is closed.
(ii) If Su ..., Sn are closed, so is St и ··· и S„.
1.11 Inner Products 113
(iii) If С is a collection of closed sets, then the set of all points common to all
the sets of С is closed. (This set is denoted [)s ε c 5).
Proof. Problem 67.
Notice that there are sets which are both open and closed. There are not
many of them. R" and 0 are the only ones. There are also sets which are
neither open nor closed, and there are many of them. For example, an
interval is open in R1 if it contains neither end point, closed if it contains
both, and neither open nor closed if it contains only one end point.
We are acquainted with the notion of " dropping a perpendicular" in the
plane. That is, if / is a line and ρ is a point not on the line, then we can drop
a perpendicular from ρ to / as in Figure 1.22. The point p0 of intersection
of the perpendicular with / is the point on / which is closest to p. A more
sophisticated way of describing this situation is to say that p0 is the orthogonal
projection of ρ on /. The concept of orthogonal projection generalizes to R"
and will prove quite useful there. In order to discuss this problem, we shall
generalize even further.
Definition 21. A Euclidean vector space is an abstract vector space V on
which is defined a real-valued function of pairs of vectors, called the inner
product, and denoted <, >. The inner product must obey these laws:
(i) <i>, i>> > 0. If <i>, i>> = 0, then ν = 0.
(ii) <i>, w> = <w, i>>.
(iii) (av, w> = a(v, w>.
(iv) <i>x +v2,w} = <i>x, w> + <i>2, w>.
\
\
*P
\
\
\
\
\
\
\
\
\
\
\ ^
Figure 1.22
114 1. Linear Functions
It is clear that R" is a Euclidean vector space when endowed with its inner
product. The space C[0, 1] of continuous functions on the unit interval is a
Euclidean vector space with this inner product:
</, 9> = fVi'MO dt
Jo
We leave it to the reader to verify that the laws (i)-(iv) are obeyed. It is
interesting that the laws (i)-(iv) are all that is essential to the notion of inner
product; that is, any such function behaving in accordance with those laws
will have all the properties of an inner product. Despite the inherent interest
in this " metamathematical" point, we shall not pursue it further, but take
it for granted that the above definition has indeed abstracted the essence of
this notion.
In terms of an inner product on a vector space we can define the notions of
length and orthogonality:
IN = [<». f>]1/2
υ 1 w if and only if <i>, w> = 0
The important bases in a Euclidean vector space are those bases whose
vectors are mutually orthogonal. More specifically, we shall call a set
{Et, ...,£„} in a Euclidean vector space V an orthonormal set if
||£J = 1 for all/
E, 1 Εj for all / φ /
If the vectors Ex, ..., E„ span V we shall call them an orthonormal basis.
(Any orthonormal set of vectors is independent—Problem 68.) The basic
geometric fact concerning orthonormal sets is the following:
Proposition 33. Let V be a Euclidean vector space and {Et, ..., E„) an
orthonormal set in V. For any vector ν in V, the vector v0 = υ — J^"= 1(v,El}El
is orthogonal to the linear span S of {Et, ..., £„}.
π
Proof. Let и> = 2 dEi be in S. Then
i = l
<v, w> = <v - 2 <v, Ε,} <£,, w> = <«, w> - J <υ, Ε0 (Ε,, w>
1.11 Inner Products 115
Now
<E, ,*> = <£,,£ CjEj} = j?Cj <E,, Ej} = c,
J = l J=l
η η
<υ, w> = <χ>, 2 ει£Ί> = 2 c' <u> £'>
J=l i = l
Thus
<v, vf> = i с <υ, £,> - J <υ, Е,У ct=0
1 = 1 1=1
Theorem 1.8. Let V be a Euclidean vector space, and let {Eu ...,£„}
be an orthonormal set in V. For any vector v, let
v0= £<»,£,>£,
Then
(i) N|2=|l»-»oll* + ll»oll2;
(ii) for any w in the linear span of {Eu ..., £„},
||r-r0||2<||r-w||2
Proof.
(i) ||υ||2 = <υ, ν} = <(υ - Vo) + v0,(v- vo) + v0>
= ||υ-υ0||2+ \Ы\2 + <v0, v-v0> + <v-v0,Vo>
The last two terms are zero by the preceding proposition, since v0 is in the linear
span of {Ei,..., E„}.
(ii) ||υ — w\\2 = (,ν — w, ν — w}
= <U — Vo + Vo — W, V — Vo + Vo — W>
= \\v — Vo\\2+ \\vo- >f||2+ <υ- Vo,v0-w}+ <v0- w,v- v0y
Again, the last two terms are zero for both v0, w and thus also v0 - w is in the linear
span of {£Ί, ...,£■„}. Thus
||„-и;||2= ||г;-г;о112+ \\v0 - w\\2 > \\v - v0\\2
so (ii) is proven.
116 1. Linear Functions
Gram-Schmidt Process
Notice that υ — v0 = v' is orthogonal to the linear span S of {£x, ..., £„}.
v0 is the vector in S which is closest to ν; it is called the orthogonal projection
of υ into S. It seems, by Theorem 1.8 that one needs an orthonormal basis
in order to find orthogonal projections; the following proposition gives a
procedure for obtaining orthonormal basis for finite-dimensional vector
spaces, and thus with it, orthogonal projections.
Proposition 34. Let Fu ..., F„be a basis for a Euclidean vector space V.
We can find an orthonormal basis Et, ..., E„so that the linear span ofEx,...,
Ej is the same as the linear span ofFu ..., F}for all).
Proof. The proof is by induction on n. If и = 1, we need only take £Ί =
IIFiir'-Fi.
Now in general, let Fi,..., F„ be a basis for a Euclidean vector space V. Then
the linear span W of Л, . ...F,,-! is a Euclidean vector space also, and we can
apply the proposition to W by the inductive hypothesis. Let £Ί,..., E„-i be an
orthonormal basis with the required properties. Now, we must find a vector E„
such that
||fi|| = l
(E„ ,£,) = 0 all ι φ η
F„ is in the linear span of £Ί,..., E„
If ii„isa vector that fulfills the last two conditions, then we can take E„ = \\E„\\~1E„.
Thus we need only find a vector filling the last two conditions. That is easy; take
En=Fn- ^{F„,Ej)E}
Kit
Then, for ι < n,
(E„, E,) = (F„, £i) - Σ №. e№j . Ει)
J<n
= (F„,Ei)-(F„,El)(El,El)=0
Furthermore,
F„ = E„+ 2(F„,Ej)Ej
so the last two conditions are fulfilled and the proposition is proven.
The proof of this proposition provides a procedure for finding orthonormal
1.11 Inner Products 117
bases in an Euclidean vector space, known as the Gram-Schmidt process. It
goes like this:
First, pick any basis Ft, ..., F„ of V. Take
Et =11^ ΙΓΧΛ
Then choose E2 = F2 — (F2, El)El, and divide by the length to find E2,
and so forth. If Et, ..., E} are found, take
£°+1 = FJ+l - (FJ + 1, Et)Et - (FJ + U E2)E2 - ··· - (FJ + 1, £,)£,
and let EJ + 1 be the vector of length one collinear with £°+1.
Examples
47. Apply the Gram-Schmidt process to this basis of R3:
Fx= (1,0,1)
F2 = (3,-1,2)
F3 = (0, 0,1)
Take
Ei = №i=(7i'0'7i)
= <3,-l,2)-(|.0,|)-(|.-l,^)
Then
Ε3° = (°'0'1)-^(^'0'^)+(ϊ^((ϊ^'
1. Linear Functions
and finally
E3
-1-^- — —\
~\(17)1/2'(17)1/2'(17)1/2/
48. Find an orthonormal basis for the kernel of λ: R4 -> R,
λ(χ\ χ2, χ3, χ4) = χ1 + χ2 + χ3 + 2x4.
First of all, let us pick a suitable basis for Κ(λ); that is,
(1,0, 0, - 1/2), (0, 1, 0, -1/2), (0, 0, 1, -1/2). Applying the Gram-
Schmidt process, we obtain
V-(»U7)-^^"7)
„..(ο,ο,,,^-Α.^,ο,ο,^)
1 / -1 /5\1/2 -2 \
~ (30)1/2 U30)1/2' \б) '(W75/
(Ю94)^2 /1 1 2\
Ьз__30-\б'б' '5J
49. Find the orthogonal projection of (3, 1, 2) into the kernel of
Г:Д3->Д:
T(x, y, z) = χ + 2y + ζ
Now the kernel of Τ is spanned by Fx = (2, -1, 0), F2 = (0, -1, 2).
Applying the Gram-Schmidt process, we obtain the orthonormal basis
Ex = (|)!/2(2, -1,0) E2 = ЦУ'\-{, -|, 1)
Thus the orthogonal projection of (3, 1, 2) into this plane is
Ш1/25(|)1/2(2, - 1, 0) + (1)1/^1(1)1/^-1, -1 i) = (4|, _|i f)
1.11 Inner Products 119
50. Find the point on the line
L: χ + у - ζ = 0
Ъу + ζ = О
which is closest to (7, 1,0). L is the linear span of the vector
(-4,-1,3). Thus the orthogonal projection of (7,1,0) on this
line (the closest point) is
α ι m (-4,-i,3)\(-4,-1,3) 27
(7'1'°)' (26)1'2 / (26)1'2 =26(4'-1'3)
• EXERCISES
61. Which of the following sets are open; closed; or neither.
(a) {xeR:2< \x-5\ < 13}.
(b) {xeR:0<x^4}.
(c) {xeR:x>32}.
(d) {xeR-. <x, x>=4}.
(e) {хеЛ3:<х,(0,2,1)>=0}.
(f) {xei?3:2<||x-(3,0,3)||<14}.
(g) {хей":х'>0 x">0}.
(h) The set of integers (considered as a subset of R).
(i) {xeR!·: 2aV<£}.
(j) {xeif: Σχ'α'ΦΙ}.
(к) {хеЛ":2(^')3<2^')2}·
62. Find the point on the plane
χ + Ъу + Iz = 4
closest to the point (1, 0, 1).
63. Find the point on the line
x + 7У + ζ = 2
л: -z = 0
closest to the point (—7, 1, 0).
64. Find an orthonormal basis for the linear span of
(a) v, = (0, 2, 2), у2 = (1, 0, 2), v3 = (1, 2, 4).
(b) v, = (0, 1, 0, 1), y2 = (1, 0, 1, 0), v3 = (1, 1, 2, 3).
(c) Vl = (0, 3, 0, 0, 0), v2 = (0, 6, 0, 3, 0), v3 = (0, 0, 2, -1, 1).
(d) v, = (1, 2, 3, 4), v2 = (4, 3, 2, 1), v3 = (2, 1, 4, 3).
120 1. Linear Functions
65. Find orthonormal bases for the linear span and kernel of these
transformations on R*:
(a) /8 6 1 0\ (b)
• PROBLEMS
67. Prove Proposition 32.
68. Show that an orthogonal set of vectors is independent.
69. Give an example of a sequence {U„} of open sets such that f)"=i U„
is not open.
70. Give an example of a sequence {C„} of closed sets such that U"=1 C„
is open.
71. Find an orthonormal basis for the linear span of 1, л:, χ2, χ3 in the
vector space C([0, 1]) with the inner product </, g} = \fg.
In the next four problems V represents a vector space endowed with an
inner product, denoted < , >.
72. Let v, w, χ be three points in V such that ν — χ is orthogonal to
w — x. Show that the Pythagorean theorem is valid:
73. Let v, w be two vectors in V. Show that the vector in the linear span
of w which is closest to υ is
vo = -7—7Г w 0·51)
(You can verify this by minimizing the function f(t)= \\v—tw\\2 by
calculus.)
74. Prove Schwarz's inequality:
K»,w>|^||i7||· |M|
for any two vectors in V. (Hint: \\v- v0\\2 ^0 where v0 is given by
(1.50).)
75. Prove the triangle inequality:
\\v-x\\^\\v-w\\+ \\w-x\\
for any three vectors in V (use Schwarz's inequality).
1.11 Inner Products 121
76. Let К be a vector space with an inner product. Suppose that W
is a subspace of V. Let ±_(W) = {v. (v, w> = 0 for all w e W). This is
called the orthogonal complement of W. Show that _L(W) is a linear
subspace of V and (if V is finite dimensional) that W and _L(W) together
span V.
77. Let T: R" -> i?m be a linear transformation represented by the matrix
A. Show that the rows of A span _\_(K(T)).
78. Show that a linear transformation is one-to-one on the orthogonal
complement of its kernel.
• FURTHER READING
R. E. Johnson, Linear Algebra, Pnndle, Weber & Schmidt, Boston, 1968.
This book covers the same material and includes a derivation of the Jordan
canonical form.
K. Hoffman and R. Kunze, Linear Algebra, Prentice-Hall, Englewood
Cliffs, N.J., 1961. This book is more thorough and abstract, and has a full
discussion of canonical forms.
L. Fox, An Introduction to Numerical Linear Algebra, Oxford University
Press, 1965. This is a detailed treatment of computational problems in
matrix theory.
H. K. Nickerson, D. C. Spencer, and N. Steenrod, Advanced Calculus,
Van Nostrand, Princeton, N.J , 1957. This set of notes has a full treatment
of all the abstract linear algebra required in modern analysis.
• MISCELLANEOUS PROBLEMS
79. Show that if A' is obtained from A by a sequence of row operations
then these equations have the same solutions: Ax = 0, A'x = 0.
80. Show that every nonempty set of positive integers has a least element.
81. Show that a set with и elements has precisely 2" subsets.
82. Show that the и-fold Cartesian product of a set with к elements has
k° elements.
83. Can you interpret the case к = 2 in Problem 82 so as to deduce the
assertion of Exercise 3 ?
84. Let A = (β/) be an η χ и matrix such that a/ = 0 if ι — j > r for some
r > 0. Show that A"~r = 0 Show that the same conclusion follows from
the assumption у — / > r for some r > 0. Will the hypothesis |i — j'\ > r do
as well?
85. Let T: Rr^R™ be a linear transformation of rank r. Show that
there are linear transformations 5Ί: Rm -> Rm-\ S2: R"~r -> R" such that
(a) 5Ί has rank m - r and b e R(T) if and only if Sib = 0.
(b) S2 has rank и — r and χ e K(T) if and only if χ e R(S2).
86. Suppose that T: R" -> R° and Τ = I. Show that Τ is invertible.
87. Let S be a subset of R". Show that the linear span [S] of S is the
intersection of all linear subspaces of R" containing S.
122 I. Linear Functions
88. Let S, Τ be subsets of R". Show that
dim([S и Г]) < dim([S]) + dim(|T]),
and equality holds if and only if [S] η [Τ] ={0}.
89. Let V and W be subspaces of R". Let X be the set of all sums
ν + w with ν e V, w e W. Show that X \ъ a linear subspace of U". The
relationship between ХъхА К and W is indicated by writing X=V+ W. If
in addition Κ η W={0}, then every xei can be written in the form
ν + w in only one way. In this case, X=V+W with V r\ W = 0, we say
that X is the direct sum of V and W and write X = V® W.
90. Suppose X=V@W. Then dim A"= dim V+ dim W.
91. Show that if λ: R" -> i? is a linear function, there exists awei" such
that λ(ν) = <v, w> for all ν e R\
92. If S is a subset of R" define
±(S) ={veR": <v, s> = 0 for all s e S}.
(a) Show that _\_(S) is a subspace of R" and that S η ±(5) = {0}.
(b) Show that [5] = ±(±(5)).
(c) If К is a linear subspace of R", R" = K© _L(K).
93. Suppose that Г: V^ Wis a linear transformation and К is not finite
dimensional. Show that either the rank or the nullity of Τ must be infinite.
94. Let V be an abstract vector space. A bilinear function ρ on К is a
function of two variables in К with these properties:
p(cv, w) = cp(v, w) p(v, cw) = cp(v, w)
P(Vi + V2 , W) =p(Vi, W) + p(v2 , W) p(v, Wi + W2) =P(V, Wi) + p(v, Wl)
Show that the sum of two bilinear functions is bilinear. In fact, the space
Bv of all bilinear functions is an abstract vector space. If К is finite
dimensional, what is the dimension of Bv1 (Hint: See the next problem.)
95. Let ρ be a bilinear function on R". Let
at; j =p(E,,Ej)
Show that ρ is completely determined by the matrix (a,;.,).
96. Let V be an abstract vector space.
(a) Show that the space V* of linear functions on К is a vector space
under addition and scalar multiplication.
(b) If dim V= d, show that dim V* = d also.
(c) Show that to every λe Rr* there isaweu" such that λ(ν) =
<v, w> for all ν e R". (Recall Problem 91.)
97. Suppose that К is a linear subspace of W. We define the annihilator
of V, denoted ann(K), to be the set of λ e W* such that λ(ν) = 0 if ν e V.
1.11 Inner Products 123
Show that ann(K) is a linear subspace of W*. If dim W= n, dim V = d,
show that ann(K) has dimension n — d.
98. Let К be a linear subspace of R", and suppose that T: V^Rr is a
linear transformation. Show that there is a linear transformation 7":
i?" -> i?™ defined on all of i?" which extends Г.
99. The closure of a set S, denoted S, is the set of all points χ such that
every neighborhood of χ contains points of S. Find the closure of all the
sets in Problem 61.
100. Show that the closure of a set S is the smallest closed set containing S.
101. The boundary of a set S, denoted dS, is the set of all points χ such
that every neighborhood of χ contains points of both S and the complement
of S. Find the boundary of all the sets in Problem 61.
102. Show that the boundary of a set is a closed set.
103. Show that the boundary of a set S is also the boundary of its
complement R" - S. In fact, show that eS = Sn(R"- S).
104. Let T: V ^ W be a linear transformation of a vector space with an
inner product. The adjoint of Τ is the transformation T*: W^-V defined
in this way
<T*(w), v> = <>, Tv} for all ν e V
(a) Show that T* is a well-defined linear transformation.
(b) If T: R"^Rm is represented by the matrix A = (a/), then
T*: Rm -> R" is represented by the matrix A* = (a*j), where a*j = atJ.
(This matrix is called the adjoint or transpose of A.)
(c) Show that R(T*) is complementary to K(T).
(d) In fact, p(T*) = v(T), v(T*) = p(T).
105. A bilinear form ρ on a vector space К is called symmetric if it obeys
the law: p(v, w) =p(w, v) for all ν and w. An inner product is a symmetric
bilinear form and much of the formal manipulations with inner products
remains valid for symmetric bilinear forms. For example, the Gram-
Schmidt process (Proposition 32) gives rise to this fact (see if you can work
the proof of Proposition 32 to give it):
Proposition. Let ρ be a symmetric bilinear form on V. Suppose Fu ...,
F„ is a basis for V. We can find another basis, £Ί, ...,Ε„ of V such that the
linear span of £Ί,..., Ej is the same as that of Flt..., F, for all J, and
p(E,,Ej)=0ifi^j.
We shall call such a basis Ε!,..., E„ p-orthogonal.
106. Let ρ be a symmetric bilinear form on a vector space V, and suppose
Ei, ...,E„ is ap-orthogonal basis.
(a) Show that p(v, w) can be computed in terms of this basis as
follows: if ν = J v'E,, w = 2 WE,, then
p(v, w) = 2 v'w'piE,, E,) (1.52)
( = 1
124 1. Linear Functions
(b) Show that ρ is an inner product on the linear span of the E,
such that p(E,, E,) > 0.
(c) Similarly, — ρ is an inner product on the linear span of the Et
such that p(E„E,)<0.
107. Prove this fact: Let ρ be a symmetric bilinear form on a finite-
dimensional vector space V. There is a basis Eu ..., E„, integers r, s such
that r + s <, η and such that if ν = J v'Et, then
ρ(ν,ν)=Σ(νγ- 2 (νγ (i.53)
ISr г£1£г+а
(Hint: Modify the basis {E,} in Problem 106 so that (1.52) becomes (1.53).)
108. The integers r, s of Problem 107 are determined by ρ alone, and are
independent of the basis. Here is a sketch of how a proof would go.
Suppose Fi,..., F„ is another p-orthogonal basis and ρ is the number of
Fi's such that p(Ft, Fi) > 0. We have to show ρ = r. Let W be the linear
span of these F's. Expressing points of W in terms of the basis Eu ..., E„
we may consider the transformation T: W^ R" given by
TQv'Ed^iv1,...,^)
Τ is one-to-one on W, for if w e W, and w φ 0,
о </>(*,*)= 2 (v'Y- Σ Ψ')2
so we must have
2>')2>o
on W. Since IT is one-to-one, it follows that r ;> p. The inequality ρ ^ r
follows from the same argument with the roles of Eu ..., E„ and Flf..., F„
interchanged.
109. Let A = (at;j) be a symmetric η χ η matrix, that is, αί\ί = αί\ι
Then A determines a symmetric bilinear form on R" as follows:
pA(\, w)= 2 aiijfV
If Ρ is the matrix corresponding to the change of basis from the standard
basis to that described in Problem 105, then P*AP is diagonal. Verify that
assertion.
1.11 Inner Products 125
110. Find thep-orthogonal basis and the representation (1.53) of Problem
107 for the symmetric bilinear forms given by these matrices:
(a) /4 3 0 l\ (b)
111. Describe the sets p(\, v) > 0, =0, <0 in R* where ρ is given by
p(y, v) = (v1)2 + (v2)2 + (v3)2 - (v*)2
112. A transformation T: V^ V is called self-adjoint if it is self-
adjoint «7v, w> = <v, Ту/У for all v, w e V). Show that if Г is a self-
adjoint transformation on R", then
R" = K(T) © R(T)
113. Suppose that v, w are eigenvectors of a self-adjoint transformation Τ
on V with different eigenvalues. Show that <v, w> = 0.
114. If Γ is a self-adjoint transformation on R", and v0 e R" is such that
2 (voV = 1 and
<7V0, v0> =max{<rv, v>; 2 (v')2 = 1}
then v0 is an eigenvector for T.
115. Use Problems 113 and 114 to prove the Spectral theorem for self-
adjoint operators on R":
Theorem. There is an orthonormal basis Ei,..., E„ of eigenvectors of T.
Τ can be computed in terms of this basis by
TQx'Ed^x'cE,
116. Find a basis of eigenvectors in R* for the self-adjoint transformations
given by the matrices (a), (b) of Problem 110.
117. Orthonormalize these bases of R*:
(a) (1, 0, 0, 0), (0, 1, 1, 1), (0, 0, 2, 2), (3, 0, 0, 3).
(b) (-1, -1, -1, -1), (0,-1,-1, -1), (0,0, -1, -1),
(0, 0, 0, -1).
(c) (0, 1, 0, 1), (1, 0, 1, 0), (1, 0, 0, 1), (0, 1, 1, 0).
118. Find the orthogonal projection of Rs onto these spaces:
(a) The span of (0,1,0,0, 1).
(b) The span of (1, 1, 0, 0, 0), (1, 0, 1, 0, 0).
(c) The span of (1, 0, 0, 0, 1), (0, 1, 0, 0, 1), (0, 0, 1, 0, 1).
(d) The span of the vectors given in (c) and the vector (0, 0, 0, 1, 1).
Chapter 2
NOTIONS OF CALCULUS
One of the main methods of the modern approach to mathematics is the
recognition of familiar concepts at work in unfamiliar settings. Thus, the
ideas of linear algebra, originally introduced for the purpose of solving
systems of equations, will be seen also to have relevance in the study of
functions. In time it will be seen that many simple concepts of geometry
permeate a lot of mathematics. Thus it is important to us, where possible,
to try to isolate our concepts and set them in an initially very abstract situation
in order to maximize their applicability. Of course, we can't just do that;
we must have had some familiarity with the behavior of those concepts. For
that we need examples. As we study these examples we can begin to
recognize more and more clearly the essence of our concept. This gives rise to a
(perhaps) tentative abstract proposal which requires further study of new
examples, born out of our generalizations. This procedure, iterated over
and over again, may take many generations and the best work of many
mathematicians before a clear, precise and satisfactory definition is molded.
So it has been with the limit notion, which was implicit in the early 17th
century, which was in some sense formulated by Newton and Leibniz in the
18th century, but which did not take a final and comprehensible form until
the late 19th century. We shall not try to encompass over two centuries of
struggle in a few pages; we shall have to take some short cuts and we shall
try (for obvious pedagogical reasons) to avoid the great confusion that is
suffered during such development.
The basic technique of calculus is approximation. Let us give an illustra-
126
Notions of Calculus 127
tion of how it goes. The problems of calculus are such that we are required
to produce a function that has given properties. There are two aspects to
this problem. There is the theoretical aspect: to be assured that there exists
a solution to our problem, and the practical one: to describe a procedure for
effectively computing that solution. These two aspects are inseparable.
In fact, we make a sequence of attempts to solve the problem. If these
attempts are good it will provide us with a sequence of functions successively
providing better solutions to the problem. Then further study of the general
form of these tentative solutions may provide a clue to the accurate solution.
Supposing we have a square of side length one unit in the plane (see Figure
2.1) consider this problem. Find a function/denned on the box which has
prescribed values at the vertices and which satisfies this condition: For every
point in the box and any rectangle with center at that point, the value of/at
that point is the average of the values of/at the vertices of the rectangle.
Now we can write these conditions more precisely:
/■(0,0) = β /(0,1) = b /(1,0) = с f(l,l) = d
where a, b, c, d are given numbers. Further, for any (x, y) and (s, t) we must
have
Я*. У) = ИЛ* ~s,y-t) +Дх + s,y-t) +f(x + s, у + t)
+ f{x-s,y + ty] (2.1)
Now, how do we find such a function? We compute, based on the given
information, its value at certain points and try to see if we obtain a pattern.
(0,1)
(1.1)
(1,0)
Figure 2.1
128 2. Notions of Calculus
First of all, the value at the center of the box is easy to find
/(i, i) = Xfi + Ь + с + d)
By (2.1) we can compute the value at the center points of the sides:
(s = ht = 0),
/(i. 0) = \l№ 0) +/(1, 0) +/(1, 0) +/(0, 0)] =
a + b
Similarly, we obtain the values at all the other center points shown in Figure
2.2. Let us move to more complicated points, for example, the centers of the
four squares in Figure 2.2. Since we know the values at all the relevant
vertices, we may compute, by (2.1)
/(iD^fl + AH^c + ^rf
f(hi) = ^ct+^b + ^c + ^d
(2.2)
Now we can see that we can break the given square into 16 squares and
compute the values at the centers (points of the form p/23, q/23, and so
c + d
a+ d
a + b + c+ d
b + c
a + b
Figure 2.2
2.1 Convergence of Sequences 129
forth). We can successively compute the necessary values of/at all points
of the form p/2", q/2". Since any point in the rectangle has points of this
form arbitrarily near it, we surmise that by this tedious procedure, we will
be able to approximate the value of/at any point. It is fair to guess then
that a solution to our problem exists and that we have described a technique
for computing its values. If we return to Equations (2.2) (or their successors
at the next stage) we may be able to really find a formula for the solution.
Now it turns out that Equations (2.2) may be rewritten as
\2" 27
(2"-p)(2n-g)_ , p(2"-g)L pq q(V-p)
2
2„ a + 2„ b + -s с + 2я d
(2.3)
(the case и = 2; ρ = 0, 1, 2, 3; q = 0, 1, 2, 3). We can show by successively
computing the values at centers of squares that (2.3) is valid for all n. Thus
rewriting (2.3), we can assert that if (x, y) is of the form (p/2", q/2") with/» and
q as integers, then
f(x, y) = (l- x)(l - y)a + x(l - y)b + xyc + (1 - x)yd (2.4)
Assuming that/is a well-behaved function this must then hold for all points
(x, y). Finally, we can show by substituting into the required conditions
that (2.4) gives the solution.
Our purpose in the present chapter is to discuss the theoretical concepts
which remove the fuzziness in the above discussion. We shall expose the
ideas limit and continuity in the setting of functions of many variables. We
shall also present a review of the information from calculus which is necessary
to the study of this text.
2.1 Convergence of Sequences
Before proceeding directly to the limit notion, a few words on the notion
of a sequence are in order. Let X be any set. A sequence of points in X
is an ordered collection {xlt x2,..., x„,...} of points in X, one for each
positive integer. Another way of saying that is this: a sequence of points
in X is given by a function f-.P^X, where we denote /(л) by x„. As a
shorthand device we will often denote the sequence {xu x2,..., x„,...}
merely by its general term {*„}·
130 2. Notions of Calculus
Examples
1. {1,2,3,---,",·--} /(и)
4-4-5 Ψ-) л-)
3. {10, 101/2, ...,101/π,...} /(и)
A subsequence of a given sequence {x„} is a sequence {y„} extracted
from the ordered collection {xl3..., x„,...}. Thus, the collections
4. {odd-numbered x„'s} = {x2n-i}>
5. {every fifth term in {*„}} = {x5fl},
6. {xp„}, where />„ is the nth prime,
7- {*9(n)}. where g is a strictly increasing function on the positive
integers,
are all subsequences of {x„}, whereas
8. {*!, xlt ..., Χι, ...} is not a subsequence.
The above description of subsequence is a bit vague. The phrase " extracted
from " is picturesque but not too meaningful. Is the sequence
{x5 , X4, X3 , X2 , Xlt Хю , Xg , ■ ■ - , Xe , · · · , *5π , χ5π- 1,
*5n-2 , *5π-3 , ■χ5π-4, · ··)
a subsequence of {*„} ? It isn't clear from the preceding paragraph.
However, we should draw the line and exclude such new sequences. The essence
of a subsequence will be that it consists of some of the xn's, infinitely many
of them, and collected in the same order. Now, to be really exacting, our
notion of sequence itself is imprecise; we seem to have failed to say what it is.
"An ordered collection " is not very satisfactory. We have already elaborated
on that: "a sequence ... is given by a function/: Ρ-> X ... ." Yet, it is
given by ".. -," but what is it? It turns out that this line of metaphysical
questioning bogs down, and is in fact irrelevant. We have already found
something which completely describes the sequence (the function /: Ρ ->Χ),
so why not define a sequence just as such a function ? Indeed, when we do
so, it becomes very easy to also define a subsequence.
Definition 1. Let X be a set. A sequence in X is a function /: Ρ -> X.
A subsequence of this sequence is another sequence h: Ρ -> X, where h =/° 9
and ^ is a strictly increasing function from Ρ to P.
{«}.
(-1)" ((-I)"
и \ η
= 101/n {101/n}
2.1 Convergence of Sequences 131
Thus, if {*!,..., x„,...} is a sequence, this is in fact just another way of
writing the function/, /(л) = xn. Iff о д is a subsequence, we can enumerate
it as {*9(i), *9(2)> · · · > *e(n)> · · ·}·
The above definition is an illustration of a standard mathematical procedure
of defining things. A concept is, mathematically, an object with such and
such properties. Once we have stated the properties which we feel describe
the concept, there is no need to further inquire what the object is; we simply
define it by those properties. We now introduce the notion of convergence
of a sequence of numbers (which we may take as complex numbers).
Definition 2. Let {z„} be a sequence of complex numbers. We say that
the sequence converges if there is a z б С such that to every positive number
ε > 0, there corresponds an integer N such that |z„ — z\ < ε for η > N. In
this case we say {z„} converges to z, written lim z„= ζ or lim z„= ζ or
n-* oo
Z„^Z.
Said another way, the sequence f:P-*C converges to ζ б С if, given any
disk centered at z, the range of/ on all but finitely many integers lies in that
disk (see Figure 2.3).
Figure 2.3
132 2. Notions of Calculus
The following proposition asserts that a sequence cannot converge to more
than one point, and gives necessary conditions for convergence, without
reference to the limit point.
Proposition 1. Suppose lim z„ = z.
(i) if also lim z„ = w, then w= z,
(ii) the sequence is bounded, that is, there is an M>0 such that |z„| < Μ
for all n,
(iii) (Cauchy criterion) for every ε > 0, there is an N>0 such that
I zn — zm\ < Efor QH n,m>N.
Proof.
(i) By the hypotheses, given ε > 0, there are Ni, N2 such that \z„ — z\ < ε for
n^Ni, |z„— w\ <efor n^N2. Thus,
\z— w\ < k — Zjv1+Jv2| + \zNl+N2— w\ <2ε
Since the inequality \z— w\ <,2ε holds for all ε > 0, we must have \z— w\ <,0, or
z= w.
(ii) Taking ε = 1, there is an integer N such that \z„ — z| < 1 for n^N. Let
M = max{|z|, ^l, |z2| |z«|}+l. Then if и^ЛГ, certainly |z„|^M. If
n>N, |z„|^|z„-z|+ |z|^l+ \z\<.M.
(iii) Let e>0 be given. There is an integer N such that |z„— z\ <ε/2 for
n<,N. Thus, if n, m ;> ΛΓ, we have
|z„-zj < |z„-z|+ \zm — z\ <_ + _ = e
2 2
Condition (iii) is called a criterion for it implies convergence, as we shall
see below. Notice that (ii) does not imply convergence. The sequence
{(—1)"} is clearly bounded, but does not converge (it doesn't even satisfy
the Cauchy criterion: |(-1)" - (- l)n+1| = 2 for all n).
Examples
9. lim (1/n) = 0. Let ε > 0 be given, and choose the integer N
so that Ν>ε~1. Then, for n^N,
η
= «_1 ^iV"1 <E
2.1 Convergence of Sequences 133
10. lim (i"/n) = 0. The proof is the same (see also Problem 2).
11.
lim
= 2
l+(l/«)
Let ε > 0 be given. Now,
1 +(1/")
-2
= 2
1
1 + (1/n)
= 2
и - (и + 1)
и + 1
и + 1
Thus, we need only take Ν > 2ε ' to verify the condition for
convergence.
12.
lim
л+1
n + 1
-1
n+ 1
< ε if η > ε λ
13. lim A" = 0 for 0<A<1. Since h < 1, there is an integer
К > 1 such that h < K/(K + 1). Since the sequence n\{n + 1) is
increasing, we have h < n/(n + 1), all n> K. Now, we shall show by
mathematical induction that h" < Kjn for all n. The case η = 1 is
clear since K^.1. Now, using the nth inequality we obtain the
(n + l)th:
hn+1 = h· h" <
η Κ Κ
— <■
и+1 η η+ 1
Thus, if ε>0 is given, let Ν^Κε'1. For η > Ν, |Α"-0|
= h" < {Kjn) < ε.
The study of the convergence of complex sequences is easily reduced to that
of real sequences by the following fact. A complex sequence converges
if and only if the real and imaginary parts both converge.
Proposition 2. Let {z„} be a sequence of complex numbers, z„ = x„ + iy„.
lim z„ = ζ = χ + iy if and only if Mm x„ = χ andlim y„ = y.
134 2. Notions of Calculus
Proof. Suppose lim z„ = z. Then, given ε > 0 there is an N such that for
n^N, \z„ — z\ < ε. Since
\x*—x\< \z,— z\ and \y„ — y\^\z„ — z\
we also have
|x„ — x|<e and |% — γ\<ε ΐοτ η>Ν
so lim x„ = x and lim % = y.
Conversely, given e > 0, there are Ni, N2 such that η ^ Μ implies | x„ — χ \ < ε/ V 2,
implies |y„ — y\ < ε/λ/ΐ Then
We have so far been considering questions of this form: Given the sequence
{z„} and the number z, is lim z„~z1 A deeper problem is this: Given
the sequence {z„}, find if possible, a number ζ such that lim zn=z. A
solution of such problems requires a more profound understanding of the
real number system than we have so far needed. A question of existence
is now involved. To resolve such questions we have to have explicit
knowledge that there are many real numbers, whereas until now we have made use
only of the existence of the numbers 0 and 1. The explicit knowledge desired
here is that provided by the axiom of the least upper bound which roughly
states that there are no gaps in the line or real numbers. A set S of real
numbers is said to be bounded from above if there is a number Μ such that
χ < Μ for every xe S. If S is a set which is bounded above, it is conceivably
useful to know the smallest number Μ which will serve as an upper bound.
We shall refer to such a least upper bound of the set S by sup S (and inf S
will denote the greatest lower bound if it exists). The axiom of the least
upper bound asserts that for a set S which is bounded above, sup S exists.
We shall state this same axiom in terms of sequences because in that form it is
more appropriate to our present context.
Theorem 2.1. Let {x„} be a decreasing sequence of real numbers; that is,
x„ l> xn+ifor all n. If the set {*„} is bounded, the sequence converges.
We have called this a theorem since it can be deduced from the axiom of
the least upper bound (see Problem 1), which we can take as a defining
property of the real number system. A consequence of this fact of existence
for the real numbers is the fact that the Cauchy criterion (see Proposition
2.1 Convergence of Sequences 135
l(iii)) is a criterion for convergence. The proof goes like this: first we find
a subsequence of the given sequence which is decreasing. An easy
consequence of the Cauchy criterion is that the sequence is bounded. Thus, by
Theorem 2.1 this subsequence has a limit x. It now follows from the Cauchy
criterion that the full given sequence also has the limit x.
Theorem 2.2. Let {xn} be a Cauchy sequence of real numbers. That is,
for every ε > 0, there is an integer N such that \x„ — xm\ < ε whenever n,m>N.
Then there is an χ such that lim xn = x.
Proof. First, a Cauchy sequence is bounded. Let ε = 1; there is an N such that
|x„ — x„\ < 1 for n, m>N. Then M = max{|*i|,..., |χΝ|}+ 1 is a bound for {x„}.
Let uk = sup{x„: и > к}. Clearly, the uk are decreasing, and uk Ξ> — Μ for all к, so
by Theorem 2.1 the sequence uk converges, say lim uk = u. We shall show that also
lim x„ = u.
Let ε > 0. There are M, N2 such that
ε
\x* — Xm\ <- for η,ηι^Νί
\uk— u\ <- for k>N2
Let N=Ni + N2. Since uN = sup{x„.· n>N}, there is an n0;>iv~ such that
x„o <.uN+ (e/3). Then, combining all these inequalities, we have for и ;>iv~,
|лг„— н| < |jc„— лг„0| + |x»o-%|+ |%-и| <е
Because of Proposition 2 that a complex sequence converges if its real and
imaginary parts do, we can deduce the same theorem for complex sequences.
Corollary. A Cauchy sequence of complex numbers converges.
Proof. Problem 3.
• EXERCISES
1. What are the limits (when they exist) of these sequences:
(a) {n2-4} (c) jb^j
(b) {(n2-4)-'} (d) {(-l)"-(-l)"+1}
136 2. Notions of Calculus
(e)
(f)
МЭ1 w Ρ£^
2. If {x„} is a convergent sequence, then lim(x„+i — x„) = Q. Is this a
я-»ео
criterion for convergence?
3. Suppose lim x„ = ζ. Let {%} be the sequence {хц+i, Xn+2,...}. Show
that lim y„ = ζ also.
4. Let {i„}, {/„} be two convergent sequences. Show that if they are
convergent sequences with the same limit, then lim(i„ — /„) = 0. Is the
converse true?
5. What is lim ((и + 1)1/2 - Vn)?
6. Show that lim z„ = 0 if and only if lim|z„| = 0.
• PROBLEMS
1. Let {x„} be a decreasing sequence of real numbers. Prove, using the
least upper bound axiom, that if {x„} is bounded, it is convergent. Deduce
also that an increasing bounded sequence is convergent.
2. Suppose that lim z„ = 0 and {c} is a bounded sequence. Prove that
lim c„ z„ = 0. If {z„} is a convergent sequence and {c„} is a bounded sequence,
n-*oo
is {c„z„} convergent? or bounded?
3. Deduce from the fact that Cauchy sequences of real numbers
converge, that a Cauchy sequence of complex numbers is convergent.
4. Let {z„} be a sequence of complex numbers and {c„} a sequence of
positive real numbers such that \z„\ <c„ for all n>N0. Prove that if
lim c„ = 0, also lim z„ = 0.
5. Suppose lim z„ = z. Let {%} be a subsequence of {z„}. Then
lim y„ = ζ also.
6. Let {i„}, {x„}, {/„} be three sequences of real numbers. Suppose that
s„<,x„< t„ for all n.
(a) Show that if lim s„ = lim /„ = c, then also lim x„ = с
(b) Show that if lim s„ = с and lim(/„ — s„) = 0, then also lim x„ = с
7. (a) Let 1 > δ > h > 0. Show that there is an integer К such that for
η ^ K, (n/(n + 1))δ > h.
(b) Let 1 > h > 0. Show that lim nh" = 0.
8. Suppose lim z„ = z, lim w„ = w. Show that
(a) lim|z„| = |z|.
(b) lim(z„ + w„) = lim z„ + lim w„.
(c) lim z„ w„ = lim z„ · lim w„.
2.2 Series 137
2.2 Series
A sequence may be formed term-by-term by adding a little bit to each term.
In this case the limit, if it exists, will be an infinite sum. Such sequences are
probably the most important kind, for in practice what we usually know about
a sequence is the difference between two successive terms. This sequence is
given to us as the sequence of sums of these differences.
Let {zn} be a given sequence. The series formed of this sequence is the
sequence of sums
π
*1. Ζ, + Z2 Zi + · · · + Z„ = Σ Z,, . . .
1=1
The series converges if the sequence of sums £Γ=ι ζ, converges; in this case
the limit is denoted by £," t z,.
Example
14. The geometric series ^°=0 z". Let SN = ££=<, z".
Then
SN+1 = 1 + ζ + · · · + ζ" + ζ"+1 = SN + zN+1
Notice also that
SN+1 = 1 + z(l + ζ + · · + ζ") = 1 + zSN
These two equations give us the general term of the sequence explicitly:
l + zSN = SN + zN + 1
or
1 - zN+l
SN = 1—1— (ζ Φ 1) (2-5)
1 — ζ
Now, if \z\ < 1, then lim SN = (1 - z)~\ for
1 zN+1 1
s i_ = - =—-—ΐζΓ+1
Ън l-z 1-z |l-z|M
138 2. Notions of Calculus
and lim \z\N = 0 (Example 13). So given ε > 0, we find N0 so that
|z|n+1 <(|1 -zQefor πΞ> JV0, and thus
SN
1-z
< ε ίοτΝ>Ν0
Notice that we cannot immediately determine whether or not the geometric
series converges for \z\ > 1 (of course, for z= 1, SN = N, so the series
diverges). In fact, the geometric series does not converge for \z\ > 1, by
application of the following proposition.
Proposition 3. IfY^=o z„ exists, then the general term must converge to
zero; that is, lim z„ = 0.
Proof. Let {s„} be the sequence of partial sums. By hypothesis lim s„ exists.
Let ε > 0 be given. There is an N such that for n,m>N, \s„ — sm | < e. Thus, for
η^,Ν, \ζ„\ = |ί„ — ί„+ι| <e.
Thus,
1 for \z\ < 1
2z»=|l-z (2.6)
(diverges for \z\ > 1
For if \z\ > 1, the general term is |z"|. By Example 13, lim(l/|z|)" = 0, so
{\z\"} gets arbitrarily large and does not converge. If \z\ = 1, |z|" = 1, and the
sequence {1} does not converge to zero.
By the way, the condition lim z„ = 0 is not sufficient for the convergence
of the series £ z„ as the following examples show.
Examples
15. Certainly the series l + l + *** + l+··' diverges. But we can
rewrite this as
, 11111 11 11
ι + ϊ + ϊ+τ + τ + τ +■■■ + - + - + ■·· + - + —Γ+'"
•^2333 ии ии + 1
Here the general term tends to zero.
2.2 Series 139
16. Σπ°°=ι l/л diverges. Let sN = £?=1 l/и. Then
Thus, {SN} is not a Cauchy sequence.
The sum of a series of positive numbers is particularly easy to work with.
For if {c„} is a sequence of positive numbers, then the series {JJ = 1 ck}
is an increasing sequence, so by Theorem 2.1 (as rewritten in Problem 1) this
sequence converges if and only if it is bounded.
Proposition 4. Let {c„} be a sequence of nonnegative numbers. The
following assertions are equivalent.
(ι) £ ck converges.
00 {Y%= χ ck} is bounded.
(iii) For each ε > 0, there is an N such that for all m> N,
m
k = N
The proof of the equivalence of (ι) and (n) is essentially given in the
preceding paragraph. Part (iii) is just the Cauchy criterion restated for positive
series (see Problem 11).
Examples
17. ΣΓ=ι I/"·' converges. For nl > 2"'1 for all n, so
1 1
and thus for all N,
N 1 N-l 1
V -< у -<2
by (2.5).
140 2. Notions of Calculus
18.
y(l- I )
п^Ди n + cos(l/n)/
converges. For
1 1 11
<
η η + cos(l/n) и и + 1
Thus, for all N
„еДл и + cos(l/n)/ ~ ЛАп n+lj 2 + 2 3 +
1 1
+ <1
N N+i.
There is no such simple criterion as Proposition 4 for arbitrary series of
complex (or real) numbers, and the question of convergence as well as
computation of a limit can become extremely subtle. However, if for a given
series the series formed of the absolute values converges, the situation is
considerably clarified. Ordinarily we shall discuss the convergence of a
series only in the happy circumstance that the corresponding series of absolute
values converges.
Proposition 5. Let {ck} be a sequence of complex numbers. If £ \c„\
converges, Y_ck also converges.
Proof. Let /„ be the sequence of partial sums of 2 Ы and s„ the partial sums of
2 c„. Notice, for m > η
\Sm— *il =
Σ ck
<, 2 k*i = *„-*,
Thus, if {/„} is a Cauchy sequence, so also is {s„}.
Definition 3. Let {ck} be a sequence of complex numbers. ]£ c„ is
absolutely convergent, if £ \c„\ converges. If £ \c„\ diverges, but £c„
converges, we say £ c„ is conditionally convergent.
There are such things as conditionally convergent sequences. In fact,
Σί°=ι(—1)7" converges. But as we have seen in Example 16 the series
ΣΓ=ι 1/" of absolute values is divergent. It is easy to see that ^°=1 (— l)"/n
2.2 Series 141
converges. Let {s„} be the sequence of partial sums. Then the subsequence
{s2n} is decreasing, and bounded below by sl3 and the subsequence {i2„+i}
is increasing, and bounded above by s2. Thus, both these subsequences
converge. Since
lJ2n + l ~ s2n\ <
П+ 1
they have the same limit. It is easy to deduce that the full sequence also
converges to that common limit. Here is the proof in a more general case
(known as Leibniz's theorem).
Proposition 6. Let {cn} be a decreasing sequence of positive numbers such
that lim c„ = 0. Then £ (— i)"c„ converges.
Proof. Let ί„ = Σ*-ι (— \fck. We consider the sequences of even and odd
partial sums separately. The sequence {s2„} is decreasing, since
*2<п+1)— S2n = C2n+2 — Cln + 1 < 0
Similarly, the sequence of odd partial sums {i2„+i} is increasing. Furthermore
these sequences are bounded, for, given any n,
Sl <S2„+1 =i2n — C2„ + l <S2„<S2
so {s2„} is bounded below by ii and above by s2. The same is true for the sequence
{s2„+i}. Thus, by Theorem 2.1 lim ί2„ = ί, lim s2„+1 =/ both exist. Furthermore,
s' — s = lim s2„+1 — lim s2„ = lim(i 2„+1 — s2„) = lim(c2„+1) = 0, so s' = ί Since both
sequences, of odd partial sums and even partial sums converge and have the same
limit, the whole sequence also converges to that limit.
Notice that this argument does not give any hint as to the value of
^ (— l)"/n. Outside of the case of Proposition 4, there is no positive
assertion that can be made about conditionally convergent series. In fact, they
tend to behave very badly, as the following illustrative example shows.
Example
19. The sequence
111111 1 1
2+2+4 + 4 + 4 + l+'" + br + '" + Tn + '"
In
142 2. Notions of Calculus
is the same as 1 + 1 + 1- 1 + · · · and thus diverges. Since the
general term is decreasing to zero, by Leibniz's theorem this series is
conditionally convergent:
11 11 11 J___L __L
5-2-2 + 4~4 + 4 4+'" + 2n 2n +" ' In + " '
и times
The sequence of partial sums is
1,оДоДо,...,1,о,
2 4 4 In
and thus obviously converges to zero. However, we may now
rearrange terms of the series so that it no longer converges! Consider
the same series where in each group we first add the positive terms
and then the negative terms:
(2.7)
The corresponding sequence of partial sums is
1111 л-1 1
2'°'4'2'4'0'···'-2Γ'···'2Τ0'-
Thus, there is a subsequence: {\, \, ...} and another: {0,0, ...} so
we cannot have convergence of (2.7). We leave to the student
(Exercise 9) to show that it can be further rearranged so that it once
again converges, but this time to one!
No such foolishness holds for absolutely convergent series. We may
attempt to sum the series in any way we please. If we arrive at a limit, it is
the sum. In fact, if £ c„ is an absolutely convergent series we may sum first
the positive terms, and then the negative terms; and £ c„ is the sum of these
two sums. We conclude this section with the proof of these facts.
2.2 Series 143
Proposition 7. Let £ c„ be an absolutely convergent series of real numbers.
(i) Let
\ck if ck > 0
■*-(
0 if ck<0
_ ί — ck if ck < 0
| 0 if c6>0
ГАетг the sums ^ск,^ск converge and £сц = ^ск+-^^.
(ii) {Rearrangement.) Let gbea one-to-one mapping of the positive integers
onto the positive integers. Then £ ce(fl) = £ c„.
(iii) {Regrouping) Let h be any strictly increasing function from Ρ into P.
Let
ft(n)
fc = ft(n-l)
ThenJ^dn = ^cn-
Proof.
(i) Since the sequence (2*=i \ck\) is bounded by absolute convergence, and
£|ck|> t^+,tc,-
the sequences 2*-i c*+. 2"=! Ck" are also increasing and bounded. Thus they
converge to, say s, t respectively, by Theorem 2.1. We have to show that 2 c„ = s — /.
Let ε > 0. Then there are M, N2 such that for n>Nu
ε
<2'
2 ci - ί
*=1
and for n>N2,
2 c*~ -'
Then for и > max(M, N2),
л
Jc»-(i-/)
2rf- 2ft"-(»-')
k=l k=l
<e
(ii) Let g be a one-to-one map of Ρ onto P. Then #"1 is defined and also maps
Ρ onto Ρ into a one-to-one fashion. For each n, let N„ =max(^(l), ...,^(n)).
144 2. Notions of Calculus
Then
for all n, so the series Σ £«<*) 's absolutely convergent. Similarly,
η Νη η
Σ c«+<*> ^ Σ c*+ ^ Σ c*+ and Σ c«"<*> ^ Σ c«"
h=l k-1 ft=1
for all n, so we have Σ <$*> < Σ c*+> Σ c«~<*> ^ Σ c*~· Applying the same reasoning
but reversing the roles of the two series, we obtain the reverse inequalities so that
in fact, Σ c«+<*> = Σ c«+ and Σ c«~« = Σ c«" · Thus, by part (i) we obtained the
desired equality; that is, Σ c«« = Σ Ct ■
Part (iii) is actually true for any convergent series. Let Σ c» = c, and the strictly
increasing function h be given. Notice that h{n) Ξ> η for all n. For ε > 0, there
is an N such that
л
Σ ck — с < e
*=1
for all и > N. Thus, for
n>N, £</„ = Σ Σ cj= Σ ο
k=l *=lj=ft(*-l) J=l
and А(я) > Ν, so that
л
Σ^» — с
Σ о-с
<е
• EXERCISES
7. Show that
л = 1\И И+1/
converges.
8. What is
| (-1)"
where a2„ = 2", a2„+1 = 3"?
2.3 Tests for Convergence 145
9. Rearrange the series
11 11 11 J__ 1 1
2 2+4 4+4 4 + '" + 2^ 2^ + '"~2^+'"
so it has the sum one.
10. Can the series 2 (—l)"/n be rearranged so as to have sum 10,000?
• PROBLEMS
9. Suppose 2 z„ = ζ and ^», = ic. Show that 2 (z» + w„) = ζ + w.
10. Suppose
(a) 2 z» and lim w«exist. Does 2 z„ w„ exist?
(b) 2 z» and Σ w» exist. D°es 2 z«w» exist?
11. Prove that 2 z« converges if and only if for all ε >0, there exists an
N > 0 such that
2 ζ*
< ε for all и > N
Deduce that Proposition 4(iii) is true.
2.3 Tests for Convergence
Since the theory of series is so important and the definition of convergence
unwieldy, there has developed a large collection of tests (or criteria) for
convergence which are more or less easy to apply in the relevant cases. We
have already given some criteria for convergence.
(1) Cauchy criterion: £ c„ converges if and only if for every ε > 0, there
is an integer N such that |c„ + 1 + · · · + cm\ < ε for all m > η > N.
(2) If the sequence {c„} decreases to zero, then £ (- l)"c„ converges.
(3) If the sequence {c„} is nonnegative, £ c„ converges if and only if the
sequence {£5UiCfc} of partial sums is bounded.
The last condition, which can be considered as a condition for absolute
convergence, gives rise to the following criterion which is the basic one.
The idea is to compare a given series with a known convergent one (if we
suspect that it converges) or to a known divergent one (if we suspect that it
diverges).
146 2. Notions of Calculus
Examples
20. Σ l/л! converges, as we have seen in Example 17. There, we
noticed that l/л! <2_n + 1, and since £2-n + 1 is convergent, so is
Σΐ/л!. For
N I 1 \ N I » 1
n = i\n!/ n = iz n=\l
21.
diverges. For if χ is small enough, sin χ > χ/2. Thus, there is an
* such that if л > Ν,
sin(-) > -
\л/ 2л
and thus for m> N,
£sin(-)= £sin(-) + o Σ -
n = l W „ = i \Л/ 2Ν + ΐΠ
But we can make the last sum as large as we please by taking m large
enough. Thus, ^*= x sia(5/ri) is not bounded, and so it is not
convergent.
22.
£, 1
■tb (1 + 0"
is absolutely convergent. For 11 + i\ = ^2, so for any m,
m 1 oo 1 _
Σ τ; ^ = Σ —7=- < °° since J2 > 1
„fo|l + *r π^ο(^2)π V
The idea behind these examples is contained in the following theorem.
2.3 Tests for Convergence 147
Theorem 2.3. (Comparison Test) Let {c„} be a sequence of complex numbers.
If there is a positive number К and an N, and a sequence {p„} of positive
numbers such that
(i) \cn\<Kpn, forn>N,
00
00 Σί>π<°°>
Fl=l
then £ c„ converges absolutely.
If instead, we have
(i)' \c„\>Kp„, fom>N,
(ii)'f>„=oo,
Fl = l
then £ \c„\ diverges.
Proof. In the first case the sequence of partial sums is bounded.
ilc*l = |ic.l+ Σ \ск\£2ы + к1р*<«>
*=1 *-l * = Λί+1 k = l
In the second case, the sequence of partial sums is unbounded.
Σ |c*i = 2 lc*l + Σ W ^ Σ Ы + Σ л
* = 1 k.l * = iV+l k=l k = N+l
which is unbounded as и ^ со.
Examples
23. X"=o z"lnl converges absolutely for any complex z. Choose
an integer N so that N> 2\z\. Then, for all η, (Ν + и)! > (2|z|)",
so that
\z\N+n \z\N
(N+n)l~ 2"
Since £ 1/2" converges, so does £ |г|п/и! by the comparison test. As
a corollary result we obtain Iim z"/n\ = 0 for all ζ (this however could
have been derived directly).
148 2. Notions of Calculus
24. £rtV converges absolutely for all z, \z\ < 1 and all integers k;
and otherwise diverges. If \z\ > 1, then Iim nkz # 0, so the series
can hardly converge. Now suppose \z\ < 1. We want to prove the
convergence by comparison with the geometric series, so we must
account for the effect of the coefficients nk. Note that (n + l)/n -* 1
as и -> oo, thus also (n + \)jnk -* 1 (Exercise 13). Let s be any number
greater than 1. Then there is an N such that for all и > Ν,
j-^- < s or (n + If <
snk
Thus, by induction we can conclude that, for all и > 0, (Ν + rif < s"Nk.
Thus, (N+n)k\z\N+n<(s\z\)nNk\z\N. We should choose s < l/|z|,
so that £„ (j| ζ |)" < oo. With the choice then of s: 1 < s < l/|z|,
we can apply the comparison test to obtain the convergence of our
series £ nkz".
25. Y^nlz" diverges for all ζ #0. We have seen in Example 2
that for any complex number c, lim c"jn\ = 0, or, replacing с by z-1,
n-»oo
lim l/n!z" = 0. This precludes the possibility that limn!z" = 0,
n-*oo
so the given series cannot converge.
26. £ 1/и2 converges. In a later section we shall give another
proof of this, at present we rely on a tricky observation.
ill- _L) = 1__L
Thus, the series
ς(----)
converges to 1. But
1 1 1
и и + 1 n(n + 1)
thus
00 1
Σ —-— = i
„=i n{n + 1)
2.3 Tests for Convergence 149
Now, In1 > η2 + η = n(n + 1), thus
1 1
τ^
η п{п + 1)
so by comparison £ 1/n2 also converges.
27. Σ 1/л(1+8) converges for any ε > 0. Let к be an integer
large that ke > 2. Then, for any n; if m ^ nfc,
1 1 1
so
<
ш(1 + 8)-л*(1+0-и*+2
Between и and (n + l)k there are (n + i)k — nfc integers. Since
, к
m
1
there is an n0 such that for η > n0, (n + l)fc < 2nfc, or (n + l)fc — nfc < n\
Thus,
(nv1)k 1 и" _ J_
m=nk+l »Г И П
Well, now we can show that the sequence of partial sums
is bounded, for
N* J n0k J Nk J
Σ „(1 + 8) ^ Σ „(l+ί) + 2,; „(1+ί)
п=1 И п=1 И п=пок+1 П
N (п+1)к 1
^Ио*+ Σ Σ -7ΪΤΓ)
^«ofc+ Σ Ζ2^"ο'ί + Σ^<00
п=по " ^
Now a special kind of a series is a power series: the geometric series, and
the series in Examples 24 and 25 are such series. A power series is a series
150 2. Notions of Calculus
of the form
00
11 = 0
Such a series has the property that if it converges for some z0, then it
converges for all ζ such that \z\ < \z0\, and if it diverges for some ζλ, then it
diverges for all ζ such that \z\ > \ζλ\. Thus, the geometric series diverges
for \z\ > 1 and converges for \z\ < 1; the series £ z"/n\ converges for all z,
and £n!z" converges for no z. This general property of power series is
easily deduced from the comparison test. We make the following somewhat
stronger statement.
Proposition 8. Let {c„} be a sequence of complex numbers.
(i) If {\c„ \t"} is bounded for some positive number t, then £ cnz" converges
absolutely for all z, \z\ <t.
(ii) If {\ c„ \t"} is unbounded, then £ cnz" diverges for all z, \z\ > t.
Proof.
(i) Suppose Μ> \c„\ t" for all n. Let ζ be such that \z\ < t. Then
|c„z"|<|c„|/"( — V<Af(—V for all и
ίτΜ?)
and since |z|//<l, 2 (И/О" < c0. so by the comparison test the series 2 е"7"
converges absolutely.
(ii) If {|c„| /"} is unbounded so is {c„z") for all z, \z\> t. Thus, we cannot have
lim c„ z" = 0, so 2 с»z" cannot converge.
Definition 4. Let {c„} be a sequence of complex numbers. The power
series associated to {c„} is the series ^"=0 a„z". The radius of convergence
of the power series is the least upper bound Я of all real numbers t such that
the sequence {| c„ |i"} is bounded.
According to Proposition 8 the series £"=0 a„z" converges for ζ inside
the disk of radius R(\ ζ \ < R), and diverges for ζ outside that disk (see
Problem 12).
Examples
28. J^°=0 z"/n has radius of convergence one. For if t > 1, then
{f/n} is unbounded, and if t < 1, t"/n -> 0. Notice that we can make
no clear assertion for ζ on the unit circle, since £"=0 (l)"/n diverges,
but ^°=0 (— l)"/n converges.
2.3 Tests for Convergence 151
29. If {c„} is bounded, but does not tend to zero, ££,<, c„zn has
radius of convergence one. For clearly {cntn} is bounded for t < 1,
and unbounded for t > 1.
There are two final tests of some importance. These are as follows:
Root test. If eventually
(I c„ |)1/π < r for some r < 1
then £ c„ converges absolutely. If there are infinitely many и such that
(| c„ |)1/n > R for some R > 1
then Σ cn diverges.
Ratio test. If there is an r < 1 such that eventually
< r < 1
then Σ cn converges absolutely. If
> R > 1 for infinitely many и
then £ c„ diverges.
These are both derived by comparison with the geometric series. We leave
it to the student to derive these tests (Problem 13). Let us here indicate why
the convergence assertions are true. Suppose (| c„ |)1/п < r < 1, for и large
enough (say и ^ N). Then \c„\ < r" eventually, so the partial sums £ \c„\
are bounded by
N 1
Σ ki + τ—
Fl = 0 1 — Г
by comparison with the geometric series. As for the ratio test, suppose
< r for и > N
152 2. Notions of Calculus
Then we have
kjv+ll <r\CN\
kjv+2l <r\cN+1\ <r2\CN\
кл,+з1 <r3\cN\
kjv+sl < rk\cN\
by induction. Thus, ^=0 \c„\ < £?=0 k„l + \c„\ Σ r" < °° since r < l-
9 EXERCISES
11. Which of the following series converge?
(a)
(b)
(c)
(d)
H)·
**$·
Σ,η(1).
2tanG)_SinG)·
(e)
(f)
(g)
(h)
v n5 + 8
^tn'+n*'
v η3 + η2 + η + 1
и4 + η5 + η6 + ηΊ
Σ-·
Τ л', χ > 0.
η!
(i) Σ -f x"> ^ a positive integer, 0 < χ < 1.
G) 2(-υ-sinί. (m) z(L + _i— + —L_).
« Σ(-ΐ)"^-· (η) ς(-—Ц + —Λ
η + 1 \и и + 1 и + 2/
(1) Σ(-ΐ)"—-—·
Л (и + I)2
12. Verify directly that lim z"jn\ = 0 for every z.
13. Suppose lim c„ = c. Then for any integer к, lim c„* = c*.
(b)
(с)
(d)
(e)
• PROBLEMS
oo 2
n=0 И
ю Ζ"
£, и!
пГо(2и)!
ii-V
2.-/ Convergence in R" 153
14. Find the disk of convergence of the following power series.
(a) \z\ (f) fn!z».
" n = 0
(g) Σ(^·
(h) 20+«Л)г".
(0 Σ ζ-2.
(j) Σ (! + *)"·
12. Let {c„} be a sequence of complex numbers, and let R be the radius
of convergence of the power series 2C»Z"· Show that Σε»ζ" converges
absolutely for \z\ < R, 2 c„z" diverges for \z\ > R.
13. Derive the convergence and divergence assertions of the root and
ratio tests.
2.4 Convergence in R"
The notion of convergence of a sequence of vectors is easy to conceive,
since a vector in R" is just an л-tuple of real numbers. Thus, a sequence of
vectors is an л-tuple of real sequences, and the question of convergence of
the vector sequence is just that of the simultaneous convergence of those и
real sequences. We might also directly paraphrase Definition 2 of
convergence, using the notion of distance in R" discussed in Chapter 1. These two
possible notions are in fact the same.
Definition 5. Let {\k} be a sequence of vectors in R". The sequence
converges if there is a vector ν e R" such that to every positive number ε > 0
there corresponds an integer К such that || vk — ν || < ε for к > К. We write
lim vfc = ν if {vk} converges to v.
fc-*oo
154 2. Notions of Calculus
Thus, lim \k = ν means precisely that lim || vk - v|| =0; that is, the distance
between the general term vk and ν tends to zero as к becomes infinite. When
put this way it sounds like just the notion we have in mind. Recalling that
in Section 2.1 we said that a complex sequence {ck} converges to с precisely
when \ck — c\ -* 0, we see that this coincides with the above definition when
η = 2. Now, if we write out the sequence \k of vectors in R" as an n-tuple
yk=(vk\...,vkn) (2.8)
we can view the given sequence as the и real sequences {vkJ}, where
j = 1,..., n. We now verify the fact mentioned above, that \k -> ν precisely
when vkJ -* vJ for all j. Notice that Proposition 2 is that fact in the case
ofR2.
Proposition 9. The sequence (2.8) converges to the vector ν = (ν1, ...,ν")
if and only if lim vkJ = v1 for all j.
fc-» oo
Proof. If w = (w1,..., w") is a vector in R1, then by definition
llw|| = (2(w')2)1/2
Then, in particular
W-vJ\ < ||vk-v|| y=l,...,и (2.9)
Suppose now that vk^v. Then, given ε >0, there is a AT such that ||vt— v|| < ε
for к ^ K. Thus, by Equation (2.9) for each j, \vkJ - vJ\ < ε for к ^ K. But this
means precisely that lim VkJ = ιΛ
k->oo
Conversely, if iV -^ uJ for ally, then (vkJ - uJ)2 ^0 for ally, so [J (vkJ - uO2]"2 =
llv»— v|| ^0 as k^ со. But then, by Definition 5, vk->v.
In precisely the same way we can verify that if the sequence of vectors (2.8)
satisfies a Cauchy criterion so do each of the real sequences {vkJ}, and thus
are convergent. Hence, by Proposition 9 the sequence of vectors {vk} also
converges, so we have a Cauchy criterion for vector sequences also. This
fact, as well as some basic algebraic properties of convergence of vectors is
easily verifiable. Accordingly, we make these assertions, leaving the proofs
to the reader.
Proposition 10. (Cauchy Criterion) Let {vk} be a sequence of vectors in R".
Suppose to every ε > 0 there corresponds а К such that || vr — vJ < ε whenever
both r,s>K. Then the sequence {vk} is convergent.
2.4 Convergence in R" 155
Proposition 11. Suppose Iim vk = v, Iim щ = w, Iim ck = c, where {yk},
{v/k} are sequences of vectors in R\ and {Ck} is a sequence of real numbers.
Then
(ι) lim(vfc + щ) = ν + w,
(ii) lim<vfc, wfc> = <v, w>,
(iii) Iim ckvk = с v.
Example
30. Let us find a point of a given plane in R3 which is closest to
the origin. A plane is given by the equation <x, a> = с for fixed
a, c. Let m =g.l.b. {||x||; <x, a> = c). Choose a sequence {x„}
on the plane such that ||x„|| ->m. We shall show that {x„} actually
converges. Now,
xmll2=llxJI2+llxmll2-2<x„,xm>
(2.10)
We can estimate the last term by using the fact that the midpoint
i(x„ + xm) between x„ and xm must also be on the given plane.
1
" (X. + Xm)
+ ■
+ - <x„, xm>
4 2
Thus,
-2<x„,xm><||x„||2+||xJ|2-4m2
Combining (2.10) and (2.11), we find that
(2.П)
<2(||x„||2+||x„
2m2)
(2.12)
Now, since ||x„|| -> m, if ε > 0 is given, there is an n0 such that for
n,m>n0, we have ||x„|| < m + ε, ||xj| < m + ε. Inequality (2.12)
then gives
IIx. - xJI2 < 2((w + ε)2 + (m + ε)2 - 2m2) < 4^ε + 2ε2 = г(4ш + 2ε)
This can be made as small as we please by choosing ε small. Thus if
n,m are large enough, ||x„-xj| is small, so the sequence {x„} is
Cauchy, and thus convergent. If χ = Iim x„,then ||x|| = lim||x„|| = m,
so χ is the closest point on the plane to the origin.
156 2. Notions of Calculus
Let us pause for a moment to consider the reasons, as illustrated by the
above example, for studying the convergence of vectors. The central
problem of calculus is to find an object, usually considered as a point in a given
collection of points, which has certain specified properties (i.e., the maximum
of a given function, or a zero of a function). At least, the theoretical aspect
of the problem is to prove the existence of a point with such and such
properties. Our technique for doing this is to use the desired properties to
develop a sequence of approximations; our hope is that the approximations
will converge; and that the limit will have the desired properties. It is thus
essential to be able to discuss the question of convergence without already
knowing the limit. Hence, for example, we have· the Cauchy criterion.
Further, we will need techniques, or criteria, to apply to the given properties
in order to be able to extract the desired Cauchy sequence of approximation.
For example, we will want to know: (a) If we have a convergent sequence
of points having a property, does the limit have that property? (b) If we
have a sequence of points having a property, does the sequence converge?
or, at least does it have a convergent subsequence? These questions lead
us to the reconsideration of the closed sets introduced in Section 1.11.
Recall that a closed set in R" is a set whose complement is open. More
precisely, S is closed if and only if corresponding to every ν φ S, there is an
ε > 0 such that any vector within ε of ν is also not in S. In particular, if S
is a closed set, and ν φ S, then ν cannot be the limit of a sequence of vectors
in S. To put it positively, a closed set contains the limits of all convergent
sequences it contains. This is in fact a defining criterion for closedness:
Proposition 12. Let S be a set in R". The following assertions are
equivalent:
(i) S is closed.
(ii) If {\k} is a convergent sequence contained in S, then lim yk e S.
Proof. Suppose S is closed. Let {v*} be a sequence contained in S and suppose
it converges to v. If ν φ S, since S is closed, there is an ε > 0 such that no vector
in S gets within ε of v. This is nonsense since ν is the limit of a sequence in S.
Thus, we must have у е S.
Suppose now S is not closed. Then there is a ν φ S such that for every e > 0
there is a vector in S which is within ε of v. in particular, for each n, taking
ε = \jn there is a v„ such that ||v„ - ν|| <, \jn and v„ e S. Thus, v„ -> ν so (ii) does
not hold for S.
We are now in a position to state our last basic consequence of the
fundamental existence axiom for the real number system. This is that every
bounded sequence in R° has a convergent subsequence. It is easy to derive
2.4 Convergence in R" 157
this from the Cauchy criterion, itself an assertion of existence. Let us
illustrate the situation in R2. Suppose {ck} is a sequence of complex numbers
which is bounded; that is, it remains in some fixed square S0 of side length К
Cut that square into four equal squares. At least one of these new squares
has infinitely many of the {ck}; let Sx be one such square. Cut St into four
equal pieces and let S2 be one of these new squares which has infinitely many
of the {ck}; now do the same with S2 and so on (see Figure 2.4). In this
way we obtain a sequence of squares {S„} with the properties:
n + l>
0) Sn=>S
(ii) side length of S„ is K/2",
(iii) S„ has infinitely many of the {ck}.
Now that this is done, we can, for each integer n, select a k{n) such that
Ck(„)eS„, and {cfc(n)} forms a subsequence of {ck}. (For this we need to
know that S„ contains infinitely many {ck}, so that we can choose k{n) greater
than any previously chosen index.) Now, {cfc(n)} is a Cauchy sequence.
For let ε > 0, and choose N so that ε > Κ^/ΐ/2Ν. Then, if n, m > N, we
have c4n),ck(m)eSN, so
L4»)
-c,
k(m)\
\2N) +\2NJ 2^
< ε
Since the sequence {сЦп)} is a Cauchy sequence, by Proposition 10 it
converges, and the argument for R2 is concluded. This is the basic idea of the
verification of
;j
Figure 2.4
158 2. Notions of Calculus
Theorem 2.4. Every sequence in a closed and bounded set S in R" has a
subsequence which converges to a point ofS.
Proof. Suppose that S is closed and bounded and {v*} is a sequence in S. We
shall find a Cauchy subsequence. Since the sequence is bounded, it is contained in
some ball B(0, R). This ball can be covered by finitely many balls of radius 1.
Since the {v*} are infinite, there is one such ball which contains infinitely many. Call
it Bi, and let vkW e Bi. Bi can be covered by finitely many balls of radius i. Let
B2 be one such which contains infinitely many of the {v*} and let v*<2) e B2 with
k(2) > k(l).
Continuing in this way we obtain a sequence {B„} of balls, a subsequence {v*<n)} of
{v*} such that (i) B„ has radius l/и, (ii) v*<„) e B„, (iii) B„ => B„+1. Then {v4<„,} is a
Cauchy sequence, for if n, m ^ Ν, v*(n) and y4m) e BN which has radius I/TV, so
2
||ν*(„, — v4<m)|| <- for all n, m ^ N
By Proposition 10 there is a v such that ν*(„) -> ν as и -> со. Since S is a closed set,
and {νλ(π)} e 5, we also have ν e S, so the theorem is proven.
Example
31. The unit sphere S = {x e Rn: ||x|| = 1} is closed. For if
x„ -> x, then certainly ||x„|| -> ||x||, so if x„ e S, so is x. Now suppose
Γ is a linear transformation of R" to R". We want to know if there
is an χ e S at which ||7x|| is a maximum. First of all, the set of
numbers of the form ||7x|| with χ e S is bounded. Let A = (a/) be
the matrix representing T, and Μ = max \a/\. Then
Tx = T(x\ ..., x") = (Χ α/χ\ ..., Σ α/V)
so
|| 7x|| = [(X aj V)2 + · · · + (Σ <*J)2]1/2 (2.13)
< [nM2 ||x||2 + · · · + nM2 ||x||2]1/2 < nM ||x||
Thus, nM is the desired bound. By the least upper bound axiom then,
m = sup{|| Гх ||: χ e S} exists, and there is a sequence {x„} с S such
that ||7x„|| ->»j. According to the above theorem there is a
subsequence {y„} which converges, say to y. Since ||7x„|| ->»j, we also
have ||TV. || - m, and by (2.13), in fact ||7y|| = lim||7y„|| = m.
2.5 Continuity 159
PROBLEMS
14. Prove Proposition 10.
15. Prove Proposition 11.
16. Let Π be a plane in R3, and suppose x0 is the point on Π which is
closest to the origin. Show that if χ e Π, then x0 is orthogonal to χ - x0.
(Hint: If not, then one of χ — x0, χ + x0 is closer to the origin than x0.)
17. Find the point on the plane given by the equation <x, (1, 1, 1)> = 3
which is closest to the origin.
18. Find the point on the plane <x, (1,0, 1)> = 2 which is closest to
-α, ι, υ.
19. Let L be a linear function from R" to Rm Show that the kernel and
range of L are both closed.
20. Let L: R°^ R be a linear function. Show that if limx„ =x, then
also lim L(x„) = L(x).
21. Let v0 be a vector in R", and Π the set of χ such that <x, v0> = с
Show that Π is closed.
22. Show that for any v0 e R" and r > 0,
{yeR": ||v-v0||<r}
is closed.
23. Show that ν* -» ν in R" if and only if
max |ϋ*' —ϋΊ->0
2.5 Continuity
We turn now to the consideration of functions from subsets of R" to Rm.
The basic notion of analysis being that of convergence, the fundamental
class of functions will consist of those which respect convergence; that is,
those which take convergent sequences into convergent sequences. These
afe continuous functions.
Definition 6. Let S be a set in R", and/a function denned on S, taking
values in Rm. /is continuous on S if whenever vk -> ν with \k e S, all k,veS,
then/(vs) -»/(*)·
We shall be concerned most usually with the local study of a function near
a given point. For this purpose we make this additional definition.
160 2. Notions of Calculus
Definition 7. A function /from a set in Rn, taking values in Rm, will be
said to be continuous at v0 e R" iff is denned in a neighborhood of v0 and
v->v0 implies/(v) ->/(v0).
Examples
32. /:Λ"->Λ,/(ν) = ||τ|| is continuous. For if v„ -> v, then
||v„-v||->0sothat ||v„|| -» ||v|| since
I II т. ||-|| τ || |<||v„-v||
33. f:C^C, f(z)=z is continuous: zn-+z implies \zn -z\
= \z„ — z\ -* 0, so that also z„ -> z.
34. A linear function on R" is continuous. Let
f(v\...,v") = £>,»' (2.14)
i=l
Then, if vfc -> ν we have vkl -> г1, ..., vk" -> r", so that £"= t a(i;fc' ->
^"=1 α,υ' since the limit of a sum is the sum of the limits. Thus,
/(v*)-/(v).
Roughly, the idea of continuity of a function/is this: as a moving point
ρ gets close to p0, the value/(p) of/at ρ gets close to/(p0). That is, we can
ensure that /(p) is as close as we please to /(p0) by choosing ρ sufficiently
close to p0. This leads to the so-called "ε — δ" criterion for continuity,
which we now give.
Proposition 13. Let S be a subset of R", and let f be an Rm valued function
defined on S.
(ι) Let x0 e X. f is continuous at x0 if and only if, to every ε > 0, there
corresponds α δ > 0 such that || χ — x01| < <5 implies ||/(x) —/(x0) II < ε.
(ii) If S is open, f is continuous on S if and only if f is continuous at every
point of S.
Proof, (ι) Supposing first that the ε - 8 criterion is true, we shall show that /
is continuous at x0. Let x„ -> x0. We have to show /(x„) ^/(x0). Given ε > 0,
there is a δ > 0 such that whenever χ is within δ of x0 we have ||/(x) — /(x0)|| < ε.
Since x„^x0, there is an ./V such that n^N implies ||x„ - x0|| < δ. Thus, for
η^,Ν, ||/(x„) —/(xo)ll < ε, as desired.
2.5 Continuity 161
Conversely, if the ε - δ criterion is false, then there is an e0 such that for every
δ > 0 there is an x, for which | |x - x011< δ but | |/(x) - /(x0) 11 > e0 · Selecting
δ-ι i i i
2 3 и
we obtain the corresponding sequence xb x1/2,..., x1/n, which converges to x0.
But/(xi/n)+>/(x0) since the/(x,/n) are always outside the ball of radius e0 centered
at x0.
Part (ii) is left as an exercise.
Examples
35. /: R2 -> R denned by
5x
Я*, У) = 7-—г
1 + у
is continuous at (0, 0). For
5x
1 + Г
<5\x\<5\\{x,y)\\
Thus, if ε is given we can choose <5 = ε/5. Then ||(x, y) || < «5 implies
5x
i + y2
36.
f(x, y, z) =
< 5<5 = ε
y3z
l + x2 + z2
is continuous at (0, 0, 0). We have
\f{x, y, z) - /(0, 0, 0)| =
y>z
l + x2 + z2
< \y3z\ < ||(дс, у, z)
Thus for each ε > 0 choose δ = ε4 = ε. Then ||(x, y, ζ)\\<δ implies
\f{x,y,z)\<5* = E.
162 2. Notions of Calculus
37.
Я*, у) = (4^r (*>зО # (0, o) /(o, o) = о
This function is not continuous, since
/(*'*) = 2? = 2+>°
If we redefine /(0, 0) = 2, this new function is still not continuous,
since
яаз0 = т = 1+*2
38. We can easily verify the continuity of the linear function (2.14)
by the ε — δ criterion. For
l/(v) -/(w) I = ΙΣ «.(»' - ^') I < ΙΚβ1...., а") II II ν - w ll
by Schwarz's inequality. Thus, if ε > 0 is given, we can take
5 = \\(al,...,a")\rh. Then ||v - v0|| < δ implies |/(v) - /(v0)| < ε.
The facts concerning convergence discussed in previous sections have
application to the study of continuity, as might be expected. In particular,
the assertion that every sequence in a closed bounded set has a convergent
subsequence has profound significance for the behavior of continuous
functions. Here is an important illustration.
Proposition 14. (Intermediate Value Theorem) Let f be a continuous
function on the interval {xe R:a < χ < b}, and suppose that f (a) < у <f(b).
Then there is а с in the interval such thatfic) = y.
Proof. We seek (as in Figure 2.5) not just a point at which the value of/is y,
but more precisely the first such point c. We must find a way to describe this
point which permits us to use the existence theorem. If χ < с we must have
f{x) < γ, otherwise the graph of/crosses the line у = γ between a and c. Thus, с
is a lower bound for the set of χ such that f(x) ^ γ. Since с is in that set, it must
be the greatest such lower bound. So if there exists a first с at which /(c) = γ, it is
the greatest lower bound of {x e R: a <, χ < b, f(x) ^ γ}. We now show that this
point (which exists by the least upper bound property) is the desired c.
2.5 Continuity 163
Let с = g.l.b.{x: a<x^b, f(x) :> y}. Then с is a limit of a sequence {*„} in this
set. Since у 5C/(*„) we must also have у < lim/(*„) =/(c) since /is continuous.
Now if/(c) Φ y we must have /(c) > y. Again, by continuity, there is a 8 such
that if |x - с | < δ, then
l/W-/(c)|<
/(c) - у
from which it follows that for all χ between с and с - δ, f(x) > у. Thus,
/(с - δ) ^y, contradicting the definition of с as a lower bound for the set of χ
with f(x) ^ y. Hence /(с) > у is impossible, so we must have /(с) = у.
Now, the most important fact about continuous real-valued functions is
that they are bounded on closed and bounded sets. This follows easily
from Theorem 2.4. If, say, /is continuous and not bounded above on the
set S, then, for every positive integer n, there is an x„ e S such that f(\„) > n.
If S is closed and bounded, {x„} has a convergent sequence {\„m}. Let
lim xn(ft) = x0 · Since / is continuous, /(x0) = hm/(x„(ft)) > ]im n(k). But
n(k) -> со as к -> со, so this is impossible. Thus/is bounded on S. What is
more it attains its least upper bound. For if m is this least upper bound,
but is not a value of/, then g(x) = (f(\) - m)'1 is an unbounded function
on S, again a contradiction. To conclude: if/is a continuous real-valued
function on a closed and bounded set S in R", then there are x„ x2 e S
such that
/(x1)=sup{/(x):x6S}
/(x2) = inf{/(x):x61S}
Figure 2.5
164 2. Notions of Calculus
Here are the proofs in a slightly more general context.
Theorem 2.5. Let f be a continuous Rm-valued function on the closed and
bounded set S in R". Then the set of values off on S,
/(S) = {/(x):xeS}
is closed and bounded.
Proof. First, f(S) is closed. Suppose y„ef(S) and y„^yeRm. We must
show that у ef(S). But this is easy. Since y„ ef(S), there is for each n, x„ e S
such that /(x„) = y„. Since S is closed and bounded there is a subsequence {z*} of
{*„} which converges, zk -> ζ e S. Since / is continuous, /(ζ*) ->/(ζ). On the other
hand, {/(z*)} is a subsequence of {y„}, so /(z*) -> y. Thus /(z) = lim/fa*) = у and
ye/CS).
If f(S) is not bounded, there is for each и an x„ e S such that ||/(x„)|| ^ n. But
{x„} has a convergent subsequence {z„}. Let lim2i=i. Then lim/(z*) =/(z).
But {/(z*)} is a subsequence of {/(xn)}, so ll/(z*)|| -> со, which is impossible since
{/(ζ»)} is convergent. Thus, f(S) must be bounded.
In particular, suppose/is a real-valued function defined on the closed and
bounded set S. Then/(S) is bounded, so Μ = sup{i: t e/(S)} exists, and
since/(S) is closed, Μ e/(S). Thus there is an xx e S such that
/(Xl) = sup{/(x):*eS}
Similarly, there is an x2 such thaty(x2) = inf{/(x): χ e S}. This basic fact
we state as
Theorem 2.6. A continuous function attains its maximum and minimum
on a closed bounded set.
• PROBLEMS
24. Let x0 e R". Show that/(x) = <x, x0> is continuous on R".
25. Show that a linear function L: R" -> Rm is continuous.
26. Prove part (ii) of Proposition 13.
27. Show that if/is a continuous real-valued function on a closed and
bounded set S, there is an x2 such that/(x2) = g.l.b.{/(x): χ e S}.
28. Suppose that /, g are /{"-valued functions continuous at p0 e R".
Show that /+ g and </, g} are also continuous at p0. If с е R, then also
cf is continuous at p0.
2.6 Calculus of One Variable 165
2.6 Calculus of One Variable
Theorem 2.6, which asserts that a continuous function attains its maximum
and minimum on a closed and bounded set, is the fundamental theoretical
tool of the calculus. We shall now give a brief review of the fundamentals
of calculus, leaving the recollection of techniques to the student's memory.
We shall give brief justifications of some of the more basic or special facts.
First of all, we studied in the calculus a limit concept which was more
general than the sequential limit we have been studying. We recall the
definition.
Definition 8. Suppose / is a real-valued function defined in a set
{χ:0<\χ-χο\<δο}
We say hm /(x) = L if and only if, for all ε > 0, there is <5 > 0 such that
|x - x0| < «5 implies |/(x) - L\ < ε.
First of all, the relationship between the two concepts of limit is an easy
one: lim/(x) = L if and only if for every sequence x„ converging to x0, we
x-*xo
have Iim/(x„) = L. We can thus rephrase the notion of continuity using
n-*oo
Definition 8. /is continuous at x0 if and only if lim/(x) =/(x0)·
x-*xo
Proposition 15.
(ι) Suppose f is defined in 1 = {x: 0 < | χ - x0 \ < δ}. Then hm /(x) = L
x-*xo
if and only if for every sequence {*„} in 1 such that x„^x0 we have
Нт/(х„)=£
(ii) Iff is also defined at x0,f is continuous at x0 if and only if
ПтДх) =Д*о)
x-*xo
Proof. We will prove only (i). The proof of (ii) is the same and is left as a
problem. Suppose first that lim/(x)=Z.. Let {x„} be a sequence such that
x„^x0. Given e > 0, there is a δ >0 such that \f(x) - L\ < ε for any χ such
that |x-x0|<8. Now since x„^x0, there is an TV such that for л2;TV,
|x„ - xo| < 8. Thus if η ^ N, |/(x„) - L\ < e. Thus, /(x„) ^L.
166 2. Notions of Calculus
Now suppose lim f(x) = L is false. Then there is an e0 such that for every δ
X-*XQ
we can find an xs such that \x, - x0 \ < δ but | f(x) -L\^>e. Consider the sequence
{c„} of x's for δ = 1, i,..., 1/n, Then \c„ - x01 < l/и, so certainly c„ ->x0.
But/(c„) is always outside the interval of width ε and center L, so it cannot converge
toL.
Definition 9. Let/be a real-valued function denned in an interval about
x0 6 R. /is differentiable at x0 if the limit
,. Я*о + 0 ~ Я*о)
lim
r^o t
exists. If it does the limit is called the derivative of/ at x0 and is denoted
f'(.Xo) or — (x0)
If/is differentiable in an interval 1 and the derivative/' is also differentiable
there, then / is said to be twice differentiable on 1 and (/')' is the second
derivative of/and is denoted by
Г or ^
f от л?
The higher derivatives/"',... ,/(n),... are denned successively in the obvious
manner. A function which has derivatives of all orders on the interval will
be said to be infinitely differentiable there. If/, g are и-times differentiable
on 1, so are/+ g,fg, and c/for с a real number. If/ is differentiable in an
interval 1 it is continuous there. If /is differentiable at a point x0 where it
attains a local maximum (or minimum), then f'(x0) = 0. This, together
with Theorem 2.6 gives this basic existence theorem.
Theorem 2.7. (Mean Value Theorem) Let f be differentiable on the closed
interval [a, b~]. There is a point ξ e (a, b) such that
т-!Щ^т (2.15)
b — a
Proof. This theorem has a nice geometric interpretation (Figure 2.6). There is a
point (£,/(£)) on the graph у =f(x) at which the tangent line is parallel to the line
through (a, /(a)) and (b, f(b)). Clearly (see Problem 30), we need only verify this
when the latter line is horizontal, that is, f(b) =f(a). In this case, let ξ0 e [a, b],
2.6 Calculus of One Variable 167
Figure 2.6
ξι e [a, b] be the points at which /attains its maximum and minimum respectively
on the interval (Figure 2.7). If either ξ0 or ξι is interior, then/has a local maximum
there, so /'(ξ) = 0 for the appropriate ξ. If this is false, then {ξ0, ξι} are the points
{a, b), sof(a) = f(b) is at once the maximum and minimum of/ Thus, /is constant
on [a, b], so /' is identically zero and we can choose any point for our ξ.
Now suppose that / is a differentiable function denned on the interval
[a, b], and g is a function defined on the range of/, and differentiable there.
Then the composed function h = g ° f, denned by
KX)=g<J(x))
is also differentiable on a, b. For if x0 e [α, ό], then
h(x)-h(x0) gU(x))-9(R*o)) /W-Λ^ο)
Я*) - Я*о)
(2.16)
(ί«,/(ί,))
Figure 2.7
168 2. Notions of Calculus
Taking the limit on both sides, we have (since χ -> x0 implies/(x) ->/(x0)),
,. h(x) - hjx0) .. g(f(x)) - g(/(x0)) .. /(x) ~ /fro)
lim = lim — ——-— · lim
x->xo x - x0 /(*)-»/(*<>) /(x)_ /(Xo) *->*o x - Xo
The limits on the right exist since/is differentiable at x0, and g is differentiable
at/(x0), so the limit on the left exists. Thus h is differentiable and we obtain
the chain rule:
A'(Xo) = (0 °Λ'(*ο) = 0'(/(*o))/'(Xo)
(Notice that if/(x) =/(x0)> then (2.16) is invalid and the proof breaks down.
However, that case can be treated separately.)
If /is a function from the interval [a, b~\ to the interval [a, jS] and there
exists a function g: [μ, β]-* [a, b~\ such that
9 °/(x) = x f°r a11 x e Ca> *]
f°g{y)=y for all у б [a, jS]
we say that/is invertible and g is its inverse. The mean value theorem gives
us a condition under which a differentiable function is invertible. If a
function/has an inverse, it must be one-to-one. From (2.14) we see that
this will be guaranteed if/' is never zero. This is the sufficient condition
for the invertibility of/.
Theorem 2.8. Suppose that fis a continuously differentiable function defined
on the interval [a, b~\, andf is never zero. Letf(a) = a. andf(b) = β. There
is a continuously differentiable function g defined on the interval between
α and β such that
0(/(x)) = x and g\f{x)) = —— for all x
fix)
Proof, f is one-to-one. For if a^a1<bl<b, there is, by the mean value
theorem a ξ between ai and bi such that
/№ι)-/(βι)=/'(β(*ι-βι)*0
by hypothesis. Thus /(όι) ¥=f(a,). By the intermediate value theorem every у
between α and β is attained by/. Now we can define g as follows: let g(y) be that
2.6 Calculus of One Variable 169
x such that f(x)=y. Clearly, g(f(x)) = x and f(g(y))=y. Now g is differ-
entiable:
lim 9(У)~д(Уо) x-x0 ι ι
Um = lim — — = lim
»-»o У-Уо ,-.,o /(x) - /(x„) ~*o [/(x) - /(xo)]/x - Xo f'(Xo)
A further fundamental fact to be drawn from the mean value theorem is
this: A function is determined, up to a constant, by its derivative.
Theorem 2.9. Suppose that f g are differentiable on the interval [a, b]
and that f'{x) = g\x) for all χ e [a, 6]. Then there is a constant С such that
f(x) = g[x) + С
Proof. Leth=f—g. By hypothesis A'(x) = 0 for all χ e [a, b]. By the mean
value theorem, for any с e [a, b], there is a ξ, a <, ξ < с such that
h,(t) = h(c) - Ka)
But A'(f) = 0, so h(c) = h(a). This for all с е [a, b], so h is constant and thus /
differs from g by a constant, as desired.
Now, given any real-valued function / denned on interval 1, we consider
those differentiable functions F denned on 1 such that F' =/. By Theorem
2.9, any two such functions differ by a constant; thus by specifying the value
of such an/at any point it is completely determined. We denote by Jj / =
F{x) that function (if it exists) such that F{a) = 0 and F'(x) =f(x) for
all χ e la, b~\. \*a /is called the indefinite integral of/ Every continuous
function has an indefinite integral, which is given by the process of Riemann
integration which we now describe.
Let/be a bounded function defined on the interval 1. A partition Ρ of 1
consists of an increasing sequence of points a0 <at <··· < an such that
/ = [a0, a„]. We now construct two sums, corresponding to the
approximations to the area under the graph of/given in Figure 2.8:
Σ(Ρ,/)= Σ>,(β,-β,-ι)
o{P,f)= Zw.(a. -fl.-i)
1 = 1
170 2. Notions of Calculus
Figure 2.8
where Μ,, wf are the maximum and minimum values of / on the interval
[a,-u a{].
Definition 10. Let / be a bounded real-valued function defined on the
interval, /is Riemann integrable if
infZ(P,/) = supff(P,/)
Ρ Ρ
(2.17)
(i.e., if we can find partitions for which the two sums Σ and σ are as close as
we please). In this case the common value is called the definite integral of/over
the interval 1, and denoted J7/
If/, g are integrable on the interval 1, then so is/ + g and cf,ceR. Further
]iU+9)=]if+]i9, i/c/=cj7/ If /is integrable on the interval 1,
then/is integrable on every interval J с 1. If/is integrable on the intervals
[a, ti\ and [b, c] with a <b <c, then/is integrable on [a, c] and
Jia, c] •'[a, 6] J[b, c]
Furthermore, iif>g and both functions are integrable, then J7/> J7^.
Finally, if /is integrable on [a, b~\, then
F(x) = f /
•Ία*]
is a continuous function of x.
2.6 Calculus of One Variable 171
The fundamental theorem of calculus says more: if/is continuous on
[a, b], then J ia.b]f= Y„f', that is, the definite and the indefinite integrals of
/ coincide. The proof of this is actually quite easy to describe. Define
these functions on the interval [a, 6], corresponding to the two sides of
Equation (2.17);
F(x) = inf{Σ(Ρ, f): Ρ a partition of [a, x]}.
F(x) = sup {σ{Ρ, f): Ρ a partition of [a, x]}
To prove that/is Riemann integrable on [a, fr] is to prove F(b) = F(b). We
show, using Theorem 2.9, that in fact F(x) = F(x) for all χ 6 [α, 6].
First of all F is differentiable in [a, 6]. Let χ e [a, 6] and A > 0, then
F(x + A) < F(x) + Mh (2.18)
F(x + A)>F(x) + wA (2.19)
where M, m are the maximum and minimum of/in the interval [χ, χ + h\
These inequalities can be routinely verified (see Problem 32); Figure 2.9 is
convincing: F(x + h) is just F(x) plus the infimum of all Σ {P,f) over
partitions of [χ, χ + A]. Any such sum lies between Mh and mh. Now
Equations (2.18) and (2.19) give
Ях + А^Дх)
m <, ■ < Μ
A
г
дг jr + A
Figure 2.9
172 2. Notions of Calculus
Letting h -* 0, since / is continuous, Μ and m both tend to /(*). Thus
F'(x) exists and is/(*). Similarly, one verifies that F'(x) also exists for all χ
and has the same value. Thus, F and F differ by a constant. Since
F(a) = F(a) = 0 is obvious, we have that F(x) = F(x) for all x. Thus
f [a x]/is defined for all x, is differentiable and has derivative/. This, then,
is the proof of
Theorem 2.10. (Fundamental Theorem of Calculus) Suppose f is
continuous on the interval [a, bj. Then the integral J*/ exists for all χ e [a, b~\.
This is a differentiable function off, and
d r*
dx ·>α
• PROBLEMS
29. Prove Proposition 15(H).
30. In the text the mean value theorem is proven in the case where
f(f>) = /(«)· The way to do the general case is to compare the graph of/
with the line through f(b) and /(a). More precisely, let g be the function
whose graph is that line, and consider h =/— g.
(a) Show that
h(x) =f(x) - f(p) - Rb} ~ /(Д) (x - a) (2.20)
(b) Show that h(a) = h(b) = 0.
(c) Now from the text there is a ξ between a and b such that A'(f) = 0.
Differentiating (2.20), deduce that
й- а
31. Suppose that / is differentiable on the interval [a, b], and f'(x) > 0
for all x. Show that / is strictly increasing, that is, f(x) <f(y) if χ <y.
32 Verify inequalities (2.18) and (219).
33. Give an example of a continuous function of a real variable which is
not differentiable. Give an example of an integrable function which is not
continuous.
34. Find the real-valued function /, continuous on the interval [0, 1]
such that
ί f{t) dt=\ /(/) dt for all χ e [0, 1 ]
2.7 Multiple Integration 173
35. Suppose / is к times differentiable on R, and /<w(x) = 0 for all x.
Verify that/is a polynomial of degree at most к - 1.
2.7 Multiple Integration
The calculus of many variables results from the attempt to study functions
of several variable quantities by generalizing to that context the calculus of a
single variable. Some notions generalize easily, others require some ideas
of linear algebra to be properly understood. The integration theory is much
closer to that of one variable than is differentiation, hence we shall describe
it first.
A closed rectangle in R" is a set of the form
{(x1,..., x") 6 R": ax < xl < b'} = [(a1,..., a"), (b1, ..., 6")]
for some fixed points a = (a1, ..., a"), b = (b1, ..., b") in R". As in the
case of intervals, we denote the corresponding open and half-open rectangles
in the same way.
(a, b)= {xeRn:al<xi<b1}
[a,b) = {xeRn:al<x,<b1}
(a, b] = {xeR":al <xl<b1}
The term rectangle will refer to any of these possibilities. The volume of
the rectangle R determined by the vectors a and b is
νο1(Λ) = (61-α1)···(*"-α")
Notice that the volume of R is the same whether R is closed, open or half-
open. Of course, this is as it should be since the faces contribute no volume.
Now let S be any set. The characteristic function of S, denoted by χΞ
is the function which is one on S and identically zero off S. We should want
to define integral so that the volume of S coincides with the integral of xs.
In particular, for a rectangle R we shall have J χκ = Vol(.R). The notion
of integral will be built up piece by piece so that things turn out that way.
Now suppose that/is a finite linear combination of characteristic functions
of rectangles: /= £/=1 α, χ(Λ,). Such a function is called simple function:
It is constant on each of some finite collection of rectangles, and identically
zero off their union.
174 2. Notions of Calcu]us
Definition 11. Let/be a simple function. If/= £?=1 atXRl, we define
[/ = ta.VoKR.) (2.21)
We immediately have a problem. It may be possible to also write the
same function in another way, /=£™=i f, %Sj for some other collection
of rectangles. For Definition 11 to make sense, we must be assured that the
sum £j = i сj VoI(Sj) coincides with (2.21). In case the a, and c, are all one
and the {Rt} and {Sj} are nonoverlapping (intersect only in faces), this
amounts to the assertion that the volume of a set is the sum of the volumes
of its rectangular pieces, no matter how it is so partitioned. The verification
that (2.21) is the same for all expressions of the function/as a combination
of characteristic functions is a long verification which is omitted. We now
make this general definition of the integral.
Definition 12. Let/be a bounded real-valued function which is identically
zero outside some rectangle R. The upper integral of/ is
/= inf{ σ: σ a simple function on R such that σ >/}
The lower integral of/is
/= sup{ σ: σ a simple function on R such that σ </}
/is integrable if
/= /; the common value is the integral j/
This is the direct generalization of the definition of the Riemann integral
given in Section 2.6. On the plane and in space it bears the same relation
to area and volume as does the Riemann integral to length.
Definition 13. Let S be a set in R". If χΞ is integrable, we define the
volume of S to be
Vol(S)=J\s
Now there are sets for which χΞ is not integrable; these are highly
pathological and shall not occur in this text. Notice that if .Rj, .,., Rn are non-
overlapping rectangles contained in the set S, then the sum of the volumes
2.7 Multiple Integration 175
£ Vol (Д,) = J (Σ Яд,) is less than J χ5, since χΞ > £ χΚ(. Thus the volume
of S is greater than the sum of the volumes of any collection of nonover-
lapping rectangles contained in S. Similarly, if now Ru ..., Rn are non-
overlapping rectangles containing S, J χΞ < £ VoI(R,). Thus, the volume of
S is trapped between the volume of any union of rectangles containing S and
the volume of any union of rectangles contained in S. If we can make these
two volumes as close as we please by proper choices of the rectangles, then
J χΞ is integrable (for then J xs = ]χΞ), and its integral is the volume of S.
Theorem 2.11. Let R be a closed rectangle in R". If f is continuous on R
and zero offR, then f is integrable.
Proof. Given ε > 0, we must find simple functions σ, τ such that σ >f>: τ and
J σ < J τ + ε Vo\(R); for then it will follow that
[/< f σ < J" τ + ε Vol(tf) < 17+ e Vol(tf)
for any e>0. Thus, \f<\f. In any case, since the inequality, J/<J/is
obvious, / is integrable.
Such functions σ, τ are easily found using the basic property of uniform
continuity (discussed in miscellaneous Problem 80). According to that theorem, given
ε>0, there is a δ>0 such that, if |x-y|<8 then |/(x) -/(y)| < ε. Now
partition R into a finite set S of rectangles each of which has the property that any
two points are within δ of each other. Thus, if for each such rectangle p, m„, and
M„ are respectively the maximum and minimum of /on p, we must have M„ — m„<e.
Let
peS peS
where p0 is the open rectangle corresponding to p. Then σ ^/> τ certainly, and
ίσ = 2 Μ ρ Vol(p) < Σ (™»+ ε) Vol(/>)
J PCS PES
< f τ + ε 2 Vol(p) < ί τ + ε Vo\(R)
J peS J
since S is a partition of R into rectangles.
These following basic properties of the integral are easily derived.
176 2. Notions of Calculus
Proposition 16. The collection of integrable functions is a vector space and
the integral is a linear function. That is:
(i) Iff is integrable and ceR, then cfis integrable and \cf= с J/.
O'i) Iff, 9 are integrable, so isf+g and \(f+g) = \f+\g-
(iii) Furthermore, iff< g then J/< \g.
Proof. We leave the proof of (i) to the reader, (ii) is certainly true for simple
functions. For if /=2α'Χ«ι> 0=%bjXsj, where Rt, Sj are rectangles, then
f+ g = J a/ x*t + 2 h Xsj is also simple, and thus integrable. By Definition 1,
/(/+ g) = 2a, νοΐ(Λ,) + 2 bj voics,) = J7+ jg
More generally, now let /, g be any integrable functions. If ε > 0, there are
simple functions σι, σ2, τι, τ2, such that
σι>/>σ2 Ti>g>T2
and
Ι σι <, Ι σ2 + ε Ι η ^ Ι τ2 + ε
Thus
οι + τι >/+ # > σ2 + τ2
so
j (f+g)?zja1 + jr1<,j(a2 + τ2) + 2ε <ί j(f+g) + 2ε
Since e >0 was arbitrary, we obtain ]{f+g) <,[(f+g), so f+g is integrable.
Finally,
J (f+g) <, jo2 + jT2 + 2ε ^ jf+ jg + 2ε
so letting e^O,
j(f+g)<;jf+jg
2.7 Multiple Integration 177
Similarly,
j(f+0) + 2e>ja1 + JT2>jf+jg
so again letting ε -^0,
j(f+g)>jf+jg
(ιϋ) Finally if/ <,g, then g - />0. Butcertainly the function which is identically
zero is a simple function. Thus J (# - /) > J (# - /) > 0. By (li) it follows that
J^-J/>0, or lg>lf.
We shall now give the basic tool for computing integrals: Fubini's theorem.
According to that result we can integrate by integrating one variable at a
time. For the purpose of showing this, write the variable (x1, ...,x") of
FT as (x, y) where xeR"'1 and у e R: χ = (χ1,..., У-1). у = χ". Let /
be a function defined on a rectangle R in R", and suppose for each у fixed,
fix, y) is an integrable function of x. Define F(y) = J/(x, j>) dx. If F
is an integrable function of y, its integral
JF(y)dy = j\jf(x,y)dx
dy
is called the iterated integral of/. We shall now show that if/is integrable
this is the same as J/. More generally (after applying this principle и times)
if all functions appearing in the following formula are integrable, then the
formula is valid.
J/Oc1,...,*")**1 ■■■</*"
= /[/■·■ [j/(*i,...,*v*1
This follows from Fubini's theorem.
dx2
dxn
(2.22)
Theorem 2.12. Let f be an integrable function on a rectangle R in R". We
refer to the coordinates ofR" as (x, y), where xeRk,ye Л"-*
(i) These functions ofy, J/(x, y) dx, J/(x, y) dx are integrable.
(ii) These functions ofx, J/(x, y) dy, J/(x, y) dy are integrable.
178 2. Notions of Calculus
(iii) J/w given by any iterated integral off; for example,
J/(x, y) dxdy=j |"j7(x, y) Λ by = j [j/(x, y) rfy
dx
Proof. It is easily verified that the collection of functions for which the
assertions (i), (ii), and (iii) are true is a vector space. Furthermore, these assertions are
obvious for the characteristic function of a rectangle. Thus, Fubini's theorem
holds for simple functions.
Now, suppose/is a bounded, real-valued function on the given rectangle R, and
suppose that σ is a simple function, and /;> σ. By definition of the lower integral
with respect to the χ coordinate,
J/(x. y)<*x>Ja(x, y)dx
Now this inequality is maintained after taking the lower integrals with respect to y,
thus
J J/(x,y)<fc dy>j ja(x,y)dx dy = ja(x,y)dxdy (2.23)
since Theorem 2.12 is true for simple functions. Equation (2.23) being true for
any σ <,f, we can take the least upper bound on the right, obtaining
J jf(x,y)dx dy^ Jf{x, y) dx dy
Now, by considering simple functions σ such that σ^/and applying the same
kind of reasoning we obtain this inequality
j jf(x,y)dx dy^jf(x,y)dxdy
As a result, we obtain this string of inequalities, which is valid for any bounded,
real-valued function on R:
>щ
>
ΊΑ
>
S\UU'
(2.24)
(The second and third inequalities follow immediately from the fact that the upper
integral always dominates the lower integral.) Now, if / is indeed integrable, the
2.7 Multiple Integration 179
first and last terms of (2.24) are the same, so all are the same That the second and
top third are equal implies that]/(x,y) dx is integrable. That the bottom third
and fourth are equal says that J/(x, y) dx is integrable. The equation
jf(x,y)dxdy = j jf(x,y)dx
dy
now just states the equality of the end terms with the interior terms.
Now we shall illustrate the use of Fubini's theorem. Before doing that,
we should remark that we rarely have the occasion to integrate functions
defined on a rectangle; more often such a function is defined or considered
only on a given measurable domain D. We make the following definition.
Definition 14. Let D be a domain contained in a rectangle R. Given a
function/defined on D, we say/is integrable if this is so for the function/
defined on R by
/(*)
=/00
= 0
xs D
xeR,x<£ D
We define JD/= J/
If β is a subdomain of a rectangle R bounded by a surface which is the
graph of a function, or has some other redeeming property, then the function
/ will be integrable if / is. We shall not pursue this theoretical inquiry,
but rather tacitly assume our domains are redeemable.
Example
39. {D = (x,y): 0<y<x2, 0<x<l}, f(x, y) = x2 + y2.
Define /(*, y) = x2 + y2 if (x, у) е D, and /(*, y) = 0 otherwise.
Then
f / = f /= J*' [ f' Ax, y) dy] dx = J*' \f\x2 + y2) dy] dx
JD JR J-lLJ-l J ° L 0 J
since, for fixed x, /(*, y) is zero if χ < 0 or у > χ2 and otherwise is
x2 + y2. We thus obtain
г л»Г v31 χ2 гЧ л χ6\ 11 26
180 2. Notions of Calculus
y = g(x)
Figure 2.10
Let us do the same example, iterating this time in the other order.
Lf=!\[$\f{x,y)dx dy=ί0[ί/χ2+y2) dx dy
_ 1 1 2 2 _ 26
~~ 3 + 3 ~ 15 ~ 7 ~~ 105
The general technique can be described as follows: Try to write the domain
in either of these forms (Figure 2.10).
D = {(*, y): a < χ < b, д(х) < у <f(x)}
or (Figure 2.11)
D = {(x, у): а < у <L b, ф(у) <х< ф(у)}
Then, given the function/defined on D, we can write
17= ('\fX)f(x,y)dy\dx
in the first case; and in the second
17= f f f(x,y)dx dy
2.7 Multiple Integration 181
Of course, if neither case can be obtained, then D might have to be broken
up into pieces in each of which either representation is possible. The
computation of integrals in more than two dimensions is done in pretty much
the same way, but with a certain amount of additional care. For example,
one should try to pick out one of the coordinates, say z, so that the given
domain takes the form g(y) < χ <f(y), where у represents all the other
coordinates and ranges through some domain D0. Now one proceeds to
break down D0 in the same way.
Examples
40. D = {(x, y, z): x2 + y2 + z2 < 1, χ > 0, у > 0, ζ > 0},
f{x, у, ζ) = xyz.
Now z ranges between 0 and (1 - (x2 + y2))1'2, so
D = {(x, y, z): x2 + y2 < 1, 0 < x, 0 < y, 0 < ζ < [1 - (χ2 + у2)У'2}
Thus, continuing the analysis of
A> = {(x. у):х2 + У2<1, 0 < x, 0 < y}
D = {(x, У, z): 0 < χ < 1, 0 < у < (1 - x2)1/2,
0<z<[l-(x2 + 3>2)]1/2}
Figure 2.11
182 2. Notions of Calculus
and
1 Γ Jl-x2)1'2
JD ·Ό \_J0
f
,[l-(jC2 + y2)V/2
dz
Sf-S·
•In ·>0
= :l· yQ-(x2 + y2))dy
*· Jo LJo
1 f'[x(l-x2)2 (1-**)-
-Vol—τ—'*—Τ
dy
dx
dx
2\2η ι
Λβ24
41 D= {[x, y, z): x2 + y2 + z2< 1, (x - i)2 +y2<l
x>0 y>0 ζ > 0} f(x, y,z)=\
(see Figure 2.12). We may rewrite this domain as
D = {(*, y, z):(x-i)2 + y2<i,x>0,y> 0,
0 < ζ < [1 - (χ2 + у2)]112}
= {{χ, y,z):0<x<\,0<y<[.i-(x- i)2f'2,
О < ζ < [1 - (χ2 + У2)Т'2}
Thus
л1 Г rli(x-i)2W4 Λΐ -(х2 + у2)]'/2 ι
Vol(D) =|| I rfx
Ji)LJo LJo
^
i/z
Figure 2.12
2.7 Multiple Integration 183
Integration is clearly of value in computing volumes; it also plays a role
in the study of mass. Suppose £is a domain in R3 filled with a certain fluid.
If D is any subdomain in E, we shall let (D) be the mass of the fluid contained
in D. What information do we need in order to compute mass (Z>), and how
do we compute it? The answer is suggested by comparison of the properties
of mass with those of volume. In fact, it is clear that the intuitive properties
of mass are the same as the properties of volume; so we should also expect
to be able to compute masses by integration. In fact, we introduce the
notion of density: for x0 eE, the density σ(χ0) of the fluid at л0 is the limit
mass(i?)
where we mean by R -> x0, that x0 is in the rectangle R, and the lengths of
the sides of R tend to zero (we might call mass (i?)/Vol (R) the relative
density of the fluid in the rectangle R). Now, the mass of the fluid in any
domain is computable in terms of this density function σ. Suppose D is
such a domain and {Rt} is a collection of pairwise disjoint rectangles in D
and almost filling D. Then
is an approximation to mass (Z>) and as the size of the rectangles gets smaller
and smaller, the approximation gets better. On the other hand, this sum is
the integral of a simple function approximating σ, and thus approximates
JD σ. Taking the limit we obtain mass (Z>) = JD a.
• EXERCISES
15. Compute the volume of these domains:
(a) {(x,y)eR2:x2+y2<l}.
(b) {(x,y)eR2:x2^y^\}.
(c) {(^j,z)ei3:0^x<l,0<^l,0<z<^ + /}.
(d) {(x,y,z)eR2: - 1 <:*<: 1, 0^y<2, y<z<y+ x2}.
16. Verify that the volume of a right circular cylinder of radius r and
height h is \m2h.
17. Integrate the function /on the unit rectangle [(0, 0), (1, 1)] in R2
(a) fix, y) = χ cos 2тту.
(b) f(x,y)=\(x-i)(y~i)\-
(c) f(x,y)=xe" + ye-'.
184 2. Notions of Calculus
(d) f(x,y)
(e) f(x,y)
χ ifx<,y
у ify<x.
x + y ifx + y^\
1 ifx + y>\.
(f) f(x,y) = 0 + x2 + y2y2·
18 Integrate the function / on the domain Din/?2.
(a) D = {(*, y): 0 ^x, 0 <y, χ + у ^ l},f(x, y) = x2 + y2.
(b) D = {(x,y):0^y<x^\},f(x,y) = xy2
(C) D = {(л:, у):0<у^х^ \},f(x, у) = л:2г·
(d) D = {(x, r): x2 + У2 < 1}, fix, У) = (*2 - У2)2·
19. Integrate the function/on the domain D in R3.
(a) D is the intersection of the unit ball with the octant {x>0,y>0,
ζ > 0} and f(x, y, z) = χ + у + ζ.
(b) D is as above and f(x, y, z) = xyz.
(c) Z> is the unit cube in the first octant and f(x, y, z) = x2 + y2 + z1
(d) D is the domain in the first octant bounded by the coordinate
axes and the plane χ + у + ζ = 1 and f(x, у, ζ) = ζ.
• PROBLEMS
36. Verify that the integral on R" as defined in this section coincides,
when и = 1, with the Riemann integral defined in the previous section.
37. Let/be a bounded, nonnegative, real-valued function defined on the
interval /, and let D = {(x,y) e R2; xe I, 0<,y^f(x)}. Verify this
assertion: /is integrable if and only if D is measurable, and J, /= Vol(D).
38. Use Problem 37 to verify this. Let D be a domain in R2 and suppose
that D is of the form
{(x,y)eR2:a^x^b, g(x) ^ у ^f(x)}
Then, if D is measurable, Vol (D) = JS [f(x) - g(x)] dx.
39. Complete the proof of Fubini's theorem by verifying the second and
third inequalities of Equation (2.24).
40. State and prove Fubini's theorem in three dimensions.
41. Suppose the unit ball is filled with a fluid whose density is proportional
to the distance to the boundary. Find the radius of the ball centered at
the origin which has precisely half the mass.
42. Suppose a cone of base radius r and height h is filled with mud
(Figure 2.13). Suppose the density of the mud is equal to the distance from
the base. What is the mass of the mud ?
43. A beach В is shaped in the form of a crescent (see Figure 2.14)
В = {(χ, у): 1 <^x2 + у2; (χ - i)2 + у2 <: 1}
and the human density σ increases with the distance from the water. More
precisely, σ(χ, y) = (x2 + y2)'1. What is the mass of humanity on that
beach ?
2.8 Partial Differentiation 185
Figure 2.13
Figure 2.14
2.8 Partial Differentiation
Although the integral in R" is defined without reference to the coordinates,
it is computed by a succession of integrations, one coordinate at a time. The
notion of differentiation is, to begin with, generalized to R" one coordinate
at a time. Later we shall see how to build out of this generalization an
invariant notion of derivation.
Let x0eR", and suppose that / is a real-valued function defined in a
neighborhood of x0 . For each i consider the function of the single variable
x' given by
,Aq ,...,X,....,XqJ
186 2. Notions of Calculus
If this function is differentiable, we denote the derivative by dfjdx', and call
it the partial derivative of/in the x1 direction. More precisely,
Definition 15. Let/be a real-valued function defined in a neighborhood
of x0 in R". The partial derivative of/with respect to x' at x„ is the limit
df . . .. f(x01,...,xo, + t,...,xtr)-ftx01,...,x0K)
a?(Xo) = !lm 1
Another way of describing the partial derivative is this. Consider the
function/only as a function on the line through x„ and in the E, direction.
This restriction is a function of one variable and dfjdx1 is its derivative.
These partial derivatives are computed merely by considering all but the
relevant variable as constant.
Examples
42.
f(x, у) = xy
дх (*' ^ = У
43.
— (x2y) = 2xy
\
44.
/(x, y) = cos[x(]
8y-(X'y) = X
ly^ = *2
l+jO]
^ (*> У) = -(1 + У) sin[x(l + у)}
— {x,y)= -xsin[>(l +yy\
45.
fix, y) = xy
2.8 Partial Differentiation 187
j-(x,y) = yx'~l У-(x, y) = x* \n χ
ox oy
Of course, if the functions
df_ dj_
дх1'""дхГ
are also defined in a neighborhood of x0) we may subject them to further
partial differentiation, and keep going in this way as far as possible We shall
refer to any such operation as a partial differentiation and call its order the
number of individual partial derivatives involved.
Thus, the order of
δχ'
UK)
is 2; the order of
f_ (д_ /дТ\\
дх2 \ду \dz3))
is 6. We introduce a notational convention which deletes parentheses.
dx2 ox \dx/
dx \dy)
дх ду
82f _ д /Bf\
dx1 dxJ dx' \dxJJ
dx'dxJdxk dx'\dxJ \dxk) J
d6f δ ( d5f
δχ2 δ у δ ζ'
.-Ц W )
δχ \δχ δγ δζ3}
and so forth.
188 2. Notions of Calculus
Suppose now that/is a function defined in an open set N in R" and that
df/dx1, ..., df/dx" all exist in N. If we set all the variables constant except
one, say x', then df/дх' is just the derivative of/along this line. Thus, if
df/дх1 = 0, / is constant along the line on which only x' varies. In such
circumstances we say that/is independent of x', since/does not vary as xl
alone varies. If, moreover, df/dx' is zero at all points of N for all i, then/
depends on none of the variables, so is constant. As this is an important
observation, we make it.
Proposition 17. Suppose that f is a real-valued function defined in a
neighborhood of x„ in R". f is constant near x„ if and only if all the derivatives
df/dx1,..., df/dx" exist and are zero near x„.
Proof. If /is constant, it is obvious that df/dx' = 0 for all i. On the other hand,
suppose that these conditions are valid in a ball B(\0,r) centered at x0. Let
у = (у1,..., у) е В(х0, г). We will show that/(у) = /(x0). Figure 2.15 illustrates
the proof. Consider the function of x":
/(χο1,...,χδ"Ι,χ")
This function has derivative zero by hypothesis, so is constant. Thus,
f(Xo , ..., Xo~ , Xo") — f(Xo , · · ■, Xo~ , У)
Now, the function of л""1,
f(x0\ ...,x"0-2, x"-\y")
(У .ίΟ2,*!)3)
(У,?2,/)
(y\ y\ xo")
Figure 2.15
2 8 Partial Differentiation 189
also has derivative zero, and thus must be constant, so
f(x0\ ..., χ*"1, Λ = /(*>', . ..,У-',У)
This together with the preceding equation gives
f(x0\ ..., XT', *o") = /(*o\ ···, xV\ У1,У)
Continuing in this way, we can replace each x0J by the corresponding yJ one at a
time, ending up with the desired equation f(x0) =f{y).
As far as the higher order differentiations are concerned, there is one basic
fact we should now verify. This is that each partial differentiation depends
only on the number of derivatives with respect to each coordinate, and not
on the order in which they are performed. For example,
^ ^ (2.25)
дх ду ду дх
d5f d5f d5f
дх ду dz dz дх ду дх dz ду дх dz dz дх
We shall verify only the first equation; it being clear that all others follow
from a succession of applications of the first one. The verification of (2.25)
amounts to an interesting application of Fubini's theorem.
Theorem 2.13. Let f be a real-valued function defined in a neighborhood
Nof{xQ, y0) in R2 and suppose that all first- and second-order partial
derivatives off exist and are continuous on N. Then
d2f d2f
дх ду ду дх
throughout N.
Proof. We apply Fubini's theorem to d2f/dx ду in a sufficiently small rectangle
^ = ((*o, Уа), (s, 0) contained in N (see Figure 2.16)
i[L^ydy]dx = i[i^ydx.
dy (2.26)
190 2. Notions of Calculus
Figure 2.16
Now, we can easily evaluate the integral on the right-hand side. For fixed y,
Integrating once again (this time with respect to .y) we obtain from Equations (2.26)
and (2.27)
fJi.
Э2/
дх ду
dy
r' 8
dx = ) g-lf(s,y)-f(xo,y)]dy
= /(«, t)-f(x0,t)- [f(s, yo) - /(*„, JO)]
(2.28)
Now, we can differentiate this equation with respect to ί first, and then t. By the
fundamental theorem of calculus, we know how to differentiate the integral on the
left with respect to the upper limit of integration:
θ
82/
*o L"3>o дхдУ
I
(x, У) dy
dx
"/.
Э2/
,o 8x 8У
(s, У) dy
Then, from (2.28)
f' 32/
вхгуЬЯЪ-ТхЬЪ-ТхЬ»*
2.8 Partial Differentiation 191
Differentiating this equation now with respect to t, we obtain
Э2/ Э2/
дхду
as desired.
ду dx
Another important application of Fubini's theorem is this result, which
allows us to differentiate under the integral sign.
Proposition 18. Suppose that f is a continuously differentiable function of
two variables χ and y, a < χ < b, and у e D, a domain in R". Define the
function F on the interval \_a, b~\ by
F(x)= \f(x,y)dy
Then Fis differentiable and
dF r df
* <*>=u <*·*>*
Proof. We shall show that F is the indefinite integral of the function
9/.
Г 3/
and thus by the fundamental theorem of calculus, the proposition follows. By
Fubini's theorem
t
/
bf
dx
(x, У) dy
•'9/.
dx=\ j γχ (x, У) dx
dy
But by the fundamental theorem of calculus, the inner integral on the right is
/(f,y)-/(e,y). Thus
j \ j j£ (x , У) dy dx = j [f(t, У) -Л«. У)] <Ь=НО- F(a)
Let us return now to the consideration of the first-order derivatives. These
are obtained by differentiating after restricting the function to lines parallel
192 2, Notions of Calculus
to the coordinate axes. We generalize this notion to allow differentiation
along any line. That is, we make this definition.
Definition 16. Let x0 e R" and suppose/is a real-valued function defined
in a neighborhood of x„. If ν is a vector in R", we define the directional
derivative df(x0, v) to be
This is clearly the same as
,._/(x0 + fv)- /(x„)
lim
!-»0 t
We leave it as an exercise to verify that
|ζ(χ0) = <//(χ0,Ε,) (2.29)
ox
Now, in certain pathological cases the directional derivatives need not hang
together in any nice way, but typically we need only know the partial
derivatives in order to find any directional derivative.
Proposition 19. Suppose f is defined in a neighborhood o/x„ and the partial
derivatives dfjdx1,..., dfjdx" all exist near x„. Then the directional derivatives
df(x0, v) vary linearly in v.
Proof. The argument consists in looking at the difference
/(xo + fv)-/(x0)
one variable at a time. In order to expose the idea without encumbering the
argument with a pile of indices, we consider the two-variable case. Write the
difference
f(x0 + th, y0 + tk) - f(x0, y0)
{f(x„ + th,y„ + tk) - f(x„ + th, y„)} + {f(x0 + th, y0) - f(x0, y)}
2,8 Partial Differentiation 193
We can find a better expression for the term in the second set of braces by applying
the mean value theorem to the function/(i, y0) of s. That is, there is a ξ0 between
xo and xo + th such that
Э/
f(x0 + th, y0) - f(x0, y0) = — (ξ0, y0)th
Similarly, by applying the mean value theorem to the function f(x0 + th, s), we can
rewrite the term in the first set of braces as
9/
— (x0 + th, 4o)tk
for some η0 between y0 and y0 + tk. Thus, we have for suitable (ξ0, ηο) in the
rectangle [(л:0, Уо), (xo + th, y0 + tk)],
/(xo + ?v)-/(xo) Э/ Э/
= γχ (ξο, y0)h + — (xo + th, ηο)^
Letting t -^0, we obtain by continuity that
d((x0, Уо), (h, k)) = γχ(Χο, yo)h + yy(xo, yo)k (2.30)
Thus the proposition is verified, at least in R2.
This linear function, a?f(x0, v) of the vector ν in R" is called the differential
of/ at x0. We will make a systematic study of this in a later chapter. The
vector-valued function
W " " ' dx-j
is called the gradient off and is denoted by V/. It is clear from Proposition
19 that the generalization of (2.30) to η variables is
#(χο. ν) = Σ §i (χοΚ = <ν, V/(x0)> (2.31)
The gradient behaves as a sort of " total derivative." It is not as powerful
in the analysis of a function as the derivative in one variable and it is
somewhat more cumbersome, but it does provide a similar kind of tool. For
example,
194 2. Notions of Calculus
Proposition 20. The gradient of a function vanishes at any point at which it
attains a maximum or minimum value.
Proof. If Xo = (*o\ ..·, xo") is (for instance) a maximum value of/, then
f(x0\ ·.., x', ..·, xo"), as a function of x', attains a maximum at x0'. Thus,
df/dx1 vanishes at x0'. Since this is true for all ι, Vf(xo) = 0.
Examples
46. Consider/(л:, у, ζ) = χ2 + ху + у2.
Vf=(2x+y,x + 2y)
Thus V/is zero when
that is, only at the origin. This is the only critical point, and a
minimum at that.
47. f(x, y, z) = χ cos у + ζ
V/= (cosy, -x smy, 1)
is never zero, so /has no critical values.
48. f{x, y, z) = χ cos (yz)
V/= (cos(_yz), -xz sin _yz, - xj> sin _yz)
V/ is zero only when χ = Oandyz = π(η + i)foranyintegern. Clearly,
/ has both negative and positive values near any point on the line
{x = 0}, so no such point is critical. Thus,/has no critical points.
• EXERCISES
20. Find the first partial derivatives of these functions,
(a) xyz (b) sinOy) (с) х,г (d) x2y + y2x
21. Differentiate Xх". (Hint: This is the same as finding the directional
derivative of х'г at a point (x, x, x) in the direction of (1, 1, 1).)
2,9 Improper Integrals 195
22, If/is differentiable at x0, then
Э/
— (xo) = df(x0, E,)
for all i.
23. Suppose that /, g are differentiable at x0 in R". Show that fg is also
differentiable and V(/^)(x0) = f(x„)Vg(x0) + ^(x0)V/(x0).
24. If /is differentiable at x0, and /(x0) ^ o, then
(?)(X0) =
yi V/(x0)
25. What is the minimum of x2 + y2 + (2y + l)2?
26. What is the maximum of
x + 3y ?
1 + x2 + y2
27. Compute the differentials of the functions in Exercise 20.
• PROBLEMS
44. Suppose/is a differentiable function of two variables and gi,g2 are
differentiable functions of one variable so that the range of (#1,^2) is in the
domain of/ Find the derivative of h(t) =f(gi(t), giit)).
45. Let/be a differentiable function of two variables. Show that/is a
function of χ — у alone if and only if df/дх + Sfjey = 0.
46. Suppose that L: R° -> R is a linear function. What is У LI
47. Let T: R" -> Л" be a linear transformation. Define the function on
R" χ R": /(x, y) = <7x, y>. Show that / is differentiable, and V/(x, y) =
<Г'у, 7x> (recall that T' is the transpose of T\ if Г is represented by the
matrix (a/), then T* is represented by (b/) where ό/ = at1),
48. If Γ: R"^> R" is a linear transformation, then the function g(x) =
<7x, x> is differentiable, and V^(x) = Γχ + Tx,
2.9 Improper Integrals
We return now to the study of functions of one variable; in fact, we will
be considering functions defined on the whole real line. Our interest will
focus on the " behavior at infinity " of such functions. For this purpose we
introduce the notion of lim/(x) as χ -> oo.
196 2. Notions of Calculus
Definition 17. If/ is a real-valued function defined in an infinite interval
{χ: χ > a} we say that /(x) converges to L as χ becomes infinite, written
lim/(x) = L if, for every ε > 0 there is an Μ > 0 such that χ > Μ implies
|/(x) — L\ <в. Similarly, if/ is defined in {x: x<b} we say Iim/(x) = L
x-*ao
(the limit of /(x) is L as χ becomes negatively infinite) if, for every ε >0
there is an Μ > 0 such that χ < — Μ implies \f(x) — L\ < ε.
Examples
49. Iim \jx = 0. For given ε > 0, we can take Μ = ε-1. Then
χ > Μ implies \\jx - 0| < ε.
50.
,. 4л:2 + Зх + 5 1
lm ——^5—-— = -
,~. Sx2 - 7 2
For, so long as χ > 0,
4л:2 + Зх + 5 _ 4 + 3/x + 5/x2
Sx2 -7 = 8 - 7/л:2
(2.32)
Now, we can compute the desired limit by using the standard algebraic
rules (the limit of a sum is the sum of the limits, etc.). (See Exercise
28.) Since l/л-, Ι/λ2 tend to zero as л -> oo, the limit of (2.32) as
x -> oo is 4/8 = 1/2.
51.
lim J = 1 lim ^j 2 = ~~ ι
If
x\x\ x2 1
л>0,
1 + χ2 1+x2 1 + 1/x2
if
л|л| л2 -1
л < 0,
1 + xz 1+x2 1 + 1/x2
2.9 Improper Integrals 197
52. Iim arctan χ = π/2.
x-*ao
Definition 17 is the analog for functions defined on an infinite interval
of the notion of convergence of a sequence (a function defined on the integers).
Just as we pass from sequences to series we can pass from infinite limits of
functions to infinite sums, that is, integrals over infinite intervals.
Definition 18. Let/ be a continuous function on the interval {x: x> a}.
We say/is integrable if Iim J*/exists, in which case We write the limit as
$"/■ /'s absolutely integrable if Iim J* |/| exists.
Examples
53. x~2 is integrable on the interval [1, oo). For
dx =
■--I + i
ι m
Cx~2dx = limi +l) = 1
Jl m-.oo\ Ш J
54. χ 1 cos* is not absolutely integrable on the interval [1, oo).
For
oo .2πη + π/3 cos χ
Γ* COS X ™ [■'■■"•^■4- COS X ,
dx Ξ> £ Λ
•Ί JC п=1 •'гяп-я/З *
Between Inn - πβ and 2πη + π/3, χ ' cos χ > (2πη + π/3) 1 ■ \.
Thus,
ί"
•Ί
cos χ
1
d*> Σο-
1
„tri 2 (2πη + π/3) 3
2π
— = οο
The theory of integration on infinite intervals is entirely analogous to the
theory of infinite series. We have the following facts (whose counterparts
in the theory of series are easily recognized).
198 2. Notions of Calculus
Proposition 21. Let f be continuous on the interval {χ : χ > a).
(i) fis absolutely integrable if and only if the set {J* |/|} is bounded.
(u) If fis absolutely integrable, then f is integrable.
(iii) (Comparison Test). If there exists a b> a and a constant К and an
integrable positive function g defined on {χ: χ > b) such that Kg> \f\, thenf
is absolutely integrable.
Proof.
(l) If l/l is integrable, clearly φ |/|} is bounded. On the other hand, if φ |/|}
is bounded, let L = sup{Ji |/|}. Then for ε > 0, L — ε is not an upper bound, so
there exists an л:0 such that Ji° |/| ^ L — ε. Then for all χ Ξ> л:0,
L>j \f\>j |/|>L-
r* I
<ε
(ii) Suppose J™ |/| = L. Let c„ = Js/ We show that {c„} is a Cauchy sequence.
Let ε > 0. Then there is an x0 such that for χ >: x0,
a
Then for n, m^Xo,
<:
\cn— cm\ =
Γ/^Γι/ι^ίι/ι-f i/i
/ \f\-L + / i/i-z.
<ε
Thus {c„} is Cauchy, so converges, say to с We shall show that in fact J?/= с
Let ε > 0, and find N so that |c„ - c\ < ε/2 for и ^ JV. Then for л: ^ max(x0, iV),
* N »
J /-c ^ J f-c +J |/|<^ +
2 2
as in the previous computation.
2.9 Improper Integrals 199
(iii) Under the given hypothesis, if χ > 1, then
X Ь oo
j |/|^J 1/1+ίJ 0<co
ρ a b
Thus by (l), /is absolutely integrable.
Here is an easily derived relationship between the absolute convergence of
series and integrals which provides yet another test for the Convergence of
series.
Proposition 22. (Integral Test) Let f be a positive, decreasing function
defined on R +. Then\™ f exists if and only if £™= λ /(и) < oo.
Proof. For x, n<x<n+ 1 we have f(n) >x >n+ 1. Thus/(n) > Ji+1/>
f(n+ 1). Thus, by comparison the series 2 \n+1 /'and J/(n) converge or diverge
together. But the convergence of the first series is the same as the existence of
J? /, and conversely.
This proposition gives an easy proof that 2 l/"u +ε) < <» for ε > 0. (Compare
to the work of Example 18.) For if we consider the integral Jf dtjtl+c, we have
r' dt -1
J, /1+ε~ ete
ε εχ" ε
as χ -> со.
Example
55.
£ 1
„ = 2 n(logn)2
For
< oo
ί/ί №x du
Γ_ίΐ_=Γ'^=-„-ι
J2 f(logf)2 Jlog2 Μ
Thus
logjc
log 2 log 2 log л:
rx dt /1 1 \ 1
Г "' _ }lm )= < oo
J 2 f(logf)2 *—\log2 log*/ log 2
200 2. Notions of Calculus
• EXERCISES
28. Verify these algebraic properties of lim. Suppose lim f(x), lim g(x)
JC-»0O JC-»0O JC-»0O
exist.
(a) lim f(x) + g(x) = li m f(x) + li m g(x).
(b) lim f(x)g(x) = lim f(x) lim g(x).
f(x) ,im^W
(c) Urn ^ = '?" , if Urn *(*) * 0.
jc-»oo
29. Compute these limits as χ -> со.
(a) sin л:
χ2 + Зх + 1
(Ь) *·+! "
л:2- 1
(d) tan -.
л:
1
(e) χ sin -
X
w x2 + 1
30. Which of these series converge:
(a) Σ-1- (Ο Σ —
V »ti и log и W »=2 n3'2
oo J oo С— iyi
(Ь) Σ -тг^ (β) Σ ( У
η = г rt(log η)2 η = 2 (log η)2
oo 1 oo 1
ω Σ „ _ _ ч2 (h) Σ
η = 2 ^(log log л)2 п=2 (log л)2
00 1 OO 1
№ Σ ^—ττττ—ίΤΤ^ (0 Σ
η=2 (log /i)2(log log я)2 n=2 (n sin я)2
oo 1 oo 1
(e) Σ —, TV <J) Σ
Η
n=l Ι 7Γ η π=3 η log(logn)1+t
tan' '
2.10 The Space of Continuous Functions 201
2.10 The Space of Continuous Functions
The mathematician attacks his problems with a certain store of techniques.
Occasionally a problem will require the development of a new technique;
more often the problem is solved by viewing it in one way, and then another
and then again another until a viewpoint is obtained which allows for the
application of one of those techniques. Sometimes if the viewpoint is clever
enough, or profound enough—or naive enough—the applicable technique
is quite elementary and surprising and leads to further deep discoveries.
This is the case with the contraction lemma (a fixed point theorem) which we
shall apply several times in this text to obtain some of the basic facts of
calculus. First, in this section, we shall develop the particular viewpoint in
the relevant context. It is simple enough—instead of looking at continuous
functions one at a time, we consider them all.
Let us illustrate this with a particular problem. Suppose we are interested
in finding a differentiable function with these properties:
f'(x)=f(x) for all χ and /(0) = 1 (2.33)
To find such a function means first of all to verify that a solution to our
problem exists, and secondly to establish some technique for computing it.
We already have enough experience with calculus to know that this second
objective will be hard to fulfill. What we in fact seek is a means of effectively
approximating our solution. This provides a clue: let us look for a sequence
of functions {/„} which converges to a function with the properties (2.33).
Such a sequence would be a sequence of differentiable function {/„} such
that the sequence {/„(*)} converges for all x, and /'„(*) =/n-i(4 If we
had such a sequence, we could take the limit and deduce that
lim/'„(x) = lim/„_i(x)
so f{x) = hm/„(x) will solve our problem.
Now this is a good idea, because Equation (2.33) itself provides the
technique for generating such a sequence. Let /0 be any function, and define
/, =/V Then let/2=/',, /3=/'2, and so forth. Will the sequence
{/„} converge? Well, that is a problem. Notice that /2 =f\ =/"0,
/3=/'2=/'"0! and more generally fn=Un)- Thus, we must be very
careful to choose an infinitely differentiable function for/0. Suppose/0 is
chosen as a polynomial of degreen. Then fn + 1 = /<S"+I) = 0, and so all
the rest of our functions are zero. Thus, the sequence certainly converges,
202 2. Notions of Calculus
but hardly to a solution, since the condition/(0) = 1 is not verified. In fact,
this present approach has obviously petered out fruitlessly and it may be
because we have not incorporated the initial condition /(0) = 1 in our
approach. Can we put all of (2.33) in one statement, and then proceed
with this technique of generating an approximating sequence? The
fundamental theorem of calculus says yes; in fact, (2.33) can be rewritten as
/(*) = 17(0 dt + 1 (2.34)
■Ό
This now is an operation involving integration rather than differentiation,
and so we have the added advantage of not having to choose a very well-
behaved function for the first approximant. Let us try again, with (2.34)
rather than (2.33). Letting/0 = 1, we find
Л(дс) = flat + 1 = χ + 1
/2W = ГО + 1) dt + 1 = *- + χ + 1
Г It2 \ x3 x2
(#ι-1)!
(2.35)
Now we're getting somewhere. We have already seen that the series (2.35)
converges for any x. Thus, letting
/(*) = lim/„(x) = Σ -
n=0 nl
this must be the sought after function. (Of course the reader has long since
recognized the solution of our problem as being the exponential function.
Thus he should be reassured to see that it did in fact turn out that way.)
What we need now is the theoretical mathematics that will allow us to take
the limit in (2.35) and correctly deduce
/(*) = Г ДО dt+l= £ -
Jn n = n ft
0 n-O"!
2.10 The Space of Continuous Functions 203
Thus we are led to the question of convergence in the space of continuous
functions. We now proceed to that theory.
Let X be a closed bounded set in Rn, and let C(X) denote the space of all
continuous complex-valued functions on X. We know that if /and g are two
functions in C{X), then so zref+g and fg and cf for c, a complex number.
In particular, C{X) is a vector space on which multiplication is defined.
The vector space C{X) is quite different from the vector spaces C, R": C(X)
is usually infinite dimensional (see Problem 49). C(X) does not have any
obvious "standard basis"—in fact, we wouldn't know how to choose one.
In other particulars, however, C(X) is not very different. There is in this
space a reasonable notion of closeness. Two functions are close if their
values are everywhere close; that is, if the maximum of their difference is
small. This leads to a notion of length and distance in C(X).
Definition 19. Let X be a closed and bounded set in R", and C(X) the
space of continuous functions on X. life C(X), the length of/is
||/||=max{|/(x)|:xeX}
If/, g are in C(X), the distance between / and # is \\f— g\\.
The properties of length and distance are just those of the corresponding
notions in R":
lk/ll = \c\ 11/11
\\f+g\\< 11/11 + Ы1
If 11/11 = 0, then /= 0. What is important is what we can consider the
notion of convergence of a sequence of continuous functions. We say that
/n -*/rf II/, -/II -*0, that is, if the distance between the general term of the
sequence and / becomes arbitrarily small. This is the same as saying that
the values of/„ at points of X converge to the values of/in a uniform manner.
The value of these notions lies not only in their naturahty, but in the now
realizable possibility of finding specific functions satisfying given properties
by techniques of approximation. Let us make this precise.
Definition 20. Let X be a closed bounded set, and {/„} a sequence in
C(X). We say that {/„} is uniformly convergent if there is an/e C(X) such
that
lim ||/„ - /II = 0
204 2. Notions of Calculus
We say that the sequence is uniformly Cauchy if, for every ε > 0 there is an N
such that
II/, - fm II < ε whenever n,m>N
Examples
56. Let χ be the interval [0, 1], f„(x) = (1 - x)*". This sequence
converges uniformly to zero. Let us compute max|/„(x)| = \\f„\\.
/;(x) = «(i-xK-'-x"
so/„'(*) = 0 has the solutions χ = 0, χ = и/(и + 1). Thus
which tends to zero.
57. On the same interval the sequence f„{x) = sin xjn tends to zero,
for
||/J = sin - -> 0 as и -> oo
58. Consider the convergence of the sequence {nx sin xjn) on the
interval [0,1]. Now we know that sinx/n->0 as n -> oo, but
nx -* oo, so we cannot make any deduction about the product.
We have to refine our information about sin xjn. For large values of
n, it is very close to xjn. Thus
χ χ 2
nx sin nx ■ - = χ
η η
(2.36)
so we guess that nx sin xjn -> x2. Let us prove it by computing
nx sin χ
η
(2.37)
In order to do that, let us provide an estimate to our guess (2.36).
X
sin —
и
X
- ~
η
<— in the interval [0, 1]
(2.38)
2.10 The Space of Continuous Functions
205
Then (2.37) becomes
• * 2
nx sin xz
и
=
=
/ X X\
nx sin
\ η n)
1 . χ x\
их sin
\ и и/
< II «II
<
1
"'7'
χ χ
sin
и и
= п~1
X
+ их ■ - -
и
-х2
(2.39)
(2.40)
and since и λ -* 0 as и -> оо, we are through.
59. On the interval [0, 1] the sequence {sin их} is not convergent.
It is not even a Cauchy sequence. The distance ||sin их - sin wx||
does not become arbitrarily small as n,m-> oo. In particular, if
m = 2n, we have
||sin(nx) — sin(2nx)
sin | и ■ — i - sin 12n ■ —)
\ 2л/ \ 2л/
= 1
The basic theorem about convergence of continuous functions is the
following, which plays the same role in C{X) as the least upper bound axiom
does for R. It provides the assertion of existence of functions with prescribed
properties. In order to verify that a sequence of functions has a continuous
limit, we need only verify that it is a uniformly Cauchy sequence.
Theorem 2.14. A uniformly Cauchy sequence of continuous functions is
uniformly convergent.
Proof. Suppose {/„} is a uniformly Cauchy sequence of continuous functions
on X. This means: for every ε > 0, there is an N> 0 such that \\f„ — fm\\<efoT
n,m^.N. This means precisely
I/„(x)-/™(x)I<e for all xe X
(2.41)
Thus, for each x, {/,(■*)} is a uniformly Cauchy sequence of real numbers, and
thus converges. Denote the limit, lim/„(x) by /(x). We must show that this
function χ ^/(x) is continuous, and that f„ converges uniformly to /
206 2. Notions of Calculus
First of all, if ε > 0, choose Was above, and let m -> со in (2.41). We obtain, for
n>N,
Urn |/„(x) - /ra(x)| = |/„(x) - f(x)| < ε for all χ e X
m-* <л
Thus, if n^./V, ||/,-/||>ε. This implies that lim \\fn -f\\ = 0, as desired.
Π-»»
Now / is continuous. Fix x0 e X Let ε > 0 and choose ./V so large that
ll/v — /II < ε/3- Since /v is continuous, there is a δ > 0 such that ||x — x01| < δ
implies |/v(x) -/м(х0)| < ε/3. Then if |x - x0| < δ,
|/(χ)-/(χο)Ι ^ |/(χ)-Λ(χ)| + I/n(x)-/n(xo)I + ΙΛ(χ)-/(χ0)Ι
ε ε ε
as desired.
Having seen one vector space of functions, we can easily see them
everywhere. The collection of bounded real-valued functions on a set X is a
vector space over the reals. The collection of all bounded functions on X
taking values in R" is also a vector space; similarly, the space of continuous
functions taking values in R". All the spaces here are endowed with the
same concept of length:
11/11 = sup{||/(x) ||: xeX}
Of even more interest are the spaces of functions on which is defined some
analytic operations. For example, if / is an interval, the space of all real-
valued functions which are differentiable on / is a vector space. The space
Cl(I) of all functions whose derivative is continuous is also a vector space,
as is the space C<n)(/) of all functions which have continuous «th derivatives.
The space R(I) of functions which are integrable on / is a vector space. These
(and other) examples are further elaborated in the exercises. Suffice it to
say here that the mathematical theory which follows this point of view
(called functional analysis) is a recent (20th-century) development which has
had profound impact, not only in foundations of mathematics, but in the
practical application of mathematics in all branches of science.
Let us return to the space C{X) of continuous functions on a closed
bounded set X in R". Once we begin thinking of these functions as points
in a space, on which are defined such notions as distance and convergence,
we are easily led to consider functions on that space. Naturally, such a
function is continuous if it takes convergent sequences into convergent
sequences.
2.10 The Space of Continuous Functions 207
Examples
60. Let geC(X) and define ф(/)=/д. ф is continuous, for if
/.-/, that is, ||/.-Я-»0, then
Ш-Ы ζ II/ -/II 1Ы-0
61. Define ψ: C(X) -» C[X), ψ{/) =/2. ψ is also continuous, for
II/2 -/2II = IK/ -/)(/„ +/)|| < II/ -/|| ■ ||/ +/|| (2.42)
If/-»/» the term ||/+/|| remains bounded while ||/„-/||-+0,
thus also ||/2-/2||^0.
62. If Ρ is any polynomial, \j/P(f) = P(f) is continuous on C(X)
(Problem 55).
63. Define M: C(X) -> R, M(/) = ||/||.
This is continuous, since
11М/)-М<011 = 1Н/11-Ы1 I < II/- g\\
64. Let x0eX and define F0: C(Z) -> R, F0(/) =/(*<,). Certainly
F0 is continuous: for if/->/in C(X), then the maximum over X
°f 1/00 -/001 tends to zero; in particular, |/(x0) -/(*o)l -^O, so
Fo(L) -»F0(/).
65. The definite integral is a continuous function on C(I), where
/ = [a, 6] с R. For
17.-17 ^ f(/.-/)
j{ jj j{
< II/-/IP-я)
so if/ ->/ also Ji/ -> \,f. A stronger and more important statement
than that of Example 65 is that the indefinite integral, as a function
from C{I) to C(I) is continuous. This is contained in the next
proposition.
Proposition 23. Let I = {* e R: a < χ < b}. Suppose / is a sequence of
continuous functions on I converging uniformly to f Let F„(x) = ft /,
F00 = ft/- Then F„ -> Funiformly.
208 2. Notions of Calculus
Proof.
I c"
F„(x)-F(x)\= J (/„-/)
|ΙΛ-/|Κ*-α)<;||/,-/Ρ-β)
Thus, taking the maximum on the left,
|1Я-Р||<||/„-/р-й)
so if /„ ->/ uniformly so also F„ -> F.
Problem 56 is intended to demonstrate that on the other hand,
differentiation is not a continuous function on C(/). (It isn't even everywhere defined;
i.e., there are continuous functions that do not have a derivative.)
Nevertheless, Proposition 23 has this consequence for differentiation.
Proposition 24. Let {f„} be a sequence of continuously differentiable
functions on the interval [a, b~] and suppose that (i) {/'„} is uniformly Cauchy,
(li) /„(a) = 0 for all n. Then {/„} is uniformly convergent to a diffe rent ι able
function f and f = lim/'„.
Proof. The proof of this proposition consists in a rereading of Proposition 23
via the fundamental theorem of calculus. By that theorem
X
/„(*) = / f'n
a
so by Proposition 23, /„ is also convergent. If we let g = hm/'„, then hmf„ = Ji g.
Thus, lim/, is indeed differentiable and its derivative is g = lim/'„.
Let us return now to the consideration of our original problem. In fact,
let us generalize it slightly. Let с be a complex number, and let us seek a
differentiable complex-valued function/such that
f'(x) = cf(x) for all χ and /(0) = 1 (2.43)
This is, by the fundamental theorem of calculus the same as seeking a
continuous function/such that
fix) = с 17(0 dt + 1 (2.44)
2.10 The Space of Continuous Functions 209
Now that we have the necessary theory and point of view available, we may
follow a more sophisticated approach. Let / be the interval / = [-R, R~\,
and define the function Ton C(I):
Tf(x)=cff(t)dt+l (2.45)
We seek a function / such that/= Tf, that is, a fixed point of the
transformation. Our technique is that of successive approximation. Let/0 be any
continuous function, and define /, = Tf0, f2 = 7/i =T2f0, and in general
/„= 77„-i = Т"/0. We must show that the sequence {/„} converges. If
we choose /0 = 1 we can compute the sequence explicitly, and we find that
(ex)" (cx)"~l
Then if m > n,
(cx)m (ex)"-1 (cx)n+1
m' (m-1)! (n + 1)!
On the interval [-R, R] the maximum of this expression is dominated by
replacing с by \c\, and χ by R. Thus,
(Иду (MR)"-1 (k|R)"+1
и/» ли - ш, +(„,_,),+ + (и + i)i
t=o к! & = o к:
Since the series
| · (|с|Д)*|
converges, its sequence of partial sums is a Cauchy sequence, so by (2.46),
{/„} is a Cauchy sequence and is thus uniformly convergent. Since Τ is
continuous on C(I), we have
lira/. = lim T(fn^) = ЩипЛ-О = ^"η/Ο
210 2. Notions of Calculus
so lim/„ solves the given problem. This function is important enough for us
to spend a few more paragraphs discussing it.
Definition 21. The exponential function, denoted exp(oc), or ecx, for any
complex number с is the solution of the differential equation
/'(*) = cf(x) ДО) = 1
First of all, this definition makes sense, because there is only one solution.
If g also solves, then
d \ecx~\ cecxg — ecxg' cecxg - cecxg
dx
e
19
92 92
= 0
since g = eg. Thus ecxg~x is constant. Since its value at 0 is 1, ecxg~l = 1,
or ecx = g. From these discussions we have these additional properties of
the exponential function
Proposition 25.
00 (ex)"
(0 е"=^Ц-
n = 0 П\
(ii) ex+y=exey.
(iii) ecx is never zero.
Proof. Part (i) follows directly from the argument above. Part (ii) follows
from the uniqueness. Fix y, and define h(x) = ex+yjef. Then
h'(x) = — = h(x) and A'(0) = —= 1
Thus we must have h(x) = e', so (n) is verified. Part (iii) follows immediately
from (ii):
gCXg-CX __ gCX~CX __ gO __ 1
so (e")-1 = e~".
• PROBLEMS
49. Let / be a nonempty interval in R. Show that C(I) is infinite
dimensional.
211 The Fixed Point Theorem 211
50. Show that the sequence of functions on the closed unit disk in С
defined by
*=ι к2
converges.
51. Does the sequence ! У zkjk\ converge on the closed unit disk''
X
sin - —
η
X
—
η
52. Let {a„} be a sequence of complex numbers such that 2 |e»l < °°-
Verify these facts:
(a) For every z, \z\ < l,f(z) = У„'=1 a„z" converges, and
(b) /is continuous on (ze C. \z\ < 1} This is true because/is the
uniform limit of the polynomials fs(z) =J_;v=l a„z", since ||/-/v||^
2"=«+1 l°ni ^° as -^^:o-
53. Let f,gbe continuous functions on the closed and bounded set X.
Show that \\fg\\ <; 11/11 \\g\\ Is \\fg\\ < ||/|| ■ ||^|| possible''
54. Show that on the interval [0, 1 ],
< — for all η
η
55. Let Xi,...,xkeX and ρ be any polynomial in к variables Define
Г:С(Х)^С
ПЛ=р(ЛхО,- ./(x0)
Show that Ψ is continuous.
56. Find a sequence {/„} of differentiable functions which is uniformly
convergent, but such that {/'„(i)} is not convergent.
2.11 The Fixed Point Theorem
The fixed point theorem is a generalization of the technique of successive
approximations described above in the discussion of the exponential function.
This technique was first used by Newton as a technique for finding roots of
polynomial equations. Simply stated, Newton's method is this. First,
212 2. Notions of Calculus
a technique is described by means of which one can transform a given
approximation to a root into a better approximation. One then chooses a reasonable
approximation, applies this technique to it to find a better one. Having
this, one again applies the technique: if it's a good one, the result is an even
better approximation. Continuing in this way, one obtains a sequence of
approximations which should converge to the root. Now, having described
the procedure, let us turn to Newton's specific technique for bettering
approximations.
Let/be a given real polynomial. We want to find a point x0 such that
f(x0) = 0. Choose a ρλ so that/^) is small. Now, replace the function by
its linear approximation at ρλ: L(x) ==f(Pi) +f'(Pi)(x — Pi), and let p2 be
the root of L{x) = 0. In other words, replace the graph of /by its tangent
line and let p2 be the χ intercept of that line (see Figure 2.17). Now apply
this procedure to p2. Let p3 be the root of the linear approximation to /
at p2, and so forth. We can describe Newton's technique abstractly as
follows: For any point p, let T(p) be the zero of the linear approximation of
/ at p: T(p) solves the equation f(p) + f'(p)(T(p) — p) = 0. (We must
Figure 2.17
2.11 The Fixed Point Theorem 213
assume that/' # 0 for Γ to be a well-defined function.) Clearly, if f(p) = 0,
we have T(p) = />, and conversely, thus we are in reality seeking a fixed
point of Γ!
Suppose Γ has the property of contraction on some interval 1. There is a
с < 1 such that \Tx - 7>| < c\x - j;|, all x,yel. Then Newton's method
works. There is a root of/(*) = 0 (or /'(*) = 0) on the interval /, and it is
the limit of the sequence x0, Tx0, T2x0,..., where x0 is any point of 1.
This is the content of the fixed point theorem.
We now state and prove it explicitly for subsets of C(X). It will be clear
that the theorem is true for subsets of R", by virtue of the same argument.
Theorem 2.15. Suppose S is a closed set of functions in C(X): that
S contains all limits of sequences in S. Suppose Τ is a mapping of S onto S
which is a contraction, that is, there is а с < 1 such that
IIТСЛ - T(g) || < с || / - g || for allf geS
Then there is a unique continuous function /0 such that T(/0) =/0.
Proof. Certainly the fixed point is unique. For if T(f0) =/0 and T(fi) =/i, then
ll/o-/ili= l|r(/o)-r(/,)||<c||/o-/i|i<|l/o-/i|| unless ||/o-/i||=0, that
IS, /o =/i-
Now let fe C(X). Let the sequence {/„} be defined as follows: /Ί =/, h = 7/Ί,
/з = Tf2 ,...,/„ = Tf„.,. {/„} is a Cauchy sequence. For
ll/n+i-ΛΙΙ = Ι|Γ/„ - 7У„-, li < с ||/„-/„-, ||
so we can verify by induction that
ΙΙ/„+ι-/ηΙΙ<<"ΊΙ/ι-/οΙΙ
Thus, for m > η we have
11/™-ΛII <
Σ(Λ+'-Λ)
< Σ ΙΙΛ+ι-ΛΙΙ
J-η
с"
<("!/) |l/i-/oll<ll/i-/o
1-е
Since с < 1, {/„} is Cauchy, so has a limit /„ e C(X). Since Τ is continuous, Tf0 =
lim Tfn = lim/„+1 =/0, and thus/, is the desired fixed function.
Л-ЮО η -» *>
As an illustration on the real numbers let us prove that if a > 0, there is
an x0 > 0 such that x02 = a, by Newton's method. First, we describe the
214 2. Notions of Calculus
map T. Let ρ > 0, the linear approximation to x2 — a at ρ is p1
+ 2p(x — p). Thus, the zero of this linear polynomial is
Τ ρ
а
2P
+ P
IH)
Clearly, if Г has a fixed point x0, we must have x02 = a. Thus, we must show
that Τ is a contraction on some closed interval:
\Tx
Ty\=-2
a a
y +
χ у
1
~2
-i
χ -
χα .
-y + — (y
xy
-y\
xy
-*)
Since a, x,y are all positive, 1 — (a/xy) < 1, so we need only ensure that
1 — (a/xy) > — 1, for Tto be a contraction with с = %. Let/ = {x: x2 >a/2}.
Then for x,y e I, xy > a/2, so a/xy < 2, which is the desired inequality.
Thus, by the fixed point theorem there is an x0 with x02 > a/2 such that
x02 = a.
We shall now give a somewhat more subtle application of the fixed point
theorem. Sometimes a relation between two real variables determines one
as a function of the other. For example, the relation χ + у = 0 determines
у as a function of x: у = — χ; χ2 + у2 — I gives у = (1 — χ2)1'2 near the
value (0, 1), and near (1, 0) we should write χ = (1 — у2)1'2 as a function
of y. The relations
1
sin(x(log y)) = 0
are somewhat less transparent, nevertheless we can ask whether or not they
do determine у as a function of x.
Suppose now, in general we have an equation (see Figure 2.18)
F(x, у) = О
(2.47)
defined in the plane. We ask: does there exist a function g of χ such that
(2.47) amounts to saying у = g(x) ? More precisely, is there a function g
such that
F(x, y) = 0 if and only if у = #(x)
It is not hard to find a necessary condition. For there to be such a function
2.11 The Fixed Point Theorem 215
ν is a function of χ
у is not a function of χ
Figure 2.18
it must be the case that each line χ = constant intersects the set F(x, y) = 0
in only one point (see Figure 2.19). Thus the function F(x, y), as a function
of у on lines χ = constant must take the value 0 only once. The root of
F(x, y) = 0 is then the value g(x). Now we recall from one-variable theory
that a function H(y) will take all values once if H'(y) Φ 0. Thus the
reasonable condition to impose on F is that it has a continuous partial derivative
with respect to y, and dF/ду φ 0. This condition turns out to be enough.
More precisely, suppose that F is defined and has continuous partial
derivatives in the neighborhood of the origin in R2, and 3F/dy(0,0) φ 0.
Figure 2.19
216 2. Notions of Calculus
We seek a function g defined in a neighborhood of χ = 0 such that g(0) = 0
and F(x, g(x)) = 0. If we fix χ — x0 near 0, then we seek a root of
F(x0, y) = 0. This brings us right back to Newton's method. Define Τ
as a function of j> as Newton did: T(y) is the zero of the linear approximation
of F(x0, y) at у; that is,
ду
or
7> = J>
aF
(*o» jO
*■(*<>, JO
(2.48)
Just as in Newton's case the solution of F(x0, y) = 0 is the fixed point of T.
Thus, we need only verify that Τ is a contraction in some interval of values of
у for x0 near χ so that it will have a fixed point; and we define g(x0) to be
that fixed point. This application of the fixed point theorem really works, as
we now shall prove.
Theorem 2.16. Suppose that F has continuous partial derivatives in a
neighborhood of (0,0), and that F(0, 0) = 0, 8F/8y(0, 0) # 0. Then there is a
function g defined for χ in some interval {—ε, ε) such that
F(x, y) = 0 if and only if у = g(x)
Proof. Instead of (2.48) we consider something slightly simpler. For χ near 0,
define
ТЛУ)=У-
SF
Fix, У)
(2.49)
We want to find the fixed point, if it exists, of (2.49). Thus we seek suitable intervals,
— ε <χ <ε, —η<)><ηΊη which Τ, is a Contraction
T,(yl)-Tx(y2)=yl-y2-
8F
ду
(0,0)
[F(x,yi)~F(x,y2)]
(2.50)
By the mean value theorem there is a ξ between уг and y2 such that
dF
F(x, yx)~ F(x, y2) = — (χ, ξ)(3Ί - У2)
dy
2.11 The Fixed Point Theorem 217
Equation (2.50) becomes, upon substitution,
Тх(Уг) ~ Tx(y2) = (У1 - y2)
, wme)-.£fc0
ду ду
(2.51)
Now the term in brackets is continuous in (χ, ξ) and has the value 0 at
(0,0). Thus we may choose ε so that that term is less than \ if -ε < χ < ε,
-ε < j>i < ε, -ε < j>2 < ε and ξ is between y1 and y2. With this choice
οίε, (2.51) gives
\Tx(yi)-Tx(yJ\<i\yi-y2\
so Tx is indeed a contraction. Define g(x) as the fixed point of Tx. Then, if
F(x, y) = 0, then by (2.49) Т*(у) = y, so we must have у = #(*). On the
other hand, if у = #(*), then T^y) = _y, so again by (2.49) we must have
F(x, y) = 0. The theorem is proved.
To say that the function g exists is already good enough, but much more is
true: g is a continuously differentiable function. We will leave the
verification of this fact to the interested reader (see Problem 58). In Section 7.2
we shall reconsider this theorem (known as the Implicit function theorem)
in many more variables. The beauty of the fixed point theorem is that the
general context does not at all complicate the ideas, nor the verifications.
• EXERCISES
31. Find, by Newton's method, a sequence of numbers converging to
the square root of a, for any a > 0. Now, do the cube root.
32. Find a sequence converging to a root of these polynomials:
(a) x3 + χ2 + χ + 1 (с) χό -2χ2-1χ + 2
(b) хг - χ + 1 (d) χ5 - χ - 1
33. (a) Let F(x, у) = χ sin(x^). For what values of (x, y) such that
Ffa y) = 0 is it true that nearby the equation F(x, y)=0 demies у as a
function of x?
(b) Same problem for
(i) F(x,y) = xy2 + 2xy+l, (11) F(x,y)=x*-y,
(iii) F(x, y) = x" + y2
34. Let F(x, y) be differentiable in a domain D, and (x0 ,y0)eD such
that F(x0, Го) = 0. Suppose g is differentiable and has the property
g(xo) = y0, F(x, g(x)) = 0. Show that
_ 8FI8x{xo,yo)
Я'{Ха)~ 8FI8y(x0,yo)
218 2. Notions of Calculus
35. Find g' where g is defined implicitly by
(a) x sin(xy) = 0 (c) e" = 1
(b) cos(x+y)=y (d) e'"=:y
• PROBLEMS
57. Prove the fixed point theorem in R".
Theorem IfS is a subset of R1 and Τ is defined on S and is a contraction
on S, then there is a unique y0 e 5 such that T(y0) = yo.
58. Let F have continuous partial derivatives near (x0, Уо) and suppose
F(xo,yo) = 0, 8F/dy(xo,yo)=£0. Let g be the function described in
Theorem 2.15 (F(x, y(x)) = 0 and #(x0)=.)o). We can prove that g is
differentiable as follows.
(a) First of all, by the mean value theorem, for any (x, y), there is a
(ξ, η) on the line between (x0, Уо) and (x, y) such that
dF dF
F(x, y) ~ F(x0, Уо) = — (ξ, η)(χ - Xo) + — (l П)(У - Уо)
Why is the mean value theorem applicable?
(b) Now, if we substitute у =д(х), Уо ~ g(x0), we have
dF dF
0 = — (ξ, η)(χ -χ0) + — (ξ, v)(g(x) - g(Xo))
Thus
g(x)-g(xo) ^-SFIbx&rj)
χ - Xo ЩЪу(£, η)
Conclude that g is differentiable and
dFldx(Xo,g(xo))
ff'(xo) ■■
ЩЫхо,д(хо))
2.12 Summary
A sequence zu ...,zn,... of complex numbers is a function from the
positive integers to C. The sequence {z„} converges to ζ if, for every ε > 0
there is an N such that |z„ — z\ < ε for и > N.
A convergent sequence is bounded, but not conversely. A monotonic
bounded sequence of real numbers is convergent. Cauchy criterion: a
212 Summary 219
sequence {z„} converges if, for every ε > 0, there is an N such that \z„ - zm\ < ε
for both n,m>N.
The series formed of a sequence {z„} is the sequence of sums {£"=1 z,}.
If the sequence of sums converges, we say that the series converges and denote
the limit by £„°°=1 ζ„. If £ z„ converges, then z„->0, but not conversely.
If {ck} is a sequence of nonnegative numbers, £ ck converges if and only
if the sequence Σϊ = 1 ck is bounded. A series £z„ converges absolutely if
Y\zn\ < oo. Absolutely convergent series may be summed in any convenient
way.
Tests for Convergence
comparison test. Suppose |z„| < |и„| for all but finitely many n. Then
(ι) if £ |w„| converges, £ z„ is absolutely convergent, (ii) if £ |z„| diverges,
so does £ |w„|.
root test. If |c„|1/n < r for some r < 1 and all but finitely many n, £ c„
is absolutely convergent.
ratio test. If |c„ + 1/c„| < r for some r < 1 and all but finitely many n,
Σ c„ is absolutely convergent.
The sequence {vk} of vectors in R" is said to converge to ν if, for every
ε > 0, there is an N such that || vk — ν || < ε for к > N. A sequence of vectors
converges if and only if it does so in each coordinate.
A set S is closed if and only if vft e S, hm vk = ν implies ν e S also. Every
sequence contained in a closed and bounded set has a convergent subsequence.
An Revalued function defined in R" is said to be continuous at v0 if/is
defined in a neighborhood of v0 and vk -> v0 implies f(vk) </(v0). A
function is continuous on a set S if it is continuous at every point of S. If S is a
closed and bounded set, and/is a continuous real-valued function defined
on S, then/is bounded and attains its maximum and minimum.
Sections 2.6 and 2.7 are mainly about integration. We shall not recollect
the definitions here; only the major results.
fundamental theorem of CALCULUS. Suppose / is continuous on the
interval [a, b]. Then the integral
F(x) = f/
Ja
exists for all χ e [a, b]. F is differentiable on (a, b) and F' =/.
220 2. Notions of Calculus
fubini's theorem. Let /be an integrable function defined on a rectangle
R = /j χ ■ ■ ■ χ In in R". J/can be computed by iteration:
ί/= f [■■■ [f Kx\...,xn)dx"
JR Jlil lJI„
dx"'1·-
dx1
Let/be a real-valued function defined in a neighborhood of x0 in R". If
vis a vector in R", the directional derivative df(x0, v) of/ at x0 in the direction
ν is defined by
lim
f-»0
Дх0 + fv) - /(Xo)
(if it exists). The partial derivative of/with respect to xl at x0 is
£1(Xo) = rf/(x0,El)
If these partial derivatives are all defined and continuous near x0, then
df(x0, v) is linear in v. We can write
df=l¥-,dx'
J ^ dx'
If the partial derivatives dfjdx1 all exist in an open set we may be able to
compute the derivatives d(df/dx')/dxJ. These are the second-order partial
derivatives. If all first and second derivatives of/ exist and are continuous
in an open set N, then
δ2/
δ2/
dx1 dxJ dxJ dx1
throughout N.
Suppose that / has continuous partial derivatives in the domain I x D,
where / is an interval of reals, and β is a domain in R". Let
F(x) = f f(x, y) dy
JD
Then F is differentiable and
7W=[t- (*» У) dy
ax Jdox
2.12 Summary 221
Suppose/is a real-valued function defined on R. We say that f(x)
converges toiasx-юо written hm/(x) = L if \f(x) - L\ can be made arbi-
JC-» 00
tranly small by taking χ sufficiently large. If now/is a continuous function
on R such that
lim ff
exists, we say that/is integrable on R. If hm J* |/| exists,/is absolutely
integrable. Integral test: If/is a positive, decreasing continuous function
defined on R, then Jf /exists if and only if £"=i/(n) < oo.
Let X be a closed and bounded set in R". We denote by C(X) the collection
of all complex-valued continuous functions on X. C(X) is a vector space.
If/is in C(X), the length of/is
||/||=max{|/(x)|:xeX}
For/ #in C(Z) the distance between/and^is ||/— #||. If {/„}isa sequence
in C(X), and ||/„ -/|| ->0 as n-> oo for some /e C(X), we say that {/„}
converges uniformly to / Cauchy criterion. Suppose {/„} is a sequence
in C(X) satisfying the following condition: for each ε > 0, there is an N such
that ||/„ —/m|| < ε whenever n,m> N. Then there is an/e C(Z) such that
/"„->/uniformly.
integration. If X is an interval in R, and/ ->/uniformly in C(Z) then
also J;/.-»J;/uniformly.
The exponential function, denoted exp(oc), or ecx for any complex number
с is the solution of the differential equation y' = cy, y(0) = 1. It has these
properties:
» (ел:)"
„=o и!
e" is never zero.
fixed point theorem. Let S be a closed set of functions in C(X) and Τ a
mapping of S onto S which is a contraction; that is, there is с < 1 such that
\\T(f) - Tte)|| < c||/- 9\\ for all/ geS
Then there is a unique continuous function /0 such that T(/0) =/0 -
222 2. Notions of Calculus
implicit function theorem. Suppose that F has continuous partial
derivatives in a neighborhood of (0, 0), and that F(0, 0) = 0, dF/dy(0, 0) # 0.
Then there is a function g defined for χ in some interval (— ε, ε) such that
F(x, y) = 0 if and only if y = g(x)
• FURTHER READING
M. Spivak, Calculus, Benjamin, New York, 1967. This is an eloquent
text in the one-variable calculus. It is an excellent reference for a full
treatment of the material in this chapter.
T. A. Bak and J. Lichtenberg, Mathematics for Scientists, Benjamin,
New York, 1966. This is a review of the theory of calculus from the point
of view of the physical scientist. It includes a chapter on numerical analysis.
C. W. Burnll and J. R. Knudsen, Real Variables, Holt, Rinehart and
Winston, New York, 1969. An advanced text, going thoroughly through
the material of this chapter and beyond to the theory of Lebesque integration.
• MISCELLANEOUS PROBLEMS
59. Let {л:„} and {y„} be sequences. Then {x„ + yn} is also a sequence.
So also is {rx„} for any real number r; thus the collection S of all real
sequences is a vector space. Show that it is not finite dimensional.
60. Show that the collection В of bounded sequences is a linear subspace
of the vector space S of all sequences (Problem 59).
61. Show that the collection С of convergent sequences is a linear sub-
space of B. Also C0, the collection of all sequences converging to zero is a
linear subspace of B. These spaces are all infinite dimensional.
62. Define the function " lim " on convergent sequences in the obvious
way: lim: С -* R: limfx»} = lim x„. Show that lim is a linear function.
63. What is the dimension of the space of linear functionals on С which
annihilate C0 ?
64. Let χι =4, хг = i(4 + I), and once x2, ...,x„ are defined, let
-Wi = Κχη + 3/л:„). Prove that {л:„} converges. Assuming that, find the
limit.
65. (a) Show that for every integer k,
lim ri4(n + 1)" = 1
lim пЧ(п+ 1)"+1=0
lim и'+'/Си + 1)' does not exist
(b) Let к be an integer, and 1 > h > 0. Show that lim n'A" = 0.
(c) Show that lim n/ft" does not exist.
2.12 Summary 223
66. Let χι = 1, and in general
3 +x„
Find lim x„.
67. Suppose lim z„ = z.
(a) Let % = £(z„- ι + z„). Then lim y„ = ζ
(b) Let fc be a positive integer. Now let {%} be defined by
1
y» — ~£~τ\ (Zn +Zn+i + " ■ + z»+*)
Then lim y„ = ζ also.
(c) This time take
1
% = - (zi + 1- z„)
Once again lim >>„ = z.
68. Suppose that / is continuous at c, and lim с = c. Then
lim/(c) =/(c).
69. Let {c„} be a sequence of complex numbers, and suppose (|с„|)1/п = R.
Show that R'1 is the radius of convergence of 2 c»z"·
70. Let {s„}, {f„} be two sequences of positive numbers such that lim s„ t„ '
exists and is nonzero. Then 2 & converges if and only if 2 i» converges.
71. Let {c„} be a sequence of positive numbers. Suppose that for every
sequence of positive numbers {p„} such that 2#,<0° we have also
2ftAi<°°- Prove that {c„} is bounded.
72. Verify Schwarz's inequality:
Σ\<*μΥ < ||α„|2· ||ό„|2
1=1 у n-L n=l
(Hint: It is true by virtue of the same fact for finite sums, which was
discussed in Problem 74 of Chapter 1.)
73. Prove that if 2 la»l2 < °°. then J_(l/ri)\a„\ < oo. Is the reverse
implication true?
74. Let S be a subset of A". Show that ±(S) = {\eR": <v, s> = 0
for all s e S} is a closed set.
75. Suppose that/is a continuous positive real-valued function defined
on a set S in /?". Show that log/is also continuous.
76. Suppose that/is a continuous real-valued function defined on all of
R". Let x0, xi e R" and с e R be such that /(x0) < с </(x,). Show that
there is an x2 e R" such that /(x2) = с
224 2. Notions of Calculus
77. Show that if/is a continuous function on the interval / taking only
rational values, then /must be constant.
78. A set S in R" is called connected if every continuous real-valued
function has the intermediate value property. Show that this is equivalent to the
following definition:
A set S is not connected if there is a continuous real-valued function /
defined on S which takes precisely two values.
79. Verify the following assertions:
(a) A ball in R" is connected.
(b) The set of integers is not connected.
(c) The sphere {x e R3: ||x || = 1} is connected.
(d) The union of two balls in R" is connected if and only if they
intersect.
(e) An open set is not connected if and only if it can be written as the
disjoint union of two nonempty open subsets.
(f) A closed set is not connected if and only if it can be written as the
disjoint union of two nonempty closed sets.
80. Let / be a continuous function on the closed and bounded set X.
Then/is uniformly continuous; that is, given ε > 0, there is a δ > 0 such that
for all x, у e X such that \x—y\ < δ we have | f(x) —f(y)\ < ε. Supposing
not, we can derive a contradiction as follows. There is an ε0 such that
for every δ, " \x — у | < δ implies \f(x) — f(y) | < ε0 " is not true. Taking
δ = 1 /и, there are x„, y„ with \x„ - y„ | < 1 /n but | f(x„) - /(y„) \>ε0. Since
X is closed and bounded, these sequences have convergent subsequences:
{*'„}, {y„'}. Show that lim x'„ = lim /„ but |/(lim x'„) — /(lim /„)| > ε0, a
contradiction.
81. Let L be a linear functional on R" and choose v0 such that ||»0|| = 1
and
L(v0)=max{L(v): ||»|| = 1}
Show that for every υ e R", L(v) =L(v0) <«, v0}.
82. Let / be an integrable function on the rectangle [a, b]. Let R, be
rectangle [a, b + r(b — a)], for 0 < / <, 1. Verify that / is integrable on
each rectangle R,, and define F(t) = JR, / Show that / is continuous. Is
/differentiable?
83. Let Q = {pjq:p, q integers with 0 <,p < q). β is a subset of the unit
interval [0,1] which is not measurable. For surely JxQ=0, and if
Ri и · · · и R„ => Q, then also Λ υ · · · и R„ => [0, 1], so J 2 Хщ ^ 1. and
thusfXo = l.
84. Let / be an integrable nonnegative function defined on the domain
B<=R2 and consider D = {(x, y, z) e R3; 0 <, ζ<>f(x, y); (x,y)eB}.
Verify that Vol(Z>) = J„ /.
2.12 Summary 225
85. Suppose that/is a continuous decreasing real-valued function of a
real variable and lim f(x) = 0. Then f? f(x) sin χ dx converges (compare
JC-»0O
this with Leibniz's theorem for series).
86. Suppose that/is a real-valued function defined on R". We say that
/(x)^+oo as ||x||->oo
if, for every Μ there is а К such that /(χ) ^ Μ whenever ||χ||>Κ Show
that if/is a real-valued continuous function on R" such that/(x) -* + oo as
||x|] -* oo, then /attains a minimum at some point.
87. Define
/(x)^0 as llxll^oo
in a way suggested by the definition in the above problem. Show that if a
continuous function on R" has this property, then it attains both a maximum
and a minimum on R".
88. Suppose / is a real-valued function which has continuous partial
derivatives in the ball {x e R": ||x|| < 1}. Show that the function
g(x) = f f(tx)dt
J 0
has the same properties, and find V#.
89. Let I2 be the space of sequences {c„} of real numbers such that
Σ μ2 < °°
n = l
Because of the result in Problem 72 (Schwarz's inequality), if {c„} and {</„}
are in I2, then
<{c}, {</„}> = fc„d„
n = l
converges. Show that I2 is a Euclidean vector space with that inner product.
90. The space of continuous functions on the unit interval can be made
into a Euclidean vector space in this way:
</,*>=f /(tMt)dt
Corresponding to this inner product is a notion of length which we denote
by || · ||2 so as to distinguish it from the modulus || · ||» introduced in the
226 2. Notions of Calculus
text. Show that this length is deficient in these respects:
(a) We can have \\f„\\i -*0 without having ||/„||„ ->0.
(b) We can have a sequence {f„} of continuous functions which is a
Cauchy sequence in the sense of the length || · ||2, but which does not
converge to a continuous function. On the other hand, show that
(c) if ll/JU-^O, then ||/„||2->0also.
91. Suppose L: C[0, 1]->Λ is a linear function. Show that L is
continuous if and only if there is an Μ > 0 such that
\L(f)\^M\\f\L
92. Show that there is a unique differentiable function /0 such that
f'o(x) = (fo(x))2 for all л: and /0(0) = *
Do it by applying the fixed point theorem to the function Τ defined below
ontfaeset{/eC[*,*]: ||/H«£»:
x
Tf(x)= f P(t)dt+i
•Ό
93. We can talk of open and closed sets, and convergence in the space M"
of (n x n) matrices, merely by considering them as vectors in R" . Doing so
verify these statements:
(a) The set G of invertible (и х и) matrices is open.
(b) The set of triangular matrices is closed.
(c) The function A-+A2 is continuous.
(d) If ρ is any polynomial in one variable the function
T^p(T)
is continuous.
(e) lim (x/ni) £?= 0 (1/и!)Г" exists for all Те L(R", Rm).
94. Suppose g is a continuous real-valued function on the interval
[-a, a]. Show that the implication
f g(t)f(t)dt = 0
J-a
for all fe Fimplies g = 0 holds whenever Fis any one of these classes:
(a) F=C([-a,a]).
(b) F=Cl([-a,aJ).
(c) Fis the collection of all polynomials.
(d) Fis the collection {χι: /a sublnterval of [-a, a]}.
(e) Fis the collection of all continuously differentiable functions such
that Д-a) = Да) =0.
Chapter 3
ORDINARY DIFFERENTIAL
EQUATIONS
In these next three chapters we shall elaborate on the study of the
differential calculus of one variable and its application to geometry and classical
(Newtonian) physics. The motivating problem throughout is the central
problem of the subject of differential equations: to find a function on the
basis of given information on its derivatives. Observed phenomena in the
sciences seem always to involve rates of change. For example, it is observed
that the rate of acceleration of a falling body is a constant independent of
mass, height, or velocity; the progress of a chemical reaction slows down as
it proceeds, dependent on the quantities of the chemicals involved. These
observations, when made precise, appear as differential equations. In
order to predict (the time it takes for the body to fall a given height, the
amount of new chemicals produced before the reaction stops), the function
described by the differential equation must be found.
The first two sections of the present chapter are devoted to the description
of the basic concepts involved; in the first we shall discuss the differentiation
of vector-valued functions, and the second is devoted to approximation and
Taylor's formula. We also include a brief excursion into the computation of
maxima and minima of functions of several variables subject to constraints
by the technique of Lagrange multipliers.
The main theoretical tool in this study is Picard's theorem which gives
conditions under which a differential equation has a solution and only one
solution. This theorem essentially tells us what a well-posed problem is,
and asserts that well-posed problems are always solvable. The question
227
228 3 Ordinary Differential Equations
of actually producing a formula for the solution, or an algorithm for
computing approximate values for the solution is another matter altogether.
Several techniques will be exposed in this chapter and Chapter 5 (successive
approximations, series expansions); there are many more very efficient
computational techniques which we shall not develop here.
It will become clear that the subject of ordinary differential equations has a
lot to do with the study of curves (paths of motion). Thus in the next
chapter we shall investigate the geometry of curves and its relation with the
subject of differential equations.
3.1 Differentiation
The first important step in the study of differential equations is to consider
vector-valued functions of a real variable as well as real-valued functions.
This is the appropriate setting for many problems involving differential
equations, and is particularly relevant when studying equations involving
derivatives of order greater than one. In the first sections we shall consider
differentiable vector-valued functions of a real variable and introduce a
special technique for approximating values: Taylor's expansion.
Definition 1. Let x0 e R, and suppose f is an /^-valued function defined
in a neighborhood of x0. f is differentiable at x0 if
f(xo + 0-f(xo)
lim ■—
f-»0 t
exists. The limit is called the derivative of f at x0 and is denoted by f'(xo).
If f is defined in an open set U, we say f is differentiable (written f is C1) in
U if [f(x + f) - f(x)]/i converges for all χ on U to a continuous function f
as i->0.
That this definition is not so far from the derivative encountered in calculus
is demonstrated by the following assertion.
Proposition 1. Let f be an R''-valued function defined in a neighborhood of
x0 e R. Write i = (flt.. .,fn) in coordinates, f is differentiable at x0 if and
only if /,, ..·,/„ are differentiable at x0. Further, f'(x0) = (/,'(x0), ...,
/:(*<>))■
3.1 Differentiation 229
Proof.
I(x„ + r) - f (x„) = (fi(x0 + t) - Mxp) A(xq + t) - f„(x„)\
t \ t '···' , ;
The limit on the left as / -* 0 exists if and only if all the limits on the right exist
(Proposition 10 in Chapter 2), and equality holds also in the limit. That is all that
Proposition 1 says.
Now if f is a differentiable function on an interval taking values in R", its
image is a curve in R". The derivative f'(x0) is a vector in R" and points in
the direction of motion of the curve (Figure 3.1). That is, the line through
f(x0) and parallel to f'(x0) is the limiting position of the line through f(x0)
and a nearby point i(x0 + i). For that line is parallel to t ~ 1(t(x0 + 0 — f(*o))>
and by definition this vector has f'(x0) as limit as t -* 0. This line through
f(x0) and parallel to f'(x0) is called the tangent line of the curve at f(x0).
From Proposition 1 it easily follows that iff, g are differentiable, so is f + g,
and (f + g)'(x0) = f'C*o) + g'i^o)· The chain rule also follows easily:
Proposition 2. (Chain Rule I) Let g be a real-valued function defined in a
neighborhood of x0 in R, and differentiable at x. Suppose f is an Revalued
function which is differentiable at g(x0) (see Figure 3.2). Then f ° g is
differentiable at x0 and (f ° #)'(x0) = ^'(^o)f'(^(^o))· (We have written g'(x0)
before f'(#(x0)) as this is the customary way of writing the product of a scalar
and a vector.)
Figure 3.1
230 3 Ordinary Differential Equations
R"
/(«(*))
x g{x)
Figure 3.2
This is of course true, just because it is true in each coordinate, by the
ordinary chain rule. Thus if f = (/,,... ,/„), then f°g = {f1°g,...,f„°g),
so
σ°<7)' = ((/ι°ί7)',···,α.°ί7)')
Example
1. Let f(x) = (x, x2, x3), #(i) = sin t. Then (f ° g){t) = (sin t, sin2i,
sin3 0
(f ° g)' = cos f(l, 2 sin i, 3 sin2i)
Now, there is also a chain rule for taking a real-valued function of a
vector-valued function (Figure 3.3). Suppose now g is a continuously
differentiable function denned on an interval / taking values in a domain D
in R". Suppose / is a real-valued function denned on D which has all
partial derivatives continuous. Then f° g is a real-valued function on the
interval /.
For clarity of exposition, let us take the case и = 2. We can write g in
coordinates as g(x) = (^(x), 02M)· Then
/(g(*o + 0)-/(g(*o))
= f(gi(x0 + 0» 0г(*о + 0) -/(0i(*o)> 0г(*о))
=f(gi(x0 + 0. #2(*o + 0) -/(0i(*o). ^2(^0 + 0)
+/fai(*o)> ^2(^0 + 0) - f(g i(x0), 0г(*о)) (3-1)
3.1 Differentiation 231
Now the function f(s, g2(x0 + 0) is differentiable (it is the restriction of /
to the line у = g2(x0 + 0)· ВУ the mean value theorem, the first difference is
Я f
-τ- (£j, 02(*o + O)[0i(*o + 0 - 0i(*o)]
ox
for some ξ1 between g^Xo + 0 and g^Xo)- Now applying the mean value
theorem we see that
gi(xo + t)-gi(Xo)=9'i(.4i)t
for some η1 between x0 + t and x0. Thus the first difference in (3.1) is
Я f
j-tfi,92(Xo + tMi(.rii)t
ox
91(χ0)<ξι<9ι(Χο + ^ Χ0<ηι<Χο+ί
Similarly, the second difference is
8f
dy
(9i(xoX £>г)9'г{Цт)1
g2(x0) <ξ2< 92(Xo + 0 *o < 42 < Xo + t
/(*(*))
Figure 3.3
232 3 Ordinary Differential Equations
Thus, we may rewrite (3.1) as
/(g(*o + 0 - /(g(*o))
t
= ^ («i, 02(*o + O)^'i(li) + Ц- (0i(*o), bMte) (3-2)
Taking the limit as t ->0, we have on the right ^ -^(xo) (since #! is
continuous), and g-fcs + f), ξ2 both tend to g^ixo) since g2 is continuous. Also
η19 η2 both tend to x0 since they lie between x0 and x0 + t. Since all the
derivatives in (3.2) are continuous, the limit exists, so
d(f°s) df 8f
-^^ (xo) = γχ (&х0)Шхо) + fy (g(*o)) gi(xo) (3.3)
Notice that, using the directional derivative notation, (3.3) becomes
<*(/°g)
dx
- (*o) = df(M{Xo), g'(xo)) = <V/(g(x0), g'(x0)> (3.4)
Thus the derivative of/along the curve χ = g(x) is the same as its directional
derivative along the tangent direction to the curve (Figure 3.4). This is
true in not only R2, but for all R". The derivation is of course the same,
only with the notational complication of many more variables. Thus
Figure 3.4
3.1 Differentiation 233
Proposition 3. (Chain Rule II) Let g be a continuously differentiable
function of a real variable, taking values in a domain D in R", and suppose f
is a continuously differentiable real-valued function defined on D. Then f Ό g
is a differentiable function and
(/o g)'(0 = df(g(t), g'(0)
Examples
2. Let g(i) = (sin t, cos t), f{x, y) = xy2. Then
df((x, y),(a,b))=d-fa + d-fb = y2a + 2xyb
ox oy
g'(t) = (cost, -sinf)
(Jo g)'(i) = df(g(t), g'(i)) = cos2 t cost+ 2 cos t sin i(-sin t)
= cos 2i cos ί
We can, of course, verify this by direct substitution, since f ° g(i) =
sin t cos2 t.
3. Let g(f) = (t, t2, 2f),f(x, y, z) = xy + log z.
с
df((x, y, z), (a, b, c)) =ya + xb + -
g'(0 = (1, It, 2)
(/ ° 8)'(0 = df((t, t2, 20, (1, 2t, 2)) = i2 + 2i2 +
2
2i
-*· + !
ί
4. Suppose/, g are given as in Proposition 3, and/° g has a
maximum at t0. Then V/(g(x0)) is orthogonal to g'(i0)· For (/° g)'(i0) =
0, but
if" g)'('o) = d№t0), g'(i0)) = <V/(g(io))> S'(io)>
234 3 Ordinary Differential Equations
Lagrange Multipliers
This last example serves to provide a method for finding maxima (or
minima) of functions subject to certain constraints. This is the process of
Lagrange multipliers. Suppose /, д are differentiable functions in a certain
domain D in R". We consider/as the function we are studying and g(x) = 0
the constraint. Suppose / has a maximum on g(x) = 0 at x0. Thus, if Г
is a curve in the set {g(x) = 0} going through x0, then V/(x0) is orthogonal
to the tangent line to Г at x0 . For if Г is the image of a function φ of a real
variable, and φ(ί0) = x0, then as in Example 4, <V/(x0), ΨΌο)> = 0> and
φ'(ί0) spans the tangent line to Γ at x0. Now also g ° φ is constant, so
<V#(x0), 0Όο)> = 0· Thus at the maximum point x0 of/ on {g(x) = 0},
V/(x0) and V#(x0) are both orthogonal to all curves through x0 subject to
the constraint g(x) = 0. If there are enough such curves, say, so that the
set of tangent vectors fills out a subspace of R" of dimension и — 1, then
V/(x0) and V#(x0) must be collinear. We will not worry here that there are
enough of these curves, but take it for granted. After all, we are not here
studying the theory, but only seeking a technique which will provide
candidates for a maximum point. We can state this principle: if x0 is a maximum
(or minimum) point for/subject to the constraint g(x) = 0, then there is a A
such that
V/(x0) = Ыхо)
Thus we can find possible x0 by solving the system of equations
V/(x) = Mx)
ff(x) = 0 (3.5)
for χ, λ.
Examples
5. We shall find the maximum value of xyz on the unit sphere
x2 + y2 + z2 = 1. Let/(x) = xyz, g(x) = x2 + y2 + z2 - 1.
V/(x) = (yz, xz, xy) Wg(x) = (2x, 2y, 2z)
Thus we must solve
x2 + V2 + z2 = l
(yz, xz, xy) = 2A(x, y, z)
3.1 Differentiation 235
Eliminating λ from Equations (3.6), we obtain
yz xz xy
χ у ζ
This can be written as
(3.7)
z = 0 or x = 0 or y = 0 or - = -,- = - (3.8)
χ у у ζ
Thus either one of the coordinates is zero or x2 = y2 = z2 Near
any point where one of the coordinates is zero, / changes sign, so
these points are disqualified. This leaves any one of the points
1/л/3(±1, ±1, ±1). The value of / at any one of these points is
±3~3/2, thus 3~3/2 is the maximum.
6. Find the point on the curve 2(x - l)2 + 3j>2 = 4 which is closest
#
totheorigin. Here#(x, y) = 2(x - l)2 +$2 - 4 and/(x, y) =x2 + y2.
Thus
V/= (2x, 2y) V<7 = (4(x - 1), 2y)
The equations become
χ = 2A(x — 1)
y = Xy
2(x - l)2 + y2 = 4
From the second equation, either у = 0 or λ = 1. The second case
gives χ =2. Thus, the candidates are (1 + >Д0), (2, ± ^/2). The
values of/at the first pair is (1 - ^2)2, (1 + >/2)2; and at the second
the value of/ is 6. Clearly, the minimum distance is |1 - ν 2| and
the maximum is 6 (see Figure 3.5).
7. Find the curve on the intersection of the two surfaces
xyz = 1
x2 + y2 + 2z2 = 8
236 3 Ordinary Differential Equations
Figure 3.5
which is closest to the origin. In this problem we have two constraints,
but we can see through the technique. The tangent vector to the
curve is orthogonal to the gradient of both constraining functions, and
at the maximum point V(x2 + y2 + z2) is orthogonal to the curve.
Thus this gradient must be coplanar with the gradients of the
constraining functions. Let /(x) = x2 + y2 + z2, g(x) = xyz — 1,
h(x) = x2 + y2 + 2z2 - 8. Then V/ = 2(x, y, z), Wg = (yz, xz, xy),
VA = 2(x, y, 2z). We must solve these five equations for x, y, ζ,λ,μ:
2(x, y, z) = X(yz, xz, xy) + ζμ{χ, у, 2z)
xyz = 1
x2 + y2 + 2z2 = 8
8. Let Μ = (a/) be a symmetric и х и matrix. That is, a' = a/
for all i andy. If Г is the transformation on R" defined by M,
ν«Γχ, χ» = 2Γχ
We show this by computation:
<Tx, x> = Σ α/xV (3.9)
The kth component of W«T\, x» is found by differentiating (3.9)
with respect to x\ this gives
Σβ/χ' + Σ*/*'
< J
3.1 Differentiation 237
But since Μ is symmetric, this is the same as £, ak'x' + £, akJxj =
2(7*)'. Then ν«Γχ, χ» = 2Γχ is established. Now, the function
/(x) = <Гх, х> must attain a maximum on the unit sphere, say at x0.
The Lagrange multiplier procedure tells us that there is a A such that
ν«Γχ,χ»μχο = ν(Σχ,2-ΐ)|χ>Χ0 or 2Tx = 2Ax
Thus the transformation Τ has an eigenvector, namely that x0 on the
unit sphere which maximizes the function <Γχ, χ>.
We can continue this idea in order to prove that a transformation given by
a symmetric transformation has an orthogonal basis of eigenvectors. For,
let X! be the eigenvector found as in Example 8. Now maximize (Tx, х>
subject to the constraints <x, x> = 1, <x, x^ = 0. If x2 is the maximum
point subject to these constraints, We have λ2, μ2 such that
||x2||=l, <χ,χ,> = 0, 2Γχ2 = 2Λ2χ2, 27*2 = /x2V«x, x,»
Thus, by the first two equations, x2 is nonzero and orthogonal to x1( and
by the third, x2 is an eigenvector of T. Now proceed to the constraints
<x, x> = 1, <x, X!> = 0, <x, x2> = 0. The same technique works to produce
a third eigenvector. We can go on until we have found и independent
eigenvectors.
Examples
9. Let
and find the eigenvectors of M.
λ is an eigenvector of Μ if and only if there is a nonzero vector χ
such that (M - AI)x = 0. We know the necessary and sufficient
condition for that: det(M - Д) = 0. Thus the eigenvalues of Μ are
the roots of det(M - AI) = 0. Now
3 Ordinary Differential Equations
After a computation we find that
det(M - Л) = (2 - A)3 - 3(2 - λ) + 2 = -(A - 1)2(A - 4)
Thus the eigenvalues are 1, 4. We find the corresponding eigenvectors
by solving the equations (M - I)x = 0, (M - 4I)x = 0 for nonzero
vectors.
eigenvalue 1:
/1 1 1\
M-I= 111
\l 1 l/
corresponding eigenvectors: (1, - 1, 0), (0, — 1, 1)
(Any two independent vectors such that Vj + v2 + v3 = 0 will do.)
eigenvalue 4:
/-2
Μ - 41 = 1
\ ι
1
-2
1
Γ
1
-2
The sum of the three rows is zero, so they are dependent. The first
and second are independent, so the corresponding eigenvector lies
on the line
- 2x + у + ζ = 0
χ — 2у + ζ = 0
Such a vector is (1, 1, 1). Thus the eigenvectors of Μ are (1, - 1, 0),
(0, - 1, 1) with eigenvalue 1, and (1, 1, 1) with eigenvalue 4.
10. Find the eigenvalues of
Here det(M - λϊ) = (2 - λ)2 - 9 which has the roots -1,5,
eigenvalue - 1: Μ + I = I I kills the vector (1, - 1).
eigenvalue 5: Μ - 51 = I , ,1 kills the vector (1, 1).
3.1 Differentiation 239
• EXERCISES
1. Differentiate these functions and graph the curve defined by the
function
(a) /(/) = e", с a complex number.
(b) f (/) = (cos /, sin /, /).
(c) f (/) = (a cos /, b sin /).
(d) f(/)=(/V3).
(e) f(/)=(/,/2,/3).
(f) f(/) = (sin/, cos/, 0).
2. What is the length of f'(/) in each of Exercises l(a)-(f) ? What is the
angle between f'(/) and f"(/)?
3. At which pairs of points are the tangent lines to the curves (a) (c = {),
and (c) of Exercise 1 parallel ?
4. At which pairs of points are the tangent lines to Exercises 1(b), (f)
parallel?
5. Find the maximum of xy on the ellipse ax2 + by2 = 1.
6. Find the minimum of χ + у on the curve xy = 1 in the first quadrant.
7. Find the two points on the curves у = χ2 and xy=—\ which are
closest.
8. Minimize x2 + y2 + z2 on the ellipsoid ax2 + by2 + cz2 = 1.
9. Given two straight lines V and L2 in space how would you try to find
the points PieL1, p2eL2 which are closest (i.e., minimize [ ρ — q II for
peL',qeL2)?
10. Find the eigenvalues and eigenvectors of these matrices:
« {I J> « ("I i)-
11. Find the eigenvalues and eigenvectors of these matrices:
1 0 —1\ /2 1 3\
(a) I 0 1 0). (b) |l 0 3
-10 3
• PROBLEMS
1. Let f, g be differentiable ^"-valued functions defined on an interval /.
(a) Show that the inner product h = <f, g> is differentiable and
A' = <f', g> + <f, g'>.
(b) Show that ||f || = <f, f>1/2 is constant if and only if I(x), l'(x) are
orthogonal for all x.
(c) Give a condition for f to lie on a straight line.
2. Find the point on the intersection of these two surfaces
a2x2 + b2y2 + c2z2 = 1
x2 + У1 = 1
which is closest to the origin.
240 3 Ordinary Differential Equations
3. A rectangular box of maximum volume is to be constructed, with sides
parallel to the coordinate planes, one vertex at the origin and the diagonally
opposite vertex on the plane ax + by + cz = 1. Find the volume of that
box.
4. A community consumes water at the rate of sin2(2nt/2A) gallons per
hour. They wish to build a storage tank of capacity Q with a pump of rate
w gallons per hour, so that the community will never run out of water. The
cost is Q + kw. Minimize this cost for them.
5. Show that if /is any differentiable function on R3, there are at least
two points χ on the unit sphere at which V/(x) is parallel to x.
3.2 Taylor's Formula
Higher order derivatives appear for vector-valued functions just as they
do in the usual one-variable calculus.
Definition 2. Let f be an /^-valued function denned on an open set
U a R. f is /c-times differentiable on U if there exist differentiable function
gj, ..., gk defined on U such that gj = f", g2 = g'i, · · ·, g* = gi-1 We will
denote gk by f(ll) f is fc-times continuously differentiable on U (written
is C\U)) iff"0 is continuous on U.
The following proposition is an obvious extension of Proposition 1 by
induction.
Proposition 4. Let /= (fu ... ,/„) be an Revalued function defined on U.
f is k-times {continuously) differentiable on U if'flt.. .,/„ are each k-times
(continuously) differentiable on U. Further, fw = (/1(ll), ... ,fnik)).
Knowing that a given function is differentiable at a particular point can
be a great aid in computing approximations to its values at nearby points.
These considerations in turn lead to a better understanding of the notion of
differentiability. Suppose that/is a differentiable Revalued function defined
in a neighborhood of 0. By definition the difference quotient,
^[/(0-/(0)]
converges to /'(0). In other words, the function ε(ί) defined for t φ 0 by
e(0 = 7[/(0-/(0)]-/'(0)
3.2 Taylor's Formula 241
has limit 0 as t ->0. Rewriting this,
/(0=/(Ο)+/'(Ο)ί + ε(0·ί (3.10)
where lim ε(ί) = 0. Thus a good approximation to the value/(i) would be
ДО) +/'(0)f; how good depends of course on the function ε(ί). But since
the difference between this approximation and f(t) is ε(ί) · t, it suffices to
know just the maximum of |e(f)|. We give an illustration of how to go
about determining this.
Suppose /is a C2 function defined in an interval [-R, K]. Let
M = sup{|/"(x)|:|x|<K}
Then
1/(0 - WO) +/'(0)01 < MR\t\ for 16 l-R, K] (3.11)
This follows easily from the mean value theorem. There is a ξ between
t and 0 such that
Further, there is an η between ξ and 0 such that/'(ξ) -/'(Ο) = /"(η). Thus,
for a given te [-Я, R],
e(0 = ^K/W-/(0))]-/'(0)
= /'(«) - /'(0) = f'№ η, ξ el- R, R]
Thus |ε(ί)| < MR. Inequality (3.11) follows from (3.10) and this inequality.
Now, although it could be very difficult to adequately describe the function
ε(ί), the maximum Μ is much easier to obtain. In practice,/" is monotonic
near 0 so we need only look at its values at the end points -R and R to
obtain this estimate. We shall now generalize this argument in order to
obtain estimates which are even more accurate.
Rereading Equation (3.10) and the special illustration above We can assert
that differentiability of a function at a point shows us how the values of the
function at nearby points can be well approximated by the values of a first-
order polynomial. (Well approximated here means that the error is small
relative to the distance between the two points.) Furthermore, this well
approximability is a criterion for differentiability.
242 3 Ordinary Differential Equations
Proposition 5. Suppose that f is an R"-valued function defined in a
neighborhood of x0e R. f is differentiable at x0 if and only if there exists a linear
function L: R^R" and a function ε defined for small t such that lim ε(ί) = 0
and
/(xo + 0 =/(*<>) +ДО+ «#)'
Furthermore, L{t) = /'(x0) · t.
Proof. We have seen above that differentiability implies this condition.
Conversely, suppose this condition is verified. Then
.. /(*o + 0-/(*„) .. Uf) L(t)
lim = lim + lim e(t) =lim = L(l)
r-»0 / t-»0 t r-»0 t-»0 t
for since L is linear, L(t) = tL(l). Thus /is differentiable at x0, and f'(x0) = L{\).
Now, an approximate evaluation of /(f) for f near 0 with error that is
small relative to |f | may not be as good as required. A better approximation
would be one whose error is small as compared to f2, or even better |f|*
for sufficiently large k. This is where the higher order derivatives come in.
We shall now derive a theorem which gives such approximations. The
derivation follows by induction directly from the above remarks.
Theorem 3.1. (Taylor's Theorem) Suppose that f is a (k + l)-times
continuously differentiable R"-valued function defined in an interval I about x0.
Then there is a polynomial Ρ (with coefficients in R") of degree k, and a function
ε defined for t in I such that
(i) ε(ί) is bounded by max{|/(ll+ υ(χ) |: χ between x0 and x0 + t},
E(t)tk+1
(ii) f(x0 + t) = P(t) + -LL_ (3.12)
Furthermore, Ρ is unique and is given by
ДО = Я*о) + /'(*<>)' + Яг-} t2 + ■ ■ ■ + ^^ t*
2 k\
If we write χ = x0 + t, (3.12) becomes a more familiar expression, called
Taylor's expansion of degree к about x0:
к fi')(0\
/(*) = Σ -тг2 (x - хо)" + %x - *o)(* - *of+ * (3-13)
ι = 0 I 1
3.2 Taylor's Formula 243
Proof. The proof is by induction on k. The case к = 1 was already discussed
above. We now assume the proposition for к = η - 1 and prove it for к = η, by
applying the induction hypothesis to /'. For simplicity we take x0 = 0, and
I={x:\x\<a).
Let tel. By the induction hypothesis we can write
/'(0= Σ ^-г'Ч^/' (3.14)
i = o г! и!
since/'<l)=/<l+1) Here ε0(ί) is bounded byM = max{|/<,,+1)(*)|: * between 0 and
/}. Now let us integrate (3.14) from 0 to x:
r* "-' fii+1,(0) r* 1 r*
/'(0*= Σ -—г-- (4ί + ~\ ut)fdt (3.15)
J0 1 = 0 l\ Jo П\ Jo
The integral on the left is, by the fundamental theorem of calculus, f(x) — f(0).
Thus, letting
ε(*)=^7Γ Jo Ut)fdt
we obtain from (3.15)
„ ! /<i+')(o) *' + > 1 x"+1
/(^=/(0)+^-^-^ + ^^^)
which is just the same as (3.12). We must show that ε(χ) is bounded by M. But,
№)l = ^7rJo \eo0)\t"dt<1FrM\o fdt<M
since ε0 is bounded by M.
Examples
11. Find the Taylor expansion of degree 3 about 1 of f(t) =
l + t + 3i4.
/(1) = 5 /'(1)=1 + 12ί3μ, = 13
Γ(ΐ) = 36 Γ(ΐ) = 72 and /<4)(0 = 72
thus the Taylor expansion is
/(0 = 5 + 13(i - 1) + I8(f - l)2 + 12(r - l)3 + ^ t
,2 , П„_пЗх^^
244 3 Ordinary Differential Equations
where |e(f)l < 72.
Notice that, since /(5)(0 = 0, the Taylor expansion of degree 4 is
accurate:
f{t) = 5 + 13(i - 1) + 18(i - l)2 + 12(i - l)3 + 3(i - l)4
for all t.
12. Find the Taylor expansion of degree 4 about 0 of f(t) =
(1 + i2)-1
/(0) = 1
fit) = -2i(l + t2) - 1 /'(0) = 0
fit) = -2(1 + i2)-1 + 4i2(l + t2)'1 f"(0) = -2
fit) = 4i(l + i2)~2 +8i(l + i2)-1 - 8i3(l + i2)-2 /'"(0) = 0
/№(i) = 4(1 + ί2Γ2 + 8(1 + t2)-1 + t[_- · ·] = 12
f(t) = 1 - t2 + tA + ε(ί)ί5
13. Calculate (40)1/2 to three decimal places. We expand
fix) = tJx about 36.
/'(x) = Jx-1/2 f\x) = \x-3'2
2 4
r{x) = \x-s'2 /<*>(χ) = ^χ-^
/(36) = 6 /'(36) = ^ Г(36)=^з
1
12 ' K~~J 463
Г(36) = ^ l/(4)«l^
for χ between 36 and 40. Thus
/(x)=6+^(x ~зб)+?V(x ~36)2
+ -L (x - 36)3 + b(x - 36)4(x - 36)4
8.6 6
3.2 Taylor's Formula 245
where ε(ί) < 15/16.67. Thus
and the desired approximation is 6.334.
14. Calculate e4 to three decimal places. We first write down the
Taylor expansion f(x) = ex about 0. Since f'(x)=f(x\ we have
fik)(x) = e* for all x. Thus the Taylor expansion of e*, degree и is
" x' x"+1
where |ε(χ)| < max{| e'|: 0 < t < x}. Thus to estimate e4 we now
take |ε(χ)| <, e4 < 34. The approximation by the Taylor expansion
of degree η is bounded by
«(*) ... . 3V + 1
(и + l)! ~(и+1)!
We must choose и so large that this is bounded by ΙΟ-3 η > 41 will
do, as we see by the following succession of inequalities
34.4Π+1 44 +5 22ι, + ι° ι
(и + 1)! ~ F^7^ ~ I3"^ ~ 2"-31
1
— jq3/10(i.-31)
Thus we must have (3/10)(n - 31) < 3, or η > 41.
In the Taylor expansion (3.16) of ex observe that the remainder term is
dominated by
„И+ 1
e*·—
(и+1)!
and therefore tends to zero as η -> oo. Thus, if we let η -> oo in (3.16), we
1
< --
15
6 16.67
1 10
43<-7<1(Γ4
6
246 3 Ordinary Differential Equations
obtain (once again)
CO v'
«"-£«
χ
Γοί!
Now, this kind of an argument can be applied to any function which we
happen to know has derivatives of all orders. That is, if / is infinitely
differentiable in an interval /about x0 we can write the Taylor expansion
jv \ jv \ , "v J (*o), v , „/· \ \x ~~ xo)
/(*) = /(*0) + Σ —^ <* - *°> + ε« („ + ΐ)! (3.17)
where |ε(χ)| < max{|/(" + 1)(i) |: t between л:0 and л:} valid for every n. Let
M" + 1(x) be thus bound. If
11тЛГ + Ч*)(Т*°Г1=0 (3·18)
Л-О0 (« + 1)
then clearly we can take the limit as η -> oo in (3.17) and represent /'as a
series This series is called the Taylor Expansion of /about x0. In Chapters
5 and 6 we shall return to the consideration of series expansions for functions.
In Section 5 8 we shall construct infinitely differentiable functions which are
not represented by these Taylor expansions. For the present we mean only
to remark on these approximations of the Taylor expansion as a tool for
approximation.
Examples
15. Consider now f(x) = sin x. We have
f'{x) = cos χ f"{x) = -sin л: /'"(*) = -cos л:
/<4)(x)=sinx,. .
and the cycle repeats itself. Thus
/■<4" + 1)(x) = cos χ f(4" + 2\x) = sin χ /(4n + 3)(x) = -cos χ
/■<4" + 4)(x) = sinx
The Taylor expansion about zero is thus found to be
x3 x5 x1
f(x) =x- — + — - — + ··· + Remainder term
3.2 Taylor's Formula 247
Since all derivatives of sin χ are one of ±sin*, +cos x, they are
bounded by 1, so the remainder for the Taylor expansion of degree к
is bounded by
(fc+1)!
which tends to zero as к -> oo. Thus the Taylor expansion
x3 x5 x1 {-\)k2k + i
sinx = *__ + ___+...+___ + ...
accurately expresses sine as an infinite sum. Similarly, we can
compute a Taylor expansion for the cosine (see Exercise 15),
x2 x4 x6 (-l)V
С05*=2Г7!+бТ+'" + -(ЩГ+···
16. Find sin π/4 to an accuracy of 10~3. We need to compute
a bound on the remainder after calculating η terms of the Taylor
expansion and then ensure this bound is < 10"3. Now the remainder
after к terms is bounded by [(2k + 1)!]_1(π/4)2* + 1. We shall use the
fact that π/4 < 4/5 to verify that к = 3 will work:
m
1 M7 1 47 1 1Л_3
<—-л^< —j< ίο
" 6.44 57 _ 6.54 ~
Thus an estimate to sin π/4 to within one thousandth is
π π3 π5
+
4 6.64 120.45
17. The logarithm is infinitely differentiable around the point 1.
Does it have an infinite Taylor expansion there ? By computation, we
find
log(')(x) = x-1 log'(l)=l
\o£"\x)= -x'1 log(">(l)=-l
log<"'>(x) = 2*-3 log(""(l) = 2
log<4)(x) = -13.2*-4 bg(4)(l) = (-1)3 · 2
\oin\x) = (- l)"(n - 1)1*- logW(l) = (- 1Π" - О'· (3-19)
248 3 Ordinary Differential Equations
The Taylor expansion of degree и about 1 is thus
M*>-t(-^^<*-i>' + «i*>(^ (32o)
Notice that from the first equation of (3.19), if χ < 1,
\εη(χ)\<(η)\χ-(η + ί)
and thus the remainder of (3.20) is bounded by
1 / x- 1 \"+1
which tends to zero as и -> oo, so long as 1 > χ > 1/2. Similarly,
we can show (Exercise 18) that the remainder goes to zero if
1 < χ < 3/2. Thus, in the interval 1/2 < χ < 3/2, the logarithm has
the Taylor series
00 (x - ΙΫ
iog(x) = I(-i)*iLTJ-
• EXERCISES
12. Find the Taylor expansion about the origin of degree 5 of tan χ; of
(1+x)-1.
13. Find sin I accurately to 4 decimal places.
14. Find л/з accurately to 4 decimal places.
15. Derive the Taylor expansion (given after Example 15) of cos x.
16. Find an interval about the origin in which the substitution
(1 + x)-1 = 1 - χ
is accurate to three decimal places. What about the substitution
(\ + χ)-1 = 1-χ + χ^Ί
3.2 Taylor's Formula 249
17. Find an interval about the origin in which the substitution
e« = l+* + T + -^
is accurate to three decimal places.
18. Show that the series
» (x - 1)"
1 = 1 к
represents the logarithm in the interval l/2<:*<:3/2. Observe that the
series converges for all χ in the interval (0, 2). Does it converge there to
log л:?
• PROBLEMS
6. Suppose/is a fc-times differentiable real-valued function defined on the
interval /. Suppose f<k) = 0 for all k. Show that / is a polynomial of
degree at most к — 1.
7. Suppose that/, g are С functions denned on an interval containing 0,
and /(0) = ·.·=/*-1>(0) = 0, ^(0) = ···=^"-1)(0) = 0, but д^ЩфО.
Prove that
/(0 _ /(t)(0)
,lm g{t) g*K0)
8. (Taylor's form of the mean value theorem) Suppose that / is С on
the interval [- R, R]. Show that for / e [R, R], there is a ξ between 0 and /
such that
■fqo.,, + /^)„
1 = 0 /! k\
9. Let m be any integer and define the functions/0 /m-i by
n=o (mn + /)!
(a) e*=/i(*) +···+/»(*)· /
(b) /,'=/.-i for i = l,...,m- 1.
(С) /о' =/„-!·
(d) The functions /,...,/™ are all solutions of the differential
equation
/™> = у
250 3 Ordinary Differential Equations
10. (a) Suppose that/is continuous on the interval [-R, R]. Define
0(0= f f«x)dx te[-R,R]
and show that g is also continuous.
(b) Suppose that h is C1 on [-R, R]. Prove that there is a
continuous function к such that h(t) = A(0) + tk(t). (Hint: Consider
Jo A'(t) dr and make the substitution τ = /χ.)
3.3 Differential Equations
Now, an ordinary differential equation is (roughly speaking) an equation
involving the variable x, an "unknown" function/, and some of its
derivatives/',/', ...,/1". Thus
/'(*) = Kx)
/"+/=0
/'(*) = xf{x)
[/(4)(*)]2 + <?rw = l/(3)WI + iog|* + i|
are examples of differential equations. A solution is a function which makes
the equation true. For example, Jj к, sin x, exp(£x2) solve the first three
equations respectively (as for the fourth, we cannot easily exhibit a solution).
We prefer to think about differential equations in this vague sense rather
than to try to attempt a formal definition of such, so we shall do so.
Many equations do not admit solutions and some equations admit many.
Consider these:
\y'\+\y-x\ = 0
iy'f + 1 = 0
y" + y=0
The first has no solution у = /(*)> because we cannot have both /(*) = χ
and/'(x) = 0; the second has no solution because the derivative of the
supposed solution would be imaginary. The third equation has as solutions
sin x, cos x, as well as any linear combination of these. The first equation
must be discarded as being self-contradictory; the second admits solutions
if we permit ourselves to consider complex-valued functions. As we shall
3.3 Differential Equations 251
see this turns out to be a very fruitful course, for it permits understanding
the third as well.
The importance of calculus derives from the fact that It is necessary to the
solution of concrete problems (mainly derived from the study of physics and
the natural sciences). These problems usually are stated mathematically
as differential equations.
Examples
18. Compound interest. A bank likes to pay its depositors on the
basis of the amount deposited and the length of time they have been
able to use these deposits. Thus every (say) June 30 your bank
would add to your deposit an amount equal to (say) 5 % of that part
of your deposit which they have held for the past year (and if they are
decent about it a reasonable fraction of that 5 % for parts left in for
fractions of that year). Many years ago, that great financial wizard,
L. Waverly Oakes, pointed out that that amount that he kept in his
bank for the first half-year was working for the bank and he should
be paid for it. Furthermore, argued Mr. Oakes, the payment he
should have received was also sunk back into the bank's investments
so also was earning income for the bank, and thus for its depositors.
Finally, Mr. Oakes pointed out that there is nothing special in half a
year, or any particular fraction thereof. His very words were " Over
any period of time, no matter how small, the earning of a particular
balance relative to that balance should be directly proportional to
that period of time. In order to best approach the interest due its
depositors, our banks should be computing interest as often as
possible." The banks all responded to this profound utterance by
recomputing their interest every month instead of every year.
Somebody even suggested that, with an army of secretaries, they could so
compute the accrued interest every 30 seconds. And there the matter
would have rested were it not for an obscure student of Isaac Newton
who dabbled in the stock market.
Suppose at time t0 a sum of s0 pounds are deposited in the bank.
Let/(i) for all times t > t0 be the balance accruing from this deposit
according to the Oakes system. Then, Oakes' assertion is, for all
Kh) - f(h) = _ (321)
where к is the earning power (interest rate) of money. The first
3 Ordinary Differential Equations
thing this brilliant person remarked is that (3.21) cannot possibly
always hold. Let us illustrate his discussion.
Suppose that 500 pounds are deposited in the bank at a 5% per
annum interest rate. Then /(0) = 500 and at the end of one year,
the interest is 25 pounds, so/(l) = 525. Now, if interest is computed
every half-year, we obtain, by (3.21),
^ = 0.05S
500 V2
or/(1/2) = 512.50. Then, over the second half of the year, we obtain
/(1)-512.50
512.50
= 0.05(i)
so that by this computation /(1) = 525.31. As this is closer to the
actual earnings of the initial deposit, this is more like the amount the
depositor should get. Furthermore, this semiannual computation
has neglected the earnings during the last three-quarters of the 6.25
accrued during the first quarter. In fact, when we compute the
interest quarterly we find that the value of/(l) should be no less than
525.504. And so it goes: no matter what period we choose for the
computation of interest, we will be neglecting the interest accrued
by the growing total during that interest. Thus Oakes' formulation
cannot be correct. However, our student was moved by the basic
justice of Oakes' ideas and after rewriting Oakes' formula as
h ~ h
he asserted that he had found the precise statement of the Oakes
formula. Oakes should have said "over any infinitesimally small
interval of time ..." rather than " over any period of time, no matter
how small ..." Precisely, then: the rate of change of the balance
at any time is proportional to the balance at that time; that is,/' = kf,
where к is the interest rate (0.05 above). Thus, the problem is to
find a solution / for the differential equation y' — ky = 0 so that
/Co) = ■*<>■
19. Population explosion. Population tends to grow also according
to the above differential equation. That is, it is assumed that every
individual has the same propensity to reproduce and that propensity
3.3 Differential Equations 253
is independent of time. Thus over any infinitesimal period of time
the ratio of the increment in population to the initial population is
proportional to the time elapsed. (You know what mobs are like:
the larger they are the faster they seem to grow.) This assertion is
supposed to be true for brief periods of time; thus we should more
precisely assert that the rate of change is directly proportional to the
total population; thus if/is the total population,/' = kf where the
constant к is called the growth rate.
In some societies the growth rate varies with time; among certain
mammals it peaks at certain times of the year. In these cases the
population as a function / of time satisfies a differential equation:
fit) = k(t)f(t), where k(t) is the variable growth rate. It may even
happen that the growth rate depends on the total population; in a
well-regulated society (1984) this would be the case. Then the
population function is a solution to a more complicated equation,
y' = k{y)y.
20. Survival of the fetahs. On a remote volcanic atoll in the South
Pacific there live only two species of animals, the fetahs and the
garibs. These animals are essentially vegetarian and there is an
everpresent undergrowth to feed them. However fetahs especially
love to eat garibs and garibs find the succulent fetahs hard to resist.
Now each fetah tends to reproduce at the rate of one young each per
year, and consume garibs at the rate of 7 per year. Conversely
the garibs have only one young per year and eat fetahs at the rate of
17 per year. Thus the increment Δ/ Ag of fetahs and garibs in a
year should be given by
Δ/=/0 - 17^0 Δ<7 = g0 - 7/o (3.22)
where/0,#o are trie initial populations of these groups. However,
the Oakes reasoning must be applied to this case; because as the
population changes, it will continually affect the increment. The
solution is, as in the above case, to rewrite (3.22) as a differential
equation. If fit), g{t) are the populations of fetahs and garibs at
time f, then these equations describe the growth of/and g:
/'=/-17* g'=9-!f
21. The biotic matrix. On a less remote island there are η different
species of animals, all of which have some effect on the growth patterns
of all the others (some feed on others; some house, or protect others).
254 3 Ordinary Differential Equations
This kind of society can be represented by a biotic и х и matrix
A = (a/). The (i,j)th entry is described as follows: The increment
in the ith species in one year which is attributable to each member
of the /th species is a/. (Thus the effect of one member of the yth
species on the /th species in an interval of time Δί years, is a/ At.) If
f(t) = (/'(i), · · · ,/"(0) 1S trie population function on this island,
then this differential equation must be satisfied:
/' = Af (3.23)
22. Particle motion. We consider now the motion of a particle
in R". Let f(i) be the location of that particle at time t. f is thus an
.Revalued function of a real variable. The rate of change of position
at a time t0 is the limit as t -> t0 of
—^-(f(i)-f(io))
1 ~ h
thus is f'(i0)> called the velocity of the particle at t0. The rate of
change of velocity, f", is the acceleration of the particle.
The velocity vector has both magnitude and direction; we can write
f'(i) = f(i)T(i) (at least when f' #0), where v(t) is a positive function of t,
and T(i) is a unit vector. T(i) points out the direction in which the particle
is traveling at time t and v(t) is the speed at which it is moving. Also, f(i)
can be given the following description. The length of the path that the
particle traces out in a certain period of time is the distance traveled by the
particle. |u(i)| is also the rate of change of that distance at time t. We will
have to await a full discussion of arc length (Section 4.2) before justifying
this; however some heuristic arguments are possible (see Problem 20).
According to this description, the distance s(t) traveled by the particle from
time t0 to t is a solution of the differential equation y' = ||f'(f)ll with s(t0) = 0.
We would hope that there is only one solution, for there is no further way to
determine this function. (Fortunately, by the fundamental theorem of
calculus this problem has a unique solution.)
Consider, for example, a particle moving on the unit circle in the plane.
Let/be the position function of this particle. Let s(t) be the arc length on
the circle from the point (1, 0) to/(i) at time t. Then (since arc length on
the unit circle is the same as the angle)
/(t) = (cos j(t), sin s(t))
3.3 Differential Equations 255
The velocity vector is f'(t) = s'(t)( - sin s(t), cos s(t)). Notice that
k'(OI = l/'(OI> giving further weight to our description of speed above.
Notice also that/'Ci) is tangent to the circle at the point /(i); this reflects
the fact that the motion is constrained to the circle. Differentiating further,
we find that the acceleration is
f"(t) = j"(0(-sin j(i), cos j(0) + j'(i)2(-cos j(i), -sin s(t))
= *"(0Д0-[У(0]2Я0
Thus the acceleration has a component tangent to the circle (in the
direction of the motion) whose magnitude is the rate of change of speed, and a
component perpendicular to the direction of motion, whose magnitude is
equal to the speed squared. For example, if the particle is rotating around
the circle with constant speed, it is accelerating toward the center of the
circle.
According to Newton's laws of motion the situation is as follows. Given
a particle at time t0 situated at p0 and having velocity v0, all further motion
is determined uniquely by the forces acting on the object. The motion is
determined by this law: the acceleration is directly proportional to the force
acting on the particle. Thus, in the absence of any forces, if f(i) is the
position of the particle at time t, we have
f('o) = Po f'(io) = v0 f"(0 = 0 alii
and f is uniquely determined by these conditions. We say that f is a
solution of the differential equation y" = 0 with the initial conditions y(0) = p0,
y'(0) = v0. Newton's laws require that the solution exists and is unique.
Mathematics bears this out; the solution is f(i) = p0 + *v0. Thus, in the
absence of force, a particle will move with constant velocity, that is, in a
straight line at a constant speed.
Now, in general, the mechanics of motion can be described as follows.
There is a function F defined on R" x R taking values in R". The value
F(x, i) represents the force that will act on a unit mass acting at point χ at
time t. The function F is called a force field. A particle of mass m situated
at the point χ will experience the force wF(x, i) at time t. According to
Newton's law it will accelerate in the direction of F. The magnitude of this
acceleration is determined by or according to this announcement of Newton's
law: Force = mass · acceleration,
wF = та
(a = acceleration).
256 3 Ordinary Differential Equations
Suppose we place a particle of mass m at p0 with velocity v0 into this
situation at time t0. Let f be the function describing its subsequent motion
according to Newton's law. Then at time t it is at f(i) and it experiences a
force F(f(i), 0· Thus we have
f "(0 = F(f(i), 0
Thus f is the solution of the differential equation y" = F(y, i) with the initial
conditions f(i0) = Po, f'(i0) = v0. Newton's laws require that the solution
exist uniquely. In the next section we shall show that for smoothly varying
force fields this is the case.
• PROBLEMS
11. Find all complex-valued solutions of the differential equation
(/)2+l=0.
12. Solve the differential equation у' = у with the initial condition
У(0) = 0.
13. (a) How long will it take 100 dollars to double at a compound
interest rate of 5% per year?
(b) How long will it take 350 dollars to double at the same rate?
(c) How long will it take 100 dollars to double at a rate of 10%
per year?
14. It is observed that radioactive elements decay into heavy metals. It
is assumed that the probability of any given atom decaying is independent
of the particular atom. Let к be the probability that a given atom of a
particular element will decay within one year. Show that the function /
is governed by the differential equation / = — ky if /(/) is the mass of the
given element after time /.
15. The time it takes for a radioactive element to halve in mass is called
the half-life. If an element has a half-life of 14 million years, find the
constant к of Problem 14.
16. Why is Oakes' formulation of the interest problem wrong ? Can you
solve equation (3.21) so that it holds for a specified period; that is, given
n, find /so that (3.21) holds for h =k/n,t2=(k+ l)/n, 0^k<nl
17. A weight of mass m is suspended from a rigid support by a spring of
natural length L. According to Hooke's law the spring produces a
"restoring force" which is proportional to the displacement from its natural
length, and directed toward its natural position (Figure 3.6). Let us denote
this constant of proportionality by k. Let χ denote the distance of mass
from the natural position, where the positive direction is upward. Then the
mass has two forces acting on it: a force Fi = — kx due to the restoring effort
of the spring, and the force of gravity F2 = —mg. If the mass is at rest, then
there is no acceleration, so by Newton's laws Fx + F2 = 0, from which we
may conclude that the rest position is at χ = —k'^mg. Now suppose we
3.3 Differential Equations 257
Figure 3.6
displace the mass by an amount h0 and let it go. Using Newton's law find
the differential equation governing the subsequent motion.
18. A certain insect lays its eggs in the flesh of a mammal. Each insect
hatches h eggs per year. Now every time one of these eggs hatches in a
horse, it kills the horse. Assuming the total mammalian population is a
constant T, we can derive the differential equations governing the growth of
this insect and horse population if we also know the natural death rate (d,)
of the insect and the natural birth and death rates (bH, dH) of the horse.
Let /(/), H(t) be the population of the insect, horse, respectively. During
a period of time Δ/, bH ■ Η · Δ/ horses are born, and dH ■ Η ■ Δ/ horses die of
natural causes. Now each insect hatches ΑΔ/ eggs during this interval;
the probability that its host is a horse is HIT. Thus there are hI(H/T) Δ/
horse deaths attributed to the insect during this time interval. The change
ΔΗ in the horse population is thus
Δ# = ό„#Δ/-ί/„#Δί-/ι/(-ΐΔ/
3 Ordinary Differential Equations
Find the corresponding change in the insect population and deduce that these
differential equations govern the growth:
hH
H' = (b„-dH)H- — I
I' = hT-d,I
19. It was observed by Galileo that the gravitational attraction of the
earth is constant. In the small, we may assume the world is flat, thus we
take as a model R3, and assume that the plane ζ = 0 is the surface of the
earth. The gravitational attraction then is a force field F(x, y, z) =
(0, 0, — g). Suppose a particle of mass m is at p0 and has a velocity v0 at
time t0. Let f (/) be the position of this particle at time /. What is the
differential equation governing the motion of the particle? Can you
solve for f ?
20. Suppose there is a wind coming out of the east which exerts a force
(c, 0, 0) on our particle, no matter what the position is. Now find the
equation of motion.
21. Suppose that on the plane there is a centripetal force field
proportional to the distance from the origin. At the time / = 0, a particle is
placed at the point z0 and has a velocity v0. What is the equation of
motion ?
22. We can try to find a formula for the length of a curve by
approximating it by a line segment. Let
* = *(/) y=y(t)
be the equations of a curve, and let (x(t0), y(t0)) be a point on the curve. For
a very short period of time. Δ/, the curve can be replaced by its tangent line
(see Figure 3.7). The length of the curve between (x(/0), yit0)) and
(x(t0 + Δ/), y(t0 + Δ/)) is then approximately equal to ((Δχ)2 + (Δ^)2)1'2.
V(AJT)^+ (ду)
(ϊ(ί„ + Λ/),3/(ί„ + Δί))
Figure 3.7
3.4 Some Techniques for Solving Equations 259
Then the rate of change of arc length over the interval At is
((ΔχΥ + (Ay)2)1'2
Δ/
Letting Δ/ -* 0 deduce that the rate of change of arc length along the curve
is the length of the vector (x'(t), /(0).
3.4 Some Techniques for Solving Equations
The fundamental theorem of calculus is of course the basic existence
theorem on solutions for differential equations, and integration is the primary
tool. Thus an equation of the form
/ = Κχ)
has the solution f(x) = h(f) dt + c, and this solution is unique but for a
constant. Let us state the same result for vector-valued functions.
Definition 3. Let h = (A1;..., h„) be a continuous /^-valued function on
the closed interval [a, 6]. Define the integral of A over the interval [α, ό] to be
J>=(J> ·Μ
Theorem 3.2. Let hbe a continuous Rn-valued function defined on the open
interval (a, b) and let a < с < b. Then the differential equation
y' = Kx) у(с)=Ро (3·24)
has the unique solution h + p0-
Proof. By the fundamental theorem of calculus and Proposition 1,
/(*)= Г h+p0
is differentiable and satisfies the conditions (3.24). If g is another solution, then
g' =/' on (a, b) so each coordinate of g - /has zero derivative and thus is constant.
Since g(c)=p0 =/(c), this constant is zero, so g =/.
260 3 Ordinary Differential Equations
Separation of Variables
There is a class of differential equations which can be solved simply by
integration, just by recalling the chain rule. This is the class of first-order
equations (only the first derivative of the unknown function у appears) in
which the variables separate; that is, these are equations of the form
%)/ = β(χ) (3-25)
The left-hand side appears to be the result of application of the chain rule;
we can rewrite (3.25) as
d
dx
h
= gM
Thus, if we let Η be an indefinite integral of h, Η = J h, then (3.25) becomes
[#(/*))]'=0(*)
so we can integrate:
я(Х*)) = fe (3.26)
If we can solve (3.26) for y(x), we will have the desired explicit expression of
j> as a function of x.
Examples
23. yy' = 1. Let Щу) = J у = у1 β. Then the equation can be
rewritten as [#(j>(x))]' = 1, or Щу) = у2/2 = χ + с, where с is a
constant to be determined by the initial conditions. Thus the general
solution of yy' = 1 is у = +(2(x + cj)1'2.
24. y' = x2y2. Again, we write
У'2у' = х2
Integrate:
x3
— 1 л
■У l=J + c
3.4 Some Techniques for Solving Equations 261
so
-3
25. y' cos у = sin x. After integrating this becomes
sin у = - cos χ + с
or у = arc sin(c - cos x). A particular solution is/(x) = χ - π.
26.
. 1+*
After integration we have
y2 x2
У + — = X + -z-+ С
2 2
(3.27)
It is now a bit difficult to write the solution explicitly as a function of
x, but it is possible using the formula for roots of a quadratic
polynomial:
^-2±(4+8, + 4»» + C)"» (328)
The constant с is presumably determined by the initial conditions, and
with it the function y. Notice however, that each value of с gives
two candidates for the solution, but they may not both be solutions.
For example, suppose we seek the solution of (3.27) with the initial
condition y(0) = 0. We arrive at (3.28) and upon substituting
x = 0, у = 0, we obtain
Q -2 + (4 + Q"2
so we must choose с = 0 and the positive sign before the radical.
This boils down to у = χ. If the initial condition is y(0) = -2, again
с = 0, but we must take the negative root, obtaining y= - (x + 2).
262 3 Ordinary Differential Equations
Notice also that upon substituting the initial condition y(—l) = 1
into (3.28), we find с = 8 and both roots give solutions to this problem;
that is, both functions у = χ and у = - (χ + 2) are solutions with this
initial value. Thus it is not always true that the initial conditions
uniquely determine the solution of the differential equations. Looking
back at the original equation (3.27) we find what might be a clue to
this bizarre behavior: the function (1 + x)(l + y)'1 is ill-behaved
at у = - 1.
Uses of Exponential
We shall now turn to the study of the exponential function; because it is
the solution of such a simple differential equation it gives rise to several
techniques. Recall from Chapter 2 (Definition 21) that the differential
equation
у = Cy y(0) =1 с any complex number (3.29)
has a unique solution, denoted ecx. Notice that
(ecx)' = cecx, (ecx)" = cVx, ..., (e")(s) = с*е" (3.30)
These remarks suggest a method of attack on another class of equations.
A homogeneous constant coefficient equation is one of the form
Ут + "k- 1У<к~l) + ■■■ + βι/ + a0 у = 0 (3.31)
We shall consider this class in greater detail in Section 3.6. Let us compute
the left-hand side of (3.31) under the substitution у = е". By (3.30),
akecx + ak-lck~1ecx + · · · + axcecx + a0 ecx
= (ak + ak-lck~i + ··· + atc + a0)e" (3.32)
We find that ecx is a solution of (3.31) if с is a root of the polynomial appearing
in (3.32).
Examples
27. Find solutions for y" - у = 0.
Substituting у = e", we obtain (c2 - \)ecx = 0, thus we must have
с = ±1. We conclude that e", e~x are solutions. Notice also that
for any a, b, aex + be~x is also a solution.
3.4 Some Techniques for Solving Equations 263
28. Find solution of ym + у = 0.
Here substitution of у = ecx yields (c3 + l)ecx = 0, so с must be a
cube root of — 1. Thus we obtain three solutions:
Of course, all functions of the form ae'x + be'"l3x + ce~'"l3x are
solutions.
29. Solve the initial value problem
y"+y' = 0 X0) = 0 /(0)=l у"(0)=1 (3.33)
Substituting у = ecx, we obtain (c3 + c)e" = 0, so we must have
с = 0 or с = i or с = —ι. Thus all functions of the form
ae0x + be,x + ce~,x
are solutions. Let us see if we can solve for a, b, с by substituting the
initial conditions:
X0) = 0 : a + b + с = 0
/(0) = 1: ib - ic = 1
У'(0)= 1: -6-c= 1
We can solve this system, obtaining
β = 1 *=-— c=--2-
Thus the function
will solve our problem.
30. Solve the initial value problem
y" + у = 0 X0) = 1 У(0) = 0 (3.34)
264 3 Ordinary Differential Equations
Here we have, as general solution ae'x + be~'x. Substituting the
initial conditions, we obtain
a + b = 1 ia - ib = 0
and thus a = b = 1/2. Thus we obtain as solution
Лх) = №* + е-'*)
Notice that we already know from calculus the solution f(x) = cos x. We
shall learn in the next section that the initial value problem (3.34) has a
unique solution. Thus this interesting equation follows:
cos χ = \(eix + e~,x) (3.35)
We shall leave to the exercises the verification of these other relationships
between the trigonometric and exponential functions:
sinx = ^(e'x -e~'x) (3.36)
e'x = cis χ = cos χ + i sin χ (3.37)
First-Order Linear Equations
Now if/ is a differentiable function, so is ef, and (ef)' =f'ef. Letting
у = ef, we obtain the differential equation y' = f'y. Thus, working
backwards we see how to solve an equation of the form
У' = ff(x)y
Namely, exp(J g) is a solution. With a little more ingenuity we can see how
to explicitly solve any linear first-order equation. These are differential
equations of the form
y'+f{x)y = g{x) (3.38)
where f,g are continuous in an interval about a. Let H(x) = \*f, and
consider the new function ζ = еиу. Then ζ' = еиу + Н'ену = ен{у' + fy),
since Η' =/. Since by (3.38), у' +fy = g, we have this equation in z:
z' = e»g
3.4 Some Techniques for Solving Equations 265
which is solvable by integration:
ζ = Г енд + с
•'а
Finally, у = e~Hz, thus the general solution of (3.38) is found:
У= e~Hz = e~H f eHg + ce~" (3.39)
where Η is the indefinite integral of/, and с is to be found by substituting for
the initial condition.
Examples
31. y' + xy = x, y(0) = 0
Here we take Η = J χ = x2/2 and consider ζ = j> exp(x2/2). Thus
the corresponding equation in ζ is
ζ = у' ехр(х2/2) + j>x exp(x2/2)(/ + xy) = exp(x2/2) л:
Thus
ζ = ί ехр(л:2/2) л: ώ + с = ехр(л:2/2) + с
so
у = ζ ехр(- л:2/2) = 1 + с ехр(- х2/2)
Substituting the initial condition у = 0 = с + ехр(-02/2) = с + 1,
so с = - 1. The solution thus is у = 1 - exp(-x2/2).
32. /-2*-\у = *,Х1) = 0.
Here we take H=\2jx= - 2 In χ and consider ζ = ye "^-
*-2j>. Thenz'= -2x-3y + x-2y' = x'2(y'-2x-1y) = x-2x=x~l-
We obtain
ζ = In χ + с and у = x2 In χ + ex
266 3 Ordinary Differential Equations
• EXERCISES
19
20.
21.
Solve these differential equations:
(a)
(b)
(c)
(d)
x2exp(x2)/ = x3,.K0) = l
У = χ sin χ + cos x, уЩ = О
(*'(0, /(0) = 0, t\ t3), (x(0), y(0), z(0))
z(t) = e" + ((1 + i)t)\ z(0) = 1
Solve these differential equations:
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
У = у2
y' cos χ = cos у
χ2 + у2/ = о
y' = (y2-l)(x2-l)
у" = χ/
/=a+x2)y
xy2 + (i - χ)/ = о
/ = ex+y
У = sin(x + y) + sin(x - y)
Solve these differential equations:
(a)
(b)
(c)
(d)
(e)
У + xy = cos x, X0) = 0
/ cos χ + у sin χ = tan x, y(0) = 1
/ + xy = x2, X0) = 0
е"У+е'у = е-",у(0) = 1
У=уе-',у(1) = 1
■(0,1,0)
22. Solve these differential equations:
(a) у'" = 2/ + / - 2y = 0, X0) = 0, /(0) = 0, У(0) = 1
(b) /-2/-r = 0,r(0)=l,/(0) = 0
(c) У - (1 + ЗгК + (3/ - 2)/ + у = 0, Х0) = 0, /(0) = 1,
у"(0) = 0
3.5 Existence Theorems
In this section we shall state and prove the basic existence theorem for
ordinary differential equations. The method is due to Picard and is that of
successive approximations. (Recall how we found, in Section 2.10, the
solution to the equation y' = cy.)
The first theorem is about first-order equations. We shall first illustrate
the method of successive approximations.
Example
33. Successive approximations. There is one and only one solution
of
/ = ex + у у(0) = 0
3.5 Existence Theorems 267
Now if/(*) solves this equation, then by integration we see that
fix) = f/'(0 dt = fry + /(i)] di
Thus if Γ is the transformation defined on continuous functions by
Tg(x) = TV + 0(i)] dt
we see that Tf = f; that is, / is a fixed point of Τ According to
Newton's method we should be able to find Τ as the limit of the
sequence /0, Tf0, T{Tf0), ..., T"f0, ... . Let us compute this
sequence. We may choose any function for/0, say/0 = 0 Then
Tf0 = f e» dt = ex - 1
T2/0 = Г(7У0) = f (2e» - 1) dt = 2e* - 2 - χ
T3f0 = T(T2f0) = £(3e» -2-t)dt = 3f-3-2x-j
. *2 *3
T4/o = 4e* - 4 - 3* - 2 - - -
x2 x3 *4
T5/0 = 5^-5-4x-3y-2---
Г"/0 = nex - η - (и - l)x - (и - 2) y
■(«-j)-t-···-
j! (л-1)!
We can't tell yet that this sequence of functions converges, but if we
replace ex by its Taylor expansion we can get a better picture:
00 x' n'1 , x'
r"/o=I«-T-Z(«-j)|(
II . — η /I
n-l vJ °° XJ "~1 X1 £ XJ
= Σο:-(--λ:-; + Σ^ = Σο(73ϊ)Τ + Σ»;- (340)
268 3 Ordinary Differential Equations
As и -> oo the last sum in (3.40) tends to zero, and we obtain
lim T"/0 = f -
j-οϋ-Ι)!
Indeed xex solves the given problem! Now we would like to show
that the solution is unique. This is easy, because it is easy to verify
that Γ is a contraction:
Τ f{x) - Tg[x) = fV + /(0 -i- git)) dt = f(/(i) - git)) dt
so in the interval |x| < \, say
Wf-Tg^zW-gW
Thus if Tf = f and Tg = g, we obtain *||/- g\\ > \\Tf- Tg\\ =
II/- 5II. which is possible only if/= g = 0.
Now, the most general differential equation of first order that we shall
consider is
У' = F(x, y) (3.41)
where F is a real-valued function defined in a neighborhood of the point
(a, b) in the plane. A solution is a function у =/(x) defined for χ in a
neighborhood of a with these properties
f(0) = b fix) = F{x, f(x))
If/is a solution, it is a fixed point of the transformation
Tgix) = Γ Fit, git)) dt + b (3.42)
The fixed point will be found by the method of successive approximations:
/0 = anything, /, = Tf0,f2 = Tfu and in general /„ = Tfn_v In order to
guarantee that this sequence has a limit and the fixed point is unique, we
must guarantee the hypothesis of the fixed point theorem. More precisely,
we must know enough about the function F in order to guarantee that the
transformation defined by (3.42) is a contraction on the space of continuous
3.5 Existence Theorems 269
functions on a suitable interval about a. It suffices (as the proof below
shows) if the following condition is satisfied.
Definition 4. Let F be a function of two variables x, у in the domain
D in Rn + m (x ranges in R" and у in Rm). F is Lipschitz in у if there is a
constant Μ such that
\F(x,y1)^F(x,y2)\<M\yl-y2\
for all yt, y2 such that (x, j>x) and (x, y2) are in D.
Notice that since (1 + x)(l + y)~l is not Lipschitz near у =—I, we
cannot apply Picard's theorem; and in fact it does not hold as we saw in
Example 26. We have allowed x, у to range through many variables because
of the generality we need for Picard's theorem. Notice that if и = m = 1,
Fwill be Lipschitz if the partial derivative dFjdy exists and is bounded. For
by the mean value theorem (along the line χ = constant)
dF
F(x, yt) - F(x, y2) = — (*, 0(з>! -y2) ¥ι<ξ< У г
ду
and thus we can take the Μ of Definition 4 to be the bound of dFjdy.
Now let us turn to higher order equations. A differential equation of
order к is given in the form
/к)=Р(х,у,у',у\...,/к-1)) (3.43)
where F is a function defined in a neighborhood of {a, b0,..., 6ft_x) in
Rk + 1. A solution is a fc-times differentiable function у =/(*) with these
properties
/(e) = b0 ,f\a) = bu... ,/ί*-"(β) = **-i,
/'^^(χ,/Μ,/'Μ,···./'""^))
We would like to solve (3.43) with the given initial conditions by successive
approximations, but the method is not transparent. However, the problem
does reduce to the first-order case by means of a great idea. First, we
illustrate.
270 3 Ordinary Differential Equations
Example
34. у'=2у'-у,уф) = 0,у'ф)=1.
We introduce a new unknown function ζ and require that y' = z.
Then the given equation is reduced to the system
y' = ζ y(0) = 0
z' = 2z - у z(0) = 1
which is first order. Thus, what we seek is the vector-valued solution
of the vector differential equation
{y, z)' = (z, 2z - y) (j>(0), z(0)) = (0, 1)
This we can rewrite by integration and thus solve by successive
approximations. Precisely, the solution is the fixed point of the
transformation (defined on pairs of functions):
П/, <?)(*) = f(ff(0.2ff(0 - A0) dt + (0, l)
■Ό
Let us compute some of the successive approximations.
(/o,ffo) = (0,l)
(Ji,9i)=T(J0,g0) = (x,2x + l)
Ui, 9г) = 4fi, ffi) = (*2 + *, ί*2 + 2* + 1)
(/з, Яг) = Пк , 9г) = (i*3 + ** + *, l*3 + ^ + 2x + 1)
(/4.^4)= T{f3,g3)
(χΑ ί 3 2 5*4 2 3 3 2 \
It is now not hard to surmise that the general form of (/„, gn) is
and that lim/n = xex.
This then is the typical means of reducing the higher order equation to
first order. Given the Equation (3.43), we introduce new unknown functions
3.5 Existence Theorems 211
Уо>У1 У к-1 and replace (3.43) by the first-order system
У'о =У\
У\ = У г
yk-i =F(x,y0,yl, ..., j>*-i)
Уо(Р) = b0, yt(a) = bx yk-i(a) = bk_ l
Now the general existence theorem for &th-order equations falls directly
out of the theorem for first-order equations for systems. The beauty of this
trick is that Picard's theorem is no harder for systems and consists merely of
verifying that the appropriate transformation defined by an integral on
vector-valued functions is a contraction, so the fixed point theorem applies.
Here, then, are the fundamental existence and uniqueness theorems for
ordinary differential equations.
Theorem 3.3. (Fundamental Existence and Uniqueness Theorem) Let
(a, b) be a point in R χ R", and F an Revalued Lipschitz function defined in a
neighborhood of {a, b). There is an ε > Ο and a unique continuously differenti-
able R" valued function f defined on (a — ε, α + ε) such that f(a) = b and
f'(jc) = F(x, f(x))for all χ in (a - ε, α + ε).
Proof. The idea behind the proof is to change the given problem to a problem
involving integration. In fact, by the fundamental theorem of calculus, our
desired function is that function f such that
f(*) = b+ f F(/,f(/))A
for all points χ near a. That is, we seek a fixed point of the function Τ defined on
[C((a — ε, α + ε))]" (the space of η-tuples of continuous functions on (a — ε, α + ε))
7Т(дс) = Ь+ ί F(/,f(/))A
•Ά
Because F is Lipschitz, we can choose ε so that Γ is a contraction. We shall of
course refer to the distance between functions introduced in Chapter 2.
First, since F is Lipschitz in a neighborhood of (a, b), there is an Μ and some
rectangle В centered at (a, b) such that
|F(x,y),F(x',y')|<M|y-y'|
272 3 Ordinary Differential Equations
for all (x, y), (x\ y') in that rectangle. In particular, F is bounded on that rectangle
by К = \F(a, b)| + Me0. Let ε < ε0 Κ'1, Μ~ιβ, ε0. Let Χ be the set of л-tuples
of continuous functions f on the interval (a — e, a + e) such that || f — b|| < e0. If
f e X, then for all te(a— ε, α+ ε), (t, I(t)) is in В and F is denned on B, so the
transformation
Tf(x) = b + f F(f, f(0) dt
is well denned on X. We verify now that it is a contraction on X. Let f e X.
Then
||7T(*)-b||<; f |F(/,f(/))A
<Κ\χ — α\<Κε<ε0
Thus ||7Ϊ- Ь|| < ε0, so Tie JTalso.
Let f, g e X.
\\Tf(x) - Tg(x)\\ < f\F(t, f(/)) - F(i, g(i)| dt
J a
<LM f ||f-g||i//
•'a
<M\x-a\ ||f-g||<Me||f-g||<i||f-g||
Thus Γ is a contraction, so by the fixed point theorem it has a unique fixed point f.
We have
f(*) = b+ \\t,t(0)dt
so by the fundamental theorem of calculus f is continuously differentiable (because
the right-hand side is so), and f (a) = b, f'(x) = F(x, f(x)) for all χ e (a — ε, α + ε).
Certain remarks on this theorem are necessary. First of all, the general
differential equation of first order is of the form F(y', y, x) = 0, noty' = F(y, x).
The question arises: when can we rewrite the relation F(y', y, x) = 0 in the
form of Picard's theorem, for in this case we will know that solutions exist.
This question, of explicitly solving an equation H(u, v) = 0 for one of its
variables, say u (so that there is a function G(v) such that H(u, v) = 0 if and
3.5 Existence Theorems 273
only if u = G(v)), will be discussed further in Chapter 7. (Recall from
Theorem 2.16 that we have a condition for functions F of two real variables:
dF/ду φ 0. We shall see that this is the general condition.)
Secondly, Picard's theorem only asserts the existence of local solutions.
Supposing that F(x, y) is defined in / x R", I any interval in R, we can ask
if there exists, for each y0 e R", a function defined on all of/such that
f '(*) = F(x, f(x)) for all χ el
f(*o) = Уо for given x0el
The answer is in general, no. For example, the function F(x, y) = y2 is
certainly Lipschitz in any rectangle, so local solutions always exist. But we
already know that if у' =у2, у must be of the form (c — x)'1 for some
constant с Thus, if we impose an initial condition f(x0) = c0, the (local)
solution is
On any interval on which the solution exists it is given by this formula (see
Exercise 19(a)). Thus there is no solution to this initial value problem in any
interval containing the point x0 + l/c0.
We now turn to equations of higher order and the reduction to systems of
first order. Let us represent a point of #1 + <fc+1>n by coordinates
(x, y0, ..., yk), where л: is a real number and the y, range through R".
Theorem 3.4. Let (a0,b0, ... ,bk)eя1 + <*+1>п and let F be an Revalued
Lipschitz function defined in a neighborhood of (a0, b0,..., bk). There is an
ε > 0 and a unique (k + lytimes continuously differentiable Revalued function
defined on (a — ε, α + ε) such that
f(a) = b0 /('W> = 6, 1<ί'<*
F(x,f(x),fXx),...,fk\x))=fik+i)M
Proof. Consider the Rik+1 '"-valued function G defined in a neighborhood of
(a, bo,..., bk) by
G(x, y0,...,yt) = (Гь · · ·, J4-i, F(x, Уо,..-,у0)
274 3 Ordinary Differential Equations
Clearly, G is Lipschitz wherever F is. By Theorem 3.3, there is an ε > 0 and a
unique function g denned in (a — ε, a + ε) taking values in /J(l,+1)n such that
g(a) = (bo,...,bk)
(3.44)
g'(x) = G(x, g(x))
Writing g = (g0, ···,№·) we have g,(d) = b, and (g0, ■■ ■,gk)'(x) = (gl(x),...,
gk- i(x), F(x, g0,.·., gij). Thus, splitting this into coordinates, gl = gt +1, 0 ^ / < к
and yl(x) = F(x,g0, ■■-,gO- Thus go=gi, g'i = g'i = дг and in general g0J) = g,.
Thus
g0(a) = bo g0'\a) = bi \<i<k
and
дЪ»\х) = F(x,g0(x),gi(x),...,gOk\x))
which solves our problem. The uniqueness follows immediately, for if/is a
solution of our original problem then clearly (/,/',...,/<l)) solves (3.44), but the
solution of that is unique.
• PROBLEMS
23. Let h0, ■ ■ ■, hk-i be infinitely differentiable functions on the interval
/ and suppose f is a solution of
У<1) + "lh,yw = 0
1 = 0
Show that f also must be infinitely differentiable. (Hint: Any solution of
y(l+1> + /b-iy(l> + "l (Α,-ι + W + h0y = 0
1=1
is also a solution of the first equation.
24. Prove: If {/„} is a sequence of bounded functions in C(I) such that
||/n— /n-i|| < C„, where 2 C„ < со, then the sequence {/„} converges to a
continuous function.
25. The differential equation у" + у = 0 has unique solutions
corresponding to the initial conditions
X0)=1 /(0) = 0
Я0) = 0 /(0) = 1
3.6 Linear Differential Equations 275
respectively. Let C, S be these two functions. Prove:
(a) C2 + S2 = l
(b) S' = C, C' = -S
(c) S(2x) = 2S(x)C(x)
(d) e" = C(x) + iS(x)
Of course, the reader will recognize that C(x) = cos χ and S(x) = sin χ
and thus these equations should follow However, the intention here is to
verify these equations on the basis only of the defining differential equation
26. Sometimes it is of value to find a linear differential equation which
has as its space of solutions the vector space spanned by η given functions.
We find an equation of nth order by substituting the η functions in the
equation /"> + gn-i/"-1) + · + goy = 0
For example, suppose we want to find the linear equation whose solution
set is the span of χ and sin x. We try a second-order equation
y" + gy' + hy = 0 and substitute x and sin x:
g+hx = 0
— sin χ + g cos χ + h sin χ = 0
We can solve these linear equations:
sin χ , ч — xsin χ
h(x) = g(x) =
sin χ — χ cos χ sin χ — χ cos χ
Thus the differential equation is
(sin χ — χ cos x)y" — y'x sin χ + у sin χ = 0
Find the linear differential equation whose solution set is the vector space
spanned by the given set of functions.
(a)
(b)
(c)
(d)
(e)
(f)
X, X2, X1
gX gi* gU+O*
xex, exp(x2)
sin x, cos x, tan χ
χ sin x, cos χ
x, ex, tan χ
3.6 Linear Differential Equations
The most important and best understood class of differential equations
are those which are linear in the unknown function and its derivatives. We
now give the definition of this class
276 3 Ordinary Differential Equations
Definition 5. Let / be an interval in R. A linear differential operator of
order A: is a transformation from the space of A:-times differentiable functions
on / to the space of continuous functions on / of the form
m)=fm+klhlfw (3.45)
1=0
where h0, ..., Ak_ t are given continuous functions on /.
Notice that the coefficient of the highest order term is 1. More generally,
it could be any function hk ■ In this case, if hk is never zero on /, we could
divide by hk and obtain the form (3.45). If hk sometimes has the value zero,
then the theory to be presented here will fail (see Problem 31).
A transformation of the type (3.45) is linear, in the sense that
L if + g) = ДО + U9) Uff) = cUJ)
It follows that the collection of functions which get mapped into zero by L,
K(L) = {/, L{f) = 0}, the kernel of L, is a vector space of functions. We shall
now show that this is a A:-dimensional vector space.
First of all, the equation L(f) = g, for a given continuous function g
defined on the interval / has a solution / on the whole interval, which is
uniquely determined by given initial conditions f{a) = b0,...,/(fc_1)(a) =
bk-v
In other words, in this case, Picard's theorem is more than local; it gives
a solution on the whole interval. We shall verify this fact below (in
Proposition 9). Thus, we can state:
Proposition 6. Let I be an interval in R, a si, and L a linear differential
operator of order к defined on I.
(i) if g is continuous on I and b0,..., bk_1 are any real numbers, there is a
unique Ck function f defined on I such that
f{a) = b0, f{a) = bx /«-"(a) = bk_t
(ii) The space K{L) of solution on I ofLf= 0 is a vector space of dimension k.
Proof.
(i) will follow immediately from Proposition 9 below according to the same
procedure as in the preceding section for reducing a fcth-order equation to a first-
order system.
3.6 Linear Differential Equations 277
(ii) Let Ea be the transformation from K(L) to Λ* denned by evaluation at a:
EJJ) = ma), Па),...,Г-1\а))
By the existence and uniqueness theorem, E„ is one-to-one and onto. Thus K(L)
also has dimension k.
Let us reconsider briefly the case of constant coefficient linear operators:
^) = /(4+'Σ>./(,) (3.46)
ι = 0
We associate to L the polynomial
Pl{X) = Xk+YialXl
1=0
(called the characteristic polynomial of L). We have already seen, by
substitution of /(*) = e", that if PL(r) = 0, then e" is in K{L). Now if PL has к
distinct roots ru...,rk, then all of the functions exp(rxx) txp{rkx)
are in K{L), as well as all linear combinations of these. Since K{L) has
dimension k, these exponential functions form a basis for K(L) and every
solution L(f) = 0 is of the form
Av exp(rxx) + ··· + Akexp(rkx)
where the A, are to be determined by the initial conditions. In case PL
does not have к distinct roots (for example, PL{X) = X2 -2X + 1), the
situation is more complicated. We shall complete this discussion in the
next chapter, where we shall also discuss the question of factoring polynomials.
Examples
35. Solve ym + Ъу" + 2/ = 0 with the initial conditions j>(0) = 0,
y'(0) = l, y"(0) = 1. The characteristic polynomial, X3 + 3X2 + 2X
has the roots 0, -2, -1. Thus the general solution is of the form
A + Be'lx + Ce~*. We solve for А, В, С by substituting the initial
conditions:
A+B+C=0
-IB- C= 1
45 + С = 1
278 3 Ordinary Differential Equations
Solving, we find A = 2, 5=1, С = — 3, so the solution is
f{x) = 2 + e'2x - 3e'x.
Linear Systems with Constant Coefficients
We now turn to the solution of systems of linear differential equations
with constant coefficients. First, let us try to see an example through to the
end.
36. Consider the system
x\ = xt + x2 *i(0) = a
x'2 = x1- x2 *2(0) = b (3.47)
According to the fundamental theorem we can approach a solution
by successive approximations using the transformation
Tix.it), x2(t)) = fW) + x2it), xtit) - x2it)) dt + ia, b) (3.48)
It is convenient to use matrix notation. Thus, writing
•O -0)
(3.48) becomes
*' = ( j _ j I* *(°) = ^o
Equation (3.48) becomes
τ*(ί)=ίΟ(ι _1)^ω^ + ^ο
Now, we successively approximate
xi = J(l _Ax0dT + x0 = L _Atx0 + x0
X2 = i0[(l -l)(l _\)™o + x0]dz + x0
3.6 Linear Differential Equations 279
= (l -l) ί*0 + (! -\)tX° + x°
/1 1\3 i3 /1 1\2 t2 /1 1\
*3 = li -i) 3T + li -i) 2l + li -i)i + /*°
x-=[i+iXi -!Г£Ь
According to the fundamental theorem the series converges to the
solution
*(0 =
к
f (l Ц* —
Μι -ι/ fci.
(3.49)
The formula (3.49) represents the solution in the sense that it describes
a way of computing approximations to the pair of functions χχ(ί),
x2(t). (The question of measuring the accuracy of those
approximations is important; we shall return to those questions in Chapter 5.)
However, we have not obtained formulas for the functions individually.
That is not really surprising since the functions are given by an
interdependent relation (3.47).
By analogy with the series for ex, we defined the exponential of a
matrix as
exp(M) = eM = / + Σ tt (3·50)
fc=l К'
Then we can write the solution to (3.47) as
x(t)=exptL _1|л:о
We now state a proposition which summarizes this discussion for general
linear first-order systems.
Proposition 7. Consider the linear first-order system of η equations in η
unknown functions:
*'(r) = Mx{t) x(0) = x0
280 3 Ordinary Differential Equations
where χ = (xt xn) and Μ is an η χ η matrix. The solution is given by
*(0 = eMtx0
Proof. We find, by successive approximations, the fixed point of
Tx(t) = f Mx(t) dt + x0
We obtain
Xo — ^o
Χι = MtXo + Xo
(Mt)2
Xi = —r— Xo + MtXo + Xo
/(Aft)" (Mty-1 \
xn = —r + )——. + ■ ■ ■ + Mt + 1 )xo
\ n\ (и-l)! /
By the fundamental theorem the sequence of vector-valued functions x„ converges
to the solution of the given differential equations. But the limit of x„ is given by
x(t)-
i+i(Mjr
(3.51)
Although we have not questioned the convergence of the series (3.50), we
know there is no problem. For, by the fundamental theorem the sequence
{xn} converges, so the series in (3.51) must converge. Finally, eM is just
еш at t = 1.
Finding the exponential of a matrix is not an easy thing to do; ordinarily
it is best to just work with the series and approximate solutions. However,
in certain cases we can obtain explicit formulas for the solution.
Examples (Eigenvectors)
37. Suppose the matrix Μ is diagonal. Then
-(V)
\o dj
3.6 Linear Differential Equations 281
and the equations are
*'i = diXi, x'2 = d2x2 x'n = dnxn
However, this system is not a system at all, but just и independent
equations. The solutions are
xt = expC^OxxCO) χη = exp(4,i)x„(0)
Thus, in particular, we see that
Id, 0\ /ехр(^) О \
\0 'dj \ 0 expK)/
38. Suppose that the vector of initial conditions x0 is an eigenvector
of Μ: Mx0 = λχ0 for some λ. Then M2x0 = λ2χ0,..., M"x0 =
λ"χ0, so we can compute the solution explicitly,
x(i) = eM'x0 = ίΐ+Σ (-^r)*o =^ο + ΣΐM"x0
\ η=ι η! / π = ι η!
оо trt2rt
Σ _ гХ
г ^о — е хо
я=1 и!
This computation leads us to speculate as to the existence and quantity of
eigenvectors of the (η χ n)-matrix M. In general, this is a difficult quest and
still does not lead to a complete explicit solution of the differential equation
x' = Mx. However, if there is a basis of R" of eigenvectors of M, then we
can give a complete explicit solution.
Proposition 8. Suppose vx v„ are independent eigenvectors of the
(η χ n)-matrix M, with eigenvalues λ1 λ„, respectively. Then the equation
x' = Mx, x(0) = x0 can be solved explicitly as follows. Write x0 = c1v1 + ···
c"v„. The solution is
x(i) = c1 exp(V)vx + · · · + c" exp(A„i)v„
Proof. We compute the series (3.51) directly:
Mxo = M(c4i + · · · + Cv„) = c'AiVi Η + c"A„v„
M2Xo = Mte'A^i + · · · + c"X„v„) = c'A^vi + · · · + c"A 2v„
M*x0 = c'A/vi + h c"A»v
282 3 Ordinary Differential Equations
Thus
/ °° Mktk\ " I °° tk \
π / oo tkλ k \ "
j = l \ k=l K! / J = i
Examples
39. Consider the system of differential equations
We find the eigenvalues and eigenvectors corresponding to this system
of equations as in Section 1.7. Let
-(=! i)
Then det(M — ΑΙ) = λ2 — ЗА + 2. The eigenvalues are the roots
A = 1, 2 of this polynomial. The eigenvalue 1 has an eigenspace the
kernel of
"—(:«1)
The vector (1, 2) spans the kernel. Similarly, the vector (1, 3) is an
eigenvector of Μ with eigenvalue 2 since it is in the kernel of
*-»-£ i)
The general solution of the given differential equation is
This vector has the initial conditions x(0) = c1 + c2, J>(0) = 2cx + 3c2 ·
Our initial conditions are (4, 12), so we can solve for cx, c2'· ct = 0,
3.6 Linear Differential Equations 283
c2 = 4. Thus, we obtain the explicit solution
*(i) = 4e2t y(t) = 12e2'
40.
H-!/<!)* ■*>-(!)
The eigenvalues are 1/2, 3/2, and they have eigenvectors (2, -1),
(2, 1), respectively. Thus the general solution is
x(0 = ^t/2(_2) + <^3,/2(2)
Substituting in our initial conditions, we obtain these equations for
cuc2:
3 = 2cx + 2c2
3 = —cl + c2
The solutions are cx = —3/4, c2 = 4/9. Thus the solution of the
given system is
X(t) 4e 1-1/9e Vl/ I 3/4e"2 + 4/9e3"3j
41.
x' = (_; J)« x(0) = (i) (3.52)
The equation for λ here turns out to be (1 - A)2 + 1 = 0, so λ = 1 + i.
The root A = 1 + i gives the eigenvector (1, -г), and for the root
X = 1 - i we obtain the eigenvector (1, +г')· Now our initial
conditions are (1,0) = (1, -0/2 + 0,+0/2, and thus we obtain the
solution
284 3 Ordinary Differential Equations
There is an easier way to solve this equation, and that consists in
recognizing that the matrix is of the form
and represents a complex number: in our case 1 — ι Thus, we can
replace our system (3.53) by the single equation
z'(i) = (1 - i)z(t) z(0) = 1
by substituting z(i) = x(t) + iy(t). This has the solution
z(i) = e'1-"'
which is the same as (3.53), of course,
x{t) = Re z(i) = - (ε'1-'" + e(1 +0')
y(t) = Im z(i) = i (e(1 -'>« - e(1 +0')
2i
42. Find the general solution of
/1 -3 3\
x'= 3 -5 3 χ
\6 -6 4/
Now, after computation, we find det(M — ΛΙ) = (— 2 — A)2(4 - λ),
thus Μ has the eigenvalues — 2, 4.
eigenvalue —2:
/3 -3 3\
M-AI= 3 -3 3
\6 -6 6/
Thus the corresponding eigenvectors lie in the plane χ — у + ζ = 0.
Two independent vectors in this plane are (1, 2, 1), (0, 1, 1).
3.6 Linear Differential Equations 285
eigenvalue 4:
/-3 -3 3\
М-Я= 3 -9 3
\ 6 -6 0/
The corresponding eigenvectors lie on the line — x — y + z = 0,
χ - у = 0, which is spanned by (1, 1, 2). Thus, the general solution is
(3H~2,(ibe~!'(:b<4,(i)
43.
^(Ιίί)" Hi)
This matrix is symmetric, so it has a spanning set of eigenvectors. We
already found them in Example 9: (1, —1,0), (0, —1, 1) have the
eigenvalue 1, (1, 1, 1) has the eigenvalue ψ- The general solution is
x(i) = c1e'( -1J + c2e'
\ 0/
The initial condition is
•(-!Η·*(ί)
(-·)··(-:)··(-■)·(!)
Thus the solution is
xi(t) = 2ег + е2г x2(t) = - 4e* + e2t x3(t) = 2e' + e2'
44.
*-(i 0Ж x(0)=(°) (354)
286 3 Ordinary Differential Equations
The equation for the eigenvalues is (1 — A)2 = 0, so we obtain only
one eigenvalue, λ= 1. This has the eigenvector (1,0). Thus we
know one solution of the general equation
x(0 = *'(i)
However, this does not satisfy the given conditions We cannot
proceed to solve this equation without further study of the matrix,
and that is generally a difficult search. In the present case we can
avoid such difficulties by observing that the second row of (3.54) is
just y' = y, y(0) = 1. This has the solution y(t) = e'. Then the
first row is
x'(i) = x(t) + e' x(0) = 0
and we know how to solve this equation: x(t) = te'. Thus our sought-
after pair is (te', e').
Notice that in the last example the solutions are not linear combinations
of exponentials, but admit polynomial factors. Only when there is a basis
of eigenvectors are the solutions linear combinations of exponentials; when
there are too few eigenvectors, we must expect more complicated
coefficients. There is a theorem that any solution of a first-order linear system
with constant coefficients is a combination of exponentials with polynomial
factors. This theorem follows from the Jordan canonical form of a matrix;
we shall not go into it here.
We conclude this section with the proof of the global version of Picard's
theorem from which Proposition 6 was obtained.
Proposition 9. (Global Version of Picard's Theorem) Let I be an interval
in R. Suppose F is a continuous Revalued function defined on I x R" which
satisfies this strong Lipschitz property: there is a constant К > 0 such that
for ally 1,y2eRn
sup{| Щх, у,) - Щх, y2) |: xsl} < К\\У1 - у2|| (3.55)
Then the system of η equations
У' = F(*, У) У(с) = а
has a unique solution for any initial condition a at с е I.
3.6 Linear Differential Equations 287
Proof. We cannot simply use the fixed point theorem, for the transformation Τ
defined by
7Т(дс)=а+£р(/,Г(/))Л
is not a contraction on the space of functions continuous on the interval /.
Nevertheless the successive approximations procedure works. Define a sequence {f„} of
continuous functions on / by induction:
fo(*) = a
f,(x) = a+jF(/,fo(/))*
ОД = а+(р((,{,.,(0)А
The sequence {f„} converges in C(/) By making К larger we may assume that
besides (3.55) we also have ||F(x, a)||, < K. We prove by induction that
К"
\Ux)-t,-,(x)\^-r\x-c\·
ni
(i) ii-l
|fi(x)-fo(*)l =
(li) η - 1 => η
ί F(f, а) Л < Я _[
Λ
= #|x-c|
\Ux)-t„-l(x)\ =
j [F(/, £,_i(/)) - F(f, f„_,(/))] Λ
<я| H.-i(/)-f-i(')l*
K"
<
?ш/е""-е'"-,л-^|я-е|"
"(и-1)
From (3.56), we obtain
|fn-f„-i||»^
[лг(й-в)Г
ni
(3.56)
288 3 Ordinary Differential Equations
Since the series 2 [K(b — а)Г/и! converges, it follows that {f„} is a Cauchy sequence
in C(I) so there is an f e C(I) such that f„ -> f. Since Τ is continuous on C(/),
7T„ -> 7T. But rf„ = f„+1, so f„ -> 7ϊ also, thus Τι = f. Since f is a fixed point of Τ
we conclude as in Picard's theorem that f solves our problem.
Now the fixed point theorem asserts the uniqueness of our fixed point, and
we seem to have lost that. But we can regain it on /, because locally we have
uniqueness, by Picard's theorem. Suppose g is another solution of the problem; we
have to show that g = f on /. For this purpose we may assume that the point с at
which the initial condition is given is one of the end points of /. Let
R = sup{r e /: ΐ(χ) = g(x) for all x<,r)
Since f(c) = a= g(c), с is in the set on the right. Also, b is an upper bound for this set,
so the least upper bound R exists. We have to show that R = b. If R < b, then
the differential equation is denned in a neighborhood of R. By Picard's theorem,
there is an ε > 0 such that the equation y' = F(x, y) has a unique solution in
(R — e,R + ε) with initial condition y(R) = I(R). But both f and g, when
considered as functions on (R — ε, R + ε), are such solutions. (Notice g(R) = f(R)
by continuity.) Thus, f = g on (R — ε, R + ε), so R + ε is in the set above, and R
is not an upper bound. Thus the assumption R < b is contradicted, so R = b and
the proposition is proved.
• EXERCISES
23. Find the general solution of these systems of equations
(a) ri =4)Ί-2y2
y'i = 2y2 + 4yi
(b) y'i=yi-y2
y'i = ay ι + у г
(С) y'i = У ι + У г + Уз
У'г = ayi + уг
у'ъ = ayi + уз
24. Find the solution of these initial value problems
(a) The system in 23(a) with initial condition ^,(0) = 1, y2(0) = 1.
(b) The system in 23(c) with initial conditions ^ί(Ο) = ^2(0) = 0,
(c)
(d)
y'i =^1+^2
y'i = —У1 + Уг
/ι = 3^i - Уз
У'г=У1+ 2y2 - Уз
у'з = 2у! - 2y3
у№
УМ
Л(0)
Л(0)
Уз(0)
25. Find the general solution of the equation x' = Mx, where Μ is given
by:
(a) the matrix in Example 10.
(b) the matrices in Exercise 10.
(c) the matrices in Exercise 11
3.7 Second-Order Linear Equations 289
(d) / о -1 -3\ (g)
(e) / 4 7\ (h)
<0 (_i Ι) ω
( 3
-2
0
V °
Ί ι
0 1
ρ о
Ί ο
0 1
,ι -ι
2
3
0
0
0
0
1
-2
°\
1
1/
°\
0
1/
0
0
2
1
• PROBLEMS
27. Suppose Μ = (α/) is an η χ и matrix such that a/ = 0 if ι < j. Show
that the solutions of x' = Mx are all polynomials of degree at most и
(Hint: M"=0.)
28. Show that exp(M') = (eM)'·
29. Show that if Μ is skew-symmetric (M' =-M), then eM(eM)' =1.
For such a matrix the rows form an orthonormal basis: A matrix A with the
property AA' = I is thus called orthogonal, and represents a rotation.
3.7 Second-Order Linear Equations
The most comon type of equation arising from physical problems is the
second-order linear equation:
y" + a^x)y' + a0(x)y = g(x) (3.57)
Thus the techniques for solving such equations have been well developed.
In this section, we shall assume that we know one solution of the associated
homogeneous equation
y" + β,(*)/ + a0(x)y = 0 (3.58)
and show how to find the general solution of (3.57). The question of finding
this first solution is of course difficult, and further discussion will be postponed
until Chapter 5. The technique involved in finding the general solution
consists in substituting candidates involving the given solution and a new
unknown function, and thereby attempting to reduce the complication in the
given equation.
290 3 Ordinary Differential Equations
In order to motivate this discussion, let us recall the theory of the first-order
equation: y' + h(x)y = g(x). The homogeneous equation is easily solved
by separation of variables: f(x) = exp(Jx h) is a solution of y' — hy = 0.
Now, to find the general solution of the given equation we substitute у = zf,
where ζ is some new unknown function. From/' + hf=0, we obtain
g = y' + hy = z'f+z(f' + hf) = z'f
Thus z' =f~lg, so ζ is found by integration: ζ = J f'xg + c.
Now, the second-order homogeneous equation (3.58) has two independent
solutions. By assumption we know one, call it/!. Let us try to find another
by substituting у = zfx. The new equation in ζ is
y" + axy' + a0y = z"fx + 2z'f'x + zf\ + ax{z'fx + zf\) + a0 zf,
= z"fl+(2f'l+alfl)z' = 0 (3.59)
This equation is linear in z' and thus we can solve for z' and then integrate to
find z. We have as a result
z(x) = cffl(ty2exp(-j'^dt
Examples
45. The equation x2y" + xy' — у = 0 has the solution y(x) = x.
We now find another solution by substituting y(x) = z(x)x. We have
y' = z'x + z, y" = z"x+2z', so
x2y" + xy' - у = z"x3 + 2z'x2 + z'x2 + xz - zx = 0
or
z"x3 + (3z')x2 = 0
Dividing by x2 we have z"x + 3z' = 0, which we can solve for z' by
separation of variables: ζ = Cxx'2 + C2- We can take ζ = x~2,
and thus the second solution, у = zx = x~l, is found.
46. sin x2 is a solution of
xy" - у' + Ахгу = О
3 7 Second-Order Linear Equations 291
We substitute у = ζ sin χ2 and obtain this differential equation for ζ
ζ" χ sin x2 + z\Ax2 cos x2 — sin x2) = 0
Thus
ζ" , 1
— = - 4x cot χ + -
Ζ Χ
Integrating, we obtain
In z' = 2 In csc(x2) + In χ + С
or
z' = C^csc2 x2
Integrating once again, we find ζ = С, cot x2 + C2 Thus, the
second solution can be chosen as cot x2 sin x2 = cos x2 (which we
might have guessed at the beginning).
Now that we have a technique for finding two independent solutions for
the homogeneous equation, we return to the general equation (3 57). Taking
our cue from the first-order case, we try a combination of the solutions of
the homogeneous equation. Let us refer to these two solutions of (3 58)
as/i>/2 · Now, we consider a function of the form
y(x) = Ζι(*)Λ(*) + z2(x)f2(x) (3 60)
If we compute y' and y" and substitute into (3.57) we will get a totally
unintelligible equation of second order in the (но unknown functions zl, z2.
What we need, to find two unknown functions, is of course, a pair of equations
From where is the second equation to come'' We notice, first of all, that
the formula (3 60) does not uniquely determine the functions z^ z2, even if
we know the sought after function \ For, if z1; z2 are found so that (3 60)
gives the solution y, then we may add gf2 to z1, and subtiactg/! from z2,
obtaining another pair making (3 60) valid We thus seek another condition
(preferably involving derivatives) which will serve to uniquely identify the
functions ζλ,ζ2. Differentiating (3 60), we obtain
y\x) = Γ,(χ)/ί(λ) + :2(x)/iW + z\(x)Mx) + z2(x)/2(a) (3 61)
292 3 Ordinary Differential Equations
Equations (3.60) and (3.61) will give a pair of linear functions in гг(х) and
z2(x) if the sum zj(x)/\(.x) + z'2{x)f2{x) vanishes. This pair of equations
(if noncollinear) will then identify zx(x), z2(x) in terms of y(x), y'(x)· Thus, if
that condition is satisfied we know that z,, z2 are uniquely determined by the
solution y. Turning the argument around, we impose the condition
ζ\Α + ζ'2/2 = 0 (3.62)
and hope now that, together with this condition, the given differential
equation will explicitly determine ζλ,ζ2. (In fact it will do so theoretically, since
Equation (3 57) determines the solution у which in turn determines z,, z2
in the presence of the condition (3.62).) Let us try our idea on Example 45.
Example
47. Solve x2y' + xy' - у = χ2.
We have the two solutions x, x~l of the homogeneous equation.
We consider у = zxx + z2x~l and impose the condition
ζ',χ + ζ^χ-1 =0 (3.63)
Now let us substitute this information into the given equation. In
the presence of (3.63), we have
У' = zx - z2x'2
y' = z\ - z'2x'2 + 2z2x~3
Then
x2 = x2y'' + xy' — у
= x2z\ — z'2 + 2z2x~l + xzl — z2x~l — zlx — z2x~l
x2z\ - z2 = λ2 (3.64)
Now the pair of linear Equations (3.63), (3.64) can be solved by
Cramer's rule:
2 — 1- — -— ^
-2x 2 — 2x 2
Integrating, we find that z1 = x\2 + cu z2 = —x2/6 + c2, and so the
general solution is
y = zJi + z2f2 = i* - ^x3 + clx + c2x~1
3.7 Second-Order Linear Equations 293
Now, it was not an accident that in this case the equations turned out to be
a pair of linear first-order equations: this is always the case. We shall now
describe the technique in general. Supposing that fuf2 are two independent
solutions of the homogeneous Equation (3.58) we try a function у = zlf1
+ z2f2 as solution of (3.57). We impose the condition
z'ifi+z2f2 = 0 (3.65)
Then
y' = zlf'l + z2f2
y" = Af\ + z'2f'2 + zln + z2f'2-
Thus (3.57) becomes
z'if[ + z'2f'2 + z1f'i + z2f'i + z1a1f'1
+ z2alf'2 + zla0fl + z2a0f2 = g
or
z\f'i + z'2fi=9 (3·66)
the rest of the terms vanishing since /,, /2 solve (3.57). We solve the pair of
linear Equations (3.65), (3.66) by Cramer's rule.
-fi9 f\9
Zl /i/i-/2/i Zl hi'i-hf\
and these can be integrated in order to find the general solution. One
apparent problem is the denominator. If it ever vanishes, these functions
may not be integrable. In fact, our whole discussion will break down.
Fortunately, we can verify once and for all that this function is nonzero.
The function
W{x) = Mx)f'2{x) - /ί(χ)/2(χ) = det(/;$ /JOo)
is called the Wronskian of the pair/,, /2 · Notice that
W' = fj'2 - fUz
294 3 Ordinary Differential Equations
Since/!,/2 solve (3.57), we can easily check that
W + axW=Q
and thus
H>H
W {x) = W{x0) exp
Thus if W is nonzero at one point, it is never zero. W is nonzero at x0 if
the vectors (fi(x0), f{(x0)) and (/2(x0X /г(*о)) are independent; this is
guaranteed if the functions/! and/2 are independent.
Examples
48. Solve y* + xy' - y = x, y(0) = 0, y'(0) = 0. It is easy
to see that χ is a solution of the homogeneous equation. We
find another solution by substituting у = zx. The equation for ζ
is z"x + (2 + хгУ = О. Thus
z" 2 + x2 „ _,
— = = -2x *-x
ζ χ
so
z' = Cx-2cxp[-j\
Thus we may take as the second solution
y(x) = xz(x) = χ Jo Г 2 expi - -J dt
Now let us refer to the integral by φ(χ). We solve the given equation
by substituting у = zlx + ζ2χφ(χ); this gives the pair of equations
ζΊχ+ ζ'2χφ{χ) = 0
z\ + ζ'2(χφ'(χ) + φ(χ)) = 1
3.7 Second-Order Linear Equations 295
Since φ'(χ) = x~2 exp(-x2/2), we find, by Cramer's rule,
- χψ(χ)
exp(-x2/2)
z, =
exp(-x2/2)
or
dt
z2(x) = expi у j
The integrals defining z, are not expressible in closed form, but they
nevertheless define a function. Thus the solution is
y(x) = -x \% exp(- Yj [JV2 exp(- !■) ώτ] Λ
+ ХеХР(у)/оГ2еХР(_У'/Т
This technique for solving second-order equations is called variation of
parameters. It can be applied to higher order linear equations Suppose
we are given such a differential equation:
уЫ + £а1(х)у" + у = д(х)
(3.67)
Suppose we have somehow found η independent solutions /,, ...,/„ of the
homogeneous equation. Then we try a solution
y = Zlfl + ··· +Znfn
As in the second-order case, the solution will uniquely determine the functions
Zi, ..., z„ if we impose the conditions
zi/i+ ■■■ + *:/: = о
ζίΓι + ··- + μ: = ο
ζί/(Γ2) + ··· + ζ;/ίΓ2) = ο
296 3 Ordinary Differential Equations
In the presence of these conditions, (3.67) becomes
z;/(n-i) + ... + z;/(n-i) = 0
We can solve this system as a system of linear equations and then find the
z,, ..., z„ by integration. Just as in the second-order case, this system is
solvable since the determinant (called the Wronskian of the η functions
fi, •••,/n)is never zero.
49. Solve $x3y'" - x2y" + 2xy' -2y = x\x2 - 9).
(3.68)
The homogeneous equation has the solutions x, x2, x3. Thus we try
у = zlx + z2x2 + z3 x3. We impose these conditions:
z\x + z'2x2 + z'3x3 = 0
z\ + 2z'2 χ + 3zj x2 = 0
In the presence of these conditions we compute (3.68) to be
z'2 + 6z'3 = χ V - 9)
The matrix of this system is
[χ χ2 χ3 \
1 2x 3x2\
\0 1 6 /
which has the determinant — 2x3 + 6x2 = —2x2{x — 3). Thus, by Cramer's
rule, we must have
z, = ■
z\ =■
x\x2 -9)x4
-2х\х-Ъ)
x\x2 - 9) · x2
-2х\х-Ъ)
ζ-2=·
-x\x2-9)-2x3
-2х\х-Ъ)
After integration we can express the general solution as
y{x) =
x9 3x*
+ x'
Γχ8 3χΊ
J + — + C>
X
У
χ1 χ6
7 2
3.7 Second-Order Linear Equations 297
• EXERCISES
26. Show that the general solution of
y" + y=f
can be expressed as
y(,t) = d cos t + c2 sin t + sin(f - t)/(t) dr
•Ό
27. Find the general solution of
у" + У =x
28. Find the general solution of:
(a) y"-4y = l
(b) y-y = &
(c) y'· + 3y' + 2y= sin χ
(d) r-^Z + ^r^o
(e) x2y" - 4xy' + 6y = x3 + x2
29. Find the solution of
x2/ - 2y = 2x2 Я0) = 1 y'(0) = 1
30. Find the general solution of
y" + xe*y' — e*y = 0
31. Find the solution of
e-y + xy'-y = \ y(0) = 0 /(0) = 1
• PROBLEMS
30. A differential equation of the form
e,xV4 + ak-i χ"-1/1"1' + · · · + αιχ/ + во у = 0
where the a,'s are constants can be easily solved. Try the substitution
у = xs. You should obtain
xsMi)(i - 1) · · · (s - к) + et-i(iXi - 1) · · · (s - к + 1) +
••• + αιί + αο]=0
298 3 Ordinary Differential Equations
Thus we need only find the к roots of the polynomial in brackets.
Find the general solution of these differential equations:
(a) χ2γ - 2xy' + у = 0
(b) x2y" - 3x/ -3y = 0
(c) x2y" + 4xy' + 3y = x5
(d) x2y°-xy' + y = 0
31. Solve the second-order 2x2 system of equations
>ч: ;)>·+(» in
(Hint: Go to the first-order 4x4 system by adding the equations / = z.)
3.8 Summary
An .Revalued function defined in a neighborhood of x0 in R is called
differentiable at x0 if
,. /(*o + 0-/(xq)
lim
<-.o t
exists. This limit is denoted f'(x0)· If / is differentiable on an interval
I in R its image is a curve in R". The line through f(x0) spanned by f'(x0)
is the tangent line to the curve at f(x0)- If h is differentiable in a
neighborhood of the curve and has a relative maximum on the curve at f(x0), then
<yh(x0), /'(х0)У = 0. We can deduce the following principle from this. If
h, g are differentiable functions defined in a domain in R", then the maximum
(or minimum) of h subject to the restraint g(x) = 0 is attained at those points
χ for which there exists a λ such that
g{x) = 0 VA(x) = XVg(x)
If A, ffi> ■ ■ ■, 9k afe differentiable in R", and h has a maximum (or minimum)
subject to the restraints gl(x) = 0,..., gk{x) = 0 at x0, there exists λ1 .. ,kk
such that
0i(*o) = 0, ..., gk(x0) = 0 VA(xo) = A,Vg,(x0) + · · · + λkVgk(x0)
Suppose / is an Revalued function defined on the interval /. / is Ck
(A:-times continuously differentiable) if f, ... ,fw all exist and are contin-
3 8 Summary 299
uous. If/ is such a function we have Taylor's expansion about any x0 e I:
where ε(χ - x0) is bounded by Mk = max{|/(kl(i) |: ( between x0 and x}.
If/has derivatives of all orders, and
hm Mk (X ~ Xo)" = 0
then/can be expanded in an infinite Taylor expansion:
oo /40/· γ λ
/(χ) = /(χ0)+Σ —~^-ч)"
η = 1 П I
A differential equation of order k is a relation involving a function of
x, y,y', ..., y(k). If there is a fc-times differentiable function /such that this
relation holds for all χ after the substitution у =/(x), j' =/'(v), , j'*·1 =
/■(к)(х), we say / is a solution of the differential equation A linear differential
equation of order A: is a relation of the form
y<k) + Σ>,(*)*(0 + floWy = ff(x) (3 69)
1=1
where the functions al and ^ are (at least) continuous on an interval / If
g = 0, the equation is called homogeneous The space of solutions of the
homogeneous equation is a vector space of dimension к Equation (3 69)
has a solution on / uniquely determined by the initial conditions
Я*о) = во Пх0) = а1,...,/("-1Ьо) = ок-1 (3 70)
Any equation of the form
/k> = F{x,U\\ . ι""1')
has a unique solution subject to the initial conditions (3 70) under this
condition on F:
(l) Fis defined and continuous in a neighborhood of (x0 , a0, , tft-i)
(ii) F is Lipschitz: there is an Μ such that
\F(x,yi,y\,.. ,}\k-")-F(x,M,\'2, ■ ,}ГП)1
<M(|3l- bl + ΙιΊ- \'2\+ - + \\li~l)- \2~u\)
300 3 Ordinary Differential Equations
Techniques for Solution
1. Successive approximations. The equation
У' = Fix, У\ y(xo) = ao
is solvable if F is Lipschitz near x0. The solution can be approximated by a
sequence {/„} defined as follows:
/o = any continuous function,
/,(*)= f F(t, f0(t)) dt + a0
hi*)= fF(t,Mt))dt + a0
JX0
JX0
2. Separation of variables. If y' =f(x)g(y), then the equation
J9~liy)dy = jf(x)dx + C
implicitly determines у as a function of x.
3. First-order linear equations. The homogeneous equation
y'+fy=0
can be solved by separation of variables: у = cexp(— J/). The equation
y' +fy = g can be reduced by the substitution у = ζ exp(—J/). The
resulting equation in ζ is solved by separation of variables.
4. Constant coefficient linear equations. The characteristic polynomial
of the differential equation
ym + ак-У"-" + ··· + β,/ + а0у = 0 (3.71)
is the polynomial Xk + ak-x X*-1 + · · · + a,X + a0 . If r is a root of this
polynomial, then erx is a solution of (3.71).
5. First-order linear systems. Let A be an η χ η matrix. The equation
in η unknown functions у = (j/,, ..., y„):
3.8 Summary 301
у' = Ay X0) = y0
has the solution y(t) = eA'y0. The exponential of a matrix is defined by
00 JUin
eM = exp(M) = / + £ —
If jo is an eigenvector with eigenvalue A, then the solution is y{i) = e*'y0.
If R" has a basis yx, ...,y„ of eigenvectors of M, with eigenvalues A„ ..., A„
respectively, then the general solution is
Ci εχρ(Α,ί)3Ί + ■■■ + c„ exp(Anf)^
In general, we must allow polynomial coefficients.
6. Second-order linear equations, knowing one solution. Suppose fx is
a solution of
/ + ax(x)y' + a0(x)y = 0 (3.72)
we find a second, by substituting у = z/i. This produces a linear first-order
equation in z'. Suppose fx,f2 are solutions of (3.72). Then we solve
y" + β,(*)/ + βοΟΟ* = ffW (3-73)
by the substitution у = zxfx + z2f2 ■ In the presence of the condition
z'i/i + z'2/2 = 0 (3·74)
Equation (3.73) becomes
z\f\+z'2f'2 = g (3.75)
The linear Equations (3.74), (3.75) can be solved for z/, z2 and then z„ z2 are
found by integration.
• FURTHER READING
E. A. Coddington, An Introduction to Ordinary Differential Equations,
Prentice-Hall, Englewood Cliffs, N.J., 1961. An elementary book on
differential equations which goes more deeply into the material of this
chapter.
M. Tennenbaum and H. Pollard, Ordinary Differential Equations, Harper
302 3 Ordinary Differential Equations
and Row, New York, 1963. This is a thorough treatment of the subject of
differential equations. Many special techniques and applications are
exposed.
F. Brauer and J. A. Nohel, Qualitative theory of Ordinary Differential
Equations, Benjamin, New York, 1967. This book studies the theory of
systems of differential equations, and in particular the behavior of sets of
solutions.
L. Loomis and S. Sternberg, Advanced Calculus, Addison-Wesley, Reading,
Mass., 1968. This is a very modern approach to the subject. It goes
thoroughly into the fundamental theorem.
• MISCELLANEOUS PROBLEMS
32. Show that if Μ is a skew-symmetric matrix (M' = —M), then
<Mx, x> = 0 for all x. Show that M2 is symmetric and thus has a basis of
eigenvectors. Conclude that, considered as a matrix over C, Μ also has a
basis of eigenvectors. (Hint: M2 - λ = (Μ + λ/λ)(Μ - λ/λ).) Thus if χ is
an eigenvector of M2 with eigenvalue λ, then either χ is an eigenvector of Μ
with eigenvalue or (M — V λ)χ is an eigenvector of Μ with eigenvalue
-Vx.
33. Let Τ be any linear transformation. Compute the gradient of
<Γχ, Γχ>, and show that the maximum of ||Γχ||2 on ||x||2 = 1 is attained at
an eigenvalue of T'T.
34. Show that if Γ is a symmetric matrix, Γ({||χ||2 =1}) is an ellipsoid
whose major axes are of length equal to the eigenvalues of T.
35. Find the points p0 e {(x, y) eR2:xy = l}, pi e {(x, y) e R2: у + χ2 = 0}
which minimize the distance between these two curves.
36. Minimize and maximize the volume of a box with given surface area.
37. Find the point on the ellipse {x2 + iy2 = 1} which is closest to (i, 0).
Find the furthest point from (i, 0).
38. Find the point on the ellipse {x2 + iy2 =1} which is closest to the
circle of radius i centered at (i, i).
39. Suppose {a„} is a bounded sequence. Define f(x) = 2"= ι a„ x".
Show that/is infinitely differentiable in the interval (—1, 1), and n\a„ =
/<n,(0).
40. Let /be a twice continuously differentiable function defined in a
neighborhood TV of (0, 0) in R2. Show that there is a function ε defined in
TV such that lim ε(ρ) = 0 and
P-.0
fix, У) = ДО) + Ь—х (0, 0)л: + Ь— (0, 0)у + <х, у)\\(х, у)II
41. Using Taylor's theorem, we can derive the exponential function in
yet another way. Suppose that / is a function with the property that
3.8 Summary 303
fix) =/(*) for all x. Then /<"(*) =/(*) for all x, so / must have the
Taylor expansion
/(*) =2 ~.x" + ek(x) —
«=o «! (fc-f 1)!
for all k. Because of the estimate on ek, it remains bounded as к -> oo, so
we should expect/to be the limit of the polynomials Pk(x) = £iuо (1/и!)лЛ
We already know, from the theory of Chapter 2 that the lim Pk(x) exists
for all χ Noticing that Д' = Д_,, prove that /(*) = lim Pk(x) does indeed
have the property/' =/.
42. With a little bit of patience, and in the same way as in Exercise 41,
you should be able to find a function / defined on R such that /(0) = 1
/'(0) = 0 and /<">(*) + f(x) = 0 for all x.
43. (a) Suppose that / is С on [-R, R] and /(0) =/'(0) = · =
• /(*-1)(0) = 0. Then there is a continuous function g such that /(f) =
ί'«?(ί), and «?(()) = (l/fc!)/<'>(0)
(b) Suppose that/ is С on [- R, R]. Show that there is acontinuous
function g such that
/(0= Σ —^f + t'git)
i = o /!
44. Change the conditions in Problem 18 as follows: The ratio of horse
population to total population is constant and only the eggs hatched in
horses produce mature insects. Derive the differential equations now
governing the population growth.
45. Suppose now we have an insect which has a natural death rate of d,
per insect per year and which lays h eggs per insect per year in the air The
egg hatches if it lands on a horse and the hatching causes the horse's death
Assuming birth and death rates bH, dH for the horse and a probability к
that a given egg will land on a given horse, now find the differential equations
of population.
46. Suppose /(г) = — ζ represents a force field on the plane Let a
particle be at 1 at time 0. Describe the motion in case the velocity is r,
(1 + i)/2, (1 - i)/2.
47. We assume that a particle generates a force field directed toward
the particle and of strength equal to the inverse of the square of the distance
to the particle. At time t = 0 there are particles at rest at points pb . ,pk
in R2. Let f,(f) be the position at time t of the particle originally at p,
What is the differential equation the function (fb ..., fk) must satisfy''
48. Suppose a river deposits water in a lake at the rate of ν gal/day. We
may assume that у is a periodic function of time with period 365. Suppose
two pumps pump water out at the constant rates of ηί, w2 gal/day. Finally,
304 3 Ordinary Differential Equations
water evaporates out of the lake at a rate of k(t) gal/day/ft2, where к is also
periodic with period 365. We may assume that the area of the lake is
proportional to WVi, where W(t) is the volume of the water in the lake at
day t. Write the differential equation W must satisfy.
49. Suppose a missile A is moving in a straight line with constant velocity
Vo. A tracking missile В of constant speed i0 is always pointed toward the
missile A. Find the differential equation of motion of the tracking missileB.
50. Suppose we have the same situation as in Problem 49, but this time
the speed of В is proportional to the distance between A and B. Find the
equation of motion of B.
51. A falling body actually experiences a drag due to air resistance which
is proportional to its velocity. Suppose a body of 100 tons is dropped from
a plane 5 miles high; and this constant of proportionality (which depends
of course on the shape of the body) is 20. How long will it take for the
body to reach the ground ?
52. Two chemicals А, В in solution combine to create chemical С
according to the equation 2A + В -> С. Suppose the rate of the formation of С is
proportional to the product of the amounts of A and В present and inversely
proportional to the amount of С present. Find the differential equation
governing the formation of C, assuming initial amounts A0, B0 of chemicals
A,B.
53. Suppose in the above problem, A0 = 10, B0 = 5, and the proportion
constant is 1. How long will it take for the reaction to complete?
54. If two bodies Л, 5 of different temperatures come in contact with each
other, the rate of change of temperature is proportional to the difference in
temperature (the proportion constant depends on the bodies). Thus if
TA, Τ в are the temperatures of A, B, respectively, we have
T'A = kA(TA - T„)
Τ в = КвКТв Τа)
Find the formula for TA, TB with these data:
(a) kA = 4, кв = 5, TA(0) = 100, Тв(0) = 0.
(b) кА =2,кв = i, TA(0) = 120, Тв(0) = 50.
55. In Problem 54, as t ->■ oo the bodies tend to a common temperature.
What is it in case (a), case (b), in general ?
56. Solve these differential equations:
(a) У4) - 3/ + 2y = 0.
(b) У + ЗУ + 2у = 2е\
(c) У sin у + cos χ cos у = cos x.
(d) (x2+\)y' -2xy = x2+l.
(e) xy' + Ъу = x~2 sin x.
(f) x' + ax = b sin t.
(g) y"=xe>.
(h) У4)-У3)-У2)-У-2^ = 0.
3.8 Summary 305
(ι) ay' + by' + су = 0.
U) y'{\+x1) = \+y\
(к) x' + y' = 2x.
x' ~ У = Ь-
ω y-(; j)y.
<m> y' = (_f б)у·
57. Solve these initial value problems:
(a) у" -Зу'+2у = е3*, уф) = 0, /(0) = 1.
(b) xy'+ 3y = x\ y(0) = 5.
(c) У4> - 3/2> + 2^ = 0, j<0) = 1, уЩ = 0, /'(0) = 0, /"(0) = 0.
(d) У = (J J)y,y(0) = ([).
(e) У' = (_з8 8)у.У(0) = (^).
(f) e*y" + Xy'-y = e*, y(0) = 0, /(0) = 0.
(g) хгу" + 3xy' + y = 0, y(0) = 1, /(0) = 1.
(h) хгу" + 4xy' + 2y = x\ y(0) = 1, /(0) = 0.
58. Show that if all the entries of the matrix Μ are less than 1, then the
series
|m"
converges. Show that the limit is (I — M)-1.
59. Use the idea of the preceding problem to approximate A-1 to within
two decimals, where
(a) /1 0 0.08\
A=[0.07 0.91 0.11
\0.14 -0.03 1.13/
(b)
A =
0.98
0.13
0.02
0.11
0.01
1.18
-0.02
-0.11
-0.12
0
1.01
0.13
-0.03
-0.1
0
1
Chapter 4
CURVES
Force Fields
According to Newton's laws of motion, a particle will move in a straight
line at constant velocity unless it is subjected to forces. In that case it will
accelerate according to Newton's third law
F = wa (4.1)
where m is the mass of the particle. In this chapter we shall study the motion
of particles subjected to variable forces. That is, we must allow the
possibility that the force applied to the particle depends upon its position (as in
gravitation) or even upon time (in the case of a variable electromagnet).
This gives rise to the notion of a field of force. A field of force will be given
in this way: at time t and position χ a particle of unit mass will experience
a force F(x, t). Thus for each t0 we have associated a vector F(x, i0) to
each point x. We can illustrate this as in Figure 4.1. Now, we have seen
that a particle of unit mass situated at x0 at time t = 0, with a velocity v0
at t = 0 will follow the path of motion determined by the given field of force
as the solution of the differential equation
f"(0 = F(f(0, 0
f'(0) = v0
f(0) = x0
306
Force Fields 307
Figure 4.1
The path of motion is a curve in space given by the function f which solves
this equation.
Examples
1. Suppose a particle moves around the unit circle in the plane
according to the function
f(i) = (cos t, sin 0 (4.2)
What force field would account for this motion'' Differentiating
twice we find that
f'(f) = ( —sin t, cost)
f'(f) = (-cos t, -sin t)= - f(0
Thus the particle is accelerating toward the origin with constant
magnitude (see Figure 4.2). This motion can be accounted for by
the force field
F(z,t) = -z
In fact, in the presence of this field, if a particle has a velocity at time
t = 0 orthogonal to its position vector, then it will continue to move in
a circle centered at the origin. We can see this by solving the
differential equation
fV) = -f{t)
/(0) = W(0) = iz0
308 4 Curves
Figure 4.2
The solution of this equation is
/(z) = z0 e" = z0(cos t, sin t)
which is just (4.2) with z0 = 1.
2. Suppose we are given in space a force field directed toward the
ζ axis with magnitude the distance from the ζ axis (Figure 4.3).
F{x,y,z)=-(x.y,0)
(jr. ν z)
Figure 4.3
Fluid Flows 309
Here again the force field is independent of time and is given by
¥(x,y,z,t) = -(x,y,0)
If a particle is at (1, 0, 0) with an initial velocity of (0, 1, a), what is
its path of motion? We must solve this differential equation for
three unknown functions f(i) = (χ(ί), y(t\ z(t))
f"(0 = (*'(0, У"®, z'(0) = WO, X0. 0)
*(0) = 1, y{0) = 0, z(0) = 0
xX0) = 0,y'(0)=l,z'(0) = a
The solution is easily found to be
f(i) = (cos t, sin t, at)
Thus, if a = 0, the path of motion is a circle in the plane ζ = 0. If α
is positive, the path of motion is an upward spiral lying over the unit
circle, of slope a, and if a < 0, the path followed is a downward spiral
(Figure 4.4).
3. Time-independent fields. If we are given a time-independent
force field on a domain in R2, or R3, and we graph sufficiently many
values of the field, it seems to be a broken line picture of a family of
curves. In fact, there is a family of curves which fits the picture in
this sense: there is a curve through each point χ which is tangent to
the vector F(x) at that point. These curves are called the lines of
force of the field and are found by solving the differential equation
f'(0 = F(f(i))
f(0) = x0
The solution of this differential equation describes the line of force
passing through the point x0 .
Fluid Flows
The general notion of field of vectors arises in many other ways besides
as force fields. Such an example which gives rise to a field is that of a fluid
in motion in a certain domain in R3. There are various ways of describing
that flow. First of all, we may idealize, by assuming that at the time t = 0,
310 4 Curves
Ω = 0
a>0 α<0
Figure 4.4
there is a particle at each point x0 in R3. Then we can describe the flow by
describing the motion of each particle. The particle which is at x0 at time
t = 0 follows a certain path which is given by a function f(x0, t). The
equations of motion are thus
χ = f(x0, 0
Precisely, the position χ at time t of the particle originally at x0 is f(x0, f)·
We assume that particles are neither created nor destroyed; this amounts to
asking that, for each t the function x0 -» f(x0, ') is one-to-one and onto, and
thus can be inverted. So we can also write
x0 = ф(х, t)
for some function φ. Precisely, the original position of the particle at χ at
time — t was ^>(x, t).
Fluid Flows 311
4. Suppose a gas is rising at constant speed, and spiraling around
the vertical axis. Thus the motion of particle is a helix as described
in Example 2. We do best to express this motion in cylindrical
coordinates: Let ζ be the (complex) coordinates in the plane (z = re'e)
and w the height off the plane. Thus the path of motion described
by the gas is
(z, w) = (z0 e", at + w0) (4.3)
Thus the particle originally at (z0, ho) will be at z0e", w0 + at at
time t. We can certainly invert these equations:
(ζ0,ΗΌ) = (ζίΓ", w-at) (4.4)
Now, another way to describe a fluid flow is by its velocity. Let v(x, t)
be the velocity of the particle which is at position χ at time t. The field ν is
called the velocity field of the flow. We can find the equations of motion
from the velocity field by solving the appropriate differential equation. For
the function f(x0, 0 describes the motion of the particle originally at x0 .
The velocity of this particle at time t is f'(x0 , 0 and its position is f (x0 , t).
Thus we must have
f'(xo,0 = v(f(xo,0, 0
f(x0,0) = x0
This equation can be solved uniquely.
5. Let us find the velocity field of the gas flow in Example 4. The
flow equations are (4.3). The velocity of the particle originally at x0 is
(z', w') = (;z0 e", a)
To find the velocity field we must write this as a function of position
at time t, rather than original position. We can do this by means of
the inversion (4.4), obtaining as velocity field
v(z, w) = (/z, a)
6. Suppose a fluid on the plane is spiraling in toward the origin
(Figure 4.5) according to this equation of flow
z(0 = ^<
312 4 Curves
Figure 4.5
Here the particle at time t = 1 moves toward the origin so that its
argument is proportional to time elapsed, and its distance from the
origin is inversely proportional to time. Then
iz0e" z0e" I 1\
Thus the velocity field is
i>(z, i) = (i - -jz
The angular velocity is thus constant whereas the radial velocity
decreases as time goes on.
7. Suppose now a fluid spiraled in toward the origin so that its
velocity field was time independent, for example,
v(z) = (/ — \)z
The equations of motion are the solutions of
/'(0 = 0'-i)/(0
ДО) = z0
4.1 Ρ агате trization of Curves 313
This gives
f(z)=ei~i+,'>, = e~'e"
In this case the distance from the origin decreases exponentially with
time (Figure 4.6).
We shall make a study of the geometry of paths of motion of single particles
and fluid flows, or families of motions, in this chapter. This study is a
continuation of analytic geometry, and begins the subject of differential geometry.
Figure 4.6
4.1 Parametrization of Curves
A curve in R" is a one-dimensional subset Г of R". This means that the
set Г can be put into one-to-one correspondence with a line, in a smooth way.
We make this notion a little more precise.
Definition 1. The image in R" of an interval under a continuously differ-
entiable one-to-one function with a nowhere vanishing derivative is called a
C1 curve. If the function is fc-times continuously differentiable we shall call
this curve a Ck curve. The particular function is called a parametrization
of the curve.
314 4 Curves
Examples
8. The unit circle in R2 is a curve. It has this parametrization:
Γ: ζ(ί) = (cos t, sin t) teR (4.5)
Since z'{t) = ( — sin t, cos t) is never zero (the sine and cosine are
never simultaneously zero), this is a good parametrization.
We could also parametrize the unit circle in this way:
z(0 = (i, (1 - t2)1'2) (4.6)
but this parametrization fails at t = +1, since the function (1 — i2)1/2
is not differentiable there. Notice that (4.6) does not parametrize
the whole circle, but only the upper semicircle. Both of these failings
can be alleviated by introducing parametrizations which cover the
other parts of the circle. That is,
ζ(θ = ((ΐ-ί2)1/2,0
will parametrize the circle in the right half-plane,
ζ(ί) = (ί,-(l-i2)1/2)
takes care of the lower semicircle, and so on.
9. It is often convenient to use complex notation to describe curves
in the plane. For example, the parametrization of the circle (4.5)
can be written as
z(i) = cos t + i sin t = e"
Another curve is the spiral:
z(i) = e"
where с is some complex number. Writing с = a + ib, this becomes
z(i) = i"e*
or, in polar notation, ζ = ге1в
КО = e"' 0(0 = bt
4.1 Parametrization of Curves 315
Thus the modulus of ζ varies exponentially with t, and the argument
is linear in t (see Figures 4.7 and 4.8)
10. The curve Γ:
x(i) = (sin t, cos t, t) (4.7)
called a right circular helix, is pictured in Figure 4.9. Since
x'(i) = (cos t, —sin t, 1)
is never zero, (4.7) is a valid parametrization of the curve.
11. The intersection of two cylinders with different axes is a curve
(see Figure 4.10). Suppose the cylinders are both of radius 1 and
one, Cl5 has as axis the у axis, and the other, C2, has as axis the χ axis.
Then C1 has the equation
x2 + z2 = 1
(4.8)
and C2 has the equation
У2 + z2 = 1
(4.9)
z{t) =eu',Rea >0
Figure 4.7
316 4 Curves
z(t) = еш, Rea < 0
Figure 4.8
The intersection is, of course, the set of points where both equations
hold and thus can be written x2 = 1 — z2, y2 = 1 — z2. We can
thus parametrize at least part of the curve by
x = (l-z2Y'2 у = (1-г2У'2 or
f(0 = (0-i2)1/2,(i-i2)1/2,0
Figure 4.9
4.1 Parametrization of Curves 317
Figure 4.10
Other parts will be found by variations on this theme:
f(0 = (-U-'2)1/2,(i-'2)1/2,0
f (0 = (i, t, (1 - t2fl*)
and so on.
A simpler parametrization is found by the substitution χ = cos t.
Then we have the two distinct branches of the intersection given by
fx(f) = (cos, t, cos t, sin t)
f2{t) = (cos, f, — cos t, sin t)
Implicitly Defined Curves
In the situation of the above example, we say that the curve is given
implicitly by the Equations (4.8) and (4.9). More often than not, when we
are given a collection of equations such as these, we can determine, just by
318 4 Curves
working with them, whether or not they do implicitly define a curve.
Nevertheless, the theoretical question remains: under what conditions can the
set defined by a collection of equations be parametrized as a curve ? We
have already answered this question in R2 in Theorem 2.14. We shall
restate the conclusion as a fact about curves.
Proposition 1. Suppose that F is a differentiable real-valued function
defined in a neighborhood of (a0, b0) and F(a0, b0) = 0 but dF(a0, b0) φ 0.
Then the set
{(x,y)eN:F(x,y) = 0} (4.10)
is a curve in some neighborhood N of (a0, b0).
Proof. Since dF{ao, b0) φ 0, then either (dF/dx)(a0 ,Ь0)фО or {bF\dy){a0, b0)
φ 0. Suppose the latter. Then, according to Theorem 2.16, there is an ε > 0 and a
differentiable function g defined on the interval (a0 — ε,α0 + ε) such that F(x, y) = 0
if and only if у = g(x). In particular, g(a0) = b0 . Let /: (a0 — ε, a0 + ε) -> R2 be
defined by fit) = (f, g(t)). Then / parametrizes the set (4.10) near (α0, b0), and
clearly /"(f) = (1, g'(t)) φ 0. If instead (dF/dx)(a0 ,Ьо)ф0 we can give the same
argument merely by changing the roles of χ and y.
In higher dimensions the situation is a little more complicated. We shall
describe it in R3. If F, G are two differentiable functions defined in a
neighborhood of a point p0, and VF(p0), VG(p0) are independent, then the set
{p: F(j>) - F(p0) = 0, G(p) - G(p0) = 0} (4.11)
is a curve through p0.
The verification of this fact is basically another use of the fixed point
theorem, complicated by some more linear algebra. We first assume that
F(j>o) = 0 = G(p0). Since the vectors VF(p0), VG(p0) are independent, we
can change coordinates in R3 so that VF(p0) = E2 and VG(p0) = E3. That
is, with respect to the new coordinates (x, y, z), dF/dx(^0) = 0, 5F/5y(p0) = 1,
dF/dz(p0) = 0 and 5G/5x(p0) = 0, дв}ду(р0) = 0, 5G/5z(p0) = 1. Now let
Po = (*o > JO > zo) ί f°r x near *o we want to show that there are uniquely
determined y, ζ such that
F(x, y,z) = 0 G(x, y,z) = 0
Following Newton's method, we ask to find the fixed point in the y, ζ plane
4.1 Parametrization of Curves 319
of the transformation
T(y, z)=(y + d-^- (p0)-lF(x, y, *),ζ + ψζ (РоГ*С(х, у, z)j
Our conditions VF(p0) = E2, VG(p0) = E3 will guarantee that in some
neighborhood of p0, Г is a contraction. Thus there are unique у = g(x), ζ = h(x)
such that
T(g(x), h(x)) = (g(x), h(x))
or
F(x, g(x), h(x)) = 0 = G(x, g(x), h(x))
Thus the function f{t) = (i, g{t), h{t)) parametrizes the set (4.11) as a curve.
Examples
12. At what points in the plane is the set ex+y = ja curve? Let
F(x, y) = ex + y - y. rihen VF(x, y) = (ex+y, ex+y - 1). Since dF/dx
is never zero, this is everywhere a curve and the equation ex+y — у = 0
determines χ as a function of у implicitly. dF/By is zero when
x + у = 0. The only point on the curve where ex + y = у and χ + у = 0
is (— 1, 1), so at that point we cannot expect to find у as a function
of x.
Notice, that even though we cannot explicitly determine the function
x =f(y) given implicitly by ex+y = y, we can find its derivative. For
exp(/(j) + У) ~ У = 0
so upon differentiating we have
ехр(Ду) + УЖПУ) + 1) - 1 = 0
or
/'(y) = exp[-(/(y) + y)]-l
13.
F{x, y) = χ sin xy — cos у (4.12)
320 4 Curves
VF(x, у) = (sin xy + xy cos xy, x2 cos xy + sin y)
If χ > 1, dF/dy(x, у) ф 0, so (4.12) defines у implicitly as a function
of x. Differentiating (4.12) with respect to χ we find
sin xy + χ cos(xy)(y + xy') + у sin у = 0
or
sin xy + xy cos xy
У = : 2
sin у + χ Cos xy
14. F(x, y, z) = x3y + y2, G(x, y, z) = xyz + e2.
VF = Qx2y, x3 + 2y, 0) VG = (yz, xz, xy + ez)
VF and VG are dependent when
3x2j> _ x3 + 2y _ 0
yz xz xy + ez
These equations become
xy + ez = 0 and у = χ3
or
3x2y = 0= χ3 + 2y
The first pair has no solutions, and the second pair amounts to χ = 0
and у = 0. But the set F(x, y, z) = 0, G(x, y, z) = 0 never intersects
this plane, so everywhere on that set F and G are independent. Thus
{(x, y, z): F{x, y, z) = G(x, y, z) = 0}
is a curve in R3.
Comparison of Parametrizations
Now, we have seen that a given curve admits many parametrizations, and
it would be to our advantage to be able to single out a best possible one. In
the study of the motion of particles there is a distinguished parameter, that
of time. But as far as the geometric study is concerned we can take any
parametrization we care to, the only criterion being that of convenience.
4.1 Parametrization of Curves 321
Geometrically, a most convenient parameter, or measure, along the curve
is that of length as measured from a fixed point.
Before considering the particular parametrization by arc length, let us first
see how to compare two different parametrizations. Suppose Γ is a curve,
parametrized by
χ = /(f) a<t<b
If σ is a continuously differentiable function with nonzero derivative defined
on the interval [α, /J] and taking values on the interval [a, b], then the
composed function / ° σ also parametrizes Γ. That is, we can write Γ
as the image of
* = <?(*)=/(σ(τ)) α.<τ<β
If τ increases as f does, then these two parametrizations determine the same
sense of direction along the curve Γ. This sense of direction is called
orientation. We know from calculus that the necessary and sufficient
condition for f, τ to increase simultaneously along the curve is that σ' > 0
on the interval [α, β]. We shall say that τ is an orientation-preserving
parameter if this condition is satisfied, and otherwise τ is orientation reversing.
On the other hand, if we started out with two different parametrizations
of a curve
Γ: *=/(*) or x = g(x) (4.13)
then there must exist a function σ relating the two parameters. For each
point of Γ corresponds to precisely one value of t and precisely one value
of τ. The correspondence
τ-»ff(t) =/(')-»'
defines the function σ. We shall verify below that σ is a differentiable
function of τ and we have g(z) = /(σ(τ)).
Notice that, given the two parametrizations, so that t = σ(τ), we have by
the chain rule
ί?'ω=/'(σ(τ))·σ'(τ) (4-14)
Thus the vectors g'(x) and f'{t) are collinear when t, τ are the same points,
and point in the same direction when σ > 0, that is, when g,f induce the
same orientation along Γ.
322 4 Curves
Definition 2. Let Γ be a curve parametrized by χ = /(f), a< t < b. The
unit tangent vector to Γ at /(f) is the vector
u l/'WI
By the above remarks we see that the unit tangent vector is the same no
matter what parametrization we choose so long as it induces the same
orientation. For if we have the two parametrizations (4.13), then by (4.14)
(since σ' > 0)
/'(τ) _ /'(σ(τ))σ'(τ) _ /'(f)
IffWI l/W) · *'WI l/'(0l
when ί, τ determine the same point of Γ.
Examples
15. Consider the unit circle, given parametrically by
Then z' = ie", which is a unit vector, so Τ = ie".
Notice: we have T= iz, so that the tangent vector is orthogonal to
the position vector.
More generally, consider the spiral ζ = еаг, where α is a complex
number. Then ζ = аеаг, so the tangent vector is exp /(Im a + arg a)t.
Notice that the angle between the tangent vector and the position
vector is
arg Τ — arg ζ = arg a
Thus Τ, ζ always make the same angle.
16. For the curve in space given by
x = it, t\ t3)
we have
dx ,
— = (1. 2Г, ЗГ»)
4.1 Parametrization of Curves 323
so
Τ(ί) = (ΤΤ^Τ9?ρ(1'2ί'3ί2)
Now, here is the verification of the fact that two parametrizations are
related by a continuously differentiable function.
Proposition 2. Let Τ be a curve, and f. [а, Ц-»Г, g: [α, β~\ -> Γ iwo
parametrizations of Г. Then there is a continuously differentiable function σ
mapping [a, b] one-to-one onto [a, /?] «vcA thatg(r) = f{a{r)) for all τ ε [a, /?],
«и//(0 = g(p-\t)\ for all t e [а, Ы
Proof. Let τ e [a, /3]. Since /maps [a, b] one-to-one onto Γ there is precisely
one te[a,b] such that f(t) =д(т). Define a(r)=t. Then σ is a well-defined
function from [α, β] to [a, b]. σ is one-to-one. Suppose σ(τΟ = σ(τ2) Then
0(Tl) =/(0fo)) =/(0(T2)) =fl(T2)
Since 5 is one-to-one we must have т1=тг.
σ maps [α, /3] onto [α, ό]. For if t e [a, b] there is a point τ e [a, /3] such that
/(0 = 0(t). Clearly, then t = σ(τ).
We now have only to verify that σ is a continuously differentiable function. Let
то e [α, β] and i0 = σ(τ0) Now / is a differentiable function at t0 and /'(ίο) ^ 0.
Let / = (/,.. ., /„) in coordinates. There is а у such that f'i(t0) Φ Ο Then
/ is a real-valued continuously differentiable function of a real variable and
since /'/ίο) Φ 0, it is invertible. That is, there is a function h defined on the
range of/ near t0 such that Λ(//ί)) = ί for f near t0. Λ is also continuously
differentiable. Now, since/(σ(τ)) =д(т), we have/(a(r)) =^/r), so
ο(τ)=(/Χο(τ)) = (Αο^χτ)
Since Λ and /, are continuously differentiable so is σ. The proposition is proven.
Without the requirement that the derivative of the parametrization is
nonzero we would in general not have such a good relationship between
different parametrizations. Notice that by the same argument the inverse
mapping σ~ι to σ is also continuously differentiable. Since σ-1 ° er(f) = t,
for all t ε (a, b), we must have, by the chain rule,
(σ-1)'(σ(ί))σ'(0=1
324 4 Curves
so σ'(ί) is also never zero. If it is always positive, σ is an increasing function
of i; if always negative σ is a decreasing function of t. Notice that if
/ g are two parametnzations of a curve and they do reverse orientation,
then they will become compatible simply by negating one of the
parameters. Thus if /is not compatible with g, then /: [ — ft, — a] -> С defined by
/(0 =/(-0 certainly is.
If/: [a, b~] -» Г is a parametnzation of a curve we shall call f(a) the left
end point of Г and f(b) the right end point.
The Tangent Line
Now, let Г be a curve in R", and x0 a point on Г. The tangent line to Г
at x0 is the straight line through x0 which best approximates the curve. We
shall show that this is the line through the tangent vector and is given by this
equation
χ = x0 + tT(x0) t ε R
The tangent line at x0 can be computed as the limiting position of lines
through x0 and nearby points xx on Γ, as xx -> x0 (Figure 4.11). Let Цхх)
be that line. Then L\xx) is the set of all vectors originating at x0 and parallel
to xl — x0 . Let/give a parametrization of Г so that x0 =/(i0), *i =f(ti).
Now L\xx) is the set of points χ such that χ — x0 is parallel to
*i -*o =/Oi)-/(io)
But that is the same as the set of points χ such that χ — x0 is parallel to
/Οι) -/(ίο)
Figure 4.11
4Λ Parametrization of Curves 325
Now x-i -»x0 is the same as i, -» i0 and the limit of the difference quotient as
tl -> i0 is f'(t0). Thus ДхО tends to the line through x0 and parallel to
/'(i0), as desired.
Examples
17. Consider the helix (Figure 4.9), given by the parametrization
f(i) = (a cos i, a sin i, bi)
Then
f'(i) = { — a sin i, a cos i, fc)
f is a positive parametrization if we take for the unit tangent
T(0 = (a1 + b2)1'2 (~asiat'a cos f' fe) (415)
(see Figure 4.12).
18. A damped helix (Figure 4.13) parametrized by
f(i) = (e' cos t, e' sin i, bt)
Thus
f'(i) = (e'(cos t - sin i), e'(sin t + cos i). *)
Figure 4.12
326 4 Curves
Figure 4.13
so we can take as the tangent vector
TC) = (2е2г + b2yl2 (β* (COS t - Sin ί), фш t + COS ί), Ь) (4.16)
Notice that the curve on the unit sphere swept out by the tangent
is the same for both helices (Figure 4.14), and that the functions
(4.15) and (4.16) give two different parametnzations of this curve.
If we consider the parameter as t, then the " moving point" described
by (4.15) has no tangential acceleration, whereas in (4.16) it is
accelerating exponentially.
19. A different helix is this one (Figure 4.15):
f(i) = (cos t, sin t, ег)
Here we take as tangent vector
T(0 = (1 + e2y/2(-sin '. C0S '. О
4 1 Parametnzation of Curves 311
Figure 4.14
(Figure 4.16). This again is a helix on the unit sphere which tends
to the equator as t -> — oo and winds rapidly around the north pole
as t -> + oo. (Notice that
1
*™> = (ГТРзр
► 1 as t-
► Oas t-
oo
■oo)
Figure 4.15
328 4 Curves
Figure 4.16
20. The intersection of a sphere and a cylinder (Figure 4.17)
x2 + y2 + z2 = 1
(* ~ i)2 + У2 = i
In order to avoid the cross at (1, 0, 0) we shall restrict attention to the
part of the curve lying above the xy plane. Let us first parametrize
this curve. We shall use as parameter the angle 0 as shown in the
figure. Then
*(0) = i + i cos 0 у{в) = \ sin θ
and ζ(θ) is the point on the unit sphere lying above (χ(θ), у(в), 0),
thus ζ{β) is the positive square root of 1 - (χ(θ))2 - (у(в))2, which is
(1 -cos0\1/2
/l-cos0\I/2 . Θ
1-2—) =Sm2
Thus, we can parametrize this curve with the function
Щ = I - + - cos Θ, - sin Θ, sin - j
Then
1 / β\
f'(0) = -l-sin0,cos0, cos-I
4.1 Parametrization of Curves 329
and we can take as tangent line
Τ(θ)= (j^)1/2(-Sin0'COS0'C°4) (417)
Notice that this does not parametrize Γ at the point (1,0,0), since
this point corresponds to both parametric values 0, 2π. In fact, Γ
is not a curve at the point (1, 0, 0) since it does not have a unique
tangent line: the limiting position to (4.17) as χ -* (1,0,0) is either
(0,1,1)/>/2ογ«),1,-l)/>/2!
• EXERCISES
1. Find a parametrization for the curve of intersection of the ellipsoid
x2 + iy2 + z2 = 1
with the cylinder
x2 + z2 = 1
2. Parametrize the intersection of the paraboliod ζ = χ1 + у2 with the
unit sphere x2 + y* + z* = 1.
3. At what points is the set defined (in polar coordinates in R2) by
r(l + a cos Θ) = 1 a curve? Find a parametrization of the curve.
Figure 4.17
330 4 Curves
Figure 4.18
4. Consider the family of cardiods (Figure 4.18)
r = (\ +c)"'(l + с cos Θ)
(a) Describe the behavior of this family as с ranges between 0 and + co.
(b) For с = 1, с = 2, calculate the unit tangent vector to the curve as a
function of Θ.
5. What is the tangent vector to the curve r = a cos bdl Graph the curve
fori = 1,2, 5,л/2.
6. Calculate the tangent lines to the following curves:
(a) f(0 = (e"'cosf, e-'sinf) at (1,0).
(b) f(x) = (x, sin-) at (1,0).
(c)
,=(*·8,η;)
f (0 = ie', τ^-j-, sin / j at (1,1,0).
(d) x2 + y2 + z2=4a2,(x-a)2+y2=a2 at (2a, 0, 0).
(e) f (0 = (f, cos t, sin 0 at (0,1,0).
(f) f(/)=(/2,l-/',/) at (1,0,1).
7. Find the tangent line at the origin for these curves,
(a) e'+1,~y-l =0
(b)
(c)
(d)
cos xy = у + 1
χ' + y3z + sin ζ = 0
exp(sin (x^ + z)) = 1
е"г — cos xy = 0
x2 + У2 + z2 = χ + у + ζ
• PROBLEMS
1. A snail deposits calcium at the leading edge of its shell in a direction
which makes a fixed angle with the ray from the snail's center to the leading
edge. Show that this hypothesis explains the spiral form of a snail's shell.
2 Graph the curve r = (1 + θ2)~1(2+ θ2) and compute its tangent
vector.
4.2 Arc Length 331
3. Graph the curve in R3 given in spherical coordinates by r = e\ θ = t,
ζ = e'. Graph the curve on the unit sphere made by the tangent vector of
the given curve.
4.2 Arc Length
Definition 3. Let Γ be an oriented curve positively parametrized by
f. [a, b] -> Γ. Let a<a0<b0<b. Define the length of Γ between f(a0)
and i(b0) to be the least upper bound of all sums
£||f(i,)-f(i,-i)
(4.18)
over all choices of points t0, ..., tk such that
a0 = t0 < ix < · · · < tk = b0
This definition has this description. Approximate (Figure 4.19) the curve
by a " broken line" joining a succession of points along Γ between a0
and b0. Then the sum of the lengths of the line segments is less than,
and approximates the length of the curve. Now, if the points i, and tl^l
are very close, then the vector f(i,) - f(i,_!) is approximately equal to
ΠΑ)(ί, - ί,-i)· If we replace this in (4.18) we get a sum
Σ 1|Г(011 С,-',-x)
(4.19)
which is a Riemann sum approximating the integral
l-bo
| Hf'(OII
dt
(4.20)
H = b
Figure 4.19
332 4 Curves
Of course, the substitutions taking us from (4.18) to (4.19) admit a small error
term by term but since к may be very large, we have no hold on the error
between (4.18) and (4.19). Nevertheless, we can, by being very careful,
justify that substitution and deduce that the limit of the lengths of the
approximating line segment curves is the integral (4.20).
Proposition 3. Let Τ be a curve parametrized by f: [a, b~\ -> Γ. The
length of Τ between f(a0) and I(b0) is given by the integral (4.20).
Proof. We will use the fundamental theorem of calculus to show this. Let
s(0 be the length of Γ between f (a0) and f (t). We shall show that s is a differen-
tiable function off, and s'(t) = ||f'(Oil-
Fix ίο > αο and consider a t ;> to. If S0 is any sum like (4.19) approximating the
length of Г between f(a0) and f(f0), then SO + l|f(0 - f('o)ll is a sum like(4.19) for
the length between f(a0) and f(0- Thus
So+l|f(0-f(io)ll<i(0
Taking the least upper bound over all such SO, we obtain the inequality
s(to)+\\I(t)-I(t„)\\<s(t) (4.21)
Now, suppose S is a sum like (4.19) corresponding to a partition of the interval
[a0, t ]. We may suppose that t0 is one of the points in this partition. For if not,
we can add it to the given partition, and get a still larger sum. Let to < fi < ■ ■ ■
< tk = t be the points of the partition between t0 and t. Then
s = s0+ Zllf('<)-f('<-«)H
1 = 1
where S0 is a sum corresponding to the interval [a0, ίο]. Thus
s<i(fo)+illf('<)-f('<-i)H
ίίί(ίο)+ Σ
ί f'(0
■'«ι — ι
dt
<s(t0)+ 1 ί ||f'(Ol|d-'<i('o)+ ί ||Γ(0Ι|Λ
'=' Л,_, Л0
Since this is true for all such sums S, we have
*(0<UO)+f llf'OII* (4-22)
4.2 Arc Length 333
From inequalities (4.21) and (4.22), we obtain
||f(/)-f(/o)ll ί(0-ί(Ό)
t-t0
i(0 - ί(Ό) 1 f'
< , , < —— Wit) II dt (4.23)
I — to t — to J«0
As t^-to, both the left and right ends converge, since f is differentiable, to
||f"(fo)||. Thus ί is differentiable at t0, and s'(t0) = ||f'('o)!|. Since this is valid
for ίο between ao and b0, we have the desired conclusion.
Now, if Г is a curve parametrized by f: [a, b~\ -> Г, we can consider arc
length as a function along Г. Precisely, let s{t) be the length of the piece of
Г from f(a) to f{t). Then, from the above proposition,
s(0 = fVwil dt
Since s'(i) = || f'(i) || > 0, we can parametrize Γ by arc length, and it induces
the same orientation as the original parametrization. Thus g(j), for every
s is the point on Γ of distance s from a: g(j(0) = f(0- If L is the length of
Γ from a to b, g: [0, L] -> Γ parametrizes Γ. Notice that
f'(0 = g'WO) · At)
= g'№)-\\f'(t)\\
so that
Thus g'(j) is the unit tangent to Γ at g(j).
Examples
21. The circle x2 + y2 = a2. Parametrize this circle by
χ = a cos θ у = a sin θ
Thus
f(0) = (a cos Θ, a sin 0)
f'(0) = a(-sin0, cos0)
||f'(fl)|| = e
334 4 Curves
Thus arc length is given by
s = S(0) = \ α άθ = αθ
The parametrization according to arc length is thus given by
substituting s = αθ.
I s . s\
x = g(s) = I a coS -, α sin — I
\ a a)
The unit tangent vector is given by
/ . s s\
T(s) = —sin-, cos -I
\ a a)
22. Consider the helix of Example 17 given by
f(i) = {a cos t, a sin i, bt)
Then
f'(i) = ( — a sin ί, α cos t, bt)
\\Г(Ш=(а2 + Ь2У'2
Thus j = j(i) =((a2 + b2)1,2)t, and the arc length parametrization is
g(s)=(a cos(^+W^' °sin (fl»+'bY/»'(fla+Wa s)
The tangent vector is
T(s) = (a2 + biyii (- « sin t, α cos i, b)
23. The curve of Example 20 has the parametrization
Щ = I - + - cos Θ, - sin Θ, sin -j
4.2 Arc Length 335
and we find
Hf'(0)||=;A=(3 + 2cos0)i
so
1 re
s(0) = —-= (3 + 2 cos фУ12 άφ
2y/2Jo
and the unit tangent vector is given as
T(0)=(3T^)1/2(-Sin0'COS0'C°S-2)
Equations of Motion
Now we shall consider in greater detail the equations of a particle in
motion. Suppose a particle moves through R" along the path given by
x = x(i)· The velocity at time t is x'(i)> and the acceleration is x"(i)· These
are vector-valued functions describing the instantaneous change in the
motion (direction and magnitude) of the particle. The speed of the particle
is the rate at which the distance covered changes, and thus is the time
derivative, ds/dt, of arc length. As we have seen above, this is the magnitude
of the velocity. Thus
, . dx ds
velocity = — speed = — =
dt dt
dx
~dt
(4.24)
Now, it is instinctive to decompose the acceleration vector into a component
tangent to the curve, and a component orthogonal to the curve. We write
d2x
~d?
acceleration = —^ = aTT + aN N
where Τ is the tangent vector and N is a unit vector orthogonal to Τ and lying
in the plane spanned by the velocity and acceleration vectors. N is called
the principal normal to the curve of motion, aT is the tangential acceleration
of the particle, and aN is the normal acceleration. We now show how to
336 4 Curves
compute these components of the acceleration. Differentiate the equation
dx ds „
— = —Τ
dt dt
obtaining
d2x d2s„ dsdT ,, „^
—; = ^T + (4.25)
dt2 dt2 dt dt v '
Now dljdt is orthogonal to T, since Τ is a unit vector. Differentiate
<T, T> = 1
We then have
<T\ T> + <T, T'> = 2 <T, T'> = 0 (4.26)
Thus we can take for the normal vector the unit vector in the direction dT/dt
dTldt dTlds
N = = (4 27Ϊ
HdT/ЛЦ ||dT/ds|| ^ ■ }
(Of course, the differentiation in (4.25) could have been with respect to arc
length as well as time.) Let к = \\dT/ds\\. This is called the curvature of the
path of motion. Then dT/ds = /cN, and (4.25) becomes
<Px_d2s ds dT ds
df'dl2 + dt~ds It
d2x d2s„ /ds\2
acceleration = —-=■ = —τ Τ + κ Ι — Ι Ν
dt2 dt2 \dtf
Thus the tangential acceleration is the rate of change of the speed, and the
normal acceleration is proportional to the curvature, or bending, of the
curve.
d2s
/dsV
"r = ^2 a»=(jt)K (4-28)
4.2 Arc Length 337
Examples
24. Suppose a particle moves along the parabola у = 1 - χ2
according to these equations
x = t - 1 y = 2t-t2
Then
χ = (ί- 1,2ί-ί2)
% -(1.2(1-0)
^5 = (°.-2)
Thus the motion of the particle is determined by a downward vertical
acceleration of constant magnitude (perhaps due to gravity) (see
Figure 4.20). The speed of the particle is
dx
It
= (1 + 4(1 - i)2)1/2
(4.29)
Thus we see that the speed is decreasing until time t = 1 (the
maximum height of the trajectory), and then increases. The tangent
vector to the path of motion is
Τ = (1 + 4(1-ί)2Γ1/2(1,2(1-0)
Figure 4.20
338
4 Curves
and so
dT 2
*= [1+4(1-0*]»/» ГС1"0'-1*
The normal vector is the unit vector in this direction:
N = (1 + 4(1 - ί)2Γ 1/2(2(1 - 0, -1)
Now
(4.30)
dT dT Ids
ds dt/dt (1 + 4(1 — 02)2
2
(2(1-0,-1)
N
[1 + 4(1 - 02]3/2
Thus the curvature of the path of motion is
2
K~(l + 4(l-02)3/2
And finally
4(1 - 0
_d2s _
aT~dT2~ (1 + 4(1 - t2))3'2 "* ~ \dt) 'v " (1 + 4(1 - 02)1/2
The length of the trajectory from χ = — 1 to χ = + 1 is
0 _(dsV
t2)Y'2 °Ν~ ЫК-(ТТ
f2 ^ dt =\\l +4(1-t)2V'2dt
Jo at ·Ό
25. (Rotation) (Figure 4.21). Suppose now that a particle rotates
around the unit circle according to the equations
χ = cos(e') у = sin(e')
Then
χ = (cos(e'), sin(e'))
dx
— = e,(-sin(e,),cos(e'))
d2x
—Ϊ = ^(-sinte'), cos(eO) - e2,(cos(e')) sin(e'))
(4.31)
(4.32)
4.2 Arc Length 339
Figure 4.21
Now we already know, just from geometric considerations, what
are the tangent and normal to the path of motion:
Τ = (- sin(e'), cos(e')) N = - (cos(e'), sin(e'))
Thus (4.31) can be written as
d2x
^4 = e'T+e2'N
dt
Thus the normal acceleration is the square of the tangential
acceleration. From (4.31) we read
ds
dt~
thus
dx
~dit
= ё
s = e"
and the curvature of the unit circle is 1.
Notice, that any motion on the unit circle can be written in the form
χ = (cos(/(i)), sin(/(i))
340 4 Curves
Figure 4.22
where /(f) represents arc length as a function of time. Since the curvature
of the unit circle is 1, we obtain for any circular motion
acceleration = —= Τ + | — I N
dt2 \dt)
The tangential acceleration is the rate of change of speed, and the normal
acceleration is the square of the speed.
26. Now let us consider the motion of an object down a slide (see
Figure 4.22). The slide will be represented by the curve Γ. Let
ζ = ζ(ί) = x(t) + iy(i) be the equation of motion of the particle. The
acceleration is z"(f); according to Newton's laws
mz" = F
where m is the mass of the object, and F is the sum of the forces
acting on the object. One such force is the force due to gravity
which is mg, where g is the gravitational field. The other force is
the restraining force due to the curve. This force acts in a direction
normal to the curve, and has undetermined magnitude. (That is, its
magnitude is determined only by the object.) Let us call this force
φΝ, where φ is a scalar and N is the normal to the curve. Thus we
have
mz" = mg + φΝ
4.2 Arc Length 341
Now, since we know the path of motion, we need only determine the
tangential acceleration aT. By Equation (4.28), we have
d2s
dt
= aT= (z\ T> = (g, T>
(4.33)
where Τ is the tangent of the curve. If we consider the curve as
parametrized by arc length: z=f(s) is the equation of the curve,
then the tangent vector is f'(s). Then Equation (4.33) becomes
dh
dt
2 = <g, /'(«)>
and the speed can be found as the solution to this differential equation
with initial conditions s(0) = s'(0) = 0.
For specific examples, let us first consider the curve to be a straight
line (Figure 4.23) with equation
z(j) = i + j£0
where ξ0 = a + ib is a unit vector in the third quadrant φ > 0). Then
Τ(ί) = ξ0 is constant, and the force due to gravity is - ig. The speed
Figure 4.23
342 4 Curves
Figure 4.24
is thus found as the solution of the differential equation
— =(-ig,a + iby= -gb
s(Q) = j'(0) = 0
Thus s(t) = -(gbt2)/2 and the equation of motion is
ζ = ζ(ί) = i - (%gbt2K0
27. Suppose now the curve is a semicircle (Figure 4.24)
z(s) = sin s + i cos s
Then
T(s) = cos j — i sin j
and the speed is the solution of the differential equation
d2s
—2 = { — ig, cos s — i sin s> = g sin s
j(0) = j'(0) = 0
Rotating Plates
28. We can describe the motion of a rotating flat circular plate by
referring to the angle as a function of time. Let a line through the
center of the plate be chosen at time t = 0 and let 0(i) be the angle
this line makes at time t with its original position. Then a point at
4.2 Arc Length 343
z0 at time t = 0 follows the path of motion
z = z0e,e(,)
Its velocity is /zo0'el9(,), so its speed is |zo|0'. The acceleration of
the point is found by differentiating further:
z"(i) = iz0 0Ve(,) - zo(0')Ve(o (4.34)
Thus the tangential acceleration is |zo|0" and the radial acceleration
is zo(0')2·
If there is an object of mass m on the plate, a force mz"(t) is required
so that the object will follow the motion of the plate. Friction may
provide this force. Notice that the central component of this force
is zo(0')2, so even if there is no angular acceleration, friction must
do its job. The further the object is from the center, or the faster the
plate spins, the greater the force required. It is this principle which
explains the centrifuge, which settles precipitates in solution by
spinning the fluid.
29. Suppose now we have a curved circular plate spinning at
constant angular velocity, and there is a ball of mass m in the plate
(Figure 4.25). Assuming there is no friction, we can describe the
motion of the ball in terms of the initial data.
Let us use spherical coordinates r, 0, ζ in R3, so that the plate is
given by the equation ζ =f{r). In Figure 4.25 we depict a planar
section of the plate. Let
r = КО θ = 0(0 ζ = z(t)
be the equations of motion of the ball, and let α be the angular velocity
of the plate. Since there is no friction, the ball rotates as does the
Figure 4.25
4 Curves
plate, then θ = θ{ϊ) = at. Since the ball is constrained to lie on the
plate we must have z(i) = /(K0)> f°r aU '· Thus we have
х(0 = (К'Уа,,ЛК0))
as the equation of motion, and we must find, using Newton's laws,
the function r{t). Now the acceleration is
x" = ((r - a2r + 2шг')еш, fir')1 + f'r") (4.35)
Letting g be the gravitational field, g = —(0, g) we have a force
mg due to gravity. There is another force, that which restrains the
motion to the profile of the plate. This acts in a direction normal
to the plate and has undetermined magnitude. Let φΝ denote this
force. There is a third force acting on the ball, due to the rotation
of plate and the direction of this force is tangential to the circle on
which the ball lies. We shall denote this force by C. Then, by
Newton's laws
φΝ + С + mg = mx" (4.36)
Let us equate coordinates. Now, since N is normal to the surface,
it lies in the plane through the ζ axis and the ball (the rz plane) and is
normal to the curve ζ = f{r). Thus N = {пхеш, п2) and
^—(/'«Гх
«1
(since n2/n1 is the slope of the line perpendicular to the curve ζ = f{r)).
Since С is tangent to the circle on which the ball lies, С = (сеш, 0).
The magnitude с of С is yet to be determined. Finally, g is vertical,
so g = (0, g). Using (4.35) and substituting these values in (4.36)
we have these three equations as a result:
φηγ = m{r" - a.2r)
с = loir'm
-тд + фп2 = ти\г')2+Гг")
Thus, eliminating φ from the first and last equations, we find that
r = r{t) is a solution of the differential equation
(1 + /'(r) V = «2r - f'W\r){r')2 - f\r)g (4.37)
4.2 Arc Length 345
For what kind of a plate will it be true that the ball will not move
up or down, once released no matter what its position? We must
have /■' = /■* = 0, so (4.37) becomes
«2r=f\r)g
Solving, we obtain f(r) = (a2/2#)r2. Thus if the plate is a paraboloid
of revolution, we can rotate it at a suitable angular velocity so that it
will have this property.
30. Suppose we are given a field of force in space, and the initial
position and velocity of a particle. Then we can find the path of
motion of that particle. For example, suppose the force field is
F(x) = — x, and the initial position and velocity of the particle are
x0, v0. Then the path of motion is given by the solution of this
differential equation:
f'V) = -ДО
/(0) = xo /'(0) = v0
We know the solution; it is
/(f) = cos f · x0 + sin f · v0
Thus the path of the particle is an ellipse in the plane determined by
the vectors x0, v0. If x0, v0 are orthogonal the major and minor
axis have lengths |x0l> lyol (see Figure 4.26). The velocity vector is
f'{t) = — sin f · x0 + cos f · v0
and the speed is the length of this vector.
Χι
Figure 4.26
346 4 Curves
31. Suppose we have a force field in the plane which is of the same
magnitude as the position vector, but orthogonal to it. Using
complex variables on the plane, the force field is given by
F(z) = iz or — iz
Let us assume it is the former. Suppose a particle has initial position
z0 and velocity v0. Then, the motion is found by solving
АО =»/(») /(0) = *o /'(0) = v0
The solution is of the form
f(t) = Ae°" + Be-°"
where α = ^Ji = (1 + i)ly/2. We solve for A, B by substituting the
initial conditions,
/(0) = z0 = A + В
/'(0) = v0 = α(Λ - В)
Thus
m = z° ~ iv° * + z° +2Ш° e-
Suppose z0 = 1, r0 = 0. Then
Л0 = *(*" + *-")
For large positive t, the second term is negligible, and the curve is very close to
ζ = \e"
which we know is an outgoing counterclockwise spiral. For large negative
t, the second term e~"' is dominant and that gives an incoming clockwise
spiral. Thus the particle comes spiraling in from outer space and then at
time t =0 pauses for a breath and then goes racing back from whence it
came. (See Figure 4.27.)
4.2 Arc Length 347
Figure 4.27
• EXERCISES
8. Find arc length as a function of the parameter for each of the following
curves
(a) r(\ +acos0)= 1
(b) r = 1 + 2 cos θ
(c) The curves in Exercises 6(a)(b)(d)(f), and 7(a).
9 Parametrize these curves according to arc length, and find the curvature
and normal
(a) x2 + y2 = 1, x2 + z2 = 1. (d) Thecurve of Example 22.
(b) The curves in Exercise 8(a)(b). (e) The curve of Example 23.
(c) The curve in Exercise 6(a)(e).
10. Find the normal and tangential accelerations for these planar
motions:
(a) z(i)=exp(l -i)t (c) z(t) = (1 +2 cos t)eil
(b) x(t) = t<,y(t) = t* (d) z(t) = t + e"
11. Find the normal and tangential accelerations of these motions in
space:
(a) x(t) = (t, sin t, sin t) (c) x(t) =t(smt, cos t, 1)
(b) x(0=(e', e~\t2)
348 4 Curves
• PROBLEMS
4. The graph of a differentiable function у =f(x) is a curve in the plane.
Find the curvature as a function of x.
5. The graph of a differentiable i?2-valued function у =f(x), ζ = g(x) is a
curve in space. Find its curvature as a function of x.
6. A skier has to negotiate a series of hills whose profile is the curve
у = e~" cos χ (Figure 4.28). There are three forces acting on the skier:
that due to gravity, the restraining force of the hills, and a force due to
friction which is proportional to his velocity. Find the differential equation
describing his motion.
7. I shot an arrow into the sky at an initial velocity of 80 feet/second and
at an angle of π/З with the horizontal. The gravitational field is vertical
downward with a magnitude of 32 feet/second2 The air drags the arrow
with a force of 0.05 times its velocity. Find the equation of motion, and
the curvature of the curve of motion (the arrow weighs one pound).
8. In Example 26, let к be the curvature of the slide. Show that the
magnitude of the constraining force due to the slide is/= (ds/dt)2K — <g, N>.
Find the differential equations which determine x(t), y(t). Write out these
equations when the slide is the curve у = cos x.
9. Suppose we have a field of force in space given by F(x, y, z) =
(—y, x, z). Find the path of motion of a particle which at time t = 0 is at
(1,1. 1) with velocity (-1, -1,1).
Figure 4.28
4.3 Local Geometry of Curves 349
Figure 4.29
10. Suppose a race track is formed by rotating the curve (x — l)2 + z2 = l,
— 1 < ζ < 0 around the ζ axis. (The surface is, in cylindrical coordinates,
(#■ — l)2 + z' = 1, Figure 4 29). A cyclist cycling around the track tends to
ride up the bank as he goes faster Explain that
11. Water is at rest in a very large sink when a stopper is removed in the
bottom center of the sink. An idealization of the ensuing motion is as
follows. The water accelerates toward the hole The forces acting on each
particle of water are due to gravity and the mass of the fluid itself. The
field due to the former is — (0, 0, g) and the field due to the latter operates
as if the particle were on an inclined plane with vertex at the hole. Find
the resultant force field. Find the differential equation giving the rate of
rotation around the hole.
12. We must send a ball of unit mass over a hiH whose profile is the curve
у = exp(— x2) from χ = — 1 to χ = 1 What minimum initial speed is
required to ensure that the ball maneuvers this hill7
13. Suppose we are given in space a force field which is directed toward
the origin and so that its component in the ζ direction is always 1. Find
the path of motion of a particle which is at rest at time t = 0 at the point
(1,1,1).
4.3 Local Geometry of Curves
We have seen, from the physical problems discussed, that the higher-order
derivatives of a function parametrizing a curve have some significance. In
this section we will discuss the higher-order invariants of a curve; that is, those
concepts which depend only on the geometry and not on the particular
parametnzation.
350 4 Curves
Let Γ be a curve in R". For purposes of simplicity, we shall take Γ to be
parametrized by arc length by χ = x(s). If Γ is twice differentiable, the tangent
vector T(s) = x'O) is a differentiable function. Since <T(s), (TV)> = 1 for
all s, we obtain through differentiation 2<T(j), T'(j)> = 0. Thus at any
point T' is orthogonal to T.
Definition 4. The normal line to Γ at x0 = x(j0) 1S the line through x0
and parallel to the vector T'(j0)· The osculating (or tangent) plane to Γ at
x0 is the plane spanned by the tangent and normal lines.
The name osculating plane is quite descriptive. This plane osculates in
the following sense.
Proposition 4. Let \0,xi, x2 be three points on the curve T. If they are
noncollwear, they determine a plane. This plane has the osculating plane as
limiting position, as x1; x2 tend to x0.
Proof In order to determine the limiting position of the plane through Xo j Χι, Χι
it suffices to find two independent vectors which are limits of vectors on the variable
plane. The easiest way to do this is to refer to the Taylor expansion of the arc
length parametrization. Suppose/: (a, b)-+T parametrizes Γ with respect to arc
length, and/(0) = x0. For simplicity we may assume x0 is the origin, 0. According
to Theorem 4.1 we can write
f(s) = T(0)s + T'(fi)s2 + e(s)s2 (4.38)
where lim e(s) = 0.
5-.0
Let xi =/(sl), x2=/(s2) Since x0=/(0)=0, the plane π(ίι,ί2) through
xo, χι, χι is the plane spanned by the vectors /(ii),/(s2). Now, foreachsb sllf(si)
is on π(ίι, s2). Now
sr Y(si) = Γ(0) + Γ'(0)ίι + Φ)*,2
Letting ίι -^0, that says that lim if V(ii) = T(0) is on the limiting plane. Now, to
find another vector on the limiting plane, we take an appropriate combination of
/(ίι),/(ί2) so as to dispose of the T(0)s term in the Taylor expansion (4.38). Thus, we
consider
i2/(*i) - Si/Ы = r(0)(s2i/ - ίιί22) + eisJsisS - ε(ί2)ίιί22 (4.39)
We are interested in finding some vector of this form which has a limit as slt s2
tend to zero. Let us take the special case st = 2s2 = 2s; (4.39) becomes
Τ'(G) 2ί3 + ε(2ί) 4ί3 - e(s) · 2ί3 = 2ί3(Γ'(0) + 2e(2s) - e(s))
4.3 Local Geometry of Curves 351
Thus 7"(0) + 2e(2s — e(s)) is on the plane spanned by f(s) and /(2s). Letting s -> 0,
we see that 7"(0) is on the limiting plane Thus the limiting plane is indeed spanned
by Γ(0) and Г(0).
A few remarks are in order. If T'(0) = 0, then the osculating plane is not
defined. In particular, if the curve Г is a straight line, then the tangent
vector is constant, and there is no plane which is closest to Г, so a straight
line has no osculating plane anywhere. Conversely, if Τ and T' are always
collinear along Γ, then Γ must be a straight line (Problem 14). Now, in the
case where T' and Τ are collinear at the point in question, but not always
collinear, it may happen that the plane through x0, xlt x2 of Proposition 3
has a limiting position as x1; x2 -> x0; and it may not (see Problem 14). In
the former case we shall consider the normal plane as defined by the limiting
position, and in the latter case, we shall say that the normal plane does not
exist. Generally speaking, such cases are pathological, and we shall exclude
them from further discussion.
Observe that for curves in R2, the osculating plane is (of course) just R2.
For curves Γ in R2, we define the normal vector to Γ at x0 as that unit vector
N on the normal line so that the sense of rotation Τ -> N is counterclockwise
(see Figure 4.30). Then the normal vector N varies continuously along the
curve and the vectors (Τ, Ν) will form a "natural" orthonormal basis for
R2 along the curve (called the moving frame). In R" for η > 2 there is
no uniquely determined choice for a normal vector, and thus we leave the
choice undetermined save that it should vary continuously along Γ.
Definition 5. Let Γ be a twice differentiable oriented curve in R". The
normal vector to Γ is a choice of unit vector on the normal line which varies
Figure 4.30
352 4 Curves
continuously along Γ. The curvature of Γ is the scalar function of s, k(s),
such that T'(j) = k(j)N(j) along Γ.
Examples
32. The circle in R2 (Figure 4.31)
x(s) = a cos -, a sin -I
\ a a)
T(s) = I —sin-, cos-1
\ a a)
The normal is orthogonal to and counterclockwise from Τ so
N(s) = — cos -, sin -)
\ a a)
Then T'(j) = — [cos(j/a), sin(j/a)]/a = N(j)/a, so the curvature of
the circle of radius a is a"1.
33. The spiral r = ee (in polar coordinates) (Figure 4.32). The
parametnzation is
z = z(0) = eVe = e(1 + ,)e
z'(0)=(l + i)e(1 + ,)e
Figure 4.31
4.3 Local Geometry of Curves 353
T(«)
π/4
Ν(β)
Figure 4.32
so
ds ee
Thus the tangent vector is
i + i
T(fl):
exe =β.(β + π/4)
The normal is N(0) = e,(e+ 3π/4). Now,
ds άθ ds v v
Thus the curvature is given by κ(θ) = N/2e~9.
Here is a proposition which gives an interpretation of curvature in the
plane and sometimes makes the curvature easily computable. It says that
354 4 Curves
the curvature is the rate of rotation of the moving frame with respect to arc
length.
Proposition 5. Let Τ be a given plane curve. The curvature of Τ is the rate
of rotation oj the tangent with respect to arc length, that is,
Φ) = -j (arg T0))
as
Proof. Let T(i) = r(s)e,eM in polar coordinates. Since T' is a unit vector,
r(s) = 1 Then N(s) = е'<в<5>+"'2>, and
dT d
— = — (e»i«) = ιθ'β" = 0V(")+'"2
ds ds
Thus φ) = 0'(s).
Examples
34. The helix
f(i) = (a cos f, a sin t, bt)
has arc length s = {a2 + b2)i,2t, and tangent vector
T(s)= (a2 + b2yl'2(-asm(a2 + b2yll2s,acos(a2 + b2)~ll2s, b)
Thus
T(s) = (a2 + b2y\-a cos(a2 + b2yil2s, -a sin(a2 + b2)~ll2s, 0)
Thus N = (-cos t, sin t, 0) and
a
к = ■
a2 + b2
Observe that the normal line to the helix always points toward the
axis of the helix.
35. Consider the curve (Figure 4.33)
x(0 = (cos t, sin t, sin 3i)
4.3 Local Geometry of Curves 355
Figure 4.33
Then
x'(0 = ( —sin t, cos t, cos 3f)
ds
-=||X'(f)ll=(l + 9cos23i)1/2
T(i) = (1 + 9 cos2 3f)"1/2(-sin t, cos t, cos 3i)
Computing
dT_dTdt
ds dt ds
= -(1 + 9 cos2 3f)"2(10 cos t, sin 9f,3 cos 3f cos It + sin 3f sin2i)
and the curvature is the length of this vector.
Now, let us make one final remark about a curve in the plane. It is
completely determined, up to Euclidean motions, by its curvature. Thus, for
example, the only curve of constant curvature is a circle. This is, as we
shall see, an easy consequence of Picard's existence theorem for differential
equations.
Theorem 4.1. Let k(s) be a continuous function of s in some interval I about
the origin. There is a curve Γ whose curvature function is k(s). If Γ" is
another curve parametrized by arc length on the interval I which has the same
curvature, then a Euclidean motion will move Γ' onto Γ.
356 4 Curves
Proof. First we shall verify the uniqueness. Let Γ be a curve with the given
curvature. Let χ = x(s) be its arc length parametnzation. We may apply a
Euclidean motion (translation and rotation) so that x(0) is the origin and T(0) is
the vector Ei. Now we show there is only one curve with these properties. The
proof depends on the observation that the normal is rigidly attached to the tangent;
that is, its motion along the curve is completely determined by the tangent. In
fact, writing T(i) = <?""5>, we have N(i) =e'""s>+"'2). Thus N' = /6»V+"'2) =
в'енв+") = —κΎ. Now the system of differential equations
T(s) = /c(i)N(i) N'(s) = - /φ)Τ(ί) (4.40)
has only one solution subject to the initial conditions T(0) = Ei, N(0) = E2. Thus
Τ is unique, so
x(s) = f Τ(σ) da
Jo
is also uniquely determined by the given conditions. Thus there is only one Γ
with the given curvature.
We now turn to the question of the existence of a plane curve with given
curvature. Again, by the fundamental theorem on differential equations, there exists a
solution of the system (4 40) subject to the initial conditions T(0) = Ei, N(0) = E2.
If (Τ(ί), Ν(ί)) is the solution, then
χ = χ(ί) = ί Τ(σ) da
Jo
defines a plane curve Γ. We must show that ί is arc length along Γ For then
x"(s) = k(s)N(s), so k(s) is the curvature. To show that ί is arc length we must
show that x'(s) = T(s) is a unit vector.
Now, let T(s) = - iN(s), N(s) = - /T(s). Then
f (0) = - ίΈ2 = Ει N(0) = iEi = E2
T'(s) = -iN'(s) = -i(-k(s)T(s)) = k(s)N(s)
N'(i) = iT(s) = ik(s)N(s) =- k(s)T(s)
Thus Τ, Ν also solve the given initial value problem By the uniqueness, Τ = T,
N = N. Thus N = (T, so NJ_ Τ It follows that
- <T(i), T(j)> = 2<Τ(ί), Τ'(ί)> = 2«(i)<T(i), N(i)> = 0
so T(s) has constant length Since T(0) = Eb it is a unit vector
4.3 Local Geometry of Curves 357
• PROBLEMS
14. Show that if Γ: χ = x(j) is a curve in R3 and T(s), T'(i) are everywhere
collinear, then Γ is a straight line.
15. (a) Let Γ be given by
χ = (x, x3, x3)
Show that T(0), T'(0) are collinear, but Γ has an osculating plane at the
origin,
(b) Let
. . \x3 if x<0
*W = (o if*>0
Show that the curve Γ given by
x = (x, -g{x),g{-x))
does not have an osculating plane at the origin
16. Let Γ be a curve on the sphere x1 + y2 + z2 = 1 Show that Γ is
an arc of a great (i.e , diametric) circle if and only if the normal to Γ is
always collinear with the position vector.
17. Show that a curve is a straight line if all its tangents are parallel
18. Three noncollinear points in R2 determine a circle If, for the
purposes of this exercise, we consider a straight line as a circle (of infinite
radius) we may assert that any three points determine a circle Suppose Γ
is a curve in R2 through p0 Following the kind of reasoning on pages 324
and 325, define the osculating circle to Γ at p0 and find its equation in terms
of a parametnzation of Γ.
19. The radius of the osculating circle is called the radius of curvature
Show that it is к'1.
20 If the osculating circle to Г is always a straight line, deduce that Г is
a straight line
21 Find the osculating circle at a general point of an ellipse.
22 Find the osculating circle at a general point of a parabola.
23 Show that if the osculating circle to a curve is always a circle of
radius R, the curve is a circle of radius R
24 Suppose Г is given parametrically by arc length by χ = x(s), у = y(s).
Show that the curvature is given by
K = x'y"-y'x =[(x")2 + (y")2Y'2
25 Show that a curve in the plane of constant curvature is a circle.
4 Curves
Figure 4.34
26. Suppose Γ· χ = f(s) is a curve with this property: for every /, the
distance between f(s) and f(s + t) is independent of s. Show that Γ is a
circle
27. Suppose f is a nonnegative function of a real variable with the
property that the area under the graph of/between 0 and χ is proportional
to the arc length of that graph. Find the curve.
28 Find the curve Γ with the property that at any point ρ the angle
between the tangent to Γ at ρ and the tangent to the ellipse
E: x1 + 2y2 = l
at the point of intersection of Ε with the ray through ρ is constant.
29 Let Γ· χ = f(i) be a planar curve Suppose we have a string along
Γ with one end point at x0 If we unwind the string tautly and without
stretching, the end point will follow a curve E, called an evolute of Γ
(Figure 4 34) If s measures arc length from χ = f (0), the curve Ε is
parametrized by χ = f(s) + sf'(s) Find the evolutes to (a) the unit circle
(b) the spiral ζ = e(1 + ot, (c) the parabola у ±= x2, (d) an ellipse.
30 If we rotate a cylinder of water about its axis, the surface of the water
does not remain a plane. What shape does it take and why?
4.4 Curves in Space 359
4.4 Curves in Space
Suppose Γ is a curve in space. Let x0 e Γ, and suppose Τ and N are the
tangent and normal to Γ at x0. A third unit vector orthogonal to both Τ
and N will serve to provide a natural frame within which to discuss the
behavior of the curve near x0. This vector B, called the binormal to the
curve is chosen so that the triple Τ -> N -> В forms a right-handed frame (see
Figure 4.35). In this section we shall use this frame, called the moving
trihedron along the curve, much as we used the tangent and normal to study
plane curves.
The three vectors Τ, Ν, Β determine three planes: the tangent (or osculating)
plane is spanned by Τ and N, the normal plane is spanned by N and B, and
the plane spanned by Τ and В is called the rectifying plane. Now the
curvature of the curve is, as we have seen, the rate of rotation, with respect to
arc length, of the tangent line in the osculating plane In three dimensions
there is another important intrinsic function on the curve. Since В is a
unit vector on Г, <B', B> = 0. Thus B' lies in the osculating plane. Since
<B, T> = 0, we have
<B', T> + <B, T'> = 0
Since T' = /cN, <B, T'> = к <B, N> = 0, thus also <B', T> = 0 so B' must
be collinear with N.
Figure 4.35
360 4 Curves
Definition 6. The torsion τ of a curve Γ is that function such that
B'= -τΝ.
The torsion measures the torque, that is, the twisting of the osculating
plane about the tangent line. That is, since the binormal is orthogonal to
the osculating plane, the change in the binormal reflects adequately the
change in the osculating plane. The Taylor development of the binormal
in a neighborhood of a point x0 = x(0) is
B(j) = B(0) - t(0)N(0)j + e(j)
Thus (considering only first-order terms) the binormal at x(j) has moved
— τ(0) · s toward the normal. Thus if τ(0) > 0, the osculating plane has
twisted in the right-handed sense about the tangent line. At a point where
τ = 0, the osculating plane pauses; it may or may not change its direction
of rotation about the tangent line. If τ ξ 0, the osculating plane remains
fixed along the curve; it follows that the curve lies on this plane.
Proposition 6. Let Τ be a curve in R3. Γ is a plane curve if and only if
τ = 0 along Γ.
Proof. If Γ is a plane curve, let Π be the plane containing Γ. The tangent
and normal to Γ always lie on Π, so the binormal is always the unit vector orthogonal
to Π. Thus the binormal is constant, so B' = 0, thus т = 0.
On the other hand, suppose τ = 0. Let χ = x(s) be the parametrization of Γ by
arc length. Since τ = 0, В' = 0, so В is constant along Г. If for some s0, x(s0) is
not on the plane through x(0) and orthogonal to B, then
<x(io) - x(0), B> Φ 0 (4.41)
Let 0(s) be the function <x(s) - x(0), B>. Then 0'(s) = <T(s), B> which is zero
since В = B(s) for all s and is orthogonal to T(s). Thus 0(s) is constant. Since
0(0) = 0, 0(io) = 0 also contradicting (4.41).
The fundamental formulas of space curve theory are those relating T',
Ν', Β' with Τ, Ν, Β. We can now easily derive them.
Theorem 4.2. (Frenet-Serret Formula)
T'= kN
Ν' = -κΤ + τΒ
B'= -τΝ
4.4 Curves in Space 361
Proof. The first and the third are just the definitions of κ, τ, respectively. Since
N is a unit vector, <N', N> = 0, so N' lies in the rectifying plane. Write
N = <xT + j8B; we must verify α = - κ, β = т. But that follows from <N, T> = 0,
<N, B> = 0. For
a=<N',T> = -<N,T'> = -«
β = <N', B> = - <N, B'> = -(-τ) = τ
Examples
36. The circular helix:
x(i) = {a cos t, a sin t, bt)
We have already computed that s = ct, where с = (a2 + b2)1'2, and
T(s) = - I — a sin I -1, a cos I -1 b I
kN(s) = -^i-a cosl-l, -a sinl-l, 0
thus
κ = 7·Ν=-Η;)·Μη(;)·°)
Β = Τ χ N=- i-6sini-j,6cosi-J, -a)
thus τ = 6/c2.
37. Let С be a curve in the xy plane, and let Г be a curve of constant
slope lying over the curve С (see Figure 4.36). Thus if Τ is the tangent
to Γ, <T, E3> is constant. Let b be that constant. Then Γ has the
parametnzation
x(t) = Mi), y(t), bit))
where (x(t), y(t)) parametrizes С We may assume the parameter
362 4 Curves
Figure 4.36
is arc length along C. Then
x' = (x', У', b)
so
ds
It
= Hx'll = ((x')2 + (/)2 + b2)1·2 = (1 + 62)1'2
Thus s = (1 + 62)1/2i and the tangent to Γ
is
T =
1
(1+62)1'
Щ(Х',/,Й)
Thus
κΝ = T' =
(TTW7,(*",/',0)
Now if кс is the curvature of C, since (*', }/)
( —/, x') is its normal vector, so
4.4 Curves in Space 363
Thus
kN =
so
кс
(1 + 62)1/2(-/.*',ο)
кс
(1 + 6
Then
2γβ N=(-/,X',0)
B = T χ Ν =
_____ (_6< _6/; (χ02 + (/)2)
1
(1 + 62)1/2
Differentiating,
1
mi-Ьх', -by', l)
-τΝ = Β' =
Thus
Ькс
(iTbW{-bx"' ~by"'0) = (TTPp(->'· *'
,o)
τ = ■
6кс
(1 + 62)1/2
Local Behavior of a Curve
We shall now make a close study of the local behavior of a curve relative
to the moving trihedron. Let Γ be a sufficiently differentiable curve,
parametrized by arc length by χ = x(j), —_<_<„. We may perform a
Euclidean transformation so that x(0) = 0, T(0) = E,, N(0) = E2, B(0) = E3.
Expanding x(j) in a Taylor series, we obtain
Now
s2 s3
x(s) = x(0) + x'(0)s + x"(0) - + x"'(0) T + e(s3)
2 b
x' = Τ, χ" = κΝ, χ" = κ'Ν + κΝ' = κ'Ν + κ(-κΤ + τΒ)
(4.42)
364 4 Curves
Evaluating these at zero and substituting into (4.42), we obtain
x(s) = «Ει + 2 s E2 + -у E2 - -g- Ei +.— E3 + e(s3)
In coordinates,
kV
X = S
+ Ф3)
У = ^«2 +js3 + e(«3)
ζ = — sJ + e(s3)
6
Thus for small values of s, the given curve looks like the cubic curve given
by the equations
y = r
κτ ,
2 4Τ 3
3 к:
Figure 4.37 is a picture of this curve for к > 0, τ > 0. Notice that, so long
as κτ φ 0 the curve always passes through its osculating and normal planes,
but lies on one side of its rectifying plane.
Figure 4.37
4.5. Varying a Curve in the Plane 365
Now, just as the curvature determines plane curves up to a Euclidean
motion, space curves are so determined by the curvature and torsion. The
proof of this fact is by the same kind of application of Picard's existence
and uniqueness theorem as we used in the case of the plane. We shall leave
the verification to the interested reader.
Theorem 4.3 Given continuous functions f g defined in an interval I there
is a space curve Γ: χ = \{s) given parametrically by arc length in some sub-
interval of I such that
k(j)=/(j) z{s) = g{s)
Γ is unique up to Euclidean motions in R3.
• PROBLEMS
31. Show that a curve in R3 is a plane curve if all its tangent planes pass
through a given point.
32. Show that a curve in R3 is a plane curve if its binormal is constant.
33. Let Γ bea curveintheplaneandletybetheintersectionof thecyhnder
over Γ with the cone x2 + y2 = ζ, ζ ^ 0. Find the curvature of Γ in terms
of that of y. What is the torsion of y?
34. Let Γ be a curve in space, and у its projection onto the xy plane.
What is the relation between the curvature and torsion of Γ and the
curvature of у ?
35. Suppose that Γ is the intersection of the surface ζ = у2 in R3, with the
plane ax + by = 0. What is the curvature of Г at the origin ?
36. Let Г be the intersection of the surface ζ = χ2 + 2y2 with the plane
ax + by = 0. What are the curvature and torsion of Γ">
37. Let Г be given in R3 by χ = x(s). Let Σ be the surface swept out by
the tangent lines to Γ. Show that a curve on Σ which is everywhere
orthogonal to those tangent lines is given by
χ = x(s) + (c - s)T(s) for some constant с
4.5 Varying a Curve in the Plane
A family of curves in the plane is a collection of curves {Гс}, as с range
through some set, usually of η-tuples of numbers. It is to be understood that
the curves of the family vary smoothly; although we shall not make this idea
precise. For example, if x(t, c) are functions defined for real t and с lying
366 4 Curves
in some set S, then the equations
χ = x(f, с), у = y(t, c) (4.43)
determine a family of curves: each curve in the family is found by fixing a
value of с We refer to Equations (4.43) as the explicit form of the family.
More often, a curve is determined by a relation between x, у and a family
could be given by an equation
F(x, y,c) = 0 (4.44)
which, for fixed с gives the relation determining a curve. We refer to (4.44)
as the implicit form of the family. Since it does not refer to any particular
parametrization of the individual members, this form is particularly useful.
The " constant" с which picks out the member of the family usually ranges
through some set in R": in which case we refer to the family ((4.43), (4.44))
as an «-parameter family of curves.
Examples
38. A straight line in the plane is given by the equation
ax + by + с = 0 (4.45)
Thus the set of all straight lines is given by (4.45) implicitly as a
3-parameter family of curves. If instead, we write down the slope-
intercept form of a straight line,
у = mx + b (4.46)
then we exhibit this family explicitly as a 2-parameter family of curves.
39. Let
χ = x(f) у = y(t)
be the equation of a curve Г in the plane, and consider the family of
tangent lines to Г. The equation of the line tangent to Г at (x(f),
XO) is
/(f)
У = XO + -^ (x - x(0) (4.47)
X (t)
4.5 Varying a Curve in the Plane 367
This is the explicit form then of a 1-parameter family. (The parameter
is f.)
40. Consider the case where Γ is the circle
χ = cos t у = sin t
The family of tangent lines to Г is given by the equation
cos t
у = sin t —
(x — cos f)
(4.48)
sin t
This simplifies to
у = — χ cot t + esc t
We can make this appear even more palatable by taking - cot t
as the parameter of the family. Letting с = — cot t, we find esc t =
- (1 + c2)ll2/c, so (4.48) becomes
у = xc — ■
(1 + сГ2
(4.49)
a 1-parameter family of lines.
41. Suppose a hoop is rolling along a horizontal line (see Figure
4.38). This collection of positions of the hoop forms a 1-parameter
family of circles where the point of tangency with the horizontal (the
Figure 4.38
368 4 Curves
Figure 4.39
χ axis) is taken to be the parameter. The implicit equation for the
family is thus
(* - c)2 + (y - l)2 = 1
42. The family of circles tangent to both the χ axis and the у axis
is a 1-parameter family of curves (Figure 4.39). We take for the
parameter the point of tangency of the curve with the χ axis. If r
is the radius of the cth circle, then the equation of the family is clearly
(x - c)2 + (y- r)2 = r2
It is easily seen that r = c\ this follows from elementary geometric
considerations. Thus the family is implicitly described by this
equation
(x - c)2 + О - с)2 = с2 (4.50)
43. The family of circles of radius 1 tangent to the parabola у = x2
(Figure 4.40). We can take as the parameter the χ coordinate с of
the points of tangency. The center of the circle is on the line
perpendicular to the parabola at (с, с2). Thus if (r, s) are the coordinates
4.5 Varying a Curve in the Plane 369
of the center of the cth circle, we have
« - c = - Yc (r - c)
{r-c)2+(s-c2)2= 1
These equations have the solution
2c _ 2 1
Г - C + (l + 4c2)1'2 S~C ~ (1 + 4c2)1'2
Thus the implicit equation for this family of circles is
44. Let Г be a curve in the plane. We seek the family of tangents
to Г. If Г is given as a function of arc length by χ = x(j), then the
lines
g(M) = x(j) + nT(j) (4.51)
form the family of tangents to Γ with s as parameter. Suppose now
that у is a curve which is orthogonal to this family at every point.
If h(j) is the point of intersection of γ with the particular tangent line
Figure 4.40
370 4 Curves
(4.51) at x(j), then γ is parametrized by χ = h(j). h(j) is then of the
form (4.51) with a particular choice u(s) of u. Writing then h(j) =
\(s) + h(j)T(s), and differentiating, we obtain
h'(s) = (1 - w'(j))T(j) + φ)Τ'(ί)
Since h'(j) is tangent to γ and thus, by assumption, orthogonal to T,
we must have 1 — w'(s) = 0. Thus u(s) = s + с So the family of
curves orthogonal to the tangent lines to Г is given by
χ = x(j) + (j + c)T(j) (4.52)
The family of curves orthogonal to the tangents to the circle ζ = e"
is given by
χ = e,s + i(s + c)e,s = [1 + i(j + c)>"
These are just the evolutes of the circle.
The Differential Equation of a Family
A differential equation
y' = F(x, y)
determines a 1-parameter family of curves, if the function Fis decent enough.
For, under such conditions, for each с there is a unique solution of the initial
value problem
y' = F(x, y) y(x0) = с
The solution can be written у = f(x, c), which can be considered as either
the explicit, or implicit form of the family. Now, it is usually true that a
1-parameter family of curves is the family of solutions of some differential
equation, and we would often like to find that differential equation.
Suppose, for example, that у = f(x, c) is the equation of a given 1-parameter
family. If у = y(x) is one particular curve (i.e., y{x) = f(x, c0) for some
fixed c0), then these two equations must hold
у = f(x, c) y' = — (x, c)
4.5 Varying a Curve in the Plane 371
for some value of с (i.e., с = с0). It may be possible to eliminate the
parameter с from these two equations, thus obtaining a relation between x, у, у'
which must be satisfied; this is the differential equation of the family. For it is
a differential equation which must be valid for each member of the family,
and this is a differential equation which determines the family.
More generally, suppose the family is given implicitly by
Fix, y,c) = 0
If χ = x(t), у = y(t) parametrizes one of the curves in the family, then there
is а с such that
F(x(t), У(0, с) = 0 (4.53)
identically in t. Differentiating now with respect to t, we have
dF dF
— (*, y, c)x' + — (x, y, c)y' = 0 (4.54)
ox ay
If we can eliminate с from Equations (4.53) and (4.54), the result will be a
relation between x, у, х', у' which must be satisfied for each curve in the
family and thus is the differential equation of the family. Of course, if χ
is the parameter along the curve, and у = у(х) is its equation, (4.54) becomes
ψ(*,ν,ο) + ψ{χ,ν,€)^ = 0 (4.55)
ox oy ax
Examples
45. Consider the family of parabolas (Figure 4.41)
y2 - ex = 0
Differentiating with respect to χ (considering у as a function of x),
2yy' - с = 0
Thus the differential equation of the family is
y2 - lyy'x = 0
372 4 Curves
Figure 4.41
or, excluding the curve у = 0,
у - 2y'x = 0
46. The family у = сех is given by the differential equation у' = у
(as we already know). The family у = e" is given by the differential
equation
'=exp(H
47. (Clairaut's Equation). Let у =/(*) give a curve in the plane,
and consider the family of lines tangent to that curve. That family
is given implicitly (taking the χ coordinate of the point of tangency as
the parameter) by this equation,
У = f(x) + f'Wx - c) (4.56)
Now, upon differentiation we find
y'=f'(c) (4.57)
4.5 Varying a Curve in the Plane 373
To say that we can eliminate с from the pair of Equations (4.56) and
(4.57) amounts to saying that we can solve (4.57) for с as a function of
y'. Then, upon eliminating we obtain as differential equation, the
equation
y = y'x + h(y') (4.58)
where h(y') represents the expression /(c) - f'(c)c, considered as a
function of y'.
Thus Equation (4.58), known as Clairauts' equation, is the general form
of the differential equation of a family of lines tangent to a curve. Its
solutions are
у = ex + h(c)
Notice that the given curve у = f(x) also solves Equation (4.58) (because
it is derived from (4.56) and (4.57) which hold under the substitution j> = /(*))·
It is called the singular solution of the equation.
48. The family of lines tangent to the parabola у = χ2 has the
implicit form
у = с2 + 2с(х — с) = 2сх — с2
Differentiating, we obtain y' = 2c. Thus с = \у', so we can eliminate
с to obtain this differential equation of the family,
y = y'x- i(y')2
49. The family of lines tangent to the circle x2 + y2 = 1 is given
implicitly by
(1 - c2)1'2
y=xc _
Then y' = c, so the Clairaut equation of the family is
(1 - jy')2)112
У= У* о
374 4 Curves
Family Orthogonal to a Given Family
50. Let F be a given family of curves. We propose to find a family
G of curves everywhere orthogonal to F. Thus, if ρ is a point in the
plane, and Γ is the curve in F through ρ with tangent T1; and у is the
curve in G through ρ with tangent T2 we must have <T1; T2> = 0.
Suppose the family F is given by the differential equation (Figure 4.42)
a(x, y)x' + b(x, y)y' = 0 (4.59)
Thus, since (*', y') is the tangent field to F, we must have T2 collinear
with (a(x,jO, b{x,y)) (for <T1; {a, b)} = 0 by (4.59)). Thus the
differential equation for the family G is
a(x, y) b(x, y)
Figure 4.42
4.5 Varying a Curve in the Plane 375
51. Find the family orthogonal to the family of hyperbolas
xy = с
The differential equation of this family is yx + xy' = 0. Thus the
differential equation of the orthogonal family is
x' _/
у χ
or xx — yy' = 0 which integrates to x2 - y2 = с
52. The family orthogonal to the family of parabolas in Example 45
is given by the differential equation
1 У'
у — 2x
(here χ is the parameter, so χ = 1). This integrates to
2
x2 + у = с (4.61)
53. Find the family which makes an angle of π/4 with the family
(4.61). The differential equation of the family (4.61) is
2xx + yy' = 0
The family orthogonal to this family has tangent collinear with 2x + iy,
thus the family we seek has tangent collinear with this vector rotated
by π/4. Thus the tangent field is collinear with el<-"l4\2x + iy), or,
what is the same, (1 + iX2x + iy) = 2x - у + i(2x + y). Thus, the
differential equation is
2x - у 2х + у
Envelopes
Many of the families we have been studying have the property that there
is a curve (or curves) which is not a member of the family but bounds the
family (see Figures 4.39-4.41). Similarly, for a family of lines tangent to a
376 4 Curves
given curve, the curve bounds the family. Such a bounding curve is called
an envelope. We want to see how to find envelopes for families.
First of all, some families do not admit envelopes. Clearly, the families
χ = с, у = с, у = χ2 + с do not admit envelopes. However, if an envelope
exists we can find it by the present techniques.
Definition 7. Let F be a family of curves in the plane. A curve Г is an
envelope for the family F, if through every point ρ in Г there goes a curve in
F which is tangent to Г at p.
Suppose that a family is given implicitly by
Fix, У,с) = 0
and that the curve Г: у = f(x) is an envelope of this family. Then, for every
x0 there is a c(x0) such that the curve С corresponding to F(x, y, c(x0)) = 0
is tangent to Г at (x, f(x0))· Thus we must have
F(xo,nxo),c(xo)) = 0 (4.62)
and since the curve С has the tangent direction (l,/'(x0)), we must have, by
(4.54),
dF dF
γχ (*o, Я*о). Фо)) + у (*о , Я*о), фсо))/'(*о) = 0 (4.63)
Differentiating (4.62) with respect to x0 we also find
dF dF N dF
Tx + Tyf'M + Tcc'M = 0 (4·64)
Comparing (4.63) and (4.64) we have as a result
dF
Yc (*o, /(*o)> c(*o)) c'(x0) = 0 (4.65)
Thus if (x, y) is on the evenlope Γ, there is а с such that
dF
F(x, y,c) = 0 — (x, y,c) = 0
4.5 Varying a Curve in the Plane 377
and we can eliminate с from this pair of equations to obtain an implicit
equation of Г. Notice that from (4.64), the equations
dF dF
F(x, y,c) = 0 —+ — y' = 0
ox ay
also hold on Г. Eliminating с from this pair we obtain once again the
differential equation of the family, so the envelope must also satisfy this
differential equation.
Examples
54. Find the envelopes of the family
(x - c)2 + (y - l)2 = 1 (4.66)
of Example 41. We differentiate with respect to с to find
- 2(x - c) = 0 or χ = с
Eliminating с we obtain {y - l)2 = 1, or у = 2, у = 0.
55. Find the envelopes of the family
(* - cf + {y- c)2 = c2 (4.67)
of Example 42. We must eliminate с from this equation and
-2(х-с)-2(з>-с) = 2с
or
x+ y = с
Substituting this in (4.67) we obtain
(-з02 + (-х)2 = (х+з02
or
2xj> = 0
Thus the envelopes are χ = 0, у = 0.
378 4 Curves
56. Find the envelopes of the family
у = χ2 sin ex
Differentiation with respect to с yields
0 = ex2 cos ex
or
π 3π
C = 0' CX=2'T'···
The condition с = 0 gives j> = 0 which fails as an envelope. But
ex = π/2, 3π/2 yields the envelopes у = ±x2 (Figure 4.43).
57. Find the envelope of the family given by
This is a Clairaut equation and has the solution
у = ex + 1 + c2
Figure 4.43
4.5 Varying a Curve in the Plane 379
Differentiation with respect to с yields
0 = x + 2c or с = -
2
Thus the envelope of this family is the curve
У=-+1
• EXERCISES
12. Find the differential equations for these families of curves:
(a) xyc = 1 (c) xecy = 1
(b) sin xy — a cos xy = 0 (d) χ sin j> + с sin χ = 0
(e) ;,<?«*+>'> = 1
(f) sin(x + у + с) + cos(x + j> + c) = 1
13. Find the implicit form of the family given by these diiferential
equations:
(a) xy'-yx'=0 (c) (y'y + y* = \
(b) x' + yy' = 1 (d) у + y'x + sin у = 0
(e) /(sec χ — tan x) = 1 — у
14. Find the implicit form and the differential equation of the family of
circles with center on the у axis and tangent to the χ axis.
15. Find the family of ellipses with foci at (-1, 0), (0, 1).
16. Find the family of curves orthogonal to the family in Exercise 14;
Exercise 15.
17. Find the family orthogonal to the families of Exercises 12(a), (b),
(f), 13(b), (d).
18. Find the family making an angle of π/З with the family of Exercises
12(a), (b), (c).
19. Find the envelopes of the families of Exercises 12(a), (b), (c), (d), (e).
20. Find the envelopes of these families:
(a) .y = sin(;t - с)2 (с) 13<?
(b) 136 (d) y = exsincx
(e) The family of cardiods r = (1 + c)~\l + с cos Θ).
(f) The family r = sin αθ.
• PROBLEMS
38. Find the family of evolutes of the parabola у = л:2. Find the family
orthogonal to this family of evolutes.
39 Find the family orthogonal to the family of spirals r = ce".
40. A ladder 10 feet tall originally leaning against a building slips
(Figure 4.44). Find the family of curves which are the trajectories of the
points on the ladder.
380 4 Curves
\\\\\\\\\\4\\\\4\4\4\\\4\\\\4\\444\v
Figure 4.44
41. Find the family of trajectories of the points on the circumference of a
ball rolling on a horizontal plane.
42 A line segment of length 2 has its endpoints on the parabola у = хг
Find the trajectories of points x0 on the segment as it slides along the
parabola (Figure 4.45).
43. A ball of unit mass is at the end of a string of unit length attached
to the top of a vertical bar rotating at constant angular velocity. Find the
path of motion of the ball assuming its position and velocity at time t = 0
to be (1, 0, 0), (1, 0, 1), respectively Find the trajectory of any point on
the string.
44. Find the family of curves swept out by the midpoints of bars of given
length with endpomts along the curve xy = 1 in the first quadrant
Figure 4.45
4.6 Vector Fields and Fluid Flows
We have come across vector fields several times already: the gradient of a
function, the gravitational field, a field of forces, are all vector fields. We
now want to study such fields in connection with fluid flows: motions of a
mass of noninterreacting particles.
4.6 Vector Fields and Fluid Flows 381
Figure 4.46
A vector field is a function which assigns to each point in a given domain
in R", a vector in R", usually considered as based at the given point. Thus,
a vector field defined on D in R" is nothing more than an Revalued function
on U, but interpreted pictorially as in Figure 4.46.
Examples
58. A body in space sets up a field of gravitational attraction.
Suppose there is a body of unit mass situated at the origin. According
to Newton's laws another body of unit mass is attracted to the given
body at the origin with a force proportional to the inverse of the
distance squared. We represent this attraction at a point ρ by a
vector directed toward the origin and of length ||p||-2 (see Figure 4.47).
Thus the gravitational field of a body situated at the origin is the
vector field defined on R3 - {0} by
or, in rectangular coordinates
(s, y, z)
\(X, )>, Z) = — —^ 2 2\572
(χ2 + у2 + z2f12
59. Given a family of curves, we may consider the field of unit
tangents to the family (Figure 4.48). In particular the field of
tangents to the family of circles x2 + y2 = c2 is defined on R2 - {0, 0},
382 4 Curves
Figure 4.47
and is given by
T(*. У) =
2 , .,241/2
(*2 + y2)
The family of unit tangents to the family of rays is defined on
K2- {0,0} by
T(*. У) =
(x,y)
{x2 + У2)112
Figure 4.48
4.6 Vector Fields and Fluid Flows 383
If we are given a vector field ν on a domain D in R", the questions arise: Is
it a field tangent to a family of curves, and if so, can we discover the curves ?
Suppose then that ν is a given vector field in the domain D, and Γ is a
curve in D such that v(x) is tangent to Γ at each point χ on Γ. Let f be a
function which parametrizes the curve Γ. Then f \t) is tangent to Γ at f(i)
so we must have f'(i)and v(f(i))colhnear. In particular then, if f is a solution
of the differential equation
f'(0=v(f(0)
then f parametrizes a curve tangent to the given field. In the terminology
of the preceding section
--v(X) = 0
is the {parametric) differential equation of the family of curves tangent to
the vector field.
60. Suppose \(x, y) = (x, 2y). (Figure 4.49.) Then the family of
curves tangent to the vector field ν is given parametrically by this
differential equation:
x' = χ y' = 2y (4.68)
x(0) = x0 № = JO
The solution is given by
x=x0e' У = Уое2' (4.69)
We can write this family of curves implicitly as
у - ex2 = 0
(taking the constant с as y0 χό2)· Thus the family we seek is a system
of parabolas.
Another way to find the implicit equation of the curve is to divide
one equation in (4.68) by the other:
dy _ dyjdt _ ly_
dx dxjdt χ
This we can solve directly by separation of variables.
384 4 Curves
^
Figure 4.49
4.6 Vector Fields and Fluid Flows 385
61. Let v(x, y) = (x + y, 1). Then the differential equation is
dy dyjdt χ + у
Τ = ΤΤΓ, = ~Ί— or У' = х + У
dx dx/dt 1
which has the general solution
у = -(л: + 1) + се*
Now let us consider a fluid in motion in a domain D in R". The equations
of fluid motion are written as follows. We suppose that at time t = 0
there is a particle of fluid at each point x0 in D. The position of that particle
at the subsequent time t is denoted by φ(χ0, t). The equation of motion
then is
х = ф(хо,0 (4.70)
For a fixed x0, the curve described by (4.69) is the path of the particle which
is at x0 at time t = 0. Thus we are assuming that
xo = <Kxo,0) (4.71)
It is also assumed that no two particles can ever occupy the same position
at the same time. Then for each t, the function ф(\0 > 0 ls one-to-one and
thus can always be inverted: there is also a function ф(х, t) which describes
the t = 0 position of the particle at χ at time t such that
χ = φ(χ0, t) if and only if x0 = "А(х, О
Definition 8. Given the fluid motion described by Equation (4.70) its
velocity at the time t = t0 is the vector field
дф
It
situated at the point χ = φ(\0, t0). If the vector field is independent of
time, we say that the fluid motion is a steady flow.
Thus the velocity field of a flow at time t0 and point χ is the velocity
v(x, i0) °f ^e particle which is at χ at that time. If the velocity is
independent of the time, or the particular particle, the flow is steady. The flow in a
river of constant volume is determined by the shape of the river bed, and thus
386 4 Curves
tends to be steady, whereas the flow of clouds in the sky is time dependent.
If the flow is steady, then the path lines (the curves described by (4.70)) are
the curves of the family tangent to the velocity field. If the velocity field is
time dependent, then these tangent families (called the lines of force) vary
with time and have little to do with the paths of individual particles. This is
easy to see. Suppose the flow
χ = φ(χ„,0 (4·72)
has the velocity field v(x, t). Then the path lines (4.72) are the solutions of
the differential equation
dx
— = v(x(0,0 x(0)=xo (4.73)
at
The lines of force at time t = t0 are the solutions of the equation
dx
- = ν(χ(τ), ί0) χ(0) = x0 (4.74)
These are the same differential equations if and only if v(x, t) = v(x, t0) for
all t, that is, if and only if the flow is steady.
Examples
62. Consider the flow in R2 given by Equation (4.67):
x = x0e' y = y0e2t (4.75)
Then
х=х0е' у' = 2у0е2' (4.76)
Thus the velocity at time t of the particle originally at (x0, y0) is
(x0e',2y0e2t). To find the velocity field we must solve (4.75) for
x0, y0 in terms of x, t and substitute. Thus (4.76) becomes
χ = χ y' = 2y
so the velocity field is v(x, y) = (x, 2y) and the flow is steady.
4.6 Vector Fields and Fluid Flows 387
63. Consider the flow in R3 given by
x=x0+t y = y0 + t2 z=z0 + t3
χ = 1 y' = It ζ' = 3ί2
Thus the velocity field
y(x,y, z) = (l,2i, 3f2)
is independent of position but is time dependent. In fact, the path
lines are independent of position and are just translates of the twisted
cubic (Figure 4.50). It is as if all of space were being rigidly
translated along the line curve у = χ2, ζ = χ3. Notice that since the
velocity field at any given time t = t0 is a constant field, the lines of
force are straight lines.
64. χ = x0 + t, у = у0{\ + t), z= z0 e'. Then
= (l,y0,z0e')
t
so the velocity field is
ν(χ,γ,ζ)=^\,γ~,ζ\
Figure 4.50
dx
dt
388 4 Curves
the flow is not steady. The lines of force at time t = t0 are the solutions
of
χ = 1 V = ζ = ζ
1 + fo
so is the family
x = x0+t У = Уо «Ρ — ζ = ζ0 e'
which is quite different from the family of path lines.
65. The flow is given by
x = x0e' у = у0е-' + х0(е'-е-') ζ = z0e2t - x0(e'- e2')
(4.77)
„t
χ = x0e у =-y0e + x0e + x0e
z' = 2z0 e2' - x0(e' - e2') (4.78)
The Equations (4.77) are linear in x0, y0, z0, so we may solve for these
in terms of x, y, z. Doing so, and substituting the result in (4.78),
we obtain the velocity field of the flow,
v(x, y, z) = {x, 2x - y,2z + x)
This flow is time independent, or steady.
It is an immediate consequence of the uniqueness assertion of Picard's
theorem that a flow is completely determined by its velocity field. For the
flow equation is the solution of the initial value problem (4.73), which is
unique. Notice also that the existence part of Picard's theorem asserts that
there always is a flow associated with a given velocity field (which is sufficiently
smooth).
The last remark we care to make at this time (we shall continue the study
of fluid flows in Chapter 8) is that in the case of a steady flow, the particles
follow one another along a fixed family of paths (whereas in general each
particle determines its own path). These are of course the lines of force.
4.6 Vector Fields and Fluid Flows 389
What we must show is this: If two particles x0, xt occupy the same position
at different times (of course), then they follow the same paths. That is, if
there are s0, ^ such that
φ(χ0, j0) = <Η*ι, Ί)
then the curves
Γ0: χ = φ(χ0 ,t) T1:x = 0(x1; t)
are the same, except for parametrization. The following proposition proves
this, and more. It makes explicit the relation between the two parametriza-
tions.
Proposition 7. Suppose χ = φ (χ0, t) describes a steady flow. If for some
(x0, s0), (x1; Sj), we have
φ(χ0, s0) = φ(χλ, Ji)
then
φ(χ0 ,s0 + t) = φ(χ1} sj, + t) for all t (4.79)
In particular, хг = ф(х0, s0 — Sj).
Proof. The proof is simply that the two functions in (4.79) solve the same
initial value problem. Let v(x) be the velocity field of the flow (by assumption ν is
time independent). Consider these functions
f0(0 = ^(xo, so + 0 fi(0 = Ά(χι, «ι + 0 (4.80)
We have
fo(0) = П(0)
Since
-£(Xo,t) = l/(<f>(Xo,t))
390 4 Curves
for all χ, t, we have
df0 д
Tt (ί) = di ^(x°' io + ') = v^x° >s° + ')) = v(fo(0)
i/fi
-(0 = v(f1(0)
Thus f0, fi solve the same first-order differential equation and by (4.80) have the
same value at 0. Thus f0 = fi identically.
Planetary Motion
We conclude this chapter with a study of the classical equations of planetary
motion. This study first requires these simplifications. We assume all
action is in a plane, and that the only force acting is that due to the sun's
gravitational field. These simplifications approximate the true situation with
enormous accuracy. For the other forces acting on the body are gravitational
forces due to other celestial bodies, which are either too far away or too small,
relative to the sun, to make a substantial contribution. According to
Newton's laws, the acceleration of a body due to the gravitational field is
proportional to the field. The motion is thus completely determined by
this force and an initial position and velocity. For if s = j(i) is the equation
of motion, then s is the solution of an initial value problem;
i(0) = i0
*'(0) = v0
s"(t) = kF(s(t))
where F is the given force field.
Our purpose here is to describe the motion of planets in terms of an
observed position and velocity. If we locate the sun at the origin, then the
gravitation force field is given (in complex notation) by
Thus, we must explicitly solve this system
z(0) = z0
z'(0) = v0 (4.81)
z(t)=-woF
4.6 Vector Fields and Fluid Flows 391
The best way to solve this is by means of polar coordinates. Write
z(i) = r{t)el9i,\ Then differentiating, we have
z' = r'eie + i9'reie (4.82)
z* = r>e"> + Ив'г'е1* + iW're1* - {O'fre19 (4.83)
and Equation (4.81) becomes
,ιβ
z" = r"e,e + Ив'г'е16 + i9"reie - (в')2ге1в = - %
r2 (4.84)
Multiplying through by е~1в, we obtain
r" - (0')2r + i(29'r' + r9") = ^
r
which reduces to this system (equating real and imaginary parts):
r" - (0')2r = ^- 2ΘΥ + νθ" = 0 (4.85)
The second equation reads
2(lnr)' = — =^=(1ηθ')'
so either Θ' = 0 or 0' is proportional to r~2. We have then these two
alternatives. In one case θ is constant, in which case the planet approaches the
sun along a straight line. In the other case, the planet rotates around the
sun at an angular velocity inversely proportional to the square of the distance
from the sun (the closer it is to the sun the faster it rotates around it). Notice
also that the solution r = constant, Θ' = constant is possible, so that an
admissible path is that of circular motion of constant angular velocity. The
angular velocity decreases as the circle gets larger.
We proceed now to the full solution of (4.84). We already have 6'r2 = h,
a constant (determined by the initial conditions). From (4.84), we obtain
βιθ 1 i
r2 h h
*"'=- — = ~ τ е'ев' = г (*'")'
392 4 Curves
Thus we can integrate to obtain
ζ ' = - е1в + С
η
Where С = реш is an arbitrary constant. Now, using (4.82) we have
r'e'9 + 1в'ге,е = - e,e + pem
h
Multiply through by е~1в and equate imaginary parts:
0V = - + ρ sin(ro - Θ)
h
Once again using 9'r2 = h, we obtain this implicit relation between r and Θ:
h = ri- + psin(a> — 0)1
or
r=l+pAsin(a,-0) (486)
The constants p, h, ω are to be determined by the initial conditions. Equation
(4.86) is the polar form of the equation of a conic with one focus at the origin.
If ph < 1, it is an ellipse; ph= 1, a parabola; and ph> 1, a hyperbola.
These are then the possible paths of motion of a planet, or comet, around the
sun.
• EXERCISES
21. Find the family of curves tangent to the given vector fields:
(a) y(x, y) = (x, -y)
(b) v(x, y) = (-y, x)
(c) y(x,y,z)=(-x,-y,z)
(d) y(x,y, z)=(x, -l,z)
22. Find a field of vectors tangent to these families:
(a) z = ec+")r
(b) z = ec + ,!
(c) χ = let, y = l— (ct)2
(d) χ = x0 + t, у = e'y0, г = sin t
4.7 Summary 393
23. Find the velocity field of these flows:
(a) x(i) = (<r'*„ ,y0 + t, e-'z0)
(b) X(i) = (x0(l + Г), ^o(l + t), Z0 + t\Xo2 + V))
(c) X(i) = (X0, y0 + t, Z0 COS 0
(d) x(t) = e-'(x0, y0, z0 cos t)
24. Find the flow with the given velocity field:
(a) v(r) = t(x, y, z) (c) Exercise 21(b).
(b) v(0 = t(-y, x, 1) (d) Exercise 21(c).
25. Is there a steady flow whose path lines are the trajectories of the
particles at (x0, y0, 0) at time t = 0 in the flow in Exercise 23(b) ?
PROBLEMS
45. Under what conditions on the velocity field of a flow are the lines of
force at all times the same as the paths of motion?
46. Consider a flow which spirals around the line L: χ = у = ζ at constant
angular velocity, whose distance from the origin increases exponentially
with time and whose distance from L decreases exponentially with time.
Find the velocity field of the flow.
47. If we are given a family of curves in the plane we may consider the
tangent field of the family as well as its differential equation and the tangent
field of the orthogonal family as well as its differential equation. How are
all these formulas related ?
4.7 Summary
The image in R" of an interval under a one-to-one C1 function with a
nowhere vanishing derivative is called a curve. If Γ is a curve given by the
function
χ = f(i) a<t<b
the variable t is called the parameter of the curve. If
χ = g(t) α < τ < β
is another parametrization of the curve, there is a one-to-one function
t = σ(τ)
mapping the interval [α, β~] onto the interval [a, b~\ such that
g(t) = ί(σ(τ)) α^τ<β
394 4 Curves
If σ' > Ο (σ is increasing) we say that the parameters t, τ induce the same
orientation on Γ. This notion divides all parametrizations into two classes.
An oriented curve is one for which one of these classes, the well-oriented
parameters is chosen.
If F is a differentiable function of two variables such that VF is never zero,
then the equation F(x, y) = 0 defines a curve implicitly. For we can find a
parametnzation
*=/(<) У = 0(0
for the set F(x, y) = 0. Similarly, if F, G are two differentiable functions
of three variables such that V.Fand VG are everywhere independent in the set
F(x, y,z) = 0 G(x, y,z) = 0
implicitly defines a curve in R3.
If Г is an oriented curve with a parametnzation
χ = f(i) a<t<b
the length of Г between i{a) and f(i) for a < t < b is defined to be the least
upper bound of all sums
Σ №.)-*('.-Oil
1=1
over all choices of points t0,.. .,tk such that
a = t0 < tt < ■ ■ ■ < tk = (
If j(i) is this number, the function s = j(i), a < t < b gives a parametnzation
of Γ. This is the parametnzation by arc length, s is the solution of the
differential equation
*'(0=llf'(0ll
s(a) = 0
The unit tangent to a curve Γ: χ = x(j) is the vector T(j) = x'(j). The
tangent line is the line through f(j) spanned by this vector. The unit normal
to the curve is a choice of unit vector N(j) lying on the line spanned by T'(j).
In two dimensions N is chosen so that the rotation Τ -* N is
counterclockwise. In three dimensions the Τ — N plane is called the osculating plane.
4.7 Summary 395
The unit binormalis the vector В so that the basis Τ -> N -> В is a right-handed
orthornormal basis: Β = Τ χ N. This frame is determined by these
differential equations, the Frenet-Serret formulas:
T' = kN
N'= -κΤ+ τΒ
В' = - τΝ
The scalar functions κ, τ, the curvature and torsion respectively of the curve
are defined by the first and third equations. The curvature к is the angular
velocity of the tangent in the osculating plane and the torsion is the angular
velocity of the osculating plane about the tangent. A curve in R3 is uniquely
determined (but for Euclidean motions) by its curvature and torsion. A
curve in R2 is uniquely determined (but for Euclidean motion) by its curvature.
If χ = f(i) is the equation of motion of a particle, we call the curve
described by this function the path of motion, ds/dt is the speed, f'(i) is the
velocity and f "(i) is the acceleration of the particle. The acceleration vector
lies in the osculating (T — N) plane. We can write
acceleration = αΤΎ + αΝΝ
where aT is the tangential acceleration and aN the normal acceleration. These
equations hold:
d2s (ds\2
aT = d? a"=\di)K
where к is the curvature of the path of motion.
A family of curves in the plane is a collection of curves {Гс} as с ranges
through some set. A pair of equations
χ = x(t, c) y = y(t, c)
determines a family of curves. This is the explicit form of the family. A
functional equation
F(x, y,c) = 0
also determines a family. This is the implicit form of the family. The set
of solutions of a differential equation
a(x, y)x' + b(x, y)y = 0
(4.87)
396 4 Curves
forms a family of curves in the plane. If
F(x, y,c) = 0 (4.88)
is the implicit form of a family its differential equation is found by eliminating
с from (4.88) and
dF , SF , л
ΊΓΧ +1ГУ =0
ox oy
If (4.87) is the differential equation of a family F, the family of curves
orthogonal to the family F is given by the differential equation
-b(x,y)x'+a(x,y)y' = 0
A vector field in a domain C/ <= R" is an /?"-valued function defined in U.
The vector associated to a particular point in U is depicted as originating at
that point. A fluid flow is given by the function
χ = φ(χ0, 0
with these properties:
(i) φ(χ0, 0) = x0.
(ii) φ has continuous partial derivatives,
(iii) For each (the function χ = φ(χ0, ί) is invertible.
The curves χ = φ(χ0, ί) χ0 fixed, are the paths of motion of the flow. The
velocity v(x, i) of the particle at χ at time t is the velocity fieldof the flow
Зф .
v(x, 0 = -^-(Хо,0|*=фою,о
If ν is independent of t, the flow is steady. The velocity field of a flow
completely determines the flow, for the paths of motion are obtained by
solving the differential equation
dx
7( = v(x,0
x(0) = x0
When the flow is steady the paths of motion do not change with time, and
particles on the same path remain on the same path.
4.7 Summary 397
• FURTHER READING
In addition to the bibliography at the end of Chapter 3, we should also
mention these excellent texts on differential geometry:
D. Struik, Lectures on Classical Differential Geometry, Addison-Wesley,
Reading, Mass., 1950.
H. Guggenheimer, Differential Geometry, McGraw Hill, New York, 1963.
R. T. Seeley, Calculus, Scott-Foresman, Glenview, 111., 1967 has a
derivation of Newton's law of gravitational attraction from Kepler's laws.
• MISCELLANEOUS PROBLEMS
48. Suppose that у is a closed curve in the plane which lies outside the
unit disk and encircles the origin. Show that the length of у is at least 2π.
49 Suppose that у is a closed curve lying completely inside the unit disk
with the property that it crosses every ray once and only once Is there an
a priori bound on the length of y7
50. Suppose that у is a curve as described in Problem 49, whose curvature
is bounded by 1. Is there now a bound to the length of y?
51. A pendulum consists of a body of mass m hanging on a rope of length
L which is fixed at one end If the mass is displaced from the vertical and let
go it will swing along the circle of radius L. Find the differential equation
of the motion
52. Suppose a particle is moving along the curve of Example 20 at
constant speed. Find the speed of its projection onto the xy plane.
53. Suppose that a particle moves along the right circular cone according
to the equation
χ = (cos t, sin t, 2i)
Find the equation of motion of the projection of the particle on the plane
x = l.
54. A horse is running around the elliptical track
x* _|_ 2y2 = 1
at constant speed. There is a wall along the line у = — 1 and a floodlight
at the point (0, 1) which casts the horse's shadow on the wall. Find the
equation of motion of the shadow.
55. A man six feet tall walks at constant speed along a straight line
passing directly beneath a street lamp 12 feet off the ground. Find the
equation of motion of the head of the man's shadow cast by the street lamp.
56. A loose foot bridge of length L hangs across a chasm of width W
(L > W). A man appears at one entrance on a pair of roller skates.
4 Curves
Suddenly he lets go and begins skating down the bridge. Assuming the
only forces acting on him are those due to gravity and the restraining forces
of the bridge, find the differential equation governing his motion.
57. Why does a river going around a curve wear out the far bank and
deposit silt along the near bank directly after the curve ?
58 Suppose a disk of radius r rolls with constant speed (at the center)
along a disk of radius R in the plane. Find the equation of motion of a
typical point on the circumference of the smaller disk.
59. Find the differential equation of the motion of a ball rolling in a
parabolic dish (with profile у = χ2) starting at rest at some point other than
the center.
60. Assuming that the population of the organisms on a given remote
island remains bounded, can you say anything about the eigenvalues of the
biotic matrix?
61. Find the curvature and torsion of these curves in R3:
(a) x = (u — sin u, 1 — cos u, 3u)
(b) χ = (sin u, 1 + cos u, sin u)
62. Let χ = x(i) be the equation of a curve у in R3 whose tangent vector
T(i) traces out a circle on the sphere. Show that у is a helix.
63. A general helix is a curve lying on a surface of revolution ζ =f(r)
which cuts the curves ζ =/(#■), θ = constant at a fixed angle. Show that
the ratio κ/τ is constant on a helix.
64. Find the curve on the xy plane onto which a helix on a cone projects.
65. Let у ι and y2 be two space curves for which we have a point for point
correspondence such that the line joining corresponding points is the normal
line to both curves. Show that the line segment between corresponding
points has constant length.
66. Let χ = x(i) be an ./{"-valued function of a real variable which is
и-times continuously differentiable. Then the image of у is a curve in R".
The Frenet-Serret frame of у is the orthonormal set obtained by applying
the Gram-Schmidt process to the vectors
x'(0, x'(0. ■ ■ ■. *("'(0
(a) Show that for η = 3, the Frenet-Serret frame is T, ±N, ±B
(b) Show that if there are only к independent vectors in the Frenet-
Serret frame at every point, the curve lies in a linear subspace of
dimension k.
(c) Suppose that the Frenet-Serret frame Ti, T2, .. , T„ is a basis
Show that the matrix representing the vectors dTi/ds, ..., dT„jds in this
basis is skew-symmetric. These are the generalized Frenet-Serret formulas.
67. Find the Frenet-Serret formulas for the curve
χ = (cos f, sin t, t, 2f)
in R*.
4.7 Summary 399
68. Kepler's laws of planetary motion (from which Newton derived his
law of gravitational attraction) are these:
I. For each planet the ray from the sun to the planet sweeps out
equal areas in equal times.
II. The path of motion of each planet is an ellipse with the sun at
one focus.
III. The square of the time period required to make one revolution
is proportional to the cube of the major axis of the ellipse This constant
of proportionality is the same for all the planets.
In the text we have derived Kepler's second law from Newton's laws.
Now derive Kepler's first and third laws.
Chapter D
SERIES OF FUNCTIONS
We have already run into series developments of functions several times:
the exponential, sine, cosine functions were expanded into power series;
Taylor's theorem provides a way to develop series expansions for suitable
functions; the exponential of a matrix gives us the only sure way to " solve "
a system of constant coefficient linear equations. We shall see in this chapter
that a general technique for solving a differential equation involves
approximation of the solution by series expansions.
We shall begin by formulating the definition of convergence of a series of
continuous functions and verifying the general criteria guaranteeing
convergence. One of the most important of series expansions is that of power series.
We shall say that a function is analytic if it can be locally developed into a
power series. We shall finally verify the fundamental theorem of algebra
and complete the discussion of constant coefficient equations. We have
delayed this until now because the kind of analytic techniques involved in the
fundamental theorem are those which are most appropriately developed for
the class of analytic functions. Further techniques for operating with
power series will be explored, as well as the question of estimation of the error
in replacing the power series by a partial sum.
400
5.1 Convergence 401
5.1 Convergence
Definition 1. Let {/J be a sequence of continuous functions denned on a
subset X of R". The series formed of the {fk} is the sequence of partial sums
{Zft=i fki- We say that the series converges if the sequence of these partial
sum converges (in the sense of Definition 19 of Chapter 2), and denote the
limit by ΣΓ=
imprecisely then, / = Y£=! fk if, corresponding to every ε > 0, there is an
./V such that
/(χ)- ΣΛ(χ)
< ε for all η > N and χ e X
Since the limit of a uniformly convergent sequence of functions is continuous
(Theorem 2.14), we can assert that the sum of a convergent series of
continuous functions is continuous. Likewise, from the Cauchy criterion for
sequences, we obtain a corresponding criterion for the convergence of series.
Proposition 1. (Cauchy Criterion) Let {fk} be a sequence of continuous
functions. The series ^T/k converges if and only if, to each ε > 0 there
correspond an N such that
Σ Λ
< ε for all η, m > N
Proof. We must show that the sequence g„ = 2*= ι /* satisfies the Cauchy
criterion. For a given ε > 0, let N be as in the proposition. Then, for any x,
m > η ^.Ν
\gm(x) - #„(x)| =
m
2 Mx)
k-n+l
<
m
I A
k-n+l
<ε
Thus \\gm — gn\\ < ε, so the proposition is proven.
Notice that the Cauchy criterion is guaranteed if the series of real numbers
Σ*"=.ι ll/tll converges (for Up^ + i/J ^ Σ"=»+ι ИЛИ)- This 8ives us a
powerful technique for verifying convergence of series.
Definition 2. Let {/J be a sequence of continuous functions defined on a
set X. The series is said to converge absolutely if Σ"=ι ΙΙΛΙΙ < °°·
402 5 Series of Functions
Of course, as remarked above, an absolutely convergent series is
convergent. In the case of absolute convergence we can pose a comparison test,
just as for series of numbers.
Theorem 5.1. (Comparison Test) Let {fk} be a sequence of continuous
functions defined on a set X. Suppose there is a sequence {pk} of positive
numbers and an integer N > 0 such that
(0 11ЛИ<л fork>N
00
(ϋ) Σλ< °°
k=\
Then Y^fk converges absolutely.
Proof. The verification is the same as that for number series (Theorem 2.3).
Examples
1. Σ zk = 1/(1 — z) uniformly and absolutely in {|z| < r} for any
r < 1. For \\zk\\ < rk in that domain, and
^ ι
< 00
2. Σ ζ<1 does not converge uniformly in {|z| < 1}. In fact, the
series is not a Cauchy sequence of functions, because for every n,
n+l
ΣΛ-
k=\
- ΣΛ
k=i
= 1
Thus, for ε = \, say, there is no ./V such that ||£™=ИЛН < i for all
m > η > Ν, in fact, not even for m = η + 1.
3. ez = Y£L, zk/k\ converges uniformly in any disk {|z| < R} with
R finite. Again, by comparison
R"
Rk
-fc! and ΣΓ!<0°
5.1 Convergence 403
, „ cos nx
4·Σ-
η1
converges uniformly on the whole real line. For any x,
cos nx
*i?
Since Σ l/"2 < °° tne comparison test easily applies.
5. If {ak} is any sequence of numbers such that Σ Ι α* I < °o, then
/(ζ) = Σ™= ι ak zk is a continuous function on the closed unit disk.
The series converges uniformly since \\akzk\\ < \ak\.
Finally, for the purpose of availability, we record the obvious extensions
to series of the propositions concerning integration and differentiation of
sequences of functions.
Proposition 2.
(i) Let {/„} be a sequence of continuous functions on the interval [a, b~\.
Let g„(x) = \xaf„ ■ If the series of functions J^f„ converges, so does the series
Σ 9n, and
CD „X -л / CD \
Σ //- = / (Σ/-) (5-i)
n= 1 Ja Ja \n=l /
(n) Let {/„} be a sequence of continuously differentiable functions on the
interval [a, 6]. Let gn =/„'. If the Series of functions J^gn converges, and
for some c, J^f„(c) converges, then the series Σ/η converges. The limit is
continuously differentiable and
(5.2)
(Σ/-)'
mples
6.
ln(l-
= Σ/:
аз
*)= Σ
xk
к
■1 < χ < 1
404 5 Series of Functions
This follows by integrating the geometric series (Example 1) term by
term:
•Ό ι — г k=o Jo
да ik+1 oo fk
ln(l-*) = ΣγΤ7= Στ
,, . " cos их
/(*) = λ Γ
is infinitely differentiable. For the differentiated series
Ά sin nx
n=i(«-l)!
(5.3)
is also convergent. By Proposition 2(ii) the sum is/'(x). Similarly,
the series (5.3) can be differentiated term by term, and gives
2, η cos nx
~ nk (n - 1)!
which is again convergent.
8. We can develop a series expansion for arc tan χ according to the
following observations. From the geometric series
1 да
—— = Σ**
1-х k=o
we obtain by substituting — x1 for χ
1
1 +x2
£(-l)V
k = 0
Integrate:
Лк + 1
arc tan χ = Σ (_ 1)*
χ
fttt, 2k + 1
5.1 Convergence 405
EXERCISES
1. For what values of χ do these series of functions converge absolutely:
(a) £ 2"x" (d) | (x+18)"
" COS ИХ »
(b) 2 —— (e) Σ e "*
n=l .X ft=o
(c) f eni (Ο Σ*01''
n = 0 n=0
2. In which domains of the complex plane do these series converge?
(a) | nz" (b) 2 -£- (c) | b-V-
11 = 0 n = 0 (,^Wj! n = 0
3. Which of these series can be differentiated or integrated on their
domain of convergence?
(a) Exercise 1(a) (c) Exercise 1(d)
к» COS ИХ
(b) Exercise 1(b) (d) У —-—
n=o П2
4. Find the power series expansion for these functions:
(a> «hy (c) 1>л
d /1 +x\ r' dt
<"> ϊΜ <«» J.t+?
PROBLEMS
1. (a) Find a power series expansion for sin χ cos x.
{Hint: 2 sin χ cos χ = sin 2x.)
(b) Find power series expansions for sin2 χ and cos2 χ
2. Prove Proposition 2.
3. Show that
lim 2 — = loS 2
*-.- 1 11=1 П
Can you conclude
» (—1)"
Σ — =log2?
п = 0 П
406 5 Series of Functions
5.2 The Fundamental Theorem of Algebra
For the remainder of this chapter we restrict attention exclusively to
complex-valued functions of a complex variable. The simplest class of such
functions are the polynomial functions, that is, functions of the form
Ρ(ζ) = α„ζ" + α„_1ζ"_1 + ··· + αιζ + α0,
(where the a, are complex numbers). We shall always assume a„ #0, in
this case η is called the degree of P. It is a basic fact of mathematics that
every polynomial has a root, that is, there is a number с е С such that P(c) =
0. The proof of this fact consists in a systematic investigation of the analytic
properties of polynomials. First, we recall de Moivre's theorem.
Lemma. Every nonzero complex number has η distinct nth roots.
Proof. Let се С, сфО For this purpose, the polar representation с = re'" is
most convenient An nth root of с is a number w = ре'ф such that p" =r and
e'n* = ew, that is, пф — θ is an integral multiple of 2-π. Let
2ττ 4π 2ттк 2π(η — 1)
αϊ = — , «2 = — , . .., ос» = , ..., α„_ι = , α„ = 2π
η η η η
Then αϊ! = exp(/ai), . ., ω„ = ехр(га„) are all distinct and have the property
(ω„)" = 1. These are called the nth roots of unity. Now, if ρ = (г)1'" and φ = θ/η,
then (ре'ф)" = rew = с. The numbers ре[фи>\,..., ре'фшк, ..., ρβ,φω„ are then all
distinct, and are all nth roots of с
Now, we need two deeper facts depending on the continuity properties of
polynomials. The first is intuitively clear: that \P(z)\ gets arbitrarily large as
ζ -> oo. The second is the crucial fact for the fundamental theorem: the
place where a polynomial has a minimum modulus must be a root.
Lemma. Let P(z) = a„ z" + ■ ■ ■ + a^z + a0 be a polynomial of degree η > 0.
(ι) hm \P(z)\ = oo, that is, given any Μ > 0 there is a K> 0 such that
|z|->°o
\P(z)\ > Μ whenever \z\ > K.
(li) IfP(z0) φ 0, then z0 cannot be a minimum point for \P\; that is, there
are ζ close to z0 such that \P(z)\ < |P(z0)|.
5.2 The Fundamental Theorem of Algebra 407
Proof.
(1) The point here is that the highest-degree term of Ρ is the dominating term as
regards the behavior of Ρ as z^ oo. For ζ Φ 0,
α"+ Σ —
k = 0 Z
|Ρ(ζ)| = |ζ|"
If |ζ|^ΛΓ^1, then |ζ|"-' ^Κ also for А: <и, so
"^ ак
а»+ 2 -тл
>|β"ΐ-ρΓΣι*ι)
Let Μ > 0 be given, and choose
«" = maxil>2M|a„|-l)2|fl„|-1 (^ |*|jj
Then, for \z\ >K,
ak
/ ы-1 (%-ь ыу\ ι
Thus
\P(z)\^\z\"-i\a„\^K- i\a„\>M
(ii) Suppose now that P(z0) ^=0. Let
Q(z)=(P(zo))-lP(z + z0)
Then Q is also a polynomial, (2(0) = 1, and we must show that 0 is not a minimum
point for Q. Let
Q(z) = 1 + 2 akzk = 1 + zm(am + z^(z))
where m is chosen as the least positive integer к for which ak φ 0, and g(z) =
У2=т+1 akZ*~(m+l). ^ is a polynomial and is thus continuous (and that is all we
need to know about g). Here again we want to use the fact that for small z, zm
dominates zm+\ so Q is very close to the polynomial 1 + zmam which has no minimum
modulus at 0 (choose ζ so that zm = — rjam with r < 1).
In our case, we choose an /nth root of — cQl; call it z0, and consider the function
408 5 Series of Functions
Q{rz0) of a real variable r. We have
Q(rzo) = 1 + rm(- l+rh(r))
where h(r) = —a^girzo) is a continuous complex-valued function. Thus
\Q(rz0)\<\\~rm\+rm+1\Kr)\
Now lim rh(r) = 0, so we can choose r0 < 1 small enough so that \r0 h(r0)\ < i.
Г-.0
Then
|6(rz0)| ^ 1 - r0m + rom(i) < 1 - W < 1
which proves part (li).
Theorem 5.2. (Fundamental Theorem of Algebra) Let Ρ be a polynomial of
positive degree. There is a z0 e С such that P(z0) = 0.
Proof. Let P(0) = c0. By part (ι) of the lemma, there is а К > 0 such that for
\z\>K, |P(z)|> |c0|. Now A = {ze C; \z\ <K} is compact, so \P\ attains a
minimum value on Δ, say at z0. But then z0 is a minimum point for all of C. For,
since 0 e Δ, \P(z0)\ < \P(0)\ = c0, and for ζ φ A, |P(z)| ^ c0 > \P(z0)\. Thus, even
for ζ φ Δ, we have |P(z0)| < \P(z)\. But then, by part (ii), there is no alternative: we
must have P(z0) = 0.
Factorization Theory
We should recall that if с is a zero of the polynomial P, then ζ — с factors Ρ
(this is proven below in Theorem 5.3). Thus P(z) = (z — c)Q(z) and Q has
degree 1 less than that of P. If deg Q > 0, Q has a zero c', which is also a
zero of P. Further, Q(z) = (z — c')Q'(z) and we can repeat this argument in
order to find exactly deg Ρ zeros of Ρ This is the factorization theorem of
algebra.
Theorem 5.3. (Factorization Theorem) Let Ρ be a polynomial of degree
η > 0. There are complex numbers α φ 0, zb ..., z„ such that
P(z) = a(z - Zl) ■ ■ ■ (Z _ z„)
Proof. The proof is by induction on и. If η = 1 the situation is simple:
P(z) = αιζ + α0 = aA ζ —
m
5.2 The Fundamental Theorem of Algebra 409
(since tfi Φ 0). Now we consider the case of general degree n, assuming the corollary
for polynomials of degree и— 1. By the theorem, there is a point с such that
P(c) = 0. Then
i>(z) = i>(z) - P(c) = 2 ak(z" -c*)=i ak(z - cjfjW-j)
n-l/ η \
= (*-c)2 Σ a^-Azi
The factor on the right is a polynomial of degree η — 1, so the induction assumption
applies: it can be written as a(z — zi) ■ ■ ■ (z — ζ,,-Ο for suitable α ^ 0, zi,..., z„_i.
Thus, writing с = z„, we obtain
P(z) = a(z - z0 · ■ ■ (z - z„) (5.4)
This factorization is clearly unique, except for the order of the z,'s: a is the
leading coefficient of Ρ and {zl5..., z„} are the roots of P. Of course,
zl5..., z„ need not be distinct; let ru ..., rs be the set of distinct roots. If
we let ml be the number of occurrences of the root r, in the list {zu ..., zs},
ml is called the multiplicity of the root rt. We can rewrite (5.4) as
Ρ(ζ) = φ-,1Γ ■•■(z-Oms (5-5)
and clearly mt + · · ■ + ms = n, the degree of Ρ
Before concluding this section we should remark on the factorization of
real polynomials. Real polynomials need not have real roots (viz., z2 + 1
= 0), but their complex roots come in conjugate pairs. Let P(z) = an z" + ■ ■ ■
+ a^z + a0 be a real polynomial. If P{r) = 0, then
P(f) = an(f)" + ■ ■ ■ + a>r + a0= {a„ r" + ■ ■ ■ + avz + a0)~ = P(r)~ = 0
so f is also a root of P. Since
(z - r)(z ~r) = z2 -(r + f)z + rr = z2 - 2 Re(r)z + \r\2
the polynomial has real coefficients. Thus, if we rearrange the roots of Ρ
into the real roots rv ...,rk and the conjugate pairs rk + i, rk + l, ..., rt, ft we
can rewrite the factorization (5.5) into a product of linear and quadratic
real polynomials.
P(z) = a(z - rj"" ---(ζ- rk)mk{z2 - 2 Re(rt+1)z + |rft+1|2) ■ ■ ■
(z2-2Re(r()z + |r(|2)
410 5 Series of Functions
PROBLEMS
4. Let Ш[ ω, be the η «th roots of unity. Show that they are
arranged at η equidistant points around the unit circle. Show that the sets
Ц ω,}, {а)Ь αϊ!2,..., ω'Γ1} are the same, if ш\ is the nearest such
point to 1.
5. Let oij,..., ω„ be the nth roots of unity. Choose к so that kn — 2 is
divisible by 4. Show that ι'ω,,..., г*а>„ are the «th roots of — 1.
6. Show that: (a) degPQ =degP+ deg Q.
(b) deg(i> + Q) = max(deg P, deg Q) if deg Ρ ^ deg Q.
(c) When is the equation in (b) not true''
7 Given two polynomials P, Q show that there is a polynomial R which
factors both P, Q and is factored by any polynomial which factors both
P, Q Ris called the greatest common divisor of Ρ and Q.
8 Show that a real polynomial of odd degree has a real root.
9 Prove that the polynomial 1 + zmam (m > 0) has no minimum modulus
at z = 0
10. For P(z) = 2?=ο α„ζ" a polynomial, let
P\z) = 2 nanz"-1
n = l
(a) Verify that the transformation P^P' is linear and satisfies
(PQ)'=PQ' + P'Q
(by induction on deg P).
(P -*-/" is a complex analog of differentiation)
(b) Prove that r is a multiple root of Ρ if and only if P(r) = 0 and
i"(r)=0.
(c) Define P"=(P')', P"'=(P")', and so on. Then r is a root of Ρ
of at least multiplicity m if and only if P(r)=P\r) = ■ ■ ■ =P<m-1\r) = 0
5.3 Constant Coefficient Linear Differential Equations
Now that we know the factorization theorem for polynomials we can return
to complete the study of constant coefficient equations in one unknown
function. Let L be a constant coefficient differential operator of order к; that is,
L is a mapping from functions to functions defined by
Д/)=/,к) + *Ё1в./,,) a. ε С (5.6)
ι=0
5.3 Constant Coefficient Linear Differential Equations 411
Corresponding to L is the polynomial
Ρί(Χ) = Χ" + "ΣαιΧι
i = 0
called the characteristic polynomial of L. We recall the facts that we already
know about such differential operators.
Theorem 5.4. Let L be given by (5.6). The collection S(L) of solutions of
the equation Lf= 0 is an η-dimensional vector space of infinitely differentiable
functions. If r is a root ofPL(X) = 0, then erx e S(L).
Now if all the roots of the characteristic polynomial are distinct, we have η
solutions of Lf = 0, and it is easily verified (Problem 11) that they are
independent. Thus they span S(L). To examine the case of multiple roots,
we must examine more closely the relationship between the given differential
operator and its characteristic polynomial. If Ρ is a polynomial, we will let
LP represent the corresponding operator; that is, for P(X) = £"=0 atx\ LP
is defined by
lp(d= i>./<0
1=0
Now, from what we already know about these differential equations we can
guess that the factorization of Ρ will tell us all we want to know about LP.
In fact, we can factor the corresponding operator accordingly as the next
lemma shows.
Lemma 1. LP + g = Lp + Lq; LPq — LpLq .
Proof. Of course, LPLQ is defined as the composition of operators: (LPLQ)(/) =
LP(LQ(f)). The first equation is obvious. The second takes a little work. We
will prove it by induction on the degree of P. If deg Ρ = 0, that is, P(x) = a0, then
PQ=a0Q and LPQ(f) = a0LQ(f) = LP(LQ(f)), for any sufficiently differentiable
function / Now suppose the lemma is true for all polynomials of degree η Let Ρ
be a polynomial of degree η + 1 If α is a root of P, we can write P(X) =
(X— a)S(X), where S is a polynomial of degree n. Thus, by hypothesis,
LSQ= LSLQ. We have left only to verify the lemma for polynomials of degree 1.
That is, we must show that if R is a polynomial of degree 1 and Γ is any polynomial,
then Lrt=LrLt. For once this is verified, we take R(X) = X— a, so that
Ρ = RS. Then
LpQ = LrsQ = LrLsq = LrLsLq = LrsLq = LpLq
412 5 Series of Functions
So, let R(X) = X-a, T(X) = Jr. о b, X1 Then
m
RT(x) =t(bi- abl+i)X[+l - ab0
Now we compute LRLT:
(m \ m m
Σ&/40 =Σ(*'/(0)'-οΣί</")
ί=0 / ί=0 1=0
m
= 2(^-^ + ι)/(' + 1)-^ο/
1=0
The lemma is proven.
It follows from the lemma that if β is a factor of P, then any solution
of LPQ(f) = 0 is a solution of LP(f) = 0. Now let Ρ be a given polynomial.
We can, by the factorization theorem, write Ρ as a product of first-order
factors.
P(X) = (X- βχΓ1 ---(Χ- as)m" with m, + · ■ ■ + ms = deg Ρ
Because of Lemma 1 the solutions corresponding to the factors {X — a,)mj
are in S(LP). Thus we need to discover the solutions of the differential
equation LP{f) = 0, where P(X) = (X - c)m.
Consider, for example, the differential operator corresponding to (X — c)2.
We know one solution: ecx; we find another by the technique of variation of
parameters. (X — c)2 = X2 — 2cX + c2. Test the operator on у = zecx.
y' = z'ecx + zcecx y" = z"ecx + 2z'cecx + zc2ecx
Then
y" - ley' + c2y = z"ecx = 0 or z" = 0
Thus ζ = χ, and the second solution is xecx. We can guess then that the
general situation is this.
Lemma 2. The solutions of LiX_c)m(f) = 0 are spanned by ecx, xecx, ...,
xm~'iecx.
Proof. We have to show that the named functions are solutions We do that
by induction. The case m = 1 is already known (by Lemma 1). Thus we may
5.3 Constant Coefficient Linear Differential Equations 413
assume the lemma for a given value of m, and prove it for m + 1. By Lemma 2, we
need only verify that LiX_cym+ г{хте") is zero. But this is
iu-A-sW =La_c)m(/nxm-1e" + сЛ« - cxV)
= LiX_c)m(mxm-1ec') = 0
by induction.
Theorem 5.5. Let p(X) = X" + J^Z^ a,X' be a polynomial with complex
coefficients. Let a, as be the roots ofp(X) = 0 with multiplicities mt
ms, respectively. Then the space S(Lp) of solutions of the differential equation
Ьр(/)=/(п, + "Е«./(,, = о
is the linear span of the functions xJea,x, 0 < j < m,.
• EXERCISES
5. Solve these differential equations:
(a) у- 5у" + 8/ - 4y = О,уф) = 0,/(0) = 0,у'Щ = 1.
(b) у -y"~5y'-3y = 0,y(0) = hyXO) = 2,y"(0)=-l.
(c) у- 6/ + 12/ - 8у = 0, у(0) = 1, /(0) = 0, /'(0) = 1.
(d) У- 3/' + 3/ - у = 0, /0) = 3, /(0) = 2, у'ХО) = 1.
(e) У4» + 2/ + у = 0, Я0) = 2, /(0) = 2, /'(0) = 2, Г(0) = 2.
(f) /4> + 4/'"> - 2/ ' - 12/ + 9^ = 0, /0) = /(0) = /(0) = 1,
Г(0)=0.
(g) /4) - 3/ ' + 2у = 0, ЯО) = 0, /(0) = /'(0) = у%0) = 1.
• PROBLEMS
11. (a) Show that if /ί, ..., r„ are n distinct numbers, the matrix
1 ■■
Г1
η2 ■■
"l
1
r„
r 2
„π— 1
is nonsmgular. (ί/wr: If the rows are dependent, we obtain a
polynomial of degree η — 1 with η distinct roots.)
(b) The functions exp(rix), ..., exp(r„ x) are independent. (Hint:
If these functions were dependent, we would be able to prove that the
columns of the above matrix are dependent.)
414 5 Series of Functions
5.4 Solutions in Series
If now we are given a linear differential equation which is not homogeneous,
or has variable coefficients, we have a problem of a much different magnitude.
In general, such problems cannot be solved explicitly. Thus, we must seek
ways to obtain approximate solutions. This is one of the places where
series representations of functions are usable. The procedure of series
approximation has two aspects. First, we must establish the theoretical
validity of such a technique and, secondly (and this is essential from the
practical point of view), we need a technique for effectively computing the
error. In this section we shall describe this procedure, deferring these two
essential points (which turn out to be the same!) until Section 5.7.
First, an example. Suppose we want the function/such that
/"(*) + 9i{x)j '(*) + g0(x)f(x) = 0 /(0) = a0 f'(0) = «,
where g0 and gt are denned in a neighborhood of 0. We shall assume
that they are sufficiently differentiable. Now our initial conditions
give us the first two terms of the Taylor expansion of/ at 0:
f(x) = a0 + avx + higher-order terms (5.7)
Our technique will be based on the tacit assumption that the " higher-
order terms" are computable, and knowing enough of them will give
a usable approximation to the solution. Now, evaluating the
differential equation itself at 0 gives us the second-order term:
/"(0) + &(0)/'(0) + <7о(0)/(0) = 0
or
Γ(0)=-(Λ(0)β1+^0(0)βο)
so
f(x) = a0 + α,χ - \(аодо(0) + αι#ι(0))Λ2 + higher-order terms (5.8)
Differentiating the differential equation will give an identity express-
5.4 Solutions in Series 415
ing/'" in terms of lower derivatives, so we may continue,
/'"(*) + 9\(x)f'(x) + ΰι(χ)Γ(χ) + g'0(x)f(x) + g0(x)f'(x) = 0
so
/'"(0) = -(0'i(O) + 0o(O)K - g'o(0)ao + 3ι(0)(3ι(0)αι + go(0)ao)
= (3i(0)2 - i7i(0) - g0(0))ai + (3o(0)3i(0) - ff{,(0))ao
and so we have the third term of the Taylor series of/:
f(x) = a0 + a^x - $(aogo(0) + а^д^Щх2
+ Жд№2 - g'№ - ί/ο(0)Κ+ до(0)да0) - зо(0))«]х3
+ higher-order terms
Example
9. Perhaps an explicit calculation is in order. We shall find an
approximate solution of:
y" + (x2 - 1)/ + xy = x2 (5.9)
уф) = 0 /(0) = 2
The solution thus begins f(x) = 2x + ■··. f"(0) is easy to calculate
by substituting the initial conditions into Equation (5.9):
f(x) = 2x + x2 + · · ·
Differentiating (5.9), we obtain
y"' + 2xy' + (x2 - 1)/' + у + xy' - 2x (5.10)
Evaluating at 0 we find f'"(0) =f"(0) ~f(0) = 2. Differentiating
(5.10) and evaluating at 0, we obtain /(4,(0) = 2; once again gives
/■(5>(0) = -10. Thus, to five terms the Taylor expansion of the
desired solution is
f(x) = 2x + x2 + ix3 + к x* - τι x5 + higher-order terms
Admittedly this is not very glamorous, but it is computable! The phrase
416 5 Series of Functions
"higher-order terms" represents the error between the fifth-degree
polynomial exhibited above and the actual solution. That polynomial is
completely meaningless without some estimate on the error incurred. But our
method gives no hint as how to estimate. So, in the hope of being able to
give more form to the "higher-order terms," we will try a more brazen
approach: we begin by assuming that the desired solution is the sum of a
convergent power series (its "full Taylor expansion") and we try to find the
coefficients. If f{x) = £"=0 a„x", then differentiating term by term we
obtain
00
η=ί
00
/"00= Σ "("-ικ*"-2
π = 2
00
fw(x) = £ η(η - 1) ■ ■ ■ (η - к + IKx"-ft
n = k
We make these substitutions into the given differential equations and
solve for the {a„} by equating the coefficients of/1'0
Let us reconsider (5.9). Let f(x) = ^°=0 a„x" be the desired
solution. The statement of the problem becomes
00 00 00
Σφ-1Κχ"-2 + (χ2-1)Σ«„^'+ £α„χ"-ζ2 = 0 (5.11)
л=2 л=1 л=0
а0 = О ах = 2
The coefficient of xk in the left-hand side of (5.11) is
(к + 2)(k + l)ak + 2 + (k- 1K-! - (k + l)ak+1 + ak^
Thus we have to solve these equations
«o = 0
ax = 2 (initial conditions)
2a2 -at=2 (k = 0)
3.2a3 +a0-2a2 = 0 (k = 1)
4.3a4 + 2av - 3a3 = 0 (k = 2)
n(n - \)a„ + (n - 2)a„_3 - (и - 1)α„_! =0 (к = η - 2)
5.4 Solutions in Series 417
We can solve, because each equation can be written in the form
α"= ?—г; n>2
n(n — 1)
(5.12)
However, we have an added advantage in that we can make a guess
at an estimate for the general term a„. In fact, we assert
a. <
2"
[#i/3]!
(5.13)
(Ы = largest integer less than or equal to x). This is in fact true for
η = 0, 1, 2; we verify it in general by induction.
kl<
(и-1)|д.-1|+(и-2)|д.-з1
n{n — 1)
< - (|α„_!ΐ + k_3|)
η
<
1/ T_
η \[(и -
+
1)/3]! [(л-3)/3]!
Now, since
(и — 3) η
(η - ЗУ
3 ·
so
(i)
'(и - ЗУ
\>\ПЛ\
~ 3 ·
Similarly,
\{η - 1)
("з)
Thus,
_ 3
\а„ <
1
3[л/3]!
(2„-ι +2"-3)<
2"
["/3]!
418 5 Series of Functions
This now tells us a lot. For, the solution to the problem in (5.9)
differs from
2x + x2 + i*3 - τ^χ4 - τ^χ5
by at most £„>6 α,,χ", where the a„ satisfy (5.13). Thus the error is
dominated by
ηη ^j3ft| ι 3ft -->3ft + 1 ι ι 3fc + 1 T3ft + 2| 13ft + 2
Σ π^Ν"= Σ 4^+ Σ ^-Чт—+ Σ —тг—
nte[n/3]i ft^2 /с! ft£2 k\ fttz A:!
< (1 + 2|x| + 4|x|2) ■ (exp(2|x|)3 - 1 - (2|x|)3)
Hence in the interval 0 < χ < 1 the solution is given by the above
polynomial except for our error of at most le1 (which is about 52)!
The reader is forgiven if he is unimpressed with our estimate, but he
should not go so far as to discard the technique for this reason. For
the paucity of our results is due to laziness rather than the uselessness
of the Taylor development. If we pushed this procedure up to 1000
terms (an easy task for a computer), then the error would be at most
7e2 . 2Ю00/1ООО! which is less than (50)-900; a good estimate indeed.
Let us recapitulate the basic ideas. We are given an initial value problem:
У(к) + Σ>.(*)/° = AW, X0) = c0, /(0) = clt ..., У'-1^) = ck^
We replace the #'s and h by power series expansions and test the " solution"
f(x) = £"=0 a„x". The first к terms are found from the initial conditions,
and the rest are found by equating the coefficient of x" on both sides of the
equation. This leaves us with these problems to resolve:
(i) Can we represent the given #'s and h by power series ?
(ii) Can we differentiate a power series term by term7
(iii) How do we multiply power series? (In the above illustration, the
g's were polynomials, so there was not much difficulty.)
(ιν) Can the system of relations between the ak's really be solved uniquely?
(v) Can we effectively estimate the error between the solution and a finite
part of its (supposed) Taylor expansion?
Little by little, we will resolve these problems. Suffice it to say that the
answer to (ι) in general is No (see Section 5.8). However, in problems that
5.4 Solutions in Series 419
do arise naturally, the given functions usually are sums of convergent power
series. If this is the case, all other questions can be satisfactorily answered;
that is, the solution also is the sum of a convergent power series whose
coefficients can be determined by the above technique and the estimate on the
remainder can be effectively computed. Let us look at another illustration.
Examples
10.
x3y'" + x2y" + xy' + у = ех (5.14)
y(0)=\ /(0) = 1 /(0) = l/6
Let the solution be f(x) =£"=0a„x". Substituting in (5.14), we
obtain
£ n(n - 1)(и - 2)a„x" + £ n(n - l)a„x" + £na„xn+ £ anx"
n=0 n=0
= Σ -
άο л!
which gives these equations for the coefficients:
a.(n(n - 1)(« - 2) + n(n - 1) + 1) = — for all η
n\
or
" п\{пъ-2пг- п+ 1)
(5.15)
Notice, that we have not used the initial conditions and fortunately
they conform to the requirements (5.15). That is, for this particular
equation, there is a unique solution independent of any initial
conditions. This does not contradict any previous results because
Picard's theorems do not apply (since the leading coefficient is not
invertible).
11.
y" - xy' + 2y = 0 (5.16)
y(0) = 1 /(0) = 0
420 5 Series of Functions
Here Picard's theorem does apply, so we should get a unique solution
with the given initial conditions. Letf(x) = ]T"=o anx" be the
candidate. (5.16) becomes
00 00 00
Σ n(n - IK*""2 - X nanx" + Σ 2a„x" = 0
n=0 n=0 n=0
or
a0 = 1 ax = 0
(и + 2)(и+ 1)α„ + 2 - na„ + 2a„ = 0 и S; 0
or
(и - 2)«„
(л + 2)(л + 1)
Thus a2 = — 1, α3 = Ο, α4 = 0 and thus all further coefficients are
zero. The solution is/(x) = 1 — x2.
In the next section, we shall fully develop the theory of power series. It
is most advantageous (as we have already seen) to do so in the complex
domain.
• EXERCISES
6. Find an approximate solution for
/ - xy = 0 XO) = 0 /(0) = 1
with an error of at most 10"* in the interval [— 1, 1]
7. Do the same for
у -x2y = l X0)=0 /(0)=0 /'(0)=0
with an error of ΙΟ""1 m [—i, i]
8. Find a recursive formula for the coefficients of the solution, and a
reasonable estimate:
(a) у" -2у' + у = 0, Х0) = 1, /(0) = 0.
(b) У -2/ + ху = е*,Х0) = 1,/(0) = 1.
(c) y<k) + у = 1, arbitrary initial conditions.
(d) / - k2y = 0
(e) / = x2 + xy, y(0) = 0.
5.5 Power Series 421
• PROBLEMS
12. Why doesn't Picard's theorem apply to Equation (5 14)''
13. The second-order equation xy" + y'=0 seems to have only one
solution by the series method, but two independent solutions by the method
of separation of variables. Explain that.
5.5 Power Series
We have already discussed at length the power series expansion of the
exponential and trigonometric functions, the geometric series and some others.
We have also seen that the Taylor formula produces a power series expansion
for suitable functions. We have observed that there is a certain disk
corresponding to each power series, called the disk of convergence. The series
converges inside that disk and diverges outside. We shall recollect all this
information as the starting point of our discussion of complex power series.
Theorem 5.6. Let cn be a sequence of complex numbers. There is a non-
negative number R (called the radius of convergence of the pouer series £ cnz")
with these properties:
(a) £„°°=0 c„ z" diverges if \z\> R.
(b) Y^=0cnz" converges absolutely and uniformly in any disk {zeC:
\z\ < r) with r < R.
R has these two descriptions:
(i) R = l.u.b. {t: \cn\t" is bounded}.
(ii) J? = (l.msup(|cn|)1/")"1·
Proof. For at least part of the proof we could refer to Proposition 9. As in that
proposition we consider the set
{t > 0. there is an Μ such that Μ > \c„\ t" for all n)
If this set is unbounded, we can take R= <x>, otherwise, let R be the least upper
bound of this set.
(a) Suppose \z\ >R Then there is a t, \z\ >t>R such that {\c„\t"} is
unbounded Since |c„| \z"\ > Iс| Г for all n, we cannot have lim c„z" = 0 so Σ c»z"
diverges
(b) Let r<R, Δ = {ζε C:\z\ < r}. Then there is a t,r<t<R such that
422 5 Series of Functions
{\c„\t") is bounded, say by Μ If \z\ <, r,
* «(:)■
Thus, letting || · || be the uniform norm for С(Д), we have \\c„z"\\ < M(rjt)" Since
r/t < 1, 2 07')" < J', so by comparison 2 c" z" converges absolutely and uniformly
in С(Д)
Further, by definition, R is given by (ι), the more esoteric formulation (n) we
shall leave as Problem 14
Examples
12. If Y^c„z'' is a given power series with radius of convergence
R, the question may arise: what happens on the circle \z\ = Rl The
answer is that practically anything can happen.
(a) If the sequence {c„} is summable, that is, £ \cn\ < oo, then by
comparison £ cnz" converges uniformly in {\z\ < 1}. Thus the series
£ (z"/n2) has radius of convergence 1 and converges uniformly in
{kl<i}·
(b) Σ (z"/n) also has radius of convergence 1, but £(Г/и) does
not converge, whereas £ [( — l)"/n] does converge.
(c) £ z" has radius of convergence 1, but £ z" does not converge
for any ζ with \z\ = 1 at all (hm z" # 0 if |z| = 1!).
Since no general assertion on the circle of convergence is possible,
we needn't be concerned with the behavior of the series there (except
in particular cases).
13. The geometric series ^„°°=0 z" 's a power series with radius of
convergence 1. This series converges to (1 — z)_1 uniformly and
absolutely on any disk {z 6 C: |z| < r} with r < 1. Thus
-^=!>" for |z| < 1
1 - ζ „f о
Now let a e С, а ф 0. Then
1 1 1 _ ! ν (ZY V Z"
7^~z = a ' [1 - (z/я)] = a „4 W = io ^
This convergence is assured in the disk {z 6 C: |z| < \a\}.
14. The series ^„°°=0 (ζ7"0 has infinite radius of convergence.
\c„z"\ = \с„\Г
5.5 Power Series 423
Thus the sum is a continuous function on the whole plane, denoted
ez since this sum does converge to the exponential function for real
values of z. We have seen that, for real x, e'x = cos χ + ι sin χ. We
can use this (or Taylor's theorem) to obtain series for the sine and
cosine:
03 (ix)"
cos χ + 1 sin χ = У
n=0 И!
00 x*k x4k+i X*k+2 X4k+3
~ io (4/0! + ' (4/c + l)! ~ (4/c + 2)' ~ ' (4/c + 3)!
_ » (-l)V* » (-lfx2k+l
~ io (2/c)! +,к40 (2/c+l)!
These series also converge on the entire plane. We can use them to
define the complex cosine and sine:
со z2k со 72k+1
C°SZ = ,?„(-')'^ *,,-£-#—^ (5.17)
We also have the equation
eiz = cos ζ + ζ sin ζ (5.18)
for all complex numbers ζ (for the series will sum again that way).
15. Replacing ζ by —iz and iz, alternately, we obtain these other
interesting equations:
ez = cos(-iz) + 1 sin(-fz) = cos(z'z) - i sin(iz)
e~z = cos(zz) + ζ sin(zz)
Thus
ez + e~ ~
ez-e-
= cos(zz) (5.19)
= -isin(zz) (5.20)
2
For real values of z, the left-hand sides of Equations (5.19), (5.20) are
424 5 Series of Functions
the hyperbolic cosine and hyperbolic sine, respectively. We can use
these expressions to define the complex cosh and sinh:
ez + е~г „ ez - e~z
cosh ζ = sinh ζ =
2 2
Because of (5.19) and (5.20), the complex trigonometric functions are,
on the imaginary axis, the hyperbolic functions:
cosh ζ = cos(zz) sinh ζ = —/sin(zz) (5.21)
We should also note that the trigonometric identities imply the
hyperbolic ones. Since cos2(/z) + sin2(zz) = 1, it follows from (5.21) that
cosh2 ζ - sinh2 ζ = 1 (5.22)
(see Exercise 10).
16. Ση=ι (ζ"/"!) has radius of convergence 1. So does the series
Σ^ι"* (z"/«!), for any integer k. We shall see later that the sums of
all these series can be given by closed expressions (such as Σ z" =
(1-z)-1).
17. A polynomial function in С is given by a power series. In
fact, writing the polynomial p(z) = Σί=ο a„z" is the same as giving
its power series expansion. What is more interesting is that any
point in С can be chosen as the center of a power series expansion for
p. Let z0 6 С and write
N
P(z) = Σ απΟ - z0 + z0)"
n = 0
Using the binomial theorem this becomes
P(z) = Σ α. Σ (")(ζ ~ zo)'zS~' (5.23)
n = 0 1 = 0 \I /
All sums being finite, we may arrange terms at will. Thus we can
rewrite (5.23) as a sum of powers of ζ — z0:
N N /m\
P(z)= Σ Σ Μ )ζ%-"(ζ-ζ0γ
л = 0 т = п \П I
which is the desired expansion.
5.5 Power Series 425
More generally, any series of the form £„°°=0 c„(z - z0)" will be called a
power series expansion centered at z0 . Can we expand ez in a power series
centered at a point other than the origin ? The answer is yes (cf., Problem
15), and the proof is like the one above for polynomials, but the question of
convergence intervenes after the analog of Equation (5 23) above. It is a
general fact that for any function given by a power series, we may move the
center of the expansion to any other point in the disk of convergence. The
truly courageous student should try to prove this now, it can be done. We
will give a proof later which is simple and avoids convergence problems but
requires more sophisticated information about functions defined by power
series expansions.
Addition and Multiplication of Power Series
Suppose /, g are complex-valued functions defined by power series
expansions centered at a point z0. Then we can find series expansions for the
functions У + g &najg also. Addition is easy: if, say
Az) = £a„(z-z0)" g(z) = £ bn(z - z0)"
then
(f+g)(z) = ^(an + bn)(z-zor
But to find the series expansion for the product requires a little more care.
Suppose that z0 = 0 (this involves no loss of generality). To say that
f(z) = γ^αηζ" is to say that in a certain disk Δ, /is the limit of the polynomials
Σ^=ο α„ζ". Similarly, g is the limit of the polynomials £*=0 bnz". Thus,
fg is the limit of the sequence of polynomials (J^=0 an ζ")(Σ,η=ο bn z"). Now,
we can multiply polynomials easily,
(Ν \ Ι Ν \ Ν Ν
„=0 / \n = 0 / n = 0 m-0
If we collect terms in this expression to form a series of powers of ζ we do not
get a very aesthetic expression, but if we take some terms from the next few
polynomials in the sequence we obtain a reasonable expression.
k=0 \п + т = к /
We could hope that fg is the limit of this sequence of polynomials. This is a
reasonable hope; for even though we have modified the original sequence
426 5 Series of Functions
of polynomials we have neither added nor deleted from the series represented
by that sequence. In fact, by making careful use of this fact, we can verify
that/# is the limit of (5.21).
Proposition 3. Let f(z) = 2^°=o anz", 9(z) = X"=o b„z" and suppose r is
less than the radii of convergence of both series. Then
(0 (/+ 9)(z) = X"=o (an + b„)z" uniformly and absolutely in
Δ= {ζ б С: \z\ <r},
(») (fg)(z) = Σ?=ο(Ση + ·η=ι< αη bm)zk uniformly and absolutely in A.
Proof. Let p„(z) = Jk=0 akz\ q„{z) = 2ϋ=ο bkzk. By hypothesis p„ ->/, q„ ^g
uniformly in Δ Thus p„ + q„ -»■/+ g, p„qn ->fg uniformly in Δ (Problem 2 55).
Since
p„(z) + q„(z) = 2 (a„ + b„)z"
k=l
(1) is proven.
(11) Let
r.(z) =
If I^bj)z
k = 0\l + J = k /
we want to show that r„ -+fg uniformly in Δ We know that p„q„ -+fg so it would
seem worth our while to compute p„q„ — r But that is easy,
pnq„- r„= J ( 2 o,bj\zk
k=0\< + J=k /
l>n
J>n
Now, each term on the right is of the form a, bj z1+J with i > η or j > n. Thus,
computing norms on Δ = {ζ e С: \z\ < r},
\\p,q*-r„\\<,
+
((|,|,fl'z'ii)L?+,|ii'jzi|1)
5.5 Power Series 427
Now, we know that 2<°= о \a,\r\ Jf=0 \Ь,\г> are finite. Let Μ be a number larger
than both Given ε > 0, there are
(1) iVi >0 such that 2Γ=η+ι \a,\r' < ε if η ^ Nu
(2) N2 > 0 such that Jj=n+i\b,\r> <εύη>Ν2,
(3) 7V3 > 0 such that ||Λί„ -/#|| < ε if η > N3.
These assertions follow from the known convergence of each case. Thus, if
η > max(M, N2, N3),
\\i*-fg\\ ^ ILp»0n-/bll-i- \lpnqn-r„\\
<£+ii+1,"k,)(iol^W) + (lolfl^')(Jl+^|W)
<ε + ε·Μ+Μ ε = (2Μ+1)ε
the proposition is concluded.
• EXERCISES
9. Verify, in the way suggested in the text that cos2 ζ + sin2 ζ = 1 is true
for all complex numbers z, and thus cosh2 7—sinh2 ζ =1 is always true.
10. Find a power series expansion for these functions·
(a) exp(z2) (e) e~' j ^—{
(b) ez sin ζ (f) cosh ζ
cos ζ
(с) γ—^ (g) smhz
dt
(d) ί exp(t2)dt
11 Verify by multiplying the power series that ez+w = ezew
12 From the addition formula for the exponential (Exercise 11), deduce
the addition formulas for cos, sin, sinh, cosh
PROBLEMS
14. If {c„} is any bounded sequence, then the maximum number, every
neighborhood of which has infinitely many members, is denoted lim sup c„.
Show that the radius of convergence of the series ^cnz" is R =
(limsup(|c„|)1/n)-'
428 5 Series of Functions
15. Expand ez in a power series about any point z0.
16. Assuming that (1 +x2)-1'2 can be represented by a power series
centered at 0, find it. Find the power series for arc cos x.
17. Assuming that tan χ can be represented by a power series centered
at 0 and using the equation tan χ cos χ = sin x, find the power series
expansion of tan x.
5.6 Complex Differentiation
An easy property of the exponential function is
lim e—^- = 1 (5.25)
For
so
oo 2
n = 0 П\
- f —-1 J γ —\
Now the term in parenthesis is a convergent power series, so is continuous
at 0. Thus, writing the parenthesis as g{z):
e* - 1
lim = 1 + lim zg(z) = 1
z->0 Ζ ζ->ο
From (5.25) and the properties of the exponential it follows that
exp(z0 + z) - exp(z0) ez - I
lim —— —- = exp(z0) lim = exp(z0)
z-»0 Ζ ζ-»0 Ζ
The student of calculus will recognize the limit on the left as a difference
quotient and the entire equation as a replica of the behavior of the real
exponential function. It might be a good idea to consider more generally such
a process of differentiation on the complex plane. This turns out to be a
5.6 Complex Differentiation 429
very significant idea, because there are many beautiful and useful ways to
represent functions which are so differentiable.
Definition 3. Let/be a complex-valued function defined in a neighborhood
of z0 in C. f is differentiable at z0 if
lim/(')-/('o)
z-*zo Ζ Zq
exists. In this case we write the limit as/'(z0). If/is defined in an open
set U and differentiable at every point of U, we say that / is differentiable
on U.
The usual algebraic facts on differentiation hold true in the complex
domain.
Proposition 4.
(ι) Suppose f, g are differentiable at z0. Then so are f+g and fg with
the derivatives given by
(f+ff)Xz0)=f'{z0)+g'(z0)
(f9)'(z0) =f'(z0)g(z0) +f(z0)g'(z0)
(ii) Suppose f is differentiable at z0 and f(z0) φ 0. Then \/f is
differentiable at z0 and(\/f)'(z0) = -/'(z0)//(z0)2.
(iii) Suppose f is differentiable at z0 and g is differentiable at f(z0). Then
g of is differentiable at z0 and(g °/)'(z0) = g'(f(z0))f'(z0).
Proof. These propositions are so much like the corresponding propositions in
calculus that their proofs will be left to the reader
Examples
18. The function ζ is clearly differentiable, and z'(z0) = 1 for all
z0 . A constant function is differentiable with derivative zero. Since
any polynomial is obtained from ζ and constant functions by a
succession of operations as described in Proposition 4(i), all polynomials
are differentiable.
19. The function ζ is nowhere differentiable. For the difference
quotient (z - z^)/(z - z0), for ζ φ z0, is a point on the unit circle
and as ζ ranges through a neighborhood of z0 this difference quotient
takes on all values on the unit circle, so it could hardly converge.
430 5 Series of Functions
Sum of a Power Series is Infinitely Differentiable
Our introduction to this section was essentially a proof that ez is
everywhere differentiable. It is in fact true that the sum of a power series is
differentiable in its disk of convergence. We now verify this basic fact.
Theorem 5.7. Let f(z) = £™_0 anz" have radius of convergence R. Then
f is differentiable at every point in the disk of radius R and
f'(z)= Σηα„ζ-1
n = l
has the same radius of convergence as ]T"=0 a„z".
(5.26)
Proof, lim sup(«|an|)1/n = lim(n)1'" lim sup(|a,|)1/n = lim supdaj)1'", so the series
^na„z"~l has the same radius of convergence as the given series. We must show
that it represents the derivative of/. Fix a z0, \z0\ < R and choose r > \z0\. The
series J n^r"'1 converges absolutely, so given ε>0 there is an N such thaf
2„>лИ \a„\r"~l <ε. Now consider the difference quotient defining/'(z0):
Z— Zo
/(z)-/(z0)= - /z"-z0"\
-. = 0 \ Z— Zo /
n=l \k = l
If \z\ < r as well as \z0\ < r, then
< 2 Ια,Ι iy-v*-1 < Σ η Mr"-1 <ε
^e-fi^zs-1
n>N k = l
Similarly, !2„>л,до„го 4 <ε. Thus,
/(z)-/(zo)
Z — Zo
·- Jnazj;-1
<2ε + 2la.|
Σ-n-k-k-l _ _n-
Z Z0 — Zo
Now, by continuity, as z-^z0 the last term tends to zero. Thus, there is a δ > 0
such that if |z — Zo! < δ, the last term is less than ε. Thus, for \z\ < r and \z — z0\
< δ we have
Z — Zo
-2h<j„zs 1
<3ε
which proves that the limit of the difference quotient exists and is given by (5.26).
5.6 Complex Differentiation 431
In particular, since /' is given by a convergent power series, it also is
differentiable, with derivative f"(z) = £и(и - \)anz"~2, and so forth. We
thus obtain these results, which form the complex version of Taylor's theorem
for sums of convergent power series.
Corollary 1. Let f(z) = J^anz" have radius of convergence R. Then f is
infinitely differentiable in {z: \z\ < R}. Furthermore, for every к the kth
derivative, f(k) is given by a convergent power series,
f(k\z) = £ И(и _ l) ... („ _ fc + 1)β„ z"~k
Corollary 2. Let f(z) = £"=0 an z" be convergent in a disk about 0. The
coefficients {a„} are uniquely determined by f.
/(n,(0)
a„ = —
Notice that the definition of complex derivative is a genuine generalization
of the differentiation of functions of a real variable. Thus the same
corollaries hold for functions of a real variable represented by power series:
Corollary 3. If f(x) = £"=0 an(x ~ xo)" m a neighborhood of x0, then f
is infinitely differentiable at x0 and
/(n)(*o)
n\
00
/<«(*) = £ n(n - 1) · · · (n - k)an(x - х0)"-к
These corollaries are easily derived from the theorem and their proofs are
left to the student. Notice that the implication of Corollary 2 is that the
coefficients of a power series representation of a function are uniquely and
directly determined by the function. In particular, a function cannot be
written as the sum of a power series in more than one way. This
observation allows us to easily verify the identity
cos
2 ζ + sin2 ζ = 1 (5.27)
For the function cos2 ζ + sin2 ζ is a polynomial in functions which are sums
of power series and thus is the sum of a power series. Its coefficients can
432 5 Series of Functions
be computed according to Corollary 3 just by letting ζ take on real values.
But the right-hand side of (5.27) is the Taylor expansion of cos2 ζ + sin2 z,
for real z, thus it must be the Taylor expansion for all z. Hence (5.27) is
always true.
The Cauchy-Riemann Equation
It is of value to compare the notion of complex differentiation with that
of differentiation of functions defined on R2, since R2 = С Suppose that
/ is a complex-valued function defined in a neighborhood of z0 = x0 + iy0.
If/is differentiable as a function of two real variables, then the differential
df(x0,y0) is defined and is a complex-valued linear function on R2. If/
is also complex differentiable, then
/'(z0) = lim/(z)-/(Zo) (5.28)
exists. Let ζ -> z0 along the horizontal line. Then (5.28) specializes to
f (z0) = hm = — (x0, y0) (5.29)
x-*xo x x0 Οχ
If we let ζ ^>z0 along a vertical, we also have
/'(z0)=lim = Tj-(x0,yo) (5·30)
у-уо 1(У ~ Уо) ι 8y
Thus the right-hand sides of (5.29) and (5.30) are the same. In conclusion,
a complex differentiable function must satisfy (when considered as a function
of two real variables) this relation
(I)=-*(!)
(5.31)
This is called the Cauchy-Riemann equation. More precisely, the Cauchy-
Riemann equations are found by writing/= u+ iv and splitting into real
and imaginary parts. Let us record this important fact.
Theorem 5.8. Let f be a complex differentiable function in a domain D.
Split f into real and imaginary parts and consider f as a function of two real
5.6 Complex Differentiation 433
dj_ = l_df_
δχ i δγ
διι δν δη
δχ δγ ду
δν
~δ~χ
variables (ζ = χ + iy). Then these partial differential equations hold in D:
(5.32)
(5.33)
Proof. Equation (5.32) was observed above. Equation (5.33) follows from
(5.32) and the identities
df__8u dv 8f _du .dv
dx~ δχ δχ 8y~ ду Ну
Notice that when/is complex differentiable, its differential is given by
df(z0, (r + is)) = — (z0)r + — (z0)s
= /'(20)r+i/'(z0)s
= /'(20)(r + is)
Thus the differential of a complex differentiable function is a complex linear
complex-valued function.
We shall show, via the techniques of the next few chapters, that a complex
differentiable function can be written as the sum of a convergent power
series. Thus, just by virtue of the differential being complex linear, the
function has derivatives of all orders and is the sum of its power series.
• PROBLEMS
18. Prove Proposition 4.
19. Prove Corollaries 1 and 2 of Theorem 5.7.
20. Show that if /is an infinitely differentiable function on an interval
(—ε, ε), and there is an Μ > 0 such that
\f(n\x)\<M all η all* -ε<χ<ε
then /is the sum of a power series which converges in the unit disk.
21. Suppose / is a complex differentiable function in a domain in the
plane. Show that:
(a) if/is real-valued it is constant.
(b) if |/1 is constant, then/is constant.
434 5 Series of Functions
22 Write the Cauchy-Riemann equations in polar coordinates. {Hint:
Differentiate along the ray and circle through a point.)
23. Suppose/i ,...,/* are given by convergent power series in a disk Δ.
If F is a polynomial in к variables such that
F(/1(z),...,/t(z))=0 (5.34)
for real z, then (5.34) is true for all ζ in Δ.
24. Compute the limits of these quotients as ζ -*■ 0:
(a)
(b)
(c)
arc tan ζ
ζ
sin hz
sin ζ
cos ζ — 1
(d)
(e)
cos ζ — 1
cos ζ
sin z —tan ζ
(f) ; η = 0,1,2,3,4
25 Suppose / is a differentiable complex-valued function of two real
variables in a domain D. Show that/is a complex differentiable if and only
if the differential df(z0) is complex linear for all z0 ε D.
26. Suppose that / is twice differentiable in D, and is complex
differentiable. Show also that /' is complex differentiable. Supposing that
(/')' = 0, show that/is a quadratic polynomial in ζ
27 If/= u+w is complex differentiable in D and twice differentiable,
then
д2и д2и d2v д2 υ
~ibc2 + ~ty2= =~frc2+~dy2
5.7 Differential Equations with Analytic Coefficients
A function which can be represented as the sum of a convergent power
series at a point ae С will be said to be analytic at a. We now return to the
study of linear differential equations in order to answer some of the questions
posed in Section 5 5. We can use the information in Section 5.6 to do this
and to provide the sought-for estimates. In particular, we shall verify the
following fact.
5.7 Differential Equations with Analytic Coefficients 435
Proposition 5. Suppose h, g0, ..., gk_t, gk are analytic at 0. Then the
solution of the differential equation
У(к)+ 19,У(0 + Кх) = 0 у(0)=ао,...,/к-1Щ = ак-1
is also analytic at 0; that is, in some disk centered at 0, it is the sum of a
convergent power series whose coefficients can be recursively calculated from the
differential equation.
We already know, from Section 5.5, how to compute the Taylor coefficients
of the solution; our business here is to show that the resulting series does in
fact converge. This, of course, involves producing the kind of estimate
required by Theorem 5.7.
Suppose then that h,g0, ..., дк-± are analytic in the interval \x\ < R Then
00 00
h(x)= £a„x" gt(x)= Σα-'χ"
and for some positive number M, \a„\ <MR~", \a„l\ <MR~" for all ι and η We
shall obtain the desired estimate in terms of M, R and the initial conditions For
simplicity, we shall do the homogeneous case only (A = 0), leaving the general case
for the reader (Problem 27). If
/(*)= fc„x"
n=0
is the desired solution, we have
Co = do, · · · , Clt -1 = _ j.j
and the rest of the coefficients are found from these equations:
|п(и- 1) ■ ■ (n - k)c„x"~K
+ *Σ(Σ ""'*")( Σ «(«-!) · (n-i)c„x"-')=0 (5.35)
Surely, the reader now has a pain in his stomach similar to that of the author as
he wrote this equation Patience, dear reader—the fun has just begun ι Equating
coefficients of xm to zero, we obtain this recursive system of equations for the
436 5 Series of Functions
coefficients:
m{m — 1) · · · (m — k)cm
or
-1
Cm
ft-1 m-ft
+ Σ Σ аа1(т+ i-(k+*)) •••(m—(k + a))cm+i_(ft+e) =0
Σ Σ (™ + '-(* + a))
/η · · · (m — к) ι = ·b »=о
···(»»-№ + «))a«lCii.+ i-«+«) (5.36)
By the restraints on ι and α we have m + ι — (к + <х)<т + к — 1 — к =т— 1, so
the highest subscript of с on the right is m — 1. Thus, given c0,..., ck_i we can
solve Equations (5.36) successively. We now try to find an estimate. For this
purpose assume Μ > 1, and let
С=тах(к,ЩЯ- D"1^ - 1),^,(^V", 0 < у < A: - Л
The last condition on С is written so as to assure that
ffl
\cj\^i — \ fory = 0,...,*-l (5.37)
We now prove this inequality for all /я, by induction. Thus we use (5.36), assuming
(5.37) for all η < m:
J »-l m-k
\Cm\ <.— 2 Σ Iй"'I |Cm+l-(k+a)|
7Я 1 = 0 α= ο
<
7Я
1 '-'"-'M/CMW1-11*"
72 ifb «ΐΌ Λ" \~r)
1 MmCm-1 *=ι (/η- k+ 1)
~ /η Λ"1-" itb JR*
Now 2?Ξί Λ"1 = OR"* - 1)CR_1 - Ι)"' = R(R - l)-l(Rk - l)R~K. Thus
m-k+\Mm R(Rk-l) lCM\m
by definition of C. By this estimate we see that f(x) = 2Г= о с" ■*" converges in the
interval {x: \x\ <R(CM)~'}, and in that interval is the solution to our problem.
5.7 Differential Equations with Analytic Coefficients 431
We might perhaps have made a better estimate by more clever substitutions; but
our above estimates were sufficient for the results desired In any particular case
we could usually be clever and obtain even better estimates.
Examples
20. y' + exy' + xy = 0, y(0) = 0, уЩ = 1.
We will find a polynomial which approximates the solution to within
10~5 on the interval [0.01, 0.01]. Let £ c„x" be the supposed
solution. By substituting the series we obtain
oo / oo χη\ /oo \ oo
Σφ-ικ^2+ Σ ι Σ^*-1 + Σ^*η+1 = ο (5.
η = 2 \η = 0 И!/ \„=ι / „=0
38)
The initial conditions give c0 = 0, ct = 1. Equation (5.38) becomes
-1
cm =
Thus
■1 lm'2 1 \
сг = -¥\ = -2
c3 = -|(2c2 + c, + c0) = 0
Q = -ti(3c3 + 2c2 + |Cl + c.) = 2V
C5 = - A(4c4 + 3f3 +C2+ icx + C2) = Tio
and so forth. The question is not really what the coefficients are
(that is to be left to a machine)—but how many coefficients need to
be computed. The coefficients appear to be bounded (we could in
fact show that they must be, cf. Problem 29). Let's try to prove that
|c„| < К for all η by induction. We have
1 m~2 1
m \,f0 i!
so long as m > 4. Thus we may take for К a bound for the first
four terms, that is, к = 1/2. Then the difference between the solution
438 5 Series of Functions
and the fcth partial sum of its Taylor expansion is dominated by
n>k 2„^k 2 1 - \x\
for |x| < 1. The interval we are concerned with is |x| < 10-1 so our
bound on the error is
2 10k 9 18
This is less than 1СГ5 if к = 6, thus (computing also c6) the solution
differs from
v _ lv2 , l_v4 , l_v5 1_ 6
л 2Л τ JtJi τ ΙίΟΛ 3 60Λ
by at most 10~5 for all values of χ in [-0.01, 0.01].
21. Suppose we needed that good an estimate in the interval
[— 1, 1]. It is easy to see that just knowing that the coefficients are
bounded is not good enough. We have to know that \c„\ < Kr" for
some r < 1 and some K. Let's try r = \. That is, we attempt to
verify by induction that |c„| < 2~" for all n. Now, using the equation
defining {c„},
\CJ<— 7Г > · „_,_, +^-T-
1 lm~2
-— Σ
m{m - 1) \ ,sb
κ ι /m-2? \
<
e2 + 4 К
m
This is less than 2~mK as soon as m > 2(e2 + 4), or m > 26. Thus
the induction step proceeds as soon as m > 26; we need only choose
К so that the inequality holds for all m < 26 (K= 2 will do). Thus
|c„| < 1/2"-1 for all n. The desired solution differs in the interval
[—1, 1] from its fcth-order Taylor polynomials by
Σαι)"
— у - < -
2ft — 1 l-t on — r\k
n = 0 ί £
^ Σ lCkl - τΤ=ϊ Σ on - oft-2
5.7 Differential Equations with Analytic Coefficients 439
This is less than 10 5 if к is 19. Thus we need to compute 19 terms
of the Taylor expansion to find the desired approximation.
22. Compute the solution of
y" + (r-zrAy' + exy = 0 X0) = 1 /(0) = 0
to an accuracy of 1СГ3 in the interval [ — \, \\.
Let f(x) = Yj°=0 cnx" be the solution. We have these equations
for the solution:
c0= \,c1 = 0
-1
c„ =
n(n - 1) \,=o
n-2 n-2 ^
Σ (П- 1 -/)£·„_!_,+ Σ -,Cn
--" ι = ο Ι!
-)
(5.39)
We will show by inductions that the {c„} are bounded. Suppose that
|c„| ^ К for all η < m. Then
kJ <
m-2 J
(от -1 - OK + Σ -,κ
1 /m_2
( У v ,_ , ^
/п(от - 1) \ ,^Ό ι=ο ι!
Κ Ζ"1-1 \ -К \т(т - 1)
^-7—ή Ei + e =-7—
m(m — l)\j = i / m(m—
n(m — 1)
+ e
К
1
: +
_2 от(/и — 1).
<X
as soon as от > 3. Thus we can take К as a bound for the first four
terms. We have from (5.39) c2= -\, c3 = 0, so we may take К = 1.
Then the estimate of the remainder after к terms in the interval
Σ κ\ w" ζ Σ ί = ψ
This is less than 10~3 when к = 10, so we need 11 coefficients. We
compute
r — 1
:2"0
/■ —-ДД-
etc.
440 5 Series of Functions
Up to six terms (giving at most an error of 1/64), our solution is
x2 x4 x5 13x6
1 ~T+12 + 20 + ^20 +'"
• EXERCISES
13. Find a power series expansion for the general solution in a
neighborhood of 0 for this equation,
(1 - x2)/ - 2xy' + k(k + \)y = 0
14. Find a power series expansion for the general solution of
/ - Ixy' + 2ky = 0
15. Find the power series for the function y=f(x) such that
(a) y'- + e'y = x\ y(Q) = l, /(0) = 0
(b) (У)2=л Х0) = 0
(c) {у'У=УУ"> X0)=0, /(0)=1
(d) / + 2xy2 = 0
16. How many terms of the power series for the solution у do we need:
(a) for an accuracy of 10"3 in the interval (— i, i) in Exercise 3(a)?
(b) for an accuracy of 10"5 in the interval (—10, 10) in Exercise 3(a)?
(c) for an accuracy of 10"5 in the interval (—0.1, 0.1) in Exercise
3(b)?
17. For what к are the solution of Equations (5.1), (5.2) polynomials?
• PROBLEMS
28. Generalizing the argument in the text prove this theorem:
Theorem. Ifh, g0,..., gk-i are analytic in an interval (— R, R) about the
origin, then any solution of the differential equation
y(l,> + kIgtyl0 = h
r = 0
can be expressed as the sum of a convergent power series in a neighborhood
of the origin.
29. Suppose that g, h are convergent power series in some disk {\z\ < R}
with R > 1. Show that the solution of the linear differential equation
y" + gy' + hy = o
(5.40)
5.8 Infinitely Flat Functions 441
is the sum of a convergent power series with bounded coefficients.
30. If the power series g{x) = 2 a„x", h(x) = ^b„x" both have infinite
radius of convergence, then so does the series expansion of the solution
of (5.40).
5.8 Infinitely Flat Functions
Not all functions are susceptible to the kind Of Taylor series analysis
which we have been doing. A first requirement is that the function have
derivatives of all order; even that however is insufficient. Another glance at
Theorem 5.6 will remind the reader that there is a behavior requirement on
these successive derivatives in order that the given function be the sum of its
Taylor expansion. We shall show by example that there are infinitely
differentiable functions which are not sums of power series. First, we shall
make the notion of analyticity precise.
Definition 4. Let/be a complex-valued function defined in an open set U.
Let ae U. /is analytic at a if there is a ball {z: \z — a\ < r} centered at a
such that/is the sum of a convergent power series in this ball. / is analytic
in U if/ is analytic at every point of U.
We have deliberately stated this definition without reference to the domain
of definition of the function; it applies equally well to functions of a real or
complex variable. The only functions which we know to be analytic are
the polynomials and ez. For example, if/is the sum of a convergent power
series at the origin, we do not yet know that we can expand/in a series of
powers of (z — a) with a any other point in the disk of convergence. We
shall see in the next chapter that this is the case. We have already seen that
an analytic function has derivatives of all orders (is C°°) and now we will
produce a C°° function which is not analytic. The clue to this function is
given by the following fact, which follows from l'Hospital's rule.
Proposition 6. lim P(t)e~' = 0, for any polynomial Ρ
<->00
Proof. Problem 31.
The function we have in mind (Figure 5.1) is defined by
Iexpl 1 x>0
^ x> (5.41)
0 x<,0
442 5 Series of Functions
Figure 5.1
σ is certainly infinitely differentiable at any point x0 φ 0, so we need only
consider its behavior at 0. Now,
a(n\x) = 0 χ < 0
all η
Thus all derivatives of σ from the left exist at 0. We have to show that all
derivatives from the right exist and are zero. More precisely we must prove
that for all n,
<P\x) - gW(0)
lim = 0
(5.42)
x->0
x>0
We do this by induction. The case η = 0 is easy:
lim = lim - expl j = lim te~' = 0
x->0 X x->0 X \ X/ r->oo
To do the general case we must have some idea what σ(η,(χ) looks like for
χ > 0. Now
•(*) = - ρ exp( - I) a\x) = (| + 1) exp( - 1)
ff(3,w=-(?4+^)exp(-i)
A pattern seems to be developing.
5.8 Infinitely Flat Functions 443
For each η there is a polynomial Pn such that
<r<"
M(x) = pJ-\ expi -^\ for χ > 0 (5.43)
This can be verified by induction. Assuming (5.43), we compute
-p-(;H-;)
where Pn + l(X) = -X2(Pn'(X) + Pn(X)). Now that we have this, (5.42)
follows immediately from Proposition 6:
lim «"О-«"O = lim 1 Wi) exp( - i) = lim tP^e" = 0
x>0
Thus σ is also infinitely differentiable at 0. But it is certainly not analytic.
Its Taylor expansion is £"=0 0 ' *" which converges to σ(χ) only for χ <, 0
and provides a poor means for approximating the value of σ(χ) for χ > 0.
However, the fact that infinitely differentiable functions exist with this
property has its bright side. The following construction will prove to be
useful.
Lemma. Given a < b, there is a C°° function zab such that
(i) 0 < zab{x) < 1 for all x,
(ii) zab{x)>0 ifa<x<b,
(iii) zab{x) =0 if χ > b or χ < a.
Proof. (See Figure 5.2.) σ(χ(1 — χ)) has the required properties of τ01. We
then define
τ-(χ>-4Η)=σ((Η)(ι-Η))
444 5 Series of Functions
у ~ TJx)
Figure 5.2
Theorem 5.9. Let [a, b~] be a given interval and Uu ..., U„ a finite
collection of open intervals covering [a, b~]. There exist C°° functions pl such that
(i) 0 < plx) < 1 for all χ e R, all ι,
(ii) />,(*)= 0 ifxtUlt
(iii) £p,(x) = l, for all χ e [a, 6].
Proof. Let Uι = (at, bt), and take τ, = Ta,bl. Then τ(χ) = 2 τ,(χ) > 0 if
χ 6 (α, ft). Let
(r,(x)
xe(a,, ft,)
P.(x) = { τ(χ)
Ο χ 0 (α,, ft,)
The pi then have the desired properties.
PROBLEMS
31. Prove that for any polynomial P, lim P(t)e~' = 0.
t-»00
32. Let a be defined by (5.31). Define
ω(χ)
f (f(l-O)
fea-o)
ifr
Λ
Show that (a) ω is C°°, (b) 0 <, ω(χ) £ 1, for all x, (c) ω(χ) = 0, if χ < 0,
(d)aj(x) = l,ifx^l.
33. Using Theorem 5.9 it can be shown that any continuous function is
the limit of C°° functions. Let/e C([0, 1]) and ε > 0. Find a C°° function
#such that II/—#|| <ε
Here's how to do it. First pick an integer N >0 such that |/(x) — f(y)\
<ε/2 if \x — y\ <\jN. Now cover the interval [0, 1] by the intervals
5.9 Summary 445
С/о Un, where
and let pu ..., pN be the corresponding functions of Theorem 5.9. Let
For any χ, χ is in only two of the intervals {[/,}, say Ur, Ur+i. Then
**>-!/(*)»<*>
I/(*)-*(*) I =M*)
fix)-f{$\+»+]fix)-f(cir)
ε ε
<2 + 2
5.9 Summary
Let (Λ) be a sequence of continuous functions. The series formed from
the/ft is the sequence of sums {^jUi/*}· If this sequence converges, we
say that the series converges and denote the limit by £"=1/&. The series
converges absolutely if £it°=1||/J < oo. The Cauchy criterion for series
asserts that the series converges if and only if the sums ||^m=n+i /J can be
made arbitrarily small by choosing m, η sufficiently large.
comparison test. If there is a sequence {pk} of positive numbers and an
integer N > 0 such that
(i) ΙΙΛΙΙ<Λ for k>N
(ii) Хл<оо
then Σ/ь converges absolutely.
integration. If £/„ converges, so does £ \xaf„ and
FUNDAMENTAL THEOREM OF ALGEBRA. If Ρ IS a polynomial:
P(z) = a„z" + ··· + αχζ + α0
(5.44)
446 5 Series of Functions
with an φ 0. Then P(z) has a complex root. If ru ..., rk are all the roots
of P, then corresponding to each root there is a positive integer m, (called
the multiplicity) such that
(i) mt + ■■■ + mk = η
(ii) i(z) = (z - r,)"" ■ ■ ■ (z - г*)™" (5-45)
To the polynomial Ρ given by (5.44) we associate the constant coefficient
differential operator LP:
£ρ(/)=α„/<η) + ···+α1/'+α0/
Ρ is called the characteristic polynomial of LP. These formulas are valid:
Lp+Q = Lp + Lq LpQ = LpLQ
If (5.45) is the factorization of P, then the kernel of LP, the collection of
solutions of LP(f) = 0, is spanned by the functions
> · · · >
„Г2Х „тг—1</2*
c Λ c
„ГкХ ,4.-1/*"
c Λ c
Let {c„} be a sequence of complex numbers. There is a nonnegative
number R (called the radius of convergence of the power series ]T cnz") with
these properties:
(a) £ c„z" diverges for \z\ > R.
(b) £ cnz" converges absolutely in {\z\ < r} for r < R.
(c) JR=[limsup(|cn|)1"']-1.
If/(z) = £ a„z", g{z) = £ 6„z" in the disk {|z| < r}, then
/(z) + ff(z) = Σ К + ^)z"
л = 0
/(z)ff(z)= £ ( Σ аяЬяУ
k = 0 \n + m-k J
in that same disk.
5.9 Summary 447
If/ is a complex-valued function defined near z0 in C, we say that/м
complex differentiable at z0 if
hm =/ (z0)
ζ - z0
exists. The sum of a convergent power series is complex differentiable at
every z0 in its disk of convergence. Furthermore, the derivative is the sum
of the derived series:
/(ζ)=Σ>„ζ" /'(ζ)= Σ^ζ"-1
л=0 л=1
Thus the sum of a convergent power series is infinitely differentiable. A
differentiable complex-valued function of two real variables is complex
differentiable if and only if it satisfies the Cauchy-Riemann equations:
.(8f\
dx
If h, g0,..., gk can be represented as sums of convergent power series
in a disk centered at zero, then the same is true for all solutions of the
differential equation
fc-l
Σ
1 = 0
/'+ Σ^(" + Λ = 0
Furthermore, once given the initial conditions
Я0) = а0,---^('1~1,(0) = ак-,
the coefficients of the power series can be recursively calculated using the
differential equation.
Given any finite covering of the interval [a, b~] by open intervals [/,,...,
U„ we can find C00 functions p,,..., p„ such that
(a) 0<p,<l
(b) p, = 0, outside U,
(c) X pt(x) = 1 for all χ e [a, b~]
These functions are called a partition of unity on [a, b~] subordinate to the
cover Uu ..., U„.
448 5 Series of Functions
• FURTHER READING
I. I. Hirshman, Infinite Series, Holt, Rinehart and Winston, New York,
1962.
H. Cartan, Elementary Theory of Analytic Functions of One or Several
Complex Variables, Addison-Wesley, Reading, Mass., 1963.
This text develops the subject of complex analysis from the point of view
of power series. It also contains a complete discussion of the theorem on
existence of solutions of analytic differential equations.
Further material can be found in
T. A. Bak and J. Lichtenberg, Mathematics for Scientists, W. A. Benjamin,
Inc., New York, 1966.
Kreider, Kuller, Ostberg, Perkins, An Introduction to Linear Analysis,
Addison-Wesley, Reading, Mass., 1966.
W. Fulks, Advanced Calculus, John Wiley and Sons, New York, 1961.
M. R. Spiegel, Applied Differential Equations, Prentice-Hall, Englewood
Cliffs, N. J., 1958.
• MISCELLANEOUS PROBLEMS
34. Find a sequence {/„} of continuous nonnegative real-valued functions
defined on the interval (0,1) such that f(x) = 2™= ι /■(*) exists for all
xe [0, 1], but/is not continuous.
35. For \z\ < 1, define
lnz=li
Show that for all such ζ, ζ = 1 — exp(ln(l — z)).
36. Show that the series
„t-i (z - n)2
converges to a complex differentiable function in the domain
C-{1,2, ...,n,...}.
37. Show that exp(z) = lim (1 + z/m)m (Hint: Compute the power
m~* 00
series expansion of (1 + zjmf.)
38. Show that a real polynomial of odd degree always has a real root.
39. Let ω = sxp(2nijn). Show that
1 - Z" = (1 - ωζ)(1 - ω2ζ) ··■(!- ω"ζ)
5.9 Summary 449
40. Let Ρ be a polynomial. Show that Ρ is the square of another
polynomial if and only if every root of Ρ occurs with even multiplicity.
41. If P, Q are two polynomials, Ρ divides Q if and only if every root of
Ρ is a root of Q with no larger multiplicity.
42. Suppose Ρ is a polynomial of degree at least two. Show that there
is а с such that P(z) — с = 0 has at least one multiple root. {Hint:
Consider P' as defined in Problem 10. If P'(a) = 0, take с = P(a).)
43. Let/i,...,/, be functions in C(X). Show that these functions are
independent if and only if there are points x,,...,x„ such that the matrix
(/i(a-j)) is nonsingular.
44. Show that the functions e", xe™, ..., x"e" are independent.
45. Show that if P, Q are polynomials, S(LP) с S(LQ) if and only if Ρ
divides Q.
46. If Ρ is a polynomial of degree at least two, there is а с е С such that
the equation LPf= c/has a solution of the form xe".
47. If a linear differential equation has polynomial coefficients, it has
global solutions on all of R.
48. Let {c} be a sequence of complex numbers such that 2 k»l < °°·
Let /(z) =2"=o c„z" Prove that \c0\ is not a relative maximum of |/|.
unless all other coefficients vanish.
49. Suppose the function
f(x, y) = x2 - y2 + w(x, y)
is complex differentiable. Find v.
50. If / is a polynomial in x, у which is complex differentiable, then
/(x, y) has the form Q{x + ty), where Q is a polynomial. {Hint:
Substitute x = (z+ z)/2, у = (ζ — z)/2, and use the Cauchy-Riemann equation.)
51 Suppose /is a C2 complex-valued function defined on a domain D
in C. Show that if / and fz are both harmonic, then / is complex
differentiable.
52. If/, g are complex differentiable and |/|2 + \g\2 is constant, then both
/ g are constant.
53. Suppose that / is a one-to-one mapping of a domain D <= С onto
Δ <= С Let g: Δ -* D be the inverse of / Show that if / is complex
differentiable, so is g.
54. We may consider the function e* as a mapping from the plane to
the plane. Let ζ = χ + iy, и = Re ег, υ = Im e1; that is
и = e* cos у ν = e* sin у
(a) Show that this mapping maps the lines χ = const, on the circles
centered at the origin, the lines у = const, go onto the rays through the
origin.
(b) Show that in any interval {a < у < a + 2π} this mapping takes
every value precisely once.
450 5 Series of Functions
(c) In particular, ez maps the horizontal strip {— π < у < π} one-to-
one onto the entire plane except for the negative real axis. Call this
domain D. Define the complex logarithm logz: D-*{—π <y <π] a.s
the inverse of this mapping. Show that log ζ is complex differentiable
and
(log)'z = -
Show also that log ζ can be represented by a power series centered at 1
in the disk {|z— 1| < 1}. (Recall Miscellaneous Problem 35) Notice
that this provides a way for extending real functions to the complex
domain besides that of power series. For example, the power series
expansion about 1 of log χ extends it only to the unit disk centered at 1.
The above extension of log is defined in the entire plane except the
negative real axis. This process is called analytic continuation.
55. Consider z2 as a mapping of the plane into the plane Show that it
maps the open right half plane one-to-one onto the domain D of Problem 54.
Let Vzbe the inverse, and show that Vzis complex differentiable Provide
a similar discussion for the mapping z"
56. Discuss the mapping properties of cos z, sin z.
57 Show that the power series expansion of the solution of Exercise 8(d)
with initial values y(0) = 1, /(0) = 0 does not converge outside the unit disk
58. Suppose that/, g are complex-valued functions defined on the interval
/. Show that
' m dt
hz-g(t)
is complex differentiable and can be represented by a power series at any
point of the image of g.
59 If/is C'mCxX and for each fixed x, /(z, x) is differentiable in z,
then
F(t) = jf(z,x)dx
is also complex differentiable.
60. The equation of Exercise 6 is called Legendre's equation and the
solutions {/,} for integral к are called the Legendre polynomials. They have
this interesting property:
fm(x)fn(x) dx = 0 ύτηφη
5.9 Summary 451
To prove this we must observe that Legendre's equation may be written as
Thus
by integration by parts. Now do the same, interchanging m and n.
61. Let Ρ be a polynomial of degree d. Show that f(z) = enz) is complex
differentiable Show that/°"(z)e~',(z) is a polynomial of degree n(d — 1).
62. Show that the polynomial
d"
exp(*2) ^ (exp(-x2))
solves the differential equation y' — 2y' + 2ny = 0.
63 (a) Find a C°° real-valued function / defined on /?" with these
properties:
(ι) 0</(χ)<1
(ii) /(x)>0if \\x-Xo\\<R
(iii) /(x)=0if \\x-Xo\\>R
(By C°° we mean all higher-order partial derivatives exist and are
continuous.)
(b) Let X be a closed set in R", and suppose B,,..., B„ are balls in
R" such that X <= В^ и ■ ■ и В„. A partition of unity on X subordinate to
Bu ■ ■ ■, B„ is a collection {/i,...,/»} of C°° functions such that
(i) 0 < /, < 1
(ii) f,(x) = 0ux$Bi
О") 2;_1/,(x) = lifxe^
Find such a partition of unity.
FUNCTIONS ON THE CIRCLE
(FOURIER ANALYSIS)
In this chapter we shall study periodic functions of a real variable. The
importance of such functions derives from the fact that many natural and
physical phenomena are oscillatory, or recurrent. In the early 19th century,
J. B. J. Fourier laid down the foundations of the study of periodic functions
in his treatise Analytic Theory of Heat. There remained a few gaps and
difficulties in Fourier's theory and much mathematical energy during the
19th century was expended in the study of these problems. The invention of
Lebesgue's theory of integration in the early 20th century finally laid the
foundations to this theory. Our exposition will not follow this
chronological pattern; but rather will try to develop the way of thinking about
Fourier series which emerged during the late 19th century.
A periodic function is one whose behavior is recurrent. That is, there
is a certain number L, called the period of the function, such that the function
repeats itself over every interval of length L,
f(x + L)= f(x) for all χ e R
From our point of view (which is very much a posteriori) the study of periodic
functions begins by discarding the notion of periodicity in favor of a change
in the geometry of the domain That is, to study the collection of all periodic
functions with a fixed period, we make the underlying space periodic instead.
We shall think of the real line as wound around a circle, and our periodic
functions are just the functions on the circle.
Chapter Q
452
6.1 Approximation by Trigonometric Polynomials 453
To fix the ideas, we shall have a particular circle in mind the set Γ of
complex numbers of modulus one. We have already seen that there is a
mapping θ -»· cos θ + ι sin θ = е'в of the real numbers onto Г which is
one-to-one on an interval of length 2π, except that both end points go onto
the same point. This mapping does precisely what we want. It winds the
real line around Γ. A continuous function on Γ is a function of e'e which
varies continuously with Θ. Thus the continuous functions on Γ are
precisely the continuous functions on R which are periodic of period 2π
f(x + 2π) = f(x) for all χ e R
In the past few chapters we have been studying the behavior of functions
from the point of view of differentiation We have studied the Taylor
expansion, an expansion into polynomials, and we have related the
coefficients to the subsequent derivatives of the function. Since the simplest
periodic functions are the trigonometric polynomials, we attempt to expand
a given periodic function in a series of trigonometric polynomials This
is the so-called Fourier series of the function. The interesting fact here
is that the relevant coefficients are found by integration In fact, as we shall
see, the Fourier series of a function is a sort of an expansion in terms of an
orthonormal basis in the vector space of continuous functions on the circle
with the inner product
<f,ff> = ^-f №βΦ)*6
2π J-„
Finally, as the circle is the set of complex numbers of modulus one, it is the
boundary of the unit disk in С and we can study the relation between Taylor
expansions in the disk and the Fourier expansions on the circle for suitable
functions It will turn out that for such functions the Taylor coefficients
can also be obtained by integration on the circle
6.1 Approximation by Trigonometric Polynomials
We shall begin with the attitude that we are studying complex-valued
functions on the circle According to this view, the function e'e is the
simplest and the most basic function This attitude is really just a
convenience, the point of view of strictly real-valued functions would consign us
to consider cos Θ, sin θ as the elementary building blocks of our theory
454 6 Functions on the Circle {Fourier Analysis)
But, since e'e = cos θ + i sin Θ, there is little difference, and we select the
more comfortable notation.
Our purpose is to describe a given function on the circle in terms of the
powers of e'°, both positive and negative. More precisely, if the series
Σ a«e""> (6·ΐ)
П= — 00
converges for all Θ, it defines a function on the circle. We ask the converse
question Can we express any periodic function as such a series? If only
finitely many of the {an} in (6.1) are nonzero, there is no problem of
convergence, and the sum defines a function, called a trigonometric polynomial.
This subject gets off the ground once we know how to compute the {an} from
the given function, and that leads us to our first proposition.
Proposition 1. Let Ρ(θ) = Y£=-N апешв be a trigonometric polynomial.
Then
am = ^( P{d)e-""« άθ
for all m.
Proof.
— ί Р(в)е-Шс1в = — Г ( 2 д„е'"е|е-""е άθ
= ττ ί α„ ί e""-m)<,i/0
2π„--Ν J _„
1
= — am ■ 2π + Ο = α„
2π
Now, given a continuous function on the circle, if it has an expansion into
a series of trigonometric polynomials, we could expect that the coefficients
of this series will be related to the function in the same way. Thus we form
this definition.
Definition 1. Let / be a continuous function on the circle. The nth
Fourier coefficient of f is
Λ
i(") = irf /(#"*# (6.2)
6.1 Approximation by Trigonometric Polynomials 455
The Fourier series of/ is the series
Σ /(")«"" (6.3)
Л = — 00
Examples
1. Let/(0) = sin0. Since
sin 0 =
еч> _ e-.e
2i
its Fourier series is
~е-1в+1.е1в
2ι 2ι
From (6.2) we can deduce (as is also easily computed):
1 Λπ — 1 1 Λπ 1
— f sin 0 е1" <20 = — — f sin0e-,ed0=-
2πί-η 2i 2π J _π 2i
2. Since cos m0 = ±(е,тв + е~'тв), the Fourier series of cos m0 is
3. Let/(0) = cos20. Then
/(„) = A. f" cos2 фе~,пф άφ = —({\+ cos 2ψ)<Γ·ηψ άφ
fi « = -2, 2
00= i " = 0
№ η Φ - 2, 0, 2
Thus the Fourier series of cos2 0 is
ie-^ + i+ie'2"
(Notice that cos2 0 = 1/2(1 + cos 20) = 1/2(1 + l/2(e,e + e~*)) is a
trigonometric polynomial.)
456 6 Functions on the Circle (Fourier Analysis)
4. Let/(0) = 7r2-02.
/(") = f Γ (π2 - Ф2)е-'"ф άφ
? π2 г" 1 Γπ ,, , 2π2
/(0) = -[ άφ--\ Φ2άφ = — η = 0
/00=-;/-Гф2е""*# = (-1)"4 "#0
by two integrations by parts. Thus the Fourier series of π — θ is
^ + 2£^Z> (6.4)
Notice that by the comparison test, this series does converge to a
continuous function of e'e:
J пФО П
In order to conclude that this is the given function π2 — θ2, we shall
need more theoretical investigations.
5. It is not necessary for a function to be continuous to have a
Fourier expansion. It need only be integrable for the expressions
(6.2), (6.3) to be computable. Let us compute the Fourier series of
1 0>O
0<O
/(0)=έίο^-2
2π Jo Ζπιη
•ιηφ
Ίπϊη
0 η even, η φ 0
— η odd
πίη
1 \_e~lm - 1]
6.1 Approximation by Trigonometric Polynomials 457
Thus the Fourier series of/ is
I I 00 ginfl
η odd
Recapturing the Function from Its Fourier Series
Notice that no claim of convergence in Definition 1 is made In particular,
the series (6.5) appears not to converge, for the comparison test does not
apply. However, we cannot conclude that convergence fails, only that the
question can be exceedingly difficult We ask instead what appears to be a
simpler question: Does the Fourier series identify the given function, and if
so, in what way? We now try to investigate the recapture of a function by its
Fourier series, deliberately leaving aside all questions of convergence.
Let / be a given integrable function on the circle and consider the
" function "
θ(Θ) = Σ /(«У"*
η = — οο
By definition of/(η),
0<β)= Σ Τ Γ ЯФ)е,п(в~ф) άφ
η= — οο ώ7Γ J —π
Now we interchange ]Γ and J, obtaining
ff(0) = f f ΚΦ) Σ е*-»аф
Ζ7Γ J — π n~ — oo
Well, it is too bad it turned out this way because we are still up against a
convergence problem, like it or not. In fact, the situation is worse, it is
untenable because
£ e<»<e-*) (6.6)
П= — 00
converges for no values of φ. This seemingly insurmountable obstacle can
be overcome, so long as we are not solely interested in pointwise convergence,
by a subtle mathematical technique: that of inserting convergence factors.
458 6 Functions on the Circle (Fourier Analysis)
If we replace the series (6 6) by the series
£ rl"le'»<e-*) (6-7)
this series converges beautifully for r < 1 and the series (6.6) is in some ideal
sense the limit of (6.7) as r tends to 1. Stepping backward two steps, this
causes us to now consider the series
9(г,в)= Σ f(n)r^e™e (6.8)
Л = — 00
and the limit lim g(r, Θ) (hoping of course that it is f(9)). Notice that the
r->l
series (6.8) does converge since the Fourier coefficients {/(«)} are bounded
(Problem 1) and the comparison test applies. Now, proceeding as above
but this time with g(r, Θ), we obtain
9(г,в)=±- Г /(φ) Σ №«-»άφ
<1
and here we can interchange Σ and J because the series in question converges
uniformly The sum in the above integral can be put in a nicer form since
it is a sum of two geometric series.
00 00 00
P(r, t) = Σ rHe"" = Σ (re-γ + Σ (Ό"
!+ T~7, (б-9)
1 - re~" 1 - re'
1-r2 1 - r
1 + r2 - r(e" + e~") 1 + r2 - 2r cos t
The function P(r, t) is called Poisson's kernel (named after its French
discoverer, not because its whole technique is fishy), and the association of/
to g is called the Poisson transform. Thus, the Poisson transform
(Pf)(r, θ) = ^ί ЯФ). ^ 2 ' ~ r\ .- ,. άφ = Σ к*У*е"
2π1-π l+r — 2r cos(y — φ) η=~χ
(6.10)
6.1 Approximation by Trigonometric Polynomials 459
takes continuous functions on the circle into continuous functions of r, θ
for И < 1; that is, into continuous functions on the open unit disk. We shall
later see the importance of the Poisson transform from the point of view of
partial differential equations.
Examples (Some Poisson Transforms)
6. We can find the Poisson transform of functions on the circle quite
explicitly, using some complex notation and Equation (6.10). For
example, consider/(0) = cos2 Θ. Using Example 3 we have
Pf(r, Θ) = irV29 + i + £rV2" = ±[1 + №е-")г + i(«,e)2]
Thinking of r, θ as polar coordinates in the disk, we can rewrite this
(using ζ = re'e = χ + iy, ζ = re~'e = χ — iy):
Pf(z) = ±[1 + Kz2 + z2)] = i[l + Re z2] = «I + x2 - y2)
Clearly,
hm Pf(r, Θ) = hm Pf(z) = - (1 + x2 - (1 - x2)) = x2 = cos2 θ
7. The Poisson transform of
Я0)-(о 0<о
is given by
1 1 r^e"1" 1 2 „ 1 //Vе гпе~тв\
n>0
1 2T /„ z"\ 1 2T « z2n + 1
p/(z) = - + - Im X - = - + - Im Σ ^—Τ
4 2 π \nodd η J 2 π „=ο 2η + 1
n>0
Now, we can use Taylor expansions to obtain a closed form for this
series.
460 6 Functions on the Circle {Fourier Analysis)
Now
С dz 1 / г dz r dz \ 1 /1 + z\
Jr3^ = 2Um + JrT^J=2ln(rriJ
(We have used real-variable techniques to find this closed form, but
once it is found it is valid for all z, \z\ < 1.) Thus
™ = И"""(Н)
As \z\ -»· 1, Pf(z) has a limit except for z-> 1, z-> — 1. We shall
now show that except for these two values, lim Pf(r, θ) =/(θ).
г->1
lim Pf(r, Θ) = Pf(l, Θ) = \ + - Im 1η(|±ί!Λ (6.11)
Now
1 +е1в _ (1 + e,e)(l - e-e) _ 1 + e'e — e~'e — 1
1 -e'e ~~ (1 - e'*)(l - e~,e) ~~ 1 - e'" - e_,e + 1
i sin θ
(1 - cos 0)
(6.12)
Since In ζ = In |z| + i arg z, Im In ζ = arg ζ for any complex number.
Since (6.12) is pure imaginary, we have
0>O
T . 1+e·9 I2
ImlnT^ , π
0<O
2
Thus, referring back to (6.11)
11т/>/(г,0) = ^ + Щ = 1 if0>O
-HO0 if»<°
6.1 Approximation by Trigonometric Polynomials 461
We are still hoping that it is true for all/that hm Pf(r, Θ) = /(0). Of course,
r->l
this turns out to be true. To see this we have to verify some properties of
Poisson's kernel. First we rewrite the Poisson kernel as
1-r2
P(r, t) = 5
' (1 - r)2 + 2r(l - cos Ζ)
From this reformulation we easily conclude the following properties:
(i) P(r, t) > 0 for all values of r, t, r < 1
1 -r2 _ 1 +r
(Τ^7)"2 = Γ
(ii) P(r,0) = - -χ-2 = - >ooasr->l
(iii) On the other hand, for values of t φ 0, P(r, ?)->0 as r->l. If
M><5,
P(r, t) =
1 - r2 \-r2
(1 - r)2 + 2r(l - cos δ) " 2r(l - cos δ)
uniformly as r-* 1.
For a fixed value of r, the graph of P(r, t) is drawn in Figure 6 1. As r -* 1,
the peak goes up and the valleys get larger and deeper. Finally,
(iv) ±f P(r,t)dt = l
2π
This can be computed directly; however it is easier to use Equation (6.10)
in the particular case where / is the function which is identically one (see
Problem 2).
Theorem 6.1. Iff is a continuous function on the circle,
hm Pf(r,e)=f(9)
r^l
Proof. Using property (iv) above we can write Pf(r, 0)-/(0) as an integral,
1
(p/Xr, θ) - /(β) = τ- ί [№ - №№, β-φ)άφ
462 6 Functions on the Circle {Fourier Analysis)
II P(r,t)
Figure 6.1
For any δ > 0 we break up the integral into two pieces:
(PfKr, θ) - /(β) = ±- ί [/(φ) - /(0)]i>(r, θ~φ)άφ
+ y-f [/(.Ф)-№1Р(г,в-ф)аф
Now, by (iii) the integrand in the lower integral tends to zero as r -> 1 and, by
continuity, |/(<£) — /(0)1 is small for all <£ near enough to θ so that we can make
the first integral small by taking δ small.
More precisely, let ε > 0 be given. Let δ be such that
\ΑΦ)-№\<2 ύ\φ-θ\<8
(6.13)
Given that δ, by (iii), there is an η > 0 such that for \r — 11 < η,
f Ρ(ν,φ-Θ)άφ<:
'2 11/11.
6.1 Approximation by Trigonometric Polynomials 463
Then for \r - 11 < η,
\Pf(r, θ) - f{r, 0)1 <; — ί ι /(φ) - /(β) ι p(r, β-φ)άφ
+ ^ί \т-№\р(г,в-Ф)аф
-έ'? ί ρ(τ,β-φ)άφ
Ζ.7Γ Ζ ·Ίφ-β|£{
+ =-2||/||βί Ρ(ν,θ-φ)άφ
-2' + π 2II/IU е
We seem to have come a long way away from our original quest, but we
have not really. The content of Theorem 6.1 is this: Let/be a continuous
function on the circle. Its Fourier series
Σ Н*У*
П= — 00
is too hard to study as regards convergence, but it does represent/in some
relevant sense. It " almost converges " to /, that is, if we put in factors to
ensure the convergence and consider instead
00
mr, 0) = Σ /(">""*""'
П= — 00
then for r very close to 1, this function is very close to/. This allows us to
make important assertions based on any information on the Fourier series of/.
For example,
Collorary 1. Iff is a continuous function on the circle, αηί/Σ"= -00 l/(")l <
00, then f is the sum of its Fourier series,
/(0)= Σ ?№">
П— — 00
Proof. The condition allows us to conclude on the basis of the comparison test
that the Fourier series converges; the essential content here is that it converges to/.
464 6 Functions on the Circle {Fourier Analysis)
In fact, by the comparison test, we can conclude that
CP/)(r, 0) = Σ f{n)rMeM
is a continuous function on the closed unit disk: all r<,\. Then for any Θ, by
Theorem 6.1.
00 00
/(0) = limi>/O·, 0)=lim 2 f(n)rMeM = Σ /ООе'""
In particular, if/(n) vanishes for all but finitely many n, then/is a
trigonometric polynomial. Thus the trigonometric polynomials are precisely the class
of continuous functions on the circle with only finitely many nonzero Fourier
coefficients. A more basic consequence is that a function is uniquely
determined by its Fourier series.
Collorary 2. Iff and g are continuous on the circle andj(n) = cj{n)for all n,
thenf= g.
Proof, f—g is continuous on Γ, and (/— g)\n) =/(n) — g(n) = 0 for all n.
Applying the first corollary to /— g we see that it is the sum of its Fourier series,
which is identically zero. Thus /— g = 0, so /= g.
Conditions on the Fourier coefficients of a function, such as that in
Corollary 1, are not hard to come by. For example, suppose / is a twice
continuously differentiable periodic function. Then by integrating by parts
we have
/(«) = ^- Г ЯФ)е-"ф άφ = -^-Γ Г(ф)е-"ф άφ
2π J -π 2πιη ■> -π
= τ\( ПФ)е-,пФ άφ
2πη J -π
Since /" is continuous on the circle, it is bounded, say by M. We obtain
these bounds on the Fourier coefficients of/:
Thus ΣΙΛΌΙ < »■
6.1 Approximation by Trigonometric Polynomials 465
Corollary 3. If f is a ^function on the circle, it is the sum of its Fourier
series.
We shall have an even better result in Section 6.4. Nevertheless, Theorem
6.1 does allow us to make deductions on the convergence of the Fourier
series. As one last application, it tells us that although we may not be able
to approximate a function by its Fourier series, we can nevertheless
approximate it by some sequence of trigonometric polynomials.
Corollary 4. A continuous function on the circle is approximable by
trigonometric polynomials.
Proof. Using the notion of uniform continuity, we can be sure, in the proof of
Theorem 6.1, that the δ chosen so that (6 13) is true is independent of 0. Thus,
in the rest of the argument we find an r < 1 such that
\Pf(r, в) - f(6)\ < ε for all 0
Now, the series 2i°=-» f(.n)rMeM converges uniformly to Pf(r, 0), if r < 1 Thus
there is an N such that the partial sum Q of the terms between —TV and N is
everywhere within ε of Pf(r, 0). Thus
| Q(6) - /(0) | < | β(0) - Pf{r, 0) | + \Pf(r, Θ) - /(0) | < ε + ε = 2ε
for all 0, as desired.
• EXERCISES
1. Find the Fourier series of the following functions on the circle,
(a) /(0)=02 (b) /(0) = coss0
(c) /(0) = e'" t* > 0, not necessarily an integer.
(d) /
0 -7TfS0<-
2
/(0) = -
» + r -r<0<O
2 2
-0+2 O<0<;
7Γ
- <0<7T
2 _
(e) /(0)=|sin0|
466 6 Functions on the Circle {Fourier Analysis)
(f) /(0) = sin 0 + cos 0
(g) / π
0
1
№-
-7Г^0<-
:θ<ο
O^0<-
■^0<7Г
(h) /(0) = e<
(ι) /(0) = e'"
2 Find the Poisson transforms of the following functions on the circle:
(a) cos3 0 + sin3 0
(b) (l+cos20)-'
(c) Exercise 1(c).
(d) Exercise 1(d)
(e) Exercise 1(g).
(f) (1+e'T
PROBLEMS
1 Show that the Fourier coefficients f(n) of a continuous function /
defined on the circle are bounded:
l/(n)l< 11/11 = max{|/(0)|: -тг^0<тг}
2. Show that
1
| P(T,t)
dt=\
by computing the Poisson transform of the function 1
3. Show that if /is a real-valued function on the circle, /(— n) =f(n)~
4 (a) Show that the Poisson transform of/can be written
Pf(r, 0) =/(0) + 1 (f(-n)z- +/(i»)r")
(b) Show that if /(—n) = 0, n>0, then Pf is the sum of a
convergent power series in the unit disk
6.2 Laplace's Equation 467
(с) Show that if/can be written in the form
/(0)=F(e'«)
where F can be written as a convergent series in powers of z, z, then
Pf(z) = F(z)
5. What is the Poisson transform of these functions?
(a) exp(e'e) (d) (l+cos20)-'
(b) (1+2*)-" (e) ln(5 + z)
(c) (z+z)" (f) exp(cos0)
6. We can use the approximation theorem (Corollary 4) to prove the
following fact.
(Weierstrass Approximation Theorem). Iffis a continuous function on the
interval [0, 1] and ε > 0, there is a polynomial P(x) = a0 + a^x + · · · +a„x"
such that
\f(x) - P(x)\ < 1 for all χ e [0, 1]
Prove it according to this idea: First extend /as a continuous function on
the interval [ — π, π] so that/(—π) = /(π) Now view the extended function
as a function on the circle and, by Corollary 4, approximate it by a tngono-
N
metric polynomial of the form £ a„e'"e. Now use the fact that the 2N
n= -N
functions {e'"e: -Ν < η < N} can be approximated by polynomials in Θ.
6.2 Laplace's Equation
The techniques described in the previous section came out of Poisson's
work on the theory of heat flow Suppose D is a domain in the plane
(representing a homogeneous metallic plate), we wish to study the
temperature distribution on this plate subject to certain sources of heat energy Let
u(x, t) be the temperature at the point χ at time t. We shall see in Chapter 8
that, as a consequence of the law of energy conservation, the temperature
function и behaves according to this partial differential equation
(appropriately called the heat equation)
du _ 1 /d2u d2u\
(6.14)
468 6 Functions on the Circle {Fourier Analysis)
Now, suppose our sources of heat maintain the temperature at the boundary
of D, and there is no other source or loss of heat. Then, as t -* oo the
temperature distribution will tend toward equilibrium: that state at which
ди/dt = 0. This equilibrium (or steady-state) temperature distribution must
therefore satisfy Laplace's equation:
д2и d2u
дх2 dy
,+-5 = 0 (6.15)
This is sometimes denoted Au = 0. Solutions of Laplace's equation are called
harmonic functions.
The Poisson transform has to do with the solution of this steady-state
problem when D is the unit disk. Suppose then, that we are given a
temperature distribution/(0) on the unit circle; we wish to find a continuous
function u(r, Θ) defined for r < 1 such that Au = 0 and w(l,0)=/(0). In
order to attack this problem, we assume that и can be represented, on each
circle r = constant by its Fourier series:
и(г,в)= Σ ^Ув (6.16)
П~ — oo
Our conditions become
(ι) α„(1) = /(И) = ^ fj(9)e-"e άθ (6.17)
д Ι ди\ д2и
(ιι) Ам=^Ы+^=о (6·18)
(We have rewritten Au in terms of polar coordinates so we can apply it to the
Fourier series We leave it to the reader to derive the polar form of the
Laplacian.)
Now, computing (n) term by term in the series (6.16), we obtain
0= f (rX + ra'n - n2any"e = 0
П— - OO
Since the zero function is represented only by the zero Fourier series we
deduce that
r2a'i + ra'„-n2a„ = 0 (6.19)
6.2 Laplace's Equation 469
for all n. This ordinary differential equation is easily solved:
1, log r η = 0
r", r~" n#0
We have only one boundary condition (6.17), however we do want the
functions continuous at r = 0, so the solutions log r,r~'"' are excluded. Thus
we must have a„ = f(n)rM, and the solution must have the form
и(г,0)= Σ ?(n)rMelne
П~ - 00
which is Poisson's transform. Hence, if the problem is solvable, the solution
must be given by Poisson's transform. Conversely, the following is a
solution.
Theorem 6.2. Let f be a continuous function on the circle. There is a
unique function u, harmonic in the disk and assuming the boundary values f.
и is the Poisson transform off:
Proof. We need only verify that и is indeed harmonic. Since we can
differentiate under the integral sign
we need only show that the Poisson kernel is harmonic. That can be done by
direct computation, or by referring back to Equation (6 9). There we have
1 11,1
v \—re~u 1 — re" 1 — г 1 — г
= -1 + 2 Re
Now, (1 — г)"1 is a complex differentiable function, and we have already seen that
the real part of such a function satisfies Laplace's equation, see Problem 5.7.2
Thus ΔΡ = 0.
№
470 6 Functions on the Circle {Fourier Analysis)
To recapitulate, Laplace's equation for the disk with given boundary values
is easily solved by Fourier methods. If/is the boundary temperature
distribution, the solution is
h(z)= £ /(Η)Η"Ι^β=/(0)+ Σ(/(-η)ζ"+/(η)ζ")
η- - oo n= 1
(since гмешв = ζ" for η > О, гме'пв = ζ" for η < 0).
Examples
8. Find the solution of Laplace's equation with boundary values
f(9) = cos3 θ + 3 sin 30. This is easy to do, for we can easily
recognize the function as the boundary value of the real part of a
complex differentiable function. Since
1 1
cos 0 = -(z + z) sin 30 = — (z3- z3)
2 2i
on the unit circle, we have
/Ю-Ц2 +£<··-.·>
for ζ = е1в. Thus the solution is given by the same expression for all
z, |z| <: 1 since it is clearly harmonic.
9. Solve Laplace's equation with boundary values Д0) = |0|.
Since |0| is not a trigonometric polynomial, we must compute the
Fourier expansion.
/(") = τ- Γ \e\e~we άθ = ^-\ ee-me άθ + — Γ 0<Τ""' άθ
2π J -π 2π J -„ 2π -Ό
= τ- (θ{έηθ + е~1пв) άθ=- Γ 0 cos ηθ άθ
2π Jo π ·Ό
= — sin ηθ άθ = —^ cos ηθ = —г
πη J0 πη ο πη (η ^ 0
/(0) = - (θάθ = 1
π Jo 2
6.2 Laplace's Equation 471
Finally,
π 2 » f+z" π _ » z2n+1 + z2n+1
Д2) 2 π Μ η2 2 Λ (2η + Ι)2
The problem analogous to the above in the case of a general domain is
known as Dirichlet's problem. More precisely, Dinchlet's problem is to
find for a given domain D and function / defined on D, a function harmonic
in D and taking the given boundary values. In 1931, O. Perron gave an
elementary, but extremely clever argument which proved the existence of a
solution to Dinchlet's problem. Poisson's method plays a strategic role in
Perron's arguments, which we shall not go into here. However, we shall
verify that the solution is unique, there can be at most one harmonic function
with given boundary values. This follows from the mean value property of
harmonic functions.
Proposition 2. Suppose и is a harmonic function in the domain D. If
Δ(ζ0 ,R)<=D, then
«Ы = ^ f "(z0 + Re">) d9
that is, w(z0) is the average of its values around any circle in D contained in D.
Proof. We can expand и in a Fourier series around any circle \z— z0\ =r,
r<.R\
oo
u(z) = 2 a„(r)e'"e r<R where ζ = z0 + re'°
n= - oo
Since Δ« = 0, we must have a„0) =/(«>'"', where/(f)) = u(z0 + Re"), already seen.
Thus
«(z)= 2 /(n)|z-z0|""e",ar,tfa"<,o>
n= — oo
so
1 r"
«Ы=/(0) = —| f(S)de = —j u(z0 + Re'°),
472 6 Functions on the Circle {Fourier Analysis)
Corollary 1. Suppose и is harmonic on the closed and bounded domain D.
If и > 0 on dD, then и > 0 throughout D.
Proof. Let us suppose that the conclusion is false That at some point z0 inside
D, u(zo)<0. We shall derive a contradiction We may take for z0 a point at
which и takes its minimum value. There is such a point since D is closed and
bounded, and it is interior to D since и > 0 on 8D Let Δ(ζ0, R) be the largest disk
centered at z0 contained in D. The boundary of Δ(ζ0, R) must touch 8D (see
Figure 6.2), for if not we could find a larger disk centered at z0 and contained in D.
Thus there are points on the circle \z— z0\ = R at which и>0. Since u(z0) is the
average value of и on this circle, and u(z0) <0, there must be points on this circle
at which и <u(z0) in order to compensate. But u(z0) is the minimum value of u,
so we have a contradiction More precisely, since u(z) > «(z0) for all ze D,
u(z0 + Re") — u{z0) ;> 0 for all Θ. On the other hand, by the mean value property
η
ί (tt(z0 + Re'") - «(zo)) άθ = 0
^ -It
When the integral of a continuous nonnegative function is zero, that function is
identically zero Thus,
tt(z0 + Re'") = tt(zo) for all θ
This contradicts the fact that for some Θ, u(z0 + Re'") >0
Figure 6.2
6.2 Laplace's Equation 473
Corollary 2. A function harmonic on a closed and bounded domain D is
uniquely determined by its boundary values.
Proof. Suppose that u, υ are both harmonic in D, but и = υ on 8D. Let ε > 0.
Then и — υ + ε, ν — и + ε are both positive on 3D. By Corollary 2, they are both
positive in D, thus
u>v — ε v>u — ε in £>
Since ε is arbitrary, we may now let it tend to zero. We conclude that и ^ ν and
ν > и throughout D. Thus и = ν in D.
Another problem of heat transfer is this: find the steady-state temperature
distribution on the unit disk assuming a given rate of heat flow through the
boundary, and no other source or loss of heat. Now the velocity of heat
flow, denoted q, is a vector field on the domain and it is a law of
thermodynamics that this field is proportional to the temperature gradient, but
oppositely directed. Thus, in this problem, our given data are the rate of heat
flow perpendicular to the boundary of the unit disk, which is proportional to
ди/dr on the boundary. By the law of conservation of energy, since we are
assuming a steady state, the total energy change is zero, thus we must impose
this condition: |ϋ.π ди/дг(е'в) άθ = 0. Thus, the mathematical formulation
of this problem (known as Neumann's problem) is this: Find a function и
harmonic in the unit disk such that ди/дг(е1в) assumes given boundary values
g(6). We impose the condition |ϋπ g(9) d9 = 0. (It is necessary to impose
this condition in order to obtain a solution, for mathematical reasons, as you
will see in Problem 8.) We again solve this problem by Fourier methods.
Find
oo
u{reie) = Σ «.«
— oo
so that (ι) (ди/дг)(е'в) = g(6), Au = 0. Again, this leads to the ordinary
differential equation (6.19) with the boundary condition a'„(\) = §(n). The
solution continuous at the origin is \n\~l§{n)rw. Thus the solution must be
given by
u(reie)= £ ^r1"^"" (6.20)
- oo П
We will omit the proof that this function does solve Neumann's problem;
the argument is much like that in Theorem 6.1. We can, of course, collapse
474 6 Functions on the Circle {Fourier Analysis)
(6.20) into an integral formula:
oo 1 Γ Ι π
«('·«")= Σ - \т\ 9(Ф)е-1"фс1ф
-οο η \_2π J-„
1л71 Г °°
= - ί д(Ф) Re
г\"\е'"в
κη£-ιη(θ-φ) α, γ.η£ι"<,β-φϊ
П „=1 И
<ty
Γ» (re'(a"^)n·
η=1 П
<ty
= - Γ 3(Ψ) Re[ln(l - re,(e_«)] άφ
Now
so
|1 -re"|2 = 1 + г2-2 cos ί
Reln(l -re") = iln|l -re"|2 = ib(l + r2 - 2r cos /)
Thus the solution to Neumann's problem takes the form (6.20) or
u{rea) = — J* д(ф) ln[l + r1 - 2r cos(0 - ψ)] <ty
EXERCISES
(a) /(β) =
3. Solve Dirichlet's problem in the disk with these boundary conditions:
(-1 0<O
( 1 0>O
(b) /(0) = sin2 θ - cos2 θ
(c) /(0) = 7r2-02
(d) /as is given in Exercise 1(c).
(e) /(^) as is given in Exercise 2(f).
4. Solve Neumann's problem with these boundary conditions:
(a) /(0) = sin θ + 2 cos 26»
(b) /(«) = { J! J
-02 0<O
0
(c) /as is given in Exercise 3(a).
6.2 Laplace's Equation 475
• PROBLEMS
7. Show that the Laplacian is given in polar coordinates by
B2u
8 I 8u\
8. Verify that it is necessary that
1 π
— f g(6) άθ = 0
for there to be a function и harmonic in the disk such that
1ιπι^(Γ,β)=0(β)
r-i 8r
9. Verify by direct computation that P(r, Θ) is harmonic
10 Show that if / is a complex differentiable function (it satisfies the
Cauchy-Riemann equations), then / is harmonic
11. We can prove, using the Poisson transform, this remarkable fact
about complex differentiable functions:
Theorem. Suppose that f is a complex differentiable function on the unit
disk. Then f is the sum of a convergent power series centered at the origin
The proof goes like this: Let g(fi) =/(<?">) Since/ is harmonic in the
disk (Problem 10), it solves Dirichlet's problem with the boundary values g
Thus f(re'") = Pg(re") Now prove this fact
(a) If the Poisson transform Pg is complex differentiable, then
g(ri) = 0 for η <0. {Hint: Apply d/8x + ; B/By to the expression
Pg(re'°) = <?(0) + 2. (9{-n)z" + g{ri)z")
n= 1
(b) Deduce from (a) that
f(rew) = I g(n)z"
12 Under what conditions on/, g is P(fg) = P(f)Pig)'7
13. (a) Show that if/is a continous function on the domain D with the
mean value property:
/(zo) = r- f /(zo + Re10) άθ for every A(z„ ,R)<=D
2π J _,
476 6 Functions on the Circle {Fourier Analysis)
then /satisfies a maximum principle:/(z0) <max{/(z): ze 8D), for every
z0 e D.
(b) Conclude that a function having the mean value property
is harmonic.
14 Prove: A bounded function defined on the entire plane which is
harmonic, must be constant.
6.3 Fourier Sine and Cosine Series
There are many notationally different ways of expressing the Fourier
expansion of a function, depending mostly on the dictates of the problem at
hand. We shall devote this section to the development of these various
expressions.
First of all, since the main physical study is that of real-valued functions
we should introduce the purely real notation. We merely convert the
Fourier expansion ^Г/(л)е'"в via the expressions
e'"e = cos nO + i sin ηθ е~'"в = cos ηθ - i sin ηθ η>0
Thus the Fourier expansion will take the form
oo
Λ0 + Σ A„ cos ηθ + BB sin ηθ (6.21)
n = 0
where the A's and B's are found from the Fourier coefficients C„ =f(n) as
follows:
OO 00
£ C„ewe= Σ C„(cos ηθ + i sin ηθ)
n= — oo n= — oo
00
= C0 + Σ L(Cn + C_„)cos ηθ + i(C„ - C_„) sin л0]
n=l
Thus
A0 = C0 A„= C„ + C_„ B„ = i{C„ - C_„} η > 0
Notice that if/ is real valued
C-„ = γπ\ fW άφ = [1 f Лф)е-"* άφ
= C.
6.3 Fourier Sine and Cosine Series All
Thus we have
1 л71
Ao = Co = - | ЯФ) άφ
1 г"
Λ„ = 2 Re C„ = - /(φ) cos ηφ άφ η > О (6.22)
ItJ-i
1 г71
B„ = 2 Im C„ = - /(φ) sin ηψ ί/0 η > 0 (6.23)
π J-„
Furthermore,
C„ = i(^ + rf.) С_„ = \{An - /Д.) η > 0
Examples
10. Express the Fourier series of π2 — θ2 in the form (6 21). From
Example 4, we have
Thus
h· л=4<4>-
Ло = ■= π2 A„ = 4 κ-^- Β„ = О
η
and we obtain this Fourier expansion:
π2_02 = ^ + 4χ izlTcos„0
3 n>o η
Notice that equality is justified by Corollary 1 to Theorem 6.1 since
the Fourier series does converge. Evaluating at 0, we obtain this
interesting fact
L· n2 12
478 6 Functions on the Circle {Fourier Analysis)
11. Express the Fourier series of |0| in the form (6 21) Reading
from Example 9, we have the Fourier series for |0|:
π 2 » е-,пв + етв
2 ~~ π it η2
Thus we have the real Fourier series
π 4 ™ cos ηθ
2 π„^ι η
Evaluating at 0 = 0, we obtain
L· n2 8
12 As usual, trigonometric polynomials can be handled directly,
without computation of integrals ·
/е'в + е~,в\4 1
cos4 θ = I +2 \ = ^ (e4'9 + 4e2ie + 6 + 4e~™ + e~™)
= — (2 cos 40 + 8 cos 2Θ + 6)
,31 1
cos θ = - + - cos 20 + - cos 40
8 2 8
£Ъеи «nrf Odd Functions
A function of a real variable is called an even function if/(x) =f( — x) for all
x, and it is an odd function if/(x) = —f(x). Notice that the product of two
odd functions is even, and the product of an odd and even function is odd.
If/is an odd function on the interval \_ — A, A~\, then
Г /(0 dt = f /(0 at + fV(0 dt=- (f{t) dt + fV(0 dt = 0
J-A J-A ·>0 J0 ·Ό
We can conclude that if/is an even function on the interval [ — π, π], its
Fourier series is purely a cosine series For in this case /(φ) sin ηφ is odd
for all n, so the integrals (6 23) all vanish Similarly, if/is an odd function its
Fourier series is purely a Sine series.
6.3 Fourier Sine and Cosine Series 479
Example
13. The Fourier series of θ is of the form
oo
Σ Βη sin ηθ
n = 1
since θ is an odd function. Here
π J -π π η -π π J -π η η
οο
Thus θ has the Fourier series — 2 £ (— 1)"/" Sln "0.
n = l
Now, all our computations have been done for periodic functions of
period 2π. Periodic functions arising in physics do not usually have such a
convenient period, yet they are subject to Fourier methods merely by a
normalization. Suppose that / is a periodic function of period L. Then
g(0) = f(L9/2n) is periodic of period 2π. For
«β + Ш) =/(£ (Θ + in)) =/(|? + L) Щ) = m
Now, if g can be expanded in a Fourier series:
oo
g(9) = A0 + £ (A„ cos ηθ + B„ sin ηθ)
then we can write
/2πχ\ ™ /1πηχ\ /1πηχ\
/00 = g^J =A0 + £ A* co\—) + B» sm[~r) ( 4)
where (as is easy to compute by the change of coordinates φ = 2L~lx)
1 rL'2 1 rL'2 , Ιπηχ ,
Λ = 7 Λ*) <f* Л = 7 /Μcos -Τ- dx (6·25)
L J-L/2 -b J-L/2 L
2 rL'2 2π"χ , ,, ,
B„ = 7 f /(*) sin -— dx (6.26)
L· J-L/2 Li
480 6 Functions on the Circle {Fourier Analysis)
With these formulas the Fourier analysis of functions periodic of period L is
made possible
Fourier Cosine Series
There are yet two more variations which are, as we shall see, of value in
the study of partial differential equations. Let/be a given periodic function
with period L and define
««-'(τ)
0<θ<π
(6.27)
L0,
-π<θ<0
Then g is an even function on the interval [ — π, π], so it can be expressed by
a Fourier series involving only cosines:
oo
g(9) = A0+ J^A„cosn9
n = l
where
A0 = — f g(9) dd A„ = - f g(9) cos ηθ d9
= - \'д(в) dd =- Г д(9) cos ηθάθ
π Jo π Jo
Now, making the substitution g(9) =f(L9/n) in the interval 0 < θ < π, these
expressions become
00 iznx
n = l L
1 L "i L
AoZ=li0f(x)dx A"= 1 i0/(x) cos ΊΓ dx (6'29)
We pause to remind the reader that the use of equality in Equations (6.24)
and (6.28) is not literal, it holds only if the series converge (say if g is twice
continuously differentiable) The point is that in such cases the expansions
(6.24), (6.28) are valid, where the coefficients are defined by (6.25), (6.26), or
6.3 Fourier Sine and Cosine Series 481
(6.29), respectively. The choice of these expansions is free—it is usually
dependent on the demands of the particular problem at hand. Equation
(6.28) is called the Fourier cosine series for the function /. Of course, if we
define g as an odd function, instead of the expression (6.28) we can obtain the
Fourier sine series for/:
00 nnx
f(x) = Σ Βη sm — (6.30)
W
here
l г· ,, . πηχ ,
B„ = - J /(x) sin — dx (6.31)
We leave the verification of this possibility to the readers as a problem.
EXERCISES
5. Find the Fourier expansions into sines and cosines for these functions:
(a) cos8 θ
(b) sink θ к а. positive integer
(c) / as given in Exercise 3(a).
(d) /as given in Exercise 1(g).
(e) /as given in Exercise 1(b).
6. Find the function whose Fourier expansion is 2™=-« e'"8/'"
7. Find the Fourier sine and Fourier cosine series for these periodic
functions of period 1
(a) /(x) = l, all*
(b) f(x) = sin(27Tx)
(c) f(x) = cos(27tx)
, , il 0<x<l/2
(d) /M = (0 1/2<*<1
(e) f(x) = sin(7rx)
(x 0<x<l/2
(0 /(*)-(,_, 1/2<х<1
(g) /(*) = Sin(TTX) + COS(ttX)
8 Show that any periodic function on the circle is the sum of an even
function and an odd function.
9. What is the Fourier expansion of f(fi) +f(n — Θ) in terms of that for
/(0)?
482 6 Functions on the Circle {Fourier Analysis)
6.4 The One-Dimensional Wave and Heat Equations
In physics, Fourier analysis begins with the study of wave motions.
Suppose we have a homogeneous string of density ρ and length L lying on the
horizontal axis in the plane which is kept extended by equal and opposite
forces of magnitude к at the end points. If we pluck the string, it will follow
a motion which is (classically) determined by Newton's laws. We shall
derive the differential equation governing the motion. At some time t the
string has a shape somewhat like that pictured in Figure 6.3. We shall refer
to a point on the string according to the distance s, measured along the string
from the left end point. The position in the plane of the point at distance s
at time / will be denoted by z(s, t). This is the function that fully describes
the motion.
Now, if we argue as if the string were a collection of points, we will get
nowhere For the only forces acting on the string are those obtained by
transferring the equal, but opposite forces at the end points tangentially
along the string. Thus, at any point the sum of the forces acting is zero, so
there can be no motion. As that is contrary to fact, this model of the string
is inadequate and we must select another.
Now we consider the string as a large finite collection of segments and
again try to deduce the equation of motion from Newton's laws Having
done that, we can idealize by letting the number of segments become infinite
(as their lengths tend to zero) and obtain a differential equation. Let s0
and s0 + As be the end points of such a segment (see Figure 6.4) The mass
of this segment is pAs and the forces acting on it are opposed tangential
forces of magnitude к acting at the end points Letting T(s) be the tangent
vector at the point s, these forces are thus — kT(s0), kT(s0 + As), respectively
If A is the acceleration of this segment, we have by Newton's laws
pAsA = k[T(s0 + As) - T(j0)]
Now, T(s) = dz(s, t)/ds and hm A = dzfdt{s0, t) Thus
As-»0
к (dzlds)(s0 + As, t) - (dzlds)(s0, t)
A =
ρ As
and now letting As -* 0 we obtain the equation of motion
d2z к d2z
tit ρ tis
6.4 The One-Dimensional Wave and Heat Equations 483
u(s t)
Figure 6.3
This equation, called the one-dimensional wave equation, is usually written
(6.32)
ds2
(where the substitution c2 = kip is legitimate since both к, ρ are positive).
We now make the (physically plausible) assumption that the horizontal
motion is negligible (for we are interested only in almost horizontal wave
motions with small fluctuations). This assumption allows the replacement
of s by the horizontal coordinate x, and the positive vector ζ by only the
vertical coordinate y. Thus (6.32) becomes simply
dx2
~?~dt2
(6.33)
The motion of the string is completely governed by this partial differential
equation and the initial displacement and velocity:
y(s,0)=f(s)
e_y
dt
(s, 0) = g(s)
(6.34)
-kT(s„)
ίΐι+ΔΪ
fcT(io + as)
Figure 6.4
484 6 Functions on the Circle {Fourier Analysis)
The technique for solving this differential equation with boundary conditions
is the same as in the theory of ordinary differential equations. We find an
independent set of solutions of the general equations and hypothesize that the
solution we seek is a linear combination of these. We then identify the
coefficients by substituting the initial conditions. However, the situation is
more complicated than in the one-variable theory. The space of solutions of
(6.32) is infinite dimensional, so the particular solution cannot be picked out
of the general solution by means of simple linear algebra. This difficulty
will be overcome, as we shall see, because the form of the general solution
will be that of a Fourier expansion and so the initial data will give us the
coefficients by Fourier methods.
Let us now solve the differential equation
d2y _ 1 d2y
'дх1 ~7 dt
г'-г р,2 (6-35)
for a function у defined on the interval [0, L] and where these conditions must
be satisfied
X0, i) = 0 XL, ί) = 0 alii (6.36)
y(x,0)=f(x) ^(x,0) = g(x) (6.37)
for given functions/, g. First, we put aside the initial data (6.37) and find all
solutions of Equation (6.35) subject to (6.36). Since we have no
techniques available, we have to make a guess at the form of the solution, and
hope that our guess is general enough (of course, in the end it will turn out to
be so). The guess that works is
y(x, t) = F(x)G(t)
and (6.35) becomes
F'(x)G(0 = V(x)G"(0
or, what is the same (since we exclude the zero solution),
F"(x) _ 1_ G"(t)
F(x) ~~?~G{i)
6.4 The One-Dimensional Wave and Heal Equations 485
The left-hand side is independent of /, and the right is independent of x.
Since they are the same, they are both constant Thus, there must be а Я
such that
- = A - — = A
F c2 G
Now, incorporating the conditions (6 36), we arrive at this one-variable
boundary value problem.
F" - XF = 0 for some A (6 38)
F(0) = 0 F(L) = 0 (6.39)
We can find all solutions of this problem. First of all, we see from (6 38)
that the general form of F is
F(x) = cx expC^/Ax) + c2 exp( — ^/Ax)
Substituting the boundary conditions (6.39), we have
0 = F(0) = c, + c2 0 = F(L) = c1 exp(yiL) + c2 exp(- JXL)
In order for there to be a solution for both equations we must have cx = — c2
and
expC^/AL) = exp( — *JXL) or ex
Thus we must have 2^/AL = 2πηι for some η > 0, or ^/A = πηι/L Therefore,
the only possible solutions of (6.38), (6.39) are
/πηΐ\ / πηΐ\ „ ίπη \
F(x) = expl-—Jx - expl — \x = 2i sinl — χ I all η > 0
Corresponding to the solution F„(x) = sm(nn/L)x, we now solve for G:
2 2
π η „
The solutions are spanned by G„(t) = cos(nn/Lc)t, sm(nn/Lc)t. Thus, all
486 6 Functions on the Circle {Fourier Analysis)
solutions of (6.35) of the form F(x)G(t) are these
/πηχ\ /πη \ [πηχ\ Ι
\-τ) cos[lc ή sm[~r) sinv
sin
πη
Тс
(6.40)
We now return to our particular initial conditions (6 37) and hope to find a
linear combination of the functions (6 40) which has those initial conditions.
Of course, the linear combination will satisfy (6.37) since it is a linear
differential equation. (However, we must caution the reader that ours will be an
infinite linear combination so questions of convergence are inevitable. If
the initial data are well behaved, these problems disperse as you shall see in
Problem 15.) Thus we seek
y(*. 0 = Σ
А"С05\Гс1) + вЛ5тГсЪ5т\тх)
satisfying the conditions (6.37):
™ /πη \
/00 = y(x, o) = Σ An sin[j; XJ
dy, „. £ nn . Ιπη \
But we can solve these equations, for these are just the expansions of/and g
into Fourier sine series. We collect this discussion into the following
proposition.
Proposition 3. If the functions f, g defined on the interval [0, L] are well
behaved (say at least twice differentiable), then the wave equation
d2y _ 1 d2y
l&'c^W
with the boundary data y(Q, t) = 0, y(L, () = 0 and the initial data y(x, 0) =/(x),
(dy/dt)(x, 0) = g(x) has a solution. The solution is given by
y(x, 0 = Σ
lnnt\ „ . Ιπηί
А"С05\Тс-)+В"5т\Тс-
(πηχ
-τ
6.4 The One-Dimensional Wave and Heat Equations 487
where
2 fL /πηχ\ , 2c rL /πηχ\ ,
A„ = -jf(x)sm{ — \dx Bn = — J g(x)sinl — I dx
Examples
14. Solve the wave equation
δχ2~4~δ?
on the interval [0, π] with initial data
y(x, 0) = sin 2x (dy/dt)(x, 0) = sin2 χ
Now с = 2, L = π. The Fourier sine series for y(x, 0) is just sin 2x,
so A„ = 0 unless η = 2, Аг = 1. Now
4 Γπ ? , , 4 Γπ 1 - cos 2χ . ^ ч ,
β„ = — sin χ sin(nx) dx = — sin(nx) dx
πη Jo πη Jo 2
= 0 if η is even
We concentrate now on the case where η is odd:
(6.41)
Bn = — · cos(2x) sin(nx) dx
πη η πηJo
Now
cos(2x) sin(nx) dx
Jo
cos(2x) cos(nx)
η
71 2 sin(2x) sin(nx)
ο" η
π
0
4
+ -2
—2 cos(2x) sin(nx) ί/χ
η J0
488 6 Functions on the Circle {Fourier Analysis)
Thus
11 JI cos(2x) sin(nx) dx = - (1 — cos πη) = - (η odd)
\ η J Jo η η
Now, putting the result of this computation into (6.41):
2
B„ = —
πη
In
-16
" n\n2 - 4)
Thus the solution is given by
16 ™ sin ntjl .
y(x, t) = cos t sin 2x > τ, , — sin nx
π „= ι η\ητ -4)
η odd
16 ™ sin(mi) sin 2mx
^(x, i) = cos ( sin 2x 2j 7^ TTITl—2
π „,^Ί (2m + l)2(4m2 + 2m - 3)
15. Solve the same wave equation with initial data
dy
y(x, 0) = sin χ + sin 5x + 2 sin 6x — (x, 0) = 0
at
The expressions for the initial conditions are the Fourier sine series
for those functions; thus we can read off the solution:
f . 5i
y(x, i) = cos - sin χ + cos — sin 5x + 2 cos 3i sin 6x
2 2
Heat Transfer
Another physical problem which gives rise to a partial differential equation
which can be solved in a similar way is the problem of one-dimensional heat
transfer. We shall derive this equation here (the derivation in Chapter 8 of
this equation in higher dimensions shall be seen to be completely analogous).
Suppose we are given a thin homogeneous rod of length L lying on the
horizontal axis. Let и(х, t) be the temperature at χ at time t. We assume
that there is no heat loss, and the temperatures at the end points are maintained
constant. Now the basic physical law here is that the flow of heat is
proportional to the temperature gradient, but points in the opposite direction.
6.4 The One-Dimensional Wave and Heat Equations 489
Thus, during a small interval of time At the heat (energy) passing from left to
right through a point x0 is proportional to -(ди/дх)(х0) · At. If we select
a segment of the rod with end points x0 and x0 + Ax the increase in energy
in that segment of the rod is proportional to
I ди \ I ди \
-|-^(χ0 + Δχ)ΔίΙ+ Ι-—(χ0)ΔίΙ (6.42)
On the other hand, the increase in energy is proportional to the product of
the mass and the change in temperature. Thus (6.42) is proportional to
Au ■ Ax. Letting k2 be the constant of proportionality we have, for the
period of time At:
Au ■ Ax = k2
ди ди
-(χ0 + Δχ)--(χ0)
Δί
Dividing by Δχ · At and letting both tend to zero, we obtain the heat equation'
(6.43)
1 ди д2и
Ί?~δΊ = ~δχ~2
We now propose to solve (6.43) given the boundary conditions
м(0, г) = 0 u(L, i) = 0 (6.44)
and the initial temperature distribution
u(x, 0) =/(x) (6.45)
The technique is the same as that for the wave equation. We try a solution
of the form и(х, t) = F(x)G(t). (6.43) becomes
1
G'(t)F(x) = T1F"(x)G(t)
Dividing by F(x)G(t), we again find that there must be a A such that
F" _ G' _ λ
Ί = λ g~~V2
The first equation, subject to the initial conditions (6.44) again has only the
solutions sin(7rnx/L), η > 0, corresponding to the choices yJ~X = nnijL. The
490 6 Functions on the Circle {Fourier Analysis)
second equation becomes
which has the solutions
„,ч /-π2"2\
G11(0 = exp(-E?p-)i
For convenience, let us write С = π/Lk. We now try to fit the series
f^nexp(-CV()sin(^ (6.46)
to the initial conditions. Evaluating at t = 0 we find that the {An} must be
the Fourier sine coefficients of/(x).
Proposition 4. If the function f defined on the interval [0, L] is well-behaved
(say at least twice continuously differentiable), then the heat equation
1 ди д2и
k*dt~lhS
with the boundary data y(0, t) = 0 = y(L, t) and the initial condition u(x, 0) =
f(x) has a solution. The solution is given by (6.46) where С = n\Lk and
Now, the wave and heat equations readily and conveniently led us to the
considerations of Fourier analysis. Actually this could have been (and in
fact was) anticipated on physical grounds, for we should expect periodic
behavior in these circumstances. Other partial differential equations arising
out of physics can be solved by similar techniques, but we do not necessarily
end up with a sequence of solutions of the general equation which are made
up of trigonometric functions. Thus the Fourier analysis does not apply,
whereas the fundamental ideas may carry over. The typical situation is
this a partial differential operator Ρ is given on a certain domain D; we seek
a solution/of
W = o
6.4 The One-Dimensional Wave and Heat Equations 491
subject to certain boundary conditions " B" and initial data f(x, 0) = g(x).
First, we find all solutions of P(f) = 0 subject to the boundary conditions B,
without regard to initial conditions. If {Slt..., S„, ...} are these solutions,
then we try to find a linear combination £ a„ S„ which fits our initial data
Х«„5„(х,0) = <7(х)
In our typical situation the S„(x, 0) are orthonormal in the sense of some
convenient inner product on the space of all initial data In this case the an
are readily computable
The cases of the heat and wave equations described above are just special
cases of this method There are many more examples of such orthogonal
expansions; discussions of them can be found in most texts of mathematical
physics.
Finally, we cannot really expect to be able to follow through such a
program for every partial differential equation, thus the general theory does
not follow such an explicit line of reasoning In one approach, local
solutions are sought through examination of Taylor expansions (everything
involved is assumed analytic). This is the Cauchy-Kowalewski theory. A
more recent attack has its roots in the above ideas, as well as the Picard
theorem. The vector space of differentiable functions is provided with a
notion of distance and length which is suited to the given problem so that one can
resolve questions of existence and uniqueness (as in the Picard theorem) and
provide usable approximations with estimates derived from the initial data.
This study is one of the most active branches of modern mathematics.
• EXERCISES
10 Solve the wave equation
d2y d2x
~ъхг~ ёТ7
on the interval (0, 1) with the boundary data y(0, t) =0 = y(l, t), and the
following initial data
<<y
(a) y{x, 0) = sin χ — (x, 0) = 0
by
(b) y(x, 0) = COS3 πΧ — COS πχ — (x, 0) =Sin π\
492 6 Functions on the Circle {Fourier Analysis)
(c) X*,0)=x(x-1) j-{x,Q)=Q
ot
dy
(d) y(x, 0) = cos πχ — (χ, 0) = sin πχ
*
By Ътт π
(e) y(x, 0) = 0 — (χ, 0) = sin — χ + sin - χ
11. Solve the heat equation
ди д2и
~dt = 4~dx1
on the interval (0, L) with the boundary data
й(0, t) = 0 = u(L, t)
and the following initial data
(a) u(x, 0) = sin χ
πχ
(b) u(x, 0) = cos —
(c) «O, 0) = x(x - L)
πχ 5πΧ
(d) u(x, 0) = sin — + 3 sm —
12. (a) Show that the function u(x, t) =ax + b solves the heat equation
on the interval (0, L), with boundary data
и(0, t) = b, u(L, t) = aL + b
(b) Show that if u, υ solve the heat equation with boundary data
и(0, t) = t, u(L, t) = ti v(0, 0 = 0 v(L, 0 = 0
then и + ν solves the heat equation with the same boundary data as u.
(c) Solve the heat equation
bu d2u
(t ~ Bx2
6.4 The One-Dimensional Wave and Heat Equations 493
on the interval (0, 1) with boundary data «(0, t) = 1, и(1, t) =e' and
initial data u{x, 0) = e'.
13. The initial data given in the problem of heat flow may be the rate of
flow of heat energy; or what is the same, the gradient of the temperature.
Show that the solution of the heat equation
1 du _ d2u
I~2~dt ~~dx~2
on the interval (0, L) with boundary data u(L, 0) = 0 = u(L, t) and initial
data (аи/дх)(х, 0) =/(*) is given by
^. πηχ
2, Л„ехр(—CVf)cos ——
n = i L
where С is a constant, and
2
Ι Γ πηΧ
A„ = — /(x)cos —— dx
πη J„ L
14. Solve the heat equation given in Exercise 11 with this initial data:
(a) 8ul8x{x, 0) = cos ttx/L,
(b) Bu/dx(x, 0) = sin πχ/L.
15. Solve the differential equation
d2u _ d2u
dx2~~~8y2~ + U
on the interval (0, π) with boundary data «(0, t) = 0 = и(1, t) and the
initial data u(x, 0) =/(*)·
16. Do the same where the differential equation is
d2u du _ d2u du
~bx2Tt~' et2Vx
PROBLEMS
15 Show that the series defining the function y(x, t) in Proposition 2
converges uniformly and absolutely under the stated conditions. Does this
observation suffice to deduce the conclusion of Proposition 2?
16. We may be given, in the heat problem, the gradient of the
temperature as boundary data. Show that the general solution of the heat equation
with boundary data
du 8u ,
-(0.0-0--СМ)
494 6 Functions on the Circle (Fourier Analysis)
can be written as a Fourier cosine series Solve the equation
8u _ 82u
~Ы~~сТ2
on the interval (0, π) with the boundary conditions
till till
-(o,o=o = -(L,o
and the initial conditions
(a) u(x, 0) = sin χ
8u
(b) — (x,0) = sinx
17 Solve the differential equation
tin ti2u
tit ~~ tix2
with the boundary data ди/Вхф, t) =0, 8uj8x(L, t) = h and initial
conditions u(x, 0) = 0
18 Solve Laplace's equation
д2и д2и
Ди = —+ —= 0
8x2 tiy2
on the infinite rectangle 0<y<L, 0<x (see Figure 6.4) with the boundary
values
u(x, 0) = 0 = u(x, L)
u(0,y)=f(>)
8u
Yxi0,y)=g(y)
Show that the assumption that и is bounded implies that the third condition
is unnecessary the solution is uniquely determined by its boundary values
19 Find the bounded solution of the differential equation Ди + и = 0
in the infinite rectangle (Figure 6 5) with the boundary conditions
u(x, 0) = 0 = u(x, L)
"(0, \)=f(y)
6.5 The Geometry of Fourier Expansions 495
Figure 6.5
6.5 The Geometry of Fourier Expansions
We now return to the study of functions on the circle, that is, periodic
functions of period 2π. We still have not studied the sense in which the
Fourier series of a function converges to that function; we have only
Corollaries 1 and 3 of Theorem 6 1 which deal with pointwise uniform convergence.
Let us consider the real Fourier series of a continuous real-valued function/:
A0 + Σ [Λ„ cos nx + B„ Sin nx~\
1 л"
(6.47)
1 л"
Ao = =- f /00 dx A„ = - f f{x) cos nx dx
2π J _„ iiJ-i
B„ = - f(x) sin nx dx
π J-„
1 л"
Since the Fourier series of a trigonometric function is itself, we find, by
applying these definitions to cos nx, Sin nx, that
f71
cos nx sm mx dx = 0 all n, m
·" -π
10 η φ m
π η = m # 0
2π η = т = 0
Г а /° "#
sin их sin mx αχ = ί
0 η φ т
т
(6.48)
(6.49)
(6.50)
496 6 Functions on the Circle {Fourier Analysis)
There is a geometric way of interpreting these equations which sheds light
on the subject. We consider C(T) as a vector space endowed with the inner
product
</. 9> = f f(x)9(x) dx
J — tt
This inner product, of course, defines a notion of distance (recall Section
1.11)
ll/-ffh =
Γ \f{x)-g{x)\2dx
•^ —it
11/2
(6.51)
which is quite distinct from the uniform, or supremum distance
II/-5II =max{|/(x)-0(x)|: -π<χ<π}
We shall call the distance (6 51) the mean square distance, and we shall speak
of convergence in this sense as mean square convergence. More precisely,
/„ -»/(mean square) if ||/„ -/||2 -► 0 as η -► oo.
Now the importance of the equations above is that they imply that the
functions cos nx, Sin nx are mutually orthogonal in the vector space C(T)
with this inner product Thus, we can interpret (6.47) as an orthogonal
expansion. Let us make these new definitions
1 „ , „ cos nx „ , sin nx
(2πγ
co(x) = j^m QM = —— S"M =
Then the collection C„, S„ is, according to Equations (6.48)-(6.50), an
orthonormal set. If/is any function on the circle,
A l Г" fl s 1 J </■ C0>
Α°-(2πγ'4-πηΧ)(2πγΙ2αΧ- (In)1'2
1 г"
4. = -7= f f(x)
/77 J -TT
cos nx </, C„>
—7=— ax = 7=—
B* = -7= f(x) —7=- dx = —j=-
6.5 The Geometry of Fourier Expansions 497
so the Founer expansion (6 47) can be rewritten as
oo
</, c0yc0 + χ [</, c„>c„ + </, s„>s„]
n= 1
and is thus the infinite-dimensional analog of the orthogonal expansion of an
element in an inner product space in terms of an orthonormal basis This
interpretation has important consequences for us.
Theorem 6.3. Let f be a continuous function on the unit circle, and let
(6.47) be its Fourier series.
(i) Among all trigonometric polynomials of degree at most N, the closest
to f is
N
A0 + Σ (Α„ cos nx + B„ sin nx) (6 52)
n= 1
(n) (Bessel's inequality)
1 f \f(x)\2 dx>A02+~t ОС + B„2) (6 53)
In J-„ I „=]
Proof. In order to verify these facts, we use the basic theorem on orthogonal
expansions (Theorem 18) The functions Ca,Ci, , Cv,Si,. ,SN form an
orthonormal basis for the space Sw of trigonometric polynomials of degree at most
N. The orthogonal projection of/into this space is
/o = </, СоУСо + 2 '/, C>C. + </, S„}S„
which is the same as (6 52) Thus, according to Theorem 1 8
(О ll/ll22=ll/ol|22+II/-/0II22
(ii) foranyweS*, ||/-/o"22 < II/- w\ 22
(11) directly implies Theorem 6 3(i) According to (1),
ll/lb2 > ll/o IL·2 = «/, Co»2 + 2 ( /, C»2 + «/ S. )2
||/||г2^27г/1о2 + 7г 2 A„2+BS
n = l
Since this is true for all N, we can take the limit on the right as N — s, thus
obtaining Bessel's inequality.
498 6 Functions on the Circle (Fourier Analysis)
Now, it is clear that for trigonometric polynomials, Bessel's inequality is
actually equality For if/ is such a trigonometric polynomial, there is an
N such that/e SN, so f=f0 . Thus, by (i) above ||/||2 = ||/0||2, and ||/0||2 is
just the right-hand Side of Bessel's inequality. Since any function can be
uniformly approximated by trigonometric polynomials (although not
necessarily by its Fourier series), we should expect Bessel's inequality to be
always equality This is the case
Corollary. (Parseval's Equality) If f is a continuous function on the unit
circle and has the Fourier series (6 47), then
-!- f \f(x)\2 dx = A02+\t (Λ„2 + B„2)
Pi oof We continue the notation of Theorem 6 3. Let ε>0 be given By
Corollary 4 of Theorem 6 1, there is a trigonometric polynomial w such that
|»v-/||<e Then
\\w
π π
■f',22=\ \w~f\2dx< w-f\2\ άχ<2πε2
Now, since w is a trigonometric polynomial, there is an TV such that w e Sw Let
/o be the projection of/into Sv Then by (ι) and (ii),
ΙΙ/,22=Ι/θ,'22+ '/-/θ!Ι22<Ιΐ/ο 22+ I/- W|l22^!/o 22+2tT£2
This becomes, as in the above argument,
f [/(χ)|2ί/χ<2ττ/102 + 7Γ 2 A2 + B2 + 2πε2
Since the sum to infinity only increases the right-hand side,
1 f" 1 ■?·
=- |/(Jf)l2 dx <A02 + - Σ (Α,2 + Β,2) + ε'
Now, since f was arbitral у we may let it tend to zero The lesulting inequality,
together with Bessel's inequality, gives Parseval's equality
Finally, we note that Parseval's equality can be expressed in terms of the
6.5 The Geometry of Fourier Expansions 499
expansion into a series of complex exponentials: £ f(n)e'"e. Since
f(0) = Ao An) = $(A„ + iB„) K-n) = \{An-iBn)
V = I /(0)|2 An2 + B2 = 2(|/(«)|2 + | /(-и)|2)
so we have
1 r"
- f Ιί(θ)\2 άθ = Σ Ι/(ΌΙ2
Examples
16 Since cos2 θ = ie~ ae + \ + \en\
If
1113
cos4 θάθ = —+ - + — = -
16 4 16
00
17. π2 - θ2 = 2π2/3 + 2 £ (-1)V"V
16π5
f (π2-02)2^ = ^ = 2π
4π4
^Σ~
пФО П
We conclude that
У — = —
Л и4 ~~ 90
The partial sum to degree 3 of the Fourier series of π2 — θ2 is
Ίττ2 Ι 2
F^rfl) = 2 cos θ + - cos 2Θ - - cos 30
34 ' 3 2 9
The square of the mean square distance between θ2 — π2 and this sum
is
1 π4
„4-4 „4 90
1 1 10
Ϊ6~~8Ϊ~ΊΓ
1 1
Ϊ6~8Γ
8 1 3_
~8Ϊ~Ϊ6~80
500 6 Functions on the Circle {Fourier Analysis)
Figure 6.6
In Figure 6 6 the graphs of π2 — θ2 and F3(0) are drawn.
18. \θ\ has the Fourier expansion
π_2 - e1(2"+1)"
2~π„Α0Ο(2« + 1)2
From Parseval's equality, we find
π2 _ π2 4 " 1 ™ 1 _ π4
T~T + i? „to (2и + Ι)4 0Γ „е0(2И + I)4 = 96
The third partial sum of the Fourier series of \θ\ is
π 2 / n cos 30\
The mean square distance between \θ\ and this trigonometric poly-
6.5 The Geometry of Fourier Expansions 501
normal is
£ 1
π* 14 13
1 < < —
81 96 81 ~ 96
„t-2 {In + l)4 96
(see Figure 6.7)
Mean square approximation is interesting from the physical point of view
Consider the solution of the wave equation (suitably normalized)
u(x, 0 = £ Ып COs ηί + Β,ιsin nt) sin "v
(6 54)
The (kinetic) energy of the wave at time t is proportional to
du
dt
dx
Now, by Parseval's equality that can easily be computed in terms of the
Fourier sine coefficients of Cu/ft
du
τ~= Σ n(Pn c°s"' -~ Ansm "0sm "x
ot „ = o
const
du
ot
dx = (Ση2(Βη cos nt — An sin ίΐί)2)
(Because of our normalizations, the constant is not relevant, it might as well
Figure 6-7
502 6 Functions on the Circle {Fourier Analysis)
be 1.) Now the maximum value of the right-hand side is
00
Σ n\An2 + Bn2)
(see Problem 20) so this is the maximum kinetic energy of the wave. Now,
according to our geometric considerations above, the Mh partial sum of
(6 54) provides the best approximation to the solution wave in the sense of
energy. Furthermore, the difference in energy levels between the solution
wave and this approximation is readily computable, it is
Σ η\Α? + Bn2)
Since energy is the important concept in the study of waves, this mean
square approximation is well suited for this study.
• EXERCISES
17 Compute these integrals by Fourier methods:
(a) j" cos830i/0
π
(b) sin2 μθ άθ μ not an integer
(0 J
(d) J
(e) f 0
\-r2
1 + r2 - 2r cos(0 - φ)
1-r2
cos φ
ιάφ
l+r2-2rcos(0-(£)
■άφ
(f) J 04i/0
18 Approximate the given function by a trigonometric polynomial to
within 10 3 in mean
(a) |0|0
™ cos «0
(b) lw*
(c) sin3 0 cos 0
(d) e
6.6 Differential Equations on the Circle 503
PROBLEMS
20. Show that the maximum of (B cos nt — A sin nt)2 is B2 + A2
21. Let {/„} be a sequence of continuous functions on the circle. Show
that if /„ ->/ uniformly, then /„ ->/ in mean. Show by example that the
converse statement is false.
22. Prove: if/, g are integrable real-valued functions on the circle
-ί №9{θ)άθ= | /(иЖи)
'7ϊ"*'_π η = -α>
6.6 Differential Equations on the Circle
We now turn to a slightly different problem involving ordinary differential
equations. We propose to find all periodic solutions of a linear constant
coefficient equation. The particular theory which results is not in itself of
vital importance, but it is worthwhile to study because of the symmetry of the
results and because it presents the simplest example of the general theory of
differential operators on compact manifolds.
As we have already seen, it is valuable in the theory of ordinary differential
equations to allow complex-valued functions. We return then to our
original form of the Fourier expansion of a function /: £/(и)е""'. Our first
result concerns the computation of the Fourier coefficients of the derivative
of a function.
Proposition 5. Let f be a continuously differentiable function on the circle.
The Fourier series off is obtainable by term by term differentiation. That is,
f'(n) = inf{n) (6.55)
Proof. The proof is by integration by parts.
1 r" 1 in л"
/Xn) = - ΠΘΥ"° άθ = — f(0)e- '"< + — f(d)e-'- M
in
>/(«)
Thus, if the differentiable function/has the Fourier series ^А„е'"в, then
the Fourier series of/' is £ inAne'"e. It follows from the fundamental
theorem of calculus that we can also integrate Fourier series term by term,
so long as it has no constant term: if /has the Fourier series £ А„е'пв, then
504 6 Functions on the Circle {Fourier Analysis)
Jo/has the Fourier series £ (т)~1А„е'"в. A useful consequence of
Proposition 5. in conjunction with Bessel's inequality is that a continuously
differentiable function is the sum of its Fourier series.
Proposition 6. If f is a continuously differentiable function, then /(0) =
Σ-" - - /(")e'"e holds for αΐΐθ.
Proof. By Bessel's inequality
I |/'(и)|2<со
Using the above proposition we then obtain by Schwarz's inequality
/'(«)
l/(0)i + Σ -
л#0 П
/'(«)
< со
Σ i/Wl = l/(0)i+Σ
<\/Φ)\ + (ϊ-Χ'Ίΐ i/'WI2)1
Thus Σ Ι/(и) Ι < °°> s0 Corollary l of Theorem 6.1 applies
Now, suppose g is a continuous function on the circle. Given a
polynomial F(X) = Xk + YJ[Zo a,X', we want to find a periodic function f
such that
k-l
Σ
1 = 0
/<»+ z°Jw = 0
(6.56)
The fact that we are interested in periodic functions is a new twist and the
local results, such as Picard's theorem, are hardly applicable. For example,
consider the simplest differential equation:
f' = g
By local considerations we know that/must be
(6.57)
/(0) = f д(ф) άφ + с
J - π
However,/will be a periodic function only if /(π) =/(-π) = с: for this we
must have |_π: д{ф) άφ = 0. Thus (6.57) has a solution if and only if $(0)
= 0. We have already recognized this condition in the above discussion of
6.6 Differential Equations on the Circle 505
integration of Fourier series. For by (6.55), if/' = g we must have inf(n) =
(}{ri) for all n. This necessary condition shows up again by taking η = 0: we
must have #(0) = 0.
Now we return to the general case (6.56). If we look at the Fourier
series of both sides this becomes F(in)f(n) = cfcn). Thus we must have
g(n) = 0 whenever F(in) = 0. Otherwise, the equation does not have a
solution. On the other hand, if this condition is satisfied, then the equation
is easily solved since we must have/(и) = F(in)~lg{ri). The solution is the
function whose Fourier series is
V Sin) ^
„=-m F(in)
Theorem 6.4. Let F(X) = Yj:^ c.X1, and let nu
F(in) = 0. Let LF be the differential operator
LF(f)= Σ^,/('
к
Г
1 = 0
(6.58)
., ησ be the roots of
(i) The space of periodic solutions of LF(f) = 0 is spanned by exp(z«19),
..., exp(wff Θ).
(ii) Given any periodic function g, the equation LFf= g has a solution if and
only ifg(n^) = 0, 1 < ι < σ. The solution is uniquely determined by specifying
the Fourier coefficients /(и,), 1 < / < σ.
Proof. The Fourier coefficients of Lr(f) are {F(m)/(n)}. Now if LF(/) = 0, we
must have F(iri)f(n) = 0 for all n, so /(n) = 0 necessarily except when F{m) = 0.
Since nu...,n<, are the roots of this equation, (ι) is proven.
If g is a periodic function and LF(/) = g, we must have F(m)/(n) = g(n). Thus
g(ni) =0, 1 <; ι <; σ is a necessary condition for this equation Suppose now that
this condition is satisfied. Then if/is a solution we must have
An)-
g(n)
: F(m)
пфпи..
(6.59)
and the f(nt), 1 < ι < a can be freely chosen Upon specification of these
coefficients the Fourier series of / is uniquely determined. The only question is this:
are the numbers (6.59) the Fourier coefficients of a function? The answer is yes
when F is of degree at least one. For then \F{in)\ >C\n\ for some constant С
and all sufficiently large η (Problem 24), and thus
Σ
m
F(in\
^IC
gin)
<c
te)"
(Σ Ι^(")Ί2)Ι/2 < oo
(6 60)
506 6 Functions on the Circle {Fourier Analysis)
for the tail end of the series, and thus the sum of the whole series is finite. Hence
the Fourier series
№= Σ |rv<»< (6.61)
converges uniformly to a continuous function. The theorem is thus proven.
We can get a much better looking form for the solution, if the degree of F
is large enough (at least second degree). For then
Ρ(θ)= Σ Ϊ7Γ-. (6-62)
„=-oo F(in)
ηφη\, ,na
defines a continuous function (Problem 24) and the solution (6.61) is given by
ешв 1 r"
„=—„0 F(in)2n J-„
1 π oo „ιη(β-φ)
= τ~\ β(Φ) Σ -ΈΓΎάΦ
2π)-π „=-οο F(w)
1 г71
= -\_9{φ)Ρ{θ~φ)άφ
using (6 61). We can now write the conclusions of Theorem 6 4 explicitly
in terms of an integral formula.
Theorem 6.5. Let F(X) = Σ?=ο c.x' (k ^ 2)> a«rf let ηγ, ...,ησ be the
solutions ofF(in) = 0 Let LF be the differential operator defined by the
polynomial F. Let
№ = Σ
00 ρίΜ
„ = -„0 F(in)
ηφη\, , ησ
Then the equation LF(f) = g has a solution if and only if §(nt) =0,1 < i < σ.
AII solutions are of the form
№ = ^ f 9(Φ)Ρ(Θ -ф)аф+^С] ехр(1„, θ) (6.63)
2nJ-„ j = i
6.6 Differential Equations on the Circle 507
Thus a constant coefficient differential operator on the circle has an inverse
of the form (6.63) (denned on its range), called an integral operator.
Examples
19. Find a periodic solution of /" -/ = cos 2Θ. Now
д(в) = cos 20 = \{el2» + e~l2\
The characteristic polynomial is F(X) = X2 - 1 and F(in) = -и2 - 1
has no roots. Thus there exists a unique solution and it is given by
(6.58):
V $№в We·2" e--2o\
= - - cos 2Θ
20. Solve: /" - 3/' + 2/= π2 - θ2. The characteristic polynomial
is F(X) = (X2 - 3X + 2) = (X - \)(X - 2) so that again F(in) = 0
has no integral roots. Since π2 - θ2 has (by Example 4) the Fourier
series
2π2 е1пв
^ + 2Σ(-1)"^τ
i ηΦο η
the solution is (by 6.58)
2 £шв
21. fw = g- This has a solution if #(0) = 0. In this case the
solution is given by
£ιηβ
№ = c+ 1о(п)т-гк
пФО (Ш)
22. Find all solutions of/" +4/= 0. Here, the roots of X2 + 4
= 0 are +2/, therefore, all solutions are periodic of period 2π: е2,в,
е~2,в span the space of solutions. Notice, however, that there are no
solutions of/" + 5f= 0 which are periodic.
508 6 Functions on the Circle {Fourier Analysis)
EXERCISES
19 Find all periodic solutions of these differential equations:
(a) /+2//+15j>=0
(b) /5) - /*> + 10/3) - 10/' + 9/-9y = 0
(c) y*» + 2/+l=0
20. Find periodic solutions of these differential equations:
(a) ym + 2y" + у = sin 50 + cos 50
(b) y" + 6/ + 9.v = π' - 02
(c) /5) + j' = exp(cos 0)
PROBLEMS
23. Suppose Fis a polynomial of degree at least k. Show that there is a
C> 0, and an integer N such that \F{in)\ > С |n|" for n>N.
24. Show that if F is a polynomial of degree at least 2, then
л not a root v /
is a continuous function on the circle.
25 If /, д are two continuous functions on the circle, define / * g, the
convolution of/and g, by
1 n
(/* вШ = ^\ ί(φ)9(θ~φ)άφ
(a) Show thatf*g=g*f.
(b) Show that the Fourier coefficients off*g are f(n)g(n).
(c) Show that the differential equation LF(/) = #, where Lf is the
constant coefficient operator associated to the polynomial F is solved by
f=g * F, where Fhas the Fourier coefficients F(w)-1.
26. Let
Л(г, 0=| P{r,r)dr
(a) Show that lim Pi(r, t) is 0 if f < 0, and 1 if t > 0.
r-*l
(b) Show that for any С' function g on the circle
3(0) = lim f дЦШг, θ~φ)άφ
00
(Hint lim Pi(r, 0) "has the Fourier series" 2 e'"ejin.)
6.7 Taylor Series and Fourier Series 509
6.7 Taylor Series and Fourier Series
If we now take the attitude that the unit circle is the boundary of the unit
disk we discover connections between Fourier series and Taylor series which
are of enormous significance in complex function theory. These
connections cannot be fully exploited until we learn the fact (in the next chapter)
that complex differentiable functions can be expanded in a power series. In
this section we shall explore the relationship between the Fourier and Taylor
expansions of such a function defined on the unit disk, assuming its Taylor
series converges on the disk.
Let/(z) = £"=0 a„z" on the unit disk. In polar coordinates this becomes
00
/(«'·)= Σ a-'V"· (6.64)
n = 0
which is, for each r, a Fourier series. Using the Fourier theoretic material
at hand we get a most remarkable collection of integral formulas for
functions which are sums of convergent power series.
00
Theorem 6.6. Letf(z) = £ a„z" be a convergent power series in {\z\ <, 1}.
n=0
Then we have these equations.
(ι) For each r < 1,
— f f(re">)e-ine άθ = αη = ^-/(η)(0) η > 0
2nJ-n n\
— f f{rete)elne <20 = Ο η > О
(Ι1) ^°έ/-'/(^ι + Γ'Λ^-»)^
(iii) ^^кО^^^
for ζ = ге1в, г < 1.
Proof. By Equation (6.64), for fixed r, a„r" is the nth Fourier coefficient of
f(rew) for η >0, and for negative η the Fourier coefficient vanishes. This is just
510 6 Functions on the Circle {Fourier Analysis)
what part (i) says explicitly. Equation (6.64) also says that f(re">) is the Poisson
integral of f(e'°) and thus we obtain (ii). Then (iii) follows from resumming the
series, using the fact that the negative coefficients vanish. Explicitly, we have
/(*) = Σ α,ζ"=Σ f f ПфУе-^аф
η = 0 n = 0 -ώ77" * _ π
the last change being accomplished by summing the geometric series.
There are several more or less immediate conclusions one can draw from
the above theorem. First of all, the sum of a convergent power series on the
unit disk is completely determined by its boundary values, by Equation (iii),
known as Cauchy's formula. This of course follows from the maximum
principle verified in the last chapter for analytic functions. The Cauchy
formula itself implies the maximum principle (see Problem 27). A more
important implication is that the sum of a convergent power series in the disk
is analytic; that is, it can be expanded in a power series centered at any
point.
Corollary. Let f(z) = Yj?=0 anz" in the disk {\z\ < 1}. Then for any z0,
\zo\ < hfcan be expanded in a power series centered at z0.
Proof. By Cauchy's formula
1 r" e1*
^-rJj^T^-z^
Now
1 1 1 / z-z0
e"»-z ~ e"» -z0~(z~ z0) ~ <?'* - z0 ' V ~ <?'* - z0
In the disk {z e C: \z — z0\ < 1 — \z0\, the last factor is the sum of a geometric
series:
V e^-zof „foV'*-zo/
Г
6.7 Taylor Series and Fourier Series 511
which we may substitute in the integral. We obtain
the series being convergent for all ζ such that \z — z0\ <1 — |z0| As a consequence
we have still more integral formulas:
f^^ = tfj^(^r^
for any z0, |z0| < 1.
We shall see in the next chapter that these integral formulas can be
explained in yet another way (basically the fundamental theorem of calculus)
and are just very special cases of general formulas. We conclude now with
an approximation theorem which should be contrasted with the Weierstrass
approximation theorem (Problem 7) for a real variable.
Theorem 6.7. Let f be a continuous function on the circle, f is approxi-
mable by polynomials in ζ if and only if
— f * f(eyne άθ = 0 η > О (6.65)
2π J-π
Proof. If / is approximable by polynomials, there is a sequence {fk} of
polynomials such that /k -» / uniformly. Since (6.65) is readily verified for any
polynomial, it thus holds also for /, by continuity of the integral.
Conversely, if (6.65) is verified, then the Poisson transform of /is the sum of a
convergent power series:
PRz)= fa„z" |z|<l (6.66)
Since Pf(rew)^f(6) as r->l, then given e>0, there is an r < 1 such that
\Pf(re") -/(.θ)\ <ε for all Θ.
Now on the circle \z\ = r, the series (6.66) converges uniformly, so there is an TV
such that
pfb)- Σ α»ζ" <ε и
512 6 Functions on the Circle {Fourier Analysis)
Then, On the unit circle,
/(z)- J a.r"z"
<\f(e">)-Pf(re,°)\ +
Pf(re'°) - 2 ап(ге'У
<2ε
independently of θ
• EXERCISES
21. Integrate:
(a) /
* e3,e + 4e2ie_(_eil
le'e - 1
άθ
(b) L&
+ 1/4)"
άθ к, п positive integers
PROBLEMS
27. Deduce the maximum principle for convergent power series on the
disk from Cauchy's formula.
28. Using the results of Problem 11, verify that these assertions for a
function defined on the disk are equivalent:
(a) /(z) = | a„z"
n-l
(b) /is complex differentiable.
(c) /is uniformly approximable by polynomials.
(d) /is harmonic and /(—и) =0 for η >0.
6.8 Summary
The function f(t) = exp(2nit/L) wraps the real line around the circle so
that every interval of length L covers the circle once. The collection of
periodic functions of period L may be viewed as the collection of functions
on the circle.
If/ is a piecewise continuous function on the circle, its nth Fourier
coefficient is
6.8 Summary 513
The Fourier series of /is the series
00
Σ h»ve
η = — oo
The Poisson transform of/is the function on the unit disk given by
2π)-π 1 + r -2rcos(0-0) n = -oo
Theorem. Iff is a continuous function of the circle andg is the function on
the disk defined by
g(r, Θ) = Pf(r, Θ) r < 1
then g is continuous on the disk and harmonic (satisfies Laplace's equation)
inside:
дх2 ду2
Λ + -Λ = ° for r < 1
Theorem. If f is continuously differentiable on the circle, then it is the sum
of its Fourier series:
/(β)= Σ /Ие"
V = — 00
Suppose u is harmonic in a closed and bounded domain D in the plane.
Then
(i) if A(a, Я) с D
1 r"
u(a) = — Г u(a + Re'e) άθ (mean value property)
2π J-„
(и) if и < Μ on dD, и <M inside D (maximum principle)
A function harmonic on a closed and bounded domain is uniquely
determined by its boundary values.
If the real-valued function/has the Fourier series Σ C„ етв, we can rewrite
514 6 Functions on the Circle {Fourier Analysis)
it as
00
Λο + X An cos ηθ + Bn sin ηθ
n= 1
where
А0 = С0=±$[яф)аф
1 r*
A„ = 2 Re C„ = C„ + C_„ = - Γ /(ψ) cos «ψ <ty
itJ-.
B„ = 2 Im С, = - i(C„ - C_„) = - f /(φ) sin ntf> #
If/is a C1 periodic function of period L, it can be expanded in a Fourier
cosine series
πηχ
f(x) = A0+ X A„ cos —
л = 1 L·
1 p^* 2 i»^* жпх
A°=LJ ^ dX A"= LJ ^ C°S "ΊΓ dX
or a Fourier sine series
πηχ
f(x)= £B„sin —
2 cL πηχ
the wave equation. Given the C2 periodic functions /, g of period L
the equation
e2y _ ι a2^
дх1 ~с1~д?
with the boundary data
XO, 0 = 0 y(L, 0 = 0
6.8 Summary 515
and the initial data
ЗУ.
y(x,0)=f(x) _i(x,0) = ff(x)
has a solution. The solution is given by
00
y(X, ο = Σ
πηί πηί
A„ cos — + B„ sin —
Lc Lc
πηχ
sin
where
A 2 f tf 4 · /ЛИХ\ J r> 2C Г1, / ч ίπηΧ\ ,
" = lj ^slnwT/ "=«"J ^x)slnl-£7) dx
the HEAT EQUATION. Given the C2 periodic function / of period L the
equation
1 ди д2и
Ϊ^Ίΰ^δχ1
with the boundary data
M(0, 0 = 0 = u(L, t)
and the initial condition
M(x,0)=/(x)
has a solution given by
πηχ
u(x, t)= £ A„ ехр(-С2л20 sin —
n=l Ь
where С = π/Lfc and
2 cL πηχ
An=LJ S1" T~
Consider С(Г) as endowed with the inner product
</, 0> = f f(x)9(x) dx
516 6 Functions on the Circle {Fourier Analysis)
Let
„ , 1 „ , . cos nx sin nx
{C„, 5„} is an orthonormal set. The Fourier series of a function /can be
rewritten as
Σ </, c„>c„ + Σ </. s„>s„
л = 0 л=1
parseval's equality
1 л71 1 °°
- f i/(x)i2 dx = ν + - Σ (A2 + A,2)
differential equations ON the circle. Let F be a polynomial and let
nu ..., ησ be the integer solutions of F(in) = 0. Let LF be the differential
operator denned by the polynomial F. Let
oo
m = Σ
„ = -oo F(in)
ηφηι, ,ησ
Then the equation LF(f) = g has a solution if and only if g(nt) = 0, 1 < ζ < σ.
All solutions are of the form
f(0) = ,- f * 9(Φ)Ρ(Θ ~φ)άφ+Σ^ ехрОи, θ)
ζπ J-π j =!
00
Theorem. Lei /(z) = Σ αηζ" be a convergent power series on the unit
n = 0
disk. Then these equations are valid:
(0 ^ f /(rC V* dfl = «, = V(n)(°) « > 0,^ < 1
2π^_π и!
(ϋ) —Γ/№)βίηβάθ = 0,η>0 r^l
2π J-π
1 г71 „, .. 1 - г2
(in) /ω=^ί /fr*). ^2 , ,я .,#
2π J -π ι + г — 2r cos(0 — <ρ)
1 л71 е'*
6.8 Summary 517
If /is complex differentiable in a domain D, it can be expanded in a
power series in some disk centered at any point in D.
• FURTHER READING
The theory of Fourier series is exposed in these texts:
R. Seeley, An Introduction to Fourier Series and Transforms, W. A Benjamin,
Inc., New York, 1966.
Kreider, Kuller, Ostberg, Perkins, An Introduction to Linear Analysis,
Addison-Wesley, Reading, Mass., 1966.
Hardy and Rogosinski, Fourier Series, Oxford University Press, New
York, 1956.
Further applications to physics and the development of other partial
differential equations can be pursued in
E. Butkov, Mathematical Physics, Addison-Wesley, Reading, Mass., 1968.
O. D. Kellogg, Foundations of Potential Theory, Dover, New York, 1953.
• MISCELLANEOUS PROBLEMS
29. Let
Κθ) - (ο θ < о
οο
Show that 2 /W = !/2 (see Example 7) What is
n— - oo
л = - QO
30. Let / be a piecewise continuous function on the circle. Suppose /
has a. jump discontinuity at 0; that is, the limits
lim /(χ) = α lim f(x) = b
x-0 x-0
x<0 x>0
both exist, but are different. Show that
1
lim Ρ fir, 0) = -(a + b)
Γ-.1 ^
(Hint: Follow the proof of Theorem 6 1 for 0>O, 0<O independently,
using the substitution
- = f P(r, -φ)άφ = ί P(r, -φ)άφ)
2 J„ J-n
6 Functions on the Circle (Fourier Analysis)
31 Show that / is an infinitely differentiable function on the circle if
and only if this condition On the Fourier coefficients is satisfied: for every
к > 0 there is an Μ > 0 such that
Μ
I fin) | < — for all η
W
32 Suppose
P(z)
where Ρ is analytic on {\z\ < 1} and Q is a polynomial. Show that there is
an integer TV and complex numbers a0, . , aN such that
aN f(k - /V) + +a0 /(A) = 0 for all к < 0
(Hint Let Q(z) = aNz* + ■ + a0 ) State and prove the converse
assertion
33 Suppose that/is complex differentiable in the annulus {r < \z\ < R}.
Using the polar form Of the Cauchy-Riemann equations, show that
/(z)= ^ a„z"
where the a„ can be computed from the Fourier coefficients f(n) of f(pe'e)
for any ρ between r and R
34 Show that if f is analytic in the punctured disk {0 < \z\ < R} and
bounded, then/extends analytically to the entire disk.
35 Show that if и is harmonic in the disk {\z — a\ < R}, then for every
/ <R,
lr" R2 - r*
u(a + le.°) = -—\ U(a + Кеч.) — ц
IttR J _ „ /?2 + r2 — 2Rr cos(f — φ) τ
36 (Harnack's principle) If и is harmonic and nonnegative in the disk
{\z- a\ <Д}, then
R-r R+r
-—<u(a + rel0)<-
R + r R — r
37 Suppose {«„} is a sequence of nonnegative harmonic functions On the
disk [\z — a\ <R). Suppose also that 2 u„(a) < oo. Then
u(z) = 2 Un(z)
converges for all ζ in that disk, and и is harmonic.
6.8 Summary 519
38. Show that
ntO 2и + 1 4
39. Verify the trigonometric identity
1 " sin(2W+l)fl
-+ 2cos2n6l=— д
2 n=i 2 sin ρ
(Яш(: The sum to be evaluated is
- + 2Re Σ (Ο")
Ζ η- 1
40. Using the identity of Problem 39, obtain Dinchlet's integral for the
partial sums of the Fourier series of/:
N
SN(&) = A0 + 2 (A cos ηθ + B„ sin ηθ)
1 f- 8ΐη(ΛΓ+»Μ
= 2^J_. sin^ /<* + 0d*
1 r"sin(Af+J)<i
Z7T J0 sin £<£
41 Using the Dinchlet integral (Problem 40), verify that for f a
continuous function on the circle which is differentiable at the point θ0, then
/(0o) =Ao+ Σ (A* cos ηθ0 + B„ sin ηθ0)
(Hint:
lr" / 1\ /(θ0 + φ)-/(θ0)
1 f" Г ^
= brijlnNt[C0S2
φ /(θ0 + φ)-/(θ0)
sin \φ
+ cos Νφ[/(θ0 +φ)-/(θ0)]α-φ
The expressions in brackets are continuous functions)
6 Functions on the Circle {Fourier Analysis)
42 Solve the differential equation
X0)=0 XL)=0
by Fourier methods, where (a) g is constant, (b) g(x) = L — x.
43. Suppose we want to study the problem of heat transfer through a
homogeneous rod with insulated ends: heat does not flow through the ends
If the rod is assumed to lie on the interval 0 <, χ < L this amounts to the
boundary conditions
ди ди
-(0,,)=0=-(L,0
The initial condition may be given either as the temperature distribution
OO,0)) or the initial heat flow [(dujdt)(x, 0)] Show that with these
boundary conditions and either kind of initial condition the heat equation
(6.43) can be uniquely solved.
44. Solve the insulated end heat problem with (a) constant initial
temperature, (b) initial temperature = cos(x/L), (c) initial heat flow = x(x — L).
45. Suppose that и is a real-valued function harmonic in the unit disk.
Show that и is the real part of an analytic function. (Hint: Write u(z) =
a0 + 2"°^i ifl-nZ'" + a„z"), and add a pure imaginary-valued harmonic
function with the same negative Fourier coefficients.)
46. If и is harmonic and real-valued on the unit disk, there is a unique
harmonic real-valued function ν such that v(0) =0, и + w is analytic.
Using the relation between the Fourier expansions of ν and и (Problem 45)
find an integral form for ν in terms of the boundary values of u.
47. If и and υ are as in Problem 45, show that the families of curves
{u = constant}, {v = constant} are orthogonal.
48. (The convolution transform) Let g be a continuous function on the
circle and define the transformation G: С(Г)^С(Г) by
ι π
G(/)(0) = ^j ΑφΜθ-φ)άφ
Show that (a) the eigenvalues of G are the Fourier coefficients g(n) of g.
(b) The nonzero eigenvalues of g form a sequence converging to zero.
(c) The eigenspaces associated to the nonzero eigenvalues are finite
dimensional.
(d) The Fourier series of G(/) is
l9(n)Me'·
6.8 Summary 521
(e) G(u) =/has a solution if and only if/is orthogonal to the kernel
of G.
49. Under what conditions is the convolution transform (Problem 48)
symmetric; skew-symmetric; one-to-one?
The Laplace Transform
50. The Laplace transform is useful in the study of differential equations
defined on the positive real axis, R+. If/is a bounded function on R + ,
define
L(/)(i)= е-/(/)Л
•Ό
Show that L(f) is defined for all s > 0
51. A function / defined on R+ is of exponential order s0 if exp(—s00/(0
is a bounded function Show that for such a function, L(f) is defined for
s>s0.
52. Compute these Laplace transforms:
(a) L(l)(i) = -
n\
(b) Д'")=рп
1
(с) L(<?') =
(5-1)
(d) L(e-') = ?
53. Verify these properties of the Laplace transform:
(a) L is linear.
f{x~d) x>a, for<j>0
(b) if/a(x) = [ 0 χ<0
L(/a) = e— ДЛ
(c) Uf') = sUf)~f(P)
54. Notice that by the above problems we see that the Laplace transform
of a polynomial is a polynomial in l/s. More generally, we might expect
at least that if/ is of exponential type, L(/)(i) ^0 as s -± go. Is this true''
55. Because of property (c) in Problem 53, Laplace transformation
transforms a differential equation into an algebraic problem For example,
suppose we want to solve y' +y ~ 1, y(0) =0, /(0) =0 on R+. If/is a
6 Functions on the Circle (Fourier Analysis)
solution we must have
Д/') = *Д/)-/(0)
L(f) =sL(f') - /'(0) = s>L(f) - */<P) - /'(0)
Thus, using the differential equation and the initial conditions
L(/"+/) = L(l)
i2L(/) + L(/) = -
s
1 _1
L(/) = ^+1) = ~s
Reading Problem 52(a)-(d) backward, we obtain the solution
/(0 = 1 - Це" + <?"") = 1 - cos t
Solve these equations by Laplace transformation:
(a) у +y = \ X0)=0 /(0) = 1
(b) у'-у = ег X0) = 1 /(0)=0
(c) у'-2у=е-г Х0) = 1
(d) у +3v' + 2y = e~' + t Х0) = 1 /(0)=0
56 Solve these systems by Laplace transformation:
(a) yi+y2=e~2t yi(0) = 0=y2(P)
yi + 2yi = 1.
(b) yl-}'2=yi »(0) = 0 /i(0)=l
y" + yi = 2y2 y2(0) = 1 yi(0) = 0
57 Define this convolution for continuous functions on R+
(/**)(')= f /(tW-t)*
Show that L(/* g)=L(f)Ug)
58 Solve the differential equation
У+2/ + ;у=/(0 X0)=0 /(D) = 1
where/is of exponential type (recall Problem 51).
59 Show that the function of a complex variable
L(/)(z)=f e"№dt
2\s + i) 2\s-i)
6.8 Summary 523
is complex differentiable (if / is continuous and bounded) in the domain
{zeC: Rez>0}
The Fourier Transform
60 We now consider the collection of continuous functions defined on
all of R We shall discuss the Fourier transform, which is the analog of
Fourier series:
/periodic /defined on R
Λ") = γ f ГФ)е- '" dd /(0 = —^— f °° f{x)e- ■*■ άξ
Fourier series of/= У fyy F(/)(x) = —\— f °° /(0e<« ^
Of course, since we are working on an infinite interval, we want to be assured
that the Fourier transform is defined The most appropriate class of
functions will come out of these observations:
(i) If/is integrable on R, /is a bounded continuous function
(ii) The Fourier transform/^/is linear
("О (/Г = 00/
(iv) /' = {ixfY
Since Fourier transformation interchanges the operations of differentiation
and multiplication, we select the class of functions such that the effect of all
such operations produces an integrable function This is the Schwartz
class S(R) of test functions /is a Schwartz function, /e S(R), if and only if
/is C"°, and for all positive integers η and к the function
dk
is integrable on R Show that /e S(R) implies /e S(R) (Hint " (*"/)<*>
integrable" implies "(ίξ)'"fw bounded" implies "f-2/<»> integrable")
61 For /,je S(7?) define the convolution by
(/**(*)= f f(y)g(x-y)dy
Show that (/♦з)'1 = /£
6 Functions on the Circle {Fourier Analysis)
62 Borrowing again from the theory of Fourier series, we should expect
that/(x) = F(/)(x):
/w-^D<*№* (6·67)
If we try to verify this directly we enter difficulties similar to that in Theorem
6.1: the integral on the right is
and we cannot apply Fubini's theorem since J elii'~') άξ does not exist.
The difficulty is overcome by introducing the convergence factor e~'w, then
letting у -+0 More precisely, define the Poisson transform of/, a function
defined on the upper half plane by
Pf(x + iy) = тг4тз f /(Oexpttf* - у |£|] άξ
(Ζπ) J _ α.
Prove these assertions:
(ι) if/is integrable on R,
and
hmPf(x + iy)=f(x)
y-*o
л QO
(Hint- Integrate ε\ρ[ιξ(χ — t) — у \ξ\] άξ over R+, R~, independently.
J -a>
The second statement follows as does the result of Theorem 6.1, since this
Poisson kernel y[(x — t)2 + y2]~l has the same behavior as that for the disk.)
(ii) Pf is harmonic in the upper half plane and thus solves Dirichlet's
problem there with the boundary values/.
(iii)if/eS(/?),
hm Pf(x + ,y) = F(f)(x)
У-.0
so the inversion formula (6.67) holds on S(R).
LINE INTEGRALS AND
GREEN'S THEOREM
The basic idea of analysis is the suitable approximation of complicated
functions by simpler ones, such as linear functions. Thus a differentiable
function will be one that is, near every point in its domain of definition,
approximable by a linear function. It is our purpose to discover what
knowledge about the function is deducible from knowledge of this
approximation, called its differential. Two hundred years ago it might have been
said that the differential expresses the infinitesimal, or instantaneous behavior
of the function and the total behavior is the sum of its infinitesimal parts.
Nowadays, it is generally conceded that such an assertion is nonsense;
nevertheless it serves to describe the mood of the analyst as he begins his
investigations.
Up until now we have been mainly concerned with one-dimensional
calculus; although some of the applications have led us into the plane and
space, our techniques have been mainly one dimensional. In the present
chapter we turn to two dimensions, and in the next chapter we shall deal with
the calculus of three dimensions. Each dimension has its own flavor. In
one dimension, the order of the real numbers plays an important role; in two,
we have the influence of complex numbers; and in three, we discover the
vector product. However, there is also much that is the same in all these
dimensions, and for these common concepts there is much to be gained from
a unified treatment. Thus we begin the present chapter with a study of
differentiable /?m-valued functions of η variables We will be interested in
mappings from Rl to R2, R3 to R2, and so on, but the concept of differenti-
525
Chapter /
526 7 Line Integrals and Green's Theorem
ability is the same in all cases and it is important for us to take cognizance
of that fact. An .Revalued function f defined in a neighborhood of a point
ρ in R" will be said to be differentiable at ρ if it can be suitably approximated
near ρ by a linear transformation of R" to Rm. This definition will make
precise our usage up to now of the word differentiable. The transformation,
whose existence is required, is called the differential of f and is denoted
di(f). We shall see that a differentiable /?m-valued function is an m-tuple
of differentiable real-valued functions. We have already studied such
functions in R2, where we showed that if a function / has continuous first
partial derivatives near p, then it is differentiable there, and the differential
is given by
d№ = t τί(ρ)^'(ρ)
1=1 OX
where x1, ..., x" are the rectangular coordinate functions of R".
We have studied, in Chapter 1, some examples of coordinate systems for
R2 and R3. We shall want, in the subsequent chapters, to consider more
general kinds of coordinates. A coordinate system near a point ρ in R"
arises in this way: if F is a continuously differentiable /?"-valued function
defined near p, and the differential rfF(p) is a nonsingular linear
transformation, then the functions
y1=F1(x),...,y = F,(x)
are coordinates in a neighborhood of p. That is, the values of y1, ..., y"
serve to identify all points near p. This fact, that the nonsingulanty of the
differential implies that of the mapping, is called the inverse mapping theorem.
It asserts that the mapping F has an inverse near ρ when its differential at
ρ does.
Suppose that/is a differentiable real-valued function defined in a domain
D. Then its differential associates to each point in D a linear function on R".
Any rule which does this is called a differential form. An important question
which we shall study in this: just when is a differential form the differential of a
function^ In one variable, this question is easily answered. For if/is a
differentiable function of a real variable, its differential is given by
/'(*) dx
Any continuous differential form in one variable is of the form g(x) dx. We
know from the fundamental theorem of calculus that if G is an indefinite
7.1 The Differential 527
integral of g:
G(x) = fg(t) dt
J a
then G is differentiable and dG = g dx. Thus the answer to our question
in one variable is always. The situation in several variables is not so easy.
But the extension of the idea of integration to differential forms provides
us with a tool for answering this question, and a several variable analog of
the fundamental theorem (Green's theorem in R2).
Green's theorem provides us with a tool to extensively study complex
differentiable functions. This is the Cauchy integral formula which gives a
means for determining such a function at interior points of a domain by its
boundary values. It follows easily from this formula (a generalization of the
formula given in Section 6.7) that a complex differentiable function must be
analytic: expressible as a convergent power series. In fact, the entire behavior
of such functions can be read off from the integral formula; this is the basis
of the Cauchy theory of complex variables. We shall only begin this study.
7.1 The Differential
In Chapter 2 we studied differentiation of real-valued functions of many
variables, differentiating with respect to one variable at a time. This gave
us the concept of partial derivatives which generalized to the direction
derivatives df(f, v) of a function/at a point ρ and in a direction v.
According to Proposition 20 of Chapter 2 if the partial derivatives are continuous in a
neighborhood of p, then the directional derivative df(p, v) varies linearly in v.
This linear function we called the differential of/at p. Now we shall give
a more precise definition of this notion, in a style more like the definition of
the derivative of an /?m-valued function of a real variable (see Proposition 5
of Chapter 3).
Definition 1. Let ρ e R", and suppose f is an /?m-valued function defined
on a neighborhood of p. We say that f is differentiable at ρ if there is a
linear transformation T: R" -»Rm and a nonnegative real-valued function ε
of a real variable such that lim ε(ί) = 0 and
||f(p + v)-f(p)-T(v)||<£(||v||)||v|| (7.1)
when || τ || is sufficiently small.
528 7 Line Integrals and Green's Theorem
If such a linear transformation exists it is called the differential of f at ρ
and is denoted by rff(p).
Notice that there can be at most one linear transformation Τ satisfying
these requirements. For suppose also S: R" -* Rm satisfies (7.1). Then
||S(h)-T(h)||<2£(||h||)||h||
for sufficiently small h. Let h = f ν and take the limit as t -> 0,
||S(iv)- T(iv)|| = \t\ ||S(v)- T(v)|| <2ε(ί)|ί| ||ν||
thus ||S(v) - T(v)|| <2ε(ί)||ν|| for all small t. Letting t -»■ 0, we obtain
S(v) = T(v). Thus S = T.
Examples
1. f(x, y) = xy2 is differentiable in the plane. Let (x0, y0) e R2
and let (h, k) be any vector. Then
f(x0 + h,y0+k) = (x0 + h)(y0 + к)2 = x0 y02 + y02h + 2y0 hk
+ 2x0 Уо ^
= *o У о2 + Уо2п + 2x0 y0k + 2y0 hk + x0k2 + hk2
Thus
f(x0 + h, y0+k)- f(x0 , y0) - {y2h + 2x0 y0 k) =
2y0 hk + x0k2 + hk2
This in norm is dominated by
2\y0\\hk\ + \x0\\k\2+\h\\k\2
< 2\y0\(h2 + k2) + \x0\(h2 + k2) + \h\(h2 + k2)
^||(A,fe)||[||(A,fe)||(|jOl + l*ol)+ ΙΙ(Λ.*)ΙΙ]
since ||(λ,Λ)|| = (h2 + k2)i/2.
Thus xy2 is differentiable and has the differential at (x0, y0);
(A, k) -> y02h + 2x0 y0 к
7.1 The Differential 529
This means that for small values of (h, k), the difference
(x0 + h)(y0 + k)2- x0y02
is effectively approximable by
y2h + 2x0 y0 к
The meaning of "effective" is that the error in this approximation is
of the order of e ||(A, k)\\, where ε can be made as small as we please, by
choosing the neighborhood of (x0, y0) Small enough.
2. More generally, Proposition 20 of Chapter 2 suggests that
a real-valued function with continuous partial derivatives near
p0 is differentiable there. This means that for small values of v,
/(Po + v) - /(Po) is effectively approximable by <V/(p0), v> =
(Σ д//дх'(р0)г/)· Let us complete Proposition 20 of Chapter 2 to
a verification of this fact (at least in R2). By the mean value theorem
we may write, for ρ = (x0, y0), ν = (h, k):
/(P + iv) - /(p) = d-£ (ξ0, y0)h + Ц- (Xo + th, ц0)к
where \ξ0 - x0\ < h, \η0- y0\ <>k. Then
/(P + iv) - /(p) - S-£ (p)A + Ц- (p)/c
<
'i^-l*
h +
>-M»>
(7.2)
where pls p2 are at least as close to ρ as ρ + v. By Schwarz's ιη-
quality (7.2) is dominated by
and the first term is dominated by
ε(||ν||) = max
all p1; p2 in the ball B(p, ||v||;
which tends to zero ||v|| ->0.
7 Line Integrals and Green's Theorem
3. Error analysis. The differential of a function gives us
approximately the difference between two values of a function in terms of the
difference between the variables:
Дх) -/(x0) = < V/(x0), x - x0> + error (7.3)
where the error is negligible if the difference is small. Considered this
way, the differential may be used to compute tolerance levels for
errors in measurement. For example, we can compute the maximal
error in the volume of a rectangular box, given certain tolerances in
the measurements of the sides. Suppose the sides can be measured
within an error of 2%. The function we are concerned with is
f(x, y, z) = xyz and V/ = (yz, xz, xy). The error in the
measurement of a volume will be, according to (7.3), approximately equal to
(.(yz, xz, xy), 0.02(x, y, z)> = 3[0.02(x.yz)]
Thus, the percentage error is
100/(х)-/(хо) = 100о^^) = 6%
/Oo) xyz
Thus an error is magnified threefold.
4. Let f(x, y, z) = x(cos y)ex+z. Given error tolerances of 2%,
1 %, 5% in the measurements of x, y, z, respectively, what error is
possible in the computation of/?
Here
V/ = ((cos y)ex+z(l + x), -x(sm y)ex+z, x(cos y)ex+*)
The ratio of the increment in / to the computed value of f is
approximately
V/(x, y, z), (0.02x, 0.02y, 0.02z)
/О, У, z)
= (1 + x)(0.02) + y(tan y)(0M) + (0.05)z
Here we see that the error in the computed value of /depends on
the magnitude of the variables. If _y is close to π/2, the error is very
bad. The maximum percent error for values of x, y, ζ in these
7.1 The Differential 531
ranges: |x| < 1, \y\ < π/4, \z\ < 1, is
2(2)+J(l) + 5(l) = 9+J
which is less than 10.
5. A linear transformation is differentiable at every point. Let
T: R" -* R"' be a given linear transformation, and let ρ e R". Since
T(p + v) - T(p) = T(v)
we have ||T(p + v) - T(p) - T(v)|| = 0, so the estimate required by
the definition is precise. Furthermore, for any ρ e R", rfT(p) = T.
In particular, the coordinate functions x1, ..., x" are differentiable and
dx'(p, v) = v' for any p, v. Since dx' is independent of the base point we
shall often omit it. Notice, that dx1, ..., dx" form a basis for the space of
linear functions on R", so the differential of any function will be a linear
combination of these differentials. In particular, if/is differentiable at p,
we have
d№ = t |i(P)<fr' (7·4)
1=1 OX
We have just shown that in two dimensions, but it is easier to directly compare
Definition 1 of this chapter and Definition 14 of Chapter 2 (cf. Problem 1)
to obtain
rf/(P)(E.) = lim / = ^ (p)
The verification of the following proposition concerning the behavior of
the differential under algebraic operations are easily performed.
Proposition 1.
(ι) Suppose that f, g are differentiable Rm-valued functions at p. Then
f + g and <f, g> are also differentiable and
d(i + g)(p) = <Я(р) + *(P)
d <f, 8>(P) = <Л(Р), g(P)> + <f(P)> <Ш>
532 7 Line Integrals and Green's Theorem
(li) Suppose f = (f1, ..., fm) is an R"'-valuedfunction defined in a
neighborhood of p. f is differentiable at ρ if and only iff1,..., fm are. In this case we
have
di(V) = (dfKV),...,dfm(v))
Proof. We shall only verify the differentiability of <f, g>; the other assertions
are clear By the hypothesis of (1) there are functions ε, η of a real variable such
that lim e(t) = lim (t) = 0 as t^O, and linear transformations R, S such that
llf (P + v) - f (p) - Жу)\\ < ε(||ν||) ||ν|| (7.5)
l|g(p + v) - g(p) - 5(v)|| <7?(||ν||) ||ν|| (7 6)
Let /z(x) = <f (x), g(x)>. Then
A(p + ν) - A(p) = <f (p + v), g(p + v)> - <f (p), g(p)>
= <f (p + v) - f (p), g(p + v)>
+ <f(p),g(p + v)-g(p)> (7.7)
If we replace the first term by R(\) we commit an error of ε(||ν||) ||ν|| ||g(p + v)|| and
if we replace the last term by S(y) we commit an error of i?(||v||) ||v|j ||f(p)|j These
are admissible errors, so we shall bravely proceed with these replacements. From
(7.7), we obtain
|A(P + v) - A(p) - (</?(v), g(p)> + <f(p), 5(v)»|
< I <f (P + v) - f (p) - R(y), g(p + v)> | + | </?(v), g(p + v) - g (p)> |
+ l<f(p),g(p + v)-g(p)-5(v)>|
<ε(1|ν||)||ν|| ||g(p + v)||+ ||/?(ν)||(||5(ν)|| + 7?(||ν||) ||v||)
+ l|f(p)IWI|v||)l|v||
If we take Μ larger than the maximum value of ||g(p + v)||, and also larger than
\\R\\ and ||S ||, this is dominated by
[Με(||ν||) + Μ2||ν|| + Μ||ν||τ?(||ν||)+ ||ί(ρ)||ΐ}(||ν||)] ||v||
which is of the desired form.
Examples
6. f(x, y) = ex cos у + yx.
df(x, y) = (ex cos у + yx log y) dx + (- ex sin у + xyx~x) dy
7. f(x, y, z) = xyz, df(x, y, z) = yz dx + xz dy + xy dz.
7.1 The Differential 533
• EXERCISES
1. Find the differential of these functions:
(a) у cos χ + sin zx
(b) cosO"') + cosO?')-
(c) exp<x, a>.
(d) <x, exp<x, a> >.
(e) x2 + y2 + ζχ.
(f) (x-^)e*+y
(g) Щ.хдс'.
2. For each of the following functions, in how large an interval about the
origin may we estimate /(v) — /(0) by <V/(0), v> incurring an error of at
most 10-3 ||v||?
(a) xy (d) sin(x + 2y)
(b) ex+> (e) x + e2'
(c) sin χ + cos у (f) exp(x2 + y2)
3. In how large a disk about the point ρ ^=0 can we estimate the polar
coordinates of nearby points ρ + ν by a linear function, with an error of
at most 10~3||v||?
• PROBLEMS
1. Suppose that / is a differentiable real-valued function defined in a
neighborhood of ρ in R". Using the definition, verify that
л,,™ ι /(ρ + Έ,)-/(ρ) Э/
rf/(p)(E,) = hm = — (p)
Γ-.0 1 °Xi
and conclude that
2. Let M(x), N(x) be η χ η matrix valued functions of the variable x. If
Μ, Ν are differentiable at p, so is MN. Show that rf(MN)(p) =
t/M(p)N + Μ ί/Ν(ρ).
3. if /(f) = det(exp(Mf)), show that /'(0) = trM.
534 7 Line Integrals and Green's Theorem
4. A quantity Q vanes with x, y, ζ according to
e'
Q = -
Suppose that x, y, ζ can be measured to within an error of 1 %, 1/2 %, 3 %,
respectively. What will be the corresponding maximal error in Q at
corresponding values? At (a) x =0, у = 2, ζ = 5, (Ъ)х = 2, у = 1, z = 3, in
particular ?
7.2 Coordinate Changes
In Chapter 1 we introduced some systems of coordinates in R", and we saw
that for certain problems a change of coordinates made the problem
understandable and solvable. Later on we saw, in the study of systems of linear
differential equations, that it was convenient, where possible, to Switch to
coordinates relative to a basis of eigenvectors. In the geometric study of
surfaces, and in many physical problems it is advantageous to admit very
general coordinate changes. We now introduce a general notion of
coordinates
Definition 2. Let U be a domain in R". A system of coordinates is an
и-tuple of continuously differentiable functions у = (у1, ..., у") defined on
U such that
(ι) ifp#q, then y(p) # y(q),
(ii) dyiQp), ■ ■ ■, dy"(p) are independent at all ρ 6 U.
The first condition states that any point is uniquely determined by the
value of у at that point. In this sense y\ ..., y" are coordinates. We can
name points in U by means of the functions yl, . ., y". Further, if/is a
function defined on U, we can describe it as a function of the coordinates
y1,. ., y". The second condition asserts that the differentials dy1, ..., dy"
span the space of linear functions. Thus we can express the differential of a
function as a linear combination of these differentials; it should be no
surprise that (7 4) is valid in any coordinate system.
Proposition 2. Suppose that y1,..., y" are coordinates in a neighborhood
of p. Iff is a differentiable function defined in a neighborhood ofjt, then
df(v)=t fi(PW(P)
.= i oy
7.2 Coordinate Changes 535
Proof. Let xx,..., x- be the coordinates of R" relative to the standard basis. We
know that
df(fi)= I j-W*'
(=i ox'
Now we can express the standard coordinates as differentiable functions of the new
coordinates^1, ..., y": x' = x'(y\ ..., y"), i=l, ..., n, and/can be expressed as a
function of y\ ..., y" by composition:
/(p)=/(x4y(p)),...,x"(y(p))
Let us assume that ρ is the origin relative to both χ and у coordinates. Now
df/By' is the derivative of / with respect to y', holding the other variables y1, j φι
constant. In other words, 8f/8y'(p) is the derivative of / at ρ along the curve
yJ = 0, j Φ i. We can parametrize this curve by
х1=д10) = х1(0, ■■■,<), t,0, . .,0)
x" = g"U) = x"(0,..., 0, t, 0,..., 0)
for t near 0. Now by Proposition 3 of Chapter 3, we have
8f d " 8f dgk
^(ρ) = ζ/(,4Ο,...,,»(Ο)|ι=ο=Σι^(0)-(0)
But dgk/dt(0) = дхк/дуЩ. Thus
8f " 8/ dx<
~(S>)=If-k (0) -r-
dy' k=i dx* 8y
<ρ) = .Σ.~m—^ <7·8)
As the x' are differentiable functions of y; dx' = 2 (Sx'/V) dyJ and we conclude
that
of Sf 8χι ^, df
Examples
9. Polar coordinates: the change of coordinates
x = r cos θ у = r sin θ
7 Line Integrals and Green's Theorem
is valid in any disk not containing the origin. We have
dx = cos θ dr - r sin θ dd, dy = sin θ dr + r cos θ άθ
so
dx dx dy dy
— = cos У —-=-rsm0 — = Sin У — =rcos
<3r 00 dr du
If/is any differentiable function,
^- = cos^ + sin^
dr dx dy
10. Spherical coordinates:
χ = r cos θ cos ψ у = r sin θ cos ψ ζ = r sin ψ
c/x = cos θ cos φ dr — r sin θ cos φ d9 — r cos 0 sin ψ rf</>
c/y = sin θ cos φ dr + r cos θ cos φ άθ — r sin θ sin ψ ί/ψ
ί/ζ = sin φ dr + r cos </> άφ
If/is differentiable,
dj__dj_d_x dj_dy_ dj_d_z_
dr dx dr dy dr dz dr
= cos θ cos φ ——h sin θ cos θ ——l· Sin φ —
dx dy dz
dj__dj_d_x_ dj_dy_ dj_d_z_
de~~dx"de+ dydQ+ dzdl)
— sin θ cos ώ h cos θ cos φ — I
dx dy!
d£_d£dx d_J_dy_ dj_dz_
dφ dx dφ dy dφ dz dφ
— cos 0sin ψ sin 0 sin ψ -—h cos ψ — I
dx dy dz]
7.2 Coordinate Changes 537
11. Let/(χ, у, ζ) = e*yz. Find df/dr, df/дв:
й f
— = (cos θ cos ф)ехуг + (sin θ cos ф)ехг + (sin </>)e*y
= r exp(r cos θ cos ф)(г sin θ cos θ cos2 φ sin ψ
+ 2 sin θ cos φ sin ψ)
Я f
— = r[(-sm θ cos ф)ехуг + (cos θ cos 0)e*z]
OP
= r2exp(r cos θ cos </>)[-r sin2 θ cos2 ψ sin ψ + cos θ cos φ sin ψ]
12. Find df/дх if/(г, θ, ψ) = φ2 in spherical coordinates. In order
to solve this we have to write/explicitly as a function of the rectangular
coordinates. Since φ = arc sm(z/r),
^T = 2ψ V" = 2 arc sin —: = j-r-p.
δχ δχ (x2 + y2 + z2)1/2
arc sin
έ(3
(x2 + У2 + ζ2)
2ч 1/2
77ге Jacobian
In general, if
/=/v,.
y" = A*1. · ·
...x")
.,*")
is a change of coordinates, we shall write this as у = F(x). The differential
rfF(x0) is a nonsingular linear transformation on R". The matrix relative
to χ coordinates representing this transformation is referred to as the Jacobian
of the mapping and denoted (when it is of value to make the coordinates
explicit) by
d(y\...,y") /dy·
Uv
δ(χ\ ..., χ")
According to Proposition 2
ay _ " ay йх_к
dy'J~k-i'd?dyJ
ι,ί = 1, ···, η
538 7 Line Integrals and Green's Theorem
which is just the entry by entry form of the equation
l_d(y\...,y")d(xl,...,x")
д{х\...,х")д{у\ ..,/)
Thus the matrices are inverse to each other as are the corresponding
differentials:
аТ-\у0)=У¥(х0)У1 ify0 = F(x0)
Example
13. Let
и = χ + ey
ν = χ cos у
be a coordinate change in a domain in R2. Then
d(u, v) _ I 1 ey \
d(x, y) \cos у — χ sin у/
d(x, y) -1 / - χ sin у -^
d(u,v) ey cos у + χ sin у \ —cosy 1 /
If f(u, v) = u2 + v2, then
df dfdu dfdv n „, v , ч
-— = — -—\- — — = 2u + 2v cos у = 2(x + ey + χ cos y)
ox du ox ov ox
lig(x, y) = x2 + y2, then
dg dg dx dg dy x2 sin у + ye"
ди дхди дуди ey cos у + χ sin у
These observations form special cases of the multivariable chain rule. We
have already seen (Propositions 3.2, 3 3) other special cases. The general
situation is this: the differential of a composed function (see Figure 7.1) is the
composition of the differentials:
die ° f)(P) = rfg(f(P)) ° <tf(P)
(7.9)
7.2 Coordinate Changes 539
In coordinates this is easy to compute by linear algebra. Let x1, ..., x"
be coordinates in R", y\...,ym in Rm and z1, ..., z" in R" Then f and g are
given in coordinates by
f:/=/'(x\ ...,χ") \<i<m
g:zJ = gJ(y1, ...,ym) \<i<p
Let h = g ° f. Then h is given by the p-tup\e of functions
z' = h>(x\ ...,xm) = g'(f\x\ ...,x"),...,fm (x1, ...,*"))
(7.9) is the same as all these equations
dhJ m da1 dfk
M^^w^-h® l-J-p 1-i-w (7Л0)
This is true since rfg(f(p)), df(p) are represented by the matrices
respectively. We can rewrite (7.9) and (7.10) again in matrix form. The
Jacobian of a product is the product of the Jacobians, and (7.9), (7.10)
540 7 Line Integrals and Green's Theorem
become
d{z\...,z*) d{z\...,z>) d{y\...,ym)
3(*\...,*»)(P) = */,...,/■)(f(P)) 3(x\...,*»)(P)
Here is the proof of the chain rule.
Theorem 7.1. (The Chain Rule) Let ρ be apoint in R". Suppose f is a
differentiable Rm-valued function defined in a neighborhood of p, and g is a
differentiable Revalued function defined in a neighborhood of f(p). Then
h = g о f is differentiable at ρ and db(f) = rfg(f(p)) ° df(p).
Proof. Let T= i/g(f (p)), and 5 = i/f(p). We must show that
||h(p + v) - h(p) - To S(y)\\ < ε(ν) ||ν|| (7.11)
where lim ε(ν) = 0. Let
!lvll-0
φ(ν) = f (p + v) - f (p) - S(v) (7.12)
iKw) = g(f (p + w)) - g(f (p)) - T(w) (7.13)
Then, since f, g are differentiable,
||φ(ν)||<δ(ν)||ν|| ΙΙΨΜΙΙ^ιΟΙΗΙ
where δ(ν)->0as ||ν|| —>■ 0and. ?7(v)->-0as ||w||->0. Now, we verify (7.11) by
computation:
h(p + v) - h(p) = g(f (p + v)) - g(f (p))
= g(f (p)) + T(f (p + v) - f (p)) + iRf (p + v) - f (p)) - g(f (p))
(by taking w = f (p + v) — f (p) in (7.13)). Now using (7.12) we can continue:
= Γ(5(ν)) + φ(ν)) + ψ(5(ν) + φ(ν))
= Τ ο 5(ν) + Γ(φ(ν)) + ψ(5(ν) + φ(ν))
since Τ is linear. Thus
ч ,., ч τ- ^ 11Г(ф(у))|| + ||ψ(5-(ν) + φ(ν))|| „ „
||h(p + ν) - h(p) - To S(y)\\ ^ π~Π II»II
7.2 Coordinate Changes 541
Now we must show that
., 11ПФ(у))Н ΙΙΨ(5(ν) + φ(ν))||
as ||v|| ->0. As for the first term
^^^,,Γ,,„,,Φ(ν)"^Ιΐηΐδ(ν)
which tends to zero as ν -^0, so that is alright. The second term is
||ψ(5(γ) + φ(ν))|| i?(Sfr) + φ(ν)) l|S(v) + φ(ν)||
ΙΙνΙΙ
:ΐί(5(») + φ(»))(||5|| + 8(»))
As ν -» 0, so does 5(v) + φ(ν) -» 0, and also η(Ξ(\) + φ(ν)) -» 0. The final
parenthesis is bounded so the whole term tends to zero. We are through.
Finally, we wish to give a sufficient condition that an и-tuple of functions
У1 =/1(X)> ···, У" = f(x) gives a coordinate system in a domain D in R".
If y1, ..., y" are coordinates, then we can invert these equations, that is,
since the y's suffice to determine points in D, we can compute the χ
coordinates in terms of y1, ..., y". Thus there are functions x1 = g1(y), ..., x" =
g"(y) such that
x = g(y) if and only if у = g(x)
in the domain D. Now the second condition defining a coordinate system
is that the differentials dfl,..., df" are independent. The inverse mapping
theorem asserts that if this second condition is valid at a point, then the first
must hold in a neighborhood of that point. Thus the independence of the
vectors dfl($), ..., df"(p) are enough to guarantee that y1, ...,y" are
coordinates near p.
Theorem 7.2. (Inverse Function Theorem) Let F be a continuously different-
iable Revalued function defined in a neighborhood o/p0 in R". Let q0 = F(p0).
// the differential d¥(jt0) is nonsingular, then there is a neighborhood U of q
and a continuously differentiable mapping G defined on U such that G(q0) = p0
and for each q in U
F(p) = q if and only if ρ = G(q)
542 7 Line Integrals and Green's Theorem
Proof Let us, for simplicity of notation, assume that p0 =q0 =0. We have to
show that if q is small enough the equation
F(p)-q = 0
has a unique solution ρ in a neighborhood of 0. This suggests Newton's method
for finding roots. The linear approximation to the mapping p^F(p) —q at a
point pl is given in terms of the differential:
p->F(pi)-q + dF(pi)(p-pi) (7 14)
If p* is near enough to 0, dF(pi) is nonsingular, so we can find a root of (7.14),
namely,
p = p1-i/F(p1)-1[F(p1)-q] (7.15)
Now we consider the transformation T, defined in a neighborhood of 0 by
r,(p) = p-dF(0)-I[F(p)-q] (7.16)
[For simplicity we have replaced i/F(pj) in (7 15) by dF(0) ] It is shown below, in
Lemma 3 that for q sufficiently small, T„ is a contraction in a neighborhood of 0.
Thus, for each q near 0, T, has a unique fixed point, which we denote G(q) Clearly,
F(p) = q if and only if ρ is the fixed point of Tq, that is, if and only if ρ = G(q) It
remains only to verify that G is differentiable
Let q0 be a point near 0, and p0 = G(q0) Let Τ = i/F(p0) Then, by definition
F(p)-F(p„)=np-Po) + <Kp-po) (7.17)
where ||φ(ρ—Po)l| <e(p—Po) lip —Poll and e(t)~>0 as f->0. Let p = G(q).
Then (7 17) becomes
q - qo = 7(G(q) - G(q„)) + (G(q) - G(q„))
Since Τ is invertible this can be rewritten as
G(q) - G(q0) = T'l(.q - q0) + r-J<p(G(q) - G(q„)) (7 18)
If we can successfully study the behavior of the last term we will have verified the
differentiability of G at q0, with
i/G(q0) = r-1=i/F(Po)-1
7.2 Coordinate Changes 543
But (7.18) gives us
||G(q) - G(q0)|| < ЦГ-11| llq - q0|| + \\T~l ||e(G(q) - G(q„)) l|G(q) - G(q0)||
(7.19)
Since G is continuous (by Problem 10), we may choose q so close to q0 that the last
term is dominated by 1/2 ||G(q) - G(q0)||. Then (7.19) is the same as
||G(q)-G(q„)||<2||r-1|l llq — QoII
and (7.18) produces this inequality which guarantees differentiability
||G(q) - G(q„) - T''(q - q0)ll < ll<t>(G(q) - G(q„))||
< e(G(q) - G(q„)) ||G(q) - G(q0)||
^2||r-1||£(G(q)-G(q„))||q-q„||
and certainly lim 2 \\T~l || e(G(q) - G(q0)) = 0.
q^qo
Here is the lemma which guarantees that the Tq are contractions for q
near enough to 0:
Lemma 1. Given the hypotheses of Theorem 7.2, there is α δ > 0 such that
for q 6 B(0, δ), the map
7(p) = p-^(0)-1(F(p)-q)
is a contraction on B(Q, δ).
Proof. Let p, p' be two points near 0 and consider the function
h(/) = ρ + ί(Ρ' - ρ) - i/F(0)-4F(p + ί(ρ' - ρ)) - о) 0 < / < 1
Then
7Хр) - Г(р') = h(l) - h(0) = f h'(i) dt (7.20)
•Ό
h'(0 = Ρ' - Ρ - d¥(0)~l dF(p + /(ρ' - p))(p' - ρ)
h'(i) = [I - dW)~l d¥(p + t(p' - p))Kp' - p) (7.21)
Now choose δ < 0 so that
\\l- dFffl)-1 dF(x)\\<ll2
544 7 Line Integrals and Greens Theorem
for ||x|| < δ. Then if p, p' e B(0, δ), every ρ + ί(ρ' - ρ) is in B(0, δ), for 0 ^ t ^ 1,
so, using (7.10)
l|h'(f)ll ^ Ш - ЛЧО)-1 </F(p + /(p' - p))|| ||p' - p||
<*Hp'-pII
Thus, by (7.9)
1 r1 1
Щр) - Γ(ρΟΙΙ <2 J HP' - Pll Λ ^2 W ~ P"
so Г is a contraction in B(0, δ).
• EXERCISES
4. Compute the Jacobian
g(«', ■■■,«")
S(x\..., x")
for each of the following functions and determine those points (x1,..., x")
at which u\ ..., u" are coordinates:
(a) и = xe"
ν = yex
(b) и1 = x1 + x2 + χ3
u2 = x'x2 + x2x3 + x3xJ
u3 = x*x2x3
(c)
(d)
и -
V -
и
ν
w
= x2-
= xy
= x2
= yx
= zx'
-y1
+ У2
-1
-1
+ z:
(e) ul = xl
xz
«2 = -
X1
X"
11»=-
X1
(f) ui=h\x)x"
и" = A„(x)x"
*,(*)# 0
for all г and χ
5. Express the differential of/(x) = 2"=i (*')2 m terms of the coordinates
u\ ..., u" given in Exercise 4(e).
6. Express df in terms of the coordinates of Exercise 4(d), where
(a) /(x) = ln(x24-;v2 + *2)
(b) f{x) = yz
(c) f(x) = x + y + z
7.2 Coordinate Changes 545
7. Compute the differential of
Kx-ay+b-by + iz-cY]-1
in spherical coordinates in R3 — {(a, b, c)}.
8. What is the rate of change of the volume of a rectangular box with
respect to the area of its surface, assuming the length of one side and the
sum of the lengths of the other two sides is left fixed ?
• PROBLEMS
5. Let/be a differentiable function defined on a domain D in R2 Show
that f is a function of x + > alone if and only if dfjdx = df/ду on D.
(Hint: Consider the change of coordinates и = χ + у, ν = χ — у.)
6. Give a condition guaranteeing that a differentiable function of two
variables can be expressed as a function of xy.
7. Suppose that /, g are two differentiable functions on R2 with Vg Φ 0.
Show that / is a function of g alone if and only if V/, Уд are everywhere
collinear.
8. Show that for any twice differentiable function / defined on the plane,
d2f d2f В I df\ d2f
'd^2 + 'dy1 = '"dr\"d7)+'de2
9. Show that for/(z) = ζ", η φ Ο,
d2f B2f
— +— = 0
8χ2 ^ dy2
10. The proof of Theorem 7 2 is still incomplete: we must show that the
function G is continuous There are two ways.
(a) Suppose q„ -*■ q Let q„ = F(p„). Suppose that p„ — ρ Then
F (p) = lim F (p„) = lim q„ = q Applying G we have
hm G(q„) = hm p„ = ρ = GF(p) = G(q)
Thus G is continuous, as desired. Why may we suppose that the sequence
{p„} converges ?
(b) In this approach we reprove Theorem 7 2 so that the continuity
is automatic
For a sufficiently small ε >0, we consider the space С of continuous
functions h on {q e R": ||q || < ε} such that Λ(0) = 0. Define T: С - С by
T(h)(q) = h(q) - rfFCOr'tFOliq)) - q]
546 7 Line Integrals and Green's Theorem
As in Lemma 3 show that Γ is a contraction (on the space С of
continuous functions!). Thus Τ has a fixed point G. Clearly, F(G(q)) = q
as desired and the continuity of G is assured.
11. Suppose that/is a continuously differentiable function defined in a
neighborhood of 0 m R3, and/(0) = 0 and β//βζ(0) φ 0 Then the equation
fix,y,z) = 0
implicitly defines ζ as a function of χ and у More precisely, there is a
function g defined for small enough x, у such that
fix, y, z) — 0 if and only if ζ = gix, y)
near the origin This can be proven as a corollary of Theorem 7.2 as
follows: applying Theorem 7 2 to the mapping
и = χ
ν = у 0.21)
w=fix,y,z)
We can find functions h, k, g of («, υ, w) such that (7 22) holds if and only if
χ = Л(и, ν, w)
у = kin, v, w)
^ = g(u, v, w)
Obviously, hiu, v, w) = u, kiu,v,w) = v It follows that when w = 0,
ζ = giu, υ, 0) = gix, y, 0) This is the desired conclusion
12 Here is a similar fact. The proof should be analogous to the
argument for Problem 11 Suppose/, g are continuously differentiable near 0
m R3 and that /(0) = giO) = 0 and
Then, there are continuously differentiable functions h, к defined for small
enough ζ such that
fix, y, z) = 0 = gix, y, z) if and only if χ = Λ(ζ) у = kiz)
7.3 Differential Forms 547
7.3 Differential Forms
The differential of a real-valued function defined on a domain D in R" is a
function defined on D whose values are linear functions on R". A function
of this type is called a differential form, and a central issue in the calculus of
several variables is this: just when is a differentiable form the differential
of a function7 This problem is resolved by the generalization of the
fundamental theorem of calculus which takes the form in this chapter of Green's
theorem. The one-variable fundamental theorem asserts that every
differential form on an interval is the differential of a function. This is far from being
true in several variables.
For example, to say that £ a, dx' is the differential of a function /is to
assert that a, = df/δχ'. Since
d2f d2f
ΤΤΓΊ = ΤΊΊΓ1 all ι and j (7.23)
dx dx1 ox1 dx
we must have dajdx1 = dajdx1. This is not always the case. ex(dx + dy),
ydx — xdy are not differentials of functions because the coefficients do not
satisfy these conditions. We shall explore this situation at length in the
following two sections.
Definition 3. Let D be a domain in R". A differential form on D is a
function which associates to each point ρ in D a linear functional on R".
If/is a differentiable function on D, the df is a differential form on D.
In particular, if xi,...,xn is a coordinate system in R", dxu ..., dxn are
differential forms on R". Furthermore, for any ρεΖ), rfxi(p), · · ·, dx„(jt)
form a basis of the space of linear functional on R", so any such functional
is a linear combination of the dx,(p). Thus, the general differential form
on D is of the form £"=1 a,(p) dx,(p) where the a, are real-valued functions
on D.
Definition 4. Let ω be a differential form on the domain D, and write
a, = £ a, dx, relative to the standard coordinates of R". We shall say that
ω is a fc-times (continuously) differentiable differential form on D (ω e C\D))
if the functions als ..., an are all fc-times (continuously) differentiable.
Suppose now that uu...,un are differentiable functions in D с R" and
that du^if), ..., dun(p) are independent at some ρ 6 R". Such an и-tuple of
function forms a coordinate system near p: the mapping u = (uu ..., и„)
548 7 Line Integrals and Green's Theorem
maps a neighborhood D of ρ onto a domain D' in one-to-one fashion.
Furthermore, du^f), ..., dun(f>) forms a basis for (/?")*, so any differential
form can be written as £ a, ί/й,. We can compute the relation between the
a, and the a, by the chain rule: since dul = £ du'/dxJ dxJ, we have
А du1
Thus differential forms transform under a coordinate change as the
differential of a function (compare Equations (7.4) and (7.23)). Now the equality
of mixed partials of a twice differentiable function gives a necessary condition
for a differential form to be the differential of a function.
Proposition 3. Let ω be a continuously differentiable differential form in a
domain D. Suppose и1,..., и" is any coordinate system for D. If ω =
Σ αι du' is the differential of a function we must have
da, да,
7-^ = 1r1l l<i,j<n (7.25)
du1 du
Proof. If ω = df, then a, = df/ди'. Then
da, _ β /ЗА _ β /β/\ _ Эй'
Closed and Exact Forms
We shall say that a differential form is exact in a domain D if it is the
differential of a function, and closed if Equations (7.25) hold. It is easily
verified that if these equations hold in any coordinate system, then they hold
in all coordinate systems (see Problem 13); so it is not too difficult to verify
that a form is closed.
In the plane a form has the expression ω = ρ dx + q d\ with respect to
the rectangle coordinates. In this case there is only one nontnvial equation in
(7.25), namely,
Sq dp
i-iy=° (7-26>
We shall refer to this function as άω\ that is, if
ω = ρ dx + q dy d(t> = —
dx dy
7.3 Differential Forms 549
Thus, a differential form on a plane is closed if άω = 0 and is exact if ω = df.
Examples
14. ω = χ dx + у dy, άω = 1 - 1 = 0. In fact, ω = d(x2 + y2)/2.
15. ω = у dx + χ dy, άω = 1 — 1 = 0. Here ω is also exact,
since ω = d(xy).
16. ω = у dx — χ dy, ί/ω = — 1 — 1 = — 2, so ω is not closed.
Notice however that \ ~2ω is closed, since it is exact (except for у = 0,
where it is not defined), y~2ω = d(x/y).
17. Integrating factors. Let ω = ρ dx + g dy be a differential
form given in a neighborhood of p0 . The vector field {—q, p) can be
realized as the field of tangents to a family of curves, as we saw in
Chapter 4. Let this family be given implicitly by
F(x, y) = с
Thus, since F(x, y) is constant on these curves, its derivative along
the curve is zero; or what is the same
dF(x, y)(-q(x, у), р(х, у)) = О
Since dF and ω annihilate the same vectors at each point, they are
collinear. Thus there is a function λ(χ, у) such that
dF = λω
We conclude that for any differential form ω there is a factor A such
that λω is exact. This is true in two dimensions, and fails in higher
dimensions A is called an integrating factor for ω.
18. The polar coordinate θ is not a well-defined function on the
domain R2 — {0}, but its differential is:
/ y\ -ydx+xdy
at) = d\ arc tan -1 = 5 5
\ xj x2 + у
Thus this form is closed, but not exact on the domain R2 - {0}.
550 7 Line Integrals and Green's Theorem
We shall now verify that every closed form on R2 — {0} is equal to an
exact form plus a constant multiple of άθ. Thus the space of closed forms
on R2 — {0} is larger than the space of exact forms by one dimension.
Suppose that ω is a closed form in R2 — {0}. In polar coordinates
(я{ге,в) = a(re'e)dr + b(reie)dd and since ω is closed we have db/dr = да/дв.
It follows that
F(r) = f "b(re">) d9
•Ό
is a constant. For
dF r2" db г2" da
— = ^dd=\ —de = a{re2"')-a{re°) = a{r)-a{r) = 0
Let c(a>) be that constant. Notice that c(dd) = In. Further, if ω = df,
then c(co) = 0. For
c(a>) = f"b d9 = f %d9= /(re2™) - /(re0) = 0
Conversely, if c(a>) = 0, then ω is the differential of a function defined on
R2 - {0}. Let
f(r, θ) = ί a(t) dt + ί Ь(ге,ф) άφ (7.27)
•Ί ·Ό
Since c(pS) = 0, f{r, θ + 2π) = f(r, θ) for all r, θ, so we can define a function
F onR2 - {0} by F(re,e) = f(r, Θ). Differentiating (7.26), we have
dF df re db re da
T=-j- = a{r) + - (re") άφ = a(r) + — (re") άφ = a{re'e)
dr dr -Ό dr -Ό o<p
de de v '
Thus, dF = ω.
Finally, if ω is any closed form on R2 — {0}, let θ = ω — ε{ω)άθβπ.
Then
c(co)
c(9) = c(co) - -^^ 2π = 0
2π
7.3 Differential Forms 551
so θ is exact: θ = dF. Thus
c(a>)
2π
EXERCISES
9. Which of the following forms are closed?
(a) £ xl dx1
(=1
(b) xy dz + у ζ dx + zx dy
(c) xyz(dx + dy + dz)
(d) rdr + dO
(e) r'dr + rdd
(f) r sm θ dr + r cos θ
(g) r sin φ dr + r cos φ sm θ ί/0 + r sm ^ d^
(h) d{xe" cos(xyz))
(l) X1X2 ί/хз + X2 Хъ dXi + x3xA dxi + XiXi dx2
(j) Xi ito + хъ dxa, + x5 dx6
(k) Xi dxi + xi dx3 + Хъ dxi
10 Is the form (z— a)'1 dz exact in С — {я}? Is its real part exact? Is
its imaginary part exact ?
11. Find integrating factors for the following forms:
(a) x(dy + dx) (d) χ dy
(b) xy(dx + dy) (e) e*+y dx + e dy
(c) —ydx + x dy (f) sm χ dx + cos χ dy
1 PROBLEMS
13 Let 0\ . , x") and (и1,.. , «") be two coordinate systems valid in a
domain D in R". Let ω be a differential form defined in D and write ω in
terms of these coordinates as
1 = 2 a< dx' = 2 a' dil'
Show that if
for all., У
552 7 Line Integrals and Green's Theorem
then
dtxt dxj
ы=ы forall/'7
14. Let the hypotheses be the same as in Problem 13, but this time
suppose η = 2. Show that
Stfi 8a2 /<3oci 8<x2\ 8(ui, u2)
8x2 8xi \81i2 8ui) 8(χί, x2)
15. Show that the space of closed forms on R2 — {0, 1} is larger than the
space of exact forms by 2 dimensions. (Hint: Let θ0 be the ordinary polar
angle, and let 0i(p) be the angle between the ray from 1 to p and the horizontal.
Then if ω is closed in R2 — {0, 1} there are constants a, b such that
ω — α άθο — b dQ-L is exact.)
7.4 Work and Conservative Fields
Suppose we have a field of forces F given in a domain D in R": F(x) is the
force felt by a unit mass situated at the point x. In moving an object of mass
m along a certain path a certain amount of energy is expended; this is called
work. In this section we shall describe the computation of work.
Suppose first that a body of mass m moving in a straight line experiences
a force of magnitude F per unit mass operating in the direction opposite
the motion. Then, by definition the work required to move that body a
distance d is — F · m · d. In a more complicated situation the force acts in
space in a fixed direction with a certain magnitude; thus the force is
represented by a vector F. Suppose we want to move a body of mass m from a
point a to another point b. The work required for this movement will
depend only on the component of the force in the direction of motion and
will be given again by —F0-m-d, where F0 is this component and d is the
distance between a and b. That is, if b - a = dE, where Ε is a unit vector,
then F0 = <F, E> and the work is
-<F, E}md= -m<F, b-a>
Now, in general, the force is not necessarily constant, but varies with
position. The general situation is that of a force given by a vector field
(vector-valued function) F on R3. Suppose that for some perverse reason
we desire to move a given body from a to b along a particular path Γ. As
7.4 Work and Conservative Fields 553
is customary we try to adapt the above formula to this revised situation by
assuming that the force field varies little over small intervals (that is, F is
continuous) and that the path is very close to being a sequence of straight line
segments. Then, we get a reasonable approximation to the total work
involved by adding up the work required over each line segment assuming
the force is constant there. More precisely, then, we select a very large
number of points a = p0, p1; ...,ps = b numbered sequentially along the
path (see Figure 7.2). The work we seek is then approximated by
-mi<F(ft).P.-P.-i> (728)
1=1
We define the work as the limit of all such sums as the maximum of the
distance between successive points tends to zero, and we expect that, as
usual, the calculus will make that computable.
And it does. Suppose given, for example, a field of force F given in a
domain D in/?3; then F =(/j(x), /г(х),/з(х)) is an /?3-valued function defined
on D. Suppose Г is an oriented curve in D, given by the parametrization
x = g(/) = (g^t), g2(t), g3(t)), a <t < b We shall now compute the work
done in moving a particle of mass m from g(a) to g(b) Let g(d) = p0,
ps = g(£) be a very large number of points situated along Г. Referring to
the parametrization we can write p0 = g(/0), Pi = g(^i)·. , Ps = g(0> with
a = t0 < /t < ■ · · < ts = b. Then the approximate work done is given by
Figure 7.2
554 7 Line Integrals and Green's Theorem
(7.28):
-mt <F(g(0), g(0 - g(f,-1)> (7.29)
1=1
= -m Σ /,(g(O)[0i(O " 0i(',-i)] + /2(g(i,))[ff2(0 " ff2(i.-i)]
1=1
+ /з(8(0)[вз(0-^,-1)]
By the mean value theorem, there are 0X ,, 02i ,, 03t, such that
ffj(0 - 0j(',-i) = ^(^.,)(ί, - ί-ι) i,-i < ^,. < i.
Thus the approximating sum (7.29) becomes
-m£
J = l
(ί,-ί,-i) (7.30)
which is a typical Riemann sum approximating
-m f ZfMtMU)dt
Ja J = l
= -m f <F(i), g^)) di (7-31)
In fact, as the " very large number of points" on Γ becomes infinite, the
sums (7.30) do tend to the integral (7.31), so we are justified in referring to
this as the work required to move the mass along Γ. We are thus led to this
definition of work:
Definition 5. Let D be a domain in R" and F a force field defined in D;
that is, F is an /?"-valued function on D. Let Γ be an oriented curve defined
in D. The work required to move a unit mass along Γ is
W(T, F) = - f*<F(0, g'(0> dt
J a
where g: [a, b~] -> Γ is a parametrization of Г.
Notice that since W(T, F) is the limit of a collection of sums defined
independently of any particular parametrization that ЩГ, F) is also
independent of the parametrization.
7.4 Work and Conservative Fields 555
Sometimes paths of motion have a break in direction (see Figure 7.3).
Such a curve is called a piecewise continuously differentiable curve, or a path
for short. More precisely, we make the following definition.
Definition 6. An oriented path is the image of an interval [a, b~] under а
continuous function f such that
(ι) f is continuously differentiable with nonzero derivative at all but
finitely many points tu ..., 1S.
(ii) lim f'(/) and lim f'(f) exist (but are not necessarily equal) and are
nonzero.
If f(a) = f(b) the path is said to be closed. If Г is an oriented path we can
write Г = Г\ + ··· + Г5+1, where the Г, are the oriented curves between
the points f, _! and f,. We define the work W(T, F) by
W(T,F)=XW(Tl,F)
Examples
19. Let F(x, y) = ( — y, x2) be a force field in R2. The work done
by moving a unit mass around the unit circle is found this way. First,
we parametrize the circle:
Γ: χ = (cos t, sin t) 0<ί<2π
Figure 7.3
556 7 Line Integrals and Green's Theorem
Then
W(T, F) = - <(-sin t, cos2 f)> (-sin f, cos ί)> Λ
•Ό
. 2π
= — ( — sin2 t + cos3 t)dt = π
•Ό
20. For the same force field, find the work done around the
boundary of Γ of the rectangle [(0, 0), (1, 1)], traversed counterclockwise.
Here Γ = Γ\ + Г2 + Г3 + Г4, where
Г\: χ = (ί, 0) 0 < f < 1
Γ2:χ = (1, ί) 0^f<l
Γ3:χ = (1-Μ) 0<ί<1
Г4:х = (0, 1-ί) 0<ί<1
Then
W(T, F) = Γ <(0, ί2), (1, 0)> dt + f <(f, 1), (0, 1)> dt
+ Γ <(-i,(i-o2),(-i,o)>rfi
+ j\((l-t)2,0),(0, -l)>rfij
= - ί (0 + 1 + 1 + 0) dt = 2
•Ό
21. Let F(x, y, z) = (yz, xz, xy) and compute the work done along
one full loop of the helix
Γ: χ = (cos t, sin t, f) 0 < f < 2π
Γ2π
^(Г, F) = — <(? sin f, t cos f, sin t cos f), (—sin t, cos f, 1)> rfi
•Ό
,2π
= - (- ί sin2 t + t cos2 f + sin f cos f) dt = 0
22. Compute the work done in the presence of the same force field
along the curve χ = 1, у = 0, 0 < ζ < 2π. Here Γ is given para-
7.4 Work and Conservative Fields 557
metrically by
Γ:χ = (Ι,Ο,ί) 0<ί<2π
Thus
W(T, F) = - f π <(0, t, 0), (0, 0, 1)> dt = 0
Conservation of Energy
Now let us suppose we are given a field of forces on a domain D. Let Γ
be a closed path in D. Under optimal conditions we would hope for no
loss of energy in moving a mass around Γ. We shall call a field conservative
if this situation is the case; that is, the field F is conservative if W(T, F) = 0
for every closed path Γ. Not every field is conservative, as Examples 19 and
20 show. In case F is a conservative field, then the work required to move
a unit mass from one point p0 to another px will be the same no matter what
path from p0 to px is followed. For suppose we take two such oriented paths
Γ, Γ'. Then the path from p0 to p0 obtained by first traversing Γ and then
— Γ' (Γ' oriented from px to p0) is a closed path. Thus И-^Г — Г', F) = 0
since F is conservative. But ЩГ - Г, F) = W(T', F), so
W(T, F) = W{Y', F)
Definition 7. Let F be a conservative field defined in the domain D A
potential function for F is a real-valued function Π defined on D such that,
for any path Γ from ρ to p' we have
W(T, F) + Π(ρ') - Π(ρ) (7.32)
is a constant.
Π is sometimes called the potential energy of the force field F and the
constancy of (7.32) is just the assertion that a conservative force field obeys
the law of conservation of energy. We can relate the potential function of a
conservative field with the field, by its differential We obtain this important
result:
Theorem 7.3. Suppose that D is a domain such that any two points can be
joined by a path {we say D is pathwise connected).
(ι) Every conservative field on D has a potential function.
(ii) Two potentials of the given field differ by a constant.
558 7 Line Integrals and Green's Theorem
(in) If the fields = (fu ...,/„) has the potential function Π, then
dn=f1dxi + ---+fn dx"
Proof. (1) Suppose that F = (/i, ...,/n) is a conservative field defined on D.
Then if Γ and Γ" are two oriented curves with the same end points, W(T, F) =
W(T\ F) since F is conservative. Fix p0 e D. Since D is arcwise connected, if ρ
is any point of D there is a curve Γ from p0 to ρ Define Π(ρ) = — IV(T, F). Π(ρ)
is a well-defined function of ρ since the work required does not depend on the
choice of Γ. Now let ρ and p' be two points in D, and let Γ be a path from ρ to p'.
If Γ0 is a curve from p0 to p, then Γ + Γ0 is a curve from p0 to p', so
- W(Го , F) = П(р), - W(T + r0, F) = П(р')
But W(T + r0, F) = W(Г, F) + W(Го , F) = W(Г, F) - П(р). Thus -П(р') =
W(T, F) - П(р), or И-ЧГ, F) + П(р') - П(р) = 0, so (ι) is proven.
(и) If Π' is another potential and Γ0 is a curve joining p0 to ρ then by the above
definition
Π'(ρ) - IT'(po) + W(T0,F)
is a constant, say C. But IV (T0, F) = — П(р), by definition, thus
П'(р)-П(р) = С + П'(Ро)
another constant. Thus two potentials for the field F indeed differ by a constant,
(iii) Finally, we prove that dU = 2/i dx,. Let ρ e D. Fix ι, and let ε be so
small that the ball B(p, ε) <= D Let Ге be the curve with this parametrization
g(0 = P + 'E, 0<ί<ε
Since Π is a potential for F,
Π(ρ + εΕ«) - Π(ρ) = - W(TC, F) = f' <F(g(/)), g'(0> dt
Jo
Now g'(i) = Ei and
<F(g(0), g'(/)> = Σ Λ(Ρ + Έι) <Ej, E,> =/,(ρ + /E.)
j
Thus
Π(ρ + εΕ,) - Π(ρ) = f/,(р + /Ε,) dt
JO
7.4 Work and Conservative Fields 559
Thus
en
Oil 1 (·■
— (p) = lim - /,(p + /Ε,) Λ =/,(ρ)
OX ι fi_»o £ JO
and so the proof of the theorem is concluded.
• EXERCISES
12. Find the work required to move a unit mass around the given path
Γ in the presence of the given force field:
(a) F(x, y) = (χ χ) Γ: unit circle
(b) F(x, у) = (у2, у — χ2) Γ: boundary of the triangle with
vertices at (0, 0), (0, 2), (0, 1)
(c) F(x, y) = (1, χ) Γ: z(f) = exp(l + i)t from t = 0 to t = 1
(d) F(x, y, z) = (—y, χ, ζ) Γ: χ = (cos t, sin f, f)
(e) F(x, χ) = (x, xy) Г: the portion of the parabola γ = kx2
from (0, 0) to (я, ка2)
(f) FO, ;v, ζ) = (ζ, χ2, χ) Γ: closed polygon with successive
vertices (0, 0, 0), (2, 0, 0), (2, 3, 0), (0, 0, 1), (0, 0, 0)
13 Which of these fields are conservative?
(a) F(x, y) = (cos x, cos y, sin χ sin y)
(b) F(x, χ) = (cos χ cos χ — sm χ sin χ)
(c) F(x, у) = (χ, у)
(d) F(x, χ ζ) = (χ ζ, χ)
(e) F(x, χ, ζ) = (-χ, χ, 1)
(f) -(χ2+^)-"2(χ,χ)
(g) (χ2 + yT1/2(-X χ)
• PROBLEMS
16. Let F(x, у) = (Л(х), В(у)). Show that W(I\ F) = 0 for any closed
path Г.
17. Find potential functions for these fields:
(a) F(x,xz)=-(0,0,1)
(b) F(x, y, z) = -(x2 +y2 + z2)-1J2(x, χ z)
(c) F(x, χ ζ) = (χ, x, 1)
(d) F(x, y, z) = xy dz + yz dx + zx rfy
18. Let F be a force field in the domain D and Г an oriented path in D
from po to p. Show that the work W(I\ F) can be written as
Г ||F||cos0<ft
where s is arc length along I\ and θ is the angle between F and the tangent
ЮГ.
560 7 Line Integrals and Green's Theorem
19. Suppose the field F has the potential function Π. The surfaces
Π = constant are called equipotential surfaces for the field F.
(a) What are the equipotential surfaces for a central force field?
(b) What are the equipotentials for the fields of Exercise 13 which are
conservative ?
20. Show that if F is a conservative force field in R2 the lines of force for
F are orthogonal to the equipotential curves for F.
21. If F is a vector field in a domain D in the plane, we define *F as the
field perpendicular and clockwise to F of the same magnitude. Verify this
relation between F and *F if
F(P) = (Λι(ρ), Аг(р)), *F = (-Аг(9), Λι(ρ))
22. Suppose both F and *F are conservative fields with potentials Π,
Π*, respectively.
(a) П is harmonic.
(b) Π + Ш* satisfies the Cauchy-Riemann equations.
23. If /= и + w is a complex analytic function, и is the potential for a
field F such that *F is also conservative (and has potential function v).
24. A vector field F is called radial if it is central and its magnitude is a
function of the radius Show that if F is a nonzero radial vector field it is
conservative, but *F is not.
7.5 Integration of Differential Forms
The study of work has led us to differentials of function via the obvious
relation between vector fields and differential forms. If F = (/j /„)
is a vector field defined on a domain in R", the differential form £"= x /, dxl
will be denoted <F, rfx> (for obvious reasons). According to the results
of Section 7.4, the field F is conservative if and only if the form <F, rfx>
is exact. In this case <F, rfx> = d(Jl), where Π is a potential function
for the field F.
On the other hand, if ω is a form we can write ω = <F, rfx> for some vector
field F (if ω = £ α, dxl, F = (аь ..., α„)). We can thus rely on the notion of
work to define the integral of ω over a path Γ:
J ω = J <F, dx> = - Ж(Г, F) (7.33)
Thus, if ω = £ α, dxl and Γ is parametrized explicitly by xl = xl(t) for
a < t < b, then
г гь dx'
ω= Σβ.-77Λ (7·34)
Jr Ja Τ at
7.5 Integration of Differential Forms 561
The idea of defining the integral of a form in terms of work presents us
with a subtle inconsistency which we would like to avoid. The notion of a
differential form on R" involves the geometry of R" only insofar as it is a
vector space. In the conception of differential form, the inner product of
R" is irrelevant and no particular coordinatization of R" is selected over any
other. But the notion of work is deliberately expressed in terms of the
Euclidean structure of R", it essentially involves lengths and angles. As
a result, with the definition (7.33) of integration, we can only compute the
integral by means of (7.34) in terms of rectangular coordinates for R".
Since the concept of differential form is free of a particular basis, we want
accessory concepts (such as integration) also to be free, in fact, we would
hope to compute Jr ω by means of (7.34) with respect to anv coordinate
system as well as any parametnzation of Γ. This turns out to be the case,
and therein we begin to see the importance of the notion of invariance with
respect to coordinate choices.
Proposition 4. Let ω be a differential form defined on a domain D in R"
and suppose ω = £/, dx' = £ φ, du' with respect to two different coordinate
systems (x1, ..., x"), (u1, . ., u"). Let Γ be a path in D parametrized in two
different ways by
xl = x\t) a< t<b
ul = u\x) a < τ < β
Then
rb dx1 i-t du'
Ι Σ /.«')) ~Ц dt = Ι Σ <Μ»ω) Ύτ άτ (7.35)
Proof. We can write the x's as functions of the u's and t as a function of τ
χ' = x'("\ ··,"") mi)
t = t(f) <χ<τ<β
Now, according to (7.24)
" 8xJ
<Mu) = Σ /j(x) —i (u) (7 36)
when x, и are coordinates for the same point
Now, let us compute the integral on the left of (7 35) by the change of coordinates
ί->-τ, according to the calculus of one dimension.
ι·* Иу-J ι·β drJ dt Γβ dx'
(7 37)
562 7 Line Integrals and Green's Theorem
But we can compute dxJ/dr by the chain rule; χ is a function of u which is a
function of τ:
dxJ _ 8x' du'
dr ι du[ dr
(7.37) becomes
IfjMt(r))—i-rdr=\ ^i(u)(t)) — dr
J a 1,1 OW dr J« dT
by (7.36). The proof is concluded.
On the basis of that proposition we may now define the path integral
of a form.
Definition 8. (The Path Integral) Let Γ be an oriented path in a domain
in which the form ω is defined. If Γ = £f= x Γ,, where the Γ, are
parametrized by χ = g,(f), a, < t < bi, we define
f ω = Σ f 'ω(8.(0, feXO) dt
Notice, that if Γ is parametrized with respect to arc length, then g' is the
tangent and the integral may be written as
\ ω= ϊ ω(Τ) ds
Examples
23. FindjYr2 dQ, where Γ is the boundary of the rectangle
— 1 < χ < 1, — 1 < J> < 1. Now, in rectangular coordinates r2 du =
— ydx + xdy. Thus
\r2dd
= f -(-l)rfx+f {Y)dy-\ -(l)dx-f (-l)dy = S
J-l J-l J-l J-l
24. Find jV- (x2 + y2 + z2)(dx + xy dy + dz) around the curve
7.5 Integration of Differential Forms 563
x2 + y2 = a2, x2 + y2 + z2 = b2. This can be parametrized by
x = acos9 у = a sin θ ζ = (b2 - a2)1'2
and thus has two branches. Thus
f (x2 + y2 + z2)(dx + xydy + dz)
= 2 Γ (-asm9d9 + a2 cos2 0sin0rf0) = O
In case the curve Γ is a closed path (a continuous image of a circle) it is
customary to write |r to indicate that the integration is around a loop. We
now summarize what we know so far about the integration of differential
forms.
Theorem 7.4. Let ω = £ a, dxl be a differentiable differential form defined
on a domain D in R".
(ι) ω is the differential of a function if and only if§r ω = 0 for all closed
curves Γ.
(ii) ω is the differential of a function if and only if the field {αχ a„)
is conservative.
(iii) If ω = df then
|^(Р) = ?Г(Р) all/, j all ρ 6 D (7.38)
OXj OXl
When is a Closed Form Exact ?
For certain domains, Equations (7.38) are sufficient to guarantee that the
form ω is the differential of a function; but this is not always true. For
example, let
-ydx + xdy 2
ω = * 2 mR - {(0, 0)}
χ + у
Certainly, ω satisfies the required conditions (recall Example 5):
564 7 Line Integrals and Green's Theorem
If ω were the differential of a function, then we would have Jr ω = 0 for
every closed curve Γ. However since ω = άθ (as remarked in Example 5),
Jr ω = 2π if Γ is a circle centered at the origin. Notice that in some sense
ω is the differential of a function, albeit not single valued. If we exclude
the line χ = 0 (or the line у = 0), in the remaining domain we can take a
principal value of θ = tan-1 y/x; but we cannot find a continuous single-
valued function on all of R2 - {0, 0} whose differential is ω.
Of course, in the above example in any small enough neighborhood of any
point in R2 — {(0, 0)} we can write ω = df for some function / This is in
fact true for any differential form satisfying the compatability equations
(7.38). That is, suppose ω = £ α, dxt is a differentiable differential form
denned in a neighborhood U of p0 in R" and the equations (7.38) are satisfied.
Then if В is a ball centered at p0 and contained in U, there is a differentiable
function/defined in В such that df= ω in B. This is really easy to prove:
if ρ is any point in B, let Lp be the oriented line from p0 to ρ and define
/(P) = Jl ω· Then, we can differentiate/with respect to xJ by differentiating
under the integral sign:
Now the integrand will have one term of the form £, da-Jdx' dx', which is
by Equations (7.38) the same as £, dajdx1 dx1 = da}. This is the essential
term: by the fundamental theorem of calculus we can conclude from df/dx1 =
J daj that dfjdx3 = a} as desired. Here is the precise proof.
Theorem 7.5. (Poincare's Lemma) Suppose that D is a domain with this
property: there is ap0efl such that for every ρ 6 D the line joining ρ to p0 is
also in D. (D is star shaped (see Figures 7.4 and 7.5).) Then in D every
closed form is exact.
Proof. We may suppose p0 is the origin. For ρεΰ, let Lp be the oriented line
segment joining 0 to p. We may parametrize L, by
L„:x=x(0 = 'p OrSirSl (7.39)
If ω is a closed form, define /(p) = \Lp ω. We shall show that df= ω. In
coordinates, ρ = (χ1,..., χ"), ω = 2 я, dx', and by (7.39)
г dx1 г1 "
/(χ1,...,*")= ^a, — dt=\ 1ai(tx)x'dt
J Lp Ш J0 i = l
7.5 Integration of Differential Forms 565
D is star shaped
Figure 7.4
Then, differentiating under the integral sign:
dxJ
(P)
; Jo b = i ex1
dx1'
(fp)fx' + a,(rp) —
dt
Now, using the compatibility equations, the integrand of the first integral takes
D is not star shaped
Figure 7.5
566 7 Line Integrals and Green's Theorem
the form
_ 8a,
_ 8aj
8
t = - [fl//p)] · /
We can now compute the first integral by integration by parts:
β
f τ-, [aj(tp)]t dt = α;(ίρ) ■ f -Γ α,(φ) Λ
Thus
8f Г1 Г1
τ- (ρ) = α;(ρ) · 1 - α/Φ) dt + aj(tp) dt = aj(p)
oxj -Ό Jo
and the proof of Poincare's lemma is concluded.
Poincare's lemma serves to indicate the nature of the solution to the basic
question: when are closed forms exact? It depends on the shape of the
domain. If the domain is a ball, or a cube, or any " star-shaped " domain,
then every differential form which satisfies the compatibility differential
equations (7.38) is the differential of a function. On the other hand, if the
domain has holes (as does R2 — {0}), there are closed forms which are not
exact. We have seen, to be precise, in the discussion following Example 18
that on R2 — {0} the dimension of the space of closed forms exceeds the
dimension of the space of exact forms by one. Problems 15 and 33 are
devoted to showing that when we remove a finite number of points from R2
this excess dimension on the remaining space is the same as the number of
removed points. These examples suggest that domains with holes are not
just defective in the closed-exact problem, but further that the solution to this
problem gives a measure of the defect. This striking relationship between
the shape, or topology, of the domain and the rnalytic question of mteg-
rability persists when we move to more complicated domains, or surfaces
and even into higher dimensions. The shape of a pretzel is accurately
reflected in the closed vs. exact controversy on its surface. The general
theorem relating this analysis to the topology of the domain is de Rhanis
theorem and is one of the cornerstones of the modern subject of differential
topology.
Now, back in one dimension, the fundamental theorem of calculus relates
the values of a function on the boundary of an interval with the integral of
its derivative over the interval:
*bdf
f(b)-f(a) = j df = j -£dt (7.40)
7.5 Integration of Differential Forms 567
The analog of this theorem for differential forms in R2 is Green's theorem;
there are many analogs in higher dimensions and we shall study some of
these in the next chapter. For the remainder of the present chapter we shall
study only the two-variable case.
Suppose D is a domain in R2, and the boundary of D is made up of a finite
collection of curves (see Figure 7.6). We make the boundary into an
oriented path by choosing the direction of motion so that the domain D is
always on the left. If Τ -> N is the (right-handed) tangent-normal frame on
the domain, then the normal N always points into the domain (see Figure
7.7). We shall refer to the boundary of D when so oriented as dD. Now
Green's theorem simply says this: if ω is a C1 differential form defined on a
neighborhood of D, then
f ω= ί άω (7.41)
ho •'d
Figure 7.6
568 7 Line Integrals and Green's Theorem
Figure 7.7
If we consider the boundary of the interval in (7.40) as oriented in some
appropriate way, then (7.41) appears to be a direct generalization of (7.40).
In order to see why (7.41) is true, we must first assume that D is of a special
form. We say that the domain D is regular if it can be expressed in both
following forms:
D = {(x, y)eR2:a<x<b, f(x) <y< g(x)} (7.42)
= {(x, y) 6 R": a < у < β, ф(у) <х< ф(у)} (7.43)
(see Figure 7.8).
For regular domains, Green's theorem follows easily from the fundamental
theorem of calculus. Let ω be a given C1 form, and write ω = ρ dx + q dy.
7.5 Integration of Differential Forms 569
a regular domain
an irregular domain
Figure 7.8
570 7 Line Integrals and Green's Theorem
Then
άω = \ (qx — py) dxdy = qx dxdy — py dy dx
We perform these integrations, by iteration: use χ first for the first integral,
у first for the second.
л л" Г гф{у) да 1 г" Γβ
\ qx dx dy = — dx dy = ^(ΆΟ), j>) dy - g(0OO, y) rfy
jd Jx LU(y) ox J J« J«
(7.44)
Now, we can parametrize 5Z) in two parts as:
δΌ = Γι + Γ2
rt: χ(ί) = (φ(ή, Ο α<ί<β
-Γ2:χ(ή = (φ(ή,ή α<(<β
Thus
Γ q dy = ί 9 dy - ί ? ^ = ί ϊ(^(0· t)dt-\ ς(φ(ί), t) dt
JdD JTi J-Tz Jx Ja
(7.45)
Comparing (7.44) and (7.45) we deduce that
ί qx dx dy = ί q dy (7.46)
We leave it to the reader to verify by the same kind of argument that
i py dx dy = — ρ dx (7.47)
•>d JeD
(Problem 25). Equations (7.46) and (7.47) together give Green's theorem.
Now, not every domain can be represented in both the required ways;
in fact, in general neither is possible. However, for most domains D it is
true that D can be covered by finitely many disks Αγ, ..., As so that D η Δ, =
D, is regular for every /. Clearly, if D is bounded by finitely many polygonal
curves this is true. All but the most pathological domains that we have seen
have this property. The above argument generalizes easily to these types of
domains. We shall now call any such domain regular.
7.5 Integration of Differential Forms 571
Definition 9. A domain D is regular if its boundary is a path and if D
can be covered by disks Ay,..., As such that each D η Δ, can be represented
in both forms (7.42) and (7.43).
Theorem 7.6. (Green's Theorem) Let D be a regular domain and ω a
differential form defined on a neighborhood of D. Then,
ω = άω
•>6D J В
Proof. Let ΰ,η4ι where the disks Δι, , Δ„ are as given in the definition
In particular, by the preceding arguments, Green's theorem is true on £>. for each ι
Let pi,..., p„ be a partition of unity subordinate to the covering Δι,. ., Δ5. Then
the p, are C°° functions and 2 Pi = 1 on D, and pt is nonzero only inside Δ, Now,
by Green's theorem on D{
pt ω = d(pi ω)
Since pi ω is zero off £>i,
d(pi ω) = ί/(ρ, ω)
But also pi ω is nonzero only on the part of each of the curves SD, 8Dt which is
common to both, thus also
Pi ω
ρ,ω =
Thus
ριθ>= ί/(ριω) 1<г<5
J 3D ^D
Adding these equations, we obtain Green's theorem for D since 2 Pi = 1:
ω=Γ 2Ριω=Σ Ριω=Σ ^(Ριω) = ^(Σ Ριω) = άω
Examples
25. Let D be the unit rectangle [(0, 0), (1, 1)]. Then, by Green's
theorem
f x2y dx + (x-y)dy= f (1 - x2) dx dy = f (1 - x2) dx
572 7 Line Integrals and Green's Theorem
26. The integral of ω = cos xy dx + у cos χ dy over the boundary
of the domain
D= {(x, y):0<x2 <y< 1}
is
ω = [ — у sin χ + χ cos xy] dx dy
= i(~y sin x) + x cos ХУ] ^ ^x
Green's theorem is also convenient for transforming double integrals
into line integrals. Noticing that dx dy arises as d( — у dx) or d(x dy)
in Green's theorem, we may compute areas of domains by line integrals.
27. Find the area bounded by the curves у = 1 — χ* and у = 1 — χ6
in the upper half plane:
area
JD
= dx dy
JD
= -\ydx = -\ (1 - x4) dx + ί (1 - χ6) dx = —
•'ев •'-ι ·"-! 35
28. Find the area inside the ellipse
' T2
E---2+h=i
We can parametrize Ε by the polar angle:
χ = a cos θ у = b sin θ
Thus
area = χ dy = ab \ cos2 θ άθ = nab
EXERCISES
14. Compute the line integrals of differential forms arising out of the
work problems in Exercise 12(a), (b) using Green's theorem.
15. Compute JY ω for given ω and Γ (using Green's theorem if
convenient).
7.5 Integration of Differential Forms 573
(a) ω = ζ dx + χ dy + у dz Г: closed oriented polygon with
successive vertices (0, 0, 0), (0, 1, 1), (1, 0, 0), (-1, -1, -1).
(b) ω = x2y dx + y2x dy Γ: the ellipse a2x2 + b2y2 = 1.
(c) ω = (χ + у) dx + (χ2 + у2) dy Г: the triangle with successive
vertices (0, 0), (4, 0), (2, 3).
(d) ω = χ2 dy + 2xy dx Γ: ζ = e(1 + "' from ( =0 to t = 2.
(e) ω = (χ + у) dx + (у + ζ) dy + (ζ + χ) dx
Γ: the circle χ2 + ζ2 = 1, у = 3.
16. Compute, using Green's theorem the area of the domain D:
(a) D = {(x,y): 0<sinx<;v<tanx< 1}
(b) D is the domain in the upper half plane bounded by the ellipse
x2 + 2y2 = 1 and the parabola χ = 2y2
(c) D is the quadrilateral with vertices at (0, 0), (1, 0), (7, 3), (2, 5).
(d) Inside the curve χ = cos" t, у = sin" t η > 0.
PROBLEMS
25. Verify Equation (7.47) in the text and conclude the proof of Green's
theorem.
26. Using Green's theorem prove that if ω is a closed differential form in
all of R2, then ω is exact.
27. A differential form is called radial, if it is of the form <F, dx} where
F is a radial vector field (see Problem 24). Show that if ω is radial, it is of
the form /(r) dr.
28. Show that if ω is a compactly supported (that is, it is identically zero
outside some large disk) form on the plane that
(a) f
2da>=0
(b) f ω = ί dm
29. Show that if ω is a compactly supported closed form in R2, it is the
differential of a compactly supported function.
30. If ω is a differential form, define *ω as follows: if
ω = <F, dx> *ω = <*F, dx}
(a) Show that if ω =p dx +q dy, *ω = —q dx +pdy.
(b) Show that (in a disk) *df is also exact if and only if/is harmonic.
(c) Show, using complex notation
*ω(Τ) = ω(ίΤ)
574 7 Line Integrals and Green s Theorem
(d) If/is harmonic, let/* be such that df* = *df. Show that/+ //*
satisfies the Cauchy-Riemann equations.
31. Let Γ be an oriented curve in R", with tangent Τ and normal N. If
/"is a differentiable function we define these derivatives of/along Γ:
^ = rf/(T) = <V/T> ^ = rf/(N) = <V/,N>
Show that
/γΉγ*ϊλ Lm*=L*«
32. Suppose that £> is a regular domain and / g are twice differentiable
functions defined on a neighborhood of D. Verify these formulas (using
Green's theorem):
iLds^O
aDST
(a) f
•Id
(b) \j^ds=\\Xixi~%dix)dxdy
(d) Lss*=JJ>**
(e) L ^ έ ^=ίί„ [^Δ/ + <Vg-v/> ] rfx dy
(f) Ц^-^1)л-Яв^-^)л
7.6 Applications of Green's Theorem
Several of the exercises at the end of the previous section have indicated
the uses of Green's theorem. The rest of this chapter is devoted to the
application of this theorem to some of the topics we have been developing.
We shall leave aside until the next section its more profound uses in the study
of complex differentiable functions.
7.6 Applications of Green s Theorem 575
The Shape of the Domain
The most immediate implication of Green's theorem is the suggestion of
the relationship of the shape of a domain to the question of the exactness of
closed forms. If every closed curve in the domain D is the boundary of a
subdomain in D, then every closed form is exact. For, suppose ω is a closed
form. By Theorem 7.4 (ι), to show that ω is exact, we need only verify that
its integral over any closed curve is zero. If Γ is such a curve, then by
hypothesis it is the boundary of the subdomain E. Then, by Green's theorem
ω = dto = О
Jr JE
since dco = 0.
We can say that a domain D " has no holes" if every closed curve in D
is the boundary of a subdomain of D. This is intuitively clear: we can draw
a loop around any hole which will bound the hole and this is not a subdomain
in D. The further study and precision of these notions is a rather difficult
branch of mathematics and falls within the domain of topology. It turns
out that there is a precise relation between this vague geometric study and
the question of exactness. The number of " holes " in the domain is the
same as the number of independent closed but nonexact forms. We already
saw that (in Section 7.2) for R2 - {0} and in Problem 15 for R2 - {0, 1}.
That argument easily generalizes to the case of the complement of finitely
many points, pu...,ps. Let 0,(z) = arg(z — p,). Although Θ, is not a
well-defined function on R2 — {pb ...,ps}, dd, is a well-defined form.
Clearly, άθχ, ...,dds are independent, so there are at least s independent
closed nonexact forms on R2 = {py, ...,ps). Now, let ω be any closed
form and define
1 f
c,(ct>) = — ω
2πι J с,
where С, is a small circle centered at p,. Then
1 s
ω' = ω - — Χ Γ,(ω) άθ,
Ζπ ,= ι
is exact. This can be proven by verifying condition (ι) of Theorem 7.4 by
Green's theorem (see Problem 33). Thus if ω is any closed form it is, but
for an exact form, a linear combination of the άθ,.
576 7 Line Integrals and Green's Theorem
Area Computation
Now, as in Examples 27, 28, we can compute areas by boundary integrals :
if D is a regular domain
area of D = \\dx dy = χ dy = — у dx = = χ dy — у dx
•I^D •'SD -leD *■ ^6D
(7.48)
Example
29. The area of a trapezoid is 1/2(6, + b2)h (see Figure 7.9).
area = χ dy = χ dy + χ dy
•'ев ·Ί.ι -Ί.2
ii:j' =
•'ев -Ί,ι
ft
α + b2 — by
(x - bt) χ e [a + b2 , £>i]
L2 : у = - χ χ 6 [0, a]
Л« + Ь2
area
Г° Л
Г n j Г я j
= χ — dx + χ -dx
■>ы a + b2 — bt Jx α
_пГ(а + Ь2)2-Ь12'1
2L a +
i>2 - bt
2a
= - [a + i>2 + bt - a] = 2 (£>i + b2)n
(«+&>, Й)
ftl
(ft.,0)
Figure 7.9
7.6 Applications of Green s Theorem 577
Integration after a Change of Variable
A line integral of a differential form is the same, no matter what coordinates
are used to compute it (recall Proposition 4). Using this knowledge and
the preceding computational techniques we can find a formula for computing
double integrals by a coordinate change.
Suppose that F is a nonsingular differentiable transformation of the domain
D onto the domain Ε (that is, F maps D one-to-one onto Ε and dF is
everywhere nonsingular). Let us write F in terms of coordinates:
F : " = "(X' >> (x,y)eD F1 : x = *u' V\ (uv)eE (7.49)
υ = v(x, y) y = y(u, v) K ' y '
If Γ is a path in D, then F(r) is a path in E. If ω = ρ dx + q dy is a
differential form defined on D, we may associate it to a form on Ε: ώ = a du
+ β dv, where the cooefficients are given (see (7.24)) by the coordinate
change (x, у) -> (и, ν). Then Jr ω = JF(d<5, since they represent the same
integration relative to two different coordinate sets. Now, if Γ bounds a
domain Δ, F(r) bounds F(A) and if we apply Green's theorem to both sides
we will obtain a relation between the double integrals. However, to apply
Green's theorem we must be sure that both Γ and F(r) are oriented as the
boundary of the domains Δ, F(A), respectively. That is not necessarily
the case.
Example
30. The transformation
ν = χ
amounts to reflection in the line χ = у. If Г is a circle centered on
that line, Г and F(r) are the same curve, but oriented in opposite
directions (see Figure 7.10).
This difficulty may be overcome by restricting attention exclusively to
transformations that preserve the sense of orientation around a curve. This
will be guaranteed if the sense of " counterclockwise" rotation about
corresponding points is the same. Thus, if we rotate the xy plane about the
point ρ in the clockwise sense, the induced motion under the transformation
Τ must also be clockwise. This will be the case if it is so for the linear
578 7 Line Integrals and Green s Theorem
Figure 7.10
approximation rfT(p), and that is guaranteed by
д(х, У)
d(u, v)
(p) = det
T-(P)
du
\T~ (ρ)
<3χ
(Ρ)
(Ρ)
δν ' J
>0
(7.50)
These remarks are not completely obvious, but we shall not pause to
verify them. It is intuitively clear that the sense of rotation at a point is the
same for the transformation and its differential. What is not so clear, and
more difficult to obtain is that this local criterion assures that the sense of
orientation of any boundary is the same in the two coordinate systems. All
these geometric considerations can be avoided, by replacing them with
appropriate algebraic considerations. We shall see further illustrations of
the difficulty in a purely geometric, rather than algebraic, approach in the
next chapter.
In any event, if (7.49) defines a change of variables satisfying condition
(7.50), then for any subdomain Δ of D, dA and 5F~'(A) define the same
orientation on the boundary of ω. Thus, if ω is any differential form
ι—ι
ω
βΓ-·(Δ)
7.6 Applications of Green's Theorem 579
In particular,
area (Δ) = \ χ dy = \ χ dy = \ χ — du + χ — dv
•'ел ->sf-ha) •'вг-1(Д) ди dv
jf-i(A)Lom\ dv/ δν\ ди/
du dv
д(х, у)
— -dudv
JF-i(A) S(U, V)
A more important formula is that allowing us to compute double integrals
with respect to the new coordinates (u, v).
Theorem 7.7. Let D be a domain in the plane, and suppose
χ = x(u, v)
у = у(и, v)
is an orientation-preserving change of coordinates (that is, д(х, у)/д(и, v) > 0).
Let Ε be the domain in (u, v) variables corresponding to D. If f is a function
defined on a rectangle containing D, then \Df can be computed in terms of the
(u, v) coordinates:
ί / = f f(x(u, v), У(и, i;))det —^ (и, υ) du dv (7.51)
•>d je o{u, v)
Proof. Let R = [(a, b), (α, β)] and define
F{x, У) = ί /(f, У) dt for (x, y)eR
•'a
Thus F(x, y) is a C" differentiate function on R such that 8Fj8x=f. Now, by
Green's theorem
ί fdx dy = ί Fdy
We can compute the integral over 8D in the («, v) coordinates:
dy dy
f Fdy=\ Fdy=\ F^du + F-f
LD JeE JeE ou dv
dv
580 7 Line Integrals and Green's Theorem
By Green's theorem (in the (w, ν) variables), the last integral is
J£ [du \ dv) dv \ du)
du dv
8F dx dy 8F by by
dx du dv 8y du dv
d2y dFdxdy dFdydy d2y
du dv dx dv du dy dv du dv du
Г 3(x, У)
■■ f(x(u, v), y(u, v)) det — du dv
•Έ d(u, v)
du dv
Thus (7.51) is proven.
Examples
31.
г dx dy r r dr άθ rl Г r2"
•b+^si (x2 + У2)1'2 Li+y^i r -Ό Ыо
άθ
άτ = 2π
32.
ί ехр[-(х2 + у2)'] dx dy = exp( — r2)rardd
= 2π exp(-r2)r dr
•Ό
= тг[-ехр(-г2)]? = я
Notice that
if exp(-i2)di) = f exp(-i2)rfi · ί exp(-i2)rf(
г00 г00 г
= exp(-x2)dx· exp(-y2)ify = exp[-(x2 + y2)~] ax dy
•Ό J0 •'кг
Thus
г00 г-
exp(-i2)di = y/π
•Ό
a computation that would have been impossible without the change
of variable to polar coordinates.
7.6 Applications of Green s Theorem 581
The Divergence Theorem
The general form of Green's theorem first came up in the study of fluid
flows and the theory of potentials. In this study it arises in the form of the
divergence theorem, which we shall now discuss in two variables.
Let ν = (v, w) be a vector field defined in some open set in the plane and
let χ = x(x0, f) be the equations of the associated flow (that flow with velocity
field v). Let D be a domain on which the flow takes place. The fluid which
at time t = 0 occupies D has moved after a time t, to a domain D, given by
D, = {x: x = x(x0, () x06 D)
The area of D, is
area(Dt)= f dx dy = f ^X,,y,\dx0dy0
■>Dt Jd o(x0 , y0)
where we have rewritten the equations of flow as
χ = x(x0, 0 = (x,(x0, y0), yt(x0> JO»
The rate of change of the area of D, is
— area (D() = — τ, г
dt JDdtld(x0,y0)\
dx0 dy0 (7.52)
Now let us evaluate this at time t = 0. Remembering that x(x0, 0) = x0,
we have
= Т-т—(х0,Уо 0) + —-—(xo,JO,0)
(=o dt дх0 tit ду0
д2у dw
δ
dt
δ'
dt
'дх, ду,
βχ0ду0
'■χ δ
3χ0 δχ0
дх, ду,
ду0 дх0
/дх\ (
\dt)~d
dt ду0 ду0
Thus the instantaneous rate of change of the area of D (Equation (7.52))
is given by
JD\dx
dV + ^)dxdy (7.53)
dy)
582 7 Line Integrals and Green's Theorem
The integrand is called the divergence of the flow and is denoted div v. The
divergence theorem says that this integral can be computed by a boundary
integral. To put it physically: the rate of expansion of D is the same as the
rate at which fluid flows into D. We will now try to compute that latter
amount. Let BD have the frame Τ -> N so that N points into the domain
(see Figure 7.11). The amount of fluid passing into BD through a small
piece of the boundary (of length As) in a time Δί is
<v, N> Δί Δί
The total amount passing through BD is thus well approximated by a Riemann
sum for the integral
ί j^<v, N> ds\ Δί
Thus the rate at which fluid passes into D can be thought to be given by
f <v,N>ds
JdD
Using the notation of Exercise 29 this is the same as
7.6 Applications of Green s Theorem 583
By Green's theorem this is the same as (7.53). Thus the divergence theorem
is verified:
f <v, N> ds = f div ν (7.54)
If v is a conservative field it has a potential function /, and <v, dx} = df.
Then <*v, dx} = *df ana (7.54) becomes
f *df = f d* df = f Δ/
Thus, if/ is the potential function for a conservative and incompressible
(divergence free) flow, / must be a harmonic function. Dinchlet's problem
(to find a harmonic function with given boundary values) may be restated as:
find the conservative incompressible flow with given boundary potential
levels.
The Cauchy Theorem
This last remark leads directly to the study of complex analysis. Suppose
that/is a complex-valued complex differentiable C1 function defined on a
domain in the plane. Then
/(z + /Q - f(z)
hm = / (z)
a->o η
exists for all ζ and (what is the same assertion) the Cauchy-Riemann equations
hold:
5/=_Д
дх ду
It follows that the form/(z) dz is a closed complex-valued form.
f dz = f dx + if dy
d(f dz) =
dxyJ> dy\
=»/, - Л = о
Theorem 7.8. (Cauchy's Theorem) If f is a C1 complex differentiable
function defined in the regular domain D, then
f fdz = 0
584 7 Line Integrals and Green's Theorem
Proof. By Green's theorem
\ f dz=\ d(f dz) = 0
•>6D •'D
•>dD
EXERCISES
17. Compute the area of these domains:
(a) x* + У < a*
(b) x2y<l,0<x<a
(c) r < 1 + 2 cos 0 (each section)
(d) r ^ e\ 0 < θ ^ 2π
(e) The domain {u2 + v2 < 1/2}, where
и = x(l + χ cos υ)
υ = y(l +;vcosx)
(f) The domain {0<«^1, 0^υ<1}, where и = xy, ν = χ2 — у2
18. Compute div ν for these flows:
(a) x(xo,0=exp
(-! -:>:
Xo
(b) x=x0(.l + t),y=y0(.\~t2)
(c) vO, j<) = (x2 - y, y2 - x)
(d) v(x, >) = (x + y, χ - y)
PROBLEMS
33 Let D = R2 — {pi, . , p,}, where pi,..., ps are s distinct points in the
plane. Show that there is an j-dimensional space L of closed, but not exact
forms defined on D such that every closed form can be written df+ω, with
ω eL.
34. Let ω be a closed form in R2 — {(0, 0)}. Show that if ω is exact in
some annulus {a< \z\ <,b], then it is exact.
35. Let / be a complex-valued differentiable function defined in the
domain Ε Show that /is complex analytic if and only if Ud fd2 = 0 for all
subdomains D of Ε {Hint: d(fdz)=0 is the same as the Cauchy-
Riemann equations)
7.7 The Cauchy Integral Formula
In Chapter 5 we introduced the power series development of functions in
order to effectively compute solutions to certain differential equations.
Those functions which admit an expansion into a power series are called
analytic. We saw that this is the most computable class of functions. We
7.7 The Cauchy Integral Formula 585
saw that such functions are differentiable in the complex sense, and that the
differential equations can be interpreted in the sense of complex variables.
In Chapter 6 we found that if a function is the sum of a convergent power
series in the closed unit disk, it can be computed by means of an integral
around the circle:
if
/(0=f>ninf°>-|il<l
then
Γ2" /(e1 V
1 r2* f(e'
2π Jn e ■
άθ
2π J„ e'9 - ζ
for |ζ| < 1. The integral may be rewritten as a line integral:
f(z) dz
/(0 = ^f
2π/ -Ίζΐ = ι
|z| = l Ζ-ζ
The Cauchy integral formula is a great generalization of this. It weakens
the hypothesis to that of complex differentiability and strengthens the
conclusion by replacing the unit circle by the boundary of any regular domain.
Theorem 7.9. (Cauchy Integra] Formula) Suppose that f is a C1 complex-
valued complex differentiable function defined in a neighborhood of the regular
domain D. Then, for ζ e D,
1 f Hz) dz
2π/ Jsd z — ζ
Proof. Let Δη = {ζ: \z— ζ\<η~1}. If η is large enough, Δ„ is contained in
D (see Figure 7.12) and/0)0 — ζ)'1 is a complex differentiable in D — Δ„. This is
because the product of complex differentiable functions is complex differentiable.
Thus /0)0 - D-1 dz is closed, so that
L
mdz=o
Лсо-д,,) Ζ ζ
Thus
f(z)dz r f(z)dz Γ2*Λζ + η-ιε'°)
r-= Г = 4 ^TT» n1e'ede = i\ f(C + nle">)de
586 7 Line Integrals and Green's Theorem
Figure 7.12
But as n^ со, /(£ + n~Le">)^f(£) uniformly on the circle, just because /is
continuous at ζ. Since η is arbitrary (but large),
f f-^- = hm ι Γ /(ζ + /TV) rfff = i Γ /(ζ) άθ = 2πί/(ζ)
and thus (7.55) is proven.
The Cauchy integral formula implies that complex differentiable functions
are extremely well behaved; after all a function certainly must be quite
special for it to be completely and explicitly determined within a domain by
its boundary values. Here are a few corollaries of Theorem 7.9 which
demonstrate this.
For simplicity of notation we shall write fe A(D) to mean that / is a C1
complex differentiable function on a regular domain D.
Proposition 5. (The Maximum Principle) Let / be in A(D). The
maximum of/on D is attained on dD.
Proof. Since D is compact, the maximum of/is attained at some point ζ e D.
If there is no point on 3D at which /attains its maximum, then not only is ζ φ dD,
but
/(Ol>max{|/(z)|:zeSi»}
7.7 The Cauchy Integral Formula 587
We shall show that this assumption leads to a contradiction Define
/ω
g{z)
7(0
Then g e Δ(£>) also, #(£) = 1 and \g \eD < 1 Then g" - 0 uniformly on £D as
n^-co. Thus
г g\z) dz
ho ζ—ζ
as n^· со. But, by the Cauchy integral formula, that integral is 2mg"(Q = 2-ni
which does not tend to zero.
Proposition 6. Suppose /„, / are all in A(D) and lim/„ = f uniformly on
dD. Then lim/„ = / uniformly in D.
Proof By assumption, |/„—/lU^O as η -»- со But since /„ — /εΛ(ΰ), by
the maximum principle,
II/ — /Id= ll/n — /Ί'βο so|/„—/|D^0 asn->coalso
Proposition 7. (Liouville's Theorem) If f is bounded and complex differen-
tiable on the entire plane, f is constant.
Proof Let Μ be an upper bound for | /(ζ) | Let ζι, ζ2 be any two points on the
plane.
l/(£i)-/(£0l =
1
2m
i Γ-Τ—L
•ΊζΙ-я \_Z— 4l 2-
l-n ·Ί.|-λ
/(ζ) rfz
(Γ-ίιΧζ-ίΟ
МЛ
_W_ г2" «Я
-2ττ/?141 4llJ0 |ί1#-ίιΛ-
ΙΙ^'-ίϊΛ-1!
As Λ^- со, the integrand converges to 1 Thus the entire expression on the right
becomes arbitrarily small as R -> со On the other hand, the left-hand side is
independent of R, hence must be zero Thus, /(£i) =/(£2) for any ζι, ζ2
588 7 Line Integrals and Green s Theorem
The most important property of complex differentiable functions is that
they are analytic, that is, they can be expressed as the sum of a convergent
power series about any point. The following theorem brings together all
the notions of analyticity and summarizes the basic properties of analytic
functions.
Theorem 7.10. Let f be a C1 complex-valued function defined in a
neighborhood of the regular domain D. The following assertions are equivalent:
(i) For any ζ e D, and R such that the disk Α(ζ, R) is contained in D, f
is the sum in Α(ζ, R) of a convergent power series:
/(*) = Σ a*(z - 0" (7-56)
n = 0
(ii) / is complex differentiable.
(iii) f satisfies the Cauchy-Riemann equations:
e-L=-ie-L
дх ду
(iv) f dz is closed.
(v) for any ζ 6 D,
m.±f Μψ
2πι Jan z — С
2πι Jbd z — ζ
In case f has these properties the coefficients an of (7.56) are given by
/W(0 1 Γ /(ζ) dz
a" = ^r = ^iL{z~(rT (7·57)
Proof. The implications (i)=>(n), (n)=>(iii) were observed in Chapter 5,
(ni)=>(iv) in the preceding section and (iv)=>(v) is the Cauchy integral formula
(Theorem 7.10). That leaves only the implication (ν) => (ι) and the first part of the
theorem will be proven. Suppose then, that (v) holds, and Δ(£, К) с D. We have
to show that / can be expanded in a power series centered at ζ. By hypothesis,
min{|z-£|: ze dD}^R. Thus for w e Δ(£, Κ),
7.7 The Cauchy Integral Formula 589
for all zedD. Thus
1 1
w ζ-ζ — (\ν-ζ) z-ζ
\ ζ-ζ) „ίΌ(ζ-ζ)"+1
uniformly for ze 3D. We can thus substitute this sum for the term (z— w)'1 in
the Cauchy integral:
1 г /(ζ) rfz 1 г - (w - 0-
/(w) = — = — /(z) 2 τ ϊ^ΓΓι dz
2πΐ JeD z—w 2πΙ JeD n=o (ζ — ζ)η+1
Thus/is represented by a power series whose coefficients are given by the integrals
in (7.57). That the coefficients also are given by the successive derivatives as in
(7.57) was already observed as part of Taylor's formula. Thus, the theorem is
completely proven.
Examples
33. If / is analytic in the disk Α(ζ, R), then the power series
representing / near ζ actually converges to /in the entire disk Α(ζ, R).
For, by Theorem 7.10, /is, in this whole disk, the sum of a power
series centered at ζ, but such a power series is uniquely determined by
/ so must be the given one. In particular, if/is analytic in the entire
plane it can be expanded in a power series converging everywhere.
34. Suppose that /is analytic near ζ. Then
/ω-ло
ζ-ζ
(7.58)
is also analytic near ζ. For, we can easily factor the Taylor expansion
of f(z) - /(C). If /(ζ) = Σ." ο «„(ζ - 0", then
f(z)-ftt) = Σ an(z ~ 0" = (z ~ 0 Σ <W* - 0"
n=l n = 0
so (7.58) is given by Σ"=οαη + ι(ζ- 0"· In particular, z_1 sin ζ is
analytic on the whole plane, and has the Taylor expansion
ζ-18ίηζ=Σ(-1)η73ΓΠΤΤ
n=0 (Z )\
590 7 Line Integrals and Green's Theorem
= 2ni
r tan ζ dz tan ζ
35. =— = 2πΐ
J|z|=l ZZ Ζ
36.
f -4L-=\
J|z-i| = l Ζ + 1 ·Ίζ-ι| = 1
dz
1
= 2πί — = π
Ι = ι (ζ + ι)(ζ — ι) 2ϊ
γ sin ζ
37. Γ —;- rfz = 27r/(sin ζ)*"-1' |ж.0
•Ίζ|=2 Ζ
ο
(-1)η/22πι
38.
e'
Ι (η - 1)!
(у>)("-1)
η odd
η even
(ζ - Ο"
dz = 2πί
(« - 1)!
_ 2πι^
39· ί Γ
^0
2α cos θ + α
\α\ <1
This integral can be computed by means of Cauchy's theorem by
interpreting it as an integral over the unit circle. Since
cos θ =
е'в + ε'*
-;и
dz = ге,в d6 = iz άθ
on the unit circle, we may rewrite the integral as
ί (ΐ-2α(Ζ-±^]+αΛ-1^=1-ϊ
J|z| = i\ \ 2 / J iz iJN = 1
— If dz
ia J|z| = i z2 -(a + (l/a))z+ 1
-Ь dz
ia J\z\ = i (z- a)(z - a*1)
dz
ζ — az2 — a + a2z
(7.59)
Since |a| < 1, the function (z — а г) Ms analytic on the unit disk
and the integral (7.59) can be computed by the Cauchy integral
7.7 The Cauchy Integral Formula 591
formula
L.
dz
|ζ| = ι(ζ-α l)(z-a)
Thus
d9
= 2n\-
Γ —
Jo 1 - 2α cos 0 + α2
Theory of Residues
a — a
-2π/ 1
α \α — α-1
2π
1-я2
There are many definite integrals which may be computed in similar
fashion. The integral formulas of complex analysis provide a powerful
technique for computing such definite integrals called the residue calculus.
We shall give a brief introduction to these methods. First, a few more
illustrations
40.
•Ό J|z| = i\ 2 J iz
'\z\--
π 531
2*6·4·2
г271 de - г Γι /ζ + ζ-1\
41· Jo l+cos20~JM = 1L I 2 j
iz
ζ dz
4 г гяг
~~ i J|z| = i z4 +6z2 + 1
(7.60)
We are now not in a very good position, for we cannot recognize the
integrand as a Cauchy integrand. To do so we should be able to write it in
the form f(z)(z — ζ)~" for some function / analytic on the unit disk, and ζ
in the disk. But it is not of that form. The integrand is
(z2 + 3 + 2V2)(z2 + 3 - 2^2)
z2+(3 + 2^2) z+(-3 + 2^2)112 z-(-3+2^2)1
/2
592 7 Line Integrals and Green's Theorem
which has the form f{z\z — a)~1(z - β)-1 for two points α, β in the disk.
However, we can still compute this integral by returning to the proof of
Cauchy's integral formula. If At, A2 are two small disks centered at α, β,
respectively, then f(z)(z — a)~l(z — β)-1 is analytic in Δ — (At υ Δ2), so
by Cauchy's theorem
f(z)dz
= 0
•'β[Λ-(Λι u4i)j(z - α)(ζ - β)
Thus, the integral (7.60) is the same as
ζ dz
f
JaA,(z2 + 3 +
2jl){z - β)(ζ - «)
+
f
JSA2(Z2 + 3 +
ζ dz
2jl){z - «)(z - β)
(7.61)
Now these integrands are of the form f(z\z — ζ)'1 with /analytic on the
disk and ζ in the disk, and can be evaluated by Cauchy's integral. (7.61) is
thus
2πί
+ ■
L(a2 + 3 + 2^2)(α -β) (β2 + 3 + 2^/2)08 - or)J
Since α = -(-3 + 2^/2)1'2, β =(-3 + 2^2)1/2, we obtain the result
Jo 1
άθ
+ cos2 θ χ
4 „ .
- · 2πι
-3+272 + 3+2^2.
α-/ϊ
= π./2
The above idea of suitably generalizing the integral formula so as to
accommodate a larger class of integrals is called the residue theorem. We
shall now prove it in general.
Definition 10. Suppose that/is analytic in a neighborhood of the point ζ,
except perhaps at ζ. We say that / has an isolated singularity at ζ. The
residue of such a function/at ζ is defined to be
Res(/
,0 = Hm^f
ε->ο Ζπι J\
|z-{|=e
/(z) dz
7.7 The Cauchy Integral Formula 593
Of course, we do not a priori know that this limit exists, and therefore that
the residue is well defined. However, there is no problem: for any ε and ε',
we have
Γ /(ζ) dz = f /(z) di
by Cauchy's theorem, since /is analytic in the (regular) domain bounded by
these two circles. Thus the limit certainly exists since it is independent of ε.
Now the residue theorem says that the boundary integral of a function analytic
but for isolated singularities is given by its residues; which we may calculate
by the integral formula, or other available local means.
Theorem 7.11. (Residue Theorem) Suppose that f is analytic on the regular
domain D but for isolated singularities at ζχ,..., ζ„ in D. Then
\ f{z)dz = 2nit Res(/,C,) (7.62)
Proof. Let Δ ι,..., Δ„ be disjoint disks centered at ζι,..., ζ„, respectively. Then
since/is analytic ιηΰ- υϊ=1 D,, by Cauchy's theorem
Γ f(z)dz=f ί f{z)dz
JdD ,=sl •'«Aj
But the sum is just (7.62) by the definition of residue.
Examples
42.
f- cos2fl f- l(z+(llz))2 dz
J-, 1 + sin2 θ J-, 1 - i(z - (1/z))2 iz
_ r" -1ζ4+2ζ2 + 1^_
~ J-„ iz z4 - 2z2 - 3
Now the roots of the denominator are
594 7 Line Integrals and Green s Theorem
and the integrand can be rewritten as
■ 1 z4 + 2z2 + 1
/(*) =
« (z2 - 3/2)(z + i/V2)(z - (i/72))
The residues to be computed are those at 0, ±ijyJ2. The integral
around each singularity is a Cauchy integral, so we need only evaluate
the relevant function at the point in question.
Res0 / = tt
3i
l/4 + 2(-l/2)+l 1
Res,/V2 / = —
Res
Thus, our integral is
i(i/y2)(-(l/2)-(3/2))(2i/V2) 8i
1/4 1
,/V5/ \φ{-2Χ-2ΐφ) 8'
It is clear that any integral of the form
f R(cos 0, sin θ) άθ
·* —it
where R is a quotient of polynomials, can be handled in this way by the
substitutions
л 1/ Ц · „ ! / 1\
cos 0 = - ζ + - sin0 = —|z —
2\ z/ 2i\ z/
IZ
The integrand then becomes a quotient of polynomials is z, and we need
only compute the residues at the roots of the denominator which lie inside
the unit circle. At such a root r, the integrand takes the form
/(z)
g(z)(z - r)k
7.7 The Cauchy Integral Formula 595
where fjg is analytic near r. Thus the residue is, by Cauchy's formula
v(ft-l)
2ni J g(z)(z - r)k (k-l)\\g(z))
The cases we have considered so far are those where к = I. Here is an
illustration of the more general case.
43.
άθ r \ dz
/·" d9 _ r 1
J-, (2 + cos Θ)2 ~ Jm-i [2 + Ш + ι
-, (2 + cos 0)2 Jw.! [2 + i(z + (1/z))]2 iz
4r ζ rfz
= lJ|z| = i(z2 + 4z + l)2
The roots of the denominator of the integrand are
-2 +УЗ -2-УЗ
These are both double roots. We need not be concerned with the
root —2 — ^/3, since it is outside the unit disk. The integral is
conveniently rewritten as
ζ dz
-f
iJ|z| = i(z + 2 + V3)2(z + 2-V3)2
By Cauchy's formula the integral is evaluating the derivative of
/(z) = z(z + 2 + J3)~2 at -2 + ^/3. Now
,- _ -2 + ^/3-2-73" i^
П-2 + V3) = - (_2 + уз + 2 + уз)3 - 2%/2-7
Therefore, our integral is
4 1 4π
2πι
i 2(27)1/2 (27)1/2
44. Occasionally, the integrand does not obligingly form itself into
a Cauchy integral, and we must play around a little more
fe'-rfe-f exp(z + ^=f ^-dz
J-π J|z| = l \ Ζ/ Ζ J\z\ = l Ζ
596 7 Line Integrals and Green's Theorem
The only singularity is at 0, but we cannot rearrange this in the form
/(z) · z~l. Thus we must compute the integral directly by some other
means. Since
OO -'' OO - Л
n=ο η! „=.ο η!
and
<?'^ = Σ /.Σ ^V
\ J>0 /
Thus
ί е2~вав= Σ Σ -Αι f z"'ldz
J£0
But that last integral is zero unless η = 0, in which case it is 2π. We
conclude that
>£0
Integrals from -co to + oo
The techniques of residue calculus also apply to suitable integrals of the
form
.00
F(x) dx (7.63)
J -oo
If, say, F(z) is analytic but for isolated singularities at zu ..., zk in the
upper half plane, then
. ft
F(z) dz = 2πϊ Σ ResZi(F)
7.7 The Cauchy Integral Formula 597
whenever D is a domain containing z1,...,zk. Choosing D=DR =
{z: \z\ < R, Im ζ > 0}, the integral is
R
f F(x) dx + f F(z) dz
J — в ·Ή„
Ян
where HR is the boundary in the upper half plane of the disk of radius R.
Now if F(z) -> 0 as |z| -> oo fast enough, the integral over HR will tend to zero
and the integral from -R to R will tend to (7.63). We shall say that F is
dissipative in the upper half plane in this case. Thus we conclude that when
F is dissipative in the half plane П,
f F(x)dx = 27ri£ Resz(/)
^ — oo ζ еП
45,
г°° х1 dx
z2dz
л1" χ ax ,. r ζ az
—i =nm —τ
J-oo X + 1 R^oo JSDR Ζ + 1
For
ζ dz
г ζ az г к е
•L z4 + 1 Jo-R
.« Д V'9 . λΛ ^
4 J?i9
^
2тгД3
+ 1
as /?-»· oo
Now the roots of z4 + 1 are (+1 + ι)/\/2· Those in the upper half
plane are a = (1 + О/чД 6 = (1 - О/лД Thus
Цртт) =
ι + г
l + i
v^JI-v^J
1 + / -1 - i
l + i -l+i
Res,
V + l/ 8i.
4l φ. RJl ^2 J
1+г
8*72
1 -f
72
l+i 1 - /'
7 Line Integrals and Green's Theorem
Finally,
r°° x2 dx
= 2πΐ
1 + i 1 -i
+
J-oo χ* +1 hiji 8цД] xji
A condition on F that guarantees that it is dissipative is that F is the
quotient of two polynomials such that the denominator is of degree
two more than the numerator (see Problem 37).
46. Compute
-шх dx
ρ e a
J-oo 1 +X
a>0
(7.64)
Now, we would hope to apply the residue theorem to e ,z(l + z2) '.
For ζ = χ + iy, this becomes
e> e-
1 + (x + iy)2
which is hardly dissipative for у > 0. But it is dissipative in the lower
half plane:
r e~iaz dz r'
•Li-r 1 +z2 ·>-
y<0
r° exp[- ia(K cos θ + i sin вУ]Ше1в d9
1 + R2el2e
- ^~j ί_π exP(~aR sin θ)de £ ^r—[ - °
as /? -> oo. Thus we compute (7.64) by residues over the lower
half plane:
f°° e',axdx „ / е-'"2 \ e-" π
r = —2πϊ Res_, ~\=2π\ = —
■>-«, 1+x2 \l + z2/ -2i e"
(The sign changes since the χ axis is oriented opposite to the
orientation it obtains as boundary of the lower half plane.) Notice, by the
way, that
ι·00 cos ax dx ι·00 e lax dx r°° elax dx π
7.7 The Cauchy Integral Formula 599
Since
r°° sin ax dx
л1" sin ax ax
J-» 1 + x2
(the integrand is an odd function), we obtain
" eiax dx f00 e~iax π
; dx = — a > 0
л·" e— αχ ρ1" e —
·>-«, 1 +x2 ~ J-» 1 +x2
EXERCISES
19. Perform the indicated integrations by residues:
ο Γ άθ
W Ln cos2 6> + 2 sin2 6>
Γ" άθ
(b) -L (cos2 6> + 2 sin2 6>)2
(c) LiW^r*2
r e'z dz
(d) L-.wFTi)
(e) Jo 5^4 cos 0
(f) ΓΊ 5'a<1
Jo 1 + a Sin Ρ
cos χ dx
fe) Lx(x2+a2)(x2 + b2)
r" e"dx
(h) J_.TT3?
0) L^ + l)
χ sin χ
dx
600 7 Line Integrals and Green's Theorem
20.
fw(0
(к)
(1)
r°° dx
J-» x2 + 3x + 2
r°° dx
•L„l+x10
Suppose that / is analytic in a neighborhood of ζ, and
= 0
Show that
α(Λ =
№
0<j<k
-/(0
"W (z-tf
is an analytic function.
21. Suppose that/is analytic in a disk centered at ζ, and all derivatives
of/vanish at ζ. Then/is identically zero.
22. Suppose that /is analytic in the punctured disk 0 < \z — ζ\ < R and
bounded. Then, defining / at ζ by
/(Q = lim/(z)
the extended function is analytic.
PROBLEMS
36. If {/,} is a convergent sequence of analytic functions in the domain D,
then the limit function is also analytic.
37. If
P{z)
where P, Q are polynomials, then F is dissipative if the degree of Q is 2
more than that of P.
38. Suppose that /is analytic in the punctured disk 0 < \z — ζ0\ < R.
(a) Show that
•Ίζ|=>· (■
dz 0<r<R
ζ-ζο)"
is independent of r.
7.7 The Cauchy Integral Formula 601
(b) Fix some r0 < R. Show that if r0 < | ζ - ζ0 \ < R,
Ζπ1 JI{-{0I=J« Ζ— 4 2ττΖ J|{-{0|=r0 2— ζ
(c) Expand /in a series of the form
/(£) = Σ α„(ζ-ζ0Υ
П= — 00
called the Laurent expansion of/, by noticing that
ζ ζ-ζο
(z-ζο).
-1 oo
= Σ
(ζ-ζοΥ
„tb (z - ζ0)" + 1
for |ζ-ζ0Ι=Λ, |ζ-ζ0|<Λ, and
1
f (z-ζοΥ
(7.65)
ζ-ζ .£*«;-ίο)"1
for |z - ζ0| = r, and |ζ - ζ0| > г.
(d) Show that Resc /= α_ι.
39. Equation (7.65) can be verified in another way. Expand / in a
Fourier series around each circle \z— ζ0\ =r:
f(z)= Σ a&W z = re"
(7.66)
(a) The Cauchy-Riemann equations imply that
a/ a/
(b) Differentiating (7.66), we obtain
00
0=2 (ran - node'""
n= ~ oo
Conclude that a„(r) = A„ r". Thus (7.66) becomes
/(z)= Σ A„r"e'"° = A0+ 2{A-z- + A„z")
602 7 Line Integrals and Green's Theorem
40. Suppose that f is one-to-one in the domain D. Then by the residue
theorem
-f -
2πϊ J ев w
ζ dz
sdW— J \z)
if wis not a value of/ in D. Suppose f{a) = w. Then
а=/-'Ы = Л[
ζ dz
IttUsd w—f(z)
Conclude that the inverse of a one-to-one analytic function is again analytic.
7.8 Summary
Let ρ e R" and suppose f is an /?m-valued function defined in a
neighborhood of p. / is differentiable at ρ if there is a linear transformation
T: R" -> Rm such that
Hf(p + v) ~ f(p) ~ T(v)||
—■ * 0 as ν -> 0
Τ is called the differential of f at ρ and is denoted df(p).
The differential is linear in the function f and also satisfies
rf<f,g> = <rff,g> + <f,dg>
Let U be a domain in R". A system of coordinates on U is an и-tuple of
C1 functions у such that
(ι) if p#q, y(p)#y(q)
(ii) rfy(p) is nonsingular at all ρ e U
The matrix
a(/,...,/)_ a/
d(x\ ..., x") dxJ
is called the Jacobian of the coordinate change.
7.8 Summary 603
the chain rule. The differentials of composed mappings compose as
linear transformations:
dig, о f)(p) = rfg(f(P)) о rff(p)
inverse mapping theorem. Suppose F is a C1 Rn-valued function denned
in a neighborhood of p0 such that rfF(p0) is nonsingular. Then there are
neighborhoods Noijt0 and UoiF(p0) and a C1 mapping G: U^N such that
G = F_1.
Let D be a domain in R". A differential form on D is a function which
associates to each point ρ in D a linear function ω(ρ) on R". A differential
form has the form
η
ω(ρ) = Σ α.(Ρ> dx'(v)
ω is said to be Ck on D if all the functions аъ ..., a„ are Ck. If ω is the
differential of a function we must have
^=^ ' *'·>*" (7·67)
A differential form is exact if it is the differential of a function, and closed
if (7.67) holds.
Suppose that F is a force field denned in a domain D in R", and Γ is an
oriented path denned in D. The work required to move a unit mass along Γ is
W{T, F) = - jb<F(0, g'(i)> dt
where g furnishes a parametrization of Γ.
A field is conservative if W{T, F) = 0 over all closed paths Γ A potential
function for a field F is a real-valued function Π such that
W(r, F) + Π(ρ') - Π(ρ)
is the same for every oriented path Γ from ρ to p'.
Suppose D is a domain such that any two points can be joined by a path
in D. Then
(ι) every field, conservative in D, has a potential function
(ii) two potentials of a given field differ by a constant
(iii) If F = (/i,..., /„) has the potential Π, dTl^J^fdx1
604 7 Line Integrals and Green's Theorem
line integral OF a differential form. Let Γ be an oriented path in a
domain on which the form ω is denned. If Γ = £*=1 Γ,, define
ίω=Σ ί"ω(&(0)(8ί(0)Λ
Jr 1=1 Jat
If Τ is the tangent to Γ,
ω = ω(Τ) as
Jr Jr
Let ω = Σαιάχ' be a C1 differential form defined on D. ω = df for
some function/
(i) if and only if the field (αχ, ..., an) is conservative
(ii) if and only if |Γ ω = 0 for all closed curves
(iii) only if
da, da
^=δ? f0ralUj
throughout D.
poincare's lemma. Suppose that D is a domain such that for some fixed
point p0 in D and every ρ e D, the line segment joining p0 to ρ is contained
in D. Then every closed form is exact in D.
In two dimensions a differential form has the form ω = ρ dx + q dy.
If ω is C1 we shall denote the function
dq dp
dx dy
by άω. A regular domain in R2 is bounded by a piecewise C1 curve. We
orient this curve so that its principal normal points into D (it winds
counterclockwise around D). When so oriented we shall denote the bounding
path by dD.
green's theorem. If ω is a C1 differential form defined on the regular
domain D,
Ud ω = JB άω
7.8 Summary 605
Integration under a coordinate change. Suppose
χ = x(u, v)
У = У(и, v)
is a coordinate change on the domain D in R2. Let Ε be the domain in the
uv plane corresponding to D. If Fis continuous on D, then
f / = f /(*(«, ν), y(u, ν))
J П * F.
d«d(X'y)
du dv
6(u, v)
Let ν = (v, w) be a C1 vector field. The divergence of ν is
dv dw
divv = —+ —
dx ay
divergence theorem. If ν is a C1 vector field denned on the regular
domain D,
Ud <v, N> ds = JB div ν
A C2 function / is the potential of a conservative divergence-free flow if
and only if it is harmonic.
cauchy's theorem. If / is a C1 complex differentiable function defined
on the regular domain D, then
\Dfdz = 0
cauchy integral formula. Under the same hypotheses on /, if ζ e D,
2πι ->d z - ζ
maximum principle. If /is analytic on D, it attains its maximum on 3D.
Theorem. Let / be a C1 complex-valued function defined on the regular
domain D. The following assertions are equivalent
(i) for any ζ e D, and some R such that Δ(ζ, R) с Df is the sum in Δ(ζ, R)
606 7 Line Integrals and Green's Theorem
of a convergent power series
oo
/(ζ)=Σα„(ζ-Οη (7.68)
(li) replace the word some in (i) by any
(iii) /is complex differentiable
(ιν) / satisfies the Cauchy-Riemann equations
дх dy
(v) fdz is closed
(vi) for any ζ e D
2πι ->sd ζ - ς
In case / has these properties (/ is analytic), the coefficients a„ of (7.68) are
given by
/<">(0 1 f f(z)dz
a„ =
n\ 2niJSD(z-C)"+1
If/is analytic in {0 < \z — z0\ < R}, we say that/has an isolated singularity
at z0. In this case the integrals
άί f(2)dz
2πι ·Ίζ-ζ0| = γ
are all the same for 0 < r < R. Their common value is the residue of/at
z0, denoted Res (/ z0).
residue theorem. If / is an analytic function on the regular domain D,
except for isolated singularities at z1; ..., z„ in D, then
f f(z)dz = 2nit Res(/,z,)
''ЯП ι = 1
7.8 Summary 607
• FURTHER READING
The general theorems on differentiation in R" are fully discussed in:
H. K. Nickerson, N. Steenrod, D. С Spencer, Advanced Calculus, D. Van
Nostrand Company, Inc., Princeton, N. J., 1957.
M. E. Munroe, Modern Multidimensional Calculus, Addison-Wesley,
Reading, Mass., 1963.
L. Loomis and S. Sternberg, Advanced Calculus, Addison-Wesley, Reading,
Mass., 1968.
For further information on complex analytic functions see
Z. Nehan, Introduction to Complex Analysis, Allyn and Bacon, Inc., Boston,
1961.
H. Cartan, Elementary Theory of Analytic Functions of One or Several
Complex Variables, Addison-Wesley, Reading, Mass., 1963.
E. Hille, Analytic Function Theory, Ginn and Company, Boston, 1959.
L. Ahlfors, Complex Analysis, McGraw-Hill, New York, 1953.
• MISCELLANEOUS PROBLEMS
41. Prove the assertion concerning integration under a coordinate change
as given in the summary (where no reference to the orientation is made).
42. Show that if ω is a differential form of compact support in R2, that
f da> = 0
43. Recall the definition of connectedness given in Problem 78 of
Chapter 2. Show that a domain in R2 is connected if and only if it is path-
wise connected.
44. If ω =p dx + q dy is a C1 form, define
*o> = —q dx + ρ dy
(a) Show that for any regular domain D,
f ω(Ν) ds= f d
J ЯП ·> D
D
where N is the interior normal to D.
(b) Show that the function и is harmonic if and only if d*du = 0.
(c) ω is (locally) the differential of a harmonic function if and only
if άω = 0, ά*ω = 0.
45. If и is a harmonic function in the domain D and if *du is exact in D,
then и is the real part of an analytic function in D.
7 Line Integrals and Green's Theorem
46. If и is harmonic in D, and Г is a closed path in D, the integral
Ζπ ·>γ
is called the period of и about Г. Show that и has zero periods about all
paths if and only if и is the real part of an analytic function. Show that
ехр(й) is the modulus of an analytic function if and only if и has integer
periods.
47. Let D = R2 — {pb ..., ps}, where pi,..., ps are s distinct points in
the plane. Show that there is an j-dimensional space L of harmonic
functions which are not the real part of an analytic function in D such that every
harmonic function has the form u = ui + Ref, MisL, /analytic in D.
(Recall Problem 33.)
48. The Gamma function. Define
Γ(ζ)= ί exp[(z- l)\nt-t]dt= ί tz-Le-dt
Jo ^ о
(a) Show that T(n)=n\
(b) Show by integration by parts that
Γ(ζ+1) = ζΓ(ζ)
(c) Show that Г is an analytic function in the half plane {Re ζ > 1}
(differentiate under the integral sign).
49. (a) Show that for any a > 0 the function
Γ„(ζ)= t'-'e-dt
is analytic on the entire plane,
(b) Substitute
^ '"
e-=2(-W-,
nl
into the integral
ί t'-'e-dt
7.8 Summary 609
to obtain the formula
r(z)=2 тгтЧ + г^)
n=o n!(z+ n)
Justify that substitution.
(c) If Re ζ > 1, does lim Γ„(ζ) = Γ(ζ) as a -* 0 ?
(d) Use the result of part (b) to extend Γ to a function analytic on
the entire plane, but for isolated singularities at 0, — 1, -2, ....
(e) Calculate the residue of Γ at those points.
50. Find the residue at the origin of
exp^z + -j
51. Compute the Fourier transform of (1 + x2)-1: find
1 f°° e'lx
(use Example 46).
52. Compute the Fourier transforms of these functions:
(a) (1+x*)-1.
(b) (l+xTl(a2 + x2)-1·
(c) (ΤΤΪν-
cos χ
(d) oner
53. Suppose {/,} is a sequence of analytic functions in D, and lim/, =/
uniformly in D. Show that /is analytic.
54. Prove: If /is C" in D and /dz = 0 for all disks Δ contained in D,
J SA
then /is analytic.
55. Morera's theorem. Suppose / is a continuous complex-valued
function defined in D such that
ί fdz = 0
over every closed path Γ in D. Then /is analytic. (#mr: Let F be a
potential function for fdz and show that F is complex differentiable.)
56. If/= и + ίν is an analytic function in the domain D, then и is the
potential of a divergence-free velocity field. Show that the curves {v =
constant} are the path lines of the associated flow.
610 7 Line Integrals and Green's Theorem
57. Let / be analytic in the domain {0 < \z — z01 < Щ f is said to be
meromorphic at z0 if there is a function g analytic in a neighborhood of z0
such that/· g extends analytically across z0. Verify that these are
equivalent conditions for meromorphicity.
(a) the Laurent expansion (7.65) of / about z0 has only finitely
many negative terms.
(b) there is an η such that (z — z0)n/ extends analytically across z0.
58. Show that if/is analytic in the domain D except for isolated
singularities aXpi,...,p,, where it is meromorphic, then there is a polynomial
Ρ such that /· Ρ extends analytically to all of D.
59. If/is meromorphic at z0, is exp(/) also meromorphic there?
60. Schwarz's lemma. Suppose that / is analytic on the disk {z e C:
|z|^l}, and
(i) max{|/(z)|:|z| = l} = M
(n)/(0)=0
Show that for any ζ in that disk
\f(z)\<M\z\
(Hint: Apply the maximum principle to z"1/)
61. Under the same hypotheses as above show that
1/'(0)|<П
and if |/'(0)| = 1, then/(z) = cz for some constant с of modulus 1.
62. Let / be in S(R), and suppose that fit) = 0 for negative t. Show
that
V2wJo
is an analytic function for ζ in the upper half plane. Notice that f(\y) =
63. Suppose that/is analytic and dissipative in the upper half plane and
f is in S(R) on the real axis Show that there is a function g e S(R) with
git) = 0 for negative t such that /(z) = giz). (Hint:
Let
m=h(j j^e'l"dt)
Then, by Fourier inversion, g and /are analytic in the upper half plane and
have the same values on the real axis. Verify that g(t) = 0 for negative t
by Cauchy's theorem.)
POTENTIAL THEORY IN
THREE DIMENSIONS
The theory of the preceding chapter, when generalized to three or more
dimensions becomes considerably complicated. The development of this
theory during the 19th century was motivated to a considerable extent by
physical intuition. The study of fields of force and velocity of fluid flows
led to the theorems on integration in severable variables which are in this
chapter. More modern expositions of this material lean heavily on algebraic
developments of the late 19th and early 20th centuries. Although the
mathematics has significantly improved with the introduction of the notions
of differential forms and invariance, the intuition provided by concrete
interpretations has been lost. We shall lean heavily on the interpretation
by fluid flows, thereby sacrificing some mathematical rigor for a little bit of
concreteness. We certainly should point out that the importance of the
subject of differential forms by far transcends its use in putting the divergence
theorem on firm ground. This theory has had major impact on all branches
of modern research mathematics and physics. We have however selected to
complete our story rather than begin to suggest a new one.
A fluid flow is given by a function φ(χ0, t) defined for x0 in some domain
D in R3 and t on an interval in R about the origin. We require that
(ι) φ is continuously differentiable in all variables,
(ii) φ(χ0, 0) = x0, all x0 6 D,
(iii) for fixed t, the transformation x0 -> φ(χ0, ή is one-to-one and has a
nonsingular differential.
611
Chapter О
612 8 Potential Theory in Three Dimensions
The value φ(χ0, t) represents the space position at time t of the particle
which was at x0 at time t = 0. We shall refer to x0 as the particle coordinate
and to χ = φ(χ0, t) as the space coordinate. Condition (n) asserts that the
particle and space coordinates coincide at t = 0. Condition (iii) asserts that
the relation between particle and space coordinates at any time t is invertible:
we can recapture the initial position of a particle from its position at any
time. We shall denote the inverse of φ by ψ: χ = φ(χ0, t) if and only if
x0 = ψ(χ, ί).
The curve give η by χ = φ(χ0, t) is the path of motion of the particle x0. The
velocity of x0 at time t is, of course, (<3ф/<3?)(х0, t). If we fix the time t,
the collection of velocity vectors forms a field, denoted by v(x, t) (referring of
course to spatial coordinates) called the velocity field of the flow. v(x, t)
is the velocity of the particle at χ at time t. We have already noted that
ν(χ0=δφ(χο,0
dt
Χο=ψ(Χ, <)
(8.1)
If the velocity field is independent of time, we say that the flow is steady.
The velocity field of a flow completely determines the flow: the path of
motion χ = u(f) of a particle x0 is the solution of the differential equation
τ- = 4"> 0
dt (8.2)
u(0) = x0
By (8.1) the solution is given by u(f) = φ(χ0, t), for (8.1) can be rewritten as
δφ(χ0, ί)
ν(φ(χ0, ί), 0 = ·
dt
Thus the equation of flow is recaptured from the velocity field by solving
Equation (8.2).
This introduction recapitulates what we have already learned about fluid
flows. In the subsequent section we shall develop the mathematics required
to study the evolution through time of a given mass of fluid. We shall see
that the various laws of conservation of physics (mass, energy) correspond
to mathematical theorems (divergence theorem, Stokes' theorem).
8.1 Divergence and the Equation of Continuity 613
8.1 Divergence and the Equation of Continuity
Let us begin with a fluid flowing through a domain in R3 according to the
equation χ = φ(χ0, t). According to reasonable physical assumptions, if
we define the density at a point ρ as the limit
mass Δ
P(P) = lim —ΓΓ
δ->ρ vol Δ
as the domain Δ shrinks uniformly down to p, then the mass of any domain
is given by integration of the density function p. In our case, that of a fluid
in motion, we shall express the density of the fluid at the point χ at time t
as p(x, t). Thus, for any domain D, the mass of fluid in D at time t is
f p(x, 0 dV
We can also consider the density at a particle: ρ(φ(χ0,ί). Ο is the density
of the fluid at time t at the particle (originally at) x0. (More generally, we
always have this option of referring measurable quantities to either the spatial,
or the particle coordinates. This option is a source of some confusion, as
well as deepening, of our understanding.)
The law of conservation of matter asserts that the mass of a given object
is independent of time. If we fix a domain D, the space occupied at time t
by the fluid originally in D is the domain D, = {ф(х0, t): x0 e D}. The
mass of fluid in Dt is
f p(*, 0 dV
Since mass must be conserved, this must be independent of t. Thus the
law of conservation of mass can be expressed by this equation:
| f p(x, t)dV = 0 (8.3)
dt JDt
for any domain D. We would prefer to state this as an equation involving
functions of points, rather than domains. In order to do that we must know
how to carry through the differentiation implied in (8.3). The problem with
614 8 Potential Theory in Three Dimensions
(8.3) is that we have a variable domain of integration. This can be solved
by replacing that integral by one over D. We shall now briefly interrupt
this discussion with a description of the formula for change of variables in
an integral. This will allow us to compute (8.3).
Suppose now that we are given a one-to-one transformation у = F(x) of a
domain D onto a domain Δ. We assume that F is continuously differen-
tiable, and its differential is everywhere nonsingular. We shall require also
that dF(\) is orientation-preserving: that is, that it maps the standard basis
E1-*E2-*E3 into a right-handed system. Writing χ = (x\x2, x3),
у = (y1,y2,y3), the image of E, under the linear transformation dF(x) is
just (5F/5x')(x). Thus we require that
dF ч dF dF ч
_(x)^_(x)^_(x)
be a right-handed system, which is the same as asking that
With these hypotheses we have the following formula for integration under
the change of variable F. If /is an integrable function on Δ, then
|/(y)^ = |/(F(x))det|^^ (8.4)
We shall defer the derivation of this formula to the end of this section. The
motivating idea is that it is true in the small: if the function/is constant, and
the transformation F is a linear transformation, and D is a rectangle, then
(8.3) just says that the volume of the parallelepiped F(D) is det F · vol(Z))
(an easily verified fact). The general case follows by locally approximating
by this case and summing over the whole domain.
Examples
1. Find J B x2y4 dV, where В is the unit ball. We use spherical
coordinates for this computation:
χ = r sin θ cos φ у = r sin θ sin φ ζ = r cos θ
д(х ν "> /sin θ cos ψ r cos θ cos φ — r sin 0 sin ψ \
„, ' ' 1Ч = I sin θ sin φ r cos θ sin φ r sin θ cos φ Ι
*'·*·*> \ cos/ -sin0 0 7
8.1 Divergence and the Equation of Continuity 615
so
3(x, у, ζ)
det ———— = rz sin θ
d(r, θ, φ)
f χ V rff/ = ί ί ί r8 sin6 # cos2 ψ sin5 ψ Jr ί/ψ rf0
•'в ·Ό J -π ·Ό
= ί r8 rfr · ί cos2 φ sin5 φάφ· \ sin6 θ d0
_ 1 16 7π _ 7π
~ 9 ' 105 ' Тб = 945
2· Jb(*2 ~ J2) *c Φ> where Ζ) = {0 < χ < 1, χ - 1 < у < χ}
becomes
if J'
λ Jo Jo
uv du dv = -
under the change of variable u = χ — у, ν = χ + у.
3. \в{х2 + у2 + ζ2) dx dy dz, where В is the domain
5= {x2 + y2< l,0<z<2}
This can be easily computed in cylindrical coordinates:
f (χ2 + yi + zi)dxdydz=\7'i f (r2 + z2)r d9 dr dz
=27i({r3+lr)dr
_ 19π
We return to our fluid flow given by χ = φ(χ0, t). We shau express it, for
the sake of compution, in coordinates:
{x\ χ2, χ3) = φΟο1' *ο2> *<Λ 0 (8·5)
616 8 Potential Theory in Three Dimensions
Since (8.5) reduces to the identity for t = 0, we have
d(xl, x2, x3)
o{Xq , x0 , Xq ) ( = o
Thus, the determinant
δ(χι, χ2, χ3)
(8.6)
J,(x0) = det
ti(X0 , X0 , X0 )
is positive for all small t, so we can apply the change of variable formula to
the computation of (8.3) for fixed small t. We now have the mass
conservation law expressed by
0 = - J р(х, t)dV = - j Кф(х0, 0, t)J, dV=\Dj( (PJt) dV
(The final equation follows since differentiation under the integral is now
allowable.) Since this must be true for every domain D, the integrand is
identically zero :
dt
(pjt) = 0
(8.7)
We can explicitly compute that derivative for t = 0, using (8.6). First,
let us consider
dt
д , d(xl, χ2, χ3)
= — det —-—-—■——
tit ti(X0 , Xq , Xq )
(8.8)
( = 0
The determinant is the usual sum of products of the various partial derivatives
dx'/dx0J. The derivative of such a product will have three terms; in each
one of which only one term is differentiated with respect to t. Each term is
of the form
Ft VW
ds2
(8.9)
where {rl, r2, r3} is a permutation of {xl, x2, x3}, and {j1, s2, s3} a permuta-
8.1 Divergence and the Equation of Continuity 617
tion of {x0l, x02, x03}. According to (8.6)
dr
Js
= 0 if s φ r0
dr
ti~s
= 1 if s = r0
Thus the only relevant terms (8.9) are those where Y2 = r02, s3 = r03 and, a
fortiori, sl = r0l. Finally, by the equality of mixed partial derivatives,
δ ίδχ'\ δ I
dt \δχ0ι) ( = 0 δχ0' \
~dt)
_ δυ'
1 = 0 ^"^Ο
where ν = (г;1, ν2, ν3) is the velocity field of the flow (recall Equation (8.1)).
Thus, the computation of (8.8) is complete: there are only three relevant
terms, for r1 = xl, x2, x3, respectively, and we have
dt
J,
dvl dv2 δυ3
-_0 δχ0ι δχ02 δχ03
(8.10)
Definition 1. Let ν = (ν1, ν2, ν3) be a differentiable vector field defined in a
domain in R3. The divergence of ν is the function defined by
άινν = ,Σ^
The name will appear presently to be justified. We now summarize our
discussion in the following assertion.
Proposition 1. (Equation of Continuity) Let v(x, t) be the velocity field of a
fluid flow, and p(\, t) its density. The law of mass conservation takes this form:
— + div(pv) = — + Χ ν, — + ρ div ν = 0
tit tit j=i tix
(8.П)
Proof. Referring to the preceding discussion we have seen from (8.7) that the
law of mass conservation asserts that
-(р(ф(Х0,Г),Г)/,(Хо)) = 0
618 8 Potential Theory in Three Dimensions
for all r, x0. Evaluating at t = 0, this becomes
- (ρ(φ(χ0, t), r))|,_0 -Л(хо) + ρ(Φ(χ0, 0, r)) -/,(x0)lr=o
(8.12)
3 dp dx' dp
= (Σ g^( (xo, 0) — (x0, 0) + - (xo, 0) + p(x0, 0) div v(x0, 0)
The second expression follows from our computation above terminating in (8.10),
and the fact that /0(Xo) = 1, x0 = Φ(χο, 0). Now, we could have started our clock
at any time; there is nothing special about the time r = 0 except that our formulas
are most easily computed there. Thus, (8.12) must hold for all (x, r) since it is
valid for all (x0, 0). Thus (8.11) is true. We leave the first equality as an exercise.
Equation (8.11) can be referred to the particle coordinates of the motion:
dp
dt
3 δχ' dp
χ = φ(χ0, ί) >=1 ΰί ϋχ
Χ = φ(Χ0. <)
δχ
+ ρ(φ(χ0, ί). 0 div — (x0, 0 = 0
which compresses into
- ρ(φ(χ0, ί), 0 + ΚΦ(Χο, 0, 0 div ^ (χ0, ί) = 0 (8.13)
This relates the time rate of change of density at a particle with the rate of
change of its position. A fluid flow is called incompressible if the same mass
always occupies the same volume. For an incompressible fluid flow we
must therefore have that JBt dV is constant for any initial domain D. Thus
0 = -f dV=- [jtdV= ί -^,dV= ί div у dV (8.14)
dt V dt JD ' JD dt' JD
for every domain D. Thus div ν = 0 is the necessary and sufficient condition
for a flow to be incompressible. By the equation of continuity (in the form
(8.13)) this is the same as asking that the density at a particle is also
independent of time.
Corollary 1. ν is the velocity of flow of an incompressible fluid if and only
г/div ν =0.
8.1 Divergence and the Equation of Continuity 619
Corollary 2. The fluid is incompressible if and only if the density at a
particle is constant under all flows of the fluid.
Now the integral JB div ν dV is the rate of expansion of the fluid in D,
according to our computation (8.14). (Hence, the name divergence.) We
could also calculate the "infinitesimal expansion" of D by calculating the
amount of fluid which enters during an " infinitesimal" amount of time, and
subtracting from it the amount of fluid that leaves. The mathematical
expression of this will be an integral over the boundary of the domain D.
The fact that this is the same as JB div ν dV is the divergence theorem, which
is a fundamental fact in calculus. We shall return to this theorem and
its implications in Section 8.5.
Examples
4. Consider the flow given by the equations
χ = x0{\ + t) + ty0 y = y0(l -t) + tx0 z = z0 e'
If D is the original position of a mass of fluid,
D, = {(x0(l +t) + ty0, >-o(l - t) + tx0, z0 e'); (x0, y0, z0) e D}
and the volume of Dt is
Γ dv=\ tetj{x>y>z\dv
JDt jd d{x0,y0,^o)
= ί e'(l - It1) dV = e\\ - It1) vol(D)
•>D
Since — vol( Д) = f η div ν dV for every domain D, we have
dt J
div v(x, t) = - e'(l - It1) = e'(l - At - It1)
at
5. For this flow:
x = x0e' y = y0e~' ζ = z0e' + x0(l - e')
620 8 Potential Theory in Three Dimensions
we have
dx
— = (x0e', -y0e ', z0e' - x0e>)
at
so v(x, t) = (x, —y, ζ — xe~') and div ν = 1. Thus, for any domain
D, (д/dt) vol(Z)() = 1 · vol(Z)(), so vol(Z)() = e' vol(Z>). If p(x, t) is the
density function at time t, the equation of continuity allows us to
find ρ in terms of its initial values. Let p(x0,0) = p(x0) be given.
Then, according to (8.13), if p(x0, t) is the particle density, we have
dp
p(x0, 0) = p(x0)
Thus
p(xo,0 = p{*o)e~' p(x, t) = e~'p(xe~', ye', z - x(e~' - 1)0
6. Suppose an incompressible fluid flows steadily in the direction
a = (a1, a2, a3). That is, the path lines are parallel to the vector a.
Then the speed is constant along the paths. For the velocity field is
v(x, 0) = ф(х)а
where φ is a scalar function (the speed), and since ν is divergence free,
we have
δώ , δώ . δώ .
—-: a + —-ζ α + —\ r
δχ δχ δ χ
divv = ^Tfll+^2a2 + I3a3 = 0
But then
#(x)(a) = <V#x),a>=0
for all x, so φ is constant along the lines parallel to a; but these are
the paths of motion.
8.1 Divergence and the Equation of Continuity 621
Integration Under a Coordinate Change
Theorem 8.1. Let (u, v, w) = F(x, y, z) be an orientation-preserving change
of coordinates valid in the domain D in x, y, ζ space. Let Δ = {F(x, y, z):
(x, y, z) 6 D}. If g is a function continuous on D, then
г г д(х ν ζ~)
g(x, у, ζ) dx dydz= g(F~ 1(u, v, w) det ' ' du dv dw
jd ja 8(u, v, w)
Proof. The proof consists in a series of reductions terminating in the one-
variable case It is enough to show that for any point ρεΰ, this theorem is true
for some rectangle centered at p. For, once this is shown, we may cover D by
finitely many such rectangles Ri,..., R„. If {pi,..., p„] is a partition of unity
subordinate to {Ri,..., R„], then pt ■ g is zero outside Rt. The theorem is thus
true for each pt ■ g. Summing over ι, we obtain the general result.
Thus we may concentrate our attention on a particular point p0 in D, which we
take to be the origin. If the theorem is valid for the coordinate changes u = F(x),
у = G(u), then it is also true for the composed mapping у = G(F(x)), simply because
apt1, x2, x3) apt1, x2, x3) 8(u\ u2, u3)
8(y\ У2, У3) = е(и\ и2, и3) ' (By\y2,y3)
We will decompose our mapping into a composition of four special cases, for each
of which the theorem is easy. The general result will follow by composing these
mappings.
First of all, let Τ be the linear mapping
д(х, у, z)
1W d(u, v, w)
■00
(u. v. w)—0
Then F = (F°T)<>T"1 and F ° Τ has the property that its Jacobian at 0 is the
identity. The theorem is easily seen to be true for a linear mapping (Problem 5), so
we need only prove it for F ° T.
Our situation is now this: we are given a change of coordinates («, v, w) =
G(x, y, z) defined at the origin such that
d(u, v, w)
V (0) = I
(0) = I
(0) = I
^х,У,
It follows that
Ku, У,
8(x, У,
a(«, v,
z)
z)
z)
z)
ape, y, z)
622 8 Potential Theory in Three Dimensions
Thus, by the inverse mapping theorem there is a neighborhood В of 0 in which
(x, y, z), (u, y, z), (и, ν, ζ), (и, ν, w) are all bona fide orientation-preserving coordinate
systems. If we denote the respective coordinate changes as follow
FiO, y, z) = (и, у, z)
F2(«, y, z) = Ο, υ, z)
F3(tt, v, z) = (u, v, w)
then F = F3 ° F2 ° Fi. Each Fi changes only one coordinate at a time, and we need
only to prove the theorem for each Fi. Since the proof of each case is the same,
we shall do it only once.
Now, here we do our computation. Let
и = h(x, y, z)
v=y
w = ζ
be a coordinate change defined on a rectangle
R = {-a<,x<a, -b<y^b,-c^z^c}
centered at the origin. Let
Δ = {(«, v, w): и = h(x, v, w), — a <, χ < a, —b<,v<,b, —c<,w<^c}
If now g is a continuous function on R,
J g(x, y, z) dx dy dz = j j J g(x, y, z) dx
dy dz
(8.15)
Now, according to the theorem of change of variable in one dimension
eh-1
~o .hia.f.z) dh~l
g(x, y, z)dx= g(h~l(x, y, z, v, w)) — (и, у, z)
J-a Jln-a.y.z) OU
Thus (8.15) becomes
du
ал-1
g(h~l(u, ν, w, v, w)) —r- (u, v, w) du
du
do dw
J3(x, y, z)
g(h~\u, v, w, v, w)) det — du dv dw
A (U, V, W)
8.1 Divergence and the Equation of Continuity 623
The last equation follows from
a(«, v, w)
д(х, y, z)
Idh
Vx
0
\o
ал
Ту
1
0
ал\
Tz
0
Ч
e \а(й, υ, w)J e\e(Xyy,z)) ~\дх) ~~ ди
EXERCISES
1. Compute the area of these domains, using either spherical or
cylindrical coordinates:
(a) x2 + y2 + z2 ;> xyz
(b) 1 ^x2+y2-z2^0
(c) x2+y2<,z<\
(d) a2x2 + b2y2 + c2z2 < 1
2. Integrate / over the domain D
(a) f{x) = x^2z» D = {x2+y2 + z2<l}
(b) /(x) = xyz £) = {x* + y2 < 1 0<z<l}
(c) f(x) = x2+y2-z2 D = {a2x2 + b2y2<l 0<z<x2+^2}
(d) f(x) = r sin2 6> cos2 φ D = {0<x(x2 f ;y2+z2)<l}
3. What is the mass of a parabolic section:
0^ζ<α(χ2+^2)
whose density is proportional to the distance from the xy plane?
4. Find the mass of the ball of radius 1, whose density is p{x) = (1 + r)'1.
5. Let
x = x0 + ty0 у =y0e' — tz0 ζ =z0e" + tx0
be the equations of a flow in space.
(a) Compute the velocity field v(x, t).
(b) Compute the divergence of the flow.
(c) Assuming an initial density function which is constant, find the
density function p(x, i)·
(d) What is the mass of the fluid in the unit cube at time t = 1 ?
6. Which of these fluid flows is incompressible ?
(a) v(x, f) =(-z, x,y)
(b) v(x, t) = (z2 - χ2, ζ - у, ζ)
(c) χ = х0е· + (I - t)y0, у = Усе-"2 + (I - t)z0, z =e~"2z0
(d) χ = Xo cos t + y0 sin t y = yQcost— x0 sin t ζ = αζ0(1 + t)
(e) v(x, t) = (x cos t, xy sin t, ze')
624 8 Potential Theory in Three Dimensions
7. Find the volume at time t = 1 of the mass of fluid originally in the unit
sphere under these flows:
(a) Exercise 6(a). (b) Exercise 6(c). (c) Exercise 6(d).
8. Show that a C2 function in R3 is harmonic if and only if it is the
potential of the vector field of an incompressible flow. (Hint: div V/= Δ/)
PROBLEMS
1. A radial field is a field of the form
v(x) = ^(||x||)x
Find all incompressible radial fields.
2. If L is a line in R3, a flow around the axis L is one whose velocity field
at any point is tangent to the cylinder with central line L. Show that the
flow of Exercise 6(d) is a flow around the ζ axis. Find another such flow
which is incompressible.
3. Find the incompressible flow whose path lines are the curves
X =X0 + U У = Уо + Sin U Ζ =Ζ0
4. Find the incompressible flow whose path lines are the curves (in
cylindrical coordinates)
z = Cr-i 0 = 0O
(see Figure 8.1).
5. Prove Theorem 1 for the coordinate change u = T(x), where Τ is a
nonsingular linear transformation.
6 In the proof of Theorem 1, a function
и = h(x, y, z)
was found. It was tacitly assumed that dh/dx>0. Why is that so?
Express dh~l/du m terms of the original functions (u, v, w) = G(x, y, z).
8.2 Curl and Rotation
The divergence of the velocity field of a flow measures the rate of expansion
of the fluid in flow as we have seen. We shall now compute an indicator of
its rotation around a given axis. Suppose
x = ф(х0, t)
(8.16)
8.2 Curl and Rotation 625
Figure 8.1
is the equation of motion of the flow. Let x0 be any point, and η a direction
(unit) vector at the point x0 . We shall compute the average angular velocity
in the plane orthogonal to η at the point x0 in terms of the velocity field v.
We take χ = 0 for convenience. Since we are interested in the motion around
the axis n, relative to the motion of 0, we must work in coordinates relative
to 0. What is the same, we shall subtract from the above motion a motion
of translation by the image of 0, so that 0 remains fixed. Since translation
involves no rotation, our computation will be valid for the original motion.
Thus we replace (8.16) by the flow
χ = v|/(x0, 0 = ф(х0, 0 = ф(0, t) (8.17)
so that in our new motion the origin is fixed.
Let Cr be a circle of radius r centered at 0 lying in the plane Π(η) orthogonal
to n. Let a be a point on Cr. After a time t, the particle originally at a has
moved to v|/(a, t). Let L be the projection of v|/(a, /) — a onto the line tangent
to Cr at a (see Figure 8.2). Let 9(t) be the angle at 0 in Π(η) between a and
a + L. Thus 9(t) is the angle in the plane orthogonal to η through which a
626 8 Potential Theory in Three Dimensions
Figure 8.2
has moved (relative to 0) during the time t. Thus
mi) = sin — = sin
r r
when Τ is the unit tangent vector to Cr at a. Dividing by t and letting
t -* 0, we obtain the angular velocity for the particle a in Π(η) as
(!"-"'7)10 = (ГГ(^|,.0 = <1(а'0)-Т>
= <v(a, 0) - v(0, 0), T>
according to (8.17). The sum over all of Cr of this angular velocity is called
the total circulation of the flow about Cr and is denoted circ(Cr). Thus
circ(Q = | <v(a, 0) - v(0, 0), T> ds (8.18)
This number, calculated for small r gives us some idea of the instantaneous
rotation of the flow around η at 0. If we suitably normalize ((8.17) tends
to zero as fast as r -* 0), and take the limit as r -* 0 we will have the same kind
of information, but it will be given by a point function, rather than a function
of circles.
Definition 2. Let ν be the velocity field of a flow in a domain D. For
each point x0 in D, and unit vector η define the curl of the flow about η at Xq
to be
1 t 4i- cirC(Q
curl v(x0, n) = hm )~-
8.2 Curl and Rotation 627
(8.19)
Γ-.0 r~
where Cr is the circle of radius r centered at x0 in the plane orthogonal to n.
Example
7. Consider the flow (Figure 8.3)
χ = x0 cos t + y0 sin t у = y0 cos t — x0 sin t ζ = z0 + t
Let us take x0 = (1, 0, 0) and η = E3. Then, as we have already seen
v(x,0 = O. -x, 1)
Figure 8.3
628 8 Potential Theory in Three Dimensions
If we take
Cr = lx — 1 + r cos -, у = r sia-, ζ =0
r r
then
circ(Cr) = f <v(x) - v(l, 0, 0), T> ds
Jcr
= / I r sin r cos - , 0 I, ( —sin s, cos s, 0) ) ds
r2"
= (-r2)rd9= -2nr2
Thus the xy plane rotates around (1, 0, 0) in the negative sense (with
constant angular velocity), as t changes. If now we take η = Еь we
have
Cr = Jx = 1, ν = r cos -, ζ = r sin -
I r r
circ(Cr) = I / Ir cos -. —1,11, 10, — r sin-, r cos-I \ ds
= 0
Thus there is no rotation in this plane.
Now, we shall compute the curl explicitly in terms of the velocity field v.
Again take x0 = 0 and let α =(α1, α2, α3), β = (β1, β2, β3) be two unit vectors
in the plane orthogonal to η so that α -> β -> η is a right-handed orthonormal
basis. Thus η = α χ β, so
η = (α2β3 - <χ3β2, α3βγ - α1 β3, α1 β2 - α2βγ) (8.20)
For a time we shall compute relative to this basis. Cr has this parametriza-
tion
s s
χ = x(s) = r cos - ■ α + r sin - ■ β (8.21)
r r
8.2 Curl and Rotation 629
The tangent vector is
s s
T(s) = - sin - ■ α + cos - ■ β
r r
Expanding the velocity field in terms of this basis:
v(x, 0) - v(0,0) = v"(\)ol + ι>"(χ)β + ι>"(χ)η
Then
circ (Cr) = J <v(x, 0) - v(0,0), T(x)> ds
Cr
= j * ( - if(x(s)) sin - + i^(x(s)) cos -) ds (8.22)
Now, substitute θ = s/r in the integral and approximate the ν" (ν = α, β)
by their differentials:
v\x(ff)) = v\0) + <fov(O)(x(0)) + εν(||χ||)
where
||χ||-ν(χ)-»0 as ||x|| — 0 (8.23)
Since vv(0) = 0, using (8.21) for x(0), we have
νν(\(θ)) = r cos θ ■ dvv(0)(a) + r sin θ ■ dvv(Q)(p) = εν(ΙΜΙ)
Substituting these expressions into (8.22), we obtain
circ (Cr) = f\-dvW(a) + di;"(0)(P)]r2 cos Θ sin θ άθ
■Ό
+ f *[-ίίι/"(0)(β) sin2 Θ + άνβ(0)(α) cos2 0>2 άθ
■Ό
+ f \-ε"(χ) cos θ + εβ(χ) sin θ)ν άθ
■Ό
= ur2[-<i!)"(0)(|i) + d/(0)(a)]
+ r f \-ε"(χ) cos θ + εβ(χ) sin 0] άθ
■Ό
630 8 Potential Theory in Three Dimensions
Dividing by nr2, and letting r -> 0, the second term disappears because of
(8.23) and we obtain
curl v(0, n) = d^(0)(a) - <foa(0)(P) (8.24)
This can be rewritten in terms of the vector n. Let ν = (ν1, ν2, ν3) in terms
of the standard Euclidean coordinates. Then
v"(x, 0) = <v(x, 0) - v(0, 0), α> = Σ № 0) - vl(0,0)]α'
ι=1
SO
Similarly,
3 dv'
*><«>(■> = £^
(8.24) can be expanded out as
3 dv1 3 dv1
сиг1¥(о)П) = 1?1_^-1?1_^
Д dv1
+(£-£)<■''·-,',> (8-25)
Referring back to (8.20) we see that this is the inner product of a vector
derived from ν with the given unit vector n. We collect these results in a
definition and a proposition.
Definition 3. If ν = (vl, v2, v3) is a vector field defined in a domain in
R3, we defined the vector field curl ν by
/dv2 dv3 dv3 dv1 dv1 dv2\ to „,.
CUTU=W-S?'8?-8?'8?-8? (8'26)
8.2 Curl and Rotation 631
Proposition 2. If ν is the velocity field of a fluid flow, the curl of у at x0
around the direction η at time t is given by curl <v(x0, t), n>.
Proof. Equation (8 25) is just (curl v, n>.
Definition 4. A flow with velocity field ν is called irrotational if curl ν = 0.
Examples
8. Let v(x) = {-y, χ, 1) (as in Example 6). Then
curl v = (0,0, -2)
Thus for any plane Π = {p: <p - x, n> = 0} through x, the rotation
in that plane has angular velocity -2<n, E3>. Thus the maximum
rotation is about the ζ axis.
In general, curl v(x) spans the axis of the "infinitesimal" rotation about
χ and its magnitude is the angular velocity.
9. Let
χ = x0(l + t) + y0(l - e') у = у0е~' ζ = z0(l + t)
be the equations of a flow. The velocity field is
(x - ye' - (2 + t)ye2'
K ' \ 1 + ί 1 + ί/
thus
-^-(2 + t)e2'\
curl
v(x, ,H(0,0, /+< )
so again the rotation at any point is about the ζ axis. Notice that the
equations break down at t = -1. We can consider that as the initial
point of the motion: the fluid came, at t = -1 spinning off the xy
plane with infinite angular velocity.
The form of curl ν recalls the discussion of closed and exact forms in the
previous chapter. If we consider the differential 1-form ω = <v, dx>
associated to the vector field v, then curl ν = 0 is the necessary condition for
632 8 Potential Theory in Three Dimensions
ω to be the differential of a function (and by Poincare's lemma it is locally
sufficient). In particular, if the field is/ conservative, then the flow induced
by the field is mutational.
We can make physical sense of this statement by referring it to the
acceleration field a = d\/dt of the flow rather than the velocity field. By
Newton's law this is essentially the field of forces which generates the flow.
As we have seen, if this field is conservative, then the work done by the flow
in moving a mass from one point to another is precisely what is needed; it
is the same as the change in energy level. For this to be the case no work
can be expended in wastelessly rotating the mass; hence the field is irrotational.
In the theory of electromagnetism the existence of two fields, the electric
E, and the magnetic H, is postulated. Certain relations between these
fields, corroborated by experimental evidence form the basic laws of the
subject. These are Maxwell's equations. Two of these are
ЯТТ
curl Ε + σ — = 0, div Η = О
at
(σ a suitable constant), which state that the rate of change of the magnetic
field is determined by the rotation of the electric field, and that the "
magnetic flow" is incompressible.
Here are several important relations between the gradient, curl, and
divergence which are easily derived.
curlV/=0 (8.27)
div curl ν = 0 (8.28)
div V/= Δ/ (8.29)
curl/v =/curl ν + Vf χ ν (8.30)
div(/v)=/divv + <V/,v> (8.31)
Example
10. Suppose
A = (-*,0,j)
is the acceleration field of a fluid in motion. Find the equations of
motion, assuming an initial velocity field of (0,1,0), and find the
divergence and curl of the flow.
8.2 Curl and Rotation 633
If χ = φ(χ0, t) is the equation of motion, we have
φ(χ0, 0) = x0
дф
^(x0,0) = (0,1,0)
and φ(χ0, 0 solves the differential equation
(x,y,z)" = (-x,0,y)
The general solutions are
χ = A0 cos t + B0 sin t
y = Av +BS
At . Bl ,
ζ = A2 + B2 t + -± t2 + -j i3
The initial conditions give these as the equations of motion:
x = x0 cos t
У=Уо + '
t2 t3
ζ = z0 + y0 - + -
The velocity field is
V(x, 0 = (-x tan/, 1, /j)
div V(x,/)= -tan/
curl V(x,/) = (-/, 0,0)
Notice that at / = π/2 the holocaust arrives. Before that moment,
our fluid is moving generally in the positive у direction, rotating
clockwise around the line parallel to the χ axis and spinning away
from it (/ < 0) and back again toward it when / > 0.
634 8 Potential Theory in Three Dimensions
EXERCISES
9. Compute the curl for these fluid flows:
(a) x=x0 + ty0 y = y0e'—tz0 z= z0e~' + tx0
(b) \(x,y,z) = (-z,x,y)
(c) \(x,y,z) = (y,z,x)
(d) The flow described in Exercise 6(b).
(e) The flow of Exercise 6(c).
(f) The flow of Exercise 6(e).
10. Verify Equations (8 27)-(8.31).
11. Find the equations of motion and analyze the flow as in Example 8
given this acceleration field and initial velocity:
(a) A = (-j»,*,l) V(x0) = 0
(b) A = (x,z,x) V(x„) = (0, 0, 1)
12. Compute the rotation at x0 about the E2 axis for the flow of
Example 6.
PROBLEMS
7. Suppose we are given a time-independent field of forces F in a medium
of constant density (say =1). By Newton's law the fluid will flow according
to the equation F = A. Let D be a small ball of fluid. The kinetic energy
of D at time t is
■I
2 I M2dV
where ν is the velocity field of the flow. Show that the work done by F in
moving D to Dk is equal to the change in kinetic energy. (Hint:
a/3r(||v||2)=<v,F>.)
8. Verify these identities:
(a) curl gVf= Vg χ V/
(b) curl/V/=0
9. Show that if u, ν are curl-free vector fields, then u χ ν is divergence
free.
10. Show that in a ball, a vector field is a gradient if and only if its curl
is zero.
11. Let Μ be a 3 χ 3 matrix, and consider the flow
χ = exp(Mf )xo
(a) Compute the divergence and curl of the velocity field of the flow.
(b) Show that the flow is divergence free if and only if tr Μ = 0
(c) Show that the flow is curl free if and only if Μ is symmetric.
8.3 Surfaces 635
12. Consider the flow
χ = exp(Mf )xo
where Μ is a symmetric matrix
(a) Show that the velocity field of the flow is conservative and has
the potential function
Π(χ) = -<Μχ,χ>
(b) Show that the flow in an eigenspace with eigenvalue α is in a
straight line either toward the origin (a < 0), or away from the origin
(a>0).
(c) Diagram the flow lines for such a flow in the plane in case the
eigenvalues (ι) are the same, (и) have the same sign; (iii) have opposite
signs.
8.3 Surfaces
A surface in R3 is (as we have been using the notion m this text) a subset
of R3 which is two dimensional. By this we mean that every point has some
neighborhood which can be put into one-to-one correspondence with a
domain in the plane. We shall assume that this correspondence is smooth.
It is given by a continuously differentiable mapping with a nonsingularity
condition on its differential.
Definition 5. A surface patch in R3 is the image of a domain D in R2
under a map χ = х(и, ν) with these properties:
(i) χ is one-to-one.
(ii) χ is continuously differentiable.
(iii) The vectors дх/ди, δχ/δν are independent at every point, (η, ν) are
called the parameters for the surface patch. The curves и = constant, and
ν = constant are called the parametric curves.
A surface is a set Σ in R3 which can be covered by surface patches, that
is, every point ρ on Σ has a neighborhood N such that Σ η Ν is a surface
patch.
Notice that if we fix и = с, then the function φ(υ) = x(c, v) parametrizes a
636 8 Potential Theory in Three Dimensions
curve (since φ is also one-to-one and
dty дх
dv δν
is everywhere nonzero). The vector δχ/δν is thus the tangent vector to the
parametric curve и = constant. Condition (iii) asks that the curves и = с,
ν = с' at any point have independent tangents. Another way of phrasing
(iii) is that the 2 x 3 matrix
<3и
δχ
w
has rank 2.
Examples
11. The sphere: x2 + y2 + z2 = 1 (Figure 8.4). Near the point
(0, 0, 1) we can write ζ as a function of χ and у on the plane: ζ =
(1 — χ2 —y2)i/2. Thus we can use x, у to define a surface patch
surrounding (0, 0, 1):
χ = х(и, v) = (и, ν, (1
υ2)1'2)
(I-*—vJ)"
Figure 8.4
8.3 Surfaces 637
Figure 8.5
which coordinatizes the upper hemisphere as u, ν range through the
disk u2 + v2 < 1. Since
Χ„ = ^ = (1,0,-Μ(1-"2-ί'2)"1/2)
dx
x, = 7-=(0,l, -v{\-u2-v2Yll2)
these vectors are independent. Every point on the sphere can be put
in such a surface patch, by permuting the roles of (x, y, z) above.
For example, the point (— 1, 0, 0) lies in the surface patch given by
χ = x(h, v) = (-(1 - u2 - v2)112, u,v) u2 + v2 < 1
Spherical coordinates can be used to coordinatize the whole sphere
except for the points (0, 0, +1):
χ = x(0, φ) = (cos θ cos φ, cos θ sin φ, sin Θ)
12. The ellipsoid (Figure 8.5)
a2x2 + b2y2 + c2z2 = 1
is also easily parametrized by spherical coordinates (again except for
z= +c_1):
cos и sin ν sin u\
/cos и cos ν ι
■=*(»>v)=[—-a—'■
638 8 Potential Theory in Three Dimensions
Figure 8.6
13. The paraboloid ζ = x2 + y2 (Figure 8.6) is a surface patch: it
is coordinated by
χ = х(и, v) = (и, ν, и2 + ν2)
Since хи = (1, 0, 2м), х„ = (0, 1, 2v), they are independent.
14. The cone ζ = (x2 + y2)1'2 (Figure 8.7) can be coordinatized,
except for the vertex, by
χ = x(h, v) = (h, v, (u2 + v2)112 и φ 0, νφΟ
We might ask if there is any way to coordinatize a neighborhood of the
vertex of the cone. It is quite difficult to show that there exists no function
which does so, but there is one important implication of the differentiability
of such a function which is easy to check out. The differentiability implies
good approximabihty by linear functions, thus we should anticipate the
existence of a linear surface (a plane) which comes " nearest" the surface at
a given point. This is the tangent plane, which we shall now describe by
limiting arguments as in the case of the tangent line to a curve.
Suppose ρ is a point on a surface Σ and q, r are two nearby points. The
three points p, q, r (in general) determine a plane. As q, r tend to p, this
plane will (in general) attain a limiting position: this is the tangent plane.
We now compute this process with coordinates. Suppose the function
χ = x(M1,M2),(M1,M2)eZ)coordinatizesZnearp. We may assume ρ = x(0,0) =
0. Let q = х(и', и2), r = x(i>', v2). The plane n(q, r) through p, q, r is then
8.3 Surfaces 639
the set of all vectors perpendicular to
q χ r = х(У, и2) χ х(У, υ2) (8.32)
In order to take the limit we approximate χ by its differential
x(h\ u2) = х^ОУ + x2(0)u2 + e(||u||)
where t~lz{t) -*0 as t ->0. Equation (8.32) becomes
q χ г = (хх(0) χ χ2(0))(ι/ι;2 - u2vl) + R (8.33)
where we have combined all the error terms in the expression R. The
important behavior of R is this:
R(u, v) = lluMHQ + |M|e2(||u||) + ε3(Η|)ε(||ν||)
where the ε, all have the same behavior: t _1ε(ί) -> 0 as t -> 0.
Now, so as to treat the remainder R as an insignificant remainder, we must
be careful with the term ulv2 -Λ1. It may, for example, be zero, in which
case the remainder becomes very significant. Thus we must assume that
Figure 8.7
640 8 Potential Theory in Three Dimensions
this terms tends to zero more slowly than R as q, r -> p. Since
HV-HV = sin(KI|u||X||v||)
it suffices to assume that the angle between the coordinate vectors does not
tend to zero as q, r -> p. Then, under this assumption, we can divide (8.33)
by ulv2 — u2ox, obtaining IT(q, r) as the plane through ρ orthogonal to the
vector
Xl(0) χ x2(0) + R1
where R1 -> 0 as q, r -> p. Thus the limiting position of IT(q, r) is the plane
orthogonal to (дх/ди1) χ (дх/ди2) at ρ: it is the plane spanned by
дх дх
Definition 6. Let ρ be a point on a surface Σ coordinatized by χ = х(и', и2).
The tangent plane to Σ at ρ is the plane spanned by the vectors дх/ди1, дх/ди2
at p.
Proposition 3. Let ρ be a point on the surface Σ, and let IT(q, r) be the plane
spanned by two points q, r on Σ so that the angle between q — ρ and r-pis
nonzero. If q, r -> ρ so that this angle remains bounded away from zero,
then IT(q, r) tends to the plane tangent to Σ at p.
Of course the angle assumption is crucial, Problem 28 exhibits the difficulty
obtained without it.
Examples
15. There is no tangent plane to the cone
ζ = (x2 + у2)1'2
at its vertex (Figure 8.7). For, if we take qx = (t, 0, t), q2 = (0, t, t),
the plane spanned by qx and q2 is the plane spanned by (1, 0, 1),
(0, 1, 1) for all t ->0. Thus this is a candidate for the tangent plane.
However, if we consider now the points q, = ( — t, 0, t), q2 = (0, — t, t)
for t > 0, the candidate we obtain is the plane spanned by ( — 1, 0, 1),
(0, —1, 1). Since these two planes are distinct, there can be no
tangent plane (Figure 8.8).
8.3 Surfaces 641
Figure 8.8
16. The cylinder x2 + y2 = 1 is a surface. It can be coordinatized
by using cylindrical coordinates:
χ = \(u, υ) = (cos u, sin u, v)
x„ = ( —sin u, cos u, 0)
x„ = (0, 0, 1)
The tangent plane at x(w, v) is the plane orthogonal to the vector
x„ χ x„ = (cos u, sin u, 0).
17. If χ = x(s) is the equation of a curve, the "surface swept out"
by its family of tangent lines is a surface. It is parametrized by
x = \(s, t) = \(s) + tT(s)
642 8 Potential Theory in Three Dimensions
We have
xs = T(i) + tkN(s) x, = T(i)
Thus, so long as κ φ 0, s, t are patch coordinates for all s, t > 0. This
surface is called the developable defined by the curve. Its tangent
plane at the point (s, t) is the same as the osculating plane to the
curve at x(s).
Let Σ be a surface, and ρ a point on the surface. We shall denote the
tangent plane to Σ at ρ by T(p). If χ = \(u, ν) parametrizes Σ in a
neighborhood of p, with ρ = х(и0, v0), then the vectors дх/ди(и0, v0), dx/dv(u0, v0)
span the plane 7Xp). The inner product on R3 induces an inner product
on this plane just by restriction. It will be valuable to us to see how to
express this inner product in terms of the basis x„, x„. If t = axu + b\ is
a vector in 7Xp) its length is given by
||t||2 = <t, t> = a\\u, x„> + 2ab<x„, x„> + b\xu, x„>
Suppose that С is a curve on Σ. Choose a parametrization of C:
x = g(j) 0 <. s < L (8.34)
Let (u(s), v(s)) be the (u, v) coordinates of g(s). Then (8.34) is the same as
χ = х(н(у), v(s)) (8.35)
and by the chain rule, the tangent to С is
du dv
Τ = x„ — + x„ T
ds as
and
IITII2 = <хи> хи>(^)2 + 2<χ„, χ„> £% + <χ„, k>(j )2 (8-36)
\ds/ ds as \ds/
We shall use these following notational conventions relative to coordinates
οηΣ:
£=<хи,хи> ^=<хи,х„> G = <x„,x„> (8.37)
8.3 Surfaces 643
In terms of this notation we have this way, intrinsic to the surface, for
computing the lengths of curves on Σ:
Proposition 4. Let Σ be a surface patch parametrized by χ = х(м, υ).
Let С be a curve on Σ parametrized by χ = x(u(t), v(t)). a< t <b. Then
the length of С is
(8.38)
ГЮ'+ЧтЗЧа)1]"1*
Proof. The length of С is
f НТЦЛ
which is, by (8.36), given by (8.38).
We shall adopt the convention (borrowed from the differential form
notation) that ds is the integrand which gives arc length along a curve. This
means just that the length of any curve С is Jc ds. According to (8.38) we
can be assured that
ds =
^-шие
1/2
at
for any parameter t along C. We can also write this as
ds1 = Edu2 + IF dudv+ G dv2
(8.39)
Definition 7. The form (8.39), where E, F, G are given by (8.37) relative to a
parametrization χ = х(м, ν) on Σ is called the first fundamental form of Σ.
If C1; C2 are two curves given parametrically by
Cl:u = u^s) ν = vt(s)
C2:u = u2(s) v = v2(s)
then their tangents are
du^ dvv
1 = x"^7 + x""rf7
du2 dv2
T2 = x-~ds' + Xv~dl
644 8 Potential Theory in Three Dimensions
At a point of intersection ρ the vectors ^(p), T2(p) He in the tangent plane
at ρ and their inner product is
T .=Edul du2 Jdui dv2 du2 dvA ^
ds ds \ ds ds ds ds J
The curves are orthogonal at ρ if <T1; T2> = 0
dvv dv2
ds ds
Proposition 5. The parametric curves и = constant, ν = constant on a
surface patch are orthogonal if and only if F = 0.
Proof. The tangent line to и = с is spanned by x„; the tangent line to υ = с
is spanned by xu. These lines are orthogonal if and only if <Xu, x„> = F= 0.
Examples
18. The plane ζ =0. In the standard rectangular coordinates we
have ds2 = dx1 + dy2. If χ =/(/), у = g(t), 0 < s < L, is any curve
joining a to b we have (as in Chapter 5) the length of L is
0
UV)2 + g'(t)2y>2 dt
If we parametrize this curve by χ we obtain the length as
Cl·©
1/2
dx
This is minimized when dy/dx = 0; that is, when the curve is a straight
line. This conforms with known facts.
19. The cylinder
χ = x(m, v) = (cos u, sin u, v)
Here x„ = (-sin u, cos u, 0), x„ = (0, 0, 1). Thus Ε = 1 = G, F= 0,
so
ds2 = du2 + dv2
Again, the length of a curve given as υ = v(u) is
2-1 1/2
f№
du
8.3 Surfaces 645
so the curves of minimal length (called geodesies) on the cylinder are
those represented by straight lines in the u, υ coordinates. Thus the
typical geodesic on the cylinder is the helix
χ = (cos t, sin t, at)
20. For the sphere
χ = х(и, v) = (cos и cos v, cos и sin v, sin u)
we have Ε = 1, F = 0, G = cos2 u. Thus
ds2 = du2 + cos2 и dv2
Once again, we discover the geodesies by minimizing the integral
Jy ds. Let a, b be two points on the sphere; by rotating the sphere
we may suppose that a, b lie on the longitude ν = 0. If у is any curve
joining a to b, the length of γ is
J ds= J (du2 + cos2 и dv2)i/2 (8.40)
The length of the longitude (u = 0) is
("(du2)1'2 = f du (8.41)
•"a •'a
Now (8.40) is always larger than (8.41) unless dv = 0 along y; that is,
ν is constant. Thus it is the longitude which is the curve of the
shortest distance between a and b. By rotating back again we
conclude that the geodesies on the sphere are the sections by diametric
planes: the great circles.
Geodesies
The problem of finding the geodesies on any surface is more difficult,
because the general form
Edu2 + 2Fdudv + Gdv2
is harder to analyze. One way to proceed is to try to find coordinates so
that the first fundamental form looks like the above examples· it has the
646 8 Potential Theory in Three Dimensions
form
ds2 = du2 + G dv2 (8.42)
When this is the case we can verify that the curves ν = constant are geodesies
(Problem 17). However, in order to find such coordinates, we must know
what we are looking for; that is, we must know how to find geodesies in the
first place. Thus, this line of reasoning has to be supplemented by the
discovery of a characteristic property of geodesies. We seek such a
characteristic property by trying to understand the "infinitesimal" behavior of a
geodesic: this (we hope) leads to a differential equation which is solvable.
Then we can carry out our original plan: solving the differential equations
will provide a convenient coordinate system in which we can discover the
curves of minimal length. We shall, however, not carry through the entire
program here; we shall only derive the basic property.
If у is a geodesic, a curve of minimal length, on the surface Σ, then,
relative to Σ it is a straight line. That is, it would have to be as close to a
straight line as it could be: it should bend only as much as it must in order
to remain on Σ. Thus the rate of change of the tangent, relative to Σ, should
be zero. Infimtesimally this says that the normal to the curve has no
component on the tangent plane to Σ. We shall now show that a geodesic has
this property.
Theorem 8.2. Let γ be a geodesic (curve of minimal length) on the surface
Σ. Then, at any point ρ on γ, the normal to γ is orthogonal to the tangent plane
of Ъ.
Proof. Let ρ б у and let u, ν be coordinates for Σ near ρ so that ρ = («(0), υ(0)).
We may choose these coordinates so that у is the curve ν = 0 and so that the
coordinates are everywhere orthogonal (see Problems 9 and 10). Now let a be
small enough so that the interval from (—я, 0) to (a, 0) in the uv plane lies on the
domain D of the coordinates. If Γ: ν =/(«) defines a curve lying in D and joining
(— a, 0) to (a, 0), then χ = x(u,f(u)), —a<, и <а gives another curve on Σ, joining
two points of у (Figure 8.9). The length of Г is no more than that of y, since у
is a geodesic.
Figure 8.9
8.3 Surfaces 647
We have not yet done enough to investigate the local behavior of у; we must
consider a whole family of curves including у rather than just one other. But that
is easy to do: let Г, be the curve parametrized by
Γ,: χ = х(и, r/(«)) —a<u<a
for-l^r<l. у is r0 and Г is Tl Let F(t) be the length of Г,. Then F(r)
has a minimum at t = 0, so (if it is differentiable) F'(0) = 0. We now compute this:
F(r)=f \\xu + xutf'(u)\\du
J -a
is certainly a differentiable function of r, and
F'(0 = | J \\x» + xvtf'(u)\\du
Now, at t = 0, the integrand is
a
- <x„ + x„tf'{u), x„ + x„r/'(«)>1/2 |,.o
■■ ζ j—r. 2<x„„/(«) + Xvf'(u), xu>
_ o^y /(и) (8 43)
The last equation follows from the assumption that the coordinates are orthogonal:
<x„,xu>=0. First, the second term drops out, secondly, the expression (8.43)
derives from
a
0 = — <x„, x„> = <xu», x„> + <x», x»u>
a«
Therefore, from F'(0) = 0, we obtain
r«<x^> =0
·>-« llx.ll
This equation must hold for all differentiable functions/such that/(-<j) =f(a) = 0.
We conclude then that
<x„, хш> = 0
648 8 Potential Theory in Three Dimensions
along у (see Miscellaneous Problem 41 of Chapter 2). Now, the normal N to у
is in the plane spanned by x„ and хш. Since these are both orthogonal to x„,
N _L x„. Further, N is orthogonal to the tangent line of у which is spanned by x„.
Thus N is orthogonal to both x„ and x„, so is orthogonal to the tangent plane of Σ.
Examples
21. Find the geodesies on the surface
Ъ:у = хг
We parametrize Σ by χ = х(и, ν) = (и, и2, ν). Let и = u(s), v = v(s)
parametrize a geodesic Γ on Σ^ Then Γ has the form
χ = (u(s), i^(s), v(s))
and
xs = (и', 2мм', v')
xss = N = (m", 2(m')2 + 2mm", v")
For Г to be a geodesic, this must be orthogonal to both
x„ = (1, 2m, 0) x„ = (0, 0, 1)
Thus, the functions u(s), v(s) parametrizing the geodesic Г satisfy these
differential equations
м" + 2м[2(м')2 + 2им"] = О
v" =0
Notice that from Picard's theorem the equations
м„ = -4u(u')2
1 +4m2
v" = 0
have unique solutions given the initial values of u, v, u', v'. Thus,
there exists a curve of minimal length in every direction, at every
point.
8.3 Surfaces 649
22. Find the geodesies on the cone
Σ:ζ2 = x2 +y2
Notice that any plane ζ + χ cos a + у sin a = b intersects Σ at right
angles (Figure 8.10). Thus the normal to the curve of intersection is
orthogonal to the surface, and such a plane always intersects Σ in a
geodesic. More generally, we can compute the equations for any
geodesic using Theorem 8.2
Σ: χ = х(и, ν) = (ν cos и, ν sin u, ν)
xu = ( — vsinu,v cos и, 0)
x„ = (cos и, sin и, 1)
Figure 8.10
650 8 Potential Theory in Three Dimensions
If и = u(s), υ = v(s) parametrizes a geodesic Г, then on Г
xs = (ι/ cos и — vu' sin u, v' sin и + vu' cos u, v')
xss = ("" cos м — 2i>V sin и — ум" sin и — v(u')2 cos и,
v" sin μ + 2i/m' cos и + ш" cos и — v(u')2 sin и, ν")
The differential equations are readily computed (and hardly solved
explicitly) by expressing <xss, x„> = 0, <xss, x„> = 0.
Surface Area
We would like now to define the area of a surface in a way analogous to
the definition of the length of a curve. We select a collection of points
xt,..., \k on Σ and replace Σ by the polygonal surface Σ' whose vertices are
xb ..., \k. If the points xb ..., xk are very numerous and close to each
other, then the sum of the areas of the faces of Σ' is a good approximation
to the area of Σ. We can then try to define the area of Σ to be the limit of
such sums as the set of points \t, ...,\k becomes infinitely numerous and
everywhere dense.
Now this definition unfortunately does not work, there are ways of so
partitioning a surface so as to obtain any desired area (for a fuller account see
Spivak, pp. 128-130). Rather than give it all up as a hopeless task because
of this phenomenon, we try a different approach. First, we study the
approximation of area in the small,hoping to generate a plausible formula for
surface area (by plausible I mean that approximations to our formula are
also approximations to our notion of area). If the formula turns out to be
intrinsic, that is, independent of parametnzations, then it will define a relevant
measure, which we shall call surface area. Returning to the above " approxi-
Figure 8.11
8.3 Surfaces 651
mation," let F be one of the faces of Σ', and x0 one of its vertices. Let F0
be the projection of F onto Σ (see Figure 8.11) and Ft the projection onto
the tangent plane T(\0). If the surface is very smooth, then for small F
these three surfaces have essentially the same area, and we can confuse the
three. We may suppose that F0 lies in a patch parametrized by χ = x(u, v)
with x0 = х(и0, *Ό)· Let D be such that
F={x(u,v);(u,v)eD}
Confusing the surface with F, we may take χ to be the linear map
x(h, v) = x0 + χ„(η0 , v0)u + х„(н0 , v0)v
Now, we know how to compute area on the image of a linear map:
area (Ft) = ||x„(w0, v0) χ \v(u0,v0)\\ area D
This is true because it is true for rectangles, as we have seen in Proposition
28 of Chapter 1. Thus, at least on this coordinate patch, the area of Σ' is
very close to
Σ ||χ„(η,, υ,) χ x„(h,, i\)|| area(Z),)
where the {D,} partition the coordinate domain D and (ut, vt) e Dt. The
limit of such sums is
ί || x„ χ xj dudv
We take this to be the definition of surface area.
Definition 8. Let Σ be a surface patch with coordinates u, v, ranging
through D in R2. The area of Σ is
Γ ||χ„ χ xj dudv
•>D
If Σ is a surface, partition Σ into pieces Du...,Dk such that each D, is a
surface patch. Define
area (Σ) = £ area (Д)
652 8 Potential Theory in Three Dimensions
We must show that this definition is independent of the particular partition.
Proposition 6. The above definition is independent of the partition of Σ
chosen.
Proof. Suppose we also partition Σ another way: Σ=£ιυ···υί,. Then
Σ = (Α η £Ί) υ(ΰιη£,)υ···υ (Α* η Εη)
is still a third partition. Clearly,
area (£,) = 2 area (Aj η Et) (8.44)
area (A;) = 2 area (Aj η £,) (8.45)
since in each case we are computing relative to the same coordinates. We leave it
to the reader (see Problem 18) to verify that the computation of the area of Aj η Et
is the same whether it is done in the Aj or Et coordinates. Then, summing (8.44)
over г, and (8.45) overy, the right-hand sides are the same; and so are the lefts, as
desired.
In accordance with our convention to denote ds as the integrand for arc
length, we shall let dS denote the integrand for surface area. Thus, in terms
of any coordinate system u, ν we have dS = Ηdu dv, where Η = ||x„ χ xj.
It follows from Lagrange's identity (Chapter 1) that also Я = (EG - F2)1'2.
Examples
23. Find the area of the sphere
{x2 + y2 + z2 = R2}
We use spherical coordinates:
χ = (R cos и cos v, R cos и sin v, R sin и)
x„ = (— R sin и cos v, — R sin и sin v, R cos и)
x„ = (— R cos и sin v, R cos и cos v, 0)
so Я = [£G - F2~\112 = R2 |cos u\. The area is
p" pi/2
R2 cos μ du dv = AnR2
•'-71 •'-71/2
8.3 Surfaces 653
24. The area of the piece of the paraboloid is
{z = x2+y2, 0<z<l}
The parametrization is
χ = (r cos и, г sin u, r)
xr = (cos u, sin u, 1) x„ = (-r sin и, г cos u, 0)
£ = 2, f = 0, G = r2, Η = 2r. The area is
ι·2" r1
Γ Γ 2rdrd9= 2π
•Ό ·Ό
• EXERCISES
13. Let/be a C1 function denned in a domain D in /?2.
(a) Show that Σ: {z =/(x, y)} is a surface patch with coordinate x, y.
(b) Compute the first fundamental form and the area element for /
(c) Show that the element area is given by sec у dx dy, where у is
the angle between the normal to Σ and the ζ axis.
14. Find the tangent plane, first fundamental form and area element for
these surfaces:
(a) The paraboloid χ = у2 + ζ2
(b) The cone zi = x2 + y2.
(c) The hyperboloid ζ = χ2 — у2
(d) Σ: х(и, ν) = (и + ν2, ν + и2, uv)
15. Find the length of the intersection of these surfaces:
(a) x2 + y2 + z2 = 1
ix2 + 2y2 + 2z2 = 1
(b) z2 = 2x2+^2
ζ = χ2 + 2y2
16. Find the angle between the parametric curves at a general point for
the Surface given in Exercise 14(d).
17. Find the area cut off the tip of the paraboloid x2 =y2 + z2 by the
plane χ + ζ = 1.
18. Find the area of these surfaces:
(a) The cone z2 = x2 + y2 0<z<a.
(b) Σ: x = («, cosk, ν) 0<ν<π, -тг<и<тг
(c) The part of the hyperboloid ζ = χ2 - у2 inside the unit ball.
(d) The ellipsoid x2 + y2 + Az* = 4.
654 8 Potential Theory in Three Dimensions
• PROBLEMS
13. Recall that a differential form Mdu + Ndv determines a family of
curves: those curves along which Μ du + N dv = 0. If ds2 = Ε du2
+ 2F du dv + G dv2 is the first fundamental form of a surface patch show
that the family of curves orthogonal to the family denned by Μ du + N dv = 0
is determined by
(EN- FM) du + (FN- GM) dv=0
14 Let po be a point on the surface Σ. Show that we can find a surface
patch near po so that the parametric curves are orthogonal. (Hint: Let
u, ν be coordinates near p0 and explicitly find the family of curves и = u(t, c),
ν = v(t, c) orthogonal to the curves dv = 0 such that «(0, c) = 0, υ(0, с) = ν0.
Show that v, с are orthogonal coordinates.)
15. Let у be a curve on the surface Σ. Find orthogonal coordinates
u, υ at a point p0 on γ so that (ι) γ is the curve ν = 0, (ii) и is arc length
along y.
16. Show that a cube is not a surface along its edges.
17. Is ds a differential 1-form?
18. Find the differential equations for the geodesies on the torus (Figure
8.12):
x = (\ — cos<^)sm θ
y = (\ — cos^)cos θ
ζ = Sin φ
19. Find those planes which intersect the ellipse x2 + a2y2 + b2zi = 1 in a
geodesic.
20. Let {(«, v) e R2: и > 0, ν > 0} parametrize a surface with first
fundamental form ds2 = v2 duz + u2 dv2 Find the equation of the family of
Figure 8.12
8.3 Surfaces 655
curves orthogonal to the curves uv = constant, and express the fundamental
form m terms of these new coordinates.
21. Find the geodesies on the Surface with first fundamental form
ds2 = du2 + f(u) dv2
22. Show that the curves υ = constant on a surface with first fundamental
form Edit2 + G dv2 are geodesies if and only if BE/dv = 0.
23. Let Σ be a surface path with two different coordinates:
Σ: χ = х(и, ν) (и, ν) е D
Σ: χ = x(r, s) (r, s) е Δ
Show that
Эх Эх
— X —
ей ev
dudv =
Эх ЙХ
— X —
er Bs
{Hint: Define и = u(s, t), ν = v(s, t) by this property: χ = x(r, s) if and only
if χ = x(«, v) with « = u(r, s), ν = v(r, s). Show that
ax ax /ax ax\ a(«, υ)\
ar * es ~~ \aii x au/ e(r, s))
The following problems use the normal to a surface Σ: this is a unit vector
N orthogonal to the tangent plane.
24. Let у be a curve on the surface Σ. Let N represent the normal to
Σ, and Τ the tangent to y. The unit surface normal to у is the vector
N, = NxT.
(a) Show that у is a geodesic on Σ if and only if <Ny, dT/ds} = 0
(b) In general, the inner produce кд = <Ny, dT/ds> is called the
geodesic curvature of у on Σ. Suppose («, v) are orthogonal coordinates
on Σ and к,1, Kg2 are the geodesic curvatures of the lines υ = constant,
« = constant, respectively. Verify Liouville's formula: the geodesic
curvature of the curve у is given by
άθ
Kg= h Kg1 COS θ + Kg2 SHI θ
ds
where θ is the angle between the tangent to у and the direction x„.
(Hint: Write
Τ = Τι cos θ + T2 sin θ
where Ti, T2 are the tangents to the curves ν = constant, и = constant.
8 Potential Theory in Three Dimensions
Then
dT ί/Т, n аТг n άθ
— = -г- cos θ + —- sin θ + (-Τι sm 0 + Τ2 cos 0) —
as as as ds
Substitute these expressions into
,ΝχΊΊ
_ /^1 ■
and evaluate at θ = 0, θ = π/2.)
25. If у is a curve on the surface Σ we can decompose dT/ds into its
components tangent to and orthogonal to the surface:
dT
— =κ9Νϊ + /<:Λ,Ν
as
where κΝ is called the normal curvature to T.
(a) Show that the curvature of у is (кд2 + κΝ2)112
(b) Show that the normal curvature of a curve у depends only on the
tangent to у and is the same as the curvature of the curve of intersection
of Σ with the plane through Τ and N.
(c) Show that the curvature of the curve у is given by к«(Т) sec Θ,
where θ is the angle between dT/ds and N.
26. Using Liouville's formula find the geodesic curvature of a gerjeral
curve on the surface obtained by revolving the curve ζ = exp(— x2) around
the ζ axis.
27. Let Σ be a surface such that at every point every curve on Σ has
zero normal curvature. Show that Σ is a piece of a plane.
28. Let ρ be a point on a surface Σ and let q, r be two nearby points. It
is possible to select q, r tending to zero So that the plane determined by
p, q, r does not converge to the tangent plane (unless Σ is itself a plane).
For example, if у is the curve intersection of Some plane Π with Σ and if r
follows q along у then the plane determined by p, q, r is always П, which
need not be the tangent plane to Σ. Furthermore, if we move q slightly
off у we can be sure of the same behavior with the requirement that the angle
between q and r (in some parametrization) is not zero (however, it must tend
to zero). Here is an explicit example. Σ is the surface ζ = χ2
parametrized by x(«, v) = («, v, u2). The tangent plane at p, the origin, is the xy
plane. However, if
q = (2r,0,4r2), r = (r,r2,r2)
then the plane determined by p, q, r tends (as r -*■ 0) to the plane orthogonal
to (0, 1, 1).
8.4 Surface Integrals and Stokes' Theorem 657
8.4 Surface Integrals and Stokes' Theorem
Suppose that / is a continuous function defined in a domain D in R3,
and Σ is a surface in D. We can verify by an argument identical to that in
Proposition 6 that the following definition makes sense independently of the
coordinate choices involved.
Definition 9. Partition Σ into subsets Σ1;..., Σ„ of surface patches on Σ.
Define the integral of/over Σ to be
\fdS = £ f fHdudv
where Η du dv is the surface area element in the patch containing Σ,.
Examples
25. I=\%x2y2zdS, where Σ is the hemisphere Σ: {(x, y, z):
x2 + y2 + z2 = 1, ζ > 0}. Using the same parametnzation as in
Example 23, we have
с" г71'2
/= cos5 и cos2 ν sin ν sin и du dv
= - sin2 2v dv cos5 и sin и du = —
4·<-π Jo 24
26. / = Jj;(x + y2) dS, where Σ is the piece of the paraboloid given
in Example 24.
г2" г1 π
/ = 2 [г2 cos w + r3 sin2 и] dr du = -
•Ό ·Ό 2
Normal and Orientation
Let Σ be a surface in Д3. The tangent plane to Σ at a point x0 is a two-
dimensional plane, thus its orthogonal complement is a line, called the
normal line to Σ at x0. The normal vector N is a choice of unit vector lying
on this line which varies continuously with the point. Such a choice is
always possible locally, but is not always possible over the whole surface.
658 8 Potential Theory in Three Dimensions
moebius band
Figure 8.13
Consider the surface depicted in Figure 8.13 (called the Moebius band).
This is obtained from a rectangle (Figure 8.14) by gluing together the
vertical sides so that vertices with corresponding labels abut. There is no
way to continuously select a normal vector to this surface which does not
point in the opposite direction when traced around the circle in Figure 8.13.
Notice that the same kind of phenomenon is put in evidence by Figure 8.14:
a right-handed basis gets transformed into a left-handed basis when we cross
the vertical line. We express this by saying that the Moebius band is not
orientable.
Thus in two dimensions we find a problem which does not exist in one
dimension. We ran into the same problem in the discussion of integration
under a change of variable in the plane, and we successfully sidestepped it
then. But we cannot avoid it now. We shall refer to an orientation on a
surface Σ in R3 as a choice of a sense of positive rotation in the tangent
plane at every point. This choice is assumed to vary continuously: that is,
if v1; v2 are nowhere collinear continuous vector fields defined on the surface
and the rotation \t -* v2 is positive at Xq it must be so in a neighborhood of
x0. A choice of orientation is equivalent to a choice of normal vector.
For, if a normal N is chosen we defined positive rotation in the tangent plane
as follows: \i^>\2 !S positive if \t -> v2 -»· N is a right-handed system.
Conversely, if an orientation is chosen we can define N = vt x v2, where
vt, v2 are unit vectors and the rotation \t -* v2 is positive. If Σ is oriented
and (и, ν) are coordinates on a patch in Σ, we shall say that (и, ν) is a positively
8.4 Surface Integrals and Stokes' Theorem 659
oriented coordinate system if the rotation x„ -»· x„ is positive. Here is a fact
relating positively oriented coordinate systems which completes the
discussion.
Proposition 7. If (и, v) and (V, υ') are two positively oriented coordinate
systems defined on the oriented surface Σ, then
d(u, v)
d(u', υ')
Examples
>0
27. If/is a C1 function defined on a domain D in the xy plane,
then the graph
T(J):z=f{x,y)
is a surface patch. We consider it oriented so that the rotation from
X;,. = (1, 0, df/δχ) to Xy = (0, 1, df/ду) is positive. Then the normal
vector N always points upward out of the surface (N3 > 0):
N =
«♦©'♦ШТШ-')
28. More generally, we can always orient a surface patch Σ: χ =
х(и, ν), (и, ν) е D by transferring the orientation from the u, ν plane.
That is, we take x„ -> x„ as the positive sense of orientation. Then
the normal to Σ is
Figure 8.14
660 8 Potential Theory in Three Dimensions
Just as, in the case of curves, we introduced the "vector length element"
dx = Tds along the curve, we introduce the vector area element dS = Nds on
a surface. Notice that, in terms of coordinates
dS = || x„ χ xj dudv = xudu χ x„ dv
(i.e., it is the vector product of the length elements along the parametric
curves). In this way we can integrate vector fields along oriented surfaces:
Definition 10. If Σ is an oriented surface and ν is a vector field defined
around Σ, define the flux of ν across Σ by
j<v,dS> = j<v,N>dS
The significance of the word flux will become apparent in the next section.
Example
29. Compute the flux of \(x, y, z) = (xy, yz, zx) across the graph of
f(x, y) = x1 + 2y2 x1 + y1 < 1
We take x, у as coordinates on Σ. Then
f<v,N>dS=f Lpxp\dxdy
h ■>x^+y^<i\ ox dy/
(xy y(x2 + 2y2) (x2 + 2y2)x\
det 1 0 2x \dxdy
J*2+'2£l \0 1 Ay J
= ί [χ3 + 2y2x - 2x2y - 4xV - 8/] dx dy
•>x2+y2Zl
»2π „1
= 4 Γ ί r5 cos2 θ sin2 QdrdQ =
•Ό ·Ό
6
Suppose that v is the velocity field of a flow and С is a closed path (oriented
closed curve). In Section 8.2 we defined the circulation around a circle; we
could use the same definition to define the circulation around C:
circ (C) = f <v, T> ds (8.46)
8.4 Surface Integrals and Stokes' Theorem 661
(s = arc length along C, and Τ is the tangent vector to C). In Section 8.2
we used this idea to define curl v, the "infinitesimal circulation" about a
point; now we ask if we can recapture the total circulation from the
infinitesimal. A clue is obtained by recognizing the integrand of (8.46) as the
differential form associated to v. If ν = (ν1, ν1, ν3), then, on the curve
<v, T> ds = Σν' dx' = <v, dx}. What we are then asking for is the analog
for surfaces of Green's theorem. Since the curl plays the same role in three
variables that dm plays in two, it is no accident that such a theorem exists.
Stokes' Theorem
Suppose now that Σ is an oriented surface lying in the domain of the
vector field v, and D is a subset of Σ bounded by a curve Γ. For the purposes
of integration we must choose an orientation of Γ. It will be the natural one
corresponding to the given orientation of Σ: Γ winds counterclockwise around
D. To be more precise, we shall define the positively directed tangent. Let
реГ and consider a small path y, with tangent vector t at ρ which crosses Γ
and is directed so that it enters D. Then the tangent vector we wish to choose
is that one Τ such that the rotation T-> t is positive (see Figure 8.15). This
corresponds to the counterclockwise sense of rotation about the normal to
the tangent plane. When the boundary of D is so oriented it is a path,
denoted dD. Now the theorem we have in mind (Stokes' theorem) asserts
that the circulation around dD is given by
f <curlv, N}dS
Figure 8.15
662 8 Potential Theory in Three Dimensions
In order to derive this theorem from Green's theorem we must ensure that the
conditions of Green's theorem will be met. Hence the following notion
of a regular domain.
Definition 11. Let Σ be a surface in R3. A subset D of Σ will be called a
regular domain if it can be partitioned into finitely many subsets of surface
patches which correspond to regular domains in the plane in the particular
coordinate representation.
Theorem 8.3. Let у be a vector field defined in a domain U in R3, and
suppose Σ is an oriented surface lying in U with normal N in U. Let D be a
regular domain in Σ whose boundary 3D is a curve. Then
f <v, T> ds = f <curl v, N> dS (8.47)
Proof. Since D is regular, there are coordinate patches Σι,..., Σ„ and a partition
J)=fliU'"ufl, of D such that £>, <= Σ, and Dt corresponds to a regular
domain m the Σ, coordinates. Now let Bu...,Bm be balls in R3 such that
D <= Bi и · · · ό B„ and each Βί lies completely inside one of the coordinate patches
Σ,. Let pu ..., p„ be a partition of unity subordinate to this cover. Then, since
ZP] = 1 on D,
f <v, T> ds = Σ f <Pj v, T> ds = У f <Pj v, T>
ds
since each part of йA which is not on dD appears as part of 8Dj for some j φ ι and
with the opposite orientation.
Γ <curl v, N> dS = 2 ί <сиг1(р; v), N> dS = 2 ί <curl(jOj v), N> dS
Jd j Jo ij JDi
Thus, we only need to show that the right-hand sides are equal termwise: we may
assume that we are in a coordinate patch.
This is now our situation. Let Σ be a surface patch coordinatized by
χ = (хг(и, ν), x2(u, ν), хъ(и, ν)), (и, v)e N <= R2
and suppose Δ is a regular domain in N and D is the subdomain of Σ corresponding
to Δ: D = {x(«, v), (u, v) e N}. Let ν = (υ1, ν2, ν3) be a vector field defined on Σ.
Then we must verify
ί <v, T> ds = ί <curl v, N> dS
J 6D J D
8.4 Surface Integrals and Stokes' Theorem 663
This is just the computation that <curl v, N> dS = (ί/ω„) du dv under the change of
variables. First, we study the left integral:
Г С Г dx* dx1
<ν,Τ>ώ= Zv'dx'=\ Σν< — du+Zv' — dv
JeD JeD Je\ du dv
By Green's theorem this is
_ r \8 I dxl\ 3 / Sjc'\"
Ща^'^)-^'^)
Now
du dv
dv
dv1 dxJ dx1 e2x<
+ v'
du dv du
a / ( dx<\ dv1 dx1 dx1 i d2x
~du χ ~dv) = У Ϊ& Ihi ~dv + V'~du~d
a / _ dx'\ _ dv1 dx1
~d~v \ ~dv) ~~ 7~teJ~dv
The integrand in (8.48) is thus
dv1 /dx1 dx1 dx1 dx1
£j~dx~J \~du~ ~dv~~dv ~du
/do1 dvi^/dx* dx1 dx1 dx1
~~ tZi \dx~J ~ 'dxij \8u ~dv~~dv~du~
= <curl ν,χ,χ x„>
Hence, after Green's theorem the left integral becomes
<curl v, x„ χ x„> du dv
But the right integral is
f <curl v, N> Цх,, X Xv || du dv = j <curl v, x„ X x„> du dv
Since χ, x x„ = ||x„ x x„ || N. The proof is concluded.
Examples
30. Calculate f <v, dx}, where Σ is the surface
Σ:ζ = χ2 0<χ<1 0<y<\
(8.48)
664 8 Potential Theory in Three Dimensions
and v(x, y, z) = — (y, z, x). We make the computations:
curlv= -(1,1,0
χ, = (1,0,2χ)
χ, = (ο, ι, о)
dS = (χ,,, χ x,,) dx dy = (—2x, 0, 1) dx dy
f <v, dx} + f <curl v, dS} = f f (2χ + 1) dx dy = 0
■>6ς ·Έ ·Ό "Ό
31. Let Σ be the surface patch
χ = х(и, v) = (и cos v, и sin v, и cos 6i>) 0 < и < 1 0 < ι> < 2π
Let Ν = (JV1, Ν2, Ν3) be the normal to Σ. Then
ί (Ν1 + N2 + Ν3) dS = Γ <curl ν, dS}
where ν = (у, ζ, χ). Thus, the sought-for integral can be computed
as <v, dx}, where у is the curve и = 1:
V: χ = x(i;) = (cos v, sin i>, cos 6v)
<v, i/x> = ( — sin2 ι> + cos ucos 6v — 6 cos ysin 6v)dv= —π
•'у -Ό
EXERCISES
19. Calculate L/dS, where
(a) fix, y, z) = x2 + 2y X:z2=x2+y2 0<z<l
(b) f{x, y, z) = xy + yz + zx H:z = x2+y2 O^z^l
(c) f(x,y,z) = xyz Σ: \(u, v) = («cos и, и sin υ, ν sin и)
0^й^2тг 0^υ^2ττ
20. Calculate J <v, t/S>, where
(a) v(x, y, z) = (xy, yz, zx) Σ:ζ = β*' O^x^l
O^^^l
(b) y(x,y,z) = {\,-y,x) Z;x2+y2 + z2 = \
(c) v(x,y,z) = (l,0,y) Z:z = x2-y2 x2 + j»2 ^ 1
8.4 Surface Integrals and Stokes' Theorem 665
21. Suppose ν is a vector field defined in a neighborhood of the domain D.
Show that
f <curl v, dS> = 0
22. (a) Suppose that D is a regular domain on a surface Σ. Verify that
for any vector a,
- ί <a χ x, dx> = ί <a, dS>
(b) Just as we integrated vector functions on the interval, we can
integrate vector functions on lines and surfaces (and m space). Show
that, for a regular domain D these vectors are the same:
ί dS = - ί χ χ dx
*D 3 JfiB
(Hint: This follows from part (a).)
23. Show that if u, ν are C1 functions on the regular domain D that
ί <,uVv, dx> = ί <V« X Vv, dS}
Jen J d
• PROBLEMS
29. If ω is a closed form defined in a neighborhood of the unit sphere in
R3, show that there is a function / such that a> = dfon the sphere.
30. Consider the torus T:
χ = x(«, v) = 2 cos и + cos ν
χ = x(«, v) = (2 + cos i;)cos u, (2 + cos «)sin «, sin v)
(a) Show that the differentials du, dv are well-defined differential
forms on T.
(b) If ω is a closed form defined on T, show that the integrals
Jf •'ν
are constant as Γ ranges over all circles ν = constant, and у ranges over
all circles и = constant.
(c) If ω is a closed form there are constants cu c2 and a differentiable
function/such that
ω = Ci du + c2 dv + df
666 8 Potential Theory in Three Dimensions
(Hint: Take ci = JV ω/2π, c2 = Jy ω/2π, where the integrals are taken as
defined m part (b).)
31. State and prove a fact like that in Problem 30(c) when Γ is replaced
by a cylinder.
32. Verify this restatement of Stokes' theorem: Let ν = (F, G, H) be a
vector field defined in a domain U m R3 and suppose that Σ is an oriented
surface lying in U with normal N = (cos a, cos β, cos γ). If D is a regular
domain m Σ, then
f Fdx+Gdy + Hdz
J so
г Γ/3# dG\ /8F dH\ „ /8G 8F\
4K^-^)cosa+(^-^)cosM^^)cosT5
33. Let D be a regular domain on the oriented surface Σ. Show that if
ν is a vector field defined on Σ
f <v, Ny> dS = f <curl(v x N), N> dS
J 8D ·* D
where Ny is the unit surface normal to 8D (see Problem 24).
8.5 The Divergence Theorem
Let ν be a vector field defined in a domain U a R3, and χ = φ(χ0, r) the
associated steady flow. Let D be a domain whose closure is contained in [/
such that dD is sufficiently differentiable surface. Notice that dD is onen-
table, since we can choose as normal vector the unit vector N which is
exterior to the domain D. We shall assume throughout this section that
this is the chosen normal. For a small interval of time Δ/, let us attempt to
calculate the amount of fluid that passes through dD. For x0 e D, the
particle at the point ф(х0, — t) at time 0 for 0 < t < At passes through Xq,
since ф(х0, -t + t) = ф(х0, 0) = x0. Thus, the volume of the fluid passing
through dD at time At is the volume of the domain
DAt = {χ: χ = φ(χ0, -ί): 0 < t < At, x0 e 3D}
We shall approximate this volume by linearizing locally. That is, we cover
dD by small neighborhoods £/,, and replace Ut η 3D by the piece Tt of the
tangent plane to dD with the same area at some point in U,. We assume
8.5 The Divergence Theorem 667
also that φ is a pure translation through T,. Then the volume which passes
through Tt is a parallelepiped of volume
<«Κχ„-Δ0-φ(χ,,0),Ν>Δ/<,
where x, is some point in Ut η 3D, and AA, is the area of T,. Let us point
out that this is a signed volume; the sign being positive if the flow is into D
(since N is the exterior normal, and if φ(χ,, — At) is on the same side of dD
as Ν, <φ(χ,, - At) - φ(χ,, 0), N> is positive). This is in fact what we want,
for we want to discover the flow into D rather that the flow through dD.
It follows that an approximation to the volume of DAt is
Σ<φ(χ„-Δί)-φ(χ„0),Ν>ΔΛ
ι
and by letting the covering get arbitrarily fine, we may replace this by an
integral:
ί№<φ(χ,-Δί)-φ(χ,0),Ν>ώ (8.49)
The limit of I/At times (8.49) as Δ/->0 through positive values is the
instantaneous flow into D, or the flux into D at time / = 0.
Proposition 8. The flux out of D at time t = 0 is
JJ№<v,</s>
Proof. The flux out of D is
- lim — Γ <φ(χ, -At) - ф(х, 0), Ν> dS
лг-о ш JeD
= lim —- f <ф(х, -Дг)-ф(х,0),^>
= ί < lim — [φ(χ, At) - ф(х, 0)], dS} = ί <ν, rfS>
JeD дг-.о Ш JeD
Now the flux out of D is the instantaneous rate of flow of fluid out of D.
On physical grounds this should be identical to the instantaneous rate of
expansion of the fluid in D, which is (as in Section 8.1) JB div ν dV. Thus,
we should expect
JJ<T,^> = JJJ div Τ rfF
(8.50)
668 8 Potential Theory in Three Dimensions
and in fact this is the case. Equation (8.50) is known as the divergence
theorem. For suitable domains it is an easy consequence of the
fundamental theorem of calculus. As in the case of Green's theorem, we shall
call such domains, or finite unions of such domains, regular domains. Many
domains in R3 are regular, but by no means are all regular. The general
theorem, for an arbitrary domain, is not easy to prove and we shall here
avoid the issue.
Definition 12. A domain D in R3 is regular if it can be expressed in each
of these ways:
D = {(x, y, z): (x, y) e Dl f(x, y) < ζ < g{x, y)}
= {(x, y, z): (x, z) e D2 r(x,z)<y< s(x, z)}
= {(x, y, z): (y, z) e D3 u(y, z) < χ < v(y, z)}
where all functions are continuously differentiable.
Lemma. If ν is a differentiable vector field defined in a neighborhood of
the regular domain D, then
f <v, dSy = f div ν dV
Proof. Let ν = s'E, + v2E2 + v3E3.
dv1 dv2 ev3
дх1 дх2 дх3
We shall show that for each i,
r r dv'
Then the lemma will follow by summing over ι. To prove the ith case, we use the
appropriate representation of the domain. Since all cases are then the same, we
shall only verify one case, say the third.
Now, using the expression
D = {(x, y, z): (x, y) e £>,,/(*, y) < ζ ^g(x, y)}
8.5 The Divergence Theorem 669
the boundary of D consists of the part Σ0 lying over 8D1 and the two surfaces
Σ.: ζ =/(*,;?) (x,y)eDi
Σ2: ζ = g(x, у) (χ, у) е Dl
Since E3 is tangent to the surface lying over ЗА at every point, the left-hand
integral over Σ0 vanishes.
Now Σχ has the parametnzation
x = (x, У, fix, У)) (x, y)eDi
Since the domain lies above this surface, the exterior normal points downward, so
is determined by -χ» χ χ, (see Figure 8.16). Now
xI = (l,0,/t) χ, = (0,1,Λ)
so we have dS = (f,,f*, — 1) dx dy. Then
Figure 8.16
670 8 Potential Theory in Three Dimensions
A similar computation produces
f 03E3>i/S>= f v\x,y,g(x,y))dxdy
Now, we compute jo (dv3/dz) dV by Fubini's theorem.
dxdy
\ TldV=\ ~(x,y,z)dz
= b?(x,y,g(x,y)-vb(x,y,f(x,y)]dxdy
•>Dl
by the fundamental theorem of calculus. But this is, according to our previous
calculations the same as JiD <υ3Ε3, dS}. Thus the lemma is verified.
Theorem 8.4. (Divergence Theorem) Let у be a continuously differentiable
vector field defined in a domain D in R3. Suppose D can be covered by
finitely many balls Βγ,..., Bn such that each D η Β, is a regular domain.
Then
Г <v, dS> = f div ν dV
Proof Let pi,..., p„ be a partition of unity subordinate to Bi,...,B„. Then
f <v, dSy = 2 f <P, v, dS> = Σ ί </><v- dSy
ί divvi/K=2 ί div(piv)i/K=2 ί di\(p,4)dV
for the customary reasons: Σρ, = 1 and pt = 0 outside Bt By the lemma, the
right-hand sides are the same termwise, so the left-hand sides are the same. We
shall henceforth describe domains of the type referred to m Theorem 8.4 as regular
Examples
32. First of all, the result of Exercise 22 follows easily from the
divergence theorem, since div curl ν = 0. For then
JeB <curl v, dSy = JD div curl ν dV= 0
8.5 The Divergence Theorem 671
33. Let
D = {X2 + y2 + Z2 < 1 },/(*, y, Z)=X2+y2+ z2
Then
f <V/, JS> = f div V/ JK = 6 f JK = 8π
34. Let D be the domain {1 > ζ > χ2 + у2}, and let y(x, y, z) =
(xy, yz, x). Then div ν = у + ζ and
\div4dV=\ f (y + z)dxdy]dz
JD ■>(> \_Jx2+y2£z J
= π ζ2 dz = -
•Ό 3
f <v, dSy = f <v, dS> - f <v, dS)
JdD Jz=l Jz = x2+y2
= [ xdxdy- Γ <(χμ, χ*2 + у2), х),
(—2jc, 2j/, l)}dxdy
= l\ (y2x2 + y4)dxdy = ^
Jx2+y2<;l i
The Heat Equation
In Chapter 6 in our discussion of the heat equation we postponed its
derivation in dimensions greater than one. We had to await the divergence
theorem; with that we can carry through our argument just as in the one-
dimensional case. Thus, we suppose a homogeneous metallic object £/in
R3 has at time / a temperature distribution u(x, t). According to the laws of
thermodynamics, the vector field q associated to the flow of heat energy is
proportional to the gradient of the temperature, but for sign:
q + с Wu = 0 (8.51)
Another basic principle is this: The increase in temperature of a unit mass is
proportional to the increase in heat energy. More specifically, the change
672 8 Potential Theory in Three Dimensions
in energy in any given domain D in a time interval / is given by
kp \ AudV
where Au(\, At) is the change in temperature at χ over the period Δ/, ρ is the
density, and к is the proportionality constant (the specific heat). Thus, the
rate of increase of heat energy in D is
kp\
du
— dV
dt
Now, we can compute (using the law of conservation of energy) the rate of
increase of energy in D; it is the flux into D across the boundary. Thus we
obtain this basic equation for every domain D:
r r du
- <kq,dS> = kp\ -dV
•>dD jd at
By the divergence theorem and (8.51) we have
kp r du
г . κρ r ou
di\VudV = — \ — dV
Jd с Jd dt
for every domain D. Thus the two functions must be the same, and we
obtain the heat equation:
-№
/kp\ du
As we saw in Chapter 6, the steady state (or equilibrium) temperature
distribution solves Laplace's equation:
div Vm = 0
d2u d2u d2u
—, -\ , + —, = 0
dx2 dy2 dz2
• EXERCISES
24. If Σ is an oriented surface with normal N and /is а С function
defined near Σ, we denote <V/, N> by S//SN. Show that
Ld£ds=i"fdv
for any regular domain D.
8.5 The Divergence Theorem 673
25. If ν is a vector field such that div ν = 1, then for any regular domain D
vol(i») = f <v, dS>
JeD
In particular, we may make any one of these choices for v:
(x, 0, 0) (0, y, 0) (0, 0, z)
Find the volume of these domains, using the divergence theorem.
(a) The cap z^ax2 + by2 0 <ζ<3.
(b) The cone z2 ^ ax2 + by2 0 <, ζ ^ 1.
(c) The tetrahedron bounded by the planes z = 0 x + y + z=l
x = 2y,y = 0.
26. Verify this formula for any regular domain:
a\ \MdV= \ ||x||<x,dS>
jd JeD
27. Here is another way of expressing the divergence theorem, which is
free of vector notation. Express N in terms of its direction cosines:
N = (cos a, cos β, cos y)
Then for any three functions F, G, H,
Г г leF 8G ёН\
28. Compute
(a) U <(*2> У2> z2)> rfS> where Σ is the (oriented) surface of the cube
with side edge 2, and center at the origin.
(b) J (x cos α - у cos β - ζ cos у) dS over the sphere S: x2 + y2
+ (z — l)2 = 1, where (cos a, cos β, cos y) is the normal.
• PROBLEMS
34. Let Σ be a surface which intersects each ray from the origin in at most
one point. The set of rays which intersect Σ will pierce the unit sphere in
a set S. The area of S is the solid angle subtended by Σ. Show that the
solid angle is given by
г <*, </S)
Jx H3
674 8 Potential Theory in Three Dimensions
35. Vector-valued functions can easily be integrated over any domain,
coordinate by coordinate. Verify these formulas for a regular domain D:
\ ν X dS = f curl ν dV
f fdS=( VfdV
f NdS = 0
Jan
36. Let ν be a divergence-free vector field defined in a domain U. Show
that if у is a closed curve defined in U, then for any regular domain Лопа
surface Σ such that 8D = y, the integral
f <v,</S>
•>d
always has the same value.
37. Show that the function/is harmonic in the domain D if and only if,
for every ball В <= D,
f <V/,dS> = 0
JdB
38. Suppose there is given a flow in R3 with these properties:
(a) The flow has constant velocity outside of some large bounded set.
(b) The flow on the {z = 0} plane remains along that plane (no fluid
passes from the upper half space to the lower half space ),Show that
L
diwi/K=0
where Η is the half space {z ;> 0}.
8.6 Dirichlet's Principle
Let D be a domain in R3, and suppose ν is the velocity field of a flow through
D which is steady (time independent). The total kinetic energy of the flow
is given by the integral
2JD'
p\\y\\2 dV (8.52)
8.6 Dirichlet's Principle 675
where ρ is the density of the fluid (we shall here take ρ to be constant). An
important physical problem is this: find the flow which minimizes the energy
(8.52) subject to certain conditions being fixed on 3D. For example, we may
assume that the normal component of the flow <v, N> through the boundary
is fixed. Or we may assume that the flow is conservative, that is, ν has a
potential function, and the values of the potential are fixed on the boundary.
These problems are analogous to Neumann's and Dirichlet's problems
respectively (see Chapter 6). Dirichlet's principle is that the flow which
minimizes the energy is the gradient of a harmonic function (solution of
Laplace's equation). In this section we shall derive Dirichlet's principle and
indicate how the techniques involved can be used to discover the solution
to the problems. In order to do this, let us make these problems precise.
Let D be a domain in R3, and/a function defined on D.
I. (Dirichlet's Problem) Among all C2 functions и defined on D which
have the boundary values /, find the one which minimizes the integral
Ь IIVhII2 dV (8.53)
II. (Neumann's Problem) Among all C2 functions и defined on D such
that (Vu, N> =/on 3D, find the one which minimizes the integral (8.53).
In order to study these problems we need (i) to relate boundary data to the
integral (8.53), (ii) to discover an interpretation of (8.53) which will suggest a
technique for minimizing that integral. The first need is filled by the
divergence theorem, which will take the form of Green's identities (given below).
The interpretation requested in (ii) is that of Euclidean vector spaces and the
technique will be orthogonal projection. Let us describe this idea more
fully.
Let C2(D) represent the collection of functions which are twice
continuously differentiable on D. We can make this vector space into a
Euclidean vector space by defining on it the inner product
Ε(ιι, v) = f <V«, Vv} dV (8.54)
JD
Then (8.53) is the square of the length of Vtiin terms of this inner product.
We shall denote (8.53) by £2<м>. Our problem is to minimize this length
among all functions with the given boundary value/. Let Mf be the space of
functions in C2(D) with boundary value /. Then Mf is a translate of the
space M0: if и is a function with boundary value/, then Mf = {u+g:ge M0}.
Now it is a simple principle of Euclidean vector spaces that the vector in Mf
which is closest to 0 is orthogonal to Mf, hence also orthogonal to M0.
676 8 Potential Theory in Three Dimensions
The solution to our problem will then be that function in Mf η M0X.
Finally, we can identify M0L as the space of harmonic functions.
There is one fault with our reasoning. The "simple principle" above is
one about finite-dimensional Euclidean vector spaces (recall Chapter 1),
and it is not necessarily true in the infinite-dimensional case (of which ours
is a prime example). The problem is that there need not be any point in
Mf η M0L; and our argument will be complete once this problem of existence
is resolved. The mid-19th century mathematicians such as Dirichlet and
Riemann were little troubled by such problems; it was during the late 19th
century that mathematicians began to think of existence questions as crucial
(with good reason). And it was not until the last decade of that century that
the existence problem was effectively solved. (The reader is referred to the
history by Kellogg (pp. 277-286) for a fuller account.)
The link between the geometry described above and the subject of harmonic
functions comes out of certain computations involving the divergence theorem
(Green's identities). These will now be exposed. We shall adopt one more
notational convention before proceeding (already foreseen in the problems):
if и is defined on the oriented surface Σ, then (Vu, N> is the directional
derivative of и in the direction normal to Σ. We shall denote it by ди/δΝ.
Theorem 8.5. (Green's Identities) Let f g be two C2 functions defined on a
regular domain D. Then
f f% dS = f UAg + <V-f' V0>1 dV <8-55>
•>dD OJy ■> D
Proof.
f f% ds = f <^v^' N>ds = f dlv(/v#)dv
•>sd oN JeD JD
But, as is easily computed (see Exercise 10):
div(fVg) =/div Vg + <V/, Vg}
so Theorem 8.5 is proven.
Corollary 1.
(i) Ifg is harmonic, ldDf{dgleN) dS = Ε </, g}.
(ii) Iffe M0 and g is harmonic, Ε </, g} = 0.
8.6 Dirichlet's Principle 677
(8.56)
(iv) Iff is orthogonal to every function in M0,fis harmonic.
Proof.
(ι) If g is harmonic, then kg = 0, so by (8.55) we have
■L fwdS=L<v/'Vg> dv=E <L g> (8·57)
(n) Now, if/e Mo, /has boundary values 0, so the integral on the left of (8.57)
also vanishes, and thus £</, g} = 0.
(iii) If g is harmonic, we have (8.57). If/is also harmonic we may interchange
the roles of/and g in (8.57) obtaining
Γ 8f
g — dS=E(g,f>
•leD oN
Thus (8.56) results since E(g,f> = E(f, g}.
(iv) If g e M0, then by (8.55) (interchanging/and g), we have
f g&fdV+E<f,g>=0
J D
Now if/is orthogonal to Mo, then |#Δ/ί/Κ = 0 for every g with boundary value
zero. This implies that Δ/= 0 everywhere. For suppose Δ/(ρ) > 0 for some ρ
in D. Let β be a ball in D centered at ρ in which Δ/> 0, and let ρ be а С2 function
such that p(p) = 1 and ρ = 0 off B. Then ρ e M0, so
ί ρΔ/ί/Κ= ί PAfdV = 0
•Ό •'в
Since ρΔ/> 0 m β, it must be zero. Thus Δ/(ρ) = ρ(ρ)Δ/(ρ) = 0, a contradiction.
Corollary 2. 77ге orthogonal complement of M0 in C2(D) with the inner
product Ε </, g} is the space Η of harmonic functions.
Theorem 8.6. (Dirichlet's Principle) Let D be a regular domain in R3 and
suppose f is a continuous function on 8D. Let Mf be the class of functions in
C2(D) with boundary value f
(iii) Iff and g are harmonic,
ί UdS=i °i*S
Jan dN Jan ЯЛ/
JSD
678 8 Potential Theory in Three Dimensions
(1) If there is a harmonic function in Mf, it minimizes the energy integral.
(ii) If there is a function in C2(D) which minimizes the energy integral, it
must be harmonic.
Proof. These facts follow from the same reasoning as in Euclidean geometry.
(1) Let и e Ms be such that Δ« = 0. If g is another function in Mf,g — и = 0 on
8D, so g — и е M0 ■
E\gy =E2<.g-u + u>=E\g-ii> + 2E(g - u, и> + £2<«>
= E\g-u> + E\u>
since tt_L Mo■ Thus, E2<,g} > E2<,u} for every ge Mf.
(ii) If ue C2(D) minimizes the energy integral in Mf, it must be orthogonal to
Mo. Forif^e Mo, then и ± # are both in Mf, and thusis2<tt + #> >E2<,u}. But
E2<u±g>=E2<,u>±2E<,u,g>+E2<,g>
so 0 ^ ±2E(u, g} + E2(g} for all g e M0 ■ Consider for t e R the function
ф({)=2Е<,и, tg> + E2<,tg>
Since <f>(t) <^ 0 for all r (positive or negative), and ^(0) = 0, we must have <^'(0) = 0.
But ф'(0) = 2E(u, g}. Thus ul_ M0, so, by Corollary 1, и is harmonic.
In order to solve Dirichlet's problem by his principle it remains to show that
there exists a function in C2(D) which minimizes the energy integral. The
technique for carrying this through was finally accomplished by Hermann
Weyl (1926) and his methods have had far reaching effect in a wide class of
boundary value problems for partial differential equations.
Harmonic Functions
We can use Green's identities and Dirichlet's principle in order to derive
the basic properties of harmonic functions (analogous to those in two
dimensions given in Chapter 6). Out of this will come a hint for solving the
Dirichlet problem.
Proposition 9. Let f be a C2 function defined on the boundary of a regular
domain D. There is at most one harmonic function with boundary value f
8.6 Dirichlet's Principle 679
Proof. If u, υ are both harmonic and have the boundary values/, then и - υ is at
the same time harmonic and in M0. Thus E(u — v,u — v}=0. But
E(u-v, u-v>= ί ||V(tt-i;|
Jd
dV
so we must have V(« - v) = 0 in D. Thus и - υ is constant. Since u = vondD,u
is identical to v.
The gravitational field of a particle of unit mass situated at the point ρ is,
according to Newton, given as
— 1 χ — ρ
Ι|χ-ρΙΙΊ|χ-ρΙΙ (8,58)
This field is easily seen to be conservative and divergence free, thus it is the
gradient of a harmonic function, called Newton's gravitational potential.
Writing (8.58) out in coordinates, we have
Qc1 - p\ x2 - p2, x3 - p3)
[(x1 - p1)2 + (x2 - p2)2 + (x3 - p3)2f'2
and it is not hard to see that this is the gradient of
Πρ(χ) = ||x- pll"1 = [(x1 -p1)2 +(x2 -py+tf-p*)*}-112
This particular function stands at the beginning of a sequence of ideas which
lead to a technique due to Green, for solving Dirichlet's problem. These
steps were motivated by an inquiry into the nature of gravitational fields (due
to masses more general than that of a particle), the point being to show that
every harmonic function arises as the potential of a gravitational field.
Green's first result is an easy consequence (reminiscent of the Cauchy integral
formula) of his identities.
Proposition 10. Let Dbea regular domain, and h a function harmonic on D.
Let ρ e D. Then
*(p)-Z!f \hdJh-Up^]dS (8.59)
Vf An ho Υ δΝ ' δΝ]
Proof. Once again we first remove a small ball B(p, ε) centered at ρ and
contained in D. Since both h, Π„ are harmonic in D - B(p, ε), Corollary 1 (iii) applies
680 8 Potential Theory in Three Dimensions
in that domain. Thus,
г Г an, eh'
\h—7-Пр— dS = 0
This implies that
an,
α
ал'
Л^-П'а^
ί/5
•1ев(,р,в)1_
an, ал'
A8JV-n'W
ί/5
(8.60)
Now the second integral can be computed using spherical coordinates centered
at p:
χ = ρ + (r cos θ cos φ, r sin θ cos ф, г sin Θ)
Then Пр(х) = r'1. The sphere 5(p, e) is given by
χ = ρ + e(cos θ cos φ, sin θ cos φ, sin 0)
and its exterior normal is the radial vector, so д/dN = 8/dr. The element of area on
B(p, ε) is dS = ε2 cos2 φ άθ άφ. Thus the right-hand side of (8.60) is
ал'
r dr
•w,.> L 8r v) r
_ г* г*'2 Г h(x) 1 ал"
J-. J-./2 L «2 e 8r_
= - Κχ)οο52φάθάφ-ε\ —οο^φάθάφ
•'-It·' -is/2 ^-It·' -It/2 ОГ
ε2 cos2 φ άθ άφ
■* л»/2 8h
Since \dh/dr\ <, ||VA||, the second integrand is bounded as e^O. Thus the second
term will vanish for e^O. As for the first term x^p as e^O, so Л(х)-*-ЛСр).
Thus, letting e^O our integral tends to
r r"2
-Λ(ρ) οο52φάθάφ = -Μ(ρ)
J -π * -π/2
which is what was desired.
Now, if D is the ball of radius R centered at p, then Πρ(χ) = ||x - p||~\
so on D, Πρ = R'1 and δΤΙρ/δΝ = -R~2. Equation (8.59) becomes
Й(Р) = —\ri f й JS + — ί ^ dS
8.6 Dirichlet's Principle 681
Since h is harmonic in D the second integral vanishes (Problem 47) and we
obtain the mean value property for harmonic functions in three variables.
Proposition 11. (Gauss' Theorem) If h is harmonic in a neighborhood of
B(j>, R), then h satisfies the mean value property:
'«-*5P/,
. n! ι h dS
Green's Function
Now, by Corollary l(iii), if к is any function harmonic on D, then (8.59)
can be modified by k:
ui\ -1 f Γ,.^Πρ-^ m ,,dh
dS (8.61)
Thus, if к is chosen so as to solve Dirichlet's problem with the boundary
value Пр, the second term will vanish and we obtain an integral formula
for h in terms only of its boundary values. Finally, we could use that
formula to solve Dirichlet's problem with any boundary values. Thus
(8.59) allows us to reduce the general problem to that for a certain family
{Пр} of specific functions, and for many regular domains that solution is
easily found.
Definition 13. Let D be a domain in R3. If kp solves Dirichlet's problem
with the boundary values Πρ, we shall call the function Gp = kp- Пр the
Green's function with singularity at p.
Theorem 8.7. Suppose D is a regular domain such that there is a Green's
function for every point ρ in D. Then ifh is harmonic on D, h can be found in
terms of its boundary values:
Απ JdD opJ
Proof. By (8.61),
but the second integral vanishes since kp — Ώ„ = 0 on 8D.
682 8 Potential Theory in Three Dimensions
Example
35. Let us take D to be the upper half space D = {(x, y, ζ): ζ > 0}.
Then dD = {(x, y, 0): (x, y)eR2}. Since the domain is infinite we
have to restrict attention to functions for which the integrals make
sense. If H, is a large hemisphere:
H, = {(x, y, z): x2 + y2 + ζ2 < /, ζ ^ 0}
then (8.61) holds for functions h harmonic on D:
4π ·>ββ(ο,ι) I dN p dNj
z&O
+ lf \hd-^-Up^\dS (8.62)
4π^ (z=o) L 9N pdN] '
We shall call the function h dissiparive if the first integral tends to 0
as /-юо, and the second integral converges. For example, if
||χ||2Λ(χ) and ||x||2V/i(x) are bounded functions on D, h is dissipative
(Problem 48). This is true for Π , ρ not the on xy plane.
Now if h is dissipative we can let / -> oo in (8.62) and obtain
Now if p = (x0,>O,zo),
Π„(χ) = [(χ - x0)2 +(y- y0)2 + (z - z0)T112
and its boundary values (z = 0) are the same as those for Tlq where
4 = (*o > Уо — zo)· Since Tlq is harmonic in D and dissipative, there
is a Green's function. Thus, the Green's function for ρ = (x0, y0, z0)
is
Gp(x) = Uq(x) - ПДх)
1
l(x-x0)2 + (y-yo)2+(z + Zo)2y'2
1
" [(x - Xo)2 + (У- Уо)2 + (ζ ~ Zo)2]1/2
8.6 Dirichlet's Principle 683
Now the exterior normal to the plane is the downward vertical, so
δ/δΝ = δ/δζ. A final computation gives
^ (*> =-?■(*. л o)= 2z°
δΝ δζ ' " ' [_{x- x0f +(y- yoy + zlT2
Thus, if h is harmonic and dissipative in the upper half space, we have
for any z0 > 0
(8.63)
Finally, we remark that (8.59) can be used to solve Neumann's
problem in the same sense. If there is a harmonic function kp for each
ρ in D such that
—£ = —- on δν
δΝ δΝ
then for any function h harmonic on D we have
Thus h is determined by its normal derivative on the boundary.
• PROBLEMS
39. Prove Corollary 2 of Theorem 8.5.
Green's Function for a Ball
40. Using a little bit of plane geometry it is possible to discover the
Green's function for the unit ball. If Ρ is a point inside the ball, let Q be
the point inverse to Ρ in the sphere
Ρ
684 8 Potential Theory in Three Dimensions
Now let X be a point on the sphere. Verify that the triangles (see Figure
8.17) OPX and OXQ are similar (since the angles POX and QOX are the
same and
IQOI
1
or
Conclude that
QX
PX
OQ
OX
QO
ox
=
ox
PO
41. From the above problem we deduce that
П,(х) = — Π,(χ)
where q is the point inverse to ρ in the unit sphere. Since Π,(χ) is harmonic
m the unit ball B, the Green's function for В is
П,(х)
Gp(x) = -£г - П,(х)
HP-xllPlI
IIP-xll
Figure 8.17
8.6 Dirichlet's Principle 685
Calculate the precise form of Theorem 8.7 (known as Poisson's formula for
the ball) if h is harmonic on the unit ball
1 Γ 1 - IIpII2
42. Solve Dirichlet's problem for the ball.
43. Solve Neumann's problem for the ball.
44. Find the steady state temperature distribution in the ball if the
surface temperature on the sphere is maintained at
(a) cos φ, φ is the angle between the point and the north pole.
(b) A(x + 2y), A a constant.
(c) x2+y2~2z2
(d) cos 40 sin 2φ, θ, φ spherical coordinates.
45. Suppose D is a domain for which there exists a Green's function GP
for all ρ e D. Show that if ρ Φ ρ'
G,(p') = G,.(P)
(Hint: Show, by Green's identity that the integral
О;
8GP. dGp
Gp~m~ Gp- ~m
dS
is the same as
8G.~\
dS
dGp, dGp
g°~8n~ Gp- ~m
where В, В' are balls of radius ε centered at p, p', respectively. Now, using
the fact that
1
Gp = - + harmonic
r
compute the limits as ρ -*■ 0.)
46. Suppose D, D' are domains with Green's functions GD, GD and
D => D'. Show that for ρ e D'
GD,p(x)^GD-.p(x) allxin£>'
47. Show that for h harmonic in the ball of radius R centered at p,
dh
— dS = 0
Λι*-»ιι=κ °N
686 8 Potential Theory in Three Dimensions
48. Show that the function h defined on the upper half space D = {z ;> 0}
is dissipative if
||x||2A(x) ||x|l2VA(x)
are bounded.
49. Show that if h is harmonic and dissipative in the upper half space
and zero on the ζ = 0 plane, then h is identically zero.
50. Suppose that h(x, y) is dissipative on the plane. Prove that there
exists a unique dissipative function и continuous on the upper half space
{z ;> 0} and harmonic for {z > 0} which attains the boundary values h. и is
given by
"<*■**>- 2 I Jo [(Х-ХоУ + (у-УоУ + го>?1>аХаУ
51. Find the steady state dissipative temperature distribution on the
upper half plane if the temperature on the plane ζ = 0 is maintained at
exp(x2 + У2)'1.
8.7 Summary
A fluid flow is given by a C1 .Revalued function φ(χ0, /) denned for Xq in
some domain D in R3 and / on an interval in R about the origin, φ has these
properties:
(i) φ(χ0, 0) = x0 all x0 e D.
(ii) For fixed /, x0 -> φ(χ0, /) is one-to-one and has a nonsingular
differential.
The vector field
Зф(х0. 0
v(x, 0 =
dt
χο = Φ~4*.Ι)
is the velocity field of the flow. The flow is steady if ν is independent of /.
If ν = (yb v2, v3) is a differentiable vector field, its divergence is the function
δνί δν2 δν^
div ν = -4 + —I + —f
δχ δχ δχ*
8.7 Summary 687
equation of continuity. If v(x, /) is the velocity field of a flow and
p(x, /) is its density, the law of conservation of mass implies
— + div(pv) = — + Συ, — + ρ div ν = О
A flow is incompressible if the same mass always occupies the same volume.
The necessary and sufficient condition for this is div ν = 0, where ν is the
velocity field of the flow. The fluid is incompressible if and only if the
density at a particle is constant under all flows of the fluid.
INTEGRATION UNDER A COORDINATE CHANGE. Let (и, V, w) = F(x, y, z) be
a change of coordinates taking a domain D onto the domain Δ. If g is
continuous on D,
g(x, y, z) dx dy dz = g(F 1(и, ν, w)
•>d ·Ά
det(*' У' Z)
du dv dw
as
(u, v, w)
If ν is the velocity field of a flow, the circulation around a curve С is defined
circ(C)= f<v,T>ifr
Jc
If we fix the point x0 and vector η at Xq let Cr be the circle in the plane
perpendicular to η of radius r centered at x0. The curl of the flow about η
at x0 is
circ(Cr)
curl v(x0 , n) = lim j—
r->0 Г
If ν = (v1, v2, v3) define
'dv2 δν3 δν3 δν1 dv1 dv2^
Ιδυ2 dv3 dv3 _ drf_ dv^_ _ διΛ
curlv=la?"aP'a?"a?'ax2 dx1)
Then curl v(x0, n) = <curl v(x0), n>. A flow is irrotational if curl ν = 0.
A surface patch in R3 is the image of a domain D in R2 under a C1 map
χ = x(u, v) with these properties:
(i) χ is one-to-one
688 8 Potential Theory in Three Dimensions
(ii) the vectors хи = дх/ди, xD = dxjdv are independent, {u, υ) are called
parameters or coordinates for the surface patch. A surface is a set Σ in R3
which can be covered by surface patches. The tangent plane to Σ is the plane
spanned by the vectors x„,x„ (this is independent of the particular
coordinates). The normal N to a surface is a unit vector defined for each point
and orthogonal to the tangent plane there.
The form
ds2 = Ε du2 + 2Fdudv + G dv2
defined on a surface Σ with coordinates (u, v) by
£=<x„,x„> F=<x„,x„> G = <x„,x„>
is the first fundamental form of the surface. The parametric curves are
orthogonal if F = 0. The length of a curve on Σ given by и = u(t), ν = v(t) is
f, nJdu\2 „„dudv ^/dvVy'2 ,
idS4[E[a4) +2Fa4Jt + G[-di)\ dt
A geodesic is a curve of minimal length. If у is a geodesic on Σ, then at any
point ρ on у the normal to γ is orthogonal to the tangent plane of Σ.
The area of a domain Dona surface Σ is defined by
ί dS - \ ||x„ x xj du dv
The integral of a continuous function / defined on D is
\ fdS= ί /||χ„ χ x.|| du dv
These definitions are independent of the parameters chosen.
If Σ is an oriented surface and ν is a vector field defined around Σ, the
flux of ν across Σ is
i<v,N>^S
8.7 Summary 689
stokes' theorem. If ν is a C1 vector field defined in a domain U, and Σ is
an oriented surface in U and D is a regular domain on Σ, then
f <v, T> ds = Γ <curl v, N> dS
divergence theorem. If ν is a C1 vector field defined in a neighborhood
of a regular domain D in R3 then, with the exterior normal orientation on 3D,
f <v, N> dS = f div ν rfF
■>dD ->D
green's identities. Let /, g be two C2 functions defined on a regular
domain D. Then
ί /Ilds= ί Ubg + <yf,vgy\dv
•>dD ON Jp
dirichlet's principle. Let D be a regular domain in R3 and suppose /is
a continuous function on D. Let My be the class of C2 functions on D with
boundary values given by/.
(i) If there is a harmonic function in Mf, it minimizes the energy integral
£2(„) = j||Vti||2iiF
(ii) If there is a C2 function which minimizes the energy integral, it must
be harmonic.
• FURTHER READING
In order to continue the study of the divergence theorem and further related
topics one must turn to the notations and ideas of differential forms. The
small book
M. Spivak, Calculus on Manifolds, W. A. Benjamin, Inc., New York, 1965,
gives a clear and direct account of this subject. The book
H. K. Nickerson, D. С Spencer, and N. Steenrod, Advanced Calculus,
D. Van Nostrand Company, New York, 1957, was the first to give a complete
account of this subject on an advanced calculus level. For a more recent
account, with a chapter on potential theory in R", see
L. Loomis and S. Sternberg, Advanced Calculus, Addison-Wesley, Reading,
Mass., 1968.
690 8 Potential Theory in Three Dimensions
Other references are
Μ. Ε. Munroe, Modern Multidimensional Calculus, Addi son-Wesley,
Reading, Mass., 1963.
E. Butkov, Mathematical Physics, Addison-Wesley, Reading, Mass., 1968.
For further study of differential geometry we recommend
S. Struik, Lectures on Classical Differential Geometry, Addison-Wesley,
Reading, Mass., 1950.
H. Guggenheimer, Differential Geometry, McGraw-Hill, NewYork, N.Y.,
1963.
MISCELLANEOUS PROBLEMS
52. Suppose that F is a C1 function defined in a neighborhood of p0 m
R3 such that F(p0) = 0 and dF(p0) Φ 0. Show that the set Σ = {ρ: F(p) = 0}
is a surface patch in some neighborhood of p0. {Hint: Choose coordinates
x, y, ζ so that the forms dF(p0), dx(p0), dy(p0) are independent. Then the
transformation F(p) = (x(p), y(p), F(p)) is invertible. If G is the inverse
to F, the function
ф(и, ν) = G(u, ν, 0)
parametrizes Σ.)
53. A family of surfaces in a domain D in R3 is given implicitly by the
equation
F(p) = c (8.64)
where F is C1 in D and dF(p) φ 0. For each c, the set (8.64) determines a
surface. Show that the vector field VF is the velocity field of a flow whose
path lines intersect each surface orthogonally.
54. Find the family of curves which are orthogonal to these families of
surfaces:
(a) x1 + 2y2+z2 = c. (c) x2 + y2 = c(z + c).
(b) z2x2 = c2 (d) ζ = с cos у.
55. Given a family F of curves in space, there may not exist a family of
surfaces orthogonal to F. If say, ν is a vector field tangent to the family F
and {F(p) = c} is the family of orthogonal surfaces, show that VFmust be
collinear with v. The condition that ν must be collinear with a gradient
must be satisfied in order for the path lines associated to ν to have an
orthogonal family of surfaces. Show that this condition may be written
<curl v, v> = 0.
56. Show that the family of path lines of the helical flow
(x, y, z) = (x0 cos t + y0 sin t, — x0 sin t + y0 cos t, z0 + t)
does not admit an orthogonal family of surfaces.
8.7 Summary 691
57. Show that if the vector field ν is conservative the family of surfaces
{П(р) = с}, where Π is a potential for ν is orthogonal to the path lines.
58 Show that, although the vector field
v(x, y, z) = (yx, y, 0)
is not conservative, the path lines of its associated flow does admit a family
of orthogonal surfaces.
59. Suppose that D is a star-shaped domain in Rb centered at the origin.
That is, if ρ e D, then so is the line segment joining 0 to ρ in D. Suppose
that ν is a C1 vector field defined on D such that div ν =0. Define the
vector field u by
u(p) = f [ν(Φ) Χ φ] dt
•Ό
Show that curl u = v. {Hint: Recall Pomcare's lemma (see Theorem 7 5);
this is just a generalization. Differentiate under the integral sign, use the
condition div ν = 0 and then integrate by parts.)
60. Suppose that u is a C1 vector field defined in a neighborhood of a
sphere S. Show that
f <curlu, N>i/5 = 0
•'s
(Use Stokes' theorem one hemisphere at a time.)
61 Every curl-free vector field defined on Rb — {0} is a gradient; however
there is a divergence-free vector field defined there which is not a curl. For
example, take
Vo(p) = ^
Then div ν = 0, but if S is a sphere centered at the origin
ί <Vo , N> dS = Απ
•'s
so by Problem 60, v0 is not a curl. It can be shown that if ν is any divergence
free field defined in R3 - {0}, there is a vector field u and a constant с such
that
ν = curl u + cv0
Can you suggest how to define с and u ?
692 8 Potential Theory in Three Dimensions
Normal Curvature
62. Let Σ be a surface patch in R3 coordinatized by χ = x(u, v). Let
N be the normal to Σ so chosen that xu^x„^N is right handed. N can
be viewed as a differentiable function of u, v. For ρ on Σ, dN(p) is thus
an /?3-valued linear map of R2. By denning
3N
dN(p)(x„) = —(p)
3N
dN(p)(xv) = — (p)
we may consider ί/N as a mapping of the tangent space Γ(Σ)Ρ into R3.
(a) Show that the range of i/N(p) is orthogonal to N(p). (Hint: N
is a unit vector.)
(b) Because of (a) i/N(p) can be considered as a linear transformation
of Τ(Σ)Ρ to Γ(Σ)Ρ. Show that i/N(p) is symmetric:
<</N(p)v, w> = <v, i/N(p)w>
(Hint: You need only show that
<dN(p)(x»), x.> = <x«, </Ν(ρ)(χ„)>)
(c) Show that i/N(p)(v) is κΝ(ν) when κΝ(ν) is the normal curvature
(see Problem 25) of the curve of intersection of the plane through N and
ν with Σ.
Since i/N(p) is symmetric on Τ(Σ)Ρ, it has two real eigenvalues and the
corresponding eigenspaces are orthogonal. The eigenvalues are called
the principal curvatures of Σ at p, and the eigendirections are the principal
directions.
63. The second fundamental form on a surface is the form
H(v) = <i/N(p)v, v> for ν e Γ(Σ)(ρ)
Show that II can be expressed as
II =Ldu2 + 2Mdudv + Ndv2 (8.65)
where
/3N 3x\
L = \eu,~eu/
8.7 Summary 693
-/— ^\_/aN ax\
\ 8v ' dv
64. Compute the second fundamental form and find the principal
directions on these surfaces:
(a) x1 + 2y2 + z2 = l
(b) 2у = хг
(c) x2-y=z2
(d) x2-y2 + z2=0
65. (Rodriques' Formula) Show that a curve Г on a surface Σ is tangent
to a principal direction at every point if and only if ί/Ν + κΝ dx = 0 along Γ.
(Such curves are called lines of curvature.)
66. Find the lines of curvature on the surface Σ:
(a) Σ is the cylinder given by x(«, v) = (cos u, sin u, v).
(b) Σ is the torus x(u, v) = (2 + cos «)cos v, (2 + cos и = (2 + cos u)
cos v, (2 + cos «)sin v, sin u).
(c) Σ is the sphere xi + y2 + z2 = 1.
67. A point ρ on a surface Σ is called an elliptic point if the principal
curvatures have the same sign, a hyperbolic point if the principal curvatures
have different signs and a parabolic point if one principal curvature is zero.
Find examples of all three kinds of points on a torus. Show that ρ is
elliptic, hyperbolic, parabolic as LN — M2 > 0, < 0, =0.
68. Show that at a hyperbolic point in a surface intersects its tangent
plane in two curves with zero normal curvature.
ANSWERS TO
SELECTED EXERCISES
Chapter 1
SECTION 1.1
1. (a) (1,-7) (f) (4-2y-3z,y,z)
(b) (4,3/4) (g) (4,3)
(c) (8,6,1) (h) (5,2)
(d) (0,1,2,1) (i) (5-y,y,-l-w,w)
(e) (11,13,-2) (j) no solutions
2. (a) (-ζ,Ο,ζ)
(b) (0,-z,z)
(c) (-z, —w, z, w)
(c) /1 2 0 Γ
0 0 10
0 0 0 1
,0 0 0 0/
(b)
0
1
0
0
0
0
1
0
694
Chapter 1
(a) (32/78, -5/78, 35/78)
(b) (-3-5x5,2 + (10/3)x5,-(7/2)
(c) (-3-5x5,8/3 + (10/3)x5. -4-
(d) no solutions
(e) no solutions
-4x5, 1/2, x5)
Ax5, 1, x5)
SECTION 1.2
11. (a) (120/52,4/52)
(b) (4/5,-8/5)
(c) (-51/22,29/22)
(d) (-39/5,-43/5)
12. (a) Ъх+1у = -\
(b) x-y = 8
(c) y-2x = U
SECTION 1.3
13. (a) /l
(b)
14. (a)
(c) (24)
(d)
2
1
ч -1
/ 24 24 12 8
12 12 6 4
-6 -6 -3 -2
48 48 24 16
\ 42 42 21 14
4\
2
V
(b)
(d) doesn't exist
15. (a)
0
3
-20/78
0
-1
7/78
(c) / 0 -1/2 0
1/6 1/12 0
-1/6 5/12 1/8
-1/2 1/4 0
696 Answers to Selected Exercises
(b) / 1 0 0 0 \
J -1 1/3 0 0 \
I 1 0-1 0 I
\ 0 0 0 1/2/
16. (a) no conditions
(b) ^=0
(c) 2b2 + b3 + b'=0
ί»1 + 362 — 26* = 0
4b2 - b" + b5 =0
(d) -Abl - 25b2 + 1463 + 10b* = 0
17. Since the index d of A is at most n, if Ρ row reduces A we obtain Ρ Ax =Pb.
Since (at least) the last m — d rows of PA are zero, b must satisfy the (non-
vacuous) conditions that the last m — d entries of Pb are zero.
19. \ϊχ = ΣχίΕι,Τ(χ)=ΣχίΤ{Ε)=0.
20. If χ =2 x[Ei, T(x) = 2ϊ=ί x'El+1 + xnEu so Τ is uniquely determined by
the conditions.
section 1.4
21. (a) 4 (b) 3 (c) 3 (d) 3
22. (a) 3 (b) 3 (c) 3
23. (a) {xe R*; xl + x2 = x3 + x*}
(b) {xeR*; -Зх1 + x3 =0,2x1 - x2 - x« =0}
(c) {xe R3;xi + 2x2-x3=0}
(d) {xeR5;x3 = x5,x2=0, x4=0}
24. (a) No (b) Yes (c) No
25. (a) c(—4υι — v2 — 6υ3 + 5υ„) = 0
(b) α(υ2 - υΛ) + b(2.Vi + υ, + υ„) = 0
(c) αοι + b(2v2 — 2ьъ + νΛ) = 0
26. The given vectors form a basis for R5.
27. (a) (0,-1/2,1,0,0) (-1,1/2,0,0,1)
(b) (1,2,1,0) (0,-1,0,1)
SECTION 1.5
29. (а) К = [бх1 = 17x4, χ2 = -2x\ хъ = x4/3}
R = R3
(b) K = {x1=0, x2=0}
r = {6x2 = \2x« - \\x\ 2x3 = 4x* - 39л:1}
Chapter 1 697
(c) К = {х1=0,3х* + 2х*=0,3х3 + 4хл=0}
R = {xi-x2 + x3~x*=0}
(d) K = {Sx1 + x* + 6x5=0, 8x2 + 5x* + 7x5=0, x3 + 2x4 + 2x5=0}
R = R3
30. (a) K: (17/6, -2, 1/3, 1), R: Eu Ε,, E3
(b) K: (0, 0, 1, 0), (0, 0, 0, 1), R: (1, -11/6, -39/2, 0), (0, 2, 4, 1)
(c) K: (0, -2/3, -4/3, 1), Я: (1, 1, 0, 0), (-1, 0, 1, 0), (1, 0, 0, 1)
(d) K:(-l, -5, -16,8,0), (-3, -7, -8, 0, 4), R: ЕиЕг,Еъ
31. If/is nonzero, its range is all of R, so its rank is 1. Thus its nullity is
и-1.
32. {Ei-E,: i = 2,...,n]
SECTION 1.6
(d)
1/9
1/6
1/9
2/9
1/9
0
2/9
-2/9
-1/3
1/6
0
6/9
1/9
-1/12
-1/9
5/18
(b)
(c) /0 1 1
1 0
-1 0
о о
34. (a) (0,2,1) (c) (9/8,1/2,-7/8)
(b) (-17,5,1) (d) (-1/2,1/2,1)
35. By induction we can show that A" has the property that its (i, j) entries are
zero for all ί <j + k — 1. Once k > n, these are all the entries.
36. (7+Λ)-^ |(-1)U
n = l
Σ(
( = 0
SECTION 1.7
37. Eigenvalue Eigenvectors
(a) 2 (0, 1, -2)
3 (1, -2, -4)
-1 (1,1,-4)
698 Answers to Selected Exercises
(b) 1 (1,0,1.-1)
-1 (2,0,2,-3)
0 (0,0,1,-1)
2 (0, 2, 0, 0)
(c) 1 (1,0,0,0), (0,0, 1,-1)
4 (0,1,0,0)
no basis of eigenvectors
(d) 2 (1,0, 1), (0, 1, 1)
-2 (1, 1, -2)
39. If G represents the standard basis, we use Exercise 38 to find Aar, AGE and
use the fact that
-AGF(AG*)-
0
0
0
72
0
0
-1
-1/2
1/2
0
0
1/2
0
1/2
1
-1/2
5 -1\ (c)
1 2
4 1/
(b) / 1/2 -1/2 -1/2 \
3/16 -5/16 —7/16 J
\-l/2 1/2 -1/2 /
40. (a) /o -2 2\ (b) /-3 -5 0\
2 3-1 2 3 0
\2 2 0/ \ 0 4 1/
41. If TE =cl, when £ is a basis of eigenvectors, then for any basis F,
TF = {Ae^TeA/ = c(Aef)-\Aef) = cl
SECTION 1.8
42. (a) (5+30/34 (d) cis(-2/3)/4
(b) l+i (e) cis(-7)
(c) (3-0/Ю
43. zz = \ if and only if z~1=z
45. (а) ±(-1 + 0Л/2
(b) cis(A:7r/5) к = \,Ъ, 5, 7, 9
(c) ±1, ±/(21/8)cis(7r/16)
(d) i,(±l + IV3)/2
Chapter 1
(e) (±5)1/2cis[Jarctan(-i)]
(f) 2501'6 cis[£tt/3 + (l/3)arctan(l/3)] к = 0, 2, 4
46. (a) (1,0,(1,-0
(d) (1,-4, 5,1), (1,3,0, 5), ((
-3+V21)/4, -2
((-3 -л/21)/4, -2 -V21/2, 0, 1)
SECTION 1.9
47.
48.
49.
50.
Table of <υ,, υ,>:
υ2 «з
υι 13 24
v2 41
υ3
υ4
υ5
V6
Table of υ ι χ Vj
υ2
οι (6,-5,-5)
«2
ν3
V5
νι (-6,2,5)
υ2 (-12,4,10)
ν3 (-15,5,21)
υ4 (-15,5,20)
ν5
ν6
(37, 16, -28)/17
(a) 9χι + 2χ2 -
(b) 12*1 - 4χ2 -
(с) Зх1 - χ2 = С
(d) Зх1 + 9х2 =
ν» v5
20 5
40 0
67 7
0
υ3 .
(5, 4, -7)
(-5,13,7)
«6
(9,2,-10)
(-1,-3,0)
(0,-7,0)
(-2,-6,0)
(3,-1,0)
10х3=0
- 10х3 = 0
I
0
v6 νη
2 17
4 34
5 0
5 -15
0 0
21
ν*
(9,2,-10)
(3, 9, 0)
(10,-5,-14)
Vl
(11,-72,25)
(-41, -123,0)
(-25, -222, 35)
(-67,-201,0)
(63, -21, 50)
(-5,-15,0)
51. Vi.: χ = z, 2y = z
vi: χ + Ъу = 0, ζ + Ay = 0
v3: у =0,5x = 7z
υΛ: χ + Ъу = 0, 2z + 5y = 0
v5: ζ =0,3x=y
v6:x=0,y = 0
υ-,: χ + Ъу = 0, Ix + 5z = 0
700 Answers to Selected Exercises
52. Ix1 - x2 - 2x3 = 17
53. Ax1 + x2 - 3x3 = 2
54. (a) <x,E2-E1>=0 = <,x,E3-E1>
(b) x1 + x1 + 3x3=0,2x1 + 2x2 + 5x3=0
(c) x2=0, x3=0
55. The planes are given by the equations
(a) x + y + z=l (b) x=y (c) x = 0
The intersection of (a) and (b) is given by equations (a) and (b), etc.
56. (a) x + у = 2z (b) у = ζ (с) ζ = 0
58. The area of a parallelogram of side lengths a, b is ab sin Θ, where θ is
(either) included angle.
59. False if и is perpendicular to ν and w, but υ is not perpendicular to w.
60. Apply the equation (a, b χ c> = det I ft I to each pair, and observe that there
are always two rows the same.
section 1.11
61. (a) open (b) neither (c) closed (d) closed (e) closed
(f) open (g) open (h) closed (i) open (j) open
(k) open
62. (29,-3,26)/14
63. (Ill,-22, lll)/34
64. (0, 1, 1)/21/2, (1, -1, 0/31'2
(b) (0, 1, 0, D/21'2, (1, 0, 1, 0)/2^, (-1, -2, 1, 2)110»'
(c) (0, 1, 0, 0, 0), (0, 0, 0, 1, 0), (0, 0, 2, 0, lys1"
(d) (1, 2, 3, 4)/301'2, (2, 1,0,-1)/6·", (1, -3, 3, -1)/»1"
65. (a) ΛΓ: (10, -16, 16, 10/4771'2
R: (0, 0, 1, 0), (1, 0, 0, D/21'2, (1, 2, 0, -l)^1'2
(b) K: (1, 0,-1, 0)1Г'\ (0, 1, 0, - D/21'2
R: (-1, -2, 1, 0W\ (3, 12, -3, 2Э/1561'2
Chapter 2 701
Chapter 2
SECTION 2.1
1. (a) does not exist (b) 0 (c) 0 (d) no limit (e) 1 (f) 1
2.
4.
5.
(g) 1
No, take
Yes
0
SECTION 2.2
8.
-1/2
(h)
x„ =
3
= η
9. Form the new sum in this way: at any stage, if the sum is 1, add the first
negative term not yet used, and if the sum is less than 1, add positive terms
until the sum is 1. The resulting series is
lllllllllj_j_ 1 1 1
2+4 + 4-2 + 8 + 8+8 + 8_4+Ϊ6 + Ϊ6 + Ϊ6 + Ϊ6~4 + "'
The terms come in blocks. The first bracket encloses the first block, the
second bracket begins the second block. The nth block consists of 2"_1
copies of
11111
yi ' 2"+2 2"+2 2"+2 2"+2
10. Yes. Since the sum of the positive terms is +co, and the sum of the
negative terms is — oo, we can rearrange so that at any stage, if the sum is
less than 10,000 we add positive terms until 10,000 is passed, and if the sum
is not less than 10,000 add negative terms until 10,000 is passed.
section 2.3
11. (b), (d), (f), (g), (h), (1), (m) converge
(a), (c), (e), (i), (j), (k), (n) diverge
14. (a) |z|<l (b) |z|<l (c) all ζ (d) all ζ (e) all ζ
(f) z=0 (g) |z|£l (h) |z|<l (i) |z|< 1
(j) |1 + *|<1
702 Answers to Selected Exercises
SECTION 2.7
15. (a) π (b) 2/3 (c) 2/3 (d) 4/3
17. (a) 0 (b) 1/16 (c) β-(ΐ/2β)-3/2 (d) 1/4 (e) 5/6
(f) 1 /3) JS'4 [(1 + sec2 0)3'2 - 1 ] dd
18. (a) 1/6 (b) 1/60 (c) 1/10 (d) π/10
19. (а) Зтг/16 (b) 1/48 (c) 1 (d) 1/24
SECTION 2.8
20. df/дх df\dy df/dz
(a) yz xz xy
(b) у cos(xy) χ cos(xy)
(c) y'x1"'-1' /z/-'Inx xyZy'\nx\ny
(d) 2xy + y2 x2 + 2yx
21. ххХ[хх-1 + х]пх + хх(Ых)2]
23. Since VA = (й/г/йх1, .. , dh/дх") for any function h, we need only show that
a a# a/
The proof is just as for functions of one variable.
24. By Exercise 23
o = v(/.i)=iv/+/v(i)
25. 5/9
26. 101/2/(l + 101'2)
SECTION 2.9
29. (a) 0 (b) 0 (c) 1 (d) 0 (e) 1
30. (b), (c), (f) converge; (a), (d), (e) diverge
section 2.11
31. (a) x„ = (i)(x,,-i + a/xn-i)
(b) x„=2x„_i/3 + a/3x„2
Chapter 3 703
32. (a) **, l-^i±^zi±l
3 3 3xS_1 + 2x„_1+ 1
*2-i-l
(b) *-2£7=7·*=?
, _ xl-1 — 2x2_ ι — 3x„_! + 2
(c) x„ — x„_! —— —
ixl-1 — 4x„_i — 3
<л\ 4 , л4*-1 +5
(d) )ί,=-)ί,-ι + 4-
5 5χί_! — 1
33. (a) all points except on the line χ = 0
(b) ι) all except (1,-1)
(ii) all points
(iii) no points
d dF 8F
34. 0 = - F(x, <?(*)) = — (*, <?(*)) + — (x, <7(х))<?'(х)
35. (a) — (ду + tan ху)\хг
(b) -sin(x + y)\{\ + sin(x + y))
(C) -J</*
(d) -ye°>\{xe"> - 1)
Chapter 3
SECTION 3.1
1. (a) ce"
(b) (—sin t, cos f, 1)
(c) (—asm t, boost)
(d) (2f,3i2)
(e) (1,2ί,3ί2)
(f) (cos t, -sin t, 0)
2. (a) |c|exp[(Rec)f], argc
(b) 21'2, тг/2
(c) (az sin2 f + 62 cos2 01/2
(d) |i|(4 + 9f2)1/2, arccos(4f + 18f2)/2 |i|[(4+ 9i2)(l + 9i2)]1/2
(e) (l + 4f2 + 9f4)1/2
(f) 1 тг/2
704 Answers to Selected Exercises
The tangent to (a) at the point e" is parallel to the tangent to (b) at the
point (a cos t, b sin t) precisely when
1 lb \
■■ arctan I - tan t)
Im с \a J
4.
5.
6.
7.
8.
10.
11.
Never
(flfr)1'
2
(-1,
12
1),(-1,1)
mm(l/a, \/b, 1/c)
(a)
(b)
(c)
(d)
(a)
(b)
Eigenvalues
11.516
-4.516
14
10
3 + 4(2)1/2
3 - 4(2)1/2
1 + 2C10)1'2
1 - 2(10)1/2
2 + 21'2
1
2 - 21'2
7.411
0.313
-1.724
Eigenvectors
(0.685, 0.729)
(0.729, -0.685)
(1,1)
(-1,1)
(0.383, 0.924)
(0.924, -0.383)
(0.585,0.811)
(0.811, -0.585)
(-0.383, 0, 0.924)
(0, 1, 0)
(0.924, 0, 0.383)
(-0.501, -0.382, -0.777)
(0.838, -0.438, -0.325)
(-0.216, -0.814, 0.540)
SECTION 3.2
12. x-x3/3 + x5/5, 1-х + х2-хъ + х«
13. 0.1987
14. 1.7320
16. [-0.0312,0.0322], |x|< 0.1
17. |*|^0.125
SECTION 3.4
18. (a) ;y=(-l/2)exp(-x2) + 3/2
(b) у = — χ cos χ + sin χ
(c) x=f2/2,;y=f3/3+l,z = f4/4
(d) ζ = -ie" + (1 + i)2'3/3 + 1 + ι
Chapter 3 705
19. (a) y={c-x)~1
(b) tan у + sec у = K(ta.n χ + sec x)
(c) x3+/ = C
(d) ;v=sin(x-(l/3)x3+ C)
(e) y=Kfexp(t2/2)dt+C
(f) у = С exp(x + x3/3)
(g) >' = [1п(1-х)-х-сГ1
(h) > = — ln(c — <?*)
(ι) tan j< + sec у = К ехр(— 2 cos χ)
20. (a) >- = exp(-x2/2) Joexp(i2/2)cosii/i
(b) у = (sec χ — cos x)/2
(c) ^ = x-exp(-x2/2)/Sexp(i2/2)i/i
(d) у = exp(/(x)) Ji exp(-/(i) - lit) Λ + exp(l/l -1)
where/(x) = (exp(l - ΐ)χ)\{\ - ί)
(e) у = ln(x + e - 1)
21. (а) у = -ex/2 + e~xl6 + e2x/3
(b) у = e'(coshV2t + smbVlt/Vl)
22. (a)
(b) a>'0
(с) α ^ 0
/*\ = ci exp(4 + 2ί)ίί j) + c2 exp(4 - 2ι)/ Γ')
(у1) =*«ρα +^)'(_ν^) +C2 exp(1 -V^'(v^)
a=0
Ы / ** X
\Λ/ V'C-Cji + Ci)/
афО
Г1) = cie'( -1 I + c2exp(l + V2a)iΙα/λ/2α I
W \ 1/ W
+ c3 exp(l - Vla)t I -a/V2a I
\-а/л/2я/
Uj =[(cJ + c1)/e, + Cie'] |0 + c2e'jl +сэ<?'(0
23. (a) ci = exp[(l - 0/2] _ _
(b) d = 1/2, c2 = -(1 -Л/2Ф0/4, сз =(-1 -л/2я/а)/4
(c) >ί = [(1 + Oexpfl - 0' + (1 - ')ехр(1 + 00/2
Уг = [(1 + 0ехр(1 - 0' - (1 - ОехрО + О' W
706 Answers to Selected Exercises
l5+Vu\
(d) /y^
W
9Λ/17- 17
: 34
exp(l+Vl7)i/2
4
5+Λ/Ϊ7
/
/5-Vn\
9Λ/17 — 17
34
exp(l-Vl7)i/2
4
5-Λ/Ϊ7
/
24. (a)
ft)—(-
(d) /M
(e)
+ c2e5
+ c2e4t 1 \ + c3e-
1
\y2) = Cl ^P^6 + /Зл/^' 2 + ί3λ/5
+ сг ехр(6 - ι3λ/5)ί 2 _ ,3л/5
(f) >ί = He exp(4 — 7ί)' + с ехр(4 + 7ι)ί)
1
уг = — (с ехр(4 — li)t — с ехр(4 + 7ι)ί)
(g) >ί = Re(c exp(3 - 2ί))
j<2 = Im(ci exp(3 — 2/))
j-3 = Re(c2 exp(l - 2i))
j<4 = Im(c2 exp(l — 2/))
(h) M\
I >21 = е'(сз ί 2/2 + c2 f + οΟΕΊ + <?'(c3 f + c2)£2 + c3 е'Яз
Chapter 4 707
(0 M\
><2 = cie'Ei + c2 e'E2 + e'[(d - c2)f + d]E3
SECTION 3.7
26. y = e-*/2 jt2e' dt+ d + de'x
У = x2/2 - χ + ci + c2 e~x
27 (a) cie2' + c2e-2,-l/4
(b) dex+c3e-x-d-(^-\-6x)ll
(c) Cie-*+ с2<г2* +(sin x — 3 cos x)/10
(d) d ex + Cix
(e) dx2 + dx3 + (In x)(—x2 + x3)
28. j< = 4x2/9 + 5/9x + 2x2 In x/3
29. ;v = CiX + c2x Jexp[(l — t)e']/t2 dt
30. > = e_1{exp[(l - x)e*] + χ J" e' exp[(l - f )<?'] Λ} + χ - 1
Chapter 4
SECTION 4.1
1. x(0) = (cos Θ, 0, sin 0) 0 < θ < 2тг
1 +V5
2. χ(0) = —-— (cos θ, sin 0,1) 0 < 6» < 2π
3. θ = arc cos((-_' — 1) с > i parametrizes the curve in the upper half-plane
by taking 0 < θ < π; in the lower half-plane by taking — π < θ <0.
5. z(f) = a cos f<?""
= e"'b(-b sinl+i cos i)/(cos2 f + ft2 sm2 f)1'2
6. (a) (1-/,/)
(b) (1 + /, sm 1 - ( cos 1)
(c) (1 +/, 1 -/,/)
(d) (2a, at, t)
(e) (M,/)
(f) (1 + 2/,-2M +/)
7. (a) x axis
(b) ^ axis
(c) ^ axis
(d) x=y, z = 0
708 Answers to Selected Exercises
section 4.2
ds _ (1 + 2a cos θ + α2)1'2
' (a) dl= (1 + a cos Θ)2
(b) ~ = (5 + 4 cos 0)1'2
(c) (6a)i = (l-e-')/21/2
Λ (χ4 + cos2(l/x))1'2
(6d) ds = $ (a2 + cos2(0/2))1/2 dB
(6/)j=J(8i2+ l)1'2*
Г U
(7φ=ί -
(,2+1)l,2
dt
f+1
10. (a) aN=21'2e,=aT
4+18i2 (9^ + 16)1'2
W flr=(4 + 9,2)1/2sgiW,fl„=3|i| 4+9i2
(d) aw = [(1 - sin i)/2]1/2
ar = -[(l + sin/)/2]1'a
11. (a) aN = |sm f 1(2/2 + cos 2f )1/2
αΓ = —sin 2i/(2 + cos 2f)1/2
(c) ar = //(2 + i2)1'2, aN = [(t* + 5f2 + 8)/(i2 + 2)]1'2
SECTION 4.5
12. (a) y=-xy'
(b) x/ + y=0
(c) χ exp(-j</*/) = 1
(d) sin x(sin у + χ/ cos j<) = χ sin .y cos χ
(e) j< = exp(x + y) y'jy{\ + y')
(f) l+/=0
13. (a) ^=cx
(b) x + ^2/2 = c
(c) 2х + у2 = с
(d) х = сехр[-/у (7 +sin ί)-1]<#
(e) c(;y — 1) = csc(7r/4 — x)
Chapter 5 709
14. x' + (y-c)2=c2
(y2-x2)y' + 2xy=0
(x + y)2 (y-x-ir ,
2a2 + 2«2-l _I
16. y2=cx
17. (a) x2-y2=c (b) y2-x2=c (с) у=х + с
1
18.
(а).СЬ)^-^ + 2ч/Здс,=с (с) y = -^-^x-^j
19. (a) x^=0 (b) ду=0 (c) ^=0 (d) sinx=0
(e) (x + >)j-=0
20. (a) y = ±l
(d) > = ±e*
(e) r = \,r = cos θ
(f) 0=O,r = ±l
SECTION 4.6
21. (a) xy=c (b) x2 + y2=c2 (c) xz = c,;yz=i/
(d) (x, y, z) = (ae\ b-t, ce')
22. (a) (l+'^)^ (b) в (с) (х,-хЧ2) (d) (l.^.Vl-z2)
23. (a) (-χ,ΐ,-z)
(b) ((1 + f)*, (1 + t)y, 2t(x2 + y2))l(\ + t)2)
(c) (0,1, -z tan 0
(d) (-*, -л -z(l + tan /))
24. (a) exp(f 2/2)(x0, у о, ζ0)
(b) (xo cos f — yo sin f, jo cos t + xo sin t, t + z0)
(c) (x0e~\ y0e~', z0e')
Chapter 5
SECTION 5.1
1. (a) |*| < 1/2 (b) |x| >1 (c) x<0 (d) -19<x<-17
(e) never (f) |*| < 1
2. (a) |z|<l (b) all ζ (с) lmz>0
710 Answers to Selected Exercises
3. (a) both (b) both (c) both (d) integrated for all x,
differentiate for χ/2π not an integer
4. (a) f 2(n+l)(-x)2n
n = 0
(b) 2 2 nx-1
qo v2i + l
(c) 2
πΐΌ(2ι'+1)ι!
qo фАП+1
(d) 2(-l)"
„=o 4n+ 1
SECTION 5.3
5. (a) -e2x + xe2x + ex
(b) (SIA)e-x + 2xe-x + (\l4)e3
(c) e2x - 2xe2x + (5/2)xV*
(d) 3ex - xex
(e) - Re[(l - 2i)e'x + (-1 - i)xe'x]
(f) (6/5)<?* - (7/40)xe* - (1 /40)х<?-3*
(g) (l/2)[exp(21/2x) + exp(-21'2x)] - e~
x* x1 x10
9. х+т^+^Г,+
12 504 40360
X3 JC8
10·Τ + 20Ϊ6
2ft,+i ft,
11. (a) a„+2 =
η + 2 (η + 1)(л + 2)
3*
1*,|<-г
л!
2α„+ι α„-ι 1
(b) й-+2=77^-Л, , 1V_ , ,-» + -}
|0ηΙ<
η + 2 (/ι+ΙΧβ + 2) η!
4ΛΓ
[η/2)»
Chapter 5 711
-ak
(d)
(e)
SECTION
14. (a)
(b)
(c)
(d)
(e)
(f)
(g)
(и + 1) · · · (η + к)
К
kl<-
ап +
к2а„
(η + 2)(η + 1)
К
\а«\<-
Оп-1
^ „+1
Ы*ЫЫ
5.5
оо 72п
Σ —
„to n!
- (i + On-(i-O"
Ζ ... г"
αο
Σ
π = 0
■[»/2](_ΐ)*-
Λ (2/t)!
ζ"
α> ν2η+1
Σ
»ΐο(2η+1)η!
00
Σ
π = 1
π J-1 1
ν ν
A(foy(„_y)!,t
ν"
Λ
00 7·2Π
У —
„to (2η)!
00
Σ
π = 0
ζ2η + 1
2η +1)!
16. cos(z + w) = cos ζ cos w — sin ζ sin w
cosh(z + w) = cosh ζ cosh w + smh ζ smh w
sin(z + w) = sin ζ cos w + cos ζ sin w
sinh(z + v) = sinh ζ cosh w + cosh ζ sinh w
712 Answers to Selected Exercises
SECTION 5.7
17. α„{ Σ ϊΓΤ^τ-, Π IW + 1) - CM/ + l)]*2" + 1
\n=i(2n+ 1)! j-o
1 "=i
+ «ι Σ
(2и+1)1
Π [*<* + 1) - (2/ + 1)(2/ + 2)]χ2"+1 + *
18. α0 Σ
2" "-1
(2η)! ;=ο
χ2"+ 1
+ «1 Σ
2" "-1
-—ПСУ+1-А)
(2η)! j=o
χ2+1 + χ
19. (а) я0 = 1, αϊ =2, α„+2
1 _ α,
:(n+l)(n+2)1+f=„7!
(b) y=x2/4ory=0
(c) по solution
(d) у = (2x2 - с)- = -(1/ci) Σ (2/d)"x2
20. (a) 10
(b) 460
(c) 3
21. (а) к odd integer, a0 = 0 or к even integer, αϊ = 0
Chapter 6
SECTION 6.1
1. (а) /(0) = 7г2/3
/(n) = (-l)"4
η
(b) /(5) =/(-5) = 1/32
/(3)=/(-3) = 5/32
/(l)=/(-l) = 10/32
/(n)=0 allother η
(-1)" e1"" - <r "■" (-1)" sin(fiTr)
(c) /(n) = — = , μ not an integer
2πι μ — η
μ — η
Chapter 6 713
(d) /(0) = π/8
/(л)=0 rfn = 4fc
= l/πη2 if η is odd
= 2/ττη2 !fn = 4k + 2
(e) /(n)=0 η odd
= 2/тг(1 — η2) η even
(/) /(1)= (1-0/2
/(-l)=(l + ,)2
/(n)=0 all other η
(g) /(0) = 1/2
/(n)=0 η odd or n=4&, A:^0
/(η) = -2/τΓΐη „=4Α: + 2
(h) /(„)=(-iy
(0 /(") =
/>Я д — Я
2тг(1 - m)
2 + 2me"(-l)n
1 + n2
2. (a) (Re z)3 + (Im z)3
2 « /_iy
(b) 3 Σΐ-g-l (rVM + ^
e-2i»\n
1 « sin ρ . . , „
(с) - Σ (-1)"—ί-r'V"1'
7Γ π=-α> μ- — П
π 1
(d) - + -
о я
2
(е) - Re
7Г
α> „Ι2Π+1Ι J oo (-212П + 11
„έ«,(2η+1)2" ' 2тг„=еа)(2п+1)2
КН"й]
(f) (i + *)2
SECTION 6.2
3. (a) -2 + ;Imlnll-zj
(b) ~(z
2 + ζ"2)
714 Answers to Selected Exercises
2π2 ^, r""i?
(с) — + 2Σ(-1)" —
3 π *0 Π
1 ^
ч Sin μπ
π „^-ю μ — η
(d) - 2 (-1)""-^—(-"V"9
(e) (1 + ζ)2
4. (a) r sin θ + r2 cos 20
(b)
I H"1 /V
7г„^о |nl \n2
2(-l)"-2\
(-1)- + ^—),
1
2 oo z2n+i
(c) - Im 2 -——— + -
π „=o (2n+ l)2 2
SECTION 6.3
5. (a) (35/2 +28 cos 26»+ 14 cos 40 + 4 cos 66» + (1/2) cos 80)/64
(b) 2/ДI . I(- 1)J sin(A: - 2;)0 к odd
2?o П(- DJ cos(A: - 2j)6 + (-1)" /Μ л even
1 2 » sm(2n + 1)0
(c) г + " Ζ , , <—
ζ 7Γπ = ο ζη+1
1__2 » sm(4A: + 2)0
( ) 2 π „t-o 2/fc + 1
(e) (cos 50 + 5 cos 30 + 10 cos 0)/16
6. See Problem 27, Section 6.5
7. cosine series
sine series
(a) 1
(b) (1 - cos 4тгх)/2
(c) cos(2itx)
4 " sm(nnx)
π π = ι Π
sin(27rx)
4 » (2n + l)sm(2n + 1)πχ
π,ίΌ (2n+3)(2n+l)
Chapter 6 715
H, ! , 2 ν (-l)ncos(2n+l)7rx 2 - 1
(d) 2 + ^Jo (2^ΓΪ) i^-C^+D™
+ 2 sm(4n + 2)тгх + sm(4n + 3)πχ)
(e) (1 - cos 2тгх)/2 sin πχ
1 8 ^ cos(4n+2)7rx 4_ ^ sin(2n+l)7rx
u 4 π2 „и (4„ + 2)> π* Λ( r (2n+l)2
(g) (1 + 2 cos πχ — cos 2тгх)/2
4 « 2n + 1
2 sm πχ + - 2 ,, , ,v, rr sm(2n + ί)πχ
π „ = o (2n + 3)(2n— 1)
8. /(^) = [/(&) +/(-0)]/2 + [/(в) -/(-0)]/2
9. 2 2 [Л2„ cos(2n)0 + Я2п+1 sin(2n + 1)0]
n = 0
SECTION 6.4
.„ , ч »/sin(7rn-l) sm(7rn+l)\
10. (a) > I — — I cos nnt sin πηχ
π = ι \ πη — 1 7ГП+ 1 /
1 64 « η2 ...
(b) - sin nt sm πχ + — 2 ZT^—»w. ,—ττ cos 2πη( sin 2πηχ
π π „ = i (4n2 — 9)(4n2 — 1)
—8 °° 1
(c) "Τ Σ „ , ,чз cos(2" + !)"' sin<2" + 1)7Γ*
πό Л = о \2п + I)
1 8 £ "
(d) - sm ττί sin тгх + - 2, η—7 cos 2π"'sin 2π"χ
4 ' ττ π „ = .ι 4η2 — 1
128 « η(η2-2)
СеЧ У (— 1У-1 : r Sin πηί Sin πηΧ
Κ) ττ2 Α Κ ' {An1 - 1)(4η2 - 9)
» / — 7Γ2η2Λ l™x\l
"■ (a) I^bzrM'nl
ττ2η2ί\ /7rnx\/sin(7rn —L) sin(7rn + L)\
716 Answers to Selected Exercises
I + 1)πχ
-8L2 » /-·ττ2ί \ 1 (In
(c) —— Σ exp —— (In + l)2 sin —
π „=o \ 4i/ J (2n+ 1)
/-7г2Л 7ГХ /-25тг2А
(d) exp^—js.n-^ + Sexp^-jsi
5πχ
sm-^—
2 —4e » sin(2n+Ιπ-χ)
12. (c) 1 + (e - \)x + 2 ехр(-тг2(2п + l)2f) \ }
7Γ п = 0 2П+ I
« n(2-e(-l)n)
+ Σ exp(—ττ2η2ί) — — sin πηχ
n = l 1 + ГГ + 7Γ2
14. (a) - exp
/—7Г2Л 7ГХ
[-UFJ sin Τ
2L » /-irVA/ 1 \
2πηχ
sm-
L
15. The general solution is of the form
f (A sin(n2 + l)1/2f + B„ cos(n2 + l)1'2) sin nx
Fl = l
where the A, ft are determined by the sine series of the initial data.
16. On the interval [—π, π]
Σ (е2'"'-1)(А„ + В„еш)
where the A„, Вп are determined by the Fourier series of the initial data.
SECTION 1
17. (a)
(b)
(c)
(d)
5.5
27tt/64
4 ^, ηζ
7Γ Λ („2 _ μ2)2 "
2тЛ1 + Д 2r2" J
2яг
2μπ — sin 2μπ
2μ
1 + Г2
1 — г2
Chapter 7 717
1
(e) 2π 2
п = 1 П
2тг5 » l
(f) ^ + 16-2^
SECTION 6.6
19. (a) span ofexp(3;0), exp(-5i0)
(b) span of exp(±3i0), exp(±i0)
(c) span of e±'"
20. (a) (sm 56» + cos 50)/576 + Л sin 0 + Я cos 0
2тг2 (—11V"9
(b) ^7 + 2 2 ( }
27 „to n2(9 - n2 + 6/и)
(с) — exp(cos θ) χ m Ч
SECTION 6.7
21. (а) 13тг/4
(b) 0 Ar<n
Chapter 7
SECTION 7.1
1. (a) [—у sin χ + ζ cos(zx)] dx + cos xdy+ x cos(2x) dz
(b) — [<?* +y sin(e* +y) + e> sin(xey)] ^ - [<?* +y sm(x + j<) + xe" sm(xey)] dy
(c) <?<*·"> <Ae, a>
(d) <Ле, <?<*·<■>> + <x, <?<*·■> ><Λ,α>
(e) (2x + ζ) φ- + 2j< i/y + x dz
(f) e*+y[l + x-;y]i/x+i?*+y[x-;v-l]i/.y
А(п*'У
(g) Σ Π*' )dxJ
2. (a) 2/1000 (c) 2/1000 (e) 2/1000e
(b) l/1000e (d) 2/5000 (f) l/1000e
3. ||p 1Г/Ю00 if ||p || < 2, \\p Ц/500 if \\p || ^ 2.
718 Answers to Selected Exercises
SECTION 7.2
4. (a)
(b)
/ e" xey\
[ye1 e*)
хуф\
χ1, χ2, χ3 all different
(c) /2x -2y\
\y x) <*»*°
(d) / 2x 2y 2z \
i-y/x2 1/x 0 J хфО
\-z/x2 0 l/x/
(e) / 1 0 0
-хг1(х1)г l/x1 0
-x3/(x')2 0 l/x1 0
-x"/^1)2 0
x^O
χ" — ^4- +0 ;
д(х\ ..., Их")
χ" φ 0 and the hu ...,h„ coordinates.
К
2 n
5. -. 2 и' du' + 2
tt1 f = 2
1 а («О2
и1 — У —-
1 = 2 Ui
du1
6. (a) duju
wv du
(b)
(c)
l + w2 — v2 l + v2 — w2
' + 7T-;—г-;—κϊ ""'* + ;
1 + υ2 + w2 (1 + w2 + υ2)2
(1 + w2 + υ2):
vudw
1 1+υ+ w
2 [«(1 + υ2 + w2)]
7. с/2 с = fixed edge
Tjidu +
u1/2w(w — v)
*i*+
u1/2v(v — w)
(l+v2 + w2)3'2 (1 + v2 + w2)3'2
<Λν
section 7.3
9. (a), (b), (d), (g), (ι) are closed
(c), (e), (f), (h), (j) are not closed
10. The real part is exact, the imaginary part is not.
Chapter 7 719
11. (a) e> (c) -\tf (e) exp(-x-y)
(b) e\y (d) 1/jc (f) l/cosx
SECTION 7.4
12. (a) 0
(b) 1
(c) -ecos 1 -02sm l)/4 - e2/4 + 5/4
(d) -f-f2/2
(e) -a2/2 - 2k2a5/5
(f) 21/2
13 (b), (c), (f) are conservative
(a), (d), (e), (g) are not conservative
SECTION 7.5
14. (a) 0 (b) 1
15. (a) 0
7Γ
W Tab
(c) 18
"1 - Γ
Ρ α2.
(d) e6(cos2 2)(sm 2)
(e) π
16. (a) 1 + In 21'2 - 2"1'2
(b) Let sm 0o = (51'2 - l)/2 The area is (3 sin 2θ0 + θ0- sin3 θ0)/6(2γι2
(c) 16
(d) If η is even there is no inside. If η is odd, the area is
(^)T
SECTION 7.6
17. (a) Set (d) (e**-l)/4
(b) oo _ (e) 4/3
(c) (2тг + 5л/3)/3
18. (a) 2 (b) l-2f-3f2 (c) 2x + 2;y (d) 0
SECTION 7.7
19. (а) 21/27г (b) Зтг/23'2 (с) e'[5i - 3]/6 (d) πι cos(l/2)/2
(e) 2тг/3 _ (f) 2тг/(1 - a2)1'^ (g) _ne~°(l + а)12с?
(h) 7r[cos(V2/2) + яп(л/2/2)]/л/2 ехр(-л/2/2) (0 т/3 (j) тг/e
(к) тг[2 sm(7r/10) + 2 δΐη(3π/10) + 1]/5
720 Answers to Selected Exercises
Chapter 8
SECTION 8.1
1. (а) со (b) 2π (с) тг/2 (d) Λπ/ЗаЬс
2. (а) 2тг/495 (b) 0 (c) Let A =a~\ B = b~\ The integral is
π(ΑΒ)1Ι2[-5(Α3 + В3) - ЪАВ{А + В) + 2ДА2 + 6АВ + 24£2]/3(2)6
3. /bra2r6/6
4. 4тг(1п2-1/2)
5. (а) (уе~г + tz - t2x, y(l - t1) + (t - l)(ze· - txe1),
-z(l + f2) + (1 + /)(* ~ tye'))
(b) -3i2
(c) fcfl-/1)-1
(d) oo
6. (a), (c) are incompressible.
7. (а) 4тг/3 (b) 4тг/3 (с) 8tt/3
SECTION 8.2
9. (a) (<?'(1 - f)+ <?""'('+1), -l.'eO-O-*"")/!-'3
(b) (1,-1,1)
(c) -(1,1,1)
(d) (-l,2z,0)
(e) (e''2(l + 0/2, e'(l - 00?'(1 - 0 + 1), e"2(2 - t))
(f) (0, 0, у sin 0
11. HM = (a'j), div=ai1 + a22 + fl33
curl = (a32 — агъ, л3 — α3\ аг1 — fli2)
12. О
SECTION 8.3
13. (b) dS>=(l+fx2)dx> + 2fxfydxdy + (l+tf)dy>
dS = (\+tf+f,iyi1dxdy
14. Tangent plane
(a) <p, (1, -2j<, -2z> =0 (1 + 4/) φ-2 + Syz dy dz + (1 + 4z2) ife2
(2x2 + y*) dx* + 2xy dx dy + (x2 + 2/) dy2
(b) <p, (-*, ->-, z)> =0 ——
Chapter 8 721
(с) </>, (-2x, -2у,1)У=0 (1 + 4x2) dx2 - 8xy dx dy + (l + 4y2) dy2
Area element
(a) (1 + Ay' + 4Z2)1'2 dy dz
(b) 21/2dxdy
(c) (1 + 4x2 + Лу2)1'2 dx dy
15. (а) 4тг/31/2
16. cos θ = (2v + lu + uv)/(l + Ли2 + v2)1/2(l + и2 + 4υ2)1/2
18. (а) 2тг[(1 + а2)3'2 - 1]/3
(b) π J"_„ (1 + sin2 и)1'2 ί/й
(d) 2тг
4 +
1 /л/4 + л/3\
λ/ϊ \л/4-л/3/
SECTION 8.4
19. (а) г1'2^ (b) О
20. (а) е-2 (Ь) -4тг/3 (с) 0
SECTION 8.5
25. (а) 9п/2(аЬУ'2 (Ь) тг/З^)1'* (с) 1/18
28. (а) 0 (Ь) -4тг/3
INDEX
Absolutely convergent series, 140
Absolutely integrable, 197
Abstract vector space, 107
Acceleration, 254
normal, 335
particle, 335
tangential, 335
Addition
in the plane, 21
in Rn, 28
Adjoint matrix of an entry, 72
Adjoint of a linear transformation,
123
Algebra
of linear operations, 59
of η χ η matrices, 60
Analytic continuation, 450
Analytic function, 400, 441, 534,
584
Angular velocity, 626
Annihilator of a subspace, 122
Approximation, 126
Arc length
definition of, 331
on a surface, 643
(Problem 22), 258
Area, 572
surface, definition of, 651
Argument, 87
Axiom of the least upper bound,
134
Ball, 111
Basis, 47
Bessel's inequality, 497
Bilinear function, 122
Binormal, 359
Biotic matrix, 253
Boundary of a set, 123
Cannes, 185
Cartesian product, 16
Cauchy
uniformly, 204
theorem of, 205
Cauchy criterion, 132, 401
Cauchy formula, the disk, 510
Cauchy integral formula, 585
Cauchy theorem, 583
Cauchy-Riemann equation, 432
Cayley-Hamilton theorem, 76
Chain rule, 168, 229, 233, 540
723
724 Index
Change of basis, 82
Characteristic polynomial
ofL, 277, 411
of M, 76
Circular motion, 338
Circulation of a flow, 626
Clairaut's equation, 372
Closed, 112, 156
Closed and bounded, 158
Closed path, 555
Closure of a set, 123
Coefficients
Fourier, 454
Fourier cosine, 481
Fourier sine, 481
real Fourier, 476
Taylor, 246
Cofactor, 65
expansion, 72
Collinear, 24
Compact set in R", 156
Comparison test, 147, 198, 402
theorem of, 147
for integrals, 198
Complex derivative, 428
Complex eigenvalue, 89
Complex number (Section 1.8), 85
argument, 87
modulus, 88
polar form, 87
Compound interest example of, 251
Conditionally convergent series, 140
Connected, pathwise, 557
Connected set, (Problem 78), 224
Conservative field, 557, 632
Constant coefficient
differential operators, 410
equation, homogeneous, 262
linear differential equation
(Section 5.3), 410
linear differential operator,
characteristic polynomial, 411
Contained in, 16
Continuity (Section 2.5), 159
Continuous, 160, 201, 206
Continuously differentiable, it-
times, 240
Contraction, 213, 268
lemma, theorem, 213
Convergence
Cauchy criterion, 154
in Rn, 153
mean square, 496
of a sequence, 131
of a series of functions, 401
uniform, 203
Convolution (Problem 25), 508
Convolution transform, 520
Coordinate axes, 18
Coordinate particle (of a flow), 612
Coordinate relative to a basis, 108
Coordinate space (of a flow), 612
Coordinates
cylindrical, 536, 635
in R3, 94
polar, 535
spherical, 536
Coplanar vectors, 93
Counterclockwise, 577
Cramer's rule, 72, 74
Curl
of a flow about n, 626
of a vector field, 630
Curvature
geodesic (Problem 24), 655
lines of (Problem 65), 693
normal, 692
(Problem 25), 656
of a curve, 336
principal (Problem 62), 692
Curve, 313
binormal, 359
curvature, 336, 352
Frenet-Serret formula, 360
Index 725
implicitly defined, 317
length of, 331
moving trihedron, 359
normal
line, 350
plane, 359
vector, 351
of a minimal length, 646
osculating plane, 350
parametrization of, 313
piecewise continuously differen-
tiable, 555
principal normal, 335
rectifying plane, 359
tangent line, 324
torsion, 360
unit tangent, 322
Curves, family of, 365
Density, 183, 613
de Rham's theorem, 566
Derivative, 166
directional, 192
partial, 187
Determinant, 61
cofactor expansion of, 67, 69
Developable surface, 642
Diagonalization, 77
Differentiable, 166, 228
definition of, 166
function on Rm, 527
Differential, 193, 527
Differential equations (Section 3.3),
250
of a family of curves, 370
of the tangent family of a vector
field, 383
on the circle (Section 6.6), 505
Differential form (Section 7.3), 547
closed, 548
exact, 548
radial (Problem 27), 573
Differential operator, linear, 276
Differentiation, 166
complex (Section 5.6), 428
partial, 187
Dimension, 40
of an abstract vector space, 107
Direct sum, 122
Directional derivative, 192
Dirichlet principle, 674
Dirichlet problem, 471
on the disk, 469
on the half plane, 524
Dirichlet theorem, 674
Dissipative functions, 597, 682
Distance
in C{X), 203
in R", 111
mean square, 496
Divergence of the flow, 582, 619
Divergence theorem (Exercise 27),
673
in R2, 581
in R3, 670
Domain of a function, 17
regular, in R2, 568, 571
in R3, 662, 668
e, φ, 16
ε — δ criterion (Proposition 13),
160
Eigenspace, 85
Eigenvalues, 76
of a skew symmetric matrix
(Problem 32), 302
of a symmetric matrix (Example
8), 236
Eigenvectors, 76
of linear systems of differential
equations, 280
Elementary matrices, 32, 63
Elementary transformations, 32
Empty set, 0, 16
726 Index
Energy
conservation of, 612
integral, 674
kinetic, 501
potential, 557
Envelope of a family of curves, 376
Equation
heat, 467, 489
Laplace's, 467
of continuity, 617
wave, 482
Equations of motion, 311
Equations of a particle, 335
Euclidean inner product, 114
mR", 111
in R3, 97
Euclidean vector space, 113
Expansion
Fourier, 454
Laurent (Problem 38), 601
Taylor, 246
Exponential, 262
of a matrix, 279
(Problem 93), 226
Exponential function, definition of
(Proposition 25), 210
Exponential order, 521
Family of curves, 365
differential equation of, 370
envelope of, 376
explicit form, 366
implicit form, 366
orthogonal family, 374
tangent to a vector field, 383
Fetah, 253
Field, 306
conservative, 557
First fundamental form on a
surface, 643
First-order linear equations, 264
Fixed point, 209
Fixed point theorem (Section 2.11),
211
(Theorem 2.15), 213
Fluid flows, 385
circulation, 626, 660
curl about n, 626
divergence of, 582, 619
equations of motion, 310, 611
incompressible, 618
irrotation, 631
steady, 612
velocity field, 612, 385
Flux of a field, 660, 667
Force field, 255, 306
Fourier coefficient, 454
real, 476
Fourier series (Chapter 5), 452
real, 476
sine, 476, 481
cosine, 476, 481
transform, 523
Frenet-Serret formula, 360
Frenet-Serret frame, in R"
(Problem 66), 398
Fubini's theorem, 177, 184, 189
Function, 17
analytic, 400, 441, 534
continuous, 160, 201
differentiable, 527
dissipative, 596, 682
domain of, 17
infinitely differentiable, 443
flat, 443
invertible, 17
harmonic, 468
linear, 1
Lipschitz, 269
meromorphic, 616
odd,478
one-to-one, 17
onto, 17
periodic, 452
Index 727
range, 17
Schwartz test, 523
Fundamental existence and
uniqueness theorems, 271, 273
Fundamental theorem of algebra,
61,86
(Theorem 5.2), 408
Fundamental theorem of calculus,
172
Gamma function (Problem 48), 608
Garib, 253
Gauss' theorem (Proposition 11),
681
Geodesies, 645
curvature (Problem 24), 655
Geometric series, 137
Gradient, 193
Gram-Schmidt process, 117
Great circles, 645
Greatest common divisor, 410
Green's identities (Theorem 8.5),
676
Green's function (Definition 13),
681
for a half space, 682
for a ball (Problems 40 and 41),
683, 684
Green's theorem, 567, 571
Harmonic functions, 468, 676, 677
Harnack's principle (Problem 36),
518
Heat equation (Section 6.4)
in R2, 482
in R3, 671
Helix, 315
Homogeneous constant coefficient
equation, 262
Hooke's law (Problem 17), 256
Hyperbolic cosine, 424
Hyperbolic sine, 424
i,85
Implicit function theorem, 216
Implicitly defined curves, 317
Incompressible flow, 618
Independence, 43
Index, 10
Infinitely differentiable, 166, 443
flat, 443
Inner product
in an abstract vector space,
114
mR", 111
in R3, 97
Integrable function, 174, 179
absolutely, 197
Integral
definite, 170
Dirichlet's, 519
energy, 674
improper (Section 2.9), 195
indefinite, 169
iterated, 177
multiple, 173
of a vector function, 259
on a surface, 657
test, 199
Integrating factor, 549
Integration, 170
formula for change of variable,
614, 579, 621
multiple (Section 2.7), 173
of differential form, 563
Intermediate value theorem, 162
Intersection, 16
Inverse function theorem (Theorem
7.2), 541
of a function, 17, 168
Invertible function, 168
Invertible matrix, 61
Irrotational flow, 631
Isolated singularity, 592
Iterated integral, 177
728 Index
Jacobian, 537
Jordan canonical form, 81
Jump discontinuity (Problem 30),
517
fc-times differentiable, 240
Kepler's laws (Problem 68), 399
Kernel
of a linear transformation, 53
Poisson, 458
Kinetic energy, 501
\\ (Problem 89), 225
L (Problem 90), 225
Lagrange multipliers, 234
Laplace transform, 521
Laplace's equation, 467, 672
Laurent expansion (Problem 38),
601
Legendre's equation, 450
polynomials (Problem 60), 450
Length, 203
of a curve, 331
Liebniz's theorem, 141
Limit, 165, 196
Linear differential equations
(Section 3.6), 275
first order, 264
second order (Section 3.7), 289
systems, 278
Linear differential operator, 276
Linear function, 1
Linear span, 40
Linear subspace, 40
basis, 47
dimension, 40
Linear systems of differential
equations, 278
Linear transformation, 28
adjoint, 123
eigenspace, 85
eigenvalue, 77
complex, 89
eigenvector, 77
kernel, 55
nullity, 53
range, 53
rank, 53
self-adjoint, 125
spectral theorem, 125
Lines of curvature (Problem 65), 693
Lines of force, 309
of a fluid flow, 386
Lioville's formula, 655
Lioville's theorem (Proposition 7),
587
Lipschitz function, 269
Local, 112
Logarithm (Example 17), 247
Mass, conservation of, 612
Matrix, 8
change of basis, 82
characteristic polynomial, 76
column index, 8
diagonal (Problem 22), 39
diagonalization of, 77
eigenvalue of, 77
eigenvector of, 77
index (Definition 1), 10
invertible, 61
Jordan canonical form of, 81
multiplication, 31
orthogonal (Problem 29), 289
row index; 8
symmetric, 123
transpose, 123
Maximum principle, 449, 586
for analytic functions, 512
for harmonic functions, 472
Maxwell's equations, 632
Mean square convergence, 496
Mean square distance, 496
Mean value property (Proposition
2), 471
Index 729
Mean value theorem, 166
Meromorphic function, 610
Mixed partial derivatives (Theorem
2.13), 189
Modulus, 88, 111
Morera's theorem (Problem 55),
609
Moving trihedron, 359
Multiplication, matrix, 31
Multiplicity, 409
Neighborhood, 111
Neuman's problem, 473, 675
Newton's gravitational potential,
679
Newton's law, 255, 340
Newton's method, 213
Normal
to a curve in R2, 335
to a surface, 655
Normal acceleration, 335
Normal curvature (Problems 25 and
62), 656, 692
Normal line to a curve, 350
Normal line to a surface, 657
Nullity, 53
One-to-one, 17
Open set, 111
Operator, integral, 507
Order, 187
exponential, 521
Orientation
in R2, 579
in J?3, 614
of a curve, 321
of a surface, 658
of the boundary in R2, 567
of the boundary in R3, 666
of the boundary of a curve on a
surface, 661
Oriented path, 555
Origin, 18
Orthogonal, 97, 111
Orthogonal curves on a surface,
644
Orthogonal family to a family of
curves, 374
Orthogonal matrix (Problem 29),
289
Orthogonal projection, 113
Orthonormal functions, 491
Orthonormal set of vectors, 114
Osculating circle, 357
Osculating plane, 350
Pain, 435
Parametnzation, 313, 321
Parseval's equality, 498
Partial derivative, 187
Partial differentiation, 187
Particle motion, 254, 335
Partition of unity, 444, 451
Path
closed, 555
integral, 562
of motion (of a flow), 612
oriented, 555
Period of a harmonic function
(Problem 46), 608
Periodic function, 452
Permutation, 68
even, odd, 69
interchange, 68
Picard's theorem
(theorem 3.3), 271
(theorem 3.4), 273
global version (Proposition 9),
286
Plane
in R\ 97
with chosen point, 20
Planetary motion, 390
Plane geometry, 18
730 Index
Poincare's lemma, 564
Point on a surface (Problem 67)
elliptic, 693
hyperbolic, 693
parabolic, 693
Poisson kernel, 458
Poisson transform, 458
Polar coordinates, 535
Polynomial functions, 406
Population, 252
Positive integers, P, 13
Positively oriented coordinates, 658
Potential energy, 555
Potential function, 555
Power series, 149
addition and multiplication
(Proposition 3), 426
and Taylor expansions, 246
radius of convergence, 150, 421
Principal curvatures (Problem 62),
692
Principal normal vector, 335
Principle of Mathematical
Induction, 13
Product, matrix, 31
Projection, 113
Properties of analytic functions
(Theorem 7.10), 588
Radius of convergence, 150, 421
Range, 17
Rank, 53
Ratio test, 151
Rational numbers, Q, 15
Real numbers, R, 15
Rearrangement of series, 143
Rectangle, 16
closed, 173
volume of, 173
Rectifying plane, 359
Regrouping of series, 143
Regular domain, 568, 571, 662, 668
Residue theorem, 592, 593
Riemann integrable, 170
Rn, 16
addition in, 28
linear subspace of, 40
scalar multiplication, 28
Rodrigues's formula (Problem 65),
693
Root, 409
Root test, 151
Roots of unity, 406
Row operation, 9
transformation corresponding to,
29
Row-reduced matrix, 9
Row reduction, 7
Scalar multiplication
in the plane, 20
in R", 28
Schwartz test functions, 523
Schwarz's inequality, 120, 223
Schwarz's lemma, (Problem 60),
610
Second fundamental form (Problem
63), 692
Second-order linear equations, 289
Self-adjoint transformation, 125
Separation of variables, 260, 290
Sequence, 129
convergence of, 131
subsequence of, 130
Series, 137
absolutely convergent, 140
comparison test, 147
conditionally convergent, 140
convergence, 137
Fourier cosine, 481
Fourier sine, 481
geometric, 137
of functions, 401
power, 149
Index 731
ratio test, 151
root test, 151
Taylor, 246, 509
with positive terms (Proposition
4), 139
Simultaneous linear equations, 2
homogeneous system, 11
Singular solution, 373
Skew-symmetric matrix (Proble
32), 302
Solid angle (Problem 34), 673
Spectral theorem for self-adjoi
operators, 125
Spherical coordinates, 536
Steady flow, 385
Stokes' theorem, 662, 666
Straight line, 24
Subtraction, 22
Successive approximations, 266
Surface, 635
arc length, 643
area, definition of, 651
definition, 635
developable, 642
elliptic point, 693
first fundamental form, 643
geodesies, 645
hyperbolic point, 693
normal, 655
onentable, 658
orientation, 658
orthogonal curves, 644
parabolic point, 693
patch, 635
second fundamental form, 692
tangent plane, 640
Symmetric bilinear form, 123
Symmetric matrix, 123
System of coordinates, 534
Tangent to a curve, 322, 324
Tangent plane
to a curve, 350
to a surface, 640
Tangential acceleration, 335
Taylor expansion, 246
Taylor's formula (Theorem 3.1),
242
Tests
comparison, 147, 198
integral, 199
ratio, 151
root, 151
Topological terms, 111
Topology, 575
Torsion, 360
Transform
Fourier, 523
Convolution, 520
Laplace, 521-22
Poisson, 458, 524
Transpose of a matrix, 123
Triangle inequality, 120
Trigonometric functions, Taylor
expansion, 246
Trigonometric polynomial, 454
Uniform convergence, 198, 204,
205
Union, 6
Unity
nth roots of, 406
partition of, 444, 451
Variation of parameters, 295
Vector, 20, 28
addition in R2, 21
addition in R", 28
field, 306, 381
independent set, 43
in the plane, 20
product, 100
subtraction in R2, 22
732 Index
Vector field, 306, 381 field of a flow, 612
curl, 630 Volume, 173
flux across a surface, 660
radial, 560, 624
divergence, 581, 670 Wave equation, 482
Vector space, 107, 108 Weierstrass approximation theorem
Velocity, 254 (Problem 6), 467
of a particle, 335 Work, 554
of a fluid flow, 385 Wronskian, 293