Text
                    ADVANCE»
CALCULUS
PROBLEMS AND APPLICATION* TO
SCIENCE AM) ENCINEEKINi.
Ilu «> Rami
В^Ып· f*4ir«rti
^3b
» I КЧ1ИЛ I . W >~4 *»?«·


ADVANCED CALCULUS
ADVANCED CALCULUS PROBLEMS AND APPLICATIONS TO SCIENCE AND ENGINEERING Hugo Rossi Brandeis University W. A. BENJAMIN, INC., New York ■ 1970
ADVANCED CALCULUS Problems and Applications to Science and Engineering Copyright © 1970 by W. A. Benjamin, Inc. All rights reserved Library of Congress Catalog Card Number 70-110973 Manufactured in the United States of America 12345R2109 W. A. BENJAMIN, INC., New York, New York 10016
PREFACE During the 19th and early 20th centuries the curriculum in advanced mathematics centered around the Cours d'Analyse: the course in mathematical analysis. This three-or-more year study was a catalog of the concepts, techniques, and accomplishments of the calculus throughout mathematics and physics. During this period we saw first the emergence of differential geometry and complex analysis as separate disciplines, then the development of linear algebra, group theory, and other branches of algebra and the beginning of an intensive research into foundations of, and formulation of mathematical concepts and techniques. During this time the Cours d'Analyse fattened, as more accomplishments of calculus were added to the catalog. The mathematical research of the late 19th and early 20th centuries brought about a revolution in mathematics which not only opened up broad new areas but changed the basic approach to the subject itself. Another level of abstraction was attained, from which it became possible to scan large areas of mathematical research, observing new relations and interconnections with a more profound understanding and exposing new frontiers of discovery. What is more, it became necessary for all scientists to attain this new level By the mid-1920's it was apparent that the Cours d'Analyse was massively unwieldy as well as out-of-date. Thus it fragmented into a collection of smaller disciplines; some remained (calculus, differential equations, differential geometry), others disappeared and were replaced by courses in more recent mathematics (point set topology, algebra, potential theory, integration theory). A piece of advanced calculus, which was important but ν
vi Preface essentially unchanged remained (series expansion, vector calculus, partial differential equations, calculus of variations). This was the course in advanced calculus. However, research in mathematics during the past forty years has been extensive in these particular subjects. The intensive and far- reaching developments in the study of differentiable manifolds and partial differential equations have cast advanced calculus in a new and important light. It was then clear that geometry and algebra form two important cornerstones for advanced calculus. In 1957, Nickerson, Spencer, and Steenrod wrote a new advanced calculus textbook which was in effect an introduction to the techniques of modern analysis. This book bore little resemblance to the existing texts in the subject, and was not successful in replacing them. However, it made the others obsolete; every text written since then must reckon with the Nickerson-Spencer-Steenrod conception of advanced calculus. In 1963,1 taught from that text to a class of exceptionally brilliant students at Princeton University. I believe that course was successful and influential for those students. As the text has no illustrative material, I developed a set of notes of " classical" advanced calculus which we used as a supplement to the text. This was the beginning of the present textbook of advanced calculus. I began to feel that indeed algebra, geometry, and topology are cornerstones of modern mathematical analysis, but so is "classical" advanced calculus. I decided that we needed a bridge between freshman calculus and modern analysis which leaned heavily upon the techniques of algebra and the concepts of geometry. This text is an attempt at such a bridge. In 1967, I taught from a preliminary edition to a class of physics-motivated juniors, and in 1968 I taught from what is essentially the present text to a class of sophomores. These two classes have had a profound influence on the development of the text and I am deeply indebted to them for assistance in matters of style and pedagogy. Needless to say, I did not complete the entire text in either year. As a text for juniors we covered a now-extinct chapter on metric spaces and Chapters 4-8, and in the class of sophomores we covered Chapters 1,2, 3, 5, and 7. I feel that either of these is an adequate year's course, depending upon the structure of the preceding and subsequent courses. This text assumes only a course in calculus that includes analytic geometry, multiple integration, and partial differentiation. These topics are speedily reviewed in Chapter 2 with a view to setting up the style of the present text as well as indicating the more valuable facts and concepts of calculus. Chapter 2 includes the abstract formulation of the technique of successive approximations, this is, of course, the basic theoretical tool in advanced calculus. Chapter 1 is hardly a course in linear algebra; it is rather a tour through
Preface vii those algebraic ideas and techniques which are essential to analysis. It is a large chapter and it is very likely that the student will become anxious to return to his analytic tools before the chapter is completed It can thus be split up: Sections 1.3-1.8 are relevant to Chapters 3 and 4, because the topic of differential equations is consistently handled in the context of systems. The last four sections can be postponed until the student reaches Chapters 6-3, for it is with the geometric study of Fourier series that they begin to be relevant. Similarly, Chapter 2 can be broken up. Sections 2.6-2.9 are completely review material and can be omitted altogether if that is suitable. If the proof of Picard's theorem is omitted or delayed, this chapter is not relevant until Chapter 5. Thus Section 2 1-2.3 could be done just before beginning Chapter 5. Sections 2.4, 2.5, 2.10, and 2.11 are of a purely theoretical nature and can be kept aside until Section 3.5 is studied, or until the integral calculus in several variables is begun (Chapters 7 and 8). Chapters 3-5 constitute a little course in ordinary differential equations. Since the study of curves and some complex variables are relevant to this topic, they are introduced in these chapters Chapter 4, in particular, is about particle motion and Chapter 5 about series expansions in the complex domain. Chapter 6 is devoted to the study of Fourier series and their use in the classical partial differential equations. This is the only illustration of eigenvalue expansions Chapters 7 and 8 form the part of advanced calculus having to do with integration; here, we find the various versions of Stokes' theorem and its applications. Outside of the notion of a differential one-form the approach is vector calculus rather than differential forms I have included in Chapter 8 the study of geodesies and Dinchlet's principle, there is no further calculus of variations. This text is thus intended to cover the course in advanced calculus given at the sophomore or junior level. The emphasis in the text is on concepts and techniques; my main intention being to present the methods of calculus. There are numerous illustrative examples and exercises on each method introduced. The exercises appear at the end of each section. Proofs of theorems are included mainly to offer further understanding of the mathematical machinery, and secondarily to illustrate its logical structure. It should be possible to read this book while skipping all proofs. The problems at the ends of the sections and the miscellaneous problems are included to deepen the student's understanding of the material, to allow him to try his hand at mathematical inference, and to suggest related topics. I wish to thank Anne Clarke and Irene Dougherty of Brandeis University for their typing of the preliminary notes leading to this text and the classes of Mathematics 35 (1967-1968) and Mathematics 21 (1968-1969) for their
viii Preface assistance in correcting them. I thank also the editorial staff of W. A. Benjamin, Inc., for their patient, friendly, and expert assistance. Finally, it gives me great pleasure to thank my wife, Ricki, who not only typed the entire final manuscript and who gave me the needed encouragement to see this text through, but who also makes the world's greatest martinis. Hugo Rossi Waltham, Massachusetts October 1969
CONTENTS PREFACE Chapter 1 Linear Functions 1.1 Simultaneous equations 1.2 Numbers, notation, and geometry 1.3 Linear transformations 1.4 Linear subspaces of R" 1.5 Rank + nullity = dimension 1.6 Invertible matrices 1.7 Eigenvectors and change of basis 1.8 Complex numbers 1.9 Space geometry 1.10 Abstract notions of linearity 1.11 Inner products Miscellaneous problems 1 2 13 28 40 53 59 76 85 93 105 110 121 Chapter 2 Notions of Calculus 2.1 Convergence of sequences 2.2 Series 2.3 Tests for convergence 2.4 Convergence in R" 2.5 Continuity 2.6 Calculus of one variable 126 129 137 145 153 159 165 IX
x Contents Chapter 3 Chapter 4 Chapter 5 2.7 Multiple integration 2.8 Partial differentiation 2.9 Improper integrals 2 10 The space of continuous functions 2.11 The fixed point theorem 2.12 Summary Miscellaneous problems Ordinary Differential Equations 3.1 Differentiation 3.2 Taylor's formula 3.3 Differential equations 3.4 Some techniques for solving equations 3.5 Existence theorems 3.6 Linear differential equations 3.7 Second-order linear equations 3.8 Summary Miscellaneous problems Cnrves 4.1 Parametrization of curves 4.2 Arc length 4.3 Local geometry of curves 4.4 Curves in space 4.5 Varying a curve in the plane 4.6 Vector fields and fluid flows 4.7 Summary Miscellaneous problems Series of Functions 5.1 Convergence 5.2 The fundamental theorem of algebra 5.3 Constant coefficient linear differential equations 5.4 Solutions in series 5.5 Power series 5.6 Complex differentiation 5.7 Differential equations with analytic coefficients 5.8 Infinitely flat functions 5.9 Summary Miscellaneous problems 173 185 195 201 211 219 222 227 228 240 250 259 266 275 289 298 302 307 313 331 349 359 365 380 393 397 400 401 406 410 414 421 428 434 441 445 448
Contents xi Chapter 6 Chapter 7 Chapter 8 Fum 6.1 6.2 6.3 6.4 6.5 6.6 6.7 6.8 ctions on the Circle (Fourier Analysis) Approximation by trigonometric polynomials Laplace's equation Fourier sine and cosine series The one-dimensional wave and heat equations The geometry of Fourier expansions Differential equations on the circle Taylor series and Fourier series Summary Miscellaneous problems Line Integrals and Green's Theorem 7.1 7.2 7.3 7.4 7.5 7.6 7.7 7.8 The differential Coordinate changes Differential forms Work and conservative fields Integration of differential forms Applications of Green's theorem The Cauchy integral formula Summary Miscellaneous problems Potential Theory in Three Dimensions 8.1 8.2 8.3 8.4 8.5 8.6 8.7 Divergence and the equation of continuity Curl and rotation Surfaces Surface integrals and Stokes' theorem The divergence theorem Dirichlet's principle Summary Miscellaneous problems 452 453 467 476 482 495 503 509 512 517 525 527 534 547 552 560 574 584 602 607 611 613 624 635 657 666 674 686 690 ANSWERS TO SELECTED EXERCISES 694 INDEX 723
Chapter 1 LINEAR FUNCTIONS You probably recall from calculus that a function is a rule which associates particular values of one variable quantity to particular values of another variable quantity. Analysis is that branch of mathematics devoted to the study, or analysis, of functions. The main kind of analysis that goes on is this: for small changes in the first variable, we try to determine an approximate value to the corresponding change in the second. Now, we ask, for large changes in the first variable to what extent can we predict, from such approximations, the corresponding change in the second? The primary technique involved in this kind of analysis is simplification of the problem. That is, we replace the given function by a suitable very simple and more easily calculable function and work with this simple function instead (making sure to keep in mind the effect of that replacement). The simplest possible functions are those which behave linearly. This means that they have a straight line as graph. Such a function has the following property. The increment in the value of the function corresponding to an increment in the variable is a constant multiple of that increment: f[x + t)-f[x) = Ct (1.1) for some C. Now, when one moves to the consideration of functions of several variable quantities the study of even these simplest functions becomes complex enough that it forms a special mathematical discipline, called linear algebra. The calculus of one variable, coupled with the concepts and 1
2 1 Linear Functions techniques of linear algebra constitute the basic tools of analysis of functions of several variables. It is our purpose in this text to study this subject. First then, we must study the notions and methods of linear algebra. We begin our study in a familiar context: that of the solution of a system of simultaneous equations in more than one unknown. We shall develop a standard technique for discovering solutions (if they exist), called row reduction This is the foundation pillar of the theory of linear algebra. After a brief section on notational conventions, we look at the system of equations from another point of view. Instead of seeking particular solutions of a system, we analyze the system itself. This leads us to consider the fundamental concept of linear algebra: that of a linear transformation. In this larger context we can resolve the question of existence of solutions and effectively describe the totality of all solutions to a given linear problem. We proceed then to analyze the totabty of linear transformations as an object of interest in itself. This chapter ends with the study of several important topics allied with linear algebra. We study the plane as the system of complex numbers and the inner and vector products in three space. 1.1 Simultaneous Equations Let us begin by considering a well-known problem: that of finding solutions to systems of simultaneous linear equations The simplest nontri vial example is that of two equations in two unknowns. Examples 1. Sx+5y = 3 7x- y = S U ' The technique for solution is that of elimination of one of the variables. This is accomplished by multiplying the equations by appropriate nonzero numbers and adding or subtracting the resulting equations. This is quite legitimate, for the set of solutions of the system will not be changed by any such operations. It is our intention to select such operations so that we eventually obtain as equations: χ = something, у = something. In the present case this is quite easy: if we add five times the second equation to the first, у will conveniently disappear: Sx + 5y = 3 35x - 5y = 40 43* ^43
1.1 Simultaneous Equations 3 and we obtain the equation x= 1. Substituting that in the first equation gives 8 + 5y = 3, or у = -1. Then χ = 1, у = -1 is the solution. Let us try a few more illustrative examples. 2. 3x-2y = 9 -x + 3y = 11 We can eliminate χ as follows: multiply the second equation by 3 and add: 3x-2y= 9 - 3x + 9y = 33 7y = 42 We obtain у = 6 and χ = 1. 3. 3* + 4У= 7 6x + 8j>=15 U'JJ If we subtract twice the first equation from the second, we obtain a mess: 6x+8y=15 -6x-8j>=14 u ' 0= 1 Thus there can be no numbers χ and у satisfying Equations (1.3), because they imply the Equation (1.4) which is patently false. Notice, if the second equation were 6x + Sy= 14 then our technique would lead to the equation 0 = 0 which is true, but hardly offers much new information. We can conclude that our simple technique of elimination does not always produce results. We shall go into the causes for this in Section 1.3. 4. Let us now generalize our technique to systems involving more variables. Consider, for example, the system χ + у + ζ = 5 3x-2y + 5z= -1 (1.5) 2x + у - ζ = 0
4 1. Linear Functions The first equation expresses χ in terms of у and z; if the second and third equation were free of χ we could solve as above for the two unknowns y, ζ and then use the first to find x. But now it is easy to replace those last two equations by another two which must also be satisfied and which are free of the variable x. We use the first to eliminate χ from the latter two. Namely, subtract three times the first from the second: 3x-2y + 5z= - 1 3x+3y + 3z = 15 - 5y + lz = -16 and twice the first from the third: 2x + у - ζ = 0 2x + 2y + 2z = 10 - y-3z= -10 The system (1.5) has been replaced by this new system: χ + у + ζ = 5 -5j>+2z=-16 (1.6) - у - 3z = - 10 and we can now see our way clear to the end. We solve the last two as a system in two unknowns: -5y+ 2z= -16 5j>+15z = 50 17z= 34~ z= 2 Then, substituting this value in the last equation, we obtain — у — 6 = -10 or у = 4. Finally, substituting these values for у and ζ in the first equation, we find χ = - 1. Thus the solution is χ = - 1, у = 4, ζ = 2. 5. χ — у — ζ = 5 2χ + у - 3ζ = 0 -4x-y+ z= 10
1.1 Simultaneous Equations 5 Eliminate χ from the second and third equations by adding appropriate multiples of the first; 2x + у - 3z = 0 2x-2y-2z= 10 Ъу- ζ = - 10 -4x- у + z = 10 Ax - 4y - 4z = 20 - 5y - 3z = 30 The given system has been replaced by these equations: χ —y — ζ = 5 Ъу- ζ = - 10 - 5y - Ъг = 30 We solve the last two easily: у = -30/7, ζ = -20/7. Substitutions into the first equation completes the solution: χ = — 15/7. Of course, we can run into difficulties as we did in the two unknown equations of Example 3. We should be prepared for such occurrences and perhaps even more mysterious ones. Nevertheless, our technique is productive: if there is a solution to be had we can locate it by this process of successive eliminations. Furthermore, it easily generalizes to systems with more unknowns. This is the technique stated for the case of и unknowns. Eliminate the first variable from all the equations except the first by adding appropriate multiples of the first. Then, we handle the resulting equations as a system in и — 1 unknowns. That is, using the second equation we can eliminate the second variable from all but the second equation, using the new third equation we can eliminate the third variable from the remaining equations, and so forth. Eventually we run out of equations and we ought to be able to find the desired solution by a succession of substitutions. We shall want to do more than discover solutions if they exist. We want to be able to predict the existence of solutions; we want to be able to compare systems, and we want to know in some sense how many solutions there are. In other words, we should come to understand the nature of a given system of equations. In order to do that we have to analyze this technique and develop a notation and theory which do so. That is where linear algebra begins. Before going into this, we study another pair of more complicated examples.
6 1. Linear Functions Examples 6. χ + 2y — ζ — 3w = 13 5x — y — ζ + 2w = —14 у + ζ + w = 4 Зх + 1у — 2z = — 7 According to our technique, we replace the last three equations by a new set in which the variable χ does not appear. We do this by adding the suitable multiple of the first equation: (-5) χ (first) + second: -llj> + 4z + 17w = -79 0 χ (first) + third: у + ζ + w = 4 (-3) χ (first) + fourth: - 4j> + z+ 9w=-46 Now we solve this set by applying the same procedure: we now eliminate y. Of course the order of the equations is not relevant; we could have listed them some other way. Since we can avoid fractions by adding multiples of the second equation to the first and third, let's do it that way. (11) χ (second) + first: 15z + 28w = -35 4 χ (second) + third: 5z + 13w = -30 Finally, of this set, ( — 3) χ (second) + first gives — llw = 55. Thus the original set of four equations is replaced by this set: χ + 2y — ζ — 3w = 13 у + ζ + w = 4 5z+ 13w= -30 -llw= 55 The solutions are now easily found, w=— 5 z = 7 y = 2 x=\ 7. Now, let us consider this set: χ + 2y + 3z + и — ν -5x+ y + 7z 2y + Au + 3v 3z — и — ν 2 -5 18 -5 (1.7)
1.1 Simultaneous Equations 7 χ is already eliminated from the last two equations. Using the first to eliminate χ from the second, we obtain these three equations in place of the last three above, 1 ly + 22z + 5k - 5v = 5 2y + 4u+3v = 18 (1.8) 3z — и — ν = — 5 Now у is already eliminated from the last. We eliminate it from the second (without getting involved with fractions) in this way: (-2) χ (first) + (11) χ (second): -44z + 34w + 43υ = 188 Now this equation together with the last of the set (1.8) gives this system -44z+ 34w + 43v = 188 3z— и — ν = — 5 We can eliminate ν from the first to obtain 129z — 9u = —27. Thus the system (1.7) has been transformed into this: χ + 2y + 3z+ и — ν = 2 Uy + 22z+5u-5v= 5 3z— u— v=— 5 \29z-9u =-27 Now we can solve for χ by the first equation once we know y, z,u,v; we can solve for у by the second once we know z, u, υ; we can solve for ν in the third once we know ζ and u; and we can use any z, и which make the last equation true. For example, if ζ = 0, we must have и = 3, and so on up the line: ν = 2, у = 0, χ = 1. Notice that for any value of ζ we can always find u, v, x, у that make these equations all hold. Thus in this case there is more than one solution. Formulation of the Procedure: Row Reduction Now, let us turn to the abstract formulation of this procedure. In the general case we will have some, say m, equations in и unknowns. Let us refer to the unknowns as x\ ..., x". These m equations may be written as a,V +a2lx2 + --- + a,ilx" = b1 0l2xl +a22x2+ --- + an2x" = b2 j αΓ*1 + "™χ2 + ■■■ + аптх" =Ъ"
8 I. Linear Functions We proceed to solve this system as follows: multiply the first equation by —al2/al' and add it to the second equation; multiply the first equation by - a, 3/al l and add it to the third and so on. The result will be a new system, which we may write this way: α,1*1 +α2ιχ2 + ··· + α„ν a22x2 + ■■· + a„ V oc2mx2 + ■■■ + oinmx" We now continue with the same technique applied to the system of m — 1 equations in и - 1 unknowns given by the system (1.11) (except for the first equation). This is an effective reduction of the problem, because xl can be computed from the first equation once x2, ..., x" are known. Of course, if a,1 =0, this technique must be slightly modified. We just renumber the equations so that the coefficient of xl in the first one is nonzero and then proceed as above. If that is impossible then xl appears in no equation so we can disregard it and work with x2 instead. We now introduce a formalism which allows us to keep track of this procedure. It is clear that the essence of the left side of the system of Equations (1.10) is embodied in the array of numbers. ■ a2 '. (1-12) \"1 "2 ' ' απ/ This array is called a matrix: the upper index of the general term is the row index and the lower index is the column index. Thus, a53 is the number in the third row and fifth column, a}2 is in the seventh row and forty-second column, akJ is the number in the j'th row and the ktii column. Symbol (1.12) is an m χ η matrix: it has m rows and и columns. The matrix -Θ is an m χ 1 matrix. Equations (1.10) can now be written symbolically as b1 ■β2 = β" (1.11) /ai A = Ax = b (1.13)
1.1 Simultaneous Equations 9 Now the technique for solving the equation described above consists of a sequence of such equations with new matrices A and b, ending with one whose solution is obvious. The step from one equation to the next is performed by a row operation (remember, the rows are the separate equations); that is, one of these particular steps: Step 1. Multiply a row by a nonzero constant. Step 2. Add one row to another. Step 3. Interchange two rows. It is clear (and will be verified in Section 1.3) that any such operation does not change the collection of solutions. Finally, the end result desired is a matrix of this form, called a row-reduced matrix: 0 1 a32 0 0 Descriptively: the first nonzero entry of any row is a 1 and this 1 in any row is to the right of the 1 in any previous row. This is the kind of matrix the above procedure leads to; and it is most desirable because the system it represents can be immediately solved. In order to see this, we shall distinguish between two cases by resolving the dotted ambiguity in the lower right corner of (1.14). Let Ax = b be a system of linear equations where A is a row-reduced matrix (of the form (1.14)). Let d be the number of nonzero rows of A. Case 1. d=n. In this case the system of equations has this form: xl+a2lx2 + --- + an1x" = b1 x2 + a32x3 + --- + an2xn = b2 χ"'1 +апл-1хл = Ьп-1 xn = bn 0 = bn+1 0 = bm (1.14)
10 1. Linear Functions Thus there is a solution if and only if bn+1 = ■ · · = bm = 0, and the solution is found by successive substitutions. In this case the solution is unique. Case!, d < n. In this case our system has the form: xl+a2lx2 + --- + a„1xn = b1 x2 + a32x3 + ··· + a„V = 62 0 = bi+l 0 = 'bm (We may have to reindex the variables in order to get all the leading l's in a line.) There is a solution if and only if bd+1 = · · · = bm = 0, and all the solutions are obtained by giving xd+1,..., У arbitrary values, and finding the values of the remaining variables by substitutions. We now summarize the factual (rather than the procedural) content of this discussion in a theorem, the proof of which will appear in Section 1.3. Definition 1. A matrix A is called a row-reduced matrix if (i) the first nonzero entry in any row is 1, (ii) in any row this first 1 appears to the right of the first 1 in any preceding row. The number of nonzero rows of A is called its index. Theorem 1.1. Let A be an m χ η matrix. A can be brought into row- reduced form A' by a succession of row operations. The equation Ax = b has precisely the same solutions as the equation A'x = b' if b' is obtained from b by the same sequence of row operations that led from A to A'. • EXERCISES 1. Find solutions for these systems (a) 2x-3^ = 23 Зх+ у = -4 (b) ix + 4y = 10 -ix + 8y= 0 (c) x+ y+ z= 15 *- У+ z= 3 2x-3y-5z = -7
1.1 Simultaneous Equations 11 (d) x + y + z+ w = 4 x+ y + z — w = 2 x— У + z — w = 0 -x+2y-3z + 4tv = 2 (e) -x + y+ z= 0 2x + 2y+ lOz = 28 x+ y+ ζ = 22 (f) 3x + 6r + 9z=12 x + 2y+3z= 4 (g) x + r = 7 χ— y=\ Ъх-4у = 0 (h) x+ у = 7 л:+ 2^ = 9 x+3.y =11 (i) Λ: + ^+ζ+νν = 4 x + у—ζ—w = 6 (j) x + 2r+ z= О χ — 3.y — 6z = 4 4x+8r + 4z=ll 2. A homogeneous system of linear equations is a system of the form Ax = 0; that is, the right-hand side is zero. Find nonzero solutions (if possible) to these homogeneous systems. (a) x+ y + z = 0 x- y+z=0 x + 2y + z = 0 (b) x + y+z = 0 x — y— ζ — 0 (c) x+ y+ z+ w=>0 x — 2y+ z —2w = 0 2x- y + 2z- w=0 3. Suppose (x1,..., x") is a solution for a given homogeneous system. Show that for every real number /, (fie1,..., fie") is also a solution. 4. If (xx,..., x"), (y1,..., y) are solutions for a given homogeneous system, then so is (x1 + y1, ■ ■ ·, x" + У). 5. Find the row-reduced matrix which corresponds to the given matrix according to Theorem 1.1. / 0 7 1\ A= 3 2 2 \-l 6 4/ /l 0 0 6 5\ 0 0 0 2 0
12 1. Linear Functions (1 2 6 1\ -2-4 0 2 0 0 8 8 I 3 6 9 12/ 6. Let i)-Θ i) Solve these equations: (a) Ax=b, (b) Bx=a, (c) Bx = c, (d) Cx=a, (e) Cx = c, where А, В, С are given in Exercise 5. • PROBLEMS 1. Show that the system ax + by = α cx + dy = β has a solution no matter what α, β are if ad—be φ 0, and there is only one such solution. 2. Can you suggest an explanation of the ugly phenomenon illustrated by Example 3 ? 3. Is there only one row-reduced matrix to which a given matrix may be reduced by row operations? If A' and A" are two such row-reduced matrices, coming from a given matrix A show that they must have the same index. 4. Suppose we have a system of и equations in и unknowns, Ax = b. After row reduction the index of the row-reduced matrix is also n. Show that in this case the equation Ax = b always has one and only one solution for every b. 5. Suppose that you have a system of m equations in и unknowns to solve. What should you expect in the way of existence and uniqueness of solutions in the cases m < n, m > и ? 6. Suppose we are given the η χ и system Αχ = b, and all the rows of A are multiples of the first row; that is, there are s1,..., У such that a/ = s'a/ for all j and i = 1,..., n. Under what conditions will the given system have a solution ? 7. Suppose instead that the columns of A are multiples of the first column. Can you make any assertions ? 8. Verify that if the columns of a 3 χ 3 matrix are multiples of the first column, then the rows are multiples of one of the rows.
1.2 Numbers, Notation, and Geometry 13 1.2 Numbers, Notation, and Geometry We now interrupt our discussion of simultaneous equations in order to introduce certain facts and notational conventions which shall be in use throughout this text. We shall also describe the geometry of the plane from the linear, or vector point of view as an alternative introduction to linear algebra. First of all, the collection {1, 2, 3, ...} of positive integers (the " counting numbers ") will be denoted by P. Every integer и has an immediate successor, denoted by и + 1. If a fact is true for the integer 1 and also holds for the successor of every integer for which it is true, then it is true for all integers. This is the Principle of Mathematical Induction, which we shall take as an axiom, or defining property of the integers. We shall formulate it this way. Principle of Mathematical Induction. Let S be a subset of Ρ with these properties: (i) 1 is a member of S, (ii) whenever a particular integer и is in S, so also is its successor и + 1 in S. Then 5 must be the set Ρ of all positive integers. This assertion is intuitively clear. You can see, for example that 2 is in S. For by (ι) 1 is in S, and thus by (ii) 1 + 1 = 2 is also in S. Continuing, 3 = 2 + 1 is in S, again by (ii). By applying (n) another time 4 is in S. Applying (ii) another 32 times we see that all the integers up to 36 are also in S. No positive integer can escape: since 1 is in S we need only apply (ii) и times to verify that the integer и is in S. In fact, the assertion of the principle of mathematical induction is that there are no integers other than those that can be captured in this way, and in this sense the principle is a defining property of the integers. The principle of mathematical induction provides us with a tool for writing proofs of assertions for all positive integers which avoids the phrases: " continuing in this way," " and so forth," "...," We shall find this a helpful device in verifying assertions concerning problems with an unspecified number, n, of unknowns. Let us illustrate this method by proving a few propositions about integers.
14 1. Linear Functions Proposition 1. The sum of the first η integers is (1/2)л(л + 1). Proof. Let S be the set of integers for which Proposition 1 is true. Certainly 1 is in S: 1=*· 1(1 + 1) Now, assuming the assertion of Proposition 1 for any integer n, we show that it also holds for η + 1. 1 + --- + л+1 = 1 + -- + л+л+1=*л(л+1) + л+1 = (л + 1)(*л + 1) = i(n + 1)(л + 2) which is the appropriate conclusion. Thus by the principle of mathematical induction, Proposition 1 is proven. Proposition 2. The sum of the first η odd integers is л2. Proof. 1 = 12 surely. We now assume the proposition for any л, and show that it follows for л + 1: 1+3 + --- + 2(л+1)-1=1+3 + -" + 2л-1+2л+1 = л2 + 2л+1 =(л+1)2 Proposition 3. Let К be a given positive integer. Then for any integer η we can write n=QK+R (1.15) with 0 < R < К in one and only one way. Proof. We may of course immediately discard the case К = 1 for in that case (1.15) is just the trivial comment that л = л · 1 for all л. Thus take К > 1, and now proceed by mathematical induction. The proposition is true for л = 1: 1 =0·ί|1 Now we assume that the proposition is true for any given integer л. Thus n=QK+R for some β and R, 0 <. R < K. Then R + 1 <; K. If R + 1 < K, we have n+l = QK+(R+l)
1.2 Numbers, Notation, and Geometry 15 with 0 < R + 1 < K, as desired. Otherwise, R + 1 = K, in which case и + 1 = QK+ K = (Q+ l)K+ 0 as desired. Thus, by mathematical induction, (1.15) is possible for every integer n. This representation is unique, for if n=Q'K+R' is also possible with 0 <, R' < K, then we have Q'K+R'=QK+R or (Q'~Q)K=R~R' and R — R' is between — К and K. Now the only multiple of К between — К and К is 0, so (β' - β)ΛΓ = R - i?' = 0 from which we conclude R = R', Q' = β. Set Notation The set of positive integers forms a subset of a larger number system, the set Ζ of all integers. Ζ consists of all positive integers, their negatives and 0. The collection of all quotients of members of Ζ is the set of rational numbers, denotedy by Q. β is a very large subset of the set of all real numbers R. For the purposes of geometric interpretation we will conceive of the real number system R as being in one-to-one correspondence with the points on a straight line. That is, given a straight line, we fix two points on it, one is the origin O, and the other denotes the unit of measurement. All other points Ρ on the line are given a numerical value: it is the displacement from О as measured on the given scale (negative if О is between Ρ and 1 and positive otherwise). (See Figure 1.1.) There are certain ideas and notations in connection with sets which we Figure 1.1
16 1. Linear Functions shall standardize before proceeding. Customarily, in any given context there is one largest set of objects under consideration, called the universe (it may be the positive integers P, or the rabbits in Texas, or the people on the moon) and all sets are actually subsets of this universe. If X is a set and χ is an object we shall write χ e X to mean: χ is a member of Χ. χ φ Χ means that χ is not a member of X. Thus, for example, -7 6 Z, but — ΊφΡ. The set with no elements is called the empty set, and is designated 0. Most specific sets are defined by a property: the set in question is the set of all elements of the universe that have that property. We use the following shorthand form to represent that phrase {x 6 U: χ has that property} For example, the set of all positive real numbers is {xeR:x>0}. The set of all Englishmen who drink coffee is {x e England: χ drinks coffee}. The set of all integers between 8 and 18 is {x e Z: 8 < χ < 18}. This is the same as {xeP: 8<x< 18} and {xeZ: \x - 131 < 5}. If X and Fare two sets, and every element of X is an element of Γ we shall say that X is contained in Y, written X <= Y. Notice that 0 <= X for every set X. We shall consider also these operations on sets: — X: the set of all χ not in X Χ υ Υ: the set of all χ in either X or Υ (or both) Χ η Υ: the set of all χ in both X and Υ X - Υ: the set of all χ in X, and not in Υ (Consult Figure 1.2 for a pictorial interpretation.) Notice that X — Υ is the same as Xn — Y. There are many other identities: X = X, -Xvj-Y= -{Χ η Υ), Χ η(ΥνΖ) = (ΧηΥ)ν(Χη Ζ), and so on, so don't be surprised when two different collections of symbols identify the same set. A final operation is that of forming the Cartesian product. If U is a given universe, then U χ U is the set of all ordered pairs of elements in U. U χ U is often denoted by U2. By extension we can define U3 as the set of all ordered triples (x1, x2, x3) of elements of U; and more generally U" is the set of all ordered л-tuples of elements of U. If X1,..., X" are subsets of U, the set of all ordered и-tuples (jc1, ..., x") with x1 6 X\ ..., Xя 6 Xя is denoted Xх χ · · · χ Χ\ Not every subset of U" is of the form Χ1 χ · · · χ Χ", those which are of this form are called rectangles. Thus the space of и-tuples of real numbers is denoted R". If I1, ..., /" are intervals in R, then ί'χ···χΓ is indeed a rectangle. A point (x , ..., x") in R" will be denoted, when specific reference to its elements is
1.2 Numbers, Notation, and Geometry 17 .^^^^, л- υ κ not required, by a single boldfaced letter χ = (χ1,..., χ"). If a = (α1,..., α"), b = (b1, ...,b") are two points in R" with b' > a', I <i <n we shall use the notation [a, b] to denote the rectangle. {(x\ ..., x"): a1 < x1 < b\ ..., a" < x" < bn} If the inequalities are strict we shall denote the rectangle by (a, b): (a, b) = {(x1, ..., x"): ax < x1 < b\ ..., a" < x" < b"} A function from a set X to another set У is a rule which associates to each point of χ a uniquely determined point у in У. It is customary to avoid the use of the new word rule by defining a function as a certain kind of subset of Χ χ Υ. Namely, a function is a set of ordered pairs (x, y) with xe X, у e Y, with each xe X appearing precisely once as a first member. If (x, y) is such a pair we denote у by/(л): у =f(x). We shall use the notation /: X -> Υ to indicate that/is to mean a function from X to Υ. Χ is called the domain of/; the range of/is the set {/(*): xe X} of values of/. If every point of у appears as a value of/we say that/maps X onto У If every point of у is the value of/at at most one χ in Z, we say that/is one-to-one. More precisely,/is one-to-one if χ φ χ implies/(x) φ/(χ'). Now, if/"is a one-to- one function from X onto Y, then for each у е Y, there is one and only one xeX such that/(x) = y. This defines a function #: У-»X which is also one-to-one and onto and has this property: g(y) = χ if and only if/(x) = j\ In this case we shall say that / is invertible and g is its inverse, denoted
18 1. Linear Functions g=f~l. Finally, if/x: X-* Y,f2- Υ -* Ζ are two functions, we can compose them to form a third function, denoted/2 °/x: A" -> Ζ defined by /2 °/iW =/2(/xW) Plane Geometry We now turn to the geometric study of the plane, as an alternative introduction to linear algebra. According to the notion of the Cartesian coordinate system we can make a correspondence between a plane, supposed to be of infinite extent in all directions, and the collection R2 of ordered pairs of real numbers. This is done in the following way: first a point on the plane is chosen, to be called the origin and denoted О (Figure 1.3). Then two distinct lines intersecting at О are drawn (it is ordinarily supposed that these lines are perpendicular, but it is hardly necessary). These lines are called the coordinate axes; they are sometimes referred to more specifically as the χ and у axes, ordered counterclockwise (Figure 1.4). Now a point is chosen on each of these axes; we call these Et and E2 (Figure 1.5). These are the "unit lengths" in each of the directions of the coordinate axes. Having chosen a unit on these lines, we can put each of them in one-to-one correspondence with the real numbers. Now, letting Ρ be any point in the plane, we associate a pair of real numbers to Ρ in this way. Draw the lines through Ρ which are parallel to the coordinate axes and let χ be the intersection with the line through £x and у the intersection with the line through О Figure 1.3 Figure 1.4
1.2 Numbers, Notation, and Geometry 19 о Figure 1.5 E2. Then we identify Ρ with the pair of real numbers (x, y) (Figure 1.6.) In this way to every point in the plane there corresponds a point in R2 (called its coordinates relative to the choice O, Ev E2). Clearly, for any pair of real numbers (x, y) we have a point with those coordinates, namely the fourth vertex of the parallelogram of side lengths χ and у along the coordinates axes with one vertex at О (Figure 1.6). P(x.y) Figure 1.6
20 1. Linear Functions Figure 1.7 Once a particular point in the plane is fixed as the origin, there can be defined two operations on the points of the plane, and these operations form the tools of linear algebra. Since they cannot be defined on the points until an origin is chosen, we are forced to distinguish between the point set the plane and the plane with chosen point. This distinction gives rise to the notion of vector: a vector is a point in the plane-with-origin. The vector can be physically realized as the directed line segment from the origin to the given point; such a visualization is nothing more than a heuristic aid. It is important to realize that as sets, the set of vectors in the plane is the same as the set of points in the plane. The difference is that the set of vectors has additional structure: a particular point has been designated the origin. We shall denote vectors by boldface letters; thus the point Ρ becomes the vector P, the origin О becomes the vector 0. We shall now describe these two operations geometrically and then compute them in coordinates. 1. Scalar Multiplication. Let Ρ be a vector in the plane, and r a real number. Consider the line through 0 and P. Considering Ρ now as a unit length, we can put that line into one-to-one correspondence with R. Using this scale, rP is the point corresponding to the real number r. Said differently, rP is one of the points on that line whose distance from 0 is \r\ times the distance of Ρ from 0 (Figure 1.7). Now, if Ρ has the coordinates (x, y) we shall see that rP has coordinates (rx, ry). First, suppose r > 0. Draw the
1.2 Numbers, Notation, and Geometry 21 triangle formed by the line through 0, Ρ and rP, and the Ex axis and the lines parallel to the E2 axis (Figure 1.8). Triangles I and II are similar. Thus, referring to the lengths as denoted in Figure 1.8, |P| _χ |rP| s By definition |P|/|HP| =l/r, thus the first coordinate of rP (here denoted by s), is rx. The second coordinate is similarly seen to be ry. Thus rP has the coordinates (rx, ry). The case r < 0 is only slightly more complicated. 2. Addition. Let P, Q be vectors in R2. Then 0, P, Q are three vertices of a uniquely determined parallelogram. We define Ρ + Q to be the fourth vertex. The description of this operation in terms of coordinates is extremely simple: if Ρ has coordinates (x, y), and Q has coordinates (s, t), then Ρ + Q has (x + s, у + t) as coordinates. There is nothing profound to be learned from the verification of this fact, so we shall not go through it in detail. After all, it is not our purpose here to logically incorporate plane geometry Figure 1.8
22 1. Linear Functions Figure 1.9 into our mathematics, but rather to use it as an intuitive tool for comprehension. For those who are suspicious of our assurances we include the verification of a special case Consider Figure 1.9 and include the data relating to the coordinates (Figure 1.10): To show that the length of the line segment OB is s + x, we must verify that the length of AB is s. Draw the line through Ρ and parallel to 0EX, and let С be the intersection of that line with the line through Ρ + Q and В The quadrilateral ABCP has parallel sides and thus is a parallelogram. Hence, AB and PC have the same length. Now triangles OjQ and PC(P + Q) have pairwise parallel sides (as shown in Figure 1.10) and further 0Q and P(P + Q) have the same length. Thus these triangles are congruent so the length of PC is the same as the length of 0s, namely s. Thus the length of AB is also s, and so OB has length s + x. Notice that this is a special case since it refers to an explicit picture which does not cover all possibilities. The operation inverse to addition is subtraction: Ρ — Q is that vector which must be added to Q in order to obtain Ρ (Figure 1.11). The best way to visualize Ρ — Q is as the directed line segment running from the head of Q to the head of Ρ (denoted L in Figure 1.11). In actuality, Ρ - Q is the vector obtained by translating L to the origin; in practice, it is customary not to do this but to systematically confuse a vector with any of its translates. We shall do this only for purposes of pictorial representation. Notice that, having chosen the vectors Ex and E2, we can express any vector in the plane uniquely in terms of them and the operations of addition
1.2 Numbers, Notation, and Geometry 23 P + Q Figure 1.11
24 1. Linear Functions and scalar multiplication: (*, y) = (x, 0) + (0, y) = x(l, 0) + уф, 1) = *EX + j>E2 This is true no matter how Ex and E2 are chosen, so long as the points 0, Ex, E2 do not he on the same line (we say the vectors Ex and E2 are not collinear). Thus we have this important fact. Proposition 4. Let Ex andE2 be any two noncollinear vectors in the plane. Then we can write any vector Q uniquely as Q = *% + x2E2 x1 and x2 are the coordinates of Q relative to the choice of origin 0 and the vectors Έί and E2. If we state this geometric fact purely as a fact about R2, it turns out to be a theoretical assertion about the solvability of a pair of linear equations. Thus, let us suppose Ex = (аД αχ2), Ε2 = (a2i, a2) relative to some standard coordinate system (for example, the usual rectangular coordinates). First of all, how do we express algebraically the assertion that Ex and E2 do not lie on the same line? We need an algebraic description of a straight line through the origin. Proposition 5. (i) A set L is a straight line through 0 if and only if there exists (a, b) e R2 such that L= {{x, y): ax + by = 0} (ii) The points (x, у), {х, у') lie on the same line through the origin if and only if *_ = y_ x' y' You certainly recall these facts from analytic geometry—we leave the verification to the exercises. Returning to the vectors Et and E2, these geometric facts become the following algebraic fact. Proposition 6. Let Hi °S) be a 2 χ 2 matrix with nonzero columns.
1.2 Numbers, Notation, and Geometry 25 (i) If a11a22 - a12a21 Φ 0, then the equation Ax = b has a unique solution for every b e R2. (ii) The equation Ax = 0 has a nonzero solution if and only if a11a22 = (iii) If a1ia22 = a2a2x, the equation Ax = b has a solution if and only tia^b2 = a2b\ Proof. (ι) This condition is according to Proposition 5 (u) precisely the assertion that the vectors (βι1, ai2), (a2\ a22) are not collinear. Then, according to Proposition 4, for any φ1, b2) there is a unique pair (x1, x2) such that ф1,Ь2)=х1(а11,а12) + х2фг1,аг2) This is the same as the pair of equations b = Ax. (n) By the above, if ai'a22 = ai2a2', then the only solution of Ax = 0 is χ = 0. On the other hand, if ai'a22 = ai2a2', then either (a22, — a,2) or (— a2', аг2) is a nonzero solution of Ax = 0, or all the entries of A are zero, in which case everything solves Ax = 0. (iii) If α^αζ2 = ai2a2', then (а/, ai2), (a2', аг2) lie on the same line through the origin by Proposition 5(ii). Any combination x'fai1, «i2) + *2{аг1, агг) will have to lie on that line, and conversely, any point on that line must be such a combination. Thus Ax = b has a solution if and only if φ1, b2) lies on the line through 0 determined by (a,1, ai2). The equation for this is, by Proposition 5(ii), b1 b* иг ι м г — =— or b2a11=b1a12 а.1 Й12 Examples 8. Let Lx be the line through (0, 0) and (3, 2) and L2 the line through (1,1) and (0, 6). Find the point of intersection of Lx and L2. Lx has the equation 2x — Ъу = 0, and L2 the equation 5x + у = 6. The point of intersection must he on both lines, and thus is the pair (x, y) solving 2X - 3y = 0 5x + у = 6 We find χ = 18/17, у = 12/17. 9. Find the line L through the point (7, 3) that is parallel to the line L': Sx + 2y = 17. L will be given by an equation of the form
26 1. Linear Functions ax + by = с In order to be parallel to L, L and L' must have no point of intersection, so the equations 8x + 2y = 17 ax + by = с can have no common solution. Thus we must have 8_2 a~ b Furthermore, since (7, 3) is on L, we must also have la + 3b = с This pair of equations in three unknowns has for a solution a = 4, b = 1, с = 31. Thus L is given by the equation 4* + у = 31 EXERCISES 7. Show that for every integer n, 12 + 22 + · · · + n2 = £«(и + 1)(2« + 1) 8. Show that for every integer n, 2 + 22 + 1_2" = 2"+1-2 9. Show that Xn(YuZ) = (Xn Y) v(XnZ) and X\j(YnZ) = (Xv Y)n(XvZ). 10. Give an example of a subset of a Cartesian product which is not a rectangle. 11. Find the point of intersection of these pairs of lines in Rz: (a) 3x+ y = l (c) 2x+ 2y=-l x-\7y = l x+12y= 14 (b) x-2y = 4 (d) y = 2x+ 1 2x+ y = 0 x = 3^ + 18 12. Find the line through Ρ which is parallel to L: (a) P = (2,-l),L:3x + 7^ = 4 (b) P = (8, 1),L:jc-^=-1 (c) P = (0,-7),L:^-2x = 3 PROBLEMS 9. We can define the line through Ρ and Q as the set of all X such that the vector X - Ρ is parallel to the vector Ρ - Q. Show that two vectors are parallel if and only if one is a multiple of the other. Conclude that the line
1.2 Numbers, Notation, and Geometry 27 through Ρ and Q is the set {P + t(P-Q):teR} 10. Using the definition in Exercise 9 show that a straight line in the plane is, in terms of coordinates, given as {(*, y) e R2: ax + by + с = 0} for suitable a, b, с 11. Suppose L is a line given by the equation bx + ay + с = 0 (a) Show that the tangent of the angle this line makes with the horizontal is — b/a. (b) Show that the vector (a, b) is perpendicular to L. (c) Find the point on L which is closest to the origin. 12. Find the line through the point Ρ and perpendicular to L: (a) P = (3, 7),L:x-3y = 2. (b) P = (-\,l),L:2x+3y = 0. (c) P = (0,2),L:5x = 2y. 13. Suppose coordinates have been chosen in the plane. Let Ει, E2 be two vectors in the plane which are not collinear. (That is, 0, Ei, E2 do not lie on a straight line.) Then we can recoordinatize the plane relative to this choice of principal directions. Give formulas which relate these two coordinatizations in terms of the given coordinates of Ε!, E2 (see Figure 1.12). / / ι vE,/-""' / / _,p U3<) uE, / ,„--·' Ε,Οτ1,/) Figure 1.12
28 1. Linear Functions 14. In the text, the equation 1 + 2 + 1- и = in(n + 1) was verified by induction. There is another way of doing this. An η χ η matrix has η2 entries. There are и of these entries on the diagonal and 1 +2 + \- n — \ entries both above and below that diagonal. Thus 2(1 + 2 + ■·· + «- \) + n = n2 1.3 Linear Transformations We now return to the problem of analyzing systems of simultaneous linear equations, with a broader question in mind: given the m χ η matrix A, for which b is there a solution of the equation Ax = b ? In order to study this, we associate to A the function from Rm to R": fA(xi,...,x") = (aiixi + --- + anlx>,,...,aimxi,...,a„m>r> (1.16) Thus the set of b, such that Ax = b, is precisely the range offA. Let us begin by introducing the two fundamental operations on R" (just as in the case и = 2 studied in the previous section): 1. Scalar Multiplication: for r e R, χ = (χ1,..., χ") e R", define rx = (rx1,..., rx") 2. Addition: for χ = (χ1, ...,x"),y = (у1, ...,уя)е Rn, define х + у=(х* +y\ ...,x" + y") Definition 2. A function / from R" to Rm is a linear transformation if it preserves these two operations, that is, if f(rx) = rf(x) /(x + У) =/(x) +/(y) The function fA defined above for the m χ η matrix A is linear: fA(rx) = (a,'rx1 + ■■■ + anlrxn, . ., a^rx1 + ■·■ + anmrx") = (r(ax V + · ·-a„V),..., ria^x1 + ·■■ + a„mx„)) = rfA(x)
1.3 Linear Transformations 29 /л(х + У) = (βχ V + У1) + ■ ■ ■ + α„ V + у"), · ■., αΛ*1 + у1) + ■■■ + а„т(х" + у")) = (a1lxl + ---+a„1xn + a1iyl + ■■■ + а„1у\ ..., а^х1 + ··· + а„тх" + аГу1 + ■■■+ а„туп) = /л(х) + /а(у) The significance of the introduction of linear transformations, from the point of view of systems of equations, is that it provides a context in which to consistently interpret the technique of row reduction For, the application of a row operation to a system of equations amounts to composition of the associated linear transformation with a particular linear transformation corresponding to the row operation. Once we have seen that we can analyze the given system by studying these successive compositions. Looking ahead, it is even more important to recognize row reduction as a tool for analyzing linear transformations. Let us now interpret the row operations as linear transformations. Type I. Multiply a row by a nonzero constant. Consider the multiplication of the rth row by с φ 0. Let Px be the transformation on Rm: Pl(bi,...,bm) = (bl,...,cbr,...,bm) (multiplication of the rth entry by c) The effect of this row operation is that of composing the transformation/,,. R" -> Rm with Pu and changing the equation Ax = b into the equation PxAx = Pxb These two equations have the same set of solutions since the transformation Pt can be reversed (it is invertible). Precisely, its inverse is given by multiplying the rth entry by 1/c. Type 11. Add one row to another. Adding the rth row to the ith row corresponds to this transformation on Rm: P2(bi,...,bm) = (bl,...,br,...,b* + br, .,bm) Again, this step in the solution of the equations amounts to transforming the equation Ax = b to P2Ax = P2b Since P2 is invertible (what is its inverse?) we cannot have affected the solutions. Type HI. Interchange two rows. Interchanging the rth and ith rows corresponds to the transformation P3(b\...,br,...,b>,...,bm) = (bl,...,b\...,br,...,bm) The importance of these observations is this" the row operations correspond to linear transformations which in turn are representable by matrices. The
30 /. Linear Functions solution of the system of equations Ax = b thus can be accomplished completely in terms of manipulations with the matrix corresponding to the system. It is our purpose now to study the representation of linear transformations by matrices and the representation of composition of transformations. In R" the η vectors (1,0,..., 0), (0, 1, 0,..., 0),..., (0,..., 0, 1) play a fundamental role. We shall refer to them as E,, ...,E„, respectively. Thus E, has all entries zero, but the /th, which is 1. Proposition 7. Any vector in R" has a unique representation as a linear combination o/Ex,..., E„. Proof. Obviously, (o\ ...,ό")=όΈ1 + ··· + ΛΈ„ We shall refer to the set of vectors Ei,..., E„ as the standard basis for R". Out of Proposition 7 comes this more illuminating fact. Proposition 8. Corresponding to any linear transformation L: R" -> Rm there is a unique m χ η matrix (a/) such that Цх\ ...,*") = (£ a/x1,..., ΣαΓχ1) 0-Π) Proof. It is clear, by the way, that, given the matrix (a/), Equation (1.17) does define a linear transformation. Now, given L, since it is linear, we can write L((x\ ..., χ")) = /.(χΈ, + · · · + x"E„) = x'HEi) + · · · + x"L(E„) (1.18) Thus a linear transformation is completely determined by what it does to the standard basis. Let HEi) = (ai\ ..., a,-),..., L(E„) = (a.1,..., a.-) Then Equation (1.18) becomes L((x\ ...,x"))= χ\αι\ ..., βι-) + ··· + χ"(α„\ ..., α„") = (xW,..., x^S) +··· + (x"a„\ .... x"a„m) = (xW + ■■■ + x"aS, ...,Λ,» + ··· + x"a„n) which is just Equation (1.17).
1.3 Linear Transformations 31 Matrix Multiplication Now we must discover how to represent the composition of two linear transformations by an operation on matrices. There is only one way to make this discovery: compute. Suppose then that T: R" -> Rm and S: Rm->RP are linear transformations represented by the matrices (a/) and (b/), respectively. Then, we can compute the composition ST as follows: nx1,...,X-)=(iaJ1X\..., £<*J) st(X\ ...,*-) = (£ V (Д*/*'). · ■ ·. Д v( Д*/*')) Thus ST is represented by the ρ χ η matrix Definition 3. Let A = (a/) be an m χ η matrix and В = (6/) ά ρ χ m matrix. Then the product BA is defined as the ρ χ η matrix whose 0\/)th entry is given by (1.19). The preceding discussion thus provides the verification of Proposition 9. IfT: R" -> Rm, S: Rm -> R" are represented by the matrices A, B, respectively, then ST is represented by the product BA. The product operation may seem a bit obscure at first sight; but it is easily described in this way: the (i,j)th entry of BA is found by multiplying entry by entry the ith row of В to thejth column of A, and adding. Examples 10. /5 3 7\ / 6 1 0\ A= 6 5 1 B= -3 2 5 \8 11 -4/ \ 4 4 4/
32 I. Linear Functions Let AB = (c/). Thei c11 = 5.6+ 3(-3) + Cl2 = 6.6+ 5(-3) + Cl3 = 8.6+ll(-3) + (- c32 = 6.0 + 5.5 + /49 39 ' 43\ AB = 1 25 20 29 \-l 14 39/ 1 7.4 = 1.4 = -4)4 = 1.4 = 49 25 -1 29 11. (2 5 1-1 -5 -1\ 5 1)= 20 50 1 0 \2 · 1 5-1 1-1/ 12. /-1 0\/ 1 7 0\_/-l -7 0\ I 0 ljl-l 4 -2J l-l 4 -2J /0 1\/ 1 7 0\ /-1 4 -2\ \\ ОД —1 4 -2J \ 1 7 OJ /1 1\/ 1 7 0\ / 0 11 -2\ \0 1Д-1 4 -2) l-l 4 -2J Now, let us recapitulate the discussion of this section so far. The problem of systems of m linear equations in η unknowns amounts to describing the range of a linear transformation T: R" -> Rm. The technique of row reduction corresponds to composing Γ by a succession of invertible transformations on Rm. These transformations are those which provide the row operations; we shall call them elementary transformations. Linear transformations can be represented by means of the standard basis by matrices, and composition of the transformations corresponds to matrix multiplication. Thus, we solve a system of linear equations as follows: Multiply the matrix on the left by a succession of elementary matrices in order to obtain a row-reduced matrix. Then we can easily read off the solutions. Since multiplication by an elementary matrix is the same as applying the corresponding row operation to the matrix it is easy to keep track of this process.
1.3 Linear Transformations 33 Examples 13. Let us consider the system of four equations in three unknowns corresponding to the matrix A = We shall record the process of row reduction in two columns. In the first we shall list the succession of transformations which A undergoes and in the second we shall accumulate the products of the corresponding elementary matrices. (a) Multiply the third row by — 1 and interchange it with the first, 4 0 -1\ /0 0 -1 0\ 3 2 21/0 1 0 0 4 0 1 II 1 0 0 0 ^0 1 2/ \0 0 0 1, (b) Multiply the first row by 3 and subtract it from the second; multiply the first row by 4 and subtract it from the third. Ί 0 -1\ /0 0 -1 0\ 0 2 5 W 0 1 3 0 0 0 5 11 1 0 4 0 v0 1 2/ \0 0 0 1, (c) Divide the second row by 2 and the third row by 5. 1 0 0 0 0 -1\ 1 5/2 0 1 1 2/ /° ° 1/5 \o 0 1/2 0 0 -1 3/2 4/5 0 0' 0 0 1, (d) Subtract the second row from and add one-half the third row to the fourth. 1 0 0 1 0 0 0 0 -l\ 5/2 1 1 o/ / ° I ° 1/5 \l/10 0 1/2 0 -1/2 -1 0' 3/2 0 4/5 0 -11/10 1,
34 1. Linear Functions Let us denote the product of the elementary matrices by P; thus Ρ is the last matrix on the right and the matrix on the left is PA. Now, it is easy to see that if PAx = у has a solution, the fourth entry of у must be zero. Now our original problem Ax = b has a solution if and only if PAx = Pb is solvable (since Ρ is invertible). Thus b is in the range of A if and only if the fourth entry of Pb is zero: Ax = b can be solved if and only if τ1**1 -i*2 - тУ>3 + b4 = 0 If b satisfies that condition, there is an χ such that Ax = b; we find it by solving PAx = Pb: χ1_χ3=_£3 fx3 = \tf + \ЪЪ x3 = \ЪХ + jb 14. Consider now the system in three unknowns given by We row reduce as above. (a) Multiply row 1 by 4 and subtract it from row 2; multiply row 1 by 3 and subtract it from row 3. (: (b) Subtract 1/2 of row 2 from row 3; divide row 2 by —10 /13 —2 \ / 1 0 0\ 0 1 -3/5 2/5 -1/10 0 \0 0 0/\-l 1/2 1/ The system Ax = b thus has a solution if and only if -Ьх+^Ь2 + Ь3=0 (1.20)
1.3 Linear Transformations 35 In that case the solution is given by x1 + 3x2 - 2x3 = bl X TX - -jO - TQ-ft Any arbitrarily chosen value of x3 will provide a solution (granted the condition (1.20) is satisfied). 15. /2 0 0 2\ A=(3 -1 1 0) \2 2 0 0/ (a) Divide row 1 by 2. /1 0 0 1\ /1/2 0 0\ (3 -1 1 Oil 0 1 Ol \2 2 0 0/ \ 0 0 1/ (b) Subtract 3 times row 1 from row 2; subtract twice row 1 from row 3. /1 0 0 1\ /1/2 0 0\ 10 -1 1 -311-3 1 01 \0 2 0 -2/\-2 0 1/ (c) Multiply row 2 by —1, and subtract twice the result from row 3. /10 0 1\ /1/2 0 0\ 0 1 -1 3 3 -1 0 \0 0 2 -8/\-8 2 1/ Here there is no condition for the equation PAx = у to be solvable, thus every problem Ax = b is also solvable. The solution is found by writing the system PAx = Pb: x1 + x4 = ψ χ2 - χ3 + Зх4 = ЪЪХ -b 2 2χ3-8χ4= -Sbl +2b2 + b3 Clearly, the value of x4 can be freely chosen, and x\ x2, x3 are easily found by the equations.
36 1. Linear Functions Validity of Row Reduction The basic point behind the present discussion is that the study of m simultaneous linear equations in и unknowns is the same as the study of linear transformations of R" into Rm, which is the same as the study of m χ η matrices under multiplication by m χ m matrices. The matrix version of this story is the easiest to work, if only because it imposes the minimum amount of notation. However, the linear transformation interpretation is the most significant, and in the next section we will follow that line of thought. But first, let us record a proof of the main result of Section 1.1 in terms of matrices. Theorem 1.2. Let A be an m χ η matrix. There is a finite collection E„,..., Es of elementary m χ m matrices such that the product Es · · · E„A is in row-reduced form. Let Ρ = Es · · · E„ and let d be the index of Ρ A. (i) The system Ax = b has a solution if and only if PAx = Pb has a solution. (ii) The system Ax = b has a solution if and only if the last m — d entries of Pb vanish. (iii) η — d unknowns can be freely chosen in any solution of Ax = b. Proof. First of all, we may, by a sequence of row operations, replace A with a matrix whose first nonzero column is I namely, supposing the y'th column is the first nonzero column. Thus some entry in that column, say a/, is nonzero. Interchange the first and y'th rows. This is accomplished by multiplication on the left by an elementary matrix of Type III, call it Eo. Now, Eo A = (a/) with α/ Φ 0. Multiply the first row by (a/) -'; this makes the (l,y) entry 1 and is accomplished by means of an elementary matrix, say Ei. Now, let En be the elementary matrix representing the operation of adding -α/ία/)-1 times the first row to the y'th row (this makes the (;',y) entry zero). Then Em · · · Eo A has its first nonzero column (1.21). The proof now proceeds by induction on m. If m = 1 the proof is concluded: the 1 χ и matrix (0,..., 0, 1, a)+1,..., α,,1) is in row-reduced form. For m > 1, the matrix Em · · · E0A has the form
1.3 Linear Transformations 37 where В is an (m — 1) χ (и —у) matrix. The induction assumption thus applies to B. There is a collection F0,..., F, of (m - 1) χ (m - 1) elementary matrices such that F, · · · F0B is in row-reduced form. Now let Ε -ί1 ° "" °\ Then, it is easy to compute that further multiplication of (1.22) by these matrices does not affect the first row, and in fact, EoA = (2 aj+i Fs • αΛ F0B J which is in row-reduced form. (i) Suppose there are χ and b such that Ax = b. Then multiplication by Ρ preserves the equality, so PAx = Pb. On the other hand, supopse x, b are given such that PAx = Pb. Let fP be the transformation on R" corresponding to P. /p is a composition of row operations which are invertible, thus/P is invertible. In particular, fP is one-to-one, so since/P(Ax) =/p(b) we must have Ax = b. (ii) If a" is the index of the m χ η matrix PA, its last m — a" rows vanish. Thus for PAx = Pb to hold for some x, the last m — d rows of Pb must vanish. By (l) this is also the condition for Ax = b to have a solution. (iii) The solutions of Ax = b are the same as those of PAx = Pb. This latter system has the form x1 + az1*2 + χ2 + a32x3 + + a„1x" = 2Pj1bJ + a„2x" = 2Pj2bJ x° + ai+1xd+1 + ---+ a„dx" = 2 Pj"bJ Clearly, x1 xd are uniquely determined once xd+1,..., x", b1 br are known. The b's are restricted by the last m free to take any values. d equations of Pb = 0, but xd+1,..., x" are • EXERCISES 13. Compute the products AB: (a) A = lo (b) A = 0 1 0 0 2 0 -3 = 0 -2 B = Ό 1 2 0 °\ ° i/ 1 -2 3 1 2 3 0 2 3 0 -1 3
38 1. Linear Functions (с) А = (6, 6, 3,2,1,) B = (d) A 2 8 6\ /0 ! :i -i) B4? 14. Compute the products BA for the matrices A, B of Exercise 13. 15. Compute the matrix corresponding to the sequence of row operations which row reduce the matrices of Exercise 5. 16. For the given m χ η matrix A find conditions on the vector b in Rm under which the equation Ax = b has a solution. (a) A as given in Exercise 13(a). (b) A as given in Exercise 13(b). (c) i-' 2 0 \ 4 /2 4 2 \8 2\ 0 -1 1 V 6 0 2 0 1 0 6 0 0 1 -1 (d) 17. Show that if A is an m χ η matrix with m>n, then there are always b for which the equation Ax = b has no solution. 18. Verify that the composition of two linear transformations is again linear. 19. Suppose that T: Rn^Rn and has this property: T(E1) = 0 Γ(Ε„)=0. Show that T(x) = 0 for every χ e Rm 20. Show that there is only one linear function on R" with this property: /(Ei) = E2 ,/(E2) = E3 /(E„) = Ei • PROBLEMS 15. Let /: R" -► R be defined by f(x\ ..., x") = £?= ι x*. Show that / is a linear function. Is the function g(x1,...,xT)= 2f=i(x')2
1.3 Linear Transformations 39 linear ? Is the function h(x\ x2) = xlx2 lmear? 16. Suppose that S, Τ are lmear transformations of i?" to Rm Show that S+T, defined by (S + T)(x) = S(x) + T(x) is also linear. Show that the matrix representing S + Τ is the entry by entry sum of the matrices representing S, T, respectively. 17. Let А, В, С be η χ η matrices. Show that (AB)C = A(BC) (A + B)C = AC + ВС A(B + С) = AB + AC Show that AB = BA need not be true. 18. Write down the products of the elementary matrices which row reduce these matrices: l\ 3 2 °/ 19. Is it possible to apply further operations to the matrices of Exercise 18 in order to bring them to the identity? Notice that when this is possible for a given matrix A, the product Ρ of the elementary matrices corresponding to these operations has the property PA = I. That is, Ρ is an inverse to A. Using this suggestion compute inverses to these matrices also: /o 1 0 0 0 4 0 1 0 -2 7 -5 0 5 -1 3 6 0 1 1 6 2 0 3 3 -5 3 3 2 4 2 6 0 1 3 3 0 1 2 1 0 !\ 4 °/ /8 0 0 \2 6 0 1 0 0 1 0 -1 0 0 0 -1 20. Find a 2 χ 2 matrix A, different from the identity such that A2 = I. Find a 2 χ 2 matrix such that A2 = — I. 21. Is the equation (I + A)(I + B)=I + A + B possible (with nonzero AandB)? 22. An η χ и matrix A = (a/) is said to be diagonal if a/ = 0 for / Ф]. Show that diagonal matrices commute; that is, if A and В are diagonal matrices, AB = BA. Give necessary and sufficient conditions for a diagonal matrix to have an inverse.
40 1. Linear Functions 1.4 Linear Subspaces of R" In the last section we saw that the equation Ax = b can be solved just for b's restricted by certain linear equations and that the set of solutions of that equation might have some degrees of freedom. In both cases these sets are determined by some linear equations; such sets are called linear subspaces of R". We will begin with an intrinsic definition of linear subspace and the notion of its dimension. In the next section we shall find a simple relation between the dimensions of the sets related to the equation Ax = b. Definition 4. (i) A set К in R" is a linear subspace if it is closed under the operations of addition and scalar multiplication. That is, these conditions must be satisfied: (1) vu v2 e К implies i>x + v2 e V. (2) re R, ye V implies η e V. (ii) If S is a set of vectors in R", the linear span of S, denoted [£] is the set of all vectors of the form ciy1 + ·■■ + ckyk with vx, ..., vke S. (iii) The dimension of a linear subspace V of R" is the minimum number of vectors whose linear span is V. Linear Span Having now given the intuitively loaded word "dimension" a definition, we had better hope that it suits our preconception of that notion. It does just that in R3: a line is one dimensional since it is the hnear span of but one vector; and a plane is two dimensional because we need that many vectors to span it. In fact, it is precisely those observations which have motivated the above definition. We should also ask that the above definition makes
1.4 Linear Subspaces ofR" 41 this assertion true: R" has dimension n. You may need a little convincing that this is not immediately obvious, since you do know of и vectors (the standard basis) whose linear span is R". But how can we be sure that we cannot find less than и vectors with the same properties? Consider this restatement of the notion of "spanning": If the vectors y1( ..., vk span R", then the system of и simultaneous linear equations Σ χ\ = ь has a solution for every b e R". We already know from the preceding section that this cannot be if к < η, and that gives us a proof that R" has dimension n. We now repeat the arguments in the present context. Theorem 1.3. // the set S of vectors in R" spans R", then S has at least η members. Thus, the dimension of R" is n. Proof. The proof is by induction on η and goes like this. Supposing that Vi,..., v* span R", one of them must have a nonzero first entry. Subtracting an appropriate multiple of that from each of the others, we may suppose that the remaining к — 1 vectors all have first entry equal to zero. Then they are the same as vectors in R"'1, and since the original Vi,..., vt spanned R" we can show that these must span i?"_1. Now, by induction к — 1 >«— 1, and we have it. (Notice that this is the same as the first step in the proof of Theorem 1.2.) Here now is a more precise argument. If none of the Vi vk has a nonzero first entry then Ei = (1, 0,..., 0) could hardly be in their linear span. Letting a, be the first entry of ν,, we may suppose (by reordering) that αλ φ 0. Now let Wi = Vi and wj = v, - β,αϊ ^i for ] = 2,..., k. The vectors Wi,..., w* have the same linear span as the vectors Vi,.. ,vk (see Problem 18); the difference is that only Wi has a nonzero first entry. Let Wi = (βι, bi), w2 = (0, b2), ..., m = (0, bk), where bi bk are in R"~\ Now, b2,..., bk span R"-1. For let с e R"~\ Then (0, c) e R", and since wb ..., щ span R", there are x1,..., xk e R such that 2 x'w, = (0, c) 1 = 1 Thus χ'αι + χ2 ·0-\ Ьх"-0 = 0, *% + h x'bk = c. Since αϊ Φ 0, the first equation implies x1 = 0, so the second equation becomes x2b2 + · · · + x"bk = с Thus, b2 bk span R"~\ so by induction к - 1 > η - 1; that is, к > п. Thus, dim R" > n. On the other hand, the standard basis Ei,..., E„ clearly spans, so in fact dim R" = n.
42 1. Linear Functions Examples 16. Let v,= (0,1,0,3) v2 = (2, 2, 2, 2) v3 = (3, 3, 3, 3) be three vectors in R4, and let S be their linear span. Then clearly dim S<3. But it is also clear that v3 is superfluous, since v3 = 3/2(v2). Thus S is also the linear span of y1( v2 : if ν =a1\1 + a2y2 + a3\3 then we can also write ν = вЧ, + (α2 + 3/2(α3))ν2 Thus, dim S < 2. In fact, S has precisely dim 2. For suppose there were a vector w = (a1, a2, a3, a4) which spanned S. Then we would have numbers cu c2 such that Vi = c,w, v2 = c2w. Explicitly this becomes 0 = cX 2 = c2a' 1 = cta2 2 = c2a2 0 = Cia3 2 = c2a3 3 = cxa4 2 = СгО4 But this is clearly impossible. By the second equation we must have cx Φ 0, so by the first we must have a1 = 0. But 2 = c2ax, which could not be. Thus, dim S = 2. 17. Let V be the subset of RA given by K= {v:;;1 + v2 + v3 - v4 = 0} К is certainly a linear subspace of R4. We will shortly have the theoretical tools to deduce that V has dimension 3; with a little work we can show it now. First of all, let Ax = (1, 0, 0, 1), A2 = (0, 1, 0, 1), A3 = (0, 0, 1, 1). Then Alf A2, A3 are all in V, and if υ = (г1, ν2, ν3, г4), since r4 = vl + v2 + v3 we have ν = ι>χΑι + ι>2Α2 + ι>3Α3
1.4 Linear Subspaces ofR" 43 Thus К is the linear span of At, A2, A3, so dim V < 3. On the other hand, if dim V < 3, then Au A2, A3 can all be expanded in terms of some pair of vectors Bu B2. If we delete the fourth entry in all these vectors this amounts to saying that the standard basis vectors in R3 can be spanned by a pair of vectors. But dim R3 = 3, so this is impossible. Thus dim V = 3 also. Independence Repeating the definition once again, dimension is the minimum number of vectors it takes to span a linear space. There is another closely allied intuitive concept: that of" degrees of freedom " or " independent directions." In such phrases as "there is a four parameter family of curves," "two independently varying quantities are involved," allusion is being made to a dimension-like notion. Now, if we try to pin down this notion mathematically and specify the concept of independence in the linear space context, it turns out to be precisely the requirement for a spanning set of vectors to be minimal. In other words, the dimension of a linear space is also the maximum number of degrees of freedom, or indpendent vectors in the space. Definition 5. Let Sbea set of vectors in R". We say that S is a set of independent vectors if the equation xlyt + ■■■ + хкУк = 0 with x1, ..., x* 6 R and v^ ..., vfc distinct elements of S implies x1 = 0,..., xk = 0. The standard basis of R" is an independent set, as is very easy to verify. We now verify that R" has in fact no more than π degrees of freedom in this sense. Proposition 10. Let \1г..., vk be an independent set in R". Then к <n. Proof. The proof is by induction on k. The case к = 1 is automatically true, since η > 1 always. Now let us proceed to the induction step (k > 1). Let as be the first entry of γ,; we can thus write v, = (a,, b,), where b, e R-1. If all the as are zero, then bi,..., bk are an independent set in if1-1. By the induction assumption then, к <n— 1, so к < п. Now suppose instead that some a, is nonzero. We can reorder the given vectors so that αϊ φ 0. Let Wi = Vi - α, αϊ 'vj for i > 2. Then the first entry of w, is 0, so w,=(0, β,) with β,εΛ""1. Pi,...,pk are an
44 /. Linear Functions independent set in R"~\ For if £{=2 с1 β, = 0, then also Jjt= 2 c'v/t = 0, so (-ic'e,W1Vi+ 2 c'v,=0 \ 1=2 / 1=2 Since Vi,..., Vk are an independent set, c2 = ■ · · = c" = 0, so β2 β* are also independent. Thus, by induction, once again к — 1 < η — 1. Thus in every case к <, η, and the proposition is proved. Examples 18. Let vx = (0, 3, 0, 2) v2 = (5, 1, 1, 2) v3 = (1, 0, 2, 2) In order to show that these vectors are independent we must show that the system of equations x1v1+x2v2+x3v3=0 (1.23) has only the zero solution. But this system is the same as the system corresponding to the matrix whose columns are vb v2, v3: V2 2 2) If we row reduce this matrix we obtain \o 0 oy Now the system PAx = 0 obviously has only the zero solution: if x1 + x1 + x3 = 0 x2 + 2хъ = 0 хз=0 (1.24) 0 = 0
1.4 Linear Subspaces ofR" 45 we find, reading upward that x3 =0, x2 = 0, x1 = 0. Since Ρ is invertible then the system Ax = 0 has only the zero solution. What is the same, if (1.23) holds, so must (1.24), so x3 = x2 = x1 = 0. Thus the vectors vb v2, v3 are independent. 19. Now let vx = (3, 2, 1, 0) v2 = (1, 2, 3, 1) v3 = (2,0, -2, -1) Again, let A be the matrix whose columns are y,, v2, v3: •ί \ I A row reduces to rA Ι ο ο ο \0 0 0/ The system PAx = 0 has the solutions x» = -3x2 + 2x3 x2 = x3 The system Ax = 0 has the same solutions. Taking x3 = 1 we have the particular solution (—1, 1, 1). Thus - vx + v2 + v3 = 0 20. Four vectors in R3 cannot be independent. Let v, = (2, 1, 2) v2= (0,3,0) v3 = (l,0,4) v4 = (0,l,2)
46 1. Linear Functions Find a linear relation which these vectors must satisfy. If we row reduce the matrix whose columns are the v's, we obtain the matrix /13 0 1 A= (0 1 -1/6 1/3 \0 0 1 -1 Now, corresponding to any value of x4 we obtain a solution of Ax = 0, and thus of £ x'v, = 0. Take x4 = 1. Then x3 = x4 = 1 x2 = u3 - i*4 = i xi=-3x2-x4=-i Thus —ivi-iv2+>,3 + >,4=0 Now, the equivalent form of these two propositions about R", that any spanning set of vectors has at least и members, and any independent set has at most и members, holds for any linear subspace of R" as well. Proposition 11. Let V be a linear subspace ofR" of dimensions d. (ι) A spanning set has no less than d elements. (ii) An independent set has no more than d elements. Proof. Part (ι) is of course just the definition, so we need only consider part (ii). The proof amounts to a reduction to the case where V is Rd, and an application of Proposition 10. Let Wi,...,wd span V; since V has dimension d there exists such vectors. Suppose, as in (n), that Vi,..., \k are independent vectors in V. Then we can write each Vj as a linear combination of Wi,..., wd; V/ = Σ «j'wi \<.}<.k J=l for suitable numbers a/. The vectors (a/,..., a/) for j = 1 к are vectors in Rd corresponding to the vectors {vj}; we shall now show that they are likewise independent. For if, IcW,...,я/)=0 J=l
1.4 Linear Subspaces ofR" 41 then also 25=1 cJv, = 0 by this computation: к к ά ά Ι к \ 2cJVj= 2cJ2aj'w,= 2 I 2cJaj')w(=0· Wi + --+0· w„ J=l J = l 1 = 1 l = l\j=l / Thus, by the independence of vi,..., vt, we must have c1 = 0, ..., <* = 0. Thus the к vectors (β/,..., β/) are independent in Rd, so by Proposition 10 d>k. Definition 6. Let К be a linear space. A basis of К is a set S of vectors such that each ν e V can be written in the form к у = ]£ с1уг with с1 е R, v, e S 1=1 in one and only one way. Another way of putting this is: a basis for a linear subspace К is a set of independent and spanning vectors in V. Proposition 12. S is a basis for the linear space V if and only if both these conditions hold: (i) S is an independent set, (ii) the linear span of S is V. Proof. Suppose that Sis a basis of V. Since every vector in V can be written as a linear combination of vectors in S, certainly (ii) is true: V is the linear span of S. Since 0 can be written in only one way as a linear combination of vectors in V, any time we have ciVl + · · · + сЧ = О with c1,..., c* in R and vb ..., vk distinct members of S, we must have c1 = 0, ...,<* = 0 (since 0 = 0 · Vi + · · + 0 · \k also). Thus (l) holds: S is an independent set. Conversely, suppose now that (i) and (ii) are true for the set S. Then (by (n)) any vector ν in К can be written v = c1vi + --- + c4l (1.25) with c' e R, v( e S. This can be done in only one way because of the independence. In fact, suppose (1.25) holds, and also ν = α1ν1 + ··· + α4 (1.26) is true, with (c1 — а1)^! + · · · + (ck — a^v* = 0, so c' = a' since the v, are independent.
48 1. Linear Functions Dimension and Basis The important facts to know about dimension of linear subspaces of R" are these: such a space V always has a basis with a finite number of elements. That number is the same for all bases and is the dimension of V, and is not greater than n. We summarize this as follows: Theorem 1.4. Let V be a linear subspace of R". (i) There is an integer d<n such that Vhas dimension d. (ii) Any basis of V has precisely d elements. (iii) d independent vectors in V form a basis. (iv) d spanning vectors in V form a basis. Proof, (ι) The proof of this part of the theorem is by mathematical induction on n. If η = 1, either V = {0} or V has a nonzero vector, in which case V = R. Thus either dim V = 0 or 1, so dim V < 1. Now we proceed to the induction step. Let us describe how it goes. We assume the assertion (i) for и — 1, and consider R"-1 as the set of «-tuples in R" with zero first entry. If К is a subspace of R", it intersects this space in some subspace of R"~l which is, by induction spanned by some S vectors, with 8<n— 1. Now, choosing any other vector in К with a nonzero first entry, this together with the vectors referred to above will span V. Now we make this argument precise. Let К be a subspace of R". If V = {0}, then dim V = 0; if not, К has a nonzero vector v0 = (a1,..., a"). One of the entries is nonzero; we may, by reordering the coordinates assume that α1 φ 0. Let now W = {weR-1:(0, w) e K} W is a linear subspace of i?""1. For if Wi, w2 e W and с1, с2 е R, we also have c40, Wl) + c2(0, w2) = (0, c'wi + c2y/2) in V, so cxwi + c2v/2 e W. Now, by the induction hypothesis, W has dimension 8<,n-l. Let wi,...,w{ span W. By definition of W, Vi =(0, Wi),.. .,ve = (0, w«) are in К Now we need only show that v0, Vi v«spanK. Let ν eV, and let с be its first entry. Then ν - cfaV'vo is also in Kand its first entry is 0. Thus this vector is of the form (0, w) with w e W. Then there are c1,..., c* such that w = c'wi Η h cV{ Thus ν - (Wy^o = (0, w) = c40, Wl) + ... + c«(o, W{)
1.4 Linear Subspaces ofR" 49 or, ν = cfcV'vo + c'vj Η h c4> Thus, there are δ + 1 vectors which span V, so К has dimension d with d< 8 + 1 < (w-l) + l =«. (li) This follows easily from Proposition 12. If S is a basis for V (dim К = d), then since 5 spans, it has at least d elements, and since S is independent it has at most d elements. (iii) Suppose that vlf ..., yd are independent vectors in V; we must show that they span. Let v0 e V. By Proposition 12(H), since dimV = d, v0 yd are dependent, so there exist (c°,..., cd) φ 0 such that c°v0 + · · · + Cvj = 0 If c° = 0, since Vi,..., vd are independent we must also have c1 = · · · = c* = 0, a contradiction. Thus c° Φ 0, so v0 = (—c°)~1(c1v1 + · · · + c^) as desired. (lv) Suppose that vi,..., vd span V. If they are dependent, then the equation cV + · · · + Cvd = 0 holds with at least one c' φ 0. If say С Ф 0, then vr = (-сг)_1(сЧ1 + · · · + C-^r-! + cr+1t;r+i + · · · + сЧ) so Vi,..., yd, with vr excluded, also span V. Hence, V has dimension at most d— 1, a contradiction, so we must have had Vi,..., yd independent and thus a basis. This final proposition, whose proof is left as an exercise, is an indication of the (theoretical) ease in finding bases. Proposition 13. Let V be a linear subspace of R" of dimension d. (i) Any set of vectors whose span is V contains d vectors which form a basis. (ii) Any set of independent vectors in V is part of a basis for V. Examples 21. Find a basis for the linear span V of the vectors vx = (4, 3, 2, 1) v2 = (5, 2, 2, 1) v3 = (0, 1, 0, 1) v4 = (1, 0, 0, 1) and express К by a linear equation.
50 1. Linear Functions We want to find all vectors b of the form Е*Ч = Ь (1-27) and we want to find a basis for such vectors. Now (1.27) is the system corresponding to the matrix A whose columns are the vectors yi< v2. v3. v4 · The span of these vectors is just the range of A. If Ρ is a product of elementary matrices row reducing A, then any vector b is in the range of A if and only if Pb is in the range of PA. Thus by row reduction we should easily be able to solve our problem. A = The end result of row reduction produces Ί 1 1 1\ /0 0 0 Γ PA=I° l ~l °l P= I1 ° ° ~4 1 0 0 1 -11 lo 0 -1/2 1 \0 0 0 0/ \l 1 -3/2 -4y Thus, the range of A is obtained by setting the fourth entry of Pb to zero: V = range of A = {φ1, b2, b3, bA): bl + b2 - \ЪЪ - 4b* = 0} V has dimension at least three since it contains the independent vectors (4, 0, 0, 1), (0, 4, 0, 1), (0, 0, 2/3, -1/4). On the other hand, V # R\ so dim V < 3. Thus, dim V = 3 and these three vectors are a basis. 22. Find a basis for the linear subspace V of R5 given by the equations 5xx + 8x2 + 3x3 + x4 + x5 = 0 x1 - x3- x5=0 x2 + 2x4 =0 We are seeking the solution space of Ax = 0, where /5 8 3 1 1\ A= 1 0-10 -1 \0 1 0 2 0/
1.4 Linear Subspaces ofR" 51 Row reduction leads to /10-1 0 -1\ PA= 0 1 0 2 0 \0 0 -2 -15 6/ and Fis the set of χ such that PAx = 0. According to these equations x4 and x5 are to be freely chosen and x1, x2, x3 determined by this choice. Thus, dimF=2. Choosing (x4, x5) = (1, 0), (0,1), respectively, we obtain as a basis (-¥.-2,-^,1,0) (4,0,3,0,1) • EXERCISES 21. What is the dimension of the linear span of these vectors? (a) Vi =(-1,2,-1,0) v2=(2,5,7,2) v3=(0,2, 1, 1) v* = (3, 5, 7, 1) (b) vi =(-1,0,2,1) v2=(2,2,-2,2) v3 = (l,l, 1,1) (c) Vl =(0,2, 1,1) v2=(l,7, 3, 3) v3 = (0,0, 0, 1) v* = (l,3, 1,2) v5=(l,5,2,2) (d) vi= (0,0, 1,1,1) v2 = (1,0,0, 1, 1) v3 = (0, 1,0, 1,0) 22. What is the dimension of the space S given by these equations: (a) 5 = {x e R5: x1 + x2 - x3 - x* = 0, x1 + x3 = 0} (b) 5 = {x e R5: x2 + x* + x5 = 0, x1 - x3 + x* = 0 л:1 - χ2 - x3 - x5 = 0} (c) S = {\ e R*: χ1 + χ2 + χ3 = χ3 - χ2 - χ1 + χ4} 23. Determine the linear span of these vectors by a system of equations (a) Vi= (1,0,0, 1) va=(0, 1,1,0) v3=(0, 1,0,1) (b) vi = (2, 2, 6, 2) v2=(l,2,3,0) v3 = (0,1,0,-1) (c) vi= (1,0,1) v, = (-1,1,1) (d) vi= (1,0, 0,0,0) v2 = (2,0, 1,0, 1)
52 1. Linear Functions 24. Are these vectors independent ? (a) Vi v5 as given in Exercise 21(c). (b) Vi, v2, v3 as given in Exercise 23(a). (c) vi = (0, 2, 0, 2, 0, 6) v, =(1,1,-1,-1, 1,1) v3 = (2,4,6,8, 10, 12) v* = (0, 0, -2, -2,0, 0) v5=(0, 1,0,0, 1,0) v« = (1,1, 1,1, 1,1) 25. Find all linear relations involving these sets of vectors. (a) v1= (0,1,1) vi = (5,3, 1) v3= (0,2,0) v« = (1,-1,1) (b) vi=(0,2,0,2) v2 = (0,1,0,0) v3= (0,1,0, 1) v4 = (0, 0, 0,1) v5=(l, 0,-1,0) (c) vi = (0,0, 0, 0) v2=(l,l,l,l) v3 = (1,1,0,0) v* = (0, 0, -2, -2) 26. Find a basis for the linear subspace of R5 spanned by (0, 0, 0, 1, 1), (0, 1, 0, 0,0), (1, 0, 0, 0, 1), (1, 1, 0, 0, 1), (2, 1, 0, 1, 2) 27. Find a basis for these linear spaces: (a) {(x1, ...,x5)eR5:x1 + 2x2 + x3=0,x1 + 2x* + x5=0, x1+x5=0} (b) {(x\ ..., x*)eR*: x1 - x2 + x3 - x* = 0, x1 - x3 =0} 28. If the given vectors on R5 are independent, extend them to a basis: (a) (0, 0, 0, 0, 1), (0, 0, 0, 1, 1), (0, 0, 1, 1, 1) (b) (1, 5, 2, 0, -3), (6, 7, 0, 2, 1), (1, 0, -1, -2, 0), (1, 1, 1, 1, 1) (c) (4, 4, 3, 2, 1), (3, 3, 3, 2, 1), (2, 2, 2, 2, 1) • PROBLEMS 23. Suppose we are given к vectors Vi,..., yk in R". Let Wi = Vi, w2 = v2 — 02 vb..., wt = vt — β,,Ύί for some numbers β2 ft. Show that the sets {vu ...,vk} and {wb ..., щ} have the same linear span. 24. The proof of Theorem 1.3 proceeds by assuming that the set S consists of the vectors Vi,..., vk. What of the case where S has infinitely many elements ? 25. Prove Proposition 13. 26. Show that if V, W are subspaces of R", so is V η W.
1.5 Rank + Nullity = Dimension 53 27. Show that if A is obtained from В by a row operation, the linear span of the rows of A is the linear span of the rows of B. 28. Show that if A is a row-reduced matrix the dimension of the linear span of the rows of A is the same as its index. 1.5 Rank + Nullity = Dimension Now let us apply the propositions of the preceding section about linear spaces, and in particular the notion of dimension, to the subject of linear transformations. There are certain obvious linear spaces to be associated to a given transformation. Definition 7. Let T: R" -> Rm be a linear transformation, (i) The set K(T)= {νεϋ":Γ(ν) = 0} is a linear subspace of R", called the kernel of T. Its dimension is the nullity of Γ, denoted v(T). (ii) The set R(T)= {T(v):veRn} is a linear subspace of Я', called the range of T. Its dimension is the rank of T, denoted p(T). Theorem 1.5. Let T: R" -> Rm be a linear transformation. We have η = v(T) + p(T) that is, dimension = nullity + rank. Proof. For short, write v(T) as v. Let Vi,..., v„ be a basis for the kernel of T. Let v„+1,..., v„ be the rest of a basis for R": Vi,..., v», v»+i,..., v„ thus span R". Let Wj = r(Vj) for j = ν + 1,..., и. Now the crux of the matter is this: wv+1,..., w„ form a basis for the range of T. Once this is shown, we will have ^(7-) = η — ν, which is the desired equation. (l) Let w e R(T). Then there is a v e R" such that w = T(y). Expand ν in the
54 1. Linear Functions basis vj,..., v„: ν = c'vi Η h c"v„. Then w=7,(v) = 7,(c1vi + --- + c"v„) = clT(\i) +■■■+ cvr(vv) + cv+1r(vv+.) + · · · + <?T(v„) = cv+1wv+i + hc"w„ The second line is justified since Τ is linear and the third follows since Vj,..., v» are in the kernel of Τ and T(\v+1) = w»+i,..., Γ(ν„) = w„. Thus these last vectors span R(T). (ii) wv+b. ., w„ are independent. Suppose cv+1wv+iH hc"w„ = 0 (1.28) We must show that the {c1} are all zero. In any event, from (1.28) we have r(cv+1vv+> + · · · + c"v„) = cv+1r(vv+1) + · · · + c"T(v„) = 0 so cv+1vv+i + h c"v„ e K(T) Vi,..., vv span K(T) so there are c1,..., cv such that cv+,Vv+i Η h c"v„ = c4i Η + cvvv or (-c>i + · · · + (-c>v + cv+1Vv+i + · · · + c"v„ = 0 Since vi,..., v„ are independent, all the cJ are zero, as required. The theorem is proven. Examples 23. Let Г Я4 -> Я3 be given by the matrix /1 3 2 7\ A= 0 0 1 1 (1.29) \0 1 0 0/ We can completely analyze this transformation by row reduction. A easily row reduces to /1 3 2 7\ 0 10 0 (1.30) \0 0 1 1/
1.5 Rank + Nullity = Dimension 55 merely by interchanging the last two rows. Thus, letting Ρ be the transformation corresponding to (1 °o Ϊ) \0 1 0/ we know that PT is the linear transformation corresponding to (1.30). Now, the range of PT is easily seen to be all of R3, and the range of Τ is P~1 (range of PT), which is again all of R3, so p(T) = 3. The kernel of Τ is the same as the kernel of PT, which has the equations given by (1.30): x1 + 3x2 + 2x3 + 7x4 = 0 x2 = 0 (1.31) x3 + x4 = 0 The set of all such solutions is found by letting x4 take on all real values and solving for the remaining coordinates by (1.31). Thus K(T) = {(-5i, 0, -t, t): 16 R}, which is one dimensional. 24. 1 "I Let Τ /1 0 3 " \2 - :R 1 1 ■1 ■3 4 _,, 1 0 3 2 R*\ 2 1 2 -1 R4 be given by the matrix Let us row reduce this matrix, keeping track of our row operations: v0 PA Now, the kernel of Τ is easy to find; it is the same as the kernel of the transformation S corresponding to the last matrix PA (because
56 1. Linear Functions S = the composition of Τ by an invertible transformation). Now the kernel of S, and thus also of T, has, corresponding to the matrix PA, the form: x1 + x2 + x3 + 2x4 = 0 x2 + x4 = 0 0 = 0 0 = 0 or x2 = — x4, x1 = — x3 — x4. Thus, K(T) = {(-(и + ν), -ν, и, υ): (и, υ) е R2} so ν(Γ) = 2. The range of Γ is a little harder to find. If R is the transformation corresponding to the product of the elementary matrices on the left, then S = R - T, so the range of Γ is R ~x of the range of S, which has the equations x3 = x4 = 0. (That is, the vector (61, ..., b4) is in the range of S if and only if there exist (x1, ..., x4) such that x1 + x2 + x3 + 2x4 = bl x2 + хА = Ъ2 0 = b3 0 = 64 The necessary and sufficient condition is b3 = bA = 0.) Thus the necessary and sufficient condition for ν to be in the range of Τ is that Pv be in the range of S; that is, the third and fourth coordinates of Pv must vanish: -Зх1 +4х2 + х3 =0 -2jcx-5x2 + x4 = 0 Thus, p(T) = 2. 25. Let us do one more example briefly. Suppose that T: R3 -> R5 corresponds to the matrix /1 0 1\ 2 1 31 0 11 1 1 21 \4 3 7/
1.5 Rank + Nullity = Dimension 57 This matrix can be row reduced to /1 0 0 0 \o 0 1 0 0 0 i\ 1 0 0 o/ by multiplication on the left by this matrix / ι -2 2 1 \-i 0 1 -1 -1 -1 0 0 1 0 -1 0 0\ 0 0 0 0 1 0 -i i/ The kernel of Γ can be found by looking at the row-reduced form A; it is the set of χ = (χ1, χ2, χ3) in R3 such that Ax = 0. Precisely, we must have x1 + x3 = 0, x2 + x3 = 0. Thus a vector is in K(T) if its first and second coordinates are the negative of the third; that is, K{T)= {{-t,-t,t):teR}. Thus v(T) = 1. The range of Τ is the set of χ = (x1, ..., x5) such that Px is in the range of A (since A = PT). The 5-tuples in the range of A are precisely those with third, fourth, and fifth coordinates zero. Thus the third through fifth coordinates of Px must be zero for χ to be in the range of T. Specifically R(T) is the set of simultaneous solutions of 2xx x2 + x3 + X« x4 + x5 ■0 0 ■ 0 We can take x1, x2 as free variables and use these equations to define x3, x4, x5: thus R(T) = {(и, ν - 2u, ν - u, 3υ - 2w): (и, v) e R2} so p(T) = 2. These examples illustrate the fact that Theorem 1.5 can be formulated purely in terms of matrices. We now do just that.
58 1. Linear Functions Proposition 14. Let A be an τη χ η matrix, representing the linear transformation T: R" ->Rm Then p(T) = number of independent columns of A = number of independent rows of A = index of the row-reduced matrix to which A can be reduced. Finally, we can also reformulate Theorem 1.5 as a conclusion for systems of linear equations, thus bringing us to the ultimate version of Theorems 1.1 and 1.2. Theorem 1.6. Suppose given a system of m linear equations in η unknowns, and suppose d is the index, or rank, of the corresponding matrix A. Then (i) d <m,d< n. (ii) {x: Ax = 0} is a vector space of dimension η — d. (iii) {b. there exists a solution of Ax = b} is a vector space of dimension d. • EXERCISES 29. Describe by linear equations the range and kernel of the linear transformations given by these matrices in terms of the standard basis: (a) (b) (c) № /8 0 0 1 6\ V 30. Find bases for K(T), R(T) for each Τ given by the matrices (a)-(d) of Exercise 29. 31. Let/: R"^R be a nonzero linear function. Show that the kernel of/is a linear subspace of R" of dimension и — 1. 32. Let f(x\ ...,x")= 2?=i x'. Find a spanning set of vectors for the kernel of/.
1.6 Invertible Matrices 59 • PROBLEMS 29. Let T:R"^Rm be a linear transformation. Then K(T) and R(T) are linear subspaces of R", Rn, respectively. 30. Let Τ be the transformation represented by the m χ η matrix A. Show that R(T) is spanned by the columns of A. Show that K(T) = {(x\...,xy. £c^=0> J=l where d,..., C„ are the columns of A. 31. Let v/eR". Define ±(w) as the set of ν such that £"=i v'w' = 0. Show that for w φ 0, _L(w) is a subspace of R" of dimension η — 1. 32. Let 5 <= i?n. Define 1(5·) as the set of ν such that £"= ι v'w' = 0 for all w e S. Show that 1 (S) is a subspace of i?", and dim 1 (S) + dim[S] = n. 1.6 Invertible Matrices In this section we shall pay particular attention to the collection of linear transformations of R" into R"—or, what is the same, the и х и matrices. From the point of view of linear equations this is reasonable; for it is usually the case that a given problem will have as many equations as unknowns. First of all; it is clear that there are certain operations which are defined on the collection of all linear transformations of R", thus making of this set an algebraic object of some sort. We collect together all these notions in the following definition. Definition 8. The algebra of linear operators on R" denoted by E", is the collection of linear transformations provided with these operations: (i) if/is in E", and с is a real number, (C/)(x) = cf{x) (ii) if/, g are in E",f+g is defined by (/+0 )(*)=/(*) + £(*) (iii) f° g is defined by (Z°tf)(x) =/(**))
60 1. Linear Functions It is important to think of the elements of E" as functions taking n-tuples of numbers into и-tuples of numbers; but in working with them it is convenient to represent them in terms of the standard basis by matrices. Thus, we are led to consider also the algebra M" of real η χ η matrices with the operations of scalar multiplication, addition and multiplication, the definitions of which we now recapitulate. Definition 9. The algebra M" is the collection of и х и matrices provided with these operations: (i) If A = (a/) is in M", and с is a real number, с А = (ш/) (ii) If A = (a/) and В = (6/) are in M", then A + Β = (β/ + b/) AB = (ibV) The two algebras E", M" are completely interchangeable, for M" is just the explicit representation of E" relative to the standard basis. Now the operations on M" obey certain laws, some of which we have already observed in previous sections. Let us list some important ones. Proposition 15. These equations hold for all η χ η matrices А, В, С and all real numbers k. (ι) fc(A + В) = к А + kB (ii) C(A + B)=CA+CB (iii) (A + B)C = AC + BC (iv) A(BC) = (AB)C If A is a given matrix, we shall let A2 denote A · A, A3 = A · A · A, and in general A" is the и-fold product of A with itself. Since we may also add matrices, and multiply by real numbers, we may consider polynomials in a given matrix. That is, A2 + ЗА + A, A7 + 3πΑ3 + A6, .. . In fact, if we adopt the usual convention that A°=/, then for any polynomial /»СЮ = Σ"= о сгХ' in the indeterminate X, we may consider the matrix p(A) = Σ"=ο CA'. A most remarkable observation can now be made, by noticing that the collection M" of и х и matrices is the same as the collection R"2 of n2-tuples of real numbers.
1.6 Invertible Matrices 61 Proposition 16. Given any η χ η matrix A, there is a nonzero polynomial ρ of degree at most n2 such that />(A) = 0. Proof. An element of M" is a rectangular array of n2 real numbers, thus corresponds to an element of R"2. We may make this correspondence explicit, by, say, placing the rows one after another. That is, the matrix (a/) corresponds to the vector (αϊ1, ..., a„\ aS,..., α„\αι3,..., αϊ-ι,α„") in R"2. In any event, the notions of sum and scalar multiplication is the same in the two interpretations Now consider the matrices I, A, A2,..., A"2. These n2 + 1 vectors in R"2 cannot be independent so there are real numbers c0, o,..., c„2, not all zero, such that c„2 A"2 + · · · + c2 A2 + c,A + col = 0 Thus the proposition is verified with ρ the polynomial p(X) = c„2 X"2 + · · · + c2 X2 + о X + со We may rephrase this proposition in this way Every matrix is a root of some nonzero polynomial equation with real coefficients. From the purely algebraic point of view this formulation is of some interest and raises the converse speculation: given a polynomial with real coefficients, does it have some η χ η matrix as a root? We shall verify this fact, and with η no greater than two. More precisely, we shall, in a later section, introduce the system of complex numbers as a certain collection of 2 χ 2 matrices, and later verify that every real polynomial has a root in the system of complex numbers. This is known as the fundamental theorem of algebra. Now, a linear transformation in E" is invertible if it has an inverse as a function from R" to R". For this it must be one-to-one and onto, that is, it must have zero nullity and rank n. We have seen (n = rank + nullity) that either of these assertions implies the other. Now it is clear that these assertions must be expressible in terms of matrices; we now do that. Definition 10. The η χ η matrix A is invertible if there is a matrix В such that BA = I = AB In this case В is said to be an inverse for A. Proposition 17. An invertible matrix has a unique inverse Proof. This is clear: if В, С are inverses to A, then all these equations hold: BA = I = AB CA = I = AC Then В = BI = B(AC) = (BA)C = 1С = С
62 1. Linear Functions We shall denote the inverse of a matrix A, if it exists, by A i. The relationship between matrices and linear transformations gives us this propostion: Proposition 18. Let Abe an η χ η matrix. These assertions are equivalent: (i) A is invertible. (ii) A represents an invertible transformation. (iii) There is a matrix В such that BA = I. (iv) There is a matrix В such that AB = I. (v) A has index n. Proof. We have already seen (in discussing systems of linear equations) that (ii) and (v) are equivalent. By definition (i) implies both (iii) and (iv). Thus we have left to prove that (i) and (ii) are equivalent, (iii) implies (l), and (iv) implies (i). (ι) implies (ii). Let A be the given invertible matrix, and Τ the transformation it represents. Let S be the transformation represented by the inverse, A-1, of A. Since A ■ A"1 = I = A_1A, we have T· S = I = S ■ T. Thus S is inverse to T, so (ii) holds. (ii) implies (l) by the same kind of reasoning with the roles of matrix and transformation interchanged. (iii) implies (i). If Τ is the linear transformation represented by A, then by (iii), there is a transformation S such that i« T = I. Thus, if T(\) = 0 we must also have χ = S(T(x)) = S(0) = 0, so Τ has nullity zero and is thus invertible. Thus (iii) implies (ii), so also implies (ι). (iv) implies (i). If again, Τ is the transformation represented by A, by (iv) there is a transformation S such that Τ ° S = I. This implies that Τ has rank и and thus is invertible. Computing the Inverse Now, it is clear that the question of invertibility for a given matrix is important and that the problems arise of effectively deciding this question and of effectively computing the inverse, if it exists. To ask that the rows (or columns) be independent, or span R", while responsive to this question hardly provides a procedure for determining invertibility. We shall now introduce two such procedures: one is a continuation of row reduction and the second is based on the notion of the determinant. The determinant is a real-valued function defined on the algebra M" of и х и matrices; its basic property is that it is nonzero only on the invertible matrices. We shall depend heavily on the determinant in the study of eigenvectors (Section 1.7). In Section 1.9 we shall explore the connection between the determinant and the notion of volume in R3.
1.6 Invertible Matrices 63 In order to verify the critical properties of the determinant function it is necessary to return to the elementary matrices, for they provide a technique for decomposing an invertible matrix into a product of simple ones, and as a result, a technique for computing inverses. We recall these facts: the elementary matrices are the matrices which represent the row operations. Since the row operations are invertible, so are the elementary matrices invertible. For any matrix A there is a sequence Ps, ..., P„ of elementary matrices such that В = PjPj-! ··· P„A is in row-reduced form. The index of A is the number of nonzero rows of B. We augment these facts by this further observation: Proposition 19. Suppose that A is an invertible η χ η matrix. There is a sequence P,,..., P0 of elementary η χ η matrices such that P, · · · P„A is the identity matrix: P, P0A = I Proof. The proof will be by induction on n. It is a slight modification of Theorem 1.2. The first column of A is nonzero since the columns of A must be independent (A is invertible). As we have seen in the proof of Theorem 1.2, there exist elementary matrices P0,..., P» such that the first column of Pk · ■ ■ P0A is Eb Thus, 1 n-1 /1 Ax\ 1 P„ P0A - Jo Ai2j n _ j Since Pk · · · P0A is invertible so is A22 (see Problem 37). Thus the proposition applies to A22. There is a sequence Qs,..., Ql+i of elementary (n - 1) x (n - 1) matrices such that Qs · ■ · Ql+iA22 = I. Let Pj = (i qJ ьгу = *-м,...,, Ql+1A22) = (θ I j Now the matrix (i "Μ Ps PoA " is the product Ρ,+η-, ■ · · Ps+i of elementary matrices corresponding to these row
64 1. Linear Functions operations: subtract a/ times theyth row from the first row, j = 2,..., n. Finally, Ps+n_1 ■ P0A = yQ j Д0 ϊ ] = Ц !J=I as required. This proposition provides us with an effective way for computing inverses; we just continue the process of row reduction until we obtain the identity. Then the corresponding product of elementary matrices is the inverse. Row reduction Product of elementary matrices Thus / 137 -21 -10\ А-1=тЫ 65 15 -20 \-10 5 25/ 27. A =
1.6 Invertible Matrices 65 П О 1 2 1 2 О -1 1 ^0 4 2- П О 1 2 0 10-3 0 0 1 О νΟ Ο 2 -10 α ο ο 2\ ι-\ 2 -ι o^ О 1 0 -3\| 1-2 0 0 0 0 1 oil 1 -1 10 νΟ 0 0 -10/ \-6 11 -2 1, 24 -и А Ί О О 0\ /-^ 0 10 0 |f 0 0 10 1 -1 1 vo о о ι/\ л -Н ^ Thus, А-1 = -±- ** — in '-22 42 -12 2^ 28 -53 6 -3 10 | 10 -10 10 О 6 -11 2 -Ь 77ie Determinant Function The determinant of a matrix is a pretty complicated concept; before going into a study of it and its properties, we shall first see how to compute it. Looking ahead, the method of computation comes from Equations (1.35) and (1.36), but we shall not use those equations to derive it. Instead we shall simply describe the technique for finding determinants. The determinant of a 2 χ 2 matrix is defined by det I ,| = ad — be (: ί)-- The determinant of a 3x3 matrix is found as follows. First, select a row. The determinant will be a sum of the products of the elements of that row withy numbers, called their cofactors. The cofactor of the (i,_/)th entry is (-l)i+-' times the determinant of the 2x2 matrix remaining when the ith row and/th column are deleted.
66 1. Linear Functions Examples 28. Compute the determinant of /1 3 2\ A= -1 4 0 \ 7 -2 l/ If we select the first row we find det A = 1[4(1) - (-2)0] - 3[(-l)(l) - 7(0)] + 2[(-l)(-2) - 7(4)] = -45 Selecting the second row: det A = -(-1)[3(1) - (-2)2] + 4[1(1) - 2(7)] + 0[...] = -45 Selecting the third row: det A = 7[3(0) - 4(2)] - (-2)[1(0) - 2(-l)] + [1(4) - 3(-l)] = -45 Now, we could also have selected a column first, and proceeded in the same way. For example, selecting the second column: det A = -3[(-1)1 - 0(7)] + 4[1(1) - 7(0)] - 2[1(0) - 2(-l)] = -45 Now, in general, the determinant of the и х и matrix is found in the same way. Select a row (or column). The determinant is the sum of the products of the entries in that row (or column) with their cofactors. The cofactor of the (i,j)th entry is (—1)'+-' times the determinant of the (л — 1) χ (л — 1) matrix remaining when the rth row andyth column are deleted. 29. Let (431 0\ 2 6 0-1 10 0 4 2 11-1/
1.6 Invertible Matrices 67 Select the first row /6 0 det A = 4 det 10 0 \l 1 /2 0 -1\ 3det 1 0 4 \2 1 -1 -Odet We now compute the determinants of the 3 χ 3 matrices by taking advantage of the location of the O's. Select the second column in the first three, and don't bother with the last since its factor is 0: det A = 4(-1)[6(4) - 0(-1)] - 3(-1)[2(4) - 1(-1)] + (-6)[l(-l) -4(2)] - [2(4) - 1(-1)] = -24. 30. A = /6 2 1 4 3 8 0 0 2 8 1 4 \2 1 4 0 1 0 0 -1 0 0 1 i/ Select the third row: Select the third column: /6 2 detA = 2(-l)2 + 3det 8 1 \2 1 = 96 + (-l)4 + 3(-l)det We turn now to the theory of determinants. We begin with a definition of the determinant function which is appropriate to the theoretical discussion and then verify that it has the multiplicative property: det(AB)= det A-det В
68 1. Linear Functions The formulas (1.35) and (1.36) below which form the basis for the preceding computations will result from a rewriting of the formula for the determinant. The determinant of an и х и matrix can be described in this way: it is the sum of all products of precisely one element from each row and column, with appropriate signs. Our first business is to determine this appropriate sign. A selection of precisely one element from each row and column is described as follows: In the first row we select a certain element, say in the π(1) column. In the second row we select an element, coming from a different column, say π(2). We have π(2) φ π(1), and so forth. We select the element ui(l) in the ith row and я(г')т column, making sure that the numbers π(1),..., π(η) are all distinct. These numbers then form a rearrangement, or permutation of the numbers 1,..., n. To form the determinant then, we consider all products απ(1) ' ' ' απ(π) as π ranges over all permutations of the numbers 1, ...,n. A particular kind of permutation is an interchange of two successive integers: i -* i + 1 i + 1 -* i η -> л (We consider the integers as arranged in a circle, so that 1 is the successor to n.) Now it is a fact about permutations, that any permutation consists of a succession of such interchanges. There may be many ways to build up a given permutation by these simple interchanges, but the parity of the number involved is always the same. That is, if we can write a given permutation as a succession of an even number of interchanges, then every way of writing that permutation as a succession of interchanges will involve an even number. For example, consider the permutation on four integers 1 2 3 4->3 1 4 2 This is obtained by this succession of interchanges: 12 3 4 2 13 4 2 3 14 3 2 14 3 12 4 3 14 2
1.6 Invertible Matrices 69 Here is a better way of doing it: 12 3 4 13 2 4 3 12 4 3 14 2 Either way, there is an odd number of interchanges involved. We shall not verify these facts about permutations; the verification would be tangential to our present study. However, we shall use these facts. We shall say that a given permutation is even if it can be formed by an even numbered succession of interchanges; the permutation is odd if an odd numbered succession of interchanges is required. For any permutation π, its sign, denoted ε(π) will be +1 if π is even, and — 1 if π is odd. There is another way of defining the sign function on permutations which is described in Problem 36. This description does not involve the notion of interchange. Definition 11. If A = (a ') is an и х и matrix its determinant is detA= Σ ΦΟΓΚο (1-32) all permutations π i = l We shall now show that det A # 0 if and only if A is invertible, by showing in fact a stronger statement: det (AB) = det A ■ det B. Lemma 1. (i) det 1=1. (ii) If A has a zero row, det A = 0. Proof. (i) Writing I = (a/), we have ei<o = 0, unless π(ΐ) = i. Thus, the sum (1.32) has only one nonzero term, that corresponding to the identity permutation. Since eache,'= l,detl = l · 1 ·■· 1 =1. (ii) If they'th row of A is zero, each term of the sum (1.32) has a factor alu) = 0, so is zero. Then det A = 0. Lemma 2. If Ρ is an elementary matrix, and A any matrix, det(PA) = det Ρ · det A (1-33) Proof. Let A = (a/), PA = (b/). Type I. If Ρ multiplies the rth row by c, then η det PA = 2 Φτ) Π bi{t) = Σ £Wfli(') * * * ca*<r> ''' a"<"> i=l η = 2 ε(π)° Π ai<o = с det A 1=1
70 1. Linear Functions In the special case A = I, we have det Ρ = det(PI) = с det I = с Thus (1.33) holds in this case. Type II. Suppose now Ρ interchanges the rth and ith rows. Let η represent the permutation which interchanges r and s. Thus, bj' = a)w. Now we compute: det PA = 2 ε(π) Π &<υ = Σ ε(^)Π <<>> i=l Now we change the index of summation. Let π = τ · η, and sum over т. det PA = 2 ε(τ ■ η) Π οϋίί'ο» = ~Σ ε(τ) Π «?<%» 1 = 1 The sign changes since η is an interchange; thus, if τ is even, τ · η is odd. Now the product Π"= ι α?<ί<0)1S the same as the product Π"=ι «ί<ο (another change of index) so det PA = 2 ε(τ) Π αίω = -det A J=l In particular, det Ρ =det(PI) = —1, so (1.33) holds in this case. Type III. Suppose that Ρ adds α times row r to row s. Then b/ =α/ if i=£s and b/ = a/ + aaf. We now compute det(PA)=2e0r) П«и i = l = Σ £W Π αί<ο + « Σ «(Ό Π ai<i)iftwifi<.) (1.34) 1=1 1 = 1 1*1 1*Γ The first term on the right is det A. The second term is zero. We can see that by splitting up the sum into odd and even permutations. Let η represent the interchange of r and s. It is important to note that the odd permutations are just those of the form π ■ η, where π is even. Thus the last term in Equation (1.34) is Σ Π aid) · αϋ<Γ) · αϋ<») - Σ Π ai<o · я*<о · αϋ<*> π even l?r,s π odd l?r,s п even 1Фг,я n even i^p.s = Σ Πί#<ι>(β!ίωβϊ<ι> —ei<i>ifi«) =0 π even i/p,s Thus, det PA = det A. In particular, detP = l, so (1.33) is verified also for Type III elementary matrices.
1.6 Invertible Matrices 71 Now, lemma is a word denoting a logical particle of no particular intrinsic interest, but of crucial importance in the verification of a theorem. Here now is the main theorem concerning determinants. Theorem 1.7. A matrix Μ is invertible if and only if det Μ Φ 0. det AB = det A ■ det В for any two matrices. Proof. Suppose Μ is an и х и matrix which is not invertible. Then there are elementary matrices P, P0 such that P, · ■ · P0M is row reduced and has zero rows. Thus, by the above lemma 0 = det(P, · ■ · P0M) = det P, · det Р5_2 · · · det P0 · det Μ Since the determinant of an elementary matrix is nonzero, we must have det Μ = 0. On the other hand, if Μ is invertible, there are elementary matrices P,,..., P0 such that I = P, · ■ · PoM. Then 1 = det I = det P, ■ det P5_, ■ ■ ■ det P0 ■ det Μ Thus det Μ Φ 0. Now let А, В be two и х и matrices. If one of A or В is not invertible, neither is AB, so det AB = 0 and either det A = 0 or det В = 0. In any case det AB = det A ■ det В is true. If A and В are invertible, there are elementary matrices P, ■ ■ ■ P0, Q„ ■ ■ ■ Qo such that P, ■ ■ ■ PoA = I = Q„ ■ ■ ■ Q0B Then Q„ QoP, PoAB = Q„ Q0(P, P0A)B = Q„ Q0B = I Thus det Q„ det Q0 ■ det Ps ■ ■ ■ det P0 ■ det(AB) = 1 det Q„ ■ ■ ■ det Q0 ■ det В = 1 det P, ■ ■ ■ det P0 ■ det A = 1 Thus again det(AB) = det A ■ det B. Notice that the formula det AB = det A ■ det В is far from transparent on the basis of the definition above. In fact, it is not at all derivable without some information regarding the structure of и х и matrices. We have a means of computing А"г for a given invertible matrix A; namely, the process of row reduction. But we have not given explicitly any formula for the
72 1. Linear Functions inverse. Such is provided by the cofactor expansion of a determinant. This formula is of theoretical interest, but not of any great computational value. As far as computations are concerned, the surest and quickest route to the inverse is the process of row reduction. Let A be an η χ η matrix. The adjoint matrix of an entry of A is the (и — 1) χ (и — 1) matrix obtained by deleting the row and column of the given entry (see Figure 1.13). Let A/ be the adjoint matrix of the entry a1. Then the inverse to the matrix A (if it is invertible) is easily given by the determinants of the adjoints: the (i,j)th entry of A-1 is v ' detA More precisely we have these formulas (the explicit version of AA-1 = \~1A=I) known as Cramer's rule: detA= £(-l),+'e/detV fora11' (1-35) detA= £(-l),+VdetA,' for all; (1.36) 1=1 °= Z(-l)'+VdetV for all i Φ к (1.37) 7=1 0= Σ(-1)'+ν^ν forall^fc (1.38) 1 A,' Figure 1.13
1.6 Invertible Matrices 73 The verifications of these formulas are simpler than it may seem; they can be based directly on formula (1.32). For example, let us verify (1.35). First fix a row index /. We shall break up the sum in (1.32) into и parts: those permutations taking /-+1, i-*2, ...,i-*n. Consider, for a fixed column index the permutations taking i -*j. (That is, those π for which π(ι) =/) These are precisely the same as all permutations on the indices of the matrix adjoint to a* (those permutations which take the integers 1, ...,n, except for i, into the integers 1, ..., n, except for^'). Thus the terms appearing in the sum (1.32) which have a' as a factor, are the same as those in (1.35): we must now verify that the signs agree. Let τ be a permutation on the indices of the adjoint to a/. The corresponding permutation π of (1,..., и) does the same as τ and takes i into j. The number of interchanges involved in building this permutation is just that for τ, with the interchanges required to send i to / The last number is j — i, which has the same parity as i+j. Thus, ε(π) = ( — 1)'+]ε(τ), so the signs of corresponding terms in (1.32) and (1.35) also agree. Thus (1.35) is true. We shall leave the verifications of the other formulas to the exercises. (Equations (1.37) and (1.38) require a small trick.) Cramer's rule allows for a simple description for solving the equation Ax = b when A is an invertible η χ η matrix. Let A(l) be the matrix obtained by replacing the /th column of A with the column b. Then the equation Ax = b, which is the same as χ = Α" ^, turns out, according to Cramer's rule to read . detA(,) , . xl = ——— 1 < ι < η detA This is checked out by unraveling all the definitions and applying the formulas of Cramer's rule: since χ = A-1b, *'= Σ (A-1)/*'=771 Z(-l)'+'detA,^ у=1 detAj=i But the summation is just the determinant of the matrix obtained by replacing the /th column of A with the column vector b! Thus we can solve by taking quotients of determinants. Example 31. Solve the equations xi+2x2- x3=2 x1 + *2 + 3x3=0 2^ + 2x2+ x3 = l
74 1. Linear Functions The determinant of the matrix is easily found by cofactor expansion along the first row: det A = 1(1 - 6) - 2(1 - 6) - 1(2 - 2) = 5 By Cramer's rule /2 2 -1\ jc^idetO 1 3 =i[2(-5)+l(6+l)] =-| \l 2 l/ /1 2 -1\ x2 = |det 1 0 3|=i[-2(-5)-(l+2)]=i /1 2 2\ x3 = jdet 1 1 0 =|[2(0) +1(1-2)] =-| \2 2 l/ (the determinants are computed by column cofactor expansion). • EXERCISES 33. Find the inverse of these matrices
1-6 Invertible Matrices 75 Ί 2 3 '4 3 0 0 1 -1 6 1 4 -1' 2 1 i; 2 0 34. Solve the equation Hi) where A is given by (a) the matrix in Exercise 33(a) (b) the matrix in Exercise 33(b) (c) A = | (d) A = | 35. Suppose that the и х и matrix A = (a/) has this property: a/=0 ifj<y Show that A" = 0. 36. If A is a matrix such that A" = 0 show that I + A is invertible. • PROBLEMS 33. Show that if a linear transformation Τ has rank n, it is invertible. Show that if there is a transformation S such that T° S = Ι, Γ is invertible. 34. Derive Equations (1.35)-(l .38) using the definition of the determinant. 35. Assume this fact about polynomials: A polynomial of degree d has no more than d roots. Prove the following assertions: (a) Let A be an и x и matrix. There are at most и numbers s such that A + Л is not invertible. (b) The mx η matrix /1 η η2 ··· гГЛ V= 1 η η2 ·· гГ1] \l r„ r2 ··■ гГ1) has a nonzero determinant if and only if the ri are all distinct. (Hint: If det V = 0, there is a nonvanishing linear relation among the columns.) 36. Let /(*1,...,ле-)=П(*,-*0 where x1, ...,x" are distinct numbers. Show that the permutation π is even if and only if f(x«l>, ...,x*<"))=f(x1, ...,дс")
76 1. Linear Functions and similarly π is odd if and only if f(x"a>,..., x"00) = -fix1, ...,x") 37. Let A be an invertible и x и matrix. For m <n, let В be an (n — w) x (n — /и) matrix formed from A by deleting any m rows and m columns. Show that В is also invertible. {Hint: You need only take m = \, and proceed by induction.) 38. Let Η ί) Verify that A2 - (a + d)\ + (ad - bc)l = 0 that is, that A is a zero of a polynomial of degree 2. 39. The same fact is true for all n, that is an и х и matrix is the zero of a polynomial of degree n. This is part of a famous theorem of algebra, which goes like this: If A is any matrix, the polynomial PA(x) = det(A - xl) is the characteristic polynomial of A. A is a root of the polynomial equation PA(x) = 0 (Cayley-Hamilton). That is, Pa(A)=0 Verify the Cayley-Hamilton theorem for (i) a diagonal matrix, (ii) a triangular matrix. 1.7 Eigenvectors and Change of Basis One fruitful way of studying linear transformations on R" is to find directions along which they act merely by stretching the vector. For example, if a transformation Τ is represented by a diagonal matrix (1.39)
1.7 Eigenvectors and Change of Basis 77 then T(E.) = i/,E,, where Ej,..., E„ are the standard basis vectors. Thus Τ acts by stretching by a factor d, along the ith direction: T(x\ ...,*") = ^x^ + rf2 x2E2 + ■ · ■ + dnx"En More generally, suppose we can find a basis v,, ..., v„ of vectors in R" such that Τ acts by stretching along the direction of v, for each i: ТЫ = d,y, Then, if ν is any vector, the action of Τ is easily computed by referring ν to the basis уг, ..., v„: if ν = £ j'v,, then T(v) = £ rf.j'v,. Τ is represented by the diagonal matrix (1.39) relative to this basis The process of finding a basis of vectors along which Τ acts by stretching is called diagonalization. Unfortunately, not all transformations can be so diagonahzed and this presents a major difficulty in this line of investigation. For example, a rotation in the plane clearly does not have any such directions in which it acts as a stretch. More precisely, let Τ be represented by the matrix Ч-ϊ i) Then T(x, y) = (y, — χ). (Τ is a clockwise rotation through a right angle.) If ν = (α, b) is such that T(a, b) = d(a, b), we must have da = b db = —a Then d2a = db = —a, and there are no real numbers d, a making this equation true (except 0). Nevertheless, there are many transformations which can be analyzed in this way, and it is our purpose in this section to study the techniques for doing so. Definition 12. Let T: R" -> R" be a linear transformation. An eigenvalue of Γ is a number rffor which there exists a nonzero vector ν such that Tv = dv. An eigenvector of Τ with eigenvalue d is a nonzero vector ν such that Γν = dv. Proposition 20. // T: R" -> R" is a linear transformation for which there is a basis of eigenvectors \u ...,v„ with eigenvalues dl,...,d„, respectively, then for any vector ν = £ j'v, , T(v) = £ d, j'v, .
78 1. Linear Functions Proof. Compute T(y) using the fact that Τ is linear. Now we find the eigenvalues of a linear transformation Τ by making use of this remark: d is an eigenvalue of Τ if and only if Τ — dl is singular (not invertible). If A is the matrix representing Τ in terms of the standard basis, this condition is verified precisely when det(A — d\) = 0. Thus the eigenvalues of Гаге just the roots of this equation. Notice that when Tis rotation by a right angle det '(-? i) dl = d2+l which has no real roots, thus explaining in another way why this transformation has no eigenvectors. We shall see that when we extend the real number system to a system in which every polynomial has a root (the complex numbers), then Τ can be represented in terms of (complex) eigenvectors. This is one of the important reasons (particularly in the study of differential equations, as we shall see) for so extending the number system. Let us now collect these observations. Proposition 21. Let Τ be a transformation on R" represented by the matrix A. dis an eigenvalue of Τ ifand only ifdis a root of the equation det(A - il) = 0 If d is an eigenvalue, the set of eigenvectors corresponding to d is the kernel ofT-dl. Proof. Suppose d is an eigenvalue of T. Then there is a v φ 0 such that 7V = ds, or (T — dl)y = 0. Thus the nullity of Τ — dl is positive, so Τ — dl is not invertible. Thus, det(A - dl) = 0. On the other hand, if det(A - dl) = 0, then T— dl is not invertible, so has a positive dimensional kernel. If ν Φ0 is in the kernel, (T—dI)(y) = 0, or Ту =dy; thus d is an eigenvector of T. Examples 32. Let Τ be represented by the matrix
1.7 Eigenvectors and Change of Basis 79 Then A-HV ,0.,) and det(A - il) = t2 - 3i + 2. The roots are t = 2, 1. The space of eigenvectors corresponding to ί = 2 is the kernel of A-2I=G -Ϊ) that is, the space of all vectors (x, y) such that χ - у = 0. Thus (1, 1) is an eigenvector with eigenvalue 2. The eigenvectors corresponding to t = 1 lie in the kernel of — Co) that is, in the space of vectors (x, y) such that χ = 0. (0, 1) is such an eigenvector. Since (1,1) and (0, 1) are a basis for R2, we have diagonalized Γ. Relative to this basis Τ is represented by the matrix 33. Consider the transformation given, relative to the standard basis by the matrix Then det(A - il) = i2 - 8i + 16 = (i - 4)2. Thus 4 is the only eigenvalue of Γ. has as kernel {(x, y): x + 2y =0}, which is one dimensional. Thus there cannot be a basis of eigenvectors for the only eigenvectors lie on the line χ = —2y. Notice that this example differs from that of a rotation, for there is no problem with the roots; the difficulty lies with the transformation itself.
80 1. Linear Functions 34. Let Γ: R3 -> Λ3 be given by the matrix Then det(A - il) = - i3 + 3i2 - 4. The roots of det(A - fl) = 0 are 2, — 1. Eigenvalue 2: /-9 0 -18\ A - 21 = 2 0 4 \ 3 0 6/ The kernel is the set of vectors (x, y, z) such that χ + 2z = 0. This space is two dimensional, so we can find two independent eigenvectors with eigenvalue 2; for example, vt = (0, 1, 0), v2 = (—2, 0, 1). Eigenvalue —1: /-6 0 -18\ A-(-1)1= 2 1 4 \ 3 0 9/ The kernel is the set of vectors (x, y, z) such that χ + 3z = 0 or χ = — 3z 2x + у + 4z = 0 or у = 2z which is one dimensional. An eigenvector is v3 = ( —3, 2, 1). These vi> v2 > v3 thus form a basis of eigenvectors, and Τ is represented by the matrix (1.40) relative to the basis vx, v2 , v3.
1.7 Eigenvectors and Change of Basis 81 Jordan Canonical Form Notice that in general there are two difficulties with the procedure described above. The polynomial det(A - /I) may not have many real roots, and it may have multiple roots. As we shall see in the next section the first difficulty can be overcome by transferring to the complex number system. Example 34 above demonstrates that the second possibility, that of multiple roots, may not be severe, whereas Example 33 shows that it can seriously handicap the diagonalization procedure. Continued study of this situation becomes quite difficult and we shall not enter into it. The conclusion is that the typical matrix which cannot be diagonalized is of this form Id 1 0 0 0 d 1 0 0 0 d 1 0 0 0 d \0 · · · representing the transformation T(x\ ..., x") = {dx1 + x2, dx2 + x3, ..., dx") Given any matrix, we can find a basis of vectors (which includes all possible eigenspaces) relative to which Τ decomposes into pieces, each of which has the form (1.41). This is called the Jordan canonical form. Change of Basis Before leaving this subject, let us compute explicitly the formulas which allow us to change bases in R". If {E1; ..., E„} is a basis for R", then any x in R" can be written χ = *% + ■ ■ ■ + χ"Επ uniquely. We shall refer to the и-tuple (x\ ..., x") as the coordinate of χ relative to the basis E: {El5..., E„} denoted x£. Let F: {Fl5 ..., F„} be another basis for R". Let xF be the coordinates of χ relative to this new basis. To each set of Ε coordinates x£ we can associate the F coordinates xF of the point corresponding to x£. In this way we can write xF as a function of x£. The precise relation is this υ ·· 0 ·· 0 ·· 0 .. η 0 0 0 (1.41)
82 1. Linear Functions Proposition 22. Let E: {El5..., E„}, F: {Flt..., F„} be two different bases for R". Write the E'j in terms of the Fs: 1=1 The matrix (a/) is called the change of basis matrix, and is denoted AFE. For any point χ in R" we have this relation between its Ε and Έ coordinates: xF = AFExE (1.42) Proof. Let xE = (x\ ..., x"), xF = (y\ ■.., У). Then χ = 2 x'YLj =tAl a>'A J=l J=l \i=l / 1=1 Thus for each i, y' =2"=ι α/χ1, which is the same as (1.42). Notice that it follows from (1.42) that (AF£)~г = AEF. For, given any xF Xp = AF xE = Ap A^ xF Thus AF£A/ = I. Now, if Τ is any linear transformation on R", it can be represented by a a matrix, relative to any basis E: {Et,..., E„}. Let us denote that matrix byT£: Γ(χ)£ = T£xE Proposition 23. IfE: {Ег, ..., E„}, F: {Ft, ..., F„} are two bases of R", and T: R" -> R" is a linear transformation, we have ТГ = (А/Г1Т£А/
1.7 Eigenvectors and Change of Basis 83 Proof. T(x)F = AFET(x)E = AFETExE = AFETEAEFxF On the other hand, by definition r(x)F=TFxF Thus TF = Α/ΤΈΑ/ = (AEF)-lTEAEF Examples 35. Let T: R2 -> R2 be represented, relative to the standard basis Ε by Ho i) Let F: {(1,1), (2, -1)} be another basis. Find the matrix T>. Now, A/ = (A/)-1=i([ _fj Thus /7/3 -1\ \4/3 2/3) 36. Let Τ be given, relative to the standard basis Ε by /-7 0 -18 T£= 2 2 4 \ 3 0 8 and let F: {(0, 1, 0), (-2, 0, 1), (-3, 2, 1)}. We have already seen that F is a basis of eigenvectors for T, with eigenvalues 2,2, -1, respectively. Thus we may conclude that TF is given by (1.40).
84 1. Linear Functions • EXERCISES 37. Find a basis of eigenvectors, if possible, for the transformation represented in terms of the standard basis by the matrix A: (a) (b) (c) A= 2 7 5/2 eigenvalues: 2, 3, -1 eigenvalues: 1, — 1, 0, 2 A = I „ „ . - eigenvalues: 1,4 (d) /1-1 Λ A= "I » 4 \ 2 2 0/ 38. Show that for F: {Fb ..., F„} a basis for R", and Ε the standard basis, the matrix A/ is just the matrix whose columns are Fb ..., F„. 39. Find the matrix AEF for these pairs of bases in R". (a) F: (1,0,1), (0,1,1), (1,0,0) E: (0,1,2), (2, 0,1), (1,2,0). (b) F: (1,0,0), (2, 0,1), (0,1,0) E:(3, 1,5), (0,2,3), (-1,-1,0). (c) F: (1, 0, 1, 0), (0, 1, 1, 0), (0, 0, 2, 0), (0, 0, 1, 1) E: (0, 2, 0, 2), (2, 0, 0, 0), (2, 0, 2, 0), (0, 2, 2, 0). 40. Let T: R3 -* R3 be a linear transformation represented by one of (a) /20 0\ (b) /1 0 -1\ T£= -1 0 3 Te= 0 1 4 \ 1 0 1/ \2 0 -1/ relative to the standard basis E. Find TV, where F is one of these bases (F as in Exercises 39(a) and 39(b)). 41. If T:R2^R2 has two independent eigenvectors with the same eigenvalue, then Τ is represented by a diagonal matrix in any basis. • PROBLEMS 40. Prove Proposition 20. 41. If Γ is a linear transformation on R" represented by the matrix A which has η distinct eigenvalues dl,...,d„, then Pa(x) = (-1)"(* - d,)(x -</,)■■■(*- d„) and РДА) = о (рл is defined in Problem 39).
1.8 Complex Numbers 85 42. Let Γ be a linear transformation on R". Let E(r) = {yeR":Ty = rv}. Show that E(r) is a linear subspace of R" (called the r eigenspace of T). Show that if r φ s, then E(r) η E(s) = {0}. 43. Suppose A represents a linear transformation on R" with this property: if η,..., rk are the eigenvalues of A, then η = 25= ι dim E(n). Then РДх) = (-1)4* - Ι) d"°E<'l) ■■■(*- л) d"°E<") Verify the Cayley-Hamilton theorem for A. 44. Find a matrix with no nontrivial eigenspaces. How would you expect to prove the Cayley-Hamilton theorem for such a matrix ? 1.8 Complex Numbers Pythagoras' discovery, that y/2 is not the quotient of two integers, was considered in his day to be a geometric mystery. His conception of numbers was limited to rational numbers and his desire to measure lengths (to associate numbers to line segments) led to this unhappy realization: there are some lengths which are not measurable! (as the hypotenuse of an isosceles right triangle of leg length 1). It took a long time for mathematicians to realize that the solution to this situation was to expand the notion of number. The general liberation of thought that was the Renaissance led in mathematics to the possibility of expressing the value of certain lengths by never-ending decimals, or continued fractions, or other types of infinite expressions. It was during those days that mathematicians formulated the view that such expressions represented numbers and served to determine all lengths. Earlier, Middle Eastern mathematicians were led from certain algebraic problems to envision extension of the number concept in another direction. As they observed, quite clearly —1 has no square root; some bold adventurer then suggested that we contemplate, in our minds, some purely imaginary quantity whose square would be — 1 and treat it as if it were another number As this supposition did not contradict any of the known facts concerning the number system, it could do no harm—and might do a great deal of good (at least in our minds). Today we need not be so mysterious or cunning in our ways. We need only recall that there is a 2 χ 2 matrix (see Problem 20) whose square is the negative of the identity. We can thus say quite factually that in the set of 2x2 matrices, — 1 does indeed have a square root. Well, there is also a 5x5 matrix, and an η χ η matrix for any и whose square is —I, so we should ask for the smallest algebraic system in which — 1 has a square root. The
86 1. Linear Functions complex number system is this system and we shall later derive the remarkable fact (the fundamental theorem of algebra): Every polynomial has a root in the complex number system. Now, to be explicit, the matrix -C "D (1.43) has the property that i2 = —I. The complex number system is the collection of all 2 χ 2 matrices of the form a\ + Ы, where a, b are real numbers. Definition 13. C, the set of complex numbers is the collection of all 2 χ 2 matrices of the form (I ~b) Proposition 24. (i) The operations of addition and multiplication are defined on C. (ii) Every nonzero complex number has an inverse. (iii) С is in one-to-one correspondence with R2. Proof. (i) (a -b\,(c -d\_(a + c -(b + d)\ \b aj + \d с J \b + d a + c J (a ~b\(c -d\(ac-bd -(ad+bc)\ \b a)\d c) \ad+bc ac-bd) (ii) If Μ ■e -i) is nonzero, then one of a or b is nonzero, so det Μ = a2 + b2 φ 0, and thus Μ has an inverse. By Cramer's rule a2 + b2\-b a) (iii) is obvious, since every complex number is given by a pair of real numbers and conversely every pair (a, b) of real numbers gives rise to a complex number.
1.8 Complex Numbers 87 Cartesian Form of a Complex Number We need now a notation which is more convenient than the matrix notation, and we get our cue from (iii) above. The matrices I and i correspond to the points (1, 0), (0, 1) of the plane and thus form a basis for C. More explicitly = al + bi (1.44) If we identify the real number 1 with the identity matrix I; and more generally the real number r with the complex number r\ + Oi, then we can say that every real number is also a complex number. In fact, the complex system is just the real number system with a square root of — 1 tacked on. (This takes us full circle back to the original conception of that Arabian adventurer. The difference here is that we now know what we mean by this procedure and that it produces no inconsistencies.) Thus, we can suppress the identity matrix in the expression and write a complex number in the form a + bi. We now recapitulate the relevant facts. С is the set of all 2 χ 2 matrices с = a + bi with a, b real numbers, a is the real part of c, written a = Re c, and b is the imaginary part, written b = Im с And these following rules hold: i2=-\ {a + bi) + {c + di) = (a + c) + (b + d)i (a + bi)(c + di) = ac — bd + (bd + ac)i (a + Z>0_1=4—%i whena2+Z>2*0 a2 + b2 Polar Form of a Complex Number Since С is in one-to-one correspondence with R2, we can represent complex numbers by points in the plane (see Figure 1.14). Addition of complex numbers is the same as addition of vectors in the plane. We now seek a geometric description of multiplication of complex numbers. For this purpose it is convenient to move to polar coordinates. Definition 14. Let ζ = χ + yi. The modulus of z, written \z\, is its distance from the origin: \z\ = {x2 + У2)1
88 1. Linear Functions у >nz = x + 'У ο χ Figure 1.14 The argument of z, written arg z, is defined for ζ # 0; it is the angle defining the ray on which ζ lies: arg ζ = tan χ We can write complex numbers in polar form: If a = χ + yi has the polar coordinates (r, Θ) then, since χ = r cos Θ, у = r sin Θ, we have ζ = r(cos θ + i sin Θ) (We have moved the i in front of sin θ for the obvious notational convenience which results.) The set of points of modulus 1 is the unit circle centered at the origin. It is the set of all points of the form cos θ + i sin Θ. We shall sometimes abbreviate this to cis Θ. Precisely, cis θ is the point of the unit circle lying on the ray of angle Θ. Now, let z, w be two complex numbers, ζ = r cis θ w = ρ cis φ Then zw = r cis(0)p cis(0) = (r cos θ + ir sin θ)(ρ cos φ + ip sin φ) = rp(cos θ cos φ — sin θ sin φ) + irp(cas θ sin φ + cos φ sin Θ) = rp cis(f? + φ) Thus we form the product of two complex numbers by multiplying the modulii and adding the arguments. (This does not make sense if one of the numbers is zero, but that case is trivial anyway.)
1.8 Complex Numbers 89 Notice then, if ζ = ρ cis θ, then ζ2 = ρ2 cis 20 and more generally z" = p" cis ηθ (1.45) This observation leads to the fact that it is easy to extract roots. For the converse of (1.45) is zllk = pllkas j к Proposition 25. Let с be a complex number, and к an integer. There are precisely к distinct solutions to the equation Xk = с Proof. Write с in polar form: с = r cis Θ. If ζ = ρ cis φ is a solution, then r cis θ = с = ζ* = pk cis кф Thus the modulus of ζ is the kth root of the modulus of c, and the argument of a is an angle such that к times it is Θ. Well, (l/k)6 is such an angle, but so is (l/k)(6 + 2π). In fact, each of the angles ί(0),1(0 + 2π),1(0 + 4π),...,1(0 + 2(*-1)π) have the property that к times it is Θ. All these angles are distinct, so с = r cis θ has precisely these к roots: 1,- ■ ο ι* · <? + 2π .,. . 0+2ττ(Α:-1) r1" cis θ, r1/k cis —-—,..., r1/k cis —-— Complex Eigenvalues We shall work extensively with the complex number system in this text. In fact, we shall discover many situations besides the algebraic one above where study within the system of complex numbers is beneficial. In particular, let us return to the eigenvalue problems of the preceding section. We consider C, the space of и-tuples of complex numbers. We can define linear transformations on С just as we did on R". In fact, the entire theory of linear algebra through Section 1.7 holds over С as well as R". Let E, = (0, 0, ..., 0, 1, 0, ..., 0) (1 in the /th place)
90 1. Linear Functions be the standard basis vectors for C. Again, any linear transformation on C" is given by a matrix A = (a/) of complex numbers relative to the standard basis: Цг1,..., ζ")=(Σ «/*.···> Σ β/*0 for all (z1, ..., z") £ C". Examples 37. Consider the matrix "(-Ϊ i) as representing a transformation Τ on C2 relative to the standard basis. Its eigenvalues are the roots of det(A — il) = 0. But det(A — il) = t2 + 1, so the roots are i, — i. Eigenvalue i: A-il -(:!') The second row is — i times the first, so the kernel of A — /I is given by the single equation — ix + у = 0. An eigenvector is (1, i). The eigenvalue —i has the eigenvector (1, —г). Now, F: {(1,0, (1, -г')} are a basis of eigenvectors for C2, so Г becomes diagonalized relative to this basis: TF Μ if Ч'МЛ) then Тг = iz 'CHW
1.8 Complex Numbers 91 38. Consider the matrix 4 i 1) representing a transformation Τ on C3 relative to the standard basis. det(A - rl) = -t3 + t2 - t + 1. This polynomial has the roots 1, i, — i. Since the roots are distinct and each must have a corresponding eigenvector, there is a basis of eigenvectors. We now find such a basis. Eigenvalue 1: A—(1 -i 1) The kernel of A — I is found as a linear relation among the columns (recall Example 18). Such a relation is Ct - 2C2 - 3C3 = 0 Thus (1, —2, —3) is an eigenvector with eigenvalue 1. Eigenvalue i: A-/I= i -/ i \ 2 1 l-t/ In order to find a relation among the columns we must row reduce· The result of row reduction is [0 ί -\Ui\ \0 0 0 / A solution of the corresponding homogeneous system is found by taking z3 = 5, then we obtain z2 = 1 - 3/, z1 = -3 + 4г. Thus, (-3 + 4/, 1 - 3f, 5) is an eigenvector with eigenvalue i. Similarly, we find the eigenvector (-3 - 4/, 1 + 3/, 5) corresponding to the
92 1. Linear Functions eigenvalue — i. Thus, Τ is represented by Ί 0 0 / ,0 0 & 0 — i relative to the basis (1, -2, -3), (-3 + 4/, 1 - 3/, 5), (-3 - 4/, 1 + 3/, 5). • EXERCISES 42. Find the inverse of these complex numbers: (a) 5-3/ (d) 4cis(2/3) (b) (1-0/2 (e) cis7 (c) 3 + i 43. Show that z_1 = ζ if and only if ζ is on the unit circle. 44. Show that the complex number cis θ represents rotation in the plane through the angle Θ, when considered as a 2 χ 2 matrix. 45. Find all kth roots of z: (a) к = 2, ζ = — i. (d) к = 3, ζ = i. (b) k = 5,z=-l (e) fc=2,z = 3/-4 (c) k=4,z=l + i (f) k = 3, ζ =15 + 5/ 46. Find, if possible, a basis of possibly complex eigenvectors for the transformations represented by these matrices (a) (b) (" "") (c) (d) Ό 1 ^ I I 1 -2 -3 -2 / ° 2 0 \-2 3\ 5 3/ 1 0 0 0 1 -i 0 0 1 i 0 2 • PROBLEMS 45. Compute that the matrix (-Ϊ1) has squar Ρ-J) has square equal to —I. We have chosen i to be the 2 χ 2 matrix so that the correspondence between complex numbers and operations on R2 will be correct. More precisely, we conceive a complex number in two ways: as a certain transformation on the plane, and as a vector on the plane. Given two complex numbers z, w we may interpret their product in two
1.9 Space Geometry 93 ways: composition of the transformations, or the application of the transformation corresponding to ζ to the vector w. We would like these two interpretations to have the same result. If ζ = a + ib, w = c + id, show that zw, с -% ι and с -до are all the same under that correspondence. 46. Show that the complex numbers z, z, when considered as vectors in R2 are independent (unless they are real or pure imaginary). 47. Why do the complex eigenvalues of a real matrix come in conjugate pairs? 1.9 Space Geometry In this section we shall introduce the basic notions of three-dimensional geometry, using vector notation. First of all, as in the plane, we select a particular point in space, called the origin and denoted 0. That being done we may refer to the points of space as vectors and think heunstically of the directed line segment from the origin to the point as a vector. The operations of scalar multiplication and addition can be denned as on the plane—and expressed in terms of coordinates in much the same way: (i) If Ρ is a vector and r a real number, rP is the vector lying on the line through 0 and Ρ and of distance from 0 equal to |r| times the length of the segment OP. If r > 0, rP lies on the same side of 0 as P, if r < 0, rP lies on the opposite side. (ii) if P, Q are two vectors in space, there is a unique parallelogram lying in the plane determined by Ρ and Q, three of whose vertices are 0, P, Q. We define Ρ + Q to be the fourth vertex. Now, we turn to the coordinatization of space. Having chosen a point as origin, let Ex, E2, E3 be three new points with the property that 0, Ex, E2, E3 do not all lie on the same plane (we say the vectors El5 E2, E3 are not coplanar). The three lines determined by the vectors Ex, E2, E3 are called the coordinate axes. Just as in two dimensions the choices of the vectors El5 E2, E3 enables Us to put each line in one-to-one correspondence with the real numbers. In three dimensions two lines determine a plane. We shall call the planes
94 1. Linear Functions Figure 1.15 through 0 determined by Et and E2 the 1-2 plane, by Et and E3 the 1-3 plane, and the plane determined by E2 and E3 is the 2-3 plane (Figure 1.15). These three planes are called the coordinate planes. Each of these planes can be put into one-to-one correspondence with R2 just as in the case of two dimensions. Now, to each point in space we can associate a triple of numbers relative to these choices in the following way. Let Ρ be any such point. There is a unique plane through Ρ which is parallel to the 2-3 plane; and this plane intersects the 1 axis in a unique point. This point has the coordinate x1 relative to the scale determined by E!. We shall call x1 the first coordinate of P. The second, x2, is found in the same way: by intersecting the plane through Ρ and parallel to the 1-3 plane with the 2 axis. Finally we find the third coordinate x3 similarly, and associate the triple (x1, x2, x3) to P. In this way we put all of space into one-to-one correspondence with R3, dependent upon the choice of vectors Ex, E2, E3, called a basis for space. The expression in terms of coordinates of the operations of addition and scalar multiplication are precisely the same as in R2 (no matter what basis is chosen): r(xl, x2, x3) = {rx1, rx2, rx3) (X1, X2, X3) + (/, y2, y3) = (χ1 + y\ χ2 + y2, χ3 + y3)
1.9 Space Geometry 95 There is no need to check that these formulas correspond to the geometric descriptions given above; we need only refer to the computation in the plane. When we are interested in the pictorial representation of problems of three-dimensional Eculidean geometry it is best if we consistently use a particular coordinatization. For this purpose we select the "right-handed rectangular coordinate system"; where the coordinate axes are mutually perpendicular and the order 1 -> 2 -+ 3 is that of a right-handed screw (see Figure 1.16). It is common in particular problems to refer to the coordinates by the letters (x, y, z) rather than (x1, x2, x3). We shall use the numbered coordinates when it is more convenient to do so. Inner Product Now, the basic notions of Euclidean geometry are length and angle. It will be of importance to Us to derive expressions for these in terms of coordinates. Consider first the length of the line segment OP between the origin and the point Ρ with coordinates (x, y, z). This can be easily computed by use of the Pythagorean theorem (consult Figure 1.17). Let P' be the point of intersection with the xz plane of the line through Ρ and parallel to the у axis. Then OPP' is a right triangle, so |0P|2 = |0P'|2+ |P'P|2 Figure 1.16
96 1. Linear Functions 4 0 ^ P{x y.z) У .л,' Figure 1.17 Letting P" be the point of intersection with the χ axis of the line through P' and parallel to the ζ axis, we obtain |0P|2= |0P"|2+ |P"P'|2+ |P'P|2 But now |0P"|2 = x2, |P"P'|2 = z2, |P'P|2 = y2, so |0P| = [_x2 + y2 + z2]1/2 Now, suppose P(x, y, z), Q(a, b, c) are any two points in space. By definition of addition, Ρ is the fourth vertex of the parallelogram three of whose vertices are 0, Ρ — Q and Q. Thus, the side through Ρ and Q has the same length as the side through Ρ — Q and 0, so IPQI = l(P - Q)0| = [(χ - a)2 + (y - b)2 + (z - c)2]1'2 (1.46) Finally, we can compute the angle between Ρ and Q by the law of cosines (consult Figure 1.18); if θ is that angle, then |PQ|2 = |0P|2 + |0Q|2 - 2|0P| |0Q| cos θ In coordinates, {x - a)2 + {y- b)2 + (z - c)2 = x> + y> + z> + a2 + b2 + c2 -2(x2 + y2+z2)112 x(a2 + b2 + c2)^2cose
1.9 Space Geometry 97 which reduces to xa + yb + cz COS = (x2 + y2 + ζψ\α2 + b2 + с1)1'1 °-47) The form in the numerator thus has some special importance: it together with the notion of length determines angles. It is called the inner product of the two vectors P, Q. Definition 15. Let P, Q be two vectors in space. Their Euclidean inner product, denoted <P, Q>, is denned as |P| |Q| cosf?, where θ is the angle between Ρ and Q. In coordinates, Ρ = (xu yu z^, Q = (x2, y2, z2), <P, Q> = Χχχ2 + yxy2 + ZiZ2 Propositions 26. The nonzero vectors Ρ and Q are perpendicular if and only i/<P,Q> = o. Proof. Ρ and Q are perpendicular if and only if the angle θ between them is a right angle, θ is a right angle if and only if cos θ = 0, and this holds precisely When <P, Q> = 0. A plane through the origin is the linear span of two vectors. If N is a vector perpendicular to such a plane ΓΊ, then Γ] is given by the equation Π:<χ,Ν> = 0 0 |0Q| Q Figure 1.18
98 1. Linear Functions More generally, if ρ is a point on a plane (not necessarily through the origin) and N is orthogonal to TJ, \\ is given by the equation <x - p, N> = 0 A line through the origin is the linear span of a single vector, and can be expressed by two linear equations (since a line is the intersection of two planes). Examples 39. Find the equation of the plane through (1, 2, 0) spanned by the two vectors (1, 0, 1), (3, 1, 2). If N = (л1, л2, л3) perpendicular to this plane we must have <N, (1, 0, 1)> = л1 + л3 = О <N, (3, 0, 2)> = Зл1 + л2 + 2л3 = 0 A solution of this system is (1, —1, —1), so we may take N = (1, - 1, - 1). Then the equation of the plane is <x-(l, 2, 0),(1, -1, -1)> = 0 or x-y-z+l=0 40. Find the equation of the plane through Ρ = (1, 0, -1), Q = (2, 2, 2), R = (3, 1, 1). If N is perpendicular to the plane, we have <N, Ρ - R> = 0 <N, Q - R> = 0 Letting N = (л1, л2, л3), we obtain this system of equations: -2их-л2-2л3=0 - л1 + л2 + л3 =0 which has a solution N = (7, 4. 3). Thus the equation we seek is <x - N, R> = 0, or 7(x - 3) + 4(j> - 1) + 3(z - 1) = 0 or Ix + 4y + 3z = 28 41. Find the equations of the line L through (4, 0, 0) and perpendicular to the plane in Example 40. If χ is on L we must have <x - (4, 0, 0), Ρ - R> = 0 <x - (4, 0, 0), Q - R> = 0
1.9 Space Geometry 99 so we may take these as the equations: 2x + у + 2z = 8 —x+y+z =-4 Vector Product Given two noncollinear vectors vlt v2 in space, the set of vectors perpendicular to vx, ν2 is a line. We shall now develop a useful formula for selecting a particular vector on that line, called the vector product vt χ v2. If N is on that line, and χ is in the linear span of vt and v2, we have <x, N> = 0 On the other hand, since x, vl5 v2 are coplanar we have Now there is a uniquely determined vector N such that for all χ 6 R3. This is easily seen using coordinates. Write vi = («ι1, "ι2, ΙΊ3), v2 = (υ2\ υ22, υ23), χ = (x1, x2, x3) Then fx \ /x1 x2 x3 detj у у ) =detj ι?/ vx2 υ3 χυ^ υ22 ν23 = xtyi V - ι>! V) - *2("ι V - VlW) + x\v11v22-v1W) = <(x\ X2, X3), ((l>! V - V.W), {V.W - V.W),
100 1. Linear Functions Definition 16. Let ν = (г1, ν2, ν3), w = (w1, w2, w3) be two vectors in R3. The vector product ν χ w is defined by ν χ w = (v2w3 - v3w2, v3™1 - vxw3, v1w2 - v2wl) Proposition 27. (i) <x, ν χ w> = detj ν I for all χ e R3. (ii) vxw= — w χ v. (iii) ν χ w is orthogonal to ν and w. (iv) The equation of the plane through the origin spanned by ν and w has the equation <x, ν χ w> = 0. The proof of this proposition is completely contained in the preceding discussion. The basic property of the vector product is the first; it follows, for example, that for any three vectors u, v, w <u, ν χ w> = <u χ v, w> = <v, w χ u> = <v χ w, u> Notice that if v, w are collinear, ν χ w = 0. If they are not collinear, the ordered basis u-+v-+vxwis right handed (see Figure 1.19). The following proposition gives an important geometric interpretation of the magnitude of ν x w. Proposition 28. Let u, v, w be three noncollinear vectors. (i) The area of the parallelogram spanned by u and ν is ||u χ ν||. " WX V Figure 1.19
1.9 Space Geometry 101 (ii) The volume of the parallelepiped spanned by u, v, and w is Proof. Step (a). The first step is to verify (n) in case the vectors u, v, w are mutually perpendicular. In that case we must show that detjv I = ||u||· llvll- ||w|| This follows easily from the multiplicative property of the determinant. First we note that /u\ /<u,u> 0 0 \ ν (u, v,w)= 0 <v, v> 0 V/ \ 0 0 <w,w>/ since the (i,/)th entry is the inner product of the jth row of the first matrix with the ;th row of the second (see Problem 49). Thus, detjv I = detjv J(u, v, w) = ||u||2||v||2||w||2 Step (b). In particular, if u, ν are perpendicular, then u, v, u χ ν are mutually perpendicular, so (u χ v\ u I = l|uxv||· Hull· Hvll so ||u χ v|| = ||u|| · ||v||, when u is perpendicular to v. Step (c). Now we prove part (i) in general. Let θ be the the angle between u and ν (see Figure 1.20). Then the area of the parallelogram spanned by u and ν is the product of the base and the height of the base: area= ||u||a= ||u|| · ||v||sin0
102 1. Linear Functions Now the vector u χ (u χ v) is orthogonal to u and u χ v, so lies in the plane spanned by u and ν and is orthogonal to u. We have sin 0 = cos Η <v, u χ (u χ v)> ||v|| · ||u χ (u χ v)|| Since u and u χ ν are orthogonal, by Step (b) we have ||u χ (u χ ν) || = ||u|| · ||u χ ν ||. Thus area= lu| <v, u χ (u χ v)> llvll· ||ux(uxv)|| <UX V, U X V> llu|| -—— — = ||u χ v|| lull u χ v|| Step (d). To prove part (11) we refer to Figure 1.21. The volume of the parallelepiped spanned by u, v, w is the product of the area of the base and the altitude: volume = ||u χ v||i= ||u χ v|| · ||w||sin φ Since u χ ν is orthogonal to the u, ν plane, sin φ = cos I- - A = l<w' u x \2 Φ) Ikll' llu x v|| u X(u X v) Figure 1.20
1.9 Space Geometry . u χ ν Figure 1.21 Thus volume = ||u χ vl „ „ <w, ux v> ,_, M, l|w|1 "71—ΠΠ T = |det u V ||w||||uxv|| I/ A final equality which will prove useful is this: l|uxv||2=||u||2||v||2-<u, v>2 This follows easily from the above arguments: ||uxv||2=||u||2||v||2sin20 = ||u||2||v||2(l-cos20) = ||u||2||v||2-<u,v>2 since the angle between u and ν is φ. • EXERCISES 47. Which pairs of the following vectors are orthogonal? vt=(2, 1,2), v2= (3,-1,4), v3=(7,0, 5), v4=(6, -2, 5) v,= (1,3,0), v6= (0,0,1), v7 = (-15,5,21)
104 1. Linear Functions 48. Find the vector products Vj χ vj for all pairs of vectors given in Exercise 47. 49. Find a vector ν such that <v,V!>=2 <v, v2> = -l <v, v3>=7 where vb v2, v3 are given in Exercise 47. 50. Find the equation of the plane spanned by the vectors (a) Vi, ve, (b) v2, v3, (c) v5, v., (d) v2, n, where the v, are given in Exercise 47. 51. Find the equation of the line spanned by the vectors given in Exercise 47. 52. Fmd the equation of the plane through (3, 2, 1) and orthogonal to the vector (-7, 1,2). 53. Find the equation of the line through (0, 2, 0) and orthogonal to the plane spanned by (1, — 1, 1) and (0, 3, 1). 54. Find the equation of the line through the origin and perpendicular to the plane through the points (a) E1;E2,E3 (b) (1,1, 5), (0,0, 2), (-1,-1,0) (c) (0, 0, 0), (0, 0, 1), (0, 1, 0) 55. Find the equation of the line of intersection of the two planes, (a) determined by (a), (b) of Exercise 54. (b) determined by (a), (c) of Exercise 54. (c) determined by (b), (c) of Exercise 54. 56. Find the plane of vectors perpendicular to each of the lines determined in Exercise 55. 57. Let A be a 3 χ 3 matrix. Show that (a) if the rows of A lie on a plane (but not on a line), the set of solutions of Ax = b forms a line, or is empty. (b) if the rows of A lie on a line, the set of solutions of Ax = b forms a plane, or is empty. 58. Show that ||v χ w|| = ||v|| · ||w|| sin Θ, where θ is the angle between the two vectors ν and w. 59. Is the vector product associative; that is, is (u χ ν) χ w = u χ (ν χ w) always true? 60. If v, w are two noncollinear vectors show that the three vectors ν, ν χ w, ν χ (ν χ w) are pairwise orthogonal. • PROBLEMS 48. Prove the identities of Proposition 27. 49. Let Vi, v2, v3 be three vectors in R1 Let A be the matrix whose
1.10 Abstract Notions of Linearity 105 rows are vb v2, v3 and В the matrix with columns vb v2, v3. Show that (a) the (i,;)th entry of AB is <v,, vj>. (b) det A = det B. 50. Let Ρ be given by the coordinates (x, y, z) relative to a choice Ε,, E2, E3 of basis for space. Show that the point of intersection of the line through Ρ parallel to the Ei axis with the 2-3 plane has coordinates (0, y, z). 1.10 Abstract Notions of Linearity There are many collections of mathematical objects which are endowed with a natural algebraic structure which is very reminiscent of R". To be less vague, there is denned, within these collections, the operations of addition and multiplication by real numbers. Furthermore, the problems that naturally arise in these other contexts are reminiscent of the problems on R" which we have been studying. The question to ask then, is this: does the same theory hold, and will the same techniques work in this more general context ? We shall see in this section that for a large class of such objects (the finite-dimensional vector spaces) the theory is the same We shall see later on that in many other cases, the techniques we have developed can be modified to provide solutions to problems in the more general context. First, let us consider some examples. Examples 42. If/ and g are continuous real-valued functions on the interval [0, 1], then we can define the functions/+ g, c/as follows: (У+*Х*) =/(*)+**) (c/X*) = cf{x) Clearly,/+ g and c/are also continuous. Thus we see that operations of addition and scalar multiplication are denned on the collection C([0, 1]) of all continuous functions on the interval [0, 1]. 43. In the above example, if/and g are differentiable, so are/+ # and cf. Thus the space C'([0, 1]) of functions on the interval [0, 1] with continuous derivatives also has the operations of addition and scalar multiplication. Notice that the operation of differentiation takes functions in Cx([0, 1]) into C([0, 1]): if/is in C4[0, 1]) it has a continuous derivative, so f is in C([0, 1]). Furthermore,
106 1. Linear Functions differentiation could be described as a linear transformation: (f+g)'=f' + g' iff)' = cf' So is, by the way, integration a linear transformation: \{f+g)=lf+\g l(cf) = c\f The fundamental theorem of calculus says that differentiation is the inverse operation for integration: (J/)' =/ These remarks may strike you as merely a curious way of describing the well-known phenomena, but the implied point of view has led to a wide range of mathematical discoveries. The subject of functional analysis which was developed early in the 20th century came out of this geometric-algebraic approach to long standing problems of analysis. Examples 44. If S and Τ are linear transformations of R" to Rm, then so is the function S + Τ denned by: (S + T)(x) = S(x) + T(x) We can also multiply a linear transformation by a scalar: (cS)(x) = cS(x) Thus the space L(R", Rm) of linear transformations of R" to Rm has denned on it operations of addition and scalar multiplication. 45. We have already observed (Section 1.6) that the collection M" of η χ η matrices has defined on it these two important operations. In fact, we used, in an essential way, the fact that when we viewed M" this way it was just the same as R"2. These examples, together with R", lead to the notion of an abstract vector space: a set together with the operations of addition and scalar multiplication. We include in the definition the algebraic laws governing these operations.
1.10 Abstract Notions of Linearity 107 Definition 17. An abstract vector space is a set V with a distinguished element, 0, called the origin, on which are defined two operations: Addition. If ν and w are elements of V, then ν + w is a well-defined element of К Scalar multiplication. If г is in К and с is a real number, cv is a well-defined element of V. These operations must behave in accordance with these laws: (i) ν + (w + x) = (v + w) + x, (ii) ν + w = w + v, (iii) ν + 0 = v, (iv) c(v + w) = cv + cw, (v) cx(c2 w) = (cl c2)w, (vi) 1 w = w. The preceding examples are all abstract vector spaces; the verifications of the required laws are easily performed. We now want to investigate the extent to which the ideas and facts discussed in the case of R" carry over to abstract vector spaces. First of all, all the definitions carry over sensibly to the abstract case if we just replace the word R" by the words an abstract vector space V. Thus we take these notions as defined also in the abstract case: linear transformation, linear subspace, span, independent, basis, dimension. Now there is one bit of amplification necessary in the case of dimension. We have until now encountered spaces of only finite dimension. Example 46. Let R °° be the collection of all sequences of real numbers. Thus an element of Rx is an ordered oo-tuple, (χ\χ2 A---) Rx is an abstract vector space with these operations: (x1,x2,...,;c",...) + (j>\j'2,...,/,...) = (xi + y1,x2 + y2,...,xn+yn,...) c(xl, x\...,x",...) = (ex1, ex2, ...,cx",...) Now Л00 has an infinite set of independent vectors. Let E„ be the sequence all of whose entries are zero but for the nth, which is 1. This entire collection {£„...,£„,...} is an independent set. For if there is a relation among some finite subset of these, it must be of the form clEl +··· + c*Ek = 0
108 1. Linear Functions (of course, many of the c's may be zero). But c1El +··· + с*£* = (с1, с2 с*, 0, 0, ...) so if this vector is zero we must have cl = c2 = · · · = c* = 0. Thus indeed the set {£1; ...,£„,...} is an infinite independent set on Rx. We now make the following restriction to the so-called finite-dimensional vector space; and we shall see that all of the preceding information about R" holds also in this more general case. Definition 18. A vector space V is finite dimensional if there is a finite set of vectors vl,...,vk which span V. That Rx is not finite dimensional follows from some of the observations to be made below. It can also be verified in the terms of the above definition (see Problem 53). The important result about finite-dimensional vector spaces is that they are no different from the spaces R". Proposition 29. Let V be a finite-dimensional vector space of dimension d. Ifvl,...,vdisa basis for V, every vector in V can be expressed uniquely as a combination of i>,,..., vd: v = x1v1 + ··· + xivi {x},...,x*) is called the coordinate of ν relative to the basis v1,...,vd. The correspondence i>-> (x1, ..., xd) is a one-to-one linear transformation of К onto Я11. Proof. The definition of basis (Definition 6) makes this proposition quite clear. We leave the verifications to the reader (Problem 54). What is not so clear is that every finite-dimensional vector space has a basis, and that every basis has the same number of elements. However, once these facts are established the above proposition serves to reduce the general finite-dimensional space to one of the R", and the results of Section 1.3 through 1.6 carry over. Proposition 30. Every finite-dimensional vector space V has a finite basis, and every basis has the same number of elements, the dimension of V. Proof. Suppose V is finite dimensional. Then V has a finite spanning set. Let {vx,..., Vi) be a spanning set with the minimal number of vectors; by definition V has dimension d. We shall show that {οι,..., vd) is a basis.
1.10 Abstract Notions of Linearity 109 Since {vi,..., vi) span, every vector in V can be written as a linear combination of these vectors. We have to show that there is only one way in which this can be done. Suppose for some vector ν we have two different such ways: ν = xiv1 + ·■· + x"vd = y*Vi + ■■·+ y"vd (1.48) Then (*' -y*)vi + ··· + (χ'-y^vi = 0 Since these two expressions differ we must have χ' Φ у1 for some ; Thus Now this equation says that vj is in the linear span of the d — 1 elements vi, .., Vj-i,vJ+i, ...,vt, so these elements serve to span all of Valso. But this contradicts the minimal assumption about d. Thus it must be impossible to express ν in terms of vi,..., νά in two different ways. Hence {vi,. , vi} is a basis. That any two bases have the same number of elements follows easily from Proposition 28 (see also Problem 55). Let T: V-* Rd be the linear transformation associating to each vector its coordinate relative to the above basis {vt, ...,vd}. If {и>1, ..., wt} is another basis, let S: K->R be the same coordinate mapping relative to this basis. Then L = S · T~l is a one-to-one linear mapping of Rd onto R*, so p(L) = δ, v(L) = 0. Thus (rank + nullity = dimension): δ = d. • PROBLEMS 51. Show that for any finite set of vectors S = {v1,..., vt} in R", there is a vector weR which does not lie in their linear span [S]. (Hint: Let ν represent the first (k + l)-tuple of entries in v. Since v/,.. , v/ cannot span Rt+1, there is a vector w' in Rk+1 which cannot be written as a combin- nation of v/,..., v/ Let w = (w', 0,.. ).) 52. Are the vectors E,,..., E„,... in R described in Example 43 a basis fori?-? 53. Let Ro°° be the collection of those sequences of real numbers (л:1, χ2,..., χ",...) such that x° = 0 for all but finitely many n. Then R0" is a linear subspace of R°°. Show that the vectors E,,..., E,,... are a basis for i?0°°. 54. Prove Proposition 29. 55. Prove, by following the arguments in Section 1.4, that any two bases of a finite-dimensional vector space have the same number of elements.
110 1. Linear Functions 56. Let V, Whe. two vector spaces. Show that the collection L(V, W) of linear transformations from К to W \ъ a vector space under the two operations : (a) if с e R, L e L(V, W), (cL)(x) = cL(x), (b) if L, L' e L(V, W), (L + L')(x) = L(x) + L'(x). 57. What is the dimension of L(R", iT)? 58. Show that a vector space К is finite dimensional if there is a one-to-one linear transformation of V0 into R" for some n. 59. Show that a vector space V is finite dimensional if there is a linear transformation Τ of R" onto V for some n. 60. Verify that the collection Ρ of polynomials is an abstract vector space. For a positive integer n, let P„ be the collection of polynomials of degree not more than n. Show that P„ is a linear subspace of P. Show that Ρ is not finite dimensional, whereas P„ is. What is the dimension of P„ ? 61. Let x0,..., x„ be distinct real numbers and c0,..., c„ another collection of real numbers. Show that there is one and only one polynomial ρ in P„ such that p(xt) = Ct 0<,i<,n (Hint: Let L:P„^R"+1 be defined by L(p) = (p(x0), ...,p(x„)). Show that L has rank и + 1.) 62. Let g be a polynomial, and define the function G.P^P: G(p)=pg Show that G is a linear function. Describe the range and kernel of G. 63. Define Dk:P^P: Dk(p) = d^pldx*. What are the range and kernel of A? 64. Let x0 e R, and let c0,..., ck be given numbers. Show that there is one and only one polynomial ρ in P„ such that ... dp dkp p{Xo) =d„ ~ (Xo) = Cl, ... , — (x„) = ck (Hint: Use the same idea as in Exercise 61.) 65. Does Dk:P^P have any eigenvalues? 66. Show that C([0,1 ]) is not a finite-dimensional vector space. 1.11 Inner Products The notion of length, or distance, is important in the geometric study of planar and spatial configurations. In Section 1.3 we studied these concepts and related them to an algebraic concept, the inner product. From the point of view of analysis also it is true that these concepts are significant:
1.11 Inner Products 111 it is in terms of distance that we can express "closeness" and in particular "convergence." By analogy with R3 we define the inner product in R", and in terms of it, distance. While we are here we shall, in this section, introduce some topological terms. Definition 19. The inner product of two vectors ν = (ν1, ..., ιΛ), w = (w1,..., W), denoted by <v, w> is defined as <v, w> = £ v'w1 We shall say that ν is orthogonal to w if <v, w> = 0. The distance d(v, w) between ν and w is defined by d(v,w)= [£(»'- w')2]1/2 The modulus |v| of a vector ν is the distance between ν and 0, |v| = rf(v,0) = rj>')2]1/2 Distance in R" behaves much as it does in R2 and R3; in particular, the Pythagorean theorem holds: d(y, w)2 = d(y, x)2 + d(x, w)2 (1.49) when <v — w, w — x> = 0. In any event, two points are no further apart than the sum of the distances from a third, d(y, w) < d(\, x) + d(x, w) (1.50) These facts will be verified in the problems. Topological Notions Definition 20. The ball in R" of radius R > 0 and center c, denoted 5(c, R), is the set of all points whose distance from с is less than R: B(c,R)= {xeRn:d(x,c)<R} A set S is said to be a neighborhood of a point с if it contains some ball centered at с A set V is said to be open if it contains a neighborhood of each of its points. Thus, a set S is a neighborhood of с if there is some R (presumably very
112 1. Linear Functions small) such that d(x, c) < R implies xeS A set U is open if for every cell, there is an R such that U => B(c, R). Notice that any ball is open. For suppose xeB(c,R). Then d(x, c) < R, so R - d(x, c) > 0. Now B(c, R) contains the ball of radius R - d(x, c) centered at x. For if у is a point in that ball, then by (1.50), d(y, c) 5Ξ d(y, x) + d(x, c)<R- d(x, c) + d(y, c) = R Here is a collection of formal properties of the collection of open sets. Proposition 31. (i) R" is open. (ii) lfU1,..., U„ are open, so is Ut η ··· η [/„. (iii) If С is any collection of open sets, then the set of all points belonging to any of the sets in С is open. (This set is denoted [j U). Proof. (i) Oearly, R" contains a ball centered at every one of its points. (ii) Suppose Ui,..., U„ are open, and χ is in every Ut. Then there are Ri,..., R„ such that t/i => B(x, RJ, ...,!/„=> B(x, Д,). Let R = min[i?i,..., R„]. Then if d(y, x)<R,y is in each B(x, Ri) so is in each Ut. Thus у is in £Λ η · · · η U„. In particular, t/i η · · · η U„ => B(x, R). Thus t/i η · · · η U„ is a neighborhood of any one of its points x, and is thus open. (iii) Suppose С is a collection of open sets. If χ is in any one of them, say U, then since U is open there is an R such that U => B(x, R). Thus, \Juec U => B(x, R). Thus (Juec U is a neighborhood of any one of its points, so is open. Many of the concepts a mathematician studies are so-called local concepts: They happen in a neighborhood of a point, or are determined by what goes on near a point; far behavior being irrelevant. Differentiation is thus local, whereas integration is not. The importance of open sets is that it is precisely on such sets that we should study these local concepts, since their definition at a point depends on behavior in some neighborhood of the point. If a set is open its complement, the set of all points not in the given set, is said to be closed. Thus, Sis a closed subset of R" if R" - S = {xeR":x4S} is open. Corresponding to Proposition 31 we have this proposition about closed sets. Proposition 32. (i) R" is closed. (ii) If Su ..., Sn are closed, so is St и ··· и S„.
1.11 Inner Products 113 (iii) If С is a collection of closed sets, then the set of all points common to all the sets of С is closed. (This set is denoted [)s ε c 5). Proof. Problem 67. Notice that there are sets which are both open and closed. There are not many of them. R" and 0 are the only ones. There are also sets which are neither open nor closed, and there are many of them. For example, an interval is open in R1 if it contains neither end point, closed if it contains both, and neither open nor closed if it contains only one end point. We are acquainted with the notion of " dropping a perpendicular" in the plane. That is, if / is a line and ρ is a point not on the line, then we can drop a perpendicular from ρ to / as in Figure 1.22. The point p0 of intersection of the perpendicular with / is the point on / which is closest to p. A more sophisticated way of describing this situation is to say that p0 is the orthogonal projection of ρ on /. The concept of orthogonal projection generalizes to R" and will prove quite useful there. In order to discuss this problem, we shall generalize even further. Definition 21. A Euclidean vector space is an abstract vector space V on which is defined a real-valued function of pairs of vectors, called the inner product, and denoted <, >. The inner product must obey these laws: (i) <i>, i>> > 0. If <i>, i>> = 0, then ν = 0. (ii) <i>, w> = <w, i>>. (iii) (av, w> = a(v, w>. (iv) <i>x +v2,w} = <i>x, w> + <i>2, w>. \ \ *P \ \ \ \ \ \ \ \ \ \ \ ^ Figure 1.22
114 1. Linear Functions It is clear that R" is a Euclidean vector space when endowed with its inner product. The space C[0, 1] of continuous functions on the unit interval is a Euclidean vector space with this inner product: </, 9> = fVi'MO dt Jo We leave it to the reader to verify that the laws (i)-(iv) are obeyed. It is interesting that the laws (i)-(iv) are all that is essential to the notion of inner product; that is, any such function behaving in accordance with those laws will have all the properties of an inner product. Despite the inherent interest in this " metamathematical" point, we shall not pursue it further, but take it for granted that the above definition has indeed abstracted the essence of this notion. In terms of an inner product on a vector space we can define the notions of length and orthogonality: IN = [<». f>]1/2 υ 1 w if and only if <i>, w> = 0 The important bases in a Euclidean vector space are those bases whose vectors are mutually orthogonal. More specifically, we shall call a set {Et, ...,£„} in a Euclidean vector space V an orthonormal set if ||£J = 1 for all/ E, 1 Εj for all / φ / If the vectors Ex, ..., E„ span V we shall call them an orthonormal basis. (Any orthonormal set of vectors is independent—Problem 68.) The basic geometric fact concerning orthonormal sets is the following: Proposition 33. Let V be a Euclidean vector space and {Et, ..., E„) an orthonormal set in V. For any vector ν in V, the vector v0 = υ — J^"= 1(v,El}El is orthogonal to the linear span S of {Et, ..., £„}. π Proof. Let и> = 2 dEi be in S. Then i = l <v, w> = <v - 2 <v, Ε,} <£,, w> = <«, w> - J <υ, Ε0 (Ε,, w>
1.11 Inner Products 115 Now <E, ,*> = <£,,£ CjEj} = j?Cj <E,, Ej} = c, J = l J=l η η <υ, w> = <χ>, 2 ει£Ί> = 2 c' <u> £'> J=l i = l Thus <v, vf> = i с <υ, £,> - J <υ, Е,У ct=0 1 = 1 1=1 Theorem 1.8. Let V be a Euclidean vector space, and let {Eu ...,£„} be an orthonormal set in V. For any vector v, let v0= £<»,£,>£, Then (i) N|2=|l»-»oll* + ll»oll2; (ii) for any w in the linear span of {Eu ..., £„}, ||r-r0||2<||r-w||2 Proof. (i) ||υ||2 = <υ, ν} = <(υ - Vo) + v0,(v- vo) + v0> = ||υ-υ0||2+ \Ы\2 + <v0, v-v0> + <v-v0,Vo> The last two terms are zero by the preceding proposition, since v0 is in the linear span of {Ei,..., E„}. (ii) ||υ — w\\2 = (,ν — w, ν — w} = <U — Vo + Vo — W, V — Vo + Vo — W> = \\v — Vo\\2+ \\vo- >f||2+ <υ- Vo,v0-w}+ <v0- w,v- v0y Again, the last two terms are zero for both v0, w and thus also v0 - w is in the linear span of {£Ί, ...,£■„}. Thus ||„-и;||2= ||г;-г;о112+ \\v0 - w\\2 > \\v - v0\\2 so (ii) is proven.
116 1. Linear Functions Gram-Schmidt Process Notice that υ — v0 = v' is orthogonal to the linear span S of {£x, ..., £„}. v0 is the vector in S which is closest to ν; it is called the orthogonal projection of υ into S. It seems, by Theorem 1.8 that one needs an orthonormal basis in order to find orthogonal projections; the following proposition gives a procedure for obtaining orthonormal basis for finite-dimensional vector spaces, and thus with it, orthogonal projections. Proposition 34. Let Fu ..., F„be a basis for a Euclidean vector space V. We can find an orthonormal basis Et, ..., E„so that the linear span ofEx,..., Ej is the same as the linear span ofFu ..., F}for all). Proof. The proof is by induction on n. If и = 1, we need only take £Ί = IIFiir'-Fi. Now in general, let Fi,..., F„ be a basis for a Euclidean vector space V. Then the linear span W of Л, . ...F,,-! is a Euclidean vector space also, and we can apply the proposition to W by the inductive hypothesis. Let £Ί,..., E„-i be an orthonormal basis with the required properties. Now, we must find a vector E„ such that ||fi|| = l (E„ ,£,) = 0 all ι φ η F„ is in the linear span of £Ί,..., E„ If ii„isa vector that fulfills the last two conditions, then we can take E„ = \\E„\\~1E„. Thus we need only find a vector filling the last two conditions. That is easy; take En=Fn- ^{F„,Ej)E} Kit Then, for ι < n, (E„, E,) = (F„, £i) - Σ №. e№j . Ει) J<n = (F„,Ei)-(F„,El)(El,El)=0 Furthermore, F„ = E„+ 2(F„,Ej)Ej so the last two conditions are fulfilled and the proposition is proven. The proof of this proposition provides a procedure for finding orthonormal
1.11 Inner Products 117 bases in an Euclidean vector space, known as the Gram-Schmidt process. It goes like this: First, pick any basis Ft, ..., F„ of V. Take Et =11^ ΙΓΧΛ Then choose E2 = F2 — (F2, El)El, and divide by the length to find E2, and so forth. If Et, ..., E} are found, take £°+1 = FJ+l - (FJ + 1, Et)Et - (FJ + U E2)E2 - ··· - (FJ + 1, £,)£, and let EJ + 1 be the vector of length one collinear with £°+1. Examples 47. Apply the Gram-Schmidt process to this basis of R3: Fx= (1,0,1) F2 = (3,-1,2) F3 = (0, 0,1) Take Ei = №i=(7i'0'7i) = <3,-l,2)-(|.0,|)-(|.-l,^) Then Ε3° = (°'0'1)-^(^'0'^)+(ϊ^((ϊ^'
1. Linear Functions and finally E3 -1-^- — —\ ~\(17)1/2'(17)1/2'(17)1/2/ 48. Find an orthonormal basis for the kernel of λ: R4 -> R, λ(χ\ χ2, χ3, χ4) = χ1 + χ2 + χ3 + 2x4. First of all, let us pick a suitable basis for Κ(λ); that is, (1,0, 0, - 1/2), (0, 1, 0, -1/2), (0, 0, 1, -1/2). Applying the Gram- Schmidt process, we obtain V-(»U7)-^^"7) „..(ο,ο,,,^-Α.^,ο,ο,^) 1 / -1 /5\1/2 -2 \ ~ (30)1/2 U30)1/2' \б) '(W75/ (Ю94)^2 /1 1 2\ Ьз__30-\б'б' '5J 49. Find the orthogonal projection of (3, 1, 2) into the kernel of Г:Д3->Д: T(x, y, z) = χ + 2y + ζ Now the kernel of Τ is spanned by Fx = (2, -1, 0), F2 = (0, -1, 2). Applying the Gram-Schmidt process, we obtain the orthonormal basis Ex = (|)!/2(2, -1,0) E2 = ЦУ'\-{, -|, 1) Thus the orthogonal projection of (3, 1, 2) into this plane is Ш1/25(|)1/2(2, - 1, 0) + (1)1/^1(1)1/^-1, -1 i) = (4|, _|i f)
1.11 Inner Products 119 50. Find the point on the line L: χ + у - ζ = 0 Ъу + ζ = О which is closest to (7, 1,0). L is the linear span of the vector (-4,-1,3). Thus the orthogonal projection of (7,1,0) on this line (the closest point) is α ι m (-4,-i,3)\(-4,-1,3) 27 (7'1'°)' (26)1'2 / (26)1'2 =26(4'-1'3) • EXERCISES 61. Which of the following sets are open; closed; or neither. (a) {xeR:2< \x-5\ < 13}. (b) {xeR:0<x^4}. (c) {xeR:x>32}. (d) {xeR-. <x, x>=4}. (e) {хеЛ3:<х,(0,2,1)>=0}. (f) {xei?3:2<||x-(3,0,3)||<14}. (g) {хей":х'>0 x">0}. (h) The set of integers (considered as a subset of R). (i) {xeR!·: 2aV<£}. (j) {xeif: Σχ'α'ΦΙ}. (к) {хеЛ":2(^')3<2^')2}· 62. Find the point on the plane χ + Ъу + Iz = 4 closest to the point (1, 0, 1). 63. Find the point on the line x + 7У + ζ = 2 л: -z = 0 closest to the point (—7, 1, 0). 64. Find an orthonormal basis for the linear span of (a) v, = (0, 2, 2), у2 = (1, 0, 2), v3 = (1, 2, 4). (b) v, = (0, 1, 0, 1), y2 = (1, 0, 1, 0), v3 = (1, 1, 2, 3). (c) Vl = (0, 3, 0, 0, 0), v2 = (0, 6, 0, 3, 0), v3 = (0, 0, 2, -1, 1). (d) v, = (1, 2, 3, 4), v2 = (4, 3, 2, 1), v3 = (2, 1, 4, 3).
120 1. Linear Functions 65. Find orthonormal bases for the linear span and kernel of these transformations on R*: (a) /8 6 1 0\ (b) • PROBLEMS 67. Prove Proposition 32. 68. Show that an orthogonal set of vectors is independent. 69. Give an example of a sequence {U„} of open sets such that f)"=i U„ is not open. 70. Give an example of a sequence {C„} of closed sets such that U"=1 C„ is open. 71. Find an orthonormal basis for the linear span of 1, л:, χ2, χ3 in the vector space C([0, 1]) with the inner product </, g} = \fg. In the next four problems V represents a vector space endowed with an inner product, denoted < , >. 72. Let v, w, χ be three points in V such that ν — χ is orthogonal to w — x. Show that the Pythagorean theorem is valid: 73. Let v, w be two vectors in V. Show that the vector in the linear span of w which is closest to υ is vo = -7—7Г w 0·51) (You can verify this by minimizing the function f(t)= \\v—tw\\2 by calculus.) 74. Prove Schwarz's inequality: K»,w>|^||i7||· |M| for any two vectors in V. (Hint: \\v- v0\\2 ^0 where v0 is given by (1.50).) 75. Prove the triangle inequality: \\v-x\\^\\v-w\\+ \\w-x\\ for any three vectors in V (use Schwarz's inequality).
1.11 Inner Products 121 76. Let К be a vector space with an inner product. Suppose that W is a subspace of V. Let ±_(W) = {v. (v, w> = 0 for all w e W). This is called the orthogonal complement of W. Show that _L(W) is a linear subspace of V and (if V is finite dimensional) that W and _L(W) together span V. 77. Let T: R" -> i?m be a linear transformation represented by the matrix A. Show that the rows of A span _\_(K(T)). 78. Show that a linear transformation is one-to-one on the orthogonal complement of its kernel. • FURTHER READING R. E. Johnson, Linear Algebra, Pnndle, Weber & Schmidt, Boston, 1968. This book covers the same material and includes a derivation of the Jordan canonical form. K. Hoffman and R. Kunze, Linear Algebra, Prentice-Hall, Englewood Cliffs, N.J., 1961. This book is more thorough and abstract, and has a full discussion of canonical forms. L. Fox, An Introduction to Numerical Linear Algebra, Oxford University Press, 1965. This is a detailed treatment of computational problems in matrix theory. H. K. Nickerson, D. C. Spencer, and N. Steenrod, Advanced Calculus, Van Nostrand, Princeton, N.J , 1957. This set of notes has a full treatment of all the abstract linear algebra required in modern analysis. • MISCELLANEOUS PROBLEMS 79. Show that if A' is obtained from A by a sequence of row operations then these equations have the same solutions: Ax = 0, A'x = 0. 80. Show that every nonempty set of positive integers has a least element. 81. Show that a set with и elements has precisely 2" subsets. 82. Show that the и-fold Cartesian product of a set with к elements has k° elements. 83. Can you interpret the case к = 2 in Problem 82 so as to deduce the assertion of Exercise 3 ? 84. Let A = (β/) be an η χ и matrix such that a/ = 0 if ι — j > r for some r > 0. Show that A"~r = 0 Show that the same conclusion follows from the assumption у — / > r for some r > 0. Will the hypothesis |i — j'\ > r do as well? 85. Let T: Rr^R™ be a linear transformation of rank r. Show that there are linear transformations 5Ί: Rm -> Rm-\ S2: R"~r -> R" such that (a) 5Ί has rank m - r and b e R(T) if and only if Sib = 0. (b) S2 has rank и — r and χ e K(T) if and only if χ e R(S2). 86. Suppose that T: R" -> R° and Τ = I. Show that Τ is invertible. 87. Let S be a subset of R". Show that the linear span [S] of S is the intersection of all linear subspaces of R" containing S.
122 I. Linear Functions 88. Let S, Τ be subsets of R". Show that dim([S и Г]) < dim([S]) + dim(|T]), and equality holds if and only if [S] η [Τ] ={0}. 89. Let V and W be subspaces of R". Let X be the set of all sums ν + w with ν e V, w e W. Show that X \ъ a linear subspace of U". The relationship between ХъхА К and W is indicated by writing X=V+ W. If in addition Κ η W={0}, then every xei can be written in the form ν + w in only one way. In this case, X=V+W with V r\ W = 0, we say that X is the direct sum of V and W and write X = V® W. 90. Suppose X=V@W. Then dim A"= dim V+ dim W. 91. Show that if λ: R" -> i? is a linear function, there exists awei" such that λ(ν) = <v, w> for all ν e R\ 92. If S is a subset of R" define ±(S) ={veR": <v, s> = 0 for all s e S}. (a) Show that _\_(S) is a subspace of R" and that S η ±(5) = {0}. (b) Show that [5] = ±(±(5)). (c) If К is a linear subspace of R", R" = K© _L(K). 93. Suppose that Г: V^ Wis a linear transformation and К is not finite dimensional. Show that either the rank or the nullity of Τ must be infinite. 94. Let V be an abstract vector space. A bilinear function ρ on К is a function of two variables in К with these properties: p(cv, w) = cp(v, w) p(v, cw) = cp(v, w) P(Vi + V2 , W) =p(Vi, W) + p(v2 , W) p(v, Wi + W2) =P(V, Wi) + p(v, Wl) Show that the sum of two bilinear functions is bilinear. In fact, the space Bv of all bilinear functions is an abstract vector space. If К is finite dimensional, what is the dimension of Bv1 (Hint: See the next problem.) 95. Let ρ be a bilinear function on R". Let at; j =p(E,,Ej) Show that ρ is completely determined by the matrix (a,;.,). 96. Let V be an abstract vector space. (a) Show that the space V* of linear functions on К is a vector space under addition and scalar multiplication. (b) If dim V= d, show that dim V* = d also. (c) Show that to every λe Rr* there isaweu" such that λ(ν) = <v, w> for all ν e R". (Recall Problem 91.) 97. Suppose that К is a linear subspace of W. We define the annihilator of V, denoted ann(K), to be the set of λ e W* such that λ(ν) = 0 if ν e V.
1.11 Inner Products 123 Show that ann(K) is a linear subspace of W*. If dim W= n, dim V = d, show that ann(K) has dimension n — d. 98. Let К be a linear subspace of R", and suppose that T: V^Rr is a linear transformation. Show that there is a linear transformation 7": i?" -> i?™ defined on all of i?" which extends Г. 99. The closure of a set S, denoted S, is the set of all points χ such that every neighborhood of χ contains points of S. Find the closure of all the sets in Problem 61. 100. Show that the closure of a set S is the smallest closed set containing S. 101. The boundary of a set S, denoted dS, is the set of all points χ such that every neighborhood of χ contains points of both S and the complement of S. Find the boundary of all the sets in Problem 61. 102. Show that the boundary of a set is a closed set. 103. Show that the boundary of a set S is also the boundary of its complement R" - S. In fact, show that eS = Sn(R"- S). 104. Let T: V ^ W be a linear transformation of a vector space with an inner product. The adjoint of Τ is the transformation T*: W^-V defined in this way <T*(w), v> = <>, Tv} for all ν e V (a) Show that T* is a well-defined linear transformation. (b) If T: R"^Rm is represented by the matrix A = (a/), then T*: Rm -> R" is represented by the matrix A* = (a*j), where a*j = atJ. (This matrix is called the adjoint or transpose of A.) (c) Show that R(T*) is complementary to K(T). (d) In fact, p(T*) = v(T), v(T*) = p(T). 105. A bilinear form ρ on a vector space К is called symmetric if it obeys the law: p(v, w) =p(w, v) for all ν and w. An inner product is a symmetric bilinear form and much of the formal manipulations with inner products remains valid for symmetric bilinear forms. For example, the Gram- Schmidt process (Proposition 32) gives rise to this fact (see if you can work the proof of Proposition 32 to give it): Proposition. Let ρ be a symmetric bilinear form on V. Suppose Fu ..., F„ is a basis for V. We can find another basis, £Ί, ...,Ε„ of V such that the linear span of £Ί,..., Ej is the same as that of Flt..., F, for all J, and p(E,,Ej)=0ifi^j. We shall call such a basis Ε!,..., E„ p-orthogonal. 106. Let ρ be a symmetric bilinear form on a vector space V, and suppose Ei, ...,E„ is ap-orthogonal basis. (a) Show that p(v, w) can be computed in terms of this basis as follows: if ν = J v'E,, w = 2 WE,, then p(v, w) = 2 v'w'piE,, E,) (1.52) ( = 1
124 1. Linear Functions (b) Show that ρ is an inner product on the linear span of the E, such that p(E,, E,) > 0. (c) Similarly, — ρ is an inner product on the linear span of the Et such that p(E„E,)<0. 107. Prove this fact: Let ρ be a symmetric bilinear form on a finite- dimensional vector space V. There is a basis Eu ..., E„, integers r, s such that r + s <, η and such that if ν = J v'Et, then ρ(ν,ν)=Σ(νγ- 2 (νγ (i.53) ISr г£1£г+а (Hint: Modify the basis {E,} in Problem 106 so that (1.52) becomes (1.53).) 108. The integers r, s of Problem 107 are determined by ρ alone, and are independent of the basis. Here is a sketch of how a proof would go. Suppose Fi,..., F„ is another p-orthogonal basis and ρ is the number of Fi's such that p(Ft, Fi) > 0. We have to show ρ = r. Let W be the linear span of these F's. Expressing points of W in terms of the basis Eu ..., E„ we may consider the transformation T: W^ R" given by TQv'Ed^iv1,...,^) Τ is one-to-one on W, for if w e W, and w φ 0, о </>(*,*)= 2 (v'Y- Σ Ψ')2 so we must have 2>')2>o on W. Since IT is one-to-one, it follows that r ;> p. The inequality ρ ^ r follows from the same argument with the roles of Eu ..., E„ and Flf..., F„ interchanged. 109. Let A = (at;j) be a symmetric η χ η matrix, that is, αί\ί = αί\ι Then A determines a symmetric bilinear form on R" as follows: pA(\, w)= 2 aiijfV If Ρ is the matrix corresponding to the change of basis from the standard basis to that described in Problem 105, then P*AP is diagonal. Verify that assertion.
1.11 Inner Products 125 110. Find thep-orthogonal basis and the representation (1.53) of Problem 107 for the symmetric bilinear forms given by these matrices: (a) /4 3 0 l\ (b) 111. Describe the sets p(\, v) > 0, =0, <0 in R* where ρ is given by p(y, v) = (v1)2 + (v2)2 + (v3)2 - (v*)2 112. A transformation T: V^ V is called self-adjoint if it is self- adjoint «7v, w> = <v, Ту/У for all v, w e V). Show that if Г is a self- adjoint transformation on R", then R" = K(T) © R(T) 113. Suppose that v, w are eigenvectors of a self-adjoint transformation Τ on V with different eigenvalues. Show that <v, w> = 0. 114. If Γ is a self-adjoint transformation on R", and v0 e R" is such that 2 (voV = 1 and <7V0, v0> =max{<rv, v>; 2 (v')2 = 1} then v0 is an eigenvector for T. 115. Use Problems 113 and 114 to prove the Spectral theorem for self- adjoint operators on R": Theorem. There is an orthonormal basis Ei,..., E„ of eigenvectors of T. Τ can be computed in terms of this basis by TQx'Ed^x'cE, 116. Find a basis of eigenvectors in R* for the self-adjoint transformations given by the matrices (a), (b) of Problem 110. 117. Orthonormalize these bases of R*: (a) (1, 0, 0, 0), (0, 1, 1, 1), (0, 0, 2, 2), (3, 0, 0, 3). (b) (-1, -1, -1, -1), (0,-1,-1, -1), (0,0, -1, -1), (0, 0, 0, -1). (c) (0, 1, 0, 1), (1, 0, 1, 0), (1, 0, 0, 1), (0, 1, 1, 0). 118. Find the orthogonal projection of Rs onto these spaces: (a) The span of (0,1,0,0, 1). (b) The span of (1, 1, 0, 0, 0), (1, 0, 1, 0, 0). (c) The span of (1, 0, 0, 0, 1), (0, 1, 0, 0, 1), (0, 0, 1, 0, 1). (d) The span of the vectors given in (c) and the vector (0, 0, 0, 1, 1).
Chapter 2 NOTIONS OF CALCULUS One of the main methods of the modern approach to mathematics is the recognition of familiar concepts at work in unfamiliar settings. Thus, the ideas of linear algebra, originally introduced for the purpose of solving systems of equations, will be seen also to have relevance in the study of functions. In time it will be seen that many simple concepts of geometry permeate a lot of mathematics. Thus it is important to us, where possible, to try to isolate our concepts and set them in an initially very abstract situation in order to maximize their applicability. Of course, we can't just do that; we must have had some familiarity with the behavior of those concepts. For that we need examples. As we study these examples we can begin to recognize more and more clearly the essence of our concept. This gives rise to a (perhaps) tentative abstract proposal which requires further study of new examples, born out of our generalizations. This procedure, iterated over and over again, may take many generations and the best work of many mathematicians before a clear, precise and satisfactory definition is molded. So it has been with the limit notion, which was implicit in the early 17th century, which was in some sense formulated by Newton and Leibniz in the 18th century, but which did not take a final and comprehensible form until the late 19th century. We shall not try to encompass over two centuries of struggle in a few pages; we shall have to take some short cuts and we shall try (for obvious pedagogical reasons) to avoid the great confusion that is suffered during such development. The basic technique of calculus is approximation. Let us give an illustra- 126
Notions of Calculus 127 tion of how it goes. The problems of calculus are such that we are required to produce a function that has given properties. There are two aspects to this problem. There is the theoretical aspect: to be assured that there exists a solution to our problem, and the practical one: to describe a procedure for effectively computing that solution. These two aspects are inseparable. In fact, we make a sequence of attempts to solve the problem. If these attempts are good it will provide us with a sequence of functions successively providing better solutions to the problem. Then further study of the general form of these tentative solutions may provide a clue to the accurate solution. Supposing we have a square of side length one unit in the plane (see Figure 2.1) consider this problem. Find a function/denned on the box which has prescribed values at the vertices and which satisfies this condition: For every point in the box and any rectangle with center at that point, the value of/at that point is the average of the values of/at the vertices of the rectangle. Now we can write these conditions more precisely: /■(0,0) = β /(0,1) = b /(1,0) = с f(l,l) = d where a, b, c, d are given numbers. Further, for any (x, y) and (s, t) we must have Я*. У) = ИЛ* ~s,y-t) +Дх + s,y-t) +f(x + s, у + t) + f{x-s,y + ty] (2.1) Now, how do we find such a function? We compute, based on the given information, its value at certain points and try to see if we obtain a pattern. (0,1) (1.1) (1,0) Figure 2.1
128 2. Notions of Calculus First of all, the value at the center of the box is easy to find /(i, i) = Xfi + Ь + с + d) By (2.1) we can compute the value at the center points of the sides: (s = ht = 0), /(i. 0) = \l№ 0) +/(1, 0) +/(1, 0) +/(0, 0)] = a + b Similarly, we obtain the values at all the other center points shown in Figure 2.2. Let us move to more complicated points, for example, the centers of the four squares in Figure 2.2. Since we know the values at all the relevant vertices, we may compute, by (2.1) /(iD^fl + AH^c + ^rf f(hi) = ^ct+^b + ^c + ^d (2.2) Now we can see that we can break the given square into 16 squares and compute the values at the centers (points of the form p/23, q/23, and so c + d a+ d a + b + c+ d b + c a + b Figure 2.2
2.1 Convergence of Sequences 129 forth). We can successively compute the necessary values of/at all points of the form p/2", q/2". Since any point in the rectangle has points of this form arbitrarily near it, we surmise that by this tedious procedure, we will be able to approximate the value of/at any point. It is fair to guess then that a solution to our problem exists and that we have described a technique for computing its values. If we return to Equations (2.2) (or their successors at the next stage) we may be able to really find a formula for the solution. Now it turns out that Equations (2.2) may be rewritten as \2" 27 (2"-p)(2n-g)_ , p(2"-g)L pq q(V-p) 2 2„ a + 2„ b + -s с + 2я d (2.3) (the case и = 2; ρ = 0, 1, 2, 3; q = 0, 1, 2, 3). We can show by successively computing the values at centers of squares that (2.3) is valid for all n. Thus rewriting (2.3), we can assert that if (x, y) is of the form (p/2", q/2") with/» and q as integers, then f(x, y) = (l- x)(l - y)a + x(l - y)b + xyc + (1 - x)yd (2.4) Assuming that/is a well-behaved function this must then hold for all points (x, y). Finally, we can show by substituting into the required conditions that (2.4) gives the solution. Our purpose in the present chapter is to discuss the theoretical concepts which remove the fuzziness in the above discussion. We shall expose the ideas limit and continuity in the setting of functions of many variables. We shall also present a review of the information from calculus which is necessary to the study of this text. 2.1 Convergence of Sequences Before proceeding directly to the limit notion, a few words on the notion of a sequence are in order. Let X be any set. A sequence of points in X is an ordered collection {xlt x2,..., x„,...} of points in X, one for each positive integer. Another way of saying that is this: a sequence of points in X is given by a function f-.P^X, where we denote /(л) by x„. As a shorthand device we will often denote the sequence {xu x2,..., x„,...} merely by its general term {*„}·
130 2. Notions of Calculus Examples 1. {1,2,3,---,",·--} /(и) 4-4-5 Ψ-) л-) 3. {10, 101/2, ...,101/π,...} /(и) A subsequence of a given sequence {x„} is a sequence {y„} extracted from the ordered collection {xl3..., x„,...}. Thus, the collections 4. {odd-numbered x„'s} = {x2n-i}> 5. {every fifth term in {*„}} = {x5fl}, 6. {xp„}, where />„ is the nth prime, 7- {*9(n)}. where g is a strictly increasing function on the positive integers, are all subsequences of {x„}, whereas 8. {*!, xlt ..., Χι, ...} is not a subsequence. The above description of subsequence is a bit vague. The phrase " extracted from " is picturesque but not too meaningful. Is the sequence {x5 , X4, X3 , X2 , Xlt Хю , Xg , ■ ■ - , Xe , · · · , *5π , χ5π- 1, *5n-2 , *5π-3 , ■χ5π-4, · ··) a subsequence of {*„} ? It isn't clear from the preceding paragraph. However, we should draw the line and exclude such new sequences. The essence of a subsequence will be that it consists of some of the xn's, infinitely many of them, and collected in the same order. Now, to be really exacting, our notion of sequence itself is imprecise; we seem to have failed to say what it is. "An ordered collection " is not very satisfactory. We have already elaborated on that: "a sequence ... is given by a function/: Ρ-> X ... ." Yet, it is given by ".. -," but what is it? It turns out that this line of metaphysical questioning bogs down, and is in fact irrelevant. We have already found something which completely describes the sequence (the function /: Ρ ->Χ), so why not define a sequence just as such a function ? Indeed, when we do so, it becomes very easy to also define a subsequence. Definition 1. Let X be a set. A sequence in X is a function /: Ρ -> X. A subsequence of this sequence is another sequence h: Ρ -> X, where h =/° 9 and ^ is a strictly increasing function from Ρ to P. {«}. (-1)" ((-I)" и \ η = 101/n {101/n}
2.1 Convergence of Sequences 131 Thus, if {*!,..., x„,...} is a sequence, this is in fact just another way of writing the function/, /(л) = xn. Iff о д is a subsequence, we can enumerate it as {*9(i), *9(2)> · · · > *e(n)> · · ·}· The above definition is an illustration of a standard mathematical procedure of defining things. A concept is, mathematically, an object with such and such properties. Once we have stated the properties which we feel describe the concept, there is no need to further inquire what the object is; we simply define it by those properties. We now introduce the notion of convergence of a sequence of numbers (which we may take as complex numbers). Definition 2. Let {z„} be a sequence of complex numbers. We say that the sequence converges if there is a z б С such that to every positive number ε > 0, there corresponds an integer N such that |z„ — z\ < ε for η > N. In this case we say {z„} converges to z, written lim z„= ζ or lim z„= ζ or n-* oo Z„^Z. Said another way, the sequence f:P-*C converges to ζ б С if, given any disk centered at z, the range of/ on all but finitely many integers lies in that disk (see Figure 2.3). Figure 2.3
132 2. Notions of Calculus The following proposition asserts that a sequence cannot converge to more than one point, and gives necessary conditions for convergence, without reference to the limit point. Proposition 1. Suppose lim z„ = z. (i) if also lim z„ = w, then w= z, (ii) the sequence is bounded, that is, there is an M>0 such that |z„| < Μ for all n, (iii) (Cauchy criterion) for every ε > 0, there is an N>0 such that I zn — zm\ < Efor QH n,m>N. Proof. (i) By the hypotheses, given ε > 0, there are Ni, N2 such that \z„ — z\ < ε for n^Ni, |z„— w\ <efor n^N2. Thus, \z— w\ < k — Zjv1+Jv2| + \zNl+N2— w\ <2ε Since the inequality \z— w\ <,2ε holds for all ε > 0, we must have \z— w\ <,0, or z= w. (ii) Taking ε = 1, there is an integer N such that \z„ — z| < 1 for n^N. Let M = max{|z|, ^l, |z2| |z«|}+l. Then if и^ЛГ, certainly |z„|^M. If n>N, |z„|^|z„-z|+ |z|^l+ \z\<.M. (iii) Let e>0 be given. There is an integer N such that |z„— z\ <ε/2 for n<,N. Thus, if n, m ;> ΛΓ, we have |z„-zj < |z„-z|+ \zm — z\ <_ + _ = e 2 2 Condition (iii) is called a criterion for it implies convergence, as we shall see below. Notice that (ii) does not imply convergence. The sequence {(—1)"} is clearly bounded, but does not converge (it doesn't even satisfy the Cauchy criterion: |(-1)" - (- l)n+1| = 2 for all n). Examples 9. lim (1/n) = 0. Let ε > 0 be given, and choose the integer N so that Ν>ε~1. Then, for n^N, η = «_1 ^iV"1 <E
2.1 Convergence of Sequences 133 10. lim (i"/n) = 0. The proof is the same (see also Problem 2). 11. lim = 2 l+(l/«) Let ε > 0 be given. Now, 1 +(1/") -2 = 2 1 1 + (1/n) = 2 и - (и + 1) и + 1 и + 1 Thus, we need only take Ν > 2ε ' to verify the condition for convergence. 12. lim л+1 n + 1 -1 n+ 1 < ε if η > ε λ 13. lim A" = 0 for 0<A<1. Since h < 1, there is an integer К > 1 such that h < K/(K + 1). Since the sequence n\{n + 1) is increasing, we have h < n/(n + 1), all n> K. Now, we shall show by mathematical induction that h" < Kjn for all n. The case η = 1 is clear since K^.1. Now, using the nth inequality we obtain the (n + l)th: hn+1 = h· h" < η Κ Κ — <■ и+1 η η+ 1 Thus, if ε>0 is given, let Ν^Κε'1. For η > Ν, |Α"-0| = h" < {Kjn) < ε. The study of the convergence of complex sequences is easily reduced to that of real sequences by the following fact. A complex sequence converges if and only if the real and imaginary parts both converge. Proposition 2. Let {z„} be a sequence of complex numbers, z„ = x„ + iy„. lim z„ = ζ = χ + iy if and only if Mm x„ = χ andlim y„ = y.
134 2. Notions of Calculus Proof. Suppose lim z„ = z. Then, given ε > 0 there is an N such that for n^N, \z„ — z\ < ε. Since \x*—x\< \z,— z\ and \y„ — y\^\z„ — z\ we also have |x„ — x|<e and |% — γ\<ε ΐοτ η>Ν so lim x„ = x and lim % = y. Conversely, given e > 0, there are Ni, N2 such that η ^ Μ implies | x„ — χ \ < ε/ V 2, implies |y„ — y\ < ε/λ/ΐ Then We have so far been considering questions of this form: Given the sequence {z„} and the number z, is lim z„~z1 A deeper problem is this: Given the sequence {z„}, find if possible, a number ζ such that lim zn=z. A solution of such problems requires a more profound understanding of the real number system than we have so far needed. A question of existence is now involved. To resolve such questions we have to have explicit knowledge that there are many real numbers, whereas until now we have made use only of the existence of the numbers 0 and 1. The explicit knowledge desired here is that provided by the axiom of the least upper bound which roughly states that there are no gaps in the line or real numbers. A set S of real numbers is said to be bounded from above if there is a number Μ such that χ < Μ for every xe S. If S is a set which is bounded above, it is conceivably useful to know the smallest number Μ which will serve as an upper bound. We shall refer to such a least upper bound of the set S by sup S (and inf S will denote the greatest lower bound if it exists). The axiom of the least upper bound asserts that for a set S which is bounded above, sup S exists. We shall state this same axiom in terms of sequences because in that form it is more appropriate to our present context. Theorem 2.1. Let {x„} be a decreasing sequence of real numbers; that is, x„ l> xn+ifor all n. If the set {*„} is bounded, the sequence converges. We have called this a theorem since it can be deduced from the axiom of the least upper bound (see Problem 1), which we can take as a defining property of the real number system. A consequence of this fact of existence for the real numbers is the fact that the Cauchy criterion (see Proposition
2.1 Convergence of Sequences 135 l(iii)) is a criterion for convergence. The proof goes like this: first we find a subsequence of the given sequence which is decreasing. An easy consequence of the Cauchy criterion is that the sequence is bounded. Thus, by Theorem 2.1 this subsequence has a limit x. It now follows from the Cauchy criterion that the full given sequence also has the limit x. Theorem 2.2. Let {xn} be a Cauchy sequence of real numbers. That is, for every ε > 0, there is an integer N such that \x„ — xm\ < ε whenever n,m>N. Then there is an χ such that lim xn = x. Proof. First, a Cauchy sequence is bounded. Let ε = 1; there is an N such that |x„ — x„\ < 1 for n, m>N. Then M = max{|*i|,..., |χΝ|}+ 1 is a bound for {x„}. Let uk = sup{x„: и > к}. Clearly, the uk are decreasing, and uk Ξ> — Μ for all к, so by Theorem 2.1 the sequence uk converges, say lim uk = u. We shall show that also lim x„ = u. Let ε > 0. There are M, N2 such that ε \x* — Xm\ <- for η,ηι^Νί \uk— u\ <- for k>N2 Let N=Ni + N2. Since uN = sup{x„.· n>N}, there is an n0;>iv~ such that x„o <.uN+ (e/3). Then, combining all these inequalities, we have for и ;>iv~, |лг„— н| < |jc„— лг„0| + |x»o-%|+ |%-и| <е Because of Proposition 2 that a complex sequence converges if its real and imaginary parts do, we can deduce the same theorem for complex sequences. Corollary. A Cauchy sequence of complex numbers converges. Proof. Problem 3. • EXERCISES 1. What are the limits (when they exist) of these sequences: (a) {n2-4} (c) jb^j (b) {(n2-4)-'} (d) {(-l)"-(-l)"+1}
136 2. Notions of Calculus (e) (f) МЭ1 w Ρ£^ 2. If {x„} is a convergent sequence, then lim(x„+i — x„) = Q. Is this a я-»ео criterion for convergence? 3. Suppose lim x„ = ζ. Let {%} be the sequence {хц+i, Xn+2,...}. Show that lim y„ = ζ also. 4. Let {i„}, {/„} be two convergent sequences. Show that if they are convergent sequences with the same limit, then lim(i„ — /„) = 0. Is the converse true? 5. What is lim ((и + 1)1/2 - Vn)? 6. Show that lim z„ = 0 if and only if lim|z„| = 0. • PROBLEMS 1. Let {x„} be a decreasing sequence of real numbers. Prove, using the least upper bound axiom, that if {x„} is bounded, it is convergent. Deduce also that an increasing bounded sequence is convergent. 2. Suppose that lim z„ = 0 and {c} is a bounded sequence. Prove that lim c„ z„ = 0. If {z„} is a convergent sequence and {c„} is a bounded sequence, n-*oo is {c„z„} convergent? or bounded? 3. Deduce from the fact that Cauchy sequences of real numbers converge, that a Cauchy sequence of complex numbers is convergent. 4. Let {z„} be a sequence of complex numbers and {c„} a sequence of positive real numbers such that \z„\ <c„ for all n>N0. Prove that if lim c„ = 0, also lim z„ = 0. 5. Suppose lim z„ = z. Let {%} be a subsequence of {z„}. Then lim y„ = ζ also. 6. Let {i„}, {x„}, {/„} be three sequences of real numbers. Suppose that s„<,x„< t„ for all n. (a) Show that if lim s„ = lim /„ = c, then also lim x„ = с (b) Show that if lim s„ = с and lim(/„ — s„) = 0, then also lim x„ = с 7. (a) Let 1 > δ > h > 0. Show that there is an integer К such that for η ^ K, (n/(n + 1))δ > h. (b) Let 1 > h > 0. Show that lim nh" = 0. 8. Suppose lim z„ = z, lim w„ = w. Show that (a) lim|z„| = |z|. (b) lim(z„ + w„) = lim z„ + lim w„. (c) lim z„ w„ = lim z„ · lim w„.
2.2 Series 137 2.2 Series A sequence may be formed term-by-term by adding a little bit to each term. In this case the limit, if it exists, will be an infinite sum. Such sequences are probably the most important kind, for in practice what we usually know about a sequence is the difference between two successive terms. This sequence is given to us as the sequence of sums of these differences. Let {zn} be a given sequence. The series formed of this sequence is the sequence of sums π *1. Ζ, + Z2 Zi + · · · + Z„ = Σ Z,, . . . 1=1 The series converges if the sequence of sums £Γ=ι ζ, converges; in this case the limit is denoted by £," t z,. Example 14. The geometric series ^°=0 z". Let SN = ££=<, z". Then SN+1 = 1 + ζ + · · · + ζ" + ζ"+1 = SN + zN+1 Notice also that SN+1 = 1 + z(l + ζ + · · + ζ") = 1 + zSN These two equations give us the general term of the sequence explicitly: l + zSN = SN + zN + 1 or 1 - zN+l SN = 1—1— (ζ Φ 1) (2-5) 1 — ζ Now, if \z\ < 1, then lim SN = (1 - z)~\ for 1 zN+1 1 s i_ = - =—-—ΐζΓ+1 Ън l-z 1-z |l-z|M
138 2. Notions of Calculus and lim \z\N = 0 (Example 13). So given ε > 0, we find N0 so that |z|n+1 <(|1 -zQefor πΞ> JV0, and thus SN 1-z < ε ίοτΝ>Ν0 Notice that we cannot immediately determine whether or not the geometric series converges for \z\ > 1 (of course, for z= 1, SN = N, so the series diverges). In fact, the geometric series does not converge for \z\ > 1, by application of the following proposition. Proposition 3. IfY^=o z„ exists, then the general term must converge to zero; that is, lim z„ = 0. Proof. Let {s„} be the sequence of partial sums. By hypothesis lim s„ exists. Let ε > 0 be given. There is an N such that for n,m>N, \s„ — sm | < e. Thus, for η^,Ν, \ζ„\ = |ί„ — ί„+ι| <e. Thus, 1 for \z\ < 1 2z»=|l-z (2.6) (diverges for \z\ > 1 For if \z\ > 1, the general term is |z"|. By Example 13, lim(l/|z|)" = 0, so {\z\"} gets arbitrarily large and does not converge. If \z\ = 1, |z|" = 1, and the sequence {1} does not converge to zero. By the way, the condition lim z„ = 0 is not sufficient for the convergence of the series £ z„ as the following examples show. Examples 15. Certainly the series l + l + *** + l+··' diverges. But we can rewrite this as , 11111 11 11 ι + ϊ + ϊ+τ + τ + τ +■■■ + - + - + ■·· + - + —Γ+'" •^2333 ии ии + 1 Here the general term tends to zero.
2.2 Series 139 16. Σπ°°=ι l/л diverges. Let sN = £?=1 l/и. Then Thus, {SN} is not a Cauchy sequence. The sum of a series of positive numbers is particularly easy to work with. For if {c„} is a sequence of positive numbers, then the series {JJ = 1 ck} is an increasing sequence, so by Theorem 2.1 (as rewritten in Problem 1) this sequence converges if and only if it is bounded. Proposition 4. Let {c„} be a sequence of nonnegative numbers. The following assertions are equivalent. (ι) £ ck converges. 00 {Y%= χ ck} is bounded. (iii) For each ε > 0, there is an N such that for all m> N, m k = N The proof of the equivalence of (ι) and (n) is essentially given in the preceding paragraph. Part (iii) is just the Cauchy criterion restated for positive series (see Problem 11). Examples 17. ΣΓ=ι I/"·' converges. For nl > 2"'1 for all n, so 1 1 and thus for all N, N 1 N-l 1 V -< у -<2 by (2.5).
140 2. Notions of Calculus 18. y(l- I ) п^Ди n + cos(l/n)/ converges. For 1 1 11 < η η + cos(l/n) и и + 1 Thus, for all N „еДл и + cos(l/n)/ ~ ЛАп n+lj 2 + 2 3 + 1 1 + <1 N N+i. There is no such simple criterion as Proposition 4 for arbitrary series of complex (or real) numbers, and the question of convergence as well as computation of a limit can become extremely subtle. However, if for a given series the series formed of the absolute values converges, the situation is considerably clarified. Ordinarily we shall discuss the convergence of a series only in the happy circumstance that the corresponding series of absolute values converges. Proposition 5. Let {ck} be a sequence of complex numbers. If £ \c„\ converges, Y_ck also converges. Proof. Let /„ be the sequence of partial sums of 2 Ы and s„ the partial sums of 2 c„. Notice, for m > η \Sm— *il = Σ ck <, 2 k*i = *„-*, Thus, if {/„} is a Cauchy sequence, so also is {s„}. Definition 3. Let {ck} be a sequence of complex numbers. ]£ c„ is absolutely convergent, if £ \c„\ converges. If £ \c„\ diverges, but £c„ converges, we say £ c„ is conditionally convergent. There are such things as conditionally convergent sequences. In fact, Σί°=ι(—1)7" converges. But as we have seen in Example 16 the series ΣΓ=ι 1/" of absolute values is divergent. It is easy to see that ^°=1 (— l)"/n
2.2 Series 141 converges. Let {s„} be the sequence of partial sums. Then the subsequence {s2n} is decreasing, and bounded below by sl3 and the subsequence {i2„+i} is increasing, and bounded above by s2. Thus, both these subsequences converge. Since lJ2n + l ~ s2n\ < П+ 1 they have the same limit. It is easy to deduce that the full sequence also converges to that common limit. Here is the proof in a more general case (known as Leibniz's theorem). Proposition 6. Let {cn} be a decreasing sequence of positive numbers such that lim c„ = 0. Then £ (— i)"c„ converges. Proof. Let ί„ = Σ*-ι (— \fck. We consider the sequences of even and odd partial sums separately. The sequence {s2„} is decreasing, since *2<п+1)— S2n = C2n+2 — Cln + 1 < 0 Similarly, the sequence of odd partial sums {i2„+i} is increasing. Furthermore these sequences are bounded, for, given any n, Sl <S2„+1 =i2n — C2„ + l <S2„<S2 so {s2„} is bounded below by ii and above by s2. The same is true for the sequence {s2„+i}. Thus, by Theorem 2.1 lim ί2„ = ί, lim s2„+1 =/ both exist. Furthermore, s' — s = lim s2„+1 — lim s2„ = lim(i 2„+1 — s2„) = lim(c2„+1) = 0, so s' = ί Since both sequences, of odd partial sums and even partial sums converge and have the same limit, the whole sequence also converges to that limit. Notice that this argument does not give any hint as to the value of ^ (— l)"/n. Outside of the case of Proposition 4, there is no positive assertion that can be made about conditionally convergent series. In fact, they tend to behave very badly, as the following illustrative example shows. Example 19. The sequence 111111 1 1 2+2+4 + 4 + 4 + l+'" + br + '" + Tn + '" In
142 2. Notions of Calculus is the same as 1 + 1 + 1- 1 + · · · and thus diverges. Since the general term is decreasing to zero, by Leibniz's theorem this series is conditionally convergent: 11 11 11 J___L __L 5-2-2 + 4~4 + 4 4+'" + 2n 2n +" ' In + " ' и times The sequence of partial sums is 1,оДоДо,...,1,о, 2 4 4 In and thus obviously converges to zero. However, we may now rearrange terms of the series so that it no longer converges! Consider the same series where in each group we first add the positive terms and then the negative terms: (2.7) The corresponding sequence of partial sums is 1111 л-1 1 2'°'4'2'4'0'···'-2Γ'···'2Τ0'- Thus, there is a subsequence: {\, \, ...} and another: {0,0, ...} so we cannot have convergence of (2.7). We leave to the student (Exercise 9) to show that it can be further rearranged so that it once again converges, but this time to one! No such foolishness holds for absolutely convergent series. We may attempt to sum the series in any way we please. If we arrive at a limit, it is the sum. In fact, if £ c„ is an absolutely convergent series we may sum first the positive terms, and then the negative terms; and £ c„ is the sum of these two sums. We conclude this section with the proof of these facts.
2.2 Series 143 Proposition 7. Let £ c„ be an absolutely convergent series of real numbers. (i) Let \ck if ck > 0 ■*-( 0 if ck<0 _ ί — ck if ck < 0 | 0 if c6>0 ГАетг the sums ^ск,^ск converge and £сц = ^ск+-^^. (ii) {Rearrangement.) Let gbea one-to-one mapping of the positive integers onto the positive integers. Then £ ce(fl) = £ c„. (iii) {Regrouping) Let h be any strictly increasing function from Ρ into P. Let ft(n) fc = ft(n-l) ThenJ^dn = ^cn- Proof. (i) Since the sequence (2*=i \ck\) is bounded by absolute convergence, and £|ck|> t^+,tc,- the sequences 2*-i c*+. 2"=! Ck" are also increasing and bounded. Thus they converge to, say s, t respectively, by Theorem 2.1. We have to show that 2 c„ = s — /. Let ε > 0. Then there are M, N2 such that for n>Nu ε <2' 2 ci - ί *=1 and for n>N2, 2 c*~ -' Then for и > max(M, N2), л Jc»-(i-/) 2rf- 2ft"-(»-') k=l k=l <e (ii) Let g be a one-to-one map of Ρ onto P. Then #"1 is defined and also maps Ρ onto Ρ into a one-to-one fashion. For each n, let N„ =max(^(l), ...,^(n)).
144 2. Notions of Calculus Then for all n, so the series Σ £«<*) 's absolutely convergent. Similarly, η Νη η Σ c«+<*> ^ Σ c*+ ^ Σ c*+ and Σ c«"<*> ^ Σ c«" h=l k-1 ft=1 for all n, so we have Σ <$*> < Σ c*+> Σ c«~<*> ^ Σ c*~· Applying the same reasoning but reversing the roles of the two series, we obtain the reverse inequalities so that in fact, Σ c«+<*> = Σ c«+ and Σ c«~« = Σ c«" · Thus, by part (i) we obtained the desired equality; that is, Σ c«« = Σ Ct ■ Part (iii) is actually true for any convergent series. Let Σ c» = c, and the strictly increasing function h be given. Notice that h{n) Ξ> η for all n. For ε > 0, there is an N such that л Σ ck — с < e *=1 for all и > N. Thus, for n>N, £</„ = Σ Σ cj= Σ ο k=l *=lj=ft(*-l) J=l and А(я) > Ν, so that л Σ^» — с Σ о-с <е • EXERCISES 7. Show that л = 1\И И+1/ converges. 8. What is | (-1)" where a2„ = 2", a2„+1 = 3"?
2.3 Tests for Convergence 145 9. Rearrange the series 11 11 11 J__ 1 1 2 2+4 4+4 4 + '" + 2^ 2^ + '"~2^+'" so it has the sum one. 10. Can the series 2 (—l)"/n be rearranged so as to have sum 10,000? • PROBLEMS 9. Suppose 2 z„ = ζ and ^», = ic. Show that 2 (z» + w„) = ζ + w. 10. Suppose (a) 2 z» and lim w«exist. Does 2 z„ w„ exist? (b) 2 z» and Σ w» exist. D°es 2 z«w» exist? 11. Prove that 2 z« converges if and only if for all ε >0, there exists an N > 0 such that 2 ζ* < ε for all и > N Deduce that Proposition 4(iii) is true. 2.3 Tests for Convergence Since the theory of series is so important and the definition of convergence unwieldy, there has developed a large collection of tests (or criteria) for convergence which are more or less easy to apply in the relevant cases. We have already given some criteria for convergence. (1) Cauchy criterion: £ c„ converges if and only if for every ε > 0, there is an integer N such that |c„ + 1 + · · · + cm\ < ε for all m > η > N. (2) If the sequence {c„} decreases to zero, then £ (- l)"c„ converges. (3) If the sequence {c„} is nonnegative, £ c„ converges if and only if the sequence {£5UiCfc} of partial sums is bounded. The last condition, which can be considered as a condition for absolute convergence, gives rise to the following criterion which is the basic one. The idea is to compare a given series with a known convergent one (if we suspect that it converges) or to a known divergent one (if we suspect that it diverges).
146 2. Notions of Calculus Examples 20. Σ l/л! converges, as we have seen in Example 17. There, we noticed that l/л! <2_n + 1, and since £2-n + 1 is convergent, so is Σΐ/л!. For N I 1 \ N I » 1 n = i\n!/ n = iz n=\l 21. diverges. For if χ is small enough, sin χ > χ/2. Thus, there is an * such that if л > Ν, sin(-) > - \л/ 2л and thus for m> N, £sin(-)= £sin(-) + o Σ - n = l W „ = i \Л/ 2Ν + ΐΠ But we can make the last sum as large as we please by taking m large enough. Thus, ^*= x sia(5/ri) is not bounded, and so it is not convergent. 22. £, 1 ■tb (1 + 0" is absolutely convergent. For 11 + i\ = ^2, so for any m, m 1 oo 1 _ Σ τ; ^ = Σ —7=- < °° since J2 > 1 „fo|l + *r π^ο(^2)π V The idea behind these examples is contained in the following theorem.
2.3 Tests for Convergence 147 Theorem 2.3. (Comparison Test) Let {c„} be a sequence of complex numbers. If there is a positive number К and an N, and a sequence {p„} of positive numbers such that (i) \cn\<Kpn, forn>N, 00 00 Σί>π<°°> Fl=l then £ c„ converges absolutely. If instead, we have (i)' \c„\>Kp„, fom>N, (ii)'f>„=oo, Fl = l then £ \c„\ diverges. Proof. In the first case the sequence of partial sums is bounded. ilc*l = |ic.l+ Σ \ск\£2ы + к1р*<«> *=1 *-l * = Λί+1 k = l In the second case, the sequence of partial sums is unbounded. Σ |c*i = 2 lc*l + Σ W ^ Σ Ы + Σ л * = 1 k.l * = iV+l k=l k = N+l which is unbounded as и ^ со. Examples 23. X"=o z"lnl converges absolutely for any complex z. Choose an integer N so that N> 2\z\. Then, for all η, (Ν + и)! > (2|z|)", so that \z\N+n \z\N (N+n)l~ 2" Since £ 1/2" converges, so does £ |г|п/и! by the comparison test. As a corollary result we obtain Iim z"/n\ = 0 for all ζ (this however could have been derived directly).
148 2. Notions of Calculus 24. £rtV converges absolutely for all z, \z\ < 1 and all integers k; and otherwise diverges. If \z\ > 1, then Iim nkz # 0, so the series can hardly converge. Now suppose \z\ < 1. We want to prove the convergence by comparison with the geometric series, so we must account for the effect of the coefficients nk. Note that (n + l)/n -* 1 as и -> oo, thus also (n + \)jnk -* 1 (Exercise 13). Let s be any number greater than 1. Then there is an N such that for all и > Ν, j-^- < s or (n + If < snk Thus, by induction we can conclude that, for all и > 0, (Ν + rif < s"Nk. Thus, (N+n)k\z\N+n<(s\z\)nNk\z\N. We should choose s < l/|z|, so that £„ (j| ζ |)" < oo. With the choice then of s: 1 < s < l/|z|, we can apply the comparison test to obtain the convergence of our series £ nkz". 25. Y^nlz" diverges for all ζ #0. We have seen in Example 2 that for any complex number c, lim c"jn\ = 0, or, replacing с by z-1, n-»oo lim l/n!z" = 0. This precludes the possibility that limn!z" = 0, n-*oo so the given series cannot converge. 26. £ 1/и2 converges. In a later section we shall give another proof of this, at present we rely on a tricky observation. ill- _L) = 1__L Thus, the series ς(----) converges to 1. But 1 1 1 и и + 1 n(n + 1) thus 00 1 Σ —-— = i „=i n{n + 1)
2.3 Tests for Convergence 149 Now, In1 > η2 + η = n(n + 1), thus 1 1 τ^ η п{п + 1) so by comparison £ 1/n2 also converges. 27. Σ 1/л(1+8) converges for any ε > 0. Let к be an integer large that ke > 2. Then, for any n; if m ^ nfc, 1 1 1 so < ш(1 + 8)-л*(1+0-и*+2 Between и and (n + l)k there are (n + i)k — nfc integers. Since , к m 1 there is an n0 such that for η > n0, (n + l)fc < 2nfc, or (n + l)fc — nfc < n\ Thus, (nv1)k 1 и" _ J_ m=nk+l »Г И П Well, now we can show that the sequence of partial sums is bounded, for N* J n0k J Nk J Σ „(1 + 8) ^ Σ „(l+ί) + 2,; „(1+ί) п=1 И п=1 И п=пок+1 П N (п+1)к 1 ^Ио*+ Σ Σ -7ΪΤΓ) ^«ofc+ Σ Ζ2^"ο'ί + Σ^<00 п=по " ^ Now a special kind of a series is a power series: the geometric series, and the series in Examples 24 and 25 are such series. A power series is a series
150 2. Notions of Calculus of the form 00 11 = 0 Such a series has the property that if it converges for some z0, then it converges for all ζ such that \z\ < \z0\, and if it diverges for some ζλ, then it diverges for all ζ such that \z\ > \ζλ\. Thus, the geometric series diverges for \z\ > 1 and converges for \z\ < 1; the series £ z"/n\ converges for all z, and £n!z" converges for no z. This general property of power series is easily deduced from the comparison test. We make the following somewhat stronger statement. Proposition 8. Let {c„} be a sequence of complex numbers. (i) If {\c„ \t"} is bounded for some positive number t, then £ cnz" converges absolutely for all z, \z\ <t. (ii) If {\ c„ \t"} is unbounded, then £ cnz" diverges for all z, \z\ > t. Proof. (i) Suppose Μ> \c„\ t" for all n. Let ζ be such that \z\ < t. Then |c„z"|<|c„|/"( — V<Af(—V for all и ίτΜ?) and since |z|//<l, 2 (И/О" < c0. so by the comparison test the series 2 е"7" converges absolutely. (ii) If {|c„| /"} is unbounded so is {c„z") for all z, \z\> t. Thus, we cannot have lim c„ z" = 0, so 2 с»z" cannot converge. Definition 4. Let {c„} be a sequence of complex numbers. The power series associated to {c„} is the series ^"=0 a„z". The radius of convergence of the power series is the least upper bound Я of all real numbers t such that the sequence {| c„ |i"} is bounded. According to Proposition 8 the series £"=0 a„z" converges for ζ inside the disk of radius R(\ ζ \ < R), and diverges for ζ outside that disk (see Problem 12). Examples 28. J^°=0 z"/n has radius of convergence one. For if t > 1, then {f/n} is unbounded, and if t < 1, t"/n -> 0. Notice that we can make no clear assertion for ζ on the unit circle, since £"=0 (l)"/n diverges, but ^°=0 (— l)"/n converges.
2.3 Tests for Convergence 151 29. If {c„} is bounded, but does not tend to zero, ££,<, c„zn has radius of convergence one. For clearly {cntn} is bounded for t < 1, and unbounded for t > 1. There are two final tests of some importance. These are as follows: Root test. If eventually (I c„ |)1/π < r for some r < 1 then £ c„ converges absolutely. If there are infinitely many и such that (| c„ |)1/n > R for some R > 1 then Σ cn diverges. Ratio test. If there is an r < 1 such that eventually < r < 1 then Σ cn converges absolutely. If > R > 1 for infinitely many и then £ c„ diverges. These are both derived by comparison with the geometric series. We leave it to the student to derive these tests (Problem 13). Let us here indicate why the convergence assertions are true. Suppose (| c„ |)1/п < r < 1, for и large enough (say и ^ N). Then \c„\ < r" eventually, so the partial sums £ \c„\ are bounded by N 1 Σ ki + τ— Fl = 0 1 — Г by comparison with the geometric series. As for the ratio test, suppose < r for и > N
152 2. Notions of Calculus Then we have kjv+ll <r\CN\ kjv+2l <r\cN+1\ <r2\CN\ кл,+з1 <r3\cN\ kjv+sl < rk\cN\ by induction. Thus, ^=0 \c„\ < £?=0 k„l + \c„\ Σ r" < °° since r < l- 9 EXERCISES 11. Which of the following series converge? (a) (b) (c) (d) H)· **$· Σ,η(1). 2tanG)_SinG)· (e) (f) (g) (h) v n5 + 8 ^tn'+n*' v η3 + η2 + η + 1 и4 + η5 + η6 + ηΊ Σ-· Τ л', χ > 0. η! (i) Σ -f x"> ^ a positive integer, 0 < χ < 1. G) 2(-υ-sinί. (m) z(L + _i— + —L_). « Σ(-ΐ)"^-· (η) ς(-—Ц + —Λ η + 1 \и и + 1 и + 2/ (1) Σ(-ΐ)"—-—· Л (и + I)2 12. Verify directly that lim z"jn\ = 0 for every z. 13. Suppose lim c„ = c. Then for any integer к, lim c„* = c*.
(b) (с) (d) (e) • PROBLEMS oo 2 n=0 И ю Ζ" £, и! пГо(2и)! ii-V 2.-/ Convergence in R" 153 14. Find the disk of convergence of the following power series. (a) \z\ (f) fn!z». " n = 0 (g) Σ(^· (h) 20+«Л)г". (0 Σ ζ-2. (j) Σ (! + *)"· 12. Let {c„} be a sequence of complex numbers, and let R be the radius of convergence of the power series 2C»Z"· Show that Σε»ζ" converges absolutely for \z\ < R, 2 c„z" diverges for \z\ > R. 13. Derive the convergence and divergence assertions of the root and ratio tests. 2.4 Convergence in R" The notion of convergence of a sequence of vectors is easy to conceive, since a vector in R" is just an л-tuple of real numbers. Thus, a sequence of vectors is an л-tuple of real sequences, and the question of convergence of the vector sequence is just that of the simultaneous convergence of those и real sequences. We might also directly paraphrase Definition 2 of convergence, using the notion of distance in R" discussed in Chapter 1. These two possible notions are in fact the same. Definition 5. Let {\k} be a sequence of vectors in R". The sequence converges if there is a vector ν e R" such that to every positive number ε > 0 there corresponds an integer К such that || vk — ν || < ε for к > К. We write lim vfc = ν if {vk} converges to v. fc-*oo
154 2. Notions of Calculus Thus, lim \k = ν means precisely that lim || vk - v|| =0; that is, the distance between the general term vk and ν tends to zero as к becomes infinite. When put this way it sounds like just the notion we have in mind. Recalling that in Section 2.1 we said that a complex sequence {ck} converges to с precisely when \ck — c\ -* 0, we see that this coincides with the above definition when η = 2. Now, if we write out the sequence \k of vectors in R" as an n-tuple yk=(vk\...,vkn) (2.8) we can view the given sequence as the и real sequences {vkJ}, where j = 1,..., n. We now verify the fact mentioned above, that \k -> ν precisely when vkJ -* vJ for all j. Notice that Proposition 2 is that fact in the case ofR2. Proposition 9. The sequence (2.8) converges to the vector ν = (ν1, ...,ν") if and only if lim vkJ = v1 for all j. fc-» oo Proof. If w = (w1,..., w") is a vector in R1, then by definition llw|| = (2(w')2)1/2 Then, in particular W-vJ\ < ||vk-v|| y=l,...,и (2.9) Suppose now that vk^v. Then, given ε >0, there is a AT such that ||vt— v|| < ε for к ^ K. Thus, by Equation (2.9) for each j, \vkJ - vJ\ < ε for к ^ K. But this means precisely that lim VkJ = ιΛ k->oo Conversely, if iV -^ uJ for ally, then (vkJ - uJ)2 ^0 for ally, so [J (vkJ - uO2]"2 = llv»— v|| ^0 as k^ со. But then, by Definition 5, vk->v. In precisely the same way we can verify that if the sequence of vectors (2.8) satisfies a Cauchy criterion so do each of the real sequences {vkJ}, and thus are convergent. Hence, by Proposition 9 the sequence of vectors {vk} also converges, so we have a Cauchy criterion for vector sequences also. This fact, as well as some basic algebraic properties of convergence of vectors is easily verifiable. Accordingly, we make these assertions, leaving the proofs to the reader. Proposition 10. (Cauchy Criterion) Let {vk} be a sequence of vectors in R". Suppose to every ε > 0 there corresponds а К such that || vr — vJ < ε whenever both r,s>K. Then the sequence {vk} is convergent.
2.4 Convergence in R" 155 Proposition 11. Suppose Iim vk = v, Iim щ = w, Iim ck = c, where {yk}, {v/k} are sequences of vectors in R\ and {Ck} is a sequence of real numbers. Then (ι) lim(vfc + щ) = ν + w, (ii) lim<vfc, wfc> = <v, w>, (iii) Iim ckvk = с v. Example 30. Let us find a point of a given plane in R3 which is closest to the origin. A plane is given by the equation <x, a> = с for fixed a, c. Let m =g.l.b. {||x||; <x, a> = c). Choose a sequence {x„} on the plane such that ||x„|| ->m. We shall show that {x„} actually converges. Now, xmll2=llxJI2+llxmll2-2<x„,xm> (2.10) We can estimate the last term by using the fact that the midpoint i(x„ + xm) between x„ and xm must also be on the given plane. 1 " (X. + Xm) + ■ + - <x„, xm> 4 2 Thus, -2<x„,xm><||x„||2+||xJ|2-4m2 Combining (2.10) and (2.11), we find that (2.П) <2(||x„||2+||x„ 2m2) (2.12) Now, since ||x„|| -> m, if ε > 0 is given, there is an n0 such that for n,m>n0, we have ||x„|| < m + ε, ||xj| < m + ε. Inequality (2.12) then gives IIx. - xJI2 < 2((w + ε)2 + (m + ε)2 - 2m2) < 4^ε + 2ε2 = г(4ш + 2ε) This can be made as small as we please by choosing ε small. Thus if n,m are large enough, ||x„-xj| is small, so the sequence {x„} is Cauchy, and thus convergent. If χ = Iim x„,then ||x|| = lim||x„|| = m, so χ is the closest point on the plane to the origin.
156 2. Notions of Calculus Let us pause for a moment to consider the reasons, as illustrated by the above example, for studying the convergence of vectors. The central problem of calculus is to find an object, usually considered as a point in a given collection of points, which has certain specified properties (i.e., the maximum of a given function, or a zero of a function). At least, the theoretical aspect of the problem is to prove the existence of a point with such and such properties. Our technique for doing this is to use the desired properties to develop a sequence of approximations; our hope is that the approximations will converge; and that the limit will have the desired properties. It is thus essential to be able to discuss the question of convergence without already knowing the limit. Hence, for example, we have· the Cauchy criterion. Further, we will need techniques, or criteria, to apply to the given properties in order to be able to extract the desired Cauchy sequence of approximation. For example, we will want to know: (a) If we have a convergent sequence of points having a property, does the limit have that property? (b) If we have a sequence of points having a property, does the sequence converge? or, at least does it have a convergent subsequence? These questions lead us to the reconsideration of the closed sets introduced in Section 1.11. Recall that a closed set in R" is a set whose complement is open. More precisely, S is closed if and only if corresponding to every ν φ S, there is an ε > 0 such that any vector within ε of ν is also not in S. In particular, if S is a closed set, and ν φ S, then ν cannot be the limit of a sequence of vectors in S. To put it positively, a closed set contains the limits of all convergent sequences it contains. This is in fact a defining criterion for closedness: Proposition 12. Let S be a set in R". The following assertions are equivalent: (i) S is closed. (ii) If {\k} is a convergent sequence contained in S, then lim yk e S. Proof. Suppose S is closed. Let {v*} be a sequence contained in S and suppose it converges to v. If ν φ S, since S is closed, there is an ε > 0 such that no vector in S gets within ε of v. This is nonsense since ν is the limit of a sequence in S. Thus, we must have у е S. Suppose now S is not closed. Then there is a ν φ S such that for every e > 0 there is a vector in S which is within ε of v. in particular, for each n, taking ε = \jn there is a v„ such that ||v„ - ν|| <, \jn and v„ e S. Thus, v„ -> ν so (ii) does not hold for S. We are now in a position to state our last basic consequence of the fundamental existence axiom for the real number system. This is that every bounded sequence in R° has a convergent subsequence. It is easy to derive
2.4 Convergence in R" 157 this from the Cauchy criterion, itself an assertion of existence. Let us illustrate the situation in R2. Suppose {ck} is a sequence of complex numbers which is bounded; that is, it remains in some fixed square S0 of side length К Cut that square into four equal squares. At least one of these new squares has infinitely many of the {ck}; let Sx be one such square. Cut St into four equal pieces and let S2 be one of these new squares which has infinitely many of the {ck}; now do the same with S2 and so on (see Figure 2.4). In this way we obtain a sequence of squares {S„} with the properties: n + l> 0) Sn=>S (ii) side length of S„ is K/2", (iii) S„ has infinitely many of the {ck}. Now that this is done, we can, for each integer n, select a k{n) such that Ck(„)eS„, and {cfc(n)} forms a subsequence of {ck}. (For this we need to know that S„ contains infinitely many {ck}, so that we can choose k{n) greater than any previously chosen index.) Now, {cfc(n)} is a Cauchy sequence. For let ε > 0, and choose N so that ε > Κ^/ΐ/2Ν. Then, if n, m > N, we have c4n),ck(m)eSN, so L4») -c, k(m)\ \2N) +\2NJ 2^ < ε Since the sequence {сЦп)} is a Cauchy sequence, by Proposition 10 it converges, and the argument for R2 is concluded. This is the basic idea of the verification of ;j Figure 2.4
158 2. Notions of Calculus Theorem 2.4. Every sequence in a closed and bounded set S in R" has a subsequence which converges to a point ofS. Proof. Suppose that S is closed and bounded and {v*} is a sequence in S. We shall find a Cauchy subsequence. Since the sequence is bounded, it is contained in some ball B(0, R). This ball can be covered by finitely many balls of radius 1. Since the {v*} are infinite, there is one such ball which contains infinitely many. Call it Bi, and let vkW e Bi. Bi can be covered by finitely many balls of radius i. Let B2 be one such which contains infinitely many of the {v*} and let v*<2) e B2 with k(2) > k(l). Continuing in this way we obtain a sequence {B„} of balls, a subsequence {v*<n)} of {v*} such that (i) B„ has radius l/и, (ii) v*<„) e B„, (iii) B„ => B„+1. Then {v4<„,} is a Cauchy sequence, for if n, m ^ Ν, v*(n) and y4m) e BN which has radius I/TV, so 2 ||ν*(„, — v4<m)|| <- for all n, m ^ N By Proposition 10 there is a v such that ν*(„) -> ν as и -> со. Since S is a closed set, and {νλ(π)} e 5, we also have ν e S, so the theorem is proven. Example 31. The unit sphere S = {x e Rn: ||x|| = 1} is closed. For if x„ -> x, then certainly ||x„|| -> ||x||, so if x„ e S, so is x. Now suppose Γ is a linear transformation of R" to R". We want to know if there is an χ e S at which ||7x|| is a maximum. First of all, the set of numbers of the form ||7x|| with χ e S is bounded. Let A = (a/) be the matrix representing T, and Μ = max \a/\. Then Tx = T(x\ ..., x") = (Χ α/χ\ ..., Σ α/V) so || 7x|| = [(X aj V)2 + · · · + (Σ <*J)2]1/2 (2.13) < [nM2 ||x||2 + · · · + nM2 ||x||2]1/2 < nM ||x|| Thus, nM is the desired bound. By the least upper bound axiom then, m = sup{|| Гх ||: χ e S} exists, and there is a sequence {x„} с S such that ||7x„|| ->»j. According to the above theorem there is a subsequence {y„} which converges, say to y. Since ||7x„|| ->»j, we also have ||TV. || - m, and by (2.13), in fact ||7y|| = lim||7y„|| = m.
2.5 Continuity 159 PROBLEMS 14. Prove Proposition 10. 15. Prove Proposition 11. 16. Let Π be a plane in R3, and suppose x0 is the point on Π which is closest to the origin. Show that if χ e Π, then x0 is orthogonal to χ - x0. (Hint: If not, then one of χ — x0, χ + x0 is closer to the origin than x0.) 17. Find the point on the plane given by the equation <x, (1, 1, 1)> = 3 which is closest to the origin. 18. Find the point on the plane <x, (1,0, 1)> = 2 which is closest to -α, ι, υ. 19. Let L be a linear function from R" to Rm Show that the kernel and range of L are both closed. 20. Let L: R°^ R be a linear function. Show that if limx„ =x, then also lim L(x„) = L(x). 21. Let v0 be a vector in R", and Π the set of χ such that <x, v0> = с Show that Π is closed. 22. Show that for any v0 e R" and r > 0, {yeR": ||v-v0||<r} is closed. 23. Show that ν* -» ν in R" if and only if max |ϋ*' —ϋΊ->0 2.5 Continuity We turn now to the consideration of functions from subsets of R" to Rm. The basic notion of analysis being that of convergence, the fundamental class of functions will consist of those which respect convergence; that is, those which take convergent sequences into convergent sequences. These afe continuous functions. Definition 6. Let S be a set in R", and/a function denned on S, taking values in Rm. /is continuous on S if whenever vk -> ν with \k e S, all k,veS, then/(vs) -»/(*)· We shall be concerned most usually with the local study of a function near a given point. For this purpose we make this additional definition.
160 2. Notions of Calculus Definition 7. A function /from a set in Rn, taking values in Rm, will be said to be continuous at v0 e R" iff is denned in a neighborhood of v0 and v->v0 implies/(v) ->/(v0). Examples 32. /:Λ"->Λ,/(ν) = ||τ|| is continuous. For if v„ -> v, then ||v„-v||->0sothat ||v„|| -» ||v|| since I II т. ||-|| τ || |<||v„-v|| 33. f:C^C, f(z)=z is continuous: zn-+z implies \zn -z\ = \z„ — z\ -* 0, so that also z„ -> z. 34. A linear function on R" is continuous. Let f(v\...,v") = £>,»' (2.14) i=l Then, if vfc -> ν we have vkl -> г1, ..., vk" -> r", so that £"= t a(i;fc' -> ^"=1 α,υ' since the limit of a sum is the sum of the limits. Thus, /(v*)-/(v). Roughly, the idea of continuity of a function/is this: as a moving point ρ gets close to p0, the value/(p) of/at ρ gets close to/(p0). That is, we can ensure that /(p) is as close as we please to /(p0) by choosing ρ sufficiently close to p0. This leads to the so-called "ε — δ" criterion for continuity, which we now give. Proposition 13. Let S be a subset of R", and let f be an Rm valued function defined on S. (ι) Let x0 e X. f is continuous at x0 if and only if, to every ε > 0, there corresponds α δ > 0 such that || χ — x01| < <5 implies ||/(x) —/(x0) II < ε. (ii) If S is open, f is continuous on S if and only if f is continuous at every point of S. Proof, (ι) Supposing first that the ε - 8 criterion is true, we shall show that / is continuous at x0. Let x„ -> x0. We have to show /(x„) ^/(x0). Given ε > 0, there is a δ > 0 such that whenever χ is within δ of x0 we have ||/(x) — /(x0)|| < ε. Since x„^x0, there is an ./V such that n^N implies ||x„ - x0|| < δ. Thus, for η^,Ν, ||/(x„) —/(xo)ll < ε, as desired.
2.5 Continuity 161 Conversely, if the ε - δ criterion is false, then there is an e0 such that for every δ > 0 there is an x, for which | |x - x011< δ but | |/(x) - /(x0) 11 > e0 · Selecting δ-ι i i i 2 3 и we obtain the corresponding sequence xb x1/2,..., x1/n, which converges to x0. But/(xi/n)+>/(x0) since the/(x,/n) are always outside the ball of radius e0 centered at x0. Part (ii) is left as an exercise. Examples 35. /: R2 -> R denned by 5x Я*, У) = 7-—г 1 + у is continuous at (0, 0). For 5x 1 + Г <5\x\<5\\{x,y)\\ Thus, if ε is given we can choose <5 = ε/5. Then ||(x, y) || < «5 implies 5x i + y2 36. f(x, y, z) = < 5<5 = ε y3z l + x2 + z2 is continuous at (0, 0, 0). We have \f{x, y, z) - /(0, 0, 0)| = y>z l + x2 + z2 < \y3z\ < ||(дс, у, z) Thus for each ε > 0 choose δ = ε4 = ε. Then ||(x, y, ζ)\\<δ implies \f{x,y,z)\<5* = E.
162 2. Notions of Calculus 37. Я*, у) = (4^r (*>зО # (0, o) /(o, o) = о This function is not continuous, since /(*'*) = 2? = 2+>° If we redefine /(0, 0) = 2, this new function is still not continuous, since яаз0 = т = 1+*2 38. We can easily verify the continuity of the linear function (2.14) by the ε — δ criterion. For l/(v) -/(w) I = ΙΣ «.(»' - ^') I < ΙΚβ1...., а") II II ν - w ll by Schwarz's inequality. Thus, if ε > 0 is given, we can take 5 = \\(al,...,a")\rh. Then ||v - v0|| < δ implies |/(v) - /(v0)| < ε. The facts concerning convergence discussed in previous sections have application to the study of continuity, as might be expected. In particular, the assertion that every sequence in a closed bounded set has a convergent subsequence has profound significance for the behavior of continuous functions. Here is an important illustration. Proposition 14. (Intermediate Value Theorem) Let f be a continuous function on the interval {xe R:a < χ < b}, and suppose that f (a) < у <f(b). Then there is а с in the interval such thatfic) = y. Proof. We seek (as in Figure 2.5) not just a point at which the value of/is y, but more precisely the first such point c. We must find a way to describe this point which permits us to use the existence theorem. If χ < с we must have f{x) < γ, otherwise the graph of/crosses the line у = γ between a and c. Thus, с is a lower bound for the set of χ such that f(x) ^ γ. Since с is in that set, it must be the greatest such lower bound. So if there exists a first с at which /(c) = γ, it is the greatest lower bound of {x e R: a <, χ < b, f(x) ^ γ}. We now show that this point (which exists by the least upper bound property) is the desired c.
2.5 Continuity 163 Let с = g.l.b.{x: a<x^b, f(x) :> y}. Then с is a limit of a sequence {*„} in this set. Since у 5C/(*„) we must also have у < lim/(*„) =/(c) since /is continuous. Now if/(c) Φ y we must have /(c) > y. Again, by continuity, there is a 8 such that if |x - с | < δ, then l/W-/(c)|< /(c) - у from which it follows that for all χ between с and с - δ, f(x) > у. Thus, /(с - δ) ^y, contradicting the definition of с as a lower bound for the set of χ with f(x) ^ y. Hence /(с) > у is impossible, so we must have /(с) = у. Now, the most important fact about continuous real-valued functions is that they are bounded on closed and bounded sets. This follows easily from Theorem 2.4. If, say, /is continuous and not bounded above on the set S, then, for every positive integer n, there is an x„ e S such that f(\„) > n. If S is closed and bounded, {x„} has a convergent sequence {\„m}. Let lim xn(ft) = x0 · Since / is continuous, /(x0) = hm/(x„(ft)) > ]im n(k). But n(k) -> со as к -> со, so this is impossible. Thus/is bounded on S. What is more it attains its least upper bound. For if m is this least upper bound, but is not a value of/, then g(x) = (f(\) - m)'1 is an unbounded function on S, again a contradiction. To conclude: if/is a continuous real-valued function on a closed and bounded set S in R", then there are x„ x2 e S such that /(x1)=sup{/(x):x6S} /(x2) = inf{/(x):x61S} Figure 2.5
164 2. Notions of Calculus Here are the proofs in a slightly more general context. Theorem 2.5. Let f be a continuous Rm-valued function on the closed and bounded set S in R". Then the set of values off on S, /(S) = {/(x):xeS} is closed and bounded. Proof. First, f(S) is closed. Suppose y„ef(S) and y„^yeRm. We must show that у ef(S). But this is easy. Since y„ ef(S), there is for each n, x„ e S such that /(x„) = y„. Since S is closed and bounded there is a subsequence {z*} of {*„} which converges, zk -> ζ e S. Since / is continuous, /(ζ*) ->/(ζ). On the other hand, {/(z*)} is a subsequence of {y„}, so /(z*) -> y. Thus /(z) = lim/fa*) = у and ye/CS). If f(S) is not bounded, there is for each и an x„ e S such that ||/(x„)|| ^ n. But {x„} has a convergent subsequence {z„}. Let lim2i=i. Then lim/(z*) =/(z). But {/(z*)} is a subsequence of {/(xn)}, so ll/(z*)|| -> со, which is impossible since {/(ζ»)} is convergent. Thus, f(S) must be bounded. In particular, suppose/is a real-valued function defined on the closed and bounded set S. Then/(S) is bounded, so Μ = sup{i: t e/(S)} exists, and since/(S) is closed, Μ e/(S). Thus there is an xx e S such that /(Xl) = sup{/(x):*eS} Similarly, there is an x2 such thaty(x2) = inf{/(x): χ e S}. This basic fact we state as Theorem 2.6. A continuous function attains its maximum and minimum on a closed bounded set. • PROBLEMS 24. Let x0 e R". Show that/(x) = <x, x0> is continuous on R". 25. Show that a linear function L: R" -> Rm is continuous. 26. Prove part (ii) of Proposition 13. 27. Show that if/is a continuous real-valued function on a closed and bounded set S, there is an x2 such that/(x2) = g.l.b.{/(x): χ e S}. 28. Suppose that /, g are /{"-valued functions continuous at p0 e R". Show that /+ g and </, g} are also continuous at p0. If с е R, then also cf is continuous at p0.
2.6 Calculus of One Variable 165 2.6 Calculus of One Variable Theorem 2.6, which asserts that a continuous function attains its maximum and minimum on a closed and bounded set, is the fundamental theoretical tool of the calculus. We shall now give a brief review of the fundamentals of calculus, leaving the recollection of techniques to the student's memory. We shall give brief justifications of some of the more basic or special facts. First of all, we studied in the calculus a limit concept which was more general than the sequential limit we have been studying. We recall the definition. Definition 8. Suppose / is a real-valued function defined in a set {χ:0<\χ-χο\<δο} We say hm /(x) = L if and only if, for all ε > 0, there is <5 > 0 such that |x - x0| < «5 implies |/(x) - L\ < ε. First of all, the relationship between the two concepts of limit is an easy one: lim/(x) = L if and only if for every sequence x„ converging to x0, we x-*xo have Iim/(x„) = L. We can thus rephrase the notion of continuity using n-*oo Definition 8. /is continuous at x0 if and only if lim/(x) =/(x0)· x-*xo Proposition 15. (ι) Suppose f is defined in 1 = {x: 0 < | χ - x0 \ < δ}. Then hm /(x) = L x-*xo if and only if for every sequence {*„} in 1 such that x„^x0 we have Нт/(х„)=£ (ii) Iff is also defined at x0,f is continuous at x0 if and only if ПтДх) =Д*о) x-*xo Proof. We will prove only (i). The proof of (ii) is the same and is left as a problem. Suppose first that lim/(x)=Z.. Let {x„} be a sequence such that x„^x0. Given e > 0, there is a δ >0 such that \f(x) - L\ < ε for any χ such that |x-x0|<8. Now since x„^x0, there is an TV such that for л2;TV, |x„ - xo| < 8. Thus if η ^ N, |/(x„) - L\ < e. Thus, /(x„) ^L.
166 2. Notions of Calculus Now suppose lim f(x) = L is false. Then there is an e0 such that for every δ X-*XQ we can find an xs such that \x, - x0 \ < δ but | f(x) -L\^>e. Consider the sequence {c„} of x's for δ = 1, i,..., 1/n, Then \c„ - x01 < l/и, so certainly c„ ->x0. But/(c„) is always outside the interval of width ε and center L, so it cannot converge toL. Definition 9. Let/be a real-valued function denned in an interval about x0 6 R. /is differentiable at x0 if the limit ,. Я*о + 0 ~ Я*о) lim r^o t exists. If it does the limit is called the derivative of/ at x0 and is denoted f'(.Xo) or — (x0) If/is differentiable in an interval 1 and the derivative/' is also differentiable there, then / is said to be twice differentiable on 1 and (/')' is the second derivative of/and is denoted by Г or ^ f от л? The higher derivatives/"',... ,/(n),... are denned successively in the obvious manner. A function which has derivatives of all orders on the interval will be said to be infinitely differentiable there. If/, g are и-times differentiable on 1, so are/+ g,fg, and c/for с a real number. If/ is differentiable in an interval 1 it is continuous there. If /is differentiable at a point x0 where it attains a local maximum (or minimum), then f'(x0) = 0. This, together with Theorem 2.6 gives this basic existence theorem. Theorem 2.7. (Mean Value Theorem) Let f be differentiable on the closed interval [a, b~]. There is a point ξ e (a, b) such that т-!Щ^т (2.15) b — a Proof. This theorem has a nice geometric interpretation (Figure 2.6). There is a point (£,/(£)) on the graph у =f(x) at which the tangent line is parallel to the line through (a, /(a)) and (b, f(b)). Clearly (see Problem 30), we need only verify this when the latter line is horizontal, that is, f(b) =f(a). In this case, let ξ0 e [a, b],
2.6 Calculus of One Variable 167 Figure 2.6 ξι e [a, b] be the points at which /attains its maximum and minimum respectively on the interval (Figure 2.7). If either ξ0 or ξι is interior, then/has a local maximum there, so /'(ξ) = 0 for the appropriate ξ. If this is false, then {ξ0, ξι} are the points {a, b), sof(a) = f(b) is at once the maximum and minimum of/ Thus, /is constant on [a, b], so /' is identically zero and we can choose any point for our ξ. Now suppose that / is a differentiable function denned on the interval [a, b], and g is a function defined on the range of/, and differentiable there. Then the composed function h = g ° f, denned by KX)=g<J(x)) is also differentiable on a, b. For if x0 e [α, ό], then h(x)-h(x0) gU(x))-9(R*o)) /W-Λ^ο) Я*) - Я*о) (2.16) (ί«,/(ί,)) Figure 2.7
168 2. Notions of Calculus Taking the limit on both sides, we have (since χ -> x0 implies/(x) ->/(x0)), ,. h(x) - hjx0) .. g(f(x)) - g(/(x0)) .. /(x) ~ /fro) lim = lim — ——-— · lim x->xo x - x0 /(*)-»/(*<>) /(x)_ /(Xo) *->*o x - Xo The limits on the right exist since/is differentiable at x0, and g is differentiable at/(x0), so the limit on the left exists. Thus h is differentiable and we obtain the chain rule: A'(Xo) = (0 °Λ'(*ο) = 0'(/(*o))/'(Xo) (Notice that if/(x) =/(x0)> then (2.16) is invalid and the proof breaks down. However, that case can be treated separately.) If /is a function from the interval [a, b~\ to the interval [a, jS] and there exists a function g: [μ, β]-* [a, b~\ such that 9 °/(x) = x f°r a11 x e Ca> *] f°g{y)=y for all у б [a, jS] we say that/is invertible and g is its inverse. The mean value theorem gives us a condition under which a differentiable function is invertible. If a function/has an inverse, it must be one-to-one. From (2.14) we see that this will be guaranteed if/' is never zero. This is the sufficient condition for the invertibility of/. Theorem 2.8. Suppose that fis a continuously differentiable function defined on the interval [a, b~\, andf is never zero. Letf(a) = a. andf(b) = β. There is a continuously differentiable function g defined on the interval between α and β such that 0(/(x)) = x and g\f{x)) = —— for all x fix) Proof, f is one-to-one. For if a^a1<bl<b, there is, by the mean value theorem a ξ between ai and bi such that /№ι)-/(βι)=/'(β(*ι-βι)*0 by hypothesis. Thus /(όι) ¥=f(a,). By the intermediate value theorem every у between α and β is attained by/. Now we can define g as follows: let g(y) be that
2.6 Calculus of One Variable 169 x such that f(x)=y. Clearly, g(f(x)) = x and f(g(y))=y. Now g is differ- entiable: lim 9(У)~д(Уо) x-x0 ι ι Um = lim — — = lim »-»o У-Уо ,-.,o /(x) - /(x„) ~*o [/(x) - /(xo)]/x - Xo f'(Xo) A further fundamental fact to be drawn from the mean value theorem is this: A function is determined, up to a constant, by its derivative. Theorem 2.9. Suppose that f g are differentiable on the interval [a, b] and that f'{x) = g\x) for all χ e [a, 6]. Then there is a constant С such that f(x) = g[x) + С Proof. Leth=f—g. By hypothesis A'(x) = 0 for all χ e [a, b]. By the mean value theorem, for any с e [a, b], there is a ξ, a <, ξ < с such that h,(t) = h(c) - Ka) But A'(f) = 0, so h(c) = h(a). This for all с е [a, b], so h is constant and thus / differs from g by a constant, as desired. Now, given any real-valued function / denned on interval 1, we consider those differentiable functions F denned on 1 such that F' =/. By Theorem 2.9, any two such functions differ by a constant; thus by specifying the value of such an/at any point it is completely determined. We denote by Jj / = F{x) that function (if it exists) such that F{a) = 0 and F'(x) =f(x) for all χ e la, b~\. \*a /is called the indefinite integral of/ Every continuous function has an indefinite integral, which is given by the process of Riemann integration which we now describe. Let/be a bounded function defined on the interval 1. A partition Ρ of 1 consists of an increasing sequence of points a0 <at <··· < an such that / = [a0, a„]. We now construct two sums, corresponding to the approximations to the area under the graph of/given in Figure 2.8: Σ(Ρ,/)= Σ>,(β,-β,-ι) o{P,f)= Zw.(a. -fl.-i) 1 = 1
170 2. Notions of Calculus Figure 2.8 where Μ,, wf are the maximum and minimum values of / on the interval [a,-u a{]. Definition 10. Let / be a bounded real-valued function defined on the interval, /is Riemann integrable if infZ(P,/) = supff(P,/) Ρ Ρ (2.17) (i.e., if we can find partitions for which the two sums Σ and σ are as close as we please). In this case the common value is called the definite integral of/over the interval 1, and denoted J7/ If/, g are integrable on the interval 1, then so is/ + g and cf,ceR. Further ]iU+9)=]if+]i9, i/c/=cj7/ If /is integrable on the interval 1, then/is integrable on every interval J с 1. If/is integrable on the intervals [a, ti\ and [b, c] with a <b <c, then/is integrable on [a, c] and Jia, c] •'[a, 6] J[b, c] Furthermore, iif>g and both functions are integrable, then J7/> J7^. Finally, if /is integrable on [a, b~\, then F(x) = f / •Ία*] is a continuous function of x.
2.6 Calculus of One Variable 171 The fundamental theorem of calculus says more: if/is continuous on [a, b], then J ia.b]f= Y„f', that is, the definite and the indefinite integrals of / coincide. The proof of this is actually quite easy to describe. Define these functions on the interval [a, 6], corresponding to the two sides of Equation (2.17); F(x) = inf{Σ(Ρ, f): Ρ a partition of [a, x]}. F(x) = sup {σ{Ρ, f): Ρ a partition of [a, x]} To prove that/is Riemann integrable on [a, fr] is to prove F(b) = F(b). We show, using Theorem 2.9, that in fact F(x) = F(x) for all χ 6 [α, 6]. First of all F is differentiable in [a, 6]. Let χ e [a, 6] and A > 0, then F(x + A) < F(x) + Mh (2.18) F(x + A)>F(x) + wA (2.19) where M, m are the maximum and minimum of/in the interval [χ, χ + h\ These inequalities can be routinely verified (see Problem 32); Figure 2.9 is convincing: F(x + h) is just F(x) plus the infimum of all Σ {P,f) over partitions of [χ, χ + A]. Any such sum lies between Mh and mh. Now Equations (2.18) and (2.19) give Ях + А^Дх) m <, ■ < Μ A г дг jr + A Figure 2.9
172 2. Notions of Calculus Letting h -* 0, since / is continuous, Μ and m both tend to /(*). Thus F'(x) exists and is/(*). Similarly, one verifies that F'(x) also exists for all χ and has the same value. Thus, F and F differ by a constant. Since F(a) = F(a) = 0 is obvious, we have that F(x) = F(x) for all x. Thus f [a x]/is defined for all x, is differentiable and has derivative/. This, then, is the proof of Theorem 2.10. (Fundamental Theorem of Calculus) Suppose f is continuous on the interval [a, bj. Then the integral J*/ exists for all χ e [a, b~\. This is a differentiable function off, and d r* dx ·>α • PROBLEMS 29. Prove Proposition 15(H). 30. In the text the mean value theorem is proven in the case where f(f>) = /(«)· The way to do the general case is to compare the graph of/ with the line through f(b) and /(a). More precisely, let g be the function whose graph is that line, and consider h =/— g. (a) Show that h(x) =f(x) - f(p) - Rb} ~ /(Д) (x - a) (2.20) (b) Show that h(a) = h(b) = 0. (c) Now from the text there is a ξ between a and b such that A'(f) = 0. Differentiating (2.20), deduce that й- а 31. Suppose that / is differentiable on the interval [a, b], and f'(x) > 0 for all x. Show that / is strictly increasing, that is, f(x) <f(y) if χ <y. 32 Verify inequalities (2.18) and (219). 33. Give an example of a continuous function of a real variable which is not differentiable. Give an example of an integrable function which is not continuous. 34. Find the real-valued function /, continuous on the interval [0, 1] such that ί f{t) dt=\ /(/) dt for all χ e [0, 1 ]
2.7 Multiple Integration 173 35. Suppose / is к times differentiable on R, and /<w(x) = 0 for all x. Verify that/is a polynomial of degree at most к - 1. 2.7 Multiple Integration The calculus of many variables results from the attempt to study functions of several variable quantities by generalizing to that context the calculus of a single variable. Some notions generalize easily, others require some ideas of linear algebra to be properly understood. The integration theory is much closer to that of one variable than is differentiation, hence we shall describe it first. A closed rectangle in R" is a set of the form {(x1,..., x") 6 R": ax < xl < b'} = [(a1,..., a"), (b1, ..., 6")] for some fixed points a = (a1, ..., a"), b = (b1, ..., b") in R". As in the case of intervals, we denote the corresponding open and half-open rectangles in the same way. (a, b)= {xeRn:al<xi<b1} [a,b) = {xeRn:al<x,<b1} (a, b] = {xeR":al <xl<b1} The term rectangle will refer to any of these possibilities. The volume of the rectangle R determined by the vectors a and b is νο1(Λ) = (61-α1)···(*"-α") Notice that the volume of R is the same whether R is closed, open or half- open. Of course, this is as it should be since the faces contribute no volume. Now let S be any set. The characteristic function of S, denoted by χΞ is the function which is one on S and identically zero off S. We should want to define integral so that the volume of S coincides with the integral of xs. In particular, for a rectangle R we shall have J χκ = Vol(.R). The notion of integral will be built up piece by piece so that things turn out that way. Now suppose that/is a finite linear combination of characteristic functions of rectangles: /= £/=1 α, χ(Λ,). Such a function is called simple function: It is constant on each of some finite collection of rectangles, and identically zero off their union.
174 2. Notions of Calcu]us Definition 11. Let/be a simple function. If/= £?=1 atXRl, we define [/ = ta.VoKR.) (2.21) We immediately have a problem. It may be possible to also write the same function in another way, /=£™=i f, %Sj for some other collection of rectangles. For Definition 11 to make sense, we must be assured that the sum £j = i сj VoI(Sj) coincides with (2.21). In case the a, and c, are all one and the {Rt} and {Sj} are nonoverlapping (intersect only in faces), this amounts to the assertion that the volume of a set is the sum of the volumes of its rectangular pieces, no matter how it is so partitioned. The verification that (2.21) is the same for all expressions of the function/as a combination of characteristic functions is a long verification which is omitted. We now make this general definition of the integral. Definition 12. Let/be a bounded real-valued function which is identically zero outside some rectangle R. The upper integral of/ is /= inf{ σ: σ a simple function on R such that σ >/} The lower integral of/is /= sup{ σ: σ a simple function on R such that σ </} /is integrable if /= /; the common value is the integral j/ This is the direct generalization of the definition of the Riemann integral given in Section 2.6. On the plane and in space it bears the same relation to area and volume as does the Riemann integral to length. Definition 13. Let S be a set in R". If χΞ is integrable, we define the volume of S to be Vol(S)=J\s Now there are sets for which χΞ is not integrable; these are highly pathological and shall not occur in this text. Notice that if .Rj, .,., Rn are non- overlapping rectangles contained in the set S, then the sum of the volumes
2.7 Multiple Integration 175 £ Vol (Д,) = J (Σ Яд,) is less than J χ5, since χΞ > £ χΚ(. Thus the volume of S is greater than the sum of the volumes of any collection of nonover- lapping rectangles contained in S. Similarly, if now Ru ..., Rn are non- overlapping rectangles containing S, J χΞ < £ VoI(R,). Thus, the volume of S is trapped between the volume of any union of rectangles containing S and the volume of any union of rectangles contained in S. If we can make these two volumes as close as we please by proper choices of the rectangles, then J χΞ is integrable (for then J xs = ]χΞ), and its integral is the volume of S. Theorem 2.11. Let R be a closed rectangle in R". If f is continuous on R and zero offR, then f is integrable. Proof. Given ε > 0, we must find simple functions σ, τ such that σ >f>: τ and J σ < J τ + ε Vo\(R); for then it will follow that [/< f σ < J" τ + ε Vol(tf) < 17+ e Vol(tf) for any e>0. Thus, \f<\f. In any case, since the inequality, J/<J/is obvious, / is integrable. Such functions σ, τ are easily found using the basic property of uniform continuity (discussed in miscellaneous Problem 80). According to that theorem, given ε>0, there is a δ>0 such that, if |x-y|<8 then |/(x) -/(y)| < ε. Now partition R into a finite set S of rectangles each of which has the property that any two points are within δ of each other. Thus, if for each such rectangle p, m„, and M„ are respectively the maximum and minimum of /on p, we must have M„ — m„<e. Let peS peS where p0 is the open rectangle corresponding to p. Then σ ^/> τ certainly, and ίσ = 2 Μ ρ Vol(p) < Σ (™»+ ε) Vol(/>) J PCS PES < f τ + ε 2 Vol(p) < ί τ + ε Vo\(R) J peS J since S is a partition of R into rectangles. These following basic properties of the integral are easily derived.
176 2. Notions of Calculus Proposition 16. The collection of integrable functions is a vector space and the integral is a linear function. That is: (i) Iff is integrable and ceR, then cfis integrable and \cf= с J/. O'i) Iff, 9 are integrable, so isf+g and \(f+g) = \f+\g- (iii) Furthermore, iff< g then J/< \g. Proof. We leave the proof of (i) to the reader, (ii) is certainly true for simple functions. For if /=2α'Χ«ι> 0=%bjXsj, where Rt, Sj are rectangles, then f+ g = J a/ x*t + 2 h Xsj is also simple, and thus integrable. By Definition 1, /(/+ g) = 2a, νοΐ(Λ,) + 2 bj voics,) = J7+ jg More generally, now let /, g be any integrable functions. If ε > 0, there are simple functions σι, σ2, τι, τ2, such that σι>/>σ2 Ti>g>T2 and Ι σι <, Ι σ2 + ε Ι η ^ Ι τ2 + ε Thus οι + τι >/+ # > σ2 + τ2 so j (f+g)?zja1 + jr1<,j(a2 + τ2) + 2ε <ί j(f+g) + 2ε Since e >0 was arbitrary, we obtain ]{f+g) <,[(f+g), so f+g is integrable. Finally, J (f+g) <, jo2 + jT2 + 2ε ^ jf+ jg + 2ε so letting e^O, j(f+g)<;jf+jg
2.7 Multiple Integration 177 Similarly, j(f+0) + 2e>ja1 + JT2>jf+jg so again letting ε -^0, j(f+g)>jf+jg (ιϋ) Finally if/ <,g, then g - />0. Butcertainly the function which is identically zero is a simple function. Thus J (# - /) > J (# - /) > 0. By (li) it follows that J^-J/>0, or lg>lf. We shall now give the basic tool for computing integrals: Fubini's theorem. According to that result we can integrate by integrating one variable at a time. For the purpose of showing this, write the variable (x1, ...,x") of FT as (x, y) where xeR"'1 and у e R: χ = (χ1,..., У-1). у = χ". Let / be a function defined on a rectangle R in R", and suppose for each у fixed, fix, y) is an integrable function of x. Define F(y) = J/(x, j>) dx. If F is an integrable function of y, its integral JF(y)dy = j\jf(x,y)dx dy is called the iterated integral of/. We shall now show that if/is integrable this is the same as J/. More generally (after applying this principle и times) if all functions appearing in the following formula are integrable, then the formula is valid. J/Oc1,...,*")**1 ■■■</*" = /[/■·■ [j/(*i,...,*v*1 This follows from Fubini's theorem. dx2 dxn (2.22) Theorem 2.12. Let f be an integrable function on a rectangle R in R". We refer to the coordinates ofR" as (x, y), where xeRk,ye Л"-* (i) These functions ofy, J/(x, y) dx, J/(x, y) dx are integrable. (ii) These functions ofx, J/(x, y) dy, J/(x, y) dy are integrable.
178 2. Notions of Calculus (iii) J/w given by any iterated integral off; for example, J/(x, y) dxdy=j |"j7(x, y) Λ by = j [j/(x, y) rfy dx Proof. It is easily verified that the collection of functions for which the assertions (i), (ii), and (iii) are true is a vector space. Furthermore, these assertions are obvious for the characteristic function of a rectangle. Thus, Fubini's theorem holds for simple functions. Now, suppose/is a bounded, real-valued function on the given rectangle R, and suppose that σ is a simple function, and /;> σ. By definition of the lower integral with respect to the χ coordinate, J/(x. y)<*x>Ja(x, y)dx Now this inequality is maintained after taking the lower integrals with respect to y, thus J J/(x,y)<fc dy>j ja(x,y)dx dy = ja(x,y)dxdy (2.23) since Theorem 2.12 is true for simple functions. Equation (2.23) being true for any σ <,f, we can take the least upper bound on the right, obtaining J jf(x,y)dx dy^ Jf{x, y) dx dy Now, by considering simple functions σ such that σ^/and applying the same kind of reasoning we obtain this inequality j jf(x,y)dx dy^jf(x,y)dxdy As a result, we obtain this string of inequalities, which is valid for any bounded, real-valued function on R: >щ > ΊΑ > S\UU' (2.24) (The second and third inequalities follow immediately from the fact that the upper integral always dominates the lower integral.) Now, if / is indeed integrable, the
2.7 Multiple Integration 179 first and last terms of (2.24) are the same, so all are the same That the second and top third are equal implies that]/(x,y) dx is integrable. That the bottom third and fourth are equal says that J/(x, y) dx is integrable. The equation jf(x,y)dxdy = j jf(x,y)dx dy now just states the equality of the end terms with the interior terms. Now we shall illustrate the use of Fubini's theorem. Before doing that, we should remark that we rarely have the occasion to integrate functions defined on a rectangle; more often such a function is defined or considered only on a given measurable domain D. We make the following definition. Definition 14. Let D be a domain contained in a rectangle R. Given a function/defined on D, we say/is integrable if this is so for the function/ defined on R by /(*) =/00 = 0 xs D xeR,x<£ D We define JD/= J/ If β is a subdomain of a rectangle R bounded by a surface which is the graph of a function, or has some other redeeming property, then the function / will be integrable if / is. We shall not pursue this theoretical inquiry, but rather tacitly assume our domains are redeemable. Example 39. {D = (x,y): 0<y<x2, 0<x<l}, f(x, y) = x2 + y2. Define /(*, y) = x2 + y2 if (x, у) е D, and /(*, y) = 0 otherwise. Then f / = f /= J*' [ f' Ax, y) dy] dx = J*' \f\x2 + y2) dy] dx JD JR J-lLJ-l J ° L 0 J since, for fixed x, /(*, y) is zero if χ < 0 or у > χ2 and otherwise is x2 + y2. We thus obtain г л»Г v31 χ2 гЧ л χ6\ 11 26
180 2. Notions of Calculus y = g(x) Figure 2.10 Let us do the same example, iterating this time in the other order. Lf=!\[$\f{x,y)dx dy=ί0[ί/χ2+y2) dx dy _ 1 1 2 2 _ 26 ~~ 3 + 3 ~ 15 ~ 7 ~~ 105 The general technique can be described as follows: Try to write the domain in either of these forms (Figure 2.10). D = {(*, y): a < χ < b, д(х) < у <f(x)} or (Figure 2.11) D = {(x, у): а < у <L b, ф(у) <х< ф(у)} Then, given the function/defined on D, we can write 17= ('\fX)f(x,y)dy\dx in the first case; and in the second 17= f f f(x,y)dx dy
2.7 Multiple Integration 181 Of course, if neither case can be obtained, then D might have to be broken up into pieces in each of which either representation is possible. The computation of integrals in more than two dimensions is done in pretty much the same way, but with a certain amount of additional care. For example, one should try to pick out one of the coordinates, say z, so that the given domain takes the form g(y) < χ <f(y), where у represents all the other coordinates and ranges through some domain D0. Now one proceeds to break down D0 in the same way. Examples 40. D = {(x, y, z): x2 + y2 + z2 < 1, χ > 0, у > 0, ζ > 0}, f{x, у, ζ) = xyz. Now z ranges between 0 and (1 - (x2 + y2))1'2, so D = {(x, y, z): x2 + y2 < 1, 0 < x, 0 < y, 0 < ζ < [1 - (χ2 + у2)У'2} Thus, continuing the analysis of A> = {(x. у):х2 + У2<1, 0 < x, 0 < y} D = {(x, У, z): 0 < χ < 1, 0 < у < (1 - x2)1/2, 0<z<[l-(x2 + 3>2)]1/2} Figure 2.11
182 2. Notions of Calculus and 1 Γ Jl-x2)1'2 JD ·Ό \_J0 f ,[l-(jC2 + y2)V/2 dz Sf-S· •In ·>0 = :l· yQ-(x2 + y2))dy *· Jo LJo 1 f'[x(l-x2)2 (1-**)- -Vol—τ—'*—Τ dy dx dx 2\2η ι Λβ24 41 D= {[x, y, z): x2 + y2 + z2< 1, (x - i)2 +y2<l x>0 y>0 ζ > 0} f(x, y,z)=\ (see Figure 2.12). We may rewrite this domain as D = {(*, y, z):(x-i)2 + y2<i,x>0,y> 0, 0 < ζ < [1 - (χ2 + у2)]112} = {{χ, y,z):0<x<\,0<y<[.i-(x- i)2f'2, О < ζ < [1 - (χ2 + У2)Т'2} Thus л1 Г rli(x-i)2W4 Λΐ -(х2 + у2)]'/2 ι Vol(D) =|| I rfx Ji)LJo LJo ^ i/z Figure 2.12
2.7 Multiple Integration 183 Integration is clearly of value in computing volumes; it also plays a role in the study of mass. Suppose £is a domain in R3 filled with a certain fluid. If D is any subdomain in E, we shall let (D) be the mass of the fluid contained in D. What information do we need in order to compute mass (Z>), and how do we compute it? The answer is suggested by comparison of the properties of mass with those of volume. In fact, it is clear that the intuitive properties of mass are the same as the properties of volume; so we should also expect to be able to compute masses by integration. In fact, we introduce the notion of density: for x0 eE, the density σ(χ0) of the fluid at л0 is the limit mass(i?) where we mean by R -> x0, that x0 is in the rectangle R, and the lengths of the sides of R tend to zero (we might call mass (i?)/Vol (R) the relative density of the fluid in the rectangle R). Now, the mass of the fluid in any domain is computable in terms of this density function σ. Suppose D is such a domain and {Rt} is a collection of pairwise disjoint rectangles in D and almost filling D. Then is an approximation to mass (Z>) and as the size of the rectangles gets smaller and smaller, the approximation gets better. On the other hand, this sum is the integral of a simple function approximating σ, and thus approximates JD σ. Taking the limit we obtain mass (Z>) = JD a. • EXERCISES 15. Compute the volume of these domains: (a) {(x,y)eR2:x2+y2<l}. (b) {(x,y)eR2:x2^y^\}. (c) {(^j,z)ei3:0^x<l,0<^l,0<z<^ + /}. (d) {(x,y,z)eR2: - 1 <:*<: 1, 0^y<2, y<z<y+ x2}. 16. Verify that the volume of a right circular cylinder of radius r and height h is \m2h. 17. Integrate the function /on the unit rectangle [(0, 0), (1, 1)] in R2 (a) fix, y) = χ cos 2тту. (b) f(x,y)=\(x-i)(y~i)\- (c) f(x,y)=xe" + ye-'.
184 2. Notions of Calculus (d) f(x,y) (e) f(x,y) χ ifx<,y у ify<x. x + y ifx + y^\ 1 ifx + y>\. (f) f(x,y) = 0 + x2 + y2y2· 18 Integrate the function / on the domain Din/?2. (a) D = {(*, y): 0 ^x, 0 <y, χ + у ^ l},f(x, y) = x2 + y2. (b) D = {(x,y):0^y<x^\},f(x,y) = xy2 (C) D = {(л:, у):0<у^х^ \},f(x, у) = л:2г· (d) D = {(x, r): x2 + У2 < 1}, fix, У) = (*2 - У2)2· 19. Integrate the function/on the domain D in R3. (a) D is the intersection of the unit ball with the octant {x>0,y>0, ζ > 0} and f(x, y, z) = χ + у + ζ. (b) D is as above and f(x, y, z) = xyz. (c) Z> is the unit cube in the first octant and f(x, y, z) = x2 + y2 + z1 (d) D is the domain in the first octant bounded by the coordinate axes and the plane χ + у + ζ = 1 and f(x, у, ζ) = ζ. • PROBLEMS 36. Verify that the integral on R" as defined in this section coincides, when и = 1, with the Riemann integral defined in the previous section. 37. Let/be a bounded, nonnegative, real-valued function defined on the interval /, and let D = {(x,y) e R2; xe I, 0<,y^f(x)}. Verify this assertion: /is integrable if and only if D is measurable, and J, /= Vol(D). 38. Use Problem 37 to verify this. Let D be a domain in R2 and suppose that D is of the form {(x,y)eR2:a^x^b, g(x) ^ у ^f(x)} Then, if D is measurable, Vol (D) = JS [f(x) - g(x)] dx. 39. Complete the proof of Fubini's theorem by verifying the second and third inequalities of Equation (2.24). 40. State and prove Fubini's theorem in three dimensions. 41. Suppose the unit ball is filled with a fluid whose density is proportional to the distance to the boundary. Find the radius of the ball centered at the origin which has precisely half the mass. 42. Suppose a cone of base radius r and height h is filled with mud (Figure 2.13). Suppose the density of the mud is equal to the distance from the base. What is the mass of the mud ? 43. A beach В is shaped in the form of a crescent (see Figure 2.14) В = {(χ, у): 1 <^x2 + у2; (χ - i)2 + у2 <: 1} and the human density σ increases with the distance from the water. More precisely, σ(χ, y) = (x2 + y2)'1. What is the mass of humanity on that beach ?
2.8 Partial Differentiation 185 Figure 2.13 Figure 2.14 2.8 Partial Differentiation Although the integral in R" is defined without reference to the coordinates, it is computed by a succession of integrations, one coordinate at a time. The notion of differentiation is, to begin with, generalized to R" one coordinate at a time. Later we shall see how to build out of this generalization an invariant notion of derivation. Let x0eR", and suppose that / is a real-valued function defined in a neighborhood of x0 . For each i consider the function of the single variable x' given by ,Aq ,...,X,....,XqJ
186 2. Notions of Calculus If this function is differentiable, we denote the derivative by dfjdx', and call it the partial derivative of/in the x1 direction. More precisely, Definition 15. Let/be a real-valued function defined in a neighborhood of x0 in R". The partial derivative of/with respect to x' at x„ is the limit df . . .. f(x01,...,xo, + t,...,xtr)-ftx01,...,x0K) a?(Xo) = !lm 1 Another way of describing the partial derivative is this. Consider the function/only as a function on the line through x„ and in the E, direction. This restriction is a function of one variable and dfjdx1 is its derivative. These partial derivatives are computed merely by considering all but the relevant variable as constant. Examples 42. f(x, у) = xy дх (*' ^ = У 43. — (x2y) = 2xy \ 44. /(x, y) = cos[x(] 8y-(X'y) = X ly^ = *2 l+jO] ^ (*> У) = -(1 + У) sin[x(l + у)} — {x,y)= -xsin[>(l +yy\ 45. fix, y) = xy
2.8 Partial Differentiation 187 j-(x,y) = yx'~l У-(x, y) = x* \n χ ox oy Of course, if the functions df_ dj_ дх1'""дхГ are also defined in a neighborhood of x0) we may subject them to further partial differentiation, and keep going in this way as far as possible We shall refer to any such operation as a partial differentiation and call its order the number of individual partial derivatives involved. Thus, the order of δχ' UK) is 2; the order of f_ (д_ /дТ\\ дх2 \ду \dz3)) is 6. We introduce a notational convention which deletes parentheses. dx2 ox \dx/ dx \dy) дх ду 82f _ д /Bf\ dx1 dxJ dx' \dxJJ dx'dxJdxk dx'\dxJ \dxk) J d6f δ ( d5f δχ2 δ у δ ζ' .-Ц W ) δχ \δχ δγ δζ3} and so forth.
188 2. Notions of Calculus Suppose now that/is a function defined in an open set N in R" and that df/dx1, ..., df/dx" all exist in N. If we set all the variables constant except one, say x', then df/дх' is just the derivative of/along this line. Thus, if df/дх1 = 0, / is constant along the line on which only x' varies. In such circumstances we say that/is independent of x', since/does not vary as xl alone varies. If, moreover, df/dx' is zero at all points of N for all i, then/ depends on none of the variables, so is constant. As this is an important observation, we make it. Proposition 17. Suppose that f is a real-valued function defined in a neighborhood of x„ in R". f is constant near x„ if and only if all the derivatives df/dx1,..., df/dx" exist and are zero near x„. Proof. If /is constant, it is obvious that df/dx' = 0 for all i. On the other hand, suppose that these conditions are valid in a ball B(\0,r) centered at x0. Let у = (у1,..., у) е В(х0, г). We will show that/(у) = /(x0). Figure 2.15 illustrates the proof. Consider the function of x": /(χο1,...,χδ"Ι,χ") This function has derivative zero by hypothesis, so is constant. Thus, f(Xo , ..., Xo~ , Xo") — f(Xo , · · ■, Xo~ , У) Now, the function of л""1, f(x0\ ...,x"0-2, x"-\y") (У .ίΟ2,*!)3) (У,?2,/) (y\ y\ xo") Figure 2.15
2 8 Partial Differentiation 189 also has derivative zero, and thus must be constant, so f(x0\ ..., χ*"1, Λ = /(*>', . ..,У-',У) This together with the preceding equation gives f(x0\ ..., XT', *o") = /(*o\ ···, xV\ У1,У) Continuing in this way, we can replace each x0J by the corresponding yJ one at a time, ending up with the desired equation f(x0) =f{y). As far as the higher order differentiations are concerned, there is one basic fact we should now verify. This is that each partial differentiation depends only on the number of derivatives with respect to each coordinate, and not on the order in which they are performed. For example, ^ ^ (2.25) дх ду ду дх d5f d5f d5f дх ду dz dz дх ду дх dz ду дх dz dz дх We shall verify only the first equation; it being clear that all others follow from a succession of applications of the first one. The verification of (2.25) amounts to an interesting application of Fubini's theorem. Theorem 2.13. Let f be a real-valued function defined in a neighborhood Nof{xQ, y0) in R2 and suppose that all first- and second-order partial derivatives off exist and are continuous on N. Then d2f d2f дх ду ду дх throughout N. Proof. We apply Fubini's theorem to d2f/dx ду in a sufficiently small rectangle ^ = ((*o, Уа), (s, 0) contained in N (see Figure 2.16) i[L^ydy]dx = i[i^ydx. dy (2.26)
190 2. Notions of Calculus Figure 2.16 Now, we can easily evaluate the integral on the right-hand side. For fixed y, Integrating once again (this time with respect to .y) we obtain from Equations (2.26) and (2.27) fJi. Э2/ дх ду dy r' 8 dx = ) g-lf(s,y)-f(xo,y)]dy = /(«, t)-f(x0,t)- [f(s, yo) - /(*„, JO)] (2.28) Now, we can differentiate this equation with respect to ί first, and then t. By the fundamental theorem of calculus, we know how to differentiate the integral on the left with respect to the upper limit of integration: θ 82/ *o L"3>o дхдУ I (x, У) dy dx "/. Э2/ ,o 8x 8У (s, У) dy Then, from (2.28) f' 32/ вхгуЬЯЪ-ТхЬЪ-ТхЬ»*
2.8 Partial Differentiation 191 Differentiating this equation now with respect to t, we obtain Э2/ Э2/ дхду as desired. ду dx Another important application of Fubini's theorem is this result, which allows us to differentiate under the integral sign. Proposition 18. Suppose that f is a continuously differentiable function of two variables χ and y, a < χ < b, and у e D, a domain in R". Define the function F on the interval \_a, b~\ by F(x)= \f(x,y)dy Then Fis differentiable and dF r df * <*>=u <*·*>* Proof. We shall show that F is the indefinite integral of the function 9/. Г 3/ and thus by the fundamental theorem of calculus, the proposition follows. By Fubini's theorem t / bf dx (x, У) dy •'9/. dx=\ j γχ (x, У) dx dy But by the fundamental theorem of calculus, the inner integral on the right is /(f,y)-/(e,y). Thus j \ j j£ (x , У) dy dx = j [f(t, У) -Л«. У)] <Ь=НО- F(a) Let us return now to the consideration of the first-order derivatives. These are obtained by differentiating after restricting the function to lines parallel
192 2, Notions of Calculus to the coordinate axes. We generalize this notion to allow differentiation along any line. That is, we make this definition. Definition 16. Let x0 e R" and suppose/is a real-valued function defined in a neighborhood of x„. If ν is a vector in R", we define the directional derivative df(x0, v) to be This is clearly the same as ,._/(x0 + fv)- /(x„) lim !-»0 t We leave it as an exercise to verify that |ζ(χ0) = <//(χ0,Ε,) (2.29) ox Now, in certain pathological cases the directional derivatives need not hang together in any nice way, but typically we need only know the partial derivatives in order to find any directional derivative. Proposition 19. Suppose f is defined in a neighborhood o/x„ and the partial derivatives dfjdx1,..., dfjdx" all exist near x„. Then the directional derivatives df(x0, v) vary linearly in v. Proof. The argument consists in looking at the difference /(xo + fv)-/(x0) one variable at a time. In order to expose the idea without encumbering the argument with a pile of indices, we consider the two-variable case. Write the difference f(x0 + th, y0 + tk) - f(x0, y0) {f(x„ + th,y„ + tk) - f(x„ + th, y„)} + {f(x0 + th, y0) - f(x0, y)}
2,8 Partial Differentiation 193 We can find a better expression for the term in the second set of braces by applying the mean value theorem to the function/(i, y0) of s. That is, there is a ξ0 between xo and xo + th such that Э/ f(x0 + th, y0) - f(x0, y0) = — (ξ0, y0)th Similarly, by applying the mean value theorem to the function f(x0 + th, s), we can rewrite the term in the first set of braces as 9/ — (x0 + th, 4o)tk for some η0 between y0 and y0 + tk. Thus, we have for suitable (ξ0, ηο) in the rectangle [(л:0, Уо), (xo + th, y0 + tk)], /(xo + ?v)-/(xo) Э/ Э/ = γχ (ξο, y0)h + — (xo + th, ηο)^ Letting t -^0, we obtain by continuity that d((x0, Уо), (h, k)) = γχ(Χο, yo)h + yy(xo, yo)k (2.30) Thus the proposition is verified, at least in R2. This linear function, a?f(x0, v) of the vector ν in R" is called the differential of/ at x0. We will make a systematic study of this in a later chapter. The vector-valued function W " " ' dx-j is called the gradient off and is denoted by V/. It is clear from Proposition 19 that the generalization of (2.30) to η variables is #(χο. ν) = Σ §i (χοΚ = <ν, V/(x0)> (2.31) The gradient behaves as a sort of " total derivative." It is not as powerful in the analysis of a function as the derivative in one variable and it is somewhat more cumbersome, but it does provide a similar kind of tool. For example,
194 2. Notions of Calculus Proposition 20. The gradient of a function vanishes at any point at which it attains a maximum or minimum value. Proof. If Xo = (*o\ ..·, xo") is (for instance) a maximum value of/, then f(x0\ ·.., x', ..·, xo"), as a function of x', attains a maximum at x0'. Thus, df/dx1 vanishes at x0'. Since this is true for all ι, Vf(xo) = 0. Examples 46. Consider/(л:, у, ζ) = χ2 + ху + у2. Vf=(2x+y,x + 2y) Thus V/is zero when that is, only at the origin. This is the only critical point, and a minimum at that. 47. f(x, y, z) = χ cos у + ζ V/= (cosy, -x smy, 1) is never zero, so /has no critical values. 48. f{x, y, z) = χ cos (yz) V/= (cos(_yz), -xz sin _yz, - xj> sin _yz) V/ is zero only when χ = Oandyz = π(η + i)foranyintegern. Clearly, / has both negative and positive values near any point on the line {x = 0}, so no such point is critical. Thus,/has no critical points. • EXERCISES 20. Find the first partial derivatives of these functions, (a) xyz (b) sinOy) (с) х,г (d) x2y + y2x 21. Differentiate Xх". (Hint: This is the same as finding the directional derivative of х'г at a point (x, x, x) in the direction of (1, 1, 1).)
2,9 Improper Integrals 195 22, If/is differentiable at x0, then Э/ — (xo) = df(x0, E,) for all i. 23. Suppose that /, g are differentiable at x0 in R". Show that fg is also differentiable and V(/^)(x0) = f(x„)Vg(x0) + ^(x0)V/(x0). 24. If /is differentiable at x0, and /(x0) ^ o, then (?)(X0) = yi V/(x0) 25. What is the minimum of x2 + y2 + (2y + l)2? 26. What is the maximum of x + 3y ? 1 + x2 + y2 27. Compute the differentials of the functions in Exercise 20. • PROBLEMS 44. Suppose/is a differentiable function of two variables and gi,g2 are differentiable functions of one variable so that the range of (#1,^2) is in the domain of/ Find the derivative of h(t) =f(gi(t), giit)). 45. Let/be a differentiable function of two variables. Show that/is a function of χ — у alone if and only if df/дх + Sfjey = 0. 46. Suppose that L: R° -> R is a linear function. What is У LI 47. Let T: R" -> Л" be a linear transformation. Define the function on R" χ R": /(x, y) = <7x, y>. Show that / is differentiable, and V/(x, y) = <Г'у, 7x> (recall that T' is the transpose of T\ if Г is represented by the matrix (a/), then T* is represented by (b/) where ό/ = at1), 48. If Γ: R"^> R" is a linear transformation, then the function g(x) = <7x, x> is differentiable, and V^(x) = Γχ + Tx, 2.9 Improper Integrals We return now to the study of functions of one variable; in fact, we will be considering functions defined on the whole real line. Our interest will focus on the " behavior at infinity " of such functions. For this purpose we introduce the notion of lim/(x) as χ -> oo.
196 2. Notions of Calculus Definition 17. If/ is a real-valued function defined in an infinite interval {χ: χ > a} we say that /(x) converges to L as χ becomes infinite, written lim/(x) = L if, for every ε > 0 there is an Μ > 0 such that χ > Μ implies |/(x) — L\ <в. Similarly, if/ is defined in {x: x<b} we say Iim/(x) = L x-*ao (the limit of /(x) is L as χ becomes negatively infinite) if, for every ε >0 there is an Μ > 0 such that χ < — Μ implies \f(x) — L\ < ε. Examples 49. Iim \jx = 0. For given ε > 0, we can take Μ = ε-1. Then χ > Μ implies \\jx - 0| < ε. 50. ,. 4л:2 + Зх + 5 1 lm ——^5—-— = - ,~. Sx2 - 7 2 For, so long as χ > 0, 4л:2 + Зх + 5 _ 4 + 3/x + 5/x2 Sx2 -7 = 8 - 7/л:2 (2.32) Now, we can compute the desired limit by using the standard algebraic rules (the limit of a sum is the sum of the limits, etc.). (See Exercise 28.) Since l/л-, Ι/λ2 tend to zero as л -> oo, the limit of (2.32) as x -> oo is 4/8 = 1/2. 51. lim J = 1 lim ^j 2 = ~~ ι If x\x\ x2 1 л>0, 1 + χ2 1+x2 1 + 1/x2 if л|л| л2 -1 л < 0, 1 + xz 1+x2 1 + 1/x2
2.9 Improper Integrals 197 52. Iim arctan χ = π/2. x-*ao Definition 17 is the analog for functions defined on an infinite interval of the notion of convergence of a sequence (a function defined on the integers). Just as we pass from sequences to series we can pass from infinite limits of functions to infinite sums, that is, integrals over infinite intervals. Definition 18. Let/ be a continuous function on the interval {x: x> a}. We say/is integrable if Iim J*/exists, in which case We write the limit as $"/■ /'s absolutely integrable if Iim J* |/| exists. Examples 53. x~2 is integrable on the interval [1, oo). For dx = ■--I + i ι m Cx~2dx = limi +l) = 1 Jl m-.oo\ Ш J 54. χ 1 cos* is not absolutely integrable on the interval [1, oo). For oo .2πη + π/3 cos χ Γ* COS X ™ [■'■■"•^■4- COS X , dx Ξ> £ Λ •Ί JC п=1 •'гяп-я/З * Between Inn - πβ and 2πη + π/3, χ ' cos χ > (2πη + π/3) 1 ■ \. Thus, ί" •Ί cos χ 1 d*> Σο- 1 „tri 2 (2πη + π/3) 3 2π — = οο The theory of integration on infinite intervals is entirely analogous to the theory of infinite series. We have the following facts (whose counterparts in the theory of series are easily recognized).
198 2. Notions of Calculus Proposition 21. Let f be continuous on the interval {χ : χ > a). (i) fis absolutely integrable if and only if the set {J* |/|} is bounded. (u) If fis absolutely integrable, then f is integrable. (iii) (Comparison Test). If there exists a b> a and a constant К and an integrable positive function g defined on {χ: χ > b) such that Kg> \f\, thenf is absolutely integrable. Proof. (l) If l/l is integrable, clearly φ |/|} is bounded. On the other hand, if φ |/|} is bounded, let L = sup{Ji |/|}. Then for ε > 0, L — ε is not an upper bound, so there exists an л:0 such that Ji° |/| ^ L — ε. Then for all χ Ξ> л:0, L>j \f\>j |/|>L- r* I <ε (ii) Suppose J™ |/| = L. Let c„ = Js/ We show that {c„} is a Cauchy sequence. Let ε > 0. Then there is an x0 such that for χ >: x0, a Then for n, m^Xo, <: \cn— cm\ = Γ/^Γι/ι^ίι/ι-f i/i / \f\-L + / i/i-z. <ε Thus {c„} is Cauchy, so converges, say to с We shall show that in fact J?/= с Let ε > 0, and find N so that |c„ - c\ < ε/2 for и ^ JV. Then for л: ^ max(x0, iV), * N » J /-c ^ J f-c +J |/|<^ + 2 2 as in the previous computation.
2.9 Improper Integrals 199 (iii) Under the given hypothesis, if χ > 1, then X Ь oo j |/|^J 1/1+ίJ 0<co ρ a b Thus by (l), /is absolutely integrable. Here is an easily derived relationship between the absolute convergence of series and integrals which provides yet another test for the Convergence of series. Proposition 22. (Integral Test) Let f be a positive, decreasing function defined on R +. Then\™ f exists if and only if £™= λ /(и) < oo. Proof. For x, n<x<n+ 1 we have f(n) >x >n+ 1. Thus/(n) > Ji+1/> f(n+ 1). Thus, by comparison the series 2 \n+1 /'and J/(n) converge or diverge together. But the convergence of the first series is the same as the existence of J? /, and conversely. This proposition gives an easy proof that 2 l/"u +ε) < <» for ε > 0. (Compare to the work of Example 18.) For if we consider the integral Jf dtjtl+c, we have r' dt -1 J, /1+ε~ ete ε εχ" ε as χ -> со. Example 55. £ 1 „ = 2 n(logn)2 For < oo ί/ί №x du Γ_ίΐ_=Γ'^=-„-ι J2 f(logf)2 Jlog2 Μ Thus logjc log 2 log 2 log л: rx dt /1 1 \ 1 Г "' _ }lm )= < oo J 2 f(logf)2 *—\log2 log*/ log 2
200 2. Notions of Calculus • EXERCISES 28. Verify these algebraic properties of lim. Suppose lim f(x), lim g(x) JC-»0O JC-»0O JC-»0O exist. (a) lim f(x) + g(x) = li m f(x) + li m g(x). (b) lim f(x)g(x) = lim f(x) lim g(x). f(x) ,im^W (c) Urn ^ = '?" , if Urn *(*) * 0. jc-»oo 29. Compute these limits as χ -> со. (a) sin л: χ2 + Зх + 1 (Ь) *·+! " л:2- 1 (d) tan -. л: 1 (e) χ sin - X w x2 + 1 30. Which of these series converge: (a) Σ-1- (Ο Σ — V »ti и log и W »=2 n3'2 oo J oo С— iyi (Ь) Σ -тг^ (β) Σ ( У η = г rt(log η)2 η = 2 (log η)2 oo 1 oo 1 ω Σ „ _ _ ч2 (h) Σ η = 2 ^(log log л)2 п=2 (log л)2 00 1 OO 1 № Σ ^—ττττ—ίΤΤ^ (0 Σ η=2 (log /i)2(log log я)2 n=2 (n sin я)2 oo 1 oo 1 (e) Σ —, TV <J) Σ Η n=l Ι 7Γ η π=3 η log(logn)1+t tan' '
2.10 The Space of Continuous Functions 201 2.10 The Space of Continuous Functions The mathematician attacks his problems with a certain store of techniques. Occasionally a problem will require the development of a new technique; more often the problem is solved by viewing it in one way, and then another and then again another until a viewpoint is obtained which allows for the application of one of those techniques. Sometimes if the viewpoint is clever enough, or profound enough—or naive enough—the applicable technique is quite elementary and surprising and leads to further deep discoveries. This is the case with the contraction lemma (a fixed point theorem) which we shall apply several times in this text to obtain some of the basic facts of calculus. First, in this section, we shall develop the particular viewpoint in the relevant context. It is simple enough—instead of looking at continuous functions one at a time, we consider them all. Let us illustrate this with a particular problem. Suppose we are interested in finding a differentiable function with these properties: f'(x)=f(x) for all χ and /(0) = 1 (2.33) To find such a function means first of all to verify that a solution to our problem exists, and secondly to establish some technique for computing it. We already have enough experience with calculus to know that this second objective will be hard to fulfill. What we in fact seek is a means of effectively approximating our solution. This provides a clue: let us look for a sequence of functions {/„} which converges to a function with the properties (2.33). Such a sequence would be a sequence of differentiable function {/„} such that the sequence {/„(*)} converges for all x, and /'„(*) =/n-i(4 If we had such a sequence, we could take the limit and deduce that lim/'„(x) = lim/„_i(x) so f{x) = hm/„(x) will solve our problem. Now this is a good idea, because Equation (2.33) itself provides the technique for generating such a sequence. Let /0 be any function, and define /, =/V Then let/2=/',, /3=/'2, and so forth. Will the sequence {/„} converge? Well, that is a problem. Notice that /2 =f\ =/"0, /3=/'2=/'"0! and more generally fn=Un)- Thus, we must be very careful to choose an infinitely differentiable function for/0. Suppose/0 is chosen as a polynomial of degreen. Then fn + 1 = /<S"+I) = 0, and so all the rest of our functions are zero. Thus, the sequence certainly converges,
202 2. Notions of Calculus but hardly to a solution, since the condition/(0) = 1 is not verified. In fact, this present approach has obviously petered out fruitlessly and it may be because we have not incorporated the initial condition /(0) = 1 in our approach. Can we put all of (2.33) in one statement, and then proceed with this technique of generating an approximating sequence? The fundamental theorem of calculus says yes; in fact, (2.33) can be rewritten as /(*) = 17(0 dt + 1 (2.34) ■Ό This now is an operation involving integration rather than differentiation, and so we have the added advantage of not having to choose a very well- behaved function for the first approximant. Let us try again, with (2.34) rather than (2.33). Letting/0 = 1, we find Л(дс) = flat + 1 = χ + 1 /2W = ГО + 1) dt + 1 = *- + χ + 1 Г It2 \ x3 x2 (#ι-1)! (2.35) Now we're getting somewhere. We have already seen that the series (2.35) converges for any x. Thus, letting /(*) = lim/„(x) = Σ - n=0 nl this must be the sought after function. (Of course the reader has long since recognized the solution of our problem as being the exponential function. Thus he should be reassured to see that it did in fact turn out that way.) What we need now is the theoretical mathematics that will allow us to take the limit in (2.35) and correctly deduce /(*) = Г ДО dt+l= £ - Jn n = n ft 0 n-O"!
2.10 The Space of Continuous Functions 203 Thus we are led to the question of convergence in the space of continuous functions. We now proceed to that theory. Let X be a closed bounded set in Rn, and let C(X) denote the space of all continuous complex-valued functions on X. We know that if /and g are two functions in C{X), then so zref+g and fg and cf for c, a complex number. In particular, C{X) is a vector space on which multiplication is defined. The vector space C{X) is quite different from the vector spaces C, R": C(X) is usually infinite dimensional (see Problem 49). C(X) does not have any obvious "standard basis"—in fact, we wouldn't know how to choose one. In other particulars, however, C(X) is not very different. There is in this space a reasonable notion of closeness. Two functions are close if their values are everywhere close; that is, if the maximum of their difference is small. This leads to a notion of length and distance in C(X). Definition 19. Let X be a closed and bounded set in R", and C(X) the space of continuous functions on X. life C(X), the length of/is ||/||=max{|/(x)|:xeX} If/, g are in C(X), the distance between / and # is \\f— g\\. The properties of length and distance are just those of the corresponding notions in R": lk/ll = \c\ 11/11 \\f+g\\< 11/11 + Ы1 If 11/11 = 0, then /= 0. What is important is what we can consider the notion of convergence of a sequence of continuous functions. We say that /n -*/rf II/, -/II -*0, that is, if the distance between the general term of the sequence and / becomes arbitrarily small. This is the same as saying that the values of/„ at points of X converge to the values of/in a uniform manner. The value of these notions lies not only in their naturahty, but in the now realizable possibility of finding specific functions satisfying given properties by techniques of approximation. Let us make this precise. Definition 20. Let X be a closed bounded set, and {/„} a sequence in C(X). We say that {/„} is uniformly convergent if there is an/e C(X) such that lim ||/„ - /II = 0
204 2. Notions of Calculus We say that the sequence is uniformly Cauchy if, for every ε > 0 there is an N such that II/, - fm II < ε whenever n,m>N Examples 56. Let χ be the interval [0, 1], f„(x) = (1 - x)*". This sequence converges uniformly to zero. Let us compute max|/„(x)| = \\f„\\. /;(x) = «(i-xK-'-x" so/„'(*) = 0 has the solutions χ = 0, χ = и/(и + 1). Thus which tends to zero. 57. On the same interval the sequence f„{x) = sin xjn tends to zero, for ||/J = sin - -> 0 as и -> oo 58. Consider the convergence of the sequence {nx sin xjn) on the interval [0,1]. Now we know that sinx/n->0 as n -> oo, but nx -* oo, so we cannot make any deduction about the product. We have to refine our information about sin xjn. For large values of n, it is very close to xjn. Thus χ χ 2 nx sin nx ■ - = χ η η (2.36) so we guess that nx sin xjn -> x2. Let us prove it by computing nx sin χ η (2.37) In order to do that, let us provide an estimate to our guess (2.36). X sin — и X - ~ η <— in the interval [0, 1] (2.38)
2.10 The Space of Continuous Functions 205 Then (2.37) becomes • * 2 nx sin xz и = = / X X\ nx sin \ η n) 1 . χ x\ их sin \ и и/ < II «II < 1 "'7' χ χ sin и и = п~1 X + их ■ - - и -х2 (2.39) (2.40) and since и λ -* 0 as и -> оо, we are through. 59. On the interval [0, 1] the sequence {sin их} is not convergent. It is not even a Cauchy sequence. The distance ||sin их - sin wx|| does not become arbitrarily small as n,m-> oo. In particular, if m = 2n, we have ||sin(nx) — sin(2nx) sin | и ■ — i - sin 12n ■ —) \ 2л/ \ 2л/ = 1 The basic theorem about convergence of continuous functions is the following, which plays the same role in C{X) as the least upper bound axiom does for R. It provides the assertion of existence of functions with prescribed properties. In order to verify that a sequence of functions has a continuous limit, we need only verify that it is a uniformly Cauchy sequence. Theorem 2.14. A uniformly Cauchy sequence of continuous functions is uniformly convergent. Proof. Suppose {/„} is a uniformly Cauchy sequence of continuous functions on X. This means: for every ε > 0, there is an N> 0 such that \\f„ — fm\\<efoT n,m^.N. This means precisely I/„(x)-/™(x)I<e for all xe X (2.41) Thus, for each x, {/,(■*)} is a uniformly Cauchy sequence of real numbers, and thus converges. Denote the limit, lim/„(x) by /(x). We must show that this function χ ^/(x) is continuous, and that f„ converges uniformly to /
206 2. Notions of Calculus First of all, if ε > 0, choose Was above, and let m -> со in (2.41). We obtain, for n>N, Urn |/„(x) - /ra(x)| = |/„(x) - f(x)| < ε for all χ e X m-* <л Thus, if n^./V, ||/,-/||>ε. This implies that lim \\fn -f\\ = 0, as desired. Π-»» Now / is continuous. Fix x0 e X Let ε > 0 and choose ./V so large that ll/v — /II < ε/3- Since /v is continuous, there is a δ > 0 such that ||x — x01| < δ implies |/v(x) -/м(х0)| < ε/3. Then if |x - x0| < δ, |/(χ)-/(χο)Ι ^ |/(χ)-Λ(χ)| + I/n(x)-/n(xo)I + ΙΛ(χ)-/(χ0)Ι ε ε ε as desired. Having seen one vector space of functions, we can easily see them everywhere. The collection of bounded real-valued functions on a set X is a vector space over the reals. The collection of all bounded functions on X taking values in R" is also a vector space; similarly, the space of continuous functions taking values in R". All the spaces here are endowed with the same concept of length: 11/11 = sup{||/(x) ||: xeX} Of even more interest are the spaces of functions on which is defined some analytic operations. For example, if / is an interval, the space of all real- valued functions which are differentiable on / is a vector space. The space Cl(I) of all functions whose derivative is continuous is also a vector space, as is the space C<n)(/) of all functions which have continuous «th derivatives. The space R(I) of functions which are integrable on / is a vector space. These (and other) examples are further elaborated in the exercises. Suffice it to say here that the mathematical theory which follows this point of view (called functional analysis) is a recent (20th-century) development which has had profound impact, not only in foundations of mathematics, but in the practical application of mathematics in all branches of science. Let us return to the space C{X) of continuous functions on a closed bounded set X in R". Once we begin thinking of these functions as points in a space, on which are defined such notions as distance and convergence, we are easily led to consider functions on that space. Naturally, such a function is continuous if it takes convergent sequences into convergent sequences.
2.10 The Space of Continuous Functions 207 Examples 60. Let geC(X) and define ф(/)=/д. ф is continuous, for if /.-/, that is, ||/.-Я-»0, then Ш-Ы ζ II/ -/II 1Ы-0 61. Define ψ: C(X) -» C[X), ψ{/) =/2. ψ is also continuous, for II/2 -/2II = IK/ -/)(/„ +/)|| < II/ -/|| ■ ||/ +/|| (2.42) If/-»/» the term ||/+/|| remains bounded while ||/„-/||-+0, thus also ||/2-/2||^0. 62. If Ρ is any polynomial, \j/P(f) = P(f) is continuous on C(X) (Problem 55). 63. Define M: C(X) -> R, M(/) = ||/||. This is continuous, since 11М/)-М<011 = 1Н/11-Ы1 I < II/- g\\ 64. Let x0eX and define F0: C(Z) -> R, F0(/) =/(*<,). Certainly F0 is continuous: for if/->/in C(X), then the maximum over X °f 1/00 -/001 tends to zero; in particular, |/(x0) -/(*o)l -^O, so Fo(L) -»F0(/). 65. The definite integral is a continuous function on C(I), where / = [a, 6] с R. For 17.-17 ^ f(/.-/) j{ jj j{ < II/-/IP-я) so if/ ->/ also Ji/ -> \,f. A stronger and more important statement than that of Example 65 is that the indefinite integral, as a function from C{I) to C(I) is continuous. This is contained in the next proposition. Proposition 23. Let I = {* e R: a < χ < b}. Suppose / is a sequence of continuous functions on I converging uniformly to f Let F„(x) = ft /, F00 = ft/- Then F„ -> Funiformly.
208 2. Notions of Calculus Proof. I c" F„(x)-F(x)\= J (/„-/) |ΙΛ-/|Κ*-α)<;||/,-/Ρ-β) Thus, taking the maximum on the left, |1Я-Р||<||/„-/р-й) so if /„ ->/ uniformly so also F„ -> F. Problem 56 is intended to demonstrate that on the other hand, differentiation is not a continuous function on C(/). (It isn't even everywhere defined; i.e., there are continuous functions that do not have a derivative.) Nevertheless, Proposition 23 has this consequence for differentiation. Proposition 24. Let {f„} be a sequence of continuously differentiable functions on the interval [a, b~] and suppose that (i) {/'„} is uniformly Cauchy, (li) /„(a) = 0 for all n. Then {/„} is uniformly convergent to a diffe rent ι able function f and f = lim/'„. Proof. The proof of this proposition consists in a rereading of Proposition 23 via the fundamental theorem of calculus. By that theorem X /„(*) = / f'n a so by Proposition 23, /„ is also convergent. If we let g = hm/'„, then hmf„ = Ji g. Thus, lim/, is indeed differentiable and its derivative is g = lim/'„. Let us return now to the consideration of our original problem. In fact, let us generalize it slightly. Let с be a complex number, and let us seek a differentiable complex-valued function/such that f'(x) = cf(x) for all χ and /(0) = 1 (2.43) This is, by the fundamental theorem of calculus the same as seeking a continuous function/such that fix) = с 17(0 dt + 1 (2.44)
2.10 The Space of Continuous Functions 209 Now that we have the necessary theory and point of view available, we may follow a more sophisticated approach. Let / be the interval / = [-R, R~\, and define the function Ton C(I): Tf(x)=cff(t)dt+l (2.45) We seek a function / such that/= Tf, that is, a fixed point of the transformation. Our technique is that of successive approximation. Let/0 be any continuous function, and define /, = Tf0, f2 = 7/i =T2f0, and in general /„= 77„-i = Т"/0. We must show that the sequence {/„} converges. If we choose /0 = 1 we can compute the sequence explicitly, and we find that (ex)" (cx)"~l Then if m > n, (cx)m (ex)"-1 (cx)n+1 m' (m-1)! (n + 1)! On the interval [-R, R] the maximum of this expression is dominated by replacing с by \c\, and χ by R. Thus, (Иду (MR)"-1 (k|R)"+1 и/» ли - ш, +(„,_,),+ + (и + i)i t=o к! & = o к: Since the series | · (|с|Д)*| converges, its sequence of partial sums is a Cauchy sequence, so by (2.46), {/„} is a Cauchy sequence and is thus uniformly convergent. Since Τ is continuous on C(I), we have lira/. = lim T(fn^) = ЩипЛ-О = ^"η/Ο
210 2. Notions of Calculus so lim/„ solves the given problem. This function is important enough for us to spend a few more paragraphs discussing it. Definition 21. The exponential function, denoted exp(oc), or ecx, for any complex number с is the solution of the differential equation /'(*) = cf(x) ДО) = 1 First of all, this definition makes sense, because there is only one solution. If g also solves, then d \ecx~\ cecxg — ecxg' cecxg - cecxg dx e 19 92 92 = 0 since g = eg. Thus ecxg~x is constant. Since its value at 0 is 1, ecxg~l = 1, or ecx = g. From these discussions we have these additional properties of the exponential function Proposition 25. 00 (ex)" (0 е"=^Ц- n = 0 П\ (ii) ex+y=exey. (iii) ecx is never zero. Proof. Part (i) follows directly from the argument above. Part (ii) follows from the uniqueness. Fix y, and define h(x) = ex+yjef. Then h'(x) = — = h(x) and A'(0) = —= 1 Thus we must have h(x) = e', so (n) is verified. Part (iii) follows immediately from (ii): gCXg-CX __ gCX~CX __ gO __ 1 so (e")-1 = e~". • PROBLEMS 49. Let / be a nonempty interval in R. Show that C(I) is infinite dimensional.
211 The Fixed Point Theorem 211 50. Show that the sequence of functions on the closed unit disk in С defined by *=ι к2 converges. 51. Does the sequence ! У zkjk\ converge on the closed unit disk'' X sin - — η X — η 52. Let {a„} be a sequence of complex numbers such that 2 |e»l < °°- Verify these facts: (a) For every z, \z\ < l,f(z) = У„'=1 a„z" converges, and (b) /is continuous on (ze C. \z\ < 1} This is true because/is the uniform limit of the polynomials fs(z) =J_;v=l a„z", since ||/-/v||^ 2"=«+1 l°ni ^° as -^^:o- 53. Let f,gbe continuous functions on the closed and bounded set X. Show that \\fg\\ <; 11/11 \\g\\ Is \\fg\\ < ||/|| ■ ||^|| possible'' 54. Show that on the interval [0, 1 ], < — for all η η 55. Let Xi,...,xkeX and ρ be any polynomial in к variables Define Г:С(Х)^С ПЛ=р(ЛхО,- ./(x0) Show that Ψ is continuous. 56. Find a sequence {/„} of differentiable functions which is uniformly convergent, but such that {/'„(i)} is not convergent. 2.11 The Fixed Point Theorem The fixed point theorem is a generalization of the technique of successive approximations described above in the discussion of the exponential function. This technique was first used by Newton as a technique for finding roots of polynomial equations. Simply stated, Newton's method is this. First,
212 2. Notions of Calculus a technique is described by means of which one can transform a given approximation to a root into a better approximation. One then chooses a reasonable approximation, applies this technique to it to find a better one. Having this, one again applies the technique: if it's a good one, the result is an even better approximation. Continuing in this way, one obtains a sequence of approximations which should converge to the root. Now, having described the procedure, let us turn to Newton's specific technique for bettering approximations. Let/be a given real polynomial. We want to find a point x0 such that f(x0) = 0. Choose a ρλ so that/^) is small. Now, replace the function by its linear approximation at ρλ: L(x) ==f(Pi) +f'(Pi)(x — Pi), and let p2 be the root of L{x) = 0. In other words, replace the graph of /by its tangent line and let p2 be the χ intercept of that line (see Figure 2.17). Now apply this procedure to p2. Let p3 be the root of the linear approximation to / at p2, and so forth. We can describe Newton's technique abstractly as follows: For any point p, let T(p) be the zero of the linear approximation of / at p: T(p) solves the equation f(p) + f'(p)(T(p) — p) = 0. (We must Figure 2.17
2.11 The Fixed Point Theorem 213 assume that/' # 0 for Γ to be a well-defined function.) Clearly, if f(p) = 0, we have T(p) = />, and conversely, thus we are in reality seeking a fixed point of Γ! Suppose Γ has the property of contraction on some interval 1. There is a с < 1 such that \Tx - 7>| < c\x - j;|, all x,yel. Then Newton's method works. There is a root of/(*) = 0 (or /'(*) = 0) on the interval /, and it is the limit of the sequence x0, Tx0, T2x0,..., where x0 is any point of 1. This is the content of the fixed point theorem. We now state and prove it explicitly for subsets of C(X). It will be clear that the theorem is true for subsets of R", by virtue of the same argument. Theorem 2.15. Suppose S is a closed set of functions in C(X): that S contains all limits of sequences in S. Suppose Τ is a mapping of S onto S which is a contraction, that is, there is а с < 1 such that IIТСЛ - T(g) || < с || / - g || for allf geS Then there is a unique continuous function /0 such that T(/0) =/0. Proof. Certainly the fixed point is unique. For if T(f0) =/0 and T(fi) =/i, then ll/o-/ili= l|r(/o)-r(/,)||<c||/o-/i|i<|l/o-/i|| unless ||/o-/i||=0, that IS, /o =/i- Now let fe C(X). Let the sequence {/„} be defined as follows: /Ί =/, h = 7/Ί, /з = Tf2 ,...,/„ = Tf„.,. {/„} is a Cauchy sequence. For ll/n+i-ΛΙΙ = Ι|Γ/„ - 7У„-, li < с ||/„-/„-, || so we can verify by induction that ΙΙ/„+ι-/ηΙΙ<<"ΊΙ/ι-/οΙΙ Thus, for m > η we have 11/™-ΛII < Σ(Λ+'-Λ) < Σ ΙΙΛ+ι-ΛΙΙ J-η с" <("!/) |l/i-/oll<ll/i-/o 1-е Since с < 1, {/„} is Cauchy, so has a limit /„ e C(X). Since Τ is continuous, Tf0 = lim Tfn = lim/„+1 =/0, and thus/, is the desired fixed function. Л-ЮО η -» *> As an illustration on the real numbers let us prove that if a > 0, there is an x0 > 0 such that x02 = a, by Newton's method. First, we describe the
214 2. Notions of Calculus map T. Let ρ > 0, the linear approximation to x2 — a at ρ is p1 + 2p(x — p). Thus, the zero of this linear polynomial is Τ ρ а 2P + P IH) Clearly, if Г has a fixed point x0, we must have x02 = a. Thus, we must show that Τ is a contraction on some closed interval: \Tx Ty\=-2 a a y + χ у 1 ~2 -i χ - χα . -y + — (y xy -y\ xy -*) Since a, x,y are all positive, 1 — (a/xy) < 1, so we need only ensure that 1 — (a/xy) > — 1, for Tto be a contraction with с = %. Let/ = {x: x2 >a/2}. Then for x,y e I, xy > a/2, so a/xy < 2, which is the desired inequality. Thus, by the fixed point theorem there is an x0 with x02 > a/2 such that x02 = a. We shall now give a somewhat more subtle application of the fixed point theorem. Sometimes a relation between two real variables determines one as a function of the other. For example, the relation χ + у = 0 determines у as a function of x: у = — χ; χ2 + у2 — I gives у = (1 — χ2)1'2 near the value (0, 1), and near (1, 0) we should write χ = (1 — у2)1'2 as a function of y. The relations 1 sin(x(log y)) = 0 are somewhat less transparent, nevertheless we can ask whether or not they do determine у as a function of x. Suppose now, in general we have an equation (see Figure 2.18) F(x, у) = О (2.47) defined in the plane. We ask: does there exist a function g of χ such that (2.47) amounts to saying у = g(x) ? More precisely, is there a function g such that F(x, y) = 0 if and only if у = #(x) It is not hard to find a necessary condition. For there to be such a function
2.11 The Fixed Point Theorem 215 ν is a function of χ у is not a function of χ Figure 2.18 it must be the case that each line χ = constant intersects the set F(x, y) = 0 in only one point (see Figure 2.19). Thus the function F(x, y), as a function of у on lines χ = constant must take the value 0 only once. The root of F(x, y) = 0 is then the value g(x). Now we recall from one-variable theory that a function H(y) will take all values once if H'(y) Φ 0. Thus the reasonable condition to impose on F is that it has a continuous partial derivative with respect to y, and dF/ду φ 0. This condition turns out to be enough. More precisely, suppose that F is defined and has continuous partial derivatives in the neighborhood of the origin in R2, and 3F/dy(0,0) φ 0. Figure 2.19
216 2. Notions of Calculus We seek a function g defined in a neighborhood of χ = 0 such that g(0) = 0 and F(x, g(x)) = 0. If we fix χ — x0 near 0, then we seek a root of F(x0, y) = 0. This brings us right back to Newton's method. Define Τ as a function of j> as Newton did: T(y) is the zero of the linear approximation of F(x0, y) at у; that is, ду or 7> = J> aF (*o» jO *■(*<>, JO (2.48) Just as in Newton's case the solution of F(x0, y) = 0 is the fixed point of T. Thus, we need only verify that Τ is a contraction in some interval of values of у for x0 near χ so that it will have a fixed point; and we define g(x0) to be that fixed point. This application of the fixed point theorem really works, as we now shall prove. Theorem 2.16. Suppose that F has continuous partial derivatives in a neighborhood of (0,0), and that F(0, 0) = 0, 8F/8y(0, 0) # 0. Then there is a function g defined for χ in some interval {—ε, ε) such that F(x, y) = 0 if and only if у = g(x) Proof. Instead of (2.48) we consider something slightly simpler. For χ near 0, define ТЛУ)=У- SF Fix, У) (2.49) We want to find the fixed point, if it exists, of (2.49). Thus we seek suitable intervals, — ε <χ <ε, —η<)><ηΊη which Τ, is a Contraction T,(yl)-Tx(y2)=yl-y2- 8F ду (0,0) [F(x,yi)~F(x,y2)] (2.50) By the mean value theorem there is a ξ between уг and y2 such that dF F(x, yx)~ F(x, y2) = — (χ, ξ)(3Ί - У2) dy
2.11 The Fixed Point Theorem 217 Equation (2.50) becomes, upon substitution, Тх(Уг) ~ Tx(y2) = (У1 - y2) , wme)-.£fc0 ду ду (2.51) Now the term in brackets is continuous in (χ, ξ) and has the value 0 at (0,0). Thus we may choose ε so that that term is less than \ if -ε < χ < ε, -ε < j>i < ε, -ε < j>2 < ε and ξ is between y1 and y2. With this choice οίε, (2.51) gives \Tx(yi)-Tx(yJ\<i\yi-y2\ so Tx is indeed a contraction. Define g(x) as the fixed point of Tx. Then, if F(x, y) = 0, then by (2.49) Т*(у) = y, so we must have у = #(*). On the other hand, if у = #(*), then T^y) = _y, so again by (2.49) we must have F(x, y) = 0. The theorem is proved. To say that the function g exists is already good enough, but much more is true: g is a continuously differentiable function. We will leave the verification of this fact to the interested reader (see Problem 58). In Section 7.2 we shall reconsider this theorem (known as the Implicit function theorem) in many more variables. The beauty of the fixed point theorem is that the general context does not at all complicate the ideas, nor the verifications. • EXERCISES 31. Find, by Newton's method, a sequence of numbers converging to the square root of a, for any a > 0. Now, do the cube root. 32. Find a sequence converging to a root of these polynomials: (a) x3 + χ2 + χ + 1 (с) χό -2χ2-1χ + 2 (b) хг - χ + 1 (d) χ5 - χ - 1 33. (a) Let F(x, у) = χ sin(x^). For what values of (x, y) such that Ffa y) = 0 is it true that nearby the equation F(x, y)=0 demies у as a function of x? (b) Same problem for (i) F(x,y) = xy2 + 2xy+l, (11) F(x,y)=x*-y, (iii) F(x, y) = x" + y2 34. Let F(x, y) be differentiable in a domain D, and (x0 ,y0)eD such that F(x0, Го) = 0. Suppose g is differentiable and has the property g(xo) = y0, F(x, g(x)) = 0. Show that _ 8FI8x{xo,yo) Я'{Ха)~ 8FI8y(x0,yo)
218 2. Notions of Calculus 35. Find g' where g is defined implicitly by (a) x sin(xy) = 0 (c) e" = 1 (b) cos(x+y)=y (d) e'"=:y • PROBLEMS 57. Prove the fixed point theorem in R". Theorem IfS is a subset of R1 and Τ is defined on S and is a contraction on S, then there is a unique y0 e 5 such that T(y0) = yo. 58. Let F have continuous partial derivatives near (x0, Уо) and suppose F(xo,yo) = 0, 8F/dy(xo,yo)=£0. Let g be the function described in Theorem 2.15 (F(x, y(x)) = 0 and #(x0)=.)o). We can prove that g is differentiable as follows. (a) First of all, by the mean value theorem, for any (x, y), there is a (ξ, η) on the line between (x0, Уо) and (x, y) such that dF dF F(x, y) ~ F(x0, Уо) = — (ξ, η)(χ - Xo) + — (l П)(У - Уо) Why is the mean value theorem applicable? (b) Now, if we substitute у =д(х), Уо ~ g(x0), we have dF dF 0 = — (ξ, η)(χ -χ0) + — (ξ, v)(g(x) - g(Xo)) Thus g(x)-g(xo) ^-SFIbx&rj) χ - Xo ЩЪу(£, η) Conclude that g is differentiable and dFldx(Xo,g(xo)) ff'(xo) ■■ ЩЫхо,д(хо)) 2.12 Summary A sequence zu ...,zn,... of complex numbers is a function from the positive integers to C. The sequence {z„} converges to ζ if, for every ε > 0 there is an N such that |z„ — z\ < ε for и > N. A convergent sequence is bounded, but not conversely. A monotonic bounded sequence of real numbers is convergent. Cauchy criterion: a
212 Summary 219 sequence {z„} converges if, for every ε > 0, there is an N such that \z„ - zm\ < ε for both n,m>N. The series formed of a sequence {z„} is the sequence of sums {£"=1 z,}. If the sequence of sums converges, we say that the series converges and denote the limit by £„°°=1 ζ„. If £ z„ converges, then z„->0, but not conversely. If {ck} is a sequence of nonnegative numbers, £ ck converges if and only if the sequence Σϊ = 1 ck is bounded. A series £z„ converges absolutely if Y\zn\ < oo. Absolutely convergent series may be summed in any convenient way. Tests for Convergence comparison test. Suppose |z„| < |и„| for all but finitely many n. Then (ι) if £ |w„| converges, £ z„ is absolutely convergent, (ii) if £ |z„| diverges, so does £ |w„|. root test. If |c„|1/n < r for some r < 1 and all but finitely many n, £ c„ is absolutely convergent. ratio test. If |c„ + 1/c„| < r for some r < 1 and all but finitely many n, Σ c„ is absolutely convergent. The sequence {vk} of vectors in R" is said to converge to ν if, for every ε > 0, there is an N such that || vk — ν || < ε for к > N. A sequence of vectors converges if and only if it does so in each coordinate. A set S is closed if and only if vft e S, hm vk = ν implies ν e S also. Every sequence contained in a closed and bounded set has a convergent subsequence. An Revalued function defined in R" is said to be continuous at v0 if/is defined in a neighborhood of v0 and vk -> v0 implies f(vk) </(v0). A function is continuous on a set S if it is continuous at every point of S. If S is a closed and bounded set, and/is a continuous real-valued function defined on S, then/is bounded and attains its maximum and minimum. Sections 2.6 and 2.7 are mainly about integration. We shall not recollect the definitions here; only the major results. fundamental theorem of CALCULUS. Suppose / is continuous on the interval [a, b]. Then the integral F(x) = f/ Ja exists for all χ e [a, b]. F is differentiable on (a, b) and F' =/.
220 2. Notions of Calculus fubini's theorem. Let /be an integrable function defined on a rectangle R = /j χ ■ ■ ■ χ In in R". J/can be computed by iteration: ί/= f [■■■ [f Kx\...,xn)dx" JR Jlil lJI„ dx"'1·- dx1 Let/be a real-valued function defined in a neighborhood of x0 in R". If vis a vector in R", the directional derivative df(x0, v) of/ at x0 in the direction ν is defined by lim f-»0 Дх0 + fv) - /(Xo) (if it exists). The partial derivative of/with respect to xl at x0 is £1(Xo) = rf/(x0,El) If these partial derivatives are all defined and continuous near x0, then df(x0, v) is linear in v. We can write df=l¥-,dx' J ^ dx' If the partial derivatives dfjdx1 all exist in an open set we may be able to compute the derivatives d(df/dx')/dxJ. These are the second-order partial derivatives. If all first and second derivatives of/ exist and are continuous in an open set N, then δ2/ δ2/ dx1 dxJ dxJ dx1 throughout N. Suppose that / has continuous partial derivatives in the domain I x D, where / is an interval of reals, and β is a domain in R". Let F(x) = f f(x, y) dy JD Then F is differentiable and 7W=[t- (*» У) dy ax Jdox
2.12 Summary 221 Suppose/is a real-valued function defined on R. We say that f(x) converges toiasx-юо written hm/(x) = L if \f(x) - L\ can be made arbi- JC-» 00 tranly small by taking χ sufficiently large. If now/is a continuous function on R such that lim ff exists, we say that/is integrable on R. If hm J* |/| exists,/is absolutely integrable. Integral test: If/is a positive, decreasing continuous function defined on R, then Jf /exists if and only if £"=i/(n) < oo. Let X be a closed and bounded set in R". We denote by C(X) the collection of all complex-valued continuous functions on X. C(X) is a vector space. If/is in C(X), the length of/is ||/||=max{|/(x)|:xeX} For/ #in C(Z) the distance between/and^is ||/— #||. If {/„}isa sequence in C(X), and ||/„ -/|| ->0 as n-> oo for some /e C(X), we say that {/„} converges uniformly to / Cauchy criterion. Suppose {/„} is a sequence in C(X) satisfying the following condition: for each ε > 0, there is an N such that ||/„ —/m|| < ε whenever n,m> N. Then there is an/e C(Z) such that /"„->/uniformly. integration. If X is an interval in R, and/ ->/uniformly in C(Z) then also J;/.-»J;/uniformly. The exponential function, denoted exp(oc), or ecx for any complex number с is the solution of the differential equation y' = cy, y(0) = 1. It has these properties: » (ел:)" „=o и! e" is never zero. fixed point theorem. Let S be a closed set of functions in C(X) and Τ a mapping of S onto S which is a contraction; that is, there is с < 1 such that \\T(f) - Tte)|| < c||/- 9\\ for all/ geS Then there is a unique continuous function /0 such that T(/0) =/0 -
222 2. Notions of Calculus implicit function theorem. Suppose that F has continuous partial derivatives in a neighborhood of (0, 0), and that F(0, 0) = 0, dF/dy(0, 0) # 0. Then there is a function g defined for χ in some interval (— ε, ε) such that F(x, y) = 0 if and only if y = g(x) • FURTHER READING M. Spivak, Calculus, Benjamin, New York, 1967. This is an eloquent text in the one-variable calculus. It is an excellent reference for a full treatment of the material in this chapter. T. A. Bak and J. Lichtenberg, Mathematics for Scientists, Benjamin, New York, 1966. This is a review of the theory of calculus from the point of view of the physical scientist. It includes a chapter on numerical analysis. C. W. Burnll and J. R. Knudsen, Real Variables, Holt, Rinehart and Winston, New York, 1969. An advanced text, going thoroughly through the material of this chapter and beyond to the theory of Lebesque integration. • MISCELLANEOUS PROBLEMS 59. Let {л:„} and {y„} be sequences. Then {x„ + yn} is also a sequence. So also is {rx„} for any real number r; thus the collection S of all real sequences is a vector space. Show that it is not finite dimensional. 60. Show that the collection В of bounded sequences is a linear subspace of the vector space S of all sequences (Problem 59). 61. Show that the collection С of convergent sequences is a linear sub- space of B. Also C0, the collection of all sequences converging to zero is a linear subspace of B. These spaces are all infinite dimensional. 62. Define the function " lim " on convergent sequences in the obvious way: lim: С -* R: limfx»} = lim x„. Show that lim is a linear function. 63. What is the dimension of the space of linear functionals on С which annihilate C0 ? 64. Let χι =4, хг = i(4 + I), and once x2, ...,x„ are defined, let -Wi = Κχη + 3/л:„). Prove that {л:„} converges. Assuming that, find the limit. 65. (a) Show that for every integer k, lim ri4(n + 1)" = 1 lim пЧ(п+ 1)"+1=0 lim и'+'/Си + 1)' does not exist (b) Let к be an integer, and 1 > h > 0. Show that lim n'A" = 0. (c) Show that lim n/ft" does not exist.
2.12 Summary 223 66. Let χι = 1, and in general 3 +x„ Find lim x„. 67. Suppose lim z„ = z. (a) Let % = £(z„- ι + z„). Then lim y„ = ζ (b) Let fc be a positive integer. Now let {%} be defined by 1 y» — ~£~τ\ (Zn +Zn+i + " ■ + z»+*) Then lim y„ = ζ also. (c) This time take 1 % = - (zi + 1- z„) Once again lim >>„ = z. 68. Suppose that / is continuous at c, and lim с = c. Then lim/(c) =/(c). 69. Let {c„} be a sequence of complex numbers, and suppose (|с„|)1/п = R. Show that R'1 is the radius of convergence of 2 c»z"· 70. Let {s„}, {f„} be two sequences of positive numbers such that lim s„ t„ ' exists and is nonzero. Then 2 & converges if and only if 2 i» converges. 71. Let {c„} be a sequence of positive numbers. Suppose that for every sequence of positive numbers {p„} such that 2#,<0° we have also 2ftAi<°°- Prove that {c„} is bounded. 72. Verify Schwarz's inequality: Σ\<*μΥ < ||α„|2· ||ό„|2 1=1 у n-L n=l (Hint: It is true by virtue of the same fact for finite sums, which was discussed in Problem 74 of Chapter 1.) 73. Prove that if 2 la»l2 < °°. then J_(l/ri)\a„\ < oo. Is the reverse implication true? 74. Let S be a subset of A". Show that ±(S) = {\eR": <v, s> = 0 for all s e S} is a closed set. 75. Suppose that/is a continuous positive real-valued function defined on a set S in /?". Show that log/is also continuous. 76. Suppose that/is a continuous real-valued function defined on all of R". Let x0, xi e R" and с e R be such that /(x0) < с </(x,). Show that there is an x2 e R" such that /(x2) = с
224 2. Notions of Calculus 77. Show that if/is a continuous function on the interval / taking only rational values, then /must be constant. 78. A set S in R" is called connected if every continuous real-valued function has the intermediate value property. Show that this is equivalent to the following definition: A set S is not connected if there is a continuous real-valued function / defined on S which takes precisely two values. 79. Verify the following assertions: (a) A ball in R" is connected. (b) The set of integers is not connected. (c) The sphere {x e R3: ||x || = 1} is connected. (d) The union of two balls in R" is connected if and only if they intersect. (e) An open set is not connected if and only if it can be written as the disjoint union of two nonempty open subsets. (f) A closed set is not connected if and only if it can be written as the disjoint union of two nonempty closed sets. 80. Let / be a continuous function on the closed and bounded set X. Then/is uniformly continuous; that is, given ε > 0, there is a δ > 0 such that for all x, у e X such that \x—y\ < δ we have | f(x) —f(y)\ < ε. Supposing not, we can derive a contradiction as follows. There is an ε0 such that for every δ, " \x — у | < δ implies \f(x) — f(y) | < ε0 " is not true. Taking δ = 1 /и, there are x„, y„ with \x„ - y„ | < 1 /n but | f(x„) - /(y„) \>ε0. Since X is closed and bounded, these sequences have convergent subsequences: {*'„}, {y„'}. Show that lim x'„ = lim /„ but |/(lim x'„) — /(lim /„)| > ε0, a contradiction. 81. Let L be a linear functional on R" and choose v0 such that ||»0|| = 1 and L(v0)=max{L(v): ||»|| = 1} Show that for every υ e R", L(v) =L(v0) <«, v0}. 82. Let / be an integrable function on the rectangle [a, b]. Let R, be rectangle [a, b + r(b — a)], for 0 < / <, 1. Verify that / is integrable on each rectangle R,, and define F(t) = JR, / Show that / is continuous. Is /differentiable? 83. Let Q = {pjq:p, q integers with 0 <,p < q). β is a subset of the unit interval [0,1] which is not measurable. For surely JxQ=0, and if Ri и · · · и R„ => Q, then also Λ υ · · · и R„ => [0, 1], so J 2 Хщ ^ 1. and thusfXo = l. 84. Let / be an integrable nonnegative function defined on the domain B<=R2 and consider D = {(x, y, z) e R3; 0 <, ζ<>f(x, y); (x,y)eB}. Verify that Vol(Z>) = J„ /.
2.12 Summary 225 85. Suppose that/is a continuous decreasing real-valued function of a real variable and lim f(x) = 0. Then f? f(x) sin χ dx converges (compare JC-»0O this with Leibniz's theorem for series). 86. Suppose that/is a real-valued function defined on R". We say that /(x)^+oo as ||x||->oo if, for every Μ there is а К such that /(χ) ^ Μ whenever ||χ||>Κ Show that if/is a real-valued continuous function on R" such that/(x) -* + oo as ||x|] -* oo, then /attains a minimum at some point. 87. Define /(x)^0 as llxll^oo in a way suggested by the definition in the above problem. Show that if a continuous function on R" has this property, then it attains both a maximum and a minimum on R". 88. Suppose / is a real-valued function which has continuous partial derivatives in the ball {x e R": ||x|| < 1}. Show that the function g(x) = f f(tx)dt J 0 has the same properties, and find V#. 89. Let I2 be the space of sequences {c„} of real numbers such that Σ μ2 < °° n = l Because of the result in Problem 72 (Schwarz's inequality), if {c„} and {</„} are in I2, then <{c}, {</„}> = fc„d„ n = l converges. Show that I2 is a Euclidean vector space with that inner product. 90. The space of continuous functions on the unit interval can be made into a Euclidean vector space in this way: </,*>=f /(tMt)dt Corresponding to this inner product is a notion of length which we denote by || · ||2 so as to distinguish it from the modulus || · ||» introduced in the
226 2. Notions of Calculus text. Show that this length is deficient in these respects: (a) We can have \\f„\\i -*0 without having ||/„||„ ->0. (b) We can have a sequence {f„} of continuous functions which is a Cauchy sequence in the sense of the length || · ||2, but which does not converge to a continuous function. On the other hand, show that (c) if ll/JU-^O, then ||/„||2->0also. 91. Suppose L: C[0, 1]->Λ is a linear function. Show that L is continuous if and only if there is an Μ > 0 such that \L(f)\^M\\f\L 92. Show that there is a unique differentiable function /0 such that f'o(x) = (fo(x))2 for all л: and /0(0) = * Do it by applying the fixed point theorem to the function Τ defined below ontfaeset{/eC[*,*]: ||/H«£»: x Tf(x)= f P(t)dt+i •Ό 93. We can talk of open and closed sets, and convergence in the space M" of (n x n) matrices, merely by considering them as vectors in R" . Doing so verify these statements: (a) The set G of invertible (и х и) matrices is open. (b) The set of triangular matrices is closed. (c) The function A-+A2 is continuous. (d) If ρ is any polynomial in one variable the function T^p(T) is continuous. (e) lim (x/ni) £?= 0 (1/и!)Г" exists for all Те L(R", Rm). 94. Suppose g is a continuous real-valued function on the interval [-a, a]. Show that the implication f g(t)f(t)dt = 0 J-a for all fe Fimplies g = 0 holds whenever Fis any one of these classes: (a) F=C([-a,a]). (b) F=Cl([-a,aJ). (c) Fis the collection of all polynomials. (d) Fis the collection {χι: /a sublnterval of [-a, a]}. (e) Fis the collection of all continuously differentiable functions such that Д-a) = Да) =0.
Chapter 3 ORDINARY DIFFERENTIAL EQUATIONS In these next three chapters we shall elaborate on the study of the differential calculus of one variable and its application to geometry and classical (Newtonian) physics. The motivating problem throughout is the central problem of the subject of differential equations: to find a function on the basis of given information on its derivatives. Observed phenomena in the sciences seem always to involve rates of change. For example, it is observed that the rate of acceleration of a falling body is a constant independent of mass, height, or velocity; the progress of a chemical reaction slows down as it proceeds, dependent on the quantities of the chemicals involved. These observations, when made precise, appear as differential equations. In order to predict (the time it takes for the body to fall a given height, the amount of new chemicals produced before the reaction stops), the function described by the differential equation must be found. The first two sections of the present chapter are devoted to the description of the basic concepts involved; in the first we shall discuss the differentiation of vector-valued functions, and the second is devoted to approximation and Taylor's formula. We also include a brief excursion into the computation of maxima and minima of functions of several variables subject to constraints by the technique of Lagrange multipliers. The main theoretical tool in this study is Picard's theorem which gives conditions under which a differential equation has a solution and only one solution. This theorem essentially tells us what a well-posed problem is, and asserts that well-posed problems are always solvable. The question 227
228 3 Ordinary Differential Equations of actually producing a formula for the solution, or an algorithm for computing approximate values for the solution is another matter altogether. Several techniques will be exposed in this chapter and Chapter 5 (successive approximations, series expansions); there are many more very efficient computational techniques which we shall not develop here. It will become clear that the subject of ordinary differential equations has a lot to do with the study of curves (paths of motion). Thus in the next chapter we shall investigate the geometry of curves and its relation with the subject of differential equations. 3.1 Differentiation The first important step in the study of differential equations is to consider vector-valued functions of a real variable as well as real-valued functions. This is the appropriate setting for many problems involving differential equations, and is particularly relevant when studying equations involving derivatives of order greater than one. In the first sections we shall consider differentiable vector-valued functions of a real variable and introduce a special technique for approximating values: Taylor's expansion. Definition 1. Let x0 e R, and suppose f is an /^-valued function defined in a neighborhood of x0. f is differentiable at x0 if f(xo + 0-f(xo) lim ■— f-»0 t exists. The limit is called the derivative of f at x0 and is denoted by f'(xo). If f is defined in an open set U, we say f is differentiable (written f is C1) in U if [f(x + f) - f(x)]/i converges for all χ on U to a continuous function f as i->0. That this definition is not so far from the derivative encountered in calculus is demonstrated by the following assertion. Proposition 1. Let f be an R''-valued function defined in a neighborhood of x0 e R. Write i = (flt.. .,fn) in coordinates, f is differentiable at x0 if and only if /,, ..·,/„ are differentiable at x0. Further, f'(x0) = (/,'(x0), ..., /:(*<>))■
3.1 Differentiation 229 Proof. I(x„ + r) - f (x„) = (fi(x0 + t) - Mxp) A(xq + t) - f„(x„)\ t \ t '···' , ; The limit on the left as / -* 0 exists if and only if all the limits on the right exist (Proposition 10 in Chapter 2), and equality holds also in the limit. That is all that Proposition 1 says. Now if f is a differentiable function on an interval taking values in R", its image is a curve in R". The derivative f'(x0) is a vector in R" and points in the direction of motion of the curve (Figure 3.1). That is, the line through f(x0) and parallel to f'(x0) is the limiting position of the line through f(x0) and a nearby point i(x0 + i). For that line is parallel to t ~ 1(t(x0 + 0 — f(*o))> and by definition this vector has f'(x0) as limit as t -* 0. This line through f(x0) and parallel to f'(x0) is called the tangent line of the curve at f(x0). From Proposition 1 it easily follows that iff, g are differentiable, so is f + g, and (f + g)'(x0) = f'C*o) + g'i^o)· The chain rule also follows easily: Proposition 2. (Chain Rule I) Let g be a real-valued function defined in a neighborhood of x0 in R, and differentiable at x. Suppose f is an Revalued function which is differentiable at g(x0) (see Figure 3.2). Then f ° g is differentiable at x0 and (f ° #)'(x0) = ^'(^o)f'(^(^o))· (We have written g'(x0) before f'(#(x0)) as this is the customary way of writing the product of a scalar and a vector.) Figure 3.1
230 3 Ordinary Differential Equations R" /(«(*)) x g{x) Figure 3.2 This is of course true, just because it is true in each coordinate, by the ordinary chain rule. Thus if f = (/,,... ,/„), then f°g = {f1°g,...,f„°g), so σ°<7)' = ((/ι°ί7)',···,α.°ί7)') Example 1. Let f(x) = (x, x2, x3), #(i) = sin t. Then (f ° g){t) = (sin t, sin2i, sin3 0 (f ° g)' = cos f(l, 2 sin i, 3 sin2i) Now, there is also a chain rule for taking a real-valued function of a vector-valued function (Figure 3.3). Suppose now g is a continuously differentiable function denned on an interval / taking values in a domain D in R". Suppose / is a real-valued function denned on D which has all partial derivatives continuous. Then f° g is a real-valued function on the interval /. For clarity of exposition, let us take the case и = 2. We can write g in coordinates as g(x) = (^(x), 02M)· Then /(g(*o + 0)-/(g(*o)) = f(gi(x0 + 0» 0г(*о + 0) -/(0i(*o)> 0г(*о)) =f(gi(x0 + 0. #2(*o + 0) -/(0i(*o). ^2(^0 + 0) +/fai(*o)> ^2(^0 + 0) - f(g i(x0), 0г(*о)) (3-1)
3.1 Differentiation 231 Now the function f(s, g2(x0 + 0) is differentiable (it is the restriction of / to the line у = g2(x0 + 0)· ВУ the mean value theorem, the first difference is Я f -τ- (£j, 02(*o + O)[0i(*o + 0 - 0i(*o)] ox for some ξ1 between g^Xo + 0 and g^Xo)- Now applying the mean value theorem we see that gi(xo + t)-gi(Xo)=9'i(.4i)t for some η1 between x0 + t and x0. Thus the first difference in (3.1) is Я f j-tfi,92(Xo + tMi(.rii)t ox 91(χ0)<ξι<9ι(Χο + ^ Χ0<ηι<Χο+ί Similarly, the second difference is 8f dy (9i(xoX £>г)9'г{Цт)1 g2(x0) <ξ2< 92(Xo + 0 *o < 42 < Xo + t /(*(*)) Figure 3.3
232 3 Ordinary Differential Equations Thus, we may rewrite (3.1) as /(g(*o + 0 - /(g(*o)) t = ^ («i, 02(*o + O)^'i(li) + Ц- (0i(*o), bMte) (3-2) Taking the limit as t ->0, we have on the right ^ -^(xo) (since #! is continuous), and g-fcs + f), ξ2 both tend to g^ixo) since g2 is continuous. Also η19 η2 both tend to x0 since they lie between x0 and x0 + t. Since all the derivatives in (3.2) are continuous, the limit exists, so d(f°s) df 8f -^^ (xo) = γχ (&х0)Шхо) + fy (g(*o)) gi(xo) (3.3) Notice that, using the directional derivative notation, (3.3) becomes <*(/°g) dx - (*o) = df(M{Xo), g'(xo)) = <V/(g(x0), g'(x0)> (3.4) Thus the derivative of/along the curve χ = g(x) is the same as its directional derivative along the tangent direction to the curve (Figure 3.4). This is true in not only R2, but for all R". The derivation is of course the same, only with the notational complication of many more variables. Thus Figure 3.4
3.1 Differentiation 233 Proposition 3. (Chain Rule II) Let g be a continuously differentiable function of a real variable, taking values in a domain D in R", and suppose f is a continuously differentiable real-valued function defined on D. Then f Ό g is a differentiable function and (/o g)'(0 = df(g(t), g'(0) Examples 2. Let g(i) = (sin t, cos t), f{x, y) = xy2. Then df((x, y),(a,b))=d-fa + d-fb = y2a + 2xyb ox oy g'(t) = (cost, -sinf) (Jo g)'(i) = df(g(t), g'(i)) = cos2 t cost+ 2 cos t sin i(-sin t) = cos 2i cos ί We can, of course, verify this by direct substitution, since f ° g(i) = sin t cos2 t. 3. Let g(f) = (t, t2, 2f),f(x, y, z) = xy + log z. с df((x, y, z), (a, b, c)) =ya + xb + - g'(0 = (1, It, 2) (/ ° 8)'(0 = df((t, t2, 20, (1, 2t, 2)) = i2 + 2i2 + 2 2i -*· + ! ί 4. Suppose/, g are given as in Proposition 3, and/° g has a maximum at t0. Then V/(g(x0)) is orthogonal to g'(i0)· For (/° g)'(i0) = 0, but if" g)'('o) = d№t0), g'(i0)) = <V/(g(io))> S'(io)>
234 3 Ordinary Differential Equations Lagrange Multipliers This last example serves to provide a method for finding maxima (or minima) of functions subject to certain constraints. This is the process of Lagrange multipliers. Suppose /, д are differentiable functions in a certain domain D in R". We consider/as the function we are studying and g(x) = 0 the constraint. Suppose / has a maximum on g(x) = 0 at x0. Thus, if Г is a curve in the set {g(x) = 0} going through x0, then V/(x0) is orthogonal to the tangent line to Г at x0 . For if Г is the image of a function φ of a real variable, and φ(ί0) = x0, then as in Example 4, <V/(x0), ΨΌο)> = 0> and φ'(ί0) spans the tangent line to Γ at x0. Now also g ° φ is constant, so <V#(x0), 0Όο)> = 0· Thus at the maximum point x0 of/ on {g(x) = 0}, V/(x0) and V#(x0) are both orthogonal to all curves through x0 subject to the constraint g(x) = 0. If there are enough such curves, say, so that the set of tangent vectors fills out a subspace of R" of dimension и — 1, then V/(x0) and V#(x0) must be collinear. We will not worry here that there are enough of these curves, but take it for granted. After all, we are not here studying the theory, but only seeking a technique which will provide candidates for a maximum point. We can state this principle: if x0 is a maximum (or minimum) point for/subject to the constraint g(x) = 0, then there is a A such that V/(x0) = Ыхо) Thus we can find possible x0 by solving the system of equations V/(x) = Mx) ff(x) = 0 (3.5) for χ, λ. Examples 5. We shall find the maximum value of xyz on the unit sphere x2 + y2 + z2 = 1. Let/(x) = xyz, g(x) = x2 + y2 + z2 - 1. V/(x) = (yz, xz, xy) Wg(x) = (2x, 2y, 2z) Thus we must solve x2 + V2 + z2 = l (yz, xz, xy) = 2A(x, y, z)
3.1 Differentiation 235 Eliminating λ from Equations (3.6), we obtain yz xz xy χ у ζ This can be written as (3.7) z = 0 or x = 0 or y = 0 or - = -,- = - (3.8) χ у у ζ Thus either one of the coordinates is zero or x2 = y2 = z2 Near any point where one of the coordinates is zero, / changes sign, so these points are disqualified. This leaves any one of the points 1/л/3(±1, ±1, ±1). The value of / at any one of these points is ±3~3/2, thus 3~3/2 is the maximum. 6. Find the point on the curve 2(x - l)2 + 3j>2 = 4 which is closest # totheorigin. Here#(x, y) = 2(x - l)2 +$2 - 4 and/(x, y) =x2 + y2. Thus V/= (2x, 2y) V<7 = (4(x - 1), 2y) The equations become χ = 2A(x — 1) y = Xy 2(x - l)2 + y2 = 4 From the second equation, either у = 0 or λ = 1. The second case gives χ =2. Thus, the candidates are (1 + >Д0), (2, ± ^/2). The values of/at the first pair is (1 - ^2)2, (1 + >/2)2; and at the second the value of/ is 6. Clearly, the minimum distance is |1 - ν 2| and the maximum is 6 (see Figure 3.5). 7. Find the curve on the intersection of the two surfaces xyz = 1 x2 + y2 + 2z2 = 8
236 3 Ordinary Differential Equations Figure 3.5 which is closest to the origin. In this problem we have two constraints, but we can see through the technique. The tangent vector to the curve is orthogonal to the gradient of both constraining functions, and at the maximum point V(x2 + y2 + z2) is orthogonal to the curve. Thus this gradient must be coplanar with the gradients of the constraining functions. Let /(x) = x2 + y2 + z2, g(x) = xyz — 1, h(x) = x2 + y2 + 2z2 - 8. Then V/ = 2(x, y, z), Wg = (yz, xz, xy), VA = 2(x, y, 2z). We must solve these five equations for x, y, ζ,λ,μ: 2(x, y, z) = X(yz, xz, xy) + ζμ{χ, у, 2z) xyz = 1 x2 + y2 + 2z2 = 8 8. Let Μ = (a/) be a symmetric и х и matrix. That is, a' = a/ for all i andy. If Г is the transformation on R" defined by M, ν«Γχ, χ» = 2Γχ We show this by computation: <Tx, x> = Σ α/xV (3.9) The kth component of W«T\, x» is found by differentiating (3.9) with respect to x\ this gives Σβ/χ' + Σ*/*' < J
3.1 Differentiation 237 But since Μ is symmetric, this is the same as £, ak'x' + £, akJxj = 2(7*)'. Then ν«Γχ, χ» = 2Γχ is established. Now, the function /(x) = <Гх, х> must attain a maximum on the unit sphere, say at x0. The Lagrange multiplier procedure tells us that there is a A such that ν«Γχ,χ»μχο = ν(Σχ,2-ΐ)|χ>Χ0 or 2Tx = 2Ax Thus the transformation Τ has an eigenvector, namely that x0 on the unit sphere which maximizes the function <Γχ, χ>. We can continue this idea in order to prove that a transformation given by a symmetric transformation has an orthogonal basis of eigenvectors. For, let X! be the eigenvector found as in Example 8. Now maximize (Tx, х> subject to the constraints <x, x> = 1, <x, x^ = 0. If x2 is the maximum point subject to these constraints, We have λ2, μ2 such that ||x2||=l, <χ,χ,> = 0, 2Γχ2 = 2Λ2χ2, 27*2 = /x2V«x, x,» Thus, by the first two equations, x2 is nonzero and orthogonal to x1( and by the third, x2 is an eigenvector of T. Now proceed to the constraints <x, x> = 1, <x, X!> = 0, <x, x2> = 0. The same technique works to produce a third eigenvector. We can go on until we have found и independent eigenvectors. Examples 9. Let and find the eigenvectors of M. λ is an eigenvector of Μ if and only if there is a nonzero vector χ such that (M - AI)x = 0. We know the necessary and sufficient condition for that: det(M - Д) = 0. Thus the eigenvalues of Μ are the roots of det(M - AI) = 0. Now
3 Ordinary Differential Equations After a computation we find that det(M - Л) = (2 - A)3 - 3(2 - λ) + 2 = -(A - 1)2(A - 4) Thus the eigenvalues are 1, 4. We find the corresponding eigenvectors by solving the equations (M - I)x = 0, (M - 4I)x = 0 for nonzero vectors. eigenvalue 1: /1 1 1\ M-I= 111 \l 1 l/ corresponding eigenvectors: (1, - 1, 0), (0, — 1, 1) (Any two independent vectors such that Vj + v2 + v3 = 0 will do.) eigenvalue 4: /-2 Μ - 41 = 1 \ ι 1 -2 1 Γ 1 -2 The sum of the three rows is zero, so they are dependent. The first and second are independent, so the corresponding eigenvector lies on the line - 2x + у + ζ = 0 χ — 2у + ζ = 0 Such a vector is (1, 1, 1). Thus the eigenvectors of Μ are (1, - 1, 0), (0, - 1, 1) with eigenvalue 1, and (1, 1, 1) with eigenvalue 4. 10. Find the eigenvalues of Here det(M - λϊ) = (2 - λ)2 - 9 which has the roots -1,5, eigenvalue - 1: Μ + I = I I kills the vector (1, - 1). eigenvalue 5: Μ - 51 = I , ,1 kills the vector (1, 1).
3.1 Differentiation 239 • EXERCISES 1. Differentiate these functions and graph the curve defined by the function (a) /(/) = e", с a complex number. (b) f (/) = (cos /, sin /, /). (c) f (/) = (a cos /, b sin /). (d) f(/)=(/V3). (e) f(/)=(/,/2,/3). (f) f(/) = (sin/, cos/, 0). 2. What is the length of f'(/) in each of Exercises l(a)-(f) ? What is the angle between f'(/) and f"(/)? 3. At which pairs of points are the tangent lines to the curves (a) (c = {), and (c) of Exercise 1 parallel ? 4. At which pairs of points are the tangent lines to Exercises 1(b), (f) parallel? 5. Find the maximum of xy on the ellipse ax2 + by2 = 1. 6. Find the minimum of χ + у on the curve xy = 1 in the first quadrant. 7. Find the two points on the curves у = χ2 and xy=—\ which are closest. 8. Minimize x2 + y2 + z2 on the ellipsoid ax2 + by2 + cz2 = 1. 9. Given two straight lines V and L2 in space how would you try to find the points PieL1, p2eL2 which are closest (i.e., minimize [ ρ — q II for peL',qeL2)? 10. Find the eigenvalues and eigenvectors of these matrices: « {I J> « ("I i)- 11. Find the eigenvalues and eigenvectors of these matrices: 1 0 —1\ /2 1 3\ (a) I 0 1 0). (b) |l 0 3 -10 3 • PROBLEMS 1. Let f, g be differentiable ^"-valued functions defined on an interval /. (a) Show that the inner product h = <f, g> is differentiable and A' = <f', g> + <f, g'>. (b) Show that ||f || = <f, f>1/2 is constant if and only if I(x), l'(x) are orthogonal for all x. (c) Give a condition for f to lie on a straight line. 2. Find the point on the intersection of these two surfaces a2x2 + b2y2 + c2z2 = 1 x2 + У1 = 1 which is closest to the origin.
240 3 Ordinary Differential Equations 3. A rectangular box of maximum volume is to be constructed, with sides parallel to the coordinate planes, one vertex at the origin and the diagonally opposite vertex on the plane ax + by + cz = 1. Find the volume of that box. 4. A community consumes water at the rate of sin2(2nt/2A) gallons per hour. They wish to build a storage tank of capacity Q with a pump of rate w gallons per hour, so that the community will never run out of water. The cost is Q + kw. Minimize this cost for them. 5. Show that if /is any differentiable function on R3, there are at least two points χ on the unit sphere at which V/(x) is parallel to x. 3.2 Taylor's Formula Higher order derivatives appear for vector-valued functions just as they do in the usual one-variable calculus. Definition 2. Let f be an /^-valued function denned on an open set U a R. f is /c-times differentiable on U if there exist differentiable function gj, ..., gk defined on U such that gj = f", g2 = g'i, · · ·, g* = gi-1 We will denote gk by f(ll) f is fc-times continuously differentiable on U (written is C\U)) iff"0 is continuous on U. The following proposition is an obvious extension of Proposition 1 by induction. Proposition 4. Let /= (fu ... ,/„) be an Revalued function defined on U. f is k-times {continuously) differentiable on U if'flt.. .,/„ are each k-times (continuously) differentiable on U. Further, fw = (/1(ll), ... ,fnik)). Knowing that a given function is differentiable at a particular point can be a great aid in computing approximations to its values at nearby points. These considerations in turn lead to a better understanding of the notion of differentiability. Suppose that/is a differentiable Revalued function defined in a neighborhood of 0. By definition the difference quotient, ^[/(0-/(0)] converges to /'(0). In other words, the function ε(ί) defined for t φ 0 by e(0 = 7[/(0-/(0)]-/'(0)
3.2 Taylor's Formula 241 has limit 0 as t ->0. Rewriting this, /(0=/(Ο)+/'(Ο)ί + ε(0·ί (3.10) where lim ε(ί) = 0. Thus a good approximation to the value/(i) would be ДО) +/'(0)f; how good depends of course on the function ε(ί). But since the difference between this approximation and f(t) is ε(ί) · t, it suffices to know just the maximum of |e(f)|. We give an illustration of how to go about determining this. Suppose /is a C2 function defined in an interval [-R, K]. Let M = sup{|/"(x)|:|x|<K} Then 1/(0 - WO) +/'(0)01 < MR\t\ for 16 l-R, K] (3.11) This follows easily from the mean value theorem. There is a ξ between t and 0 such that Further, there is an η between ξ and 0 such that/'(ξ) -/'(Ο) = /"(η). Thus, for a given te [-Я, R], e(0 = ^K/W-/(0))]-/'(0) = /'(«) - /'(0) = f'№ η, ξ el- R, R] Thus |ε(ί)| < MR. Inequality (3.11) follows from (3.10) and this inequality. Now, although it could be very difficult to adequately describe the function ε(ί), the maximum Μ is much easier to obtain. In practice,/" is monotonic near 0 so we need only look at its values at the end points -R and R to obtain this estimate. We shall now generalize this argument in order to obtain estimates which are even more accurate. Rereading Equation (3.10) and the special illustration above We can assert that differentiability of a function at a point shows us how the values of the function at nearby points can be well approximated by the values of a first- order polynomial. (Well approximated here means that the error is small relative to the distance between the two points.) Furthermore, this well approximability is a criterion for differentiability.
242 3 Ordinary Differential Equations Proposition 5. Suppose that f is an R"-valued function defined in a neighborhood of x0e R. f is differentiable at x0 if and only if there exists a linear function L: R^R" and a function ε defined for small t such that lim ε(ί) = 0 and /(xo + 0 =/(*<>) +ДО+ «#)' Furthermore, L{t) = /'(x0) · t. Proof. We have seen above that differentiability implies this condition. Conversely, suppose this condition is verified. Then .. /(*o + 0-/(*„) .. Uf) L(t) lim = lim + lim e(t) =lim = L(l) r-»0 / t-»0 t r-»0 t-»0 t for since L is linear, L(t) = tL(l). Thus /is differentiable at x0, and f'(x0) = L{\). Now, an approximate evaluation of /(f) for f near 0 with error that is small relative to |f | may not be as good as required. A better approximation would be one whose error is small as compared to f2, or even better |f|* for sufficiently large k. This is where the higher order derivatives come in. We shall now derive a theorem which gives such approximations. The derivation follows by induction directly from the above remarks. Theorem 3.1. (Taylor's Theorem) Suppose that f is a (k + l)-times continuously differentiable R"-valued function defined in an interval I about x0. Then there is a polynomial Ρ (with coefficients in R") of degree k, and a function ε defined for t in I such that (i) ε(ί) is bounded by max{|/(ll+ υ(χ) |: χ between x0 and x0 + t}, E(t)tk+1 (ii) f(x0 + t) = P(t) + -LL_ (3.12) Furthermore, Ρ is unique and is given by ДО = Я*о) + /'(*<>)' + Яг-} t2 + ■ ■ ■ + ^^ t* 2 k\ If we write χ = x0 + t, (3.12) becomes a more familiar expression, called Taylor's expansion of degree к about x0: к fi')(0\ /(*) = Σ -тг2 (x - хо)" + %x - *o)(* - *of+ * (3-13) ι = 0 I 1
3.2 Taylor's Formula 243 Proof. The proof is by induction on k. The case к = 1 was already discussed above. We now assume the proposition for к = η - 1 and prove it for к = η, by applying the induction hypothesis to /'. For simplicity we take x0 = 0, and I={x:\x\<a). Let tel. By the induction hypothesis we can write /'(0= Σ ^-г'Ч^/' (3.14) i = o г! и! since/'<l)=/<l+1) Here ε0(ί) is bounded byM = max{|/<,,+1)(*)|: * between 0 and /}. Now let us integrate (3.14) from 0 to x: r* "-' fii+1,(0) r* 1 r* /'(0*= Σ -—г-- (4ί + ~\ ut)fdt (3.15) J0 1 = 0 l\ Jo П\ Jo The integral on the left is, by the fundamental theorem of calculus, f(x) — f(0). Thus, letting ε(*)=^7Γ Jo Ut)fdt we obtain from (3.15) „ ! /<i+')(o) *' + > 1 x"+1 /(^=/(0)+^-^-^ + ^^^) which is just the same as (3.12). We must show that ε(χ) is bounded by M. But, №)l = ^7rJo \eo0)\t"dt<1FrM\o fdt<M since ε0 is bounded by M. Examples 11. Find the Taylor expansion of degree 3 about 1 of f(t) = l + t + 3i4. /(1) = 5 /'(1)=1 + 12ί3μ, = 13 Γ(ΐ) = 36 Γ(ΐ) = 72 and /<4)(0 = 72 thus the Taylor expansion is /(0 = 5 + 13(i - 1) + I8(f - l)2 + 12(r - l)3 + ^ t ,2 , П„_пЗх^^
244 3 Ordinary Differential Equations where |e(f)l < 72. Notice that, since /(5)(0 = 0, the Taylor expansion of degree 4 is accurate: f{t) = 5 + 13(i - 1) + 18(i - l)2 + 12(i - l)3 + 3(i - l)4 for all t. 12. Find the Taylor expansion of degree 4 about 0 of f(t) = (1 + i2)-1 /(0) = 1 fit) = -2i(l + t2) - 1 /'(0) = 0 fit) = -2(1 + i2)-1 + 4i2(l + t2)'1 f"(0) = -2 fit) = 4i(l + i2)~2 +8i(l + i2)-1 - 8i3(l + i2)-2 /'"(0) = 0 /№(i) = 4(1 + ί2Γ2 + 8(1 + t2)-1 + t[_- · ·] = 12 f(t) = 1 - t2 + tA + ε(ί)ί5 13. Calculate (40)1/2 to three decimal places. We expand fix) = tJx about 36. /'(x) = Jx-1/2 f\x) = \x-3'2 2 4 r{x) = \x-s'2 /<*>(χ) = ^χ-^ /(36) = 6 /'(36) = ^ Г(36)=^з 1 12 ' K~~J 463 Г(36) = ^ l/(4)«l^ for χ between 36 and 40. Thus /(x)=6+^(x ~зб)+?V(x ~36)2 + -L (x - 36)3 + b(x - 36)4(x - 36)4 8.6 6
3.2 Taylor's Formula 245 where ε(ί) < 15/16.67. Thus and the desired approximation is 6.334. 14. Calculate e4 to three decimal places. We first write down the Taylor expansion f(x) = ex about 0. Since f'(x)=f(x\ we have fik)(x) = e* for all x. Thus the Taylor expansion of e*, degree и is " x' x"+1 where |ε(χ)| < max{| e'|: 0 < t < x}. Thus to estimate e4 we now take |ε(χ)| <, e4 < 34. The approximation by the Taylor expansion of degree η is bounded by «(*) ... . 3V + 1 (и + l)! ~(и+1)! We must choose и so large that this is bounded by ΙΟ-3 η > 41 will do, as we see by the following succession of inequalities 34.4Π+1 44 +5 22ι, + ι° ι (и + 1)! ~ F^7^ ~ I3"^ ~ 2"-31 1 — jq3/10(i.-31) Thus we must have (3/10)(n - 31) < 3, or η > 41. In the Taylor expansion (3.16) of ex observe that the remainder term is dominated by „И+ 1 e*·— (и+1)! and therefore tends to zero as η -> oo. Thus, if we let η -> oo in (3.16), we 1 < -- 15 6 16.67 1 10 43<-7<1(Γ4 6
246 3 Ordinary Differential Equations obtain (once again) CO v' «"-£« χ Γοί! Now, this kind of an argument can be applied to any function which we happen to know has derivatives of all orders. That is, if / is infinitely differentiable in an interval /about x0 we can write the Taylor expansion jv \ jv \ , "v J (*o), v , „/· \ \x ~~ xo) /(*) = /(*0) + Σ —^ <* - *°> + ε« („ + ΐ)! (3.17) where |ε(χ)| < max{|/(" + 1)(i) |: t between л:0 and л:} valid for every n. Let M" + 1(x) be thus bound. If 11тЛГ + Ч*)(Т*°Г1=0 (3·18) Л-О0 (« + 1) then clearly we can take the limit as η -> oo in (3.17) and represent /'as a series This series is called the Taylor Expansion of /about x0. In Chapters 5 and 6 we shall return to the consideration of series expansions for functions. In Section 5 8 we shall construct infinitely differentiable functions which are not represented by these Taylor expansions. For the present we mean only to remark on these approximations of the Taylor expansion as a tool for approximation. Examples 15. Consider now f(x) = sin x. We have f'{x) = cos χ f"{x) = -sin л: /'"(*) = -cos л: /<4)(x)=sinx,. . and the cycle repeats itself. Thus /■<4" + 1)(x) = cos χ f(4" + 2\x) = sin χ /(4n + 3)(x) = -cos χ /■<4" + 4)(x) = sinx The Taylor expansion about zero is thus found to be x3 x5 x1 f(x) =x- — + — - — + ··· + Remainder term
3.2 Taylor's Formula 247 Since all derivatives of sin χ are one of ±sin*, +cos x, they are bounded by 1, so the remainder for the Taylor expansion of degree к is bounded by (fc+1)! which tends to zero as к -> oo. Thus the Taylor expansion x3 x5 x1 {-\)k2k + i sinx = *__ + ___+...+___ + ... accurately expresses sine as an infinite sum. Similarly, we can compute a Taylor expansion for the cosine (see Exercise 15), x2 x4 x6 (-l)V С05*=2Г7!+бТ+'" + -(ЩГ+··· 16. Find sin π/4 to an accuracy of 10~3. We need to compute a bound on the remainder after calculating η terms of the Taylor expansion and then ensure this bound is < 10"3. Now the remainder after к terms is bounded by [(2k + 1)!]_1(π/4)2* + 1. We shall use the fact that π/4 < 4/5 to verify that к = 3 will work: m 1 M7 1 47 1 1Л_3 <—-л^< —j< ίο " 6.44 57 _ 6.54 ~ Thus an estimate to sin π/4 to within one thousandth is π π3 π5 + 4 6.64 120.45 17. The logarithm is infinitely differentiable around the point 1. Does it have an infinite Taylor expansion there ? By computation, we find log(')(x) = x-1 log'(l)=l \o£"\x)= -x'1 log(">(l)=-l log<"'>(x) = 2*-3 log(""(l) = 2 log<4)(x) = -13.2*-4 bg(4)(l) = (-1)3 · 2 \oin\x) = (- l)"(n - 1)1*- logW(l) = (- 1Π" - О'· (3-19)
248 3 Ordinary Differential Equations The Taylor expansion of degree и about 1 is thus M*>-t(-^^<*-i>' + «i*>(^ (32o) Notice that from the first equation of (3.19), if χ < 1, \εη(χ)\<(η)\χ-(η + ί) and thus the remainder of (3.20) is bounded by 1 / x- 1 \"+1 which tends to zero as и -> oo, so long as 1 > χ > 1/2. Similarly, we can show (Exercise 18) that the remainder goes to zero if 1 < χ < 3/2. Thus, in the interval 1/2 < χ < 3/2, the logarithm has the Taylor series 00 (x - ΙΫ iog(x) = I(-i)*iLTJ- • EXERCISES 12. Find the Taylor expansion about the origin of degree 5 of tan χ; of (1+x)-1. 13. Find sin I accurately to 4 decimal places. 14. Find л/з accurately to 4 decimal places. 15. Derive the Taylor expansion (given after Example 15) of cos x. 16. Find an interval about the origin in which the substitution (1 + x)-1 = 1 - χ is accurate to three decimal places. What about the substitution (\ + χ)-1 = 1-χ + χ^Ί
3.2 Taylor's Formula 249 17. Find an interval about the origin in which the substitution e« = l+* + T + -^ is accurate to three decimal places. 18. Show that the series » (x - 1)" 1 = 1 к represents the logarithm in the interval l/2<:*<:3/2. Observe that the series converges for all χ in the interval (0, 2). Does it converge there to log л:? • PROBLEMS 6. Suppose/is a fc-times differentiable real-valued function defined on the interval /. Suppose f<k) = 0 for all k. Show that / is a polynomial of degree at most к — 1. 7. Suppose that/, g are С functions denned on an interval containing 0, and /(0) = ·.·=/*-1>(0) = 0, ^(0) = ···=^"-1)(0) = 0, but д^ЩфО. Prove that /(0 _ /(t)(0) ,lm g{t) g*K0) 8. (Taylor's form of the mean value theorem) Suppose that / is С on the interval [- R, R]. Show that for / e [R, R], there is a ξ between 0 and / such that ■fqo.,, + /^)„ 1 = 0 /! k\ 9. Let m be any integer and define the functions/0 /m-i by n=o (mn + /)! (a) e*=/i(*) +···+/»(*)· / (b) /,'=/.-i for i = l,...,m- 1. (С) /о' =/„-!· (d) The functions /,...,/™ are all solutions of the differential equation /™> = у
250 3 Ordinary Differential Equations 10. (a) Suppose that/is continuous on the interval [-R, R]. Define 0(0= f f«x)dx te[-R,R] and show that g is also continuous. (b) Suppose that h is C1 on [-R, R]. Prove that there is a continuous function к such that h(t) = A(0) + tk(t). (Hint: Consider Jo A'(t) dr and make the substitution τ = /χ.) 3.3 Differential Equations Now, an ordinary differential equation is (roughly speaking) an equation involving the variable x, an "unknown" function/, and some of its derivatives/',/', ...,/1". Thus /'(*) = Kx) /"+/=0 /'(*) = xf{x) [/(4)(*)]2 + <?rw = l/(3)WI + iog|* + i| are examples of differential equations. A solution is a function which makes the equation true. For example, Jj к, sin x, exp(£x2) solve the first three equations respectively (as for the fourth, we cannot easily exhibit a solution). We prefer to think about differential equations in this vague sense rather than to try to attempt a formal definition of such, so we shall do so. Many equations do not admit solutions and some equations admit many. Consider these: \y'\+\y-x\ = 0 iy'f + 1 = 0 y" + y=0 The first has no solution у = /(*)> because we cannot have both /(*) = χ and/'(x) = 0; the second has no solution because the derivative of the supposed solution would be imaginary. The third equation has as solutions sin x, cos x, as well as any linear combination of these. The first equation must be discarded as being self-contradictory; the second admits solutions if we permit ourselves to consider complex-valued functions. As we shall
3.3 Differential Equations 251 see this turns out to be a very fruitful course, for it permits understanding the third as well. The importance of calculus derives from the fact that It is necessary to the solution of concrete problems (mainly derived from the study of physics and the natural sciences). These problems usually are stated mathematically as differential equations. Examples 18. Compound interest. A bank likes to pay its depositors on the basis of the amount deposited and the length of time they have been able to use these deposits. Thus every (say) June 30 your bank would add to your deposit an amount equal to (say) 5 % of that part of your deposit which they have held for the past year (and if they are decent about it a reasonable fraction of that 5 % for parts left in for fractions of that year). Many years ago, that great financial wizard, L. Waverly Oakes, pointed out that that amount that he kept in his bank for the first half-year was working for the bank and he should be paid for it. Furthermore, argued Mr. Oakes, the payment he should have received was also sunk back into the bank's investments so also was earning income for the bank, and thus for its depositors. Finally, Mr. Oakes pointed out that there is nothing special in half a year, or any particular fraction thereof. His very words were " Over any period of time, no matter how small, the earning of a particular balance relative to that balance should be directly proportional to that period of time. In order to best approach the interest due its depositors, our banks should be computing interest as often as possible." The banks all responded to this profound utterance by recomputing their interest every month instead of every year. Somebody even suggested that, with an army of secretaries, they could so compute the accrued interest every 30 seconds. And there the matter would have rested were it not for an obscure student of Isaac Newton who dabbled in the stock market. Suppose at time t0 a sum of s0 pounds are deposited in the bank. Let/(i) for all times t > t0 be the balance accruing from this deposit according to the Oakes system. Then, Oakes' assertion is, for all Kh) - f(h) = _ (321) where к is the earning power (interest rate) of money. The first
3 Ordinary Differential Equations thing this brilliant person remarked is that (3.21) cannot possibly always hold. Let us illustrate his discussion. Suppose that 500 pounds are deposited in the bank at a 5% per annum interest rate. Then /(0) = 500 and at the end of one year, the interest is 25 pounds, so/(l) = 525. Now, if interest is computed every half-year, we obtain, by (3.21), ^ = 0.05S 500 V2 or/(1/2) = 512.50. Then, over the second half of the year, we obtain /(1)-512.50 512.50 = 0.05(i) so that by this computation /(1) = 525.31. As this is closer to the actual earnings of the initial deposit, this is more like the amount the depositor should get. Furthermore, this semiannual computation has neglected the earnings during the last three-quarters of the 6.25 accrued during the first quarter. In fact, when we compute the interest quarterly we find that the value of/(l) should be no less than 525.504. And so it goes: no matter what period we choose for the computation of interest, we will be neglecting the interest accrued by the growing total during that interest. Thus Oakes' formulation cannot be correct. However, our student was moved by the basic justice of Oakes' ideas and after rewriting Oakes' formula as h ~ h he asserted that he had found the precise statement of the Oakes formula. Oakes should have said "over any infinitesimally small interval of time ..." rather than " over any period of time, no matter how small ..." Precisely, then: the rate of change of the balance at any time is proportional to the balance at that time; that is,/' = kf, where к is the interest rate (0.05 above). Thus, the problem is to find a solution / for the differential equation y' — ky = 0 so that /Co) = ■*<>■ 19. Population explosion. Population tends to grow also according to the above differential equation. That is, it is assumed that every individual has the same propensity to reproduce and that propensity
3.3 Differential Equations 253 is independent of time. Thus over any infinitesimal period of time the ratio of the increment in population to the initial population is proportional to the time elapsed. (You know what mobs are like: the larger they are the faster they seem to grow.) This assertion is supposed to be true for brief periods of time; thus we should more precisely assert that the rate of change is directly proportional to the total population; thus if/is the total population,/' = kf where the constant к is called the growth rate. In some societies the growth rate varies with time; among certain mammals it peaks at certain times of the year. In these cases the population as a function / of time satisfies a differential equation: fit) = k(t)f(t), where k(t) is the variable growth rate. It may even happen that the growth rate depends on the total population; in a well-regulated society (1984) this would be the case. Then the population function is a solution to a more complicated equation, y' = k{y)y. 20. Survival of the fetahs. On a remote volcanic atoll in the South Pacific there live only two species of animals, the fetahs and the garibs. These animals are essentially vegetarian and there is an everpresent undergrowth to feed them. However fetahs especially love to eat garibs and garibs find the succulent fetahs hard to resist. Now each fetah tends to reproduce at the rate of one young each per year, and consume garibs at the rate of 7 per year. Conversely the garibs have only one young per year and eat fetahs at the rate of 17 per year. Thus the increment Δ/ Ag of fetahs and garibs in a year should be given by Δ/=/0 - 17^0 Δ<7 = g0 - 7/o (3.22) where/0,#o are trie initial populations of these groups. However, the Oakes reasoning must be applied to this case; because as the population changes, it will continually affect the increment. The solution is, as in the above case, to rewrite (3.22) as a differential equation. If fit), g{t) are the populations of fetahs and garibs at time f, then these equations describe the growth of/and g: /'=/-17* g'=9-!f 21. The biotic matrix. On a less remote island there are η different species of animals, all of which have some effect on the growth patterns of all the others (some feed on others; some house, or protect others).
254 3 Ordinary Differential Equations This kind of society can be represented by a biotic и х и matrix A = (a/). The (i,j)th entry is described as follows: The increment in the ith species in one year which is attributable to each member of the /th species is a/. (Thus the effect of one member of the yth species on the /th species in an interval of time Δί years, is a/ At.) If f(t) = (/'(i), · · · ,/"(0) 1S trie population function on this island, then this differential equation must be satisfied: /' = Af (3.23) 22. Particle motion. We consider now the motion of a particle in R". Let f(i) be the location of that particle at time t. f is thus an .Revalued function of a real variable. The rate of change of position at a time t0 is the limit as t -> t0 of —^-(f(i)-f(io)) 1 ~ h thus is f'(i0)> called the velocity of the particle at t0. The rate of change of velocity, f", is the acceleration of the particle. The velocity vector has both magnitude and direction; we can write f'(i) = f(i)T(i) (at least when f' #0), where v(t) is a positive function of t, and T(i) is a unit vector. T(i) points out the direction in which the particle is traveling at time t and v(t) is the speed at which it is moving. Also, f(i) can be given the following description. The length of the path that the particle traces out in a certain period of time is the distance traveled by the particle. |u(i)| is also the rate of change of that distance at time t. We will have to await a full discussion of arc length (Section 4.2) before justifying this; however some heuristic arguments are possible (see Problem 20). According to this description, the distance s(t) traveled by the particle from time t0 to t is a solution of the differential equation y' = ||f'(f)ll with s(t0) = 0. We would hope that there is only one solution, for there is no further way to determine this function. (Fortunately, by the fundamental theorem of calculus this problem has a unique solution.) Consider, for example, a particle moving on the unit circle in the plane. Let/be the position function of this particle. Let s(t) be the arc length on the circle from the point (1, 0) to/(i) at time t. Then (since arc length on the unit circle is the same as the angle) /(t) = (cos j(t), sin s(t))
3.3 Differential Equations 255 The velocity vector is f'(t) = s'(t)( - sin s(t), cos s(t)). Notice that k'(OI = l/'(OI> giving further weight to our description of speed above. Notice also that/'Ci) is tangent to the circle at the point /(i); this reflects the fact that the motion is constrained to the circle. Differentiating further, we find that the acceleration is f"(t) = j"(0(-sin j(i), cos j(0) + j'(i)2(-cos j(i), -sin s(t)) = *"(0Д0-[У(0]2Я0 Thus the acceleration has a component tangent to the circle (in the direction of the motion) whose magnitude is the rate of change of speed, and a component perpendicular to the direction of motion, whose magnitude is equal to the speed squared. For example, if the particle is rotating around the circle with constant speed, it is accelerating toward the center of the circle. According to Newton's laws of motion the situation is as follows. Given a particle at time t0 situated at p0 and having velocity v0, all further motion is determined uniquely by the forces acting on the object. The motion is determined by this law: the acceleration is directly proportional to the force acting on the particle. Thus, in the absence of any forces, if f(i) is the position of the particle at time t, we have f('o) = Po f'(io) = v0 f"(0 = 0 alii and f is uniquely determined by these conditions. We say that f is a solution of the differential equation y" = 0 with the initial conditions y(0) = p0, y'(0) = v0. Newton's laws require that the solution exists and is unique. Mathematics bears this out; the solution is f(i) = p0 + *v0. Thus, in the absence of force, a particle will move with constant velocity, that is, in a straight line at a constant speed. Now, in general, the mechanics of motion can be described as follows. There is a function F defined on R" x R taking values in R". The value F(x, i) represents the force that will act on a unit mass acting at point χ at time t. The function F is called a force field. A particle of mass m situated at the point χ will experience the force wF(x, i) at time t. According to Newton's law it will accelerate in the direction of F. The magnitude of this acceleration is determined by or according to this announcement of Newton's law: Force = mass · acceleration, wF = та (a = acceleration).
256 3 Ordinary Differential Equations Suppose we place a particle of mass m at p0 with velocity v0 into this situation at time t0. Let f be the function describing its subsequent motion according to Newton's law. Then at time t it is at f(i) and it experiences a force F(f(i), 0· Thus we have f "(0 = F(f(i), 0 Thus f is the solution of the differential equation y" = F(y, i) with the initial conditions f(i0) = Po, f'(i0) = v0. Newton's laws require that the solution exist uniquely. In the next section we shall show that for smoothly varying force fields this is the case. • PROBLEMS 11. Find all complex-valued solutions of the differential equation (/)2+l=0. 12. Solve the differential equation у' = у with the initial condition У(0) = 0. 13. (a) How long will it take 100 dollars to double at a compound interest rate of 5% per year? (b) How long will it take 350 dollars to double at the same rate? (c) How long will it take 100 dollars to double at a rate of 10% per year? 14. It is observed that radioactive elements decay into heavy metals. It is assumed that the probability of any given atom decaying is independent of the particular atom. Let к be the probability that a given atom of a particular element will decay within one year. Show that the function / is governed by the differential equation / = — ky if /(/) is the mass of the given element after time /. 15. The time it takes for a radioactive element to halve in mass is called the half-life. If an element has a half-life of 14 million years, find the constant к of Problem 14. 16. Why is Oakes' formulation of the interest problem wrong ? Can you solve equation (3.21) so that it holds for a specified period; that is, given n, find /so that (3.21) holds for h =k/n,t2=(k+ l)/n, 0^k<nl 17. A weight of mass m is suspended from a rigid support by a spring of natural length L. According to Hooke's law the spring produces a "restoring force" which is proportional to the displacement from its natural length, and directed toward its natural position (Figure 3.6). Let us denote this constant of proportionality by k. Let χ denote the distance of mass from the natural position, where the positive direction is upward. Then the mass has two forces acting on it: a force Fi = — kx due to the restoring effort of the spring, and the force of gravity F2 = —mg. If the mass is at rest, then there is no acceleration, so by Newton's laws Fx + F2 = 0, from which we may conclude that the rest position is at χ = —k'^mg. Now suppose we
3.3 Differential Equations 257 Figure 3.6 displace the mass by an amount h0 and let it go. Using Newton's law find the differential equation governing the subsequent motion. 18. A certain insect lays its eggs in the flesh of a mammal. Each insect hatches h eggs per year. Now every time one of these eggs hatches in a horse, it kills the horse. Assuming the total mammalian population is a constant T, we can derive the differential equations governing the growth of this insect and horse population if we also know the natural death rate (d,) of the insect and the natural birth and death rates (bH, dH) of the horse. Let /(/), H(t) be the population of the insect, horse, respectively. During a period of time Δ/, bH ■ Η · Δ/ horses are born, and dH ■ Η ■ Δ/ horses die of natural causes. Now each insect hatches ΑΔ/ eggs during this interval; the probability that its host is a horse is HIT. Thus there are hI(H/T) Δ/ horse deaths attributed to the insect during this time interval. The change ΔΗ in the horse population is thus Δ# = ό„#Δ/-ί/„#Δί-/ι/(-ΐΔ/
3 Ordinary Differential Equations Find the corresponding change in the insect population and deduce that these differential equations govern the growth: hH H' = (b„-dH)H- — I I' = hT-d,I 19. It was observed by Galileo that the gravitational attraction of the earth is constant. In the small, we may assume the world is flat, thus we take as a model R3, and assume that the plane ζ = 0 is the surface of the earth. The gravitational attraction then is a force field F(x, y, z) = (0, 0, — g). Suppose a particle of mass m is at p0 and has a velocity v0 at time t0. Let f (/) be the position of this particle at time /. What is the differential equation governing the motion of the particle? Can you solve for f ? 20. Suppose there is a wind coming out of the east which exerts a force (c, 0, 0) on our particle, no matter what the position is. Now find the equation of motion. 21. Suppose that on the plane there is a centripetal force field proportional to the distance from the origin. At the time / = 0, a particle is placed at the point z0 and has a velocity v0. What is the equation of motion ? 22. We can try to find a formula for the length of a curve by approximating it by a line segment. Let * = *(/) y=y(t) be the equations of a curve, and let (x(t0), y(t0)) be a point on the curve. For a very short period of time. Δ/, the curve can be replaced by its tangent line (see Figure 3.7). The length of the curve between (x(/0), yit0)) and (x(t0 + Δ/), y(t0 + Δ/)) is then approximately equal to ((Δχ)2 + (Δ^)2)1'2. V(AJT)^+ (ду) (ϊ(ί„ + Λ/),3/(ί„ + Δί)) Figure 3.7
3.4 Some Techniques for Solving Equations 259 Then the rate of change of arc length over the interval At is ((ΔχΥ + (Ay)2)1'2 Δ/ Letting Δ/ -* 0 deduce that the rate of change of arc length along the curve is the length of the vector (x'(t), /(0). 3.4 Some Techniques for Solving Equations The fundamental theorem of calculus is of course the basic existence theorem on solutions for differential equations, and integration is the primary tool. Thus an equation of the form / = Κχ) has the solution f(x) = h(f) dt + c, and this solution is unique but for a constant. Let us state the same result for vector-valued functions. Definition 3. Let h = (A1;..., h„) be a continuous /^-valued function on the closed interval [a, 6]. Define the integral of A over the interval [α, ό] to be J>=(J> ·Μ Theorem 3.2. Let hbe a continuous Rn-valued function defined on the open interval (a, b) and let a < с < b. Then the differential equation y' = Kx) у(с)=Ро (3·24) has the unique solution h + p0- Proof. By the fundamental theorem of calculus and Proposition 1, /(*)= Г h+p0 is differentiable and satisfies the conditions (3.24). If g is another solution, then g' =/' on (a, b) so each coordinate of g - /has zero derivative and thus is constant. Since g(c)=p0 =/(c), this constant is zero, so g =/.
260 3 Ordinary Differential Equations Separation of Variables There is a class of differential equations which can be solved simply by integration, just by recalling the chain rule. This is the class of first-order equations (only the first derivative of the unknown function у appears) in which the variables separate; that is, these are equations of the form %)/ = β(χ) (3-25) The left-hand side appears to be the result of application of the chain rule; we can rewrite (3.25) as d dx h = gM Thus, if we let Η be an indefinite integral of h, Η = J h, then (3.25) becomes [#(/*))]'=0(*) so we can integrate: я(Х*)) = fe (3.26) If we can solve (3.26) for y(x), we will have the desired explicit expression of j> as a function of x. Examples 23. yy' = 1. Let Щу) = J у = у1 β. Then the equation can be rewritten as [#(j>(x))]' = 1, or Щу) = у2/2 = χ + с, where с is a constant to be determined by the initial conditions. Thus the general solution of yy' = 1 is у = +(2(x + cj)1'2. 24. y' = x2y2. Again, we write У'2у' = х2 Integrate: x3 — 1 л ■У l=J + c
3.4 Some Techniques for Solving Equations 261 so -3 25. y' cos у = sin x. After integrating this becomes sin у = - cos χ + с or у = arc sin(c - cos x). A particular solution is/(x) = χ - π. 26. . 1+* After integration we have y2 x2 У + — = X + -z-+ С 2 2 (3.27) It is now a bit difficult to write the solution explicitly as a function of x, but it is possible using the formula for roots of a quadratic polynomial: ^-2±(4+8, + 4»» + C)"» (328) The constant с is presumably determined by the initial conditions, and with it the function y. Notice however, that each value of с gives two candidates for the solution, but they may not both be solutions. For example, suppose we seek the solution of (3.27) with the initial condition y(0) = 0. We arrive at (3.28) and upon substituting x = 0, у = 0, we obtain Q -2 + (4 + Q"2 so we must choose с = 0 and the positive sign before the radical. This boils down to у = χ. If the initial condition is y(0) = -2, again с = 0, but we must take the negative root, obtaining y= - (x + 2).
262 3 Ordinary Differential Equations Notice also that upon substituting the initial condition y(—l) = 1 into (3.28), we find с = 8 and both roots give solutions to this problem; that is, both functions у = χ and у = - (χ + 2) are solutions with this initial value. Thus it is not always true that the initial conditions uniquely determine the solution of the differential equations. Looking back at the original equation (3.27) we find what might be a clue to this bizarre behavior: the function (1 + x)(l + y)'1 is ill-behaved at у = - 1. Uses of Exponential We shall now turn to the study of the exponential function; because it is the solution of such a simple differential equation it gives rise to several techniques. Recall from Chapter 2 (Definition 21) that the differential equation у = Cy y(0) =1 с any complex number (3.29) has a unique solution, denoted ecx. Notice that (ecx)' = cecx, (ecx)" = cVx, ..., (e")(s) = с*е" (3.30) These remarks suggest a method of attack on another class of equations. A homogeneous constant coefficient equation is one of the form Ут + "k- 1У<к~l) + ■■■ + βι/ + a0 у = 0 (3.31) We shall consider this class in greater detail in Section 3.6. Let us compute the left-hand side of (3.31) under the substitution у = е". By (3.30), akecx + ak-lck~1ecx + · · · + axcecx + a0 ecx = (ak + ak-lck~i + ··· + atc + a0)e" (3.32) We find that ecx is a solution of (3.31) if с is a root of the polynomial appearing in (3.32). Examples 27. Find solutions for y" - у = 0. Substituting у = e", we obtain (c2 - \)ecx = 0, thus we must have с = ±1. We conclude that e", e~x are solutions. Notice also that for any a, b, aex + be~x is also a solution.
3.4 Some Techniques for Solving Equations 263 28. Find solution of ym + у = 0. Here substitution of у = ecx yields (c3 + l)ecx = 0, so с must be a cube root of — 1. Thus we obtain three solutions: Of course, all functions of the form ae'x + be'"l3x + ce~'"l3x are solutions. 29. Solve the initial value problem y"+y' = 0 X0) = 0 /(0)=l у"(0)=1 (3.33) Substituting у = ecx, we obtain (c3 + c)e" = 0, so we must have с = 0 or с = i or с = —ι. Thus all functions of the form ae0x + be,x + ce~,x are solutions. Let us see if we can solve for a, b, с by substituting the initial conditions: X0) = 0 : a + b + с = 0 /(0) = 1: ib - ic = 1 У'(0)= 1: -6-c= 1 We can solve this system, obtaining β = 1 *=-— c=--2- Thus the function will solve our problem. 30. Solve the initial value problem y" + у = 0 X0) = 1 У(0) = 0 (3.34)
264 3 Ordinary Differential Equations Here we have, as general solution ae'x + be~'x. Substituting the initial conditions, we obtain a + b = 1 ia - ib = 0 and thus a = b = 1/2. Thus we obtain as solution Лх) = №* + е-'*) Notice that we already know from calculus the solution f(x) = cos x. We shall learn in the next section that the initial value problem (3.34) has a unique solution. Thus this interesting equation follows: cos χ = \(eix + e~,x) (3.35) We shall leave to the exercises the verification of these other relationships between the trigonometric and exponential functions: sinx = ^(e'x -e~'x) (3.36) e'x = cis χ = cos χ + i sin χ (3.37) First-Order Linear Equations Now if/ is a differentiable function, so is ef, and (ef)' =f'ef. Letting у = ef, we obtain the differential equation y' = f'y. Thus, working backwards we see how to solve an equation of the form У' = ff(x)y Namely, exp(J g) is a solution. With a little more ingenuity we can see how to explicitly solve any linear first-order equation. These are differential equations of the form y'+f{x)y = g{x) (3.38) where f,g are continuous in an interval about a. Let H(x) = \*f, and consider the new function ζ = еиу. Then ζ' = еиу + Н'ену = ен{у' + fy), since Η' =/. Since by (3.38), у' +fy = g, we have this equation in z: z' = e»g
3.4 Some Techniques for Solving Equations 265 which is solvable by integration: ζ = Г енд + с •'а Finally, у = e~Hz, thus the general solution of (3.38) is found: У= e~Hz = e~H f eHg + ce~" (3.39) where Η is the indefinite integral of/, and с is to be found by substituting for the initial condition. Examples 31. y' + xy = x, y(0) = 0 Here we take Η = J χ = x2/2 and consider ζ = j> exp(x2/2). Thus the corresponding equation in ζ is ζ = у' ехр(х2/2) + j>x exp(x2/2)(/ + xy) = exp(x2/2) л: Thus ζ = ί ехр(л:2/2) л: ώ + с = ехр(л:2/2) + с so у = ζ ехр(- л:2/2) = 1 + с ехр(- х2/2) Substituting the initial condition у = 0 = с + ехр(-02/2) = с + 1, so с = - 1. The solution thus is у = 1 - exp(-x2/2). 32. /-2*-\у = *,Х1) = 0. Here we take H=\2jx= - 2 In χ and consider ζ = ye "^- *-2j>. Thenz'= -2x-3y + x-2y' = x'2(y'-2x-1y) = x-2x=x~l- We obtain ζ = In χ + с and у = x2 In χ + ex
266 3 Ordinary Differential Equations • EXERCISES 19 20. 21. Solve these differential equations: (a) (b) (c) (d) x2exp(x2)/ = x3,.K0) = l У = χ sin χ + cos x, уЩ = О (*'(0, /(0) = 0, t\ t3), (x(0), y(0), z(0)) z(t) = e" + ((1 + i)t)\ z(0) = 1 Solve these differential equations: (a) (b) (c) (d) (e) (f) (g) (h) (i) У = у2 y' cos χ = cos у χ2 + у2/ = о y' = (y2-l)(x2-l) у" = χ/ /=a+x2)y xy2 + (i - χ)/ = о / = ex+y У = sin(x + y) + sin(x - y) Solve these differential equations: (a) (b) (c) (d) (e) У + xy = cos x, X0) = 0 / cos χ + у sin χ = tan x, y(0) = 1 / + xy = x2, X0) = 0 е"У+е'у = е-",у(0) = 1 У=уе-',у(1) = 1 ■(0,1,0) 22. Solve these differential equations: (a) у'" = 2/ + / - 2y = 0, X0) = 0, /(0) = 0, У(0) = 1 (b) /-2/-r = 0,r(0)=l,/(0) = 0 (c) У - (1 + ЗгК + (3/ - 2)/ + у = 0, Х0) = 0, /(0) = 1, у"(0) = 0 3.5 Existence Theorems In this section we shall state and prove the basic existence theorem for ordinary differential equations. The method is due to Picard and is that of successive approximations. (Recall how we found, in Section 2.10, the solution to the equation y' = cy.) The first theorem is about first-order equations. We shall first illustrate the method of successive approximations. Example 33. Successive approximations. There is one and only one solution of / = ex + у у(0) = 0
3.5 Existence Theorems 267 Now if/(*) solves this equation, then by integration we see that fix) = f/'(0 dt = fry + /(i)] di Thus if Γ is the transformation defined on continuous functions by Tg(x) = TV + 0(i)] dt we see that Tf = f; that is, / is a fixed point of Τ According to Newton's method we should be able to find Τ as the limit of the sequence /0, Tf0, T{Tf0), ..., T"f0, ... . Let us compute this sequence. We may choose any function for/0, say/0 = 0 Then Tf0 = f e» dt = ex - 1 T2/0 = Г(7У0) = f (2e» - 1) dt = 2e* - 2 - χ T3f0 = T(T2f0) = £(3e» -2-t)dt = 3f-3-2x-j . *2 *3 T4/o = 4e* - 4 - 3* - 2 - - - x2 x3 *4 T5/0 = 5^-5-4x-3y-2--- Г"/0 = nex - η - (и - l)x - (и - 2) y ■(«-j)-t-···- j! (л-1)! We can't tell yet that this sequence of functions converges, but if we replace ex by its Taylor expansion we can get a better picture: 00 x' n'1 , x' r"/o=I«-T-Z(«-j)|( II . — η /I n-l vJ °° XJ "~1 X1 £ XJ = Σο:-(--λ:-; + Σ^ = Σο(73ϊ)Τ + Σ»;- (340)
268 3 Ordinary Differential Equations As и -> oo the last sum in (3.40) tends to zero, and we obtain lim T"/0 = f - j-οϋ-Ι)! Indeed xex solves the given problem! Now we would like to show that the solution is unique. This is easy, because it is easy to verify that Γ is a contraction: Τ f{x) - Tg[x) = fV + /(0 -i- git)) dt = f(/(i) - git)) dt so in the interval |x| < \, say Wf-Tg^zW-gW Thus if Tf = f and Tg = g, we obtain *||/- g\\ > \\Tf- Tg\\ = II/- 5II. which is possible only if/= g = 0. Now, the most general differential equation of first order that we shall consider is У' = F(x, y) (3.41) where F is a real-valued function defined in a neighborhood of the point (a, b) in the plane. A solution is a function у =/(x) defined for χ in a neighborhood of a with these properties f(0) = b fix) = F{x, f(x)) If/is a solution, it is a fixed point of the transformation Tgix) = Γ Fit, git)) dt + b (3.42) The fixed point will be found by the method of successive approximations: /0 = anything, /, = Tf0,f2 = Tfu and in general /„ = Tfn_v In order to guarantee that this sequence has a limit and the fixed point is unique, we must guarantee the hypothesis of the fixed point theorem. More precisely, we must know enough about the function F in order to guarantee that the transformation defined by (3.42) is a contraction on the space of continuous
3.5 Existence Theorems 269 functions on a suitable interval about a. It suffices (as the proof below shows) if the following condition is satisfied. Definition 4. Let F be a function of two variables x, у in the domain D in Rn + m (x ranges in R" and у in Rm). F is Lipschitz in у if there is a constant Μ such that \F(x,y1)^F(x,y2)\<M\yl-y2\ for all yt, y2 such that (x, j>x) and (x, y2) are in D. Notice that since (1 + x)(l + y)~l is not Lipschitz near у =—I, we cannot apply Picard's theorem; and in fact it does not hold as we saw in Example 26. We have allowed x, у to range through many variables because of the generality we need for Picard's theorem. Notice that if и = m = 1, Fwill be Lipschitz if the partial derivative dFjdy exists and is bounded. For by the mean value theorem (along the line χ = constant) dF F(x, yt) - F(x, y2) = — (*, 0(з>! -y2) ¥ι<ξ< У г ду and thus we can take the Μ of Definition 4 to be the bound of dFjdy. Now let us turn to higher order equations. A differential equation of order к is given in the form /к)=Р(х,у,у',у\...,/к-1)) (3.43) where F is a function defined in a neighborhood of {a, b0,..., 6ft_x) in Rk + 1. A solution is a fc-times differentiable function у =/(*) with these properties /(e) = b0 ,f\a) = bu... ,/ί*-"(β) = **-i, /'^^(χ,/Μ,/'Μ,···./'""^)) We would like to solve (3.43) with the given initial conditions by successive approximations, but the method is not transparent. However, the problem does reduce to the first-order case by means of a great idea. First, we illustrate.
270 3 Ordinary Differential Equations Example 34. у'=2у'-у,уф) = 0,у'ф)=1. We introduce a new unknown function ζ and require that y' = z. Then the given equation is reduced to the system y' = ζ y(0) = 0 z' = 2z - у z(0) = 1 which is first order. Thus, what we seek is the vector-valued solution of the vector differential equation {y, z)' = (z, 2z - y) (j>(0), z(0)) = (0, 1) This we can rewrite by integration and thus solve by successive approximations. Precisely, the solution is the fixed point of the transformation (defined on pairs of functions): П/, <?)(*) = f(ff(0.2ff(0 - A0) dt + (0, l) ■Ό Let us compute some of the successive approximations. (/o,ffo) = (0,l) (Ji,9i)=T(J0,g0) = (x,2x + l) Ui, 9г) = 4fi, ffi) = (*2 + *, ί*2 + 2* + 1) (/з, Яг) = Пк , 9г) = (i*3 + ** + *, l*3 + ^ + 2x + 1) (/4.^4)= T{f3,g3) (χΑ ί 3 2 5*4 2 3 3 2 \ It is now not hard to surmise that the general form of (/„, gn) is and that lim/n = xex. This then is the typical means of reducing the higher order equation to first order. Given the Equation (3.43), we introduce new unknown functions
3.5 Existence Theorems 211 Уо>У1 У к-1 and replace (3.43) by the first-order system У'о =У\ У\ = У г yk-i =F(x,y0,yl, ..., j>*-i) Уо(Р) = b0, yt(a) = bx yk-i(a) = bk_ l Now the general existence theorem for &th-order equations falls directly out of the theorem for first-order equations for systems. The beauty of this trick is that Picard's theorem is no harder for systems and consists merely of verifying that the appropriate transformation defined by an integral on vector-valued functions is a contraction, so the fixed point theorem applies. Here, then, are the fundamental existence and uniqueness theorems for ordinary differential equations. Theorem 3.3. (Fundamental Existence and Uniqueness Theorem) Let (a, b) be a point in R χ R", and F an Revalued Lipschitz function defined in a neighborhood of {a, b). There is an ε > Ο and a unique continuously differenti- able R" valued function f defined on (a — ε, α + ε) such that f(a) = b and f'(jc) = F(x, f(x))for all χ in (a - ε, α + ε). Proof. The idea behind the proof is to change the given problem to a problem involving integration. In fact, by the fundamental theorem of calculus, our desired function is that function f such that f(*) = b+ f F(/,f(/))A for all points χ near a. That is, we seek a fixed point of the function Τ defined on [C((a — ε, α + ε))]" (the space of η-tuples of continuous functions on (a — ε, α + ε)) 7Т(дс) = Ь+ ί F(/,f(/))A •Ά Because F is Lipschitz, we can choose ε so that Γ is a contraction. We shall of course refer to the distance between functions introduced in Chapter 2. First, since F is Lipschitz in a neighborhood of (a, b), there is an Μ and some rectangle В centered at (a, b) such that |F(x,y),F(x',y')|<M|y-y'|
272 3 Ordinary Differential Equations for all (x, y), (x\ y') in that rectangle. In particular, F is bounded on that rectangle by К = \F(a, b)| + Me0. Let ε < ε0 Κ'1, Μ~ιβ, ε0. Let Χ be the set of л-tuples of continuous functions f on the interval (a — e, a + e) such that || f — b|| < e0. If f e X, then for all te(a— ε, α+ ε), (t, I(t)) is in В and F is denned on B, so the transformation Tf(x) = b + f F(f, f(0) dt is well denned on X. We verify now that it is a contraction on X. Let f e X. Then ||7T(*)-b||<; f |F(/,f(/))A <Κ\χ — α\<Κε<ε0 Thus ||7Ϊ- Ь|| < ε0, so Tie JTalso. Let f, g e X. \\Tf(x) - Tg(x)\\ < f\F(t, f(/)) - F(i, g(i)| dt J a <LM f ||f-g||i// •'a <M\x-a\ ||f-g||<Me||f-g||<i||f-g|| Thus Γ is a contraction, so by the fixed point theorem it has a unique fixed point f. We have f(*) = b+ \\t,t(0)dt so by the fundamental theorem of calculus f is continuously differentiable (because the right-hand side is so), and f (a) = b, f'(x) = F(x, f(x)) for all χ e (a — ε, α + ε). Certain remarks on this theorem are necessary. First of all, the general differential equation of first order is of the form F(y', y, x) = 0, noty' = F(y, x). The question arises: when can we rewrite the relation F(y', y, x) = 0 in the form of Picard's theorem, for in this case we will know that solutions exist. This question, of explicitly solving an equation H(u, v) = 0 for one of its variables, say u (so that there is a function G(v) such that H(u, v) = 0 if and
3.5 Existence Theorems 273 only if u = G(v)), will be discussed further in Chapter 7. (Recall from Theorem 2.16 that we have a condition for functions F of two real variables: dF/ду φ 0. We shall see that this is the general condition.) Secondly, Picard's theorem only asserts the existence of local solutions. Supposing that F(x, y) is defined in / x R", I any interval in R, we can ask if there exists, for each y0 e R", a function defined on all of/such that f '(*) = F(x, f(x)) for all χ el f(*o) = Уо for given x0el The answer is in general, no. For example, the function F(x, y) = y2 is certainly Lipschitz in any rectangle, so local solutions always exist. But we already know that if у' =у2, у must be of the form (c — x)'1 for some constant с Thus, if we impose an initial condition f(x0) = c0, the (local) solution is On any interval on which the solution exists it is given by this formula (see Exercise 19(a)). Thus there is no solution to this initial value problem in any interval containing the point x0 + l/c0. We now turn to equations of higher order and the reduction to systems of first order. Let us represent a point of #1 + <fc+1>n by coordinates (x, y0, ..., yk), where л: is a real number and the y, range through R". Theorem 3.4. Let (a0,b0, ... ,bk)eя1 + <*+1>п and let F be an Revalued Lipschitz function defined in a neighborhood of (a0, b0,..., bk). There is an ε > 0 and a unique (k + lytimes continuously differentiable Revalued function defined on (a — ε, α + ε) such that f(a) = b0 /('W> = 6, 1<ί'<* F(x,f(x),fXx),...,fk\x))=fik+i)M Proof. Consider the Rik+1 '"-valued function G defined in a neighborhood of (a, bo,..., bk) by G(x, y0,...,yt) = (Гь · · ·, J4-i, F(x, Уо,..-,у0)
274 3 Ordinary Differential Equations Clearly, G is Lipschitz wherever F is. By Theorem 3.3, there is an ε > 0 and a unique function g denned in (a — ε, a + ε) taking values in /J(l,+1)n such that g(a) = (bo,...,bk) (3.44) g'(x) = G(x, g(x)) Writing g = (g0, ···,№·) we have g,(d) = b, and (g0, ■■ ■,gk)'(x) = (gl(x),..., gk- i(x), F(x, g0,.·., gij). Thus, splitting this into coordinates, gl = gt +1, 0 ^ / < к and yl(x) = F(x,g0, ■■-,gO- Thus go=gi, g'i = g'i = дг and in general g0J) = g,. Thus g0(a) = bo g0'\a) = bi \<i<k and дЪ»\х) = F(x,g0(x),gi(x),...,gOk\x)) which solves our problem. The uniqueness follows immediately, for if/is a solution of our original problem then clearly (/,/',...,/<l)) solves (3.44), but the solution of that is unique. • PROBLEMS 23. Let h0, ■ ■ ■, hk-i be infinitely differentiable functions on the interval / and suppose f is a solution of У<1) + "lh,yw = 0 1 = 0 Show that f also must be infinitely differentiable. (Hint: Any solution of y(l+1> + /b-iy(l> + "l (Α,-ι + W + h0y = 0 1=1 is also a solution of the first equation. 24. Prove: If {/„} is a sequence of bounded functions in C(I) such that ||/n— /n-i|| < C„, where 2 C„ < со, then the sequence {/„} converges to a continuous function. 25. The differential equation у" + у = 0 has unique solutions corresponding to the initial conditions X0)=1 /(0) = 0 Я0) = 0 /(0) = 1
3.6 Linear Differential Equations 275 respectively. Let C, S be these two functions. Prove: (a) C2 + S2 = l (b) S' = C, C' = -S (c) S(2x) = 2S(x)C(x) (d) e" = C(x) + iS(x) Of course, the reader will recognize that C(x) = cos χ and S(x) = sin χ and thus these equations should follow However, the intention here is to verify these equations on the basis only of the defining differential equation 26. Sometimes it is of value to find a linear differential equation which has as its space of solutions the vector space spanned by η given functions. We find an equation of nth order by substituting the η functions in the equation /"> + gn-i/"-1) + · + goy = 0 For example, suppose we want to find the linear equation whose solution set is the span of χ and sin x. We try a second-order equation y" + gy' + hy = 0 and substitute x and sin x: g+hx = 0 — sin χ + g cos χ + h sin χ = 0 We can solve these linear equations: sin χ , ч — xsin χ h(x) = g(x) = sin χ — χ cos χ sin χ — χ cos χ Thus the differential equation is (sin χ — χ cos x)y" — y'x sin χ + у sin χ = 0 Find the linear differential equation whose solution set is the vector space spanned by the given set of functions. (a) (b) (c) (d) (e) (f) X, X2, X1 gX gi* gU+O* xex, exp(x2) sin x, cos x, tan χ χ sin x, cos χ x, ex, tan χ 3.6 Linear Differential Equations The most important and best understood class of differential equations are those which are linear in the unknown function and its derivatives. We now give the definition of this class
276 3 Ordinary Differential Equations Definition 5. Let / be an interval in R. A linear differential operator of order A: is a transformation from the space of A:-times differentiable functions on / to the space of continuous functions on / of the form m)=fm+klhlfw (3.45) 1=0 where h0, ..., Ak_ t are given continuous functions on /. Notice that the coefficient of the highest order term is 1. More generally, it could be any function hk ■ In this case, if hk is never zero on /, we could divide by hk and obtain the form (3.45). If hk sometimes has the value zero, then the theory to be presented here will fail (see Problem 31). A transformation of the type (3.45) is linear, in the sense that L if + g) = ДО + U9) Uff) = cUJ) It follows that the collection of functions which get mapped into zero by L, K(L) = {/, L{f) = 0}, the kernel of L, is a vector space of functions. We shall now show that this is a A:-dimensional vector space. First of all, the equation L(f) = g, for a given continuous function g defined on the interval / has a solution / on the whole interval, which is uniquely determined by given initial conditions f{a) = b0,...,/(fc_1)(a) = bk-v In other words, in this case, Picard's theorem is more than local; it gives a solution on the whole interval. We shall verify this fact below (in Proposition 9). Thus, we can state: Proposition 6. Let I be an interval in R, a si, and L a linear differential operator of order к defined on I. (i) if g is continuous on I and b0,..., bk_1 are any real numbers, there is a unique Ck function f defined on I such that f{a) = b0, f{a) = bx /«-"(a) = bk_t (ii) The space K{L) of solution on I ofLf= 0 is a vector space of dimension k. Proof. (i) will follow immediately from Proposition 9 below according to the same procedure as in the preceding section for reducing a fcth-order equation to a first- order system.
3.6 Linear Differential Equations 277 (ii) Let Ea be the transformation from K(L) to Λ* denned by evaluation at a: EJJ) = ma), Па),...,Г-1\а)) By the existence and uniqueness theorem, E„ is one-to-one and onto. Thus K(L) also has dimension k. Let us reconsider briefly the case of constant coefficient linear operators: ^) = /(4+'Σ>./(,) (3.46) ι = 0 We associate to L the polynomial Pl{X) = Xk+YialXl 1=0 (called the characteristic polynomial of L). We have already seen, by substitution of /(*) = e", that if PL(r) = 0, then e" is in K{L). Now if PL has к distinct roots ru...,rk, then all of the functions exp(rxx) txp{rkx) are in K{L), as well as all linear combinations of these. Since K{L) has dimension k, these exponential functions form a basis for K(L) and every solution L(f) = 0 is of the form Av exp(rxx) + ··· + Akexp(rkx) where the A, are to be determined by the initial conditions. In case PL does not have к distinct roots (for example, PL{X) = X2 -2X + 1), the situation is more complicated. We shall complete this discussion in the next chapter, where we shall also discuss the question of factoring polynomials. Examples 35. Solve ym + Ъу" + 2/ = 0 with the initial conditions j>(0) = 0, y'(0) = l, y"(0) = 1. The characteristic polynomial, X3 + 3X2 + 2X has the roots 0, -2, -1. Thus the general solution is of the form A + Be'lx + Ce~*. We solve for А, В, С by substituting the initial conditions: A+B+C=0 -IB- C= 1 45 + С = 1
278 3 Ordinary Differential Equations Solving, we find A = 2, 5=1, С = — 3, so the solution is f{x) = 2 + e'2x - 3e'x. Linear Systems with Constant Coefficients We now turn to the solution of systems of linear differential equations with constant coefficients. First, let us try to see an example through to the end. 36. Consider the system x\ = xt + x2 *i(0) = a x'2 = x1- x2 *2(0) = b (3.47) According to the fundamental theorem we can approach a solution by successive approximations using the transformation Tix.it), x2(t)) = fW) + x2it), xtit) - x2it)) dt + ia, b) (3.48) It is convenient to use matrix notation. Thus, writing •O -0) (3.48) becomes *' = ( j _ j I* *(°) = ^o Equation (3.48) becomes τ*(ί)=ίΟ(ι _1)^ω^ + ^ο Now, we successively approximate xi = J(l _Ax0dT + x0 = L _Atx0 + x0 X2 = i0[(l -l)(l _\)™o + x0]dz + x0
3.6 Linear Differential Equations 279 = (l -l) ί*0 + (! -\)tX° + x° /1 1\3 i3 /1 1\2 t2 /1 1\ *3 = li -i) 3T + li -i) 2l + li -i)i + /*° x-=[i+iXi -!Г£Ь According to the fundamental theorem the series converges to the solution *(0 = к f (l Ц* — Μι -ι/ fci. (3.49) The formula (3.49) represents the solution in the sense that it describes a way of computing approximations to the pair of functions χχ(ί), x2(t). (The question of measuring the accuracy of those approximations is important; we shall return to those questions in Chapter 5.) However, we have not obtained formulas for the functions individually. That is not really surprising since the functions are given by an interdependent relation (3.47). By analogy with the series for ex, we defined the exponential of a matrix as exp(M) = eM = / + Σ tt (3·50) fc=l К' Then we can write the solution to (3.47) as x(t)=exptL _1|л:о We now state a proposition which summarizes this discussion for general linear first-order systems. Proposition 7. Consider the linear first-order system of η equations in η unknown functions: *'(r) = Mx{t) x(0) = x0
280 3 Ordinary Differential Equations where χ = (xt xn) and Μ is an η χ η matrix. The solution is given by *(0 = eMtx0 Proof. We find, by successive approximations, the fixed point of Tx(t) = f Mx(t) dt + x0 We obtain Xo — ^o Χι = MtXo + Xo (Mt)2 Xi = —r— Xo + MtXo + Xo /(Aft)" (Mty-1 \ xn = —r + )——. + ■ ■ ■ + Mt + 1 )xo \ n\ (и-l)! / By the fundamental theorem the sequence of vector-valued functions x„ converges to the solution of the given differential equations. But the limit of x„ is given by x(t)- i+i(Mjr (3.51) Although we have not questioned the convergence of the series (3.50), we know there is no problem. For, by the fundamental theorem the sequence {xn} converges, so the series in (3.51) must converge. Finally, eM is just еш at t = 1. Finding the exponential of a matrix is not an easy thing to do; ordinarily it is best to just work with the series and approximate solutions. However, in certain cases we can obtain explicit formulas for the solution. Examples (Eigenvectors) 37. Suppose the matrix Μ is diagonal. Then -(V) \o dj
3.6 Linear Differential Equations 281 and the equations are *'i = diXi, x'2 = d2x2 x'n = dnxn However, this system is not a system at all, but just и independent equations. The solutions are xt = expC^OxxCO) χη = exp(4,i)x„(0) Thus, in particular, we see that Id, 0\ /ехр(^) О \ \0 'dj \ 0 expK)/ 38. Suppose that the vector of initial conditions x0 is an eigenvector of Μ: Mx0 = λχ0 for some λ. Then M2x0 = λ2χ0,..., M"x0 = λ"χ0, so we can compute the solution explicitly, x(i) = eM'x0 = ίΐ+Σ (-^r)*o =^ο + ΣΐM"x0 \ η=ι η! / π = ι η! оо trt2rt Σ _ гХ г ^о — е хо я=1 и! This computation leads us to speculate as to the existence and quantity of eigenvectors of the (η χ n)-matrix M. In general, this is a difficult quest and still does not lead to a complete explicit solution of the differential equation x' = Mx. However, if there is a basis of R" of eigenvectors of M, then we can give a complete explicit solution. Proposition 8. Suppose vx v„ are independent eigenvectors of the (η χ n)-matrix M, with eigenvalues λ1 λ„, respectively. Then the equation x' = Mx, x(0) = x0 can be solved explicitly as follows. Write x0 = c1v1 + ··· c"v„. The solution is x(i) = c1 exp(V)vx + · · · + c" exp(A„i)v„ Proof. We compute the series (3.51) directly: Mxo = M(c4i + · · · + Cv„) = c'AiVi Η + c"A„v„ M2Xo = Mte'A^i + · · · + c"X„v„) = c'A^vi + · · · + c"A 2v„ M*x0 = c'A/vi + h c"A»v
282 3 Ordinary Differential Equations Thus / °° Mktk\ " I °° tk \ π / oo tkλ k \ " j = l \ k=l K! / J = i Examples 39. Consider the system of differential equations We find the eigenvalues and eigenvectors corresponding to this system of equations as in Section 1.7. Let -(=! i) Then det(M — ΑΙ) = λ2 — ЗА + 2. The eigenvalues are the roots A = 1, 2 of this polynomial. The eigenvalue 1 has an eigenspace the kernel of "—(:«1) The vector (1, 2) spans the kernel. Similarly, the vector (1, 3) is an eigenvector of Μ with eigenvalue 2 since it is in the kernel of *-»-£ i) The general solution of the given differential equation is This vector has the initial conditions x(0) = c1 + c2, J>(0) = 2cx + 3c2 · Our initial conditions are (4, 12), so we can solve for cx, c2'· ct = 0,
3.6 Linear Differential Equations 283 c2 = 4. Thus, we obtain the explicit solution *(i) = 4e2t y(t) = 12e2' 40. H-!/<!)* ■*>-(!) The eigenvalues are 1/2, 3/2, and they have eigenvectors (2, -1), (2, 1), respectively. Thus the general solution is x(0 = ^t/2(_2) + <^3,/2(2) Substituting in our initial conditions, we obtain these equations for cuc2: 3 = 2cx + 2c2 3 = —cl + c2 The solutions are cx = —3/4, c2 = 4/9. Thus the solution of the given system is X(t) 4e 1-1/9e Vl/ I 3/4e"2 + 4/9e3"3j 41. x' = (_; J)« x(0) = (i) (3.52) The equation for λ here turns out to be (1 - A)2 + 1 = 0, so λ = 1 + i. The root A = 1 + i gives the eigenvector (1, -г), and for the root X = 1 - i we obtain the eigenvector (1, +г')· Now our initial conditions are (1,0) = (1, -0/2 + 0,+0/2, and thus we obtain the solution
284 3 Ordinary Differential Equations There is an easier way to solve this equation, and that consists in recognizing that the matrix is of the form and represents a complex number: in our case 1 — ι Thus, we can replace our system (3.53) by the single equation z'(i) = (1 - i)z(t) z(0) = 1 by substituting z(i) = x(t) + iy(t). This has the solution z(i) = e'1-"' which is the same as (3.53), of course, x{t) = Re z(i) = - (ε'1-'" + e(1 +0') y(t) = Im z(i) = i (e(1 -'>« - e(1 +0') 2i 42. Find the general solution of /1 -3 3\ x'= 3 -5 3 χ \6 -6 4/ Now, after computation, we find det(M — ΛΙ) = (— 2 — A)2(4 - λ), thus Μ has the eigenvalues — 2, 4. eigenvalue —2: /3 -3 3\ M-AI= 3 -3 3 \6 -6 6/ Thus the corresponding eigenvectors lie in the plane χ — у + ζ = 0. Two independent vectors in this plane are (1, 2, 1), (0, 1, 1).
3.6 Linear Differential Equations 285 eigenvalue 4: /-3 -3 3\ М-Я= 3 -9 3 \ 6 -6 0/ The corresponding eigenvectors lie on the line — x — y + z = 0, χ - у = 0, which is spanned by (1, 1, 2). Thus, the general solution is (3H~2,(ibe~!'(:b<4,(i) 43. ^(Ιίί)" Hi) This matrix is symmetric, so it has a spanning set of eigenvectors. We already found them in Example 9: (1, —1,0), (0, —1, 1) have the eigenvalue 1, (1, 1, 1) has the eigenvalue ψ- The general solution is x(i) = c1e'( -1J + c2e' \ 0/ The initial condition is •(-!Η·*(ί) (-·)··(-:)··(-■)·(!) Thus the solution is xi(t) = 2ег + е2г x2(t) = - 4e* + e2t x3(t) = 2e' + e2' 44. *-(i 0Ж x(0)=(°) (354)
286 3 Ordinary Differential Equations The equation for the eigenvalues is (1 — A)2 = 0, so we obtain only one eigenvalue, λ= 1. This has the eigenvector (1,0). Thus we know one solution of the general equation x(0 = *'(i) However, this does not satisfy the given conditions We cannot proceed to solve this equation without further study of the matrix, and that is generally a difficult search. In the present case we can avoid such difficulties by observing that the second row of (3.54) is just y' = y, y(0) = 1. This has the solution y(t) = e'. Then the first row is x'(i) = x(t) + e' x(0) = 0 and we know how to solve this equation: x(t) = te'. Thus our sought- after pair is (te', e'). Notice that in the last example the solutions are not linear combinations of exponentials, but admit polynomial factors. Only when there is a basis of eigenvectors are the solutions linear combinations of exponentials; when there are too few eigenvectors, we must expect more complicated coefficients. There is a theorem that any solution of a first-order linear system with constant coefficients is a combination of exponentials with polynomial factors. This theorem follows from the Jordan canonical form of a matrix; we shall not go into it here. We conclude this section with the proof of the global version of Picard's theorem from which Proposition 6 was obtained. Proposition 9. (Global Version of Picard's Theorem) Let I be an interval in R. Suppose F is a continuous Revalued function defined on I x R" which satisfies this strong Lipschitz property: there is a constant К > 0 such that for ally 1,y2eRn sup{| Щх, у,) - Щх, y2) |: xsl} < К\\У1 - у2|| (3.55) Then the system of η equations У' = F(*, У) У(с) = а has a unique solution for any initial condition a at с е I.
3.6 Linear Differential Equations 287 Proof. We cannot simply use the fixed point theorem, for the transformation Τ defined by 7Т(дс)=а+£р(/,Г(/))Л is not a contraction on the space of functions continuous on the interval /. Nevertheless the successive approximations procedure works. Define a sequence {f„} of continuous functions on / by induction: fo(*) = a f,(x) = a+jF(/,fo(/))* ОД = а+(р((,{,.,(0)А The sequence {f„} converges in C(/) By making К larger we may assume that besides (3.55) we also have ||F(x, a)||, < K. We prove by induction that К" \Ux)-t,-,(x)\^-r\x-c\· ni (i) ii-l |fi(x)-fo(*)l = (li) η - 1 => η ί F(f, а) Л < Я _[ Λ = #|x-c| \Ux)-t„-l(x)\ = j [F(/, £,_i(/)) - F(f, f„_,(/))] Λ <я| H.-i(/)-f-i(')l* K" < ?ш/е""-е'"-,л-^|я-е|" "(и-1) From (3.56), we obtain |fn-f„-i||»^ [лг(й-в)Г ni (3.56)
288 3 Ordinary Differential Equations Since the series 2 [K(b — а)Г/и! converges, it follows that {f„} is a Cauchy sequence in C(I) so there is an f e C(I) such that f„ -> f. Since Τ is continuous on C(/), 7T„ -> 7T. But rf„ = f„+1, so f„ -> 7ϊ also, thus Τι = f. Since f is a fixed point of Τ we conclude as in Picard's theorem that f solves our problem. Now the fixed point theorem asserts the uniqueness of our fixed point, and we seem to have lost that. But we can regain it on /, because locally we have uniqueness, by Picard's theorem. Suppose g is another solution of the problem; we have to show that g = f on /. For this purpose we may assume that the point с at which the initial condition is given is one of the end points of /. Let R = sup{r e /: ΐ(χ) = g(x) for all x<,r) Since f(c) = a= g(c), с is in the set on the right. Also, b is an upper bound for this set, so the least upper bound R exists. We have to show that R = b. If R < b, then the differential equation is denned in a neighborhood of R. By Picard's theorem, there is an ε > 0 such that the equation y' = F(x, y) has a unique solution in (R — e,R + ε) with initial condition y(R) = I(R). But both f and g, when considered as functions on (R — ε, R + ε), are such solutions. (Notice g(R) = f(R) by continuity.) Thus, f = g on (R — ε, R + ε), so R + ε is in the set above, and R is not an upper bound. Thus the assumption R < b is contradicted, so R = b and the proposition is proved. • EXERCISES 23. Find the general solution of these systems of equations (a) ri =4)Ί-2y2 y'i = 2y2 + 4yi (b) y'i=yi-y2 y'i = ay ι + у г (С) y'i = У ι + У г + Уз У'г = ayi + уг у'ъ = ayi + уз 24. Find the solution of these initial value problems (a) The system in 23(a) with initial condition ^,(0) = 1, y2(0) = 1. (b) The system in 23(c) with initial conditions ^ί(Ο) = ^2(0) = 0, (c) (d) y'i =^1+^2 y'i = —У1 + Уг /ι = 3^i - Уз У'г=У1+ 2y2 - Уз у'з = 2у! - 2y3 у№ УМ Л(0) Л(0) Уз(0) 25. Find the general solution of the equation x' = Mx, where Μ is given by: (a) the matrix in Example 10. (b) the matrices in Exercise 10. (c) the matrices in Exercise 11
3.7 Second-Order Linear Equations 289 (d) / о -1 -3\ (g) (e) / 4 7\ (h) <0 (_i Ι) ω ( 3 -2 0 V ° Ί ι 0 1 ρ о Ί ο 0 1 ,ι -ι 2 3 0 0 0 0 1 -2 °\ 1 1/ °\ 0 1/ 0 0 2 1 • PROBLEMS 27. Suppose Μ = (α/) is an η χ и matrix such that a/ = 0 if ι < j. Show that the solutions of x' = Mx are all polynomials of degree at most и (Hint: M"=0.) 28. Show that exp(M') = (eM)'· 29. Show that if Μ is skew-symmetric (M' =-M), then eM(eM)' =1. For such a matrix the rows form an orthonormal basis: A matrix A with the property AA' = I is thus called orthogonal, and represents a rotation. 3.7 Second-Order Linear Equations The most comon type of equation arising from physical problems is the second-order linear equation: y" + a^x)y' + a0(x)y = g(x) (3.57) Thus the techniques for solving such equations have been well developed. In this section, we shall assume that we know one solution of the associated homogeneous equation y" + β,(*)/ + a0(x)y = 0 (3.58) and show how to find the general solution of (3.57). The question of finding this first solution is of course difficult, and further discussion will be postponed until Chapter 5. The technique involved in finding the general solution consists in substituting candidates involving the given solution and a new unknown function, and thereby attempting to reduce the complication in the given equation.
290 3 Ordinary Differential Equations In order to motivate this discussion, let us recall the theory of the first-order equation: y' + h(x)y = g(x). The homogeneous equation is easily solved by separation of variables: f(x) = exp(Jx h) is a solution of y' — hy = 0. Now, to find the general solution of the given equation we substitute у = zf, where ζ is some new unknown function. From/' + hf=0, we obtain g = y' + hy = z'f+z(f' + hf) = z'f Thus z' =f~lg, so ζ is found by integration: ζ = J f'xg + c. Now, the second-order homogeneous equation (3.58) has two independent solutions. By assumption we know one, call it/!. Let us try to find another by substituting у = zfx. The new equation in ζ is y" + axy' + a0y = z"fx + 2z'f'x + zf\ + ax{z'fx + zf\) + a0 zf, = z"fl+(2f'l+alfl)z' = 0 (3.59) This equation is linear in z' and thus we can solve for z' and then integrate to find z. We have as a result z(x) = cffl(ty2exp(-j'^dt Examples 45. The equation x2y" + xy' — у = 0 has the solution y(x) = x. We now find another solution by substituting y(x) = z(x)x. We have y' = z'x + z, y" = z"x+2z', so x2y" + xy' - у = z"x3 + 2z'x2 + z'x2 + xz - zx = 0 or z"x3 + (3z')x2 = 0 Dividing by x2 we have z"x + 3z' = 0, which we can solve for z' by separation of variables: ζ = Cxx'2 + C2- We can take ζ = x~2, and thus the second solution, у = zx = x~l, is found. 46. sin x2 is a solution of xy" - у' + Ахгу = О
3 7 Second-Order Linear Equations 291 We substitute у = ζ sin χ2 and obtain this differential equation for ζ ζ" χ sin x2 + z\Ax2 cos x2 — sin x2) = 0 Thus ζ" , 1 — = - 4x cot χ + - Ζ Χ Integrating, we obtain In z' = 2 In csc(x2) + In χ + С or z' = C^csc2 x2 Integrating once again, we find ζ = С, cot x2 + C2 Thus, the second solution can be chosen as cot x2 sin x2 = cos x2 (which we might have guessed at the beginning). Now that we have a technique for finding two independent solutions for the homogeneous equation, we return to the general equation (3 57). Taking our cue from the first-order case, we try a combination of the solutions of the homogeneous equation. Let us refer to these two solutions of (3 58) as/i>/2 · Now, we consider a function of the form y(x) = Ζι(*)Λ(*) + z2(x)f2(x) (3 60) If we compute y' and y" and substitute into (3.57) we will get a totally unintelligible equation of second order in the (но unknown functions zl, z2. What we need, to find two unknown functions, is of course, a pair of equations From where is the second equation to come'' We notice, first of all, that the formula (3 60) does not uniquely determine the functions z^ z2, even if we know the sought after function \ For, if z1; z2 are found so that (3 60) gives the solution y, then we may add gf2 to z1, and subtiactg/! from z2, obtaining another pair making (3 60) valid We thus seek another condition (preferably involving derivatives) which will serve to uniquely identify the functions ζλ,ζ2. Differentiating (3 60), we obtain y\x) = Γ,(χ)/ί(λ) + :2(x)/iW + z\(x)Mx) + z2(x)/2(a) (3 61)
292 3 Ordinary Differential Equations Equations (3.60) and (3.61) will give a pair of linear functions in гг(х) and z2(x) if the sum zj(x)/\(.x) + z'2{x)f2{x) vanishes. This pair of equations (if noncollinear) will then identify zx(x), z2(x) in terms of y(x), y'(x)· Thus, if that condition is satisfied we know that z,, z2 are uniquely determined by the solution y. Turning the argument around, we impose the condition ζ\Α + ζ'2/2 = 0 (3.62) and hope now that, together with this condition, the given differential equation will explicitly determine ζλ,ζ2. (In fact it will do so theoretically, since Equation (3 57) determines the solution у which in turn determines z,, z2 in the presence of the condition (3.62).) Let us try our idea on Example 45. Example 47. Solve x2y' + xy' - у = χ2. We have the two solutions x, x~l of the homogeneous equation. We consider у = zxx + z2x~l and impose the condition ζ',χ + ζ^χ-1 =0 (3.63) Now let us substitute this information into the given equation. In the presence of (3.63), we have У' = zx - z2x'2 y' = z\ - z'2x'2 + 2z2x~3 Then x2 = x2y'' + xy' — у = x2z\ — z'2 + 2z2x~l + xzl — z2x~l — zlx — z2x~l x2z\ - z2 = λ2 (3.64) Now the pair of linear Equations (3.63), (3.64) can be solved by Cramer's rule: 2 — 1- — -— ^ -2x 2 — 2x 2 Integrating, we find that z1 = x\2 + cu z2 = —x2/6 + c2, and so the general solution is y = zJi + z2f2 = i* - ^x3 + clx + c2x~1
3.7 Second-Order Linear Equations 293 Now, it was not an accident that in this case the equations turned out to be a pair of linear first-order equations: this is always the case. We shall now describe the technique in general. Supposing that fuf2 are two independent solutions of the homogeneous Equation (3.58) we try a function у = zlf1 + z2f2 as solution of (3.57). We impose the condition z'ifi+z2f2 = 0 (3.65) Then y' = zlf'l + z2f2 y" = Af\ + z'2f'2 + zln + z2f'2- Thus (3.57) becomes z'if[ + z'2f'2 + z1f'i + z2f'i + z1a1f'1 + z2alf'2 + zla0fl + z2a0f2 = g or z\f'i + z'2fi=9 (3·66) the rest of the terms vanishing since /,, /2 solve (3.57). We solve the pair of linear Equations (3.65), (3.66) by Cramer's rule. -fi9 f\9 Zl /i/i-/2/i Zl hi'i-hf\ and these can be integrated in order to find the general solution. One apparent problem is the denominator. If it ever vanishes, these functions may not be integrable. In fact, our whole discussion will break down. Fortunately, we can verify once and for all that this function is nonzero. The function W{x) = Mx)f'2{x) - /ί(χ)/2(χ) = det(/;$ /JOo) is called the Wronskian of the pair/,, /2 · Notice that W' = fj'2 - fUz
294 3 Ordinary Differential Equations Since/!,/2 solve (3.57), we can easily check that W + axW=Q and thus H>H W {x) = W{x0) exp Thus if W is nonzero at one point, it is never zero. W is nonzero at x0 if the vectors (fi(x0), f{(x0)) and (/2(x0X /г(*о)) are independent; this is guaranteed if the functions/! and/2 are independent. Examples 48. Solve y* + xy' - y = x, y(0) = 0, y'(0) = 0. It is easy to see that χ is a solution of the homogeneous equation. We find another solution by substituting у = zx. The equation for ζ is z"x + (2 + хгУ = О. Thus z" 2 + x2 „ _, — = = -2x *-x ζ χ so z' = Cx-2cxp[-j\ Thus we may take as the second solution y(x) = xz(x) = χ Jo Г 2 expi - -J dt Now let us refer to the integral by φ(χ). We solve the given equation by substituting у = zlx + ζ2χφ(χ); this gives the pair of equations ζΊχ+ ζ'2χφ{χ) = 0 z\ + ζ'2(χφ'(χ) + φ(χ)) = 1
3.7 Second-Order Linear Equations 295 Since φ'(χ) = x~2 exp(-x2/2), we find, by Cramer's rule, - χψ(χ) exp(-x2/2) z, = exp(-x2/2) or dt z2(x) = expi у j The integrals defining z, are not expressible in closed form, but they nevertheless define a function. Thus the solution is y(x) = -x \% exp(- Yj [JV2 exp(- !■) ώτ] Λ + ХеХР(у)/оГ2еХР(_У'/Т This technique for solving second-order equations is called variation of parameters. It can be applied to higher order linear equations Suppose we are given such a differential equation: уЫ + £а1(х)у" + у = д(х) (3.67) Suppose we have somehow found η independent solutions /,, ...,/„ of the homogeneous equation. Then we try a solution y = Zlfl + ··· +Znfn As in the second-order case, the solution will uniquely determine the functions Zi, ..., z„ if we impose the conditions zi/i+ ■■■ + *:/: = о ζίΓι + ··- + μ: = ο ζί/(Γ2) + ··· + ζ;/ίΓ2) = ο
296 3 Ordinary Differential Equations In the presence of these conditions, (3.67) becomes z;/(n-i) + ... + z;/(n-i) = 0 We can solve this system as a system of linear equations and then find the z,, ..., z„ by integration. Just as in the second-order case, this system is solvable since the determinant (called the Wronskian of the η functions fi, •••,/n)is never zero. 49. Solve $x3y'" - x2y" + 2xy' -2y = x\x2 - 9). (3.68) The homogeneous equation has the solutions x, x2, x3. Thus we try у = zlx + z2x2 + z3 x3. We impose these conditions: z\x + z'2x2 + z'3x3 = 0 z\ + 2z'2 χ + 3zj x2 = 0 In the presence of these conditions we compute (3.68) to be z'2 + 6z'3 = χ V - 9) The matrix of this system is [χ χ2 χ3 \ 1 2x 3x2\ \0 1 6 / which has the determinant — 2x3 + 6x2 = —2x2{x — 3). Thus, by Cramer's rule, we must have z, = ■ z\ =■ x\x2 -9)x4 -2х\х-Ъ) x\x2 - 9) · x2 -2х\х-Ъ) ζ-2=· -x\x2-9)-2x3 -2х\х-Ъ) After integration we can express the general solution as y{x) = x9 3x* + x' Γχ8 3χΊ J + — + C> X У χ1 χ6 7 2
3.7 Second-Order Linear Equations 297 • EXERCISES 26. Show that the general solution of y" + y=f can be expressed as y(,t) = d cos t + c2 sin t + sin(f - t)/(t) dr •Ό 27. Find the general solution of у" + У =x 28. Find the general solution of: (a) y"-4y = l (b) y-y = & (c) y'· + 3y' + 2y= sin χ (d) r-^Z + ^r^o (e) x2y" - 4xy' + 6y = x3 + x2 29. Find the solution of x2/ - 2y = 2x2 Я0) = 1 y'(0) = 1 30. Find the general solution of y" + xe*y' — e*y = 0 31. Find the solution of e-y + xy'-y = \ y(0) = 0 /(0) = 1 • PROBLEMS 30. A differential equation of the form e,xV4 + ak-i χ"-1/1"1' + · · · + αιχ/ + во у = 0 where the a,'s are constants can be easily solved. Try the substitution у = xs. You should obtain xsMi)(i - 1) · · · (s - к) + et-i(iXi - 1) · · · (s - к + 1) + ••• + αιί + αο]=0
298 3 Ordinary Differential Equations Thus we need only find the к roots of the polynomial in brackets. Find the general solution of these differential equations: (a) χ2γ - 2xy' + у = 0 (b) x2y" - 3x/ -3y = 0 (c) x2y" + 4xy' + 3y = x5 (d) x2y°-xy' + y = 0 31. Solve the second-order 2x2 system of equations >ч: ;)>·+(» in (Hint: Go to the first-order 4x4 system by adding the equations / = z.) 3.8 Summary An .Revalued function defined in a neighborhood of x0 in R is called differentiable at x0 if ,. /(*o + 0-/(xq) lim <-.o t exists. This limit is denoted f'(x0)· If / is differentiable on an interval I in R its image is a curve in R". The line through f(x0) spanned by f'(x0) is the tangent line to the curve at f(x0)- If h is differentiable in a neighborhood of the curve and has a relative maximum on the curve at f(x0), then <yh(x0), /'(х0)У = 0. We can deduce the following principle from this. If h, g are differentiable functions defined in a domain in R", then the maximum (or minimum) of h subject to the restraint g(x) = 0 is attained at those points χ for which there exists a λ such that g{x) = 0 VA(x) = XVg(x) If A, ffi> ■ ■ ■, 9k afe differentiable in R", and h has a maximum (or minimum) subject to the restraints gl(x) = 0,..., gk{x) = 0 at x0, there exists λ1 .. ,kk such that 0i(*o) = 0, ..., gk(x0) = 0 VA(xo) = A,Vg,(x0) + · · · + λkVgk(x0) Suppose / is an Revalued function defined on the interval /. / is Ck (A:-times continuously differentiable) if f, ... ,fw all exist and are contin-
3 8 Summary 299 uous. If/ is such a function we have Taylor's expansion about any x0 e I: where ε(χ - x0) is bounded by Mk = max{|/(kl(i) |: ( between x0 and x}. If/has derivatives of all orders, and hm Mk (X ~ Xo)" = 0 then/can be expanded in an infinite Taylor expansion: oo /40/· γ λ /(χ) = /(χ0)+Σ —~^-ч)" η = 1 П I A differential equation of order k is a relation involving a function of x, y,y', ..., y(k). If there is a fc-times differentiable function /such that this relation holds for all χ after the substitution у =/(x), j' =/'(v), , j'*·1 = /■(к)(х), we say / is a solution of the differential equation A linear differential equation of order A: is a relation of the form y<k) + Σ>,(*)*(0 + floWy = ff(x) (3 69) 1=1 where the functions al and ^ are (at least) continuous on an interval / If g = 0, the equation is called homogeneous The space of solutions of the homogeneous equation is a vector space of dimension к Equation (3 69) has a solution on / uniquely determined by the initial conditions Я*о) = во Пх0) = а1,...,/("-1Ьо) = ок-1 (3 70) Any equation of the form /k> = F{x,U\\ . ι""1') has a unique solution subject to the initial conditions (3 70) under this condition on F: (l) Fis defined and continuous in a neighborhood of (x0 , a0, , tft-i) (ii) F is Lipschitz: there is an Μ such that \F(x,yi,y\,.. ,}\k-")-F(x,M,\'2, ■ ,}ГП)1 <M(|3l- bl + ΙιΊ- \'2\+ - + \\li~l)- \2~u\)
300 3 Ordinary Differential Equations Techniques for Solution 1. Successive approximations. The equation У' = Fix, У\ y(xo) = ao is solvable if F is Lipschitz near x0. The solution can be approximated by a sequence {/„} defined as follows: /o = any continuous function, /,(*)= f F(t, f0(t)) dt + a0 hi*)= fF(t,Mt))dt + a0 JX0 JX0 2. Separation of variables. If y' =f(x)g(y), then the equation J9~liy)dy = jf(x)dx + C implicitly determines у as a function of x. 3. First-order linear equations. The homogeneous equation y'+fy=0 can be solved by separation of variables: у = cexp(— J/). The equation y' +fy = g can be reduced by the substitution у = ζ exp(—J/). The resulting equation in ζ is solved by separation of variables. 4. Constant coefficient linear equations. The characteristic polynomial of the differential equation ym + ак-У"-" + ··· + β,/ + а0у = 0 (3.71) is the polynomial Xk + ak-x X*-1 + · · · + a,X + a0 . If r is a root of this polynomial, then erx is a solution of (3.71). 5. First-order linear systems. Let A be an η χ η matrix. The equation in η unknown functions у = (j/,, ..., y„):
3.8 Summary 301 у' = Ay X0) = y0 has the solution y(t) = eA'y0. The exponential of a matrix is defined by 00 JUin eM = exp(M) = / + £ — If jo is an eigenvector with eigenvalue A, then the solution is y{i) = e*'y0. If R" has a basis yx, ...,y„ of eigenvectors of M, with eigenvalues A„ ..., A„ respectively, then the general solution is Ci εχρ(Α,ί)3Ί + ■■■ + c„ exp(Anf)^ In general, we must allow polynomial coefficients. 6. Second-order linear equations, knowing one solution. Suppose fx is a solution of / + ax(x)y' + a0(x)y = 0 (3.72) we find a second, by substituting у = z/i. This produces a linear first-order equation in z'. Suppose fx,f2 are solutions of (3.72). Then we solve y" + β,(*)/ + βοΟΟ* = ffW (3-73) by the substitution у = zxfx + z2f2 ■ In the presence of the condition z'i/i + z'2/2 = 0 (3·74) Equation (3.73) becomes z\f\+z'2f'2 = g (3.75) The linear Equations (3.74), (3.75) can be solved for z/, z2 and then z„ z2 are found by integration. • FURTHER READING E. A. Coddington, An Introduction to Ordinary Differential Equations, Prentice-Hall, Englewood Cliffs, N.J., 1961. An elementary book on differential equations which goes more deeply into the material of this chapter. M. Tennenbaum and H. Pollard, Ordinary Differential Equations, Harper
302 3 Ordinary Differential Equations and Row, New York, 1963. This is a thorough treatment of the subject of differential equations. Many special techniques and applications are exposed. F. Brauer and J. A. Nohel, Qualitative theory of Ordinary Differential Equations, Benjamin, New York, 1967. This book studies the theory of systems of differential equations, and in particular the behavior of sets of solutions. L. Loomis and S. Sternberg, Advanced Calculus, Addison-Wesley, Reading, Mass., 1968. This is a very modern approach to the subject. It goes thoroughly into the fundamental theorem. • MISCELLANEOUS PROBLEMS 32. Show that if Μ is a skew-symmetric matrix (M' = —M), then <Mx, x> = 0 for all x. Show that M2 is symmetric and thus has a basis of eigenvectors. Conclude that, considered as a matrix over C, Μ also has a basis of eigenvectors. (Hint: M2 - λ = (Μ + λ/λ)(Μ - λ/λ).) Thus if χ is an eigenvector of M2 with eigenvalue λ, then either χ is an eigenvector of Μ with eigenvalue or (M — V λ)χ is an eigenvector of Μ with eigenvalue -Vx. 33. Let Τ be any linear transformation. Compute the gradient of <Γχ, Γχ>, and show that the maximum of ||Γχ||2 on ||x||2 = 1 is attained at an eigenvalue of T'T. 34. Show that if Γ is a symmetric matrix, Γ({||χ||2 =1}) is an ellipsoid whose major axes are of length equal to the eigenvalues of T. 35. Find the points p0 e {(x, y) eR2:xy = l}, pi e {(x, y) e R2: у + χ2 = 0} which minimize the distance between these two curves. 36. Minimize and maximize the volume of a box with given surface area. 37. Find the point on the ellipse {x2 + iy2 = 1} which is closest to (i, 0). Find the furthest point from (i, 0). 38. Find the point on the ellipse {x2 + iy2 =1} which is closest to the circle of radius i centered at (i, i). 39. Suppose {a„} is a bounded sequence. Define f(x) = 2"= ι a„ x". Show that/is infinitely differentiable in the interval (—1, 1), and n\a„ = /<n,(0). 40. Let /be a twice continuously differentiable function defined in a neighborhood TV of (0, 0) in R2. Show that there is a function ε defined in TV such that lim ε(ρ) = 0 and P-.0 fix, У) = ДО) + Ь—х (0, 0)л: + Ь— (0, 0)у + <х, у)\\(х, у)II 41. Using Taylor's theorem, we can derive the exponential function in yet another way. Suppose that / is a function with the property that
3.8 Summary 303 fix) =/(*) for all x. Then /<"(*) =/(*) for all x, so / must have the Taylor expansion /(*) =2 ~.x" + ek(x) — «=o «! (fc-f 1)! for all k. Because of the estimate on ek, it remains bounded as к -> oo, so we should expect/to be the limit of the polynomials Pk(x) = £iuо (1/и!)лЛ We already know, from the theory of Chapter 2 that the lim Pk(x) exists for all χ Noticing that Д' = Д_,, prove that /(*) = lim Pk(x) does indeed have the property/' =/. 42. With a little bit of patience, and in the same way as in Exercise 41, you should be able to find a function / defined on R such that /(0) = 1 /'(0) = 0 and /<">(*) + f(x) = 0 for all x. 43. (a) Suppose that / is С on [-R, R] and /(0) =/'(0) = · = • /(*-1)(0) = 0. Then there is a continuous function g such that /(f) = ί'«?(ί), and «?(()) = (l/fc!)/<'>(0) (b) Suppose that/ is С on [- R, R]. Show that there is acontinuous function g such that /(0= Σ —^f + t'git) i = o /! 44. Change the conditions in Problem 18 as follows: The ratio of horse population to total population is constant and only the eggs hatched in horses produce mature insects. Derive the differential equations now governing the population growth. 45. Suppose now we have an insect which has a natural death rate of d, per insect per year and which lays h eggs per insect per year in the air The egg hatches if it lands on a horse and the hatching causes the horse's death Assuming birth and death rates bH, dH for the horse and a probability к that a given egg will land on a given horse, now find the differential equations of population. 46. Suppose /(г) = — ζ represents a force field on the plane Let a particle be at 1 at time 0. Describe the motion in case the velocity is r, (1 + i)/2, (1 - i)/2. 47. We assume that a particle generates a force field directed toward the particle and of strength equal to the inverse of the square of the distance to the particle. At time t = 0 there are particles at rest at points pb . ,pk in R2. Let f,(f) be the position at time t of the particle originally at p, What is the differential equation the function (fb ..., fk) must satisfy'' 48. Suppose a river deposits water in a lake at the rate of ν gal/day. We may assume that у is a periodic function of time with period 365. Suppose two pumps pump water out at the constant rates of ηί, w2 gal/day. Finally,
304 3 Ordinary Differential Equations water evaporates out of the lake at a rate of k(t) gal/day/ft2, where к is also periodic with period 365. We may assume that the area of the lake is proportional to WVi, where W(t) is the volume of the water in the lake at day t. Write the differential equation W must satisfy. 49. Suppose a missile A is moving in a straight line with constant velocity Vo. A tracking missile В of constant speed i0 is always pointed toward the missile A. Find the differential equation of motion of the tracking missileB. 50. Suppose we have the same situation as in Problem 49, but this time the speed of В is proportional to the distance between A and B. Find the equation of motion of B. 51. A falling body actually experiences a drag due to air resistance which is proportional to its velocity. Suppose a body of 100 tons is dropped from a plane 5 miles high; and this constant of proportionality (which depends of course on the shape of the body) is 20. How long will it take for the body to reach the ground ? 52. Two chemicals А, В in solution combine to create chemical С according to the equation 2A + В -> С. Suppose the rate of the formation of С is proportional to the product of the amounts of A and В present and inversely proportional to the amount of С present. Find the differential equation governing the formation of C, assuming initial amounts A0, B0 of chemicals A,B. 53. Suppose in the above problem, A0 = 10, B0 = 5, and the proportion constant is 1. How long will it take for the reaction to complete? 54. If two bodies Л, 5 of different temperatures come in contact with each other, the rate of change of temperature is proportional to the difference in temperature (the proportion constant depends on the bodies). Thus if TA, Τ в are the temperatures of A, B, respectively, we have T'A = kA(TA - T„) Τ в = КвКТв Τа) Find the formula for TA, TB with these data: (a) kA = 4, кв = 5, TA(0) = 100, Тв(0) = 0. (b) кА =2,кв = i, TA(0) = 120, Тв(0) = 50. 55. In Problem 54, as t ->■ oo the bodies tend to a common temperature. What is it in case (a), case (b), in general ? 56. Solve these differential equations: (a) У4) - 3/ + 2y = 0. (b) У + ЗУ + 2у = 2е\ (c) У sin у + cos χ cos у = cos x. (d) (x2+\)y' -2xy = x2+l. (e) xy' + Ъу = x~2 sin x. (f) x' + ax = b sin t. (g) y"=xe>. (h) У4)-У3)-У2)-У-2^ = 0.
3.8 Summary 305 (ι) ay' + by' + су = 0. U) y'{\+x1) = \+y\ (к) x' + y' = 2x. x' ~ У = Ь- ω y-(; j)y. <m> y' = (_f б)у· 57. Solve these initial value problems: (a) у" -Зу'+2у = е3*, уф) = 0, /(0) = 1. (b) xy'+ 3y = x\ y(0) = 5. (c) У4> - 3/2> + 2^ = 0, j<0) = 1, уЩ = 0, /'(0) = 0, /"(0) = 0. (d) У = (J J)y,y(0) = ([). (e) У' = (_з8 8)у.У(0) = (^). (f) e*y" + Xy'-y = e*, y(0) = 0, /(0) = 0. (g) хгу" + 3xy' + y = 0, y(0) = 1, /(0) = 1. (h) хгу" + 4xy' + 2y = x\ y(0) = 1, /(0) = 0. 58. Show that if all the entries of the matrix Μ are less than 1, then the series |m" converges. Show that the limit is (I — M)-1. 59. Use the idea of the preceding problem to approximate A-1 to within two decimals, where (a) /1 0 0.08\ A=[0.07 0.91 0.11 \0.14 -0.03 1.13/ (b) A = 0.98 0.13 0.02 0.11 0.01 1.18 -0.02 -0.11 -0.12 0 1.01 0.13 -0.03 -0.1 0 1
Chapter 4 CURVES Force Fields According to Newton's laws of motion, a particle will move in a straight line at constant velocity unless it is subjected to forces. In that case it will accelerate according to Newton's third law F = wa (4.1) where m is the mass of the particle. In this chapter we shall study the motion of particles subjected to variable forces. That is, we must allow the possibility that the force applied to the particle depends upon its position (as in gravitation) or even upon time (in the case of a variable electromagnet). This gives rise to the notion of a field of force. A field of force will be given in this way: at time t and position χ a particle of unit mass will experience a force F(x, t). Thus for each t0 we have associated a vector F(x, i0) to each point x. We can illustrate this as in Figure 4.1. Now, we have seen that a particle of unit mass situated at x0 at time t = 0, with a velocity v0 at t = 0 will follow the path of motion determined by the given field of force as the solution of the differential equation f"(0 = F(f(0, 0 f'(0) = v0 f(0) = x0 306
Force Fields 307 Figure 4.1 The path of motion is a curve in space given by the function f which solves this equation. Examples 1. Suppose a particle moves around the unit circle in the plane according to the function f(i) = (cos t, sin 0 (4.2) What force field would account for this motion'' Differentiating twice we find that f'(f) = ( —sin t, cost) f'(f) = (-cos t, -sin t)= - f(0 Thus the particle is accelerating toward the origin with constant magnitude (see Figure 4.2). This motion can be accounted for by the force field F(z,t) = -z In fact, in the presence of this field, if a particle has a velocity at time t = 0 orthogonal to its position vector, then it will continue to move in a circle centered at the origin. We can see this by solving the differential equation fV) = -f{t) /(0) = W(0) = iz0
308 4 Curves Figure 4.2 The solution of this equation is /(z) = z0 e" = z0(cos t, sin t) which is just (4.2) with z0 = 1. 2. Suppose we are given in space a force field directed toward the ζ axis with magnitude the distance from the ζ axis (Figure 4.3). F{x,y,z)=-(x.y,0) (jr. ν z) Figure 4.3
Fluid Flows 309 Here again the force field is independent of time and is given by ¥(x,y,z,t) = -(x,y,0) If a particle is at (1, 0, 0) with an initial velocity of (0, 1, a), what is its path of motion? We must solve this differential equation for three unknown functions f(i) = (χ(ί), y(t\ z(t)) f"(0 = (*'(0, У"®, z'(0) = WO, X0. 0) *(0) = 1, y{0) = 0, z(0) = 0 xX0) = 0,y'(0)=l,z'(0) = a The solution is easily found to be f(i) = (cos t, sin t, at) Thus, if a = 0, the path of motion is a circle in the plane ζ = 0. If α is positive, the path of motion is an upward spiral lying over the unit circle, of slope a, and if a < 0, the path followed is a downward spiral (Figure 4.4). 3. Time-independent fields. If we are given a time-independent force field on a domain in R2, or R3, and we graph sufficiently many values of the field, it seems to be a broken line picture of a family of curves. In fact, there is a family of curves which fits the picture in this sense: there is a curve through each point χ which is tangent to the vector F(x) at that point. These curves are called the lines of force of the field and are found by solving the differential equation f'(0 = F(f(i)) f(0) = x0 The solution of this differential equation describes the line of force passing through the point x0 . Fluid Flows The general notion of field of vectors arises in many other ways besides as force fields. Such an example which gives rise to a field is that of a fluid in motion in a certain domain in R3. There are various ways of describing that flow. First of all, we may idealize, by assuming that at the time t = 0,
310 4 Curves Ω = 0 a>0 α<0 Figure 4.4 there is a particle at each point x0 in R3. Then we can describe the flow by describing the motion of each particle. The particle which is at x0 at time t = 0 follows a certain path which is given by a function f(x0, t). The equations of motion are thus χ = f(x0, 0 Precisely, the position χ at time t of the particle originally at x0 is f(x0, f)· We assume that particles are neither created nor destroyed; this amounts to asking that, for each t the function x0 -» f(x0, ') is one-to-one and onto, and thus can be inverted. So we can also write x0 = ф(х, t) for some function φ. Precisely, the original position of the particle at χ at time — t was ^>(x, t).
Fluid Flows 311 4. Suppose a gas is rising at constant speed, and spiraling around the vertical axis. Thus the motion of particle is a helix as described in Example 2. We do best to express this motion in cylindrical coordinates: Let ζ be the (complex) coordinates in the plane (z = re'e) and w the height off the plane. Thus the path of motion described by the gas is (z, w) = (z0 e", at + w0) (4.3) Thus the particle originally at (z0, ho) will be at z0e", w0 + at at time t. We can certainly invert these equations: (ζ0,ΗΌ) = (ζίΓ", w-at) (4.4) Now, another way to describe a fluid flow is by its velocity. Let v(x, t) be the velocity of the particle which is at position χ at time t. The field ν is called the velocity field of the flow. We can find the equations of motion from the velocity field by solving the appropriate differential equation. For the function f(x0, 0 describes the motion of the particle originally at x0 . The velocity of this particle at time t is f'(x0 , 0 and its position is f (x0 , t). Thus we must have f'(xo,0 = v(f(xo,0, 0 f(x0,0) = x0 This equation can be solved uniquely. 5. Let us find the velocity field of the gas flow in Example 4. The flow equations are (4.3). The velocity of the particle originally at x0 is (z', w') = (;z0 e", a) To find the velocity field we must write this as a function of position at time t, rather than original position. We can do this by means of the inversion (4.4), obtaining as velocity field v(z, w) = (/z, a) 6. Suppose a fluid on the plane is spiraling in toward the origin (Figure 4.5) according to this equation of flow z(0 = ^<
312 4 Curves Figure 4.5 Here the particle at time t = 1 moves toward the origin so that its argument is proportional to time elapsed, and its distance from the origin is inversely proportional to time. Then iz0e" z0e" I 1\ Thus the velocity field is i>(z, i) = (i - -jz The angular velocity is thus constant whereas the radial velocity decreases as time goes on. 7. Suppose now a fluid spiraled in toward the origin so that its velocity field was time independent, for example, v(z) = (/ — \)z The equations of motion are the solutions of /'(0 = 0'-i)/(0 ДО) = z0
4.1 Ρ агате trization of Curves 313 This gives f(z)=ei~i+,'>, = e~'e" In this case the distance from the origin decreases exponentially with time (Figure 4.6). We shall make a study of the geometry of paths of motion of single particles and fluid flows, or families of motions, in this chapter. This study is a continuation of analytic geometry, and begins the subject of differential geometry. Figure 4.6 4.1 Parametrization of Curves A curve in R" is a one-dimensional subset Г of R". This means that the set Г can be put into one-to-one correspondence with a line, in a smooth way. We make this notion a little more precise. Definition 1. The image in R" of an interval under a continuously differ- entiable one-to-one function with a nowhere vanishing derivative is called a C1 curve. If the function is fc-times continuously differentiable we shall call this curve a Ck curve. The particular function is called a parametrization of the curve.
314 4 Curves Examples 8. The unit circle in R2 is a curve. It has this parametrization: Γ: ζ(ί) = (cos t, sin t) teR (4.5) Since z'{t) = ( — sin t, cos t) is never zero (the sine and cosine are never simultaneously zero), this is a good parametrization. We could also parametrize the unit circle in this way: z(0 = (i, (1 - t2)1'2) (4.6) but this parametrization fails at t = +1, since the function (1 — i2)1/2 is not differentiable there. Notice that (4.6) does not parametrize the whole circle, but only the upper semicircle. Both of these failings can be alleviated by introducing parametrizations which cover the other parts of the circle. That is, ζ(θ = ((ΐ-ί2)1/2,0 will parametrize the circle in the right half-plane, ζ(ί) = (ί,-(l-i2)1/2) takes care of the lower semicircle, and so on. 9. It is often convenient to use complex notation to describe curves in the plane. For example, the parametrization of the circle (4.5) can be written as z(i) = cos t + i sin t = e" Another curve is the spiral: z(i) = e" where с is some complex number. Writing с = a + ib, this becomes z(i) = i"e* or, in polar notation, ζ = ге1в КО = e"' 0(0 = bt
4.1 Parametrization of Curves 315 Thus the modulus of ζ varies exponentially with t, and the argument is linear in t (see Figures 4.7 and 4.8) 10. The curve Γ: x(i) = (sin t, cos t, t) (4.7) called a right circular helix, is pictured in Figure 4.9. Since x'(i) = (cos t, —sin t, 1) is never zero, (4.7) is a valid parametrization of the curve. 11. The intersection of two cylinders with different axes is a curve (see Figure 4.10). Suppose the cylinders are both of radius 1 and one, Cl5 has as axis the у axis, and the other, C2, has as axis the χ axis. Then C1 has the equation x2 + z2 = 1 (4.8) and C2 has the equation У2 + z2 = 1 (4.9) z{t) =eu',Rea >0 Figure 4.7
316 4 Curves z(t) = еш, Rea < 0 Figure 4.8 The intersection is, of course, the set of points where both equations hold and thus can be written x2 = 1 — z2, y2 = 1 — z2. We can thus parametrize at least part of the curve by x = (l-z2Y'2 у = (1-г2У'2 or f(0 = (0-i2)1/2,(i-i2)1/2,0 Figure 4.9
4.1 Parametrization of Curves 317 Figure 4.10 Other parts will be found by variations on this theme: f(0 = (-U-'2)1/2,(i-'2)1/2,0 f (0 = (i, t, (1 - t2fl*) and so on. A simpler parametrization is found by the substitution χ = cos t. Then we have the two distinct branches of the intersection given by fx(f) = (cos, t, cos t, sin t) f2{t) = (cos, f, — cos t, sin t) Implicitly Defined Curves In the situation of the above example, we say that the curve is given implicitly by the Equations (4.8) and (4.9). More often than not, when we are given a collection of equations such as these, we can determine, just by
318 4 Curves working with them, whether or not they do implicitly define a curve. Nevertheless, the theoretical question remains: under what conditions can the set defined by a collection of equations be parametrized as a curve ? We have already answered this question in R2 in Theorem 2.14. We shall restate the conclusion as a fact about curves. Proposition 1. Suppose that F is a differentiable real-valued function defined in a neighborhood of (a0, b0) and F(a0, b0) = 0 but dF(a0, b0) φ 0. Then the set {(x,y)eN:F(x,y) = 0} (4.10) is a curve in some neighborhood N of (a0, b0). Proof. Since dF{ao, b0) φ 0, then either (dF/dx)(a0 ,Ь0)фО or {bF\dy){a0, b0) φ 0. Suppose the latter. Then, according to Theorem 2.16, there is an ε > 0 and a differentiable function g defined on the interval (a0 — ε,α0 + ε) such that F(x, y) = 0 if and only if у = g(x). In particular, g(a0) = b0 . Let /: (a0 — ε, a0 + ε) -> R2 be defined by fit) = (f, g(t)). Then / parametrizes the set (4.10) near (α0, b0), and clearly /"(f) = (1, g'(t)) φ 0. If instead (dF/dx)(a0 ,Ьо)ф0 we can give the same argument merely by changing the roles of χ and y. In higher dimensions the situation is a little more complicated. We shall describe it in R3. If F, G are two differentiable functions defined in a neighborhood of a point p0, and VF(p0), VG(p0) are independent, then the set {p: F(j>) - F(p0) = 0, G(p) - G(p0) = 0} (4.11) is a curve through p0. The verification of this fact is basically another use of the fixed point theorem, complicated by some more linear algebra. We first assume that F(j>o) = 0 = G(p0). Since the vectors VF(p0), VG(p0) are independent, we can change coordinates in R3 so that VF(p0) = E2 and VG(p0) = E3. That is, with respect to the new coordinates (x, y, z), dF/dx(^0) = 0, 5F/5y(p0) = 1, dF/dz(p0) = 0 and 5G/5x(p0) = 0, дв}ду(р0) = 0, 5G/5z(p0) = 1. Now let Po = (*o > JO > zo) ί f°r x near *o we want to show that there are uniquely determined y, ζ such that F(x, y,z) = 0 G(x, y,z) = 0 Following Newton's method, we ask to find the fixed point in the y, ζ plane
4.1 Parametrization of Curves 319 of the transformation T(y, z)=(y + d-^- (p0)-lF(x, y, *),ζ + ψζ (РоГ*С(х, у, z)j Our conditions VF(p0) = E2, VG(p0) = E3 will guarantee that in some neighborhood of p0, Г is a contraction. Thus there are unique у = g(x), ζ = h(x) such that T(g(x), h(x)) = (g(x), h(x)) or F(x, g(x), h(x)) = 0 = G(x, g(x), h(x)) Thus the function f{t) = (i, g{t), h{t)) parametrizes the set (4.11) as a curve. Examples 12. At what points in the plane is the set ex+y = ja curve? Let F(x, y) = ex + y - y. rihen VF(x, y) = (ex+y, ex+y - 1). Since dF/dx is never zero, this is everywhere a curve and the equation ex+y — у = 0 determines χ as a function of у implicitly. dF/By is zero when x + у = 0. The only point on the curve where ex + y = у and χ + у = 0 is (— 1, 1), so at that point we cannot expect to find у as a function of x. Notice, that even though we cannot explicitly determine the function x =f(y) given implicitly by ex+y = y, we can find its derivative. For exp(/(j) + У) ~ У = 0 so upon differentiating we have ехр(Ду) + УЖПУ) + 1) - 1 = 0 or /'(y) = exp[-(/(y) + y)]-l 13. F{x, y) = χ sin xy — cos у (4.12)
320 4 Curves VF(x, у) = (sin xy + xy cos xy, x2 cos xy + sin y) If χ > 1, dF/dy(x, у) ф 0, so (4.12) defines у implicitly as a function of x. Differentiating (4.12) with respect to χ we find sin xy + χ cos(xy)(y + xy') + у sin у = 0 or sin xy + xy cos xy У = : 2 sin у + χ Cos xy 14. F(x, y, z) = x3y + y2, G(x, y, z) = xyz + e2. VF = Qx2y, x3 + 2y, 0) VG = (yz, xz, xy + ez) VF and VG are dependent when 3x2j> _ x3 + 2y _ 0 yz xz xy + ez These equations become xy + ez = 0 and у = χ3 or 3x2y = 0= χ3 + 2y The first pair has no solutions, and the second pair amounts to χ = 0 and у = 0. But the set F(x, y, z) = 0, G(x, y, z) = 0 never intersects this plane, so everywhere on that set F and G are independent. Thus {(x, y, z): F{x, y, z) = G(x, y, z) = 0} is a curve in R3. Comparison of Parametrizations Now, we have seen that a given curve admits many parametrizations, and it would be to our advantage to be able to single out a best possible one. In the study of the motion of particles there is a distinguished parameter, that of time. But as far as the geometric study is concerned we can take any parametrization we care to, the only criterion being that of convenience.
4.1 Parametrization of Curves 321 Geometrically, a most convenient parameter, or measure, along the curve is that of length as measured from a fixed point. Before considering the particular parametrization by arc length, let us first see how to compare two different parametrizations. Suppose Γ is a curve, parametrized by χ = /(f) a<t<b If σ is a continuously differentiable function with nonzero derivative defined on the interval [α, /J] and taking values on the interval [a, b], then the composed function / ° σ also parametrizes Γ. That is, we can write Γ as the image of * = <?(*)=/(σ(τ)) α.<τ<β If τ increases as f does, then these two parametrizations determine the same sense of direction along the curve Γ. This sense of direction is called orientation. We know from calculus that the necessary and sufficient condition for f, τ to increase simultaneously along the curve is that σ' > 0 on the interval [α, β]. We shall say that τ is an orientation-preserving parameter if this condition is satisfied, and otherwise τ is orientation reversing. On the other hand, if we started out with two different parametrizations of a curve Γ: *=/(*) or x = g(x) (4.13) then there must exist a function σ relating the two parameters. For each point of Γ corresponds to precisely one value of t and precisely one value of τ. The correspondence τ-»ff(t) =/(')-»' defines the function σ. We shall verify below that σ is a differentiable function of τ and we have g(z) = /(σ(τ)). Notice that, given the two parametrizations, so that t = σ(τ), we have by the chain rule ί?'ω=/'(σ(τ))·σ'(τ) (4-14) Thus the vectors g'(x) and f'{t) are collinear when t, τ are the same points, and point in the same direction when σ > 0, that is, when g,f induce the same orientation along Γ.
322 4 Curves Definition 2. Let Γ be a curve parametrized by χ = /(f), a< t < b. The unit tangent vector to Γ at /(f) is the vector u l/'WI By the above remarks we see that the unit tangent vector is the same no matter what parametrization we choose so long as it induces the same orientation. For if we have the two parametrizations (4.13), then by (4.14) (since σ' > 0) /'(τ) _ /'(σ(τ))σ'(τ) _ /'(f) IffWI l/W) · *'WI l/'(0l when ί, τ determine the same point of Γ. Examples 15. Consider the unit circle, given parametrically by Then z' = ie", which is a unit vector, so Τ = ie". Notice: we have T= iz, so that the tangent vector is orthogonal to the position vector. More generally, consider the spiral ζ = еаг, where α is a complex number. Then ζ = аеаг, so the tangent vector is exp /(Im a + arg a)t. Notice that the angle between the tangent vector and the position vector is arg Τ — arg ζ = arg a Thus Τ, ζ always make the same angle. 16. For the curve in space given by x = it, t\ t3) we have dx , — = (1. 2Г, ЗГ»)
4.1 Parametrization of Curves 323 so Τ(ί) = (ΤΤ^Τ9?ρ(1'2ί'3ί2) Now, here is the verification of the fact that two parametrizations are related by a continuously differentiable function. Proposition 2. Let Τ be a curve, and f. [а, Ц-»Г, g: [α, β~\ -> Γ iwo parametrizations of Г. Then there is a continuously differentiable function σ mapping [a, b] one-to-one onto [a, /?] «vcA thatg(r) = f{a{r)) for all τ ε [a, /?], «и//(0 = g(p-\t)\ for all t e [а, Ы Proof. Let τ e [a, /3]. Since /maps [a, b] one-to-one onto Γ there is precisely one te[a,b] such that f(t) =д(т). Define a(r)=t. Then σ is a well-defined function from [α, β] to [a, b]. σ is one-to-one. Suppose σ(τΟ = σ(τ2) Then 0(Tl) =/(0fo)) =/(0(T2)) =fl(T2) Since 5 is one-to-one we must have т1=тг. σ maps [α, /3] onto [α, ό]. For if t e [a, b] there is a point τ e [a, /3] such that /(0 = 0(t). Clearly, then t = σ(τ). We now have only to verify that σ is a continuously differentiable function. Let то e [α, β] and i0 = σ(τ0) Now / is a differentiable function at t0 and /'(ίο) ^ 0. Let / = (/,.. ., /„) in coordinates. There is а у such that f'i(t0) Φ Ο Then / is a real-valued continuously differentiable function of a real variable and since /'/ίο) Φ 0, it is invertible. That is, there is a function h defined on the range of/ near t0 such that Λ(//ί)) = ί for f near t0. Λ is also continuously differentiable. Now, since/(σ(τ)) =д(т), we have/(a(r)) =^/r), so ο(τ)=(/Χο(τ)) = (Αο^χτ) Since Λ and /, are continuously differentiable so is σ. The proposition is proven. Without the requirement that the derivative of the parametrization is nonzero we would in general not have such a good relationship between different parametrizations. Notice that by the same argument the inverse mapping σ~ι to σ is also continuously differentiable. Since σ-1 ° er(f) = t, for all t ε (a, b), we must have, by the chain rule, (σ-1)'(σ(ί))σ'(0=1
324 4 Curves so σ'(ί) is also never zero. If it is always positive, σ is an increasing function of i; if always negative σ is a decreasing function of t. Notice that if / g are two parametnzations of a curve and they do reverse orientation, then they will become compatible simply by negating one of the parameters. Thus if /is not compatible with g, then /: [ — ft, — a] -> С defined by /(0 =/(-0 certainly is. If/: [a, b~] -» Г is a parametnzation of a curve we shall call f(a) the left end point of Г and f(b) the right end point. The Tangent Line Now, let Г be a curve in R", and x0 a point on Г. The tangent line to Г at x0 is the straight line through x0 which best approximates the curve. We shall show that this is the line through the tangent vector and is given by this equation χ = x0 + tT(x0) t ε R The tangent line at x0 can be computed as the limiting position of lines through x0 and nearby points xx on Γ, as xx -> x0 (Figure 4.11). Let Цхх) be that line. Then L\xx) is the set of all vectors originating at x0 and parallel to xl — x0 . Let/give a parametrization of Г so that x0 =/(i0), *i =f(ti). Now L\xx) is the set of points χ such that χ — x0 is parallel to *i -*o =/Oi)-/(io) But that is the same as the set of points χ such that χ — x0 is parallel to /Οι) -/(ίο) Figure 4.11
4Λ Parametrization of Curves 325 Now x-i -»x0 is the same as i, -» i0 and the limit of the difference quotient as tl -> i0 is f'(t0). Thus ДхО tends to the line through x0 and parallel to /'(i0), as desired. Examples 17. Consider the helix (Figure 4.9), given by the parametrization f(i) = (a cos i, a sin i, bi) Then f'(i) = { — a sin i, a cos i, fc) f is a positive parametrization if we take for the unit tangent T(0 = (a1 + b2)1'2 (~asiat'a cos f' fe) (415) (see Figure 4.12). 18. A damped helix (Figure 4.13) parametrized by f(i) = (e' cos t, e' sin i, bt) Thus f'(i) = (e'(cos t - sin i), e'(sin t + cos i). *) Figure 4.12
326 4 Curves Figure 4.13 so we can take as the tangent vector TC) = (2е2г + b2yl2 (β* (COS t - Sin ί), фш t + COS ί), Ь) (4.16) Notice that the curve on the unit sphere swept out by the tangent is the same for both helices (Figure 4.14), and that the functions (4.15) and (4.16) give two different parametnzations of this curve. If we consider the parameter as t, then the " moving point" described by (4.15) has no tangential acceleration, whereas in (4.16) it is accelerating exponentially. 19. A different helix is this one (Figure 4.15): f(i) = (cos t, sin t, ег) Here we take as tangent vector T(0 = (1 + e2y/2(-sin '. C0S '. О
4 1 Parametnzation of Curves 311 Figure 4.14 (Figure 4.16). This again is a helix on the unit sphere which tends to the equator as t -> — oo and winds rapidly around the north pole as t -> + oo. (Notice that 1 *™> = (ГТРзр ► 1 as t- ► Oas t- oo ■oo) Figure 4.15
328 4 Curves Figure 4.16 20. The intersection of a sphere and a cylinder (Figure 4.17) x2 + y2 + z2 = 1 (* ~ i)2 + У2 = i In order to avoid the cross at (1, 0, 0) we shall restrict attention to the part of the curve lying above the xy plane. Let us first parametrize this curve. We shall use as parameter the angle 0 as shown in the figure. Then *(0) = i + i cos 0 у{в) = \ sin θ and ζ(θ) is the point on the unit sphere lying above (χ(θ), у(в), 0), thus ζ{β) is the positive square root of 1 - (χ(θ))2 - (у(в))2, which is (1 -cos0\1/2 /l-cos0\I/2 . Θ 1-2—) =Sm2 Thus, we can parametrize this curve with the function Щ = I - + - cos Θ, - sin Θ, sin - j Then 1 / β\ f'(0) = -l-sin0,cos0, cos-I
4.1 Parametrization of Curves 329 and we can take as tangent line Τ(θ)= (j^)1/2(-Sin0'COS0'C°4) (417) Notice that this does not parametrize Γ at the point (1,0,0), since this point corresponds to both parametric values 0, 2π. In fact, Γ is not a curve at the point (1, 0, 0) since it does not have a unique tangent line: the limiting position to (4.17) as χ -* (1,0,0) is either (0,1,1)/>/2ογ«),1,-l)/>/2! • EXERCISES 1. Find a parametrization for the curve of intersection of the ellipsoid x2 + iy2 + z2 = 1 with the cylinder x2 + z2 = 1 2. Parametrize the intersection of the paraboliod ζ = χ1 + у2 with the unit sphere x2 + y* + z* = 1. 3. At what points is the set defined (in polar coordinates in R2) by r(l + a cos Θ) = 1 a curve? Find a parametrization of the curve. Figure 4.17
330 4 Curves Figure 4.18 4. Consider the family of cardiods (Figure 4.18) r = (\ +c)"'(l + с cos Θ) (a) Describe the behavior of this family as с ranges between 0 and + co. (b) For с = 1, с = 2, calculate the unit tangent vector to the curve as a function of Θ. 5. What is the tangent vector to the curve r = a cos bdl Graph the curve fori = 1,2, 5,л/2. 6. Calculate the tangent lines to the following curves: (a) f(0 = (e"'cosf, e-'sinf) at (1,0). (b) f(x) = (x, sin-) at (1,0). (c) ,=(*·8,η;) f (0 = ie', τ^-j-, sin / j at (1,1,0). (d) x2 + y2 + z2=4a2,(x-a)2+y2=a2 at (2a, 0, 0). (e) f (0 = (f, cos t, sin 0 at (0,1,0). (f) f(/)=(/2,l-/',/) at (1,0,1). 7. Find the tangent line at the origin for these curves, (a) e'+1,~y-l =0 (b) (c) (d) cos xy = у + 1 χ' + y3z + sin ζ = 0 exp(sin (x^ + z)) = 1 е"г — cos xy = 0 x2 + У2 + z2 = χ + у + ζ • PROBLEMS 1. A snail deposits calcium at the leading edge of its shell in a direction which makes a fixed angle with the ray from the snail's center to the leading edge. Show that this hypothesis explains the spiral form of a snail's shell. 2 Graph the curve r = (1 + θ2)~1(2+ θ2) and compute its tangent vector.
4.2 Arc Length 331 3. Graph the curve in R3 given in spherical coordinates by r = e\ θ = t, ζ = e'. Graph the curve on the unit sphere made by the tangent vector of the given curve. 4.2 Arc Length Definition 3. Let Γ be an oriented curve positively parametrized by f. [a, b] -> Γ. Let a<a0<b0<b. Define the length of Γ between f(a0) and i(b0) to be the least upper bound of all sums £||f(i,)-f(i,-i) (4.18) over all choices of points t0, ..., tk such that a0 = t0 < ix < · · · < tk = b0 This definition has this description. Approximate (Figure 4.19) the curve by a " broken line" joining a succession of points along Γ between a0 and b0. Then the sum of the lengths of the line segments is less than, and approximates the length of the curve. Now, if the points i, and tl^l are very close, then the vector f(i,) - f(i,_!) is approximately equal to ΠΑ)(ί, - ί,-i)· If we replace this in (4.18) we get a sum Σ 1|Г(011 С,-',-x) (4.19) which is a Riemann sum approximating the integral l-bo | Hf'(OII dt (4.20) H = b Figure 4.19
332 4 Curves Of course, the substitutions taking us from (4.18) to (4.19) admit a small error term by term but since к may be very large, we have no hold on the error between (4.18) and (4.19). Nevertheless, we can, by being very careful, justify that substitution and deduce that the limit of the lengths of the approximating line segment curves is the integral (4.20). Proposition 3. Let Τ be a curve parametrized by f: [a, b~\ -> Γ. The length of Τ between f(a0) and I(b0) is given by the integral (4.20). Proof. We will use the fundamental theorem of calculus to show this. Let s(0 be the length of Γ between f (a0) and f (t). We shall show that s is a differen- tiable function off, and s'(t) = ||f'(Oil- Fix ίο > αο and consider a t ;> to. If S0 is any sum like (4.19) approximating the length of Г between f(a0) and f(f0), then SO + l|f(0 - f('o)ll is a sum like(4.19) for the length between f(a0) and f(0- Thus So+l|f(0-f(io)ll<i(0 Taking the least upper bound over all such SO, we obtain the inequality s(to)+\\I(t)-I(t„)\\<s(t) (4.21) Now, suppose S is a sum like (4.19) corresponding to a partition of the interval [a0, t ]. We may suppose that t0 is one of the points in this partition. For if not, we can add it to the given partition, and get a still larger sum. Let to < fi < ■ ■ ■ < tk = t be the points of the partition between t0 and t. Then s = s0+ Zllf('<)-f('<-«)H 1 = 1 where S0 is a sum corresponding to the interval [a0, ίο]. Thus s<i(fo)+illf('<)-f('<-i)H ίίί(ίο)+ Σ ί f'(0 ■'«ι — ι dt <s(t0)+ 1 ί ||f'(Ol|d-'<i('o)+ ί ||Γ(0Ι|Λ '=' Л,_, Л0 Since this is true for all such sums S, we have *(0<UO)+f llf'OII* (4-22)
4.2 Arc Length 333 From inequalities (4.21) and (4.22), we obtain ||f(/)-f(/o)ll ί(0-ί(Ό) t-t0 i(0 - ί(Ό) 1 f' < , , < —— Wit) II dt (4.23) I — to t — to J«0 As t^-to, both the left and right ends converge, since f is differentiable, to ||f"(fo)||. Thus ί is differentiable at t0, and s'(t0) = ||f'('o)!|. Since this is valid for ίο between ao and b0, we have the desired conclusion. Now, if Г is a curve parametrized by f: [a, b~\ -> Г, we can consider arc length as a function along Г. Precisely, let s{t) be the length of the piece of Г from f(a) to f{t). Then, from the above proposition, s(0 = fVwil dt Since s'(i) = || f'(i) || > 0, we can parametrize Γ by arc length, and it induces the same orientation as the original parametrization. Thus g(j), for every s is the point on Γ of distance s from a: g(j(0) = f(0- If L is the length of Γ from a to b, g: [0, L] -> Γ parametrizes Γ. Notice that f'(0 = g'WO) · At) = g'№)-\\f'(t)\\ so that Thus g'(j) is the unit tangent to Γ at g(j). Examples 21. The circle x2 + y2 = a2. Parametrize this circle by χ = a cos θ у = a sin θ Thus f(0) = (a cos Θ, a sin 0) f'(0) = a(-sin0, cos0) ||f'(fl)|| = e
334 4 Curves Thus arc length is given by s = S(0) = \ α άθ = αθ The parametrization according to arc length is thus given by substituting s = αθ. I s . s\ x = g(s) = I a coS -, α sin — I \ a a) The unit tangent vector is given by / . s s\ T(s) = —sin-, cos -I \ a a) 22. Consider the helix of Example 17 given by f(i) = {a cos t, a sin i, bt) Then f'(i) = ( — a sin ί, α cos t, bt) \\Г(Ш=(а2 + Ь2У'2 Thus j = j(i) =((a2 + b2)1,2)t, and the arc length parametrization is g(s)=(a cos(^+W^' °sin (fl»+'bY/»'(fla+Wa s) The tangent vector is T(s) = (a2 + biyii (- « sin t, α cos i, b) 23. The curve of Example 20 has the parametrization Щ = I - + - cos Θ, - sin Θ, sin -j
4.2 Arc Length 335 and we find Hf'(0)||=;A=(3 + 2cos0)i so 1 re s(0) = —-= (3 + 2 cos фУ12 άφ 2y/2Jo and the unit tangent vector is given as T(0)=(3T^)1/2(-Sin0'COS0'C°S-2) Equations of Motion Now we shall consider in greater detail the equations of a particle in motion. Suppose a particle moves through R" along the path given by x = x(i)· The velocity at time t is x'(i)> and the acceleration is x"(i)· These are vector-valued functions describing the instantaneous change in the motion (direction and magnitude) of the particle. The speed of the particle is the rate at which the distance covered changes, and thus is the time derivative, ds/dt, of arc length. As we have seen above, this is the magnitude of the velocity. Thus , . dx ds velocity = — speed = — = dt dt dx ~dt (4.24) Now, it is instinctive to decompose the acceleration vector into a component tangent to the curve, and a component orthogonal to the curve. We write d2x ~d? acceleration = —^ = aTT + aN N where Τ is the tangent vector and N is a unit vector orthogonal to Τ and lying in the plane spanned by the velocity and acceleration vectors. N is called the principal normal to the curve of motion, aT is the tangential acceleration of the particle, and aN is the normal acceleration. We now show how to
336 4 Curves compute these components of the acceleration. Differentiate the equation dx ds „ — = —Τ dt dt obtaining d2x d2s„ dsdT ,, „^ —; = ^T + (4.25) dt2 dt2 dt dt v ' Now dljdt is orthogonal to T, since Τ is a unit vector. Differentiate <T, T> = 1 We then have <T\ T> + <T, T'> = 2 <T, T'> = 0 (4.26) Thus we can take for the normal vector the unit vector in the direction dT/dt dTldt dTlds N = = (4 27Ϊ HdT/ЛЦ ||dT/ds|| ^ ■ } (Of course, the differentiation in (4.25) could have been with respect to arc length as well as time.) Let к = \\dT/ds\\. This is called the curvature of the path of motion. Then dT/ds = /cN, and (4.25) becomes <Px_d2s ds dT ds df'dl2 + dt~ds It d2x d2s„ /ds\2 acceleration = —-=■ = —τ Τ + κ Ι — Ι Ν dt2 dt2 \dtf Thus the tangential acceleration is the rate of change of the speed, and the normal acceleration is proportional to the curvature, or bending, of the curve. d2s /dsV "r = ^2 a»=(jt)K (4-28)
4.2 Arc Length 337 Examples 24. Suppose a particle moves along the parabola у = 1 - χ2 according to these equations x = t - 1 y = 2t-t2 Then χ = (ί- 1,2ί-ί2) % -(1.2(1-0) ^5 = (°.-2) Thus the motion of the particle is determined by a downward vertical acceleration of constant magnitude (perhaps due to gravity) (see Figure 4.20). The speed of the particle is dx It = (1 + 4(1 - i)2)1/2 (4.29) Thus we see that the speed is decreasing until time t = 1 (the maximum height of the trajectory), and then increases. The tangent vector to the path of motion is Τ = (1 + 4(1-ί)2Γ1/2(1,2(1-0) Figure 4.20
338 4 Curves and so dT 2 *= [1+4(1-0*]»/» ГС1"0'-1* The normal vector is the unit vector in this direction: N = (1 + 4(1 - ί)2Γ 1/2(2(1 - 0, -1) Now (4.30) dT dT Ids ds dt/dt (1 + 4(1 — 02)2 2 (2(1-0,-1) N [1 + 4(1 - 02]3/2 Thus the curvature of the path of motion is 2 K~(l + 4(l-02)3/2 And finally 4(1 - 0 _d2s _ aT~dT2~ (1 + 4(1 - t2))3'2 "* ~ \dt) 'v " (1 + 4(1 - 02)1/2 The length of the trajectory from χ = — 1 to χ = + 1 is 0 _(dsV t2)Y'2 °Ν~ ЫК-(ТТ f2 ^ dt =\\l +4(1-t)2V'2dt Jo at ·Ό 25. (Rotation) (Figure 4.21). Suppose now that a particle rotates around the unit circle according to the equations χ = cos(e') у = sin(e') Then χ = (cos(e'), sin(e')) dx — = e,(-sin(e,),cos(e')) d2x —Ϊ = ^(-sinte'), cos(eO) - e2,(cos(e')) sin(e')) (4.31) (4.32)
4.2 Arc Length 339 Figure 4.21 Now we already know, just from geometric considerations, what are the tangent and normal to the path of motion: Τ = (- sin(e'), cos(e')) N = - (cos(e'), sin(e')) Thus (4.31) can be written as d2x ^4 = e'T+e2'N dt Thus the normal acceleration is the square of the tangential acceleration. From (4.31) we read ds dt~ thus dx ~dit = ё s = e" and the curvature of the unit circle is 1. Notice, that any motion on the unit circle can be written in the form χ = (cos(/(i)), sin(/(i))
340 4 Curves Figure 4.22 where /(f) represents arc length as a function of time. Since the curvature of the unit circle is 1, we obtain for any circular motion acceleration = —= Τ + | — I N dt2 \dt) The tangential acceleration is the rate of change of speed, and the normal acceleration is the square of the speed. 26. Now let us consider the motion of an object down a slide (see Figure 4.22). The slide will be represented by the curve Γ. Let ζ = ζ(ί) = x(t) + iy(i) be the equation of motion of the particle. The acceleration is z"(f); according to Newton's laws mz" = F where m is the mass of the object, and F is the sum of the forces acting on the object. One such force is the force due to gravity which is mg, where g is the gravitational field. The other force is the restraining force due to the curve. This force acts in a direction normal to the curve, and has undetermined magnitude. (That is, its magnitude is determined only by the object.) Let us call this force φΝ, where φ is a scalar and N is the normal to the curve. Thus we have mz" = mg + φΝ
4.2 Arc Length 341 Now, since we know the path of motion, we need only determine the tangential acceleration aT. By Equation (4.28), we have d2s dt = aT= (z\ T> = (g, T> (4.33) where Τ is the tangent of the curve. If we consider the curve as parametrized by arc length: z=f(s) is the equation of the curve, then the tangent vector is f'(s). Then Equation (4.33) becomes dh dt 2 = <g, /'(«)> and the speed can be found as the solution to this differential equation with initial conditions s(0) = s'(0) = 0. For specific examples, let us first consider the curve to be a straight line (Figure 4.23) with equation z(j) = i + j£0 where ξ0 = a + ib is a unit vector in the third quadrant φ > 0). Then Τ(ί) = ξ0 is constant, and the force due to gravity is - ig. The speed Figure 4.23
342 4 Curves Figure 4.24 is thus found as the solution of the differential equation — =(-ig,a + iby= -gb s(Q) = j'(0) = 0 Thus s(t) = -(gbt2)/2 and the equation of motion is ζ = ζ(ί) = i - (%gbt2K0 27. Suppose now the curve is a semicircle (Figure 4.24) z(s) = sin s + i cos s Then T(s) = cos j — i sin j and the speed is the solution of the differential equation d2s —2 = { — ig, cos s — i sin s> = g sin s j(0) = j'(0) = 0 Rotating Plates 28. We can describe the motion of a rotating flat circular plate by referring to the angle as a function of time. Let a line through the center of the plate be chosen at time t = 0 and let 0(i) be the angle this line makes at time t with its original position. Then a point at
4.2 Arc Length 343 z0 at time t = 0 follows the path of motion z = z0e,e(,) Its velocity is /zo0'el9(,), so its speed is |zo|0'. The acceleration of the point is found by differentiating further: z"(i) = iz0 0Ve(,) - zo(0')Ve(o (4.34) Thus the tangential acceleration is |zo|0" and the radial acceleration is zo(0')2· If there is an object of mass m on the plate, a force mz"(t) is required so that the object will follow the motion of the plate. Friction may provide this force. Notice that the central component of this force is zo(0')2, so even if there is no angular acceleration, friction must do its job. The further the object is from the center, or the faster the plate spins, the greater the force required. It is this principle which explains the centrifuge, which settles precipitates in solution by spinning the fluid. 29. Suppose now we have a curved circular plate spinning at constant angular velocity, and there is a ball of mass m in the plate (Figure 4.25). Assuming there is no friction, we can describe the motion of the ball in terms of the initial data. Let us use spherical coordinates r, 0, ζ in R3, so that the plate is given by the equation ζ =f{r). In Figure 4.25 we depict a planar section of the plate. Let r = КО θ = 0(0 ζ = z(t) be the equations of motion of the ball, and let α be the angular velocity of the plate. Since there is no friction, the ball rotates as does the Figure 4.25
4 Curves plate, then θ = θ{ϊ) = at. Since the ball is constrained to lie on the plate we must have z(i) = /(K0)> f°r aU '· Thus we have х(0 = (К'Уа,,ЛК0)) as the equation of motion, and we must find, using Newton's laws, the function r{t). Now the acceleration is x" = ((r - a2r + 2шг')еш, fir')1 + f'r") (4.35) Letting g be the gravitational field, g = —(0, g) we have a force mg due to gravity. There is another force, that which restrains the motion to the profile of the plate. This acts in a direction normal to the plate and has undetermined magnitude. Let φΝ denote this force. There is a third force acting on the ball, due to the rotation of plate and the direction of this force is tangential to the circle on which the ball lies. We shall denote this force by C. Then, by Newton's laws φΝ + С + mg = mx" (4.36) Let us equate coordinates. Now, since N is normal to the surface, it lies in the plane through the ζ axis and the ball (the rz plane) and is normal to the curve ζ = f{r). Thus N = {пхеш, п2) and ^—(/'«Гх «1 (since n2/n1 is the slope of the line perpendicular to the curve ζ = f{r)). Since С is tangent to the circle on which the ball lies, С = (сеш, 0). The magnitude с of С is yet to be determined. Finally, g is vertical, so g = (0, g). Using (4.35) and substituting these values in (4.36) we have these three equations as a result: φηγ = m{r" - a.2r) с = loir'm -тд + фп2 = ти\г')2+Гг") Thus, eliminating φ from the first and last equations, we find that r = r{t) is a solution of the differential equation (1 + /'(r) V = «2r - f'W\r){r')2 - f\r)g (4.37)
4.2 Arc Length 345 For what kind of a plate will it be true that the ball will not move up or down, once released no matter what its position? We must have /■' = /■* = 0, so (4.37) becomes «2r=f\r)g Solving, we obtain f(r) = (a2/2#)r2. Thus if the plate is a paraboloid of revolution, we can rotate it at a suitable angular velocity so that it will have this property. 30. Suppose we are given a field of force in space, and the initial position and velocity of a particle. Then we can find the path of motion of that particle. For example, suppose the force field is F(x) = — x, and the initial position and velocity of the particle are x0, v0. Then the path of motion is given by the solution of this differential equation: f'V) = -ДО /(0) = xo /'(0) = v0 We know the solution; it is /(f) = cos f · x0 + sin f · v0 Thus the path of the particle is an ellipse in the plane determined by the vectors x0, v0. If x0, v0 are orthogonal the major and minor axis have lengths |x0l> lyol (see Figure 4.26). The velocity vector is f'{t) = — sin f · x0 + cos f · v0 and the speed is the length of this vector. Χι Figure 4.26
346 4 Curves 31. Suppose we have a force field in the plane which is of the same magnitude as the position vector, but orthogonal to it. Using complex variables on the plane, the force field is given by F(z) = iz or — iz Let us assume it is the former. Suppose a particle has initial position z0 and velocity v0. Then, the motion is found by solving АО =»/(») /(0) = *o /'(0) = v0 The solution is of the form f(t) = Ae°" + Be-°" where α = ^Ji = (1 + i)ly/2. We solve for A, B by substituting the initial conditions, /(0) = z0 = A + В /'(0) = v0 = α(Λ - В) Thus m = z° ~ iv° * + z° +2Ш° e- Suppose z0 = 1, r0 = 0. Then Л0 = *(*" + *-") For large positive t, the second term is negligible, and the curve is very close to ζ = \e" which we know is an outgoing counterclockwise spiral. For large negative t, the second term e~"' is dominant and that gives an incoming clockwise spiral. Thus the particle comes spiraling in from outer space and then at time t =0 pauses for a breath and then goes racing back from whence it came. (See Figure 4.27.)
4.2 Arc Length 347 Figure 4.27 • EXERCISES 8. Find arc length as a function of the parameter for each of the following curves (a) r(\ +acos0)= 1 (b) r = 1 + 2 cos θ (c) The curves in Exercises 6(a)(b)(d)(f), and 7(a). 9 Parametrize these curves according to arc length, and find the curvature and normal (a) x2 + y2 = 1, x2 + z2 = 1. (d) Thecurve of Example 22. (b) The curves in Exercise 8(a)(b). (e) The curve of Example 23. (c) The curve in Exercise 6(a)(e). 10. Find the normal and tangential accelerations for these planar motions: (a) z(i)=exp(l -i)t (c) z(t) = (1 +2 cos t)eil (b) x(t) = t<,y(t) = t* (d) z(t) = t + e" 11. Find the normal and tangential accelerations of these motions in space: (a) x(t) = (t, sin t, sin t) (c) x(t) =t(smt, cos t, 1) (b) x(0=(e', e~\t2)
348 4 Curves • PROBLEMS 4. The graph of a differentiable function у =f(x) is a curve in the plane. Find the curvature as a function of x. 5. The graph of a differentiable i?2-valued function у =f(x), ζ = g(x) is a curve in space. Find its curvature as a function of x. 6. A skier has to negotiate a series of hills whose profile is the curve у = e~" cos χ (Figure 4.28). There are three forces acting on the skier: that due to gravity, the restraining force of the hills, and a force due to friction which is proportional to his velocity. Find the differential equation describing his motion. 7. I shot an arrow into the sky at an initial velocity of 80 feet/second and at an angle of π/З with the horizontal. The gravitational field is vertical downward with a magnitude of 32 feet/second2 The air drags the arrow with a force of 0.05 times its velocity. Find the equation of motion, and the curvature of the curve of motion (the arrow weighs one pound). 8. In Example 26, let к be the curvature of the slide. Show that the magnitude of the constraining force due to the slide is/= (ds/dt)2K — <g, N>. Find the differential equations which determine x(t), y(t). Write out these equations when the slide is the curve у = cos x. 9. Suppose we have a field of force in space given by F(x, y, z) = (—y, x, z). Find the path of motion of a particle which at time t = 0 is at (1,1. 1) with velocity (-1, -1,1). Figure 4.28
4.3 Local Geometry of Curves 349 Figure 4.29 10. Suppose a race track is formed by rotating the curve (x — l)2 + z2 = l, — 1 < ζ < 0 around the ζ axis. (The surface is, in cylindrical coordinates, (#■ — l)2 + z' = 1, Figure 4 29). A cyclist cycling around the track tends to ride up the bank as he goes faster Explain that 11. Water is at rest in a very large sink when a stopper is removed in the bottom center of the sink. An idealization of the ensuing motion is as follows. The water accelerates toward the hole The forces acting on each particle of water are due to gravity and the mass of the fluid itself. The field due to the former is — (0, 0, g) and the field due to the latter operates as if the particle were on an inclined plane with vertex at the hole. Find the resultant force field. Find the differential equation giving the rate of rotation around the hole. 12. We must send a ball of unit mass over a hiH whose profile is the curve у = exp(— x2) from χ = — 1 to χ = 1 What minimum initial speed is required to ensure that the ball maneuvers this hill7 13. Suppose we are given in space a force field which is directed toward the origin and so that its component in the ζ direction is always 1. Find the path of motion of a particle which is at rest at time t = 0 at the point (1,1,1). 4.3 Local Geometry of Curves We have seen, from the physical problems discussed, that the higher-order derivatives of a function parametrizing a curve have some significance. In this section we will discuss the higher-order invariants of a curve; that is, those concepts which depend only on the geometry and not on the particular parametnzation.
350 4 Curves Let Γ be a curve in R". For purposes of simplicity, we shall take Γ to be parametrized by arc length by χ = x(s). If Γ is twice differentiable, the tangent vector T(s) = x'O) is a differentiable function. Since <T(s), (TV)> = 1 for all s, we obtain through differentiation 2<T(j), T'(j)> = 0. Thus at any point T' is orthogonal to T. Definition 4. The normal line to Γ at x0 = x(j0) 1S the line through x0 and parallel to the vector T'(j0)· The osculating (or tangent) plane to Γ at x0 is the plane spanned by the tangent and normal lines. The name osculating plane is quite descriptive. This plane osculates in the following sense. Proposition 4. Let \0,xi, x2 be three points on the curve T. If they are noncollwear, they determine a plane. This plane has the osculating plane as limiting position, as x1; x2 tend to x0. Proof In order to determine the limiting position of the plane through Xo j Χι, Χι it suffices to find two independent vectors which are limits of vectors on the variable plane. The easiest way to do this is to refer to the Taylor expansion of the arc length parametrization. Suppose/: (a, b)-+T parametrizes Γ with respect to arc length, and/(0) = x0. For simplicity we may assume x0 is the origin, 0. According to Theorem 4.1 we can write f(s) = T(0)s + T'(fi)s2 + e(s)s2 (4.38) where lim e(s) = 0. 5-.0 Let xi =/(sl), x2=/(s2) Since x0=/(0)=0, the plane π(ίι,ί2) through xo, χι, χι is the plane spanned by the vectors /(ii),/(s2). Now, foreachsb sllf(si) is on π(ίι, s2). Now sr Y(si) = Γ(0) + Γ'(0)ίι + Φ)*,2 Letting ίι -^0, that says that lim if V(ii) = T(0) is on the limiting plane. Now, to find another vector on the limiting plane, we take an appropriate combination of /(ίι),/(ί2) so as to dispose of the T(0)s term in the Taylor expansion (4.38). Thus, we consider i2/(*i) - Si/Ы = r(0)(s2i/ - ίιί22) + eisJsisS - ε(ί2)ίιί22 (4.39) We are interested in finding some vector of this form which has a limit as slt s2 tend to zero. Let us take the special case st = 2s2 = 2s; (4.39) becomes Τ'(G) 2ί3 + ε(2ί) 4ί3 - e(s) · 2ί3 = 2ί3(Γ'(0) + 2e(2s) - e(s))
4.3 Local Geometry of Curves 351 Thus 7"(0) + 2e(2s — e(s)) is on the plane spanned by f(s) and /(2s). Letting s -> 0, we see that 7"(0) is on the limiting plane Thus the limiting plane is indeed spanned by Γ(0) and Г(0). A few remarks are in order. If T'(0) = 0, then the osculating plane is not defined. In particular, if the curve Г is a straight line, then the tangent vector is constant, and there is no plane which is closest to Г, so a straight line has no osculating plane anywhere. Conversely, if Τ and T' are always collinear along Γ, then Γ must be a straight line (Problem 14). Now, in the case where T' and Τ are collinear at the point in question, but not always collinear, it may happen that the plane through x0, xlt x2 of Proposition 3 has a limiting position as x1; x2 -> x0; and it may not (see Problem 14). In the former case we shall consider the normal plane as defined by the limiting position, and in the latter case, we shall say that the normal plane does not exist. Generally speaking, such cases are pathological, and we shall exclude them from further discussion. Observe that for curves in R2, the osculating plane is (of course) just R2. For curves Γ in R2, we define the normal vector to Γ at x0 as that unit vector N on the normal line so that the sense of rotation Τ -> N is counterclockwise (see Figure 4.30). Then the normal vector N varies continuously along the curve and the vectors (Τ, Ν) will form a "natural" orthonormal basis for R2 along the curve (called the moving frame). In R" for η > 2 there is no uniquely determined choice for a normal vector, and thus we leave the choice undetermined save that it should vary continuously along Γ. Definition 5. Let Γ be a twice differentiable oriented curve in R". The normal vector to Γ is a choice of unit vector on the normal line which varies Figure 4.30
352 4 Curves continuously along Γ. The curvature of Γ is the scalar function of s, k(s), such that T'(j) = k(j)N(j) along Γ. Examples 32. The circle in R2 (Figure 4.31) x(s) = a cos -, a sin -I \ a a) T(s) = I —sin-, cos-1 \ a a) The normal is orthogonal to and counterclockwise from Τ so N(s) = — cos -, sin -) \ a a) Then T'(j) = — [cos(j/a), sin(j/a)]/a = N(j)/a, so the curvature of the circle of radius a is a"1. 33. The spiral r = ee (in polar coordinates) (Figure 4.32). The parametnzation is z = z(0) = eVe = e(1 + ,)e z'(0)=(l + i)e(1 + ,)e Figure 4.31
4.3 Local Geometry of Curves 353 T(«) π/4 Ν(β) Figure 4.32 so ds ee Thus the tangent vector is i + i T(fl): exe =β.(β + π/4) The normal is N(0) = e,(e+ 3π/4). Now, ds άθ ds v v Thus the curvature is given by κ(θ) = N/2e~9. Here is a proposition which gives an interpretation of curvature in the plane and sometimes makes the curvature easily computable. It says that
354 4 Curves the curvature is the rate of rotation of the moving frame with respect to arc length. Proposition 5. Let Τ be a given plane curve. The curvature of Τ is the rate of rotation oj the tangent with respect to arc length, that is, Φ) = -j (arg T0)) as Proof. Let T(i) = r(s)e,eM in polar coordinates. Since T' is a unit vector, r(s) = 1 Then N(s) = е'<в<5>+"'2>, and dT d — = — (e»i«) = ιθ'β" = 0V(")+'"2 ds ds Thus φ) = 0'(s). Examples 34. The helix f(i) = (a cos f, a sin t, bt) has arc length s = {a2 + b2)i,2t, and tangent vector T(s)= (a2 + b2yl'2(-asm(a2 + b2yll2s,acos(a2 + b2)~ll2s, b) Thus T(s) = (a2 + b2y\-a cos(a2 + b2yil2s, -a sin(a2 + b2)~ll2s, 0) Thus N = (-cos t, sin t, 0) and a к = ■ a2 + b2 Observe that the normal line to the helix always points toward the axis of the helix. 35. Consider the curve (Figure 4.33) x(0 = (cos t, sin t, sin 3i)
4.3 Local Geometry of Curves 355 Figure 4.33 Then x'(0 = ( —sin t, cos t, cos 3f) ds -=||X'(f)ll=(l + 9cos23i)1/2 T(i) = (1 + 9 cos2 3f)"1/2(-sin t, cos t, cos 3i) Computing dT_dTdt ds dt ds = -(1 + 9 cos2 3f)"2(10 cos t, sin 9f,3 cos 3f cos It + sin 3f sin2i) and the curvature is the length of this vector. Now, let us make one final remark about a curve in the plane. It is completely determined, up to Euclidean motions, by its curvature. Thus, for example, the only curve of constant curvature is a circle. This is, as we shall see, an easy consequence of Picard's existence theorem for differential equations. Theorem 4.1. Let k(s) be a continuous function of s in some interval I about the origin. There is a curve Γ whose curvature function is k(s). If Γ" is another curve parametrized by arc length on the interval I which has the same curvature, then a Euclidean motion will move Γ' onto Γ.
356 4 Curves Proof. First we shall verify the uniqueness. Let Γ be a curve with the given curvature. Let χ = x(s) be its arc length parametnzation. We may apply a Euclidean motion (translation and rotation) so that x(0) is the origin and T(0) is the vector Ei. Now we show there is only one curve with these properties. The proof depends on the observation that the normal is rigidly attached to the tangent; that is, its motion along the curve is completely determined by the tangent. In fact, writing T(i) = <?""5>, we have N(i) =e'""s>+"'2). Thus N' = /6»V+"'2) = в'енв+") = —κΎ. Now the system of differential equations T(s) = /c(i)N(i) N'(s) = - /φ)Τ(ί) (4.40) has only one solution subject to the initial conditions T(0) = Ei, N(0) = E2. Thus Τ is unique, so x(s) = f Τ(σ) da Jo is also uniquely determined by the given conditions. Thus there is only one Γ with the given curvature. We now turn to the question of the existence of a plane curve with given curvature. Again, by the fundamental theorem on differential equations, there exists a solution of the system (4 40) subject to the initial conditions T(0) = Ei, N(0) = E2. If (Τ(ί), Ν(ί)) is the solution, then χ = χ(ί) = ί Τ(σ) da Jo defines a plane curve Γ. We must show that ί is arc length along Γ For then x"(s) = k(s)N(s), so k(s) is the curvature. To show that ί is arc length we must show that x'(s) = T(s) is a unit vector. Now, let T(s) = - iN(s), N(s) = - /T(s). Then f (0) = - ίΈ2 = Ει N(0) = iEi = E2 T'(s) = -iN'(s) = -i(-k(s)T(s)) = k(s)N(s) N'(i) = iT(s) = ik(s)N(s) =- k(s)T(s) Thus Τ, Ν also solve the given initial value problem By the uniqueness, Τ = T, N = N. Thus N = (T, so NJ_ Τ It follows that - <T(i), T(j)> = 2<Τ(ί), Τ'(ί)> = 2«(i)<T(i), N(i)> = 0 so T(s) has constant length Since T(0) = Eb it is a unit vector
4.3 Local Geometry of Curves 357 • PROBLEMS 14. Show that if Γ: χ = x(j) is a curve in R3 and T(s), T'(i) are everywhere collinear, then Γ is a straight line. 15. (a) Let Γ be given by χ = (x, x3, x3) Show that T(0), T'(0) are collinear, but Γ has an osculating plane at the origin, (b) Let . . \x3 if x<0 *W = (o if*>0 Show that the curve Γ given by x = (x, -g{x),g{-x)) does not have an osculating plane at the origin 16. Let Γ be a curve on the sphere x1 + y2 + z2 = 1 Show that Γ is an arc of a great (i.e , diametric) circle if and only if the normal to Γ is always collinear with the position vector. 17. Show that a curve is a straight line if all its tangents are parallel 18. Three noncollinear points in R2 determine a circle If, for the purposes of this exercise, we consider a straight line as a circle (of infinite radius) we may assert that any three points determine a circle Suppose Γ is a curve in R2 through p0 Following the kind of reasoning on pages 324 and 325, define the osculating circle to Γ at p0 and find its equation in terms of a parametnzation of Γ. 19. The radius of the osculating circle is called the radius of curvature Show that it is к'1. 20 If the osculating circle to Г is always a straight line, deduce that Г is a straight line 21 Find the osculating circle at a general point of an ellipse. 22 Find the osculating circle at a general point of a parabola. 23 Show that if the osculating circle to a curve is always a circle of radius R, the curve is a circle of radius R 24 Suppose Г is given parametrically by arc length by χ = x(s), у = y(s). Show that the curvature is given by K = x'y"-y'x =[(x")2 + (y")2Y'2 25 Show that a curve in the plane of constant curvature is a circle.
4 Curves Figure 4.34 26. Suppose Γ· χ = f(s) is a curve with this property: for every /, the distance between f(s) and f(s + t) is independent of s. Show that Γ is a circle 27. Suppose f is a nonnegative function of a real variable with the property that the area under the graph of/between 0 and χ is proportional to the arc length of that graph. Find the curve. 28 Find the curve Γ with the property that at any point ρ the angle between the tangent to Γ at ρ and the tangent to the ellipse E: x1 + 2y2 = l at the point of intersection of Ε with the ray through ρ is constant. 29 Let Γ· χ = f(i) be a planar curve Suppose we have a string along Γ with one end point at x0 If we unwind the string tautly and without stretching, the end point will follow a curve E, called an evolute of Γ (Figure 4 34) If s measures arc length from χ = f (0), the curve Ε is parametrized by χ = f(s) + sf'(s) Find the evolutes to (a) the unit circle (b) the spiral ζ = e(1 + ot, (c) the parabola у ±= x2, (d) an ellipse. 30 If we rotate a cylinder of water about its axis, the surface of the water does not remain a plane. What shape does it take and why?
4.4 Curves in Space 359 4.4 Curves in Space Suppose Γ is a curve in space. Let x0 e Γ, and suppose Τ and N are the tangent and normal to Γ at x0. A third unit vector orthogonal to both Τ and N will serve to provide a natural frame within which to discuss the behavior of the curve near x0. This vector B, called the binormal to the curve is chosen so that the triple Τ -> N -> В forms a right-handed frame (see Figure 4.35). In this section we shall use this frame, called the moving trihedron along the curve, much as we used the tangent and normal to study plane curves. The three vectors Τ, Ν, Β determine three planes: the tangent (or osculating) plane is spanned by Τ and N, the normal plane is spanned by N and B, and the plane spanned by Τ and В is called the rectifying plane. Now the curvature of the curve is, as we have seen, the rate of rotation, with respect to arc length, of the tangent line in the osculating plane In three dimensions there is another important intrinsic function on the curve. Since В is a unit vector on Г, <B', B> = 0. Thus B' lies in the osculating plane. Since <B, T> = 0, we have <B', T> + <B, T'> = 0 Since T' = /cN, <B, T'> = к <B, N> = 0, thus also <B', T> = 0 so B' must be collinear with N. Figure 4.35
360 4 Curves Definition 6. The torsion τ of a curve Γ is that function such that B'= -τΝ. The torsion measures the torque, that is, the twisting of the osculating plane about the tangent line. That is, since the binormal is orthogonal to the osculating plane, the change in the binormal reflects adequately the change in the osculating plane. The Taylor development of the binormal in a neighborhood of a point x0 = x(0) is B(j) = B(0) - t(0)N(0)j + e(j) Thus (considering only first-order terms) the binormal at x(j) has moved — τ(0) · s toward the normal. Thus if τ(0) > 0, the osculating plane has twisted in the right-handed sense about the tangent line. At a point where τ = 0, the osculating plane pauses; it may or may not change its direction of rotation about the tangent line. If τ ξ 0, the osculating plane remains fixed along the curve; it follows that the curve lies on this plane. Proposition 6. Let Τ be a curve in R3. Γ is a plane curve if and only if τ = 0 along Γ. Proof. If Γ is a plane curve, let Π be the plane containing Γ. The tangent and normal to Γ always lie on Π, so the binormal is always the unit vector orthogonal to Π. Thus the binormal is constant, so B' = 0, thus т = 0. On the other hand, suppose τ = 0. Let χ = x(s) be the parametrization of Γ by arc length. Since τ = 0, В' = 0, so В is constant along Г. If for some s0, x(s0) is not on the plane through x(0) and orthogonal to B, then <x(io) - x(0), B> Φ 0 (4.41) Let 0(s) be the function <x(s) - x(0), B>. Then 0'(s) = <T(s), B> which is zero since В = B(s) for all s and is orthogonal to T(s). Thus 0(s) is constant. Since 0(0) = 0, 0(io) = 0 also contradicting (4.41). The fundamental formulas of space curve theory are those relating T', Ν', Β' with Τ, Ν, Β. We can now easily derive them. Theorem 4.2. (Frenet-Serret Formula) T'= kN Ν' = -κΤ + τΒ B'= -τΝ
4.4 Curves in Space 361 Proof. The first and the third are just the definitions of κ, τ, respectively. Since N is a unit vector, <N', N> = 0, so N' lies in the rectifying plane. Write N = <xT + j8B; we must verify α = - κ, β = т. But that follows from <N, T> = 0, <N, B> = 0. For a=<N',T> = -<N,T'> = -« β = <N', B> = - <N, B'> = -(-τ) = τ Examples 36. The circular helix: x(i) = {a cos t, a sin t, bt) We have already computed that s = ct, where с = (a2 + b2)1'2, and T(s) = - I — a sin I -1, a cos I -1 b I kN(s) = -^i-a cosl-l, -a sinl-l, 0 thus κ = 7·Ν=-Η;)·Μη(;)·°) Β = Τ χ N=- i-6sini-j,6cosi-J, -a) thus τ = 6/c2. 37. Let С be a curve in the xy plane, and let Г be a curve of constant slope lying over the curve С (see Figure 4.36). Thus if Τ is the tangent to Γ, <T, E3> is constant. Let b be that constant. Then Γ has the parametnzation x(t) = Mi), y(t), bit)) where (x(t), y(t)) parametrizes С We may assume the parameter
362 4 Curves Figure 4.36 is arc length along C. Then x' = (x', У', b) so ds It = Hx'll = ((x')2 + (/)2 + b2)1·2 = (1 + 62)1'2 Thus s = (1 + 62)1/2i and the tangent to Γ is T = 1 (1+62)1' Щ(Х',/,Й) Thus κΝ = T' = (TTW7,(*",/',0) Now if кс is the curvature of C, since (*', }/) ( —/, x') is its normal vector, so
4.4 Curves in Space 363 Thus kN = so кс (1 + 62)1/2(-/.*',ο) кс (1 + 6 Then 2γβ N=(-/,X',0) B = T χ Ν = _____ (_6< _6/; (χ02 + (/)2) 1 (1 + 62)1/2 Differentiating, 1 mi-Ьх', -by', l) -τΝ = Β' = Thus Ькс (iTbW{-bx"' ~by"'0) = (TTPp(->'· *' ,o) τ = ■ 6кс (1 + 62)1/2 Local Behavior of a Curve We shall now make a close study of the local behavior of a curve relative to the moving trihedron. Let Γ be a sufficiently differentiable curve, parametrized by arc length by χ = x(j), —_<_<„. We may perform a Euclidean transformation so that x(0) = 0, T(0) = E,, N(0) = E2, B(0) = E3. Expanding x(j) in a Taylor series, we obtain Now s2 s3 x(s) = x(0) + x'(0)s + x"(0) - + x"'(0) T + e(s3) 2 b x' = Τ, χ" = κΝ, χ" = κ'Ν + κΝ' = κ'Ν + κ(-κΤ + τΒ) (4.42)
364 4 Curves Evaluating these at zero and substituting into (4.42), we obtain x(s) = «Ει + 2 s E2 + -у E2 - -g- Ei +.— E3 + e(s3) In coordinates, kV X = S + Ф3) У = ^«2 +js3 + e(«3) ζ = — sJ + e(s3) 6 Thus for small values of s, the given curve looks like the cubic curve given by the equations y = r κτ , 2 4Τ 3 3 к: Figure 4.37 is a picture of this curve for к > 0, τ > 0. Notice that, so long as κτ φ 0 the curve always passes through its osculating and normal planes, but lies on one side of its rectifying plane. Figure 4.37
4.5. Varying a Curve in the Plane 365 Now, just as the curvature determines plane curves up to a Euclidean motion, space curves are so determined by the curvature and torsion. The proof of this fact is by the same kind of application of Picard's existence and uniqueness theorem as we used in the case of the plane. We shall leave the verification to the interested reader. Theorem 4.3 Given continuous functions f g defined in an interval I there is a space curve Γ: χ = \{s) given parametrically by arc length in some sub- interval of I such that k(j)=/(j) z{s) = g{s) Γ is unique up to Euclidean motions in R3. • PROBLEMS 31. Show that a curve in R3 is a plane curve if all its tangent planes pass through a given point. 32. Show that a curve in R3 is a plane curve if its binormal is constant. 33. Let Γ bea curveintheplaneandletybetheintersectionof thecyhnder over Γ with the cone x2 + y2 = ζ, ζ ^ 0. Find the curvature of Γ in terms of that of y. What is the torsion of y? 34. Let Γ be a curve in space, and у its projection onto the xy plane. What is the relation between the curvature and torsion of Γ and the curvature of у ? 35. Suppose that Γ is the intersection of the surface ζ = у2 in R3, with the plane ax + by = 0. What is the curvature of Г at the origin ? 36. Let Г be the intersection of the surface ζ = χ2 + 2y2 with the plane ax + by = 0. What are the curvature and torsion of Γ"> 37. Let Г be given in R3 by χ = x(s). Let Σ be the surface swept out by the tangent lines to Γ. Show that a curve on Σ which is everywhere orthogonal to those tangent lines is given by χ = x(s) + (c - s)T(s) for some constant с 4.5 Varying a Curve in the Plane A family of curves in the plane is a collection of curves {Гс}, as с range through some set, usually of η-tuples of numbers. It is to be understood that the curves of the family vary smoothly; although we shall not make this idea precise. For example, if x(t, c) are functions defined for real t and с lying
366 4 Curves in some set S, then the equations χ = x(f, с), у = y(t, c) (4.43) determine a family of curves: each curve in the family is found by fixing a value of с We refer to Equations (4.43) as the explicit form of the family. More often, a curve is determined by a relation between x, у and a family could be given by an equation F(x, y,c) = 0 (4.44) which, for fixed с gives the relation determining a curve. We refer to (4.44) as the implicit form of the family. Since it does not refer to any particular parametrization of the individual members, this form is particularly useful. The " constant" с which picks out the member of the family usually ranges through some set in R": in which case we refer to the family ((4.43), (4.44)) as an «-parameter family of curves. Examples 38. A straight line in the plane is given by the equation ax + by + с = 0 (4.45) Thus the set of all straight lines is given by (4.45) implicitly as a 3-parameter family of curves. If instead, we write down the slope- intercept form of a straight line, у = mx + b (4.46) then we exhibit this family explicitly as a 2-parameter family of curves. 39. Let χ = x(f) у = y(t) be the equation of a curve Г in the plane, and consider the family of tangent lines to Г. The equation of the line tangent to Г at (x(f), XO) is /(f) У = XO + -^ (x - x(0) (4.47) X (t)
4.5 Varying a Curve in the Plane 367 This is the explicit form then of a 1-parameter family. (The parameter is f.) 40. Consider the case where Γ is the circle χ = cos t у = sin t The family of tangent lines to Г is given by the equation cos t у = sin t — (x — cos f) (4.48) sin t This simplifies to у = — χ cot t + esc t We can make this appear even more palatable by taking - cot t as the parameter of the family. Letting с = — cot t, we find esc t = - (1 + c2)ll2/c, so (4.48) becomes у = xc — ■ (1 + сГ2 (4.49) a 1-parameter family of lines. 41. Suppose a hoop is rolling along a horizontal line (see Figure 4.38). This collection of positions of the hoop forms a 1-parameter family of circles where the point of tangency with the horizontal (the Figure 4.38
368 4 Curves Figure 4.39 χ axis) is taken to be the parameter. The implicit equation for the family is thus (* - c)2 + (y - l)2 = 1 42. The family of circles tangent to both the χ axis and the у axis is a 1-parameter family of curves (Figure 4.39). We take for the parameter the point of tangency of the curve with the χ axis. If r is the radius of the cth circle, then the equation of the family is clearly (x - c)2 + (y- r)2 = r2 It is easily seen that r = c\ this follows from elementary geometric considerations. Thus the family is implicitly described by this equation (x - c)2 + О - с)2 = с2 (4.50) 43. The family of circles of radius 1 tangent to the parabola у = x2 (Figure 4.40). We can take as the parameter the χ coordinate с of the points of tangency. The center of the circle is on the line perpendicular to the parabola at (с, с2). Thus if (r, s) are the coordinates
4.5 Varying a Curve in the Plane 369 of the center of the cth circle, we have « - c = - Yc (r - c) {r-c)2+(s-c2)2= 1 These equations have the solution 2c _ 2 1 Г - C + (l + 4c2)1'2 S~C ~ (1 + 4c2)1'2 Thus the implicit equation for this family of circles is 44. Let Г be a curve in the plane. We seek the family of tangents to Г. If Г is given as a function of arc length by χ = x(j), then the lines g(M) = x(j) + nT(j) (4.51) form the family of tangents to Γ with s as parameter. Suppose now that у is a curve which is orthogonal to this family at every point. If h(j) is the point of intersection of γ with the particular tangent line Figure 4.40
370 4 Curves (4.51) at x(j), then γ is parametrized by χ = h(j). h(j) is then of the form (4.51) with a particular choice u(s) of u. Writing then h(j) = \(s) + h(j)T(s), and differentiating, we obtain h'(s) = (1 - w'(j))T(j) + φ)Τ'(ί) Since h'(j) is tangent to γ and thus, by assumption, orthogonal to T, we must have 1 — w'(s) = 0. Thus u(s) = s + с So the family of curves orthogonal to the tangent lines to Г is given by χ = x(j) + (j + c)T(j) (4.52) The family of curves orthogonal to the tangents to the circle ζ = e" is given by χ = e,s + i(s + c)e,s = [1 + i(j + c)>" These are just the evolutes of the circle. The Differential Equation of a Family A differential equation y' = F(x, y) determines a 1-parameter family of curves, if the function Fis decent enough. For, under such conditions, for each с there is a unique solution of the initial value problem y' = F(x, y) y(x0) = с The solution can be written у = f(x, c), which can be considered as either the explicit, or implicit form of the family. Now, it is usually true that a 1-parameter family of curves is the family of solutions of some differential equation, and we would often like to find that differential equation. Suppose, for example, that у = f(x, c) is the equation of a given 1-parameter family. If у = y(x) is one particular curve (i.e., y{x) = f(x, c0) for some fixed c0), then these two equations must hold у = f(x, c) y' = — (x, c)
4.5 Varying a Curve in the Plane 371 for some value of с (i.e., с = с0). It may be possible to eliminate the parameter с from these two equations, thus obtaining a relation between x, у, у' which must be satisfied; this is the differential equation of the family. For it is a differential equation which must be valid for each member of the family, and this is a differential equation which determines the family. More generally, suppose the family is given implicitly by Fix, y,c) = 0 If χ = x(t), у = y(t) parametrizes one of the curves in the family, then there is а с such that F(x(t), У(0, с) = 0 (4.53) identically in t. Differentiating now with respect to t, we have dF dF — (*, y, c)x' + — (x, y, c)y' = 0 (4.54) ox ay If we can eliminate с from Equations (4.53) and (4.54), the result will be a relation between x, у, х', у' which must be satisfied for each curve in the family and thus is the differential equation of the family. Of course, if χ is the parameter along the curve, and у = у(х) is its equation, (4.54) becomes ψ(*,ν,ο) + ψ{χ,ν,€)^ = 0 (4.55) ox oy ax Examples 45. Consider the family of parabolas (Figure 4.41) y2 - ex = 0 Differentiating with respect to χ (considering у as a function of x), 2yy' - с = 0 Thus the differential equation of the family is y2 - lyy'x = 0
372 4 Curves Figure 4.41 or, excluding the curve у = 0, у - 2y'x = 0 46. The family у = сех is given by the differential equation у' = у (as we already know). The family у = e" is given by the differential equation '=exp(H 47. (Clairaut's Equation). Let у =/(*) give a curve in the plane, and consider the family of lines tangent to that curve. That family is given implicitly (taking the χ coordinate of the point of tangency as the parameter) by this equation, У = f(x) + f'Wx - c) (4.56) Now, upon differentiation we find y'=f'(c) (4.57)
4.5 Varying a Curve in the Plane 373 To say that we can eliminate с from the pair of Equations (4.56) and (4.57) amounts to saying that we can solve (4.57) for с as a function of y'. Then, upon eliminating we obtain as differential equation, the equation y = y'x + h(y') (4.58) where h(y') represents the expression /(c) - f'(c)c, considered as a function of y'. Thus Equation (4.58), known as Clairauts' equation, is the general form of the differential equation of a family of lines tangent to a curve. Its solutions are у = ex + h(c) Notice that the given curve у = f(x) also solves Equation (4.58) (because it is derived from (4.56) and (4.57) which hold under the substitution j> = /(*))· It is called the singular solution of the equation. 48. The family of lines tangent to the parabola у = χ2 has the implicit form у = с2 + 2с(х — с) = 2сх — с2 Differentiating, we obtain y' = 2c. Thus с = \у', so we can eliminate с to obtain this differential equation of the family, y = y'x- i(y')2 49. The family of lines tangent to the circle x2 + y2 = 1 is given implicitly by (1 - c2)1'2 y=xc _ Then y' = c, so the Clairaut equation of the family is (1 - jy')2)112 У= У* о
374 4 Curves Family Orthogonal to a Given Family 50. Let F be a given family of curves. We propose to find a family G of curves everywhere orthogonal to F. Thus, if ρ is a point in the plane, and Γ is the curve in F through ρ with tangent T1; and у is the curve in G through ρ with tangent T2 we must have <T1; T2> = 0. Suppose the family F is given by the differential equation (Figure 4.42) a(x, y)x' + b(x, y)y' = 0 (4.59) Thus, since (*', y') is the tangent field to F, we must have T2 collinear with (a(x,jO, b{x,y)) (for <T1; {a, b)} = 0 by (4.59)). Thus the differential equation for the family G is a(x, y) b(x, y) Figure 4.42
4.5 Varying a Curve in the Plane 375 51. Find the family orthogonal to the family of hyperbolas xy = с The differential equation of this family is yx + xy' = 0. Thus the differential equation of the orthogonal family is x' _/ у χ or xx — yy' = 0 which integrates to x2 - y2 = с 52. The family orthogonal to the family of parabolas in Example 45 is given by the differential equation 1 У' у — 2x (here χ is the parameter, so χ = 1). This integrates to 2 x2 + у = с (4.61) 53. Find the family which makes an angle of π/4 with the family (4.61). The differential equation of the family (4.61) is 2xx + yy' = 0 The family orthogonal to this family has tangent collinear with 2x + iy, thus the family we seek has tangent collinear with this vector rotated by π/4. Thus the tangent field is collinear with el<-"l4\2x + iy), or, what is the same, (1 + iX2x + iy) = 2x - у + i(2x + y). Thus, the differential equation is 2x - у 2х + у Envelopes Many of the families we have been studying have the property that there is a curve (or curves) which is not a member of the family but bounds the family (see Figures 4.39-4.41). Similarly, for a family of lines tangent to a
376 4 Curves given curve, the curve bounds the family. Such a bounding curve is called an envelope. We want to see how to find envelopes for families. First of all, some families do not admit envelopes. Clearly, the families χ = с, у = с, у = χ2 + с do not admit envelopes. However, if an envelope exists we can find it by the present techniques. Definition 7. Let F be a family of curves in the plane. A curve Г is an envelope for the family F, if through every point ρ in Г there goes a curve in F which is tangent to Г at p. Suppose that a family is given implicitly by Fix, У,с) = 0 and that the curve Г: у = f(x) is an envelope of this family. Then, for every x0 there is a c(x0) such that the curve С corresponding to F(x, y, c(x0)) = 0 is tangent to Г at (x, f(x0))· Thus we must have F(xo,nxo),c(xo)) = 0 (4.62) and since the curve С has the tangent direction (l,/'(x0)), we must have, by (4.54), dF dF γχ (*o, Я*о). Фо)) + у (*о , Я*о), фсо))/'(*о) = 0 (4.63) Differentiating (4.62) with respect to x0 we also find dF dF N dF Tx + Tyf'M + Tcc'M = 0 (4·64) Comparing (4.63) and (4.64) we have as a result dF Yc (*o, /(*o)> c(*o)) c'(x0) = 0 (4.65) Thus if (x, y) is on the evenlope Γ, there is а с such that dF F(x, y,c) = 0 — (x, y,c) = 0
4.5 Varying a Curve in the Plane 377 and we can eliminate с from this pair of equations to obtain an implicit equation of Г. Notice that from (4.64), the equations dF dF F(x, y,c) = 0 —+ — y' = 0 ox ay also hold on Г. Eliminating с from this pair we obtain once again the differential equation of the family, so the envelope must also satisfy this differential equation. Examples 54. Find the envelopes of the family (x - c)2 + (y - l)2 = 1 (4.66) of Example 41. We differentiate with respect to с to find - 2(x - c) = 0 or χ = с Eliminating с we obtain {y - l)2 = 1, or у = 2, у = 0. 55. Find the envelopes of the family (* - cf + {y- c)2 = c2 (4.67) of Example 42. We must eliminate с from this equation and -2(х-с)-2(з>-с) = 2с or x+ y = с Substituting this in (4.67) we obtain (-з02 + (-х)2 = (х+з02 or 2xj> = 0 Thus the envelopes are χ = 0, у = 0.
378 4 Curves 56. Find the envelopes of the family у = χ2 sin ex Differentiation with respect to с yields 0 = ex2 cos ex or π 3π C = 0' CX=2'T'··· The condition с = 0 gives j> = 0 which fails as an envelope. But ex = π/2, 3π/2 yields the envelopes у = ±x2 (Figure 4.43). 57. Find the envelope of the family given by This is a Clairaut equation and has the solution у = ex + 1 + c2 Figure 4.43
4.5 Varying a Curve in the Plane 379 Differentiation with respect to с yields 0 = x + 2c or с = - 2 Thus the envelope of this family is the curve У=-+1 • EXERCISES 12. Find the differential equations for these families of curves: (a) xyc = 1 (c) xecy = 1 (b) sin xy — a cos xy = 0 (d) χ sin j> + с sin χ = 0 (e) ;,<?«*+>'> = 1 (f) sin(x + у + с) + cos(x + j> + c) = 1 13. Find the implicit form of the family given by these diiferential equations: (a) xy'-yx'=0 (c) (y'y + y* = \ (b) x' + yy' = 1 (d) у + y'x + sin у = 0 (e) /(sec χ — tan x) = 1 — у 14. Find the implicit form and the differential equation of the family of circles with center on the у axis and tangent to the χ axis. 15. Find the family of ellipses with foci at (-1, 0), (0, 1). 16. Find the family of curves orthogonal to the family in Exercise 14; Exercise 15. 17. Find the family orthogonal to the families of Exercises 12(a), (b), (f), 13(b), (d). 18. Find the family making an angle of π/З with the family of Exercises 12(a), (b), (c). 19. Find the envelopes of the families of Exercises 12(a), (b), (c), (d), (e). 20. Find the envelopes of these families: (a) .y = sin(;t - с)2 (с) 13<? (b) 136 (d) y = exsincx (e) The family of cardiods r = (1 + c)~\l + с cos Θ). (f) The family r = sin αθ. • PROBLEMS 38. Find the family of evolutes of the parabola у = л:2. Find the family orthogonal to this family of evolutes. 39 Find the family orthogonal to the family of spirals r = ce". 40. A ladder 10 feet tall originally leaning against a building slips (Figure 4.44). Find the family of curves which are the trajectories of the points on the ladder.
380 4 Curves \\\\\\\\\\4\\\\4\4\4\\\4\\\\4\\444\v Figure 4.44 41. Find the family of trajectories of the points on the circumference of a ball rolling on a horizontal plane. 42 A line segment of length 2 has its endpoints on the parabola у = хг Find the trajectories of points x0 on the segment as it slides along the parabola (Figure 4.45). 43. A ball of unit mass is at the end of a string of unit length attached to the top of a vertical bar rotating at constant angular velocity. Find the path of motion of the ball assuming its position and velocity at time t = 0 to be (1, 0, 0), (1, 0, 1), respectively Find the trajectory of any point on the string. 44. Find the family of curves swept out by the midpoints of bars of given length with endpomts along the curve xy = 1 in the first quadrant Figure 4.45 4.6 Vector Fields and Fluid Flows We have come across vector fields several times already: the gradient of a function, the gravitational field, a field of forces, are all vector fields. We now want to study such fields in connection with fluid flows: motions of a mass of noninterreacting particles.
4.6 Vector Fields and Fluid Flows 381 Figure 4.46 A vector field is a function which assigns to each point in a given domain in R", a vector in R", usually considered as based at the given point. Thus, a vector field defined on D in R" is nothing more than an Revalued function on U, but interpreted pictorially as in Figure 4.46. Examples 58. A body in space sets up a field of gravitational attraction. Suppose there is a body of unit mass situated at the origin. According to Newton's laws another body of unit mass is attracted to the given body at the origin with a force proportional to the inverse of the distance squared. We represent this attraction at a point ρ by a vector directed toward the origin and of length ||p||-2 (see Figure 4.47). Thus the gravitational field of a body situated at the origin is the vector field defined on R3 - {0} by or, in rectangular coordinates (s, y, z) \(X, )>, Z) = — —^ 2 2\572 (χ2 + у2 + z2f12 59. Given a family of curves, we may consider the field of unit tangents to the family (Figure 4.48). In particular the field of tangents to the family of circles x2 + y2 = c2 is defined on R2 - {0, 0},
382 4 Curves Figure 4.47 and is given by T(*. У) = 2 , .,241/2 (*2 + y2) The family of unit tangents to the family of rays is defined on K2- {0,0} by T(*. У) = (x,y) {x2 + У2)112 Figure 4.48
4.6 Vector Fields and Fluid Flows 383 If we are given a vector field ν on a domain D in R", the questions arise: Is it a field tangent to a family of curves, and if so, can we discover the curves ? Suppose then that ν is a given vector field in the domain D, and Γ is a curve in D such that v(x) is tangent to Γ at each point χ on Γ. Let f be a function which parametrizes the curve Γ. Then f \t) is tangent to Γ at f(i) so we must have f'(i)and v(f(i))colhnear. In particular then, if f is a solution of the differential equation f'(0=v(f(0) then f parametrizes a curve tangent to the given field. In the terminology of the preceding section --v(X) = 0 is the {parametric) differential equation of the family of curves tangent to the vector field. 60. Suppose \(x, y) = (x, 2y). (Figure 4.49.) Then the family of curves tangent to the vector field ν is given parametrically by this differential equation: x' = χ y' = 2y (4.68) x(0) = x0 № = JO The solution is given by x=x0e' У = Уое2' (4.69) We can write this family of curves implicitly as у - ex2 = 0 (taking the constant с as y0 χό2)· Thus the family we seek is a system of parabolas. Another way to find the implicit equation of the curve is to divide one equation in (4.68) by the other: dy _ dyjdt _ ly_ dx dxjdt χ This we can solve directly by separation of variables.
384 4 Curves ^ Figure 4.49
4.6 Vector Fields and Fluid Flows 385 61. Let v(x, y) = (x + y, 1). Then the differential equation is dy dyjdt χ + у Τ = ΤΤΓ, = ~Ί— or У' = х + У dx dx/dt 1 which has the general solution у = -(л: + 1) + се* Now let us consider a fluid in motion in a domain D in R". The equations of fluid motion are written as follows. We suppose that at time t = 0 there is a particle of fluid at each point x0 in D. The position of that particle at the subsequent time t is denoted by φ(χ0, t). The equation of motion then is х = ф(хо,0 (4.70) For a fixed x0, the curve described by (4.69) is the path of the particle which is at x0 at time t = 0. Thus we are assuming that xo = <Kxo,0) (4.71) It is also assumed that no two particles can ever occupy the same position at the same time. Then for each t, the function ф(\0 > 0 ls one-to-one and thus can always be inverted: there is also a function ф(х, t) which describes the t = 0 position of the particle at χ at time t such that χ = φ(χ0, t) if and only if x0 = "А(х, О Definition 8. Given the fluid motion described by Equation (4.70) its velocity at the time t = t0 is the vector field дф It situated at the point χ = φ(\0, t0). If the vector field is independent of time, we say that the fluid motion is a steady flow. Thus the velocity field of a flow at time t0 and point χ is the velocity v(x, i0) °f ^e particle which is at χ at that time. If the velocity is independent of the time, or the particular particle, the flow is steady. The flow in a river of constant volume is determined by the shape of the river bed, and thus
386 4 Curves tends to be steady, whereas the flow of clouds in the sky is time dependent. If the flow is steady, then the path lines (the curves described by (4.70)) are the curves of the family tangent to the velocity field. If the velocity field is time dependent, then these tangent families (called the lines of force) vary with time and have little to do with the paths of individual particles. This is easy to see. Suppose the flow χ = φ(χ„,0 (4·72) has the velocity field v(x, t). Then the path lines (4.72) are the solutions of the differential equation dx — = v(x(0,0 x(0)=xo (4.73) at The lines of force at time t = t0 are the solutions of the equation dx - = ν(χ(τ), ί0) χ(0) = x0 (4.74) These are the same differential equations if and only if v(x, t) = v(x, t0) for all t, that is, if and only if the flow is steady. Examples 62. Consider the flow in R2 given by Equation (4.67): x = x0e' y = y0e2t (4.75) Then х=х0е' у' = 2у0е2' (4.76) Thus the velocity at time t of the particle originally at (x0, y0) is (x0e',2y0e2t). To find the velocity field we must solve (4.75) for x0, y0 in terms of x, t and substitute. Thus (4.76) becomes χ = χ y' = 2y so the velocity field is v(x, y) = (x, 2y) and the flow is steady.
4.6 Vector Fields and Fluid Flows 387 63. Consider the flow in R3 given by x=x0+t y = y0 + t2 z=z0 + t3 χ = 1 y' = It ζ' = 3ί2 Thus the velocity field y(x,y, z) = (l,2i, 3f2) is independent of position but is time dependent. In fact, the path lines are independent of position and are just translates of the twisted cubic (Figure 4.50). It is as if all of space were being rigidly translated along the line curve у = χ2, ζ = χ3. Notice that since the velocity field at any given time t = t0 is a constant field, the lines of force are straight lines. 64. χ = x0 + t, у = у0{\ + t), z= z0 e'. Then = (l,y0,z0e') t so the velocity field is ν(χ,γ,ζ)=^\,γ~,ζ\ Figure 4.50 dx dt
388 4 Curves the flow is not steady. The lines of force at time t = t0 are the solutions of χ = 1 V = ζ = ζ 1 + fo so is the family x = x0+t У = Уо «Ρ — ζ = ζ0 e' which is quite different from the family of path lines. 65. The flow is given by x = x0e' у = у0е-' + х0(е'-е-') ζ = z0e2t - x0(e'- e2') (4.77) „t χ = x0e у =-y0e + x0e + x0e z' = 2z0 e2' - x0(e' - e2') (4.78) The Equations (4.77) are linear in x0, y0, z0, so we may solve for these in terms of x, y, z. Doing so, and substituting the result in (4.78), we obtain the velocity field of the flow, v(x, y, z) = {x, 2x - y,2z + x) This flow is time independent, or steady. It is an immediate consequence of the uniqueness assertion of Picard's theorem that a flow is completely determined by its velocity field. For the flow equation is the solution of the initial value problem (4.73), which is unique. Notice also that the existence part of Picard's theorem asserts that there always is a flow associated with a given velocity field (which is sufficiently smooth). The last remark we care to make at this time (we shall continue the study of fluid flows in Chapter 8) is that in the case of a steady flow, the particles follow one another along a fixed family of paths (whereas in general each particle determines its own path). These are of course the lines of force.
4.6 Vector Fields and Fluid Flows 389 What we must show is this: If two particles x0, xt occupy the same position at different times (of course), then they follow the same paths. That is, if there are s0, ^ such that φ(χ0, j0) = <Η*ι, Ί) then the curves Γ0: χ = φ(χ0 ,t) T1:x = 0(x1; t) are the same, except for parametrization. The following proposition proves this, and more. It makes explicit the relation between the two parametriza- tions. Proposition 7. Suppose χ = φ (χ0, t) describes a steady flow. If for some (x0, s0), (x1; Sj), we have φ(χ0, s0) = φ(χλ, Ji) then φ(χ0 ,s0 + t) = φ(χ1} sj, + t) for all t (4.79) In particular, хг = ф(х0, s0 — Sj). Proof. The proof is simply that the two functions in (4.79) solve the same initial value problem. Let v(x) be the velocity field of the flow (by assumption ν is time independent). Consider these functions f0(0 = ^(xo, so + 0 fi(0 = Ά(χι, «ι + 0 (4.80) We have fo(0) = П(0) Since -£(Xo,t) = l/(<f>(Xo,t))
390 4 Curves for all χ, t, we have df0 д Tt (ί) = di ^(x°' io + ') = v^x° >s° + ')) = v(fo(0) i/fi -(0 = v(f1(0) Thus f0, fi solve the same first-order differential equation and by (4.80) have the same value at 0. Thus f0 = fi identically. Planetary Motion We conclude this chapter with a study of the classical equations of planetary motion. This study first requires these simplifications. We assume all action is in a plane, and that the only force acting is that due to the sun's gravitational field. These simplifications approximate the true situation with enormous accuracy. For the other forces acting on the body are gravitational forces due to other celestial bodies, which are either too far away or too small, relative to the sun, to make a substantial contribution. According to Newton's laws, the acceleration of a body due to the gravitational field is proportional to the field. The motion is thus completely determined by this force and an initial position and velocity. For if s = j(i) is the equation of motion, then s is the solution of an initial value problem; i(0) = i0 *'(0) = v0 s"(t) = kF(s(t)) where F is the given force field. Our purpose here is to describe the motion of planets in terms of an observed position and velocity. If we locate the sun at the origin, then the gravitation force field is given (in complex notation) by Thus, we must explicitly solve this system z(0) = z0 z'(0) = v0 (4.81) z(t)=-woF
4.6 Vector Fields and Fluid Flows 391 The best way to solve this is by means of polar coordinates. Write z(i) = r{t)el9i,\ Then differentiating, we have z' = r'eie + i9'reie (4.82) z* = r>e"> + Ив'г'е1* + iW're1* - {O'fre19 (4.83) and Equation (4.81) becomes ,ιβ z" = r"e,e + Ив'г'е16 + i9"reie - (в')2ге1в = - % r2 (4.84) Multiplying through by е~1в, we obtain r" - (0')2r + i(29'r' + r9") = ^ r which reduces to this system (equating real and imaginary parts): r" - (0')2r = ^- 2ΘΥ + νθ" = 0 (4.85) The second equation reads 2(lnr)' = — =^=(1ηθ')' so either Θ' = 0 or 0' is proportional to r~2. We have then these two alternatives. In one case θ is constant, in which case the planet approaches the sun along a straight line. In the other case, the planet rotates around the sun at an angular velocity inversely proportional to the square of the distance from the sun (the closer it is to the sun the faster it rotates around it). Notice also that the solution r = constant, Θ' = constant is possible, so that an admissible path is that of circular motion of constant angular velocity. The angular velocity decreases as the circle gets larger. We proceed now to the full solution of (4.84). We already have 6'r2 = h, a constant (determined by the initial conditions). From (4.84), we obtain βιθ 1 i r2 h h *"'=- — = ~ τ е'ев' = г (*'")'
392 4 Curves Thus we can integrate to obtain ζ ' = - е1в + С η Where С = реш is an arbitrary constant. Now, using (4.82) we have r'e'9 + 1в'ге,е = - e,e + pem h Multiply through by е~1в and equate imaginary parts: 0V = - + ρ sin(ro - Θ) h Once again using 9'r2 = h, we obtain this implicit relation between r and Θ: h = ri- + psin(a> — 0)1 or r=l+pAsin(a,-0) (486) The constants p, h, ω are to be determined by the initial conditions. Equation (4.86) is the polar form of the equation of a conic with one focus at the origin. If ph < 1, it is an ellipse; ph= 1, a parabola; and ph> 1, a hyperbola. These are then the possible paths of motion of a planet, or comet, around the sun. • EXERCISES 21. Find the family of curves tangent to the given vector fields: (a) y(x, y) = (x, -y) (b) v(x, y) = (-y, x) (c) y(x,y,z)=(-x,-y,z) (d) y(x,y, z)=(x, -l,z) 22. Find a field of vectors tangent to these families: (a) z = ec+")r (b) z = ec + ,! (c) χ = let, y = l— (ct)2 (d) χ = x0 + t, у = e'y0, г = sin t
4.7 Summary 393 23. Find the velocity field of these flows: (a) x(i) = (<r'*„ ,y0 + t, e-'z0) (b) X(i) = (x0(l + Г), ^o(l + t), Z0 + t\Xo2 + V)) (c) X(i) = (X0, y0 + t, Z0 COS 0 (d) x(t) = e-'(x0, y0, z0 cos t) 24. Find the flow with the given velocity field: (a) v(r) = t(x, y, z) (c) Exercise 21(b). (b) v(0 = t(-y, x, 1) (d) Exercise 21(c). 25. Is there a steady flow whose path lines are the trajectories of the particles at (x0, y0, 0) at time t = 0 in the flow in Exercise 23(b) ? PROBLEMS 45. Under what conditions on the velocity field of a flow are the lines of force at all times the same as the paths of motion? 46. Consider a flow which spirals around the line L: χ = у = ζ at constant angular velocity, whose distance from the origin increases exponentially with time and whose distance from L decreases exponentially with time. Find the velocity field of the flow. 47. If we are given a family of curves in the plane we may consider the tangent field of the family as well as its differential equation and the tangent field of the orthogonal family as well as its differential equation. How are all these formulas related ? 4.7 Summary The image in R" of an interval under a one-to-one C1 function with a nowhere vanishing derivative is called a curve. If Γ is a curve given by the function χ = f(i) a<t<b the variable t is called the parameter of the curve. If χ = g(t) α < τ < β is another parametrization of the curve, there is a one-to-one function t = σ(τ) mapping the interval [α, β~] onto the interval [a, b~\ such that g(t) = ί(σ(τ)) α^τ<β
394 4 Curves If σ' > Ο (σ is increasing) we say that the parameters t, τ induce the same orientation on Γ. This notion divides all parametrizations into two classes. An oriented curve is one for which one of these classes, the well-oriented parameters is chosen. If F is a differentiable function of two variables such that VF is never zero, then the equation F(x, y) = 0 defines a curve implicitly. For we can find a parametnzation *=/(<) У = 0(0 for the set F(x, y) = 0. Similarly, if F, G are two differentiable functions of three variables such that V.Fand VG are everywhere independent in the set F(x, y,z) = 0 G(x, y,z) = 0 implicitly defines a curve in R3. If Г is an oriented curve with a parametnzation χ = f(i) a<t<b the length of Г between i{a) and f(i) for a < t < b is defined to be the least upper bound of all sums Σ №.)-*('.-Oil 1=1 over all choices of points t0,.. .,tk such that a = t0 < tt < ■ ■ ■ < tk = ( If j(i) is this number, the function s = j(i), a < t < b gives a parametnzation of Γ. This is the parametnzation by arc length, s is the solution of the differential equation *'(0=llf'(0ll s(a) = 0 The unit tangent to a curve Γ: χ = x(j) is the vector T(j) = x'(j). The tangent line is the line through f(j) spanned by this vector. The unit normal to the curve is a choice of unit vector N(j) lying on the line spanned by T'(j). In two dimensions N is chosen so that the rotation Τ -* N is counterclockwise. In three dimensions the Τ — N plane is called the osculating plane.
4.7 Summary 395 The unit binormalis the vector В so that the basis Τ -> N -> В is a right-handed orthornormal basis: Β = Τ χ N. This frame is determined by these differential equations, the Frenet-Serret formulas: T' = kN N'= -κΤ+ τΒ В' = - τΝ The scalar functions κ, τ, the curvature and torsion respectively of the curve are defined by the first and third equations. The curvature к is the angular velocity of the tangent in the osculating plane and the torsion is the angular velocity of the osculating plane about the tangent. A curve in R3 is uniquely determined (but for Euclidean motions) by its curvature and torsion. A curve in R2 is uniquely determined (but for Euclidean motion) by its curvature. If χ = f(i) is the equation of motion of a particle, we call the curve described by this function the path of motion, ds/dt is the speed, f'(i) is the velocity and f "(i) is the acceleration of the particle. The acceleration vector lies in the osculating (T — N) plane. We can write acceleration = αΤΎ + αΝΝ where aT is the tangential acceleration and aN the normal acceleration. These equations hold: d2s (ds\2 aT = d? a"=\di)K where к is the curvature of the path of motion. A family of curves in the plane is a collection of curves {Гс} as с ranges through some set. A pair of equations χ = x(t, c) y = y(t, c) determines a family of curves. This is the explicit form of the family. A functional equation F(x, y,c) = 0 also determines a family. This is the implicit form of the family. The set of solutions of a differential equation a(x, y)x' + b(x, y)y = 0 (4.87)
396 4 Curves forms a family of curves in the plane. If F(x, y,c) = 0 (4.88) is the implicit form of a family its differential equation is found by eliminating с from (4.88) and dF , SF , л ΊΓΧ +1ГУ =0 ox oy If (4.87) is the differential equation of a family F, the family of curves orthogonal to the family F is given by the differential equation -b(x,y)x'+a(x,y)y' = 0 A vector field in a domain C/ <= R" is an /?"-valued function defined in U. The vector associated to a particular point in U is depicted as originating at that point. A fluid flow is given by the function χ = φ(χ0, 0 with these properties: (i) φ(χ0, 0) = x0. (ii) φ has continuous partial derivatives, (iii) For each (the function χ = φ(χ0, ί) is invertible. The curves χ = φ(χ0, ί) χ0 fixed, are the paths of motion of the flow. The velocity v(x, i) of the particle at χ at time t is the velocity fieldof the flow Зф . v(x, 0 = -^-(Хо,0|*=фою,о If ν is independent of t, the flow is steady. The velocity field of a flow completely determines the flow, for the paths of motion are obtained by solving the differential equation dx 7( = v(x,0 x(0) = x0 When the flow is steady the paths of motion do not change with time, and particles on the same path remain on the same path.
4.7 Summary 397 • FURTHER READING In addition to the bibliography at the end of Chapter 3, we should also mention these excellent texts on differential geometry: D. Struik, Lectures on Classical Differential Geometry, Addison-Wesley, Reading, Mass., 1950. H. Guggenheimer, Differential Geometry, McGraw Hill, New York, 1963. R. T. Seeley, Calculus, Scott-Foresman, Glenview, 111., 1967 has a derivation of Newton's law of gravitational attraction from Kepler's laws. • MISCELLANEOUS PROBLEMS 48. Suppose that у is a closed curve in the plane which lies outside the unit disk and encircles the origin. Show that the length of у is at least 2π. 49 Suppose that у is a closed curve lying completely inside the unit disk with the property that it crosses every ray once and only once Is there an a priori bound on the length of y7 50. Suppose that у is a curve as described in Problem 49, whose curvature is bounded by 1. Is there now a bound to the length of y? 51. A pendulum consists of a body of mass m hanging on a rope of length L which is fixed at one end If the mass is displaced from the vertical and let go it will swing along the circle of radius L. Find the differential equation of the motion 52. Suppose a particle is moving along the curve of Example 20 at constant speed. Find the speed of its projection onto the xy plane. 53. Suppose that a particle moves along the right circular cone according to the equation χ = (cos t, sin t, 2i) Find the equation of motion of the projection of the particle on the plane x = l. 54. A horse is running around the elliptical track x* _|_ 2y2 = 1 at constant speed. There is a wall along the line у = — 1 and a floodlight at the point (0, 1) which casts the horse's shadow on the wall. Find the equation of motion of the shadow. 55. A man six feet tall walks at constant speed along a straight line passing directly beneath a street lamp 12 feet off the ground. Find the equation of motion of the head of the man's shadow cast by the street lamp. 56. A loose foot bridge of length L hangs across a chasm of width W (L > W). A man appears at one entrance on a pair of roller skates.
4 Curves Suddenly he lets go and begins skating down the bridge. Assuming the only forces acting on him are those due to gravity and the restraining forces of the bridge, find the differential equation governing his motion. 57. Why does a river going around a curve wear out the far bank and deposit silt along the near bank directly after the curve ? 58 Suppose a disk of radius r rolls with constant speed (at the center) along a disk of radius R in the plane. Find the equation of motion of a typical point on the circumference of the smaller disk. 59. Find the differential equation of the motion of a ball rolling in a parabolic dish (with profile у = χ2) starting at rest at some point other than the center. 60. Assuming that the population of the organisms on a given remote island remains bounded, can you say anything about the eigenvalues of the biotic matrix? 61. Find the curvature and torsion of these curves in R3: (a) x = (u — sin u, 1 — cos u, 3u) (b) χ = (sin u, 1 + cos u, sin u) 62. Let χ = x(i) be the equation of a curve у in R3 whose tangent vector T(i) traces out a circle on the sphere. Show that у is a helix. 63. A general helix is a curve lying on a surface of revolution ζ =f(r) which cuts the curves ζ =/(#■), θ = constant at a fixed angle. Show that the ratio κ/τ is constant on a helix. 64. Find the curve on the xy plane onto which a helix on a cone projects. 65. Let у ι and y2 be two space curves for which we have a point for point correspondence such that the line joining corresponding points is the normal line to both curves. Show that the line segment between corresponding points has constant length. 66. Let χ = x(i) be an ./{"-valued function of a real variable which is и-times continuously differentiable. Then the image of у is a curve in R". The Frenet-Serret frame of у is the orthonormal set obtained by applying the Gram-Schmidt process to the vectors x'(0, x'(0. ■ ■ ■. *("'(0 (a) Show that for η = 3, the Frenet-Serret frame is T, ±N, ±B (b) Show that if there are only к independent vectors in the Frenet- Serret frame at every point, the curve lies in a linear subspace of dimension k. (c) Suppose that the Frenet-Serret frame Ti, T2, .. , T„ is a basis Show that the matrix representing the vectors dTi/ds, ..., dT„jds in this basis is skew-symmetric. These are the generalized Frenet-Serret formulas. 67. Find the Frenet-Serret formulas for the curve χ = (cos f, sin t, t, 2f) in R*.
4.7 Summary 399 68. Kepler's laws of planetary motion (from which Newton derived his law of gravitational attraction) are these: I. For each planet the ray from the sun to the planet sweeps out equal areas in equal times. II. The path of motion of each planet is an ellipse with the sun at one focus. III. The square of the time period required to make one revolution is proportional to the cube of the major axis of the ellipse This constant of proportionality is the same for all the planets. In the text we have derived Kepler's second law from Newton's laws. Now derive Kepler's first and third laws.
Chapter D SERIES OF FUNCTIONS We have already run into series developments of functions several times: the exponential, sine, cosine functions were expanded into power series; Taylor's theorem provides a way to develop series expansions for suitable functions; the exponential of a matrix gives us the only sure way to " solve " a system of constant coefficient linear equations. We shall see in this chapter that a general technique for solving a differential equation involves approximation of the solution by series expansions. We shall begin by formulating the definition of convergence of a series of continuous functions and verifying the general criteria guaranteeing convergence. One of the most important of series expansions is that of power series. We shall say that a function is analytic if it can be locally developed into a power series. We shall finally verify the fundamental theorem of algebra and complete the discussion of constant coefficient equations. We have delayed this until now because the kind of analytic techniques involved in the fundamental theorem are those which are most appropriately developed for the class of analytic functions. Further techniques for operating with power series will be explored, as well as the question of estimation of the error in replacing the power series by a partial sum. 400
5.1 Convergence 401 5.1 Convergence Definition 1. Let {/J be a sequence of continuous functions denned on a subset X of R". The series formed of the {fk} is the sequence of partial sums {Zft=i fki- We say that the series converges if the sequence of these partial sum converges (in the sense of Definition 19 of Chapter 2), and denote the limit by ΣΓ= imprecisely then, / = Y£=! fk if, corresponding to every ε > 0, there is an ./V such that /(χ)- ΣΛ(χ) < ε for all η > N and χ e X Since the limit of a uniformly convergent sequence of functions is continuous (Theorem 2.14), we can assert that the sum of a convergent series of continuous functions is continuous. Likewise, from the Cauchy criterion for sequences, we obtain a corresponding criterion for the convergence of series. Proposition 1. (Cauchy Criterion) Let {fk} be a sequence of continuous functions. The series ^T/k converges if and only if, to each ε > 0 there correspond an N such that Σ Λ < ε for all η, m > N Proof. We must show that the sequence g„ = 2*= ι /* satisfies the Cauchy criterion. For a given ε > 0, let N be as in the proposition. Then, for any x, m > η ^.Ν \gm(x) - #„(x)| = m 2 Mx) k-n+l < m I A k-n+l <ε Thus \\gm — gn\\ < ε, so the proposition is proven. Notice that the Cauchy criterion is guaranteed if the series of real numbers Σ*"=.ι ll/tll converges (for Up^ + i/J ^ Σ"=»+ι ИЛИ)- This 8ives us a powerful technique for verifying convergence of series. Definition 2. Let {/J be a sequence of continuous functions defined on a set X. The series is said to converge absolutely if Σ"=ι ΙΙΛΙΙ < °°·
402 5 Series of Functions Of course, as remarked above, an absolutely convergent series is convergent. In the case of absolute convergence we can pose a comparison test, just as for series of numbers. Theorem 5.1. (Comparison Test) Let {fk} be a sequence of continuous functions defined on a set X. Suppose there is a sequence {pk} of positive numbers and an integer N > 0 such that (0 11ЛИ<л fork>N 00 (ϋ) Σλ< °° k=\ Then Y^fk converges absolutely. Proof. The verification is the same as that for number series (Theorem 2.3). Examples 1. Σ zk = 1/(1 — z) uniformly and absolutely in {|z| < r} for any r < 1. For \\zk\\ < rk in that domain, and ^ ι < 00 2. Σ ζ<1 does not converge uniformly in {|z| < 1}. In fact, the series is not a Cauchy sequence of functions, because for every n, n+l ΣΛ- k=\ - ΣΛ k=i = 1 Thus, for ε = \, say, there is no ./V such that ||£™=ИЛН < i for all m > η > Ν, in fact, not even for m = η + 1. 3. ez = Y£L, zk/k\ converges uniformly in any disk {|z| < R} with R finite. Again, by comparison R" Rk -fc! and ΣΓ!<0°
5.1 Convergence 403 , „ cos nx 4·Σ- η1 converges uniformly on the whole real line. For any x, cos nx *i? Since Σ l/"2 < °° tne comparison test easily applies. 5. If {ak} is any sequence of numbers such that Σ Ι α* I < °o, then /(ζ) = Σ™= ι ak zk is a continuous function on the closed unit disk. The series converges uniformly since \\akzk\\ < \ak\. Finally, for the purpose of availability, we record the obvious extensions to series of the propositions concerning integration and differentiation of sequences of functions. Proposition 2. (i) Let {/„} be a sequence of continuous functions on the interval [a, b~\. Let g„(x) = \xaf„ ■ If the series of functions J^f„ converges, so does the series Σ 9n, and CD „X -л / CD \ Σ //- = / (Σ/-) (5-i) n= 1 Ja Ja \n=l / (n) Let {/„} be a sequence of continuously differentiable functions on the interval [a, 6]. Let gn =/„'. If the Series of functions J^gn converges, and for some c, J^f„(c) converges, then the series Σ/η converges. The limit is continuously differentiable and (5.2) (Σ/-)' mples 6. ln(l- = Σ/: аз *)= Σ xk к ■1 < χ < 1
404 5 Series of Functions This follows by integrating the geometric series (Example 1) term by term: •Ό ι — г k=o Jo да ik+1 oo fk ln(l-*) = ΣγΤ7= Στ ,, . " cos их /(*) = λ Γ is infinitely differentiable. For the differentiated series Ά sin nx n=i(«-l)! (5.3) is also convergent. By Proposition 2(ii) the sum is/'(x). Similarly, the series (5.3) can be differentiated term by term, and gives 2, η cos nx ~ nk (n - 1)! which is again convergent. 8. We can develop a series expansion for arc tan χ according to the following observations. From the geometric series 1 да —— = Σ** 1-х k=o we obtain by substituting — x1 for χ 1 1 +x2 £(-l)V k = 0 Integrate: Лк + 1 arc tan χ = Σ (_ 1)* χ fttt, 2k + 1
5.1 Convergence 405 EXERCISES 1. For what values of χ do these series of functions converge absolutely: (a) £ 2"x" (d) | (x+18)" " COS ИХ » (b) 2 —— (e) Σ e "* n=l .X ft=o (c) f eni (Ο Σ*01'' n = 0 n=0 2. In which domains of the complex plane do these series converge? (a) | nz" (b) 2 -£- (c) | b-V- 11 = 0 n = 0 (,^Wj! n = 0 3. Which of these series can be differentiated or integrated on their domain of convergence? (a) Exercise 1(a) (c) Exercise 1(d) к» COS ИХ (b) Exercise 1(b) (d) У —-— n=o П2 4. Find the power series expansion for these functions: (a> «hy (c) 1>л d /1 +x\ r' dt <"> ϊΜ <«» J.t+? PROBLEMS 1. (a) Find a power series expansion for sin χ cos x. {Hint: 2 sin χ cos χ = sin 2x.) (b) Find power series expansions for sin2 χ and cos2 χ 2. Prove Proposition 2. 3. Show that lim 2 — = loS 2 *-.- 1 11=1 П Can you conclude » (—1)" Σ — =log2? п = 0 П
406 5 Series of Functions 5.2 The Fundamental Theorem of Algebra For the remainder of this chapter we restrict attention exclusively to complex-valued functions of a complex variable. The simplest class of such functions are the polynomial functions, that is, functions of the form Ρ(ζ) = α„ζ" + α„_1ζ"_1 + ··· + αιζ + α0, (where the a, are complex numbers). We shall always assume a„ #0, in this case η is called the degree of P. It is a basic fact of mathematics that every polynomial has a root, that is, there is a number с е С such that P(c) = 0. The proof of this fact consists in a systematic investigation of the analytic properties of polynomials. First, we recall de Moivre's theorem. Lemma. Every nonzero complex number has η distinct nth roots. Proof. Let се С, сфО For this purpose, the polar representation с = re'" is most convenient An nth root of с is a number w = ре'ф such that p" =r and e'n* = ew, that is, пф — θ is an integral multiple of 2-π. Let 2ττ 4π 2ттк 2π(η — 1) αϊ = — , «2 = — , . .., ос» = , ..., α„_ι = , α„ = 2π η η η η Then αϊ! = exp(/ai), . ., ω„ = ехр(га„) are all distinct and have the property (ω„)" = 1. These are called the nth roots of unity. Now, if ρ = (г)1'" and φ = θ/η, then (ре'ф)" = rew = с. The numbers ре[фи>\,..., ре'фшк, ..., ρβ,φω„ are then all distinct, and are all nth roots of с Now, we need two deeper facts depending on the continuity properties of polynomials. The first is intuitively clear: that \P(z)\ gets arbitrarily large as ζ -> oo. The second is the crucial fact for the fundamental theorem: the place where a polynomial has a minimum modulus must be a root. Lemma. Let P(z) = a„ z" + ■ ■ ■ + a^z + a0 be a polynomial of degree η > 0. (ι) hm \P(z)\ = oo, that is, given any Μ > 0 there is a K> 0 such that |z|->°o \P(z)\ > Μ whenever \z\ > K. (li) IfP(z0) φ 0, then z0 cannot be a minimum point for \P\; that is, there are ζ close to z0 such that \P(z)\ < |P(z0)|.
5.2 The Fundamental Theorem of Algebra 407 Proof. (1) The point here is that the highest-degree term of Ρ is the dominating term as regards the behavior of Ρ as z^ oo. For ζ Φ 0, α"+ Σ — k = 0 Z |Ρ(ζ)| = |ζ|" If |ζ|^ΛΓ^1, then |ζ|"-' ^Κ also for А: <и, so "^ ак а»+ 2 -тл >|β"ΐ-ρΓΣι*ι) Let Μ > 0 be given, and choose «" = maxil>2M|a„|-l)2|fl„|-1 (^ |*|jj Then, for \z\ >K, ak / ы-1 (%-ь ыу\ ι Thus \P(z)\^\z\"-i\a„\^K- i\a„\>M (ii) Suppose now that P(z0) ^=0. Let Q(z)=(P(zo))-lP(z + z0) Then Q is also a polynomial, (2(0) = 1, and we must show that 0 is not a minimum point for Q. Let Q(z) = 1 + 2 akzk = 1 + zm(am + z^(z)) where m is chosen as the least positive integer к for which ak φ 0, and g(z) = У2=т+1 akZ*~(m+l). ^ is a polynomial and is thus continuous (and that is all we need to know about g). Here again we want to use the fact that for small z, zm dominates zm+\ so Q is very close to the polynomial 1 + zmam which has no minimum modulus at 0 (choose ζ so that zm = — rjam with r < 1). In our case, we choose an /nth root of — cQl; call it z0, and consider the function
408 5 Series of Functions Q{rz0) of a real variable r. We have Q(rzo) = 1 + rm(- l+rh(r)) where h(r) = —a^girzo) is a continuous complex-valued function. Thus \Q(rz0)\<\\~rm\+rm+1\Kr)\ Now lim rh(r) = 0, so we can choose r0 < 1 small enough so that \r0 h(r0)\ < i. Г-.0 Then |6(rz0)| ^ 1 - r0m + rom(i) < 1 - W < 1 which proves part (li). Theorem 5.2. (Fundamental Theorem of Algebra) Let Ρ be a polynomial of positive degree. There is a z0 e С such that P(z0) = 0. Proof. Let P(0) = c0. By part (ι) of the lemma, there is а К > 0 such that for \z\>K, |P(z)|> |c0|. Now A = {ze C; \z\ <K} is compact, so \P\ attains a minimum value on Δ, say at z0. But then z0 is a minimum point for all of C. For, since 0 e Δ, \P(z0)\ < \P(0)\ = c0, and for ζ φ A, |P(z)| ^ c0 > \P(z0)\. Thus, even for ζ φ Δ, we have |P(z0)| < \P(z)\. But then, by part (ii), there is no alternative: we must have P(z0) = 0. Factorization Theory We should recall that if с is a zero of the polynomial P, then ζ — с factors Ρ (this is proven below in Theorem 5.3). Thus P(z) = (z — c)Q(z) and Q has degree 1 less than that of P. If deg Q > 0, Q has a zero c', which is also a zero of P. Further, Q(z) = (z — c')Q'(z) and we can repeat this argument in order to find exactly deg Ρ zeros of Ρ This is the factorization theorem of algebra. Theorem 5.3. (Factorization Theorem) Let Ρ be a polynomial of degree η > 0. There are complex numbers α φ 0, zb ..., z„ such that P(z) = a(z - Zl) ■ ■ ■ (Z _ z„) Proof. The proof is by induction on и. If η = 1 the situation is simple: P(z) = αιζ + α0 = aA ζ — m
5.2 The Fundamental Theorem of Algebra 409 (since tfi Φ 0). Now we consider the case of general degree n, assuming the corollary for polynomials of degree и— 1. By the theorem, there is a point с such that P(c) = 0. Then i>(z) = i>(z) - P(c) = 2 ak(z" -c*)=i ak(z - cjfjW-j) n-l/ η \ = (*-c)2 Σ a^-Azi The factor on the right is a polynomial of degree η — 1, so the induction assumption applies: it can be written as a(z — zi) ■ ■ ■ (z — ζ,,-Ο for suitable α ^ 0, zi,..., z„_i. Thus, writing с = z„, we obtain P(z) = a(z - z0 · ■ ■ (z - z„) (5.4) This factorization is clearly unique, except for the order of the z,'s: a is the leading coefficient of Ρ and {zl5..., z„} are the roots of P. Of course, zl5..., z„ need not be distinct; let ru ..., rs be the set of distinct roots. If we let ml be the number of occurrences of the root r, in the list {zu ..., zs}, ml is called the multiplicity of the root rt. We can rewrite (5.4) as Ρ(ζ) = φ-,1Γ ■•■(z-Oms (5-5) and clearly mt + · · ■ + ms = n, the degree of Ρ Before concluding this section we should remark on the factorization of real polynomials. Real polynomials need not have real roots (viz., z2 + 1 = 0), but their complex roots come in conjugate pairs. Let P(z) = an z" + ■ ■ ■ + a^z + a0 be a real polynomial. If P{r) = 0, then P(f) = an(f)" + ■ ■ ■ + a>r + a0= {a„ r" + ■ ■ ■ + avz + a0)~ = P(r)~ = 0 so f is also a root of P. Since (z - r)(z ~r) = z2 -(r + f)z + rr = z2 - 2 Re(r)z + \r\2 the polynomial has real coefficients. Thus, if we rearrange the roots of Ρ into the real roots rv ...,rk and the conjugate pairs rk + i, rk + l, ..., rt, ft we can rewrite the factorization (5.5) into a product of linear and quadratic real polynomials. P(z) = a(z - rj"" ---(ζ- rk)mk{z2 - 2 Re(rt+1)z + |rft+1|2) ■ ■ ■ (z2-2Re(r()z + |r(|2)
410 5 Series of Functions PROBLEMS 4. Let Ш[ ω, be the η «th roots of unity. Show that they are arranged at η equidistant points around the unit circle. Show that the sets Ц ω,}, {а)Ь αϊ!2,..., ω'Γ1} are the same, if ш\ is the nearest such point to 1. 5. Let oij,..., ω„ be the nth roots of unity. Choose к so that kn — 2 is divisible by 4. Show that ι'ω,,..., г*а>„ are the «th roots of — 1. 6. Show that: (a) degPQ =degP+ deg Q. (b) deg(i> + Q) = max(deg P, deg Q) if deg Ρ ^ deg Q. (c) When is the equation in (b) not true'' 7 Given two polynomials P, Q show that there is a polynomial R which factors both P, Q and is factored by any polynomial which factors both P, Q Ris called the greatest common divisor of Ρ and Q. 8 Show that a real polynomial of odd degree has a real root. 9 Prove that the polynomial 1 + zmam (m > 0) has no minimum modulus at z = 0 10. For P(z) = 2?=ο α„ζ" a polynomial, let P\z) = 2 nanz"-1 n = l (a) Verify that the transformation P^P' is linear and satisfies (PQ)'=PQ' + P'Q (by induction on deg P). (P -*-/" is a complex analog of differentiation) (b) Prove that r is a multiple root of Ρ if and only if P(r) = 0 and i"(r)=0. (c) Define P"=(P')', P"'=(P")', and so on. Then r is a root of Ρ of at least multiplicity m if and only if P(r)=P\r) = ■ ■ ■ =P<m-1\r) = 0 5.3 Constant Coefficient Linear Differential Equations Now that we know the factorization theorem for polynomials we can return to complete the study of constant coefficient equations in one unknown function. Let L be a constant coefficient differential operator of order к; that is, L is a mapping from functions to functions defined by Д/)=/,к) + *Ё1в./,,) a. ε С (5.6) ι=0
5.3 Constant Coefficient Linear Differential Equations 411 Corresponding to L is the polynomial Ρί(Χ) = Χ" + "ΣαιΧι i = 0 called the characteristic polynomial of L. We recall the facts that we already know about such differential operators. Theorem 5.4. Let L be given by (5.6). The collection S(L) of solutions of the equation Lf= 0 is an η-dimensional vector space of infinitely differentiable functions. If r is a root ofPL(X) = 0, then erx e S(L). Now if all the roots of the characteristic polynomial are distinct, we have η solutions of Lf = 0, and it is easily verified (Problem 11) that they are independent. Thus they span S(L). To examine the case of multiple roots, we must examine more closely the relationship between the given differential operator and its characteristic polynomial. If Ρ is a polynomial, we will let LP represent the corresponding operator; that is, for P(X) = £"=0 atx\ LP is defined by lp(d= i>./<0 1=0 Now, from what we already know about these differential equations we can guess that the factorization of Ρ will tell us all we want to know about LP. In fact, we can factor the corresponding operator accordingly as the next lemma shows. Lemma 1. LP + g = Lp + Lq; LPq — LpLq . Proof. Of course, LPLQ is defined as the composition of operators: (LPLQ)(/) = LP(LQ(f)). The first equation is obvious. The second takes a little work. We will prove it by induction on the degree of P. If deg Ρ = 0, that is, P(x) = a0, then PQ=a0Q and LPQ(f) = a0LQ(f) = LP(LQ(f)), for any sufficiently differentiable function / Now suppose the lemma is true for all polynomials of degree η Let Ρ be a polynomial of degree η + 1 If α is a root of P, we can write P(X) = (X— a)S(X), where S is a polynomial of degree n. Thus, by hypothesis, LSQ= LSLQ. We have left only to verify the lemma for polynomials of degree 1. That is, we must show that if R is a polynomial of degree 1 and Γ is any polynomial, then Lrt=LrLt. For once this is verified, we take R(X) = X— a, so that Ρ = RS. Then LpQ = LrsQ = LrLsq = LrLsLq = LrsLq = LpLq
412 5 Series of Functions So, let R(X) = X-a, T(X) = Jr. о b, X1 Then m RT(x) =t(bi- abl+i)X[+l - ab0 Now we compute LRLT: (m \ m m Σ&/40 =Σ(*'/(0)'-οΣί</") ί=0 / ί=0 1=0 m = 2(^-^ + ι)/(' + 1)-^ο/ 1=0 The lemma is proven. It follows from the lemma that if β is a factor of P, then any solution of LPQ(f) = 0 is a solution of LP(f) = 0. Now let Ρ be a given polynomial. We can, by the factorization theorem, write Ρ as a product of first-order factors. P(X) = (X- βχΓ1 ---(Χ- as)m" with m, + · ■ ■ + ms = deg Ρ Because of Lemma 1 the solutions corresponding to the factors {X — a,)mj are in S(LP). Thus we need to discover the solutions of the differential equation LP{f) = 0, where P(X) = (X - c)m. Consider, for example, the differential operator corresponding to (X — c)2. We know one solution: ecx; we find another by the technique of variation of parameters. (X — c)2 = X2 — 2cX + c2. Test the operator on у = zecx. y' = z'ecx + zcecx y" = z"ecx + 2z'cecx + zc2ecx Then y" - ley' + c2y = z"ecx = 0 or z" = 0 Thus ζ = χ, and the second solution is xecx. We can guess then that the general situation is this. Lemma 2. The solutions of LiX_c)m(f) = 0 are spanned by ecx, xecx, ..., xm~'iecx. Proof. We have to show that the named functions are solutions We do that by induction. The case m = 1 is already known (by Lemma 1). Thus we may
5.3 Constant Coefficient Linear Differential Equations 413 assume the lemma for a given value of m, and prove it for m + 1. By Lemma 2, we need only verify that LiX_cym+ г{хте") is zero. But this is iu-A-sW =La_c)m(/nxm-1e" + сЛ« - cxV) = LiX_c)m(mxm-1ec') = 0 by induction. Theorem 5.5. Let p(X) = X" + J^Z^ a,X' be a polynomial with complex coefficients. Let a, as be the roots ofp(X) = 0 with multiplicities mt ms, respectively. Then the space S(Lp) of solutions of the differential equation Ьр(/)=/(п, + "Е«./(,, = о is the linear span of the functions xJea,x, 0 < j < m,. • EXERCISES 5. Solve these differential equations: (a) у- 5у" + 8/ - 4y = О,уф) = 0,/(0) = 0,у'Щ = 1. (b) у -y"~5y'-3y = 0,y(0) = hyXO) = 2,y"(0)=-l. (c) у- 6/ + 12/ - 8у = 0, у(0) = 1, /(0) = 0, /'(0) = 1. (d) У- 3/' + 3/ - у = 0, /0) = 3, /(0) = 2, у'ХО) = 1. (e) У4» + 2/ + у = 0, Я0) = 2, /(0) = 2, /'(0) = 2, Г(0) = 2. (f) /4> + 4/'"> - 2/ ' - 12/ + 9^ = 0, /0) = /(0) = /(0) = 1, Г(0)=0. (g) /4) - 3/ ' + 2у = 0, ЯО) = 0, /(0) = /'(0) = у%0) = 1. • PROBLEMS 11. (a) Show that if /ί, ..., r„ are n distinct numbers, the matrix 1 ■■ Г1 η2 ■■ "l 1 r„ r 2 „π— 1 is nonsmgular. (ί/wr: If the rows are dependent, we obtain a polynomial of degree η — 1 with η distinct roots.) (b) The functions exp(rix), ..., exp(r„ x) are independent. (Hint: If these functions were dependent, we would be able to prove that the columns of the above matrix are dependent.)
414 5 Series of Functions 5.4 Solutions in Series If now we are given a linear differential equation which is not homogeneous, or has variable coefficients, we have a problem of a much different magnitude. In general, such problems cannot be solved explicitly. Thus, we must seek ways to obtain approximate solutions. This is one of the places where series representations of functions are usable. The procedure of series approximation has two aspects. First, we must establish the theoretical validity of such a technique and, secondly (and this is essential from the practical point of view), we need a technique for effectively computing the error. In this section we shall describe this procedure, deferring these two essential points (which turn out to be the same!) until Section 5.7. First, an example. Suppose we want the function/such that /"(*) + 9i{x)j '(*) + g0(x)f(x) = 0 /(0) = a0 f'(0) = «, where g0 and gt are denned in a neighborhood of 0. We shall assume that they are sufficiently differentiable. Now our initial conditions give us the first two terms of the Taylor expansion of/ at 0: f(x) = a0 + avx + higher-order terms (5.7) Our technique will be based on the tacit assumption that the " higher- order terms" are computable, and knowing enough of them will give a usable approximation to the solution. Now, evaluating the differential equation itself at 0 gives us the second-order term: /"(0) + &(0)/'(0) + <7о(0)/(0) = 0 or Γ(0)=-(Λ(0)β1+^0(0)βο) so f(x) = a0 + α,χ - \(аодо(0) + αι#ι(0))Λ2 + higher-order terms (5.8) Differentiating the differential equation will give an identity express-
5.4 Solutions in Series 415 ing/'" in terms of lower derivatives, so we may continue, /'"(*) + 9\(x)f'(x) + ΰι(χ)Γ(χ) + g'0(x)f(x) + g0(x)f'(x) = 0 so /'"(0) = -(0'i(O) + 0o(O)K - g'o(0)ao + 3ι(0)(3ι(0)αι + go(0)ao) = (3i(0)2 - i7i(0) - g0(0))ai + (3o(0)3i(0) - ff{,(0))ao and so we have the third term of the Taylor series of/: f(x) = a0 + a^x - $(aogo(0) + а^д^Щх2 + Жд№2 - g'№ - ί/ο(0)Κ+ до(0)да0) - зо(0))«]х3 + higher-order terms Example 9. Perhaps an explicit calculation is in order. We shall find an approximate solution of: y" + (x2 - 1)/ + xy = x2 (5.9) уф) = 0 /(0) = 2 The solution thus begins f(x) = 2x + ■··. f"(0) is easy to calculate by substituting the initial conditions into Equation (5.9): f(x) = 2x + x2 + · · · Differentiating (5.9), we obtain y"' + 2xy' + (x2 - 1)/' + у + xy' - 2x (5.10) Evaluating at 0 we find f'"(0) =f"(0) ~f(0) = 2. Differentiating (5.10) and evaluating at 0, we obtain /(4,(0) = 2; once again gives /■(5>(0) = -10. Thus, to five terms the Taylor expansion of the desired solution is f(x) = 2x + x2 + ix3 + к x* - τι x5 + higher-order terms Admittedly this is not very glamorous, but it is computable! The phrase
416 5 Series of Functions "higher-order terms" represents the error between the fifth-degree polynomial exhibited above and the actual solution. That polynomial is completely meaningless without some estimate on the error incurred. But our method gives no hint as how to estimate. So, in the hope of being able to give more form to the "higher-order terms," we will try a more brazen approach: we begin by assuming that the desired solution is the sum of a convergent power series (its "full Taylor expansion") and we try to find the coefficients. If f{x) = £"=0 a„x", then differentiating term by term we obtain 00 η=ί 00 /"00= Σ "("-ικ*"-2 π = 2 00 fw(x) = £ η(η - 1) ■ ■ ■ (η - к + IKx"-ft n = k We make these substitutions into the given differential equations and solve for the {a„} by equating the coefficients of/1'0 Let us reconsider (5.9). Let f(x) = ^°=0 a„x" be the desired solution. The statement of the problem becomes 00 00 00 Σφ-1Κχ"-2 + (χ2-1)Σ«„^'+ £α„χ"-ζ2 = 0 (5.11) л=2 л=1 л=0 а0 = О ах = 2 The coefficient of xk in the left-hand side of (5.11) is (к + 2)(k + l)ak + 2 + (k- 1K-! - (k + l)ak+1 + ak^ Thus we have to solve these equations «o = 0 ax = 2 (initial conditions) 2a2 -at=2 (k = 0) 3.2a3 +a0-2a2 = 0 (k = 1) 4.3a4 + 2av - 3a3 = 0 (k = 2) n(n - \)a„ + (n - 2)a„_3 - (и - 1)α„_! =0 (к = η - 2)
5.4 Solutions in Series 417 We can solve, because each equation can be written in the form α"= ?—г; n>2 n(n — 1) (5.12) However, we have an added advantage in that we can make a guess at an estimate for the general term a„. In fact, we assert a. < 2" [#i/3]! (5.13) (Ы = largest integer less than or equal to x). This is in fact true for η = 0, 1, 2; we verify it in general by induction. kl< (и-1)|д.-1|+(и-2)|д.-з1 n{n — 1) < - (|α„_!ΐ + k_3|) η < 1/ T_ η \[(и - + 1)/3]! [(л-3)/3]! Now, since (и — 3) η (η - ЗУ 3 · so (i) '(и - ЗУ \>\ПЛ\ ~ 3 · Similarly, \{η - 1) ("з) Thus, _ 3 \а„ < 1 3[л/3]! (2„-ι +2"-3)< 2" ["/3]!
418 5 Series of Functions This now tells us a lot. For, the solution to the problem in (5.9) differs from 2x + x2 + i*3 - τ^χ4 - τ^χ5 by at most £„>6 α,,χ", where the a„ satisfy (5.13). Thus the error is dominated by ηη ^j3ft| ι 3ft -->3ft + 1 ι ι 3fc + 1 T3ft + 2| 13ft + 2 Σ π^Ν"= Σ 4^+ Σ ^-Чт—+ Σ —тг— nte[n/3]i ft^2 /с! ft£2 k\ fttz A:! < (1 + 2|x| + 4|x|2) ■ (exp(2|x|)3 - 1 - (2|x|)3) Hence in the interval 0 < χ < 1 the solution is given by the above polynomial except for our error of at most le1 (which is about 52)! The reader is forgiven if he is unimpressed with our estimate, but he should not go so far as to discard the technique for this reason. For the paucity of our results is due to laziness rather than the uselessness of the Taylor development. If we pushed this procedure up to 1000 terms (an easy task for a computer), then the error would be at most 7e2 . 2Ю00/1ООО! which is less than (50)-900; a good estimate indeed. Let us recapitulate the basic ideas. We are given an initial value problem: У(к) + Σ>.(*)/° = AW, X0) = c0, /(0) = clt ..., У'-1^) = ck^ We replace the #'s and h by power series expansions and test the " solution" f(x) = £"=0 a„x". The first к terms are found from the initial conditions, and the rest are found by equating the coefficient of x" on both sides of the equation. This leaves us with these problems to resolve: (i) Can we represent the given #'s and h by power series ? (ii) Can we differentiate a power series term by term7 (iii) How do we multiply power series? (In the above illustration, the g's were polynomials, so there was not much difficulty.) (ιν) Can the system of relations between the ak's really be solved uniquely? (v) Can we effectively estimate the error between the solution and a finite part of its (supposed) Taylor expansion? Little by little, we will resolve these problems. Suffice it to say that the answer to (ι) in general is No (see Section 5.8). However, in problems that
5.4 Solutions in Series 419 do arise naturally, the given functions usually are sums of convergent power series. If this is the case, all other questions can be satisfactorily answered; that is, the solution also is the sum of a convergent power series whose coefficients can be determined by the above technique and the estimate on the remainder can be effectively computed. Let us look at another illustration. Examples 10. x3y'" + x2y" + xy' + у = ех (5.14) y(0)=\ /(0) = 1 /(0) = l/6 Let the solution be f(x) =£"=0a„x". Substituting in (5.14), we obtain £ n(n - 1)(и - 2)a„x" + £ n(n - l)a„x" + £na„xn+ £ anx" n=0 n=0 = Σ - άο л! which gives these equations for the coefficients: a.(n(n - 1)(« - 2) + n(n - 1) + 1) = — for all η n\ or " п\{пъ-2пг- п+ 1) (5.15) Notice, that we have not used the initial conditions and fortunately they conform to the requirements (5.15). That is, for this particular equation, there is a unique solution independent of any initial conditions. This does not contradict any previous results because Picard's theorems do not apply (since the leading coefficient is not invertible). 11. y" - xy' + 2y = 0 (5.16) y(0) = 1 /(0) = 0
420 5 Series of Functions Here Picard's theorem does apply, so we should get a unique solution with the given initial conditions. Letf(x) = ]T"=o anx" be the candidate. (5.16) becomes 00 00 00 Σ n(n - IK*""2 - X nanx" + Σ 2a„x" = 0 n=0 n=0 n=0 or a0 = 1 ax = 0 (и + 2)(и+ 1)α„ + 2 - na„ + 2a„ = 0 и S; 0 or (и - 2)«„ (л + 2)(л + 1) Thus a2 = — 1, α3 = Ο, α4 = 0 and thus all further coefficients are zero. The solution is/(x) = 1 — x2. In the next section, we shall fully develop the theory of power series. It is most advantageous (as we have already seen) to do so in the complex domain. • EXERCISES 6. Find an approximate solution for / - xy = 0 XO) = 0 /(0) = 1 with an error of at most 10"* in the interval [— 1, 1] 7. Do the same for у -x2y = l X0)=0 /(0)=0 /'(0)=0 with an error of ΙΟ""1 m [—i, i] 8. Find a recursive formula for the coefficients of the solution, and a reasonable estimate: (a) у" -2у' + у = 0, Х0) = 1, /(0) = 0. (b) У -2/ + ху = е*,Х0) = 1,/(0) = 1. (c) y<k) + у = 1, arbitrary initial conditions. (d) / - k2y = 0 (e) / = x2 + xy, y(0) = 0.
5.5 Power Series 421 • PROBLEMS 12. Why doesn't Picard's theorem apply to Equation (5 14)'' 13. The second-order equation xy" + y'=0 seems to have only one solution by the series method, but two independent solutions by the method of separation of variables. Explain that. 5.5 Power Series We have already discussed at length the power series expansion of the exponential and trigonometric functions, the geometric series and some others. We have also seen that the Taylor formula produces a power series expansion for suitable functions. We have observed that there is a certain disk corresponding to each power series, called the disk of convergence. The series converges inside that disk and diverges outside. We shall recollect all this information as the starting point of our discussion of complex power series. Theorem 5.6. Let cn be a sequence of complex numbers. There is a non- negative number R (called the radius of convergence of the pouer series £ cnz") with these properties: (a) £„°°=0 c„ z" diverges if \z\> R. (b) Y^=0cnz" converges absolutely and uniformly in any disk {zeC: \z\ < r) with r < R. R has these two descriptions: (i) R = l.u.b. {t: \cn\t" is bounded}. (ii) J? = (l.msup(|cn|)1/")"1· Proof. For at least part of the proof we could refer to Proposition 9. As in that proposition we consider the set {t > 0. there is an Μ such that Μ > \c„\ t" for all n) If this set is unbounded, we can take R= <x>, otherwise, let R be the least upper bound of this set. (a) Suppose \z\ >R Then there is a t, \z\ >t>R such that {\c„\t"} is unbounded Since |c„| \z"\ > Iс| Г for all n, we cannot have lim c„z" = 0 so Σ c»z" diverges (b) Let r<R, Δ = {ζε C:\z\ < r}. Then there is a t,r<t<R such that
422 5 Series of Functions {\c„\t") is bounded, say by Μ If \z\ <, r, * «(:)■ Thus, letting || · || be the uniform norm for С(Д), we have \\c„z"\\ < M(rjt)" Since r/t < 1, 2 07')" < J', so by comparison 2 c" z" converges absolutely and uniformly in С(Д) Further, by definition, R is given by (ι), the more esoteric formulation (n) we shall leave as Problem 14 Examples 12. If Y^c„z'' is a given power series with radius of convergence R, the question may arise: what happens on the circle \z\ = Rl The answer is that practically anything can happen. (a) If the sequence {c„} is summable, that is, £ \cn\ < oo, then by comparison £ cnz" converges uniformly in {\z\ < 1}. Thus the series £ (z"/n2) has radius of convergence 1 and converges uniformly in {kl<i}· (b) Σ (z"/n) also has radius of convergence 1, but £(Г/и) does not converge, whereas £ [( — l)"/n] does converge. (c) £ z" has radius of convergence 1, but £ z" does not converge for any ζ with \z\ = 1 at all (hm z" # 0 if |z| = 1!). Since no general assertion on the circle of convergence is possible, we needn't be concerned with the behavior of the series there (except in particular cases). 13. The geometric series ^„°°=0 z" 's a power series with radius of convergence 1. This series converges to (1 — z)_1 uniformly and absolutely on any disk {z 6 C: |z| < r} with r < 1. Thus -^=!>" for |z| < 1 1 - ζ „f о Now let a e С, а ф 0. Then 1 1 1 _ ! ν (ZY V Z" 7^~z = a ' [1 - (z/я)] = a „4 W = io ^ This convergence is assured in the disk {z 6 C: |z| < \a\}. 14. The series ^„°°=0 (ζ7"0 has infinite radius of convergence. \c„z"\ = \с„\Г
5.5 Power Series 423 Thus the sum is a continuous function on the whole plane, denoted ez since this sum does converge to the exponential function for real values of z. We have seen that, for real x, e'x = cos χ + ι sin χ. We can use this (or Taylor's theorem) to obtain series for the sine and cosine: 03 (ix)" cos χ + 1 sin χ = У n=0 И! 00 x*k x4k+i X*k+2 X4k+3 ~ io (4/0! + ' (4/c + l)! ~ (4/c + 2)' ~ ' (4/c + 3)! _ » (-l)V* » (-lfx2k+l ~ io (2/c)! +,к40 (2/c+l)! These series also converge on the entire plane. We can use them to define the complex cosine and sine: со z2k со 72k+1 C°SZ = ,?„(-')'^ *,,-£-#—^ (5.17) We also have the equation eiz = cos ζ + ζ sin ζ (5.18) for all complex numbers ζ (for the series will sum again that way). 15. Replacing ζ by —iz and iz, alternately, we obtain these other interesting equations: ez = cos(-iz) + 1 sin(-fz) = cos(z'z) - i sin(iz) e~z = cos(zz) + ζ sin(zz) Thus ez + e~ ~ ez-e- = cos(zz) (5.19) = -isin(zz) (5.20) 2 For real values of z, the left-hand sides of Equations (5.19), (5.20) are
424 5 Series of Functions the hyperbolic cosine and hyperbolic sine, respectively. We can use these expressions to define the complex cosh and sinh: ez + е~г „ ez - e~z cosh ζ = sinh ζ = 2 2 Because of (5.19) and (5.20), the complex trigonometric functions are, on the imaginary axis, the hyperbolic functions: cosh ζ = cos(zz) sinh ζ = —/sin(zz) (5.21) We should also note that the trigonometric identities imply the hyperbolic ones. Since cos2(/z) + sin2(zz) = 1, it follows from (5.21) that cosh2 ζ - sinh2 ζ = 1 (5.22) (see Exercise 10). 16. Ση=ι (ζ"/"!) has radius of convergence 1. So does the series Σ^ι"* (z"/«!), for any integer k. We shall see later that the sums of all these series can be given by closed expressions (such as Σ z" = (1-z)-1). 17. A polynomial function in С is given by a power series. In fact, writing the polynomial p(z) = Σί=ο a„z" is the same as giving its power series expansion. What is more interesting is that any point in С can be chosen as the center of a power series expansion for p. Let z0 6 С and write N P(z) = Σ απΟ - z0 + z0)" n = 0 Using the binomial theorem this becomes P(z) = Σ α. Σ (")(ζ ~ zo)'zS~' (5.23) n = 0 1 = 0 \I / All sums being finite, we may arrange terms at will. Thus we can rewrite (5.23) as a sum of powers of ζ — z0: N N /m\ P(z)= Σ Σ Μ )ζ%-"(ζ-ζ0γ л = 0 т = п \П I which is the desired expansion.
5.5 Power Series 425 More generally, any series of the form £„°°=0 c„(z - z0)" will be called a power series expansion centered at z0 . Can we expand ez in a power series centered at a point other than the origin ? The answer is yes (cf., Problem 15), and the proof is like the one above for polynomials, but the question of convergence intervenes after the analog of Equation (5 23) above. It is a general fact that for any function given by a power series, we may move the center of the expansion to any other point in the disk of convergence. The truly courageous student should try to prove this now, it can be done. We will give a proof later which is simple and avoids convergence problems but requires more sophisticated information about functions defined by power series expansions. Addition and Multiplication of Power Series Suppose /, g are complex-valued functions defined by power series expansions centered at a point z0. Then we can find series expansions for the functions У + g &najg also. Addition is easy: if, say Az) = £a„(z-z0)" g(z) = £ bn(z - z0)" then (f+g)(z) = ^(an + bn)(z-zor But to find the series expansion for the product requires a little more care. Suppose that z0 = 0 (this involves no loss of generality). To say that f(z) = γ^αηζ" is to say that in a certain disk Δ, /is the limit of the polynomials Σ^=ο α„ζ". Similarly, g is the limit of the polynomials £*=0 bnz". Thus, fg is the limit of the sequence of polynomials (J^=0 an ζ")(Σ,η=ο bn z"). Now, we can multiply polynomials easily, (Ν \ Ι Ν \ Ν Ν „=0 / \n = 0 / n = 0 m-0 If we collect terms in this expression to form a series of powers of ζ we do not get a very aesthetic expression, but if we take some terms from the next few polynomials in the sequence we obtain a reasonable expression. k=0 \п + т = к / We could hope that fg is the limit of this sequence of polynomials. This is a reasonable hope; for even though we have modified the original sequence
426 5 Series of Functions of polynomials we have neither added nor deleted from the series represented by that sequence. In fact, by making careful use of this fact, we can verify that/# is the limit of (5.21). Proposition 3. Let f(z) = 2^°=o anz", 9(z) = X"=o b„z" and suppose r is less than the radii of convergence of both series. Then (0 (/+ 9)(z) = X"=o (an + b„)z" uniformly and absolutely in Δ= {ζ б С: \z\ <r}, (») (fg)(z) = Σ?=ο(Ση + ·η=ι< αη bm)zk uniformly and absolutely in A. Proof. Let p„(z) = Jk=0 akz\ q„{z) = 2ϋ=ο bkzk. By hypothesis p„ ->/, q„ ^g uniformly in Δ Thus p„ + q„ -»■/+ g, p„qn ->fg uniformly in Δ (Problem 2 55). Since p„(z) + q„(z) = 2 (a„ + b„)z" k=l (1) is proven. (11) Let r.(z) = If I^bj)z k = 0\l + J = k / we want to show that r„ -+fg uniformly in Δ We know that p„q„ -+fg so it would seem worth our while to compute p„q„ — r But that is easy, pnq„- r„= J ( 2 o,bj\zk k=0\< + J=k / l>n J>n Now, each term on the right is of the form a, bj z1+J with i > η or j > n. Thus, computing norms on Δ = {ζ e С: \z\ < r}, \\p,q*-r„\\<, + ((|,|,fl'z'ii)L?+,|ii'jzi|1)
5.5 Power Series 427 Now, we know that 2<°= о \a,\r\ Jf=0 \Ь,\г> are finite. Let Μ be a number larger than both Given ε > 0, there are (1) iVi >0 such that 2Γ=η+ι \a,\r' < ε if η ^ Nu (2) N2 > 0 such that Jj=n+i\b,\r> <εύη>Ν2, (3) 7V3 > 0 such that ||Λί„ -/#|| < ε if η > N3. These assertions follow from the known convergence of each case. Thus, if η > max(M, N2, N3), \\i*-fg\\ ^ ILp»0n-/bll-i- \lpnqn-r„\\ <£+ii+1,"k,)(iol^W) + (lolfl^')(Jl+^|W) <ε + ε·Μ+Μ ε = (2Μ+1)ε the proposition is concluded. • EXERCISES 9. Verify, in the way suggested in the text that cos2 ζ + sin2 ζ = 1 is true for all complex numbers z, and thus cosh2 7—sinh2 ζ =1 is always true. 10. Find a power series expansion for these functions· (a) exp(z2) (e) e~' j ^—{ (b) ez sin ζ (f) cosh ζ cos ζ (с) γ—^ (g) smhz dt (d) ί exp(t2)dt 11 Verify by multiplying the power series that ez+w = ezew 12 From the addition formula for the exponential (Exercise 11), deduce the addition formulas for cos, sin, sinh, cosh PROBLEMS 14. If {c„} is any bounded sequence, then the maximum number, every neighborhood of which has infinitely many members, is denoted lim sup c„. Show that the radius of convergence of the series ^cnz" is R = (limsup(|c„|)1/n)-'
428 5 Series of Functions 15. Expand ez in a power series about any point z0. 16. Assuming that (1 +x2)-1'2 can be represented by a power series centered at 0, find it. Find the power series for arc cos x. 17. Assuming that tan χ can be represented by a power series centered at 0 and using the equation tan χ cos χ = sin x, find the power series expansion of tan x. 5.6 Complex Differentiation An easy property of the exponential function is lim e—^- = 1 (5.25) For so oo 2 n = 0 П\ - f —-1 J γ —\ Now the term in parenthesis is a convergent power series, so is continuous at 0. Thus, writing the parenthesis as g{z): e* - 1 lim = 1 + lim zg(z) = 1 z->0 Ζ ζ->ο From (5.25) and the properties of the exponential it follows that exp(z0 + z) - exp(z0) ez - I lim —— —- = exp(z0) lim = exp(z0) z-»0 Ζ ζ-»0 Ζ The student of calculus will recognize the limit on the left as a difference quotient and the entire equation as a replica of the behavior of the real exponential function. It might be a good idea to consider more generally such a process of differentiation on the complex plane. This turns out to be a
5.6 Complex Differentiation 429 very significant idea, because there are many beautiful and useful ways to represent functions which are so differentiable. Definition 3. Let/be a complex-valued function defined in a neighborhood of z0 in C. f is differentiable at z0 if lim/(')-/('o) z-*zo Ζ Zq exists. In this case we write the limit as/'(z0). If/is defined in an open set U and differentiable at every point of U, we say that / is differentiable on U. The usual algebraic facts on differentiation hold true in the complex domain. Proposition 4. (ι) Suppose f, g are differentiable at z0. Then so are f+g and fg with the derivatives given by (f+ff)Xz0)=f'{z0)+g'(z0) (f9)'(z0) =f'(z0)g(z0) +f(z0)g'(z0) (ii) Suppose f is differentiable at z0 and f(z0) φ 0. Then \/f is differentiable at z0 and(\/f)'(z0) = -/'(z0)//(z0)2. (iii) Suppose f is differentiable at z0 and g is differentiable at f(z0). Then g of is differentiable at z0 and(g °/)'(z0) = g'(f(z0))f'(z0). Proof. These propositions are so much like the corresponding propositions in calculus that their proofs will be left to the reader Examples 18. The function ζ is clearly differentiable, and z'(z0) = 1 for all z0 . A constant function is differentiable with derivative zero. Since any polynomial is obtained from ζ and constant functions by a succession of operations as described in Proposition 4(i), all polynomials are differentiable. 19. The function ζ is nowhere differentiable. For the difference quotient (z - z^)/(z - z0), for ζ φ z0, is a point on the unit circle and as ζ ranges through a neighborhood of z0 this difference quotient takes on all values on the unit circle, so it could hardly converge.
430 5 Series of Functions Sum of a Power Series is Infinitely Differentiable Our introduction to this section was essentially a proof that ez is everywhere differentiable. It is in fact true that the sum of a power series is differentiable in its disk of convergence. We now verify this basic fact. Theorem 5.7. Let f(z) = £™_0 anz" have radius of convergence R. Then f is differentiable at every point in the disk of radius R and f'(z)= Σηα„ζ-1 n = l has the same radius of convergence as ]T"=0 a„z". (5.26) Proof, lim sup(«|an|)1/n = lim(n)1'" lim sup(|a,|)1/n = lim supdaj)1'", so the series ^na„z"~l has the same radius of convergence as the given series. We must show that it represents the derivative of/. Fix a z0, \z0\ < R and choose r > \z0\. The series J n^r"'1 converges absolutely, so given ε>0 there is an N such thaf 2„>лИ \a„\r"~l <ε. Now consider the difference quotient defining/'(z0): Z— Zo /(z)-/(z0)= - /z"-z0"\ -. = 0 \ Z— Zo / n=l \k = l If \z\ < r as well as \z0\ < r, then < 2 Ια,Ι iy-v*-1 < Σ η Mr"-1 <ε ^e-fi^zs-1 n>N k = l Similarly, !2„>л,до„го 4 <ε. Thus, /(z)-/(zo) Z — Zo ·- Jnazj;-1 <2ε + 2la.| Σ-n-k-k-l _ _n- Z Z0 — Zo Now, by continuity, as z-^z0 the last term tends to zero. Thus, there is a δ > 0 such that if |z — Zo! < δ, the last term is less than ε. Thus, for \z\ < r and \z — z0\ < δ we have Z — Zo -2h<j„zs 1 <3ε which proves that the limit of the difference quotient exists and is given by (5.26).
5.6 Complex Differentiation 431 In particular, since /' is given by a convergent power series, it also is differentiable, with derivative f"(z) = £и(и - \)anz"~2, and so forth. We thus obtain these results, which form the complex version of Taylor's theorem for sums of convergent power series. Corollary 1. Let f(z) = J^anz" have radius of convergence R. Then f is infinitely differentiable in {z: \z\ < R}. Furthermore, for every к the kth derivative, f(k) is given by a convergent power series, f(k\z) = £ И(и _ l) ... („ _ fc + 1)β„ z"~k Corollary 2. Let f(z) = £"=0 an z" be convergent in a disk about 0. The coefficients {a„} are uniquely determined by f. /(n,(0) a„ = — Notice that the definition of complex derivative is a genuine generalization of the differentiation of functions of a real variable. Thus the same corollaries hold for functions of a real variable represented by power series: Corollary 3. If f(x) = £"=0 an(x ~ xo)" m a neighborhood of x0, then f is infinitely differentiable at x0 and /(n)(*o) n\ 00 /<«(*) = £ n(n - 1) · · · (n - k)an(x - х0)"-к These corollaries are easily derived from the theorem and their proofs are left to the student. Notice that the implication of Corollary 2 is that the coefficients of a power series representation of a function are uniquely and directly determined by the function. In particular, a function cannot be written as the sum of a power series in more than one way. This observation allows us to easily verify the identity cos 2 ζ + sin2 ζ = 1 (5.27) For the function cos2 ζ + sin2 ζ is a polynomial in functions which are sums of power series and thus is the sum of a power series. Its coefficients can
432 5 Series of Functions be computed according to Corollary 3 just by letting ζ take on real values. But the right-hand side of (5.27) is the Taylor expansion of cos2 ζ + sin2 z, for real z, thus it must be the Taylor expansion for all z. Hence (5.27) is always true. The Cauchy-Riemann Equation It is of value to compare the notion of complex differentiation with that of differentiation of functions defined on R2, since R2 = С Suppose that / is a complex-valued function defined in a neighborhood of z0 = x0 + iy0. If/is differentiable as a function of two real variables, then the differential df(x0,y0) is defined and is a complex-valued linear function on R2. If/ is also complex differentiable, then /'(z0) = lim/(z)-/(Zo) (5.28) exists. Let ζ -> z0 along the horizontal line. Then (5.28) specializes to f (z0) = hm = — (x0, y0) (5.29) x-*xo x x0 Οχ If we let ζ ^>z0 along a vertical, we also have /'(z0)=lim = Tj-(x0,yo) (5·30) у-уо 1(У ~ Уо) ι 8y Thus the right-hand sides of (5.29) and (5.30) are the same. In conclusion, a complex differentiable function must satisfy (when considered as a function of two real variables) this relation (I)=-*(!) (5.31) This is called the Cauchy-Riemann equation. More precisely, the Cauchy- Riemann equations are found by writing/= u+ iv and splitting into real and imaginary parts. Let us record this important fact. Theorem 5.8. Let f be a complex differentiable function in a domain D. Split f into real and imaginary parts and consider f as a function of two real
5.6 Complex Differentiation 433 dj_ = l_df_ δχ i δγ διι δν δη δχ δγ ду δν ~δ~χ variables (ζ = χ + iy). Then these partial differential equations hold in D: (5.32) (5.33) Proof. Equation (5.32) was observed above. Equation (5.33) follows from (5.32) and the identities df__8u dv 8f _du .dv dx~ δχ δχ 8y~ ду Ну Notice that when/is complex differentiable, its differential is given by df(z0, (r + is)) = — (z0)r + — (z0)s = /'(20)r+i/'(z0)s = /'(20)(r + is) Thus the differential of a complex differentiable function is a complex linear complex-valued function. We shall show, via the techniques of the next few chapters, that a complex differentiable function can be written as the sum of a convergent power series. Thus, just by virtue of the differential being complex linear, the function has derivatives of all orders and is the sum of its power series. • PROBLEMS 18. Prove Proposition 4. 19. Prove Corollaries 1 and 2 of Theorem 5.7. 20. Show that if /is an infinitely differentiable function on an interval (—ε, ε), and there is an Μ > 0 such that \f(n\x)\<M all η all* -ε<χ<ε then /is the sum of a power series which converges in the unit disk. 21. Suppose / is a complex differentiable function in a domain in the plane. Show that: (a) if/is real-valued it is constant. (b) if |/1 is constant, then/is constant.
434 5 Series of Functions 22 Write the Cauchy-Riemann equations in polar coordinates. {Hint: Differentiate along the ray and circle through a point.) 23. Suppose/i ,...,/* are given by convergent power series in a disk Δ. If F is a polynomial in к variables such that F(/1(z),...,/t(z))=0 (5.34) for real z, then (5.34) is true for all ζ in Δ. 24. Compute the limits of these quotients as ζ -*■ 0: (a) (b) (c) arc tan ζ ζ sin hz sin ζ cos ζ — 1 (d) (e) cos ζ — 1 cos ζ sin z —tan ζ (f) ; η = 0,1,2,3,4 25 Suppose / is a differentiable complex-valued function of two real variables in a domain D. Show that/is a complex differentiable if and only if the differential df(z0) is complex linear for all z0 ε D. 26. Suppose that / is twice differentiable in D, and is complex differentiable. Show also that /' is complex differentiable. Supposing that (/')' = 0, show that/is a quadratic polynomial in ζ 27 If/= u+w is complex differentiable in D and twice differentiable, then д2и д2и d2v д2 υ ~ibc2 + ~ty2= =~frc2+~dy2 5.7 Differential Equations with Analytic Coefficients A function which can be represented as the sum of a convergent power series at a point ae С will be said to be analytic at a. We now return to the study of linear differential equations in order to answer some of the questions posed in Section 5 5. We can use the information in Section 5.6 to do this and to provide the sought-for estimates. In particular, we shall verify the following fact.
5.7 Differential Equations with Analytic Coefficients 435 Proposition 5. Suppose h, g0, ..., gk_t, gk are analytic at 0. Then the solution of the differential equation У(к)+ 19,У(0 + Кх) = 0 у(0)=ао,...,/к-1Щ = ак-1 is also analytic at 0; that is, in some disk centered at 0, it is the sum of a convergent power series whose coefficients can be recursively calculated from the differential equation. We already know, from Section 5.5, how to compute the Taylor coefficients of the solution; our business here is to show that the resulting series does in fact converge. This, of course, involves producing the kind of estimate required by Theorem 5.7. Suppose then that h,g0, ..., дк-± are analytic in the interval \x\ < R Then 00 00 h(x)= £a„x" gt(x)= Σα-'χ" and for some positive number M, \a„\ <MR~", \a„l\ <MR~" for all ι and η We shall obtain the desired estimate in terms of M, R and the initial conditions For simplicity, we shall do the homogeneous case only (A = 0), leaving the general case for the reader (Problem 27). If /(*)= fc„x" n=0 is the desired solution, we have Co = do, · · · , Clt -1 = _ j.j and the rest of the coefficients are found from these equations: |п(и- 1) ■ ■ (n - k)c„x"~K + *Σ(Σ ""'*")( Σ «(«-!) · (n-i)c„x"-')=0 (5.35) Surely, the reader now has a pain in his stomach similar to that of the author as he wrote this equation Patience, dear reader—the fun has just begun ι Equating coefficients of xm to zero, we obtain this recursive system of equations for the
436 5 Series of Functions coefficients: m{m — 1) · · · (m — k)cm or -1 Cm ft-1 m-ft + Σ Σ аа1(т+ i-(k+*)) •••(m—(k + a))cm+i_(ft+e) =0 Σ Σ (™ + '-(* + a)) /η · · · (m — к) ι = ·b »=о ···(»»-№ + «))a«lCii.+ i-«+«) (5.36) By the restraints on ι and α we have m + ι — (к + <х)<т + к — 1 — к =т— 1, so the highest subscript of с on the right is m — 1. Thus, given c0,..., ck_i we can solve Equations (5.36) successively. We now try to find an estimate. For this purpose assume Μ > 1, and let С=тах(к,ЩЯ- D"1^ - 1),^,(^V", 0 < у < A: - Л The last condition on С is written so as to assure that ffl \cj\^i — \ fory = 0,...,*-l (5.37) We now prove this inequality for all /я, by induction. Thus we use (5.36), assuming (5.37) for all η < m: J »-l m-k \Cm\ <.— 2 Σ Iй"'I |Cm+l-(k+a)| 7Я 1 = 0 α= ο < 7Я 1 '-'"-'M/CMW1-11*" 72 ifb «ΐΌ Λ" \~r) 1 MmCm-1 *=ι (/η- k+ 1) ~ /η Λ"1-" itb JR* Now 2?Ξί Λ"1 = OR"* - 1)CR_1 - Ι)"' = R(R - l)-l(Rk - l)R~K. Thus m-k+\Mm R(Rk-l) lCM\m by definition of C. By this estimate we see that f(x) = 2Г= о с" ■*" converges in the interval {x: \x\ <R(CM)~'}, and in that interval is the solution to our problem.
5.7 Differential Equations with Analytic Coefficients 431 We might perhaps have made a better estimate by more clever substitutions; but our above estimates were sufficient for the results desired In any particular case we could usually be clever and obtain even better estimates. Examples 20. y' + exy' + xy = 0, y(0) = 0, уЩ = 1. We will find a polynomial which approximates the solution to within 10~5 on the interval [0.01, 0.01]. Let £ c„x" be the supposed solution. By substituting the series we obtain oo / oo χη\ /oo \ oo Σφ-ικ^2+ Σ ι Σ^*-1 + Σ^*η+1 = ο (5. η = 2 \η = 0 И!/ \„=ι / „=0 38) The initial conditions give c0 = 0, ct = 1. Equation (5.38) becomes -1 cm = Thus ■1 lm'2 1 \ сг = -¥\ = -2 c3 = -|(2c2 + c, + c0) = 0 Q = -ti(3c3 + 2c2 + |Cl + c.) = 2V C5 = - A(4c4 + 3f3 +C2+ icx + C2) = Tio and so forth. The question is not really what the coefficients are (that is to be left to a machine)—but how many coefficients need to be computed. The coefficients appear to be bounded (we could in fact show that they must be, cf. Problem 29). Let's try to prove that |c„| < К for all η by induction. We have 1 m~2 1 m \,f0 i! so long as m > 4. Thus we may take for К a bound for the first four terms, that is, к = 1/2. Then the difference between the solution
438 5 Series of Functions and the fcth partial sum of its Taylor expansion is dominated by n>k 2„^k 2 1 - \x\ for |x| < 1. The interval we are concerned with is |x| < 10-1 so our bound on the error is 2 10k 9 18 This is less than 1СГ5 if к = 6, thus (computing also c6) the solution differs from v _ lv2 , l_v4 , l_v5 1_ 6 л 2Л τ JtJi τ ΙίΟΛ 3 60Λ by at most 10~5 for all values of χ in [-0.01, 0.01]. 21. Suppose we needed that good an estimate in the interval [— 1, 1]. It is easy to see that just knowing that the coefficients are bounded is not good enough. We have to know that \c„\ < Kr" for some r < 1 and some K. Let's try r = \. That is, we attempt to verify by induction that |c„| < 2~" for all n. Now, using the equation defining {c„}, \CJ<— 7Г > · „_,_, +^-T- 1 lm~2 -— Σ m{m - 1) \ ,sb κ ι /m-2? \ < e2 + 4 К m This is less than 2~mK as soon as m > 2(e2 + 4), or m > 26. Thus the induction step proceeds as soon as m > 26; we need only choose К so that the inequality holds for all m < 26 (K= 2 will do). Thus |c„| < 1/2"-1 for all n. The desired solution differs in the interval [—1, 1] from its fcth-order Taylor polynomials by Σαι)" — у - < - 2ft — 1 l-t on — r\k n = 0 ί £ ^ Σ lCkl - τΤ=ϊ Σ on - oft-2
5.7 Differential Equations with Analytic Coefficients 439 This is less than 10 5 if к is 19. Thus we need to compute 19 terms of the Taylor expansion to find the desired approximation. 22. Compute the solution of y" + (r-zrAy' + exy = 0 X0) = 1 /(0) = 0 to an accuracy of 1СГ3 in the interval [ — \, \\. Let f(x) = Yj°=0 cnx" be the solution. We have these equations for the solution: c0= \,c1 = 0 -1 c„ = n(n - 1) \,=o n-2 n-2 ^ Σ (П- 1 -/)£·„_!_,+ Σ -,Cn --" ι = ο Ι! -) (5.39) We will show by inductions that the {c„} are bounded. Suppose that |c„| ^ К for all η < m. Then kJ < m-2 J (от -1 - OK + Σ -,κ 1 /m_2 ( У v ,_ , ^ /п(от - 1) \ ,^Ό ι=ο ι! Κ Ζ"1-1 \ -К \т(т - 1) ^-7—ή Ei + e =-7— m(m — l)\j = i / m(m— n(m — 1) + e К 1 : + _2 от(/и — 1). <X as soon as от > 3. Thus we can take К as a bound for the first four terms. We have from (5.39) c2= -\, c3 = 0, so we may take К = 1. Then the estimate of the remainder after к terms in the interval Σ κ\ w" ζ Σ ί = ψ This is less than 10~3 when к = 10, so we need 11 coefficients. We compute r — 1 :2"0 /■ —-ДД- etc.
440 5 Series of Functions Up to six terms (giving at most an error of 1/64), our solution is x2 x4 x5 13x6 1 ~T+12 + 20 + ^20 +'" • EXERCISES 13. Find a power series expansion for the general solution in a neighborhood of 0 for this equation, (1 - x2)/ - 2xy' + k(k + \)y = 0 14. Find a power series expansion for the general solution of / - Ixy' + 2ky = 0 15. Find the power series for the function y=f(x) such that (a) y'- + e'y = x\ y(Q) = l, /(0) = 0 (b) (У)2=л Х0) = 0 (c) {у'У=УУ"> X0)=0, /(0)=1 (d) / + 2xy2 = 0 16. How many terms of the power series for the solution у do we need: (a) for an accuracy of 10"3 in the interval (— i, i) in Exercise 3(a)? (b) for an accuracy of 10"5 in the interval (—10, 10) in Exercise 3(a)? (c) for an accuracy of 10"5 in the interval (—0.1, 0.1) in Exercise 3(b)? 17. For what к are the solution of Equations (5.1), (5.2) polynomials? • PROBLEMS 28. Generalizing the argument in the text prove this theorem: Theorem. Ifh, g0,..., gk-i are analytic in an interval (— R, R) about the origin, then any solution of the differential equation y(l,> + kIgtyl0 = h r = 0 can be expressed as the sum of a convergent power series in a neighborhood of the origin. 29. Suppose that g, h are convergent power series in some disk {\z\ < R} with R > 1. Show that the solution of the linear differential equation y" + gy' + hy = o (5.40)
5.8 Infinitely Flat Functions 441 is the sum of a convergent power series with bounded coefficients. 30. If the power series g{x) = 2 a„x", h(x) = ^b„x" both have infinite radius of convergence, then so does the series expansion of the solution of (5.40). 5.8 Infinitely Flat Functions Not all functions are susceptible to the kind Of Taylor series analysis which we have been doing. A first requirement is that the function have derivatives of all order; even that however is insufficient. Another glance at Theorem 5.6 will remind the reader that there is a behavior requirement on these successive derivatives in order that the given function be the sum of its Taylor expansion. We shall show by example that there are infinitely differentiable functions which are not sums of power series. First, we shall make the notion of analyticity precise. Definition 4. Let/be a complex-valued function defined in an open set U. Let ae U. /is analytic at a if there is a ball {z: \z — a\ < r} centered at a such that/is the sum of a convergent power series in this ball. / is analytic in U if/ is analytic at every point of U. We have deliberately stated this definition without reference to the domain of definition of the function; it applies equally well to functions of a real or complex variable. The only functions which we know to be analytic are the polynomials and ez. For example, if/is the sum of a convergent power series at the origin, we do not yet know that we can expand/in a series of powers of (z — a) with a any other point in the disk of convergence. We shall see in the next chapter that this is the case. We have already seen that an analytic function has derivatives of all orders (is C°°) and now we will produce a C°° function which is not analytic. The clue to this function is given by the following fact, which follows from l'Hospital's rule. Proposition 6. lim P(t)e~' = 0, for any polynomial Ρ <->00 Proof. Problem 31. The function we have in mind (Figure 5.1) is defined by Iexpl 1 x>0 ^ x> (5.41) 0 x<,0
442 5 Series of Functions Figure 5.1 σ is certainly infinitely differentiable at any point x0 φ 0, so we need only consider its behavior at 0. Now, a(n\x) = 0 χ < 0 all η Thus all derivatives of σ from the left exist at 0. We have to show that all derivatives from the right exist and are zero. More precisely we must prove that for all n, <P\x) - gW(0) lim = 0 (5.42) x->0 x>0 We do this by induction. The case η = 0 is easy: lim = lim - expl j = lim te~' = 0 x->0 X x->0 X \ X/ r->oo To do the general case we must have some idea what σ(η,(χ) looks like for χ > 0. Now •(*) = - ρ exp( - I) a\x) = (| + 1) exp( - 1) ff(3,w=-(?4+^)exp(-i) A pattern seems to be developing.
5.8 Infinitely Flat Functions 443 For each η there is a polynomial Pn such that <r<" M(x) = pJ-\ expi -^\ for χ > 0 (5.43) This can be verified by induction. Assuming (5.43), we compute -p-(;H-;) where Pn + l(X) = -X2(Pn'(X) + Pn(X)). Now that we have this, (5.42) follows immediately from Proposition 6: lim «"О-«"O = lim 1 Wi) exp( - i) = lim tP^e" = 0 x>0 Thus σ is also infinitely differentiable at 0. But it is certainly not analytic. Its Taylor expansion is £"=0 0 ' *" which converges to σ(χ) only for χ <, 0 and provides a poor means for approximating the value of σ(χ) for χ > 0. However, the fact that infinitely differentiable functions exist with this property has its bright side. The following construction will prove to be useful. Lemma. Given a < b, there is a C°° function zab such that (i) 0 < zab{x) < 1 for all x, (ii) zab{x)>0 ifa<x<b, (iii) zab{x) =0 if χ > b or χ < a. Proof. (See Figure 5.2.) σ(χ(1 — χ)) has the required properties of τ01. We then define τ-(χ>-4Η)=σ((Η)(ι-Η))
444 5 Series of Functions у ~ TJx) Figure 5.2 Theorem 5.9. Let [a, b~] be a given interval and Uu ..., U„ a finite collection of open intervals covering [a, b~]. There exist C°° functions pl such that (i) 0 < plx) < 1 for all χ e R, all ι, (ii) />,(*)= 0 ifxtUlt (iii) £p,(x) = l, for all χ e [a, 6]. Proof. Let Uι = (at, bt), and take τ, = Ta,bl. Then τ(χ) = 2 τ,(χ) > 0 if χ 6 (α, ft). Let (r,(x) xe(a,, ft,) P.(x) = { τ(χ) Ο χ 0 (α,, ft,) The pi then have the desired properties. PROBLEMS 31. Prove that for any polynomial P, lim P(t)e~' = 0. t-»00 32. Let a be defined by (5.31). Define ω(χ) f (f(l-O) fea-o) ifr Λ Show that (a) ω is C°°, (b) 0 <, ω(χ) £ 1, for all x, (c) ω(χ) = 0, if χ < 0, (d)aj(x) = l,ifx^l. 33. Using Theorem 5.9 it can be shown that any continuous function is the limit of C°° functions. Let/e C([0, 1]) and ε > 0. Find a C°° function #such that II/—#|| <ε Here's how to do it. First pick an integer N >0 such that |/(x) — f(y)\ <ε/2 if \x — y\ <\jN. Now cover the interval [0, 1] by the intervals
5.9 Summary 445 С/о Un, where and let pu ..., pN be the corresponding functions of Theorem 5.9. Let For any χ, χ is in only two of the intervals {[/,}, say Ur, Ur+i. Then **>-!/(*)»<*> I/(*)-*(*) I =M*) fix)-f{$\+»+]fix)-f(cir) ε ε <2 + 2 5.9 Summary Let (Λ) be a sequence of continuous functions. The series formed from the/ft is the sequence of sums {^jUi/*}· If this sequence converges, we say that the series converges and denote the limit by £"=1/&. The series converges absolutely if £it°=1||/J < oo. The Cauchy criterion for series asserts that the series converges if and only if the sums ||^m=n+i /J can be made arbitrarily small by choosing m, η sufficiently large. comparison test. If there is a sequence {pk} of positive numbers and an integer N > 0 such that (i) ΙΙΛΙΙ<Λ for k>N (ii) Хл<оо then Σ/ь converges absolutely. integration. If £/„ converges, so does £ \xaf„ and FUNDAMENTAL THEOREM OF ALGEBRA. If Ρ IS a polynomial: P(z) = a„z" + ··· + αχζ + α0 (5.44)
446 5 Series of Functions with an φ 0. Then P(z) has a complex root. If ru ..., rk are all the roots of P, then corresponding to each root there is a positive integer m, (called the multiplicity) such that (i) mt + ■■■ + mk = η (ii) i(z) = (z - r,)"" ■ ■ ■ (z - г*)™" (5-45) To the polynomial Ρ given by (5.44) we associate the constant coefficient differential operator LP: £ρ(/)=α„/<η) + ···+α1/'+α0/ Ρ is called the characteristic polynomial of LP. These formulas are valid: Lp+Q = Lp + Lq LpQ = LpLQ If (5.45) is the factorization of P, then the kernel of LP, the collection of solutions of LP(f) = 0, is spanned by the functions > · · · > „Г2Х „тг—1</2* c Λ c „ГкХ ,4.-1/*" c Λ c Let {c„} be a sequence of complex numbers. There is a nonnegative number R (called the radius of convergence of the power series ]T cnz") with these properties: (a) £ c„z" diverges for \z\ > R. (b) £ cnz" converges absolutely in {\z\ < r} for r < R. (c) JR=[limsup(|cn|)1"']-1. If/(z) = £ a„z", g{z) = £ 6„z" in the disk {|z| < r}, then /(z) + ff(z) = Σ К + ^)z" л = 0 /(z)ff(z)= £ ( Σ аяЬяУ k = 0 \n + m-k J in that same disk.
5.9 Summary 447 If/ is a complex-valued function defined near z0 in C, we say that/м complex differentiable at z0 if hm =/ (z0) ζ - z0 exists. The sum of a convergent power series is complex differentiable at every z0 in its disk of convergence. Furthermore, the derivative is the sum of the derived series: /(ζ)=Σ>„ζ" /'(ζ)= Σ^ζ"-1 л=0 л=1 Thus the sum of a convergent power series is infinitely differentiable. A differentiable complex-valued function of two real variables is complex differentiable if and only if it satisfies the Cauchy-Riemann equations: .(8f\ dx If h, g0,..., gk can be represented as sums of convergent power series in a disk centered at zero, then the same is true for all solutions of the differential equation fc-l Σ 1 = 0 /'+ Σ^(" + Λ = 0 Furthermore, once given the initial conditions Я0) = а0,---^('1~1,(0) = ак-, the coefficients of the power series can be recursively calculated using the differential equation. Given any finite covering of the interval [a, b~] by open intervals [/,,..., U„ we can find C00 functions p,,..., p„ such that (a) 0<p,<l (b) p, = 0, outside U, (c) X pt(x) = 1 for all χ e [a, b~] These functions are called a partition of unity on [a, b~] subordinate to the cover Uu ..., U„.
448 5 Series of Functions • FURTHER READING I. I. Hirshman, Infinite Series, Holt, Rinehart and Winston, New York, 1962. H. Cartan, Elementary Theory of Analytic Functions of One or Several Complex Variables, Addison-Wesley, Reading, Mass., 1963. This text develops the subject of complex analysis from the point of view of power series. It also contains a complete discussion of the theorem on existence of solutions of analytic differential equations. Further material can be found in T. A. Bak and J. Lichtenberg, Mathematics for Scientists, W. A. Benjamin, Inc., New York, 1966. Kreider, Kuller, Ostberg, Perkins, An Introduction to Linear Analysis, Addison-Wesley, Reading, Mass., 1966. W. Fulks, Advanced Calculus, John Wiley and Sons, New York, 1961. M. R. Spiegel, Applied Differential Equations, Prentice-Hall, Englewood Cliffs, N. J., 1958. • MISCELLANEOUS PROBLEMS 34. Find a sequence {/„} of continuous nonnegative real-valued functions defined on the interval (0,1) such that f(x) = 2™= ι /■(*) exists for all xe [0, 1], but/is not continuous. 35. For \z\ < 1, define lnz=li Show that for all such ζ, ζ = 1 — exp(ln(l — z)). 36. Show that the series „t-i (z - n)2 converges to a complex differentiable function in the domain C-{1,2, ...,n,...}. 37. Show that exp(z) = lim (1 + z/m)m (Hint: Compute the power m~* 00 series expansion of (1 + zjmf.) 38. Show that a real polynomial of odd degree always has a real root. 39. Let ω = sxp(2nijn). Show that 1 - Z" = (1 - ωζ)(1 - ω2ζ) ··■(!- ω"ζ)
5.9 Summary 449 40. Let Ρ be a polynomial. Show that Ρ is the square of another polynomial if and only if every root of Ρ occurs with even multiplicity. 41. If P, Q are two polynomials, Ρ divides Q if and only if every root of Ρ is a root of Q with no larger multiplicity. 42. Suppose Ρ is a polynomial of degree at least two. Show that there is а с such that P(z) — с = 0 has at least one multiple root. {Hint: Consider P' as defined in Problem 10. If P'(a) = 0, take с = P(a).) 43. Let/i,...,/, be functions in C(X). Show that these functions are independent if and only if there are points x,,...,x„ such that the matrix (/i(a-j)) is nonsingular. 44. Show that the functions e", xe™, ..., x"e" are independent. 45. Show that if P, Q are polynomials, S(LP) с S(LQ) if and only if Ρ divides Q. 46. If Ρ is a polynomial of degree at least two, there is а с е С such that the equation LPf= c/has a solution of the form xe". 47. If a linear differential equation has polynomial coefficients, it has global solutions on all of R. 48. Let {c} be a sequence of complex numbers such that 2 k»l < °°· Let /(z) =2"=o c„z" Prove that \c0\ is not a relative maximum of |/|. unless all other coefficients vanish. 49. Suppose the function f(x, y) = x2 - y2 + w(x, y) is complex differentiable. Find v. 50. If / is a polynomial in x, у which is complex differentiable, then /(x, y) has the form Q{x + ty), where Q is a polynomial. {Hint: Substitute x = (z+ z)/2, у = (ζ — z)/2, and use the Cauchy-Riemann equation.) 51 Suppose /is a C2 complex-valued function defined on a domain D in C. Show that if / and fz are both harmonic, then / is complex differentiable. 52. If/, g are complex differentiable and |/|2 + \g\2 is constant, then both / g are constant. 53. Suppose that / is a one-to-one mapping of a domain D <= С onto Δ <= С Let g: Δ -* D be the inverse of / Show that if / is complex differentiable, so is g. 54. We may consider the function e* as a mapping from the plane to the plane. Let ζ = χ + iy, и = Re ег, υ = Im e1; that is и = e* cos у ν = e* sin у (a) Show that this mapping maps the lines χ = const, on the circles centered at the origin, the lines у = const, go onto the rays through the origin. (b) Show that in any interval {a < у < a + 2π} this mapping takes every value precisely once.
450 5 Series of Functions (c) In particular, ez maps the horizontal strip {— π < у < π} one-to- one onto the entire plane except for the negative real axis. Call this domain D. Define the complex logarithm logz: D-*{—π <y <π] a.s the inverse of this mapping. Show that log ζ is complex differentiable and (log)'z = - Show also that log ζ can be represented by a power series centered at 1 in the disk {|z— 1| < 1}. (Recall Miscellaneous Problem 35) Notice that this provides a way for extending real functions to the complex domain besides that of power series. For example, the power series expansion about 1 of log χ extends it only to the unit disk centered at 1. The above extension of log is defined in the entire plane except the negative real axis. This process is called analytic continuation. 55. Consider z2 as a mapping of the plane into the plane Show that it maps the open right half plane one-to-one onto the domain D of Problem 54. Let Vzbe the inverse, and show that Vzis complex differentiable Provide a similar discussion for the mapping z" 56. Discuss the mapping properties of cos z, sin z. 57 Show that the power series expansion of the solution of Exercise 8(d) with initial values y(0) = 1, /(0) = 0 does not converge outside the unit disk 58. Suppose that/, g are complex-valued functions defined on the interval /. Show that ' m dt hz-g(t) is complex differentiable and can be represented by a power series at any point of the image of g. 59 If/is C'mCxX and for each fixed x, /(z, x) is differentiable in z, then F(t) = jf(z,x)dx is also complex differentiable. 60. The equation of Exercise 6 is called Legendre's equation and the solutions {/,} for integral к are called the Legendre polynomials. They have this interesting property: fm(x)fn(x) dx = 0 ύτηφη
5.9 Summary 451 To prove this we must observe that Legendre's equation may be written as Thus by integration by parts. Now do the same, interchanging m and n. 61. Let Ρ be a polynomial of degree d. Show that f(z) = enz) is complex differentiable Show that/°"(z)e~',(z) is a polynomial of degree n(d — 1). 62. Show that the polynomial d" exp(*2) ^ (exp(-x2)) solves the differential equation y' — 2y' + 2ny = 0. 63 (a) Find a C°° real-valued function / defined on /?" with these properties: (ι) 0</(χ)<1 (ii) /(x)>0if \\x-Xo\\<R (iii) /(x)=0if \\x-Xo\\>R (By C°° we mean all higher-order partial derivatives exist and are continuous.) (b) Let X be a closed set in R", and suppose B,,..., B„ are balls in R" such that X <= В^ и ■ ■ и В„. A partition of unity on X subordinate to Bu ■ ■ ■, B„ is a collection {/i,...,/»} of C°° functions such that (i) 0 < /, < 1 (ii) f,(x) = 0ux$Bi О") 2;_1/,(x) = lifxe^ Find such a partition of unity.
FUNCTIONS ON THE CIRCLE (FOURIER ANALYSIS) In this chapter we shall study periodic functions of a real variable. The importance of such functions derives from the fact that many natural and physical phenomena are oscillatory, or recurrent. In the early 19th century, J. B. J. Fourier laid down the foundations of the study of periodic functions in his treatise Analytic Theory of Heat. There remained a few gaps and difficulties in Fourier's theory and much mathematical energy during the 19th century was expended in the study of these problems. The invention of Lebesgue's theory of integration in the early 20th century finally laid the foundations to this theory. Our exposition will not follow this chronological pattern; but rather will try to develop the way of thinking about Fourier series which emerged during the late 19th century. A periodic function is one whose behavior is recurrent. That is, there is a certain number L, called the period of the function, such that the function repeats itself over every interval of length L, f(x + L)= f(x) for all χ e R From our point of view (which is very much a posteriori) the study of periodic functions begins by discarding the notion of periodicity in favor of a change in the geometry of the domain That is, to study the collection of all periodic functions with a fixed period, we make the underlying space periodic instead. We shall think of the real line as wound around a circle, and our periodic functions are just the functions on the circle. Chapter Q 452
6.1 Approximation by Trigonometric Polynomials 453 To fix the ideas, we shall have a particular circle in mind the set Γ of complex numbers of modulus one. We have already seen that there is a mapping θ -»· cos θ + ι sin θ = е'в of the real numbers onto Г which is one-to-one on an interval of length 2π, except that both end points go onto the same point. This mapping does precisely what we want. It winds the real line around Γ. A continuous function on Γ is a function of e'e which varies continuously with Θ. Thus the continuous functions on Γ are precisely the continuous functions on R which are periodic of period 2π f(x + 2π) = f(x) for all χ e R In the past few chapters we have been studying the behavior of functions from the point of view of differentiation We have studied the Taylor expansion, an expansion into polynomials, and we have related the coefficients to the subsequent derivatives of the function. Since the simplest periodic functions are the trigonometric polynomials, we attempt to expand a given periodic function in a series of trigonometric polynomials This is the so-called Fourier series of the function. The interesting fact here is that the relevant coefficients are found by integration In fact, as we shall see, the Fourier series of a function is a sort of an expansion in terms of an orthonormal basis in the vector space of continuous functions on the circle with the inner product <f,ff> = ^-f №βΦ)*6 2π J-„ Finally, as the circle is the set of complex numbers of modulus one, it is the boundary of the unit disk in С and we can study the relation between Taylor expansions in the disk and the Fourier expansions on the circle for suitable functions It will turn out that for such functions the Taylor coefficients can also be obtained by integration on the circle 6.1 Approximation by Trigonometric Polynomials We shall begin with the attitude that we are studying complex-valued functions on the circle According to this view, the function e'e is the simplest and the most basic function This attitude is really just a convenience, the point of view of strictly real-valued functions would consign us to consider cos Θ, sin θ as the elementary building blocks of our theory
454 6 Functions on the Circle {Fourier Analysis) But, since e'e = cos θ + i sin Θ, there is little difference, and we select the more comfortable notation. Our purpose is to describe a given function on the circle in terms of the powers of e'°, both positive and negative. More precisely, if the series Σ a«e""> (6·ΐ) П= — 00 converges for all Θ, it defines a function on the circle. We ask the converse question Can we express any periodic function as such a series? If only finitely many of the {an} in (6.1) are nonzero, there is no problem of convergence, and the sum defines a function, called a trigonometric polynomial. This subject gets off the ground once we know how to compute the {an} from the given function, and that leads us to our first proposition. Proposition 1. Let Ρ(θ) = Y£=-N апешв be a trigonometric polynomial. Then am = ^( P{d)e-""« άθ for all m. Proof. — ί Р(в)е-Шс1в = — Г ( 2 д„е'"е|е-""е άθ = ττ ί α„ ί e""-m)<,i/0 2π„--Ν J _„ 1 = — am ■ 2π + Ο = α„ 2π Now, given a continuous function on the circle, if it has an expansion into a series of trigonometric polynomials, we could expect that the coefficients of this series will be related to the function in the same way. Thus we form this definition. Definition 1. Let / be a continuous function on the circle. The nth Fourier coefficient of f is Λ i(") = irf /(#"*# (6.2)
6.1 Approximation by Trigonometric Polynomials 455 The Fourier series of/ is the series Σ /(")«"" (6.3) Л = — 00 Examples 1. Let/(0) = sin0. Since sin 0 = еч> _ e-.e 2i its Fourier series is ~е-1в+1.е1в 2ι 2ι From (6.2) we can deduce (as is also easily computed): 1 Λπ — 1 1 Λπ 1 — f sin 0 е1" <20 = — — f sin0e-,ed0=- 2πί-η 2i 2π J _π 2i 2. Since cos m0 = ±(е,тв + е~'тв), the Fourier series of cos m0 is 3. Let/(0) = cos20. Then /(„) = A. f" cos2 фе~,пф άφ = —({\+ cos 2ψ)<Γ·ηψ άφ fi « = -2, 2 00= i " = 0 № η Φ - 2, 0, 2 Thus the Fourier series of cos2 0 is ie-^ + i+ie'2" (Notice that cos2 0 = 1/2(1 + cos 20) = 1/2(1 + l/2(e,e + e~*)) is a trigonometric polynomial.)
456 6 Functions on the Circle (Fourier Analysis) 4. Let/(0) = 7r2-02. /(") = f Γ (π2 - Ф2)е-'"ф άφ ? π2 г" 1 Γπ ,, , 2π2 /(0) = -[ άφ--\ Φ2άφ = — η = 0 /00=-;/-Гф2е""*# = (-1)"4 "#0 by two integrations by parts. Thus the Fourier series of π — θ is ^ + 2£^Z> (6.4) Notice that by the comparison test, this series does converge to a continuous function of e'e: J пФО П In order to conclude that this is the given function π2 — θ2, we shall need more theoretical investigations. 5. It is not necessary for a function to be continuous to have a Fourier expansion. It need only be integrable for the expressions (6.2), (6.3) to be computable. Let us compute the Fourier series of 1 0>O 0<O /(0)=έίο^-2 2π Jo Ζπιη •ιηφ Ίπϊη 0 η even, η φ 0 — η odd πίη 1 \_e~lm - 1]
6.1 Approximation by Trigonometric Polynomials 457 Thus the Fourier series of/ is I I 00 ginfl η odd Recapturing the Function from Its Fourier Series Notice that no claim of convergence in Definition 1 is made In particular, the series (6.5) appears not to converge, for the comparison test does not apply. However, we cannot conclude that convergence fails, only that the question can be exceedingly difficult We ask instead what appears to be a simpler question: Does the Fourier series identify the given function, and if so, in what way? We now try to investigate the recapture of a function by its Fourier series, deliberately leaving aside all questions of convergence. Let / be a given integrable function on the circle and consider the " function " θ(Θ) = Σ /(«У"* η = — οο By definition of/(η), 0<β)= Σ Τ Γ ЯФ)е,п(в~ф) άφ η= — οο ώ7Γ J —π Now we interchange ]Γ and J, obtaining ff(0) = f f ΚΦ) Σ е*-»аф Ζ7Γ J — π n~ — oo Well, it is too bad it turned out this way because we are still up against a convergence problem, like it or not. In fact, the situation is worse, it is untenable because £ e<»<e-*) (6.6) П= — 00 converges for no values of φ. This seemingly insurmountable obstacle can be overcome, so long as we are not solely interested in pointwise convergence, by a subtle mathematical technique: that of inserting convergence factors.
458 6 Functions on the Circle (Fourier Analysis) If we replace the series (6 6) by the series £ rl"le'»<e-*) (6-7) this series converges beautifully for r < 1 and the series (6.6) is in some ideal sense the limit of (6.7) as r tends to 1. Stepping backward two steps, this causes us to now consider the series 9(г,в)= Σ f(n)r^e™e (6.8) Л = — 00 and the limit lim g(r, Θ) (hoping of course that it is f(9)). Notice that the r->l series (6.8) does converge since the Fourier coefficients {/(«)} are bounded (Problem 1) and the comparison test applies. Now, proceeding as above but this time with g(r, Θ), we obtain 9(г,в)=±- Г /(φ) Σ №«-»άφ <1 and here we can interchange Σ and J because the series in question converges uniformly The sum in the above integral can be put in a nicer form since it is a sum of two geometric series. 00 00 00 P(r, t) = Σ rHe"" = Σ (re-γ + Σ (Ό" !+ T~7, (б-9) 1 - re~" 1 - re' 1-r2 1 - r 1 + r2 - r(e" + e~") 1 + r2 - 2r cos t The function P(r, t) is called Poisson's kernel (named after its French discoverer, not because its whole technique is fishy), and the association of/ to g is called the Poisson transform. Thus, the Poisson transform (Pf)(r, θ) = ^ί ЯФ). ^ 2 ' ~ r\ .- ,. άφ = Σ к*У*е" 2π1-π l+r — 2r cos(y — φ) η=~χ (6.10)
6.1 Approximation by Trigonometric Polynomials 459 takes continuous functions on the circle into continuous functions of r, θ for И < 1; that is, into continuous functions on the open unit disk. We shall later see the importance of the Poisson transform from the point of view of partial differential equations. Examples (Some Poisson Transforms) 6. We can find the Poisson transform of functions on the circle quite explicitly, using some complex notation and Equation (6.10). For example, consider/(0) = cos2 Θ. Using Example 3 we have Pf(r, Θ) = irV29 + i + £rV2" = ±[1 + №е-")г + i(«,e)2] Thinking of r, θ as polar coordinates in the disk, we can rewrite this (using ζ = re'e = χ + iy, ζ = re~'e = χ — iy): Pf(z) = ±[1 + Kz2 + z2)] = i[l + Re z2] = «I + x2 - y2) Clearly, hm Pf(r, Θ) = hm Pf(z) = - (1 + x2 - (1 - x2)) = x2 = cos2 θ 7. The Poisson transform of Я0)-(о 0<о is given by 1 1 r^e"1" 1 2 „ 1 //Vе гпе~тв\ n>0 1 2T /„ z"\ 1 2T « z2n + 1 p/(z) = - + - Im X - = - + - Im Σ ^—Τ 4 2 π \nodd η J 2 π „=ο 2η + 1 n>0 Now, we can use Taylor expansions to obtain a closed form for this series.
460 6 Functions on the Circle {Fourier Analysis) Now С dz 1 / г dz r dz \ 1 /1 + z\ Jr3^ = 2Um + JrT^J=2ln(rriJ (We have used real-variable techniques to find this closed form, but once it is found it is valid for all z, \z\ < 1.) Thus ™ = И"""(Н) As \z\ -»· 1, Pf(z) has a limit except for z-> 1, z-> — 1. We shall now show that except for these two values, lim Pf(r, θ) =/(θ). г->1 lim Pf(r, Θ) = Pf(l, Θ) = \ + - Im 1η(|±ί!Λ (6.11) Now 1 +е1в _ (1 + e,e)(l - e-e) _ 1 + e'e — e~'e — 1 1 -e'e ~~ (1 - e'*)(l - e~,e) ~~ 1 - e'" - e_,e + 1 i sin θ (1 - cos 0) (6.12) Since In ζ = In |z| + i arg z, Im In ζ = arg ζ for any complex number. Since (6.12) is pure imaginary, we have 0>O T . 1+e·9 I2 ImlnT^ , π 0<O 2 Thus, referring back to (6.11) 11т/>/(г,0) = ^ + Щ = 1 if0>O -HO0 if»<°
6.1 Approximation by Trigonometric Polynomials 461 We are still hoping that it is true for all/that hm Pf(r, Θ) = /(0). Of course, r->l this turns out to be true. To see this we have to verify some properties of Poisson's kernel. First we rewrite the Poisson kernel as 1-r2 P(r, t) = 5 ' (1 - r)2 + 2r(l - cos Ζ) From this reformulation we easily conclude the following properties: (i) P(r, t) > 0 for all values of r, t, r < 1 1 -r2 _ 1 +r (Τ^7)"2 = Γ (ii) P(r,0) = - -χ-2 = - >ooasr->l (iii) On the other hand, for values of t φ 0, P(r, ?)->0 as r->l. If M><5, P(r, t) = 1 - r2 \-r2 (1 - r)2 + 2r(l - cos δ) " 2r(l - cos δ) uniformly as r-* 1. For a fixed value of r, the graph of P(r, t) is drawn in Figure 6 1. As r -* 1, the peak goes up and the valleys get larger and deeper. Finally, (iv) ±f P(r,t)dt = l 2π This can be computed directly; however it is easier to use Equation (6.10) in the particular case where / is the function which is identically one (see Problem 2). Theorem 6.1. Iff is a continuous function on the circle, hm Pf(r,e)=f(9) r^l Proof. Using property (iv) above we can write Pf(r, 0)-/(0) as an integral, 1 (p/Xr, θ) - /(β) = τ- ί [№ - №№, β-φ)άφ
462 6 Functions on the Circle {Fourier Analysis) II P(r,t) Figure 6.1 For any δ > 0 we break up the integral into two pieces: (PfKr, θ) - /(β) = ±- ί [/(φ) - /(0)]i>(r, θ~φ)άφ + y-f [/(.Ф)-№1Р(г,в-ф)аф Now, by (iii) the integrand in the lower integral tends to zero as r -> 1 and, by continuity, |/(<£) — /(0)1 is small for all <£ near enough to θ so that we can make the first integral small by taking δ small. More precisely, let ε > 0 be given. Let δ be such that \ΑΦ)-№\<2 ύ\φ-θ\<8 (6.13) Given that δ, by (iii), there is an η > 0 such that for \r — 11 < η, f Ρ(ν,φ-Θ)άφ<: '2 11/11.
6.1 Approximation by Trigonometric Polynomials 463 Then for \r - 11 < η, \Pf(r, θ) - f{r, 0)1 <; — ί ι /(φ) - /(β) ι p(r, β-φ)άφ + ^ί \т-№\р(г,в-Ф)аф -έ'? ί ρ(τ,β-φ)άφ Ζ.7Γ Ζ ·Ίφ-β|£{ + =-2||/||βί Ρ(ν,θ-φ)άφ -2' + π 2II/IU е We seem to have come a long way away from our original quest, but we have not really. The content of Theorem 6.1 is this: Let/be a continuous function on the circle. Its Fourier series Σ Н*У* П= — 00 is too hard to study as regards convergence, but it does represent/in some relevant sense. It " almost converges " to /, that is, if we put in factors to ensure the convergence and consider instead 00 mr, 0) = Σ /(">""*""' П= — 00 then for r very close to 1, this function is very close to/. This allows us to make important assertions based on any information on the Fourier series of/. For example, Collorary 1. Iff is a continuous function on the circle, αηί/Σ"= -00 l/(")l < 00, then f is the sum of its Fourier series, /(0)= Σ ?№"> П— — 00 Proof. The condition allows us to conclude on the basis of the comparison test that the Fourier series converges; the essential content here is that it converges to/.
464 6 Functions on the Circle {Fourier Analysis) In fact, by the comparison test, we can conclude that CP/)(r, 0) = Σ f{n)rMeM is a continuous function on the closed unit disk: all r<,\. Then for any Θ, by Theorem 6.1. 00 00 /(0) = limi>/O·, 0)=lim 2 f(n)rMeM = Σ /ООе'"" In particular, if/(n) vanishes for all but finitely many n, then/is a trigonometric polynomial. Thus the trigonometric polynomials are precisely the class of continuous functions on the circle with only finitely many nonzero Fourier coefficients. A more basic consequence is that a function is uniquely determined by its Fourier series. Collorary 2. Iff and g are continuous on the circle andj(n) = cj{n)for all n, thenf= g. Proof, f—g is continuous on Γ, and (/— g)\n) =/(n) — g(n) = 0 for all n. Applying the first corollary to /— g we see that it is the sum of its Fourier series, which is identically zero. Thus /— g = 0, so /= g. Conditions on the Fourier coefficients of a function, such as that in Corollary 1, are not hard to come by. For example, suppose / is a twice continuously differentiable periodic function. Then by integrating by parts we have /(«) = ^- Г ЯФ)е-"ф άφ = -^-Γ Г(ф)е-"ф άφ 2π J -π 2πιη ■> -π = τ\( ПФ)е-,пФ άφ 2πη J -π Since /" is continuous on the circle, it is bounded, say by M. We obtain these bounds on the Fourier coefficients of/: Thus ΣΙΛΌΙ < »■
6.1 Approximation by Trigonometric Polynomials 465 Corollary 3. If f is a ^function on the circle, it is the sum of its Fourier series. We shall have an even better result in Section 6.4. Nevertheless, Theorem 6.1 does allow us to make deductions on the convergence of the Fourier series. As one last application, it tells us that although we may not be able to approximate a function by its Fourier series, we can nevertheless approximate it by some sequence of trigonometric polynomials. Corollary 4. A continuous function on the circle is approximable by trigonometric polynomials. Proof. Using the notion of uniform continuity, we can be sure, in the proof of Theorem 6.1, that the δ chosen so that (6 13) is true is independent of 0. Thus, in the rest of the argument we find an r < 1 such that \Pf(r, в) - f(6)\ < ε for all 0 Now, the series 2i°=-» f(.n)rMeM converges uniformly to Pf(r, 0), if r < 1 Thus there is an N such that the partial sum Q of the terms between —TV and N is everywhere within ε of Pf(r, 0). Thus | Q(6) - /(0) | < | β(0) - Pf{r, 0) | + \Pf(r, Θ) - /(0) | < ε + ε = 2ε for all 0, as desired. • EXERCISES 1. Find the Fourier series of the following functions on the circle, (a) /(0)=02 (b) /(0) = coss0 (c) /(0) = e'" t* > 0, not necessarily an integer. (d) / 0 -7TfS0<- 2 /(0) = - » + r -r<0<O 2 2 -0+2 O<0<; 7Γ - <0<7T 2 _ (e) /(0)=|sin0|
466 6 Functions on the Circle {Fourier Analysis) (f) /(0) = sin 0 + cos 0 (g) / π 0 1 №- -7Г^0<- :θ<ο O^0<- ■^0<7Г (h) /(0) = e< (ι) /(0) = e'" 2 Find the Poisson transforms of the following functions on the circle: (a) cos3 0 + sin3 0 (b) (l+cos20)-' (c) Exercise 1(c). (d) Exercise 1(d) (e) Exercise 1(g). (f) (1+e'T PROBLEMS 1 Show that the Fourier coefficients f(n) of a continuous function / defined on the circle are bounded: l/(n)l< 11/11 = max{|/(0)|: -тг^0<тг} 2. Show that 1 | P(T,t) dt=\ by computing the Poisson transform of the function 1 3. Show that if /is a real-valued function on the circle, /(— n) =f(n)~ 4 (a) Show that the Poisson transform of/can be written Pf(r, 0) =/(0) + 1 (f(-n)z- +/(i»)r") (b) Show that if /(—n) = 0, n>0, then Pf is the sum of a convergent power series in the unit disk
6.2 Laplace's Equation 467 (с) Show that if/can be written in the form /(0)=F(e'«) where F can be written as a convergent series in powers of z, z, then Pf(z) = F(z) 5. What is the Poisson transform of these functions? (a) exp(e'e) (d) (l+cos20)-' (b) (1+2*)-" (e) ln(5 + z) (c) (z+z)" (f) exp(cos0) 6. We can use the approximation theorem (Corollary 4) to prove the following fact. (Weierstrass Approximation Theorem). Iffis a continuous function on the interval [0, 1] and ε > 0, there is a polynomial P(x) = a0 + a^x + · · · +a„x" such that \f(x) - P(x)\ < 1 for all χ e [0, 1] Prove it according to this idea: First extend /as a continuous function on the interval [ — π, π] so that/(—π) = /(π) Now view the extended function as a function on the circle and, by Corollary 4, approximate it by a tngono- N metric polynomial of the form £ a„e'"e. Now use the fact that the 2N n= -N functions {e'"e: -Ν < η < N} can be approximated by polynomials in Θ. 6.2 Laplace's Equation The techniques described in the previous section came out of Poisson's work on the theory of heat flow Suppose D is a domain in the plane (representing a homogeneous metallic plate), we wish to study the temperature distribution on this plate subject to certain sources of heat energy Let u(x, t) be the temperature at the point χ at time t. We shall see in Chapter 8 that, as a consequence of the law of energy conservation, the temperature function и behaves according to this partial differential equation (appropriately called the heat equation) du _ 1 /d2u d2u\ (6.14)
468 6 Functions on the Circle {Fourier Analysis) Now, suppose our sources of heat maintain the temperature at the boundary of D, and there is no other source or loss of heat. Then, as t -* oo the temperature distribution will tend toward equilibrium: that state at which ди/dt = 0. This equilibrium (or steady-state) temperature distribution must therefore satisfy Laplace's equation: д2и d2u дх2 dy ,+-5 = 0 (6.15) This is sometimes denoted Au = 0. Solutions of Laplace's equation are called harmonic functions. The Poisson transform has to do with the solution of this steady-state problem when D is the unit disk. Suppose then, that we are given a temperature distribution/(0) on the unit circle; we wish to find a continuous function u(r, Θ) defined for r < 1 such that Au = 0 and w(l,0)=/(0). In order to attack this problem, we assume that и can be represented, on each circle r = constant by its Fourier series: и(г,в)= Σ ^Ув (6.16) П~ — oo Our conditions become (ι) α„(1) = /(И) = ^ fj(9)e-"e άθ (6.17) д Ι ди\ д2и (ιι) Ам=^Ы+^=о (6·18) (We have rewritten Au in terms of polar coordinates so we can apply it to the Fourier series We leave it to the reader to derive the polar form of the Laplacian.) Now, computing (n) term by term in the series (6.16), we obtain 0= f (rX + ra'n - n2any"e = 0 П— - OO Since the zero function is represented only by the zero Fourier series we deduce that r2a'i + ra'„-n2a„ = 0 (6.19)
6.2 Laplace's Equation 469 for all n. This ordinary differential equation is easily solved: 1, log r η = 0 r", r~" n#0 We have only one boundary condition (6.17), however we do want the functions continuous at r = 0, so the solutions log r,r~'"' are excluded. Thus we must have a„ = f(n)rM, and the solution must have the form и(г,0)= Σ ?(n)rMelne П~ - 00 which is Poisson's transform. Hence, if the problem is solvable, the solution must be given by Poisson's transform. Conversely, the following is a solution. Theorem 6.2. Let f be a continuous function on the circle. There is a unique function u, harmonic in the disk and assuming the boundary values f. и is the Poisson transform off: Proof. We need only verify that и is indeed harmonic. Since we can differentiate under the integral sign we need only show that the Poisson kernel is harmonic. That can be done by direct computation, or by referring back to Equation (6 9). There we have 1 11,1 v \—re~u 1 — re" 1 — г 1 — г = -1 + 2 Re Now, (1 — г)"1 is a complex differentiable function, and we have already seen that the real part of such a function satisfies Laplace's equation, see Problem 5.7.2 Thus ΔΡ = 0. №
470 6 Functions on the Circle {Fourier Analysis) To recapitulate, Laplace's equation for the disk with given boundary values is easily solved by Fourier methods. If/is the boundary temperature distribution, the solution is h(z)= £ /(Η)Η"Ι^β=/(0)+ Σ(/(-η)ζ"+/(η)ζ") η- - oo n= 1 (since гмешв = ζ" for η > О, гме'пв = ζ" for η < 0). Examples 8. Find the solution of Laplace's equation with boundary values f(9) = cos3 θ + 3 sin 30. This is easy to do, for we can easily recognize the function as the boundary value of the real part of a complex differentiable function. Since 1 1 cos 0 = -(z + z) sin 30 = — (z3- z3) 2 2i on the unit circle, we have /Ю-Ц2 +£<··-.·> for ζ = е1в. Thus the solution is given by the same expression for all z, |z| <: 1 since it is clearly harmonic. 9. Solve Laplace's equation with boundary values Д0) = |0|. Since |0| is not a trigonometric polynomial, we must compute the Fourier expansion. /(") = τ- Γ \e\e~we άθ = ^-\ ee-me άθ + — Γ 0<Τ""' άθ 2π J -π 2π J -„ 2π -Ό = τ- (θ{έηθ + е~1пв) άθ=- Γ 0 cos ηθ άθ 2π Jo π ·Ό = — sin ηθ άθ = —^ cos ηθ = —г πη J0 πη ο πη (η ^ 0 /(0) = - (θάθ = 1 π Jo 2
6.2 Laplace's Equation 471 Finally, π 2 » f+z" π _ » z2n+1 + z2n+1 Д2) 2 π Μ η2 2 Λ (2η + Ι)2 The problem analogous to the above in the case of a general domain is known as Dirichlet's problem. More precisely, Dinchlet's problem is to find for a given domain D and function / defined on D, a function harmonic in D and taking the given boundary values. In 1931, O. Perron gave an elementary, but extremely clever argument which proved the existence of a solution to Dinchlet's problem. Poisson's method plays a strategic role in Perron's arguments, which we shall not go into here. However, we shall verify that the solution is unique, there can be at most one harmonic function with given boundary values. This follows from the mean value property of harmonic functions. Proposition 2. Suppose и is a harmonic function in the domain D. If Δ(ζ0 ,R)<=D, then «Ы = ^ f "(z0 + Re">) d9 that is, w(z0) is the average of its values around any circle in D contained in D. Proof. We can expand и in a Fourier series around any circle \z— z0\ =r, r<.R\ oo u(z) = 2 a„(r)e'"e r<R where ζ = z0 + re'° n= - oo Since Δ« = 0, we must have a„0) =/(«>'"', where/(f)) = u(z0 + Re"), already seen. Thus «(z)= 2 /(n)|z-z0|""e",ar,tfa"<,o> n= — oo so 1 r" «Ы=/(0) = —| f(S)de = —j u(z0 + Re'°),
472 6 Functions on the Circle {Fourier Analysis) Corollary 1. Suppose и is harmonic on the closed and bounded domain D. If и > 0 on dD, then и > 0 throughout D. Proof. Let us suppose that the conclusion is false That at some point z0 inside D, u(zo)<0. We shall derive a contradiction We may take for z0 a point at which и takes its minimum value. There is such a point since D is closed and bounded, and it is interior to D since и > 0 on 8D Let Δ(ζ0, R) be the largest disk centered at z0 contained in D. The boundary of Δ(ζ0, R) must touch 8D (see Figure 6.2), for if not we could find a larger disk centered at z0 and contained in D. Thus there are points on the circle \z— z0\ = R at which и>0. Since u(z0) is the average value of и on this circle, and u(z0) <0, there must be points on this circle at which и <u(z0) in order to compensate. But u(z0) is the minimum value of u, so we have a contradiction More precisely, since u(z) > «(z0) for all ze D, u(z0 + Re") — u{z0) ;> 0 for all Θ. On the other hand, by the mean value property η ί (tt(z0 + Re'") - «(zo)) άθ = 0 ^ -It When the integral of a continuous nonnegative function is zero, that function is identically zero Thus, tt(z0 + Re'") = tt(zo) for all θ This contradicts the fact that for some Θ, u(z0 + Re'") >0 Figure 6.2
6.2 Laplace's Equation 473 Corollary 2. A function harmonic on a closed and bounded domain D is uniquely determined by its boundary values. Proof. Suppose that u, υ are both harmonic in D, but и = υ on 8D. Let ε > 0. Then и — υ + ε, ν — и + ε are both positive on 3D. By Corollary 2, they are both positive in D, thus u>v — ε v>u — ε in £> Since ε is arbitrary, we may now let it tend to zero. We conclude that и ^ ν and ν > и throughout D. Thus и = ν in D. Another problem of heat transfer is this: find the steady-state temperature distribution on the unit disk assuming a given rate of heat flow through the boundary, and no other source or loss of heat. Now the velocity of heat flow, denoted q, is a vector field on the domain and it is a law of thermodynamics that this field is proportional to the temperature gradient, but oppositely directed. Thus, in this problem, our given data are the rate of heat flow perpendicular to the boundary of the unit disk, which is proportional to ди/dr on the boundary. By the law of conservation of energy, since we are assuming a steady state, the total energy change is zero, thus we must impose this condition: |ϋ.π ди/дг(е'в) άθ = 0. Thus, the mathematical formulation of this problem (known as Neumann's problem) is this: Find a function и harmonic in the unit disk such that ди/дг(е1в) assumes given boundary values g(6). We impose the condition |ϋπ g(9) d9 = 0. (It is necessary to impose this condition in order to obtain a solution, for mathematical reasons, as you will see in Problem 8.) We again solve this problem by Fourier methods. Find oo u{reie) = Σ «.« — oo so that (ι) (ди/дг)(е'в) = g(6), Au = 0. Again, this leads to the ordinary differential equation (6.19) with the boundary condition a'„(\) = §(n). The solution continuous at the origin is \n\~l§{n)rw. Thus the solution must be given by u(reie)= £ ^r1"^"" (6.20) - oo П We will omit the proof that this function does solve Neumann's problem; the argument is much like that in Theorem 6.1. We can, of course, collapse
474 6 Functions on the Circle {Fourier Analysis) (6.20) into an integral formula: oo 1 Γ Ι π «('·«")= Σ - \т\ 9(Ф)е-1"фс1ф -οο η \_2π J-„ 1л71 Г °° = - ί д(Ф) Re г\"\е'"в κη£-ιη(θ-φ) α, γ.η£ι"<,β-φϊ П „=1 И <ty Γ» (re'(a"^)n· η=1 П <ty = - Γ 3(Ψ) Re[ln(l - re,(e_«)] άφ Now so |1 -re"|2 = 1 + г2-2 cos ί Reln(l -re") = iln|l -re"|2 = ib(l + r2 - 2r cos /) Thus the solution to Neumann's problem takes the form (6.20) or u{rea) = — J* д(ф) ln[l + r1 - 2r cos(0 - ψ)] <ty EXERCISES (a) /(β) = 3. Solve Dirichlet's problem in the disk with these boundary conditions: (-1 0<O ( 1 0>O (b) /(0) = sin2 θ - cos2 θ (c) /(0) = 7r2-02 (d) /as is given in Exercise 1(c). (e) /(^) as is given in Exercise 2(f). 4. Solve Neumann's problem with these boundary conditions: (a) /(0) = sin θ + 2 cos 26» (b) /(«) = { J! J -02 0<O 0 (c) /as is given in Exercise 3(a).
6.2 Laplace's Equation 475 • PROBLEMS 7. Show that the Laplacian is given in polar coordinates by B2u 8 I 8u\ 8. Verify that it is necessary that 1 π — f g(6) άθ = 0 for there to be a function и harmonic in the disk such that 1ιπι^(Γ,β)=0(β) r-i 8r 9. Verify by direct computation that P(r, Θ) is harmonic 10 Show that if / is a complex differentiable function (it satisfies the Cauchy-Riemann equations), then / is harmonic 11. We can prove, using the Poisson transform, this remarkable fact about complex differentiable functions: Theorem. Suppose that f is a complex differentiable function on the unit disk. Then f is the sum of a convergent power series centered at the origin The proof goes like this: Let g(fi) =/(<?">) Since/ is harmonic in the disk (Problem 10), it solves Dirichlet's problem with the boundary values g Thus f(re'") = Pg(re") Now prove this fact (a) If the Poisson transform Pg is complex differentiable, then g(ri) = 0 for η <0. {Hint: Apply d/8x + ; B/By to the expression Pg(re'°) = <?(0) + 2. (9{-n)z" + g{ri)z") n= 1 (b) Deduce from (a) that f(rew) = I g(n)z" 12 Under what conditions on/, g is P(fg) = P(f)Pig)'7 13. (a) Show that if/is a continous function on the domain D with the mean value property: /(zo) = r- f /(zo + Re10) άθ for every A(z„ ,R)<=D 2π J _,
476 6 Functions on the Circle {Fourier Analysis) then /satisfies a maximum principle:/(z0) <max{/(z): ze 8D), for every z0 e D. (b) Conclude that a function having the mean value property is harmonic. 14 Prove: A bounded function defined on the entire plane which is harmonic, must be constant. 6.3 Fourier Sine and Cosine Series There are many notationally different ways of expressing the Fourier expansion of a function, depending mostly on the dictates of the problem at hand. We shall devote this section to the development of these various expressions. First of all, since the main physical study is that of real-valued functions we should introduce the purely real notation. We merely convert the Fourier expansion ^Г/(л)е'"в via the expressions e'"e = cos nO + i sin ηθ е~'"в = cos ηθ - i sin ηθ η>0 Thus the Fourier expansion will take the form oo Λ0 + Σ A„ cos ηθ + BB sin ηθ (6.21) n = 0 where the A's and B's are found from the Fourier coefficients C„ =f(n) as follows: OO 00 £ C„ewe= Σ C„(cos ηθ + i sin ηθ) n= — oo n= — oo 00 = C0 + Σ L(Cn + C_„)cos ηθ + i(C„ - C_„) sin л0] n=l Thus A0 = C0 A„= C„ + C_„ B„ = i{C„ - C_„} η > 0 Notice that if/ is real valued C-„ = γπ\ fW άφ = [1 f Лф)е-"* άφ = C.
6.3 Fourier Sine and Cosine Series All Thus we have 1 л71 Ao = Co = - | ЯФ) άφ 1 г" Λ„ = 2 Re C„ = - /(φ) cos ηφ άφ η > О (6.22) ItJ-i 1 г71 B„ = 2 Im C„ = - /(φ) sin ηψ ί/0 η > 0 (6.23) π J-„ Furthermore, C„ = i(^ + rf.) С_„ = \{An - /Д.) η > 0 Examples 10. Express the Fourier series of π2 — θ2 in the form (6 21). From Example 4, we have Thus h· л=4<4>- Ло = ■= π2 A„ = 4 κ-^- Β„ = О η and we obtain this Fourier expansion: π2_02 = ^ + 4χ izlTcos„0 3 n>o η Notice that equality is justified by Corollary 1 to Theorem 6.1 since the Fourier series does converge. Evaluating at 0, we obtain this interesting fact L· n2 12
478 6 Functions on the Circle {Fourier Analysis) 11. Express the Fourier series of |0| in the form (6 21) Reading from Example 9, we have the Fourier series for |0|: π 2 » е-,пв + етв 2 ~~ π it η2 Thus we have the real Fourier series π 4 ™ cos ηθ 2 π„^ι η Evaluating at 0 = 0, we obtain L· n2 8 12 As usual, trigonometric polynomials can be handled directly, without computation of integrals · /е'в + е~,в\4 1 cos4 θ = I +2 \ = ^ (e4'9 + 4e2ie + 6 + 4e~™ + e~™) = — (2 cos 40 + 8 cos 2Θ + 6) ,31 1 cos θ = - + - cos 20 + - cos 40 8 2 8 £Ъеи «nrf Odd Functions A function of a real variable is called an even function if/(x) =f( — x) for all x, and it is an odd function if/(x) = —f(x). Notice that the product of two odd functions is even, and the product of an odd and even function is odd. If/is an odd function on the interval \_ — A, A~\, then Г /(0 dt = f /(0 at + fV(0 dt=- (f{t) dt + fV(0 dt = 0 J-A J-A ·>0 J0 ·Ό We can conclude that if/is an even function on the interval [ — π, π], its Fourier series is purely a cosine series For in this case /(φ) sin ηφ is odd for all n, so the integrals (6 23) all vanish Similarly, if/is an odd function its Fourier series is purely a Sine series.
6.3 Fourier Sine and Cosine Series 479 Example 13. The Fourier series of θ is of the form oo Σ Βη sin ηθ n = 1 since θ is an odd function. Here π J -π π η -π π J -π η η οο Thus θ has the Fourier series — 2 £ (— 1)"/" Sln "0. n = l Now, all our computations have been done for periodic functions of period 2π. Periodic functions arising in physics do not usually have such a convenient period, yet they are subject to Fourier methods merely by a normalization. Suppose that / is a periodic function of period L. Then g(0) = f(L9/2n) is periodic of period 2π. For «β + Ш) =/(£ (Θ + in)) =/(|? + L) Щ) = m Now, if g can be expanded in a Fourier series: oo g(9) = A0 + £ (A„ cos ηθ + B„ sin ηθ) then we can write /2πχ\ ™ /1πηχ\ /1πηχ\ /00 = g^J =A0 + £ A* co\—) + B» sm[~r) ( 4) where (as is easy to compute by the change of coordinates φ = 2L~lx) 1 rL'2 1 rL'2 , Ιπηχ , Λ = 7 Λ*) <f* Л = 7 /Μcos -Τ- dx (6·25) L J-L/2 -b J-L/2 L 2 rL'2 2π"χ , ,, , B„ = 7 f /(*) sin -— dx (6.26) L· J-L/2 Li
480 6 Functions on the Circle {Fourier Analysis) With these formulas the Fourier analysis of functions periodic of period L is made possible Fourier Cosine Series There are yet two more variations which are, as we shall see, of value in the study of partial differential equations. Let/be a given periodic function with period L and define ««-'(τ) 0<θ<π (6.27) L0, -π<θ<0 Then g is an even function on the interval [ — π, π], so it can be expressed by a Fourier series involving only cosines: oo g(9) = A0+ J^A„cosn9 n = l where A0 = — f g(9) dd A„ = - f g(9) cos ηθ d9 = - \'д(в) dd =- Г д(9) cos ηθάθ π Jo π Jo Now, making the substitution g(9) =f(L9/n) in the interval 0 < θ < π, these expressions become 00 iznx n = l L 1 L "i L AoZ=li0f(x)dx A"= 1 i0/(x) cos ΊΓ dx (6'29) We pause to remind the reader that the use of equality in Equations (6.24) and (6.28) is not literal, it holds only if the series converge (say if g is twice continuously differentiable) The point is that in such cases the expansions (6.24), (6.28) are valid, where the coefficients are defined by (6.25), (6.26), or
6.3 Fourier Sine and Cosine Series 481 (6.29), respectively. The choice of these expansions is free—it is usually dependent on the demands of the particular problem at hand. Equation (6.28) is called the Fourier cosine series for the function /. Of course, if we define g as an odd function, instead of the expression (6.28) we can obtain the Fourier sine series for/: 00 nnx f(x) = Σ Βη sm — (6.30) W here l г· ,, . πηχ , B„ = - J /(x) sin — dx (6.31) We leave the verification of this possibility to the readers as a problem. EXERCISES 5. Find the Fourier expansions into sines and cosines for these functions: (a) cos8 θ (b) sink θ к а. positive integer (c) / as given in Exercise 3(a). (d) /as given in Exercise 1(g). (e) /as given in Exercise 1(b). 6. Find the function whose Fourier expansion is 2™=-« e'"8/'" 7. Find the Fourier sine and Fourier cosine series for these periodic functions of period 1 (a) /(x) = l, all* (b) f(x) = sin(27Tx) (c) f(x) = cos(27tx) , , il 0<x<l/2 (d) /M = (0 1/2<*<1 (e) f(x) = sin(7rx) (x 0<x<l/2 (0 /(*)-(,_, 1/2<х<1 (g) /(*) = Sin(TTX) + COS(ttX) 8 Show that any periodic function on the circle is the sum of an even function and an odd function. 9. What is the Fourier expansion of f(fi) +f(n — Θ) in terms of that for /(0)?
482 6 Functions on the Circle {Fourier Analysis) 6.4 The One-Dimensional Wave and Heat Equations In physics, Fourier analysis begins with the study of wave motions. Suppose we have a homogeneous string of density ρ and length L lying on the horizontal axis in the plane which is kept extended by equal and opposite forces of magnitude к at the end points. If we pluck the string, it will follow a motion which is (classically) determined by Newton's laws. We shall derive the differential equation governing the motion. At some time t the string has a shape somewhat like that pictured in Figure 6.3. We shall refer to a point on the string according to the distance s, measured along the string from the left end point. The position in the plane of the point at distance s at time / will be denoted by z(s, t). This is the function that fully describes the motion. Now, if we argue as if the string were a collection of points, we will get nowhere For the only forces acting on the string are those obtained by transferring the equal, but opposite forces at the end points tangentially along the string. Thus, at any point the sum of the forces acting is zero, so there can be no motion. As that is contrary to fact, this model of the string is inadequate and we must select another. Now we consider the string as a large finite collection of segments and again try to deduce the equation of motion from Newton's laws Having done that, we can idealize by letting the number of segments become infinite (as their lengths tend to zero) and obtain a differential equation. Let s0 and s0 + As be the end points of such a segment (see Figure 6.4) The mass of this segment is pAs and the forces acting on it are opposed tangential forces of magnitude к acting at the end points Letting T(s) be the tangent vector at the point s, these forces are thus — kT(s0), kT(s0 + As), respectively If A is the acceleration of this segment, we have by Newton's laws pAsA = k[T(s0 + As) - T(j0)] Now, T(s) = dz(s, t)/ds and hm A = dzfdt{s0, t) Thus As-»0 к (dzlds)(s0 + As, t) - (dzlds)(s0, t) A = ρ As and now letting As -* 0 we obtain the equation of motion d2z к d2z tit ρ tis
6.4 The One-Dimensional Wave and Heat Equations 483 u(s t) Figure 6.3 This equation, called the one-dimensional wave equation, is usually written (6.32) ds2 (where the substitution c2 = kip is legitimate since both к, ρ are positive). We now make the (physically plausible) assumption that the horizontal motion is negligible (for we are interested only in almost horizontal wave motions with small fluctuations). This assumption allows the replacement of s by the horizontal coordinate x, and the positive vector ζ by only the vertical coordinate y. Thus (6.32) becomes simply dx2 ~?~dt2 (6.33) The motion of the string is completely governed by this partial differential equation and the initial displacement and velocity: y(s,0)=f(s) e_y dt (s, 0) = g(s) (6.34) -kT(s„) ίΐι+ΔΪ fcT(io + as) Figure 6.4
484 6 Functions on the Circle {Fourier Analysis) The technique for solving this differential equation with boundary conditions is the same as in the theory of ordinary differential equations. We find an independent set of solutions of the general equations and hypothesize that the solution we seek is a linear combination of these. We then identify the coefficients by substituting the initial conditions. However, the situation is more complicated than in the one-variable theory. The space of solutions of (6.32) is infinite dimensional, so the particular solution cannot be picked out of the general solution by means of simple linear algebra. This difficulty will be overcome, as we shall see, because the form of the general solution will be that of a Fourier expansion and so the initial data will give us the coefficients by Fourier methods. Let us now solve the differential equation d2y _ 1 d2y 'дх1 ~7 dt г'-г р,2 (6-35) for a function у defined on the interval [0, L] and where these conditions must be satisfied X0, i) = 0 XL, ί) = 0 alii (6.36) y(x,0)=f(x) ^(x,0) = g(x) (6.37) for given functions/, g. First, we put aside the initial data (6.37) and find all solutions of Equation (6.35) subject to (6.36). Since we have no techniques available, we have to make a guess at the form of the solution, and hope that our guess is general enough (of course, in the end it will turn out to be so). The guess that works is y(x, t) = F(x)G(t) and (6.35) becomes F'(x)G(0 = V(x)G"(0 or, what is the same (since we exclude the zero solution), F"(x) _ 1_ G"(t) F(x) ~~?~G{i)
6.4 The One-Dimensional Wave and Heal Equations 485 The left-hand side is independent of /, and the right is independent of x. Since they are the same, they are both constant Thus, there must be а Я such that - = A - — = A F c2 G Now, incorporating the conditions (6 36), we arrive at this one-variable boundary value problem. F" - XF = 0 for some A (6 38) F(0) = 0 F(L) = 0 (6.39) We can find all solutions of this problem. First of all, we see from (6 38) that the general form of F is F(x) = cx expC^/Ax) + c2 exp( — ^/Ax) Substituting the boundary conditions (6.39), we have 0 = F(0) = c, + c2 0 = F(L) = c1 exp(yiL) + c2 exp(- JXL) In order for there to be a solution for both equations we must have cx = — c2 and expC^/AL) = exp( — *JXL) or ex Thus we must have 2^/AL = 2πηι for some η > 0, or ^/A = πηι/L Therefore, the only possible solutions of (6.38), (6.39) are /πηΐ\ / πηΐ\ „ ίπη \ F(x) = expl-—Jx - expl — \x = 2i sinl — χ I all η > 0 Corresponding to the solution F„(x) = sm(nn/L)x, we now solve for G: 2 2 π η „ The solutions are spanned by G„(t) = cos(nn/Lc)t, sm(nn/Lc)t. Thus, all
486 6 Functions on the Circle {Fourier Analysis) solutions of (6.35) of the form F(x)G(t) are these /πηχ\ /πη \ [πηχ\ Ι \-τ) cos[lc ή sm[~r) sinv sin πη Тс (6.40) We now return to our particular initial conditions (6 37) and hope to find a linear combination of the functions (6 40) which has those initial conditions. Of course, the linear combination will satisfy (6.37) since it is a linear differential equation. (However, we must caution the reader that ours will be an infinite linear combination so questions of convergence are inevitable. If the initial data are well behaved, these problems disperse as you shall see in Problem 15.) Thus we seek y(*. 0 = Σ А"С05\Гс1) + вЛ5тГсЪ5т\тх) satisfying the conditions (6.37): ™ /πη \ /00 = y(x, o) = Σ An sin[j; XJ dy, „. £ nn . Ιπη \ But we can solve these equations, for these are just the expansions of/and g into Fourier sine series. We collect this discussion into the following proposition. Proposition 3. If the functions f, g defined on the interval [0, L] are well behaved (say at least twice differentiable), then the wave equation d2y _ 1 d2y l&'c^W with the boundary data y(Q, t) = 0, y(L, () = 0 and the initial data y(x, 0) =/(x), (dy/dt)(x, 0) = g(x) has a solution. The solution is given by y(x, 0 = Σ lnnt\ „ . Ιπηί А"С05\Тс-)+В"5т\Тс- (πηχ -τ
6.4 The One-Dimensional Wave and Heat Equations 487 where 2 fL /πηχ\ , 2c rL /πηχ\ , A„ = -jf(x)sm{ — \dx Bn = — J g(x)sinl — I dx Examples 14. Solve the wave equation δχ2~4~δ? on the interval [0, π] with initial data y(x, 0) = sin 2x (dy/dt)(x, 0) = sin2 χ Now с = 2, L = π. The Fourier sine series for y(x, 0) is just sin 2x, so A„ = 0 unless η = 2, Аг = 1. Now 4 Γπ ? , , 4 Γπ 1 - cos 2χ . ^ ч , β„ = — sin χ sin(nx) dx = — sin(nx) dx πη Jo πη Jo 2 = 0 if η is even We concentrate now on the case where η is odd: (6.41) Bn = — · cos(2x) sin(nx) dx πη η πηJo Now cos(2x) sin(nx) dx Jo cos(2x) cos(nx) η 71 2 sin(2x) sin(nx) ο" η π 0 4 + -2 —2 cos(2x) sin(nx) ί/χ η J0
488 6 Functions on the Circle {Fourier Analysis) Thus 11 JI cos(2x) sin(nx) dx = - (1 — cos πη) = - (η odd) \ η J Jo η η Now, putting the result of this computation into (6.41): 2 B„ = — πη In -16 " n\n2 - 4) Thus the solution is given by 16 ™ sin ntjl . y(x, t) = cos t sin 2x > τ, , — sin nx π „= ι η\ητ -4) η odd 16 ™ sin(mi) sin 2mx ^(x, i) = cos ( sin 2x 2j 7^ TTITl—2 π „,^Ί (2m + l)2(4m2 + 2m - 3) 15. Solve the same wave equation with initial data dy y(x, 0) = sin χ + sin 5x + 2 sin 6x — (x, 0) = 0 at The expressions for the initial conditions are the Fourier sine series for those functions; thus we can read off the solution: f . 5i y(x, i) = cos - sin χ + cos — sin 5x + 2 cos 3i sin 6x 2 2 Heat Transfer Another physical problem which gives rise to a partial differential equation which can be solved in a similar way is the problem of one-dimensional heat transfer. We shall derive this equation here (the derivation in Chapter 8 of this equation in higher dimensions shall be seen to be completely analogous). Suppose we are given a thin homogeneous rod of length L lying on the horizontal axis. Let и(х, t) be the temperature at χ at time t. We assume that there is no heat loss, and the temperatures at the end points are maintained constant. Now the basic physical law here is that the flow of heat is proportional to the temperature gradient, but points in the opposite direction.
6.4 The One-Dimensional Wave and Heat Equations 489 Thus, during a small interval of time At the heat (energy) passing from left to right through a point x0 is proportional to -(ди/дх)(х0) · At. If we select a segment of the rod with end points x0 and x0 + Ax the increase in energy in that segment of the rod is proportional to I ди \ I ди \ -|-^(χ0 + Δχ)ΔίΙ+ Ι-—(χ0)ΔίΙ (6.42) On the other hand, the increase in energy is proportional to the product of the mass and the change in temperature. Thus (6.42) is proportional to Au ■ Ax. Letting k2 be the constant of proportionality we have, for the period of time At: Au ■ Ax = k2 ди ди -(χ0 + Δχ)--(χ0) Δί Dividing by Δχ · At and letting both tend to zero, we obtain the heat equation' (6.43) 1 ди д2и Ί?~δΊ = ~δχ~2 We now propose to solve (6.43) given the boundary conditions м(0, г) = 0 u(L, i) = 0 (6.44) and the initial temperature distribution u(x, 0) =/(x) (6.45) The technique is the same as that for the wave equation. We try a solution of the form и(х, t) = F(x)G(t). (6.43) becomes 1 G'(t)F(x) = T1F"(x)G(t) Dividing by F(x)G(t), we again find that there must be a A such that F" _ G' _ λ Ί = λ g~~V2 The first equation, subject to the initial conditions (6.44) again has only the solutions sin(7rnx/L), η > 0, corresponding to the choices yJ~X = nnijL. The
490 6 Functions on the Circle {Fourier Analysis) second equation becomes which has the solutions „,ч /-π2"2\ G11(0 = exp(-E?p-)i For convenience, let us write С = π/Lk. We now try to fit the series f^nexp(-CV()sin(^ (6.46) to the initial conditions. Evaluating at t = 0 we find that the {An} must be the Fourier sine coefficients of/(x). Proposition 4. If the function f defined on the interval [0, L] is well-behaved (say at least twice continuously differentiable), then the heat equation 1 ди д2и k*dt~lhS with the boundary data y(0, t) = 0 = y(L, t) and the initial condition u(x, 0) = f(x) has a solution. The solution is given by (6.46) where С = n\Lk and Now, the wave and heat equations readily and conveniently led us to the considerations of Fourier analysis. Actually this could have been (and in fact was) anticipated on physical grounds, for we should expect periodic behavior in these circumstances. Other partial differential equations arising out of physics can be solved by similar techniques, but we do not necessarily end up with a sequence of solutions of the general equation which are made up of trigonometric functions. Thus the Fourier analysis does not apply, whereas the fundamental ideas may carry over. The typical situation is this a partial differential operator Ρ is given on a certain domain D; we seek a solution/of W = o
6.4 The One-Dimensional Wave and Heat Equations 491 subject to certain boundary conditions " B" and initial data f(x, 0) = g(x). First, we find all solutions of P(f) = 0 subject to the boundary conditions B, without regard to initial conditions. If {Slt..., S„, ...} are these solutions, then we try to find a linear combination £ a„ S„ which fits our initial data Х«„5„(х,0) = <7(х) In our typical situation the S„(x, 0) are orthonormal in the sense of some convenient inner product on the space of all initial data In this case the an are readily computable The cases of the heat and wave equations described above are just special cases of this method There are many more examples of such orthogonal expansions; discussions of them can be found in most texts of mathematical physics. Finally, we cannot really expect to be able to follow through such a program for every partial differential equation, thus the general theory does not follow such an explicit line of reasoning In one approach, local solutions are sought through examination of Taylor expansions (everything involved is assumed analytic). This is the Cauchy-Kowalewski theory. A more recent attack has its roots in the above ideas, as well as the Picard theorem. The vector space of differentiable functions is provided with a notion of distance and length which is suited to the given problem so that one can resolve questions of existence and uniqueness (as in the Picard theorem) and provide usable approximations with estimates derived from the initial data. This study is one of the most active branches of modern mathematics. • EXERCISES 10 Solve the wave equation d2y d2x ~ъхг~ ёТ7 on the interval (0, 1) with the boundary data y(0, t) =0 = y(l, t), and the following initial data <<y (a) y{x, 0) = sin χ — (x, 0) = 0 by (b) y(x, 0) = COS3 πΧ — COS πχ — (x, 0) =Sin π\
492 6 Functions on the Circle {Fourier Analysis) (c) X*,0)=x(x-1) j-{x,Q)=Q ot dy (d) y(x, 0) = cos πχ — (χ, 0) = sin πχ * By Ътт π (e) y(x, 0) = 0 — (χ, 0) = sin — χ + sin - χ 11. Solve the heat equation ди д2и ~dt = 4~dx1 on the interval (0, L) with the boundary data й(0, t) = 0 = u(L, t) and the following initial data (a) u(x, 0) = sin χ πχ (b) u(x, 0) = cos — (c) «O, 0) = x(x - L) πχ 5πΧ (d) u(x, 0) = sin — + 3 sm — 12. (a) Show that the function u(x, t) =ax + b solves the heat equation on the interval (0, L), with boundary data и(0, t) = b, u(L, t) = aL + b (b) Show that if u, υ solve the heat equation with boundary data и(0, t) = t, u(L, t) = ti v(0, 0 = 0 v(L, 0 = 0 then и + ν solves the heat equation with the same boundary data as u. (c) Solve the heat equation bu d2u (t ~ Bx2
6.4 The One-Dimensional Wave and Heat Equations 493 on the interval (0, 1) with boundary data «(0, t) = 1, и(1, t) =e' and initial data u{x, 0) = e'. 13. The initial data given in the problem of heat flow may be the rate of flow of heat energy; or what is the same, the gradient of the temperature. Show that the solution of the heat equation 1 du _ d2u I~2~dt ~~dx~2 on the interval (0, L) with boundary data u(L, 0) = 0 = u(L, t) and initial data (аи/дх)(х, 0) =/(*) is given by ^. πηχ 2, Л„ехр(—CVf)cos —— n = i L where С is a constant, and 2 Ι Γ πηΧ A„ = — /(x)cos —— dx πη J„ L 14. Solve the heat equation given in Exercise 11 with this initial data: (a) 8ul8x{x, 0) = cos ttx/L, (b) Bu/dx(x, 0) = sin πχ/L. 15. Solve the differential equation d2u _ d2u dx2~~~8y2~ + U on the interval (0, π) with boundary data «(0, t) = 0 = и(1, t) and the initial data u(x, 0) =/(*)· 16. Do the same where the differential equation is d2u du _ d2u du ~bx2Tt~' et2Vx PROBLEMS 15 Show that the series defining the function y(x, t) in Proposition 2 converges uniformly and absolutely under the stated conditions. Does this observation suffice to deduce the conclusion of Proposition 2? 16. We may be given, in the heat problem, the gradient of the temperature as boundary data. Show that the general solution of the heat equation with boundary data du 8u , -(0.0-0--СМ)
494 6 Functions on the Circle (Fourier Analysis) can be written as a Fourier cosine series Solve the equation 8u _ 82u ~Ы~~сТ2 on the interval (0, π) with the boundary conditions till till -(o,o=o = -(L,o and the initial conditions (a) u(x, 0) = sin χ 8u (b) — (x,0) = sinx 17 Solve the differential equation tin ti2u tit ~~ tix2 with the boundary data ди/Вхф, t) =0, 8uj8x(L, t) = h and initial conditions u(x, 0) = 0 18 Solve Laplace's equation д2и д2и Ди = —+ —= 0 8x2 tiy2 on the infinite rectangle 0<y<L, 0<x (see Figure 6.4) with the boundary values u(x, 0) = 0 = u(x, L) u(0,y)=f(>) 8u Yxi0,y)=g(y) Show that the assumption that и is bounded implies that the third condition is unnecessary the solution is uniquely determined by its boundary values 19 Find the bounded solution of the differential equation Ди + и = 0 in the infinite rectangle (Figure 6 5) with the boundary conditions u(x, 0) = 0 = u(x, L) "(0, \)=f(y)
6.5 The Geometry of Fourier Expansions 495 Figure 6.5 6.5 The Geometry of Fourier Expansions We now return to the study of functions on the circle, that is, periodic functions of period 2π. We still have not studied the sense in which the Fourier series of a function converges to that function; we have only Corollaries 1 and 3 of Theorem 6 1 which deal with pointwise uniform convergence. Let us consider the real Fourier series of a continuous real-valued function/: A0 + Σ [Λ„ cos nx + B„ Sin nx~\ 1 л" (6.47) 1 л" Ao = =- f /00 dx A„ = - f f{x) cos nx dx 2π J _„ iiJ-i B„ = - f(x) sin nx dx π J-„ 1 л" Since the Fourier series of a trigonometric function is itself, we find, by applying these definitions to cos nx, Sin nx, that f71 cos nx sm mx dx = 0 all n, m ·" -π 10 η φ m π η = m # 0 2π η = т = 0 Г а /° "# sin их sin mx αχ = ί 0 η φ т т (6.48) (6.49) (6.50)
496 6 Functions on the Circle {Fourier Analysis) There is a geometric way of interpreting these equations which sheds light on the subject. We consider C(T) as a vector space endowed with the inner product </. 9> = f f(x)9(x) dx J — tt This inner product, of course, defines a notion of distance (recall Section 1.11) ll/-ffh = Γ \f{x)-g{x)\2dx •^ —it 11/2 (6.51) which is quite distinct from the uniform, or supremum distance II/-5II =max{|/(x)-0(x)|: -π<χ<π} We shall call the distance (6 51) the mean square distance, and we shall speak of convergence in this sense as mean square convergence. More precisely, /„ -»/(mean square) if ||/„ -/||2 -► 0 as η -► oo. Now the importance of the equations above is that they imply that the functions cos nx, Sin nx are mutually orthogonal in the vector space C(T) with this inner product Thus, we can interpret (6.47) as an orthogonal expansion. Let us make these new definitions 1 „ , „ cos nx „ , sin nx (2πγ co(x) = j^m QM = —— S"M = Then the collection C„, S„ is, according to Equations (6.48)-(6.50), an orthonormal set. If/is any function on the circle, A l Г" fl s 1 J </■ C0> Α°-(2πγ'4-πηΧ)(2πγΙ2αΧ- (In)1'2 1 г" 4. = -7= f f(x) /77 J -TT cos nx </, C„> —7=— ax = 7=— B* = -7= f(x) —7=- dx = —j=-
6.5 The Geometry of Fourier Expansions 497 so the Founer expansion (6 47) can be rewritten as oo </, c0yc0 + χ [</, c„>c„ + </, s„>s„] n= 1 and is thus the infinite-dimensional analog of the orthogonal expansion of an element in an inner product space in terms of an orthonormal basis This interpretation has important consequences for us. Theorem 6.3. Let f be a continuous function on the unit circle, and let (6.47) be its Fourier series. (i) Among all trigonometric polynomials of degree at most N, the closest to f is N A0 + Σ (Α„ cos nx + B„ sin nx) (6 52) n= 1 (n) (Bessel's inequality) 1 f \f(x)\2 dx>A02+~t ОС + B„2) (6 53) In J-„ I „=] Proof. In order to verify these facts, we use the basic theorem on orthogonal expansions (Theorem 18) The functions Ca,Ci, , Cv,Si,. ,SN form an orthonormal basis for the space Sw of trigonometric polynomials of degree at most N. The orthogonal projection of/into this space is /o = </, СоУСо + 2 '/, C>C. + </, S„}S„ which is the same as (6 52) Thus, according to Theorem 1 8 (О ll/ll22=ll/ol|22+II/-/0II22 (ii) foranyweS*, ||/-/o"22 < II/- w\ 22 (11) directly implies Theorem 6 3(i) According to (1), ll/lb2 > ll/o IL·2 = «/, Co»2 + 2 ( /, C»2 + «/ S. )2 ||/||г2^27г/1о2 + 7г 2 A„2+BS n = l Since this is true for all N, we can take the limit on the right as N — s, thus obtaining Bessel's inequality.
498 6 Functions on the Circle (Fourier Analysis) Now, it is clear that for trigonometric polynomials, Bessel's inequality is actually equality For if/ is such a trigonometric polynomial, there is an N such that/e SN, so f=f0 . Thus, by (i) above ||/||2 = ||/0||2, and ||/0||2 is just the right-hand Side of Bessel's inequality. Since any function can be uniformly approximated by trigonometric polynomials (although not necessarily by its Fourier series), we should expect Bessel's inequality to be always equality This is the case Corollary. (Parseval's Equality) If f is a continuous function on the unit circle and has the Fourier series (6 47), then -!- f \f(x)\2 dx = A02+\t (Λ„2 + B„2) Pi oof We continue the notation of Theorem 6 3. Let ε>0 be given By Corollary 4 of Theorem 6 1, there is a trigonometric polynomial w such that |»v-/||<e Then \\w π π ■f',22=\ \w~f\2dx< w-f\2\ άχ<2πε2 Now, since w is a trigonometric polynomial, there is an TV such that w e Sw Let /o be the projection of/into Sv Then by (ι) and (ii), ΙΙ/,22=Ι/θ,'22+ '/-/θ!Ι22<Ιΐ/ο 22+ I/- W|l22^!/o 22+2tT£2 This becomes, as in the above argument, f [/(χ)|2ί/χ<2ττ/102 + 7Γ 2 A2 + B2 + 2πε2 Since the sum to infinity only increases the right-hand side, 1 f" 1 ■?· =- |/(Jf)l2 dx <A02 + - Σ (Α,2 + Β,2) + ε' Now, since f was arbitral у we may let it tend to zero The lesulting inequality, together with Bessel's inequality, gives Parseval's equality Finally, we note that Parseval's equality can be expressed in terms of the
6.5 The Geometry of Fourier Expansions 499 expansion into a series of complex exponentials: £ f(n)e'"e. Since f(0) = Ao An) = $(A„ + iB„) K-n) = \{An-iBn) V = I /(0)|2 An2 + B2 = 2(|/(«)|2 + | /(-и)|2) so we have 1 r" - f Ιί(θ)\2 άθ = Σ Ι/(ΌΙ2 Examples 16 Since cos2 θ = ie~ ae + \ + \en\ If 1113 cos4 θάθ = —+ - + — = - 16 4 16 00 17. π2 - θ2 = 2π2/3 + 2 £ (-1)V"V 16π5 f (π2-02)2^ = ^ = 2π 4π4 ^Σ~ пФО П We conclude that У — = — Л и4 ~~ 90 The partial sum to degree 3 of the Fourier series of π2 — θ2 is Ίττ2 Ι 2 F^rfl) = 2 cos θ + - cos 2Θ - - cos 30 34 ' 3 2 9 The square of the mean square distance between θ2 — π2 and this sum is 1 π4 „4-4 „4 90 1 1 10 Ϊ6~~8Ϊ~ΊΓ 1 1 Ϊ6~8Γ 8 1 3_ ~8Ϊ~Ϊ6~80
500 6 Functions on the Circle {Fourier Analysis) Figure 6.6 In Figure 6 6 the graphs of π2 — θ2 and F3(0) are drawn. 18. \θ\ has the Fourier expansion π_2 - e1(2"+1)" 2~π„Α0Ο(2« + 1)2 From Parseval's equality, we find π2 _ π2 4 " 1 ™ 1 _ π4 T~T + i? „to (2и + Ι)4 0Γ „е0(2И + I)4 = 96 The third partial sum of the Fourier series of \θ\ is π 2 / n cos 30\ The mean square distance between \θ\ and this trigonometric poly-
6.5 The Geometry of Fourier Expansions 501 normal is £ 1 π* 14 13 1 < < — 81 96 81 ~ 96 „t-2 {In + l)4 96 (see Figure 6.7) Mean square approximation is interesting from the physical point of view Consider the solution of the wave equation (suitably normalized) u(x, 0 = £ Ып COs ηί + Β,ιsin nt) sin "v (6 54) The (kinetic) energy of the wave at time t is proportional to du dt dx Now, by Parseval's equality that can easily be computed in terms of the Fourier sine coefficients of Cu/ft du τ~= Σ n(Pn c°s"' -~ Ansm "0sm "x ot „ = o const du ot dx = (Ση2(Βη cos nt — An sin ίΐί)2) (Because of our normalizations, the constant is not relevant, it might as well Figure 6-7
502 6 Functions on the Circle {Fourier Analysis) be 1.) Now the maximum value of the right-hand side is 00 Σ n\An2 + Bn2) (see Problem 20) so this is the maximum kinetic energy of the wave. Now, according to our geometric considerations above, the Mh partial sum of (6 54) provides the best approximation to the solution wave in the sense of energy. Furthermore, the difference in energy levels between the solution wave and this approximation is readily computable, it is Σ η\Α? + Bn2) Since energy is the important concept in the study of waves, this mean square approximation is well suited for this study. • EXERCISES 17 Compute these integrals by Fourier methods: (a) j" cos830i/0 π (b) sin2 μθ άθ μ not an integer (0 J (d) J (e) f 0 \-r2 1 + r2 - 2r cos(0 - φ) 1-r2 cos φ ιάφ l+r2-2rcos(0-(£) ■άφ (f) J 04i/0 18 Approximate the given function by a trigonometric polynomial to within 10 3 in mean (a) |0|0 ™ cos «0 (b) lw* (c) sin3 0 cos 0 (d) e
6.6 Differential Equations on the Circle 503 PROBLEMS 20. Show that the maximum of (B cos nt — A sin nt)2 is B2 + A2 21. Let {/„} be a sequence of continuous functions on the circle. Show that if /„ ->/ uniformly, then /„ ->/ in mean. Show by example that the converse statement is false. 22. Prove: if/, g are integrable real-valued functions on the circle -ί №9{θ)άθ= | /(иЖи) '7ϊ"*'_π η = -α> 6.6 Differential Equations on the Circle We now turn to a slightly different problem involving ordinary differential equations. We propose to find all periodic solutions of a linear constant coefficient equation. The particular theory which results is not in itself of vital importance, but it is worthwhile to study because of the symmetry of the results and because it presents the simplest example of the general theory of differential operators on compact manifolds. As we have already seen, it is valuable in the theory of ordinary differential equations to allow complex-valued functions. We return then to our original form of the Fourier expansion of a function /: £/(и)е""'. Our first result concerns the computation of the Fourier coefficients of the derivative of a function. Proposition 5. Let f be a continuously differentiable function on the circle. The Fourier series off is obtainable by term by term differentiation. That is, f'(n) = inf{n) (6.55) Proof. The proof is by integration by parts. 1 r" 1 in л" /Xn) = - ΠΘΥ"° άθ = — f(0)e- '"< + — f(d)e-'- M in >/(«) Thus, if the differentiable function/has the Fourier series ^А„е'"в, then the Fourier series of/' is £ inAne'"e. It follows from the fundamental theorem of calculus that we can also integrate Fourier series term by term, so long as it has no constant term: if /has the Fourier series £ А„е'пв, then
504 6 Functions on the Circle {Fourier Analysis) Jo/has the Fourier series £ (т)~1А„е'"в. A useful consequence of Proposition 5. in conjunction with Bessel's inequality is that a continuously differentiable function is the sum of its Fourier series. Proposition 6. If f is a continuously differentiable function, then /(0) = Σ-" - - /(")e'"e holds for αΐΐθ. Proof. By Bessel's inequality I |/'(и)|2<со Using the above proposition we then obtain by Schwarz's inequality /'(«) l/(0)i + Σ - л#0 П /'(«) < со Σ i/Wl = l/(0)i+Σ <\/Φ)\ + (ϊ-Χ'Ίΐ i/'WI2)1 Thus Σ Ι/(и) Ι < °°> s0 Corollary l of Theorem 6.1 applies Now, suppose g is a continuous function on the circle. Given a polynomial F(X) = Xk + YJ[Zo a,X', we want to find a periodic function f such that k-l Σ 1 = 0 /<»+ z°Jw = 0 (6.56) The fact that we are interested in periodic functions is a new twist and the local results, such as Picard's theorem, are hardly applicable. For example, consider the simplest differential equation: f' = g By local considerations we know that/must be (6.57) /(0) = f д(ф) άφ + с J - π However,/will be a periodic function only if /(π) =/(-π) = с: for this we must have |_π: д{ф) άφ = 0. Thus (6.57) has a solution if and only if $(0) = 0. We have already recognized this condition in the above discussion of
6.6 Differential Equations on the Circle 505 integration of Fourier series. For by (6.55), if/' = g we must have inf(n) = (}{ri) for all n. This necessary condition shows up again by taking η = 0: we must have #(0) = 0. Now we return to the general case (6.56). If we look at the Fourier series of both sides this becomes F(in)f(n) = cfcn). Thus we must have g(n) = 0 whenever F(in) = 0. Otherwise, the equation does not have a solution. On the other hand, if this condition is satisfied, then the equation is easily solved since we must have/(и) = F(in)~lg{ri). The solution is the function whose Fourier series is V Sin) ^ „=-m F(in) Theorem 6.4. Let F(X) = Yj:^ c.X1, and let nu F(in) = 0. Let LF be the differential operator LF(f)= Σ^,/(' к Г 1 = 0 (6.58) ., ησ be the roots of (i) The space of periodic solutions of LF(f) = 0 is spanned by exp(z«19), ..., exp(wff Θ). (ii) Given any periodic function g, the equation LFf= g has a solution if and only ifg(n^) = 0, 1 < ι < σ. The solution is uniquely determined by specifying the Fourier coefficients /(и,), 1 < / < σ. Proof. The Fourier coefficients of Lr(f) are {F(m)/(n)}. Now if LF(/) = 0, we must have F(iri)f(n) = 0 for all n, so /(n) = 0 necessarily except when F{m) = 0. Since nu...,n<, are the roots of this equation, (ι) is proven. If g is a periodic function and LF(/) = g, we must have F(m)/(n) = g(n). Thus g(ni) =0, 1 <; ι <; σ is a necessary condition for this equation Suppose now that this condition is satisfied. Then if/is a solution we must have An)- g(n) : F(m) пфпи.. (6.59) and the f(nt), 1 < ι < a can be freely chosen Upon specification of these coefficients the Fourier series of / is uniquely determined. The only question is this: are the numbers (6.59) the Fourier coefficients of a function? The answer is yes when F is of degree at least one. For then \F{in)\ >C\n\ for some constant С and all sufficiently large η (Problem 24), and thus Σ m F(in\ ^IC gin) <c te)" (Σ Ι^(")Ί2)Ι/2 < oo (6 60)
506 6 Functions on the Circle {Fourier Analysis) for the tail end of the series, and thus the sum of the whole series is finite. Hence the Fourier series №= Σ |rv<»< (6.61) converges uniformly to a continuous function. The theorem is thus proven. We can get a much better looking form for the solution, if the degree of F is large enough (at least second degree). For then Ρ(θ)= Σ Ϊ7Γ-. (6-62) „=-oo F(in) ηφη\, ,na defines a continuous function (Problem 24) and the solution (6.61) is given by ешв 1 r" „=—„0 F(in)2n J-„ 1 π oo „ιη(β-φ) = τ~\ β(Φ) Σ -ΈΓΎάΦ 2π)-π „=-οο F(w) 1 г71 = -\_9{φ)Ρ{θ~φ)άφ using (6 61). We can now write the conclusions of Theorem 6 4 explicitly in terms of an integral formula. Theorem 6.5. Let F(X) = Σ?=ο c.x' (k ^ 2)> a«rf let ηγ, ...,ησ be the solutions ofF(in) = 0 Let LF be the differential operator defined by the polynomial F. Let № = Σ 00 ρίΜ „ = -„0 F(in) ηφη\, , ησ Then the equation LF(f) = g has a solution if and only if §(nt) =0,1 < i < σ. AII solutions are of the form № = ^ f 9(Φ)Ρ(Θ -ф)аф+^С] ехр(1„, θ) (6.63) 2nJ-„ j = i
6.6 Differential Equations on the Circle 507 Thus a constant coefficient differential operator on the circle has an inverse of the form (6.63) (denned on its range), called an integral operator. Examples 19. Find a periodic solution of /" -/ = cos 2Θ. Now д(в) = cos 20 = \{el2» + e~l2\ The characteristic polynomial is F(X) = X2 - 1 and F(in) = -и2 - 1 has no roots. Thus there exists a unique solution and it is given by (6.58): V $№в We·2" e--2o\ = - - cos 2Θ 20. Solve: /" - 3/' + 2/= π2 - θ2. The characteristic polynomial is F(X) = (X2 - 3X + 2) = (X - \)(X - 2) so that again F(in) = 0 has no integral roots. Since π2 - θ2 has (by Example 4) the Fourier series 2π2 е1пв ^ + 2Σ(-1)"^τ i ηΦο η the solution is (by 6.58) 2 £шв 21. fw = g- This has a solution if #(0) = 0. In this case the solution is given by £ιηβ № = c+ 1о(п)т-гк пФО (Ш) 22. Find all solutions of/" +4/= 0. Here, the roots of X2 + 4 = 0 are +2/, therefore, all solutions are periodic of period 2π: е2,в, е~2,в span the space of solutions. Notice, however, that there are no solutions of/" + 5f= 0 which are periodic.
508 6 Functions on the Circle {Fourier Analysis) EXERCISES 19 Find all periodic solutions of these differential equations: (a) /+2//+15j>=0 (b) /5) - /*> + 10/3) - 10/' + 9/-9y = 0 (c) y*» + 2/+l=0 20. Find periodic solutions of these differential equations: (a) ym + 2y" + у = sin 50 + cos 50 (b) y" + 6/ + 9.v = π' - 02 (c) /5) + j' = exp(cos 0) PROBLEMS 23. Suppose Fis a polynomial of degree at least k. Show that there is a C> 0, and an integer N such that \F{in)\ > С |n|" for n>N. 24. Show that if F is a polynomial of degree at least 2, then л not a root v / is a continuous function on the circle. 25 If /, д are two continuous functions on the circle, define / * g, the convolution of/and g, by 1 n (/* вШ = ^\ ί(φ)9(θ~φ)άφ (a) Show thatf*g=g*f. (b) Show that the Fourier coefficients off*g are f(n)g(n). (c) Show that the differential equation LF(/) = #, where Lf is the constant coefficient operator associated to the polynomial F is solved by f=g * F, where Fhas the Fourier coefficients F(w)-1. 26. Let Л(г, 0=| P{r,r)dr (a) Show that lim Pi(r, t) is 0 if f < 0, and 1 if t > 0. r-*l (b) Show that for any С' function g on the circle 3(0) = lim f дЦШг, θ~φ)άφ 00 (Hint lim Pi(r, 0) "has the Fourier series" 2 e'"ejin.)
6.7 Taylor Series and Fourier Series 509 6.7 Taylor Series and Fourier Series If we now take the attitude that the unit circle is the boundary of the unit disk we discover connections between Fourier series and Taylor series which are of enormous significance in complex function theory. These connections cannot be fully exploited until we learn the fact (in the next chapter) that complex differentiable functions can be expanded in a power series. In this section we shall explore the relationship between the Fourier and Taylor expansions of such a function defined on the unit disk, assuming its Taylor series converges on the disk. Let/(z) = £"=0 a„z" on the unit disk. In polar coordinates this becomes 00 /(«'·)= Σ a-'V"· (6.64) n = 0 which is, for each r, a Fourier series. Using the Fourier theoretic material at hand we get a most remarkable collection of integral formulas for functions which are sums of convergent power series. 00 Theorem 6.6. Letf(z) = £ a„z" be a convergent power series in {\z\ <, 1}. n=0 Then we have these equations. (ι) For each r < 1, — f f(re">)e-ine άθ = αη = ^-/(η)(0) η > 0 2nJ-n n\ — f f{rete)elne <20 = Ο η > О (Ι1) ^°έ/-'/(^ι + Γ'Λ^-»)^ (iii) ^^кО^^^ for ζ = ге1в, г < 1. Proof. By Equation (6.64), for fixed r, a„r" is the nth Fourier coefficient of f(rew) for η >0, and for negative η the Fourier coefficient vanishes. This is just
510 6 Functions on the Circle {Fourier Analysis) what part (i) says explicitly. Equation (6.64) also says that f(re">) is the Poisson integral of f(e'°) and thus we obtain (ii). Then (iii) follows from resumming the series, using the fact that the negative coefficients vanish. Explicitly, we have /(*) = Σ α,ζ"=Σ f f ПфУе-^аф η = 0 n = 0 -ώ77" * _ π the last change being accomplished by summing the geometric series. There are several more or less immediate conclusions one can draw from the above theorem. First of all, the sum of a convergent power series on the unit disk is completely determined by its boundary values, by Equation (iii), known as Cauchy's formula. This of course follows from the maximum principle verified in the last chapter for analytic functions. The Cauchy formula itself implies the maximum principle (see Problem 27). A more important implication is that the sum of a convergent power series in the disk is analytic; that is, it can be expanded in a power series centered at any point. Corollary. Let f(z) = Yj?=0 anz" in the disk {\z\ < 1}. Then for any z0, \zo\ < hfcan be expanded in a power series centered at z0. Proof. By Cauchy's formula 1 r" e1* ^-rJj^T^-z^ Now 1 1 1 / z-z0 e"»-z ~ e"» -z0~(z~ z0) ~ <?'* - z0 ' V ~ <?'* - z0 In the disk {z e C: \z — z0\ < 1 — \z0\, the last factor is the sum of a geometric series: V e^-zof „foV'*-zo/ Г
6.7 Taylor Series and Fourier Series 511 which we may substitute in the integral. We obtain the series being convergent for all ζ such that \z — z0\ <1 — |z0| As a consequence we have still more integral formulas: f^^ = tfj^(^r^ for any z0, |z0| < 1. We shall see in the next chapter that these integral formulas can be explained in yet another way (basically the fundamental theorem of calculus) and are just very special cases of general formulas. We conclude now with an approximation theorem which should be contrasted with the Weierstrass approximation theorem (Problem 7) for a real variable. Theorem 6.7. Let f be a continuous function on the circle, f is approxi- mable by polynomials in ζ if and only if — f * f(eyne άθ = 0 η > О (6.65) 2π J-π Proof. If / is approximable by polynomials, there is a sequence {fk} of polynomials such that /k -» / uniformly. Since (6.65) is readily verified for any polynomial, it thus holds also for /, by continuity of the integral. Conversely, if (6.65) is verified, then the Poisson transform of /is the sum of a convergent power series: PRz)= fa„z" |z|<l (6.66) Since Pf(rew)^f(6) as r->l, then given e>0, there is an r < 1 such that \Pf(re") -/(.θ)\ <ε for all Θ. Now on the circle \z\ = r, the series (6.66) converges uniformly, so there is an TV such that pfb)- Σ α»ζ" <ε и
512 6 Functions on the Circle {Fourier Analysis) Then, On the unit circle, /(z)- J a.r"z" <\f(e">)-Pf(re,°)\ + Pf(re'°) - 2 ап(ге'У <2ε independently of θ • EXERCISES 21. Integrate: (a) / * e3,e + 4e2ie_(_eil le'e - 1 άθ (b) L& + 1/4)" άθ к, п positive integers PROBLEMS 27. Deduce the maximum principle for convergent power series on the disk from Cauchy's formula. 28. Using the results of Problem 11, verify that these assertions for a function defined on the disk are equivalent: (a) /(z) = | a„z" n-l (b) /is complex differentiable. (c) /is uniformly approximable by polynomials. (d) /is harmonic and /(—и) =0 for η >0. 6.8 Summary The function f(t) = exp(2nit/L) wraps the real line around the circle so that every interval of length L covers the circle once. The collection of periodic functions of period L may be viewed as the collection of functions on the circle. If/ is a piecewise continuous function on the circle, its nth Fourier coefficient is
6.8 Summary 513 The Fourier series of /is the series 00 Σ h»ve η = — oo The Poisson transform of/is the function on the unit disk given by 2π)-π 1 + r -2rcos(0-0) n = -oo Theorem. Iff is a continuous function of the circle andg is the function on the disk defined by g(r, Θ) = Pf(r, Θ) r < 1 then g is continuous on the disk and harmonic (satisfies Laplace's equation) inside: дх2 ду2 Λ + -Λ = ° for r < 1 Theorem. If f is continuously differentiable on the circle, then it is the sum of its Fourier series: /(β)= Σ /Ие" V = — 00 Suppose u is harmonic in a closed and bounded domain D in the plane. Then (i) if A(a, Я) с D 1 r" u(a) = — Г u(a + Re'e) άθ (mean value property) 2π J-„ (и) if и < Μ on dD, и <M inside D (maximum principle) A function harmonic on a closed and bounded domain is uniquely determined by its boundary values. If the real-valued function/has the Fourier series Σ C„ етв, we can rewrite
514 6 Functions on the Circle {Fourier Analysis) it as 00 Λο + X An cos ηθ + Bn sin ηθ n= 1 where А0 = С0=±$[яф)аф 1 r* A„ = 2 Re C„ = C„ + C_„ = - Γ /(ψ) cos «ψ <ty itJ-. B„ = 2 Im С, = - i(C„ - C_„) = - f /(φ) sin ntf> # If/is a C1 periodic function of period L, it can be expanded in a Fourier cosine series πηχ f(x) = A0+ X A„ cos — л = 1 L· 1 p^* 2 i»^* жпх A°=LJ ^ dX A"= LJ ^ C°S "ΊΓ dX or a Fourier sine series πηχ f(x)= £B„sin — 2 cL πηχ the wave equation. Given the C2 periodic functions /, g of period L the equation e2y _ ι a2^ дх1 ~с1~д? with the boundary data XO, 0 = 0 y(L, 0 = 0
6.8 Summary 515 and the initial data ЗУ. y(x,0)=f(x) _i(x,0) = ff(x) has a solution. The solution is given by 00 y(X, ο = Σ πηί πηί A„ cos — + B„ sin — Lc Lc πηχ sin where A 2 f tf 4 · /ЛИХ\ J r> 2C Г1, / ч ίπηΧ\ , " = lj ^slnwT/ "=«"J ^x)slnl-£7) dx the HEAT EQUATION. Given the C2 periodic function / of period L the equation 1 ди д2и Ϊ^Ίΰ^δχ1 with the boundary data M(0, 0 = 0 = u(L, t) and the initial condition M(x,0)=/(x) has a solution given by πηχ u(x, t)= £ A„ ехр(-С2л20 sin — n=l Ь where С = π/Lfc and 2 cL πηχ An=LJ S1" T~ Consider С(Г) as endowed with the inner product </, 0> = f f(x)9(x) dx
516 6 Functions on the Circle {Fourier Analysis) Let „ , 1 „ , . cos nx sin nx {C„, 5„} is an orthonormal set. The Fourier series of a function /can be rewritten as Σ </, c„>c„ + Σ </. s„>s„ л = 0 л=1 parseval's equality 1 л71 1 °° - f i/(x)i2 dx = ν + - Σ (A2 + A,2) differential equations ON the circle. Let F be a polynomial and let nu ..., ησ be the integer solutions of F(in) = 0. Let LF be the differential operator denned by the polynomial F. Let oo m = Σ „ = -oo F(in) ηφηι, ,ησ Then the equation LF(f) = g has a solution if and only if g(nt) = 0, 1 < ζ < σ. All solutions are of the form f(0) = ,- f * 9(Φ)Ρ(Θ ~φ)άφ+Σ^ ехрОи, θ) ζπ J-π j =! 00 Theorem. Lei /(z) = Σ αηζ" be a convergent power series on the unit n = 0 disk. Then these equations are valid: (0 ^ f /(rC V* dfl = «, = V(n)(°) « > 0,^ < 1 2π^_π и! (ϋ) —Γ/№)βίηβάθ = 0,η>0 r^l 2π J-π 1 г71 „, .. 1 - г2 (in) /ω=^ί /fr*). ^2 , ,я .,# 2π J -π ι + г — 2r cos(0 — <ρ) 1 л71 е'*
6.8 Summary 517 If /is complex differentiable in a domain D, it can be expanded in a power series in some disk centered at any point in D. • FURTHER READING The theory of Fourier series is exposed in these texts: R. Seeley, An Introduction to Fourier Series and Transforms, W. A Benjamin, Inc., New York, 1966. Kreider, Kuller, Ostberg, Perkins, An Introduction to Linear Analysis, Addison-Wesley, Reading, Mass., 1966. Hardy and Rogosinski, Fourier Series, Oxford University Press, New York, 1956. Further applications to physics and the development of other partial differential equations can be pursued in E. Butkov, Mathematical Physics, Addison-Wesley, Reading, Mass., 1968. O. D. Kellogg, Foundations of Potential Theory, Dover, New York, 1953. • MISCELLANEOUS PROBLEMS 29. Let Κθ) - (ο θ < о οο Show that 2 /W = !/2 (see Example 7) What is n— - oo л = - QO 30. Let / be a piecewise continuous function on the circle. Suppose / has a. jump discontinuity at 0; that is, the limits lim /(χ) = α lim f(x) = b x-0 x-0 x<0 x>0 both exist, but are different. Show that 1 lim Ρ fir, 0) = -(a + b) Γ-.1 ^ (Hint: Follow the proof of Theorem 6 1 for 0>O, 0<O independently, using the substitution - = f P(r, -φ)άφ = ί P(r, -φ)άφ) 2 J„ J-n
6 Functions on the Circle (Fourier Analysis) 31 Show that / is an infinitely differentiable function on the circle if and only if this condition On the Fourier coefficients is satisfied: for every к > 0 there is an Μ > 0 such that Μ I fin) | < — for all η W 32 Suppose P(z) where Ρ is analytic on {\z\ < 1} and Q is a polynomial. Show that there is an integer TV and complex numbers a0, . , aN such that aN f(k - /V) + +a0 /(A) = 0 for all к < 0 (Hint Let Q(z) = aNz* + ■ + a0 ) State and prove the converse assertion 33 Suppose that/is complex differentiable in the annulus {r < \z\ < R}. Using the polar form Of the Cauchy-Riemann equations, show that /(z)= ^ a„z" where the a„ can be computed from the Fourier coefficients f(n) of f(pe'e) for any ρ between r and R 34 Show that if f is analytic in the punctured disk {0 < \z\ < R} and bounded, then/extends analytically to the entire disk. 35 Show that if и is harmonic in the disk {\z — a\ < R}, then for every / <R, lr" R2 - r* u(a + le.°) = -—\ U(a + Кеч.) — ц IttR J _ „ /?2 + r2 — 2Rr cos(f — φ) τ 36 (Harnack's principle) If и is harmonic and nonnegative in the disk {\z- a\ <Д}, then R-r R+r -—<u(a + rel0)<- R + r R — r 37 Suppose {«„} is a sequence of nonnegative harmonic functions On the disk [\z — a\ <R). Suppose also that 2 u„(a) < oo. Then u(z) = 2 Un(z) converges for all ζ in that disk, and и is harmonic.
6.8 Summary 519 38. Show that ntO 2и + 1 4 39. Verify the trigonometric identity 1 " sin(2W+l)fl -+ 2cos2n6l=— д 2 n=i 2 sin ρ (Яш(: The sum to be evaluated is - + 2Re Σ (Ο") Ζ η- 1 40. Using the identity of Problem 39, obtain Dinchlet's integral for the partial sums of the Fourier series of/: N SN(&) = A0 + 2 (A cos ηθ + B„ sin ηθ) 1 f- 8ΐη(ΛΓ+»Μ = 2^J_. sin^ /<* + 0d* 1 r"sin(Af+J)<i Z7T J0 sin £<£ 41 Using the Dinchlet integral (Problem 40), verify that for f a continuous function on the circle which is differentiable at the point θ0, then /(0o) =Ao+ Σ (A* cos ηθ0 + B„ sin ηθ0) (Hint: lr" / 1\ /(θ0 + φ)-/(θ0) 1 f" Г ^ = brijlnNt[C0S2 φ /(θ0 + φ)-/(θ0) sin \φ + cos Νφ[/(θ0 +φ)-/(θ0)]α-φ The expressions in brackets are continuous functions)
6 Functions on the Circle {Fourier Analysis) 42 Solve the differential equation X0)=0 XL)=0 by Fourier methods, where (a) g is constant, (b) g(x) = L — x. 43. Suppose we want to study the problem of heat transfer through a homogeneous rod with insulated ends: heat does not flow through the ends If the rod is assumed to lie on the interval 0 <, χ < L this amounts to the boundary conditions ди ди -(0,,)=0=-(L,0 The initial condition may be given either as the temperature distribution OO,0)) or the initial heat flow [(dujdt)(x, 0)] Show that with these boundary conditions and either kind of initial condition the heat equation (6.43) can be uniquely solved. 44. Solve the insulated end heat problem with (a) constant initial temperature, (b) initial temperature = cos(x/L), (c) initial heat flow = x(x — L). 45. Suppose that и is a real-valued function harmonic in the unit disk. Show that и is the real part of an analytic function. (Hint: Write u(z) = a0 + 2"°^i ifl-nZ'" + a„z"), and add a pure imaginary-valued harmonic function with the same negative Fourier coefficients.) 46. If и is harmonic and real-valued on the unit disk, there is a unique harmonic real-valued function ν such that v(0) =0, и + w is analytic. Using the relation between the Fourier expansions of ν and и (Problem 45) find an integral form for ν in terms of the boundary values of u. 47. If и and υ are as in Problem 45, show that the families of curves {u = constant}, {v = constant} are orthogonal. 48. (The convolution transform) Let g be a continuous function on the circle and define the transformation G: С(Г)^С(Г) by ι π G(/)(0) = ^j ΑφΜθ-φ)άφ Show that (a) the eigenvalues of G are the Fourier coefficients g(n) of g. (b) The nonzero eigenvalues of g form a sequence converging to zero. (c) The eigenspaces associated to the nonzero eigenvalues are finite dimensional. (d) The Fourier series of G(/) is l9(n)Me'·
6.8 Summary 521 (e) G(u) =/has a solution if and only if/is orthogonal to the kernel of G. 49. Under what conditions is the convolution transform (Problem 48) symmetric; skew-symmetric; one-to-one? The Laplace Transform 50. The Laplace transform is useful in the study of differential equations defined on the positive real axis, R+. If/is a bounded function on R + , define L(/)(i)= е-/(/)Л •Ό Show that L(f) is defined for all s > 0 51. A function / defined on R+ is of exponential order s0 if exp(—s00/(0 is a bounded function Show that for such a function, L(f) is defined for s>s0. 52. Compute these Laplace transforms: (a) L(l)(i) = - n\ (b) Д'")=рп 1 (с) L(<?') = (5-1) (d) L(e-') = ? 53. Verify these properties of the Laplace transform: (a) L is linear. f{x~d) x>a, for<j>0 (b) if/a(x) = [ 0 χ<0 L(/a) = e— ДЛ (c) Uf') = sUf)~f(P) 54. Notice that by the above problems we see that the Laplace transform of a polynomial is a polynomial in l/s. More generally, we might expect at least that if/ is of exponential type, L(/)(i) ^0 as s -± go. Is this true'' 55. Because of property (c) in Problem 53, Laplace transformation transforms a differential equation into an algebraic problem For example, suppose we want to solve y' +y ~ 1, y(0) =0, /(0) =0 on R+. If/is a
6 Functions on the Circle (Fourier Analysis) solution we must have Д/') = *Д/)-/(0) L(f) =sL(f') - /'(0) = s>L(f) - */<P) - /'(0) Thus, using the differential equation and the initial conditions L(/"+/) = L(l) i2L(/) + L(/) = - s 1 _1 L(/) = ^+1) = ~s Reading Problem 52(a)-(d) backward, we obtain the solution /(0 = 1 - Це" + <?"") = 1 - cos t Solve these equations by Laplace transformation: (a) у +y = \ X0)=0 /(0) = 1 (b) у'-у = ег X0) = 1 /(0)=0 (c) у'-2у=е-г Х0) = 1 (d) у +3v' + 2y = e~' + t Х0) = 1 /(0)=0 56 Solve these systems by Laplace transformation: (a) yi+y2=e~2t yi(0) = 0=y2(P) yi + 2yi = 1. (b) yl-}'2=yi »(0) = 0 /i(0)=l y" + yi = 2y2 y2(0) = 1 yi(0) = 0 57 Define this convolution for continuous functions on R+ (/**)(')= f /(tW-t)* Show that L(/* g)=L(f)Ug) 58 Solve the differential equation У+2/ + ;у=/(0 X0)=0 /(D) = 1 where/is of exponential type (recall Problem 51). 59 Show that the function of a complex variable L(/)(z)=f e"№dt 2\s + i) 2\s-i)
6.8 Summary 523 is complex differentiable (if / is continuous and bounded) in the domain {zeC: Rez>0} The Fourier Transform 60 We now consider the collection of continuous functions defined on all of R We shall discuss the Fourier transform, which is the analog of Fourier series: /periodic /defined on R Λ") = γ f ГФ)е- '" dd /(0 = —^— f °° f{x)e- ■*■ άξ Fourier series of/= У fyy F(/)(x) = —\— f °° /(0e<« ^ Of course, since we are working on an infinite interval, we want to be assured that the Fourier transform is defined The most appropriate class of functions will come out of these observations: (i) If/is integrable on R, /is a bounded continuous function (ii) The Fourier transform/^/is linear ("О (/Г = 00/ (iv) /' = {ixfY Since Fourier transformation interchanges the operations of differentiation and multiplication, we select the class of functions such that the effect of all such operations produces an integrable function This is the Schwartz class S(R) of test functions /is a Schwartz function, /e S(R), if and only if /is C"°, and for all positive integers η and к the function dk is integrable on R Show that /e S(R) implies /e S(R) (Hint " (*"/)<*> integrable" implies "(ίξ)'"fw bounded" implies "f-2/<»> integrable") 61 For /,je S(7?) define the convolution by (/**(*)= f f(y)g(x-y)dy Show that (/♦з)'1 = /£
6 Functions on the Circle {Fourier Analysis) 62 Borrowing again from the theory of Fourier series, we should expect that/(x) = F(/)(x): /w-^D<*№* (6·67) If we try to verify this directly we enter difficulties similar to that in Theorem 6.1: the integral on the right is and we cannot apply Fubini's theorem since J elii'~') άξ does not exist. The difficulty is overcome by introducing the convergence factor e~'w, then letting у -+0 More precisely, define the Poisson transform of/, a function defined on the upper half plane by Pf(x + iy) = тг4тз f /(Oexpttf* - у |£|] άξ (Ζπ) J _ α. Prove these assertions: (ι) if/is integrable on R, and hmPf(x + iy)=f(x) y-*o л QO (Hint- Integrate ε\ρ[ιξ(χ — t) — у \ξ\] άξ over R+, R~, independently. J -a> The second statement follows as does the result of Theorem 6.1, since this Poisson kernel y[(x — t)2 + y2]~l has the same behavior as that for the disk.) (ii) Pf is harmonic in the upper half plane and thus solves Dirichlet's problem there with the boundary values/. (iii)if/eS(/?), hm Pf(x + ,y) = F(f)(x) У-.0 so the inversion formula (6.67) holds on S(R).
LINE INTEGRALS AND GREEN'S THEOREM The basic idea of analysis is the suitable approximation of complicated functions by simpler ones, such as linear functions. Thus a differentiable function will be one that is, near every point in its domain of definition, approximable by a linear function. It is our purpose to discover what knowledge about the function is deducible from knowledge of this approximation, called its differential. Two hundred years ago it might have been said that the differential expresses the infinitesimal, or instantaneous behavior of the function and the total behavior is the sum of its infinitesimal parts. Nowadays, it is generally conceded that such an assertion is nonsense; nevertheless it serves to describe the mood of the analyst as he begins his investigations. Up until now we have been mainly concerned with one-dimensional calculus; although some of the applications have led us into the plane and space, our techniques have been mainly one dimensional. In the present chapter we turn to two dimensions, and in the next chapter we shall deal with the calculus of three dimensions. Each dimension has its own flavor. In one dimension, the order of the real numbers plays an important role; in two, we have the influence of complex numbers; and in three, we discover the vector product. However, there is also much that is the same in all these dimensions, and for these common concepts there is much to be gained from a unified treatment. Thus we begin the present chapter with a study of differentiable /?m-valued functions of η variables We will be interested in mappings from Rl to R2, R3 to R2, and so on, but the concept of differenti- 525 Chapter /
526 7 Line Integrals and Green's Theorem ability is the same in all cases and it is important for us to take cognizance of that fact. An .Revalued function f defined in a neighborhood of a point ρ in R" will be said to be differentiable at ρ if it can be suitably approximated near ρ by a linear transformation of R" to Rm. This definition will make precise our usage up to now of the word differentiable. The transformation, whose existence is required, is called the differential of f and is denoted di(f). We shall see that a differentiable /?m-valued function is an m-tuple of differentiable real-valued functions. We have already studied such functions in R2, where we showed that if a function / has continuous first partial derivatives near p, then it is differentiable there, and the differential is given by d№ = t τί(ρ)^'(ρ) 1=1 OX where x1, ..., x" are the rectangular coordinate functions of R". We have studied, in Chapter 1, some examples of coordinate systems for R2 and R3. We shall want, in the subsequent chapters, to consider more general kinds of coordinates. A coordinate system near a point ρ in R" arises in this way: if F is a continuously differentiable /?"-valued function defined near p, and the differential rfF(p) is a nonsingular linear transformation, then the functions y1=F1(x),...,y = F,(x) are coordinates in a neighborhood of p. That is, the values of y1, ..., y" serve to identify all points near p. This fact, that the nonsingulanty of the differential implies that of the mapping, is called the inverse mapping theorem. It asserts that the mapping F has an inverse near ρ when its differential at ρ does. Suppose that/is a differentiable real-valued function defined in a domain D. Then its differential associates to each point in D a linear function on R". Any rule which does this is called a differential form. An important question which we shall study in this: just when is a differential form the differential of a function^ In one variable, this question is easily answered. For if/is a differentiable function of a real variable, its differential is given by /'(*) dx Any continuous differential form in one variable is of the form g(x) dx. We know from the fundamental theorem of calculus that if G is an indefinite
7.1 The Differential 527 integral of g: G(x) = fg(t) dt J a then G is differentiable and dG = g dx. Thus the answer to our question in one variable is always. The situation in several variables is not so easy. But the extension of the idea of integration to differential forms provides us with a tool for answering this question, and a several variable analog of the fundamental theorem (Green's theorem in R2). Green's theorem provides us with a tool to extensively study complex differentiable functions. This is the Cauchy integral formula which gives a means for determining such a function at interior points of a domain by its boundary values. It follows easily from this formula (a generalization of the formula given in Section 6.7) that a complex differentiable function must be analytic: expressible as a convergent power series. In fact, the entire behavior of such functions can be read off from the integral formula; this is the basis of the Cauchy theory of complex variables. We shall only begin this study. 7.1 The Differential In Chapter 2 we studied differentiation of real-valued functions of many variables, differentiating with respect to one variable at a time. This gave us the concept of partial derivatives which generalized to the direction derivatives df(f, v) of a function/at a point ρ and in a direction v. According to Proposition 20 of Chapter 2 if the partial derivatives are continuous in a neighborhood of p, then the directional derivative df(p, v) varies linearly in v. This linear function we called the differential of/at p. Now we shall give a more precise definition of this notion, in a style more like the definition of the derivative of an /?m-valued function of a real variable (see Proposition 5 of Chapter 3). Definition 1. Let ρ e R", and suppose f is an /?m-valued function defined on a neighborhood of p. We say that f is differentiable at ρ if there is a linear transformation T: R" -»Rm and a nonnegative real-valued function ε of a real variable such that lim ε(ί) = 0 and ||f(p + v)-f(p)-T(v)||<£(||v||)||v|| (7.1) when || τ || is sufficiently small.
528 7 Line Integrals and Green's Theorem If such a linear transformation exists it is called the differential of f at ρ and is denoted by rff(p). Notice that there can be at most one linear transformation Τ satisfying these requirements. For suppose also S: R" -* Rm satisfies (7.1). Then ||S(h)-T(h)||<2£(||h||)||h|| for sufficiently small h. Let h = f ν and take the limit as t -> 0, ||S(iv)- T(iv)|| = \t\ ||S(v)- T(v)|| <2ε(ί)|ί| ||ν|| thus ||S(v) - T(v)|| <2ε(ί)||ν|| for all small t. Letting t -»■ 0, we obtain S(v) = T(v). Thus S = T. Examples 1. f(x, y) = xy2 is differentiable in the plane. Let (x0, y0) e R2 and let (h, k) be any vector. Then f(x0 + h,y0+k) = (x0 + h)(y0 + к)2 = x0 y02 + y02h + 2y0 hk + 2x0 Уо ^ = *o У о2 + Уо2п + 2x0 y0k + 2y0 hk + x0k2 + hk2 Thus f(x0 + h, y0+k)- f(x0 , y0) - {y2h + 2x0 y0 k) = 2y0 hk + x0k2 + hk2 This in norm is dominated by 2\y0\\hk\ + \x0\\k\2+\h\\k\2 < 2\y0\(h2 + k2) + \x0\(h2 + k2) + \h\(h2 + k2) ^||(A,fe)||[||(A,fe)||(|jOl + l*ol)+ ΙΙ(Λ.*)ΙΙ] since ||(λ,Λ)|| = (h2 + k2)i/2. Thus xy2 is differentiable and has the differential at (x0, y0); (A, k) -> y02h + 2x0 y0 к
7.1 The Differential 529 This means that for small values of (h, k), the difference (x0 + h)(y0 + k)2- x0y02 is effectively approximable by y2h + 2x0 y0 к The meaning of "effective" is that the error in this approximation is of the order of e ||(A, k)\\, where ε can be made as small as we please, by choosing the neighborhood of (x0, y0) Small enough. 2. More generally, Proposition 20 of Chapter 2 suggests that a real-valued function with continuous partial derivatives near p0 is differentiable there. This means that for small values of v, /(Po + v) - /(Po) is effectively approximable by <V/(p0), v> = (Σ д//дх'(р0)г/)· Let us complete Proposition 20 of Chapter 2 to a verification of this fact (at least in R2). By the mean value theorem we may write, for ρ = (x0, y0), ν = (h, k): /(P + iv) - /(p) = d-£ (ξ0, y0)h + Ц- (Xo + th, ц0)к where \ξ0 - x0\ < h, \η0- y0\ <>k. Then /(P + iv) - /(p) - S-£ (p)A + Ц- (p)/c < 'i^-l* h + >-M»> (7.2) where pls p2 are at least as close to ρ as ρ + v. By Schwarz's ιη- quality (7.2) is dominated by and the first term is dominated by ε(||ν||) = max all p1; p2 in the ball B(p, ||v||; which tends to zero ||v|| ->0.
7 Line Integrals and Green's Theorem 3. Error analysis. The differential of a function gives us approximately the difference between two values of a function in terms of the difference between the variables: Дх) -/(x0) = < V/(x0), x - x0> + error (7.3) where the error is negligible if the difference is small. Considered this way, the differential may be used to compute tolerance levels for errors in measurement. For example, we can compute the maximal error in the volume of a rectangular box, given certain tolerances in the measurements of the sides. Suppose the sides can be measured within an error of 2%. The function we are concerned with is f(x, y, z) = xyz and V/ = (yz, xz, xy). The error in the measurement of a volume will be, according to (7.3), approximately equal to (.(yz, xz, xy), 0.02(x, y, z)> = 3[0.02(x.yz)] Thus, the percentage error is 100/(х)-/(хо) = 100о^^) = 6% /Oo) xyz Thus an error is magnified threefold. 4. Let f(x, y, z) = x(cos y)ex+z. Given error tolerances of 2%, 1 %, 5% in the measurements of x, y, z, respectively, what error is possible in the computation of/? Here V/ = ((cos y)ex+z(l + x), -x(sm y)ex+z, x(cos y)ex+*) The ratio of the increment in / to the computed value of f is approximately V/(x, y, z), (0.02x, 0.02y, 0.02z) /О, У, z) = (1 + x)(0.02) + y(tan y)(0M) + (0.05)z Here we see that the error in the computed value of /depends on the magnitude of the variables. If _y is close to π/2, the error is very bad. The maximum percent error for values of x, y, ζ in these
7.1 The Differential 531 ranges: |x| < 1, \y\ < π/4, \z\ < 1, is 2(2)+J(l) + 5(l) = 9+J which is less than 10. 5. A linear transformation is differentiable at every point. Let T: R" -* R"' be a given linear transformation, and let ρ e R". Since T(p + v) - T(p) = T(v) we have ||T(p + v) - T(p) - T(v)|| = 0, so the estimate required by the definition is precise. Furthermore, for any ρ e R", rfT(p) = T. In particular, the coordinate functions x1, ..., x" are differentiable and dx'(p, v) = v' for any p, v. Since dx' is independent of the base point we shall often omit it. Notice, that dx1, ..., dx" form a basis for the space of linear functions on R", so the differential of any function will be a linear combination of these differentials. In particular, if/is differentiable at p, we have d№ = t |i(P)<fr' (7·4) 1=1 OX We have just shown that in two dimensions, but it is easier to directly compare Definition 1 of this chapter and Definition 14 of Chapter 2 (cf. Problem 1) to obtain rf/(P)(E.) = lim / = ^ (p) The verification of the following proposition concerning the behavior of the differential under algebraic operations are easily performed. Proposition 1. (ι) Suppose that f, g are differentiable Rm-valued functions at p. Then f + g and <f, g> are also differentiable and d(i + g)(p) = <Я(р) + *(P) d <f, 8>(P) = <Л(Р), g(P)> + <f(P)> <Ш>
532 7 Line Integrals and Green's Theorem (li) Suppose f = (f1, ..., fm) is an R"'-valuedfunction defined in a neighborhood of p. f is differentiable at ρ if and only iff1,..., fm are. In this case we have di(V) = (dfKV),...,dfm(v)) Proof. We shall only verify the differentiability of <f, g>; the other assertions are clear By the hypothesis of (1) there are functions ε, η of a real variable such that lim e(t) = lim (t) = 0 as t^O, and linear transformations R, S such that llf (P + v) - f (p) - Жу)\\ < ε(||ν||) ||ν|| (7.5) l|g(p + v) - g(p) - 5(v)|| <7?(||ν||) ||ν|| (7 6) Let /z(x) = <f (x), g(x)>. Then A(p + ν) - A(p) = <f (p + v), g(p + v)> - <f (p), g(p)> = <f (p + v) - f (p), g(p + v)> + <f(p),g(p + v)-g(p)> (7.7) If we replace the first term by R(\) we commit an error of ε(||ν||) ||ν|| ||g(p + v)|| and if we replace the last term by S(y) we commit an error of i?(||v||) ||v|j ||f(p)|j These are admissible errors, so we shall bravely proceed with these replacements. From (7.7), we obtain |A(P + v) - A(p) - (</?(v), g(p)> + <f(p), 5(v)»| < I <f (P + v) - f (p) - R(y), g(p + v)> | + | </?(v), g(p + v) - g (p)> | + l<f(p),g(p + v)-g(p)-5(v)>| <ε(1|ν||)||ν|| ||g(p + v)||+ ||/?(ν)||(||5(ν)|| + 7?(||ν||) ||v||) + l|f(p)IWI|v||)l|v|| If we take Μ larger than the maximum value of ||g(p + v)||, and also larger than \\R\\ and ||S ||, this is dominated by [Με(||ν||) + Μ2||ν|| + Μ||ν||τ?(||ν||)+ ||ί(ρ)||ΐ}(||ν||)] ||v|| which is of the desired form. Examples 6. f(x, y) = ex cos у + yx. df(x, y) = (ex cos у + yx log y) dx + (- ex sin у + xyx~x) dy 7. f(x, y, z) = xyz, df(x, y, z) = yz dx + xz dy + xy dz.
7.1 The Differential 533 • EXERCISES 1. Find the differential of these functions: (a) у cos χ + sin zx (b) cosO"') + cosO?')- (c) exp<x, a>. (d) <x, exp<x, a> >. (e) x2 + y2 + ζχ. (f) (x-^)e*+y (g) Щ.хдс'. 2. For each of the following functions, in how large an interval about the origin may we estimate /(v) — /(0) by <V/(0), v> incurring an error of at most 10-3 ||v||? (a) xy (d) sin(x + 2y) (b) ex+> (e) x + e2' (c) sin χ + cos у (f) exp(x2 + y2) 3. In how large a disk about the point ρ ^=0 can we estimate the polar coordinates of nearby points ρ + ν by a linear function, with an error of at most 10~3||v||? • PROBLEMS 1. Suppose that / is a differentiable real-valued function defined in a neighborhood of ρ in R". Using the definition, verify that л,,™ ι /(ρ + Έ,)-/(ρ) Э/ rf/(p)(E,) = hm = — (p) Γ-.0 1 °Xi and conclude that 2. Let M(x), N(x) be η χ η matrix valued functions of the variable x. If Μ, Ν are differentiable at p, so is MN. Show that rf(MN)(p) = t/M(p)N + Μ ί/Ν(ρ). 3. if /(f) = det(exp(Mf)), show that /'(0) = trM.
534 7 Line Integrals and Green's Theorem 4. A quantity Q vanes with x, y, ζ according to e' Q = - Suppose that x, y, ζ can be measured to within an error of 1 %, 1/2 %, 3 %, respectively. What will be the corresponding maximal error in Q at corresponding values? At (a) x =0, у = 2, ζ = 5, (Ъ)х = 2, у = 1, z = 3, in particular ? 7.2 Coordinate Changes In Chapter 1 we introduced some systems of coordinates in R", and we saw that for certain problems a change of coordinates made the problem understandable and solvable. Later on we saw, in the study of systems of linear differential equations, that it was convenient, where possible, to Switch to coordinates relative to a basis of eigenvectors. In the geometric study of surfaces, and in many physical problems it is advantageous to admit very general coordinate changes. We now introduce a general notion of coordinates Definition 2. Let U be a domain in R". A system of coordinates is an и-tuple of continuously differentiable functions у = (у1, ..., у") defined on U such that (ι) ifp#q, then y(p) # y(q), (ii) dyiQp), ■ ■ ■, dy"(p) are independent at all ρ 6 U. The first condition states that any point is uniquely determined by the value of у at that point. In this sense y\ ..., y" are coordinates. We can name points in U by means of the functions yl, . ., y". Further, if/is a function defined on U, we can describe it as a function of the coordinates y1,. ., y". The second condition asserts that the differentials dy1, ..., dy" span the space of linear functions. Thus we can express the differential of a function as a linear combination of these differentials; it should be no surprise that (7 4) is valid in any coordinate system. Proposition 2. Suppose that y1,..., y" are coordinates in a neighborhood of p. Iff is a differentiable function defined in a neighborhood ofjt, then df(v)=t fi(PW(P) .= i oy
7.2 Coordinate Changes 535 Proof. Let xx,..., x- be the coordinates of R" relative to the standard basis. We know that df(fi)= I j-W*' (=i ox' Now we can express the standard coordinates as differentiable functions of the new coordinates^1, ..., y": x' = x'(y\ ..., y"), i=l, ..., n, and/can be expressed as a function of y\ ..., y" by composition: /(p)=/(x4y(p)),...,x"(y(p)) Let us assume that ρ is the origin relative to both χ and у coordinates. Now df/By' is the derivative of / with respect to y', holding the other variables y1, j φι constant. In other words, 8f/8y'(p) is the derivative of / at ρ along the curve yJ = 0, j Φ i. We can parametrize this curve by х1=д10) = х1(0, ■■■,<), t,0, . .,0) x" = g"U) = x"(0,..., 0, t, 0,..., 0) for t near 0. Now by Proposition 3 of Chapter 3, we have 8f d " 8f dgk ^(ρ) = ζ/(,4Ο,...,,»(Ο)|ι=ο=Σι^(0)-(0) But dgk/dt(0) = дхк/дуЩ. Thus 8f " 8/ dx< ~(S>)=If-k (0) -r- dy' k=i dx* 8y <ρ) = .Σ.~m—^ <7·8) As the x' are differentiable functions of y; dx' = 2 (Sx'/V) dyJ and we conclude that of Sf 8χι ^, df Examples 9. Polar coordinates: the change of coordinates x = r cos θ у = r sin θ
7 Line Integrals and Green's Theorem is valid in any disk not containing the origin. We have dx = cos θ dr - r sin θ dd, dy = sin θ dr + r cos θ άθ so dx dx dy dy — = cos У —-=-rsm0 — = Sin У — =rcos <3r 00 dr du If/is any differentiable function, ^- = cos^ + sin^ dr dx dy 10. Spherical coordinates: χ = r cos θ cos ψ у = r sin θ cos ψ ζ = r sin ψ c/x = cos θ cos φ dr — r sin θ cos φ d9 — r cos 0 sin ψ rf</> c/y = sin θ cos φ dr + r cos θ cos φ άθ — r sin θ sin ψ ί/ψ ί/ζ = sin φ dr + r cos </> άφ If/is differentiable, dj__dj_d_x dj_dy_ dj_d_z_ dr dx dr dy dr dz dr = cos θ cos φ ——h sin θ cos θ ——l· Sin φ — dx dy dz dj__dj_d_x_ dj_dy_ dj_d_z_ de~~dx"de+ dydQ+ dzdl) — sin θ cos ώ h cos θ cos φ — I dx dy! d£_d£dx d_J_dy_ dj_dz_ dφ dx dφ dy dφ dz dφ — cos 0sin ψ sin 0 sin ψ -—h cos ψ — I dx dy dz]
7.2 Coordinate Changes 537 11. Let/(χ, у, ζ) = e*yz. Find df/dr, df/дв: й f — = (cos θ cos ф)ехуг + (sin θ cos ф)ехг + (sin </>)e*y = r exp(r cos θ cos ф)(г sin θ cos θ cos2 φ sin ψ + 2 sin θ cos φ sin ψ) Я f — = r[(-sm θ cos ф)ехуг + (cos θ cos 0)e*z] OP = r2exp(r cos θ cos </>)[-r sin2 θ cos2 ψ sin ψ + cos θ cos φ sin ψ] 12. Find df/дх if/(г, θ, ψ) = φ2 in spherical coordinates. In order to solve this we have to write/explicitly as a function of the rectangular coordinates. Since φ = arc sm(z/r), ^T = 2ψ V" = 2 arc sin —: = j-r-p. δχ δχ (x2 + y2 + z2)1/2 arc sin έ(3 (x2 + У2 + ζ2) 2ч 1/2 77ге Jacobian In general, if /=/v,. y" = A*1. · · ...x") .,*") is a change of coordinates, we shall write this as у = F(x). The differential rfF(x0) is a nonsingular linear transformation on R". The matrix relative to χ coordinates representing this transformation is referred to as the Jacobian of the mapping and denoted (when it is of value to make the coordinates explicit) by d(y\...,y") /dy· Uv δ(χ\ ..., χ") According to Proposition 2 ay _ " ay йх_к dy'J~k-i'd?dyJ ι,ί = 1, ···, η
538 7 Line Integrals and Green's Theorem which is just the entry by entry form of the equation l_d(y\...,y")d(xl,...,x") д{х\...,х")д{у\ ..,/) Thus the matrices are inverse to each other as are the corresponding differentials: аТ-\у0)=У¥(х0)У1 ify0 = F(x0) Example 13. Let и = χ + ey ν = χ cos у be a coordinate change in a domain in R2. Then d(u, v) _ I 1 ey \ d(x, y) \cos у — χ sin у/ d(x, y) -1 / - χ sin у -^ d(u,v) ey cos у + χ sin у \ —cosy 1 / If f(u, v) = u2 + v2, then df dfdu dfdv n „, v , ч -— = — -—\- — — = 2u + 2v cos у = 2(x + ey + χ cos y) ox du ox ov ox lig(x, y) = x2 + y2, then dg dg dx dg dy x2 sin у + ye" ди дхди дуди ey cos у + χ sin у These observations form special cases of the multivariable chain rule. We have already seen (Propositions 3.2, 3 3) other special cases. The general situation is this: the differential of a composed function (see Figure 7.1) is the composition of the differentials: die ° f)(P) = rfg(f(P)) ° <tf(P) (7.9)
7.2 Coordinate Changes 539 In coordinates this is easy to compute by linear algebra. Let x1, ..., x" be coordinates in R", y\...,ym in Rm and z1, ..., z" in R" Then f and g are given in coordinates by f:/=/'(x\ ...,χ") \<i<m g:zJ = gJ(y1, ...,ym) \<i<p Let h = g ° f. Then h is given by the p-tup\e of functions z' = h>(x\ ...,xm) = g'(f\x\ ...,x"),...,fm (x1, ...,*")) (7.9) is the same as all these equations dhJ m da1 dfk M^^w^-h® l-J-p 1-i-w (7Л0) This is true since rfg(f(p)), df(p) are represented by the matrices respectively. We can rewrite (7.9) and (7.10) again in matrix form. The Jacobian of a product is the product of the Jacobians, and (7.9), (7.10)
540 7 Line Integrals and Green's Theorem become d{z\...,z*) d{z\...,z>) d{y\...,ym) 3(*\...,*»)(P) = */,...,/■)(f(P)) 3(x\...,*»)(P) Here is the proof of the chain rule. Theorem 7.1. (The Chain Rule) Let ρ be apoint in R". Suppose f is a differentiable Rm-valued function defined in a neighborhood of p, and g is a differentiable Revalued function defined in a neighborhood of f(p). Then h = g о f is differentiable at ρ and db(f) = rfg(f(p)) ° df(p). Proof. Let T= i/g(f (p)), and 5 = i/f(p). We must show that ||h(p + v) - h(p) - To S(y)\\ < ε(ν) ||ν|| (7.11) where lim ε(ν) = 0. Let !lvll-0 φ(ν) = f (p + v) - f (p) - S(v) (7.12) iKw) = g(f (p + w)) - g(f (p)) - T(w) (7.13) Then, since f, g are differentiable, ||φ(ν)||<δ(ν)||ν|| ΙΙΨΜΙΙ^ιΟΙΗΙ where δ(ν)->0as ||ν|| —>■ 0and. ?7(v)->-0as ||w||->0. Now, we verify (7.11) by computation: h(p + v) - h(p) = g(f (p + v)) - g(f (p)) = g(f (p)) + T(f (p + v) - f (p)) + iRf (p + v) - f (p)) - g(f (p)) (by taking w = f (p + v) — f (p) in (7.13)). Now using (7.12) we can continue: = Γ(5(ν)) + φ(ν)) + ψ(5(ν) + φ(ν)) = Τ ο 5(ν) + Γ(φ(ν)) + ψ(5(ν) + φ(ν)) since Τ is linear. Thus ч ,., ч τ- ^ 11Г(ф(у))|| + ||ψ(5-(ν) + φ(ν))|| „ „ ||h(p + ν) - h(p) - To S(y)\\ ^ π~Π II»II
7.2 Coordinate Changes 541 Now we must show that ., 11ПФ(у))Н ΙΙΨ(5(ν) + φ(ν))|| as ||v|| ->0. As for the first term ^^^,,Γ,,„,,Φ(ν)"^Ιΐηΐδ(ν) which tends to zero as ν -^0, so that is alright. The second term is ||ψ(5(γ) + φ(ν))|| i?(Sfr) + φ(ν)) l|S(v) + φ(ν)|| ΙΙνΙΙ :ΐί(5(») + φ(»))(||5|| + 8(»)) As ν -» 0, so does 5(v) + φ(ν) -» 0, and also η(Ξ(\) + φ(ν)) -» 0. The final parenthesis is bounded so the whole term tends to zero. We are through. Finally, we wish to give a sufficient condition that an и-tuple of functions У1 =/1(X)> ···, У" = f(x) gives a coordinate system in a domain D in R". If y1, ..., y" are coordinates, then we can invert these equations, that is, since the y's suffice to determine points in D, we can compute the χ coordinates in terms of y1, ..., y". Thus there are functions x1 = g1(y), ..., x" = g"(y) such that x = g(y) if and only if у = g(x) in the domain D. Now the second condition defining a coordinate system is that the differentials dfl,..., df" are independent. The inverse mapping theorem asserts that if this second condition is valid at a point, then the first must hold in a neighborhood of that point. Thus the independence of the vectors dfl($), ..., df"(p) are enough to guarantee that y1, ...,y" are coordinates near p. Theorem 7.2. (Inverse Function Theorem) Let F be a continuously different- iable Revalued function defined in a neighborhood o/p0 in R". Let q0 = F(p0). // the differential d¥(jt0) is nonsingular, then there is a neighborhood U of q and a continuously differentiable mapping G defined on U such that G(q0) = p0 and for each q in U F(p) = q if and only if ρ = G(q)
542 7 Line Integrals and Green's Theorem Proof Let us, for simplicity of notation, assume that p0 =q0 =0. We have to show that if q is small enough the equation F(p)-q = 0 has a unique solution ρ in a neighborhood of 0. This suggests Newton's method for finding roots. The linear approximation to the mapping p^F(p) —q at a point pl is given in terms of the differential: p->F(pi)-q + dF(pi)(p-pi) (7 14) If p* is near enough to 0, dF(pi) is nonsingular, so we can find a root of (7.14), namely, p = p1-i/F(p1)-1[F(p1)-q] (7.15) Now we consider the transformation T, defined in a neighborhood of 0 by r,(p) = p-dF(0)-I[F(p)-q] (7.16) [For simplicity we have replaced i/F(pj) in (7 15) by dF(0) ] It is shown below, in Lemma 3 that for q sufficiently small, T„ is a contraction in a neighborhood of 0. Thus, for each q near 0, T, has a unique fixed point, which we denote G(q) Clearly, F(p) = q if and only if ρ is the fixed point of Tq, that is, if and only if ρ = G(q) It remains only to verify that G is differentiable Let q0 be a point near 0, and p0 = G(q0) Let Τ = i/F(p0) Then, by definition F(p)-F(p„)=np-Po) + <Kp-po) (7.17) where ||φ(ρ—Po)l| <e(p—Po) lip —Poll and e(t)~>0 as f->0. Let p = G(q). Then (7 17) becomes q - qo = 7(G(q) - G(q„)) + (G(q) - G(q„)) Since Τ is invertible this can be rewritten as G(q) - G(q0) = T'l(.q - q0) + r-J<p(G(q) - G(q„)) (7 18) If we can successfully study the behavior of the last term we will have verified the differentiability of G at q0, with i/G(q0) = r-1=i/F(Po)-1
7.2 Coordinate Changes 543 But (7.18) gives us ||G(q) - G(q0)|| < ЦГ-11| llq - q0|| + \\T~l ||e(G(q) - G(q„)) l|G(q) - G(q0)|| (7.19) Since G is continuous (by Problem 10), we may choose q so close to q0 that the last term is dominated by 1/2 ||G(q) - G(q0)||. Then (7.19) is the same as ||G(q)-G(q„)||<2||r-1|l llq — QoII and (7.18) produces this inequality which guarantees differentiability ||G(q) - G(q„) - T''(q - q0)ll < ll<t>(G(q) - G(q„))|| < e(G(q) - G(q„)) ||G(q) - G(q0)|| ^2||r-1||£(G(q)-G(q„))||q-q„|| and certainly lim 2 \\T~l || e(G(q) - G(q0)) = 0. q^qo Here is the lemma which guarantees that the Tq are contractions for q near enough to 0: Lemma 1. Given the hypotheses of Theorem 7.2, there is α δ > 0 such that for q 6 B(0, δ), the map 7(p) = p-^(0)-1(F(p)-q) is a contraction on B(Q, δ). Proof. Let p, p' be two points near 0 and consider the function h(/) = ρ + ί(Ρ' - ρ) - i/F(0)-4F(p + ί(ρ' - ρ)) - о) 0 < / < 1 Then 7Хр) - Г(р') = h(l) - h(0) = f h'(i) dt (7.20) •Ό h'(0 = Ρ' - Ρ - d¥(0)~l dF(p + /(ρ' - p))(p' - ρ) h'(i) = [I - dW)~l d¥(p + t(p' - p))Kp' - p) (7.21) Now choose δ < 0 so that \\l- dFffl)-1 dF(x)\\<ll2
544 7 Line Integrals and Greens Theorem for ||x|| < δ. Then if p, p' e B(0, δ), every ρ + ί(ρ' - ρ) is in B(0, δ), for 0 ^ t ^ 1, so, using (7.10) l|h'(f)ll ^ Ш - ЛЧО)-1 </F(p + /(p' - p))|| ||p' - p|| <*Hp'-pII Thus, by (7.9) 1 r1 1 Щр) - Γ(ρΟΙΙ <2 J HP' - Pll Λ ^2 W ~ P" so Г is a contraction in B(0, δ). • EXERCISES 4. Compute the Jacobian g(«', ■■■,«") S(x\..., x") for each of the following functions and determine those points (x1,..., x") at which u\ ..., u" are coordinates: (a) и = xe" ν = yex (b) и1 = x1 + x2 + χ3 u2 = x'x2 + x2x3 + x3xJ u3 = x*x2x3 (c) (d) и - V - и ν w = x2- = xy = x2 = yx = zx' -y1 + У2 -1 -1 + z: (e) ul = xl xz «2 = - X1 X" 11»=- X1 (f) ui=h\x)x" и" = A„(x)x" *,(*)# 0 for all г and χ 5. Express the differential of/(x) = 2"=i (*')2 m terms of the coordinates u\ ..., u" given in Exercise 4(e). 6. Express df in terms of the coordinates of Exercise 4(d), where (a) /(x) = ln(x24-;v2 + *2) (b) f{x) = yz (c) f(x) = x + y + z
7.2 Coordinate Changes 545 7. Compute the differential of Kx-ay+b-by + iz-cY]-1 in spherical coordinates in R3 — {(a, b, c)}. 8. What is the rate of change of the volume of a rectangular box with respect to the area of its surface, assuming the length of one side and the sum of the lengths of the other two sides is left fixed ? • PROBLEMS 5. Let/be a differentiable function defined on a domain D in R2 Show that f is a function of x + > alone if and only if dfjdx = df/ду on D. (Hint: Consider the change of coordinates и = χ + у, ν = χ — у.) 6. Give a condition guaranteeing that a differentiable function of two variables can be expressed as a function of xy. 7. Suppose that /, g are two differentiable functions on R2 with Vg Φ 0. Show that / is a function of g alone if and only if V/, Уд are everywhere collinear. 8. Show that for any twice differentiable function / defined on the plane, d2f d2f В I df\ d2f 'd^2 + 'dy1 = '"dr\"d7)+'de2 9. Show that for/(z) = ζ", η φ Ο, d2f B2f — +— = 0 8χ2 ^ dy2 10. The proof of Theorem 7 2 is still incomplete: we must show that the function G is continuous There are two ways. (a) Suppose q„ -*■ q Let q„ = F(p„). Suppose that p„ — ρ Then F (p) = lim F (p„) = lim q„ = q Applying G we have hm G(q„) = hm p„ = ρ = GF(p) = G(q) Thus G is continuous, as desired. Why may we suppose that the sequence {p„} converges ? (b) In this approach we reprove Theorem 7 2 so that the continuity is automatic For a sufficiently small ε >0, we consider the space С of continuous functions h on {q e R": ||q || < ε} such that Λ(0) = 0. Define T: С - С by T(h)(q) = h(q) - rfFCOr'tFOliq)) - q]
546 7 Line Integrals and Green's Theorem As in Lemma 3 show that Γ is a contraction (on the space С of continuous functions!). Thus Τ has a fixed point G. Clearly, F(G(q)) = q as desired and the continuity of G is assured. 11. Suppose that/is a continuously differentiable function defined in a neighborhood of 0 m R3, and/(0) = 0 and β//βζ(0) φ 0 Then the equation fix,y,z) = 0 implicitly defines ζ as a function of χ and у More precisely, there is a function g defined for small enough x, у such that fix, y, z) — 0 if and only if ζ = gix, y) near the origin This can be proven as a corollary of Theorem 7.2 as follows: applying Theorem 7 2 to the mapping и = χ ν = у 0.21) w=fix,y,z) We can find functions h, k, g of («, υ, w) such that (7 22) holds if and only if χ = Л(и, ν, w) у = kin, v, w) ^ = g(u, v, w) Obviously, hiu, v, w) = u, kiu,v,w) = v It follows that when w = 0, ζ = giu, υ, 0) = gix, y, 0) This is the desired conclusion 12 Here is a similar fact. The proof should be analogous to the argument for Problem 11 Suppose/, g are continuously differentiable near 0 m R3 and that /(0) = giO) = 0 and Then, there are continuously differentiable functions h, к defined for small enough ζ such that fix, y, z) = 0 = gix, y, z) if and only if χ = Λ(ζ) у = kiz)
7.3 Differential Forms 547 7.3 Differential Forms The differential of a real-valued function defined on a domain D in R" is a function defined on D whose values are linear functions on R". A function of this type is called a differential form, and a central issue in the calculus of several variables is this: just when is a differentiable form the differential of a function7 This problem is resolved by the generalization of the fundamental theorem of calculus which takes the form in this chapter of Green's theorem. The one-variable fundamental theorem asserts that every differential form on an interval is the differential of a function. This is far from being true in several variables. For example, to say that £ a, dx' is the differential of a function /is to assert that a, = df/δχ'. Since d2f d2f ΤΤΓΊ = ΤΊΊΓ1 all ι and j (7.23) dx dx1 ox1 dx we must have dajdx1 = dajdx1. This is not always the case. ex(dx + dy), ydx — xdy are not differentials of functions because the coefficients do not satisfy these conditions. We shall explore this situation at length in the following two sections. Definition 3. Let D be a domain in R". A differential form on D is a function which associates to each point ρ in D a linear functional on R". If/is a differentiable function on D, the df is a differential form on D. In particular, if xi,...,xn is a coordinate system in R", dxu ..., dxn are differential forms on R". Furthermore, for any ρεΖ), rfxi(p), · · ·, dx„(jt) form a basis of the space of linear functional on R", so any such functional is a linear combination of the dx,(p). Thus, the general differential form on D is of the form £"=1 a,(p) dx,(p) where the a, are real-valued functions on D. Definition 4. Let ω be a differential form on the domain D, and write a, = £ a, dx, relative to the standard coordinates of R". We shall say that ω is a fc-times (continuously) differentiable differential form on D (ω e C\D)) if the functions als ..., an are all fc-times (continuously) differentiable. Suppose now that uu...,un are differentiable functions in D с R" and that du^if), ..., dun(p) are independent at some ρ 6 R". Such an и-tuple of function forms a coordinate system near p: the mapping u = (uu ..., и„)
548 7 Line Integrals and Green's Theorem maps a neighborhood D of ρ onto a domain D' in one-to-one fashion. Furthermore, du^f), ..., dun(f>) forms a basis for (/?")*, so any differential form can be written as £ a, ί/й,. We can compute the relation between the a, and the a, by the chain rule: since dul = £ du'/dxJ dxJ, we have А du1 Thus differential forms transform under a coordinate change as the differential of a function (compare Equations (7.4) and (7.23)). Now the equality of mixed partials of a twice differentiable function gives a necessary condition for a differential form to be the differential of a function. Proposition 3. Let ω be a continuously differentiable differential form in a domain D. Suppose и1,..., и" is any coordinate system for D. If ω = Σ αι du' is the differential of a function we must have da, да, 7-^ = 1r1l l<i,j<n (7.25) du1 du Proof. If ω = df, then a, = df/ди'. Then da, _ β /ЗА _ β /β/\ _ Эй' Closed and Exact Forms We shall say that a differential form is exact in a domain D if it is the differential of a function, and closed if Equations (7.25) hold. It is easily verified that if these equations hold in any coordinate system, then they hold in all coordinate systems (see Problem 13); so it is not too difficult to verify that a form is closed. In the plane a form has the expression ω = ρ dx + q d\ with respect to the rectangle coordinates. In this case there is only one nontnvial equation in (7.25), namely, Sq dp i-iy=° (7-26> We shall refer to this function as άω\ that is, if ω = ρ dx + q dy d(t> = — dx dy
7.3 Differential Forms 549 Thus, a differential form on a plane is closed if άω = 0 and is exact if ω = df. Examples 14. ω = χ dx + у dy, άω = 1 - 1 = 0. In fact, ω = d(x2 + y2)/2. 15. ω = у dx + χ dy, άω = 1 — 1 = 0. Here ω is also exact, since ω = d(xy). 16. ω = у dx — χ dy, ί/ω = — 1 — 1 = — 2, so ω is not closed. Notice however that \ ~2ω is closed, since it is exact (except for у = 0, where it is not defined), y~2ω = d(x/y). 17. Integrating factors. Let ω = ρ dx + g dy be a differential form given in a neighborhood of p0 . The vector field {—q, p) can be realized as the field of tangents to a family of curves, as we saw in Chapter 4. Let this family be given implicitly by F(x, y) = с Thus, since F(x, y) is constant on these curves, its derivative along the curve is zero; or what is the same dF(x, y)(-q(x, у), р(х, у)) = О Since dF and ω annihilate the same vectors at each point, they are collinear. Thus there is a function λ(χ, у) such that dF = λω We conclude that for any differential form ω there is a factor A such that λω is exact. This is true in two dimensions, and fails in higher dimensions A is called an integrating factor for ω. 18. The polar coordinate θ is not a well-defined function on the domain R2 — {0}, but its differential is: / y\ -ydx+xdy at) = d\ arc tan -1 = 5 5 \ xj x2 + у Thus this form is closed, but not exact on the domain R2 - {0}.
550 7 Line Integrals and Green's Theorem We shall now verify that every closed form on R2 — {0} is equal to an exact form plus a constant multiple of άθ. Thus the space of closed forms on R2 — {0} is larger than the space of exact forms by one dimension. Suppose that ω is a closed form in R2 — {0}. In polar coordinates (я{ге,в) = a(re'e)dr + b(reie)dd and since ω is closed we have db/dr = да/дв. It follows that F(r) = f "b(re">) d9 •Ό is a constant. For dF r2" db г2" da — = ^dd=\ —de = a{re2"')-a{re°) = a{r)-a{r) = 0 Let c(a>) be that constant. Notice that c(dd) = In. Further, if ω = df, then c(co) = 0. For c(a>) = f"b d9 = f %d9= /(re2™) - /(re0) = 0 Conversely, if c(a>) = 0, then ω is the differential of a function defined on R2 - {0}. Let f(r, θ) = ί a(t) dt + ί Ь(ге,ф) άφ (7.27) •Ί ·Ό Since c(pS) = 0, f{r, θ + 2π) = f(r, θ) for all r, θ, so we can define a function F onR2 - {0} by F(re,e) = f(r, Θ). Differentiating (7.26), we have dF df re db re da T=-j- = a{r) + - (re") άφ = a(r) + — (re") άφ = a{re'e) dr dr -Ό dr -Ό o<p de de v ' Thus, dF = ω. Finally, if ω is any closed form on R2 — {0}, let θ = ω — ε{ω)άθβπ. Then c(co) c(9) = c(co) - -^^ 2π = 0 2π
7.3 Differential Forms 551 so θ is exact: θ = dF. Thus c(a>) 2π EXERCISES 9. Which of the following forms are closed? (a) £ xl dx1 (=1 (b) xy dz + у ζ dx + zx dy (c) xyz(dx + dy + dz) (d) rdr + dO (e) r'dr + rdd (f) r sm θ dr + r cos θ (g) r sin φ dr + r cos φ sm θ ί/0 + r sm ^ d^ (h) d{xe" cos(xyz)) (l) X1X2 ί/хз + X2 Хъ dXi + x3xA dxi + XiXi dx2 (j) Xi ito + хъ dxa, + x5 dx6 (k) Xi dxi + xi dx3 + Хъ dxi 10 Is the form (z— a)'1 dz exact in С — {я}? Is its real part exact? Is its imaginary part exact ? 11. Find integrating factors for the following forms: (a) x(dy + dx) (d) χ dy (b) xy(dx + dy) (e) e*+y dx + e dy (c) —ydx + x dy (f) sm χ dx + cos χ dy 1 PROBLEMS 13 Let 0\ . , x") and (и1,.. , «") be two coordinate systems valid in a domain D in R". Let ω be a differential form defined in D and write ω in terms of these coordinates as 1 = 2 a< dx' = 2 a' dil' Show that if for all., У
552 7 Line Integrals and Green's Theorem then dtxt dxj ы=ы forall/'7 14. Let the hypotheses be the same as in Problem 13, but this time suppose η = 2. Show that Stfi 8a2 /<3oci 8<x2\ 8(ui, u2) 8x2 8xi \81i2 8ui) 8(χί, x2) 15. Show that the space of closed forms on R2 — {0, 1} is larger than the space of exact forms by 2 dimensions. (Hint: Let θ0 be the ordinary polar angle, and let 0i(p) be the angle between the ray from 1 to p and the horizontal. Then if ω is closed in R2 — {0, 1} there are constants a, b such that ω — α άθο — b dQ-L is exact.) 7.4 Work and Conservative Fields Suppose we have a field of forces F given in a domain D in R": F(x) is the force felt by a unit mass situated at the point x. In moving an object of mass m along a certain path a certain amount of energy is expended; this is called work. In this section we shall describe the computation of work. Suppose first that a body of mass m moving in a straight line experiences a force of magnitude F per unit mass operating in the direction opposite the motion. Then, by definition the work required to move that body a distance d is — F · m · d. In a more complicated situation the force acts in space in a fixed direction with a certain magnitude; thus the force is represented by a vector F. Suppose we want to move a body of mass m from a point a to another point b. The work required for this movement will depend only on the component of the force in the direction of motion and will be given again by —F0-m-d, where F0 is this component and d is the distance between a and b. That is, if b - a = dE, where Ε is a unit vector, then F0 = <F, E> and the work is -<F, E}md= -m<F, b-a> Now, in general, the force is not necessarily constant, but varies with position. The general situation is that of a force given by a vector field (vector-valued function) F on R3. Suppose that for some perverse reason we desire to move a given body from a to b along a particular path Γ. As
7.4 Work and Conservative Fields 553 is customary we try to adapt the above formula to this revised situation by assuming that the force field varies little over small intervals (that is, F is continuous) and that the path is very close to being a sequence of straight line segments. Then, we get a reasonable approximation to the total work involved by adding up the work required over each line segment assuming the force is constant there. More precisely, then, we select a very large number of points a = p0, p1; ...,ps = b numbered sequentially along the path (see Figure 7.2). The work we seek is then approximated by -mi<F(ft).P.-P.-i> (728) 1=1 We define the work as the limit of all such sums as the maximum of the distance between successive points tends to zero, and we expect that, as usual, the calculus will make that computable. And it does. Suppose given, for example, a field of force F given in a domain D in/?3; then F =(/j(x), /г(х),/з(х)) is an /?3-valued function defined on D. Suppose Г is an oriented curve in D, given by the parametrization x = g(/) = (g^t), g2(t), g3(t)), a <t < b We shall now compute the work done in moving a particle of mass m from g(a) to g(b) Let g(d) = p0, ps = g(£) be a very large number of points situated along Г. Referring to the parametrization we can write p0 = g(/0), Pi = g(^i)·. , Ps = g(0> with a = t0 < /t < ■ · · < ts = b. Then the approximate work done is given by Figure 7.2
554 7 Line Integrals and Green's Theorem (7.28): -mt <F(g(0), g(0 - g(f,-1)> (7.29) 1=1 = -m Σ /,(g(O)[0i(O " 0i(',-i)] + /2(g(i,))[ff2(0 " ff2(i.-i)] 1=1 + /з(8(0)[вз(0-^,-1)] By the mean value theorem, there are 0X ,, 02i ,, 03t, such that ffj(0 - 0j(',-i) = ^(^.,)(ί, - ί-ι) i,-i < ^,. < i. Thus the approximating sum (7.29) becomes -m£ J = l (ί,-ί,-i) (7.30) which is a typical Riemann sum approximating -m f ZfMtMU)dt Ja J = l = -m f <F(i), g^)) di (7-31) In fact, as the " very large number of points" on Γ becomes infinite, the sums (7.30) do tend to the integral (7.31), so we are justified in referring to this as the work required to move the mass along Γ. We are thus led to this definition of work: Definition 5. Let D be a domain in R" and F a force field defined in D; that is, F is an /?"-valued function on D. Let Γ be an oriented curve defined in D. The work required to move a unit mass along Γ is W(T, F) = - f*<F(0, g'(0> dt J a where g: [a, b~] -> Γ is a parametrization of Г. Notice that since W(T, F) is the limit of a collection of sums defined independently of any particular parametrization that ЩГ, F) is also independent of the parametrization.
7.4 Work and Conservative Fields 555 Sometimes paths of motion have a break in direction (see Figure 7.3). Such a curve is called a piecewise continuously differentiable curve, or a path for short. More precisely, we make the following definition. Definition 6. An oriented path is the image of an interval [a, b~] under а continuous function f such that (ι) f is continuously differentiable with nonzero derivative at all but finitely many points tu ..., 1S. (ii) lim f'(/) and lim f'(f) exist (but are not necessarily equal) and are nonzero. If f(a) = f(b) the path is said to be closed. If Г is an oriented path we can write Г = Г\ + ··· + Г5+1, where the Г, are the oriented curves between the points f, _! and f,. We define the work W(T, F) by W(T,F)=XW(Tl,F) Examples 19. Let F(x, y) = ( — y, x2) be a force field in R2. The work done by moving a unit mass around the unit circle is found this way. First, we parametrize the circle: Γ: χ = (cos t, sin t) 0<ί<2π Figure 7.3
556 7 Line Integrals and Green's Theorem Then W(T, F) = - <(-sin t, cos2 f)> (-sin f, cos ί)> Λ •Ό . 2π = — ( — sin2 t + cos3 t)dt = π •Ό 20. For the same force field, find the work done around the boundary of Γ of the rectangle [(0, 0), (1, 1)], traversed counterclockwise. Here Γ = Γ\ + Г2 + Г3 + Г4, where Г\: χ = (ί, 0) 0 < f < 1 Γ2:χ = (1, ί) 0^f<l Γ3:χ = (1-Μ) 0<ί<1 Г4:х = (0, 1-ί) 0<ί<1 Then W(T, F) = Γ <(0, ί2), (1, 0)> dt + f <(f, 1), (0, 1)> dt + Γ <(-i,(i-o2),(-i,o)>rfi + j\((l-t)2,0),(0, -l)>rfij = - ί (0 + 1 + 1 + 0) dt = 2 •Ό 21. Let F(x, y, z) = (yz, xz, xy) and compute the work done along one full loop of the helix Γ: χ = (cos t, sin t, f) 0 < f < 2π Γ2π ^(Г, F) = — <(? sin f, t cos f, sin t cos f), (—sin t, cos f, 1)> rfi •Ό ,2π = - (- ί sin2 t + t cos2 f + sin f cos f) dt = 0 22. Compute the work done in the presence of the same force field along the curve χ = 1, у = 0, 0 < ζ < 2π. Here Γ is given para-
7.4 Work and Conservative Fields 557 metrically by Γ:χ = (Ι,Ο,ί) 0<ί<2π Thus W(T, F) = - f π <(0, t, 0), (0, 0, 1)> dt = 0 Conservation of Energy Now let us suppose we are given a field of forces on a domain D. Let Γ be a closed path in D. Under optimal conditions we would hope for no loss of energy in moving a mass around Γ. We shall call a field conservative if this situation is the case; that is, the field F is conservative if W(T, F) = 0 for every closed path Γ. Not every field is conservative, as Examples 19 and 20 show. In case F is a conservative field, then the work required to move a unit mass from one point p0 to another px will be the same no matter what path from p0 to px is followed. For suppose we take two such oriented paths Γ, Γ'. Then the path from p0 to p0 obtained by first traversing Γ and then — Γ' (Γ' oriented from px to p0) is a closed path. Thus И-^Г — Г', F) = 0 since F is conservative. But ЩГ - Г, F) = W(T', F), so W(T, F) = W{Y', F) Definition 7. Let F be a conservative field defined in the domain D A potential function for F is a real-valued function Π defined on D such that, for any path Γ from ρ to p' we have W(T, F) + Π(ρ') - Π(ρ) (7.32) is a constant. Π is sometimes called the potential energy of the force field F and the constancy of (7.32) is just the assertion that a conservative force field obeys the law of conservation of energy. We can relate the potential function of a conservative field with the field, by its differential We obtain this important result: Theorem 7.3. Suppose that D is a domain such that any two points can be joined by a path {we say D is pathwise connected). (ι) Every conservative field on D has a potential function. (ii) Two potentials of the given field differ by a constant.
558 7 Line Integrals and Green's Theorem (in) If the fields = (fu ...,/„) has the potential function Π, then dn=f1dxi + ---+fn dx" Proof. (1) Suppose that F = (/i, ...,/n) is a conservative field defined on D. Then if Γ and Γ" are two oriented curves with the same end points, W(T, F) = W(T\ F) since F is conservative. Fix p0 e D. Since D is arcwise connected, if ρ is any point of D there is a curve Γ from p0 to ρ Define Π(ρ) = — IV(T, F). Π(ρ) is a well-defined function of ρ since the work required does not depend on the choice of Γ. Now let ρ and p' be two points in D, and let Γ be a path from ρ to p'. If Γ0 is a curve from p0 to p, then Γ + Γ0 is a curve from p0 to p', so - W(Го , F) = П(р), - W(T + r0, F) = П(р') But W(T + r0, F) = W(Г, F) + W(Го , F) = W(Г, F) - П(р). Thus -П(р') = W(T, F) - П(р), or И-ЧГ, F) + П(р') - П(р) = 0, so (ι) is proven. (и) If Π' is another potential and Γ0 is a curve joining p0 to ρ then by the above definition Π'(ρ) - IT'(po) + W(T0,F) is a constant, say C. But IV (T0, F) = — П(р), by definition, thus П'(р)-П(р) = С + П'(Ро) another constant. Thus two potentials for the field F indeed differ by a constant, (iii) Finally, we prove that dU = 2/i dx,. Let ρ e D. Fix ι, and let ε be so small that the ball B(p, ε) <= D Let Ге be the curve with this parametrization g(0 = P + 'E, 0<ί<ε Since Π is a potential for F, Π(ρ + εΕ«) - Π(ρ) = - W(TC, F) = f' <F(g(/)), g'(0> dt Jo Now g'(i) = Ei and <F(g(0), g'(/)> = Σ Λ(Ρ + Έι) <Ej, E,> =/,(ρ + /E.) j Thus Π(ρ + εΕ,) - Π(ρ) = f/,(р + /Ε,) dt JO
7.4 Work and Conservative Fields 559 Thus en Oil 1 (·■ — (p) = lim - /,(p + /Ε,) Λ =/,(ρ) OX ι fi_»o £ JO and so the proof of the theorem is concluded. • EXERCISES 12. Find the work required to move a unit mass around the given path Γ in the presence of the given force field: (a) F(x, y) = (χ χ) Γ: unit circle (b) F(x, у) = (у2, у — χ2) Γ: boundary of the triangle with vertices at (0, 0), (0, 2), (0, 1) (c) F(x, y) = (1, χ) Γ: z(f) = exp(l + i)t from t = 0 to t = 1 (d) F(x, y, z) = (—y, χ, ζ) Γ: χ = (cos t, sin f, f) (e) F(x, χ) = (x, xy) Г: the portion of the parabola γ = kx2 from (0, 0) to (я, ка2) (f) FO, ;v, ζ) = (ζ, χ2, χ) Γ: closed polygon with successive vertices (0, 0, 0), (2, 0, 0), (2, 3, 0), (0, 0, 1), (0, 0, 0) 13 Which of these fields are conservative? (a) F(x, y) = (cos x, cos y, sin χ sin y) (b) F(x, χ) = (cos χ cos χ — sm χ sin χ) (c) F(x, у) = (χ, у) (d) F(x, χ ζ) = (χ ζ, χ) (e) F(x, χ, ζ) = (-χ, χ, 1) (f) -(χ2+^)-"2(χ,χ) (g) (χ2 + yT1/2(-X χ) • PROBLEMS 16. Let F(x, у) = (Л(х), В(у)). Show that W(I\ F) = 0 for any closed path Г. 17. Find potential functions for these fields: (a) F(x,xz)=-(0,0,1) (b) F(x, y, z) = -(x2 +y2 + z2)-1J2(x, χ z) (c) F(x, χ ζ) = (χ, x, 1) (d) F(x, y, z) = xy dz + yz dx + zx rfy 18. Let F be a force field in the domain D and Г an oriented path in D from po to p. Show that the work W(I\ F) can be written as Г ||F||cos0<ft where s is arc length along I\ and θ is the angle between F and the tangent ЮГ.
560 7 Line Integrals and Green's Theorem 19. Suppose the field F has the potential function Π. The surfaces Π = constant are called equipotential surfaces for the field F. (a) What are the equipotential surfaces for a central force field? (b) What are the equipotentials for the fields of Exercise 13 which are conservative ? 20. Show that if F is a conservative force field in R2 the lines of force for F are orthogonal to the equipotential curves for F. 21. If F is a vector field in a domain D in the plane, we define *F as the field perpendicular and clockwise to F of the same magnitude. Verify this relation between F and *F if F(P) = (Λι(ρ), Аг(р)), *F = (-Аг(9), Λι(ρ)) 22. Suppose both F and *F are conservative fields with potentials Π, Π*, respectively. (a) П is harmonic. (b) Π + Ш* satisfies the Cauchy-Riemann equations. 23. If /= и + w is a complex analytic function, и is the potential for a field F such that *F is also conservative (and has potential function v). 24. A vector field F is called radial if it is central and its magnitude is a function of the radius Show that if F is a nonzero radial vector field it is conservative, but *F is not. 7.5 Integration of Differential Forms The study of work has led us to differentials of function via the obvious relation between vector fields and differential forms. If F = (/j /„) is a vector field defined on a domain in R", the differential form £"= x /, dxl will be denoted <F, rfx> (for obvious reasons). According to the results of Section 7.4, the field F is conservative if and only if the form <F, rfx> is exact. In this case <F, rfx> = d(Jl), where Π is a potential function for the field F. On the other hand, if ω is a form we can write ω = <F, rfx> for some vector field F (if ω = £ α, dxl, F = (аь ..., α„)). We can thus rely on the notion of work to define the integral of ω over a path Γ: J ω = J <F, dx> = - Ж(Г, F) (7.33) Thus, if ω = £ α, dxl and Γ is parametrized explicitly by xl = xl(t) for a < t < b, then г гь dx' ω= Σβ.-77Λ (7·34) Jr Ja Τ at
7.5 Integration of Differential Forms 561 The idea of defining the integral of a form in terms of work presents us with a subtle inconsistency which we would like to avoid. The notion of a differential form on R" involves the geometry of R" only insofar as it is a vector space. In the conception of differential form, the inner product of R" is irrelevant and no particular coordinatization of R" is selected over any other. But the notion of work is deliberately expressed in terms of the Euclidean structure of R", it essentially involves lengths and angles. As a result, with the definition (7.33) of integration, we can only compute the integral by means of (7.34) in terms of rectangular coordinates for R". Since the concept of differential form is free of a particular basis, we want accessory concepts (such as integration) also to be free, in fact, we would hope to compute Jr ω by means of (7.34) with respect to anv coordinate system as well as any parametnzation of Γ. This turns out to be the case, and therein we begin to see the importance of the notion of invariance with respect to coordinate choices. Proposition 4. Let ω be a differential form defined on a domain D in R" and suppose ω = £/, dx' = £ φ, du' with respect to two different coordinate systems (x1, ..., x"), (u1, . ., u"). Let Γ be a path in D parametrized in two different ways by xl = x\t) a< t<b ul = u\x) a < τ < β Then rb dx1 i-t du' Ι Σ /.«')) ~Ц dt = Ι Σ <Μ»ω) Ύτ άτ (7.35) Proof. We can write the x's as functions of the u's and t as a function of τ χ' = x'("\ ··,"") mi) t = t(f) <χ<τ<β Now, according to (7.24) " 8xJ <Mu) = Σ /j(x) —i (u) (7 36) when x, и are coordinates for the same point Now, let us compute the integral on the left of (7 35) by the change of coordinates ί->-τ, according to the calculus of one dimension. ι·* Иу-J ι·β drJ dt Γβ dx' (7 37)
562 7 Line Integrals and Green's Theorem But we can compute dxJ/dr by the chain rule; χ is a function of u which is a function of τ: dxJ _ 8x' du' dr ι du[ dr (7.37) becomes IfjMt(r))—i-rdr=\ ^i(u)(t)) — dr J a 1,1 OW dr J« dT by (7.36). The proof is concluded. On the basis of that proposition we may now define the path integral of a form. Definition 8. (The Path Integral) Let Γ be an oriented path in a domain in which the form ω is defined. If Γ = £f= x Γ,, where the Γ, are parametrized by χ = g,(f), a, < t < bi, we define f ω = Σ f 'ω(8.(0, feXO) dt Notice, that if Γ is parametrized with respect to arc length, then g' is the tangent and the integral may be written as \ ω= ϊ ω(Τ) ds Examples 23. FindjYr2 dQ, where Γ is the boundary of the rectangle — 1 < χ < 1, — 1 < J> < 1. Now, in rectangular coordinates r2 du = — ydx + xdy. Thus \r2dd = f -(-l)rfx+f {Y)dy-\ -(l)dx-f (-l)dy = S J-l J-l J-l J-l 24. Find jV- (x2 + y2 + z2)(dx + xy dy + dz) around the curve
7.5 Integration of Differential Forms 563 x2 + y2 = a2, x2 + y2 + z2 = b2. This can be parametrized by x = acos9 у = a sin θ ζ = (b2 - a2)1'2 and thus has two branches. Thus f (x2 + y2 + z2)(dx + xydy + dz) = 2 Γ (-asm9d9 + a2 cos2 0sin0rf0) = O In case the curve Γ is a closed path (a continuous image of a circle) it is customary to write |r to indicate that the integration is around a loop. We now summarize what we know so far about the integration of differential forms. Theorem 7.4. Let ω = £ a, dxl be a differentiable differential form defined on a domain D in R". (ι) ω is the differential of a function if and only if§r ω = 0 for all closed curves Γ. (ii) ω is the differential of a function if and only if the field {αχ a„) is conservative. (iii) If ω = df then |^(Р) = ?Г(Р) all/, j all ρ 6 D (7.38) OXj OXl When is a Closed Form Exact ? For certain domains, Equations (7.38) are sufficient to guarantee that the form ω is the differential of a function; but this is not always true. For example, let -ydx + xdy 2 ω = * 2 mR - {(0, 0)} χ + у Certainly, ω satisfies the required conditions (recall Example 5):
564 7 Line Integrals and Green's Theorem If ω were the differential of a function, then we would have Jr ω = 0 for every closed curve Γ. However since ω = άθ (as remarked in Example 5), Jr ω = 2π if Γ is a circle centered at the origin. Notice that in some sense ω is the differential of a function, albeit not single valued. If we exclude the line χ = 0 (or the line у = 0), in the remaining domain we can take a principal value of θ = tan-1 y/x; but we cannot find a continuous single- valued function on all of R2 - {0, 0} whose differential is ω. Of course, in the above example in any small enough neighborhood of any point in R2 — {(0, 0)} we can write ω = df for some function / This is in fact true for any differential form satisfying the compatability equations (7.38). That is, suppose ω = £ α, dxt is a differentiable differential form denned in a neighborhood U of p0 in R" and the equations (7.38) are satisfied. Then if В is a ball centered at p0 and contained in U, there is a differentiable function/defined in В such that df= ω in B. This is really easy to prove: if ρ is any point in B, let Lp be the oriented line from p0 to ρ and define /(P) = Jl ω· Then, we can differentiate/with respect to xJ by differentiating under the integral sign: Now the integrand will have one term of the form £, da-Jdx' dx', which is by Equations (7.38) the same as £, dajdx1 dx1 = da}. This is the essential term: by the fundamental theorem of calculus we can conclude from df/dx1 = J daj that dfjdx3 = a} as desired. Here is the precise proof. Theorem 7.5. (Poincare's Lemma) Suppose that D is a domain with this property: there is ap0efl such that for every ρ 6 D the line joining ρ to p0 is also in D. (D is star shaped (see Figures 7.4 and 7.5).) Then in D every closed form is exact. Proof. We may suppose p0 is the origin. For ρεΰ, let Lp be the oriented line segment joining 0 to p. We may parametrize L, by L„:x=x(0 = 'p OrSirSl (7.39) If ω is a closed form, define /(p) = \Lp ω. We shall show that df= ω. In coordinates, ρ = (χ1,..., χ"), ω = 2 я, dx', and by (7.39) г dx1 г1 " /(χ1,...,*")= ^a, — dt=\ 1ai(tx)x'dt J Lp Ш J0 i = l
7.5 Integration of Differential Forms 565 D is star shaped Figure 7.4 Then, differentiating under the integral sign: dxJ (P) ; Jo b = i ex1 dx1' (fp)fx' + a,(rp) — dt Now, using the compatibility equations, the integrand of the first integral takes D is not star shaped Figure 7.5
566 7 Line Integrals and Green's Theorem the form _ 8a, _ 8aj 8 t = - [fl//p)] · / We can now compute the first integral by integration by parts: β f τ-, [aj(tp)]t dt = α;(ίρ) ■ f -Γ α,(φ) Λ Thus 8f Г1 Г1 τ- (ρ) = α;(ρ) · 1 - α/Φ) dt + aj(tp) dt = aj(p) oxj -Ό Jo and the proof of Poincare's lemma is concluded. Poincare's lemma serves to indicate the nature of the solution to the basic question: when are closed forms exact? It depends on the shape of the domain. If the domain is a ball, or a cube, or any " star-shaped " domain, then every differential form which satisfies the compatibility differential equations (7.38) is the differential of a function. On the other hand, if the domain has holes (as does R2 — {0}), there are closed forms which are not exact. We have seen, to be precise, in the discussion following Example 18 that on R2 — {0} the dimension of the space of closed forms exceeds the dimension of the space of exact forms by one. Problems 15 and 33 are devoted to showing that when we remove a finite number of points from R2 this excess dimension on the remaining space is the same as the number of removed points. These examples suggest that domains with holes are not just defective in the closed-exact problem, but further that the solution to this problem gives a measure of the defect. This striking relationship between the shape, or topology, of the domain and the rnalytic question of mteg- rability persists when we move to more complicated domains, or surfaces and even into higher dimensions. The shape of a pretzel is accurately reflected in the closed vs. exact controversy on its surface. The general theorem relating this analysis to the topology of the domain is de Rhanis theorem and is one of the cornerstones of the modern subject of differential topology. Now, back in one dimension, the fundamental theorem of calculus relates the values of a function on the boundary of an interval with the integral of its derivative over the interval: *bdf f(b)-f(a) = j df = j -£dt (7.40)
7.5 Integration of Differential Forms 567 The analog of this theorem for differential forms in R2 is Green's theorem; there are many analogs in higher dimensions and we shall study some of these in the next chapter. For the remainder of the present chapter we shall study only the two-variable case. Suppose D is a domain in R2, and the boundary of D is made up of a finite collection of curves (see Figure 7.6). We make the boundary into an oriented path by choosing the direction of motion so that the domain D is always on the left. If Τ -> N is the (right-handed) tangent-normal frame on the domain, then the normal N always points into the domain (see Figure 7.7). We shall refer to the boundary of D when so oriented as dD. Now Green's theorem simply says this: if ω is a C1 differential form defined on a neighborhood of D, then f ω= ί άω (7.41) ho •'d Figure 7.6
568 7 Line Integrals and Green's Theorem Figure 7.7 If we consider the boundary of the interval in (7.40) as oriented in some appropriate way, then (7.41) appears to be a direct generalization of (7.40). In order to see why (7.41) is true, we must first assume that D is of a special form. We say that the domain D is regular if it can be expressed in both following forms: D = {(x, y)eR2:a<x<b, f(x) <y< g(x)} (7.42) = {(x, y) 6 R": a < у < β, ф(у) <х< ф(у)} (7.43) (see Figure 7.8). For regular domains, Green's theorem follows easily from the fundamental theorem of calculus. Let ω be a given C1 form, and write ω = ρ dx + q dy.
7.5 Integration of Differential Forms 569 a regular domain an irregular domain Figure 7.8
570 7 Line Integrals and Green's Theorem Then άω = \ (qx — py) dxdy = qx dxdy — py dy dx We perform these integrations, by iteration: use χ first for the first integral, у first for the second. л л" Г гф{у) да 1 г" Γβ \ qx dx dy = — dx dy = ^(ΆΟ), j>) dy - g(0OO, y) rfy jd Jx LU(y) ox J J« J« (7.44) Now, we can parametrize 5Z) in two parts as: δΌ = Γι + Γ2 rt: χ(ί) = (φ(ή, Ο α<ί<β -Γ2:χ(ή = (φ(ή,ή α<(<β Thus Γ q dy = ί 9 dy - ί ? ^ = ί ϊ(^(0· t)dt-\ ς(φ(ί), t) dt JdD JTi J-Tz Jx Ja (7.45) Comparing (7.44) and (7.45) we deduce that ί qx dx dy = ί q dy (7.46) We leave it to the reader to verify by the same kind of argument that i py dx dy = — ρ dx (7.47) •>d JeD (Problem 25). Equations (7.46) and (7.47) together give Green's theorem. Now, not every domain can be represented in both the required ways; in fact, in general neither is possible. However, for most domains D it is true that D can be covered by finitely many disks Αγ, ..., As so that D η Δ, = D, is regular for every /. Clearly, if D is bounded by finitely many polygonal curves this is true. All but the most pathological domains that we have seen have this property. The above argument generalizes easily to these types of domains. We shall now call any such domain regular.
7.5 Integration of Differential Forms 571 Definition 9. A domain D is regular if its boundary is a path and if D can be covered by disks Ay,..., As such that each D η Δ, can be represented in both forms (7.42) and (7.43). Theorem 7.6. (Green's Theorem) Let D be a regular domain and ω a differential form defined on a neighborhood of D. Then, ω = άω •>6D J В Proof. Let ΰ,η4ι where the disks Δι, , Δ„ are as given in the definition In particular, by the preceding arguments, Green's theorem is true on £>. for each ι Let pi,..., p„ be a partition of unity subordinate to the covering Δι,. ., Δ5. Then the p, are C°° functions and 2 Pi = 1 on D, and pt is nonzero only inside Δ, Now, by Green's theorem on D{ pt ω = d(pi ω) Since pi ω is zero off £>i, d(pi ω) = ί/(ρ, ω) But also pi ω is nonzero only on the part of each of the curves SD, 8Dt which is common to both, thus also Pi ω ρ,ω = Thus ριθ>= ί/(ριω) 1<г<5 J 3D ^D Adding these equations, we obtain Green's theorem for D since 2 Pi = 1: ω=Γ 2Ριω=Σ Ριω=Σ ^(Ριω) = ^(Σ Ριω) = άω Examples 25. Let D be the unit rectangle [(0, 0), (1, 1)]. Then, by Green's theorem f x2y dx + (x-y)dy= f (1 - x2) dx dy = f (1 - x2) dx
572 7 Line Integrals and Green's Theorem 26. The integral of ω = cos xy dx + у cos χ dy over the boundary of the domain D= {(x, y):0<x2 <y< 1} is ω = [ — у sin χ + χ cos xy] dx dy = i(~y sin x) + x cos ХУ] ^ ^x Green's theorem is also convenient for transforming double integrals into line integrals. Noticing that dx dy arises as d( — у dx) or d(x dy) in Green's theorem, we may compute areas of domains by line integrals. 27. Find the area bounded by the curves у = 1 — χ* and у = 1 — χ6 in the upper half plane: area JD = dx dy JD = -\ydx = -\ (1 - x4) dx + ί (1 - χ6) dx = — •'ев •'-ι ·"-! 35 28. Find the area inside the ellipse ' T2 E---2+h=i We can parametrize Ε by the polar angle: χ = a cos θ у = b sin θ Thus area = χ dy = ab \ cos2 θ άθ = nab EXERCISES 14. Compute the line integrals of differential forms arising out of the work problems in Exercise 12(a), (b) using Green's theorem. 15. Compute JY ω for given ω and Γ (using Green's theorem if convenient).
7.5 Integration of Differential Forms 573 (a) ω = ζ dx + χ dy + у dz Г: closed oriented polygon with successive vertices (0, 0, 0), (0, 1, 1), (1, 0, 0), (-1, -1, -1). (b) ω = x2y dx + y2x dy Γ: the ellipse a2x2 + b2y2 = 1. (c) ω = (χ + у) dx + (χ2 + у2) dy Г: the triangle with successive vertices (0, 0), (4, 0), (2, 3). (d) ω = χ2 dy + 2xy dx Γ: ζ = e(1 + "' from ( =0 to t = 2. (e) ω = (χ + у) dx + (у + ζ) dy + (ζ + χ) dx Γ: the circle χ2 + ζ2 = 1, у = 3. 16. Compute, using Green's theorem the area of the domain D: (a) D = {(x,y): 0<sinx<;v<tanx< 1} (b) D is the domain in the upper half plane bounded by the ellipse x2 + 2y2 = 1 and the parabola χ = 2y2 (c) D is the quadrilateral with vertices at (0, 0), (1, 0), (7, 3), (2, 5). (d) Inside the curve χ = cos" t, у = sin" t η > 0. PROBLEMS 25. Verify Equation (7.47) in the text and conclude the proof of Green's theorem. 26. Using Green's theorem prove that if ω is a closed differential form in all of R2, then ω is exact. 27. A differential form is called radial, if it is of the form <F, dx} where F is a radial vector field (see Problem 24). Show that if ω is radial, it is of the form /(r) dr. 28. Show that if ω is a compactly supported (that is, it is identically zero outside some large disk) form on the plane that (a) f 2da>=0 (b) f ω = ί dm 29. Show that if ω is a compactly supported closed form in R2, it is the differential of a compactly supported function. 30. If ω is a differential form, define *ω as follows: if ω = <F, dx> *ω = <*F, dx} (a) Show that if ω =p dx +q dy, *ω = —q dx +pdy. (b) Show that (in a disk) *df is also exact if and only if/is harmonic. (c) Show, using complex notation *ω(Τ) = ω(ίΤ)
574 7 Line Integrals and Green s Theorem (d) If/is harmonic, let/* be such that df* = *df. Show that/+ //* satisfies the Cauchy-Riemann equations. 31. Let Γ be an oriented curve in R", with tangent Τ and normal N. If /"is a differentiable function we define these derivatives of/along Γ: ^ = rf/(T) = <V/T> ^ = rf/(N) = <V/,N> Show that /γΉγ*ϊλ Lm*=L*« 32. Suppose that £> is a regular domain and / g are twice differentiable functions defined on a neighborhood of D. Verify these formulas (using Green's theorem): iLds^O aDST (a) f •Id (b) \j^ds=\\Xixi~%dix)dxdy (d) Lss*=JJ>** (e) L ^ έ ^=ίί„ [^Δ/ + <Vg-v/> ] rfx dy (f) Ц^-^1)л-Яв^-^)л 7.6 Applications of Green's Theorem Several of the exercises at the end of the previous section have indicated the uses of Green's theorem. The rest of this chapter is devoted to the application of this theorem to some of the topics we have been developing. We shall leave aside until the next section its more profound uses in the study of complex differentiable functions.
7.6 Applications of Green s Theorem 575 The Shape of the Domain The most immediate implication of Green's theorem is the suggestion of the relationship of the shape of a domain to the question of the exactness of closed forms. If every closed curve in the domain D is the boundary of a subdomain in D, then every closed form is exact. For, suppose ω is a closed form. By Theorem 7.4 (ι), to show that ω is exact, we need only verify that its integral over any closed curve is zero. If Γ is such a curve, then by hypothesis it is the boundary of the subdomain E. Then, by Green's theorem ω = dto = О Jr JE since dco = 0. We can say that a domain D " has no holes" if every closed curve in D is the boundary of a subdomain of D. This is intuitively clear: we can draw a loop around any hole which will bound the hole and this is not a subdomain in D. The further study and precision of these notions is a rather difficult branch of mathematics and falls within the domain of topology. It turns out that there is a precise relation between this vague geometric study and the question of exactness. The number of " holes " in the domain is the same as the number of independent closed but nonexact forms. We already saw that (in Section 7.2) for R2 - {0} and in Problem 15 for R2 - {0, 1}. That argument easily generalizes to the case of the complement of finitely many points, pu...,ps. Let 0,(z) = arg(z — p,). Although Θ, is not a well-defined function on R2 — {pb ...,ps}, dd, is a well-defined form. Clearly, άθχ, ...,dds are independent, so there are at least s independent closed nonexact forms on R2 = {py, ...,ps). Now, let ω be any closed form and define 1 f c,(ct>) = — ω 2πι J с, where С, is a small circle centered at p,. Then 1 s ω' = ω - — Χ Γ,(ω) άθ, Ζπ ,= ι is exact. This can be proven by verifying condition (ι) of Theorem 7.4 by Green's theorem (see Problem 33). Thus if ω is any closed form it is, but for an exact form, a linear combination of the άθ,.
576 7 Line Integrals and Green's Theorem Area Computation Now, as in Examples 27, 28, we can compute areas by boundary integrals : if D is a regular domain area of D = \\dx dy = χ dy = — у dx = = χ dy — у dx •I^D •'SD -leD *■ ^6D (7.48) Example 29. The area of a trapezoid is 1/2(6, + b2)h (see Figure 7.9). area = χ dy = χ dy + χ dy •'ев ·Ί.ι -Ί.2 ii:j' = •'ев -Ί,ι ft α + b2 — by (x - bt) χ e [a + b2 , £>i] L2 : у = - χ χ 6 [0, a] Л« + Ь2 area Г° Л Г n j Г я j = χ — dx + χ -dx ■>ы a + b2 — bt Jx α _пГ(а + Ь2)2-Ь12'1 2L a + i>2 - bt 2a = - [a + i>2 + bt - a] = 2 (£>i + b2)n («+&>, Й) ftl (ft.,0) Figure 7.9
7.6 Applications of Green s Theorem 577 Integration after a Change of Variable A line integral of a differential form is the same, no matter what coordinates are used to compute it (recall Proposition 4). Using this knowledge and the preceding computational techniques we can find a formula for computing double integrals by a coordinate change. Suppose that F is a nonsingular differentiable transformation of the domain D onto the domain Ε (that is, F maps D one-to-one onto Ε and dF is everywhere nonsingular). Let us write F in terms of coordinates: F : " = "(X' >> (x,y)eD F1 : x = *u' V\ (uv)eE (7.49) υ = v(x, y) y = y(u, v) K ' y ' If Γ is a path in D, then F(r) is a path in E. If ω = ρ dx + q dy is a differential form defined on D, we may associate it to a form on Ε: ώ = a du + β dv, where the cooefficients are given (see (7.24)) by the coordinate change (x, у) -> (и, ν). Then Jr ω = JF(d<5, since they represent the same integration relative to two different coordinate sets. Now, if Γ bounds a domain Δ, F(r) bounds F(A) and if we apply Green's theorem to both sides we will obtain a relation between the double integrals. However, to apply Green's theorem we must be sure that both Γ and F(r) are oriented as the boundary of the domains Δ, F(A), respectively. That is not necessarily the case. Example 30. The transformation ν = χ amounts to reflection in the line χ = у. If Г is a circle centered on that line, Г and F(r) are the same curve, but oriented in opposite directions (see Figure 7.10). This difficulty may be overcome by restricting attention exclusively to transformations that preserve the sense of orientation around a curve. This will be guaranteed if the sense of " counterclockwise" rotation about corresponding points is the same. Thus, if we rotate the xy plane about the point ρ in the clockwise sense, the induced motion under the transformation Τ must also be clockwise. This will be the case if it is so for the linear
578 7 Line Integrals and Green s Theorem Figure 7.10 approximation rfT(p), and that is guaranteed by д(х, У) d(u, v) (p) = det T-(P) du \T~ (ρ) <3χ (Ρ) (Ρ) δν ' J >0 (7.50) These remarks are not completely obvious, but we shall not pause to verify them. It is intuitively clear that the sense of rotation at a point is the same for the transformation and its differential. What is not so clear, and more difficult to obtain is that this local criterion assures that the sense of orientation of any boundary is the same in the two coordinate systems. All these geometric considerations can be avoided, by replacing them with appropriate algebraic considerations. We shall see further illustrations of the difficulty in a purely geometric, rather than algebraic, approach in the next chapter. In any event, if (7.49) defines a change of variables satisfying condition (7.50), then for any subdomain Δ of D, dA and 5F~'(A) define the same orientation on the boundary of ω. Thus, if ω is any differential form ι—ι ω βΓ-·(Δ)
7.6 Applications of Green's Theorem 579 In particular, area (Δ) = \ χ dy = \ χ dy = \ χ — du + χ — dv •'ел ->sf-ha) •'вг-1(Д) ди dv jf-i(A)Lom\ dv/ δν\ ди/ du dv д(х, у) — -dudv JF-i(A) S(U, V) A more important formula is that allowing us to compute double integrals with respect to the new coordinates (u, v). Theorem 7.7. Let D be a domain in the plane, and suppose χ = x(u, v) у = у(и, v) is an orientation-preserving change of coordinates (that is, д(х, у)/д(и, v) > 0). Let Ε be the domain in (u, v) variables corresponding to D. If f is a function defined on a rectangle containing D, then \Df can be computed in terms of the (u, v) coordinates: ί / = f f(x(u, v), У(и, i;))det —^ (и, υ) du dv (7.51) •>d je o{u, v) Proof. Let R = [(a, b), (α, β)] and define F{x, У) = ί /(f, У) dt for (x, y)eR •'a Thus F(x, y) is a C" differentiate function on R such that 8Fj8x=f. Now, by Green's theorem ί fdx dy = ί Fdy We can compute the integral over 8D in the («, v) coordinates: dy dy f Fdy=\ Fdy=\ F^du + F-f LD JeE JeE ou dv dv
580 7 Line Integrals and Green's Theorem By Green's theorem (in the (w, ν) variables), the last integral is J£ [du \ dv) dv \ du) du dv 8F dx dy 8F by by dx du dv 8y du dv d2y dFdxdy dFdydy d2y du dv dx dv du dy dv du dv du Г 3(x, У) ■■ f(x(u, v), y(u, v)) det — du dv •Έ d(u, v) du dv Thus (7.51) is proven. Examples 31. г dx dy r r dr άθ rl Г r2" •b+^si (x2 + У2)1'2 Li+y^i r -Ό Ыо άθ άτ = 2π 32. ί ехр[-(х2 + у2)'] dx dy = exp( — r2)rardd = 2π exp(-r2)r dr •Ό = тг[-ехр(-г2)]? = я Notice that if exp(-i2)di) = f exp(-i2)rfi · ί exp(-i2)rf( г00 г00 г = exp(-x2)dx· exp(-y2)ify = exp[-(x2 + y2)~] ax dy •Ό J0 •'кг Thus г00 г- exp(-i2)di = y/π •Ό a computation that would have been impossible without the change of variable to polar coordinates.
7.6 Applications of Green s Theorem 581 The Divergence Theorem The general form of Green's theorem first came up in the study of fluid flows and the theory of potentials. In this study it arises in the form of the divergence theorem, which we shall now discuss in two variables. Let ν = (v, w) be a vector field defined in some open set in the plane and let χ = x(x0, f) be the equations of the associated flow (that flow with velocity field v). Let D be a domain on which the flow takes place. The fluid which at time t = 0 occupies D has moved after a time t, to a domain D, given by D, = {x: x = x(x0, () x06 D) The area of D, is area(Dt)= f dx dy = f ^X,,y,\dx0dy0 ■>Dt Jd o(x0 , y0) where we have rewritten the equations of flow as χ = x(x0, 0 = (x,(x0, y0), yt(x0> JO» The rate of change of the area of D, is — area (D() = — τ, г dt JDdtld(x0,y0)\ dx0 dy0 (7.52) Now let us evaluate this at time t = 0. Remembering that x(x0, 0) = x0, we have = Т-т—(х0,Уо 0) + —-—(xo,JO,0) (=o dt дх0 tit ду0 д2у dw δ dt δ' dt 'дх, ду, βχ0ду0 '■χ δ 3χ0 δχ0 дх, ду, ду0 дх0 /дх\ ( \dt)~d dt ду0 ду0 Thus the instantaneous rate of change of the area of D (Equation (7.52)) is given by JD\dx dV + ^)dxdy (7.53) dy)
582 7 Line Integrals and Green's Theorem The integrand is called the divergence of the flow and is denoted div v. The divergence theorem says that this integral can be computed by a boundary integral. To put it physically: the rate of expansion of D is the same as the rate at which fluid flows into D. We will now try to compute that latter amount. Let BD have the frame Τ -> N so that N points into the domain (see Figure 7.11). The amount of fluid passing into BD through a small piece of the boundary (of length As) in a time Δί is <v, N> Δί Δί The total amount passing through BD is thus well approximated by a Riemann sum for the integral ί j^<v, N> ds\ Δί Thus the rate at which fluid passes into D can be thought to be given by f <v,N>ds JdD Using the notation of Exercise 29 this is the same as
7.6 Applications of Green s Theorem 583 By Green's theorem this is the same as (7.53). Thus the divergence theorem is verified: f <v, N> ds = f div ν (7.54) If v is a conservative field it has a potential function /, and <v, dx} = df. Then <*v, dx} = *df ana (7.54) becomes f *df = f d* df = f Δ/ Thus, if/ is the potential function for a conservative and incompressible (divergence free) flow, / must be a harmonic function. Dinchlet's problem (to find a harmonic function with given boundary values) may be restated as: find the conservative incompressible flow with given boundary potential levels. The Cauchy Theorem This last remark leads directly to the study of complex analysis. Suppose that/is a complex-valued complex differentiable C1 function defined on a domain in the plane. Then /(z + /Q - f(z) hm = / (z) a->o η exists for all ζ and (what is the same assertion) the Cauchy-Riemann equations hold: 5/=_Д дх ду It follows that the form/(z) dz is a closed complex-valued form. f dz = f dx + if dy d(f dz) = dxyJ> dy\ =»/, - Л = о Theorem 7.8. (Cauchy's Theorem) If f is a C1 complex differentiable function defined in the regular domain D, then f fdz = 0
584 7 Line Integrals and Green's Theorem Proof. By Green's theorem \ f dz=\ d(f dz) = 0 •>6D •'D •>dD EXERCISES 17. Compute the area of these domains: (a) x* + У < a* (b) x2y<l,0<x<a (c) r < 1 + 2 cos 0 (each section) (d) r ^ e\ 0 < θ ^ 2π (e) The domain {u2 + v2 < 1/2}, where и = x(l + χ cos υ) υ = y(l +;vcosx) (f) The domain {0<«^1, 0^υ<1}, where и = xy, ν = χ2 — у2 18. Compute div ν for these flows: (a) x(xo,0=exp (-! -:>: Xo (b) x=x0(.l + t),y=y0(.\~t2) (c) vO, j<) = (x2 - y, y2 - x) (d) v(x, >) = (x + y, χ - y) PROBLEMS 33 Let D = R2 — {pi, . , p,}, where pi,..., ps are s distinct points in the plane. Show that there is an j-dimensional space L of closed, but not exact forms defined on D such that every closed form can be written df+ω, with ω eL. 34. Let ω be a closed form in R2 — {(0, 0)}. Show that if ω is exact in some annulus {a< \z\ <,b], then it is exact. 35. Let / be a complex-valued differentiable function defined in the domain Ε Show that /is complex analytic if and only if Ud fd2 = 0 for all subdomains D of Ε {Hint: d(fdz)=0 is the same as the Cauchy- Riemann equations) 7.7 The Cauchy Integral Formula In Chapter 5 we introduced the power series development of functions in order to effectively compute solutions to certain differential equations. Those functions which admit an expansion into a power series are called analytic. We saw that this is the most computable class of functions. We
7.7 The Cauchy Integral Formula 585 saw that such functions are differentiable in the complex sense, and that the differential equations can be interpreted in the sense of complex variables. In Chapter 6 we found that if a function is the sum of a convergent power series in the closed unit disk, it can be computed by means of an integral around the circle: if /(0=f>ninf°>-|il<l then Γ2" /(e1 V 1 r2* f(e' 2π Jn e ■ άθ 2π J„ e'9 - ζ for |ζ| < 1. The integral may be rewritten as a line integral: f(z) dz /(0 = ^f 2π/ -Ίζΐ = ι |z| = l Ζ-ζ The Cauchy integral formula is a great generalization of this. It weakens the hypothesis to that of complex differentiability and strengthens the conclusion by replacing the unit circle by the boundary of any regular domain. Theorem 7.9. (Cauchy Integra] Formula) Suppose that f is a C1 complex- valued complex differentiable function defined in a neighborhood of the regular domain D. Then, for ζ e D, 1 f Hz) dz 2π/ Jsd z — ζ Proof. Let Δη = {ζ: \z— ζ\<η~1}. If η is large enough, Δ„ is contained in D (see Figure 7.12) and/0)0 — ζ)'1 is a complex differentiable in D — Δ„. This is because the product of complex differentiable functions is complex differentiable. Thus /0)0 - D-1 dz is closed, so that L mdz=o Лсо-д,,) Ζ ζ Thus f(z)dz r f(z)dz Γ2*Λζ + η-ιε'°) r-= Г = 4 ^TT» n1e'ede = i\ f(C + nle">)de
586 7 Line Integrals and Green's Theorem Figure 7.12 But as n^ со, /(£ + n~Le">)^f(£) uniformly on the circle, just because /is continuous at ζ. Since η is arbitrary (but large), f f-^- = hm ι Γ /(ζ + /TV) rfff = i Γ /(ζ) άθ = 2πί/(ζ) and thus (7.55) is proven. The Cauchy integral formula implies that complex differentiable functions are extremely well behaved; after all a function certainly must be quite special for it to be completely and explicitly determined within a domain by its boundary values. Here are a few corollaries of Theorem 7.9 which demonstrate this. For simplicity of notation we shall write fe A(D) to mean that / is a C1 complex differentiable function on a regular domain D. Proposition 5. (The Maximum Principle) Let / be in A(D). The maximum of/on D is attained on dD. Proof. Since D is compact, the maximum of/is attained at some point ζ e D. If there is no point on 3D at which /attains its maximum, then not only is ζ φ dD, but /(Ol>max{|/(z)|:zeSi»}
7.7 The Cauchy Integral Formula 587 We shall show that this assumption leads to a contradiction Define /ω g{z) 7(0 Then g e Δ(£>) also, #(£) = 1 and \g \eD < 1 Then g" - 0 uniformly on £D as n^-co. Thus г g\z) dz ho ζ—ζ as n^· со. But, by the Cauchy integral formula, that integral is 2mg"(Q = 2-ni which does not tend to zero. Proposition 6. Suppose /„, / are all in A(D) and lim/„ = f uniformly on dD. Then lim/„ = / uniformly in D. Proof By assumption, |/„—/lU^O as η -»- со But since /„ — /εΛ(ΰ), by the maximum principle, II/ — /Id= ll/n — /Ί'βο so|/„—/|D^0 asn->coalso Proposition 7. (Liouville's Theorem) If f is bounded and complex differen- tiable on the entire plane, f is constant. Proof Let Μ be an upper bound for | /(ζ) | Let ζι, ζ2 be any two points on the plane. l/(£i)-/(£0l = 1 2m i Γ-Τ—L •ΊζΙ-я \_Z— 4l 2- l-n ·Ί.|-λ /(ζ) rfz (Γ-ίιΧζ-ίΟ МЛ _W_ г2" «Я -2ττ/?141 4llJ0 |ί1#-ίιΛ- ΙΙ^'-ίϊΛ-1! As Λ^- со, the integrand converges to 1 Thus the entire expression on the right becomes arbitrarily small as R -> со On the other hand, the left-hand side is independent of R, hence must be zero Thus, /(£i) =/(£2) for any ζι, ζ2
588 7 Line Integrals and Green s Theorem The most important property of complex differentiable functions is that they are analytic, that is, they can be expressed as the sum of a convergent power series about any point. The following theorem brings together all the notions of analyticity and summarizes the basic properties of analytic functions. Theorem 7.10. Let f be a C1 complex-valued function defined in a neighborhood of the regular domain D. The following assertions are equivalent: (i) For any ζ e D, and R such that the disk Α(ζ, R) is contained in D, f is the sum in Α(ζ, R) of a convergent power series: /(*) = Σ a*(z - 0" (7-56) n = 0 (ii) / is complex differentiable. (iii) f satisfies the Cauchy-Riemann equations: e-L=-ie-L дх ду (iv) f dz is closed. (v) for any ζ 6 D, m.±f Μψ 2πι Jan z — С 2πι Jbd z — ζ In case f has these properties the coefficients an of (7.56) are given by /W(0 1 Γ /(ζ) dz a" = ^r = ^iL{z~(rT (7·57) Proof. The implications (i)=>(n), (n)=>(iii) were observed in Chapter 5, (ni)=>(iv) in the preceding section and (iv)=>(v) is the Cauchy integral formula (Theorem 7.10). That leaves only the implication (ν) => (ι) and the first part of the theorem will be proven. Suppose then, that (v) holds, and Δ(£, К) с D. We have to show that / can be expanded in a power series centered at ζ. By hypothesis, min{|z-£|: ze dD}^R. Thus for w e Δ(£, Κ),
7.7 The Cauchy Integral Formula 589 for all zedD. Thus 1 1 w ζ-ζ — (\ν-ζ) z-ζ \ ζ-ζ) „ίΌ(ζ-ζ)"+1 uniformly for ze 3D. We can thus substitute this sum for the term (z— w)'1 in the Cauchy integral: 1 г /(ζ) rfz 1 г - (w - 0- /(w) = — = — /(z) 2 τ ϊ^ΓΓι dz 2πΐ JeD z—w 2πΙ JeD n=o (ζ — ζ)η+1 Thus/is represented by a power series whose coefficients are given by the integrals in (7.57). That the coefficients also are given by the successive derivatives as in (7.57) was already observed as part of Taylor's formula. Thus, the theorem is completely proven. Examples 33. If / is analytic in the disk Α(ζ, R), then the power series representing / near ζ actually converges to /in the entire disk Α(ζ, R). For, by Theorem 7.10, /is, in this whole disk, the sum of a power series centered at ζ, but such a power series is uniquely determined by / so must be the given one. In particular, if/is analytic in the entire plane it can be expanded in a power series converging everywhere. 34. Suppose that /is analytic near ζ. Then /ω-ло ζ-ζ (7.58) is also analytic near ζ. For, we can easily factor the Taylor expansion of f(z) - /(C). If /(ζ) = Σ." ο «„(ζ - 0", then f(z)-ftt) = Σ an(z ~ 0" = (z ~ 0 Σ <W* - 0" n=l n = 0 so (7.58) is given by Σ"=οαη + ι(ζ- 0"· In particular, z_1 sin ζ is analytic on the whole plane, and has the Taylor expansion ζ-18ίηζ=Σ(-1)η73ΓΠΤΤ n=0 (Z )\
590 7 Line Integrals and Green's Theorem = 2ni r tan ζ dz tan ζ 35. =— = 2πΐ J|z|=l ZZ Ζ 36. f -4L-=\ J|z-i| = l Ζ + 1 ·Ίζ-ι| = 1 dz 1 = 2πί — = π Ι = ι (ζ + ι)(ζ — ι) 2ϊ γ sin ζ 37. Γ —;- rfz = 27r/(sin ζ)*"-1' |ж.0 •Ίζ|=2 Ζ ο (-1)η/22πι 38. e' Ι (η - 1)! (у>)("-1) η odd η even (ζ - Ο" dz = 2πί (« - 1)! _ 2πι^ 39· ί Γ ^0 2α cos θ + α \α\ <1 This integral can be computed by means of Cauchy's theorem by interpreting it as an integral over the unit circle. Since cos θ = е'в + ε'* -;и dz = ге,в d6 = iz άθ on the unit circle, we may rewrite the integral as ί (ΐ-2α(Ζ-±^]+αΛ-1^=1-ϊ J|z| = i\ \ 2 / J iz iJN = 1 — If dz ia J|z| = i z2 -(a + (l/a))z+ 1 -Ь dz ia J\z\ = i (z- a)(z - a*1) dz ζ — az2 — a + a2z (7.59) Since |a| < 1, the function (z — а г) Ms analytic on the unit disk and the integral (7.59) can be computed by the Cauchy integral
7.7 The Cauchy Integral Formula 591 formula L. dz |ζ| = ι(ζ-α l)(z-a) Thus d9 = 2n\- Γ — Jo 1 - 2α cos 0 + α2 Theory of Residues a — a -2π/ 1 α \α — α-1 2π 1-я2 There are many definite integrals which may be computed in similar fashion. The integral formulas of complex analysis provide a powerful technique for computing such definite integrals called the residue calculus. We shall give a brief introduction to these methods. First, a few more illustrations 40. •Ό J|z| = i\ 2 J iz '\z\-- π 531 2*6·4·2 г271 de - г Γι /ζ + ζ-1\ 41· Jo l+cos20~JM = 1L I 2 j iz ζ dz 4 г гяг ~~ i J|z| = i z4 +6z2 + 1 (7.60) We are now not in a very good position, for we cannot recognize the integrand as a Cauchy integrand. To do so we should be able to write it in the form f(z)(z — ζ)~" for some function / analytic on the unit disk, and ζ in the disk. But it is not of that form. The integrand is (z2 + 3 + 2V2)(z2 + 3 - 2^2) z2+(3 + 2^2) z+(-3 + 2^2)112 z-(-3+2^2)1 /2
592 7 Line Integrals and Green's Theorem which has the form f{z\z — a)~1(z - β)-1 for two points α, β in the disk. However, we can still compute this integral by returning to the proof of Cauchy's integral formula. If At, A2 are two small disks centered at α, β, respectively, then f(z)(z — a)~l(z — β)-1 is analytic in Δ — (At υ Δ2), so by Cauchy's theorem f(z)dz = 0 •'β[Λ-(Λι u4i)j(z - α)(ζ - β) Thus, the integral (7.60) is the same as ζ dz f JaA,(z2 + 3 + 2jl){z - β)(ζ - «) + f JSA2(Z2 + 3 + ζ dz 2jl){z - «)(z - β) (7.61) Now these integrands are of the form f(z\z — ζ)'1 with /analytic on the disk and ζ in the disk, and can be evaluated by Cauchy's integral. (7.61) is thus 2πί + ■ L(a2 + 3 + 2^2)(α -β) (β2 + 3 + 2^/2)08 - or)J Since α = -(-3 + 2^/2)1'2, β =(-3 + 2^2)1/2, we obtain the result Jo 1 άθ + cos2 θ χ 4 „ . - · 2πι -3+272 + 3+2^2. α-/ϊ = π./2 The above idea of suitably generalizing the integral formula so as to accommodate a larger class of integrals is called the residue theorem. We shall now prove it in general. Definition 10. Suppose that/is analytic in a neighborhood of the point ζ, except perhaps at ζ. We say that / has an isolated singularity at ζ. The residue of such a function/at ζ is defined to be Res(/ ,0 = Hm^f ε->ο Ζπι J\ |z-{|=e /(z) dz
7.7 The Cauchy Integral Formula 593 Of course, we do not a priori know that this limit exists, and therefore that the residue is well defined. However, there is no problem: for any ε and ε', we have Γ /(ζ) dz = f /(z) di by Cauchy's theorem, since /is analytic in the (regular) domain bounded by these two circles. Thus the limit certainly exists since it is independent of ε. Now the residue theorem says that the boundary integral of a function analytic but for isolated singularities is given by its residues; which we may calculate by the integral formula, or other available local means. Theorem 7.11. (Residue Theorem) Suppose that f is analytic on the regular domain D but for isolated singularities at ζχ,..., ζ„ in D. Then \ f{z)dz = 2nit Res(/,C,) (7.62) Proof. Let Δ ι,..., Δ„ be disjoint disks centered at ζι,..., ζ„, respectively. Then since/is analytic ιηΰ- υϊ=1 D,, by Cauchy's theorem Γ f(z)dz=f ί f{z)dz JdD ,=sl •'«Aj But the sum is just (7.62) by the definition of residue. Examples 42. f- cos2fl f- l(z+(llz))2 dz J-, 1 + sin2 θ J-, 1 - i(z - (1/z))2 iz _ r" -1ζ4+2ζ2 + 1^_ ~ J-„ iz z4 - 2z2 - 3 Now the roots of the denominator are
594 7 Line Integrals and Green s Theorem and the integrand can be rewritten as ■ 1 z4 + 2z2 + 1 /(*) = « (z2 - 3/2)(z + i/V2)(z - (i/72)) The residues to be computed are those at 0, ±ijyJ2. The integral around each singularity is a Cauchy integral, so we need only evaluate the relevant function at the point in question. Res0 / = tt 3i l/4 + 2(-l/2)+l 1 Res,/V2 / = — Res Thus, our integral is i(i/y2)(-(l/2)-(3/2))(2i/V2) 8i 1/4 1 ,/V5/ \φ{-2Χ-2ΐφ) 8' It is clear that any integral of the form f R(cos 0, sin θ) άθ ·* —it where R is a quotient of polynomials, can be handled in this way by the substitutions л 1/ Ц · „ ! / 1\ cos 0 = - ζ + - sin0 = —|z — 2\ z/ 2i\ z/ IZ The integrand then becomes a quotient of polynomials is z, and we need only compute the residues at the roots of the denominator which lie inside the unit circle. At such a root r, the integrand takes the form /(z) g(z)(z - r)k
7.7 The Cauchy Integral Formula 595 where fjg is analytic near r. Thus the residue is, by Cauchy's formula v(ft-l) 2ni J g(z)(z - r)k (k-l)\\g(z)) The cases we have considered so far are those where к = I. Here is an illustration of the more general case. 43. άθ r \ dz /·" d9 _ r 1 J-, (2 + cos Θ)2 ~ Jm-i [2 + Ш + ι -, (2 + cos 0)2 Jw.! [2 + i(z + (1/z))]2 iz 4r ζ rfz = lJ|z| = i(z2 + 4z + l)2 The roots of the denominator of the integrand are -2 +УЗ -2-УЗ These are both double roots. We need not be concerned with the root —2 — ^/3, since it is outside the unit disk. The integral is conveniently rewritten as ζ dz -f iJ|z| = i(z + 2 + V3)2(z + 2-V3)2 By Cauchy's formula the integral is evaluating the derivative of /(z) = z(z + 2 + J3)~2 at -2 + ^/3. Now ,- _ -2 + ^/3-2-73" i^ П-2 + V3) = - (_2 + уз + 2 + уз)3 - 2%/2-7 Therefore, our integral is 4 1 4π 2πι i 2(27)1/2 (27)1/2 44. Occasionally, the integrand does not obligingly form itself into a Cauchy integral, and we must play around a little more fe'-rfe-f exp(z + ^=f ^-dz J-π J|z| = l \ Ζ/ Ζ J\z\ = l Ζ
596 7 Line Integrals and Green's Theorem The only singularity is at 0, but we cannot rearrange this in the form /(z) · z~l. Thus we must compute the integral directly by some other means. Since OO -'' OO - Л n=ο η! „=.ο η! and <?'^ = Σ /.Σ ^V \ J>0 / Thus ί е2~вав= Σ Σ -Αι f z"'ldz J£0 But that last integral is zero unless η = 0, in which case it is 2π. We conclude that >£0 Integrals from -co to + oo The techniques of residue calculus also apply to suitable integrals of the form .00 F(x) dx (7.63) J -oo If, say, F(z) is analytic but for isolated singularities at zu ..., zk in the upper half plane, then . ft F(z) dz = 2πϊ Σ ResZi(F)
7.7 The Cauchy Integral Formula 597 whenever D is a domain containing z1,...,zk. Choosing D=DR = {z: \z\ < R, Im ζ > 0}, the integral is R f F(x) dx + f F(z) dz J — в ·Ή„ Ян where HR is the boundary in the upper half plane of the disk of radius R. Now if F(z) -> 0 as |z| -> oo fast enough, the integral over HR will tend to zero and the integral from -R to R will tend to (7.63). We shall say that F is dissipative in the upper half plane in this case. Thus we conclude that when F is dissipative in the half plane П, f F(x)dx = 27ri£ Resz(/) ^ — oo ζ еП 45, г°° х1 dx z2dz л1" χ ax ,. r ζ az —i =nm —τ J-oo X + 1 R^oo JSDR Ζ + 1 For ζ dz г ζ az г к е •L z4 + 1 Jo-R .« Д V'9 . λΛ ^ 4 J?i9 ^ 2тгД3 + 1 as /?-»· oo Now the roots of z4 + 1 are (+1 + ι)/\/2· Those in the upper half plane are a = (1 + О/чД 6 = (1 - О/лД Thus Цртт) = ι + г l + i v^JI-v^J 1 + / -1 - i l + i -l+i Res, V + l/ 8i. 4l φ. RJl ^2 J 1+г 8*72 1 -f 72 l+i 1 - /'
7 Line Integrals and Green's Theorem Finally, r°° x2 dx = 2πΐ 1 + i 1 -i + J-oo χ* +1 hiji 8цД] xji A condition on F that guarantees that it is dissipative is that F is the quotient of two polynomials such that the denominator is of degree two more than the numerator (see Problem 37). 46. Compute -шх dx ρ e a J-oo 1 +X a>0 (7.64) Now, we would hope to apply the residue theorem to e ,z(l + z2) '. For ζ = χ + iy, this becomes e> e- 1 + (x + iy)2 which is hardly dissipative for у > 0. But it is dissipative in the lower half plane: r e~iaz dz r' •Li-r 1 +z2 ·>- y<0 r° exp[- ia(K cos θ + i sin вУ]Ше1в d9 1 + R2el2e - ^~j ί_π exP(~aR sin θ)de £ ^r—[ - ° as /? -> oo. Thus we compute (7.64) by residues over the lower half plane: f°° e',axdx „ / е-'"2 \ e-" π r = —2πϊ Res_, ~\=2π\ = — ■>-«, 1+x2 \l + z2/ -2i e" (The sign changes since the χ axis is oriented opposite to the orientation it obtains as boundary of the lower half plane.) Notice, by the way, that ι·00 cos ax dx ι·00 e lax dx r°° elax dx π
7.7 The Cauchy Integral Formula 599 Since r°° sin ax dx л1" sin ax ax J-» 1 + x2 (the integrand is an odd function), we obtain " eiax dx f00 e~iax π ; dx = — a > 0 л·" e— αχ ρ1" e — ·>-«, 1 +x2 ~ J-» 1 +x2 EXERCISES 19. Perform the indicated integrations by residues: ο Γ άθ W Ln cos2 6> + 2 sin2 6> Γ" άθ (b) -L (cos2 6> + 2 sin2 6>)2 (c) LiW^r*2 r e'z dz (d) L-.wFTi) (e) Jo 5^4 cos 0 (f) ΓΊ 5'a<1 Jo 1 + a Sin Ρ cos χ dx fe) Lx(x2+a2)(x2 + b2) r" e"dx (h) J_.TT3? 0) L^ + l) χ sin χ dx
600 7 Line Integrals and Green's Theorem 20. fw(0 (к) (1) r°° dx J-» x2 + 3x + 2 r°° dx •L„l+x10 Suppose that / is analytic in a neighborhood of ζ, and = 0 Show that α(Λ = № 0<j<k -/(0 "W (z-tf is an analytic function. 21. Suppose that/is analytic in a disk centered at ζ, and all derivatives of/vanish at ζ. Then/is identically zero. 22. Suppose that /is analytic in the punctured disk 0 < \z — ζ\ < R and bounded. Then, defining / at ζ by /(Q = lim/(z) the extended function is analytic. PROBLEMS 36. If {/,} is a convergent sequence of analytic functions in the domain D, then the limit function is also analytic. 37. If P{z) where P, Q are polynomials, then F is dissipative if the degree of Q is 2 more than that of P. 38. Suppose that /is analytic in the punctured disk 0 < \z — ζ0\ < R. (a) Show that •Ίζ|=>· (■ dz 0<r<R ζ-ζο)" is independent of r.
7.7 The Cauchy Integral Formula 601 (b) Fix some r0 < R. Show that if r0 < | ζ - ζ0 \ < R, Ζπ1 JI{-{0I=J« Ζ— 4 2ττΖ J|{-{0|=r0 2— ζ (c) Expand /in a series of the form /(£) = Σ α„(ζ-ζ0Υ П= — 00 called the Laurent expansion of/, by noticing that ζ ζ-ζο (z-ζο). -1 oo = Σ (ζ-ζοΥ „tb (z - ζ0)" + 1 for |ζ-ζ0Ι=Λ, |ζ-ζ0|<Λ, and 1 f (z-ζοΥ (7.65) ζ-ζ .£*«;-ίο)"1 for |z - ζ0| = r, and |ζ - ζ0| > г. (d) Show that Resc /= α_ι. 39. Equation (7.65) can be verified in another way. Expand / in a Fourier series around each circle \z— ζ0\ =r: f(z)= Σ a&W z = re" (7.66) (a) The Cauchy-Riemann equations imply that a/ a/ (b) Differentiating (7.66), we obtain 00 0=2 (ran - node'"" n= ~ oo Conclude that a„(r) = A„ r". Thus (7.66) becomes /(z)= Σ A„r"e'"° = A0+ 2{A-z- + A„z")
602 7 Line Integrals and Green's Theorem 40. Suppose that f is one-to-one in the domain D. Then by the residue theorem -f - 2πϊ J ев w ζ dz sdW— J \z) if wis not a value of/ in D. Suppose f{a) = w. Then а=/-'Ы = Л[ ζ dz IttUsd w—f(z) Conclude that the inverse of a one-to-one analytic function is again analytic. 7.8 Summary Let ρ e R" and suppose f is an /?m-valued function defined in a neighborhood of p. / is differentiable at ρ if there is a linear transformation T: R" -> Rm such that Hf(p + v) ~ f(p) ~ T(v)|| —■ * 0 as ν -> 0 Τ is called the differential of f at ρ and is denoted df(p). The differential is linear in the function f and also satisfies rf<f,g> = <rff,g> + <f,dg> Let U be a domain in R". A system of coordinates on U is an и-tuple of C1 functions у such that (ι) if p#q, y(p)#y(q) (ii) rfy(p) is nonsingular at all ρ e U The matrix a(/,...,/)_ a/ d(x\ ..., x") dxJ is called the Jacobian of the coordinate change.
7.8 Summary 603 the chain rule. The differentials of composed mappings compose as linear transformations: dig, о f)(p) = rfg(f(P)) о rff(p) inverse mapping theorem. Suppose F is a C1 Rn-valued function denned in a neighborhood of p0 such that rfF(p0) is nonsingular. Then there are neighborhoods Noijt0 and UoiF(p0) and a C1 mapping G: U^N such that G = F_1. Let D be a domain in R". A differential form on D is a function which associates to each point ρ in D a linear function ω(ρ) on R". A differential form has the form η ω(ρ) = Σ α.(Ρ> dx'(v) ω is said to be Ck on D if all the functions аъ ..., a„ are Ck. If ω is the differential of a function we must have ^=^ ' *'·>*" (7·67) A differential form is exact if it is the differential of a function, and closed if (7.67) holds. Suppose that F is a force field denned in a domain D in R", and Γ is an oriented path denned in D. The work required to move a unit mass along Γ is W{T, F) = - jb<F(0, g'(i)> dt where g furnishes a parametrization of Γ. A field is conservative if W{T, F) = 0 over all closed paths Γ A potential function for a field F is a real-valued function Π such that W(r, F) + Π(ρ') - Π(ρ) is the same for every oriented path Γ from ρ to p'. Suppose D is a domain such that any two points can be joined by a path in D. Then (ι) every field, conservative in D, has a potential function (ii) two potentials of a given field differ by a constant (iii) If F = (/i,..., /„) has the potential Π, dTl^J^fdx1
604 7 Line Integrals and Green's Theorem line integral OF a differential form. Let Γ be an oriented path in a domain on which the form ω is denned. If Γ = £*=1 Γ,, define ίω=Σ ί"ω(&(0)(8ί(0)Λ Jr 1=1 Jat If Τ is the tangent to Γ, ω = ω(Τ) as Jr Jr Let ω = Σαιάχ' be a C1 differential form defined on D. ω = df for some function/ (i) if and only if the field (αχ, ..., an) is conservative (ii) if and only if |Γ ω = 0 for all closed curves (iii) only if da, da ^=δ? f0ralUj throughout D. poincare's lemma. Suppose that D is a domain such that for some fixed point p0 in D and every ρ e D, the line segment joining p0 to ρ is contained in D. Then every closed form is exact in D. In two dimensions a differential form has the form ω = ρ dx + q dy. If ω is C1 we shall denote the function dq dp dx dy by άω. A regular domain in R2 is bounded by a piecewise C1 curve. We orient this curve so that its principal normal points into D (it winds counterclockwise around D). When so oriented we shall denote the bounding path by dD. green's theorem. If ω is a C1 differential form defined on the regular domain D, Ud ω = JB άω
7.8 Summary 605 Integration under a coordinate change. Suppose χ = x(u, v) У = У(и, v) is a coordinate change on the domain D in R2. Let Ε be the domain in the uv plane corresponding to D. If Fis continuous on D, then f / = f /(*(«, ν), y(u, ν)) J П * F. d«d(X'y) du dv 6(u, v) Let ν = (v, w) be a C1 vector field. The divergence of ν is dv dw divv = —+ — dx ay divergence theorem. If ν is a C1 vector field denned on the regular domain D, Ud <v, N> ds = JB div ν A C2 function / is the potential of a conservative divergence-free flow if and only if it is harmonic. cauchy's theorem. If / is a C1 complex differentiable function defined on the regular domain D, then \Dfdz = 0 cauchy integral formula. Under the same hypotheses on /, if ζ e D, 2πι ->d z - ζ maximum principle. If /is analytic on D, it attains its maximum on 3D. Theorem. Let / be a C1 complex-valued function defined on the regular domain D. The following assertions are equivalent (i) for any ζ e D, and some R such that Δ(ζ, R) с Df is the sum in Δ(ζ, R)
606 7 Line Integrals and Green's Theorem of a convergent power series oo /(ζ)=Σα„(ζ-Οη (7.68) (li) replace the word some in (i) by any (iii) /is complex differentiable (ιν) / satisfies the Cauchy-Riemann equations дх dy (v) fdz is closed (vi) for any ζ e D 2πι ->sd ζ - ς In case / has these properties (/ is analytic), the coefficients a„ of (7.68) are given by /<">(0 1 f f(z)dz a„ = n\ 2niJSD(z-C)"+1 If/is analytic in {0 < \z — z0\ < R}, we say that/has an isolated singularity at z0. In this case the integrals άί f(2)dz 2πι ·Ίζ-ζ0| = γ are all the same for 0 < r < R. Their common value is the residue of/at z0, denoted Res (/ z0). residue theorem. If / is an analytic function on the regular domain D, except for isolated singularities at z1; ..., z„ in D, then f f(z)dz = 2nit Res(/,z,) ''ЯП ι = 1
7.8 Summary 607 • FURTHER READING The general theorems on differentiation in R" are fully discussed in: H. K. Nickerson, N. Steenrod, D. С Spencer, Advanced Calculus, D. Van Nostrand Company, Inc., Princeton, N. J., 1957. M. E. Munroe, Modern Multidimensional Calculus, Addison-Wesley, Reading, Mass., 1963. L. Loomis and S. Sternberg, Advanced Calculus, Addison-Wesley, Reading, Mass., 1968. For further information on complex analytic functions see Z. Nehan, Introduction to Complex Analysis, Allyn and Bacon, Inc., Boston, 1961. H. Cartan, Elementary Theory of Analytic Functions of One or Several Complex Variables, Addison-Wesley, Reading, Mass., 1963. E. Hille, Analytic Function Theory, Ginn and Company, Boston, 1959. L. Ahlfors, Complex Analysis, McGraw-Hill, New York, 1953. • MISCELLANEOUS PROBLEMS 41. Prove the assertion concerning integration under a coordinate change as given in the summary (where no reference to the orientation is made). 42. Show that if ω is a differential form of compact support in R2, that f da> = 0 43. Recall the definition of connectedness given in Problem 78 of Chapter 2. Show that a domain in R2 is connected if and only if it is path- wise connected. 44. If ω =p dx + q dy is a C1 form, define *o> = —q dx + ρ dy (a) Show that for any regular domain D, f ω(Ν) ds= f d J ЯП ·> D D where N is the interior normal to D. (b) Show that the function и is harmonic if and only if d*du = 0. (c) ω is (locally) the differential of a harmonic function if and only if άω = 0, ά*ω = 0. 45. If и is a harmonic function in the domain D and if *du is exact in D, then и is the real part of an analytic function in D.
7 Line Integrals and Green's Theorem 46. If и is harmonic in D, and Г is a closed path in D, the integral Ζπ ·>γ is called the period of и about Г. Show that и has zero periods about all paths if and only if и is the real part of an analytic function. Show that ехр(й) is the modulus of an analytic function if and only if и has integer periods. 47. Let D = R2 — {pb ..., ps}, where pi,..., ps are s distinct points in the plane. Show that there is an j-dimensional space L of harmonic functions which are not the real part of an analytic function in D such that every harmonic function has the form u = ui + Ref, MisL, /analytic in D. (Recall Problem 33.) 48. The Gamma function. Define Γ(ζ)= ί exp[(z- l)\nt-t]dt= ί tz-Le-dt Jo ^ о (a) Show that T(n)=n\ (b) Show by integration by parts that Γ(ζ+1) = ζΓ(ζ) (c) Show that Г is an analytic function in the half plane {Re ζ > 1} (differentiate under the integral sign). 49. (a) Show that for any a > 0 the function Γ„(ζ)= t'-'e-dt is analytic on the entire plane, (b) Substitute ^ '" e-=2(-W-, nl into the integral ί t'-'e-dt
7.8 Summary 609 to obtain the formula r(z)=2 тгтЧ + г^) n=o n!(z+ n) Justify that substitution. (c) If Re ζ > 1, does lim Γ„(ζ) = Γ(ζ) as a -* 0 ? (d) Use the result of part (b) to extend Γ to a function analytic on the entire plane, but for isolated singularities at 0, — 1, -2, .... (e) Calculate the residue of Γ at those points. 50. Find the residue at the origin of exp^z + -j 51. Compute the Fourier transform of (1 + x2)-1: find 1 f°° e'lx (use Example 46). 52. Compute the Fourier transforms of these functions: (a) (1+x*)-1. (b) (l+xTl(a2 + x2)-1· (c) (ΤΤΪν- cos χ (d) oner 53. Suppose {/,} is a sequence of analytic functions in D, and lim/, =/ uniformly in D. Show that /is analytic. 54. Prove: If /is C" in D and /dz = 0 for all disks Δ contained in D, J SA then /is analytic. 55. Morera's theorem. Suppose / is a continuous complex-valued function defined in D such that ί fdz = 0 over every closed path Γ in D. Then /is analytic. (#mr: Let F be a potential function for fdz and show that F is complex differentiable.) 56. If/= и + ίν is an analytic function in the domain D, then и is the potential of a divergence-free velocity field. Show that the curves {v = constant} are the path lines of the associated flow.
610 7 Line Integrals and Green's Theorem 57. Let / be analytic in the domain {0 < \z — z01 < Щ f is said to be meromorphic at z0 if there is a function g analytic in a neighborhood of z0 such that/· g extends analytically across z0. Verify that these are equivalent conditions for meromorphicity. (a) the Laurent expansion (7.65) of / about z0 has only finitely many negative terms. (b) there is an η such that (z — z0)n/ extends analytically across z0. 58. Show that if/is analytic in the domain D except for isolated singularities aXpi,...,p,, where it is meromorphic, then there is a polynomial Ρ such that /· Ρ extends analytically to all of D. 59. If/is meromorphic at z0, is exp(/) also meromorphic there? 60. Schwarz's lemma. Suppose that / is analytic on the disk {z e C: |z|^l}, and (i) max{|/(z)|:|z| = l} = M (n)/(0)=0 Show that for any ζ in that disk \f(z)\<M\z\ (Hint: Apply the maximum principle to z"1/) 61. Under the same hypotheses as above show that 1/'(0)|<П and if |/'(0)| = 1, then/(z) = cz for some constant с of modulus 1. 62. Let / be in S(R), and suppose that fit) = 0 for negative t. Show that V2wJo is an analytic function for ζ in the upper half plane. Notice that f(\y) = 63. Suppose that/is analytic and dissipative in the upper half plane and f is in S(R) on the real axis Show that there is a function g e S(R) with git) = 0 for negative t such that /(z) = giz). (Hint: Let m=h(j j^e'l"dt) Then, by Fourier inversion, g and /are analytic in the upper half plane and have the same values on the real axis. Verify that g(t) = 0 for negative t by Cauchy's theorem.)
POTENTIAL THEORY IN THREE DIMENSIONS The theory of the preceding chapter, when generalized to three or more dimensions becomes considerably complicated. The development of this theory during the 19th century was motivated to a considerable extent by physical intuition. The study of fields of force and velocity of fluid flows led to the theorems on integration in severable variables which are in this chapter. More modern expositions of this material lean heavily on algebraic developments of the late 19th and early 20th centuries. Although the mathematics has significantly improved with the introduction of the notions of differential forms and invariance, the intuition provided by concrete interpretations has been lost. We shall lean heavily on the interpretation by fluid flows, thereby sacrificing some mathematical rigor for a little bit of concreteness. We certainly should point out that the importance of the subject of differential forms by far transcends its use in putting the divergence theorem on firm ground. This theory has had major impact on all branches of modern research mathematics and physics. We have however selected to complete our story rather than begin to suggest a new one. A fluid flow is given by a function φ(χ0, t) defined for x0 in some domain D in R3 and t on an interval in R about the origin. We require that (ι) φ is continuously differentiable in all variables, (ii) φ(χ0, 0) = x0, all x0 6 D, (iii) for fixed t, the transformation x0 -> φ(χ0, ή is one-to-one and has a nonsingular differential. 611 Chapter О
612 8 Potential Theory in Three Dimensions The value φ(χ0, t) represents the space position at time t of the particle which was at x0 at time t = 0. We shall refer to x0 as the particle coordinate and to χ = φ(χ0, t) as the space coordinate. Condition (n) asserts that the particle and space coordinates coincide at t = 0. Condition (iii) asserts that the relation between particle and space coordinates at any time t is invertible: we can recapture the initial position of a particle from its position at any time. We shall denote the inverse of φ by ψ: χ = φ(χ0, t) if and only if x0 = ψ(χ, ί). The curve give η by χ = φ(χ0, t) is the path of motion of the particle x0. The velocity of x0 at time t is, of course, (<3ф/<3?)(х0, t). If we fix the time t, the collection of velocity vectors forms a field, denoted by v(x, t) (referring of course to spatial coordinates) called the velocity field of the flow. v(x, t) is the velocity of the particle at χ at time t. We have already noted that ν(χ0=δφ(χο,0 dt Χο=ψ(Χ, <) (8.1) If the velocity field is independent of time, we say that the flow is steady. The velocity field of a flow completely determines the flow: the path of motion χ = u(f) of a particle x0 is the solution of the differential equation τ- = 4"> 0 dt (8.2) u(0) = x0 By (8.1) the solution is given by u(f) = φ(χ0, t), for (8.1) can be rewritten as δφ(χ0, ί) ν(φ(χ0, ί), 0 = · dt Thus the equation of flow is recaptured from the velocity field by solving Equation (8.2). This introduction recapitulates what we have already learned about fluid flows. In the subsequent section we shall develop the mathematics required to study the evolution through time of a given mass of fluid. We shall see that the various laws of conservation of physics (mass, energy) correspond to mathematical theorems (divergence theorem, Stokes' theorem).
8.1 Divergence and the Equation of Continuity 613 8.1 Divergence and the Equation of Continuity Let us begin with a fluid flowing through a domain in R3 according to the equation χ = φ(χ0, t). According to reasonable physical assumptions, if we define the density at a point ρ as the limit mass Δ P(P) = lim —ΓΓ δ->ρ vol Δ as the domain Δ shrinks uniformly down to p, then the mass of any domain is given by integration of the density function p. In our case, that of a fluid in motion, we shall express the density of the fluid at the point χ at time t as p(x, t). Thus, for any domain D, the mass of fluid in D at time t is f p(x, 0 dV We can also consider the density at a particle: ρ(φ(χ0,ί). Ο is the density of the fluid at time t at the particle (originally at) x0. (More generally, we always have this option of referring measurable quantities to either the spatial, or the particle coordinates. This option is a source of some confusion, as well as deepening, of our understanding.) The law of conservation of matter asserts that the mass of a given object is independent of time. If we fix a domain D, the space occupied at time t by the fluid originally in D is the domain D, = {ф(х0, t): x0 e D}. The mass of fluid in Dt is f p(*, 0 dV Since mass must be conserved, this must be independent of t. Thus the law of conservation of mass can be expressed by this equation: | f p(x, t)dV = 0 (8.3) dt JDt for any domain D. We would prefer to state this as an equation involving functions of points, rather than domains. In order to do that we must know how to carry through the differentiation implied in (8.3). The problem with
614 8 Potential Theory in Three Dimensions (8.3) is that we have a variable domain of integration. This can be solved by replacing that integral by one over D. We shall now briefly interrupt this discussion with a description of the formula for change of variables in an integral. This will allow us to compute (8.3). Suppose now that we are given a one-to-one transformation у = F(x) of a domain D onto a domain Δ. We assume that F is continuously differen- tiable, and its differential is everywhere nonsingular. We shall require also that dF(\) is orientation-preserving: that is, that it maps the standard basis E1-*E2-*E3 into a right-handed system. Writing χ = (x\x2, x3), у = (y1,y2,y3), the image of E, under the linear transformation dF(x) is just (5F/5x')(x). Thus we require that dF ч dF dF ч _(x)^_(x)^_(x) be a right-handed system, which is the same as asking that With these hypotheses we have the following formula for integration under the change of variable F. If /is an integrable function on Δ, then |/(y)^ = |/(F(x))det|^^ (8.4) We shall defer the derivation of this formula to the end of this section. The motivating idea is that it is true in the small: if the function/is constant, and the transformation F is a linear transformation, and D is a rectangle, then (8.3) just says that the volume of the parallelepiped F(D) is det F · vol(Z)) (an easily verified fact). The general case follows by locally approximating by this case and summing over the whole domain. Examples 1. Find J B x2y4 dV, where В is the unit ball. We use spherical coordinates for this computation: χ = r sin θ cos φ у = r sin θ sin φ ζ = r cos θ д(х ν "> /sin θ cos ψ r cos θ cos φ — r sin 0 sin ψ \ „, ' ' 1Ч = I sin θ sin φ r cos θ sin φ r sin θ cos φ Ι *'·*·*> \ cos/ -sin0 0 7
8.1 Divergence and the Equation of Continuity 615 so 3(x, у, ζ) det ———— = rz sin θ d(r, θ, φ) f χ V rff/ = ί ί ί r8 sin6 # cos2 ψ sin5 ψ Jr ί/ψ rf0 •'в ·Ό J -π ·Ό = ί r8 rfr · ί cos2 φ sin5 φάφ· \ sin6 θ d0 _ 1 16 7π _ 7π ~ 9 ' 105 ' Тб = 945 2· Jb(*2 ~ J2) *c Φ> where Ζ) = {0 < χ < 1, χ - 1 < у < χ} becomes if J' λ Jo Jo uv du dv = - under the change of variable u = χ — у, ν = χ + у. 3. \в{х2 + у2 + ζ2) dx dy dz, where В is the domain 5= {x2 + y2< l,0<z<2} This can be easily computed in cylindrical coordinates: f (χ2 + yi + zi)dxdydz=\7'i f (r2 + z2)r d9 dr dz =27i({r3+lr)dr _ 19π We return to our fluid flow given by χ = φ(χ0, t). We shau express it, for the sake of compution, in coordinates: {x\ χ2, χ3) = φΟο1' *ο2> *<Λ 0 (8·5)
616 8 Potential Theory in Three Dimensions Since (8.5) reduces to the identity for t = 0, we have d(xl, x2, x3) o{Xq , x0 , Xq ) ( = o Thus, the determinant δ(χι, χ2, χ3) (8.6) J,(x0) = det ti(X0 , X0 , X0 ) is positive for all small t, so we can apply the change of variable formula to the computation of (8.3) for fixed small t. We now have the mass conservation law expressed by 0 = - J р(х, t)dV = - j Кф(х0, 0, t)J, dV=\Dj( (PJt) dV (The final equation follows since differentiation under the integral is now allowable.) Since this must be true for every domain D, the integrand is identically zero : dt (pjt) = 0 (8.7) We can explicitly compute that derivative for t = 0, using (8.6). First, let us consider dt д , d(xl, χ2, χ3) = — det —-—-—■—— tit ti(X0 , Xq , Xq ) (8.8) ( = 0 The determinant is the usual sum of products of the various partial derivatives dx'/dx0J. The derivative of such a product will have three terms; in each one of which only one term is differentiated with respect to t. Each term is of the form Ft VW ds2 (8.9) where {rl, r2, r3} is a permutation of {xl, x2, x3}, and {j1, s2, s3} a permuta-
8.1 Divergence and the Equation of Continuity 617 tion of {x0l, x02, x03}. According to (8.6) dr Js = 0 if s φ r0 dr ti~s = 1 if s = r0 Thus the only relevant terms (8.9) are those where Y2 = r02, s3 = r03 and, a fortiori, sl = r0l. Finally, by the equality of mixed partial derivatives, δ ίδχ'\ δ I dt \δχ0ι) ( = 0 δχ0' \ ~dt) _ δυ' 1 = 0 ^"^Ο where ν = (г;1, ν2, ν3) is the velocity field of the flow (recall Equation (8.1)). Thus, the computation of (8.8) is complete: there are only three relevant terms, for r1 = xl, x2, x3, respectively, and we have dt J, dvl dv2 δυ3 -_0 δχ0ι δχ02 δχ03 (8.10) Definition 1. Let ν = (ν1, ν2, ν3) be a differentiable vector field defined in a domain in R3. The divergence of ν is the function defined by άινν = ,Σ^ The name will appear presently to be justified. We now summarize our discussion in the following assertion. Proposition 1. (Equation of Continuity) Let v(x, t) be the velocity field of a fluid flow, and p(\, t) its density. The law of mass conservation takes this form: — + div(pv) = — + Χ ν, — + ρ div ν = 0 tit tit j=i tix (8.П) Proof. Referring to the preceding discussion we have seen from (8.7) that the law of mass conservation asserts that -(р(ф(Х0,Г),Г)/,(Хо)) = 0
618 8 Potential Theory in Three Dimensions for all r, x0. Evaluating at t = 0, this becomes - (ρ(φ(χ0, t), r))|,_0 -Л(хо) + ρ(Φ(χ0, 0, r)) -/,(x0)lr=o (8.12) 3 dp dx' dp = (Σ g^( (xo, 0) — (x0, 0) + - (xo, 0) + p(x0, 0) div v(x0, 0) The second expression follows from our computation above terminating in (8.10), and the fact that /0(Xo) = 1, x0 = Φ(χο, 0). Now, we could have started our clock at any time; there is nothing special about the time r = 0 except that our formulas are most easily computed there. Thus, (8.12) must hold for all (x, r) since it is valid for all (x0, 0). Thus (8.11) is true. We leave the first equality as an exercise. Equation (8.11) can be referred to the particle coordinates of the motion: dp dt 3 δχ' dp χ = φ(χ0, ί) >=1 ΰί ϋχ Χ = φ(Χ0. <) δχ + ρ(φ(χ0, ί). 0 div — (x0, 0 = 0 which compresses into - ρ(φ(χ0, ί), 0 + ΚΦ(Χο, 0, 0 div ^ (χ0, ί) = 0 (8.13) This relates the time rate of change of density at a particle with the rate of change of its position. A fluid flow is called incompressible if the same mass always occupies the same volume. For an incompressible fluid flow we must therefore have that JBt dV is constant for any initial domain D. Thus 0 = -f dV=- [jtdV= ί -^,dV= ί div у dV (8.14) dt V dt JD ' JD dt' JD for every domain D. Thus div ν = 0 is the necessary and sufficient condition for a flow to be incompressible. By the equation of continuity (in the form (8.13)) this is the same as asking that the density at a particle is also independent of time. Corollary 1. ν is the velocity of flow of an incompressible fluid if and only г/div ν =0.
8.1 Divergence and the Equation of Continuity 619 Corollary 2. The fluid is incompressible if and only if the density at a particle is constant under all flows of the fluid. Now the integral JB div ν dV is the rate of expansion of the fluid in D, according to our computation (8.14). (Hence, the name divergence.) We could also calculate the "infinitesimal expansion" of D by calculating the amount of fluid which enters during an " infinitesimal" amount of time, and subtracting from it the amount of fluid that leaves. The mathematical expression of this will be an integral over the boundary of the domain D. The fact that this is the same as JB div ν dV is the divergence theorem, which is a fundamental fact in calculus. We shall return to this theorem and its implications in Section 8.5. Examples 4. Consider the flow given by the equations χ = x0{\ + t) + ty0 y = y0(l -t) + tx0 z = z0 e' If D is the original position of a mass of fluid, D, = {(x0(l +t) + ty0, >-o(l - t) + tx0, z0 e'); (x0, y0, z0) e D} and the volume of Dt is Γ dv=\ tetj{x>y>z\dv JDt jd d{x0,y0,^o) = ί e'(l - It1) dV = e\\ - It1) vol(D) •>D Since — vol( Д) = f η div ν dV for every domain D, we have dt J div v(x, t) = - e'(l - It1) = e'(l - At - It1) at 5. For this flow: x = x0e' y = y0e~' ζ = z0e' + x0(l - e')
620 8 Potential Theory in Three Dimensions we have dx — = (x0e', -y0e ', z0e' - x0e>) at so v(x, t) = (x, —y, ζ — xe~') and div ν = 1. Thus, for any domain D, (д/dt) vol(Z)() = 1 · vol(Z)(), so vol(Z)() = e' vol(Z>). If p(x, t) is the density function at time t, the equation of continuity allows us to find ρ in terms of its initial values. Let p(x0,0) = p(x0) be given. Then, according to (8.13), if p(x0, t) is the particle density, we have dp p(x0, 0) = p(x0) Thus p(xo,0 = p{*o)e~' p(x, t) = e~'p(xe~', ye', z - x(e~' - 1)0 6. Suppose an incompressible fluid flows steadily in the direction a = (a1, a2, a3). That is, the path lines are parallel to the vector a. Then the speed is constant along the paths. For the velocity field is v(x, 0) = ф(х)а where φ is a scalar function (the speed), and since ν is divergence free, we have δώ , δώ . δώ . —-: a + —-ζ α + —\ r δχ δχ δ χ divv = ^Tfll+^2a2 + I3a3 = 0 But then #(x)(a) = <V#x),a>=0 for all x, so φ is constant along the lines parallel to a; but these are the paths of motion.
8.1 Divergence and the Equation of Continuity 621 Integration Under a Coordinate Change Theorem 8.1. Let (u, v, w) = F(x, y, z) be an orientation-preserving change of coordinates valid in the domain D in x, y, ζ space. Let Δ = {F(x, y, z): (x, y, z) 6 D}. If g is a function continuous on D, then г г д(х ν ζ~) g(x, у, ζ) dx dydz= g(F~ 1(u, v, w) det ' ' du dv dw jd ja 8(u, v, w) Proof. The proof consists in a series of reductions terminating in the one- variable case It is enough to show that for any point ρεΰ, this theorem is true for some rectangle centered at p. For, once this is shown, we may cover D by finitely many such rectangles Ri,..., R„. If {pi,..., p„] is a partition of unity subordinate to {Ri,..., R„], then pt ■ g is zero outside Rt. The theorem is thus true for each pt ■ g. Summing over ι, we obtain the general result. Thus we may concentrate our attention on a particular point p0 in D, which we take to be the origin. If the theorem is valid for the coordinate changes u = F(x), у = G(u), then it is also true for the composed mapping у = G(F(x)), simply because apt1, x2, x3) apt1, x2, x3) 8(u\ u2, u3) 8(y\ У2, У3) = е(и\ и2, и3) ' (By\y2,y3) We will decompose our mapping into a composition of four special cases, for each of which the theorem is easy. The general result will follow by composing these mappings. First of all, let Τ be the linear mapping д(х, у, z) 1W d(u, v, w) ■00 (u. v. w)—0 Then F = (F°T)<>T"1 and F ° Τ has the property that its Jacobian at 0 is the identity. The theorem is easily seen to be true for a linear mapping (Problem 5), so we need only prove it for F ° T. Our situation is now this: we are given a change of coordinates («, v, w) = G(x, y, z) defined at the origin such that d(u, v, w) V (0) = I (0) = I (0) = I ^х,У, It follows that Ku, У, 8(x, У, a(«, v, z) z) z) z) ape, y, z)
622 8 Potential Theory in Three Dimensions Thus, by the inverse mapping theorem there is a neighborhood В of 0 in which (x, y, z), (u, y, z), (и, ν, ζ), (и, ν, w) are all bona fide orientation-preserving coordinate systems. If we denote the respective coordinate changes as follow FiO, y, z) = (и, у, z) F2(«, y, z) = Ο, υ, z) F3(tt, v, z) = (u, v, w) then F = F3 ° F2 ° Fi. Each Fi changes only one coordinate at a time, and we need only to prove the theorem for each Fi. Since the proof of each case is the same, we shall do it only once. Now, here we do our computation. Let и = h(x, y, z) v=y w = ζ be a coordinate change defined on a rectangle R = {-a<,x<a, -b<y^b,-c^z^c} centered at the origin. Let Δ = {(«, v, w): и = h(x, v, w), — a <, χ < a, —b<,v<,b, —c<,w<^c} If now g is a continuous function on R, J g(x, y, z) dx dy dz = j j J g(x, y, z) dx dy dz (8.15) Now, according to the theorem of change of variable in one dimension eh-1 ~o .hia.f.z) dh~l g(x, y, z)dx= g(h~l(x, y, z, v, w)) — (и, у, z) J-a Jln-a.y.z) OU Thus (8.15) becomes du ал-1 g(h~l(u, ν, w, v, w)) —r- (u, v, w) du du do dw J3(x, y, z) g(h~\u, v, w, v, w)) det — du dv dw A (U, V, W)
8.1 Divergence and the Equation of Continuity 623 The last equation follows from a(«, v, w) д(х, y, z) Idh Vx 0 \o ал Ту 1 0 ал\ Tz 0 Ч e \а(й, υ, w)J e\e(Xyy,z)) ~\дх) ~~ ди EXERCISES 1. Compute the area of these domains, using either spherical or cylindrical coordinates: (a) x2 + y2 + z2 ;> xyz (b) 1 ^x2+y2-z2^0 (c) x2+y2<,z<\ (d) a2x2 + b2y2 + c2z2 < 1 2. Integrate / over the domain D (a) f{x) = x^2z» D = {x2+y2 + z2<l} (b) /(x) = xyz £) = {x* + y2 < 1 0<z<l} (c) f(x) = x2+y2-z2 D = {a2x2 + b2y2<l 0<z<x2+^2} (d) f(x) = r sin2 6> cos2 φ D = {0<x(x2 f ;y2+z2)<l} 3. What is the mass of a parabolic section: 0^ζ<α(χ2+^2) whose density is proportional to the distance from the xy plane? 4. Find the mass of the ball of radius 1, whose density is p{x) = (1 + r)'1. 5. Let x = x0 + ty0 у =y0e' — tz0 ζ =z0e" + tx0 be the equations of a flow in space. (a) Compute the velocity field v(x, t). (b) Compute the divergence of the flow. (c) Assuming an initial density function which is constant, find the density function p(x, i)· (d) What is the mass of the fluid in the unit cube at time t = 1 ? 6. Which of these fluid flows is incompressible ? (a) v(x, f) =(-z, x,y) (b) v(x, t) = (z2 - χ2, ζ - у, ζ) (c) χ = х0е· + (I - t)y0, у = Усе-"2 + (I - t)z0, z =e~"2z0 (d) χ = Xo cos t + y0 sin t y = yQcost— x0 sin t ζ = αζ0(1 + t) (e) v(x, t) = (x cos t, xy sin t, ze')
624 8 Potential Theory in Three Dimensions 7. Find the volume at time t = 1 of the mass of fluid originally in the unit sphere under these flows: (a) Exercise 6(a). (b) Exercise 6(c). (c) Exercise 6(d). 8. Show that a C2 function in R3 is harmonic if and only if it is the potential of the vector field of an incompressible flow. (Hint: div V/= Δ/) PROBLEMS 1. A radial field is a field of the form v(x) = ^(||x||)x Find all incompressible radial fields. 2. If L is a line in R3, a flow around the axis L is one whose velocity field at any point is tangent to the cylinder with central line L. Show that the flow of Exercise 6(d) is a flow around the ζ axis. Find another such flow which is incompressible. 3. Find the incompressible flow whose path lines are the curves X =X0 + U У = Уо + Sin U Ζ =Ζ0 4. Find the incompressible flow whose path lines are the curves (in cylindrical coordinates) z = Cr-i 0 = 0O (see Figure 8.1). 5. Prove Theorem 1 for the coordinate change u = T(x), where Τ is a nonsingular linear transformation. 6 In the proof of Theorem 1, a function и = h(x, y, z) was found. It was tacitly assumed that dh/dx>0. Why is that so? Express dh~l/du m terms of the original functions (u, v, w) = G(x, y, z). 8.2 Curl and Rotation The divergence of the velocity field of a flow measures the rate of expansion of the fluid in flow as we have seen. We shall now compute an indicator of its rotation around a given axis. Suppose x = ф(х0, t) (8.16)
8.2 Curl and Rotation 625 Figure 8.1 is the equation of motion of the flow. Let x0 be any point, and η a direction (unit) vector at the point x0 . We shall compute the average angular velocity in the plane orthogonal to η at the point x0 in terms of the velocity field v. We take χ = 0 for convenience. Since we are interested in the motion around the axis n, relative to the motion of 0, we must work in coordinates relative to 0. What is the same, we shall subtract from the above motion a motion of translation by the image of 0, so that 0 remains fixed. Since translation involves no rotation, our computation will be valid for the original motion. Thus we replace (8.16) by the flow χ = v|/(x0, 0 = ф(х0, 0 = ф(0, t) (8.17) so that in our new motion the origin is fixed. Let Cr be a circle of radius r centered at 0 lying in the plane Π(η) orthogonal to n. Let a be a point on Cr. After a time t, the particle originally at a has moved to v|/(a, t). Let L be the projection of v|/(a, /) — a onto the line tangent to Cr at a (see Figure 8.2). Let 9(t) be the angle at 0 in Π(η) between a and a + L. Thus 9(t) is the angle in the plane orthogonal to η through which a
626 8 Potential Theory in Three Dimensions Figure 8.2 has moved (relative to 0) during the time t. Thus mi) = sin — = sin r r when Τ is the unit tangent vector to Cr at a. Dividing by t and letting t -* 0, we obtain the angular velocity for the particle a in Π(η) as (!"-"'7)10 = (ГГ(^|,.0 = <1(а'0)-Т> = <v(a, 0) - v(0, 0), T> according to (8.17). The sum over all of Cr of this angular velocity is called the total circulation of the flow about Cr and is denoted circ(Cr). Thus circ(Q = | <v(a, 0) - v(0, 0), T> ds (8.18) This number, calculated for small r gives us some idea of the instantaneous rotation of the flow around η at 0. If we suitably normalize ((8.17) tends to zero as fast as r -* 0), and take the limit as r -* 0 we will have the same kind of information, but it will be given by a point function, rather than a function of circles. Definition 2. Let ν be the velocity field of a flow in a domain D. For each point x0 in D, and unit vector η define the curl of the flow about η at Xq
to be 1 t 4i- cirC(Q curl v(x0, n) = hm )~- 8.2 Curl and Rotation 627 (8.19) Γ-.0 r~ where Cr is the circle of radius r centered at x0 in the plane orthogonal to n. Example 7. Consider the flow (Figure 8.3) χ = x0 cos t + y0 sin t у = y0 cos t — x0 sin t ζ = z0 + t Let us take x0 = (1, 0, 0) and η = E3. Then, as we have already seen v(x,0 = O. -x, 1) Figure 8.3
628 8 Potential Theory in Three Dimensions If we take Cr = lx — 1 + r cos -, у = r sia-, ζ =0 r r then circ(Cr) = f <v(x) - v(l, 0, 0), T> ds Jcr = / I r sin r cos - , 0 I, ( —sin s, cos s, 0) ) ds r2" = (-r2)rd9= -2nr2 Thus the xy plane rotates around (1, 0, 0) in the negative sense (with constant angular velocity), as t changes. If now we take η = Еь we have Cr = Jx = 1, ν = r cos -, ζ = r sin - I r r circ(Cr) = I / Ir cos -. —1,11, 10, — r sin-, r cos-I \ ds = 0 Thus there is no rotation in this plane. Now, we shall compute the curl explicitly in terms of the velocity field v. Again take x0 = 0 and let α =(α1, α2, α3), β = (β1, β2, β3) be two unit vectors in the plane orthogonal to η so that α -> β -> η is a right-handed orthonormal basis. Thus η = α χ β, so η = (α2β3 - <χ3β2, α3βγ - α1 β3, α1 β2 - α2βγ) (8.20) For a time we shall compute relative to this basis. Cr has this parametriza- tion s s χ = x(s) = r cos - ■ α + r sin - ■ β (8.21) r r
8.2 Curl and Rotation 629 The tangent vector is s s T(s) = - sin - ■ α + cos - ■ β r r Expanding the velocity field in terms of this basis: v(x, 0) - v(0,0) = v"(\)ol + ι>"(χ)β + ι>"(χ)η Then circ (Cr) = J <v(x, 0) - v(0,0), T(x)> ds Cr = j * ( - if(x(s)) sin - + i^(x(s)) cos -) ds (8.22) Now, substitute θ = s/r in the integral and approximate the ν" (ν = α, β) by their differentials: v\x(ff)) = v\0) + <fov(O)(x(0)) + εν(||χ||) where ||χ||-ν(χ)-»0 as ||x|| — 0 (8.23) Since vv(0) = 0, using (8.21) for x(0), we have νν(\(θ)) = r cos θ ■ dvv(0)(a) + r sin θ ■ dvv(Q)(p) = εν(ΙΜΙ) Substituting these expressions into (8.22), we obtain circ (Cr) = f\-dvW(a) + di;"(0)(P)]r2 cos Θ sin θ άθ ■Ό + f *[-ίίι/"(0)(β) sin2 Θ + άνβ(0)(α) cos2 0>2 άθ ■Ό + f \-ε"(χ) cos θ + εβ(χ) sin θ)ν άθ ■Ό = ur2[-<i!)"(0)(|i) + d/(0)(a)] + r f \-ε"(χ) cos θ + εβ(χ) sin 0] άθ ■Ό
630 8 Potential Theory in Three Dimensions Dividing by nr2, and letting r -> 0, the second term disappears because of (8.23) and we obtain curl v(0, n) = d^(0)(a) - <foa(0)(P) (8.24) This can be rewritten in terms of the vector n. Let ν = (ν1, ν2, ν3) in terms of the standard Euclidean coordinates. Then v"(x, 0) = <v(x, 0) - v(0, 0), α> = Σ № 0) - vl(0,0)]α' ι=1 SO Similarly, 3 dv' *><«>(■> = £^ (8.24) can be expanded out as 3 dv1 3 dv1 сиг1¥(о)П) = 1?1_^-1?1_^ Д dv1 +(£-£)<■''·-,',> (8-25) Referring back to (8.20) we see that this is the inner product of a vector derived from ν with the given unit vector n. We collect these results in a definition and a proposition. Definition 3. If ν = (vl, v2, v3) is a vector field defined in a domain in R3, we defined the vector field curl ν by /dv2 dv3 dv3 dv1 dv1 dv2\ to „,. CUTU=W-S?'8?-8?'8?-8? (8'26)
8.2 Curl and Rotation 631 Proposition 2. If ν is the velocity field of a fluid flow, the curl of у at x0 around the direction η at time t is given by curl <v(x0, t), n>. Proof. Equation (8 25) is just (curl v, n>. Definition 4. A flow with velocity field ν is called irrotational if curl ν = 0. Examples 8. Let v(x) = {-y, χ, 1) (as in Example 6). Then curl v = (0,0, -2) Thus for any plane Π = {p: <p - x, n> = 0} through x, the rotation in that plane has angular velocity -2<n, E3>. Thus the maximum rotation is about the ζ axis. In general, curl v(x) spans the axis of the "infinitesimal" rotation about χ and its magnitude is the angular velocity. 9. Let χ = x0(l + t) + y0(l - e') у = у0е~' ζ = z0(l + t) be the equations of a flow. The velocity field is (x - ye' - (2 + t)ye2' K ' \ 1 + ί 1 + ί/ thus -^-(2 + t)e2'\ curl v(x, ,H(0,0, /+< ) so again the rotation at any point is about the ζ axis. Notice that the equations break down at t = -1. We can consider that as the initial point of the motion: the fluid came, at t = -1 spinning off the xy plane with infinite angular velocity. The form of curl ν recalls the discussion of closed and exact forms in the previous chapter. If we consider the differential 1-form ω = <v, dx> associated to the vector field v, then curl ν = 0 is the necessary condition for
632 8 Potential Theory in Three Dimensions ω to be the differential of a function (and by Poincare's lemma it is locally sufficient). In particular, if the field is/ conservative, then the flow induced by the field is mutational. We can make physical sense of this statement by referring it to the acceleration field a = d\/dt of the flow rather than the velocity field. By Newton's law this is essentially the field of forces which generates the flow. As we have seen, if this field is conservative, then the work done by the flow in moving a mass from one point to another is precisely what is needed; it is the same as the change in energy level. For this to be the case no work can be expended in wastelessly rotating the mass; hence the field is irrotational. In the theory of electromagnetism the existence of two fields, the electric E, and the magnetic H, is postulated. Certain relations between these fields, corroborated by experimental evidence form the basic laws of the subject. These are Maxwell's equations. Two of these are ЯТТ curl Ε + σ — = 0, div Η = О at (σ a suitable constant), which state that the rate of change of the magnetic field is determined by the rotation of the electric field, and that the " magnetic flow" is incompressible. Here are several important relations between the gradient, curl, and divergence which are easily derived. curlV/=0 (8.27) div curl ν = 0 (8.28) div V/= Δ/ (8.29) curl/v =/curl ν + Vf χ ν (8.30) div(/v)=/divv + <V/,v> (8.31) Example 10. Suppose A = (-*,0,j) is the acceleration field of a fluid in motion. Find the equations of motion, assuming an initial velocity field of (0,1,0), and find the divergence and curl of the flow.
8.2 Curl and Rotation 633 If χ = φ(χ0, t) is the equation of motion, we have φ(χ0, 0) = x0 дф ^(x0,0) = (0,1,0) and φ(χ0, 0 solves the differential equation (x,y,z)" = (-x,0,y) The general solutions are χ = A0 cos t + B0 sin t y = Av +BS At . Bl , ζ = A2 + B2 t + -± t2 + -j i3 The initial conditions give these as the equations of motion: x = x0 cos t У=Уо + ' t2 t3 ζ = z0 + y0 - + - The velocity field is V(x, 0 = (-x tan/, 1, /j) div V(x,/)= -tan/ curl V(x,/) = (-/, 0,0) Notice that at / = π/2 the holocaust arrives. Before that moment, our fluid is moving generally in the positive у direction, rotating clockwise around the line parallel to the χ axis and spinning away from it (/ < 0) and back again toward it when / > 0.
634 8 Potential Theory in Three Dimensions EXERCISES 9. Compute the curl for these fluid flows: (a) x=x0 + ty0 y = y0e'—tz0 z= z0e~' + tx0 (b) \(x,y,z) = (-z,x,y) (c) \(x,y,z) = (y,z,x) (d) The flow described in Exercise 6(b). (e) The flow of Exercise 6(c). (f) The flow of Exercise 6(e). 10. Verify Equations (8 27)-(8.31). 11. Find the equations of motion and analyze the flow as in Example 8 given this acceleration field and initial velocity: (a) A = (-j»,*,l) V(x0) = 0 (b) A = (x,z,x) V(x„) = (0, 0, 1) 12. Compute the rotation at x0 about the E2 axis for the flow of Example 6. PROBLEMS 7. Suppose we are given a time-independent field of forces F in a medium of constant density (say =1). By Newton's law the fluid will flow according to the equation F = A. Let D be a small ball of fluid. The kinetic energy of D at time t is ■I 2 I M2dV where ν is the velocity field of the flow. Show that the work done by F in moving D to Dk is equal to the change in kinetic energy. (Hint: a/3r(||v||2)=<v,F>.) 8. Verify these identities: (a) curl gVf= Vg χ V/ (b) curl/V/=0 9. Show that if u, ν are curl-free vector fields, then u χ ν is divergence free. 10. Show that in a ball, a vector field is a gradient if and only if its curl is zero. 11. Let Μ be a 3 χ 3 matrix, and consider the flow χ = exp(Mf )xo (a) Compute the divergence and curl of the velocity field of the flow. (b) Show that the flow is divergence free if and only if tr Μ = 0 (c) Show that the flow is curl free if and only if Μ is symmetric.
8.3 Surfaces 635 12. Consider the flow χ = exp(Mf )xo where Μ is a symmetric matrix (a) Show that the velocity field of the flow is conservative and has the potential function Π(χ) = -<Μχ,χ> (b) Show that the flow in an eigenspace with eigenvalue α is in a straight line either toward the origin (a < 0), or away from the origin (a>0). (c) Diagram the flow lines for such a flow in the plane in case the eigenvalues (ι) are the same, (и) have the same sign; (iii) have opposite signs. 8.3 Surfaces A surface in R3 is (as we have been using the notion m this text) a subset of R3 which is two dimensional. By this we mean that every point has some neighborhood which can be put into one-to-one correspondence with a domain in the plane. We shall assume that this correspondence is smooth. It is given by a continuously differentiable mapping with a nonsingularity condition on its differential. Definition 5. A surface patch in R3 is the image of a domain D in R2 under a map χ = х(и, ν) with these properties: (i) χ is one-to-one. (ii) χ is continuously differentiable. (iii) The vectors дх/ди, δχ/δν are independent at every point, (η, ν) are called the parameters for the surface patch. The curves и = constant, and ν = constant are called the parametric curves. A surface is a set Σ in R3 which can be covered by surface patches, that is, every point ρ on Σ has a neighborhood N such that Σ η Ν is a surface patch. Notice that if we fix и = с, then the function φ(υ) = x(c, v) parametrizes a
636 8 Potential Theory in Three Dimensions curve (since φ is also one-to-one and dty дх dv δν is everywhere nonzero). The vector δχ/δν is thus the tangent vector to the parametric curve и = constant. Condition (iii) asks that the curves и = с, ν = с' at any point have independent tangents. Another way of phrasing (iii) is that the 2 x 3 matrix <3и δχ w has rank 2. Examples 11. The sphere: x2 + y2 + z2 = 1 (Figure 8.4). Near the point (0, 0, 1) we can write ζ as a function of χ and у on the plane: ζ = (1 — χ2 —y2)i/2. Thus we can use x, у to define a surface patch surrounding (0, 0, 1): χ = х(и, v) = (и, ν, (1 υ2)1'2) (I-*—vJ)" Figure 8.4
8.3 Surfaces 637 Figure 8.5 which coordinatizes the upper hemisphere as u, ν range through the disk u2 + v2 < 1. Since Χ„ = ^ = (1,0,-Μ(1-"2-ί'2)"1/2) dx x, = 7-=(0,l, -v{\-u2-v2Yll2) these vectors are independent. Every point on the sphere can be put in such a surface patch, by permuting the roles of (x, y, z) above. For example, the point (— 1, 0, 0) lies in the surface patch given by χ = x(h, v) = (-(1 - u2 - v2)112, u,v) u2 + v2 < 1 Spherical coordinates can be used to coordinatize the whole sphere except for the points (0, 0, +1): χ = x(0, φ) = (cos θ cos φ, cos θ sin φ, sin Θ) 12. The ellipsoid (Figure 8.5) a2x2 + b2y2 + c2z2 = 1 is also easily parametrized by spherical coordinates (again except for z= +c_1): cos и sin ν sin u\ /cos и cos ν ι ■=*(»>v)=[—-a—'■
638 8 Potential Theory in Three Dimensions Figure 8.6 13. The paraboloid ζ = x2 + y2 (Figure 8.6) is a surface patch: it is coordinated by χ = х(и, v) = (и, ν, и2 + ν2) Since хи = (1, 0, 2м), х„ = (0, 1, 2v), they are independent. 14. The cone ζ = (x2 + y2)1'2 (Figure 8.7) can be coordinatized, except for the vertex, by χ = x(h, v) = (h, v, (u2 + v2)112 и φ 0, νφΟ We might ask if there is any way to coordinatize a neighborhood of the vertex of the cone. It is quite difficult to show that there exists no function which does so, but there is one important implication of the differentiability of such a function which is easy to check out. The differentiability implies good approximabihty by linear functions, thus we should anticipate the existence of a linear surface (a plane) which comes " nearest" the surface at a given point. This is the tangent plane, which we shall now describe by limiting arguments as in the case of the tangent line to a curve. Suppose ρ is a point on a surface Σ and q, r are two nearby points. The three points p, q, r (in general) determine a plane. As q, r tend to p, this plane will (in general) attain a limiting position: this is the tangent plane. We now compute this process with coordinates. Suppose the function χ = x(M1,M2),(M1,M2)eZ)coordinatizesZnearp. We may assume ρ = x(0,0) = 0. Let q = х(и', и2), r = x(i>', v2). The plane n(q, r) through p, q, r is then
8.3 Surfaces 639 the set of all vectors perpendicular to q χ r = х(У, и2) χ х(У, υ2) (8.32) In order to take the limit we approximate χ by its differential x(h\ u2) = х^ОУ + x2(0)u2 + e(||u||) where t~lz{t) -*0 as t ->0. Equation (8.32) becomes q χ г = (хх(0) χ χ2(0))(ι/ι;2 - u2vl) + R (8.33) where we have combined all the error terms in the expression R. The important behavior of R is this: R(u, v) = lluMHQ + |M|e2(||u||) + ε3(Η|)ε(||ν||) where the ε, all have the same behavior: t _1ε(ί) -> 0 as t -> 0. Now, so as to treat the remainder R as an insignificant remainder, we must be careful with the term ulv2 -Λ1. It may, for example, be zero, in which case the remainder becomes very significant. Thus we must assume that Figure 8.7
640 8 Potential Theory in Three Dimensions this terms tends to zero more slowly than R as q, r -> p. Since HV-HV = sin(KI|u||X||v||) it suffices to assume that the angle between the coordinate vectors does not tend to zero as q, r -> p. Then, under this assumption, we can divide (8.33) by ulv2 — u2ox, obtaining IT(q, r) as the plane through ρ orthogonal to the vector Xl(0) χ x2(0) + R1 where R1 -> 0 as q, r -> p. Thus the limiting position of IT(q, r) is the plane orthogonal to (дх/ди1) χ (дх/ди2) at ρ: it is the plane spanned by дх дх Definition 6. Let ρ be a point on a surface Σ coordinatized by χ = х(и', и2). The tangent plane to Σ at ρ is the plane spanned by the vectors дх/ди1, дх/ди2 at p. Proposition 3. Let ρ be a point on the surface Σ, and let IT(q, r) be the plane spanned by two points q, r on Σ so that the angle between q — ρ and r-pis nonzero. If q, r -> ρ so that this angle remains bounded away from zero, then IT(q, r) tends to the plane tangent to Σ at p. Of course the angle assumption is crucial, Problem 28 exhibits the difficulty obtained without it. Examples 15. There is no tangent plane to the cone ζ = (x2 + у2)1'2 at its vertex (Figure 8.7). For, if we take qx = (t, 0, t), q2 = (0, t, t), the plane spanned by qx and q2 is the plane spanned by (1, 0, 1), (0, 1, 1) for all t ->0. Thus this is a candidate for the tangent plane. However, if we consider now the points q, = ( — t, 0, t), q2 = (0, — t, t) for t > 0, the candidate we obtain is the plane spanned by ( — 1, 0, 1), (0, —1, 1). Since these two planes are distinct, there can be no tangent plane (Figure 8.8).
8.3 Surfaces 641 Figure 8.8 16. The cylinder x2 + y2 = 1 is a surface. It can be coordinatized by using cylindrical coordinates: χ = \(u, υ) = (cos u, sin u, v) x„ = ( —sin u, cos u, 0) x„ = (0, 0, 1) The tangent plane at x(w, v) is the plane orthogonal to the vector x„ χ x„ = (cos u, sin u, 0). 17. If χ = x(s) is the equation of a curve, the "surface swept out" by its family of tangent lines is a surface. It is parametrized by x = \(s, t) = \(s) + tT(s)
642 8 Potential Theory in Three Dimensions We have xs = T(i) + tkN(s) x, = T(i) Thus, so long as κ φ 0, s, t are patch coordinates for all s, t > 0. This surface is called the developable defined by the curve. Its tangent plane at the point (s, t) is the same as the osculating plane to the curve at x(s). Let Σ be a surface, and ρ a point on the surface. We shall denote the tangent plane to Σ at ρ by T(p). If χ = \(u, ν) parametrizes Σ in a neighborhood of p, with ρ = х(и0, v0), then the vectors дх/ди(и0, v0), dx/dv(u0, v0) span the plane 7Xp). The inner product on R3 induces an inner product on this plane just by restriction. It will be valuable to us to see how to express this inner product in terms of the basis x„, x„. If t = axu + b\ is a vector in 7Xp) its length is given by ||t||2 = <t, t> = a\\u, x„> + 2ab<x„, x„> + b\xu, x„> Suppose that С is a curve on Σ. Choose a parametrization of C: x = g(j) 0 <. s < L (8.34) Let (u(s), v(s)) be the (u, v) coordinates of g(s). Then (8.34) is the same as χ = х(н(у), v(s)) (8.35) and by the chain rule, the tangent to С is du dv Τ = x„ — + x„ T ds as and IITII2 = <хи> хи>(^)2 + 2<χ„, χ„> £% + <χ„, k>(j )2 (8-36) \ds/ ds as \ds/ We shall use these following notational conventions relative to coordinates οηΣ: £=<хи,хи> ^=<хи,х„> G = <x„,x„> (8.37)
8.3 Surfaces 643 In terms of this notation we have this way, intrinsic to the surface, for computing the lengths of curves on Σ: Proposition 4. Let Σ be a surface patch parametrized by χ = х(м, υ). Let С be a curve on Σ parametrized by χ = x(u(t), v(t)). a< t <b. Then the length of С is (8.38) ГЮ'+ЧтЗЧа)1]"1* Proof. The length of С is f НТЦЛ which is, by (8.36), given by (8.38). We shall adopt the convention (borrowed from the differential form notation) that ds is the integrand which gives arc length along a curve. This means just that the length of any curve С is Jc ds. According to (8.38) we can be assured that ds = ^-шие 1/2 at for any parameter t along C. We can also write this as ds1 = Edu2 + IF dudv+ G dv2 (8.39) Definition 7. The form (8.39), where E, F, G are given by (8.37) relative to a parametrization χ = х(м, ν) on Σ is called the first fundamental form of Σ. If C1; C2 are two curves given parametrically by Cl:u = u^s) ν = vt(s) C2:u = u2(s) v = v2(s) then their tangents are du^ dvv 1 = x"^7 + x""rf7 du2 dv2 T2 = x-~ds' + Xv~dl
644 8 Potential Theory in Three Dimensions At a point of intersection ρ the vectors ^(p), T2(p) He in the tangent plane at ρ and their inner product is T .=Edul du2 Jdui dv2 du2 dvA ^ ds ds \ ds ds ds ds J The curves are orthogonal at ρ if <T1; T2> = 0 dvv dv2 ds ds Proposition 5. The parametric curves и = constant, ν = constant on a surface patch are orthogonal if and only if F = 0. Proof. The tangent line to и = с is spanned by x„; the tangent line to υ = с is spanned by xu. These lines are orthogonal if and only if <Xu, x„> = F= 0. Examples 18. The plane ζ =0. In the standard rectangular coordinates we have ds2 = dx1 + dy2. If χ =/(/), у = g(t), 0 < s < L, is any curve joining a to b we have (as in Chapter 5) the length of L is 0 UV)2 + g'(t)2y>2 dt If we parametrize this curve by χ we obtain the length as Cl·© 1/2 dx This is minimized when dy/dx = 0; that is, when the curve is a straight line. This conforms with known facts. 19. The cylinder χ = x(m, v) = (cos u, sin u, v) Here x„ = (-sin u, cos u, 0), x„ = (0, 0, 1). Thus Ε = 1 = G, F= 0, so ds2 = du2 + dv2 Again, the length of a curve given as υ = v(u) is 2-1 1/2 f№ du
8.3 Surfaces 645 so the curves of minimal length (called geodesies) on the cylinder are those represented by straight lines in the u, υ coordinates. Thus the typical geodesic on the cylinder is the helix χ = (cos t, sin t, at) 20. For the sphere χ = х(и, v) = (cos и cos v, cos и sin v, sin u) we have Ε = 1, F = 0, G = cos2 u. Thus ds2 = du2 + cos2 и dv2 Once again, we discover the geodesies by minimizing the integral Jy ds. Let a, b be two points on the sphere; by rotating the sphere we may suppose that a, b lie on the longitude ν = 0. If у is any curve joining a to b, the length of γ is J ds= J (du2 + cos2 и dv2)i/2 (8.40) The length of the longitude (u = 0) is ("(du2)1'2 = f du (8.41) •"a •'a Now (8.40) is always larger than (8.41) unless dv = 0 along y; that is, ν is constant. Thus it is the longitude which is the curve of the shortest distance between a and b. By rotating back again we conclude that the geodesies on the sphere are the sections by diametric planes: the great circles. Geodesies The problem of finding the geodesies on any surface is more difficult, because the general form Edu2 + 2Fdudv + Gdv2 is harder to analyze. One way to proceed is to try to find coordinates so that the first fundamental form looks like the above examples· it has the
646 8 Potential Theory in Three Dimensions form ds2 = du2 + G dv2 (8.42) When this is the case we can verify that the curves ν = constant are geodesies (Problem 17). However, in order to find such coordinates, we must know what we are looking for; that is, we must know how to find geodesies in the first place. Thus, this line of reasoning has to be supplemented by the discovery of a characteristic property of geodesies. We seek such a characteristic property by trying to understand the "infinitesimal" behavior of a geodesic: this (we hope) leads to a differential equation which is solvable. Then we can carry out our original plan: solving the differential equations will provide a convenient coordinate system in which we can discover the curves of minimal length. We shall, however, not carry through the entire program here; we shall only derive the basic property. If у is a geodesic, a curve of minimal length, on the surface Σ, then, relative to Σ it is a straight line. That is, it would have to be as close to a straight line as it could be: it should bend only as much as it must in order to remain on Σ. Thus the rate of change of the tangent, relative to Σ, should be zero. Infimtesimally this says that the normal to the curve has no component on the tangent plane to Σ. We shall now show that a geodesic has this property. Theorem 8.2. Let γ be a geodesic (curve of minimal length) on the surface Σ. Then, at any point ρ on γ, the normal to γ is orthogonal to the tangent plane of Ъ. Proof. Let ρ б у and let u, ν be coordinates for Σ near ρ so that ρ = («(0), υ(0)). We may choose these coordinates so that у is the curve ν = 0 and so that the coordinates are everywhere orthogonal (see Problems 9 and 10). Now let a be small enough so that the interval from (—я, 0) to (a, 0) in the uv plane lies on the domain D of the coordinates. If Γ: ν =/(«) defines a curve lying in D and joining (— a, 0) to (a, 0), then χ = x(u,f(u)), —a<, и <а gives another curve on Σ, joining two points of у (Figure 8.9). The length of Г is no more than that of y, since у is a geodesic. Figure 8.9
8.3 Surfaces 647 We have not yet done enough to investigate the local behavior of у; we must consider a whole family of curves including у rather than just one other. But that is easy to do: let Г, be the curve parametrized by Γ,: χ = х(и, r/(«)) —a<u<a for-l^r<l. у is r0 and Г is Tl Let F(t) be the length of Г,. Then F(r) has a minimum at t = 0, so (if it is differentiable) F'(0) = 0. We now compute this: F(r)=f \\xu + xutf'(u)\\du J -a is certainly a differentiable function of r, and F'(0 = | J \\x» + xvtf'(u)\\du Now, at t = 0, the integrand is a - <x„ + x„tf'{u), x„ + x„r/'(«)>1/2 |,.o ■■ ζ j—r. 2<x„„/(«) + Xvf'(u), xu> _ o^y /(и) (8 43) The last equation follows from the assumption that the coordinates are orthogonal: <x„,xu>=0. First, the second term drops out, secondly, the expression (8.43) derives from a 0 = — <x„, x„> = <xu», x„> + <x», x»u> a« Therefore, from F'(0) = 0, we obtain r«<x^> =0 ·>-« llx.ll This equation must hold for all differentiable functions/such that/(-<j) =f(a) = 0. We conclude then that <x„, хш> = 0
648 8 Potential Theory in Three Dimensions along у (see Miscellaneous Problem 41 of Chapter 2). Now, the normal N to у is in the plane spanned by x„ and хш. Since these are both orthogonal to x„, N _L x„. Further, N is orthogonal to the tangent line of у which is spanned by x„. Thus N is orthogonal to both x„ and x„, so is orthogonal to the tangent plane of Σ. Examples 21. Find the geodesies on the surface Ъ:у = хг We parametrize Σ by χ = х(и, ν) = (и, и2, ν). Let и = u(s), v = v(s) parametrize a geodesic Γ on Σ^ Then Γ has the form χ = (u(s), i^(s), v(s)) and xs = (и', 2мм', v') xss = N = (m", 2(m')2 + 2mm", v") For Г to be a geodesic, this must be orthogonal to both x„ = (1, 2m, 0) x„ = (0, 0, 1) Thus, the functions u(s), v(s) parametrizing the geodesic Г satisfy these differential equations м" + 2м[2(м')2 + 2им"] = О v" =0 Notice that from Picard's theorem the equations м„ = -4u(u')2 1 +4m2 v" = 0 have unique solutions given the initial values of u, v, u', v'. Thus, there exists a curve of minimal length in every direction, at every point.
8.3 Surfaces 649 22. Find the geodesies on the cone Σ:ζ2 = x2 +y2 Notice that any plane ζ + χ cos a + у sin a = b intersects Σ at right angles (Figure 8.10). Thus the normal to the curve of intersection is orthogonal to the surface, and such a plane always intersects Σ in a geodesic. More generally, we can compute the equations for any geodesic using Theorem 8.2 Σ: χ = х(и, ν) = (ν cos и, ν sin u, ν) xu = ( — vsinu,v cos и, 0) x„ = (cos и, sin и, 1) Figure 8.10
650 8 Potential Theory in Three Dimensions If и = u(s), υ = v(s) parametrizes a geodesic Г, then on Г xs = (ι/ cos и — vu' sin u, v' sin и + vu' cos u, v') xss = ("" cos м — 2i>V sin и — ум" sin и — v(u')2 cos и, v" sin μ + 2i/m' cos и + ш" cos и — v(u')2 sin и, ν") The differential equations are readily computed (and hardly solved explicitly) by expressing <xss, x„> = 0, <xss, x„> = 0. Surface Area We would like now to define the area of a surface in a way analogous to the definition of the length of a curve. We select a collection of points xt,..., \k on Σ and replace Σ by the polygonal surface Σ' whose vertices are xb ..., \k. If the points xb ..., xk are very numerous and close to each other, then the sum of the areas of the faces of Σ' is a good approximation to the area of Σ. We can then try to define the area of Σ to be the limit of such sums as the set of points \t, ...,\k becomes infinitely numerous and everywhere dense. Now this definition unfortunately does not work, there are ways of so partitioning a surface so as to obtain any desired area (for a fuller account see Spivak, pp. 128-130). Rather than give it all up as a hopeless task because of this phenomenon, we try a different approach. First, we study the approximation of area in the small,hoping to generate a plausible formula for surface area (by plausible I mean that approximations to our formula are also approximations to our notion of area). If the formula turns out to be intrinsic, that is, independent of parametnzations, then it will define a relevant measure, which we shall call surface area. Returning to the above " approxi- Figure 8.11
8.3 Surfaces 651 mation," let F be one of the faces of Σ', and x0 one of its vertices. Let F0 be the projection of F onto Σ (see Figure 8.11) and Ft the projection onto the tangent plane T(\0). If the surface is very smooth, then for small F these three surfaces have essentially the same area, and we can confuse the three. We may suppose that F0 lies in a patch parametrized by χ = x(u, v) with x0 = х(и0, *Ό)· Let D be such that F={x(u,v);(u,v)eD} Confusing the surface with F, we may take χ to be the linear map x(h, v) = x0 + χ„(η0 , v0)u + х„(н0 , v0)v Now, we know how to compute area on the image of a linear map: area (Ft) = ||x„(w0, v0) χ \v(u0,v0)\\ area D This is true because it is true for rectangles, as we have seen in Proposition 28 of Chapter 1. Thus, at least on this coordinate patch, the area of Σ' is very close to Σ ||χ„(η,, υ,) χ x„(h,, i\)|| area(Z),) where the {D,} partition the coordinate domain D and (ut, vt) e Dt. The limit of such sums is ί || x„ χ xj dudv We take this to be the definition of surface area. Definition 8. Let Σ be a surface patch with coordinates u, v, ranging through D in R2. The area of Σ is Γ ||χ„ χ xj dudv •>D If Σ is a surface, partition Σ into pieces Du...,Dk such that each D, is a surface patch. Define area (Σ) = £ area (Д)
652 8 Potential Theory in Three Dimensions We must show that this definition is independent of the particular partition. Proposition 6. The above definition is independent of the partition of Σ chosen. Proof. Suppose we also partition Σ another way: Σ=£ιυ···υί,. Then Σ = (Α η £Ί) υ(ΰιη£,)υ···υ (Α* η Εη) is still a third partition. Clearly, area (£,) = 2 area (Aj η Et) (8.44) area (A;) = 2 area (Aj η £,) (8.45) since in each case we are computing relative to the same coordinates. We leave it to the reader (see Problem 18) to verify that the computation of the area of Aj η Et is the same whether it is done in the Aj or Et coordinates. Then, summing (8.44) over г, and (8.45) overy, the right-hand sides are the same; and so are the lefts, as desired. In accordance with our convention to denote ds as the integrand for arc length, we shall let dS denote the integrand for surface area. Thus, in terms of any coordinate system u, ν we have dS = Ηdu dv, where Η = ||x„ χ xj. It follows from Lagrange's identity (Chapter 1) that also Я = (EG - F2)1'2. Examples 23. Find the area of the sphere {x2 + y2 + z2 = R2} We use spherical coordinates: χ = (R cos и cos v, R cos и sin v, R sin и) x„ = (— R sin и cos v, — R sin и sin v, R cos и) x„ = (— R cos и sin v, R cos и cos v, 0) so Я = [£G - F2~\112 = R2 |cos u\. The area is p" pi/2 R2 cos μ du dv = AnR2 •'-71 •'-71/2
8.3 Surfaces 653 24. The area of the piece of the paraboloid is {z = x2+y2, 0<z<l} The parametrization is χ = (r cos и, г sin u, r) xr = (cos u, sin u, 1) x„ = (-r sin и, г cos u, 0) £ = 2, f = 0, G = r2, Η = 2r. The area is ι·2" r1 Γ Γ 2rdrd9= 2π •Ό ·Ό • EXERCISES 13. Let/be a C1 function denned in a domain D in /?2. (a) Show that Σ: {z =/(x, y)} is a surface patch with coordinate x, y. (b) Compute the first fundamental form and the area element for / (c) Show that the element area is given by sec у dx dy, where у is the angle between the normal to Σ and the ζ axis. 14. Find the tangent plane, first fundamental form and area element for these surfaces: (a) The paraboloid χ = у2 + ζ2 (b) The cone zi = x2 + y2. (c) The hyperboloid ζ = χ2 — у2 (d) Σ: х(и, ν) = (и + ν2, ν + и2, uv) 15. Find the length of the intersection of these surfaces: (a) x2 + y2 + z2 = 1 ix2 + 2y2 + 2z2 = 1 (b) z2 = 2x2+^2 ζ = χ2 + 2y2 16. Find the angle between the parametric curves at a general point for the Surface given in Exercise 14(d). 17. Find the area cut off the tip of the paraboloid x2 =y2 + z2 by the plane χ + ζ = 1. 18. Find the area of these surfaces: (a) The cone z2 = x2 + y2 0<z<a. (b) Σ: x = («, cosk, ν) 0<ν<π, -тг<и<тг (c) The part of the hyperboloid ζ = χ2 - у2 inside the unit ball. (d) The ellipsoid x2 + y2 + Az* = 4.
654 8 Potential Theory in Three Dimensions • PROBLEMS 13. Recall that a differential form Mdu + Ndv determines a family of curves: those curves along which Μ du + N dv = 0. If ds2 = Ε du2 + 2F du dv + G dv2 is the first fundamental form of a surface patch show that the family of curves orthogonal to the family denned by Μ du + N dv = 0 is determined by (EN- FM) du + (FN- GM) dv=0 14 Let po be a point on the surface Σ. Show that we can find a surface patch near po so that the parametric curves are orthogonal. (Hint: Let u, ν be coordinates near p0 and explicitly find the family of curves и = u(t, c), ν = v(t, c) orthogonal to the curves dv = 0 such that «(0, c) = 0, υ(0, с) = ν0. Show that v, с are orthogonal coordinates.) 15. Let у be a curve on the surface Σ. Find orthogonal coordinates u, υ at a point p0 on γ so that (ι) γ is the curve ν = 0, (ii) и is arc length along y. 16. Show that a cube is not a surface along its edges. 17. Is ds a differential 1-form? 18. Find the differential equations for the geodesies on the torus (Figure 8.12): x = (\ — cos<^)sm θ y = (\ — cos^)cos θ ζ = Sin φ 19. Find those planes which intersect the ellipse x2 + a2y2 + b2zi = 1 in a geodesic. 20. Let {(«, v) e R2: и > 0, ν > 0} parametrize a surface with first fundamental form ds2 = v2 duz + u2 dv2 Find the equation of the family of Figure 8.12
8.3 Surfaces 655 curves orthogonal to the curves uv = constant, and express the fundamental form m terms of these new coordinates. 21. Find the geodesies on the Surface with first fundamental form ds2 = du2 + f(u) dv2 22. Show that the curves υ = constant on a surface with first fundamental form Edit2 + G dv2 are geodesies if and only if BE/dv = 0. 23. Let Σ be a surface path with two different coordinates: Σ: χ = х(и, ν) (и, ν) е D Σ: χ = x(r, s) (r, s) е Δ Show that Эх Эх — X — ей ev dudv = Эх ЙХ — X — er Bs {Hint: Define и = u(s, t), ν = v(s, t) by this property: χ = x(r, s) if and only if χ = x(«, v) with « = u(r, s), ν = v(r, s). Show that ax ax /ax ax\ a(«, υ)\ ar * es ~~ \aii x au/ e(r, s)) The following problems use the normal to a surface Σ: this is a unit vector N orthogonal to the tangent plane. 24. Let у be a curve on the surface Σ. Let N represent the normal to Σ, and Τ the tangent to y. The unit surface normal to у is the vector N, = NxT. (a) Show that у is a geodesic on Σ if and only if <Ny, dT/ds} = 0 (b) In general, the inner produce кд = <Ny, dT/ds> is called the geodesic curvature of у on Σ. Suppose («, v) are orthogonal coordinates on Σ and к,1, Kg2 are the geodesic curvatures of the lines υ = constant, « = constant, respectively. Verify Liouville's formula: the geodesic curvature of the curve у is given by άθ Kg= h Kg1 COS θ + Kg2 SHI θ ds where θ is the angle between the tangent to у and the direction x„. (Hint: Write Τ = Τι cos θ + T2 sin θ where Ti, T2 are the tangents to the curves ν = constant, и = constant.
8 Potential Theory in Three Dimensions Then dT ί/Т, n аТг n άθ — = -г- cos θ + —- sin θ + (-Τι sm 0 + Τ2 cos 0) — as as as ds Substitute these expressions into ,ΝχΊΊ _ /^1 ■ and evaluate at θ = 0, θ = π/2.) 25. If у is a curve on the surface Σ we can decompose dT/ds into its components tangent to and orthogonal to the surface: dT — =κ9Νϊ + /<:Λ,Ν as where κΝ is called the normal curvature to T. (a) Show that the curvature of у is (кд2 + κΝ2)112 (b) Show that the normal curvature of a curve у depends only on the tangent to у and is the same as the curvature of the curve of intersection of Σ with the plane through Τ and N. (c) Show that the curvature of the curve у is given by к«(Т) sec Θ, where θ is the angle between dT/ds and N. 26. Using Liouville's formula find the geodesic curvature of a gerjeral curve on the surface obtained by revolving the curve ζ = exp(— x2) around the ζ axis. 27. Let Σ be a surface such that at every point every curve on Σ has zero normal curvature. Show that Σ is a piece of a plane. 28. Let ρ be a point on a surface Σ and let q, r be two nearby points. It is possible to select q, r tending to zero So that the plane determined by p, q, r does not converge to the tangent plane (unless Σ is itself a plane). For example, if у is the curve intersection of Some plane Π with Σ and if r follows q along у then the plane determined by p, q, r is always П, which need not be the tangent plane to Σ. Furthermore, if we move q slightly off у we can be sure of the same behavior with the requirement that the angle between q and r (in some parametrization) is not zero (however, it must tend to zero). Here is an explicit example. Σ is the surface ζ = χ2 parametrized by x(«, v) = («, v, u2). The tangent plane at p, the origin, is the xy plane. However, if q = (2r,0,4r2), r = (r,r2,r2) then the plane determined by p, q, r tends (as r -*■ 0) to the plane orthogonal to (0, 1, 1).
8.4 Surface Integrals and Stokes' Theorem 657 8.4 Surface Integrals and Stokes' Theorem Suppose that / is a continuous function defined in a domain D in R3, and Σ is a surface in D. We can verify by an argument identical to that in Proposition 6 that the following definition makes sense independently of the coordinate choices involved. Definition 9. Partition Σ into subsets Σ1;..., Σ„ of surface patches on Σ. Define the integral of/over Σ to be \fdS = £ f fHdudv where Η du dv is the surface area element in the patch containing Σ,. Examples 25. I=\%x2y2zdS, where Σ is the hemisphere Σ: {(x, y, z): x2 + y2 + z2 = 1, ζ > 0}. Using the same parametnzation as in Example 23, we have с" г71'2 /= cos5 и cos2 ν sin ν sin и du dv = - sin2 2v dv cos5 и sin и du = — 4·<-π Jo 24 26. / = Jj;(x + y2) dS, where Σ is the piece of the paraboloid given in Example 24. г2" г1 π / = 2 [г2 cos w + r3 sin2 и] dr du = - •Ό ·Ό 2 Normal and Orientation Let Σ be a surface in Д3. The tangent plane to Σ at a point x0 is a two- dimensional plane, thus its orthogonal complement is a line, called the normal line to Σ at x0. The normal vector N is a choice of unit vector lying on this line which varies continuously with the point. Such a choice is always possible locally, but is not always possible over the whole surface.
658 8 Potential Theory in Three Dimensions moebius band Figure 8.13 Consider the surface depicted in Figure 8.13 (called the Moebius band). This is obtained from a rectangle (Figure 8.14) by gluing together the vertical sides so that vertices with corresponding labels abut. There is no way to continuously select a normal vector to this surface which does not point in the opposite direction when traced around the circle in Figure 8.13. Notice that the same kind of phenomenon is put in evidence by Figure 8.14: a right-handed basis gets transformed into a left-handed basis when we cross the vertical line. We express this by saying that the Moebius band is not orientable. Thus in two dimensions we find a problem which does not exist in one dimension. We ran into the same problem in the discussion of integration under a change of variable in the plane, and we successfully sidestepped it then. But we cannot avoid it now. We shall refer to an orientation on a surface Σ in R3 as a choice of a sense of positive rotation in the tangent plane at every point. This choice is assumed to vary continuously: that is, if v1; v2 are nowhere collinear continuous vector fields defined on the surface and the rotation \t -* v2 is positive at Xq it must be so in a neighborhood of x0. A choice of orientation is equivalent to a choice of normal vector. For, if a normal N is chosen we defined positive rotation in the tangent plane as follows: \i^>\2 !S positive if \t -> v2 -»· N is a right-handed system. Conversely, if an orientation is chosen we can define N = vt x v2, where vt, v2 are unit vectors and the rotation \t -* v2 is positive. If Σ is oriented and (и, ν) are coordinates on a patch in Σ, we shall say that (и, ν) is a positively
8.4 Surface Integrals and Stokes' Theorem 659 oriented coordinate system if the rotation x„ -»· x„ is positive. Here is a fact relating positively oriented coordinate systems which completes the discussion. Proposition 7. If (и, v) and (V, υ') are two positively oriented coordinate systems defined on the oriented surface Σ, then d(u, v) d(u', υ') Examples >0 27. If/is a C1 function defined on a domain D in the xy plane, then the graph T(J):z=f{x,y) is a surface patch. We consider it oriented so that the rotation from X;,. = (1, 0, df/δχ) to Xy = (0, 1, df/ду) is positive. Then the normal vector N always points upward out of the surface (N3 > 0): N = «♦©'♦ШТШ-') 28. More generally, we can always orient a surface patch Σ: χ = х(и, ν), (и, ν) е D by transferring the orientation from the u, ν plane. That is, we take x„ -> x„ as the positive sense of orientation. Then the normal to Σ is Figure 8.14
660 8 Potential Theory in Three Dimensions Just as, in the case of curves, we introduced the "vector length element" dx = Tds along the curve, we introduce the vector area element dS = Nds on a surface. Notice that, in terms of coordinates dS = || x„ χ xj dudv = xudu χ x„ dv (i.e., it is the vector product of the length elements along the parametric curves). In this way we can integrate vector fields along oriented surfaces: Definition 10. If Σ is an oriented surface and ν is a vector field defined around Σ, define the flux of ν across Σ by j<v,dS> = j<v,N>dS The significance of the word flux will become apparent in the next section. Example 29. Compute the flux of \(x, y, z) = (xy, yz, zx) across the graph of f(x, y) = x1 + 2y2 x1 + y1 < 1 We take x, у as coordinates on Σ. Then f<v,N>dS=f Lpxp\dxdy h ■>x^+y^<i\ ox dy/ (xy y(x2 + 2y2) (x2 + 2y2)x\ det 1 0 2x \dxdy J*2+'2£l \0 1 Ay J = ί [χ3 + 2y2x - 2x2y - 4xV - 8/] dx dy •>x2+y2Zl »2π „1 = 4 Γ ί r5 cos2 θ sin2 QdrdQ = •Ό ·Ό 6 Suppose that v is the velocity field of a flow and С is a closed path (oriented closed curve). In Section 8.2 we defined the circulation around a circle; we could use the same definition to define the circulation around C: circ (C) = f <v, T> ds (8.46)
8.4 Surface Integrals and Stokes' Theorem 661 (s = arc length along C, and Τ is the tangent vector to C). In Section 8.2 we used this idea to define curl v, the "infinitesimal circulation" about a point; now we ask if we can recapture the total circulation from the infinitesimal. A clue is obtained by recognizing the integrand of (8.46) as the differential form associated to v. If ν = (ν1, ν1, ν3), then, on the curve <v, T> ds = Σν' dx' = <v, dx}. What we are then asking for is the analog for surfaces of Green's theorem. Since the curl plays the same role in three variables that dm plays in two, it is no accident that such a theorem exists. Stokes' Theorem Suppose now that Σ is an oriented surface lying in the domain of the vector field v, and D is a subset of Σ bounded by a curve Γ. For the purposes of integration we must choose an orientation of Γ. It will be the natural one corresponding to the given orientation of Σ: Γ winds counterclockwise around D. To be more precise, we shall define the positively directed tangent. Let реГ and consider a small path y, with tangent vector t at ρ which crosses Γ and is directed so that it enters D. Then the tangent vector we wish to choose is that one Τ such that the rotation T-> t is positive (see Figure 8.15). This corresponds to the counterclockwise sense of rotation about the normal to the tangent plane. When the boundary of D is so oriented it is a path, denoted dD. Now the theorem we have in mind (Stokes' theorem) asserts that the circulation around dD is given by f <curlv, N}dS Figure 8.15
662 8 Potential Theory in Three Dimensions In order to derive this theorem from Green's theorem we must ensure that the conditions of Green's theorem will be met. Hence the following notion of a regular domain. Definition 11. Let Σ be a surface in R3. A subset D of Σ will be called a regular domain if it can be partitioned into finitely many subsets of surface patches which correspond to regular domains in the plane in the particular coordinate representation. Theorem 8.3. Let у be a vector field defined in a domain U in R3, and suppose Σ is an oriented surface lying in U with normal N in U. Let D be a regular domain in Σ whose boundary 3D is a curve. Then f <v, T> ds = f <curl v, N> dS (8.47) Proof. Since D is regular, there are coordinate patches Σι,..., Σ„ and a partition J)=fliU'"ufl, of D such that £>, <= Σ, and Dt corresponds to a regular domain m the Σ, coordinates. Now let Bu...,Bm be balls in R3 such that D <= Bi и · · · ό B„ and each Βί lies completely inside one of the coordinate patches Σ,. Let pu ..., p„ be a partition of unity subordinate to this cover. Then, since ZP] = 1 on D, f <v, T> ds = Σ f <Pj v, T> ds = У f <Pj v, T> ds since each part of йA which is not on dD appears as part of 8Dj for some j φ ι and with the opposite orientation. Γ <curl v, N> dS = 2 ί <сиг1(р; v), N> dS = 2 ί <curl(jOj v), N> dS Jd j Jo ij JDi Thus, we only need to show that the right-hand sides are equal termwise: we may assume that we are in a coordinate patch. This is now our situation. Let Σ be a surface patch coordinatized by χ = (хг(и, ν), x2(u, ν), хъ(и, ν)), (и, v)e N <= R2 and suppose Δ is a regular domain in N and D is the subdomain of Σ corresponding to Δ: D = {x(«, v), (u, v) e N}. Let ν = (υ1, ν2, ν3) be a vector field defined on Σ. Then we must verify ί <v, T> ds = ί <curl v, N> dS J 6D J D
8.4 Surface Integrals and Stokes' Theorem 663 This is just the computation that <curl v, N> dS = (ί/ω„) du dv under the change of variables. First, we study the left integral: Г С Г dx* dx1 <ν,Τ>ώ= Zv'dx'=\ Σν< — du+Zv' — dv JeD JeD Je\ du dv By Green's theorem this is _ r \8 I dxl\ 3 / Sjc'\" Ща^'^)-^'^) Now du dv dv dv1 dxJ dx1 e2x< + v' du dv du a / ( dx<\ dv1 dx1 dx1 i d2x ~du χ ~dv) = У Ϊ& Ihi ~dv + V'~du~d a / _ dx'\ _ dv1 dx1 ~d~v \ ~dv) ~~ 7~teJ~dv The integrand in (8.48) is thus dv1 /dx1 dx1 dx1 dx1 £j~dx~J \~du~ ~dv~~dv ~du /do1 dvi^/dx* dx1 dx1 dx1 ~~ tZi \dx~J ~ 'dxij \8u ~dv~~dv~du~ = <curl ν,χ,χ x„> Hence, after Green's theorem the left integral becomes <curl v, x„ χ x„> du dv But the right integral is f <curl v, N> Цх,, X Xv || du dv = j <curl v, x„ X x„> du dv Since χ, x x„ = ||x„ x x„ || N. The proof is concluded. Examples 30. Calculate f <v, dx}, where Σ is the surface Σ:ζ = χ2 0<χ<1 0<y<\ (8.48)
664 8 Potential Theory in Three Dimensions and v(x, y, z) = — (y, z, x). We make the computations: curlv= -(1,1,0 χ, = (1,0,2χ) χ, = (ο, ι, о) dS = (χ,,, χ x,,) dx dy = (—2x, 0, 1) dx dy f <v, dx} + f <curl v, dS} = f f (2χ + 1) dx dy = 0 ■>6ς ·Έ ·Ό "Ό 31. Let Σ be the surface patch χ = х(и, v) = (и cos v, и sin v, и cos 6i>) 0 < и < 1 0 < ι> < 2π Let Ν = (JV1, Ν2, Ν3) be the normal to Σ. Then ί (Ν1 + N2 + Ν3) dS = Γ <curl ν, dS} where ν = (у, ζ, χ). Thus, the sought-for integral can be computed as <v, dx}, where у is the curve и = 1: V: χ = x(i;) = (cos v, sin i>, cos 6v) <v, i/x> = ( — sin2 ι> + cos ucos 6v — 6 cos ysin 6v)dv= —π •'у -Ό EXERCISES 19. Calculate L/dS, where (a) fix, y, z) = x2 + 2y X:z2=x2+y2 0<z<l (b) f{x, y, z) = xy + yz + zx H:z = x2+y2 O^z^l (c) f(x,y,z) = xyz Σ: \(u, v) = («cos и, и sin υ, ν sin и) 0^й^2тг 0^υ^2ττ 20. Calculate J <v, t/S>, where (a) v(x, y, z) = (xy, yz, zx) Σ:ζ = β*' O^x^l O^^^l (b) y(x,y,z) = {\,-y,x) Z;x2+y2 + z2 = \ (c) v(x,y,z) = (l,0,y) Z:z = x2-y2 x2 + j»2 ^ 1
8.4 Surface Integrals and Stokes' Theorem 665 21. Suppose ν is a vector field defined in a neighborhood of the domain D. Show that f <curl v, dS> = 0 22. (a) Suppose that D is a regular domain on a surface Σ. Verify that for any vector a, - ί <a χ x, dx> = ί <a, dS> (b) Just as we integrated vector functions on the interval, we can integrate vector functions on lines and surfaces (and m space). Show that, for a regular domain D these vectors are the same: ί dS = - ί χ χ dx *D 3 JfiB (Hint: This follows from part (a).) 23. Show that if u, ν are C1 functions on the regular domain D that ί <,uVv, dx> = ί <V« X Vv, dS} Jen J d • PROBLEMS 29. If ω is a closed form defined in a neighborhood of the unit sphere in R3, show that there is a function / such that a> = dfon the sphere. 30. Consider the torus T: χ = x(«, v) = 2 cos и + cos ν χ = x(«, v) = (2 + cos i;)cos u, (2 + cos «)sin «, sin v) (a) Show that the differentials du, dv are well-defined differential forms on T. (b) If ω is a closed form defined on T, show that the integrals Jf •'ν are constant as Γ ranges over all circles ν = constant, and у ranges over all circles и = constant. (c) If ω is a closed form there are constants cu c2 and a differentiable function/such that ω = Ci du + c2 dv + df
666 8 Potential Theory in Three Dimensions (Hint: Take ci = JV ω/2π, c2 = Jy ω/2π, where the integrals are taken as defined m part (b).) 31. State and prove a fact like that in Problem 30(c) when Γ is replaced by a cylinder. 32. Verify this restatement of Stokes' theorem: Let ν = (F, G, H) be a vector field defined in a domain U m R3 and suppose that Σ is an oriented surface lying in U with normal N = (cos a, cos β, cos γ). If D is a regular domain m Σ, then f Fdx+Gdy + Hdz J so г Γ/3# dG\ /8F dH\ „ /8G 8F\ 4K^-^)cosa+(^-^)cosM^^)cosT5 33. Let D be a regular domain on the oriented surface Σ. Show that if ν is a vector field defined on Σ f <v, Ny> dS = f <curl(v x N), N> dS J 8D ·* D where Ny is the unit surface normal to 8D (see Problem 24). 8.5 The Divergence Theorem Let ν be a vector field defined in a domain U a R3, and χ = φ(χ0, r) the associated steady flow. Let D be a domain whose closure is contained in [/ such that dD is sufficiently differentiable surface. Notice that dD is onen- table, since we can choose as normal vector the unit vector N which is exterior to the domain D. We shall assume throughout this section that this is the chosen normal. For a small interval of time Δ/, let us attempt to calculate the amount of fluid that passes through dD. For x0 e D, the particle at the point ф(х0, — t) at time 0 for 0 < t < At passes through Xq, since ф(х0, -t + t) = ф(х0, 0) = x0. Thus, the volume of the fluid passing through dD at time At is the volume of the domain DAt = {χ: χ = φ(χ0, -ί): 0 < t < At, x0 e 3D} We shall approximate this volume by linearizing locally. That is, we cover dD by small neighborhoods £/,, and replace Ut η 3D by the piece Tt of the tangent plane to dD with the same area at some point in U,. We assume
8.5 The Divergence Theorem 667 also that φ is a pure translation through T,. Then the volume which passes through Tt is a parallelepiped of volume <«Κχ„-Δ0-φ(χ,,0),Ν>Δ/<, where x, is some point in Ut η 3D, and AA, is the area of T,. Let us point out that this is a signed volume; the sign being positive if the flow is into D (since N is the exterior normal, and if φ(χ,, — At) is on the same side of dD as Ν, <φ(χ,, - At) - φ(χ,, 0), N> is positive). This is in fact what we want, for we want to discover the flow into D rather that the flow through dD. It follows that an approximation to the volume of DAt is Σ<φ(χ„-Δί)-φ(χ„0),Ν>ΔΛ ι and by letting the covering get arbitrarily fine, we may replace this by an integral: ί№<φ(χ,-Δί)-φ(χ,0),Ν>ώ (8.49) The limit of I/At times (8.49) as Δ/->0 through positive values is the instantaneous flow into D, or the flux into D at time / = 0. Proposition 8. The flux out of D at time t = 0 is JJ№<v,</s> Proof. The flux out of D is - lim — Γ <φ(χ, -At) - ф(х, 0), Ν> dS лг-о ш JeD = lim —- f <ф(х, -Дг)-ф(х,0),^> = ί < lim — [φ(χ, At) - ф(х, 0)], dS} = ί <ν, rfS> JeD дг-.о Ш JeD Now the flux out of D is the instantaneous rate of flow of fluid out of D. On physical grounds this should be identical to the instantaneous rate of expansion of the fluid in D, which is (as in Section 8.1) JB div ν dV. Thus, we should expect JJ<T,^> = JJJ div Τ rfF (8.50)
668 8 Potential Theory in Three Dimensions and in fact this is the case. Equation (8.50) is known as the divergence theorem. For suitable domains it is an easy consequence of the fundamental theorem of calculus. As in the case of Green's theorem, we shall call such domains, or finite unions of such domains, regular domains. Many domains in R3 are regular, but by no means are all regular. The general theorem, for an arbitrary domain, is not easy to prove and we shall here avoid the issue. Definition 12. A domain D in R3 is regular if it can be expressed in each of these ways: D = {(x, y, z): (x, y) e Dl f(x, y) < ζ < g{x, y)} = {(x, y, z): (x, z) e D2 r(x,z)<y< s(x, z)} = {(x, y, z): (y, z) e D3 u(y, z) < χ < v(y, z)} where all functions are continuously differentiable. Lemma. If ν is a differentiable vector field defined in a neighborhood of the regular domain D, then f <v, dSy = f div ν dV Proof. Let ν = s'E, + v2E2 + v3E3. dv1 dv2 ev3 дх1 дх2 дх3 We shall show that for each i, r r dv' Then the lemma will follow by summing over ι. To prove the ith case, we use the appropriate representation of the domain. Since all cases are then the same, we shall only verify one case, say the third. Now, using the expression D = {(x, y, z): (x, y) e £>,,/(*, y) < ζ ^g(x, y)}
8.5 The Divergence Theorem 669 the boundary of D consists of the part Σ0 lying over 8D1 and the two surfaces Σ.: ζ =/(*,;?) (x,y)eDi Σ2: ζ = g(x, у) (χ, у) е Dl Since E3 is tangent to the surface lying over ЗА at every point, the left-hand integral over Σ0 vanishes. Now Σχ has the parametnzation x = (x, У, fix, У)) (x, y)eDi Since the domain lies above this surface, the exterior normal points downward, so is determined by -χ» χ χ, (see Figure 8.16). Now xI = (l,0,/t) χ, = (0,1,Λ) so we have dS = (f,,f*, — 1) dx dy. Then Figure 8.16
670 8 Potential Theory in Three Dimensions A similar computation produces f 03E3>i/S>= f v\x,y,g(x,y))dxdy Now, we compute jo (dv3/dz) dV by Fubini's theorem. dxdy \ TldV=\ ~(x,y,z)dz = b?(x,y,g(x,y)-vb(x,y,f(x,y)]dxdy •>Dl by the fundamental theorem of calculus. But this is, according to our previous calculations the same as JiD <υ3Ε3, dS}. Thus the lemma is verified. Theorem 8.4. (Divergence Theorem) Let у be a continuously differentiable vector field defined in a domain D in R3. Suppose D can be covered by finitely many balls Βγ,..., Bn such that each D η Β, is a regular domain. Then Г <v, dS> = f div ν dV Proof Let pi,..., p„ be a partition of unity subordinate to Bi,...,B„. Then f <v, dSy = 2 f <P, v, dS> = Σ ί </><v- dSy ί divvi/K=2 ί div(piv)i/K=2 ί di\(p,4)dV for the customary reasons: Σρ, = 1 and pt = 0 outside Bt By the lemma, the right-hand sides are the same termwise, so the left-hand sides are the same. We shall henceforth describe domains of the type referred to m Theorem 8.4 as regular Examples 32. First of all, the result of Exercise 22 follows easily from the divergence theorem, since div curl ν = 0. For then JeB <curl v, dSy = JD div curl ν dV= 0
8.5 The Divergence Theorem 671 33. Let D = {X2 + y2 + Z2 < 1 },/(*, y, Z)=X2+y2+ z2 Then f <V/, JS> = f div V/ JK = 6 f JK = 8π 34. Let D be the domain {1 > ζ > χ2 + у2}, and let y(x, y, z) = (xy, yz, x). Then div ν = у + ζ and \div4dV=\ f (y + z)dxdy]dz JD ■>(> \_Jx2+y2£z J = π ζ2 dz = - •Ό 3 f <v, dSy = f <v, dS> - f <v, dS) JdD Jz=l Jz = x2+y2 = [ xdxdy- Γ <(χμ, χ*2 + у2), х), (—2jc, 2j/, l)}dxdy = l\ (y2x2 + y4)dxdy = ^ Jx2+y2<;l i The Heat Equation In Chapter 6 in our discussion of the heat equation we postponed its derivation in dimensions greater than one. We had to await the divergence theorem; with that we can carry through our argument just as in the one- dimensional case. Thus, we suppose a homogeneous metallic object £/in R3 has at time / a temperature distribution u(x, t). According to the laws of thermodynamics, the vector field q associated to the flow of heat energy is proportional to the gradient of the temperature, but for sign: q + с Wu = 0 (8.51) Another basic principle is this: The increase in temperature of a unit mass is proportional to the increase in heat energy. More specifically, the change
672 8 Potential Theory in Three Dimensions in energy in any given domain D in a time interval / is given by kp \ AudV where Au(\, At) is the change in temperature at χ over the period Δ/, ρ is the density, and к is the proportionality constant (the specific heat). Thus, the rate of increase of heat energy in D is kp\ du — dV dt Now, we can compute (using the law of conservation of energy) the rate of increase of energy in D; it is the flux into D across the boundary. Thus we obtain this basic equation for every domain D: r r du - <kq,dS> = kp\ -dV •>dD jd at By the divergence theorem and (8.51) we have kp r du г . κρ r ou di\VudV = — \ — dV Jd с Jd dt for every domain D. Thus the two functions must be the same, and we obtain the heat equation: -№ /kp\ du As we saw in Chapter 6, the steady state (or equilibrium) temperature distribution solves Laplace's equation: div Vm = 0 d2u d2u d2u —, -\ , + —, = 0 dx2 dy2 dz2 • EXERCISES 24. If Σ is an oriented surface with normal N and /is а С function defined near Σ, we denote <V/, N> by S//SN. Show that Ld£ds=i"fdv for any regular domain D.
8.5 The Divergence Theorem 673 25. If ν is a vector field such that div ν = 1, then for any regular domain D vol(i») = f <v, dS> JeD In particular, we may make any one of these choices for v: (x, 0, 0) (0, y, 0) (0, 0, z) Find the volume of these domains, using the divergence theorem. (a) The cap z^ax2 + by2 0 <ζ<3. (b) The cone z2 ^ ax2 + by2 0 <, ζ ^ 1. (c) The tetrahedron bounded by the planes z = 0 x + y + z=l x = 2y,y = 0. 26. Verify this formula for any regular domain: a\ \MdV= \ ||x||<x,dS> jd JeD 27. Here is another way of expressing the divergence theorem, which is free of vector notation. Express N in terms of its direction cosines: N = (cos a, cos β, cos y) Then for any three functions F, G, H, Г г leF 8G ёН\ 28. Compute (a) U <(*2> У2> z2)> rfS> where Σ is the (oriented) surface of the cube with side edge 2, and center at the origin. (b) J (x cos α - у cos β - ζ cos у) dS over the sphere S: x2 + y2 + (z — l)2 = 1, where (cos a, cos β, cos y) is the normal. • PROBLEMS 34. Let Σ be a surface which intersects each ray from the origin in at most one point. The set of rays which intersect Σ will pierce the unit sphere in a set S. The area of S is the solid angle subtended by Σ. Show that the solid angle is given by г <*, </S) Jx H3
674 8 Potential Theory in Three Dimensions 35. Vector-valued functions can easily be integrated over any domain, coordinate by coordinate. Verify these formulas for a regular domain D: \ ν X dS = f curl ν dV f fdS=( VfdV f NdS = 0 Jan 36. Let ν be a divergence-free vector field defined in a domain U. Show that if у is a closed curve defined in U, then for any regular domain Лопа surface Σ such that 8D = y, the integral f <v,</S> •>d always has the same value. 37. Show that the function/is harmonic in the domain D if and only if, for every ball В <= D, f <V/,dS> = 0 JdB 38. Suppose there is given a flow in R3 with these properties: (a) The flow has constant velocity outside of some large bounded set. (b) The flow on the {z = 0} plane remains along that plane (no fluid passes from the upper half space to the lower half space ),Show that L diwi/K=0 where Η is the half space {z ;> 0}. 8.6 Dirichlet's Principle Let D be a domain in R3, and suppose ν is the velocity field of a flow through D which is steady (time independent). The total kinetic energy of the flow is given by the integral 2JD' p\\y\\2 dV (8.52)
8.6 Dirichlet's Principle 675 where ρ is the density of the fluid (we shall here take ρ to be constant). An important physical problem is this: find the flow which minimizes the energy (8.52) subject to certain conditions being fixed on 3D. For example, we may assume that the normal component of the flow <v, N> through the boundary is fixed. Or we may assume that the flow is conservative, that is, ν has a potential function, and the values of the potential are fixed on the boundary. These problems are analogous to Neumann's and Dirichlet's problems respectively (see Chapter 6). Dirichlet's principle is that the flow which minimizes the energy is the gradient of a harmonic function (solution of Laplace's equation). In this section we shall derive Dirichlet's principle and indicate how the techniques involved can be used to discover the solution to the problems. In order to do this, let us make these problems precise. Let D be a domain in R3, and/a function defined on D. I. (Dirichlet's Problem) Among all C2 functions и defined on D which have the boundary values /, find the one which minimizes the integral Ь IIVhII2 dV (8.53) II. (Neumann's Problem) Among all C2 functions и defined on D such that (Vu, N> =/on 3D, find the one which minimizes the integral (8.53). In order to study these problems we need (i) to relate boundary data to the integral (8.53), (ii) to discover an interpretation of (8.53) which will suggest a technique for minimizing that integral. The first need is filled by the divergence theorem, which will take the form of Green's identities (given below). The interpretation requested in (ii) is that of Euclidean vector spaces and the technique will be orthogonal projection. Let us describe this idea more fully. Let C2(D) represent the collection of functions which are twice continuously differentiable on D. We can make this vector space into a Euclidean vector space by defining on it the inner product Ε(ιι, v) = f <V«, Vv} dV (8.54) JD Then (8.53) is the square of the length of Vtiin terms of this inner product. We shall denote (8.53) by £2<м>. Our problem is to minimize this length among all functions with the given boundary value/. Let Mf be the space of functions in C2(D) with boundary value /. Then Mf is a translate of the space M0: if и is a function with boundary value/, then Mf = {u+g:ge M0}. Now it is a simple principle of Euclidean vector spaces that the vector in Mf which is closest to 0 is orthogonal to Mf, hence also orthogonal to M0.
676 8 Potential Theory in Three Dimensions The solution to our problem will then be that function in Mf η M0X. Finally, we can identify M0L as the space of harmonic functions. There is one fault with our reasoning. The "simple principle" above is one about finite-dimensional Euclidean vector spaces (recall Chapter 1), and it is not necessarily true in the infinite-dimensional case (of which ours is a prime example). The problem is that there need not be any point in Mf η M0L; and our argument will be complete once this problem of existence is resolved. The mid-19th century mathematicians such as Dirichlet and Riemann were little troubled by such problems; it was during the late 19th century that mathematicians began to think of existence questions as crucial (with good reason). And it was not until the last decade of that century that the existence problem was effectively solved. (The reader is referred to the history by Kellogg (pp. 277-286) for a fuller account.) The link between the geometry described above and the subject of harmonic functions comes out of certain computations involving the divergence theorem (Green's identities). These will now be exposed. We shall adopt one more notational convention before proceeding (already foreseen in the problems): if и is defined on the oriented surface Σ, then (Vu, N> is the directional derivative of и in the direction normal to Σ. We shall denote it by ди/δΝ. Theorem 8.5. (Green's Identities) Let f g be two C2 functions defined on a regular domain D. Then f f% dS = f UAg + <V-f' V0>1 dV <8-55> •>dD OJy ■> D Proof. f f% ds = f <^v^' N>ds = f dlv(/v#)dv •>sd oN JeD JD But, as is easily computed (see Exercise 10): div(fVg) =/div Vg + <V/, Vg} so Theorem 8.5 is proven. Corollary 1. (i) Ifg is harmonic, ldDf{dgleN) dS = Ε </, g}. (ii) Iffe M0 and g is harmonic, Ε </, g} = 0.
8.6 Dirichlet's Principle 677 (8.56) (iv) Iff is orthogonal to every function in M0,fis harmonic. Proof. (ι) If g is harmonic, then kg = 0, so by (8.55) we have ■L fwdS=L<v/'Vg> dv=E <L g> (8·57) (n) Now, if/e Mo, /has boundary values 0, so the integral on the left of (8.57) also vanishes, and thus £</, g} = 0. (iii) If g is harmonic, we have (8.57). If/is also harmonic we may interchange the roles of/and g in (8.57) obtaining Γ 8f g — dS=E(g,f> •leD oN Thus (8.56) results since E(g,f> = E(f, g}. (iv) If g e M0, then by (8.55) (interchanging/and g), we have f g&fdV+E<f,g>=0 J D Now if/is orthogonal to Mo, then |#Δ/ί/Κ = 0 for every g with boundary value zero. This implies that Δ/= 0 everywhere. For suppose Δ/(ρ) > 0 for some ρ in D. Let β be a ball in D centered at ρ in which Δ/> 0, and let ρ be а С2 function such that p(p) = 1 and ρ = 0 off B. Then ρ e M0, so ί ρΔ/ί/Κ= ί PAfdV = 0 •Ό •'в Since ρΔ/> 0 m β, it must be zero. Thus Δ/(ρ) = ρ(ρ)Δ/(ρ) = 0, a contradiction. Corollary 2. 77ге orthogonal complement of M0 in C2(D) with the inner product Ε </, g} is the space Η of harmonic functions. Theorem 8.6. (Dirichlet's Principle) Let D be a regular domain in R3 and suppose f is a continuous function on 8D. Let Mf be the class of functions in C2(D) with boundary value f (iii) Iff and g are harmonic, ί UdS=i °i*S Jan dN Jan ЯЛ/ JSD
678 8 Potential Theory in Three Dimensions (1) If there is a harmonic function in Mf, it minimizes the energy integral. (ii) If there is a function in C2(D) which minimizes the energy integral, it must be harmonic. Proof. These facts follow from the same reasoning as in Euclidean geometry. (1) Let и e Ms be such that Δ« = 0. If g is another function in Mf,g — и = 0 on 8D, so g — и е M0 ■ E\gy =E2<.g-u + u>=E\g-ii> + 2E(g - u, и> + £2<«> = E\g-u> + E\u> since tt_L Mo■ Thus, E2<,g} > E2<,u} for every ge Mf. (ii) If ue C2(D) minimizes the energy integral in Mf, it must be orthogonal to Mo. Forif^e Mo, then и ± # are both in Mf, and thusis2<tt + #> >E2<,u}. But E2<u±g>=E2<,u>±2E<,u,g>+E2<,g> so 0 ^ ±2E(u, g} + E2(g} for all g e M0 ■ Consider for t e R the function ф({)=2Е<,и, tg> + E2<,tg> Since <f>(t) <^ 0 for all r (positive or negative), and ^(0) = 0, we must have <^'(0) = 0. But ф'(0) = 2E(u, g}. Thus ul_ M0, so, by Corollary 1, и is harmonic. In order to solve Dirichlet's problem by his principle it remains to show that there exists a function in C2(D) which minimizes the energy integral. The technique for carrying this through was finally accomplished by Hermann Weyl (1926) and his methods have had far reaching effect in a wide class of boundary value problems for partial differential equations. Harmonic Functions We can use Green's identities and Dirichlet's principle in order to derive the basic properties of harmonic functions (analogous to those in two dimensions given in Chapter 6). Out of this will come a hint for solving the Dirichlet problem. Proposition 9. Let f be a C2 function defined on the boundary of a regular domain D. There is at most one harmonic function with boundary value f
8.6 Dirichlet's Principle 679 Proof. If u, υ are both harmonic and have the boundary values/, then и - υ is at the same time harmonic and in M0. Thus E(u — v,u — v}=0. But E(u-v, u-v>= ί ||V(tt-i;| Jd dV so we must have V(« - v) = 0 in D. Thus и - υ is constant. Since u = vondD,u is identical to v. The gravitational field of a particle of unit mass situated at the point ρ is, according to Newton, given as — 1 χ — ρ Ι|χ-ρΙΙΊ|χ-ρΙΙ (8,58) This field is easily seen to be conservative and divergence free, thus it is the gradient of a harmonic function, called Newton's gravitational potential. Writing (8.58) out in coordinates, we have Qc1 - p\ x2 - p2, x3 - p3) [(x1 - p1)2 + (x2 - p2)2 + (x3 - p3)2f'2 and it is not hard to see that this is the gradient of Πρ(χ) = ||x- pll"1 = [(x1 -p1)2 +(x2 -py+tf-p*)*}-112 This particular function stands at the beginning of a sequence of ideas which lead to a technique due to Green, for solving Dirichlet's problem. These steps were motivated by an inquiry into the nature of gravitational fields (due to masses more general than that of a particle), the point being to show that every harmonic function arises as the potential of a gravitational field. Green's first result is an easy consequence (reminiscent of the Cauchy integral formula) of his identities. Proposition 10. Let Dbea regular domain, and h a function harmonic on D. Let ρ e D. Then *(p)-Z!f \hdJh-Up^]dS (8.59) Vf An ho Υ δΝ ' δΝ] Proof. Once again we first remove a small ball B(p, ε) centered at ρ and contained in D. Since both h, Π„ are harmonic in D - B(p, ε), Corollary 1 (iii) applies
680 8 Potential Theory in Three Dimensions in that domain. Thus, г Г an, eh' \h—7-Пр— dS = 0 This implies that an, α ал' Л^-П'а^ ί/5 •1ев(,р,в)1_ an, ал' A8JV-n'W ί/5 (8.60) Now the second integral can be computed using spherical coordinates centered at p: χ = ρ + (r cos θ cos φ, r sin θ cos ф, г sin Θ) Then Пр(х) = r'1. The sphere 5(p, e) is given by χ = ρ + e(cos θ cos φ, sin θ cos φ, sin 0) and its exterior normal is the radial vector, so д/dN = 8/dr. The element of area on B(p, ε) is dS = ε2 cos2 φ άθ άφ. Thus the right-hand side of (8.60) is ал' r dr •w,.> L 8r v) r _ г* г*'2 Г h(x) 1 ал" J-. J-./2 L «2 e 8r_ = - Κχ)οο52φάθάφ-ε\ —οο^φάθάφ •'-It·' -is/2 ^-It·' -It/2 ОГ ε2 cos2 φ άθ άφ ■* л»/2 8h Since \dh/dr\ <, ||VA||, the second integrand is bounded as e^O. Thus the second term will vanish for e^O. As for the first term x^p as e^O, so Л(х)-*-ЛСр). Thus, letting e^O our integral tends to r r"2 -Λ(ρ) οο52φάθάφ = -Μ(ρ) J -π * -π/2 which is what was desired. Now, if D is the ball of radius R centered at p, then Πρ(χ) = ||x - p||~\ so on D, Πρ = R'1 and δΤΙρ/δΝ = -R~2. Equation (8.59) becomes Й(Р) = —\ri f й JS + — ί ^ dS
8.6 Dirichlet's Principle 681 Since h is harmonic in D the second integral vanishes (Problem 47) and we obtain the mean value property for harmonic functions in three variables. Proposition 11. (Gauss' Theorem) If h is harmonic in a neighborhood of B(j>, R), then h satisfies the mean value property: '«-*5P/, . n! ι h dS Green's Function Now, by Corollary l(iii), if к is any function harmonic on D, then (8.59) can be modified by k: ui\ -1 f Γ,.^Πρ-^ m ,,dh dS (8.61) Thus, if к is chosen so as to solve Dirichlet's problem with the boundary value Пр, the second term will vanish and we obtain an integral formula for h in terms only of its boundary values. Finally, we could use that formula to solve Dirichlet's problem with any boundary values. Thus (8.59) allows us to reduce the general problem to that for a certain family {Пр} of specific functions, and for many regular domains that solution is easily found. Definition 13. Let D be a domain in R3. If kp solves Dirichlet's problem with the boundary values Πρ, we shall call the function Gp = kp- Пр the Green's function with singularity at p. Theorem 8.7. Suppose D is a regular domain such that there is a Green's function for every point ρ in D. Then ifh is harmonic on D, h can be found in terms of its boundary values: Απ JdD opJ Proof. By (8.61), but the second integral vanishes since kp — Ώ„ = 0 on 8D.
682 8 Potential Theory in Three Dimensions Example 35. Let us take D to be the upper half space D = {(x, y, ζ): ζ > 0}. Then dD = {(x, y, 0): (x, y)eR2}. Since the domain is infinite we have to restrict attention to functions for which the integrals make sense. If H, is a large hemisphere: H, = {(x, y, z): x2 + y2 + ζ2 < /, ζ ^ 0} then (8.61) holds for functions h harmonic on D: 4π ·>ββ(ο,ι) I dN p dNj z&O + lf \hd-^-Up^\dS (8.62) 4π^ (z=o) L 9N pdN] ' We shall call the function h dissiparive if the first integral tends to 0 as /-юо, and the second integral converges. For example, if ||χ||2Λ(χ) and ||x||2V/i(x) are bounded functions on D, h is dissipative (Problem 48). This is true for Π , ρ not the on xy plane. Now if h is dissipative we can let / -> oo in (8.62) and obtain Now if p = (x0,>O,zo), Π„(χ) = [(χ - x0)2 +(y- y0)2 + (z - z0)T112 and its boundary values (z = 0) are the same as those for Tlq where 4 = (*o > Уо — zo)· Since Tlq is harmonic in D and dissipative, there is a Green's function. Thus, the Green's function for ρ = (x0, y0, z0) is Gp(x) = Uq(x) - ПДх) 1 l(x-x0)2 + (y-yo)2+(z + Zo)2y'2 1 " [(x - Xo)2 + (У- Уо)2 + (ζ ~ Zo)2]1/2
8.6 Dirichlet's Principle 683 Now the exterior normal to the plane is the downward vertical, so δ/δΝ = δ/δζ. A final computation gives ^ (*> =-?■(*. л o)= 2z° δΝ δζ ' " ' [_{x- x0f +(y- yoy + zlT2 Thus, if h is harmonic and dissipative in the upper half space, we have for any z0 > 0 (8.63) Finally, we remark that (8.59) can be used to solve Neumann's problem in the same sense. If there is a harmonic function kp for each ρ in D such that —£ = —- on δν δΝ δΝ then for any function h harmonic on D we have Thus h is determined by its normal derivative on the boundary. • PROBLEMS 39. Prove Corollary 2 of Theorem 8.5. Green's Function for a Ball 40. Using a little bit of plane geometry it is possible to discover the Green's function for the unit ball. If Ρ is a point inside the ball, let Q be the point inverse to Ρ in the sphere Ρ
684 8 Potential Theory in Three Dimensions Now let X be a point on the sphere. Verify that the triangles (see Figure 8.17) OPX and OXQ are similar (since the angles POX and QOX are the same and IQOI 1 or Conclude that QX PX OQ OX QO ox = ox PO 41. From the above problem we deduce that П,(х) = — Π,(χ) where q is the point inverse to ρ in the unit sphere. Since Π,(χ) is harmonic m the unit ball B, the Green's function for В is П,(х) Gp(x) = -£г - П,(х) HP-xllPlI IIP-xll Figure 8.17
8.6 Dirichlet's Principle 685 Calculate the precise form of Theorem 8.7 (known as Poisson's formula for the ball) if h is harmonic on the unit ball 1 Γ 1 - IIpII2 42. Solve Dirichlet's problem for the ball. 43. Solve Neumann's problem for the ball. 44. Find the steady state temperature distribution in the ball if the surface temperature on the sphere is maintained at (a) cos φ, φ is the angle between the point and the north pole. (b) A(x + 2y), A a constant. (c) x2+y2~2z2 (d) cos 40 sin 2φ, θ, φ spherical coordinates. 45. Suppose D is a domain for which there exists a Green's function GP for all ρ e D. Show that if ρ Φ ρ' G,(p') = G,.(P) (Hint: Show, by Green's identity that the integral О; 8GP. dGp Gp~m~ Gp- ~m dS is the same as 8G.~\ dS dGp, dGp g°~8n~ Gp- ~m where В, В' are balls of radius ε centered at p, p', respectively. Now, using the fact that 1 Gp = - + harmonic r compute the limits as ρ -*■ 0.) 46. Suppose D, D' are domains with Green's functions GD, GD and D => D'. Show that for ρ e D' GD,p(x)^GD-.p(x) allxin£>' 47. Show that for h harmonic in the ball of radius R centered at p, dh — dS = 0 Λι*-»ιι=κ °N
686 8 Potential Theory in Three Dimensions 48. Show that the function h defined on the upper half space D = {z ;> 0} is dissipative if ||x||2A(x) ||x|l2VA(x) are bounded. 49. Show that if h is harmonic and dissipative in the upper half space and zero on the ζ = 0 plane, then h is identically zero. 50. Suppose that h(x, y) is dissipative on the plane. Prove that there exists a unique dissipative function и continuous on the upper half space {z ;> 0} and harmonic for {z > 0} which attains the boundary values h. и is given by "<*■**>- 2 I Jo [(Х-ХоУ + (у-УоУ + го>?1>аХаУ 51. Find the steady state dissipative temperature distribution on the upper half plane if the temperature on the plane ζ = 0 is maintained at exp(x2 + У2)'1. 8.7 Summary A fluid flow is given by a C1 .Revalued function φ(χ0, /) denned for Xq in some domain D in R3 and / on an interval in R about the origin, φ has these properties: (i) φ(χ0, 0) = x0 all x0 e D. (ii) For fixed /, x0 -> φ(χ0, /) is one-to-one and has a nonsingular differential. The vector field Зф(х0. 0 v(x, 0 = dt χο = Φ~4*.Ι) is the velocity field of the flow. The flow is steady if ν is independent of /. If ν = (yb v2, v3) is a differentiable vector field, its divergence is the function δνί δν2 δν^ div ν = -4 + —I + —f δχ δχ δχ*
8.7 Summary 687 equation of continuity. If v(x, /) is the velocity field of a flow and p(x, /) is its density, the law of conservation of mass implies — + div(pv) = — + Συ, — + ρ div ν = О A flow is incompressible if the same mass always occupies the same volume. The necessary and sufficient condition for this is div ν = 0, where ν is the velocity field of the flow. The fluid is incompressible if and only if the density at a particle is constant under all flows of the fluid. INTEGRATION UNDER A COORDINATE CHANGE. Let (и, V, w) = F(x, y, z) be a change of coordinates taking a domain D onto the domain Δ. If g is continuous on D, g(x, y, z) dx dy dz = g(F 1(и, ν, w) •>d ·Ά det(*' У' Z) du dv dw as (u, v, w) If ν is the velocity field of a flow, the circulation around a curve С is defined circ(C)= f<v,T>ifr Jc If we fix the point x0 and vector η at Xq let Cr be the circle in the plane perpendicular to η of radius r centered at x0. The curl of the flow about η at x0 is circ(Cr) curl v(x0 , n) = lim j— r->0 Г If ν = (v1, v2, v3) define 'dv2 δν3 δν3 δν1 dv1 dv2^ Ιδυ2 dv3 dv3 _ drf_ dv^_ _ διΛ curlv=la?"aP'a?"a?'ax2 dx1) Then curl v(x0, n) = <curl v(x0), n>. A flow is irrotational if curl ν = 0. A surface patch in R3 is the image of a domain D in R2 under a C1 map χ = x(u, v) with these properties: (i) χ is one-to-one
688 8 Potential Theory in Three Dimensions (ii) the vectors хи = дх/ди, xD = dxjdv are independent, {u, υ) are called parameters or coordinates for the surface patch. A surface is a set Σ in R3 which can be covered by surface patches. The tangent plane to Σ is the plane spanned by the vectors x„,x„ (this is independent of the particular coordinates). The normal N to a surface is a unit vector defined for each point and orthogonal to the tangent plane there. The form ds2 = Ε du2 + 2Fdudv + G dv2 defined on a surface Σ with coordinates (u, v) by £=<x„,x„> F=<x„,x„> G = <x„,x„> is the first fundamental form of the surface. The parametric curves are orthogonal if F = 0. The length of a curve on Σ given by и = u(t), ν = v(t) is f, nJdu\2 „„dudv ^/dvVy'2 , idS4[E[a4) +2Fa4Jt + G[-di)\ dt A geodesic is a curve of minimal length. If у is a geodesic on Σ, then at any point ρ on у the normal to γ is orthogonal to the tangent plane of Σ. The area of a domain Dona surface Σ is defined by ί dS - \ ||x„ x xj du dv The integral of a continuous function / defined on D is \ fdS= ί /||χ„ χ x.|| du dv These definitions are independent of the parameters chosen. If Σ is an oriented surface and ν is a vector field defined around Σ, the flux of ν across Σ is i<v,N>^S
8.7 Summary 689 stokes' theorem. If ν is a C1 vector field defined in a domain U, and Σ is an oriented surface in U and D is a regular domain on Σ, then f <v, T> ds = Γ <curl v, N> dS divergence theorem. If ν is a C1 vector field defined in a neighborhood of a regular domain D in R3 then, with the exterior normal orientation on 3D, f <v, N> dS = f div ν rfF ■>dD ->D green's identities. Let /, g be two C2 functions defined on a regular domain D. Then ί /Ilds= ί Ubg + <yf,vgy\dv •>dD ON Jp dirichlet's principle. Let D be a regular domain in R3 and suppose /is a continuous function on D. Let My be the class of C2 functions on D with boundary values given by/. (i) If there is a harmonic function in Mf, it minimizes the energy integral £2(„) = j||Vti||2iiF (ii) If there is a C2 function which minimizes the energy integral, it must be harmonic. • FURTHER READING In order to continue the study of the divergence theorem and further related topics one must turn to the notations and ideas of differential forms. The small book M. Spivak, Calculus on Manifolds, W. A. Benjamin, Inc., New York, 1965, gives a clear and direct account of this subject. The book H. K. Nickerson, D. С Spencer, and N. Steenrod, Advanced Calculus, D. Van Nostrand Company, New York, 1957, was the first to give a complete account of this subject on an advanced calculus level. For a more recent account, with a chapter on potential theory in R", see L. Loomis and S. Sternberg, Advanced Calculus, Addison-Wesley, Reading, Mass., 1968.
690 8 Potential Theory in Three Dimensions Other references are Μ. Ε. Munroe, Modern Multidimensional Calculus, Addi son-Wesley, Reading, Mass., 1963. E. Butkov, Mathematical Physics, Addison-Wesley, Reading, Mass., 1968. For further study of differential geometry we recommend S. Struik, Lectures on Classical Differential Geometry, Addison-Wesley, Reading, Mass., 1950. H. Guggenheimer, Differential Geometry, McGraw-Hill, NewYork, N.Y., 1963. MISCELLANEOUS PROBLEMS 52. Suppose that F is a C1 function defined in a neighborhood of p0 m R3 such that F(p0) = 0 and dF(p0) Φ 0. Show that the set Σ = {ρ: F(p) = 0} is a surface patch in some neighborhood of p0. {Hint: Choose coordinates x, y, ζ so that the forms dF(p0), dx(p0), dy(p0) are independent. Then the transformation F(p) = (x(p), y(p), F(p)) is invertible. If G is the inverse to F, the function ф(и, ν) = G(u, ν, 0) parametrizes Σ.) 53. A family of surfaces in a domain D in R3 is given implicitly by the equation F(p) = c (8.64) where F is C1 in D and dF(p) φ 0. For each c, the set (8.64) determines a surface. Show that the vector field VF is the velocity field of a flow whose path lines intersect each surface orthogonally. 54. Find the family of curves which are orthogonal to these families of surfaces: (a) x1 + 2y2+z2 = c. (c) x2 + y2 = c(z + c). (b) z2x2 = c2 (d) ζ = с cos у. 55. Given a family F of curves in space, there may not exist a family of surfaces orthogonal to F. If say, ν is a vector field tangent to the family F and {F(p) = c} is the family of orthogonal surfaces, show that VFmust be collinear with v. The condition that ν must be collinear with a gradient must be satisfied in order for the path lines associated to ν to have an orthogonal family of surfaces. Show that this condition may be written <curl v, v> = 0. 56. Show that the family of path lines of the helical flow (x, y, z) = (x0 cos t + y0 sin t, — x0 sin t + y0 cos t, z0 + t) does not admit an orthogonal family of surfaces.
8.7 Summary 691 57. Show that if the vector field ν is conservative the family of surfaces {П(р) = с}, where Π is a potential for ν is orthogonal to the path lines. 58 Show that, although the vector field v(x, y, z) = (yx, y, 0) is not conservative, the path lines of its associated flow does admit a family of orthogonal surfaces. 59. Suppose that D is a star-shaped domain in Rb centered at the origin. That is, if ρ e D, then so is the line segment joining 0 to ρ in D. Suppose that ν is a C1 vector field defined on D such that div ν =0. Define the vector field u by u(p) = f [ν(Φ) Χ φ] dt •Ό Show that curl u = v. {Hint: Recall Pomcare's lemma (see Theorem 7 5); this is just a generalization. Differentiate under the integral sign, use the condition div ν = 0 and then integrate by parts.) 60. Suppose that u is a C1 vector field defined in a neighborhood of a sphere S. Show that f <curlu, N>i/5 = 0 •'s (Use Stokes' theorem one hemisphere at a time.) 61 Every curl-free vector field defined on Rb — {0} is a gradient; however there is a divergence-free vector field defined there which is not a curl. For example, take Vo(p) = ^ Then div ν = 0, but if S is a sphere centered at the origin ί <Vo , N> dS = Απ •'s so by Problem 60, v0 is not a curl. It can be shown that if ν is any divergence free field defined in R3 - {0}, there is a vector field u and a constant с such that ν = curl u + cv0 Can you suggest how to define с and u ?
692 8 Potential Theory in Three Dimensions Normal Curvature 62. Let Σ be a surface patch in R3 coordinatized by χ = x(u, v). Let N be the normal to Σ so chosen that xu^x„^N is right handed. N can be viewed as a differentiable function of u, v. For ρ on Σ, dN(p) is thus an /?3-valued linear map of R2. By denning 3N dN(p)(x„) = —(p) 3N dN(p)(xv) = — (p) we may consider ί/N as a mapping of the tangent space Γ(Σ)Ρ into R3. (a) Show that the range of i/N(p) is orthogonal to N(p). (Hint: N is a unit vector.) (b) Because of (a) i/N(p) can be considered as a linear transformation of Τ(Σ)Ρ to Γ(Σ)Ρ. Show that i/N(p) is symmetric: <</N(p)v, w> = <v, i/N(p)w> (Hint: You need only show that <dN(p)(x»), x.> = <x«, </Ν(ρ)(χ„)>) (c) Show that i/N(p)(v) is κΝ(ν) when κΝ(ν) is the normal curvature (see Problem 25) of the curve of intersection of the plane through N and ν with Σ. Since i/N(p) is symmetric on Τ(Σ)Ρ, it has two real eigenvalues and the corresponding eigenspaces are orthogonal. The eigenvalues are called the principal curvatures of Σ at p, and the eigendirections are the principal directions. 63. The second fundamental form on a surface is the form H(v) = <i/N(p)v, v> for ν e Γ(Σ)(ρ) Show that II can be expressed as II =Ldu2 + 2Mdudv + Ndv2 (8.65) where /3N 3x\ L = \eu,~eu/
8.7 Summary 693 -/— ^\_/aN ax\ \ 8v ' dv 64. Compute the second fundamental form and find the principal directions on these surfaces: (a) x1 + 2y2 + z2 = l (b) 2у = хг (c) x2-y=z2 (d) x2-y2 + z2=0 65. (Rodriques' Formula) Show that a curve Г on a surface Σ is tangent to a principal direction at every point if and only if ί/Ν + κΝ dx = 0 along Γ. (Such curves are called lines of curvature.) 66. Find the lines of curvature on the surface Σ: (a) Σ is the cylinder given by x(«, v) = (cos u, sin u, v). (b) Σ is the torus x(u, v) = (2 + cos «)cos v, (2 + cos и = (2 + cos u) cos v, (2 + cos «)sin v, sin u). (c) Σ is the sphere xi + y2 + z2 = 1. 67. A point ρ on a surface Σ is called an elliptic point if the principal curvatures have the same sign, a hyperbolic point if the principal curvatures have different signs and a parabolic point if one principal curvature is zero. Find examples of all three kinds of points on a torus. Show that ρ is elliptic, hyperbolic, parabolic as LN — M2 > 0, < 0, =0. 68. Show that at a hyperbolic point in a surface intersects its tangent plane in two curves with zero normal curvature.
ANSWERS TO SELECTED EXERCISES Chapter 1 SECTION 1.1 1. (a) (1,-7) (f) (4-2y-3z,y,z) (b) (4,3/4) (g) (4,3) (c) (8,6,1) (h) (5,2) (d) (0,1,2,1) (i) (5-y,y,-l-w,w) (e) (11,13,-2) (j) no solutions 2. (a) (-ζ,Ο,ζ) (b) (0,-z,z) (c) (-z, —w, z, w) (c) /1 2 0 Γ 0 0 10 0 0 0 1 ,0 0 0 0/ (b) 0 1 0 0 0 0 1 0 694
Chapter 1 (a) (32/78, -5/78, 35/78) (b) (-3-5x5,2 + (10/3)x5,-(7/2) (c) (-3-5x5,8/3 + (10/3)x5. -4- (d) no solutions (e) no solutions -4x5, 1/2, x5) Ax5, 1, x5) SECTION 1.2 11. (a) (120/52,4/52) (b) (4/5,-8/5) (c) (-51/22,29/22) (d) (-39/5,-43/5) 12. (a) Ъх+1у = -\ (b) x-y = 8 (c) y-2x = U SECTION 1.3 13. (a) /l (b) 14. (a) (c) (24) (d) 2 1 ч -1 / 24 24 12 8 12 12 6 4 -6 -6 -3 -2 48 48 24 16 \ 42 42 21 14 4\ 2 V (b) (d) doesn't exist 15. (a) 0 3 -20/78 0 -1 7/78 (c) / 0 -1/2 0 1/6 1/12 0 -1/6 5/12 1/8 -1/2 1/4 0
696 Answers to Selected Exercises (b) / 1 0 0 0 \ J -1 1/3 0 0 \ I 1 0-1 0 I \ 0 0 0 1/2/ 16. (a) no conditions (b) ^=0 (c) 2b2 + b3 + b'=0 ί»1 + 362 — 26* = 0 4b2 - b" + b5 =0 (d) -Abl - 25b2 + 1463 + 10b* = 0 17. Since the index d of A is at most n, if Ρ row reduces A we obtain Ρ Ax =Pb. Since (at least) the last m — d rows of PA are zero, b must satisfy the (non- vacuous) conditions that the last m — d entries of Pb are zero. 19. \ϊχ = ΣχίΕι,Τ(χ)=ΣχίΤ{Ε)=0. 20. If χ =2 x[Ei, T(x) = 2ϊ=ί x'El+1 + xnEu so Τ is uniquely determined by the conditions. section 1.4 21. (a) 4 (b) 3 (c) 3 (d) 3 22. (a) 3 (b) 3 (c) 3 23. (a) {xe R*; xl + x2 = x3 + x*} (b) {xeR*; -Зх1 + x3 =0,2x1 - x2 - x« =0} (c) {xe R3;xi + 2x2-x3=0} (d) {xeR5;x3 = x5,x2=0, x4=0} 24. (a) No (b) Yes (c) No 25. (a) c(—4υι — v2 — 6υ3 + 5υ„) = 0 (b) α(υ2 - υΛ) + b(2.Vi + υ, + υ„) = 0 (c) αοι + b(2v2 — 2ьъ + νΛ) = 0 26. The given vectors form a basis for R5. 27. (a) (0,-1/2,1,0,0) (-1,1/2,0,0,1) (b) (1,2,1,0) (0,-1,0,1) SECTION 1.5 29. (а) К = [бх1 = 17x4, χ2 = -2x\ хъ = x4/3} R = R3 (b) K = {x1=0, x2=0} r = {6x2 = \2x« - \\x\ 2x3 = 4x* - 39л:1}
Chapter 1 697 (c) К = {х1=0,3х* + 2х*=0,3х3 + 4хл=0} R = {xi-x2 + x3~x*=0} (d) K = {Sx1 + x* + 6x5=0, 8x2 + 5x* + 7x5=0, x3 + 2x4 + 2x5=0} R = R3 30. (a) K: (17/6, -2, 1/3, 1), R: Eu Ε,, E3 (b) K: (0, 0, 1, 0), (0, 0, 0, 1), R: (1, -11/6, -39/2, 0), (0, 2, 4, 1) (c) K: (0, -2/3, -4/3, 1), Я: (1, 1, 0, 0), (-1, 0, 1, 0), (1, 0, 0, 1) (d) K:(-l, -5, -16,8,0), (-3, -7, -8, 0, 4), R: ЕиЕг,Еъ 31. If/is nonzero, its range is all of R, so its rank is 1. Thus its nullity is и-1. 32. {Ei-E,: i = 2,...,n] SECTION 1.6 (d) 1/9 1/6 1/9 2/9 1/9 0 2/9 -2/9 -1/3 1/6 0 6/9 1/9 -1/12 -1/9 5/18 (b) (c) /0 1 1 1 0 -1 0 о о 34. (a) (0,2,1) (c) (9/8,1/2,-7/8) (b) (-17,5,1) (d) (-1/2,1/2,1) 35. By induction we can show that A" has the property that its (i, j) entries are zero for all ί <j + k — 1. Once k > n, these are all the entries. 36. (7+Λ)-^ |(-1)U n = l Σ( ( = 0 SECTION 1.7 37. Eigenvalue Eigenvectors (a) 2 (0, 1, -2) 3 (1, -2, -4) -1 (1,1,-4)
698 Answers to Selected Exercises (b) 1 (1,0,1.-1) -1 (2,0,2,-3) 0 (0,0,1,-1) 2 (0, 2, 0, 0) (c) 1 (1,0,0,0), (0,0, 1,-1) 4 (0,1,0,0) no basis of eigenvectors (d) 2 (1,0, 1), (0, 1, 1) -2 (1, 1, -2) 39. If G represents the standard basis, we use Exercise 38 to find Aar, AGE and use the fact that -AGF(AG*)- 0 0 0 72 0 0 -1 -1/2 1/2 0 0 1/2 0 1/2 1 -1/2 5 -1\ (c) 1 2 4 1/ (b) / 1/2 -1/2 -1/2 \ 3/16 -5/16 —7/16 J \-l/2 1/2 -1/2 / 40. (a) /o -2 2\ (b) /-3 -5 0\ 2 3-1 2 3 0 \2 2 0/ \ 0 4 1/ 41. If TE =cl, when £ is a basis of eigenvectors, then for any basis F, TF = {Ae^TeA/ = c(Aef)-\Aef) = cl SECTION 1.8 42. (a) (5+30/34 (d) cis(-2/3)/4 (b) l+i (e) cis(-7) (c) (3-0/Ю 43. zz = \ if and only if z~1=z 45. (а) ±(-1 + 0Л/2 (b) cis(A:7r/5) к = \,Ъ, 5, 7, 9 (c) ±1, ±/(21/8)cis(7r/16) (d) i,(±l + IV3)/2
Chapter 1 (e) (±5)1/2cis[Jarctan(-i)] (f) 2501'6 cis[£tt/3 + (l/3)arctan(l/3)] к = 0, 2, 4 46. (a) (1,0,(1,-0 (d) (1,-4, 5,1), (1,3,0, 5), (( -3+V21)/4, -2 ((-3 -л/21)/4, -2 -V21/2, 0, 1) SECTION 1.9 47. 48. 49. 50. Table of <υ,, υ,>: υ2 «з υι 13 24 v2 41 υ3 υ4 υ5 V6 Table of υ ι χ Vj υ2 οι (6,-5,-5) «2 ν3 V5 νι (-6,2,5) υ2 (-12,4,10) ν3 (-15,5,21) υ4 (-15,5,20) ν5 ν6 (37, 16, -28)/17 (a) 9χι + 2χ2 - (b) 12*1 - 4χ2 - (с) Зх1 - χ2 = С (d) Зх1 + 9х2 = ν» v5 20 5 40 0 67 7 0 υ3 . (5, 4, -7) (-5,13,7) «6 (9,2,-10) (-1,-3,0) (0,-7,0) (-2,-6,0) (3,-1,0) 10х3=0 - 10х3 = 0 I 0 v6 νη 2 17 4 34 5 0 5 -15 0 0 21 ν* (9,2,-10) (3, 9, 0) (10,-5,-14) Vl (11,-72,25) (-41, -123,0) (-25, -222, 35) (-67,-201,0) (63, -21, 50) (-5,-15,0) 51. Vi.: χ = z, 2y = z vi: χ + Ъу = 0, ζ + Ay = 0 v3: у =0,5x = 7z υΛ: χ + Ъу = 0, 2z + 5y = 0 v5: ζ =0,3x=y v6:x=0,y = 0 υ-,: χ + Ъу = 0, Ix + 5z = 0
700 Answers to Selected Exercises 52. Ix1 - x2 - 2x3 = 17 53. Ax1 + x2 - 3x3 = 2 54. (a) <x,E2-E1>=0 = <,x,E3-E1> (b) x1 + x1 + 3x3=0,2x1 + 2x2 + 5x3=0 (c) x2=0, x3=0 55. The planes are given by the equations (a) x + y + z=l (b) x=y (c) x = 0 The intersection of (a) and (b) is given by equations (a) and (b), etc. 56. (a) x + у = 2z (b) у = ζ (с) ζ = 0 58. The area of a parallelogram of side lengths a, b is ab sin Θ, where θ is (either) included angle. 59. False if и is perpendicular to ν and w, but υ is not perpendicular to w. 60. Apply the equation (a, b χ c> = det I ft I to each pair, and observe that there are always two rows the same. section 1.11 61. (a) open (b) neither (c) closed (d) closed (e) closed (f) open (g) open (h) closed (i) open (j) open (k) open 62. (29,-3,26)/14 63. (Ill,-22, lll)/34 64. (0, 1, 1)/21/2, (1, -1, 0/31'2 (b) (0, 1, 0, D/21'2, (1, 0, 1, 0)/2^, (-1, -2, 1, 2)110»' (c) (0, 1, 0, 0, 0), (0, 0, 0, 1, 0), (0, 0, 2, 0, lys1" (d) (1, 2, 3, 4)/301'2, (2, 1,0,-1)/6·", (1, -3, 3, -1)/»1" 65. (a) ΛΓ: (10, -16, 16, 10/4771'2 R: (0, 0, 1, 0), (1, 0, 0, D/21'2, (1, 2, 0, -l)^1'2 (b) K: (1, 0,-1, 0)1Г'\ (0, 1, 0, - D/21'2 R: (-1, -2, 1, 0W\ (3, 12, -3, 2Э/1561'2
Chapter 2 701 Chapter 2 SECTION 2.1 1. (a) does not exist (b) 0 (c) 0 (d) no limit (e) 1 (f) 1 2. 4. 5. (g) 1 No, take Yes 0 SECTION 2.2 8. -1/2 (h) x„ = 3 = η 9. Form the new sum in this way: at any stage, if the sum is 1, add the first negative term not yet used, and if the sum is less than 1, add positive terms until the sum is 1. The resulting series is lllllllllj_j_ 1 1 1 2+4 + 4-2 + 8 + 8+8 + 8_4+Ϊ6 + Ϊ6 + Ϊ6 + Ϊ6~4 + "' The terms come in blocks. The first bracket encloses the first block, the second bracket begins the second block. The nth block consists of 2"_1 copies of 11111 yi ' 2"+2 2"+2 2"+2 2"+2 10. Yes. Since the sum of the positive terms is +co, and the sum of the negative terms is — oo, we can rearrange so that at any stage, if the sum is less than 10,000 we add positive terms until 10,000 is passed, and if the sum is not less than 10,000 add negative terms until 10,000 is passed. section 2.3 11. (b), (d), (f), (g), (h), (1), (m) converge (a), (c), (e), (i), (j), (k), (n) diverge 14. (a) |z|<l (b) |z|<l (c) all ζ (d) all ζ (e) all ζ (f) z=0 (g) |z|£l (h) |z|<l (i) |z|< 1 (j) |1 + *|<1
702 Answers to Selected Exercises SECTION 2.7 15. (a) π (b) 2/3 (c) 2/3 (d) 4/3 17. (a) 0 (b) 1/16 (c) β-(ΐ/2β)-3/2 (d) 1/4 (e) 5/6 (f) 1 /3) JS'4 [(1 + sec2 0)3'2 - 1 ] dd 18. (a) 1/6 (b) 1/60 (c) 1/10 (d) π/10 19. (а) Зтг/16 (b) 1/48 (c) 1 (d) 1/24 SECTION 2.8 20. df/дх df\dy df/dz (a) yz xz xy (b) у cos(xy) χ cos(xy) (c) y'x1"'-1' /z/-'Inx xyZy'\nx\ny (d) 2xy + y2 x2 + 2yx 21. ххХ[хх-1 + х]пх + хх(Ых)2] 23. Since VA = (й/г/йх1, .. , dh/дх") for any function h, we need only show that a a# a/ The proof is just as for functions of one variable. 24. By Exercise 23 o = v(/.i)=iv/+/v(i) 25. 5/9 26. 101/2/(l + 101'2) SECTION 2.9 29. (a) 0 (b) 0 (c) 1 (d) 0 (e) 1 30. (b), (c), (f) converge; (a), (d), (e) diverge section 2.11 31. (a) x„ = (i)(x,,-i + a/xn-i) (b) x„=2x„_i/3 + a/3x„2
Chapter 3 703 32. (a) **, l-^i±^zi±l 3 3 3xS_1 + 2x„_1+ 1 *2-i-l (b) *-2£7=7·*=? , _ xl-1 — 2x2_ ι — 3x„_! + 2 (c) x„ — x„_! —— — ixl-1 — 4x„_i — 3 <л\ 4 , л4*-1 +5 (d) )ί,=-)ί,-ι + 4- 5 5χί_! — 1 33. (a) all points except on the line χ = 0 (b) ι) all except (1,-1) (ii) all points (iii) no points d dF 8F 34. 0 = - F(x, <?(*)) = — (*, <?(*)) + — (x, <7(х))<?'(х) 35. (a) — (ду + tan ху)\хг (b) -sin(x + y)\{\ + sin(x + y)) (C) -J</* (d) -ye°>\{xe"> - 1) Chapter 3 SECTION 3.1 1. (a) ce" (b) (—sin t, cos f, 1) (c) (—asm t, boost) (d) (2f,3i2) (e) (1,2ί,3ί2) (f) (cos t, -sin t, 0) 2. (a) |c|exp[(Rec)f], argc (b) 21'2, тг/2 (c) (az sin2 f + 62 cos2 01/2 (d) |i|(4 + 9f2)1/2, arccos(4f + 18f2)/2 |i|[(4+ 9i2)(l + 9i2)]1/2 (e) (l + 4f2 + 9f4)1/2 (f) 1 тг/2
704 Answers to Selected Exercises The tangent to (a) at the point e" is parallel to the tangent to (b) at the point (a cos t, b sin t) precisely when 1 lb \ ■■ arctan I - tan t) Im с \a J 4. 5. 6. 7. 8. 10. 11. Never (flfr)1' 2 (-1, 12 1),(-1,1) mm(l/a, \/b, 1/c) (a) (b) (c) (d) (a) (b) Eigenvalues 11.516 -4.516 14 10 3 + 4(2)1/2 3 - 4(2)1/2 1 + 2C10)1'2 1 - 2(10)1/2 2 + 21'2 1 2 - 21'2 7.411 0.313 -1.724 Eigenvectors (0.685, 0.729) (0.729, -0.685) (1,1) (-1,1) (0.383, 0.924) (0.924, -0.383) (0.585,0.811) (0.811, -0.585) (-0.383, 0, 0.924) (0, 1, 0) (0.924, 0, 0.383) (-0.501, -0.382, -0.777) (0.838, -0.438, -0.325) (-0.216, -0.814, 0.540) SECTION 3.2 12. x-x3/3 + x5/5, 1-х + х2-хъ + х« 13. 0.1987 14. 1.7320 16. [-0.0312,0.0322], |x|< 0.1 17. |*|^0.125 SECTION 3.4 18. (a) ;y=(-l/2)exp(-x2) + 3/2 (b) у = — χ cos χ + sin χ (c) x=f2/2,;y=f3/3+l,z = f4/4 (d) ζ = -ie" + (1 + i)2'3/3 + 1 + ι
Chapter 3 705 19. (a) y={c-x)~1 (b) tan у + sec у = K(ta.n χ + sec x) (c) x3+/ = C (d) ;v=sin(x-(l/3)x3+ C) (e) y=Kfexp(t2/2)dt+C (f) у = С exp(x + x3/3) (g) >' = [1п(1-х)-х-сГ1 (h) > = — ln(c — <?*) (ι) tan j< + sec у = К ехр(— 2 cos χ) 20. (a) >- = exp(-x2/2) Joexp(i2/2)cosii/i (b) у = (sec χ — cos x)/2 (c) ^ = x-exp(-x2/2)/Sexp(i2/2)i/i (d) у = exp(/(x)) Ji exp(-/(i) - lit) Λ + exp(l/l -1) where/(x) = (exp(l - ΐ)χ)\{\ - ί) (e) у = ln(x + e - 1) 21. (а) у = -ex/2 + e~xl6 + e2x/3 (b) у = e'(coshV2t + smbVlt/Vl) 22. (a) (b) a>'0 (с) α ^ 0 /*\ = ci exp(4 + 2ί)ίί j) + c2 exp(4 - 2ι)/ Γ') (у1) =*«ρα +^)'(_ν^) +C2 exp(1 -V^'(v^) a=0 Ы / ** X \Λ/ V'C-Cji + Ci)/ афО Г1) = cie'( -1 I + c2exp(l + V2a)iΙα/λ/2α I W \ 1/ W + c3 exp(l - Vla)t I -a/V2a I \-а/л/2я/ Uj =[(cJ + c1)/e, + Cie'] |0 + c2e'jl +сэ<?'(0 23. (a) ci = exp[(l - 0/2] _ _ (b) d = 1/2, c2 = -(1 -Л/2Ф0/4, сз =(-1 -л/2я/а)/4 (c) >ί = [(1 + Oexpfl - 0' + (1 - ')ехр(1 + 00/2 Уг = [(1 + 0ехр(1 - 0' - (1 - ОехрО + О' W
706 Answers to Selected Exercises l5+Vu\ (d) /y^ W 9Λ/17- 17 : 34 exp(l+Vl7)i/2 4 5+Λ/Ϊ7 / /5-Vn\ 9Λ/17 — 17 34 exp(l-Vl7)i/2 4 5-Λ/Ϊ7 / 24. (a) ft)—(- (d) /M (e) + c2e5 + c2e4t 1 \ + c3e- 1 \y2) = Cl ^P^6 + /Зл/^' 2 + ί3λ/5 + сг ехр(6 - ι3λ/5)ί 2 _ ,3л/5 (f) >ί = He exp(4 — 7ί)' + с ехр(4 + 7ι)ί) 1 уг = — (с ехр(4 — li)t — с ехр(4 + 7ι)ί) (g) >ί = Re(c exp(3 - 2ί)) j<2 = Im(ci exp(3 — 2/)) j-3 = Re(c2 exp(l - 2i)) j<4 = Im(c2 exp(l — 2/)) (h) M\ I >21 = е'(сз ί 2/2 + c2 f + οΟΕΊ + <?'(c3 f + c2)£2 + c3 е'Яз
Chapter 4 707 (0 M\ ><2 = cie'Ei + c2 e'E2 + e'[(d - c2)f + d]E3 SECTION 3.7 26. y = e-*/2 jt2e' dt+ d + de'x У = x2/2 - χ + ci + c2 e~x 27 (a) cie2' + c2e-2,-l/4 (b) dex+c3e-x-d-(^-\-6x)ll (c) Cie-*+ с2<г2* +(sin x — 3 cos x)/10 (d) d ex + Cix (e) dx2 + dx3 + (In x)(—x2 + x3) 28. j< = 4x2/9 + 5/9x + 2x2 In x/3 29. ;v = CiX + c2x Jexp[(l — t)e']/t2 dt 30. > = e_1{exp[(l - x)e*] + χ J" e' exp[(l - f )<?'] Λ} + χ - 1 Chapter 4 SECTION 4.1 1. x(0) = (cos Θ, 0, sin 0) 0 < θ < 2тг 1 +V5 2. χ(0) = —-— (cos θ, sin 0,1) 0 < 6» < 2π 3. θ = arc cos((-_' — 1) с > i parametrizes the curve in the upper half-plane by taking 0 < θ < π; in the lower half-plane by taking — π < θ <0. 5. z(f) = a cos f<?"" = e"'b(-b sinl+i cos i)/(cos2 f + ft2 sm2 f)1'2 6. (a) (1-/,/) (b) (1 + /, sm 1 - ( cos 1) (c) (1 +/, 1 -/,/) (d) (2a, at, t) (e) (M,/) (f) (1 + 2/,-2M +/) 7. (a) x axis (b) ^ axis (c) ^ axis (d) x=y, z = 0
708 Answers to Selected Exercises section 4.2 ds _ (1 + 2a cos θ + α2)1'2 ' (a) dl= (1 + a cos Θ)2 (b) ~ = (5 + 4 cos 0)1'2 (c) (6a)i = (l-e-')/21/2 Λ (χ4 + cos2(l/x))1'2 (6d) ds = $ (a2 + cos2(0/2))1/2 dB (6/)j=J(8i2+ l)1'2* Г U (7φ=ί - (,2+1)l,2 dt f+1 10. (a) aN=21'2e,=aT 4+18i2 (9^ + 16)1'2 W flr=(4 + 9,2)1/2sgiW,fl„=3|i| 4+9i2 (d) aw = [(1 - sin i)/2]1/2 ar = -[(l + sin/)/2]1'a 11. (a) aN = |sm f 1(2/2 + cos 2f )1/2 αΓ = —sin 2i/(2 + cos 2f)1/2 (c) ar = //(2 + i2)1'2, aN = [(t* + 5f2 + 8)/(i2 + 2)]1'2 SECTION 4.5 12. (a) y=-xy' (b) x/ + y=0 (c) χ exp(-j</*/) = 1 (d) sin x(sin у + χ/ cos j<) = χ sin .y cos χ (e) j< = exp(x + y) y'jy{\ + y') (f) l+/=0 13. (a) ^=cx (b) x + ^2/2 = c (c) 2х + у2 = с (d) х = сехр[-/у (7 +sin ί)-1]<# (e) c(;y — 1) = csc(7r/4 — x)
Chapter 5 709 14. x' + (y-c)2=c2 (y2-x2)y' + 2xy=0 (x + y)2 (y-x-ir , 2a2 + 2«2-l _I 16. y2=cx 17. (a) x2-y2=c (b) y2-x2=c (с) у=х + с 1 18. (а).СЬ)^-^ + 2ч/Здс,=с (с) y = -^-^x-^j 19. (a) x^=0 (b) ду=0 (c) ^=0 (d) sinx=0 (e) (x + >)j-=0 20. (a) y = ±l (d) > = ±e* (e) r = \,r = cos θ (f) 0=O,r = ±l SECTION 4.6 21. (a) xy=c (b) x2 + y2=c2 (c) xz = c,;yz=i/ (d) (x, y, z) = (ae\ b-t, ce') 22. (a) (l+'^)^ (b) в (с) (х,-хЧ2) (d) (l.^.Vl-z2) 23. (a) (-χ,ΐ,-z) (b) ((1 + f)*, (1 + t)y, 2t(x2 + y2))l(\ + t)2) (c) (0,1, -z tan 0 (d) (-*, -л -z(l + tan /)) 24. (a) exp(f 2/2)(x0, у о, ζ0) (b) (xo cos f — yo sin f, jo cos t + xo sin t, t + z0) (c) (x0e~\ y0e~', z0e') Chapter 5 SECTION 5.1 1. (a) |*| < 1/2 (b) |x| >1 (c) x<0 (d) -19<x<-17 (e) never (f) |*| < 1 2. (a) |z|<l (b) all ζ (с) lmz>0
710 Answers to Selected Exercises 3. (a) both (b) both (c) both (d) integrated for all x, differentiate for χ/2π not an integer 4. (a) f 2(n+l)(-x)2n n = 0 (b) 2 2 nx-1 qo v2i + l (c) 2 πΐΌ(2ι'+1)ι! qo фАП+1 (d) 2(-l)" „=o 4n+ 1 SECTION 5.3 5. (a) -e2x + xe2x + ex (b) (SIA)e-x + 2xe-x + (\l4)e3 (c) e2x - 2xe2x + (5/2)xV* (d) 3ex - xex (e) - Re[(l - 2i)e'x + (-1 - i)xe'x] (f) (6/5)<?* - (7/40)xe* - (1 /40)х<?-3* (g) (l/2)[exp(21/2x) + exp(-21'2x)] - e~ x* x1 x10 9. х+т^+^Г,+ 12 504 40360 X3 JC8 10·Τ + 20Ϊ6 2ft,+i ft, 11. (a) a„+2 = η + 2 (η + 1)(л + 2) 3* 1*,|<-г л! 2α„+ι α„-ι 1 (b) й-+2=77^-Л, , 1V_ , ,-» + -} |0ηΙ< η + 2 (/ι+ΙΧβ + 2) η! 4ΛΓ [η/2)»
Chapter 5 711 -ak (d) (e) SECTION 14. (a) (b) (c) (d) (e) (f) (g) (и + 1) · · · (η + к) К kl<- ап + к2а„ (η + 2)(η + 1) К \а«\<- Оп-1 ^ „+1 Ы*ЫЫ 5.5 оо 72п Σ — „to n! - (i + On-(i-O" Ζ ... г" αο Σ π = 0 ■[»/2](_ΐ)*- Λ (2/t)! ζ" α> ν2η+1 Σ »ΐο(2η+1)η! 00 Σ π = 1 π J-1 1 ν ν A(foy(„_y)!,t ν" Λ 00 7·2Π У — „to (2η)! 00 Σ π = 0 ζ2η + 1 2η +1)! 16. cos(z + w) = cos ζ cos w — sin ζ sin w cosh(z + w) = cosh ζ cosh w + smh ζ smh w sin(z + w) = sin ζ cos w + cos ζ sin w sinh(z + v) = sinh ζ cosh w + cosh ζ sinh w
712 Answers to Selected Exercises SECTION 5.7 17. α„{ Σ ϊΓΤ^τ-, Π IW + 1) - CM/ + l)]*2" + 1 \n=i(2n+ 1)! j-o 1 "=i + «ι Σ (2и+1)1 Π [*<* + 1) - (2/ + 1)(2/ + 2)]χ2"+1 + * 18. α0 Σ 2" "-1 (2η)! ;=ο χ2"+ 1 + «1 Σ 2" "-1 -—ПСУ+1-А) (2η)! j=o χ2+1 + χ 19. (а) я0 = 1, αϊ =2, α„+2 1 _ α, :(n+l)(n+2)1+f=„7! (b) y=x2/4ory=0 (c) по solution (d) у = (2x2 - с)- = -(1/ci) Σ (2/d)"x2 20. (a) 10 (b) 460 (c) 3 21. (а) к odd integer, a0 = 0 or к even integer, αϊ = 0 Chapter 6 SECTION 6.1 1. (а) /(0) = 7г2/3 /(n) = (-l)"4 η (b) /(5) =/(-5) = 1/32 /(3)=/(-3) = 5/32 /(l)=/(-l) = 10/32 /(n)=0 allother η (-1)" e1"" - <r "■" (-1)" sin(fiTr) (c) /(n) = — = , μ not an integer 2πι μ — η μ — η
Chapter 6 713 (d) /(0) = π/8 /(л)=0 rfn = 4fc = l/πη2 if η is odd = 2/ττη2 !fn = 4k + 2 (e) /(n)=0 η odd = 2/тг(1 — η2) η even (/) /(1)= (1-0/2 /(-l)=(l + ,)2 /(n)=0 all other η (g) /(0) = 1/2 /(n)=0 η odd or n=4&, A:^0 /(η) = -2/τΓΐη „=4Α: + 2 (h) /(„)=(-iy (0 /(") = />Я д — Я 2тг(1 - m) 2 + 2me"(-l)n 1 + n2 2. (a) (Re z)3 + (Im z)3 2 « /_iy (b) 3 Σΐ-g-l (rVM + ^ e-2i»\n 1 « sin ρ . . , „ (с) - Σ (-1)"—ί-r'V"1' 7Γ π=-α> μ- — П π 1 (d) - + - о я 2 (е) - Re 7Г α> „Ι2Π+1Ι J oo (-212П + 11 „έ«,(2η+1)2" ' 2тг„=еа)(2п+1)2 КН"й] (f) (i + *)2 SECTION 6.2 3. (a) -2 + ;Imlnll-zj (b) ~(z 2 + ζ"2)
714 Answers to Selected Exercises 2π2 ^, r""i? (с) — + 2Σ(-1)" — 3 π *0 Π 1 ^ ч Sin μπ π „^-ю μ — η (d) - 2 (-1)""-^—(-"V"9 (e) (1 + ζ)2 4. (a) r sin θ + r2 cos 20 (b) I H"1 /V 7г„^о |nl \n2 2(-l)"-2\ (-1)- + ^—), 1 2 oo z2n+i (c) - Im 2 -——— + - π „=o (2n+ l)2 2 SECTION 6.3 5. (a) (35/2 +28 cos 26»+ 14 cos 40 + 4 cos 66» + (1/2) cos 80)/64 (b) 2/ДI . I(- 1)J sin(A: - 2;)0 к odd 2?o П(- DJ cos(A: - 2j)6 + (-1)" /Μ л even 1 2 » sm(2n + 1)0 (c) г + " Ζ , , <— ζ 7Γπ = ο ζη+1 1__2 » sm(4A: + 2)0 ( ) 2 π „t-o 2/fc + 1 (e) (cos 50 + 5 cos 30 + 10 cos 0)/16 6. See Problem 27, Section 6.5 7. cosine series sine series (a) 1 (b) (1 - cos 4тгх)/2 (c) cos(2itx) 4 " sm(nnx) π π = ι Π sin(27rx) 4 » (2n + l)sm(2n + 1)πχ π,ίΌ (2n+3)(2n+l)
Chapter 6 715 H, ! , 2 ν (-l)ncos(2n+l)7rx 2 - 1 (d) 2 + ^Jo (2^ΓΪ) i^-C^+D™ + 2 sm(4n + 2)тгх + sm(4n + 3)πχ) (e) (1 - cos 2тгх)/2 sin πχ 1 8 ^ cos(4n+2)7rx 4_ ^ sin(2n+l)7rx u 4 π2 „и (4„ + 2)> π* Λ( r (2n+l)2 (g) (1 + 2 cos πχ — cos 2тгх)/2 4 « 2n + 1 2 sm πχ + - 2 ,, , ,v, rr sm(2n + ί)πχ π „ = o (2n + 3)(2n— 1) 8. /(^) = [/(&) +/(-0)]/2 + [/(в) -/(-0)]/2 9. 2 2 [Л2„ cos(2n)0 + Я2п+1 sin(2n + 1)0] n = 0 SECTION 6.4 .„ , ч »/sin(7rn-l) sm(7rn+l)\ 10. (a) > I — — I cos nnt sin πηχ π = ι \ πη — 1 7ГП+ 1 / 1 64 « η2 ... (b) - sin nt sm πχ + — 2 ZT^—»w. ,—ττ cos 2πη( sin 2πηχ π π „ = i (4n2 — 9)(4n2 — 1) —8 °° 1 (c) "Τ Σ „ , ,чз cos(2" + !)"' sin<2" + 1)7Γ* πό Л = о \2п + I) 1 8 £ " (d) - sm ττί sin тгх + - 2, η—7 cos 2π"'sin 2π"χ 4 ' ττ π „ = .ι 4η2 — 1 128 « η(η2-2) СеЧ У (— 1У-1 : r Sin πηί Sin πηΧ Κ) ττ2 Α Κ ' {An1 - 1)(4η2 - 9) » / — 7Γ2η2Λ l™x\l "■ (a) I^bzrM'nl ττ2η2ί\ /7rnx\/sin(7rn —L) sin(7rn + L)\
716 Answers to Selected Exercises I + 1)πχ -8L2 » /-·ττ2ί \ 1 (In (c) —— Σ exp —— (In + l)2 sin — π „=o \ 4i/ J (2n+ 1) /-7г2Л 7ГХ /-25тг2А (d) exp^—js.n-^ + Sexp^-jsi 5πχ sm-^— 2 —4e » sin(2n+Ιπ-χ) 12. (c) 1 + (e - \)x + 2 ехр(-тг2(2п + l)2f) \ } 7Γ п = 0 2П+ I « n(2-e(-l)n) + Σ exp(—ττ2η2ί) — — sin πηχ n = l 1 + ГГ + 7Γ2 14. (a) - exp /—7Г2Л 7ГХ [-UFJ sin Τ 2L » /-irVA/ 1 \ 2πηχ sm- L 15. The general solution is of the form f (A sin(n2 + l)1/2f + B„ cos(n2 + l)1'2) sin nx Fl = l where the A, ft are determined by the sine series of the initial data. 16. On the interval [—π, π] Σ (е2'"'-1)(А„ + В„еш) where the A„, Вп are determined by the Fourier series of the initial data. SECTION 1 17. (a) (b) (c) (d) 5.5 27tt/64 4 ^, ηζ 7Γ Λ („2 _ μ2)2 " 2тЛ1 + Д 2r2" J 2яг 2μπ — sin 2μπ 2μ 1 + Г2 1 — г2
Chapter 7 717 1 (e) 2π 2 п = 1 П 2тг5 » l (f) ^ + 16-2^ SECTION 6.6 19. (a) span ofexp(3;0), exp(-5i0) (b) span of exp(±3i0), exp(±i0) (c) span of e±'" 20. (a) (sm 56» + cos 50)/576 + Л sin 0 + Я cos 0 2тг2 (—11V"9 (b) ^7 + 2 2 ( } 27 „to n2(9 - n2 + 6/и) (с) — exp(cos θ) χ m Ч SECTION 6.7 21. (а) 13тг/4 (b) 0 Ar<n Chapter 7 SECTION 7.1 1. (a) [—у sin χ + ζ cos(zx)] dx + cos xdy+ x cos(2x) dz (b) — [<?* +y sin(e* +y) + e> sin(xey)] ^ - [<?* +y sm(x + j<) + xe" sm(xey)] dy (c) <?<*·"> <Ae, a> (d) <Ле, <?<*·<■>> + <x, <?<*·■> ><Λ,α> (e) (2x + ζ) φ- + 2j< i/y + x dz (f) e*+y[l + x-;y]i/x+i?*+y[x-;v-l]i/.y А(п*'У (g) Σ Π*' )dxJ 2. (a) 2/1000 (c) 2/1000 (e) 2/1000e (b) l/1000e (d) 2/5000 (f) l/1000e 3. ||p 1Г/Ю00 if ||p || < 2, \\p Ц/500 if \\p || ^ 2.
718 Answers to Selected Exercises SECTION 7.2 4. (a) (b) / e" xey\ [ye1 e*) хуф\ χ1, χ2, χ3 all different (c) /2x -2y\ \y x) <*»*° (d) / 2x 2y 2z \ i-y/x2 1/x 0 J хфО \-z/x2 0 l/x/ (e) / 1 0 0 -хг1(х1)г l/x1 0 -x3/(x')2 0 l/x1 0 -x"/^1)2 0 x^O χ" — ^4- +0 ; д(х\ ..., Их") χ" φ 0 and the hu ...,h„ coordinates. К 2 n 5. -. 2 и' du' + 2 tt1 f = 2 1 а («О2 и1 — У —- 1 = 2 Ui du1 6. (a) duju wv du (b) (c) l + w2 — v2 l + v2 — w2 ' + 7T-;—г-;—κϊ ""'* + ; 1 + υ2 + w2 (1 + w2 + υ2)2 (1 + w2 + υ2): vudw 1 1+υ+ w 2 [«(1 + υ2 + w2)] 7. с/2 с = fixed edge Tjidu + u1/2w(w — v) *i*+ u1/2v(v — w) (l+v2 + w2)3'2 (1 + v2 + w2)3'2 <Λν section 7.3 9. (a), (b), (d), (g), (ι) are closed (c), (e), (f), (h), (j) are not closed 10. The real part is exact, the imaginary part is not.
Chapter 7 719 11. (a) e> (c) -\tf (e) exp(-x-y) (b) e\y (d) 1/jc (f) l/cosx SECTION 7.4 12. (a) 0 (b) 1 (c) -ecos 1 -02sm l)/4 - e2/4 + 5/4 (d) -f-f2/2 (e) -a2/2 - 2k2a5/5 (f) 21/2 13 (b), (c), (f) are conservative (a), (d), (e), (g) are not conservative SECTION 7.5 14. (a) 0 (b) 1 15. (a) 0 7Γ W Tab (c) 18 "1 - Γ Ρ α2. (d) e6(cos2 2)(sm 2) (e) π 16. (a) 1 + In 21'2 - 2"1'2 (b) Let sm 0o = (51'2 - l)/2 The area is (3 sin 2θ0 + θ0- sin3 θ0)/6(2γι2 (c) 16 (d) If η is even there is no inside. If η is odd, the area is (^)T SECTION 7.6 17. (a) Set (d) (e**-l)/4 (b) oo _ (e) 4/3 (c) (2тг + 5л/3)/3 18. (a) 2 (b) l-2f-3f2 (c) 2x + 2;y (d) 0 SECTION 7.7 19. (а) 21/27г (b) Зтг/23'2 (с) e'[5i - 3]/6 (d) πι cos(l/2)/2 (e) 2тг/3 _ (f) 2тг/(1 - a2)1'^ (g) _ne~°(l + а)12с? (h) 7r[cos(V2/2) + яп(л/2/2)]/л/2 ехр(-л/2/2) (0 т/3 (j) тг/e (к) тг[2 sm(7r/10) + 2 δΐη(3π/10) + 1]/5
720 Answers to Selected Exercises Chapter 8 SECTION 8.1 1. (а) со (b) 2π (с) тг/2 (d) Λπ/ЗаЬс 2. (а) 2тг/495 (b) 0 (c) Let A =a~\ B = b~\ The integral is π(ΑΒ)1Ι2[-5(Α3 + В3) - ЪАВ{А + В) + 2ДА2 + 6АВ + 24£2]/3(2)6 3. /bra2r6/6 4. 4тг(1п2-1/2) 5. (а) (уе~г + tz - t2x, y(l - t1) + (t - l)(ze· - txe1), -z(l + f2) + (1 + /)(* ~ tye')) (b) -3i2 (c) fcfl-/1)-1 (d) oo 6. (a), (c) are incompressible. 7. (а) 4тг/3 (b) 4тг/3 (с) 8tt/3 SECTION 8.2 9. (a) (<?'(1 - f)+ <?""'('+1), -l.'eO-O-*"")/!-'3 (b) (1,-1,1) (c) -(1,1,1) (d) (-l,2z,0) (e) (e''2(l + 0/2, e'(l - 00?'(1 - 0 + 1), e"2(2 - t)) (f) (0, 0, у sin 0 11. HM = (a'j), div=ai1 + a22 + fl33 curl = (a32 — агъ, л3 — α3\ аг1 — fli2) 12. О SECTION 8.3 13. (b) dS>=(l+fx2)dx> + 2fxfydxdy + (l+tf)dy> dS = (\+tf+f,iyi1dxdy 14. Tangent plane (a) <p, (1, -2j<, -2z> =0 (1 + 4/) φ-2 + Syz dy dz + (1 + 4z2) ife2 (2x2 + y*) dx* + 2xy dx dy + (x2 + 2/) dy2 (b) <p, (-*, ->-, z)> =0 ——
Chapter 8 721 (с) </>, (-2x, -2у,1)У=0 (1 + 4x2) dx2 - 8xy dx dy + (l + 4y2) dy2 Area element (a) (1 + Ay' + 4Z2)1'2 dy dz (b) 21/2dxdy (c) (1 + 4x2 + Лу2)1'2 dx dy 15. (а) 4тг/31/2 16. cos θ = (2v + lu + uv)/(l + Ли2 + v2)1/2(l + и2 + 4υ2)1/2 18. (а) 2тг[(1 + а2)3'2 - 1]/3 (b) π J"_„ (1 + sin2 и)1'2 ί/й (d) 2тг 4 + 1 /л/4 + л/3\ λ/ϊ \л/4-л/3/ SECTION 8.4 19. (а) г1'2^ (b) О 20. (а) е-2 (Ь) -4тг/3 (с) 0 SECTION 8.5 25. (а) 9п/2(аЬУ'2 (Ь) тг/З^)1'* (с) 1/18 28. (а) 0 (Ь) -4тг/3
INDEX Absolutely convergent series, 140 Absolutely integrable, 197 Abstract vector space, 107 Acceleration, 254 normal, 335 particle, 335 tangential, 335 Addition in the plane, 21 in Rn, 28 Adjoint matrix of an entry, 72 Adjoint of a linear transformation, 123 Algebra of linear operations, 59 of η χ η matrices, 60 Analytic continuation, 450 Analytic function, 400, 441, 534, 584 Angular velocity, 626 Annihilator of a subspace, 122 Approximation, 126 Arc length definition of, 331 on a surface, 643 (Problem 22), 258 Area, 572 surface, definition of, 651 Argument, 87 Axiom of the least upper bound, 134 Ball, 111 Basis, 47 Bessel's inequality, 497 Bilinear function, 122 Binormal, 359 Biotic matrix, 253 Boundary of a set, 123 Cannes, 185 Cartesian product, 16 Cauchy uniformly, 204 theorem of, 205 Cauchy criterion, 132, 401 Cauchy formula, the disk, 510 Cauchy integral formula, 585 Cauchy theorem, 583 Cauchy-Riemann equation, 432 Cayley-Hamilton theorem, 76 Chain rule, 168, 229, 233, 540 723
724 Index Change of basis, 82 Characteristic polynomial ofL, 277, 411 of M, 76 Circular motion, 338 Circulation of a flow, 626 Clairaut's equation, 372 Closed, 112, 156 Closed and bounded, 158 Closed path, 555 Closure of a set, 123 Coefficients Fourier, 454 Fourier cosine, 481 Fourier sine, 481 real Fourier, 476 Taylor, 246 Cofactor, 65 expansion, 72 Collinear, 24 Compact set in R", 156 Comparison test, 147, 198, 402 theorem of, 147 for integrals, 198 Complex derivative, 428 Complex eigenvalue, 89 Complex number (Section 1.8), 85 argument, 87 modulus, 88 polar form, 87 Compound interest example of, 251 Conditionally convergent series, 140 Connected, pathwise, 557 Connected set, (Problem 78), 224 Conservative field, 557, 632 Constant coefficient differential operators, 410 equation, homogeneous, 262 linear differential equation (Section 5.3), 410 linear differential operator, characteristic polynomial, 411 Contained in, 16 Continuity (Section 2.5), 159 Continuous, 160, 201, 206 Continuously differentiable, it- times, 240 Contraction, 213, 268 lemma, theorem, 213 Convergence Cauchy criterion, 154 in Rn, 153 mean square, 496 of a sequence, 131 of a series of functions, 401 uniform, 203 Convolution (Problem 25), 508 Convolution transform, 520 Coordinate axes, 18 Coordinate particle (of a flow), 612 Coordinate relative to a basis, 108 Coordinate space (of a flow), 612 Coordinates cylindrical, 536, 635 in R3, 94 polar, 535 spherical, 536 Coplanar vectors, 93 Counterclockwise, 577 Cramer's rule, 72, 74 Curl of a flow about n, 626 of a vector field, 630 Curvature geodesic (Problem 24), 655 lines of (Problem 65), 693 normal, 692 (Problem 25), 656 of a curve, 336 principal (Problem 62), 692 Curve, 313 binormal, 359 curvature, 336, 352 Frenet-Serret formula, 360
Index 725 implicitly defined, 317 length of, 331 moving trihedron, 359 normal line, 350 plane, 359 vector, 351 of a minimal length, 646 osculating plane, 350 parametrization of, 313 piecewise continuously differen- tiable, 555 principal normal, 335 rectifying plane, 359 tangent line, 324 torsion, 360 unit tangent, 322 Curves, family of, 365 Density, 183, 613 de Rham's theorem, 566 Derivative, 166 directional, 192 partial, 187 Determinant, 61 cofactor expansion of, 67, 69 Developable surface, 642 Diagonalization, 77 Differentiable, 166, 228 definition of, 166 function on Rm, 527 Differential, 193, 527 Differential equations (Section 3.3), 250 of a family of curves, 370 of the tangent family of a vector field, 383 on the circle (Section 6.6), 505 Differential form (Section 7.3), 547 closed, 548 exact, 548 radial (Problem 27), 573 Differential operator, linear, 276 Differentiation, 166 complex (Section 5.6), 428 partial, 187 Dimension, 40 of an abstract vector space, 107 Direct sum, 122 Directional derivative, 192 Dirichlet principle, 674 Dirichlet problem, 471 on the disk, 469 on the half plane, 524 Dirichlet theorem, 674 Dissipative functions, 597, 682 Distance in C{X), 203 in R", 111 mean square, 496 Divergence of the flow, 582, 619 Divergence theorem (Exercise 27), 673 in R2, 581 in R3, 670 Domain of a function, 17 regular, in R2, 568, 571 in R3, 662, 668 e, φ, 16 ε — δ criterion (Proposition 13), 160 Eigenspace, 85 Eigenvalues, 76 of a skew symmetric matrix (Problem 32), 302 of a symmetric matrix (Example 8), 236 Eigenvectors, 76 of linear systems of differential equations, 280 Elementary matrices, 32, 63 Elementary transformations, 32 Empty set, 0, 16
726 Index Energy conservation of, 612 integral, 674 kinetic, 501 potential, 557 Envelope of a family of curves, 376 Equation heat, 467, 489 Laplace's, 467 of continuity, 617 wave, 482 Equations of motion, 311 Equations of a particle, 335 Euclidean inner product, 114 mR", 111 in R3, 97 Euclidean vector space, 113 Expansion Fourier, 454 Laurent (Problem 38), 601 Taylor, 246 Exponential, 262 of a matrix, 279 (Problem 93), 226 Exponential function, definition of (Proposition 25), 210 Exponential order, 521 Family of curves, 365 differential equation of, 370 envelope of, 376 explicit form, 366 implicit form, 366 orthogonal family, 374 tangent to a vector field, 383 Fetah, 253 Field, 306 conservative, 557 First fundamental form on a surface, 643 First-order linear equations, 264 Fixed point, 209 Fixed point theorem (Section 2.11), 211 (Theorem 2.15), 213 Fluid flows, 385 circulation, 626, 660 curl about n, 626 divergence of, 582, 619 equations of motion, 310, 611 incompressible, 618 irrotation, 631 steady, 612 velocity field, 612, 385 Flux of a field, 660, 667 Force field, 255, 306 Fourier coefficient, 454 real, 476 Fourier series (Chapter 5), 452 real, 476 sine, 476, 481 cosine, 476, 481 transform, 523 Frenet-Serret formula, 360 Frenet-Serret frame, in R" (Problem 66), 398 Fubini's theorem, 177, 184, 189 Function, 17 analytic, 400, 441, 534 continuous, 160, 201 differentiable, 527 dissipative, 596, 682 domain of, 17 infinitely differentiable, 443 flat, 443 invertible, 17 harmonic, 468 linear, 1 Lipschitz, 269 meromorphic, 616 odd,478 one-to-one, 17 onto, 17 periodic, 452
Index 727 range, 17 Schwartz test, 523 Fundamental existence and uniqueness theorems, 271, 273 Fundamental theorem of algebra, 61,86 (Theorem 5.2), 408 Fundamental theorem of calculus, 172 Gamma function (Problem 48), 608 Garib, 253 Gauss' theorem (Proposition 11), 681 Geodesies, 645 curvature (Problem 24), 655 Geometric series, 137 Gradient, 193 Gram-Schmidt process, 117 Great circles, 645 Greatest common divisor, 410 Green's identities (Theorem 8.5), 676 Green's function (Definition 13), 681 for a half space, 682 for a ball (Problems 40 and 41), 683, 684 Green's theorem, 567, 571 Harmonic functions, 468, 676, 677 Harnack's principle (Problem 36), 518 Heat equation (Section 6.4) in R2, 482 in R3, 671 Helix, 315 Homogeneous constant coefficient equation, 262 Hooke's law (Problem 17), 256 Hyperbolic cosine, 424 Hyperbolic sine, 424 i,85 Implicit function theorem, 216 Implicitly defined curves, 317 Incompressible flow, 618 Independence, 43 Index, 10 Infinitely differentiable, 166, 443 flat, 443 Inner product in an abstract vector space, 114 mR", 111 in R3, 97 Integrable function, 174, 179 absolutely, 197 Integral definite, 170 Dirichlet's, 519 energy, 674 improper (Section 2.9), 195 indefinite, 169 iterated, 177 multiple, 173 of a vector function, 259 on a surface, 657 test, 199 Integrating factor, 549 Integration, 170 formula for change of variable, 614, 579, 621 multiple (Section 2.7), 173 of differential form, 563 Intermediate value theorem, 162 Intersection, 16 Inverse function theorem (Theorem 7.2), 541 of a function, 17, 168 Invertible function, 168 Invertible matrix, 61 Irrotational flow, 631 Isolated singularity, 592 Iterated integral, 177
728 Index Jacobian, 537 Jordan canonical form, 81 Jump discontinuity (Problem 30), 517 fc-times differentiable, 240 Kepler's laws (Problem 68), 399 Kernel of a linear transformation, 53 Poisson, 458 Kinetic energy, 501 \\ (Problem 89), 225 L (Problem 90), 225 Lagrange multipliers, 234 Laplace transform, 521 Laplace's equation, 467, 672 Laurent expansion (Problem 38), 601 Legendre's equation, 450 polynomials (Problem 60), 450 Length, 203 of a curve, 331 Liebniz's theorem, 141 Limit, 165, 196 Linear differential equations (Section 3.6), 275 first order, 264 second order (Section 3.7), 289 systems, 278 Linear differential operator, 276 Linear function, 1 Linear span, 40 Linear subspace, 40 basis, 47 dimension, 40 Linear systems of differential equations, 278 Linear transformation, 28 adjoint, 123 eigenspace, 85 eigenvalue, 77 complex, 89 eigenvector, 77 kernel, 55 nullity, 53 range, 53 rank, 53 self-adjoint, 125 spectral theorem, 125 Lines of curvature (Problem 65), 693 Lines of force, 309 of a fluid flow, 386 Lioville's formula, 655 Lioville's theorem (Proposition 7), 587 Lipschitz function, 269 Local, 112 Logarithm (Example 17), 247 Mass, conservation of, 612 Matrix, 8 change of basis, 82 characteristic polynomial, 76 column index, 8 diagonal (Problem 22), 39 diagonalization of, 77 eigenvalue of, 77 eigenvector of, 77 index (Definition 1), 10 invertible, 61 Jordan canonical form of, 81 multiplication, 31 orthogonal (Problem 29), 289 row index; 8 symmetric, 123 transpose, 123 Maximum principle, 449, 586 for analytic functions, 512 for harmonic functions, 472 Maxwell's equations, 632 Mean square convergence, 496 Mean square distance, 496 Mean value property (Proposition 2), 471
Index 729 Mean value theorem, 166 Meromorphic function, 610 Mixed partial derivatives (Theorem 2.13), 189 Modulus, 88, 111 Morera's theorem (Problem 55), 609 Moving trihedron, 359 Multiplication, matrix, 31 Multiplicity, 409 Neighborhood, 111 Neuman's problem, 473, 675 Newton's gravitational potential, 679 Newton's law, 255, 340 Newton's method, 213 Normal to a curve in R2, 335 to a surface, 655 Normal acceleration, 335 Normal curvature (Problems 25 and 62), 656, 692 Normal line to a curve, 350 Normal line to a surface, 657 Nullity, 53 One-to-one, 17 Open set, 111 Operator, integral, 507 Order, 187 exponential, 521 Orientation in R2, 579 in J?3, 614 of a curve, 321 of a surface, 658 of the boundary in R2, 567 of the boundary in R3, 666 of the boundary of a curve on a surface, 661 Oriented path, 555 Origin, 18 Orthogonal, 97, 111 Orthogonal curves on a surface, 644 Orthogonal family to a family of curves, 374 Orthogonal matrix (Problem 29), 289 Orthogonal projection, 113 Orthonormal functions, 491 Orthonormal set of vectors, 114 Osculating circle, 357 Osculating plane, 350 Pain, 435 Parametnzation, 313, 321 Parseval's equality, 498 Partial derivative, 187 Partial differentiation, 187 Particle motion, 254, 335 Partition of unity, 444, 451 Path closed, 555 integral, 562 of motion (of a flow), 612 oriented, 555 Period of a harmonic function (Problem 46), 608 Periodic function, 452 Permutation, 68 even, odd, 69 interchange, 68 Picard's theorem (theorem 3.3), 271 (theorem 3.4), 273 global version (Proposition 9), 286 Plane in R\ 97 with chosen point, 20 Planetary motion, 390 Plane geometry, 18
730 Index Poincare's lemma, 564 Point on a surface (Problem 67) elliptic, 693 hyperbolic, 693 parabolic, 693 Poisson kernel, 458 Poisson transform, 458 Polar coordinates, 535 Polynomial functions, 406 Population, 252 Positive integers, P, 13 Positively oriented coordinates, 658 Potential energy, 555 Potential function, 555 Power series, 149 addition and multiplication (Proposition 3), 426 and Taylor expansions, 246 radius of convergence, 150, 421 Principal curvatures (Problem 62), 692 Principal normal vector, 335 Principle of Mathematical Induction, 13 Product, matrix, 31 Projection, 113 Properties of analytic functions (Theorem 7.10), 588 Radius of convergence, 150, 421 Range, 17 Rank, 53 Ratio test, 151 Rational numbers, Q, 15 Real numbers, R, 15 Rearrangement of series, 143 Rectangle, 16 closed, 173 volume of, 173 Rectifying plane, 359 Regrouping of series, 143 Regular domain, 568, 571, 662, 668 Residue theorem, 592, 593 Riemann integrable, 170 Rn, 16 addition in, 28 linear subspace of, 40 scalar multiplication, 28 Rodrigues's formula (Problem 65), 693 Root, 409 Root test, 151 Roots of unity, 406 Row operation, 9 transformation corresponding to, 29 Row-reduced matrix, 9 Row reduction, 7 Scalar multiplication in the plane, 20 in R", 28 Schwartz test functions, 523 Schwarz's inequality, 120, 223 Schwarz's lemma, (Problem 60), 610 Second fundamental form (Problem 63), 692 Second-order linear equations, 289 Self-adjoint transformation, 125 Separation of variables, 260, 290 Sequence, 129 convergence of, 131 subsequence of, 130 Series, 137 absolutely convergent, 140 comparison test, 147 conditionally convergent, 140 convergence, 137 Fourier cosine, 481 Fourier sine, 481 geometric, 137 of functions, 401 power, 149
Index 731 ratio test, 151 root test, 151 Taylor, 246, 509 with positive terms (Proposition 4), 139 Simultaneous linear equations, 2 homogeneous system, 11 Singular solution, 373 Skew-symmetric matrix (Proble 32), 302 Solid angle (Problem 34), 673 Spectral theorem for self-adjoi operators, 125 Spherical coordinates, 536 Steady flow, 385 Stokes' theorem, 662, 666 Straight line, 24 Subtraction, 22 Successive approximations, 266 Surface, 635 arc length, 643 area, definition of, 651 definition, 635 developable, 642 elliptic point, 693 first fundamental form, 643 geodesies, 645 hyperbolic point, 693 normal, 655 onentable, 658 orientation, 658 orthogonal curves, 644 parabolic point, 693 patch, 635 second fundamental form, 692 tangent plane, 640 Symmetric bilinear form, 123 Symmetric matrix, 123 System of coordinates, 534 Tangent to a curve, 322, 324 Tangent plane to a curve, 350 to a surface, 640 Tangential acceleration, 335 Taylor expansion, 246 Taylor's formula (Theorem 3.1), 242 Tests comparison, 147, 198 integral, 199 ratio, 151 root, 151 Topological terms, 111 Topology, 575 Torsion, 360 Transform Fourier, 523 Convolution, 520 Laplace, 521-22 Poisson, 458, 524 Transpose of a matrix, 123 Triangle inequality, 120 Trigonometric functions, Taylor expansion, 246 Trigonometric polynomial, 454 Uniform convergence, 198, 204, 205 Union, 6 Unity nth roots of, 406 partition of, 444, 451 Variation of parameters, 295 Vector, 20, 28 addition in R2, 21 addition in R", 28 field, 306, 381 independent set, 43 in the plane, 20 product, 100 subtraction in R2, 22
732 Index Vector field, 306, 381 field of a flow, 612 curl, 630 Volume, 173 flux across a surface, 660 radial, 560, 624 divergence, 581, 670 Wave equation, 482 Vector space, 107, 108 Weierstrass approximation theorem Velocity, 254 (Problem 6), 467 of a particle, 335 Work, 554 of a fluid flow, 385 Wronskian, 293