/
Author: Zeidler E.
Tags: mathematics mathematical physics nonlinear functional analysis variational methods
Year: 1984
Similar
Text
Eberhard Zeidler
Nonlinear
Functional Analysis
and its Applications
Variational Methods
and Optimization
S& BBBb'I
Leonhard Euler (1707-1783)
Eberhard Zeidler
Nonlinear
Functional ^lalys]
and its Applieatioi
III: Variational Methods
and Optimization
Translated by Leo F. Boron
With 111 Illustrations
Gl
Eberhard Zeidler
Sektion Mathematik
Karl-Marx-Platz
7010 Leipzig
German Democratic Republic
Leo F. Boron (Translator)
Department of Mathematics
and Applied Statistics
University of Idaho
Moscow, ID 83843
U.S.A.
AMS Classification: 58-01, 58-CXX, 58-EXX
Library of Congress Cataloging in Publication Data
Zeidler, Eberhard.
Nonlinear functional analysis and its applications.
Bibliography: p.
Includes index.
Contents: —pt. 3. Variational methods and
optimization.
1. Nonlinear functional analysis—Addresses, essays,
lectures. I. Title.
QA321.5.Z4513 1984 515.7 83-20455
© 1985 by Springer-Verlag New York Inc.
All rights reserved. No part of this book may be translated or reproduced in
any form without written permission from Springer-Verlag, 175 Fifth Avenue,
New York, New York 10010, U.S.A.
Typeset by Science Typographers, Inc., Medford, New York.
Printed and bound by R. R. Donnelley & Sons, Harrisonburg, Virginia.
Printed in the United States of America.
987654321
Dedicated in gratitude to my teacher
Professor Herbert Beckert
Preface
As long as a branch of knowledge offers an abundance of problems, it is full
of vitality.
David Hilbert
Over the last 15 years I have given lectures on a variety of problems in
nonUnear functional analysis and its appUcations. In doing this, I have
recommended to my students a number of excellent monographs devoted to
specialized topics, but there was no complete survey-type exposition of
nonUnear functional analysis making available a quick survey to the wide
range of readers including mathematicians, natural scientists, and engineers
who have only an elementary knowledge of linear functional analysis. I have
tried to close this gap with my five-part lecture notes, the first three parts of
which have been pubUshed in the Teubner-Texte series by Teubner-Verlag,
Leipzig, 1976, 1977, and 1978. The present EngUsh edition was translated
from a completely rewritten manuscript which is significantly longer than
the original version in the Teubner-Texte series. The material is organized in
the following way:
Part I: Fixed Point Theorems.
Part II: Monotone Operators.
Part III: Variational Methods and Optimization.
Parts IV/V: Applications to Mathematical Physics.
The exposition is guided by the following considerations:
(a) What are the supporting basic ideas and what intrinsic interrelations
exist between them?
(/6) In what relation do the basic ideas stand to the known propositions of
classical analysis and Unear functional analysis?
(•y) What typical applications are there?
Vll
Vlll
Preface
Special emphasis is placed on motivation. The reader should always have
the feeling that the theory is not developed for its own sake but rather for
the effective solution of concrete problems. At the same time I try to outline
a variegated picture of the subject matter which ranges from the
fundamental questions of set theory (the Bourbaki-Kneser fixed point theorem) to
concrete numerical methods, encompassing numerous applications to
physics, chemistry, biology, and economics. The reader should see
mathematics as a unified whole, with no separation between pure and applied
mathematics. At the same time we show how deep mathematical tools can
be used in the natural sciences, engineering, and economics. The
development of nonlinear functional analysis has been influenced in an essential
way by complicated natural scientific questions; the close contact with the
natural sciences and other sciences will also be of great significance for the
development of nonlinear functional analysis. In our exposition, the use of
analytic tools stands in the foreground, but we also seek to show
connections with algebraic and differential topology. For instance, Sections 37.27
and 37.28 contain an introduction to Morse theory as well as to singularity
and catastrophe theory. To reach the largest possible readership and to
fashion a self-contained exposition, important tools from linear functional
analysis are provided in the appendices to Parts I and II. These are
presented so that readers with a skimpy background can familiarize
themselves with this material. We forego, at the outset, the greatest possible
generality, but rather seek to expose the simple intrinsic nucleus without
trivializing it. According to the author's experience, it is easier for the
student to generalize familiar mathematical ideas to a more general situation
than to elicit the basic idea from a theorem that is formulated very generally
and burdened with many technical details. The teacher must help him in
that task. In order to make it easier for the reader to grasp the central
results, a number of propositions have been listed in a separate section
called List of Theorems to be found on page 643. It is clear that this
procedure is not entirely free of arbitrariness. However, we hope that the
lists of Theorems for Parts I-V provide an overview of the essential
substance of nonlinear functional analysis. Furthermore, since, in the
experience of the author, it is frequently difficult, because of a flood of details,
for the student to recognize the interrelationships between different
questions and the general strategies for the solution of problems, special
emphasis is placed on these interrelationships.
We have given a general overview of the content of Parts I-V and the
basic idea of nonlinear functional analysis in the Preface and in the
introduction to Part I. The present Part III consists of the following topics:
(a) Introduction to the subject.
(fi) Two fundamental existence and uniqueness principles.
(•y) Extremal problems without side conditions.
(8) Extremal problems with smooth side conditions.
(e) Extremal problems with general side conditions.
Preface
IX
(f) Saddle points and duality,
(rj) Variational inequalities.
In the introduction, and in the schematic survey in Fig. 37.1 on page 3, we
give an overview of the interrelationships between various extremal
problems. In the comprehensive introductory Chapter 37, we present many
simple, but typical, examples that are representative of those concrete
problems that have played a central role in the historical development of the
subject. In order to obtain an impression of the extraordinary variety of
problems involved, the reader should glance at the list of subjects for
Chapter 37 that appears in the Contents. In the immediately following
chapters it is our chief concern to show the reader that these problems can
be handled with the aid of a unified theory of extremal problems. The
essence of this unified theory consists of a small number of fundamental
principles of functional analysis. The title of Part III, Variational Methods
and Optimization, indicates-that we consider aspects of the classical calculus
of variations as well as modern optimization theory and their
interrelationships. By working out the supporting ideas and general fundamental
principles, we also wish to help the reader obtain an understanding of the
substance of the extraordinarily comprehensive and turbulently
accumulating literature on extremal problems, to classify these works according to
their ideas, and to note the emergence of new ideas.
Each of the 21 chapters is self-contained. Each begins with motivations,
heuristic considerations, and indications of the typical problems to be
investigated and contains the most important theorems and definitions
together with elucidating examples, figures, and typical applications. We
also do not shun citing very simple examples in the interest of the reader.
Furthermore, we always try to penetrate as quickly as possible to the heart
of the matter. We try to achieve the situation where the reader knows at
each phase of the book what concrete applications the general
considerations allow. In general, a very careful selection of the material had to be
made because one could write each chapter as a special monograph and, to
some extent, such monographs already exist. Here, we describe the
applications to nonlinear differential and integral equations, differential
inequalities, one-dimensional and multidimensional variational problems, linear and
convex optimization problems, problems in approximation theory and game
theory, continuous and discrete control problems for ordinary and partial
differential equations, and also consider important approximation methods.
In particular, in Section 37.29, we explain the basic ideas of 10 important
methods and principles for the construction of approximation methods. In
the introduction to Part I we have already pointed out that in numerical
methods the devil rides high on detail. However, general principles and
theoretical investigations of approximation methods within the setting of
numerical functional analysis are useful for recognizing the basic ideas and
for arranging the abundance of concrete numerical methods into a unified
point of view. We examine a number of more profound applications of
nonlinear functional analysis to mathematical physics in Parts IV and V.
X
Preface
At the end of each chapter the reader will find problems and references to
the literature. The problems vary considerably in their degree of difficulty:
(a) Problems without asterisks serve as drills in the material presented and
require no additional tools.
(j8) Problems with asterisks are more difficult—additional ideas are
required to solve them.
(•y) Problems with double asterisks are very difficult—one needs substantial
additional information to solve them.
Each problem contains either a solution or a precise reference to the
monograph or original work in which the solution can be found. Moreover,
we try to clarify the meaning of the results with explanatory remarks. The
problems with one or two asterisks are in part so devised that they present
targeted references to the literature on important extensions of results or
they serve to extend the reader's mathematical horizon. A number of topics
will be treated supplementarily in the problem collections. These topics are
particularly extensive in Chapter 40, where we try to sketch for the reader a
line of development from the classical calculus of variations and from
geometrical optics up to the modern theory of Fourier integral operators. In
this we let ourselves be led by the experience that the penetration of a
complicated theory is made easier for the student when she/he has an
ultimate goal from the beginning and knows the connection between the
goal and the simpler questions familiar to her/him.
The references to the literature at the end of each chapter are styled as
follows: Krasnoselskii (1956, M, B, H), etc. The year refers to the list of
literature at the end of the book. Furthermore, the capital Latin letters
mean:
M: monograph;
L: lecture notes;
S: survey;
P: proceedings;
B: the cited work contains a comprehensive bibliography;
H: the cited work contains references to the historical development of the
subject.
In this connection, the references to the literature are at the same time
supplied with clarifying captions which explain the interrelationship
between the works cited. On page 166 one finds "Recent trends". From the
abundance of available literature we have made a careful but necessarily
subjectively biased selection, which in the author's opinion will easily afford
the reader as comprehensive a picture as possible concerning the
farther-leading results. In this, the emphasis lies naturally on the surveys
and monographs. However, we also cite a number of classical works which
were of special significance for the development of the subject. We
recommend that the reader glance at several of these works in order to obtain ah
Pref.
XI
active impression of the genesis of new results and of the historical
development of mathematics. Unfortunately, in order to keep the list of literature
within tolerable bounds, we had to forego listing many important
references.
In the choice of the presentation it was taken into consideration that in
general no book is read completely from beginning to end. We hope that
even a quick skimming of the text will suffice for one to grasp the essential
contents. To this end, we recommend reading the introductions to the
individual chapters, the definitions, the theorems (without proofs), and the
examples (without proofs) as well as the comments in the text between these
definitions, theorems, etc., which point out the meaning of the individual
results. The reader who does not have time to solve the problems should,
however, briefly scrutinize the captions to the problems and the adjoining
remarks, which elucidate the meaning of the formulation of the problems
and the interrelationships." The reader who is interested in supplementary
problem material can try to prove independently all of the examples in the
text without referring to the given proof. Moreover, in the references to the
literature in Section 37.29, books are cited in which the reader will find
comprehensive collections of exercises that as a rule are not too difficult. All
hypotheses both in the theorems and in the examples are explicitly stated so
that the reader avoids a time-consuming search for the assumptions in the
antecedent text. We have taken pains to reduce the number of definitions to
a minimum in order not to burden the reader with too many concepts. On
page xii one finds a list of the most important definitions. In order to clarify
interrelationships, several assertions that belong together are at times
combined into a single theorem. In this form of exposition, we have also kept in
mind the natural scientist and the engineer who want primarily to gain
information on which mathematical tools are available for the various
nonlinear problems. We recommend Chapter 37 to the reader who wishes to
examine the class of problems which the general theory allows one to treat.
However, it suffices to glance at this comprehensive chapter, because
references will later be made at the appropriate places. The reader whose
priority is to become acquainted with the theoretical framework can
immediately begin with Chapter 38 and, on first reading, omit the sections in
the individual chapters that are devoted to applications.
Grasping the individual steps in the proofs as well as the essential ideas of
the proofs is made easier by the careful organization of the proofs. It is a
truism that only by a precise study of the proofs one can penetrate more
deeply into a mathematical theory.
Part III is to a large extent independent of the other parts. However,
where necessary, we do refer to particular results of the other parts. Note
that several auxiliary tools are made available in Parts I and II (basic
information concerning linear functional analysis, Sobolev spaces, etc.). We
formulate a number of results for locally convex spaces. The reader who is
not familiar with this material can orient himself by reading the appendix to
Part I or replace the concept of a locally convex space by that of a Banach
Xll
Preface
or Hilbert space. Dual pairs are important for duality theory. We explain
this concept in the appendix to Part III. The reference Aj (20) relates to (20)
in the appendix to the ith part. (37.20) is formula (20) in Chapter 37.
Within a particular chapter, we forego giving the chapter number of the
equation. In each chapter, theorems are distinguished by capital letters, so
that, for instance, "Theorem 57.B in Section 57.5" means the second
theorem in Chapter 57, located in Section 5 of that chapter. Propositions,
lemmas, corollaries, definitions, remarks, conventions, counterexamples,
standard examples, and examples are numbered consecutively in each
chapter—for example, in Chapter 41 one finds Definition 41.1, Proposition
41.2, Corollary 41.3, etc., in that order. The end of a proof is indicated by
the symbol □. We subdivide the chapters among the five separate parts of
this work in the following way:
Part I: Chapters 1-17.
Part II: Chapters 18-36.
Part III: Chapters 37-57.
Part IV: Chapters 58-79.
Part V: Chapters 80-100.
A list of symbols used can be found on page 637. We have taken pains to
employ the notation that is generally used. To avoid confusion, we point out
several peculiarities at the beginning of the list of symbols on page 637. A
detailed subject index can be found on page 651. As far as abbreviations are
concerned, we use only B-space (respectively, H-space) for Banach space
(respectively, Hilbert space), F-derivative (respectively, G-derivative) for
Frechet derivative (respectively Gateaux derivative) as well as M-S
sequence for Moore-Smith sequence and L-S deformation for
Ljusternik-Schnirelman deformation.
I have taken pains to write as interesting and diverse a book as possible.
Of course, whether or not I have succeeded in this only the reader can
decide.
I am indebted to numerous colleagues for interesting conversations and
letters as well as for sending me articles and books—I thank them all
heartily. I am especially grateful to my mentor Professor Herbert Beckert
for all that I learned from him as a scientist and as a human being. I should
like to dedicate the present volume to him. I cordially thank Paul H.
Rabinowitz and the Department of Mathematics of the University of
Wisconsin, Madison, for the invitation as guest resident scholar during the
fall semester 1978. The very stimulating atmosphere in Madison influenced
the final form of the exposition in an essential way. In the tasks of typing
the manuscript and of making copies, I was supported in an amiable way by
a number of colleagues, both male and female. I should like to very heartily
thank Ursula Abraham, Sonja Bruchholz, Elvira Krakowitzki, Heidi Kilhn,
Hiltraud Lehmann, Karin Quasthoff, Werner Berndt, and Rainer Schumann.
I would especially like to thank Rainer Schumann for a critical perusal of
parts of the manuscript. The understanding and extensive support shown to
Preface
xin
me by the librarian of our institute, Frau Ina Letzel, was of great value to
me. Furthermore, I thank the administrators of the Mathematics Section
of the Karl Marx University, Leipzig, and its director, Professor Horst
Schumann, for supporting this project.
I would also like to thank the translator, Professor Leo F. Boron,
University of Idaho, Moscow, for his excellent work. I am very indebted to
him for valuable suggestions and remarks. Finally, my special thanks go to
Springer-Verlag for the harmonious collaboration and the understanding
approach to all my wishes.
Eberhard Zeidler
Leipzig
Spring 1984
Contents
Introduction to the Subject 1
General Basic Ideas 4
CHAPTER 37
Introductory Typical Examples 12
§37.1. Real Functions in R1 13
§37.2. Convex Functions in R1 15
§37.3. Reai Functions in R N, Lagrange Multipliers, Saddle Points, and
Critical Points 16
§37.4. One-Dimensional Classical Variational Problems and Ordinary
Differential Equations, Legendre Transformations, the
Hamilton-Jacobi Differential Equation, and the Classical
Maximum Principle 20
§37.5. Multidimensional Classical Variational Problems and Elliptic
Partial Differential Equations 41
§37.6. Eigenvalue Problems for Elliptic Differential Equations and
Lagrange Multipliers 43
§37.7. Differential Inequalities and Variational Inequalities 44
§37.8. Game Theory and Saddle Points, Nash Equilibrium Points and
Pareto Optimization 47
§37.9. Duality between the Methods of Ritz and Trefftz, Two-Sided
Error Estimates 50
§37.10. Linear Optimization in R N, Lagrange Multipliers, and Duality 51
§37.11. Convex Optimization and Kuhn-Tucker Theory 55
§37.12. Approximation Theory, the Least-Squares Method, Deterministic
and Stochastic Compensation Analysis 58
§37.13. Approximation Theory and Control Problems 64
xvi
Contents
§37.14, Pseudoinverses, Ill-Posed Problems and Tihonov Regularization 65
§37,15. Parameter Identification 71
§37.16. Chebyshev Approximation and Rational Approximation 73
§37,17. Linear Optimization in Infinite-Dimensional Spaces, Chebyshev
Approximation, and Approximate Solutions for Partial
Differential Equations 76
§37.18, Splines and Finite Elements 79
§37.19. Optimal Quadrature Formulas 80
§37.20, Control Problems, Dynamic Optimization, and the Bellman
Optimization Principle 84
§37.21. Control Problems, the Pontrjagin Maximum Principle, and the
Bang-Bang Principle 89
§37.22, The Synthesis Problem for Optimal Control 92
§37.23, Elementary Provable Special Case of the Pontrjagin Maximum
Principle 93
§37.24. Control with the Aid of Partial Differential Equations 96
§37.25. Extremal Problems with Stochastic Influences 97
§37.26. The Courant Maximum-Minimum Principle, Eigenvalues,
Critical Points, and the Basic Ideas of the Ljusternik-Schnirelman
Theory 102
§37,27. Critical Points and the Basic Ideas of the Morse Theory 105
§37.28. Singularities and Catastrophe Theory 115
§37.29. Basic Ideas for the Construction of Approximate Methods for
Extremal Problems 132
TWO FUNDAMENTAL EXISTENCE AND UNIQUENESS
PRINCIPLES
CHAPTER 38
Compactness and Extremal Principles 145
§38,1, Weak Convergence and Weak* Convergence 147
§38.2. Sequential Lower Semicontinuous and Lower Semicontinuous
Functionals 149
§38.3. Main Theorem for Extremal Problems 151
§38.4. Strict Convexity and Uniqueness 152
§38.5. Variants of the Main Theorem 153
§38.6, Application to Quadratic Variational Problems 155
§38.7. Application to Linear Optimization and the Role of Extreme
Points 157
§38.8. Quasisolutions of Minimum Problems 158
§38,9. Application to a Fixed-Point Theorem 161
§38.10, The Palais-Smale Condition and a General Minimum Principle 161
§38.11. The Abstract Entropy Principle 163
Contents
XV11
CHAPTER 39
Convexity and Extremal Principles 168
§39.1. The Fundamental Principle of Geometric Functional Analysis 170
§39.2. Duality and the Role of Extreme Points in Linear Approximation
Theory 172
§39.3. Interpolation Property of Subspaces and Uniqueness 175
§39.4. Ascent Method and the Abstract Alternation Theorem 177
§39.5. AppUcation to Chebyshev Approximation 180
EXTREMAL PROBLEMS WITHOUT SIDE CONDITIONS
CHAPTER 40
Free Local Extrema of Differentiable Functionals and the Calculus
of Variations 189
§40.1. nth Variations, G-Derivative, and F-Derivative 191
§40.2. Necessary and Sufficient Conditions for Free Local Extrema 193
§40.3. Sufficient Conditions by Means of Comparison Functionals and
Abstract Field Theory 195
§40.4. AppUcation to Real Functions in RN 195
§40.5. AppUcation to Classical Multidimensional Variational Problems
in Spaces of Continuously Differentiable Functions 196
§40.6. Accessory Quadratic Variational Problems and Sufficient
Eigenvalue Criteria for Local Extrema 200
§40.7. AppUcation to Necessary and Sufficient Conditions for Local
Extrema for Classical One-Dimensional Variational Problems 203
CHAPTER 41
Potential Operators 229
§41.1. Minimal Sequences 232
§41.2. Solution of Operator Equations by Solving Extremal Problems 233
§41.3. Criteria for Potential Operators 234
§41.4. Criteria for the Weak Sequential Lower Semicontinuity of
Functionals 235
§41.5. AppUcation to Abstract Hammerstein Equations with Symmetric
Kernel Operators 237
§41,6. AppUcation to Hammerstein Integral Equations 239
CHAPTER 42
Free Minima for Convex Functionals, Ritz Method and the
Gradient Method 244
§42.1. Convex Functionals and Convex Sets 245
§42.2. Real Convex Functions 246
Contents
§42.3. Convexity of F, Monotonicity of F', and the Definiteness of the
Second Variation 247
§42.4. Monotone Potential Operators 249
§42.5. Free Convex Minimum Problems and the Ritz Method 250
§42.6. Free Convex Minimum Problems and the Gradient Method 252
§42.7. Application to Variational Problems and Quasilinear Elliptic
Differential Equations in Sobolev Spaces 255
EXTREMAL PROBLEMS WITH SMOOTH SIDE CONDITIONS
CHAPTER 43
Lagrange Multipliers and Eigenvalue Problems 273
§43.1. The Abstract Basic Idea of Lagrange Multipliers 274
§43.2. Local Extrema with Side Conditions 276
§43.3. Existence of an Eigenvector Via a Minimum Problem 278
§43.4. Existence of a Bifurcation Point Via a Maximum Problem 279
§43.5. The Galerkin Method for Eigenvalue Problems 281
§43.6. The Generalized Implicit Function Theorem and Manifolds in
B-Spaces 282
§43.7. Proof of Theorem 43.C 288
§43.8. Lagrange Multipliers 289
§43.9. Critical Points and Lagrange Multipliers 291
§43.10. Application to Real Functions in R N 293
§43.11. Application to Information Theory 294
§43.12, Application to Statistical Physics. Temperature as a Lagrange
Multiplier 296
§43.13. Application to Variational Problems with Integral Side Conditions 299
§43.14. Application to Variational Problems with Differential Equations
as Side Conditions 300
CHAPTER 44
Ljusternik-Schnirelman Theory and the Existence of
Several Eigenvectors 313
§44.1. The Courant Maximum-Minimum Principle 314
§44.2. The Weak and the Strong Ljusternik Maximum-Minimum
Principle for the Construction of Critical Points 316
§44.3. The Genus of Symmetric Sets 319
§44.4. The Palais-Smale Condition 321
§44.5. The Main Theorem for Eigenvalue Problems in Infinite-
Dimensional B-spaces 324
§44.6. A Typical Example 328
§44.7. Proof of the Main Theorem 330
^UlllCllLS
§44.8. The Main Theorem for Eigenvalue Problems in Finite-
Dimensional B-Spaces 335
§44.9. Application to Eigenvalue Problems for Quasilinear Elliptic
Differential Equations 336
§44.10. Application to Eigenvalue Problems for Abstract Hammerstein
Equations with Symmetric Kernel Operators 337
§44.11. Application to Hammerstein Integral Equations 339
§44.12. The Mountain Pass Theorem 339
CHAPTER 45
Bifurcation for Potential Operators - 351
§45.1. Krasnoselskii's Theorem 351
§45.2. The Main Theorem • 352
§45.3. Proof of the Main Theorem 354
EXTREMAL PROBLEMS WITH GENERAL SIDE CONDITIONS
CHAPTER 46
Differentiable Functionals on Convex Sets 363
§46.1. Variational Inequalities as Necessary and Sufficient Extremal
Conditions 363
§46.2. Quadratic Variational Problems on Convex Sets and Variational
Inequalities 364
§46.3. Application to Partial Differential Inequalities 365
§46.4. Projections on Convex Sets 366
§46.5. The Ritz Method 367,
§46.6. The Projected Gradient Method 368
§46.7. The Penalty Functional Method 370
§46.8. Regularization of Linear Problems 372
§46.9. Regularization of Nonlinear Problems 375
CHAPTER 47
Convex Functionals on Convex Sets and Convex Analysis 379
§47.1. The Epigraph 380
§47.2. Continuity of Convex Functionals 383
§47.3. Subgradient and Subdifferential 385
§47.4. Subgradient and the Extremal Principle 386
§47.5. Subgradient and the G-Derivative 387
§47.6. Existence Theorem for Subgradients 387
§47.7. The Sum Rule 388
XX
Contents
§47.8. The Main Theorem of Convex Optimization 390
§47.9. The Main Theorem of Convex Approximation Theory 392
§47.10. Generalized Kuhn-Tucker Theory 392
§47.11. Maximal Monotonicity, Cyclic Monotonicity, and Subgradients 396
§47.12. Application to the Duality Mapping 399
CHAPTER 48
General Lagrange Multipliers (Dubovickii-Miljutin Theory) 407
§48.1. Cone and Dual Cone 408
§48.2. The Dubovickii-Miljutin Lemma 411
§48.3. The Main Theorem on Necessary and Sufficient Extremal
Conditions for General Side Conditions 413
§48.4. Application to Minimum Problems with Side Conditions in the
Form of Equalities and Inequalities 416
§48.5. Proof of Theorem 48.B 419
§48.6. Application to Control Problems (Pontrjagin's Maximum
Principle) 422
§48.7. Proof of the Pontrjagin Maximum Principle 426
§48.8. The Maximum Principle and Classical Calculus of Variations 433
§48.9. Modifications of the Maximum Principle 435
§48.10. Return of a Spaceship to Earth 437
SADDLE POINTS AND DUALITY
CHAPTER 49
General Duality Principle by Means of Lagrange Functions
and Their Saddle Points 457
§49.1. Existence of Saddle Points 457
§49.2. Main Theorem of Duality Theory 460
§49.3. Application to Linear Optimization Problems in B-Spaces 463
CHAPTER 50
Duality and the Generalized Kuhn-Tucker Theory 479
§50.1. Side Conditions in Operator Form 479
§50.2. Side Conditions in the Form of Inequalities 482
CHAPTER 51
Duality, Conjugate Functionals, Monotone Operators and Elliptic
Differential Equations 487
§51.1. Conjugate Functionals 489
§51.2. Functionals Conjugate to Differentiable Convex Functionals 492
XXI
§51.3. Properties of Conjugate Functional 493
§51.4. Conjugate Functionals and the Lagrange Function 496
§51.5. Monotone Potential Operators and Duality 499
§51.6. Applications to Linear Elliptic Differential Equations, Trefftz's
Duality 502
§51.7. Application to Quasilinear Elliptic Differential Equations 506
CHAPTER 52
General Duality Principle by Means of Perturbed Problems and
Conjugate Functionals 512
§52.1. The S-Functional, Stability, and Duality . 513
§52.2. Proof of Theorem 52.A 515
§52.3. Duality Propositions of Fenchel-Rockafellar Type 517
§52.4. Application to Linear Optimization Problems in Locally Convex
Spaces 519
§52.5. The Bellman Differential Inequality and Duality for Nonconvex
Control Problems 521
§52.6. Application to a Generalized Problem of Geometrical Optics 525
CHAPTER 53
Conjugate Functionals and Orlicz Spaces 538
§53.1. Young Functions 538
§53.2. Orlicz Spaces and Their Properties 539
§53.3. Linear Integral Operators in Orlicz Spaces 541
§53.4. The Nemyckii Operator in Orlicz Spaces 542
§53.5. Application to Hammerstein Integral Equations with Strong
Nonlinearities 542
§53.6. Sobolev-Orlicz Spaces 544
VARIATIONAL INEQUALITIES
CHAPTER 54
Elliptic Variational Inequalities 551
§54.1. The Main Theorem 551
§54.2. Application to Coercive Quadratic Variational Inequalities 552
§54.3. Semicoercive Variational Inequalities 553
§54.4. Variational Inequalities and Control Problems 556
§54.5. Application to Bilinear Forms 558
§54.6. Application to Control Problems with Elliptic Differential
Equations 559
§54.7. Semigroups and Control of Evolution Equations 560
lints
§54.8. Application to the Synthesis Problem for Linear Regulators 561
§54.9. Application to Control Problems with Parabolic Differential
Equations 562
CHAPTER 55
Evolution Variational Inequalities of First Order in H-Spaces 568
§55.1. The Resolvent of Maximal Monotone Operators 569
§55.2. The Nonlinear Yosida Approximation 570
§55.3. The Main Theorem for Inhomogeneous Problems 570
§55.4. Application to Quadratic Evolution Variational Inequalities of
First Order 572
CHAPTER 56
Evolution Variational Inequalities of Second Order in H-Spaces 577
§56.1. The Main Theorem 577
§56.2. Application to Quadratic Evolution Variational Inequalities of
Second Order 578
CHAPTER 57
Accretive Operators and Multivalued First-Order Evolution
Equations in B-Spaces 581
§57.1. Generalized Inner Products on B-Spaces 582
§57.2, Accretive Operators 583
§57.3, The Main Theorem for Inhomogeneous Problems with
m-Accretive Operators 584
§57.4. Proof of the Main Theorem 585
§57,5. Application to Nonexpansive Semigroups in B-Spaces 593
§57.6. Application to Partial Differential Equations 594
Appendix 599
References 606
List of Symbols 637
List of Theorems 643
List of the Most Important Definitions 647
Index 651
Introduction to the Subject
I love mathematics not only because it is applicable to technology but also
because it is beautiful.
Rosza Peter
Science is a first class piece of furniture for the bel etage—as long as common
sense reigns on the ground floor.
Oliver Wendell Holmes
Extremal problems play an extraordinarily large role in the application of
mathematics to practical problems, for example:
(a) in mathematical physics (mechanics and celestial mechanics,
geometrical optics, elasticity theory, hydrodynamics, rheology, relativity theory,
etc.);
(0) in geometry (geodesies, minimal surfaces, etc.);
(•y) in mathematical economics (transport problems, optimal warehouse
maintenance);
(§) in regulation technology (optimal control of general regulation systems,
e.g., industrial installations, spaceships, etc.);
(e) in chemistry, geophysics, technology, etc. (optimal determination of
unknown data from measurements);
(f) in numerical mathematics (optimal structuring of approximation
processes, etc.);
(rj) in the theory of probability (optimal control of stochastic processes,
optimal estimation of unknown parameters, optimal construction of
airplanes, water-power networks, etc.).
2
Introduction to the Subject
In this connection, we exploit the fact that many processes in nature
proceed according to extremal principles, for example:
(a) the principle of stationary action in mechanics, relativity theory,
electrodynamics, etc.;
(b) the principle of minimal potential energy in stable mechanical
equilibrium states;
(c) Fermat's principle of least time in light propagation in geometrical
optics;
(d) Einstein's principle of the motion of mass along four-dimensional
geodesies in general relativity theory.
Moreover, for economic reasons, we are interested in the optimal modelling
of production procedures and other regulation processes.
The history of extremal problems comprises four distinct stages:
(i) The solution of extremal problems for real functions with the aid of
the differential and integral calculus that was invented about 300 years ago.
(ii) Classical calculus of variations that originated about 300 years ago in
connection with mechanical problems.
(iii) Optimization that came into being because of economic and
regulation-technical questions and that has been intensively advanced during
approximately the last 30 years (linear optimization, Kuhn-Tucker theory,
Bellman dynamic optimization, Pontrjagin's maximum principle).
(iv) The theory of variational inequalities and quasivariational inequalities
with its applications to mathematical physics and the deterministic and
stochastic optimization theory that has existed for about the last 15 years.
Figure 37.1 gives a general view. In this connection, we generally
distinguish:
(a) Problems without side conditions (free problems).
(b) Problems with side conditions (bound problems).
Side conditions in the form of equations are typical for the classical calculus
of variations. For example, the shortest path joining two points on a sphere
must satisfy the equation of the sphere. On the other hand, side conditions
in the form of inequalities are typical for optimization. For example, it can
be a matter of bounds for the fuel supply under optimal control of a rocket
or the bounds for the warehouse capacity under optimal warehouse
maintenance.
In the comprehensive introductory chapter (Chapter 37), we give as an
explanation of Fig. 37.1 a survey of diverse concrete formulations of
problems, the calculus of variations, and optimization theory. In the
following chapters we show that these seemingly very disparate problems can be
treated in a unified way within the framework of a functional analytical
theory with the aid of only a few general fundamental principles.
In the following we shall go into several of these interrelationships.
NONLINEAR FUNCTIONAL ANALYSIS
stochastic optimization
EXTREMAL PROBLEMS
VARIATIONAL PROBLEMS,
Euler
differential
equations
(e.g., boundary
and boundary
eigenvalue
problems for
quasilinear elliptic
differential equations)
Variatiomal inequalities
(e.g., differential
inequalities)
..OPTIMIZATION^
/ ..game theory
Hammerstein
integral
equations
discrete
(dynamic optimization
and the discrete maximum
principle)
parameter identification
continuous
(Pontrjagin maximum
principle and the
Bellman equation)
\ I
approximation theory
I
' CONVEX OPTIMIZATION
(Kuhn-Tucker theory)
linear
optimization
Figure 37.1, Overview of extremal problems.
4
Introduction to the Subject
General Basic Ideas
By extremal problems we mean:
(i) minimum and maximum problems (extremal problems in the narrower
sense);
(ii) saddle point problems and minimax problems (game theory, duality
theory and error estimates, approximation theory);
(iii) determination of critical points (eigenvalue problems, Ljuster-
nik-Schnirelman theory, and Morse theory);
(iv) determination of noncooperative equilibrium points in the sense of
Nash, Pareto optimization, Walras equilibria (economics models);
(v) solution of variational inequalities.
Here, (iv) [respectively, (v)] is related to (ii) [respectively, (i)]. The concept of
critical point is of central significance for variational problems and their
applications. The functional F has a critical point with respect to a
neighborhood U of «0 in case, roughly speaking, the following holds: The
differenceF(u0 + h)— F(u0) is of order greater than the first with respect to
all h such that uQ + heU. For a real function F: R ->R this means that
F'(uQ) = 0 provided 1/= R. The precise definition of a critical point can be
found in Section 43.9. The intuitive meaning of a critical point is explained
in Sections 37.1. and 37.2 for real functions of one and several variables as
well as for free variational problems (respectively, variational problems with
side conditions) in 37.4b (respectively, 37.4/). In Section 43.9 we go into the
connection between critical points and Lagrange multipliers. If F has a
critical point at u0, then we also say that Fis stationary at u0. We symbolize
the problem of discovering critical points u of F with respect to U by
F(u) = stationary!, «e[/.
Many equations of mathematical physics are obtained from such
formulations of the problem (principle of stationary action). Moreover, the
symmetry properties of F lead, via the Noether theorem (cf. 37.4k), to physical
conservation quantities (energy, spin, etc.) and transformation properties of
the field equations (tensors, spinors, and gauge transformations). This is
u
Figure 37.2
General Basic Ideas
5
especially important if one wishes, for example, to obtain an overview of the
possible field equations on the basis of physical symmetry and invariance
ideas for interacting quantum fields of elementary particles.
To fix the terminology, we now recall several well-known concepts. We
designate minima and maxima as extrema. A general minimum problem has
the form
infF(«) = a. (l°)
ae(/
Here, F: U-* [ — 00,00] is a mapping that can take on the two values ± 00
besides all real values. In the introduction to Chapter 47 we explain why
taking these two improper values into consideration is very expedient.
Posing the problem in the form (1°) means that the infimum a of F is
sought on U. This infimum always exists on [—00,00]. By definition, it is
equal to the greatest lower bouqid of the values of F on U. The point uQ in U
is called a solution of (1°) or, also, a minimal point of F on U if and only if
F(u0) = a. In this case, we call a the minimal value of F on U. Moreover,
we say that F possesses a minimum on U.
If we wish to emphasize that we are seeking a minimal point, then instead
of (1°) we write
minF(«) = a. (2°)
ae(/
This way of formulating the problem thus entails the determination of the
infimum a of F on U and discovering a u0 in U such that F(u0) = a. Figure
37.2 refers to the important situation that a solution u0 of (1°) [respectively,
(2°)] need not always exist. For (1°) [respectively, (2°)] we occasionally
write
F( u) = inf., «e[/; respectively F( u) = min!, u e U.
Maximum problems of the form
sup F(u) = /3, (3°)
ae(/
maxF(tt)=)8 (4°)
ae(/
are to be understood analogously. By definition, we set the infimum
(respectively, the supremum) over the empty set U = 0 in (1°) [respectively, (3°)]
equal to a = + 00 (respectively, /3 = — 00). We shall frequently be concerned
with minimum problems only, since, because of the relation
supF(«) = - inf (-F(u)), (5°)
ae(7 ae(7
every maximum problem can be changed into a minimum problem by
switching from F to — F.
We designate problems of the type
mini sup L(x, y)) = y (6°)
6
Introduction to the Subject
as min-sup problems. The point x is a solution of (6°) if and only if, parallel
to our conventions for (1°) and (2°), we have
def / ,\
y = inf I sup L(x, y)\
x^A^y&B '
and
sup (L(x, .)0) = 7, x^A. (6a°)
yeB
If we replace the symbol "sup" by "max" in (6°), then (3c, y) is naturally
called a solution of (6°) if and only if y is a solution of (6a°). max-inf
problems, etc., are handled analogously. It is thus clear what is to be
understood by a solution (x, y) of
min sup L(x,y) = max inf L(x, y), (7°)
i€A ye B yeBxe A
namely,
del
y = inf sup L(x,y)= sup infL(x,>')
je/l jet y^BxeA
and
supL(x,>') = v, inf L(x,y) = y, (x,y)^AxB. (7a°)
One is led to (7°) when determining the saddle points (x, y) for L. If the
symbols "max" and "min" appear instead of "sup" and "inf in (7°), then
(x, y) is called a solution of the corresponding minimax problem if and only
if (7a°) holds for y = L(x,y). Problems of the form (6°) appear, for
example, in approximation theory, for one can write the problem
min||fc-x|| = -y
x e A
as
min max (y, b - x) = y,
where A C X and B= {y e X*: \\y\\ =1} in case X is a B-space. (7°) and the
corresponding minimax problems are basic to the game theory discussed in
Section 37.8 and to the duality theory in Chapter 49.
In Part III, we shall investigate the following central questions for
extremal problems:
(a) Existence and uniqueness of extremal solutions (minimal and maximal
points, saddle points, equilibrium points, critical points).
(b) Necessary and sufficient conditions for characterizing extremal
solutions.
(c) Construction of approximation methods for calculating the extremal
values a, /3, y and the extremal solutions, obtaining error estimates.
(d) Connections between various extremal problems by means of duality
theory.
(e) Estimates for the number of critical points (Morse theory and
General Basic Ideas
7
In this connection, let us elucidate several fundamental notions. In Parts I
and II we placed the fixed point theorems of Banach, Schauder, and
Bourbaki-Kneser at the pinnacle of nonlinear functional analysis. The
existence propositions for extremal solutions are based on:
(a) Compactness (generalized Weierstrass theorem).
(/?) Convexity (separation of convex sets, Hahn-Banach theorem).
We carry this out more precisely in Chapters 38 and 39. The compactness
arguments in Chapter 38 generalize the classical Weierstrass theorem: A
continuous real function on a closed bounded interval has a minimum and a
maximum. Existence propositions that are based on convexity arguments as
in Chapter 39 are frequently intimately connected with duality theory. In
this connection, together with a given minimum problem, we consider a
corresponding maximum problem. The prototype for this is shown in Fig.
37.3. The original problem reads as follows:
min||fc-w|| = a, (8a°)
u e<7
i.e., we seek the minimal Euclidean distance of the point b in R3 from the
straight line U- The corresponding dual problem reads as follows:
maxdist(b,H) = P, (8b°)
i.e., we seek the maximal Euclidean distance of the point b from all planes
H that pass through the straight line U. In the present case, a = /3. For
extremal problems in infinite-dimensional spaces, it is frequently the case
that for two given mutually dual problems one can obtain existence
propositions for one of the problems by a compactness argument and for the other
by a convexity argument. However, it is also possible that the given problem
has no solution, but that the dual problem does. This makes the
construction of generalized solutions for the original problem possible. One exploits
this situation, for instance, in the theory of minimal surfaces (Chapter 52).
Uniqueness propositions for the minimum problem
infF(«) = a (9°)
are based in general on one of the following two principles:
(a) Condition on F (strict convexity).
(/?) Condition on U (interpolation property).
bT H
0
Introduction to the Subject
/
(a)
Figure 37.4
(b)
In Fig. 37.4(a), F is strictly convex and has, in contrast to Fig. 37.4(b), a
uniquely determined minimal point. In order to elucidate the prototype for
conditions on U, we consider (9°), with u = ((,if), \\u\\ = max(|£|, M), F =
\\u\\. Let U be a straight line. The set Q = {«eR2: ||h||=1} is the boundary
of the unit square. In Fig. 37.5(a), (9°) has exactly one solution, whereas in
Fig. 37.5(b) there exist infinitely many solutions. The solutions are exactly
all the points of dQ that lie on U. Moreover, a ==1. In Section 39.2 we
explain the connection with the so-called interpolation property of U. In
classical Chebyshev interpolation, the interpolation property is known as
the Haar condition.
The necessary conditions for solutions u of the minimum problem (9°)
can, to begin with, be split, roughly speaking, into two classes:
(a) the operator equation F'(u) = 0 (free minimum, u is an interior point of
U);
(fi) the Lagrange multiplier rule (minimum with side conditions).
Furthermore, there are, in addition:
(y) sub gradient condition 0 e dF(u);
(S) variational inequalities;
(e) characterization of solutions by means of dual problems.
In Parts I and II we were greatly concerned with the solution of operator
equations which one can always write in the form
Bu = 0. (10°)
The connection with extremal problems is roughly the following: If F has a
derivative F', then for an interior point u0 of U we have: If uQ is a solution
of (9°), then F'(«(,) = °-
(a)
Figure 37.5
(b)
ijeneral Basic Ideas
9
It follows from this that there is an important method for the solution of
the operator equation (10°) which, for example, can represent a differential
equation, an integral equation, or a system of real equations: We seek a
functional F such that B = F' and solve the minimum problem (9°) or a
corresponding maximum problem. However, it suffices that u0 be a critical
point—for instance, a saddle point. Then we also have F'(«0) = 0. In any
case, it must be emphasized that not all operators B can be written in the
form B = F' but rather only the so-called potential operators. In a real
Hilbert space, of the continuous linear operators it is precisely and solely
the self-adjoint operators that are also potential operators. We give general
criteria for an operator to be a potential operator in Section 41.3.
Especially strong propositions can be proved for the minimum problem
(9°) in case where Fis convex. F' is then a monotone operator. We studied
the theory of monotone operators in Part II. Not every monotone operator
is a potential operator. However, for monotone potential operators, one can
prove additional propositions— for instance, propositions for eigenvalue
and bifurcation problems. We discuss this in Chapters 43-45.
During the last 15 years, in connection with optimization problems, a
convex analysis for convex, but not necessarily differentiable, functionals
has been formulated. At the center of this theory there stands a calculus for
the subdifferential dF(u) which appears in place of the derivative F'(u).
The basic idea, which leads to the definition of dF(u), is elucidated in
Section 37.2. The necessary condition F'(«) = 0 is then replaced by 0 e
dF(u). Chapters 47 and 51-53 are devoted to convex analysis. There we
also work out the interrelationship between conjugate functionals and
duality theory.
If the minimum problem (9°) has a solution u0 which is not an interior
point of U, then more complicated necessary conditions appear for u0,
which in many cases can be summarized in a unified way under the concept
of the Lagrange multiplier rule. In general, one is led to Lagrange
multipliers if the side conditions occur in the form of equations or inequalities.
Prototypes for this are:
(a) Eigenvalue problems.
(/?) Linear or convex optimization problems.
We elucidate these prototypes in Sections 37.3, 37.6, 37.10, and 37.11.
Moreover, in Section 37.10 we also obtain the connection between the
Lagrange multiplier method and duality theory. We delve more deeply into
this interrelationship in Chapter 50. Furthermore, in Chapter 43
(respectively, Chapter 48) we justify the Lagrange multiplier method for smooth
(respectively, more general) side conditions. At this point we already note
the important situation that the Lagrange multiplier rule in the narrower
sense is tied up with certain nondegeneracy conditions. The purely formal
application of the Lagrange multiplier rule which one frequently finds in
physics textbooks can lead to false results. One finds a counterexample to
this in Section 43.1.
10
Introduction to the Subject
If U in (9°) is convex and F' exists, then the variational inequality
(F'{u0),v-u0)>0 for all ue U (11°)
holds for a solution uQ of (9°). In Section 37.1 we explain that this is a
matter of the generalization of a well-known necessary condition for the
existence of minima of real functions. A quasivariational inequality is
present when, in addition, U in (11°) depends on u0. Instead of (11°) we
shall consider more general problems, e.g.,
(Au0— b,v — u0) >h(v)— h(u0) for all v e U,
where the operator A is not necessarily a potential operator. The theory of
variational inequalities that has been developed over the last 15 years
combines the theory of extremal problems and the theory of monotone
operators under a unified viewpoint. In Chapter 9 we explained the
important connections with the theory of multivalued mappings. We concern
ourselves with variational inequalities in Chapters 46 and 54-57.
The sufficient conditions for the existence of solutions of the minimum
problem (9°) can be roughly classified as follows:
(a) Positive definiteness of the second variation.
(/?) Characterization of solutions by means of the dual problem.
(•y) Comparison functionals (abstract field theory).
(5) The method of dynamic optimization.
(e) In case of convex problems, the necessary conditions are generally
sufficient.
The criteria that use the second variation are discussed in Section 40.2 (free
local minima) and in Section 43.8 (Lagrange multiplier rule). In this
connection, this is a matter of a generalization of the known classical
criterion for real functions: F"(u0)>0 implies the existence of a local
minimum for F at u0 in the case F'(uQ) = 0. We point out the advantages of
dual problems for the characterization of solutions in Section 37.29f. In
Section 37.20b (respectively, Section 40.3), we treat the method of dynamic
optimization (respectively, the method of comparison functionals). In
Section 40.7 we elucidate the connection between the abstract results and the
field theory of classical calculus of variations.
In order to make it easier for the reader to learn the essential ideas in the
construction of approximation methods for extremal problems, we present,
in Section 37.29, the basic ideas of various important approximation
methods:
(i) The Ritz method (projection method),
(ii) The gradient method (descent method),
(iii) Ascent method,
(iv) Penalty method.
(v) Regularization.
(vi) Duality method.
General Basic Ideas
11
(vii) Dynamic optimization,
(viii) Decomposition,
(ix) Equivalence and combination principle.
These basic ideas are delved into more deeply later.
In conclusion, we summarize the advantages of duality theory:
(a) Necessary and sufficient conditions for the characterization of extremal
solutions.
(b) Existence propositions when properties of the corresponding dual
problems that are frequently easy to verify are at hand.
(c) The construction of generalized solutions for unsolvable problems with
the aid of solutions of the dual problem and the so-called extremal
relation.
(d) The construction of approximation methods with two-sided error
estimates for the extremal values and error estimates for the extremal
solutions.
(e) The side conditions of the dual problem may have a simpler structure
than that of the original problem; therefore, it is occasionally more
propitious to solve the dual problem and to obtain solutions for the
original problem by means of the extremal relation.
We explain this in Section 37.29f.
The basic ideas of duality theory can be found in Chapter 39.
Furthermore, we take up duality theory in detail in Chapters 49-53.
In Part I, the topological essence of fixed point theory concentrated on
the concept of the fixed point index (mapping degree). In the theory of
extremal problems, topological tools will be used to obtain, within the
framework of the Morse theory and the Ljusternik-Sclinirelman theory,
estimates for the smallest number of critical points and to guarantee the
existence of at least one saddle point in indefinite problems. From this we
obtain, for example, propositions concerning the number of eigensolutions
for nonlinear differential and integral equations or concerning the number
of geodesies (Chapter 44) as well as concerning the existence of solutions of
nonlinear differential equations or the existence of periodic solutions of
dynamical systems (Chapter 49).
In the preceding overview it is already apparent that the solutions
of convex minimum problems have especially propitious properties. A goal
of current research consists in making the propitious behavior of convex
problems useful also for classes of nonconvex problems by introducing
generalized concepts of a solution. We discuss this in Chapters 42 and 48.
Finally, we would like to point out that in general we follow the strategy
of reducing propositions on extremal problems for functional to that for
classical real functions. This becomes especially clear in the introduction to
Chapter 40.
CHAPTER 37
Introductory Typical Examples
When I was a student it was fashionable to give courses called "Elementary
Mathematics from the Higher Point of View" But what I needed was a
few courses called " Higher Mathematics from the Elementary Point of View."
Joel Franklin
In the occupation with mathematical problems, a more important role than
generalization is played—I believe—by specialization.
David Hubert
There are two ways to teach mathematics. One is to take real pains toward
creating understanding—visual aids, that sort of thing. The other is the old
British system of teaching until you're blue in the face.
James R. Newman, compiler of the 2,535 page The World of Mathematics,
quoted in the New York Times, Sept. 30,1956
In the following we wish to present many concrete examples, foregoing
extensive technical details, whose solutions have contributed essentially to
the development of a general theory of extremal problems. A glance at the
organization of this chapter in the Contents shows the variety of different
problems one encounters. In this connection, an especially central position
is assumed by Section 37.4, where we discuss a number of fundamental
ideas from the classical calculus of variations. The ideas of the calculus of
variations have influenced the modern theory of extremal problems in an
essential way, and knowledge of these classical ideas is indispensable for a
thorough understanding of the modern development.
In the references to the literature at the end of each section of this
chapter, we restrict ourselves to a few introductory expositions and standard
12
37.1. Real functions in R1
13
works. The later chapters are provided with detailed lists. If the reader
concentrates his attention on the works introduced in the references to the
literature in this chapter under the caption "classical works," then he can
obtain a quick survey of the historical development of the subject.
This chapter addresses itself to readers who are interested in a detailed
motivation of the general theory by means of simple but typical examples.
In the following chapters, we will show how these examples fit into a general
functional analysis theory. In this connection, the reader is often referred to
the corresponding sections of the present chapter. For this reason, a cursory
perusal of this chapter on first reading will suffice. A reader who wishes to
get acquainted immediately with the foundational principles of the theory of
extremal problems can begin with Chapters 38 and 39. There we explain the
role of compactness and convexity for existence propositions, give two
important uniqueness criteria, and treat some fundamental principles of
duality theory.
37.1. Real Functions in Ul
One can already observe numerous phenomena that are typical for extremal
problems in the study of real-valued functions of a real variable. Later we
shall often reduce the investigation of general functionals x>-+ F(x) on a
locally convex space X to the investigation of real-valued functions t >-> <p(t)
of a real variable t, where we set <p(0 = F(x(t)). Here, t >-> x(t) is a curve
inX
Let F: [a, b] -» U be a real function defined on the bounded interval
[a, b]. By definition, F has a local minimum at x0 if and only if there exists a
neighborhood U(x0) of x0 such that
F(x)>F(x0) for allx e(/(x0)n[a,fc], where x =£ x0. (12)
If F possesses a local minimum at xQ and the derivative F'(x0) exists, then
one must distinguish two cases:
(i) If x0 is an interior point of [a, b], then
*"(*„)-0. (13a)
(ii) If x0 is a boundary point of [a, b], then
F'(x0)(x-x0)>0 forallxe[a,fc]. (13b)
The condition (13b) is equivalent to F'(x0)>0 [respectively, F'(x0)<0]
for xQ = a (respectively, xQ = b) (see Fig. 37.6). Obviously, (13a) is a special
case of (13b).
If F: D(F) C X-> U is a functional, for instance, on the B-space X, then
in place of (13a) we have an operator equation (Theorem 40.B in Section
37. Introductory Typical Examples
/
/
/
\
\
\
a x0 D
Figure 37.6
40.3) and in place of (13b) we have a variational inequality (Theorem 46.A
in Section 46.1).
In case (i), because x0 e ]a, b[, a full neighborhood of x0 is allowed in the
competition in (12), whereas in case (ii) only one-sided neighborhoods are
taken into consideration in (12). For that reason, we speak, in (i)
[respectively, (ii)] of a free local minimum (respectively, of a bound local
minimum).
If the sign " >" holds instead of " >" in (12), then by definition it is a
matter of a strict local minimum. In Fig. 37.4(a) a strict minimum is
depicted in contrast to Fig. 37.4(b). We say that xx is a global minimum in
case F(x) > F(xt) for all x e [a, b]. In Fig. 37.6, F has local minima at x0
and x = a. On the other hand, F possesses a global minimum at xx = b.
A central concept for extremal problems is that of a critical point. If
F'(x0) exists, then by definition F: [a, b]-*U has a critical point at x0,
x0 e [a, b], if and only if -F'(*o) = 0, i.e., the tangent line is horizontal. The
following are critical points: local minima and maxima and horizontal
inflection points (see Fig. 37.7). An important aid for the study of the local
behavior of F in a neighborhood of x0 is the Taylor expansion of F,
provided F is differentiable a sufficient number of times.
Example 37.1. If F'(x0) = 0, F"(x0) > 0, then
F(x) = F(x0)+F"iXo)^'Xof+---, (14)
i.e., F behaves in a neighborhood of xQ as the quadratic polynomial on the
Figure 37.7
37.2. Convex Functions in R1
15
right-hand side of (14). Consequently, F has a strict local minimum at xQ.
The precise assumptions for this are: F'(x0) = 0, F"(x0)>0, and F" is
continuous at xQ. This follows from the form of the remainder term in (14).
Example 37.2. If F(">(x0) = 0 for « =1, 2, 3, 4 and F(5>(x0) + 0, then
Therefore, x0 is a horizontal inflection point, for F behaves locally as the
fifth-degree polynomial on the right-hand side of the last equation.
The material of this section is contained in any textbook of differential
and integral calculus.
37.2. Convex Functions in U1
A function F: [a, b] -» U is said to be convex if and only if each chord lies
above the corresponding arc of the curve (see Fig. 37.8). In contrast to
arbitrary real functions, convex functions possess a number of remarkable
properties of which we list three here:
(a) If F has a local minimum at x0, then F also has a global minimum at x0.
(b) The necessary conditions (13a) [respectively, (13b)] for local minima are
also sufficient for global minima.
(c) If F' exists on [a, b], then on [a, b\. F is convex if and only if F' is
monotonely increasing.
In Chapter 42, property (c) yields the connection between convex
functional and monotone operators. A convex function possesses only minima as
critical points.
Figure 37.8
16
37. Introductory Typical Examples
Figure 37.9
If F: [a, b] -» U is a convex but not necessarily differentiable function,
then a global minimum at xQ can be characterized by
0 e dF{x0) (15)
instead of by F'(x0) = 0 or (13b). Here, the so-called subdifferential dF(x0)
equals the set of all slopes m of the straight lines through (x0, F(x0)) which
lie beneath the curve determined by F (see Fig. 37.9). (15) is the starting
point for the convex analysis that we develop in Chapter 47.
References to the Literature
Convex analysis and convex sets: Rockafellar (1970, M, H, B) and Roberts,
Varberg (1973, M, B, H) (standard works for RB); Holmes (1972, L), (1975,
M); Ekeland and Temam (1974, M); Marti (1977, M).
37.3. Real Functions in UN, Lagrange Multipliers,
Saddle Points, and Critical Points
We consider the minimum problem:
F(x) = min! (16)
subject to the side conditions
G,.(x) = 0,/=1,...,M, (17)
where x = (^,... ,£N), Dj,= d/d^j. Here, F and all the G,'s are real-valued
functions of the real variables £v...,£N, and M < N.
We denote the corresponding Lagrange function by
M
L(x,A) = A0JF(x)-£A,.G,-(*)-
i-i
All the A,'s are real numbers and are called Lagrange multipliers.
37.3. Real Functions in R^, Lagrange Multipliers, Saddle Points, and Critical Points 17
Without the side condition (17), the classical necessary condition for a
solution xQ of (16) reads as follows:
DjF{x0) = 0, j = l,...,N, (18)
in the case where xQ is an interior point of D(F) and the derivatives exist.
Now, the Lagrange multiplier rule asserts that with the presence of the side
conditions (17) one needs merely to replace F by L in the necessary
conditions (18) for suitable fixed A = (A0, A1;...,AM), A + 0, i.e.,
DjL{x0,X) = 0, j=l,...,N, (19)
or, in detail,
A0Z>yF(x0)-X;A,.Z>,.G,.(x0) = 0, j = l,...,N. (19a)
i=i
Here we assume that all first partial derivatives of F and the G, are
continuous in an open neighborhood of x0.
A large role is played by the so-called nondegeneracy condition which
requires that the rank of the matrix (Z)yG,(x0)) be maximal, hence equal to
M. If this condition is violated, then one lets (19a) be trivially satisfied by
A0 = 0, for one can then determine (A1,...,AM)#0 as the solution of the
corresponding system of linear equations in (19a). It is crucial that, in the
nondegenerate case, (19a) holds for A0 = 1. We then speak of the Lagrange
multiplier rule in the narrower sense. We give the proof of this in Section
43.10.
When M = 1, we obtain the following eigenvalue problem as a special case
of (19a):
A0^(*o)-Ai^Gi(*o)-0, j = l,.-.,N, (20)
where A20 + X\ + 0. In the nondegenerate case, i.e., in the case when not all
DjG^Xq) are simultaneously zero, we can choose A0 =1.
The simplest variant for gaining a sufficient condition for (16) and (17) to
hold with the aid of the Lagrange multipliers consists in the following. We
consider the modified problem
L(x, A) = min! (16a)
with respect to x. No side conditions appear in (16a). We assume that for
fixed A, where A0 = l, x0 is a solution of (16a) and xQ satisfies the side
condition (17). Since
L(x,X) = F(x)
for all x that satisfy the side condition (17), then x0 is also a solution of (16),
i.e., it is a solution of F(x) = min! with the side condition (17). The classical
18
37. Introductory Typical Examples
condition that xQ be a solution of (16a) reads as follows:
(a) (19) holds.
(b) The quadratic form associated with the Hessian matrix (DkDjL(x0, A))
is positive definite.
If all the Gj's are linear, then DkDjL(x0, A) = DkDjF(x0), i.e., A does not
even appear in the Hessian matrix. One must frequently deal with linear
Gj's when studying problems of the type (16) and (17) for determining
thermodynamic equilibrium states (cf. Part IV).
We will make use of this simple method in Section 37.4/ to investigate
variational problems with side conditions. In Section 43.10 we shall prove a
refined sufficiency criterion for (16) and (17).
We now explain the connection with critical points. By definition,
F possesses a critical point at xQ relative to the side condition (17) if and
only if:
(i) If we set f(t) = F(x(t)), then / has a critical point at r = 0, i.e.,
/'(0) = 0.
(ii) Here we shall consider precisely all curves t>-+x(t) which satisfy the
side condition (17) in a neighborhood of t = 0. Moreover, x'(0) must
exist. Furthermore, x(0) = xQ.
In Chapter 43 we shall show that under appropriate regularity
requirements on F and G„ the existence of a critical point for F relative to (17) in
the nondegenerate case is equivalent to (19) and A0 = 1. If no side condition
(17) is at hand, then we talk about a free critical point. If F possesses
continuous first partial derivatives at x0, then we can choose x(t) in (ii) to
have the form x0 + th and obtain from/'(0) = 0, according to the chain rule,
that 2,jDjF(x0)hj = 0 for all h e UN, and thereforeDjF(x0) = 0,/ = 1,...,N;
but this is (18). We can thus characterize a free critical point x0 as follows:
(a) Geometric condition: The tangent plane at x0 is horizontal.
(b) Analytic condition: The linear terms in the Taylor expansion vanish at
the point xQ.
(c) Degeneracy property: The linear approximation F'(*o): R"-» U of F:
U(x0)CUN-*U is not surjective. Observe that F'(x0)h = /)^(½)^
+ ■■■ + DNF(x0)hN holds.
In the theory of manifolds, the definition of a critical point is based on
(c).
Local minima and maxima and saddle points are critical points.
The use of the concept of a saddle point is not uniform in the literature.
By "saddle point" we shall mean any critical point which does not
correspond to a local minimum or a local maximum (cf. Section 43.9). In the
regular case, this means intuitively, in (i) and (ii) above, that there exist two
clef
curves r >-> xx(t) and r >-> x2(t) such that for ft{t) = F{x,{t)): /x has a local
minimum at t = 0, and /2 has a local maximum at t = 0. For example, for
37.3. Real Functions in Rw, Lagrange Multipliers, Saddle Points, and Critical Points 19
the quadratic function F: U2-*U, F(x) = ai-j + b%\, the following
assertions hold:
(a) F possesses a minimum at x = 0 when a, b>0.
(fi) F possesses a maximum at x = 0 when a, b < 0.
(•y) F possesses a saddle point at x = 0 when aft < 0.
For instance, if a > 0, b < 0, then £\ >-> F(£x, £2) has a minimum at £x = £2 = 0
and £2 >-> F(£1; £2) has a maximum at this point (see Fig. 49.1).
Besides, in connection with duality theory and game theory, we use the
narrower concept of a saddle point with respect to a product set. In this
connection, compare Section 49.1.
In general, one can study the local behavior in a neighborhood of a
critical point parallel to the Examples 37.1 and 37.2 with the aid of the
Taylor expansion. Morse theory provides normal forms for critical points
(cf. Section 37.26).
Saddle points are significant for game theory and duality theory.
Equation (20) shows that one obtains existence propositions for eigenvalue
problems by means of the study of the critical points of functions or, more
generally, of functionals. Existence statements for nonlinear equations of
the type (18) are obtained by constructing free critical points. In Section
44.12 we consider the so-called mountain pass theorem. This theorem
asserts the existence of a nontrivial free critical point.
Estimates for the number of critical points are obtained by using
topological tools within the framework of the Morse theory and the
Ljusternik-Schnirelman theory (cf. Sections 37.26, 37.27, and Chapter 44).
In Section 49.1 we treat a general existence theorem for saddle points with
respect to product sets. In the Problems for Chapter 49, we delve further
into existence propositions for critical points and their applications. In
particular, we deal with a general topological existence principle for saddle
points—the so-called Unking principle.
The justification of the Lagrange multiplier method for general situations
is an important concern in Part III. In this connection, compare Chapters
43, 47, 48, and 50.
The saddle point condition of the Kuhn-Tucker theory for convex
optimization problems (Sections 37.11, 47.10, 48.4, 50.1, and 50.2) and the
Pontrjagin maximum principle for control problems (Sections 37.21 and
48.6) are important contemporary extensions of the classical Lagrange
multiplier rule for problems with side conditions that are not necessarily
smooth.
References to the Literature
Sharp Lagrange multiplier rules in R": Hestenes (1966, M); Boltjanskii
(1976, M).
20
37. Introductory Typical Examples
37.4. One-Dimensional Classical Variational Problems
and Ordinary Differential Equations, Legendre
Transformations, the Hamilton-Jacobi
Differential Equation, and the Classical
Maximum Principle
This section is of central significance for a deep understanding of many
assertions here in Part III. We present the results of the classical calculus of
variations in such a way that the reader will later see the connections with
convex analysis and control theory very clearly. The Hamilton-Jacobi
formalism is the focal point—this formalism is generalized in many ways in
the later chapters.
(a) Generalization of the canonical equations and of the classical maximum
principle: Pontrjagin's maximum principle and control problems.
(fl) Generalization of the Hamilton-Jacobi differential equation: Bellman's
principle of dynamic optimization in control theory, Bellman's
differential equation, and duality theory for nonconvex problems.
(•y) Generalization of the Legendre transformation: conjugate functional
in convex analysis and duality theory.
(5) Generalization of the Hamilton-Jacobi action function S: duality by
means of the stability of perturbed problems and the ^-functional.
The Hamilton-Jacobi theory represents a general framework for the
mathematical description of the propagation of actions in nature and the
optimal modelling of control processes in economics and technology. In
the Problems in Chapter 40, we point out a number of deep physical and
mathematical connections: geometrical optics, characteristics, bicharacteris-
tics and electromagnetic waves, hyperbolic partial differential equations,
Huygens' principle, diffraction theory, asymptotic expansions and the
Maslov index, Fourier integral operators, symplectic geometry, quasiclassi-
cal asymptotic expansions in quantum mechanics, the Feynman integral in
quantum mechanics and its connection with classical mechanics, integrable
canonical systems and perturbation theory in celestial mechanics, infinite-
dimensional canonical equations and nonlinear wave equations, etc. In Part
V we delve into the connection between classical mechanics, statistical
physics, and ergodic theory and explain the derivation of the fundamental
equations of mathematical physics on the basis of variational principles as
well as the meaning of symmetry principles and Lie groups in order to
obtain conservation quantities, similarity assertions, and gauge field
theories, which have achieved a basic significance in elementary particle physics.
This presentation, which is far from complete, should nonetheless facilitate
a feeling for the focal position of the Hamilton-Jacobi theory.
In the following we assume that all functions are sufficiently smooth.
j i.t.' Variauouai Problems, namiltou-jacObi Equauuu, Classical maximum rnntiiple <ii
37.4a. The Variational Problem
We set
L(x, <u(x), u'(x)) dx
and consider the problem
/(«) = min!, u(x0)=uQ, u(xl)=ul, (21)
i.e., for given fixed real numbers xQ, xlt uQ, and uv we seek the minimum of
the integral, where all functions u: [x0, xx] -» U with the boundary
conditions given in (21) are admitted in the competition. L is called the Lagrange
function. In many investigations, it is important that L be convex with
respect to u'.
Example 37.3. The problem of the shortest curve connecting the two points
(x0, u0) and (xv ut) leads to (21) with L = h+u'2 (see Fig. 37.10).
The following example is of great physical significance. Henceforth we
shall constantly refer to it in order to depict the general results intuitively.
Example 37.4 (Fermat's Principle in Geometrical Optics). A ray of light
propagates in a medium of the (x, «)-plane so that the time required for it
to travel from the point (x0, u0) to the point (xv u{) is minimal, i.e.,
jdt = min! (22)
If we represent the path of the ray of light in the form x >-> u(x), then the
precise formulation of this principle coincides with (21), where
L = n(x, u)c~ih+ u'2 . (23)
Here, n is a given function with n(x, u) > 0 for all (x, u) e U2. The number
n(x,u) is called the index of refraction at the point (x, u), and c is the
velocity of light in vacuum. In particular, for n = constant, we obtain a
problem that is equivalent to Example 37.3.
u
o
u
x0 x:
Figure 37.10
22
37. Introductory Typical Examples
In order to obtain this variational problem from (22), note that for given
«(•>•)> by definition the velocity s'(t) at the time ( of a ray of light
t >-> (x(t), u(t)) is given by
n{x{t),u{t))'
Here, s'{t)=\jx'2{t) + u'2{t); therefore,
dt = nc~lds = nc"1yl + u'2(x) dx.
37.4b. The Euler Equation as a Necessary Condition
If u is a solution of (21), then the so-called Euler equation is valid on
]x0, x^:
-£Lu.{x, u(x), «'(*)) = Lu{x, u(x), «'(*)). (24)
The simple proof makes use of methods of deduction that are typical of
all of the calculus of variations. We choose a function h such that h(xQ) =
_def
hix^)— 0. Then u = u + eh satisfies the boundary condition in (21) for all
real e (see Fig. 37.10). If we set <p(e) = J(u + eh), then the real function <p
has a minimum at e = 0, and consequently <p'(0) = 0, i.e.,
( \Lu(t, u, u')h + Lu,{t, u, u')h'\ dx - 0.
Since h(x0) = h(x1) = 0, integration by parts immediately yields
( \Lu-L'u,)hdx = 0.
This relation holds in particular for all h e Co°(x0, xt). According to the
variation lemma (Proposition 18.2), this implies (24).
def
8"J(u; h) = <p("'(0) is called the nth variation of J in u in the direction h.
By a solution of the problem
J(u) — stationary!, u(x0) = u0,u(xl) = ut (24a)
we understand any u such that 8J(u, h) = <p'(0) = 0 for all h e C™(x0, x,).
Then u is called a critical point. The above derivation shows that the critical
points are precisely the solutions of the Euler equation.
Many problems in mechanics are not of type (21) but rather of type (24a),
although this is often not mentioned in theoretical physics textbooks (cf.
Counterexample 40.9).
37.4. Variational Problems, Hamilton-Jacobi Equation, Classical Maximum Principle 23
37.4c. Legendre transformation and Canonical Equations as
Necessary Conditions
Our goal is to go from (t, u, u') to new variables (t, u, p) and to replace L
by H by means of the formulas
p = Lu,(x,u,u'), (25)
H(x,u, p) = pu'(x,u, p)~L(x,u,u'(x, u, p)). (26)
This transformation is called the Legendre transformation. In this
connection, we assume that (25) can be solved for u';i.e., u'= u'(x, u, p). Locally,
this is possible by the implicit function theorem provided Lu,u,(x, u, u') + 0.
We give global solvability conditions in Section 37.4d. From (25) and (26) it
follows that
Hu = PK - Lu + LwK = ~Lu>
Hp = u' + pu'p~Lu,u'p = u'.
Thus, if x >-> u(x) satisfies the Euler equation (24), then
p'{x) = -Hu{x,u(x),p(x)),
u'(x)) = Hp{x,u{x),p{x)). (27)
These are the so-called Hamilton canonical equations. Conversely, from (27)
and (25) it follows that (24) holds.
An advantage of (27) over (24) is that one can read off conservation
quantities directly from (27). If H does not depend on p (respectively, u),
then u(x) = constant [respectively, p(x) = constant]. If H is independent of
x, then
therefore, H(x, u(x), p(x)) = constant.
The canonical equations have been established in the complex problems
of celestial mechanics. The deeper reason lies in the fact that the integration
of (27) can be made essentially easier by means of the canonical
transformations (cf. Section 37.4n).
At the same time, the canonical formalism yields the framework for
general field theories in physics. If u and p are interpreted as operators in an
H-space with the commutator relationpu — up = h/i, then from the
canonical formalism one obtains quantum mechanics from classical mechanics.
The process of the so-called second quantization then yields quantum field
theories which describe the interaction between elementary particles.
_,. u,jroducw.j ^/picali
Example 37.5 (Geometrical Optics). If we choose L = n(x, u)c V1+ u'2
as in Example 37.4, then we obtain
p-Lu, = -
:H+u'2 '
H= pu'~L = ~\jn2{x,u)c 2 -p2 .
37.4d. Classical Maximum Principle and Necessary Conditions
We assume that
Lu,u,(x, u, u') > 0 for all (x, u, u') s R3,
L(x,u,u') , „
—- --»+00 as \u ->°°
|«'|
and define
Jf(x,u,u',p) = pu'—L(x,u,u').
As functions of u', the graphs of L, 3V, and La, have the form shown in Fig.
37.11. In particular, u'-* Lu,(x, u, u') is strictly monotonely increasing and
Lu,(x, u,u')-> ±00 asw'-»±oo.
Therefore, for fixed x, u, and p, the maximum problem
maxJfix, u,v,p)= P
always has exactly one solution u'. For this solution,
Jfu,{x,u,u',p)=0;
therefore,
p = Lu,(x,u,u'). (25a)
According to Fig. 37.11, for each p there exists exactly one u' for which
(25a) holds, i.e., we can solve the Legendre transformation (25a) globally
for u'.
■3C
Figure 37.11
j/X Variational' Problems, Hamilton-.) acbbi Equation, Classical Maximum fnriciple /J
We thus obtain the classical maximum principle: If «(•) is a solution of
the variational problem (21) and we set
p(x) = Lu,(x,u(x),u'(x)),
then for all x e [x0, xj, we have
max^f (x,u(x),v,p(x)) = 3V(x, u(x),u'(x), p(x)).
Besides,p(-) and «(•) satisfy the differential equations
p'=-JTu, u'-JTp.
These equations result from the Euler equation (24) and ,^, = — Lu,JFp = u'.
Furthermore, from (25) and (26) we obtain
H(x, u,p) = max^f (x,u, v,p). (28)
oeR
In Section 51.1 we show that this relation simply means that H is the
conjugate function to L. Thus, the Legendre transformation turns out to be
a duality transformation.
The deep Pontrjagin maximum principle in Chapter 48 is a fundamental
generalization of the above maximum principle to control problems. In
Section 48.8 we show that the following necessary conditions are obtained
by an application of the Pontrjagin maximum principle to the variational
problem (21):
(a) The Euler equation.
(P) The Legendre condition.
(Y) The Weierstrass condition.
(§) The Weierstrass-Erdmann corner condition for solutions u with corners
(jumps in the first derivatives).
Therefore, the Pontrjagin maximum principle is also of central
significance for the classical calculus of variations.
37.4e. Sufficient Conditions
The difference between weak and strong minima is important in the
variational problem (21).
(i) By definition, J has a weak local minimum at u if and only if there
exists e > 0 such that J(u) < J(u) for all u satisfying
\u(x) — u(x)\ < e, \u'(x) — u'(x)\< e on[x0,xj. (29)
Besides, all these 5's should satisfy the boundary condition u(x0) = u0,
m(x,) = ut [see Fig. 37.12(a)].
(ii) If the condition on u' is absent in (29), then we speak of a strong local
minimum [see Fig. 37.12(b)].
26
37. Introductory Typical Examples
(a) (b)
Figure 37.12
In (i) we require that not only the function values but also the derivative
values of u differ only slightly from the corresponding values of u. Here, the
u are the functions that are allowed in the competition. On the other hand,
in (ii) one foregoes the adjacency of the derivative values. Thus, every strong
local minimum is also a weak local minimum. As the derivation in Section
37.4b shows, the Euler equation is necessary for a weak local minimum.
A crucial problem reads as follows: When is a solution of the Euler
equation (24) also a solution of the variational problem (21)? Such sufficient
conditions are proved in the classical calculus of variations with the aid of:
(a) the second variation 82J (accessory variational problem, the Jacobi
condition for weak local minima);
(P) field theory (Hilbert's invariant integral, Weierstrass' criterion for strong
local minima with the aid of the E-function).
We discuss this in Section 40.7. There we also explain the connection with
general necessary and sufficient criteria for extremal problems. In particular,
we explain the role of the eigenvalue criteria in connection with the second
variation.
37.4f. Perturbed Variational Problems and the
Hamilton-Jacobi Differential Equation
We study the perturbed problem associated with (21):
L(x, u(x), u'(x)) dx = min!, u(x0) = u0, w(£) = a.
Here, perturbation means that we replace (xvul) by (£, a). We hold
(x0,u0) fixed and for variable (£, a) we set the minimum value equal to
S(£,a), i.e.,
L(x,u(x),u'(x))dx, u(x0) = u0, «(£) = a,
where the existence of a solution u of the corresponding variational problem
is assumed. In Section 37.4i we show that then, under natural assumptions,
37.4. Variational Problems, Hamilton-Jacobi Equation, Classical Maximum Principle 27
S satisfies the so-called Hamilton-Jacobi differential equation
Sx(x,u)+H(x,u,Su{x,u))=0. (30)
In Chapter 52 we use the idea of perturbed variational problems to prove
duality propositions. From (28), we can write (30) in the form
Sx(x,u)+max Jf (x,u,v,Su(x,u)) = 0. (3l)
This equation is called the Bellman differential equation. In Section 37.20
we explain the connection between a discretized form of (31) and the
Bellman optimality principle of dynamic optimization. In Chapter 52 we use
(31) to construct a duality theory for general nonconvex control problems,
which yields estimates for the minimal values and sufficient solvability
conditions.
Example 37.6 (Geometrical Optics). In the special case L =
n(x, «)c_1/l + u'2 of Example 37.4, S({, a) is the time required by a light
ray to reach the point (£, a) from (x0, u0). Here S is called eikonal. Since
H=— ]jn2c~2 — p2 , the Hamilton-Jacobi differential equation (30) is
transformed into
S2 + S2 = ^.
This equation is called the eikonal equation. The curves S(x, u) = constant
are called wave fronts. We shall explain the more exact physical meaning in
Problem 40.10.
Example 37.7 (Quadratic Variational Problem). Let L = 2~\u'2 + au2).
Then
p = Lu, = u', H^pu'-L = 2~l(p2-au2).
The Hamilton-Jacobi differential equation (30) reads as follows:
Sx+2-\S2~au2) = 0.
The substitution S = r{x)u2 leads to the Riccati differential equation r'+
2r2 — a/2 = 0. Therein lies the deeper reason why the Riccati equation
plays an important role in control problems with a quadratic objective
functional (cf. Sections 37.20 and 54.8).
In the following two subsections we show:
(a) how one can obtain solutions of the canonical equations, and thus of
the Euler equation, from solutions S of the Hamilton-Jacobi
differential equations;
(P) how, conversely, solutions of the Hamilton-Jacobi differential
equations are obtained from solutions of the canonical equations.
From the standpoint of geometrical optics, this connection is very natural.
ii. mu'oductbiy iepical Exainpifes
Light rays correspond to the solutions of the canonical equation, whereas
the solutions S of the Hamilton-Jacobi differential equation yield wave
fronts by S(x, u) = constant, to which (by Section 37.4j) the light rays are
perpendicular. One expects that there exists a mutual correspondence
between light rays and wave fronts.
37.4g. Complete Integral of the Hamilton-Jacobi Equation
and the Solution of the Canonical Equations
If we know a so-called complete integral, i.e., a solution S = S(x, u, a)+ C
of the Hamilton-Jacobi equation (30), which depends on two constants a
and C, then by means of
Sa{x,u{x),a)=P, p(x)=Su{x,u(x),ct) (32)
we obtain a solution of the canonical equations
u'=H„ p'=~Hu, (33)
provided that for fixed /6 the first equation in (32) can be solved for u and
thus x -» u(x) results. Let Sau(x, u,a)¥=0 for the corresponding (x,
u,devalues. Then «(•) andp(-) depend on the two constants a and /6 and, under
suitable regularity assumptions, represent the general solution of (33).
For the proof, we differentiate (32) with respect to x and (30) with respect
to a (respectively, u). We get
Sax + Sauu' = 0, p'~Sux + Suuu'
Sxa + HpSua^0, Sxu + Hu + HpSuu = 0.
(33) follows immediately from this.
If «(■),/?(-) Is an arbitrary solution of (33), then Sa(x, u(x), a) ~ constant,
i.e., Sa is a so-called conservation quantity, since
-~Sa(x,u(x),a) = Sax + Sauu' = Sax + SauHp-0.
Example 37.8 (Harmonic Oscillator). If u(x) is the displacement of an
oscillating spring at time x, then the Newton equation of motion
mu" = -ku (34)
holds, where m is the mass and k is the spring constant, p = mu' is the
momentum. If we choose H = p2/2m + ku2/2, then we can write (34) in the
form
u'-Hp, p'~-Hu, (35)
with the Hamilton-Jacobi differential equation
2mSx(x,u) + S2(x,u) + kmu2 = 0. (36)
^,.-,. Variational i'robleiuc, >iottiilton~jav<jdi Equauuu, Classical maximum rnnciple /.y
H is interpreted as the energy. The substitution S = — ax + T(u) yields
S = — ax + I v2ma — kmv2dv
Jo
as a solution of (36). Finally, Sa(x, u(x), a) = /6 means
— x+ I m(2ma~ kmv2) dv = P
Jo
with the solution
12a . /IT, „x
u — \ —r- sini/ — (x + p).
V k V m '
If we take into account that p — Su and Sx + H(u,Su) = 0, then we obtain
a= H(u, p), i.e., a equals the energy.
In the above example, we can also obtain the solution directly in a simple
way. The advantage of the method described above first shows up in more
complicated problems. In this connection, compare Landau and Lifsic
(1962, M), Volumes I, II.
37.4h. Solutions of Canonical Equations and the Initial Value
Problem for the Hamilton-Jacobi Differential Equation
To solve the initial value problem
Sx(x,u)+ H(x,u,Su(x,«)) = 0,
S(0,«) = 9(«) (37)
for given <p, we consider solutions of the so-called characteristic differential
equation system
u'(x) = Hp(x,u(x),p(x)), w(0) = v,
p'(x)--Hu(x,u(x),p(x)), />(0) = <p'(iO, (38)
a'(x) = p(x)u'(x) — H(x,u(x),p(x)), a(0)=<p(u).
In the following, one should take into account that the solutions of (38)
depend on x and v. We symbolize partial derivatives with respect to x by a
prime. If we set
S(x,u(x,v)) = a(x,y), (39)
then we obtain a solution S of (37) provided the following important
nondegeneracy condition is fulfilled (see Fig. 37.13):
(H) If I denotes an interval on the w-axis of the (x, «)-plane, then exactly
one solution curve x>-*u(x,v) of (38) for a corresponding u-value goes
through each point (x, u) of a suitable neighborhood of I.
30
37. Introductory Typical Examples
Figure 37.13
In the language of geometrical optics, (H) means the following: The light
rays belonging to the curves x >-> u(x, v) do not intersect, i.e., there are no
foci.
For the proof, we differentiate (39) with respect to x; therefore,
Sx + Suu' = a' = pu'~H.
Differentiation of (39) with respect to v yields
S„u„ = o„.
(40)
We obtain (37) from (40) and (H) provided we show that av = puv, because
then Su = p. In this connection, we set w = av — puv. According to (38), we
have:
w' = a„' - p'uv - pu'v
= Pvu' + P< ~ Hu«o - HpPv - P'uo - PK = 0.
From the initial conditions in (38) it follows that w(0, v) = 0. Therefore,
w(x, v) = 0; hence av = puv.
37.4L General Form of the First Variation and the
Hamilton-Jacobi Differential Equation
Our target is a general formula for the change of the integral
Ji(u)= jL(x,u(x),u'(x)) dx
when u and the interval of integration I are changed. This formula is
//(5)- J,{u) = j[Lu~L'u\{u{x)~u{x))dx
+ Lu,8u+ (L~Lu,u')8x\X^+o{P{u,u)), p->0. (41)
The arguments of L, Lu, are (x, u(x), u'(x)). The expression appearing in
the right-hand side of (41) without the remainder term o(p) is frequently
37.4. Variational Problems, Hamilton-Jacobi Equation, Classical Maximum Principle 31
denoted by SJ. We make use of the usual symbolism:
and assume that:
(i) I = [x0, xj and 7= [x0, xj are bounded intervals',
(ii) u e C\l), u e C\l). These functions will be extended linearly on U to
C'-functions.
(iii) We set fix, = xt — x,, §«, = m(x,)— u(xt), i.e., 6x, 8u are the differences
of the coordinates of the end points of the curves for u and u.
Furthermore, let
p(u,u) = max\u(x)~u(x)\
ivi
l
+ max|«'(x)-i/'(x)|+ £ |6x,.|+|6\|.
'U/ ,=0
(41) is obtained in a way parallel to Section 37.4b. The simple
calculation that uses only the Taylor theorem can be found in Gelfand and
Fomin (1961, M), page 55. In the following, the Hamilton-Jacobi
equation, the so-called transversality condition, and the Noether
theorem are obtained from (41). We recommend that the reader prove (41)
as an exercise.
In order to derive the Hamilton-Jacobi differential equation from (41),
we assume that u (respectively, m) is a solution of
Jj(u) = min!, u(x0) = u0, ^(x,)^^,
respectively,
/j(«) = min!, m(x0) = «0, ^(^)==^,
where I — [x0, xj, I = [x0, xj. We assume that
P(m,k) = 0(|Ak|+|Ax|)
holds, where Au = ul~ uv Ax = x, — xv By the definition of S in Section
37.4f, from (41) we obtain
S{xl,ul)-S{xv a,) = LU,(P) Am
+ [L(P)-Lu,(P)«'(x,)] Ax + o{\Au\+ |Ax|),
where P = (xv uv «'(xi))- One takes into account that the integral in (41)
vanishes because of the Euler equation, and x0 = x0, u0 = u0. Thus,
Su(x„«,)=MP), 5,(^,, «,)=L(P)-Lu,(P)«'(x,).
Sincep(xl) = LU,(P) and H = pu'~ L, it follows that
sx(xi> ul) + H(xv uv Su(xv «,)) = 0.
37.4j. Problems with Free End Point and Transversality
Condition
We consider the variational problem
I 1 jL(x, u(x), u'(x)) dx = min\, u(x0) = u0, u(xl{r)) = ul(r),
x0
(42)
i.e., all u( ■) are admitted to the competition which pass through a fixed left
end point (x0,u0) whereas the right end point lies on the curve t>->
(x,(t),«,(t)) (see Fig. 37.14).
If u is a solution, say, for t = t0, then the Euler equation
-j^Lu,(x, u(x), u'{x)) = Lu(x, u(x), u'(x)) (43)
holds on ]x0, x1(t0)[, and the so-called transversality condition
Lu,(P)u'l(r0)+[L(P)~Lu,(P)u'(xl)]x[(r0)=0, (44)
where P = (xv u{xv), u'ixy)) holds for the right end point xv = xx{rQ).
In order to prove this, we first consider only comparison curves u which
have the same right end point in common with u. The same argument as in
Section 37.4b yields (43). In order to prove (44), we assume that t0 = 0 and
choose admissible comparison curves that pass through (x^t), ^(t)), where
p(u, u) = O(t). Here we assume that such comparison curves exist.
Furthermore, let <p(t) = //(t)(m), I(r) = [x0, x,(t)]. We have:
8u0 = 0, §«, = ut{r)— «,(0) = «1'(0)t + o(t),
8x0 = 0, fix, = x1(t) —x,(0) =x,(0)t + o(t).
Since J has a minimum at u, <p'(0) = 0. Thus (44) follows from (41) and (43).
Example 37.9 (Geometrical Optics). In the special case L =
n(x, u)c~lyl+ u'2 of Example 37.4, (44) reads as follows:
"'(*)"i(To) + *i(To) = 0>
Figure 37.14
37.4. Variational Problems, Hamilton-Jacobi Equation, Classical Maximum Principle 33
i.e., the light ray x >-> u(x) is perpendicular to the curve r >-> (xt(r), u^r)).
This explains the designation transversality condition for (44). If one
chooses the curve in Fig. 37.14 to be in the form of a wave front S(x, u) —
constant, then by the construction of S we obtain that all the light rays that
emanate from the fixed point (x0, u0) intersect the wave front
perpendicularly.
37.4k. Noether's Theorem .
We will derive the remarkable fact that the existence of a conservation
quantity (45) below follows from an invariance property of the variational
integral.
In this connection, we use the notation of Section 37.4i and make the
following assumptions:
(i) The variational integral Jt is invariant with respect to transition from u
to m and from I to I, i.e., J,( u) = Jj(u); therefore,
/ L(x, u(x), u'(x)) dx~ I L(x,ii(x), u'{x)) dx.
(ii) This sufficiently smooth transformation depends on a parameter a such
that
x = x + ct<p(x)+ o(«), u(x) = u(x)+ ot\p(x)+ o(a).
The terms o(a) also depend on x, and o(a)/a-»0 holds as o-»0
uniformly with respect to x on I = [x0, xj.
(iii) u satisfies the Euler equation on I.
Then it follows directly from (41) that
L — Lum')cl<p\x +o(a) — 0 asa-»0.
If the assumptions are fulfilled for all a in a neighborhood of zero and for
all x0, xv then, after division by a, we obtain, as a -» 0,
Lu.{P)Hx) + (L{P)-Lu,{P)u'(x))<p(x)- constant, (45)
where P = (x, u(x), u'(x)).
Example 37.10 (Energy as a Conservation Quantity). If L does not depend
on x, then (i) holds for x~x + a, u(x) = u(x) (translation invariance).
According to (45), L{P)~Lu,{P)u'(x) = constant; therefore, H(u(x),
p(x)) = constant. In applications to mechanics, H is the energy and the
theorem on the conservation of mechanical energy is obtained.
We shall occupy ourselves with generalizations and the important
physical applications of the Norther theorem in Part V.
34
37. Introductory Typical Examples
37.4/. Variational Problems with Side Conditions and the
Lagrange Multiplier Rule
We consider the variational problem
L(x, u, u', v, v') dx = mini,
«(*o) = «o> u{xl) = ul, v(x0) = v0, v(xl) = vl (46)
for fixed xt, «,, vt, / = 0,1, subject to one of the following side conditions:
(i) Integral side condition (isoperimetric problem):
M(x, u, u', v, v') dx = constant.
(ii) Equation as a side condition:
M(x, u, v) = 0.
(iii) Differential equation as a side condition:
M(x, u,v, u',v') = 0.
Without the side conditions, the necessary conditions for a solution u, v of
(46) read as follows:
~LU,(P) = LU{P), ~LV,{P) = LV{P) on]x0,xj, (47)
where P = (x, y,u(x),u'(x),v(x),v'(x)). These Euler equations are
obtained in a way analogous to the method of Section 37.4b.
Lagrange Multiplier Rule as a Necessary Condition. This important rule
reads as follows:
If u,v^C1[x0,x1] is a solution of (46) with one of the side
conditions (i), (ii), or (iii), then (47) holds, in which connection , .
one must replace L by X0L + AM. Here, A0 is a real number and ^ •*
A is a C1 — function on [x0, xj, where A20 + A2(x) * 0.
We make the assumptions precise:
Ad (i) (L) holds with A0 = l provided M does not satisfy (47) with L
replaced by M (nondegenerate case). Otherwise, (L) is trivially fulfilled with
A0 = 0, A = l.
Ad (ii) (L) holds with A0 =1 provided the rank of the matrix
(MU(P),MV(P))
is maximal—therefore, equal to 1 for all P, i.e., M„(P)+M*(P) f= 0 for all
x e [x0, xj (nondegenerate case). For M*(P)+ M^(P) = 0, (L) is trivially
fulfilled for A0 = 0, A = 1.
37.4. Variational Problems, Hamilton-Jacobi Equation, Classical Maximum Principle 35
Ad (iii) (L) holds provided the rank of the matrix
(map), map))
is maximal, therefore, equal to 1 for all P.
The functions L and M are assumed to be sufficiently smooth. If we
denote by C the arc that belongs to the solution u = u(x), v = v(x), then it
suffices that L and M are C1-functions in a neighborhood V of C. In this
connection, to be precise, Fis the set of all (x, u, u', v, v') e Us, x e [x0, xj,
and
\u~ u(x)\, \u'~ u'(x)\, \v — v(x)\, \v'— v'(x)\<,8
for fixed 8 > 0.
Idea of the Proof. The classical proofs of (i), (ii), and (iii) can be found in
Courant and Hilbert (1953, M), Vol. I, page 216, Gelfand and Fomin (1961,
M), page 42, and Funk (1962, M), page 275, respectively. The multiplier rule
is also investigated in detail in Bolza (1949, M). We sketch the ideas of the
proof. In (i) and (ii) we make use of the multiplier rule for real functions
and make inferences analogous to those in Section 37.4b. In the more
difficult case (iii) we use an artifice of Bliss that first reduces the problem to
a Mayer problem and then applies the main theorem on underdetermined
systems of differential equations.
(i) We replace u (respectively, v) by u + elhl (respectively, v + e2h2)
where hv h2 vanish at the boundary points x0, xv We denote the left-hand
side in (46) [respectively, (i)] by F(e1; e2) [respectively, G(ev e2)]. Then F
has a minimum at (0,0) under the side condition G(ev e2) = c. From the
multiplier rule for real functions it follows that
\0Fti (0,0)+ AG,((0,0) = 0, /=1,2.
(L) follows from this in a way analogous to Section 37.4b (cf. Courant and
Hilbert (1953, M), page 216).
(ii) First we note that in this case (L) is of a purely local nature. This
follows from the fact that each solution of (46), (ii) is also a solution of the
problem that results from (46) by replacing [x0, xj by a smaller interval
[x0, xj and modifying the boundary conditions correspondingly. If, say,
MU{P*) + 0, then we can solve (ii) for u in a neighborhood of
(x*, u(x*), v(x*)), and we get u = g(x, v). This expression is substituted in
(46), possibly with [x0, xj instead of [XqjXJ, and we write the Euler
equation for this situation. Then we obtain (L) (cf. Courant, and Hilbert
(1953, M), Vol. I, page 219).
JO
j/. introductory lypicalhxampies
Exercises. Write out these proofs explicitly.
(iii) If u, v is a solution of (46), (iii) and m denotes the minimal value,
then u, v, w is a solution of
w' = L(x, u, u', v, v'), M(x, u, u', v, v') = 0
with the boundary conditions
w(x0) = 0, w(xl) = m,
u(x0) = u0, u(xl) = uv v(x0) = v0, v(xl) = vv
It is crucial that, because of the choice of m, the present problem has no
solution provided we replace m by m — b, b > 0. This means that u, v, w is a
bound arc in the sense of Problem 43.9. According to the theorem proved in
Problem 43.9, there then exist C'-functions A0, A on [x0, xj such that
A20 + A2 * 0 and
A'0 = 0,(A0L„, + AM„,)'-(A0L„ + AMJ = 0
and a corresponding relation for v instead of u. Consequently, A0 = constant
and we obtain (L).
Lagrange Multipliers and Sufficient Conditions. In a manner parallel to the
considerations for real functions in Section 37.3, simple sufficient conditions
can be formulated for variational problems with side conditions provided
one uses Lagrange multipliers. In this connection, we consider the following
problem which is a modification of (46):
[*\L + \M)dx = mini, (46*)
u(x0) = u0, u(xl) = uv v(x0) = v0, v(x1) = vv
If u, v is a solution of (46*) for fixed A and this solution satisfies the side
condition (i), then u, v is obviously also a solution of (46) with the side
condition (i). The situation behaves analogously for the side conditions (ii)
and (iii).
In Chapter 40 we prove sufficiency criteria for problems of type
(46*)—thus, for problems without side conditions. If one finds a A so that
these sufficiency criteria are applicable to (46*), then one immediately
obtains sufficient conditions for (46), with one of the side conditions (i), (ii),
or (iii).
Critical Points. The multiplier rule (L) also holds in case where "min!" is
replaced by "stationary!" in (46), i.e., in case where we are seeking critical
points u, v. Roughly speaking, in this connection, a critical point means: If
we replace u and v by u + kv v + k2, respectively, which also satisfy the side
conditions, then the change in the integral expression in (46) is of higher
than first order in kt, k2. Then, in the proofs of (L) sketched above, the real
function (ev e2) >-> F(ev e2) has a critical point at (0,0) with respect to the
37.4. Variational Problems, Hamilton-Jacobi Equation, Classical Maximum Principle 37
side conditions G(ev e2) = c, etc. However, the Lagrange multiplier rule for
real functions holds in general for critical points, not only for extrema (cf.
Corollary 43.25). We give the precise definition of a critical point in Section
43.9.
Example 37.11 (Hanging Rope). We seek the form u = u(x) of a rope of
fixed length a and constant density which hangs between two fixed points
(x0, u0) and (xv «,). The variational problem reads as follows:
I wvl+w' dx = imn\, «(x0) = «0, u(x1) = u1 (48)
with the side condition
j'Xlh+u'2dx = a. (49)
(48) comprises the requirement for minimal potential energy. In order to
motivate this, we think of the potential energy of a mass point in the
linearized gravitational field of the earth as being equal to weight times
height. If we subdivide the rope into small parts, then its potential energy is
approximately equal to u As (s = arc length), and (48) is obtained by
summation and passing to the limit as A* -» 0. The necessary condition for
a solution u reads as follows:
^(A0L + AM)„,-(A0L + AM)„ = 0,
L = uh + u'2 , m = vT+m^.
In the nondegenerate case, A0 = 1; therefore, (u + A)/Vl + u'2 = c, i.e.,
u + A = ccosh(c_1x + c,).
This is the so-called catenary. The constants A, c, and c, are determined
from the boundary conditions and the side condition.
Degeneracy occurs if a = xx — x0. Then, according to (49), we must have
u' = 0; therefore u = 0 where u0 = ut = 0. Here, we can choose A0 = 0, A = 1.
Example 37.12 (Geodesies). We seek geodesies on the surface M(x, u, v) =
0, i.e., x >-> (u(x), v(x)) must connect two fixed points and realize the
shortest distance between these two points. Then we obtain the problem:
h+u'2 + i/2dx = mini, (50)
u(x0) = u0, u{xl) = ul, v(x0) = v0, v(xl) = vl
with the side condition
M(x,u,o) = 0. (51)
The necessary solvability conditions for u, v are as follows:
38
37. Introductory Typical Examples
In the nondegenerate case, A0 =1. A simple calculation shows that we then
have: The principal normals of geodesies are normals to the surface (cf.
Smirnov (1956, M), Vol. IV, Section 70).
37.4m. The Trick of Introducing Lagrange Coordinates of the
Second Kind
Variational problems with side conditions arise very frequently in
mechanics when the principle of stationary action is applied. Concerning side
conditions in equation form, we distinguish between holonomic
(respectively, nonholonomic) side conditions when no derivatives occur
(respectively, derivatives do occur). For example, holonomic side conditions
describe the motion of mass points on surfaces. Nonholonomic conditions
occur in the motion of a ship or of a skater.
The Lagrange multiplier rule yields additional terms in the Euler
differential equations—these additional terms correspond to constraining forces in
mechanics, which, e.g., maintain the mass points on the prescribed surface.
In the case of holonomic conditions, there exists an important trick: One
introduces new coordinates so that the side conditions are automatically
fulfilled. Then, in these new coordinates (Lagrange coordinates of the
second kind), one obtains a variational problem without side conditions. In
mechanics, the Euler equations that result are called Lagrange equations of
the second kind.
Example 37.13. In Example 37.12, we introduce surface coordinates t, s on
the surface M(x, u, v) = 0. Then x = x(t, s), u = u(t,s), and v = v(t, s)
automatically satisfy (51). If we transform (50) to t, s, then we obtain
problem (51) without side conditions.
We explain the significance of geodesies on Riemannian manifolds for
general relativity theory in Part IV.
37.4n. Canonical Transformations and the Hamilton-Jacobi
Differential Equations
In order to solve the canonical equations
p'(x) = ~Hu(x,u(x),p(x)), u'(x) = Hp(x,u(x),p(x)), (52)
one can try to pass to new coordinates by means of a transformation
P = P(x,u,p), U=U(x,u,p) (53)
so that after the transformation, the solutions of (52) satisfy the new
37.4. Variational Problems, Hamilton-Jacobi Equation, Classical Maximum Principle 39
equations
P'{x) = - H*(x,U{x), P(x)), U'(x) = H*(x, U(x), P(x)).
(54)
Such transformations that preserve the form of the canonical equations are
called canonical transformations. If, e.g., H* = 0, then the solutions of (54)
are P(x) = constant, l/(x) = constant, and the solutions of (52) are easily
obtained from (53).
We show: If \p = \p(x, u, U) is a given function and
p = \pu(x,u,U), P = \f>u(x",u,U)
can be solved in the form (53), then there results a canonical transformation
with
H*(x,U, P) = H(x,u, p) + tx(x,u,U).
^ is called a generating function.
def
In order to prove this, for \j>(x) = ty{x, u(x), U(x)) we write
i'{x) = ix + i>uii'+ ^VU'=H* -H + pu' + PU' (55)
and consider
I [pu'~ H(x, u,p)\ dx = stationary!, (56a)
fl[PU'-H*{x,P,U)\ dx = stationary! (56b)
Furthermore, in (56a) and (56b) one must adjoin further fixed boundary
conditions on the functions u,p (respectively, P,U). As a result of (55), the
two integrals in (56) differ only by a constant. They thus possess the same
critical points. By Section 37.4b, these are, however, equivalent to solutions
of the corresponding Euler equations (52) [respectively, (54)] (cf. (47)).
The fact that variational integrals differ by only a constant in the case
where one adjoins a derivative in the integrand or a divergence expression in
a multiple integral is a basic trick of the calculus of variations which is
exploited, for instance, in field theory (cf. Section 40.7).
Example 37.14. As in Section 37.4g, let S(x, u, a) be a complete integral of
the Hamilton-Jacobi differential equation
Sx + H{x,u,Su) = 0. (57)
def
We choose S to be the generating function; thus, $(x, u, U) = S(x, u, U).
From (57), H* = 0; therefore, P(x) = constant, U(x) = constant. We thus
40
37. Introductory Typical Examples
obtain the solution of (52) by
p = Su{x,u,U), P = Sa(x,u,U),
where P and U are constants. This is precisely the method for the solution
of canonical equations that we have already used in Section 37.4g.
At the same time we thus obtain a new interpretation of the
Hamilton-Jacobi equation as an equation for an especially propitious
generating function of a canonical transformation.
In Problem 40.8 we treat a deep-lying application of canonical
transformations.
In celestial mechanics, in the consideration of the perturbation action of
planets in (52), perturbed Hamiltonian functions of the form H + eHl
appear instead of H. The classical method consists in carrying out a
canonical transformation with respect to the unperturbed function H
analogous to Example 37.14. Then (54) is obtained with H*(x,U,P) =
bH^x, u,p). The classical perturbation calculus for obtaining approximate
solutions for small e is now based on power series expansions for the
solutions of (54) with H * = eHv
References to the Literature
As a survey of classical works of the calculus of variations by J. Bernoulli
(1667-1748), Euler (1707-1783), Lagrange (1736-1813), Legendre
(1752-1833), Jacobi (1804-1851), Weierstrass (1815-1897), and Hilbert
(1862-1943), we recommend Funk (1962, M), Petrov (1977, M), and
Goldstine (1980, M). For the connection between the classical theory and
modern control theory, we recommend McShane (1978, M).
Introduction: Courant and Hilbert (1953, M), Volumes I, II; Gelfand and
Fomin (1961, M); Bliss (1951, M); and Funk (1962, M).
Hamilton-Jacobi theory: Rund (1966, M); Klotzler (1971, M).
Calculus of variations and first-order partial differential equations,
Lie theory of contact transformations: Caratheodory (1935, M); Frank and
von Mises (1961, M), Vol. I.
Lagrange multiplier rule: Bolza (1949, M); Funk (1962, M); Ioffe and
Tihomirov (1974, M).
Global generalized solutions of the Hamilton-Jacobi differential
equation: Lions, Jr. (1982, L).
Applications to mechanics: Sommerfeld (1962, M), Vol. I; Landau and
Lifsic (1962, M), Volumes I, II; Arnold (1974, M) (modern presentation).
Application of the canonical formalism in all branches of theoretical
physics: Landau and Lifsic (1962, M), Volumes I-IX.
(Also, cf. the references to the literature for Chapters 40 and 43.)
i /.3. lvMtidimensional Variational ProDlems
41
37.5. Multidimensional Classical Variational
Problems and Elliptic Partial Differential
Equations
As a generalization of Section 37.4, we consider the minimum problem
I L(x, y, u,ux, u)dxdy = min!, « = g on dG, (58)
where g is given. Let G be a bounded region in U2. As in Section 37.4b, we
obtain that a sufficiently smooth solution satisfies the Euler equation
±LUx{P)+^Lu(,P)-Lu{P) = Q (59)
on G, where P = (x, y, u(x, y),ux(x, y),uy(x, y)). In contrast to one-
dimensional variational problems, this is a partial differential equation. We
treat general multidimensional problems in Sections 40.5 and 40.6.
Example 37.15. In Section 18.3 of Part II we have already seen that for a
solution u e C1((?) of
JG
the relation
I \u\ + u2-2fu) dxdy = xmn\, u = g ondG, (60)
j (uxvx + uyvy-fv)dxdy = 0 for all y e C0°°(G) (61)
JG
-'I
always holds. Furthermore, in case u e C (G),
G: -uxx-uyy^f; dG:u = g. (62)
Thus, the first boundary value problem for the Poisson equation appears
here as the Euler equation for (60). The relation (61) is called a variational
equation or the generalized problem for (62) and is, as we saw in Chapter
22, the point of departure for the modern functional analysis treatment of
boundary value problems in Sobolev spaces.
In the introductory remarks before Chapter 18 we explained in detail that
for general regions G and boundary functions g, one cannot expect that
solutions u e C2(G) of (58) exist which also satisfy (59). In Section 42.7 we
treat the existence theory for (58). In this connection, the following items
are crucial:
(i) The solutions of (58) are proved to exist in Sobolev spaces and have
only generalized first derivatives,
(ii) The solutions satisfy
fc[LUx(P)vx + LU)(P)vv + Lu(P)v\ dxdy = 0
for all ye C0°° (G). (61a)
42
37. Introductory Typical Examples
This equation is called the generalized equation for the classical Euler
equation (59), and (61a) means that the first variation of (58) vanishes.
In contrast to (59), (61a) contains only first derivatives. In applications
to elasticity theory, (61a) corresponds to the principle of virtual work.
We explain this in Part IV.
(iii) Under appropriate regularity assumptions on L, dG, and g, it can be
shown with the aid of ingenious estimates that the solutions of the
generalized problem (62a) are also solutions of the Euler equation (59).
This difficult regularity theory can be found in Ladyzenskaja and
Uralceva (1964, M) and Morrey (1966, M). We also recommend
Giaquinta (1981, L) and Necas (1983, L).
(iv) A fundamental assumption of existence theory is the convexity of L
with respect to the first derivatives ux, uy. Regarding weakening this
assumption, we refer to the Problems in Chapter 42.
In Section 18.4 we pointed out the situation that is fundamental for
applications in mathematical physics that, for certain variational problems,
boundary conditions appear as necessary conditions which are not
formulated in the original variational problem. One then speaks of natural
boundary conditions.
Example 37.16. If we forego the boundary condition "u — g on dG", in
(60); i.e., we consider (60) without any boundary condition, then, from
Section 18.4, we obtain the equation (62) with the natural boundary
condition dG: du/dn = 0 instead of dG: u = g.
References to the Literature
Introduction: Courant and Hilbert (1953, M), Vol. I; Gelfand and Fomin
(1961, M); Klotzler (1971, M).
Hamilton-Jacobi theory and field theory: Rund (1966, M); Klotzler
(1971, M).
Standard works on existence and regularity theory: Ladyzenskaja and
Uralceva (1964, M); Morrey (1966, M); Gilbarg and Trudinger (1977, M).
Recent results on regularity: Giaquinta (1981, L); Frehse (1982, S); Necas
(1983, L).
Quadratic variational problems: Michlin (1962, M).
Minimal surfaces: Nitsche (1975, M); Gilbarg, and Trudinger (1977, M).
Historical survey: Ladyzenskaja and Uralceva (1964, M); Funk (1962,
M); Goldstine (1980, M).
Aleksandrov (1969, S); and Browder (1976, S). (Hilbert's 19th and 20th
problems).
37.6. Eigenvalue Problems for Elliptic Differential Equations and Lagrange Multipliers 43
37.6. Eigenvalue Problems for Elliptic Differential
Equations and Lagrange Multipliers
Instead of (58) we now consider
I L(x, y, u,ux,u ) dxdy = mini, « = g on dG (63a)
JG
with the integral side condition
I M(x, y,u,ux,uy) dxdy = constant. (63b)
JG
The Lagrange multiplier rule asserts that the necessary condition for (63)
is obtained by replacing the function L in (59) by X0L + AM, where
A2, + A2 + 0, i.e.,
^(a0L^+aMJ + ^(a0L%+aMJ-(a0L„ + aMJ = 0 (64)
on G. The argument of L and M is (x, y, u(x, y), ux(x, y), uy(x, yj). The
numbers A0 and A are real.
The degenerate case occurs provided (64) holds on G, with A0 = 0 and
A =1. In the nondegenerate case, one can choose A0 =1. We shall make this
precise in Section 43.13.
One obtains the generalized equation for (64) from (61a) by replacing L
everywhere by A0L + AM, i.e.,
f [(a0L + \M)Uxvx +(a0L + \M)UyVy +(a0L + \M)uv] dxdy = 0
(65)
for all ye C0°°(G).
If one replaces "min!" by "stationary!" in (63a), then (65) is equivalent to
(63) provided, roughly speaking, the just-mentioned nondegeneracy
condition holds. If « eC2(G), then the Euler equation (64) follows from (65).
Example 37.17. As in Section 18.5, we consider the problem
I [«2 + «21 dxdy = min!, « = 0 ondG, (66)
/ u2dxdy=l.
JG
If u^C2(G) is a solution of (66), then
G: -uxx-uyy -A« = 0; dG:u = 0 (67)
41
j/. uiu'oductoiy iepical EAtimpies
with the corresponding generalized equation
j [uxvx + uyvy - Xuv] dxdy = 0 for ally eC0°°(G). (68)
The above-mentioned nondegeneracy condition is fulfilled because u + 0.
By means of the minimum problem (66), one obtains only the smallest
eigenvalue \ = \t. But it is known that to (67) there corresponds a sequence
(\n) of eigenvalues such that 0 <\1<\2< ■ ■ ■ and \n ~* + oo as n -* oo. In
order to obtain \n for n > 2, we replace "min!" by "stationary!" in (66).
Then to the corresponding critical points u there correspond the solutions of
(68) with \ = \„ and yield the classical eigensolutions of (67) for X„ for
sufficiently smooth boundary dG.
Therefore the critical points are of fundamental significance for
discovering the eigensolutions for the higher eigenvalues. The Ljusternik-
Schnirelman theory makes available topological tools for demonstrating
the existence of critical points in connection with nonlinear eigenvalue
problems (cf. Section 37.27).
References to the Literature
Courant and Hilbert (1953, M), Vol. I; Klotzler (1971, M); Ioffe and
Tihomirov (1974, M) (functional analysis treatment of Lagrange multipliers).
Eigenvalue problems in physics and engineering: Michlin (1962, M);
Collate (1963, M).
37.7. Differential Inequalities and Variational
Inequalities
We consider the following boundary value problem:
-A« + cm = / onG, u<=C2{G), (69)
-^-g>0, «>0, (-^-g)" = 0 ondG.
Here, f ^C(G), g^C(dG) and the constant c are given, d/dn denotes the
exterior normal derivative. The boundary condition can also be written in
the form
%-geF{u), (69a)
where
({0} for u > 0,
F(u)=lu+ for« = 0,
I 0 for«<0.
37.7. Differential Inequalities and Variational Inequalities
45
In contrast to the classical boundary value problem, here there appear
inequalities (respectively, multivalued conditions). Such boundary
conditions result from a number of physical problems with one-sided bounds. As
examples we mention:
(i) sliding boundaries in elastic media (the Signorini problem in elasticity
theory);
(ii) diffusion (respectively, heat transfer) in media with semipermeable
(respectively, thermally insulated) walls.
We shall consider (i) in Part IV. In order to physically motivate problem
(69) in a simple way, we interpret u as the temperature of a medium in a
region G, The differential equation in (69) describes a stationary
temperature state with the heat source / — cu that depends on temperature. Here,
f(x)— cu(x)>0 is the heat intake at the point x. The walls dG of the
medium are to act in a thermally insulating way against the environment
which has temperature u = 0. Let, say, g = 0. Then du/dn>.Q on dG
means that there is no flow of heat to the outside (cf. Section 69.2 in Part
IV). Besides, we require that for u(x) > 0 we always have du(x)/dn = 0 at
a boundary point x, i.e., the heat at x can flow to the wall only tangentially.
Normally, because the outside temperature is u = 0, heat would flow to the
outside, but the insulated wall prevents this.
If dfi is the set of all boundary points x at which u(x) = 0, then:
dG-dfi: j^- g = 0; djG: « = 0.
Thus, the first boundary value problem applies to dfi and the second
boundary value problem applies to dG — d-fi. In fact, to begin with, dxG is
unknown and cannot be easily prescribed. One therefore speaks of a free
boundary value problem. It is characteristic of free boundary value
problems that together with the solution one must further determine a set (the
form of the boundary, part of the boundary, interior subset, etc.) which is of
special physical interest. For example, in the melting of a block of ice or of
metal, one is interested in the advance of the fusion zone (Stefan's problem).
It is hard to investigate the problem in the form (69). It is much more
convenient to consider, for «eM,an equivalent variational inequality
a(u,v — u) > b(v — u) forallyeM (70)
where
M={«eC2(6):«>0 ondG),
I N \
a(u, v) = / Yl DtuDtv + cuv dx,
b{v)= ffvdx+ f gvdO
JG JdG
46
37. Introductory Typical Examples
and the corresponding variational problem
2~la{u,u) — b(u) = min'., «eM. (7l)
Proposition 37.18. If G is a bounded region in UN, N>1, having a piecewise
smooth boundary, i.e., dG eC0,1, then the following hold:
(1) Equivalence. The problems (69), (70), and (71) are mutually equivalent.
(2) Uniqueness. Each of these problems has at most one solution.
In order to recognize the connection with variational problems, recall
that, from Section 18.2, relation (70) with the equality sign and thus the
second boundary value problem
G:-Au + cu=f, dG:^r- = g
dn
follows from (71) in case M = C2(G).
PROOF. (1) (70) «* (71). If F denotes the left-hand side in (71), then we set
<p(t) = F(u + t(v — «)) for t > 0 and fixed u, v e M. Then u is a solution of
(71) if and only if the convex function <p: [0, oo [ ■—* IR has a minimum at
t = 0, i.e., <p'(0) > 0. This is (70).
(70) <=> (72). If we set v = 2«, v = u + w for w e M in (70), then we obtain
that (70) is equivalent to
a(u, u) = b(u), a(u,w) >b(w) forallweM, (72)
where weMis sought.
(69) => (72). Multiplication of the differential equation in (69) by w e M
and subsequent integration by parts yield
/ (2,Z),«Z),w + cuw) dx— I -^-wdO = I fwdx.
JG J3C dn Jc
The boundary conditions in (69) then yield (72).
(72) => (69). By integration by parts, it follows from (72) that for all
w e M we have:
/c(-A« + c«-/Wx + /J^-g)v^0:>0,
((- Au + cu-f)udx+ f {-^-- g\ud0 = 0.
Then, for w e Co°(G), we first obtain — A« + cu = / on G. The choice of
an arbitrary w e M then yields the boundary conditions in (69).
(2) If «1; «2 are solutions of (70), then we have a(«,-, v — «,) > b(v — «,)
for all v e M. For u = «1; «2, we obtain
a{ux,u2 - Uj) >b(u2 - uj,
a(«2, «j — «2) > b(ut — u2).
Addition yields a(ut — u2, ut — u2) < 0; thus, ut = u2. D
37.8. Game Theory and Saddle Points
47
Example 37.19. Let N = l and G = ]-l,l[,/=i, g = 0. ForoO, u=l/c
is the unique solution of (69).
For c = 0, u = — 2~1x2 + Cxx + C2 is the general solution of the
differential equation in (69) and it can easily be verified that (69) possesses no other
solution. Observe that du/dn passes into ± «'(±1).
In Section 46.3 for c> 0 we shall construct a generalized solution of (70)
and hence of (69) while replacing C2(G) by the Sobolev space W2{G).
There, c>0 yields the coerciveness of a(-,-). In Chapter 54 we shall
consider semicoercive problems, to which, e.g., the Signorini problem leads.
Variational inequalities are the appropriate tool iqv handling a number of
free boundary value problems (Signorini problem, flows of ground water,
and the Stefan problem). We discuss this in Part IV. Many applications of
variational inequalities to mathematical physics can be found in Duvaut and
Lions (1972, M), and Friedman (1982, M).
References to the Literature
Classical works on variational inequalities: Fichera (1964) (solution of the
Signorini problem); Stampacchia (1965) (elliptic differential equations with
discontinuous coefficients); Hartman and Stampacchia (1966) and Browder
(1966) (variational inequalities with nonlinear monotone operators).
Introduction: Lions (1969, M), (1971, M); Kinderlehrer and Stampacchia
(1980, M)
Applications of variational inequalities: Lions (1971, M) (control
problems); Duvaut and Lions (1972, M) (mechanics); Baiocchi and Capelo
(1978, M) (free boundary value problems); Bensoussan and Lions (1978, M)
(stochastic optimization); Aubin (1979, M) (mathematical economics);
Kinderlehrer and Stampacchia (1980, M); Friedman (1982, M, B) (free
boundary value problems).
Numerical methods: Glowinski, Lions, and Tremolieres (1976, M).
37.8. Game Theory and Saddle Points, Nash
Equilibrium Points and Pareto Optimization
We have already taken up saddle points and their game-theoretical
applications in Chapter 9. In this section we consider these considerations in a
more general context.
We consider two players, Pl and P2, having the strategy sets P and Q,
respectively, i.e., each element p in P (respectively, q in Q) symbolizes a
decision of Pl (respectively, P2). Let f(p,q) [respectively, g(p,q)] denote
the winnings of Pv P2, respectively. If f(p, q)<0, then the negative win-
48
37. Introductory Typical Examples
ning of Pl means a loss for Pv At the beginning, each player Pt will first
determine his individual game value vt. By definition, this is:
v1= sup inf f(p,q), (73a)
y2= sup inf g(p,q). (73b)
For the player Pt, vt is an optimal lower bound on winnings. To see this, let
us consider, say, vt: The infimum in (73a) corresponds to the minimal gain
of Pl in case he plays p. Now he tries to make this minimal gain as large as
possible by a suitable choice of p.
The next thing that each player should ask himself is whether he can
realize the winning vt, i.e., Pl (respectively, P2) seeks a solution p
(respectively, q) of (73a) [respectively, (73b)]. These solutions are called
conservative strategies. Thus, in game theory one is led in a natural way to the
solution of max-inf problems, e.g., p is a solution of
y, = max( inf f(p,q)\,
p(EP\q(EQ I
i.e.,
i>!= inf f{p,q).
q*Q
Now we consider strategy pairs (p,q) which are propitious for both
players, (p, q) is called a Nash equilibrium point if and only if
f(p,q) = maxf(p,q), (74)
p e p
g{p,q) = maxg(p,q).
In this case, none of the players obviously has occasion to change his
strategy, provided his opponent does not vary his strategy, for each player
realizes his maximal possible payoff with the strategy chosen by his
opponent. It is, however, quite possible that there is a strategy pair (p, q) for the
players that is more advantageous than (p, q), i.e.,
f(p,q)>f(P,q), g(p,q)>g(P,q)- (75)
We call an arbitrary strategy pair (p, q) a Pareto maximum if and only if
(75) does not hold.
Naturally, both players will seek strategy pairs (p, q) which are
simultaneously equilibrium points and Pareto maxima. If this is not possible, then
one restricts oneself to strategy pairs (p,q) which have the following
properties:
(i) (p, q) is a Pareto maximum;
(ii) f(p,q)>v1,g(p,q)>v2.
By definition, all these (p, q) form the core of the game.
37.8. Game Theory and Saddle Points
49
Example 37.19. We consider the game situation presented in Table 37.1. In
the cell (pn dft) there appears (/(/>„ #/), g(fi, #,))• We can assume that this
game models economic decisions of Pl and P2 (production, sales,
purchasing, warehousing, etc.), which are related, e.g., in terms of dollars with profit
or loss. One now easily verifies the following: We have vl = — 3, v2 = — 2,
The numbers pv q2 represent conservative strategies. There exists no
equilibrium point, and the core of the game is given by (p2,1i), (Pnli)-
Thus, this strategy pair is appropriate for both players.
We will now discuss the connection with the zero-sum games discussed in
Chapter 9. In this case, / = — g. From (74) it follows immediately that
(p, q) is a Nash equilibrium point if and only if (f, q) is a saddle point of g
with respect to P X Q, i.e.,
g(?.q)£g(P>q)*g(P>q) fora\\{p,q)^PXQ. (76)
In Corollary 9.16 we showed that (p,q) satisfies (76) if and only if p, q
are conservative strategies and vl = v2. Then, in addition, vl = v2 = g(p, q).
We can express this briefly by asserting that
max inf g(p,q)= min sup g(p,q) = g{p,q). (77)
q<=Qp<=P p<=P ?<Eg
In a two-person zero-sum game, the individual game values are thus equal
to the winning of P2. Since/= — g, each strategy pair (p,q) is trivially a
Pareto maximum as well.
Therefore, an important mathematical problem consists in verifying the
existence of saddle points. In Section 9.6 we proved the fundamental
existence theorem of J. von Neumann and several of its generalizations. In
this connection, P and Q must be convex sets. This condition is not fulfilled,
e.g., for finite sets. However, in Section 9.7 we have shown that the
convexity of P and Q can be affected by having each of the players guess
their decisions only with certain probabilities. We delve into the solution of
max—inf and min—sup problems in the construction of conservative
strategies in Problem 49.14.
The concept of a Nash equilibrium point can easily be extended to n
players, parallel to (74). We shall consider this definition in Chapter 77 in
Part IV in connection with the important Nash existence theorem. In
mathematical economics there are a number of other definitions of
"equilibrium" which suit the various models, for instance, the Walras
equilibrium. In Chapter 77, we shall prove the main theorem on the existence of
Table 37.1
(/, g)
Pl
Pl
9l «2 It,
(6,-3) (-3,0) (3,-3)
(-3,2) (5,-2) (-4,-7)
50
37. Introductory Typical Examples
Walras equilibria in connection with the fundamental Ky Fan inequality. A
detailed investigation of these questions can be found in Aubin (1979, M).
In Chapter 49 we shall show that saddle points are of central importance
not only in game theory but also in duality theory.
References to the Literature
Classical works: von Neumann (1928); von Neumann and Morgenstem
(1944, M).
Introduction: Collate and Wetterling (1966, M) (connection with the
theory of linear optimization).
Burger (1959, M); Owen (1968, M); Vorobjov (1970, S); Friedman (1971,
M), (1974, M) (differential games); Friedman (1975, M) (stochastic games);
Aubin (1979, M).
Applications to mathematical economics: von Neumann and
Morgenstem (1944, M); Karlin (1959, M); Aubin (1979, M).
History of game theory: Vorobjov (1975, M).
37.9. Duality between the Methods of Ritz and
Trefftz, Two-Sided Error Estimates
As in Section 37.5, we proceed from the minimum problem
minj(u)-b(u)=*a, u = 0ondG, (78)
u
where
N
J{u)= (l-lY, (D/ufdx,
JG ,-1
b(u) = I fudx.
For a solution u e C2(G), the following holds:
G: -A«=/; dG: « = 0. (79)
According to Trefftz, we consider, parallel to (78), the maximum problem
max(- J(v))=P, -Au=/onG. (78*)
V
To begin with, there exists a formal duality between these two problems:
(i) (78) contains the boundary condition in (79) as a side condition,
(ii) (78*) contains the differential equation in (79) as a side condition.
37.10. Linear Optimization in US ", Lagrange Multipliers, and Duality 51
In Section 51.6 we shall prove the following within the context of a general
duality theory:
-J(v)<a = p<J(u)-b(u), (80)
Cf (u-u)2dx<J(u)-b(u)+J{v).
JG
This holds for all u, v with
«,yeC2(G), u = 0ondG, -Av = fonG. (80a)
u denotes the solution of (78) and (79), and it turns out that u also solves
(78*). In (80), C > 0 is a constant
From (80) we obtain practical error estimates for u and the minimal value
a by making use of test functions u and v for which (80a) holds. These error
estimates can be improved by calculating u (respectively, v) with the aid of
the Ritz method for (78) [respectively, (78*)] (cf. Chapter 18). The Ritz
method for (78*) is called the Trefftz method. The particularity of (80) and
(80a) is that one obtains lower bounds for a with the aid of (78*).
In Section 51.7 we discuss similar results for quasilinear elliptic
differential equations, which result from general duality theory.
References to the Literature
Classical works: Trefftz (1927); Friedrichs (1929).
Courant and Hilbert (1953, M), Vol. I; Michlin (1962, M), (1969, M);
Michlin and Smolizki (1969, M) (numerical methods); Velte (1976, M).
37.10. Linear Optimization in UN, Lagrange
Multipliers, and Duality
We consider the linear optimization problem
N
inf £ c,«,- = a, u<=Ul, (81)
N
bj-lLdjMKO, j = 1,..., M.
(=i
Parallel to this, for existence theory, it turns out to be basic to study the
52
37. Introductory Typical Examples
following dual problem:
M
sup E bj\j = P, XeU™, (81*)
X j-l
M
EVjr'^O. / = 1,...,2V.
>-i
Here, u = (uv...,uN), X = (X1,...,XM), and iieRj means that «,■> 0 for all
i. All c,, ft-, and djt are given real numbers, u and \ are to be found. The
manner of writing (81), (81*) is so chosen that in the next section the
connection with convex optimization becomes clear.
If we use u >: 0 for u e U+, then in matrix notation (81), (81*) read briefly
as follows:
inf(c|«) = a, «>0, b-Du<0, (81)
U
sup(b\\)=P, X^O, D*X~c<0. (81*)
\
If, after multiplying by —1, we formulate (81*) [respectively, (81)] as a
minimum problem (respectively, as a maximum problem), then, because
D** = D, one immediately recognizes that (81) is the problem dual to (81*).
As the admissible region U (respectively, A) of (81) [respectively, (81*)], we
denote all u (respectively, X) that satisfy the side conditions in (81)
[respectively, (81*)].
We will now call the reader's attention to several phenomena that will
later lead to important generalizations.
Meaning of the Vertices of U, The geometric meaning of the problem
minuj— 2«2 +4 = a(e) (81a)
u
Ui+ u2<>\ — E, IIER2+
is as follows: One determines the shortest distance of the plane E; z = «x —
2«2 +4 from the (uv «2)-plane over the admissible region U, which is a
triangle here (see Fig. 37.15).
Figure 37.15
Line, , mizat J", Li„ „;Mult, .and I „
It is intuitively clear that this minimal value is taken on at a vertex of U.
If we check all three vertices of U, then we obtain the solution
M(e) = (0,l-e), a(e) = 2 + 2e for all ee [-2^,2-1].
That the minimal value is attained at vertices of the feasible region is typical
of linear optimization problems and forms the point of departure for the
Dantzig fundamental simplicial algorithm. Here, the idea is that one
proceeds from one vertex to another so that the value of the objective
functional is always decreased. In this connection, compare the standard
work of Dantzig (1963, M).
In Section 38.7 we generalize to linear optimization problems in locally
convex spaces the observation that the minimal value is attained at vertices.
To this end we shall use extreme points of convex compact sets U.
Stability of Perturbed Problems^ Example (81a) is also remarkable in that
the minimal value a(e) depends continuously differentiably on e in a
neighborhood of e = 0. In Section 52.1 this phenomenon is the starting point
for the Rockafellar theory of stable optimization problems. Here, the role of
the S-function of the classical Hamilton-Jacobi theory in Section 37.4 is
taken over by a( •).
In order to see the connection with the general formulation in Sections
52.1 and 52.2, we set
F(u) =
clef
With S(e) = a(e), (81a) is equivalent to
min F{u) + H{\-e~u1-u2) = S{e).
u eR2
Thus, upon introducing F and H, there arises a problem over the entire
space. Later we shall use this device systematically.
Consistency, Existence, and Duality. An optimization problem is said to be
consistent if and only if the feasible region is not empty. This is a trivial
requirement for the existence of a solution. The question arises whether the
following is valid:
Consistency => Existence.
The simple example in R1,
— « = min!, «>0
shows that a consistent problem need not have a solution. However, the
«! — 2«2 +4
+ oo
forueR*
iovu<£U\
0 forueR^,
+ oo for v
*R\.
54
37. Introductory Typical Examples
following main theorem of linear optimization shows that the existence of
solutions for both problems follows from the consistency of the original
problem and of the dual problem.
Theorem 37.A. The following three assertions are equivalent:
(i) The original problem (81) has a solution.
(ii) The dual problem (81*) has a solution,
(iii) Both problems are consistent.
If any one of these conditions holds, then, moreover, a= B.
We go into a short proof that follows from a separation theorem via
Farkas' lemma in Problem 50.4.
The duality assertion in Theorem 37.A is the model for a general duality
theory that we develop in Chapters 49-52, together with numerous
applications. The assertion is not preserved in the strong form given above in
infinite-dimensional spaces and in singular finite-dimensional situations. For
example, duality gaps may occur, i.e. it may happen that a > fi or one of the
mutually dual problems has no solution. We give examples of this in
Problem 52.2. In Section 52.1 we establish the following general stability
principle:
Consistency of ( P ), ( P * ), Stability of ( P * )
=> Existence of (P) and equality of the extreme values of (P) and (P*).
Here, (P) [respectively, (P*)] denotes the original (respectively, dual)
problem.
Lagrange Multiplier Method. We construct the Lagrange function
L(u,\) = (c\u) + (\\b-Du),
i.e., we add a term to the objective functional (c\u), which takes the side
condition b — Du <, 0 into account, and instead of (81) we consider the new
minimum problem
infL(«,\) = a, «eRj!, (82)
u
in which the side condition b— Du<0 no longer appears. The components
\t of \ are called the Lagrange multipliers.
Saddle Point Theorem. The following two assertions are equivalent:
(i) u is a solution of the original problem (81), \ is a solution of the dual
problem (81*), and for the extreme values we have a — fi.
37.11. Convex Optimization and Kuhn-Tucker Theory
55
(ii) L has a saddle point («, \) with respect to R% X R +, i.e., (u, X) e R* X
Rf and
L(«,]u)<L(«,\)<L(i;,\) forall(u,fi)eR~xRf.
If either one of these conditions is fulfilled, then u is a solution of (82).
We give the proof in Section 49.3 in a more general setting. This theorem
shows that one can also apply the Lagrange multiplier method to minimum
problems with inequalities as side conditions.
Furthermore, an interesting interpretation of the dual problem results: Its
solutions are precisely the Lagrange multipliers of the original problem.
We shall place a saddle point theorem of the above form at the pinnacle
of duality theory in Chapter 49.
Linear optimization has numerous interesting applications to economics
and the natural sciences. In this connection, we recommend Dantzig (1963,
M), Collatz and Wetterling (1966, M), and Bronstein and Semendjaev
(1979, S).
References to the Literature
Classical work: Dantzig (1949) (simplex algorithm). The elements of linear
optimization theory are already contained in the book by Kantorovic (1939,
M), which has remained unnoticed for a long time.
Introduction: Collatz and Wetterling (1966, M) (emphasis on
applications); Bronstein and Semendjaev (1979, S) (handbook article).
Linear optimization and its applications: Dantzig (1963, M, B, H)
(standard work); Vogel (1967, M), Suhovickii and Avdejeva (1969, M); Glashoff
and Gustafson (1978, M); Foulds (1981, M).
37.11. Convex Optimization and Kuhn-Tucker
Theory
Parallel to the linear optimization problem (81), we consider the convex
optimization problem
infF(«) = a, «eR~, (83)
U
Fj{u)<lO, 7 = 1,....M.
We assume that all F, F^,.,.,FM~. RN —*R are convex. Motivated by Section
37.10, we construct the Lagrange function
M
L(u,\) = \0F{u)+Z\jFj{u),
7-1
oducl , _ pical I. A ,es
where \ = (\l,...,\M). All \t are real numbers; they are called Lagrange
multipliers. In the nondegenerate case, ^o = l- Therefore, we do not write
out the dependence of the function L on \Q explicitly. The point of
departure for the theory is the saddle point formula
L(u,n)<L(u,\)<L(v,\) forall(y,]u)e[R~x[|«f. (84)
Furthermore, the so-called Slater condition is of central significance:
There exists a u0 in U1 such that Fj(u0) < 0 for ally. (SC)
This condition assures the nondegenerate case \0 = 1.
Theorem 37.B (The Kuhn-Tucker Saddle Point Theorem (1951)). If (SC)
holds, then the following two assertions are equivalent:
(i) u is a solution of the original problem (83).
(ii) L, with X0 = l, has a saddle point (u,X) with respect to U^XUf, i.e.,
(84) holds and (u, X) e R~ x Rf.
Corollary 37.21. If (SC) does not hold, then, furthermore, (i) follows from
(ii). But (ii) follows from (i) only in a modified form, in that we replace X0 = 1
by
X0 > 0, X20 + X\ + ■■■ + \2„ + 0.
This means that \0 = 0 is possible, but not all the multipliers Xt are
simultaneously equal to zero.
We give the proof of this which is based on a separation theorem in
Section 47.10 in a more general context. With a view to later generalizations,
we will now give various equivalent formulations of (84). In this connection,
\0 can be chosen arbitrarily. (SC) is not assumed.
Let (u, \)eUlxU+. Then (u, X) is a saddle point of L with respect to
U+XU+if and only if any one of the following three conditions is fulfilled:
(1) Minimum problem without inequalities as side conditions: u is a
solution of
inf L(y,\) = a1; (85a)
where, in addition, the following holds:
XjFji^^O, Fj(u)<0, y' = l,...,M. (85b)
(2) Local Kuhn-Tucker condition (variational inequalities):
(Lu(u,\)\v-u)>0 forallyeR?,
(Lx(u,\)\p-\)<0 forallfteRf.
37.11. Convex Optimization ana' Kuhri-1 ucicer Theory j,
(3) Local Kuhn-Tucker condition (inequalities):
Lu(u,\)>0, Lx(u,X)<0,
<Lu(«,X)|«> = <Lx(«,X)|X> = 0.
In addition in (2) and (3) it is assumed that F, FV...,FM have continuous
first partial derivatives; therefore, the F-derivatives F', FJ exist. Then we
have:
M
Lu{u,\) = \0F'{u)+Z\jF/{u),
y'-i
Lx(u,\)=(Fl(u),...,FM(u))
and
F'{u) = {D.Fiu),...^^^)), Dt = ~.
The proof of (1) is completely elementary. (85a) [respectively, (85b)]
follows from L(u,\)< L(v,\) [respectively, L(u, fi)< L(u, X)] in (84).
The condition \jFj(u)=> 0 in (85b) means that \ = 0 when Fj(u) < 0. Then
one says that \j is inactive. (2) is a special case of Theorem 46.A, (2) in
Section 46.1. Here, (2) is obtained immediately and directly if one sets
<p(t) = L(u + t(v-u),\), ip(t) = L(u,\ + t{ii.-\)),
and takes into account the relations <p'(0) < 0, ^'(0) !> 0 due to (84). The
equivalence of (2) and (3) again follows in a completely elementary way. To
this end, we choose v = w + u, weK", v = 2u, v = 0, and analogously for ft.
The conditions for Lx in (3) are equivalent to (85b). We recommend that the
reader carry out all these proofs as an exercise. We shall give these proofs
later in a more general setting.
The role of \,- as a Lagrange multiplier is clear in (85a). In contrast to the
original problem (83), the inequalities -fj(y)^0 do not appear as side
conditions, but instead F is replaced by L.
The local Kuhn-Tucker condition in the form of the variational
inequality (2) has the advantage that it can also be applied to nonconvex problems.
We shall prove a general proposition in this direction in Section 48.4.
Roughly speaking, we get the following result:
(a) The local Kuhn-Tucker condition (in the form of variational
inequalities) is necessary for a solution of the original problem (83).
(b) This condition, with \0 = l, is sufficient provided all functions are
convex.
(c) The Slater condition is needed in (a) to guarantee the nondegeneracy
\0 = 1.
We shall take up generalizations of the Kuhn-Tucker theory in Section
47.10 (connection with convex analysis), in Section 48.4 (general Lagrange
multiplier rule), and in Chapter 50 (general duality theory).
58
37. Introductory Typical Examples
References to the Literature
Classical works: John (1948); Kuhn and Tucker (1951).
Introduction: Collate and Wetterling (1966, M); Dixon (1980, M) (state
of the art).
Arrow, Hurwicz, and Uzawa (1958, M); Hadley (1963, M); Stoer and
Witzgall (1970, M); Kreko (1974, M); Martos (1975, M); Blum and Oettli
(1975, M,B); Elster (1977, M,B).
Numerical methods: Polak (1971, M); Grossmann and Kleinmichel (1976,
L); Psenicnyi and Danilin (1979, M); Fletcher (1980, M), Vols. I, II
(standard work) (also, cf. the references to the literature in Section 37.29).
Applications to mathematical economics: Karlin (1959, M); Aubin (1979,
M).
Nonlinear optimization and nonlinear approximation theory: Collate and
Krabs (1973, M); Krabs (1975, M).
37.12. Approximation Theory, the Least-Squares
Method, Deterministic and Stochastic
Compensation Analysis
A fundamental problem of approximation theory reads as follows:
min\\b-u\\ = a, (86)
i.e., we seek an element u in the subset M of the B-space X that has minimal
distance from a given fixed element b in X (see Fig. 37.16). The following
are important problems in approximation theory:
(a) Characterization of the solution u.
(P) Determination of a.
(y) Construction of approximation methods and obtaining error estimates
for a and u.
In this connection, duality theory plays a special role (cf. Chapter 39). We
give numerous important examples of (86) in this section and in Sections
37.13-37.19.
Figure 37.16
37.12. Approximation Theory, Least-Squares Method, Compensation Analysis 59
The general significance of approximation theory in practice is that it
allows the optimal modelling of approximation processes which present the
foundation of all numerical methods. A central problem in applied
mathematics is, say, the approximation of functions by simpler expressions, e.g.,
by polynomials or rational functions in order to be able to calculate them
on computers. As we shall see, here it is a matter of a special case of (86). As
a further example, we mention the construction of optimal quadrature
formulas for the approximate calculation of integrals (cf. Section 37.19). If
in (86) M is the solution set of a differential or integral equation, then we
are dealing with a class of control problems, e.g., the control of a regulation
system with minimal expenditure of energy (cf. Section 37.13). For a general
control problem, the expression ||fc-~«|| in (86) is replaced by a general
functional F(u) (cf. Chapters 48.and 54). Also, many problems of
parameter identification that are of importance in engineering can be reduced to
(86) (cf. Section 37.15).
In this Section we consider as a special case of (86), the important
least-squares method. Let ul,...,un be fixed linearly independent elements
in a real H-space X with the inner product (-|-). Furthermore, let M =
span{ «!,...,«„}. Then (86) is equivalent to
min||fr-tt||2 = a2, (87a)
u e M
where
n
"=EC,";> clt...,c„eU. (87b)
i-l
This is the abstract formulation of the least-squares method. It follows from
Theorem 22.A in Section 22.1 or from the results in Section 39.2 and 39.3
that (87a) has exactly one solution u. If we set
\ /-i i=i I
then (87a) is equivalent to
F(c) = min!, ceR".
If c is a solution, then all first partial derivatives of F vanish at this point,
i.e.,
lb-ZciUi\u\=0, / = 1,...,«. (88)
This is a system of linear equations for determining cv...,cn. The coefficient
determinant G = det{(«,|«y)} is called the Gram determinant. Because of
the linear independence of the u,, Gt'O, i.e., (88) has a unique solution c. If
(«,) forms an orthonormal system, i.e., (u,\uj) = 5;y, i, j = 1,...,n, then from
60
37. Introductory Typical Examples
Tb
i
i
i
-i—
-»~M
Figure 37.17
(88) we obtain c- = (b\iij). Thus, for the solution of (87) we have
n
«= E (%,>,■
y-i
(89)
Equation (88) means that b — u is perpendicular to all «, and therefore is
perpendicular to M, i.e., the solution u is the orthogonal projection of b on
M (see Fig. 37.17).
We now consider four typical applications of the function analytical
results to (87).
Example 37.22 (Deterministic Compensation Analysis). The problem is
£ (b,-u{x,)) =min!,
r-l
(90)
where
u(x)= E Ci«i(x),
i = i
and has the following interpretation: Suppose k measurement data (xr, br)
are given. We seek a function y = u(x) as a Unear combination of the
functions y — ut(x), which optimally fit the measurement data in the sense
of (90) (see Fig. 37.18). Here we are dealing with a special case of (87) with
X-Uk, b={b1,...,bk), 1/, = (1/,(^),...,1/,(¾)).
y
Figure 37.18
...»_. Appi ionTL . Jy ^east-l , ; Met! . jmpen <\naly;
k
E
r-l
b,-
n
- E C,",(xr)
i = l
Equation (88) for determining cl,...,cn reads as follows:
«y(xr) = 0,y=l,...,n.
This method is very frequently applied in all areas of the natural sciences,
engineering, medicine, economics, social sciences, etc. The abundance of
empirical laws which have been discovered by the adjustment of
measurement data in astronomy is fascinating. For example, one can infer from
the period-brightness relation of periodically luminous 5-Cepheus stars
the distance of galaxies up to 106 light years away. With the aid of the
cosmological red shift that follows from general* relativity theory and the
empirically determined Hubble constant, even distances of up to 1010 light
years have been measured (recession of the galaxies). Furthermore, the
calculation of double star trajectories is based on compensation analysis.
Example 37.23 (Fourier Series). The continuous analogue of (90) reads as
follows:
f\b(x)-u{x))2dx = mini, (91a)
■'a
where
n
u{x)-Zc,u,{x), (91b)
i-i
i.e., the function b is to be optimally approximated by a linear combination
of the functions «, in the sense of (91a). This problem corresponds to (87)
for
X=L2{a,fi), («|u)=/ uvdx.
In the classical special case a = 0, /3 = 2ir, with (uv... ,u2k+l) =
^-1/2 (2-1/2,sinx,...,sinfc>c,cosx,...,cosfc>c), we have («,|«y) = dtJ, and
the solution (91b) with c, = (b\ut) corresponds to the nth partial sum of the
Fourier series for b.
Example 37.24 (Compensation Analysis for Random Variables). We now
consider (87) with
thus
X=L2(2,n), (u\v)=fuvdix;
f(b(u)-u{u))2dix = xmn\, (92)
n
«(<0)= E C,";(«).
1 = 1
62
37. Introductory Typical Examples
In order to explain the probability theoretic meaning, we remind the reader
of several fundamental concepts from probability. A probability space
(2, 3t, ft) consists of the set 2, a a-algebra 3t of subsets of 2, and a measure
ft on the sets of 3t such that 0 < n(A) < 1 for all A e 3t and ju(S2) = 1.
Elements w of £2 are called elementary events and are interpreted as
possible results of a random experiment. li(A) is the probabiUty that in the
random experiment one of the outcomes w in A occurs. The sets A in 3t are
called events. For example, if a homogeneous (fair) die is tossed, 2 =
(wx,...,w6), /*(«,-) = |. Here, w„ means that the number n appears (n =
1,..., 6). The set A = { »„ w2 } with 1>,{A) = § corresponds to the event that 1
or 2 appears. Here 3t is equal to the set of all subsets of £2. If a needle is
tossed onto a square Q, £2 = Q, the points of Q are the elementary events
(targets of the point of the needle), ft equals the Lebesgue measure, and 3t
consists of all Lebesgue-measurable subsets of Q.
The measurability of functions /: 2 -> U and the integral jafdfs, are
explained analogously to A2(4) and ^42(13), respectively. Parallel to L2(G),
L2(2, ft) consists of exactly all measurable functions /: 2 -> U such that
f f2dfx <oo.
L2(2, ft) is an H-space with the inner product
{f\g)-ffgdii,
where functions that differ on a set of fi-measure zero on 2 are identified.
The measurable functions/: 2 -* U are called probabilistic (or stochastic)
random variables./(w) is interpreted as the observed measurement value of
/ when the elementary event w occurs. For instance, in the die experiment,
the number of eyes n is a random variable, i.e., /(«n) = n. Let
A= {«eS2:a</(«)<fc}.
Then ft(^4) is the probabiUty that the measurement outcome/(w) lies in
[a, b\ We define the expected value E[f] and the dispersion D2[f] of a
random variable / to be
E[f]=ffdn, D2[f] = f(f-E[f]fdv.
The basic significance of these two information quantities of / results from
the Chebyshev inequaUty:
n(A)<l-a-2D2[f],
def
A = {o>^2:\f(o>)-E[f]\<a}
for aU a > 0. This means: The probabiUty that the measurement value/(w)
differs at most by a from the expected value E[f] is less than or equal to
37.12. Approximation Theory, Least-Squares Method, Compensation Analysis 63
1 — a~2D2[f]. The dispersion D2[f] is also designated as the variance,
Var[/]. If /, g: ti-*U are two random variables, then their covariance,
Cov(/, g), is defined to be the number
Cov{f,g)={f-E[f]\g-E[g]).
Note that Cov(/, /) s Var[/].
Problem (92) thus means that one must approximate a random variable b
by a linear combination u of random variables «, so that Var[« - b] is
minimal. In the special case n = 2, «x =1, «2 arbitrary, the solution of (92)
leads to
u = a + /•aa2~1(«2 — a2). *
Here, a (respectively, a2) [as well as a2 (respectively, a22)] is the expected
value (as well as the dispersion) of b (respectively, u2), and the number
def
r=a a2 E[(b — a)(«2—a2)] is called the correlation coefficient. This
number, r, with — l<r<l, is a basic measure for applications, to the
extent that b depends linearly on u2-
Example 37.25 (Compensation Analysis for Stochastic Processes). We
consider the basic model (87) with X= L2(ot, /?), i.e.,
P(b(t;u)-u(t;u)fdt = min! (93)
/.
u(t; «) = 22 ci(w)"/(0 forallwefi.
/-1
Here, b is a given stochastic process which is to be approximated by the
stochastic process u, and ui does not depend on randomness. We recall that
a stochastic process b: [a, /?]X 2 -> U is understood to be a mapping which
is a random variable for each fixed t. If w is kept fixed, then t >-> &(?; w) can
be interpreted as the measurement curve of a random process that depends
on time (e.g., daily temperature change). For that reason, one also
designates stochastic processes as random functions. Dependence on chance is
emphasized by the dependence on w. In conjunction with (87), the solution
of (93) reads as follows:
P
cj(a)= t,o,j[ ui(t)b(t;a)dt.
Here, all atJ- are independent of w by (88), i.e., independent of chance.
Consequently, under appropriate regularity assumptions on b, the following
holds for the expected values:
^]-Evf«,(<W(0]*-
3'/. „.,..oductc., ljricalEjk^.„.r..j
Therefore, as an approximation to b, one chooses the average measurement
curve:
«(0-E£[cik(0-
i-i
In Section 37.25 we treat additional methods for the approximation of
stochastic processes that are basic in practice.
References to the Literature
Approximation theory: Cheney (1966, M); Holmes (1972, M); Laurent
(1972, M); Collatz and Krabs (1973, M); Dreszer (1975, M) (handbook
article).
Least-squares method: Linnik (1961, M); Schmetterer (1966, M)
(statistics); Luenberger (1969, M); Rozanov (1975, M).
Compensation analysis and applications: Grossmann (1969, M); Ludwig
(1969, M).
Factor analysis and its applications in statistics: Uberla (1968, M); Focke
(1984, S).
Applications in meteorology: Bengtsson (1981, P).
(Compare, also, the references to the literature in Section 37.25.)
37.13. Approximation Theory and Control Problems
In order to explain the basic idea, we consider the problem:
(Tw2{t)dt = rmn\, (94a)
Jo
mx"(t) + ax'(t) = w(t), (94b)
x(0) = x'(0) = 0, x{T)=x0, x'{T) = 0
with the following interpretation: The function t >-* x(t) describes the
motion of a mass point (e.g., a car) of mass m under the influence of a
control force w. Here, a denotes friction. In the sense of (94a), with minimal
expenditure of force, the situation is to be achieved that a point which is at
rest at time t = 0 is to arrive at a given fixed time T at x0 with the velocity
zero. We seek a control force w with this property. For the sake of
simplification, we set a = m = T = x0 = 1.
Example 37.26. The optimal force function is
, v l + e-2e'
37.14. Pseudoinverses, Ill-Posed Problems and Tihonov Regularization
65
PROOF. Let X= L2(0,1). The solution of (94b) yields
x(\)=>( ywdt, y(t) = l-e'~l,
Jo
x'{l)~ (\wdt, z(t) = e'-1.
Jo
Thus, (94) reads as follows:
|M|2 = min!, (y\w)=l, (z|w) = 0.
In order to obtain homogeneous side conditions, we chose fcel such that
(y\b) = l, (z\b) = 0. Let N = span{ y, z } and let Nx denote the orthogonal
complement to N in X; then, with w = b — v, the problem that arises is
||fc-u||2 = min!, v^N1.
According to Section 37.12, this problem has exactly one solution v, where
w = b — v is perpendicular to N ±; therefore, it belongs to N, i.e., w = ^y +
c2z. We thus obtain a problem of the type F(cv c2) = min! Setting the first
partial derivatives equal to zero yields the assertion. □
It is left to the reader to carry out the calculations as an exercise. In
Section 37.21 we consider a more complicated control problem, where a
completely different optimal control (the bang-bang principle) arises.
References to the Literature
Luenberger (1969, M) (cf. also, Sections 37.21 and 37.24).
37.14. Pseudoinverses, Ill-Posed Problems and
Tihonov Regularization
37.14a. Pseudoinverses
We proceed from the operator equation
Au=°b. (95)
In this connection, let A'. D(A) C X-* Y be a closed linear operator. Let X
and Y be real H-spaces. We recall that every continuous linear operator A:
X-*Y is closed (cf. Ax (39)). In order to make the following general
considerations concrete, we formulate two important special cases of (95).
Example 37.27. A is a real m X n matrix and
X=W, Y=Um.
66
37. Introductory Typical Examples
Example 37.28. A: X-> Xis an integral operator of the first kind,
rk{t,s)u(s)ds = b{t) for all fe [0,1], (96)
with continuous kernel k: [0,1]X[0,1]~*U. Here, let X=L2(0,l). If the
upper limit of integration is replaced by t, then the result is a Voterra
integral equation of the first kind.
Our goal is to construct solutions and generalized solutions of (95) with
the aid of the least-squares method and to present the basic idea of a
numerically stable method for the solution of unstable problems by means
of the Tihonov regularization method. Such unstable or ill-posed problems
occur frequently when one possesses too much or too little information
about the object being investigated. For this purpose, instead of (95), we
consider
min \\Au-b\\2^a. (97)
ueD(A)
Furthermore, we designate by P: Y-*R(A) the orthogonal projection
operator of Y on W{A), and formulate the new problem
Au = Pb, u<=X. (98)
Finally, we set 0(/4+) = {b<=Y:Pb<= R(A)}.
Proposition 37.29. With the assumptions made above for A, X, and Y, (97)
and (98) are mutually equivalent for all b e /)(/4+) and possess a nonempty
convex closed solution set L. Therefore, L contains exactly one element uR with
minimal norm.
Definition 37.30. We set/1¾ = uR and call the operator A*: D(Af)cY~> X
the pseudoinverse of A. Furthermore, uR is called the normal solution of
(95).
For b e R(A), uR is obviously a solution of (95).
Proof. Instead of (97), we study
min \\v-b\\2 = a.
According to Proposition 21.28, there exists exactly one solution v and
v = Pb. Furthermore, L = { u e D(A): Au = v } and L is convex and closed.
By virtue of Proposition 38.15 and Theorem 39.B in Section 39.2, the
problem
min ||u|| = /?
u e L
has exactly one solution uR. □
37.14. Pseudoinverses, Ill-Posed Problems and Tihonov Regularization
67
The concept of a pseudoinverse plays a central role in modern numerical
mathematics. The designation is justified by the fact that, on R(A), A* is
equal to the inverse operator when the latter exists. However, A* is also
defined in more general cases, e.g., for rectangular matrices or for systems of
equations that do not have a solution at all. In this case the normal solution
uR = A'b is a generalized solution of (95).
Example 37.31. Let X = 7= U2 and
-(J ;)• ■-(;)• »-(:)•_ »-(:)•
where a # 0. Then the equation Au = b has no solution u. All solutions of
Au = Pb are obtained by u — (a — y, y) for arbitrary y. The normal solution
uR = (a/2, a/2) follows from ||«||2 = (a - y)2 + y2.
Numerous applications of pseudoinverses to integral equations of the first
kind, control problems, parameter identification, linear optimization, game
theory, networks, statistics, and compensation analysis can be found in
Nashed (1976, M, B). In Section 37.15 we will explain the connection with
parameter identification.
37.14b. Well-Posed and Ill-Posed Problems
We begin with a basic definition.
Definition 37.32. The problem Au = b, « e D(A) is well posed if and only if
the linear or nonlinear operator A: D(A)Q X-> Y is stable, i.e., there is a
number c> 0 such that
||/4h-/4i>||;>c||h-i>|| for all u, v ^D(A). (99)
In this connection, let X and Y be B-spaces. Otherwise we say that the
problem is ill posed.
Here this is a question of a central concept of numerical mathematics.
From (99) it follows directly that the equation Au = b has, for each
b e R(A), exactly one solution and the solution is stable, i.e., for each e > 0
there exists a 5(e) >0 such that 11^-^11 < *(e)> where bv b2 e R(A),
always implies that H^ — u2\\ < e for the corresponding solutions uv u2.
Furthermore, (99) shows that R(A) is closed.
First we introduce two prototypes for a well-posed and an ill-posed
problem and consider, for this purpose,
Au = b, ael
(100)
68
37. Introductory Typical Examples
Proposition 37.33. Let X and Y be B-spaces and let A, B: X-*Y be
continuous linear operators. Then:
(1) The problem (100) is well posed when A is bijective. If we replace the
operator A in (100) by A + B with \\B\\ < \\A~l\\~l, then the resulting
problem is also well posed.
(2) The problem (100) is ill-posed when A is compact and dim R(A) = oo.
Proof.
(1) According to the open mapping theorem, Al(36), A'1: Y-> X is a
continuous linear operator. The rest follows from Problem 1.7.
(2) It suffices to prove that R(A) is not closed. Suppose, on the contrary,
that R(A) is closed. Because of the fact that the null space N(A) is closed,
the factor space X/N(A) is a B-space. The elements of X/N(A) are the sets
def
[u] = u + N(A) having the norm
« = W IN- (101)
11 oe[u]-
def
If we set Aj[u] = Au, then A^. X/N(A) -* R(A) is linear, continuous, and
bijective. Consequently, /4fx: R(A)-+ X/N(A) is also continuous.
Therefore, Ai\K) is bounded, where X> {u <=R(A): \\u\\ <1}. Thus, by (101),
there exists a bounded set M in X with A(M) = K. Since A is compact, K is
compact. By virtue of /41(37), dimi?(^4)<oo. However, this is a
contradiction. D
Example 37.34. Linear systems of equations with nonquadratic or quadratic
noninvertible coefficient matrices are ill-posed problems.
Example 37.35. According to Proposition 37.33, (2), the integral equation
problems of the first kind in Example 37.28 are also ill posed. We explain
this explicitly on the basis of the simple special case
I / S\\, u{s)ds = b{t) for elite [0,1], (102a)
^o \n—l)\
where n =1,2,.... For u e C[0,1], this problem is equivalent to
ft(">(0 - «(0. b(°) = b'(°) = ■ • • = ft("_1>(0) = 0. (102b)
The process of differentiation in (102b) is unstable to a high degree. Small
changes in b can cause large changes in fc(n) and thus in u. One recognizes
this immediately for large N in the example b(t) = sin Nt, b'(t) = iVcos Nt.
If, on the basis of the round-off error, b does not lie in C"[0,1], then (102a)
has no solution u e C[0,1] at all.
/„
As a result of the numerical instability, in practice one cannot solve an
ill-posed problem directly by means of an approximation method. The
37.14. Pseudoinverses, Ill-Posed Problems and Tihonov Regularization
69
round-off errors that appear will completely falsify the result in general. For
that reason, for a long time only well-posed problems were considered.
However, there exist numerous important problems in the natural sciences
that are ill posed. In this connection, it is a matter, for instance, of inverse
problems in which one wishes to infer intrinsic properties of the systems
from observation data or to infer the state of the system at an earlier point
in time. To these belong the problems of prospecting for earth's resources
by measurements on the surface or the determination of the temperature
field in a body at time t = 0, knowing the temperature field at the time
t = t0, t0 > 0. In Section 37.15 we shall discuss the large class of problems of
parameter identification.
37.14c. Tihonov Regularization
It is now extraordinarily remarkable that with the aid of the so-called
Tihonov regularization usable approximation methods can be given for
ill-posed problems. The simple basic idea, which we shall carry out more
exactly in Section 46.8, is the following. Instead of the original equation
Au = b, ael, (103a)
we consider the problem perturbed, say, by round-off errors
Aus = bs, u8el, (103b)
where \\b — bs\\ < 8 and the corresponding regularized problem
min 2'l\\Aus-bs\\2 + yF(us) = a, (103c)
uteX
where y > 0. Here, A: X -* Y is a continuous linear operator, X and Y are
real H-spaces, and F: X -* U is a G-differentiable functional. For example,
def
one can choose F(u)=2 \\u\\ . If one replaces u in (103c) by us+tv
parallel to Section 18.3, differentiates with respect to t at t = 0, and sets this
expression equal to zero, then one obtains
(A*Aus - A*bs\v) + yF'(us)v = 0 for all uel;
therefore,
A*Aus + yF'(us) = A*bs, us^X. (103d)
Here, A* denotes the operator adjoint to A.
In the special case F(u) = 2_1||«||2, we obtain
(A*A + yl)us = A*bs, us^X (103e)
for the solution us of (103c). For 7 = 0, (103e) results from (103b) upon
multiplication by A*. However, it is now crucial that the term yl, with 7^0,
occurs. Also, in the case when A*A possesses no inverse operator, there
70
37. Introductory Typical Examples
a
a
x = a
a
X=2
exists an inverse operator for yl + A*A, 7 > 0, since 7/ + A*A is self-adjoint
and strongly positive. As a rule, one disposes of the parameter 7 so that the
defect \\Aug — bs\\ is as small as possible. To this end, one calculates us for
different values of 7.
The task of the theory consists in proving that for a suitable choice of
7 = 7(5), the sequence (us) converges as 5->+0. Moreover, one must
clarify in which sense the limiting element is a solution of the original
problem (103a). We deal with this in Chapter 46. Since, in numerical
investigations, A, also, is known only imprecisely, one has to replace A by
As, where \\A - As\\ <, d.
Here, on the basis of a simple example, we will only show which typical
effects appear in regularization.
Example 37.36. We consider the system of equations Au = b, i.e., explicitly,
x + y = a,
cy = a, (104)
with the solutions
y for c# 0 (classical solution), (104a)
for c — 0 (normal solution).
Both solutions correspond to the pseudoinverse. For c -* 0 we recognize the
instability of the construction of the pseudoinverse. The regularized
problem (103e) now reads as follows:
x + y + yx = a,
x + (1 + c2) y + yy = a + ac,
with the solutions
c2(l + 7) + 27 + 72 1 + Y V
If c = 0, then (x, y)-^ (a/2, a/2) as 7-+0, i.e., the regularized solution
tends to the normal solution. The same holds in the case where c=t 0. If we
assume that a and c are burdened with an error, i.e., if we replace a
(respectively, c) by a + d (respectively, c + 5), then one recognizes that for
the choice y = d the regularized solution in (104b) differs from (104a) for
fixed a, c only by an error of the order of magnitude 8. In contrast to (104a),
no singular behavior arises in the singular case c = 0 in (104b).
In general we thereby obtained a unified numerically stable method for
the investigation of regular, singular, and badly conditioned equations. In
Example 37.36, c = 0 indicates singular behavior, and for small c, a bad
condition arises. This general method also functions when A*A is singular,
in contrast to Example 37.36. As an example, the system of equations x = a,
37.15. Parameter Identification
71
<y = a possesses the solutions x = a, y = a/c for c # 0 and the normal
solution x = a, y = 0 for c = 0, while the solution of the regularized problem
(103e) reads as follows:
_ a _ ac
We discuss the regularization of integral equations of the first kind in the
next section.
References to the Literature
Classical works: Hadamard (1902), (1932, M) (well-posed problems); Picard
(1910) (integral equations of the first kind); Moore (1920) and Penrose
(1955), (1956) (pseudoinverse for matrices); Tihonov (1963) (regularization)
[also, cf. Tihonov and Arsenin (1977, M, B)].
Pseudoinverses: Luenberger (1969, M) (introductory); Ben-Israel and
Greville (1973, M); Marcuk and Kuznecov (1975, S,B) (iterative calculation
of pseudoinverses of matrices); Nashed (1976, P,H,B) (comprehensive
exposition with numerous applications, bibliography listing over 1700 works
with explanatory commentaries).
Ill-posed and inverse problems: Lavrentjev, Romanov, and Vasilev (1969,
M), Lattes and Lions (1969, M), and Payne (1975, M) (partial differential
equations); Tihonov and Arsenin (1977, M) (integral equations of the first
kind); Anger (1979, P) (also, cf. the references to the literature in Section
37.15).
Regularization: Lions (1969, M); Lattes and Lions (1969, M); Morozov
(1973, S, B) (linear and nonlinear deterministic or stochastic problems);
Tihonov and Arsenin (1977, M); Ivanov, Tanana, and Vasin (1978, M);
Kluge (1979, M); Anger (1979, P); Vainikko (1980) (also, cf. the references
to the literature in Section 37.29).
37.15. Parameter Identification
In order to explain the very important method of parameter identification
for numerous problems in the natural sciences and in engineering by a
simple example, we consider the problem
mx"(t) + ax'(t) = w(t), x(0) = 0, x'(0) = c. (105)
If we interpret x(t) as the coordinate of a point mass of mass m at the time
t, then (105) describes the motion of this point on the x-axis under the
influence of the external force w and the friction force — ax'. Let m—\.
72
37. Introductory Typical Examples
Example 37.37. We set w(t) = 0 and assume that we have at our disposal n
measurement data (tlt x,); from this we wish to determine the friction
constant a and the initial velocity c. The solution of (105) reads as follows:
x(t) = ca-1(l-e~at).
To determine a and c, we use the least-squares method, i.e.,
<f(a,c)= Z (x(ti)-xi)2 = min\
i = i
From this one obtains the nonlinear system of equations
<p„(a,c) = 0, <pc(a,c) = 0.
One can solve this system with the aid, say, of the Newton method, by
determining an initial approximation for (a, c) from x(^) = x,, /=1,2,3
using (105) and replacing the differential quotients by difference quotients.
Example 37.38. Now, for the sake of simplicity, let a = c=0. Then the
solution of (105) reads as follows:
x(t)= f1k(t,s)w(s)ds, (106)
where
K(t's) \ 0 if0<;<u<;l.
We assume that we know a measurement curve t >-> x(t) on the time interval
[0,1], and from this we wish to determine the force w that acts on the point
mass. Then we have to solve the integral equation (106). According to
Section 37.14, this is an ill-posed problem. Since the measurement curve
x(-) is burdened with measurement errors, we determine w by the least-
squares method:
||x8-/Hlx + YlMlx = min!> w^X, (107)
where X— L2(0,1) and ||x — xs\\x <, d. We denote the integral operator on
the right-hand side of (106) by A. Parallel to (103c), we have added a
regularizing term v||w||2 with y > 0. If x e A(X), then for y *= 8, by
Theorem 46.E in Section 46.8, (107) has exactly one solution w,el and ws -* w0
as 5 -> +0. Here, Aw0 = x. The function ws is obtained by virtue of Section
46.8 from the integral equation of the second kind:
[ kl(t,s)ws{s)ds +8ws(t)<= xs(t)
Jo
with the iterated kernel
t,((,j)= f k(T,t)k(r,s)dT.
A classical example of a parameter identification is the rediscovery of the
planetoid Ceres by Gauss in 1801, who determined the path solely from
3V.16. chebyshev Approximation and Rational Approximation
73
knowledge of 9° of the path arc. This led to an eighth-degree equation,
which Gauss solved in an ingenious way.
At present there are still very many open problems in the area of
parameter identification, e.g., in partial differential equations.
References to the Literature
Introduction: Kalaba, Spingarn (1982, M).
Parameter identification for partial differential equations: Polis and
Goodson (1974, S); Kubrusly (1977, S); Seidman (1977), (1979, S) (diffusion
equation); Kluge (1979, M, B) (abstract methods); Niirnberg (1979)
(viscosity properties of incompressible, fluids); conference volumes: IFIP-con-
ferences (1978, P), (1979, P); Kluge (1978, P); Anger (1979, P).
Numerical methods: Deuflhard and Hairer (1983, P).
Parameter identification and pseudoinverses: Nashed (1976, P).
Engineering applications: Tzafestas (1980, P).
Identification of rate constants in chemical reactions: Bock and Schloder
(1983).
37.16. Chebyshev Approximation and Rational
Approximation
In Section 37.12 we approximated functions / using the least-squares
method:
I \f(x)~ 22 £,",(*) dx—min)
i = 0
In this connection, the approximation may be very bad at a single point x.
For this reason, in practice one generally uses the principle
max
a < x<, b
1=0
min!, (108)
where — co<a <b<co. We call this uniform approximation or Chebyshev
approximation. If we use the space X— C[a, b], then (108) can be written in
the form
||/- u\\x = min!, u e M, (108a)
where M = span{u0,...,un}.
If uk(x)= xk, then it is a matter of the polynomial approximation
problem. For this, in Section 39.5 we obtain the following fundamental
classical theorem as a special case of more general results.
74
37. Introductory Typical Examples
Theorem 37.C (Alternation Theorem). For f e C[a, b], there exists exactly
one solution of (108).
u is a solution if and only if there are n +2 points x,-, a <,x0 <, ■ ■ ■ <xn+1 <
b, such that the error f(x)-~ u(x) in absolute value takes on its maximum at
all xk, and for x0, xv ... the signs of the errors alternate constantly.
Example 37.39. With the aid of the alternation theorem, one can easily
verify that x +8_1 is the best uniform approximation of -Jx by first-degree
polynomials on [0,1]. The alternate points are x0 = 0, xx = \, x2 = 1 with the
maximal error \.
We point out approximation methods, in particular the Remes algorithm,
in Section 37.29c. A crucial difficulty of Chebyshev approximation is that
one cannot apply methods of differential calculus to (108). It is a matter of a
typical convex nondifferentiable problem. For that reason, no Euler
equation appears here, but rather the condition given in the alternation theorem
in which it is extraordinarily remarkable that only finitely many points x,-
suffice for the characterization of solutions. In Chapter 39 we shall show
that such problems can be handled with the aid of geometric functional
analysis and duality theory.
The restriction to polynomial approximation is not always expedient from
the practical standpoint. For instance, one can use rational functions. Then
M in (108a) consists of all continuous rational functions on [a, b] the degree
of whose denominator and numerator are bounded by certain numbers. In
this case, M is not a linear subspace of C[a, b]. For this reason, we call the
problem a nonlinear approximation problem.
Example 37.40. For ex, the function
1.008757 + 0.854740x +0.846029x2
is the best uniform approximation with respect to all second-degree
polynomials on [0,1], with an error of (8.78)10-3. The best approximation with
respect to all rational functions whose denominator and numerator have
degree one is
0.995705+ 0.668203x
l-0.388848x
with an error of (4.32)10"3, which is only half as large as that above.
In connection with the rational approximation, we have the Pade
approximation:
/w- "^
e„w
<,constant |x|k for allx e [ — a,a].
Here, Pn and Qm are polynomials of degree n and m, respectively. If /
possesses continuous derivatives of order up to and including n + m +1 in a
37.16. Chebyshev Approximation and Rational Approximation
75
neighborhood of x = 0, then there exists a Pade approximation for / of the
above type with k^n +1. In many cases, k = n + m + 1. In Cheney (1966,
M) one finds algorithms for the calculation of the Pade approximation.
Example 37.41. In the case m =1, n = 0,1,2, Pade approximations for e~x
are the rational functions 1/(1 + x), (2 - x)/(2 + x),(6-4x + x2)/(6 + 2x).
In Varga (1962, M), these Pade approximations are used for the
construction of difference methods. In recent years, the Pade approximation has
proved to be an important auxiliary tool, e.g., in problems of quantum
physics and quantum chemistry. In this connection, the fact that one can
approximate singularities that are important in physics by means of rational
approximations plays a fundamental role, while in the polynomial
approximation no singularities appear in principle.
In the approximation of functions one frequently uses continued fractions
all al\
\°l \°2
By this we understand the iteration prescription
al ax
x0 = b0, xt = b0 + -j—, x2 = b0 -\ —,...,
1 h + —
1 b
i.e., in going over from xk to xk+v bk is replaced by bk +(ak+l/bk+l). The
continued fraction expansion for tan z stems from Gauss:
f»W~K-[3 (2^1-
While the power series for tanz converges (respectively, diverges) when
\z\<n/2 (respectively, \z\>n/2), fn(z)-*tanz as n-»oo and oi/zeC.
For example,
i/2(?)-tan(5)i^3xi(r4'
i.e., the convergence is very rapid. The application of this method to
computing is described in the article by Stoer and Bulirsch in Sauer and
Szabo (1967, M), Vol. III. There it is pointed out that the approximation of
functions by polynomials with the aid of the Taylor theorem is frequently
numerically unsuitable.
/0
"i'l. Introductory Typical Examples
References to the Literature
Classical work: Chebyshev (1859).
History of approximation theory: Cheney (1966, M), pages 224-233.
Chebyshev approximation and rational approximation: Cheney (1966,
M, B,H) (introduction); Meinardus (1964, M); Collate and Krabs (1973, M).
Pade approximation: Cheney (1966, M, B,H); Baker and Gammel (1970,
P); and Saff and Varga (1977, P) (applications to quantum physics); Baker
and Morris (1981, M).
Approximation of functions on computers: Luke (1975, M).
(Also, cf. references to the literature for Chapter 39.)
37.17. Linear Optimization in Infinite-Dimensional
Spaces, Chebyshev Approximation, and
Approximate Solutions for Partial
Differential Equations
The fundamental problem of Chebyshev approximation (108) can obviously
be written in the form
c„+1 = min!
± f(x)~ £ c,«,(x) <c„+l forallx e [a,b],
\ i-0 I
0<:c„+1. (108b)
If one compares this problem with Section 37.10, then (108b) can be
conceived of as a linear optimization problem for c— (c0,...,c„+1) with an
infinite number of side conditions in the form of inequalities. We are thus
led in a natural way to linear optimization problems in infinite-dimensional
spaces, which we shall investigate in Section 52.4 within the context of
general duality theory. A simple approximation method for (108b) consists
in considering the side conditions at only a finite number of points xv... ,xN.
Then there arises a linear optimization problem analogous to Section 37.10
to which the known simplex algorithm can be applied. Typically for this
approximation method, the minimal value of the approximation problem is
smaller than that of (108b). Adding additional points xit one approaches the
minimal value from below and we thus speak of an ascent method.
In Section 37.29f we shall treat the effective Remes algorithm for
Chebyshev approximation which is based on the alternation theorem.
We will now show by two examples how one can use the Chebyshev
approximation in connection with the maximum principle for
approximative solution of differential equations.
1..^,. linear v^Fu«iizatiou, v^n-oyshev npyidximauun, rypproximtue solutions / /
Example 37.42 (Boundary Maximum Principle). Let E be the open unit
disk. We consider the boundary value problem
E:-Au = 0; dE:u = f, (109)
where / is a given continuous function on dE. In polar coordinates our
solution can be written as
n
un{r>(p) = ao+ H (akrkcosk<p + bkrksink<p).
k = l
un satisfies the differential equation. We determine the coefficients a , bj
from
max |/(<p)-tt„(l,<p)| = min!
0<ip< 2ir
If un is a solution with the minimal value a, then from the maximum
principle for the Laplace equation one obtains the error estimate
\u(r,<p)-u„(r,<p)\<a on E
for the solution u of (109).
Example 37.43 (Problems of Monotone Type). We study the nonlinear
boundary value problem
K: -Att + /(«) = 0; dK:u-l=0. (110)
Let K be the open unit ball in U3. In order to elucidate a general important
approximation principle, we set Lu = — Aw + /(«), Mu = u — 1. The
problem is said to be of monotone type if and only if
Lu < Lw on K; Mv < Mw on OK (111)
and v, w e C2(K) always implies that v<won K. Briefly, for (111) we write
Lv < Lw, Mv < Mw. This property, which we have already investigated in
Section 7.10, has an important practical consequence: If u is a solution of
(110), then:
Lv>0, Mv>0 implies v>u,
Lw<0, Mw<,0 implies w<u.
In order to exploit this, we proceed from the substitutions
v=l + (l-r2)(a + br2), w =1 + (1- r2)(c + dr2),
where r2 = x2 + y2 + z2. Then Mv, Mw = 0, i.e., the boundary condition is
fulfilled automatically. We determine the unknown coefficients a, b, c, and d
by
| = mini, 0<Lv<£ onK,
ij = min!, -ij<Lw<0 on K. (112)
These are nonlinear optimization problems (one-sided Chebyshev
approximation). Then w < u < v, and our method minimizes the one-sided
defects.
78
37. Introductory Typical Examples
We have yet to give conditions for (110) to be of monotone type. This is
the case for/eC^R) and/'(«)>() on U. In order to show this, we set
h = w— v. Since /(w)— f(v) = f'(v^w — v), it follows immediately from
(111) that
K: -Ah+f'{v)h>0; dK:h = 0. (113)
If we take f'(v)>0 mto account, then from the maximum principle it
follows that h > 0 on K (cf. Problem 7.2).
As a numerical example, we consider (110) with /(u) = u2. Here we have
only /'(«)> 0 for u > 0. However, we can apply the same method in the
case where we know that u, v,w > 0 on K. For this, one takes into account
(113) with h = o—u,u — ve,/'(*>)- v + u, u + w, and f'(v)>0. According
to Collatz and Krabs (1973, M), page 19, we obtain
v =1- (1- /-2)(0.13545 + 0.01263/-2),
w=l-(l-/-2)(0.13691+ 0.01275/-2)
as a solution of (112). We have v,w>0 on K. Now, if (110) with/(w) = u2
has a solution u with u > 0 on K, then w < u < v on K; therefore, in
particular,
0.86309 < 1/(0,0,0) < 0.86455.
EXERCISE. Show that (110) with/(«) = u2 has exactly one solution u, u > 0, on K.
Solution. Replace u2 by u2 + eu, e > 0. We choose the subsolution ^ = 0 and the
supersolution v2 =1. Analogous to Example 7.39, one shows that (110) always has a
solution ue for all e > 0 with v1<ue< v2 on K. The a priori estimates for elliptic
equations show that the set of all ue is bounded in C2,a(K), 0 < a <1; therefore it is
relatively compact in C2(K) [cf. (6.11)]. Thus, as e->+0, a subsequence of (uc)
tends to a solution of (110) with/(«) = u2. We have introduced the regularized term
em in order to guarantee that 0 e G in Theorem 7.E in Section 7.10. The uniqueness
follows from the fact that the problem is of monotone type.
One can obviously apply this method in an analogous way to all
formulations of problems for which one has the maximum principle at his
disposal. In the problems of Chapter 7 we have formulated such maximum
principles for ordinary differential equations as well as for second-order
elliptic and parabolic partial differential equations. A number of examples
of this method can be found in the references to the literature.
References to the Literature
Collatz (1964, M); Collatz and Wetterling (1966, M); Collatz and Krabs
(1973, M); Krabs (1975, M); Collatz (1976, S).
37.18. Splines and Finite Elements
79
37.18. Splines and Finite Elements
In the Appendix of Part II we constructed finite elements. These are
piecewise polynomial functions with certain smoothness properties at the
juncture points. The crucial advantage of these finite elements is that they
represent flexible basis functions for the Ritz method and for the Galerkin
method for the approximate solution of partial differential equations in
Sobolev spaces. We discussed this in detail in the introduction to Part II and
in Chapter 22.
Definition 37.44. Let g e Cl[a, b], - oo < a < b < oo, and a partition a = xQ
< xv < • ■ ■ < xn = b be given. By a corresponding cubic spline we
understand a function Sg with the following properties:
(i)Sg(x,)=g(x,) for alii.
(ii)Sg'(a)=g'(a), £'(&)= g'(&).
(iii) On each open interval ]x,, xi+l[, Sg is a polynomial of degree at most
three and Sg eC2[a, b].
Proposition 37.45. There exists exactly one Sg with these properties.
Proof. The continuity conditions for the first and second derivatives,
together with (i) and (ii), yield a linear system of equations with the same
number of equations as unknowns for the polynomial coefficients. To the
associated homogeneous system there corresponds the case g(x,) = 0 for all
i and g'(a) = g'(b) = 0 with the unique solution Sg = 0. However, from the
uniqueness, the existence of exactly one solution follows. □
We now explain the connection with the variational problem
A/"(*)]2* = min!> /eC2[a,fr], (114)
Ja
/(x,) = g(x,),* = 0,...,«; /'(a) = g'(a),/'(ft)-g'(ft).
Proposition 37.46. For g e Cl[a, b], Sg is the only solution of (114).
The name "spline" stems from the fact that (114) can be interpreted
physically as follows: We have to determine the equilibrium position of a
thin rod which passes through given points (x,, g(x,-)) and is forced (by a
special device) to have given directions in the endpoints x — a,b. Designers
use such an instrument to draw curves through points. /(x) denotes the
displacement of the rod at x. The condition (114) means that the expression
for the potential energy (within the context of a linearized theory) is to be
minimized.
def
Proof. Let e=f—Sf, /eC [a,b]. For all piecewise linear continuous
80
37. Introductory Typical Examples
functions <p, with respect to the partition of [a, b] considered in (114),
fhe"<pdx = 0. (115)
This is easily proved using integration by parts, taking into account that
e'(a) = e\b) = 0, e(x,) = 0.
If / fulfills the side conditions in (114), then Sf = Sg because of
Proposition 37.45; thus, Sf = Sg" and
jfv *=/;[(/" - srf+s?\ dx > / v *.
One takes into account that/" = (/"-Sg")+Sg" and uses (115) with
<p = Sg. Consequently, Sg is a solution, and for each additional solution /,
we have/" = Sg'; therefore,/ = Sg by the construction of Sg. □
In approximating a function g by polynomials, one observes the following
unfavorable effect: If g behaves badly locally, then, as a rule, the global
approximation by polynomials is also bad. This disadvantage of polynomial
approximation is essentially improved by the spline approximation.
Numerical examples to demonstrate this can be found in de Boor (1978, M).
There, on page 68, it is furthermore shown that
max |g(*)-Sg(*)l<(^U4- max |g<4>(x)|
a<,x<b s \ Jo4 I a<,x<b
when g e C4[a, b]. Here h denotes the length of the largest subinterval.
References to the Literature
Classical works: Courant (1943) (finite elements); Schoenberg (1946)
(splines).
Splines: Varga (1971, S); Laurent (1972, M); Schultz (1973, M); de Boor
(1978, M) (numerical methods with computer programs).
Finite elements: Ciarlet (1977, M,B).
(Also, cf. the references to the literature in the Appendix to Part II.)
37.19. Optimal Quadrature Formulas
In this section we shall prove that the determination of optimal quadrature
formulas for function classes in (117a) below leads to an approximation
problem of the general type considered in Section 37.12. For the
approximate determination of the integral
F{f)= f f{x)dx, -oo<a<fc<oo,
one uses formulas of the form
G(f)-ic,F,(f) (116)
/-1
37.19. Optimal Quadrature Formulas
81
with Fj(f) = f(xj). In this connection, in a suitable way we can dispose of
the support points x„ a<xl<x2<- ■ ■ <x„<b and the real coefficients
cv...,c„. Let X = C[a, b] and let M be a linear subspace of X with the norm
II" II at wmch need not he equal to the max norm on X. Our goal is to choose
Xj and c, so that
\F(f)-G{f)\<a\\f\\M for all/eM (117)
and where a is the smallest possible such value. We then speak of an
optimal quadrature formula on M. We assume that H/ll^^constantll/H^
for all/ e M. Then F and G are continuous linear functional on M. If ||F||*
denotes the norm of the functional F e M*, then problem (117) is
equivalent to the approximation problem
min \\F-G\\* = a, (117a)
Gejv* .
where N * is the set of all G of the form (116).
Example 37.47. We seek an optimal quadrature formula on [0,1] with two
support points 0 < xv < x2 £ 1 for the class of continuously differentiable
functions on [0,1] with the additional property that the formula is exact for
constant functions.
We assert that this formula is
G(/)-2-»(/(i) + /($))
with the error estimate
\(lfdx-G{f)
1 max |/'(x)| (118)
0<x <1
forall/eCx[0,l].
Proof. We proceed from the starting point
G(f) = cJ(Xl)+c2f(x2).
For f =1, we should have F(f) = G(f); therefore, 1 = c, + c2. We choose
def
M={/eC>[0,l]:/(0) = 0} with ||/||M= max0s;tS1|/'(x)|. Each function
in C[0,1] differs from a function in M only by a constant. Since the
formula is to be exact for constant functions, we can restrict ourselves to M
instead of to CX[0,1]. Since c, + c2 =1, for/ e M, it easily follows from
F(f)= flf(x)dx=f\l-t)f'(t)dt, f(Xj)=fXjf'(t)dt
that
F(f)-G(f)-[lK(t)f'(t)dt,
•'n
82
37. Introductory Typical Examples
where
t ifO^<x,,
K(t)={~t + cl iixl<t<,x2
\-t ilx2<t<,\;
therefore
|F(/)-G(/)|<(/oV(Ol^
If one represents K graphically, then one recognizes without difficulty that,
for each e> 0, there exists an/ e M with ||/||M <1 and
f1K(t)f'(t)dt>f1\K(t)\dt-e
(cf. Berezin and Zidkov (1966, M), Vol. 1,3.9). Consequently,
||F-G|U=sup{|F(/)-G(/)|:/eM,||/||M<l}
= fl\K{t)\dt.
In order to minimize \\F— G||*, it suffices to note that
/i (\K{t)\dt > 4-l\lxl + {x2 - x,)2 + 2(l- x2f] > |
'0
for 0 < xl < x2 < 1 and the value | is actually assumed by J for x, = \, x2 = \,
cx = c2~\. □
Functions which, e.g., have no second derivative also belong to the class
M of the preceding example. It is to be expected that better approximations
exist for smoother functions. In (116) we have 2n parameters x-,c, to
dispose of freely. We will choose these parameters so that the formulas are
exact for all polynomials up to and including the mth degree. By (116),
there arise m +1 equations with In unknowns. Thus, we expect m = 2n — 1.
Gauss could show that this nonlinear system of equations can be solved
when the zeros of the Legendre polynomials d"(x2 — l)"/dx" are chosen as
the support points x-, with a = — 1, b = l. Linear systems of equations for
the Cj then arise.
Example 37.48. In order to have a comparison with Example 37.47, we
consider the problem of calculating the Gauss formula directly, for two
support points xx,x2 on [0,1]. After a short calculation, substituting and
equating coefficients, we have (cf. Collatz and Albrecht (1972, M), page 107)
37.19. Optimal Quadrature Formulas
83
Figure 37.19
From Berezin and Zidkov (1966, M), 3.5.2, we take the error formula
I (lfdx-G{f) £[(135)(32)] "'.max |/(4>(x)| for all/ eC4 [0,1].
I •'O 0 < x < 1
A comparison with (118) shows that the error factor is now essentially
smaller.
Multiple integrals, which appear frequently, for instance, in quantum
chemistry and quantum physics, are often calculated by the Monte Carlo
method. For one-dimensional integrals J= fofdx with 0 <,f(x)<l, the
basic idea, which can also be directly carried over to multiple integrals, is
the following: If one throws a needle perpendicularly onto a unit square N
times, then J is approximately equal to K/N, where K is the number of
trials, in which the needle point remains stuck in the hatched area in Fig.
37.19. Instead of the position of the needle, computer random numbers now
appear.
Optimal quadrature formulas for multiple integrals, i.e., so-called cuba-
ture formulas for classes of functions in Sobolev spaces, can be found in
Sobolev (1974, M) and Levin and Girsovic (1975, L, B). There are cases
where the application of these formulas is more propitious than the Monte
Carlo method.
The intimate connection between quadrature formulas and splines is
explained in Karlin (1971) and Levin and Girsovic (1979, L,B).
References to the Literature
Introduction: Levin and Girsovic (1979, L,B), Engels (1980, M) (standard
work).
Optimal quadrature formulas: Berezin and Zidkov (1966, M), Vol. 1;
Krylov (1967, M); Karlin (1971); Kiesewetter (1973, M) (the application of
algorithms of approximation theory); Sobolev (1974, M), Levin and Girsovic
(1979, L,B) (multiple integrals), Engels (1980, M).
o4
37. Introductory Typical Examples
Application to ordinary differential equations: Stroud (1974, M).
Simulation and Monte Carlo methods: Piehler and Zschiesche (1976, M)
(introduction); Sobol (1971, M); Yakowitz (1977, M).
37.20. Control Problems, Dynamic Optimization, and
the Bellman Optimization Principle
By the Bellman optimization principle, one understands the assertion that in
an optimal process all the subprocesses considered by themselves must run
their course optimally. We make this idea precise in the following for
discrete and for continuous control problems. The basic procedure of
dynamic optimization consists in studying the behavior of the minimal value
S of the control problem when the system parameters (e.g., the initial state
and initial time) are changed. The result is the so-called Bellman equation
for S. A comparison with the classical variational methods in Section 37.4
shows:
(i) Hamilton-Jacobi equation => Bellman equation.
(ii) Canonical equations => Pontrjagin's maximum principle.
Thus, one often refers to this equation as the Hamilton-Jacobi-
Bellman equation. An essential disadvantage of (i) relative to (ii) is that the
smoothness assumptions, which one needs in continuous control problems
to derive the Bellman equation, are frequently not fulfilled. However, in
Theorem 37.E in Section 37.20b, we shall give the fundamental principle of
dynamic optimization a form such that it is more general than the Bellman
equation and independent of the regularity assumptions. One advantage of
(i) is that sufficient conditions are obtained (cf. Example 37.52) and, after
discretizing continuous control problems, an effective approximation method
is at one's disposal (cf. Remark 37.50). Furthermore, an essential advantage
of (i) is that the optimal control is immediately obtained in the feedback
control form which is important in engineering (cf. Remark 37.53).
37.20a. Discrete Control Problems
As an example, we consider a production process as depicted in Fig. 37.20.
At stage 1, from the initial state x, under the influence of the control
magnitude «,, the state x2 = <pl(xv «j) arises with the cost expenditure
kl(xv «,), etc. Let x, e U N, «, e U M. Here x, means, e.g., the mass provision
of N chemical substances. If the process runs from the rth through the nth
stage with the initial state xr, then the total cost is equal to
j r n
Kr(xr; ur,...,u„) = £ M*/> ".•).
i — r
37.20. Control Problems, Dynamic Optimization, Bellman Optimization tiinciple 85
-I "2
1. I
Figure 37.20
where
inf Kr(xr;ur,...,u„) = Sr(xr),
«,,...,"„
Xj+l = <Pj(Xj,Uj), j*=r,...,n. (119)
Then the problem of the optimization of total cost reads as follows:
(120a)
Uj^Uj, , j=r,...,n. (120b)
The sets Uj in U M are given control restrictions. Furthermore, xr is given.
Through dynamic optimization, we study the minimal value Sr(xr),
defined by (120a), depending on r and xr, i.e., we also study the
perturbations of a fixed problem. The function (x, r)1-* Sr(x) is called the Bellman
function and is analogous to the 5-function in the Hamilton-Jacobi theory
in Section 37.4f. The so-called Bellman equation
Sr(xr)= inf kr(xr,ur)+Sr+l{<pr(xr,ur)) (121)
ur<BUr
def
for r = 1,...,/1 with Sn+l = 0 and
Si(xi) = ki(xi,ui)+ Si+l(xi+l), i = r,...,n,
is crucial.
(122)
Theorem 37.D. (1) The Bellman function satisfies (121).
(2) (ur,..., un), (xr,...,xn) are solutions of (120) ;/ and only if (122) holds
and the side conditions (119) and (I20fc) are fulfilled.
Corollary 37.49 (Bellman Optimality Principle). The following two assertions
are equivalent:
(/) (uv...,un), (xv...,%„) is a solution o/(l20) with /-=1.
(//) (um,...,un), (xm,...,xn) is a solution of (120) with r = m for all m =
\,...,n.
This corollary asserts that the total process is optimal if and only if all its
subprocesses are optimal.
Proof. (1) Obviously
Kr{xr;ur,...,un) = kr{xr,ur)
+ ^r+1(«Pr(^.«r);«r+l»---.«„)
37. Introductory Typical Examples
holds for the costs. Now, (121) follows immediately from (120) because
inf • • • = inf
ur, ...,«„
inf
"r L"r+1. -■•>"„
(2) For all (ur, ...,«„) with the corresponding (xr,...,xn), given by (119),
we have
Sr(xr) < k,.(xr, ur)+ Sr+l{xr+1) <kr + {kr+l + Sr+2)
<■■ <kr{xr,ur)+ ■■■ +kn{x„,u„)=Kr(xr;ur,...,un)
because of (121). For Sr = Kr, the equaUty sign must appear for each of the
subestimates, i.e., (122) holds.
Corollary 37.49 follows immediately from (2). □
Remark 37.50 (Bellman's Method). Theorem 37.D can be exploited to
calculate propitiously an optimal solution (uv...,un), (xv...,x„) of the
original problem (120) for given initial state xv
Step 1. Calculating (121) contrary to the direction of the process r = n,
n -1,...,1. For arbitrary given initial state xr of the rth stage, we calculate
the control ur(xr) and Sr(xr) as a solution of the minimum problem (121).
Step 2. Calculating (119) in the direction of the process. We set
_ dej _ _ def _ _ _ def _ _ def _ _
ux = «,(:*:,), x2 = <Pi(*i, «i), «2 = ^2(^2)1 x3 = ^2(^21 ui)i etc-
By the construction of 5,, xit (122) holds for /=1,...,/1, i.e., according to
Theorem 37.D, all u„ x, represent an optimal solution.
Continuous control problems that depend on time can be handled, by
time discretization, similarly to discrete problems.
37.20b. Continuous Control Problems
We consider problem
F(*,,z(*,)) = min!, (123)
z'(0 = /('>z(0>"(0) on]*0,*,[ (control equation), (123a)
z(t0) = z0 (initial condition), (123b)
tl^Tl, z(tl)^Zl (end condition), (123c)
w(?)el/, on]^0,^[ (control restriction). (123d)
Let Z and U be B-spaces with subsets Zx cZ, I/, C U. Furthermore, z(t)
from Z [respectively, u(t) from U] is the state (respectively, the control) of
the system at the time t. The initial time t0, the initial state z0 e Z, and the
set r, (respectively, Z,) of the possible end times and end states are given.
For r, = {T}, the end time is fixed, i.e., tx = T. We denote by C(t0, zQ) the
37.20. Control Problems, Dynamic Optimization, Bellman Optimization Principle 87
set of all admissible pairs (u, z), i.e., u: [t0, tx] -> I/ is piecewise continuous
and u, z satisfy all the side conditions (I23a)-(l23d). The end time tx
depends on (u, z). Then our problem (123) reads briefly as follows:
S(t0, z0)= inf F(*„z(*,)). (124)
(«,z)ec(r0,z0)
Here, by definition, S(t0, z0) denotes the infimum. We now vary (t0, z0) and
study the behavior of the Bellman function S. To this end, in preparation
we note that:
(a) t<-^> S(t,z(t)) is monotonically increasing on [t0, tt] for all admissible
pairs («, z).
(b) t <-* S(t, z*(0) is constant on [t0, t*] for the admissible pair («*, z*).
(c) S(tv z,) = F(tv z,) for all tx s Tv z, e Z,.
Here, (c) is only the fixing of the interpretation of the control problem in
case we start with (tv z,) e Tv X Zv
Theorem 37.E (1) Necessary condition. If («*,z*) is a solution of (123),
then (a)-(c) hold for the function S defined in (124).
(2) Sufficient condition. If there exists a function S that satisfies (a)-(c),
then («*, z*) is a solution of (123).
Proof. (1) Let t0<r1<r2< tv If («, z) is admissible on [?0, fj and [u, z] is
admissible on [t2, tt] such that z(t2)= z(t2), then there arises an admissible
pair on [t,, tt] which coincides with (u, z) on [t1(t2] and with (u, z) on
[t2, ^]. It then follows that 5(^, z(t,)) < S(r2, z(t2)); therefore (a) holds,
(b) is obtained from
S(t0,z0) = F{t*,z*{t*)) = s{t?,z*(t*))
and (a).
(2) For all admissible (u, z),
5(^0,z0)<F(^,z(^)),
and, for («*, z*),
S{t0,z0) = s{t*,z*{t*)) = F{t*,z*(t*)).
a
Remark 37.51 (The Bellman Equation). If the Bellman function S is
sufficiently regular, then from (a) by differentiation with respect to t, it
follows that
S,{t,z(t)) + S2(t,z{t))f(t,z{t),u(t))>0
for all admissible (u,z) and all t^[t0,tt]. If (u,z) is optimal, then, by
Theorem 37.E, (1), the equality sign holds. If, for fixed t and f, there exists
88
37. Introductory Typical Examples
an admissible pair (u, z) with u(t)= v, z(t) = f for each v e Uv then
inf St(t,S)+Sz(t,S)f(t,Lv) = 0.
This is the so-called Bellman equation.
Example 37.52. We consider the simplest case of the so-called linear
regulator problem:
j'\x\t)+u\t))dt- mini, (125)
'o
x'(t) = ax(t)+bu(t), x(t0) = x0 (125a)
where tQ, tv xQ, a, b are given real numbers. We assert that if c is a solution
of the Riccati equation
c'(t) -= -2ac(t)+ b2c2(t)-l, t0 < t < tu
where c(tt) = 0, and we replace u in (125a) by
u{t) = -c{t)bx{t), (126)
then we obtain the optimal state x from (125a) and the optimal control u
from (126).
Remark 37.53 (Synthesis Problem). It is extraordinarily remarkable that the
solution yields the so-called feedback control by (126). This is especially
important for the construction of regulating systems: In (126) the control
u(t) depends on the state x(t) at the same point in time and can be
constantly regulated. In contrast to this if one knows only u(t) as a function
of time, then it is difficult to realize this control technically. Say, in a lunar
landing one constantly measures the height and velocity of the landing ferry
and accordingly regulates the action of the brakes of the rocket in order to
land softly on the moon with minimal fuel consumption (cf. Problem 48.5).
The construction of optimal controls in the form of the feedback control is
designated as the solution of the synthesis problem (cf. Section 37.22).
Proof. We first write (125) in the form (123). This is a frequently used trick
and is done by introducing a new variable y. In this connection, we get
y(ty) = min!
with
,'(0 = *2(0+«2(0. >('o) = o,
x'(t) = ax(t) + bu(t), x(t0) = x0. (127)
We set S(t, x, y) = c(t)x2 + y. From (127) and the differential equation for
c, it follows that
—i ^A^L=(u{t) + c{t)bx(t)) £0,
i.e., t >-> S(t, x(t), y(t)) is monotonically increasing and constant for u(t) =
37.21. Control Problems, Pontrjagin Maximum Principle, Bang-Bang Principle 89
-c(t)bx(t). Furthermore, S(tvx(tl))y(tl))= yitj. Then Theorem 37.E,
(2) yields the assertion. □
References to the Literature
Classical works: Bellman (1953, M), (1954, S).
Fundamental monograph: Bellman (1957, M).
Elementary survey article: Bellman and Lee (1978, S, B).
Further expositions: Bellman (1961, M) (application to regulation
processes); Hadley (1963, M); Bellman (1967, M), Volumes I, II; White
(1969, M); Bellman and Angel (1972, M); and Cheung (1978, S)
(application to partial differential equations); Fleming and Rishel (1975, M)
(application to stochastic optimization); Aubin (1979, M) (connection with
quasivariational equations).
Quadratic control problems: Casti (1980, S).
Connection with the continuous Pontrjagin maximum principle:
Pontrjagin (1961, M); Fleming and Rishel (1975, M).
Connection with duality and the discrete Pontrjagin maximum principle:
Focke and Klotzler (1978).
Applications to economics: Pan-Tai-Lui (1980, P).
Hamilton-Jacobi-Bellman equation and optimal control: Lions, Jr.
(1982, L).
37.21. Control Problems, the Pontrjagin Maximum
Principle, and the Bang-Bang Principle
As a simple example for illustrating the Pontrjagin maximum principle, we
consider the following problem: At time tx = 0, suppose that a point mass of
mass m = 1, provided with a motor, is at rest at x = — x0 (Fig. 37.21). The
point is to move under the influence of the motor force w(t) in such a way
that it arrives as quickly as possible at the time t2 at x0 with velocity zero,
i.e.,
x"(0 = w(0. (128)
x(tl) = - x0, x'(t1) = 0, x(t2) = x0, x'(t2) = 0.
Here, it is decisive that we observe that the motor force is subject to
H 1 ^x
0 x0
Figure 37.21
90
37. Introductory Typical Examples
restrictions. We require, say, that
|w(0l^l for all t.
Example 37.54. A solution of the problem must necessarily have the
following form: One accelerates maximally, i.e., w — 1 until one arrives at
x = 0 and then brakes maximally, i.e., w = -1.
This control principle, which can be easily realized technically, is called
the bang-bang principle because of the abrupt change of the control
situation.
Proof. We will apply the Pontrjagin maximum principle from Section 48.6.
To this end, we set
def def
yx = x, y2 = x'
and write the problem in the form
I 2dt = mini,
yi(h) = -xo> .^(0 = 0, ^1(^2) = ½. y2(t2) = o,
Let w(-) be piecewise continuous on [^,^2]> where w(-) has jumps only in
the interior of [tv t2] and is, say, continuous from the right. Furthermore, let
def
Jtr(y,w,p,\0)=ply2+p2w-\0.
If (y, w) is a solution, then, by Section 48.6, the following holds: There
exist real numbers A0 > 0, av a2 which are not all simultaneously zero and
functions/?,, p2, such that the following holds at all points of continuity t of
w(-y.
def
Po(0=*{y{t)Mt)>p{t)>K)
= maxJf(y(t),u, p(t),\0)
l«l <i
and
^(0-^, = 0,
Pl(t2) = -aV Plih) = - «2. Poih)"0-
From this it follows that pt(t) = — otv p2(t) = — ot^t — t2)~ a2, p0(t) = 0.
Now if a,,a2 = 0, thus pvp2 — 0, then, because p0(t) = J{?= — A0, we
37.21. Control Problems, Pontrjagin Maximum Principle, Bang-Bang Principle 91
would immediately have \0 = 0, contradicting the fact that ava2,\0 are
not simultaneously equal to zero. Consequently, p2 * 0. Now the maximum
relation (15) for 3V in Section 48.6 reads as follows;
/^(0^(0-^0= maxp2(t)u-X0.
|a|<l
Thus,
w(t)=l forp2(t)>0, w(0=-l forp2(t)<0.
Since p2 is a linear function, w(-) can change its sign at most once.
Furthermore, it is intuitively clear that a braking process must occur at time
t2. Thus,
w(t)=l for0^t<tw, w(t)=>-1 * for tw£t£t2.
From (128) we obtain
x(t) = 2~lt2.-x0 forO <,t<tw,
x(t) = -2-l(t-t2)2+x0 for tw<, t£t2.
At the switching time tw, the position and velocity of both solutions must
coincide; therefore, t2 = 2tw, tw = ]j2x0, x(tw) = 0. D
In Pontrjagin (1961), Section 5 it is shown that the control found here in
fact solves the problem, i.e., the necessary condition is also sufficient.
The designation maximum principle is due to the fundamental maximum
relation for the Pontrjagin function ^f.
References to the Literature
Classical works: Boltjanskii, Gamkrelidze, Pontrjagin (1956); Gamkrelidze
(1958) (linear systems); Boltjanskii (1958) (proof of the maximum principle);
Pontrjagin (1959, S). A variant of the maximum principle was already given
by Hestenes (1950) in a paper that remained in obscurity.
Historical survey of the calculus of variations and the maximum
principle: McShane (1978, S,H).
Introduction: Macki and Strauss (1982, M).
Bronstein and Semendjaev (1979, S) (handbook article); Frank (1969, M);
Fleming and Rishel (1975, M); Petrov (1977, M); Leitmann
(1981, M).
Pontrjagin (1961, M); Lee and Markus (1967, M); Boltjanskii (1971, M);
Russell (1979, M); Cesari (1983, M) (also, cf. the comprehensive references
to the literature for Chapter 48).
Application of control theory to classical problems of the calculus of
variations: McShane (1978).
Discrete maximum principle: Boltjanskii (1976, M).
Connection of the maximum principle with dynamic optimization:
Compare the references to the literature in Section 37.20.
3/. introductory Typical Examples
37.22. The Synthesis Problem for Optimal Control
In Remark 37.53 we have already pointed out the great technical
significance of feedback control in optimal control theory and thereby the role of
the synthesis problem. To explain this further, parallel to Section 37.21, we
consider the problem
x"{t) = w{t),
x(^) = x0, x'itj-x'o, x(t2)-x'(t2)-0, (129)
|w(Ol^l for all f,
i.e., for a given initial state, we wish to attain x = 0 as rapidly as possible,
where the velocity equals zero at the arrival time. Figure 37.22 shows the
phase diagram in the (x, x')-plane. The curve OB (respectively OA) is a part
of the parabola 2x = — x'2 (respectively, 2x — x'2). We set
W(x x'\ = / ~~ * above AOB and on BO,
K ' ' \+l below AOB and on AO.
Example 37.55. The optimal solution (x, w) necessarily has the form
w{t) = W(x{t),x'{t)), x"{t) = w(t). (130)
This is a feedback control in the sense of Remark 37.53. For the form of
the optimal solution in the phase plane, in Fig. 37.22, we have: One starts at
(x0, Xq), follows the drawn-in trajectories 2x = ±x'2 + constant to AOB,
and then moves along AOB up to the origin. By w = W(x, x') the value
w = ± 1 of the optimal control depends only on the state of the system, i.e.,
on the point in the phase space. A regulating system therefore need only
measure the position x and the velocity x' in order to determine the
control w.
Figure 37.22
37.23. Elementary Provable Special Case of the Pontrjagin Maximum Principle 93
Exercise
Prove Example 37.55 parallel to Example 37.54. The solution can be found in
Pontrjagin (1961, M), Section 5. There it is also shown that (130) is not only
necessary but also sufficient, i.e., there exists exactly one optimal solution and this is
given by (130).
References to the Literature
Pontrjagin (1961, M); Lee and Markus (1967, M,B).
37.23. Elementary Provable Special Case of the
Pontrjagin Maximum Principle
We consider the abstract control problem
F(y,w) = mini, G(y,w) = 0, w^W (131)
with the control w and state y. If one writes this problem as
^(^,^) = min!, (j,w)eZ,
def
where Z = {(y,w): G(y,w)= 0, w e W), then for propitious properties of
Z (convexity, closedness) and F, one can apply the wide range of methods
to treat minimum problems (cf. Section 52.4).
An especially propitious case occurs when G(y,w) = 0 can be uniquely
solved for y for each w^W, i.e., y = y(w). If we set J(w)= F(y(w),w),
then (131) passes into
/(w) = min!, w e W. (132)
Let w0 be a solution of (132). Propositions of the type of the Pontrjagin
maximum principle can be obtained by using the following simple method;
(i) One studies the change of J in going from w0 to a suitable we (e.g.,
so-called needle variations we).
(ii) One uses [J(we)~ J(w0)]/e > 0 for e>0 and recasts the limiting value
relation as e -» +0 with the aid of the adjoint state p.
To explain this, we consider as a simple example;
F{y{T)) = mm\, (133a)
y'{t)^Ay{t)+b{t)+g(w(t)), 0<t<T (133b)
7(0) = a, w(t)eW.
Let F,g,b: R -»U be continuously differentiable. The initial state aeR,
94
37. Introductory Typical Examples
the end time T eU, and the set of control restrictions W from U are given.
Suppose the control function w: [0, T] -» U is piecewise continuous. To each
such w there belongs exactly one solution of (133b). According to a known
classical theorem, this solution has the form
y(t) = a+ f'G(t,s){b{s)+g(w(s))ds. (134)
Here, G is continuous and G(t, t) = 1 for all t e U (cf., e.g., Gunther (1972,
M), Vol. iy, page 63 for systems). We define the Pontrjagin function by
je(t,y,w,p) = p(Ay + b(t) + g(w)).
Now the maximum relation
^(r,y0(r),w0(r),p(r))=maxJf(r,y0(r),w,p(r)) (135)
we W
is crucial. Here, p is determined uniquely from the so-called adjoint state
equation p' = — 3Vy, i.e.,
p>{t)~~p{t)A for0<;<r, p(T)~-F'(y0{T)). (136)
Proposition 37.56. If(y0,w0) is a solution o/(133), then (135) and (136) hold
at all points of continuity r e [0, T[ ofw0(-).
Remark 37.57. (133) is a problem with fixed initial point .y(O) and free end
point y(T) (see Fig. 37.23). If one also fixes the end point by means of the
additional condition y(T)=c, then not every control w(-) yields a path
which ends at c. Then a more complicated case occurs, for which the proof
of the maximum principle with the aid of the Lagrange multipliers is
considerably more difficult (cf. Section 48.7).
The following proof is completely analogous for systems of equations in
(133b). Then A is a matrix. Many general nonlinear problems with a free
end point can be handled with the same idea for the proof. In this
connection, compare Ioffe and Tihomirov (1974, M), page 149. Application
of similar methods to partial differential equations and integral equations
are cited in Section 37.24.
y
_ 1 ^t
T
Figure 37.23
37.23. Elementary Provable Special Case of the Pontrjagin Maximum Principle 95
H -t
Figure 37.24
Proof. (I) Needle variations (see Fig. 37.24). Let w0(-) be continuous at
t e [0, T[. For small e > 0, we set
W'^ ' \w<=W iftf=[r,r + e].
According to (134), let the state>»„(•) belong to the control we(')- Then, for
t>T,
y,(t) = a + f'G(b + g(w0)) ds + f + 'G(t,s)[g(w)-g(w0{s))] ds
-yo(t)+eG(t,r)[g(w)-g(w0(rj)] + o(e)ase-*+0.
(II) The function z. We set
def
z(t)= lim e 1[ye('r + ^-yo('r + e)\>
■ +0
def
z(0= lim e_1[ye(t)-y0{t)] for all* >t
E-^> +0
and show that
z(r) = g(w)-g(w0(r)),
z'(t) = Az(t) for all t>r.
(137a) follows from (I) and G(r, t) = 1. Since
?.(') = ye(r + «)+ [' [Ay.{s) + *(*) + «(».(*))] ds,
y0(0 = >UT + e)+ V [Ay0(s)+ b{s) +g(w0(s))] ds,
by subtraction we have
z(t)=>z(r) + ('Az(s)ds for all t>r.
This yields (137b).
(Ill) First variation of F, From
F(ye(T))-F(y0(T))>0
(137a)
(137b)
7U
37. Introductory Typical Examples
after division by e > 0 and letting e -» + 0, by the chain rule, it follows that
F'(y0(T))z(T)>0.
(IV) Adjoint state/>. The adjoint state equation (136) yields
-p{T)z{T)>0
and
(pz)'(t) = p'z + pz'=-p(t)Az(t) + p(t)Az(t) = 0
for all t > t, according to (137b). For this reason, pz is constant on [t, T];
therefore,
-/>(t)z(t)>0.
Since z(t) = g(w)— g(w0(r)), this is (135). Observe that in the
construction of we we can choose we Wto be arbitrary. □
37.24. Control with the Aid of Partial Differential
Equations
In many technological processes, the state of the system is described by
quantities that depend on the position variables and also on time. For
instance, think of the temperature or concentration in chemical processes,
elastic vibrations, electromagnetic fields in plasma, etc. In general, one
speaks of distributed parameters. In contrast to Section 37.21, in the
optimal control of such processes there appear partial differential equations
as the control equations (optimal modelling of metallurgical tempering
processes and smelting processes, of chemical processes, etc.). Control
theory for partial differential equations is a very comprehensive area which
is developing rapidly; however, there are still many open questions.
Existence proofs for optimal controls are mathematically difficult in
complicated cases. For the engineer, however, the calculation of optimal
controls and the construction of optimal regulation systems stand in the
forefront (solution of the synthesis problem). Here, only stable regulation
systems are of practical significance. However, since stability investigations
are already very difficult for uncontrolled complicated processes, there still
remains much to do in this area.
Interesting technological examples can be found in Butkovskii (1965, M),
(1975, M) and in Lurje (1975, M). In the derivation of the Pontrjagin
maximum principle, one can use the method of needle variation that was
used in that connection in Section 37.23 (cf. Butkovskii (1965, M), Chapter
1, Section 2; Lurje (1975, M), Chapter 1, Section 7; Bittner (1975);
von Wolfersdorf (1975), (1976). In Lions (1971, M) the Hilbert space
methods for linear partial differential equations presented in Part II are
37.25. Extremal Problems with Stochastic Influences
97
applied to control problems. Control problems will play a special role in the
solution of the problem of the century—that of nuclear fusion.
References to the Literature
Butkovskii (1965, M), (1975, M); Lions (1971, M); Lurje (1975, M); Ahmed
and Teo (1981, M,B). (Also, cf. the references to the literature for Chapter
54).
37.25. Extremal Problems with Stochastic Influences
Many processes in nature, in engineering, in economics, and in medicine are
subject to influences of chance or are a priori of a purely random nature. To
model them optimally one must reckon essentially with stochastic aspects.
To this end, we give several examples.
(1) First we again consider the production process of Fig. 37.20 in Section
37.20 and now assume that the initial quantities x, are perturbed randomly.
Then one attempts to control this process so that instead of the cost it is the
expected value of the cost that turns out to be as small as possible.
(2) In order to keep the cost minimal for a large warehouse one must take
into account that the demand is subject to laws of randomness. At certain
discrete points in time, e.g., every month, the warehouse has to be filled so
that the demand will be covered, but on the other hand, the wares should
remain in the warehouse the shortest possible time.
(3) Say that a taxi enterprise has to decide each quarter which taxis
should be repaired or replaced by new ones. In this connection, the cost is
to be kept minimal but with maximal safety.
In (1)-(3), it is a matter Of the optimization of discrete stochastic decision
processes. In this connection, compare Ross (1970, M), Astrom (1970, M)
and Girlich (1973, M). In the last reference, linear optimization problems
with coefficients that are random variables are also investigated. In
continuous control processes, stochastic differential equations frequently appear as
control equations. For this, we present two examples.
(4) In Astrom (1970, M), page 188, it is described in detail how the
production process in a paper mill, which is subject to stochastic
perturbations, is controlled by a process computer in such a way that the quality of
the paper produced varies as little as possible from a prescribed correct
value.
(5) In the proceedings edited by De Giorgi (1979), on pages 339-360,
465-492, 583-662, the complicated optimal modelling of water power
networks, e.g., in France is investigated. Here, among other things, the
98
37. Introductory Typical Examples
demand for electricity and the water level in the storage reservoir are
stochastic. One can control the work of turbines. The expected value of the
cost of generating energy is to be minimized.
We will now briefly point out several mathematical aspects.
37.25a. Filtering and Prognosis of Stationary Stochastic
Processes According to N. Wiener
In order to have something concrete at hand, we consider the basic technical
problem of the optimal filtering and prognosis of signals. The point of
departure is the formula
lim -£= fT [g(t + h)-x(t)]2dt-jmn\, (138a)
where
/•00
x(0=/ K(r)f(t-r)dr (138b)
Jo
as well as the Wiener-Hopf integral equation
/•00
C/g(t + h)-J K(r)Cff(t-r)dr = 0, t>0, (139)
where
, def 1 rT
C/g(0= lim rr/ f(r)g(t + r)dr.
Schematically we consider a regulation system such as in Fig. 37.25 with
the initial signal / and the departing signal x. The quantities / and x are
related by (138b), i.e., x(t) depends linearly on the values f(s) for all earlier
times s ^t. This situation occurs, e.g., in regulation systems for which x and
/ are connected by a linear ordinary differential equation Lx = f with
constant coefficients. We now assume that the initial signal / is composed of
a signal g and the perturbation / — g. The problem (138a) reads as follows:
A regulating system is sought, i.e., we seek a function K such that the time
average value of [g(t + h)—x(t)]2, with fixed h, is minimal. The
construction of this regulation system guarantees the engineer that the departing
signal x(t) he measured optimally approximates the signal sought, g(t + h).
For h = 0 (respectively, h > 0) this is a filtering problem (respectively, a
prognosis problem).
After several calculations, one obtains the integral equation (139) as a
necessary and sufficient condition for a solution of the original problem
i K
X
Figure 37.25
37.25. Extremal Problems with Stochastic Influences
99
(138). This integral equation was solved by N. Wiener with the aid of the
Fourier transform. Details can be found in Wiener (1949, M). Technical
applications are described in Wiener (1949, M) and in Solodovnikov (1965,
M). The investigations of Wiener were made independently of the earlier
work of Kolmogorov (1941), who considered the discrete case.
In the integral equation (139) it is crucial that to determine K, the signals
/, g themselves are not required but rather only the cross correlation
function Cfg and the autocorrelation function C^. These are statistical
characteristics of the signals.
The optimal regulation systems are identical for all /, g with the same
Cfg, Cjf. In the following, by a stochastic process we understand precisely
the conceptual formulation that we defined measure theoretically in Section
37.12. No stochastic processes occur explicitly in the Wiener theory. There
one works with functions /, g and it is only required that C^, Cgg and
Cfg exist. Due to the conceptual simplicity of this notion, it is applied in
many practice-oriented expositions. However, stationary ergodic stochastic
processes lurk in the background. Stationarity means that all distribution
functions are invariant with respect to time translations—in particular, the
expected values E[f(t)], E[g(t)] are temporally constant, and the covari-
ances Co\(f(t),f(s)), Co\(f(t), g(s)), and Co\(g(t), g(s)) depend only on
the time difference t — s. Ergodicity means that the expected values can be
replaced by the time average values.
The Wiener method for stationary stochastic processes is presented within
the framework of the spectral theory of such processes in Rozanov (1975,
M), Chapter IV, 2.1. Also, compare Brillinger (1975, M).
37.25b. The Kalman-Bucy Filter for Nonstationary
Stochastic Processes
In the following we restrict ourselves to the heuristic description of a
number of basic concepts in the theory of stochastic processes. The precise
definitions together with physical motivations can be found, e.g., in the
introductory exposition by Arnold (1973, M).
The point of departure is the situation depicted in Fig. 37.26, together
with the formulas:
g'{t) + A{t)g(t) = o(t)w'(t), (140a)
f'(t) + B(t)g(t) = 0,(^(1), (140b)
g'(t) + C(t)g(t) = D(t)f'(t). (140c)
We seek the best possible approximation g for g. In this connection, g arises
in a system that is described by (140a). However, we do not know g, but can
only measure/by (140b). It is crucial that stochastic terms aw', a-^v[ appear
100
37. Introductory Typical Examples
perturbation aw'
system
(14
Da)
g
perturbation o1 wj
measurement
(140 b)
f
filter
(140 c)
Figure 37.26
in (140a) and (140b). Under suitable assumptions, it is possible to show that
the best (in a certain sense) approximation g satisfies the differential
equation (140c), where C and D can be calculated. With the aid of (140c),
one can construct a dynamic regulation system that filters out g from/. This
is the Kalman-Bucy filter. The precise formulation and proof can be found
in Fleming and Rishel (1975, M), page 136. In this connection, duality with
respect to the deterministic linear regulator problem in Example 37.52 is
exploited in a remarkable way.
We are dealing with so-called white noise in the stochastic perturbations
w',w[. Roughly speaking, these are strongly fluctuating disturbances which
are mutually independent at different times. Formally, w' and w[ are
obtained as derivatives of Wiener processes w, wt, i.e., one can intuitively
picture t >-> w(t) and t >-> w^t) to be the paths of particles in Brownian
motion under a microscope. These particles execute very strong quivering
movements. The exact interpretation of the stochastic differential equation
(140a) is obtained with the aid of
«(0 = S('o)- f'A(s)g(s) ds + f'a(s) dw(s). (140a')
Here, the second integral on the right-hand side is to be understood as the
Ito integral. It is defined by an approximation process with the aid of
suitable step functions. A rapid approach to all this is contained in Rozanov
(1975, M) and Arnold (1973, M).
If one wishes to define white noise exactly, then one must consider
generalized stochastic processes (cf. Arnold (1973, M), 3.2). These are
generalized functions (distributions) which depend on chance. Such
generalized stochastic processes play an important role in models of quantum
field theory (cf. Simon (1974, M)).
37.25c. Optimal Regulation of Stochastic Dynamic Systems
A typical example for this is
j!L(s,g(s),u(s))ds
= min!,
g'(t)+A(t,g(t),u{t))-a(t,g{t),u{t))w'(t), g(t0) = a.
37.25. Extremal Problems with Stochastic Influences
101
The stochastic differential equation for g is again to be understood in the
sense of (140a'). Here it is a question of minimizing the expected value of an
integral, where the control equation is stochastic. The term with the white
noise w' describes stochastic perturbations. An optimal control «(•) is
sought. In this connection, it is important that «(•) can be determined in
correspondence to the stochastic nature of the dynamical system in
feedback control form. The details can be found in Fleming and Rishel (1975,
M), Chapter VI.
In the consideration of discrete time states, there arise control problems
for Markov chains (cf. Astrom (1970, M); Ross (1970, M); and Girlich
(1973, M)). As an introduction to the application of dynamic optimization
to stochastic decision processes, we recommend Bellman (1957, M) and
White (1969, M). . ^f^?A~.
•N4V"7'.. '
References to the Literature
Classical works on filter theory and prognosis theory: Kolmogorov (1941);
Wiener (1949, M); Kalman and Bucy (1961).
General survey: Control theory and topics in functional analysis (1976,
M), Vol. Ill (proceedings of an international seminar in Trieste).
Introduction to stochastic control theory: Astrom (1970, M,B,H);
Fleming and Rishel (1975, M,B); Balakrishnan (1975, M).
General expositions: Bellman (1957, M) and White (1969, M) (dynamic
optimization); Bryson and Ho (1969, M); Meditch (1969, M); Girlich (1973,
M) (discrete decision models); Gihman and Skorohod (1977, M) (abstract
methods).
Prognosis and filtering: Kroschel (1973, L), Part 2 (introduction); Wiener
(1949, M); Bucy and Joseph (1968, M); Kalman, Falb, and Arbib (1969, M)
(general systems theory); Astrom (1970, M,H,B); Bensoussan (1971, M);
Arnold (1973, M); Fleming and Rishel (1975, M); Lipcer and Sirjajev (1977,
M), Kallianpur (1980, M).
Technical applications: Solodovnikov (1965, M); Schlitt (1968, M);
Kroschel (1973, L).
Time series: Jenkins and Watts (1968, M); Box and Jenkins (1970, M);
Hannan (1970, M) (multiple time series); Konig and Wolters (1972, M);
Brillinger (1975, M); Priestley (1981, M).
Stochastic optimization and variational inequalities: van Moerbeke (1974,
S), (1976); Friedman (1975, M), (1979, S); Bensoussan and Lions (1978, M);
Bensoussan (1981, M) (recommended as an introduction).
Applications: Solodovnikov (1965, M); Bucy and Joseph (1968, M);
Bryson and Ho (1969, M); Astrom (1970, M); Ross (1970, M); Girlich
(1973, M); van Moerbeke (1974, M); De Giorgi (1979, P).
102
37. Introductory Typical Examples
General References to the Literature
Probability theory and stochastic processes: Gnedenko (1962, M); Feller
(1968, M); Arnold (1973, M); and Rozanov (1975, M) (introductions);
Prohorov and Rozanov (1969, M) (handbook); Doob (1953, M); Meyer
(1966, M); Karlin (1968, M); Karlin and Taylor (1980, M); Gihman and
Skorohod (1969, M), (1971, M), Volumes I-III; Loeve (1978, M); Wentzell
(1979, M).
Generalized stochastic processes: Arnold (1973, M) (introduction);
Gelfand and Vilenkin (1964, M); Balakrishnan (1975, M); Simon (1974, M)
(applications in quantum field theory).
Stochastic differential equations: Arnold (1973, M) and Rozanov (1975,
M) (introduction); Gihman and Skorohod (1972, M); Friedman (1975, M);
Wentzell (1979, M); Ladde and Lakshmikantham (1980, M).
Handbook of queuing theory: Gnedenko and KOnig (1983).
37.26. The Courant Maximum-Minimum Principle.
Eigenvalues, Critical Points, and the Basic
Ideas of the Ljusternik-Schnirelman Theory
A fundamental problem of the theory of extremal problems consists in
finding estimates for the number of critical points of functionals using
topological tools. For this purpose, we have two different methods at our
disposal:
(i) The Ljusternik-Schnirelman theory,
(ii) Morse theory.
We treat the basic ideas in this section and the next one. Together with the
theory of the fixed point index from Part I, (i) and (ii) present the
topological heart of nonlinear functional analysis.
As a point of departure we choose the linear eigenvalue problem
Au=\u, u^UN, \eR, ||«||=1, (141)
where A: UN->UN is a symmetric N X N matrix. Let ||■ || be the Euclidean
norm. If we set/4 = (a,7), u = (^,...,iN), and
F{u) = 2~l E «,/«,-, (142)
i.e., F(u)*= 2~1(Au\u), then F' = A, and we can write (141) in the form
F'(u) = Xu, ael, \eR, ||«||=1, (143)
where X=UN. The goal of the Ljusternik-Schnirelman theory is the
37.26. Courant Max-Min Principle, Ljusternik-Schnirelman Theory
103
investigation of (143) for nonlinear operators F' in B-spaces X. In order to
get an idea of the results that one can expect, we first formulate a number of
known propositions for (141) and (142).
Proposition 37.58. The following five assertions hold for (141):
(1) There exists at least N eigenvector pairs (u, — u).
(2) If A is a k-fold eigenvalue, then the corresponding eigenvectors lie on a
(k — lydimensional sphere.
(3) The eigenvectors are exactly the critical points of F with respect to the unit
sphere S.
(4) A minimum (respectively, maximum) of F on S corresponds to the smallest
(respectively, largest) eigenvalue. All the other eigenvalues correspond to
saddle points.
(5) If the eigenvalues are ordered in the form \l<\2<- ■ ■ <\N
corresponding to their multiplicities, then
-f = sup inf F(u), m=l,...,N. (144)
1 s„, e.sem "eS™
Here,SCm is the class of all (m — lydimensional spheres, i.e., Sm = S n Xm,
where Xm is an arbitrary m-dimensional subspace of U N.
The important characterization (144) of Am was due to E. Fischer and H.
Weyl at the beginning of this century; because of the further development
of this principle by R. Courant, it is frequently referred to as the Courant
maximum-minimum principle. (144) is the starting point for the
Ljusternik-Schnirelman theory. In this connection, SCm is replaced by a
more comprehensive class Jfm.
Proof. (1) and (2) This is a classical proposition of linear algebra.
(3) Apply the Lagrange multiplier rule described in Section 37.3, first
formally to the problem
JV
F(«) = stationary!, 2-^^ = 2-1.
/-1
and then obtain (143) as a necessary condition. (3) is obtained rigorously
from Proposition 43.23.
(4) From linear algebra it is known that there exists a rotation u = Tv such
that A passes to the diagonal form and F a sum of squares
^(.) = 2-1 EM?> \<--<\N, (145)
/=i
whereF1(v)= F(Tv), v = (r)1,...,i)N). The numbers X, are the eigenvalues of
A. We get the eigenvectors on S corresponding to A, by considering the
intersection points between the T];-axis and S, i.e., y, = (0,..., 0, ± 1,0,..., 0),
1U<+
3V. iiiuodllCtOi_y i yyiCdi Examples
where ±1 occupies the ith position. Since a rotation leaves the critical
points unaltered, the assertion follows easily from (145). As an illustration,
we consider the special case
*i(o)- 2-^1,^ + \2-n22 + \371l], \1<\2<\3.
Fx has a maximum (respectively, a minimum) on S at the points ± v3 =
(0,0, ±1) [respectively, 1^ = ( + 1,0,0)]. On the other hand, ±v2 =
(0, ± 1,0) is a saddle point, for, when Tjj = 0 (respectively, tj3 = 0), then Fx
has a minimum (respectively, a maximum) on the corresponding circles
tj| + i)3 = 1 (respectively, ^ + tj| = 1) at tj2 = ± 1, tj3 = 0 (respectively, ijj = 0,
12 = ±1)-
(5) This is obtained in a manner analogous to the proof of Theorem 22.E
in Section 22.9. □
In order to generalize Proposition 37.58 to nonlinear problems within the
context of the Ljusternik-Schnirelman theory, we replace (144) by
cm= sup inf F(u), m=l,...,N. (146)
K e Xm " G K
Here, Cfm is the class of all compact symmetric sets K on the unit sphere S
with gen K >m. The number gen K denotes the so-called genus of K that we
shall consider in Section 44.3. In particular, gen Sm = m; therefore, Jt~m 2 i?m,
i.e., the spheres in (144) are replaced by more general sets. However, one
can show that cm = \m/2 in the special case (141), i.e., for F(u) =
2~\Au\u). The first main result of the Ljusternik-Schnirelman theory is
the following proposition.
Proposition 37.59 (Ljusternik (1930) and Schnirelman (1930)). If the
function F: UN-+ U is even and F possesses continuous first partial derivatives,
then the system of equations
3F
|H«) = X{„ /-1,.. -, N, (147)
where u&UN,\^U, \\u\\=l, has at least N pairs of eigenvectors (u, — u).
This proposition is a special case of Theorem 44.B in Section 44.9.
Equation (147) corresponds to F'(u) = Xu. The basic idea of the proof is
that for each cm in (146), one finds a critical point u of F with respect to S
with F(u)= cm. According to the Lagrange multiplier rule, an eigenvector of
F' corresponds to this point u. If all of the cm's are mutually distinct, then
one obtains N pairs (u, — u) of eigenvectors in this way. Otherwise, if
cm = cm+i= ''' = Cm+P ^or P — 1) l^en ^e genus of the set of all
eigenvectors of (147) on S with F(u)= cm is greater than or equal to p+1. In
particular, there then exist infinitely many eigenvectors on S. This result
generalizes the multiplicity assertion in Proposition 37.58, (2).
37.27. Critical Points and the Basic Ideas of the Morse Theory
105
The following generalization of Proposition 37.59 to infinite-dimensional
spaces is important for nonlinear functional analysis.
Proposition 37.60 (Ljusternik (1939)). The equation
F'{u) = \u, ueX, XeR, ||h||=1
possesses an infinite number of pairs («, — «) of eigenvectors in case the
following three assertions hold:
(/) X is a real separable H-space such that dim X= oo.
(//) F: X-* U is even, F e Cl(X,U) and F' is compact.
(Hi) F(0)=F'(0) = 0, and u # 0 implies that F(u)*% F'(u)¥=0.
This proposition, which was proved in weaker form by Ljusternik (1939),
is a special case of Proposition 44.17. If A: X~> X is a compact linear
operator and one sets F(u)=2~l(Au\u), then F'=A, and Proposition
37.60 is transformed into a known proposition of linear functional analysis
(cf. Theorem 22.E in Section 22.5).
References to the Literature
Ljusternik (1930) (classical work); Krasnoselskii (1956, M); Vainberg (1956,
M); Schwartz (1969, L); Rabinowitz (1974, S).
37.27. Critical Points and the Basic Ideas of the
Morse Theory
The Morse theory investigates the local and global behavior of critical
points of functions F: M -> U. In this connection, the quadratic terms in the
Taylor expansion of F and their nondegeneracy play a crucial role in the
local investigation. The global propositions connect the topological
properties of M and the number of nondegenerate critical points of F, In
particular, one obtains estimates for the number of nondegenerate local
maxima, local minima, and saddle points.
Many results can also be generalized to infinite-dimensional spaces or
manifolds M. For instance, these generalizations play an important role in
the investigation of geodesies on surfaces or on more general manifolds.
In this book we do not go into the proofs of Morse theory since a deep
knowledge of algebraic topology and differential geometry is needed for a
profound understanding of this theory and its applications, an
understanding not expected of the reader. We content ourselves with working out the
basic ideas in the most elementary way possible and giving a number of
106
37. Introductory Typical Examples
references to the literature for an effective study. In this way, we hope to
arouse the interest of the reader in this excellent mathematical theory, with
which the image of modern analysis is essentially imprinted. The connection
of Morse theory to the central modern concepts of
(a) singularity,
(/?) transversality,
(y) generic property,
(§) stable unfolding or deformation of singularities
are treated in the next section under the caption "catastrophe theory."
In order to describe the basic ideas under unified assumptions, we
assume, for the sake of simplicity, that the functions are arbitrarily often
continuously differentiable, i.e., we consider only C°°-functions. Diffeomor-
phisms, which we have defined in Section 4.13, will play a central role.
37.27a. The Simplest Situation in 0¾1
Let F: U -> U be a C°°-function. u0 is a critical point of F if and only if
F'( u0) = 0. Then the Taylor expansion of F is
F(«) = F(u0) + a(u-u0f+ o(\u- u0\2), u-+u0, (148)
where a— F"(u0)/2. The critical point u0 is said to be nondegenerate if
and only if a =/= 0. The Morse index of u0 is equal to 1 for a < 0 (local
maximum) and equal to 0 for a > 0 (local minimum). The functions F(u) =
uk, k > 3, for example, have a critical point at u = 0, which, however, is
degenerate. The following formulations are so selected that they carry over
to essentially more general situations. In (i) and (ii) below, it is assumed that
«0 is a nondegenerate critical point.
(i) Normal form. There exists a transformation of coordinates u = <p(v),
with «0 = <p(0), such that
F(<p(v))~F(u0) + (S&ia)v2. (149)
Here, <p is a C°°-diffeomorphism in a neighborhood of zero.
(ii) Stability. Each transformation of coordinates « = <p(u), with u0 =
<p(«0), which is a C°°-diffeomorphism in a neighborhood of u0, leaves the
property of u0 invariant, u0 being a nondegenerate critical point. Since
<p'("o) ^ 0> tQis ^act follows easily from the chain rule.
Furthermore, for given e > 0, each C°°-function G". U -> U also possesses
a nondegenerate critical point v0 with |«0 — v0\ < e when \F—G\, \F' — G'\,
and \F" — G"\ are sufficiently small on a suitable neighborhood of uQ.
Moreover, vQ has the same Morse index as uQ.
Degenerate critical points u0 do not have this stability property. For
example, because F0'(0) = -F0"(0) = 0, Ft(u) = w3 + tu has a degenerate criti-
37.27. Critical Points and the Basic Ideas of the Morse Theory
107
cal point at u = 0 for t = 0, whereas the perturbed function Ft has no critical
point at all for t > 0 because Ft'(u)= 3u2 + t > 0 (see Fig. 37.34).
(iii) Morse-Sard Theorem. The number c is called a critical value of F or
a critical level if and only if there exists a critical point «x such that
F(ul)=c. The following holds: The set of critical values of Fhas measure
zero in U. Intuitively, this assertion means that the critical values are rare
exceptions.
(iv) Morse Junctions. A C°°-function G: U -> U is called a Morse function
if and only if it possesses only nondegenerate critical points. These points
cannot have a finite limit point. Let F: U -> U be a C°°-function. We set
(?(«) = F(u)+au. Then G is a Morse function for almost all aeR. This
proposition intuitively asserts that the majority of all functions are Morse
functions,
(v) Level sets. We set Mc = {u e U: F(u)<c}.
Case 1: Let — oo < a < b < + do. If the set F~l[a, b] is compact and
contains no critical points, then the set Ma is C°°-diffeomorphic to Mb. In
Fig. 37.27(a), Mb arises from Ma by means of a simple deformation.
Case 2: Let uQ be a nondegenerate critical point of F such that F(u0) = c.
If F~ l[c — e, c + e] is compact for some e > 0 and u0 is the only critical point
in this set, then Mc+e is C°°-diffeomorphic to a set which arises from Mc_c
by the adjunction of an interval. In Fig. 37.27(b), Mc_e —0 while Mc+e is an
interval.
Roughly speaking, the level sets change their structure significantly only
upon passing through a critical level.
(vi) Global estimates for the number of critical points (Morse inequalities).
The function F(u)=u has no critical points on R; but if it is known that
the C2-function F: U -> U possesses only nondegenerate critical points, and
F(u)-> +oo as \u\ ->oo, then
M0>1, Mi-Mo^-l. (150)
Here, A/, denotes the number of critical points having Morse index i, i.e., M0
and Mx are the number of local minima and maxima, respectively. These
estimates are obtained easily from the fact that a maximum must lie
: between any two minima.
The estimates of the type (150) depend crucially on the topology of the
set on which F is defined. For example, let F: Sl -> U be a C2-function on
H
(a) (b)
Figure 37.27
c + e
108
37. Introductory Typical Examples
the boundary of the unit disk 51 in U 2. Critical points are defined in a way
analogous to that using local coordinates. If F has only nondegenerate
critical points, then, in contrast to (150),
M0>1, M!-M0 = 0 (151)
holds. Analogous results for closed surfaces in R3 will be given below in
Example 37.63.
In order to describe generalizations, we first need some concepts
concerning quadratic forms.
37.27b. Quadratic Forms
Let Q: X X X -> U be a symmetric bounded bilinear form, where X is a real
B-space. Then there exists a linear continuous operator A: X-> X* such
that Q(h,k) = (Ah,k) for all h,keX. By the null index of Q, we
understand dim N(A). The Morse index of Q is defined to be the maximal
dimension of all subspaces of X on which Q(h,h)<0.
Furthermore, Q is said to be nondegenerate (respectively, weakly
nondegenerate) if and only if A is bijective (respectively, injective).
Example 37.61. Let A = (a, •) be a real symmetric M XM matrix. We set
N
Q(h,k)= £ a^^j iotallh,keUN.
-../ = 1
Then the Morse index (respectively, the null index) of Q is equal to the
number of negative eigenvalues of A taking their multiplicities into account
[respectively, equal to dim N(A)]. There exists a regular linear
transformation of coordinates h=Th',k = Tk' such that Q(h,k) passes into a sum of
squares, i.e.,
JV
Q(Th',Tk')~ Z^k'i,
; = i
where e,. = ± 1,0. The Morse index (respectively, null index) of Q is equal to
the number of e, with e,- = — 1 (respectively, e, = 0). Furthermore, Q is
nondegenerate if and only if det(a,- •)# 0.
Example 37.62. Let Q: X X X -> U be a symmetric bounded bilinear form
defined on the real H-space X with dimJif=oo. Suppose there exists a
strictly positive, compact, and symmetric bilinear form b: X -> X -> U such
that
Q(h,h)>c\\h\\2x-d-b(h,h) for all A e*
and fixed constants c, d > 0. Furthermore, we consider the eigenvalue
problem
Q(h,k)=ixb(h,k) for all A: eX (152)
37.27. Critical Points and the Basic Ideas of the Morse Theory
109
Then the Morse index (respectively, null index) of Q is equal to the number
of negative eigenvalues ju taking their multiplicities into account
(respectively, equal to the multiplicity of ft = 0). Q is nondegenerate if and only if
ju = 0 is not an eigenvalue in (152).
def
Proof. The quadratic form e = Q + d- b is strongly positive and symmetric.
Therefore, Q(h,k)= (Ah,k) and A = E - d-B, where e(h,k)~ (Eh,k),
b(h,k)= (Bh,k) for all h, k e X. Here, E is strongly positive and
symmetric. B is compact and symmetric. Thus, A is a Fredholm operator of index
zero (see Section 22.7). Consequently, A is bijective if and only if dim N(A)
= 0, i.e., ft = 0 is not an eigenvalue in (152). Furthermore, dim N{A) equals
the multiplicity of ju = 0 in (152). If we write (152) in the form e(h, k)= (ju
+ d)b(h, k), then we can apply Proposition 22.31 and obtain the existence
of eigenvectors uvu2,... and eigenvalues ftj <fi2 < • • • with jun -> + oo as
n -> oo as well as b{ut, Uj)= 5,y for (152). For each h e X,
00
h= £ b(h,«,)«,;
1=1
therefore,
00
Q(h,h)= ZfiMh,u,)2.
;=i
The assertion of the example follows easily from this. D
37.27c. Generalizations
The results of Section 37.27a can be generalized to a large extent to
functions F: M -+U, where M is an open set in R" or a finite-dimensional
manifold or an infinite-dimensional manifold. For local results it suffices to
know the generalizations to open sets M of UN or of B-spaces, inasmuch as
manifolds behave locally as do those spaces. The global results, however, are
based on a detailed knowledge of the global topology of M, We give the
exact definition of the concept of a manifold in Chapter 43. The reader can
think of manifolds as sufficiently smooth curves and surfaces, on which local
coordinates can be introduced, which lie in UN or in B-spaces. The essential
strategy of the theory of manifolds is that one calculates in terms of local
coordinates, but applies only those concepts that are independent of the
local coordinate system chosen. In Problem 44.12 we point out a number of
generalizations of the Morse theory to infinite-dimensional manifolds.
First, let F: M £UN-+U be a (^-function defined on an open set M,
with 0 e M. The point u0 = 0 is a critical point of F if and only if F'(0) = 0,
i.e., all the first partial derivatives of F vanish. Then the Taylor expansion
for F reads as follows:
F(tt) = F(0) + iF"(0)tt2 + o(||tt||2) as u -> 0, (153)
110
37, Introductory Typical Examples
where
F"(0)uo= E a^vj. (154)
-../ = 1
The critical point u0 = 0 is called nondegenerate if and only if the quadratic
form in (154) is nondegenerate. By means of a regular linear transformation
of coordinates, one can then attain the situation that a,y = e,5,y- holds, where
e,-= ±1. By definition, the Morse index of u0 = 0 is equal to the Morse
index of the quadratic form in (154), i.e., it is equal to the number of the e,
with 6; = — 1. In particular, if u0 = 0 is a nondegenerate critical point in U2,
then, by means of a regular linear transformation of coordinates, one can
always attain the situation that
F(u) = ^(0) + 6^ + 62^ + 0(1142) astt-^0 (155)
holds, where the following are special cases:
ex = e2 = 1, local minimum, Morse index / = 0;
ex = e2 = — 1, local maximum, Morse index / = 2;
Ej =1, e2 = — 1, saddle point, Morse index /=1.
Analogous to (i) in Section 37.27a, the Morse lemma asserts that by means
of a suitable coordinate transformation, one can attain the situation that the
term o(||«||2) in (153) and (155) vanishes. In Theorem 73.E in Section 73.12
we shall prove a more general result in 5-spaces. An interesting application
of the Morse lemma pertains to asymptotic formulas for integrals of the
type
fa(y)eik^dy
for large k (the method of stationary phases). Such formulas, which are of
great significance in geometrical optics, can be found in Guillemin and
Sternberg (1977, M), page 16. Here, a decisive role is played by the critical
points of <p.
We give a generalization of the Morse-Sard theorem (iii) in Section
37.27a to finite-dimensional and infinite-dimensional spaces in Problem
44.12. The generalization of assertion (iv) reads as follows: If F: UN-+ U is
a Morse function, then u >-> F(u)+ au is also a Morse function for almost all
a e UN (cf. Guillemin and Pollack (1974), page 43).
The generalization of (v) in Section 37.27a concerning the structure of
level sets to infinite-dimensional Hilbert manifolds can be found in Schwartz
(1969, L), Propositions 4.67 and 4.87. Then, in Case 2, instead of intervals,
one has to adjoin balls whose dimension is connected with the index of the
associated critical points. Similar assertions are valid for the homotopy
equivalence of Mc+e and extensions of Mc_e by means of balls (cf. Milnor
(1963, M), page 14 for finite-dimensional manifolds and Skrypnik (1973,
M), Chapter 5, for infinite-dimensional Hilbert manifolds).
37.27. Critical Points and the Basic Ideas of the Morse Theory 111
General Morse inequalities are given in Milnor (1963, M), page 30 and
Kahn (1980, M) (finite-dimensional manifolds) and in Schwartz (1969, L),
Proposition 4.89 (infinite-dimensional Hilbert manifolds). These inequalities
are of the type that the alternating sum of the M, (the number of
nondegenerate critical points of Morse index i) is estimated against the alternating
sum of topological invariants (Betti numbers).
Example 37.63 (Morse Inequalities on the Torus). As an illustration, we
consider a sufficiently smooth function F: M-^Rona closed, orientable,
and sufficiently smooth surface M in three-dimensional space. Each such
well-behaved surface is homeomorphic to a sphere-with p handles (p =
0,1,2,...). The number p is called the genus of M. Figure 37.28(b) shows
the case p=l. This surface is homeomorphic to the torus in Fig. 37.28(a).
To each neighborhood of a point P belong local coordinates (uv u2). If F
has a nondegenerate critical point at P, then the Taylor expansion of F
coincides with (155) in a neighborhood of P. Now the crucial result reads as
follows: If F has only nondegenerate critical points on M, then
M0>1, M2>1, - M0 + M1~M2 = 2p-2
(M0, Mj, and M2 equal the number of minima, saddle points, and maxima,
respectively). In particular, Ml > 2p; therefore, F has at least two saddle
points on the torus with p =1.
If we consider, say, the function P >-> z(P) that assigns to each point P of
the torus in Fig. 37.28(a) the corresponding z-value, then this function has a
minimum (respectively, a maximum) at m (respectively, M) and has two
saddle points at the two points S. In Milnor (1963, M), page 1, the intuitive
form of the level sets of this function and its homotopy types are discussed.
If F is a continuously differentiable real function on the torus, then F
has at least three critical points, provided we drop the assumption that all
critical points are nondegenerate. This result does not follow from Morse
theory but from the Ljusternik-Schnirelman theory (cf. Problem 44.13d).
We shall study easily formulated special cases of Morse inequalities in
ff-spaces in Problem 44.12.
We treat additional important generalizations of Section 37.27a in Section
37.28.
m p= 1
(a) (b)
Figure 37.28
112
37. Introductory Typical Examples
37.27d. Index Theorem for Geodesies
First, as a simple example we consider the unit sphere 52 in R3 in order to
explain the intuitive meaning of the Morse index and of the null index for
geodesies. As usual, let <p (respectively, #) be the coordinates of
geographical longitude (respectively, geographical latitude), with 0 < <p < 2ir, 0<&<
it. The curve # = ir/2 corresponds to the equator. We state the variational
problem
def ra I—^ ;—
F{9)=* \ U'z+ sin2*d<p = mini, (156)
*(0) = *(a)-|
with the corresponding Euler equation
d &' sin#cos#
d(P /#'2+sin2# /#'2+sin2#
(157)
and the solution ^("P) — w/2- Furthermore, we state the second variation
of F:
82F(&0; <92) = ["Ufii - 9i&2) d<p for all &u #2 e C0°°(0, a),
Jo
(158)
as well as the eigenvalue equation
82-F(fl0; #1,^2) = /^(¾.¾) for all #2e ^/(0, a), (159)
0 . def
where &leW2(0,a) and b(ftv #2) = /0¾¾^ and the corresponding
classical eigenvalue equation
-#('-•#! = /*#!, ^(0) = ^(0) = 1. (160)
(156) describes the problem of determining the shortest curve joining two
points P0 and Pl on the equator in the form # = #(<p) (see Fig. 37.29). The
solution #0 corresponds to the shortest arc of the equator between PQ and
Pv According to (4.16), (158) arises by setting ifr(t,s) = F(&0 + t&t + s&2)
and calculating the derivative ^„(0,0). Then, (159) is the generalized prob-
Figure 37.29
37.27. Critical Points and the Basic Ideas of the Morse Theory 113
Table 37.2
0<a<ir
a = w
v < a < 2v
Null Index
of the
Geodesic #0
0
I
0
Are the Initial Point
and End Point
Conjugate?
no
yes
(multiplicity = 1)
no
Morse
Index
of #„
0
0
1
lem corresponding to (160). It arises by multiplying (160) by #2 e C<J°(0, a)
and then integrating by parts. However, it follows from regularity
considerations that (159) and (160) are equivalent. We think of 82F in (158) as a
bilinear form on X X X, where X= W^iO, a). Let the Morse index
(respectively, the null index) of #0 be by definition equal to the corresponding
index of 82F, According to Example 37.62, the null index (respectively, the
Morse index) is equal to the multiplicity of ju = 0 (respectively, the sum of
the multiplicities) of the negative eigenvalues. Moreover, the assertions in
Table 37.2 hold.
This easily results by considering the eigenfunctions sin(nir<p/a) of (160).
The points P0, Pl are called mutually conjugate if and only if the null index
of the corresponding geodesic #0 is not equal to zero (degenerate critical
point). By Table 37.2, this occurs for a = tr —therefore, it occurs for the two
antipodal points in Fig. 37.29. That the point is degenerate finds its
geometric expression in the fact that several geodesies pass through these
two antipodal points. By definition, the multiplicity of conjugate points is
equal to the null index of the corresponding geodesic. We have thus
obtained the following in our special case:
The Morse index of a geodesic is equal to the number of points in the
interior of the geodesic that are conjugate to the initial point, where
the points are counted according to their multiplicity. This proposition
(the Morse index theorem) holds in general for Riemannian manifolds
(cf. Milnor (1963, M), page 83).
37.27e. Existence of Geodesies
Morse theory and Ljusternik-Schnirelman theory present the basic tools for
proving the existence of geodesies on Riemannian manifolds M—therefore,
in particular, on surfaces in U3. Here, the idea is the following: If P0, Pl are
two points on M, then we denote by M(PQ, Pt) the space of all piecewise
continuous curves C which join PQ and Pl on M (see Fig. 37.29). Let L(C)
be the length of the curve C. By introducing a suitable metric, M(P0, Pt)
becomes an infinite-dimensional metric space. Then the critical points of L
114
37. Introductory Typical Examples
correspond to geodesies. We cite three important results:
(a) (Morse) On a Riemannian C°°-manifold that is homeomorphic to the
unit sphere in U", n > 3, there exists an infinite number of joining geodesies
between the two points P0, Pl (cf, Seifert and Threlfall (1938, M), Section
20).
(b) (Ljusternik and Schnirelman) On a closed surface in R3 that is
C°°-difFeomorphic to the unit sphere in U3 there exist three closed geodesies
that do not intersect (cf. Klingenberg (1978, M), page 214). One can specify
ellipsoids that contain exactly three such closed geodesies.
(c) (Ljusternik and Fet) There exists a closed geodesic on each compact
Riemannian C°°-manifold (cf. Klingenberg (1978, M), page 207).
Here, by a geodesic we understand not only a shortest joining curve, but
all solutions of the Euler differential equation for the shortest curve
variational problem. In Fig. 37.29, e.g., every curve which winds around the
equator several times is also a geodesic, (a) is to be understood in this sense.
The Morse index theorem in Section 37.27d holds in general for such
geodesies.
While (a) follows from Morse theory, (b) and (c) are obtained from the
Ljusternik-Schnirelman theory, (a) is based on the fact that the space
M(P0, Pj) has an infinitely high connectivity.
31.211. Comparison of the Morse Theory and the
Ljusternik-Schnirelman Theory
In contrast to Morse theory, the Ljusternik-Schnirelman theory has the
advantage that the estimates for the number of critical points are obtained
without making any assumptions whatsoever on the nondegeneracy or the
isolation of the critical points. To begin with, the Morse theory is connected
with nondegenerate critical points; but it can also be applied to degenerate
critical points with the aid of type numbers (cf. Seifert and Threlfall (1938,
M) and Berger (1977, M) as an introduction). However, in contrast to the
Ljusternik-Schnirelman theory on infinite-dimensional manifolds in the
case of degenerate critical points, the Morse estimates for the type numbers
yield no especially sharp estimates for the number of critical points. In the
case of geodesies, however, the estimates are sufficient to verify the existence
of infinitely many geodesies (cf. Seifert and Threlfall (1938, M)).
References to the Literature
Classical works; Morse (1925), (1934, M).
Introduction: Seifert and Threlfall (1938, M); Milnor (1963, M); Kahn
(1980, M); Bott (1982, S). The topological tools needed are available in an
elementary setting in the above-cited monograph by Seifert and Threlfall.
37.28. Singularities and Catastrophe Theory
115
Conjugate points, the calculus of variations and sufficient conditions for
extrema: Morse (1934, M), (1972, M).
Critical points and global analysis: Morse and Cairns (1969, M); Smale
(1977, S); Kahn (1980, M); Bott (1982, S).
Morse theory on infinite-dimensional manifolds; Berger (1977, M)
(introductory); Palais (1963), Palais and Smale (1964); Schwartz (1969, L); Rothe
(1973); Skrypnik (1973, M); Tromba (1977), (1977a); Klingenberg (1978,
M); Marsden (1981, L).
Applications to minimal surfaces: Tromba (1977), (1977a), (1977b), (1980).
Application to geodesies: Morse (1934, M); Seifert and Threlfall (1938,
M); Milnor (1963, M); Schwartz (1969, L); Klingenberg (1978, M).
Application to nonlinear elliptic differential equations: Skrypnik (1973,
M); Berger (1977, M).
Application to asymptotic integral formulas in geometrical optics: Guil-
lemin and Sternberg (1977, M).
Application to homotopy theory: Milnor (1963, M) (Freudenthal's
suspension theorem, Bott's periodicity theorem as the basis for the
Atiyah-Singer index theory for elliptic differential equations on manifolds).
Application to differential topology: Hirsch (1976, M) (classification of
closed surfaces in R3).
Degenerate critical points and their type numbers: Morse (1934, M);
Seifert and Threlfall (1938, M); Berger (1977, M).
Generalized Morse index of Conley and dynamic systems: Conley (1978,
M); Amann and Zehnder (1980) (application to differential equations);
Smoller (1983, M) (shock waves).
Infinite-dimensional Morse-Sard theorem: Fucik, Necas, and SouCek
(1973), L); Berger (1977, M); Tromba (1977b).
37.28. Singularities and Catastrophe Theory
This section supplements and generalizes parts of the Morse theory
discussed in the preceding section. The reader who wishes to become acquainted
as rapidly as possible with the fundamental ideas of catastrophe theory can
immediately begin with Section 37.28f after studying Section 37.28a. We
pursue the same aims as those described in the introduction to Section
37.27.
The purely calculational aspect of the theory which is important for
applications is explained briefly in Section 37.28k. In Chapter 73 we shall
study the following questions in a more general setting.
116
37. Introductory Typical Examples
37.28a. Singularities
Let F: U C RN -> RM be a (^-mapping defined on the open set U'mRN. By
definition, F has a singularity or a critical point at «0 if and only if the
linearization F'(u0): RN-+ RMis not surjective. This is, e.g., always the case
for N < M. The number c is called a critical value of F if and only if there
exists a critical point u0 such that F(u0)= c.
Moreover, this is obviously a matter of a generalization of the
corresponding concepts of Morse theory for functions. F is called a submersion
(respectively, an immersion) at u if and only if F\u): RN -> RM is surjective
(respectively, injective). If this property holds for all ueD(F), then we
speak simply of a submersion (respectively, immersion).
The Morse-Sard theorem asserts that critical values are rare. To be
precise, the following holds: The set of critical values of a C°°-function F:
RN-+ RM has measure zero in RM.
Example 37.64. The mapping F: R2-»R2, where F(£,tj)= (£,tj2), has a
critical point at (0,0) because
*'<«)-(J I), «-(€,!,)
anddetF'(0)=0.
Furthermore, F: R2->R2, where F(£, ?]) = (£, rf - £t]) has a critical
point at (0,0).
Example 37.65. The function F: R2 -+ R, where F(£, tj) is equal to one of
the following expressions: £2 + tj2, £2 — tj2, — £2 — tj2, has a critical point at
(0,0).
Roughly speaking, Morse theory asserts that in most cases the critical
points of functions F: R2 -* R have the structure of Example 37.65, i.e., one
can usually attain these normal forms by means of a transformation of
coordinates, which is a local diffeomorphism, and the addition of a
constant. A fundamental result due to Whitney (1955) asserts that in most cases
the mappings F: R 2 -> R 2 have only the two singularities given in Example
37.64, i.e., usually one can attain one of these two normal forms by means
of a transformation of coordinates of the dependent and independent
variables, which are local diffeomorphisms.
In singularity theory one attempts to classify the possible singularities by
producing normal forms. In this, one restricts oneself to such singularities
which, roughly speaking, occur in most cases and are stable with respect to
perturbations. As examples, we shall give the exact formulation of the
Whitney classification theorem and the Thorn classification of elementary
catastrophes, where the latter case entails stable deformations of
singularities.
37.28. Singularities and Catastrophe Theory
117
The significance of singularity theory for the natural sciences is that on
the basis of many examples one is convinced that essential phenomena in
nature are frequently connected with stable singularities. Thus, knowledge
of normal forms affords a survey of the possible wealth of structures in
nature, and one obtains hints for the mathematical modelling of natural-
scientific phenomena. However, despite the employment of deep-lying tools,
until now, we have succeeded in finding such normal forms only in simple
cases, and we know natural situations, i.e., fc-parametric deformations,
k > 6, for which an infinite number of normal forms already exists. Besides,
the simple example of gravitational potential of the sun, which has a pole at
the midpoint of the sun, already shows that the classification of singularities
of smooth mappings cannot be sufficient. The singularities in elementary
particle theory behave essentially still worse.
37.28b. Transversality
To a large extent transversality theory generalizes the following elementary
fact: For two smooth curves in the plane, there exists at a point u exactly
three possibilities:
(a) The curves contact one another (see Fig. 37.30(a)).
(b) The curves intersect each other transversally (see Fig. 37.30(b)).
(c) The curves do not intersect (see Fig. 37.30(c)).
Two curves are said to be transversal at u if and only if (b) or (c) holds.
The following observation is crucial: (b) and (c) are stable relative to small
perturbations. On the other hand, in (a) the smallest perturbations suffice in
order to attain (b) or (c) (see Fig. 37.30(d)). Therefore, it is intuitively
evident that transversality occurs in most cases. The transversality theorem
of Rene Thorn that we shall formulate in Section 37.28c makes this precise
in more general form.
First we define the concept of transversality which is of central
significance in modern differential topology.
(a) If X and Y are two C1 -manifolds in UN, then they are said to be
transversal at the point u with respect to UN if and only if one of the
following two cases occurs:
Case 1: u<£ X nY, i.e., X and Y do not intersect at u.
\J
(a) (b) (c) (d)
Figure 37.30
118
37. Introductory Typical Examples
Case 2: ue X (~)Y, and the two tangent spaces at u span UN, i.e.,
TXU + TYU = UN.
For example, one has transversality with respect to R2 (respectively, R3)
in Fig. 37.30(b) (respectively, Fig. 37.31). If two curves in R3 intersect, then
one never has transversality at the intersection point, because the tangent
spaces are one dimensional and thus cannot span R3. We explain TXU in
Definition 43.8. Intuitively, the tangent space TXU to a curve (respectively,
to a surface) is obtained by means of a translation of the tangent
(respectively, of the tangent plane) to zero.
(b) If F: R N -* R M is a (^-mapping and 7 is a (^-manifold in the image
space RM, then F is said to be transversal to Y at u with respect to RM if
and only if one of the following two cases occurs:
Casel: F(u)<£Y.
Case 2: F(u) e Y, and for the linearization F'(u) we have
R{F'(u)) + TYF(u) = UM.
If R N is replaced by a C^-manifold X in R N, then for the linearization one
has to choose the tangential mapping TF(u) instead of F'(u) (cf. Definition
43.18). We speak of transversality when it holds at each point.
First we treat two typical examples which will show how already known
nondegeneracies are to be conceived in a unified manner with the aid of the
concept of transversality.
Example 37.66 (Nondegenerate Zero). Let F: R -> R be a C00-function
such that F(u0) = 0. The zero u0 is said to be nondegenerate if and only if F
is a submersion at u, i.e., F'(u0) + 0. This can also be expressed as follows:
F is transversal to {0} at u0 with respect to the image space R.
We shall now use the concept of transversality to formulate the situation
that F has only nondegenerate zero points. To this end, we introduce the
/c-jet coordinates JkF(u) = (u, F(u), F'(u),...,F(k'>(u)) and the k-jet space
Figure 37.31
37.28. Singularities and Catastrophe Theory
119
Figure 37.32
Jk(U,i
def , „
)=11^+2. Then
JkF:U'~*Jk(U,U)
is a mapping belonging to F: U -> U. Linearization yields
(JkFy(u)h-(h,F'(u)h,...,F<k+1>(u)hk+1).
(161)
Therefore, the following holds:
(A) F: U-+U has only nondegenerate zeros if and only if J°F is
transversal to the straight line X= {(h,0) eR^lieR1} relative to U2 (see
Fig. 37.32). Intuitively, J°F is the graph of F in U 2.
Furthermore, Fig. 37.33 shows that a degenerate zero can be changed into
a nondegenerate zero by means of a small perturbation. We have already
essentially used this idea in Chapter 12 in the construction of the fixed point
index. We will give a precise formulation of this perturbation proposition in
Section 37.28c, (ii).
Example 37.67 (Nondegenerate Critical Point). The C00-function F: U -»R
possesses a critical point at u0 if and only if F'(u0) = 0. The critical point is
called nondegenerate if and only if F"(u0) + 0. According to (161), this is
equivalent to stating that JlF is transversal at u0 to the plane X= {(£, tj,0)
ee3: (£,ij)eR2} relative to Jrl(R,II«) = IR3.
(B) F possesses only nondegenerate critical points, i.e., F is a Morse
function if and only if JXF is transversal to X with respect to Z^R.IR).
Figure 37.33
no
37. Introductory Typical Examples
Exercise
Verify explicitly that (A) and (B) result from (161) and the definition of transver-
sality and visualize (B) by means of a surface in R3 that corresponds to JlF.
In order to explain the connection between JkF and the Taylor expansion
of F, we denote by j„F the function that results from the Taylor expansion
of F at the point u if one takes into account only the terms up to and
including the k th order; therefore,
jtF(v)-F{u) + F'(u)v+ -+¾^.
Then JkF results from u and the expansion coefficients without taking the
corresponding factorials into account.
As the first important application of the transversality concept, we
consider the equation
F{u) = y. (162)
Let F: UN~*UM be a C°°-mapping. We ask the question, when do the
solutions u of (162) form a C°°-manifold when y ranges over a
C°°-manifold Y in UM. The answer is: F'\Y) is a C°°-manifold when F is
transversal to Y with respect to UM (cf. Theorem 73.F in Section 73.13).
For example, if Y consists of only one pointy, then TYy = {0} holds, and
F~\y) is a C°°-manifold when R(F'(u)) = UM for all u e UN, i.e., F is a
submersion. In order to retain this proposition in 5-spaces, the concept of
submersion must be modified (cf. Section 43.6).
37.28c. Generic Properties
By such a property we shall heuristically understand that it is one that
occurs in the majority of cases and is stable relative to perturbations.
The precise definition reads as follows: A property of a C°°-mapping
F: U N -> U M is said to be generic if and only if there exists a set A that is
open and dense in CX(UN,UM) such that all mappings from A have this
property. In this connection, C^IR^,UM) is provided with the so-called
C00 -Whitney topology (cf. At (69) and Golubitsky and Guillemin (1973, M),
Chapter 2, Section 3).
As important examples, we assert that the following properties are generic
for C°°-mappings F:
(i) F: UN -> U possesses only nondegenerate critical points, i.e., F is a
Morse function,
(ii) F: UN -> UM possesses only nondegenerate zeros,
(iii) F: UN -> UM is an immersion when M^2N.
(iv) F: UN -> UM is transversal to a fixed closed C°°-manifold in UM.
37.28. Singularities and Catastrophe Theory
121
(v) For fixed k, the fc-jet mapping JkF to F: UN -> U M is transversal to a
fixed closed C°°-manifold in Jk(UN,UM)-
The genericity of (iv) and (v) is a special case of the general transversality
theorem of R. Thorn, (i)-(iii) follow from (iv) and (v). In (v), JkF{u) means
the tuple (u, F(u), DaF(u)), where Da ranges over all the partial derivatives
of Fup to and including the order k. Suppose the number of components of
this tuple is K. We set
Jk(UN,UM) = UK.
Then JkF is a mapping of R" into the A:-j'et space Jk(UN,UM)- In
connection with the Whitney theorem (iii), we give two further basic results
due to Whitney:
(a) The set of injective immersionsF: UN -> UM is dense in CX(UN,UM)
whenM^2iV + l.
(b) Every iV-dimensional manifold X, i.e., every C°°-manifold with a
countable basis that is modelled over UN can be embedded in U2N+1, i.e.,
there exists a C°°-immersion /:X-* U2N+1 which is simultaneously a
homeomorphism onto /(X) (Whitney's embedding theorem).
The proofs of all these properties can be found in Golubitsky and
Gufflemin (1973, M), Chapter 2.
37.28d. Equivalence
By the concept of equivalent mappings, we wish to think heuristically of
mappings which, by a well-behaved change of the dependent and
independent variables, go from one into the other and thus, roughly speaking,
possess the same structure.
Two C°°-mappings F, G: U N -> U M are said to be equivalent if and only if
there exist mappings <p, \p such that the following diagram commutes:
UN —^-> UM
v
where <p and \p are C°°-diffeomorphisms.
If, concerning <p and \p, this is a matter of local diffeomorphisms with
v[) = <p(u0) and F(u0) = \p(G(v0)), then F and G ate said to be locally
equivalent at u0 and v0, respectively.
122
37. Introductory Typical Examples
Example 37.68. For C°°-functions F: U ->R, the following two assertions
hold:
(i) u >-» F(u) and u -» « are locally equivalent at «0 when F'(u0) + 0.
(ii) «>->F(w) and w*-»w2 are locally equivalent at u0 and 0, respectively,
when F has a nondegenerate critical point at u0.
Exercise
Prove this and give examples of concrete functions which are mutually equivalent.
For example, u <-* u and u <-* sinh u are mutually equivalent on U, but u <-* u and
u >-> w3 are not, because the function inverse to «■-> w3 is not a C°°-function. (i)
results from considering inverse functions, whereas (ii) follows from the Morse
theory.
Example 37.69 (Normal Form of Submersions and Immersions). Let F:
U(u0) c UN-* UM be a C°°-mapping on an open neighborhood of u0. Then
the following holds:
If F is a submersion (respectively, an immersion) at u0, then, at u0, F is
locally equivalent to G: 1/(0) c U N -> U M at 0, where
def
G{^,...AN) = (^,...,^), M^iV,
(respectively,
G(^,. ..,£„) = (^,. ..,^,0,. ..,0), iV^M).
The proof follows directly from the rank theorem in Problem 4.4.
Example 37.70 (Whitney's Classification Theorem). There exists an open
and dense subset A of C°°(R2,IR2) such that each F in A is locally
equivalent at each point u to one of the following mappings at zero:
(*,i)~(*.i). (£.12). (£.i3-^)
(cf. Brocker and Lander (1975, M), Chapter 8). Generalizations can be
found in Golubitsky and Guillemin (1973, M), Chapter 7, Section 4 (Morin
singularities).
37.28e. Structural Stability
With this concept we associate the heuristic picture of functions which
preserve their essential structure under perturbations.
A function F e C°°(UN,UM) is said to be structurally stable if and only if
there is a neighborhood U(F) of Fin C°°(II«W,II«M) in which each G e U(F)
is equivalent to F
The structural stability of C°°-mappings F: X -> Y is explained in an
analogous way in the case where X, Y are finite-dimensional C°°-manifolds.
37.28. Singularities and Catastrophe Theory
123
Example 37.71. If Xis a compact C00-manifold in R", then the
(^-function F: X-> U is structurally stable if and only if F is a Morse function that
takes on distinct values at distinct critical points.
F. X-*UM is structurally stable when F is a submersion or an injective
immersion (cf. Golubitsky and Guillemin (1973, M), Chapter 3).
37.28f. The First Elementary Catastrophe
We consider the function F(u) = u3. Then Fhas a degenerate critical point
at u = 0 which is not stable, for one can consider the.family Ft{u) = u3 + tu
that depends on the parameter t having the behavior shown in Fig. 37.34.
For t + 0, the critical point of F at 0 vanishes.
However, the following question is crucial for the so-called elementary
catastrophe theory: Is the family { Ft} which describes the deformation of
the singularity above stable? The answer, which we shall make precise in
Section 37.28h is as follows: In principle; there is only one stable
deformation of u >-> u3 in a neighborhood of zero and it is given by (u, t) >-> u3 + tu.
This deformation, or unfolding, is called the first elementary catastrophe.
Example 37.72 (Isotherms of a van der Waals Gas). We consider a gas in a
container of volume V and with a temperature T. Suppose the pressure p
acts on the gas (see Fig. 37.35). We choose the state equation to be the van
der Waals equation
p^RTiV-by'-^V-2,
where a, b, and R are positive constants. The isotherms, i.e., the curves
T= constant, have the form shown in Fig. 37.36(a). For a critical
temperature rcrit, the isotherm has exactly one critical point. At this point, liquifac-
tion (condensation) occurs. Figure 37.36(a) shows the deformation, or
unfolding, of this singularity, and this deformation has the structure of the
first elementary catastrophe. The isotherms located above the critical
isotherm (T> rcrit) have no critical points. They describe the gaseous phase.
The isotherms located below the critical isotherm (T<Tcdl) have a local
minimum and a local maximum. They contain an unstable region with
t = 0
r
y
F, (u) = u3 + t u
Figure 37.34
37. lmrouuctory lypicalExainpies
Figure 37.35
dp/dV> 0, which in principle is not realizable experimentally. In fact, one
must correct the isotherms by a straight line AB as in Fig. 37.36(b). Along
this straight line the (vapor) pressure remains constant. Here, the gas and
liquid are in thermodynamic equilibrium. With the aid of the equilibrium
condition for free enthalpy, one can show that the straight line must be such
that the hatched region above and below the straight line AB in Fig.
37.36(b) have the same area. On the isotherm, only gas (respectively, only
liquid) occurs to the right of B (respectively, to the left of A). A detailed
physical discussion can be found in Sommerfeld (1962, M), Vol. V, Section
10.
T = constant
* rritic:
(a)
,, liquid
QfflJF
j&
gas
gas and
liquid
T = constant
(b)
Figure 37.36
37.2o. amgularities auu Catasiropne' Theoiy
125
t
C
3
>-s
Figure 37.37
37.28g. The Second Elementary Catastrophe or the Cusp
Catastrophe
We consider the function F(u) = u4 which has a degenerate critical point at
0. Parallel to the situation in the preceding section, to this function there
corresponds a stable deformation in a neighborhood of zero given by the
two-parameter family
Fs t(u) = u4 - su2 + tu.
This deformation, which we also call the second elementary catastrophe or
the cusp catastrophe, occurs frequently. One reason for this is that the
function u >-> u4 corresponds to the simplest form of a degenerate minimum.
We treat an application in Section 37.28J.
In order to acquire an intuitive picture of Fs t, we determine the critical
points of F t for constant (s, t) by Fs't(u) = 0, i.e.,
4u3-2su + t*=0. (163)
Multiple solutions of this equation occur for Fs"t(u) = 0, i.e., 12«2 — 2* = 0;
therefore, 8*3 = 27*2. This curve C in the (s, 0-plane is shown in Fig. 37.37.
C splits the (s, 0-plane into three parts in which the function u >-> Fs t{u)
has 3, 2, and 1 critical points (see Fig. 37.37). If we keep s fixed and
consider the changes in the family of functions u >-> F t(u) relative to the
parameter t, then we obtain the situations pictured in Fig. 37.38.
37.28h. The Seven Elementary Catastrophes of R. Thorn
Of importance for the following are the transformation formulas:
F(«) = ±F1(<p(«))-constant-e(«,«), n= (£,tj, ...) (164)
H1(u,p)^H(Hu,p),Up))+K(p), (165)
\
2^
126
37. Introductory Typical Examples
Figure 37.38
and H(u,0) = F(u). The new concepts appearing in the following theorem
will be explained in the next section.
Theorem 37.F (Classification Theorem). (1) Normal form. If F^. 1^(0) cRw
-*U is a Cx-function on the neighborhood of zero, 1^(0), which has a critical
point at u = 0 with codim Ft < 4, then one can always attain one of the normal
forms F given in Table 37.3 by means of a transformation of the form (164).
In this connection, <p is a C^-diffeomorphism on a neighborhood of zero with
<p(0) = 0, and Q is a nondegenerate quadratic form of the components of u
which do not appear in F.
Table 37.3
codim F1
1
2
3
3
3
4
4
Normal Form F
e
i4
e
e+v3
sws
i6
S2t) + t,4
Universal Stable
Deformation H (Unfolding)
of F; t, s, v, w Are Parameters
e+ta
i4-ii2+?i
e+ve+^e+tt
I3 + 7}3 + u£t? - st? - r£
|3-|T,2+t,(|2+7}2)
-sr]-t£
|27} + 7}4 + wf
+ UT)2-irj- r£
37.28. Singularities and Catastrophe Theory
127
(2) Stable deformation. To each normal form F there belongs the stable
k-parameter deformation H shown in Table 37.3.
The proof, together with graphical representations of the elementary
catastrophes, can be found in BrOcker and Lander (1975, M), Chapters 15
and 17. The stable deformations H in Table 37.3 have a universal character:
Every other stable deformation H^ of F in a neighborhood of zero cannot
contain fewer parameters and, with the aid of a coordinate and parameter
transformation according to (165), can be represented by if in a natural
way: here, ^, f, and K have the natural properties given below in (ii),
where, however, f need only be a C°°-function.
Furthermore, it is important that the expressions H given in Table 37.3
be typical for the behavior of parameter families in a neighborhood of zero
with no more than four parameters. Roughly speaking, this means that as a
rule (in the sense of a generic property) one obtains the functions H given in
Table 37.3 by means of suitable coordinate and parameter transformations.
In Chapter 73 we shall study this in greater detail.
37.28L Stable Deformations and Codimension
In the following, let U, Ut, V, and Vi be open neighborhoods of zero.
(i) Deformation. Let F: U c R N -> R be a C°°-function. By a ^-parameter
deformation, or an unfolding, of F we understand a C°°-mapping of the
form (u,p) >-> H(u, p), where H(u,0) = F(u) on V. More precisely,
H: VxV1c:UNXUk~>U.
We think of p in Uk as a parameter.
(ii) Stable deformation. In addition, H in (i) is called a stable ^-parameter
deformation of F if and only if the following holds: For each sufficiently
small neighborhood of zero, l/j XU2, in UN XUk, there exists a
neighborhood W{H) of H in C^Xl^R) such that each H^WiH) can be
obtained from H by means of a coordinate and parameter transformation
according to:
Hl(u,p)^H(t(u,p),t(p))+K(p) onU.X^.
This transformation has the following natural properties:
(a) \p(-,p). l/j -> l/j is a C°°-diffeomorphism for all/? e U2 with \p(u,0) = u
and^eC^l/iXl^.R).
(b) f: U2 -> U2 is a C°°-diffeomorphism with f (0) = 0 and K e C°°(l/2, R).
(iii) Germs. Two C00-functions .^: l/;CRw->R, /=1,2, are said to be
equivalent if and only if they coincide in a neighborhood of zero. The
corresponding equivalence classes are called germs. One can define addition,
multiplication, and differentiation of germs in a natural way by carrying out
these operations on representatives and taking into account that the result is
independent of the choice of the representatives. The germ structure plays a
128
37, Introductory Typical Examples
central role in the construction of normal forms, because one can make use
of the methods of commutative algebra (ideal theory) and algebraic
geometry. In this connection, the theory of local rings and Malgrange's
preparation theorem are especially important. The latter generalizes the Weierstrass
preparation theorem (cf. Problem 8.1).
In this connection, study Brocker and Lander (1975, M), Chapter 6.
(iv) Codimension. Let GN be the real vector space of germs of C°°-func-
tions F: U c R N -> R, and let Fx: 1^ c R N -> R be a fixed C°°-function. We
set
codim/^ = dimG7(/F1).
Here, (j1^) is the linear subspace of GN that consists of all germs
belonging to
a0+ E a,(«)-
i-1
K,
where w = (^,...,^). Furthermore, a0 eR and alia, are C°°-functions in a
neighborhood of zero. The factor space GN/(j1F1) results in the usual way
by identifying elements of GN which differ by an element in (j'1^).
Example 37.73. If Fx has no critical point or a nondegenerate critical point
at u = 0, then codim Fx = 0.
Example 37.74. If Fx: l/cR->J
G(0)it0 and m > 2, then codim
! has the form Fl(u) = umG(u) with
F1 = m-2. In this case, a basis for
GN/(j1F1) is formed by the residue classes which belong to u, u ,...,u"
Exercise. Prove Example 37.74 (cf. Golubitsky (1978, S), p. 360). One can
conceive of codim F: as the measure of degeneracy of a critical point at u = 0.
37.28J. Perturbed Bifurcation and Catastrophe Theory
As an illustration, we consider the buckling of a rod of length tr under the
influence of the external force X (see Fig. 37.39). Let y(x) be the
displacement at the point x. Here, let x denote arc length. If u denotes the maximal
37.28. Singularities and Catastrophe Theory
129
(a)
(b)
Figure 37.40
(c)
displacement, then theoretically one obtains the bifurcation diagram shown
in Fig. 37.40(a), i.e., buckling occurs only for a critical force X0. Here, u > 0
(respectively, u < 0) means that the buckling is upward (respectively,
downward). In practice, however, one obtains diagrams that correspond to Fig.
37.40 (b), (c) and which one can interpret to mean that the ideal situation is
disturbed, e.g., by an additional small weight a as in Fig. 37.41. This
situation includes the lower branch in Fig. 37.40(b).
We will now show how catastrophe theory can explain the structure of
this perturbation diagram. In order to expose the heart of the matter clearly,
we forego concrete calculations.
From the principle of stationary potential energy, we get the following
variational problem for y:
clef rT , . .
Fx= \ L(y,y',y";\)dx = stationary!, (166)
y(0) = y(n) = y"(0) = y"(n) = 0,
with the corresponding Euler equation
G(y,y',...,y^,\) = 0.
This, together with the boundary condition in (166), is a bifurcation
problem for determining y and X. We assume that with the aid of the
method from Section 8.10 we obtain a unique bifurcation branch of the
form
y(x) = uy0(x) + O(u2), X = X0 + O(u), w-»0.
We substitute y in Vx . Now suppose that we obtain
VXo(u) = au4 + 0(us), «-»0,
where a > 0, i.e., u = 0 is a critical point of codimension 2 (cf. Example
Figure 37.41
130
37. Introductory Typical Examples
37.74). According to Section 37.28h, there is a change of variable from u to £
such that
^0 = ^-
By Table 37.3, the universal stable deformation
V,tl(l)-P-se + ti (167)
corresponds to this. This means that if we are interested in stable
perturbations of the potential, then we can reduce these to (167) by means of
suitable coordinate and parameter transformations. The crucial information
that catastrophe theory provides us is that we need at least two parameters
to describe the stable deformation of the potential. Therefore, it does not
suffice to consider only the force parameter \.
The equilibrium state of the rod is now determined by the requirement
that the potential energy is stationary for fixed parameters, i.e.,
K,;,(£) = 4£3-2j£ + * = 0.
For fixed t = 0, t > 0, t < 0, we obtain the structure of the diagram in Fig.
37.40 (a), (b), (c) when we set u = £, X = X0 + s there.
Concrete calculations can be found in Golubitsky (1978, S). There,
y0(x) = sin x. For perturbed and many-parameter bifurcations, we
recommend Hale (1976, S), Chow and Hale (1982, M), Reiss (1977), and
Golubitsky and Schaeffer (1979).
37.28k. Taylor Expansion and Numerical Calculation of
Normal Forms
In the following, let F,G: l/(0)cR"-»IS be C°°-functions on a
neighborhood of zero. We denote the polynomial which consists of the terms of the
Taylor expansion of F at zero up to and including the fcth-order term by
jkF. Furthermore, let u = (^,...,^N). We call F and G locally
right-equivalent if and only if
G(u) = F(<p(u)) + constant
holds on a neighborhood of zero in UN, where <p is a local C°°-diffeomor-
phism with <p(0) = 0.
It is of great significance for practical problems that one answers the
following questions: How must one choose k so that jkF expresses the
essential behavior of F, and how can one construct such special
deformations of F that yield all deformations of F in a neighborhood of zero up to
coordinate and parameter transformations? A summary of results in this
direction together with a computer program can be found in Poston and
Stewart (1978, M), Chapter 8. In this connection, a central role is played by
fc-determinacy. The function F is called k-determined if and only if it follows
ftomjkF= jkG that F and G are locally right-equivalent.
37.28. Singularities and Catastrophe Theory
131
Example 37.75. Let F'(0)¥=0, i.e., not all the linear terms vanish in the
Taylor expansion of F at zero. Then F and H, with H(u) = £v are locally
right-equivalent. Consequently, F is one-determined.
Now let F'(0) = 0, but suppose that the critical point « = 0 is not
degenerate, i.e., all linear terms in the Taylor expansion of F vanish at zero;
however, the quadratic terms constitute a nondegenerate quadratic form.
According to Morse's lemma in Section 37.27c, F and H are locally
right-equivalent where H(u) = £x2 + ■ • ■ + £2 - £2+1 - ■ - ■ - ££. Thus, F is
two-determined.
The situation is more complicated in the investigation of degenerate
critical points u = 0 for which terms of order higher than two in the Taylor
expansion are crucial. For example, there is no number k such that
F(u) = i-1£l is ^-determined. Roughly speaking, this follows from the fact
that where G = £x£2 + £2*+1, F and G are not locally right-equivalent,
because the zeros of F lie on two straight lines, but those of G lie only on
one straight line.
As an illustration of the structure of criteria for fc-determinacy, we give
the following proposition: F is ^-determined if one obtains each
homogeneous polynomial of the (k + l)-st degree in N variables by forming
Pdk'\DlF)+---+PNjk'\DNF)
with arbitrary polynomials Pf of degree greater than or equal to two and
discards terms of order higher than k +1. Here, Z), = d/d£t.
Exercise
With the aid of this criterion show that F(u) = £i + €i€f is three-determined.
A further important criterion due to Mather reads as follows: F is
/c-determined for some k if and only if codim F is finite. In this case,
k <, codim F + 2.
In the natural sciences, one frequently uses approximations that are
obtained by truncating the Taylor expansion, i.e., one replaces F by jkF.
However, if k is chosen clumsily, then grave errors can arise. The
significance of the theory of fc-determinacy is precisely that one obtains
propositions concerning an appropriate choice of k. In Chapter 73 we shall
consider these questions in greater detail.
37.28/. Applications of Catastrophe Theory
Numerous applications in hydrodynamics, theory of elasticity,
thermodynamics, laser technology, biology, ecology, sociology, and numerical
mathematics are described in Poston and Stewart (1978, M) and Gilmore
132
37. Introductory Typical Examples
(1981, M). We recommend that the reader study these monographs. There
one also finds a detailed bibliography.
References to the Literature
Classical work on catastrophe theory: Thorn (1972, M).
Introduction: Lu (1976, L); Poston and Stewart (1978, M,B) and Gilmore
(1981, M) (elementary expositions with numerous applications); Golubitsky
(1978, S); Triebel (1981, S).
Elementary transversality theory and applications: Guillemin and Pollack
(1974, M).
General singularity theory: Golubitsky and Guillemin (1973, M), Arnold
(1981, S).
Deformations and elementary catastrophes: Brocker and Lander (1975,
M).
Classification of singularities and applications to the bifurcation theory of
dynamical systems: Arnold (1971, M), Vol. II.
Classification of critical points: Arnold (1975, S), (1983, S).
Phase integrals of geometrical optics and catastrophe theory: Duistermaat
(1974); Arnold (1975, S), (1983a, S).
Transversality and generic properties of dynamical systems: Abraham
and Robbin (1967, M).
Imperfect bifurcation and catastrophe theory: Golubitsky and Schaeffer
(1979), Chow and Hale (1982, M).
Applications to the natural sciences: Poston and Stewart (1978, M) and
Gilmore (1981, M) (comprehensive expositions); Thom (1972, M) (biology);
Zeeman (1974, S); Hilton (1974, P); Golubitsky (1978, S); Giittinger and
Eikemeier (1979, P); Ursprung (1982, L) (applications in economics);
Thompson (1982, M).
37.29. Basic Ideas for the Construction of
Approximation Methods for Extremal
Problems
In this section we shall give a summary of the basic ideas of a number of
methods for the approximate solution of extremal problems:
(a) The Ritz method.
(/6) Gradient method (descent method).
(y) Ascent method.
(5) Penalty method.
37.29. Construction of Approximation Methods for Extremal Problems
133
(e) Regularization and perturbation analysis.
(f) Duality method.
(rj) Dynamic optimization.
(#) Decomposition.
Moreover, we discuss two important principles for the construction of
further approximation methods:
(1) Equivalence principle.
(2) Combination principle.
In the following list we summarize several typical difficulties which appear
in the numerical treatment of extremal problems. In* parentheses, we give
the methods by means of which these difficulties can in principle be
overcome.
(a) Infinite-dimensional problems (the Ritz method and more general
projection methods, e.g., in the case of variational inequalities).
(/6) Side conditions (penalty method),
(y) Multiple solutions (regularization).
(§) Multivalued expressions (regularization).
(e) Instable, i.e., ill-posed problems (regularization).
(f) Large number of equations (decomposition).
If several of the above-named difficulties occur, then one must form a
combination of several of these methods. This is the combination principle.
In Chapter 25 we have, for instance, combined the projection and iteration
methods to form the projection-iteration method. Some abstract results for
combined methods can be found in Kluge (1979, M), page 204.
By the equivalence principle we understand:
(i) The reduction of extremal problems to operator equations. For
example, a necessary condition for a solution of
F(u) — min!
is the Euler equation
F'(«) = 0.
(ii) The reduction of operator equations to extremal problems. For
example,
Au = b
is equivalent to
\\Au- b\\2 = min!
In (i), for the solution of extremal problems, we also have at our disposal
the methods for the solution of operator equations, which we have already
made available in Parts I and II. We give a survey regarding this in the
references to the literature at the end of this section.
134
37. Introductory Typical Examples
37.29a. Ritz's Method
For the functional F: M c X -> R on the real B-space X, we consider the
minimum problem
minF(«) = a. (P)
ueM
The basic idea of Ritz's method consists in solving the modified problem
min F(u) = a„ (P„)
ueM n x„
instead of (P), where Xn is a finite-dimensional subspace of X. U {wv...,w„)
forms a basis in Xn, then each u e X„ can be represented as u — c^ +
■ ■ ■ + cnwn, ct e U. Consequently, (Pn) is a minimum problem for a real-
valued function of the real variables cv...,cn.
If Xl c X2 c • • • cl with U „A"„ = X, i.e., as n increases, X is
approximated better and better by Xn, then under suitable assumptions on M and
F, it can be shown that when n -» oo:
(a) The extremal values a„ converge to a.
(b) The solutions un of (/>„) converge in a certain sense to a solution of (P)
(strong or weak convergence, subsequence convergence; cf. Sections 42.5
and 46.5).
We have already explained the connection with the more general Galerkin
method (projection method) and given numerous examples in Chapters
18-22 of Part II.
37.29b. Gradient Method or the Method of Steepest Descent
If one wishes to reach the lowest point of a valley in the mountains, then
one must simply walk continually downhill, and indeed as continually as
possible, in the direction of steepest descent. Suppose that the functional
F: X -» U is given. We will use this idea to formulate an iteration method
for the minimum problem
minF(«) = a. (P)
We choose, say, X— U2, u — (£, rj), start with u0, and construct a sequence
(«„) recursively by
"n+i = "„ + *». « = 0,1,....
The direction kn is to be so chosen that F(un+1) < F(un) holds. We show
that
def , ,
k„~-t„F'(u„), t„>0
37.29. Construction of Approximation Methods for Extremal Problems
135
is a propitious choice. To this end, we set
<p(t) = F(u0 + th) forallfSR,
i.e., we consider F on the straight line t •-> uQ + th. We have
<P'(0 "F^uo + th)^ + Fv(u0 + th)h2 *F'{u0)h.
<p'(0) is smallest if, to within a normalization constant, we choose
h--F'(u0),
i.e., h = — grad F. This is the direction of steepest descent.
If the mountain landscape, i.e., the surface belonging to F, is sufficiently
well behaved, then (un) converges to a solution of (P) when the step size t„
is appropriately prescribed (cf. Sections 42.6 and 46.6).
37.29c. Ascent Methods and Remes Algorithms
In contrast to the so-called descent methods, in which the minimal value is
approached from above, in ascent methods this occurs from below. The Ritz
method and the gradient method are typical descent methods. On the other
hand, the ascent method is frequently used in approximation theory.
Prototypes are the Remes algorithms for uniform polynomial approximation. To
this end, we consider the problem of Chebyshev approximation
min [ max |/(0~«(0l) =a (p)
and, parallel to this, the discretized problem
min( max 1/(0-1/(01) = /8, (^)
where -oo<c£t1<t2<- ■ ■ <tm<,d < oo. Let M be the set of (n - l)-st
degree polynomials. Obviously, /6 <, a.
In the first Remes algorithm one successively increases the number of
subdivision points in (Pd) so that /6 constantly becomes larger: One begins
with m—\ and determines «x as a solution of (Pd). Then the point t2 is so
chosen that |/(0""i(OI assumes its maximum on [c, d] at t2. If t2 = t1,
then a<,p, i.e., a = /S, and «x is also a solution of (P). Otherwise, one
determines u2 as a solution of (Pd) with m = 2, and so forth.
EXERCISE. Prove the convergence of this algorithm (cf. Cheney (1966, M), page 96).
In using this algorithm, it is possible that m becomes very large. In the
refined second Remes algorithm, one works with a fixed m = n+l and
makes use of the equation
/(0-k(0-(-1)' + 1Y. /=1,...,»+1. (168)
136
37. Introductory Typical Examples
According to the alternation theorem in Section 37.16, u is a solution of (P)
precisely when (168) holds, with IyI = ll/~~ «11- From Corollary 39.14 it
easily follows that the unique solution of (168) is equal to the solution of
(Pdy, thus, /8 = |y|. For this reason, for each solution of (168), the error
estimate
|Yl<«<l|/-"ll
holds. Thus, the idea consists in solving (168) and increasing the number |y|
by interchanging t, until we obtain |y| = ||/ — h||.
Example 37.76. We will approximate/^) = e' uniformly in an optimal way
on [ — 1,1] by means of a first-degree polynomial u = a + bt. We set g = / — u.
According to the alternation theorem in Section 37.16, there exists exactly
one solution u, and this solution is characterized by the fact that there exist
three points -1 < tx < t2 < t3 < 1 such that
*K)--g(T2) = g(T3), |g(r,)| = ||g|| for all/.
We start with ty = -1, t2 = 0, and *3 = +1. From (168) we obtain
« = 1.272 + 1.175*, Y = 0.272.
Subdivision of [-1,1] with step size 0.1 then yields ||g||> 0.286 and
g(f) = 0.286 for I = 0.2. Due to the alternating property,
we interchange t2 and t. With the changed points tv i, and t3, we again solve
(168). This yields
« = 1.264 + 1.175*, Y = 0.278. (169)
Now, ||g|| = 0.279 and g(0.16) = 0.279. The polynomial (169) satisfies (168)
with the points -1, 0.16, and +1. For this reason, the u in (169) is the
solution within the limits of the accuracy used, and — 1, 0.16 and 1 are the
alternate points.
EXERCISE. The reader is asked to depict our calculations graphically.
The proof of the convergence of the second Remes algorithm, in which
the interchange of *, must be disposed of appropriately, can be found in
Cheney (1966, M), page 97 and, in greater generality, in Meinardus (1964,
M), page 102. This method converges linearly (respectively, quadratically
for smooth/), i.e., at the fcth step the error is of the order of magnitude qk
(respectively, q2k) for fixed <jre]0,l[. The rapid convergence for smooth/
results from the fact that in this case the algorithm can be reduced to a
Newton method.
37.29. Construction of Approximation Methods for Extremal Problems
137
s
\
\
\
/
/ -
/
/
/—
__. „ M -%
'F»
Xv
—+~
Figure 37.42
37.29d. Penalty Method
The basic idea consists in approximating minimum problems with side
conditions by such problems having no side conditions.
To explain this, for a given continuous function F: U -»R, we consider
the minimum problem with side condition,
minF(x) = a, 0<x^l (P)
and parallel to it, the minimum problem without side condition,
In addition, let
minF„(x) = a„. (S)
*eR
Fn(x) = F(x) + an(A(x)xf+bn(B(x)(l~x))2;
v ; \l ifx<0;
ayX} \l ifx>l.
Here, (an) and (b„) are monotonically increasing sequences of positive
numbers which tend to + oo as n -» oo.
Outside [0,1], for increasing n, the functions Fn are increasingly steep (see
Fig. 37.42). For this reason, it is clear that for n>n0(F), the solutions of
(S) are also solutions of (P).
We shall investigate this method in greater detail in Section 46.7. The
name "penalty" functional has its origin in the situation that one adjoins
additional terms—the so-called penalty terms—to F with Fn(x) > F(x) for
x € [0,1]. We say that violation of the side condition is penalized. This
penalty increases with increasing n.
37.29e. Regularization and Perturbation Calculus
We have already pointed out the significance of regularization methods in
Sections 37.14 and 37.15 and in the introduction to Section 37.29. The
general idea consists in replacing an equation
Au = b (P)
138
37. Introductory Typical Examples
by a regularized equation
A.u = b, (P.)
which one can treat more easily. Following this idea, one tries to construct
solutions for (P) from the solutions of (PJ. We applied this method, e.g., in
an essential way in the Yosida approximation in Chapter 31 and we will use
it in Chapter 54.
Numerous methods for regularization can be found in Lions (1969, M),
(1973, L), (1983, M), and Lions, Jr. (1982, L). A procedure for improving
the behavior of the difference method for gas dynamics problems according
to an idea of J. von Neumann by introducing artificial viscosities is
described in Richtmyer and Morton (1967, M).
In perturbation analysis one tries to represent the solutions of (PE) as
expansions in terms of the small parameter e. We shall delve into these
fundamental techniques, that are basic for theoretical physics, and their
peculiarities in Part V.
37.29L Duality Method
The basic idea of duality theory consists in considering a given minimum
problem
inf F(u) = a (P)
uGA
together with a maximum problem
sup 0(/,)=0, (P*)
peB
where ft <a. In the following, we shall explain the advantages that accrue
from this.
(i) Two-sided bounds for a. If one chooses some u^A, then F(u)>a
results from (P). However, knowing (P*), one also obtains a lower bound
for a, for it follows from (P*) that
G(p)<P<a<F(u) ioru^A, p^B. (170)
(ii) Sufficiency criterion for solvability. By (170), it follows immediately
from
F(u) = G(p) for fixed «e A, p&B (171)
that F(u) = a, i.e., u is a solution of (P), and p is a solution of (P*).
Furthermore, a = /?.
(iii) Error estimate for the solution u of (P). For all «,ce A and fixed
c, p > 0, we assume that the estimate
c\\v-u\\"<F(v)-F(u) (172)
37.29. Construction of Approximation Methods for Extremal Problems
139
holds. We give conditions for this in Section 41.3. Now, if u is a solution of
(P), then the error estimate
c\\v-uY<F{v)-G{p) for ue A, p^B (173)
follows immediately from (172) and (170).
(iv) Approximation method. If sequences («„) and (pn) with F(w„)-»a
and G(p„)-*P as n-»oo are constructed, for example, by using a Ritz
method or a gradient method for (P) and (P*), then from (173) we
immediately obtain:
c\\u„-ur<F(u„)-G(p„), (174a)
G{pn)<p<a<F{un). (174b)
(v) No duality gaps. We say that there is a duality gap when p < a. In this
case, because F(un)—G(pn)~2^a — ft, the estimate for a in (174b) and for
||w„ — u|| in (174a) cannot be arbitrarily precise for fundamental reasons. On
the other hand, for a = /3, the right-hand side of (174a) tends to zero. Thus,
one is very much interested in the condition a = /3. We shall discuss this in
Chapters 49-52.
When « = /S, by (170), one can also formulate a simple necessary and
sufficient criterion for a solution: Up is a fixed solution to the dual problem
(P*), then u is a solution to the original problem (P) if and only if
F(u)=G(p).
(vi) Extremal relations. In many cases one can give relations, say, of the
form E(u,p) = 0 between the solutions u (respectively, p) of (P)
[respectively, (P*)], which are called extremal relations. Indeed, if u = J(p), then
tlie following can be exploited: First one determines a solution/? of the dual
problem (P*) and then obtains a solution u of the original problem (P) by
u =/(/>).
This method is applied, e.g., in the case where (P*) can be solved more
easily than (P). If only the dual problem (P*) has a solution, then by
u = J(p) one can construct generalized solutions to the original problem
(P). For example, this method is applied in the theory of minimal surfaces
(cf. Problem 52.1).
37.29g. Dynamic Optimization
We have already discussed the pertinent algorithm for discrete problems in
Section 37.20a. For continuous control problems, one obtains approximate
solutions by discretizing and then applying this algorithm.
37.29h. Decomposition
Very large systems of equations occur frequently in practical problems. In
order to treat these systems effectively, one tries to break the problem up
into a number of subproblems. The idea is that in optimizing a complex of
140
37. Introductory Typical Examples
factories, an approximate solution is obtained by optimizing the individual
factories (principle of decomposition). Thus, in order to achieve a good
approximation, one must assume that the interaction between the factories,
i.e., the subsystems, is weak. A comprehensive survey of the different
methods can be found in Bensoussan, Lions and Temam (1972, S). In this
connection, not only extremal problems, but also operator equations and
evolution equations (fractional step methods) are considered.
References to the Literature
Survey of approximation methods: Courant (1943, S) (classical work);
Collate (1964, M); Cea (1971, M); Polak (1973, S,H,B); Hlavacek (1979, S);
Dixon (1980, P).
General expositions of approximation methods: Luenberger (1969, M)
and Varga (1971, L) (introductory); Collate (1964, M); KantoroviC and
Akilov (1964, M); Sauer and Szabo (1967, M) (handbook for engineers);
Kxasnoselskii (1973, M) (comprehensive exposition); Gajewski, Groger, and
Zacharias (1974, M); Langenbach (1976, M); Auslender (1976, M); Glowin-
ski, Lions and Tremolieres (1976, M) (variational inequalities); Berger
(1977, M); Kluge (1979, M); Glowinski (1980, L); Hlavacek and Necas
(1981, M).
Algorithms: Cea (1971, M); Polak (1971, M); Grossmann and Kleinmichel
(1976, L); Auslender (1976, M); Psenicnyi and Danilin (1979, M); Dixon
(1980, P); Marcuk (1980, M), (1982, M).
Rite's method: Rite (1909) (classical work); Michlin (1969, M); Ciarlet
(1977, M) (finite elements) (also, cf. the references to the literature for
Chapter 22 and to the Appendix for Part II).
Projection method and difference method: Galerkin (1915) (classical
work); Kxasnoselskii (1973, M); Richtmyer and Morton (1967, M); Birkhoff
(1971, L); Temam (1977, M) (also, cf. the comprehensive references to the
literature for Chapters 20, 21, 34, and 35).
Gradient method: Cauchy (1847) (classical work); Powell (1971) (good
algorithm); Ljubic (1970, S); Cea (1971, M); Vainberg (1972, M); Gopfert
(1973, M); Fletcher (1980, M) (also, cf. the references to the literature on
algorithms above).
Ascent methods in approximation theory: Remes (1934) (classical work);
Meinardus (1964, M); Cheney (1966, M); Laurent (1972, M).
Connection with the Newton method: Collate (1964, M); Meinardus
(1964, M) (also, cf. the references to the literature for Chapter 39).
Penalty method: Courant (1943) (classical work); Cea (1971, M);
Grossmann and Kaplan (1979, L, B).
Problems with side conditions: Poljak (1974, S,B); Glowinski, Lions, and
Tremolieres (1976, M) (numerous physical applications); Psenicnyi and
Danilin (1979, M); Fletcher (1980, M).
Regularization: Tihonov (1963) (classical work); Cea (1971, M); Morozov
(1973, S); Tihonov and Arsenin (1977, M,H,B); Ivanov, Tanana, and Vasin
(1978, M).
Regularization of monotone operators: Browder (1968/76, M, B); Lions
(1969, M); Gajewski, Groger and Zacharias (1974, M); Pascali (1974, M);
Hess (1974); Pascali and Sburlan (1978, M).
Regularization of difference methods of gas dynamics: J. von Neumann
and Richtmyer (1950) (classical work on artificial viscosity); Lax and
Wendroff (1960), (1964); Richtmyer and Morton (1967, M).
Regularization of partial differential equations: Oleinik (1957, S)
(nonlinear hyperbolic differential equations); Lions (1969, M), (1973, L), (1983, M)
(comprehensive expositions); Lions and Lattes (1969, M) (quasireversibility);
Oleinik (1971, S) (degenerate differential equations).
Regularization of the Hamilton-Jacobi equation: Lions, Jr. (1982, L).
Perturbation theory: Compare the references to the literature in Chapters
8 and 79.
Perturbed variational problems, asymptotics, homogenization: Bensous-
san, Lions and Papanicolaou (1978, M); Lions (1980, S), (1983, M).
Duality method: Ekeland and Temam (1974, M); Glowinski, Lions, and
Tremolieres (1976, M).
Minimax problems: Auslender (1972, L); Demjanov and Malozemov
(1975, M).
Decomposition: Cea (1971, M); Bensoussan, Lions and Temam (1972,
S,B).
Combination principle: Browder (1966); Kluge (1979, M).
General surveys: Jacobs (1976, P) (state of the art in numerical
mathematics); Dixon (1980, P) (state of the art in optimization); Marcuk (1980,
M), (1982, M).
Symposia on optimization problems and their applications: Bensoussan
and Lions (1975, P); Marcuk (1975, P); Cea (1976, P); IFIP Conferences
(1978, P), (1978a, P), (1979, P); Glowinski and Lions (1980, P); Tzafestas
(1980, P). Pursue the IFIP Conferences further.
Modern numerical methods: Pursue the book series, International Series
in Numerical Analysis, Birkh'auser, Basel, Volumes 1-60, and further
volumes. Furthermore, pursue the conference series Glowinski and Lions
(1980).
Complexity of numerical algorithms: Traub (1976, P); Traub and
Wozniakowski (1980, M); Smale (1981, S).
Encyclopedia of mathematics and its applications (1976/oo).
Russian mathematical encyclopedia (1977).
Handbook of applicable mathematics in 6 volumes (1980).
New directions in applied mathematics: Hilton and Young (1980, M).
Further references to the literature on approximation methods can be
found in the following:
Iteration method (Chapter 1);
Numerical method for determination of fixed points (Chapter 1);
142
37. Introductory Typical Examples
Semidiscretization, line method, and Rothe's method (Chapter 3);
Newton's method, secant method, quasilinearization, shooting methods,
and invariant embedding (Chapter 5);
Continuation by a parameter (Chapter 6);
Problems of monotone type (Chapter 7);
Approximation methods for bifurcation problems (Chapter 15);
Collocation method (Chapter 21);
Projection-iteration method (Chapter 25);
Fractional step method (Chapter 30);
Approximate solutions for control problems (Chapter 48).
The references to the literature in Chapter 1 contain a summary of
monographs on numerical mathematics.
Exercise collections and monographs with comprehensive exercise sections.
Calculus of variations: Krasnov (1975, M) (collection of exercises with
solutions); Bolza (1949, M); Gelfand and Fomin (1961, M); loffe and
Tihomirov (1974, M).
Optimal control: Oleinikov (1969, M) (collection of exercises with
solutions); Lee and Markus (1967, M); Bryson and Ho (1969, M); loffe and
Tihomirov (1974, M); Fleming and Rishel (1975, M); Leitmann (1981, M).
Dynamic optimization: Bellman (1957, M), (1967, M).
Optimization and approximation theory: Collatz and Albrecht (1972, M),
Volumes I, II (collection of exercises with solutions); Dantzig (1963, M);
Luenberger (1969, M); Holmes (1972, L), (1975, M); Foulds (1981, M).
Approximation theory: Cheney (1966, M); Collatz and Kxabs (1973, M);
de Boor (1978, M).
Stochastic optimization: Astrom (1970, M).
Game theory: Karlin (1959, M); Owen (1968, M).
Convex functions: Roberts and Varberg (1973, M).
Several optimization techniques and their applications: Foulds (1981, M)
TWO FUNDAMENTAL EXISTENCE
AND UNIQUENESS PRINCIPLES
It is a splendid feeling to realize the unity of a complex of phenomena that by
physical perception appear to be completely separated.
Albert Einstein
Having become acquainted with an abundance of concrete and very
diversified examples in the preceding chapter, we will work out, in the two
following chapters, two general principles at our disposal for existence and
uniqueness proofs. To be precise, these principles entail:
(a) compactness
(P) and convexity
in existence proofs and the significance of:
(a) strong convexity of functional
(P) and the interpolation property of subspaces
in uniqueness proofs. We have already explained the basic ideas in the
section 'Introduction to the Subject.'
In Chapter 38 we attach value to presenting the connections between
different formulations of the generalized theorem of Weierstrass, which are
available in the literature.
CHAPTER 38
Compactness and Extremal Principles
Before you generalize, formalize, and axiomatize, there must be mathematical
substance.
Hermann Weyl
Another characteristic of mathematical thought is that it can have no success
where it cannot generalize.
Charles Sanders Pierce
Of the greatest importance is the sharp distinction that Weierstrass draws
according to whether a function attains a value at a point or whether it comes
only arbitrarily close to this value.
David Hilbert 1897
In this chapter we give a far-reaching generalization of the following
classical theorem of Weierstrass using compactness arguments: A
continuous function F: [a, b] -* U, — oo < a < b < oo, has a maximum and a
minimum (see Fig. 38.1). Here, lower semicontinuous functionals and weak
sequentially lower semicontinuous functionals play a crucial role. In this
connection, we exploit, e.g., the fact that the continuity of F: [a, b] -> U is
not needed for the existence of a minimum of F, but only the lower
semicontinuity. Due to its fundamental importance, we will explain the
crucial argument in its simplest form. Let F: [a, b] -* U be a real function
defined on the closed bounded interval [a, b] with the property
F(u) <, lim F(u„) for«= lim u„,
n —» oo
n -^ oo
which we shall elucidate intuitively in Example 38.11 and Fig. 38.2(a). We
assert that Fhas a minimum on [a, b]. To prove this, let a be the inflmum of
145
146 38. Compactness and Extremal Principles
Figure 38.1
F on [a,b], and let («„) be a sequence in [a, b] such that F(u„)-*a as
n-> oo. Since [a, b] is bounded, there exists a convergent subsequence («„,)
such that «„/ -*u as n' -> oo, and because [a, b] is closed, u e [a, £]. From
F{u) <lim F(«„-) = a
it follows that F(«) = a, i.e., F has a minimum at «.
In this chapter we shall get acquainted with different variants of this
argument. The crucial point is that a bounded sequence in an infinite-
dimensional B-space does not necessarily contain a convergent subsequence;
but in a reflexive B-space there always exists a weakly convergent
subsequence. For this reason, weak convergence plays a central role in extremal
problems. In nonreflexive B-spaces one can use weak* convergence in
certain cases.
The main theorem in Section 38.3 represents a central tool for existence
propositions in extremal problems. In connection with the development of
functional analysis, a long historical development process of mathematics
was required before the abstract existence principles that are presented here
were worked out clearly in this century. In this connection, the Dirichlet
principle played a fundamental role. This principle is a method of deduction
which Dirichlet (1805-1859), inspired by an idea of Gauss, used to solve the
first boundary value problem in the plane:
G:uxx + uyy = 0; dG:u = g.
Dirichlet assumed the existence of a smooth solution u of the variational
problem
J [ux + uyj dxdy = mini, u = g ondG.
Since the first boundary value problem corresponds to the Euler equation of
this variational problem, one immediately finds that u is the solution of the
first boundary value problem. In the middle of the last century, Riemann
placed this principle at the pinnacle of complex variable theory and used it
to construct his deep theory of Abelian integrals and Riemann surfaces.
However, the Dirichlet principle was subjected to sharp criticism by
Weierstrass. He pointed out that the existence of the solution of a varia-
38.1. Weak Convergence and Weak* Convergence
147
tional problem was in no way evident. In addition, he constructed a simple
variational problem which does indeed possess an inflmum but where there
exists no function that realizes this inflmum (cf. Problem 38.5). The
justification of the Dirichlet principle thus became a famous problem of the second
half of the nineteenth century. At first, C. Neumann. H. A. Schwarz, and H.
Poincare bypassed the difficulties of the Dirichlet principle by developing
new methods which made possible the direct solution of the first boundary
value problem without going down the path of a variational problem. The
Dirichlet principle was first rigorously proved by Hilbert (1904). At the
same time, moreover, he created the so-called direct method of the calculus
of variations which works parallel to the above proof for the existence of a
minimum of a real function F, i.e., convergent minimal sequences are
constructed. On the other hand, by the indirect method one understands the
solution of a variational problem with the aid of the solution of the
corresponding Euler differential equation. The significance of lower semi-
continuity for existence questions in classical variational problems was
pointed out emphatically by Tonelli (1921, M). As we shall see, for lower
semicontinuity arguments, the concept of convexity plays a central
background role.
In this connection, we would also like to point out an important situation
which we have already discussed in the introduction to Chapter 18. If the
function F considered above is defined only on the rational numbers in
[a, b\ then it need not possess a minimum. In the proof this is expressed by
the fact that the limiting element u of the subsequence («„.) need not be a
rational number. It is only the completion of the rational numbers by the
irrational numbers that makes the above existence theorem for a minimum
possible. An analogous difficulty arose in classical variational problems. For
a general existence theory, it turned out to be necessary, parallel to the
adjunction of the irrational numbers, to adjoin certain ideal elements within
the context of a completion procedure. To these ideal elements there
correspond functions with generalized derivatives, i.e., functions from
Sobolev spaces. Formally, this comes from the fact that the spaces Ck(G) of
smooth functions are not reflexive, while the corresponding Sobolev spaces
Wp(G), 1 < p < oo, are reflexive, and thus bounded sequences always
contain weakly convergent subsequences.
38.1. Weak Convergence and Weak* Convergence
Here we repeat several frequently used definitions and propositions that are
discussed in detail in the Appendix to Part I. There the connection with
topology is also pointed out, and the difference, say, between weak
continuity and weak sequential continuity is explained to help the reader avoid
errors.
148
38. Compactness and Extremal Principles
Definition 38.1. Let Xbe a B-space. We define weak convergence as n -»oo
of a sequence (un) in Xby
u„-*u iff lim (v,u„) = (v,u) for all ue X*. (1)
« ~^> 00
We define weak* convergence as n -* oo of a sequence (¾) in X* by
*
vn-*v iff lim (vn,w) = (v,w) for all w e X. (2)
H -^> 00
Norm convergence, also called strong convergence, in X (respectively,
X*) is denoted by u„-*u (respectively, vn -> v).
Proposition 38.2. The following assertions hold in a B-space X;
(1) un-*u as n -* oo implies u^M when all un belong to M and M is a closed
convex set in X.
(2) If X is reflexive, then every bounded sequence in X has a weakly
convergent subsequence.
(3) If X is separable, then every bounded sequence in X* has a weak*
convergent subsequence.
(4) If X is reflexive, then on X*, weak* convergence and weak convergence
coincide. In particular, every H-space and every finite-dimensional B-space
are reflexive.
(5) If dim X< oo, then strong convergence, weak convergence, and weak*
convergence coincide.
(6) When n -* oo, then we have the following two limiting relations:
v„-*v in X*, un-*u in X
implies ( v„, u„ ) -> ( v, u ), (3)
and
*
v„-*v in X*, u„-*u in X
implies (v„,u„) -> (v,u). (4a)
In a reflexive B-space X, it follows by assertion (4) that:
v„-*v in X*, un-*u in X
implies (v„,un) -> (v,u). (4b)
The proofs of these standard results can be found in Dunford and Schwartz
(1958, M), Vol. I, Yosida (1965, M), and Mukherjea and Pothoven (1978,
M).
Example 38.3. According to Proposition 38.2, (4) weak* convergence plays
a special role only in the dual spaces X* of nonreflexive B-spaces X. A
prototype is the nonreflexive B-space X= L^G), where G is a bounded
38.2. Sequential Lower Semicontinuous and Lower Semicontinuous Functionals 149
region in UN. By A2(40), X* = LX(G) and weak* convergence vn -^ v in X*
means that
I vnudx -* I vudx
JG JG
as n -* oo for all u e X. The space Lt(G) is separable. Therefore,
Proposition 38.2,(3) can be applied.
38.2. Sequential Lower Semicontinuous and Lower
Semicontinuous Functionals
The point of departure is the formula
F(u)< Urn F{u„). (5)
n -^ oo
Definition 38.4. Let F: M C X-> [- oo, oo] be given. Let Ibea B-space.
The functional F is said to be sequentially lower semicontinuous at the point
u e M if and only if (5) holds for each sequence («„) in M such that u„-*u
as n -*oo.
Similarly, F is said to be weak sequentially lower semicontinuous
(respectively, weak* sequentially lower semicontinuous) at the point u e M if and
only if (5) holds for each sequence («„) in M such that «„—u (respectively,
!/„•*-«) as n -+00.
F is said to be weak sequentially continuous (respectively, weak*
sequentially continuous) at the point u e M if and only if F(u) = limn^KF(un)
holds for all sequences («„) in M such that «„ —« (respectively, «„*-«) as
H-+00.
In weak* sequential lower semicontinuity, we naturally have to assume
that X= Y * holds, where Y is a B-space. F is said to be sequentially lower
semicontinuous on M when F is sequentially lower semicontinuous for all
we M. We proceed analogously to the other concepts in Definition 38.4. In
connection with (5) we recall the known definition of limF(un). The
number h, where — oo <, h < oo, is called a limiting value of the sequence
(F(un)) if and only if there is a subsequence which converges to h. Then
limF{un) is the smallest limiting value of (F(«„)) which always exists on
[~oo,oo].
Together with Definition 38.4, we consider a parallel definition. It is
based on the properties of the set
def
Mr= {ueM:F{u)<:r}.
150
38. Compactness and Extremal Principles
Definition 38.5. Let F: M c X~* [ - oo, oo] be given.
If X is a linear space, then the functional F is said to be quasiconvex if
and only if Mr is convex for all r e R.
If Jf is a topological space, then F is said to be lower semicontinuous if and
only if Mr is closed relative to M for all r e R.
F is said to be /owe/- semicompact if and only if Mr is compact for all
reR.
F is said to be upper semicontinuous (respectively, quasiconvex) if and
only if — F is lower semicontinuous (respectively, quasiconvex).
For a closed set M, by At(9), Mr is closed relative to M if and only if Mr is
closed. In a B-space X, the closedness of Mr relative to M means that if («„)
is a sequence in Mr, then from un-*u as n -» oo and u e M, it always
follows that « e Mr.
Example 38.6. For F: M c Jf-> [ - oo, oo]:
(1) F and M are convex implies F is quasiconvex.
(2) F is continuous if and only if Fis both lower and upper semicontinuous.
The proof is obtained almost directly from the corresponding definitions
(cf. Problem 38.1). We recall the definition of convexity in Section 42.1 as
well as in Section 47.1 for functionals F: M -* [— oo, oo] with infinite values.
The continuity of F: M -* [—00,00] is explained in the usual way by the
situation that for each u e M and each neighborhood U(F(u)), there exists
a neighborhood V(u) such that F(V(u)) cU(F(u)). Here, 1/(-00)
[respectively, 1/(+00)] is a set that contains [— 00, a] (respectively, [a, + 00]) for a
fixed a eR,
In the following proposition we investigate the connection between the
following assertions in B-spaces:
(i) F is lower semicontinuous on M.
(ii) F is sequentially lower semicontinuous on M.
(iii) F is weak sequentially lower semicontinuous on M.
Proposition 38.7. For F: M C X-> [- 00,00] on the B-space X:
(1) Assertions (i) and (ii) are equivalent.
(2) If M is closed and convex and F is convex, then (i), (ii), and (iii) are
mutually equivalent.
(3) Let « e M and F(u) + ± 00. Then F is sequentially lower semicontinuous
at u if and only if, for each e > 0, there exists a 5(e) > 0 such that
\\v-u\\<8(e) implies F(u)<F(v) + e (6)
holds for all osM.
We treat the simple proofs in Problem 38.3. In particular, assertion (2)
shows that especially propitious relations occur for convex functionals. In
38.3. Main Theorem for Extremal Problems
151
(3), sequential lower semicontinuity can be replaced by lower semicontinu-
ity. If F is continuous at u, where F(u) + ± oo, then
||y-«|| <5(e) implies - e< F(u)-F(v) <e.
A comparison with (6) motivates the designation "lower semicontinuity."
38.3. Main Theorem for Extremal Problems
We study the minimum problem
minF(«) = a. (7)
u e M
The corresponding maximum problem can be reduced to (7) by passing to
— F. We are interested in existence propositions.
Theorem 38.A. For the functional F: M C X-> [- 00,00] with M + 0, (7) has
a solution in case the following hold:
(i) X is a real reflexive B-space.
(ii) M is bounded and weak sequentially closed, i.e., by definition, for each
sequence (un) in M such that un-*u as n -* 00, we always have « e M.
(Hi) F is weak sequentially lower semicontinuous on M.
Corollary 38.8. With the assumption (i), the condition (ii) holds when one of
the following two conditions holds:
(ii') M is bounded, closed, and convex.
(ii") For fixed r> 0, we set M = { « e X' G(u) = r}. Here, the functional G:
X-*U is weak sequentially continuous and
lim G(«) = +oo.
Hull-00
This corollary describes two situations which are important for
applications.
Corollary 38.9. The functional F: M c X-> [- 00,00], M¥=0, has a
minimum and a maximum on M when (i) and (ii) hold and F is weak sequentially
continuous on M.
def
Proof. Let a = miu(BMF(u). We choose a sequence («„) in M such that
F(un) -* a. Since M is bounded and X is reflexive, by Proposition 38.2, (2),
there exists a weak convergent subsequence (u„,) such that «„/—«. From (ii)
it follows that u e M; therefore,
F{u) <,limF(un,) ^ a
152
38. Compactness and Extremal Principles
according to (iii). Since a<F(u), we have F(u) = a, i.e., u is a solution of
(7). This proves Theorem 38.A.
Corollary 38.8, (ii') is identical to Proposition 38.2, (1). We now prove
Corollary 38.8, (ii"). Since G(u)-> + oo as ||u|| ->oo, M is bounded.
Furthermore, it immediately follows from G(«„) = r for all neN and «„—u
that G(u) = r.
Corollary 38.9 follows by applying Theorem 38.A to - F and taking
Example 38.6 into account. D
In Problem 38.4 we will show that Theorem 38.A is a special case of the
following more general result, which is frequently designated as the
generalized Weierstrass theorem.
Theorem 38.B (Main Theorem). Let X be a topological space. For the
functional F: M C X-> [- oo,oo], M=£0, the minimum problem (7) has a
solution in case one of the following two conditions holds:
(i) F is lower semicompact.
(ii) F is lower semicontinuous on the compact set M.
Corollary 38.10. The functional F: M-> [-00,00], M^0, has a minimum
and a maximum on M when F is continuous on the compact set M.
def
Proof, (i) By assumption, Mr= («eM: F(u)<r] is compact for all
def
rel. Let a == infu<BMF(u). For a = + 00, the assertion is trivial because
F = + 00. Therefore, let a < r0 < 00 for a fixed r0. The set MrQ is compact.
Since the intersection of a finite number of Mr's with a < r < rQ is always
nonempty, it follows from A^llg) (finite intersection property) that there is
a «0 such that
«0e p| Mr.
Obviously, F(uQ) = a, i.e., uQ is a solution of (7).
(ii) This is a special case of (i).
Corollary 38.10 follows from (ii) upon applying to - F and taking
Example 38.6 into account. □
38.4. Strict Convexity and Uniqueness
The following uniqueness criterion is used very frequently.
Theorem 38.C. The functional F: M c. X-*U has at most one minimum on
M in case the following hold:
(i) M is a convex subset of the linear space X.
38.5. v anants 01 mc Main T
15:
v/
(a) (b)
Figure 38.2
((7) F is strictly convex, i.e.,
F((l- t)u + tv) < (1- t)F(u)+ tF(v)
holds for all u,v^M,uj=v, and all t e ]0,1[.
(c)
(8)
We give another general uniqueness principle in Theorem 39.B in Section
39.2.
Proof. By (8), we arrive at a contradiction for F(u) = F(v) = min„,e MF(w)
and «=/= v when t — \. D
Example 38.11. For all u^[a,b], the real function F: [a, b]->U with
— oo < a < ft < oo in Fig. 32.8(a) is sequentially lower semicontinuous, for it
follows from un-*u that
F{u)<> Urn F(u„).
Since strong and weak convergence coincide on U, F is also weak
sequentially lower semicontinuous. Furthermore, according to Proposition 38.7,
(1), F is lower semicontinuous.
The existence of a minimum of F on [a, b] in Fig. 38.2(a) is a special case
of Theorems 38.A and 38.B in Section 38.3. In Fig. 38.2(b), F is strictly
convex, i.e., by (8), the interior points of the chord lie properly above the
curve belonging to F. In Fig. 38.2(c), F is convex, i.e., (8) holds with " <"
instead of " <." Figure 38.2(c) shows that convexity does not suffice to
assure the uniqueness of the minimum.
38.5. Variants of the Main Theorem
As a preview of later applications, we first present a summary of a number
of criteria for a minimum. After this, we explain two important tricks for
the treatment of minimum problems in unbounded and nonconvex sets.
154
38. Compactness and Extremal Principles
Proposition 38.12. The functional F: M c X-> [-00,00], M¥=0, has a
minimum on M when one of the following six conditions is fulfilled:
(a) X is a topological space and F is lower semicompact.
(a*) X is a topological space, M is compact, and F is lower semicontinuous.
(b) X=UN, N>1, M is closed and bounded, and F is lower semicontinuous.
(c) X is a reflexive B-space, M is closed, bounded, and convex, F is lower
semicontinuous and convex or, more generally, lower semicontinuous and
quasiconvex.
(d) X is a B-space, F is weak sequentially lower semicontinuous on M, and
M is weak sequentially compact, i.e., by definition:
Each sequence in M possesses a weakly convergent subsequence ,..
with limit value in M. ^ '
For example, in a reflexive B-space X, every closed, bounded, and
convex set M is also weakly sequentially compact,
(d*) X=Y*, Yis a B-space, F is weak* sequentially lower semicontinuous on
M, and M is weak* sequentially compact, i.e., by definition, (9) holds,
where "weak" is replaced by "weak*."
For example, the ball M = {v^Y*: \\v\\<,R] is weak* sequentially
compact in Y* when Y is separable.
Corollary 38.13. The functional F: M C X-> [— 00,00] possesses a maximum
and a minimum on M when in Proposition 38.12, (a*), (b), (d), (d*), we
replace "lower semicontinuous" (respectively, "sequentially lower
semicontinuous") with "continuous" (respectively, "sequentially continuous").
Proof, (a) and (a*) correspond to Theorem 38.B in Section 38.3, and (b) is
a special case of (a*). Furthermore, (c) is a special case of (a). The set Mr,
r e U, is closed, bounded, and convex (see Definition 38.5). If Xis equipped
with the weak topology, then Mr is weak compact; therefore, F is weak
lower semicompact.
(d) and (d*) are proved analogously to Theorem 38.A in Section 38.3.
Corollary 38.13 is obtained from Proposition 38.12 by passing from F
to - F. D
Trick for Unbounded Sets. The boundedness of M plays an important role
in Proposition 38.12. We now explain a frequently used trick which reduces
the minimum problem (7) on the unbounded set M of the B-space X to an
equivalent minimum problem
min F(u) = a, (10)
ui=MnU(u0,R)
_ def _
where U(u0, R)= («e X: \\u-u0\\<R}, i.e., M n U( u0, R) is bounded.
38.6. Application to Quadratic Variational Problems
155
Corollary 38.14. For the functional F: M c X-> [— oo, oo], where u0 e M,
the minimum problem over M is equivalent to (10) when
F(u)-*+oo ay||«||->oo, «eM (11)
and R is chosen sufficiently large.
Proof. Let F * + oo. By (11) there exists an R > 0 such that F(u) > F(u0)
holds for all u with ||u— u0\\ > R. □
: From Proposition 38.12, we thus obtain the following frequently used
existence proposition as a prototype.
Proposition 38.15. A functional F: McAr->[-oo,oo] on the convex, closed,
and nonempty subset M of the real reflexive B-space Xsatisfying (11) possesses
a minimum when, in addition, one of the following two conditions holds:
(a) F is convex and continuous or, more generally, convex and lower semicon-
tinuous {respectively, quasiconvex and lower semicontinuous).
(b) F is weak sequentially lower semicontinuous.
In case (a), the set of minimal points is closed, convex, and bounded.
In particular, we can choose M = X.
Proof. The solution set is equal to L = {u e M: F(u) < a), where a is the
minimal value. The boundedness of L follows from (11). Furthermore, L is
closed and convex because of the lower semicontinuity and convexity (or the
quasiconvexity) of F. O
Trick for Nonconvex Sets. The convexity of M also plays an important role
in Proposition 38.12. In the proof of Theorem 43.B in Section 43.4 we shall
use a simple trick for nonconvex sets. In place of the minimum problem on
M, one considers the corresponding problem on the closed convex hull of
M, co M, and shows that the minimum over co M is in fact taken on M. In
Theorem 43.B, M is, for instance, the boundary of a ball and thus co M is
the closed ball. In addition, we have already used this argument in the proof
of Theorem 22. E in Section 22.5.
38.6. Application to Quadratic Variational Problems
We consider the minimum problem
min 2~la{u,u)— b{u) = a. (12)
«e M
Example 38.16. Let M be a closed convex nonempty set of the real reflexive
B-space X. Let a: XX X-*U be bilinear and bounded. Furthermore, let
156
38. Compactness and Extremal Principles
b e X*. We set
def
F(u) = 2 1a(u,u)-b(u).
Then:
(1) If a is positive (respectively, strictly positive), then F is convex
(respectively, strictly convex) and continuous on M; therefore, it is lower
semicontinuous and, by Proposition 38.7, also weak sequentially lower
semicontinuous on M.
(2) If a is strongly positive, then F(u)-* + oo and f(«)/||«||-> + oo as
||«||->oo.
(3) If the bilinear form a is compact, then F is weak sequentially
continuous.
The properties of the bilinear form a{-,-) used here were defined in
Section 21.5. If A: X-> X* is a continuous linear operator and if we set
def
a(u,v) = (Au,v), for all u,v^X then, by Section 21.5, a is positive,
strictly positive, strongly positive, compact, respectively if and only if A has
the corresponding property. If Xis an H-space with the inner product (• |-),
then, by Section 21.4, we can set X— X* and (w, v) = (w\v) for all
w,v& X.
def
Proof. (1) Let <p(0 = F(u + t(v- «)) for all (eR and fixed »,ueM
Then <p is a quadratic polynomial such that the coefficient of t2 is
2~~la(v — u,v — u). If a is strictly positive, then a(v — u, v — u)> 0 for
u + v. Figure 38.3 yields
<p(0<<p(0)+f(«p(l)-<p(0)) forall*e]0,l[. (13)
This is precisely the strict convexity (8) of F. If a is positive, then (13) holds
with " < " in place of " <," i.e., F is convex.
(2) One takes into account that a(u, u)> c\\u\\2 and \b(u)\ < \\b\\ \\u\\ for
all u e Xfor fixed c> 0.
(3) «„-*« implies that a(un,un)-*a(u,u) and b(u„)-*b(u)\ therefore,
F(«„)->F(«). D
Thus, from Propositions 38.12, (c) and 38.15 and Theorem 38.B, it
immediately follows that the following existence and uniqueness proposition
holds.
0 1
Figure 38.3
38.7. Application to Linear Optimization and the Role of Extreme Points
157
Proposition 38.17. With the assumptions of Example 38.16, (12) has a
solution when one of the following two additional conditions is fulfilled:
(i) a is positive and M is bounded,
(ii) a is strongly positive.
If a is strictly positive, then (12) has at most one solution.
We treat applications in Chapter 46.
38.7. Application to Linear Optimization and the
Role of Extreme Points
Definition 38.18. Let M be a subset of the linear space X. Then u in M is
called an extreme point of M if and only if u is not an interior point of a
segment whose end points belong to M, i.e.,
u = tul + {\-t)u2, u1,u2^M, ux + u2, 0<?<1 (14)
does not hold.
Example 38.19. In Fig. 38.4, with X= U2, the vertices of M are precisely
the extreme points.
At the same time, Fig. 38.4 depicts the fundamental Krein-Milman
theorem: In a real locally convex space, a compact convex set M is the
closed convex hull of its extreme points. A proof of this theorem can be
found, for example, in Holmes (1975, M), page 74.
We apply this result to the linear optimization problem
minF(«) = a. (15)
«e M
Theorem 38.D (Main Theorem of Linear Optimization). (15) has a solution
u, where u is an extreme point of M provided one of the following two
conditions is satisfied:
(1) F: M C X-*U, M ¥=0, is a continuous linear functional on the compact
convex set M of the real locally convex space X.
(2) F: M c X-*U, M¥=0, is a continuous linear functional on the closed
bounded convex set M of the real reflexive B-space X.
jProof. (1) The existence assertion follows from Theorem 38.B in Section
:38.3. Let A be the set of solutions of (15). A is convex because F is linear. A
Figure 38.4
158
38. Compactness and Extremal Principles
is closed because, from F(up) = a, for all /6 and the MS sequential
convergence Up-*u, we obtain F(u)=a due to the continuity of F (cf.
^41(17e)). As a closed subset of M, A is compact. By the Krein-Milman
theorem, A has an extreme point u e A. We shall show that u is an extreme
point of M, too. If, on the contrary, (14) holds, then we have:
a = F(u) = tF(u1)+(l-t)F(u2)>ta + (l-t)a'=a,
i.e., Fluj) = F(u2) = a; therefore, uv «2 e A and (14) holds. Consequently,
u is an extreme point of A; but this is a contradiction.
(2) If we equip X with the weak topology, then M is weakly compact
(cf. /^(42), /^(44)). Therefore, (2) is a special case of (1). □
38.8. Quasisolutions of Minimum Problems
In the immediately following sections, we consider several general existence
principles which play an important role in numerous current investigations.
Our point of departure is the minimum problem
F(iO = min!, oel (l6)
If we can not apply the generalized Weierstrass theorem from Section 38.3
or 38.5, then the natural question arises: Can we at least prove the existence
of approximate solutions or of quasisolutions? To this end, parallel to (16),
we consider the regularized problem
F(v)+ed(u,v) = mini, oel (17)
Here, d is a metric. Obviously, because d(u, u) = 0, each solution u of (16) is
also a solution of (17) with the same minimal value. This motivates the
following definition.
Definition 38.20. We call each solution u of (17) an e-quasisolution of (16).
The following theorem guarantees the existence of quasisolutions. Our
assumptions are:
(HI) Xis a complete metric space with the metric d (e.g., Xis a B-space or a
closed set in a B-space with d(u, v) = \\u — v\\).
(H2) The functional F: X-*]— 00,00] is lower semicontinuous, bounded
below, and F ^ + 00.
Theorem 38.E (Ekeland (1974)). If (HI) and (H2) hold, then for each e > 0,
the minimum problem (16) has an e-quasisolution.
Corollary 38.21. Let X be a B-space and suppose that the functional F: X-*U
is lower semicontinuous, G-differentiable, and bounded below. Then for each
38.8. Quasisolutions of Minimum Problems
159
e > 0 there exists a « e X such that the following is valid:
F(u)£ inf F(v) + e, (18)
te X
\\F'{u)\\<e. (19)
We introduced the concepts of G-differentiability and F-differentiability
in Chapter 4. We recall these definitions again in Section 40.1. Corollary
38.21 is especially suggestive, for, as we shall see in Theorem 40.B, F'(u) = 0
is a necessary condition for the existence of a solution u of (16). According
to (18) and (19), this condition can now always be fulfilled at least
approximately. The proofs of the two assertions just given follow below as
special cases of the following general result.
Proposition 38.22. We assume that (HI) and (H2) are satisfied. For given
positive numbers e, X, we choose a u0 e X such that
F(u0)^ inf F(v)+e.
te X
Then there exists a « e X with the following three properties;
F{u)<F(u0), (20)
d(u,u0)<\ (21)
F(v)>F(u)-ed(U'V* forallv^X,v*u. (22)
One frequently chooses X = 1 or X = {t. Theorem 38. E is obviously a special
case of (22) with X = 1. In the next two sections we treat, as an application, a
general existence principle for minimum problems (Theorem 38.F) and a
fixed point theorem. Moreover, in Section 38.11 we consider a
generalization of Theorem 38.E (abstract entropy principle). Additional important
applications can be found in Ekeland (1979, S): Fixed point theorems,
Kuhn-Tucker theory, and the Pontrjagin maximum principle under weak
smoothness assumptions, geodesic curves, geometry of B-spaces, and nonex-
pansive semigroups. In addition, we also recommend a calculus for
generalized derivatives of locally Lipschitz-continuous functions, which one can
find in Clarke (1981, S), (1984, M) and Rockafellar (1981, L).
Proof of Proposition 38.22. It suffices to assume that X = 1 since we can
pass from d to d/X. We inductively define a sequence («„) for n = 0,1, —
If we know u„ e X, then we construct «„+1 in the following way:
Case 1: F(v)> F(un)—ed(un,v) for all oel Then let u„+1= un.
Case 2: F(v) <; F(un)- ed(un, v) for a ce X. Let S„ be the set of all these
def _
points v and let a„ = infs F. We then choose a «„+1 e S„ with
F{un+l)-an<2-l[F{un)-an\. (23)
10U
jo. Conipautuess anci nAticinal Principles
Our construction is so constituted that all the F(un) form a monotone
decreasing sequence, which by (H2) is bounded below and hence
convergent. We shall show that («„) also converges. By construction,
ed(u„,u„+1) <; F(u„)- F{un+1)
holds for all n. Addition yields
ed{u„,um)<F{u„)-F{um) forallm>«. (24)
Therefore («„) is a Cauchy sequence and thus a convergent sequence. Let
u„-* u as n -* oo. From the lower semicontinuity of F, it follows that
F{u)<, Urn F(u„). (25)
We shall show that u has all the desired properties.
Proof of (20). From (25) and F(u„)^F(u0) for all n, it follows that
F(u)*F(u0).
Proof of (21). For n = 0 and m -* oo in (24), we obtain
ed(u0, u) < F(u0)— in(F<e.
x
Proof of (22). On the contrary, suppose that (22) is false. Then there exists
a v, v + u, such that
F{v)<,F{u)-ed{u,v). (26)
As m -> oo, from (24) and (25) we obtain
F{u)<,F{u„)-ed{u„,u).
The triangle inequality yields
F(v)£F(u„)-ed(u„,v).
Thus v e S„ for all n. Hence, from (23) it follows that
2F(un+1)-F(un)£a„<F(v).
Let F(un) -> /3 as n -> oo. Then 0 <: F(y). From (25) it follows that F(u) < p.
Thus, F(u) £ F(v). This contradicts (26). D
Proof of Corollary 38.21. We set d(u, v) = \\u - v\\. By (22), there exists
a u e X such that
F(v)^F(u)-e\\u-v\\ forallyeX
We choose v = u + tw. Then
t~l{F{u + tw)- F{u)) ;> - e||w||.
As t -> 0, we obtain (F'(u), w)^~ e\\w\\. That is, (F'(u), z) < e||z|| for all
z(=X.Thas\\F'(u)\\<e. D
38.10. The Paiais-Smale Condition and a Cjeneral Minimum Principle
101
38.9. Application to a Fixed-Point Theorem
Concerning the following fixed-point theorem, it is remarkable that the
operator T need not be continuous. The following generalized contraction
condition is important:
d{u,Tu)<,S{u)-S{Tu) for aline X (27)
Proposition 38.23 (Caristi (1976)). Let T: X-*X be a mapping of the
complete metric space X into itself for which (27) holds. Here, let S: X -* U be
lower semicontinuous and bounded below. Then T has a fixed point.
Proof. If we apply (22) with e = \ to S, then we obtain a u e X with
S(u)^S(m)-2-V(m,i;) for ally eX
For v = Tu, it follows that
S{Tu)^.S{u)-2~ld{u,Tu).
From (27) it then follows that d{u,Tu) = 0. a
38.10. The Palais-Smale Condition and a General
Minimum Principle
Together with the minimum problem
„ ' F(«) = min!, k e X, (28)
we consider the operator equation
F'(«) = 0. (29)
The following existence assertion is based on an important compactness
property of functional s, which we shall first define.
Definition 38.24. Let the functional F: X->U be G-differentiable on the
B-space X Then F satisfies the Palais-Smale condition (PS) if and only if
the following holds:
If («„) is a sequence in X with the two properties
(i) (F(u„)) is bounded,
(ii)||f"(«B)l|-*0as«-»oo,
then («„) has a convergent subsequence.
Theorem 38.F. Let F: X-> U be an F-differentiable functional on the B-space
X which is bounded below and satisfies (PS).
Then the minimum problem (28) has a solution u which also satisfies the
operator equation (29).
162
38. Compactness and Extremal Principles
Instead of F-differentiability, it suffices to require that F is lower semicon-
tinuous and G-differentiable.
def
Proof. Let a = infXF. According to Corollary 38.21, for e = \/n, there
exists a sequence («„) with F(un)-*a and ||F'(«n)||->0 as n-*oo. The
functional F is continuous (Proposition 4.8). Due to (PS), there exists a
subsequence («„-) with un, -* u as n' -* oo. Consequently, F(u) = a. Thus, u
is a solution of (28). By Theorem 40.B in Section 40.2, u is then a solution of
(29) as well. D
The prototype for an F-differentiable functional F with (PS) is a function
F: U" -*U with continuous first partial derivatives and the weak coercive-
ness condition F(u) -* + oo as ||u|| -> oo. From the boundedness of (F(u„))
it then follows that («„) is bounded and hence that there exists a convergent
subsequence.
On the other hand, the function F: U -*U with F(u) = cos u does not
satisfy (PS). Consider («„) with un = ntr.
In B-spaces one has the following prototype for (PS).
Example 38.25. The G-differentiable functional F: X~* U with F' = A + C
on the B-space X satisfies (PS) when, in addition, the following holds:
(i) F(u)-> + oo as ||u|| ->oo.
(ii) A: X-*X* has a continuous inverse operator A~* on X*, and
C: X -> X* is compact.
In an H-space X, one can, e.g., identify X with X* and choose A to be
equal to the identity operator I.
We have already encountered such compact perturbations of the identity,
I + C, in dealing with the mapping degree in Part I and with the Fredholm
alternatives in Part II.
Proof. If (F(«„)) is bounded, then, by (i), («„) is also bounded.
Furthermore, let Aun + Cun-*0 as n->oo. Due to the compactness of C, there
exists a subsequence («„-) such that (Cu„,) converges. Consequently, (Au„,)
also converges and hence («„-) converges because of (ii). □
In Chapter 44 we shall generalize the definition of (PS) and consider
additional prototypes. There we shall also show that (PS) plays an
important role in the Ljusternik-Schnirelman theory in the proof of the
existence of critical points for functionals and for eigenvalue problems. In
Section 44.12 we shall treat the mountain pass theorem, which can be
thought of as an important supplement to Theorem 38.F. The mountain
pass theorem guarantees the existence of a critical point u of F to which
there corresponds no minimum; that is, u is a solution of (29), but not of
(28), and is therefore different from the solution whose existence was proved
38.11. The Abstract Entropy Principle
163
in Theorem 38.F. (PS) is also crucial for the mountain pass theorem. One
can frequently, but not always, verify the condition (PS) for nonlinear
partial differential equations. Then Theorem 38.F and the mountain pass
theorem yield existence assertions.
The investigation of nonlinear elliptic differential equations and the
verification of periodic solutions of nonlinear hyperbolic differential
equations and of Hamiltonian systems with such variational methods is presently
the object of intensive investigations. In this connection, study Nirenberg
(1981, S), Chow and Hale (1982, M), and the literature given in the
references to the literature in Chapter 49.
38.11. The Abstract Entropy Principle
Our goal is an assertion of the form:
u<,v implies S(u) = S(v). (30)
In this connection, we work in an ordered set, which we have defined in
Section 11.8. Roughly speaking, the set X is ordered if, for certain u, v & X,
there is defined a relation u < v with which one can calculate in the usual
way. Our assumptions read as follows:
(HI) X is an ordered set. Each monotone increasing sequence in X has an
upper bound.
(H2) S: X-*[— oo,oo[ is a monotone increasing function that is bounded
above.
It is evident that (HI) means that, for all n e N, u„ < un+1 always implies
the existence of a v e X such that u„ < v for all «eN. Furthermore, (H2)
means that u < v always implies that S(u) < S(v) and that there exists a real
number C such that S(u) < C for all u e X.
Theorem 38.G (Brezis and Browder (1976)). If (HI) and (HI) hold, then
there exists a nel such that (30) holds.
This theorem permits a simple thermodynamic interpretation. We recall
that, by the second law of thermodynamics, for each closed system the
entropy S is a monotone increasing function of time. Therefore, S tends to a
maximum as t -> + oo. The states of maximal entropy have a special
physical meaning. Roughly speaking, to these states there correspond stable
equilibrium states of the system. We now interpret u, v in X as possible
states of a system. The relation u < v means that the system in the state u
can pass into the state v at a later time. Thus, to a monotone increasing
sequence ul<u2<u3<- ■ ■, there corresponds a possible time development
of the system. (H2) models the fact that the entropy S is a monotone
164
38. Compactness and Extremal Principles
increasing function of time. Now, Theorem 38.G yields the existence of a
stable equilibrium state u. If the system is in u, then S can no longer
increase.
In Problem 57.6 we treat an important application of Theorem 38.G to
invariant sets of nonexpansive semigroups. Additional applications can be
found in Brezis and Browder (1976). There it is also shown that one can
obtain assertions concerning quasisolutions of the type considered in
Section 38.8 from Theorem 38.G.
Proof of Theorem 38.G. Our argument is analogous to that in the proof
of Proposition 38.22.
The idea of the proof is obtained directly from the physical interpretation
of the theorem.
We choose an arbitrary but fixed element »0el and inductively
construct a monotone increasing sequence («„). Let un be known. We set
M,= {ael: un < u) and fi„ = sup^S. If (30) holds for «„, then we are
finished. Otherwise, we have f}n > S(un) and we can choose a «„+1 such that
Pn-S{un+l)<2-l[pn~S{un)\. (31)
In this way we obtain a monotone increasing sequence («„) which, by (HI),
has an upper bound u, i.e.,
u„<u for all n. (32)
We shall show that u is the desired solution.
Let us assume that u does not satisfy (30). Then there exists a v such that
u<v and S(u)<S(v). The sequence (S(un)) is convergent, for it is
monotone increasing and bounded above, by (H2). From (32) and the
monotonicity of S, it follows that
lim S(u„)<S(u). (33)
Due to (32), oeM, for all n. Therefore, from (31) it follows that
2S(u„+1)~S(u„)>f}„>S(v) for all«.
As n -* oo, we now obtain the contradiction S(v) < S(u) by (33). □
Problems
38.1. Proof of Example'iS.6. Hint: Compare Example 9.12 and Dieudonne (1975,
M), Vol. II, 12.7.
38.2. Properties of lower semicontinuous functionals. Show: If F, G,Fa: M'-*[ — oo, oo]
are lower semicontinuous, then F + G, sup(F, G), inf(F,(7), and sup„F„ are
also lower semicontinuous. FG is lower semicontinuous when F, G > 0. For
F+G and FG, we require that these expressions be defined, i.e., the cases
-oo + oo and 0-oo are excluded. Hint: Compare Dieudonne (1975, M), Vol.
II, 12.7.
References
165
38.3. Proof of Proposition 38.7. Solution: Ad (1) (i) =» (ii) If F is not sequentially
lower semicontinuous at u, then there exists a sequence («„) in M such that
u„-+u and F(u)> Mmn^00F(u„). Then there exist numbers r and n0 such
that F(u) > r > F(u„) for all n > n0. Since u„-*u, this contradicts the fact
that Mr is relatively closed.
(ii) =» (i) From u„ <= Mr for all n e N and u„ -» w, it follows that F(«) <
fim F(«„) < r; therefore, « e Mr.
Ad (2) (i) <=> (iii) Follow a Une of reasoning analogous to (1). Observe that
Mr is convex and closed, and thus from u„ s Mr, for all n s n0 and u„-~u, it
always follows that « e Mr.
Ad (3) Use the definition of lim and a suitable subsequence.
38.4. Theorem 38.A is a special case of Theorem 38.B.
Solution: We equip X with the weak topology. M is weak sequentially
closed; thus, by Problem 32.3, it is also weakly closed. According to /-^(44^,
coM is a weakly compact, i.e., by /-^(12d), M is also weakly compact.
If u belongs to the weak closure Mr of Mr, then, by Problem 32.3, there
exists a sequence («„) in Mr such that u„^u; therefore, F(u) <limF(u„) < r
and hence u s Mr. Consequently, Mr is weakly closed. Furthermore, F is
weakly lower semicontinuous.
38.5. Weierstrass' classical counterexample. Show that the variational problem
inf / (xu'(x)) dx-a,
.€M'-1
del
where M = {«eC [-1,1]: «(-1) = 0, «(1) = 1}, has no solution u.
Solution: If one chooses
,, 1,1 arctanrt-1* , _
u,Ax) = - + - —, « = 1,2,...,
z z arctanrc
then u„ s M and one obtains a = 0. If « is a solution, then x«'(x) = 0 on
[ — 1,1]; therefore, u = constant, in contradiction to u( -1) = 0, «(1) = 1.
This example was given by Weierstrass to show that a minimum problem
in the calculus of variations need not always have a solution (cf. Funk (1962,
M), page 220 for historical comments).
38.6. Theorem 38.G implies Theorem 38.E. Hint: Set u<v if and only if F(u)~
d(u,v) = F(v).
References to the Literature
Classical works: Hilbert (1904) (establishing the Dirichlet principle and the
direct method of the calculus of variations); Tonelli (1921, M) (lower
semicontinuity).
Introduction: Vainberg(1956, M); Luenberger(1969, M); Girsanov(1972,
L); Holmes (1972, L); Fucik, Necas, and Soucek (1977, L), Hlavacek and
Necas (1981, M).
166
38. Compactness and Extremal Principles
Selection of monographs on the functional analysis treatment of various
aspects of the theory of extremal problems: Vainberg (1956, M), (1972, M);
Krasnosdskii (1956, M) (1975, M); Lions (1971, M), (1983, M); Klotzler
(1971, M); Cea (1971, M); Duvaut and Lions (1972, M); loffe and Tihomirov
(1974, M); Ekeland and Temam (1974, M); Holmes (1975, M); Glowinski,
Lions, and Tremolieres (1976, M); Langenbach (1976, M); Berger (1977,
M); Kluge (1979, M); Aubin (1979, M); Kinderlehrer and Stampacchia
(1980, M).
Weak convergence: Dunford and Schwartz (1958, M), Vol. I; Yosida
(1965, M); Mukherjea and Pothoven (1978, M).
Lower semicontinuous functionals: Dieudonne (1975, M), Vol. II (also,
cf. the references to the literature for Chapter 22).
Lower semicontinuity and existence theory of variational problems with
multiple integrals: Morrey (1966, M), Giaquinta (1981, L), Necas (1983, L).
Recent trends
General survey: Berkeley (1983, P).
Variational problems, (PS), and nonlinear differential equations:
Nirenberg (1981, S) (also see the references to the literature in Chapter 49).
Nonconvex problems: Ekeland (1979, S); Ekeland and Temam (1974, M);
Rockafellar (1981, L); Demjanov and Vasiljev (1981, M).
Abstract entropy principle: Brezis and Browder (1976).
Nonsmooth problems: Ekeland (1979, S); Clarke (1976a), (1981), (1984,
M); Rockafellar (1981, L); Demjanov and Vasiljev (1981, M) (see Problems
47.10 and 48.8c).
Nonconvex problems, their stochastic interpretation, and generalized
solutions: McShane (1978, S); Gamkrelidze (1978, M) (see Problem 42.14).
Duality for nonconvex problems: Kldtzler (1983, S).
Stochastic control and quasivariational differential equations: Fleming
and Rishel (1975, M); Bensoussan (1982, M).
Regularity of the solutions of variational inequalities: Kinderlehrer and
Stampacchia (1980, M), Friedman (1982, M).
Control problems governed by partial differential equations: Lions (1971,
M), (1976, S), (1977, S), (1983, M); Ahmed and Teo (1981, M).
Global generalized solutions of the Hamilton-Jacobi differential equation:
Lions, Jr. (1982, L).
Perturbed variational problems, asymptotics, homogenization:
Bensoussan, Lions, and Papanicolaou (1978, M); Lions (1980, S), (1983, M).
Global analysis and the existence of solutions in the generic case: Tromba
(1977, S) (minimal surfaces); Ekeland (1979, S) (geodesies) (see Problem
52.1h).
Minimal surfaces: Tromba (1977, S); Bohme (1980/81, S); Fomenko
(1982, M); Hildebrandt (1983, S), Almgren (1984, M).
Complexity of numerical algorithms in the generic case: Smale (1981).
Optimal algorithms: Traub and Wozniakowski (1980, M).
References
167
Global analysis and mathematical economics: Smale (1983, S).
Morse theory: Tromba (1977), (1977a); Bott (1982, S).
Shock waves, reaction-diffusion and the generalized Morse index of
Conley: Smoller (1983, M).
Global analysis, infinite-dimensional Hamiltonian systems, symplectic
geometry, and mathematical physics: Chernoff and Marsden (1974, L);
Marsden (1974, L), (1981, L).
Geometrical optics, asymptotic expansions of the solutions of partial
differential equations, symplectic geometry, geometric quantization, Maslov
index, and Fourier integral operators: Guillemin and Sternberg (1977, M);
Leray (1978, M); Homander (1983, M); Beals, Fefferman and Grossman
(1983, S).
Microlocal analysis: Kashiwara, Kawai and Sato (1973, L); Sato, Miwa
and Jimbo (1980, S) (holonomic quantum fields); Garding (1981, S);
Fefferman (1983, S).
Global analysis and control theory; Givens and Millman (1982, S).
Solitons: Bullough and Caudrey (1980, P); Zaharov (1980, M), Calogero
and Degasperis (1982, M),
Gauge field theory and elementary particles: See the references to the
literature in Chapter 40.
Modern development of general relativity: Held (1980, P); Marsden
(1981, L).
Applications of catastrophe theory to the natural sciences and
engineering: Poston and Stewart (1978, M); Gilmore (1981, M).
Classification of singularities: Arnold (1981, S), (1983, S), (1983a, S).
Optimization and operations research: Dixon (1980, M); Korte (1982,
M).
Nonlinear elasticity: Ball (1977), Necas (1983, L).
Plasticity: Temam (1983, M).
Variational inequalities and free boundary value problems: Friedman
(1982, M).
Capillarity: Finn (1984, M).
Inverse problems and parameter identification: Deuflhard and Hairer
(1983, P).
Capillarity: Finn (1984, M).
CHAPTER 39
Convexity and Extremal Principles
It seems to me that the notion of convex function is just as fundamental as
positive or increasing function. If I am not mistaken in this, the notion ought
to find its place in elementary expositions of the theory of real functions.
J. L. W. V. Jensen, 1906
The study of convex sets is a branch of geometry, analysis, and linear algebra
that has numerous connections with other areas of mathematics and serves to
unify many apparently diverse mathematical phenomena.
Victor Klee, 1950
In the preceding chapter we showed how existence propositions for extremal
problems are obtained with the aid of compactness arguments. A second
basic strategy for obtaining existence propositions consists in considering
convexity instead of compactness. Figure 39.1 shows the logical
connections. We place the Hahn-Banach theorem at the pinnacle; in the final
analysis this theorem goes back to the central fixed point theorem of
Bourbaki and Kneser, in Chapter 11, via Zorn's lemma. The separation
theorems for convex sets and the Kxein extension theorem for positive
functionals follow from the Hahn-Banach theorem. These three theorems
are standard results of functional analysis. We summarize them in Section
39.1 without proofs. The proofs can be found, e.g., in Edwards (1965, M). In
fact these three theorems, which are framed in Fig. 39.1, are mutually
equivalent if they are appropriately formulated. They represent different
conceptions of a general fundamental principle of geometric functional
analysis, which finds its most suggestive geometrical form in the separation
theorems for convex sets. These equivalences are discussed in Holmes (1975,
M), page 95 (cf. Problem 39.13). In toto, there are nine important
propositions that are equivalent to the Hahn-Banach theorem. In Problem 39.14
168
39.1. The Fundamental Principle of Geometric Functional Analysis
169
Bourbaki-Kneser fixed-point theorem
I
Zorn's lemma
Chcbyshev
approximation
saddle points
and game
theory
Kuhn-Tucker
theory
Pontrjagin
maximum
principle
Figure 39.1. Convexity and extremal problems.
we point out the interesting fact that the existence of nontrivial quantum
fields can be proved with the aid of the Hahn-Banach theorem.
In Section 39.2, as a prototype for the application of the Hahn-Banach
theorem to extremal problems, we treat the duality principle for linear
approximation theory. In this connection, we use a common proof strategy:
(a) The dual problem is solved with a convexity argument without using
compactness.
(j6) The original problem is then solved using a compactness argument.
In this chapter, as an application of general linear approximation theory,
we treat Chebyshev approximation. From the standpoint of the classical
calculus of variations, this approximation problem is difficult. To
characterize the solutions, one cannot use the methods of the differential
calculus as in the next chapter because the norm on C[a, b] is not differen-
tiable, and the simple uniqueness argument in Theorem 38.C in Section
38.4, with the aid of the strict convexity of the functional, fails because
C[a, b] is not a strictly convex B-space. In fact, it is a matter of a typical
convex optimization problem which is uniquely solvable only in certain
cases. We shall characterize the solutions in terms of extreme points of the
unit ball of the dual space. This results in a natural way from the fact that
the dual problem is a linear optimization problem on a convex set, and,
according to Section 38.7, the extreme points of convex sets are crucial.
170
39. Convexity and Extremal Principles
These extreme points also play an important role in uniqueness
propositions.
We discuss the propositions of Fig. 39.1 in later chapters of Part III.
39.1. The Fundamental Principle of Geometric
Functional Analysis
A functional p: X-> U is called sublinear if and only if p(tu) = tp(u) and
p(u + v)< p(u)+ p(v) for all u, u e Xand (eR, t>0. Furthermore, p is
called a seminorm provided that p is sublinear and in addition p{tu) =
\t\p(u) for all ael, (eR.
Proposition39.1 (General Hahn-Banach Theorem). Letf:McX-*Ubea
linear functional on the linear subspace M of the real linear space X such that
f(u)<p(u) forallu^M, (la)
where p: X-+U is sublinear. Then f can be extended to a linear functional f:
X->U such that
f(u)<p{u) for all u eX (lb)
The proof is, in principle, simple and can be found in Edwards (1965, M),
1.7.1. This fundamental theorem goes back to Hahn (1926) and Banach
(1929). The discovery of the Hahn-Banach theorem was closely related to
the famous classical momentum problem. The interesting history of this
theorem is discussed in Dieudonne (1981, M). If X is a locally convex space
and p is a continuous seminorm on X, then by passing from u to — u, it
immediately follows from (lb) that \f(u)\<>p(u) for all «eX Since
p(0) = 0 and p is continuous, / is continuous at u = 0. A translation shows
that / is also continuous on X. In particular, the following Hahn-Banach
theorem in B-spaces follows immediately.
Corollary 39.2. Let f: MC.X-+U be a linear functional on the linear
subspace M of the real B-space X such that \\f\\ < oo, i.e., \f(u)\ < \\f\\ \\u\\for
all u eM. Then f can be extended to a continuous linear functional f: X-*U
with preservation of the norm \\f\\.
Definition 39.3. Let A, B be nonempty sets in the real locally convex space
X. Then A and B can be separated if and only if there exist a continuous
linear functional/: X-» U, f + 0, and a real number a so that:
f(u)<a<f(v) for all u^A,veB. * (2)
A and B can be strictly separated if and only if " <" can be replaced
everywhere in (2) by " <."
def
If we designate the set H = {u e X: f(u) = a } for fixed / e X*, a e U,
/ =/= 0 as a closed hyperplane, then when X = U 2 we have separation in Fig.
39.1. The Fundamental Principle of Geometric Functional Analysis 171
rry fa" /rr j
(a) (b)
(«0
Figure 39.2
39.2(a) and strict separation in Figs. 39.2(b) and 39.2(c). It is extraordinarily
remarkable that the simple geometric situation that occurs in these figures
holds very generally.
Proposition 39.4 (Separation of Convex Sets). If A,B are nonempty convex
sets in the real locally convex space X, then:
(1) A and B can be separated provided
BDintA = 0, intA + 0
[see Fig. 39.2(a)]. Then, in addition to (2), above, f(u) < a for all u e int A.
(2) A and B can be strictly separated provided AC\ B = 0 and one of the
following two conditions is fulfilled:
(/) A and B are open [see Fig. 39.2(b)].
(ii) A is closed and B is compact [see Fig. 39.2(c)].
The proofs, which easily follow from the Hahn-Banach theorem, can be
found in Edwards (1965, M), 2.1. Sharper formulations, which, however, we
do not need here, are contained in Holmes (1975, M), HE. The separation
theorem will play a crucial role in Chapter 47 in the construction of convex
analysis.
By a convex cone we understand a convex set having the property, u e K,
t > 0 implies tu e K.
Proposition 39.5 (Kxein's Extension Theorem). Suppose that the following
two conditions hold:
(i) X is a real locally convex space. K is a convex cone in X and L is a linear
subspace of X such that LC\intK + <Z>.
(ii) f: L-+U is a linear functional such that /(«) > 0 for all u^Ln K.
172
39. Convexity and Extremal Principles
Then f can be extended to a continuous linear functional f: X-> U such that
f(u)>0for all u&K.
The proof, which again easily follows from the Hahn-Banach theorem,
can be found in Edwards (1965, M), 2.5.1. Proposition 39.5 is used in
Chapter 48 in an essential way to prove the Dubovickii-Miljutin lemma.
We shall base general optimality criteria on that lemma.
39.2. Duality and the Role of Extreme Points in
Linear Approximation Theory
In order to describe a typical application of the Hahn-Banach theorem, we
consider the original problem
inf \\b-u\\ = a (3)
we M
together with the dual problem corresponding to if.
sup (f,b) = P, (4)
f^K(M±)
where
def r
Mx = {f<=X*:(f,u) = 0 forallttSM},
K{Mx)={f<=Mx:\\f\\<l}.
If Y is a normed space, then in general we denote by
def
K(Y)={ueY:\\u\\<l}
(respectively,
def
S(Y)={ueY:\\u\\=l})
the closed unit ball in Y (respectively, the boundary of the unit ball in Y).
The following relation is crucial for the characterization of the solutions:
</,*-«>-||ft-«||, u^M,f&K{Mx). (5)
When ||ft - «|| + 0, H/ll = 1 must automatically hold, i.e., / e S(M x).
Theorem 39.A. Let M be a linear subspace of the real B-space X and let ft be
a fixed given element in X. Then the following hold:
(1) Dual problem. (4) has a solution f and a = /3.
(2) Original problem. (3) has a solution u provided one of the following two
conditions is fulfilled:
(i) X is reflexive and M is closed,
(ii) dimM < oo.
Then the solution set is bounded, closed, and convex.
39.2. i^uaiuy and me jvole of Jtxirenie Poirus in linear /Approximation ineory 173
(3) Characterization, u is a solution of (3) above if and only if there exists
an f such that (5) holds.
The next corollary follows directly from a = /S.
Corollary 39.6 (Error Estimate). Let u in M and fin K{M x)be arbitrary but
fixed. Then:
(f,b)<a<\\b-u\\ (6)
and
(/, b) = \\b — u\\<*uisa solution of problem (3) and f is a solution of (4).
(7)
Theorem 39.A is a prototype for the propositions of duality theory. It is
interesting that here the dual problem is always solvable, while the original
problem need not have a solution. The existence assertions for (4)
(respectively, (3)) are based on the Hahn-Banach theorem (respectively, on a
compactness argument). Corollary 39.6 is the prototype for error estimates
and solution characterizations with the aid of duality theory. The following
example contains the intuitive meaning of Theorem 39.A.
Example 39.7. Let X = U2 with the Euclidean inner product (• | •) and let M
be a straight line through the origin. We identify X with X*; therefore,
(/.") — (/I")- Then a — ft asserts that the distance between b and M equals
the maximal distance between b and all planes through M (see Fig. 39.3).
Mx corresponds to the orthogonal complement to M. The orthogonal
projection u of b on M is a solution of the approximation problem (3),
whereas, for b£M, the unit vector/ = ||fc — w||~ 1(i> — u) is a solution of the
dual problem (4). These assertions are also valid when M is a closed
subspace of a real H-space X.
Proof of Theorem 39.A. (1) For/ e K{Mx),
(/. b) = {f,b-u)< \\b - »|| for all u e M;
thus a > p. For b e M, we have a = 0, and u = b (respectively, / = 0) solve
(3) [respectively, (4)] and (5) holds. Now let b + 0. Each x in span {b, M)
can be represented in the form x = db + u, where (Jeffi,aeM are uniquely
I
i
"i ~7
-4 -/-—»-M
Figure 39.3
174
39. Convexity and Extremal Principles
determined. We set/(x ) = da. For </#0we have
1/(^)1 = 1^1«^ 1^1 II* - (— rf"1!*)!! = ||Jc||;
hence ||/|| = 1. According to the Hahn-Banach theorem (Corollary 39.2),/
can be extended to a continuous linear functional/: X-*U with ||/|| = 1. By
construction, / e M x and (/, b) = a; thus a = /S.
(2) This assertion follows directly from Proposition 38.15.
(3) This is another formulation of Corollary 39.6 taking proposition (1)
into account.
D
The dual problem (4) is a linear optimization problem on a convex set. In
Theorem 38.D in Section 38.7 we saw that essential simplifications appear in
such problems because the minimum is taken on at an extreme point. We
will now formulate the corresponding result for the approximation problem
(3) where dimM < oo.
Corollary 39.8. Let M be an n-dimensional subspace of the real B-space X
where 1 < n < oo andb *£ M. Then, for u in M, the following two assertions are
equivalent:
(i) u is a solution of the approximation problem (3).
(ii) There exist m numbers \l,...,\m>0 such that \i+ ■■■ +\m=l and
\<m<n+\ as well as m linearly independent extreme points fx,...,fm
ofK(X*)such that
</„*-«>-||ft-«||, i = l,...,m
and
m
def
Proof. (ii)=*(i) (5) holds for /= 2,A,/. Theorem 39.A, (3) yields the
assertion.
def
(i) => (ii) Let N = span{ft, M}. According to Theorem 39.A, (3), with N
instead of X, there exists an/ e S(N*)nMx such that (f,b-u) = \\b-
«||.The functional / can be represented as a convex linear combination of m
linearly independent extreme points fv...,fm of K(N*) with m<n+l, i.e.,
/ = 2,m=1\,/ (cf. Problem 39.1). Each / can be extended to an / e X* so
that / is an extreme point of K(X*) (cf. Problem 39.3). From ||/|| =1 it
follows that (/, b-u)<\\b- u\\. Since
(ll^ifi,b-u) = \\b-u\\, Xt>0 for all/
i
and Xx+ • • • +Xm=l, we have (ffib-u) = ||fc- u\\ for all/. D
39.3. Interpolation Property of Subspaces and Uniqueness 175
39.3. Interpolation Property of Subspaces and
Uniqueness
The following theorem contains an important uniqueness assertion for
approximation problems.
Tlieorem 39.B. Let M be a convex set in the real B-space X and let b be given
and fixed in X. Then the approximation problem
to/lift-«|| = a (8)
w e M
has at most one solution u when one of the following two conditions holds:
(i) X is strictly convex, i.e., for all u, v e X and r>0, we have:
\\u\\ = \\v\\ = r, u¥=v -implies \\2~l(u + v)\\ < r.
(ii) M is a linear n-dimensional subspace and possesses the interpolation
property, i.e., for arbitrary fixed ai,...,an eU, the system of equations
/;(«) = a;, i —1,..., n, «eM
has exactly one solution u when fx,...,fn are linearly independent extreme
points of K(X*).
Proof, (i) For two solutions uu u2 of (8) with ul ¥= u2 one immediately
obtains the contradiction:
II* -2~1(«1 + «2)|| - ||2~1[(* - "i) + (*- «2)] || < «,
2~l(ul + u2) eM.
def _
(ii) Let b ¢. M. If uv u2 are solutions of (8), then u = 2 l(ul + u2) is also
a solution of (8) because of the convexity of the norm. In Corollary 39.8, m
must be equal to n+l, for, if m were less than n+l, then we would
complete f\,---,fm to n linearly independent extreme points f^,...,f„ of
K(X*). Since dim X> n+l, this is always possible according to the
Krein-Milman theorem in Section 38.7. Thus, by (ii) there exists a u0 e M
such that f(u0) = l, i = \,...,n. Therefore, 'L1Li\jfj(u0)+ 0, which
contradicts 2^^,/, e M x. Furthermore, by Corollary 39.8,
f{b-u) = \\b-u\\ = a, / = 1,...,n+l;
hence
2-^-^)+2-^-^) = a, ; = 1,...,«+1.
From ||/;|| = 1 it follows that f(b - Uj) < \\b - hJ = a for j = 1,2.
Consequently, f(b — Wj) = f(b — u2), i = 1,...,n +1. The interpolation property
gives«! = «2- E
Example 39.9. According to Section 10.1, every uniformly convex B-space
X is also strictly convex. Consequently, H-spaces as well as Lebesgue spaces
1,6
39. convexity anu cxuemal Hinciples
Lp(G) and Sobolev spaces Wpm(G), l<p<oo, m = l,2,..., are strictly
convex. On the other hand, the B-space C[a, b], — oo < a < b < oo, is an
important example of a B-space that is not strictly convex. This can be seen
very easily. In Section 39.5 we shall prove uniqueness propositions for
Chebyshev approximation using Theorem 39. B, (ii). Then the interpolation
property means that the functions in the n-dimensional approximation
space M are uniquely determined by giving their values at n distinct points.
Theorem 39.B describes two basically distinct strategies for obtaining
uniqueness propositions. In (i) a property of the functional to be minimized
is exploited. This criterion is intimately connected with Theorem 38.C in
Section 38.4. In (ii) the properties of the set M over which one minimizes
play a central role. The following example contains the intuitive meaning.
Example 39.10. Let X = U2 with the norm || • ||., 1 < j < oo. We consider
inf
(9)
and set
def ,
sy-{„.
def
«||y-l}, Ky={«eR2:||„||y<l}.
u, u,
-*-M
(b)
jy.4. Ascent Method ana trie Abstract Alternation ineorem
ill
Let M be a straight line of support for Kj through a point of S-, i.e., M goes
through a boundary point and K lies entirely on one side of M. Thus, a = 1.
For/= 2, || • ||2 is the Euclidean norm, 52 is the boundary of the unit disk,
and (9) has exactly one solution u (see Fig. 39.4(a)). The space X with the
norm ||-||2 is strictly convex because S2 does not contain a straight line
segment.
For/= oo, and therefore for \\u\\K = max(|£|, |tj|), u = (£, tj), (9) need not
have a unique solution (see Fig. 39.4(b)). Here, Sx is the boundary of the
unit square and X with the norm || • H^ is not strictly convex. Then a solution
a of (9) is unique if and only if M passes through exactly one vertex of SK
(see Fig. 39.4(c)). We will show that it is precisely then that M has the
interpolation property. The norm on Jf* is ||-||l5 where ||«||i = |£|+h|- The
extreme points of Kl are (±1,0), (0, ± 1) (see Fig. 39.4(c)). As one can easily
verify, the interpolation property means that each point of M is uniquely
determined by its projection on one of the directions from the origin to
(+1,0),(0,+1).
In contrast to Theorem 39.B, here M is not a linear subspace of X, but
rather it is only parallel to a linear subspace. However, the situation in
Theorem 39.B is obtained by a translation.
39.4. Ascent Method and the Abstract Alternation
Theorem
In this section we give results that are of fundamental significance for
obtaining approximate solutions for linear approximation problems. At the
same time, we generalize the considerations in Section 37.29c. To this end,
we consider the original problem
min||fc~tt|| = a (10)
we M
together with the so-called discrete problem
min||fc-«||F = j8 (11)
we M
with the discrete seminorm
def
||fr-«||F= max \ft(b-u)\ (12)
1 ^ i <, m
and make the following assumptions:
(HI) M is an n-dimensional subspace of the real B-space X, 1 < n < oo,
and b is a fixed element in X such that b£M.
Definition 39.11. By a reference F= (f^...,/,,,), we understand a tuple of
functionals fl,...,fm e X* such that ||/j-|| = l for /( + 0. Furthermore, F is
178
39. Convexity and Extremal Principles
called regular when m = n +1, and for a choice of a fixed basis {uv...,un}
in M, we have:
ENII//ll>o. (13)
i-i
Here, ht denotes the so-called Haar determinant:
def
/i("i)
/i-i("i)
//+i("i)
/iK)
fi-l(Un)
/« + l("l) ••• /» + l("»)
which results from (fk(uj)) by eliminating the ;'th row.
Example 39.12. In the special case of Chebyshev approximation, which we
shall consider more precisely in the next section, we have X= C(T), and
one can choose f to be/)(w)= u(tt) for all u e X and fixed t, e T.
Proposition 39.13 (Error Estimate). Assume (HI) holds. Then if u is a
solution of the discrete problem (11) for a fixed reference F,
\\b-u\\F<a<\\b-u\\, (14)
and this u is a solution of the original problem (10) when
\\b-u\\F = \\b-u\\.
Proof. From ||/(||<1 it follows that ||fc-i/||F<||fc-i/|| for all aejlf;
therefore, /3 < a. D
We now give an effective method for solving (11).
Corollary 39.14. With the assumption (HI), the following two assertions hold:
(1) If F is a regular reference, then the linear system of equations
(-1)7^,)11/,11/8+ iakf(uk) = f(b), (15)
/ = 1,...,/1+1,
always has a solution (aly.. .,a„, /3). In this connection, we choose
def
p(ht) = sgn hi for ht + 0, and for ht = 0, let p(/!;) be a fixed real number with
\p(hj)\ <1. If we set u= ~L"k=lakuk, then u is a solution of (11).
(2) For each solution u = H"k=lakuk of (10) there exists a regular reference
F such that (15) holds with /3 = a.
39.4. Ascent Method and the Abstract Alternation Theorem
179
The way for an approximation method to solve the original problem (10)
is thus indicated:
(a) One determines u by (15) or, more generally, as any solution of (11).
Then the error estimate (14) holds.
(b) If ||fc - «||F = ||/b - «||, then u is already a solution of (10).
(c) Otherwise one tries to increase the discrete seminorm ||fc — n||F by
changing the reference F.
This method can be conceived of as an abstract form of the Remes
algorithm which we described in Section 37.29c for the Chebyshev
approximation. In this connection, the situation of Example 39.12 is present.
Here, changing the reference F means changing the points tt. General
algorithms for (c) are given, for instance, in Laurent (1972, M), 8.5 and in
Kiesewetter (1973, M). We refer to these as ascent methods because \\b — u\\F
approaches the value a from below.
def
Proof. (1) The determinant of coefficients of (15) is equal to - y =
- 2^1^,1||/|| < 0. From (15) it follows that
/,(&-n) = (-l)'p(A/)ll//ll/8; (16)
therefore, \\b — u\\F = /3 because of (13). We define
/(«)-Y-lBE(-l)'U(«). (17)
/ = i
According to the construction of ht and the theorem on the expansion of
determinants,/^) = 0, k = 1,..., n; thus,/(y )= 0 for ally eM. From (16)
it follows that
f{b)=f{b-u)-0.
Thus, for all v e M,
\\b-u\\F^p=f(b-v)<y~lZ\hi\\fi(b-v)\<\\b~v\\F.
i
(2) This follows from the equivalence of (16) to (15) and Theorem 39.C
below. D
In the following, p(//,) has the same meaning as in Corollary 39.14 above.
Theorem 39.C (Abstract Alternation Theorem). With the assumption (HI),
for u in M, the following two assertions are equivalent:
(/) « is a solution of the approximation problem (10).
(/7) There exists a regular reference F—(fl /„+i) such that
/.(^-^) = (-1)^(//,.)11/1111^-^11, / = 1,...,^+1. (18)
Proof. We make use of Theorem 39.A, (3) in Section 39.2 in an essential
way.
180
39. Convexity and Extremal Principles
(i) => (ii) According to Theorem 39.A, (3), there exists an / e M x such
that ||/||=1 and f(b- u) = \\b- u\\. We choose a basis {ul,...,un} in M
and, according to the Hahn-Banach theorem, we construct functional
fi,...,f„ el* such that /,(«,) = 8,, and 11./-11=1 for /,/ = 1 n. For the
reference F=-(fu...,f„,f„+l), where/„+1 = (-1)'!+1/, we have A„+1=l
and Aj = ■ • ■ = A,( = 0 since/(u) = 0 for all u e M, i.e., f is regular. If we set
def def
p(hn+l) =1 and p(/*,) =/(*>-«)/(-l)'||fc- «|| for \<i<n, then |p(/i,)|
£l and (18) holds.
(ii) =» (i) We construct/according to (17). Then/ e M x, |/(u)| <: ||u|| for
all eel, and /(£ — u) — \b — u|| because of (18). According to Theorem
39.A, (3), u is a solution of (10). D
39.5. Application to Chebyshev Approximation
We consider the problem
min||fc-M|| = a (19)
u e M
under the following assumptions:
(HI) T is a compact set in UN, T + 0 and l<iV<oo. Let X=C(T).
Here, C(T) denotes the B-space of continuous functions v. T-> U with the
max norm,
*/ . ,
||y|| = max|y(0l forallueX
Suppose ul un e X are linearly independent functions. Let
*/
M = span {«!,...,«„),
i.e., M consists of all real linear combinations qi^ + • • • + c„un.
Furthermore, b is given but fixed, b£M.
The classical special case is obtained in the following way:
(H2) T is the interval [c, d] in R1, — oo < c < d < oo, and M is the space
def
of polynomials of degree n — 1; thus, uk(t) = tk l, k —l,...,n. Finally, b:
[c, d] -* U is a given fixed continuous function, b <£ M.
It is crucial that one knows the extreme points of the unit ball in C(T)*.
According to Problem 39.5, they consist of precisely the functionals ± d,o
for arbitrary fixed t0 e T, where §,(«)= u(t0) for all u e C(T). The
interpolation property in Theorem 39.B, (ii) in Section 39.3 is equivalent to the
following condition:
(H3) For prescribed function values at n distinct points in T, there exists
exactly one function u in M that takes on these values.
39.5. Application to Chebyshev Approximation
181
The condition (H3) is always fulfilled in the classical special case (H2).
Here, the determination of the interpolating polynomial leads to a linear
system of equations whose coefficient determinant is the nonvanishing
Vandermonde determinant. According to Corollary 39.8, with/; = ± 8,, and
with the proof of Theorem 39.B, (ii), we immediately obtain the following
proposition.
Proposition 39.15. //(HI) holds, then u in M is a solution of (19) if and only
if there exist nonzero real numbers au..., am and m distinct points tl,...,tmin
T such that 1 < m < n +1 and
\b{tj)-u(tj)\~\\b-u\\, sgnaj = sgn{b{lj)-u{tj)) (20)
forj = 1,...,m as well as
m >
£ aJui(tJ) = 0, i = l,...,n.
./ = 1
If, together with (HI), (H3) is also valid, then m = n+\, and (19)possesses
exactly one solution u.
We sharpen this proposition in the classical special case of polynomial
approximation (H2) and state in addition:
|fr(*y)-«(0)l=ll&-K|l. (21)
b(tj)-u(tj) = {~iy+1(b{h)-u{tl)), y-i «+1.
Proposition 39.16 (Alternation Theorem). If (H2) holds, then u^M is a
solution of (19) if and only if there exist n+\ points tj such that a<tx<t2<
• • • <tn+l<b and (21) holds.
This classical alternation theorem for Chebyshev approximation asserts
that the error curve t <-> b(t)~ u(t) takes on in at least n +1 points values
that are largest in absolute value, and the signs of these values alternate
according to (21). We treat a simple application in Problem 39.6.
Proof. If u is a solution, then (20) holds with m = n +1; thus,
n
E «,";(',) = -a„+i",('„+i). i = l,-.-,n.
Cramer's rule yields
{-l)" + l~Ja„ + lhj
aJ = h • (22)
"n + l
However, here the Haar determinants hi in Definition 39.11 with^(«;) =
U;(tj) are Vandermonde determinants and hence are all positive. (21)
follows from this.
182
39. Convexity and Extremal Principles
clef
Conversely, if (21) holds, then (20) follows with a„ + 1 = sgn(b(tn + l)-
u(tn + i)) and the a-'s defined by (22) for j = 1,...,n. Thus, by Proposition
39.15, wis a solution of (19). D
Problems
39.1. Convex linear combinations. Show: Every x in S(R") can be represented
as a convex linear combination of at most n extreme points of K(R "), i.e.,
x lies in the convex hull of these points. For x e int K(R"), one needs at
most n +1 such points.
Hint: Use induction on the dimension. Compare Holmes (1972, L),
page 82.
39.2.* Convex sets in R", systems of inequalities, positive solutions. In this
connection, study Appendices A3(l)-A3(7), interpret these results
geometrically, and infer ideas for the proof from these interpretations.
Hint: Compare Marti (1977, M), pages 28, 208, and Vogel (1967, M),
page 49, for A3(7) (also, cf. Problem 50.4).
39.3. Extension of junctionals. Show: If iV is a linear subspace of the real
B-space X, then each f:N~*R that is an extreme point of K(N*) has an
extension /0: X~* R which is an extreme point of K(X*).
Solution: According to the Alaoglu-Bourbaki theorem (cf. A3(20)),,
K(X*) is weak* compact. We denote the set of all extensions /0 of / such
that /0eS(J*) by A. By the Hahn-Banach theorem, A*<Z>. MS
sequences immediately show that A is a weak* closed subset of K(X*), i.e.,
A is weak* compact and, by virtue of the Krein-Milman theorem, it has
an extreme point fv Indirectly by considering restrictions one now easily
shows that fl is also an extreme point of K(X*).
39.4. Continuity of functional. Let /: X-^R be a linear functional on the real
locally convex space X and let a be a fixed real number. Show:
(i) If f(u) > a on a neighborhood of a fixed point, then / is continuous,
(ii) If f(u) > 0 on a neighborhood of zero, then /= 0.
Solution: (i) By translation one obtains the boundedness of / on a
neighborhood of zero. Then the homogeneity of / yields the continuity of/
at the zero point and hence on X.
(ii) By passing from u to — u, it follows that / is equal to zero on a
neighborhood of zero.
39.5.* Extreme points in C(T)*. Let T be a compact set in R", T*0. To each
clef
point t e T one can assign the 8,-functional such that S,(F) = F(t) for all
F e C(T). Show: The set of all extreme points of the unit ball in C(T)* is
equal to {±8,: teT}.
Hint: Compare Holmes (1972, L), page 50. Use the Krein-Milman
theorem and the bipolar theorem A3(21).
39.6. Chebyshev approximation in R1. Let /eC2[c,d], with —oo<c<rf<oo
and f"(t) > 0 on [c, d\. With the aid of the alternation theorem (Proposi-
Problems
183
d
Figure 39.5
tion 39.16), determine the Chebyshev approximation of / with respect to
first-degree polynomials.
Solution: u(t) = a0 + axt, where
«i-/(<fjj?C). «o-i(/(c) + /(«2))-i(c+t2)fll.
Here, t2 is a solution of f'(t2)— u'(t2) = 0 (see Fig. 39.5). Compare
Collate and Albrecht (1972, M), page 123. There one will find numerous
additional exercises.
39.7.* The Remes algorithm. We have already described the basic idea of this
algorithm in Section 37.29c. In this connection, study the convergence
proof in Cheney (1966, M), page 95 and Meinardus (1964, M), page 98,
and especially the connection with the Newton method mentioned in
Section 37.29c, which is presented in Meinardus (1964, M), page 105.
39.8.* Chebyshev approximation in UN. We consider the problem
min ||« —Jb|| = « (23)
of Chebyshev approximation on a compact set r in R" under the
assumptions (HI) in Section 39.4. In particular, dim M= n.
39.8a. Kolmogorov's criterion, u is a solution of (23) if and only if the following
holds: For each v e M there exists a t eT such that
|ii(0-6(0l-H»-*ll. (u(t)-b(t))v(t)>0.
Hint: Compare Meinardus (1964, M), page 15 and Schonhage (1971, M).
39.8b. Haar's uniqueness theorem. (23) has a unique solution if and only if M
possesses the interpolation property, i.e., for arbitrary numbers ax a„
£1 and n distinct points t1,...,t„^T, there exists exactly one «eM
such that u(tj)—ait (=1,..., n.
Hint: Compare Laurent (1972, M), 3.4.6.
39.8c. Alternation theorem, u in M is a solution of (23) if and only if there exist
n +1 distinct points tl7..., t„+1 e T satisfying the alternation condition
6(^)-11(^)-7(-1)^(^)116-1111, y-l,...,n + l.
In this connection, t is uniformly equal to 1 or — 1 for all j. Furthermore,
39. Convexity and Extremal Principles
hj is the Haar determinant which results from the (n + 1)X n matrix (aik),
def
where alk = uk(ti), by eliminating the yth row. Here, {ux u,,} is a
def
basis in M. Furthermore, p(/i/) = sgn /i/ for /i/ =£ 0. Otherwise, p(/ij) lies
in [-1,1].
Hint: Compare Kiesewetter (1973, M), page 170.
Discrete Chebyshev approximation, compensation analysis, and linear opti-
mization.li one has n measurement values (x;>>v) and would like to
produce a linear connection y = Cx by means of compensation, then to
determine C one can also use the method of discrete Chebyshev
approximation instead of the least-squares method of Section 37.12—i.e., one
considers the minimum problem
||j-C;t|| = mini, (24)
def \
where \\y — Cx\\ = max!,;, s„|j, - Cxt\. This problem can be written as a t
linear optimization problem of the form [■
/(a,C) = min!, (25) j
j, — Cxj < a, yt — Cxt > — a for all ;', I
def l
where f(a,C) = a. The simplicial algorithm can be applied to the latter }
problem. J
Show graphically that for x = (2,4,5,6), y = (1.2,2.1,2.6,3.1), the j
solution is C = 0.54. r
Hint: Compare Cheney (1966, M), page 30. There one also finds \
algorithms and solution propositions for the case where C is a matrix. [
Then \\y — Cx\\ equals the corresponding norm \\-\\x in U", and, analo- \
gous to the pseudoinverses in Section 37.14, each solution of (24) is a [
generalized solution of y = Cx. In Section 37.17 we have already described |
the application of discrete Chebyshev approximation to the approximate I
solution of ordinary and partial differential equations. J
Iteration methods for solving linear regular and singular systems of equations. '
In this connection, study Marcuk and Kuznecov (1975, S,B). There one ♦
can find extensive material for solving Ax = y. Here, A can be a rectangu- j
lar matrix, and the solution is to be understood in the sense of \\Ax — y\\2 l
= mini, i.e., it is a matter of generalized solutions (pseudoinverses). j
General uniqueness theorem of linear approximation theory. Show: If M is •
an n-dimensional linear subspace of a normed real linear space X, then, )
for each b in X, (23) has exactly one solution provided the following %
condition is not fulfilled: i
/,(x) = ||*ll = II*-Jll, y-l,...,m, {
m
Z^ifj^S(M^).
7 = 1
Here, x€X, y e M, y # 0, \1,...,\m>Q, X1+---+Xm = l, m<n,
/i,...,/„, e X*, and all the f's are extreme points of the unit ball in X*.
References
185
•£ i »-M
0 y
Figure 39.6
Hint: Compare Holmes (1972, L), page 111. There one also finds a
number of applications, e.g., a proof of the Haar uniqueness theorem of
Problem 39.8b. Give a geometric interpretation of this result (see Fig.
39.6).
39.12.* Applications of approximation theory. Study the examples in Sections
37.12-37.19 and the literature- for these sections. Especially numerous
examples—in particular, applications to partial differential equations—can
be found in Collatz and Krabs (1973, M).
39.13.* The fundamental principle of geometric functional analysis. Study Holmes
(1975, M), page 95. There it is shown that the following propositions are
mutually equivalent: the Hahn-Banach theorem, separation theorems for
convex sets, the support theorem, Krein's extension theorem, the theorem
on subdifferentiability, Tuy's inconsistency theorem, the
Farkas-Minkowski lemma, the Hurwicz saddle point theorem, Golstein's duality theorem,
and the Dubovickil-Miljutin lemma.
39.14.** The Hahn-Banach theorem and the existence of nontrivial quantum fields.
In this connection, study Hofmann (1981). There it is shown that there are
nontrivial quantum fields that satisfy the quantum field theory axioms. The
idea consists in constructing (with the help of extension theorems for
functionals) fields whose topological properties are different from those of
known fields for free particles.
References to the Literature
Classical works: Hahn (1926); Banach (1929); Krein (1938).
History of the Hahn-Banach theorem: Dieudonne (1981, M).
Geometric functional analysis and optimization: Holmes (1975, M).
Survey of separation properties: Klee (1969, S).
Compact convex sets and their applications in functional analysis: Asimow
and Ellis (1982, M) (Krein-Milman theory, Choquet theory, etc.).
Convex cones: Fuchssteiner and Lusky (1981, M).
Geometry of Banach spaces: Beauzamy (1982, M) (cf. also the references
to the literature in the appendix).
Introduction to approximation theory and its numerical methods: Collatz
(1964, M), Sections 19, 25, and 26; Meinardus (1964, M); Cheney (1966,
M,H,B); Collatz and Kxabs (1973, M) (many examples of applications).
186
39. Convexity and Extremal Principles
Approximation of functions by computers: Sauer and Szabo (1967, M);
Vol. Ill (article by Bulirsch and Stoer); Thacher and Witzgall (1968, M);
Luke (1975, M) (handbook).
Functional analysis and approximation theory: Luenberger (1969, M);
Varga (1971, M) and Holmes (1972, L) (introductions); Singer (1970, M,B)
(standard work); Laurent (1972, M,B); Kiesewetter (1973, M); Holmes
(1975, M).
Convex analysis and approximation theory. Holmes (1972, M)
(introduction); Laurent (1972, M,B); Krabs (1975, M).
Optimal quadrature formulas: Berezin and Zidkov (1966, M), Vol. I,
Chapter 3; Isaacson and Keller (1966, M); Kiesewetter (1973, M); Sobolev
(1974, M) (multiple integrals); Engels (1980, M).
Chebyshev approximation: Meinardus (1964, M); Cheney (1966, M,H,B);
Achiezer (1967, M); Remes (1969, M); Rivlin (1969, M); Schonhage (1971,
M); Laurent (1972, M); Collatz and Krabs (1973, M); Dzjadyk (1977,
M, H, B) (references in the introduction to numerous other monographs).
Rational approximation and Pade approximation: Meinardus (1964, M);
Cheney (1966, M); Collatz and Krabs (1973, M); Baker and Gammel (1970,
P); and Saff and Varga (1977, P) (applications to quantum physics); Baker
(1975, M); Baker and Morris (1981, M).
Nonlinear approximation theory: Collatz and Krabs (1973, M); Krabs
(1975, M).
Approximation theory and splines: Varga (1971, M); Laurent (1972,
M,B); Schultz (1973, M); Prenter (1975, M); de Boor (1978, M) (methods
on computers).
Pseudoinverses in linear equations: Cheney (1966, M); Ben-Israel and
GreviUe (1973, M); Marcuk and Kuznecov (1975, S,B) (numerous iteration
methods for systems of linear equations); Nashed (1976, P,B).
Approximation of random quantities: Karlin and Studden (1966, M);
Luenberger (1969, M); Rozanov (1975, M) (also, cf. the references to the
literature on stochastic optimization in Section 37.25).
Separation of convex sets and solution of systems of inequalities:
Rockafellar (1970, M); Holmes (1975, M); Marti (1977, M); Gwinner (1981,
S); Konig (1982, S).
Systems of inequalities and approximation theory: Cheney (1966, M).
Hahn-Banach theorem and the existence of quantum fields: Hofmann
(1981).
Generalized Hahn-Banach theorem and basic concepts of geometric
functional analysis: Konig (1982, S).
Survey of the modern development of approximation theory,
optimization theory, and numerical mathematics: International Series of
Numerical Mathematics, Volumes 1-60, Birkhauser, Basel. Pursue this series.
EXTREMAL PROBLEMS WITHOUT
SIDE CONDITIONS
The shortest distance between people is a smile.
In the following three chapters we investigate how the classical condition
F'(«)«0, (N)
which is necessary for a local solution of
F(«) = min! (M)
for real functions F, carries over to more general problems.
In Chapter 40 we concern ourselves with (N). In Chapter 41 we ask the
question, what operator equations
Bu = 0 (E)
can be written in the form (N)? In combination with (M) there result
existence propositions for (E). In Chapter 42 we clarify the connection
between convex functionals F and monotone operators F'.
The applications deal with:
(a) Classical variational problems for one-dimensional and
multidimensional integrals (Chapters 40 and 42).
(B) Quasilinear elliptic differential equations (Chapter 42).
(y) Hammerstein integral equations (Chapter 41).
CHAPTER 40
Free Local Extrema of Differentiable
Functional and the Calculus of
Variations
Only he is driven to method for whom empiricism is burdensome.
Johann Wolfgang von Goethe
Besides, it is an error to believe that rigor in proof is a foe of simplicity... But
the shocking example for my assertion is the calculus of variations. The
treatment of the first and second variations of definite integrals brought with
it to some extent extremely complicated calculations, and the appropriate
development of the old mathematicians avoided rigor. Weierstrass showed us
the way to a new and secure foundation of the calculus of variations.
David Hilbert
(in his Paris lecture, 1900)
In this chapter, in an elementary way, we generalize the known classical
criteria, mentioned in Section 37.1, for free local extrema of differentiable
real functions to functionals. Theorem 40.A in Section 40.2 forms the
foundation of the classical calculus of variations.
A crucial device is this: the study of real functionals F: D(F) C X -* U on
a real locally convex space X is reduced to the study of real functions <ph of
the real variable t by setting
def
<ph{t)=F(u0 + th), t<=R. (1)
Example 40.1. Let F: U2-*U be given as in Fig. 40.1. To <ph there
corresponds the curve which lies above the straight line t>-* u0 + th on the
surface belonging to F.
One obtains information about <ph in a neighborhood of t = 0 from the
classical Taylor theorem
9*(0-9*(0)+ ttk^p- + Rn. (2)
k = l
189
190 40. Fiee Local Extrema of Differentiable Functional and the Calculus of Variations
Figure 40.1
If <ph is n-times differentiable on ]— tQ, t0[, tQ>0, then (2) holds for all
t e ] — tQ, t0[, where we have
R„ = o(t") ast~>0 (3)
for the remainder term Rn. This means that R„/t" -*0 as t -*0. The proof
can be found, e.g., in Fichtenholz (1972, M), Vol. I, pages 229, 235. To be
precise, R„ has the form
n\RH-f(tf){9t)-vW0)), (3a)
where the number #, 0 < # < 1, depends on t and h.
As an illustration of the general method, we assume that F has a local
minimum at u0, i.e., F(u)> F(u0) for all u e U(u0) (see Fig. 40.1). Then,
obviously, <ph also has a local minimum at t = 0, i.e., the known classical
condition yields
<p;(o) = o, y;(o)2:0 (4)
provided these derivatives exist. (4) follows directly from (2) and (3) with
n = 1,2. Now, (4) already contains the fundamental necessary conditions for
the existence of free local minima that we shall formulate in Theorems 40.A
and 40.B in Section 40.2, only in a somewhat different form. In Sections
40.5 and 40.7 we shall show that the functional analysis results are
generalizations of results from the classical calculus of variations.
Furthermore, in this chapter we explain two important general methods
for obtaining sufficient criteria for the existence of free local minima:
(i) Investigation of the second variation,
(ii) Construction of comparison functionals.
In connection with (i) we consider accessory quadratic variational problems
and eigenvalue criteria. In Section 40.7 we show that classical sufficient
conditions for one-dimensional variational problems [Jacobi's criterion
(respectively, the criterion of field theory)] are obtained from (i) [respectively,
(ii)]. Furthermore, we treat applications to multidimensional variational
problems. In particular, in Section 40.6 we elucidate the relationship
between eigenvalue problems and sufficient conditions for minima.
40.1. nth Variations, G-Derivative, and F-Derivative
191
In this chapter we frequently use the following assumption:
(HI) F: D(F)C X-+U is a functional on the real locally convex space X,
and u0 is a given fixed interior point of D(F).
We delve into a number of deep important applications in Problems
40.7-40.14. We handle further important physical applications in Part IV.
40.1. nth Variations, G-Derivative, and F-Derivative
The preceding considerations show that
9^(0)-
d"F(u0 + th)
dt"
(5)
r = 0
plays an important role in the study of extremal problems.
Definition 40.2. If (HI) holds, then we define the nth variation of F at the
point «0 in the direction h by
def
8»F(u0;h)-vp{0), (6)
for he. X, when the derivative appearing in the right-hand side exists. We
write 5 for 51.
If the right-sided derivative (<P/,)+(0) exists, then we define the one-sided
directional derivative of F at u0 in the direction h by
def
8+F(u0;h) = {<p'h) + (0). (7)
Here,
(vi)+(0). lim *■(". + *)-*■(".).
t —* + 0 '
In the following, for F under the assumption (HI), we recall a number of
definitions and propositions that we presented in Chapter 4 in a more
general setting. The functional F is G-differentiable at u0 if and only if there
exists a continuous linear functional ael*, that we denote by F'(u0),
such that
hmF(uo + th)-F(u0)= for all/^ ex (8)
t-* o l
F'(u0) is called the G-derivative (or Gateaux derivative) of F at u0. We also
briefly write F'(u0)h for (F'(uQ), h). The G-derivative ^'("o) exists if and
only if 8F(u0; h) exists for all h e X and h >->dF(u0; h) is a continuous
linear functional on X. Then
dF{u0;h) = (F'{u0),h) for all A eX.
192 40. Free Local Extrema of Differentiable Functional and the Calculus of Variations
Let the X in (HI) be a normed space. The functional F is F-differentiable
at «0 if and only if there exists a continuous linear functional a e X*, that
we denote by F'(u0), such that an expansion of the form
F{u0 + h) = F(u0) + (F'{u0), h) + o{\\h\\) as n ->0
holds for all h in a neighborhood of zero. F'(u0) is called the F-derivative
(or Frechet derivative) of F at u0. The F-differential of F at u0 in the
direction h is defined by dF(u0; h)= (F'(u0), h).
We again point out that we speak of the existence of the G-derivative
(respectively, of the F-derivative) of F at the point u0 only when F is defined
in a neigliborhood of u0. However, for the sake of simplicity, in the
following we will frequently forego an explicit formulation of this fact. The
assertion that F is, say, G-differentiable on M thus always tacitly includes
M Q'mt D(F).
Every F-derivative F'(u0) is also a G-derivative and
dF(u0;h) = dF(u0;h) = (F'(u0),h) for aline X (9)
Conversely, if the G-derivative F'(u) of F is defined for all u in a
neighborhood of u0, U(uQ), and if F': U(u0) C X-> X* is continuous at u0,
then F'(u0) is also the F-derivative.
If F'(u0) exists as the F-derivative, then Fis continuous at u0.
Furthermore, we recommend that the reader study Chapter 4 regarding
the definition of higher derivatives and higher differentials because we shall
frequently work with these concepts in the sequel. In particular, the nth
variation 8"F(u0; h) coincides with the nth G-differential d"F(u0; h,...,h)
of F at u0 in the direction h. For example, we recall the following
proposition. If X is a B-space, then:
thenth F-derivativeF(n)(u0) exists
<=» d"F(u0; A1,...,AB) exists for all hl,...,h„e X.
Here, d"F is the nth F-differential. If one of the conditions in (10) is
fulfilled, then d"F(u0; h) also exists for all h e X and
8"F{u0;h) = d"F{u0;h) = F<"\u0)h". (11)
Here, F(n)(u0)h" stands for F(n)(u0)h...h. In conclusion, we recall the
Taylor formula
F{u0 + th)-F{u0)+ t 8kF{"f'h)+o(\\h\\") as«->0 (12)
k-i K-
for all h in a suitable neighborhood of zero. It is assumed that F is n-times
F-differentiable in an open ball about u0 and that F("' is continuous at uQ.
In Problem 40.1 we show that (12) follows from the classical Taylor formula
(2).
(10)
40.2. Necessary and Sufficient Conditions for Free Local Extrema
193
40.2. Necessary and Sufficient Conditions for Free
Local Extrema
Definition 40.3. Assume (HI) holds. In particular, u0 e int D(F). Then the
functional F has a free local minimum at u0 if and only if there exists a
neighborhood of uQ, U(uQ), such that
F(u)>F(u0) forallnel/(n0). (13)
If " > " holds instead of " > " for u + u0, then we speak of a strict local
minimum.
The adjunct "free" points out that in (13) the neighborhood U(u0) is not
restricted by side conditions as is the case in Definition 43.1. The
corresponding definitions for local maxima are obtained in an obvious way by
replacing " > " everywhere with " <." Figure 40.2 clarifies the definition.
In the following two theorems we formulate necessary and sufficient
conditions for the existence of free local minima, first using variations and
then derivatives. In many applications it is easier to verify the existence of
variations than the existence of derivatives. The corresponding assertions for
local maxima are obtained by replacing F with — F.
Theorem 40.A. Let X be a real locally convex space. Let F: D(F)cX->R
be given and let u0 e int D(F). Then the following assertions hold:
(1) Necessary conditions. If F has a free local minimum at uQ, then:
dF(u0;h) = 0 (14)
d2F{u0;h)>0 (15)
for all lie X when these variations exist. For (14) to hold, it suffices that
SF(u0; h) exist for all h e X.
(2) Sufficient condition. Let n be an even number, n>2, and let X be a
B-space. Then F has a free strict local minimum at u0 provided the following
i i y
u0
Figure 40.2
194 40. Free Local Extrema of Diflferentiable Functionals and the Calculus of Variations
hold:
(i) For all he X and fixed c> 0,
8kF(u0;h) = 0, k = l,...,n-l, (16)
8»F(u0;h)^c\\h\\". (17)
(»") u >-* 8"F(u; h) is continuous at u0 and indeed uniformly continuous with
respect to h, i.e., to be precise, for each e> 0 there exists an tj(e)> 0 such that
\d"F(u; h)- 8"F(u0; h)\< e\\h\\n (18)
for all he X and allue X such that \\ u — u0\\ < t)(e). Here, it is assumed that
all variations that appear exist.
In concrete classical variational problems, where F(u) is an integral
expression, the Euler equation (respectively, the Legendre condition)
corresponds to (14) (respectively, (15)). In Section 40.6, within the framework of
the so-called accessory variational problem, we treat a method for verifying
(17) for n = 2.
def
Proof. (1) follows immediately from (4). To prove (2), let <ph(t) = ^(«0 +
th); therefore, ^)(t) = 8kF(ua + th; h) for all h in a neighborhood of zero.
The classical Taylor theorem (2), (3a) for t = 1 yields
F(u0 + h)-F{u0)~n(l)-n(0) = ^^-, 0<*<1.
By (17) and (18) with e= c/2, for h with ||A|| < tj(c/2) we thus have:
, , 8"F(ua + $h;h) c , „
F(u0 + h)-F(u0) = V °h1 '-^ > ^Pir-
D
Theorem 40.B. Let X be a real B-space. Let F: D(F)c. X -*U be given and
let «0eintZ)(F). Then:
(1) Necessary condition. If F has a free local minimum at uQ, then
F'(u0) = 0 (generalizedEuler equation) (19)
when F'(u0) exists as a G-derivative or as an F-derivative.
(2) Sufficient condition. Let n be an even number, n>2. Then, F has a
free strict local minimum at u0 when the following two conditions are fulfilled:
(i) For all h and fixed c> 0,
F<*>(k0) = 0, k = l,...,n-l, (20)
F^\u0)h">c\\h\\". (21)
(»") F is n-times F-differentiable in a neighborhood of u0 and F(n) is
continuous at u0.
Proof. (1) By (14) and (9), 8F(u0; h)= (F'(u0), h) = 0 for all heX;
therefore F'(u0)=0.
40.4. Application to Real Functions in R
195
(2) This is a special case of Theorem 40.A, (2) taking into consideration
that dkF(u0; h)= F<k\u0)hk by (11) and
\8"F(u; h)-8"F(u0; h)\*\\F™(u)-F™(u0)\\\\h\\\
a
40.3. Sufficient Conditions by Means of Comparison
Functional and Abstract Field Theory
Up until now we obtained sufficient conditions for local extrema by
investigating, e.g., the second variation. In the following we describe another
important method. Together with the original problem
min\F(w) = a, (22)
u<BM
we study the comparison problem
rmnK(u)=p. (23)
«e M
Tlieorem 40.C. Let F,K:M~*Ube given with F(u)>K(u) on the set M. If
(23) has a solution u0 such that F(u0)= K(u0), then u0 is also a solution of
(22).
Proof. For all «eM,wehaveF(u)>K(u)^K(u0)= F(u0). a
This simple idea is the basis for obtaining important sufficient conditions
for triinima in classical variational problems within the context of so-called
field theory. Here the crucial step is the construction of K by means of
invariant integrals. We will discuss this in Section 40.7.
40.4. Application to Real Functions in U N
Example 40.4. Suppose the function F: 1/(^(,)^11^-+11^^^1^08868868
continuous partial derivatives of order up to and including n on an open
neighborhood of uQ, U(u0).
According to Example 4.18, F then has continuous F-derivatives up to
and including order n, and for all «el/(«0), h eRw and k = l,...,n, we
have
8kF(u;h) = F<k\u)hk~ZDil...DiF(u)hil...hik.
The summation is over all (\,..., ik from 1 to N. Furthermore, u = (£x,..., i-N),
/2 = (V--,M and Dt =3/3^.
Theorem 40.A in Section 40.2 yields necessary and sufficient conditions
for the existence of a local minimum for F at u0. In particular, from (14) for
1V6 40. Free Local Extrema of Differentiate Functionals and the Calculus of Variations
n = 1 it follows that if F has a local minimum at u0, then
Z?,.F(«0) = 0, i = l,...,N. (24)
From (16) and (17) for n = 2 it follows that if (24) holds and d2F(u0; h) is
positive definite with respect to h, then F has a free strict local minimum at
"o-
By the way, the above formula for 8kF(u0; h) is also obtained directly by
a k-fold differentiation of F(«0 + th) with respect to t and an evaluation at
f = 0.
40.5. Application to Classical Multidimensional
Variational Problems in Spaces of Continuously
Differentiable Functions
We consider the classical variational problem
def r
F(u) = I L(x,Du(x)) dx = mini, ueX, (25)
Jc
where
def _
X= { h eC2m(G): Z>^ = 0 on dGfor all/?,|/?|<m-l}.
Thus, homogeneous boundary conditions are contained in the condition
u e X. Our goal is to establish the following necessary condition for (25) to
hold:
G: E (-iya]DaLD*{x,Du(x)) = 0; (26)
|o| <, m
dG:Dl>u = 0 forall/8, \P\<,m-l.
This differential equation is called the Euler equation for (25). The solutions
of (26) without the boundary conditions are called extremals of (25). In
addition, we wish to justify the following formulas for the first and second
variations for arbitrary u, h e X:
8F{u;h) = j £ LD.(x,Du(x))Dahdx,
G|o| < m
82F{u;h)=( £ LD.Di,{x,Du{x))DahDlihdx. (27)
G\a\,\P\<m
First, we explain the notation. We introduced the symbol D" for the partial
def
derivative in Section 21.1. Here, |a| is the order of Da. Let D°u = u. The
function L is to depend on x and all partial derivatives D"u up to and
40.5. Application to Classical Multidimensional Variational Problems
197
including order m. Let Du be the tuple {Dau)^&m. In particular, for
GcR1, jeR1, we have Du = («,«',...,w(m)). In order to simplify the
notation, we think of L as a function of x and D, where D = (i?a)|„|< „, and
D"eR,J)e Rrf. Furthermore, LD« is the partial derivative of L with respect
to D". One could also write LD«U for this. In particular, LDo = Lu.
In the literature oriented toward physical applications, one frequently
uses du instead of h in (27). Integrating by parts in (27), according to
Section 18.2, we obtain
8F(u;h)= (
£ {-l)]a]DaLD»{x,Du{x))
G
For this, with h = 8u, one also writes
8F
hdx. (28)
8F(u;8u)= I -r—p—rdu(x)dx.
Jg8u(x)
Therefore, the differential equation in (26) reads briefly as follows:
8F/du(x) = 0. Here 8F/du(x) is called the variational derivative and is
frequently applied to mathematical physics.
In (25), for X we introduce either the norm
IMIc""= E max\Dau{x)\
|«|<mj£5
or the norm
||tt||c=max|tt(x)|.
Definition 40.5. The function u in X is called a weak (respectively, strong)
minimal of (25) if and only if F: X -> U has a free local minimum at u with
respect to the norm \\-\\cm (respectively, ||-||c) on X.
The corresponding minimum is said to be weak (respectively, strong).
We have already given the intuitive interpretation of this definition in
Section 37.4 in connection with Fig. 37.12.
Proposition 40.6. Let G be a bounded region in UN, N>1, whose boundary is
piecewise smooth, i.e., dG e C0'1. Furthermore, let L e Cm+2(G X Ud).
Then the first and second variations exist and (27) and (28) hold for all
u,h^X.
If u is a weak minimal of (25), then
8F{u;h) = 0, d2F{u;h)>0 forallh&X (29)
and (26) holds.
We treat applications of this proposition to mechanics and elasticity
theory in Part IV. In Problem 40.3, we consider the case where L does not
depend only on one function u, but on m functions ux,...,um.
198 40. Free Local Extrema of Differential)]e Functionals and the Calculus of Variations
Proof. Observe that dkF(u;h) = <p[k)(0), where <ph{t)= F(u + th). The
differentiation can be carried out under the integral sign because of the
smoothness of all functions. (29) follows from Theorem 40.A in Section
40.2. In particular, (29) yields 8F(u;h)=0 for all AeC0K(G). Now we
obtain (26) from (28) and the lemma on variations in Section 18.1. □
Remark 40.7 (Reduction trick). If one has inhomogeneous boundary
conditions, dG: D^u= g, then one can reduce them to homogeneous ones with
respect to v by replacing u with u = v + w, where w is a fixed function
satisfying the boundary conditions. By the inverse transformation from v to
u, one obtains the differential equation (26) with the corresponding
inhomogeneous boundary conditions. One can also prove this directly in a way
parallel to Theorem 40.A in Section 40.2.
The following very simple example aims to prepare the reader for general
considerations in the next section. The use of the norm 11 -1Ij^ 2 is important.
Example 40.8. The problem of finding the shortest curve x >-* u(x) between
the two points (0,0) and (1,0) in the (x, «)-plane leads to
F(u) = [ h+u'2dx = nun\, u&X,
Jo
def
where X= {«e C2[0,1]: w(0)= w(l)= 0}. According to (25), the
corresponding Euler equation reads as follows:
d i
uX ,/-1 _j_ ,,/2
= 0, M(0) = «(1) = 0. (30)
. + i/"'
A solution of (30) is u0 = 0. The straight lines u = ex + d with arbitrary
constants c, d are extremals.
We will show that uQ is a weak minimal. To this end, we write the
following two norms on X:
\\u\\ci= max |i/(x)|+ max |i/'(x)|,
Osjtsl 0<;t<l
\l/2
J {u2 + u'2)dx\
We haveL(u') = h+ u'2 and
52F(t/;/!)= [lLu,Ju'{x))h'2dx for all u, h e X.
From the continuity of Lu.u. it follows that
\82F(u;h)-82F(u0;h)\< (leh'2dx<e\
2
1,2>
0
where ||«- uQ\\ci < t)(e). Since Lu.u,{u'0(x)) = \, we have
82F(u0;h)> Ch'2dx>c\\
II? 2
40.5. Application to Classical Multidimensional Variational Problems
199
for all liel and fixed c > 0. In this connection, we make use of the
Poincare inequality from Problem 22.1. If we choose e sufficiently small,
then
82F{u;h)^2-lc\\h\\l2 (31)
clef
for all h e X and all «el, where ||«- «0||ci < tj. Now let <P/,(0 =
F(u0 + th). Then we have <p'h(t) = 8F(u0 + th; h), <p'£(t) = 82F(u0 + th; h).
Moreover, <p'h(0) ~ 0. The classical Taylor theorem yields
^(1) = ^(0) + 2^(0), 0<#<1.
Then the desired assertion follows immediately from (31) with u=u0+ h:
F(u)> F(«0) + 4_1c||«- «olli,2" for all u such that ||«- "ollc1 < 1-
Counterexample 40.9. We consider the minimum problem
F{u) = 2_1f (u1 - u2x ~2fu)dxdt = mm\, «el, (32)
JG
def , _ def
where I={aeC2(G): « = 0 on dG}. Here, G={(x,t): 0<x<l,
0<t <t0} is a rectangle in the (x, j)-plane. The Euler equation reads as
follows:
««-««+ /-0. (33)
Furthermore,
d2F(u;h)= \(h)-h\)dxdt. (34)
JG
We can never have 82F(u; h)>0 for all /ieX For this reason (32)
possesses no local minimum. Analogously one can show that no local
inaximum exists.
If we interpret u(x, t) as the displacement of a string at the time t at the
point x, then (33) describes the equation of the vibrating string under the
influence of the exterior force /. (32) comprises the Hamilton principle of
least action, which stands at the pinnacle of mechanics. However, our
considerations show that this principle is not well posed in the form (32).
Indeed, by the principle of least action physicists do not mean (32) but the
fact that the first variation 8F(u, h) is equal to zero for all /ieX This is
/equivalent to (33). In the sense of Section 43.9, this means that F has a
critical point at u. More correctly, therefore, one must replace "min!" by
"stationary!" in (32) and speak of the principle of stationary action.
Counterexample 40.9 shows that one can solve hyperbolic partial
differential equations by seeking critical points of appropriate functionals.
However, problems for critical points are generally more difficult to solve than
minimum and maximum problems. We shall delve into this in Chap-
-.J ... rccLc... _ rcma . erentii _. nctior ,theC ,ofVi s
ters 44 and 49. In particular, for hyperbolic partial differential
equations, we recommend the works of Rabinowitz (1978a), (1978b), Benci and
Rabinowitz (1979), Brezis, Coron and Nirenberg (1980), and Amann and
Zehnder (1980), which employ deeper-lying topological methods, namely
the Fadell-Rabinowitz index, which generalizes the genus from Chapter 44
and the generalized Morse index of Conley (1978, M). In these works, from
critical points of functional, periodic solutions of the canonical Hamilton
equations also arise (cf. the Problems for Chapter 49).
Remark 40.10 (Generalized solutions). In the introductory remarks before
the first section of Chapter 18 we have already referred to the fact that
spaces of smooth functions are not appropriate for a general existence
theory for the minimum problem (25). For example, one cannot apply the
basic existence propositions from Section 38.3, since the space X in (25) is
not reflexive. In order to build up an existence theory, one must replace
C2m(G) by the Sobolev spaces W™(G). In this connection, one must ensure
that the integrals appearing above exist and that in the calculation of
8kF(u0;h) one can carry out the differentiation under the integral sign
according to A2(25). To this end, one needs restricting growth conditions on
L and Ln«. We discuss this in Section 42.7.
40.6. Accessory Quadratic Variational Problems and
Sufficient Eigenvalue Criteria for Local Extrema
We shall build up the idea used in Example 40.8 to a general functional
analytical sufficient condition and apply it to multidimensional variational
problems. In this connection, we proceed from the minimum problem
F(tt) = min!, seI (35)
with the corresponding Euler equation
dF(u;h) = 0 foralUeX (36)
Let «0 be a solution of (36). We assume that the second variation can be
written in the form d2F(u0; h)= a(h,h), where a(-,') is bilinear. We
designate the accessory variational problem by
def
mina(h,h) = y, M = {h e Y: \\h\\z = l} (37)
h eM
with the corresponding so-called Jacobi eigenvalue equation
a(h,v) = in(h\v)z forallueY. (38)
We seek h^Y, h=£0, and ft e U. In Example 40.8, we considered the
following special situation: X = { u e Cz[0,1]: «(0) = «(1) = 0}, || • || x = \\ ■ ||ci,
Y= ^/(0, i), Z= L2(0,1). We formulate the following assumptions, where
w.o. Accessory quadratic variational Problems and Sufficient Eigenvalue Criteria 201
the following so-called Garding inequality is crucial:
82F(u0;h)*c0\\h\\2Y-d0\\h\\2z (39)
for all h e Y and fixed c0 > 0, d0 > 0.
: (HI) Spaces: X cy cZ. Here, Xis a real normed space, Y and Z are real
H-spaces and the embedding Y c Z is compact. «0 e X is given and fixed.
(H2) Garding's inequality. For F: l/(«0)c X->IR, (39) holds. Here,
U(u0) is a fixed open neighborhood in X.
(H3) Uniform continuity of the second variation. For each e > 0, there exists
anij(e)> 0 such that
\82F{u;h)-82F(u0;h)\<e\\!t\\2
for all u,h&X such that ||«— ue\\x < t)(e). Here, it is naturally assumed
that 82F(u; h) exists for all u e U(u0) and all h <= X.
(H4) Bilinear form. There exists a bounded symmetric bilinear form a:
YX Y-*U such that 82F(u0; h, h)-a(h, h) for all h e Y.
Tlieorem 40.D. Assume (//1)-(//4) hold. If u0 is a solution of the Euler
equation (36), then F has a free strict local minimum on X at u0 when any one
of the following mutually equivalent conditions is fulfilled:
(i) a is strongly positive, i.e., (39) holds for d0 = 0.
(ii) a is strictly positive, i.e., 82F( u0; h) > 0 for allh^Y,h + 0.
(Hi) All eigenvalues ju in (38) are positive.
Before proving this theorem we apply it to the general multidimensional
variational problem (25) considered in the preceding section. Then:
F(u)= (L{x,Du(x))dx. (L)
JG
Furthermore, we choose
a(h,v) = j E LD„Dll{x,Du0))DtihDavdx
G\a\,\P\<m
and 7= W2m(G), Z=L2(G),
def _
X=° { u eC2m{G): Z)^«=0 on dG for all /?, |/?| < m-1}
def
with the norm ||-||;r= INIcm- Parallel to the eigenvalue problem (38), we
formulate the classical Jacobi eigenvalue equation
G: E {-\)HDa[LD,D,(x,Dua{x))Dlih}^v,h, (40)
\a\,\P\<.m
dG: /)^ = 0 for all 0, \P\<m-l.
Problem (38) is exactly the generalized problem corresponding to (40) in
the sense of Part II. Each classical solution h of (40) satisfies (38). To prove
202 40. Free Local Extrema of Diflerentiable Functionals and the Calculus of Variations
this, one has to multiply (40) by ueCo°(G) or, more generally, by ce
1¥™(G) and then integrate by parts. If the data are sufficiently smooth,
then, according to the regularity theory, (38) and (40) are mutually
equivalent. This always holds under the assumptions of the following proposition
when we are dealing with a one-dimensional variational problem, i.e.,
G = ]a,b[,x^U\
Proposition 40.11. If «0 e X is an extremal of F defined in (L), i.e., u0
satisfies the corresponding Euler equation (26), then u0 is also a strict weak
minimal of F when the following three conditions are fulfilled:
(a) The assumptions of Proposition 40.6 hold.
(b) The differential operator appearing on the left-hand side of (40) is strongly
elliptic in the sense of Section 22.14, i.e., there exists a c> 0 such that
£ LD„De(x,Du0(x))DaDl)>c £ \D"\2
|o|,|/5| = m \a\ = m
holds for all x e G. The real variables Da, D& appearing here can take on
arbitrary real values.
(c) One of the conditions (i), (ii), or (Hi) of Theorem 40.D holds.
Strong ellipticity means that the symmetric matrix (LD«Dn(x, Du0(x)) of
second partial derivatives is positive definite with a uniform constant c for
all xeG. The proof of Proposition 40.11 follows directly from Theorem
40.D if one observes that the Garding inequality follows, according to
Lemma 22.39, from the strong ellipticity. In Problem 40.4 we treat an
application of Proposition 40.11 to the minimal surface problem. One
proves the fact that u0 is a strong minimal in the sense of Definition 40.5
with the aid of field theory. We discuss this in the next section.
Proof of Theorem 40.D. Ad (i) Use the same argument as in Example
40.8.
(ii)=»(i) We define c(-,-) by a(h,v)*= c(h,v)~d0(h\v)z. According to
(39), c is strongly positive on Y. The bilinear form (h,v)<-^> (h\v)z is
compact on Y because of the compact embedding Y C Z. By Hestenes'
theorem (Problem 22.11), (i) follows from (ii).
For the sake of completeness, we give a direct proof here. According to
Example 38.16, h >-* c(h, h) is weakly sequentially lower semicontinuous on
Y and h1-* — d0(h\h)z is weakly sequentially continuous on Y, i.e., h>~*
a(h,h) is weakly sequentially lower semicontinuous on Y. If (i) does not
hold, then there exists a sequence (hn) from Y such that p„||K = l for all
«eN and a(hn, hn) -» 0. Due to the compact embedding Y c Z, hn-+h in Y
and hn-*h in Z possibly only for a subsequence—therefore, a(h„,h)-*
a(h, h) and
c0\\h„ - h\\\ - d0\\h„ - A||! < a(h„ -h,h„- h) -- a(h, h);
40.7. Application to Necessary and Sufficient Conditions for Local Extrema 203
consequently, — a(h,h) = 0, i.e., h = 0 according to (ii). However, h„ -» 0 in
7contradicts p„||K = l for all «eN.
(iii) => (ii) It suffices to show that y > 0 in (37). First we solve (37).
According to (39), a(h,h)^-dQ for all h^M. If (hn) is a minimal
sequence for (37), then by (39) it is also bounded in Y. Therefore, hn-+h in
Y, with the possible transition to a subsequence, and hence hn-*h in Z.
This yields h^M and a(h,h)< Hma(h„,h„)'=y. Consequently, h is a
solution of (37) and, according to the argument in Section 18.5 or directly
by Proposition 43.6, h is also a solution of (38). Therefore, ji = a(h, h) = y.
For an arbitrary solution lieMof (38), n = a(h,h); therefore, jn >y-
Thus, y is the smallest eigenvalue of (38), i.e., y > 0 according to (iii).
(ii) =» (iii) Now we have y > 0. For this reason, according to the preceding
considerations, the smallest eigenvalue of (38) is positive. □
40.7. Application to Necessary and Sufficient
Conditions for Local Extrema for Classical
One-Dimensional Variational Problems
Parallel to Section 40.5, we study the one-dimensional variational problem
def rb /
F(u) = / L(x, u{x),u'(x))dx*=nanl, u^X, (41)
where — oo < a < ft < oo,
def
X" {xec2[a,b]:u(a) = u(b) = 0}, LeC3(R3), (42)
and the Euler equation
-^K'(x> u{x),u'(x))-Lu(x, u(x),u'(x)) = 0. (43)
We will show how one can obtain a number of known classical criteria
from our previous considerations. In particular, we are interested in
sufficient criteria for weak and strong minima in the sense of Definition 40.5
with m — 1. In this connection, we attach no value to a derivation of the
results under the weakest possible assumptions, but rather we will work
out the simple basic ideas as clearly as possible. The essential results
that we present here are contained in Fig. 40.3. The arrows are to be
understood as implications. They indicate necessary (respectively, sufficient)
criteria for minima. In particular, we will show that with the sufficiency
criteria the convexity of L with respect to u' plays an important role. In
addition to the following considerations, we consider the Weierstrass-
Erdmann corner condition and the necessity of the Weierstrass E-condition
in Section 48.7 in connection with the Pontrjagin maximum principle. First,
zv/4 w. rree Local txtrema or Din'erentiaDle junctionals and the Calculus of Variations
free local minimum
weak *^_ _^ strong
minimum ' minimum
I
Euler equation, strong Legendre condition, field theory,
Legendre condition eigenvalue criterion Weierstrass E-function
respectively the Jacobi
conjugate points criterion
Figure 40.3
we note several important expressions. The second variation for (41) reads
as follows:
82F(u; h) = f[wAQ)h'2 +2Lu,u(Q)h'h] dx + fhLuu(Q)h2 dx
foralltt./aeX, (44)
def
where Q — (x, u(x), u'(x)) and the corresponding Jacobi eigenvalue
equation is
--^-{Rh') + Ph=t).h, h^X. (45)
Here,
def def A
R(x)~Lu,u,(Q), P(x)=Luu(Q)-j^Lu,u(Q).
If we observe that (h2)' = 2h'h, then integration by parts in (44)
immediately yields
S2F(u; h) = fb(Rh'2 + Ph2)dx for all «,/i£l (46)
Furthermore, we define the Weierstrass E-function
def
E(x,u, u',v') = L(x, u, v') — L(x, u, u')~ Lu,(x, u, u')(v'— u').
By Section 42.3, the convexity of L with respect to u' is equivalent to
E(x,u,u',v')>0 for all *,«,«', c'eK (47)
(respectively, to
Lu,u,(x,u,u')>0 for all x, «,«'e|). (48)
These two conditions will play a central role in the following.
As an illustration of our results, we use the following example.
40.7. Application to Necessary and Sufficient Conditions for Local Extrema 205
Standard Example 40.12. Let L(x, u, u') = n(x, u)n + u'2, with n > 0 on
R2. According to Section 37.4a, the solutions x •-* u(x) of (41) are the paths
of light rays in a medium with the refraction index n(x,u) at the point
{x, u) where we set the velocity of light equal to one.
Proposition 40.13 (Necessary Condition). Suppose (42) holds. If (41) has a
weak minimum at u, then the Euler equation (43) and the Legendre condition
Lu,u,(x,u(x),u'(x))>0 forallxe[a,fc] (49)
hold.
Proof. Proposition 40.6 yields the Euler equation and 82F(u; h) > 0 for all
heX. Thus, (49) follows from (46). Namely, if R(x0) < 0 for an x0 e [a, b],
then one can choose an he X having very large h'(x0) and small h(x0), so
that S2F(u; h) < 0 holds. However, this is impossible. □
(49) is always fulfilled in the Standard Example 40.12, since L is convex
with respect to u'. Explicitly, LuV = n{\ + u'2)~3/2 holds.
Proposition 40.14 (The Jacobi Sufficiency Criterion). Suppose (42) holds. If
« e X is an extremal, i.e., a solution of the Euler equation (43) with the strong
Legendre condition
Lu,u,(x,u(x),u'(x))>0 forallxe[a,b], (50)
then u is a weak minimal o/(41) when one of the following three conditions is
fulfilled:
(/) d2F(u; h) > c\\h\\h2 for all h e X and fixed c> 0.
(ii) All eigenvalues ju of the Jacobi equation (45) are greater than zero.
{Hi) If one solves the initial value problem h(a)= 0, A'(a) = l for (45) with
ja = 0, then the solution h possesses no zeros on ]a, b].
The zeros xk of h in (Hi) are called conjugate points of a. In this
connection, in (iii), one can use every initial value problem of the form
h(a) = 0, h'(a) = a, a + 0, because of the linearity of (45).
Example 40.15. If we choose n = v'l + u in the Standard Example 40.12,
then u = 4~1(l +ct2)x2 + ax is a family of extremals. From u — u(x, a),
ua(x, a)= 0, and by the elimination of a, we obtain the envelope v = 4~1x2
-1 of this family of parabolas (cf. Fig. 40.4).
If we choose a — 0 and a parabola of this family that passes through the
origin, then a short calculation shows that the first conjugate point xk is the
abscissa of the contact point of the parabola with the envelope v. For b with
a<b<xk, the segment of the parabola over [a, b] is a weak minimal by
Proposition 40.14. In geometrical optics, the envelope corresponds to the
envelope of light rays. This is the so-called caustic. In general, from the
206 40. Free Local Extrema of Diflerentiable Functional and the Calculus of Variations
\ ^K^n
Figure 40.4
standpoint of the calculus of variations, the points of the caustic are points
having singular behavior. To be more precise, the following occurs: u(-,a)
satisfies the Euler equation (43). If one differentiates this equation with
def
respect to a and sets h(x) = ua(x, a) for fixed a, then h is a solution of the
Jacobi equation (45) with ju = 0 and u(-,a) instead of «. We have h(0) = 0,
h'(0) = 1. Let a =/= 0. Then h(xk) — 0 is equivalent to ua(xk, a) = 0.
According to the theory of envelopes, «(•,«) and v are in contact at xk. These
considerations for the determination of conjugate points can obviously be
extensively generalized. Moreover, we obtain a simple interpretation of the
Jacobi equation.
Strictly speaking, the assumption made in (42) is that L e C3(!R3) is not
fulfilled since n is defined only for u> — 1. However, for a given extremal
u == 4-1(l + a2)x2 + ax, one can modify n for u<c(ot), where c(a) is a
suitable constant with c(a) > -1, and such that L e C3(R3) holds, so thatw
still remains a solution of the Euler equation and the Jacobi equation is not
changed (cf. Fig. 40.4).
Proof of Proposition 40.14. The sufficiency of (i) as well as (i) «=> (if)
follow from Proposition 40.11. The strong duplicity of (45) follows from
(50).
(iii)=»(ii) Let h0 be the solution of the initial value problem h0(a) = Q,
def
h'0(a) = l for (45) with ft = 0. We set w = - h'0R/h0. From (46) we obtain
d2F{u;h)=JbR(h'+~-) dx>0 for all A <=CZ>(a,b). (51).
In this connection, we observe that R(P + w') = w2 by (45) and
f\h2w' + 2hh'w)dx^ f\h2w)'dx = 0.
Co°(a, b) is dense in X; therefore (51) also holds for all /iel If h is a
solution of (45), then, by integration by parts, we obtain
nfbh2dx = d2F{u;h);
40.7. Application to Necessary and Sufficient Conditions for Local Extrema 207
u
I
i
I
l
\
\
\
a b
Figure 40.5
therefore ju > 0. However, because of (hi), ja = 0 cannot be an eigenvalue. □
The following basic definition comprises the situation (depicted
intuitively in Fig. 40.5) that a light ray is embedded in a family of light rays,
where there is no intersection or touching; therefore, in particular, no
caustic occurs.
Definition 40.16. Let u0 be an extremal, i.e., a solution of (43). u0 can be
embedded in afield of extremals if and only if the following three conditions
hold:
(a) There exists an open neighborhood U of the extremal u0 in the (x, u)-
plane and a family of extremals (ua), where a varies in a neighborhood
of zero.
(b) Exactly one curve ua of the family passes through each point (x, u) e JJ;
therefore, a = a(x, u).
(c) The function ^ defined on Uby $(x, u) — u'a(x), a = a(x, u), is called a
descent function of the field. We require that ^ e C\U).
In the case where U — U2, we call the field global, ^(x, u) is the value of
the derivative with respect to x of the curve ua through (x, u) at this point.
Proposition 40.17 (Sufficiency Criterion of Field Theory). Suppose (42) holds
and let «0 e X be an extremal which can be embedded in a field of extremals.
Then u is a strong minimal when L is convex with respect to u', i.e., (47) or
(48) holds.
If the field is global, then u0 yields an absolute minimum for the original
problem (41).
Example 40.18. We consider the Standard Example 40.12 with n = 1. Then
the problem is to find the shortest path in the plane connecting the points
(a,0) and (b,0). The Euler equation reads as follows:
Z.\j8 40. free Local Extrcma of Differentiable Functionals and the Calculus of Variations
therefore, t/'/vl + u'2 = constant and thus u' = constant, i.e., all nonvertical
straight lines are extremals. The boundary condition u{a) - u{b) = 0 leads
to u0(x) = 0. As the global field we choose ua{x) = a. By Proposition 40.17,
u0 yields a global minimum, as was naturally to be expected.
def
Proof of Proposition 40.17. Let UE={ueX: ||u-u0||c <e}. We
choose e > 0 so small that all curves belonging to u in UE lie in U (see Fig.
40.5). Furthermore, we define K by
F(u)~K(u)= fbE(x,u(x),^(x, u(x)),u'(x))dx. (52)
•'a
From (47) it follows that F{u) > K{u) for all u e Ue. The point is that we
can write K as a line integral:
K(u)= fib'0)Mdx + Ndu.
J(a,0)
With the aid of the Euler equation (43) for ua and because u'a(x)=*
i^(x, ua(x)), one easily verifies that
MU = NX inU (53)
holds (cf. Problem 40.5). Consequently, we obtain the crucial property that
K(u) does not depend on the path, but rather only on the boundary values;
therefore, K{u) = K(u0) for all u e. UE. According to the construction of ty
and E,
"o(*)->K*>"o(*))
and thus
£(^,^0(^),^(^,^0(^)),^0(^))-0-,
therefore, F(u0)— K(uQ) — 0 by (52). From this we obtain
F(u) ^K{u) = K(u0) = F(u0) for all u e Uc.
a
K is the comparison functional from Theorem 40.C in Section 40.3.
Problems and Supplements
40.1. Taylor's formula. Prove (12). Solution: From (2), with t = 1, and (3a), (11),
it follows that:
\n\Rn\-\rF{u0 + 9h;h)-VF(u0;h)\
<\\Fw(u^ + n)-F^(u^)\\\\hW',
therefore, Rn = o(\\h\\") as h -* 0 because 0<d<:l and because of the
continuity of F(n) at u0.
Problems and Supplements
209
40.2. Quadratic functionals. Let X be a real B-space, and let a: XxX-*U
be bilinear, symmetric, and bounded (cf. Section 21.5). We set F{u) =
2~la(u,u)-b(u), where tel*. Calculate SnF(u; h) and the F-deriva-
tive F<-"\ Solution:
8F(u;h)~a(u,h)-b(h),82F(u;h)-a(h,h), (54)
SkF(u;h)^0 for k a 3.
Let 6 = 0. According to Section 21.5, a(v) can be written as a(u,v) =
(Au,v) for all u,v^X, where A: X-* X* is linear and continuous.
Furthermore, according to Proposition 4.19,
F'(u)h = (Au,h), F"(u)hk°*(Ah,k); (55)
therefore, F'' = A, F"{u) = constant, and F("' = 0 for n > 3.
If we combine these results with Theorems 40.A and 40.B in Section
40.2, then we obtain an overview of the local minimum problems for
quadratic functionals. The definiteness condition (17) [respectively, (21)]
with n = 2 is identical to the strong positiveness of a. Then, by
Proposition 38.17, F possesses even a strict global minimum on X when X is
reflexive.
40.3. Variational problems for several sought functions. Formulate Proposition
40.6 for the case when L depends not only on a function u, but also on k
functions ux,...,uk and their derivatives up to and including the mth
order. Let D^Uj = 0 on dG for ally and all /8 such that |/J| < m -1.
Hint: Replace u and h everywhere by u = (ult...,uk) and h =
{hx,...,hk), respectively. Instead of the Euler equation (26), one obtains a
system where for each Uj,j = l,...,k, there arises a corresponding
equation which is formally obtained by ignoring Uj as well as all the other u, in
forming the derivatives; therefore,
G: £ (-l)HD"LD*Uj(x,Du(x))~0, ]-l,...,k,
|a|£m
dG: DfiUj = 0 for all /8 such that |/3| < m -1.
40.4. Minimal surface problem. We consider
j /T+yJ + v] dx dy = mini, v~g on dG, v&C2(G). (56)
We have already considered this problem and interpreted it physically in
Problem 6.5. Here it is a question of finding a surface v = v(x, y) with the
smallest area which passes through a given space curve. Let G be a
bounded region in R 2 with dG e C0,1, and let g e C2(G). Derive the Euler
equation and show that every solution of the Euler equation with v e
C2(G) and v = g on dG is a weak minimal.
def . _
Solution: We set v=u + g, X= {u gC2(G): k = 0 on dG} and
obtain
40. Free Local Extrema of Differentiable Functionals and the Calculus of Variations
According to Proposition 40.6, the Euler equation reads as follows:
and
2
S2F(u;h)=( T, DiDiUixhxdxldx1 for all/i e X,
Jc /,./-1 '' '
where
L = \jl+v2 + v2 , xx = x, x2~°y, Di = jlp'
Hence
DfiL- (l+^ + y>2)~1/2(s,;(l+yx2 + ^2)-y,i%).
The eigenvalues of the matrix {D(DjL) are greater than or equal to
(1 + y2 + y2)~1/2; thus they are greater than or equal to a for all x e G for
a suitable a > 0. For this reason,
S2F(u;h)>c(u)\\h\\l2 (57)
for all /i£l and fixed c(u)>0. Now Proposition 40.11 yields the
assertion.
The eigenvalues of (DjDjL) tend toward zero as (1 + y2 + y2) -»oo. For
this reason (57) does not hold uniformly for all ael This essentially
causes the difficulties of the existence theory for (56) within the context of
Sobolev spaces. We discuss this in Problem 52.1.
5. Proof of (53). Solution: From u'a(x)=>\j/(x, ua(x)) it follows that
<(x) = ^x{x,ua(x))+^u{x,ua(x))u'a(x).
The Euler equation (43) yields
^v(GK+weK+we)-MG)-o,
where Q = (x, ua(x), u'a(x)). Since a curve ua goes through each point
(x, u) of U, we obtain
WAP)*x(S)+ lu,x(p) = - lu,u,(p)Hs)*u(s)
-LU,U(P)HS)+LU(P),
where P = (x, u, \p(x, u)), S = (x, u). Since
N(S)-LAP), M(S)-L(P)-j,(S)LAP),
this is equivalent to Nx = Mu.
* Field theory for multidimensional integrals. For our present purpose, study
Klotzler (1971, M), Chapter V. There is is shown that an intimate
connection exists between the construction of invariant integrals that
depend only on the boundary values, the Legendre transformation, and
the Hamilton-Jacobi equation. The variegated connections between field
theory, geometrical optics, and other areas of theoretical physics can be
found in Rund (1966, M).
Problems and Supplements
211
40.7.** Canonical formalism, symplectic geometry, the Legendre transformation, Lie
algebras, and differential forms. In R2" one can generate a so-called
symplectic structure by means of the skew-symmetric inner product
n In
[x,y] = ~ L x,y„+l+ L x,y,_n.
/-1 i = n +1
If q = (qlt...,q„) and/? = (pi,...,p„) are the position coordinates and the
generalized impulse coordinates, respectively, of a mechanical system,
then by means of [x, y] with x = (p, q), there arises a symplectic structure
which is significant for a deep understanding of the canonical Hamilton
formalism, which we have described in its classical form in Section 37.4.
To this end, study Arnold (1974, M), Chapters 8, 9*and Abraham and
Marsden (1978, M).
If we denote by M the n-dimensional manifold of the position
coordinates q- (qi,...,q„) of a mechanical system, then we can first assign to
each point the corresponding tangent space with tangent space
coordinates q'=(q{ q'n), which can be interpreted as the velocity
coordinates. If one varies (q, q'), then a 2rc-dimensional manifold TM arises: the
so-called tangent bundle. If one makes all possible first-order differential
forms db> = P\dqx+ ■ ■■ + p„dq„ correspond to each point q, then one
obtains the cotangent bundle TM* with the coordinates (p,q). Here/;
can be interpreted as the generalized momentum. The Legendre
transformation signifies the transition from the tangent bundle TM to the
cotangent bundle TM*. A symplectic structure can be introduced in TM* in a
natural way, which is locally generated by the alternating differential form
dp A dq = dpi A dqx + • • • + dp„ A dqn. The apparatus of differential forms
on manifolds then permits an elegant formulation of the canonical
formalism which is important in the modern theory of linear partial
differential equations within the context of pseudodifferential operators and
Fourier integral operators (cf. Problem 40.12).
Applications of symplectic geometry to field theories in physics can be
found in Kijowski and Tulpyczew (1979, M). Furthermore, we recommend
Chernoffand Marsden (1974, L), Marsden (1974, L), (1981, L), Abraham
and Marsden (1978, M), and Guillemin and Sternberg (1977, M)
(applications of infinite-dimensional canonical systems and symplectic geometry
in mathematical physics).
40.8.** Canonical formalism and perturbation theory for mechanical systems. The
Hamilton canonical equations for i = 1,...,n read as follows:
J/'O)--ffO'O.fO)). q;(t) = ~(p(t),q(t)). (58a)
A basic method for solving these equations consists of passing to new
coordinates
Ij = Ij(p,q), <Pj = <Pj(p,q)
so that the structure of the canonical equations is preserved, i.e.,
40. Free Local Extrema of Differentiable Functionals and the Calculus of Variations
for i =1,...,n. Such transformations are called canonical transformations
when one is dealing with diffeomorphisms. In Arnold (1974, M), Chapter
10, conditions that assure that a canonical transformation exists with
H* — H*(I) (Liouville's theorem) are given. Then we obtain an especially
simple solution from (58b):
/, = constant, <p,(0 = (0,-(1)/ + <p,(0), /=1,...,/1, (59)
where w, = dH*/8I,. Moreover, all <p, are interpreted as angular
coordinates, i.e., for integer values of k, <p, and f^lirk describe the same
position of the system. Thus the paths / •-> <p,(/) are to be considered on
an //-dimensional torus which consists of all points (<p,,...,<p„), where <p,
and yj + lirk are identified for integer values of k. Figure 40.6 shows the
situation for n = 2.
Motions of the form (59) are called quasiperiodic because each
coordinate <p, executes a periodic motion with the angular frequency w,. Thus the
physical content of Liouville's theorem is that the complex motion of
numerous mechanical systems can be reduced to simple vibrations with
the choice of appropriate coordinates. Ij and <py- are called the action
variable and the angle variable, respectively. Systems for which the
reduction of (58a) to (58b) with H* = H*(I) is possible are called
integrable. The crucial sufficient condition for integrability consists in that
the following three assertions hold for (58a):
(i) There exist n integrals Fx F„ of motion with Fl — H which are in
involution, i.e., the Poisson brackets {FitFj} are identically equal to
zero for i, j = 1,..., n. In this connection, by definition,
{F,G}
£ ftp*
k=l
' FlPpk
)•
def
(ii) If we set Ma = {(//, ¢) eR2": F,(p,q)= a,, / =1,...,n) for fixed a,.
then all the F, are independent on Ma, i.e., all the differentials dFt are
linearly independent at each point of Ma.
(iii) Ma is compact and connected.
The next crucial step is to consider (58b) with H* = H0(I)+ eHx{I, <j>)
Such perturbations appear, e.g., in celestial mechanics because of the
perturbing influence of other planets on the motion of a given planet in the
gravitational field of the sun. The Kolmogorov-Arnold-Moser theon
Problems and Supplements
213
deals with this and more general perturbations. Roughly speaking, an
essential result reads as follows: In the majority of cases, quasiperiodic
motions again arise as a result of the perturbation of quasiperiodic
motions. In this connection, study Siegel and Moser (1971, M) and Arnold
(1974, M), Appendix 8 as well as Arnold (1963, S) and Sternberg (1969,
M). There one also finds important applications to the stability problem of
our planetary system, which, however, we have not yet been able to solve
in complete generality.
The abstract context for the investigation of perturbed problems is laid
out in the Moser-Nash theorem (the hard implicit function theorem). We
have already discussed this in Problem 5.9. The special difficulties in the
present perturbation problem are that resonances can appear, i.e., for the
unperturbed problem the connection between w and I is not bijective, and
formally posed perturbation series contain small divisors (cf. Problem 5.9c
and the works of Kolmogorov, Arnold, and Moser cited in Chapter 5).
Also, in Arnold (1974, M), Chapter 10, a general heuristic principle for
the treatment of perturbed problems is described. The idea, which goes
back to Gauss, is that instead of
If=egi(I,<p), ¥i = Ui{I)+eft{I,<t), i = l,...,n,
one considers the averaged equation
where
g,= (2iry"f "■■■ f *gt(J,<t)d<tv..dtf„.
One expects that J(t) and I(t) differ very little on [0, T] for T = l/e and
small e. For n =1, this can be rigorously proved under suitable
assumptions.
40.9.** Solitons, infinite-dimensional canonical systems, and nonlinear wave
equations. We have already dealt with methods for the solution of the
Korteweg-de Vries equation
u, = 6uxu-uxxx (60)
in Problem 33.7. The significance of this and similar nonlinear wave
equations is that they possess solitary waves, i.e., so-called solitons as
solutions, which exhibit stable behavior upon interaction with other
solitons (cf. Figs. 33.2 and 33.3). For this reason, one is interested in solitons
in many branches of physics (hydrodynamics, crystallography, plasma
physics, short optical impulses, elementary particle physics, general
relativity theory, etc.). The methods for the solution of (60) are based on the
following two important observations:
(i) Inverse scattering theory. If we set D = d/dx and
L(t)v(x) = (- D2+u(x,t))v(t),
A(t)v(x)= [4D3-3(uD + Du)]v(x),
40. Free Local Extrema of Differentiable Functionals and the Calculus of Variati<'ii>
then, because L'(t) = u„ one can write (60) in the form
L'=LA-AL, (CI)
where A* = - A for suitable specification of the domain of definition
of A in the H-space L2( — 00,00). Now, if u is a solution of (60), then
(61) holds. From this it follows that U~l{t)L{t)U{t) is constant villi
respect to t, where U(t) = exptA. The operator U(t) is uniting
Consequently, e.g., the eigenvalues \(t) of L(t)v = \(t)v are ni'i
dependent on t, i.e., they are conservation quantities for (60), ami
there arises the initially surprising fact that (60) possesses an infin'K1
number of conservation quantities. The application of inverse
scanning theory to the solution of the initial value problem for (60) is al-n
based on the unitary equivalence of L(0) and L(t). We have explained
the basic idea in Problem 33.7. Details can be found in Lax (1%M.
We call (L,A) a Lax pair,
(ii) Canonical equations. One can also write (60) in the form of an infir/ie-
dimensional or continuous canonical system:
where
H(u)=(X (2~lu2x+u3)dx.
•'-oo
In fact, for all h e C0°°(R), one obtains
SH(u;h) = J(uxhx + 3u2h)dx = J(-uxx + 3u2)hdx;
therefore, SH/Su= - uxx + 3u2. Zaharov and Faddeev (1971) -ik-
ceeded in verifying that (62) is integrable and in giving the action
variables and angle variables analogous to Problem 40.8. Here, oik-
finds a very natural explanation of the existence of an infinite nunilvi
of conservation quantities for (60). The formalism is built up furiliei
in Zaharov and Sabat (1974) and a perturbation formalism for a ( l.i-
of nonintegrable wave problems is developed in Zaharov (1974).
Together with the above-mentioned works, for this area of problem-,
study the monographs of Zaharov (1980, M), and Calogero (1982, M), .mil
the proceedings by Bullough and Caudrey (1980, P), where numeicii-
physical applications can be found, as well as the survey articles !n
Gelfand and Dikii (1975, S) and Miura (1976, S). As an introduction. \u'
recommend Lamb (1980, L), and Ablowitz and Sigur (1981, M).
* Wave equation, eikonal equation, canonical equations, asymptotic ex;''".-
sions, Huygens' principle, and geometrical optics. In conjunction uil'i
Section 37.4, we will explain a number of further important connect inii-
wliich have their origin in geometrical optics and which are of fundame11i.1l
significance in mathematical physics as well as in the modern theoiv i'l
partial differential equations (cf. Problem 40.12).
Problems and Supplements
215
40.10a. Variational problem and ordinary differential equations. We proceed from
I L(x,q,q') dx = mini (63)
for given fixed boundary values for q. Using the Legendre transformation,
as in Section 37.4, we obtain the canonical equations
/>'--«,, q' = Hp. (64)
40.10b. First-order partial differential equations. The Hamilton-Jacobi differential
equation corresponding to (64) reads as follows:
Sx + H{x,q,S„)-Q. ^ (65)
In Section 37.4 we have seen that there exists a very intimate relationship
between (64) and (65). The initial value problem for (65) can be solved
with the aid of (64). Conversely, from the solutions of (65), one obtains the
corresponding solutions of (64). From the standpoint of the general theory
of first-order partial differential equations, which one finds, e.g., in Courant
and Hilbert (1953, M), Vol. I or in Caratheodory (1935, M), (64) is a part
of the characteristic system for (65). The solution of the initial value
problem for (65) with the aid of the Cauchy method of characteristics also
leads naturally to (64).
In the special case of geometrical optics, L = n(x, q)c~1]jl+ q'2 and
H = -]j(n/c) - p1. Here, c/n(x, q) is interpreted as the velocity of
light at the point (x, q). Then the eikonal equation
S.H^ = (f)2 (66)
results from (65).
40.10c. Second-order wave equation. We study the wave equation
(f) "«-"«-"„ = 0 (67a)
with the corresponding Helmholtz equation
k2v + vxx + vqq = 0, k1 = (—) (67b)
and the so-called characteristic equation
(J)2tf-tf-+J-0. (67c)
If if- is a solution of (67c), then the surface \p(t, x, q) = constant in
(t, x, #)-space is called a characteristic. The curves r -* (t{r), x(r), q(r))
described by the differential equations
''(O-^)2*/, *'(t)--+,. *'(t)--*, (67d)
are called the ^characteristics associated with the characteristic \j/. If, in
particular, we seek characteristics of the form \j/(t, x,q) = t- S(x, q), then
4U. Free Local fcxtrema ol JLhlTerentiabie junctionals and the Calculus of Variation-
the eikonal equation (66) for S results from (67c). It is well known that one
is led to characteristics if one seeks surfaces ty in (r, x, o)-space thai
contain the possible discontinuities of the solutions u of the wave equation
(67a).
In order to determine the solutions of the wave equation (67a), \u-
proceed from the substitution
u(x,q,t) = e~i"'v(P),
(6M
u(P)- Zvr(P)(-i"Vret"S(n, P-(x,q).
r = 0
Then v is a solution of the Helmholtz equation (67b). Equating coefficienK
we obtain the eikonal equation (66) for S and the so-called transpoii
equations
2vSvwr+frAS = Awr_,, r = 0,l,2,...,
with y_, = 0, for the amplitudes vr. If one is seeking real solutions, then
the real and imaginary parts of (68) must be considered. Therefore, at- a
first approximation, the solution of (67a) has, for large w, the form
u=-v0e'°s-'u' +
In order to recognize the significance of (68), we set the refraction indi-\
n = \. Then
K = <r""'+2'"'x/\ r\ = 27rc/w,
and thus the real part u = cos(27r\~xx — wt) is a solution of (67a). To tkb
solution there corresponds a plane wave with wavelength X. Therefore, in
(68) we are dealing, roughly speaking, with an asymptotic expansion in
terms of small wave lengths.
If S is a solution of the eikonal equation (66), then the curve S(x, q) =
constant is called a geometric wave front. The corresponding characteristic
\j/(t,x,q) = constant, with \p(t, x, q) = t - S(x, q), describes geometric
wave fronts which move with variable time t. For the ^characteristics
which correspond according to (67d), the result is
x'(r) = Sx, q'(r)-S., t'(r)- — .
(69)
These curves stand perpendicularly to the wave front (see Fig. 40.7). We
Figure 40.7
Problems and Supplements
217
will show that these curves can be interpreted as light rays, i.e., they are
extremals of the variational problem (63). In order to arrive at symmetric
formulas, we write (63) with L = n(x, q)c'1]jl + q'2 in the parametric
form
\hnc~x{xa + qa dt = min!
for given fixed boundary values. For this variational problem, the Euler
equations read as follows (according to Problem 40.3):
d x'
dt {xirT71
d q'n
dt ptrzt
= njx'2 + q'2, (70)
= njx'2 + q'2.
ix'^q'
If we introduce the arc lengths s, with ds/dt = pc'2 + q'2 , we then obtain
Now suppose we are given a solution of (69). Then
dx _ c2Sx dq _c%
dt „2 ' dt
1 (71)
(70a) easily follows from this if one observes that S2 + S2 = n2/c2. Hence,
ds/dt = c/n and SXSXX + SgSgx = nnx/c2.
In order to motivate the expression "wave front," let S be a solution of
the eikonal equation (66), and calculate the change in S along a light ray
'""* (■*(')> <?(')) which satisfies (71). From (71) and (66), we obtain
jtS(x(t),q(t)) = Sxx'+Sgq' = l,
i.e.,
S(*('i).?('i))-S(*('o).f('o))-'i-'o-
From this it results that if light rays start from a wave front at the time t0
as in Fig. 40.7, then they arrive at time tx at a new wave front when the
family of curves has sufficiently regular behavior. This is the precise form
of Huygens' principle in geometrical optics.
" The ^characteristics play an important role in the solution of the initial
value problem of the wave equation (67a). In this connection, we seek a
solution u of (67a) for which u and the normal derivative of u take on
prescribed values on a surface Fin (x, q, f)-space. In the special case when
the plane t = 0 corresponds to F, the initial value problem reads as
follows:
u(x,q,0) = u0(x,q), u,{x,q,Q) = ux(x,q) for all (x, q) eR2
(72)
and for given fixed functions u0 and ux. Difficulties arise in the initial value
problem when one bicharacteristic contacts F. This case does not occur in
(72). In this connection, study Garding, Kotake, and Leray (1964). Here a
Riemann hypersurface is constructed for the description of the singulari-
40. Free Local Extrema of Differentiable Functionals and the Calculus of Variations
q
\ ^x
Figure 40.8
ties of the solutions of analytic hyperbolic differential equations. In this
connection, it is a matter of a far-reaching generalization of the classical
idea of conceiving of the singularities and multiple valuedness of analytic
functions, i.e., of the solutions of the Cauchy-Riemann differential
equations as Riemann surfaces.
Asymptotic series of the form (68), which play a fundamental role in
geometrical optics and quantum theory, are considered in detail in Babic
and Buldyrev (1972, M). In this connection, optical foci cause special
difficulties. These are points at which light rays x -* q(x) intersect or come
into contact (see Fig. 40.8). The envelope of light rays is called a caustic. In
Problem 40.11 we refer to a method developed by Maslov, which allows
one to extend asymptotic expansions beyond optic foci. Important
asymptotic formulas for small wave lengths for the refraction of light on a convex
body were derived by Buslaev (1964).
Electromagnetic waves and geometrical optics. The real physical background
for the asymptotic expansion (68) is the transition from electromagnetic
waves to the limiting case when the wavelength tends to zero. Geometrical
optics corresponds to this limiting case. In order to clarify this, we first
note the Maxwell equations in the international MKSA system:
curl E=-/iH,, c\ii\H = eE,, (73)
divE£ = 0, div /iH=0.
E and H are the three-dimensional vectors of the electric (and magnetic)
field strength, respectively. We consider (73) in a region in which there is a
homogeneous medium, but no charges or currents. The material constants
e and n depend on the position coordinates x,y,z and are called the
dielectric and permeability constants, respectively. An analogous
substitution for E and H as in (68), i.e.,
£= £•„(*,J'.^Oe-"'"*"'"50''''^ ••• ,
H=H0e'i-"+i-'s+ ■■■
yields the eikonal equation
S* + Sv2+ ^ = ^, 4 = EV- (74)
cL c
Here, c is the velocity of light in a vacuum. The expansions for E and H in
the form (68) are expansions in terms of small wavelengths X. By geometri-
Problems and Supplements
219
cal optics, one understands that of all the physical quantities, only the first
approximation with respect to small X or large w are considered. Now,
what is the physical definition of a light ray? In order to give a physically
motivated meaningful definition, we consider the vector P = EXH of
energy flow density. This vector describes the flow of energy. If we
designate the surfaces S = constant corresponding to E and H as wave
fronts, then, as a first approximation, the time averaged vector P stands
perpendicular to the wave front. Now, we designate as light rays the curves
which have the first approximation to P as tangent vectors, i.e., which
intersect the wave front perpendicularly. In the language of
hydrodynamics, light rays are thus the stream lines of the vector field of the energy
flow density and, indeed, in the first approximation.
In this connection, study the comprehensive standard work on
geometrical optics by Born and Wolf (1959, M) (in particular, Chapter 3). There,
within the context of the above discussion, one also finds a proof of the
Fermat principle of shortest light path that we postulated in Section 37.4.
Furthermore, study Luneburg (1964, M).
One is also led to the eikonal equation (74) if one looks for surfaces
t = S(x, y, z) on which jumps of the solutions E and H of the Maxwell
equations (73) appear. In this connection, study Courant and Hilbert
(1953, M), Vol. II, Smirnov (1956, M), Vol. IV, Section 167, and Jeffrey
and Taniuti (1964, M). There it is shown, in general, how one can
investigate the structure of wave phenomena for the equations of
mathematical physics (hydrodynamics, magnetohydrodynamics, gas dynamics,
elasticity theory, etc.) without solving the equations, by considering the
characteristics, i.e., the surfaces that can contain the discontinuities of the
solutions.
40.10e. Huygens' principle. This important heuristic physical principle, that was
developed for the qualitative description of refraction processes in light
waves and other wave phenomena, reads as follows: Every point of a wave
front is the starting point of elementary waves (spherical waves), where the
new wave front appears as the envelope of these elementary waves (see Fig.
40.9).
We have already given an exact formulation of this principle in
geometrical optics right after (71). For the wave equation
2 „,2 Li „f2 ""?' ^ '
cl dtl ,- = 1 dtf
Figure 40.9
40. Free Local Extrema of Differentiable Functionals and the Calculus of Variations
where ^ = (^,^2.^3). there results an exact formulation from the basic
Kirchhoff formula
1 \ \ [ l\\du~\ , \\du~\dr ri dr~l \ ,.
"(*o./) = t~/ - -5- +— -5- -5--[«W— Uo
+ (-^\-dV, x0eG, r>0. (76)
•'c 4wc r
In this connection, let r = \x — x0\ and let G be a bounded region in R3
with a sufficiently smooth boundary. Let [/] denote /(x, /- r/c). For
sufficiently smooth / and each sufficiently smooth solution u of (75) in G,
the formula (76) holds (cf. Smirnov (1956, M), Vol. II, Section 202). If we
set / = 0 and proceed from the substitution
u(x,t) = v(x)e"*', (77)
then v satisfies the Helmholtz equation, and from (76) we immediately
obtain
v u y 4ir Jdc r \dn \ c r J dn j K '
for x0 e G, (> 0. If we interpret r-1e_'"'/c as a spherical wave, then
according to (78), the function u(x0,t) results from the superposition of
spherical waves.
The exact formulation of Huygens' principle (78) is the starting point for
important approximation formulas of diffraction theory. In this
connection, for example, in the diffraction by passage of light through a slit,
certain physically motivated approximations concerning the exterior
normal derivative dv /dn are made in the right-hand side of (78). To this end,
study Born and Wolf (1959, M), page 377. An overview of diffraction
theory can be found in BabiC (1967, M,B). The investigation of diffraction
problems with the aid of integral equations is carried out in Kupradze
(1956, M). There, the point of departure is the substitution for v in (77) ol
the form of a single layer or double layer potential
ihir/c
p(x)~
or
v(x0)=( p(x)~^-d0, (79)
Jan r
v(x0)=°( f
-^tJtt1^
respectively. In this connection, also compare Smirnov (1956, M), Vol. IV.
Section 228 ff.
The connection with the Maxwell equations (73) results from the fad
that in a homogeneous medium without charges and flows with e = constant.
H = constant, all components of E and H satisfy the wave equation (75)
with 0 = 1/)/7/1 (where c is the velocity of light in the medium). In this
connection, compare Born and Wolf (1959, M), page 10.
Of. Weak Huygens principle and the initial value problem. By the weak Huygens j
principle, we mean the following: A sharply localized perturbation propa- •
f
Problems and Supplements
m = 3
Figure 40,10
gates wavelike in such a way that a sharp anterior and a sharp posterior
wave front are present (see Fig, 40,10 with m = 3), This is the case for
sound and light. On the other hand, if one throws a stone into water, then
no sharp wave fronts appear (see Fig. 40.10 for m = 2).
In order to alter this principle into an exact form for the wave equation,
we consider the initial value problem
C2 dt2 ,-_i dt?
(80)
u(x,0) = u0(x), ut(x,0) = Ui(x),
where x = (£j £„,). For sufficiently smooth initial data u0 and ux, (80) is
uniquely solvable, and the solution is given for all x0 <= R"', t > 0, from the
following formulas:
(i) m =1:
u(x0,t) = -z(u0(x0-ct)+u0(x0 + ct))+-=- f ul{x)dx.
(81a)
(ii) 171 = 2:
Ur,(x,t) = ^.— I , 1 d^d^2
+ ^4:( , "° d^dti- (81b)
(iii) m = 3:
Here,
l A ( 1 d(ra0) «!
•/a/f(x0,o47rr2 ^r A-nrc
(81c)
/f(x0,0= {xeRm:r^rf},r = |x-x0|,
In the formulas (81) it is immediately noticeable that the integration is
over different regions. In (81c) the value u(x0,t) depends only on the
40. Free Local Extrema of Differentiable Functional and the Calculus of Variaiifi-.
values of u and its first derivatives on dK(x0, t). This is a strict form ol ilk-
causality principle, for dK(x0, t) consists exactly of the points x with Ilk-
property that a signal which at the time / = 0 starts from x towards \.
arrives at the point x0 at exactly the time t.
In contrast to this, u(x0, t) in (81a) and (81b) depends on the values ol '/
and its first derivatives on K(x0, t). This is a weak form of the caus.ilii^
principle, for K(x0, t) consists of precisely the points x having the
puberty that a signal which starts at the time t = 0 from x towards x0 arrive .u
x0 in the time interval [0, /]. However, in this case, no sharp signal tran^lci
is possible. In U2 it would be impossible to receive radio mess.ws
meaningfully since in the receiver the signals, which are sent at diffeu-ni
times, constantly superimpose.
By the weak Huygens principle for (80), we mean that the value oi Ik-
solution u(x0, t) for x0 e U"', t> 0, depends only on the values of u .mil
its first derivatives on dK(x0, t). This is the case for all odd m, m a 3.
A geometric interpretation of the different dependence regions .1 is
shown in Fig. 40.11 with A = dK(x0, t) for m = 3 and A = K(x0, t) Iim
m=l,2. The cone C with the vertex (x0,i) is called the characten-in
conoid. C is a characteristic. Its envelope lines are ^characteristics of (Mi).
The characteristic conoid cuts the boundary of the dependency region )
out of the initial surface t — 0.
In order to elucidate the connection with the weak heuristic Hujsi-n^
principle that we formulated earlier, we assume that the initial value1- '/
and Mj are concentrated only in a small neighborhood of x. Then, at link-
t> 0, the solution u for m = 3, according to (81c), is also different limn
zero only in a small neighborhood of the surface of the ball with centi-i .n
x and radius ct, i.e., there exist sharp anterior and posterior fronts foi tlii-
perturbation propagation. In contrast to this, the solution u for m = 1 .n
time t is different from zero only outside a neighborhood of a disk uilli
center at x of radius ct, i.e., the echo effect occurs (see Fig. 40.10).
As a modern standard work for the solution of the initial value probk-in
for general linear hyperbolic differential equations, we recomnii-iul
Friedlander (1976, M). There these differential equations are conneiial
with the Riemannian geometry of a four-dimensional space-time univi-i si-
Then the characteristic conoid in Fig. 40.11 is curved. The characteriMii-
and bicharacteristics are null hypersurfaces and null geodesies, res|vi-
t t
(x0.t)
A
m=l m=2 m=3
Figure 40.11
(x0,0
(x0,t)
Problems and Supplements
'223
tively. Instead of the representation formulas (81), more complicated
expressions which contain distributions appear there. From these formulas
one can infer the criteria for the validity of the weak Huygens principle,
which refer back to the classical monograph on the initial value problem
by Hadamard (1932, M). Accordingly, the logarithmic term must vanish in
the Hadamard fundamental solution. The validity of Huygens' principle
could be verified for a number of physically important equations. In this
connection, study Gunther and Wiinsch (1974) (Maxwell's equations in
general relativity theory), Gunther (1965) (metric of plane gravitation
waves), Schimming (1977) (general tensor and spinor fields), and W'unsch
(1976) (spinor fields). A survey can be found in Schimming (1978). In the
literature, the weak Huygens principle is frequently designated briefly as
the Huygens principle.
Furthermore, concerning the initial value problems for hyperbolic
differential equations, study Courant and Hilbert (1953, M), Petrovskii (1955,
M), and Garabedian (1964, M>. More complicated questions are treated in
Leray (1952, L), Garding, Kotake, and Leray (1964) (singularities and
Riemann hypersurfaces), Lichnerowicz (1967, M) (relativistic magnetohy-
drodynamics), Atiyah, Bott, and Garding (1970) (lacunary regions),
Hawking and Ellis (1973, M) (Einstein's equations for general relativity
theory), Marsden (1974, L), (1981, L), and Smoller (1983, M) (shock
waves).
40.10g. Axiomatic construction of action propagation. In this connection, study
Gelfand and Fomin (1961, M), Appendix I. There it is shown that the
Hamilton-Jacobi differential equation and the canonical equations are
obtained from very few plausible assumptions concerning the propagation
of action.
40.10h. Global generalized solutions of the Hamilton-Jacobi equation. (Cf. Problem
52.9.)
40.11.** The Maslov WKB method. In Problem 40.10c we pointed out that the
investigation of the asymptotic series (68) encounters difficulties when, in
the language of geometrical optics, foci appear. A method for surmounting
these difficulties is due to Maslov. In this connection, there appear
additional terms in the asymptotic expansions which proceed from the foci and
are connected with the so-called Morse index of these foci. In the general
case, the index introduced by Maslov plays a crucial role.
The basic idea of Maslov's results for the Schrodinger equation can be
found in Arnold (1974, M), Appendix 11. There it is explained how
asymptotic expansions of solutions of the Schrodinger equation according
to the Planck action quantum h (quasiclassical approximations for the
motion of a quantum mechanical particle) are related to the corresponding
classical motion described by the canonical equations in phase space.
Parallel to (68), the substitution for the solution of the Schrodinger
equation has the form
u(x,t) = v(x)ets^ + 0(h), h-*0. (82)
4U. Free Local cxtrema 01 uuTerentiaDie Junctionals and the Calculus of Variations
S is the action function of the corresponding classical motion. If foci
occur, this estimate has to be modified. The physical background of this
so-called WKB-method can be found in Landau and Lifsic (1962, M), Vol.
Ill, Chapter 7. Its connection with geometrical optics is that the Schro-
dinger equation can be conceived of as an equation for electron waves
(electron optics). The passage to the limit A -+ 0 corresponds to the
transition A -+ 0, where A is the wavelength. For A = 0, the corresponding
Fermat principle yields the motion of electrons as classical particles (cf.
Born and Wolf (1959, M), page 740).
As an introduction to the Maslov theory, we recommend Eckmann,
Seneor (1976), where the harmonic and anharmonic oscillators of quantum
mechanics are treated in detail. Furthermore, we recommend Guillemin
and Sternberg (1977, M). Numerous results can be found in Maslov (1972,
M), (1977, M). In the latter monograph, nonlinear equations are also
considered.
* Fourier integral operators. This theory investigates how one can give
meaning to integral expressions of the form
( ei^x'%(x,e,oi)de (83)
within the framework of distribution theory. For example, in quantum
field theory, in the description of action propagations, there appear
expressions of the form (83), which are extremely singular and are not
functions but distributions. Under suitable restricting assumptions, the
theory of Fourier integral operators allows one to define the products of
such distributions, which are important in quantum field theory. In this
connection, study Reed and Simon (1971, M), Vol. II, Chapter IX and
Bogoljubov and Sirkov (1973, M).
One is led to integrals of the form (83) in classical optics if, over the I
caustic, an attempt is made to globally extend the asymptotic expansion,-
given locally for the wave equation with the aid of the Fresnel integrals. A,-
an introduction to this, we recommend Combet (1975, L), (1982, L)
Parallel to Problem 40.11, in the global continuation one uses method-
developed by Maslov (1972, M), which essentially are based on the Maslo\
index. In the global continuation of expressions that are given locally,
cohomology theory frequently plays a crucial role. The prototype for this i-
the construction of analytic functions of several variables with prescribed
zeros or poles (the Cousins problems). In this connection, study Maurin
(1967, M) and Hormander (1973, M). Also, in the Atiyah-Singer index ;
theorem concerning the structure of elliptic differential operators on
manifolds, jKT-theory, which is a cohomology theory for vector bundles, plays an
essential role (cf. Booss (1977, L)). Arnold could in fact show that the
Maslov index is connected with a cohomology class on a suitable manifolil
(cf. the appendix to Maslov (1972, M)).
The theory of Fourier integral operators created by Hormander (1971)
represents a powerful tool for the investigation of general linear differential
equations, where the connections given in Problems 40.7, 40.10, and 40.11 !
are extensively generalized. In particular, the concept of the wave front ol
Problems and Supplements
225
a solution in the sense of distribution theory can be introduced and the
propagation of singularities and the regularity behavior of the solutions
can be described thereby. In this connection, study the standard work on
linear partial differential equations, Hormander (1983, M), and Guillemin
and Sternberg (1977, M). In the latter monograph, one can admire the
interplay of various mathematical resources: Morse theory, symplectic
geometry on manifolds, classical mechanics, geometrical optics, cohomol-
ogy and differential forms, Lie algebras, geometric quantum theory,
asymptotic expansions for (83), pseudodifferential operators, etc.
Moreover, also study Treves (1980, M), and Taylor (1981, M)
(pseudo-differential operators).
40.13.** Connection between classical mechanics and quantum mechanics by way of
the Feynman integral and the .Wiener integral. In this connection, study
Reed and Simon (1971, M), Volume II, Section X.ll. There it is shown
how the solution of the quantum mechanical Schrodinger equation can be
represented as an integral that one can conceive of as averaging over the
paths of classical particles (Feynman integral). In the averaging for
imaginary time, the Wiener measure and the Wiener integral play an important
role which is connected with Brownian motion (the Feynman-Kac
representation formula). This formula is motivated in Reed and Simon (1971,
M) with the aid of the action function of the corresponding classical
motion, parallel to (82). Concerning the Feynman integral (respectively,
the Wiener integral), we recommend Albeverio, and Hoegh-Krohn (1976,
L) [respectively, Simon (1979, M) and Glimm and Jaffe (1981, M)]. In the
framework of the so-called Euclidean quantum field theory, the crucial
discovery was the fact that the theory becomes easier when we start with
imaginary time. Then the solutions for the real time can be obtained by
analytic continuation. From the mathematical point of view, this Euclidean
theory has the advantage that we can use the Wiener integral instead of the
Feynman integral. Up until now it was not possible to rigorously justify
the Feynman integral in the necessary generality. By the Wiener integral
we understand an integral over function spaces or spaces of distributions
with an appropriate measure (functional integral or path integral). In
recent years a number of essential connections between quantum field
theory, operator algebras, stochastic processes, and stochastic physics have
been discovered. In this connection, study Simon (1974, M), (1979, M),
Glimm and Jaffe (1981, M), and Streit (1980, P).
40.14.** Canonical equations, statistical physics, and ergodic theory. In classical
statistical physics, one considers, say, the motion of a gas which consists of
1023 molecules, as the motion of a point in a high-dimensional phase
space. This motion is described by means of the canonical equations
<l'i = Hpi, p'i = -Hqi, i=l,...,M.
Statistical average values first arise by means of time average values.
However, the crucial trick of statistical physics consists of replacing this
time average value by the so-called ensemble average value. The latter are
average values with respect to appropriate measures in the phase space.
Ergodic theory attempts to justify this procedure rigorously. In this con-
226 40. Free Local Extrema of Differentiable Functional and the Calculus of Variations
nection, as an introduction, study Reed and Simon (1971, M), Vol. I, II.5,
and Arnold and Avez (1968, M). Furthermore, we recommend Walters
(1982, M), as an introduction, and the standard work Cornfeld, Fomin,
and Sinai (1982, M).
40.15. Principle of stationary action, symmetry, Noether's theorem, conservation
quantities, and gauge field theories. We also delve into these topics in Part
V. There we show that many basic equations of mathematical physics
result from variational problems (e.g., principle of stationary action).
The Noether theorem is closely related to so-called gauge field theories
which play a fundamental role in the modern theory of elementary
particles. It seems that gauge field theories are the right tool for building
up a unified theory of elementary particles which includes all kinds of
known interactions. For example, the Weinberg-Salam theory unifies weak
and electromagnetic interaction. In the framework of strong interactions,
hypothetical particles (so-called colored quarks) and the quantums of the
related gauge fields (so-called gluons) are important.
In this connection, study Weinberg (1974, S), Faddeev and Slavnov
(1980, M), and Becher (1981, M) (physical point of view) and Eguchi et al.
(1980, S); Jaffe and Taubes (1980, M); Rund (1981, S), and Manin (1984,
M) (mathematical point of view). As an introduction, we recommend the
masterful Fermi lectures of Sir Michael Atiyah (1979, L) (Yang-Mills
equation, connections in principal fiber bundles, complex projective spaces
and Penrose twisters, holomorphic vector bundles).
Gauge field theories are a continuation of Einstein's concept of
describing physical effects mathematically in terms of differential geometry. In
general relativity, the curvature of the four-dimensional space-time
manifold is responsible for gravitation. In the gauge field theories of elementary
particle physics, curved manifolds (fiber bundles) occur, whose structure is
determined by internal symmetries of the elementary particles (the groups
SU„). In Atiyah (1979, L) it is shown how deep tools of differential
geometry and algebraic geometry can be used to obtain the exact number
of solutions of certain complex nonlinear differential equations arising
from gauge field theory (number of the so-called instantons).
40.16. History of the calculus of variations. Study the history of the Euler,
Lagrange, Jacobi, and Weierstrass necessary and sufficient conditions in
Funk (1962, M), and Goldstine (1980, M). This development is
characterized by a number of classical error deductions and their criticism. The
difficult path of the development of rigorous mathematical deductions will
then be clear.
References to the Literature
Functional analysis and classical calculus of variations; Fucik, Necas, and
Soucek (1977, L) (introduction); Klotzler (1971, M) (multiple integrals);
loffe and Tihomirov (1974, M).
References
227
Calculus of variations: Courant and Hilbert (1953, M), Vols. I, II;
Gelfand and Fomin (1961, M) (introduction); Funk (1962, M,H); Rund
(1966, M); Hestenes (1966, M); Morrey (1967, M) (standard work on
existence theory and regularity theory); Young (1969, M); Klotzler (1971,
M).
Field theory: Caratheodory (1935, M); Rund (1966, M); Klotzler (1971,
M).
Collection of exercises for the calculus of variations: Krasnov (1975, M).
Minimal surfaces: Fucik, Necas, and Soucek (1977, L) (introduction);
Nitsche (1975, M) (standard work); Ekeland and Temam (1974, M); Gilbarg
and Trudinger (1977, M); Tromba (1977, S), (1980); Hildebrandt and
Nitsche (1979), (1980/1981); Fomenko (1982, M).
Canonical formalism and perturbation theory: Arnold (1963, S), (1974,
M); Siegel and Moser (1971, M) (cf. Problem 40.8).
Solitons and infinite-dimensional canonical systems: Lamb (1980, L)
(introduction); Zaharov (1980, M); Bullough and Caudrey (1980, P);
Albowitz and Sigur (1981, M); Calogero and Degasperis (1982, M, H, B)
(cf. Problem 40.9).
Symplectic geometry, canonical systems, and mathematical physics:
Arnold (1974, M) (introduction); Guillemin and Sternberg (1977, M);
Abraham and Marsden (1978, M); Kijowski and Tulpyczew (1979, L);
Marsden (1981, L).
Infinite-dimensional canonical systems and mathematics: Chernoff and
Marsden (1974, L); Marsden (1974, L), (1981, L); Abraham and Marsden
(1978, M).
Geometrical optics: Sommerfeld (1962, M), Vol. IV; Born and Wolf
(1959, M); Luneburg (1964, M).
Geometrical optics and asymptotic expansions: Buslaev (1964); Babic
and Buldyrev (1972, M); Guillemin and Sternberg (1977, M); Babic and
Kirpicnikova (1979, M).
Maslov's WKB-method: Maslov (1972, M), (1977, M); Eckmann and
Seneor (1976) (introduction); Leray (1978, M).
Fourier integral operators and pseudodifferential operators: Hormander
(1983, M) (standard work); Reed and Simon (1971, M), Vol. II
(introduction); Guillemin and Sternberg (1977, M); Treves (1980, M); Taylor (1981,
M).
Diffraction theory: Kupradze (1956, M) (integral equations); Born and
Wolf (1959, M); Babic (1967, S, B) (survey).
Scattering theory: Lax and Philipps (1967, M); Reed and Simon (1971,
M), Vol. Ill; Amrein (1981, M); Baumgartel and Wollenberg (1983, M).
Huygens' principle: Born and Wolf (1959, M); Rund (1966, M).
Weak Huygens principle: Hadamard (1932, M); Courant and Hilbert
(1953, M), Vol. II; Friedlander (1976, M); Gunther (1965), Gunther and
Wiinsch (1974); Wunsch (1976); Ibragimov (1976); Schimming (1977),
(1978, S, B) (cf. Problem 40.10f).
228 40. Free Local Extrema of Differentiable Functional and the Calculus of Variations
Global generalized solutions of the Hamilton-Jacobi differential
equation: Benton (1977, M); Lions, Jr. (1982, L), Crandall and Lions, Jr. (1983).
Initial value problem for hyperbolic differential equations: Friedlander
(1976, M) (standard work), Courant and Hilbert (1953, M), Vol. II;
Petrovskii (1955, M); Garabedian (1964, M).
Complicated problems in hyperbolic differential equations: Leray (1952,
M); Garding, Kotake, and Leray (1964); Lichnerowicz (1967, M); Atiyah,
Bott, and Garding (1970); HSrmander (1971); Hawking and Ellis (1973,
M); Marsden (1974, L), (1981, L); Smoller (1983, M) (shock waves and
reaction-diffusion) (cf. Problem 40.10f).
Feynman integrals, Wiener integrals, stochastic processes, and quantum
field theory: Reed and Simon (1971, M), Vol. II; Albeverio, Hoegh, and
Krohn (1976, L); Simon (1974, M), (1979, M); Glimm and Jaffe (1981, M).
Statistical physics and ergodic theory: Reed and Simon (1971, M), Vol. I,
and Walters (1982, L) (introduction); Cornfield et al. (1982, M).
Principle of stationary action for obtaining fundamental equations in
mathematical physics: Sommerfeld (1962, M), Vol. I (mechanics); Landau
and Lifsic (1962, M), Volumes I-IX; Schweber (1961, M); Bogoljubov and
Sirkov (1973, M), (1980, M) (quantum field theory); Courant and Hilbert
(1953, M), Vol. I; Gelfand and Fomin (1961, M); Rund (1966, M).
Variational principles, differential geometry and gauge theory: Atiyah
(1979, L); Rund (1981, S); Eguchi et al. (1980, S); Manin (1984, M).
Applications of gauge field theory to elementary particle physics:
Weinberg (1974, S); Faddeev and Slavnov (1980, M); Bogoljubov and
Sirkov (1980, M); Jaffe and Taubes (1980, M); Becher (1981, M, B)
(introduction from the physical point of view).
CHAPTER 41
Potential Operators
To wit, since the plan of the universe is the most perfect, there can be no
doubt that all actions in the world can be determined from the observed
phenomena and the causes with the aid of the method of maxima and
minima.
Leonhard Euler (1707-1783)
He was a great scholar and a gracious human being.
(Inscription on the Euler memorial tablet in Riehen, Switzerland, near Basel,
where Euler spent bis childhood.)
Above all, I think one must study the masters rather than the disciples if one
wishes to make progress in mathematics.
Niels Henrik Abel (1802-1829)
Together with the minimum problem
inf F{u)-(b,u)x = a, (l)
ue M
we consider the Euler equation
F'(u)-b = Q (2)
and ask the following questions:
(a) How can one obtain the solutions for (2) from the solutions of (1)?
()8) Which operator equations Au - b = 0 can be written in the form (2),
i.e., when is F'= A?
(y) In what manner are the properties of F connected with those of F'l
We give the answers in this and in the next chapter. In particular, in
Chapter 42 we show that F is convex if and only if F' is monotone. This
yields the connection with the theory of monotone operators in Part II.
229
230
41. Potential Operators
The applications relate to the Hammerstein integral equations in Section
41.6 and to quasilinear elliptic partial differential equations in Section 42.7.
It has been 200 years since the death of Leonhard Euler (1707-1783), the
man who created the calculus of variations, pursuing the first works of the
Bernoulli brothers, and produced works that were crucial to the
development of mathematical physics. For this reason, at the beginning of this
chapter, in which the abstract form of the Euler equation is the focal point,
we should like to make the reader aware of this by means of some
quotations concerning his work. For the interested reader, we recommend
the Euler biographies by Juskevic (1971) and Thiele (1982, M,B). More
precisely, one would have to designate the Euler equation as the Euler-
Lagrange equation, since Lagrange was the first to derive these equations for
integrals with several variables. The original geometric methods of Euler
could not yet accomplish this. We have described Lagrange's analytic
methods of variations in a more precise form in Section 37.4b and have used
them in an essential way in Chapter 40. One can find a compilation of the
important classical works on the calculus of variations due to Johann and
Jakob Bernoulli, Euler, Lagrange, Legendre, and Jacobi in Stockel (1894,
M). One can obtain an overview of the fundamental works of Hilbert on the
calculus of variations at the beginning of this century by reading his
"Collected Works," Hilbert (1932), Vol. 3, pp. 10-54. In that volume, one
can also find Hilbert's recollections of Weierstrass and Minkowski as well as
his life history. The essential impulse and results, which for the development
of the calculus of variations in our century emanated from Hilbert's Paris
address in 1900 (Problems 19, 20, and 23), are described in the collection
volumes of Aleksandrov (1971) and Browder (1976). These volumes are
devoted to all 23 Hilbert problems. One also finds much material
concerning the history of the calculus of variations in Funk (1962, M) and
Goldstine (1980, M). Furthermore, we recommend that the reader glance at
the collected works of Euler (1911), in particular, his books on the calculus
of variations, differential and integral calculus, algebra, mechanics, and
optics. One will be astonished to see how smoothly Euler's books read even
today and surprised by the detailed presentation of very elementary things.
Mathematics knows, besides the exclusive era of the Greeks, no luckier
constellation than the one under which Leonhard Euler was born. It was up to
him to give mathematics a completely changed form and to shape it into the
powerful edifice that it is today.
Andreas Speiser (1885-1970)
The Euler "Calculus of Variations" of the year 1744 is one of the most
beautiful mathematical works that has ever been written.
Constantin Caratheodory (1873-1950)
I have recently again made a lengthier study of Euler's "Integral Calculus"
and have anew wondered how this work of over 70 years has maintained its
1. Potential Operators
231
freshness, while the contemporary d'Alembert is entirely impossible to read. It
appears to me that the reason lies in Euler"s examples.
Carl Gustav Jacob Jacobi (1804-1851)
Euler"s textbook "Complete instruction in algebra" which appeared in 1770
stems from the time of his blindness. Euler dictated the work to an
uneducated young tailor with the intention of testing the comprehensibility
directly. The pedagogical experiment was a success, according to reports of
Eulefs son, Johann Albrecht, for the tailor could solve difficult algebraic
problems without outside help. The book proceeds in small steps up to
difficult problems.
RudigerThiele(1982)
Read Euler, he is the master of us all. '
Marquis de Pierre Simon Laplace (1749-1827)
Euler truly did not sour his life with limiting value considerations,
convergence and continuity criteria and he could not and did not wish to bother
about the logical foundation of analysis, but rather he relied—only on
occasion unsuccessfully—on his astonishing certitude of instinct and
algorithmic power.
Emil Alfred Fellman (born 1927)
Seen statistically, Euler must have made a discovery every week.... About
1911, G. Enestrom published an almost complete (from today's viewpoint) list
of works with 866 titles. Of the 72 volumes of his "Collected Works" all but
three have appeared as of today. Euler's correspondence with nearly 300
colleagues is estimated to constitute 4500 to 5000 letters, of which perhaps a
third appear to have been lost. These letters are to appear in 13 volumes.
Euler was not only one of the greatest mathematicians, but also in general one
of the most creative human beings. His indefatigable scientific activity, which
could not be impaired even by his blindness, was limited however not only to
mathematics: Euler, who was indeed called the personified analysis, engaged
in a similar way in comprehensive technological application of science as also
in fundamental questions in the theory of cognition.
Rudiger Thiele (1982)
One needs to have delved but little into the principles of differential calculus
to know the method of how to determine the greatest and least ordinates of
curves. But there are maxima or minima problems of a higher order, which in
fact depend on the same method, which however can not be subjected to this
method. These are the problems where it is a matter of finding the curves
themselves.
The first problem of this type, which the geometers solved, is that of the
brachistochrone or the curve of fastest fall, which Johann Bernoulli proposed
toward the end of the preceding century. One attained this only in special
ways, and it was only some time later and on the occasion of the
investigations concerning isoperimetric problems that the great geometer of whom
we just spoke and his extraordinary brother Jacob Bernoulli gave some rules
232
41. Potential Operators
in order to solve several other problems of this type. But since these rules were
not of sufficient generality, the famous Euler undertook to refer all
investigations of this type to a general method. But even as sophisticated and fruitful
as his method is, one must nevertheless confess that it is not sufficiently
simple.... Now, here one finds a method which requires only a simple use of
the principles of differential and integral calculus; above all, I must call
attention to the fact that I have introduced in my calculations a new
characteristic S since this method requires that the same quantities vary in two
different ways.
Comte de Joseph Louis Lagrange, 1762
As I see, your analytic solution of the isoperimetric problem contains all that
one can wish for in this situation and I am very happy that this theory which I
have treated since the first attempts almost alone, has been brought precisely
by you to the highest degree of perfection. The importance of the situation
has occasioned me with the help of your new insights to myself conceive of an
analytic solution, but which I shall not make known before you have
published your deliberations, in order not to deprive you of the least part of the
fame due you.
Euler, in a letter to Lagrange
41.1. Minimal Sequences
Definition 41.1. A sequence (un) in M is called a minimal sequence for (1) if
and only if F(u„)—(b,un) ->a as n ->oo.
The following proposition is important for the existence theory for (1)
and for the construction of approximation methods.
Proposition 41.2. Suppose the functional F: M c X-+U, M¥=0 satisfies the
following four assumptions:
(i) X is a real reflexive B-space.
(ii) F is weak sequentially lower semicontinuous.
(Hi) M is weak sequentially closed (e.g., M is closed and convex),
(iv) Either M is bounded or for each sequence (un) in M such that \\un\\ -» oo
as n-+ qo, we have
lim F(un) — (b,u„) — +oo.
n -» oo
Then the following three assertions hold:
(a) For each del*,(l) possesses a solution u.
(b) For (1) there always exists a minimal sequence. Each minimal sequence
has a subsequence which converges weakly to a solution of (1).
(c) If (1) possesses exactly one solution, then every minimal sequence of (1)
converges weakly to the solution of (1).
41.2. aoiuiion oi upeiator Equuuoris by Soivmg Extremal rioblems
233
Corollary 41.3. In (b) and (c) weak convergence can be replaced by strong
convergence when one replaces the assumptions (ii) and (Hi) in Proposition
41.2 by the following two conditions (but retaining the other assumptions):
(W) F is continuous. F' exists on M as a G-derivative and satisfies the
condition (S)+, i.e., for each sequence (un) in M, as n -» oo we have:
u„-+u, lim(F'(u„),un- u) <0=> u„-* u.
(Hi') M is closed and convex.
According to Fig. 27.1 in Part II, (S)+ is fulfilled, e.g., when F'=A + V
holds for the operators A, V: X-* X*, where A is uniformly monotone and
Fis compact.
One proves Proposition 41.2 in a way analogous to Theorem 38.A in
Section 38.3. Observe that each minimal sequence is bounded, by (iv).
Assertion (c) follows from the convergence principle Proposition 10.13, (2).
We treat the proof of Corollary 41.3 in Problem 41.1.
41.2. Solution of Operator Equations by Solving
Extremal Problems
Theorem 41.A. For each del*, the operator equation (2) has a solution
when the following two conditions hold:
(i) The functional F: MC.X-+M, M¥=0, is weak sequentially lower
semicontinuous. X is a reflexive real B-space.
(ii) The G-derivative F': M c X -» X* of F exists, and one of the following
three conditions holds:
(HI) M= («61: ||«|| </?),/?> 0;
(F'(u)-b,u)>0 foralluedM.
(//2) M=X,F(u)-(b,u) -* +oo ay||K||-*oo.
(//3) M=X, <F'(tO,K>/|H-*+oo ay||K||-*oo.
Proof. Ad (HI), (H2). According to Proposition 41.2, (a), (1) possesses a
def
solution u0. For G(u) = F(u)-b, we have G'(u) = F'(u)-b. We need only
show that u0 e int M. Then it will follow that G'(u0) = 0 by Theorem 40.B,
(1) in Section 40.3. For (H2), u0 e int M is trivial. If u0 ¢. int M for (HI),
then ||«0|| = /? and G(u)>G(u0) for all ueM. However, from this we
obtain the contradiction
/^,/ \ ,. ,. G(un — tun)—G(ur,)
0><G'("0)>-"o>= lim -^-5 r~ V "' >0.
t-> +o t
Ad (H3). For large R, this is a special case of (HI). □
234
41. Potential Operators
41.3. Criteria for Potential Operators
Definition 41.4. Let X be a real B-space. The operator A: X-* X* is called
a potential operator if and only if there exists a G-differentiable functional
F: X -* U such that A = F'. Then F is called a potential of A.
If A is hemicontinuous, then we define FA: X-+H by
fa(") = C(A{tu),u)dt
and call FA a pseudopotential of A. The hemicontinuity of A guarantees the
continuity of the integrand.
Typical examples of potential operators are the Nemyckii operator in
Section 41.6 and operators which belong to the generalized boundary value
problems for quasilinear elliptic differential equations in Section 42.7. Of
the important criteria for potential operators that we shall formulate
directly, the following two equations play a crucial role:
fa(u)-fa{v)= (l(A(v + t(u-v),u-v)dt (3)"
•'o
for all u, v e X,
(A'(u)v,w) = (A'(u)w,v) ioiallu,v,w<=X. (4)
The following condition belongs to (4):
(t,s)<-+(A'(w + tu + sv)x, y) is continuous on [0,1]X[0,1] ,..
for all u, v, w, x, y e X. ^ '
Proposition 41.5. If A: X-+X* is a hemicontinuous operator on the real
B-space X, then the following two assertions hold:
(1) Integral criterion. A is a potential operator if and only if (3) holds. Then
the pseudopotential FA is a potential, and an arbitrary potential for A differs
from FA only by a constant.
(2) Derivative criterion. If A' exists on X as a G-derivative, with (5), then A
is a potential operator if and only if (4) holds.
Example 41.6. Let X=U3, u = (£,tj,0, X=X*. Then we can interpret
A = (a,b,c) to be a three-dimensional force field. If A is a potential
operator, then A = grad F holds in the sense of vector analysis, i.e., on IR3,
a = Fv b = Fv, c = Fs. (6)
It can easily be shown that (4) is equivalent to curl ,4 = 0, i.e., a„ = 6£,
a? = Cj, bf; = cv These are the known integrability conditions which follow
from (6) and F(ri = F^, etc. (cf. Problem 41.2). Furthermore, it can easily be
verified that (3) simply means that <f>Adu = Q in the sense of a classical line
41.4. Criteria for the Weak Sequential Lower Semicontinuity of Functional 235
integral when one integrates around a triangle. Furthermore,
fa{u) = / Adv>
where the integration is along the segment from 0 to u.
Consequently, Proposition 41.5 generalizes known assertions from vector
analysis.
Example 41.7. Let A: X-* X* be a continuous linear operator on the real
B-space X. We set Bu = Au-b for all ueX and fixed beX*. Then
B'{u) = A for all u e X, and from (4) it follows that:
B is a potential operator if and only if A is symmetric. Then the potential
FB is equal to
fb(u) = f\A(tu)-b,u)dt = 2-\Au,u)-(b,u).
By Theorem 41.A in Section 41.2, the equation Au — b = Q can then be
obtained only as the Euler equation of an extremal problem when A is
symmetric, i.e., (Au, v) = (Av, u) for all u,ve X. The meaning of this
assertion for partial differential equations was elucidated in Section 22.5.
In Problem 41.3 we prove Proposition 41.5 by means of the known
proposition on the integrability conditions and the independence of path for
line integrals in 0¾2.
41.4. Criteria for the Weak Sequential Lower
Semicontinuity of junctionals
In Sections 38.3, 38.5, 41.1, and 41.2, we have already learned that the
concept of the weak sequential lower semicontinuity of functional is of
fundamental significance in the existence theory in minimum problems. For
this reason it is important to know a number of criteria for this.
Proposition 41.8. The functional F: X-+H is weakly sequentially lower
semicontinuous on the real reflexive B-space X when one of the following six
conditions is fulfilled:
(//1) F is convex and lower semicontinuous.
(//2) S2F(u; h) > 0 for all u,he X, F' exists as a G-derivative.
(//3) F' is monotone.
(//4) F' is pseudomonotone and locally bounded.
(//5) F' is demicontinuous and satisfies {S)+.
^36
4i. roiential Opeiautts
(//6) F' is locally bounded and satisfies (P), i.e.,
u)l-^u=*lim(F'(ull),ull — u)>Q
as n-* oo.
In the conditions where 82F or F' appears, the existence of these expressions
on X is assumed, where F' denotes the G-derivative.
In Fig. 27.1 in Part II one finds prototypes for (H4)-(H6). Let F' = A + V
with the operators A, V: X-* X*. Then (H4) holds when A is monotone and
hemicontinuous and V is strongly continuous. (H5) occurs when A is
uniformly monotone and hemicontinuous and V is compact or strongly
continuous. From this we obtain an intimate relationship with the theory of
monotone operators.
Numerous classes of generalized boundary value problems for quasilinear
elliptic differential equations lead to (H4), (H5).
Corollary 41.9. The functional F: X-+M is weakly sequentially continuous on
the real reflexive B-space X when F' exists on X as a G-derivative and is,
strongly continuous or, more generally, only compact.
Proof. Here we shall use several results that will be proved in the next
chapter. (HI) follows from Proposition 38.7. According to Fig. 27.1, (H3),
(H4), and (H5) are special cases of (H6). In this connection, one observes
that, by Proposition 42.6, every monotone potential operator is demicon-
tinuous. By Proposition 42.6 and Corollary 42.8, (H2) is a special case of
(H3).
Therefore, it suffices to prove (H6) by contradiction. Let us assume that F
is not weakly sequentially lower semicontinuous at u. Then there exists a
number d > 0 and a sequence (un) with u„-^u such that
lim F(u„)-F(u)<-d and F(u„)-F (u) < - d forallneN.
n -» oo
(?)
def
Let <p(t) = F(u + th). The classical mean value theorem yields <p(l)-<p(0)
= <p'(&), i.e.,
F(u + h)-F(u)=*(F'(u + &h),h), 0<#<1. (8)
By (7),
[F(un)-F(u + e{ua-u))] + [F(u + e(u„-u))-F{u)]<-d. (9)
For suitable #„ e ]0,1[, depending on e, it follows from (8) that
def
A„ = F(u+e{un - u))-F(u) = e(F'(u + #„e(«„ - «)), u„ - u).
Since un-+u, (u„ — u) is bounded. The local boundedness of F' at u then
4Lj. nustract hammers teiri equations Witn symmetric js.ernel Operators 23 I
guarantees:
|A„| < eK for all n e N and e such that |e| < e0 (10)
for fixed K, e0, with e0 <^. If we choose e> 0 fixed but sufficiently small,
then, by (9) and (10), we have
I„ = F(u„)-F(u + e(un-u))<-~ forallneN. (11)
If we apply (8) to (11), we obtain
A„ = (F'(wn),(l-e)(«„-«)> = (1 -e)y-\F'(w„), wn -«>,
def _ def
where wn = u„ + #„(l-e)(«„ - u), y = l + #„(l-e\ 0<#„<1. For this
reason, wn-+u as n -» oo, and it follows from (11) that
m(F'{W„),W„ - a) =15^7(1-6)-¾ < 0.
But this contradicts (P). D
Proof of Corollary 41.9. Suppose u„-+u as n-*oo. If F{u„)^> F{u)
does not hold, then there exists an e0 > 0 and a subsequence of (u„) which
we also denote by (un) such that
0< eo^l^uj-^u)! forallneN.
For 0 < #„ < 1, (8) yields
0< e0^|<f'(« + *„(«- un),u-un)\. (12)
The sequence (u — un) is bounded. F' is compact. For this reason, there
exists a subsequence, which we again denote by («„), such that F'(u + &n(u
-«„))-> z as « -> oo. Since «„--«, the right-hand side of (12) tends to zero.
But this is a contradiction. D
41.5. Application to Abstract Hammerstein Equations
with Symmetric Kernel Operators
Here, in conjunction with Chapter 28, we deal with the operator equation
u + KF(u)=0, ueX*. (13)
Theorem 41.B. (13) has a solution when the following three conditions hold:
(f) K: X-* X* is linear, monotone, and symmetric. X is a real reflexive
B-space.
(«) F: X* -» X is a potential operator with the potential <p: X* -»IR and <p
satisfies the growth condition
<p{u) i> - a\\u\\2 - bWuf ~ c forallueX*. (14)
Here, a, b, c, and /3 are constants, a,b,c>;Q and 0 < fi < 2, 2a\\K\\ < 1.
238
41. Potential Operators
(Hi) <p is either weak sequentially lower semicontinuous on X* or K is
compact and F is continuous.
We recall that, according to Fig. 27.1 in Part II, every linear monotone
operator K is also continuous. The symmetry of the kernel operator K
means that (Ku, v)x = (Kv, u)x for all u, v e X. In (ii), we use X** = X.
Proof. By Proposition 28.1, there exists a real H-space (//,(• |-)) and a
continuous linear mapping S: X-*H, where K = S*S holds and S*:
H-* X* is injective. In this connection, we set //= H*. Moreover, \\S*\\2 <
ll*IU|S|| = l|S*ll-
Instead of (13), we consider
v + SFS*d = Q, veil. (15)
If v is a solution of (15), then u = S*v is a solution of (13). Therefore, it
suffices to solve (15). To this end, we consider the minimum problem
minh(v) = a, (16)
ueff
def ,
where h(v) = 2~~l{v\v)+<p{S*v). We shall show that:
(a) h is weak sequentially lower semicontinuous for all v e H.
(b) h{v) -» + oo as ||u|| -» oo.
(c) The G-derivative h'= I + SFS* exists.
Then Theorem 41.A in Section 41.2 yields the existence of a solution of
(16).
Case 1. <p is weak sequentially lower semicontinuous.
(a) S* is linear and continuous and, therefore, also weak sequentially
continuous according to Fig. 27.1, i.e.,
v-v„ =* S*v„-S*v =* <p(S*o) <lim<p(S*v). (17)
Furthermore, by Example 38.16, (1), v-*2~\v\v) is weak sequentially
lower semicontinuous.
(b) From (14) it follows that
h{v) > 2-^^)-a\\S*v\\2 - b\\S*o\\p - c
2:2^(1-2^111^11)110112-611^11^11^-0.
(c) For all v,we H and t -* +0, we have
h'{v)w = limrl{h{v + tw)- h(v))
= (v\w) + (<p'(S*v),S*w)
^(v + S<p'{S*v)\w).
Case 2. K is compact and F is continuous.
By assumption, <p' = F as a G-derivative. <p' is continuous and
consequently, by Section 40.1, <p is also continuous. K is linear and compact.
41.6. Application to Hammerstein Integral Equations
239
Hence, by Proposition 28.1, S is also compact. In general, from the
compactness of S, it follows that S* is also compact. Figure 27.1 shows that
S* is strongly continuous. Therefore, instead of (17), the following holds:
vn-v =* S*v„ -* S*v =» <p(S\) -* <p(S*v).
The inferences proceed now as in Case 1. □
41.6. Application to Hammerstein Integral Equations
Parallel to Section 28.4, we consider the integral equation
u(x) + lifk(x,y)f(y,u{y))dy = 0, (18)
JG
where /i e 01. We set X = Lq(G) and write (18) in the form
u+ixKFu^O, ueX*, (18')
where the Nemyckii operator F: X* -* X is generated by / and the kernel
operator K: X -* X* is generated by k by virtue of
(Fu)(x) °= f(x,u(x)), {Kv){x) = f k(x, y)v(y)dy.
Jc
In this connection, we make the following four assumptions:
(HI) G is a bounded region in UN with JV^l and l< p, q<ao,
p'1 + q~1 = l. Then as is known X* = Lp(G) holds.
(H2) The kernel operator K: X-* X* is linear, monotone, compact, and
symmetric.
If we consider
(Ku,v)x~ j \j k{x, y)u(y) dy\v{x) dx,
then by Section 28.4 these assumptions are fulfilled when:
(i) k: G X G -»Hi is measurable (e.g., continuous) and
I \k(x, y)\pdxdy <oo.
JGXG
(if) k(x, y) = k(y, x) for all x, y e G.
(iii) (Ku, u)x > 0 for all ueX.
In this connection, (i) implies the compactness of K, (ii) yields the symmetry
of K, i.e., (Ku, v) = (Kv, u) for all u,veX, and (iii) is identical to the
montonicity of K. In particular, if p = 2, then (iii) follows from (i) and (ii)
when all eigenvalues of K are nonnegative. Furthermore, with regard to
(H4) below, we mention the known fact that \\K\\ is then equal to the largest
eigenvalue of K.
x.40
41'. luiciitial Opciaturs
(H3)/: GXU -> 0¾ satisfies the Caratheodory condition (e.g.,/is
continuous), and the growth condition
\f(x,u)\<\a{x)\ + b\u\p-1 forall(jc,«)eGxlR
holds for fixed a e Lq{G) and beU + .
(H4) The Hammerstein growth condition
ff{x, v)dv>- c\u\2 - \d{x)\ M2~T- \e{x)\
holds for all {x,u)eGxU, where 0<y<2, p^2, d eL2/y(G), ee
LX{G), celR+, and
c(mesG)(^"2)//,||/s:||<l.
Proposition 41.10. (1)// (//1) and (//3) hold, then the Nemyckii operator F
is a potential operator from X* into X with the potential
def f I fu(x) \
<p(u) = f f f(x,v)dv\dx forallu^X*.
(2) If (//1)-( //4) hold, then (18) possesses a solution for n = 1.
Corollary 41.11 (Eigensolutions). //(//1)-(//3) hold and if f(x,0)^0 as
well as <p(u) =£ 0 and KFu =£ 0 for all u e X — {0}, /wen (18) te an eigensolu-
tion w£Q for every a>0, wnere (cuwdx = a for all w e K~l(u). If, in
addition, f is odd with respect to u, then for each a > 0 there exists at least m
such eigenvector pairs (u, — u) with m = dim K{X), 1 < m < oo. For
dim K(X) = oo, //iere are infinitely many characteristic numbers ji„ with
H^1 -»0 a5 n ->oo.
Proof. (1) For fixed «, w e X* and all < e [- /0, /0], we set
therefore, <p(u + /w) = /Gg(/, x) c?x, and we show that
d<p(u + /w)
g(f,x) = / f(x,v)dv; (19)
£//
= //(x, «(x))w(x) ti«. (20)
Then {^(^),^) = (^,^),- thus <p' = F. Formal calculations yield (20)
immediately. In the following we justify this with the aid of the theorem on
the differentiation of integrals with a parameter A2(25) and verify the
assumptions.
(19) exists for almost all x e G, for u >-* /(x, u) is continuous for almost
all x e G because of the Caratheodory condition. For all t e [-10, t0], from
(H3) and A2(30b) it follows that
\g(t,x)\<\a(x)\(\u(x)\+t0\w(x)\) + (constant)(\u{x)\P + tp0\w{x)\P).
(19) yields gt(t, x) = f(x, u(x)+tw(x))w(x) for all te]-t0,t0] and for
Problems
241
almost all x e G. By (H3), for these t,
\g,(t,x)\< [K^I + tconstantXH^r1 + <rV(*)l'-1)] M*)l-
Now, because u,we Lp{G) and, therefore, a, \u\p~~l, \w\p"1 e Lq(G), the
right-hand side is integrable over G by the Holder inequality A2(29).
(2) F: X* -» Zis continuous according to Proposition 26.5. By the Holder
inequality, from (H4) it follows that
v(«) :> - c||«||| - ||^||2/TIMirT-constant.
Furthermore, the Holder inequality with X* = Lp{G) yields:
I a/2 / \ 0--2)/2/.
Now Theorem 41.B in Section 41.5 yields the assertion. D
Corollary 41.11 is a special case of Theorem 44.C in Section 44.10 that we
shall prove later (Ljusternik-Schnirelman theory).
Problems
def def
41.1. Proof of Corollary 41.3. Solution: Let G(u) = F(u)-(b, «), <p(f) = G(k + t(v
- «)). Then <p'(0 = (G'(« + <(f - ")), v ~ u). The classical mean value
theorem yields <p(l)- <p(0) = <p'(ft), where 0 < # <1; therefore
G(w)-G(«) = (G'(« + #(i'-«),i'-«) forallw.yeM. (21)
Hered depends on u, v. Let (m„) be a sequence in Msuch that G(u )-> a and
a = inf„e MG(w). By (iv), («„) is bounded. Xis reflexive; consequently, m„ — u
as n -> oo, possibly after passing to a subsequence. We shall show that un -> m.
Then the continuity of G yields G(u) = a, and Proposition 41.2, (c), with
strong convergence instead of weak convergence, follows from the
convergence principle (Proposition 10.13, (1)).
Let 0 < e ^ \. From (21) we obtain
def
A„ = G(«„)-G(« + E(«„-«)) (22)
where
therefore,
■(G'(W„),(l-e)( «„-«)>,
def
w„ = »» + *»(l-e)("»-»), 0 <#„<!;
A„ = (1-6)(1+^(1-6))^(0^,).^-^-
Obviously, w„ — u as n -> oo and
Imi(G'(w„),%-«>-EHA„(l + d„(l-6))(1-6)-1.
41. Potential Operators
We shall show that
%mb„='EmG(u„)-G(u + £(u„-u))<0 (23)
as n -» oo. Then
Ern(G'(%),w„-M><0,
and (5)+ yields w„ -» u, i.e.,
«„ - «- (1 + 9n(l-e))~\wB - «)) - 0.
Proof of (23): Since u + e(un~ w)e M, we have G(u + e(u„~ u))>aloi
all n e N and G(un) -> a. For a > - oo, (23) follows. Let a = - oo. Recall that
(m„) is bounded. We choose e> 0 so small that the sequence of the G(u + e(u„
— «)) is bounded because of the continuity of G. Then (23) also holds.
Proof of Example 41.6. Solution: If, for example, we set v~ (/1,0,0), w =
(0,/i,0), /i ¥=0, then A'(u)v- {a^{u)h, b((u)h, c^(u)h) and we obtain A'(u)w
by replacing the derivative with respect to £ by the derivative with respect to ij.
Thus, (A'(u)v,w) = (A'(u)w, v) means 6j(m) = a,(«).
free/ o/ Proposition 41.5. (I) (3) is necessary. Let v4 = F'. For <p(t) — F(v +
t{u~- v)) for all t e R, we have
¢/(0-= (>4(y + f(M- y)),M- y);
therefore,
F(«)-F(i;) = <p(l)-<p(0)= (1(A(v + t(u-v),u-v)dt
and thus F( u) = F(0) + F„(«).
(II) (3) is sufficient. For * -> 0, we have
(F;(v),w)^-&ms'1{FA(v + m)-FA(v))
==lim/ (^(p + teMO.w)* = (v^.w)
•'o
according to (3), i.e., FA'=> A. Passage to the limit and integration can be
interchanged because the integrand is continuous, for A is hemicontinuous.
def
(III) (4) is necessary. Let W(t, s) = F(w + tu + sv) for all t, s e R;
therefore,
Wls{t, s) = (A'(w + tu + sv)v, u),
Wsl(t,s) = (A'(w + tu + sv)u,v).
Since, by (5), the derivatives are continuous, we have W,s(0,0) = Ws,(0,0)
according to a known classical theorem. But this is (4).
(IV) (4) is sufficient. Let
def
U(t,s) = (A(tv + su),v),
def
V(t,s) = (A(tv + su),u).
References
243
Figure 41.1
(4) means that Us(t,s) = V,(t,s) for all j,/eR. Therefore, for the classical
line integral,
&U(t, s) dt + V(t, s) ds = 0
holds. In particular, if we choose the triangle in Fig. 41.1, then we obtain (3),
i.e., A is a potential operator, by (II).
References to the Literature
Potential operators: Vainberg (1956, M), (1972, M); Gajewski, Groger and
Zacharias (1974, M); Langenbach (1976, M); Berger (1977, M).
Weak lower semicontinuity: Browder (1970b); Hess (1971); Zeidler (1976).
Weak sequential lower semicontinuity of integral expressions and
existence theory: Morrey (1966), M) (standard work); Olech (1969) and Cesari
(1983, M) (control theory); Ball (1977) (fundamental paper on nonlinear
elasticity); Dacorogna (1982, L); Giaquinta (1981, L); Necas (1983, L).
Hammerstein integral equations: Hammerstein (1930) (classical work);
Vainberg (1956, M), (1972, M); Krasnoselskii (1956, M), (1975, M), Gupta
(1970), Fucik, Necas, and Soucek (1977, L); Pascali and Sburlan (1978, M)
(also, cf. the references to the literature in Chapter 28).
CHAPTER 42
Free Minima for Convex Functional,
Ritz Method and the Gradient Method
Our science, in contrast to others, is not founded on a single period of human
history, but has accompanied the development of culture through all its
stages. Mathematics is as much interwoven with Greek culture as with the
most modern problems in engineering. It not only lends a hand to the
progressive natural sciences but participates at the same time in the abstract
investigations of logicians and philosophers.
Felix Klein (1849-1925)
In this chapter we show the intimate connection between the convexity of
the functional F and the monotonicity of the operator F', which fully
corresponds to the known connection in the case of real functions F:
U -»R. In this way we obtain an approach to the theory of monotone
operators F' by means of convex minimum problems. In contrast to general
minimum problems, convex minimum problems have a number of crucial
advantages:
(i) According to the main theorem and its variants in Sections 38.3 and
38.5, there result simple existence propositions in reflexive B-spaces.
(ii) By Theorem 38.C, it follows from the strict convexity of F that the
minimum point is unique,
(iii) Local minima are always global minima.
(iv) The Euler equation F'(u)=*Q, where ueintD(F), is not only a
necessary condition but also a sufficient condition for a free local
minimum of F at u.
(v) One has productive approximation methods at one's disposal in the
Ritz and gradient methods.
244
42.1. Convex Functionals and Convex Sets
245
42.1. Convex Functionals and Convex Sets
Definition 42.1. Let X be a linear space and let F: M c X -»IR be a
functional.
The set M is said to be convex if and only if
u,veM, re [0,1] implies (l-t)u + tveM.
If M is convex, then F is said to be convex if and only if
F((l~t)u + tv)<(l-t)F(u) + tF(v) forallu.ueM, te]0,l[.
(1)
F is called strictly convex if and only if (1) holds with " < " instead of " <."
F is called concave if and only if — F is convex.
Example 42.2. The set M in Fig. 42.1(a), with X=U2, is convex, for
whenever u, v belong to M, then the segment joining them also belongs to
M. The function <p: IR -»IR in Figs. 42.1(b) and 42.1(c) is convex, for the
chords always lie above the curve belonging to <p. In Fig. 42.1(b), <p is also
strictly convex, i.e., the interior points of the chords lie properly above the
curve. In Fig. 42.1(c), <p is not strictly convex.
Figure 42.1
246 42. Free Minima for Convex Functionals, Ritz Method and the Gradient Method
Proposition 42.3. Let F: M Q X^U be a convex functional on the convex set
M in the real locally convex space X. If F has a local minimum at u0, i.e.,
F(u)>F(u0) forallueU{u0)(~)M, (2)
where U(u0) is an appropriate neighborhood of u0, then u0 is a global
minimum, i.e.,
F(u)>F(u0) forallueM. (3)
Proof. Let «eM, where u¥=u0. Then there exists a \e]0,l] such that
u0+X(u-u0)eU(u0)DM. By (2),
F(u0) £ F(u0 + \(u- u0)) <\F(u) + (l- \)F(u0);
therefore F(u0) < F(u). 0
42.2. Real Convex Functions
Proposition 42.4 (Convexity Criterion). As in Definition 42.1, let F be given.
We set <p(t) = F(u + t(v — u)). Then:
Fis (strictly) convex <=> <p is (strictly) convex on[0,1] for all u,v e M.
We recommend that the reader give the proof as an easy exercise.
Geometrically this proposition means that a convex functional F is also
convex over every segment in M, and conversely. Proposition 42.4 allows
one to reduce the investigation of convex functionals F to the investigation
of real convex functions <p. For this reason we first summarize classical
results concerning <p and then apply these results in the next section.
Proposition 42.5. The following assertions hold for the real function <p;
[a, 6]->IR, where — oo<a<b<oo:
(a) <p is convex implies <p'_(0 < <p'+(0 for a^l e ]°> b[-
(b) <p is convex implies <p is continuous on ]a, b[.
(c) IfV exists on [a, b], then:
<p is (strictly) convex on [a, b] <=> <p' is (strictly)
monotonically increasing on [a, b].
<p is convex on[a,b] =» <p' is continuous on]a,b[
(d) If<f/' exists on [a,b] then:
<p is convex on [a, b] ** <p" > 0 on [a, b].
<p is strictly convex on [a, b] <= <p" > 0 on [a, b] ■
Assertion (a) also includes the existence of one-sided derivatives <p'±(t).
Figure 42.2 shows that <p in (b) need not be continuous on [a, b]. Moreover,
(4)
(5)
(6)
(7)
42.3. Convexity of F, Monotonicity of F' and the Definiteness of the Second Variation 247
Figure 42.2
one obtains an intuitive interpretation of (a). We treat the proof in Problem
42.3.
Let <p: [a, 6]-» IR be convex. In the proof of (4), it follows in particular
from the monotonicity relation (53) below with tx = a, t3 = b that:
<jp(ft)-<jp(a)^(ft-fl)<jp'+(fl), ' (4a)
when <p'+(a) exists. By (a) this is the case if <p is convex in a full
neighborhood of the point a. We shall often make use of relation (4a). We give an
intuitive interpretation in Example 42.9 and in Fig. 42.3.
42.3. Convexity of F, Monotonicity of F', and the
Definiteness of the Second Variation
We now generalize Proposition 42.5. In this connection, we use the
definition of monotone operators from Section 25.3.
Proposition 42.6. Let F: X -»IR be a functional on the real B-space X.
Suppose the G-derivative F'\ X^> X* exists on X, Then the following hold;
(1) The following three assertions are equivalent:
(i) F is convex on X.
(ii) F' is monotone on X.
(Hi) F(v)- F(u) > (F'(u), v-u) for all u, v e X.
(2) If F is convex on X and X is reflexive, then F' is monotone and
demicontinuous on X.
(3) The following three assertions are equivalent:
(i) F is strictly convex on X,
(ii) F' is strictly monotone on X.
(Hi) F(v)— F(u) > (F'(u), v — u) for all u,ve X such that u=f=v.
A functional F: X -»IR on the B-space X is called coercive (respectively,
weakly coercive) if and only if
-~n—»+00 as|M|->oo
/48 42. Free Minima for Convex Functionals, Ritz Method and the Gradient Method
(respectively,
.F(m)->+oo as ||k||-» oo).
Corollary 42.7. If, under the assumptions of Proposition 42.6, F' is uniformly
monotone, i.e., to be precise, if for fixed numbers p > 1, c> 0, :;
(F'(v)-F'(u),v-u)^c\\v-u\\p forallu,veX, (8);=
then
F(v)-F(u)>(F'(u),v-u) + cp-1\\v-u\\p forallu,veX. (9)
By Proposition 42.6,3 (Hi) it follows that F is strictly monotone and,
furthermore, that F is coercive.
(9) is significant for error estimates, for, if F has a minimum at u, then
F'(u) = 0 and thus
F(v)-F(u)>cp-l\\v-u\\p for all u ex (9a)
From information about F, one can thus estimate \\v — u\\. We shall discuss
this in Remark 42.13. Furthermore, for v = un, (9) allows us to immediately
infer from «„ — u and F(un) -» F(u) as n -» oo that un -» u.
In applications, the convexity of F is often obtained conveniently by
investigating the second variation. In preparation for this, we summarize as
follows:
82F(u; h) > 0 for all u, h e X. (10)
82F(u;h)>Q iora\\u,heX, h*0. (11)
82F(u;h)>c\\h\\p for all u,h e Xand fixed;? >l,c> 0, (12)
r >-» 8^(^ + t(v — u); v — u) is continuous on [0,1] for all u, v e X.
Corollary 42.8. If, in addition to the assumptions in Proposition 42.6, the
second variation 82F(u; h) exists for all u, he X, then one has the following
criteria for convexity:
(i) (10) <=> F is convex on X.
(ii) (11) =» F is strictly convex on X.
(Hi) (12) =» (8) holds and thus the assertions of Corollary 42.7 are valid.
The proofs, which follow easily from Section 42.2, will be treated in
Problem 42.4. As a special case of Corollary 42.8, we explain the application
to quadratic variational problems and thereby sharpen Example 38.16.
Example 42.9. Let X be a real B-space. We set F(u)= 2~xa(u, u)-b(u).
Here, let a: XxX^U be bilinear, bounded, and symmetric and let
42.4. Monotone Potential Operators
24S
Figure 42.3
b<BX*. According to Problem 40.2,
82F{u;h)=*a{h,h) forallK.fce*
and
(F'{u),h) = a(u,h)-b(h) for all u, h e X.
From Corollary 42.8 it follows that:
(i) a is positive <=> F convex.
(ii) a is strictly positive => F strictly convex.
(iii) a is strongly positive =» F strictly convex and coercive, and (9) holds for
The relation
F(v)- F(u)>(F'(u),v-u) for all k, us* (13)
in Proposition 42.6, (1) intuitively signifies for F: U -»01, because (a, b) =
ab, that for a differentiable convex function F, the corresponding curve lies
over the tangent through u (see Fig. 42.3).
The following proposition is obtained directly from Theorem 40.B in
Section 40.3 and (13).
Proposition 42.10. If F: X^>H is a convex G-differentiable functional on the
real B-space X, then:
Fhas a minimum at « <=> F'(u) — 0.
42.4. Monotone Potential Operators
Due to the significance of monotone operators for numerous applications,
we again present a summary of a number of important propositions for
these operators, which we obtained in Chapter 41 and in the preceding
250 42. Free Minima for Convex Functional, Ritz Method and the Gradient Method
sections. To this end, we note the following relations:
f (A(tu),u)dt- f (A(tv),v)dt
= { (A(v + t(u-v)),u-v)dt iox&\\u,v^X.
(14)
(A'{w)u,v) = (A'{w)v,u) foia\\u,v,weX, (15)
(A'(w)h,h)>Q forallw,/jeX
Proposition 42.11. Let A; X^>X* be an operator on the real reflexive
B-space X.
(1) The following assertions are equivalent:
(i) A is a monotone potential operator,
(ii) A is monotone hemicontinuous and (14) holds.
(Hi) A is a potential operator, i.e., A = F', and F is convex and weakly
sequentially lower semicontinuous on X.
(2) If the demicontinuous G-derivative A' exists on X, then the following two
assertions are equivalent:
(i) A is a monotone potential operator,
(ii) (15) holds.
(3) If A is a monotone potential operator, then A is demicontinuous.
By Proposition 38.7, one can replace the condition "F is weakly
sequentially lower semicontinuous on X" by ".Fis lower semicontinuous on X" in
(iii). In (2) the reflexivity of X is not needed.
Example 42.12. If A: X-> X* is a continuous linear operator on a real
B-space, then A'(w) = A for all weX, and it follows from Proposition
42.11, (2) that:
A is a monotone potential operator <=> A is positive and symmetric.
42.5. Free Convex Minimum Problems and the
Ritz Method
We now generalize the assertions in Theorem 22.A in Section 22.1 for
quadratic variational problems to convex variational problems. To this end,
we study the minimum problem
mmF(u)-(b,u)x-a (16)
ue x
42.5. Free Convex Minimum Problems and the Ritz Method
251
with the Euler equation
F'{u)-b*=Q. (16a)
For the construction of approximate solutions,
n
"„ = E cknwk,
we set Xn = span{ w1;..., wn } and study the Ritz approximation problem
rmn F{un)-{b,un)x = an (17)
with the corresponding Ritz equations
(F'{u„)-b,wk)x = 0, k=l,...,n. (17a)
This is a system of nonlinear equations in the real numbers cln,...,c„„, for
whose iterative solution we prepare a gradient method in the next section.
In preparation, we note further:
(F'(v)-F'(w), v - w) > c\\v - w\\p
for all u, we X and fixed;? >l,c> 0. (18)
If this condition is fulfilled, then F' is uniformly monotone.
Theorem 42.A. Suppose that the following three conditions are satisfied:
{i)X is a real separable reflexive B-space with dim X—cc and {wr,w2,...}
is a basis in X,
(ri) The convex functional F: X^>M possesses a G-derivative F'\ X-> X*
on X, which is coercive, i.e.,
<fj^U+oo a, IN-00.
{Hi) b is a fixed element in X*.
Then:
(a) Equivalence. (16) and (16a) as well as (17) and (17a) are mutually
equivalent problems. Moreover, F' is monotone and demicontinuous.
(b) Existence. (16) possesses a solution u, and for each n€N equation (17)
possesses a solution u„. If {u„) is a sequence of approximate solutions, then
there exists a subsequence which converges weakly to a solution u of (16).
(c) Uniqueness. If F' is strictly monotone, then all the solutions in (b) are
unique and u„-*u as n -> oo.
(d) Strong convergence of the Ritz method. If F' is uniformly monotone,
then (16) as well as (17) have exactly one solution for alLneN, and {un)
converges strongly to the solution u of (16).
(e) Error estimate for the solution of (16). If (18) holds, then (d) occurs,
and for all n e N we have
cp-l\\un-u\Y<an-a. (19)
lil 42. Free Minima for Convex Functional, Ritz Method and the Gradient Method
(f) Convergence of minimal values. If F is continuous, then a„-*a as
n -» oo.
Remark 42.13. If one knows a lower bound /? for the minimal value a in
(16), then a„- a<an- ji, and from (19) we obtain an estimate for \\un - u\\.':.
Such lower bounds /? are obtained with the aid of the dual maximum^
problem. We discuss this in Theorem 51.A in Section 51.3.
We treat applications of Theorem 42.A to quasilinear elliptic differential
equations in Section 42.7. Theorem 42.A is very intimately connected with
the main theorem on monotone operators in Section 26.2. In this
connection, the Ritz equations for F' are identical with the Galerkin equations.
If the F in (16) is not differentiable, then we immediately obtain the
following corollary from Proposition 38.15 and Theorem 38.C in Section
38.4.
Corollary 42.14. If F: X-* U is convex and lower semicontinuous on the real
reflexive B-space X, and if, for fixed b e X*,
F(u)-(b,u)-* + <x> ay||K||-*oo, (20)
then (16) has a solution, and the solution set is closed, bounded, and convex. If
F is strictly convex, then (16) has exactly one solution.
def
Proof of Theorem 42.A. Let G(u) = F(u)-b. Then, G'{u)= F\u)-b.
Here, F and G are convex, (a) follows directly from Propositions 42.10 and
42.11. Furthermore, (b), (c), and (d) are obtained from the main theorem on
monotone operators (Theorem 26.A in Section 26.3) and from (a). Observe
that F' is demicontinuous by Proposition 42.11, (3). Finally, (e) follows
from (9a).
We prove (f). Let nbea solution of (16). By hypothesis, {wx, w2, ■. ■} is a
basis in X. This means that Xx c X, c • • • c X and U nX„= X. Thus, there
exists a sequence of natural numbers («') and elements un, e Xn, such that
un, -» u as n' -» oo. Furthermore, X1 c X2 c - - - c X yields a„>a for all n;
(«„) decreases monotonely and thus converges. From a = G(u)<an,<
G( un,) -» G{ u), we obtain an-+ a. D
42.6. Free Convex Minimum Problems and the
Gradient Method
We elucidated the basic idea of the gradient method in Section 37.29. Here
we use this method to solve, by successive approximations, the minimum
problem
minF(u) — {b, u)x = a (21)
we A-
42.6. Free Convex Minimum Problems and the Gradient Method
253
and the corresponding Euler equation
F'{u)-b = 0 (22)
with the aid of the iteration method
un+l = un-tnU{F\un)-b), « = 0,1,.... (23)
In this connection, we make the following assumptions:
(HI) X is a real separable reflexive B-space, and b is a fixed element in
X*. The functional F: X-*U possesses the G-derivative F': X-* X*.
(H2) F' is uniformly monotone. To be precise, for all v, w e X and fixed
p > 1, c> 0, we have:
(F'{v)-F'{w)-,v-w)^c\\v-w\\''.
(H3) F' is locally Lipschitz-continuous, i.e., for each r > 0, there exists a
number M(r)>.Q such that, for all v, w e X,
|M|,|M|<r implies \\F\v)-F'{w)\\< M{r)\\v-w\\.
(H4) U: X* -> Xis a fixed operator with the property that, for all v e X*,
(v,Uv)x-\\v\\2, ||M>|| = N|.
In an H-space X, U is the duality mapping from X* into X** = X. If we
identify X with X*, then U is equal to the identity mapping I (cf. Section
21.4). If X is a strictly convex real reflexive B-space, then U is the duality
mapping from X* into X** = X. In this connection, Uv=H'{v), and
H{v) = 2-l\\v\\2 (cf. Section 47.12).
(H5) In the gradient method (23), we obtain the t„ in the following way:
We start with a fixed element u0e X and, for n = 0,1,... successively, we
choose the numbers rn, Mn, tn such that
rn = IKII+ \\F'(«„)- b\\, M„ = max(l, M{r„)),
1
Theorem 42.B. Suppose (//1)-(//5) hold. Then (21) has exactly one
solution u.
The gradient method (23) converges to u as n-* oo. For all n = 0,1,..., one
has the error estimates
¢11^-^-^11^(0-611, (24)
c2\\un-u\\^-^<2t;l[F(un)-F(un+l) + (b,un+l-un)]. (25)
Theorem 42.B is intimately connected with the generalized gradient
method of Theorem 26.B in Section 26.2.
254 42. Free Minima for Convex Functionals, Ritz Method and the Gradient Method
Proof. The existence and uniqueness assertion follows from Theorem 42.A
def
in Section 42.5. We now investigate (un) and set dn = F(un)-(b, un). The
following two relations are crucial:
(F'(".+!)-*.".~ "„+i> £d„- dn+l, (26)
c\\un-u\\P<(F'{un)-b,un-u)
<\\F'{un)-b\\\\un-u\\. (27)
(26) follows from Proposition 42.6, 1 (iii) and the monotonicity of F'. (27)
results from (H2) and F'{u) = b. (24) is a direct consequence of (27). Below
we shall prove:
2-ltn\\F>{un)-b\f<dn-dn+l, (28)
(un) is bounded. (29)
The proof follows easily from this, for, by (28), (dn) is monotonely
decreasing. dn > a for all n eN yields the convergence of (dn). Due to (29) and
(H5), inint„ > 0. Thus, (28) guarantees that .F'(«„)-&->(} as n->oo;
therefore, un -* u according to (27). Furthermore, we obtain (25) from (28) and
(24).
Proof of (28). Since t„ < 2~\ by (23) and (H4) we obtain:
\K\\<r„,
\\«„+l\\<\\un\\ + 2-l\\F'{un)-b\\,
\K-un+1\\<t„\\F'(un)-b\\,
(F'(un)-b,un-un+l) = tn(F'(u„)-b,U{F'(un)-b))
-t„\\F'{un)-b\\2.
Thus, from (26) it follows that:
d„ ~ dn+l > (F'(un)-b, un - un+l) + (F'(un+l)-F'(un),un - un+l)
>t„\\F'(un)-bf-\\F'(un+1)-F'{un)\\\\un-un+l\\
>t„\\F'("„)- b\\2 - Mn\\un- un+lf
>tn(l-t„Mj\\F'{u„)-b\\\
Proof of (29). By Corollary 42.7, .F(m)/|M| -» + oo as ||u|| -♦ oo; therefore,
F{u)-(b, «)->+oo as ||u|| —»oo. Now (29) follows from the boundedness
of (</„)■ n
Additional propositions concerning the gradient method can be found in
Problem 42.8.
42.7. Quasilinear Elliptic Differential Equations in Sobolev Spaces
255
42.7. Application to Variational Problems and
Quasilinear Elliptic Differential Equations in
Sobolev Spaces
Our goal is to give existence proofs for the classical variational problems
considered in Section 40.5. We first explain the strategy that we shall pursue
in doing this.
In Section 40.5 we considered variational problems for multidimensional
integral expressions on spaces of smooth functions. Here, for certain classes
of such problems, which correspond to perturbed convex problems, we
prove the existence of generalized solutions in Sobolev spaces. In this
connection we also obtain generalized solutions of the Euler equations that
correspond to quasilinear elliptic differential equations. The existence
propositions for these differential equations are special cases of the results that
we obtained in Chapter 26 with the aid of the theory of monotone
operators, and which resulted from Chapter 27 within the context of the
theory of pseudomonotone operators. In Chapters 26 and 27 we studied
generalized boundary value problems for quasilinear elliptic differential
equations of the form
a(u,h)-(b,h) = 0 for all A e A". (30)
There we showed that (30) is equivalent to the operator equation
Au-b = 0, ueX, (31)
where (Au, h) = a(u, h). In this section, we consider the special case
A = F', i.e., A is a potential operator. Then (31) is the Euler equation for the
minimum problem
minF(u) — (b,u) = a. (32)
we X
def
If we set G(u) = F(u)-(b,u) and take into account that 8G(u;h) =
(F'u- b, h), then (30) is identical to the vanishing of the first variation,
i.e., 8G(u; h) = 0 for all h e X. The Galerkin equations for (30),
a{un,wk)-(b,wk) = Q, k=\,...,n (33)
def
for un e Xn, where Xn = span{ wlt..., wn}, are identical to the Ritz equations
for (32) in Section 42.5. For our applications, the following situation of a
perturbed convex problem occurs;
(i) F=F1 + F2, the functional iy. X-+M is convex and continuous, F2:
X-* U is weakly sequentially continuous,
(if) F(u)-(b,u)-* + oo as ||k||-* + oo for fixed b e X*.
256 42. Free Minima for Convex Functional, Ritz Method and the Gradient Method
According to Proposition 38.7, (2), F is then weakly sequentially lower
semicontinuous. From Proposition 38.15 it follows that:
If (i) and (ii) hold and X is a real reflexive B-space, ,,...
then (32) has a solution. *• '
We fulfill condition (i) by seeing that the integrand of F1 is convex with
respect to u and all its partial derivatives and that, in comparison with Fv
F2 contains only derivatives of lower order. To these belong the growth
conditions which assure the existence of the integrals and which give the
continuity of F1 because of the continuity of the Nemyckii operator. We
guarantee (ii) by means of the coerciveness condition on the integrands.
With differentiability conditions and growth conditions for the integrands
of Fl and F2, we obtain the existence of the F-derivatives F{ and F{, where
F{ is monotone and continuous and F{ is strongly continuous, i.e., A = F{
+ F{ is pseudomonotone.
42.7a. A Second-Order Differential Equation
First we explain the preceding considerations on the basis of a simple
example:
/ \P~l El DMP + g(u)-fi* \dx = mini, u^OondG. (35)
Let G be a bounded region in IRN and let p > 2. Furthermore, let x =
(^,...,^), Dt = d/dZ/. If ueC2(G) is a solution of (35), then, by Section
40.5,
N
G:-Y,Dt(\Drtp~2Diu) + g'{u)-f, dG:u = Q. (36)
i=i
Observe that the real function <p{t) = p~1\t\p on IR has the derivative
<p'(t) = \t\r~2t. Since <p"(t) = (p-l)\t\p-2, <p"(0^0- Therefore, <p is
convex. We set
FAu)- (lP>dx
def J* def
L^ = p-lZ\D,u\", L®-g(u),
i = i
def „
and we choose X = Wp\G). Let / e Lq(G) be given, where p + q =1.
According to(22.1°), there then exists a functional be X* such that
(b,u)= f fudx for all ue X.
42.7. Quasilinear Elliptic Differential Equations in Sobolev Spaces
25?
As the generalized problem for (35), we now consider
F( u)~(b,u) = mini, ueX, (37)
def
where F=Fl + F2. For ueX, we have « = 0 on dG in the sense of
generalized boundary values. According to Section 40.5, we expect the first
variation to be
! N \
(F'u,h)=*f\ J^lDiU^-^uDth + g'i^hldx iovallheX.
(38)
To (36) there corresponds the generalized boundary value problem
(F'u,h)-(b,h) = Q for all/i eX (39)
with the corresponding Ritz equations
(F'un,wk)-(b,wlc) = 0, k=\,...,n (39a)
def
for un e Xn = span{w1,w2,...}. The wltw2,... form a basis in X. In the
appendix to Part II, we gave a number of possibilities for this (cf. A2(56)).
For g, we assume that:
geCr{U), g{u)> -(constant) u- constant f or all m e IR, (40)
|g(M)|<(constant)(l + |«|^) foralluelR, (41)
|g'(«)|< (constant) (1 + M'"1) for all ue U. (42)
For example, one can choose g(u) = \u\p.
Example 42.15. With the assumptions made above, (37) has a solution that
satisfies (39) with (38).
If g = 0, then all the assertions of Theorem 42.A in Section 42.5 hold,
including the consequences from (18). In particular, (37) has exactly one
solution u, and the uniquely determined Ritz approximations un converge
strongly to u in X as n -* oo.
In Section 51.6 we shall discuss the duality theory for (37) with g = 0 and
the error estimates that follow.
Proof. We carry out the proof so that it can immediately be carried over
to a more general situation in the next section. We set D0u = u,
D = (D0, />!,..., DN). The estimate
I N \
|L°K#")| < (constant) 1+ £ |Z);M|H, / = 1,2 (43)
\ i = o I
is essential.
258 42. Free Minima for Convex Functionals, Ritz Method and the Gradient Method
(I) Existence. Fl is continuous on X, for, according to (43) and
Proposition 26.4, it follows that the Nemyckii operator belonging to L(1) is a
continuous operator from X= Wp{G) into LY{G). Thus, from un -* u in X
it follows that Lm(Dun)-» I+l\Du) in L^G); therefore, F^uJ -* F^u),
Fr is convex because of the convexity of <p.
F2 is weakly sequentially continuous. First, as above for Fv it follows that
F2 is continuous on X. However, by virtue of (41) and Proposition 26.4, F2 is
also continuous on Lp(G). The embedding X c Lp(G) is compact. Hence,
«„ — u in X=*un-+u in Lp(G) =»
F2(un) -» F2(u) as n->oo.
Now, we shall show that F{u)— (b, «)-> + oo as ||u|| —* oo. According to
(40),
F(u) > ||M||fi;,i0 -(constant)||mHj-constant,
def
where ||K||f 0 = Fx(u) and ||-||x denotes the norm on Lr(G). By A2(53b),
II" llx jo o *s an equivalent norm on X. Due to the continuous embeddings
JfcL,(G)cL1(G),
F{u) > c\\u\\p - (constant)||«||-constant
for all u e X and positive constant c. Moreover, \(b, u)\ < \\b\\ \\u\\.
Now, the existence assertion follows from (34).
(II) Proof of (38). Let L = L(1) + L<2). For u, h e X and all t e [ - r0, f0],
j jv
-%- f L(D(u + th))dx= f Y,LDU{Du^th)Dihdx. (44)
In order to justify the differentiation under the integral sign, one has to
estimate the integrand on the right-hand side, uniformly with respect to t,
against an integrable function, according to A2(25). However, from (42)
and A 2 (30b) it follows that
ZLDlU{Du + ih)Dth
(constant)
1 + IlW
\D,h\-
By Proposition 26.4, the expression in the square brackets, [...], belongs to
Lq(G) and D,h lies in Lp(G). Consequently, by the Holder inequality,
[...]\Dth\ belongs to L^G). Now (44), with t = 0, yields 8F(u; h).
Furthermore, the Holder inequality assures that h >-* 8F(u; h)isa continuous linear
functional on X. Consequently, the G-derivative F' exists, and (38) holds.
As in the proof of Proposition 26.7, one shows the continuity of F' with
the aid of the continuity of the Nemyckii operator. Hence, F' exists even as
an F-derivative.
Theorem 40.B, (1) in Section 40.2 yields (38a).
(Ill) If g = 0, then all the assumptions of Theorem 42.A in Section 42.5
are fulfilled. In particular,
(F{(u)-F{(v), u-v)> c\\u - v\\p for all u,veX
42.7. Quasilinear Elliptic Differential Equations in Sobolev Spaces
259
follows from inequality (25.45), i.e., from
(\a\p"2a - \b\p~2b)(a -b)> cY\a - b\>> for all a, b e IR,
where cx > 0 is fixed, as well as from the fact that || - llx.^.o *s an equivalent
norm on X. □
42.7b. Differential Equations of Order 2m
We study the variational problem
(L(x, Du(x)) dx- ffudx = min!, (45)
D^u = QondG . forall/?,|/?|<m-l.
L depends on x and on all partial derivatives Dau up to and including
def
order m. In this connection, D u = u. We conceive of L as a real function
of x and D, where D = (Z>a)W;£m and D"eIR, DeUd. Furthermore, LD„
and LDaDK denote partial derivatives. \a\ is the order of the differential
operator D".
According to Section 40.5, the following boundary value problem
formally belongs to (45):
G: E (-iya]D*Aa(x,Du(x))=f(x) (46)
dG:Dpu = Q forall/8,|y8|<m-l,
with
Aa = LD« for alia, \a\ <m. (46a)
We have already dealt with problems of type (46) independently of (46a) in
Chapters 26 and 27 within the context of the theory of monotone and
pseudomonotone operators. If (46) is given, then the following question
naturally arises: When is this problem a variational problem? The first
formal answer reads as follows: (46a) must hold. In order to formulate a
handier criterion, we imagine that we always have LDaDp = LD/sD« for
smooth L. Thus, from (46a) it follows that
(Aa)D» = (Afi)D- for all a,/3 such that |a|, \P\<m. (47)
If, say, all Aa belong to C1(Gx01'') and (47) holds on GxUd, then by a
classical theorem there exists an L such that (46a) holds on G X IR d when G
is a simply connected region in IR N. Thus, in order to decide if (46) belongs
to a variational problem, one will first verify (47).
In order to treat (45) as a perturbed convex problem on a Sobolev space,
parallel to the preceding section, we set
L(x, D) = L(1)(jc, D)+ Li2){x,D)
260 42. Free Minima for Convex Functional, Ritz Method and the Gradient Method
and make the following assumptions:
(HI) G is a bounded region in IR N, N > 1, and 1 < p < oo, p'1 + q~x =1,
m >1. We set X=W^(G). Let/ e Lq(G) be given and fixed.
(H2) Growth condition. L e C(G X 0¾ d) and
\L(x,D)\<J\ai{x)\+ £ i#t)-
V |y|Sm '
To be precise, these conditions are to hold for L(1) and L(2) separately.
(H3) Coerciveness condition
L(x,D)>c2 £ IjDtV-CjZ)0-^.*;).
|y| = m .i
(H4) Convexity. D >-* L(1)(x, /)) is convex on IR'' for all x e G.
(H5) Degenerate perturbation. L(2) depends only on x and all partial
derivatives up to and including order m — 1.
(H6) Growth condition for A a. Let L e C^G XlRrf) and
\LDa(x,D)\<cJ\b(x)\+ £ IDT"1).
^ |y| < m '
To be exact, this condition is to hold for L(1) and L(2) separately.
The inequalities above are to be fulfilled for all arguments, i.e., for all,
(x, D)eGxUd and a with \a\ < m. Here, Cj denotes a positive constant.
Furthermore, let a, e L^G), b e Lq{G).
If L*1' eC^G XlRrf), then, by Problem 42.6, (H4) is equivalent to
* £ {L%l{x,D)~L%l{x,D')){Da-D'a)^Q
\a\ < m
ior&\\D,D'<aUd,x<BG.
This is the monotpnicity condition on Aa that we assumed in Section 26.5. If
L(1) e C2(G X 0¾rf), then (H4) is equivalent to
£ L<DV(x,/))/)^5:0
l«l.|/8|<m
forallD.D'elR^jceG,
i.e., the eigenvalues of the symmetric Hessian matrix
{L$D,{x,D))
are nonnegative for all (x, D) eGxUd. In generalizing the monotonicity
condition, we formulate the following.
(H7) Uniform monotonicity condition. We have
£ {L%l{x, D)- L%l(x, D')){D" -/>'«) >c5 £ |D*-Z>T
|a|sm |y| = m
for all A £>' elR**, x eG and fixed c5 > 0.
17. Qiiasilinear Elliptic Differential Equations in iobolev Spaces
2bl
Parallel to Section 42.7a, we now set
def r v ^ clef
F,(u)= Lu){x,Du{x))dx, F=F1 + F2,
Jc
def ,
(b, u)x— I fudx
JG
and consider the generalized variational problem
F(u)-(b,u) = mini, ueX (48)
instead of (45).
According to Section 40.5, we expect the first variation to be
(F'{u),h)=J £ LDa(x,Du)Dahdx, u,heX. (49)
G\a\ s m
(46) corresponds to a generalized boundary value problem
(F'{u),h)-(b,h) = 0 for all h eX (50)
with the corresponding Ritz equations
(F'{un),wk)-(b,wk) = Q, k = l,...,n (50a)
def
foruneXn, whereXn = span{wx,w2, ■ ■ ■, w„}, and w1,w2,... form a basis in
X. One can find examples for this in the Appendix to Part II (cf. A 2 (56)).
Proposition 42.16. //(//1)-(//5) hold, then the variational problem (48) has
a solution u. Furthermore, Fl is convex and continuous on X, F2 is weakly
sequentially continuous on X, and F is coercive.
If, in addition, (//6) is fulfilled, then the continuous F-derivatives F{, F{, F'
exist and (49) holds, and u satisfies the generalized boundary value problem
(50). Furthermore, F{ is monotone and F{ is strongly continuous. For L<2) = 0,
(48) and (50) are mutually equivalent.
Corollary 42.17. If (//1)-(//7) are fulfilled and L(2) = 0, then all the
assertions of Theorem 42. A in Section 42.5 are valid. In particular, (48) and (50)
have exactly one solution u and the sequence (u„) of the uniquely determined
Ritz approximations converges strongly to u e X as n ~* oo. Furthermore, F' is
uniformly monotone, i.e.,
(F'{u)- F'{v),u- v)>c\\u- v\\$
for allu,v<£ X and fixed c> 0.
The proof is obtained in a way parallel to Example 42.15, taking
Proposition 26.12 into account.
262 42. Free Minima for Convex Functionals, Ritz Method and the Gradient Method
Remark 42.18. Proposition 42.16 holds with the same proof if one weakens
the regularity assumptions on L(,) with respect to x. Instead of L(,) e C(G
xUd) for /=1,2 in (HI), it suffices that L<° satisfy a Caratheordory;
condition, i.e., x >-* L(i)(x, D) is measurable on G for all DeUd, and
D <-> L(,)(x, D) is continuous for almost all xeG. Analogously, in (H6),
instead of L(/) eC\G xUd), it suffices that L(,) and all L$ satisfy a
Caratheodory condition.
Furthermore, analogously to Section 27.4, one can weaken the growth;
conditions with the aid of the Sobolev embedding theorems. In Browder <
(1970) and Lions (1969, M), one finds general conditions on L and Lp"
which guarantee that F' is pseudomonotone (respectively, satisfies the (S)+
condition). In general, for existence theory we recommend Morrey (1966,
M). An important problem, to which many recent works have been devoted,
consists in verifying that the generalized solutions in Sobolev spaces are in
fact classical solutions. In this connection, it is a matter of a new conception
of Hilbert's nineteenth problem. Standard works on regularity theory are
Ladyzenskaja and Uralceva (1964, M) and Morrey (1966, M). For recent
results we recommend Giaquinta (1981, L), Frehse (1982, S) (capacity
methods) and Necas (1983, L). A survey of several important results can be
found in FuCik, Necas, and Soucek (1977, M), page 72. Compare also
Problem 42.13 for an important weakening of the convexity conditions.
Problems
42.1. Convex functionals. Show: If F: Wcl-»K is convex on a convex set M,
then
Flihu^zii.Fiu,) (51)
W = l I /-1
for all ^,...^,,6^,0^(,,...,(^1, 2^,=1.
Solution: (51) follows from (1) by induction.
42.2. Proof of Proposition 42.4. Solution: Simple calculations using Definition
42.1.
42.3. Proof of Proposition 42.5. Solution: (a) For a <t1<t2<t3<b, it follows
from the convexity of <p that
(t3 - t^vih) < (t3 - *2M<i)+(<2 - 'i)«P('3)- (52)
def
If we set g(r, t) = (<p(t)~<p(r))/(t - t) for t > r, then from (52) it follows
that
s(h,ti)Sg(h,h)<g(t1,t,) (53)
and thus
g(t-h,t)<g{t,t+h) lorh>0,te]a,b[. (54)
Furthermore, by (53), the right-hand side (respectively, the left-hand side) in
(54) is monotonically decreasing (respectively, monotonically increasing) as
Problems
263
h -* +0. Thus, the limits in (54) exist as h -> +0 and yield <p'_(0 < <p'+(0-
(b) The existence of y'±(t) implies the continuity of <p in t.
All the remaining assertions except (5) can be found in any textbook on
differential and integral calculus [cf., e.g., Fichtenholz (1972, M), Vol. 1].
(5) Since <p' is monotone by (4), we first have
lim <p'(r)><p'(t) loidllte]a,b[. (55)
T->(+0
By (b), qp is continuous on ]a, b[; therefore, since g(r, a) ^ y'(r),
g(t,a)= lim g(r,a)> lim <p'(r)
T->(+0 T->(+0
for t < t < a. Then, as a -> t + 0, we obtain (55) with " <," i.e., (55) holds
with " =" instead of " >." Consequently, <p' is continuous from the right.
One proves the continuity from the left analogously.
def
42.4. Proof of the assertions in Section 42.3. Let u,v*e X be fixed. For <p(t) =
F(u + t(v - «)) and all t e [0,1],
<p'(t) = (F'(u + t(v-u)),v- «),
<p"(0 = S2.F(K + f(y-K);y-K).
Ad (1) (I) F is convex on X
<=> <p is convex on [0,1] for all u, v e X (Proposition 42.4)
<=> ip' is monotonely increasing on [0,1] for all u, v e X (Proposition 42.5)
<=> F' is monotone on X (Example 25.6).
(II) F' is monotone on X
=> <p' is monotonely increasing on [0,1]
=» <p(l)- <p(0) = <p'(#) 2i ip'(0), 0 < 9 <1
=» F(y)-F(w) > (F'(u), v-u) for all u, v e X
=>F(M)-F(y)>(F'(y),M-y>
^0>(F'(u)-F'(v),v-u)
=> F' is monotone on X
Ad (2) If F is convex, then <p' is continuous on ]0,1[ by Proposition 42.5.
Since u and v are arbitrary, <p' is continuous on [0,1], i.e., F' is hemicontinu-
ous. F' is monotone by (1) and thus demicontinuous as well, according to
Fig. 27.1.
Ad (3) Use the same line of reasoning as in (1).
Proof of Corollary 42.7. <p(l)- <p(0) =
/V(0 <*- tf(0)+(\vr(t)-tf(0)) dt
:><p'(0)+ [1ctp-1\\v-u\\pdt.
Proof of Corollary 42.8. (I) Fis convex on X«> <p is convex on [0,1] for all
«,nel« <p"(0 i 0 for all t e [0,1] (Proposition 42.5).
(II) <p"(t)> 0 for all re [0,l]=>qp is strictly convex on [0,1] (Proposition
42.5) => F is strictly convex (Proposition 42.4).
h2. Free ivmmna for ^unvcx FunwuuiwiS, Ritzivitmod and u«- vjiadient mewdd
(III) Finally,
<P'(1W(°)=/V'('M
= f1S2F(u+ t(v - u);v-u) dt > (lc\\v-u\\pdt.
Proof of Proposition 42.11. Verify that this proposition follows from the
results in Sections 41.3, 41.4, and 42.3. Note that A~F', S2F(u;h) =
(F"(u)h,h).
Criteria for convex functions. Formulate necessary and sufficient conditions
for F: R N -» R to be convex.
Solution: If F e Cl(U N), then F is convex if and only if
E(Z),F(x)-Z),.F(x))(i.-|,)^0
/ = i
forallx.xeR". (56)
If F e C2 (R N), then F is convex if and only if
N
£ DiDjF(x)lij>0 forallx.xeR" (57)
(../=1
Here, x= (ZV...,ZN), Z),= d/d£,. (56) means that F' is monotone. (57)
corresponds to S2F(x;x)>0 for all x,x^UN. Take Section 42.3 and
Example 40.4 into account.
These criteria are also valid for convex functions on open convex sets of
UN. A detailed exposition of the properties of convex functions is contained
in Rockafellar (1970, M) and Roberts and Varberg (1973, M).
Strongly convex functionals. F: X-* R defined on the real H-space Xis said to
be strongly convex if and only if for all b,i;eX,(6 [0,1] and fixed m > 0,
2-»(l- t)tm\\u-vf< (l~t)F{u)+ tF(v)-F((l- t)u + tv).
F is said to be bounded convex if and only if for all u,v e X, t& [0,1] and
fixed M > 0,
{\- t) F(u)+ tF(v)- F((\- t)u + tv) <2~\\- t)tM\\u - v\\2.
Show. If F is continuous on X, then the following two assertions arc
equivalent:
(i) F is strongly convex and bounded convex.
(ii) F' exists on X as an F-derivative, and for all u,v &X and fixed m.
M>0,
m\\u - y||2 £ (F'(u)- F'(v)\u - v) £ M\\u - v\\2,
i.e., F' is strongly monotone and Lipschitz continuous.
Hint: Compare Gopfert (1973, M), page 173.
Problems
265
42.8.* Gradient method. For
min F(u) = a (58)
«e X
we consider the gradient method
uk + i = uk + tkhk, /: = 0,1,....
Here, start with u0 e X. If uk is known, then the method breaks off, by
definition, for F'(uk) = 0. If F'(uk)¥= 0, choose hk so that (F'(uk)\hk) < 0
(e.g., hk = - F'(uk)). Furthermore, determine an optimal step size tk by
F(uk + tkhk) = rcm,>0F(uk + thk).
Show: If F: X -» R is continuous, strongly convex, and bounded convex on
the real H-space X, then (58) has exactly one solution u and F'(u) = 0.
(uk) converges to u e X as & -» oo if and only if the series 2c| diverges. In
rfc/
this connection, ck = 1(-^(^-)1^)1/11-^(^)11 P/ill is a coefficient measuring
the quality of the approximation-(ck = oo when F'(uk) = 0).
For k = 1,2,... with F'(uk)i= 0 and ^(1^)^0, the following error
estimates hold:
/c-l
F(«,)-«< (F(«0)-a) EI {l-mM^cj),
/c-1
m2\\uk - u\\2 < ||F'(«o)ll2 EI {l-m2M-\:}).
m and M are taken from Problem 42.7. If hk = - F'(uk), then (¾ = 1, and the
convergence is linear (cf. Section 1.3).
Hint: Compare Gopfert (1973, M), page 180. The case of underrelaxation
is also treated there, i.e., tk is smaller than the optimal step size. Therefore, in
the general case, the zigzag path of the iteration method is smoothed.
42.9.* Gradient method and optimal positioning of factories. Study Beckert (1971).
42.10.* Convex functional and problems of elasticity and plasticity theory. Study
Langenbach (1976, M). Cf. also Part IV.
42.11. An existence theorem. Show: The minimum problem
rmnF(u) = a (59)
«e X
has a solution when F: X -> R is G-differentiable in the real reflexive B-space
X, and, for all u, h e X, S2F(u; h) exists such that
S2F(u;h)>\\h\\a(\\h\\) for all u, h e X, (60)
where a: [0, oo[ -> [0, oo[ is a continuous function with
R-1 fa(t)dt-* + oo as R -» + oo. (60a)
•'o
If a(0 > 0 for t> 0, then the solution of (59) is unique.
Hint: By (60) and (60a), F is convex and weakly sequentially lower
semicontinuous (cf. Corollary 42.8 and Proposition 41.8). Parallel to Problem
42.4, show that (60) and (60a) imply the relation F(u) -> + oo as ||u|| -> oo,
42. Free Minima for Convex Functionals, Ritz Method and the Gradient Method
and apply Theorem 41.A in Section 41.2. Compare Fucik, Necas, and Soucd,
(1977, L), page 25. The uniqueness follows from the strict convexity of F.
A variational problem. Apply the result of Problem 42.11 to the variational
problem
f L(x,Du(x))dx = mn\, u<zW{"(G) (61)
JG
and explicitly formulate the assumptions needed for L.
Solution: (61) has exactly one solution when the following two condition'-
hold:
(i) G is a bounded region in R N, N, m > 1.
(ii) LsC^GxR') and there exist a function a e C(G), where a > 0 on
G, and constants cu c2 > 0 such that for all x e G, D, D'e Ud, the following
growth conditions are satisfied:
\L(x,D)\<a(x)+Cl Z |Oy|2,
|y|<m
\Lua(x,D)\<a(x) + cx £ \Dy\ for all a, \a\ < m,
|y|<m
\LucDi>{x,D)\<a(x) for all a, /8 such that \a\, |)8| < m
as well as the definiteness condition
E LD«u,(x,D)D'"D'^c2 E \D'y\2.
|o|,l/3|5m |y|<m
Compare Fuclk, NeCas, and Soucek (1977, L), page 63. Instead of a e C(G).
it suffices to have a e LX{G). Furthermore, instead of isC2 foi
L, LD«, LDaDii, one needs only the Caratheodory conditions.
A variational problem in which the integrand is convex only with respect to tk
highest derivatives. In Section 42.7 we used a decomposition of Fol the form
Fx + Fz, where Fx is convex. However, it suffices to have F convex with
respect to the highest partial derivatives that are present. In this connection,
we consider
(L(x,D'u,D"u)dx=mn\, u<zW„m, (621
Jc y
where ZJ'« = (ZJatt)|a|sm_1 and D"u = (Dau)la]_m.
Show: (62) has a solution when the following three conditions hold:
(i) G is a bounded region in UN, N, m>\, l<p <oo,
(ii) L e Cl(G xW), and L(x, D', D") is convex with respect to D" for a I
fixed x, D'.
(iii) L satisfies the growth conditions (H2) and (H6) as well as the coerciw-
ness condition (H3) in Section 42.7b.
Instead of LgC (GxW) it suffices to require that L and all L,-
belong to C(GxR<').
Hint: Compare Berger (1977, M), page 307. The crucial point is tlw
verification of the weak sequential lower semicontinuity of the functional n
(62). To this end, use Egorov's theorem from measure theory. Importa11
Problems
267
generalizations of the above existence proposition can be found in Ekeland
and Temam (1974, M), Chapter VIII, Theorem 2.2 and Morrey (1966, M),
Theorem 1.9.1.
Ladyzenskaja and Uralceva (1964, N) and Morrey (1966, M) contain
propositions on regularity for m=l, N arbitrary.
14.** Nonconvex variational problems. Up until now, we have assumed that the
integrands have certain convexity properties with respect to the derivatives.
We now describe two methods for dealing with more general problems with
the aid of a generalized setup of the problems.
-2.14a. Generalized solutions using measures (stochastic interpretation). In order to
explain the difficulties, we consider the simple example
F(x) = f[x2(t}+(x'2(t)~l)2] dt = rmn\, (63)
x(0) = ;c(l) = 0.
Let x(') be continuous and piecewise continuously differentiable or, more
generally, absolutely continuous on [0,1]. The integrand is not convex with
respect to x'. First we show that (63) does not have a solution. The lower
bound of the integral is equal to zero. To prove this, we decompose [0,1] into
In equal subintervals and construct x„ as the polygonal path with x„(0) = 0
and x,,(t) = l (respectively, x,,(t) = -l) on adjacent subintervals (see Fig.
42.4). Then F(x„) = l/12rc2, i.e., the infimum of Fis equal to zero. However,
from F(x) = 0 it follows that x'(t) = 1, x(t) = 0; hence (63) cannot have a
solution.
In order to obtain generalized solutions which are connected with x„,
instead of (63) we consider the generalized problem
M/ {x2(t)+(v2-lf)d,i.t(v)\dt = rmn\, (63a)
*(t)-/7 jvdji.,(v)\ dt, x(l)-0.
We seek a continuous function x: [0,1] -> R and, for each <e[0,l], a
probability measure n, on R. If x(-) is a continuously differentiable function
on [0,1] and we choose n, equal to the Dirac measure Sx^,)t i.e.,
m ; \o itX'(t)eM,
Figure 42.4
ti. Free Minima for convex Funcuonais, Ritz Metnod and me uradient ivieinod
then
(f(v)d»,(v)=f(x>(t)),
and we obtain the classical expressions in (63a). Now it is easy to verify that
by
M,-2-1(« + i + «-i) (64)
we obtain a solution of (63a), with x(t) = 0. Note that the lower bound of
the functional in (63a) to be minimized equals zero, and for ju, in (64) and
x(t) = 0, because
//(1/)^,(1/)-2^(/(-1)+/(1)),
this lower bound is attained. (64) permits the interpretation that the
generalized solution takes on the derivative values x'(t)=l, or x'(t) = -1, with
probability i at any given time. If we consider the motivation for this to be
the sequence of polygonal paths (x„) constructed above, then (x„) does
indeed converge uniformly to zero as n -> oo, but x(t) = 0 is not a solution of
(63). However, the following two assertions hold:
(i) The integral value F(xn) tends to zero as n -> oo, i.e., (xn) is a minimal
sequence,
(ii) The probability that x'„(t) = +1, or x'„(t) = — 1, is equal to \.
Analogously one can explain generalized problems for more general
variational problems. The essential advantage of this theory of generalized
solutions which is due to L. C. Young and McShane is that for the measures
p.,, general existence propositions are obtained with the aid of compactness
arguments with respect to appropriate topologies. As an introduction to this
subject, we recommend McShane (1978, S,H). There, for more general
control problems as above, necessary conditions connected with the Pontrja-
gin maximum principle as well as existence propositions are given and
applied to four important classical problems of the calculus of variations.
A detailed exposition is contained in Young (1969, M) and Gamkrclidze
(1978, M).
Generalized solutions with the aid of convex regularization. Together with the
original problem
/k
L(x, u, u') dx^a, (65)
we consider the generalized problem
inf ('
hL**{x,u,u')dx-p. (65**)
U describes suitable side conditions. Here, L** relates to «'. The exact
definition of L** is given in Section 51.1. Intuitively, «'•-» L**(x,u,ur)
denotes the convex lower semicontinuous function which best approaches
«'•-> L(x, u, u') from below (cf. Fig. 42.5 and Example 51.7).
If «'•-> L(x, u, u') is not convex with respect to «', then one can instead
consider (65**), where «'■-* L**(x, u, «') is convex. Therefore, the existence
References
269
v /^s.**
«-u'
Figure 42.5
propositions of Problem 42.13 can be applied to (65.**). In Ekeland and
Temam (1974, M), Chapters IX, X it is shown in a sophisticated way that
under suitable assumptions for the multidimensional problems corresponding
to (65) and (65**), the following holds: a = /8 and the solutions of (65**) are
the limiting values of the minimal sequences for (65). In this sense, the
solutions of (65**) are generalized solutions of (65).
References to the Literature
Monotone potential operators: Vainberg (1972, M,H); Ekeland and Temam
(1974, M); Gajewski, Groger, and Zacharis (1974, M); Langenbach (1976,
M); Kluge (1979, M).
Ritz's method: Michlin (1969, M); Vainberg (1972, M); Ciarlet (1977, M)
(finite elements).
Gradient method: Ljubic (1970, S); Cea (1971, M); Vainberg (1972, M);
Gopfert (1973, M); Berger (1977, M) (cf., also, the references to the
literature on general approximation methods in Section 37.29 and in Chapter
21 as well as in the Appendix to Part II on finite elements).
Convex functions: Rockafellar (1970, M), Roberts and Varberg (1973,
M, H) (standard works).
Convex functionals: Kluge (1979, M).
Existence theory for multidimensional variational problems: Morrey
(1966, M) (standard work); Browder (1970, S), Ekeland and Temam (1974,
M); Berger (1977, M), Ball (1977), (1981) (cf., also, the references to the
literature in Chapters 26 and 27).
Regularity of generalized solutions: Ladyzenskaja and Uralceva (1964,
M,H); Morrey (1966, M); Giaquinta (1981, L); Frehse (1982, S);
Hildebrandt (1983, S); NeSas (1983, L) (cf., also, the references to the
literature in Chapter 21).
Nonconvex variational problems and generalized solutions: McShane
(1978, S,H) (introduction); Young (1969, M); Ekeland and Temam (1974,
M); Gamkrelidse (1978, M).
Hilbert problems and the calculus of variations; Aleksandrov (1971,
M,H,B); Browder (1976, M,H,B).
EXTREMAL PROBLEMS WITH
SMOOTH SIDE CONDITIONS
Everyone knows what a curve is, until he has studied enough mathematics to
become confused by the countless exceptions.
Felix Klein
In the following three chapters we consider problems of the type
min F{u) = a,
ue M
where the side condition M is given by an equation
G{u) = 0,
i.e., M= {ueD(G): G(u) = 0}. In Chapter 43 we justify the Lagrange
multiplier rule and apply these results to eigenvalue problems. In particular,
we treat:
(a) Existence of an eigenvector (Chapter 43).
(/?) Existence of bifurcation points (Chapters 43 and 45).
(y) Existence of several eigenvectors (Chapter 44).
Chapter 44 is devoted to the Ljusternik-Schnirelman theory. Chapter 45
contains a fundamental bifurcation result for potential operators. The
applications are related to:
(a) Real functions.
(/?) Information theory.
(y) Statistical physics.
(S) Variational problems with side conditions.
(e) Quasilinear elliptic differential equations.
(J) Hammerstein integral equations.
272
Extremal Problems
In Part IV we elucidate the connection with the principle of virtual
displacement in mechanics as well as with thermodynamic equilibrium, and
we treat applications to elasticity theory. Constraining forces in mechanics
and absolute temperature are examples of the physical interpretation of
Lagrange multipliers.
CHAPTER 43
Lagrange Multipliers and Eigenvalue
Problems
By generalizing Euler's method, Lagrange got the idea for his remarkable
formulas, where in a single line there is contained the solution of all problems
of analytic mechanics.
C. G. J. Jacobi
Returning to the concepts of maximum and minimum, it is a nuisance that
there reigns such confusion in the use of these words. One says that an
expression is a maximum or a minimum if one simply wishes to say that its
variation vanishes (critical point), also in the case when neither a maximum
nor a minimum occurs.
C. G. J. Jacobi, 1837
In this chapter we shall show what nondegeneracy condition is necessary to
justify the Lagrange multiplier rule in the narrower sense for smooth side
conditions. Moreover, we will interpret this condition geometrically and
explain the connection with manifolds in B-spaces. In this connection, a
generalization of the implicit function theorem is the focal point (Theorem
43.C). The central concepts are:
(a) Tangent vector, tangent space, and submersion.
(/J) Regular point of a set.
(j) Manifold.
(S) Tangential mapping.
(e) Critical point of a functional.
The desired nondegeneracy condition leads to submersions. Furthermore,
we discuss the connection between critical points, Lagrange multipliers, and
eigenvalue problems. Roughly speaking, we shall obtain the following
273
274
43. Lagrange Multipliers and Eigenvalue Problems
important result:
(L) If the smooth side condition G(u) = Q describes a manifold, then the
Lagrange multiplier rule can be applied.
43.1. The Abstract Basic Idea of Lagrange Multipliers
The basic idea of the Lagrange multiplier rule for sufficiently smooth side
conditions is based on the following proposition:
Proposition 43.1. Assume that the following two conditions hold:
(i) X and Y are B-spaces over K, where K = IR or C.
(ii) A: X-> Y and B: X->K are continuous linear operators and R(A) is
closed.
Then if
Bh = 0 for all hex such that Ah = 0 (1)
holds, there exists a AeP such that
X0Bk + A(Ak) = 0 for all k eX, (2)
with X0 = 1. For R(A) = Y, A is unique.
Corollary 43.2. If R(A)¥=Y, then, by the assumptions (i) and (ii), there
exists a AeY*, A¥=Q, such that (2) holds with X0 = Q.
In every case, X0 and A in (2) are not simultaneously zero.
Proof. We use the closed range theorem (cf. Aj^)). According to that
theorem, R(A*) = ±N(A). By assumption, B e -^(,4). Consequently, B =
— A*A; therefore,
(B,k) = -(A*A,k) = -(A,Ak) for all k el
This yields (2) with \0 =1. If R(A) = Y, then N(A*) = R(A)± = {0}, i.e.,
A* is injective. Consequently, A is determined uniquely by B.
In the case of Corollary 43.2, there exists a v e Y with v £ R{A).
According to the Hahn-Banach theorem, one can construct a A e Y* such that
A(u) = landA(w) = OforallweJR(^l). □
As a typical application, we consider the minimum problem
F(u) = imn\, G(u)=0.
Let u0 be a solution. We restrict ourselves to formal observations which we
shall make precise in Section 43.8. A curve t >-* u(t) such that u(Q) = u0 is
said to be admissible when G(u(t)) = 0 for all t in a neighborhood of zero
and u'(0) exists. Then we call u'(Q) a tangent vector. From G(u(t)) = 0 it
43.1. The Abstract Basic Idea of Lagrange Multipliers 275
follows that G'(u0)u'(0) = 0. If we set <p(t) = F(u(t)), then <p has a
minimum at t = 0; therefore <p'(0) = °> i-e-> ^"("o)"'(°)= °- Thus,
F'{u0)h = 0, G'{u0)h-0,
where h = u'(Q). The following two requirements are now crucial for the
application of Proposition 43.1:
(i) Every h such that G'{u0)h = 0, which we designate as a virtual
displacement, is a tangent vector of an admissible curve,
(ii) R(G'(u0)) is closed.
Then, according to Proposition 43.1, the following holds:
X0F'(u0)u + AG'(«0)" = 0 for all u e X, (3)
where A0 = 1. This is the Lagrange multiplier rule in the narrower sense. If
only (ii) holds, whereby, however, G'(u0) is not surjective, then we obtain
(3) with A0 = 0 and A =£ 0 according to Corollary 43.2. In this degenerate
case we need no admissible curves.
In abstract form, these considerations contain the principle of virtual
displacements of mechanics that we shall delve into in Chapter 58.
For multidimensional variational problems, (ii) can cause difficulties. For
this reason, one does not always succeed in verifying that the Lagrange
multiplier rule is a necessary condition. However, the simple method that we
described in Section 37.4/ allows us to use Lagrange multipliers to obtain
sufficient conditions.
On the basis of a simple example we shall show the meaning of condition
(0-
Counterexample 43.3. We consider
f(£,ij)=min!, G(£,ij) = 0,
where F(£, i)) = exp(£ + 7)) and G(£, tj) = £2 + tj2. The solution is u0 = (0,0),
since u0 is the only point where G(£, 17) = 0. The formal application of the
Lagrange multiplier rule in the narrower sense yields the existence of a
number X such that
F{ (0,0)- AGj (0,0) = 0, ^,(0,0)-\G, (0,0)-0.
However, this is a contradiction because Gf(0,0) = 0 and ./^(0,0) = 1. The
reason is the following: h = (hlt h2) is a virtual displacement if and only if
G'(u0)h = 0, i.e., G^O.O^ + G^O.O^-O. Hence, every h in 0¾2 is a
virtual displacement. The only admissible curve t >-* u(t) is, however, u(t) = Q
with the tangent vector ¢^(0)==0. For this reason the side condition is
designated as rigid. Thus, not every virtual displacement is the tangent
vector of an admissible curve.
276
43. Lagrange Multipliers and Eigenvalue Problems
However, if the necessary condition is written in the form
a0^(0,0)-aGj(0,0) = 0, a0jF„(0, 0)- AG, (0,0) = 0,
where \20 + \2 + 0, then no contradiction arises.
43.2. Local Extrema with Side Conditions
Definition 43.4. Let F: D(F)c X^>U be a functional on a real locally
convex space X. Let M be a subset of D{F) which we will call a side
condition. Let u0 e D(F).
F has a bound local minimum with respect to the side condition M at u0 if
and only if there is a neighborhood U(u0) of u0 in X such that
F(u)>F{u0) for all «e £/( «0)nM. (4)
If " > " holds in (4) for u¥=u0, then we say that F has a bound strict local
minimum with respect to M.
The corresponding notions for maxima are explained in an obvious way
by reversing the inequality sign.
Example 43.5. In Fig. 43.1, the function F: U -»IR has a bound strict local
minimum with respect to the side condition a<u<b at the point «0 = a,
i.e., M=>[a, b]. However, there is no free local minimum present at u0 = a
because F(u) > F(a) does indeed hold in a right-hand-sided neighborhood
of a, but it does not hold in a full neighborhood of a.
As a prototype for the application of minima and maxima with side
conditions to eigenvalue problems, we consider
f(«) = min!, G(«) = 0 (5a)
together with the necessary condition
\0F'(«o)-AG'(«0) = 0. (5b)
i—I ) h*u
a b
Figure 43.1
43.2: Local' Extrema v/iih Side vouultions
277
Proposition 43.6. There exist real numbers X0, X with X20 + X2 ¥= 0 such that
{5b) holds when the following two conditions are satisfied:
(/) F has at u0 a bound local minimum or maximum with respect to the side
def
condition M= {ueD(G): G{u) = 0}.
(/7) F, G: U(u0) c X-> IR are functional on the real B-space X, and F'(u0)
and G'(u0) exist as F-derivatives.
If the nondegeneracy condition
G'(u0) =£0, G is continuous on U(u0) (5c)
holds, then X0—l.
def
(5b) results formally from L'(u0) = 0 where L = X0F~ XG. Therefore, it
is a matter of a Lagrange multiplier rule. Without the side condition, the
necessary condition for a local extremum of F is equal to (5b) with \0 =1,
X — Q. The nondegeneracy condition (5c) is weaker than a corresponding
condition which results from the general Theorem 43.D in Section 43.8
below. For this reason we give an independent proof which is formulated in
such a way that it will later in Chapter 64 be applicable to variational
inequalities as well.
Proof. (I) Degenerate case. If G'(u0) = 0, then (5b) holds for X0 = 0, X = 1.
(II) Nondegenerate case. We choose an hx such that (G'(u0), hx) > 0. The
functional G is F-differentiable at u0. Thus, from G(u0) = 0 it follows that
G(u0 + k) = (G'(u0),k) + o(\\k\\) as k ->0. (6)
def
We set ga(S) = G(u0 + a(/?0 + S)hx + ah) and fix /80(A) by
(G'(u0), ^)^ +(G'(u0),h)-Q.
By (6),
ga{±n~l) = ± an-\G'(u0), h{) + ao(l), a-> 0.
For this reason, for each «eN, there exists an«„>0 such that an -»0 as
n->oo and ga(±n'1)^Q. According to the mean value theorem, there
thus exists a S„ e [ — n~l, n'1] such that ga (Sn) = 0; therefore,
def
G{u) = 0 ioiu = u0 + an(fi0 + dn)hl + anh.
By assumption, F has a local minimum with respect to M at u0, i.e.,
F(u0)- F(u0 + oB(A, + «„)*! + a„h) < 0.
When n -» oo, we obtain (.F'^oX AA + h) <0 analogous to (6), i.e.,
(F'(u0)-XG'(u0),h)<0 for all h (= X,
278
43. Lagrange Multipliers and Eigenvalue Problems
where X = (F'(u0), hl)/(G'(u0), hr). Consequently, (5b) holds by Problem
39.4. □
The following possibilities are available for producing existence
propositions for eigenvectors:
(a) Minimum problem (cf. Section 43.3).
(/?) Maximum problem (cf. Section 43.4).
(y) sup-min problem (cf. Section 44.5).
However, we shall see that the approach via the maximum problem even
yields a bifurcation point.
43.3. Existence of an Eigenvector Via a Minimum
Problem
We consider the minimum problem with a side condition
min F(u) = F(ua), (7a)
u e ff„
def
where Na= {ue X: G(u) = a), and the eigenvalue problem which
corresponds to it according to the Lagrange multiplier rule:
F'(ua) = XaG'{ua), \a*0, ua*0. (7b)
Theorem 43.A. Suppose that the following six conditions hold:
(i) X is a real reflexive B-space.
(ii) F: X-> IR is weakly sequentially lower semicontinuous.
(iii) G; X -* U is weakly sequentially continuous and G(0) = 0.
(iv) F',G': X-* X* exist as F-derivatives.
(v) G'(u) = 0, or F'(u) = 0, implies u — Q.
(vi) F(u)-* +oo as \\u\\ -»oo.
Then there exists a real number a, a + 0, such that Na ¥> 0. For each such a,
(7a) has a solution ua and ua is an eigensolution of (7b).
We discuss applications to quasilinear elliptic differential equations in
Section 44.9.
Proof. By (v), G & 0. Consequently, there exists slug X such that G(u) + 0.
Let a = G(u). By (iii), the level surface Na is weak sequentially
closed, for it follows from G(un) = a, un-+u that G(u) = a. According to
Proposition 41.2, (7a) possesses a solution ua. Since G(0) = 0 and G(ua) = a,
we have ua + 0 and thus G'(ua) + 0, by (v). Proposition 43.6 yields (7b). If
43.4. Existence of a Bifurcation Point Via a Maximum Problem 279
Xa were equal to zero, Xa = 0, then, by (v), F'(ua) = 0, ua + 0 would yield a
contradiction. □
43.4. Existence of a Bifurcation Point Via a Maximum
Problem
Parallel to (7), we now study the maximum problem with a side condition
maxG(«) = G(«a)> (8)
ue/V„
def
where Na = {u e X: F(u) = a}, and the corresponding eigenvalue problem
G'(«J-M"(««), *«*0, ua*0. (9)
Retaining the assumptions of Section 43.3, observe that we have
interchanged the roles of F and G. In the following, F will be assumed to be only
weak sequentially lower semicontinuous, while we assume G to have the
stronger property of weak sequential continuity. The following condition is
important for the bifurcation proposition:
(HI) (X,(-\-)) is a real H-space and G' is compact. G"(Q) exists as a
second F-derivative, and there exists a w e X such that (G"(0)w\w) > 0. We
def _
specialize F to F(u) = 2 (u\u) for all ue X.
If we identify X with X*, then F'{u) = u. By virtue of Proposition 7.33
and Problem 4.3, G"(0): X-* X is a compact symmetric operator. In this
special case, (9) reads as follows:
G'{ua) = Xaua, Xa *0, ua*0 (9a)
with the linearized eigenvalue problem
G"(Q)vl = X0vl> X0>0, v1¥^Q. (10)
Theorem 43.B (Krasnoselskii (1956)). Suppose that the following seven
conditions hold:
(/) X is a real reflexive B-space,
(r7) F: X-* U is weak sequentially lower semicontinuous.
(/(7) G: X -» U is weak sequentially continuous.
(iv) F',G': X-* X* exist as F-derivatives.
{v) F'(u) = 0 implies u = Q; G'(u) = 0 implies G(u) = 0; and G(0) = G'(0)
= F(Q) = 0.
(vi) F(u) -+ +QO as \\u\\ -» oo.
(vii) There exists a sequence (wn) in Xsuch that G(w„) > 0 for all n eN and
m>„ -»0 as n -* oo.
_iu
43, ^ufcidngelUun^jjers anu j-jgi-uvaluexiuuicms
Then:
(1) Eigensolution. For each a > 0, (8) has a solution ua and ua satisfies (9).
(2) Bifurcation point. If (HI) holds, then
(ua,\a)-*(0,\0) asa-»Q withX0>0. (11)
Here, A0 is the largest eigenvalue of G"(0).
(11) shows that (0, A0) is a bifurcation point of G'(u) = Xu. Theorem
45.A in Section 45.2 contains an important generalization of assertion (2).
Proof. (1) The crucial trick is to first consider, instead of (8), the variational
problem
def
maxG(«) = G(0. Ma= {x e X: F{u) <a], (8a)
ueMa
which is easier to solve. Let a > 0. The set Ma is bounded because F(u) ->
+ oo as ||k||-»oo. Furthermore, Ma is weak sequentially closed, for it
follows from F(un)<a, un-^u that F(u)<limF(un)<a. According to
Proposition 38.12, (d), (8a) has a solution ua.
We show that ua is also a solution of (8), i.e., F(ua) = a. To begin with, F
is continuous at u = 0. Therefore, there exists a wno e Ma. From 0 e Ma,
G(wn ) > 0 it follows that G(ua) > 0. If we had F(ua) < a, then we would
also have ua e int Ma because of the continuity of F; therefore, G'(ua) = 0
according to Theorem 40.B in Section 40.2. By (v), this yields G(ua) = 0, in
contradiction to G(ua) > 0.
We prove that F(ua) < a. Indeed, F(ua) = a and a > 0 assure that ua + 0
and F'(ua) + 0. Proposition 43.6 yields (9). Here, Aa * 0, for Aa = 0 would
yield G'(ua) = 0 and thus G(ua) = 0. This is impossible.
(2) We compare (8a) with the quadratic variational problem
max #({;)=/?, (11a)
veMa
def
where H(v) = 2 l(G"(0)v\v). If vl is a solution of (11) for a = 1, then Va vl
is a solution for a > 0.
(I) Solution of (11) for a = l. By Corollary 21.23, it follows from the
compactness of G"(Q) that H is weak sequentially continuous. An argument
analogous to that for (8a) yields the existence of a solution vr of (11) and
H'{Vl) = XoVl, 2^(^1^)=1-
Since H'(v1) = G"(0)v1, X0 = H(v1). By (11) with a = l, A0 is thus the
largest eigenvalue of G"(Q).
(II) We show:
G(u) = H(u) + o(\\u\\2) asu-»0, (12)
0(^) = 2-^0^^)1^) + 0(11^112) as«-*0. (13)
To this end, we set g{t) = G{tu)-2-lt2{G"{0)u\u). Since G'(0) = 0, the
43b. me Galeuuu method 101 i-iigenvali>v i luJems
28.
Taylor theorem immediately yields
G'{u) = G"{0)u + o{\\u\\) as«-*0. (14)
The mean value theorem implies g(l) = g'(#), 0 < # < 1. This, together with
.(14), yields (12) and (13).
(Ill) Since ua e Na, and 2_1(«a|Ma) = a, we have ua -» 0 as a -» 0. Now,
by (12) and (13), Xa -» X0 as a -» 0 is obtained from
\ji-2-l(G'{ua)\ua) = G{ua)+o{a),
X0a=H(\/avl) = G(\/avl) + o(a)<G(ua) + o(a)
= H(ua) + o(a)<H(\/avl) + o(a) .
= \0a + o(a).
Now take into consideration ua, \fa.v1 e Ma and (8a), (11). □
43.5. The Galerkin Method for Eigenvalue Problems
For an approximate solution of the eigenvalue problem
liAu = Bu, ueX, (15)
for k = 1,2,..., we consider the Galerkin equations
\ik(Auk,wi) = (Buk,wi), / = 1,...,^, (16)
def
where uk = Hf_lckiwi. We seek nk e IR and cki e IR. Here, (16) represents a
nonlinear eigenvalue problem in IR *.
Proposition 43.7. Suppose that the following three conditions hold:
(/) X is a real separable reflexive B-space with dim X = oo, and { wlt w2,. ■.}
is a basis in X,
(//) A: X-* X* is compact.
(Hi) B: X-* X* is continuous, bounded, and satisfies (S)0.
Then, if for each k e N, (16) has a solution (uk,ixk) such that sup^(||«^|| +
\l>,k\) < oo, (15) has a solution.
The convergence of the subsequence uk,-+u, i>,k>-*ii as fc'->oo implies
strong convergence uk, -* u, and (u, p) is a solution of (15).
(S)0 means that for n -» oo:
v„-*v, Bvn-+w, (Bvn,vn) -*(w, v) implies vn -» v.
According to Fig. 27.1, B satisfies, for instance, the condition (S)0 when
B = C + D, where C: X-+X* is uniformly monotone and D: X-+X* is
compact.
282
43, Lagrange Multipliers and Eigenvalue Problems
Proof. Since (uk) and (nk) are bounded and A is compact, there exist
convergent subsequences which we denote in the same way such that uk-+u,
def
pk ->;*, Auk -» z. Let Xm = span{w1,...,wm}. From (16) it follows that for
all v e Xm, m<k:
pk(Auk,v) = (Buk,v). (17)
Thus, (Buk, v) -» (pz, v) for all v e u mXm.
Since X—UmXm and (Buk) is bounded, we even have Buk-^pz (cf.
Aj(31d)). Furthermore, from (17) it follows that
(Buk,uk) = pk(Auk,uk) -* (pz, u).
Moreover, Buk-^\s,z, uk-+u. The condition (S)0 yields uk -* u, i.e., Auk -* Au.
Therefore, Au = z. Hence, Bu = pAu. O
43.6. The Generalized Implicit Function Theorem
and Manifolds in B-Spaces
We consider the equation
G(u) = Q (18)
with the known solution u0. Let G: U(u0)c X-» Y be a mapping, where X
and Y are B-spaces over K, where K =01 or C. We seek a parametric
representation of the form
w = 9(A), heVs (18a)
for all solutions of (18) in a neighborhood of u0. Here,
def ,
Vs~{heX:G'(u0)h = 0,Uh\\<8}.
Our goal is the following proposition.
If G is a submersion at u0, then there exist numbers 8,
e > 0 and a homeomorphism <p from Vs into X such that all
u from (18a) are solutions of (18). Conversely, every
solution of (18) with \\u — u0\\ < e can be represented in the
form (18a). Moreover, <p is continuously .F-differentiable
on Vs with <p(/i)= u0 + h + o(||A||) as h -* 0.
Submersions will be explained below in Definition 43.15.
(18b) is a variant of the implicit function theorem (Theorem 4.B in
Section 4.7), which is obtained as a special case of Theorem 43.C in Section
43.6. In order to interpret (18b) geometrically, we first introduce several
concepts from differential geometry in B-spaces, which are of fundamental
interest. These include tangent vectors, tangent spaces, and manifolds. In
(18b)
43.6. The Generalized Implicit Function Theorem and Manifolds in B-Spaces 283
this connection, one generalizes well-known concepts of differential
geometry in 0¾3. We explain the geometrical core in Example 43.12 and Fig. 43.3
below. In Chapter 73 we shall study Banach manifolds in greater detail.
43.6a. Tangent Vectors and Tangent Spaces
In the following one can always think of M in connection with (18) as the
set
def
M= {ueD(G):G(u) = 0}. .
Definition 43.8. Let X be a locally convex space over K (K is IR or C) and
let M be a subset of X. Let u0 be a fixed point in M.
(1) An admissible curve in M through u0 is understood to be a mapping
t >-+ u(t) with u(Q) = u0 and u(t) e M for all t in a neighborhood of zero in
R1. Moreover, the derivative u'(Q) is assumed to exist.
(2) h is called a tangent vector to M at u0 if and only if there exists an
admissible curve as in (1) such that u'(Q) = h.
(3) If the set of all tangent vectors to M at u0 form a linear space over K,
then we denote it by TMUa and call it the tangent space to M at u0.
Furthermore, u0 + TMU is called the tangent plane to M at a0.
43.6b. Manifolds
We shall now make use of the tangent space to introduce local coordinates
on Min a neighborhood of u0, with the aid of a mapping <j>. All topological
concepts for M are relative to the induced topology on M (cf. Ax(9)). In
particular, W is an open neighborhood of u0 in M if and only if W= M n
U(u0), where U(u0) is an open set in X containing u0.
Definition 43.9. A point u0 in M'is said to be regular if and only if the
following two conditions hold:
(i) The tangent space TMUa exists and is closed.
(ii) There exists an open neighborhood of zero V'm TMUa and a mapping <j>:
V c TMUa -* M which maps V homeomorphically on an open
neighborhood of u0 in M.
Each u in <p(V) can be represented as u = <p(h), where h e V. Here, h is
called a local coordinate of u, and (V, <p) is designated as a local parametri-
zation of M at a0. We now deal with change of local coordinates. If ux and
u2 are two regular points in M with the corresponding local parametriza-
Z84
43. Lagrange Multipliers and Eigenvalue Problems
Figure 43.2
def
tions (Ki,<Pi) and (F2,<p2), respectively, and we set Wt = <Pj(P,), then each
point u in Wlf) W2 has with respect to V1 and V2 the local coordinates
hx = <pf 1(u) and h2 = ff2x{u), respectively. The change of coordinates is
thus described by means of hx = <pf ^2(^2)) (see Fig- 43.2).
Definition 43.10. Let M be a subset of the B-space X over K. Then M is
called a manifold if and only if every point of M is regular. M is called a
C'-manifold if and only if M is a manifold and all the mappings <pf * °<p2
which describe the change of local coordinates are C-mappings.
If M is a manifold, then it immediately follows that all the mappings
<jpj"1o<jp2 are homeomorphisms. If M is a given Cr-manifold, then all the
<pf * 0 <p2 are C-diffeomorphisms by the inverse mapping theorem in Section
4.13.
In Problem 43.4 we study the more general concept of manifolds that are
modelled locally on B-spaces but need not necessarily lie in a fixed B-space
(Banach manifolds). Since from experience we know that the abstract
concept of a tangent space for general Banach manifolds causes the reader
difficulties (cf. Problem 43.4c), we have preferred to first consider only the
manifolds introduced above, which, moreover, are very well suited to our
applications to Lagrange multipliers and, as the following examples show,
are helpful in direct geometric interpretation.
43.6c. Examples
In the following examples we shall show how surfaces, curves, and points in
IR3 submit to the concept of a manifold.
Example 43.11. Let G: 0¾3 -*W be a mapping with G(u0) = 0. Let G =
(G1;...,G„), u = (£1( £2, £3), and D- = d/d^-. All Gt are assumed to possess
continuous first partial derivatives. Then the F-derivative G'(u) exists and
we have
G'(«)A-(Gf(«)A,...,G;(«)A),
43.6. The Generalized Implicit Function Theorem and Manifolds in B-Spaces 285
Figure 43.3
where
G,'(u)h- I, DjGMhj. *
We now consider the equation G(u) = 0, i.e.,
G1(«) = 0,...,GII(«) = 0. (20)
We denote the set of all points u that satisfy (20) by M. Then u0 lies on M.
The crucial requirement reads as follows:
R(G,{u0)) = W, (20a)
i.e., the linearization of G at u0 is surjective. Then G is a submersion at u0
(cf. Definition 43.15 below). (20a) is equivalent to
vank(DjGt(u0)) = n. (20b)
Example 43.12 (Surface). Let n=\. Then M is a surface in 0¾3. The
nondegeneracy condition (20a), (20b) now reads explicitly as follows:
£[j>/?i(«o)]2*0. (20c)
./ = 1
The following propositions are intuitively manifest (see Fig. 43.3). They
result rigorously from Theorem 43.C below.
(i) Admissible curves through u0 are curves on the surface M that pass
through u0. The normal vector (Z)1G1(«0), D2Gi(u0), Z)3G1(«0)) exists at u0,
and according to (20c) it is different from zero. The tangent vectors are
equal to the intuitive tangent vectors to the curve on the surface.
(ii) The tangent space TMUa consists precisely of all heU3 such that
G'(u0)h = 0, i.e., h is perpendicular to the normal vector. The tangent plane
u0 + TMUa coincides with the intuitive tangent plane ^{u0) in Fig. 43.3.
TMUo arises from ^{u0) by translation to the zero point. TMUo is
homeomorphic to 0¾2.
(iii) u0 is a regular point, i.e., a neighborhood of u0 in the tangent plane is
homeomorphic to a neighborhood of u0 on the surface by virtue of the
mapping u0 + h^> <p(/i). Moreover, (18b) holds.
(iv) M is a C1-manifold when (20c) holds for all u0eU3 such that
G(u0) = 0. If, in addition, all G; possess continuous partial derivatives up to
and including the rth order, then M is a C-manifold.
286
43. Lagrange Multipliers and Eigenvalue Problems
Example 43.13 (Curve, Point). For n = 2 or n = 3, M is a curve through u0
or u0 is an isolated point, respectively. As an exercise, the reader may
explicitly formulate the assertions of Theorem 43 .C below and, parallel to
Example 43.12, interpret them intuitively. We only mention that if the
nondegeneracy condition (20a), (20b) holds for all u0 e R3, where G(u0) = 0,
then M is a C1-manifold.
Example 43.14. Let U be an open set in the B-space X, and let u0 e U. All
t^>u(t) with u(t) = u0 + th are admissible curves. For this reason, each h in
X is a tangent vector. Consequently, TMu<j = X. One can choose <p(h) = u0
+ h as a local coordinate mapping. Thus, U is a C°°-manifold.
43.6d. Submersions
This concept is basic to the implicit construction of manifolds in Theorem
43.C below.
Definition 43.15. If X, Y are B-spaces over K, then a mapping G: D{G) c X
-*Y is called a submersion at u0 (respectively, an immersion at u0) if and
only if the following hold:
(i) G is continuously F-differentiable in a neighborhood of u0.
(ii) G'(u0): X-*Y is surjective, i.e., R(G'(u0)) = Y (respectively G'(u0) is
injective).
(iii) The null space N(G'(u0)) splits X, i.e., there exists a continuous
projection operator P of X on N(G'(u0)); therefore,
X-N(G'{u0))®{I-P)(X).
(respectively, R(G'{u0)) splits Y).
Example 43.16. N(G'(u0)) splits X when dimN(G'(u0))<co holds or
codim N(G'(u0)) < oo or X is an H-space. Note that N(G'(u0)) is always a
closed subspace. Then in an H-space X there even exists an orthogonal
projection operator of X on N(G'(u0)).
43.6e. Construction of Manifolds
We now formulate the main result of this section.
Theorem 43.C (Ljusternik (1934)). Suppose that the following two conditions
hold:
(i) G: D(G)c. X -» Y is a submersion at u0 with G(u0) — Q,and Xand Yare
B-spaces over K, where U =U or C.
(ii) We set M= {« e Z>(G): G(«) = 0}.
43.6. The Generalized Implicit Function Theorem and Manifolds in B-Spaces 287
Then:
(1) Tangent space, h is a tangent vector of M at u0 if and only if
G'(u0)h = 0, i.e., TMUo= N(G'(u0)).
(2) Local structure. There exists a homeomorphism ip: Fc TMUi -+ M of an
open neighborhood of zero, V, in TMUa onto an open neighborhood of u0 in M.
Furthermore, <p is continuously F-differentiable and(p(h) — u0 + h + o{\\h\\) as
h -* 0 on V.
(3) Manifold. If D(G) is open and G is a submersion for all u0 from D(G)
such that G(u0) = Q, then M is a ^-manifold. If, in addition, G is a
C-mapping on D(G), then M forms a C-manifold.
(4) Isolated solution. For N(G'(u0)) = {0}, u0 is an isolated solution of the
equation G(u) = 0.
Note that assertion (2) contains assertion (18b) above of the generalized
implicit function theorem and the fact that u0 is a regular point of M as a
particular case. The proof of Theorem 43.C will be given in Section 43.7.
43.6f. Tangential Mapping
We will now use <j> to describe the local behavior of mappings defined on a
neighborhood of u0 in M.
Corollary 43.17. Let F: U{u0) c X -* Z be F-differentiable at u0. Let X, Y, Z
be B-spaces over K and suppose that G satisfies the assumptions of Theorem
43.C. Then
F(<p{h)) = F(u0)+ F'{u0)h + o(P||) ash-*0
holds for all h in a neighborhood of zero of TMU .
This decomposition gives occasion for the following definition.
Definition 43.18. Let the mapping F: U(u0) c X-* Z be F-differentiable at
u0, where X and Z are B-spaces over K. Let M be a set in X which has a
tangent space TMUa at u0. Then we set
TF{u0)h = F'{u0)h for all h e TMUo
and we call the continuous linear operator TF(u0): TMUa~+ Z the tangent
mapping or the differential of F at u0 with respect to M.
In Section 44.4, the tangent mapping will play a focal role in the
formulation of the Palais-Smale condition. TF(u0) is nothing other than
the restriction of F'(u0) to the tangent space TMUa. Instead of TF(u0), one
also uses TF. or T. F.
^6
43. i^agimige Muiupuvi's andugtuvalue Fiuwcuis
43.7. Proof of Theorem 43.C
The intuitive content of the proof is contained in Fig. 43.3.
Let Xl~N(G'(u0)); therefore, Xx = PX, and let X2=(I-P)X. We
decompose X by X= XX®X2. Then each u in X can be uniquely
represented as
u — ux-\-u2, uleXl,u2eX2.
In this connection, ux — Pu, u2 = (I— P)u. To begin with, we assume
*i*{0}.
(Ad 2) The simple idea of the proof is to solve the equation
def
F(ux, u2) — G(u0 + ux + u2) = 0
with the aid of the implicit function theorem in the form u2 — ^(Mj) in a
neighborhood of zero in Xv To this end, we verify the assumptions of the
implicit function theorem (Theorem 4.B in Section 4.7).
(I) For F: 1/(0,0) c X1X X, -* Y, F(0,Q) = G(u0) = 0. Let [/(0,0) be an
appropriate open neighborhood of (0,0).
(II) According to the chain rule (Proposition 4.10),
FUi{u1,u2)h = G\u0 + u1+u2)h for all h e Xt (21)
holds on 1/(0,0). Here, Fu, FUj are continuous on U(0,0); consequently, F is
continuously F-differentiable on U(Q,0) (Proposition 4.14).
def
(III) We set A = FUi(0,0) and show that A: X2 -* Y is linear, continuous,
and bijective, for, according to (21), A equals the restriction of G'(u0) to X2
and
G'(u0)h = 0, h^X2=*heXl=*h = Q
as well as R(A) = R(G'(u0)) = Y.
According to Theorem 4.B in Section 4.7, there exists a number 8 > 0 and
a neighborhood of zero on Xlx X2, 1^(0,0), such that for each uY e Xl with
II**ill < * there is exactly one <H"i) e X2 with (uv <H"i)) e W(0,0) and
*•(«!,*(«!))-<). (22)
Moreover, \p is continuously F-differentiable. From (22), by partial
differentiation with respect to uu it follows that
^(0,0) + ^(0,0)^(0)-0.
(21) shows that FUi(0,0) = 0; therefore A^'(0) = 0, i-e-> t'{0) = 0- Since
^(0) = $'(0) = 0, we have <K"i) = c-dlwjl) as ux -* 0.
def
Now we set <p("i) = "o + "l + >K"i)- Then <p is a homeomorphism of a
sufficiently small neighborhood of zero in Xx onto a neighborhood of u0 in
43.8. Lagrange Multipliers 289
M. The inverse mapping of 9 is realized by the projection P. We observe
that for sufficiently small e > 0,
\\u — u0\\ <e, G(u) = Q =* u = u0 + ul + u2, ul = P(u — u0)
=> H^ll <8=>u2 = y}>(ui) =*u = <p(Ui).
If G is a C-mapping in an open neighborhood of u0, then F is a
C-mapping in a neighborhood of zero, and thus so are 1^,9, according to
Theorem 4.B in Section 4.7.
(Ad 1) If A is a tangent vector and u(-) is the corresponding admissible
curve, then from G(u(t)) = 0 and u(Q) = u0, u'(0) = A by differentiation with
respect to t at t = 0, it follows immediately that G'(u0)h = 0.
Conversely, if G'(u0)h = 0 holds, then u(t) = <p(th) is an admissible curve
according to assertion (2).
(Ad 3) Let u0 together with u0 be another point for which G(u0) = 0 and
let the corresponding mapping be ¢. If u in M possesses the two
representations
u = <p{h), h^TMUa, and k = 9(A), h^TM„a,
then by the construction of 9, the relation A = P(u — u0) = P(9(A)— u0)
follows. Since P ° 9 is continuously F-differentiable, M is a C'-manifold.
If G is a C-mapping, then so are 9 and P ° 9, i.e., M is a C-manifold.
We now assume that Xx = N(G'(u0))= {0}. According to the inverse
function theorem in Section 4.13, G is a local C'-diffeomorphism.
Consequently, u0 is an isolated solution of G(u) = Q in X. Thus, assertion (4)
holds. The only admissible curve in M through u0 is u(t) — u0; therefore,
TMU= {0} = N(G\u0)). This is (1). Assertion (2) becomes trivial with
ip(A) = u0 for all A e TMUo, i.e., A = 0. (3) is proved as above. Thus Theorem
43.C in Section 43.6 is proved.
Corollary 43.17 follows immediately from
F(u0 +k) = F{u0) + F'{uQ)k + o(\\k\\) ask->QonX
and
M0 + yc = 9(A) = «0 + A+o(||A||) asA->0.
43.8. Lagrange Multipliers
We consider the minimum problem
^(1*) = min!
with the side condition
G(k)=0
and G: D{G) cX->7. For fixed u0, let G{u0) = 0. We again set M
D{G): G(u) = 0}. Condition (24) below is crucial.
(23a)
(23b)
290
43. Lagrange Multipliers and Eigenvalue Problems
Theorem 43.D (Ljusternik (1934)). Suppose that the following two conditions
are satisfied:
(i) F: U(u0) C.X-+M is F-differentiable at u0, and X and Y are real
B-spaces.
(ii) G is a submersion at u0.
Then:
(1) Necessary condition. If F has a bound local minimum at u0 with respect
to M, then there exists a A e Y* such that
F'(u0)k-A(G'{u0)k) = Q forallk^X. (24)
(2) Sufficient condition. F has a bound strict local minimum at u0 with
respect to M when the following two conditions are fulfilled:
(a) F and G are n-times continuously F-differentiable in an open neighborhood
of u0, where n is an even integer, n>2.
(b) There exists a number c> 0 and a functional A e Y * such that
F^(u0)kr- A(G(r~>(u0)kr) = Q, r = 1,...,/1-1,
F("\u0)h" - A(G(n){u0)h") > c\\h\\"
for all k^X and all h e X such that G'(u0)h = 0.
A is called a Lagrange multiplier. Analogous assertions hold for a
maximum. Then, one has only to replace " > c||A||"" by " < — c||A||"" in (b).
Proof. (1) According to Theorem 43.C in Section 43.6, TMU= N{G'{u0)).
For this reason, for each h in N(G'(u0)) there exists an admissible curve
t <-* u(t) in M passing through u0; therefore, G(u(t)) = 0, u(Q) = u0, u'(0) =
def
h. Differentiation yields G'{u0)h = 0. Let f(t) = F(u(t)). Then/possesses a
local minimum at t = 0; therefore, /'(0) = 0, i.e., F'(u0)h = 0 for all h with
G'(u0)h = 0. Now, Proposition 43.1 yields the assertion.
(2) Let n = 2. The proof proceeds analogously for n > 2. We set
H(u) = F(u)-A{G(u)).
For r = 1,2,
/T(«„)*'= F<'>(«0)*'-A(G<"(«0)*').
By the Taylor theorem, because H'(u0)k = Q,sve have
H{u)-H{u0) = H"(u0)(u-u0)+o(\\u-u0\\)2 as«-«0. (25)
Parallel to Theorem 43.C, (2) in Section 43.6, we choose u = <p(h) for h in a
small neighborhood of zero, V, on TMUo. Then G(u) = 0; therefore H{u) =
F(u). Since <p(h) — u0 + h + o(PH), from (25) we thus obtain that for all
F(<p(h))-F{u0) = H"(u0)h2 + o(\\h\\2) > c\\h\\2 + o(\\h\\2) as h -»0.
43.9. Critical Points and Lagrange Multipliers
291
Since <p(V) is a neighborhood of u0 on M, F has a strict local minimum at
w0. □
We now generalize Theorem 43.D by weakening the assumptions on G.
The formula
X0F'(u0)k-A(G'(u0)k) = 0 for allk el (26)
is the focal point.
Proposition 43.19. Suppose that the following three conditions hold:
(i) F: U(u0)C X-+Y is F-differentiable at u0. Here, X and Y are real
B-spaces.
(ii) G: U(u0) C X -* Y is F-differentiable in an open neighborhood of u0 and
G' is continuous at u0.
(Hi) R(G'(u0)) is closed.
Then if F has a bound local minimum with respect to the side condition
M= {u e Z)(G): G(u) = 0} at u0, there exist a A0 in U and a A in Y* which
are not both equal to zero such that (26) holds.
We have A0 = 1 when the nondegeneracy condition R(G'(u0)) = Y is fulfilled.
Proof. In the degenerate case R(G'(u0)) + Y, (26) holds with A0 = 0 and
A + 0 by Corollary 43.2.
If R(G'(u0)) = Y, then the assertion follows from Theorem 43.D in the
case where G is a submersion, i.e., G is continuously differentiable in an
open neighborhood of u0 and N(G'(u0)) splits X.
Analogous to the proof of Theorem 43.D, (1), the proof of the assertion
under the present weaker assumptions follows because TMUo= N(G'(u0))
according to Problem 43.2. □
43.9. Critical Points and Lagrange Multipliers
critical points is the f<
= 0. (27)
The point of departure for the definition of critical points is the formula
d
din«{t))
/-0
Definition 43.20. Let X be a real locally convex space. The functional F:
D(F) c X-* U has a critical point with respect to M at u0 if and only if the
following two conditions hold:
(i) M is a subset of X such that u0 e Af, and D(F) contains an open
neighborhood of u0 in M.
(ii) (27) holds for all admissible curves t<-+u(t) in M that pass through u0,
i.e., u(Q) = u0, u(t) e M for all t in a neighborhood of zero of 0¾1 and
u'(0) exists.
292
43. Lagrange Multipliers and Eigenvalue Problems
If u0 e int M, i.e., M contains an open X-neighborhood of u0, then u0 is
called a free critical point of F.
For the sake of convenience, in the case of a free critical point we shall
agree that in (27) only straight lines u(t) = u0 + th for arbitrary h in X will
be considered. All of these straight lines are admissible. Then, by definition
of the first variation, the following holds: If u0 e int M, then F has a free
critical point at u0 if and only if
8F(u0;h) = Q for all A e*. (28)
A critical point which corresponds neither to a local maximum nor a local
minimum is called a saddle point. In particular, a critical point u0 of F is a
saddle point with respect to M if for each neighborhood U(u0) there exist
points v and w on M n U(u0) such that F(v) < F(u0) < F(w).
We have already explained the intuitive meaning of critical points in
Section 37.3. If F has a critical point with respect to M at u0, then we also
say that F is stationary with respect to M at u0. If X is a real B-space, the
F-derivative F'(u0) exists, and M has a tangent space TMUo at u0, then we
have the following criterion: u0 is a critical point of F with respect to M if
and only if
F'{u0)h = 0 for all h e TMu<j. (29)
This is equivalent to the assertion that the tangential mapping TF(u0):
TMUo-*M equals zero. (29) follows from (28), the chain rule, and the
definition of TM„.
We now consider the problem
F(u) = stationary!, (30)
G(u) = Q
and its connection with Lagrange multipliers. If we set M = { u e D{G):
G{u) = 0}, then (30) means that we seek critical points of F with respect to
M. Parallel to (30), we note the following crucial condition:
X0F'(u0)k-A(G'(u0)k)=0 for all k e X. (31)
Proposition 43.21. Suppose that the following two conditions hold:
(i) F: U(u0)c. X -* U is F-differentiable at u0\ X and Y are real B-spaces-
(ii) G: D(G)c. X -*Y is a submersion at u0 such that G(u0) = 0.
Then F has a critical point with respect to M at u0 if and only if (31) holds
for a fixed A e Y* and h0=l.
Corollary 43.22. If F satisfies the assumptions of Proposition 43.21 and G
satisfies the weaker (relative to Proposition 43.21) assumptions of Proposition
43.10. Application to Real Functions in R
293
43.19, then the following holds: If F has a critical point with respect to M at
u0, then there exist a A0 in U and a A in Y*, where not both are
simultaneously equal to zero, such that (31) holds. In the nondegenerate case,
R(G'(u0)) = Y, we can choose A0 == 1.
Proof. We use only the fact that TMUg= N(G'(u0)). This relation follows
from Theorem 43.C in Section 43.6.
If u0 is a critical point, then from (29) it immediately follows that
F'(u0)h = 0 for all h such that G'(u0)h = 0; therefore, (31) holds according
to Proposition 43.1.
Conversely, if (31) holds and f-*u(t) is an admissible curve in M
that passes through u0, then G{u(t)) = Q and G'(u0)u'(0) = 0; therefore,
F'(u0)u'(0) = 0 according to (31) and thus we have (27). □
Corollary 43.22 is proved analogously with the aid of TMUo = N(G'(u0))
for R(G'(u0)) = Y according to Problem 43.2. If R(G'(u0)) * Y, then we set
A0 = 0 and use Corollary 43.2.
43.10. Application to Real Functions in UN
We consider the minimum problem with side conditions:
F(u)= mini, (32a)
Gt(u)=Q, i =1,..., A"; K<N. (32b)
The following two conditions are crucial,
(i) There exist real numbers ^,...,^ and A0 =1 such that
A0Z^(k0)-£a,Z>,.G,.(k0) = 0, j = l,...,N. (33)
(ii) For all h e u N such that G({u0)h = 0, i = 1,..., K and fixed c> 0,
F"(u0)h2- ZKG"{^)h2>c\\h\\\ (34)
i-l
To be precise, we assume:
(HI) The functions F, GV...,GK: U(u0)QUN-+U, N>1, possess
continuous partial derivatives up to and including order n in the open
neighborhood U(u0). Let u = (iv...,iN) and D} = d/d^.
(H2) The rank of the K X N matrix (DjGt(u0)) is maximal, i.e., equal
to K.
294
43. Lagrange Multipliers and Eigenvalue Problems
From (HI) it immediately follows that F,GV...,GK are n-times
continuously F-differentiable on U{u0) and
F^{u0)h = ZDj,---DJF{u0)hJi---hJr
for r == 1,...,n. The summation is oveijv...,jr from 1 to N.
An analogous formula holds for the Gt. According to (H2), G is a
submersion at u0, i.e., R(G'(u0)) = UK.
Proposition 43.23. Suppose (//1) and (//2) are fulfilled and that u0 satisfies
the side condition (32b). Then the following hold:
(1) Let n=\ in (//1). If F has a bound local minimum with respect to the
side condition (32b), then (i) holds.
Condition (i) is necessary and sufficient for F to have a critical point with
respect to (32b) at u0.
(2) Let n = 2 in (//1). Then (i) and (ii) are sufficient for F to possess a
bound strict local minimum with respect to (32b) at u0.
Now we consider (32a) without the side condition (32b).
Corollary 43.24. If F satisfies the assumption (//1) for n=\, then u0is a free
critical point of F if and only if (33) holds for \0 = l,\l= ••• = XK = Q.
Proof. We set G(u)= (Gx(u),...,GK(u)) and apply Theorem 43.D in
Section 43.8 as well as Proposition 43.21 with X=UN, Y=UK, A =
(XV...,XK). Corollary 43.24 follows from(28). □
If the nondegeneracy condition (H2) is violated, then, in general, one can
formulate the Lagrange multiplier rule as follows.
Corollary 43.25. If F satisfies the assumption (//1) with n=\ and F has a
bound local minimum or a critical point with respect to the side condition (32b)
at u0, then there exist numbers X0, XV...,XK, which are not all simultaneously
equal to zero, such that (33) holds.
Proof. If (H2) is not fulfilled, then the assertion follows from Proposition
43.19 with X0 = 0 and Corollary 43.22. □
43.11. Application to Information Theory
We consider an experiment e having the possible results ev...,en. Letpt be
the probability for the occurrence of et. We define the entropy S of e by
n
S=-kZpMPi- (35)
i-i
43.11. Application to Information Theory
295
Here, we agree to set Pjln Pi = Q for pt = 0. Thus, S is continuous on
def
/c is a constant that is freely at our disposal. In statistical physics, k equals
the Boltzmann constant, i.e., k = (1.380)10^23 Ws/grd. In information
theory, we choose k so that
n
S--T,P,to&2Pf (36)
/-i
Then the unit of S is called a bit.
Before we motivate the definition (35), as an application of the Lagrange
multiplier rule, we prove the following simple assertion.
Proposition 43.26. S assumes its maximum on M at exactly the point p° —
(Pi,...,p°), where p° = ■ ■ • = p°; thus,pf = l/n for i = 1,...,n. Furthermore,
5(/) = k Inn.
Proof. The existence of a maximum point/ follows from the fact that S is
n def
continuous on the compact set M. Let p" eintK where K = [p <eU":
Q<Pi<l, i —1,...,n). From Proposition 43.23, due to the side condition
pl+ ••• +pn=l, it immediately follows that dS{p°)/dpt = \, / = 1,...,«,
i.e., -A:(lnp° +1) = \, thusp°l = ---= p°.
Now, by induction on n, one easily shows that p° e dK is impossible. □
Example 43.27. We consider the special case n = 2 and set I{Pi) =
S(Pi> 1-Pi)- Then I has the form of Fig. 43.4.
Now we motivate the definition of S. Heuristically, S is a measure of the
uncertainty of the outcome of the trial. We first consider the case n = 2 in
Fig. 43.4. For px = 1, p2 = 0 and px = 0, p2 = 1, the outcome is absolutely
certain and S equals zero. For Pi = p2^h the outcome is the most
uncertain and S is greatest. We also designate S as information. In this
connection, we take the following standpoint: If one carries out an
experiment, then the information obtained is greatest when the outcome of the
trial is most uncertain.
I
,
,1 . \ -pj
I 1
2
Figure 43.4
296
43. Lagrange Multipliers and Eigenvalue Problems
Table 43.1
Trial
Result
Probability
e
e„ i=l,...,n
Pi=l/n
f
fj, ./=1,...,m
qj = l/m
ef
ptJ = l/nm
In order to motivate the concrete form of S, we consider three
experiments e,f, and e/as given in Table 43.1.
Here, ef consists of simultaneously carrying out of e and /, where we
assume e and/to be mutually independent. For this reason the probabilities
are multiplied. We set /(//) = 5(/^,...,/7,,), where/?, = 1/« for all /'. Then
the entropy of e, /, and ef is equal to /(//), /(m), and /(nm) respectively.
We now require
J(nm) = /(//)+ /(m). (37)
This condition lies at the heart of the following intuitive idea: We can
convey the information to a remote experimenter $ty means of a channel or
even over two channels concerning e and /. In this connection we now
expect that the data add up. According to Problem 43.7a one obtains all
continuous functions /: ]0,oo[-»R such that f{xy) = f{x) + f{y) for all
x, y e ]0, oo[ by f(x) = k In x, where k e R is fixed. For this reason we set
/(«) = klnn; therefore, S(p,...,p)*= - klnp where// = 1///. This is (35).
In case of distinct probabilities /?,, S in (35) is the expected value of
- k In p.
In Problem 43.7c we give a deeper information-theoretical interpretation
of the S in (36). Roughly speaking, it turns out that NS is equal to the
average number of questions that one must ask in order to determine the
result of a trial sequence of N trials in the case where the questions are
answered solely by "yes" or "no."
43.12. Application to Statistical Physics. Temperature
as a Lagrange Multiplier
We consider the following basic model. Suppose a system has the possible
states Z1;...,Z„ with the corresponding energies Ev...,En, n > 3, where not
all of the Ej's are equal. Let//; be the probability that the system is in the
def
state Zj. We set// = (Pu---,p„) as well as
def ,
K= {/ze[R": 0 <//( <1, /=1,...,//),
def "
S{p)= -kl_lpi]xxpi
;-i
■*U I. Application to Statistical Physics. Temperature as a Lagrange Multiplier 297
and determine pt from the requirement
S(p) = max!, P^K,
tpi = h (38)
1=1
tpA = E
j==i
lm fixed E. This fundamental problem of statistical physics has the
following physical interpretation. We assume that the system 2 is part of a very
kuge system 20. Suppose 2 stands in energy exchange with 20. However,
Mippose the interaction is so weak that one can attribute an average energy
in 2. For example, one can imagine 2 to be a gas in a container and 20 to
be the Earth together with the Earth's atmosphere. (38) means that the
entropy is maximal. By the second law of thermodynamics, the entropy S
cannot decrease. The state which realizes maximal entropy is a stationary
final state. From the standpoint of information theory in Section 43.11,
mi lure seeks to realize states with maximal information.
Proposition 43.28. If (38) possesses a solution with p e int K, then there exist
iciil numbers C, X such that
j?, = Cexp\£,, i' = l,...,«.
1'roof. According to Proposition 43.23, we have
SPI(P)~ \x-M/"0;
therefore, - k(In p, +1) = X1 + X2Et. □
C and X are determined by the side conditions in (38); thus,
expXjB, ,„ .
P,= n . 1-=1,...,«• (39)
£ exp XEj
/-1
In statistical physics, one sets \= -1/kT. Then T turns out to be the
ahsolute thermodynamic temperature. The connection between T and E is
obtained from E = 2,"=1^£,.
fl: E (-l)MD"LD«u(X,D(u(X))) = 0. (42)
We now extend the model by assuming that in a state Z, of the system,
lliere belongs not only the energy Et but also a second quantity N„ where Nt,
in general, represents the number of particles. We assume that it is meaning-
lul to speak about the average number N of particles of the system 2, since
298
43. Lagrange Multipliers and Eigenvalue Problems
the exchange with the more comprehensive system 20 is rather weak. Then
we have yet to add the side condition
ipA = N (38a)
i = i
in (38) where N is fixed. Let p be a solution of (38), (38a), with p e int K. If
the nondegenerate case occurs, i.e., the rank of the matrix
1 1 --1
Ei E2 - • • En
lA N2 ••• N„J
is equal to 3, then from Proposition 43.23 it follows that there exist real
numbers X1,X2,X3 such that
Sp,{p)-K-KEi-X3N, = 0;
therefore, pt = Cexp XEt + /*iV, and consequently
^ «pXfi,+^ i = K n (4Q)
£ cxpXEj + ixNj
> = i
In statistical physics one chooses
kT' ** kT
Here, Tis the absolute temperature and f is called the chemical potential
per molecule. This plays an important role in physical chemistry. The
connection between T, f and E, N is obtained from
e-ZpA, n-Zpm
i-i i=i
and (40).
Formulas (39) and (40) are of fundamental significance in statistical
physics. The model pertaining to (39) or (40) is called the canonical
ensemble or the large canonical ensemble, respectively. We shall treat
physical applications in Chapter 68. For example, we shall consider Planck's
radiation law and the evolution of the universe after the Big-Bang. The
crucial physical problem consists in making the concepts of state Z, precise
and calculating Et, N, for a state Z,-. Here, one has various possibilities
(classical Gibbs statistics in phase space, Einstein-Bose statistics,
Fermi-Dirac statistics). In this connection, quantum theory plays a crucial
role.
In Chapter 67 it will be shown that the determination of thermodynamic
equilibrium states of physical and chemical systems leads to extremal
problems for real functions (thermodynamic potentials) with side conditions
and can be treated with the aid of Lagrange multipliers. For example, the
Gibbs phase rule and the fundamental mass action law in chemistry are
obtained in this way.
43.13. Application to Variational Problems with Integral Side Conditions
299
43.13. Application to Variational Problems with
Integral Side Conditions
We consider the minimum problem
f L(x,Du(x)) c?x = min!,
Dau = 0 on dQ for all a, \a\<m-l, (41)
I H(x,Du(x))dx = c,
where c is some constant. Here, we use the notation from Section 40.5. In
particular, Du symbolizes the function u and all its partial derivatives up to
and including order m. Let d be the number of components of Du.
According to Section 40.5, the Euler equation for L reads as follows:
Q: £ (-l)WZ>«Wx,/>(«(*)) = 0. (42)
\cc\ < m
We assume:
(HI) Q is a bounded region in 0¾N and L, H eCm+\Q xUd), N,m> 1.
By the generalized or distributional form of (42) we understand
/ E LD.u(x,Du(x))Dahdx = Q for all h e C0°°(Q). (42a)
This relation follows from (42) by multiplication by h and subsequent
integration by parts. If ueC2m(Q), then (42) follows, conversely, from
(42a) in reverse by integration by parts. In order to formulate (41)
functional analytically, we set
X= {ueCm{U): Dau = 0 on 30 for all a, \a\ < m -1},
def /■ def /■
F(u) = I L(x, Du) dx, G(u) — j H(x, Du) dx — c.
Then (41) reads as follows:
F(u) = mini, ueX, G(u) = Q. (43)
Proposition 43.29. Suppose (HI) holds. If, with respect to the side conditions
in (43), F has a bound local minimum or a critical point, then there exist real
numbers X0 and X which are not both zero such that (42) holds in the
generalized sense (42a) with £? = \0L + \H instead of L.
We have X0 = l when the side condition is not degenerate, i.e., (42) does not
hold in the generalized sense with H instead of L.
3U0
43. Lagrange Multipliers and Egenvalue Problems
Proof. We have G: X -» Y, where Y=U. If we denote the left-hand side of
(42a) by {L}, then
F'(u)h={L}, G'{u)h={H),
where R{G'{u)) = U or R{G'{u))= {0}. In both cases, R{G'{u)) is closed.
From Proposition 43.19 or Corollary 43.22 it follows that
XQF'(u)h + XG'{u)h = Q for all h e X,
where \20 + X2 + 0 and \0 = 1 for R{G'{u)) = 0¾. □
Proposition 43.29 refers to smooth solutions of the variational problem
(41). In an analogous way, one can handle nonsmooth solutions that lie in
Sobolev spaces. Parallel to 42.7, one then has to take into account that the
integrands in (41) satisfy growth conditions so that the integrals exist.
43.14. Application to Variational Problems with
Differential Equations as Side Conditions
We now replace the integral side condition (41) with differential equations,
i.e., we consider the problem
f L(x,Du(x)) dx= mini, (44)
Dau = Q on dQ for all a, \a\ < m- 1,
Gk(x,Du(x))^Q onQfork = l,...,K.
In this connection, u=(ult...,Uj), and Du symbolizes the partial
derivatives of all Uj up to and including order m. Let the number of components
of Du be /■ d. The Euler equations belonging to L read as follows:
B. £ (-l?4D"LD.Uj(x,Du(x)) = 0, /-1,...,/, (45)
|«| < m
with the corresponding generalized (or distributional) form
( E LD.u.(x,Du(x))Dtthi(x)dx = 0 for all/j, eC0°°(J2), (45a)
" |o| < m
j = 1,...,J. An important role is played by the inhomogeneous linearized
form of the side conditions
E E (Gt)B.Hy(x,zMx))z>»M*)~gt(x) (46)
j = 1 |o| s m
with k—1,..., K. We write (44) as a functional analysis problem in the form
f(«) = min!, ueX, G{u) = 0. (47)
43.14. Variational Problems with Differential Equations as Side Conditions 301
Here, F(u) is the integral in (44) and
def
G{u){x) = (Gl{x,Du{x)),...,Gk{x,Du{x))).
Let X be the B-space of all u = (ux,..., Uj) with
UjeCm(Q) and DaUj = Q on dQ
for ally = 1,...,J and all a, \a\ < m -1 equipped with the usual norm.
We now make the following assumptions, where u is a fixed solution of
(HI) Q is a bounded region in Dl" and L, G1,...,GJ(eCm+\Q)XUJd),
N, mil.
(H2) Variations with respect to u.. For each he X such that G'{u)h = 0,
i.e., (46) with gk = 0 for all k, there exists a function (x, t) •-> u(x, t) with the
following properties:
(i) u eCm(Q x[-t0,t0]) for fixed t0 > 0.
(ii) u(x,Q)= u(x), 5,(x,0) = h(x) on Q.
(iii) For each (e[-(0, r0], 5(-, t) satisfies the boundary and side
conditions in (44).
Then relative to the side condition G(v) = Q, to u there corresponds an
admissible curve in X passing through u and having h as tangent vector.
(H3) Closedness. Let
def -L .,. def JL
Z=Y\W2mW, Y= U L2{Q).
j = 1 k = 1
If h runs through the space Z, then the right-hand sides g = {gx,..-,gK) in
(47) form a closed set in Y.
Proposition 43.30. Suppose F has a bound local minimum or a critical point
relative to the side condition in (47) at u in X, and (//1)-(//3) hold for u.
Then there exist functions Xj,...,?^ in L2(Q) such that the Euler equation
(45) holds in the generalized sense (45a) in the case where L is replaced by
<e=L + z«=l\kGk.
Proof. If we replace u in (44) by u, then the derivative of the integral at
t = Q must equal zero. Since u,(x,0)= h(x), we have F\u)h = 0, G'k(u)h =
0. Moreover, the left-hand side of (45a) and (46) correspond to F\u)h and
G'k(u)h, respectively. Thus, according to (H2), the following holds:
F'(u)h = 0 for all h e Xsuch that G\u)h = 0. (48)
Since X is dense in Z, by passing to the limit in (48), we obtain
F'{u)h = 0 for all fceZ such that G'{u)h = Q. (48a)
Furthermore, the operators F'(u): X-+H and G\u): X-+Y can be
302
43. Lagrange Multipliers and Eigenvalue Problems
extended to continuous linear operators F'(u): Z-*U and G'{u): Z-*Y.
Assumption (H3) means that R(G'(u)) is closed in Y.
Now from (48a) and Proposition 43.1 we obtain the existence of a
A e Y* such that
F'(u)h + AG'{u)h = Q for all ft eZ.
Y is an H-space. By the Riesz theorem (Proposition 21.17), A = (\lt ...,XK),
where \k e L2(Q) for all k, i.e.,
F'{u)h+(Y,\kG'k(u)hdx = 0 for all h e Z.
■>a k
For ^ = (0,...,^.,0,...,0), where Ay.eC0°°(Q), (45a) follows from this
immediately with L replaced by JS?. □
In multidimensional variational problems, the closedness condition (H3)
can cause difficulties. In principle, the condition R(G'(u))=Y can be
characterized according to a general surjectivity theorem by a priori
estimates for the operator adjoint to G'(u) (cf. Problem 43.3).
If one does not succeed in verifying (H3), then one cannot apply
Proposition 43.30 in order to verify the Lagrange multiplier rule as a
necessary condition. However, independent of this, there exists the
possibility of using Lagrange multipliers in a simple way to obtain sufficient
conditions. To this end, let it be assumed that we know:
(a) a tuple of functions u = (ul,...,uJ) that satisfies the side conditions in
(44)
(b) and lagrange multipliers, i.e., functions Xl,...,XK that satisfy the Euler
equation (45) (or, more generally, (45a)) with if = L + \lGl + • • • +
XKGK instead of L.
Then instead of (44) we consider the problem
L + E KGk dx = ^111. i44*)
k-i I
Dau = 0 ondOforalla, |a|<m-l.
If we can verify, for example, with the aid of the sufficiency criteria given in
Section 40.2 with respect to the second variation, that u is a solution of
(44*), then we have obviously obtained a solution u of (44).
In Proposition 43.30, we considered a smooth solution u. These
considerations can be extended directly to solutions in the Sobolev space W™(Q)
that are not necessarily smooth, provided certain growth conditions are
placed on the functions L and Gk, parallel to Section 42.7. Then, with the
assumptions (HI) and (H3), Proposition 43.19 can be applied to F: .Z-*U,
G: Z-*Y. Here, (H3) guarantees that R(G'(u)) is closed. (H2) need not be
assumed here.
I
Problems
303
If the side conditions Gk(x, u(x)) = Q do not depend on the derivatives,
then the closedness condition (H3) cannot be fulfilled at first. However, by
differentiation with respect to x, side conditions that contain derivatives
result. For example, Gx + 2,G„ Z>« = 0 follows from G(x, u(x)) = 0.
Problems
In particular we recommend that the reader study the set of Problems 43,4 where
general Banach manifolds are considered,
43.1, The quotient theorem. Let X, Y, and Z be B-spaces over K, We consider the
diagram
B
x z
. \ A m
Y
where AeL(X,Y), BeL(X,Z), and R(A) = Y, N(A)cN(B), Show:
There exists exactly one operator Q in L(Y,Z) such that B = Q°A, i.e.,
(49) is commutative.
Hint: Compare Ioffe and Tihomirov (1974, M), 0,1,4, Use the open
mapping theorem (cf. Ax(36)). This proposition generalizes Proposition
43.1, One can think of Q as a quotient,
43,2,* General theorem on tangent vectors. Prove:
TMU= N(G'(u0))
del
for M = { u e D(G): G(u) = 0} provided:
(i) X, Y are B-spaces over K,
(ii) G; D(G) c X-* Y is F-differentiable in an open neighborhood of u0,
and G' is continuous at u0,
(iii)/J(G'(«0))-y.
These assumptions are weaker than those in Theorem 43,C in Section 43,6.
In particular, here we forego the splitting property for N(G'(u0)),
Hint: Compare Ioffe and Tihomirov (1974, M), 0,2,4, Use the Banach
fixed point theorem for multivalued mappings (Theorem 9,A in Section 9,1),
43,3, Surjectivity theorem, R(G'(u0)) closed and the condition R(G'(u0)) = Y
play an important role in the applicability of the Lagrange multiplier rule.
We give a general criterion for this. Prove: Let A: D(A)c X-+Y be a
closed linear operator with dense domain of definition D(A), Let X and Y
be B-spaces, Then R(A) = Y if and only if A* has a continuous inverse, i.e.,
there exists a number c > 0 such that
||/f*y*||:>c||.y*|| for ally* eD(A*), (50)
Use the closed range theorem (Ax (39)),
Hint: Compare Yosida (1965, M), VII, 5, Consequence 1.
In particular, every continuous linear operator A: X -» Y is also closed. If
A is the closure of a differential operator, then the a priori estimate (50)
43, Lagrange Multipliers and Eigenvalue Problems
implies the solvability of the differential equation A u = / for all / in Y. In
this connection, compare the general investigations of Browder (1959),
In order to prove that R(G'(u0)) is closed, one can appeal to Proposition
8,14: A + B is a Fredholm operator, i.e., in particular, R(A + B) is closed
provided A,B e L(X,Y), A is a Fredholm operator, and B is compact.
Generalizations can be found in Kato (1966, M), Theorems 5,22, 5,26, etc,
.4, Banach manifolds. In Definition 43,10 we considered manifolds in a fixed
B-space, For numerous applications this assumption is too restrictive. For
instance, if one considers the Riemann surface of an analytic function, then
this is a topological space on which one can calculate only locally as in the
complex plane. The general concept of a Banach manifold comprises
topological spaces on which one can calculate locally as in UN or, more
generally, as in a B-space, This concept is of central significance in the
natural sciences. It corresponds to the picture that one can describe objects
in natural science locally by parameters. Important manifolds in physics are
curves and surfaces in phase spaces (mechanics, statistical physics),
Riemannian manifolds (general relativity theory; events are described in
local coordinate systems by three space coordinates and a time coordinate),
and Lie groups for describing symmetries and conservation quantities as
well as fiber bundles (gauge field theories in elementary particle physics).
In many cases one can abstractly embed given manifolds in a single
space. According to Whitney an n-dimensional C°°-manifold with a
countable basis, which is thus described locally by W, can be embedded as a
surface in R2"+1 (cf. Section 73.21). In Part IV we shall study the theory of
Banach manifolds in greater detail.
4a. Definition. We generalize Definition 43,10 in a natural way. A topological
space M is called a C-Banach manifold if and only if the following two
conditions hold:
(i) For each point u e M there exist a B-space Xu over H, an open set Vu in
Xu, and a mapping <p„: Vu c Xu -» M which maps Vu homeomorphically
onto an open neighborhood of u in M (see Fig. 43.5).
(ii) The mappings, which describe the change of local coordinates parallel
to Definition 43.10, are C-mappings.
If all Xu are equal to the same B-space X, then one says that M is a
manifold which is modelled on the B-space X.
4b. General strateg)' of the theory of manifolds. By a change of local coordinates,
we understand that instead of the local coordinates <p,71(tt) from Vu, to the
point u we assign the local coordinates %\u) from V0, provided u lies in
<p„(Ki) (see Fig. 43.5). The general strategy is that one calculates in local
coordinates and takes into account the concepts that remain invariant
relative to change of local coordinates. In this connection, the chain rule
plays a central role. Only these concepts possess a general coordinate-free
meaning for the manifold, i.e., they represent geometrical properties of the
manifold.
For example, on a C-manifold M, one can easily define a mapping F:
M-+ R when F is a C-mapping. For this it suffices that F be a Cr-map-
I'M'blems
305
Figure 43,5
ping relative to Vu for all weM—to be precise, F°yu eC(Fu,IR) must
hold, This property is preserved for a change of local coordinates, One can
convince oneself of this explicitly with the help of the chain rule.
Analogously, one can define C-maps F: M -> N between two C-Banach
manifolds M and N.
As our next example, we consider tangent vectors.
4J.4c. Tangent space, In Definition 43.8 tangent vectors appeared in a simple
intuitive way because M was situated in a B-space X. We will now define
tangent vectors at u and the tangent space TMU at u in an invariant way so
that TMU s Xu holds (isomorphism of linear spaces).
(i) Let M be a C^manifold. By definition, a differentiable curve C on M
that passes through the point u e M is a C'-map y: U(0) cR^M with
/(0) = u. Now consider local coordinates in Vu. Then u has the local
coordinate ii = y~l{u). Furthermore, the curve C corresponds to the
curve Cu on Vu with the tangent vector tu at u. To be more precise, Cu is
given by x(r) = q>u\y(r)) and tu = x'(0) (see Fig. 43.5).
(ii) By definition, two differentiable curves C in (i) are in contact at u if and
only if they have the same tangent vector tu with respect to Vu.
(iii) By definition, a tangent vector t of M at m is the collection of all
differentiable curves passing through u which are in contact at u.
We designate tu as the representative of t with respect to Vu.
Sometimes one calls the representative tu of t a concrete tangent vector
and t an abstract tangent vector of the manifold. Obviously, tu lies in
the B-space Xu (see Fig. 43.5). With the aid of the chain rule, show that
the concepts of a differentiable curve, of contact, and of a tangent
vector introduced in (i)-(iii) are invariant relative to a change of local
coordinate systems. Observe that the curve r*-*x(j) in Xu passes to
t i-> <Kx(t)) in the space Xv.
(iv) The set of all tangent vectors t at u forms a linear space TMU which is
linearly isomorphic to Xu. In this connection, the linear operations on
tangent vectors are defined by the corresponding operations on the
representatives in Xu.
43. Lagrange Multipliers and Kgenvalue Problems
Write the transformation rule for the representatives of the tangent
vectors and derive that the linear operations for tangent vectors are
independent of the local coordinate system. Solution: \j/(x(t))' =
4>'(x(t))x'(t); therefore, tv = $\u)tu and atf1 + Ptf> = ^'(u)(at^ +
M2))-
The tangential mapping TF(u). Let M, N be C'-manifolds and let
F:M-*N
be a C'-mapping. The tangential mapping
def
TF{u):TMu^TNF{u), TF(u)t~s (51)
assigned to F is a linear mapping between the corresponding tangent spaces
which results in a natural way when one refers F to local coordinates and
linearizes (forming the F-derivative). This linearization acts on the
representatives tu of the tangent vectors t. Using the chain rule, one can easily
convince oneself that the assignment of Mo s = TF{u)t is independent of
the choice of the representatives of t and s (also, cf. Section 72.6).
u is called a critical point of F when TF(u) is not surjective. For
C'-functionals F: M-*U, this coincides with a definition parallel to
Definition 43.20. However, Definition 43.20 can also be applied to functional
which are not of the type C1.
The tangent bundle TM. Let M be a Cr-manifold modelled on the B-space X
with r>l. Show that the collection TM of all pairs (u,t), where ue M,
teTMu, forms a C~'-manifold which is modelled on XXI (tangent
bundle). TM plays a central role in modern differential geometry.
Intuitively, TM results when one attaches the tangent plane TMU at each point
«e M(see Fig. 43.6).
Solution: One assigns the set Vu X X to (u, t). The local coordinates of
(u, t) are (<p„_1(")>'u)- Here, (p„_1(«) is the local coordinate of u and tu is
the representative of t with respect to Vu. In the transition to Vv X X, one
chooses the corresponding coordinates with respect to Vv, i.e., (<p„-1(«),/„)
(see Fig. 43.5).
In a natural way, the tangential mappings TF(u) for u e M in Problem
43.4d yield a mapping TF: TM -»TN of the tangent bundles.
As an introduction to the theory of Banach manifolds and their
applications, we recommend Marsden (1974, L). An excellent introduction to the
theory of manifolds in U" that is conceptually very intuitive is Guillemin
and Pollack (1974, M).
Generalizations to Sobolev spaces. Use the hints given in Sections 43.12 and
43.13 to generalize the results of Propositions 43.29 and 43.30 to Sobolev
...
/£U £ZJ/
Figure 43.6
Problems
307
spaces. Furthermore, study the applications to the Zermelo navigation
problem and the form of the surface of a fluid under the influence of surface
tension considered in Klotzler (1971, M), page 102.
43.6. Lagrange multipliers for one-dimensional variational problems. Explicitly carry
out the proofs sketched in Section 37.4/. Numerous physically interesting
applications can be found in Bolza (1949, M) and Krasnov (1975, M)
(collection of exercises).
43.7. Information theory. In conjunction with Section 43.10 the following
exercises should enable one to obtain a deeper understanding of information
theory.
43.7a. A functional equation. Prove: If/: ]0, oo[ -» R is a continuous function such
that/(xy) = /(*)+/(/) for all x,y <e]0,oo[, then/(x) = - klax for fixed
keU.
Hint: Compare Fichtenholz,(1972, M), Vol. I, Section 75.
43.7b. Number of trial results. Suppose that an experiment e yields the outcomes
elt...,e„ with the corresponding probabilities px,...,pn. We perform the
experiment N times. By a possible outcome of the experiment we mean
e, e, • • • e, , (52)
where each ej in (52) appears according to its probability, i.e., pjN times.
The number A(N) of all possible outcomes of the experiment is
AM
A(N)~~(Pi*y----(p,N)\ ■
Show that N'1 InA{N)-^S/k as #-»00. Thus there results a further
interpretation of the information S.
Hint: Make use of the Stirling formula. Compare Brioullin (1956, M),
page 7.
43.7c* Average number of optimal questions and the first fundamental theorem of
information theory. Suppose an experimentalist E has obtained the outcome
of the experiment (52) for fixed N. We wish to establish this result using the
smallest number of questions, where E can answer each question by "yes"
or "no." Let H{N) be the average number of questions that are needed in
order to establish experimental outcomes of the form (52). Show:
lim —\-'- = - £/>,log2/>,. (53)
JV -»00 '» ,• _ 1
This assertion is called the first fundamental theorem of information theory.
(53) shows that to establish the outcome of a sequence of experiments
consisting of N mutually independent individual experiments for large N,
we need approximately N- S questions, where S is the information
appearing on the right-hand side of (53).
Hint: Compare Tops0e (1974, M), Chapter 1. We will explain and make
this problem precise by means of an example with n ~ 2. Suppose E draws a
card. e1 and e2 mean "no trump" and "trump," respectively. The
probabilities are px = | andp2~i, respectively. Now E draws N cards, where after
each drawing the card is replaced and the deck shuffled, so that the
43. Lagrange Multipliers ana eigenvalue rrooienis
Table 43.2
Outcome
e\t\
eiei
e2e1
e2e2
Probability
PiPj
9.4-2
3-4-2
3-4-2
1-4-2
Questions
(i)
(i), (¾
(i), (ii), (iii)
(i),(ii),(iii)
Number of
questions
1
2
3
3
drawings are mutually independent. Let N = 2. Table 43.2 shows the
probabilities for the outcomes of the experiment.
From this we deduce the following strategy of questioning:
(i) Is it e1e1?
(ii) Is it exe2l
(iii) Is it e2e{!
That is, we put the questions in this sequence and stop when the answer is
"yes" for the first time. Obviously, it is not optimal to ask for e2e2 in the
first step because, compared to (i), the probability is greater that the answer
is "no." In the last column Table 43.2 shows the number of questions that
one needs in order to establish the corresponding outcome e.e,. If E repeats
the experiment in Table 43.2 very often and if m is the number of
experiments, then the outcome e,ey occurs mptpj times on the average and
the number of questions that we must ask in order to establish the correct
outcome in all cases is equal to
1' mpiPi + 2 • mp1p2 + 3 ■ mp2px + 3 ■ mp2p2 = 1.688m.
By definition, H( N) with N = 2 is the average number of questions, thus
equal to 1.688. Consequently, //(2)/2 = 0.844. By (53), H(N)/N -> S as
N -> 00 and S = 0.815.
Application of information theory to physics, biology, medicine, linguistics, and
communications technology. For this purpose, study Brioullin (1956, M) and
Tops0e (1974, M) as well as the classical work of Shannon (1948).
Axiomatics for S. One can show that S is uniquely determined by a few
axioms. Such an axiom system can be found in Shannon (1948). The
significance of the axioms is discussed in detail in Jaglom (1960, M), page
80. The proof of uniqueness under very weak assumptions can be found in
Renyi (1977, M). It is recommended that the reader study this literature.
Existence of free critical points. Study the Problems in Chapter 49. There we
explain a number of different methods.
Lagrange multipliers, bound arcs, and the main theorem on underdetermined
systems of differential equations. We have already used the result below in an
essential way in Section 37.4/ to obtain a simple derivation of the Lagrange
multiplier rule for one-dimensional variational problems with differential
equations as side conditions.
PiuuiCiiiS
30?
For y = (yx,...,yn) we consider the boundary value problem
Ftt{x,y,y') = 0, a = l,...,m (54a)
y(x0)->y0, y(xi) = yu (54b)
where 0<m<n and - oo < x0 < xx < oo. Parallel to this, we are interested
in the perturbed boundary condition
y(x0)=y0, yixj^h + b. (54c)
Furthermore, in preparation we write the system
E i"aiK)'- *«>« = °. «' = !,...,« (55)
a = l
for \a, where
«*/(*) = ^~,\x,y(x),y\x)),
- dy,
M*) = ^(*./(*)./'(*))■
Let the function]' in Cl[x0, xy\ be a solution of (54a) and (54b). We denote
the corresponding arc of the curve by C. By definition, C is said to be free if
and only if the perturbed boundary value problem (54a) and (54c) has a
solution for each & in a fixed neighborhood of the origin in R". Otherwise,
we say that C is bound.
Show: If C is bound, then there exist C'-functions \l,...,\m on (½^]
which are not all identically zero and satisfy (55) when the following
regularity assumptions are fulfilled;
(i) All Fa are C'-functions on a neighborhood V of C. Here,
V= {(x,y,y')&U2n+l:x&[x0,Xy\,
\y-y{x)\<8, \y'~y'{x)\<,8}
for a fixed S > 0.
(ii) Along C the rank of the functional determinant
d(Flt...,Fm)
3{yi,-,yi)
is maximal, i.e., equal to m for all points (x, y(x), y'(x)), where
x *= L-^o. X\\.
Solution: In the following, let a = l,...,m, /8 = m + 1,...,n; i,j,k,r =
1,...,n. One always sums over two equal indices. Our point of departure is
the system of differential equations
Fa{x,y,y')-0, (56)
F/,(x, y, y') = Cf,{x) + ejcpjix)
43. Lagrange Multipliers and Eigenvalue Problems
for suitable Fg, c^, and c^. In this connection, we choose all Fp as
C'-functions on V so that
d(Flt...,F„) a]ongC_ (5y)
Due to (ii), this can be achieved by choosing Fp to be linear in all the
variables yj and with coefficients depending on x. Then (57) can be realized
first locally and then globally by means of a partition of unity. Furthermore,
let Cpj e Cl[x0, x{[ and let c^ be so determined that y is a solution of (56).
We denote by y — y(x, e) the solution of (56) that satisfies the initial
condition y(x0,0) - y0. Suppose the parameter e lies in a neighborhood of
the origin in U". Then solutions of the perturbed boundary value problem
(54a) and (54c) are obtained from
y{xl,e) = y{xl) + b. (58)
To begin with, this equation has the trivial solution e = 0, b = 0. However,
since C is bound, (58) cannot be solved in the form e = e(b) in a
neighborhood of b = 0. Therefore, by the implicit function theorem from Section 4.8,
we have
— -^(^,0) = 0.
d(.Ei,...,£„)
def
If we define dtJ{x) = dy(x,0)/dEj, then det(d^xj) = 0. If we set y =
y{x, e) in (56), then by partial differentiation with respect to e^ we
immediately obtain the system
a^dlj+b^du-O, (59)
afiid;j+bfijdtJ=-cfij.
Since y(x0,e)^y0, we have djJ(x0) = 0. Parallel to (59), on [x0>*i] we
consider the so-called adjoint system
(\kakiy-bkl\k-0. (59*)
(57) is equivalent to det(akj(x)) # 0 on [x0,Xi]. Therefore, (59*) has n
linearly independent solutions \k r. We are finished if, say, X/3,i = 0 for all
P — m +l,...,n. Due to the linear independence of the \kr, not all Kal,
a = l,...,m, are identically equal to zero on [xo,*!], and we obtain our
assertion from (59*) with \a = \a-1.
In order to prove that X/31 = 0, we multiply (59) by \k r and sum over k.
Then from (59 *) it follows that
(fl*/rf,7^*.r)'=c/s//sr;
thus
(akid<jXk.r)(Xl)= ( lCpj\p rdx.
Since AeX{dij{x{)) = 0, the multiplication theorem for determinants im-
References
311
mediately yields
detl f CpjXp rdx =0.
By (56) and our derivation, this relation holds for all c^ e C*[x0, x{[. Thus,
the corresponding n X n matrix has rank less than n. Consequently, the first
row, say, is linearly dependent on the remaining rows, i.e.,
/ lkfj,ihfjdx = 0.
Since Cpj is arbitrary, this holds for all hp e Cl[x0, x{\; therefore, \p x = 0
according to the variational lemma (Proposition 18.2).
In this proof the reader should pay very careful attention to the fact that,
for all e, in a neighborhood of the origin in R ", we obtain a global solution
y = y(x, e) of (56) on [x^x^ having continuous partial derivatives with
respect to Ey. To this end, use (57) and the implicit function theorem as well
as the theorem on the dependence of solutions of ordinary differential
equations on the parameters (cf. Section 4.11). These theorems first give
local solutions. Due to the compactness of V, however, there exist global
extensions. In this connection, one uses the Picard-Lindelof theorem with
the corresponding assertions concerning the magnitude of the solution
region in Section 3.1.
References to the Literature
Classical work: Ljusternik (1934).
Lagrange multipliers: Vainberg (1956, M); Krasnoselskii (1956, M)
(bifurcation); Browder (1965); loffe and Tihomirov (1974, M) (general results);
Maurin (1976, M), Vol. I.
Galerkin method: Browder (1968), (1970a); Vainikko (1979, S,B) (linear
problems); Zeidler (1980) (also, cf. the references to the literature in
Chapter 22).
Variational problems with side conditions: Bolza (1949, M,H); Courant
and Hilbert (1953, M), Vol. I; Gelfand and Fomin (1961, M); Funk (1962,
M,H); Hestenes (1966, M); Klotzler (1971, M) (multiple integrals); loffe
and Tihomirov (1974, M).
Banach manifolds: Marsden (1974, L) (introduction); Lang (1972, M)
(standard work); Schwartz (1969, M); Klingenberg (1978, M) (Hilbert
manifolds); Abraham and Robbin (1967, M) (manifolds and dynamical
systems).
Finite-dimensional manifolds: Guillemin and Pollack (1974)
(introduction); Schwartz (1964, L); Warner (1971, M); Dieudonne (1975, M), Volumes
III-VI.
J12
4j. Lagrange Multipliers and Eigenvalue Problems
Manifolds in mathematical physics: Marsden (1974, L); Guillemin and
Sternberg (1977, M); Abraham and Marsden (1978, M); Choquet, Bruhat
et al. (1982, M).
Information theory: Shannon (1948) (classical work); Jaglom (1960, M)
(introduction); Feinstein (1958, M).
Emphasis on the applications: Brioullin (1956, M); Tops0e (1974, M).
Information theory and ergodic theory: Billingsley (1965, M).
Statistical physics: Sommerfeld (1962, M), Vol. V; Landau and Lifsic
(1962, M), Vol. V; Kittel (1973, M).
Modern algebraic approach to quantum statistics: Ruelle (1969, M);
Bratteli and Robinson (1979, M) (also, cf. the references to the literature in
Chapter 7 concerning quantum field theory).
Statistical physics and ergodic theory: Reed and Simon (1971, M,B), Vol.
I (introduction).
(Also, cf. the references to the literature in Chapters 44, 45, and 49
concerning the existence of critical points and bifurcation points.)
CHAPTER 44
Ljustemik-Schnirelman Theory and
the Existence of Several Eigenvectors
The theory of eigenvalues of quadratic forms developed by R. Courant
enables one to discern their existence and reality without calculations. We
shall generalize this theory to arbitrary functions having continuous second
partial derivatives.
Lazar Aronovic Ljusternik (1930)
In Chapter 43 we proved the existence of an eigenvector. Now we concern
ourselves with the eigenvalue problem
Au = XBu, ueX, \eR, (1)
and we will prove the existence of several eigenvectors for (1) within the
generalized context of the Courant maximum-minimum principle. In this
connection, in an essential way, we use the fact that A and B are odd
potential operators, i.e., A — F', B = G', and A(— u) = — A(u), B{— u) =
-B(u) for all «el We have already explained the basic idea of the
Ljusternik-Schnirelman theory in Section 37.26, and we recommend that
the reader first study Section 37.26 again.
In Part I we became acquainted with the fixed point index and the
mapping degree, important topological tools for obtaining fixed point
theorems; we now make use of the concept of the genus of a set in order to
obtain propositions concerning eigenvalues. It is easier to work with this
conceptualization than with the category concept applied originally, which
we shall study in Problem 44.13. In order to prove the existence of
sufficiently many eigenvectors for (1), it is crucial that the genus of spheres
be equal to the dimension of the space. However, this assertion is obtained
with the aid of the mapping degree—to be precise, it follows from the
Borsuk-Ulam theorem and thus, in the final analysis, from the Borsuk
antipodal theorem in Part I. Whereas for the fixed point index in Chapter 12
313
314 44. Ljusternik-Schnirelman Theory and the Existence of Several Eigenvectors
the compactness of the operators played a central role, in this chapter we
use essentially the local Palais-Smale condition (PS)C for functional. This is
also a compactness condition.
In Theorem 44.A in Section 44.5 our goal is to formulate such
propositions for the nonlinear problem (1) which are optimal in comparison with
the corresponding linear problem (1).
Not only can the Ljusternik-Schnirelman theory be applied to eigenvalue
problems but, also, it yields, in principle, a method for proving the existence
of one or several critical points of a functional F. If it is a matter of a free
critical point, then by Section 43.9, solutions of the equation F'(u) = Q
result. If they are critical points with respect to a surface which is given by
an equation G(u) = a, then one obtains solutions of F'(u)~ XG'(u) = 0, i.e.,
of (1), when G is a functional. We work out the simple fundamental
principles of the Ljusternik-Schnirelman theory axiomatically in Section
44.2. In order to be able to apply these basic ideas to concrete problems, the
construction of so-called Ljusternik-Schnirelman deformations (in short:
L-S deformations) in each individual case plays a central role. We treat
methods for this in the proof of the main theorem in Section 44.7 as well as
in Problems 44.13g and 49.7. In the general case, for this purpose pseudo-
gradient vector fields and the Palais-Smale condition are crucial.
The applications concern systems of nonlinear equations, quasilinear
elliptic differential equations, and Hammerstein integral equations. In the
Problems in Chapter 49 we consider periodic solutions of nonlinear
hyperbolic equations and Hamiltonian systems. There, we present a number of
general methods for constructing free critical points in connection with their
applications. In Section 37.27f we pointed out the significance of the Morse
theory and the Ljusternik-Schnirelman theory for the existence of geodesies
on surfaces and manifolds. In the Problems in this chapter, we will discuss
some important generalizations of the Ljusternik-Schnirelman theory and
the Morse theory, in addition to the material of this chapter.
44.1. The Courant Maximum-Minimum Principle
We study the linear eigenvalue problem
Au = \u, ueX, XeU (2)
with the aid of
\± I sup inf ±F(u),
2 (o forjS?m±=0
44.1. The Courant Maximum-Minimum Principle
315
for m = 1,2, In this connection, we assume:
(HI) X is a real separable H-space with the inner product (-|) and
dimX = oo. The operator A: X-+X is linear, symmetric, and compact,
A + 0. We define
def def
f(«) = 2""1(^"l"). C?(«) = 2-1(«|«).
def
(H2) S = {u e X: \\u\\ = 1} is the boundary of the unit ball. Sk denotes the
boundary of an arbitrary ^-dimensional unit ball in X, i.e., Sk = S C\ Xk,
where Xk is an arbitrary ^-dimensional linear subspace of X.
(H3) Let JS?m denote the set of all Sk with k > m. Furthermore, let
def r
X±-[Ske<?m: ±F(u)>0 on Sk}.
If we identify X with X* according to Section 21.4, then F(u) =
l\Au,u), G(u)*=2~-l(u,u). Thus, F' = A, G'=/. Obviously,
+ A? > ± \f > • ■ • > 0.
Proposition 44.1. With the assumptions (//1)-(//3), let ± \~ > 0 for + or
-. Then the following four assertions hold:
(a) X = A~ w an eigenvalue of A. All eigenvalues X + 0 o/^l can fee obtained
in this way with the aid of (3).
(b) The multiplicity ofX is equal to the number of indices kfor which X\ = X.
(c) There exist eigenvectors ul,...,um of A such that («,|«y) = S, for i,j=*
1,...,m and such that
\±
±^T= min ±F(u),
where Sm = S C\span{ul,...,um} e SC^.
(d) \~ ^0 as m->oo.
The proof runs parallel to the proof of Theorem 22.E in Section 22.11.
There we assumed only that A is positive just to be able to formulate the
results more simply (cf. Problem 44.1). Our goal is to give a far-reaching
generalization of Proposition 44.1 to nonlinear problems of the type Au —
XBu. This is accomplished in Section 44.5. There, the basic idea is to replace
the class jS?^ in (3) by a more comprehensive class Xm. In order to develop
this basic idea clearly, we give an abstract axiomatic approach in the next
section.
Jl6 44. Ljusternik-Schnirelman Theory and the bxistenceot Several Eigenvectors
44.2. The Weak and the Strong Ljusternik
Maximum-Minimum Principle for the
Construction of Critical Points
44.2a. The Weak Principle for the Existence of a Critical Point
As a point of departure for the construction of critical points of the
functional F, we choose
c— sup inf F(u) (4)
and assume:
(HI) F: M c X -* U is a functional on the real B-space X and M j=0.
Definition 44.2. We denote the set of critical points of F with respect to M
such that F(u) = c by critM CF. If this set is nonempty, then c is called a
critical value or a critical level of F with respect to M.
(H2) Jfis a nonempty class of nonempty subsets of M. The number c
constructed in (4) is finite.
(H3) M allows L-S deformations with respect to F and c. By definition,
this means: For each open set U in X such that U 2 critMc.F there exists a
number e(U)>Q and a continuous mapping d: Mx[0,l]-»M such that
d(u,Q) = u on M and
F(u)^.c-e, ueM-U implies F(d(u,l))>c + e. (5)
(H4) If critM CF = 0, then Xis invariant with respect to d in (H3) with
£/ = 0, i.e.,
KeX implies d{K,l)eX. (6)
Proposition 44.3 (Ljusternik (1930)). With the assumptions (H1)-(H4),
critMiCF*0.
For this theorem, we need (H3) only in the special case critM CF = 0,
U = 0. We need the full concept of an L-S deformation in the next section.
Corollary 44.4. An analogous proposition holds for
c= inf swp F(u) (4a)
provided (5) is replaced by
F(u) <c + e, ueM — U implies F(d(u,l)) < c— e. (5a)
Proof. Let us assume that critM CF = 0. By (H3) there is a d such that (5)
holds with [/ = 0, i.e.,
F(u)>c — e, ueM implies F(d(u,l)) > c + e.
44.2. Weak and the Strong Ljusternik Maximum-Minimum Principle
317
We choose a K e X such that
inf F(u)>c-e; thus, inf F(u)>c + e.
Due to (H4), d(K,l) e X, i.e.,
inf F(u)<c,
»erf(AT,l)
by (4). But this is a contradiction. Corollary 44.4 is proved analogously. D
We now briefly discuss (H2). Let infussKF(u)> — oo for some fixed
K e X (e.g., F is continuous on the compact set K). Then (H2) holds, i.e.,
the number c is finite, when one of the following two conditions is fulfilled:
(i) F is bounded above on M.
(ii) Every K e X intersects a fixed set M0 on which F is bounded above.
Condition (ii) is the point of departure of the important linking principle
for the construction of critical points which we shall consider in Problem
49.7.
44.2b. The Strong Principle for the Construction of Several
Critical Points
In order to obtain the existence of several critical points for F, we consider
cm= sup inf F(u), m = l,2,..., (7)
K e Jf„, » e K
where
def
Xm = class of all compact subsets KolM such that ind K > m.
In this connection, ind K is a topological index whose properties we shall
describe axiomatically in conjunction with (A) below.
Definition 44.5. Let X be a real B-space. A subset K of X is said to be
symmetric if and only if u e K always implies -«e K. We denote by sym^
the class of all closed symmetric subsets K of X such that 0 € K (see Fig.
44.1).
Figure 44.1
318 44. Ljusternik-Schnirelman Theory and the Existence of Several Eigenvectors
Henceforth we assume the following;
(A0) Each K in sym^ is assigned an integer m, 0 < m < oo, or oo, which
we designate by ind K, such that for all K, Kx Kke sym^ the following
hold:
(Ax) ind K = 0 if and only if K = 0.
(A 2) If K is a finite nonempty set, then ind K — \.
(A3) indC^j U • • • U Kk) c indKx + • • • +indKk.
(A4) ind ^ < ind K2 provided Kx c K2 or, more generally, there exists an
odd continuous mapping <j>: Kx -» K2.
(A 5) If K is_compact, then ind K < oo,jind there exis ts an open set U such
that KcU,U s sym x and ind K — ind U.
(B0) F: M qX-*U is an even functional on the nonempty symmetric set
M of the real B-space X, i.e., F{— u) — F{u) on M.
(Bx) For a fixed meN,jfmis nonempty and - oo < cm < oo.
(B2) critM c.Fis compact and does not contain the zero point.
(B3) On M there exist L-S deformations with respect to F and cm, where
all d in (5) with c = cm are odd as functions of u.
Proposition 44.6 (Ljusternik (1930)). With the assumptions {A) and (B)
above, the following hold:
(a) crit^c F*0.
(b) For cm = cm+l = ■ ■ ■ = cm+p withp 2:1, indcritM c F^p +1. In
particular, critM c F contains infinitely many pairs (u, — u).
In this connection, in (b) it is tacitly assumed that Jfm Jfm+p¥=0.
This proposition allows an important consequence: If the assumptions (A)
and (B) are fulfilled for c1 ck, then there exist at least k pairs (u,-u) to
which critical points of F with respect to M correspond, for, if all q ck
are distinct, then (a) can be applied. Otherwise, one uses (2).
As the proof of Proposition 44.6 will show, its assertions hold if ctfm is
replaced by Jf%, where J^ contains precisely all K e JTsuch that F(u) > 0
on K. Then, however, from the L-S deformations d in (5), one has to require
K e X^ =» d{K, 1) e X^ in order to guarantee (H4) in the proof.
Corollary 44.7. A proposition analogous to Proposition 44.6 holds for
cm= inf sup F(u) (7a)
when the relation (5) in (B3) is replaced by (5a) with c = cm.
Proof, (a) This follows from Proposition 44.3. The condition (H4) (in
Section 44.2a) results from (Bj), (A4), and the definition of Jfm above.
def
(b) Let K= ciitM CF. The set K is symmetric because M is symmetric
and F is even. Let it be assumed that ind K < p. If we choose U as in (A5)
— thus, in particular, indU= indK < p, then by (B3) we have:
F{u)>cm-t, ueM-U implies F(d(u,l)) > cm + e. (8)
44.3. The Genus of Symmetric Sets
319
Since cm =cm+p> there exists an L e jfm+p such that
inf F(u)>cm~e; thus, inf F{u)>cm + e (9)
»ei u€id(L -(/,1)
when L-U+ 0. By (A3), since L c (L -U)UU, we have
ind(L — U) >indL — indU> m +p — p = m;
therefore, L-Uejfm and L-XJ + 0 by (Aj). The conditions (B3) and
(A4) yield d{L - U, 1) e j^,, i.e.,
inf f(«)<cm
iierf(L-(/,1)
because of (7). But this is in contradiction to (9). One proves Corollary 44.7
analogously. □
In order to be able to apply the. preceding two general existence principles
for critical points to concrete problems, we need L-S deformations and a
topological index:
(a) In Section 44.3 we construct indK.
(/?) In Section 44.4 we explain the Palais-Smale condition. In particular, it
yields the compactness of critMtC.F and plays a central role in the
construction of L-S deformations,
(y) We construct L-S deformations explicitly in the course of the proof of
Theorem 44.A in Section 44.7 as well as in Problems 44.13g and 49.7
with the aid of ordinary differential equations and pseudogradient
vector fields.
If one attentively studies the proof of Proposition 44.6, then one
recognizes that the same considerations can also be applied if one has at one's
disposal a suitable index that need not necessarily be defined for symmetric
sets. In particular, category is one such index. In this connection, compare
Problems 44.13f,g. The construction of topological indices that have more
propitious properties than does genus and correspond to more general
symmetries can be found in Fadell and Rabinowitz (1977), (1978), together
with applications to bifurcation theory and the existence of periodic
solutions of Hamiltonian systems.
44.3. The Genus of Symmetric Sets
Our point of departure is
f-.K-tM"-^}, / is odd and continuous. (10)
Definition 44.8. Let X be a real B-space. To each set K in sym^ we assign a
number gen K (that we call the genus of K) in the following way:
(i) gen 0 = 0.
3^.0 44. Ljusternik-Schnirelman Tlieoty and the tixistence of Several Eigenvectors
(ii) If K + 0, then let gen K be the smallest natural number n > 1 for which
a zero-free mapping/of the form (10) exists,
(iii) If for K =t 0 there does not exist such an n, then let gen K = + oo.
Example 44.9. If K is the boundary of the unit disk in 0¾2, then gen K = 2. :
To begin with, the identity mapping satisfies (10) for n — 2. However, for
n =1 there exists no / such that (10) holds. If we assume that / is such a
mapping, then there is a ueK such that /(«)=£0. Thus, also, /(-«) =
-f{u)i>=0. The classical mean value theorem yields a veK such that
/(v) = 0. But this is a contradiction.
The following theorem generalizes this example.
Proposition 44.10 (The Genus of Spheres). For the sphere S={ueX:
\\u\\ =1} in the real B-space X, we have gen S = dim X.
Proof. Let 0<dimX<oo. The identity mapping satisfies (10) with n =
dim X. However, according to the Borsuk-Ulam theorem (Theorem 16.D in
Section 16.5), there exists no / for which (10) holds with n < dim X.
Let dimX = oo. If Xn is an n-dimensional subspace of X, then, by the
definition of genus, gen S > gen(S n X„); thus, gen S > n for all n e N, i.e.,
gen 5 = 00.
For X= {0),5 = 0 and genS = 0. O
Proposition 44.11. The genus genK has the properties (A0)-(A5) in Section
44.2b. Therefore, we can use genK as a topological index in Section 44.2b.
Corollary 44.12. For all K>Kl,K2e sym x, the following four assertions hold:
(1) gen Kx — gen K2 provided Kx and K2 are homeomorphic with respect to an
odd homeomorphism.
(2) gen Kx<oo implies gen(K2 ~ Kt) > gen K2 — gen Kx provided gen Kx <
oo.
(3) gen K < dim X.
(4) From genK> m, 1 < m < oo, it follows that K C\(I- P)(X)ik0 when
P: X -» X1 is a continuous linear projection operator on the m-dimensional
subspace Xr of X.
Proof. In an essential way we make use of the Tietze-Dugundji extension
theorem (Proposition 2.1). We do not discuss the trivial special cases
separately (empty set, etc.).
(A1),(A4). Compare Definition 44.8 with/° <p instead of/in (10).
def
(A2). We choose/(± a,) = ±1 and n =1 in (10).
(A3). Let gen Ks = n, < oo, i = 1,2 and let
/:^,-^-(0}
44.4. The Palais-Smale Condition
321
be a continuous odd mapping. According to Proposition 2.1, /• can be
extended continuously to/: X-*W'. We set
def
g(«)-(/i(«)-/i(-«),/2(")-/2(-")) forallae^u/^.
Then g: KlKJK2-* W1+"2 -{0} is odd and continuous; therefore genC^
U K2) <nx + n2. Now, (A3) is obtained for arbitrary k by induction.
_ def
(A5). Let U(u;R)= {v e X: \\v~u\\<R}. For u + 0, 0 < R < \\u\\, we
_ _ def _ _
have U{u;R)r\U{-u;R) = 0. Let L= U{u; R)L)U{- u; R) be a ball pair.
def _
If we choose/(u) = ±1 for v eU(± u; R), then (10) holds with n = 1 andL
instead of K; thus, genL=l. Since Q&K, the compact set K can be
covered by a finite number of such ball pairs: Lv...,Lk. According to (A3),
gen K < gert Lx +' • • • + gen Lk = k < oo.
We now construct U. Let gen K = n. As in the proof of (A3), there exists
an odd continuous mapping h: X-*W such that 0€h(K), and h{K) is
compact. Therefore, h{K) is at a positive distance from 0, and we can
construct the ball pairs L1>...,Llcby the choice of a suitable cover so that
, def _
0 € h{Lj) holds for all J. Let U = int Lx U • • • Uint Lk. Since K c U, we
have gmK < gent/. On the other hand, from 0 € h(U) it directly follows
that genU <n, i.e., gen K = genU. □
We prove Corollary 44.12 in Problem 44.2.
44.4. The Palais-Smale Condition
The following condition is crucial:
Each sequence (un) in M such that
\\TF(un)\\-*0,F{un)-*casn--*co (11)
has a convergent subsequence.
Definition 44.13. Let M be a set in the real B-space X and let F: D{F) c X
-* U be a functional that has a tangential mapping TF(u) with respect to M
at each point u in M.
Fsatisfies the local Palais-Smale condition (PS)C with respect to M if and
only if (11) holds for a fixed ceU.
F satisfies the Palais-Smale condition (PS) (respectively, (PS)+, (PS)~) if
and only if (PS)C holds for all c e IR (respectively, c> 0, c < 0).
According to Definition 43.18, the existence of TF{u) means that the
tangent space TMU and the F-derivative F'(u) exist, where u eint D(F).
322 44. Ljusternik-Schnirelman Theory and the Existence of Several Eigenvectors
Then TF(u)h = F'{u)h for all h e TMU. We first treat two typical examples
in connection with the eigenvalue problem
Au = \u, ueX, XeU (12)
and with the operator equation
Au = Q, ueX. (13)
Example 44.14 (Eigenvalue Problem). Let M={ueX: ||w|| = r} be a
sphere in the real H-space X, r > 0. Let A: X -» X be a linear, compact, and
symmetric operator. We set F(u) = 2~1(Au\u) for all »ejf; thus, F'= A.
Then the following assertions hold:
(1) The critical points of F with respect to M are precisely the eigenvectors
of A on M.
(2) F satisfies (PS)C for all c * 0 with respect to M.
(3) If dim X < oo, then F also satisfies (PS) with respect to M.
(4) If dim X = oo, then F does not satisfy (PS) with respect to M.
Proof. (1) We set
def def
Du = Au — X(u)u, \(u) = r~l(Au\u).
Below we shall show that Du is an extension of TF(u) on X such that
1177^)11 = \\Du\\. According to Section 43.9, u is a critical point of F with
respect to M if and only if TF(u) = 0, i.e., Du = 0. Consequently, (1) holds.
Now we prove that ||r„.F|| = ||Dk||. According to Theorem 43.C in Section
def
43.6, with G{u) = (u\u)- r2, we have TMU = {h e X: (u\h) = 0}; therefore,
TF(u)h = F'(u)h= (Du\h) iova]lheTMu,
i.e., 1177^)11 < ||.Dk||. Let P: X-+TMU be the orthogonal projection operator
on TMU. Since (Du\u) = 0 for u e M, for all v e X we have:
|(i)«|i;)|-|(i)«|Pb)|^||7y(«)||||ft;||^||7y(«)||||i;||,
i.e.,||Z>K||<||2T(K)l|.
(2) From \\TF(un)\\ -* 0, F{u„) -* c, and un e M for all «, it follows that
Dm„-»0 and \(«„)->c, c=£0. The operator ^4 is compact—therefore,
possibly after passing to a subsequence, Aun -» u. Finally, Dun -* 0 yields
un -* 0-½.
(3) M is compact.
(4) We assume that X is separable. Otherwise, it suffices to consider a
subspace. Let (un) be a complete orthonormal system in X, r=l. Then
un-*0 as n-*oo (cf. Ax(52)). Thus, Aun-*Q, X(un)= {Aun\un) -*0;
therefore, ||77;,(Mn)|| = ||i)M„||-*0 and F{u„)-»Q, but \un) has no convergent
subsequence because \\uk — um\\2 = 2 for k =£ m. D
Example 44.15 (Operator Equation). Let F: X -»01 be an F-differentiable
functional on the real B-space X. If we set M = X, then, according to
Example 43.14, TMU = X and thus TF(u) = F\u) for all ueX. Further;
44.4. The Palais-Smale Condition
323
more:
(1) The critical points u of F with respect to M are precisely the solutions
oiF'(u) = 0.
(2) F satisfies (PS) with respect to X when F' is proper and 0 e R(F').
This is fulfilled, in particular, when F'\ X-+X* is bijective and F'~l is
continuous.
Proof. (1) Compare Section 43.9.
(2) Compare Definition 11.10 and Example 11.11. □
The significance of (PS)C for the Ljusternik-Schnirelman theory results
from the following proposition.
Proposition 44.16. Suppose that the following two conditions hold:
(/') The functional F: D(F)qX-+B satisfies (PS)C with respect to the
closed set M in the B-space X.
(/7) If (un) is a sequence on M such that un-*u and \\TF{ un)\\ -»0, as
n -» oo, then TF(u) = 0.
Then:
(1) The set critM CF is compact.
(2) For each open set U in X such that U O critM CF, there exist numbers
Y, 8 > 0 such that
1177^( k) || ^. y for all u eM-Ufor which \F(u)~c\ < 8.
We shall use these propositions in an essential way in the construction of
L-S deformations in Section 44.7a.
Proof. (1) Take (11) into account and observe that u e critM CF<* TF(u) =
0, F(u) = c, by Section 43.9. Due to the existence of the F-derivative of .Fon
M contained in (PS)C, F is continuous on M.
(2) If the assertion were false, then we should have a sequence (un) in
M — U such that F(u„)-+c and \\TF(un)\\-+Q; therefore, «„->« because
of (PS)C, and thus TF(u) = 0 and F{u) = c, i.e., uecvitM CF. But this
contradicts ue M-U. Note that M - U is closed. □
To explain the significance of (PS)C for the existence of critical points, we
now formulate two theorems that are valid as prototypes for more general
results. In this connection, we state:
F'(u) = Xu, ueM, XeU (14)
and
F'(u) = 0, ueX. (15)
(HI) X is a real H-space with dim X = oo. Let M = [ue X: \\u\\ = r} for
r>0.
(H2) F is an even functional with F e C\ X, U).
(H3a) F satisfies one of the following three ™nAU\™„.
324 44. Ljusternik-Schnirelman Theory and the Existence of Several Eigenvectors
(i) F is bounded below (or bounded above) on M and satisfies (PS) with
respect to M.
(ii) F is bounded as well as greater than (respectively, less than) zero on M
and satisfies (PS)+ (respectively, (PS)~) with respect to M.
(iii) F': X-+Y is compact, F(0) = F'(0) = 0 and u*Q implies F(u) + %
F'(u)*0.
Analogous to Example 44.14, we prove that (PS)* follows from (iii).
Furthermore, since M is connected, F > 0 or F < 0 on M.
Thus (iii) is a special case of (ii).
Proposition 44.17 (Eigenvalue Problem). If (HI), (HI), and (HZa) hold,
then with respect to M, Fpossesses infinitely many pairs (u, — u) of critical
points to which eigenvectors of (14) correspond.
We shall give the proof in Section 44.7d in conjunction with the proof of
Theorem 44.A.
(H3b) We choose Jfm to be the class of all compact sets K e symx with
genK>m, m=l,2,... and we set
cm = inf sup F( u). (16)
F satisfies (PS)- with respect to X and F(Q) = 0.
Proposition 44.18 (Operator Equation). If (HI), (H2), and (H3b) hold and
— oo < cm < 0, then F has a pair of critical points (u, — u) on X such that
F(±u) = cm to which solutions of (15) correspond.
If - oo < cm = cm+! = • • • = cm+p < 0,p > 1, then gencritXcF>p+land
F thus has an infinite number of pairs (u, — u) of critical points such that
F(u) = cm to which solutions of (15) correspond.
This proposition is a special case of Corollary 44.7. The L-S deformations
which represent the heart of the proof are obtained from Problem 49.7.
There, we also treat applications to semilinear elliptic differential equations.
44.5. The Main Theorem for Eigenvalue Problems in
Infinite-Dimensional B-Spaces
For fixed a > 0, we consider the eigenvalue problem
F'(u) = \G'(u), ueNa, XeU (17)
with the level set
def
Na= {ueX: G(u)^a).
44.5. The Main Theorem for Eigenvalue Problems in Infinite-dimensional B-Spaces 325
The following assumptions turn out to be natural generalizations of the
classical linear eigenvalue problem Au = Xu in the Standard Example 44.19
that follows below.
(HI) Functional F, G. The space X is a real reflexive separable B-space
with dimX = oo, and F, G: X-+H are even functional such that F,Ge
C\X,U) and F{0) = G(0) = 0. In particular, it follows from this that F' and
G' are odd potential operators.
(H2) The operator F' is strongly continuous and F{u) + Q, uecoNa
implies F'(u)i=Q.
(H3) The operator G' is uniformly continuous on bounded sets and
satisfies {S)x, i.e.,
un-*u, G'{un)-*v implies un -* u asn->oo.
(H4) The level set Na is bounded and
u¥=Q implies (G'(u),u)>Q, lim G(tu)*= + oo,
and
inf (G'{u),u)>0.
ueN„
The boundedness of Na follows, for instance, from G(m)-»+oo as
||u|| -»oo. Due to (H4), 0 <£ Na. From (H2) it follows that, to each
eigenvector u of (17) such that F(u) + 0, there belongs an eigenvalue A + 0.
Condition (H4) shows that G\u) + 0 on Na. Therefore, G is a submersion for all
ueNa. We construct the projection operator P: X-» N(G'{u)) in this
connection in the proof of Lemma 44.31. According to Proposition 43.21,
the following therefore holds:
u is a solution of (17)
<=» u is a critical point of F with respect to Na.
Now, the construction of
sup inf ± F(u),
10 forJf±
ic±= ^*"EA: (18)
for m=l,2,... is crucial. Here, jf„^ denotes the class of all compact
symmetric subsets K of Na such that genK>m and ±F(u)>0 on K.
Furthermore, we define a global multiplicity x ± by:
X
def I supremum over all m such that + cm > 0,
* ~ \o for0^=0.
Standard Example 44.19. The assumptions (H1)-(H4) are fulfilled
provided the following hold:
326 44. Ljusternik-Schnirelman Theory and the Existence of Several Eigenvectors
(i) X is a real separable H-space with dim X = oo. We identify X with X*.
(ii) A: X-* X is a linear, compact, and symmetric operator. We set F(u) =
2~\Au\u) and G(u) = 2~\u\u).
Then F' = A, G' = I, Nais a sphere, and (17) corresponds to
Au = \u, \e|R
with the normalizing condition G(u) = a, i.e., u e Na. It can be shown that
c™ = a\± when + c± > 0. (18a)
Here, \± is the eigenvalue of A that we constructed in Section 44.1 with the
aid of the Courant maximum-minimum principle (cf. Problem 44.5). In this
way all eigenvalues that are different from zero of A are obtained from c*
according to their multiplicity. Therefore, A has at least x+ + X- P^s
(u, — u) of eigenvectors on Na with the corresponding eigenvalues that are
different from zero. If \* has the multiplicity^ +1, i.e., \± = \± +1 = • • ■ =
\* +p, then the corresponding eigenvectors on Na form a ^-dimensional
sphere and the genus of this set is p +1 according to Proposition 44.10.
The following theorem generalizes these results to nonlinear problems.
Theorem 44.A (Main Theorem). With the assumptions (//1)-(//4), the
following Jive assertions hold:
(1) Existence of an eigenvalue. If ± c * > 0 (+ or -), then (17) possesses
apair(u^, — u*) ofeigenvectors with the eigenvalue \* =£0 andF(u*) — c*.
If F' and G' are positive homogeneous, i.e., F'{tu) = tF'(u), and G'(tu) =
tG'(u) for allu^X,t> 0, then c* = a\±.
(2) Multiplicity. (17) has at least x+ + X^ Pa'rs ("> — ") °f eigenvectors
with eigenvalues that are different from zero.
If ± c± = ± c*+1 = •••= + c*+p > 0, p >1 (+ or -), then the set of all
eigenvectors of (17) such that F(u) = c* has genus greater than or equal to
p +1. In particular, this set is infinite.
(3) Critical levels. + oo > + c* > + cf >■■ ■ > 0 and c* -»0 as m -»oo.
(4) Infinitely many eigenvalues. // x+=0° or X- = °0) and F(u) = Q, u
eco Af, implies (F'(u),u) — Q, then there exists a sequence (Xm) of
infinitely many distinct eigenvalues for (17) such that \m -»0 as m-+ oo.
(5) Weak convergence of eigenvectors. Assume that F{u) — Q, «eco Na
implies u — ti. Then max{x+> X-)= °° and there exists a sequence of eigen-
solutions (um, Xm) of (17) such that um-+Q, Xm -» 0 as m -» oo a«c? \m =£0/0/"
a// m.
Corollary 44.20 (Existence of an Eigensolution for Functionals F, G That
Are Not Necessarily Even). With the assumptions (//1)-(//4), the following
44.5. The Main Theorem for Eigenvalue Problems in Infinite-dimensional B-Spaces 327
holds when one foregoes the evenness of F and G: If there exists an element
Uq e Na ( + or —) such that ± F(u^)>Q, then (17) has an eigensolution
(u ±, \*) such that X± =£ 0 and
±F(u±)= max ±F(u).
We give the proofs in Section 44.7. In this connection, in Lemma 44.28 it
turns out that, by virtue of a radical projection, the level set Na is
homeomorphic to the unit sphere in X and 0 € Na. This homeomorphism is
odd. According to Proposition 44.10 and Corollary 44.12, on Na for each
meN, there thus exists a compact symmetric set K such that genK = m;
therefore, gen Na = oo. From this there easily result estimates for x + which
are important for the multiplicity assertion in Theorem 44.A, (2).
Corollary 44.21 (Calculation of the multiplicity x ± )• With the assumptions
(#1)-( #4), the following hold:
(a) x + = oo when ± F > 0 on Na (+ or —).
(b) X±> dim Xx provided there exists a linear subspace Xl of X such that
±F>0onNanXl(+or~). def
(c) X+ = oo or X- = oo when the set of zeros N° = {«e Na: F(u) = 0} is
compact or, more generally, there exists a closed linear subspace Xl of X
such that dim{X/Xl) = oo and dist(\\u\\~1u, Xx) < t/ for all u e Afa° and
fixed t\ e ]0,1[.
def
Proof, (b) Let K=*Nttr\Xv By Lemma 44.28 in Section 44.7a, K is
homeomorphic to the unit sphere in Xv Since this homeomorphism is odd,
we have gen K = dim Xx by virtue of Proposition 44.10 and Corollary 44.12.
Since ± F > 0 on K, ± c ± > 0.
(a) This is a consequence of (b) because dim X = oo.
(c) If one uses Problem 44.3 and the fact that the homeomorphism from
the unit sphere S onto Na is odd, then for each meN, there exists a
compact symmetric set K on Na such that gen K>m and K n N° — 0. Let
def
Kt = {«e K: ± F(u)>Q}. Then K± is compact and symmetric. Since
K=K+U K^, K+(~) K_=0, we have gen^±>/« (+ or —); hence,
± c± > 0. □
Remark 44.22 (Optimality of the Main Theorem). The prototype of
Theorem 44.A that we gave in Proposition 37.60 was originated by Ljusternik
(1939). Numerous authors contributed to the further development. In the
present general form, Theorem 44.A can be found in Zeidler (1980). A
careful comparison with the linear case in Standard Example 44.19 shows
that Theorem 44.A is optimal in a certain sense. In this connection, compare
Zeidler (1980). In the following remarks, we point out possible
generalizations.
328 44. Ljusternik-Schnirehnan Theory and the Existence of Several Eigenvectors
Remark 44.23 (Weakened Continuity of G'). In (H3) we required that G'
be uniformly continuous on bounded sets. Our goal is to weaken this
condition by requiring only the continuity of G'. To this end, we state:
(H2') F' is strongly continuous on X and (F'(u),u) = 0, «eco Na
implies F(u) = 0.
(H3') G' is continuous, bounded, and satisfies (S)0, i.e., as n--oo,
u„-u, G'{uJ-v, (G'(u„),u„)-*(d,u) implies u„-*u.
Then, with the assumptions (HI), (H2'), (H3'), and (H4). Theorem 44.A and
Corollary 44.20 remain valid, where only Theorem 44.A, (2) is to be
replaced by the following weakened proposition: For x+ =0° or x~ =oo,
(17) has infinitely many pairs (u, — u) of eigenvectors, where the
corresponding eigenvalues are not equal to zero.
The proof can be found in Zeidler (1980). Browder's Galerkin method
from Section 43.5 is used in that proof. We explain the basic idea in
Problem 44.11.
Remark 44.24 (Hyperboloids). Theorem 44.A refers to F'(u) = \G'(u),
«e Na, X e 01, where G' is definite but F' is not necessarily definite. The
similar problem for indefinite F' and G' is considered in general form in
Zeidler (1979a). Whereas in Theorem 44.A the level set Na has
approximately the form of a sphere, for indefinite G', Na can, e.g., have the
structure of a hyperboloid (cf. Problems 44.9 and 44.10).
Remark 44.25 (Perturbation of Evenness). For Theorem 44.A it is
important that F,G be even and that, correspondingly, F',G' be odd. In
Zeidler (1980), in conjunction with Krasnoselskii (1956, M) the general form
of the perturbation case F = F1 + eF2 for small e is considered, where only
F1 is even (cf. Problem 44.8).
The papers of the author cited in the preceding remarks also contain
applications to Hammerstein integral equations and quasilinear elliptic
differential equations and comprehensive bibliographical references. As an
introduction to this class of problems, we recommend Krasnoselskii (1956,
M), Chapter 6.
44.6. A Typical Example
We consider
F'(u) = \G'{u), u^X, \eR, G(u) = a (19)
and formulate an important special case of Theorem 44.A in connection
with the theory of monotone operators.
44.6. A lypical Example
329
Proposition 44.26. Suppose that the following four conditions hold:
(/) X is a real reflexive separable B-space with dim X — oo.
(//) F,G:X-*U are functionals, F,G^ Cl(X,M), and F(0) = G(0) = 0.
(///) F' is strongly continuous and (F'(u), u)>0 for all u =£ 0 in X.
(iv) G' is continuous, uniformly monotone, bounded, and G'(0) = 0.
Then:
(1) For each a> 0, equation (19) has an eigensolution (u, X) with ui=Q,
\>0.
(2) IfF',G' are odd, then for fixed a > 0 and for each meN, (19) has an
eigensolution (um, Xm) with umJ=Q,\m>Q and um-^Q,-\m -» +0 as m -» oo,
so that there exist infinitely many distinct eigenvectors and eigenvalues.
We treat applications to quasilinear elliptic differential equations in
Section 44.9.
Proof. First, let G' be uniformly continuous on bounded sets and assume
that F and G are even. We verify the assumptions (H1)-(H4) of Theorem
44.A in Section 44.5 and use the connections between operator properties
shown in Fig. 27.1.
(HI) This is obviously fulfilled.
(H2), (H2') We have F(u) = 0 «> (F\u),u) = 0 «> « = 0. This follows
from (F\u), u) > 0 for u + 0 and
F(u)= f\F'(tu),u)dt.
(H3), (H3') G' satisfies (8)0,(8)! by Fig. 27.1.
(H4)
(I) ||u|| -» 00 implies that G(u) -» + 00, i.e., N is bounded. To prove this,
def
let «jp(0 = (G'(tu), u) for u + 0 and t > 0. Since G'(0) = 0 and since G'
is uniformly monotone, we have <p(t) > 0 for t > 0, and <j> is strictly
monotonically increasing on [0,1] (Example 25.6). Consequently,
G{u)= r<p{t)dt> (1 (p{t)dt>(G'{2'lu),2~1u) -* +00
•'o '1/2
as ||u|| -» 00, because G' is coercive.
(II) 9(1) > 0 yields (G'(u), u) > 0 for u * 0.
(III) We prove vaiufEN^G'{u),u) > 0. Since G is continuous at « = 0 and
G(0) = 0, there exists an r > 0 such that u£ Na for all u, \\u\\ < r. The
uniform monotonicity of G' and G(0) = 0 assure that
<G'(«)>«>^a(||«||)||«||^fl(r)r>0
for all u^Na (cf. Definition 25.2).
330 44. Ljusternik-Schnirelman Theory and the Existence of Several Eigenvectors
Now, assertion (2) follows from Theorem 44.A, (5), and (1) is obtained
from Corollary 44.20. If G' is only continuous, then one uses Remark 44.23.
That X is positive follows from X = (F'(u), u)/(G'(u), u). D
44.7. Proof of the Main Theorem
The crucial aspect of the proof is the application of the strong Ljustemik
maximum-minimum principle from Section 44.2b with ind(-) = gen(-). At
the heart of the proof are the local Palais-Smale condition and the
elementary explicit construction of L-S deformations.
44.7a. Preliminaries
We assume (H1)-(H4) in Section 44.5 and first make available propositions
on the duality mapping 7, the structure of the level set Na) the local
Palais-Smale condition, L-S deformations, and projection operators. The
proofs are especially simple when X is an H-space. If in this case we identify
X with X*, then J is equal to the identity mapping on X. Further
simplifications result when G(u) = 2"l{u\u); therefore, G'= I. Then Na is a
sphere. We shall frequently use the connections between operator properties
depicted in Fig. 27.1 of Part II. In particular, from Corollary 41.9 and Fig.
27.1 it follows that the following lemma holds.
Lemma 44.27. F is strongly continuous. Furthermore, F, F', and G' are
bounded and uniformly continuous on bounded sets.
Step 1: Duality Mapping J. According to the Kadec-Troyanski theorem,
every reflexive B-space has an equivalent norm such that X and X* are
locally uniformly convex (A3(29)). Since our proof will be invariant relative
to equivalent normings, we can assume X and X* to be locally uniformly
convex at the start. By virtue of Proposition 47.19, since X** = X, there
then exists an odd continuous mapping J: X* -» X such that
<w,/w) = ||w||2, ||/w|| = IMI for all w eX*. (20)
J is the dual mapping of X* into X**.
Step 2: Level Surface Na for fixed a> 0
Lemma 44.28
(1) There exist numbers 0 < R0 < ^ such that 0 < R0 < \\v\\ <, RY for all
V<ENa.
44.7. Proof of the Main Theorem
331
(2) For each u + 0 in X, there exists exactly one r(u)>Q such that G(r(u)u)
= a; therefore, r(u) = l on Na.
(3) The mapping r: X — {0} -» IR is even and continuously F-dijferentiable,
such that
r'(") = 7^TT^rVrTG'(r(")") forallu*0. (21)
(G (r(u)u),r(u)u)
Since r(u)ue Na for ui=Q and {HA) holds, the denominator in (21) is not
equal to zero.
(4) r and r' are uniformly continuous and bounded on bounded sets outside a
neighborhood of zero.
(5) The radial mapping w-+r(u)u is an odd homeomorphism of the unit
sphere S onto Na (see Fig. 44.2)..
We treat the proof in Problem 44.6.
Sit-p 3: L-S Deformations on Na. For all u e Na, we set
DudlF>(u)-X(u)G'(u), Mb)?£S«L«1
(G'(u),u)
Eu - JDu - -—*—r u.
(G\u),u)
Lemma 44.29. There exists a continuous mapping d: Na X [0,1] -» Na and a
number tx > 0 such that
F(d{u,l))>F(u)+7^0^2 forallueNa. (22)
Moreover, d(u,Q) — u on Na and d is odd with respect to u.
This yields the following crucial proposition.
Corollary 44.30. For each c + 0 and each open set U 2 critN CF, there exists
a number e> 0 such that F(u)>c—e, ue Na — U implies F(d(u, 1))> c + e.
Proof of Corollary 44.30. The proof of Lemma 44.31 in the following
step shows that, for each c ¥= 0, F satisfies the local Palais-Smale condition
Figure 44.2
332 44. Ljusternik-Schnirelman Theory and the Existence of Several Eigenvectors
(PS)C. with respect to Na. Moreover, there exists a constant cl > 0 such that
||rF(K)||<||Z>K|| :2(1+^)117^(1011 on Na. The assertion now follows
directly from (22) and Proposition 44.16. D
Proof of Lemma 44.29. (By (21), the relations
(Du,u) = (G'(u),Eu)=0, (r'(u),Eu)=Q,
(F'(u), Eu) = (Du, Eu) = (Du, JDu) = \\Du\\2
hold on Na. The operators D, E: Na -» X are continuous and bounded.
Therefore, by Lemma 44.28, (1), there exists a t0 > 0 such that
inf \\u+ jEu\\ >0 forallre [-t0,t0].
Now, the following definitions are important:
def
g(u, t) = r(u + tEu)(u + tEu),
.def def
\p{u,r) =F(g(u,T)), d(u,t) =4>(u,T0t)
for all ueNa, ts[-t0, t0], and re [0,1]. We verify (22). First, g is a
mapping of Na X [ — t0, t0] into Na and
gT(u, t) = (r'{u + tEu), Eu)(u+ rEu) + r(u + tEu)Eu,
g(u,0) = u, gr(u,Q) = Eu,
*T(«,T) = <F'(g(«,T)),gT(«,T)>,
*(u,0) = F(u), ^T(u,0) = (F'(u),Eu) = \\Du\\2.
According to Lemma 44.28, g and gr are bounded. F and F' are bounded
and uniformly continuous on bounded sets. By Lemma 44.28, the mappings
t >-* 4>(u,t) and ti->^t(m,t) are thus equicontinuous on [-t0,t0] with
respect to all ue Na. Thus, for sufficiently small t0, by the mean value
theorem,
t(u,T0)>t(u,0) + 2-\i}>T(u,0) for all Kei\ra.
(22) results directly from this. D
Step 4: The Local Palais-Smale Condition
Lemma 44.31. For each c =^ 0, F satisfies the condition (PS)C with respect to
Na.
def
Proof. (I) Connection between 77^(¾) and Du. Let N(G'{u)) = {he X.
(G'(u), h) = 0} for fixed u e Na. Then Pu defined by
_ */ (G'(u),v)
P,,V = V - *-7 U
(G'{u),u)
44.7. Proof of the Main Theorem 333
is a continuous linear projection operator from X onto N(G'(u)). Since
(G'(u), u)>P>0, \\G'(u)\\ <y, and ||u|| < Rr for all ue7Ya, where /?, y,
and /?1 are suitable constants, we have ||P„u|| < (1 + fi~1yRl)\\v\\. Therefore,
\\PU\\ < constant for all u e 7Ya. According to Theorem 43.C in Section 43.6,
TM,= N(G'(u)).
Now, one proves analogous to Example 44.14, that Du is an extension of
TF{u) on the space X and that 1177^)11 < \\Du\\ < (constant) 1177^)11 on Na.
(II) (PS)C. Due to (I), it suffices to show the following: If (u„) is a
sequence in Na and Dun -» 0, F(un) -» c as n -» oo for c ¥= 0, then (un) has a
convergent subsequence. To prove this, let
Du„ = F>{u„)-\{u„)G'{u„)-*0,
\( \ (71^ »„);»„)
(G'{un),un)
(un) is bounded; therefore, the sequences (F'(u„)), (G'(u„)), and (M«„))
are also bounded by virtue of (H4) and Lemma 44.27. Consequently, there
exists a subsequence («„,) such that un,-+u,\{ u„>)-* \0,i.e.,F'(u„,)-+F'(u)
and F(un,) -* F(u) (Lemma 44.27). F(u) = c + 0, u eco Na yields F'{u) + 0
by (H2) and thus X0 =£ 0 since /*"'(")"" ^qG'(u) = 0- Therefore, u„,-*u, and
G'(«„0 -» Xq lf'(")- Now (s)iin (H3) assures that u„, -* u. U
Step 5: Generalized Nonlinear Orthogonal Projection Operators
Lemma 44.32. Let X be a real separable reflexive B-space. Then, for each
n eN, there exists a finite-dimensional linear subspace Xn of X and an odd
continuous operator Pn: X-* Xn such that
un-+u implies Pnun-+u as n -» oo,
||/>||<:|MI for all ueX,neN.
Proof. If X is a real H-space, then we choose a complete orthonormal
def
system (e„) and set Pnu = 2"_1(«|e,)e,. The proof is more complicated for
B-spaces (cf. Problem 44.7). D
Corollary 44.33. If(n') is a subsequence of the sequence of natural numbers,
then un,-^ u implies that Pn,un>-^u as n' -» oo.
Proof. We set um = un, for m = n' and um = u otherwise. □
44.7b Proof of Theorem 44.A.
It suffices to consider the case " +."
(Ad (1), (2)) We use the strong Ljusternik maximum-minimum principle
def
from Section 44.2b with ind K = gen K. To this end, we verify the assump-
334 44. Ljusternik-Schnirelman Theory and the Existence of Several Eigenvectors
tions made in Section 44.2b. Since F is bounded, 0 < c+ < oo. The condition
(PS)t+, when c+ >0, implies that crit^ c+.F is compact by virtue of
Proposition 44.16. The L-S deformations d were constructed in Corollary
44.30. Finally, F(d{u,l))^ F(u) on Na yields K e X^ =* d(K,l)e X^.
If i-" and G' are positive homogeneous, then
F(u)= (l(F'(tu),u)dt = (F'{u),u), G(u) = (G'{u),u).
J0
Thus, from F'(u) = XG'(u), G{u) = a it follows that \ = F{u)/G{u) =
a-~xF{u).
(Ad 3) In order to show that c+ -» 0 as m -» oo, we use the operators P„
from Lemma 44.32.
(I) For each e > 0 there exist numbers S(e) > 0 and n0(e) e N such that:
\F{u)\>e, ueNa implies \\P„u\\>8.
Otherwise there exists an e0 > 0 and a sequence («„) on Na such that
\F(un)\\>e0 and ||Pbm„|| <n~~l for all neN. If we choose a subsequence
(«„,) such that un,-+u, then P„,u„,-*u, according to Corollary 44.33; hence
u = 0. Consequently, F(un,) -» 0, which contradicts \F{un)\ > e0.
(II) From K £Na and gen Jt > m0 + 1, where m0 = dim X„^c), it follows
that inf„e=K\F(u)\<e.
Otherwise, ||Pbm||>0 on K, by (I). The set P„g{K) is compact and
symmetric. Corollary 44.12 yields the contradiction genP„ (K) < m0.
(III) From (II) it follows that 0 < c*o(e)+1 < e for all e> 0. The sequence
(c+) is monotonically decreasing; hence, c+ -» 0 as m -» oo.
(Ad 4) Let x+ = oo. By assertion (1), for each m€N there exists a um
such that F\um) = \mG\um), where umeNa, F(um)=c+, and \„,*0.
The sequence («„,) is bounded. We choose a subsequence («„,-) sucn tnat
um,-+u. Hence, F(u) = 0 since c*-» 0. Furthermore, «€co Na. Thus
(.F'(")> «) = 0 by assumption. Therefore,
\ (F'iUm')'K,') ,n
xm,» _*0.
<G \um,),um,)
(Ad 5) If .F(m) = 0, m 6 co Af, implies « = 0, then F =£ 0 on Na. However,
Na is connected; consequently, F > 0 or F < 0 on Na. Corollary 44.21 yields
X+ =oo or x_ =oo. The assertion now follows as in the proof of assertion
(4). D
44.7c Proof of Corollary 44.20.
Now, in contrast to Section 44.7b, use the weak Ljusternik maximum-
minimum principle (Proposition 44.3), where Jfis the class of singleton
subsets of Na.
44.8. Main Theorem for Eigenvalue Problems in Finite-Dimensional B-Spaces 335
44.7d Proof of Proposition 44.17
(Ad i) Apply the strong Ljusternik maximum-minimum principle
(Proposition 44.6). The L-S deformations are obtained by virtue of Lemma 44.29
and Corollary 44.30 when F' is uniformly continuous on the ball {ue X:
\\u\\<r}.
If this condition is not fulfilled, then one must construct the L-S
deformations more carefully (see Problem 49.7).
(Ad ii) Use the classes JfJ" as in (18) and follow a line of reasoning
analogous to (i). D
44.8. The Main Theorem for Eigenvalue Problems in
Finite-Dimensional B-Spaces
For fixed a > 0, we consider the eigenvalue problem
F\u) = XG'(u), ueNa, XeU, (23)
def
where Na = { u e X: G(u)= a) with the following assumptions:
(HI) F, G: X-+M are functionals on the real finite-dimensional B-space
X, where F, G e C\ X, U) and G(0) = 0.
(H2) For each u =£ 0, (G'{u), u)>0 and there exists a number r(u) > 0
such that G(r(u)u) = a, i.e., each ray through the origin intersects Na.
It easily follows that ue Na implies u =£0. If F'{u) + 0 holds on Na, then
all eigenvalues X in (23) are different from zero. We set
def
cm= sup iai F(u), m =1,...,dim X.
Let .^, be the class of all compact symmetric sets K in Na with gen K^.m.
According to Lemma 44.28, there exists an odd homeomorphism from the
unit sphere S in X onto Na. Then Proposition 44.10 and Corollary 44.12, (1)
immediately yield JCm+<Z for m =1,...,dimX. Furthermore, from the
compactness of Na and the continuity of F, it follows that - oo < cm < oo.
Theorem 44.B. With the assumptions (HI) and (//2), the following two
assertions hold:
(1) (23) has an eigensolution (u,X).
(2) If F and G are even, then there exist at least dim X distinct eigenvector
pairs (u,-u) in (23).
When cm = cm+1 = ■ ■ • = cm+p,p ^1, the genus of the set of all eigenvectors
u of (23) such that F(u) = cm is greater than or equal to p +1. In particular,
there then exist infinitely many eigenvectors for (23).
336 44. Ljusternik-Schnirelman Theory and the Existence of Several Eigenvectors
Proof. (1) F has a maximum to which there corresponds a critical point on
the compact set Na. Furthermore, G is a submersion at each point u e Na.
Now Proposition 43.21 yields the assertion.
(2) Since Na is compact, F satisfies the condition (PS)C with respect to Na
for all c e IR and not only for c =£ 0 as under the assumptions of Theorem
44.A in Section 44.5. Therefore, the existence assertion for L-S deformations
in Corollary 44.30 holds for all c e IR. Now Proposition 44.6, with ind K =
gen K, yields the assertion. D
Since dim X< oo, in (23) it is a matter of an eigenvalue problem for a
system of nonlinear equations. We have already formulated a special case in
Proposition 37.59.
44.9. Application to Eigenvalue Problems for
Quasilinear Elliptic Differential Equations
As an application of the results in Section 44.6, we consider the classical
boundary eigenvalue problem
N
Q: -\£ A(lA"(*)l^2A"(*))^g'("(*)). (24)
i = i
dQ: « = 0.
(HI) Let Q be a bounded region inUNf N^l. Furthermore, let;? 2: 2. We
set x-(£!,...,£„), fy-d/dt,.
(H2)g: U -» IR is continuously differentiable, withg(0) = 0 andg'(u)u> 0
for all real numbers u + 0. There exist constants c, d > 0 such that the
following growth condition holds for all u e U:
\g{u)\<c(l+\u\P), \g'(u)\<d(l+\u\»^).
Definition 44.34. Let X- Wp(Q). The generalized problem for (24) reads as
follows: We seek u e X, X e IR such that
Xb(u,v) = a(u,v) for all v e X, G(u) = a (24a)
for fixed a > 0. Here,
N
G{") = P~1f Y,\DMpdx> F{u)= fg{u)dx,
^8,- = 1 JQ
N
b(u,v)= f £ {DiU^^DjUDiVdx,
•'a/-1
a(u,v) = f g'(u)vdx.
44*10. Eigenvalue Problems for Haininerstein Equations
337
(24a) results formally from (24) by multiplication by i>eC0°°(G) and
subsequent integration by parts.
Proposition 44.35. With the assumptions {HI) and {HI), the following two
assertions hold:
(1) (24a) has an eigensolution {u, X), with u¥=Q,X>Q.
(2) If g is even, then (24a) has infinitely many eigensolutions {um, Xm), with
um + 0, Xm > 0 for all meN such that um-+Q in X as well as Xm -» 0 as
m-»oo.
Proof. According to Proposition 42.16, F and G are continuously F-di-
fferentiable on X, where (G'{u), v) — b{u, v) and (F'{u), v) — a{u, v) for
all u, v e X. Therefore, (24a) is equivalent to F'{u) = XG'{u), u e X, X e IR,
G{u) = a. By Proposition 42.16 and Corollary 42.17, F' is strongly
continuous, G' is continuous, uniformly monotone, and bounded. (H2) yields
(F'{u), u)>0 for all u * 0 in X.
The assertion now follows from Proposition 44.26. □
By applying Proposition 42.16, one can treat essentially more general
boundary eigenvalue problems than (24) in a manner analogous to the
above with the aid of Proposition 44.26. In this connection, also compare
Problem 44.9.
44.10. Application to Eigenvalue Problems for
Abstract Hammerstein Equations with
Symmetric Kernel Operators
We consider the eigenvalue problem
u = it,KF{u), ueX*, fieIR (25a)
together with the normalization condition
(u,w)x = a for all we K~x{u) (25b)
for fixed a > 0. For a solution of (25), we always have u + 0, ft, ¥= 0.
Theorem 44.C (Amann (1972)). Suppose that the following three conditions
are satisfied:
(i) X is a real reflexive separable B-space.
(ii) K: X -* X* is linear, compact, monotone, and symmetric.
(/'«) F: X* -» X is a continuous potential operator with potential <p. Here,
<p(0) = F{0) = 0 and <p{u) + 0, KF{u) + 0 for all u in X,u*Q.
338 44. Ljusternik-Schnirehnan Theory and the Existence of Several Eigenvectors
Then:
(1) (25) has an eigenvector.
(2) If F is odd, then at least m distinct eigenvector pairs (u, — u) belong to
(25), where m = dim K( X).
When m = oo, there exist infinitely many distinct characteristic numbers ji„
such that n^1 -» 0 as n -» oo.
Proof. Our goal is to reduce (25a) to the problem
u = pS*SF(u), ueX, fieU (26¾
and
G'(v) = n<S>'{v), veH, )»eR, G(u) = |, (27);
where
def def
${v) = <p{S*v), G(v) = 2^(v\v) for all v e H.
We then apply Theorems 44.A and 44.B to (27).
(I) Equation (26). According to Proposition 28.1, there exist a real
separable H-space (//,(■ |-)) and a linear compact mapping S: X-+H,
where S*: H-* X* is injective and K = S*S, SUH= H.
From K = S*S it follows that K(X)cS*(H). Furthermore, S*(H)
QK(X); for, becauseS(X) = Hand u e //, from w = S*(v) it follows that
there exists a sequence (un) such that u„ = £«„ -» v; hence, Kun=- S *Sun -* w:
Consequently, dim H = dim S * (H) = dim K( X).
(II) Equation (27). If we set v — S*~1u, then (26) is equivalent to
d = ixSF(S*v), veH, ixeU. Since <p' = F, this is equivalent to (27) with
$' = SFS*. Note that <$'(")> w> = <<p'(5'*u), S*w) = (5^(5^), w).
4>': //-» // is strongly continuous; for, if S is linear and compact then so
is S* which, therefore, is also strongly continuous, and F is continuous.
From $(v) * 0 it follows that S*u * 0; hence KF(S*v) + 0, i.e., $'(v) + 0.
Furthermore, 4>(u) = 0 if and only if v = 0, by (iii).
(III) Existence. Corollary 44.20 and Theorem 44.A, (5) in Section 44.5
(respectively, Theorem 44.B in Section 44.9) guarantee, for each a > 0, the
existence of an eigenvector v of (27) (respectively, dim H distinct eigenvector
pairs (v, — v) for odd ¢) satisfying the condition concerning the limiting
value formulated in assertion (2).
(IV) Inverse transformation. If v is a solution of (27), then u — S*v is a
solution of (25a). Furthermore, (25b) also holds, for it follows from ueR(K)
that u = Kw = S*Sw for some w; hence, v = Sw and thus
a= (v\v) = (Sw\Sw) — (Kw, w) = (u,w)
forallwe/T1^). □
44.12. The Mountain Pass Theorem
339
44.11. Application to Hammerstein Integral Equations
We have already formulated the applications of Theorem 44.C in Section
44.10 to concrete integral equations in Corollary 41.11.
44.12. The Mountain Pass Theorem
To conclude this chapter, we treat an important existence principle for a
free critical point which is very intuitive. Our assumptions read as follows:
(HI) Let X be a real B-space. The functional F: X -»IR is continuously
F-differentiable and satisfies (PS).
(H2) There exist positive constants R and a such that \F{u)\>a for all
ueX with |M| = #.
(H3) There exists a point u,6lwith H^H > R and F{u{), ^(°) < «•
(H4) We denote by Jf the set of all continuous mappings p: [0,1] -» X with
p(Q) = 0 and p(l) = uv
Furthermore, we set
def
c= inf sup F(p(t)).
pe.X o^;<l
If X—U2, then we can think of F(u) as the height of a mountainous
landscape at the point u. We shall designate the points u with ||u|| = R as a
mountain chain S. Then, by (H3), valleys occur at the points u = 0 and
u = uv To each p there corresponds a path which connects the two valleys
over the mountain chain S. Intuitively, one now expects that there exists a
saddle point of our landscape at height c.
Theorem 44.D (Ambrosetti and Rabinowitz (1973)). If (//1)-( #4) hold,
then Fpossesses a critical point u, with F(u) = c, c>a.
We give the proof in Problem 49.10a in connection with the general
linking principle. This proof follows from the Ljusternik weak maximum-
minimum principle in Section 44.2a. In this connection, one obtains the
required L-S deformations from a general result concerning such
deformations which we furnish in Problem 49.6. In the problems in Chapter 49 we
give an overview of various methods and principles for the construction of
free critical points. Applications of Theorem 44.D to nonlinear elliptic
and hyperbolic partial differential equations can be found in Ambrosetti
and Rabinowitz (1973), Brezis, Coron, and Nirenberg (1980), and Chow and
Hale (1982, M).
340
froblem
Problems
We consider general results of the Morse theory and the Ljusternik-Schnirelmaii
theory on infinite-dimensional manifolds in Problems 44.12 and 44.13. Additional
material concerning the Ljusternik-Schnirelman theory can be found in the Prob
lems to Chapter 49, in connection with the construction of free critical points.
44.1. Proof of Proposition 44.1. Compare Zeidler (1979), page 185.
44.2. Proof of Corollary 44.12. Solution:
(1) Compare (A4).
(2) Use K2 C Kx UlXP^i) and (A3).
(3) Let dim X = n, K*0. If we identify X with R", then, since 0 £ K.
relation (10) holds with f — I; hence, genK <n.
(4) Suppose K n(I- P)(X) = 0. Then 0 £ P(K). The operator P: K-
.^-(0} is odd and continuous. If we identify Xx with Rm, then
gen K < m, in contradiction to gen K> m.
44.3.* A property of gen K. Show: For some set M and each m e N, there exists
a compact symmetric set Km such that Km CS-M and gen Km>m.
provided the following two conditions hold:
(i) S is the unit sphere in the real B-space X with dim X = oo. The set M
is a subset of S.
(ii) There exists a closed linear subspace Xl of X such that dim( X/XJ --
oo and dist(«, Xx) < i\ for all u e M and for fixed t| e ]0,1[.
If M is compact, then (ii) follows from (i). Hint: Compare Zeidlci
(1980), page 457. Use the Michael selection theorem (cf. Problem 9.3).
The proof is elementary in an H-space. In this special case, one appliis
the orthogonal projection operator of Xonto X±- and Proposition 44.1'•
44.4. Direct proofs of the main results for special cases.
44.4a. n-dimensional spheres. In order to convince oneself whether one has MK
understood the simple basic ideas of the Ljusternik-Schnirelman theor\.
one should give a direct proof of Proposition 37.59 using all possible
simplifications: An even (^-functional F: M"+l -» R has at least n paii^
(«,-«) of critical points on the H-dimensional unit sphere S" that
correspond to the eigenvectors of F'(u) = Aw, u e S".
Solution: Make use of Proposition 44.6. Construct the L-S
deformations needed in (5) as in Lemma 44.29 and Corollary 44.30, taking into
consideration the essential simplifications that appear. Proposition 43.21
yields the connection between critical points and eigenvectors.
44.4b. Infinite-dimensional spheres. Explicitly verify that the proof of Theorem
44.A in Section 44.5 becomes especially simple when X is an H-span-
def
and G(u) = 2 (u\u), i.e., Na is a sphere.
44.5.* Comparison with the linear case; optimality of the main theorem. Prou1
(18a) using Theorem 44.A, (1) in Section 44.5. Hint: Compare Zeidk'.
(1979), page 202. Furthermore, one should convince oneself of the
i .U^ltlS
341
optimality of the main theorem, emphasized in Remark 44.22. Compare
Zeidler (1980), Theorem 1.
44.6. Proof of Lemma 44.28. Solution: (1) By (H4), Ne is bounded. The
functional G is continuous at the zero point and G(0) = 0.
def
(2) Let <p(f, u) = G(tu). By (H4), <p,(f, u) = (G'(tu), u) > 0 for u # 0,
f > 0. Therefore, f •-> <p(f, u) is strictly monotonely increasing on [0, oo[
and <p(f, m)->+oo as r -> + oo and u # 0. Thus the equation G(ft<) = a
has exactly one solution t - r(u) > 0 for u # 0.
(3) By the chain rule in Section 4.3, <p is continuously F-differentiable.
Since y(r(u), u) = a and (p,(r(w), u) > 0 for m # 0, the implicit function
theorem (Theorem 4.B in Section 4.7) assjares the continuous F-
differentiability of r on Jf - {0}; thus
0 = G(r(u)u)' = (G'(/•(«)«), «)/•'(«) + r(«)G'(r(«)«).
(21) follows. Since G is even, r is even. Consequently, r' is odd.
(4) By assertion (1), r is bounded on bounded sets which lie outside
some neighborhood of zero. Then, by (21), (H3) and (H4), r' has the
same property. Let p > 0 be fixed. For all u, v e Xsuch that ||u||, ||y|| > p,
\\u — v\\ <, p/2, the mean value theorem yields
\r(u)~ r(v)\ £\(r'(u + &(v - «)), v - u)\
s||r'(« + #(o-«))||||o-«||
for suitable d e ]0,1[. Since \\u + ft(v - u)\\ > p/2, r is uniformly
continuous on bounded sets of {u e X: \\u\\ > p}.
Then, according to (21), (H3), and (H4), r' has the same property.
(5) The mapping inverse to u -> r(u)u from S onto Na is v •-» IMr'y
(see Fig. 44.2 in Section 44.7a).
44.7.* Proof of Lemma 44.32 for B-spaces. Use the weak topology and a
partition of unity. Compare Dancer (1976).
44.8.** Perturbed eigenvalue problems and stable critical points. We consider
F'(u) + eFl'(u) = hu, u&X, AeR, ||«||=1. (28)
Show: For each n e N, there exists an e0(n) > 0 such that (28) has at
least n eigenvectors for each e, |e| < e0(n), provided the following three
conditions hold:
(i) F, Fx: X-* U are C'-functionals on the real separable H-space X,
where dim AT = oo.
(ii) F\ F{ are strongly continuous,
(iii) F is even, with F(0) = 0 and F(u) # 0, F'(u) # 0 for u # 0.
The perturbation Fx need not be even. Consequently, F' is odd, but F{
need not be odd. Hint: Compare Krasnoselskii and Zabreiko (1975, M),
Section 57.3. Generalizations to indefinite problems with applications to
partial differential equations can be found in Zeidler (1980) (cf. Problem
44. Ljusternik-Schnirelman Theory and the Existence of Several Eigenvectors
44,9), There, perturbed Hammerstein equations
\u = K(Fxu + eF2u), (29)
together with applications to Hammerstein integral equations are also
considered,
,9, Indefinite eigenvalue problems for quasilinear elliptic differential equations.
Parallel to Section 44.9, we consider the boundary eigenvalue problem
ti:\(-bu + g'(u)) = >p(x)u + ef{(u); 3S2:w = 0. (30)
o
In the Sobolev space W2(^)> we write (30) in the form
\G'(u)=*F'(u) + eF{(u), (31)
where
G(u)'(2-lZ[(Dlu)2+g(u)]dx,
a ; = i
F(u) = 2~1J>pu2dx, Fl(u) = Jfl(u)dx.
Then
N
<G'(u),«>-/ £ [(Diuf+g'i^uldx,
Investigate this problem for a bounded region Q in IR N, Indefinitenesses
of F' and G' arise by the change of sign in \p and violation of the
condition
g'(«)«>0 forallweR. (32)
If (32) holds, then (G'(u), u)>0 for all u e W}(Q), where u # 0.
Case 1; Let g=0, e=0, ^eC(Q). Apply Theorem 44.A in Section
44.5 and show: If \j/(x) # 0 for some x e G, then (31) has infinitely man>
eigenvectors in ^(fi) such that G(u) = a for fixed a > 0.
Besides (32), what conditions does one need in order to obtain tni>
assertion for g & 0 as well?
Care 2: Let g = 0, e # 0, <(/ e C(S).
Apply Problem 44.8. Generalizations can be found in Zeidler (1980)
Case 3: Let e > 0 and suppose that (32) is violated, e.g., let g'(u) = mu.
where m < 0.
Then (G'(«), u) > 0 does not necessarily hold for G(m) = a, i.e., the
level set Na= {u: G(u) = a} does not have to behave approximately like
a sphere (cf. Problem 44.10).
Parallel to Section 42.7, consider more general nonlinear equation^
instead of (30). Hint: Compare Zeidler (1979a), (1980).
0.* Eigenvalue problems with unbounded level sets. Study (31) with e = 0 i'l
Krasnoselskii (1956, M), Chapter 6 and in Zeidler (1979a) for the case
when the level set Na is a hyperboloid or, more generally, unbounded and
F' is indefinite.
Problems
343
44.11.* Ljusternik-Schnirelman theory and the Galerkin method. The proof of the
main theorem (Theorem 44.A in Section 44.5) is based on the
investigation of
+ c± = sup inf ±F(u) (33)
with the aid of L-S deformations of the set K on the level set Na. If Na
does not have sufficiently regular properties, then the construction of
these L-S deformations is difficult, and we need another method. The
idea of the Galerkin method due to Browder consists in using an
increasing sequence of finite-dimensional subspaces Xx c X2 c • • • c X
and considering in (33) only K from JtJ2 such that K c Xk. Then,
instead of c~, we get c,^k. Now we can apply the main theorem in
finite-dimensional B-spaces. (Theorem 44.B in Section 44.9) to these
modified problems. By means of an approximation argument, one shows
that c,* k -* c ~ as & -> oo" One then proves the convergence of the
eigensolutions as k -»oo with the aid of Section 43.5. In particular, from
this it follows that the eigenvectors of F'(u) = \G'(u) exist, where
F(w) = c*. Furthermore, analogous to the proof of Theorem 44.A in
Section 44.5, it follows that c,* -> 0 as m -»oo. It then follows that there
exist infinitely many eigenvectors on Na.
Use this idea to prove the weakening of the continuity assumptions on
G' given in Remark 44.23. Hint: Compare Zeidler (1980), page 477.
44.12. Morse theory. We generalize several of the important propositions in
Section 37.27a in R1 to H-spaces and B-spaces.
44.12a.* Morse lemma on normal forms. Let F: U(0) c X-^ U be a C- functional,
r 2: 3, on a neighborhood of zero of the B-space X. Moreover, let u — 0
be a nondegenerate critical point of F, i.e., F'(0) = 0 and (/!,&)-»
d2F(0; h, k) is not degenerate. Show:
F(«p(i;)) = ^(0)+2-^(0: w.i;) forallyeK(O), (34)
i.e., to be precise, there exists a neighborhood of zero, K(0), in X and a
C"2-diffeomorphism <p: V(0) -> (7(0), with <p(0) = 0, <p'(0) = / (identity),
such that (34) holds.
Hint: Cf. Section 73.12.
44.12b.** Morse inequalities for the number of critical points. Show: If M, denotes
the number of critical points of F with Morse index i, then the estimates
k
M0;>1, A^-Afo^-l, £(-1)^,2::1, /c = 2,3,...,
i'=0
oo
£(-l)'M,=l
i=0
hold provided the following assumptions are fulfilled:
(i) F: X-* U is a C Afunctional on the H-space X.
(ii) All critical points are nondegenerate and all M, are finite.
44 44. Ljusiernik-Smmreiman liieory and the existence 01 several Eigenvectors
(iii) inf„e XF(U) > _0°, and F satisfies (PS), i.e., for all ceR, from
F(un) -» c, ||/"(«„)|| -* 0 as n -» oo it follows that (un) has a
convergent subsequence.
Hint: Compare Berger (1977, M), page 361. The assertion follows from
more general topological results. General Morse estimates for functional
satisfying (PS) on complete Hilbert manifolds can be found in Schwartz
(1969, L), Theorem 4.89 and in Rothe (1973).
44.12c* The Morse-Sard theorem in UN. Show:
(i) If F: U c R"-* R is a C^-function on the open set U in R", then
the set of critical values of F has measure zero,
(ii) If F: U c R"-* UM is a C-function on the open set U in HN, then
the set of critical values of F has measure zero in UM provided
r>max(Af- M,0).
In this connection, c is called a critical value of F if and only if there
exists a u eU such that F(u) = c and F'(u): KN-^KM is not surjec-
tive. (i) is a special case of (ii). Hint: Cf. Section 73.20 for a more general
result.
44.12d.* Generalization to infinite-dimensional spaces. Show:
(i) The set of critical values of F: X-* U is at most countable provided
the following holds: F is analytic on the real separable B-space X
and F' is a Fredholm mapping.
(ii) The set of critical values of F: X -» Y is of first Baire category in 1
when the following holds: X,Y are separable B-spaces, F is a
C'-Fredholm mapping and r> max(ind F,0). The set of regulai
values of F is open and dense in Y.
In this connection, F: X -» Y is called a Fredholm mapping when
F e C^Jf.y) and F'(u): X-* Y is a linear Fredholm operator for all
u e X(cf. Definition 8.13). Then ind F'(u) is constant for all u e X, and
we simply denote it by ind F.
Hint: (i) Compare Fucik, Necas, and Soucek (1973, L), page 150. (it)
Cf. Section 78.8 for a more general result.
As a simple application, prove: For F in (ii), with ind F < 0, we have
^^(^) = 0, i.e., the problem F(u) = v is ill posed. If there exists a
solution for u0, then in each neighborhood of v0 there exists a v for which
there exists no solution. Hint: Compare Berger (1977, M), page 126.
44.12e.* Applications of Morse theory to quasilinear elliptic differential equations. I)i
this connection, consider the following problem given in Berger (1977.
M), page 363:
G:E2ku+u-g(x)u3 = 0; dG:u = 0,
and study Skrypnik (1973, M), Chapter 5.
44.12f.* Application to the Ljusternik-Schnirelman theory. As an application ol
Problem 44.12d, (i), prove the following result for the eigenvalue
problem
F'(u) = \u, u&X, XeR, ||M||2 = r (35)
Problems
345
for fixed r > 0: The set T of the values of F that correspond to
eigenvectors of (35) is at most countable—to be precise, T — {0} consists
only of isolated points when the following conditions hold:
(a) F: X-^U is analytic on the real separable H-space X and F' is
strongly continuous.
(b) F(0) = F'(0) = 0; u # 0 implies F'(u) # 0, F(u) > 0.
When dim X = oo, then from Theorem 44.A in Section 44.5 for odd F,
it follows that T consists of precisely a countable number of values that
accumulate only at zero.
Hint: Compare Fucik, Necas, and Soucek (1973, L), page 161. There,
one also finds more general results.
44.12g. Connection with eigenvalues. Show: For r = 2, the set T in Problem 44.12f
consists of exactly the eigenvalues X in (35) when F' is positive
homogeneous, i.e., F'(tu) = tF'(u} for all t> 0 and for all u e X.
Solution: F(u) = U(F'(tu)\u)dt = 2^1(F'(u)\u). From (35) it
follows that \r = (F'(u)\u) = 2F(u).
44.13. Ljustemik-Schnirelman theory and category. Category is a topological
index which in very general cases yields a lower bound for the critical
points of a functional (see Problems 44.13f and 44.13g). The basic idea is
the same as in Section 44.2b. However, we do not postulate any
symmetry conditions. In preparation for the following, we first make available
absolute neighborhood extensors and Finsler manifolds.
44.13a. Absolute neighborhood extensors. A topological space Y is called an
absolute neighborhood extensor for metric spaces (in brief: ANE) if and
only if the following extension property holds for continuous mappings:
If /: M -> Y is continuous on the closed set M of a metric space X, then
there exists an open set U such that M c U c X and a continuous
extension /: U -* Y of /.
(i) Example. Every normed or, more generally, every locally convex
space is an ANE. This is the Tietze-Dugundji theorem (Proposition 2.1).
(ii) Borsuk's homotopy extension theorem. One of the most important
properties of ANE's is the possibility of extending a homotopy, i.e., a
continuous mapping
//: MX[0,1]-^7, M<zX,
to a continuous mapping
H: XX[0,1]-^Y.
The precise assumptions are: Y is an ANE, M is a closed set in the
metric space X, and 7/(-,0): X-^Yis given as a continuous extension of
//(-,0): M-*Y. Hint: Follow a line of reasoning similar to that in the
proof of Theorem 16.A in Section 16.2. Compare Borsuk (1966, M), page
94. This monograph is a standard work for retracts and extensors.
44.13b. Finsler manifolds. A C'-manifold M modelled on a B-space X is called a
Finsler manifold if and only if the following assertions hold:
(i) On every tangent space TMU, a norm ||-1|„ can be introduced that is
equivalent to the norm on X.
44. Ljustemik-Schnirelman Theory and the Existence of Several Eigenvector
(ii) If (/(w) is a w-neighborhood on M such that the points of the
tangent bundle TM over (/(w) can be represented as U(u)XX (local
trivialization), then for each k > 1 there exists a smaller w-neighborhood.
Uk(u), such that
fc-VLslMI.SfcHC forall(M)ef4(u)XX
In connection with (i), one must note that we have indeed verified the
linear isomorphism TMU = X in Problem 43.4c—however, this isomor
phism depends on the local coordinate system and thus a norm is noi
given a priori on TMU in an invariant way. On a Finsler manifold M, fo>
a continuously differentiable curve x: [0,1] -» M, one can define the
curve length by
/V(0L(o*-
For two points Px, P2 in a component of M, we define the distance
p(Pl,P2) to be the inflmum of the lengths of all continuously
differentiable curves that join Px and P2.
(iii) Every component of M with the metric p(-, •) becomes a metric-
space and we require that this metric induce the initial topology on M
This holds, e.g., in the case where M is a regular topological space. MK
said to be complete as a metric space, provided each component of M
incomplete.
Trivially, every B-space X is itself a complete C°°-Finsler manifold
Moreover, sufficiently regular surfaces in B-spaces are Finsler manifolds
c. Topological index. If A and B are subsets of a topological space M, then
B is called a deformation of A (in symbols: A — 5) if and only if theie
exists a continuous mapping d: A X[0,l]-» M, i.e., a deformation witli
d(A,l) = B. The set ^ is said to be contractible in M if and only if
A — {h0} holds for a fixed «0. In Fig. 44.3 we have simultaneous^
drawn the deformation paths />-»</(«(/) for u^A. For a topological
index j(-) on the topological space M, we require that to each closed set
^ there corresponds an integer «(^4), 0 < «(^4) < oo, having the
following four properties:
(a) /(0) = 0, and i(A) = 1 when ^ consists of a single point.
(b) AcBoiA~B implies /(/1) < /(5).
(c) i(AVB)<i(A)+i(B).
(d) For each component set A in M there exists a neighborhood C/ of -1
such that /((7) = /(,4).
AU0
4
Figure 44.3
Problems
347
In particular, from (a) and (c) it follows that A contains at least i(A)
points. By convention >41(8), all topological spaces are assumed to be
separated.
44.13d. Category. We define the category c&tM(A) for closed sets A in the
topological space M in the following way:
(1) cdXMA = 1 if and only if A is contractible in M.
(2) cdXMA is equal to the smallest number of first category sets that
cover A. In particular, catM0=O. Furthermore, ca.lMA = <x> holds
when there exists no finite cover of that sort.
Show:
(i) cat„nv4 = 1 for a closed ball A in R ", n > 1 (see Fig. 44.4).
(ii) catR»S"_1 = 2 for the boundary, S"~\ of the unit ball in R",
n > 1 (see Fig. 44.5):
(iii)* cdXpnP" = n +1 for the n-dimensional projective space P" which
arises from S", n>\, by identifying antipodal points (see Fig.
44.6).
Figure 44.4
Figure 44.5
0
Figure 44.6
44. Ljusternik-Schrurelman Theory and the Existence of Several Eigenvectors
(iv) cat p«,P'" = <x> for the infinite-dimensional projective space P°°
which arises from the boundary S of the unit ball in an infinite-
dimensional B-space by identifying antipodal points.
(v)** catrr= 3 where T is a torus in R3.
Hint for the solution of (iii): The original proof of Schnirelman (1930)
used deep-lying algebraic topological tools. However, one can give a
completely elementary and very short proof using the Ljusternik-
Schnirelman-Borsuk covering theorem in Section 16.5. Compare
Tihomirov (1976, M), page 88. (iv) follows from (iii).
We briefly explain the significance of (iii) for the classical
Ljusternik-Schnirelman theory. Let/: S" -» R be continuously differen-
tiable and even. According to Problem 44.13g, the number of critical
points of/on S" is greater than or equal to the category of S"—thus, by
(ii), it is greater than or equal to two. First of all, this result is trivial.
However, since / is even, we can think of / as a continuous mapping of
P" into R. Then, by (iii),/has at least n +1 critical points to which n +1
pairs (u, - u) of critical points of /on S" correspond.
Hint for the solution of (v): For rc-dimensional C°°-manifolds, we have
catMMsil + cup length of M. Here, "cup length" is a concept from
cohomology theory. This yields (v). Compare Schwartz (1969, M),
Section 5.15.
From Problem 44.13g below, it thus follows that the number of critical
points of a C'-function /: T -» R is at least equal to three. In an
elementary way, prove this assertion with the aid of Section 44,2a Hint-
Compare Ljusternik and Schnirelman (1934, M), page 38,
44.13e. Properties of category. Show.
(i) Category has the properties (a)-(c) of a topological index in Prob
lem 44.13c.
(ii) If M is a metric space and an ANE, and every point of M has a
closed neighborhood which is contractible to this point, then (d)
also holds.
(iii) Category is maximal in the following sense: i(A) < catMA holds foi
each topological index in Problem 44.13 c. The next problem shows
the significance of this maximality assertion.
Hint: (i) Compare Schwartz (1969, M), page 156, (ii) Compare
Tihomirov (1976, M), page 88 and Browder (1970a), page 13. Use the
homotopy extension theorem from Problem 44.13a. (iii) Let catM>4=l
Since A ~ { w0}, then, by (b) and (a), we have i(A) < i({u0}) = 1. Novt
the assertion follows by using a cover and (c),
44,13f, Abstract main theorem on the smallest number of critical points. Parallel fc>
Section 44,2b, show that the functional F: M -> R has at least i(M)
critical points when the following four conditions hold:
(i) There exists a topological index on M as in Problem 44,13c,
(ii) The L-S deformations (5) are defined for all c e R,
(iii) critM ,..^is compact for all ce R.
Problems
349
(iv) The numbers
c„,= sup inf F(u)
or
dm = inf sup F(u)
are finite for all m = 1,2,.,,, with l£m<,i{M). Here, Jf„, denotes
the class of all subsets K of M such that i(K)>m,
We tacitly assume that the concept of a critical point on M is well
defined. Let M be, say, a C^-Banach manifold. ^
44.13g.* Concrete main theorem on the smallest number of critical points. Show:
The number of critical points of F: M -*U is greater than or equal to
catMM when the following hold:
(i) F is a (^-functional, F satisfies (PS) and is bounded from above or
from below on M,
(ii) M is a complete C2-Finsler manifold.
Parallel to Section 44,4, the condition (PS) means that each sequence
(u„) on M, such that \\TF(u„)\\ -> 0, F{u„)-*c as n ->oo and for
arbitrary c e R, has a subsequence which converges in the metric of M.
Here, \\TF(u)\\ is the operator norm of TF(u): TMU->U which is
induced by the norm on the tangent space TMU.
def
Hint: We use Problem 44.13f with i(A) = cdiMA and first assume
that:
(H) All c,„, m =1,2,..., \<m <, caiMA, are finite.
Then the assertion follows easily from the assertion in Problem 44.13f
when we can construct the L-S deformations (5).
These L-S deformations are obtained by solving ordinary differential
equations on M and from (PS). If the norms ||-||„ on the tangent spaces
TMU are produced by inner products, then one can easily construct the
L-S deformations by solving
u'(t) = a{\\TF(u(t))$)TF{u(t)), u(0) = u0 (36)
def
and d(u0, t) = u(t). Here, (5) results from
F(u(t)y~(TF(u(t))\u'(t)) = a(\\TF(u(t))\0\\TF(u(t))\\2
and Proposition 44.16, provided one chooses a: [0, oo[->R as a C°°-
function with a(t) = 1 for 0 < t <1 and a(t) = 2//2 for t > 2. In
addition, t >-* t2a(t) should be monotonically decreasing for t > 0. Compare
Berger (1977, M), page 367.
In the general case, the right-hand side of (36) must be replaced by
pseudogradient vector fields. In this connection, compare Browder
(1970a), page 17, Zeidler (1979a), Chow and Hale (1982, M) and
Problem 49.7.
350 44. Ljusternik-Schnirelman Theory and the Existence of Several Eigenvectors
Parallel to Section 44.2b, (H) yields assertions concerning multiplicity.
If (H) is not fulfilled, then one can use a general argument from Browder
(1970a), pages 10, 33.
44.14.* Ljusternik-Schnirelmen theory and free critical points. We consider this
complex of problems in Problem 49.7 in connection with other methods
for the construction of free critical points.
44.15.** Existence of geodesies. Study the literature cited in Section 37.27e.
References to the Literature
Classical works: Ljusternik (1930), (1939); Schnirelman (1930).
General theory: Ljusternik and Schnirelman (1934, M); Krasnoselskii;
(1956, M); Vainberg (1956, M); Palais (1966); Schwartz (1969, L); Coffman
(1969); Alber (1970, S); Browder (1970a); Clark (1972); Amann (1972);
Fucik, Necas, and Soucek (1973, L); Rabinowitz (1974, S); Berger (1977,
M); Klingenberg (1978, M); Zeidler (1979), (1979a), (1980, S, B).
General symmetries: Browder (1970a), Fadell and Rabinowitz (1978).
Genus and topological index: Krasnoselskii (1956, M); Coffman (1969);
Browder (1970a) (category); Rabinowitz (1974); Fadell and Rabinowitz
(1977), (1978); Zeidler (1980, B).
Perturbation of critical points: Krasnoselskii (1956, M); Krasnoselskii
and Zabreiko (1975, M); Zeidler (1980, B).
Indefinite problems: Krasnoselskii (1956, M); Zeidler (1979a), (1980).
Morse-Ljusternik-Schnirelman theory and global differential geometry
on Hilbert manifolds: Klingenberg (1978, M).
Application to elliptic partial differential equations: Browder (1970),
(1970a); Ambrosetti and Rabinowitz (1973); Rabinowitz (1974, S); Zeidler
(1979a), (1980); Struwe (1980), (1982).
Application to hyperbolic differential equations and periodic solutions of
dynamical systems: Fadell and Rabinowitz (1978); Rabinowitz (1978a),
(1978b); Chow and Hale (1982, M).
Application to Hammerstein integral equations: Krasnoselskii (1956, M);
Vainberg (1956, M); Coffman (1969); Amann (1972); Zeidler (1979a),
(1980).
Application to the existence of geodesies: Ljusternik and Schnirelman
(1929) (classical work); Ljusternik and Schnirelman (1947, S); Schwartz
(1969, L); Flaschel and Klingenberg (1972, L); Klingenberg (1978, M,B,H).
(Also, cf. the references to the literature in Chapter 49 concerning the
existence of free critical points and their applications.)
CHAPTER 45
Bifurcation for Potential Operators
One can fully justify the linearization principle of bifurcation theory for
potential operators.
Mark Aleksandrovic Krasnoselskii (1956)
In this chapter we shall show that especially favorable bifurcation relations
are present in the case of potential operators. In the proof of the main
theorem, we shall essentially make use of Lagrange multipliers and the
Ljusternik-Schnirelman theory. We introduced the basic concepts of
bifurcation theory in Section 8.1.
45.1. Krasnoselskii's Theorem
We consider the nonlinear eigenvalue problem
n(Lu + Nu) = u, ueX, ix eU (l)
together with the linearized problem
H0Lu = u (2)
under the following assumptions:
(HI) L: X-* X is linear and compact on the real B-space X.
(H2) N: f/(0)cl-^Iis compact on the neighborhood of zero [/(0),
where ||M<||/||k||-*0 as ||k||-*0.
According to Theorem 15.B in Section 15.6, the following two assertions
hold:
(i) Necessary condition: If (jw0,0) is a bifurcation point of (1), then n0 is a
characteristic number of (2).
j52
45. tsuurcation lor rotential operators
(ii) Sufficient condition: If ju.0 is a characteristic number of (2) with odd
algebraic multiplicity, then (/x0,0) is a bifurcation point of (1).
To obtain an essential sharpening of (ii), we make the following
additional assumptions:
(H3) X is an H-space, L is symmetric, and N is a potential operator, i.e.,
there exists a functional F: [/(0) c X-* U such that F'= N on [/(0).
(H4) N is continuously F-differentiable on the open neighborhood of zero
[/(0).
Since L is symmetric, it is also a potential operator.
Proposition 45.1 (Krasnoselskii (1956)). If(Hl)-(H4) hold, then the
following two assertions are equivalent:
(a) (n0,0) is a bifurcation point of (1).
(b) ix0 is a characteristic number of (2).
Thus the heuristic linearization principle for the determination of
bifurcation points for potential operators is rigorously justified. As we have seen in
the Counterexample 8.2, (b) =» (a) does not always hold when no potential
operators are present. The bifurcation branches can then correspond to
complex values of the parameter.
The significance of Proposition 45.1 consists is that, for example, in
elasticity theory there appear problems of type (1), say, in the determination
of the buckling of beams, plates, and shells. Since in this connection it is a
matter of equations that arise from variational problems, potential operators
are present. We shall delve into such applications in Part IV.
Proposition 45.1 is a special case of Theorem 45.A in the next section. In
Krasnoselskii (1956, M), Chapter VI, Theorem 2.2, the conditions (H3) and
(H4) are replaced by a modification of (H3).
45.2. The Main Theorem
We investigate the equation
Bu + Nu = e(u + Mu), (3)
where B is a linear operator and Mu, Nu = o(\\u\\) as «->0, in order to
determine when (3) has another (nontrivial) solution in a neighborhood of
(0,0) besides the trivial solution u = 0, e arbitrary (see Fig. 45.1). Our
assumptions read as follows:
(HI) B: X-* X is a linear, continuous, and symmetric operator on the
real H-space X with the inner product (• | •).
(H2) R(B) is closed and 0 < dim N(B) < oo.
45.2. The Main Theorem
353
u
-p ► 6
Figure 45.1
(H3) M, N: [/(0) c X -* X are continuous F-differentiable potential
operators on the open neighborhood of zero, U(0), such that ||iVK||/|M| -*0 and
||MK||/||K||-*0asK-*0.
Then, because of its symmetry, B is also a potential operator.
Theorem 45.A. Under the assumptions (//1)-(//3), the following assertions
hold:
(1) The point (0,0) is a bifurcation point of (3).
(2) If M and N are odd, then there exist n solution branches of (3) that pass
through (0,0), where n = dim N(B).
To be precise, there exists an r0 > 0 such that for each number r:
0 < r < r0 there exist at least n distinct solution pairs (e,, ur), (e,, — ur) of
(3), where
g(ur) = r and (e,, «,)-> (0,0) asr->0.
Here, g(u) = 2~\u\u)+a(u), where a'(u) = Mu on [/(0), a(0) = 0.
The intuitive interpretation of Theorem 45.A is clear from the following
example.
Example 45.2. We consider
Lu + Nu = X(u + Mu) (4)
and assume:
(i) L: X -» X is linear, continuous, and symmetric on the real H-space X.
The operators M and N satisfy (H3).
(ii) \0 is an isolated eigenvalue of L having finite multiplicity n.
Then the following hold:
(a) (A0,0) is a bifurcation point of (4).
(b) If M and N are odd, then there exist n solution branches, i.e., for each r.
0 < r < r0 there are at least n distinct solution pairs (Xr, ur), (Xr, — ur)
such that g(ur) = r and (Xr, ur) -» (A0,0) as r -* 0.
Proof. We set X = X0 + e, B = L - X0I. According to Problem 45.2, from
(ii) it follows that R(B) =R[1T). Then Theorem 45.A immediately yields the
assertion. □
354 45. Bifurcation for Potential Operators
(gr®
(a) (b)
Figure 45.2
Figure 45.2 illustrates situation (b) for X=U2, n = 2, M = 0. The
linearized problem Lu = \u possesses two linearly independent eigenvectors;
therefore, two solution vector pairs (u, — u) lie on each small sphere having
center at zero [see Fig. 45.2(a)]. If a perturbation N appears in (4), then we
expect that the solution branches are deformed as in Fig. 45.2(b). Our
assertion corresponds to this representation in weakened form. For M=£0,
the level set g(u) = r describes a perturbed sphere.
45.3. Proof of the Main Theorem
First, we briefly explain the basic idea of the proof. Let a'= M, b'= N.We
set
def def
f(u) = 2_1(^m|m) + 6(m), g(u) =2-l(u\u) + a(u).
Then the original problem (3) reads as follows:
f'(u) = eg'(u). (5)
Due to the Lagrange multiplier rule, it is natural to make use of the
variational problem
/(«) = min!, g(u) = r (6)
to solve (5). However, this method does not lead directly to our goal. The
idea of our proof is a modification which requires that we consider (6) only
for special u of the form
u = d + w(v,e), e = e(v), veN(B). (7)
Since dimN(B)<oo, there are no difficulties with compactness in the
solution of (6). Our procedure is very natural, since, with the aid of the
branching equation, it will turn out that the solutions of (5) must necessarily
have the form (7). From (6) and (7) we first obtain a solution of the
branching equation and then, in the usual way, a solution of the original
problem (5).
45.3. Proof of the Main Theorem
355
In the proof we shall repeatedly apply the implicit function theorem
(Theorem 4.B in Section 4.7). In order to clearly work out the simple essence
of the proof, we shall assume that M and N are analytic in a neighborhood
of zero, i.e., in the sense of Section 8.3,
Mu=*a2u2 + a3u3 + ■ ••, Nu = b2u2 + b3u3 + • • • (8)
holds. Then we can apply the implicit function theorem in its analytic
version given in Section 8.3 and obtain the solutions in terms of series
expansions as well. The decisive advantage is that by substitutions and
equating coefficients or by successive approximation we can immediately see
the order of magnitude of the solutions in a simple way (see, e.g., (10)). For
the proof it is crucial that certain expansion terms do not appear in the
solutions. In Problem 45.3 we recommend that the reader verify all steps in
the proof with the aid of Theorem 4.B in Section 4.7 for the case when M
and N are only C1-mappings.
We shall consider and solve all equations in a neighborhood of zero.
Step 1: The Branching Equation in the Sense of Ljapunov. We decompose X
by
X=N(B)®N(B)±
and then, parallel to this, consider the decomposition
u=v + w, veN(B), weN(B)1-
for all m e X Here, NiB)-1 denotes the orthogonal complement of N(B).
We denote the orthogonal projection operators of X onto N(B) or onto
N(B)->- by P or Px, respectively. Then P + P± =1. Parallel to Section 8.6,
we now decompose equation (5) in the form
P±f'(u) = eP±g'(u), u = v + w, (9a)
Pf'(u) = ePg'(u). (9b)
We first solve (9a). According to the closed range theorem (cf. Aj(39)),
R(B) = N(B)±. Consequently, B: N(B)1- -* N(B)1- is surjective.
Furthermore, this operator is also bijective because Bu = Q, ueNlB)-1 implies
» = 0. According to Aj(36), B~l thus exists on N(B)-1 as a linear
continuous operator. Since /'= B + N, g'=I + M, P±B*=B, Bv = 0, P±v=0,
(9a) then reads as follows:
w=B~1P± [e(w + Mu) — Nu], u = v + w.
If we write this equation in the form G(e, u,w) = 0, then G(0,0,0) = 0,
Gw(0,0,0) = I. According to the implicit function theorem from Section 8.3,
we obtain a solution w = w(e,v), where
w(e, v) — c02u2 + ecuv2 + c03u3 + ■ ■ • (10)
because of (8). The ellipsis dots denote terms of higher order. It is important
: that each term contains at least the square of v.
356
45. Bifurcation for Potential Operators
Now, if we set u = v + w(e, v) in (9b), we obtain the so-called branching
equation. Conversely, if one knows a solution (e, v) of the branching
equation, then setting w = w(e, v) one immediately obtains a solution of (5)
by adding (9a) and (9b).
Therefore, we have only yet to solve the branching equation.
Step 2: Condition on e. Forming the inner product of the branching
equation (9b) with v immediately leads to
(Bu + Nu\v) = e(u + Mu\v), u = v + w(e, v)
because/' = B + N and g'~ I + M. Taking into account that Bv = 0, (u|w)
= 0, this yields
(Bw(e, v) + N(v + w(e, v))\v) = e(v\v) + e(M(v + w(e, v))\v). (11):
It is crucial that after division by (v\v), for v + 0, this equation assumes the
form
e = d0lv + eduv + d02v2 + • ■ •, (11a)
where the right-hand member is analytic in a neighborhood of zero. Here,
one must keep in mind (10), (11) and ||cu*||/(u|u) < ||c|| ||u||*_2 for k > 3 and
v + 0. The implicit function theorem from Section 8.3 guarantees a solution
e= e(v) of (11a), where
e(v) = axv + a2v2 + • • • .
We thus arrive at the following lemma.
Lemma 45.3. If we set <p(v) = w(e(v), v), then
P^f'(u) = eP^g'(u), (/'(«)|i>)-e(g'(«)|i;)
holds, where e= e(v), u = v+ <p(v) for all v in a neighborhood of zero in
N(B).
Step 3: Variational Problem. As we stated after (6), we now consider the
problem
/(i> + v(i>)) = mini, g(v + v(v))-r, veV. (12)
Here, V is a fixed sufficiently small open neighborhood of zero in N(B).
Lemma 45.4. For any fixed sufficiently small r > 0, (12) has a solution v and
this solution satisfies
(f'(v + <p(v))\z + <p'(v)z) = ,x(g'(v + <p(v))\z + <p'(v)z) (13)
for all z e N{B).
Corollary 45.5. If M and N are odd, then there exist at least dimN(B)
solution pairs (v, — v) of (12), and (13) holds for them.
IV blems
357
1'koof. Let c(v) =g(v + <p(v)), F, = {c£F: c(v) = r}. Since <p(u) =
fA||u||2), v-* 0, we have
(c'(0)|0) = (g'(0 + v(0))|0 + v'(0)0)
~(v\v) + 0(\\v\\3), v-*0;
c(w) - 2-1(»|») + C»(||i;||3), v-+0;
therefore, (c'(v)\v) + 0 for all v e Vr and sufficiently small r. Thus Lemma
45.4 follows from the Lagrange multiplier rule (cf. Section 43.3). One must
lake into account that for all sufficiently small r > 0, the level set Vr is a
ampact subset of a sufficiently small open neighborhood of zero, V.
Corollary 45.5 follows from the. Ljusternik-Schnirelman theory (cf. Sec-
lion 44.8). Note that/ and g are even and thus v <-> e(v), v -* w(e, v), and
r -»<p(v) are odd. Consequently, c is even. Since (c'(v)\v) i= 0 for all v e Vr
and sufficiently small r > 0, the level set Vr is diffeomorphic to the sphere
2 '*{v\v) — r. This is proved analogously to Lemma 44.28. □
The essential trick in the proof of Theorem 45.A in Section 45.2 depends
on the following lemma.
Lemma 45.6. In (13), ft, — e(v).
def
1'roof. Let z = v in (13). Since ^(^)6^(5)-1 and therefore ip'(o)z€
SXB)-1, from (13) and Lemma 45.3 it immediately follows that
(6(0)-/0(^(0 + ^(0))10 + ^(0)0)-0.
□
Srep 4: Solution of the Branching Equation. Using (13) with ft, = e(v), from
Lemma 45.3 it follows that
{f'(v + v(v))\z) = e(v)(g'(v + v(v))\z) forallzeAT(iJ);
therefore,
Pf'(v + V(v))-e(v)Pg'(v + <p(o)). (14)
Step 5: Solution of the Original Problem. According to Lemma 45.3, (14)
also holds if one replaces P by P ±. Since P + P± = I, addition of these two
relations yields the equation (14) without P. Consequently, u = v + <p(v),
1 - e(v) is a solution of/'(«) = eg'(u).
This completes the proof of Theorem 45.A in Section 45.2.
1'roblems
45.1. Proof of Proposition 45.1. Solution: Set X = n~l, A0 = fig1, A = A0 + e,
B = L - \0I and apply Theorem 45.A in Section 45.2. By Section 8.4, B is a
Fredholm operator; hence R(B) =R(B).
45. Bifurcation for Potential Operator-
Closed range. Prove that R(B)=R(B), where B = L~\QI in Example
45.2.
Solution: First, from the symmetry of B it immediately follows that
R(B)£N(B)-L; hence PXB = B. We will be finished when we show that
R(B) = NiB)1. For this purpose we use:
Bu = Q, ueN(B)1 implies « = 0. (15)
By assumption, A = 0 is an isolated eigenvalue of B. The spectral famih
{Ex} of B is thus constant in a left-hand-sided and right-hand-sideJ
neighborhood of X = 0. Then {P1 EXP-1 } possesses the same property. The
latter is the spectral family of the operator PXBPX: N{B)X -»i\f(5) + .
For this reason, X = 0 is an eigenvalue of P^BP1 or X = 0 belongs to the
resolvent set of PXBPX. In the second case, R{B) = N{B)L. Due to (15).
the first case cannot occur.
Weakened smoothness in the proof of Theorem 45.A. Making use of the hinl1-
in the introduction to Section 45.3, show that the proof of Theorem 45.A K
also valid for C'-mappings M and N.
Hint: Use Theorem 4.B in Section 4.7. Caution in using (11) is
recommended. For this purpose, we write (11) in the form
H(e,v) = e(v\v).
Division by (v\v) yields
G(e,u) = e (1(-)
for v =t= 0. Now one must note the fact that for v-*Q, the limiting value
def
relation G(e, v)-*Q holds. If one sets G(0,0) = 0, then G is continuous on a
neighborhood of zero. Moreover, one shows that (7,,(0,0) = 0. From (16) it
then follows that e=e(v), and Theorem 4.B in Section 4.7 yields the
existence of e'(v) for v + 0. Compare this with Rabinowitz (1974, S), pajA
181.
Important variants of Theorem 45.A
We consider the equation
Lu + Nu = (\0 + e)u, eel, ael (1")
subject to the following assumptions:
(HI) Xis a real H-space. L: X-* X\s linear, continuous, and symmetrk
(H2) JV: 1/(0)cl->l is continuously F-differentiable on an opo
neighborhood of zero, 1/(0), where ||JVk||/||k|| -* 0 as u -* 0. Furthermore, \
is an odd potential operator.
(H3) X0 is an isolated eigenvalue of L with finite multiplicity n.
Show that (0,0) is a bifurcation point of (17) and exactly one of the
following two cases occurs:
(i) (0,0) is not an isolated solution of (17) in {0} X X.
(ii) There exist a left-hand-sided neighborhood of zero, Uh and a righi-
hand-sided neighborhood of zero, Ur, in U and integers n,, nr such th.il
n, + nr>n and the following holds: For each eel// (respective!'..
Problems
359
e e Ur), e # 0, (17) has at least n, (respectively, nr) solution pairs (e, u),
(e, - u), u # 0, and as e -> 0 these nontrivial solutions converge to (0,0).
Hint: Compare Fadell and Rabinowitz (1977). In an essential way the
proof makes use of a topological index, which is explained with
the aid of the Cech cohomology theory and of arguments from the
Ljusternik-Schnirelman theory. This index is related to the concept of
genus defined in Chapter 44. However, with regard to the present problem,
it possesses more convenient properties than does the genus.
In Theorem 45.A in Section 45.2 we characterized the number of
solutions by stating how many solutions lie on small spheres with center at the
zero point of X or the surfaces related to this. The significance of the above
variant of Theorem 45.A consists is that the number of solutions is
characterized as dependent on e. This characterization corresponds to our
representation of solution branches of the form £*-> u(e). Roughly speaking,
(ii) asserts that to the left (respectively, to the right) of e = 0 there lie at least
n, (respectively, nr) solution branches e*->m(e), and the total number of
these branches is at least equal to the multiplicity of the eigenvalue \0.
Assertions for the case when N is not odd may be found in Rabinowitz
(1977).
45.4b. Study the proof in Exercise 45.4a and show that assertions (i) and (ii) can be
carried over to the equation
F01u + eFnu + Fmu2 + 2t, rtkFkrur = 0, (17a)
esR, ugX
when the following assumptions are satisfied. Here, F(e, u) denotes the
left-hand side of (17a):
(HI) X is a real H-space.
(H2) F: U(0,0)cmxX-^X is analytic in a neighborhood of (0,0). In
(17a) the summation is over all integers k, r, where k + r > 3, k >. 0,
r:>l. Thus, F( e, 0) = 0.
(H3) For all e in a neighborhood of zero, u *-> F(e, u) is an odd potential
operator.
(H4) F01 is a Fredholm operator for 0 < n < oo where n - dim N(Fnl).
(H5) (Fnu\u) > 0 for all u # 0 in N(F01).
Hint: Compare Ackermann (1979), page 100. Equations of the type (17a)
appear frequently in nonlinear elasticity theory.
45.4c. Important assertions concerning the structure of equations of type (17) and
(17a), where it is not necessary to have potential operators, can be found in
B6hme (1972). There, in the case of analytic equations, one finds more
precise assertions concerning the number of analytic solution branches.
The proofs use the reduction method of Section 45.3. The branching
equations are then investigated with the aid of deeper topological and
analytical statements (Adams' theorem concerning the number of linearly
independent vector fields on spheres within the framework of jKT-theory, the
curve selection theorem for analytical sets, etc.).
j60
45. Bifurcation for Potential Operator:-
45.5.* Application to elliptic partial differential equations. We consider the nonlineai
boundary eigenvalue problem
G: -\bu=*u + p(u); dG:u = Q. (18)
Let G be a bounded region in R N with a sufficiently smooth boundary, and
let /?: [— a, a] -> R be continuously differentiable for a fixed a > 0, where
p(u) = o(u) as «->0.
Show that to each eigenvalue \0 of the linearized problem, i.e., (18) with
p = 0, there corresponds a bifurcation point (X0,0) of (18). Stated precisely:
There exists an e0 > 0 such that to each \ with 0<|\-\0|<e0 there
corresponds a classical solution ux^ 0 of (18), where ux(x) -> 0 uniformly
on G as X -> X0.
Hint: Let X= ^(G)- We extend/; to a C'-function/;: R ->R so that/
is bounded globally. The generalized problem for (18) reads as follows:
f'(u) = \u, XeR, ugX, (18a)
where
f(u)- J2~l{u2+P{u))dx
and P' = p. Now apply Theorem 45.A in Section 45.2 to (18a) with
X = \0 + e. Then, for the solutions of (18a), regularity propositions foi
elliptic partial differential equations then yield that ||«||^-->0 implies.
max|u(x)| -» 0. Therefore, the generalized solutions are also classical
solutions of (18) with the original p. Compare Rabinowitz (1974, S), pages 187.
160.
References to the Literature
Classical work: Krasnoselskii (1956, M).
Bohme (1972); Rabinowitz (1974, S), (1977); Fadell and Rabinowit/
(1977); Mc Leod and Turner (1976); Stuart (1977), (1979, S); Bergej
(1977, M); Ackermann (1979); Chow and Hale (1982, M).
Kirchgassner (1971), (1976); Grundmann (1974); Ize (1976, S).
EXTREMAL PROBLEMS
WITH GENERAL SIDE CONDITIONS
We cannot get more out of the mathematical mill than we put into it, though
we may get it in a form infinitely more useful for our purpose.
John Hopkins
Do not imagine that mathematics is hard and crabbed, and repulsive to
common sense. It is merely the etherealization of common sense.
Lord Kelvin
In the following three chapters, convex sets, convex cones, variational
inequalities, and general Lagrange multipliers play a crucial role.
CHAPTER 46
Differentiable Functionals on Convex Sets
In this chapter, by generalizing the results of Section 40.2, we show that in
the case of a convex set M each solution u of
min F(u) — (b,u) =a (1)
satisfies a variational inequality, i.e.,
(F'(u)-b,v-u)>.0 forallueM. (2)
If u is an interior point of M, then, by Problem 39.4, (2) passes to the Euler
equation
F'{u)-b = 0. (3)
46.1. Variational Inequalities as Necessary and
Sufficient Extremal Conditions
We formulate the following assumptions for the existence propositions:
(HI) F: M c X -» IR is weak sequentially lower semicontinuous. X is a real
reflexive 5-space.
(H2) M is closed, convex, and not empty.
(H3) M is bounded or, for each sequence («„) from M, where \\u„\\ -* oo as
n -» oo, we have
HmF(un)-(b,un) = +oo.
364
46. Differentiable Functional on Convex Sets
Theorem 46.A. Let F; M c X -* U be a functional on the convex nonempty
set M of the real locally convex space X, and let b in X* be a given prescribed
element. Then:
(a) Necessary condition. If u is a solution of (1), then
d+F(u;v-u)-(b,u-v)>0 for all v e M (4)
in the case where the left-hand side exists. If F' exists as a G-derivative on M,
then (4) is identical to the variational inequality (2).
(b) Equivalence. If F is convex and F' exists as a G-derivative on M, then
the minimum problem (1) and the variational inequality (2) are mutually
equivalent.
(c) Uniqueness. If F is strictly convex on M, then (1) and (2) have at most
one solution.
(d) Existence. If (//1)-(//3) are satisfied, then (1) has a solution. For
convex F, the solution set of (1) is closed, convex, and bounded.
As the proof shows, assertion (a) also holds for local minima of u <-* F(u)
-(b,u) with respect to M.
Proof, (a) For fixed v e M, we set
def
<p(t) = F(u + t(v - u))-(b,u + t(v - u)).
For all t e [0,1], <p(t) > <p(0); therefore, <p'+ (0) > 0, but this is (4).
(b) Let u be a solution of (2); therefore, <p'(0) > 0. The function <p is
convex on [0,1]; consequently, <p' is monotone, i.e.,
<p(l)-<p(0) = <p'( #);><*>'(()), 0<d<l.
This yields F(v)-(b, v) > F(u)-(b, u) for all v e M and, therefore, (1).
(c) and (d) are special cases of Theorem 38.C in Section 38.4, Proposition
38.15, and Proposition 41.2. □
46.2. Quadratic Variational Problems on Convex Sets
and Variational Inequalities
Parallel to the minimum problem
min 2~la(u,u)-b(u) = a, (5)
MS M
we consider the variational inequality
a(u,v— u)>b(v — u) forallueM. (6)
We seek ue M.
■"■.). Application to Partial Differential Inequalities
365
I'roposition 46.1. The two problems (5) and (6) are mutually equivalent and
1'i'isess exactly one solution when the following three conditions hold:
((') M is a closed convex nonempty set in the real H-space X.
{ii) a: X X X-* U is bilinear, bounded, symmetric, and strongly positive.
iiii) b: X-+H is linear and continuous.
Corollary 46,2, If M is a closed convex cone, then (6) is equivalent to the
determination ofu&Mby means of
a(u,w)>b(w) forallweM, (7)
a(u,u) = b{u).
We recall the known definition of a convex cone in Section 48.1. We treat
the proofs in Problems 46.1 and 46.2.
46,3. Application to Partial Differential Inequalities
l'.irallel to Section 37.7, we consider the problem
- Au + cu = / on G, (8)
«£0, j~-g*0, (~-g)« = 0 ondG,
w I tore c is a positive constant. Let G be a bounded region in IR N, N > 1, with
clef
piecewise smooth boundary, i.e., dG e C0,1. We choose X = W^iG) and set
def
M = («el: u(x) ^ 0 almost everywhere on dG].
We recall that each u in X possesses generalized boundary values in L2(dG).
I liese generalized boundary values appear in M.
Definition 46.3. The generalized problem of (8) reads as follows: / e L2(G)
and geL2(8G) are given. An element ueM is sought such that
a(u,w)>b(w) forallweM,
a(u, u) = b(u),
where
def - I N
a(u, v) = j < Y, DjUDiV + cuv \ dx
i-l
, .def c r
b{v)= \ gvdO+ fvdx
JdG JG
Ji,d ^=(^,...,^), Z>,= 3/3f,.
366
46. Differentiable Functionals on Convex Sets
This definition is motivated by the discussion in Section 37.7.
Example 46.4. The generalized problem for (8) has exactly one solution.
We treat the proof in Problem 46.3.
46.4. Projections on Convex Sets
We now study the minimum problem
min \\u — c\\ = a, (9)
« e M
i.e., we seek points u in M which are at the least distance from the given
point c (see Fig. 46.1).
Proposition 46,5 (Moreau (1962)). If M is a closed convex nonempty set in
the real H-space X with the inner product (■ | ■), then:
(1) For each c in X, (9) has exactly one solution u.
(2) If we denote this solution by Pc, then the operator P: X-* M is monotone
and nonexpansive.
(3) u = Pc if and only if
(u-c\v-u)^0 forallveM. (10)
(4) If M is a closed convex cone, then
\\Pc\\2 = (c\Pc) and (Pc|w);>0 forallc,weM. (11)
If we set
def
M+ = {veX:{v\w)>OforallweM},
then each c in X has exactly one decomposition of the form
c = u-u+, ueM, u+eM+, (u\u+) = Q. (12)
Here, u = Pc.
M
Figure 46.1
4( i. The Ritz Method
367
Geometrically, condition (10) asserts that, for all v e M, u - c and v - u
lorm an acute angle (see Fig. 46.1). In the special case when M is a closed
linear subspace, M+ is the orthogonal complement of M, i.e.,
M+={oeI: (u|w) = 0forallweM}.
Mien P is equal to the orthogonal projection operator on M, and (12)
lepresents the known orthogonal decomposition of c. For this reason, in the
general case, we call P the projection operator of X on the convex set M. We
"•lull apply Proposition 46.5 in an essential way in Section 46.6.
1'koof. Ad (1), (3). We set
def def
a(u,u) = (m\u), b(u) = {c\u).
Since
\\u~ c\\2 = a(u,u)~2b(u) + (c\c),
(lJ) is equivalent to
min 2_1a(«, u) — b(u) =/?.
iie M
Now the assertion follows from Proposition 46.1.
Ad (4) Relation (11) follows from Corollary 46.2.
From c = Pc-u+ it follows that (Pc\u+) = Q according to (11) and
h eM+ according to (10). If (12) holds, then (10) is also satisfied and thus
u -- Pc.
Ad (2) From (10) with v = Pd and v = Pc, we obtain
{Pc-c\Pd-Pc)^Q and (Pd-d\Pc-Pd)^Q,
icvpectively; therefore,
\\Pc~Pd\\2< (Pc~Pd\c-d) < \\Pc~Pd\\ \\c - d\\.
a
46.5. The Ritz Method
I nr an approximate solution of
min F(u)-(b,u) = a, (13)
«e M
we consider the Ritz approximation problem
min F{un)-(b, «„>=«„, «-1,2,.... (14)
u e M n X„
(14) is a finite-dimensional optimization problem. We refer to Problem 46.5
368
46. Differentiable Functional on Convex Scls
for methods for handling such problems. We assume:
(HI) F: X -* U is continuous on the real separable H-space X, dim X = oo.
b is a fixed element in X*.
(H2) M is a closed convex nonempty set in X. Furthermore, M is
bounded or, for each sequence (¾) in M such that ||u„|| -» oo as n -» oo, we
have
En F(v„)-(b,v„)-* +00.
n -» oo
(H3) F'\ X-* X* exists onlasa G-derivative. F' is demicontinuous and
satisfies (S)+. According to Fig. 27.1, this is satisfied, for instance, when
F' = A + V holds for the operators A, V: X-* X*, where A is uniformly
monotone and V is compact.
(H4) {w1,w2,...} is a basis for X. If we set Xn = span{Wj,...,w„}, then
M n Xn¥=0 for all n e N and the closure of U "_jAf n X„ is equal to M.
Theorem 46.B. ff/r/i r/ie assumptions (//1)-(//4), the following two assertions
hold:
(1) /Tie /?/te equations have a solution u„ for each n, and (u„) has a
subsequence which converges to a solution u of (13). Moreover, a„ -> a as
n -»oo.
(2) If F is strictly convex, then for each n, (13) and (14) possess exactly one
solution u and un, respectively, and un -* u as n -* oo.
Proof. Ad (1) According to Proposition 41.8, F is weakly sequentially lower
semicontinuous. The existence assertions for (13) and (14) follow from
Proposition 41.2. As in the proof of Theorem 42.A in Section 42.5, the
continuity of F and the fact that U „ Xn n Af = M imply an -* a as n -> oo.
Thus, («„) is a minimal sequence, and Corollary 41.3 yields the assertion of
the theorem.
Ad (2) The uniqueness follows from Theorem 38.C in Section 38.4. Then
we obtain the assertion by again using Corollary 41.3. D
46.6. The Projected Gradient Method
We consider the minimum problem
min F(u)-(b,u) = a (15)
ue M
with the corresponding variational inequality
(F'(u)~b,v-u)ZtO forallueM. (16)
46.6. The Projected Gradient Method
369
Parallel to this, we also investigate the more general variational inequality
(Au-b,v-u)>0 forallueM. (17)
We seek »6 M.
The basic idea for dealing with (16) and (17) is that we construct the
operator
def
Ltu = P(u-tJ~l(Au-b)) (18)
and, instead of (17), study the fixed-point problem
u = Ltu, ueX, * (19)
with the corresponding iteration method
un + 1 = Ltun, u0eM, « = 0,1,.... (20)
Thereby, we will apply the combined monotonicity and contractivity trick
from Section 25.4. We assume:
(HI) A: X-* X* is a strongly monotone and Lipschitz-continuous
operator on the real separable H-space X, i.e., there exist numbers a, m > 0 such
that for all u,veX,
m\\u~ v\\2 <{Au — Av,u— v),
\\Au- Av\\ <a\\u- v\\.
We choose t so that 0 < t < 2m/a2. Furthermore, b is a fixed element in X*.
(H2) M is a closed, convex, and nonempty set in X.
(H3) P: X -» M is the projection operator from XonM defined in Section
46.4. Furthermore, J: X-* X* is the duality mapping explained in Section
21.4. If we identify X with X*, then J passes to the identity mapping I:
X-*X.
The following contractability condition which follows from Section 25.4 is
crucial:
\\L,u — L,v\\ <k\\u- v\\ forallM.u el,
def
where k2 = 1 - 2mt - t2a2. According to the choice of t in (HI), 0 < k < 1.
Theorem 46.C. With the assumptions (//1)-(//3), the following two assertions
hold:
(1) L, is a k-contractive operator on X.
(2) The variational inequality (17) and the fixed-point problem (19) are
mutually equivalent. (19) has exactly one solution u on M and the iteration
process (20) converges to u as n-* oo, with the error estimate
\\un- u\\< k"(l~ k)'1^- u0\\, ««1,2,....
370 46. Differentiable Functional on Convex Sets
Remark 46.6. (19) is precisely of the type that we considered in Theorem
25.A in Section 25.1.
Therefore, for an approximate solution of (19), one can use not only
iteration methods, but also projection methods and projection-iteration
methods, and all these methods converge according to Theorem 25.A.
If one compares (20) with the gradient method (42.23), with U=J~\
then because of the additional appearance of the projection operator P in
(18) and (20), the designation projected gradient method becomes
understandable. P provides for the situation that each solution of (19) is
automatically in M. The main difficulty of the method lies in the construction of
P in concrete problems.
If F: X -* U is convex and F' exists on X as a G-derivative, then (15) and
(16) are mutually equivalent according to Theorem 46.A in Section 46.1 and
Theorem 46.C can be applied to solve (15) and (16) with A = F'.
def _,
Proof. Ad (1) Let Bu = u-tJ \Au-b). According to Section 25.4, B is
fc-contractive. Since, by Proposition 46.5, P is nonexpansive, we obtain
\\PBu - PBv\\ <, \\Bu - Bv\\ <k\\u~ v\\.
Ad (2) Equation u = L,u is equivalent to u = PBu and this is equivalent to
(u-Bu\v~u) >0 forallueM,
by Proposition 46.5, (3). However, because t > 0, this is equivalent to
0< (J~l(Au-b)\v-u)= (Au-b,v-u) forallueM.
The assertions for (19) and (20) follow immediately from the Banach
fixed-point theorem (Theorem l.A in Section 1.1). D
46.7. The Penalty Functional Method
We study the minimum problem
F(u) = minl, ueX (21a)
subject to the side conditions
F,(u)<i0, / = 1,...,^,
Fj(u) = Q, j = p+l,...,N. (21b)
Here, we also allow that the inequalities or the equalities do not even
appear. This can be attained formally by Fk = 0. The penalty method for the
approximate solution of (21) reads as follows:
H«) + k„ E(^("))2+ E H3(«)||2 =min!, ueX, (22)
\'=i j=p+i I
46.7. The Penalty Functional Method
371
_ def
where Ft(u) = max{Ft{u),0), or
n») + ^ff(^(»)-^'))2+ E ||i>(")H2)=min!, (23)
\ .-=.1 j = P+i I
where(u,t)(= XxW and t = (t^,...,tiP)).
We have already explained the basic idea of the penalty method in
Section 37.29d. Here, it is important to note that in (22) and (23) we are
dealing with a free minimum problem, in contrast to (21). The advantage of
(23) over (22) is that for differentiable Ft, / = 1,...,^, the corresponding
penalty term is also differentiable. We assume:
(HI) X and Y are real reflexive B-spaces.
(H2) F: X-*U is weakly sequentially lower semicontinuous and F(u)-*
+ oo as ||u|| -» oo.
(H3) Ft: X-*U, i = 1,...,p, is weakly sequentially continuous.
(H4) Fy. X -* Y, j = p +1,..., N, is weakly sequentially continuous.
(H5) There exists a ue X that satisfies the side condition (21b).
Furthermore, (k„) is a sequence of positive numbers such that £:„->oo as
n -»oo.
Theorem 46.D. With the assumptions (//1)-(//5), the following two
assertions hold:
(1) For each n =1,2,..., the penalty problem (23) has a solution (un,tn)e
XxW. There exists a subsequence (un>) of (un) which converges weakly
to a solution u of the original problem (21), and F(un,) -* F(u) as n -» oo.
(2) If (21) has exactly one solution u, then un-^u and F(un)-* F(u) as
n-»oo.
Corollary 46.7. There is an analogous assertion for (22). If F possesses a
uniformly monotone G-derivative on X, i.e.,
(F'(u)-F'(d),u-d)>c\\u-d\\p
for all «,!)£ X and fixed c>Q,p>l, then one can replace weak convergence
in Theorem 46. D by strong convergence.
Proof. Ad (1) If we denote the left-hand side of (23) by G„, then
G„(u,t)-*+00 as||K|| + ||f||-*oo (24)
holds. For ||u|| -* oo, this follows from G„{u, t)> F{u)-* + oo. By (H3), Ft,
i = 1,...,p, is strongly continuous and thus bounded according to Fig. 27.1.
For this reason, (24) is also valid for sequences for which ||u|| remains
bounded and ||f||-»oo holds. F and G„ are weakly sequentially lower
semicontinuous. Proposition 41.2 yields the existence of a solution (un, t„)
of (23). Obviously f <'> must be nonpositive.
372
4b. Differentiable Functionals on Convex Sets
If U denotes the set of all u e X where (21b) holds, then U is weakly
sequentially closed. Proposition 41.2 yields the existence of a solution v of
(21). If wesetr(,)= F^v), then, because of (21b), Gn{un, tn)<Gn(v, t)<F(v)
holds; therefore,
G„(u„,t„)-F(u„) + k„ E(^("J-^)2+ E \\Ft(u„)\\A < F(v).
\i=-i 1=/1+1 ,
(25)
From this it follows that
F(u„)<F(v), (26)
{F.M-tPYzk^iFW-Fiu,)), / = 1,...,^,
11^(011^^(^)-^(0). j = P+h...,N.
(24) and (25) yield the a priori estimate sup„(||«J| + ||f„||)< oo.
Consequently, there exist subsequences which we denote by (u„), (t„) such that
un-+u and tn -» t as n -» oo; therefore,
F(u)<limF(un)<F(v). (27)
(26) yields
(i-.(«)-r<'>)2<0, iif/^ii^o.
Since t(,) < 0, u satisfies the side condition (21b). According to (27), u is a
solution of (21), for F(v) is equal to the minimal value.
Ad (2) Use the convergence principle (Proposition 10.13, (2)). D
The first part of Corollary 46.7 is proved analogously. By Corollary 42.7,
the second part follows from
f(«(,)-^(")^<^"(«),«,,-«> + q'"1ll«-«jr1
and
F(u„)-* F(u), un-+u as n -»oo.
46.8. Regularization of Linear Problems
In combination with Section 37.14, we consider the linear equation
Au = b, ueX, (28)
with the regularization method
minM,i;-68||2 + *IMI2 = « (29)
veX
■" . . Regulanzatibn of Linear Problems
373
iiud the corresponding Euler equation
(AtAt+8l)ua = Afba, usex. (30)
.(' denotes the adjoint operator to A.
Here, we assume:
(HI) A: X -» Y is a linear continuous operator. X and Y are real H-spaces.
I! P denotes the orthogonal projection operator from Yonto R(A~], then let
/' l>e a fixed element in Y such that Pb e R(A).
According to Proposition 37.29, equation (28) then has a normal solution
un which is equal to the uniquely determined solution of Au = Pb, u e X,
»ilh the smallest norm.
(H2) For each 8: 0 < 8 < 80 there exists a continuous linear operator As:
V-> Yand an element bs in Y such that \\A - As\\<8 and \\b - bs\\ < 8.
As and bs arise from A and b,- respectively, in practical problems on the
hasis of round-off errors.
(H3) If Ps denotes the orthogonal projection operator from Y onto
K[AS), then, in the case where b <£ R{A), we consider only those operators
. f„ having the property
\\Pb-Psbs\\< (constant) 8 for all 8, 0<8<;80. (31)
Theorem 46.E. With the assumptions (//1)-(//3), the following two asser-
linns hold:
(1) The regularized problem (30) has exactly one solution us. The problems
(30) and (29) are mutually equivalent.
(2) If 8 -» 0, r/iert m8 -* Mfi.
The operator # = A^AS + S/ is self-adjoint and all its eigenvalues are
yi eater than or equal to 8 because
su\Asu) + 8(u\u) ^ 8(u\u).
C) asserts that the solutions us of the stable problem (30) with the strongly
monotone operator B tend to the normal solution uR of the original problem
(28) as 8 -* 0. If we set A = As, b = bs, then, according to Section 46.7, (29)
is the penalty method for ||u||2 = min! with the side condition \\Au — b\\2 = 0.
I lie significance of Theorem 46.E for the solution of ill-posed problems was
explained in Section 37.14. We have already considered applications in
Section 37.15.
We delve into the iterative determination of the normal solution and the
corresponding error estimates in Problem 46.5.
Proof. Ad (1) This follows immediately from Theorem 42.A in Section 42.5
:md (37.103d). A perusal of the proof of Theorem 42.A shows that the
374
46. Differentiable Functionate on Convex &t-
separability of X required there is not necessary for the assertions needed
here.
Ad (2) First, we show that when b e R{A), (31) follows from (H2), for v-e
then have
\\Pb-Psbs\\ = \\b-Psbs\\<\\b-bs\\+ \\ba-Pab„\\
* \\b- b„\\+ \\bs - AsuR\\ < \\b - bs\\+ \\bs - b\\
+ \\AuR-A8uR\\<z2\\b-ba\\+\\A-A8\\\\uR\\
<2S + S||«R||. (32)
Note that \\bs - Psbs\\ < \\bs - Asu\\ for all u e X holds because of the
construction of Ps.
(I) Weak convergence of (us). Since PsAgV = AgU,
\\Asv -bs\\2 = \\Asv -Psbs\\2 + ||(/- Ps)bs\\2 for all v ex.
(This is the theorem of Pythagoras.) Consequently, us is also a solution nf
problem (29) if one replaces bs by Psbs; therefore,
Usus - Psbs\\2 + S\\us\\2 < \\AsuR - Psbs\\2 + 8\\uR\\2. (33)
The next relation is important:
\\AsuR-Psbs\\ = 0{8) as S -0. (34,1
According to (H2) and (H3), this follows from
UsuR - PAII ^ \\AauR - AuR\\+ \\Pb- Psbs\\-
Therefore, (33) yields the a priori estimate:
||k8||2<IKII2 + 0(8) as 8 -0. (35,1
Consequently, there exists a weakly convergent subsequence which we shall
again briefly denote by (us); therefore, us-±v as 8 -» 0, and
||u||</»n||«8||<||«R||. (3M
Furthermore, according to (H2), (H3), and (33), we have
\\Au8 - Pb\\ < \\Aus - ^l8«8||+ \\Asus - Psbs\\
+ \\Psbs-Pb\\-*Q asS-*0.
A is weakly sequentially continuous (cf. Fig. 27.1); therefore, Av = l'b.
\\v\\ ^ \\ur\\j i-e-. v=uRby the construction of uR.
Since v is uniquely determined, it follows [by the convergence principle
(Proposition 10.13, 2)] that the entire sequence (us) converges weakly to «,,
as 8 -* 0.
(II) Strong convergence of (us). From (35) and (36), with v = uR, a
follows that ||m8|| -» \\uR\\ as 8 -* 0. Moreover, us-+uR as 8 -* 0. Now, Xis an
H-space. It thus follows that us -* uR as 8 -* 0 (cf. Problem 46.4).
46.9. Regularization of Nonlinear Problems
375
46.9. Regularization of Nonlinear Problems
In connection with the nonlinear operator equation
Au = b, ueZ, (37)
we study the regularized problem
inf \\Az-bs\\r + 8-F(z) = a. (38)
zeZ
Our goal is an assertion of the form:
\\u-us\\x<e for \\b -bs\\r<8, 0<8<8(e). (39)
To this end, we assume:
(HI) A: X -* Y is an injective continuous operator. X, Y, and Z are real
reflexive B-spaces. The embedding Zc lis compact.
(H2) F: Z-» [0, oo [ is weakly sequentially lower semicontinuous. For each
r > 0, F~l([Q, r]) is bounded in Z.
(H3) b is a given fixed element in A (Z), and u denotes the solution of (37).
Theorem 46.F. If (//1)-(//3) hold, then for each e > 0 there exists a 8(e) > 0
such that for each bseY with \\b — bs\\Y < 8 and 0 < 8 < 8(e), the regularized
problem (38) has a solution usfor which \\u - us\\x < e.
def
Example 46.8. F(u) = \\u\\lz can be chosen as a prototype for F.
Proof. (I) Existence of us for fixed 8 > 0. We have a> 0. Let («„) be a
minimal sequence of (38), i.e.,
\\Aun - bs\\\ + 8-F(un) -* a asn->oo.
Due to the boundedness of the sequence (F(un)), according to (H2) the
sequence (un) is also bounded (in Z). Consequently, there exists a
subsequence which we shall again denote by (un) such that un-^us as n -* oo in Z
and thus un -» us in X; therefore,
\\Aus-bs\\2r + 8-F(us)^a.
(II) Proof of (39). Let \\b - bs\\ < 8. Since Au = b and u e Z, from (38) it
follows that
\\Au8-b8\\+d-F(u8)^\\Au-b8\\+d-F{u)-0(8) (40)
as 8-* 0. Consequently, F(us)<r for all 8 in a neighborhood of zero for
appropriate r, i.e., us e F~ '([0, r]).
*/ _■,
The set M = F \[0,r])U{u} is bounded in Z (according to (H2)), and
tlius it is relatively compact in X. The closure M of M in X is therefore
compact. JThe operator A: M -* Y is continuous and injective. Consequently,
A"1: A(M)-*M exists as a continuous operator (cf. Aj(12e)). We shall
376
40. Diffeienuaule Funcnonais on Convex Set*
show that
\\Au-Aus\\Y = 0(S) as8-*0. (41)
Since u, us e M, the relation (39) immediately follows. However, (41)
follows directly from
\\b-Aus\\<\\b-bs\\+\\bs-Aus\\
and (40). L
Problems
def
46.1. Proof of Proposition 46.1. Solution: Let F(u) = 2 la(u, u). The functional
F is weak sequentially lower semicontinuous and strongly convex, according
to Example 38.16. Furthermore,
(F'(u),v) = lim r1(F(u + tv)~F(u)) = a(u,v)
t->0
holds and
F(u)-b(u)> c\\u\\2 - \\b\\\\u\\ -> + oo as ||u|| -> oo.
Theorem 46.A in Section 46.1 yields the assertion.
46.2. Proof of Corollary 46.2. Solution: (6) obviously follows from (7) by
subtraction. Conversely, we obtain (7) from (6) upon choosing
def def def
v = u + w forweM and v = 2u, v=0.
Here, v is always an element of M.
46.3. Proof of Example 46.4. Solution: M is closed, for if un -> u in X, then for the
generalized boundary values on 9(7, we have the convergence un-*u in
L2(dG). Furthermore, M is obviously a convex cone. The mapping a
X X X -> R is bilinear, bounded, symmetric, and strongly positive. ThK
follows from the fact that |-| with \u\ = a(u,u) represents an equivalent
norm on X because c > 0. Corollary 46.2 yields the assertion. Take (22.1')
and Section 21.2 into account.
46.4. Criterion for strong convergence. Show: In an H-space or, more generally, in
a locally uniform convex space (cf. A3(21a)), it follows from the wea'-\
convergence un-*u and the convergence of the norms ||m„|| -> ||w|| as n -*cr
that we have the strong convergence un -> u.
Hint: Compare Pascali and Sburlan (1978, M), page 5. In a real H-spacv.
the assertion follows immediately from
IK-"II2 = IKII2-2("K)+NI2.
46.5.* Methods for the solution of nonlinear optimization problems in R". For tlh1
numerical treatment of such problems, we have, in principle, three classes cf
methods at our disposal:
(i) Method of feasible directions,
(ii) Penalty methods,
(iii) Method of cutting hyperplanes.
n^jlems
ill
(i) is a matter of gradient methods where it is important to find descent
directions which are compatible with the side conditions. Moreover, one
must control the step length appropriately in order to prevent the method
from oscillating near a vertex of the feasible region, where this vertex is not
an optimal point. That is, we must avoid the situation where the method
stays at an incorrect vertex.
In (iii), the idea is to represent convex sets as the intersection of
half-spaces and then to appropriately approximate the feasible region in a
neighborhood of an optimal point.
For an introduction to the modern algorithmic treatment of these ideas,
we recommend Foulds (1981, M), Grossman and Kleinmichel (1976, L), and
Psenicnyi, and Danilin (1979, M), as well as vthe literature that is to be
found in the references to the literature for this chapter under the caption
"Algorithms in RN".
46.6.* Regularization and iteration, methods. Suppose given the equation Au = b,
where A: X -» Y is a linear continuous operator, X and Y are H-spaces and
b e R(A). Let uR be the normal solution of Au — b. For the determination
of uR by iteration, we make use of the iteration method
un"(l-A%As)un_1 + A%bs, /1=1,2 «o = 0,
where \\AS - A\\ < S, \\bs - b\\ < S, and we also make use of the
normalization condition \\AS\\ <1, ||v4|| <1. The operator As: X~> 7is to be linear and
continuous. We stop the iteration at n = n(S) when \\u„ — t/„_x|| < S. Show:
IK(S) - "kII -> 0 and Sn(S) -> 0 as S -> 0.
If uR = (A*A)pv, p>0, and ||y||^r, then one has the more precise
estimates
IK(S)-"*ll^sM'+1).
n{S)<dS'1/^+l\
The constants c and d depend only on p and r.
Hint: Compare Vainikko (1980), (1982, L). A number of further results
can be found there.
46.7. Regularization for a not necessarily uniquely solvable equation. Together with
the not necessarily uniquely solvable equation
Au-f, (42)
we consider the regularized uniquely solvable equation
Au + n~1Bu=f+n~1g. (43)
Let A, B: X-> X* be operators on the real reflexive separable B-space Xfor
which the following hold:
(i) A is hemicontinuous, monotone, and coercive,
(ii) B is hemicontinuous, strictly monotone, and bounded.
Let f,g&X* be given and fixed. Show: For each ne.N, (43) possesses
exactly one solution un. We have m„-*m0 as n -* oo, and u0 is a solution of
(42).
378
46. Differentiable Functionals on Convex Sets
The solution set L of (42) is convex, bounded, and closed. The solution h0
is uniquely characterized by (Bu0 — g, v - u0) ^0 for all c£i. If B
satisfies condition (S), then u„ -> u0 as n -> oo in X.
According to Section 47.12, one can choose B to be the duality mapping
when X and X* are strictly convex and separable.
Hint: Make use of (25.20°). Compare Gajewski, Groger, and Zacharias
(1974, M), page 87. Further regularization methods in this direction can be
found in Pascali (1974, M) and in Pascali and Sburlan (1978, M).
Regularization methods for semicoercive problems are contained in Hess
(1974). We discuss this in Problem 54.5.
References to the Literature
Variational inequalities: Lions (1971, M); Kinderiehrer and Stampacchia
(1980, M) (cf., also, the more detailed references to the literature in Chapter
54).
Projection on convex sets: Moreau (1962), (1965); Zarantonello (1971a).
Approximation methods for problems with side conditions: Cea (1971,
M) (recommended as an introduction); Poljak (1974, S); Glowinski, Lions,
and Tremolieres (1976, M); Kluge (1979, M); Fletcher (1980, M), Vol. 2.
Algorithms in UN: Fletcher (1980, M), Vols. 1-2 (standard work); Polak
(1971, M), (1973, S); Grossmann and Kleinmichel (1976, L);
Grossman and Kaplan (1979, L) (penalty method); PseniCnyi and Danilin
(1979, M); Dixon (1980, P) (state of the art); Foulds (1981, M) (emphasizing
practical applications; recommended as an introduction).
Regularization: Cea (1971, M); Morozov (1973, S); Tihonov and Arsenin
(1977, M); Vainikko (1980), (1982, L) (also, cf. the references to the
literature in Section 37.29).
CHAPTER 47
Convex Functional on Convex Sets and
Convex Analysis
His number-theoretic investigations led Minkowski (1864-1909) for the first
time to the realization that the concept of a convex body is a fundamental
concept in our science.
David Hilbert, 1910
The foundations of the general theory of convex sets and functions were laid
around the turn of the century, chiefly by Minkowski.
R. Tyrell Rockafellar, 1970
Over the last 20 years, parallel to the theory of monotone operators, a
calculus for the investigation of convex functionals designated by convex
analysis has emerged, which allows one to solve a number of problems in a
Mmple way. To this calculus belong:
(a) The subgradient dF (a generalization of the classical concept of
derivative).
(/3) The conjugate functional F* (duality theory).
In this chapter we deal with the subgradient. We delve into the topics of
conjugate functionals and their applications in Chapters 51 and 52.
If a G-differentiable functional F: X-* U has a minimum at u, then
F'(u)=0 (la)
iind dF(u)= {F'(u)}. In this chapter we generalize this condition for
nondifferentiable functionals to
Q(=dF(u). (lb)
furthermore, from the important sum rule for subgradients,
d(F+G)(u) = dF(u)+ dG(u),
„dO
hi. i-'unvex Buncuonals oil v-onvex Set's ana Convex Anal) sis
we shall obtain, for example, the Kuhn-Tucker theory and the main
theorem of convex approximation theory in a simple way.
We allow F to take on the values ± oo. By means of this, and with the
help of a simple trick, it is possible to change optimization problems with
side conditions into problems without side conditions. In order to elucidate
this, we consider the minimum problem
min F(u) = a, (2a)
where F: M c X -»IR is a functional on the subset M of the linear space X.
If we now set
f{up(F{u) iiueM, (
'\+co if ueX-M,
then (2a) is equivalent to the free minimum problem
minF(«) = a. (2b)
The proofs of the central propositions of convex analysis are all based
essentially on the separation theorems for convex sets which we have
summarized in Section 39.1. In this connection, we essentially exploit the
fact that with the aid of epigraphs one can reduce the investigation of
convex functionals to the consideration of convex sets.
We handle the applications of the subgradient to variational inequalities
in Chapters 54-56. In Part IV we use the subgradient essentially in
plasticity theory in order to formulate multivalued stress-strain relations.
47.1. The Epigraph
Definition 47.1. Let F: X-* [- oo, oo] be a functional on the linear space X.
(a) F is called convex if and only if
F((l-t)u + tv) < (1 - t)F(u)+tF(v)
for all u, v e X, te ]0,1[ for which the right-hand side is meaningful.
Therefore, precisely all u, v e X for which F(u) and F(v) are simultaneously
infinite with opposite sign are not to be considered.
(b) The effective domain of definition of F, dom F, and the epigraph of F,
epi F, are defined by the sets
domir= {ue X: F(u) < + oo},
epif= {(u,a) eXXM: F(u)<a).
Furthermore, we recall that F: X-* [ - oo, oo] on the topological space X
is said to be lower semicontinuous if and only if the set { u e X: F(u) < r} is
closed for all r e IR.
47.1. The Epigraph
381
R
:==^^^6pi F
Figure 47.1
Figure 47.1 shows epiF for X=U. Obviously, (a) coincides with
Definition 42.1 for F: X -* U.
Example 47.2. If M is a subset of a locally convex space X, then we define
the indicator function of M by
Y (,.)./0 iiueM,
XmK ' \ +oo iiueX-M.
The following assertions hold:
(i) M is convex if and only if Xm 1s convex.
(ii) M is closed if and only if Xm IS lower semicontinuous.
For F: M -* U and F as in (3), we have:
(iii) F is convex and M is convex if and only if F is convex,
(iv) F is lower semicontinuous and M is closed implies F is lower
semicontinuous.
We now summarize several properties of the epigraph.
Proposition 47.3. If X is a real locally convex space, then the following hold
for F: X-+ [-00,00]:
(1) F is convex if and only if epiF is convex.
(2) F is lower semicontinuous if and only if epiF is closed.
(3) F is continuous at u and F(u) =£ ± 00 implies intepiF^ 0.
(4) F & + 00 implies epiFi= 0.
(5) F is convex implies domF is convex.
Proof. (1). (1)=*: Let F be convex. From (u, a),(v, b) eepif it follows
that
F(tu + (1 - t)v) < tF(u)+ (1 - t)F(v) < ta + (1 - t)b;
therefore, (tu +(1- 0", ta + (l- t)b) eepiF for all t e]0,1[.
(II)<=: Let epi F be convex. Suppose — 00 < F(u), F(v) < 00. The
remaining cases are handled similarly. From (u, F(u)), (v, F(v)) e epi F, it follows
382 47. Convex Functional on Convex Sets and Convex An: I - -
that
(tu + {l-t)v,tF{u)+{l-t)F{v))^epiF for all; e]0,l[.
This yields the convexity of F.
(2) First, let X be a B-space.
(Ill) =>: Let F be lower semicontinuous. From (u„, an) e epi F for all n
and (un, an) -* (u, a) as n -* oo, it follows that F(u„) < a„ and a„ < a + e foi
«>«0(e).
The lower semicontinuity of F assures that F(u) <a + e. This holds for all
e > 0; therefore, F(u) < a, i.e., (u, a) e epi F.
(IV)<=: Let epiF be closed. From F(un)<r for all n and u„-*u a>
n -*oo, it follows that (u„,/-)eepiF and thus (u, /-)eepiF, i.e., F(u)<i.
If X is a locally convex space, then one uses MS sequences instead of
sequences.
(3) There exist neighborhoods U(u) c X, V(0) c U such that F(u) < F(h i
+ 1 + e for all (v, e) e [/(u)X V(0); therefore, (u, F(u) + l) e intepi F.
(4), (5) Compare Definition 47.1. L
We treat calculation rules for lower semicontinuous and convex function-
als in Problems 38.2 and 47.1.
As a typical application of the separation theorems, we obtain thi.1
following lemma which we shall use frequently.
Lemma 47.4. Let F: X-* [- oo, oo] be convex and lower semicontinuous on
the real locally convex space X and suppose there exists a u such that
— oo<a<F(«), u e "dom F.
Then there exists (u*,a)e. X* XU such that
(u* ,u)- a> a> (u* ,v)- F(v) forallv&X.
In particular, if F(u) ¥= ± oo, then we obtain
F(v) >a + (u*,v — u)
for all i)6jf such that F(v)> -oo, i.e., F can be estimated from below
relative to an affine function. One also says that F is supported by an affun.1
function at u.
Proof. Every continuous linear functional z* e(ZxlR)* has the form
(z*,(v, b)} = (w*, v)+a*b for all (v, b) e X XU
for fixed (w*, a*) e X* X U.
For F = + oo, the assertion is trivial because we can choose u* = 0. Lei
F^+oo. According to Proposition 47.3, epiF is convex, closed, and
nonempty. Moreover, (u, a)€ epiF. According to the separation theorem^
(Proposition 39.4, (2ii)), we can strongly separate (u, a) and epi F in X X12,
47.2. Continuity of Convex Functionate
383
i.e., there exist z* e (X x 01 )* and fi e IR such that z* + 0 as well as
(w*,u) + a*a> fl> (w*, v) + a*b for all (v,b) eepif.
For u e dom F, we have (i>, F(v)) e epi F; therefore,
<w*,M>+a*a>j8><H'*, ^)+0^(^) (4)
for all u e dom F.
We shall show that a* < 0. Then we obtain the assertion with «* =
(- a*)~1w*.
Assume, on the contrary, that a* > 0. Since u edom~F, there exists an
M-S sequence (va) from domf such that va -» u; therefore,
(w*,u) + a*a>P>(w*,u)+a*F{u),
by (4) with v = va. But this contradicts a< F(u). D
41.2. Continuity of Convex Functionals
We shall show that convex functionals are already continuous under very
weak assumptions.
Proposition 47.5. If F: X-*[— 00,00] is convex on the real locally convex
space X, then:
(1) The following two assertions are equivalent:
(/) F is continuous at u and finite,
(ii) F is bounded above on a neighborhood of u.
(2) F is continuous on the open set M when F is finite on M and continuous
at some point of M.
Corollary 47.6. Every convex function F: M c IR N -»IR on an open convex set
M is continuous. Here N>1.
Corollary 47.7. Every convex lower semicontinuous functional F: M c X -»IR
on a closed convex set M of the real B-space X is continuous on int M.
Proof. (Ad 1) (i) =» (ii) This is a direct consequence of the definition of
continuity,
(ii) =* (i) Without loss of generality, let u = 0, F(0) = 0, and let U be a
def
neighborhood of zero such that a = s\xpBeUF(v) < 00. For all ee ]0,1[,
D=(l-e)-0+e(^), 0=(1+6)^ + 6(1+6)^(^).
->o4
4;. convex Funcuonals on convex Sets" ana Convex analysis
The convexity of F yields
v<EeU=*F(v)<{l~e)F(0)+eF(~\<ea,
v e (- eU) =» F(v) > (1+ e)F(Q)~eW —) > ~ ea;
therefore, | F( v) | < ea for all v e eU n (- ef/), i.e., F is continuous at « = 0.
(Ad 2) Without loss of generality, let F be continuous at u = 0 and let
0 e M; therefore, F(v)<a for all u in a neighborhood of zero, U, according
to (1). Let «eM, We choose p > 1 so that pu^M.
The mapping /i defined by
h(v)= (l~p~1)v + p~l(pu)
is a homeomorphism. Since /i(0)= «, /i maps a neighborhood of zero, V,
with V c,U on a ^-neighborhood /i(F). The functional F is bounded on
h(V) because
F{h{v)) < {l-p-l)F{v)+p-lF{pu) < (l-p"l)a + p^lF(pu).
Thus, by (1), F is continuous at u. D
Proof of Corollary 47.6. We set F(v)= + oo for v<£M. Then F:
U^-^)-00,00] is convex. Let «eM and let, say, N—2. We choose a
triangle D that is spanned by {a, b, c} with u e int /) c M. Each u e /) has
a representation of the form
v = aa + Pb + yc, 0 <a,/J,y<l, a + j3 + y=l.
Due to the convexity of F,
F(v)<aF(a)+fiF(b)+yF(c) for all c£ /),
i.e., i*" is bounded from above on D and thus is continuous at u. O
Proof of Corollary 47.7. As above we extend F in a convex way on Xby
defining F(v) = + oo for v € M. Without loss of generality, let 0 e int M.
We choose a number a such that a > F(0). Let
def
T= {u<EM:F(u)<a}.
We shall show that T n(- T) is a neighborhood of zero. Then, by
Proposition 47.5, it follows that F is continuous on int M.
T is convex and closed since F is convex and lower semicontinuous on M.
According to Corollary 47.6, F is continuous on a neighborhood of zero of
the straight line t >-> tv. Since .F(O) <a, for each v^X there thus exists a
t > 0 such that F( ± tv) < a; hence, tv e ± T. Therefore, the set T n (- T) is
a barrel; thus it is a neighborhood of zero since every B-space is barrelled
(cf. Yosida (1965, M), Appendix V.2). D
47.3. Subgradient and Subdifferential
385
47.3. Subgradient and Subdifferential
Subgradients generalize the classical concept of a derivative. In this
connection,
F(v)>F(u) + (u*,v-u) forallueX (5)
is crucial.
Definition 47.8. Let F: X -»[ — oo, oo] be a functional on the real locally
convex space X.
u* in X* is called a subgradient of F at u if and only if F(u) + ± oo and
(5) holds.
The set of all subgradients of F at u is called the subdifferential dF(u). If
no subgradient exists at u, then we set dF(u) = 0. This is the case for
F{u)= ±oo.
If dF(u) *0, then, by (5),F>- oo.
Example 47.9. For F:U-*U, the subdifferential dF{u) equals the set of all
slopes u* e IR of straight lines through (u, F(u)) which lie below the curve
belonging to F (generalized tangents in Fig. 47.2).
If F'(u) exists, then dF(u)= {F'(")}• We generalize this in Proposition
47.13.
Example 47.10 (Support Functional). Let M be a convex set in the real
locally convex set X with the indicator function x«. By a support functional
to M at the point u, we understand a functional u* in X* such that
(u*,u) >(u*,v) forallueM (6)
(see Fig. 47.3). According to (5), taking F^xm &n& %(") = 0 for a e M
and Xm(u) ~ + °° f°r u & M, we have:
I set of all support functional to M at the point u
5Xa/(")=< where «ejf,
I 0 where ii£M.
Figure 47.2
386
47. Convex Functionals on Convex Sets and Convex Analjhi-
1
M / ,'
/I
u
Figure 47.3
We already considered the support functional mapping u>-> Bxm(u) in
Section 32.2 and used it in Section 32.6 to handle variational inequality
Let us discuss dxM- By (6), we always have 0 e 3xm(u) f°r u e M-
If ueintM, then dxw(") = W- For "e ^M and int M + 0, by
separating u and int M, according to Proposition 39.4, (1), one obtains a functioiui1
u* + 0 for which (6) holds, i.e., u* e dxM(u)-
If M is a linear subspace, then from (6) it immediately follows that
Sxm^^M1- forueM,
i.e., dxM(u)= {"* e X*'- («*,w) = 0 for all we M) when «eJf.
Example 47.11 (Subdifferential of the Square of the Norm). Let X be a ic.il
normed space. For
■b\\2
def ,
with fixed iielwe have:
u*<=dF(u)**(u*,u-b) = \\u*\\\\u-b\\ and \\u*\\ = \\u-b\\.
Therefore, in a real H-space X, dF{u)= {u—b} when we identify X villi
We treat the proof in Problem 47.3. Let us discuss Example 47.11.
According to Theorem 47.A in Section 47.6 that follows below, we ha\o
dF{ u)¥=0. One can also easily verify this with the aid of the Hahn-Banadi
theorem. We shall use this example in a crucial way in Section 47.8 (mum
theorem of convex optimization) and in Section 47.12 (duality mapphv.i.
One denotes dF(u) above by J(u — b).
47.4. Subgradient and the Extremal Principle
We consider the minimum problem
inf F(u) = a
»e X
and the corresponding Euler equation
QedF(u).
V)
47.6. Existence Theorem for Subgradients
387
Proposition 47.12. If F: X-*]~00,00] is a functional on the real locally
convex space X with F * + 00, then u is a solution of (7) if and only if (8)
holds.
From this result, which holds by (5) in a trivial way, we shall easily obtain
nontrivial propositions for convex optimization problems in Sections
47.8-47.10 with the aid of the sum rule.
47.5. Subgradient and the G-Derivative
We now justify the relation
dF(u)=.{F'(u)}. (9)
Proposition 47.13. If F: X-* [ — 00,00] is a convex functional on the real
locally convex space X which is finite at the point u, then:
(/') IfF'(u) exists as a G-derivative, then (9) holds.
(it) If F is continuous at u and dF(u) consists of exactly one element, then
F'iu) exists as a G-derivative and (9) holds.
def
Proof, (i) Let <p(t) = F(u + t(v - u)). According to (42.4a), for <p we have
(P(1)-<jp(0)><jp'(0), i.e., F(v)-F(u)>(F'(u),v-u) for all veX; there-
ioK,F'(u)edF(u).
From u* e dF(u) it follows that F(v)-F(u)> (u*, v-u) for all eel
Thus, for v = u + th and r-*0we have (F'(u)-u*, h)>Q for all he X,
\.e.,F'(u)= u*.
(ii) Compare Problem 47.4. □
47.6. Existence Theorem for Subgradients
The continuity of a real function F: U -»IR at « does not imply its
differentiability at u. For convex functionals, however, such an assertion
holds for the subgradient (cf. Fig. 47.2 in Section 47.3).
Theorem 47.A. If F: X -* [ — 00,00] is a convex functional on the real locally
convex space X, then:
(1) dF(u) is convex and weak* closed.
(2) If F is finite and continuous at u, then dF(u) is nonempty and weak*
compact.
In (1), dF(u) = 0 is possible. The existence assertion dF(u)¥=0 in (2)
follows from a separation theorem. From PronosiHon 47 S it fnllnws that
388
47. Convex Functional on Convex Sets and Convex Anal>sis
under the assumptions in (2), the subgradient dF(v) exists for all v in
int(dom F).
Proof. (1) The convexity of dF(u) follows easily from (5). To show the
weak* closedness, we choose an M-S sequence (u*) in dF(u) such that
u* -» u* in the weak* topology on X* (cf. A^l)).
Passage to the limit in
F(v)>F(u) + (u*,v-u) iorallveX
yields (5) and thus u* e dF(u).
(2) (I) We show that dF(u)j=0. According to Proposition 47.3, epi f is
convex and intepi F+0. Moreover, (u, F(u))£ intepii\ Thus the point
(u, F{u)) and the set epi F can be separated in X x IR (Proposition 39.4, (1)),
i.e., there exist (w*, a*) + 0 from X* X 0¾ and /3 e IR such that
(w*,u) + a*F(u)>a^(w*,v) + a*a for all (v, a) e epiF (10)
(cf. the proof of Lemma 47.4). We shall show a*<Q below. Let
u* = (- a*) lw*. Since (v,F(v))e epi F for F(v)^ +oo and(u, a)eepiF
for all a e IR when F(v) = - oo, from (10) we then obtain
(u*, !<)— F(u) > (u*,v) — F(v) for alius domF,
i.e., u* e dF(u); therefore, dF(u)+0.
We still must prove that a* < 0. To this end, we use (10). First, since
(u, F(u) + T)e epi F, a*<0 holds. a* = 0 yields (w*, u — u)^0 for all
v e dom F. Due to the continuity of F at u, dom F contains a neighborhood
of u. Therefore, w* = 0 in contradiction to (w*, a*) + 0.
def
(II) We shall show that dF(u) is weak* compact. Let !7={iieJf:
F(u + /;)— F(u) < 1}. Due to the continuity of F at u, U is a neighborhood
of zero. For all /i e £/, «* e dF(u), we have
(^,/1)^(11 + /1)-^^1,
i.e., dF{u)cU°. According to A3(18) and A3(19), the polar U° is weak*
compact. Due to (1), dF(u) is also weak* compact. D
47.7. The Sum Rule
For the functionals F, Fu F2: X -» [ — oo, oo] and X > 0, from the definition
of the subgradient it immediately follows that for all a€l:
d(\F(u)) = \dF(u),
d(Fi + Fi)(u) 3 dFl(u)+ dF2{u).
47.7. The Sum Rule
389
Our goal is the stronger assertion:
8{Fl+---+ F„){u) = dFl{u) + ■■■+ dF„{u). (11)
In this connection, we use, as usual, A + B = fa + b: ae A, be B\ and
A+ 0 = A. Equation (11) generalizes the sum rule of the classical
differential calculus. We treat important applications in the following three sections.
Theorem 47.B (Moreau and Rockafellar). (11) holds for all ue X when the
following assumptions are fulfilled:
(/') F1,...,F„: X-* ] — 00,00] are convex functional on jhe real locally convex
space X, n > 2.
(/7) There is a u0 in X such that all-F^s are finite at u0 and all Ft, with the
exception of Fn, are continuous at uQ.
The proof is based on a separation theorem, as is the case for all
important propositions in convex analysis.
Proof. Let n = 2. The assertion follows by induction for n > 2.
With "2" in place of "=," (11) is obtained directly by adding the
definition equations (5) for uf e dFj(u). We shall prove " c."
Let u* e d{Fl + F2){u); therefore, F±{u), F2(u) < oo and
def
F2{u)~ F2{v) < F^v)- F^u)-(u*, v - u) = G{v) for all v e X.
(12)
We construct subsets A, B of X XU by
def
A = {{v,a)e XxU: G{v)<a] =epiG,
def
B= {{w,b)eXxU:b<F2{u)~F2(w)}.
According to Proposition 47.3, A is convex and int^l ¥=0. B is also convex.
Furthermore, B n int A = 0. For it follows from (v, a) e B n int A that
G(v) < a < F2(u)-F2(v),
i.e., a = G(v), according to (12). Moreover, (v, a) e int A means that (v, a —
e)eA for all small e> 0, i.e., G(v) < a — e, in contradiction to G(v) = a.
Consequently, we can separate the sets A and B in X X U (Proposition
39.4, (1)). Therefore, analogous to the proof of Lemma 47.4, there exists a
(»*, a*) + 0 in X* X0¾ and an a e IR such that
(w*,v) + a*a < a< (w*,w) + a*b (13)
390
47. Convex Functional on Convex Sets and Convex Analysis
for all (v,a)eA, (w,b)eB. Furthermore, Proposition 39.4, (1) yields
" <a<" for all (v, a)e intA.
Below we shall show that a* < 0. Thus, by a change of w*,a, we can
assume, say, a* - -1. Since («,0) e A n B, it follows that a = (w*, u) by
(13). With a* - -1, from (13) and for appropriate choices of a and b, we
obtain
<w*,u>-G({;)<<w*,m><<w*,w>- (F2{u)- F2(w))
for all u e domG, w e dom 7¾. Taking (12) into account, this yields:
— w* e 8F2(u), u* + w* e dF^u).
Consequently,
u* = (M* + w*) + (-w*) edFl(u)+dF2(u).
This is the desired assertion,
d(F1 + F2)(u) c 5^(^)+ 5f2(«).
We still must prove that a* < 0. By assumption, G is continuous at u0;
therefore, (u0,G(u0) + l)eintA. By (13), from (u0,F2(u)-F2(u0))eB,
we obtain
<w*,«0> + fl*(G(«0) + l)<o^<w*>«0> + fl*(f2(«)-F2(«0)).
Now (12) yields a* < 0. D
47.8. The Main Theorem of Convex Optimization
We consider the minimum problem
inf F{u) = a. (14)
iie M
Parallel to this, we write the solvability conditions
QedF{u)+dXM{u) (15)
and the variational inequality
8+F(u;v-u)>0 forallueM. (16)
Under the assumptions of the following theorem, the existence of the
left-hand side in (16) is always assured. One can think of (15) as a
generalized Lagrange multiplier rule, because, for the trivial side condition
M = X, and because of Xx — ®< ^Xxi11)^ (0}> (15) passes into the Euler
equation 0 e dF(u). We calculated dxM(u) in Example 47.10. Therefore,
with the assumptions of the following theorem, (15) is equivalent to the
47.8. The Main Theorem of Convex Optimization
391
following condition:
There exists a u* in X* such that (17)
F(v)> F(u) + (u*, v- u) and
(u*,v — u)>Q for allu e M.
Theorem 47.C. Suppose that the following two conditions hold:
((') F: M c X-* U is convex; F is extended to X by setting F(v) = + oo for
v<£M.
((7) M is a convex nonempty subset of the real locally convex space X.
Then:
(1) Characterization of the solution. Each of the three conditions (15), (16),
and (17) is necessary and sufficient for u e M to be a solution of the minimum
problem (14).
(2) Structure of the solution set JSfof (14).
(/) Sis convex.
(ii) SPis closed when F is lower semicontinuous and M is closed.
(Hi) Every local minimum of F on M is also a global minimum of F on M.
(3) Uniqueness. (14) has at most one solution when F is strictly convex
on M.
Proof. (Ad 1), (17) If u is a solution of (14), then (17) holds with u* = 0.
Conversely, from (17) it follows that u is a solution of (14).
def
(Ad 1), (16) Let 9(0 = F(u + t(v - u)) for all t e [0,1]. Then 9 is convex
if u, v e M. According to (42.4a), <p'+ (0) = S+ F(u; v - u) > - oo exists and
<p(l)- 9(0) > 9'+ (0); therefore,
F(v)-F{u)>8+F{u;v~u) for all u, v e M. (18)
If u is a solution of (14), then 9(0 > 9(0) for t e [0,1] with fixed veM;
thus, 9'+ (0) > 0. This is (16).
Conversely, if (16) holds, then from (18) it follows that u is a solution of
(14).
(Ad 2), (3) We have already proved these assertions. □
We shall show that under stronger assumptions, condition (15) also
follows from the sum rule. Indeed, (14) is equivalent to
infF(«) + x*(«) = a. (14a)
«e x
According to Proposition 47.12, u is a solution of (14a) if and only if
0 g d{F+ Xm)(u)- If int M =^0 or F is continuous at a point u0 of M, then
from the sum rule it follows that d(F + Xm)(")== dF(u)+ dxM(u) f°r a^
« G X. This yields (15).
,)2
•t/. ^unvex IfuncuL»nals oil v^uuvex Sets emu vJonvex .miaiysis
47.9. The Main Theorem of Convex Approximation
Theory
We shall now generalize the results of Section 39.2 to convex approximation
problems of the form
inf||K-6|| = a (19)
II6JW
and note the following solvability condition:
There exists a u* in X* such that (20)
<K*,K-6> = ||K-6||,||K*||=1,
(u*,v- u) >0 for alius M.
Theorem 47.D. The following three assertions hold for a convex nonempty set
M in the real normed space X for fixed be X, b£ M:
(1) Characterization of the solutions. (20) is a necessary and sufficient
condition for u in M to be a solution of (19).
(2) Existence. (19) has a solution when M = M and X is reflexive.
(3) Uniqueness. (19) has at most one solution when X is strictly convex.
Proof. (1) The minimum problem (19) is equivalent to
inf F(u) = fi, (19a)
def
where F(u) = 2 \\u- b\\ . According to Theorem 47.C in Section 47.8, for
ueM:
u is a solution of (19a)
~0edF(u)+dXM(«)
<=> there exists aa*e dF(u) such that — u* e 8xm(u)-
Now the assertion follows immediately from Examples 47.11 and 47.10.
Note that because b£M, \\u — b\\*£Q always holds for a solution ue M,
(2), (3) Compare Proposition 38.15 and Theorem 39.B in Section 39.2. D
47.10. Generalized Kuhn-Tucker Theory
We study the minimum problem
iniF{u) = a, (21)
u
Fj(u)<0, i = l,...,«, ueA
with the side conditions in the form of inequalities and «e A. We have
47.1U. Ueneralized Kuiin-Tucker Theory
393
already explained the basic ideas in Section 37.11. Our goal is to establish a
Lagrange multiplier rule for (21) and at the same time to explain the
connection between various formulations of the Kuhn-Tucker theory. To
this end, for A = (Xv...,Xn)eU" and A0elR, we construct the Lagrange
function
def
L{u, X) = X0F{u) + XlFl{u) + ■■■+ X„F„{u),
where we forego giving the dependence of A0 explicitly, because A0=l
holds in the nondegenerate case. We now formulate a number of
propositions and give their range of validity in Theorem 47.E below. At the
pinnacle of the entire theory stands the saddle point assertion (A2) from
which (A3)-(A5) are also obtained in a simple way following the usual
pattern.
(Al) u is a solution of the original problem (21).
(A2) L, with A0 = 1, has a saddle point (u, X) with respect
(u,X)eAxU"+ and
L(u,n)< L(u,X)<L(v,X)
forati{v,ii,)eAxM"+.
(A3) u is a solution of
inf L(u, X) = al
ueA
for a fixed (X, A0) with A0 =1, where, in addition,
A,.>0, VSCO-O, (23)
Ft{u)<Q, /=1,...,«, ueA.
In contrast to (21), (22) contains no inequalities as side conditions.
Instead of this, the Lagrange multiplier X appears which thus, roughly
speaking, eliminates the inequalities. In the nondegenerate case A0=l, we
have a = av
(22) and (23) are obtained in a simple way from L(u, X) < L(v, X) and
L{u,n)<L{u,X), respectively, in (A2). Note that from ji,a< A,a for all
Ht > 0 and fixed A, > 0, a e IR, it follows that a < 0 and A,a = 0 always hold.
(A2) is the expression of a duality principle:
u is obtained as a solution of a minimum problem and one obtains the
Lagrange multiplier X as a solution of a maximum problem. We explain the
connection with the general duality theory in Section 50.2. Condition (23)
for A, simply means that A, > 0 and A,- = 0 for Ft{u) < 0. Here one says that
Xj is not active when Fj(u)<Q.
(A4) u satisfies
0 e \0dF(u)+ Xl8Fl(u)+ ■■■+ X„dF„(u)+ 8Xa(u) (24)
for a fixed (X, A0), with A0 =1, where, in addition, (23) holds.
to,4xlR';,i.e.,
(22)
394
47. Convex Functionals on Convex Sets and Convex Analysis
(A5) u satisfies the variational inequality
(X0F'(u)+XlFl'(u)+ ■■■ + X„F„'(u),v-u)x>Q for alius A
(25)
for a fixed (X, X0), with X0 = 1, where, in addition, (23) holds.
In the special case A = X, (25) is equivalent to
X0F'(u)+XlFl'{u)+ ■■■ + X„F„'(u)-0, (25a)
i.e., Lu(u, X) = 0. Here it is again especially clear that one is dealing with a
Lagrange multiplier rule.
Furthermore, the so-called Slater condition plays an important role:
There exists an element u0 in A
such that Fi(u0)<0 for all/. (SC)
This condition guarantees the nondegenerate case A0 = 1.
Theorem 47.E. Suppose that the following two conditions are satisfied:
(/) F,Fl,...,Fn: X-»IR are convex on the real locally convex space X.
(//) A is a convex subset of X and (SC) is fulfilled.
Then:
(1) Lagrange function. We have (Al) <=» (A2) <=» (./43). Moreover, the
extreme values in (21) and (22) coincide.
(2) Subgradients. If F, Fl,...,Fll are continuous at afixedpoint in A, then
(,41) «(,44).
(3) Variational inequality. If the G-derivatives F',F{,...,Fn' exist on A,
then (Al)** (AS).
If the Slater condition (SC) is absent, then one obtains only weaker
propositions. It can no longer be guaranteed that A0 =1. In the following,
we denote by (A/)' the assertion which results from (A/) when A0=l is
replaced by
X0 > 0, X20 + X\+ ■■■+ X\ + 0,
i.e., A0 = 0 is possible, but not all multipliers are simultaneously zero.
Corollary 47.14. If the assumptions of Theorem 47.E are satisfied but (SC) is
absent, then for (Al) we have:
(1) (,43) <=* (A2) => (,41).
(2) (Al)=»(A"b)'**(A2)'.
(3) (,44) =» (,41) =» (,44)' when F, Fu...,Fn are continuous at afixedpoint in
the set A.
(4) (,45) =» (,41) =» (,45)' when the G-derivatives F', F{,...,F„' exist in A.
47.10. Generalized Kuhn-Tucker Theory
395
Thus the weakening relative to Theorem 47.E pertains to the necessary
conditions for (Al) (the existence of a solution of the original problem).
We study the situation that is more general than (21) in Section 48.4,
where the Fs are not convex and operator equations appear as side
conditions.
Proof. The crucial step is the proof of (Al) =» (A3) with the aid of a
separation theorem. All the other assertions are then obtained in a simple
way. The reader should convince himself that we obtain Corollary 47.14 at
the same time.
(Ad 1) (A2) => (A3) and (A3) => (A2), (Al). This is trivial if one takes the
remark in conjunction with (A3) into account. Here, (SC) is not needed.
(Al) =» (A3). Let u be a solution of (21). We construct a subset C of
U"¥l. By definition, the point (ji0,..':,/*„) inlR"+1 belongs to C if and only
if
F{v)-F{u)<H, (26)
Fj{v)<ft,t for all i = 1,...,n and a fixed v e A.
C has the following properties:
(i) C is convex since F, Fh and A are convex,
(ii) int C ¥= 0, for (26) holds with v = u, ji0 = /xx = • • • = ji„ > 0.
(iii) 0 £ intC, for u is a solution of (21); consequently, /x0 < 0 is impossible
in (26).
We can thus separate the point 0 and the set C in R"+1 (Proposition 39.4,
(1)), i.e., there exists a \'= (\0, X) in R"+1 with X + 0 and (X'\u) > 0 for all
ft G C, i.e.,
n
E V; ^ ° for all j^ e C. (27)
X' has the following properties:
(a) \0,...,\„ > 0, for 0¾ n++1 c C, by (ii). Observe (27).
(b) XjFj{u) = 0, for, from Fj{u) < 0, we obtain (0,...,0, Fj{u),0,...,0) eQ
therefore, XjFj(u) ;> 0, by (27).
(c) L{v, X) = X0F(v)+ S;\,.i=;.(f;) > X0F(u) =. L(u, X) for all v e A. By (27)
this follows immediately from
(F(i))-F(«),f1(c)>...,F„(c))£C forallue^.
(d) From the Slater condition (SC) one obtains X0 > 0. This follows from
(27) because
(F{u0)~F(u),Fl(u0),...,Fn(u0))eC
and F^Uq) < 0 for all i as well as X' + 0.
We can thus assume that X0 = 1. Then L(u, X) = F(u), by (b). Therefore,
a=ai, by (c).
j J6
41. convex Funcuouals on convex Sets ana convex analysis
(Ad 2) We now show that (A3) <=» (A4). The minimum problem (22) is
equivalent to
inf L(u,X) + xA(u) = ai-
uex
The assertion now follows from the Euler equation
Oed{L + xA){u) (28)
according to Proposition 47.12 and by the sum rule
i
in Section 47.7. Here one takes into account that
XA{u0)<ooSLndd{XiFi){u) = XidFi(u)ioTXi^O.
(Ad 3) (A3) <*> (A5) This follows from Theorem 46.A, (b) in Section 46.1.
D
47.11. Maximal Monotonicity, Cyclic Monotonicity,
and Subgradients
In this section we fully explain the connection between subgradients and the
theory of monotone multivalued mappings. In this connection, we
generalize earlier results from Section 42.4 on difTerentiable convex functional
and monotone potential operators. In doing so, we find that:
(a) differentiability will be generalized by the subdifferential,
(/?) independence of path of integrals will be generalized by the sum
condition of cyclic monotonicity.
This section is intimately connected with Chapter 32. Here we repeat several
definitions from Section 32.1 and add the concept of cyclic monotonicity.
Definition 47.15. Let X be a real B-space and let T: X -* 2X* be a
multivalued mapping, i.e., to each ue X there is assigned a subset T(u) of X*. The
graph of T, G(T), consists of all
{u,u*)eXxX* such that u*eT{u).
T is called monotone if and only if
(u*-v*,u-v)>Q iorall{u,u*),{v,v*)eG{T).
T is called cyclic monotone if and only if
Ol\"l-«2> + 0*>"2-"3>+ •■• +<"*."« -"«+l>^°
for all («,, u*)eG(T), i = 1,...,n, and all n. Here, we set utl+l = uv
- ' J. Maximal Monotonicity, Cyclic Monotonicity, and Subgradients
397
T is called maximal monotone if and only if T is monotone and there is no
monotone mapping 7): X-*2X* such that G(T) c G(7i).
Maximal cyclic monotone mappings are defined analogously. In
preparation for the following, we now formulate the condition:
(H) F: X -* ]— oo, oo] is convex lower semicontinuous and F & + oo.
'Ilieorem 47.F (Rockafellar (1970a)). In a real B-space X, the following
propositions for characterizing subgradients hold:
(1) dF is maximal monotone when (H) holds.
(2) For a mapping T: X -» 2X , the following assertions are equivalent:
(i) T=dF and F satisfies (H).
(ii) T is maximal cyclic monotone.
In Problem 55.6 we show that F in (2) is uniquely determined to within a
amstant by T. In particular, (2) generalizes the integral criterion in Section
4114.
1'iioof. We restrict ourselves to the case when X is reflexive. The proof for
nonreflexive B-spaces can be found in Rockafellar (1970a).
(Ad 1) We show that dF is monotone. From (u, u*), (v, v*) e G(dF), it
lollows that u* e dF{u), v* e F(v); therefore,
F(v)-F(u)>(u*,v-u),
F(u)-F(v)it(v*,u-v).
Now, addition yields (u* — v*, u - v) ^. 0.
We show that F is maximal monotone.
(I) Let (u0, «0*)elx X* with
(u$-u*,u0-u)Z:0 foral\(u,u*)eG(dF). (29)
We prove that u* e 8F(u0); therefore (u0,u$)eG(dF). The maximal
monotonicity of dF follows from this.
Here, in a crucial way, we use assertion (II) given below. Accordingly,
because R(J + dF)= X*, there exist elements u and u* for which
J(u)+u* = J(u0)+u$, u*edF(u). (30)
I'rom (29) it follows that
(J(u0)-J(u),u0-u) <0;
therefore, u0 — u because of the strict monotonicity of J.
(30) shows that u$ = u*, i.e., u$ e 8F(u0).
(II) We still must prove that there exists a strictly monotone operator J:
K-* X* such that R(J + dF) = X*. To prove this, we consider the varia-
lional problem
inf <*>(«)=/?, (31)
398
47. Convex Functional on Convex Sets and Convex Analysis
where <p{u) = H(u)+F(u)-(u*,u), and #(^) = 2^11^12 for all ueX
def
with fixed u* e X*. We set J(u)-H'(u). In order to guarantee the
existence of the F-derivative H' on X, we equip X with an equivalent norm
so that X is locally uniformly convex. Then X is also strictly convex, H is
strictly convex, and u >-* \\u\\ is F-differentiable on X - {0} (cf.
A3(21)-A3(31)). From this it follows that the F-derivative H' exists on X
and H'(Q) = 0. According to Proposition 42.6, H' is strictly monotone
because of the strict convexity of H. Now <p possesses the following
properties:
(a) <p: X-> ]- oo, oo] is convex and lower semicontinuous, since it is the sum
of functionals with these properties, and <p m + oo.
(b) <p(u) -» + oo as ||u|| -»oo, since by Lemma 47.4, there exist u$ e X*,
a e |R such that
F{u)>{ul,u)~ a for all u e X;
therefore,
<p(k)>#(k)-||k||(||k*||+ ||K&||)-a-> +oo as||t<||->oo.
From Proposition 38.15 it follows that (31) has a solution u. Proposition
47.12 yields
Oed<p(«). (32)
The sum rule in Section 47.7 assures that
d<p(u) = dH{u)+ dF (u)-u* = J(u)+ dF (u)~u*;
hence, R(J + dF) = X*, by (32), because u* is arbitrary.
(Ad 2) (i) =* (ii) By the definition of dF,
f(uj^)-F{Uj) > {uf, uJ+l - Uj)
def
for uf e dF(itj),j = 1,...,n and un+l = ux. Addition yields
n
0< £ (uf,Uj-uJ+l),
/=1
i.e., dF is cyclic monotone.
We show that dF is maximally cyclic monotone. To this end, let £ be a
cyclic monotone extension of dF and let u* e dF{u{)and uf e E(u2). The
mapping £ is cyclic monotone; therefore
This means that
{uf~ut,ul-u1)S:Q forall^.Kj^eG^F).
5.F is maximally monotone; therefore («2, u\) e G{dF). Thus, £ = d£.
47.12. Application to the Duality Mapping
399
(ii) => (i) We set
def £
where («y, uf)eG(T),j = l n + 1. We fix (uvuf) and define
F(u) = sup^„(«)
for variable n and variable (itj, uf)eG(T), /-2 n + 1. Now F has all
the desired properties:
(a) i*" is convex and lower semicontinuous, since it is the supremum of
continuous linear functionals.
(b) F m + oo because the cyclic monotonicity of T yields ^„("i) ^ 0; thus,
F( «i)<0.
(c) 3.F is an extension of T. To prove this, let {un+2, u*+2) e G(T). By the
construction of F and the arbitrary choice of n,
n+l
E <"*[,UJ + l- Uj)+ {u*+2,u- un + 2) <F(u),
j-l
i.e.,
^n+i("/,+i)+{"*+2."-"n+2>ef(");
therefore
F(un + l)+{K+2>u- "n + 2> ^-^(^) for all M S X
Thus, u*+2 e dF(un+2). Take into account that F(un+2) < oo because of
property (b).
(d) BF= T. This follows from (c) and the fact that T and dF are maximal
cyclic monotone. □
47.12. Application to the Duality Mapping
In this section, we will show how the results of this chapter directly yield
numerous important properties of the duality mapping which we have
already used in Chapter 44 in the proof of the main theorem of the
Ljustemik-Schnirelman theory as well as implicitly in the proof of Theorem
47.F in Section 47.11.
Definition 47.16. Let X be a real B-space. We set
F(u) = 2-¾2.
The duality mapping J: X-+ 2X* is defined by J(u) = dF(u).
400 47. Convex Functionals on Convex Sets and Convex Anal;. >■»
If J(u) is a singleton, i.e., J(u) = {vu} for all ue X, then we identify J(»i
with vu; we thus set J(u) = vu and write J: X -» X*.
Proposition 47.17. J(u) consists of exactly all u* e X* such that
<«*,«> = ||«*||||«||, ||«*||-||«||.
J(u) is nonempty and convex as well as weak* compact.
This follows directly from Example 47.11 and Theorem 47.A in Section
47.6 since F is convex. If F'(u) exists as a G-derivative, then from
Proposition 47.13 it immediately follows that J(u) = { F'(«)}• In particulai.
in an H-space, we obtain (F'(u), h) = (u\h); therefore, (J(u), h) = (u In
for all ft e X In addition, ||/(w)|| = ||u||. Thus, in H-spaces, J coincides v-iili
the duality mapping introduced in Section 21.3.
The next proposition follows directly from Theorem 47.F in Section
47.11.
Proposition 47.18. J: X-+2X* is maximal monotone and maximal cydu
monotone.
If the B-spaces X and X* have additional properties, then J also h;i>.
additional properties. The next proposition shows this.
Proposition 47.19. Let X be a real B-space, and let F(u) = 2, x||w||2.
(1) If X* is strictly convex, then F'(u) exists as a G-derivative unci
J{u) = F\u) for all u e X, i.e., J is single valued.
(2) If X* is uniformly convex, then the following hold:
(0 F'(u) exists as an F-derivative and J(u) = F'(u) for all »el
(ii) J: X-* X* is uniformly continuous on each bounded set of X.
(3) If X and X* are separable, reflexive, and strictly convex, then i/ic
following hold:
(0 J: X-* X* is bijective, strictly monotone, coercive, bounded, demiciii-
tinuous, and odd.
(ii) J~l: X* -» X is equal to the duality mapping from X* onto X** V.
(Hi) If X and X* are even locally uniformly convex, then J and J"1 tire
continuous on X and X*, respectively.
According to the Kadec-Troyanski theorem (A3(29)), the proof of winch
can be found in Troyanski (1971) and in Cioranescu (1974, M), page 98, an
equivalent norm can be introduced on every reflexive B-space X so that X
and X* are both locally uniformly convex and therefore strictly convex as
well. In considerations that are independent relative to passing to equivalent
norms, one can thus always assume that J possesses the propitious
properties given in Proposition 47.19, (3).
Problems 401
We recommend that the reader study Appendices A3(21)-A3(31)
concerning the geometry of B-spaces and then try to give the proof of
Proposition 47.19, which is a simple consequence of these known
propositions given in the Appendices together with results of the theory of
monotone operators from Part II. We give the proof in Problem 47.6. With
this method for the proof, we would like to point out the intimate
connection between the geometry of B-spaces and the duality mapping. With an
independent line of reasoning, the reader can check to see whether he has an
understanding of the important propositions on monotone operators,
potential operators, and differentiation rules which we gave earlier.
The proofs of the propositions in the Appendix can be found in the
comprehensive monograph of Cioranescu (1974, M) and in the lecture notes
of Diestel (1974). Also, compare Beauzamy (1982, M).
Problems
47.1. Convex junctionals. Show; If F,G,Fa: X-*[-00,00] are convex on the
linear space X, then F + G,tF, 0 < t < 00, sup(F, G), and sup0F0 are also
def
convex. In this connection, we agree that F(u) + G(u) = +00 if F(u) =
— G(u) = +00.
47.2. A singular case. If F: X-* [ — 00,00] is convex and lower semicontinuous on
the real locally convex space X and F(u) ~ — 00 for some uel, then F
takes on no finite values.
Solution: Compare Lemma 47.4.
47.3. Proof of Example 47.11. Solution: It suffices to consider the case b==0. Let
F(u) = 2 ||k|| . For u = 0, we have F (0) = 0; therefore, by Proposition
47.13, dF(u) = {0}. Now let u # 0. From u* e dF(u) it follows that
2_1(l|u||2-|kll2) >(u*,v -z) for alius A-, (33)
where z = u; consequently,
(«*,«> 2: (u*,v) for all u e X, where ||i>|| = ||m||,
i.e.,
||«*HM-sup{<«*,o>:i;eAr,||i;||-||«||}- <«*,«>.
We set
def def def
v = (s±r)w, z=sw, IM|=1, J = ||«||.
From (33), for r > 0, it follows that
I'^s2- (s -r)2)^r(«*,H'><2~1((5 + 02~*2).
When r-> +0, we get (u*,w) =s for all w, ||w||=l, i-e-> II"*ll = .5 = ||u||.
47. Convex Functional on Convex Sets and Convex Analysis
Conversely, let <«*,«)-||m*||||m|| and ||m*||-||m||. From this, for all
v e X, it follows that
(u*,v - u) <; Hulllloll- Nl2 <; 2^(11^112 - Ml2).
Observe that ab s 2"\a2 + b2). Thus, we have u* e dF(u).
Proof of Proposition 47.13, (»')• Hint: Compare Ekeland and Temam (1974.
M), Chapter I, 5.2. Use a separation theorem.
Application of Theorem 47.D to concrete approximation problems. In thN
connection, study Holmes (1972, L), Chapter 3.
Proof of Proposition 47.19. Solution-. We set
def , „ def
F{u)-2-x\\ut, G(u)=\\u\\.
(1) According to A3(26), G is G-differentiable on X— {0}. ConsequenuV
F'(u) = G'(u)||f<||for k#0. Trivially, F'(0) = 0.
2(i) Follow the line of reasoning of (1), using A3(25).
def
2(ii) According to A3(25), G is uniformly F-differentiable on S — { u e X.
\\u\\ =1}, i.e., for each e> 0 there exists a S(e) > 0 such that
- e\\h\\ S \\u, + h\\- \\u,\\- (G'(u,),h) 5 e||A|| (34)
for all h, \\h\\ ^ S(e), and u, e S.
(I) G' is uniformly continuous on S. To see this we choose ulyu2^^
with 11«! - u2\\ <, eS(e). Hence
iM-lMsrfCe). Ill«i + A||-Il«2 + A|||se«(e).
Subtraction in (34) immediately yields
\(G'(ui)-G'(u2),h)\^4eS(E) for alU,P|| < S(e);
therefore,
||G'(«1)-G'(«2)IIs4e.
(II) G' is thus also bounded on S, by Fig. 27.1.
(III) w* G'(\\u\\~lu) is uniformly continuous on bounded sets that lie
outside some neighborhood of zero. This follows from (I) and the
corresponding property for u -* ||«||_1«.
Since G(tu) = tG(u) for t>0, we have G\tu) = G\u), i.e., G'fllwH-1!*)
= G'(w) for w#0. Now the uniform continuity of F' on bounded set-
follows by a suitable decomposition of F'(u)— F'(v), taking into
consideration
F'(u) = G'(\\urlu)\\u\\ forw#0, F'(0)*=0
and (1)-(111).
3(i) X is strictly convex; therefore, F is also strictly convex by A3(31).
Thus, for J = F\ the following hold:
(a) J is strictly monotone and demicontinuous by Section 42.3.
(b) J is odd because F is even.
Problems
403
(c) / is coercive because (/1/,1/)/111/11==111/115 therefore
(/i/,i/)/||i/||-> + oo as||i/||->oo.
(d) / is bounded since \\Ju\\ = ||u||.
(e) / is bijective by Theorem 26.A.
3(ii) Since X is strictly convex, it follows from assertion (1) that the
duality mapping J: X* -» X** is single valued because X** = X.
Proposition 47.17 shows that / = /-1.
3(iii) / is continuous because, from u„ -» u as n -» oo, it follows that
J(u„)-J(u), ||/(«„)||^||/(«)||
because of the demicontinuity of / and ||/(y)|| = |H|- Now Aj(30) yields
/(1/,,)^/(1/).
The continuity of / 1 — J is obtained in an analogous manner.
47.7.* Chain rule for subdifferentials. Prove thai,:
d(F°L)(u)=*L*dF(Lu) foralli/eA-
provided F: y-» R is convex and lower semicontinuous, L: X^> Y is linear
and continuous, X and Y are real locally convex spaces, and F is finite and
continuous at some point.
Hint: Compare Ekeland and Temam (1974, M), Proposition 5.7. Use a
separation theorem.
47.8. Approximate minimal solutions. Let F: X -» R be lower semicontinuous and
G-differentiable on the B-space X, and suppose u satisfies
F(u)£ inf F(v) + e.
v<=X
Show: For this F there exists an approximate minimal solution ue, i.e.,
F(u,)zF(u), \\u-u.\\*G, \\F'(u.)\\*G.
Hint: Use Proposition 38.22. Further generalizations for subgradients can
be found in Ekeland and Temam (1974, M), 6.3.
47.9.* Local e-subdifferentiability. Let F: X^>]-00,00] be convex and lower semi-
continuous on the real B-space X. Let 0 < e < 00. The functional F is said to
be locally s-subdifferentiable at u if and only if there exist aa*£l* and a
number i\ > 0 such that
F(v)^F(u) + (u*,v~u)~e\\v~u\\ for all
v for which \\v — u\\<,i\.
Show: If X is an H-space and Fm+oa, then the set of points at which F
is locally E-differentiable is dense in dom F—hence, it is dense in X for
— 00 < F < 00.
Hint: Compare Aubin (1979, M), page 125. There, one also finds further
results. See, also, Ekeland (1979, S) and the detailed exposition in Demjanov
and Vasiljev (1981, M).
47.10. Generalized gradients of locally Lipschitz continuous functional. Let/: U Q X
->R be Lipschitz continuous on the open set U of the B-space X. The
47. Convex Functional on Convex Sets and Convex Analysis
generalized directional derivative of / at x is given by
tlef
8+f(x;h)-m[f(y + th)-f(y)]r1 asy-»*,*-»+0.
Here t > 0. By definition, the generalized gradient df(x) of / at x is the sei
of all x* e X* such that
d+f(x;h)>(x*,h) for all AeA'.
Show: 3/(x) is a nonempty convex bounded subset of X*.
Hint: Compare Clarke (1981), page 54. This new calculus, which can be
found in Clarke (1981), Demjanov and Vasiljev (1981, M), and in Rockafel-
lar (1981, L), is very useful for treating nonsmooth and nonconvex
problems. Interesting applications to general optimization and control problems
are contained in Clarke (1976), (1976a), (1983), Ekeland (1979, S) and
Rockafellar (1981, L) (also, see Problem 48.8c). This approach is closeh-
related to that in Section 38.8.
As a typical result we mention the following: Suppose /, g: X->M an:
locally Lipschitz continuous functions on the B-space X. If u is a solution ol
/(«) = min!, g(u)<:0,
then there exist nonnegative numbers A0 and A, not both zero, such that
0<=\odf(u) + \dg(u),
O-Ag(ti).
If/and g are convex, then <9/(w) and dg(u) are the subdifferentials of/and
g, respectively, at u. If / and g are G-differentiable, then df(u) - {/'(")!
and dg(u) - {g'(u)) and we obtain the classical condition
0 = A0/'(«) + Ag'(")-
Compare Clarke (1976b).
Applied problems for convex optimization theory. Numerous interesting
practical problems can be found in Luenberger (1969, M). Study these
applications.
*Convex analysis and mathematical economics. In this connection, study the
detailed exposition in Aubin (1979, M).
A simple sufficient Lagrange multiplier rule for nonconvex problems. "Wi:
consider the problem
F(x) = min! (35)
fj(x)<,0 forally=l,...,m,
with the Lagrange function
L(x,\) = F(x)+ JlXjfjix).
ye/
Relerences
405
We make the following three assumptions.
(i) F,fy. U(x0)gn"-+n are C^functions.
(ii) The point x0 satisfies the side conditions.
(iii) Denote by J the set of all j for which fj(x0) = 0.
Suppose that for each je/ there is a positive number \ • such that
L,(x0,\) = 0 (36)
and
Lxx(x0,\)h2>0 (37)
for all nonzero isR" with fj(x0)h = 0 for all j e J.
Show: x0 is a strict local solution of (35).
Solution: If x0 is not such a solution, then there is a sequence (tn) of
positive numbers with t„ ~* 0 as n ~* oo and a sequence (hn) of unit vectors
in R w such that
F(x0 + t„h„)-F(x0)<,Q
fj(x0 + t„h„)-fj(x0) < 0 for all j e/.
Furthermore, by a compactness argument, we can assume that hn-*h as
n -* oo. Hence
F(x0)A<0
and
fj'(x0)h<0 forallye/.
By (36) and \^ > 0, we obtain
r(*o)A-0
and
fj(x0)h = Q for all / e /.
Set x„ = tnh„. By Taylor's theorem and (36),
0>L{x„,\)-L{x0,\) = 2-^(^^)^ + 0(11^,,112).
As « -»oo, 0 > 2_1Z,„(x0, \)h2. This contradicts (37).
References to the Literature
Classical works: Minkowski (1910), (1911) (convex functions, convex
bodies, and number theory); Bonnesen and Fenchel (1934, M) (convex bodies);
John (1948) and Kuhn and Tucker (1951) (minimum problem with
inequalities as side conditions).
Convex analysis in 01^: Rockafellar (1970, M, B,H) and Roberts and
Varberg (1973, M,B,H) (standard works); Marti (1977, M).
406 47. Convex Functionals on Convex Sets and Convex Analysi-
Convex analysis in infinite-dimensional spaces: Moreau (1966, L).
Rockafellar (1968, S), (1970a); Ekeland and Temam (1974, M); Ioffe and
Tihomirov (1974, M); Barbu and Precupanu (1978, M) and Aubin (197l>.
M) (comprehensive exposition of calculus).
Duality mapping: Cioranescu (1974, M) (comprehensive presentation),
Pascali and Sburlan (1978, M) (connection with the theory of monotone
operators).
Convex analysis and monotone operators: Rockafellar (1968, S); Browdur
(1968/76, M); Brezis (1973, L) (H-spaces); Gajewski, Groger, and Zacharias
(1974, M); Pascali and Sburlan (1978, M); Barbu and Precupanu (1978, Mi:
Kluge (1979, M).
Convex analysis, multivalued functions, and measure theory: Castaiiisj.
and Valadier (1977, L).
Convex analysis and the calculus of variations: Ekeland and Temam
(1974, M); Ioffe and Tihomirov (1974, M).
Local convex analysis and control theory: Ioffe and Tihomirov (1974, M i.
Convex analysis and approximation theory: Holmes (1972, L).
Convex analysis and geometric functional analysis: Holmes (1975, M).
Applications of convex optimization theory in B-spaces: Luenberga
(1969, M); Ekeland and Temam (1974, M); Ioffe and Tihomirov (1974, Mi:
Holmes (1975, M); Barbu and Precupanu (1978, M); Aubin (1979, M).
Convex analysis and mechanics: Duvaut and Lions (1972, M); Ekelaml
and Temam (1974, M); Moreau (1976, S); Groger (1979, S); Hlavacek ami
NeSas (1981, M); Temam (1983, M).
Convex analysis and mathematical economics: Aubin (1979, M).
Generalized gradients for locally Lipschitz continuous functionals ami
applications: Clarke (1976), (1976a), (1976b), (1981), (1984, M); Ekelaml
(1979, S); Rockafellar (1981, L); Demjanov and Vasiljev (1981, M, B).
Optimization for nonsmooth functionals: Demjanov and Vasiljev (19^1.
M, B); Rockafellar (1981, L); Clarke (1984, M).
e-Subgradients, quasi-derivatives, and numerical methods: Demjanov ami
Vasiljev (1981, M, B, H).
Convex sets: Valentine (1964, M); Holmes (1975, M); Leichtweiss (19SD.
M); Fuchssteiner and Lusky (1981, M).
Generalized Hahn-Banach theorem and basic concepts of convex anal\-
sis: Konig (1982, S).
CHAPTER 48
General Lagrange Multipliers
(Dubovickii-Miljutin Theory)
True optimization is the revolutionary contribution of modern research to
decision processes.
George Bernhard Dantzig
(born 1914)
In Chapter 43 (eigenvalue problems) and in Section 47.10 (Kuhn-Tucker
theory), we became acquainted with the Lagrange multiplier method for
handling extremal problems. In this chapter we prove a very general
formulation of this method (Theorem 48.A in Section 48.3). In this
connection, the direction cone and the positive functionals that exist on it play the
crucial role.
The basic idea is very simple. Let F: G c |R2 -> |R be a real function on
the closure of a region G. For F to have a minimum at a boundary point
m0 e dG we must, roughly speaking, have
KonKl=0. (1)
Here, K0 and K^ have the following meanings:
(i) K0 is the set of all directions that emanate from uQ and in which F is
strictly decreasing,
(ii) Kx is the set of all directions that point from u0 into the region G.
Therefore, K0 n K± = 0 means, in other words, that there exists no
direction that points from u0 into the region and in which F is strictly
decreasing.
Thus, our problem consists of stating conditions under which K0f)Kl =
0, with the aid of separation theorems. This occurs in Section 48.2
w8
48. (jeneral Lagrange Multipliers (Dubovickii-Miljutin Theory)
(Dubovickii-Miljutin lemma). To this end, we use the Krein extension
theorem for positive functionals from Section 39.1, which is obtained from a
separation theorem.
As applications we consider:
(a) extremal problems with side conditions in the form of equations and
inequalities (generalized local Kuhn-Tucker conditions);
(/?) control problems, classical variational problems, and the Pontrjagin
maximum principle.
This theory was influenced in an essential way by the attempts around
1965 to create a general theory of extremal problems with side conditions
that also encompassed the Pontrjagin maximum principle within the context
of a Lagrange multiplier rule.
In Section 48.10 we treat an application of the maximum principle to the
optimal control of a spaceship in its return to earth. In Problems 48.5-48.7
we study additional practical control problems (optimal moon landing,
optimal start of a rocket, etc.).
48.1. Cone and Dual Cone
In general, cones play a crucial role in optimization theory.
Definition 48.1. Let K be a subset of the real locally convex space X.
K is called a cone if and only if the following holds:
ueK, a > 0 implies au e K. (2)
By the dual cone K+ to K we mean
def
K+ = {feX*:f(u)>0onK}.
Example 48.2. Figure 48.1 shows a cone for X= U2. We do not require that
the apex 0 belong to K or that K be closed or that the angle of the cone is
acute.
By definition, K+ consists of all continuous linear functionals that are
nonnegative on K. Obviously, K+ is a convex cone with 0 e K+. For K = 0,
K+ = X*. Furthermore, Kx C K2 always yields K\ c K{.
Figure 48.1
48.1. Cone and Dual Cone
409
Convention 48.3. For reasons of symmetry, in optimization theory one
frequently sets K* = K+. We adher to this convention later in Chapter 49,
but we point out a danger of confusion: For K= X, we have K+ = {0};
therefore, the dual cone K* is not equal to the dual space X* when
X* {0}.
Example 48.4. Let X =0¾ N and
^ = ((^,...4)6^:¾ ^>0}.
If we set K = IR1, then K is a closed convex cone in X with K+ = K (see
Fig. 48.2).
Proof. Every continuous linear functional/ e X* has the form
/(•*) = iJi£i+ '" +%£jv
for all jc = (£i,...,£jv)e X, wherey — (^ %)el is fixed. The
condition /( jc)> 0 for all x eKis equivalent to ij1(...,%>0; hence y eK. If we
identify /with y, then we obtain Jt+ = K. D
As our next example, we investigate the cones:
def
K= = {«eI:/(«) = 0},
def
K< = («eI:/(«)<0},
Ks = («€jf:/(i()<0}.
Example 48.5. If X is a real locally convex space and /eP with / # 0,
then:
A"i == {A/:\elR},
/:+ = (A/: AelR,A<0}, A"+=A"+.
We treat the proof in Problem 48.1. We shall apply this example in Section
48.5.
Figure 48.2
410
48. General Lagrange Multipliers (Dubovickii-Miljutin Theory)
Our next goal is to establish the formula
K++=K (3)-
which is important in duality theory. In this connection, we set K++ =¾
(K+)+.
Proposition 48.6. If K is a cone in the real locally convex space X, and
(X, X*) forms a dual pair, then the following four assertions hold:
(a) K+ is convex, closed, and nonempty.
(b) K++ = K if and only if K is convex, closed, and nonempty.
(c) K++= co K when K*0.
(d)IfK*0,then:
inf (v,w) = { , ' (4)--:
w<ekx ' \-oo ifv$K+. v '
Later we shall frequently make use of (4) in optimization theory to
calculate Lagrange functions and conjugate functionals. Dual pairs will be
defined in the Appendix. In particular, (X, X*) is a dual pair when X is a
reflexive B-space and X* is the dual B-space. The proof that K++ = K is
based on a separation theorem.
Proof, (a) K+ is closed. In this connection, let (va) be an M-S sequence
from K+ such that va -» v. From (va, u) > 0, for all u eK, it immediately
follows that (v, u) > 0 for all u e K, i.e., veK+.
(d) Let v &K+. Either (v,w) = 0 for all w& K or (v,w)>0 for some
weK. In the second case, we also have (v,aw)>Q for all a>0. Now
a-* +0 yields (d).
Let u<£ K+. Then (v,w) < 0 for some we K; therefore, we also have
(v, aw) < 0 for all a < 0. Thus, a -* + oo yields (d).
Note that w e K, a > 0 implies aw e K.
(b) If (K+)+ = K holds, then by (a) it follows that K is closed, convex,
and nonempty.
Now let K be closed, convex, and nonempty. We show that (K+ )+ = K.
Since X** = X, this is equivalent to
(v,u)x>0 for all v e /T <*> u e Jt. (5)
The assertion for <= follows immediately from the definition of K+. We
show the assertion for ==> when K is convex, closed, and nonempty. If we
had u£K, then we could strictly separate u and K (Proposition 39.4, 2(ii)).
Thus, there exist elements v& X* and an a e IR such that
(v,u) <a= inf (v,w).
weK
By (4), a = 0; therefore, uef+, which contradicts (y, «> > 0 by (5).
48.2. The Dubovickii-Miljutin Lemma
411
def
(c) K c K++ follows from (5). Let Kl=coK. By (a), K++ is convex and
closed; hence, KrQK++.
Furthermore, from K c Kx it follows that A? c K+ and thus A++ c Kf+.
By (b), A++ = Ki, therefore Kx = K++. D
48.2. The Dubovickii-Miljutin Lemma
The following Dubovickii-Miljutin lemma is crucial for the proof of the
main theorem in the next section. Our goal is to find a characterization for
n.A,.=0 (6)
i = 0
with the aid of
/o+ ---+/,+ 1-0. (6*)
Our assumptions read as follows:
(Al) KQ, Ku.. .,Kn+l are convex cones in the real locally convex space X,
where n > 0.
(A2) K0,...,Kn are open and Ko¥=0.
Lemma 48.7. With the assumptions (Al) and (A2), the following two
assertions are equivalent;
(/) (6) holds.
(ii) There exist junctionals / e Kf, /' = 0,..., n +1, which are not all
simultaneously equal to zero, such that (6*) holds.
Proof, (ii) => (i) Suppose that there exists a u such that
n + l
ue f| K,-
i = 0
By (6 *), it is impossible to have /0 = • • • = /„ = 0. Therefore, let, say, /0 ¥= 0
and/0(u) ¥= 0. The set K0 is open; consequently, for all A in a neighborhood
of zero, f0(u + Xv) 5: 0, i.e., f0(u)+ Xf0(v) > 0; hence, f0(u)> 0. By (6*),
because/ e Kf, this yields the contradiction
o«/0(«)+ •-- +/„+i(")a:/o(").
(i) => (ii) We use a separation theorem and the important formula
m \ +
n*, =^ + --+^, »«£», (7)
,|-=o /
which we shall prove below with the aid of the Krein extension theorem.
Since Ko=£0 and (6) holds, there exists anmsn such that
K= C\K,*0, KnKm+l~0.
i = 0
412
48. General Lagrange Multipliers (Dubovickii-Miljutin Theory)
Since K is open, we can separate K and Km+l (Proposition 39.4, (1)). Thus,
there are an / e X* and an a e |R such that / =* 0 and
f(u) ^a<f(v) foial\ueKm+l, veK.
def def
By(4),/e/r, -/6^, If we set/m+1 - -/,/M+2«- /,+ i=0,
then
/ + /.,+1+---+/.+1-0.
From / e A"+ and (7) it follows that
/=/o+•••+/»
for suitable/, e #+. This yields (6*).
We must still prove (7). To this end, we set
def
Y = {(u0,...,um): ut, e Xioiall/},
def
L= {(v,...,v):v<=X),
def
C= {(u0,..,,um): ut e Kt for all i},
i.e.,
m m
r-ru. c=ru„
i = 0 i = 0
and L is the so-called diagonal on Y. Here, C is an open convex cone in Y
such that C r\L^0 because K¥=0. Each fey* has the representation
m
*■(«)=!//(«/) (8)
i=-0
for all k = (u0,...,um)e Y, where/, e X* for alli.
(7) with 2 follows immediately from the definition of the dual cone.
In order to prove (7) with c, let
/e(,-0o*'P (9)
Our trick for proving this consists in defining F on L by
def
F(u)=f(v) for all u= (v,...,v) eL.
Due to (9), F(u) >0 for all ueCn L. According to the Krein extension
theorem (Proposition 39.5), F on L can be extended to a continuous linear
functional F on Y such that F(u)> 0 on C. Therefore, there exists an
48.3. Necessary and Sufficient Extremal Conditions Conditions
413
/jSl* such that (8) holds and
m
*"(«)- I/,(«i)2:0
/ = 0
for all u e C, i.e., for all ut e Kt.
veKt, a>0 implies av ^ Kt. Therefore, ft{u)>Q for all ute.Kt, i.e.,
The construction of F yields / = /0 + • • • + fm on X. D
48.3. The Main Theorem on Necessary and Sufficient
Extremal Conditions for General Side
Conditions
We consider the general minimum problem with side conditions:
F0(w) = min!, (10)
(a) Side conditions of the type of inequalities:
ue Nj, y = l,...,«.
(b) Side conditions of the type of equations:
u(=Nn+l.
Our goal is a necessary condition for solvability of the form
/0 + /1+--+/,+1-0, f,eK+.
In this connection, our assumptions read as follows:
(HI) F0: D(F0) c X-* U is a functional on a neighborhood of u0 in the
real locally convex space X.
(H2) All Nu..., Nn+! are subsets of X such that int Nj¥=0 for7 = 1 n.
We thus designate the side conditions u e Nj, J = 1,...,n, as side
conditions of the type of inequalities. In contrast to this, int Nn+1 = 0 is possible
for N„+l. This situation occurs for side conditions in the form of equations.
Now we associate certain direction cones at the point u0 with the side
conditions:
(a) The cone K0 of the regular descent directions of F0 at u0.
(/?) The cone Kj of the admissible directions at u0 with respect to Nj for
7=1,...,«.
(7) The cone Kn+l of one-sided tangential directions at uQ with respect to
Nn+l.
We give the precise definitions below.
(H3) K0,...,Kn+ j are convex and K0 ¥= 0.
414
48. General Lagrange Multipliers (Dubovickii-Miljutin Theory)
In the following, we designate u0 as a local solution of (10) when 7¾ has a
bound local minimum at u0 with respect to the side conditions in (10).
Theorem 48.A (Dubovickii and Miljutin (1965)). With the assumptions
(//1)-(//3), the following three assertions hold:
(1) Necessary condition. If u0 is a local solution of (10), then there exist
continuous linear functionals feKf, i = 0,1,...,« +1, which are not all
simultaneously zero, such that
/o+ ---+/,+ 1 = 0. (11)
(2) Nondegeneracy. We have fk^0 when
n + 1
/ = o
i + k
(3) Sufficient condition. The necessary condition in assertion (1) is sufficient
for u0 to be a solution of (10) provided, in addition, the following hold:
(i) F0: X-* U is convex and continuous.
(ii) Nl,...,Nn+l are convex and there exists an h such that h e intNj for
7=1,...,« andh eNn+l (Slatercondition).
We designate the /; as generalized Lagrange multipliers. We call (11) an
abstract Euler-Lagrange equation. Especially important is the nondegener-
ate case/0 ¥= 0 which is guaranteed by
« + i
/=i
In the next section it will be clear from an important example that all this
notation is chosen in a meaningful way. In (10) if one wants to forego the
side conditions u eNt for fixed i, then one can choose N, = X. Then Kt = X
by Definition 48.8 below. Therefore, Kf = {0} and thus f = 0.
We now make the definition of Kt precise.
Definition 48.8. h in X is called a regular descent direction of F0 at uQ if and
only if there exist numbers a > 0, e0 > 0, and a neighborhood of zero, U(Q),
such that
F0(u0 + e(h + r))-F0(u0) ^ ^
e ~
holds for all e e ]0, e0[, r e [/(0). We denote the set of all these h by K0.
h in X is called an admissible direction at u0 with respect to Nj for
j = 1,...,n ii and only if there exist a number e0 > 0 and a neighborhood of
zero, [/(0), such that
u0 + e(h + r)(=Nj
48.3. Necessary and Sufficient Extremal Conditions
415
^h
,e><
u(t)
(a) / "^ (b)
Figure 48.3
holds for all e e [0, e0[, r e [/(0). We denote the set of all these h by Kj [see
Fig. 48.3(a)].
h in X is called a one-sided tangential direction at u0 with respect to Nn+1
if and only if there exist a number tQ > 0 and a curve t >-* u(t) such that
u{t) = u0 + t(h + s{t))eN„+l
for all t e [0, f0[. Here, we must have s(t) -» 0 as r -» +0. We denote the set
of all these h by A"„+1 [see Fig. 48.3(b)].
It can easily be verified that all the Kt's are cones because if [/(0) is a
neighborhood of zero, so is XU(Q) with X > 0. In order to be able to apply
Theorem 48.A, one must then calculate the direction cone Kt and the dual
cone Kf to it. We explain this in Section 48.5.
Proof of Theorem 48.A. (1) Without any difficulties, from Definition 48.8
it follows that all K0,... ,Kn are open, due to (H2). We set
*/ "X1
and show that A=0. Then the assertion follows immediately from Lemma
48.7.
Suppose that he A. By the construction of Kt, there then exists a
neighborhood of zero, [/(0), and numbers e0 > 0, a > 0 such that
F0(u0 + e{h + r))-F0{u0) ^ ^
e ~~
and
u0 + e(h + r) eNv...,N„
for all e e ]0, e0[, r e [/(0). Furthermore, for sufficiently small e> 0, we have
"(e) = "o + e{h + s{e)) e Nn+l
and s{e) e [/(0). Therefore,
F0(u(e))-F0(u0)< -ea
u(e)eNu...,Nn+l.
If V(u0) is a «0-neighborhood, then, for sufficiently small e>0, we can
-ri6 48. ueneral Lagrange Multipliers (Uubovicku-Miljutin Theory)
always achieve
«(e)eF(«0).
This contradicts the fact that F0 has a bound local minimum at u0.
(2) We consider, say, the case k = 0.
If n 1+tK, ¥= 0 and /0 = 0, then, from /t + • ■ • + /„+1 = 0 and Lemma
48.7, we arrive at the contradiction
n + 1
n Kt=e>.
;=i
(3) Suppose there exists a ux such that F(ul)< F(u0) and uleNj,
y = l,...,n+l. Let u = tux + (\-t)h for 0 < r <1. Since /ieintA^, y =
1,...,n, Ae iVn+1 and all iV,- are convex, we have « eintA^.y =1,...,« and
«eAfi+1.
The construction of isT, yields u - u0 e /£, for i > 1. Take into account that
u0 + e(u — u0) eNn+1 Dint Nj for j = 1,...,n and 0< e<l. Below we show
that u — m0 e /£0. Then « — «0 belongs to the intersection of all KQ,...,/fllH. x.
However, from (11), according to Lemma 48.7, it follows that this
intersection is empty. This is the contradiction sought.
We must still prove that u - u0 eKQ. Due to the continuity of FQ and
since ^(«0 < ^0("o)> we can nn^ a small t > 0 such that F0(u) < F0(u0).
For 0 < e < 1, we have
F0(u0 + e(u - u0)) < eF0(u)+ (1 - e)F0(u0);
therefore,
lim — <FQ{u)- FQ{u0) <0,
i.e., u — «0 e 1£0. The limiting value on the left-hand side exists because of
the convexity of F0. Q
48.4. Application to Minimum Problems with Side
Conditions in the Form of Equalities and
Inequalities
As an application of Theorem 48.A, we study the minimum problem
F0(u) = vain\, (12)
F-(u)<0, y =1,...,/1 -1, ueN„,
^,+ i(")-0,
where n > 2.
48.4. Minimum Problems with Side Conditions
417
Our goal is a necessary and sufficient solvability condition in the form of
a Lagrange multiplier rule:
n-1
E M'(«o)("-"o) + <ACi("o)("-"o)>2:0 (13a)
/ = o
for all ueN„;
^,...,^-1^0, y*eY*, (13b)
V5(k0) = 0, y-l,...,«-l;
f|-(«o)^0, y-1 n-1 * (13c)
u0eN„,- Fn+l(u0) = 0.
We designate A, and j* as Lagrange multipliers. The variational inequality
(13a) represents the Euler equation or the Euler-Lagrange equation for (12).
(13b) and (13c) yield additional conditions for Ay. To be exact, Ay = 0 for
Fj(u0)<Q, j = 1,...,n — 1, i.e., Ay is inactive in the case of the strict
inequality Fj(u0) < 0.
Our assumptions read as follows:
(HI) X and Y are real B-spaces.
(H2) FQ,...,Fn_l: U(u0)c, X-+M are F-differentiable functionals on an
open neighborhood of uQ, U(uQ).
(H3) N„ is a convex set in X with intNn^0.
(H4) Fn+l: U(u0) c X -* Y is a continuously F-differentiable operator.
(H5) Regularity: The range R(F„'+l(u0)) is closed in Y.
Assumption (H5) plays an important role for the necessary solvability
condition. In the following we call «0 a local solution of (12) if and only if
there exists a «0-neighborhood V such that F0(u) ^ F0(u0) for all u e V that
satisfy the side conditions in (12).
Theorem 48.B {Generalized Kuhn-Tucker Theory).
(1) Necessary condition. If (//1)-(//5) hold and u0 is a local solution of
(12), then there exist real numbers A0,...,A„_1 and a functional y* eY*
which are not all simultaneously equal to zero and satisfy (13).
(2) Sufficient condition. If (//1)-(//4) hold, then from (13) it follows that
w0 is a solution of (12) when the following two additional conditions are
satisfied:
(/)A0>0;
(//') FQ,...,Fn_l andu >-* (y*, Fn+l(u)) are convex on X.
The following additional results with respect to assertion (1) are
important for many applications.
418
48. General Lagrange Multipliers (Dubovickii-Miljutin Theory)
Corollary 48.9 (Special Cases). We consider the situation of Theorem 48.5,
(1).
If the side condition ueNn is eliminated, i.e., Nn = X, then the equality sign
holds in (13a) for all u e X. Consequently,
KFQ'{u0)+---+\n_lFn'_l(u0)+[Fn'+l{u0)]*y* = 0.
If the inequalities are eliminated from (12) or the equation is eliminated in
(12), then the corresponding terms in (13) drop out and A0, y* or
A0, Xl,...,X„_l, respectively, are not simultaneously zero.
Corollary 48.10 (Nondegenerate Case). In Theorem 48. .8, (1), we have
A0 > 0 when one of the following two conditions is fulfilled:
(/) F0'(«0) = 0.
(ii) F0'(u0) =£ 0 and there is an he X such that the Slater condition
F/(uQ)h<0, ./-1,...,/1-1,
Uq + h eint Nn,
^,'+i("o)A-0,
as well as
R{Fn'+1{u0)) = Y,
is satisfied.
If the inequalities are eliminated from (12) or the equation is eliminated in
(12), then in (ii) the corresponding conditions on Fl,...,F„_l or Fn+l,
respectively, drop out.
Finally, we explain the connection with the Lagrange function
L(«;\0,\,y*)= E a^(") + 0>*,F„+i(")>-
Here, \ = (\1,...,\„_1). A short calculation similar to that in Section 37.11
shows that for A0 > 0, X e IR!J._1, y* e Y*, u0e N„, condition (13) is
equivalent to
Lu(u0;X0,X,y*)(u-uQ)>0 for alius AT,, (14)
LK(u0;X0,X, y*){n-X)<Q for all ju. 6R"+_1,
Ly*{u0;XQ,X, y*) = Q.
From this it is clear that in (13) we are dealing with a local Kuhn-Tucker
condition.
48.5. Proof of Theorem 48.B
419
Proof of Theorem 48.B, (2). The proof of the sufficiency of the Lagrange
multiplier rule is as always completely elementary. To this end, we set
def
<p(t) = L(uQ + t(u~u0);\Q,\, y*),
where t e [0,1] and u e N„ is fixed.
<p is convex. (13a) means that <j>'(0)>0; therefore, <p has a minimum at
t = Q relative to [0,1]. Thus,
L(uQ;X0,X, y*)<L(u;X0,X, y*) for all u e N„.
Since XQ > 0, by multiplication of X and y* by a suitable number, we can
always assume that XQ =1. From (13b) and (13c), it follows that
F0(u0) = L,(u0;X0,X, y*).
For all u that satisfy the side conditions in (12), we always have
L(u;X0,X, y*)<F0(u)
because \y > 0; therefore, ^("o) - FQ(u). P
We give the proof of the necessity of the Lagrange multiplier rule in the
next section.
48.5. Proof of Theorem 48.B
We shall prove Theorem 48.B, (1), with the aid of Theorem 48.A in Section
48.3 and assume that u0 is a local solution of (12) and that (H1)-(H5) hold.
The proof facilitates deeper insight into the mechanics of the Lagrange
multiplier rule.
Step 1: Trivial Special Cases. If F0'(uQ) = Q, then (13) holds with \0=1,
\j = 0 for J = 1,..., n -1 and y* = 0.
If Fj'(u0) = 0, Fj(u0) = Q for some fixed / = 1,..., n -1, then (13) holds
with X,■ = 1, Xk = 0 for all k + j and y* = 0.
def
Let A = Fn'+1(u0) and suppose R(A)¥= Y. According to the closed range
theorem Aj(39), R(A) = N(A*)X because of (H5). Hence, there exists a
y* e N(A*) such that y* + 0; therefore,
< y*< Fn'+i(uo)h) = (A*y*, h) = 0 for all hex.
Consequently, (13) holds for X0 = • • • = \„_j = 0.
420
48. General Lagrange Multipliers (Dubovickii-Miljutin Theory)
Step 2: Calculation of Kt and K* when the Above Special Cases Do Not
Occur. Parallel to Section 48.3, we set
def
Nj= {ueX:Fj(u)<0}, 7=1,...,/1-1,
def
Nn+1={ueX:Fn+l{u) = 0},
and investigate the direction cone Kt which we introduced in Definition
48.8. At this point the reader is advised to study this definition again.
Lemma 48.11. For F0'(u0) ¥= 0, we have
KQ={heX:F0'(uQ)h<0},
KZ = {-\0F0'(u0):\0>0}.
Proof. By Definition 48.8, if h e K0, then F0'(u0)h < 0. (Use e -* 0.)
Conversely, from F0'(u0)h < 0 and the F-differentiability of F0 at u0, it follows
that
F0("o+ ^) = -^0)+^0)^ +0(11*11) as k "*0.
For k = e(h + r), it easily follows that h is a regular descent direction, i.e.,
heK0.
The formula for Kq follows from Example 48.5. D
Lemma 48.12. If F-(u0) < 0 for a fixed j = 1,...,n — 1, then Kj = X;
therefore, K+ = {0}, i.e., K+ = {-\jFj\u0): \y = 0}.
Proof. Take into account the continuity of Fj at u0. O
Lemma 48.13. If Fj'(u0) =£ 0 and Fj(uQ) = Q for a fixed j = 1,...,w—1, then
Kj= {heX:F/(u0)h<0},
k;={-XjF/(u0):Xj>0}.
Proof. From h e Nj it follows that
Fj(u0 + e{h + r))-Fj(u0)z0
for all e6 ]0, e0[, r e [/(0); therefore,
F/(u0)(h + r)z0,
i.e., F/(u0)h<0.
Conversely, from Fj'(u0)h < 0 it follows that h e Nj, analogous to the
proof of Lemma 48.11. D
4&5. Proof of Theorem 48.B
421
Lemma 48.14. The following hold:
Kn*~ [he X: h = a(u-u0),ueintNn,a>Q},
K+ = {feX*:f{u-u0)>OforallueNn}.
Proof. These formulas follow directly from Definition 48.8 and the
definition of the dual cone. □
Lemma 48.15. For R(Fn'+1(u0)) = Y, we have
Kn+l~{heX:Fn'+l{u0)h = Q}x
<+i=k'+i("o)]*(^*)>
i.e., K*+l is equal to the set of all f e X* that can be represented as
f(h) = -(y*,Fn'+l(uQ)h)
for all hex and fixed y* e Y*.
def
Proof. Let A = Fn'+1(u0). If h is a one-sided tangential direction, then from
Definition 48.8 it directly follows that Ah = 0; but, according to Theorem
43.C in Section 43.6, each such A is a tangential vector; therefore, Kn+1 =
N(A). In this connection, one must observe that by Problem 43.2 we can
forego having N(Fn'+l(uQ)) split the space X.
For K++1( by the closed range theorem Aj(39) and because R(A*) =
1N{A), it follows that
/etf„++1 **/(«) = 0 on N{A)
<** f = A*(— y*) for somey* e Y*.
□
One now easily convinces oneself that under the assumptions made in the
lemmas, KQ,...,Kn+l are convex and KQ,...,Kn are open. Furthermore,
K0*e>.
Step 3; Proof of Theorem 48. B, (1). If none of the special cases considered in
Step 1 is present, then from the assertions of Step 2 and by Theorem 48.A in
Section 48.3, it follows that there exist f e Kf, i = 0,...,n + 1, which are
not all simultaneously equal to zero, such that
~" /0 _ /l ~" ' ' ' ~ fn-l ~~ /n + 1 = Jn-
But this is (13a).
We now prove (13b). If Fj(u0)< 0 for a fixedy = 1,..., n — 1, then because
/; G Kf, we have Xj = 0 by Lemma 48.12.
Corollary 48.9 is obtained analogously. In order to show Corollary 48.10
— therefore, that \0 > 0 and thus that /0 =£ 0—according to Theorem 48.A,
422
48. General Lagrange Multipliers (Dubovickii-Miljutin Theory)
(2), we have to verify the condition
n+l
However, the element h in Corollary 48.10 belongs to this intersection.
This completes the proof of Theorem 48.B, (1) in Section 48.4.
48.6. Application to Control Problems (Pontrjagin's
Maximum Principle)
As an important application of Theorem 48.B in Section 48.4, we consider
the following control problem (P):
(a) Control functional:
jhf(y(t),w(t),t)dt = vtin\
(b) Control equations:
yM=ai+j'gi{y{t),w{t),t)dt, / = 1,. ..,N.
(c) Boundary conditions at the end point t2:
h,(h,y(t2)) = o, i=l,...,N.
(d) Control restriction:
w(t)eW forallf e[f1(f2].
Here we admit:
(a) all finite time intervals [tlt t2] for fixed initial time tx and variable
terminal time t2> tx;
(/?) all paths or states y(-) and controls w(-) such that
yteC[tltt2\, wteLS(«1(r2)
for all/ =1,...,N; k~l,...,M.
Here,
y{t) = (yl(t),...,yff(t))eRN,
w(t)={Wl(t),...,wM(t))eUM.
We understand L^(r1( t2) to be the set of all piecewise continuous real
functions on [tlt t2], i.e., these functions are bounded and continuous up to
a finite number of jumps (see Fig. 48.4). As usual, C[tlt t2\ denotes the set
of all continuous real functions on \tx, t2\
48.6. Application to Control Problems (Pontrjagin's Maximum Principle) 423
^r>
Figure 48.4
Comment. We have already considered a special case of (P) in Section
37.21. There, the bang-bang principle shows that it is not meaningful to
restrict oneself to continuous controls. The control equation (b) for all
'e [*i> h] describes the connection between the control w and the path or
state quantity y. Frequently the control equations occur in the form of
differential equations:
^/(0-aW0.w(0.0. yiih) = a,.
Then integration yields (b). The boundary conditions for t2 comprise, e.g.,
the following two special cases:
(i) No boundary condition for y at t2, i.e., h = 0.
(ii) y(t2) = b for fixed b, i.e., h-y — b.
Our natural assumptions read as follows:
(Hi) All/, gh and A • have continuous first partial derivatives with respect
to all arguments.
(H2) The admissible control region W is given as a subset of IR M.
(H3) The initial values alt..., aN e IR of y are given.
The construction of the Pontrjagin function
def N
tf{y,w,p,t,\) = Y,pigi{y,w,t)-\0f(y,w,t)
/=i
is crucial for the formulation of necessary solvability conditions.
Furthermore, in preparation, we state the maximum principle
Sf(y(t),w(t),p(t),t,\Q)= max Jif(y(t),w*, p(t),t, \0) (15)
w*eW
as well as the generalized canonical equations
and the so-called transversality condition at the end point t2:
N dh
PiU2)--T.-gj;(h>y{h))«j> / = 1,--^- (Ha)
424
48. General Lagrange Multipliers (Dubovickii-Miljutin Theory)
Furthermore, from the control equation (b), it follows that at the initial
point tx:
HitJ-a,, i=l,...,N. (17b)
The following theorem is called the Pontrjagin maximum principle. This
important principle was conjectured by Pontrjagin around 1955. The
rigorous proof of the maximum principle was given by Boltjanskii (1958). A
variant of the maximum principle was already given by Hestenes (1950) in a
technical report which remained in obscurity.
Theorem 48.C. Suppose (Hl)-(H3) are satisfied. If y,w, t2 is a solution of
the original problem (P), then there exist real numbers ^,0^...,0^, where
X0>0, that are not all simultaneously zero, and functions px,... ,pN which are
continuous on [tlt t2] such that equations (15) and (16) hold at all points of
continuity t of the optimal control w. Moreover, (17) is satisfied.
Corollary 48.16. Furthermore, there exists a continuous function p0 on [tv t2]
such that at all points t of continuity of w,
p0(t) = Jf(y(t),w(t), p(t),t,X0)
holds as well as
Po = ^< 08)
with the transversality condition at the end point t2,
N dh
Po(t2)=T,-s?(h,y(h))«j. (19)
7-1
\0 can always be chosen to be 1 or 0. In the case where h = 0, i.e., for a free
right boundary, X0 =1.
We have made use of an abbreviated form in (16) and (18) in order to
formulate the conditions in a suggestive way. Written out in detail, eg.,
Pi = — 3f reads as follows:
P!(t)-- L Pj(t)-^{y{t)Mt),t)
+ \0j£(y(t)Mt),t).
In (16), y(=3fp is nothing other than the control equation y-{t) =
£,(^(0)^(0.0- Furthermore, ^generalizes the Hamilton function H. We
shall explain the connection with the classical calculus of variations in
Section 48.8. Then w = y'; therefore, g, = wt and W = 0¾ M.
The Pontrjagin maximum principle generalizes the classical maximum
principle in Section 37.4 and represents a basic tool for handling variational
and control problems.
48.6. Application to Control Problems (Pontrjagin's Maximum Principle)
425
Remark 48.17 (Analysis of the Maximum Principle). We consider, say, the
frequently occurring case of fixed end conditions, i.e., yt(t2)= bh i = 1,... ,N;
thus, hj = yt — br Then, to determine the functions y, w, p and the end time
t2, we have at our disposal the generalized canonical equations
/>;--■*;,. >/=■*;,=&. /=1,...,^,
the boundary conditions
the boundary condition
^(y(h)Mh);p(h)>h,K) = o
that follows from (19), as well as the Pontrjagin maximum condition (15).
This maximum condition asserts that the optimal control imparts to the
function Jf a maximum in comparison with all other possible controls. If no
control restrictions are present, i.e., W=UM, then the system of equations
^k{y{t),w{t),p{t)j,K)-Q, k = l,...,M (15a)
results from the maximum condition (15). We thus obtain exactly 2N
first-order differential equations with 2N + 1 boundary conditions to
determine the 2N functions yt> p( and the end time t2. To determine the M
functions wk, one also uses the M equations (15a). Thus we get exactly the
number of condition equations that are needed for well-behaved problems
in order to calculate y, p, w, and t2 uniquely. For the numerical treatment,
one can employ shooting methods (cf. Section 48.10). Furthermore,
Theorem 48.C yields Pi(t2)- — a,, where (\0, a1(..., aN) =£ 0, \0 = 1 or \0 = 0.
From this we obtain the additional information
N
\0=1 or \0 = 0, Hpf{t2)*0.
/ = 1
These assertions can be used together with the remaining conditions to
exclude the degenerate case \0 = 0. We give an example of this in Section
48.8. In physical problems, the following heuristic consideration is very
useful: If one expects that the optimal control depends on the form of the
integral to be minimized, i.e., it depends on/, then we must have \0 =1. In
the opposite case, since \0 = 0, namely, all condition equations for
determining the optimal control would be independent of/, i.e., one would not
even need the information contained in /.
We treat an application of these considerations to the problem of the
return of a spaceship to the earth in Section 48.10.
426
48. General Lagrange Multipliers (Dubovickii-Miljutin Theory)
48.7. Proof of the Pontrjagin Maximum Principle
With the aid of Theorem 48.B in Section 48.4, we wish to give a simple
proof of the Pontrjagin maximum principle, which uses only completely
elementary transformations and which makes the mechanism of the
maximum principle very clear. The proof consists of the following main steps:
(i) By a time transformation from t to t there results a problem to which
we can apply the Lagrange multiplier rule of Theorem 48. B.
(ii) By Theorem 48.B, we obtain a variational inequality which we simplify
by introducing the auxiliary functions <p and ^ by means of an adjoint
problem,
(iii) A time-reversal transformation yields the Pontrjagin maximum
principle. At the same time, p0, p result from <p, \j/.
Here, in contrast to the side condition w e Win which W neither needs to
be convex nor needs to contain interior points, a side condition aeJV, in
which Nx is convex and intA^1^ is obtained by carrying out the time
transformation. An additional advantage is that the new r-time lies in the
fixed interval [0,1], whereas the original Mime ranges over the variable
interval [tx> t2].
Merely to simplify we set N — M = \ and y1 = y, g1 = g, a1 = a, p1 = p,
and hx = h. In addition, we agree on the following notation:
def
C = C[0,1] (continuous functions on [0,1]).
def
C+= the set of all nonnegative continuous functions v * 0 on
[0,1] whose zeros are concentrated on at most finitely many
intervals of positive length (see Fig. 48.5).
def
S = { t e [0,1]: w is continuous at t }.
def def
P=(z(r),w(T),f(r)), P={y{t),w{t),t).
Qdi{t{l),m).
Obviously, C+ is a convex subset of C and intC+ =£0, since u = l
belongs to int C+. We shall introduce w, z and t below.
Step 1: Time Transformation. To each oeC+ we assign a time
transformation, i.e., a transition from t to t with
'(■0='h+ f'V(T)dj for all re [0,1].
■*■ 7. Proof of the Pontrjagin Maximum Principle 427
Figure 48.5
'lite reverse transformation reads as follows:
T(f*) = min{Te[0,l]:f(T) = f*}
Ul Fig. 48.5). Since t'(T)>0, on r-intervals, where u(t)>0, the trans-
lormation is injective, i.e., 1 -1. On r-intervals, where v(t) = 0, t(-) remains
constant.
Step 2: Transformation of the Problem. Purely formally, from the original
pioblem (P) in Section 48.6, a transition from t to t yields
f f(z(r),w(T),t(T))v(T)dT = mini,
0
(z,t,v)eCXCXC+,
z{j)-a- f g(z(T),w(r),t(T))v(T)dT = 0,
o
t(r)- h~fv{j)dT = Q,
A(/(l),z(l))-=0.
We write this problem in the operator form:
F0(u) = min!, u^Nlt
f2(«)-0.
(PO
(P")
<*28
48. General Lagrange Multipliers (Dubovickii-Miljutin Theory)
In this connection, we use the following notation:
def
u = (z, t,v) e X,
def
X^CXCxC,
def
Nl~CxCxC+,
def
Y^CxCxM.
F2 maps X into Y. Now we justify this formal procedure.
Lemma 48.18. Let y,w,[tx, t2] be a solution of the original problem (P) in
Section 48.6. If we construct the element u — {z,t,v) for given ueC+,
w*eWby
def /-T def
'"(t)-*i+/ v(r)dT, z(t) = y(i(T))
and set
def(w(i(T)) for t such that v(t)>0,
w(t) = I
[w* for t such that v(t) = 0,
then u is a solution of (P') provided v is chosen so that i(l) = t2-
Proof, w has only a finite number of points of discontinuity, i.e., [0,1]-Sis
finite. In addition, i, zeC. From (P) it easily follows that u satisfies the side
conditions in (P'). Moreover, after transformation to t,
FQ(u)=f'2f(y(t),v(t),t)dt. (20)
Now, if an element u - (z, t, v) satisfies the side conditions in (P'), then by
a transformation to t according to
t(T) = tl+ f v(r)dT,
we obtain from z, w two functions
t^y(t), t^iv(t) on[r1;r2]
that satisfy the side conditions in (P) with t2 replaced by t2. In addition,
F0(«)-fhf(y(t)Mt),t)dt.
Sincey, w is a solution of (P), by (20) we have F0(u) > F0(u). D
In the following, let u, w be fixed. In an essential way we shall use the fact
that v, w* are arbitrary only in the last step.
Step 3: Necessary Condition for (P")
48.7. Proof of the Pontrjagin Maximum Principle
429
Lemma 48.19. There exist A0 e R, y* e Y* which are not both simultaneously
equal to zero such that A0 > 0 and for all u e A^ the crucial variational
inequality
X0F0'(u)(u-u) + (y*, F2'(u)(u~u))>0 (21)
holds.
This follows from Theorem 48.B in Section 48.4 because, there, the
assumptions (H1)-(H4) are obviously fulfilled. The basic regularity
condition (H5) results from the next lemma.
Lemma 48.20. The range R(F2'(u)) is closed.
We will carry out the proof in such a way that it carries over completely
analogously to the case N,M^.l.
Proof. The equation
F2'(u)(u-u)-b
corresponds to the inhomogeneous equations in (P'), where one linearizes
on the left-hand side, i.e.,
z-2- / gy(P)v(z-z)dT
- f[gt(P)v(t-i)+g(P)(v-v)] dr-blt (22a)
t — i—l (v — v)dT = b2;
M5)(<(iW(i))+M2)(*(i)-z(i))=-*3- (22b)
Below we denote the left-hand side in (22b) by b3(u). Here, u = (z,t,v)
holds. Of importance is the fact that for each fixed v eC, (22a) represents a
system of Volterra integral equations which by Section 1.9 has exactly one
solution (z,()eCxC that depends continuously on {bY,b2), for each
right-hand side {bx, b2) e C X C.
We shall show that R{F2\u)) consists of exactly all b such that
bxeC, b2eC, b3^ ^^(^,^)) +y, (23)
where y ranges over a fixed linear subspace J^in IR and (bu b2) *-* u{bY, b2)
is continuous on C X C. Since u >-* b3(u) is also continuous on X, it follows
that because J^is closed, R(F2(u)) is also closed.
To prove (23), we denote the set of all solutions of (22a) with bi — b2 — Q
by uh. All uh — u form a linear space and thus all b3(uh — u) also form a
linear space that we denote by &. Furthermore, for fixed (bx, b2), (22a) has
exactly one solution for v — v, which we denote by u(blt b2). Each solution u
of (22a) now has the form u = u(b1,b2)+uh. Thus, (23) holds.
430
48. General Lagrange Multipliers (Dubovickii-Miljutin Theorv •
Step 4: The Lagrange Multiplier y* in (21). Since y* eP and Y=CxC>
U, we have:
y* = (y*, y* ,a) e C* X C* XU .
Lemma 48.21. X20 + a2 * 0 holds.
Proof. Assume, to the contrary, that X0 = a = 0. For bx = b2 = 0 and fixed
v = v1 with ux = 1, we construct a solution u1 = (z, t, ux) of (22a). From (21)
it follows that
(y*,F2'(u)(Ul-u))^0.
Since vx eintC+, ux eint Nv Therefore, from (21) with X0 = 0 it follov*
that
(y*,F{(u)(u-u))*=Q for all u (= X.
Since (23) holds and y* — (yf, y*,Q), we then have yf = j2* = 0. This means
that \0 = a = y* = 0, which contradicts Lemma 48.19.
Lemma 48.22. For h = 0, \0 > 0. This means we can set X0 = 1 a/ter changing
y*,a.
Proof. If h = 0, the last equations with h in (P') drops out. Then Y- C X f
and j* = (jf, j2*). Now one follows a line of reasoning analogous to that in
Lemma 48.21 above.
We now conclude the proof of the maximum principle by specializing u
and u in (21).
Step 5: Specializing u in (21). For each v e C+ and bx — b2 — 0, we choose
the unique solution u of (22a). Thus, from (21) we obtain
\0£[fy(P)v(z -z) + fl(P)v(t-i) + f(P)(v-v)]dT
+ aht{Q){t{\)- t(l))+ ahy(Q)(z(\)- 1(1)) > 0
and (22a) with bx — b2 — 0. We can differentiate (22a) with respect to t in
the points of continuity of the integrand, thus at t e S. This yields
z'-z'^gy(P)v(z-z)+gl(P)v(t-i) (24)
+ g(P)(v-v) for all res,
t'-i'=v-v for all re [0,1],
z(0) = z(0), ^(0)=^(0) = ^.
We forego explicitly stating the r-dependence of z'(t), t'(r), etc.
48.7. Proof of the Pontrjagin Maximum Principle
431
Step 6: The Trick of Introducing <p and yj/ in Order to Eliminate t and z. Our
goal is the variational inequality
(l[Kf(P)~*g(P)+<p](v-v)dT>0 forallt;eC+. (25)
'0
To this end, we introduce <j> and ^ by
r(r)-(\ofy(P)-gy(T)i>(r))u(r), (26)
<P'(T) = -(\0/,(F)-g,(F)<HT))fKT) forallres
and
*(l) = -hy(Q)a, v(l)-A,(G)a.
The existence of <j> and ^ is easily obtained by integrating (26) over [1, t],
solving the resulting system of Volterra integral equations on [0,1] by
continuous functions <p,\p, and finally differentiating at the points of
continuity of the integrand, i.e., at t in S.
<p and 41 are introduced in such a way that one can apply the product rule
to the relations in Step 5. Namely, for all t e S,
X0v[fy(P)(z~z) + f(P)(t~i)]
= [(z-2)4>]'-[(t-i)v]'-[4>g(P)-v](o-o).
Taking the inequality in Step 5 into account, integration yields
i _ _ T_1
f [\0f(P)-tg(P)+<p](v-v)dT + (z~z)t~(t~i)<p
+ ah,(Q)(t(l)-t(l))+ ahy(Q)(z(l)-z(l))>0.
Now (25) follows from this.
The introduction of <p and ^ is a trick which, under the catchphrase
"introduction of adjoint states," plays an important role in all variants of
the maximum principles (cf. Sections 37.23, 54.4, and 54.7).
Step 7: Simplification of (25). In order to change the integral inequality (25)
into a pointwise relation, we set
def _ _
A(r) = \0/(F)-^(T)g(P)+<jp(r).
Since P= (z(t),w(t), i(r)), from (25), for all points of continuity of
t -» w(t), i.e., for t e S, it follows that
A(t) = 0 forallreS, whereu(T) >0,
A(t)>0 forallreS, where v(t) = 0. (27)
The opposite assumption easily leads to a contradiction because of the
continuity of A in t for a suitable choice of v e C+.
432
48. General Lagrange Multipliers (Dubovickii-Miljutin Theoi ■. i
Step 8: Time-Reversal Transformation of t to t for Obtainingp, p0 from \j/, r/.
We integrate (26) over [1, t]; therefore,
^(r) = ^(l) + /iT[\0/r(?)-gJ,(?)^]P^T,
' <p(T) = <jp(l)-/T[\0/((F)-g((?)^]^T.
Changing variables from t to t by t = f(r) and
def />t
f(r) = fx+ / C(t)c?t
yields the functions p, p0 from \j/, <p, where
p{t) = p(t2) + f'[\0fy(P)-gy(P)p\dt,
'2
P{t2)-^{l) = -hy{t2>y{t2))a
and
PoU) = Po(h)-[l[Kft(P)-g<(P)p]dt,
Po(h)= 9(1) = ht(t2, y(t2))a
for all f e [f1( f 2]. Take into account that g = (f(l), 2(1)), P = (j>(0, w(0>' '-
Thus, j?, ^0 are continuous on [f1( t2\.
Differentiation with respect to t yields the differential equations for;? and
p0 given in Theorem 48.C in Section 48.6 and Corollary 48.16.
Step 9: Interpretation of (27) by Specialization of v. We now observe that the
w in Lemma 48.18 depends on v e C, and w* e W.
def
(I) Relation betweenp0 and #C. We set v(t) = t2 — ft. After a time-re\ci-
sal transformation, from A(t) = 0 in (27) for all points of continuity of the
optimal control t -> w(t), it follows that
X0f(P)-p(t)g(P) + Po(t)=Q.
This is identical to
Po{t)^ 3f{y{t),w{t), p{t),t,\0) (:*>
in Corollary 48.16.
(II) Maximum principle. We choose v according to Fig. 48.5. From
A(t)> 0 in (27), after a time-reversal transformation, it follows that
VW'*).wV*)-/>('*)s(.v('*).wV*) + />o('*)^0 (24)
for all w* e W, t* e [tu t2\. Observe that according to Fig. 48.5 one cm
obtain each t* e [tlt t2] by an appropriate choice of V. In addition, one must
take into account the construction of w in Lemma 48.18.
48.8. The Maximum Principle and Classical Calculus of Variations
433
However, because of (28), relation (29) is precisely the maximum
principle
■&(y(t*)Mt*),P{t*),K,t*)>J?(y(t*)>w*,p(t*),\0,t*)
for all w* e W, t* e [tr, t2].
This concludes the proof of Theorem 48.C in Section 48.6 and Corollary
48.16.
48.8. The Maximum Principle and Classical Calculus
of Variations
We consider the classical variational problem
f'2L(u{t),t,u'{t))dt = mm\, (30)
u(tl) = a, u(t2) — b
where u = (ur,..., uM) and
u,eC[t1,t2],u'leLg[t1,t2])i-'l,...,M,
i.e., all derivatives u't are piecewise continuous. We suppose that the finite
interval [tltt2] and a, b eUM are given and fixed. We set
def
Pl(t) = Lu,(u(t),t,u'(t)), /=1,...,M, (31)
def ^
H(u, t, u') = 2j Lu,(u, t, u')u'; — L(u, t, u').
In the classical calculus of variations, one knows the following necessary
solvability conditions for (30):
(a) Euler equation:
p'i{t)-LUi{u{t),t,u'{t% i=l,...,M.
(b) Legendre condition:
M
£ ^("(^.'."'(OMw,^0 forallweRM.
(c) Weierstrass condition:
M
L(u{t),t,w)-L(u{t),t,u'{t))> L^(o(w,-«;(o)
i = i
forallweRM.
434
48. General Lagrange Multipliers (Dubovickii-Miljutin Theoi -i
(d) Weierstrass-Erdmann corner condition:
LUl(Q+)-LUi(Q_), H(Q+) = H(Q_),
/=1,...,M.
Here, t is an arbitrary point of discontinuity of u' in ]tlt t2[ and
def
Q±= (u(r), t, u'(t +0)). Thus, (d) contains conditions on the jumps of the
derivative of a solution of (30).
We shall prove that all these conditions result from the Pontrjagin
maximum principle. In this way the central position of this maximum
principle for the classical calculus of variations becomes clear.
Theorem 48.D. Suppose L has continuous first partial derivatives with respcu
to all arguments and that u(•) is a solution of (30).
Then (a), (b), and(c) hold at all points of continuity t of u'. Moreover, (cl)
holds.
In addition, in (b) it is naturally assumed that L has continuous second
partial derivatives with respect to all arguments.
Proof. The idea of the proof is to write (30) as a control problem as in
Section 48.6 with the control variable w = u' and »eRw In order Id
guarantee a fixed end time t2, we introduce an additional state variable
yM+i — t, i.e., we sety = (ul,...,uM,t). Then (30) reads as follows:
f'2L(y(t),w(t)) dt = min!
with the control equations
yi(t) = ai+ f'wi(t)dt, /=1,...,M
Jh
^+1(0=^1+ \'dt,
the control constraints
w(t)^UM for all /e [tlt t2],
and the boundary conditions
yi{t2) = bi, / = 1,....M,
yM+i\h) = ^M+i-
The last condition fixes t2 because bM+l is prescribed. Furthermore, let
yic^CUnh]' w;eLS(^i.^) for all/,/c.
We have stated the condition w(t) e UM solely to obtain complete
parallelism to Section 48.6. In fact, this requirement places no restriction whatsoe\ei
on w(-).
4" 9. Modifications of the Maximum Principle
435
The Pontrjagin function reads as follows:
M
Jf(y,w,p)= E Pm + pM+1 ~X0L(y,w).
1=1
According to Theorem 48.C in Section 48.6, there exist numbers \0 eR,
«n[RM+1 such that X0>0 which are not all simultaneously zero and
functions^,...,;?M+1 which are continuous on [tu t2] such that
P'k--^,,,, Pk(h) = ~«k, k = l M + l, (32)
Jt?(y(t),w(t),p(t)) = ma* JP(y(t),w,p(t)). (33)
I urthermore, by Corollary 48.16, there exists a continuous function p0 on
[/,, t2] with
p0{t) = 3e{y{t),w{t),P{t))> (34)
Po-*» Po(h)~0. (35)
I hese relations hold for all points of continuity t of w = u'.
From (35) it follows that^0 = 0. Furthermore, \0 > 0; for, it would follow
from X0 = 0 that pk = - ak by (32). Then
M
&=- E «j*j-«M+1.
Since ^0 = 0 and (33) and (34) hold, the maximum of the linear function
II -»Jff(w) is equal to zero on RM, i.e., Jif = 0; thus, ¢^ = ---= aM+l = 0.
I his is in contradiction to the fact that \0 and a are not simultaneously
/cro. Therefore, we can assume \0 =1, perhaps after a change in a, p.
Now the Weierstrass condition follows directly from the maximum
principle (33). Furthermore, (33) immediately yields
K(y(')Mt),p(t)) = o, (36)
M
E ^»,.(.v(0.w(0..P(0)w,wy£0 forallwe|RM. (37)
'./ = 1
(37) is the Legendre condition. (36) corresponds topt = Lu,—hence, to (31).
I his, together with p[ = - 3P , yields the Euler equation.
The Weierstrass-Erdmann corner condition results from the continuity of
"'1 Po» • • • >Pm+i> together with ^, = Lu,, mdp0^Jf = H + pM+l. D
48.9. Modifications of the Maximum Principle
'Ac shall first call the reader's attention to several transformations which
jllow one to reduce certain classes of problems to the normal form consid-
cicdin Section 48.6.
436
48. General Lagrange Multipliers (Dubovickii-Miljutin Theory)
(a) Fixed end time. In Section 48.6 the end time t2 is variable. If t2 is to be
fixed, then, as in Section 48.8, one introduces y- = t as a new state variable
with the boundary condition yj(t2) = fy for fixed bj.
(b) Integral side condition. We consider, for instance, the problem
(hf(u(t), t, u'{t)) dt = mini,
Jhg(u(t),t,u,(t))dt = c,
1/(^) = 0, u(t2) = b
for fixed a,b,ce R. By introducing of a new state variable yJt the integral
side condition can be written as the control equation
yj(t) = aj+ f'g(u(t),t,w(t))dt
with the boundary condition yj(t2) = c and with w = u'.
As an exercise we recommend that the reader treat this problem parallel
to Section 48.8 with the aid of the Pontrjagin maximum principle and show
that the classical Lagrange multiplier rule which we formulated in Section
37.4/ results. In this connection, one must take (a) into consideration.
(c) Bolza's problem. If in place of an integral to be minimized there
appears the more general expression
F{y{t2))+ fhf(y(t),w(t),t) A-mini (38)
h
with the control equations
^.(0 = 0,+ /^,(/(0,^(0.0^. /-1,....M,
then one sets
def
(38) is equivalent to /((^2)— ^(^1)= mi11-' f°r nxed nih)= ^(y(a)) or
f'2h'(t) dt = mini
Furthermore, if we set
f(y,w,t)= IiFy:(y)gi(y,w,t) + f(y,w,t),
1 = 1
then h'{t) = f(y(t), w(t), t) and (38) passes into
f'2f(y(t)Mt),t)dt = min! (38a)
h(t)tfF(y(t)) + ff(y(t),w('),')dt.
48.10. Return of a Spaceship to Earth
437
If f = 0 or F = 0 in (38), then, by definition, this is a matter of a Mayer
problem or a Lagrange problem, respectively. All these problems and their
natural generalizations can be carried over from one to another by means of
simple substitutions of the design given above.
Problems with Phase Restrictions. In Section 48.6 the paths are not subject
to any restrictions, i.e., there are no so-called phase restrictions. Problems
with phase restrictions, i.e., with additional side conditions of the type
Gj(y(t),t)<0,
Can be handled with a method of proof that is analogous to Section 48.7. In
this connection, compare Girsanov (1972, M), Lesson 14. Other methods for
the detailed investigation of this realm of problems can be found in Ioffe
and Tihomirov (1974, M), Section 5.2 and Neustadt (1976, M). Here, it is
essential that in place of the generalized canonical equations forp in Section
48.6, there appear integral relations, where the integrals are of the
Lebesgue-Stieltjes type, i.e., they contain measures.
48.10. Return of a Spaceship to Earth
In this section we consider an application of the Pontrjagin maximum
principle to space travel problems. Here we deal with the calculation of the
optimal control of an Apollo spacecraft which returns to earth as described
in Stoer and Bulirsch (1978, M). Here, the braking process is to be
controlled so that the heating of the spaceship remains minimal. For this
problem, space engineers set up the following somewhat simplified control
nroblem:
(hl0ylJpdt = mini,
yl~ Si(y>w)> (=1,2,3 (control equations),
)>i(ti) = at, yXh) = bt (boundary conditions),
weR (no control restrictions).
In this connection, we use the following notation (see Fig. 48.6):
yx = tangential velocity;
y2 + it/2 = path angle of inclination q> with respect to the joining ray,
spaceship—center of the earth, in arc measure;
y3 = h/R (h is the distance of the spaceship above the earth's surface, R
is the radius of the earth);
w = control parameter (related to the brake system of the spaceship);
p = p0exp(- 0Ry3) (atmospheric density by the barometric height
formula).
438 48. General Lagrange Multipliers (Dubovickii-Miljutin Theor- ■
space
ship
Figure 48.6
The boundary conditions are chosen so that at time tx the spaceship enters
the earth's atmosphere. In this connection, the following hold:
.^(^)=10.8 km/sec, y2(t1) =0.045tt, y3(tj) =120km//?.
The point Px in Fig. 48.6 corresponds to this situation. For the desired aid
time t2, the following must hold:
yl(t2) = %.I km/sec, ^2(½) = 0> y3(t2) = 15km/R.
Here, y^t^) and y1(t2) are approximately equal to the second and first
cosmic velocities, respectively (i.e., the minimal velocity required for leaving
the earth and for attaining an orbit, respectively). The end time t2
corresponds to the point P2 in Fig. 48.6. Upon attaining the data j,(^2)> the enlv\
maneuver in the earth's atmosphere is finished and the landing maneuver
can begin. Here, the condition y2(t2) = 0 means that at time t2, the path
runs parallel to the earth's surface, i.e., an orbit has been achieved.
The integral to be minimized describes the heating process of the spate-
ship, i.e., to be precise, the convective heat transfer. Here the great influenee
of the velocity is expressed by the appearance oiyl. The quantities g, have
the following meaning:
Fpyl „ ( x gsiny2
g =_ Ci(w)
2 (1+3¾)
„ FPyi r/,LA, y^osy2 gcosy2
where
g3 = R Vising,
F = frontal surface/mass of the spaceship;
g = acceleration due to gravity;
Ci(w) =1.174 — 0.9cosw (aerodynamic coefficient of resistance);
C2(w) = 0.6 sinw (aerodynamic coefficient of lift).
48.10. Return of a Spaceship to Earth
439
In toto, the following numerical values are obtained with the choice of a
suitable mass system (in 105 ft, etc.): aY = 0.36, a2 = — 0.04577-, a3 = 4/R,
£ = 209, ^ = 0.27, 62 = 0, ^ = 2.5//?, /=-=53200, p0 = 2.704 X10"3, /3 =
4.26, g = 3.2172 X10 ~4.
The following simplifying assumptions are made in the derivation of the
differential equations:
(a) The earth is a ball at rest.
(ft) The flight path lies in the plane of a great circle.
(y) The astronauts' load capacity can be arbitrarily high.
The appearance of the terms with CVC2 which describe the influence of
atmosphere on the spaceship that is flying at approximately 11 km/sec is
crucial. If yt is known, then the distance d on the earth's surface (see Fig.
48.6) is given by the differential equation
d' = yi(l+ y3)~1cosy2.
We can apply the Pontrjagin maximum principle in the form of Remark
48.17 to the present problem. To this end, we first construct the Pontrjagin
function
1-1
Since we expect that the optimal control depends in an essential way on the
integral expression to be minimized, according to our heuristic
considerations in Remark 48.17, we immediately set XQ =1 and forego the discussion
that X0 = 0 is impossible. In addition to the differential equations and
boundary conditions for yv we thus obtain as necessary conditions the
differential equations
P', = -jeyi, /-1,2,3
with the additional boundary condition
*(y(h)Mti),p{h)) = o
and the maximum principle
The last condition immediately yields
sinw = — Q.6a~lp2, cosw= —0.9a~ly1p1,
a = J(0.6p2f+(0.9ylPl)2.
440 48. General Lagrange Multipliers (Dubovickii-Miljutin Theory)
If we replace w in the equations above by this expression, then, for the
determination of the six functions yit p( and the end time t2, we obtain
exactly six first-order differential equations with seven boundary conditions.
The numerical solution of this complex nonlinear boundary value
problem can be carried out by using shooting methods whose basic idea was
explained in Problem 5.4. Due to the denominator in g,-, which can lead to
singularities, the problem turns out to be very sensitive from the numerical
viewpoint. The physical reason for this is that for the entry maneuver only a
narrow corridor is favorable. If this corridor is missed, then the spaceship
falls or will be tossed by the polster of the earth's atmosphere into space. As
mathematical investigations of the boundary value problems have provided,
there exist, in fact, differentiable solutions only for a narrow region of
boundary data.
Figure 48.7 shows the qualitative behavior of the solution. If we set tx = 0,
then for the end time we obtain t2 = 224.9 sec. Therefore, the critical phase
when the spaceship penetrates the earth's atmosphere lasts approximately 4
min. Of interest in the optimal solution is, the fact that the spaceship
penetrates the earth's atmosphere rather deeply (from 120 km to 50 km) and
then it climbs again to the given distance of 75 km. On the other hand, the
velocity falls almost monotonically.
Due to the sensitivity of the problem it was of the greatest importance for
the numerical calculation to possess a good initial approximation. Such data
were given to the mathematicians by space engineers on the basis of their
experiences and their practical instinct. Additional details concerning these
calculations can be found in Stoer and Bulirsch (1978, M).
This example shows very clearly how valuable it is for the two
groups—engineers and mathematicians—to contribute their specific
knowledge and experience to the mutual solution of practical problems. At
the same time, one obtains an idea that with concrete problems, despite the
presence of a general theory, crucial difficulties can still arise from the
specifics of the problem, with which the practitioner must do front line
battle.
-» altitude
-• speed
H .-1 (time)
Figure 48.7
Problems
441
Problems
48.1. Proof of the Example 48.5. Solution: Concerning Kt: Let ge.K±, i.e.,
g(± «)>0 for all ue.K=; therefore, g(u) = 0 for all ueif=. Let/(u0) =
def
-1. Then y = u+ /(u)u0e if= for all »el, i.e., g(y) = 0; therefore,
g(u) = - f(u)g(u0) f°r all u e X Thus, g = \f.
Conversely, from g = Xf it immediately follows that ge.Kt.
Concerning K<: Let g e if+ , i.e., g(u) > 0 for all u e #<. Since #< c
if=Uif<, the proof proceeds as above for Kt, taking u0e.K< and
g(«o) ^ 0 into consideration.
For Kt- the argument is the same as for K%.
48.2 Calculus for dual cones. In the following, let X and" Y be real locally convex
spaces, and let K, Ka, and KY be cones with K, Ka cl, KY £ Y.
Furthermore, (X, X*) and (Y, 7*) form dual pairs. Suppose the operator B: X-+ Y
is linear and continuous. Prove:
48.2a. Union. (Ua(EAKay = ntt<EAKt.
48.2b. Intersection. If all Ka are convex, closed, and nonempty, then:
( n kX=™ z k- (39)
Furthermore,
intif nL*0=*(K nL)+=K+ + L+ (40)
when L is a linear subspace of X and K is convex.
To show this, one uses Proposition 48.6 for (39) and the Krein extension
theorem (Proposition 39.5) for (40).
48.2c. Subspaces. For a linear subspace L of X, we have
L+ = {/eX*:/(jc) = 0onL}.
48.2d. Generalized diagonal. For
<fe/
C= {(jc,.y)eXxY:fljc = .y},
we have
C+ = {(**,;>*) eX*xY*:jc* = -B*y*}.
To show this, use Problem 48.2c.
48.2e. Farkas' lemma. For
def
C= (jceX:Bjceify} = B-1(K'r),
we have
48. General Lagrange Multipliers (Dubovickii-Miljutin Theory)
when KY is convex and either one of the following two conditions is
satisfied:
(i) There exists anx0el such that Bx0 e int K Y (the Slater condition),
(ii) X= RN, Y = RM, KY = R +.
Therefore, in case (ii), we have C+ = {B*y: y eR+ } because KY = KY
This relation can also be expressed as follows: The system
B*y*=b, y>0
has a solution y if and only if b e C+, i.e., if and only if
(b\x)>0 for all x with Sjc > 0.
In this form, the Farkas lemma stands at the pinnacle of linear optimization
theory in R N (cf. Problem 50.4a for a short proof). This lemma is also often
referred to as the Farkas-Minkowski lemma.
In case (i), apply (40) to the sets
{{x,y)<EXxY:y<EKY} and {(x,y) e XX Y: Bx = y).
A special case. If C consists of all x e R N with
(ai\x)>0 fori=l,...,/c,
<a,-|jc)-0 for/ = fc + l,...,M
for fixed a, e R N, then C+ consists of all / e R N with
M
/=IV/, xu...,\k>o
1=1
and\,eR.
Hint: See Girsanov (1972, M), Lessons 5, 10.
Existence of optimal controls. In the following, we shall refer to prototype-
for existence assertions. Additional material is found in the references to the
literature in this chapter under the headings "Existence of optimal controls"
and "Generalized solutions." Many existence theorems are contained in
Cesari (1983, M).
Classical optimal controls (Fillipov's theorem). Parallel to Section 48.6, we
consider the time-optimal control problem with fixed endpoint:
t2 - tx = min!, (41a)
y'(i)'=g(y(i),w(t),t) for almost all <e [t1,t1 ], (41b)
y(h)~a, y(t2)-b,
w(t)e\V for all <e [^,^].
In this connection, let tx e R and a, b e R N be fixed. Suppose the contn>l
w(-) is measurable and thaty(-) is absolutely continuous.
Show: The control problem (41) has a solution when the following hold:
(HI) Regularity. The components of g: RNXRMXR->RN posses
continuous first partial derivatives.
(H2) Compactness of the control region. IF is a compact subset of R ".
Problems
443
(H3) Growth restriction. For all (y, w, t) e RN x W XR, we have
\g(y,w,t)\<c(t)\y\,
where c is a fixed continuous function.
(H4) Consistency. The problem is well posed, i.e., the side condition (41b)
can be fulfilled for a t1>t1 and some _>>(•), w(-).
(H5) Convexity condition. The set
{g(y,w,t);w<=W}
is convex for all (y, t) e R N xR.
Hint: Compare Gamkrelidze (1978, M), page 151. There the assertion
follows from the result of the next problem. Similar existence results which
depend on the existence principle of lower semicontjnuity in Chapter 38 can
be found in Fleming and Rishel (1975, M), Chapter III.
No solution need exist without the convexity condition (H5). In this case,
one can use the assertion of the next problem
48.3b. Generalized controls (relaxed controls). In Problem 42.14a, we have already
pointed out that it is meaningful to consider generalized variational
problems within the framework of a probability interpretation. The generalized
control problem parallel to (41) reads as follows:
t1-t1 = mini, (41a*)
/(')" [ g{y{t),w,t)dv.,(w) for almost all te [tlt t2], (41b*)
{n,} is admissible with respect to W.
In this connection, let tx eR, a, b e|" be fixed. Let y(-) be absolutely
continuous. In the differential equation in (41b*), in contrast to (41b), we
average over all controls weW with the aid of a measure n,. By the
admissibility of the family of measures {fi,} we understand the following:
(i) For each (£[(,,(;], p, is a probability measure on RM which is
concentrated on W; that is, n, is a measure on the smallest a-algebra 81 of
R M which is generated by the open sets, with 0 < ^,(^4) <1 for all A e 31
and ii,(W) = 1 as well as ii,(A) = 0 for A n W = 0,A e 81.
(ii) The function h denned by h(t) = jwH{w, t) dii,{w) is Lebesgue-mea-
surable for all continuous functions H: W X[^, t2] -* R.
Show: The control problem (41*) possesses a solution when the following
two conditions hold:
(a) The conditions (HI) (regularity), (H2) (compactness of W), and (H3)
(growth restriction) from Problem 48.3a are fulfilled.
(b) The problem is consistent, i.e., the side condition (41b*) can be
fulfilled for a t2 > tx and some^(), {fi,}.
Furthermore, one can show that the solution satisfies the differential
equation
/(0- EMOsWO.VO.O.
; = o
48. ^■^.tial LagiauS^ ivIultipL^io ^uboviuui-iviiljutin mcuiy)
where X,(f) > 0 for all i and X0(0+ • • • + \N(t)<~l. This type of control
by means of a convex averaging is called chattering control.
Hint: Compare Gamkrelidze (1978, M), page 147. The essential
advantage of using measures is that a convexity condition analogous to (H5) is
automatically fulfilled. The proof is based on the existence principle in
Chapter 38 (generalized Weierstrass theorem). In this connection, the choice
of weakly convergent subsequences of measures is exploited (weak*
convergence in C(W)*; cf. Proposition 38.2, (3)).
One can think of Problem 48.3b as a convexification of Problem 48.3a. It
is shown in McShane (1978, S) how one can use generalized control
problems in order to obtain very detailed assertions for three classical
problems of the calculus of variations.
The Ljapunov theorem on vector measures and the bang-bang principle. Let M
be a set and let St be a a-algebra of subsets of M, where M e 21. Suppose
that, with respect to (M, St) there are given finite measures /it,.. .,/i„ which
have no atoms, i.e., for each A with \it(A)> 0 there exists a subset Be A
with 0 < /i>(2?) < /1/(/4). The Ljapunov theorem asserts: The set
{{^(A),...,iin(A)):AsV)
is compact and convex in U".
The proof based on the Krein-Milman theorem can be found in Holmes
(1975, M), page 108 or Aubin (1979, M), page 580. Study this proof. The
significance of the Ljapunov theorem for control theory is that it yields the
key to the bang-bang principle for certain classes of problems. This
principle, a special case of which we have already become acquainted with in
Section 37.21, asserts: If the controls vary in a compact polyhedron in UM.
then the optimal control can be so chosen that it assumes only values that
correspond to the vertices of the polyhedron, i.e., one always controls with
lull power. In this connection, study Holmes (1975, M), page 117 for a
special case, Macki and Strauss (1982, M), and Cesari (1983, M).
Ljapunov's theorem is closely related to the theory of measurable
multivalued mappings. Here, we refer to Ioffe and Tihomirov (1974, M), Chaptei
8 and Castaing and Valadier (1977, L).
Soft moon landings with minimal fuel consumption. This control problem
reads as follows:
fhkw(t) eft = min!, (42a)
mh"=~gm + w, (42b)
m'=~kw, (42c)
*('i)-*i. h'{h) = Vl, h(t2)-h'(t2)-0, m(t1) = m1, (42d)
0<w(t)<a. (42e)
Here, h(t) and m(t) denote the distance above the moon's surface and the
mass of the moon landing ferry at the time t, respectively. The boundary
Problems
445
condition (42d) means that at the initial time tlt the landing ferry is at the
altitude hu has the velocity vx and the mass mx. At the unknown landing
time t2, the altitude and the velocity are to be equal to zero (for a soft
landing). (42b) is the Newtonian equation of motion with the force of lunar
gravity -gm and the braking power w of the rocket (g = acceleration due
to gravity on the moon). According to (42e), the braking power of the
rocket is bounded above, depending on fuel supply. By (42c), we set the rate
of change in mass proportional to the braking power (k is a constant).
According to (42c), requirement (42a) is equivalent to the situation that the
loss of mass m(t^)- m(t2), therefore the consumption of fuel, remains
minimal.
Determine an optimal control w(t) of the moonjanding ferry. Moreover,
in the sense of Section 37.22, solve the synthesis problem, i.e., calculate w in
the feedback control form
w = W(h,h').
In this way, the optimal control w is obtained in terms of the momentary
altitude and velocity. Parallel to Section 37.21, show that the optimal
control corresponds to a bang-bang principle: First, there is no braking at
all, and at a later point in time, which can be calculated, the total braking
power w = a of the rocket is switched on.
Hint: Compare Fleming and Rishel (1975, M), page 28. There this
exercise is treated as a Mayer problem. However, one can also make use of
Section 48.6 directly. The first-order system used there results from the
introduction of v = h'.
48.6. Start of a rocket in a homogeneous gravitational field with minimal
consumption of fuel. Parallel to Problem 48.5, we obtain the following control
problem:
I 2kw(t) dt = mini,
Jh
mx" = - mg+ wcosip, mz" = wsinip, m'= — kw,
x(t1) = x'(t1) = z(t1) = z'(t1) = 0, m(t1)=m1, x(t2) = x2,
z(t2) = z2,
0< w(t)<a.
The control parameters are the thrust of the rocket, w, and the climbing
angle, <j>, where <j> is subject to no restrictions (see Fig. 48.8). The start
commences at time tx. At the unknown terminal time t2, a given point
P2(x2,z2)istobe reached.
Determine an optimal control and the form of the path. Show that the
optimal control w corresponds to a bang-bang principle: Begin with the
maximal thrust w = a and then completely switch off the rocket drive at a
critical point in time which one can calculate, i.e., w = 0.
Hint: Compare Frank (1969, M), page 183. Also, study Leitmann
(1981, M).
48. General Lagrange Multipliers (Dubovicitii-Miljutin Theorj ■
Figure 48.8
Further applications of control theory. In this connection, study the following
problems:
(a) Control of a spaceship in interplanetary space with minimal consump
tion of fuel (cf. Lee and Markus (1967, M), 7.4).
(/?) Numerous concrete space-travel problems (cf. Control in space (197<i.
P))-
(y) Control of oscillating systems and chemical reactions (cf. Lee ai'd
Markus (1967, M), Chapter 1).
(S) Control of electric motors (cf. Petrov (1977, M), Chapter 6).
(e) Optimal strategies in education (cf. Frank (1969, M), page 113).
(f) Stochastic optimization of the production process of a paper mill (U
Astrom (1970, M), page 188).
We recommend Leitmann (1981, M) and Cesari (1983, M) as an
introduction to applications of control theory. The book by Petrov (1977, Mi
contains a bibliography on engineering-technical applications. There aic
also numerous engineering-technical examples in Csaki (1972, M) (this L<- .i
handbook of over 1000 pages on control systems). Additional application-
can be found in Bryson and Ho (1969, M). Nearly 100 exercises an-
contained in the collection of exercises by Oleinikov (1969, M).
Furthermore, study the duality between the deterministic linear contii>l
problem and the stochastic Kalman-Bucy filter in Astrom (1970, Mi.
Chapter 7, page 238 and Fleming and Rishel (1975, M), page 133.
General Lagrange multiplier rules and other proofs of the Pontrjagin maximin"
principle. In order to prove the maximum principle, various abstract mulu-
plier rules have been developed in the literature. Here, we shall point cul
two conceptions which differ from our procedure.
Side conditions in the form of a finite number of equations. We consider I lie
minimum problem:
.^(1^) = min!, weS, (43)
/r(w) = 0, ( = 2,. ..,N,
where the functional Ft: S-*U, i=l,...,N, are defined on the set S. Id
F=(Fl,...,FN). Our goal is to find a necessary solvability condition in the
Problems
447
form
(X\d)<0 foraUrfeZ)(w0), (44)
XeRN, X*0, Xt<0.
The following decomposition formula is crucial for the construction of the
set D(w0), which we designate as the cone of variations:
N
F(w(x))=°F(w0)+ Erf,JC,-+o(|jc|) foralljceAs (45)
/ = i
as jc-*0, where x = (xv...,xN). In this connection, As denotes a simplex
in RN which is spanned by 0, Seu...,SeN where 8 > 0. Here, e, is the unit
vector in the direction of the ith coordinate axis, i.e., et = (1,0,...,0), etc.
(see Fig. 48.9). To be exact, we assume that there is an Af-dimensional
convex cone D(w0) in RN having the following property: If dlt...,dNe
D(w0) are linearly independent, then there exists a S > 0 and a mapping
w(): &S->S such that (45) holds. Moreover, the composite mapping
F(w(•)) should be continuous and w(0) = w0 should hold.
Show: If w0 is a solution of the original problem (43), then there exists a
Lagrange multiplier X such that (44) holds.
Hint: The proof uses a separation theorem in RN and the Brouwer
fixed-point theorem. Compare Fleming and Rishel (1975, M), page 46.
There as well the above proposition is the main tool for giving a proof of
the Pontrjagin maximum principle without the use of functional analysis as
an auxiliary means. To explain its basic idea, we consider the control
problem in the Mayer form:
Fi(y(h)) = tmnl,
y' = f'(y,w,t), H'i) = «i> F2(y(h)) = 0>
w(t)<EW.
If we denote by S the set of all control functions w( •) with the property that
the corresponding paths y(•) on [tx, t2] satisfy the differential equation and
the initial condition above, then we obtain precisely (43). The crucial point
in the proof is to construct a suitable cone of variations D. By means of
transformations of (44) with the aid of solutions of the adjoint equation,
one then obtains the maximum principle.
■di
izJ
Figure 48.9
48. ueneial Lagrange Multipliers (UuboviCKii-Miljutiri meor-'
Figure 48.9 shows the intuitive meaning of the cone of variations D(wt.)
for the case N — 2: If one prescribes dud2 and denotes by Ap the triangle
spanned by 0,pd1,pd2, where p > 0, then there exists a "curvilinear
triangle w(As) such that its image F(w(As)) is approximated to within the
first order by means of F(w0) + Ap when p and 8 are sufficiently small.
Show: If 5 = RM and the F-derivative F'(w0) exists, then D(w0) = R''
holds when the rank of the matrix F'(w0) equals N. Otherwise, no such
D(w0) exists.
Generalizations of the above multiplier rule with applications to gener.il
control problems can be found in Neustadt (1976, M).
Side conditions in the form of equalities and inequalities with control restrif-
tions. As a generalization of Section 48.4, we consider the problem:
F0(j,w) = min!, (4<>)
Fj(y,w)<;0, j-1 «-1,
F(y,w) = 0, w<EN.
The essential difference between this and Section 48.4 is that the set N of
control restrictions are now not subject to any additional conditions. li\
definition, the Lagrange function reads as follows:
def "-1
L(y,w,\,z*)~ £ \,F,(y,w) + (z*,F(y,w)).
y0, w0 is called a local solution of (46) when all y,w which satisfy the side
conditions are admitted and y varies in a fixed neighborhood of y0. Our goal
is to find a necessary solvability condition of the form
Lv(y0,w0,-K,z*) = 0, (47)
L(y0,w0,\,z*) = min L[y0,w,\,z*),
w G N
Show: If y0,w0 is a local solution of (46), then there exist Lagrange
multipliers X e R", z* e Z*, which are not all zero, such that (47) holds foi
X0,---.^,1-1^ 0 when the following assumptions are fulfilled:
(HI) The spaces X and Z are real B-spaces, and N is a set.
(H2) Regularity. For each fixed w, all mappings Ft: YXN-+R and
F: Y X N -* Z are continuously F-differentiable as functions of y in a
neighborhood of y0.
(H3) Convexity. For wv w2 e N, a e [0,1], and y in a fixed neighborhood
of j0, there always exists a w e ]V such that
F,(j, w) ^a/Kj.wJ-r- (l-a)F,( j, w2), (-0,...,/1-1,
F(^ ^)-0:^(:)/,^)+(1--0:)^(^,1¾).
(H4) Range condition. The dimension of the factor space Z/R(Fy(y0, w0))
is finite.
Hint: Use a separation theorem analogous to the proof of Theorem 47.1'.
in Section 47.10. Compare loffe and Tihomirov (1974, M), Section 1.4; in
Problems
449
Chapter 5 of that monograph some generalizations of the above proposition
(local convexity) can be found, which leads to a proof of the maximum
principle for control problems with phase restrictions.
48.8c. Quasisolutions and Pontrjagin's maximum principle for nonsmooth problems.
In this chapter our goal was to show the connection between an abstract
multiplier rule, the Kuhn-Tucker theory, variational inequalities, and the
maximum principle. There exists another interesting approach to the
maximum principle based on the theory of quasisolutions in Section 38.8. In this
way it is possible to obtain a simple direct proof of Pontrjagin's maximum
principle under weak regularity hypotheses. Study Clarke (1976a) and
Ekeland (1979, S).
48.9. Tangential cones, nonlinear approximation theory, and the generalized
Kolmogorov criterion. Together with the minimum problem
F(w) = min!, u<eM, (48)
we consider the modified minimum problem
F(u) = min!, ue.uQ + dM(u0). (49)
Let M be a set in the B-space X. We denote by dM(u0) the so-called
tangential cone to M at the point u0 (see Fig. 48.10), i.e., by definition,
dM(uQ) consists of exactly all h e X for which the following holds: There
exists a sequence (u„) in M and a sequence of positive real numbers (t„)
such that
t~l(un —u0)-* h, u„-*u0 as«-»oo.
Show: If F: X -* R is a convex continuous functional, then each solution
of (48) is also a solution of (49).
Hint: Compare Collatz and Krabs (1973, M), page 57; in that monograph
and in Krabs (1975, M) applications to nonlinear approximation problems
are treated. In particular, derive the following generalized Kolmogorov
criterion:
Let C(T) be the B-space of all real continuous functions on the compact
nonempty set Tin RN with the usual max norm. Parallel to Problem 39.8a,
we study the approximation problem:
min||u-fc|| = a. (50)
Let the function b e C(T) and the nonempty subset M of C(T) be given.
Figure 48.10
48. General Lagrange Multipliers (Dubovickii-Miljutin Theory)
Then, if u is a solution of (50), we have:
For each he dM(u) there exists a teT such that \u{t)~b(t)\ , \
- ||M - b\\ and (u(t)- b{t))h{t) > 0. W
Conversely, if the condition (K) is fulfilled for a u e M, where M is convex,
then u is a solution of (50).
Hint: Compare Collate and Krabs (1973, M), page 60, and Krabs (1975,
M), page 152.
Motivation for the Pontrjagin maximum principle. In this connection, study
Frank (1969, M), page 131. There the connection between the maximum
principle and the classical Lagrange multiplier rule is motivated.
The structure of linear control problems. We consider the control equation
x'{t) = Ax{t)+Bu{t), x{0)~x0, (51)
where A: U" -*U" and B: W"->W" are matrices. Here, x(t) and u(t)
denote the state and the control at time /, respectively. All components wy ol
the control are assumed to be measurable functions on the time interva1
[0,^] where t1>0 depends on u. For the corresponding time-optima1
problem below, (52), we want to point out that the structure of the solutions-
is based on simple algebraic properties of A and B.
Controllability. Let the state xx = 0 be the target. Furthermore, let C denote
the so-called controllable set, i.e., the set of all initial points J0eR" which
can be steered to the target by (51). The controllability matrix is defined as
M ={B,AB,A2B,...,A"-lB}. Show:
(i) C is convex, symmetric and arcwise connected,
(ii) C is open <=» the target 0 e int C<=> rank M = n.
(iii) C = U " if and only if rank M = n and no eigenvalue of A has positive
real part.
Hint: Compare Macki and Strauss (1982, M), Chapter 2.
Time-optimal control and normality. We consider the following time-optimal
control problem:
fl==min! (521
x'{t) = Ax{t)+Bu{t), 0<t<tl
x(0) = x0, x{tl) = 0
u{t)<EU.
Here, U is the closed unit cube in R m. Suppose that no column bj of B i-
zero. We call (52) normal if and only if the vectors {bj, Abj,...,A"~lbj} an:
linearly independent for all/= l,...,m. Suppose that there exists a
successful control steering x0 to 0 by (51). Show:
(i) There exists at least one bang-bang time-optimal control which i-
measurable but not necessarily piecewise constant.
References
451
(Ii) A time-optimal control u on [0, fx] satisfies the maximum principle:
there is a constant vector h i= 0 such that
{h\e-'ABu{t))= svp{h\e~tABv) for all t e [0,JX].
new
(iii) If (52) is normal, then the time-optimal control is unique, bang-bang,
and piecewise constant,
(iv) If (52) is normal, then the converse of (ii) is valid: any successful
control which satisfies the maximum principle is in fact time-optimal.
Hint: Compare Macki and Strauss (1982, M), Chapter 3.
The results summarized above are prototypes of jnore general results for
nonlinear and infinite-dimensional control problems. Compare Macki and
Strauss (1982, M), Balakrishnan (1975, M), and Givens and Millman
(1982, S).
References to the Literature
Classical works on the maximum principle: Boltjanskii, Gamkrelidze, and
Pontrjagin (1956); Gamkrelidze (1958) (linear systems); Boltjanskii (1958)
(first proof of the maximum principle); Pontrjagin (1959, S). A variant of
the maximum principle was already given by Hestenes (1950) in a work
which remained in obscurity.
Classical work on the general theory of extremal problems: Dubovickii
and Miljutin (1965).
Introduction: Frank (1969, M); Leitmann (1981, M); Macki and Strauss
(1982, M).
General survey of control theory. Control Theory (1976, P) (this is a
three-volume proceedings of an international seminar in Trieste).
General abstract Lagrange multiplier rule: Girsanov (1972, L,H)
(introductory); Dubovickii and Miljutin (1965), (1971, M); Halkin (1970); loffe
and Tihomirov (1974, M); Boltjanskii (1975); Neustadt (1976, M,H); Clarke
(1976), (1976a); Aubin (1979, M) (applications to economics).
General theory of extremal problems: Girsanov (1972, L) (introductory);
Dubovickii and Miljutin (1965, M), (1971, M); loffe and Tihomirov (1974,
M).
Applications to approximation theory: Laurent (1972, M); Collatz and
Krabs (1973, M); Krabs (1975, M).
Introduction to deterministic and stochastic control theory: Fleming and
Rishel (1975, M).
Control by means of ordinary differential equations: Cesari (1983, M, B)
(standard work); Pontrjagin (1961, M); Bellman (1961, M), (1967, M);
Hestenes (1966, M); Lee and Markus (1967, M); Hermes and La Salle
(1969, M); Boltjanskii (1971, M); Berkovitz (1974, M); loffe and Tihomirov
(1974, M); Fleming and Rishel (1975, M); Neustadt (1976, M); Gabasov
452
48. General Lagrange Multipliers (Dubovickii-Miljutin Theory)
and Kirillova (1976, S,B). Elementary presentations: Frank (1969, M);
Petrov (1977, M); Leitmann (1981, M); Macki and Strauss (1982, M).
Control and stability of engineering-technical systems: Csaki (1972, M,B)
(this is a handbook of over 1000 pages).
Applications of control theory: Lee and Markus (1967, M); Bryson and
Ho (1969, M) (comprehensive presentation); Frank (1969, M); Control in
space (1970, M) (space travel); Petrov (1977, M); IFIP Conference (1978a,
P), (1979, P); Leitmann (1981, M); Cesari (1983, M).
Collection of exercises with solutions: Oleinikov (1969, M).
Linear systems: Macki and Strauss (1982, M) (introductory); Pontrjagin
(1961, M); Lee and Markus (1967, M); Krasovskii (1968, M) (method of
moments); Kalman, Falb and Arbib (1969, M) (general systems theory);
Chen (1970, M); Eveleigh (1972, M); Balakrishnan (1975, M) (functional
analysis methods); Aoki (1976, M) (applications to economics); Russel
(1978, S), (1979, M); Curtain and Pritchard (1978, L) (infinite-dimensional
linear systems); Givens and Millman (1982, S) (applications of global
analysis).
Maximum principle under minimal hypotheses: Clarke (1976a).
Global characterization of optimal solutions; Phii (1984), (1984a)
(applications to the buckling of rods).
Properties of cones in 5-spaces: Fuchsteiner and Lusky (1981, M).
Control by means of partial differential equations and integral equations:
Compare the references to the literature in Chapter 54.
Stochastic control theory: Compare the references to the literature in
Section 37.25.
Dynamic optimization-. Compare the references to the literature in
Section 37.20.
Discrete maximum principle: Bittner (1968); Boltjanskii (1976, M); Focke
and Klotzler (1978).
Existence of optimal controls: Fleming and Rishel (1975, M)
(introductory); Cesari (1966), (1975); (1983, M, B) (standard work); Olech (1969a),
(1969); Ioffe and Tihomirov (1974, M); Rockafellar (1975); Klotzler (1976);
MorbyhoviC (1976, S,B); Ahmed and Teo (1981, M).
Generalized solutions in the sense of a stochastic interpretation (relaxed
control): McShane (1978, S) and Gamkrelidze (1978, M); (introductory);
Young (1969, M); Warga (1972, M); Berkovitz (1974, M); Morbyhovic'
(1976, S,B).
Numerical methods: Dyer and McReynolds (1970, M) (introductory);
Balakrishnan and Neustadt (1964, M); Butkovskii (1965, M); Polak (1971,
M), (1973, S); Moisseev (1975, M); Fedorenko (1978, M) (handbook).
Connection between classical calculus of variations and control theory:
Pontrjagin (1961, M); Hestenes (1966, M); Ioffe and Tihomirov (1974, M);
McShane (1978, S); Leitmann (1981, M); Cesari (1983, M).
Historical survey: McShane (1978, S); Bennett (1979, M).
SADDLE POINTS AND DUALITY
It is true that a mathematician, who is not somewhat of a poet, will never be a
perfect mathematician.
Karl Weierstrass (1815-1897)
The mathematician is perfect only in so far as he is a perfect being, in so far
as he perceives the beauty of truth; only then will his work be thorough,
transparent, comprehensive, pure, clear, attractive, and even elegant. All this
is necessary to resemble Lagrange.
Johann Wolfgang von Goethe (1749-1832)
(Wilhelm Meisters Wanderjahre)
The basic idea of duality theory is that, together with the original problem
mlF{u) = a, (1°)
we consider a dual maximum problem
sup (?(/>) = /?, (2°)
where /? < a. In Section 37.29f we have already given a detailed presentation
of the advantages derived from this approach, and we very strongly
recommend that the reader again peruse Section 37.29f before studying Chapters
49-53. In particular, there we have explained the meaning of a — /?, i.e.,
there are no duality gaps, in contrast to the case where /? < a. In Chapter 61
of Part IV we shall describe the physical meaning of mutually dual
problems on the basis of elasticity theory. Then displacements and stress
correspond to u in (1°) and to p in (2°), respectively. The extremal relation
454
Saddle Points and Duality
between the solutions u of (1°) and p of (2°) is nothing other than the
known stress-strain relationship of elasticity theory.
In the following chapters we deal with the construction of dual problems
as well as the corresponding existence propositions, extremal relations, and
error estimates, all as generalizations of Chapter 39.
(i) In Chapter 49 we place at the pinnacle of duality theory the concept of
the Lagrange function L; we construct dual problems with the help of L and
show how existence propositions for mutually dual problems arise directly
from saddle-point propositions for L.
(ii) In the chapters following Chapter 49 we show how one obtains such
Lagrange functions for linear and convex optimization problems as well as
for quasilinear elliptic partial differential equations.
(iii) In Chapter 51 we introduce the concept of a conjugate functional and
explain its connection with the Lagrange function, dual problems, and the
theory of monotone operators. Conjugate functionals generalize the classical
Legendre transformation.
(iv) The concept of a conjugate functional is used in Chapter 52 to prove
the Rockafellar duality theorem on the stability of perturbed problems. In
this connection, the place of differentiability conditions for the classical
action function 5 of the Hamilton-Jacobi theory is taken by a more general
condition for the subdifferential of an 5-functional.
(v) In Section 52.5, using the Bellman differential equation, we develop a
duality theory for nonconvex problems.
(vi) Chapter 53 is devoted to the study of the connection between
conjugate functionals and Orlicz spaces. The point here is that Orlicz spaces
are used to treat differential equations and integral equations having strongly
growing coefficients by functional analysis methods.
The applications deal with:
(a) Linear optimization.
(/?) Convex optimization (Kuhn-Tucker theory).
(y) Quasilinear elliptic partial differential equations.
(8) Minimal surfaces.
(e) Hammerstein integral equations.
We have already become acquainted with applications of duality theory to
problems of approximation theory in Chapter 39.
In Problems 50.2 and 50.3, we treat with the Uzawa and Arrow-
Hurwicz methods, two approximation methods which are based on duality
theory. In Part IV we explain the significance of variants of this method for
the numerical treatment of the Navier-Stokes differential equation, which
describe the motion of viscous fluids. In Section 51.6 appears the classical
duality between the Ritz and Trefftz methods as a special case of more
general results.
Together with coerciveness conditions (cf. Theorems 49.A and 49.B), a
special role for existence propositions is played by nondegeneracy condi-
Saddle Points and Duality
455
tions which we denote briefly as the Slater condition (SC) (cf. Theorems
50.A, 52.A, 52.B, and 52.C). However, as Section 49.3 shows, there is a close
connection between the two conditions of coerciveness and nondegeneracy.
We attach particular value to ensuring that the connection between
duality theory and the classical Hamilton-Jacobi theory is clear to the
reader. In particular, we point this out in Chapter 51 (conjugate functionals
and the Legendre transformation) as well as in Chapter 52 (the
Hamilton-Jacobi theory, Rockafellar's stability principle, Bellman's
differential equation, and duality for nonconvex problems).
Duality appears in the most varied forms in concrete variational problems
and optimization problems. The approach to duality theory that we have
chosen here should, however, make clear that one can understand all these
different manifestations with the aid of a simple unifying principle that we
will present in Section 49.2. Duality arises in many branches of
mathematics. It is one of the fundamental concepts of mathematics.
CHAPTER 49
General Duality Principle by Means of
Lagrange Functions and Their Saddle
Points
A mathematician, like a painter or poet, is a maker of patterns. If his patterns
are more permanent than theirs, it is because they are made with ideas.
Godfrey Harold Hardy (1877-1947)
In this chapter we set Lagrange functions and a related general duality
principle at the pinnacle of duality theory. We treat important examples of
Lagrange functions in:
(a) Section 49.3 (linear optimization).
(0) Section 50.1 (Kuhn-Tucker theory).
(y) Section 51.6 (Trefftz duality for linear elliptic partial differential
equations).
(8) Section 51.7 (quasilinear elliptic partial differential equations).
In Section 51.4 we explain a general method for constructing Lagrange
functions with the aid of conjugate functionals. The general duality
principle in Section 49.2 leads us, in Section 51.4, to the formulation of dual
problems of Fenchel type which we investigate in Section 51.5 within the
framework of the theory of monotone operators and in Section 52.2 as an
application of the Rockafellar stability principle.
In the Problems, at the end of this chapter, we explain a number of
general methods for constructing critical points and saddle points, in
particular.
49.1. Existence of Saddle Points
In Section 43.9 we defined saddle points to be critical points that are neither
local minima nor local maxima. Proceeding from
maxL(u, p) = L{u,p) = minL(u,p), (3)
457
458 49. General Duality Principle by Means of Lagrange Functions and their Saddle Poiri-
we shall now define a saddle point with respect to the product set A X H.
The reader will do well to distinguish these two concepts. For connections
between these two concepts, we refer the reader to Example 49.3.
Definition 49.1. Let L: Ax B-> IR be given; here A and B are nonempi\
sets. The point (u, p)is called a saddle point of L with respect to Ax B if and
only if (u, p) e A x B and (3) holds.
Example 49.2 (Prototype). Let L: IR x IR -»IR be given, where L(u, p) = u
- p2. Then (0,0) is a saddle point of L with respect to IR x IR (see Fig. 49.1).
Example 49.3. We assume that:
(i) L: A x B c X x Y-* IR is F-differentiable at (S, p).
(ii) X, Y are real B-spaces, and A and B are neighborhoods of u and />.
respectively.
Then the following holds: If (u, p) is a saddle point of L with respect in
Ax B, then from (3) it immediately follows that
Lu(u,p) = Lp(u,p) = 0; (4)
therefore, L'(u, p) = 0, as well.
In particular, then, (u, p) is a free critical point of L, to which, howeva.
there corresponds a local minimum and therefore no saddle point in I lie
special case L(u, p) = constant. However, as a rule, (u,p) will be a saddle
point (see Fig. 49.1). The concept of a saddle point, as a critical point of /..
is connected with certain differentiability properties, according to Section
43.9. Definition 49.1 above is independent of this and is global in nature in
contrast to the definition of a saddle point in Section 43.9.
In order to prove a focal existence principle for saddle points with respocl
to a product set A x B, we formulate the following assumptions:
(HI) X and Y are real reflexive B-spaces and A<z X,B cY. Here, A and /i
are convex, closed, and nonempty.
Figure 49.1
-■.I. Existence of Saddle Points
459
(H2) The functional L: A X B -* U has the following two properties:
(i) u>-» — L(u,p) is convex and lower semicontinuous on A for all pGB.
(ii) /■"-» — L(u,p) is convex and lower semicontinuous on B for all we A.
(113) A is bounded or there exists a />0 e 2? such that L(u, p0) -* + oo as
j»|| *oo on A.
(113*) B is bounded or there exists au0&A such that - L(w0, />) -> + oo
i!.sj|/7||->oo on B.
The assumptions are very symmetric. In passing from u top, one must
replace L by — L. Condition (ii) can also be formulated as follows:
l"-'L(u,p) is concave and upper semicontinuous on B for all u&A.
1 tinctionals (w, p) •-* L(u, p) which are convex with respect to w and
a'licave with respect to p are also said to be convex-concave. The limiting
\;iltie relations in (H3) and (H3*) are weak coerciveness conditions for L.
One can depict the following existence assertion by Fig. 49.1.
Ilieorem 49.A. With the assumptions (//1)-(7/3), (//3*), L possesses a
\addle point with respect to AX B,
The significance of this theorem for game theory was already explained in
tliapter 9. The important connection with duality theory appears in the
next section.
Proof. (I) If A and B are bounded, then the assertion follows from
Theorem 9.D in Section 9.6 (John von Neumann's minimax theorem).
(II) In the unbounded case, we use a limiting value argument. To this end,
v.e set
def def
An={uGA:\\u\\<n}, Bn = {we B: \\u\\ < n).
.1,, and Bn are bounded and nonempty for sufficiently large n\ for this
reason, according to (I), for these n, L has a saddle point (un,pn) with
rospect to A„ X Bn —therefore,
L(u„, p) < L(u„, p„) < L(u, pn) ioxa\\{u,p)^AnxBn.
If we choose/? = p0,u = uQ, then from (H3), (H3*), and Proposition 38.12,
If), it follows that the sequences (un),(pn) and (L(un, pn)) are bounded.
I onsequently, possibly after going over to subsequences,
"„-"> Pn-P< L(un,pn)^y as n -* oo
and (u,p)&AxB by (HI). According to Proposition 38.7, the assumed
cunvexity and lower semicontinuity assures weak sequential lower semicon-
linuity; therefore,
L(u, p) < lim L(un, p) <y < Jim L(u,pn) < L(u,p)
for all (u, p)eAxB.ln particular, L{u, p) = y. Consequently, (3) holds. □
tuO 4?. vjeneral Lmamy Principle oy Means 01 Lagrange r uueiions auu uieir Saddle rumcs
49.2. Main Theorem of Duality Theory
The point of departure for the formulation of dual problems is the
symmetric pair of formulas:
inf isupL(u,p) =a, —oo<a<oo, (5)
sup ( inf L(u,/>)) = jS, -oo<j8<oo. (5*)
peB^u^A I
L is called a Lagrange function. In order to apply this dualizing procedure
to the minimum problem
inf F(u) = a,
u G A
we must assume the existence of a function L such that F can be represented
in the form
F(u)= sup L(u,p) for all tie .,4.
Motivated by (5*), we set
del
G(p)= inf L(u,p) forall/>e£,
«6/1
and obtain the maximum problem
sup (?(/>) = £
as the dual problem. Here, p plays the role of an abstract Lagrange
multiplier.
For a meaningful dualizing process one requires that double dualization
yields the original problem. Now, obviously, (5*) and (5) are equivalent to
(5a*) and (5a), respectively, where
inf sup — L(u,p) = — jS, (5a*)
P e B u e A
sup inf — L(u,p)= —a. (5a)
In this sense, (5) is in fact the dual problem to (5*) with respect to the
Lagrange function — L.
Theorem 49.B. If L; AX B ->U is a function on the product of the nonempty
sets A and B, then the following hold:
(1) Weak duality assertion. /? < a holds.
(2) Strongest duality assertion. The following two statements are equivalent.
(i) (u,p) is a saddle point of L with respect to Ax B.
(ii) u is a solution of (5), p is a solution of (5*), and a = jS.
49.2. ftiam lheorem ot Uuality Theory
461
Moreover, the so-called extremal relation then holds for (u, p):
a=F(u) = L(u,p) = G(p) = l3. (6)
The existence of a saddle point is guaranteed when (//1)-(//3), (//3*) in
Section 49.1 are fulfilled.
(3) Strong duality assertion.
(i) The original problem (5) has a solution u and a = (I when (//1), (//2),
and (//3) in Section 49.1 hold and a < + oo.
(ii) The dual problem (5*) has a solution p and a = /? w/*en (//1), (//2), and
(//3) /n Section 49.1 /wW and jS > - oo.
This theorem yields important information about the behavior of the
solutions of mutually dual problems. In particular, it is possible that only
one of the two problems has a solution. The following two corollaries are an
immediate consequence of assertion (1).
Corollary 49.4 (Error Estimation). For »e A, p e /?, we have G(p)<P <a
<F(u).
Corollary 49.5 (Sufficient Solvability Criterion). From F(u) = G(p) for fixed
elements «e A, p e B, it follows that u is a solution of (5) and that p is a
solution of (5*) as well as a == p.
We extend the existence assertions in Theorem 49.B to monotone
operators in Theorem 51.B and to the Rockafellar stability principle in Theorem
52.A. In this way we shall obtain additional criteria that are important for
applications.
Proof of Theorem 49.B. We write inf„ for iniu^A and sup^ for suppeB.
We mark inequalities that are obtained directly from the definition of inf
and sup by <, .
(Ad 1) From suppinfaJL(w, p) < su\)pL(v, p) for all v e A, it immediately
follows that jS = suppinfuL(u, p) < inft)suppL(y, p) = a.
(Ad 2) (ii)=»(i) From (ii) it follows that j8 = G(p) = miuL(u, p)
< L(u, p) < suppL(u, p) = F(u) ~ a = P.
Thus, the equality sign appears everywhere, i.e., (u, p) is a saddle point of
L with respect to A X B.
(i) =» (ii) If (u, p) is a saddle point with respect to AX B, then
supL(u, p) = L(u, p) = mfL(u,p);
462 49. General Duality Principle by Means of Lagrange Functions and their Saddle Poii'i*
therefore,
a= inf supL{u,p) < supL(u,p) = L(u,p)
" p ' p
= infL(w,^) < sup infL(w,p) =jS < a.
u ■ p u
Thus, the equality sign appears everywhere. This is (ii).
The existence assertion in (2) follows from Theorem 49.A in Section 49.1-
Ad 3(ii). (I) If A is bounded, then (H3) also holds, and the assertion
follows from (2).
(II) Now suppose that A is unbounded. We use the regularized function
def
Ln{u,p) = L{u,p)+n-l\\u\\2,
where «=1,2,... and a limit argument as n -* oo similar to that in the prool
of Theorem 49.A in Section 49.1.
According to Lemma 47.4, for fixed pQ there exists a (u*, a)eX*> H
such that
L{u,p0)>a + (u*,u-uQ).
For this reason, L„(w, p0)-* + oo as ||k||-*oo on A, and we can appK
Theorem 49.A to Ln. Accordingly, Ln has a saddle point (un,pn) with
respect to A X B, i.e.,
L{u„,p) + n'1\\un\\<L{un,pn) + n~1\\u„\\<L{u,pn) + n~1\\u\\ 17)
for all (u, p) e A X B; therefore,
a = inf supL{u,p) < supL{un, p) < L{un,pn) (X)
« p ■ p
<L{u0,pn) + n~l\\u0\\.
Since jS < a, we have — oo < a. Condition (H3*) yields the boundedness nl
(pn). Thus, pn-^p as n^oo, possibly after passing to a subsequence.
According to (8) and (7),
a< Tim L{un,pn)< Urn L{u,pn)<L{u,p);
n -> oo n -> oo
therefore,
jS= sup infL(«, p) > miL{u,p) >a.
p " ■ "
From jS < a it follows that a = /?, i.e., G(p) = jS, and p solves (5*).
Ad 3(i). Pass from L to — L, think of (5) as the dual problem to (5*), am]
apply 3(ii), (cf. (5a)). C
49.3. Application to Linear Optimization Problems in B-Spaces
463
49.3. Application to Linear Optimization Problems
in B-Spaces
In this section we shall be concerned with working out the connection
between the weak coerciveness conditions (H3) and (H3*) in Theorem 49.B
in Section 49.2 and the Slater condition (SC) of linear optimization.
Together with the original problem
inf (c,u)x = a, u&Kx, Du-b &KY, (9)
u
where Kx c X, KY c 7, we shall consider
sup (p,b)r = P, p^Kf, c-D*p&Kx<, (9*)
p
where K£ c X*, K$ c 7*. Moreover, we set
def
L(u,p) = (c, u)x-(p, Du-b)Y for all (u, p) &KXX K^.
This Lagrange function was constructed in a way corresponding exactly to
our general formal procedure: We adjoin the side condition multiplied by a
Lagrange multiplier/? to the original functional (c,«); therefore, we get
~(p, Du — p). Our assumptions are:
(Al) X, Y are real reflexive B-spaces.
(A2) Kx and KY are convex closed nonempty cones in X and 7,
respectively.
(A3) D: X-^Y is linear and continuous.
Proposition 49.6. For fixed c e X*, b e 7, problem (9*) is dual to (9) with
respect to the Lagrange function L on Kx X K* when (,41)-(,43) hold. For
this reason, one can apply all the assertions of Theorem 49.B in Section 49.2
with A = Kx, B = KY.
We give the simple proof in Problem 49.14. The reader can convince
himself that the transition from (9) to (9*) occurs in a very symmetric way.
In order to strengthen this symmetry, we write K * in place of K+ for the
dual cone introduced in Section 48.1. However, when K = X, one has to
take K* = K+ = {0} into account. In this special case, note that K* is not
equal to the dual space X* when Xi= {0}.
If one writes (9*) as a minimum problem, by replacing (p,b) by
-(p, b), then it easily follows that (9) is the dual problem for (9*). In this
connection, we observe that X** = X, 7** = 7, D** = £>, and K$* - Kx,
K$* = KY (Proposition 48.6).
Example 49.7. In the special case
464 49. General Duality Principle by Means of Lagrange Functions and their Saddle Points
where N,M eN, the following hold:
X* = X, Y* = Y, K$ = KX, K$ = KY.
D is an M X N matrix and (9) corresponds to the classical linear
optimization problem (81) in Section 37.10, with/? = X.
Example 49.8. In the special case
X*=UN, Kx = Ul, dim7=oo,
we have X* = X, K$ = Kx, and the following holds:
(i) In general (9) contains infinitely many side conditions described by
Du-be KY.
(ii) (9*) contains only a finite number of side conditions described by
c-D*peK$.
For this reason, in the numerical treatment it is frequently advisable to
solve the dual problem (9*) approximatively, instead of the original
problem, and to exploit the connection between (9) and (9*) which we shall
describe in the following two corollaries.
From Proposition 49.6 and the saddle-point characterization in Theorem
49.A, (2), it immediately follows, by a short calculation, from (6) that the
following corollary holds.
Corollary 49.9 (Characterization of the Solution). With the assumptions
(^41)-(^43), the following three assertions are equivalent:
(/) u solves (9), p solves (9*) and a = /?.
(ii) (u, p) is a saddle point of L with respect to Kx X K*, i.e.,
L(u, p) < L(u,p) < L(u,p) for all (u, p) e Kx X K*.
(Hi) u andp satisfy the side conditions in (9) and (9*), respectively, and we
have:
(c~ D*p,u) = (p,Du-b) = 0. (10)
If any one of these conditions is fulfilled, then L(u, p) = a.
(10) can also be written in the derivative form:
Lu(u,p) = Lp(u,p) = 0.
This is equivalent to L'(u, p) = 0. This, in turn, is equivalent to the fact that
(u, p) is a free critical point of I on IX Y*.
In order to formulate the existence assertion that follows from Theorem
49.B, (3) in Section 49.2 in a form that is convenient for applications, we
note the following so-called Slater conditions:
There exists au0<= Kx with Du0 - b e int KY. (SC)
There exists &p% e KY with c~ D*p$<= int K$. (SC*)
Problems
465
In the special case of finite-dimensional linear optimization in Example
49.7, the side condition u e Kx, Du — b& KY denotes a system of
inequalities. Here, (SC) asserts that in all inequalities containing D the equality sign
never appears for uQ.
(SC) implies the coerciveness condition
L(w0,/>)-> — oo as||/?||-»oo onK$.
We show this in the proof of Theorem 50.A in Section 50.1 in a more
general context. Now, from Theorem 49.B, (3) in Section 49.2, we obtain the
following result.
Corollary 49.10 (Existence). With the assumptions (,41)-(,43), the following
two assertions hold:
(1) From (SC) it follows that the dual problem (9*) has a solution and a = /?.
(2) From (SC*) it follows that the original problem (9) has a solution and
a = fi.
In Theorem 49.B, (3) it is assumed that ft > - oo. However, assertion (1)
also holds for /?= — oo. Then p = 0 is a trivial solution of (9*). In this
connection, compare this with the proof of Theorem 50.A in Section 50.1.
Furthermore, assertion (2) follows from assertion (1) since (9) is the dual
problem to (9*).
What is interesting about Corollary 49.10 is that an assertion concerning
the structure of the side conditions of the original problem allows an
existence assertion for the dual problem, and conversely.
In Section 52.3 we shall consider linear optimization problems in locally
convex spaces as an application of the Rockafellar stability principle.
In Problem 50.4 we treat the existence and duality theorems for linear
optimization problems in R". In this connection, Corollaries 49.9 and 49.10
will be essentially sharpened.
Problems
The main goal of the following set of problems is to familiarize the reader with a
number of important methods for proving the existence of free critical points and to
point out applications.
The identification of critical points for indefinite functionals which are bounded
neither below nor above causes special difficulties. However, such problems occur if
one wishes to prove the existence of periodic solutions of systems of differential
equations or of hyperbolic differential equations (cf. Problem 49.1). As the real
function F(u) = u shows, indefinite functionals need not have a critical point We
recommend that the reader depict the following results using simple examples of
functionals F:RN-*R with Af-1,2.
49. General Duality Principle by Means of Lagrange Functions and their Saddle Poini
49.1. Critical points and prototypes of differential equations. Give differential
equations which are necessary conditions for the following problems:
j [l-l{ul + u\)- f(u)\ d£dr\ = stationary!, (11)
u = 0on3G, u<=C2{G), (12)
jlJT[2-l(uj -«!)-/(«)] rf£ A = stationary!,
«(!,<) = 0 for ^ = 0, | = 0,1, respectively,
ueC2([0,l]xR),uis T-periodic with respect to*.
I [pi'~ H(P> <0] <# = stationary!, (1?)
/>(0) = ¢(0) = 0,p,q<= Cl(R),p,q are T-periodic.
Solution: Parallel to Section 40.5, we obtain:
— A« = /'(«) (elliptic equation), (11a)
UU ~~ uit = f'(u) (hyperbolic equation), (12a)
p' = — Hq, q'= Hp (canonical equation). (13a)
Thus, in principle, follows the possibility of solving these types ol
equations by determining critical points of the corresponding
functional. However, in order to obtain problems that fit an abstract theory, we
have to replace spaces of smooth functions by Sobolev spaces or one
writes each of the equations (11a), (12a), and (13a) in the form
Au-F'(u), ueD(A)cV, (14a)
where V is a real H-space (Lebesgue space) and A is an unbounded
self-adjoint operator. The corresponding variational problem for (14a)
reads as follows:
2~l(Au\u)-F{u)=- stationary!, ueD(A). (14)
(12a) and (13a) are always indefinite problems. Then the functional in
(12), (13), and (14) are bounded neither below nor above.
49.2. G-differentiability and saddle points. Let L: A X B -* R be given and have
the following properties:
(i) A and B are convex sets of the real B-spaces X and Y, respectively.
L is G-differentiable.
(ii) u -» L(u,p) is convex on A for all;? e B.
(iii) p -» - L(u,p) is convex on B for all u e A.
Show that for (u,p) in A X B, the following three assertions are
equivalent:
(a) (u,p) is a saddle point of L with respect to Ax B.
(b) For allueA,peB,
(Lu(u,p),u-u)>0, (Lp(u,p),p-p)<0.
Problems
467
(c) L(u,p) = min„e/lsup;,eBL(M,p) = maxpsBmiusAL(u,p).
Solution: (a) *» (b) follows from Theorem 46.A in Section 46.1. (a) *»
(c) is the assertion of Theorem 49.B, (2) in Section 49.2. Generalizations
to nondifferentiable functionals can be found in Ekeland and Temam
(1974, M), Chapter VI, Proposition 1.7 and in Barbu and Precupanu
(1978, M), Chapter 2, Section 3. All assertions in Chapters 42 and 47
concerning convex functions can be carried over to saddle points in a
way analogous to the above.
49.3. Monotone operators and saddle points. Let L: XX Y-* R be given and
have the following properties:
(i) X and Y are real B-spaces. L is G-differentiable.
(ii) w* Lu(u,p) is monotone on X for allp e Y.
(iii) p -* - Lp(u,p) is monotone on Y for all u e X.
Show: (u,p) in Xx Y is a saddle point of L with respect to XX Y if
and only if Lu(u,p) = Lp(u,p) = 0.
Solution: Compare Proposition 42.6 and Problem 49.2.
49.4. Strongly monotone operators and saddle points of families of functions. Let
L: XXYXZ-+R be a C'-function {u,p,z)^> L(u,p; z), where we
think of z as a parameter. X, Y, and Z are fixed real H-spaces. Suppose
that for fixed c> 0 and all «,-, «ejf, ^,;e Y, zeZ, the following
conditions hold:
{Lu{ul,p;z)-Lu{u1,p;z)\ul-u1)>c\\ul~-u2\\2,
-(Lp(u,pl;z)-Lp(u,p1;z)\pl-p1)>c\\p1-p1\\2.
Show: For each zeZ, L has exactly one saddle point with respect to
X X Y which we denote by
def
s{z) = {u{z),p(z)),
where s e C( Z, X X Y) and
dL(sidZJ'z) ~L2(s(z);z) for all zeZ.
Hint: Use Problem 49.2 and the main theorem on monotone operators
(Theorem 26.A in Section 26.2). Compare Amann (1979), page 132.
49.5. The method of saddle point reduction
49.5a. Basic idea. Assume that we seek a critical point of G on the real H-space
def
V. We decompose V=X®Y®Z and set L(u,p,z) = G(u + p + z),
ue X, peY, z e Z. If we apply the result of Problem 49.4 to this
situation, then we obtain (u(z),p(z)) with
Lu(u(z),p(z),z) = Lp(u(z),p(z),z) = 0.
def
Let g(z) = L(u(z),p(z), z). Furthermore, if we succeed in identifying a
critical point z0 of g, then g'(z0) = 0; therefore, L2(u(z0),p(z0), z0) = 0
49. General Duality Principle by Means of Lagrange functions and their Saddle Points
and thus L'(u(z0),p(z0),z0)= 0. Consequently, u(z0)+ p(z0)+ z0 is a
critical point of G.
5b.** Applications. Study Amann (1979) and Amann and Zehnder (1980).
There this method is applied to equations of the form (14a) and the
corresponding differential equation problem in Problem 49.1. Of
particular interest is the use of isolating blocks and of a generalized Morse index
(the homotopy index) of Conley for dynamical systems (see Conley
(1978, M)).
A typical result for
G: - hu = g{u); <9G:M = 0 (15)
with the corresponding eigenvalue problem
G: -hu = \u; dG:u = 0 (16)
reads as follows: Let G be a bounded region in UN with sufficiently
smooth boundary, and assume that, for g e CX(R,R),
def
g'(°°)= lim g'(u)
|a|->«i
exists. Then (15) has a solution when g'(oo) is not an eigenvalue of (16).
If g(0) = 0, then (15) has a nontrivial solution when there is an
eigenvalue X of (16) such that g'(0) <\< g'(oo) or g'(oo) <\< g'(0).
49.6.* L-S deformations. In Section 44.2 we pointed out the great significance of
these deformations for the Ljusternik-Schnirelman theory. We now give
an important existence criterion. To this end, let A" be a real B-space. Let
M be a subset in X, and let c be a fixed real number. Let F: X -» R be a
functional.
Show. For each open set U in X with l/2critMcF and for each
e0 > 0, there exists a number e, 0 < e < e0, and a continuous mapping d:
M X[0,1] -* M with the following five properties:
(i) d(u,Q) = u on M.
(ii) d{u,l) = u for all u e M with F(u) <£ [c- e0, c+ e0].
(iii) F(u)>c-e, u e M - U implies F(rf(u,l))^c+ e.
(iv) F(d(u,t))> F(u) on M X[0,1].
(v) d is even with respect to u when F is even. If case 2, below, occurs,
then G must be even as well.
In this connection, we assume that one of the two cases occurs.
Case 1: M is equal to the B-space X, F eC'(I,R) and F satisfies
(PS)C.
def
Case 2: Mis equal to the level set Na, where Na = [ael: G{u) = a}
for fixed real a. The following hold for F and G:
(a) F,G<=C\X,R), ^satisfies (PS)C and F~\A)r\Na is bounded when
A is bounded.
(b) G': X-* X* is bounded and locally Lipschitz continuous on bounded
sets of Na.
(c) infaeA-|(G'(«), «)|> 0 on bounded sets K of Na.
Problems
469
Assertions (i)-(v) also hold when > is replaced everywhere by < and
e is replaced by — e.
Hint: Compare Rabinowitz (1974, S) and Chow and Hale (1981, M),
page 134 for case 1 and Zeidler (1979a) for case 2. The basic idea in the
construction of case 1 involves the solution of the ordinary differential
equation
v'=<p{v)h(\\p{v)\\)p{v), y(0) = «.
def
Then one sets d{u,t) = v{t). Here, <p and h are appropriate functions.
The heart of the construction is thatp() is a so-called pseudogradient
vector field, i.e., for all u e X such that F'{u) ¥= 0,
\\p{u)\\<2\\f\u)\\, <f(«).;>(")>^p"(")ll2-
In an H-space X, one can choosep(u) — F'(u) (gradient vector field).
An analogous construction is used in case 2.
49.7. Ljusternik-Schnirelman theory for free critical points.
49.7a. Abstract result. Prove Proposition 44.18.
Solution: Use Section 44.2b and the L-S deformation d from Problem
49.6, case 1.
49.7b.* Application to semilinear elliptic differential equations. We consider
G:-Au = Xg(u); 3G:u = 0, (17)
together with the linearized problem
G: -Au = Xg'(0)u, 3G:u = 0. (18)
With the aid of Proposition 44.18 and (11), show that: for \e]\„,
^,,+il> (17) has at least n solution pairs (u, - u) when the following
conditions hold:
(i) G is a bounded region in R N with a sufficiently smooth boundary.
0 < Xx < X2 ^ " are tne eigenvalues of (18).
(ii) g e CX(R,R), g(0) = 0, g'(0) > 0, g(u) < 0 for a fixed u > 0, g is odd.
QHint: Compare Rabinowitz (1974, S), page 164. First, treat (11) in
Wl(G) and then show that the weak solution is also a classical solution
upon application of the regularization theorems. No growth conditions
are needed for g since, with the aid of the maximum principle, a priori
estimates for the solutions of (17) can be given.
Additional results can be found in Rabinowitz (1974, S).
49.8. Ljusternik- Schnirelman theory on general level sets. Use Section 44.2 and
Problem 49.6 , case 2 to formulate results for eigenvalue problems of the
form F'{u) = \G'(u), u eNa. Here, Na need not be bounded.
Hint: Compare Zeidler (1979a). There one also finds applications to
differential and integral equations (cf., also, Problems 44.9 and 44.10).
49.9. Elementary linking theorems. This problem serves as a preparation for the
linking principle in Problem 49.10. A class Jf of sets is said to be linked
49. General Duality Principle by Means of Lagrange Functions and their Saddle Poinfc
Figure 49.2
with a set M if and only if KC\M*0 for all K e X. Figure 49.2
motivates the designation "linking"'. Every sufficiently regular surface A
with dK = AQ intersects M.
49.9a. In a B-space X, every continuous curve g: [0,1]-> X intersects the
boundary of the unit ball when ||g(0) || < 1< ||g(l) || [Fig. 49.3(a)].
Solution: Apply the mean value theorem to t •-» ||g(0ll-
49.9b. Every continuous curve g: [ —1,1] —> IF82 such that g( ± 1) = (± 1,0)
intersects the T)-axis [see Fig. 49.3(b)].
Solution: Use the mean value theorem or Problem 49.9c.
def
49.9c. Let B = [-1,1] and suppose H\ fiX[0,l]->R2 is continuous. If P:
M2->M denotes the orthogonal projection operator on the £-axis, i.e..
P(£, T)) = £, then we suppose that
P/7(£,0) = £, PH(£,t)*0
foramsdB,rs]0,l].
Show: The curve belonging to £•-» H(i;,t) intersects the rj-axis for aJI
re [0,1] [see Fig. 49.3(c)].
(a)
O
o
(d)
Figure 49.3
Problems
471
Formulate and prove an analogous result for the unit disk B in R 2 (see
Fig. 49.3(d)) and for balls in H-spaces.
Solution: We use the fixed-point index from Chapter 12. Let
def
<K£, 0 = £ - PH(£, t). Since ^(£,0) - 0 on dB, the fixed-point index of
^(-,0) is equal to 1 on B. The invariance of the fixed-point index under
homotopy yields the same assertion for ^(-, t); therefore, ^(-, t) has a
fixed point on B.
In an H-space Xong. must require the compactness of ^: B X [0,1] -* X.
Then B is a ball in a closed linear subspace Xx of X. Instead of the £-axis
and the rj-axis there appears Xx and X^~, respectively.
49.10. Ljusternik- Schnirelman theory and the linking principle.
49.10a. Basic idea. In order to identify a critical point by Section 44.2a, with the
aid of
def
c = inf sup F(u) (19)
one needs the following items: L-S deformations, the invariance of Ct
with respect to L-S deformations, and — oo < c < oo. The linking
principle permits one to establish that - oo < c < oo. To this end, we choose a
set M such that
infF(u)>-oo, KnM*0 for all Ke X. (20)
16 M
From this it immediately follows that c> - oo. If Fis bounded above on
some K in Jf, then c < oo also holds. If one combines this idea with Fig.
49.3, then a number of important results are obtained.
49.10b. Figure 49.3a. We assume:
(i) XisarealB-space, FeC^X.R) and F satisfies (PS)+.
(ii) Let U(0, R) = { u e X: ||u|| < R}. There exists an R > 0 with F(0) =
def
0,F(u)>0on (7(0,^)-(0} and a = inf„e W(0 _R)F(u) > 0.
(iii) There exists a ux # 0 for which F(ux) < a.
rfe/
(iv) JTis the class of all K = g([0,l]), where g: [0,1] -» Xis a continuous
mapping (curve) with g(0) = 0, g(l) = uv
Show: In addition to the local minimum u == 0, F has another critical
point u ¥= 0 with F(u) = c, c > a.
Figure 49.4 makes the structure of F clear.
Solution: Let, say, F(ux)<0. Now follow the line of reasoning in
Section 44.2a. According to Problem 49.9a, every curve g intersects the
boundary 3(7(0, R); therefore, c > a. If c is not a critical point, then we
choose an L-S deformation d according to Problem 49.6, case 1, with
U = 0, e0 = a/2. In particular, 0 < e < a/2 and
F(u)<c + e implies F(d(u,l)) < c- e, (21)
d(u,l) = u forf(tt)*[c-|,c + |]. (22)
472 4y. Oeneral Uuality Principle by Means ot Lagrange functions and their Saddle Foints
\
V I ^r~\
Figure 49.4
By (19), there exists a K e Jf such that
sup F(u) < c + e.
From F(u) < 0 for u = 0, uv as well as c > a and (22), it follows that
d(g(r),l) = g(T) for t = 0,1, i.e., d{K,\) e jf. For this reason,
sup F(u)^c-e
uerf(*r,l)
according to (21), in contradiction to (19).
At the same time we have proven the mountain-pass theorem
(Theorem 44.D in Section 44.12).
49.10c. Figure 49.36. Show: F has a critical point when:
(i) F e C\X,R), F satisfies (PS), and X is a real B-space.
(ii) .¾ is a linear subspace of X with dim Xx < oo, and P: X-* Xx is aj
continuous linear projection operator on Xx. We set X2^i
(I - P){X); therefore, X=Xl<BX1.
(iii) F(u) > b2 > - oo on X2 and F(u)<b1<bz on the boundary 31/(0)}
of a bounded neighborhood of zero, (7(0), in A^.
Solution: We consider all g e C((7, Xx@ X2) with g(ux) = ul on <W [see
Fig. 49.3(b)]. Let Jf be the class of all K=g(U). Now follow a line of
reasoning as in Problem 49.10b. By Problems 49.9b and 49.9c, it is
crucial that K nX2¥=0. Hence, c>b2. Compare Rabinowitz (1978¾
page 162.
49.10d.* Figure 493d. In this connection, study the existence propositions for
critical points in Benci and Rabinowitz (1979) with applications to
periodic solutions of Hamiltonian systems.
49.10e.** Critical points and the Galerkin method, intersection theory. In this cote
nection, study Rabinowitz (1978b). There the existence of periodic
solutions for semilinear hyperbolic differential equations (12a) is proved;:
Here, the linking principle is based on the application of intersection
theory from algebraic topology. This intersection theory, which can be
found in Dold (1972, M), is the appropriate tool for proving deeper
linking theorems. %
49.11. Ordinary differential equations and critical points. An important method
for the solution of A(u) = 0 is to consider the differential equation
I Jems
473
a'(t) = A(u(t)) and verify that a(f)-»u0 and a'(t)-*0 as f-» + oo.
Then A(u0) = 0 (cf. Problem 6.7g). Show that F has a critical point u0
when:
(i) F e c*(X,R), F' is Lipschitz continuous, and A" is a real H-space.
(ii) F is bounded below on X.
(iii) F~ 1(B) is bounded when 5 is bounded.
(iv) u„~~u, F'(u„)-> v as n-> oo implies that v = F'(u).
Hint: Compare Berger (1977, M), page 131. In Berger's monograph
one can find a number of further results concerning critical points
together with physical applications.
49.12. The fixed-point index and critical points. We use the notation from
Section 12.3.
49.12a.* Local fixed-point index for a minimal point. Suppose g: U(u0)C X-^R
has a local minimum at u0 to which an isolated critical point
corresponds. Show that /(7 - g', u0) = 1 when the following hold:
(i) U(u0) is an open bounded neighborhood of u0 in the real
lisp ace X.
(ii) g e {^([/(uq),!!?), and I - g' is compact on U(u0).
Hint: Compare Rabinowitz (1975).
49.12b. Existence of three critical points, g has at least three critical points on
U(uQ) when the following hold, in addition to the assumptions in
Problem 49.12a: u0 ¥= 0, u = 0 is a nondegenerate critical point of g (i.e.,
g'(0) = 0, g"(0) exist as F derivatives, and g"(Qyx exists on X*), and
i(I-g',U(u0))~l.
Solution: According to Section 14.2, i(I ~ g',0) = + 1. If, in addition
to u0 and u =.0, there exist no further critical points, then we obtain the
contradiction
i(/-*M/(«o))-i(/-s',0) + i(/-*',«„).
In Rabinowitz (1975) one can also find an application to shell theory.
Additional important results on the existence of three critical points
via fixed-point index can be found in Problem 14.4.
49.12c. Existence of critical points and Morse theory. Compare Problem 44.12.
49.13. Weak sequentially lower semicontinuity and saddle points. The proof of
Theorem 49.A in Section 49.1 is based on Theorem 9.D, which we have
proved with the aid of fixed-point theorems for multivalued operators.
Using Theorem 38.A, give a direct proof of Theorem 49.A for bounded
A, B, which exploits only the weak sequential continuity of u •-* L(u,p)
andp •-* — L(u,p).
Hint: Compare Ekeland and Temam (1974, M), Chapter VI, 2.1.
49.14. Proof of Proposition 49.6. Solution: For
L(u,p) = (c,u)x — (p,Du— b)Y
49. General Duality Principle by Means of Lagrange Functions and their Saddle Points
by Section 48.4, we have
' (c,u) if Du- be KY,
sup L(u,p) =
peKf
Hence,
+ 00 HDu-be KY.
inf sup L(u,p)^= a
is equivalent to
inf(c, K) = a, ueKx, Du— b e KY.
u
Furthermore, L is equal to
L(u,p)=(p,b)Y + (c-D*p,u)x;
therefore, by (48.4),
' (p,b) if c-D*peKx,
inf Liu,p)--
ueKx (-oo tic-D*p<£K%.
Consequently,
is equivalent to
sup inf L{u,p) = j$
peK?ueKx
sup (p,b) = /8, peKp, c-D*peKx.
p
Generalization of the duality assertions of Theorem 49. B. We again
consider
a- inf ( sup L(u,p) , (23)
/8 = sup( inf L{u,p)\ (24)
and make the following five assumptions:
(HI) L: AXBQXXY->R is a given functional. A and B are
convex nonempty subsets of the real locally convex spaces X
and Y, respectively.
(H2) u <-* L(u,p) is convex on A for all p e B.
(H3) /) >-> L(u,p) is concave on B for all ȣ/(.
(H4) u-> L(u,p) is lower semicontinuous on A for all /> e 5.
(H5) «>-> L(u,p) is lower semicompact on A for a fixed p0 e B.
Prove the following assertions:
. min-sup problem. (23) has a solution « when (HI), (H4), and (H5) hold.
Solution: One easily shows that w-> suppeBL(u,p) is lower semi-
compact. Then use Proposition 38.12, (a).
Problems
475
49.15b. min-sup problem with a = /8. (23) has a solution u and a = /8 when
(H1)-(H5) hold.
Solution: Follow a line of reasoning similar to that in the proof of
Theorem 49.B, (3) in Section 49.2. Another way of proving this can be
found in Aubin (1979, M), page 216.
By passing from L to — L, one also immediately obtains assertions
about the solutions of (24) with a = /8.
49.15c* Applications to mathematical economics. In this connection, study Aubin
(1979, M). There the special role of the Ky Fan inequality for existence
propositions is pointed out. We will come back to Ky Fan's inequality
and its applications in Chapter 77 of Part IV.
49.16.* Applications to nonlinear differential equations. Study Nirenberg (1981).
In this survey article it is.shown how the Palais-Smale condition and
variational principles can be applied to prove the existence of solutions
of semilinear elliptic partial differential equations or the existence of
periodic solutions of semilinear wave equations or Hamiltonian
(canonical) systems. Compare, also, the corresponding works in the references to
this chapter. There are still many open questions in this field.
In this connection let us consider the following three typical examples
which are prototypes of more general important results first proved by
Ambrosetti and Rabinowitz (1973), and Rabinowitz (1978a), (1978b) by
variational methods.
49.16a. Semilinear elliptic equations. Show that the following elliptic boundary
value problem
G: -Au + g(u) = 0, dG: u = 0
has a positive solution when the following four conditions are satisfied.
(i) G is a bounded domain in R " with smooth boundary,
(ii) g: R -» R is a C°°-function with g(u) > 0 for all u > 0.
1 (iii) g(u) = o(u) as u -* 0.
(iv) g(u) = auk for large u. Here, a>0 and .1 < k < (n +2)/(n -2)
(superlinear case).
Hint: Use the mountain pass theorem in Section 44.12. Cf. Nirenberg
(1981, S), page 279.
49.16b. Hamiltonian systems. Show that the Hamiltonian system
p'=-Hq, q'=Hp
has a nontrivial periodic solution on the level set
*f ,
L=>{(p,q)<=Ri»:H(p,q)=l}
when the following two conditions are satisfied.
(i) #:R2"->R is C°°, and H'(p, q)* 0 on L.
(ii) L is compact and strictly star shaped about the origin (i.e., any ray
from the origin hits L at just one point and nontangentially).
Hint: Cf. Nirenberg (1981, S), page 285.
49. vjv-ncral Dumuy Principle uy ivleans in Lagrange runcuons anu ineir Saddle roints
9.16c. Nonlinear vibrating strings. Let r be a given positive rational number.
Show that the following hyperbolic problem
u„ — uxx + u3 = 0, 0<x<ir, t>0
u(Q,t) = u(ir,t) = Q
has a nontrivial solution u e LX(R2) with time period 2irr.
Hint: Use the mountain pass theorem in Section 44.12. Cf. Nirenberg
(1981, S), page 293. We need r rational in order to avoid bad resonance
effects. More general results can be found in Brezis (1983, S). There, the
proofs make use of a duality principle which we describe in the following
problem.
49.17. The duality principle of Clarke and Ekeland. We are given a fixed b in X.
We consider the equation
Au + F'(u) = b, u<=X (25)
together with the corresponding variational principle
2'l(Au\u)+ F(u)~(b\u) = stationary! (26)
Furthermore, we consider the so-called dual variational principle
2-^-^)+F*(v +6) = stationary! veR(A). (26*)
We assume:
(i) The operator A : D(A)£. X-+ X is linear and self-adjoint on the
//-space X. The range R(A) is closed,
(ii) The functional F: X-+ R is convex and C1.
Thus X admits an orthogonal decomposition X= R(A)®N(A).
Consequently, the inverse operator A~l: R(A)-* R(A) is well defined and
bounded. Note that A can be unbounded. F* denotes the conjugate
functional defined in Section 51.1.
Obviously, the problems (25) and (26) are equivalent. Use Proposition
51.5 in order to show that the solutions of (26*) are solutions of (25)
when v — F'{u)~ b.
Solution: (25) is equivalent to
u<eR(A)
A-1v + F'~~1(u+b)<EN(A).
Now observe (F*)'= F'~l by Proposition 51.5.
In practice it is much easier to find solutions of (26*) than of (26).
This duality principle has turned out to be extremely useful for
applications to Hamiltonian systems and nonlinear vibrating strings. Cf. Brezis
(1983, M), and Clarke and Ekeland (1982).
49.18* Representation of maximal monotone operators by saddle functions. The
following results show why, roughly speaking, maximal monotone
operators behave like the subgradients of convex functionals. At the same time
we get a close connection between the theory of maximal monotone
References
477
operators and convex analysis. Let X be a real 5-space. The function
S: X X X-^]~ oo,oo]is said to be a proper saddle function if and only if
S * + oo, and S is concave (respectively, convex) with respect to the first
(respectively, second) argument.
(i) Let S: X X X-* ]oo, oo] be a proper saddle function. Set
.ye 7¼ ilt(-y,y)<EdS(x,x).
Show that the mapping T: X-* 2X* is maximal monotone,
(if) Conversely, let T: X-+ 2X* be maximal monotone. Show that there
is a saddle function S such that T can be represented by (i).
Hint: Cf. Krauss (1984). For example, this-approach can be used to
derive general sum theorems for maximal monotone operators.
References to the Literature
Classical work: Farkas (1902) (Farkas' Lemma on inequalities); J. von
Neumann (1928) (existence of saddle points and game theory).
Saddle points and optimization in U N: Rockafellar (1970, M, B, H); Stoer
and Witzgall (1970, M); Elster (1977, M).
Introduction to linear optimization and duality: Franklin (1980, M).
General minimax theorems: J. von Neumann (1928); Ky Fan (1952);
Sion (1958); Browder (1968a); Brezis, Nirenberg, and Stampacchia (1972);
Holmes (1975, M); Aubin (1979, M).
Saddle points, multivalued mappings, and variational inequalities:
Browder (1968a); Mosco (1976, S); Gwinner (1981, S).
Saddle points and convex analysis: Ekeland and Temam (1974, M);
Barbu and Precupanu (1978, M); Aubin (1979, M).
Saddle points and geometric functional analysis: Holmes (1975, M).
Saddle points and duality theory: Ekeland and Temam (1974, M)
(recommended as an introduction, numerous applications); PSenicnyi (1972, M);
Gopfert (1973, M); Sander (1973, M); Ioffe and Tihomirov (1974, M);
Golstein (1975, M) (generalized concept of solution); Krabs (1975, M);
Barbu and Precupanu (1978, M).
Saddle functions and maximal monotone operators: Krauss (1984).
Critical points and nonlinear differential equations: Nirenberg (1981, S);
Berkeley (1983, P).
Critical points and semilinear elliptic differential equations: Clark (1972);
Ambrosetti and Rabinowitz (1973); Rabinowitz (1974, S), (1978); Ahmad,
Lazer, and Paul (1976); Berger (1977, M); Amann (1979); Amann and
Zehnder (1980); Hess (1980); Struwe (1980), (1982).
Critical points and semilinear wave equations: Rabinowitz (1978b);
Amann (1979); Amann and Zehnder (1980); Brezis, Coron, and Nirenberg
(1980); Brezis (1983) (duality principle).
478 49. General Duality Principle by Means of Lagrange Functions and their Saddle Points
Critical points and periodic solutions of Hamiltonian systems: Berger
(1977, M); Rabinowitz (1978a), (1980); Benci and Rabinowitz (1979);
Amann (1979); Amann and Zehnder (1980); Nirenberg (1981, S); Clarke
and Ekeland (1980), (1981).
Saddle points and applications to economics: Aubin (1979, M)
(comprehensive presentation).
Saddle points and game theory: Compare the references to the literature
in Chapter 9 and in Section 37.8.
Critical points and Ljusternik-Schnirelman theory as well as Morse
theory: Compare the references to the literature in Chapter 44.
Approximation methods for determining saddle points: Auslender (1972,
L), (1976, M); Ekeland and Temam (1974, M) (Uzawa's algorithm);
Demjanov and Malozemov (1975, M); Belenkii and Volkonskii (1976, M)
(handbook); Glowinski, Lions, and Tremolieres (1976, M).
CHAPTER 50
Duality and the Generalized
Kuhn-Tucker Theory
Bees ... by virtue of a certain geometrical forethought... know that the
hexagon is greater than the square and the triangle and will hold more honey for
the same expenditure of material.
Pappus of Alexandria
In this chapter we consider convex minimum problems with a finite or
infinite number of side conditions and their generalizations. It turns out that
the results of the classical Kuhn-Tucker theory can be carried over
completely to this situation.
50.1. Side Conditions in Operator Form
In order to be able to formulate the side conditions in a convenient form,
parallel to Chapter 7, we agree to the following notation: We write u < v if
and only if v - u e K, where K is a closed convex cone in the B-space Y.
Our original problem reads as follows:
infF(n) = a, u(=A, Nu<0. (1)
u
Below we show that (1) is equivalent to
inf sup L(u, p) = a (2)
480
50. Duality and Generalized Kulm-Tucker The<" ■
with the Lagrange function
def
L(u,p)=F(u) + (p,Nu) foia\\(u,p)<EAXK*.
L arises from (1) corresponding to our general formal rule: The side
conditions multiplied by a Lagrange multiplier/?, thus (p, Nu), are added
to the functional F(u) to be minimized.
The problem dual to (1) then has the following form on the basis of out
general duality principle from Section 49.2:
sup ( inf L(u,p)) = p. (l'|
Our assumptions read as follows:
(HI) X and Y are real reflexive B-spaces.
(H2) A c X, A is closed, convex, and nonempty.
(H3) K c Y, K is a closed convex cone.
(H4) F: A c X -*US is convex and lower semicontinuous.
(H5) N: A c X -* Y is convex, i.e.,
N(tu + (l-t)v) <tNu + (l- t)Nv foial\u,v&A, (e[0,l]
Furthermore, N is lower semicontinuous in the sense that the functionals
U*-*(P,NU)Y
are lower semicontinuous on A for all p &K*. This condition is always
fulfilled for continuous N.
Furthermore, the Slater condition is crucial:
There exists &uQ&A such that — Nu0 e int K. (SCl
As the following proof shows, (SC) guarantees a weak coerciveness
condition for L.
Theorem 50.A. With the assumptions (Hl)-(H5) and (SC), the following
hold:
(a) Solution of the dual problem. (1*) has a solution p and a = /?.
(b) Solution of the original problem. The following two assertions an'
equivalent:
(i) u is a solution of the original problem (1).
(«7) L has a saddle point (u,p) with respect to AX K*.
If (ii) is satisfied, then p is a solution of the dual problem (1*) and the
extremal relation
(p,Nu) = 0 (?)
holds.
50.1. Side Conditions in Operator Form
481
Assertion (b) yields a saddle-point characterization of u. Moreover,
assertions concerning the Lagrange multiplier p are put forth. In addition,
we introduce a characterization which makes use of the minimum problem
L(u,p) = min L(u,p), (4)
u^A, Nu<0, (p,Nu) = 0.
(4) is a result of (1) by replacing F by L and adding the extremal condition
(p,Nu) = 0.
Corollary 50.1. With the assumptions (Hl)-(H5) and (SC), the following two
assertions are equivalent:
(i) u is a solution of the original problem (1).
(ii) There exists ap&K* and a u such that (4) holds.
Proof of Theorem 50. A. The equivalence between (1) and (la) follows
from
w \ / »r \ SF(u) if-NueK,
sup F(u)-(p,-Nu)=\ ^ > '
pBK* I+oo if -Nu£K
according to (48.4) because K** = K.
(Ad a) We first prove that from (SC) it follows that
L(u0, p) = F(u0)-(p,-Nu0) -»-oo (5)
as ||/>||-»oo on K*; for, because — NuQ&'mtK, some neighborhood of
- NuQ also belongs to int K. For this reason, there exists an r > 0 such that
for all/? e K* and all v e X with ||y|| <1. Now, (5) follows from
r||/>||= sup (p,rv) <(p,- Nu0) for all/> e .K*.
IH-i
If fi > - oo, then (a) is obtained from Theorem 49.B, (3) in Section 49.2.
When fi= -oo,
a= inf F(m)= inf L(u,0)<fi;
u G-A u &A
therefore, a = fi and p = 0 is a solution of (1*).
(Adb) Use (a) and Theorem 49.B, (2). The relation (p,Nu) = 0 is
obtained, according to (49.6), from
F(u) = L(u,p) = F(u) + (p,Nu).
a
We shall prove Corollary 50.1 in Problem 50.1.
482
50. Duality and Generalized Kulm-Tucker Theory
50.2. Side Conditions in the Form of Inequalities
In this section we will explain the connection between the general duality
theory and the Kuhn-Tucker theory of Section 47.10. We shall see that the
essential results of Section 47.10 are also obtained as special cases from
Section 50.1. For this purpose, we selp = X, X = (X1,...,X„) and consider
our original problem to be
mfF(u) = a, u&A, (6)
u
FM<0, ,-=1 n
with the corresponding dual problem
sup ( inf L{u,X))=fi (6*)
and the Lagrange function
def "
L{u,X)^F{u)+ £i\,^(")-
i = i
The Slater condition from Section 50.1 reads as follows:
There exists a u0 in A such that Ft,(u0) < 0 for alli. (SC)
This is a requirement of the side conditions in the original problem (6).
Proposition 50.2. Suppose the following two conditions hold:
(i) A is a closed convex nonempty set in the real reflexive B-space X.
(ii) The junctionals F,Fl,...,Fn: A<zX-^>U are convex and lower semicon-
tinuous.
Then:
(1) Solution of the dual problem. (6*) always has a solution X and a = /8.
(2) Solution of the original problem. The following two assertions are
equivalent:
(a) u is a solution of the original problem (6).
(b) L has a saddle point (u,X) with respect toAXM + .
If (b) is satisfied, then A e IR" is a solution of the dual problem (6*), and
\tF,{u) = Q, i=l,...,«. (7)
Proof. The assertions are a special case of Section 50.1, with Y=W,
Nu = (^(11),...,^(11)), K = U"+ =K*,p = X, and (p, Nu) = 2f=1A,.^(M).
Here (7) corresponds to the extremal relation (p,Nu) = 0. □
A comparison with Section 47.10 shows complete agreement. The
Lagrange multiplier which occurs there appears here as a solution of the
Problems
483
dual problem. Furthermore, according to Section 50.1, the Lagrange
function constructed in Section 47.10 is, at the same time, a Lagrange function
for our general duality theory.
Problems
50.1. Proof of Corollary 50.1. (i)=» (ii) According to Theorem 50.A, (b), (u,p) is a
saddle point of L with respect to A X K*, i.e., for all u e A,p e K*, we have
F(u) + (p,Nu)<F(u) + (p,Nu) (8)
<F(u) + (p,Nu).
This yields (4).
(ii) =» (i) (8) follows from (4) because Nu<0; therefore, (p, Nu) ;S 0 for
all p e K *. Hence, (u, p) is a saddle point of L with respect \o AxK*.
According to Theorem 50.A, (b), m is a solution of (1).
50.2.* Uzawa's algorithm. Suppose we are given the minimum problem
inf I sup L(u,p)\ = a (9)
with L(u,p) — F(u)+(p\G(u)). The iteration process reads as follows:
Pn + l = P(Pn + tG(un)), p0<EB,
where t is sufficiently small and positive. We calculate p0, u0,pv ux,...
successively, and u„ is obtained from
L(u„,p„)= rmnL(u,p„). (10)
Show: (u„) converges as n-*oo to a solution of (9) when the following
four conditions hold:
(i) A and B are closed, convex, and nonempty sets in the real H-spaces X
~ and Y, respectively.
(ii) P: 7-> B is the projection operator from Section 46.4 on the bounded
set B.
(iii) F: A -> R is G-differentiable and {F'{u)~ F'{v)\u ~v)> c\\u ~ v\\2 for
all u, v s A and fixed c > 0.
(iv) G: A-*Y is Lipschitz continuous. The functional w~* (p\G{u)) is
convex and lower semicontinuous on A for all p e B.
The reader should think about the situation for which one can conceive of
this method as the gradient method for the dual problem
sup I inf L(u,p))= /8
for sufficiently regular data.
Hint: Compare Ekeland and Temam (1974, M), Chapter VII.
0.3.* The Arrow-Hurwicz algorithm. In this variant of Problem 50.2, u„ is not
determined by (10) but rather by the iteration procedure
Ut + l-Un-'iQM-'i+GI'Pn))-
50. Duality and Generalized Kuhn-Tucker Theon
Show: For suitable t, s > 0, (u„) converges as n -* oo to a solution of (9)
when the following conditions hold, in addition to the assumptions in
Problem 50.2:
def
(iii*) F(u) = {<&u\u)-2(<p\u) for all u e A, the operator ¢: X-* Xislineai.
continuous, self-adjoint, and strongly positive, <p e X.
(iv*) G: X-* Y is linear and continuous.
(iii) and (iv) follow automatically from (iii*), (iv*).
Hint: Compare Ekeland and Temam (1974, M), Chapter VII. In Temaiii
(1977, M), the Uzawa and Arrow-Hurwicz algorithms are applied to flu
Navier-Stokes equations.
Simple proof of the main theorem of linear optimization in R N. In this set i >f
exercises, we give the reader appropriate hints so he can prove the ma-n
theorem (Theorem 37.A in Section 37.10) independently. In this connection,
only a separation theorem is used in Problem 50.4a. All other consideration.',
are obtained by means of simple calculations with matrices. In the following,
let A: R N -> R M be a linear operator, i.e., A is equal to a real matrix (a^)
The adjoint matrix corresponds to A*. Furthermore, we recall that foi
u,v eR", we have
n
(U\V)= £ UtVj.
1-1
u > 0 and u » 0 mean m, > 0 and m, > 0, respectively, for all ('.
Farkas' lemma. The problem
Aw a, u>0 (II)
has a solution u if and only if
(y|a)>0 for all y with/)*y > 0. (12)
This is a generalization of the known criterion for the solution of a system of
linear equations:
Au = a has a solution u
«=» (v\a) = 0 for all v with A*v = 0.
Hint: (11) =»(12) (A*v\u) = (v\Au) = <i>|a).
(12) =» (11) Separate a and if = W (R1). Compare Vogel (1967, M), pa.^c
49, and Franklin (1980, M).
. Alternative theorem of Farkas (1902). Either (11) has a solution u or
A*v>0, (v\a)<0 (U.D
has a solution y.
Solution: This is another formulation of Farkas' lemma.
Tucker's existence proposition for inequalities. The two systems Au = 0,u:-u
and A*v > 0 possess the solutions m, v with ^*y + m » 0.
Hint: Apply Problem 50.4b to
AxU\ = -ak, Ux > 0
Problems
485
and
Afv^O, (v1\-ak)<0.
Ax results from A by eliminating the &th column ak. Compare Collatz and
Wetterling (1966, M), page 69.
50.4d. Inequality for skew-symmetric matrices. If 5: UN -» UN is a matrix with
fl* = - B, then
5a >0, a>0, Ba + a»0 (13)
has a solution a.
Hint: Apply Problem 50.4c to
(/,-5)(^) = 0, (*)>0, and(£),>0
and set a = v + z. Here the symbol I denotes the unit matrix. Compare
Collatz and Wetterling (1966, M)," page 70.
50.4e. Main theorem of linear optimization in U N. Consider
inf (p\u) = a,
u
sup(b\v) =•/?,
Au > b,
A*v <p,
u > 0
y>0.
(14)
(14*)
u and v are called admissible vectors for (14) and (14*), respectively, if and
only if u and v satisfy the corresponding side conditions.
Show: If u and v are admissible, then
(b\v)< (v\Au)<(p\u)
holds, i.e., B < a.
Solution: A simple calculation.
Also show: Exactly one of the following two cases occurs:
Regular case: (14) and (14*) have a solution pair u, v such that
<ft|o> = 0>|«>.
Singular case: The following assertions hold for (14) and (14*):
(i) At least one of the two problems has no admissible vectors.
(ii) If the set M of admissible vectors of one of the two problems is not
empty, then M is unbounded and the objective function is also
unbounded on M.
(iii) Neither of the two problems has a solution.
This is another formulation of the main theorem in Section 37.10.
Hint: The trick is to apply Problem 50.4d to
I 0 A -b)
-A* 0 p
b* ~p* 0 ;
and discuss (13). Here, t > 0 and t = 0 yield the regular and singular cases,
respectively. Compare Collatz and Wetterling (1966, M), page 71. A similar
simple approach can be found in Franklin (1980, M), page 62.
in
» | >0
t,
486
50. Duality and Generalized Kuhn-Tucker Theory
50.5.* Tuy's inconsistency theorem for inequalities. Under appropriate assumptions,
from the solvability of
S(u)<0, ueA (15)
and the nonsolvability of
S(u)<0, T(u)<0, u<=A (16)
this very general theorem of convex analysis infers the existence of linear
continuous positive functionals /, g, with g =£ 0, and
(f,S(u)) + (g,T(u))>0 forallue^.
Here, A is convex and S: A -* X and T: A -* Y are convex mappings in the
linear topological spaces X and Y, respectively.
In this connection, study Holmes (1975, M), page 90. In particular, the
concept of a solution for (15) and (16) with the aid of regularizing sets is
made precise there. Furthermore, it is shown that from this inconsistency
theorem one can obtain a number of important general propositions of
convex analysis: the Minkowski-Farkas lemma, the Hurwicz saddle-point
theorem, and the Golstein duality theorem concerning generalized solutions
of convex optimization problems. This concept of generalized solutions is so
contrived that one obtains very intuitive duality propositions. In this
connection, also compare Golstein (1975, M).
References to the Literature
John (1948) and Kuhn and Tucker (1951) (classical works); Arrow, Hurwicz,
and Uzawa (1958, M); Ekeland and Temam (1974, M); Holmes (1975, M);
Barbu and Precupanu (1978, M) (cf., also, the references to the literature in
Chapters 49 and 47).
CHAPTER 51
Duality, Conjugate Functionals,
Monotone Operators and Elliptic
Differential Equations
But we are all led and guided by the passion to perceive and to understand,
whereby we consider ourselves to be admirably distinguished.
Euler's motto for his "Mediationes super problemate nautico"
(Considerations on nautical problems).
In Chapter 49 we showed how one arrives at general duality propositions
knowing a Lagrange function L. In this chapter, given a functional F, we
define a so-called conjugate functional F*, and in Section 51.4 we explain
how one can construct a Lagrange function for a given convex minimum
problem with respect to F by means of F*. In this connection, the
generalized Young inequality
F*(u*)+F(u)>(u*,u), (la)
F*(u*) + F(u) = (u*,u)*>u*edF(u) (lb)
and the relation
F** = F (2)
play a crucial role. One can use
(F*)'~(F')~1 (3)
or the more general relation
u*^dF(u)**u&dF*(u*) (3a)
to calculate F*. Together with the propositions furnished in Chapter 47
concerning subgradients, the calculus of conjugate functionals, whose
principal parts are comprised in (la)-(3a), form the "crossing frog" of convex
<+oo 51. jDuamy, Conjugate Functiunais, Monoiuuc wperatois, ujnptic Duicicudal EquauiW
analysis. [For readers who are not familiar with railways, a "crossing frog"
is a device on railroad tracks for keeping cars on the proper rails at
intersections or switches.]
The concept of a conjugate functional has two classical roots:
(a) The Legendre transformation in the calculus of variations.
(/?) The Young inequality.
We elucidate this more precisely in Section 51.1. The Young inequality
reads as follows:
„.„* Mi:+ ]"!!! • (4)-
p q w
for all u, u* G.U, where/', q>l and p"1 + q'1 =1. The calculus of
conjugate functionals is best understood if one proceeds from (4). Here, the
correspondence between F and F * occurs by means of
F(u) = !~-L-, F*(u*) = !—'-.
Now, one easily verifies that F** = F and (F*)'= (F')-1. Here, (F'y1
denotes the inverse function of F' with F'(u) = \u\p"1sgn u. The inequality
(4) is nothing other than (la), and (lb) asserts that the equality sign appears
in (4) precisely when u* = F'(u), i.e., u* = |n|/'_1sgiiii.
The derivative F' of a convex functional F is a monotone operator. In
this chapter, we make use of the generalized Young inequality as well as
F** = F and (F*)' = (F')_1 in order to prove a general duality theorem
for monotone potential operators in Section 51.5, which also justifies
approximation procedures with practical error estimates. We apply this
theorem to linear and quasilinear elliptic differential equations. In this
connection, we obtain, in particular, the important Trefftz duality between
the Ritz and Trefftz methods of Section 37.9. We thus recognize that, with
regard to the concept of a conjugate functional, the classical Legendre
transformation and the Trefftz duality for linear elliptic differential
equations are mutually interconnected, although at first glance it appears that we
are dealing with unrelated objects.
In Chapter 52, we generalize the 5-function of the classical
Hamilton-Jacobi theory and, with the aid of conjugate functionals, obtain a
general duality principle which does not explicitly use Lagrange functions
(Rockafellar's stability principle). Also, in this connection, the generalized
Young inequality plays a central role.
The Young inequality (4) is responsible for the duality between the
Lebesgue spaces L (G) and Lq(G). In Chapter 53, as a generalization of
this duality, we elucidate the connection between conjugate functions and
Orlicz spaces, which play an important role in the treatment of differential
and integral equations with strongly increasing nonlinearities.
51.1. conjugate runctionals
489
We also admit infinite values for F. For this reason, we can apply the
formalism described in the introduction to Chapter 47, which allows us to
change a minimum problem with side conditions into a free minimum
problem over the entire space.
In order to obtain symmetric duality propositions, we make use of dual
pairs (X, X*) of locally convex spaces. We explain this concept in detail in
the Appendix. In this connection, as usual, X* is the set of all continuous
linear functionals on X. The designation "dual pair" indicates that a locally
convex topology is defined on X* which yields the crucial duality relation
(X*)* = X. The prototype for a dual pair is a reflexive B-space X and its
dual space X*, together with the norm defined on it. The reader who does
not feel confident with the theory of locally convex spaces can think of a
dual pair to be this prototype in all the theorems. We have explained the
concept of locally convex spaces in the Appendix to Part I.
51.1. Conjugate Functionals
Our starting point is
del
F*(u*)= sup (u*,u)x~F(u) forall«*e^*. (5)
UtEX
Definition 51.1. Let F: X^> [-00,00] be a functional on the locally convex
space X. The conjugate functional F*: X* -»[ - 00,00] to F is defined by (5).
Obviously,
F*(u*)= sup (u*,u) -F(u). (6)
u s dom F
In this connection, we observe that F * + 00 for dom FJ=0. When dom F
= 0( F=+oo; therefore, F* = — 00. This is (6) because of our earlier
agreement that the supremum (respectively, inflmum) taken over the empty
set equals — 00 (respectively, + 00). In summary, the following situation
results for infinite values:
F(u) = — 00 foranel implies F* =+00. (7a)
F=+oo implies F* =-00. (7b)
F>-00, F*+00 implies F* > - 00, F* * + 00. (7c)
The last assertion is obtained in the proof of Proposition 51.6, (4). We give
an intuitive geometric interpretation of F*(u*) in Example 51.3 and in
Problem 51.2a.
490 51. Duality, Conjugate Functionals, Monotone Operators, Elliptic Differential Equations
Proposition 51.2 (Generalized Young Inequality). The following hold:
F*(u*)+F(u)^.(u*,u), (8a)
F*(u*)+F(u) = (u*,u)<*u*(=dF(u) (8b)
for all u e X, u* e X*, for which the left-hand side is meaningful, i.e., oo - oo
does not occur.
These singular situations never arise in the special case (7c).
Proof. (Ad 8a) Compare (5).
Ad (8b) u* e dF(u) is equivalent to
(u*,v- u) <F(v)- F(u) ioraHveX and -oo < F(u) <oo,
i.e.,
F*(u*)<(u*,u)- F(u), - oo < F(u) < oo.
(8a) yields the assertion. □
As typical examples, we now explain the connection between conjugate
functions and the Young inequality as well as the classical Legendre
transformation.
Example 51.3 (Young's Inequality). Let X= U; therefore, X* = U as well.
If, in Fig. 51.1, we draw a secant through (0,0) with slope u*, then F*(u*)
is the largest possible difference in ordinates between the secant and the
curve belonging to F.
If, in particular, we choose
def \u\p
F(u) = i-J- for all « eR and fixed/> >1,
then
F*(u*)
for all u* e U, where/> + q L = 1.
Figure 51.1
51.1. Conjugate Functionals
491
This results from (5) by means of a simple calculation, by setting
def
g(u) = u*u— F(u) and determining the maximum of g with the aid of
*'(«) = o.
The classical Young inequality (4) that we have already discussed in the
introduction to this chapter follows from (8).
Example 51.4 (Legendre Transformation). Our goal is to verify the
following relation for q' ►-* L(t, q; q'):
L*(t,q;p)^H(t,q;p), (9)
i.e., we will show that the conjugate function L* to the Lagrange function L
with respect to q' is the Hamilton function H. In this connection, p is
obtained from
p = Lq,{t,q;q>). (10)
To this end, we assume:
(i) Regularity. The Lagrange function L: U3 -* IR has continuous second
partial derivatives,
(ii) Convexity. For all t,q,q'<= U,
(iii) Coerciveness. For all t, q eR,
L(t,q;q')
—-— -> + oo aso'-*oo.
\q'\
Then the following assertions result:
(1) For fixed t, q e IR and for each /?eR, equation (10), which describes
the Legendre transformation, has exactly one solution q'= q'(t,q,p).
(2) If we set
3f{t,q,q',p)= pq'-L{t,q;q'),
def
H(t,q; p) = Jf(t,q,q'(t,q, p), p),
then (9) holds.
Proof. (Ad 1) As a function of q', Lq, is continuous and strictly monotone
increasing because of (ii), and Lq,-*±oo as g'->±oo because of (iii).
Therefore, Lq, has the structure shown in Fig. 51.2.
(Ad 2) As a function of q', — ^f is continuous and strictly convex because
- #q,q, = Lq,q, > 0, and - 3ff/\q'\ -> + oo as \q'\ -* oo. For this reason, 3^
has the form given in Fig. 51.2. Consequently, for fixed t, q, p, the function
Jf possesses exactly one maximum which, indeed, is in q' with
4y2 51. Duality, Conjugate Functionals, Monotone Operators, elliptic iJinerehtial Equations
Figure 51.2
3fq,(t, q, p, q') = 0; therefore, (10), i.e., q'= q'(t, q, p). Thus,
L*(t,q;p)=m^je(t,q,q',p) (11)
q'sR
= Jf(t,q,q'(t,q, p), />) = H(t,q; p).
In Section 37.4 we have already explained the central significance of the
Legendre transformation for the classical calculus of variations. (11)
comprises the classical maximum principle, whose generalization to control
problems is the Pontrjagin maximum principle.
51.2. Functionals Conjugate to Differentiable Convex
Functional
In order to conveniently calculate conjugate functionals for a class of
frequently occurring functionals, we now justify the central formula
(F*)'=(F')_1. (12)
To be precise, we show that
F(u) = F(0)+ C(F'(tu), u)xdt for all u e X, (13)
F*(u*) = F*(0)+ (l{u*,F'"l{tu*))xdt forallM*e^*, (14)
Jo
F*(0) = -f(F'-1(0)).
The corresponding generalized Young inequality reads as follows: For all
u*ex* and weX,
F*(u*)+F(u)^(u*,u), (15a)
F*(u*)+F(u) = (u*,u)*>u* = F'(u). (15b)
(12)-(14) result from this in a simple way.
51.3. Properties of Conjugate Functionals
493
Proposition 51.5. Formulas (12)-(156) hold when F: X^>M is G-differentia-
ble on the real reflexive separable B-space X and F': X -* X* is strictly
monotone and coercive.
In particular, the strict convexity of F follows from the strict monotonic-
ity of F'. If F is not differentiable, then a natural generalization of (12) with
the aid of the subgradient reads as follows:
u*edF(u)*>uedF*(u*). (16)
We shall justify this formula in Theorem 51.A.
Proof. (15) Since dF(u) = {F'(u)}, this follows immediately from (8).
(13) According to Section 42.4, F' is demicontinuous and (13) holds.
(12) Relation (15b) is the key here—i.e.,
F*(F'(u))+F(u) = '(F'(u),u) for all u eX (17)
According to Theorem 26.A in Section 26.1, the inverse operator F'~l:
.X"*-* X exists and is strictly monotone as well as demicontinuous. We set
del del
w = F'{y), v = F'(x). Then from (17) and Section 42.13 it follows that
F*(w)- F*(v) = F(x)~ F(y)-(F'(y), x - y)
+ (F'(y)-F'(x),x)
> (F'(y)-F'(x), x) = (w - v, F'-\v));
therefore,
(w - v, F'-l(v)) < F*(w)- F*(v)< (w - v, F'"l(w)).
The second inequality is obtained from the first by interchanging v and w.
Finally, this yields
(F*'{v),h)x*= lim rl[F*(v + th)-F*(v)]
= (h,F>-\v))x~(F'-\v),h)x*;
therefore, (F*)'=(F'yl.
(14) According to (17), F*(0)= - F(F'-l(0)). Due to (^*)'= (F'y1
and the monotonicity of (F')_1, (F*)' is a monotone potential operator.
Then, analogous to (13), (14) follows from Section 42.4. D
51.3. Properties of Conjugate Functionals
In this section, we assume in general that:
(H) X is a real locally convex space and (X, X*) forms a dual pair.
This condition is fulfilled, e.g., when X is a real reflexive B-space and X*
is the dual space with the usual norm topology. Then, because (X*)* = X,
494 51. Duality, Conjugate Functionals, Monotone Operators, Elliptic Differential Equations
for F: X-* [ - oo,oo], the relation
(F*)*(k)= sup (v*,u)x-F*{v*) for all u&X (18)
U*<E:X*
holds by Definition 51.1. We set F** = (F*)*. A simple geometric
interpretation of F** as a F-regularization is given in Problem 51.2e. We now ask,
when does F= F** hold?
Theorem 51.A (Fenchel and Moreau). For F: A"-* ]-00,00], assuming (H),
we have:
(i) F= F** if and only if F is convex and lower semicontinuous.
(ii) u* e dF(u) if and only if we dF*(u*) when F is convex and lower
semicontinuous.
This theorem follows easily from (3), (4), and (6) in the following
proposition. Here, F < G and F> — 00 are equivalent to F(u) < G(u) and
F(u)> — 00, respectively, for all »el
Proposition 51.6. For the functionals F,G: X-*[- 00,00], assuming (H), we
have:
(1) G<Fimplies F* <.G*.
(2) F* is convex and lower semicontinuous on X*.
(3) F** < F and F** is convex and lower semicontinuous on X.
(4) F= F** when F is convex and lower semicontinuous, with F> — 00 or
F = — 00.
(5) F** <G < F implies F** = G when G is convex and lower
semicontinuous andF,G> — 00 or G= — 00.
(6) dF(u)¥°0, u* e dF(u) implies u e dF*(u*), F(u) = F**(u).
(7) F(u) = F**(u) implies dF(u) = dF**(u).
Example 51.7. For F: U -*U, (3) and (5) mean that F** is that convex
lower semicontinuous function which best approaches F from below (see
Fig. 51.3).
From this observation it is meaningful to make the problem
inf F**(h)=j8, (19)
«e X
w
Figure 51.3
51.3. Properties of Conjugate Functionals
495
as a generalized problem, correspond to a nonconvex minimum problem
infF(n) = a. (20)
«e X
F** is convex and lower semicontinuous. We then designate the solutions
of (19) as generalized solutions of (20). We have already noted this
possibility in Problem 42.14b.
Proof of Proposition 51.6. The proof of assertion (4), which is based on
a separation theorem, is crucial. All the other assertions result in a very
simple way.
(Ad 1) Compare Definition 51.1.
(Ad 2) F* is obtained as the supremum of a family "of continuous linear
functionals (cf. Problems 38.2 and 47.1).
(Ad 3) F** < F follows from (18) and the Young inequality (8a).
Furthermore, take f** = (F*)* and assertion (2) into account.
(Ad 4). (I) Distinguishing cases. For F we have:
(a) F(u) = — oo for some u implies F* = + oo.
(b) F = + oo implies F* = - oo.
(c) F> — oo, F* + oo implies F*> — oo, F* * +oo.
(a) and (b) follow directly from Definition 51.1. We shall prove (c). First,
F(u)< + oo for some u. According to Lemma 47.4, there exists a«*el*
such that
(u*, u) -(F(n)-l) > (u*, v)- F(v) for all v(=X;
therefore, (u*,u) — (F(u)—l)> F*(u*), by equation (6).
For F = — oo, we have F* = + oo, F** = — oo; therefore, F= F**.
Likewise, from F = + oo, it follows that F= F**.
(II) Suppose (c) holds. Since F**<F, it suffices to show that F< F**.
Suppose F(u) > F**(u) for some fixed u.
(IIj) We first prove that u e dom F; for, otherwise, because u € dom F, we
can strictly separate the point u and the closed convex set domF
(Proposition 39.4, 2(ii)), i.e., there exist w e X*, /? e U such that
(w,u)> p>(w,v) for all ye dom F.
Thus, for a z such that F*(z) < + oo, by (c) and for all t > 0, we have
F*(z + tw)= sup (z + tw,v)-F(v)<F*(z)+tp,
v e dom F
F**(u) >(z + tw,u)- F*(z + tw) (by (18)),
> (z, u)+ t(( w, w) —/?) —F*(z)-> + oo as;-» + oo.
Consequently, F**(w)= + oo, in contradiction to F**(u) < F(u).
(II2) Now, we have F**(u) < F(u) and u edom F. By Lemma 47.4 there
exist elements u* e X*, a e IR such that
(u*,u)- F**(u)>a> (u*,v)- F(v) for all v e dom F;
496 51. Duality, Conjugate Functionals, Monotone Operators, Elliptic Differential Equations
therefore,
(u*, u)- F**(u) >a> F*(u*),
in contradiction to (18).
(Ad 5) By assertions (2) and (4), F*** = (F*)** = F*. Thus, from
F**<,G<F it follows that F*<,G* < F*** = F*, by assertion (1).
According to assertion (4), this yields F* = G*,F** = G** = G.
(Ad 6) u* e dF(u) implies F(u)* ±oo. The Young equality (8b) and
(18) yield
F(u) = (u*,u)- F*(u*)<F**(u).
Since F** < F, we have F**(u) = F(u); therefore
(u*,u) = F**(u)+F*(u*)
and thus u<= dF*(u*), again by (8b).
(Ad 7) Compare Problem 51.2d. D
51.4. Conjugate Functionals and the Lagrange
Function
In this section we show how one can construct Lagrange functions for a
comprehensive class of problems by means of conjugate functionals and the
formula F** = F. A more detailed investigation of the dual problems is
found in the following section and in Section 52.2.
We consider the minimum problem
inf F(u)+H(Du-a) = a, (21)
«e X
and we will give the cases for which the dual problems read as follows:
sup [-F*(-D*p)-H*(p)-(p,a)]=p, (21a*)
peY*
sup [-H*(p)-(p,a)]=p, D*p = b, />eF*. (21b*)
p
This duality is called the Fenchel duality. We choose the Lagrange function
to be
def
L(u,p) =F(u)+(p,Du-a)-H*(p)
def def
for u e X0, p e F0*, where X0 = dom F and YQ* = dom H*. Our goal is to
write (21) in the form
inf sup L(u,p) = a. (22)
" e X° p e Y0*
51.4. Conjugate Functional and the Lagrange function
497
Then the dual problem reads as follows:
sup inf L(u,p)=p. (22*)
In this connection, our assumptions read as follows:
(HI) X and Y are real locally convex spaces. (X, X*) and (Y,Y*) form
dual pairs.
(H2) F: X -> ]— oo, oo ] is a functional such that F * + oo.
(H3) H: Y -* ] — oo, oo ] is convex lower semicontinuous and H * + oo.
(H4) D: X -> Y is a given operator and a is a fixed element in Y.
Proposition 51.8. With the assumptions (H1)-(H4), the minimum problems
(21) and (22) are mutually equivalent.
Proof. According to Theorem 51.A in Section 51.3, H** = H; therefore,
H(Du- a) = suppeY*(p,Du-a)- H*(p). 0
Corollary 51.9. If, in addition to (H1)-(H4), the operator D: J -> F is linear
and continuous, then the dual problem (22*) is equivalent to (21a*).
def
In the special case F(u) = — (b, u) for all u e X and fixed b e X*, (21a*)
passes into (21b*).
We treat the simple proof in Problem 51.3. In the following Examples
51.10-51.13, we show that very different important problems can be reduced
to (21) by means of a suitable choice of F, H, and D.
Example 51.10 (Minimum Problem). The problem
inf F(u)+ H(u) = a,
u& A
where A<z X, passes into (21) when we set X = Y, D = I, a = 0, and extend
def
the functionals F,H:A-*U to J by F(u), H(u) = + oo for u e X - A.
o 1
Example 51.11 (Classical Variational Problem). Let X=W2(G). The
variational problem
N
- f fudx + 2'1 f £ {Dtufdx =a
JG JGi=\
belonging to the boundary value problem
G: -An = /; dG:u = 0
passes into (21), i.e., into
inf F(u)+H(Du) = a,
u & X
where
F(u)= - (fudx, H{v) = 2~l( £ vfdx,
JG JGj=l
inf
as X
498 51. Duality, Conjugate Functionals, Monotone Operators, Elliptic Differential Equations
D=*(DjU,...,DNu), v = (vu...,vN). Then, naturally, Y^U^Lj^^G) and
a = 0.
We shall deal with more general problems of this sort for linear and
quasilinear elliptic differential equations in Sections 51.6 and 51.7.
Example 51.12 (Generalized Linear Optimization Problem). The minimum
problem
MF(u) = a, u<=A, Du-a<=B (23)
U
for F: A -* U and A<z X, BczY, can be written in the form (21) by setting
K ' \+oo iiveY-B
def
and extending F from A to X by F(u) = +00 for all u e X — A.
(23) contains linear optimization problems in locally convex spaces as a
special case, which we shall treat in Section 52.3.
Example 51.13 (Spline Functions). The fundamental problem of the theory
of spline functions reads as follows: A real function u: [a, b]-*U is sought
as a solution of the minimum problem
inf/"*[«(«>(0]2df-a
with the side condition
«(r,) = rf, /=1,. ..,n, u^W£(a,b), (24)
where l<q<n. Here, the n points tt in the compact interval [a,b],
a <tx<- ■ ■ <tn<b, and the real numbers rt are given. This problem can be
written in the form (21), i.e.,
inf F(u)+H(Du)°*a,
usX
by setting
X-Wj{a,b), Y=L2(a,b),
H(v) = ||y||y, Du=u(q) {qth derivative),
F(«)-( ° f0r^'
I +00 otherwise.
The solution u of this problem is unique. Here, u is the only function
with (24) that has the following properties:
(i) u is a polynomial of degree 2q-l on ]tt,ti+1[, 1 =1,...,n -1.
(ii) u is a polynomial of degree q —1 on [a, fj and ]tn, b].
(iii) S(2?_1) is continuous.
Cf. Problem 51.5. We already treated a special case in Section 37.18.
51.5. Monotone Potential Operators and Duality
499
51.5. Monotone Potential Operators and Duality
Our task is a detailed consideration of the minimum problem
inf H{Du)-b{u) = a, (25)
which plays an important role in many problems of elasticity and plasticity
theory. According to (21b*), the dual problem reads as follows:
sup [-#*(/>)]=/?, (25*)
peK
where
K= {p<=Y*:(p,Dv) = b(v) forallueX}.
This is equivalent to K= {p^Y*: D*p = b}. The Euler equation for (25)
is
a(u,v) = b(v) for alii; eX, (26)
def .
where a{u, v) = (H'(Du), Dv). We seek »el Observing that H*' = H''1,
the Euler equation for (25*) can be written as
(H-\p),q-p)7>0 for all <? e K. (26*)
We seek p e K.
In the applications in the following two sections, (26) is the generalized
problem for a linear or quasilinear elliptic boundary value problem. Then
(25) is the variational problem belonging to (26). The operator H' is
generated by the coefficient functions of the differential equation and, in
applications to mechanics, it depends on the properties of the materials
involved. Our assumptions read as follows:
(HI) X and Y are real separable reflexive B-spaces.
(H2) H: Y^U is G-differentiable and H': Y^Y* is strictly monotone
and coercive.
(H3) D: X^ Y is linear and isometric, i.e., ||Di;|| = ||i;|| for all v e X. The
element b e X is given and fixed.
According to (14), we obtain the following basic formula for calculating the
dual problem (25*):
H*(P) = -h(h'~1(0))+ (\p,H'-\tp))dt forall/>eF*. (27)
•'o
According to the main theorem on monotone operators (Theorem 26.A),
H'-1- y* -> Y exists as a strictly monotone operator. H is strictly convex
because of the strict monotonicity of H'. According to Section 51.4, the
Lagrange function reads as follows:
L(u,p) = (p,Du)-(b,u)-H*(p).
500 51. Duality, Conjugate Functional, Monotone Operators, Elliptic Differential Equations
Theorem 51.B. With the assumptions (Hl)-(H4),the following four
assertions hold:
(1) Existence and uniqueness. (25) and (25*) have exactly one solution u
andp, respectively, and a = /?.
(2) Euler equations. (25) and (26) are mutually equivalent problems. The
same is true for (25*) and (26*).
(3) Extremal relation.jp = H'(Du) holds.
(4) Error estimates for u and a. For all »e X,p eK,
~H*(p)<a<H(Du)-b(u). (28)
If there exist numbers y > 1, c> 0 such that
(H'{r)~H'{q),r-q)>c\\r-q\y for all r,q^Y, (29)
then
y~lc\\u-u\y <H(Du)~b(u)+ H*(p) (30)
for allu<=X,pe K.
Remark 51.14. For y = 2, assumption (29) is equivalent to the strong
monotonicity of H'.
If, e.g., with the aid of a Ritz method or a gradient method, one
constructs the minimal sequences (un) and (p„) for (25) and (25*),
respectively, i.e.,
H(Dun)-b(un)^a,
-H*(p„)^p
as n -* oo, then, since a = ft, (29) yields an error estimate for u with
rXc\\un-u\V <H{Dun)-b{un) + H*{pn) ^Q asn-a,.
In applications to differential equations, (27) allows one to determine the
dual problem, for H' depends only on the coefficient functions of the
differential equations (cf. Sections 51.6 and 51.7). Since in applications to
elasticity and plasticity theory these coefficient functions are essentially
determined by the material law, the dual problem thus crucially depends on
the material law.
In elasticity theory, u is interpreted as displacement and p as stress. The
extremal relation p = H'(Du) is then exactly the stress-strain relationship
which is described, say, by Hooke's law. We shall discuss this in Chapter 62.
The principle of minimal potential energy and the Castigliano principle of
maximal dual energy correspond to (25) and (25*), respectively. (26) means
the principle of virtual work. Furthermore, p&K expresses the so-called
equilibrium condition between the stresses and external forces.
51.5. Monotone Potential Operators and Duality
501
Proof. We make use of Section 51.2 in a crucial way.
def
(I) Solution of (25). We set F(u) = H(Du) for all u e X. Then F'(u) =
D*H'(Du) holds since, for all u, v e X,
(F'(u),v)= ]xmrl[H{Du + tDv)-H{Du)}
= {H'{Du),Dv).
F' is strictly monotone since H' is strictly monotone and Du ¥■ Dv if
u + v; therefore,
(F'(u)-F'(v),u-v)*=(H'(Du)-H'(Dv),Du-Dv)>0.
F' is coercive, since H' is coercive; consequently, because \\Du\\ -*oo, we
immediately have
(F'(u),u) = (H'(Du),Du)
ll«ll \\Du\\ " + °°
as ||u|| -»oo.
According to Theorem 42.A in Section 42.5, (25) thus has exactly one
solution u.
(II) Solution of (26). According to Theorem 42.A in Section 42.5, (25) and
(26) are mutually equivalent. Consequently, u is also a solution of (26).
del
(III) Solution of (25*). If we set p = H'(Du), then, by (26), (p, Dv) =
b{v) for all v e X; therefore, p eK. The generalized Young inequality (1)
yields
H(Du)+H*(p)>(p,Du) = b(u) for aft (u,p)<= XX K,
H(Du)+H*(p) = (p,Du)=*b(u).
The first line shows that jS < a, and the second line yields p as solution of
(25*) with a = j8.
The uniqueness of p follows from the strict convexity of H*, for, by (12),
H*'= H'~l, and H'_1 is strictly monotone. Moreover, K is convex.
(IV) Solution of (26*). According to Theorem 46.A, (b) in Section 46.1,
(25*) and (26*) are mutually equivalent.
(V) Error estimate. (29) yields
(F'(u)-F'(v),u-v)>c\\Du-Dv\y = c\\u-v\y.
By (42.9), it follows that
y-xc\\u-u\y + (F'{u),u-u)<F{u)~F{u).
Now, one obtains (30) from F{u) = H{Du), F'{u) = b as well as
F{u)-b(u) = a = $•£- H*{p).
D
502 51. Duality, Conjugate Functionals, Monotone Operators, Elliptic Differential Equations
51.6. Applications to Linear Elliptic Differential
Equations, Trefftz's Duality
We will apply the results of Section 51.5 to the minimum problem
inf I 2 1 ]£ alXx)DiuDu — fu\ dx — a,
(31)
1
where X—W^G), and for this purpose in preparation we note the dual
problem
sup
peK
-2-1 / E a\jl\x)piPjdx
= /}. (31*)
Here, K is the set of all p = (Pi,---,pN) such that /?■ e L2(G) for ally and
(31a*)
/ ]C PjDjVdx — I fvdx for all i; e X.
^0,=1 JG
Tlie matrix (a'/1') is the inverse of (a,y). If we set
def N
Lu= - £ Dt(aijDju),
then the classical boundary value problem
G:Lu = f; dG:u = 0
(32)
belongs to (31) as the Euler equation, with the corresponding generalized
problem: We seek u& Xso that
N
f £ a, ]DjuDivdx = f fvdx for all ye J. (33)
JGi,j-l JG
In this connection, (33) results from (32) purely formally upon
multiplication by v e C0°°(G) and integration by parts.
In order to clarify the connection with the Trefftz duality, we set
def _
5= {u <= C2(G): u = 0 on dG),
def _
r= {veC2(G): Lu = f on G)
and, parallel to Section 37.9, consider the two problems
inf H(Du)-b(u)-=a,
«6S
inf [-H(Dv)] = /?
(34)
(34*)
51.6. Application to Linear Elliptic Differential Equations, Trefftz's Duality 503
def - "
with
H(Du)= (2-1 £ atjDiUDjUdx,
G /,y = i
*/ r
b(u) = I fudx.
Jr.
Below we shall show that
H*{p) = 2-lf £ a\jl\x)ptPjdx (35)
for all p er with Y~YV?=lL2(G) and F* = F. In order to be able to
interpret (31a*) intuitively, we note that when/?, eC1((?) for ally, from the
equation (31a*), by integration by-parts, it follows that
I N \
f\'LDlpi + fvdx = 0 for all 1; eQ00(G),
i.e.,
N
-ZDiPi-f on6. (36)
/-1
In the general case, (31a*) indicates that (36) holds in the sense of
distributions.
Proposition 51.15. Suppose that the following four conditions hold:
(i) G is a bounded region in RN, N^.1.
(ii) All atj: G -> U are measurable, i.e., for example, continuous as well as
bounded and symmetric, i.e., atj — a,-, for all i, j.
(Hi) L is strongly elliptic, i.e.,
N N
D alJ(x)dldj>c'£d^
1,./ = 1 ;-i
for all d e R ^, x e G and fixed c> 0.
(iv) f is a fixed element in L2(G).
Then:
(1) The problems (31) and (31*) have exactly one solution u and p,
respectively, and a — ji.
(2) u is also the unique solution of (33).
(3) The following extremal relation holds:
N
Pj = E a,jD,u for all).
504 51. Duality, Conjugate Functionals, Monotone Operators, Elliptic Differential Equations
(4) The following error estimates hold for u and a:
- H*(p) < a< H(Du)- b(u),
2~lc\\u -u\\2x < H{Du)-b(u) + H*(p).
Remark 51.16. In assertion (4) we make use of
N
II"-"11*"= f E (DiU-D,u)2dx. (37)
If we apply the Poincare inequality
cx I \u— u\2dx < \\u— u\\x forallweX,
JG
then we get an estimate for the integral appearing in the left-hand side. We
gave estimates for the constant c1 > 0 in Problem 22.1.
Corollary 51.17 (Classical Solution). If in Proposition 51.15, the functions f,
at], and the boundary dG are sufficiently smooth, then u e C2(G) and u is a
solution of the classical boundary value problem (32) and of the classical
variational problems (34) and (34*), with a = /8. Furthermore, the following
error estimates hold for u and a:
2-1cc1 (\u- u\2 dx ^fkWu - u\\2x < H(Du)- b(u)+ H(Dv),
JG
- H{Dv) <a<H{Du)-b{u)
for allueS,ve T.
This justifies Section 37.9. As the following proof shows, strongly elliptic
differential equations of order 2m can be handled in an analogous manner.
Proof. We will apply Theorem 51.B in Section 51.5 and to this end we set
H( v) = 2"1 ( X) atjV^jdx for all v&Y,
def
Du= (£>!«,...,1»^«) forallwej.
Since X= Wj(G) and Y=TlfL1L2(G), the operator D: J-> Yis linear and
continuous, with | Du \ \ Y = 11 u \ \ x for all u e X. Note the definition of 11 • 11 x in
(37) and that
N
\\P\\y=f Zpfdx.
If we set
A(r,q)= f £ a^r^fdx for all r, q e Y,
JaU-i
51.6. Application to Linear Elliptic t)ifferential Equations, IrefTtz's Duality 505
then there results
AH(r 4- 1n\
= A{r,q).
(H'(rU) ^-^31
r = 0
Thus, from the strong ellipticity it follows that
(H'{q)- H'{r),q- r)>c\\q- r\\2Y for all q, r e Y.
This means that H' is strongly monotone and hence coercive as well (see
Fig. 27.1).
We now calculate H*(p) by Section 51.5 with the aid of the formula
H*(p)~-H(H'-1(0)) + f*(p,H'-1Xtp))rdt. (38)
First, since H'(0) = 0, we have H{H'~\Q)) = 0. Let w = H\r). For almost
allxeG,
N
wy(x)= E au{x)rt{x).
i-l
This follows from
(w,q)Y= ( 'L^J<}jdx^(H'(r),q)
Ja j
N
= / E aii{x)ri<iidx for all ^eF.
Thus,
/=1
and from (38) it follows that
H*(p) = 2~lf E alj»plPjdx.
Now, Proposition 51.15 follows immediately from Theorem 51.B in
Section 51.5. D
Proof of Corollary 51.17. 5eC2(G) and (32) are obtained from the
regularity assertions which we gave in Problem 22.8.
(34) «eS implies u&X. Since SeS, one thus needs only to minimize
over S in (31).
(34*) Let
P=\(Pi,...,pN):pj=> JiatjDtV, oer'
506 51. Duality, Conjugate Functional, Monotone Operators, Elliptic Differential Equations ■}
From p e P it follows that
j ■:%
because oeT; therefore, p&K. According to Proposition 51.15, (3), we^
obtain peP. Hence, in (31*) one needs to maximize only over />eP. s
Furthermore, by (35),
H*(p) = H(Dv) for allp e P. (39)
By virtue of Proposition 51.15, (3), u is thus a solution of (34*).
The error estimates result from Proposition 51.15 and (39). □
51.7. Application to Quasilinear Elliptic Differential
Equations
Parallel to Section 51.6, we consider the minimum problem
inf f E IA"I"-./" \dx = a, (40)
where X=W^(G), p>2. In preparation, we formulate the dual problem
sup
-P, (40*)
-°~lf i,\P,\°dx
Gi-i
where p"1 + a~x=l. Here, K is the set of all p — (plt...,pN)mthp-e LB{G)
for all j and
N
f E PiD,vdx = f fvdx for all v e X.
To (40) formally belongs the classical boundary value problem
N
G: ~ E A(lA«r~2A")=/; dG:u = 0 (41)
with the corresponding generalized problem: We seek a«el such that
N
f YJ\Diu\l'~-2DiuDivdx= (fvdx for all v e J. (42)
^,- = 1 ■'g
Here, (42) formally results from (41) upon multiplication by v e C™(G) and
subsequent integration by parts.
For the following error estimates, we recall inequality (25.45), i.e.,
{\y\p'2v-\zr2z)(y-z)>c\v-z\p
51.7. Application to Quasilinear Elliptic Differential Equations
507
for ally, z eIR and fixed c> 0. Furthermore, in the proof, it turns out that
N
H*(p) = o~lf T,\p,\'dx
0/ = 1
for all/>eF*, where Y = 11,^1,,,((7) and thus Y* = nf=1L0((7).
Moreover, we note that
N
H(Du) = p~1f 'Z\D,u\pdx forallwej,
JGt-i
b(u) = J fudx,
N
H(v)=( Y,\vt\pdx forallueF; Du= (D^,...^^);
JGi = i
N
11« = / ElA«lp-*,
G,=i
N
\\P\\r-f T,\P,\°dx.
JGi=1
Proposition 51.18. Let G be a bounded region in UN, N>\, and let f be a
fixed element from La(G). Then the following assertions hold:
(1) The problems (40) and (40*) have exactly one solution u and p,
respectively, and a = /?.
(2) u is the unique solution of (42).
(3) The extremal relation
N
Pj= £|W~2A-
between u and p holds.
(4) For u and a, the following error estimates hold:
-H*{p)<a<H{Du)-b{u),
p~lcd f\u- u\"dx <p~lc\\u - «||5- < H(Du)-b(u)
for all u e X, p e K where d > 0 is some constant.
The proof proceeds parallel to Section 51.6 (cf. Problem 51.4). In an
analogous manner, one can handle more general quasilinear elliptic
differential equations which lead to uniformly monotone potential operators in the
sense of Section 42.7b.
Duality, Conjugate Functionate, Monotone Operators, Elliptic Differential Equatioa-
MS
Calculation. Prove the following calculation rules:
(i) (F+ a)* = F* - a for all a e W.
(ii) (XF)*(u*) = XF*(\-iu*) for all X > 0.
(iii) (G)*(w*) = F*(M*)+(w*,y) when G(«) = F(«-y).
(iv) If (F„) is a family of functional, then
(infF„) = supF„*,
(supFa)*<infF„*.
Geometric interpretation of F*, F**, dF, and the T-regularization. Let I:
X-* [—00,00] be a functional on the real locally convex space X and
suppose (X, X*) forms a dual pair. We first give several definitions:
(i) H: X-*R is said to be affine and continuous if and only if H(v)*
(u*, v) + a for all v e X and fixed u* e X*, ael.We denote the sd
of these H by ip.
(ii) H is called an affine minorant of F if and only if H e ip and F(y);
//(u) for all y e X Let 9W denote the set of all these minorants.
(iii) By the r-regularization of F, T(F), we understand
. ,*/
T(F)= sup //;
. tfeSK
therefore T(F) = - 00 for 3» =0.
(iv) H is called a hyperplane of support for F at u if and only if // e 2¾ and
H(u) = F(u) (see Fig. 51.4).
Prove the following assertions for H(v) = («*, v) + a:
Interpretation of F*(u*). If He.Wl, then the maximal possible a is
- F*(u*) when F*(u*) is finite. For F*(u*)= + oo, there is no a, i.e.,
9K —0, and when F*(u*) =-00, every aeiis possible.
Subgradient. H is a hyperplane of support for F at u if and only if
u* e dF(u).
Figure 51.4
roblems
509
51.2c. Interpretation of F**{u). If H is a hyperplane of support for F at u, then
a = F{u)-(u*,u) = - F*{u*), H is a hyperplane of support for F** at u
and F(u) = F**(u).
51.2d. Meaning of F(u)*= F**(u). If F(u)- F**(u), then the hyperplanes of
support to F and F** at u are equal, i.e., dF(u) = dF**{u).
51.2e. Interpretation of F**. We always have T(F) = F**.
Solution: Concerning Problems 51.2a and 51.2b, compare the definition
ofF* and 8F(u).
Ad 51.2c. By the definition of F**.
(u*,v)-F*(u*)<F**(v);
therefore, H(v)< F**(v)< F(v) for all veX, according to Proposition
51.6, (3). Now, H(u) = F{u) yields F**{u) = F{u).
Ad 51.2d. Compare Problem 51.2c.
Ad 51.2e. One can restrict oneself to those H with maximal a, i.e.,
H(v) = (u*,v)~ F*(u*). Now, (18) yields the assertion.
51.3. Proof of Corollary 51.9. Solution:
inf L(u,p)= inf F(u) + (p,Du- a)- H*(p)
u e X0 u e JV0
= - sup (- D*p,u)- F(u)- H*(p)-(p,a)
ueX0
- - F*(- D*p)- H*(p)-(p,a).
In this connection, take note of X0 = dom F and (6). For F(u)= - (b, u),
take into account that
inf L(u,p)= inf (D*p-b,u)-(p,a)-H*(p)
■(p,a)-H*(p) HD*p-b = 0,
-oo if£>*;>-�.
51.4. Proof of Proposition 51.18. Solution: Follow a line of reasoning analogous to
the proof of Proposition 51.15. In particular
<H'(r),q)Y-[ £ |r,|'-2r,ft<fa for allies 7.
-^. -i
AT
From w = H'(r), we obtain that for almost all x e G,
w,(x) = |r,(x)|f"~2; therefore ri(jc) = |w,-(jc)|°""2w,-(jc)
and thus
H*(p)--H(H'-\0)) + fl(p,H>-\tp))Ydt
f1-
•'o
AT
■«"7 EiP/i"*-
0,=.1
510 51. Duality, Conjugate Functionals, Monotone Operators, Elliptic Differential Equation
51.5.* Spline functions. Prove the assertions in Example 51.13. Hint: CompaK'
Laurent (1972, M), Theorems 4.1.2 and 4.1.3. That monograph contains j
detailed treatment of spline functions. For numerical questions we
recommend de Boor (1978, M).
51.6. Special conjugate functionals for bilinear and linear forms. Calculate F*,C*
for F(u) = 2~l(Au\u), G(u) = (b\u). Here, A: X-+X is a symmetrn.
strongly positive operator on the real H-space X and b e X.
Solution: From (14) it follows that F' = A, F*(u) = 2~1(A~1u\u) and
<?*(«)-( ° ?,uZl'
v ' I+oo if u¥=b.
51.7.* Conjugate functionals for integral expressions. Let
F{u)= [ f(x,u(x))dx forallue f[ L. (G).
Show that the following convenient formula holds:
n
F*(u*) = jf*(x,u*(x))dx forallu*e Y[Lq.(G).
Here,/* is the conjugate function to/with respect to u, i.e.,
n
/*(x,u*)= sup Yj ufui-f{x,u)
for almost all x e G.
The assumptions are:
(i) G is an open bounded nonempty set in R", n >1.
(ii) /: G X R" -* R satisfies the Caratheodory condition (e.g., / is
continuous),
(iii) For all (x, u)e G XR", the growth condition is satisfied:
n
\f{x,u)\<a{x) + bZMp'
i-l
for fixed a e LX(G), b>Q,\< pt <oo.
(iv) p^ + q-1**!, i=l n.
Hint: Compare Ekeland and Temam (1974, M), Chapter IX, Proposition
2.1. The proof depends on appropriate approximations. For this problem,
also study Rockafellar (1971), (1976, S).
51.8. Criterion for lower semicompactness. Show. Let F: R N -> ]— oo, oo ], F * + v..
be lower semicontinuous. Then Fis lower semicompact when F* is
continuous at zero or F(u)/||u|| -» a as ||u|| -»oo, a > 0.
Hint: Compare Aubin (1979, M), pages 76-78. Theorem 38.B in Section
38.3 shows the meaning of lower semicompactness.
51.9. Torsion of a rod of nonlinear elastic material. For
G: - E^[«p(l&radu|2)J3/u]=/; 3G:u = 0,
References
511
where |grad u\2 = EyL1(Dju)2, set up the corresponding variational problem
and the dual problem in the sense of Section 51.5. Assume that the function
<p: [0, oo[ -» R is strongly monotone and Lipschitz continuous.
Hint; Compare Gajewski (1970). Further similar problems from
nonlinear elasticity theory can be found in Langenbach (1976, M).
Moreover, Gajewski (1970) contains references to the literature for
appropriate problems in rheology.
References to the Literature
Classical works: Young (1912) (Young's inequality)^ Fenchel (1949), (1951,
L) (conjugate functionals were first introduced in these papers).
Calculation with conjugate functionals: Rockafellar (1970, M) (in UN);
Moreau (1971); Ekeland and Temam (1974, M); loffe and Tihomirov (1974,
M); Barbu and Precupanu'(1978, M); Aubin (1979, M).
Duality for monotone operators: Gajewski, Groger, and Zacharias (1974,
M); Kluge (1979, M).
Applications to rheology; Gajewski (1970).
Duality and partial differential equations: Ekeland and Temam
(1974, M).
Numerical methods: Glowinski, Lions, and Tremolieres (1976, M).
Duality for integral functionals: Rockafellar (1971), (1976); Ekeland and
Temam (1974, M) (application to nonconvex variational problems).
(Compare, also, the references to the literature in Section 37.9.)
CHAPTER 52
General Duality Principle by Means of
Perturbed Problems and Conjugate
Functionals
The study of mathematics, like the Nile, begins in minuteness, but ends in
magnificence.
C. C. Colton (1820)
In Section 37.10 we observed the following general principle for linear
optimization problems in U N:
The consistency of (P) and (P*) implies that (P) and (P*)
are solvable.
Here, (P) and (P*) are the original and the dual problems, respectively. This
very convenient existence principle no longer holds for more general
optimization problems. However, in Section 52.1 we shall justify the following
principle:
The consistency of (P) and (P*), as well as the stability of
(P*), implies that (P) is solvable.
More precisely, the following holds when (P) and (P*) are consistent;
The stability of (P*) is equivalent to the solvability of (P)
andinf(P) = sup(P*).
Furthermore, (P) and (P*) can be interchanged everywhere. This clarifies
the basic meaning of the stability concept in the sense of Rockafellar for the
existence of solutions.
The basic idea consists in that, together with the given problem, one also
studies perturbed problems. Parallel to the construction of the 5-function in
the classical Hamilton-Jacobi theory and the Bellman dynamic optimiza-
512
52.1. The S-Functional, Stability, and Duality
513
tion, S-functionals are considered. The dual problems arise by means of
conjugate functionals without the explicit use of Lagrange functions.
We treat Fenchel duality and linear optimization problems as
applications. In Section 52.5 we describe a duality principle for general, not
necessarily convex, control problems, which is based on the Bellman
differential equation.
52.1. The S-Functional, Stability, and Duality
Together with the original problem
infF(n) = a, (P)
we consider the dual problem
sup [-M*(0,q*)]=fi. (P*)
In this connection, the functional M: .YX g-»]-00,00] arises from the
requirement that the problems
mlM(u,q)-S(q) (1)
u e X
represent a perturbation of (P), i.e.,
M(u,0) = F(u) for all Hex
We consider
sup [-M*(u*,q*)] = -S*{q*) (2)
as the perturbed problem for (P*). By definition, S(q) and - S*(q*) are
the perturbed extremal values with S(0) = a and — S*(0) = jS, respectively.
We denote the functional conjugate to M by M*.
Definition 52.1. (P) and (P*) are called stable if and only if dS(G) *0 and
55*(O)^0, respectively, hold. (P) is said to be normal if and only if
— oo</? = a<+oo.
This stability concept generalizes the differentiability properties of the
5-function of classical Hamilton-Jacobi theory. The classical S-function
was introduced in Section 37.4 by means of perturbed variational problems.
The Bellman S-function was obtained in the same way in Section 37.20. If
(P) is normal, then, by definition, no duality gaps occur.
514 52. General Duality Principle by Means of Perturbed Probleni-
Our assumptions read as follows:
(HI) X and g are real locally convex spaces. (X, X*) and (6,2*) fann
dual pairs.
(H2) M: XXQ->]-00,00] is convex and lower semicontinuous, wheiv
M(u,0) = F(u) for all u e X
(H3) Consistency. There exist u0 e X, #,* eg* such that jF(«0), M*(0,^*)
#+00.
Theorem 52.A (Rockafellar (1967)). With the assumptions (//1)-(//3), the
following five assertions hold:
(a) Weak duality. -oo</J<a<+oo holds.
(b) Solution of (P). The following statements are equivalent:
(/) (P) has a solution and a = ft.
(ii) (P*) is stable.
(c) Solution of (P*). The following statements are equivalent:
(/) (P*) has a solution and a = ft.
(ii) (P) is stable.
(d) Solution set. If a = ft, then the solution sets of(P) and (P*) are equal
to 55*(0) and dS(0), respectively.
(e) Extremal relation. The following statements are equivalent:
(/) u solves (P), q* solves (P*), and a = /?.
(ii) M(u,0)+M*(0,q*) = 0.
(Hi) (0,q*)e=dM(u,0).
According to this theorem, it is important to know conditions which
assure the stability of (P) or (P*). This assurance is provided by the
following Slater conditions:
(SC) q *-» M(ux, q)is continuous at q = 0 for a fixed wt e X.
(SC*) u* *-» M*(u*,q*) is continuous at u* = 0 for a fixed q* e g*.
Corollary 52.2. With the assumptions (//1)-(//3), the following holds: (/')
and (P*) are stable when (SC) and (SC*), respectively, holds.
We sharpen Theorem 52A in Problem 52.3. We explain the connectimi
with saddle points of a Lagrange function in Problem 52.4.
52.2. Proof of Theorem 52.A
515
52.2. Proof of Theorem 52.A
The proof rests on the repeated application of convex analysis. In particular,
we make use of the definition of
M*(u*,q*)<= sup (u*,u)x + (q*,q)Q-M(u,q),
(u,q)eXXQ
the generalized Young inequality
M(u,0)+M*(0,q*)^(0,u)x + (q*,0)Q = 0, (3)
M(u,0) + M*(0,q*)=*0 if and only if (0,q*)(=dM(u,0), (4)
as well as
S*(q*) = M*(0,q*), (5)
5(0) = «, S**(0)=j8, (6)
M** = M, (7)
and the generalized Young inequality for 5.
Step 1: Justification of (3)-(7). By hypothesis, (X, X*), (Q,Q*) form dual
pairs. Then (X X Q, X* X Q*) also forms a dual pair with
((u*,q*),(u,q))XXQ=(u*,u)x + (q*,q)Q.
Thus, the above formula for M* is well defined. (3) and (4) follow from
Proposition 51.2. Relation (5) follows from
S*(q*)= sup (q*,q)-S(q)
qSQ
— sup sup (q*,q)~ M(u,q)
q<=Qu&X
sup (q*,q)-M(u,q) = M*(0,q*).
(u,q)^XxQ
From this we immediately obtain (6) as well, because
S**(0)= sup [~S*(q*)\
17* eg*
= sup [~M*(0,q*)]-fi.
q* eg*
Step 2: Double Dualization, (P**) = (P). The dual problem (P*) is
equivalent to
inf M*(0,q*) = -p. (P*)
,7* eg*
jjl6
52. General Duality Principle by Means of Perturbed Problems
As the corresponding perturbed problem, we consider
inf M*(u*,q*) = S*(u*).
According to our general construction principle, because X** = X, Q** = Q,
the problem dual to (P*) is the same as
sup [-M**(k,0)] = -y. (P**)
u<=X
If one notes that M** — M, then this is equivalent to the original problem
inf M(n,0) = a, (P)
«e X
with a — y. In order to be able to apply this double dualization below, we
note that the same assumptions are satisfied for (P*) as for (P); for, from
(51.7c) and Proposition 51.6 it follows that:
(HI*) (X*, X**), (Q*,Q**) form dual pairs.
(H2*) M*:I*xg*->]- 00,00] is convex and lower semicontinuous.
(H3*) M*(0,q$), M**(u0,0)*+ 00.
Step 3: S is Convex. Given ql,q2^Q, e > 0, according to (1), there exist
elements ulf u2 e X such that
S(<7,) SM(ultqt) 5 S(q,)+e, / = 1,2,
when S(qt) > — 00, /=1,2. For all t e ]0,'|.[, e > 0, the convexity of M yields
the relation
S(tqi + (1-0¾) ^ M(tui + (1- 0«2.'fc + (1-0¾)
<t(S(qi)+e)+(l-t)(S(q2) + e).
Taking the limit as e -♦ + 0 yields the assertion.
For 5(¾) = - 00, / = 1,2, we replace 5(^,)+ e by an arbitrary real number
and choose a suitable «,. By Definition 47.1, we need not consider the case
5(^)=-00, 5(^) = + 00.
Step 4: Proof of Theorem 52.A.
(Ad a) From the consistency condition (H3) it follows that a < + 00,
jS > — 00. Now (3) yields /? < a.
(Ad e) This follows immediately from (3) and (4).
(Ad d) Let a — fi. Then q* solves (P*) if and only if a= - M*(0,q*);
therefore,
S(0)--S*(q*).
By the generalized Young inequality in Proposition 51.2, this is equivalent
to q* e dS(0).
52.3. Duality Propositions of Fenchel-Rockafellar Type
517
The corresponding assertion for (P) follows from (P) = (P**).
(Ad c) (i) => (ii) By (d), 55(0) * 0.
(ii) => (i). It follows from 55(0) # 0 that a = /?, by (6), for, Proposition
51.6, (6) yields 5(0) = 5**(0). Furthermore, (d) shows that 55(0) is the
solution set of (P*).
(Ad b) Use (c) and (P) = (P**).
Step 5: Proof of Corollary 52.2.
Ad (SC). The continuity of q>-* M(ult q) at q — 0 implies the bounded-
ness on a neighborhood of zero, U(0), i.e., for a fixed r > 0,
5(.7)= inf M(u,q)<M(ul,q)<r f or all q e £/(0).
ue X
Proposition 47.5 yields the continuity of 5 at q - 0. Then Theorem 47.A in
Section 47.6 guarantees that 55(0)-^0.
Ad (SC*). An analogous deduction holds for (SC). D
52.3. Duality Propositions of Fenchel-Rockafellar
Type
We will apply Theorem 52.A to the minimum problem
inf F(u)+H(Du-a) = a, (P)
«6 X
together with the dual problem (according to Section 51.4)
sup [{q*,a)-F*(D*q*)-H*(-q*j\=p, (P*)
17* eg*
where we replace/? by — q*. According to Section 51.4, the corresponding
Lagrange function Lx reads as follows:
Lfaq*)* F(u)-(q*,Du- a)- H*(- q*).
With the assumptions of the following theorem, Lx is finite on the nonempty
set A X B, where
del
A = (uel: F(u)< + <x>],
def
B= {q*<=Q*:H*(-q*)<+w}.
In order to be able to apply the perturbation formalism from Section
52.1, we set
def
M(u,q) = F(u) + H{Du - a - q),
518
52. General Duality Principle by Means of Perturbed Problems
i.e., we perturb a by a + q and consider the two problems:
M M(u,q) = S(q), (8)i?
sup [-M*(u*,q*)]<=-S*(u*). (8* ):.:■,
q*<=Q*
In Problem 52.5 we shall show that
- M*(u*,q*) = (q*,a)~ F*(D*q* + u*)~H*(- q*);
for this reason, (P) coincides with (8) for q = 0 and (P*) coincides with (8*)
for u* = 0, and we can indeed apply Theorem 52.A to (P) and (P*). Our
assumptions read as follows:
(HI) X and Q are real locally convex spaces. (X, X*) and (2,2*) form
dual pairs.
(H2) The functionals F: X-*] — 00,00] and /7: Q-^] —00,00] are convex
and lower semicontinuous.
(H3) D: X -»2 is linear and continuous, a is a fixed element in Q.
(H4) Consistency. There exist points u0 e X, q% e Q* such that F(uQ),
H(Du0 - a), F*(D*q$), and H*(-q$) are all different from + 00.
Theorem 52.B (Fenchel (1951) and Rockafellar (1967)). With the
assumptions (/71)-(/74), the following four assertions hold:
(1) Weak duality. -oo</?<a<oo holds.
(2) Solution of (P). The following two statements are equivalent:
(/) (P) is solvable and a = /?.
(h) (P*) is stable, i.e., dS*(O)*0.
If (ii) holds, then the solution set of (P) equals 55*(0).
(3) Solution of (P*). The following two statements are equivalent:
0) (P*) is solvable and a = /?.
00 (P) is stable, i.e., dS(O)*0.
If (ii) holds, then the solution set of (P*) equals dS(0).
(4) Characterization of solutions by a saddle point. The following three
statements are equivalent:
(i) u is a solution of (P), q* is a solution of (P*), and a = /?.
(ii) (u, q*) is a saddle point of Lx with respect to Ax B.
(Hi) D*q* e dF(u), - q* e dH(Du - a).
One can use the following Slater conditions to guarantee stability:
/7 is continuous at DuQ — a. (SC)
F* is continuous at D*q$. (SC*)
52.4. Application to Linear Optimization Problems in Locally Convex Spaces 519
Corollary 52.3. With the assumptions (//1)-(./74), the problems (P) and(P*)
are stable when (SC) and (SC*), respectively, hold.
Theorem 52.B generalizes Theorem 51.B in Section 51.5. According to
Section 51.4, the formulation of problem (P) is very general. For example, it
encompasses variational problems (cf. Sections 51.6 and 51.7) and
optimization problems (cf. Section 52.4).
Proof. Ad (1), (2), (3). The functional M:lXg-»]- oo, oo] is convex and
lower semicontinuous. Theorem 52.A in Section 52.1 yields the assertion.
Ad (4) (i) <=> (ii) This follows from Theorem 49.B, (2) in Section 49.2.
(i) <=> (iii) By Theorem 52.A, (i) is equivalent to
M(u,0) + M*(0,q*) = 0,
i.e.,
[F(u)+ F*(D*q*)-(D*q*,u)}
= -[H(Du-a)+H*(-q*)-(-q*,Du-a)].
By the generalized Young inequality in Proposition 51.2, both of the
expressions in square brackets are non-negative, and therefore they are
equal to zero. Thus, Proposition 51.2 yields D*q* e dF(u), — q* e dH{Du
-a). D
Corollary 52.2 yields Corollary 52.3.
52.4. Application to Linear Optimization Problems
in Locally Convex Spaces
As a special case of Section 52.3, we consider, as in Section 49.3, the
minimum problem
inf (c,u)x=a(b), u^Kx, Du-b<=K0 (P)
u
with the dual problem
sup (q*,b)Q^ /8(c), q*^K*, c-D*q*&K$ (P*)
1*
and the corresponding Lagrange function
L(u,q*) = (c,u)x + (q*,b-Du)Q.
Our assumptions read as follows:
(HI) X and Q are real locally convex spaces. (X, X*) and (2,2*) each
forms a dual pair.
520
52. General Duality Principle by Means of Perturbed Problems
(H2) Kx and KQ are convex closed nonempty cones in X and g,
respectively. We denote the corresponding dual cones by K$ and Kg.
(H3) D: X->Q is linear and continuous. Furthermore, beg, eel* are
fixed elements.
(H4) Consistency. There exist points uQ and q$ which satisfy the side
conditions in (P) and (P*), respectively.
a(-) and /6(-) are functions on g and X*, respectively. The following
theorem yields exhaustive information about the behavior of the solutions of
(P) and (P*).
Theorem 52.C. With the assumptions (H1)-(H4), the following four
assertions hold:
(1) Weak duality. - oo < /8(c) < a(b) < + oo holds.
(2) Solution of (P). The following two statements are equivalent:
(i) (P) is solvable and a(b) = /6(c).
(ii) <?/6(c)*0.
If (ii) holds, then the solution set of (P) equals d/3(c).
(3) Solution of (P*). The following two statements are equivalent:
(i) (P*) is solvable and a(b) = /6(c).
(ii) da(b)*0.
If (ii) holds, then the solution set of(P*) equals da(b).
(4) Characterization of solutions by a saddle point. The following three
statements are equivalent:
(i) u solves (P), q* solves (P*), and a(b) = /6(c).
(ii) (u,q*) is a saddle point of L with respect to Kx X Kg.
(Hi) u and q* satisfy the side conditions of (P) and (P*), respectively,
and
(c-D*q*,u)*=(q*,Du-b) = 0.
The following solvability criterion results from assertion (1): If u and q*
satisfy the side conditions in (P) and (P*), respectively, with (c,u) = (q*, b),
then u and q* are solutions of (P) and (P*), respectively.
Corollary 52.4 (Slater Conditions). With the assumptions (H1)-(H4), the
following assertions hold:
da(b)^0 when Du0 —b&intKQ.
<2/6(c)*0 whenc-D*q$&intKx'.
52.5. Duality for Non-Convex Control Problems
521
Proof. As in Section 49.3, assertion (4) follows from Theorem 49.B in
Section 49.2. If we set
, .W((c,u) forU<=KX,
^U) \+oo foi u<£Kx
and
Kql \+oo iorq*KQ,
then, by (48.4), we obtain
0 forc-n*eA'jf,
F*(u*)= sup (u*,u)— F(u)
u<=x ' ( + oo fore— u*<£K$
and
(0 for q* e K*
H*(-q*)= sup (-q*,q)-H(q)= I U
c<bq (+00 for q*€ K*.
Therefore, (P) and (P*) above are equivalent to (P) and (P*), respectively,
in Section 52.3, with a replaced by b. Moreover,
S(q) = a(b + q), dS(0) = da(b),
S*(u*) = -p(c-u*), 55*(0) = 5j8(0).
Now all the assertions follow from Section 52.3. D
52.5. The Bellman Differential Inequality and Duality
for Nonconvex Control Problems
For a general class of control problems which also comprise classical
variational problems, our goal is to prove duality propositions without
convexity assumptions. In the present case, duality gaps can occur, i.e., we
can have ft < a in Theorem 52.D (page 524). At the same time we obtain a
generalization of the classical Hamilton-Jacobi theory, where, in place of
the Hamilton-Jacobi first-order partial differential equation, there appears a
differential inequality which we designate as the Bellman-Hamilton-Jacobi
differential inequality or, briefly, as the Bellman differential inequality. This
inequality allows two-sided error estimates for the minimal values and
allows the formulation of sufficient conditions for solvability. The following
duality principle is entirely elementary and rests only on the formula for
integration by parts. In Section 52.6, we explain its connection with
geometrical optics and delve into the construction of approximation methods.
522
52. General Duality Principle by Means of Perturbed Problems
Our minimum problem reads as follows:
inf f f(t,y{t),u(t))dt = a (9)
y,»JG
with the following constraints:
(a) State constraint for y:
(t,y(t))<=Z forallfeG.
(b) Control constraint for u:
u(t)^U forallfeG.
(c) Control equation on G:
DiyJ(t) = giJ(t,y(t),u(t)), i = l,...,N, / = 1,...,M.
(d) Boundary condition for the states:
yj = hj on dG for/=1,...,M.
(e) Piecewise smoothness:
yj,uk&D\G), / = 1,...,M, k = l,...,K.
In this connection, we assume more precisely:
(HI) G is a bounded region in IR^ with piecewise smooth boundary, i.e.,
dG^C0-1 and N >1._
We denote by D1(G) the set of all continuous functions <p: G -»IR that are
piecewise continuously differentiable on G. We forego a detailed description
of this concept and content ourselves with the remark that the
discontinuities of the first partial derivatives can be found on sufficiently well-behaved
sets M, with dim M < dimG, so that parallel to Section 21.1, we can apply
the formula for integration by parts below in the proof of Theorem 52.D.
(H2) We set
u = (uu...,uK), y=°(yi,...,yM), t = (tlt...,tN),
The sets Z and U are given fixed subsets of IR N X IR M and IR K, respectively.
(H3) Let the given fixed functions/, g(. •: ZxU-*U be continuous for all
i, j. Furthermore, we set
*/ r
F(y>u)= f(t,y(t),u(t))dt.
JG
Example 52.5. If we choose
"=("/y)» Stj(t,y,u) = uu
52.5. Duality for Non-Convex Control Problems
523
and
Z = UNXUM, U=UK, K = NXM,
then Ujj = D^j, and (9) represents a problem of the classical calculus of
variations.
By definition, the problem dual to the original problem (9) reads as
follows:
sup$(5)=j8, S1,...,SN^D1(Z), (9*)
s
with
def r
¢(5)= / - sup ds(t,y)dt
JG ySQ(t)
N
+ ( LSt(t,h(t))ni(t)dO.
Here, the quantities which occur have the following meaning:
(i) Pontrjagin function:
def
3e{t, y, u, p) = L,PijSij{t, y, u)-f(t, y, u).
(ii) Bellman defect:
ds(t,y)= 'LDiSi(t>y)+ supJf(t,y,v,Sy(t,y)).
i veil
The summation is over i= 1,..., N and J-1,..., M. We denote by Sy the
matrix of the first derivatives dSj/dyj.
(iii) Cross section of the state constraint:
def
Q(t)={yeUM;(t,y)eZ}.
Let n(t) be the vector of the exterior unit normal at the boundary point
t e dG, with components nt(t).
The differential equation
ds(t,y) = 0
is called the Bellman differential equation for 5 = (5j,. --,SN). It generalizes
the Hamilton-Jacobi partial differential equation of the classical calculus of
variations for multiple integrals to control problems.
524
52. General Duality Principle by Means of Perturbed Problems
Theorem 52.D (Klotzler (1978)). With the assumptions (i/l)-(i/3), the
following two assertions hold:
(1) Weak duality. /? < a holds for the extremal values of the original
problem (9) and of the corresponding dual problem (9*).
(2) Two-sided error bounds for the minimal value a. We have
F(y,u)—(meas G) sup ds(t, y) <a< F(y,u)
(t,y)<=Z
when the following conditions are satisfied:
(i) y, u satisfy the side conditions of (9).
(ii) 51;...,SN e D1(Z) and, with the exception of the points of discontinuity
of the first derivatives of 5,, the differential equation
Ea[^('j('))]-/('j(0,«(')) (10)
1-1
holds on G.
Remark 52.6. As a consequence of Theorem 52.D, in the following we
explain the meaning of the Bellman differential inequality
ds(t, y)<0 for all (f, ;)eZ (11)
for obtaining the error estimates and sufficient conditions for solvability.
First, we consider assertion (1). From (1 < a it immediately follows that
$(S)<a<F(y,u)
for all (y, u) and 5 which satisfy the constraints in (9) and (9*),
respectively. If, in addition, 5 is a solution of (11), then, according to the
construction of ¢:
N
f ZSt{t,h{t))nt{t)dO<a<F{y,u). (12)
JdGi„1
In the left-hand and right-hand sides are equal in (12), then y, u is a solution
of the original problem (9).
We now turn to assertion (2). Here, we can exploit the degree of freedom
we have in the choice of 5,. For example, we can proceed from the linear
substitution with respect to y:
*f £
s,(t,y) = al0(t)+ £«//0^-
y=i
If we substitute this expression in (10), then we obtain a first-order
differential equation for determining the aik. If all the aik satisfy this differential
equation, then assertion (2) yields an error estimate for a. If, in addition, S
satisfies the Bellman differential inequality (11), then, by assertion (2),
F(y, u) = a, i.e., y, u solves the original problem (9).
52.6. Application to a Generalized Problem of Geometrical Optics 525
Proof of Theorem 52.D. (1) If y, u satisfy the side conditions in (9) and
S1,...,SN^D1(Z), then, by the chain rule, we have
A[s,C,/(0)1 = (^)(^(0)
+ E |£ (t, y(t))D,yj{t).
Furthermore, elementary transformations and integration by parts yield:
F(y,u) =/ -3f(t,y(t),u(t),Sy(t,y(t)))+ EgJ^tf
G i,j yi
>(- sup^(t,y(t),v,Sy)+YJDiyjlr-i-dt
JG veU _ ij "yj
= f - ds(t, y(t))+ ED,[s,{t, y(t))] dt
JG
= f - ds(t, y(t)) dt+ f £s,(;, M0K(0 d0 * *(s).
JG JdG i
(2) Using integration by parts, it follows from (ii) that for all y, u which
satisfy the side conditions in (9),
F(y, u) = ( f(t, y(t),u(t)) dt=f EA[5,(r, /(0)] dt
JG JG t
= ( £5,.^,/,(0)^(0 do
JdG ,
<:F(y,u)+ [ds(t,y(t))dt
JG
<, F(y,u) + (measG) sup ds(t,y)
(t,y)sz D
52.6. Application to a Generalized Problem of
Geometrical Optics
Parallel to Section 37.4, we study the problem
M("n(y)]fyJ7yJdr = a (13)
Jo
fory = y(r), subject to the following constraints:
(a) Path constraints:
y(r)^A forallTe[0, a].
526
52. General Duality Principle by Means of Perturbed Problems
(b) Boundary constraints:
^(0) = %, y(a) = ya-
(c) Piecewise smoothness of the path:
y1,y2eDl([0,a]).
Here, y = (yY, y2). Let the function n: U -»IR be continuous and positive.
Let the points y0, ya e IR2 and the real number a > 0 be given and fixed.
Furthermore, let A be the closure of a fixed region in IR2.
This problem has the following physical interpretation: We seek the path
y( ■) of a ray of light which moves in the shortest time from yQ to ya, where it
cannot leave the set A (Fig. 52.1). For the sake of simplicity, we set the
velocity of light c equal to 1. Here, n(y) is the index of refraction at the
point y.
The Hamilton-Jacobi partial differential inequality
S^(y)+Sy22(y)<n2(y) foralljG^f (14)
is crucial for handling this problem. (14) is equivalent to the fact that
|egrad S(y)\<n(y) ioraHy^A (15)
holds for all unit vectors eelR2, i.e., the magnitudes of all directional
derivatives of 5 at the point y are less than or equal to the index of
refraction n(y). Our goal is the error estimate
S(ya)-S(y0)<a<F(y) (16)
for the minimal value a, where
Hy)~ fn{y){yj+y?dr.
Proposition 52.7. The error estimate (16) holds when y(-) satisfies the
constraints in (13) and S: A^>U is a continuous and piecewise continuously
differentiable function which satisifies (14).
Figure 52.1
52.6. Application to a Generalized Problem of Geometrical Optics
527
The proof of this assertion, which is a special case of Theorem 52.D in
Section 52.4, is given in Problem 52.6. If the assumptions of Proposition
52.7 hold and the right-hand and left-hand sides of (16) are equal, then
F(y) = a, and y(-) is a solution of (13). Thus, Proposition 52.7 yields a
simple sufficiency criterion for the solvability of the original problem (13).
Example 52.8. In order to be able to explain several general peculiarities in
the simplest way, we consider the special case n(y) = 1. Then in (13) we seek
the shortest path connecting the points yQ and ya which lies entirely in A.
Case 1: No constraint on the path.
_ def def
Let A = R . Furthermore, let, say, y0 = (0,0), ya = (a,0). A solution of the
Hamilton-Jacobi differential equation, i.e., of (14) with " =" instead of
def def
"<," is S(y)~yv Forj(T) = (T,0),
therefore the straight linej(-) is a solution, as could be expected.
Case 2: With path constraint.
Now, let A c IR2. Then, as a rule, we have a proper path constraint, and
we need inequality (14).
We choose A, say, as in Fig. 52.2, with the triangulation given there. We
will use this triangulation to simultaneously explain the basic ideas of a
general approximation method which will approximate the minimal curve
lengths a as well as possible. In the following, when we speak of nodes we
will always mean the nodes of the triangulation in Fig. 52.2.
Let t -» y( ■) be an arbitrary polygonal path which connects the points y0
and ya in the set A and whose vertices are nodes. Then F(y) > a, i.e., we
obtain an upper bound for a.
I' In order to also obtain a lower bound for a by Proposition 52.7, we now
construct the function 5 by prescribing the values S(y) at the node points
y and then extending 5 to the set A by means of linear interpolation. Then,
as a finite element, 5 is piecewise continuously differenliable, analogous to
A2(59). In order to satisfy the differential inequality (14), the node values
Figure 52.2
528
52. General Duality Principle by Means of Perturbed Problems
S( y) must be chosen according to (15) so that all directional derivatives are
less than 1. Furthermore, in order to make the error estimate (16) as optimal
as possible, we strive to achieve the situation that the positive differences of
the functional values of 5 along the polygonal paths t*-*P(t) under
investigation are as large as possible. This leads us to the following con-
def def
struction: We begin with S(y0) = 0. We set S(y) = 1 at nodes y that are
def
adjacent to j0. Furthermore, let S(y) = 2 at the nodesy that are adjacent to
y, which as yet have no 5 value, etc. If y andj are two adjacent nodes, then
we always have 15(/)- S(y)\ >\y- y\/^2. If we follow the nodes along a
polygonal path t^|(t), then S(ya)-S(y0)>yF(y), where y =1//2.
Thus, by Proposition 52.7,
yF{y)<a<F{y). (17)
In contrast to the triangulation in Fig. 52.2, if one chooses a triangulation
with equilateral triangles, then (17) holds with the better estimate y = /J /2.
In this connection, one notes that the sides of an equilateral hexagon, which
is inscribed in the unit circle, are at a distance /3*/2 from the center.
Therefore, our result reads as follows: For a triangulation by means of
equilateral triangles, the relative error of F(y) is always less than or equal to
7% relative to the true value a because of (17) with y = /3~/2.
For the original problem (13), these considerations motivate the following
general approximation method for determining the minimal value a.
Step 1: Triangulation and Choice of a Polygonal Path. We triangulate the set
A and determine a polygonal path t •-» j(t) whose vertices are nodes. Then
by F(y) > a we obtain an upper bound for a. The polygonal pathy(-) can
be determined optimally by means of the requirement:
F(y) = min!,
where all possible polygonal paths y( ■) are admitted. This is a problem of
so-called transport optimization on the graph belonging to the triangulation
of A. For these problems, we have at our disposal simple algorithms (cf. e.g.,
Berge and Ghouila-Houri (1969, M)).
Step 2: Construction of the S-Function. We assign 5 values to the nodes and
interpolate linearly so that the following hold for the S-function that arises:
S{ya)-S{y0) = m<ix\, (18a)
-n(y) <egradS(y)<n(y) (18b)
for all y e A and all e e IR2, \e\ =1. This is a linear optimization problem
with an infinite number of side conditions. We can solve (18)
approximately, by varying only those nodes y and all e in (18b) that correspond to
Problems
529
edge directions. If one knows a solution 5 of (18), then, by Proposition 52.7,
S(ya)~S(y0)<a<F(y). (19)
The algorithmic formulation can be found in KlStzler (1979), Part II.
There it is also shown how one can exploit the second step to delimit a
subregion of A in which the polygonal path y( ■) of step 1 lies.
Furthermore, in Klotzler (1978) it is shown that no duality gaps appear
for the original problem (13) in Proposition 52.7, i.e., one can find a
function 5 with S(ya)— S(y0) = a. Therefore, the estimate for a in (19) can
be, in principle, made arbitrarily precise.
Problems
52.1. The minimal surface problem. The goal of this set of problems is to call the
reader's attention to several important results. The minimal surface
problem is a difficult and very diversified problem which played, and still
plays, an important role in the development of the calculus of variations.
As a standard work for the classical theory, we recommend Nitsche (1975,
M,B,H).
Parallel to Problems 6.5 and 40.4, we consider the minimum problem:
(20a)
(20b)
= 0; (21)
ith the
F(z)-
•jjl + zl + z}
z = g on
corresponding Euler equation
G:
3{
dx\f^
),9
zi + zj) "y
dxdy
dG
{
u+
= mini,
^ + z] 1
dG: z = g.
This is equivalent to:
G: (l + zyl)zxx + (l + z2x)zyy-2zxzyzxv-0; (21a)
dG: z = g.
In this connection, let G be a bounded region in R 2, and let g be a given
function on the boundary dG. Geometrically, Problem (20) means that we
seek a surface z = z(x, y) in R3 which passes through a fixed spatial curve
C of the form (20b) and in this connection has the smallest possible
surface area (Fig. 52.3). According to (20b), the projection of C on the
(x, ^)-plane is equal to dG.
Experimentally, one can realize this minimal surface by dipping a wire
loop having the form C into a soap solution. Then inside C, a soap
membrane which corresponds approximately to the solution of (20) is
formed. Namely, (20a) means that the potential energy of the soap
membrane, neglecting gravity, is minimal.
52. General Duality Principle by Means of Perturbed Probler- -
Figure 52.3
la. Peculiarities of the problem. Due to the physical interpretation of oui
problem, it is natural to conjecture that a classical solution need not exi-i
for every curve C. If the basic region is not convex, then the danger arises
that the soap membrane ruptures. In fact, in 1912 Bernstein gave a
nonconvex region G and a curve C for which (20) has no classical
solution, z e C2(G)nC(G) (cf. Fucik, Necas, and Soucek (1977, M), paix
162). The mathematical difficulties in the treatment of (20) originate in tin-
fact that the relation
F(z)-* +oo as||z||;f-»oo (22)
does not hold in the case of the reflexive B-space X= Wp(G), \<p<v.
If, on the other hand, we consider the Sobolev space W\(Gi), then this
space is not reflexive. In both cases we cannot apply the important
existence principle of Proposition 38.15. These difficulties are expressed in
the minimal surface equation (21a) by the situation that it is not
uniformly elliptic, i.e., there exist no constants c, d > 0 such that
c(H2 + r,2)<;(l + q2)e + (l + p2W-2pqtiV
holds for all real numbers £, rj, p, and q.
Therefore, for the existence proof for (20), general functional analysis
propositions do not suffice. As an essential element one needs nontrivial
a priori estimates which depend on the specifics of the problem. We delve
into this further below in Problems 52.1c and 52.1d.
.lb. Convexity and lower semicontinuity; elementary solution of a modifitJ
problem. As in Section 6.2, we denote by C0,1(G) the B-space of all
Lipschitz-continuous functions z: G-*R. Furthermore, we set
def _
K(g,R)-{zeC°<\Gy.\\z\\wilR,z-gaadG).
Here,
def
l|z|lo.i= max|z(P)|+L(z),
PeG
where L{z) denotes the Lipschitz constant of z on G, i.e.,
|z(/')-z(e)|<L(z)dist(/>,e) forall/>,geG.
Problems
531
For all z e K(g, R), F(z) is explained meaningfully in (20). In this
connection, we make use of a theorem due to Rademacher (1919), which
asserts that every function z in C°A(G) possesses classical first partial
derivatives almost everywhere on G, and these derivatives are measurable
and bounded with the Lipschitz constant as bound, i.e., for z e K{g, R),
we have zx,zy e LX{G) and \\zx\\x, \\zy\\x < R.
Show: The modified problem corresponding to (20),
F(z) = min!, z eK(g,R), (23)
has a solution when K(g, R)¥=0.
Solution: Let (z„) be a minimal sequence for (23). From z„ e K(g, R),
for all n e N, it follows that the Lipschitz constants for all z„ are less than
or equal to R. According to the Arzela-Ascoli theorem in Ax(24g), we can
choose a subsequence that we again denote by (z„) which converges
uniformly on G to a function z e K(g, R). In conformity with this, we
show that
F(z)< Urn F(z„). (24)
n -+ oo
From this, as in the introduction to Chapter 38, it follows that z is a
solution of (23).
Proof of (24). Let f(u,v) = VI + u2 + v2. The function /: R-»R
is convex. Therefore, according to Proposition 42.6,
/(u„,v„)-f(u,v)>fu(u,v)(u„- u)+fv(u, v)(v„-v)
def def def def
holds. We set u — zx,v = zy, un = (z„)x, and v„ = (z„)>- Then
u„-*u, v„-*v in L2(G) as«-»oo. (25)
By integration by parts this follows immediately from
/ <p(un — u) dxdy = — I <px(zn~ z) dxdy -* 0 as«-*oo
for all <p e C^(G) as well as from the fact that C^(G) is dense in L2{G)
and the boundedness of (u„), (v„) in L2(G) (cf. Ax(31d)).
From (25), for n -» oo, we obtain
F(z„)- F(z) > f [fu(u,v)(un- u) + f„(u, v)(v„-v)]dxdy-^0.
(26)
(24) follows directly from this.
The crucial assertion (26) for subsequences can also be obtained from
the fact that (u„), (v„) are bounded in LX(G) and thus possess weak*
convergent subsequences in LX(G) (cf. Example 38.3).
52.1c* A priori estimates and the classical existence proof due to Haar {1927). In
this paper by Haar, the study of which we recommend to the reader, the
following is proved: The original problem (20) possesses a solution
32
52. General Duality Principle by Means of Perturbed Problems
zeC^iG) when the following two assertions hold:
(a) G is a bounded convex region of R2.
(b) The spatial curve C corresponding to g: dG-*R satisfies a so-called
three-point condition, i.e., there exists a number d > 0 such that
a2 + b2 < d2 holds for each plane z = ax + by + c which passes through
three points of C.
Moreover, the solution is then analytic in G and satisfies the minimal
surface equation (21).
We split a sketch of the proof into four steps.
Step 1. Solution of the modified problem (23). We have given tlv
elementary proof in Problem 52.1b.
Step 2. A priori estimate. With the aid of the three-point condition, oin-
shows that there exists a number R > 0 such that each solution z e C0il(<>)
of (20) lies in K(g,R) and K(g,R)¥=0. Then the solution of tlv
modified problem (23) with this R is also a solution of the origin.il
problem (20) in the space C^iG).
Hidden behind this a priori estimate in the geometric fact that one c.:n
estimate the Lipschitz constant for a solution of (20) against the constanl
d in the three-point condition. Otherwise, one could construct a surfau'
with smaller surface area.
Step 3. Analyticity of the solution. A difficulty consists in that up until
now we have proved only the existence of the first derivatives of the
solution of (20), but in the minimal surface equation there appear second
derivatives. In order to overcome this difficulty one uses a typical dedik-
tion due to Haar (Haar's lemma). In this connection, one deduces from
the vanishing of the first variation in (20) a first-order system of diffavn-
tial equations (D) which contains additional auxiliary functions. From i Pi
and the theorem asserting that Lipschitz-continuous solutions of I Ik
Cauchy-Riemann differential equations are analytic, it then follows that
the solution of (20) is analytic.
Step 4. The solution satisfies the minimal surface equation. This follow.--
easily from (D) and the fact that the solution is analytic.
Concerning the present set of problems, we also recommend Nitsche
(1975, M), page 587 ff.
52.Id.* A functional analysis existence proof for (20) in W[{G). In this connection,
study Fucik, Necas, and Soucek (1977, M), page 146 ff. In modified fonn
this proof contains the first and second steps of Problem 52.1c. '1 he
a priori estimate in the second step follows from the maximum principle
for the minimal surface equation and the so-called bounded slope
condition which is related to the three-point condition. A uniqueness assertion
also follows from the maximum principle.
52.1e.** Sharp existence assertion for the minimal surface equation. In this
connection, study Gilbarg and Trudinger (1977, M), Chapter 15. There Ilk-
following is shown:
(i) Let G be a bounded region in R2 with dG e C2. Then the Dirichlrl
problem possesses a solution for the minimal surface equation (21) foi all
Problems
533
jeC(dG) if and only if the curvature of the boundary dG is everywhere
non-negative.
(ii) Let G be a bounded region in R2 with dG e C2*", 0 < a < 1. Let the
curvature of dG be everywhere non-negative. Then for each g e C2'"( dG)
(respectively, geC(9G)), dispossesses exactly one solution z e C2'a(G)
(respectively, z e C2(G)n C(G)).
(iii) If z eC2(R2) satisfies the minimal surface equation (21) on R2,
then z is a linear function.
Assertion (iii) is a classical theorem due to Bernstein. It shows that
despite formal similarity the minimal surface equation behaves essentially
differently than the Laplace equation.
The proofs follow from sharp a priori estimates and the continuation
according to a parameter method discussed in Chapter 6 (the
Leray-Schauder principle).
52.1 f.** The parametric minimal surface problem. In (20) we sought minimal
surfaces in the special form z = z(x, y). Since not every surface can be
written in this special form, the more general problem arises of
determining minimal surfaces in the parametric form x = x(u,v), y = y(u,v),
z = z(u, v). The solution of this famous classical Plateau problem can be
found in Nitsche (1975, M), Chapter V, where one also finds detailed
historical comments.
52.1g.* Duality and generalized solutions of the minimal surface problem. In the
preceding we have seen that the form of the basic region G plays an
important role in the construction of solutions of the minimal surface
problem, e.g., we needed the convexity of G. However, duality theory
offers the possibility of constructing generalized solutions for general
bounded regions G. In this connection, study Ekeland and Temam (1974,
M), Chapter V. The basic idea is the following:
(i) Let G be a bounded open_set in R2 and let g be in W\(G)—more
precisely, in the closure of CX(G) in W&G). We set
def o -
JV= {z<=Wl(G):z=*g+u,u<=W}(G)},
def r
JV*= {p*eL00(G)xL00(G):divp*-OonG
and \p*(x)\<l almost everywhere on G}.
Then the problem dual to
infF(z) = a (27)
z &N
reads as follows:
sup H(p*) = p, (27*)
p*GN*
where
H{p*)- j {- p*{x)^Ag{x)+[\~\p*{x)\2]l/2)dx.
4
52. General Duality Principle by Means of Perturbed Problen •
(ii) a = /8 and the dual problem (27*) has exactly one solution p*.
(iii) If the original problem (27) has a solution z, then the extremal
relation
~P*(x)
(1-1^(^)12)1
ffadz(x)- ,_ yK'/2 (28)
holds and \p*(x)\<l almost everywhere on G. Here, grad and div are
always to be understood in the sense of distributions.
(iv) If the original problem (27) has no solution, then one can construct
a generalized solution z of (27) by means of the extremal relation (28). It
is essential to investigate the regularity properties of this generalized
solution. One can find a discussion of this in Ekeland and Temam (1974,
M).
52.1h. Finitely many solutions in the generic case. For a long time it was believed
that for all sufficiently smooth curves there are only a finite number oC
minimal surfaces which they bound. In B6hme and Tromba (1977) and
Tromba (1977), it was proved that, roughly speaking, there exists an open
dense set of curves in R3 which bound only a finite number of classical
minimal surfaces of the disk type. The proof is based on Morse theory.
This result is closely connected to a recent trend in analysis stemming
from global analysis. We do not consider the most general case, which is
burdened with all kinds of pathologies, but rather we consider only the
generic case and prove very natural results for this. As another example,
we consider geodesies on a sphere. In most cases there exists a unique
curve of shortest length between two points. An important generalization
of this observation reads as follows:
Let V be a (possibly infinite-dimensional) complete connected Rieman-
nian manifold. Let any ueVbe given. Then there is a residual subset R of
V such that every point ȣfi can be joined to a by a unique minimal
geodesic (cf. Ekeland (1979, S), page 470).
Note that a residual set is the complement of a set of first Baire
category. Such residual sets are "big."
Also, compare the survey article Hildebrandt (1983) and Almgren
(1984, M).
52.2. Examples of linear optimization problems with unfavorable solution behavior.
52.2a. Duality gaps. For the minimum problem
u3 = min!, u<=K, Du-beK (29)
with
def def def
u = (u1,u2,«3), Du = (0, u3, ux), 6-(0,-1,0),
def . .
K = [u eR : U], u2 > 0, uxu2 ~sl u\ j,
construct the dual problem (29*) and show that both problems have a
solution but that the extremal values do not coincide.
Hint: Compare Fan (1970) and Gopfert (1973, M), page 205.
Problems
535
52.2b. Unsolvable original problem. For the minimum problem
u2 = min!, (u1,u2)eR2, (30)
t1ul + u2^t for all (E [0,1],
construct the dual problem (30*) and show that the extremal values are
equal and (30*) has a solution, whereas (30) has no solution.
Hint: Compare Krabs (1975, M), page 34.
52.3. Sharpening of Theorem 52.A. We use the notation from Section 52.1. Now
our assumptions read as follows:
(Bl) X and Q are real locally convex spaces. (X, X*) and (Q, {?*) form
dual pairs.
(B2) M: XxQ->]-00,00] is convex and lower semicontinuous, with
M * + oo. Furthermore, M(u,0) = F(u) on X.
Show:
52.3a. The following three assertions are equivalent:
(i) - oo < inf(P) = sup(P *) < oo.
(ii) S(0) is finite and S is lower semicontinuous at zero.
(iii) S*(0) is finite and S* is lower semicontinuous at zero.
In the cases (ii) and (iii), one says that the problems (P) and (P*),
respectively, are normal.
52.3b. The following assertions are equivalent:
(i) (P) is solvable and - oo < inf(P) = sup(P*) < oo.
(ii) (P*) is stable.
An analogous equivalence holds if (P) and (P*) are interchanged.
Hint: Use Problem 51.2. Compare Ekeland and Temam (1974, M),
Chapter III, 2.
52.4. Connection with a Lagrange function. We again make use of the notation
of Section 52.1 and assume (Bl) and (B2) from Problem 52.3 hold. We
define L: XxQ* -* [-00,00;| by
def
-L(u,g*) = sup (q*, q)- M{u,q),
i.e., — L is the function conjugate to q -» M(u, q).
52.4a. Show that the following two assertions are equivalent:
(i) u is a solution of (P), q* is a solution of (P*), and inf(P) = sup(P*).
(ii) (u, q*) is a saddle point of L with respect to X X Q*.
52.4b. Show that for stable (P) the following two assertions are equivalent:
(i) u is a solution of (P).
(ii) L has a saddle point (u,q*) with respect to X X Q*.
Hint: Compare Ekeland and Temam (1974, M), Chapter III, 3.
52. General Duality Principle by Means of Perturbed Problems
Show that the following three assertions are equivalent:
(i) (P) and (P*) are stable.
(ii) L has a saddle point (u,q*) with respect to X X Q*.
(iii) (P) has a solution u, (P*) has a solution q*, and
- oo < inf (P) = sup (P*) < oo.
Solution: Use Problems 52.4a, 52.4b, and 52.3b.
Calculation of M* in Section 52.3.
Solution: Let
def
y = sup (q*,q)-H{Du- a- q).
def
For p — Du — a — q,
y — sup (q*,Du- a-p)~ H(p)
= sup (D*q*,u) + (-q*,a + p)-H(p)
P&Q
= (D*q*, u) + H*(- q*)~(q*, a).
From this it follows that
M*(u*,q*)= sup (u*,u) + (q*,q)
(a,«)eXx2
-F(u)-H(Du-a-q)
= sup (u*,«)- F(u)+ y = F*(D*q* + u*)
+ H*(-q*)-(q*,a).
Proof of Proposition 52.7. Solution: We write (13) in the form
F(y,u) = f n(y(t))fuf(t)+ uj(t) dt == rain!,
J0
y!(t)*°uit / = 1,2,
{t,y(t))e[0,a]XA forall *e[0,a]
and apply Section 52.4. Then
Jf(y,u,p)"p1u1 + p1u1-n(y)fu[+i4
with
sup Jf(y,v,p)=> < rl yl KJ'
„<=B2 I + oo otherwise
and ds = St; therefore, ds = 0 when we choose S to be independent of /
(12) yields the assertion.
References
53'/
52.7. A Igorithm for the application of the duality principle in Section 52.4. In this
connection, study Klotzler (1979), Part II.
52.8. Duality principle, discrete control problems, discrete maximum principle, and
dynamic optimization. In this connection, study Focke and Klotzler (1978).
There the reader will find the duality principle from Section 52.4 applied
to discrete problems.
52.9. Generalized solutions of the Hamilton -Jacobi equation. Study the
comprehensive representation of recent results in Lions, Jr. (1982, L). Also, study
Crandall and Lions (1983). There, a regularization method is used to
obtain existence theorems for generalized solutions of the Hamilton-Jacobi
equation.
52.10. Capillary Equilibrium Surfaces. In this connection, many very interesting
and deep recent results can-be found in the comprehensive monograph
Finn (1984).
References to the Literature
Classical works: Fenchel (1951); Rockafellar (1967).
General presentations; Gopfert (1973, M); Ekeland and Temam (1974,
M); Barbu and Precupanu (1978, M).
Applications of optimization theory in infinite-dimensional vector spaces:
Collatz and Krabs (1973, M); Gopfert (1973, M); Krabs (1975, M).
Applications to partial differential equations: Ekeland and Temam (1974,
M).
Duality for nonconvex problems: Ekeland and Temam (1974, M);
Rockafellar (1975); Klotzler (1978), (1979), (1983, S).
Minimal surfaces: Nitsche (1975, M,B,H) (this is a standard work with a
very comprehensive bibliography); Courant (1950, M); Ekeland and Temam
(1974, M); Gilbarg and Trudinger (1977, M); Fucik, Necas, and Soucek
(1977, L).
Recent trends in the theory of minimal surfaces: Tromba (1977, S);
Bohme (1981/82, S); Fomenko (1982, M); Hildebrandt (1983, S); Almgren
(1984, M).
Capillary equilibrium surfaces: Finn (1984, M, B, H) (standard work).
Generalized solutions of the Hamilton-Jacobi equations: Lions, Jr. (1982,
L,B); Crandall and Lions, Jr. (1983).
CHAPTER 53
Conjugate Functionals and Orlicz Spaces
The secret to wearying consists in saying everything.
Voltaire
In this chapter, we consider the Orlicz spaces LH and LH * as generalizations
of the Lebesgue spaces Lp and Lq respectively, where/', q > 1,p~x + q~l =1
and explain the connection with conjugate functionals. Orlicz spaces were
introduced by Orlicz in 1932.
Whereas Lebesgue spaces and the corresponding Sobolev spaces are
appropriate for the treatment of nonlinear differential equations and
integral equations with nonlinearities which do not grow more rapidly than
certain polynomials for large functional values, one uses Orlicz spaces and
Sobolev-Orlicz spaces when the growth is more rapid, for example, for
exponential growth.
Corresponding to our general strategy concerning function spaces, which
we have already pursued in Part II, we merely summarize the important
facts about these spaces and concentrate on a typical application in Section
53.5.
53.1. Young Functions
Definition 53.1. Let H. U -»IR be a fixed function. H is called a Young
function if and only if;
(i) H(t) = fl'Hs)ds for all r e U.
(ii) h; U+ -*U + is continuous and strictly monotonely increasing,
(iii) h(0) = 0 and h(s) -* + oo as 5 -» + oo.
538
53.2. Orlicz Spaces and their Properties
539
H satisfies the condition A2 if and only if, for fixed t0, c > 0,
H(2t)<cH(t) for all t>t0.
H satisfies the condition A2 if and only if, for fixed t0, c>\,
H2(t)<H(ct) tor all t>t0.
Proposition 53.2. If H: IR -* U is a Young function, then one obtains the
conjugate function H* on U by
H*(t)= f%-l(s)ds.
Here, h~x is the function inverse to h.
This follows immediately from Proposition 51.5. For this reason, H* is
also a Young function and H** = H. According to Proposition 51.2, the
Young inequality holds:
tt* <H(t) + H*(t*) forallt,t*eU, (1)
where the equality sign occurs if and only if t* = H'(t)—therefore, for
t* = h(\t\)sgnt.
Proposition 53.3. If H is a Young function, then
H satisfies A2 => H* satisfies A2.
Proof. Compare Problem 53.1.
Example 53.4. Let h(s) = sp~l for all s>0 with fixedp,l<p<oo. Then:
H(t) = p-l\t\P, H*{t) = q-l\t\o
on R, where p~l + q~l =1. Here, H and H* are Young functions that
obviously satisfy A2.
Example 53.5. Let h(s) = psp"lexpsp for all s > 0 with fixedp, 1 < p < oo.
Then
tf(0-(exp|r|')-l
is a Young function on IR. Obviously, H satisfies A2, i.e., H* satisfies A2-
53.2. Orlicz Spaces and their Properties
Definition 53.6. Let G be an open bounded nonempty set in R" with N >1.
Let H: U -»IR be a Young function. We set
pH(u) = IH(u(x))dx.
JG
540
53. Conjugate Functionals and Orlicz Spaces
The Orlicz class LH(G) is the set of all measurable functions u: G -»US for
which pH(u) < oo.
The Orlicz space LH(G) is the set of all u such that
a(u)ueLH(G), a(u)>0
for an appropriate number a(w), which can depend on u.
EH(G) is the set of all u such that au e LH(G) for all real numbers a > 0.
Functions which differ on a set of N-dimensional measure zero are
identified.
In the following we summarize important properties of Orlicz spaces.
Proposition 53.7. Let LH(G) be the real linear hull of LH(G). Then LH(G) is
a real B-space with the norm
Nl/= infa'1 1-
«>o
The generalized Holder inequality
■ j H(au(x)) dx
Jr.
I uvdx
Jr.
^ ll«llj/l|l>l|j/.
(2)
holds for all u e LH(G), v e LH,(G).
Proof. Compare Problem 53.2.
Corollary 53.8. EH(G) is a closed separable subspace of LH(G). To be
precise, EH(G) =27^((7) (this is the closure in the space LH(GJ),
Furthermore, LM(G) c EH(G) c LH(G) c LH(G) c L^G).
Corollary 53.9 (The Role of A2). If H satisfies the A2 condition, then:
(a) EH(G) == LH(G) = LH(G) and LH(G) is separable.
(b) As n ->oo,
II" - "»llff -»0 ** pff (« - «„) -» 0.
(c) A set M is bounded in LH(G) if and only if supUBMpH(u) < oo.
(d) LH(G)* = LHt(G).
To be precise, relation (d) means that for each linear continuous functional
u* e LH(G)*, there exists exactly one u* e LH,(G) such that
u*(u)= J u*udx for all »e LH(G),
Jr
and in this way each u* e LH,(G) generates a u* e LH(G)*.
53.3. Linear Integral Operators in Orlicz Spaces 541
Corollary 53.10. LH(G) is reflexive if and only if H and H* satisfy A2.
All proofs can be found in Krasnoselskii and Rutickii (1958, M), (1958a,
S) and Kufner, John, and Fu&k (1977, M).
Example 53.11. For the Young function H(t) = p~l\t\p, l<p<<x>, we
have
LH(G) = Lp(G), LH,(G) = Lq{G),
where/^1 + q~l = 1 and
IHtf = <71A|HMG),- Hff* = /^11 L,(G) •
53.3. Linear Integral Operators in Orlicz Spaces
We investigate the linear integral operator
def f
(Ku)(x)= k(x,y)u(y)dy
JG
under the following assumptions:
(HI) G is an open bounded nonempty set in R", N >1.
(H2) H: U ->M is a Young function satisfying the A2 condition, e.g.,
H(t) = (exp\t\p)-l, 1</><oo.
(H3) k e LH(G X G), i.e., there exists an a > 0 for which
/ H(ak(x, y)) dxdy< oo.
JGXG
(H4) WesetX=Lff»(G).
Then X* = LH(G); for, according to Proposition 53.3, H* satisfies the A2
condition and Corollary 53.9 yields X* = Lff**(G). Furthermore, H** = H
by Proposition 53.2.
Proposition 53.12. With the assumptions (Hl)-(H4), the operator K: X -» X*
is linear and continuous. Moreover, K is compact when (H3) is satisfied for all
a>0.
The proof can be found in Krasnoselskii and Rutickii (1958, M),
Theorems 6.6, 15.4B, 16.5.
542
53. Conjugate Functional and Orlicz Spaces
53.4. The Nemyckii Operator in Orlicz Spaces
We shall study the Nemyckii operator F generated by
def
F(u)(x) =f(x,u(x)).
Proposition 53.13. F: X* -* X is continuous and bounded provided the
following three conditions hold:
(i) G, H, and X satisfy the same conditions as in Section 53.3.
(ii) /: G XU -* U satisfies a Caratheodory condition, e.g., f is continuous.
(Hi) f satisfies the growth condition:
\f(x,u)\<b(x) + T(\u\) forall(x,u)<BGXM,
where b e LH,(G), the function T: IR+ -*U+ is continuous monotonely
increasing and for each c> 0 there exists an s0(c) > 0 such that
R(cs)<H(s) foralls>s0(c). (3)
Proof. According to Krasnoselskii and Rutickii (1958, M), Theorem 6.3,
Section 6.5, the A2-condition for H implies, for fixed r, s1 > 0, the estimate
H*(Hs)<H(rs) for all .$>.$!.
Since H*, being a Young function, is monotonely increasing on K +, by (3),
for each c > 0 there exists a t0(c) > 0 such that
/f*(*(y)) £#*(#(£)) £/f(0 foralW>r0(c).
Now an analogous line of reasoning to that in the proof of Theorem 4.2 in
Krasnoselskii and Rutickii (1958a) yields the assertion. □
Example 53.14. For the real functions
def
H(t) = (exp\t\p)-l, l<p<oo,
def
r(0 = expjB|f|, j8>0,
all the assumptions on H and T in Proposition 53.13 are satisfied.
53.5. Application to Hammerstein Integral Equations
with Strong Nonlinearities
We consider the Hammerstein integral equation
u(x)+[k(x,y)f(y,u(y))dy-0. (4)
Jr.
53.5. Application to Hammerstein Integral Equations with Strong Nonlinearities 543
In this connection, we allow / to grow exponentially with respect to u. The
following proposition is a typical example of the application of Orlicz
spaces. In the proof, we shall make use of many of the auxiliary means
prepared in the preceding sections.
Proposition 53.15. Equation (4) has exactly one measurable solution u: G -»IR
such that
f exp\u(x)\pdx <oo forallp>\,
JG
where modification of u on an N-dimensional set of measure zero is permitted
when the following assumptions are satisfied:
(i) G is a bounded open nonempty set in UN, N>1.
(ii) The kernel k: G X G ->IR is' measurable, bounded, and symmetric, i.e.,
k(x, y) = k(y, x)for all x, yeG.
(Hi) k is positive in the sense that
( fk(x,y)v(y)dy
JGlJG
v(x)dx>0 for alive LX{G).
(iv) /: G X IR -»IR satisfies a Caratheodory condition, e.g., f is continuous,
(v) f is monotonely increasing with respect to u and satisfies the growth
condition
\f{x,y)\<a + be^
for all (x, u) e G X IR, where a,b,fi>0 are fixed numbers.
Proof. We will apply Theorem 28.A in Part II. To this end, we choose the
Young function
#(0-(exp|f|')-l, P>h
def
and set X = LH,(G). The functions H and H* satisfy the A2- and A2-condi-
tions, respectively. According to Corollary 53.9, X= EH„(G), and Xis real
and separable. We write (4) in the form
u + KFu = 0, ueX*, (4a)
where K and F are generated by k and /, respectively. The operator K:
X-> X* is linear and continuous by Proposition 53.12.
We show that K is monotone. Since LX(G) is dense in EH*(G) and
X= EH,(G), it immediately follows from (iii) that (Kv,v)x^O for all
vex.
Analogously, from (ii) it follows that (Kv, w)x= (Kw, v)xior all v, w e X.
According to Proposition 53.13 and Example 53.14, the operator F:
X* -> X is continuous. Moreover, Fis a monotone operator; for, because of
544
53. Conjugate Functionals and Orlicz Spaces
the monotonicity of / with respect to u, for all u, v e X*, we have
(u-v,F(u)-F(v))x^f[u(x)-v(x)}[f(x>u(x))-f(x>v{x))}dx>0.
JG
Here we think of X as a subset of X**.
Now, according to Theorem 28.A, for each p>l, (4a) has exactly one
solution »€l*. Since X* = LH(G), we thus have
I expX (u)\u(x)\pdx <oo
JG
for an appropriate Xp(u) > 0. The estimate
QKp\t\p'<C(X,p')eKpX\t\p
for all A > 0, p' such that 1 < p' < p, and all t e R, shows that this existence
proposition is equivalent to the assertion of Proposition 53.15. □
53.6. Sobolev-Orlicz Spaces
Let G be an open bounded nonempty set in R", N^.1. We denote by
W"LH(G) the collection of all uGLH(G) which have generalized
derivatives Dau up to and including order m in the sense of Definition 21.2, where
Dau e LH(G) holds for all a such that \a\ < m. We set
defl \V2
INL.i/- E ll^'«ll^ • (5)
Proposition 53.16. Relative to the norm (5), WmLH(G) is a real B-space (it is
a so-called Sobolev-Orlicz space).
The proof is analogous to that of Proposition 21.10. Parallel to Section
53.5, Sobolev-Orlicz spaces play an important role in the treatment of
nonlinear partial differential equations. However, in this connection, one
must note the fact that, as a rule, these spaces are not reflexive. We
recommend Gossez (1974), (1979, S) and Schumann (1982). Also, compare
Problem 53.5.
Problems
53.1. Proof of Proposition 53.3. Hint: Compare Krasnoselskii and Rutickii (1958,
M), Lemma 5.1, Theorem 6.6.
53.2.* Proof of Proposition 53.7. Hint: Compare Krasnoselskii and Rutickii (1958,
M), Chapter 2.
Problems
545
53.3. Proof of Example 53.11. Solution: A short calculation.
53.4.* Proof of Propositions 53.12 and 53.13. Study the references to the literature
given in the text.
53.5.** Application to partial differential equations. We consider the boundary value
problem
-EA[*(¥W)]-/W *<?, (6)
u(x) = 0 on 9G,
under the following assumptions:
(i) G is a bounded region in R N with dG e C01 and N > 1.
(ii) h: R -» R is continuous," monotonely increasing, and odd, and h(s) -»
+ oo as j -> + oo. We set H(t) = /i'1/^) ds.
(iii)/e£„»(G).
Show: (6) has a generalized solution w e WlLH(G), i.e.,
AT
f Y,h(Diu)Divdx=( fvdx
for all v e Jf^/i^G). The solution u is uniquely determined when h is
strictly monotonely increasing.
Here, WlLH(G) and ^^(G) denote the closure of C^(G) in
WlLH(G) and WX£W(G), respectively, with respect to suitable
topologies. The boundary condition u = 0 on dG is by this means taken into
account in a generalized way.
Hint: Compare Gossez (1974), (1979, S). There one will find the
development of a general theory of generalized pseudomonotone
operators in so-called complementary systems of Sobolev-Orlicz spaces.
Consider the following special cases:
(i) Strong growth of the coefficient function h:
h(t) = teW, //(0 = (|f|-l)el'l + l.
(ii) Weak growth:
/!(f) = sgnf-ln(l + |f|),
ff(0-(l + |t|)ln(l + l'l)-|'l-
(iii) Polynomial growth:
h(t)*=\t\P~1t, \<p<oo.
In (iii), WXLH(G) = WlE„(G) = W?(G) and E„.(G) = Lq(G),p~l + q~l
= 1.
Approximation methods for such differential equations can be found in
Schumann (1982).
546
53. Conjugate Functionals and Orlicz Spaces
References to the Literature
Classical work: Orlicz (1932).
Orlicz spaces: Krasnoselskii and Rutickii (1958, M,B,H) (this is a
standard work); Kufner, John, and Fuclk (1977, M).
Application to nonlinear integral equations: Krasnoselskii and Rutickii
(1958, M), (1958a, S); Amann (1969).
Application to nonlinear elliptic partial differential equations: Gossez
(1974), (1979, S); Schumann (1982) (approximation methods).
VARIATIONAL INEQUALITIES
In most sciences one generation tears down what another has built, and what
one has established another undoes. In mathematics alone each generation
builds a new story to the old structure.
Hermann Hankel, 1839-1873
In Parts I and II, as well as in the preceding chapters, we have already
encountered variational inequalities several times. In Chapter 9, we
explained the connection between variational inequalities and the fixed-point
theory for multivalued mappings. In Chapter 32, existence propositions for
variational inequalities resulted from a direct application of the main
theorem on maximal monotone operators. Here in Part III, we have so far
encountered variational inequalities as necessary and also partly as sufficient
conditions for solutions of minimum problems on convex sets. For example,
in Section 47.10 we learned that there is a close connection between the
Kuhn-Tucker theory and variational inequalities. Furthermore, the proof of
Pontrjagin's maximum principle in Section 48.7 was based on the
investigation of a variational inequality.
In the next four chapters, parallel to the treatment of monotone operator
equations and first- and second-order evolution equations in Part II, we will
consider the corresponding variational inequalities. In this connection, in
Chapters 54-56, we pursue the unified strategy of reducing variational
inequalities with the aid of the subgradient dtp to multivalued operator
equations and evolution equations, which need not necessarily be related to
variational problems, i.e., it is not absolutely necessary for potential
operators to appear. We explain this by an example involving the variational
548
Variational Inequalities
inequality:
(b- Au,v-u) + <p(u)<<p(v) for alii; ex. (1°)
By the definition of the subgradient dtp, when <p(u) # + oo, this is equivalent
to
b — Aue d(p(u)
or
Au + d<p(u)3b. (2°)
If tp equals the indicator function xM, i.e., tp( u) = 0 if i; e M and tp(v) = + oo
if i; £ Af, then (1°) passes into
(b-Au,v-u)<0, v<BM. (3°)
We already encountered such problems in Chapter 46, with A — F'. Now,
however, A need not be a potential operator. At the focal point of our
existence proofs stands the concept of a maximal monotone operator and
the main theorem for maximal monotone operators (Theorem 32.A in
Section 32.3). In this connection, in an essential way we make use of the fact
that, for a convex lower semicontinuous functional tp, the mapping dtp is
maximal monotone (Theorem 47.F in Section 47.11). In order to enable the
reader to compare the results, we specialize the main theorems in Chapters
54-56 to quadratic variational inequalities.
In Chapter 57 we treat multivalued first-order evolution equations in
B-spaces. In place of maximal monotone operators in H-spaces there appear
m-accretive operators. There we also explain the connection with nonexpan-
sive semigroups as a generalization of Chapter 31. In this connection, a
generalized concept of the solution is essential, i.e., we consider so-called
integral solutions.
Equation (2°), i.e., Au + d(p(u) 3 b contains the operator equation Au — b
and the Euler equation <p'(u) = 0 as special cases. Thus, (2°) represents a
coalescence of the theory of operator equations with the calculus of
variations.
Another important strategy for handling variational inequalities is offered
by the Galerkin method, parallel to Part II. We shall not delve into this. A
detailed presentation can be found in Lions (1969, M) and Duvaut and
Lions (1972, M). In this connection, one frequently combines the Galerkin
method with a regularization. For example, one can replace Au + d<p(u) 3 b
in (2°) by
Au + q>;(u) = b, (4°)
where tp^ is a regularization of <p for small ft and the F-derivative ^
represents the Yosida approximation of dtp. We deal with the Yosida
approximation of multivalued maximal monotone operators in Section 55.2
where we generalize results of Chapter 31.
A comprehensive investigation of numerical methods for handling
variational inequalities is contained in Glowinski, Lions, and Tremolieres
(1976, M).
Variational Inequalities
549
The theory of variational inequalities has been developed over the last 20
years in intimate connection with physical applications in elasticity and
plasticity theory, hydrodynamics, etc. In this connection, it is frequently a
question of problems with one-sided constraints (one-sided conditions in
elasticity theory, flow through walls, which permit transfer of matter or heat
in only one direction, etc.). In Section 37.7 we have already pointed out
important applications in connection with free boundary value problems
(determination of the dampness region caused by leakage of water through a
dam, fusion zone of ice, etc.). We shall discuss several of these applications,
e.g., in modern plasticity theory, in Part IV. In general, we recommend the
following for applications: Duvaut and Lions (19.72, M), Baiocchi and
Capelo (1978, M), Groger (1979, S), Kinderlehrer and Stampacchia (1980,
M), Hlavacek and Necas (1981, M), and Friedman (1982, M).
An additional main area of application for variational inequalities arises
in control problems with a quadratic objective functional, where the control
equations are partial differential equations. A detailed discussion of this can
be found in Lions (1971, M). The connection between control problems and
quasivariational inequalities is presented in Aubin (1979, M). Finally, there
exist intimate interconnections between variational inequalities, stochastic
differential equations, and stochastic optimization. One can find this in
Friedman (1975, M), (1979, S), Bensoussan and Lions (1978, M), and
Bensoussan (1982, M). The last reference is recommended as an
introduction to this field.
In Sections 54.4-54.9 we elucidate several methods for the investigation
of control problems for partial differential equations and integral equations.
The strategy, which we have already mentioned in Section 37.23, consists in
the following:
(a) The control problem is reduced to a minimum problem over a subset of
the product of the state space and the control space, or by elimination
of the state there results a minimum problem over the control set.
(/?) The variational inequality that arises is simplified by the introduction
of adjoint states.
Another possibility is to apply the method of so-called needle variations
described in Section 37.23 (cf. Problem 54.7).
The investigation of the smoothness of the solutions of variational
inequalities presents a difficult analytic problem. Simple physical examples
already show that, in contrast to the solutions of equations, one has to deal
with weaker regularity (cf. Problem 54.4). A thorough investigation of these
problems can be found in Brezis (1972), Kinderlehrer and Stampacchia
(1980, M), and Friedman (1982, M).
In Chapter 64 in Part IV we delve into bifurcation problems for
variational inequalities and their applications in elasticity theory. In Chapter 77
in Part IV we consider the connection between quasivariational inequalities
and mathematical economics.
CHAPTER 54
Elliptic Variational Inequalities
Mathematics takes us still further from what is human, into the region of
absolute necessity, to which not only the actual world, but every possible
world must conform.
Bertrand Russell
54.1. The Main Theorem
We consider the variational inequality
(b- Au,v- u) + q>(u) <<p(v) for all v e M (5)
for u e M and, parallel to this, the multivalued operator equation
Au+ d<p(u)B b, ueM (6)
under the following assumptions:
(HI) X is a real separable reflexive B-space.
(H2) M is a convex closed nonempty subset of X.
(H3) <p: M -»]— oo, oo] is convex lower semicontinuous and <p * + oo.
def
In the following, we think of tp as extended to X by (p(v) = +oo for
i)€l-M. Then <p: X-*]~00,90] is likewise convex and lower
semicontinuous.
(H4) A: McI-»I* is pseudomonotone, demicontinuous, and bounded.
For instance, these assumptions are fulfilled when A: McI->I* is
monotone, hemicontinuous, and bounded.
552
54. Elliptic Variational Inequalities
(H5) Coerciveness. If M is unbounded, then there exist u0 e M, v0 e Z*
such that v0 e d(p(w0), i.e., <p(u0) < + oo and
(p(w0) + (i;0, w — uQ) <<p(v) forallyeM
as well as
(Au, u — u0)
ll"ll
(H6) b is a fixed element in X*.
•oo as||w||->oo, ueM.
The condition for <p in (H5) is fulfilled, e.g., when v0 = <p'("0) an(^ *P'("o)
exists as a G-derivative.
Theorem 54.A. With the assumptions (Hl)-(H6), the following two assertions
hold:
(a) Equivalence. (5) and (6) are mutually equivalent.
(b) Existence. (5) has a solution.
Proof, (a) This is a direct consequence of the definition of dq>.
(b) According to Theorem 47.F in Section 47.11, the mapping dtp:
X^>2X* is maximal monotone. Then the restriction of tp to M is also
maximal monotone. Theorem 32.A in Section 32.1 yields a solution u for
(6). □
In Theorem 32.C in Section 32.5 we have already showed that for
monotone A and <p = 0 on M, the solution set of (5) is bounded, closed, and
convex. If A is strictly monotone and tp = 0 on M, then the solution of (5) is
unique.
54.2. Application to Coercive Quadratic Variational
Inequalities
We consider the quadratic variational inequality
b(v- u)-a(u,v — u) + tp(u) < <p(v) forallyeM. (7)
We seek «eM.A frequently occurring special case results when <p = 0 on
M. An important assumption is the strong positiveness of a( ■, •), i.e.,
a(u, u) > c\\u\\2 forallweX,
where c is a positive constant. In the next section we partially free ourselves
from this restriction.
54.3. Semicoercive Variational Inequalities
553
Proposition 54.1. Problem (7) with q> s 0 on M has exactly one solution
provided the following four assertions hold:
(i) X is a real separable reflexive B-space.
(ii) M is a closed convex nonempty set in M.
(iii) a: XX X^>U is bilinear, bounded, and strongly positive.
(iv) b: X -» U is linear and continuous.
Corollary 54.2. (7) has a solution if, in addition to (i)-(iv), the following two
assertions hold:
(v) <p: M -» ]— oo, oo] is convex, lower semicontinuousl and tp * + oo.
(vi) If M is unbounded, then there exists a u0e M, v0e X* such that
v0 e d(p(u0), i.e., <p(«0) < +-oo and
<p(uQ) + (vQ,v — uQ) <<p(v) for alive M.
We shall give the proofs in Problem 54.1. In connection with these results,
compare Section 46.6.
54.3. Semicoercive Variational Inequalities
In the following we concern ourselves with the quadratic variational
problem
b(v- u) <a(u,v-u) forallyeM. (8)
We seek u&M. According to Section 46.2, for symmetric a(•,•). the
variational problem associated with (8) reads as follows:
min 2~~la(u,u)-b(u) = a. (9)
However, now we do not assume a( ■, •) to be strongly positive on the entire
space but rather on a subspace. It is crucial that in this connection there
appears an additional side condition for b. Proposition 54.3 constitutes the
basis for the handling of the Signorini problem of elasticity theory. We shall
discuss this in Chapter 63 in Part IV.
Our assumptions are:
(HI) X is a real H-space.
(H2) M is a closed convex nonempty subset of X.
(H3) a: X X X -»IR is bilinear, bounded, positive, and symmetric.
(H4) b: X -»IR is linear and continuous.
In order to be able to formulate the important additional conditions, we
define:
del del
Na= {uGX:a(u,u)^0}, Nb= {ne X: b(u) =0}.
554
54. Elliptic Variational Inequalities
In Problem 54.2 we show that Na = N(A), where a(u, v) = (Au\v).
Therefore, Na and Nb are closed linear subspaces. Thus, there exist orthogonal
surjective projection operators
P:X-*Na; Q:X^NanNh.
Now the following conditions are crucial:
(H5) dim Na < oo, and (/ - Q)(M) is closed.
(H6) Semicoerciveness. There exists a c > 0 such that
a(v,v)>c\\(I-P)v\\2 for all ue*.
(H7) Compatibility condition. We have
b(v)<0 for all v e Nan M.
In applications to mechanics, this is a side condition for the external forces.
Proposition 54.3. With the assumptions (//1)-(//6), the following assertions
hold:
(1) Equivalence. Problems (8) and (9) are mutually equivalent.
(2) Uniqueness. If u, ux are two solutions of (8), then u— wx e Na.
(3) Existence. (8) has a solution when M is bounded. If M is a cone, then (8)
has a solution if and only if {HI) holds.
Proof. (1) Compare Theorem 46.A in Section 46.1.
(2) The addition of
b{ux— u)<a{u,ux— u), &(w—Wj) <(((«!, w—Wj)
yields
0>a(w-M1>M-M1)>c||(/-P)(w -«i)||2.
(3) We set
def
F(u) = 2~1a(u,u)-b(u)
and
del del del
S = I-Q, T = P-Q, U = I-P.
Then the orthogonal decomposition holds:
S(X) = T(X)®U(X). (10)
T is an orthogonal projection operator on NaQ(Na n Nb). In order to solve
54.3. Semicoercive Variational Inequalities
555
(8), by assertion (1), it suffices to find a solution of
minF(«) = a. (11)
uSM
def
Let (un) be a minimal sequence of (11), i.e., F(un)-* a, and let vn = Sun.
(I) We show: If (vn) is bounded, then (11) possesses a solution. After
possibly passing to a subsequence, vn-*v as n -* oo. From vn^S(M) and the
fact that S(M) is closed and convex, it follows that v eS(M); therefore,
vSu holds. Na = N(A) in Problem 54.2 yields F(z) = F(Sz) for all z e X
Due to the weak lower semicontinuity of F, we obtain
F(w) = F(i;)< lim F(vn)" lim F(w„) = a;
therefore, F(w)= a.
(II) We show that (vn) is bounded. This is trivial when M is bounded.
Thus, let M now be a cone. It is assumed that (vn) is unbounded. Then,
def
after possibly passing to a subsequence, ||i;n||-^oo. We set wn = anlun,
def
a„ = ||i>„||. Then Swn = a„ \ and ||5w„|| =1.
Semicoerciveness (H6) yields
c\\Uun\\2<a(un,un) = 2F(un) + 2b(v„). (12)
Observe that b{un) = b(vn) because b(Qz) = 0 for all z e X From (12) and
S = r+t/it follows that
ca2||t/iv„||2 ^: constant + 2||ft||a„, (12a)
2-^11^11^0^(^)+6(7^) + 6(1/%). (12b)
(12a) shows that Uw„ ~^> 0 as n -»oo since an ~^> + oo. Below we shall prove:
For a subsequence {wn,) we have Twn, -»z, where 6(z) < 0. (13)
Then the desired contradiction is obtained from (12b), for the right-hand
side in (12b) tends to the negative value 6(z).
(Ill) Proof of (13). From (10) it follows that
l = l|5wj|2 = ||rw„||2 + ||t/wj|2;
therefore, ||rwn|| -* 1 as n -* oo. Since Twn e Na and dim Na<oo, there exists
a subsequence (wn,) such that Twn, ~^> z and ||z|| =1. From wn, e M, Swn, =
rw„- + Uw„,, and the fact that S(M) is closed, it follows that zeS(Jlf), i.e.,
z=(/-g)w for aweM; thus, 6(z) = 6(w).
Since /?(T) c JVa and dimiVa < oo, we have z e Na; thus, w = (w - z)+ z
= Qiv + z e JVa) i.e., we NanM. The compatibility condition (H7) yields
b(w)<0.
556
54. Elliptic Variational Inequalities
We will show6(z)<0. If we had b{z) = Q, then b(w) = 0, i.e., weJVan
Nb; consequently, Qw=w, z = 0, in contradiction to ||z||=l. (13) is thus
proven.
(IV) (H7) is necessary for a solution of (8) for a cone M. Let u in M be a
solution of (8), i.e.,
b(v — u)<a(u,v — u) forallyeM.
Furthermore, let w e.NaC\M. Since Na = N(A), a(u,w) = 0 by Problem
54.2. From tw e M, for all t ^ 0, it follows that
tb(w) — b(u)<a(u,tw — u)=—a(u,u).
As t -* +oo, we obtain 6(w)<0. D
54.4. Variational Inequalities and Control Problems
We consider the control problem
F(z,«) = min!, u&V, z&Z, (14)
Az = Bu + /,
with the state quantity z and the control quantity u. A frequently used
method for reducing control problems to purely minimum problems consists
of introducing the set X of all admissible (z, u), i.e.,
def
X= {(z,u)(=ZxU: Az = Bu + f,z<=D(A),ueV}.
Then (14) passes to the equivalent problem
i7(z,M) = min!, {z,u)&X. (15)
The whole apparatus we have developed for minimum problems can now be
applied to this problem. A frequently used trick consists of introducing an
adjoint state p that simplifies the variational inequality resulting from (15):
(Fz(z,u),y-z) + (Fu(z,u),v-u)>0 for all (y, v) e X. (16)
As we shall see, we then obtain
A*p='FI(z,u), (17a)
(B*p + Fu(z,u),u-v)<0 for all i; e PF. (17b)
Here, W is the set of all control quantities v &V that correspond to a state z
such that Az = Bv + f. The variational inequality (17b) is a simple form of
54.4. Variational Inequalities and Control Problems
557
Pontrjagin's maximum principle; for, the expression appearing on the
left-hand side in (17b) takes on its maximum for v = u. Our assumptions
read as follows:
(HI) Z, Y, and U are real reflexive B-spaces. V is a closed convex subset of
U. Here, V describes the control restrictions.
(H2) B: U -* Y is linear and continuous, and A: D(A) c Z -* Y is linear and
closed, with D{A)= Z and closed range R(A) (cf. Ax(39)).
(H3) / is a fixed element in Y and X # 0.
(H4) F: Z X t/ -* R is convex and lower semicontinuous.
Theorem 54.B. W/f/i the assumptions (//1)-(/M), the following two assertions
hold:
(1) Existence and uniqueness. (14) /ias a solution (z,u) provided X is
bounded or
F(z,u)-* +oo as ||z||+||w||->oo, (z,ii)eX
The solution is unique when F is strictly convex on X.
(2) Characterization of the solution. If, in addition, F is F-differentiable on
ZXU, then (z,u) is a solution of (14) if and only if there exists a p satisfying
(17a) and (176) holds.
This theorem permits numerous applications to control problems, where
linear differential or integral equations correspond to the control equation
Az = Bu + f.
Proof. (1) X is closed and convex according to (H2). Section 38.5 yields (1).
(2) According to Theorem 46.A, (z, u) is a solution of (15) if and only if
(16) holds. For v — u,y = z + w, weN(A), it follows from (16) that
<Fz(z,h),w> = 0 ioTaWw^N(A).
SinccR(A*) = XN(A) by Aj^), then A* p = Fz(z, u) has a solution/?. Now
(17b) follows directly from (16). Observe that Ay- Az = Bv- Bu for (y, v)
e x.
Conversely, (16) follows from (17). D
If the control equations Az = Bu + f are nonlinear, then, in an analogous
way, we obtain existence propositions by solving the minimum problem
with respect to F over X. Then we have to make use of structure
propositions on X. Such propositions, in connection with the theory of monotone
operators in application to parameter identification problems, can be found
in Kluge (1979a, S) and Nurnberg (1979).
558
54. Elliptic Variational Inequalities
54.5. Application to Bilinear Forms
As a special case of (14), we study the control problem
2-1[c(z - z0,z~ z0) + d(u, u)] = mini, (18a)
u e V, z&Z, (18b)
a(z,w) = b(u,w)+g(w) forallwSZ.
In preparation for this, we also state:
a(w, p) = c(z — zQ,w) forallweZ, (19)
b(u — v,p) + d(u, u — v) <0 forallyeF,
p is a fixed element in Z.
Proposition 54.4. The control problem (IS) possesses a solution (u, z), and this
solution is characterized by (186) and (19), provided the following assumptions
are fulfilled:
(i) Z and U are real separable H-spaces. V is a closed, convex, bounded,
and nonempty subset of U.
(ii) The bilinear forms a,c: Z X Z-*U, b: UxZ-*U, and d: U XU^>U
are bounded. Furthermore, a is strongly positive, and c and d are positive
and symmetric.
(Hi) zQ&Z, g&Z* are fixed.
The solution is unique when c or d is strictly positive.
The characterization of the solutions is to be understood in the following
sense. If u, z is a solution of (18), then there exists a p in Z for which (18b)
and (19) hold. Conversely, if one has &p in Z such that (18b) and (19) hold,
then u, z is a solution of (18).
Proof. For (18b), the representation formulas in Section 21.5 yield the
equation Az = Bu + f with
a(z,w) = (Az,w), b{u,w) = (Bu,w), (f,w) = g(w).
A: Z-* Z* is strongly monotone; therefore R(A) = Z* by Theorem 26.A in
Section 26.2, and A-1 exists. Then the assertion follows from Theorem 54.B
in Section 54.4 with Y=Z*. O
The following corollary also follows immediately from Theorem 54.B.
Corollary 54.5. (18) has exactly one solution (u,z), and this solution is
characterized by (186), (19) provided (i)-(iii) hold with the following
modifications:
(a) V is not necessarily bounded.
(b) d(-,) is strongly positive.
54.6. Application to Control Problems with Elliptic Differential Equations 559
54.6. Application to Control Problems with Elliptic
Differential Equations
In Chapter 22 we discussed in detail how one formulates generalized
boundary value problems for elliptic differential equations with the aid of
equations for bilinear forms. An abundance of examples can be obtained
thereby from Section 54.5. As a simple problem, we consider
2 1f(z-z0) dx = min\,
JG
G: -Az = h; dG:z = 0, "
u&V, ~z<=W}(G).
(20a)
(20b)
In addition, we state
G: -Ap = z-zQ; dG:p = Q, (21)
( p(u-v)dx<0 forallveK,
JG
p^W2l(G).
Problem (20) has the following simple physical interpretation. Let z be the
steady temperature distribution in a region G, and let u be an external heat
source. Suppose u varies over a set V and is to be so determined that z
arbitrarily closely approaches a desired temperature distribution z0 in the
sense of root mean square.
The assumptions read as follows:
(HI) G is a bounded region inUN, N>:1.
(H2) Fis a convex closed bounded nonempty subset of L2(G).
(H3) z0 <^W2\G) is given and fixed.
We think of the boundary value problems (20b) and (21) as in Section 22.2
in the generalized sense, i.e., (20b) means
a(z,w) = b(u,w) for all w <=W2l(G),
where
N
a(z,w)=l ]£ DtzDtwdx, b(u,w)= i uwdx.
•'g,==1 Jg
The following proposition now follows directly from Proposition 54.4 and
Section 22.2 with U=L2{G), Z=W2\G).
Proposition 54.6. If(Hl)-(H3) hold, then (20) has exactly one solution (u, z)
and this solution is characterized by (20b), (21).
560
54. Elliptic Variational Inequalities
54.7. Semigroups and Control of Evolution Equations
We consider
fT[(z(t)\z(t)) + (u(t)\u(t))] dt = min\, (22a)
ueL2(0,T;U), z<=L2(0,T;Z),
z'(t) = Az(t) + Bu(t), 0<t^T (22b)
z(0) = z0.
This problem is called a linear regulator problem. It is frequently employed
in engineering as an approximation for more complicated nonlinear control
problems. The solution of this problem is based on
p'(t) = -A*p(t)-z(t), p(T) = 0, (22c)
z'(t) = Az(t)-BB*p(t), z(0) = z0,
u(t) = -B*p(t)
for all t, 0 < t < T. Here, u is the control quantity and z is the state quantity.
We interpret t as time. We call p the adjoint state. The introduction of p
essentially simplifies the characterization of the solution.
Also, in order to be able to handle the control equation (22b) easily, to
which correspond unbounded operators A and thus, say, parabolic
differential equations, we write (22b) in the form
z(t) = S(t)z0+ f's(t-s)Bu(s)ds (22b*)
Jo
and (22c) in the form
p(t) = fTS*(s - t)z(s) ds, (22c*)
z(t) = S(t)z0 - f's(t -s)B*p(s) ds,
Jo
u(t) = -B*p(t)
for all t, 0 < t < T. In this connection, compare this with Example 54.7
below. Our assumptions read as follows:
(HI) Z and U are real H-spaces. The fixed end time T, 0 < T< oo, and the
initial state z0eZ are given.
(H2) B: U -* Z is linear and bounded.
(H3) {S(t): t>0} is a linear continuous semigroup, i.e., S(t): Z-> Z is
linear and continuous for each t > 0, S(t + s) = S(t)S(s) for all
t,s >0, 5(0) = 1 and S(t)z-* z as t -* + 0 for all z (= X.
54.8. Application to the Synthesis Problem for Linear Regulators
561
We denote the corresponding adjoint operators by
A*:D(A*)QZ^Z, S*(t):Z^Z, B*:Z^U.
Thus for the sake of simplicity, we forego the notation S*\ etc., given in the
List of Symbols to distinguish between dual and adjoint operators.
Theorem 54.C. If (Hl)-(m) hold, then the control problem (22a), (22b*)
has exactly one solution which is characterized by (22c*).
Proof. We make use of some elementary properties of semigroups which
can be found, e.g., in Balakrishnan (1975, M), Chapter 4. The situation
when the special case of Example 54.7 below occurs is particularly intuitive.
def del
Let X= L2(0, T; Z) and Y= L2(0, T; U). We write (22b*) briefly in the
form
z = Mz0 + Lu. (23)
L: Y -* X is linear and continuous. Now the following minimum problem
results from (22a):
def
Q(u) = (Mz0 + Lu\Mz0 + Lu) + (u\u) = mm\, ueY.
A short calculation yields
Q(v)-Q(u) = ((I + L*L)(v- u)\(v- u)) forall v^Y,
where
def _,
u= -(I+L*L) 1L*Mz0.
Since I + L*L is strongly positive, the inverse operator exists. Therefore,
this u is the unique solution of the minimum problem. For u we have
u + L*(Lu + MzQ) = 0; (24)
therefore, u = - L*z because of (23). If we define
p(t)= (TS*(s-t)z(s)ds,
•'t
then n= - L*z = - B*p (cf. Problem 54.6), and we obtain (22c*). D
54.8. Application to the Synthesis Problem for Linear
Regulators
We consider two typical examples for Theorem 54.C.
Example 54.7 (Bounded A and the Synthesis Problem). HA: Z -> Z is a
def
continuous linear operator, then we can choose S(t) = expk4 and (22b) and
(22c) are equivalent to (22b*) and (22c*), respectively, as one can easily
562
54. Elliptic Variational Inequalities
verify. This situation occurs, e.g., in the case where Z — W, U=Um. Then
the control equations (22b) are systems of ordinary differential equations. A
and B are matrices. In addition, we will explain how one obtains the
feedback control that is important in engineering for optimal control
(solution of the synthesis problem). To this end, we write
u(t) = -B*P(t)z(t) (25)
with the corresponding so-called Riccati equation
P'{t) = - I - A*P(t)- P(t)A + P(t)BB*P(t), P(r)=0. (26)
We assert: If P() is a continuously differentiable solution of (26) on [0, T],
then one obtains the optimal control u( ■) from
z'(t) = Az(t)-BB*P(t)z(t), z(0) = z0
and (25).
The proof is very simple. We set p(t) = P(t)z(t). Then the product rule
yields (22c) and the assertion follows from Theorem 54.C.
It is remarkable that P(t) satisfies the nonlinear equation (26). This
nonlinearity occurs because the objective functional in (22a) is quadratic. In
general, it is a nontrivial problem to solve the Riccati equation (26). In
Problem 54.11 we give some hints for this existence problem.
Example 54.8 (Unbounded A). Let A: D(A)<zZ^>Z be a linear
operator, let — A be monotone, and let R(I~ A)= Z. According to Theorem
31.A in Section 31.1, the operator A generates a continuous linear
semigroup {S(t)\ t^.0}. Then one can think of (22b*) and (22c*) as
generalized formulations of (22b) and (22c), respectively.
To solve the synthesis problem, one must write the Riccati equation (26)
in the generalized form. This, together with existence propositions for the
Riccati equation, can be found in Balakrishnan (1975, M), 5.2.
54.9. Application to Control Problems with Parabolic
Differential Equations
We consider
jTlj[z(x,t)2+u(x,t)2] dx)dt = mini (27)
with the boundary-initial-value problem
z,(x,t) = £iz(x,t) + u(x,t) on(7x]0, T], (28)
z(x,t) = 0 ondGx[0,T],
z(x,0) = z0(x) on G
Problems
563
as control equation. Let G be a bounded region in IR N. In order to write (28)
in the generalized form in the style
z'(t) = Az(t) + u(t), z(0) = z0, (29)
. def
we set Z-~L2(G) and define B = - A, D(B) = C0°°(G). According to
Section 31.4, B has a self-adjoint extension BF in Z (Friedrichs' extension).
Finally, we now set A = — BF. Again, by Section 31.4, the situation of
Example 54.8 is at hand and we can apply the results of Section 54.7 with
U= L2(G). Then, in place of (29), the integral equation (22b*) occurs. This
is a generalization of (28) and (29) as well.
Problems
54.1. Proof of the results in Section 54.2. Solution: By Section 21.5, there exists an
operator A: X -* X* which is linear and strongly positive. Now use
Theorem 54.A.
54.2. A special subspace. Show: Na in Section 54.3 is a closed linear subspace.
Solution: According to Section 21.5, there exists a continuous positive
self-adjoint linear operator A: X-*X such that a(u,v) = (Au\v) for all
u, v e. X. The operator A has a square root Al/2; thus, a(u,u) =
(A1/2u\A1/2u). From this it easily follows that Na = N(A).
54.3. Complementarity problem. Let F: U^ -> RN be given. We seek auGRj"
such that
(F(u)\u) = 0, F(«)eR*. (30)
Show that for u e R £:
u is a solution of (30) «• (F(u)\v - u) > 0 for all v e R1.
Hint: Compare Kinderlehrer and Stampacchia (1980, M), page 17.
Generalize this proposition to B-spaces and formulate the corresponding existence
propositions. Compare Barbu and Precupanu (1978, M), pages 127,168. For
the connection between the complementarity problem and numerous
problems of linear and nonlinear optimization theory, the reader is referred to
Karamardian (1969) and Gftpfert (1973, M).
54.4.* Obstacle problem and regularity. We consider
f [l~lu'2-uf] dt = mini, ueK,
where
def o ,
K= {u<=wl(0,l): u> gon]0,l]}.
The functions/ e L2(0,1) and g e W2l(0,1) are given. Furthermore, let g(0),
g(l) < 0, and suppose we are given an xq e [0,1] such that g(x0) > 0.
Moreover, let K ¥= 0.
5-y. i.uiptic Vaiiauuual Inetj uaii uC'S
o 1
Figure 54.1
This problem allows the following interpretation:
u{x) is the displacement at x of a string which is fastened at the
boundaries, g describes an obstacle (see Fig. 54.1).
Show: There exists exactly one solution. Formulate the corresponding
variational inequality and prove that u(x) > g(x) implies the continuity of
u' at x.
Hint: Compare Kinderlehrer and Stampacchia (1980, M), page 47. There
one also finds detailed considerations concerning generalizations to R N.
* General semicoercive problems. Let M be a closed convex set in the real
reflexive B-space X, with 0 e M. We consider the variational inequality
(b-Au,v-u)<0 forallyeM. (31)
Let A: X -» X* be pseudomonotone, demicontinuous, and semicoercive, i.e.,
(Au, u) > c[p(u)\q forallueX,
where c> 0, q > 1 are fixed. Here, p is a seminorm on X. Let X be
compactly embedded in a real B-space Y, wherep()+||||yis an equivalent
norm on X. Furthermore, let
def
N= {u<=X:p(u) = 0}.
Show:
(i) If M n N is bounded, then (31) has a solution u e M for each b e X*.
(ii) If Mr\N is unbounded, then (31) has a solution ue M provided
b e X* and (ft, y) < 0 for all y e M n AT with v * 0.
Hint: Compare Hess (1974). There one also finds generalizations and
applications. The idea for the proof consists of reducing the problem to the
coercive case.
6. The operator L*. Show that L*: X-*Y does in fact have the form given in
the proof of Theorem 54.C.
Solution: Let
*(')= ('S(t - s)Bu(s) ds = (Lu)(t),
Jo
y(t) = B* (TS*(s-t)w(s)ds.
It must be shown that L*w = y, i.e.,
(T{z{t)\w{t))dt~lT{u{t)\y{t))dt. (32)
'n •'n
trooiems
5M
First, we consider the situation of Example 54.7; thus, S(t) = exp tA. Then:
z'(t) = Az(t)+Bu(t), y'{t)^-B*A*y{t)-w{t). (33)
Now (32) follows first for B = I by the integration of
(z(t)\y(t))'=(u(t)\y(t))-(z(t)\w(t)),
taking into account that z(0) = y(T) = 0. For B ¥= I, one obtains (32) by
replacing u with Bu.
In the general case, one first uses polynomials for u,w. Then (33) holds,
where A is the infinitesimal generator of the semigroup (cf. Balakrishnan
(1975, M), 4.8). Now, (32) follows from this as above. For general u, w, one
uses a passage to the limit in (32).
54.7. Needle variations and control problems for partial differential equations and
integral equations. In this connection, study Butkovskii (1965, M), Chapter
1, Section 2 and Lurje (1965,M), Chapter 1, Section 7. There the Pontrjagin
maximum principle is derived in an elementary way with the aid of the
method of so-called needle variations described in Section 37.23.
In this connection, also study Bittner (1975) and von Wolfersdorf (1976)
(nonlinear integral equations). In von Wolfersdorf (1975) a maximum
principle is formulated for a class of heating processes. In this connection,
the semilinear parabolic differential equations are reduced to nonlinear
integral equations by means of Green's functions.
54.8.* Control problems for linear elliptic, parabolic, and hyperbolic differential
equations with quadratic objective functional. In this connection, study Lions
(1971, M). There one finds numerous examples. The simple general strategy
consists of using the Hilbert-space methods presented in Chapters 22-24,
eliminating the state quantities, and simplifying the variational inequalities
that arise by introducing adjoint states. For time-dependent problems one
also uses the formula for integration by parts with respect to time.
54.9.* Semigroups and control problems. In this connection, study Balakrishnan
(1975, M). There one can also find the investigation of control problems in
which stochastic differential equations appear.
54.10.* Duality between the linear regulator problem and the Kalman-Bucy filter. As
we have already explained in Section 37.25, the Kalman-Bucy filter plays
an important role in the filtering of nonstationary stochastic processes. The
duality referred to in this problem heading is exploited to reduce the
investigation of the Kalman-Bucy filter to the linear regulator problem. For
this, study Fleming and Rishel (1975, M), pages 133-141 and Astrom (1970,
M), page 242.
54.11.* Dynamic optimization and the linear regulator problem. An advantage of
dynamic optimization is that it allows one to formulate sufficiency criteria.
In this connection, study Fleming and Rishel (1975, M), pages 88, 165.
There, these sufficiency criteria are used in order to determine optimal
controls for deterministic and stochastic linear regulators.
Linear regulators have diversified technical applications. For this, study
Lee and Markus (1967, M) and Astrom (1970, M), as well as the literature
566
54. Elliptic Variational Inequalities
given in the references to the literature in Chapter 48 under the caption
"Linear systems."
54.12.* Solution of the Riccati equation and the synthesis problem. Give conditions
that guarantee the existence of solutions of the Riccati equation (26).
According to Example 54.7, knowledge of these solutions is basic to the
solution of the synthesis problem.
Hint: Compare Fleming and Rishel (1975, M), page 89 (the case of UN),
Lions (1971, M), Chapter 3, 4.3, and Balakrishnan (1975, M). Additional
references to the literature can be found in Barbu and Precupanu (1978, M),
page 299.
54.13.* Linear regulators, Bellman's equation, dynamic optimization, synthesis
problem, Riccati equation, stopping time problems, quasivariational inequality of
Bensoussan and Lions, and impulse control. As an introduction to this series
of problems, study Aubin (1979, M), pages 500-520. There applications to
economics are also pointed out. The theory of quasivariational inequalities
arose in connection with these questions. The Bensoussan-Lions
quasivariational inequality is intimately connected with the Bellman equation, which
in turn is a generalization of the Hamilton-Jacobi differential equation. In
addition, for stochastic control theory, study Friedman (1979, S),
von Moerbeke (1974, S), (1976), Bensoussan and Lions (1978, M), and
Bensoussan (1982, M).
54.14.* Control problems with partial differential equations and engineering
applications. In this connection, study Butkovskii (1965, M), (1975, M) and Lurje
(1975, M).
References to the Literature
Classical works: Compare Section 37.7.
Introduction: Lions (1971, M); Kinderlehrer and Stampacchia (1980, M).
General presentations: Browder (1966), (1968/76, M); Lions (1969, M);
Duvaut and Lions (1972, M); Mosco (1973, S), (1976, S); Ekeland and
Temam (1974, M); Barbu (1976, M); Barbu and Precupanu (1978, M);
Pascali and Sburlan (1978, M); Kluge (1979, M); Aubin (1979, M).
Semicoercive problems: Fichera (1964), (1973); Hess (1974); Kinderlehrer
and Stampacchia (1980, M).
Applications: Fichera (1973, S); Duvaut and Lions (1972, M); Baiocchi
and Capelo (1978, M); De Giorgi (1978, P); Groger (1979, S); Aubin (1979,
M); Kinderlehrer and Stampacchia (1980, M); Hlavacek and Necas (1981,
M); Friedman (1982, M).
Approximation methods: Mosco (1973, S); Glowinski, Lions, and
Tremolieres (1976, M,B); Glowinski (1980, L).
Parabolic differential equations and the bang-bang principle: Glashoff
(1976), (1977).
References
567
Stochastic differential equations, stochastic optimization, and variational
inequalities: von Moerbeke (1974, S), (1976); Friedman (1975, M), (1979,
S); Bensoussan and Lions (1978, M); Bensoussan (1982, M) (recommended
as an introduction).
Control with partial differential equations and integral equations.
Introduction: Butkovskii (1965, M), (1975, M); Ahmed and Teo (1981,
M,B).
Hilbert space methods: Lions (1971, M) (standard work).
Survey articles: Wang (1964, S); Butkovskii, Egorov, and Lurje (1968, S);
Robinson (1971, S); Lions (1976, S), (1977, S), (1980, S).
Monographs especially emphasizing engineering applications: Butkovskii
(1965, M), (1975, M); Lurje (1975, M); Sirazetdinov (1977, M)
(aerodynamics); Egorov (1978, M) (nuclear reactors); Ray and Lainiotis (1978,
M).
Conference reports on applications: Anger (1979, P); Tzafestas (1980, P);
IFIP conferences (1978, P), (1978a, P), (1979, P).
Selected works and monographs: Beckert (1972), (1977) (control of
stability of elastic systems); Warga (1972, M) (generalized solutions, relaxed
control); Bittner (1975); Balakrishnan (1975, M) (application of semigroups);
von Wolfersdorf (1975), (1975a), (1976); Seidman (1977); Goebel and von
Wolfersdorf (1978); Barbu and Precupanu (1978, M); Walker (1980, M);
Lions, Jr. (1982, L); Lions (1983, M).
(Also, cf. the references to the literature in Chapter 48 on existence
theory.)
CHAPTER 55
Evolution Variational Inequalities of First
Order in H-Spaces
I have hardly ever known a mathematician who was able to reason.
Plato, 370 B.C.
A great science is mathematics, but mathematicians are often only blockheads.
Georg Christoph Lichtenberg, 1799
I had a feeling about Mathematics—that I saw it all. Depth beyond Depth
was revealed to me—the Byss and the Abyss. I saw as one might see the
transit of Venus or even the Lord Mayor's Show—a quantity passing through
infinity and changing its sign from plus to minus. I saw exactly how it
happened and why the tergiversation was inevitable—but it was after dinner
and I let it go.
Winston S. Churchill (1874-1965)
In this chapter we generalize the results of Chapter 31 to problems of the
form
u'(t) + Au(t)-uu(t)Bb(t), 0<t<T,
"(°) = "o
in an H-space. In this connection we use the results that we established in
Chapter 23 (generalized derivatives and evolution triples) and Chapter 32
(maximal monotone operators). As an important methodological tool, we
make use of the Yosida approximation. In Chapter 66 we shall consider
applications to plasticity theory.
568
55.1. ine Resolvent ot Maximal monotone operators
56.
55.1. The Resolvent of Maximal Monotone Operators
For the multivalued operator A: H -*2H, we consider the equation
b&u + pAu, u<=H (1)
clef
for fixed n > 0. We define the resolvent of A by Rfl = (I + [iA) , which
always exists as a multivalued operator. We now explain the fundamental
connection between the monotonicity of A and the accretiveness of R^.
Proposition 55.1. Let A: H->2H be a multivalued operator on the real
H-space H. Then the following hold:
(A) The following properties are equivalent:
(/) A is monotone,
(ii) A is accretive, i.e., R is single valued and nonexpansive for all n>0.
(B) The following properties are equivalent:
(a) A is maximal monotone.
(b) A is monotone and R(I + A) = H.
(c) A is m-accretive, i.e., A is accretive and R^ exists for every n>Q on H.
In particular, it follows from (B) that (1) has a solution for each b&H
provided A is maximal monotone.
Proof. (A) We write the identity
\Wi-u2+ii,{vl-v2)\\2 (2)
= 11¾ - M2||2 -+-2^(^ - «2^1- ^) + ^11^1 - ^ll2-
If A is monotone, then
(v1 — v2\u1 — u2)>0 for all (^,^),(^2,¾) e^i (3)
therefore,
ll"i — "2II ^ ll"i — "2 -+-^(^1 — ^2)11 (4)
for all (ut, Vj) e A. Thus, (ii) holds. Conversely, (3) follows from (2) and (4).
(B) (a) => (b) This is a special case of Theorem 32.A.
(b) => (a) If A is not maximal monotone, then there exists a («', v') <£ A
such that
(v-v'\u-u')>0 forall(n,i;)e/l. (5)
Since R(I + A) = H, there exists a (u,v)&A such that u + v=u'+v';
therefore, u'=u,v'=v by (5), in contradiction to («', v') <$. A.
(a) *» (c) Observe that if A is maximal monotone, then so is \iA for \i > 0.
D
570
55. Evolution Variational Inequalities of First Order in H-Spaces
55.2. The Nonlinear Yosida Approximation
Let A: H -* 2" be a maximal monotone operator in the real H-space H. For
all \i, > 0 we define the Yosida approximation A by
Furthermore, for \i = 0 and u e D(A), we set:
def
A0u = the element with smallest norm in Au.
This convention is meaningful because, for each u e D(A), the set Au is
convex, closed, and nonempty. Otherwise, one could easily construct a
proper monotone extension of A. By Section 46.4, A0u is uniquely
determined. For \i >. 0, the mapping A^. D(A)l) c H -* H is single-valued with
£>(^0) = I>(y4) and £)(^) = //^ for ju, > 0. In the sequel, t and | mean
monotone convergence.
Proposition 55.2. For all \i,\>Q and u,v<^H, we have:
(a) A^uG ARyU.
(c) A^ is maximal monotone.
(<0(/iA)„-/iA+r
Corollary 55.3. For all u^D(A),
A^u^AoU, ||4,«||T||/<0«|| for ii 10 (6)
and
||^«-^0«||2<|M0«||2-||VH2 for all fi>0. (7)
For u^ D(A),\\Allu\\'\ oo as n J,0.
For u <=D(A), R^u -> w as jw -> +0.
We treat the proofs in Problems 55.1 and 55.2.
55.3. The Main Theorem for Inhomogeneous
Problems
As a generalization of Chapter 31, we study the inhomogeneous problem
u'(t) + Au(t)-uu(t)3b(t), (8)
"(°) = "o
for almost all t e ]0, T[, where T is fixed, 0 < T < oo.
55.3. The Main Theorem for Inhomogeneous Problems
571
Theorem 55.A. For fixed given quantities
u0eD(A), beW^(0,T;H), toeR,
problem (8) has exactly one solution u e W^iQ, T; H) provided A: H -» 2H is a
maximal monotone operator on the real separable H-space H.
Corollary 55.4. Moreover, the solution u is Lipschitz continuous and u'(t)
exists for almost all t e ]0, T[ in the sense of the classical derivative as the limit
of the difference quotient, i.e., u' e L^O, T; H).
Here, u e W^(0, T; H) means u, u' e L2(0, T; H)(cf. Section 23.5). The
space W^iO, T'< H) corresponds to the Sobolev space ^(0. T\ V, H) in
Section 23.6 with V=H.
Proof. The proof proceeds with the aid of the results of Sections 55.1 and
55.2 in a way parallel to the proof of Theorem 31.A in Section 31.1. Instead
of (8), one first considers the regularized problems
";(0 + 4,M0-«"„(0-&(0. "„(0) = «0. (9)
where p > 0. By assumption, b is continuous on [0, T], The operator A^ is
Lipschitz continuous. Now, (9) is solved as in Section 31.2. It is only to
obtain the a priori estimate
||m;(0H<C for all n > 0, (e[0J] (10)
that we need an additional trick. To this end, we differentiate (9) again;
thus,
w;{t)+g'll{t)-wu'll{t) = b'{t) (li)
def
for almost all t, where gM(f) = A^uit). All the derivatives in (11) exist. From
the monotonicity of A^ it follows that
(.*„«„(*+ A)-.4„«„(0|«/l(r +A)-«„(0)^0;
thus, (g;(0K(0) ^ 0. Therefore, (11) yields
(<(0l«;(0)* {b'(t)K(t))+\4u;(t)\u;(t))
*ll&'(0llll«;(0ll+MIK(0ll2
^-^'(Of+^ + M)!!";^)!!2-
Integration by parts yields
K(0ll2-11«; (o)||2 -2/'( u';(s)\u;(s))ds
s||&||2+(i+2m)/'ik(j)||2<&,
j/2
55. evolution Variational Inequalities of First Order in H-Spaces
where Y= )^(0, T; H). The Gronwall lemma from Section 3.5 assures that
||«;(OH2<(constant)(||«; (0)||2 + \\b\\\).
Since
u;(0) = -Allu0 + ou0 + b(0)
and ||i4 n0|| < ||i40«0||, it follows from the above that (10) holds. D
55.4. Application to Quadratic Evolution Variational
Inequalities of First Order
We consider the following problem for «(•):
(u'(t)\v-u(t))H + a(u(t),v-u(t)) (12a)
-(b(t),v-u(t))v + <p(v)><p(u(t))
for all v e V and almost all t e [0, T], where Tis fixed, 0 < T< oo, and with
the initial condition
k(0) = k0. (12b)
Furthermore, let <p(u(t)) < oo for almost all t e [0, T).
Proposition 55.5. Problem (12) has exactly one solution u e ^(0, T; /f)
provided the following four conditions are satisfied:
(i) H is a real separable H-space and "Kci/cP" is an evolution triple,
(ii) The bilinear form a: V X V-* U is bounded, and there exist real numbers
03 and jS > 0 such that
a(v,v) + u$v\\2H>P\\v\\l forallv&V.
(Hi) y: V-* ]— oo, oo] is convex and lower semicontinuous, <p =£ + oo.
(iv) b e W^iO, T; H) and u0&V are given and have the property that
<p(u0) < +00 and
a(u0,v-u0) + <p(v)><p(u0) + (g\v-u0)H (13)
for all »eF with fixed g&H.
Example 55.6. If M is a closed convex nonempty set in V and <p = Xm> i-e->
<p(v) = 0 for vg.M and <p(v) = + oo for v€ M, then (iii) holds and (12a)
passes to the following problem: We seek a u such that u(t) e M for almost
all t e [0, T] with
(«'(01^- u(t))H + a(u(t),v- u(t))-(b(t),v- u(t))v>0
for all veM.
Problems
DM
Proof of Proposition 55.5. By (ii), there exists a linear continuous
strongly positive operator A: V-^V* such that
(Au~ uu,v)v = a(u,v) for all u,v^V.
clef
In this connection, consider ax(u, v) = a(u, v) + u(u\v)H. Now, (12a) reads
as follows:
<- u'(t)-Au(t) + uu(t)+b(t),v- u(t))v + <p(u(t))<y(v)
for all teK,
i.e.,
u'(t) + Au(t)+d<p(u(t))-uu(f)Bb(t).
We denote by B: H-* 2H the restriction of A + dy to H in the sense of
Problem 55.4. We then obtain .
u'(t) + Bu(t)-uu(t)3b(t).
The fact that, by Problem 55.4, the mapping B is maximal monotone is
crucial. (13) yields u0 e D(B) because g — Au0 + uu0 e dy(u0), i.e., {Au0
+ d<p(uo)}r\H*0.
Theorem 55.A in Section 55.3 yields the assertion. D
Proposition 55.5 can be generalized directly to nonlinear operators with
the aid of Problem 55.4.
Problems
55.1. Proof of Proposition 55.2. Solution: (a) Observe that w = R^u => u e (/ +
fiA)w => pA^u — u— w e fiAw.
(b) By hypothesis, A is monotone. From (a) it follows that for all ^ > 0:
(A^u-A^RpU -R^iiO for all u,veH.
Thus:
WA^u - ApW \\u ~ v\\ >{A^u~ A^u-v)
- (A»u ~ At"\vA».u ~ ^A^)+(ap.u ~ ApP\Rp.u - v)
ZliWA^u-A^W2.
(c) By the last chain of inequalities, A^ is monotone and continuous. Now
Example 32.4 yields the assertion.
(d) Use the definition of A^.
55.2. Proof of Corollary 55.3. Ad (6) Let ueD(A). The sequence (IM^mID is
bounded as \i -»+ 0; for, it follows from the monotonicity of A and
Proposition 55.2, (a) that
(A0u~ A^u\u- R^u) >0 for all u eD(A).
574
55. Evolution Variational Inequalities of First Order in H-Spaces
Since u - R^u — pA^u, we thus have
IIVII2 ^ (Aou\V) ^ IMo"ll IIVII- V- > °>
IIVII^IMo"!!.
The sequence (H^ull) is monotonely increasing as p J,0; for, if we replace
A^ by A^+x, then, because Ak+/l = (A^)^, we immediately obtain
|Mx+flu||2 < (Aku\Ak+llu), \\Ak+llu\\ < \\Aku\\;
therefore,
IIV,"" VII2 < \\Aku\\2 - \\Ak + jlu\\\ (14)
Consequently, (IM^wlD is convergent as uj,0. By (14), A^u-^y as nJ,0.
From
(z-A^v- /^u)> 0 forall (v,z)eA
and u — R^u = pA^u -» 0, it follows that
(z- y\v-u)>0 forall (y,z)e ^,
i.e., ye.Au because of the maximal monotonicity of A. Furthermore, \\ApuW
< \\AquW implies \\y\\ < ||/40u||; therefore, y = A0u by the definition of A0.
Ad (7) Consequence of (14).
The remaining assertions are obtained in an elementary way. Compare
Brezis (1973, M), pages 27, 28.
55.3. Proof of Theorem 55.A. Carry out the proof completely with the aid of the
hints given in Section 55.3.
55.4. Important methods for constructing maximal monotone operators. We set
*f({Nu+d<p(u)}nH forueF,
10 forueff~F.
Show: 5: H-+2H is maximal monotone when the following four conditions
hold:
(i) "Vc He V*" is an evolution triple, H is a real H-space.
(ii) N: V-*V* is monotone, hemicontinuous, and bounded.
(iii) <p: V-* ]-00,00] is convex and lower semicontinuous, where 9956+00.
(iv) There exists a u0 e V such that d<p(u0) ¥= 0 and
(NU, U — Un)v
— » + 00 as ||tt||,,-»00.
\\u\\v
Solution: According to Theorem 47.F in Section 47.11, dip is maximal
monotone. Theorem 32.A in Section 32.3 yields R(N + d<p) = V*. Let
def
Nxu = u + Nu for all u e V. Since
<N1u>y)K = (u|y)w + <Nu,i;)|/,
and by Section 23.4, (iv) holds with Nx instead of N. Theorem 32.A again
Problems
575
yields R(NX + d<p) = V*, i.e., R(I + 5)= //; thus, B is maximal monotone
by Proposition 55.1.
55.5.* Regularization of junctionals. Let <p: //-» ]—00,00] be a convex lower semi-
continuous functional on the real H-space H with <p 0 + 00. For X > 0, we set
<P*(u)= min<p(y) + (2X) ||y-u||2. (15)
v e //
Show:
(i) 9i>\(w) -> 95(«) as X -» +0 and for all u e //.
(ii) 99^ is convex, lower semicontinuous, and F-differentiable for all X > 0.
(iii) <p'x is the Yosida approximation of dip for all X > 0.
Hint: Compare Brezis (1973-, M), page 39. An analogous assertion holds
when // is a reflexive B-space and H and //* are strictly convex. Then, in (ii),
F-differentiability is to be replaced by G-differentiability. Compare Barbu
and Precupanu (1978, M), page 107.
55.6. A property of subdifferentials. Show. From d<p(u)= dip(u) on X it follows
that <p{u)"^(u)H-constant on X provided:
(a) X is a reflexive B-space.
(b) The functionals 99,1^: X-* ] — 00,00] are convex and lower
semicontinuous, where <p, <p * + 00.
Solution: According to Problem 55.5, <p'x = \p'x; thus,
9»\( «) - >M «) = constant = yx( u0) -- if>( u0).
Passing to the limit as X -» + 0 yields the assertion. The assumptions in
Problem 55.5 can be fulfilled by transition to an equivalent norm in
accordance with A 3 (29).
55.7. Example for the Yosida approximation. Let <p: U -»]-oo,oo] be a function
such that w(u) = 0 for |u| < 1 and <p(u) = + 00 for |u| > 1. Determine <px as
well as A = dip and Ax (cf. (15)).
Solution: Compare Fig. 55.1.
55.8.* Galerkin method for evolution variational inequalities. In this connection, study
Duvaut and Lions (1972, M).
Figure 55.1
d76
55. Evolution Variational Inequalities of First Order in H-Spaces
References to the Literature
Lions (1969, M); Brezis (1972) (regularity of solutions), (1973, M,B,H);
Barbu (1976, M,B); Browder and Brezis (1980)..
Galerkin methods and applications to mechanics: Duvaut and Lions
(1972, M); Naumann (1984, L).
Applications to plasticity theory: Groger (1979, S); Hlavacek and Necas
(1981, M).
Numerical methods: Glowinski, Lions, and Tremolieres (1976, M).
Yosida approximation: Brezis (1973, M) (H-spaces); Barbu and
Precupanu (1978, M) (B-spaces).
CHAPTER 56
Evolution Variational Inequalities of
Second Order in H-Spaces
Mathematics contains much that will neither hurt one if one does not know it
nor help one if one does know it.
J. B. Mencken, 1715
The second-order equation
u" + Nu' + Lu~b (1)
can be transformed by means of the substitution u'=v into a first-order
equation
v'+Nv + LSv=b.
This is the path we took in Chapters 32 and 33. However, (1) can also be
changed into a first-order system
u'=v, v'+Nv + Lu = b
and then Theorem 55.A in Section 55.3 can be used. We apply this method
in the present chapter, using tools from Chapter 23.
56.1. The Main Theorem
We consider the equation for u:
u"(t) + Nu'(t) + Au(t)-uu(t) 3 b(t) (2a)
for almost all (e[0J], where T is fixed, 0<T<oo, with the initial
conditions
«(0)-«0, «'(0)-i;0 (2b)
577
578
56. Evolution Variational Inequalities of Second Order in H-Spaces
under the following assumptions;
(HI) H is a real separable H-space and 'Tc^cF*" is an evolution
triple.
(H2) N: V -* 2V* is maximal monotone.
(H3) A: V-* V* is linear, continuous, symmetric, and strongly positive.
(H4) b0, u0, v0, w are given with ueR and
b(=W?(0,T;H), u0,v0eV, {Au0 +NvQ}nH *0.
Theorem 56.A. If(HV)-(H4) hold, then (2) has exactly one solution such that
ueC{[0,T];V),
u'eL„(0,T;V)nC([0,T]iH), u" e L„(0,Ti H).
We recommend that the reader independently carry out the proof using
the idea mentioned in the introduction. In this connection, use the trick of
identically adjoining terms with a parameter a and introduce a suitable
inner product on the product space V X H with the aid of A. We shall give
the proof in Problem 56.1.
56.2. Application to Quadratic Evolution Variational
Inequalities of Second Order
We study the following variational inequality for u:
{u"(t)\v - u'(t))H + a(u(t),v - u'(t))
- (b{t),v - u'{t))v + <p{v)><p{u'{tj) (3a)
for all v e V and almost all t e [0, T], where T is fixed, 0 < T < oo, with the
initial conditions
u(0) = u0, «'(0) = i;0. (3b)
Furthermore, let <p(u'(t)) < oo for almost all t e [0, T),
Proposition 56.1. Problem (3) has exactly one solution u with the properties
given in Theorem 56. A provided the following four conditions hold:
(i) H is a real separable H-space and "V c H c V* " is an evolution triple,
(ii) a; V X V-* R is bilinear, symmetric, and bounded, and there exist real
numbers w and /? > 0 such that
a(v,v) + u\\v\\2H>P\\v\\2v forallv^V.
(Hi) <p: V-* ] — oo, oo] is convex lower semicontinuous and (p & + oo.
Problems
579
(iv) b e W2X(Q, T; H), and u0, vQ&V are given such that <p(vQ) < oo and
a(u0,v-v0) + <p(v)><p(v0) + (g\v~v0)H (4)
for all v eV and fixed g^H.
The proof proceeds analogously to Section 55.4. We shall give the proof
in Problem 56.2.
Problems
56.1. Proof of Theorem 56./4. Solution: We write (2) iiuthe form
u' + [au — v\ — au — 0, (5a)
v' + [Au- uu + Nv + av] — av3 b,
u(0) = u0, v(0) = v0. (5b)
In this connection, we forego giving the argument t in (5a). We shall dispose
of a later.
We write (5) in the form
U'(t)+BU(t)-aU(t)3F(t), (6)
U(0) = (uQ,vQ),
where U = (u, v), F = (0, b), and
BU^ (au — v, Au — uu + Nv + av),
<*4 ,
D(B) = {(u,v)eX: {Au + Nv}r\H*0}
def
as well as X = V X H. Then X becomes an H-space with the inner product
(Ux\U2) = {Au1,u1)v + (vl\v1) „■
Note that because of (H3), (Aul,u1)v generates an equivalent inner product
on V. We now verify the assumptions of Theorem 55.A in Section 55.3.
(I) B: X-*2X is monotone for a suitable choice of o; for, a short
calculation shows that, for Uv U2 e D(B), we have:
A = (BUX - BU2\UX - U2) = «111/! - U2\\\
-«(«!- U2\vx -V2)H + (Nvl - Nv2,Vx ~ V2)v.
Take into account that {x,y)v = (x\y)H for y e V, x e H by Section 23.5
and Vj e Vbecause Vi eD(B), as well as the symmetry of A. To simplify the
notation, let Nvt denote an element of the set Nv,-. Now, for sufficiently large
a, A > 0 follows from the monotonicity of N and from
|( «! - u2\vx -v2)H\< 2^(11^ - u2fH + \\vx - v2\\2H)
< C{\\ux - u2\\l + \\0l - v2\\2„) <; CX\\UX - U2\\2X.
580
56. Evolution Variational Inequalities of Second Order in H-Spaces
(II) B is maximal monotone. According to Proposition 55.1, we have to
show that R{I+B)^X. The equation (I + B)U3 W with W e X, W=
(w,z) means
u + (au— v) = w, (7a)
v + (Au- uu + Nv + av)3 z (7b)
for fixed (w, z) e F X H; therefore, u = (1 + a)"1(w + v) and
Bxv + Nv3 z-(l + a)~\Aw- aw), (8)
def
where Bxv = (1 + a)v +(1 + ot)-1(/4i> - ud). By (H3) the operator Bx: K->
V* is linear, continuous, and strongly positive for sufficiently large o.
According to Theorem 32.A, R(BX + N) = V, i.e., (8) has a solution v e K,
and (7b) shows that (7= (u, y) eD(B).
(III) By assumption, U0 - (u0, y0) e I>(5). Furthermore, from ft e
^'(0, T; /7) it follows that F e W\{0, T; X).
Now Theorem 56.A follows from Theorem 55.A and Corollary 55.2.
Accordingly, (6) has exactly one solution U such that
UeC([0,T];X), U'e Lx(0,T; X).
Observe that such a solution always lies in 0-^(0, T; X). Since X=VXH,
this means that
ueC([0,T];V), v <=C([0,T]; H),
u'£Lx(0,T;V), v'<ELx(0,T;H).
Now take into consideration that u'= v.
56.2. Proof of Proposition 56.1. Solution: Parallel to Section 55.4, from (3a), taking
Section 23.5 into account, we have the relation
{-u"(t)-Au(t) + uu(t) + b(t),v-u'(t))v
+ y(u'{ t))<<p(v) forallyeF,
i.e.,
u"(t)+ 9<p( «'(')) + Au(t)-uu(t)=>b(t).
By virtue of Theorem 47.F in Section 47.11, the mapping d<p: V-* 2V* is
maximal monotone. Now Theorem 56.A with N = d<p yields the assertion.
Note that condition (4) asserts that g - Au0 + uu0 e d<p(v0), i.e.,
{Au0 + Nv0}n H*0 .
56.3.* Regularity of solutions. In this connection, study the fundamental work of
Brfeis (1972).
References to the Literature
Lions (1969, M); Brezis (1972); Barbu (1976, M) (cf., also, the references to
the literature in Chapters 54 and 55).
CHAPTER 57
Accretive Operators and Multivalued
First-Order Evolution Equations in
B-Spaces
The moving power of mathematics is not reasoning but imagination.
A. De Morgan
There is an astonishing imagination, even in the science of mathematics.
... We repeat, there was far more imagination in the head of Archimedes than
in that of Homer.
Voltaire
In this chapter, in a manner parallel to Chapter 55, we study the initial
value problem
u'(t) + Au(t)3f(t), 0<t<T, (1)
h(0) = h0,
where u(t) lies in a real B-space X and A is multivalued. In Chapter 55,
where X was a real H-space, the following two properties played a crucial
role:
(i) A is monotone,
(ii) A is maximal monotone.
In a real B-space X, in place of (i) and (ii) there appear the following
generalizations:
(i') A is accretive,
(ii') A is m-accretive.
In Chapter 55 we solved (1) in H-spaces while we thought of the derivative
581
582 57. Accretive Operators and Multivalued First-Order Evolution Equations in B-Spaces
u' in a generalized sense. In the case of a B-space, we again generalize the
solution concept in an essential way and proceed as follows:
(a) We state the difference method belonging to (1) (backward differences).
(jS) The proof of convergence is obtained by constructing majorants for (a)
with the aid of the difference method for a classical first-order partial
differential equation,
(y) The limiting value is a generalized solution of (1), i.e., a so-called
integral solution.
(6) We prove the uniqueness of integral solutions.
The uniqueness proof assures that each classical solution of (1) is also an
integral solution.
If one considers the difference method for (1) in Section 57.3, then one
finds that m-accretive operators allow the stable solvability of the
corresponding difference equations in a very natural way. This motivates the
central meaning of m-accretive operators for evolution equations.
In Section 57.5 we show that for/ = 0 in (1) and variable u0 &D(A), the
integral solution
u(t) = S(t)uQ
yields a nonexpansive semigroup {5(0} belonging to (1).
57.1. Generalized Inner Products on B-Spaces
Definition 57.1. Let X be a real B-space. For all x, y e. X and real \i # 0, we
define:
[x,^_J!£±MzM,
r
[x,y]± = lim [x, >>]„.
Thus, [x, y]± is nothing other than the directional derivative of the norm.
Proposition 57.2. (a) The expression [x, y] ± is well defined, and for all
x, y,z &X and |w, X, £ > 0, the following hold:
[x, y]+= inf [x,y]^, [x, y]_ = sup [x, y]„.. (2)
[x,y]-<[x,y] + , [*,.y]+<[*,/]„, (3)
[x,-y]- = ~[x,y] +
[x, y + z]+<[x, y] + + [x,z] + , [x, y + z]_ < [x, y]^ +[x, z]+ (4)
|[x,j]±|<||j||, [y,y]±=\\y\\, [tx,\y]+ = t\[x,y] + . (5)
57.2. Accretive Operators
583
(b) We have
ll"(0ll'-["(0."'(0] + -["(0.«'(0]-. (6)
provided all these derivatives exist.
We treat the proof in Problem 57.1. The following example shows that
[x, y]± is a kind of generalized inner product.
Example 57.3. In a real H-space X with the inner product (• |), we have:
'(x|jO/II*II «11*11*0,
[x,y]± ' '■•■'■ if ||x1|-0.
Proof. We set (p(^) = \\x + fiy\\ — ][{x + {iy\x + py) and take into account
that [x, y]±= <p'± (0). . D
57.2. Accretive Operators
Definition 57.4. Let J be a real B-space and let A: D(A) c X-* 2X be a
multivalued operator. We denote the resolvent, which always exists as a
def
multivalued operator, by Rfl = {I + pA) .
A is called accretive if and only if, for all p > 0, R is single valued and
nonexpansive.
A is called m-accretive if and only if A is accretive and R(I + fiA) = X for
all p > 0.
Proposition 57.5 (Characterization). If X is a real B-space and A: D(A) c X
-» 2X is a multivalued operator, then the following hold:
(a) A is accretive if and only if [ut — U2, Vx — v2]+ > 0 for all
(«!, ^),(^2, y2)e ^-
(b) If X is a separable H-space, then:
A is accretive <=> A is monotone.
A is m-accretive ** A is maximal monotone.
If, in addition, A is linear and continuous, then the concepts accretive and
m-accretive coincide.
Proof, (a) By (2),
||*|| <; II*+ MI. V>0~[x,y]+>0. (7)
A is accretive if and only if, for all \i > 0,
\\R u - Rv\\ < \\u - v\\ for all u, v e R(I + pA).
584 57. Accretive Operators and Multivalued First-Order Evolution Equations in B-Spaces
This is equivalent to
ll"l-"2ll^ll"l+^l-("2+^2)ll
for all n > 0, (k,., vt) e A. Now (7), with x = ux - u2, y = vx - i;2, yields (a),
(b) Compare Proposition 55.1. If A is linear and continuous, then the
following holds: If A is accretive then it is also monotone. The operator
I + \i A is thus strongly monotone for n > 0 and, by Theorem 26.A in
Section 26.2, it is surjective; therefore it is m-accretive. D
57.3. The Main Theorem for Inhomogeneous
Problems with ra-Accretive Operators
We consider the initial value problem
u'(t)+Au(t)3f(t), 0<t<T, (8)
"(0) = "o
with the following assumptions:
(HI) The mapping A: D(A) c X-* 2X is m-accretive. X is a real B-space.
(H2) / e 1^(0, T; X) is given and fixed for fixed T, 0 < T < oo.
(H3) uQ &D(A) is given and fixed.
Definition 57.6. u: [0, T] ~^> X is called an integral solution of (8) if and only
if u is continuous and
\\u(t)-x\\-\\u(S)-x\\<f'[u(T)-x,f(T)~y] + dT (9)
for all t, s such that 0 < s < t < T and all (x, y)^A.
These integral solutions are obtained in a natural way in Section 57.4
below as limiting values of the following difference method for (8):
{^ty'ixl-xl^ + AxlBfH (10)
with k = 1,2,..., n. In this connection, we set
t"k = kAnt, A„r = |.
Proceeding from (10), we construct the piecewise constant functions
«„(0=*2. W)=fk forrek".^]
and k =1,2,...,n. Moreover, let un(0) = x%, fn(0) = 0.
57.4. Proof of the Main Theorem
585
Theorem 57.A. With the assumptions (//1)-(//3), the following hold:
(a) Existence and uniqueness. (8) has exactly one integral solution.
(b) Permanence. Each continuous solution u: [0, T)~^> X of (8) that has a
generalized derivative u' e -£>i(0, T; X) is also an integral solution 0/(8).
(c) Convergence. If
f„-*f inLx{Q, T; X) and xg-> w0 in X
as n -* oo, then (un) converges uniformly on [0, T] to the integral solution u of
(8).
(d) Comparison assertion. If v is an arbitrary integral solution 0/(8), then
\\v{t)-u(t)\\<\\v{s)-u{s)^
holds for alls,t,0<s <t <T.
This theorem shows that the concept of an integral solution provides an
appropriate solution concept for (8). At the same time, (c) yields an
approximation method for the solution of (8). Equation (10) is equivalent to
and because A is m-accretive, this can always be uniquely solved for x"c.
Since Xq is known, we obtain x", x\,... successively.
57.4. Proof of the Main Theorem
The main idea of the proof has already been described in the introduction
to this chapter.
Proof of Theorem 57.A, (b). From u e C([0, T)\ X) and u' e L^O, T\ X)
it follows that
u(t) = u(0)+ ('u'(T)dT forallfe[0,r] (11)
and, for almost all t e [0, T],
h~1(u(t + h)~-u(t))^u'(t) as/j^0
(cf. Brezis (1973, M), pages 140, 154). Equation (11) yields
H«(ri)-«(r2)ll^rV(T)NT, Q<h<t2<T.
For each e> 0 there thus exists a 6(e) > 0 such that for disjoint intervals
]t„ ti+1] having total length less than 6(e), we always have:
£lll«(f,+i)-*ll-ll«(',)-*lll
i
^Il\\u(tl+1)-u(tl)\\^I,f'*1\\u'(r)\\dr<e
i i '<
586 57. Accretive Operators and Multivalued First-Order Evolution Equations in B-Spaces
(cf. A2(20)). Thus, f-* ||ti(f)-;c|| is absolutely continuous on [0, T). Then,
according to a known theorem from Lebesgue theory, the derivative of
t >-> ||m(/) — jc|| exists almost everywhere on [0, T], and we have
\\u(t)-x\\-\\u(S)-x\\=f'\\u(r)-x\\'dr (12)
for 0 < s < t < T (cf. Riesz-Nagy (1956, M), No. 25).
Let y e Ax. By (6), the following holds for almost all t e [0, T]:
||«(t) —jc||'— [m(t)-jc,«'(t)]_
= [u(r)-x,(u'(r)-f(r) + y) + (f(r)-y)]_
<-[u(r)-x,{f(r)-u'(r))-y] + + [u(r)-x,f(r)-y] +
<[u(r)-x,f(r)-y]+.
In this connection, one takes Proposition 57.2 into account, as well as the
fact that /(t)- u'(t) e Au(r), y e Ax, and the accretiveness of A
(Proposition 57.5). Equation (12) yields
||«(0-x||-||«(j)-x||
< f [u(t)-~xJ(t)- y] + dr.
The existence of the integral appearing on the right-hand side results from
|[«(t)-x,/(t)-j]+|<||/(t)-j||<||/(t)||+||j||
by (5) and the Lebesgue dominated convergence criterion (cf. A2(19)).
Proof of Theorem 57.A, (c). Here, (14), (15) and (17), (18) below are
crucial.
Step 1: Estimation for Accretive Operators A
Lemma 57.7. From
a-^x-^ + AxSf, P'\y-y) + AyBg
and a, jS > 0, it follows that
Ux-yU^ia + py'iaUx-n + m-yW+etPUf-gW}- (13)
Proof. Proposition 57.5 yields
0<[x-j,(/-a-1(x-x))-(g-J8-1(j-^))] +
<[x- y,f-g] + + [x- y, a'l{x -x)] + + [x- y, P'\y - y)\ +
*[x-y,f-g] + + a-Hll* - y\\- \\x - y\\)
+ r1(\\x-y\\-\\x-y\\).
57.4. Proof of the Main Theorem
587
Take (2) into account. From this it follows that (13) holds, taking
\[x-yj-g) + \<\\f-g\\
into consideration. D
Step 2: Estimation of the differences. We set
del
ahk = \\x?-xl\\,
o>m'»(t-s)tf /'|r_S|(||/(r)||+||j||)^r
0
+ ||*o"-*ll + ll*om-*ll+ll/-/Jli + ll/-/Jli-
Here, ||-1^ denotes the norm in 1^(0, T; X).
Lemma 57.8. For allj = 1,..., m and k =1,..., n,
aj^^iaj^Aj+aj^^Aj + AjAj-Uff-f^iAj + Ajy1,
(14)
and for j = 0 or k = 0, as well as y e Ax,
ahk<um'n{tl~t?). (15)
Proof. (14) Write (10) for m and n. Then (14) follows from Lemma 57.7.
(15) Let ye Ax. The operator A is accretive, i.e., (I + Ant-A)~l is
nonexpansive. Thus, from (10) it follows that
\\x"k- x\\<\\A„t-fk" + x"k^-(x + A„t-y)\\
<\\x"k^-x\\ + A„t(\\fZ\\+\\y\\)
^II*Li-*II+P"(II/(t)|| + IWI + ||/(t)-/„(t)||)<*t.
'k-l
From this it follows that
\\x"k-x\\<\\x"0-x\\+f'"k(\\f(r)\\+\\y\\) dr+ \\f - fj,.
Now, for/ = 0, assertion (15) follows from
11*2 ~*omII^IW -*ll + II* -*omll
and tg = 0. In an analogous way, we get (15) for k = 0. D
Step 3. Difference method. We consider the difference method that is
parallel to (14), (15) for the real first-order differential equation
Us(s,t) + Ut(s,t)-=h(s,t), 0<s,t<T (16a)
588 57. Accretive Operators and Multivalued First-Order Evolution Equations in B-Spaces
Figure 57.1
with the initial condition
U(s,t)=u(t-s)
for t = 0, s e [0, T] or s = 0, t e [0, T] (see Fig. 57.1).
For smooth w and h, (16) has the solution U = G(u, h), where
(16b)
G(u, h)(s, t) = u(t-s)-
{ h(T, t -s + r) dr for t>s,
I h{s-t + t,t) dr ioxs<t.
One checks this by integrating (16a) along the characteristics s — t<= constant
(see Fig. 57.1). If w and h are not sufficiently smooth, we think of G(u, h) as
the generalized solution of (16).
The difference method corresponding to (16) reads as follows:
Vj.k-Vj-i,k
Vj.k-Vj,k-i
A.r
= /;
■J.k
for k =l,...,n,j = l,...,m, and
^,/c = w(^-^m) fory = 0 or ft; = 0.
The expression "for/=0 or ft; = 0" is a natural abbreviation of the
expression "for/ = 0, ft = 0,...,n or ft = 0, / = 0,...,m." Furthermore, we
use the notation fj|! = k&nt, ft = 0,...,n and sj" = j'Amt,j' = 0,...,m as well
as t/M =1/(^,¾).
If we replace h by /!m'" and w by wm'", then we obtain
Uj,k= {uJ^k^nt + uj^l^mt + ^mt^nt■h'>y}{^mt + ^ntY\ (n)
UJ<k = um-n(t2-sp) for/ = 0 or ft = 0. (18)
The crucial basic idea of the proof is the comparison of (IT) and (18) with the
assertion of Lemma 57.8.
We denote by H(um,n, hm'n) the function that equals UJJc at the grid
point (Sj",t^) and is constant on the square ]sjlx, s^xjt^^ tnk\ The
following lemma contains a convergence proposition for the difference
57.4. Proof of the Main Theorem
589
method. To this end, for the function h: [0, T]X[0, T] -* R, we introduce the
following norm:
||A||.-inf{||g|li + ||/||i}.
Here, the infimum varies over all g, f e L^O, T) such that \h(s, t)\ < g(s) +
f(t) almost everywhere on [0, T]X[0, T). We denote the completion of
C([0, T]X[0, T]) with respect to this norm by C*.
Lemma 57.9. In the L^-norm on [0, T]X[0, T],
H(um'n,hm-")^G(u,h) as(n,m)^oo, (19)
provided the following three conditions hold:
(i) wm'«, u&C[-T,T], hm'n, heC*for all n,m.
(ii) hm,n is piecewise constant on the grid, i.e., it is constant on each square
(hi) IK'" - <o||c(t0,7-]x[0,7-])-0 and\\hm'» - A||, -0 as (n, m)-»oo.
Here, (n, m) -* oo means that min(n, m) -* oo.
The proof makes use of standard techniques for difference methods,
which we presented in Chapter 20 (cf. Problem 57.2).
Step 4: Majorant Method and Uniform Convergence of(un) on [0, T]. We set
clef def
h(s,t)=\\f(s)-f(t)\\, hm-"{s,t)=\\fm{s)-fn{t)\\.
If we compare (14) and (15) with (17) and (18), then, because of the fact
that the coefficients are positive, we obtain the key relation of our proof:
\\aj,k\\^Uj,k> 7=0,1,...,m; £ = 0,1,...,«.
Therefore, for the corresponding piecewise constant functions, we have:
\\um(s)-u„(t)\\<H(^",hm'")(s,t). (20)
By assumption,
\h°un(s,t)-\\f(s)-f(t)\\\
S||/m(j)-/(j)||+||/B(0-/(0ll "»0 as(/!,m)-»oo.
From this it follows that \\hm'n — /i||* -*0 as (n,m) -*oo. Let
«(J, 0 = /""""'(11/(011 + IWI) dr + 2\\u0 - x\\.
Since Xq, x™ -* u0 as («, m) -» oo, we have
um'"(s, t) ^u(s - 0 uniformly on [0, T] X [0, T]
as (n, m)->oo for fixed (x,y)&A.
590 57. Accretive Operators and Multivalued First-Order Evolution Equations in B-Spaces
According to Lemma 57.9 and (20), we have
155 \\un(t)-um(t)\\<G(u,h)(t,t) = 2\\u0-x\\
(n, m ) -> oo
for all x eD(A). Since u0 &D(A), it follows from this that (un) converges
uniformly as n -* oo on [0, T] to a function u.
Again, by Lemma 57.9 and (20), for 0 < s < t < T, we have
\\u(S)-u(t)\\~ lim \\un(s)-un(t)\\<G(o>,h)(S,t)
n ->oo
-/"_,|(II/(t)IH-W)</t + 2||«0-x||
•'o
+ f\\f(t-S + r)-f(r)\\dr.
Therefore, u is continuous on [0, T). In fact, the first integral is small for
\t — s\ small because of the absolute continuity of the integral. For the
second integral, one uses the mean continuity that follows from ||/(-)|| e
Lj(0, T). Since u0 &D{A), \\uQ — x\\ can be made arbitrarily small for a
suitable choice of x e D(A).
Step 5: u is an Integral Solution. By (10), we have
rk+{^ty\xi^~xi)^Axi, fc=i,2,...,«.
A is accretive. Therefore, according to Proposition 57.5, for all (x,y)&A,
we have
0<[x"k-x,fk»+(Anty1(x^1-xl)-y] +
<[x'l- x,fk" - y] + + [x"k- x,(A„ty\x- x"k)] +
+ [xnk-x,{kntY\xnk^-xj\ + ;
therefore, by (3) and (5),
\\x"k- x\\-\\x"k^- x\\<A„t[x"k- x,fl' - y] +
zKt[xk-x>fk-y],> for all/i >0.
Addition of these equations for the various k yields
\\un{t)-x\\-\\un{s)-x\\< (\un{T)-x,fn{T)~y}^.
Since
|[a,6]/l-[C,^]/l|<2^1||a-c||+||6-^||,
we may replace the quantities un and fn by u and /, respectively, for n -* oo.
Then, as \i ~^> + 0, we obtain (9). In this connection, take into consideration
57.4. Proof of the Main Theorem
591
that
[u(r)-x,f(r)-y]ll^[u(r)-x,f(r)-y] + as|u-+0, (21)
\[u(r)-x,f(r)-y],\^\\f(r)-y\\ for all ^ > 0
as well as that u, f e L^O, T; X) and the Lebesgue dominated convergence
theorem (cf. A 2 (19)). D
Proof of Theorem 57.A, (d). In an essential way, we now make use of
distributions u e @'(Q) over C0°°(fi) with values in IR (cf. A2(64)). Here, Q
denotes a region in IR N. The following lemma is important for this.
Lemma 57.10. The function g: [0, T]-*U is monotonely increasing if and only
ifg'>QinS)'(Q,T).
Here, u > 0 in 2'{Q) means that u(<p) > 0 for all <p e C0°°(fi), where tp > 0
on Q. The proof of Lemma 57.10 can be found in Schwartz (1950, M), pages
29, 54.
Now, let v be an integral solution, i.e., by Definition 57.6, the function
)? ('
is monotonely increasing on [0, T); therefore,
d_
dt
g(t)- f[v(r)-x,f(r)-y]+dr-\\v(t)-x\\
[v(t)-x,f(t)-y] + -^-\\v(t)-x\\ SO (22)
in &'(0, T) for all (x, y) e A. We choose
def def s _, ,
x-xl, y = fk"+(A„t) \xl^-xl).
According to Proposition 57.2, for p > 0, (22) yields
jt\\v{t)-xl\\<[v{t)~xl,f{t)-fZ]„ (23)
+ [v(t)-xi(bnty\xnk-x^1)]Aiil
= [v(t)~-xk>J(t)-fk»]ll
+ {KtY\\\v{t)-xl^\\-\\v{t)-xl\{).
We set
def def
w„(t,s)=\\v(t)-un(s)\\, w(t,s)=\\v(t)-u(s)
def
h„(t,s)-[v(t)-u„(s),f(t)-fn(s)]ll,
def
h(t,s)=[v(t)-u(s),f(t)-f(s)]„
592 57. Accretive Operators and Multivalued First-Order Evolution Equations in B-Spaces
as well as
gB(^)=VB0_1(IMO-*ZIIHKO-*;:-ill)
forse]^!,^], fc-1,2,...,/!.
def
Then, with G = ]0, T[X]0,T[, inequality (23) reads as follows:
jtwn{t,s) + gn{t,s)<hn{t,s) in^'(G). (24)
Furthermore, we construct w~n(t, s) on [0, r]X[0, T] by
def
W,
>„(t,s)=\\v(t)-x"k\\ iors = tnk, k = 0,l,...,n
and interpolate linearly with respect to s. Parallel to Example 21.6,
integration by parts yields
fwn%dtds = -fgnq>dtds for all <p e C0°°(G),
JG JG
i.e., dWn/ds = g„ in ®'(G). Thus, by (24),
jtw„ + ~wn<hn mS'(G). (25)
As « -> oo, we have w„,w~„^>w and hn -* h in LX(G); therefore,
±w + JLw<h fa 9'(G). (26)
This is so because, for all tp e C"(G), we have, e.g.,
— J wnq>, dtds -* — I w<p, dtds as n -* oo.
Lemma 57.11. The following inequality holds:
J^Ht)-k(t)||<0 inS'(0,T). (27)
Proof. Compare Problem 57.3.
Then, according to Lemma 57.10, the desired assertion (9) follows
immediately.
Proof of Theorem 57.A, (a). The uniqueness assertion follows from
Theorem 57.A, (d) with v(0)=u(0), and Theorem 57.A, (c) yields the
existence assertion. D
57.5. Application to Nonexpansive Semigroups in B-Spaces
593
57.5. Application to Nonexpansive Semigroups in
B-Spaces
Let C be a nonempty subset of a B-space. By a nonexpansive semigroup on
C, we understand a family {S(t):0 <t <oo} of operators S(t): C-* Csuch
that for all u,veC and t,s > 0, the following hold:
(i) S(t + s)u = S(t)S(s)u.
(ii) S(0)k = u.
(iii) S(t)u-* u as t-* + 0.
(iv)\\S(t)u-S(t)v\\<\\u-v\\.
By the infinitesimal generator of-the semigroup {S(t)}, we understand the
operator B: D(B) c X^> X defined by
def
Bu= lim h'x{S{h)u-u). (28)
Here, £>(£) is the set of all u e X for which the limiting value (28) exists in
the sense of norm convergence. The semigroup {5(^)} is called linear if all
S(t) are linear.
We now explain the connection between the initial value problem
u'(t) +Au(t) BO, 0<t<oo," (29)
"(0) = "o
and nonexpansive semigroups. Here, A is a multivalued mapping.
According to Theorem 57.A in Section 57.3, problem (29) has exactly one integral
solution u(-) for each u0 <=D{A). We define S(t) by S(t)u0 = u(t).
Theorem 57.B. If X is a real B-space and A: D{A)<zX-^>2x is an m-accre-
tive operator, then {S(t): 0<t<co} forms a nonexpansive semigroup on
WA).
Proof. By the construction of u(t) according to Theorem 57.A, we obtain
u(t)^DlA)ioT all t>0.
def
In order to show that S(t +s)u0 =S(t)S(s)u0, we set v(t) = u(t + s) for
fixed s > 0. Then v is an integral solution of (29) on [0, T] for arbitrary
T> 0 with v(0) = u(s); thus, v(t) = S{t)u(s).
According to Theorem 57.A, (d), S(t) is nonexpansive, and from the
continuity of t -» u(t) it follows that S(t)u0 -* u0 as t -» +0. D
In Problem 57.4 we point out the Hille-Yosida theory for linear
nonexpansive semigroups and their nonlinear generalization by Crandall and
Pazy.
594 57. Accretive Operators and Multivalued First-Order Evolution Equations in B-Spaces
57.6. Application to Partial Differential Equations
We consider the quasilinear differential equation
u, + F(u)x = 0, 0<x<l, t>Q (30a)
with the initial condition
u(x,0) = u0(x), 0<x<l (30b)
and the boundary condition at x = 0,
«(0,0=0, t>0. (30c)
Equation (30a) describes a conservation law, because, for a smooth
solution of (30a), we have:
-r f u(x,t)dx = F(u(a,t))-F(u(b,t)).
at Ja
In the following, let X = 1^(0,1) and let u(t) denote x ►-» u(x, t) conceived
of as an element of X for fixed t.
Definition 57.12. By the generalized problem for (30), we understand the
differential equation
u'(t) + Au(t)=0, 0<f<oo, (31)
"(0) = "o
in X with
def
D(A)= (bgC[0,1]: v(p) = 0,F°veW?(0,l)}
and
(Av)(x)tf JLf(v(x))
on ]0,1[ for all v e D(A). Recall that (F»o)(x) = F(v(x)).
The boundary condition (30c) is contained in the definition of D(A). The
operator A in (31) acts on the function x ►-» u(x, t).
Proposition 57.13. Let F: U -> IR be continuously differentiable and strictly
increasing with F(0) = 0, F(U) = U. Then the following two assertions hold:
(a) The operator A: i)(i)cI-» X is m-accretive with D(A)=* X.
(b) For each u0 e X, (31) has exactly one integral solution u.
If we set S(t)u0 = u(t), then {S(t): 0<t<oo} forms a nonexpansive
semigroup on X.
We show (a) in Problem 57.5. Then (b) follows from Theorems 57.A and
57.B.
Problems
595
Problems
def
57.1. Proof of Proposition 57.2. Solution: Let 9»(n) = II* + M-fll- Then <p is convex.
Consequently, the one-sided derivatives <jf± (0) = [x, y]± with 99^ (0) <
yf+ (0) exist. In addition, the difference quotients [x^j^as ii-* +0 and as
\i -* - 0 are monotonely decreasing and monotonely increasing, respectively
(cf. Problem 42.3). Now use these assertions and the triangle inequality. For
example, it follows immediately from
||x + X(.y + z)|| < 2^(11*+ 2X.y||+||x + 2Xz||)
that
[x,y + z]+ < [x,y] + + [x,z] + .
On the other hand, from
||x + X.y|| < 2.^(11¾+ 2X(.y + z)||+Hx-2Xz||)
one immediately obtains
[x,y]+<[x,y + z]+-[x,z]^.
Together with [a, b]+ = -[a, - b]^, this yields
[x,y + z]^<[x,y]-. + [x,z] + .
If we set x = u(t), y = u'(t), then
x +\iy= u{t +\i) + o{\i) asji-»0.
Now, (6) follows.
57.2.* Proof of Lemma 57.9. Hint: Compare Crandall and Evans (1975).
57.3. Proof of Lemma 57.11. Solution: For continuously differentiable functions,
(27) results directly by considering (26) for s = t and observing that h(t, t)
= 0.
In the case of distributions, (27) means
- [w(<p, + <ps)dtds< (hydtds (32)
JG JG
for all 99 e Q00 (G), where 99 ^ 0. Here, G = ]0, T [ X ]0, T [. We introduce new
def
variables T,ubys = T + ii, t = r — a and set 99(7-, a) = a(T) j8(a), where
aeQ°(0,r), jSecn-e.e), a;J8>0, f /S(o)do~l.
J- 8
Then 99 is concentrated on a small neighborhood of the main diagonal in the
(t, i)-plane. We have a'fi = 99, + 9¾. From (32) it follows that
"/I/ \\v(t — 0)—u(t + o)\\(x'(t) dT\fi(o) do
<f (fT\\f(T-0)-f(T + 0)\\a(T)dTy(0)d0.
596 57. Accretive Operators and Multivalued First-Order Evolution Equations in B-Spaces
For e -» 0, the mean continuity of / and the continuity of u, v yield
- /"'lIl'CT)—M(T)||«'(T)rfT^0
for all a e C0°°(0, T) with a > 0. This is (27).
57.4.* Characterization of nonexpansive semigroups. We give a complete survey of
the structure of:
(a) linear nonexpansive semigroups in B-spaces (Problem 57.4a);
(/?) nonexpansive semigroups in H-spaces (Problem 57.4b).
In this connection, also compare the introduction to Chapter 31.
57.4a.* Linear Hille-Yosida theory. Let Xbe a linear B-space. Show:
(i) If A is an operator such that
A: D(A)c.X-* X is linear, m-accretive, and T)(A)= X,
then
def , .
S(t)u0= Mm (exp\ — tAA)u0
p.->+0 L J
for all m0 e X yields a linear nonexpansive semigroup on X. Here, A^
denotes the Yosida approximation of A, and — A is the infinitesimal
generator of the semigroup in the sense of Section 57.5.
(ii) Every linear nonexpansive semigroup is obtained as in (i).
Hint: Compare Riesz and Nagy (1956, M), No. 143, page 385.
57.4b.* Nonlinear Hille-Yosida theory of Crandall and Pazy. Let X be a real
H-space. Show:
(i) If A is a multivalued mapping such that
A: D(A)QX-*2X and A is m-accretive,
then, according to Theorem 57.B in Section 57.5, A generates a
nonexpansive semigroup {S(t): 0<t<oo) on D(A). The infinitesimal generator of
this semigroup is —A0. Here, D(A0)=* D(A) and AQu is the uniquely
determined element with the smallest norm in the closed convex set Au.
(ii) Every nonexpansive semigroup on a closed convex nonempty set is
obtained as in (i) with the aid of a mapping A.
The connection with the theory of monotone operators is obtained by the
following relation which is valid in a real H-space:
A is m-accretive «* A is maximal monotone.
Hint: Compare Crandall and Pazy (1969) and Brezis (1973, M), page 114.
57.5. Proof of Proposition 57.13, (a). Solution:
(I) We show that A is accretive. To this end, we use a regularizing
method. Let v, w e D(A), X > 0. We must show that
||o-w + \(/1j;-/1w)|^^||j;-w||^ (33)
Problems
597
For this we set
, .deffn'ls forljI^M"1,
l^sgni for|i|>n l;
def <>s
q„(s) = j p„(t) dt.
Then the following crucial relation holds:
(\Av-Aw)p„(F(v)-F(w))dx (34)
Jo
^f\F(v)-F(w)Yp„(F(v)-F(w))dx
Jo
-/^,,(^(0)-^(^))^-9,(^(0(1))-^(^(1)))^0.
To be precise, one should write F(v(x)), F(w(x)). Furthermore, one must
observe the following: For v,w eD(A), the function x >-* F(v(x)) belongs
to C[0,l]rWi(0,r) and is consequently absolutely continuous (cf. Smir-
now (1956, M), Vol. V, Section 110). The function q„ is Lipschitz
continuous. Thus, x -» q„(F(v(x))- F(w(x))) is also absolutely continuous.
Consequently, the last line in (34) is meaningful (cf. Riesz and Nagy (1956, M),
No. 25).
Since F and p„ are monotonely increasing, we have
\v-w\\p„(F(v)-F(w))\~(v-w)p„(F(v)-F(w)).
Since \p„\ si and (34) holds, then
(X\v - w + \(Av - Aw)\dx z. [l(v-w)pJF(v)-F(w))dx
Jo Jo
+ (l\{Av~ Aw)[pn(F{v)~F(w))] dx
Jo
>. I \v — w\\p„(F(v)—F(w))\dx-*l \v~w\dx as«-»oo.
•'o •'o
This is (33). Use the Lebesgue dominated convergence theorem.
(II) A is m-accretive. Let h e X. We must show that R(I+ XA) — X for
all X > 0, i.e., the ordinary differential equation
v(x)+XF(v(x)Y=h(x), 0<x<l, (35)
has a solution v e D(A). If G is the inverse of the function u >-» XF(u), then
it suffices to find a function w e C[0,l]n ^/(0,1) such that w(0) = 0 and
G(w(x))+w'(x)=*h(x) (36)
def
almost everywhere on ]0,1[. Then v(x) = G(w(x)) is a solution of (35) and
veD(A).
A solution of (36) is obtained by solving the integral equation
w(x)-f\h(i)-G(w(l)))di, 0<x<;l
598 57. Accretive Operators and Multivalued First-Order Evolution Equations in B-Spaces
on C[0,1] using the Schauder-Leray principle (Theorem 6.A in Section 6.8).
Take into consideration that for each solution v e D(A) of v + XAv = h,
the inequality ||y||x< \\h\\x holds because of (33) and /i(0) = 0; thus,
IIGMb < Pll* f°r each solution of (36).
(Ill) D(A)= X because the C°°-functions that vanish at x = 0 belong to
D(A) and form a dense subset of X— £[(0,1).
57.6.* Invariant sets for nonexpansive semigroups. Let {S(t)} be a nonexpansive
semigroup on the complete metric space X, where the trajectories t >-» S(t)u
are continuous on R + for all »el. Show: If M is a closed subset of X
and C > 0 is a constant such that
lim t~ld(S(t)u,M)<,C for all u e M,
then d(S(t)u, M) < Ct for all u e M and t i 0.
In the special case C — 0, we find that Af is an invariant set for the
semigroup, i.e., if u e M, then 5(0 « e M f°r all' e R + •
Hint: Compare Brezis and Browder (1976) and Ekeland (1979). There
one also finds further material. Use the abstract entropy principle (Theorem
38.G in Section 38.11) with the entropy function (u, t) >-> t and the
ordering:
(u,t)< {v,s)<*t<s and d(S(s - t)u,v) < L(s- t).
References to the Literature
Crandall and Evans (1975); Kobayashi (1975); Crandall (1976, S).
Nonlinear semigroups: Crandall and Pazy (1969); Brezis (1973, M,B);
Brezis and Browder (1976); Barbu (1976, M, B); Walker (1980, M); Berkeley
(1983, P). (Also, cf. the references to the literature in Chapter 31.)
Appendix
Intelligence consists of this: that we recognize the similarity of different things
and the difference between similar ones.
Montesquieu
In this Appendix we give fundamental propositions concerning:
(a) Properties of convex sets in IR" and systems of inequalities.
(/?) Dual pairs of locally convex spaces.
(y) Smoothness and convexity properties of the norm in B-spaces.
In this connection, we assume the basic concepts of linear and locally
convex spaces as well as those of Hilbert spaces (briefly, H-spaces) and
Banach spaces (briefly, B-spaces) which we summarized in the Appendix to
Part I. In general, the following scheme holds:
H-space -* B-space
(norm topology)
; locally convex - linear
space space.
B-space
(weak topology)
Every H-space is a B-space, etc.
600
Appendix
Convex Sets in IR" and Systems of Inequalities
(1) Caratheodory's representation theorem. Let M be a set in IR" and
x e coM. Then there exist points xv...,xk in M, 1< k < n + 1, such that
x e co{x1,...,xt}.
(2) The convex hull of a compact set in IR" is again compact.
(3) Helly's intersection theorem. Let {K1,...,Km} be a finite family of
compact convex sets Ki in IR". Then the intersection of all these sets is
nonempty if and only if the intersection of at most n +1 sets Kt is
nonempty.
(4) Linear inequalities. Let Kbe a compact set in IR". Then the system of
inequalities
(z\u)<0 for all u<=K
has a solution z e IR" if and only if 0 € coK.
(5) Convex inequalities. Let ff. Af cR"-*R, /=1,...,m, be convex
functions on the convex nonempty set M. Then the system
fj(x)<0, / = 1,..., m
has a solution xQ e M if and only if there is no vector y e IR 1 — {0} having
at most n +1 nonvanishing components such that
/=i
(6) Strongly positive solutions of linear equations. Let A be a real mXn
matrix. Then the problem
Ax = 0, x»0, xelR"
has a solution x if and only if the problem
A*y>0, A*y¥°0, y<=Rm
has no solution y.
Here, x»z means that all components of x are greater than the
corresponding components of z, i.e., x — z e intlR^.
(7) Farkas' lemma. Let ^4 be a real mXn matrix and let b be a vector in
IRm. Then
Ax = b, x>0, xelR"
has a solution x if and only if
(y\b)>0 for all jelRm such that ,4*y>0.
Dual Pairs of Locally Convex Spaces
The concept of a dual pair allows the formulation of a symmetric duality
theory.
(8) Dual space. Let X be a locally convex space on IK ( = IR,C) with
topology t and let X* be the set of all continuous linear functionals on X.
Appendix
oux
Then X* is called the dual space of X. For x* e X*, we write
def
(x*,x)x= x*{x).
We assume that X* is transformed into a locally convex space over IK
with the topology t* by means of a system of seminorms.
(8a) Dual pair. (X, X*) forms a dual pair if and only if t, t* are so
constituted that the following hold:
(i) If, for fixed x e X, we set
def
fx(x*) = (x*,x)x for all x* el*,
then fx is a continuous linear functional onl*.
(ii) All continuous linear functionals on X* are obtained in this way.
(8b) Identification. Since x # y always implies fx # f we can identify x
with fx. In this sense, (X*)* = X, and, for all x e X, x* e X*,
(X ,X);f=(X,X );f*.
(9) Standard Example 1. If Xis a reflexive B-space, then (X, X*) forms a
dual pair if the topologies r and T*onI and X*, respectively, are generated
by the norms in the usual way.
In this sense (K^K"), with K = R or C, is a dual pair.
(10) Standard Example 2. If a locally convex space X is equipped with the
weak topology tw and X* with the weak* topology t^ (cf. Ax(41)), then
(X, X*) is a dual pair.
(11) Standard construction of dual pairs. Let X and F be linear spaces
over IK and let b: X X Y-* IK be a bilinear form with the properties:
(i) b(x, /) = 0 for all y e Y implies x = 0.
(ii) b(x, y) = 0 for all x e X impliesy = 0.
Then we call (X, Y) an algebraic dual pair with respect to b.
For all x e X, y e F, we set
<fe/ rfe/
Equipped with the system of seminorms {py: y e F}, X becomes a locally
convex space over IK. We denote this topology by a(X, F).
Regarding a(X,Y), we have X* = F in the sense that precisely all
continuous linear functionals on X are obtained by means of x ►-» b(x, y),
provided y ranges over the set F.
With the system of seminorms {p*\ x e X), F becomes a locally convex
space over IK. We denote this topology by a(Y, X).
Regarding a(Y, X), we have Y* = X in the sense that precisely all
continuous linear functionals on F are obtained by means of y>-* b(x, y),
provided x ranges over the set X.
602
Appendix
Henceforth, the following holds: (X, X*), with X* = Y, forms a dual pair
with respect to the topologies (a(X,Y), a(Y, X)), and for all x&X,
x* e X*, we have (x*,x)x = b(x,x*).
(12) Example 1. Let X= C[a, b], Y= C"[a, b] and
b(x,y) = (bx(t)y(t)dt,
where - oo < a < b < oo, n = 0,1,.... Obviously, the situation (11) is
obtained.
This example demonstrates the significance of dual pairs. If we provide
X— C[a, b] with the usual norm, then X is a B-space and X* is a normed
space in which one no longer can work very comfortably. By means of the
construction in (11), X is equipped with a locally convex topology such that
X* = Y holds. In this way dual optimization problems that run their course
in X* can be handled more easily.
Since (X*)*= X, there exists a complete symmetry between X and X*.
def
(13) Example 2. Let X be locally convex. We set Y = X*, b(x, y)
def
= (y, x)x. Then the construction in (11) yields
o(X,X*) = rw, o(X*,X) = r*.
(14) In a dual pair (X, X*), the following holds for M-S sequences:
xa -*x in Ximplies (x*, xa)x~^> (x*,x)x for all x* e X*,
x* -*x* in X* implies (x*,x)x~* (x*,x)x for all x e X.
(15) The Mackey topology t(X, Y). Suppose (X, Y) forms an algebraic
dual pair with respect to b as in (11). We set
Pc(x) ~ SUP \b{x, y)\ forallxeX
y eC
The system of seminorms
{ pc: C is a( Y, X)-compact and convex on Y}
transforms X into a locally convex space. We denote this topology by
r(X,Y).
(15a) The Mackey-Arens theorem. The locally convex topology n
transforms the linear space X into a locally convex space with X* = Y if and only
if
(15b)o(X,Y)QpQr(X,Y).
Furthermore, for \i in (15b) and Mcl,we have:
(15c) M is convex and ^-closed <=> M is convex and r{X, F)-closed.
(15d) M is ^-bounded <=> M is t(X, FVbounded.
def def
(15e) Example. Let X be a B-space. We set Y= X*, b(x, y) = (y, x)x.
Appendix
603
Then the Mackey topology r{X,Y) is generated by the norm on X, and
a{ X, Y) is equal to the weak topology r„ on X.
For the B-space X, (15b) describes all locally convex topologies n that
yield the same continuous linear functionals on X as does the norm
topology on X. In particular, for Mel, (15c) and (15d) assert that:
(i) M is convex and weakly closed <=> M is convex and closed,
(ii) M is weakly bounded <=> M is bounded.
(16) Product spaces. If X and P are locally convex spaces over IK, then
X X P is also a locally convex space over IK.
In addition, (X X P)* = X* X P*, i.e., each continuous linear functional
/e(lxP)* has the form/= (**,/>*), where x*e X*,/>* eP*, and
<(**. />*)>(*> P))XXP = <**> *>* + (P*> P)p
for all (x, p) e X X P, and all elements of (X X P)* are obtained in this
way.
If (X, X*),(P, P*) are dual pairs, then (X X P, X* X P*) also forms a
dual pair with respect to the corresponding product topologies.
(17) Dual operator. Let (X, X*) and (Y, Y*) be dual pairs over IK. For
each continuous linear mapping A: X->Y, there exists a uniquely
determined continuous linear mapping A*: Y* -* X* such that
(y*,Ax)Y^(A*y*,x)x for all j* e F*, x&X.
A* is called the dual mapping, or the dual operator, to A.
Since (X*)* = X, (Y*)* = Y, we have (A*)* = A.
(18) The polar M° of M. For a set M in the real locally convex space X
we set
M° = {x*eX*: (x*,x)x<ltoial\x<=M}.
(19) The Alaoglu-Bourbaki theorem. U° is weak* compact in X*
provided U is a neighborhood of zero in X.
(20) Bipolar theorem. (M°)° =co(MU{0}) provided (X, X*) is a dual
pair.
Convexity and Smoothness Properties of the Norm in B-Spaces
In (21)-(32), let all spaces be real. X denotes a real B-space.
(21) Definitions. The following definitions refer to smoothness and
convexity properties of the boundary of the unit ball in X.
(21a) X is called locally uniformly convex if and only if for each e,
0 < e < 2, and for each x, \\x\\ = 1, there exists a 6(e, x) > 0 such that the
0U4
Appendix
following holds for all x, y e X:
\\x-y\\>e, ||x|| = ||j||=l implies ||2-1(jc + ^)|| ^1-«(e,jc).
(21b) X is called uniformly convex if and only if X is locally uniformly
convex and 8 can be chosen to be independent of x.
We explained the geometric meaning in Fig. 10.1.
(21c) X is called strictly convex if and only if the following holds:
\\tx+(l-t)y\\ <1 provided x # y, \\x\\ = \\y\\ =1, t e ]0,1[.
(21d) X is called smooth if and only if / is G-differentiable for all
del
x&X- {0}, where f(x) = ||x||.
(21 e) X is called uniformly smooth if and only if / is F-differentiable for
allxeX-{0} and
\\x + h\\-\\x\\-(f'(x),h)x+e(h)\\h\\
def
for all /iel, where f(x) = ||x||. Here, we have e(h)^>0 as h-*0 and
indeed uniformly for all x, \\x\\ =1.
(22) X is uniformly convex =» X is locally uniformly convex =» X is
strictly convex.
(23) Xis uniformly convex =» Xis reflexive. Xis uniformly smooth => Xis
smooth.
(24) Example. Every real H-space X is uniformly convex and uniformly
smooth and
/'(*) = IMP1* forallxeX-{0}.
The Lebesgue spaces Lp{G) and the Sobolev spaces Wpm(G) are
uniformly convex provided 1 < p < oo and G is a nonempty open set in IR N.
(25) X is uniformly convex (respectively, uniformly smooth) <=> X* is
uniformly smooth (respectively, uniformly convex).
(26) X* is strictly convex (respectively, smooth) =» X is smooth
(respectively, strictly convex).
(27) For a real reflexive B-space X, the following then holds:
X is strictly convex (respectively, smooth) <=> X* is smooth (respectively,
strictly convex).
(28) X* is locally uniformly convex =»/ is F-differentiable on X-{0},
rfe/
where f(x) = \\x\\.
(29) The Kadec-Troyanski theorem. In every reflexive B-space, an
equivalent norm can be introduced so that X, X* are locally uniformly convex
and thus also strictly convex. Then, according to (28), the corresponding
norms on X-{0} and X* -{0} are F-differentiable.
(30) In a locally uniformly convex space,
x„-^x, ||x„||-* ||x|| implies xn-^x
as n -* oo.
Appendix
605
(31) Characterization of strictly convex spaces. The following five
assertions are mutually equivalent:
(i) X is strictly convex.
(ii) [x e X: \\x\\ =1} contains no segments.
clef
(iii) Every boundary point of 5={xel: ||x||<l} is an extreme point
of B.
(iv) If the equals sign holds in the triangle inequality, i.e.,
\\x-y\\-\\x-z\\+\\z-y\\,
and z + x,z + y, then z = tx+(l- t)y holds for some t e]0,1[.
(v) The functional x ►-» ||x||2 is strictly convex on X.
(32) Maxima of functionals. -
(32a) In a real B-space X, for all continuous linear functionals / e X*, we
have:
||/||= sup f(x).
(32b) X is strictly convex <=> each/ e X* takes on its maximum on B in at
most one point.
(32c) (James) X is reflexive <=> each /el* has a maximum on £.
(32d) (Bishop-Phelps) Let the set M in X be bounded, closed, convex,
and nonempty. Let X be a real B-space. Then the set of all / e X* that have
a maximum on M is dense in X*.
(32e) (James) Let the set M in X be bounded, weakly closed, and
nonempty. Let X be a real B-space. Then:
M is weakly compact <=> each /el* has a maximum on M.
(32f) (Ekeland and Lebourg). Every B-space with an F-differentiable
norm off the origin is an Asplund space, i.e., every continuous convex real
function on the space is F-differentiable at every point of a residual set.
A residual set is the complement of a set of first Baire category. Such sets
are "big."
Many of the theorems on the geometry of B-spaces stated above are
profound propositions whose proofs are difficult.
References to the Literature
Convex sets in W: Valentine (1964, M); Rockafellar (1970, M,B)
(standard work); Marti (1977, M,B).
Linear inequalities: Vogel (1967, M); Marti (1977, M).
Convexity and inequalities: Marti (1977, M).
Dual pairs: Edwards (1965, M); Schaefer (1966, M).
Geometry of B-spaces: Kdthe (1960, M); Cioranescu (1974, M)
(comprehensive exposition); Diestel (1974, L); Holmes (1975, M); Ekeland (1979);
Beauzamy (1982, M).
References
In this literature list, for example, "In: Fucik, S. and Kufner, A. [eds-] (1979),
59-94" without further instructions indicates that the article is to be found in
"Fucik, S. and Kufner, A. [eds.] (1979)."
Ablowitz, M. and Sigur, H. (1981): Solitons and the Inverse Scattering Transform.
SIAM, Philadelphia.
Abraham, R. and Robbin, J. (1967): Transversal Mappings and Flows. Benjamin,
New York.
Abraham, R. and Marsden, J. (1978): Foundations of Mechanics. Benjamin,
Reading, MA.
Achieser, N. (1967): Vorlesungen uber Approximationstheorie. Akademie-Verlag,
Berlin.
Ackermann, S. (1979): Axiomatische Bifurkationstheorie und Verzweigung bei un-
geraden Potentialoperatoren. Dissertation, Leipzig.
Ahmad, S., Lazer, C, and Paul, J. (1976): Elementary critical point theory and
perturbation of elliptic boundary value problems at resonance. Indiana Univ.
Math. J. 25 (933-944).
Ahmed, N. and Teo, K. (1981): Optimal Control of Distributed Parameter Systems.
North-Holland, New York.
Alber, S. (1970): The topology of functional manifolds and the calculus of variations in
the large. Uspehi Mat. Nauk 25, 4 (57-122) (Russian).
Albeverio, S. and Hoegh-Krohn, R. (1976): Mathematical theory of Feynman
integrals. Lecture Notes in Mathematics, Vol. 523. Springer-Verlag, Berlin.
Aleksandrov, P. [ed.] (1971): Die Hilbertschen Problem. Geest & Portig, Leipzig.
Almgren, F. (1984): Q-valued Functions Minimizing Dirichlet's Integral and the
Regularity of Area Minimizing Rectifiable Currents up to Codimension Two.
(monograph to appear).
Amann, H. (1969): Ein Existenz- und Eindeutigkeitssatz far die Hammersteinsche
Gleichung in Banachraumen. Math. Z. Ill (175-190).
Amann, H. (1972): Ljusternik-Schnirelman theory and nonlinear eigenvalue
problems. Math. Ann. 199 (55-72).
Amann, H. (1979): Saddle points and multiple solutions of differential equations.
Math. Z. 169 (127-166).
606
References
607
Amann, H. and Zehnder, E. (1980): Nontrivial solutions for a class of nonresonance
problems and applications to nonlinear differential equations. Ann. Scuola Norm.
Sup. Pisa CI. Sci. (4) 7 (539-603).
Ambrosetti, A. and Rabinowitz, P. (1973): Dual variational methods in critical point
theory and applications. J. Funct. Anal. 14 (349-380).
Amrein, O. (1981): Non-relativistic Quantum Dynamics. Reidel, Dordrecht.
Angel, E. and Bellman, R. (1972): Dynamic Programming and Partial Differential
Equations. Academic, New York.
Anger, G. [ed.] (1979): Inverse and Improperly Posed Problems in Differential
Equations. Akademie-Verlag, Berlin.
Aoki, M. (1976): Optimal Control and System Theory in Dynamic Economic Analysis.
North-Holland, Amsterdam.
Arnold, L. (1973): Stochastische Differentialgleichungen. Oldenbourg, Miinchen.
(English edition: Stochastic Differential Equations: Theory and Applications.
Wiley, New York, 1974.)
Arnold, V. (1963): Small denominators and problems of stability of motion in classical
and celestial mechanics. Uspehi.Mat. Nauk 18, 6 (91-196) (Russian).
Arnold, V. and Avez, A. (1968): Ergodic Problems of Classical Mechanics. Benjamin,
New York.
Arnold, V. (1971): Ordinary Differential Equations. Nauka, Moscow, 1971-1978.
Vols. 1,2. (Russian). (English edition: MIT Press, Cambridge, MA, 1978.)
Arnold, V. (1974): Mathematical Methods of Classical Mechanics. Nauka, Moscow
(Russian). (English edition: Springer-Verlag, Berlin, 1978.)
Arnold, V. (1975): Critical points of functions. Uspehi Mat. Nauk 30, 5 (3-65)
(Russian).
Arnold, V. (1981): Singularity Theory. Selected Papers. Cambridge University Press,
Cambridge, England.
Arnold, V. (1983): Singularities of ray systems. Uspehi Mat. Nauk 38, 2 (77-147)
(Russian).
Arnold, V. (1983a): Singularities in variational calculus. Itogi nauki sovremennye
problemy matematiki, Vol. 22. Moscow (Russian).
Arnold, V. (1983b): Geometrical Methods in the Theory of Ordinary Differential
Equations. Springer-Verlag, New York.
Arrow, K., Hurwicz, L., and Uzawa, H. (1958): Studies in Linear and Nonlinear
Programming. Stanford University Press, Stanford, CA.
Arrow, K., and Intrilligator, A. [eds.] (1983): Handbook of Mathematical Economics.
North-Holland, New York (to appear).
Asimow, L. and Ellis, A. (1982): Convexity Theory and its Applications in Functional
Analysis. Academic, New York.
Astrom, K. (1970): Introduction to Stochastic Control Theory. Academic, New York.
Atiyah, M., Bott, R., and Garding, L. (1970): Lacunas for hyperbolic differential
operators with constant coefficients, I; II. Acta Math. 124 (109-189); 131 (1973),
(145-206).
Atiyah, M. (1979): Geometry of Yang-Mills Fields. Scuola Normale Superiore, Pisa
(Lecture Notes).
Aubin, J. (1979): Mathematical Methods of Game and Economic Theory.
North-Holland, Amsterdam.
Auslender, A. (1972): Problemes de minimax via Vanalyse convexe et les inegalites
variationnelles. Lecture Notes in Economics, Vol. 77. Springer-Verlag, Berlin.
Auslender, A. (1976): Optimisation: methodes numeriques. Masson, Paris.
Babic, V., Michlin, S., Kapilevic, M., Natanson, G, Riz, P., Slobodeckii, L., and
Smirnov, M. (1967): lineare Differentialgleichungen der mathematischen Physik.
008
References
Akademie-Verlag, Berlin. (Russian edition: Nauka, Moscow, 1964. English
edition: Holt, Rinehart and Winston, New York, 1967.)
Babic, V. and Buldyrev, V. (1972): Asymptotic Methods in Diffraction Problems of
Short Waves. Nauka, Moscow (Russian).
Babic, V. and Kirpicnikova, N. (1979): The Boundary Layer Methods in Diffraction
Problems. Springer-Verlag, New York.
Bacry, H. (1977): Lectures on Group Theory and Particle Theory. Gordon and
Breach, London.
Baiocchi, C. and Capelo, A. (1978): Disequazioni variazionali e quasivariazionali,
Vols. 1, 2. Pitagora, Bologna.
Baker, G. and Gammel, J. (1970): The Pade Approximation in Theoretical Physics.
Academic, New York.
Baker, G. (1975): Essentials of Pade's Approximants. Academic, New York.
Baker, C. and Morris, P. (1981): Pade Approximants, Vols. 1, 2. Addison-Wesley,
New York.
Balakrishnan, A. and Neustadt, L. (1964): Computing Methods in Optimization
Problems. Academic, New York.
Balakrishnan, A. (1975): Applied Functional Analysis. Springer-Verlag, New York.
Ball, J. (1977): Convexity conditions and existence theorems in nonlinear elasticity.
Arch. Rat. Mech. Anal. 63 (337-403).
Ball, J., Curie, J., and Oliver, P. (1981): Null Lagrangians, weak continuity, and
variational problems of arbitrary order. J. Funct. Analysis 41 (135-174).
Banach, S. (1929): Sur les fonctionnelles lineaires, I; II. Studia Math. 1 (1929),
211-216; 223-239.
Barbu, V. (1976): Nonlinear Semigroups and Differential Equations in Banach Spaces.
Noordhoff, Leyden; Ed. Acad., Bucuresti.
Barbu, V., and Precupanu, T. (1978): Convexity and Optimization in Banach Spaces.
Ed. Acad., Bucuresti; Sijthoff & Noordhoff, Leyden.
Baumgartel, H. and Wollenberg, M. (1983): Mathematical Scattering Theory.
Akademie-Verlag, Berlin.
Bazley, N. (1974): Existence and bounds for the lowest critical energy of the Hartree
operator. In: Ordinary and Partial Differential Equations. Sleeman, B. et al.
[eds.]. Lecture Notes in Mathematics, Vol. 415. Springer-Verlag, Berlin, 1974,
23-34.
Beals, M., Fefferman, C, and Grossman, R. (1983): Strictly pseudoconvex domains in
C". Bull. Amer. Math. Soc. (N.S.) 8 (125-322).
Beauzamy, B. (1982): Introduction to Banach Spaces and Their Geometry.
North-Holland, Amsterdam.
Becher, P., BOhm, M., and Joos, H. (1981): Eichtheorien der starken und elektro-
schwachen Wechselwirkung. Teubner, Stuttgart.
Beckert, H. (1971): Uber die Konvergenz des Gradientenverfahrens mit Anwendungen
auf Standortprobleme und das Ritzsche Verfahren. ZAMM 51 (333-341).
Beckert, H (1972); Zur Steuerung der Stabilitat in elastischen Korpern. ZAMM 52
(617-622).
Beckert, H. (1977): Bemerkungen zur Theorie der Stabilitat. Sitzungsber. Sachs.
Akad. Wiss. Leipzig, Math.-nat. Kl. 113, 2.
Belenkii, V. and Volkonskii, V. [eds.] (1974): Iterative Methods in Game Theory and
Optimization. Nauka, Moscow (Russian).
Bellman, R. (1953): An introduction to the theory of dynamic programming. The Rand
Corporation, Santa Monica, CA.
Bellman, R. (1954): The theory of dynamic programming. Bull. Amer. Math. Soc. 60
(503-516).
Bellman, R. (1957): Dynamic Programming. Princeton University Press, Princeton,
N.J.
References
609
Bellman, R. (1961): Adaptive Control Processes. Princeton University Press,
Princeton, N.J.
Bellman, R (1967): Introduction to the Mathematical Theory of Control Processes,
Vols. 1, 2. Academic, New York, 1967-1971.
Bellman, R and Angel, E. (1972): Dynamic Programming and Partial Differential
Equations. Academic, New York
Bellman, R. and Lee, E. (1978): Functional equations in dynamic programming.
Aequationes Math. 17 (1-18).
Bend, V. and Rabinowitz, P. (1979): Critical point theorems for indefinite junctionals.
Inventiones Math. 52 (241-273).
Bengtsson, L. et al. [eds.] (1981): Dynamic Meteorology. Springer-Verlag, New York.
Ben-Israel, A. and Greville, T. (1973): Generalized Inverses. Wiley, New York.
Bennett, S. (1979): A History of Control Engineering. Peter Peregrinus, Stevenage,
England.
Bensoussan, A. (1971): Filtrage optimal des systemes lineaires. Dunod, Paris.
Bensoussan, A. (1982): Stochastic Control by Functional Analysis Methods.
North-Holland, Amsterdam. .
Bensoussan, Ai, Lions, J., and Temam, R. (1972): Method of decomposition,
decentralization, coordination and its applications. In: Lions, J. and Marcuk, G.
[eds.] (1975), 144-274 (Russian).
Bensoussan, A. and Lions, J. (1975): Control Theory, Numerical Methods and
Computer System Modelling. Lecture Notes in Economics, Vol. 107. Springer-
Verlag, Berlin.
Bensoussan, A. and Lions, J. [eds.] (1978): Applications des inequations variation-
nelles en contrble stochastique. Dunod, Paris; Bordas, Paris. (English edition:
North-Holland, Amsterdam, 1981.)
Bensoussan, A., Lions, J., and Papanicolaou, G. (1978): Asymptotic Methods in
Periodic Structures. North-Holland, Amsterdam.
Benton, S. (1977): The Hamilton-Jacobi Equation: a global approach. Academic,
New York.^
Berezin, I. and Zidkov, N, (1966): Numerical Methods. Nauka, Moscow (Russian).
(German edition: VEB Dt. Verl. d. Wiss., Vols. 1, 2. Berlin, 1970-1971.)
Berge, C. and Ghouila-Houri, A. (1969): Programme, Spiele, Transportnetze.
Teubner, Leipzig. (French edition: Dunod, Paris 1962.)
Berger, M. (1977): Nonlinearity and Functional Analysis. Academic, New York.
Berkeley (1983): Proceedings of a Summer Institute of the American Mathematical
Society on Nonlinear Functional Analysis (to appear).
Berkovitz, L. (1974): Optimal Control Theory. Springer-Verlag, New York.
Billingsley, P. (1965): Ergodic Theory and Information. Wiley, New York.
Birkhoff, G (1971): The Numerical Solution of Elliptic Equations. Regional
Conference Series in Applied Mathematics, Vol. 11. SI AM, Philadelphia,
Bittner, L. (1968): Begrundung des sogenannten diskreten Maximumprinzips. Z,
Wahrsch. Verw. Gebiete 10 (289-301),
Bittner, L. (1975): On optimal control of processes governed by abstract functional.
integral and hyperbolic differential equations. Math. Operationsforsch. Statist. 6
(107-134).
Bliss, G. (1925): Calculus of Variations. Open Court, Chicago.
Bliss, G. (1951): Lectures on the Calculus of Variations. University of Chicago Press.
Chicago.
Blum, E. and Oettli, W. (1975): Mathematische Optimierung. Springer-Verlag, New
York
Bogoljubov, N. and Sirkov, D. (1973): Introduction to Quantum Field Theory. Nauka.
Moscow (Russian). (English edition: Wiley, New York, 1979.)
Bogoljubov, N. and Sirkov, D. (1980): Quantum Fields. Nauka, Moscow (Russian).
610
References
Bohme, R. (1972): Die Ldsung der Verzweigungsgleichungen fiir nichtlineare Eigen-
wertprobleme. Math. Z. 127 (105-126).
Bohme, R. (1981/1982): New results on the classical problem of Plateau on the
existence of many solutions. Seminaire Bourbaki No. 579.
BShme, R. and Tromba, A. (1977): The number of solutions to the classical Plateau
problem is generically finite. Bull. Amer. Math. Soc. 83 (1043-1044).
Boltjanskii, V., Gamkrelidze, R., and Pontrjagin, L. (1956): On the theory of optimal
processes. Doklady Akad. Nauk SSSR 110 (7-10) (Russian).
Boltjanskii, V. (1958): The maximum principle and the theory of optimal processes.
Doklady Akad. Nauk SSSR 119 (1070-1073) (Russian).
Boltjanskii, V. (1971): Mathematische Methoden der optimalen Steuerung. Geest &
Portig, Leipzig. (English edition: Mathematical Methods of Optimal Control.
Holt, Rinehart and Winston, New York, 1971.)
Boltjanskii, V. (1975): The tent method in the theory of extremal problems. Uspehi
Mat. Nauk 30, 3 (3-55) (Russian).
Boltjanskii, V. (1976): Optimale Steuerung diskreter Systeme. Geest & Portig,
Leipzig. (English edition: Optimal Control of Discrete Systems. Halsted, New York,
1978.)
Bolza, O. (1949): Vorlesungen iiber Variationsrechnung. Koehler and Amelang,
Leipzig.
Bonnesen, T. and Fenchel, W. (1934): Theorie der konvexen Korper. Springer-Verlag,
Berlin.
Boor, C. de (1978): A Practical Guide to Splines. Springer-Verlag, New York.
Booss, B. (1977): Topologte und Analysis. Springer-Verlag, Berlin.
Borisovic, Ju., Zvjagin, V., and Sapronov, Ju. (1977): Nonlinear Fredholm mappings
and Leray-Schauder theory. Uspehi Mat. Nauk 32, 4 (3-54).
Borisovic, Ju., Zvjagin, V., and Serman, P. (1978): Topological Methods in the Theory
of Nonlinear Fredholm Operators. Voronez University Press, Voronez (Russian).
Born, M. and Wolf, E. (1959): Principles of Optics. Pergamon, New York.
Borsuk, K. (1966): Theory of Retracts. PWN, Warsaw.
Bott, R (1982): Lectures on Morse theory, old and new. Bull. Amer. Math. Soc.
(N.S.) 7 (331-358).
Box, G. and Jenkins, G. (1970): Time-Series Analysis: Forecasting and Control.
Holden-Day, San Francisco.
Bratteli, O. and Robinson, D. (1979): Operator Algebras and Quantum Statistical
Mechanics, Springer-Verlag, New York.
Br&is, H. (1972); Problemes unilateraux. J. Math. Pures Appl. 51 (1-168).
Br&is, H., Nirenberg, L., and Stampacchia, G. (1972): Remark on Ky Fan's
min-max theorem. Bull. Univ. Mat. Ital. 4, 6 (293-300).
Brezis, H. (1973): Operateurs maximaux monotones. North-Holland, Amsterdam.
Br&is, H. and Browder, F. (1976): A general ordering principle in nonlinear functional
analysis. Advances in Math. 21 (355-364).
Br&is, H., Coron, J., and Nirenberg, L. (1980): Free vibrations of nonlinear wave
equations and a theorem of P. Rabinowitz. Comm. Pure Appl. Math. 33
(667-689).
Br&is, H. (1983): Periodic solutions of nonlinear vibrating strings and duality
principles. Bull. Amer. Math. Soc. (N.S.) 8 (409-426).
Brillinger, D. (1975): Time Series. Holt, Rinehart and Winston, New York.
Brillouin, L. (1956): Science and Information Theory. Academic, New York.
Brocker, T. and Lander, L. (1975): Differentiate Germs and Catastrophes.
Cambridge University Press, Cambridge, England.
Brongtein, I. and Semendjaev, K. (1979): Taschenbuch der Mathematik, Vols. 1, 2.
Teubner, Leipzig.
Browder, F. (1959): Functional analysis and partial differential equations, I; II. Math.
References
611
Ann. 138 (1959), 55-79; 145 (1961/62), 81-226.
Browder, F. (1965): Variational methods for nonlinear elliptic eigenvalue problems.
Bull. Amer. Math. Soc. 71 (176-183).
Browder, F. (1965a): Non-linear monotone operators and convex sets in Banach
spaces. Bull. Amer. Math. Soc. 71 (780-785).
Browder, F. (1966): On the unification of the calculus of variations and the theory of
monotone nonlinear operators in Banach spaces. Proc. Nat. Acad. Sci. USA 56
(419-425).
Browder, F. (1968): Non-linear eigenvalue problems and Galerkin approximation.
Bull. Amer. Math. Soc. 74 (651-656).
Browder, F. (1968a): The fixed point theory of multivalued mappings in topological
spaces. Math. Ann. 177 (283-301).
Browder, F. (1968/1976): Nonlinear Operators and Nonlinear Equations of Evolution
in Banach Spaces. Proc. Symp. Pure Math., Vol. 18, 2. American Mathematical
Society, Providence, RI, 1976. Preprint version, 1968.
Browder, F. (1970): Existence theorems for nonlinear partial differential equations. In:
Global Analysis. American Mathematical Society, Providence, RI, 1-62.
Browder, F. (1970a): Non-linear eigenvalue problems and group invariance. In:
Functional Analysis and Related Fields. F. Browder [ed.] (1970), Springer-Verlag,
Berlin, 1-58.
Browder, F. (1970b): Pseudomonotone operators and the direct method of the calculus
of variations. Arch. Rat. Mech. Anal. 38 (268-277).
Browder, F. [ed.] (1976): Mathematical Developments Arising from Hubert's
Problems. American Mathematical Society, New York.
Browder, F. and Brezis, H. (1980): Strongly nonlinear parabolic variational
inequalities. Proc. Nat. Acad. Sci. USA 77 (713-715).
Bryson, A. and Ho, Y. (1969): Applied Optimal Control. Blaisdell, New York.
Bucy, R. and Joseph, P. (1968): Filtering for Stochastic Processes with Applications to
Guidance. Interscience, New York.
Bullough, R. and Caudrey, P. [eds.] (1980): Solitons. Springer-Verlag, New York.
Burger, E. (1959): Einfuhrung in die Theorie der Spiele. De Gruyter, Berlin.
Buslaev, V. (1964); The asymptotics for short waves in the diffraction problem for
smooth convex contours. Trudy Mat. Inst. Steklova 73 (14-117) (Russian).
Butkovski!, A. (1965): The Theory of the Optimal Control of Systems with Distributed
Parameters. Nauka, Moscow (Russian). (English edition: American Elsevier,
New York, 1969.)
Butkovskii, A., Egorov, A., and Lurje, K. (1968): Optimal control of distributed
systems. SIAM J. Control 6 (437-476).
Butkovskii, A. (1975): Methods of Control of Systems with Distributed Parameters.
Nauka, Moscow (Russian).
Calogero, F. and Degasperis, A. (1982): Spectral Transform and Solitons.
North-Holland, New York.
Carathfodory, C. (1935): Variationsrechnung und partielle Differentialgleichungen
erster Ordnung. Teubner, Leipzig, 1935, 1956.
Caristi, J. (1976): Fixed point theorems for mappings satisfying inwardness conditions.
Trans. Amer. Math. Soc. 215 (241-251).
Castaing, C. and Valadier, M. (1977): Convex Analysis and Measurable Multifunc-
tions. Lecture Notes in Math., Vol. 580. Springer-Verlag, Berlin.
Casti, J. (1980): The quadratic control problem. SIAM Rev. 22 (442-458).
Cauchy, A. (1847): Methode generate pour la resolution des systemes d'equations
simultanees. C. R. Acad. Sci. Paris 25 (536-5381.
612
References
Cea, J. (1971): Optimisation: theorie et algorithmes. Dunod, Paris.
Cea, J. [ed.] (1976): Optimization Techniques: Modelling and Optimization in the
Service of Man. Lecture Notes in Computer Science, Vol. 40/41. Springer-
Verlag, Berlin.
Cernousko, F. and Kolmanovskii, V. (1977): Numerical methods of optimal control.
Itogi nauki i tehniki, Mat. analiz 14 (101-166) (Russian).
Cesari, L. (1966): Existence theorems for weak and usual optimal solutions in
Lagrange problems with unilateral constraints. Trans. Amer. Math. Soc. 124
(369-412).
Cesari, L. (1975): Geometric and analytic views in existence theorems for optimal
control. J. Optim. Theory Appl. 15 (467-497).
Cesari, L. (1983): Optimization —Theory and Applications. Problems with Ordinary
Differential Equations. Springer-Verlag, New York.
Chebyshev, P. (see Tchebycheff, P.)
Chen, C. (1970): Introduction to Linear System Theory. Holt, Rinehart and Winston,
New York.
Cheney, E. (1966): Introduction to Approximation Theory. McGraw-Hill, New York
Chernoff, P. and Marsden, J. (1974): Properties of Infinite-Dimensional Hamiltonian
Systems. Lecture Notes in Math., Vol. 425. Springer-Verlag, Berlin.
Cheung, T. (1978): Recent developments in the numerical solution of partial differential
equations by linear programming. SIAM Rev. 20 (139-167).
Choquet-Bruhat, Y., Dewitt-Morette, C, and Dillard-Bleick, M. (1982): Analysis,
Manifolds and Physics. North-Holland, Amsterdam.
Chow, S. and Hale, J. (1982): Methods of Bifurcation Theory. Springer-Verlag, New
York.
Ciarlet, P. (1977): Numerical Analysis of the Finite Element Method for Elliptic
Boundary Value Problems. North-Holland, Amsterdam.
Cioranescu, I. (1974): Aplicatii de dualitate in analiza functionala neliniara. Ed.
Acad., Bucuresti.
Clark, D. (1972): A variant of the Ljustemik-Schnirelman theory. Indiana Univ.
Math. J. 22 (65-74).
Clarke, F. (1976): Necessary conditions for a general control problem. In: Calculus of
Variations and Control Theory. D. Russell [ed.], Academic, New York, 257-278.
Clarke, F. (1976a): The maximum principle under minimal hypotheses. SIAM J.
Control Optim. 14 (1078-1091).
Clarke, F. (1976b): A new approach to Lagrange multipliers. Math. Oper. Res. 1
(165-174).
Clarke, F. (1981): Generalized gradients of Lipschitz junctionals. Advances in Math.
40 (52-67).
Clarke, F. (1984): Nonsmooth analysis and optimization (to appear).
Clarke, F. and Ekeland, I. (1980): Hamiltonian trajectories having prescribed minimal
period. Comm. Pure Appl. Math. 33 (103-116).
Clarke, F. and Ekeland, I. (1982): Nonlinear oscillations and boundary value problems
for Hamiltonian systems. Arch. Rat. Mech. Anal. 78 (315-337).
Coffman, C. (1969): A minimum-maximum principle for a class of non-linear integral
equations. J. d'analyse mathem. 22 (391-418).
Collatz, L. (1963): Eigenwertaufgaben mit technischen Anwendungen. Geest & Portig,
Leipzig.
Collatz, L. (1964): Funktionalanalysis und numerische Mathematik. Springer-Verlag,
Berlin.
Collatz, L. and Wetterling, W. (1966): Optimierungsaufgaben. Springer-Verlag, Berlin
(Second enlarged edition, 1971; English edition: Springer, New York, 1975).
Collatz, L. and Albrecht, J. (1972): Aufgaben aus der angewandten Mathematik. Vols.
1, 2. Akademie-Verlag, Berlin.
References
613
Collate, L. and Krabs, W. (1973): Approximationstheorie. Teubner, Stuttgart.
Collatz, L., Gilnther, H„ and Sprekels, J. (1976): Vergleich zwischen Dis-
kretisierungsverfahren undparametrischen Methoden an einfachen Testbeispielen.
ZAMM 56 (1-11).
Collatz, L. [ed.] (1979): Numerical Methods of Approximation Theory. ISNN, Vol. 52.
Birkhauser, Basel.
Combet, E. (1975): Equations aux derivees partielles. Univ. Claude-Bernard, Lyon,
1975-1976 (Lecture Notes).
Combet, E. (1982): Integrates exponentielles. Lecture Notes in Math., Vol. 937.
Springer-Verlag, Berlin.
Conley, C. (1978): Isolated Invariant Sets and the Morse Index. Regional Conference
Series in Math., Vol. 38. American Mathematical Society, Providence, RI.
Control in Space (1970): Proc. 3rd Internat. Symposium IF AC, Vols. 1, 2. Toulouse,
France.
Control Theory and Topics in Functional Analysis (1976): International seminar
course, Trieste, 1974. International Atomic Energy Agency, Vienna, 1976.
Cornfeld, I., Fomin, S., and Sinai, Yu. (1982): Ergodic Theory. Springer-Verlag, New
York.
Courant, R. (1943): Variational methods for the solution of problems of equilibrium
and vibrations. Bull. Amer. Math. Soc. 49 (1-23).
Courant, R (1950): Dirichlet's Principle, Conformal Mapping and Minimal Surfaces.
Interscience, New York.
Courant, R. and Hilbert, D. (1953): Methods of Mathematical Physics, Vols. 1, 2.
Interscience, New York, 1953-1962.
Crandall, M. and Pazy, A. (1969): Semi-groups of nonlinear contractions and dissipa-
tive sets. J. Funct. Anal. 3 (376-418).
Crandall, M. and Evans, L. (1975): On the relation of the operator d/ds + d/dt to
evolution governed by accretive operators. Israel J. Math. 21 (261-278).
Crandall, M. (1976): Evolutionary equations. In: Dynamical Systems, Vol. 1,
Academic, New York, 131-165.
Crandall, M. and Lions, P. (1983): Viscosity solutions of Hamilton-Jacobi equations.
Transact. Amer. Math. Soc. 277 (1-42).
Csaki, F. (1972): Modem Control Theories. Akad. Kiado, Budapest.
Curtain, R. and Pritchard, A. (1978): Infinite Dimensional linear Systems Theory.
Lecture Notes in Control and Information Sciences, Vol. 8. Springer-Verlag,
Berlin.
Dacarogna, B. (1982): Weak Continuity and Weak Lower Semi-continuity of
Nonlinear Functionals. Lecture Notes in Math., Vol. 922. Springer-Verlag, Berlin.
Dancer, N. (1976): A note on a paper of Fucik and Necas. Math. Nachr. 73
(151-153).
Dantzig, G. (1949): Programming of interdependent activities. Econometrica 17
(200-211). (Compare, also, the works of Dantzig in: Koopmans, T. [ed.]
(1951).)
Dantzig, G. (1963): Linear Programming and Extensions. Princeton University Press,
Princeton, NJ.
De Giorgi, E. (see Giorgi, E. De).
Demjanov, V. and Malozemov, V. (1975): Einfuhrung in die Minimaxprobleme.
Geest & Portig, Leipzig.
Demjanov, V. and Vasiljev, L. (1981): Nondifferentiable Optimization. Nauka,
Moscow (Russian).
Deuflhard, P. and Hairer, E [eds.] (1983): Workshop on Numerical Treatment of
Inverse Problems in Differential and Integral Equations (to appear).
614
References
Diestel, J. (1974): Geometry of Banach Spaces. Lecture Notes in Mathematics, Vol.
485. Springer-Verlag, Berlin.
Dieudonne, J. (1975): Grundzuge der modernen Analysis, Vols. 1-9. VEB Dt. Verlag
der Wiss., Berlin 1975 ff. (English edition: Foundations of Modern Analysis.
Academic, New York, 1960 ff. French edition: Gauthier-Villars, Paris, 1968 ff.
Russian edition: Mir, Moscow, 1964 ff.)
Dieudonne, J. (1981): History of functional analysis. North-Holland, Amsterdam.
Dixon, L„ Spedicato, E., and Szego, G. (1980): Nonlinear Optimization. Theory and
Algorithms. Birkhauser, Boston.
Dold, A. (1972): Lectures on Algebraic Topology. Springer-Verlag, Berlin.
Doob, J. (1953): Stochastic Processes. Wiley, New York, 1953, 1967.
Dreszer, J. (1975): Mathematik-Handbuch fur Technik und Naturwissenschaft. Fach-
buchverlag, Leipzig.
Dubin, D. (1974): Solvable Models in Algebraic Statistical Meclianics. Clarendon,
Oxford.
Dubovickii, A. and Miljutin, A. (1965): Extremal problems with side conditions. Z.
Vycisl. Mat. i Mat. Fiz. 5 (395-453) (Russian). (English edition: USSR
Comput. Math. Math. Phys. 5 (1965), pp. 1-80.)
Dubovickii, A. and Miljutin, A. (1971): Necessary Conditions for a Weak Extremum
for the General Problem of Optimal Control. Nauka, Moscow (Russian).
Duistermaat, J. and Hdrmander, L. (1972): Fourier integral operators II. Acta Math.
128 (183-269).
Duistermaat, J. (1974); Oscillatory integrals, Lagrange immersions and unfolding of
singularities. Comm. Pure. Appl. Math. 27 (207-281).
Dunford, N. and Schwartz, J. (1958): Linear Operators, Vols. 1-3. Interscience, New
York, 1958-1971. (Russian edition: IL, Moscow, 1964.)
Duvaut, G. and Lions, J. (1972): Les inequations en mecanique et en physique.
Dunod, Paris.
Dyer, P. and McReynolds, S. (1970); The Computation and Theory of Optimal
Control. Academic, New York.
Dzjadyk, V. (1977): Introduction to the Theory of Uniform Approximation of
Functions by Polynomials. Nauka, Moscow (Russian).
Eckmann, J. and Seneor, R. (1976); The Maslov-WKB method for the (an)-harmonic
oscillator. Arch. Rat. Mech. Anal. 61 (153-173).
Edwards, R. (1965): Functional Analysis. Holt, Rinehart and Winston, New York.
(Russian edition: Mir, Moscow, 1969.)
Egorov, A. (1966): Necessary optimality conditions for systems with distributed
parameters. Mat. Sb. 69 (371-421) (Russian).
Egorov, A. (1978): Optimization of Heating and Diffusion Processes. Nauka, Moscow
(Russian).
Eguchi, T., Gilkey, P., and Hanson, A. (1980): Gravitation, Gauge Theories and
Differential Geometry. Physics Reports 66 (213-393).
Ekeland, I. (1974): On the variational principle. J. Math. Anal. Appl. 47 (324-353).
Ekeland, I. (1979): Nonconvex minimization problems. Bull. Amer. Math. Soc. (N.S.)
1 (443-474).
Ekeland, I. and Temam, R (1974): Analyse convexe et problemes variationals.
Dunod, Paris. (English edition: North-Holland, Amsterdam, 1976.)
Elster, K. et al. (1977): Einfuhrung in die niclitlineare Optimierung. Teubner, Leipzig.
Encyclopedia of Mathematics and Its Applications (1976): Edited by G C. Rota. Vols.
1 ff. Addison-Wesley, Reading, MA, 1976 ff.
Engels, H. (1980): Numerical Quadrature and Cubature. Academic, New York
References
615
Euler, L. (1911): Opera omnia (Collected papers). Leipzig-Berlin, later Basel-Zurich,
Vols. 1-72. (There will also appear 15 volumes containing letters.)
Eveleigh, V. (1972): Introduction to Control Systems Design. McGraw-Hill, New
York.
Faddeev, L. and Slavnov, A. (1980): Gauge Fields. Addison-Wesley, Reading, MA.
Fadell, E. and Rabinowitz, P. (1977): Bifurcation for odd potential operators and an
alternative topological index. J. Funct. Anal. 26 (48-67).
Fadell, E and Rabinowitz, P. (1978): Generalized cohomological index theories for
Lie group actions with an application to bifurcation questions for Hamiltonian
systems. Invent. Math. 45 (139-174).
Fan, Ky (see Ky Fan).
Farkas, J. (1902): Uber die Theorie der einfachen Ungleichutigen. J. Reine Angew.
Math 124 (1-24).
Faurre, P. (1971): Navigation inertielle optimale et filtrage statistique. Bordas, Paris.
Fedorenko, R (1978): Approximative Solution of Optimal Control Problems. Nauka,
Moscow (Russian).
Fefferman, C. (1983): The uncertainty principle. Bull. Amer. Math. Soc. (N.S.) 9
(122-206).
Feinstein, A. (1958): Foundations of Information Theory. McGraw-Hill, New York.
Feller, W. (1968): Modem Probability Theory, Vols. 1, 2. Wiley, New York.
Fenchel, W. (1949): On conjugate convex functions. Canad. J. Math. 1 (73-77).
Fenchel, W. (1951): Convex Cones, Sets and Functions. Princeton University Press,
Princeton (Lecture Notes).
Fichera, G. (1964): Problemi elastostatici con vincoli unilaterali: il problema di
Signorini con ambigue condizioni al contomo. Atti. Accad. Naz. Lined Mem. CI.
Sci. Fis. Mat. Natur. Sez. 1(8) 7 (91-140).
Fichera, G (1973): Boundary value problems of elasticity with unilateral constraints.
In: Encyclopedia of Physics, Vol. VIa/2. S. Flilgge [ed.] Springer-Verlag, Berlin.
Fichtenholz, G. (1972): Differential- undIntegralrechnung, Vols. 1-3. VEB Dt. Verl.
d. Wiss., Berlin.
Finn, R (1963): New estimates for equations of minimal surface type. Arch. Rat.
Mech. Anal. 14 (337-375).
Finn, R (1984): Equilibrium Capillary Surfaces. Springer-Verlag, New York (to
appear).
Flaschel, P. and Klingenberg, W. (1972): Riemannsche Hilbertmannigfaltigkeiten.
Periodische Geodatische. Lecture Notes in Mathematics, Vol. 228. Springer-
Verlag, Berlin.
Fleming, W. and Rishel, R. (1975): Deterministic and Stochastic Optimal Control.
Springer-Verlag, Berlin.
Fletcher, R (1980): Practical Methods of Optimization. Vols. 1, 2. Wiley, Chichester.
Focke, J. (1969): Symmetrische n-Orbiformen kleinsten Inhalts. Acta Math. Hung. 20
(39-68).
Focke, J. and Klotzler, R (1978): Zur Grundkonzeption der dynamischen Opti-
mierung. Wiss. Z. Karl-Marx-Univ. Leipzig, Math.-nat. Reihe 27 (447-462).
Focke, J. (1984): Maximum-Likelihood-Schatzungen bei semidefiniten Faktormodel-
len. Mathem. Operationsforschung und Statistik, Ser. Statistics (to appear).
Fomenko, A. (1982): Variational Methods in Topology. Nauka, Moscow (Russian).
Foulds, L. (1981): Optimization Techniques. Springer-Verlag, New York.
Frank, P. and Mises, R von (1961): Die Differential- und Integralgleichungen der
Mechanik und Physik. Dover, New York; Vieweg, Braunschweig.
Frank, W. (1969): Mathematische Grundlagen der Optimierung. Oldenbourg,
Mflnchen.
OTTO""
References
Franklin, J. (1980): Methods of Mathematical Economics. Springer-Verlag, New
York.
Frehse, J. (1982): Capacity methods in the theory of partial differential equations.
Jahresbericht der Deutschen Mathematikervereinigung 84 (1-44).
Friedlander, F. (1976): The Wave Equation on a Curved Space-Time. Cambridge
University Press, Cambridge, England.
Friedman, A. (1971): Differential Games. Wiley, New York.
Friedman, A. (1974): Differential Games. American Mathematical Society,
Providence, RI.
Friedman, A. (1975): Stochastic Differential Equations and Applications, Vols. 1, 2.
Academic, New York, 1975-1976.
Friedman, A. (1979): Optimal stopping problems in stochastic control. SIAM Rev. 21
(71-80).
Friedman, A. (1982): Variational Principles and Free Boundary Value Problems.
Wiley, New York.
Friedrichs, K. (1929): Ein Verfahren der Variationsrechnung, das Minimum eines
Integrals als das Maximum eines anderen Ausdrucks darzustellen. Nachr. Ges.
Wiss. Gdttingen, Math.-phys. Kl. 13-27.
Fuchsteiner, B. and Lusky, W. (1981): Convex Cones. North-Holland, Amsterdam.
Fucik, S., Necas, J., Soucek, J., and Soucek, V. (1973): Spectral Analysis of Nonlinear
Operators. Lecture Notes in Mathematics, Vol. 346. Springer-Verlag, Berlin.
Fucik, S., Necas, J., and Soucek, V. (1977): Einfuhrung in die Variationsrechnung.
Teubner, Leipzig.
Fucik, S. and Kufner, A. [eds.] (1979): Nonlinear Analysis, Function Spaces and
Applications. Teubner, Leipzig.
Fucik, S. and Kufner, A. (1980): Nonlinear Differential Equations. Elsevier, New
York; SNTL, Prague.
Funk, P. (1962): Variationsrechnung und ihre Anwendung in Physik und Technik.
Springer-Verlag, Berlin.
Gabasov, R. and Kirillova, F. (1976): Methods of optimal control. Itogi nauki i
tehniki, Sovremennye problemy matematiki 6 (133-206) (Russian).
Gajewski, H. (1970): Uber einige Fehlerabschatzungen bei Gleichungen mit monotonen
Potentialoperatoren in Banach-Raumen. Monatsber. Dt. Akad. d. Wiss. Berlin
12 (571-579).
Gajewski, H., Groger, K., and Zacharias, K. (1974): Nichlineare Operatorgleichungen
und Operatordifferentialgleichungen. Akademie-Verlag, Berlin.
Galerkin, B. (1915): Rods and plates. Vestnik Inzernerov 19 (Russian).
Gamkrelidze, R. (1958): Theory of time-optimal processes for linear systems. Izv.
Akad. Nauk SSSR, ser. mat. 22 (449-474) (Russian).
Gamkrelidze, R. (1978): Principles of Optimal Control Theory. Plenum, New York.
Garding, L., Kotake, T„ and Leray, J. (1964): Uniformisation et developpement
asymptotique de la solution du probleme de Cauchy lineaire. Bull. Soc. Math.
France 92 (263-361).
Garding, L., (1981): Microlocal Analysis of Distributions. Jahresbericht der Deutschen
Mathematikervereinigung 83 (32-44).
Garabedian, P. (1964): Partial Differential Equations. Wiley, New York.
Gelfand, I. and Fomin, S. (1961): Calculus of Variations. Nauka, Moscow (Russian).
(English edition: Prentice-Hall, Englewood Cliffs, NJ, 1965.)
Gelfand, I. and Vilenkin, N. (1964): Generalized Functions, Vol. 4. Academic, New
York.
Gelfand, I. and Dikii, L. (1975): The asymptotics of the resolvent of the Sturm-
Liouville equation and the algebra of the Korteweg-de Vries equation. Uspehi
References
617
Mat. Nauk 30, 5 (67-100) (Russian).
Giaquinta, M. (1981): Multiple Integrals in the Calculus of Variations and Nonlinear
Elliptic Systems. University of Bonn, Lecture Notes No. 443, Sonder-
forschungsbereich 72, Bonn, Germany.
Gihman, I. and Skorohod, A. (1969): Introduction to the Theory of Random Pivcesses.
Saunders, Philadelphia.
Gihman, I. and Skorohod, A. (1971): Theory of Stochastic Processes, Vols. 1-3.
Nauka, Moscow, 1971-1975. (Russian). (English edition: Springer-Verlag,
Berlin, 1975.)
Gihman, I. and Skorohod, A. (1972): Stochastic Differential Equations. Springer-
Verlag, Berlin.
Gihman, I. and Skorohod, A. (1977): The Control of Random Processes. Naukova
Dumka, Kiev (Russian).
Gilbarg, D. and Trudinger, N. (1977): Elliptic Partial Differential Equations of
Second Order. Springer-Verlag, Berlin (second enlarged edition, 1984).
Gilkey, P. (1974): The Index Theorem and the Heat Equation. Publish or Perish,
Boston.
Gilmore, R. (1981): Catastrophe Theory for Scientists. Wiley, New York.
Giorgi, E De, Magenes, E., and Mosco, U. [eds.] (1979): Proc. of the Internat.
Meeting on Recent Methods in Nonlinear Analysis. Pitagora, Bologna.
Girlich, H. (1973): Stochastische Entscheidungsprozesse. Teubner, Leipzig.
Girsanov, N. (1972): Lectures on the Mathematical Theory of Extremum Problems.
Lecture Notes in Economics, Vol. 67. Springer-Verlag, Berlin.
Givens, C. and Millman, R. (1982): Review of Herrmann, R. (1979). Bull. Amer.
Math. Soc. (N.S.) 6 (467-477).
Glashoff, K. and Week, N. (1976): Boundaiy control of parabolic differential
equations. SIAM J. Control Optim. 14 (662-681).
Glashoff, K. and Sachs, E. (1977): On theoretical and numerical aspects of the
bang-bang principle. Num. Math. 29 (93-113).
Glashoff, K., and Gustafson, S. (1978): Einfuhrung in die lineare Optimierung. Wiss.
Buchges., Darmstadt.
Glimm, J. and Jaffe, A. (1981): Quantum Physics. Springer-Verlag, New York.
Glowinski, R. (1980): Lectures on Numerical Methods for Nonlinear Variational
Problems. Tata Institute, Bombay.
Glowinski, R. and Lions, J. [eds.] (1974): Computing Methods in Applied Sciences
and Engineering. Lecture Notes in Computer Science, Vols. 10, 11. Springer-
Verlag, Berlin.
Glowinski, R. and Lions, J. [eds.] (1980): Computing Methods in Applied Sciences
and Engineering. North-Holland, Amsterdam.
Glowinski, R., Lions, J., and Tremolieres, R. (1976): Analyse numerique des
inequations variationnelles, Vols. 1, 2. Gauthier-Villars, Paris.
Gnedenko, B. (1962): Lehrbuch der Wahrscheinlichkeitsrechnung. Akademie-Verlag,
Berlin.
Gnedenko, B. and Konig, D. (1983): Handbuch der Bedienungstheorie. Akademie-
Verlag, Berlin.
Goebel, M. and Wolfersdorf, L. von (1978): Optimale Steuerprobleme bei
Noetherschen Operatorgleichungen. HI. Math. Nachr. 82 (77-85).
Goldstine, H. (1980): A History of the Calculus of Variations. From the 17th Century
through the 19th Century. Springer-Verlag, New York.
Gol§tein, E. (1975): Dualitdtstheorie in der nichtlinearen Optimierung und ihre
Anwendung. Akademie-Verlag, Berlin.
Golubitsky, M. and Guillemin, V. (1973): Stable Mappings and Their Singularities.
Springer-Verlag, Berlin.
Golubitsky, M. (1978): An introduction to catastrophe theory and its applications.
618
References
SIAM Rev. 20 (352-387).
Golubitsky, M. and SchaefFer, D. (1979): A theory for imperfect bifurcation via
singularity theoty. Comm. Pure Appl. Math. 32 (21-98).
Gopfert, A. (1973): Mathematische Optimierung in allgemeinen Vektorr&umen.
Teubner, Leipzig.
Gossez, J. (1974): Nonlinear elliptic boundary value problems for equations with
rapidly or slowly increasing coefficients. Trans. Amer. Math. Soc. 190 (163-205).
Gossez, J. (1979): Orlicz - Sobolev spaces and nonlinear elliptic boundary value
problems. In: Fufik, S. and Kufner, A. [eds.] (1979), 59-94.
GrSger, K. (1979): Initial value problems for elasto-viscoplastic systems. In: Fucik, S.
and Kufner, A. [eds.] (1979), 95-127.
Gromoll, D., Klingenberg, W., and Meyer, W. (1968): Riemannsche Geometrie im
Grossen. Lecture Notes in Mathematics, Vol. 55. Springer-Verlag, Berlin.
Grossmann, C. and Kleinmichel, H. (1976): Verfahren der nichtlinearen Optimierung.
Teubner, Leipzig.
Grossmann, C. and Kaplan, A. (1979): Strafmethoden und modifizierte Lagrange-
funktionen in der nichtlinearen Optimierung. Teubner, Leipzig.
Grossmann, W. (1969): Grundzuge der Ausgleichsrechnung. Springer-Verlag, Berlin.
Grundmann, A. (1974): Der topologische Abbildungsgrad homogener Polynomoper-
atoren. Dissertation, Stuttgart.
Guillemin, V. and Pollack, A. (1974): Differential Topology. Prentice-Hall, En-
glewood Cliffs, NJ.
Guillemin, V. and Sternberg, S. (1977): Geometric Asymptotics. Mathematical
Surveys, Vol. 14. American Mathematical Society, Providence, RI.
Gunther, P. (1965): Beispiel einer nichttrivialen Huygensschen Differentialgleichung
mit vier unabhdngigen Variablen. Arch. Rat. Mech. Anal. 18 (103-106).
Gunther, P., Beyer, K., Gottwald, S., and Wilnsch, V. (1972): Grundkurs Analysis,
Vols. 1-4. Teubner, Leipzig, 1972-1974.
Gunther, P. and Wunsch, V. (1976): Maxwellsche Gleichungen und Huygenssches
Prinzip. I, II. Math. Nachr. 63 (1974), 97-121; 73 (1976), 37-58.
Gupta, C. (1970): On the existence of solutions of non-linear integral equations of
Hammerstein type in a Banach space. J. Math. Anal. Appl. 32 (617-620).
Guttinger, W. and Eikemeier, H. [eds.] (1979): Structural Stability in Physics.
Springer-Verlag, New York.
Gwinner, J. (1981): On fixed points and variational inequalities: A circular tour.
Nonlinear Analysis 5 (565-583).
Haar, A. (1927): Uber das Plateausche Problem. Math. Ann. 97 (124-158).
Hadamard, J. (1902): Sur les problemes aux derivees partielles et leur signification
physique. Bull. Univ. Princeton, 49-52.
Hadamard, J. (1932): Lectures on Cauchy's Problem. Yale University Press, New
Haven, CT. 1923. (French edition: Le probleme de Cauchy et les equations aux
derivees partielles lineaires hyperboliques. Hermann, Paris, 1932.)
Hadley, G. (1963): Nonlinear and Dynamic Programming. Addison-Wesley, Reading,
MA.
Hahn, H. (1926): Uber lineare Gleichungssysteme in linearen Raumen. J. Reine
Angew. Math. 157 (214-229).
Hale, J. (1976): Lectures on generic bifurcation. In: Symposium on Nonlinear Analysis
and Mechanics. R. Knops [ed.], Pitman, New York, 1976.
Halkin, H. (1970): A satisfactory treatment of equality and operator constraints in the
Dubovickii-Miljutin optimization formalism. J. Optim. Theory Appl. 6
(138-149).
References
619
Hammerstein, A. (1930): Nichtlineare Integralgleichungen nebst Anwendungen. Acta
Math. 54 (117-176).
Handbook of Applicable Mathematics (1980): Edited by W. Ledermann. Vols. 1-6.
Wiley, Chichester, 1980ff.
Hannan, E. (1960): Time Series Analysis. Methuen, London.
Hannan, E. (1970): Multiple Time Series. Wiley, New York.
Hartman, P. and Stampacchia, G. (1966): On some non-linear elliptic differential
functional equations. Acta Math. 115 (271-310).
Hawking, S. and Ellis, G. (1973): The Large Scale Structure of Space Time.
Cambridge University Press, Cambridge, England.
Held, A. [ed.] (1980): General Gravity and Gravitation, Vols. 1, 2. Plenum, New
York.
Hermann, R. (1979): Cartanian geometry, nonlinear waves, and control theory.
Interdisciplinary Mathematics authored by R. Hermann, Vols. 20, 21.
Mathematical Science Press, Brookline, MA.
Hermes, H. and Lasalle, J. (1969): 'Functional Analysis and Time Optimal Control.
Academic, New York.
Hess, P. (1971): A variational approach to a class of nonlinear eigenvalue problems.
Proc. Amer. Math. Soc. 29 (272-276).
Hess, P. (1974): On semi-coercive nonlinear problems. Indiana Univ. Math. J. 23
(646-654).
Hess, P. (1980): On nontrivial solutions of a nonlinear elliptic boundary value problem.
Conferenze del Seminario di Matematica dell' Universita di Bari, Vol. 173.
Laterza & Figli, Bari.
Hestenes, M. (1950): A general problem in the calculus of variations with applications
to paths of least time. The Rand Corporation, Santa Monica, CA.
Hestenes, M. (1966): Calculus of Variations and Optimal Control Theory. Wiley, New
York.
Hilbert, D. (1904): Uberdas Dirichletsche Prinzip. Math. Ann. 59 (161-186).
Hilbert, D. (1932): Gesammelte Abhandlungen, Vols. 1-3. Springer-Verlag, Berlin,
1932-1935.
Hildebrandt, S. and Nitsche, J. (1979): Minimal surfaces with free boundaries. Acta
Math. 143 (251-272).
Hildebrandt, S. (1980): Optimal boundary regularity for minimal surfaces with a free
boundary. Manuscripta Math. 33 (357-364).
Hildebrandt, S. (1983): Partielle Differentialgleichungen und Differentialgeometrie.
Jahresbericht der Deutschen Mathematikervereinigung 85 (129-145).
Hilton, P. [ed.] (1974): Structural Stability, the Theory of Catastrophes, and
Applications in the Sciences. Lecture Notes in Math., Vol. 525. Springer-Verlag, Berlin.
Hilton, P. and Young, G. [eds.] (1980): New Directions in Applied Mathematics.
Springer-Verlag, New York.
Hirsch, M. (1976): Differential Topology. Springer-Verlag, New York.
Hlavacek, I. (1979): Some variational methods for nonlinear mechanics. In: Fucik, S.
and Kufner, A. [eds.] (1979); 128-148.
Hlavacek, I. and NeCas, J. (1981): Mathematical Theory of Elastic and Elasto-plastlc
Bodies. Elsevier, Amsterdam.
Hofmann, G. (1981): On the existence of quantum fields in space time dimension 4.
Rep. Math. Phys. 18, 2 (129-141).
Holmes, R. (1972): A Course on Optimization and Best Approximation. Lecture
Notes in Mathematics, Vol. 257. Springer-Verlag, Berlin.
Holmes, R. (1975): Geometrical Functional Analysis. Springer-Verlag, Berlin.
Holtzman, J. (1970): Nonlinear System Theory: A Functional Analysis Approach.
Prentice-Hall, Englewood Cliffs, NJ.
Hdrmander, L. (1971): Fourier integral operators. I, II. Acta Math. 127 (79-183);
620
References
128 (183-269).
Hormander, L. (1973): An Introduction to Complex Analysis. North-Holland,
Amsterdam.
Hormander, L. (1983): The Analysis of Linear Partial Differential Operators. Vols.
1-3. Springer-Verlag, New York.
Ibragimov, N. (1976): Huygens' principle. Amer. Math. Soc. Transl. (2) 104
(141-152).
Ibragimov, N. and Ovsjannikov, L. [eds.] (1978): Group Theoretical Methods in
Mechanics. Proc. of the joint IUTAM/IMU symp.—Novosibirsk: Akad. Nauk
SSSR, Sib. otd. (Russian).
IFIP Conference (1978): Distributed Parameter Systems, Modelling and
Identification. Lecture Notes in Control and Information Science, Vol. 1. Springer-Verlag,
Berlin.
IFIP Conference (1978a): Optimization Techniques. Lecture Notes in Control and
Information Science, Vol. 6/7. Springer-Verlag, Berlin.
IFIP Conference (1979): Optimization Techniques. Lecture Notes in Control and
Information Science, Vol. 22/23. Springer-Verlag, Berlin.
Ikeda, I. and Watanabe, S. (1981): Stochastic Differential Equations and Diffusion
Processes. North-Holland, New York.
Ioffe, A. and Tihomirov, V. (1974): Theory of Extremal Problems. Nauka, Moscow
(Russian). (German edition: VEB Dt. Verl. d. Wiss., Berlin, 1979. English
edition: North-Holland, New York, 1978.)
Iooss, G. and Joseph, D. (1980): Elementary Stability and Bifurcation Theory.
Springer-Verlag, New York.
Isaacson, E. and Keller, H. (1966): Analysis of Numerical Methods. Wiley, New
York. (German edition: Edition Leipzig, 1972.)
Ivanov, B. et al. (1978): Theory of Linear Regularization and Its Applications. Nauka,
Moscow (Russian).
Ize, J. (1976): Bifurcation Theory for Fredholm Operators. Memoirs of the Amer.
Math. Soc, Vol. 174. American Mathematical Society, Providence, RI.
Jacobs, D. [ed.] (1976): The State of the Art in Numerical Analysis. Academic,
London.
Jacobs, O. et al. [eds.] (1980): Analysis and Optimization of Stochastic Systems.
Academic, New York.
Jaffe, A. and Taubes, C. (1980): Vortices and Monopoles. Structure of Static Gauge
Field Theories. Birkhauser, Basel.
Jaglom, A. and Jaglom, I. (1960): Wahrscheinlichkeit undInformation. VEB Dt. Verl.
d. Wiss., Berlin.
Jeffrey, A. and Taniuti, T. (1964): Nonlinear Wave Propagation with Applications to
Physics and Magnetohydrodynamics. Academic, New York.
Jenkins, G. and Watts, D. (1968): Spectral Analysis and Its Applications. Holden-Day,
San Francisco.
John, F. (1948): Extremum problems with inequalities as subsidiary conditions. In:
Studies and Essays Presented to R. Courant. Interscience, New York, 187-204.
Juskevic, A. (1971): Leonhard Euler. In: Dictionary of Scientific Biography, Vol. 4,
Scribners, New York, 467-484.
Kahn, D. (1980): Introduction to Global Analysis. Academic, New York.
Kalaba, R. and Spingarn, K. (1982): Control, Identification and Imput Optimization.
Plenum, New York.
References
621
Kallianpur, G. (1980): Stochastic Filtering Theory. Springer-Verlag, New York.
Kalman, R. and Bucy, R. (1961): New results in linear filtering and prediction theory.
Trans. ASME, Ser. D 83 (95-107).
Kalman, R., Falb, P., and Arbib, M. (1969): Topics in Mathematical System Theory.
McGraw-Hill, New York.
KantoroviC, L. (1939): Mathematical methods in the organization and planning of
production. Leningrad. (English version in: Management Sci. 6 (1960), 366-422).
Kantorovic, L. and Akilov, G. (1964): Funktionalanalysk in normierten Raumen.
Akadamie-Verlag, Berlin. (English edition: Pergamon, Oxford, 1964.)
Karamedian, S. (1969): The nonlinear complementarity problem with applications. J.
Optim. Theory Appl. 4 (87-98,167-181).
Karlin, S. (1959): Mathematical Methods and Theory in Games, Programming and
Economics, Vols. 1, 2. Addison-Wesley, Reading, MA.
Karlin, S. and Studden, W. (1966): Tchebycheff Systems with Applications in Analysis
and Statistics. Interscience, New York.
Karlin, S. (1968): A First Course in Stochastic Processes. Academic, New York.
Karlin, S. (1971): Best quadrature formulas and splines. J. Approx. Theory 4 (59-90).
Karlin, S. and Taylor, M. (1980): A Second Course in Stochastic Processes. Academic,
New York.
Karpman, V. (1977): Nichtlineare Wellen. Akademie-Verlag, Berlin.
Kashiwara, M., Kawai, T., and Sato, M. (1973): Microfunctions and Pseudodifferen-
tial Equations. Lecture Notes in Math., Vol. 287. Springer-Verlag, Berlin.
Kato, T. (1966): Perturbation Theory for Linear Operators. Springer-Verlag, Berlin.
Kiesewetter, H. (1973): Vorlesungen uber lineare Approximation. VEB Dt. Verl. d.
Wiss., Berlin.
Kijowski, J. and Tulpczyjew, W. (1979): A Symplectic Framework for Field Theories.
Lecture Notes in Physics, Vol. 107. Springer-Verlag, New York.
Kinderlehrer, D. and Stampacchia, G. (1980): An Introduction to Variational
Inequalities and Their Application. Academic, New York.
Kirchgassner, K. (1971): Multiple eigenvalue bifurcation for holomorphic mappings.
In: Zarantonello, E. [ed.], (1971a), 69-100.
Kirchgassner, K. (1976): Instability phenomena in fluid mechanics. In: SYNSPADE
1975. Hubbard, Z. [ed.], Academic, New York, 1976, 349-371.
Kittel, C. (1973): Physik der Warme. Geest & Portig, Leipzig. (English edition:
Thermal Physics. Wiley, New York, 1969.)
Klee, V. (1969): Separation and support properties of convex sets: a survey. In:
Control Theory and Calculus of Variations. Balakrishnan, A. [ed.], Academic,
New York, 1969, 235-304.
Klingenberg, W. (1978): Lectures on Closed Geodesies. Springer-Verlag, Berlin.
Klotzler, R (1971): Mehrdimensionale Variationsrechnung. VEB Dt. Verl. d. Wiss.,
Berlin.
Klotzler, R. (1976): On Pontrjagin's maximum principle for multiple integrals.
Beitrage zur Analysis 8 (67-75).
Klotzler, R. (1978): A generalization of the duality in optimal control and some
numerical conclusions. In: EFIP Conference (1978a), Part 1, 313-320.
KlStzler, R. (1979): A priori Abschatzungen von Optimalwerten zu Steuerungsproble-
men. I, II. Math. Operationsforsch. Statist. Ser. Optim. 10 (101-110, 335-344).
Klotzler, R. (1983): Globale Optimierung in der Steuerungstheorie. ZAMM 63
(305-312).
Kluge, R. [ed.] (1978): Theory of Nonlinear Operators. Akademie-Verlag, Berlin.
Kluge, R. (1979): Nichtlineare Variationsungleichungen und Extremalaufgaben. VEB
Dt. Verl. d. Wiss., Berlin.
Kluge, R (1979a): On some inverse problems in variational and quasivariational
inequalities. In: Anger, G. [ed.] (1979), 141-149.
622
References
Knops, R. (ed.) (1976): Symposium on Nonlinear Analysis and Mechanics, Vols. 1-4.
Pitman, London, 1976-1979.
Kobayashi, Y. (1975): Difference approximations of Cauchy problems for quasi-dis-
sipative operators and generation of nonlinear semigroups. J. Math. Soc. Japan 27
(640-665).
Kolmogorov, A. (1941): Interpolation and extrapolation of stationary random
sequences. Bull. Acad. Sci. USSR, Ser. math. 5 (3-14) (Russian).
Konig, H. and Wolters, J. (1972): Einfuhrung in die Spektralanalyse okonomischer
Zeitreihen. Meisenheim am Glan: A. Hain.
Konig, H. (1982): On basic concepts in convex analysis. In: Korte [ed.] (1982),
107-144.
Koopmans, T. [ed.] (1951): Activity Analysis of Production and Allocation. Wiley,
New York.
Korte, B. [ed.] (1982): Modern Applied Mathematics: Optimization and Operations
Research. North-Holland, Amsterdam.
Kothe, G. (1960): Topologische lineare Raume, Vols. 1, 2. Springer-Verlag, Berlin,
1960-1979. (English edition: Topological Vector Spaces, Vols. 1, 2. Springer-
Verlag, New York, 1969-1979.)
Krabs, W. (1975): Optimierung und Approximation. Teubner, Stuttgart. (English
edition: Optimization and Approximation. Wiley, New York, 1979.)
Krasnoselskii, M. (1956): Topological Methods in the Theory of Nonlinear Integral
Equations. Gostehizdat, Moscow (Russian). (English edition: Pergamon,
Oxford, New York, 1964.)
Krasnoselskii, M. and Rutickii, J. (1958): Convex Functions and Orlicz Spaces.
Fizmatgiz, Moscow (Russian). (English edition: NoordhofT, Groningen, 1961.)
Krasnoselskii, M. and Rutickii, J. (1958a): Orlicz spaces and nonlinear integral
equations. Trudy Mosk. Mat. Obsc. 7 (63-120) (Russian).
Krasnoselskii, M., Vainikko, G., Zabreiko, P., Rutickii, Ja., and Stecenko, V. (1973):
Ndherungsverfahren zur Losung von Operatorgleichungen. Akademie-Verlag,
Berlin.
Krasnoselskii, M. and Zabreiko, P. (1975): Geometric Methods of Nonlinear Analysis.
Nauka, Moscow (Russian). (English edition: in preparation).
Krasnov, M., Kiselev, G., and Makarenko, I. (1975): Problems and Exercises in the
Calculus of Variations. Mir, Moscow.
Krasovskii, N. (1968): Theory of Control of the Motion of Linear Systems. Nauka,
Moscow (Russian).
Krauss, E. (1984): A representation of maximal monotone operators by saddle
functions. Revue Romaine de Math. Pures et Appl. (to appear).
Krein, M. (1938): On positive junctionals in linear normed spaces. In: Achieser, N.
and Krein, M., On Some Problems of the Theory of Moments. GONTI,
Charkov, (Russian).
Kreko, B. (1974): Optimierung—nichtlineare Modelle. VEB Dt. Verl. d. Wiss.,
Berlin.
Kreyszig, E. (1957): Differentialgeometrie. Geest & Portig, Leipzig.
Kroschel, K. (1973): Statistische Nachrichtentheorie, Vols. 1, 2. Springer-Verlag,
New York, 1973-1974.
Krylov, V. (1967): Approximate Computation of Integrals. Nauka, Moscow (Russian).
Kubrusly, C. (1977): Distributed parameter system identification: a survey. Int. J.
Control 26 (509-535).
Kufner, A., John, 0., and Fucik, S. (1977): Function Spaces. Academia, Prague and
NoordhofT, Leyden.
Kuhn, H. and Tucker, A. (1951): Nonlinear programming. In: Proc. Second Berkeley
Symp. on Math. Statistics and Probability. University of Calif. Press, Berkeley,
481-492.
References
623
Kupradze, V. (1956): Randwertaufgaben der Schwingungstheorie und Integral-
gleichungen. VEB Dt. Verl. d. Wiss., Berlin.
Ky Fan (1952): Fixed point and minimax theorems in locally convex linear spaces.
Proc. Nat. Acad. Sci. USA 38 (121-126).
Ky Fan (1970): Asymptotic cones and duality of linear relations. In: Inequalities, Vol.
2. O. Shisha [ed.], Academic, London, 179-186.
Ladde, G. and Lakhshmikantham (1980): Random Differential Inequalities.
Academic, New York.
Ladyzenskaja, O. and Uralceva, N. (1964): Linear and Quasilinear Equations of
Elliptic Type, 2nd ed., 1973, Nauka, Moscow (Russian). (English edition:
Academic, New York, 1968.)
Ladyzenskaja, O. (1973): Boundary Value Problems of Mathematical Physics. Nauka,
Moscow (Russian).
Lamb, G. (1980): Elements of Soliions. Wiley, New York.
Landau, L. and Lifsic, E. (1962): Course of Theoretical Physics. Pergamon, Oxford.
(German edition: Lehrbuch der theoretischen Physik, Vols. 1-10. Akademie-
Verlag, Berlin, 1962ff.)
Lang, S. (1972): Differential Manifolds. Addison-Wesley, Reading, MA.
Langenbach, A. (1976): Monotone Potentialoperatoren in Theorie und Anwendung.
VEB Dt. Verl. d. Wiss., Berlin.
Lattes, R. and Lions, J. (1969): The Method of Quasireversibility. Gordon and
Breach, New York.
Laurent, P. (1972): Approximation et optimisation. Hermann, Paris.
Lavrentjev, M., Romanov, V., and Vasiljev, Z. (1969): Multidimensional Inverse
Problems for Differential Equations. Nauka, Novosibirsk (Russian). [See, also:
Lecture Notes in Mathematics, Vol. 167. Springer-Verlag, Berlin, 1970.]
Lax, P. and Wendroff, B. (1960): Systems of conservation laws. Comm. Pure Appl.
Math. 13 (217-238).
Lax, P. and Wendroff, B. (1964): Difference schemes for hyperbolic equations with
high order of accuracy. Comm. Pure Appl. Math. 17 (381-398).
Lax, P. and Phillips, R. (1967): Scattering Theory. Academic, New York.
Lax, P. (1968): Integrals of nonlinear equations of evolution and solitary waves. Comm.
Pure Appl. Math. 21 (467-490).
Lee, E. and Markus, L. (1967): Foundations of Optimal Control Theory. Wiley, New
York.
Leichtweiss, K. (1980): Konvexe Mengen. VEB Dt. Verl. d. Wiss., Berlin.
Leitmann, G. (1981): The Calculus of Variations and Optimal Control. Plenum, New
York.
Leray, J. (1952): Lectures on Hyperbolic Equations with Variable Coefficients.
Institute for Advanced Study, Princeton, NJ.
Leray, J. (1978): Analyse Lagrangienne et mecanique quantique. Strasbourg, France.
Leray, J. (1981): The meaning of Maslov's asymptotic method: the need of Planck's
constant in mathematics. Bull. Amer. Math. Soc. (N.S.) 5 (15-27).
Levin, M. and GirsoviC, J. (1979): Optimal Quadrature Formulas. Teubner, Leipzig.
Levinson, N. (1966): Minimax, Ljapunov, and bang-bang. J. Diff. Equations 2
(218-241).
Lichnerowicz, A. (1967): Relativistic Hydrodynamics and Magnetohydrodynamics.
Benjamin, New York.
Linnik, J. (1961): Die Methode der kleinsten Quadrate. VEB Dt. Verl. d. Wiss.,
Berlin.
S2i
References
Lions, J. and Magenes, E. (1968): Problemes aux limites non homogenes et
applications, Vols. 1-3. Dunod, Paris, 1968-1970. (English edition: Springer-Verlag,
Berlin, 1972-1973.)
Lions, J. (1969): Quelques methodes de resolution des problemes aux limites non
lineaires. Dunod, Paris; Gauthier-Villars, Paris.
Lions, J. (1971): Optimal Control of Systems Governed by Partial Differential
Equations. Springer-Verlag, Berlin.
Lions, J. (1973): Perturbations singulieres dans lesproblemes aux limites et en contrble
optimale. Lecture Notes in Math., Vol. 323, Springer-Verlag, Berlin.
Lions, J. and Marcuk, G. (1975): Methods of Numerical Mathematics. Nauka,
Novosibirsk (Russian).
Lions, J. (1976): Various topics in the theory of optimal control of distributed systems.
In: Optimal Control Theory and Its Applications, Vol. 1. B. Kirby [ed.]. Lecture
Notes in Economics, Vol. 105. Springer-Verlag, Berlin, 1976,166-309.
Lions, J. (1977): Remarks on the theory of optimal control of distributed systems. In:
Control Theory of Systems Governed by Partial Differential Equations. Academic,
New York, 1-103.
Lions, J. (1980): Asymptotic calculus of variations. In: Meyer, R. and Parter, S. [eds.]
(1980), 277-296.
Lions, P. (1982): Generalized solutions of Hamilton-Jacobi equations. Pitman, London.
Liptser, R. and Sirjaev, A. (1977): Statistics of Random Processes, Vols. 1, 2.
Springer-Verlag, Berlin, 1977-1978.
Ljubic, J. and Maistrovskii, G. (1970): General theory of relaxation processes for
convex junctionals. Uspehi Mat. Nauk 25,1 (57-112) (Russian).
Ljusternik, L. and Schnirelman, L. (1929): Sur le probleme de trois geodesiques
fermees sur les surfaces de genre zero. C.R. Acad. Sci. Paris 189 (269-317).
Ljusternik, L. and Schnirelman, L. (1934): Methodes topologiques dans les problemes
variationnels. Hermann, Paris.
Ljusternik, L. and Schnirelman, L. (1947): Topological methods in variational
problems and their application to the differential geometry of surfaces. Uspehi Mat.
Nauk 2, 1 (166-217) (Russian).
Ljusternik, L. (1930): Topologische Grwidlagen der allgemeinen Eigenwerttheorie.
Monatsh. Math. Phys. 37 (125-130).
Ljusternik, L. (1934): On constrained extrema of junctionals. Mat. Sb. 41 (390-401)
(Russian).
Ljusternik, L. (1939): On a class of nonlinear operators in Hilbert space. Izv. Akad.
Nauk SSSR, ser. mat. 100 (257-264) (Russian).
Loeve, M. (1978): Probability Theory. Vols. 1, 2. Springer-Verlag, Berlin.
Lovelock, D. and Ruiid, H. (1975): Tensors, Differential Forms and Variational
Principles. Wiley, New York.
Lu, Y. (1976): Singularity Theory and an Introduction to Catastrophe Theory.
Springer-Verlag, Berlin.
Ludwig, R (1969): Methoden der Fehler- und Ausgleichsrechnung. Springer-Verlag,
Berlin.
Luenberger, D. (1969): Optimization by Vector Space Methods. Wiley, New York.
Luke, Y. (1975): Mathematical Functions and Their Approximations. Academic, New
York.
Luneburg, R. (1964): Mathematical Theory of Optics. University of California Press,
Berkeley.
Lurje, K. (1975): Optimal Control in Problems of Mathematical Physics. Nauka,
Moscow (Russian).
Macki, J. and Strauss (1982): Introduction to Optimal Control Theory. Springer-Verlag,
Berlin.
References
625
Manin, Yu. (1984): Gauge Fields and Complex Geometry. Nauka, Moscow (to
appear).
Marcuk, G. [ed.] (1975): Optimization Techniques. IFIP Technical Conferences.
Lecture Notes in Computer Science, Vol. 27. Springer-Verlag, Berlin.
Marcuk, G. and Kuznecov, J. (1975): Iteration methods and quadratic junctionals
(Russian). In: Lions, J. and Marcuk, G. [eds.] (1975), 4-143.
Marcuk, G. (1980): Methodes de calcul numerique. Mir, Moscow (French). (Russian
edition: Mir, Moscow, 1977. English edition: Springer, New York, 1982.)
Marsden, J. (1974): Applications of Global Analysis in Mathematical Physics. Publish
or Perish, Boston.
Marsden, J. (1981): Lectures on Geometric Methods in Mathematical Physics. SIAM,
Philadelphia.
Marti, J. (1977): Konvexe Analysis. Birkhauser, Basel.
Martos, B. (1975): Nonlinear Programming: Theory and Methods. Akad. Kiado,
Budapest.
Maslov, V. (1972): Theorie des perturbations et methodes asymptotiques. Dunod,
Paris; Gauthier-Villars, Paris.
Maslov, V. (1977): The Complex' WKB-Method in Nonlinear Equations. Nauka,
Moscow (Russian).
Maurin, K. (1967): Methods of Hilbert Spaces. PWN, Warsaw.
Maurin, K. (1976): Analysis, Vols. 1, 2. PWN, Warsaw; Reidel, Boston, 1976-1980.
McEliece, R. (1977): The Theory of Information and Coding. Encyclopedia of Math.
and Appl., Vol. 3. Addison-Wesley, Reading, MA.
McLeod, J. and Turner, R. (1976): Bifurcation for non-differentiable operators with an
application to elasticity. Arch. Rat. Mech. Anal. 63 (1-45).
McShane, E. (1978): The calculus of variations from the beginning through optimal
control theory. In: Optimal Control and Differential Equations. A. Schwarzkopf
[ed.], Academic, New York, 3-49.
Meditch, J. (1969): Stochastic Optimal Linear Estimation and Control. McGraw-Hill,
New York.
Meinardus, G. (1964): Approximation von Funktionen und ihre numerische Behand-
lung. Springer-Verlag, Berlin. (English edition: Approximation of Functions:
Theory and Numerical Methods. Springer-Verlag, New York, 1967.)
Meyer, P. and Dellacherie, C. (1966): Probability et potentiel. Hermann, Paris.
(English edition: Probabilities and Potential. Elsevier, New York, 1978.)
Meyer, R. and Parter, S. [eds.] (1980): Singular Perturbations and Asymptotics.
Academic, New York.
Michlin, S. (1962): Variationsmethoden der mathematischen Physik. Akademie-Verlag,
Berlin.
Michlin, S. (1969): Numerische Realisierung von Variationsmethoden. Akademie-
Verlag, Berlin.
Michlin, S. and Smolickij, C. (1969): Naherungsmethoden zur Losung von
Differential- und Integralgleichungen. Teubner, Leipzig.
Miersemann, E. (1975): Verzweigungsprobleme fur Variationsungleichungen. Math.
Nachr. 65 (187-209).
Miersemann, E. (1981): Eigenvalue problems for variational inequalities.
Contemporary Mathematics 4 (25-43).
Milnor, J. (1963): Morse Theory. Princeton University Press, Princeton, NJ.
Minkowski, H. (1910): Geometrie der Zahlen. Teubner, Leipzig.
Minkowski, H. (1911): Theorie der konvexen Korper. In: Minkowski: Gesammelte
Abhandlungen, Vol. 2. Teubner, Leipzig. 131-229.
Miura, R. (1976): The Korteweg-de Vries equation: a survey of results. SIAM Rev. 18
(412-459).
Moerbeke, P. van (1974): Optimal stopping and free boundary problems. Rocky
626
References
Mountain J. Math. 4 (539-578).
Moerbeke, P. van (1976): On optimal stopping time and free boundary problems. Arch.
Rat. Mech. Anal. 60 (101-148).
Moiseev, N. (1975): Elements of the Theory of Optimal Systems. Nauka, Moscow
(Russian).
Moiseev, N. (1979): Optimization and Operations Research. Nauka, Moscow
(Russian).
Moore, E. (1920): On the reciprocal of the general matrix. Bull. Amer. Math. Soc. 26
(394-395).
MorbyhoviC, V. (1976): Existence of optimal controls. In: Supplement to Gabasov, R
and Kirillova, F. (1976), 207-261 (Russian).
Moreau, J. (1962): Decomposition orthogonale dans un espace hilbertien selon deux
cones mutuellement polaires. C. R. Acad. Sci., Paris 255 (238-240).
Moreau, J. (1965): Proximite et dualite dans un espace hilbertien. Bull. Soc. Math.
France 93 (273-299).
Moreau, J. (1966): Fonctionnelles convexes: seminaire equations aux derivees par-
tielles. College de France, Paris.
Moreau, J. (1971): Weak and strong solutions of dual problems. In: Zarantonello, E
[ed.] (1971a), 181-214.
Moreau, J. (1976): Application of convex analysis to the treatment of elastoplastic
systems. In: Applications of Methods of Functional Analysis to Problems in
Mechanics. Lecture Notes in Mathematics, Vol. 503, 56-89. Springer-Verlag,
Berlin.
Morozov, V. (1973): Linear and nonlinear ill-posed problems. Itogi Nauki i Tehniki,
Matemat. Analiz 11 (129-178) (Russian).
Morrey, C. (1966): Multiple Integrals in the Calculus of Variations. Springer-Verlag,
Berlin.
Morse, M. (1925): Relations between the critical points of a real function of n
independent variables. Trans. Amer. Math. Soc. 27 (345-396).
Morse, M. (1934): The Calculus of Variations in the Large. Colloquium Publ., Vol.
18. American Mathematical Society, Providence, RI.
Morse, M. and Cairns, S. (1969): Critical Point Theory in Global Analysis. Academic,
New York.
Morse, M. (1972): Variational Analysis: Critical Extremals and Sturmian Extensions.
Wiley, New York.
Morse, P. and Feshbach, H. (1953): Methods of Theoretical Physics. Vols. 1, 2.
McGraw-Hill, New York.
Mosco, U. (1969): Convergence of sets and solutions of variational inequalities. Adv. in
Math. 3 (510-585).
Mosco, U. (1973): An introduction to the approximate solution of variational
inequalities. In: Constructive Aspects of Functional Analysis. Cremonese, Roma, 497-684.
Mosco, U. (1976): Implicit variational problems and quasi-variational inequalities. In:
Nonlinear Operators and the Calculus of Variations. J. Gossez et al. [eds.],
Lecture Notes in Mathematics, Vol. 543, 82-156. Springer-Verlag, Berlin.
Mukherjea, A. and Pothoven, K. (1978): Real and Functional Analysis. Plenum, New
York, London.
Nashed, M. [ed.] (1976): Generalized Inverses and Applications. Academic, New
York.
Naumann, J. (1984): Parabolische Variationsungleichungen. Teubner, Leipzig (to
appear).
Necas, J. and Hlavacek, I. (1981) (see Hlavacek, I. and Necas, J.).
References
627
Necas, J. (1983): Introduction to the Theory of Nonlinear Elliptic Partial Differential
Equations. Teubner, Leipzig.
Neumann, J. von (1928): Zur Theorie der Gesellschaftsspiele. Math. Ann. 100
(295-320).
Neumann, J. von and Morgenstern, O. (1944): Theory of Games and Economic
Behavior. Princeton University Press, Princeton, NJ.
Neumann, J. von and Richtmyer, R. (1950): A method for the numerical calculation of
hydrodynamics. J. Appl. Phys. 21 (232-237).
Neustadt, L. (1976): Optimization: A Theory of Necessary Conditions. Princeton
University Press, Princeton, NJ.
Nirenberg, L. (1981): Variational and topological methods in nonlinear problems. Bull.
Amer. Math. Soc. (N.S.) 4 (267-302).
Nitsche, J. (1975): Vorlesungen uber Minimalflachen. Springer-Verlag, Berlin.
Novozilov, Ju. (1972): Introduction to the Theory of Elementary Particles. Nauka,
Moscow (Russian).
Nilrnberg, R. (1979): On the determination of functional parameters in nonlinear
evolution equations of the Navier-Stokes type. In: Anger, G. [ed.] (1979),
189-196.
Olech, C. (1969): Existence theorems for optimal control problems involving multiple
integrals. J. DifT. Equations 6 (512-524).
Olech, C. (1969a): Existence theorems for optimal problems with vector-valued cost
function. Trans. Amer. Math. Soc. 136 (159-180).
Oleinik, O. (1957): Discontinuous solutions of nonlinear equations. Uspehi Mat. Nauk
12, 3 (3-73) (Russian).
Oleinik, O. and Radkevic, E. (1971): Second order equations with non-negative
characteristic form. Itogi nauki i tehniki, Matemat. Analiz (1969). VINITI,
Moscow, 1971, 7-252 (Russian).
Oleinikov, V., et al. (1969): Collection of Problems and Examples for the Theory of
Automatic Control. Vys§aja Skola, Moscow (Russian).
Orlicz, W. (1932): Uber eine gewisse Klasse von Raumen vom Typus B. Bull. Int.
Acad. Polon. Sci. A 8/9 (207-220).
Owen, G. (1968): Game Theory. Saunders, Philadelphia. (German edition: Springer-
Verlag, Berlin, 1971.)
Palais, R. (1963): Morse theory on Hilbert manifolds. Topology 2 (299-340).
Palais, R. and Smale, S. (1964): A generalized Morse theory. Bull. Amer. Math. Soc.
70 (165-171).
Palais, R. (1965): Seminar on the Atiyah-Singer Index Theorem. Princeton
University Press, Princeton, NJ.
Palais, R. (1966): Ljusternik-Schnirelman theory on Banach manifolds. Topology 5
(115-132).
Palais, R. (1967): Foundations of Global Nonlinear Analysis. Benjamin, Reading, MA.
Pan-Tai Liu [ed.] (1980): Dynamic Optimization and Mathematical Economics.
Plenum, New York.
Pascali, D. (1974): Operatori neliniari. Ed. Acad., Bucuresti.
Pascali, D. and Sburlan, S. (1978): Nonlinear Mappings of Monotone Type. Sijthoff &
Noordhoff, Alphen a. d. Rijn.
Payne, L. (1975): Improperly Posed Problems in Partial Differential Equations. SIAM,
Philadelphia.
Penrose, R. (1955): A generalized inverse for matrices. Proc. Cambridge Phil. Soc. 51
§28
References
(406-413).
Penrose, R. (1956): On best approximate solutions of linear matrix equations. Proc.
Cambridge Phil. Soc. 52 (17-19).
Petrov, J. (1977): Variational Methods of the Theory of Optimal Control. Energija,
Leningrad (Russian). (English edition: Academic, New York, 1968.)
Petrovskii, I. (1955): Partielle Differentialgleichungen. Teubner, Leipzig. (English
edition: Academic, New York, 1955. Russian (3rd) edition: Gosizdatfizmatlit,
Moscow, 1961.)
Phu, H. (1984): Zur Losung des linearisierten Knickstabproblems mit beschrankter
Ausbiegung. ZAMM (to appear).
Phu, H. (1984a): Losung einer regularen Aufgabe der optimalen Steuerung mit engem
Zustandsbereich anhand der Methode der Bereichsanalyse. Math. Operations-
forschung und Statistik, Ser. Optimization (to appear).
Picard, E. (1910): Sur un theoreme generate relatif aux equations integrates de
premiere espece et sur quelques problemes de physique mathematique. Rend. Circ.
Mat. Palermo 29 (615-619).
Piehler, J. and Zschiesche, H. (1976): Simulationsmethoden. Teubner, Leipzig.
Polak, E. (1971): Computational Methods in Optimization. Academic, New York.
Polak, E. (1973): An historical survey of computational methods in optimal control.
SIAM Rev. 15 (553-584).
Polis, M. and Goodson, R (1974): Parameter identification in distributed systems: a
synthesizing overview. In: Identification of Parameters in Distributed Systems. R
Goodson and M. Polis [eds.]. American Society of Mechanical Engineers, New
York, 1974, 1-30.
Poljak, B. (1974): Methods of minimization with presence of side conditions. Itogi
Nauki i Tehniki, Matemat. Analiz 12 (147-197) (Russian).
Pontrjagin, L. (1959): Optimal control processes. Uspehi Mat. Nauk 14, 1 (3-20)
(Russian).
Pontrjagin, L., Boltjanskii, V., Gamkrelidze, R, and Miscenko, E. (1961):
Mathematical Theory of Optimal Processes. Fizmatgiz, Moscow (Russian). (German
edition: VEB Dt. Verl. d. Wiss., Berlin, 1964. English edition: Wiley, New
York, 1962.)
Poston, T. and Stewart, I. (1978): Catastrophe Theory and Its Applications. Pitman,
London.
Powell, M. (1971): Recent advances in unconstrained optimization. Math.
Programming 1 (26-57).
Prenter, P. (1975): Splines and Variational Methods. Wiley, New York.
Priestley, M. (1981): Spectral Analysis and Time Series, Vols. 1, 2. Academic, New
York.
Prohorov, J. and Rozanov, J. (1969): Probability Theory. Springer-Verlag, Berlin.
Psenicnyi, B. (1972): Notwendige Optimalitatsbedingungen. Teubner, Leipzig.
PseniCnyi, B. and Danilin, J. (1979): Numerical Methods in Extremal Problems. Mir,
Moscow (Russian).
Rabinowitz, P. (1974): Variational methods for nonlinear eigenvalue problems. In:
Eigenvalues of Nonlinear Problems. G. Prodi [ed.]. Cremonese, Roma, 141-195.
Rabinowitz, P. (1975): A note on topological degree for potential operators. J. Math.
Anal. Appl. 51 (483-492).
Rabinowitz, P. (1977): A bifurcation theorem for potential operators. J. Funct. Anal.
25 (412-416).
Rabinowitz, P. (1978): Some minimax theorems and applications to nonlinear
differential equations. In: Nonlinear Analysis. L. Cesari et al. [eds.], Academic, New
References
629
York, 161-177.
Rabinowitz, P. (1978a): Periodic solutions of Hamiltonian systems. Comm. Pure
Appl. Math. 31 (157-184).
Rabinowitz, P. (1978b): Free vibrations for a semilinear wave equation. Comm. Pure
Appl. Math. 31 (31-68).
Rabinowitz, P. (1980): On subharmonic solutions of Hamiltonian systems. Comm.
Pure Appl. Math. 33 (609-633).
Rademacher, H. (1919): Uber partielle und totale Differenzierbarkeit von Funktionen
mehrerer Variabler. I, II. Math. Ann. 79 (1919), 340-359; 81 (1920), 52-63.
Ray, W. and Lainiotis, D. [eds.] (1978): Distributed Parameter Systems,
Identification, Estimation and Control. Dekker, New York.
Razypraev, A. (1977): Foundations of Control of the Flight of Cosmic Apparatuses and
Spaceships. MaMnostroenie, Moscow (Russian).
Reed, M. and Simon, B. (1971): Methods of Modern Mathematical Physics, Vols.
1-4. Academic, New York, 1971-1980. (Russian edition: Mir, Moscow, 1977.)
Reiss, E. (1977): Imperfect bifurcation. In: Applications of Bifurcation Theory. P.
Rabinowitz [ed.], Academic, New York. 1977, 37-72.
Remes, E. (1934): Sur un procede convergent d'approximation successive pour
determiner les polynbmes d'approximation. C. R. Acad. Sci. Paris 198
(2063-2065); 199 (337-340).
Remes, E. (1969): The Foundations of Numerical Methods of the Chebyshev
Approximation. Naukova Dumka, Kiev (Russian).
Renyi, A. (1977): Wahrscheinlichkeitsrechnung. VEB Dt. Verl. d. Wiss., Berlin.
Richtmyer, R. and Morton, K. (1967): Difference Methods for Initial Value Problems.
Interscience, New York.
Riesz, F. and Sz.-Nagy, B. (1956): Vorlesungen uber Funktionalanalysis. VEB Dt.
Verl. d. Wiss., Berlin. (English edition: Functional Analysis, Frederick Ungar,
New York, 1955. French edition: Akademiai Kiado, Budapest, 1952, 1953,
1955,1965.)
Ritz, W. (1909): Uber eine neue Methode zur Losung gewisser Variationsprobleme der
mathematischen Physik. J. Reine Angew. Math. 135 (1-61).
Rivlin, T. (1969): An Introduction to the Approximation of Functions. Blaisdell,
Waltham, MA.
Roberts, A. and Varberg, D. (1973): Convex Functions. Academic, New York.
Robinson, A. (1971): A survey of optimal control of distributed parameter systems.
Automatica 7 (371-388).
Rockafellar, R. (1967): Duality and stability in extremum problems involving convex
functions. Pacific J. Math. 21 (167-187).
Rockafellar, R. (1968): Convex functions, monotone operators and variational
inequalities. In: Theory and Applications of Monotone Operators. A. Ghizetti [ed.], Ed.
Oderisi, Gubbio, 35-65.
Rockafellar, R. (1970): Convex Analysis. Princeton University Press, Princeton, NJ.
Rockafellar, R. (1970a): On the maximal monotonicity of subdifferential mappings.
Pacific J. Math. 33 (209-216).
Rockafellar, R. (1970b): Conjugate functions in optimal control and the calculus of
variations. J. Math. Anal. Appl. 32 (174-222).
Rockafellar, R (1971): Convex integral junctionals and duality. In: Zarantonello, E.
[ed.] (1971a), 215-236.
Rockafellar, R. (1975): Existence theorems for general control problems of Bolza and
Lagrange. Adv. in Math. 15 (312-337).
Rockafellar, R. (1976): Integral junctionals, normal integrands and measurable
selections. In: Nonlinear Operators and the Calculus of Variations. J. Gossez et al.
[eds.], Lecture Notes in Mathematics, Vol. 543, 167-207. Springer-Verlag,
Berlin.
630
References
Rockafellar, R. (1981): The Theory of Subgradients and Its Applications to Problems
of Optimization. Convex and Nonconvex Problems. Heldermann, Berlin.
Ross, S. (1970): Applied Probability Models with Optimization Applications. Holden-
Day, San Francisco.
Rothe, E. (1973): Morse theory. Rocky Mountain J. Math. 3 (251-274).
Rozanov, J. (1975): Stochastische Prozesse. Akadamie-Verlag, Berlin.
Ruelle, D. (1969): Statistical Mechanics. Benjamin, New York.
Rund, H. (1966): The Hamilton- Jacobi theory in the calculus of variations. Van
Nostrand, London.
Rund, H. (1981): Differential Geometric and Variational Background of Classical
Gauge Field Theories, Parts 1, 2. University of Arizona Press, Tucson (preprint).
Russell, D. (1978): Controllability and stabilizability: theory for linear partial
differential equations. SIAM Rev. 20 (635-739).
Russell, D. (1979): Mathematics of Finite-Dimensional Control Systems. Dekker, New
York.
Russian Encyclopedia of Mathematics (1977): Edited by I. Vinogradov, Vol. Iff.
Sovetskaja Encyclopedia, Moscow (Russian).
Saaty, T. (1978): Optimization in integers and related extremal problems. McGraw-
Hill, New York.
Saff, R. and Varga, R. [eds.] (1977): Pade and Rational Approximation. Academic,
New York.
Sander, H. (1973): Dualitat bei Optimierungsaufgaben. Oldenbourg, Miinchen.
Sato, M., Miwa, T., and Jimbo, M. (1980): Aspects of holonomic quantum fields,
isodronic deformation and ising model. In: Lecture Notes in Phys., Vol. 126,
429-491. Springer-Verlag, New York.
Sauer, R and Szabo, I. (1967): Mathematische Hilfsmittel des Ingenieurs, Vols. 1-4.
Springer-Verlag, Berlin, 1967-1970.
Schaefer, H. (1966): Topological Vector Spaces. Macmillan, London.
Schimming, R. (1977)'- Das Huygenssche Prinzip bei linearen hyperbolischen Di-
fferentialgleichungen 2. Ordnung fur allgemeine Felder. Beitragezur Analysis 11
(45-90).
Schimming, R (1978): A review of Huygens' principle for linear hyperbolic differential
equations. In: Ibragimov, N. and Ovsjannikov, L. [eds.] (1978), 214-225.
Schlitt, H. (1968): Stochastische Vorgange in linearen und nichtlinearen Regelkreisen.
Vieweg, Braunschweig; Verl. Technik, Berlin.
Schloder, J. and Bock, H. (1983): Identification of rate constants in bistable chemical
reactions. In: Deuflhard, P. and Hairer, E. [eds.] (1983) (to appear).
Schmetterer, L. (1966): Einfuhrung in die mathematische Statistik. Springer-Verlag,
Wien.
Schoenberg, I. (1946): Contributions to the problem of approximation of equidistant
data by analytic functions. Quart. Appl. Math. 4 (45-99; 112-141).
Schonhage, A. (1971): Appromationstheorie. De Gruyter, Berlin.
Schultz, M. (1973): Spline Analysis. Prentice-Hall, Englewood Cliffs, NJ.
Schumann, R. (1982): Approximation methods for quasilinear elliptic equations with
rapidly or slowly increasing coefficients. Zeitschrift fur Analysis und ihre
Anwendungen 1, 4 (73-85).
Schwartz, J. (1964): Differential Geometry and Topology. Gordon and Breach, New
York.
Schwartz, J. (1969): Nonlinear Functional Analysis. Gordon and Breach, New York.
Schwartz, L. (1950): Theorie des distributions, Vol. 1, 2. Hermann, Paris, 1950-1951.
Schweber, S. (1961): An Introduction to Relativistic Quantum Field Theory. Row,
References
631
Peterson, Elnisford, New York.
Seidman, T. (1977): Observation and prediction for the heat equation. SI AM J.
Control Optim. 15 (412-427).
Seidman, T. (1979): Ill-posed problems arising in boundary control and observation for
diffusion equations. In: Anger, G. [ed.] (1979), 233-247.
Seifert, H. and Threlfall, W. (1938): Variationsrechnung im Grossen. Teubner,
Leipzig.
Shannon, C. (1948): A mathematical theory of communication. Bell System Techn. J.
27 (379-423, 623-656).
Siegel, C. and Moser, J. (1971): Lectures on Celestial Mechanics. Springer-Verlag,
Berlin.
Simon, B. (1974): The P(<p)2 Euclidean Quantum Field Theory. Princeton University
Press, Princeton, NJ.
Simon, B. (1979): Functional Integration and Quantum Physics. Academic, New
York.
Singer, I. (1970): Best Approximation in Normed Linear Spaces by Elements of Linear
Subspaces. Springer-Verlag, Berlin.
Sion, M. (1958): On general min-max theorems. Pacific J. Math. 8 (171-176).
Sirazetdinov, I. (1977): Optimization of Distributed Systems. Nauka, Moscow
(Russian).
Skrypnik, I. (1973): Nonlinear Elliptic Equations of Higher Order. Naukova Dumka,
Kiev (Russian).
Smale, S. (1977): Global variational analysis. Bull. Amer. Math. Soc. 83 (683-693).
Smale, S. (1980): The Mathematics of Time. Springer-Verlag, New York.
Smale, S. (1981): The fundamental theorem of algebra and complexity theory. Bull.
Amer. Math. Soc. (N.S.) 4 (1-36).
Smale, S. (1983): Global analysis and economics. In: Arrow and Intrilligator [eds.]
(1983).
Smirnow, W. (1956): Lehrgang der hbheren Mathematik, Vols. 1-5. VEB Dt. Verl. d.
Wiss., Berlin, 1956-1962. (English edition: A Course in Higher Mathematics,
Vols. 1-5, Addison-Wesley, Reading, MA, 1964. Russian edition: Vols. 1-5,
Fizmatgiz, Moscow, Leningrad, 1951-1959.)
Smoller, J. (1983): Shock Waves and Reaction Diffusion Equations. Springer-Verlag,
Berlin.
Sobol, I. (1971): Die Monte-Carlo Methode. VEB Dt. Verl. d. Wiss., Berlin.
Sobolev, S. (1974): Introduction to the Theory of Cubature Formulas. Nauka, Moscow
(Russian).
Solodovnikov, V. (1965): Statistical Dynamics of Linear Automatic Control Systems.
Dover, New York.
Sommerfeld, A. (1962): Vorlesungen Uber theoretische Physik, Vols. 1-6. Geest &
Portig, Leipzig.
Stackel, P. [ed.] (1894): Abhandlungen 'uber Variationsrechnung. Teil.l: Johann und
Jacob Bernoulli, Euler. Teil 2: Lagrange, Legendre, Jacobi. Ostwalds Klassiker
der exakten Wissenschaften, No. 46/47. Engelmann, Leipzig, 1894-1911.
Stampacchia, G. (1963): On some regular multiple integral problems in the calculus of
variations. Comm. Pure Appl. Math. 16 (383-421).
Stampacchia, G. (1965): Le probleme de Dirichlet pour les equations elliptiques du
second ordre a coefficients discontinus. Ann. Inst. Fourier 15 (189-259).
Sternberg, S. (1969): Celestial Mechanics, Vols. 1, 2. Benjamin, New York.
Stoer, J. and Witzgall, C. (1970): Convexity and Optimization in Finite Dimensions.
Springer-Verlag, Berlin.
Stoer, J. and Bulirsch, R (1978): Einfuhrung in die numerische Mathematik, Vol. 2.
Springer-Verlag, Berlin. (English edition: Vols. 1, 2 in one volume, Springer-
Verlag, New York, 1980.)
632
References
Streit, L. [ed.] (1980): Quantum fields: Algebras, Processes. Springer-Verlag, Wien.
Stroud, A. (1974): Numerical Quadrature and Solution of Ordinary Differential
Equations. Springer-Verlag, New York.
Struwe, M. (1980): Infinitely many critical points for junctionals which are not even
and applications to superlinear boundary value problems. Manuscripta Math. 32,
(355-364).
Struwe, M. (1982): Multiple solutions of differential equations without the Palais-Smale
condition. Math. Ann. 261, (399-412).
Stuart, C. (1977): Three Fundamental Theorems on Bifurcation. Ecole Polytechnique,
Lausanne (Lecture Notes).
Stuart, C. (1979): An introduction to bifurcation theory based on differential calculus.
In: Knops, R. (ed.) (1976), Vol. 4, 76-135.
Stumpff, K. (1959): Himmelsmechanik, Vols. 1-3. VEB Dt. Verl. d. Wiss., Berlin,
1959-1973.
Suhovickii, S. and Avdeeva, L. (1969): Lineare und konvexe Programmierung.
Oldenbourg, Miinchen.
Tartar, L. (1978): Une nouvelle methode de resolution d'equations aux derivees
partielles non lineaires. In: Journees d'analyse non lineaire. Benilan, P. and
Robert, J. [eds.]. Lecture Notes in Mathematics, Vol. 665, 228-241. Springer-
Verlag, Berlin.
Taylor, M. (1981): Pseudodifferential Operators. Princeton University Press,
Princeton, NJ.
Tchebycheff, P. (1859): Sur les questions de minima qui se rattachent a la
representation approximative des fonctions. In: Tchebycheff: Oeuvres. St. Petersburg, 1899;
Chelsea, New York, 1962, Vol. 1, 273-378.
Temam, R. (1977): Navier-Stokes Equations: Theory and Numerical Analysis.
North-Holland, Amsterdam.
Temam, R. (1983): Problemes mathematiques en plasticite. Dunod, Paris.
Thacher, H. and Witzgall, C. (1968): Computer Approximations. Wiley, New York.
Thiele, R. (1982): Leonhard Euler. Teubner, Leipzig.
Thom, R. (1972): Stabilite structurelle et morphogenese. Inter-Editions, Paris.
(English edition: Structural Stability and Morphogenesis. An Outline of a General
Theory of Models, 2nd printing. Benjamin, Reading, MA, 1976.)
Thompson, J. (1982): Instabilities and Catastrophes in Science and Engineering.
Wiley, New York.
Tihomirov, V. (1976): Some Problems of Approximation Theory. Moscow University
Press, Moscow (Russian).
Tihomirov, V. (1982): Grundprinzipien der Theorie der Extremalaufgaben. Teubner,
Leipzig.
Tihonov, A. (1963): On the solution of ill-posed problems. Doklady Akad. Nauk SSSR
151 (501-504) (Russian).
Tihonov, A. and Arsenin, V. (1977): Solution of Ill-Posed Problems. Wiley, New
York.
Tikhonov, A. (see Tihonov, A.).
Tonelli, L. (1921): Fondamenti di calcolo delle variazioni, Vols. 1, 2. Bologna,
1921-1923.
Topsoe, F. (1974): Informationstheorie. Teubner, Stuttgart.
Traub, J. [ed.] (1976): Analytic Computation Complexity. Academic, New York.
Traub, J. and Wozniakowski, H. (1980): A General Theory of Optimal Algorithms.
Academic, New York.
Trefftz, E. (1927): Ein Gegenstuck zum Ritzschen Verfahren. In: Verh. des II. Intern.
References
or
Kongresses fur Technische Mechanik, Zurich, p. 131.
Treves, F. (1980): Introduction to Pseudo-Differential and Fourier Integral Operators,
Vols. 1, 2. Plenum, New York.
Triebel, H. (1981): Analysis und mathematische Physik. Teubner, Leipzig.
Tromba, A. (1977): On the Number of Simply Connected Minimal Surfaces Spanning
a Curve. Memoirs of the American Mathematical Society, Vol. 194. American
Mathematical Society, Providence, RI.
Tromba, A. (1977a): A general approach to Morse theory. J. DifT. Geometry 12
(47-85).
Tromba, A. (1977b): The Morse- Sard- Brown theorem and the problem of Plateau.
Amer. J. Math. 99 (1251-1256).
Tromba, A. (1980): On the structure of the set of curves bounding minimal surfaces of
prescribed degeneracy. J. Reine Angew. Math. 316 (31-43).
Troyanski, S. (1971): On locally uniformly convex and differentiable norms in certain
non-separable spaces. Studia Math. 37 (173-180).
Tychonov, A. (see Tihonov, A.)
Tzafestas, S. [ed.] (1980): Simulation of Distributed Parameters and Large-Scale
Systems. North-Holland, Amsterdam.
Uberla, K. (1968): Faktoranalyse. Springer-Verlag, Berlin.
Ursprung, H. (1982): Die elementare Katostrophentheorie: Fine Darstellung aus der
Sicht der Okonomie. Lecture Notes in Economics and Mathematical Systems,
Vol. 195. Springer-Verlag, Berlin.
Vainberg, M. (1956): The Variational Method in the Investigation of Nonlinear
Operators. Gostehizdat, Moscow (Russian). (English edition: Holden-Day, San
Francisco, 1964.)
Vainberg, M. (1972): Variational Method and Method of Monotone Operators in the
Theory of Nonlinear Equations. Nauka, Moscow (Russian). (English edition:
Wiley, New York, 1973.)
Vainikko, G. (1979): Regular convergence of operators and the approximate solution of
equations. Itogi Nauki i Tehniki, Matemat. Analiz 16 (5-53) (Russian).
Vainikko, G. (1980): Error estimates for the method of successive approximations in
ill-posed problems. Avtomat. i Telemeh. 3 (84-91) (Russian).
Vainikko, G. (1982): Methods for the Solution of Ill-Posed Problems in Hilbert
Spaces. Tartu University Press, Tartu, SSSR (Russian).
Valentine, F. (1964): Convex Sets. McGraw-Hill, New York.
Van der Waerden, B. (1965): Mathematische Statistik. Springer-Verlag, Berlin.
Varga, R. (1962): Matrix Iterative Analysis. Prentice-Hall, Englewood Cliffs, NJ.
Varga, R. (1971): Functional Analysis and Approximation Theory in Numerical
Analysis. SI AM, Philadelphia.
Velte, W. (1976): Direkte Methoden der Variationsrechnung. Teubner, Stuttgart.
Vogel, W. (1967): Lineares Optimieren. Geest & Portig, Leipzig.
Vorobjov, N. (1970): The present state of game theory. Uspehi Mat. Nauk 25, 2
(81-140) (Russian).
Vorobjov, N. (1975) (see Worobjow, N. (1975)).
Walker, J. (1980): Dynamical Systems and Evolution Equations. Plenum, New York.
Walters, P. (1982): An Introduction to Ergodic Theory. Springer-Verlag, New York
634
References
Wang, P. (1964): Control of distributed parameter systems. In: Advances in Control
Systems. Academic, New York, 75-172.
Warga, J. (1972): Optimal Control of Differential and Functional Equations. Academic,
New York. (Russian edition: Nauka, Moscow, 1977.)
Warner, F. (1971): Foundations of Differential Manifolds and Lie Groups. Scott,
Foresman, London.
Weinberg, S. (1972): Gravitation and Cosmology. Wiley, New York.
Weinberg, S. (1974): Recent progress in gauge theories of the weak, electromagnetic
and strong interactions. Rev. Mod. Phys. 46 (255-277).
Wentzell, A. (1979): Theorie zufalliger Prozesse. Akademie-Verlag, Berlin.
Westenholz, C. von (1981): Differential Forms in Mathematical Physics.
North-Holland, Amsterdam.
White, D. (1969): Dynamic Programming. Holden-Day, San Francisco.
Whitney, H. (1955): On singularities of mappings of Euclidean spaces. Ann. Math. 62
(374-410).
Wiener, N. (1948): Cybernetics or Control and Communication in the Animal and the
Machine. Wiley, New York.
Wiener, N. (1949): Extrapolation, Interpolation, and Smoothing of Stationary Time
Series. Technology Press, Cambridge, MA.
Wolfersdorf, L. von (1975): Optimale Steuerung einer Klasse nichtlinearer Aufhei-
zungsprozesse. ZAMM 55 (353-362).
Wolfersdorf, L. von (1975a): Optimal control problems governed by equations with
closed and normally resolvable operators. Math. Nachr. 65 (331-333).
Wolfersdorf, L. von (1976): Optimal control of a class of processes governed by general
integral equations of Hammerstein type. Math. Nachr. 71 (115-141).
Worobjow, N. (1975): Entwicklung der Spieltheorie. VEB Dt. Verlag der Wiss.,
Berlin.
Wiinsch, V. (1976): Sur la validite duprincipe de Huygenspour les equations de champ
spinoriel. C. R. Acad. Sci. Paris A 283 (983-986).
Yakowitz, S. (1977): Computational Probability and Simulation. Addison-Wesley,
Reading, MA.
Yosida, K. (1965): Functional Analysis. Springer-Verlag, Berlin (6th edition, 1980).
(Russian edition: Mir, Moscow 1967.)
Young, L. (1969): Lectures on the. Calculus of Variations and Optimal Control Theory.
Saunders, Philadelphia. (Russian edition: Mir, Moscow, 1974.)
Young, W. (1912): On classes of summable functions and their Fourier series. Proc.
Royal Soc. London A 87 (225-229).
Zaharov, V. and Faddeev, L. (1971): The Korteweg-de Vries equation —a totally
integrable system. Funkcional. Anal. i. Prilozen. 5 (18-27) (Russian).
Zaharov, V. (1974): The Hamiltonian formalism for waves in nonlinear media with
dispersion. Izv. Vyss. Ucebn. Zaved. Radiofizika 17 (431-453) (Russian).
Zaharov, V. and Sabat, A. (1974): Scheme of integration of equations of mathematical
physics. Funkcional. Anal, i Prilozen. 8 (43-53) (Russian).
Zaharov, V., Manakov, S., Novikov, S. and Pitaevskii, L. (1980): Theory of Solitons.
Nauka, Moscow (Russian).
Zarantonello, E. (1971): Projections on convex sets in Hilbert spaces and spectral
theory. In: Zarantonello, E. [ed.] (1971a), 237-424.
Zarantonello, E. [ed.] (1971a): Contributions to Nonlinear Functional Analysis.
Academic, New York.
References 635
Zeeman, E. (1974): Leveb of structure in catastrophe theory illustrated by applications
in the social and biological sciences. In: Proc. Internal. Congress of
Mathematicians, Vancouver, 1974, Vol. 2, 533-546.
Zeidler, E. (1976): Lokale und globale Verzweigungsresultate fur Variationsun-
gleichungen. Math. Nachr. 71 (37-63).
Zeidler, E. (1979): Lectures on Ljustemik-Schnirelman theory for indefinite nonlinear
eigenvalue problems and its applications. In: Fucik, S. and Kufner, A. [eds.]
(1979), 176-219.
Zeidler, E. (1979a): Ljusternik-Schnirelman Theory on General Level Sets.
Mathematics Research Center Techn. Report No. 1910. University of Wisconsin
Press, Madison (to appear in Math. Nachr.).
Zeidler, E. (1980): The Ljusternik-Schnirelman theory for indefinite and not
necessarily odd nonlinear operators and its applications. Nonlinear Anal. 4 (451-489).
List of Symbols
We use the following abbreviations:
B-space Banach space
H-space Hilbert space
M-S sequence Moore-Smith sequence
F-derivative Frechet derivative
G-derivative Gateaux derivative
AS(10) means (10) in the Appendix to Part i.
In perusing the following symbols, the reader should pay attention to the
possible danger of confusion. The precise definitions can be found in Part I.
X * dual space to X
A* dual operator to A in a B-space, transposed
matrix
A*' adjoint operator to A in an H-space, adjoint
matrix (transposed and conjugate complex)
Observe that if the continuous linear operator A: X-* X is defined on the
H-space X, then the operators A*: X*-> X* and A*'; X-> X are defined
on different spaces.
F' F-derivative or G-derivative of the operator F
(the text always refers precisely to the
momentary meaning)
(x\y) inner product in an H-space
(x, y) ordered pair, an element from the product set
XXY
(x\y) inner product in R N,C N
(f,x) value of the linear functional / at x, f(x)
x„ -» x convergence in norm
xn-^x weak convergence
/„-*■/ weak * convergence of functional
x *~* f(x) another notation for the mapping /
/(•) another notation for the mapping /
(x„r) subsequence of (x„)
||x|| norm of x
\x\ Euclidean norm of x
General Notation
iff
def
f(x) = 2x
aeA
{x:...}
AcB
AcB
n,u,-
0
2A
AXB
N
R,C,0,Z
[a,b],]a,b[,]a,b]
measG
I
f: AqX^Y
f surjective
/ injective
/ bijective
s/ implies 88
if and only if
j* iff SS
f(x) = 2x by definition
a is an element of the set A
set of all x with the property ...
the set A is contained in the set B
A is properly contained in B
intersection, union, difference
empty set
set of all subsets of A
product set
set of the natural numbers 1,2, ...
set of the real, complex, rational,
integer numbers
RorC
nonnegative real numbers
set of all real N-tuples x —
set of all x e U N with £,. ^ 0 for all
i
partial derivative with respect to £,
closed, open, half-open interval
Lebesgue measure of G
identity mapping
single-valued mapping from A into
Y with AcX
mapping onto Y, i.e., f(A) = Y
one-to-one mapping
one-to-one mapping onto Y
List of Symbols
f(A)
f~\B)
S\a
D(f)
R(f)
N(f)
f°g
f: A^2B
SN
sgna
det M, rank M
□
image of A
preimage of B
restriction of the map / to the set
si.
domain of /
range of /
null space, #(/)= {x: /(x) = 0)
/ applied to g, (/ ° g)(x) = f(g(x))
multivalued mapping
surface of the unit ball in UN+l,
the N-sphere
signum of a
determinant, rank of the matrix M
end "of proof
Notation Introduced in Part I
8A
A
int^4
U(x)
U(x, R)
diam/1
dist(x, A)
dist(A,B)
lim, km
~A + B, XA
span A
coA
coA
dimL
supp/
XXY
INI,
X®Y
L(X,Y)
Ck(G)
C(G)
boundary of the set A
closure of A
interior of A
neighborhood of the point x
open ball with center x and radius
R
diameter of A
distance of the point x from the
set A
distance between the sets A and B
lower, upper limit
sum of the sets A and B, product
of the set A by the number X
linear hull of A
convex hull of A
closed convex hull of A
dimension of the linear space L
support of the function /
product space
^-norminRAr,CAr
direct sum
space of linear continuous
operators from X into Y
space of fc-times continuously di-
fferentiable real functions
space of continuous real functions
omj
Tast of Synsows
Ck[a, b], C[a, b] stands for Ck(G), C(G) with G =
[a,b]
Ck(M,Y) space of fc-fold continuously F-Ai-
fferentiable mappings /: M-*Y
Ck'a(G) space of fc-fold Holder-continu-
ously differentiable real functions
Ck,a(dG) space of real functions on the
boundary 8G
dG&Ck'a boundary property of the set G
Notation Introduced in Part II
X = X* identification of an H-space X with
its dual space
a= (alt...,aN) multi-index
\a\ = ax + ■ ■ ■ + aN order of a
D"= D"lD£2 ■ ■ ■ D^" derivative in multi-index notation,
D, = d/dl,
d/dn derivative in the direction of the
exterior normal
N
i -1
dO
ds
Q»(G)
dG G C0-1
LP(G)
11*11,
Lp(dG)
\\-\\P,3G
Wpm(G), Wpm(G)
\\-\\m,P
II" llm,/>,0
HO*
"VcHcV*"
Wp\0,T; V,H)
Lp(0,T;X)
(P),(S),(S)+,(S)0
Laplace operator, Dt = d/d£t
surface differential
differential of the arc length (d0 =
ds in IR2)
space of infinitely differentiable
real functions with compact
support
piecewise smooth boundary
Lebesgue space
norm on Lp(G)
Lebesgue space on dG
norm on L (dG)
Sobolev spaces
norm on Wpm(G)
norm on Wpm(G)
inner product in W2m(G)
evolution triple
Sobolev space with respect to "V
QHcV*"
Lebesgue space of functions with
values in X
conditions for mappings
IfeH^TO^Synibte
(sJs£jjSV|8&-«,
Notation Introduced in the Present Part III
Page
tF(u)
= AF(u)
-=AF(u)
inf„,
min,
max
min„e^^(") = «
F(u) — min!
F(u) = stationary!
F'(u)
8F(u; h)
8nF(u; h)
d"F(u;h1,...,hn)
dnF{u;h)
F'(u)h
F"(u)hk
F"{u)h2
S + F(u;h)
dF(u)
domF
epiF
Xm
p*
K+
K*
K(Y)
S(Y)
Mx
(X, X*),(X,Y)
TMU
TF(u)
^M,cF
sym^
ind M
gen M
(PS)
(PS)±,(PS),
x±
infimum of F on A 5
supremum of F on A 5
minimum of F on A 5
maximum of F on A 5
minimum problem 5
problem for determining a critical point 292
F-derivative or G-derivative at the point u
(the text always refers precisely to the
momentary meaning) 191
variation of F at the point u in the
direction h. 191
n-th variation of F at u in the direction h 191
n-th /•'-differential of F at u in the
directions hu..., h„ 192
identical with d"F(u;h,...,h)
F\u) applied to h
identical with {F"(u)h)k
identical with F"{u)hh
directional derivative of F at the point u
in the direction h 191
subdifferential of F at the point u 385
effective domain of F 380
epigraph of F 380
indicator function of the set M 381
conjugate functional to F 489
dual cone to K 408
identical with K+ 408
closed unit ball in the space Y 111
boundary of K(Y) 111
generalized orthogonal complement in
B-spaces 172
dual pair of locally convex spaces 601
tangent space to M at the point u- 283
tangential mapping to F at the point u 287
set of critical points 316
class of symmetric sets in the space X 'ill
topological index of the set M 318
genus of M 319
Palais-Smale condition 161
local Palais-Smale conditions 321
global multiplicity of eigenvectors 325
642
List of Symbols
Lcx(a,b)
LH(G)
L„(G)
E„(G)
WmLH(G)
[x,y\p,[x,
y]±
space of piecewise continuous functions
on [a, b]
Orlicz space
Orlicz class
subspace of LH(G)
Sobolev-OHict space
generalized inner products on B-spaces
422
540
540
540
544
582
List of Theorems
Every science is, among other things, the ordering, the simplifying, the making
digestible what is undigestible for the spirit.
Hermann Hesse
Theorem 37.A (Consistency, existence, and duality in linear
optimization problems) 54
Theorem 37.B (The Kuhn-Tucker saddle point theorem of
convex optimization) 56
Theorem 37.C (Alternation theorem of classical Chebyshev
approximation) 74
Theorem 37.D (Dynamic optimization of discrete control
problems and the Bellman optimization principle) .... 85
Theorem 37.E (Dynamic optimization of continuous control
problems) 87
Theorem 37.F (Thorn's classification theorem of catastrophe
theory) , 126
Theorem 38.A (Main theorem for extremal problems in B-spaces;
compactness and existence of extremal solutions;
generalized Weierstrass theorem) 151
Theorem 38.B (Main theorem for extremal problems in
topological spaces) 152
Theorem 38.C (Strict convexity and uniqueness of extremal
solutions) 152
Theorem 38.D (Main theorem of linear optimization on compact
convex sets in locally convex spaces, the role of the
extreme points) 157
Theorem 38.E (Quasisolutions for minimum problems) 158
644
List of Theorems
Theorem 38.G (Abstract entropy principle) 163
Theorem 39.A (Main theorem of linear approximation theory,
existence, and duality) 172
Theorem 39.B (Interpolation properties of subspaces and the
uniqueness of extremal solutions) 175
Theorem 39.C (Abstract alternation theorem of linear
approximation theory) 179
Theorem 40.A (Necessary and sufficient conditions for free, local
extrema expressed in terms of variations) 193
Theorem 40.B (Necessary and sufficient conditions for free, local
extrema expressed in terms of derivatives) 194
Theorem 40.C (Sufficient conditions for global minima using
comparison functionals; basic idea of field theory) 195
Theorem 40.D (Accessory quadratic minimum problems and
sufficient eigenvalue criteria for free, local minima) 201
Theorem 41.A (Solution of operator equations by solving
extremal problems) 233
Theorem 41.B (Solution of abstract Hammerstein equations
whose kernel operator is symmetric and whose
Nemyckii operator is a potential operator) 237
Theorem 42.A (Free, convex minimum problems and the Ritz
method) 251
Theorem 42.B (Free, convex minimum problems and the gradient
method) 253
Theorem 43.A (Existence of an eigenvector via a minimum
problem with side conditions) 278
Theorem 43.B (Existence of a bifurcation point via a maximum
problem with side conditions) 279
Theorem 43.C (Tangent vectors, submersions, and the generalized
implicit function theorem) 286
Theorem 43.D (Existence of Lagrange multipliers for smooth side
conditions, necessary and sufficient conditions). . . 290
Theorem 44.A (Main theorem of the Ljustemik-Schnirelman
theory in infinite-dimensional B-spaces on the
existence of finitely many or infinitely many
eigenvectors) 326
Theorem 44.B (Main theorem of the Ljustemik-Schnirelman
theory in finite-dimensional B-spaces) 335
Theorem 44.C (Existence of several eigenvectors for abstract
Hammerstein equations) 337
Theorem 44.D (The mountain pass theorem for constructing a
free critical point) 339
Theorem 45.A (Main theorem of bifurcation theory for potential
operators) 353
List of Theorems
645
Theorem 46.A (Variational inequalities as necessary and sufficient
conditions for minimum problems on
convex sets) 364
Theorem 46.B (The Ritz method for minimum problems on
convex sets) 368
Theorem 46.C (Solution of minimum problems and variational
inequalities on convex sets by means of a
projected gradient method) 369
Theorem 46.D (Convergence of the penalty functional method) . . 371
Theorem 46.E (Regularization of linear operator equations) .... 373
Theorem 46.F (Regularization of nonlinear operator equations). . 375
Theorem 47.A (Existence of subgradients) 387
Theorem 47.B (Sum rule for subgradients) 389
Theorem 47.C (Main theorem of convex optimization) 391
Theorem 47.D (Main theorem of convex approximation theory). . 392
Theorem 47.E (Generalized Kuhn-Tucker theory for side
conditions in the form of inequalities; saddle points,
Lagrange multipliers, subgradients, variational
inequalities) 394
Theorem 47. F (Maximal mono tonicity of the subgradient,
characterization of maximal cyclic monotone
mappings) ; 397
Theorem 48.A (Main theorem on necessary and sufficient
extremal conditions for minimum problems with
general side conditions) 414
Theorem 48.B (Necessary and sufficient extremal conditions for
minimum problems with operator equations and
inequalities as side conditions) 417
Theorem 48 .C (Pontrjagin maximum principle) 424
Theorem 48.D (The Euler, Legendre, and Weierstrass necessary
conditions for classical variational problems as a
consequence of the Pontrjagin maximum
principle) 434
Theorem 49.A (Existence of saddle points) 459
Theorem 49.B (Main theorem of general duality theory, Lagrange
functions, and their saddle points) 460
Theorem 50.A (Minimum problems with operator inequalities as
side conditions and duality) 480
Theorem 51.A (Properties of conjugatefunctionals, F**=> F). . . 494
Theorem 51.B (Duality propositions for monotone potential
operators) 500
Theorem 5 2.A (Duality propositions of Rockafellar type for
stable problems) 514
Theorem 52.B (Duality propositions of Fenchel type) 518
646
List of Theorems
Theorem 52.C (Main theorem for linear optimization problems in
locally convex spaces, duality, and stability) 520
Theorem 52.D (Bellman differential equation and duality in non-
convex control problems) 524
Theorem 54.A (Main theorem for elliptic variational inequalities
with pseudomonotone operators) 552
Theorem 54.B (Maximum principle for control problems with
linear operator equations as control equations) . . . 557
Theorem 54.C (Semigroups and control problems with evolution
equations as control equations) 561
Theorem 55.A (Multivalued inhomogeneous evolution equations
of the first order in H-spaces and maximal
monotone operators) 571
Theorem 56.A (Multivalued inhomogeneous evolution equations
of the second order in H-spaces and maximal
monotone operators) 578
Theorem 57.A (Multivalued inhomogeneous evolution equations
of the first order in B-spaces and m-accretive
operators) 585
Theorem 57.B (Nonexpansive semigroups in B-spaces) 593
List of the Most Important Definitions
Weak convergence 148
Weak sequentially continuous functional 149
Weak sequentially lower semicontinuous functional 149
Lower semicontinuous and upper semicontinuous function-
als 150
Lower semicompact functional 150
Palais-Smale condition 161, 321
n-th variation, G-derivative, F-derivative 191
Potential operator 234
Convex set 245
Convex and concave functional 245
Cone and dual cone 408
Minimum and maximum (local, free, bound) 5,193, 276
Saddle point 292
Saddle point with respect to a product set 458
Free extremum 193
Bound extremum 276
Critical point 291
Subgradient and subdifferential 385
Conjugate functional 489
A list of fundamental definitions of (compact, strongly continuous,
demicontinuous, monotone, pseudomonotone, etc.) operators can be found
in Section 27.5 of Part II.
648 List of Definitions
Schematic Overviews
Interrelationship between various extremal problems 3
Convexity and extremal problems 169
Interrelationships between the classical necessary and sufficient
conditions for weak and strong minima in the calculus of
variations 204
Interrelationship between important properties of nonlinear operators (see
Figure 27.1 of Part II)
General References to the Literature
Theory of probability, stochastic processes, stochastic differential
equations 102
Approximation methods 140
Exercise collections and monographs with comprehensive
exercise sections 142
Functional analysis and extremal problems 166
Recent trends 166
For the history of the theory of extremal problems compare the literature
under the caption "classical works" in the individual sections of Chapter 37.
Auxiliary Means in the Appendix to Part I
Topological spaces
Moore-Smith sequence (M-S sequence)
Foundations of linear functional analysis in linear spaces, Banach
spaces, Hilbert spaces, and locally convex spaces
Auxiliary Means in Part II
Lebesgue integral
Integration by parts
Generalized derivatives and distributions
List of Definitions
649
Lebesgue spaces and Sobolev spaces
Concrete Galerkin schemes for the Ritz method (e.g., the method of
finite elements)
Auxiliary Means in the Appendix of the Present
Part III
Convex sets in R N and systems of inequalities
Dual pairs of locally convex spaces
Geometry of Banach spaces (convexity and smoothness properties
of the norm)
Facolt& di Ingegneria
Biblioteca
Via Montallegro, 1
16145 GENOVA
£6*30
Index
The reader should also consult the detailed Contents of this volume on page xv, the
List of Theorems on page 643 and the indexes in Parts I and II. Important
properties of operators or functionals can be found under the catchword "operator."
absolute neighborhood extensors
(ANE) 345
absolute neighborhood retracts (ANR)
(see Part I)
abstract entropy principle 163
abstract Hammerstein equations 237,
337
accessory variational problem 200
admissible curve 283
admissible direction 414
Alaoglu-Bourbaki theorem 603
alternation theorem
abstract 177
concrete 74,181
approximation
Chebyshev 73, 76,180,182,184
Pade 74
rational 74
Yosida 570, 574
approximation theory
basic ideas of 58
compensation analysis and 58
control theory and 64
convex 392
duality and 172
Haar's uniqueness theorem 183
Kolmogorov's criterion, classical
183
Kolmogorov's criterion, generalized
449
linear 172,183
partial differential equations and
76
approximative method of
Arrow—Hurewicz 454, 483
ascent 135,177,183
basic ideas of 132
Bellman 86
combination 133
cutting hyperplanes 376
Dantzig 53
decomposition 139
duality 138
dynamic optimization 86
equivalent problems 133
feasible directions 376
Galerkin 281, 343, 472, 548
gradients 134, 252
iteration 483
least-squares 58
652
Index
projected gradients 368
projection (see Ritz, Galerkin)
projection-iteration 483
regularization 69, 372, 375, 377
Remes (see ascent)
Ritz 134, 250, 367
simplicial algorithm (see Dantzig)
steepest descent (see gradients)
Trefftz 50, 502
Uzawa 454,483
approximative methods, references
to the literature 140
at almost all points (see Part II)
Baiiach manifolds
in Banach spaces 284
general definition of 304
Lagrange multipliers and 286, 290
Banach space (B-space) (see Part I)
dual (see Part I)
locally uniform convex 603
reflexive (see Part I)
separable (see Part I)
uniform convex 604
Bang-Bang principle 90, 444, 450
basic ideas 4
Bellman's
differential equation 27, 87, 523
differential inequality 524
equation 85
optimization principle 85
bicharacteristics 217
bifurcation 279, 351, 358ff
bifurcation point (see Part I)
bilinear form (see Part II)
compact (see Part II)
nondegenerate 108
positive (see Part II)
strongly positive (see Part II)
symmetric (see Part II)
bipolar theorem 603
Bolza's problem 436
canonical formalism
classical 23
in control theory 423
infinite-dimensional 213
perturbation theory and 211
symplectic geometry and 211
Caratheodory
condition (see Part II)
representation theorem 600
catastrophe theory 115
category 347
Cauchy sequence (see Part I)
characteristic number (see Part I)
characteristics 215
Chebyshev approximation 73, 76,
180,182
discrete 184
classical calculus of variations 20ff,
41, 43, 197, 203, 433
history of the 146ff, 189,226,230ff
caustic 218
codimension of a singularity 128
coercive
functional 247
operator (see Part II)
compact
bilinear form (see Part II)
operator (see Part I)
set (see Part I)
compactness principles 145
compensation analysis
deterministic 58
stochastic 61
concave functional 245
condition
Caratheodory (see Part II)
Jacobi 205
Legendre 205, 433
Palais-Smale 161, 321
Slater, classical 56
Slater, generalized 394, 414, 482,
519, 520
Weiers trass 433
Weierstrass-Erdmann 434
cone 408
cone, dual 408, 441
conjugate functionals
basic ideas of 487
definition of 489
duality and 487, 512
for bilinear forms 510
for differentiable functionals 508
for integral operators 510
geometric interpretation of 508
Orlicz spaces and 538
Index
653
properties of 493, 508ff
T-regularization and 508
consistency and existence 53, 514
controllability 450
control problems
discrete 84, 537
dynamic optimization and 84
existence theorems for 442
linear 450
needle variations and 93, 565
nonconvex 521
Pontrjagin maximum principle and
422
quadratic 88, 559, 560, 562 •
Riccati equation and 27, 566_
space ships and 437, 444
time-optimal 450
control problems with
Bang-Bang 90, 444, 450
elliptic differential equations 559
evolution equations 560
impulse control 576
ordinary differential equations 89,
92, 93, 422, 433, 437, 450
parabolic differential equations
562
partial differential equations 96,
559, 560
regulators 88, 561, 566
stochastic influences 97
stopping time 566
synthesis 88, 92, 561, 566
convergence
norm (see Part I)
of M~S sequences (see Part I)
strong 148
weak* 148
convex
analysis 379
approximation theory 392
function 15, 246, 264
functional 245,380
hull (see Part I)
optimization 55, 390
set 245
sets, properties in IRN 600
convexity
principles 169
properties of the norm 603
critical level of a functional 316
critical points
basic definition of 291
characterization of 292
elementary meaning of 14,18
existence of 102,105,316, 324,
326, 339, 340ff, 465ff
free 292
inIR 14
inl!«N 18
Lagrange multipliers and 292
Ljusternik-Schnirelman theory and
102, 316, 324, 326, 339,469ff
Morse theory and 105, 343
necessary and sufficient conditions
for 292
nonlinear differential equations and
466ff
Palais-Smale condition and the
existence of 161, 324
of a function 14,18
of a functional 291
of a smooth mapping 116
periodic solutions and 466,475
stable 341
critical value
of a functional 316
of a smooth mapping 116
cyclic monotone 396
deformation 346
derivative 191
diffeomorphism (see Part I)
differential 192
on manifolds 287
differential equations
Bellman 27, 87, 523, 524
conservation laws 594
control problems with ordinary
422, 435
control problems with partial 96,
559, 561, 562, 565ff
eigenvalue problems 342, 360, 468,
469
elliptic 41,43,50,76,255,336,
342, 360, 466, 468,469,475, 502,
506, 545, 559
Euler 22,41,197,209
Euler-Lagrange 34,43, 299, 300
654
Index
Hamilton-Jacobi 20,27ff, 38,215,
537
Helmholtz 271
Huygen's principle for hyperbolic
214, 219ff
hyperbolic 213ff, 466, 476
Korteweg-de Vries equation 213
ordinary 20ff, 422ff
parabolic 562
periodic solutions of 466, 475
wave equation 213ff
with strong nonlinearities 545
differential inequalities 44, 365, 524
directional derivative 191
dual
Banach space (see Part I)
cone 408, 441
pair 601
duality
approximation theory and 172
basic ideas of 7, 11, 138, 453
conjugate functionals and 487ff,
512ff
convex optimization and 56, 392,
482
elliptic differential equations 50,
502, 506
Fenchel-Rockafellar 518
gaps 139, 534
general principle of 460
in Banach spaces (see Part I)
in locally convex spaces 600
Kuhn-Tucker theory and 56, 392,
482
Lagrange functions and 454,460
Lagrange multipliers and 460
linear optimization and 54, 463,
519
minimal surface problems and 533
monotone potential operators and
499
nonconvex control problems and
521
principle of Clarke-Ekeland 476
Ritz 50, 502
Trefftz 50, 502
duality mapping
in a Banach space 399
in a Hilbert space (see Part II)
dynamic optimization
continuous 86
discrete 84
effective domain of definition 380
eigenvalue problems
abstract 278, 279, 335
bifurcation 279, 351
elliptic differential equations 43,
299, 336, 342ff
existence of several eigenvectors
324, 335
Hammerstein integral equations
240, 342
Hammerstein operator equations
337
inlRN 17,335
Ljusternik-Schnirelman theory and
102, 324, 341ff
eikonal 27
elementary catastrophes 123,125
embedding (see Part II)
epigraph 380
equation
Bellman 27, 85, 87, 523, 524
canonical 23, 211, 213, 214
eikonal 27
Euler 22, 41, 197, 209
Euler-Lagrange 34, 43
generalized canonical 213, 423
generalized Euler 193ff, 229, 386
generalized Euler-Lagrange 290,
414
Hamilton-Jacobi 20,27ff, 38, 215,
537
Hamilton-Jacobi-Bellman (see
Bellman)
Helmholtz 217
Jacobi 201
Korteweg-de Vries 213
equilibrium
Nash 49 (see also Part IV)
Walras 49 (see also Part IV)
equivalent
mappings 121
norms (see Part I)
ergodic theory 225
error estimates, two-sided
approximation theory and 173
655
basic idea of 138
duality and 138, 461
monotone potential operators and
500
nonconvex control problems and
524
Ritz method and 50, 500, 504
Trefftz method and 50, 504
Euler equation (see equation)
Euler quotations 230
evolution
equations 560, 581
triple (see Part II)
variational inequalities 568,.577
exercise collections, references to 142
existence principles
basic ideas of 7,145, 168
compactness and 145ff, 151,152,
154, 161, 232
convexity and 168ff, 172
history of 146
Palais-Smale condition and 161,
324
existence principles for
bifurcation points 279, 352
critical points 102,105, 316ff, 324,
326, 339, 340ff, 465ff
eigenvectors 278, 279, 314, 324,
326, 335
maxima (see minima)
minima 151,153,154,161,232
minimal surface problems 530ff
optimal control problems 442
saddle points 457, 466, 467
variational problems 255, 266
extremal
of a variational problem 196
principles, fundamental 145ff,
168ff
problems 4
problems and operator equations
9, 229ff, 233
relations 453, 461
extreme points 157,182
extremum 5
Fan's inequality 50 (see also Part
IV)
Farkas' lemma 441, 484
F-derivative 192
F-differential 192
Fermat's principle 21
Feynman integral 225
field theory 195, 207, 210
filtering of stochastic processes 98
finite elements (see Part II)
Finsler manifolds 345
Fourier integral operators 224
Fourier series 61
Frechet-derivative (see F-derivative)
Fredholm mappings (see Part I)
free convex minimum problems 250,
252
free critical point 292
free local minimum 193
function
action (S-function)
generalized Lagrange 460
Hamilton (H-function) 23
k-determined 130
Lagrange 16, 21, 43, 54, 55
Morse 107
Pontrjagin (^function)
structurally stable 122
Weierstrass (E-function) 204
Young 538
functional (see operator)
Galerkin method
eigenvalue problems and 281
general (see Part II)
Ljusternik-Schnirelman theory and
343, 472
Galerkin scheme (see Part II)
game theory 47
Gateaux-derivative (see G-
derivative)
gauge field theory 226
G-derivative 191
generalized
canonical equations 213, 423
Euler equations 193ff, 229, 386
Euler-Lagrange equations 290,
414
gradients of locally Lipschitz
continuous functionals 403
implicit function theorem 286
656
Inde;
inner products on Banach spaces
582
Kolmogorov criterion 449
Kuhn-Tucker theory 392, 416,
448, 479
Lagrange function 454, 460
Lagrange multipliers 290,414,
460
linear optimization 157,463, 519
Ljusternik-Schnirelman theory
340, 469
Morse theory 343
problem in geometrical optics 525
Slater condition 394,414,482,
519, 520
generalized solutions of
abstract minimum problems 158,
403, 449
control problems 443
convex regularization and 268
linear equations 67
minimal surface problems 533
partial differential equations (see
Part II)
variational problems 267, 268
generic properties 120, 534
genus of symmetric sets 319
geodesies 37,112ff
geometrical optics 24,27, 215, 216,
217, 218, 525
geometric functional analysis 168,
170, 185
germs of functions 127
gradient method 134, 252, 265
Haar determinant 178
Haar's uniqueness theorem 183
Hahn-Banach theorem 170
Hamiltonian systems 475
Hammerstein integral equations 239
with strong nonlinearities 542
Helly's intersection theorem 600
Hestenes' theorem 202
Hilbert space (H-space) (see Part I)
H5lder's inequality (see Part II)
homeomorphism (see Part I)
hull (see Part I)
closed
convex
linear
Huygen's principle 214, 219
weak 220
immersions 116, 286
indefinite variational problems 342,
466
index theorem for geodesies 112
indicator function 381
inequality of
Fan 50 (see also Part IV)
Farkas 484
Garding 201
H51der (see Part II)
Tucker 484
inequalities, systems of 600
infimum 5
information theory 294, 307
integrable dynamical systems 212
infinite-dimensional 214
integration by parts (see Part II)
interpolation property of subspaces
175
inverse scattering theory 213
Kadec-Troyanski theorem 604
Kalman-Bucy filter 99, 565
Kolmogorov's criterion 183, 449
Krasnoselskii's bifurcation theorem
352
Krein's extension theorem 171
Krein-Milman theorem 157
Kuhn-Tucker theory
classical 55
generalized 392, 416, 448, 479
generalized gradients and 404
simple sufficient rule 404
Lagrange functions
calculus of variations and 21, 34
conjugate functional and 496
convex optimization 55
duality and 454, 460
generalized 460
in RN 16
linear optimization and 55
Lagrange multipliers
basic ideas of 16, 274
constraining forces as 272
convex optimization 56, 393,480
Index
657
critical points and 291
eigenvalue problems and 43,278,
279, 324, 335
generalized 290, 414, 446
in RN 16, 293
linear optimization and 54,463,
519
nondegeneracy conditions for 9,
17, 273
Pontrjagin maximum principle and
429, 446
temperature as 296
variational problems and 34, 36,
43, 299, 300
Lax pair 214
least-squares method 58
Lebesgue integral (see Part II)
Lebesgue space (see Part II)
Legendre condition 205, 433
Legendre transformation
classical 23
conjugate functionals and 491
linear space (see Part I)
linking principle 317, 469, 471
list of the
abbreviations 637
auxiliary means 648
general references to the literature
648
most important definitions 647
symbols 437
theorems 643
Ljapunov's theorem on vector measures
444
Ljusternik's theorems 104,105, 286,
290
Ljusternik-Schnirelman theory
approach by genus 319
approach by category 347
basic ideas of 316
classical results of 102
constrained case and eigenvalue
problems 324
free (unconstrained) case 339,
465ff
Hammerstein equations and 337,
339, 342
indefinite problems and 342
linking principle and 317, 469,
471
monotone operators and 328
mountain pass theorem and 339
on Banach manifolds 349
on Finsler manifolds 349
on general level sets 469
on topological spaces 345
periodic solutions and 466, 475
partial differential equations and
336, 342, 466ff
perturbed problems and 341
locally convex spaces (see Part I)
lower semicompact functionals
criterion for 510
definition of 150
lower semicontinuous functionals
150, 380
L-S deformations (Ljusternik-
Schnirelman deformations)
definition of 316
existence of 468
Mackey-Arens theorem 602
Mackey topology 603
manifolds in a Banach space 284
mapping (see operator)
Maslov WKB method 223
mathematical economics 50, 475,
549 (see also Part IV)
maximum (see minimum)
maximum principle
classical 25
Pontrjagin 422, 557
maximum-minimum principle
Courant 102, 314
Ljusternik 316
Mayer's problem 437
measurable function (see Part II)
minimal of variational problems
strong 197
weak 197
minimum
bound 276
free 244
strong 197
weak 197
minimal
point 5
sequence 232
surface problem 209, 529
value 5
658
minimax theorem 459
min-sup problem 6, 474
moon landing 444
Morse
function 107
index, classical 108
index, generalized (see Part V)
inequalities 107, 111, 343
lemma 110, 343
Morse-Sard theorem
classical 344
generalized 344
Morse theory
classical 105
generalized 343
topology and (see Part V)
mountain pass theorem 339
M-S sequence (Moore-Smith
sequence) (see Part I)
Nash equilibrium point 48
necessary conditions for extrema
basic ideas of 8
classical maximum principle 25
differentiable functionals 193 ft"
Dubovickii-Miljutin theory 414
dynamic optimization 85, 87
Euler equation abstract 193ff, 229,
386
Euler equation, variational problems
22, 41, 197, 209
Euler-Lagrange equation, abstract
290, 414
Euler-Lagrange equation,
variational problems 34, 43
functionals on convex sets 363
general side conditions 414
inR 13
in RN 16, 195, 294
Kuhn-Tucker theory 56, 394, 417,
480, 482
Lagrange multipliers 17, 34,290,
292, 294
Legendre condition 205, 433
Pontrjagin maximum principle 422
subgradient condition 386
variational inequalities 363
Weierstrass condition 433
Weierstrass-Erdmann corner
Index
neighborhood (see Part I)
Nemyckii operator (see Part II)
Noether's theorem 33, 226
nonexpansive semigroups 593
norm (see Part I)
normality of control problems 450
normal solution 66
nonconvex
control problems 521
minimal surface problems 533
problems and convex regularization
268
variational problems 267, 268
nonexpansive semigroups
characterization of 596
construction of 593
Crandall-Pazy theory for nonlinear
596
definition of 593
Hille-Yosida theory for linear 596
invariant sets for 598
obstacle problem and regularization
563
operator (functional, mapping)
accretive 583
adjoint (see Part II)
bijective (see List of Symbols)
bilinear (see Part I)
bounded (see Part II)
coercive (functional) 247
coercive (operator) (see Part II)
compact (see Part II)
concave 245
condition (P),(S),(S)+,(S)0 (see
Part II)
condition A2,A2 539
conjugate 489
continuous (see Part I)
contractive (see Part I)
convex 245, 380
cyclic monotone 396
demicontinuous (see Part II)
dual (see Part I)
equicontinuous (see Part I)
F-diflerentiable 192
Fredholm (see Part I)
G-diflerentiable 192
659
linear (see Part I)
linear bounded (see Part I)
Lipschitz-continuous (see Part
II)
locally Lipschitz-continuous (see
Part II)
lower semicontinuous 150, 380
lower semicompact 150
m-accretive 583
maximal cyclic monotone 396
maximal monotone 396
measurable (see Part II)
monotone (see Part II)
monotone potential 249
nonexpansive (see Part II)'
potential 234
proper (see Part I)
pseudomonotone (see Part II)
quasiconcave 150
quasiconvex 150
self-adjoint (see Part I)
sequentially lower semicontinuous
149
strictly convex 245
strictly monotone (see Part II)
surjective (see List of Symbols)
symmetric (see Part II)
uniformly continuous (see Part II)
upper semicontinuous 150
weak sequentially continuous 149
weak sequentially lower continuous
149
weak* sequentially lower continuous
149
weakly coercive (functional) 247
optimal quadrature formulas 80
optimization problems
approximative methods in IRN for
376
convex 55, 390, 392, 416, 479
dynamic 84
linear 51, 157, 463, 485
nonconvex 521
Pareto 48
Pontrjagin maximum principle for
422
Orlicz
class 540
space 540
orthonormal system (see Part I)
Palais-Smale condition 161, 321
parameter identification 71
Pareto optimization 48
penalty method 137, 370
periodic solutions 466, 475
perturbation theory 137, 211
perturbed
bifurcation and catastrophe theory
128
minimum problems and Rockafellar
theory 513
variational problems and
Hamilton-Jacobi theory 26
polar 603
Pontrjagin maximum principle
abstract version of 557
alternative proofs of 446, 449
basic idea of 93
Bolza's problem and 436
calculus of variations and 433
discrete 537
elementary application of 89
elementary provable special case
of 93
Lagrange multipliers and 429,
446
main theorem of 422
Mayer's problem and 437
modifications of 435
nonsmooth version of 449
proof of 426
spaceships and 437ff, 444,445
variational inequalities and 429,
557
with operator equations as control
557
with phase restrictions 437
potential operators 229
criteria for 234
definition of 234
properties of monotone 9, 249
problem
ill-posed 67
well-posed 67
product space (see Part I)
projected gradient method 368
projection
operator (see Part II)
on convex sets 366
pseudoinverses 66
mr
H'l'iaex
quadratic
control problems 88, 558, 559,
560, 561, 566
elliptic variational inequalities 364,
552, 553
evolution variational inequalities
572, 578
forms 108, 209, 235
variational inequalities, concrete
44, 365, 559, 561, 562
variational problems, abstract 155,
200, 314, 364
variational problems, concrete 27,
41, 43, 44, 50, 365, 502
quantum field theory 226
quasiconcave functionals 150
quasiconvex functionals 150
quasisolutions of minimum problems
158, 449
quasivariational inequalities 10, 549,
566 (see also Part IV)
quotient theorem 303
recent trends
references to the literature of 166
reference 177
reflexive Banach space (see Part I)
region (see Part I)
regular descent direction 414
regular point of a set 283
regularization
iteration and 377
linear 372ff, 377
nonlinear 375ff
of functionals 508, 574
of generalized solutions 267,269,
563
Tihonov 69
regulator problem 88, 561, 565, 566
Remes algorithms 135,183
resolvent (see Part I)
of maximal monotone operators
569
return of a spaceship to earth 437
Riccati differential equation 27, 566
Ritz's method 134, 250, 367
Rockafellar's stability principle 512,
514
saddle points
convex optimization and 56, 393,
480
definition of 292
duality and 453, 460
existence of 457, 466, 467
game theory and 47
inl!«N 18
linear optimization and 54, 463,
519
monotone operators and 467, 476
with respect to a product set 458
second variation
convexity and 247
definition of 191
eigenvalue criteria for minima and
200
necessary conditions and 10,194
sufficient conditions and 10,194,
200
theorem of Hestenes and 202
semigroups
control problems and 560, 565
nonexpansive 593, 596, 598
seminorm 170
separation of convex sets 171
set
closed (see Part I)
compact (see Part I)
convex 245
dense (see Part II)
measurable (see Part II)
open (see Part I)
relatively compact (see Part I)
symmetric 317
weak sequentially closed 151
weak sequentially compact 154
side conditions in the form of
convex sets 363
equations 34, 446, 479
equations in IRN 293, 335
evolution equations 560
general relations 413
inequalities and equalities 55, 392,
416, 448, 479, 482
integrals 34,299
linear differential equations 450,
559, 562
linear inequalities 51,463, 519
lllut-X
nonlinear ordinary differential
equations 34,422
nonlinear partial differential
equations 300
operator equations 417, 479, 556
smooth equations 277, 290, 324
simplicial algorithm 53
singularities of smooth mappings 116
Sobolev's embedding theorems (see
Part II)
Sobolev spaces (see Part II)
Sobolev-Orlicz spaces 544
solitons 213
splines 79, 498
stability
of perturbed minimum problems
513
structural 122
stable deformations 127
statistical physics 296
stochastic interpretation of
control problems 443
variational problems 267
stochastic processes
control of 97, 549, 565
compensation analysis for 63
definition of 63
filtering of 98
prognosis of 98
strongly convex functional 264
subdifferential 385
subgradients
basic ideas of 379, 487
chain rule for 403
definition of 385
existence of 387
extremal principle and 386
G-derivative and 387
inR 16
properties of 379ff, 386, 387, 388,
396, 399, 403, 487ff, 490, 493
sum rule for 388
Young's inequality and 490
submersions 116,286
sufficient conditions for minima
accessory problems 200
basic ideas of 10
convex functionals 246
differentiable functionals 193
duality 138, 461
Dubovickii-Miljutin theory 414
dynamic optimization 85, 87
eigenvalue criteria 200
field theory 195, 207, 210
general side conditions 414
in RN 17, 195, 294
Jacobi condition 205
Kuhn-Tucker theory 56, 394,417,
480, 482
Lagrange multipliers 17, 36,290,
294
linear optimization 54,464, 520
nonconvex control problems 524
second variation 193,195,200,
290, 294
2n-th variation 193,195, 290, 294
simple rule for nonconvex problems
with inequalities as side
conditions 404
variational inequalities 363
sum rule 388
support functional 385
supremum 5
surjectivity theorem 303
synthesis problem 88, 92, 561, 566
tangent
bundle 306
plane 283
space 283, 305
vector 283
tangential
cones 449
direction 415
mapping 287, 306
theory of
Crandall-Pazy 596
Dubovickii-Miljutin 407
Hille-Yosida 596
Kolmogorov-Arnold-Moser 212
Kolmogorov-Wiener 98
Kuhn-Tucker 55, 392, 416, 448,
479
Ljusternik-Schnirelman 102,313,
340, 469
Morse 105, 343
Rockafellar 514
Thorn's classification theorem 126
662
Index
Tihonov regularization 69
topological
index 317, 346
space (see Part I)
transversality
of manifolds and mappings 117
condition 32
Tuy's inconsistency theorem 468
unfolding of mappings 123
uniqueness principles
strict convexity 152
interpolation property 175
Uzawa's algorithm 454, 483
variation 191
variational inequalities
basic ideas of 363, 547
bifurcation problems for (see Part
IV)
coercive 552
control problems and 549, 556
differential inequalities and 44,
365, 524
elliptic 551
evolution 568, 577
inR 13
Kuhn-Tucker theory and 56, 394,
417
mathematical physics and 45 (see
also Part IV)
maximal monotone operators and
548ff
Pontrjagin maximum principle and
429,557
quadratic 364, 552, 553, 565, 572,
578
regularity of solutions of 549
semicoercive 553, 564
(see also quasivariational
inequalities)
variational problems
as control problems 433
basic ideas of classical 20ff, 203ff
duality for 50, 502
existence theorems for 255, 266
field theory 195, 207, 210
generalized solutions of nonconvex
267, 268
multi-dimensional 41, 43, 50,
196ff, 255ff, 266, 299, 300, 502
necessary conditions for 22,41,
43, 197, 209, 433
one-dimensional 20ff, 203ff, 433
sufficient conditions for 200ff, 207,
210
Walras equilibrium 49 (see also Part
IV)
warehouse maintenance 97
weak sequentially lower
semicontinuous functionals
criteria for 235
definition of 149
Weierstrass
counterexample of 165
E-function of 204
existence theorem, classical 145
existence theorem, generalized 151,
152, 154, 232
necessary condition 232
Wiener integral 215
Yosida approximation 570, 575
Young's inequality
classical 488
generalized 490