/
Text
Lipschutz-Lipson:Schaum's I Front Matter I Preface I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
Linear algebra has in recent years become an essential part of the mathemtical background required by
mathematicians and mathematics teachers, engineers, computer scientists, physicists, economists, and
statisticians, among others. This requirement reflects the importance and wide applications of the subject matter.
This book is designed for use as a textbook for a formal course in linear algebra or as a supplement to all
current standard texts. It aims to present an introduction to linear algebra which will be found helpful to all
readers regardless of their fields of specification. More material has been included than can be covered in most
first courses. This has been done to make the book more flexible, to provide a useful book of reference, and to
stimulate further interest in the subject.
Each chapter begins with clear statements of pertinent definitions, principles and theorems together with
illustrative and other descriptive material. This is followed by graded sets of solved and supplementary
problems. The solved problems serve to illustrate and amplify the theory, and to provide the repetition of basic
principles so vital to effective learning. Numerous proofs, especially those of all essential theorems, are included
among the solved problems. The supplementary problems serve as a complete review of the material of each
chapter.
The first three chapters treat vectors in Euclidean space, matrix algebra, and systems of linear equations.
These chapters provide the motivation and basic computational tools for the abstract investigation of vector
spaces and linear mappings which follow. After chapters on inner product spaces and orthogonality and on
determinants, there is a detailed discussion of eigenvalues and eigenvectors giving conditions for representing a
linear operator by a diagonal matrix. This naturally leads to the study of various canonical forms, specifically,
the triangular, Jordan, and rational canonical forms. Later chapters cover linear functions and the dual space V*,
and bilinear, quadratic and Hermitian forms. The last chapter treats linear operators on inner product spaces. For
completeness, there is an appendix on polynomials over a field.
The main changes in the third edition have been for pedagogical reasons rather than in content. Specifically,
the abstract notion of a linear map and its matrix representation appears before and motivates the study of
eigenvalues and eigenvectors and the diagonalization of matrices (under similarity). There are also many
additional solved and supplementary problems.
Finally, we wish to thank the staff of the McGraw-Hill Schaum's Outline Series, especially Barbara Gilson,
for their unfailing cooperation.
Seymour Lipschutz
Marc Lars Lipson
Lipschutz-Lipson:Schaum's I 1. Vectors in R'^n and C^n, I Text I I ©The McGraw-Hill
Outline of Theory and Spatial Vectors Companies, 2004
Problems of Linear
Algebra, 3/e
CHAPTER 1
Vectors in R" and C",
Spatial Vectors
1.1 INTRODUCTION
There are two ways to motivate the notion of a vector: one is by means of lists of numbers and
subscripts, and the other is by means of certain objects in physics. We discuss these two ways below.
Here we assume the reader is familiar with the elementary properties of the field of real numbers,
denoted by R On the other hand, we will review properties of the field of complex numbers, denoted by C.
In the context of vectors, the elements of our number fields are called scalars.
Although we will restrict ourselves in this chapter to vectors whose elements come from R and then
from C, many of our operations also apply to vectors whose entries come from some arbitrary field K.
Lists of Numbers
Suppose the weights (in pounds) of eight students are listed as follows:
156, 125, 145, 134, 178, 145, 162, 193
One can denote all the values in the list using only one symbol, say w, but with different subscripts; that is
Wj, W2, W3, W4, W5, W5, W7, Wg
Observe that each subscript denotes the position of the value in the list. For example,
Wj = 156, the first number, W2 = 125, the second number, ...
Such a list of values,
W = (Wj, W2, W3, . . . , Wg)
is called a linear array or vector.
1
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
1. Vectors in R'^n and C^n,
Spatial Vectors
Text
©The McGraw-Hill
Companies, 2004
VECTORS IN R" AND C", SPATIAL VECTORS
[CHAR 1
Vectors in Physics
Many physical quantities, such as temperature and speed, possess only "magnitude". These quantities
can be represented by real numbers and are called scalars. On the other hand, there are also quantities, such
as force and velocity, that possess both "magnitude" and "direction". These quantities, which can be
represented by arrows having appropriate lengths and directions and emanating from some given reference
point O, are called vectors.
Now we assume the reader is familiar with the space R^ where all the points in space are represented
by ordered triples of real numbers. Suppose the origin of the axes in R^ is chosen as the reference point O
for the vectors discussed above. Then every vector is uniquely determined by the coordinates of its
endpoint, and vice versa.
There are two important operations, vector addition and scalar multiplication, that are associated with
vectors in physics. The definition of these operations and the relationship between these operations and the
endpoints of the vectors are as follows.
(i) Vector Addition: The resultant u + v of two vectors u and v is obtained by the so-called
parallelogram law, that is, u + v is the diagonal of the parallelogram formed by u and v.
Furthermore, if (a,b,c) and (a\ b\ c') are the endpoints of the vectors u and v, then
{a-\- a\ b -\-b\ c + c^ is the endpoint of the vector u + v. These properties are pictured in Fig.
l-l(fl).
{ka, kb, kc)
{a) Vector Addition
{b) Scalar Multiplication
Fig. 1-1
(ii) Scalar Multiplication: The product ^u of a vector u by a real number k is obtained by multiplying
the magnitude of u by ^ and retaining the same direction if ^ > 0 or the opposite direction if ^ < 0.
Also, if (a, Z?, c) is the endpoint of the vector u, then {ka, kb, kc) is the endpoint of the vector ^u.
These properties are pictured in Fig. \-\{b).
Mathematically, we identify the vector u with its (a,b,c) and write \i = (a,b,c). Moreover, we call the
ordered triple (a,b,c) of real numbers a point or vector depending upon its interpretation. We generalize
this notion and call an w-tuple (aj, ^2, ..., a„) of real numbers a vector. However, special notation may be
used for the vectors in R^ called spatial vectors (Section 1.6).
1.2 VECTORS IN R"
The set of all ^-tuples of real numbers, denoted by R", is called n-space. A particular w-tuple in R", say
W = (flj, B2, . . . , a^)
is called dipoint or vector. The numbers a^ are called the coordinates, components, entries, or elements of w.
Moreover, when discussing the space R", we use the term scalar for the elements of R.
Lipschutz-Lipson:Schaum's I 1. Vectors in R^n and C^n,
Outline of Theory and Spatial Vectors
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 1]
VECTORS IN R" AND C\ SRATIAL VECTORS
Two vectors, u and v, are equal, written u = v, if they have the same number of components and if the
corresponding components are equal. Although the vectors A, 2, 3) and B, 3, 1) contain the same three
numbers, these vectors are not equal since corresponding entries are not equal.
The vector @, 0, ..., 0) whose entries are all 0 is called the zero vector, and is usually denoted by 0.
Example 1.1
(a) The following are vectors:
B,-5), G,9), @,0,0), C,4,5)
The first two vectors belong to R^ whereas the last two belong to R^. The third is the zero vector in R^.
(b) Find x, y, z such that (x —y, x-\-y, z — I) = D, 2, 3).
By definition of equality of vectors, corresponding entries must be equal. Thus,
x—y = 4, x-\-y = 2, z—1=3
Solving the above system of equations yields x = 3,j;=—l,z = 4.
Column Vectors
Sometimes a vector in w-space R" is written vertically, rather than horizontally. Such a vector is called
a column vector, and, in this context, the above horizontally written vectors are called row vectors. For
example, the following are column vectors with 2, 2, 3, and 3 components, respectively:
"
2
3"
-4
r
5
—6
" 1.5
2
3
_-15_
We also note that any operation defined for row vectors is defined analogously for column vectors.
1.3 VECTOR ADDITION AND SCALAR MULTIPLICATION
Consider two vectors u and i; in R", say
u = (ai, a2, ..., a^) and i; = (Z?i, Z?2, ..., b^)
Their sum, written u-\-v,\s the vector obtained by adding corresponding components fi*om u and v. That is,
w +i; = (flj+Z?i, a2 + Z?2, ..., a„ + Z?„)
The scalar product or, simply, product, of the vector w by a real number k, written ku, is the vector obtained
by multiplying each component of u by k. That is,
ku = k{aY, B2, ..., a^) = {ka^, ^^2, ..., ka^)
Observe that u-\- v and ku are also vectors in R". The sum of vectors with different numbers of components
is not defined.
Negatives and subtraction are defined in R" as follows:
—u = (—l)u and u — v = u-\-(—v)
The vector —u is called the negative of u, and u — v is called the difference of u and v.
Lipschutz-Lipson:Schaum's I 1. Vectors in R^n and C^n,
Outline of Theory and Spatial Vectors
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
VECTORS IN R" AND C", SPATIAL VECTORS
[CHAR 1
Now suppose we are given vectors Ui,U2, ... ,u^ in R" and scalars ki,k2, ... ,k^ in R. We can
multiply the vectors by the corresponding scalars and then add the resultant scalar products to form the
vector
Such a vector v is called a linear combination of the vectors Ui,U2, ... ,Uj^.
Example 1.2
(a) Let u = B, 4, -5) and v = (l, -6, 9). Then
u-\-v = B-\-l, 4+ (-5), -5+ 9) = C,-1,4)
7u = GB), 7D), 7(-5)) = A4, 28, -35)
-V = (-1)A, -6, 9) = (-1,6, -9)
3u-5v = F, 12, -15) + (-5, 30, -45) = A, 42, -60)
(b) The zero vector 0 = @, 0,..., 0) in R" is similar to the scalar 0 in that, for any vector u = (ai, a2, ..., a^).
w + 0 = (a^ + 0, C2 + 0, ..., a„ + 0) = (a^, C2, ..., a„) = w
(c) Let u -
Basic properties of vectors under the operations of vector addition and scalar multiplication are
described in the following theorem.
Theorem 1.1: For any vectors w, v, w in R" and any scalars k, k' in R,
(i) {u-\-v)-\-w = u-\-{v -\- w), (v) k{u -\-v) = vu-\- kv,
(ii) u-\-0 = u,
(iii) u + (—u) = 0,
(iv) u-\- V = V -\-u.
2"
3
-4
and V =
3"
-1
-2
. Then 2u -
-3v =
4
6
-8
+
-9
3
6
=
-5
9
-2
(vi) (k + k')u = ku -\- k'u,
(vii) {kk')u = k{k'u\
(viii) \u = u.
We postpone the proof of Theorem 1.1 until Chapter 2, where it appears in the context of matrices
(Problem 2.3).
Suppose u and v are vectors in R" for which u = kv for some nonzero scalar k in R. Then u is called a
multiple of v. Also, u is said to be the same or opposite direction as i; according as ^ > 0 or ^ < 0.
1.4 DOT (INNER) PRODUCT
Consider arbitrary vectors u and i; in R"; say,
u = (ai,a2, ... ,a„) and v = (bi, b2, ... ,b„)
The (io/^ product or mwer product or scalar product of w and i; is denoted and defined by
u ■ V = a^b^ + B2^72 + • • • + a„Z?„
That is, w • i; is obtained by multiplying corresponding components and adding the resulting products. The
vectors u and v are said to be orthogonal {oxperpendicular) if their dot product is zero, that is, if w • i; = 0.
Lipschutz-Lipson:Schaum's I 1. Vectors in R^n and C^n, I Text
Outline of Theory and Spatial Vectors
Problems of Linear
Algebra, 3/e
©The McGraw-Hill
Companies, 2004
CHAR 1]
VECTORS IN R" AND C\ SRATIAL VECTORS
Example 1.3
(a) Let u = A, -2, 3), v = D, 5, -1), w = B, 7, 4). Then:
u-v= 1D) - 2E) + 3(-l) = 4 - 10 - 3 = -9
U'W = 2-l4-\-l2 = 0, V'W = S-\-35-4 = 39
Thus u and w are orthogonal.
(b) Let u --
2"
3
-4
and V =
3"
-1
-2
Then w-i; = 6-3 + 8 = 11.
(c) Suppose w = A, 2, 3, 4) and v = F, k, —S,2). Find k so that u and v are orthogonal.
First obtain w-i; = 6 + 2^ — 24 + 8 = —10 + 2^. Then sQt u • v = 0 and solve for k:
-10 + 2^ = 0 or 2^=10 or k = 5
Basic properties of the dot product in R" (proved in Problem 1.13) follow.
Theorem 1.2: For any vectors u,v,w in K^ and any scalar ^ in R:
(i) (u -\- v) ■ w = u ■ w -\- V ■ w, (iii) u ■ v = v ■ u,
(ii) (ku) ■ V = k(u ■ v), (iv) u ■ u >0, and u ■ u = 0 iff u = 0.
Note that (ii) says that we can "take k out" from the first position in an inner product. By (iii) and (ii),
u • (kv) = (kv) • u = k(v • u) = k(u • v)
That is, we can also "take k out" from the second position in an inner product.
The space R" with the above operations of vector addition, scalar multiplication, and dot product is
usually called Euclidean n-space.
Norm (Length) of a Vector
The norm or length of a vector w in R", denoted by ||w||, is defined to be the nonnegative square root of
u ■ u.\n particular, if w = (a^, ^2, ..., a„), then
/u ■ u = Ja\ -\- al-\- ■ ■ ■ -\- al
That is, ||w|| is the square root of the sum of the squares of the components of u. Thus ||w|| > 0, and
||w|| = 0 if and only if w = 0.
A vector u is called a unit vector if ||w|| = 1 or, equivalently, ifu-u= 1. For any nonzero vector v in
R", the vector
is the unique unit vector in the same direction as v. The process of finding v from v is called normalizing v.
Example 1.4
(a) Suppose u = (I, —2, —4, 5, 3). To find ||w||, we can first find ||w||^ = u • uby squaring each component of w and
adding, as follows:
\2 , ^2 , o2
uf = 1 + (-2y + (-4r + 5^ + 3^ = 1 + 4 + 16 + 25 + 9 = 55
Then \\u\\ = ^55.
Lipschutz-Lipson:Schaum's I 1. Vectors in R'^n and C^n, I Text I I ©The McGraw-Hill
Outline of Theory and Spatial Vectors Companies, 2004
Problems of Linear
Algebra, 3/e
VECTORS IN R" AND C", SPATIAL VECTORS [CHAP 1
(b) LqX V = (I, -3,4,2) sind w = a, -1,1,1). Then
||i;||=Vl+9 + 16 + 4 = V30 and \\w\\ = ^^^j^^^^j^ = ^^ = V\ = I
Thus w is a unit vector but v is not a unit vector. However, we can normalize v as follows:
V / 1 -3 4 2
M VV30'V30'V30'V30.
This is the unique unit vector in the same direction as v.
The following formula (proved in Problem 1.14) is known as the Schwarz inequality or Cauchy-
Schwarz inequality. It is used in many branches of mathematics.
Theorem 1.3 (Schwarz): For any vectors u, v in R", \u ■ v\ < \\u\\ \\v\\.
Using the above inequality, we also prove (Problem 1.15) the following result known as the "triangle
inequality" or Minkowski's inequality.
Theorem 1.4 (Minkowski): For any vectors u, v in R", ||w + i;|| < ||w|| + ||i;||.
Distance, Angles, Projections
The distance between vectors u = (ai,a2, ... ,aj and v = (bi,b2, ... ,bj in R" is denoted and
defined by
d(u, v) = \\u - v\\ = y (fli - Z?i)^ + («2 - ^2? + ••• + («!- Z?i)^
One can show that this definition agrees with the usual notion of distance in the Euclidean plane R^ or
space R^.
The angle 6 between nonzero vectors w, i; in R" is defined by
u • V
cos 6 = ■
This definition is well defined, since, by the Schwarz inequality (Theorem 1.3),
u-v
-1 < < 1
-\\u\\\\v\\-
Note that if u ■ v = 0, then 6 = 90° (or 6 = n/2). This then agrees with our previous definition of
orthogonality.
The projection of a vector u onto a nonzero vector v is the vector denoted and defined by
proi(w, v) = ^i;
\\vf
We show below that this agrees with the usual notion of vector projection in physics.
Example 1.5
(a) Suppose u = (I, —2, 3) and v = B, 4, 5). Then
d(u, v) = y/(l - 2f + (-2 - Af + C - 5)^ = Vl+36 + 4 = Va\
To find cos d, where d is the angle between u and v, we first find
u-v = 2-^ + \5 = 9, \\uf = 1+4 + 9=14, \\vf = 4 + 16 + 25 = 45
Lipschutz-Lipson:Schaum's I 1. Vectors in R^n and C^n, I Text
Outline of Theory and Spatial Vectors
Problems of Linear
Algebra, 3/e
©The McGraw-Hill
Companies, 2004
CHAR 1]
VECTORS IN R" AND C", SRATIAL VECTORS
Then
Also,
cos 6 -
\\u\\\\v\\ yi4V45
u-v 9 1 [2 4
projiu, v)= .v = —B, 4, 5) = -B, 4, 5) = -, -, 1
^ -"^ ^ \\vf 45^ ^ 5^ ^ 15 5
(b) Consider the vectors u and v in Fig. l-2(a) (with respective endpoints A and B). The (perpendicular) projection of
u onto V is the vector w* with magnitude
||w*|| = \\u\\cose= \\u\
u-v u-v
\\u\\v\\ \\v\\
To obtain w*, we multiply its magnitude by the unit vector in the direction of v, obtaining
* II *ii ^ u - V V u-v
This is the same as the above definition of proj(w, v).
Projection m* of m onto v
(a)
^^B
Pibi-ai, b2-a2 b^-a^)
*Bib,,b2,b,)
Fig. 1-2
1.5 LOCATED VECTORS, HYPERPLANES, LINES, CURVES IN R"
This section distinguishes between an w-tuple P(ai) = P(ai, ^2, ..., a„) viewed as a point in R" and an
w-tuple w = [cj, C2, ..., c„] viewed as a vector (arrow) from the origin O to the point C(ci, C2, ..., c„).
Located Vectors
Any pair of points A(ai) and B(bi) in R" defines the located vector or directed line segment from A to
B, written AB. We identify AB with the vector
u = B — A = [bi — ai, Z?2 — «2'
K - «J
Lipschutz-Lipson:Schaum's I 1. Vectors in R^n and C^n,
Outline of Theory and Spatial Vectors
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
VECTORS IN R" AND C", SPATIAL VECTORS
[CHAR 1
since AB and u have the same magnitude and direction. This is pictured in Fig. 1-2(Z?) for the
points A(ai,a2,a2) and 5(Z?i, Z?2, ^3) in R^ and the vector u=B—A which has the endpoint
P(Z?i -^1, Z?2 -^2, Z?3 -^3).
Hyperplanes
A hyperplane H in R" is the set of points (xj, X2, ..., x„) which satisfy a linear equation
>2
ijAj -T ^2-^2 + • • • + (^n^n = ^
where the vector u = [ai,a2, ... ,aj of coefficients is not zero.Thus a hyperplane H in R^ is a line and a
hyperplane H in R^ is a plan^e. We show below, as pictured in Fig. 1-3 (a) for R^, that u is orthogonal to
any directed line segment PQ, where P(Pi) and Q(qi) are points in H. [For this reason, we say that u is
normal to H and that H is normal to u.]
Since P(j^/) and g(^/) belong to H, they satisfy the above hyperplane equation, that is,
aipi + a2P2 + ... + a^„ = b and aj^j + ^2^2 + • • • + «„^„ = ^
Let
Then
PQ =Q-P = [(ii -Px^qi -P2^"'^qn-Pn\
u-v = a^(q^ -p^) + a2{q2 - Pi) + • • • + «„fc -J^J
= (^1^1 + ^2^2 + ... + «„^J - {ci^Px + «2/^2 + ... + a^n) = b-b = 0
Thus 1; = PQ is orthogonal to w, as claimed.
(a)
(h)
Fig. 1-3
Lines in R"
The line L in R" passing through the point P(bi, Z?2, ..., ^„) and in the direction of a nonzero vector
u = [ai,a2, ... ,aj consists of the points X(xi, X2, ..., x„) that satisfy
X = P + tu
Xj = ait -\- bi
where thQ parameter t takes on all real values. Such a line L in R^ is pictured in Fig. 1-3(Z?).
Lipschutz-Lipson:Schaum's I 1. Vectors in R'^n and C^n, I Text I I ©The McGraw-Hill
Outline of Theory and Spatial Vectors Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 1] VECTORS IN R" AND C\ SRATIAL VECTORS
Example 1.6
(a) Let H be the plane in R^ corresponding to the linear equation 2x — 5y-\-7z = 4. Observe that P(l, 1, 1) and
QE, 4, 2) are solutions of the equation. Thus P and Q and the directed line segment
v = ^ =Q-P = [5-l, 4-1, 2-1] = [4, 3,1]
lie on the plane H. The vector w = [2, — 5, 7] is normal to H, and, as expected,
U'V = [2, -5, 7] • [4, 3, 1] = 8 - 15 + 7 = 0
That is, u is orthogonal to v.
(b) Find an equation of the hyperplane H in R^ that passes through the point P(l, 3, —4, 2) and is normal to the
vector u = [4, —2, 5, 6].
The coefficients of the unknowns of an equation of// are the components of the normal vector u; hence the
equation of H must be of the form
4xi — 2x2 + 5x3 + 6x4 = k
Substituting P into this equation, we obtain
4(l)-2C) + 5(-4) + 6B) = ^ or 4-6-20+12 = ^ or ^=-10
Thus 4xi — 2x2 + 5x3 + 6x4 = —10 is the equation of//.
(c) Find the parametric representation of the line L in R^ passing through the point P(l, 2, 3, —4) and in the direction
of u = [5, 6, —7, 8]. Also, find the point Q on L when t = I.
Substitution in the above equation for L yields the following parametric representation:
Xi=5t-\-\, X2=6t-\-2, x^ = -lt-\-3, x4 = 8^-4
or, equivalently,
L(t) = Et-\-l,6t-\-2, -It + 3,8^- 4)
Note that t = 0 yields the point P on L. Substitution of ^ = 1 yields the point QF, 8, -4, 4) on L.
Curves in R"
Let D be an interval (finite or infinite) on the real line R. A continuous function F: /) ^ R" is a curve
in R". Thus, to each point t e D there is assigned the following point in R":
F@ = [Fi@,F2@,...,F„@]
Moreover, the derivative (if it exists) of F(t) yields the vector
[dF^(t) dF^it) dF^(t)
F@^'^«
dt
which is tangent to the curve. Normalizing V(t) yields
V{t}
dt dt dt
T« = -
V{t)\\
Thus T{t) is the unit tangent vector to the curve. (Unit vectors with geometrical significance are often
presented in bold type.)
Lipschutz-Lipson:Schaum's I 1. Vectors in R^n and C^n,
Outline of Theory and Spatial Vectors
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
10
VECTORS IN R" AND C", SPATIAL VECTORS
[CHAR 1
Example 1.7. Consider the curve F{t) = [sin t, cos t, t] in R^. Taking the derivative of F(t) [or each component of F(t)]
yields
V(t) = [cost, -sint, 1]
which is a vector tangent to the curve. We normalize V(t). First we obtain
II V(t)f = cos^ t + sin^ ^+1 = 1 + 1=2
Then the unit tangent vection T(^) to the curve follows:
T@:
cos t —sint 1
moil LV2' V2 'V2.
1.6 VECTORS IN R^ (SR4TIAL VECTORS), ijk NOTATION
Vectors in R^, called spatial vectors, appear in many applications, especially in physics. In fact, a
special notation is frequently used for such vectors as follows:
i = [1, 0, 0] denotes the unit vector in the x direction,
j = [0, 1, 0] denotes the unit vector in the y direction.
k = [0, 0, 1] denotes the unit vector in the z direction.
Then any vector u = [a, b,c] in R^ can be expressed uniquely in the form
u = [a, b, c] = ai-\- b] + c]
Since the vectors i, j, k are unit vectors and are mutually orthogonal, we obtain the following dot products:
11= 1, j j = l, k k=l and i j = 0, i k = 0, j k = 0
Furthermore, the vector operations discussed above may be expressed in the ijk notation as follows.
Suppose
^«ii + «2J + «3k
and
Then
u-\-v = (ai-\- Z?i)i + (B2 + Z?2)j + (^3 + Z?3)k
where c is a scalar. Also,
Z?ii + Z?2J + Z?3k
and cu = caii -\- ca2] + ca^i^
u ■ V = aibi + B2^72 + a^b^
and
M
lu • U = ^1 + B2 + ^3
Example 1.8 Suppose w = 3i + 5j - 2k and i; = 4i - 8j + 7k.
{a) To find w + i;, add corresponding components, obtaining w + i; = 7i — 3j+5k
{b) To find lu — lv, first multiply by the scalars and then add:
3w - 2i; = (9i + 13j - 6k) + (-8i + 16j - 14k) = i + 29j - 20k
(c) To find u • V, multiply corresponding components and then add:
w-i;= 12-40-14 = -42
(d) To find ||w||, take the square root of the sum of the squares of the components:
\\u\\ = V9 + 25 + 4 = V38
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
1. Vectors in R'^n and C^n,
Spatial Vectors
Text
©The McGraw-Hill
Companies, 2004
CHAR 1]
VECTORS IN R" AND C\ SRATIAL VECTORS
11
Cross Product
There is a special operation for vectors u and v in R^ that is not defined in R" for w 7^ 3. This operation
is called the cross product and is denoted by u x v. One way to easily remember the formula foru x v is to
use the determinant (of order two) and its negative, which are denoted and defined as follows:
= ad — be
and
= be — ad
Here a and d are called the diagonal elements and b and c are the nondiagonal elements. Thus the
determinant is the product ad of the diagonal elements minus the product be of the nondiagonal elements,
but vice versa for the negative of the determinant.
Now suppose u = aii-\- a2] + a^li and v = bii-\- Z?2J + ^sk. Then
u X V
^3^2I + (^3^1 ~ ^1^3)]* + (^1^2 ~ ^2^1 )J^
^2
b2
^3
b,
bi ZJ
«3
^3
j +
bi
«2
bi
«3
^3
That is, the three components of u x v are obtained from the array
ai
b,
a^ ^3
^2
^3
(which contain the components of u above the component of v) as follows:
A) Cover the first column and take the determinant.
B) Cover the second column and take the negative of the determinant.
C) Cover the third column and take the determinant.
Note that w x 1; is a vector; hence w x 1; is also called the veetor produet or outer produet of u and v.
Example 1.8. Find uxv where: {a) w = 4i + 3j + 6k, i; = 2i + 5j - 3k, {b) u = [2, -1, 5], v = [3, 7, 6].
3 6"
(a) Use
(b) Use
2 5
'2 -1
5
7 6
to get w X i; = (-9 - 30)i + A2 + 12)j + B0 - 6)k = -39i + 24j + 14k
togetwxi; = [-6-35,15-12,14 + 3] = [-41,3,17]
Remark: The cross products of the vectors i, j, k are as follows:
i X j = k, j X k = i, k X i = j
j X i = -k, k X j = -i, i X k = -j
Thus, if we view the triple (i, j, k) as a cyclic permutation, where i follows k and hence k precedes i, then
the product of two of them in the given direction is the third one, but the product of two of them in the
opposite direction is the negative of the third one.
Two important properties of the cross product are contained in the following theorem.
Theorem 1.5: Let u,v,whQ vectors in R^.
(a) The vector u x v is orthogonal to both u and v.
(b) The absolute value of the "triple product"
u ■ V X w
represents the volume of the parallelopiped formed by the vectors u,v, w.
[See Fig. l-4(a).]
Lipschutz-Lipson:Schaum's I 1. Vectors in R^n and C^n,
Outline of Theory and Spatial Vectors
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
12
VECTORS IN R" AND C", SPATIAL VECTORS
[CHAR 1
Fig. 1-4
Complex pliitie
We note that the vectors u,v,uxv form a right-handed system, and that the following formula gives
the magnitude of u x v:
\u X v\\ = \\u\\\\v\\ smc
where 6 is the angle between u and v.
1.7 COMPLEX NUMBERS
The set of complex numbers is denoted by C. Formally, a complex number is an ordered pair (a, b) of
real numbers where equality, addition and multiplication are defined as follows:
{a, b) = (c, d) if and only if a = c and b = d
(a, b) + (c, ^) = (a + c, b + d)
(a, b) ■ (c, d) = {ac — bd, ad + be)
We identify the real number a with the complex number (a, 0); that is,
a ^> (a, 0)
This is possible since the operations of addition and multiplication of real numbers are preserved under the
correspondence; that is,
(a,0) + (b,0) = (a + b, 0) and (a,0) ■ (b,0) = (ab,0)
Thus we view R as a subset of C, and replace (a, 0) by a whenever convenient and possible.
We note that the set C of complex numbers with the above operations of addition and multiplication is
2i field of numbers, like the set R of real numbers and the set Q of rational numbers.
The complex number @, 1) is denoted by /. It has the important property that
•2
/^ = // = @, 1)@, 1) = (-1, 0) = -1 or / = V-1
Accordingly, any complex number z = (a,b) can be written in the form
z = (a, b) = (a, 0) + @, b) = (a, 0) + (b, 0) • @, 1) = a + bi
Lipschutz-Lipson:Schaum's I 1. Vectors in R'^n and C^n, I Text I I ©The McGraw-Hill
Outline of Theory and Spatial Vectors Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 1] VECTORS IN R" AND C", SRATIAL VECTORS 13
The above notation z = a + bi, where a = Re z and b = lmz are called, respectively, the real and
imaginary parts of z, is more convenient than (a,b). In fact, the sum and product of complex number
z = a-\-bi and w = c-\- di can be derived by simply using the commutative and distributive laws and
/2 = -l:
z -\-w = {a-\-bi) -\- {c -\- di) = a-\- c -\- bi + di = (a-\- b)-\- {c -\- d)i
zw = (a-\- bi)(c + di) = ac -\- bci + adi + bdi^ = (ac — bd) + (be + ad)i
We also define the negative of z and subtraction in C by
—z = — Iz and w — z = w-\- (—z)
Warning: The letter / representing V — 1 has no relationship whatsoever to the vector i = [1, 0, 0] in
Section 1.6.
Complex Conjugate, Absolute Value
Consider a complex number z = a-\- bi. The conjugate of z is denoted and defined by
z = a-\-bi = a — bi
Then zz = (a-\- bi)(a — bi) = a^ — b^i^ = a^ -\-b^. Note that z is real if and only if z = z.
The absolute value of z, denoted by |z|, is defined to be the nonnegative square root of zz. Namely,
|z| = V^ = x/a2~+^
Note that \z\ is equal to the norm of the vector (a, b) in R^.
Suppose z 7^ 0. Then the inverse z~^ of z and division in C of w by z are given, respectively, by
_, z a b ^ w wz _]
z = — = -5 -r ^ -r / and - = wz
zz a^ -\- b^ a^ -\- b^ z zz
Example 1.9. Suppose z = 2-\-?>i and w = 5 — 2i. Then
z + w = B + 3/) + E - 2/) = 2 + 5 + 3/ -2i = l -\-i
zw = B-\- 30E - 2i) = 10 + 15/ - 4/ - 6/^ = 16 + 11/
z = 2 + 3/ = 2 — 3/ and w = 5 — 2/ = 5 + 2/
w _ 5 - 2/ _ E - 2/)B - 3/) _ 4 - 19/ _ 4 19 .
7~2 + 3/~B + 3/)B-3/)~ 13 ~T3~T3^
\z\ = V4 + 9 = VU and |>v| = V25+4 = V29
Complex Plane
Recall that the real numbers R can be represented by points on a line. Analogously, the complex
numbers C can be represented by points in the plane. Specifically, we let the point (a, b) in the plane
represent the complex number a + bi as shown in Fig. 1-4(Z?). In such a case, \z\ is the distance from the
origin O to the point z. The plane with this representation is called the complex plane, just like the line
representing R is called the real line.
Lipschutz-Lipson:Schaum's I 1. Vectors in R'^n and C^n, I Text I I ©The McGraw-Hill
Outline of Theory and Spatial Vectors Companies, 2004
Problems of Linear
Algebra, 3/e
14 VECTORS IN R" AND C", SPATIAL VECTORS [CHAR 1
1.8 VECTORS IN C"
The set of all ^-tuples of complex numbers, denoted by C", is called complex n-space. Just as in the
real case, the elements of C" are called points or vectors, the elements of C are called scalars, and vector
addition in C" and scalar multiplication on C" are given by
[Zj, Z2, . . . , Zj + [Wj, W2, . . . , Wj = [Zj + Wj, Z2 + W2, . . . , Z„ + wj
z[Zi, Z2, . . . , Z„] = [ZZJ, ZZ2, . . . , ZZJ
where the z^, w^, and z belong to C.
Example 1.10. Consider vectors w = [2 + 3/, 4 - i, 3] and i; = [3 - 2/, 5/, 4 - 6/] in C^ Then
u-\-v = [2 + 3/, 4 - /, 3] + [3 - 2i, 5i, 4 - 6/] = [5 + /, 4 + 4/, 7 - 6/]
E-2i)u = [E-20B + 30, E-20D-0, E-20C)] = [16 + 11/, 18-13/, 15-6/]
Dot (Inner) Product in C"
Consider vectors w = [zj, Z2, ..., z„] and v = [wi,W2, ..., wj in C". The (io/^ or inner product ofu and
i; is denoted and defined by
U- V = Z^Wi + Z2W2 H ^ ^n^n
This definition reduces to the real case since Wf = Wf when w^ is real. The norm of u is defined by
/U^ = yJz^Z^ +Z2Z2^ h Z^Z^ = J\Z^ 1^ + |Z2|^ H h |l^„|^
ll^ll
We emphasize that u • u and so ||w|| are real and positive when u ^ 0 and 0 when u = 0.
Example 1.10. Consider vectors u = [2 + 3/, 4 - /, 3 + 5/] and v = [?>- 4/, 5/, 4 - 2/] in C3. Then
u-v = B + 30C - 40 + D - 0E0 + C + 50D - 20
= B + 30C + 40 + D - 0(-50 + C + 50D + 20
= (-6+130+ (-5-200+ B+ 260 = -9+19/
w-w= |2 + 3/|2 + |4-/|2 + |3 + 5/|2 = 4 + 9 + 16+1+9 + 25 = 64
64 = 8
The space C" with the above operations of vector addition, scalar multiplication, and dot product, is
called complex Euclidean n-space. Theorem 1.2 for R" also holds for C" if we replace u- v = v -uhy
u ■ V = u ■ V
On the other hand, the Schwarz inequality (Theorem 1.3) and Minkowski's inequality (Theorem 1.4) are
true for C" with no changes.
Solved Problems
VECTORS IN R"
1.1. Determine which of the following vectors are equal:
wi =A,2,3), U2 = B,3,1), W3 =A,3,2), W4 = B,3,l)
Vectors are equal only when corresponding entries are equal; hence only U2 = u^.
Lipschutz-Lipson:Schaum's I 1. Vectors in R^n and C^n,
Outline of Theory and Spatial Vectors
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 1]
VECTORS IN R" AND C\ SRATIAL VECTORS
15
1.2. Let u = B, -7, 1), v = (-3, 0, 4), w = @, 5, -8). Find:
(a) 3u — 4v,
(b) 2u -\-3v — 5w.
First perform the scalar multiplication and then the vector addition.
(a) 3u-4v = 3B, -7, 1) - 4(-3, 0, 4) = F, -21, 3) + A2, 0, -16) = A8, -21, -13)
(b) 2u-\-3v-5w = D, -14, 2) + (-9, 0, 12) + @, -25, 40) = (-5, -39, 54)
1.3. Let u
5"
3
-4
, V =
"-1"
5
2
, w =
3
-1
-2
Find:
(a) 5u — 2v,
(b) -2u -\-4v- 3w.
First perform the scalar multiplication and then the vector additioin:
(a) 5u — 2v = 5
(b) -2u + Av -
5
3
-4
3w =
-2
-1
5
2
-10
-6
I
^
+
=
25
15
-20
-4
20
^
+
+
-9
3
6
2
-10
-4
=
=
-23
17
22
27
5
-24
1.4. Find x and y, where: (a) (x, 3) = B, x + y), (b) D, y) = xB, 3).
(a) Since the vectors are equal, set the corresponding entries equal to each other, yielding
X = 2, 3 = x-\-y
Solve the linear equations, obtaining x = 2, j; = 1.
(b) First multiply by the scalar x to obtain D,y) = Bx, 3x). Then set corresponding entries equal to each
other to obtain
4 = 2x,
Solve the equations to yield x = 2, j; = 6.
J^ = 3x
1.5. Write the vector i; = A, —2, 5) as a linear combination of the vectors Wj = A, 1, 1), U2 = (I, 2, 3),
W3 =B,-1,1).
We want to express v in the form v = xui -\-yu2 + zui, with x,y, z as yet unknown. First we have
1
1
1
2
3
+ z
2
-1
1
=
X + y -\-2z
x-\-2y — z
x + 3j;+ z
(It is more convenient to write vectors as columns than as rows when forming linear combinations.) Set
corresponding entries equal to each other to obtain
X + y + 2z =
x-\-2y — z —
X + 3_y + z —
1
-2
5
or
X + j; + 2z =
y — 3z —
2y — z =
1
-3
4
or
x+j; + 2z= 1
j; - 3z = -3
5z= 10
This unique solution of the triangular system isx= —6,j; = 3,z = 2. Thus i; = — 6^^ + 3^2 + 2^3
Lipschutz-Lipson:Schaum's I 1. Vectors in R^n and C^n,
Outline of Theory and Spatial Vectors
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
16
VECTORS IN R" AND C", SPATIAL VECTORS
[CHAR 1
1.6. Write v = B, —5, 3) as a linear combination of Wj = A, —3, 2), U2
Find the equivalent system of linear equations and then solve. First,
X-
-3x-
2x ■
:{2,-4,-l),M3 = (l,-5,7).
2
-5
3
= X
1
-3
2
■^y
2
-4
-1
+ z
1
-5
7
=
- 2y -
■4y-
■ y-
■ z
■5z
■7z
Set the corresponding entries equal to each other to obtain
X + 2_y + z = 2
or 2j; — 2z = 1
-5j; + 5z=-l
X + 2_y + z =
X — 4y — 5z =
-X — j; + 7z =
2
-5
3
X + 2_y + z :
2_y — 2z:
0:
The third equation. Ox + Oj; + Oz = 3, indicates that the system has no solution. Thus v cannot be written as a
linear combination of the vectors Ui, ^2' ^3*
DOT (INNER) PRODUCT, ORTHOGONALITY, NORM IN R"
1.7. Find u • v where:
(a) u = B, -5, 6) and v = (8, 2, -3)
(b) u = D, 2, -3, 5, -1) and v = B, 6, -1, -4, 8).
Multiply the corresponding components and add:
(a) u-v = 2(8) - 5B) + 6(-3) = 16 - 10 - 18 = -12
(b) U'V = S-\-l2-\-3-20-S = -5
1.8. Let u = E, 4, 1), v = C, —4, 1), w = A, —2, 3). Which pair of vectors, if any, are perpendicular
(orthogonal)?
Find the dot product of each pair of vectors:
w-i;=15-16+l=0, i;->v = 3 + 8 + 3 = 14, u-w = 5 - S-\-3 = 0
Thus u and v are orthogonal, u and w are orthogonal, but v and w are not.
1.9. Find k so that u and v are orthogonal, where:
(a) u = (l,k, -3) and v = B, -5, 4)
(b) u = B, 3k, -4, 1, 5) and v = F, -1, 3, 7, 2k).
Compute u • V, SQt u • V equal to 0, and then solve for k:
(a) u-v= 1B) + ^(-5) - 3D) = -5k - 10. Then -5k - 10 = 0, or ^ = -2.
(b) u-v=\2-3k-\2 + l+\Qk = lk + l. Then 7^ + 7 = 0, or ^ = -1.
1.10. Find ||w||, where: {a) u = C, -12, -4), (b) u = B, -3, 8, -7).
First find \\u\\^ = u - uby squaring the entries and adding. Then ||w|| = vINF-
(a) \\uf = Cf + (-Uf + (-4J = 9 + 144 + 16 = 169. Then \\u\\ = ^169 = 13
(b) ||w||2 =4 + 9 + 64 + 49= 126. Then ||w||
026.
Lipschutz-Lipson:Schaum's I 1. Vectors in R'^n and C^n, I Text I I ©The McGraw-Hill
Outline of Theory and Spatial Vectors Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 1] VECTORS IN R" AND C\ SRATIAL VECTORS 17
1.11. Recall that normalizing a nonzero vector v means finding the unique unit vector v in the same
direction as v, where
1
V = V
M
Normalize: (a) u = C, -4), (b) v = D, -2, -3, 8), (c) w = (^ |, -\).
(a) First find ||w|| = V9 +16 = V25 = 5. Then divide each entry of w by 5, obtaining u = Q, — |).
(b) Here ||i;|| = V16 + 4 + 9 + 64 = V93. Then
. _ / 4 -2 -3 8 \
(c) Note that w and any positive multiple of w will have the same normalized form. Hence first multiply w by
12 to "clear fractions", that is, first find w^ = I2w = F, 8, —3). Then
\\W\\ = V36 + 64 + 9 = yi09 and w = W = i-^
V7T09
^. 109 V109 V109V
1.12. Let w = A, -3, 4) and i; = C, 4, 7). Find:
{a) cos 6, where 0 is the angle between u and i;;
{b) proj(w, v), the projection of u onto i;;
(c) d{u, v), the distance between u and i;.
First find w • i; = 3 - 12 + 28 = 19, ||w||2 = 1 + 9 + 16 = 26, ||i;||2 = 9 + 16+ 49 = 74. Then
. , ^ u-v 19
(a) cos t
\\u\\\\v\\ V26V74
•. ^ ^-^ 19,^ ^ ^, /57 76 133\ /57 38 133\
F) proj(.,.) = ^. = -C,4,7)=^-,-,—j = ^-,-,—j
(c) d{u, v) = \\u - v\\ = ||(-2, -7 - 3I1 = V4 + 49 + 9 = V62.
1.13. Prove Theorem 1.2: For any u, v, w in R" and k in R:
(i) (u -\- v) ■ w = u ■ w -\- V ■ w, (ii) (ku) • v = ^(w • i;), (iii) u ■ v = v ■ u,
(iv) w • w > 0, and u ■ u = 0 iff u = 0.
Let W = (Wj, ^2^ • • • » ^n)' i^ = (i^l, i^2' • • • ' ^n)' ^ — (^1' >^2' • • • ' ^n)-
(i) Since w + i; = (w^ + 1;^, ^2 + ^2' • • •' ^n + ^n)'
(w + i;) • w = (wi + i;i)>vi + (U2 + i;2M + • • • + K + vjw^
= UiWi + i;i>Vi + U2W2 -\ h Uj^Wj, + i;„>v„
= (wiWi + U2W2 + • • • + w„%) + (v^w^ + i;2>V2 + • • • + i;„>vj
= U ' W -\- V ' W
(ii) Since ^w = (kui, ku2, • • •, ku^),
(ku) • V = kUiVi + kU2V2 + • • • + kUj^Vj^ = k(UiVi + U2V2 + • • • + U^V^) = k(u • V)
(iii) u • V = UiVi + U2V2 -\ + UjjVjj = ViUi + V2U2 H h i^„w„ = v • u
(iv) Since w^ is nonnegative for each /, and since the sum of nonnegative real numbers is nonnegative,
u • u = u\ -\-ul-\ -\-ul > 0
Furthermore, u • u = 0 iff u, = 0 for each /, that is, iff u = 0.
Lipschutz-Lipson:Schaum's I 1. Vectors in R'^n and C^n, I Text I I ©The McGraw-Hill
Outline of Theory and Spatial Vectors Companies, 2004
Problems of Linear
Algebra, 3/e
18 VECTORS IN R" AND C", SPATIAL VECTORS [CHAR 1
1.14. Prove Theorem 1.3 (Schwarz): \u ■ v\ < ||w||||i^||.
For any real number t, and using Theorem 1.2, we have
0 <(tu-\-v)- (tu -\-v) = f(u • u) + 2t(u •v)-\-(v-v)= \\uff + 2(u • v)t + ||i;||^
Let a = \\u\\^, b = 2(u • v), c = \\v\\^. Then, for every value of t, af- -\-bt -\- c > 0. This means that the
quadratic polynomial cannot have two real roots. This implies that the discriminant D = b^ — Aac < 0 or,
equivalently, b^ < Aac. Thus
A{u-vf<A\\uf\\vf
Dividing by 4 gives us our result.
1.15. Prove Theorem 1.4 (Minkowski): ||w + i;|| < ||w|| + ||i;||.
By the Schwarz inequality and other properties of the dot product,
\\u + vf ={u + v)-{u + v) = {u-u) + 2(u -v) + (v-v)< \\uf + 2\\u\\ \\v\\ + \\vf = (\\u\\ + ||i;||)'
Taking the square root of both sides yields the desired inequality.
POINTS, LINES, HYPERPLANES IN R"
Here we distinguish between an w-tuple P(ai, ^2, ..., a„) viewed as a point in R" and an w-tuple
u = [ci,C2, ... ,cj viewed as a vector (arrow) from the origin O to the point C(ci, C2, ..., c„).
1.16. Find the vector u identified with the directed line segment PQ for the points:
(a) P(l,-2,4)andeF, l,-5)inR^ (b) PB, 3,-6, 5) and 2G, 1, 4,-8) in R^.
(a) u = Jq =Q-P = [6-l, l-(-2), -5-4] = [5, 3,-9]
(b) u = Jq =Q-P = [7-2, 1-3, 4 + 6, -8 - 5] = [5,-2, 10,-13]
1.17. Find an equation of the hyperplane H in R^ that passes through PC, —4, 1, —2) and is normal to
u = [2, 5, -6, -3].
The coefficients of the unknowns of an equation of// are the components of the normal vector u. Thus an
equation of// is of the form 2xi + 5x2 ~ 6x3 — 3x4 = k. Substitute P into this equation to obtain k = —26.
Thus an equation of// is 2xi + 5x2 ~ 6x3 — 3x4 = —26.
1.18. Find an equation of the plane H in R^ that contains P(l, —3, —4) and is parallel to the plane H^
determined by the equation 3x — 63; + 5z = 2.
The planes H and H' are parallel if and only if their normal directions are parallel or antiparallel (opposite
direction). Hence an equation of// is of the form 3x — 6j; + 5z = k. Substitute P into this equation to obtain
k = I. Then an equation of // is 3x — 6j; + 5z = 1.
1.19. Find a parametric representation of the line L in R^ passing through PD, —2, 3, 1) in the direction
ofw = [2, 5,-7, 8].
Here L consists of the points X(x,) that satisfy
X = P-\-tu or X,- = a^t + b^ or L(t) = (a^t + b^)
where the parameter t takes on all real values. Thus we obtain
x^=A-\- 2t, X2 = -2 + 2t, x^=3-7t, X4 = 1 + 8^ or L(t) = (A-\- 2t, -2 -\-2t, 3 - It, 1 + 80
Lipschutz-Lipson:Schaum's I 1. Vectors in R^n and C^n,
Outline of Theory and Spatial Vectors
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 1]
VECTORS IN R" AND C\ SRATIAL VECTORS
19
1.20. Let C be the curve F(t) = (f, 3t - 2, t\ f + 5) in R^ where 0 < ^ < 4.
(a) Find the point P on C corresponding to t = 2.
(b) Find the initial point Q and terminal point Q^ of C.
(c) Find the unit tangent vector T to the curve C when t = 2.
(a) Substitute t = 2 into F(t) to get P =f{2) = D, 4, 8, 9).
{b) The parameter t ranges from ^ = 0 to t = A. Hence Q =f@) = @, —2,0,5) and
e^=FD) = A6, 10,64,21).
(c) Take the derivative of F(^), that is, of each component of F(^), to obtain a vector V that is tangent to the
curve:
V(t) = ^ = [2t,3,3t\2t]
at
Now find V when t = 2; that is, substitute t = 2 in the equation for V(t) to obtain
V = V{2) = [4, 3, 12, 4]. Then normalize V to obtain the desired unit tangent vector T. We have
II V\\ = V16 + 9 + 144 + 16 :
/185
and
12
^185 V185 V185 V185
SPATIAL VECTORS (VECTORS IN R^), ijk NOTATION, CROSS PRODUCT
1.21. Let w = 2i - 3j + 4k, i; = 3i + j - 2k, w = i + 5j + 3k. Find:
(a) u -\- V, (b) 2u — 3v -\- 4w, (c) u • v and u • w, (d) \\u\\ and ||i;||.
Treat the coefficients of i, j, k just like the components of a vector in R^.
(a) Add corresponding coefficients to get u -\-v = 51 — 2] — 2k.
(b) First perform the scalar multiplication and then the vector addition:
2u-3v-\-4w = Di - 6j + 8k) + (-9i + 3j + 6k) + Di + 20j + 12k)
= -i+17j + 26k
(c) Multiply corresponding coefficients and then add:
u-v = 6 — 3 — S = —5 and u-w = 2 — 15-\-12 =—I
(d) The norm is the square root of the sum of the squares of the coefficients:
\\u\\ = V4 + 9+16 = V29 and ||i;|| = V9 + l+4 = yi4
1.22. Find the (parametric) equation of the line L:
(a) through the points P(l, 3, 2) and QB, 5, -6);
(b) containing the point P( 1,-2,4) and perpendicular to the plane H given by the equation
3x-\-5y-\-lz= 15.
(a) First find i; = Pg = g - P = [1, 2, -8] = i + 2j - 8k. Then
L(t) = (t-\-l, 2t-\-3, -8^ + 2) = (^+l)i + B^ + 3)j + (-8^ + 2)k
(b) Since L is perpendicular to H, the line L is in the same direction as the normal vector N = 3i + 5j + 7k
to H. Thus
L(t) = Ct-\-l, 5t-2, 7^ + 4) = C^+l)i + E^-2)j + G^ + 4)k
Lipschutz-Lipson:Schaum's I 1. Vectors in R^n and C^n, I Text
Outline of Theory and Spatial Vectors
Problems of Linear
Algebra, 3/e
©The McGraw-Hill
Companies, 2004
20
VECTORS IN R" AND C", SPATIAL VECTORS
[CHAR 1
1.23. Let S be the surface xy^ + 2yz = 16 in R^.
(a) Find the normal vector N(x, y, z) to the surface S.
(b) Find the tangent plane H to S Sit the point P(l,2,3).
(a) The formula for the normal vector to a surface F(x, y,z) = 0 is
N(x,j;,z)=F,i + F_^j+F,k
where F^, Fy, F^ are the partial derivatives. Using F(x,y, z) =xy^ -\- 2yz — 16, we obtain
F,=/, F_^ = 2xj; + 2z, F, = 2y
Thus N(x, y, z) = yH + Bxj; + 2z)j + 2j;k.
(b) The normal to the surface S at the point P is
N(P) = N(l, 2, 3) = 4i + lOj + 4k
Hence N = 2i + 5j + 2k is also normal to S at P. Thus an equation of// has the form 2x + 5j; + 2z = c.
Substitute P in this equation to obtain c = 18. Thus the tangent plane // to 5* at P is 2x + 5j; + 2z = 18.
1.24. Evaluate the following determinants and negative of determinants of order two:
(a) (i)
(b) (i)-
Use
3 4
5 9
3 6
4 2
a b
c d
, (ii)
(ii)
2 -1
4 3
, (iii)
4 -5
3 -2
7 -5
3 2
(iii)
4 -1
8 -3
ad — be and —
a b
c d
-- be — ad. Thus:
{a) (i) 27 - 20 = 7, (ii) 6 + 4=10, (iii) -8 + 15 = 7.
{b) (i) 24 - 6 = 18, (ii) -15 - 14 = -29, (iii) -8 + 12 = 4.
1.25. Let w = 2i-3j + 4k, i; = 3i + j-2k, w = i + 5j + 3k.
Find: {a) u x v, (b) u xw
(a) Use
(b) Use
2-3 4
3 1 -2
2 -3 4
1 5 3
to get w X i; = F - 4)i + A2 + 4)j + B + 9)k = 2i + 16j + 1 Ik
to get w X w = (-9 - 20)i + D - 6)j + A0 + 3)k = -29i - 2j + 13k
1.26. Find uxv, where: (a) u = {1,2,31 v = D, 5, 6); (b) u = (-4, 7, 3), v = F, -5, 2).
(a) Usel^ ^ ^
(b) Use
[-4 7
[6-5
togetwxi; = [12-15, 12-6, 5 - 8] = [-3, 6,-3]
togetwxi; = [14+15, 18 + 8, 20 - 42] = [29, 26,-22]
1.27. Find a unit vector u orthogonal to i; = [1, 3, 4] and w = [2, —6, — 5].
First find v x w, which is orthogonal to v and w.
1 3 4
The array
2 -6 -5
gives i;x>v = [-15 + 24, 8 + 5, -6 - 61] = [9, 13,-12]
Normalize v x w to gQt u = [9/^394, 13/^394, -12/^394]
Lipschutz-Lipson:Schaum's I 1. Vectors in R'^n and C^n, I Text I I ©The McGraw-Hill
Outline of Theory and Spatial Vectors Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 1] VECTORS IN R" AND C", SRATIAL VECTORS 21
1.28. Let u = (ai,a2,a2) and i; = (Z?i, Z?2, ^3) so w x 1; = (^2^3 — <3!3Z?2, ^3^1 — «i^3, «i^2 ~ ^2^1)-
Prove:
(a) u X V is orthogonal to u and v [Theorem 1.5(i)]
(b) \\u X v\\^ = (u ' u)(v ' v) — (u ' v)^ (Lagrange's identity).
(a) We have
u • (u X v) = ai(a2b2, — ^3^2) + ^2(^3^1 ~ ^1^3) + ^3(^1^2 ~ ^2^1)
= aia2bi, — aiai,b2 + ^2^361 — aia2bi, + ^1B362 — ^2^36^ = 0
Thus w X i; is orthogonal to u. Similarly, w x i; is orthogonal to v.
(b) We have
•v)-(u- vf = (a\+al+al)(b\+bl+bl) - (c
Expansion of the right-hand sides of A) and B) establishes the identity.
(u • u)(v 'V)-(U' vf = (a] -\-al-\- al)(b\ -\-b^-\- b]) - (a^b^ + ^2^72 + ^363)^ B)
COMPLEX NUMBERS, VECTORS IN C
1.29. Suppose z = 5 + 3/ and w = 2 — 4/. Find: (a) z-\-w, (b) z — w, (c) zw.
Use the ordinary rules of algebra together with f- = —\ to obtain a result in the standard form a + bi.
(a) z + w = E + 3/) + B - Ai) = 1 -i
(b) z-w = E+ 3/) - B - Ai) = 5 + 3/ - 2 + 4/ = 3 + 7/
(c) zw = E + 30B - Ai) = 10 - 14/ - 12/^ = 10 - 14/ + 12 = 22 - 14/
1.30. Simplify: (a) E + 3/)B - 7/), (b) D - 3/)^ (c) A + 2if.
(a) E + 3/)B - 7/) = 10 + 6/ - 35/ - 21/^ = 31 - 29/
F) D - 3/)^ = 16 - 24/ + 9/2 = 7 - 24/
(c) A + 2/)^ = 1 + 6/ + 12/2 _^ g^-3 ^ 1 + 6^- _ 12 - 8/ = -11 - 2/
1.31. Simplify: (a) i\ i\ i\ (b) i\ /^ i\ i\ (c) i'\ i''\ i^'\ i^'\
(a) /O = 1, /3 = /2(/) = (-1)(/) = -/, /4 = (/2)(/2) = (-1)(-1) = 1.
(b) /^ = (/4)(/) = A)(/) = /, /6=(/4)(/2) = (l)(/2)^/2^_i^ f=i^=_i^ /8=/4 = L
(c) Using /4 = 1 and /" = /4^+'^ = (i'^yf = \^f = f, divide the exponent « by 4 to obtain the remainder r.
.39 ^ ^(9)+3 ^ (^)9.3 ^ ^9.3 ^ .3 ^ _.^ .174 ^ -2 ^ .^^ -252 ^ ^0 ^ ^ ^ -317 ^ -1 ^ •
1.32. Find the complex conjugate of each of the following:
(a) 6 + 4/, 7 - 5/, 4 + /, -3 - /, (b) 6, -3, 4/, -9/.
(a) 6 + 4/ = 6 - 4/, 7 - 5/ = 7 + 5/, A-\-i = A- /, -3 - / = -3 + /.
F) 6 = 6, ^=-3, 4/=-4/, ^9/= 9/.
(Note that the conjugate of a real number is the original number, but the conjugate of a pure imaginary number
is the negative of the original number)
1.33. Find zz and \z\ when z = 3 + 4/.
For z = a-\- bi, use zz = a^ -\- b^ and z = \/zz = yj a^ + b^.
zz = 9-\-16 = 25, \z\ = V25 = 5
Lipschutz-Lipson:Schaum's I 1. Vectors in R^n and C^n,
Outline of Theory and Spatial Vectors
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
22
VECTORS IN R" AND C", SPATIAL VECTORS
[CHAP 1
1.34. Simpify
2-7/
5 + 3/"
To simplify a fraction zjw of complex numbers, multiply both numerator and denominator by w, the
conjugate of the denominator:
2 - 7/ _ B - 7/)E - 3/) _ -11 - 41/
5 + 3/ ~ E + 3/)E - 3/) ~ 34
\\ 41 .
1.35. Prove: For any complex numbers z, w G C, (i) z + w = z + w, (ii) zw = zw, (iii) z^z.
Suppose z = a-\-bi and w = c-\- di where a, b,c,d e R.
(i) ITw= (a-\-bi)-\-(c-\-di) = (a-\-c)-\-(b-\-d)i
= (a -\- c) — (b -\- d)i = a-\- c — bi — di
= (a — bi) -\- (c — di) = z -\-w
(ii) zw = (a -\- bi)(c + di) = (ac — bd) + (ad + bc)i
= (ac — bd) — (ad + bc)i = (a — bi)(c — di) = zw
(iii) z = a-\-bi = a — bi = a — (—b)i = a-\-bi = z
1.36. Prove: For any complex numbers z, w G C, \zw\ = |z||w|.
By (ii) of Problem 1.35,
\zw\^ = (zw)(zw) = (zw)(zw) = (zz){ww) = \z\^\w\^
The square root of both sides gives us the desired result.
1.37. Prove: For any complex numbers z, w G C, |z + w| < \z\ + |w|.
Suppose z = a-\-bi and w = c-\- di where a, b,c,d e R. Consider the vectors u = (a, b) and v = (c, d)
in R^. Note that
\Z\ = 7^2 + 1,2 ^ ll^ll^ 1^1 ^ 7^2+^2 ^ ll^ll
and
\z-\-w\ = \(a + c) + F + d)i\ = ^(a + cf + F + df = \\(a -\-c,b-\-d)\\ = \\u + i;||
By Minkowski's inequality (Problem 1.15), ||w +1;|| < ||w|| + ||i;||, and so
|z + w| = ||t/ + i;||<||t/|| + ||i;|| = |z| + |w|
1.38. Find the dot products u-v and v-u where: (a) u = (l—2i, 3 + /), i; = D + 2/, 5 — 6/),
(b) u = C - 2/, 4/, 1 + 6/), i; = E + /, 2 - 3/, 7 + 2/).
Recall that conjugates of the second vector appear in the dot product
(Zi, . . . , Z„) • (Wi, . . . , Wj = Z-^W^ -\ h Z^W^
(a) u-v = (l- 2/)D + 2/) + C + /)E - 6/)
= A - 2/)D - 2/) + C +/)E + 6/) = -10/ +9+ 23/ = 9+13/
v-u = D-\- 2/)(T^^) + E - 6/)CT7)
= D + 2/)(l + 2/) + E - 6/)C - /) = 10/ + 9-23/ = 9 - 13/
(b) u-v = C- 2/)E + /) + D/)B - 3/) + A + 6/)G + 2/)
= C - 2/)E - /) + D/)B + 3/) + A + 6/)G - 2/) = 20 + 35/
v-u = E-\- i)C^^) + B - 3/)D/) + G + 2/)(TT6/)
= E + /)C + 2/) + B - 3/)(-4/) + G + 2/)(l - 6/) = 20 - 35/
In both cases, v - u = w"^. This holds true in general, as seen in Problem 1.40.
Lipschutz-Lipson:Schaum's I 1. Vectors in R^n and C^n, I Text
Outline of Theory and Spatial Vectors
Problems of Linear
Algebra, 3/e
©The McGraw-Hill
Companies, 2004
CHAR 1]
VECTORS IN R" AND C\ SPATIAL VECTORS
23
1.39. Let u = (l - 2/, 2 + 5/) and i; = A + /, -3 - 6/). Find:
(a) u-\-v; (b) 2iu; (c) C — i)v; (d) u ■ v; (e) ||w|| and ||i;||
(a) u-\-v = (l-2i-\-l-\-i, 2-\-5i-3- 6i) = (8 - /, -1 - /)
(b) 2iu = (l4i-4i\ 4/+10/2) = D+14/, -10 + 4/)
(c) C - i)v = C + 3/ - / - f, -9 - 18/ + 3/ + 6f) = D + 2/, -15 - 15/)
(d) u-v = (l - 2/)(l + /) + B + 5/)(-3 - 6/)
= A- 2/)(l - /) + B + 5/)(-3 + 6/) = 5 - 9/ - 36 - 3/
-31 -12/
e) \\u\\ = ^72 + (-2f + 22 + 52 = V82 and ||i;|| = ^P + 12 + (-3)^ + (-6)^ = ^47
1.40. Prove: For any vectors u,v e C^ and any scalar z g C, (i) u ■ v = v ■ u, (ii) (zu) ■ v = z(u ■ v),
(iii) u ■ (zv) = z(u ■ v).
Suppose u = (zi,Z2, ... ,z^) and v = (wi,W2, ..., w^).
(i) Using the properties of the conjugate,
vu = w^z^ + W2Z2 -\ h w^z^ = w^z^ + W2Z2 -\ h w^z^
= W^Zi + W2^2 + • • • + ^n^n — Z^W^ + Z2>V2 "
(ii) Since zu — (zz^,zz2, ..., zz^),
(zu) -v =zz-^Wi-\- ZZ2W2 -\ h zz^w^ = z(z^Wi + Z2W2 -\ h z^wj = z(u • v)
(Compare with Theorem 1.2 on vectors in R".)
(iii) Using (i) and (ii),
u • (zv) = (zv) • u = z(v • u) = z(v • u) = z(u • v)
r
3
-4
, V =
"
1
5
, w =
3"
-2
6
Supplementary Problems
VECTORS IN R"
1.41. Let u = (l, -2,4), v = C, 5, 1), w = B,l, -3). Find:
(a) 3u — 2v; (b) 5u-\-3v — 4w; (c) u-v,u-w,v-w; (d) ||w||, ||i;||;
(e) cos 6, where 6 is the angle between u and v\ (f) d(u, v); (g) proj(w, v).
1.42. Repeat Problem 1.41 for vectors u ■
1.43. Let u = B, -5, 4, 6, -3) and v = E, -2, 1, -7, -4). Find:
(a) 4u — 3v; (b) 5u-\-2v; (c) u • v; (d) ||w|| and ||i;||; (e) proj(w, i;);
(/) d(u,v).
1.44. Normalize each vector:
(a) u = E,-l); (b) v = (\,2,-2,4)\ (c) ^^ft'"^'^)-
1.45. Let u = (\,2, -2), v = C, -12, 4), and k = -3.
(a) Find \\ul \\v\\, \\u + vl \\ku\\
(b) Verify that \\ku\\ = \k\\\u\\ and \\u-\-v\\ < \\u\\ + ||i;||.
1.46. Find x and y where:
(a) (X, y+\) = (y-2, 6); (b) x{2,y) = y(\,-2).
Lipschutz-Lipson:Schaum's I 1. Vectors in R^n and C^n,
Outline of Theory and Spatial Vectors
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
24
VECTORS IN R" AND C", SPATIAL VECTORS
[CHAR 1
1.47. Find x,y, z where (x, y-\- I, y-\-z) = Bx-\-y, 4, 3z).
1.48. Write v = B, 5) as a linear combination of u^ and U2 where:
(a) Ui = (I, 2) and U2 = C, 5);
(b) u^=C, -4) and U2 = B, -3).
1.49. Write v --
-3
16
as a linear combination of Wi
"
3
3
, U2 =
2"
5
-1
, W3 =
4"
-2
3
1.50. Find k so that u and v are orthogonal, where:
(a) u = C, k, -2), V = F, -4, -3);
(b) u = E,k,-4,2Xv = (l,-3,2,2k);
(c) u = (l, 7, k-\-2, -2%v = C,k,-3,k).
LOCATED VECTORS, HYPERPLANES, LINES IN R"
1.51. Find the vector v identified with the directed line segment PQ for the points:
(a) PB, 3, -7) and Q(l, -6, -5) in R^
(b) P(l, -8, -4, 6) and QC, -5, 2, -4) in R^.
1.52. Find an equation of the hyperplane H in R^ that:
(a) contains P(l, 2, —3, 2) and is normal to u = [2, 3, —5, 6];
(b) contains PC, —1, 2, 5) and is parallel to 2xi — 3x2 + 5^3 — 7x4 = 4.
1.53. Find a parametric representation of the line in R^ that:
(a) passes through the points P(l, 2, 1,2) and QC, —5, 7, —9);
(b) passes through P(l, 1,3,3) and is perpendicular to the hyperplane 2xi + 4x2 + 6x3
■ oXa — 5.
SPATIAL VECTORS (VECTORS IN R^), ijk NOTATION
1.54. Given w = 3i - 4j + 2k, i; = 2i + 5j - 3k, w = 4i + 7j + 2k. Find:
(a) 2u — 3v\ (b) 3u-\-Av — 2w\ (c) u • v, u-w, v -w; (d) \\u\\, \\v\\, \\w\\
1.55. Find the equation of the plane H:
(a) with normal N = 3i — 4j + 5k and containing the point P(l,2, —3);
(b) parallel to Ax-\-3y — 2z = 11 and containing the point gB, —1,3).
1.56. Find the (parametric) equation of the line L\
(a) through the point PB, 5, —3) and in the direction of i; = 4i — 5j + 7k;
(b) perpendicular to the plane 2x — 3j; + 7z = 4 and containing P(l, —5,7).
1.57. Consider the following curve C in R^ where 0 < ^ < 5:
F(t) = th - f] + Bt - 3)k
(a) Find the point P on C corresponding to t = 2.
(b) Find the initial point Q and the terminal point Q\
(c) Find the unit tangent vector T to the curve C when t = 2.
Lipschutz-Lipson:Schaum's I 1. Vectors in R^n and C^n,
Outline of Theory and Spatial Vectors
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 1]
VECTORS IN R" AND C", SRATIAL VECTORS
25
1.58. Consider a moving body B whose position at time t is given by R{t) = fi + ^^j + 3^k. [Then V(t) = dR{i)/dt
and A{t) = dV{t)/dt denote, respectively, the velocity and acceleration of 5.] When t = \, find:
{a) position; {b) velocity v\ (c) speed s\ (d) acceleration a of B.
1.59. Find a normal vector N and the tangent plane H to each surface at the given point:
(a) surface x^y + 3yz = 20 and point P(l, 3, 2);
(b) surface x^ -\-3/ - 5z^ = 160 and point PC, -2, 1).
CROSS PRODUCT
1.60. Evaluate the following determinants and negative of determinants of order two:
(a)
(b)
2
5
3 6
'
6 4
7
5
3
1
'
-6
-4
'
1 -3|
2
4
-4
7
'
-2
-3
8
-6
-3
-2
1.61. Given w = 3i - 4j + 2k, i; = 2i + 5j - 3k, w = 4i + 7j + 2k, Find:
(a) u X V, (b) u X w, (c) v xw.
1.62. Given u = [2,l,3lv = [4, -2, 2], w = [1,1,5], find:
(a) u X V, (b) u X w, (c) v x w.
1.63. Find the volume V of the parallelopiped formed by the vectors u, v, w in:
(a) Problem 1.60, (b) Problem 1.61.
1.64. Find a unit vector u orthogonal to:
(a) i; = [l,2,3]and>v = [l,-l,2];
(b) i; = 3i — j + 2k and w = 4i — 2j — k.
1.65. Prove the following properties of the cross product:
(a) u X V = —(v X u) {d) u X {v -\-w) = {u X v) -\-{u xw)
(b) u X u = 0 for any vector u (e) (v -\-w) x u = (v x u) -\-(w x u)
(c) {ku) X V = k(u X v) = u X (kv) if) (u X v) X w = (u • w)v — (v • w)u
COMPLEX NUMBERS
1.66. Simplify:
{a) D-70(9 + 20; {b) {3-5if- (c)
4 - li
(d) JiJ; (e) A-0'.
3 — 5;
1.67. Simplify: (a) j.; (b) |±|; (c) i'^P,^^ (d) (^
-)'
Lipschutz-Lipson:Schaum's I 1. Vectors in R^n and C^n, I Text
Outline of Theory and Spatial Vectors
Problems of Linear
Algebra, 3/e
©The McGraw-Hill
Companies, 2004
26
VECTORS IN R" AND C", SR^TIAL VECTORS
[CHAR 1
1.68. Let z = 2 — 5/ and w = 1 -\- 3i. Find:
(a) v-\-w; (b) zw; (c) z/w; (d) z,w; (e) |z|, |>v|.
1.69. Show that:
(a) Re z = ^(z + z) (b) Im z = j(z — z) (c) zw = 0 implies z = 0 or w = 0.
VECTORS IN C"
1.70. Let u = (l-\- li, 2 - 6/) and v = {5 - 2i, 3 - 4/). Find:
(a) u + v (b) C-\-i)u (c) 2iu-\-D-\-li)v (d) u-v (e) ||w|| and ||i;||.
1.71. Prove: For any vectors u, v, w in C":
(a) (u -\- v) • w = u • w -\- V • w, (b) w • (u -\- v) = w • u -\- w • v.
1.72. Prove that the norm in C" satisfies the following laws:
[NJ For any vector u, \\u\\ > 0; and ||w|| =0 if and only ifu = 0.
[N2] For any vector u and complex number z, ||zw|| = |z| ||w||.
[N3] For any vectors u and v, \\u -\-v\\ < \\u\\ + ||i;||.
Answers to Supplementary Problems
1.41. (a) (-3,-16,4); (b) F,1,35); (c) -3,12,8; (d) ^21,^35,^14;
(e) -3/V2TV35; (f) V62; (g) -^C,5,1) = (-
35' 35' 35-
1.42.
1.43.
1.44.
1.45.
1.46.
1.47.
1.48.
1.49.
1.50.
1.51.
(Column vectors) (a) (-1,7,-22); (b) (-1,26,-29); (c) -15,-27,34;
(d) V26, V30; (e) -15/(^26^30); (/) V86; (g) -^v = (-\,-'.,-^
(a) (-13,-14,13,45,0); (b) B0,-29,22,16,-23);
(e) -^,v; if) 7167
{c)
id) V90, V95,
ib) (^,i,-i,f);
{a) E/V76,9/V76);
{a) 3, 13, 7120,9
(a) x=—3,y = 5\ (b) x = 0,y = 0, and x = 1, j; = 2
(c) F/VT33, -4VT33, 9VT33)
X = —3,y = 3,z = ^
(a) V = 5ui — U2;
V = 3ui — U2+ 2^3
{a) 6; {b) 3;
{a) 1; = [-1,-9, 2]
(b) V = I6ui — 23u2
(c) f
(b) [2,3,6,-10]
Lipschutz-Lipson:Schaum's I 1. Vectors in R'^n and C^n, I Text I I ©The McGraw-Hill
Outline of Theory and Spatial Vectors Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 1] VECTORS IN R" AND C", SR^TIAL VECTORS 27
1.52. (a) 2xi + 3x2 ~ ^^3 + 6x4 = 35; (b) Ix^ — 3x2 + ^^3 ~ ^^4 = —16
1.53. (a) l2t-\-l, -It+ 2, 6^+1, -lU + 2]; {b) [2^+1, 4^+1, 6^ + 3, -8^ + 3]
1.54. {a) -23j + 13k; {b) 3i-6j-10k; (c) -20,-12,37; {d) V29, V38, V69
1.55. {a) 3x - 4j; + 5z =-20; {b) 4x-\-3y - 2z =-I
1.56. (a) [At+ 2, -5^ + 5, 7^-3]; {b) [2^+1, -3^-5, 7^ + 7]
1.57. {a) P = FB) = 8i-4j + k; {b) g = F@) =-3k, g^ = FE) = 125i - 25j + 7k;
(c) T = Fi-2j+k)/V4T
1.58. {a) i + j + 2k; {b) 2i + 3j + 2k; (c) ^17; {d) 2i + 6j
1.59. {a) 7V = 6i + 7j + 9k, 6x + 7j; + 9z = 45; {b) TV = 6i - 12j - 10k, 3x - 6j; - 5z = 16
1.60. {a) -3,-6,26; {b) -2,-10,34
1.61. {a) 2i+13j + 23k; {b) -22i + 2j+37k; (c) 31i-16j-6k
1.62. {a) [5,8,-6]; {b) [2,-7,1]; (c) [-7,-18,5]
1.63. {a) 143; {b) 17
1.64. {a) G, l,-3)/V59; {b) Ei + 1 Ij - 2k)/yi50
1.66. {a) 50-55/; {b) -16-30/; (c) ^D + 7/); {d) ^A+3/); {e) -2-2/
1.67. (a) -\i- (b) ^E + 27/); (c) -1,/,-1; (^ ^D + 3/)
1.68. (a) 9-2/; F) 29-29/; (c) ^(-1-41/); (d) 2 + 5/, 7 - 3/; (e) V29, ^58
1.69. (c) ///«^; If zw = 0, then \zw\ = \z\\w\ = |0| = 0
1.70. (a) F + 5/, 5-10/); (b) (-4 + 22/, 12-16/); (c) (-8-41/, -4-33/);
(^ 12 + 2/; (e) ^90,^54
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAPTER 2
Algebra of Matrices
2.1 INTRODUCTION
This chapter investigates matrices and algebraic operations defined on them. These matrices may be
viewed as rectangular arrays of elements where each entry depends on two subscripts (as compared with
vectors, where each entry depended on only one subscript). Systems of linear equations and their solutions
(Chapter 3) may be efficiently investigated using the language of matrices. Furthermore, certain abstract
objects introduced in later chapters, such as "change of basis", "linear transformations", and "quadratic
forms", can be represented by these matrices (rectangular arrays). On the other hand, the abstract treatment
of linear algebra presented later on will give us new insight into the structure of these matrices.
The entries in our matrices will come fi*om some arbitrary, but fixed, field K. The elements of K are
called numbers or scalars. Nothing essential is lost if the reader assumes that K is the real field R.
2.2 MATRICES
A matrix A over afield K or, simply, a matrix A (when K is implicit) is a rectangular array of scalars
usually presented in the following form:
a
11
^12
^21 ^22
^2w
. ^m\ ^w2 • • • ^mn ,
The rows of such a matrix A are the m horizontal lists of scalars:
(flll,«12.---.«lJ. (fl21,«22.---.«2j. •••.
and the columns oi A are the n vertical hsts of scalars:
(^wl' ^w2' • • • ' ^mn)
au ~
«21
_^m\ _
,
a^2~
«22
_^w2_
, . . . ,
^Iw
^2n
_ ^mn _
Note that the element ay, called the ij-entry or ij-element, appears in row / and columny. We fi-equently
denote such a matrix by simply writing A = [a^j].
28
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAR 2]
ALGEBRA OF MATRICES
29
A matrix with m rows and n columns is called mim by n matrix, written m x n. The pair of numbers m
and n is called the size of the matrix. Two matrices A and B are equal, written A = B, if they have the same
size and if corresponding elements are equal. Thus the equality of two m x n matrices is equivalent to a
system of mn equalities, one for each corresponding pair of elements.
A matrix with only one row is called a row matrix or row vector, and a matrix with only one column is
called a column matrix or column vector. A matrix whose entries are all zero is called a zero matrix and will
usually be denoted by 0.
Matrices whose entries are all real numbers are called real matrices and are said to be matrices over R.
Analogously, matrices whose entries are all complex numbers are called complex matrices and are said to
be matrices over C. This text will be mainly concerned with such real and complex matrices.
Example 2.1
(a) The rectangular array A
columns are
1 -4 5
0 3-2
is a 2 X 3 matrix. Its rows are A, —4, 5) and @, 3, —2), and its
-4
3
]• [-]
(b) The 2x4 zero matrix is the matrix 0 :
(c) Find X, y, z, t such that
0 0 0 0]
0 0 0 oj'
x-\-y 2z-\-t
X — y z — t
3 7
1 5
By definition of equality of matrices, the four corresponding entries must be equal. Thus:
x+_y = 3, x—y=\, 2z-\-t = l, z — t = 5
Solving the above system of equations yields x = 2, y = I, z = 4, t = —I.
2.3 MATRIX ADDITION AND SCALAR MULTIPLICATION
Let A = [ay] and B = [by] be two matrices with the same size, say m x ^ matrices. The sum of A and
B, written A-\-B, is the matrix obtained by adding corresponding elements from A and B. That is.
A-\-B =
«21 + ^21 «22 + ^22 • • • «2« + ^2«
^wl "I" ^wl ^w2 "I" ^w2 • • • ^mn ~^ '^mn
ThQ product of the matrix ^ by a scalar k, written ^ • ^ or simply kA, is the matrix obtained by multiplying
each element of ^ by k. That is.
kA =
kau
ka2i
kai2
ka22
ka^i ka
ml
kaxn
ka2n
kn
Observe that A-\- B and kA are also m x n matrices. We also define
-A = (-l)A and A - B = A-\-(-B)
The matrix —A is called the negative of the matrix A, and the matrix A — B is called the difference of A and
B. The sum of matrices with different sizes is not defined.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
30
ALGEBRA OF MATRICES
[CHAR 2
Example 2.2. Let A =
1-23'
0 4 5
A-\-B =
3A
and 5 =
Then
2A-3B--
2-4 6
0 8 10
1+4 -2 + 6 3 + 8
0+1 4+ (-3) 5+ (-7)
3A) 3(-2) 3C)
3@) 3D) 3E)
-12 -18 -24
-3 9 21
+
5 4 11
1 1 -2
3 -6 9
0 12 15
-10 -22 -18
-3 17 31
The matrix 2A — 3B is called a linear combination of A and B.
Basic properties of matrices under the operations of matrix addition and scalar multiplication follow.
Theorem 2.1: Consider any matrices A,B, C (with the same size) and any scalars k and t. Then:
(i) (A-\-B)-\-C = A-\-(B-\-Cl (v) k(A-\-B) = kA-\-kB,
(ii) A + 0 = 0+A=A, (vi) (k + k')A = kA + k'A,
(iii) A + (-A) = (-A)+A = 0, (vii) {kk')A = k{k'A\
(iv) A+B = B + A (viii) \ ■ A = A.
Note first that the 0 in (ii) and (iii) refers to the zero matrix. Also, by (i) and (iv), any sum of matrices
requires no parentheses, and the sum does not depend on the order of the matrices. Furthermore, using (vi)
and (viii), we also have
A^A = 2A,
A+A+A = 3A,
and so on.
The proof of Theorem 2.1 reduces to showing that the //-entries on both sides of each matrix equation
are equal. (See Problem 2.3.)
Observe the similarity between Theorem 2.1 for matrices and Theorem 1.1 for vectors. In fact, the
above operations for matrices may be viewed as generalizations of the corresponding operations for
vectors.
2.4 SUMMATION SYMBOL
Before we define matrix multiplication, it will be instructive to first introduce the summation symbol 2
(the Greek capital letter sigma).
Suppose f(k) is an algebraic expression involving the letter k. Then the expression
n
J2f(k) or equivalent^ TH^Jik)
has the following meaning. First we set ^ = 1 in/(^), obtaining
/(I)
Then we set A: = 2 mf{k), obtaining/B), and add this to/(I), obtaining
/(l)+/B)
Lipschutz-Lipson:Schaum's I 2. Algebra of Matrices
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 2]
ALGEBRA OF MATRICES
31
Then we set ^ = 3 in/(^), obtaining/C), and add this to the previous sum, obtaining
/(l)+/B)+/C)
We continue this process until we obtain the sum
/{l)+/B) +•••+/(«)
Observe that at each step we increase the value of ^ by 1 until we reach n. The letter k is called the index,
and 1 and n are called, respectively, the lower and upper limits. Other letters frequently used as indices are /
andy.
We also generalize our definition by allowing the sum to range fi-om any integer Wj to any integer ^2-
That is, we define
E f(k) =f{n,) +f{n, + 1) +/(«! + 2) + • • • +/{«2)
Example 2.3
5 n
(a) J] x^ = Xi + X2 + X3 + X4 + X5 and J] a^b^ = a^^i + a2b2 H h a^b^
5 n
iP) E/=22+32+42 + 52 =54 and Y.^i^ = ^q + a^x + a^x^ + - - - + a^x""
7=2 /=o
p
{c) E ^ikhj = aaby + a^by + ^o^sy + • • • + aipbpj
2.5 MATRIX MULTIPLICATION
The product of matrices A and B, written AB, is somewhat complicated. For this reason, we first begin
with a special case.
The product AB of a row matrix A = [aj and a column matrix B = [Z?J with the same number of
elements is defined to be the scalar (or 1 x 1 matrix) obtained by multiplying corresponding entries and
adding; that is.
AB = [flj, B2, . . . , a„]
bx
b.
K
= a^b^ + a2b2 + • • • + a^b^ = E ^k^k
k^\
We emphasize that AB is a scalar (or a 1 x 1 matrix). The product AB is not defined when A and B have
different numbers of elements.
Example 2.4
{a) [7,-4,5]
: 7C) + (-4)B) + 5(-l) = 21 - 8 - 5 :
:24 + 9-16 + 15 = 34
(b) [6,-1,8,3]
We are now ready to define matrix multiplication in general.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
32
ALGEBRA OF MATRICES
[CHAR 2
Definition: Suppose A = [a^^] and B = [bj^] are matrices such that the number of columns of ^ is equal
to the number of rows of B; say, A is mi m xp matrix and B is 3.p x n matrix. Then the
product AB is the m x ^ matrix whose //-entry is obtained by multiplying the /th row of ^ by
theyth column of B. That is,
-^wl
hp
'^mp ,
bn
"p\
^pn J
^m\
where Cy = anby + a^-2^2/-"
The product AB is not defined if ^ is mi m x p matrix and 5 is a ^ x ^ matrix, where p ^ q.
Example 2.5
(a) Find AB where A
[J-?
and 5:
Since ^ is 2 x 2 and 5 is 2 x 3, the product AB is defined and AB is a 2 x 3 matrix. To obtain the first row of
the product matrix AB, multiply the first row [1, 3] of ^ by each column of B,
[-"]• [-t]
respectively. That is,
AB--
p + l
5 0-6 -4+18
17
-6 14
To obtain the second row of AB, multiply the second row [2, —1] of ^ by each column of B. Thus
14
(b) Suppose A
AB -
1 2
3 4
AB-
mdB -
17 -6
4-5 0+2
17
-1
14
-14
[5 6]
[o -2 J
Then
5+0 6-4
15 + 0 18-8
5 2
15 10
and
BA ■
5 + 18 10 + 24
0-6 0-8
23 34
-6 -8
The above example shows that matrix multiplication is not commutative, i.e. the products AB mid BA
of matrices need not be equal. However, matrix multiplication does satisfy the following properties.
Theorem 2.2: Let A,B,C hQ matrices. Then, whenever the products and sums are defined:
(i) (AB)C = A(BC) (associative law),
(ii) A(B -\-C)=AB-\-AC (left distributive law),
(iii) (B + C)A =BA-\-CA (right distributive law),
(iv) k(AB) = (kA)B = A(kB), where ^ is a scalar.
We note that 0^ = 0 and BO = 0, where 0 is the zero matrix.
Lipschutz-Lipson:Schaum's I 2. Algebra of Matrices
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 2]
ALGEBRA OF MATRICES
33
2.6 TRANSPOSE OF A MATRIX
The transpose of a matrix A, written ^^, is the matrix obtained by writing the columns of ^, in order,
as rows. For example.
1 2 3
4 5 6
-iT
=
-1
4
2 5
3 6
and
[1,-3,-5]^
1
-3
-5
In other words, if A = [Ufj] is mi m x n matrix, then A^ = [by] is the w x m matrix where by = ajf.
Observe that the tranpose of a row vector is a column vector. Similarly, the transpose of a column
vector is a row vector.
The next theorem lists basic properties of the transpose operation.
Theorem 2.3: Let A and B be matrices and let ^ be a scalar. Then, whenever the sum and product are
defined:
(i) (A+Bf =A^+B^,
(ii) (A^f=A,
(iii) (kAf = kA^,
(iv) (ABf = B^A^.
We emphasize that, by (iv), the transpose of a product is the product of the transposes, but in the
reverse order.
2.7 SQUARE MATRICES
A square matrix is a matrix with the same number of rows as columns. An n x n square matrix is said
to be of order n and is sometimes called an n-square matrix.
Recall that not every two matrices can be added or multiplied. However, if we only consider square
matrices of some given order n, then this inconvenience disappears. Specifically, the operations of addition,
multiplication, scalar multiplication, and transpose can be performed on any n xn matrices, and the result
is again an w x ^ matrix.
Example 2.6. The following are square matrices of order 3:
1 2 3"
-4 -4 -4
5 6 7
and
2 -5
0 3
1 2
The following are also matrices of order 3:
A+B--
3 -3
-4 -1
6 8
AB =
4"
-6
3_
,
5 7
-12 0
17
7
2A =
-15"
20
-35 _
,
10 12
BA
6"
-8
14
2^
-22
-2^
,
30
-24
-30
A' =
33"
-26
-33
[1
2
[3
-4 5
-4 6
-4 7
Lipschutz-Lipson:Schaum's I 2. Algebra of Matrices
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
34
ALGEBRA OF MATRICES
[CHAR 2
Diagonal and Trace
Let^ = [aij] be an ^-square matrix. The diagonal or main diagonal of A consists of the elements with
the same subscripts, that is,
ail, ^22' ^33' •••' ^nn
The trace of A, written tr(^), is the sum of the diagonal elements. Namely,
tr(^) = ail + «22 + «33 + • • • + ^nn
The following theorem applies.
Theorem 2.4: Suppose A = [afj] and B = [bfj] are ^-square matrices and ^ is a scalar. Then:
(i) tr(^ -\-B) = tY(A) + trE), (iii) tr(^^) = tr(^),
(ii) tr(^^) = ^ tr(^), (iv) tY(AB) = tY(BA).
Example 2.7. Let A and B be the matrices A and B in Example 2.6. Then
diagonal of ^ = {1, -4, 7} and tr(^) =1-4 + 7 = 4
diagonal ofB = {2,3, -4} and trE) = 2 + 3-4=1
Moreover,
tr(^ + 5) = 3 - 1 + 3 = 5, trB^) = 2-8+14 = 8, tr(^^) =1-4 + 7 = 4
tY(AB) = 5 + 0 - 35 = -30, tY(BA) = 27 - 24 - 33 = -30
As expected from Theorem 2.4,
tr(^ -\-B) = tY(A) + trE), tr(^^) = tr(^), trB^) = 2 tr(^)
Furthermore, although AB ^ BA, the traces are equal.
Identity Matrix, Scalar Matrices
The w-square identity or unit matrix, denoted by /„, or simply /, is the w-square matrix with 1 's on the
diagonal and O's elsewhere. The identity matrix / is similar to the scalar 1 in that, for any ^-square matrix
A,
AI = IA=A
More generally, if 5 is an m x ^ matrix, then BI^ = I^B = B.
For any scalar k, the matrix kl that contains ^'s on the diagonal and O's elsewhere is called the scalar
matrix corresponding to the scalar k. Observe that
(kI)A = k(IA) = kA
That is, multiplying a matrix A by the scalar matrix kl is equivalent to multiplying A by the scalar k.
Example 2.8. The following are the identity matrices of orders 3 and 4 and the corresponding scalar matrices for ^ = 5:
1 0 0"
0 1 0
0 0 1
,
5 0 0"
0 5 0
0 0 5
,
Lipschutz-Lipson:Schaum's I 2. Algebra of Matrices
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 2]
ALGEBRA OF MATRICES
35
Remark 1: It is common practice to omit blocks or patterns of O's when there is no ambiguity, as in
the above second and fourth matrices.
Remark 2: The Kronecker delta function dy is defined by
^{0 if i^j
\\ if/=7
Thus the identity matrix may be defined by / = {dij\.
2.8 POWERS OF MATRICES, POLYNOMIALS IN MATRICES
Let A be an ^-square matrix over a field K. Powers of ^ are defined as follows:
A^=AA, A^=A^A, ..., ^"+^=^"^, ..., and A^ = I
Polynomials in the matrix A are also defined. Specifically, for any polynomial
f(x) = gq + fljx + a2X^ + • • • + a„x"
where the a^ are scalars in K, f(A) is defined to be the following matrix:
f(x) = uqI + aiA + a2A^ H h a„^"
[Note that/(^) is obtained fi-om/(x) by substituting the matrix A for the variable x and substituting the
scalar matrix UqI for the scalar Uq.] lff(A) is the zero matrix, then A is called a zero or root of/(x).
Example 2.8. Suppose A
1 2
3 -4
Then
-6
22
and
A'-A'A-\ ^ -^]\^ ^]-\-^^
^ -^^-[-9 22j[3 -4j"[ 57
38
57 -106
Suppose/(x) = 2x^ — 3x + 5 and g(x) = x^ + 3x — 10. Then
" 7 -6
.-9 22
7 -6
-9 22
f(A) = 2
g(A)
+ 3
3
1 2
0
0 1
3 -4]"^
Lo ij
" 16
_-27
0"
_o o_
-18"
61_
Thus ^ is a zero of the polynomial g(x).
2.9 INVERTIBLE (NONSINGULAR) MATRICES
A square matrix A is said to be invertible or nonsingular if there exists a matrix B such that
AB = BA=I
where / is the identity matrix. Such a matrix B is unique. That is, if AB^ = B^A = / and AB2 = ^2^ = /,
then
5i = B^I = BMB2) = (BiA)B2 = IB2 = B2
We call such a matrix B the inverse of A and denote it by ^~^ Observe that the above relation is symmetric;
that is, if B is the inverse of ^, then A is the inverse of B.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
36
ALGEBRA OF MATRICES
[CHAR 2
Example 2.9. Suppose that A =
2 5'
1 3
AB--
6-5 -10+10
3-3 -5 + 6
and 5 =
1 0
0 1
Then
and BA
6-5
2 + 2
15-15"
-5 + 6_
0"
0 1_
Thus A and B are inverses.
It is known (Theorem 3.15) that AB = / if and only if BA = /. Thus it is necessary to test only one
product to determine whether or not two given matrices are inverses. (See Problem 2.17.)
Now suppose A and B are invertible. Then AB is invertible and (AB)~^ = B~^A~^. More generally, if
Ai,A2, ... ,Aj^ SiYQ invertible, then their product is invertible and
{A,A,
■Ar'=Ai'
the product of the inverses in the reverse order.
..a^'at'
Inverse of a 2 x 2 Matrix
Let A be an arbitrary 2x2 matrix, say A
a b
e d
. We want to derive a formula for ^ \ the inverse
of ^. Specifically, we seek 2^ = 4 scalars, say Xj, Ji, -^2, J2? such that
a b~\
e d \
fxi
iy\
X2
yi_
0"
0 i_
or
axx + byx 0x2 + ^^2
exx + dyx ex2 + dy2
1 0
0 1
Setting the four entries equal to the corresponding entries in the identity matrix yields four equations,
which can be partitioned into two 2x2 systems as follows:
ax^ + by^ = 1, ax2 + by2 = 0
cxj + dy^ = 0, ex2 + dy2 = 1
Suppose we let \A\ = ab — be (called the determinant of A). Assuming \A\ 7^ 0, we can solve uniquely for
the above unknowns Xj, jj, X2, J2? obtaining
d_
y\
—e
X2
Accordingly,
A-' =
-1 r
-b
1 1
\\A\
a
d -b~
—e a
d/\A\ -b/\A\
-e/\A\ a/\A\^
In other words, when \A\ 7^ 0, the inverse of a 2 x 2 matrix A may be obtained fi*om A as follows:
A) Interchange the two elements on the diagonal.
B) Take the negatives of the other two elements.
C) Multiply the resulting matrix by \/\A\ or, equivalently, divide each element by \A\.
In case \A\ = 0, the matrix A is not invertible.
Example 2.10. Find the inverse of A =
2 3
4 5
and 5 =
1 3
2 6
First evaluate \A\ = 2E) - 3D) = 10 - 12 = -2. Since \A\ ^ 0, the matrix A is invertible and
-2[-4 2} [ 2 -1_
Now evaluate \B\ = 1F) — 3B) = 6 — 6 = 0. Since \B\ = 0, the matrix B has no inverse.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAR 2]
ALGEBRA OF MATRICES
37
Remark: The above property that a matrix is invertible if and only if ^ has a nonzero determinant is
true for square matrices of any order. (See Chapter 8).
Inverse of an w x w Matrix
Suppose A is an arbitrary ^-square matrix. Finding its inverse A~^ reduces, as above, to finding the
solution of a collection ofnxn systems of linear equations. The solution of such systems and an efficient
way of solving such a collection of systems is treated in Chapter 3.
2.10 SPECIAL TYPES OF SQUARE MATRICES
This section describes a number of special kinds of square matrices.
Diagonal and Triangular Matrices
A square matrix D = [dij] is diagonal if its nondiagonal entries are all zero. Such a matrix is
sometimes denoted by
i) = diag((iii,^22"-«)
where some or all the d^ may be zero. For example,
^3
0
0
are diagonal matrices, which may be represented, respectively, by
diagC, -7, 2), diagD, -5), diagF, 0, -9, 8)
(Observe that patterns of O's in the third matrix have been omitted.)
A square matrix A = [a^] is upper triangular or simply triangular if all entries below the (main)
diagonal are equal to 0, that is, ifa^ = 0 for / >j. Generic upper triangular matrices of orders 2, 3, 4 are as
follows:
0 0
7 0
0 2
'
4
0
0
-S
'
^11 ^12
0 CI22
bu
-^23
^33.
^n ^n ^u ^u
^11 <^23 <^24
<^33 <^34
C44
(As with diagonal matrices, it is common practice to omit patterns of O's.)
The following theorem applies.
Theorem 2.5: Suppose A = [a^y] and B = [Z?^y] slyq n x n (upper) triangular matrices. Then:
(i) A-\-B, kA, AB are triangular with respective diagonals.
(a^+bu, ..., a„„-\-b„J, (ka^, •••, ka„J, (aubu, ...
(ii) For any polynomial/(x), the matrix/(^) is triangular with diagonal
(/(aH),/(«22),...J@)
(iii) A is invertible if and only if each diagonal element a^ 7^ 0, and when A ^ exists it is
also triangular.
^nn'-^nn)i
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
38
ALGEBRA OF MATRICES
[CHAR 2
A lower triangular matrix is a square matrix whose entries above the diagonal are all zero. We note
that Theorem 2.5 is true if we replace "triangular" by either "lower triangular" or "diagonal".
Remark: A nonempty collection A of matrices is called an algebra (of matrices) if ^ is closed under
the operations of matrix addition, scalar multiplication, and matrix multiplication. Clearly, the square
matrices with a given order form an algebra of matrices, but so do the scalar, diagonal, triangular, and
lower triangular matrices.
Special Real Square Matrices: Symmetric, Orthogonal, Normal
Suppose now ^ is a square matrix with real entries, that is, a real square matrix. The relationship
between A and its transpose A^ yields important kinds of matrices.
{a) Symmetric Matrices
A matrix^ is symmetric ifA^ = A. Equivalently, A = [a^] is symmetric if symmetric elements (mirror
elements with respect to the diagonal) are equal, that is, if each a^ = aj^.
A matrix A is skew-symmetric if A^ = —A or, equivalently, if each a^ = —aji. Clearly, the diagonal
elements of such a matrix must be zero, since a^ = —a^ implies a^ = 0.
(Note that a matrix A must be square if ^^ = A or A^ = —A.)
Example 2.11. Let^
-3 5'
6 7
7 -8
0
3
4
3
0
-5
-4
5
0
,C:
1 0 0
0 0 1
(a) By inspection, the symmetric elements in A are equal, or A^ = A. Thus A is symmetric.
(b) The diagonal elements of B are 0 and symmetric elements are negatives of each other, or B^
skew- symmetric.
(c) Since C is not square, C is neither symmetric nor skew-symmetric.
-B. Thus B is
(b) Orthogonal Matrices
A real matrix A is orthogonal if A^ = A~^, that is, if AA^ = A^A = /. Thus A must necessarily be
square and invertible.
Example 2.12. Let^
"i 8 _4-
9 9 9
4 _4 _7
9 9 9
8 14
.9 9 9.
. Multiplying Ahy A^ yields /; that is, AA-^ — I. This means A-^A —I, as well.
Thus A = A ; that is, A is orthogonal.
Now suppose ^ is a real orthogonal 3x3 matrix with rows
Ui = (ai,a2,a2), U2 = (bi, b2, b^),
Since A is orthogonal, we must have AA^ = I. Namely
AA'
b\ ^2 ^3
. ^\ ^2 ^3,
1 ^1
2 ^2
3 h
C\~
Cl
^3_
=
W3 =(Ci,C2,C3)
1 0 0
0 1 0
0 0 1
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAR 2]
ALGEBRA OF MATRICES
39
Multiplying Ahy A^ and setting each entry equal to the corresponding entry in / yields the following nine equations:
a\ -\- a^-\- al ■■
1,
b^ai + Z}2«2 + ^3^3 = 0'
Ci^i + Coflo + c^a^,
0,
b\^bl^bl = l,
a-^Ci + a2C2 + a3C3 = 0
b^Ci + ^2^2 + Z73C3 = 0
Accordingly, u^ • u^ = I, U2 • U2 = I, u^ • u^ = I, and Ui • Uj = 0 for / ^j. Thus the rows u^, U2, u^ are unit vectors
and are orthogonal to each other.
Generally speaking, vectors Ui, U2, ..., u^^^ in R^ sltq said to form an orthonormal set of vectors if the
vectors are unit vectors and are orthogonal to each other, that is,
Jo if / 7^7
In other words, Uf • Uj = Sfj where Sfj is the Kronecker delta function.
We have shown that the condition AA^ = I implies that the rows of ^ form an orthonormal set of
vectors. The condition A^A = / similarly implies that the columns of ^ also form an orthonormal set of
vectors. Furthermore, since each step is reversible, the converse is true.
The above results for 3 x 3 matrices is true in general. That is, the following theorem holds.
Theorem 2.6: Let ^ be a real matrix. Then the following are equivalent:
(a) A is orthogonal.
(b) The rows of ^ form an orthonormal set.
(c) The columns of ^ form an orthonormal set.
For w = 2, we have the following result (proved in Problem 2.28).
Theorem 2.7: Let ^ be a real 2x2 orthogonal matrix. Then, for some real number 6,
cost
— sin^
sin(
cos^
cos 6 sin 6
sin 6 — cos 6
(c) Normal vectors
A real matrix A is normal if it commutes with its transpose A^, that is, if AA^ =A^A. If A is
symmetric, orthogonal, or skew-symmetric, then A is normal. There are also other normal matrices.
Let^ =
-3]
3 6j
-3]
_3 6j
r 6 3"
[-3 6
. Then
5 0"
0 45
AA'
Since AA^ = A^A, the matrix A is normal.
and
[-
6 3
3 6
6 -3
3 6
H
45 0
0 45
2.11 COMPLEX MATRICES
Let ^ be a complex matrix, that is, a matrix with complex entries. Recall (Section 1.7) that if
z = a -\-bi is a complex number, then z = a — bi is its conjugate. The conjugate of a complex matrix A,
written A, is the matrix obtained from A by taking the conjugate of each entry in A. That is, ifA = [a^],
then A = [b^j], where by = ay. (We denote this fact by writing A = [ay].)
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
40
ALGEBRA OF MATRICES
[CHAR 2
The two operations of transpose and conjugation commute for any complex matrix A, and the special
notation A^ is used for the conjugate transpose of ^. That is,
A^ = (Af = (A^)
Note that if ^ is real, then A^ = A^. [Some texts use ^* instead of ^^.]
Example 2.14. Let A =
2 + 8/ 5-3/ 4 - li
6i 1-4/ 3 + 2/
Then A^ =
2 - 8/ -6/
5 + 3/ 1 + 4/
4 + 7/ 3 - 2/
Special Complex Matrices: Hermitian, Unitary, Normal
Consider a complex matrix A. The relationship between A and its conjugate transpose A^ yields
important kinds of complex matrices (which are analogous to the kinds of real matrices described above).
A complex matrix A is said to be Hermitian or skew-Hermitian according as
A"" =A
or
^^ = -A.
Clearly, A = [a^y] is Hermitian if and only if symmetric elements are conjugate, that is, if each a^ = cip in
which case each diagonal element a^ must be real. Similarly, if ^ is skew-symmetric, then each diagonal
element a^ = 0. (Note that A must be square if ^^ = ^ or A^ = —A.)
A complex matrix A is unitary if A^A~^ = A~^A^ = /, that is, if
Thus A must necessarily be square and invertible. We note that a complex matrix A is unitary if and only if
its rows (columns) form an orthonormal set relative to the dot product of complex vectors.
A complex matrix A is said to be normal if it commutes with A^, that is, if
AA^ =A^A
(Thus A must be a square matrix.) This definition reduces to that for real matrices when A is real.
Example 2.15. Consider the following complex matrices:
3
1+2/
4-7/
1 -2/
-4
2/
4 + 7/
-2/
5
1
'-2
1
/
1+/
— /
1
-1+/
-1+/
1+/
0
C-
[2.3,
1
1+2/
(a) By inspection, the diagonal elements of A are real and the symmetric elements 1 — 2/ and 1 + 2/ are conjugate,
4 + 7/ and 4 — 7/ are conjugate, and —2/ and 2/ are conjugate. Thus A is Hermitian.
(b) Multiplying B by B^ yields /, that is, BB^ = I. This implies B^B = I, as well. Thus B^ = B-\ which means B
is unitary.
(c) To show C is normal, we evaluate CC^ and C^C:
H_\2^3i 1
^^ -[ i 1+2/
2 — 3/ —/
1 1-2/
14 4-4/
4 + 4/ 6
and similarly C^C -
14 4 - 4/
4 + 4/ 6
. Since CC^ = C^C, the complex matrix C is normal.
We note that when a matrix A is real, Hermitian is the same as symmetric, and unitary is the same as
orthogonal.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAR 2]
ALGEBRA OF MATRICES
41
2.12 BLOCK MATRICES
Using a system of horizontal and vertical (dashed) lines, we can partition a matrix A into submatrices
called blocks (or cells) of A. Clearly a given matrix may be divided into blocks in different ways. For
example:
1
2
3
4
-2
3
1
6
0 1
5 7
4 5
-3 1
3"
-2
9
8
'
2
3
4
-2
3
1
6
0 1
5 7
4 5
-3 1
3"
-2
9
8
'
" 1
2
3
4
-2
3
1
6
0 1 1
5 7
4 1 5
-3 1
3
-2
9
8
The convenience of the partition of matrices, say A and B, into blocks is that the result of operations on A
and B can be obtained by carrying out the computation with the blocks, just as if they were the actual
elements of the matrices. This is illustrated below, where the notation A = [Ay] will be used for a block
matrix A with blocks Ay.
Suppose that A = [Ay] and B = [By] are block matrices with the same numbers of row and column
blocks, and suppose that corresponding blocks have the same size. Then adding the corresponding blocks
of ^ and B also adds the corresponding elements of ^ and B, and multiplying each block of ^ by a scalar k
multiplies each element of ^ by k. Thus
A-\-B =
-B,
-Br
^21 +^21 ^22 +^'
^22
^2«+5:
In
L^ml +^wl ^n
-B^,
A -\- R
and
kA =
kA^i
kA2i
kAn
kA22
kAj^i kA^
kAxn
kA2n
kA
The case of matrix multiplication is less obvious, but still true. That is, suppose that U = [U^j^ and
V = [Vjy] are block matrices such that the number of columns of each block U^j^ is equal to the number of
rows of each block Vjy. (Thus each product Uu^Vjy is defined.) Then
UV =
^2\
In
w
ml
w
ml
w
where Wy = UaVy
UaV2j + ,
U V
The proof of the above formula for UV is straightforward, but detailed and lengthy. It is left as an exercise
(Problem 2.85).
Square Block Matrices
Let M be a block matrix. Then M is called a square block matrix if:
(i) M is a square matrix, (ii) The blocks form a square matrix,
(iii) The diagonal blocks are also square matrices.
The latter two conditions will occur if and only if there are the same number of horizontal and vertical
lines and they are placed symmetrically.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
42
ALGEBRA OF MATRICES
[CHAR 2
Consider the following two block matrices:
1
9
4
3
2
1
8
4
5
3
1
7
4
3
4 151
1 1 1
6^ 5
4 1 4
5 3
and B.
1
9
4
3
2
1
8
4
5
3
1
7
4
3
41 51
1 1 1
6~[ 5
4 1 4
5~I 3
The block matrix A is not a square block matrix, since the second and third diagonal blocks are not square.
On the other hand, the block matrix 5 is a square block matrix.
Block Diagonal Matrices
Let M = {Aij\ be a square block matrix such that the nondiagonal blocks are all zero matrices, that is.
Ay = 0 when / ^j. Then M is called a block diagonal matrix. We sometimes denote such a block diagonal
matrix by writing
M = dmg(A^^,A22^"'Mrr) or
M
'...eA
The importance of block diagonal matrices is that the algebra of the block matrix is frequently reduced to
the algebra of the individual blocks. Specifically, suppose/(x) is a polynomial and M is the above block
diagonal matrix. Then/(M) is a block diagonal matrix and
/(M) = diag(/(^H),/(^22), . . . J(Ar))
Also, M is invertible if and only if each ^^^ is invertible, and, in such a case, M~^ is a block diagonal matrix
and
M"
:diag(^r/,^22,...,V)
Analogously, a square block matrix is called a block upper triangular matrix if the blocks below the
diagonal are zero matrices, and a block lower triangular matrix if the blocks above the diagonal are zero
matrices.
Example 2.16. Determine which of the following square block matrices are upper diagonal, lower diagonal, or diagonal:
1 2 I 0
3_ _4 I 5
0 0^ 6_,
i,_o_ o;_o_
3 4 I 0
0 6 10
7 8|~9
C-
r 0 0
0| 2 3
014 5
D -
3
0
2
4
6
0"
5
7
{a) A is upper triangular since the block below the diagonal is a zero block.
{b) B is lower triangular since all blocks above the diagonal are zero blocks.
(c) C is diagonal since the blocks above and below the diagonal are zero blocks.
{d) D is neither upper triangular nor lower triangular Also, no other partitioning of D will make it into either a block
upper triangular matrix or a block lower triangular matrix.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAP. 2]
ALGEBRA OF MATRICES
43
Solved Problems
MATRIX ADDITION AND SCALAR MULTIPLICATION
2"
2.1. Give A
and 5:
3 0
-7 1
, find:
(a) A-\-B, (b) 2A-3B.
(a) Add the corresponding elements:
A-\-B--
1+3 -2 + 0
4-7 5+1
3+2
-6 + 8
(b) First perform the scalar multiplication and then a matrix addition:
2A-3B
2 -4
8 10
Vi
21
0 -6
-3 -24
-7 -4
-29 7
0
-36
(Note that we multiply B by —3 and then add, rather than multiplying 5 by 3 and subtracting. This
usually prevents errors.)
2.2. Find x, y, z, t where 3
Write each side as a single equation:
X y
z t
X 6
1 2t
+
4 x+y
z + t 3
r3x 3j;"| _ r x + 4 X
\_3z 3^J ~ [z + ^- 1
+ 3^ + 6
2^ + 3
Set corresponding entries equal to each other to obtain the following system of four equations:
3^ = 2^ + 3
3x = x + 4, 3j; = x+j; + 6,
or 2x = 4, 2j; = 6 + x,
The solution is x = 2, j; = 4, z = 1, ^ = 3.
3z = z + ^- 1,
2z = ^-l,
2.3. Prove Theorem 2.1 (i) and (v): (i) (A + B) + C =A + (B + C), (v) k(A +B) = kA + kB.
Suppose A = [aij], B = [bij], C = [cij]. The proof reduces to showing that corresponding (/-entries in each
side of each matrix equation are equal. [We only prove (i) and (v), since the other parts of Theorem 2.1 are
proved similarly.]
(i) The (/-entry ofA-\-Bis a^j + b^p hence the (/-entry of {A-\-B) -\- C is {a^j + b^j) + c^y. On the other hand,
the (/-entry ofB-\-C is b^j + c^y, and hence the (/-entry of A-\-{B -\- C) is a^j + {b^j + c^y). However, for
scalars in K,
{ay + bij) + c^j = ay + {by + c^j)
Thus (^ + 5) + C and ^ + E + C) have identical (/"-entries. Therefore (^ + 5) + C = ^ + E + C).
(v) The (/-entry of A-\-B is a^ + by-, hence k{ay + bfj) is the (/-entry of k{A -\-B). On the other hand, the
(/-entries of kA and kB are kay and kby, respectively. Thus kay + kby is the (/-entry of kA + kB. However,
for scalars in K,
K^ij + b^j) = kay + kby
Thus k{A + B) and kA + A:5 have identical (/-entries. Therefore k{A -\-B) = kA-\- kB.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
44
ALGEBRA OF MATRICES
[CHAR 2
MATRIX MULTIPLICATION
2.4. Calculate: (a) [8, -4, 5]
(b) [6,-1,7,5]
(c) [3,8,-2,4]
(a) Multiply the corresponding entries and add:
S, -4, 5]
: 8C) + (-4)B) + 5(-l) = 24 - 8 - 5 = 11
(b) Multiply the corresponding entries and add:
4
[6,-1,7,5], -_3
2
:24 + 9-21 +10 = 22
(c) The product is not defined when the row matrix and the column matrix have different numbers of
elements.
2.5. Let (r x s) denote an r x ^ matrix. Find the sizes of those matrix products that are defined:
(a) B X 3)C X 4), (c) A x 2)C x 1), (e) D x 4)C x 3)
(b) D X 1)A X 2), (d) E X 2)B X 3), (/) B x 2)B x 4)
In each case, the product is defined if the inner numbers are equal, and then the product will have the size
of the outer numbers in the given order.
(a) 2x4, (c) not defined, (e) not defined
(b) 4x2, (d) 5x3, (/) 2 x 4
2.6. Let A
1
2
3
-1
and 5 =
2
3
0
-2
-4
6
. Find: (a) AB, (b) BA.
(a) Since ^ is a 2 x 2 and 5 a 2 x 3 matrix, the product AB is defined and is a 2 x 3 matrix. To obtain the
entries in the first row ofAB, multiply the first row [1, 3] of ^ by the columns
respectively, as follows:
AB -
[5 J
2 + 9 0-6 -4+18
11
-6 14
]
ofB,
To obtain the entries in the second row ofAB, multiply the second row [2, —1] of^ by the columns ofB:
0 -41 _ r 11 -6 14
_3 -2 6j~[4-3 0 + 2
Thus
AB-
ri 3ir2 0 -4] _ r n
[2 -1JL3 -2 6j~[4-
rii -6
AB =
I 1 2
-8-6
(b) The size ofB is 2 x 3 and that of ^ is 2 x 2. The inner numbers 3 and 2 are not equal; hence the product
BA is not defined.
Lipschutz-Lipson:Schaum's I 2. Algebra of Matrices
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAP. 2]
ALGEBRA OF MATRICES
45
2.7. Find AB, where A
3 -1
-2 5
and 5:
2 -1 0 6'
1 3-5 1
4 1-2 2
Since ^ is a 2 x 3 and 5 a 3 x 4 matrix, the product AB is defined and is a 2 x 4 matrix. Multiply the
rows of A by the columns of B to obtain
AB--
4 + 3-4 -2 + 9-1 0-15+2 12 + 3-2
8-2 + 20 -4-6 + 5 0+10-10 24-2 + 10
3 6
26 -5
-13 13
0 32
2.8. Find: (a)
1 6
-3 5
(b)
1 6
-3 5
(c) [2,-7]
1 6
-3 5
(a) The first factor is 2 x 2 and the second is 2 x 1, so the product is defined as a 2 x 1 matrix:
U
6
3 5
2-42
-6-35
]-[-
-40
-41
(b) The product is not defined, since the first factor is 2 x 1 and the second factor is 2 x 2.
(c) The first factor is 1 x 2 and the second factor is 2 x 2, so the product is defined as a 1 x 2 (row) matrix:
[2,-7] J
1 6
-3 5
:[2 + 21, 12-35] = [23,-23]
2.9. Clearly 0^ = 0 and AO = 0, where the O's are zero matrices (with possibly different sizes). Find
matrices A and B with no zero entries such that AB = 0.
Let^ :
[I
2
2 4
and 5:
Then AB --
0 0
0 0
2.10. Prove Theorem 2.2(i): (AB)C = A(BC).
Let A = [aijl B = [bjkl C = [Ckjl and let AB = S = [Sikl BC = T = [tjj]. Then
m n
Sik = E ^ijbjk and tjj = J2 bjkCki
7=1 k^l
Multiplying S = AB by C, the //-entry of (AB)C is
n n m
k^\ A:=l7=l
On the other hand, multiplying A by T = BC, the //-entry of A(BC) is
m m n
antii + aat2i + • • • + aj^i = J2 ^tfy = E E ^iji^jkCki)
7=1 7=1 ^=1
The above sums are equal; that is, corresponding elements in (AB)C and A(BC) are equal. Thus
(AB)C = A(BC).
2.11. Prove Theorem 2.2(ii): A(B + C) = AB + AC.
Let A = [a^l B = [bj,l C = [c,,], mdMD=B^C = [dj,l E = AB = [e,,], F = AC = [^]. Then
7=1
fik — Jl ^ifjk
7=1
Lipschutz-Lipson:Schaum's I 2. Algebra of Matrices
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
46
ALGEBRA OF MATRICES
[CHAR 2
Thus the ik-Qntry of the matrix AB + ^C is
m m m
efk +fik = E ^ijbjk + E ^ijCjk = E aij(bjk + Cjk)
7=1 7=1 7=1
On the other hand, the //-entry of the matrix AD = A(B -\- C) is
m m
andik + aad2k + • • • + «/^^^^ = E ^ijdjk = E ^ijibjk + Cjj,)
7=1 7=1
Thus A(B -\- C) = AB -\-AC, since the corresponding elements are equal.
TRANSPOSE
2.12. Find the transpose of each matrix:
A =
1 -2 3
7 8-9
B =
1 2 3
2 4 5
3 5 6
C = [1,-3, 5,-7],
1 7
-2 8
3 -9
B'
1 2 3
2 4 5
3 5 6
a
1
-3
5
-7
i) =
Rewrite the rows of each matrix as columns to obtain the transpose of the matrix:
D^ = [2, -4, 6]
(Note that B^ = B; such a matrix is said to be symmetric. Note also that the transpose of the row vector C is a
column vector, and the transpose of the column vector D is a row vector.)
2.13. Prove Theorem 2.3(iv): (ABf = B^A^.
Let A = [aij^] and B = [bj^j]. Then the //-entry of AB is
anby-\-a^by-\-...-\-a^^b^j
This is theyY-entry (reverse order) of(AB)^. Now columny ofB becomes rowy ofB^, and row / of ^ becomes
column i of A^. Thus the //-entry of B^A^ is
[by, by,..., b^j\[aa ,aa,..., a^J^ = by-a^ + by-aa + ... + b^ja^^
Thus (AB)^ = B^A^, since the corresponding entries are equal.
SQUARE MATRICES
2.14. Find the diagonal and trace of each matrix:
1 2 -3
4-5 6
1 3 61 [248
(a) A=\ 2 -5 8 L (b) B=\ 3 -7 9 |, (c) C =
4 -2 9J [-5 0 2_
(a) The diagonal of ^ consists of the elements from the upper left comer of ^ to the lower right comer of ^
or, in other words, the elements a^, ^22, «33. Thus the diagonal of ^ consists of the numbers 1, —5, and
9. The trace of ^ is the sum of the diagonal elements. Thus
tr(^) =1-5 + 9 = 5
(b) The diagonal of B consists of the numbers 2, —7, and 2. Hence
tY(B) = 2-7 + 2 =-3
(c) The diagonal and trace are only defined for square matrices.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAP. 2]
ALGEBRA OF MATRICES
47
.15. Let A =
(a) A\ (h
2l
L and let/(x) = 2x^ — 4x + 5 and g(x
>)AMc)f(AX(d)g(A).
id) A^=AA = ^\ JjJ
(b) A^ =
-'=[-; -I
1 2] [1+8 2-6"
4 -3j~[4-12 8 + 9_
J[-8 nj" [36 + 24 -
=
)=x2 + 2x+ll. Fir
9 -4"
_-8 17_
-4 + 34"
-16-51_
=
"-7 30"
60 -67 _
(c) First substitute A for x and 5/ for the constant in/(x), obtaining
f(A)
Now perform the scalar multiplication and then the matrix addition:
f(A)
plication and then the matrix addition:
601 r -4 -81 r5 0] r
-134j^[-16 12j^[0 5j [
-14
120
-13 52
104 -117
(d) Substitute A for x and 11/ for the constant in g(x), and then calculate as follows:
g(A)
1 2
4 -3
■-11 0'
0 -11.
Since g(A) is the zero matrix, ^ is a root of the polynomial g(x).
A^-\-2A- III
-4
[-1 ~''M'
-11
1 0
0 0
0 0
2.16. Let A
1 3
4 -3
(b) Describe all such vectors.
. (a) Find a nonzero column vector u ■■
such that Au = 3u.
(a) First set up the matrix equation Au = 3u, and then write each side as a single matrix (column vector) as
follows:
[1-
and then
x +
4x
f3j;1^r3x1
-3j;J [3j;J
Set the corresponding elements equal to each other to obtain a system of equations:
X + 3j; = 3x 2x - 3j; = 0 - _ „
A 'J -t o^ A /a o^ 2x - 3j; = 0
4x - 3j; = 3j; 4x - 6j; = 0 -^
The system reduces to one nondegenerate linear equation in two unknowns, and so has an infinite
number of solutions. To obtain a nonzero solution, let, say, y = 2\ then x = 3. Thus u = C, 2)^ is a
desired nonzero vector.
{b) To find the general solution, set y = a, where a is a parameter. Substitute y = a into 2x — 3j; = 0 to
obtain x = |a. Thus u = (^a, a) represents all such solutions.
INVERTIBLE MATRICES, INVERSES
2.17. Show that A ■■
1 0 2
2-13
4 1 8
and 5:
-11 2 2
-4 0 1
6 -1 -1
Compute the product AB, obtaining
AB
-11+0+12 2 + 0-2 2 + 0-2
-22 + 4+18 4 + 0-3 4-1-3
-44-4 + 48 8 + 0-8 8 + 1-8
are mverses.
1 0 0'
0 1 0
0 0 1
Since AB = I, we can conclude (Theorem 3.15) that BA = I. Accordingly, A and B are inverses.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
48
ALGEBRA OF MATRICES
[CHAR 2
2.18. Find the inverse, if possible, of each matrix:
(a) A =
5 3
4 2
(b) B =
(c)
Use the formula for the inverse of a 2 x 2 matrix appearing in Section 2.9.
(a) First find |^| = 5B) — 3D) = 10 — 12 = —2. Next interchange the diagonal elements, take the
negatives of the nondiagonal elements, and multiply by l/\A\:
(b) First find \B\ = 2C) — (—3)A) = 6 + 3=9. Next interchange the diagonal elements, take the negatives
of the nondiagonal elements, and multiply by l/\B\:
^_i 1
3 3"
-1 2
- 1 1-
3 3
1 2
_ 9 9_
(c) First find \C\ = -2(-9) - 6C) =18-18 = 0. Since |C| = 0, C has no inverse.
2.19. Let A =
1 1 1
0 1 2
1 2 4
Find^-i =
Xi X2 X3
ji yi J3
^1
^2
^^3,
Multiplying ^ by ^ ^ and setting the nine entries equal to the nine entries of the identity matrix / yields
the following three systems of three equations in three of the unknowns:
+ J^l + Zi
yi + 2^1
+ 2y^ + 4zi
= 1
= 0
= 0
•^2 + 3^2 + ^2=0
y -\-2z2 = l
X2 + 2y2 + 4z2 = 0
•^3 + 3^3 + ^3=0
y^ + 2z3 = 0
X3 + 2y^ + 4z3 = 1
Solving the three systems for the nine unknowns yields
xi=0, yi=2, zi=-l; X2 =-2, y2 = ^, Z2 ■■
ys
-2,
Thus
0
2
1
-2
3
-1
1
-2
1
(Remark: The next chapter gives an efficient way for solving the three systems.)
2.20. Let A and B be invertible matrices (with the same size). Show that AB is also invertible and
(AB)-^ = B-^A-\ [Thus, by induction, {A^A^ .. .AJ'^ = A^\ . .A^^A^^.]
Using the associativity of matrix multiplication, we get
(AB)(B-^A-^) = A(BB-^)A-^ = AIA'^ = AA'^ =1
(B-^A-^)(AB) = B-\A-^A)B = A-^IB = B'^B = /
Thus {ABy
-B-^A-
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAR 2]
ALGEBRA OF MATRICES
49
DIAGONAL AND TRIANGULAR MATRICES
2.21. Write out the diagonal matrices A = diagD, —3, 7), B = diagB, —6), C = diagC,
Put the given scalars on the diagonal and O's elsewhere:
~3
-8,0,5).
4
0
0
0 0
-3 0
0 7
B -
2 0
0 -6
C-
0
2.22. Let A = diagB, 3, 5) and 5 = diagG, 0, -4). Find:
(a) AB,A^,B^, (b) /(^), where/(x)=x2 + 3x-2, (c) A'^ md B'^
(a) The product matrix AB is a diagonal matrix obtained by multiplying corresponding diagonal entries;
hence
^5 = diagBG), 3@), 5(-4)) = diagA4, 0,-20)
Thus the squares A^ and B^ are obtained by squaring each diagonal entry; hence
A^ = diagB2, 3^ 5^) = diagD, 9, 25) and B^ = diagD9, 0, 16)
(b) f(A) is a diagonal matrix obtained by evaluating/(x) at each diagonal entry. We have
/B) = 4 + 6-2 = 8, /C) = 9 + 9-2=16, /E) = 25 + 15 - 2 = 38
Thus/(^) = diag(8, 16,38).
(c) The inverse of a diagonal matrix is a diagonal matrix obtained by taking the inverse (reciprocal) of each
diagonal entry. Thus A~^ = diag(^, ^, ^), but B has no inverse since there is a 0 on the diagonal.
2.23. Find a 2 x 2 matrix A such that A^ is diagonal but not A.
Let A :
[I -?}--"-[J;
which is diagonal.
2.24. Find an upper triangular matrix A such that A^
8 -57
0 27
Set^
Then x^
y = 3
-'-[I III Ml *] "^
[2 -
Thus 19j; = —57, or j; = —3. Accordingly, ^ =
so X = 2; and z^ =21, so z = 3. Next calculate A^ using x = 2 and
3"
3_
A' =
yl
_0 3j
[4 5y'
[o 9 _
=
"8 19;;"
_0 27_
2.25. Let A = [a^] and B = [by] be upper triangular matrices. Prove that AB is upper triangular with
diagonal an^n, ^22^22^ • • •' ^nnKn-
Let AB = [cij]. Then Cij = Yll^i ciikbkj ^nd c^ = Yl^k^i ^ik^ki- Suppose / > j. Then, for any k, either / > k
or k >j, so that either a^^ = 0 or bj^j = 0. Thus Cij = 0, and AB is upper triangular. Suppose / =j. Then, for
k < i, we have a^^ = 0; and, for k > i, we have bj^ = 0. Hence Cn = anbn, as claimed. [This proves one part of
Theorem 2.5(i); the statements for ^ +5 and kA are left as exercises.]
Lipschutz-Lipson:Schaum's I 2. Algebra of Matrices
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
50
ALGEBRA OF MATRICES
[CHAR 2
SPECIAL REAL MATRICES: SYMMETRIC AND ORTHOGONAL
2.26. Determine whether or not each of the following matrices is symmetric, that is, A^ = A, or
skew-symmetric, i.e., ^^ = —A\
{a) A =
5 -1 1
-7 8 2
1 2 -4
{b) B =
0 4-3
-4 0 5
3-5 0
(c) C =
0 0 0
0 0 0
(a) By inspection, the symmetric elements (mirror images in the diagonal) are —7 and —7, 1 and 1, 2 and 2.
Thus A is symmetric, since symmetric elements are equal.
(b) By inspection, the diagonal elements are all 0, and the symmetric elements, 4 and —4, —3 and 3, and 5
and —5, are negatives of each other. Hence B is skew-symmetric.
(c) Since C is not square, C is neither symmetric nor skew-symmetric.
2.27. Find X and 5, if 5:
4 x-\-2
2x — 3 X -h 1
is symmetric.
Hence B
Set the symmetric elements x -h 2 and 2x — 3 equal to each other, obtaining 2x — 3 =x-|-2 orx = 5.
4 7"
7 6
2.28. Let A be an arbitrary 2x2 (real) orthogonal matrix.
(a) Prove: If (a, b) is the first row of ^, then a^ -\-b^ = \ and
a b
—b a
a b
b —a
(b) Prove Theorem 2.7: For some real number 0,
A =
cos 6 sin 6
- sin 6 cos 6
or
A =
cos 6 sin 6
sin 6 — cos 6
(a) Suppose (x,y) is the second row of ^. Since the rows of ^ form an orthonormal set, we get
a^-\-b^ = 1,
-b' = \, x^-\-/ = \,
Similarly, the columns form an orthogonal set, so
a^-\-x^ = l, b^-\-/ = l,
Therefore, x^ = I — a^ = b^, whence x = ±b.
Case (i): x = b. Then b(a -\-y) = 0, so j; = —a.
Case (ii): x = —b. Then b(y — a) = 0, so j; = a.
This means, as claimed.
-b a
ax-\-by = Q
ab -\-xy = ^
(b) Since a^ -\- b^ = 1,wq have — 1 < a < 1. Let a = cos 6. Then b^ = I — cos^ 6,so b = sin 6. This proves
the theorem.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAR 2]
ALGEBRA OF MATRICES
51
2.29. Find a 2 x 2 orthogonal matrix A whose first row is a (positive) multiple of C,4).
Normalize C, 4) to get (f, |). Then, by Problem 2.28,
3 4'
5 5
4 3
5 5.
2.30. Find a 3 x 3 orthogonal matrix P whose first two rows are multiples of Wj = A, 1, 1) and
^2 = @, — 1, 1), respectively. (Note that, as required, Wj and U2 are orthogonal.)
First find a nonzero vector u^ orthogonal to Ui and U2; say (cross product) u^ = Ui x U2 = B, — 1, — 1).
Let A be the matrix whose rows are Ui,U2,Ut^\ and let P be the matrix obtained from A by normalizing the
rows of ^. Thus
0
2
-1
-1
1
-1
and
1/^3 1/V3 I/V3'
0 -1/V2 1/V2
2/V6 -1/V6 -1/V6
COMPLEX MATRICES: HERMITIAN AND UNITARY MATRICES
2.31. Find A^ where: {a) A
3-5/ 2 + 4/
6 + 7/ 1 +
:]■
ib) A.
2-3/ 5 + 8/'
-4 3 - 7/
—6 — / 5/
Recall that A^ = A^, the conjugate tranpose of ^. Thus
(a) A" ■■
[3 + 5/ 6-
[2-4/ 1-
(b) A" ■■
[2 + 3/
[5-8/
-4 -6 + /
3 + 7/ -5/
2.32. Show that A =
2 3
2,
1 2
is unitary.
3" 33'
The rows of ^ form an orthonormal set:
/I 2 2 \ /I 2 2 \ /I 4\ 4
(,3-3^7 A3 ^7^(9 + 9;+9 = ^
1 2 2 \ / 2 1 2 \ /2 4\ / 2 4\
/ 2 12\/2 12\4/14\
(,-3'' -3-37 • i-3''-3-3 V =9+ (,9 + 9; = '
Thus ^ is unitary.
2.32. Prove the complex analogue of Theorem 2.6: Let ^ be a complex matrix. Then the following are
equivalent, (i) A is unitary, (ii) The rows of ^ form an orthonormal set. (iii) The columns of ^ form
an orthonormal set.
(The proof is almost identical to the proof on page 38 for the case when ^ is a 3 x 3 real
matrix.)
First recall that the vectors Ui,U2, ... ,u^inC^ form an orthonormal set if they are unit vectors
and are orthogonal to each other, where the dot product in C" is defined by
(^1 y ^2
. «J •(bi,b2,..., bj = a^b^ + ^2^72 + ... + a^b^
... ,R^ SiYQ its rows. Then R^,R2, ■ ■ ■ ,Rl are th
AA^ = [cij]. By matrix multiplication, Cij = RfiJ = Ri • Rj. Since A is unitary, we have AA^ = I. Multiplying
Suppose A is unitary, and Ri,R2,... ,Rn are its rows. Then R^,R2,... ,Rl are the columns of A^. Let
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
2. Algebra of Matrices
Text
©The McGraw-Hill
Companies, 2004
52
ALGEBRA OF MATRICES
[CHAR 2
A by A^ and setting each entry c^j equal to the corresponding entry in / yields the following n^ equations:
Ri-Ri = l, R2-R2 = l, ..., R^-R^ = l, and RrRj = 0, for i^j
Thus the rows of ^ are unit vectors and are orthogonal to each other; hence they form an orthonormal set of
vectors. The condition A^A = I similarly shows that the columns of A also form an orthonormal set of
vectors. Furthermore, since each step is reversible, the converse is true. This proves the theorem.
BLOCK MATRICES
2.33. Consider the following block matrices (which are partitions of the same matrix):
(a)
Find the size of each block matrix and also the size of each block.
(a) The block matrix has two rows of matrices and three columns of matrices; hence its size is 2 x 3. The
block sizes are 2 x 2, 2 x 2, and 2 x 1 for the first row; and 1 x 2, 1 x 2, and 1 x 1 for the second row.
(b) The size of the block matrix is 3 x 2; and the block sizes are 1 x 3 and 1 x 2 for each of the three rows.
1
2
3
-2 1 0 1 1 3
315 71-2
, (b)
1
2
3
-2 Oi 1
3517
1 4 5
3
-2
9
.34.
Compute AB using block m
A =
^"^^^[ol G_
an(
ultiplication
21 r
3410
_0 0 2_
, where
an
S'
T
, w
2
4 5
0 0
31 r
6 1 1
0 1
B :
, where E, F, G, R, S, T are the given blocks, and 0^x2 ^^^
Oix3 are zero matrices of the indicated sites. Hence
AB--
[ ER ES^FTl \\ 9 12 15] [3] [l]
L01.3 GT J=Nl9 26 33J L7j + [0j
9 12 15 4
19 26 33 7
0 0 0 2
2.35. Let M = dmg(A, B, C), where A =
1 2
3 4
, B = [5], C =
Since M is block diagonal, square each block:
II
10
22
B^ = [25],
3"
_5 7_
2
.
. Fmc
6 24"
40
64 _
M^
1 10 I
15 22
■125-'
^16 24'
I 40 64
MISCELLANEOUS PROBLEM
2.36. Let/(x) and g{x) be polynomials and let ^ be a square matrix. Prove:
(a) {f + g){A)=f{A)+g{A),
(b) if-g)iA)^f(A)g(A),
(c) f{A}g{A}^g(A)nA).
Suppose/(x) = ^;^i a,.x' and g(x) = J2j=-i bjX^■
Lipschutz-Lipson:Schaum's I 2. Algebra of Matrices
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 2]
ALGEBRA OF MATRICES
53
(a) We can assume r = s = nby adding powers of x with 0 as their coefficients. Then
/(x)+g(x) = E(«, + 6,)x'
Hence (/ + g)(A) = ±{a, + b,)A^ = t M' =f(A) + g(A)
(b) We have /(x)g(x) = J] afijX^+J. Then
f(A)g(A) = I EM'1 (E Vj = E«.V'^' = (/^X^)
(c) Using/(x)g(x) = g(x)/(x), we have
f(A)g(A) = (fg)(A) = (gf)(A) = g(A)f(A)
Supplementary Problems
ALGEBRA OF MATRICES
Problems 2.37-2.40 refer to the following matrices:
B--
5 0
-6 7
C-
[4-8 9
2.37.
2.38.
2.39.
2.40.
Find: (a) 5A - 2B, (b) 2A + 325, (c) 2C - 3D.
Find: (a) AB and (AB)C, (b) BC and A(BC). [Note that (AB)C = A(BC).]
Find: (a) A^ and ^^ F) ^D and BD, (c) CD.
Find: (a)A^, (b)B^, (c)(ABf, (d)A^B^. [NoXq thsit A^B^ ^ (ABf.]
Problems 2.41 and 2.42 refer to the following matrices:
1 -1 2
0 3 4
B -
4 0
-1 -2
C-
2-3 0 1
5-1-4 2
-10 0 3
D-
2.41.
2.42.
2.43.
2.44
2.45.
2.46.
2.47.
2.48.
Find: (a) 3A - 4B, (b) AC, (c) BC, (d) AD, (e) BD, (f) CD.
Find: (a)A^, (b)A^B, (c)A^C.
LQtA
1 2
3 6
. Find a 2 X 3 matrix B with distinct nonzero entries such that AB = 0.
Cl^ CI2 CIT) ^4
Let e^ = [1, 0, 0], 62 = [0, 1, 0], e^ = [0, 0, 1], and A -
. Find eiA, ^2^, e^A.
Let e, = [0, ..., 0, 1, 0, ..., 0], where 1 is the ith entry. Show:
(a) e^A = Ai, ith row of ^. (c) If e^A = e^B, for each /, then A = B.
(b) BeJ = BJ, yth column of B. (d) If AeJ = BeJ, for each j, then A=B.
Prove Theorem 2.2(iii) and (iv): (iii) (B + C)A = BA-\- CA, (iv) k(AB) = (kA)B = A(kB).
Prove Theorem 2.3: (i) (A + Bf = A^ -\- B^, (ii) (A^f, (iii) (kA) = kA^.
Show: (a) If A has a zero row, then AB has a zero row. (b) If B has a zero column, then AB has a zero
column.
Lipschutz-Lipson:Schaum's I 2. Algebra of Matrices
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
54
ALGEBRA OF MATRICES
[CHAR 2
SQUARE MATRICES, INVERSES
2.49. Find the diagonal and trace of each of the following matrices:
(a) A.
2 -5 8'
3 -6 -7
4 0-1
(b) B-
1 3-4"
6 1 7
2 -5 -1
(C) C:
[9
_3
-5l
1
,B =
Problems 2.50-2.52 refer to A
2.50. Find: (a) A^ and A\ (b)f(A) and g(Al where
f(x)=x^ -2x^ -5,
2.51. Find: (a) B^ and 5^ F)/E) and g(B\ where
/(x)=x2+2x-22,
2.52. Find a nonzero column vector u such that Mw = 4u.
: :^}-[
3 -6
6 -4
3 -2
g(x) =x^ -3x+ 17.
g(x) = X — 3x — 6.
2.53. Find the inverse of each of the following matrices (if it exists):
2.54. Find the inverses of A ■■
7 4
5 3
'
of^ =
B =
1 2"
1 2 5
1 3 7
\2 3
[4 5_
'
and 5 =
C-
4 -6
-2 3
-[
5 -2
6 -3
1 -1 1
0 1 -1
1 3 -2
2.55. Suppose A is invertible. Show that if AB = AC, then B = C. Give an example of a nonzero matrix A such that
AB = AC hut B^C.
2.56. Find 2x2 invertible matrices A and 5 such that A -\-B ^ 0 and ^ + 5 is not invertible.
2.57. Show: (a) A is invertible if and only if ^^ is invertible. (b) The operations of inversion and transpose
commute, that is, (A^)~ = (A~^) . (c) If ^ has a zero row or zero column, then A is not invertible.
DIAGONAL AND TRIANGULAR MATRICES
2.58. Let A = diag(l, 2, -3) and B = diagB, -5, 0). Find:
(a) AB, A^, B^, (b) f(A), where/(x) = x^ + 4x - 3, (c) A'^ and B'^.
2.59. Let A -
1 2
0 1
and 5:
1 1 0
0 1 1
0 0 1
(a) Find A\(b) Find 5".
2.60. Find all real triangular matrices A such that A'^ = B, where: (a) B -
4 21
0 25
,(b)B-
1 4
0 -9
2.61. Let A :
5 2'
0 k
Find all numbers k for which ^ is a root of the polynomial:
2.62. Let B
2.63. Let B --
(a) fix) = x2 - 7x + 10, (b) gix) =x^ - 25, (c) ^(x) = x^ - 4.
. Find a matrix A such that A^ = B.
1 0
26 27
1 8 5
0 9 5
0 0 4
Find a triangular matrix A with positive diagonal entries such that A'^ = B.
Lipschutz-Lipson:Schaum's I 2. Algebra of Matrices
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAP. 2]
ALGEBRA OF MATRICES
55
2.64. Using only the elements 0 and 1, find the number of 3 x 3 matrices that are: (a) diagonal,
(b) upper triangular, (c) non-singular and upper triangular. Generalize to n x n matrices.
2.65. Let Dj^ = kl, the scalar matrix belonging to the scalar k. Show:
(a) DkA = kA, (b) BDk = kB, (c) Dj, + Dj,. = Dk+j,., (d) DkDj,. = Dkj,.
2.66. Suppose AB = C, where A and C are upper triangular.
(a) Find 2x2 nonzero matrices A,B, C, where B is not upper triangular.
(b) Suppose A is also invertible. Show that B must also be upper triangular.
SPECIAL TYPES OF REAL MATRICES
2.67. Find x,y, z such that A is symmetric, where:
(a) A.
2x3
4 5 y
z 1 7
(b) A:
1 -6 2x
y z —1
2.68. Suppose ^ is a square matrix. Show: {a) ^ +^^ is symmetric, ih) A — A^ is skew-symmetric,
(c) A = B -\- C, where B is symmetric and C is skew-symmetric.
2.69. Write A
[:
4 5
3
as the sum of a symmetric matrix B and a skew-symmetric matrix C.
2.70. Suppose A and B are symmetric. Show that the following are also symmetric:
(a) A-\-B, (b) A:^, for any scalar ^, (c) A^,
(d) A^, for n > 0, (e) f(A), for any polynomial/(x).
2.71. Find a 2 x 2 orthogonal matrix P whose first row is a multiple of:
(a) C,-4), (b) A,2).
2.72. Find a 3 x 3 orthogonal matrix P whose first two rows are multiples of:
(a) A,2, 3) and @,-2,3), (b) A, 3, 1) and A, 0,-1).
2.73. Suppose A and B are orthogonal matrices. Show that A^, A~^, AB are also orthogonal.
2.74. Which of the following matrices are normal? A = \ \ B = \
-1
2 3
1 1 1
0 1 1
0 0 1
COMPLEX MATRICES
2.75. Find real numbers x, y, z such that A is Hermitian, where A ■
3 X + 2/ yi
3-2/ 0 1+z/
yi \ — xi — 1
2.76. Suppose ^ is a complex matrix. Show that AA^ and A^A are Hermitian.
2.77. Let ^ be a square matrix. Show that: (a) A + A^ is Hermitian, (b) A — A^ is skew-Hermitian,
(c) A = B -\- C, where B is Hermitian and C is skew-Hermitian.
Lipschutz-Lipson:Schaum's I 2. Algebra of Matrices
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
56
ALGEBRA OF MATRICES
[CHAR 2
2.78. Determine which of the following matrices are unitary:
ill -^/2
[V3/2 -i/2
Ul+/ 1-/1
2[l-/ 1+/J
1 -/ -1+/
/ 1 1 + /
1+/ -1+/ 0
2.79. Suppose A and B are unitary. Show that A^, A ^, AB are unitary.
2.80. Determine which of the following matrices are normal: A -
[3+4/ 1
[ / 2 + 3/
and 5:
1 0
1 — / /
BLOCK MATRICES
2.81. Let U :
3
0
0
2 1 0
4|0
0 1 5
0 1 3
0
0
1
4
01
0
2
1
and V --
2
0
0
0
-2
4
0
0
0
0
0
1
2
-4
0"
0
2
-3
1
{a) Find UV using block multiplication, {b) Are U and V block diagonal matrices?
(c) Is UV block diagonal?
2.82. Partition each of the following matrices so that it becomes a square block matrix with as many diagonal blocks
as possible:
1 0 0
0 0 2
0 0 3
12 0 0 0"
3 0 0 0 0
0 0 4 0 0
0 0 5 0 0
0 0 0 0 6
C -
0 1 0
0 0 0
2 0 0
2.83. Find M^ and M^ for: {a) M --
_2J 0_0_^0_|
Oil 4 lO
OJ 2 _ ^ I 0
"o ] 0 0 \i
,(b) M.
2
0
0
1
3
0
0
0
0
1
4
0"
0
2
5
2.84. For each matrix M in Problem 2.83, find/(M) where/(x) = x^ + 4x — 5.
2.85. Suppose UY = [Uu^] and V = [Vj^j] are block matrices for which UV is defined and the number of columns of
each block U^^ is equal to the number of rows of each block Vj^j. Show that UV = [Wy], where
2.86. Suppose M and A^ are block diagonal matrices where corresponding blocks have the same size, say
M = diag(^,) and A^ = diagE^). Show:
(i) M + 7V = diagD+5,.),
(ii) kM = disig(kAi%
(iii) MN = dmg(AiBi),
(iv) /(M) = diag(/(^^)) for any polynomial/(x).
Lipschutz-Lipson:Schaum's I 2. Algebra of Matrices
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 2]
ALGEBRA OF MATRICES
57
Answers to Supplementary Problems
Notation: A = [Ri', R2', ...] denotes a matrix A with rows R\,R2,
2.37. (a) [-5,10; 27,-34], (b) [17,4; -12,13], (c) [-7,-27,11; -8,36,-37]
2.38. (a) [-7,14; 39,-28], [21,105,-98; -17,-285,296];
(b) [5,-15,20; 8,60,-59], [21,105,-98; -17,-285,296]
2.39. (a) [7,-6; -9,22], [-11,38; 57,-106],
(b) [11,-9,17; -7,53,-39], [15,35,-5; 10,-98,69]; (c) not defined
2.40. (a) [1,3; 2,-4], [5,-6; 0,7], [5,-5; 10,-40]
2.41. (a) [-13,-3,18; 4,17,0], (b) [-5,-22,4,5; 11,-3,-12,18],
(c) [11,-12,0,-5; -15,5,8,4], (d) [9; 9], (e) [11; 9], (/) not defined
2.42. (a) [1,0; -1,3; 2,4], (b) [4,0,-3; -7,-6,12; 4,-8,6], (c) not defined
2.43. [2,4,6; -1,-2,-3]
2.44. [a^,a2,a^,a^l [b^, b2, b^, b^l [q, C2, C3, cj
2.49. (a) 2, -6, -1, tr(^) = -5, (b) 1,1,-1, trE) = I, (c) not defined
2.50. (a) [-11,-15; 9,-14], [-67,40; -24,-59], (b) [-50,70; -42,-46], g(^) = 0
2.51. (a) [14,4; -2,34], [60,-52; 26,-2000], (b) /E) = 0, [-4, 10; -5,46]
2.52. u = [la, a]
2.53. [3,-4; -5,7], [-f,|; 2,-1], not defined, [1,-|; 2,-f]
2.54. [1,1,-1; 2,-5,3; -1,2,-1], [1,1,0; -1,-3,1; -1,-4,1]
2.55. ^ = [1,2; 1,2], 5 = [0,0; 1,1], C = [2,2; 0,0]
2.56. ^ = [1,2; 0,3]; 5 = [4,3; 3,0]
2.57. (c) Hint: Use Problem 2.48
2.58. {a) ^5 = diagB,-10,0), ^^ ^ diag(l, 4, 9), 5^ = diagD, 25, 0);,
i _i
'2' 3-
{b) f(A) = diagB, 9, 18), (c) A'^ = diag(l, i, - i), C'^ does not exist
2.59. (a) [1,2"; 0,1], (b) [l,n,^n(n - l); 0,l,n; 0,0,1]
2.60. (a) [2,3; 0,5], [-2,-3; 0,-5], [2,-7; 0,5], [-2,7; 0,5], (b) none
2.61. (a) k = 2, (b) k =-5, (c) none
2.62. [1,0; 2,3]
2.63. [1,2,1; 0,3,1; 0,0,2]
Lipschutz-Lipson:Schaum's I 2. Algebra of Matrices
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
58
ALGEBRA OF MATRICES
[CHAR 2
2.64.
2.66.
2.67.
2.68.
2.69.
2.71.
2.72.
2.74.
2.75.
2.77.
2.78.
2.80.
2.81.
2.82.
2.83.
All entries below the diagonal must be 0 to be upper triangular, and all diagonal entries must be 1 to be non-
singular, (a) 8 B"), (b) 2^ B"("+i)/2), (c) 2^ B"("-i)/2).
(a) ^ = [1,1; 0,0], 5 = [1,2; 3,4], C = [4,6; 0,0]
(a) x = 4,y=l,z = 3, (b) x = 0, j; =—6, z any real number
(c) Hint: LQtB = \(A-\- A^) and C = \(A - A^)
B = [4,3; 3,3], C = [0,2; -2,0]
(a) [f,-f; ill (b) [1/V5,2/V5; 2/^5,-1/^5]
(a) [l/yi4, 2/yi4, 3/yi4; 0, -2/^13, 3/^13; 12/^157, -3/^157, -2/^157]
(b) [l/yn, 3/711, l/yiT; 1/V2, 0, -1/V2; 3/V22, -2/V22, 3/V22]
A,C
x = 3,y = 0, z = 3
(c) Hint: LQtB = \(A-\- A") and C = ^(^ - A")
A,B,C
A
(a) UV = disig([7,6; 17, 10], [-1,9; 7,-5]) (b) no, (c) yes
A: line between first and second rows (columns);
B: line between second and third rows (columns) and between fourth and fifth rows (columns);
C: C itself - no further partitioning of C is possible.
(a) M2 = diag([4], [9, 8; 4, 9], [9]),
M^ = diag([8], [25, 44; 22, 25], [27])
(b) M2=diag([3,4; 8,11], [9,12; 24,33])
M^ =diag([ll,15; 30,41], [57,78; 156,213])
2.84. (a) diag([7], [8,24; 12,8], [16]), F) diag([2,8; 16,181], [8,20; 40,481])
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
CHAPTER 3
Systems of Linear
Equations
3.1 INTRODUCTION
Systems of linear equations play an important and motivating role in the subject of linear algebra. In
fact, many problems in linear algebra reduce to finding the solution of a system of linear equations. Thus
the techniques introduced in this chapter will be applicable to abstract ideas introduced later. On the other
hand, some of the abstract results will give us new insights into the structure and properties of systems of
linear equations.
All our systems of linear equations involve scalars as both coefficients and constants, and such scalars
may come from any number field K. There is almost no loss in generality if the reader assumes that all our
scalars are real numbers, that is, that they come from the real field R.
3.2 BASIC DEFINITIONS, SOLUTIONS
This section gives basic definitions connected with the solutions of systems of linear equations. The
actual algorithms for finding such solutions will be treated later.
Linear Equation and Solutions
A linear equation in unknowns Xj, X2, ..., x„ is an equation that can be put in the standard form
aiXi + B2X2 + • • • + a„x„ = b C.1)
where a^, ^2, ..., a„, and b are constants. The constant aj^ is called the coefficient oixj^, and b is called the
constant term of the equation.
A solution of the linear equation C.1) is a hst of values for the unknowns or, equivalently, a vector u in
A:«, say
Xj — /Cj, X2 — 'C2
^n — ^n
or
U = (k^,k2,...,kJ
such that the following statement (obtained by substituting kf for x^ in the equation) is true:
In such a case we say that u satisfies the equation.
59
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
60 SYSTEMS OF LINEAR EQUATIONS [CHAP. 3
Remark: Equation C.1) implicitly assumes there is an ordering of the unknowns. In order to avoid
subscripts, we will usually use x, y for two unknowns, x, y, z for three unknowns, and x, y, z, t for four
unknowns, and they will be ordered as shown.
Example 3.1. Consider the following linear equation in three unknowns x,y, z:
X + 2_y — 3z = 6
We note that x = 5,y = 2,z = 1, or, equivalently, the vector w = E, 2, 1) is a solution of the equation. That is,
5+2B)-3A) = 6 or 5+4-3=6 or 6 = 6
On the other hand, w = A, 2, 3) is not a solution, since, on substitution, we do not get a true statement:
1 + 2B) - 3C) = 6 or 1+4-9 = 6 or -4 = 6
System of Linear Equations
A system of linear equations is a list of linear equations with the same unknowns. In particular, a
system of m linear equations L^,L2, ... ,L^mn unknowns Xj, ^2, ..., x„ can be put in the standard form
where the a^ and b^ are constants. The number a^ is the coefficient of the unknown Xj in the equation L^,
and the number b^ is the constant of the equation L^.
The system C.2) is called mim x n (read: m by n) system. It is called a square system ifm = n, that is,
if the number m of equations is equal to the number n of unknowns.
The system C.2) is said to be homogeneous if all the constant terms are zero, that is, if Z?i = 0,
Z?2 = 0, ..., Z?^ = 0. Otherwise the system is said to be nonhomogeneous.
A solution (or a particular solution) of the system C.2) is a list of values for the unknowns or,
equivalently, a vector u in K^, that is a solution of each of the equations in the system. The set of all
solutions of the system is called the solution set or the general solution of the system.
Example 3.2. Consider the following system of linear equations:
Xj + X2 + 4x3 + 3x4 = 5
^Xi + JXo + Xt ^Xa ^ 1
Xi + 2x2 ~ 5-^3 + 4x4 = 3
It is a 3 X 4 system since it has 3 equations in 4 unknowns. Determine whether (a) w = (—8,6, 1, 1) and
(b) V = (—10, 5, 1, 2) are solutions of the system.
(a) Substitute the values of u in each equation, obtaining
-8 + 6 + 4(l) + 3(l) = 5 or -8 + 6 + 4 + 3 = 5 or 5 = 5
2(-8) + 3F) + 1 - 2A) = 1 or -16+18+1-2=1 or 1 = 1
-8+ 2F)-5A)+ 4A) = 3 or -8 + 12-5+4 = 3 or 3=3
Yes, w is a solution of the system since it is a solution of each equation.
(b) Substitute the values of v into each successive equation, obtaining
-10 + 5+ 4A)+ 3B) = 5 or -10 + 5 + 4 + 6 = 5 or 5 = 5
2(-10) + 3E)+l-2B) = 1 or -20+15 + 1-4=1 or -8 = 1
No, V is not a solution of the system, since it is not a solution of the second equation. (We do not need to
substitute v into the third equation.)
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
CHAR 3]
SYSTEMS OF LINEAR EQUATIONS
61
The system C.2) of linear equations is said to be consistent if it has one or more solutions, and it is
said to be inconsistent if it has no solution. If the field K of scalars is infinite, such as when K is the real
field R or the complex field C, then we have the following important result.
Theorem 3.1: Suppose the field K is infinite. Then any system ^ of linear equations has either:
(i) a unique solution, (ii) no solution, or (iii) an infinite number of solutions.
This situation is pictured in Fig. 3-1. The three cases have a geometrical description when the system
if consists of two equations in two unknowns (Section 3.4).
System of linear equations
T
Inconsistent
Consistent
No
solution
Unique
solution
Infinite number
of solutions
Fig. 3-1
Augmented and Coefficient Matrices of a System
Consider again the general system C.2) of m equations in n unknowns. Such a system has associated
with it the following two matrices:
M .
-^22
. wl ^m2
«2« ^2
an
and
A =
«21
«12
«22
O-jfiX a-jfi2
a\n
«2«
The first matrix M is called the augmented matrix of the system, and the second matrix A is called the
coefficient matrix.
The coefficient matrix A is simply the matrix of coefficients, which is the augmented matrix M without
the last column of constants. Some texts write M = [A,B] to emphasize the two parts of M, where B
denotes the column vector of constants. The augmented matrix M and the coefficient matrix A of the
system in Example 3.2 are as follows:
M
1 1
2 3
1 2
4
1
-5
3
-2
4
5
1
3
and
A =
114 3
2 3 1-2
12-5 4
As expected, A consists of all the columns of M except the last, which is the column of constants.
Clearly, a system of linear equations is completely determined by its augmented matrix M, and vice
versa. Specifically, each row of M corresponds to an equation of the system, and each column of M
corresponds to the coefficients of an unknown, except for the last column, which corresponds to the
constants of the system.
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
62 SYSTEMS OF LINEAR EQUATIONS [CHAR 3
Degenerate Linear Equations
A linear equation is said to be degenerate if all the coefficients are zero, that is, if it has the form
Oxi +0x2 + --- + 0x„ = Z? C.3)
The solution of such an equation only depends on the value of the constant b. Specifically:
(i) If Z? 7^ 0, then the equation has no solution,
(ii) If Z? = 0, then every vector u = {k^, k2, ..., k^ in K" is a solution.
The following theorem applies.
Theorem 3.2: Let if be a system of linear equations that contains a degenerate equation L, say with
constant b.
(i) If Z? 7^ 0, then the system ^ has no solution.
(ii) If Z? = 0, then L may be deleted from the system without changing the solution set of
the system.
Part (i) comes from the fact that the degenerate equation has no solution, so the system has no solution.
Part (ii) comes from the fact that every element in K" is a solution of the degenerate equation.
Leading Unknown in a Nondegenerate Linear Equation
Now let L be a nondegenerate linear equation. This means one or more of the coefficients of L are not
zero. By the leading unknown of L, we mean the first unknown in L with a nonzero coefficient. For
example, x^ and y are the leading unknowns, respectively, in the equations
Oxj + 0x2 + 5-^3 + 6x4 + 0x5 + 8x5 = 7 and 0x-\-2y — 4z = 5
We frequently omit terms with zero coefficients, so the above equations would be written as
5x3 + 6x4 + 8x5 = 7 and 2y — 4z = 5
In such a case, the leading unknown appears first.
3.3 EQUIVALENT SYSTEMS, ELEMENTARY OPERATIONS
Consider the system C.2) of m linear equations in n unknowns. Let L be the linear equation obtained
by multiplying the m equations by constants Cj, C2, ..., c^, respectively, and then adding the resulting
equations. Specifically, let L be the following linear equation:
(ciaii -\ h c^a^i)xi H h (ciai„ -\ h c^<3!^„)-^„ = Cibi -\ h c^Z?^
Then L is called a linear combination of the equations in the system. One can easily show (Problem 3.43)
that any solution of the system C.2) is also a solution of the linear combination L.
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 3] SYSTEMS OF LINEAR EQUATIONS 63
Example 3.3. Let L^, L2, L^ denote, respectively, the three equations in Example 3.2. Let L be the equation obtained by
multiplying L^, L2, Lj^hy ?>,—2, 4, respectively and then adding. Namely,
31,:
-2L2:
AL,:
3xi
-4xi
4xi
+ 3X2
— 6x2
+ 8x2
+ 12x3
— 2X3
-20x3
+
+
9X4 =
4x4 =
+ 16x4 =
15
-2
12
(Sum) L\ 3xi + 5x2 ~ ^^-^s + ^9x4 = 25
Then Z is a linear combination of Z^, Z2, X3. As expected, the solution u = (—8, 6, 1, 1) of the system is also a
solution of L. That is, substituting u in L, we obtain a true statement:
3(-8) +5F)-10A)+ 29A) = 25 or -24 + 30-10 + 29 = 25 or 9 = 9
The following theorem holds.
Theorem 3.3: Two systems of linear equations have the same solutions if and only if each equation in
each system is a linear combination of the equations in the other system.
Two systems of linear equations are said to be equivalent if they have the same solutions. The next
subsection shows one way to obtain equivalent systems of linear equations.
Elementary Operations
The following operations on a system of linear equations L^,L2, ... ,L^ are called elementary
operations.
[Ej] Interchange two of the equations. We indicate that the equations L^ and Lj are interchanged by
writing:
"Interchange L^ and L/' or "L^ ^^ L/'
[E2] Replace an equation by a nonzero multiple of itself We indicate that equation L^ is replaced by kL^
(where ^ 7^ 0) by writing
"Replace L^ by kLj"' or "kL^ -^ L/'
[E3] Replace an equation by the sum of a multiple of another equation and itself We indicate that
equation Lj is replaced by the sum of kL^ and Lj by writing:
"Replace Lj by kL^ + Ly" or "^L^ + Lj -^ Lj''
The arrow -^ in [E2] and [E3] may be read as "replaces".
The main property of the above elementary operations is contained in the following theorem (proved in
Problem 3.45).
Theorem 3.4: Suppose a system of ^ of linear equations is obtained from a system if of linear
equations by a finite sequence of elementary operations. Then ^ and ^ have the same
solutions.
Remark: Sometimes (say to avoid fractions when all the given scalars are integers) we may apply
[E2] and [E3] in one step, that is, we may apply the following operation:
[E] Replace equation Lj by the sum of kLf and k'Lj (where t ^ 0), written
"Replace Lj by kL^ + k'Lj'' or "^L^- + k'Lj -^ Lj''
We emphasize that in operations [E3] and [E], only equation Lj is changed.
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
64 SYSTEMS OF LINEAR EQUATIONS [CHAR 3
Gaussian elimination, our main method for finding the solution of a given system of linear
equations, consists of using the above operations to transform a given system into an equivalent
system whose solution can be easily obtained.
The details of Gaussian elimination are discussed in subsequent sections.
3.4 SMALL SQUARE SYSTEMS OF LINEAR EQUATIONS
This section considers the special case of one equation in one unknown, and two equations in two
unknowns. These simple systems are treated separately since their solution sets can be described
geometrically, and their properties motivate the general case.
Linear Equation in One Unknown
The following simple basic result is proved in Problem 3.5.
Theorem 3.4: Consider the linear equation ax = b.
(i) If a 7^ 0, then x = b/a is a unique solution of ax = b.
(ii) If a = 0, but b ^ 0, then ax = b has no solution,
(iii) If a = 0 and Z? = 0, then every scalar ^ is a solution of ax = b.
Example 3.4. Solve:
(a) 4x-l=x + 6, (b) 2x-5 -x = x-\-3,(c) 4+x - 3 = 2x + 1 - x.
(a) Rewrite the equation in standard form obtaining 3x = 1. Then x = | is the unique solution [Theorem 3.4(i)].
(b) Rewrite the equation in standard form, obtaining Ox = 8. The equation has no solution [Theorem 3.4(ii)].
(c) Rewrite the equation in standard form, obtaining Ox = 0. Then every scalar ^ is a solution [Theorem 3.4(iii)].
System of Two Linear Equations in Two Unknowns Bx2 System)
Consider a system of two nondegenerate linear equations in two unknowns x and y, which can be put
in the standard form
A]X -\- B^y = Ci
^ ^^ ^ C.4)
A2X + B2y = C2
Since the equations are nondegenerate, Ai and Bi are not both zero, and A2 and B2 are not both zero.
The general solution of the system C.4) belongs to one of three types as indicated in Fig. 3-1. If R is
the field of scalars, then the graph of each equation is a line in the plane R^ and the three types may be
described geometrically as pictured in Fig. 3-2. Specifically:
A) The system has exactly one solution.
Here the two lines intersect in one point [Fig. 3-2(a)]. This occurs when the lines have distinct slopes
or, equivalently, when the coefficients of x and 3; are not proportional:
— 7^ — or, equivalently, ^1^2 —^2^1 7^ ^
A2 B2
For example, in Fig. 3-2(a), 1/3 ^ —1/2.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
CHAR 3]
SYSTEMS OF LINEAR EQUATIONS
65
Zi: X - J = -1
L2. 3x + 2j= 12
Li andZ2
-3
y\
6
^3
0
-3
3
^^ \
Zj: X + 2j = 4
L2. 2x + 4j = 8
ic)
Fig. 3-2
B) The system has no solution.
Here the two lines are parallel [Fig. 3-2(Z?)]. This occurs when the lines have the same slopes but
different y intercepts, or when
B,
Ci
2 2 2
For example, in Fig. 3-2(Z?), 1/2 = 3/6 ^ -3/8.
C) The system has an infinite number of solutions.
Here the two lines coincide [Fig. 3-2(c)]. This occurs when the lines have the same slopes and same 3;
intercepts, or when the coefficients and constants are proportional.
B,
A2 B2
For example, in Fig. 3-2(c), 1/2 = 2/4 = 4/8.
Remark: The following expression and its value is called a determinant of order two:
A, B,
A2 B2
A^B2 -A2B^
Determinants will be studied in Chapter 9. Thus the system C.4) has a unique solution if and only if the
determinant of its coefficients is not zero. (We show later that this statement is true for any square system
of linear equations.)
Elimination Algorithm
The solution to system C.4) can be obtained by the process of elimination, whereby we reduce the
system to a single equation in only one unknown. Assuming the system has a unique solution, this
elimination algorithm has two parts.
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
66 SYSTEMS OF LINEAR EQUATIONS [CHAR 3
Algorithm 3.1: The input consists of two nondegenerate linear equations L^ and L2 in two unknowns
with a unique solution.
Part A. (Forward Elimination) Multiply each equation by a constant so that the resulting coefficients of
one unknown are negatives of each other, and then add the two equations to obtain a new
equation L that has only one unknown.
Part B. (Back-substitution) Solve for the unknown in the new equation L (which contains only one
unknown), substitute this value of the unknown into one of the original equations, and then
solve to obtain the value of the other unknown.
Part A of Algorithm 3.1 can be applied to any system even if the system does not have a unique
solution. In such a case, the new equation L will be degenerate and Part B will not apply.
Example 3.5 (Unique Case). Solve the system
L^\ 2x-?>y= -8
L2. 3x + 4j;= 5
The unknown x is eliminated from the equations by forming the new equation Z = — 3Zi + 2^2. That is, we
multiply Li by —3 and L2 by 2 and add the resulting equations as follows:
-3Zi: -6x + 9j; = 24
2L2. 6x + 8j;=10
Addition: 17j; = 34
We now solve the new equation for j;, obtaining;; = 2. We substitute;; = 2 into one of the original equations, say Z^,
and solve for the other unknown x, obtaining
2x — 3B) = —8 or 2x — 6 = 8 or 2x = —2 or x = —1
Thus X = — 1, j; = 2, or the pair w = (—1, 2) is the unique solution of the system. The unique solution is expected,
since 2/3 7^ —3/4. [Geometrically, the lines corresponding to the equations intersect at the point (—1, 2).]
Example 3.6. (Nonunique Cases)
(a) Solve the system
L^; X - 3j; = 4
L2'. —2x -\-6y = 5
We eliminated x from the equations by multiplying Z^ by 2 and adding it to L2, that is, by forming the new
equation L = IL^ -\- L2. This yields the degenerate equation
Ox + Oj;= 13
which has a nonzero constant 6=13. Thus this equation and the system has no solution. This is expected, since
l/(—2) = —3/6 7^ 4/5. (Geometrically, the lines corresponding to the equations are parallel.)
(b) Solve the system
L^; X - 3j; = 4
L2\ -2x + 6j;=-8
We eliminated x from the equations by multiplying Z^ by 2 and adding it to L2, that is, by forming the new
equation L = IL^ -\- L2. This yields the degenerate equation
Ox + Oj; = 0
where the constant term is also zero. Thus the system has an infinite number of solutions, which correspond to the
solutions of either equation. This is expected, since l/(—2) = —3/6 = 4/(—8). (Geometrically, the lines
corresponding to the equations coincide.)
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 3] SYSTEMS OF LINEAR EQUATIONS 67
To find the general solution, let y = a, and substitute into L^ to obtain
X — 3a = 4 or x = 3a + 4
Thus the general solution of the system is
X = 'ia-\-A,y = a or u = Ca-\-4, a)
where a (called a parameter) is any scalar.
3.5 SYSTEMS IN TRIANGULAR AND ECHELON FORM
The main method for solving systems of linear equations, Gaussian elimination, is treated in Section
3.6. Here we consider two simple types of systems of linear equations: systems in triangular form and the
more general systems in echelon form.
Triangular Form
Consider the following system of linear equations, which is in triangular form:
2xi + 3x2 + 5-^3 ~ 2x4 = 9
/X3 X4 ^ D
2X4 ^ o
That is, the first unknown Xj is the leading unknown in the first equation, the second unknown X2 is the
leading unknown in the second equation, and so on. Thus, in particular, the system is square and each
leading unknown is directly to the right of the leading unknown in the preceding equation.
Such a triangular system always has a unique solution, which may be obtained by back-substitution.
That is:
A) First solve the last equation for the last unknown to get X4 = 4.
B) Then substitute this value X4 = 4 in the next-to-last equation, and solve for the next-to-last unknown
X3 as follows:
7x3 — 4 = 3 or 7x3 =7 or -^3 = 1
C) Now substitute X3 = 1 and X4 = 4 in the second equation, and solve for the second unknown X2 as
follows:
5x2 — 1 + 12 = 1 or 5x2 + 11 = 1 or 5x2 = —10 or X2 = —2
D) Finally, substitute X2 = —2, X3 = 1, X4 = 4 in the first equation, and solve for the first unknown Xj as
follows:
2xi + 6 + 5 — 8 = 9 or 2xi + 3 = 9 or 2xi = 6 or Xj = 3
Thus Xj = 3 , X2 = —2, X3 = 1, X4 = 4, or, equivalently, the vector w = C, —2, 1, 4) is the unique solution
of the system.
Remark. There is an alternative form for back-substitution (which will be used when solving a
system using the matrix format). Namely, after first finding the value of the last unknown, we substitute this
value for the last unknown in all the preceding equations before solving for the next-to-last unknown. This
yields a triangular system with one less equation and one less unknown. For example, in the above
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
68 SYSTEMS OF LINEAR EQUATIONS [CHAR 3
triangular system, we substitute X4 = 4 in all the preceding equations to obtain the triangular system
2xi ■
- 3^2 + 5x3 =
JX2 — X3 ^
7X3 =
17
-1
7
We then repeat the process using the new last equation. And so on.
Echelon Form, Pivot and Free Variables
The following system of linear equations is said to be in echelon form:
2xi + 6x2 — X3 + 4x4 — 2x5 = 7
X3 + 2x4 + 2x5 = 5
3x4 — 9x5 = 6
That is, no equation is degenerate and the leading unknown in each equation other than the first is to the
right of the leading unknown in the preceding equation. The leading unknowns in the system, Xj, X3, X4, are
called pivot variables and the other unknowns, X2 and X5, are called free variables.
Generally speaking, an echelon system or a system in echelon form has the following form:
^Vi^Ji ~^ ^^2,72+ 1-^72 + 1 + ■ ■ ■ + ^2n^n = ^2
C.4)
^ri •^7" ~r ' ' ' ~r ^rrP^n — r
where 1 <72 < • • • <7r ^^^ a^, a2j^, ..., a^^j^ are not zero. The pivot variables are Xj, x^^, ..., Xj^. Note
that r < n.
The solution set of any echelon system is described in the following theorem (proved in Problem 3.10).
Theorem 3.5: Consider a system of linear equations in echelon form, say with r equations in n
unknowns. There are two cases.
(i) r = n. That is, there are as many equations as unknowns (triangular form). Then the
system has a unique solution,
(ii) r < n. That is, there are more unknowns than equations. Then we can arbitrarily
assign values to the w — r free variables and solve uniquely for the r pivot variables,
obtaining a solution of the system.
Suppose an echelon system does contain more unknowns than equations. Assuming the field K is
infinite, the system has an infinite number of solutions, since each of the w — r free variables may be
assigned any scalar.
The general solution of a system with free variables may be described in either of two equivalent ways,
which we illustrate using the above echelon system where there are r = 3 equations and n = 5 unknowns.
One description is called the "Parametric Form" of the solution, and the other description is called the
"Free-Variable Form".
Parametric Form
Assign arbitrary values, csiWQd parameters, to the free variables X2 and X5, say X2 = a and X5 = b, and
then use back-substitution to obtain values for the pivot variables Xj, X3, X5 in terms of the parameters a and
b. Specifically:
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 3] SYSTEMS OF LINEAR EQUATIONS 69
A) Substitute x^ = b in the last equation, and solve for x^:
3x4 — 9b = 6 or 3x4 = 6 + 9Z? or X4 = 2 + 3Z?
B) Substitute X4 = 2 + 3Z? and x^=b into the second equation, and solve for x^'.
X3 + 2B + 3Z?) + 2Z? = 5 or X3 + 4 + 8Z? = 5 or X3 = 1 - 8Z?
C) Substitute X2 = a, X3 = 1 — 8Z?, X4 = 2 + 3Z?, x^ = b into the first equation, and solve for Xj:
2xi + 6fl - A - 8Z?) + 4B -\-3b)-2b = 7 or Xj = 4 - 3^ - 9Z?
Accordingly, the general solution in parametric form is
Xj = 4 — 3a — 9Z?, X2 = a, X3 = 1 — 8Z?, X4 = 2 + 3Z?, ^5 = b
or, equivalently,
v = D-3a-9b, a, 1 - 8Z?, 2 + 3Z?, Z?)
where a and Z? are arbitrary numbers.
Free-Variable Form
Use back-substitution to solve for the pivot variables Xj, X3, X4 directly in terms of the free variables X2
and X5. That is, the last equation gives X4 = 2 + 3x5. Substitution in the second equation yields
X3 = 1 — 8x5, and then substitution in the first equation yields Xj = 4 — 3x2 — 9x^. Accordingly,
Xj = 4 — 3x2 — 9x^, X2 = free variable, X3 = 1 — 8x5, X4 = 2 + 3x5, X5 = free variable
or, equivalently,
V = D — 3X2 ~ ^-^5' -^2' 1 ~ ^-^5' 2 + ^-^5' -^5)
is the free-variable form for the general solution of the system.
We emphasize that there is no difference between the above two forms of the general solution, and the
use of one or the other to represent the general solution is simply a matter of taste.
Remark: A particular solution of the above system can be found by assigning any values to the free
variables and then solving for the pivot variables by back-substitution. For example, setting X2 = 1 and
X5 = 1, we obtain
X4 = 2 + 3 = 5, X3 = 1 - 8 = -7, xj = 4 - 3 - 9 = -8
Thus w = (—8, — 1, 7, 5, 1) is the particular solution corresponding to X2 = 1 and X5 = 1.
3.6 GAUSSIAN ELIMINATION
The main method for solving the general system C.2) of linear equations is called Gaussian
elimination. It essentially consists of two parts:
Part A. (Forward Elimination) Step-by-step reduction of the system yielding either a degenerate equation
with no solution (which indicates the system has no solution) or an equivalent simpler system in
triangular or echelon form.
Part B. (Backward Elimination) Step-by-step back-substitution to find the solution of the simpler
system.
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
70 SYSTEMS OF LINEAR EQUATIONS [CHAR 3
Part B has already been investigated in Section 3.4. Accordingly, we need only give the algorithm for
Part A, which is as follows.
Algorithm 3.2 for (Part A):
Input: The m x n system C.2) of linear equations.
Elimination Step: Find the first unknown in the system with a nonzero coefficient (which now must be
Xj).
(a) Arrange so that a^ j^O. That is, if necessary, interchange equations so that the first unknown Xj
appears with a nonzero coefficient in the first equation.
(b) Use ail as ^ pi^ot to eliminate Xj from all equations except the first equation. That is, for / > 1:
A) Set m = —aii/aii; B) Replace L^ by mLi + L^
The system now has the following form:
aiiXi -\- ^12^2 ~l~ ^13^3 H" ■ ■ ■ H" ^In-^n — ^1
^mJ2^J2 '^ ' ' ' ^ ^mn-^n ^n
where Xj does not appear in any equation except the first, an 7^ 0, and Xj^ denotes the first unknown
with a nonzero coefficient in any equation other than the first.
(c) Examine each new equation L.
A) If L has the form Oxj + 0x2 H ^ Ox„ = b with Z? 7^ 0, then
STOP
The system is inconsistent and has no solution.
B) If L has the form Oxj + 0x2 H ^ ^^n = 0 or if L is a multiple of another equation, then delete
L from the system.
Recursion Step: Repeat the Elimination Step with each new "smaller" subsystem formed by all the
equations excluding the first equation.
Output: Finally, the system is reduced to triangular or echelon form, or a degenerate equation with no
solution is obtained indicating an inconsistent system.
The next remarks refer to the Elimination Step in Algorithm 3.2.
A) The following number m in (Z?) is called the multiplier.
an coefficient to be deleted
an pivot
B) One could alternatively apply the following operation in (b):
Replace L^ by — a^Li + anL^
This would avoid fractions if all the scalars were originally integers.
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 3] SYSTEMS OF LINEAR EQUATIONS 71
Gaussian Elimination Example
Here we illustrate in detail Gaussian elimination using the following system of linear equations:
1*
2*
3*
X — 3y — 2z =
2x — 4y — 3z =
-3x + 63; + 8z =
-5
Part A. We use the coefficient 1 of x in the first equation Lj as the pivot in order to eliminate x from the
second equation L2 and from the third equation L3. This is accomphshed as follows:
A) Multiply Li by the multiplier m = —2 and add it to L2? that is, "Replace L2 by —2Li -\-L2'\
B) Multiply Lj by the multipher m = 3 and add it to L3; that is, "Replace L3 by 3Li + L3".
These steps yield
(-2)Li: -2x + 63; + 4z=-12 3^^. 3x-9y-6z=\^
L2. 2x-Ay-3z= 8 i^. _3_^ + 5^; + g^ = _5
NewL2: 2;;+ z =-4 NewL3: 2y + z =-4
Thus the original system is replaced by the following system:
L2
■3y-2z= 6
23; + z = -4
-33; + 2z= 13
(Note that the equations L2 and L3 form a subsystem with one less equation and one less unknown than the
original system.)
Next we use the coefficient 2 of 3; in the (new) second equation L2 as the pivot in order to eliminate y
from the (new) third equation L3. This is accomplished as follows:
C) Multiply L2 by the multiplier m = \ and adding it to L3; that is, "Replace L3 by 1^2+^3"
(alternately, "Replace L3 by 3^2 + 2^.3", which will avoid fractions).
This step yields
3r .
2^2-
L3:
New L3:
33; + |z=-6
-3y + 2z= 13
■^ or
2z= 7
Thus our system is replaced by the following system:
L^: x — 3y — 2z =
L2: 2y -\- z =
L3: 7z =
3^2:
2L3:
New L3:
6
-4
14 (or Iz =
63; + 3z =
-6y + 4z =
lz =
= 7)
-12
26
14
The system is now in triangular form, so Part A is completed.
Part B. The values for the unknowns are obtained in reverse order, z,y,x, by back-substitution.
Specifically:
A) Solve for z in L3 to get z = 2.
B) Substitute z = 2 in L2? ^^^ solve for y to get 3; = —3.
C) Substitute 3; = — 3 and z = 2 in Lj, and solve for x to get x = 1.
Thus the solution of the triangular system and hence the original system is as follows:
x=l, 3;=—3, z = 2 or, equivalently, w = A,-3,2).
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
72 SYSTEMS OF LINEAR EQUATIONS [CHAR 3
Condensed Format
The Gaussian elimination algorithm involves rewriting systems of linear equations. Sometimes we can
avoid excessive recopying of some of the equations by adopting a "condensed format". This format for the
solution of the above system follows:
Number
A)
B)
C)
B0
C0
C'0
Equation
X — 3y — 2z =
2x — 4y — 3z =
-3x + 63; + 8z =
2y-\- z =
-3y-\-2z =
7z =
6
8
-5
-4
13
14
Operation
Replace ^2 by — 2Li +L2
Replace L3 by 3Li + L3
Replace L3 by 3^2 + 2I3
That is, first we write down the number of each of the original equations. As we apply the Gaussian
elimination algorithm to the system, we only write down the new equations, and we label each new
equation using the same number as the original corresponding equation, but with an added prime. (After
each new equation, we will indicate, for instructional purposes, the elementary operation that yielded the
new equation.)
The system in triangular form consists of equations A), B^, and C^0? the numbers with the largest
number of primes. Applying back-substitution to these equations again yields x = 1, j = —3, z = 2.
Remark: If two equations need to be interchanged, say to obtain a nonzero coefficient as a pivot,
then this is easily accomplished in the format by simply renumbering the two equations rather than
changing their positions.
Example 3.7. Solve the following system: x + 2y- 3z = I
2x + 5y- 8z = 4
3X + 87- 13z = 7
We solve the system by Gaussian elimination.
Part A. (Forward Elimination) We use the coefficient 1 of x in the first equation L^ as the pivot in order to
eliminate x from the second equation L2 and from the third equation Z3. This is accomplished as follows:
A) Multiply Li by the multiplier m = —2 and add it to L2; that is, "Replace L2 by —2Li -\-L2''.
B) Multiply Li by the multiplier m = —3 and add it to L^; that is, "Replace Z3 by —3Li +I3".
The two steps yield
X -\- 2y — 3z = 1
j; - 2z = 2
2j; - 4z = 4
X + 2_y — 3z = 1
_y — 2z = 2
(The third equation is deleted, since it is a multiple of the second equation.) The system is now in echelon form with
free variable z.
Part B. (Backward Elimination) To obtain the general solution, let the free variable z = a, and solve for x and j; by
back-substitution. Substitute z = am the second equation to obtain;; = 2-\-2a. Then substitute z = a and j; = 2-\-2a
into the first equation to obtain
X + 2B + 2a) — 3a=l or x-\-A-\-Aa — 3a=\ or x = —3 — a
Thus the following is the general solution where a is a parameter:
X = —3 — a, y = 2 -\-2a, z — a or u — (—3 — a, 2 + 2a, a)
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 3] SYSTEMS OF LINEAR EQUATIONS 73
Example 3.8. Solve the following system:
Xi
2xi
3xi
+ 3x2-
+ 8x2-
+ 5x2 ~
^X3 ~|~ DXa — T-
X3 + 9x4 = 9
12x3 + 17x4 = 7
We use Gaussian elimination.
Part A. (Forward Elimination) We use the coefficient 1 of x^ in the first equation L^ as the pivot in order to
eliminate x^ from the second equation L2 and from the third equation Z3. This is accomplished by the following
operations:
A) "Replace L2 by —IL^ +^2" ^^d B) "Replace L^ by —SL^ +^3"
These yield:
JX2 — -^-^3 + ^^-^4 —
ZX2 + JX3 — X4 ^
4X2 ~ 6-^3 + 2X4 =
4
1
-5
We now use the coefficient 2 of X2 in the second equation L2 as the pivot and the multiplier m = 2 in order to
eliminate X2 from the third equation L^. This is accomplished by the operation "Replace L^ by 2^2 + ^3", which then
yields the degenerate equation
Oxi + 0x2 + 0-^3 + 0-^4 = -3
This equation and, hence, the original system have no solution:
DO NOT CONTINUE
Remark 1: As in the above examples, Part A of Gaussian elimination tells us whether or not the
system has a solution, that is, whether or not the system is consistent. Accordingly, Part B need never be
applied when a system has no solution.
Remark 2: If a system of linear equations has more than four unknowns and four equations, then it
may be more convenient to use the matrix format for solving the system. This matrix format is discussed
later.
3.7 ECHELON MATRICES, ROW CANONICAL FORM, ROW EQUIVALENCE
One way to solve a system of linear equations is by working with its augmented matrix M rather than
the system itself. This section introduces the necessary matrix concepts for such a discussion. These
concepts, such as echelon matrices and elementary row operations, are also of independent interest.
Echelon Matrices
A matrix A is called an echelon matrix, or is said to be in echelon form, if the following two conditions
hold (where a leading nonzero element of a row of ^ is the first nonzero element in the row):
A) All zero rows, if any, are at the bottom of the matrix.
B) Each leading nonzero entry in a row is to the right of the leading nonzero entry in the preceding row.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
74
SYSTEMS OF LINEAR EQUATIONS
[CHAR 3
That is, A = [Ufj] is an echelon matrix if there exist nonzero entries
'^l7i'^2/2' •••'SV
with the property that
aij = 0
for
where j\ <J2 < ••• <jr
&i<r, j<ji
(ii) i
The entries ay^, a2j^, ..., a^y , which are the leading nonzero elements in their respective rows, are called
the pivots of the echelon matrix.
Example 3.9. The following is an echelon matrix whose pivots have been circled:
0 (D 3 4 5 9 0 7"
0 0 0 (D 4 1 2 5
00000EO2
000000(8N
0 0 0 0 0 0 0 0
Observe that the pivots are in columns C2, C4, Q, C7, and each is to the right of the one above. Using the above
notation, the pivots are
where 7i = 2,72 = 4,73 = 6, j\ = 7. Here r = 4.
^3,
«3/,
«4/,
Row Canonical Form
A matrix A is said to be in row canonical form if it is an echelon matrix, that is, if it satisfies the above
properties A) and B), and if it satisfies the following additional two properties:
C) Each pivot (leading nonzero entry) is equal to 1.
D) Each pivot is the only nonzero entry in its column.
The major difference between an echelon matrix and a matrix in row canonical form is that in an
echelon matrix there must be zeros below the pivots [Properties A) and B)], but in a matrix in row
canonical form, each pivot must also equal 1 [Property C)] and there must also be zeros above the pivots
[Property D)].
The zero matrix 0 of any size and the identity matrix / of any size are important special examples of
matrices in row canonical form.
Example 3.10. The following are echelon matrices whose pivots have been circled:
"B) 3 2 0 4 5-6"
0 0 0 1-32 0
0 0 0 0 0 F) 2
0 0 0 0 0 0 0
"© 2 3"
0 0©
0 0 0
0 0 3 0 0 4
0 0 0 0 0-3
0 0 0 0 0 2
The third matrix is also an example of a matrix in row canonical form. The second matrix is not in row canonical
form, since it does not satisfy property D), that is, there is a nonzero entry above the second pivot in the third column.
The first matrix is not in row canonical form, since it satisfies neither property C) nor property D), that is, some pivots
are not equal to 1 and there are nonzero entries above the pivots.
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 3] SYSTEMS OF LINEAR EQUATIONS 75
Elementary Row Operations
Suppose ^ is a matrix with rows i^j, i?2' • • •»^w The following operations on A are called elementary
row operations.
[Ej] (Row Interchange): Interchange rows R^ and Rj. This may be written as
"Interchange i?^ and i?y" or "Rt <—^ ^y"
[E2] (Row Scaling): Replace row R^ by a nonzero multiple kR^ of itself This may be written as
"Replace R^ by kR^ (k 7^ 0)" or "kR^ -^ i?,"
[E3] (Row Addition): Replace row Rj by the sum of a multiple kRf of a row Rf and itself This may be
written as
"Replace Rj by kR^ + R/' or "kR^ + Rj -^ R/'
The arrow -^ in E2 and E3 may be read as "replaces".
Sometimes (say to avoid fractions when all the given scalars are integers) we may apply [E2] and [E3]
in one step, that is, we may apply the following operation:
[E] Replace Rj by the sum of a multiple kR^ of a row R^ and a nonzero multiple k'Rj of itself This may be
written as
"Replace Rj by kR^ + k'Rj {k' ^ 0)" or "kR^ + k'Rj -^ Rf
We emphasize that in operations [E3] and [E] only row Rj is changed.
Row Equivalence, Rank of a Matrix
A matrix A is said to be row equivalent to a matrix B, written
A--B
if B can be obtained from ^ by a sequence of elementary row operations. In the case that B is also an
echelon matrix, B is called an echelon form oi A.
The following are two basic results on row equivalence.
Theorem 3.6: Suppose A = [a^jl and B = [Z?^y] are row equivalent echelon matrices with respective pivot
entries
«i7i' ^2h' • • • ^rj, and Z?i^^ ,b2k^,... b^k.
Then A and B have the same number of nonzero rows, that is, r = s, and the pivot entries
are in the same positions, that is,ji = ki,J2 = k2, ... j'^ = K.
Theorem 3.7: Every matrix A is row equivalent to a unique matrix in row canonical form.
The proofs of the above theorems will be postponed to Chapter 4. The unique matrix in Theorem 3.7 is
called the row canonical form of A.
Using the above theorems, we can now give our first definition of the rank of a matrix.
Definition: The rank of a matrix A, written rank(^), is equal to the number of pivots in an echelon form
of^.
The rank is a very important property of a matrix and, depending on the context in which the
matrix is used, it will be defined in many different ways. Of course, all the definitions lead to the
same number.
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
76 SYSTEMS OF LINEAR EQUATIONS [CHAR 3
The next section gives the matrix format of Gaussian elimination, which finds an echelon form of any
matrix A (and hence the rank of ^), and also finds the row canonical form of ^.
One can show that row equivalence is an equivalence relation. That is:
A) A ^ A for any matrix A.
B) If^-5, then 5-^.
C) If ^ - 5 and 5 - C, then A--C.
Property B) comes from the fact that each elementary row operation has an inverse operation of the same
type. Namely:
(i) "Interchange R^ and i?y" is its own inverse.
(ii) "Replace R^ by kRi' and "Replace R^ by {\/k)Rj'' are inverses,
(iii) "Replace Rj by kR^ -\-R/' and "Replace Rj by —kR^ -\-R/' are inverses.
There is a similar result for operation [E] (Problem 3.73).
3.8 GAUSSIAN ELIMINATION, MATRIX FORMULATION
The section gives two matrix algorithms that accomplish the following:
A) Algorithm 3.3 transforms any matrix A into an echelon form.
B) Algorithm 3.4 transforms the echelon matrix into its row canonical form.
These algorithms, which use the elementary row operations, are simply restatements of Gaussian
elimination as applied to matrices rather than to linear equations. (The term "row reduce" or simply
"reduce" will mean to transform a matrix by the elementary row operations.)
Algorithm 3.3 (Forward Elimination): The input is any matrix A. (The algorithm puts O's below each
pivot, working from the "top-down".) The output is an echelon
form of ^.
Step 1. Find the first column with a nonzero entry. Lety'i denote this column.
{a) Arrange so that ay^ 7^ 0. That is, if necessary, interchange rows so that a nonzero entry
appears in the first row in column j^.
(Z?) Use ay^ as a pivot to obtain O's below ay^.
Specifically, for / > 1:
A) Set m = —ayjay^; B) Replace R^ by mR^ + R^
[That is, apply the operation —(aijJay^)Ri -\-R^ ^ Rf]
Step 2. Repeat Step 1 with the submatrix formed by all the rows excluding the first row. Here we let 72
denote the first column in the subsystem with a nonzero entry. Hence, at the end of Step 2, we
have a2j^ ^ 0.
Steps 3 to r. Continue the above process until a submatrix has only zero rows.
We emphasize that at the end of the algorithm, the pivots will be
^Vl' ^2/2' •••' %'r
where r denotes the number of nonzero rows in the final echelon matrix.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
CHAR 3]
SYSTEMS OF LINEAR EQUATIONS
77
Remark 1: The following number m in Step 1(Z?) is called the multiplier.
^l7i
entry to be deleted
pivot
Remark 2: One could replace the operation in Step 1(b) by
Replace R^ by — a^^Ri -\-ay^R^
This would avoid fractions if all the scalars were originally integers.
Algorithm 3.4 (Backward Elimination): The input is a matrix A = [a^] in echelon form with pivot
entries
^Vi' ^2/2' •••' ^rj^
The output is the row canonical form of ^.
Step 1. (a) (Use row scaling so the last pivot equals 1.) Multiply the last nonzero row Rj. by 1/a^y.
(b) (Use a^y = 1 to obtain O's above the pivot.) For / = r— l,r — 2, ...,2, 1:
A) Set m = -a^j^; B) Replace R^ by mR^ + R^
(That is, apply the operations —a^R^ + R^ -^ Rf.)
Steps 2 to r—1. Repeat Step 1 for rows R^-i, Rr-i^..., i?2-
Step r. (Use row scaling so the first pivot equals 1.) Multiply R^ by l/^i^y
There is an alternative form of Algorithm 3.4, which we describe here in words. The formal
description of this algorithm is left to the reader as a supplementary problem.
Alternative Algorithm 3.4 Puts O's above the pivots row by row from the bottom up (rather than column
by column from right to left).
The alternative algorithm, when applied to an augmented matrix M of a system of linear equations, is
essentially the same as solving for the pivot unknowns one after the other from the bottom up.
Remark: We emphasize that Gaussian elimination is a two-stage process. Specifically:
Stage A (Algorithm 3.3). Puts O's below each pivot, working from the top row Ri down.
Stage B (Algorithm 3.4). Puts O's above each pivot, working from the bottom row Rj. up.
There is another algorithm, called Gauss-Jordan, that also row reduces a matrix to its row canonical form.
The difference is that Gauss-Jordan puts O's both below and above each pivot as it works its way from the
top row Ri down. Although Gauss-Jordan may be easier to state and understand, it is much less efficient
than the two-stage Gaussian elimination algorithm.
Example 3.11. Consider the matrix A ■■
12-31 2
2 4 -4 6 10
3 6 -6 9 13
(a) Use Algorithm 3.3 to reduce A to an echelon form.
(b) Use Algorithm 3.4 to further reduce A to its row canonical form.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
78
SYSTEMS OF LINEAR EQUATIONS
[CHAR 3
(a) First use a^i
"Replace R2, by
"Replace ^^3 by
1 as a pivot to obtain O's below a^, that is, apply the operations "Replace R2 by —IR^ + 7^2" and
-3R^ -\-R3
-\R2+R.
'; and then use ^23
". This yields
: 2 as a pivot to obtain 0 below ^23, that is, apply the operation
12-312
0 0 2 4 6
0 0 3 6 7
12-31 2
0 0 2 4 6
0 0 0 0-2
The matrix is now in echelon form.
(b) Multiply i?3 by — ^ so the pivot entry C35
1, and then use C35 = 1 as a pivot to obtain O's above it by the
operations "Replace R2 by —5i?3 +^^2" ^^^ then "Replace Ry by —2Rt^ -\-Ry'. This yields
12-31 2'
0 0 2 4 6
0 0 0 0 1
12-31 0'
0 0 2 4 0
0 0 0 0 1
Multiply R2 by \ so the pivot entry ^23
"Replace Ri by 3i?2 +^i"- This yields
12-310
0 0 12 0
0 0 0 0 1
The last matrix is the row canonical form of ^.
1, and then use ^23 = 1 as a pivot to obtain O's above it by the operation
12 0 7 0
0 0 12 0
0 0 0 0 1
Application to Systems of Linear Equations
One way to solve a system of linear equations is by working with its augmented matrix M rather than
the equations themselves. Specifically, we reduce M to echelon form (which tells us whether the system
has a solution), and then further reduce M to its row canonical form (which essentially gives the solution of
the original system of linear equations). The justification for this process comes from the following facts:
A) Any elementary row operation on the augmented matrix M of the system is equivalent to applying the
corresponding operation on the system itself.
B) The system has a solution if and only if the echelon form of the augmented matrix M does not have a
row of the form @,0,..., 0, b) with b^O.
C) In the row canonical form of the augmented matrix M (excluding zero rows), the coefficient of each
basic variable is a pivot entry equal to 1 and it is the only nonzero entry in its respective column;
hence the free-variable form of the solution of the system of linear equations is obtained by simply
transferring the free variables to the other side.
This process is illustrated below.
Example 3.12. Solve each of the following systems:
+ X2 — 2x3 + Ax,
2xi + 2^2
3xi + 3x9
— 3x3 ■
-4x3
(a)
4 ■
X4 '■
2.x A '■
~\~ X2 -^-^S ~r ~^Xa
2xi + 3x2 + 3x3
5xi + 7x2 + 4x3 ■
(b)
X4 '■
X4 '■
x-\-2y -\-z --
2x -\- 5y — z -
3x — 2y — z --
(c)
(a) Reduce its augmented matrix M to echelon form and then to row canonical form as follows:
M -
1 -2 4
2 -3 1
3 -4 -2
2
1
2
4
-7
-14
5"
-7
-14
^
-10
-7
0
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
CHAR 3]
SYSTEMS OF LINEAR EQUATIONS
79
Rewrite the row canonical form in terms of a system of linear equations to obtain the free variable form of the
solution. That is,
Xj + X2 — 10x4 = —9 Xi = —9 — X2 + 10x4
x^ — /X4 ^ — / x^ ^ — I \ /X4
(The zero row is omitted in the solution.) Observe that x^ and X3 are the pivot variables, and X2 and X4 are the free
variables.
(b) First reduce its augmented matrix M to echelon form as follows:
M -
1 1
2 3
5 7
-2
3
4
3 4'
-1 3
1 5
1 1
0 1
0 2
-2
7
14
3
-7
-14
4'
-5
15
1 1
0 1
0 0
-2
7
0
3
-7
0
4
-5
-5
There is no need to continue to find the row canonical form of M, since the echelon form already tells us that the
system has no solution. Specifically, the third row of the echelon matrix corresponds to the degenerate equation
Oxi + 0x2 + 0-^3 + 0-^4 — ~5
which has no solution. Thus the system has no solution.
(c) Reduce its augmented matrix M to echelon form and then to row canonical form as follows:
M -
1
2
3
1
0
0
2 1
5 -1
-2 -1
2 1
1 -3 -
0 1
3
-4
5
31
-10
3
^
^
[1 2
0 1
[o -8
2 0
0 1 0
0 0 1
1 3
-3 -10
-4 -4
0"
-1
3
1 2
0 1
0 0
0 0
1 0 -
3
-10
0 0 1
-28 -84
2^
-1
3
Thus the system has the unique solution x = 2, j; = — 1, z = 3, or, equivalently, the vector u = B, —I, 3). We
note that the echelon form of M already indicated that the solution was unique, since it corresponded to a
triangular system.
Application to Existence and Uniqueness Theorems
This subsection gives theoretical conditions for the existence and uniqueness of a solution of a system
of linear equations using the notion of the rank of a matrix.
Theorem 3.8: Consider a system of linear equations in n unknowns with augmented matrix M = [A,B].
Then:
(a) The system has a solution if and only if rank(^) = rank(M).
(b) The solution is unique if and only if rank(^) = rank(M) = n.
[A,B] does not have a
Proof of (a). The system has a solution if and only if an echelon form of M :
row of the form
@,0, ...,0,Z?), with b^O
If an echelon form of M does have a row, then Z? is a pivot of M but not of A, and hence
rank(M) > rank(^). Otherwise, the echelon forms of A and M have the same pivots, and hence
rank(^) = rank(M). This proves (a).
Proof of (b). The system has a unique solution if and only if an echelon form has no free variable. This
means there is a pivot for each unknown. Accordingly, n = rank(^) = rank(M). This proves (Z?).
The above proof uses the fact (Problem 3.74) that an echelon form of the augmented matrix
M = [A, 8} also automatically yields an echelon form of ^.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
80
SYSTEMS OF LINEAR EQUATIONS
[CHAR 3
3.9 MATRIX EQUATION OF A SYSTEM OF LINEAR EQUATIONS
The general system C.2) of m linear equations in n unknowns is equivalent to the matrix equation
^11
Hi
^21 ^22
«2«
'*w2
X2
X3
AX = B
where A = [a^y] is the coefficient matrix, X = {Xj\ is the column vector of unknowns, and B = [Z?J is the
column vector of constants. (Some texts write Ax = b rather than AX = B, in order to emphasize that x and
b are simply column vectors.)
The statement that the system of linear equations and the matrix equation are equivalent means that
any vector solution of the system is a solution of the matrix equation, and vice versa.
Example 3.13. The following system of linear equations and matrix equation are equivalent:
Xi + 2x2 ~ 4x3 + 7x4 = 4
3xi — 5x2 + 6x3 — 8x4 = 8 and
4xi — 3x2 ~ 2x3 + 6x4 = 11
1 2
3 -5
4 -3
4
6
2
7l
-8
6 I
fxi
X2
X3
1X4
—
r "^1
8
11
"■
We note that x^ — J, X2 — 1, X3 — z, X4 — 1, or, in other words, the vector w = [3, 1, 2, 1] is a solution of the system.
Thus the (column) vector u is also a solution of the matrix equation.
The matrix form AX = 5 of a system of linear equations is notationally very convenient when
discussing and proving properties of systems of linear equations. This is illustrated with our first theorem
(described in Fig. 3-1), which we restate for easy reference.
Theorem 3.1: Suppose the field K is infinite. Then the system AX = B has: (a) a unique solution, (b) no
solution, or (c) an infinite number of solutions.
Proof. It suffices to show that if AX = B has more than one solution, then it has infinitely many.
Suppose u and v are distinct solutions of AX = B; that is, Au = B and Av = B. Then, for any k e K,
A[u + k(u - v)] =Au-\- k(Au -Av)=B-\-k(B-B)=B
Thus, for each k e K, the vector u + k(u — 1;) is a solution of AX = B. Since all such solutions are distinct
(Problem 3.47), AX = B has an infinite number of solutions.
Observe that the above theorem is true when K is the real field R (or the complex field C). Section 3.3
shows that the theorem has a geometrical description when the system consists of two equations in two
unknowns, where each equation represents a line in R^. The theorem also has a geometrical description
when the system consists of three nondegenerate equations in three unknowns, where the three equations
correspond to planes H^, H2, H^ in R^. That is:
{a)
(b)
(c)
Unique solution: Here the three planes intersect in exactly one point.
No solution: Here the planes may intersect pairwise but with no common point of intersection, or two
of the planes may be parallel.
Infinite number of solutions: Here the three planes may intersect in a line (one free variable), or they
may coincide (two free variables).
These three cases are pictured in Fig. 3-3.
Lipschutz-Lipson:Schaum's I 3. Systems of Linear
Outline of Theory and Equations
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 3]
SYSTEMS OF LINEAR EQUATIONS
(a) Unique solution
(i) (ii)
(c) Infinite number of solutions
H^,H2,a.ndH2
(iii)
(b) No solutions
Fig. 3-3
(iii)
H,
H^ and Hj
(iv)
Matrix Equation of a Square System of Linear Equations
A system ^X = B of linear equations is square if and only if the matrix A of coefficients is square. In
such a case, we have the following important result.
Theorem 3.9: A square system AX = B of linear equations has a unique solution if and only if the
matrix A is invertible. In such a case, A~^B is the unique solution of the system.
We only prove here that if ^ is invertible, then A~^B is a unique solution. If ^ is invertible, then
A(A-^B) = (AA-^)B = IB = B
and hence A~^B is a solution. Now suppose v is any solution, so Av = B. Then
v = Iv = (A-^A)v = A-\Av) = A-^B
Thus the solution A~^B is unique.
Example 3.14. Consider the following system of linear equations, whose coefficient matrix A and inverse A~^ are also
given:
^3-83^
-1 7 -3
0 -2 1
x-\-2y-\- 3z=l
x-\-3y-\- 6z = 3,
.x-\-6y-\- \3z = 5
A =
1 2 3
1 3 6
2 6 13
By Theorem 3.9, the unique solution of the system is
A-'B--
3
1
0
-8
7
-2
31
-3
ij
rr
3
[5
=
[-6]
5
-1
That is, X = —6, y = 5, z = —I.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
82
SYSTEMS OF LINEAR EQUATIONS
[CHAR 3
Remark. We emphasize that Theorem 3.9 does not usually help us to find the solution of a square
system. That is, finding the inverse of a coefficient matrix A is not usually any easier than solving the
system directly. Thus, unless we are given the inverse of a coefficient matrix ^, as in Example 3.14, we
usually solve a square system by Gaussian elimination (or some iterative method whose discussion lies
beyond the scope of this text).
3.10 SYSTEMS OF LINEAR EQUATIONS AND LINEAR COMBINATIONS OF VECTORS
The general system C.2) of linear equations may be rewritten as the following vector equation:
Xi
«11"
«21
_^m\ _
+ X2
"«12
«22
_^w2_
+ ■
^Iw
«2«
_ ^mn _
=
r^ii
b.
_^m_
Recall that a vector v in K^ is said to be a linear combination of vectors Ui,U2, ... ,u^mK^ if there exist
scalars a^, ^2, • • • ,ci^ in AT such that
V = a^Ux + B2^2 H ^ <^m^m
Accordingly, the genereal system C.2) of linear equations and the above equivalent vector equation have a
solution if and only if the column vector of constants is a linear combination of the columns of the
coefficient matrix. We state this observation formally.
Theorem 3.10: A system AX = B of linear equations has a solution if and only if 5 is a linear
combination of the columns of the coefficient matrix A.
Thus the answer to the problem of expressing a given vector v in K" as a linear combination of vectors
Ui,U2, ... ,Uj^ in K^ reduces to solving a system of linear equations.
Linear Combination Example
Suppose we want to write the vector i; = A, —2, 5) as a linear combination of the vectors
A,1,1),
A,2,3),
B,-1,1)
First we write v = xu^ + yu2 + zu^ with unknowns x, y, z, and then we find the equivalent system of linear
equations which we solve. Specifically, we first write
(*)
1
-2
5
= X
1
1
1
+y
1
2
3
+ z
2
-1
1
Then
1
-2
5
=
X
X
X
+
y
2y
3j
+
2z
—z
z
=
x-\-y -\-2z
X -\-2y — z
X -\-3y -\- z
Setting corresponding entries equal to each other yields the following equivalent system:
X -\- y-\-2z= 1
X -\-2y — z = —2
X -\-3y -\- z = 5
/'**\
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 3] SYSTEMS OF LINEAR EQUATIONS 83
For notational convenience, we have written the vectors in R" as columns, since it is then easier to find the
equivalent system of linear equations. In fact, one can easily go from the vector equation (*) directly to the
system (**).
Now we solve the equivalent system of linear equations by reducing the system to echelon form. This
yields
X -\- y -\-2z = 1 X -\-y -\-2z = 1
y — 3z = —3 and then j; — 3z = —3
2y- z= 4 5z= 10
Back-substitution yields the solution x = —6, y = 3, z = 2. Thus v = —6ui + 3^2 + 2^3.
Example 3.15
(a) Write the vector i; = D, 9, 19) as a linear combination of
wi = A, -2, 3), U2 = C, -7, 10), W3 = B, 1, 9).
Find the equivalent system of linear equations by writing v = xu^ -\-yu2 + ZW3, and reduce the system to an
echelon form. We have
x+3j; + 2z=4 x + 3j; + 2z=4 x + 3j; + 2z=4
—2x— 7_y+ z= 9 or —j;+ z=17 or —j;+ z=17
3x+10j; + 9z= 19 j; + 3z = 7 8z = 24
Back-substitution yields the solution x = 4, y =—2, z = 3. Thus i; is a linear combination of Ui,U2,Ut^.
Specifically, v = 4ui — 2u2 + 3^3.
(b) Write the vector i; = B, 3, —5) as a linear combination of
wi = A, 2, -3), U2 = B, 3, -4), W3 = A, 3, -5)
Find the equivalent system of linear equations by writing v = xui -\-yu2 + ZW3, and reduce the system to an
echelon form. We have
x + 2_y+z=2 x + 2_y+z=2 x + 2_y+z=2
2x + 3_y + 3z = 3 or —y-\-z=—\ or — 5_y + 5z = — 1
-3x - 4j; - 5z = -5 2j; - 2z = 1 0=3
The system has no solution. Thus it is impossible to write i; as a linear combination of w^, ^2^ ^3-
Linear Combinations of Orthogonall Vectors, Fourier Coefficients
Recall first (Section 1.4) that the dot (inner) product u ■ v oi vectors w = (a^,..., a„) and
1; = (Z?i, ..., Z?„) in R" is defined by
u ■ V = aibi + B2^72 + • • • + a^b^
Furthermore, vectors u and v are said to be orthogonal if their dot product u ■ v = 0.
Suppose that Ui,U2, ... ,u^ inR" are n nonzero pairwise orthogonal vectors. This means
(i) Ui ■ Uj = 0 for / 7^ j and (ii) u^ • W/ 7^ 0 for each /
Then, for any vector 1; in R", there is an easy way to write 1; as a linear combination of Ui,U2, ... ,u„,
which is illustrated in the next example.
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
84 SYSTEMS OF LINEAR EQUATIONS [CHAR 3
Example 3.16. Consider the following three vectors in R^:
wi = A, 1, 1), U2 = A, -3, 2), ^3 = E, -1, -4)
These vectors are pairwise orthogonal; that is,
Ui • U2 = I — 3 -\- 2 = 0, Ui • u^ = 5 — \ — 4 = 0, ^2*^3 =5 + 3 — 8 = 0
Suppose we want to write i; = D, 14, —9) as a linear combination of w^, ^2^ ^3-
Method 1. Find the equivalent system of linear equations as in Example 3-14 and then solve,
obtaining v = 3ui — 4^2 + ^3.
Method 2. (This method uses the fact that the vectors Ui,U2, W3 are mutually orthogonal, and hence
the arithmetic is much simpler.) Set 1; as a linear combination of Wj, ^2' ^3 using unknown scalars x, y, z as
follows:
D, 14, -9) = x(l, 1, 1) +X1, -3, 2) +zE, -1, -4) (*)
Take the dot product of (*) with respect to Ui to get
D, 14,-9). A,1,1) =x(l, 1,1)-A,1,1) or 9 = 3x or x = 3
(The last two terms drop out, since Ui is orthogonal to U2 and to u^.) Next take the dot product of (*) with respect to U2
to obtain
D, 14, _9).(l,_3,2)=Xl.-3,2)-(l,-3,2) or -56=14;; or y =-4
Finally, take the dot product of (*) with respect to W3 to get
D, 14,-9). E,-1,-4) =zE,-1,-4)-E,-1,-4) or 42 - 42z or z=\
Thus V = 3ui — 4^2 + W3.
The procedure in Method 2 in Example 3.16 is valid in general. Namely:
Theorem 3.10: Suppose Ui,U2,... ,u^ are nonzero mutually orthogonal vectors in R". Then, for any
vector 1; in R",
V ■ U] V ■ U2 ^ ' Un
V = Ui H U2 H \ w„
Wj • Wj U2 ' U2 ^n ' ^n
We emphasize that there must be n such orthogonal vectors u^ in R" for the formula to be used. Note
also that each u^ • w- 7^ 0, since each u^ is a nonzero vector.
Remark: The following scalar kf (appearing in Theorem 3.10) is called the Fourier coefficient of v
with respect to u{.
V ■ Ui V ■ Ui
k =
Uf^i \\Uit
It is analogous to a coefficient in the celebrated Fourier series of a function.
3.11 HOMOGENEOUS SYSTEMS OF LINEAR EQUATIONS
A system of linear equations is said to be homogeneous if all the constant terms are zero. Thus a
homogeneous system has the form AX = 0. Clearly, such a system always has the zero vector
0 = @, 0, ..., 0) as a solution, called the zero or trivial solution. Accordingly, we are usually interested
in whether or not the system has a nonzero solution.
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 3] SYSTEMS OF LINEAR EQUATIONS 85
Since a homogeneous system AX = 0 does have at least the zero solution, it can always be put in an
echelon form; say
a^jXj^ -\ h a^rpc^ = 0
Here r denotes the number of equations in echelon form and n denotes the number of unknowns. Thus the
echelon system has n — r free variables.
The question of nonzero solutions reduces to the following two cases:
(i) r = n. The system has only the zero solution,
(ii) r < n. The system has a nonzero solution.
Accordingly, if we begin with fewer equations than unknowns, then, in echelon form, r < n, and the
system has a nonzero solution. This proves the following important result.
Theorem 3.11: A homogeneous system AX = 0 with more unknowns than equations has a nonzero
solution.
Example 3.17. Determine whether or not each of the following homogeneous systems has a nonzero solution:
x+_y— x = 0 x-\- y — z = Q x^-\- Ix^ — 3x3 + 4x4 = 0
2x — 3_y + z = 0 2x + 4y — z = 0 Ix^— 3x2 + 5x3 ~ '^•^4 — ^
X - 4j; + 2z = 0 3x + 2j; + 2x = 0 5xi + 6x2 - 9x3 + 8x4 = 0
id) (b) (c)
(a) Reduce the system to echelon form as follows:
x+ y — z = 0
—5j; + 3z = 0 and then
-5y + 3z = 0
x+ y — z = 0
-5y + 3z = 0
x-\-y- z =
2j;+ z =
-y-\-5z-
= 0
= 0
= 0
The system has a nonzero solution, since there are only two equations in the three unknowns in echelon form.
Here z is a free variable. Let us, say, set z = 5. Then, by back-substitution, y = 3 and x = 2. Thus the vector
w = B, 3, 5) is a particular nonzero solution.
(b) Reduce the system to echelon form as follows:
x-\-y — z = 0
and then 2j; + z = 0
llz = 0
In echelon form, there are three equations in three unknowns. Thus the system has only the zero solution.
(c) The system must have a nonzero solution (Theorem 3.11), since there are four unknowns but only three
equations. (Here we do not need to reduce the system to echelon form.)
Basis for the General Solution of a Homogeneous System
Let W denote the general solution of a homogeneous system AX = 0. A list of nonzero solution
vectors Ui,U2, ... ,Ug of the system is said to be a basis for W if each solution vector w e W can be
expressed uniquely as a linear combination of the vectors Ui,U2, ... ,Ug, that is, there exist unique scalars
ai,a2, ... ,ag such that
W = a^Ui + B2^2 + • • • + <^s^s
The number s of such basis vectors is equal to the number of free variables. This number s is called the
dimension of W, written dim ^ = ^. In case W = {0}, that is, the system has only the zero solution, we
define dim W = 0.
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
86 SYSTEMS OF LINEAR EQUATIONS [CHAR 3
The following theorem, proved in Chapter 5, tells us how to find such a basis.
Theorem 3.12: Let W be the general solution of a homogeneous system AX = 0, and suppose that the
echelon form of the homogeneous system has s free variables. Let Ui,U2, ... ,UghQ the
solutions obtained by setting one of the free variables equal to 1 (or any nonzero
constant) and the remaining free variables equal to 0. Then dim W = s, and the vectors
Ui,U2, ... ,Ug form a basis of W.
We emphasize that the general solution W may have many bases, and that Theorem 3.12 only gives us
one such basis.
Example 3.18. Find the dimension and a basis for the general solution W of the homogeneous system
Xi + 2x2 ~ 3x3 + 2x4 — 4x5 = 0
2xi + 4x2 ~ 5x3 + X4 — 6x5 = 0
5xi + 10x2 ~ 13x3 + 4x4 — 16x5 = 0
First reduce the system to echelon form. Apply the following operations:
"Replace L2 by —IL^ +^2" and "Replace Z3 by — SL^ +X3", and then "Replace Z3 by —2^2 +^3"
These operations yield
Xi + 2X2 - 3X3 + 2X4 - 4X5 = 0 ,0 a , o A A
^ \ ^ n A Xi + 2X2 - 3X3 + 2X4 - 4X5 = 0
X3 - 3x4 + 2x5 = 0 and ^ o , o _ n
2x3-6x4+4x5=0 X3-3x4 + 2x5-0
The system in echelon form has three free variables, X2, X4, X5; hence dim W = 3. Three solution vectors that form a
basis for W are obtained as follows:
A) Set X2 = 1, X4 = 0, X5 = 0. Back-substitution yields the solution u^ = (—2, 1, 0, 0, 0).
B) Set X2 = 0, X4 = 1, X5 = 0. Back-substitution yields the solution U2 = A,0,3, 1, 0).
C) Set X2 = 0, X4 = 0, X5 = 1. Back-substitution yields the solution W3 = (—2, 0, —2, 0, 1).
The vectors u^ = (-2, 1, 0, 0, 0), U2 = G, 0, 3, 1, 0), u^ = (-2, 0, -2, 0, 1) form a basis for W.
Remark: Any solution of the system in Example 3.18 can be written in the form
au^ + bu2 + CW3 = a(-2, 1, 0, 0, 0) + bG, 0, 3, 1, 0) + c(-2, 0, -2, 0, 1)
= (—2a -\-lb — 2c, a, 3b — 2c, b, c)
or
Xi = —2a -\-lb — 2c, X2 = a, x^ = 3b — 2c, x^ = b, ^5 = c
where a, b, c are arbitrary constants. Observe that this representation is nothing more than the parametric
form of the general solution under the choice of parameters X2 = a, x^ = b, x^ = c.
Nonhomogeneous and Associated Homogeneous Systems
Let AX = 5 be a nonhomogeneous system of linear equations. Then AX = 0 is called the associated
homogeneous system. For example,
x-\-2y-4z = l x-\-2y-4z = 0
3x — 5y -\- 6z = ^ 3x — 5y -\- 6z = 0
show a nonhomogeneous system and its associated homogeneous system.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
CHAR 3]
SYSTEMS OF LINEAR EQUATIONS
87
The relationship between the solution ^ of a nonhomogeneous system AX = B and the solution W of
its associated homogeneous system AX = 0 is contained in the following theorem.
Theorem 3.13: Let Vq be a particular solution of AX = B and let W be the general solution of AX = 0.
Then the following is the general solution of AX = B:
U = Vq-\-W = {vq-\-w:w eW}
That is, ^ = i;q + ^ is obtained by adding Vq to each element in W. We note that this theorem has a
geometrical interpretation in R^. Specifically, suppose ^ is a line through the origin O. Then, as pictured
in Fig. 3-4, ^ = i;q + ^ is the line parallel to W obtained by adding Vq to each element of W. Similarly,
whenever ^ is a plane through the origin O, then ^ = i;q + ^ is a plane parallel to W.
Fig. 3-4
3.12 ELEMENTARY MATRICES
Let e denote an elementary row operation and let e(A) denote the results of applying the operation e to
a matrix A. Now let E be the matrix obtained by applying e to the identity matrix /, that is,
E = e{I)
Then E is called the elementary matrix corresponding to the elementary row operation e. Note that E is
always a square matrix.
Example 3.19. Consider the following three elementary row operations:
A) Interchange R2 and R^. B) Replace R2 by -6R2. C) Replace ^^3 by - AR^ + R^.
The 3x3 elementary matrices corresponding to the above elementary row operations are as follows:
1 0 0
0 1 0
^ -4 0 1
The following theorem, proved in Problem 3.34, holds.
Theorem 3.13: Let e be an elementary row operation and let E be the corresponding m x m elementary
matrix. Then
e(A) = EA
1 0 0
0 0 1
0 1 0
^2 =
1
0
0
0 0
-6 0
0 1
where A is any m x n matrix.
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
SYSTEMS OF LINEAR EQUATIONS [CHAR 3
In other words, the result of applying an elementary row operation e to a matrix A can be obtained by
premultiplying A by the corresponding elementary matrix E.
Now suppose e' is the inverse of an elementary row operation e, and let E' and E be the corresponding
matrices. We note (Problem 3.33) that E is invertible and E' is its inverse. This means, in particular, that
any product
P = Ej^... E2E^
of elementary matrices is invertible.
Applications of Elementary Matrices
Using Theorem 3.13, we are able to prove (Problem 3.35) the following important properties of
matrices.
Theorem 3.14: Let ^ be a square matrix. Then the following are equivalent:
(a) A is invertible (nonsingular).
(b) A is row equivalent to the identity matrix /.
(c) ^ is a product of elementary matrices.
Recall that square matrices A and B are inverses if AB = BA = I. The next theorem (proved in
Problem 3.36) shows that we need only show that one of the products is true, say AB = /, to prove that
matrices are inverses.
Theorem 3.15: Suppose AB = /. Then BA = /, and hence B = A~^.
Row equivalence can also be defined in terms of matrix multiplication. Specifically, we will prove
(Problem 3.37) the following.
Theorem 3.16: B is row equivalent to A if and only if there exists a nonsingular matrix P such that
B = PA.
Application to Finding the Inverse of an w x w Matrix
The following algorithm finds the inverse of a matrix.
Algorithm 3.5: The input is a square matrix A. The output is the inverse of ^ or that the inverse does not
exist.
Step 1. Form the n x 2n (block) matrix M = [A, I], where A is the left half of M and the identity matrix /
is the right half of M.
Step 2. Row reduce M to echelon form. If the process generates a zero row in the A half of M, then
STOP
A has no inverse. (Otherwise A is in triangular form.)
Step 3. Further row reduce M to its row canonical form
M - [/, B]
where the identity matrix / has replaced A in the left half of M.
Step 4. Set A~^ = B, the matrix that is now in the right half of M.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
CHAP. 3]
SYSTEMS OF LINEAR EQUATIONS
The justification for the above algorithm is as follows. Supposed is invertible and, say, the sequence of
elementary row operations ei,e2, ..., e^ applied to M = [A, I] reduces the left half of M, which is ^, to the
identity matrix /. Let E^ be the elementary matrix corresponding to the operation e^. Then, by applying
Theorem 3.13, we get
E^...E,E,A=I or (E^ .. .E,E,I)A = I,
so
That is, A ^ can be obtained by applying the elementary row operations ei, e2,
/, which appears in the right half of M. Thus B = A~^, as claimed.
,, e^ to the identity matrix
Example 3.20. Find the inverse of the matrix A ■
0 2"
2-13
_4 1 8_
First form the (block) matrix M = [A,I] and row reduce M to an echelon form:
M -
1 0 2
2-13
4 1 8
1 0 0"
0 1 0
0 0 1
^
0
0
0
-1
1
2 1
-1|
0 1
1 0 01
-2 1 0
-4 0 1
^
0
0
0
-1
0
2
-1
-1
1 0 0
-2 1 0
-6 1 1
In echelon form, the left half of M is in triangular form; hence A has an inverse. Next we further row reduce M to its
row canonical form:
M '
1
0
0
0 0
-1 0
0 1
-11
4
6
2
0
-1
10 0'
0 10
0 0 1 1
-11
-4
6
2
0
-1
2
1
-1
The identity matrix is now in the left half of the final matrix; hence the right half is ^ ^. In other words,
11
-4
6
2
0
-1
2
1
-1
Elementary Column Operations
Now let ^ be a matrix with columns Cj, C2, ..., C„. The following operations on A, analogous to the
elementary row operations, are called elementary column operations'.
[Fj] (Column Interchange): Interchange columns Q and Cj.
[F2] (Column Scaling): Replace Q by ^Q (where k 7^ 0).
[F3] (Column Addition): Replace q by kC^ + Cj.
We may indicate each of the column operations by writing, respectively.
A) C, ^ Cj, B) kC, -^ C,,
C) {kCi + q) ^ Cj
Moreover, each column operation has an inverse operation of the same type, just like the corresponding
row operation.
Now let/ denote an elementary column operation, and let F be the matrix obtained by applying/ to
the identity matrix /, that is,
F =f{I)
Then F is called the elementary matrix corresponding to the elementary column operation/. Note that F is
always a square matrix.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
90
SYSTEMS OF LINEAR EQUATIONS
[CHAR 3
Example 3.21. Consider the following elementary column operations:
A) Interchange C^ and C3; B) Replace C3 by — 2C3;
The corresponding three 3x3 elementary matrices are as follows:
0 0 1
0 1 0
1 0 0
^2 =
1 0
0 1
0 0
0
0
-2
^3 =
1 0
0 1
0 0
0
-3
1
C) Replace C3 by -3C2 + C3
The following theorem is analogous to Theorem 3.13 for the elementary row operations.
Theorem 3.17: For any matrix A, f(A) = AF.
That is, the result of applying an elementary column operation / on a matrix A can be obtained by
postmultiplying A by the corresponding elementary matrix F.
Matrix Equivalence
A matrix B is equivalent to a matrix AifB can be obtained from ^ by a sequence of row and column
operations. Alternatively, B is equivalent to ^, if there exist nonsingular matrices P and Q such that
B = PAQ. Just like row equivalence, equivalence of matrices is an equivalence relation.
The main result of this subsection (proved in Problem 3.38) is as follows.
Theorem 3.18: Every m x n matrix A is equivalent to a unique block matrix of the form
^Ir 0"
0 0
where 4 is the r-square identity matrix.
The following definition applies.
Definition: The nonnegative integer r in Theorem 3.18 is called the rank of A, written rank(^).
Note that this definition agrees with the previous definition of the rank of a matrix.
3.13 LU DECOMPOSITION
Suppose ^ is a nonsingular matrix that can be brought into (upper) triangular form U using only row-
addition operations, that is, suppose A can be triangularized by the following algorithm, which we write
using computer notation.
Algorithm 3.6: The input is a matrix A and the output is a triangular matrix U.
Step 1. Repeat for / = 1, 2, ..., w — 1:
Step 2. Repeat fory = / + 1, / + 2, ..., w
(a) Set my : = aij/a^.
[End of Step 2 inner loop.]
[End of Step 1 outer loop.]
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
CHAR 3]
SYSTEMS OF LINEAR EQUATIONS
91
The numbers m^ are called multipliers. Sometimes we keep track of these multipliers by means of the
following lower triangular matrix L:
1
^21
^31
0
1
-^32
0
0
1
0
0
0
0
0
0
''n2>
1
That is, L has I's on the diagonal, O's above the diagonal, and the negative of the multiplier my as its
//-entry below the diagonal.
The above matrix L and the triangular matrix U obtained in Algorithm 3.6 gives us the classical LU
factorization of such a matrix A. Namely:
Theorem 3.19: Let ^ be a nonsingular matrix that can be brought into triangular form U using only row-
addition operations. Then A = LU, where L is the above lower triangular matrix with 1 's
on the diagonal, and U is an upper triangular matrix with no O's on the diagonal.
Example 3.22. Suppose A
1 2 -3
-3 -4 13
2 1 -5
We note that A may be reduced to triangular form by the operations
"Replace R2 by 3R^ +^^2"; "Replace ^^3 by - 2R^ +^^3"; and then "Replace ^^3 by |i?2 +^3'
That is,
1 2
0 2
0 -3
T 2 -3'
0 2 4
0 0 7
This gives us the classical factorization A = LU, where
0 0
1 0
■I'.
and
U :
T 2 -3'
0 2 4
0 0 7
We emphasize:
A) The entries
B) U is the triangular form of ^.
A) The entries —3, 2, — | in Z are the negatives of the multipliers in the above elementary row operations.
Application to Systems of Linear Equations
Consider a computer algorithm M. Let C(n) denote the running time of the algorithm as a function of
the size n of the input data. [The function C(n) is sometimes called the time complexity or simply the
complexity of the algorithm M.] Frequently, C(n) simply counts the number of multiplications and
divisions executed by M, but does not count the number of additions and subtractions since they take much
less time to execute.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
92
SYSTEMS OF LINEAR EQUATIONS
[CHAR 3
Now consider a square system of linear equations AX = B, where
Kl
X — [xi, ... ,xj
B^[b„...,b„y
and suppose A has mi LU factorization. Then the system can be brought into triangular form (in order to
apply back-substitution) by applying Algorithm 3.6 to the augmented matrix M = [A,B] of the system.
The time complexity of Algorithm 3.6 and back-substitution are, respectively,
C(n)^
W
and
C{«):
:1«^
where n is the number of equations.
On the other hand, suppose we already have the factorization A = LU. Then, to triangularize the
system, we need only apply the row operations in the algorithm (retained by the matrix L) to the column
vector B. In this case, the time complexity is
C(«)
:in^
Of course, to obtain the factorization A = LU requires the original algorithm where C(n) ^ ^n^. Thus
nothing may be gained by first finding the LU factorization when a single system is involved. However,
there are situations, illustrated below, where the LU factorization is useful.
Suppose, for a given matrix A, we need to solve the system
AX = B
repeatedly for a sequence of different constant vectors, say Bi,B2,... ,Bj^. Also, suppose some of the Bf
depend upon the solution of the system obtained while using preceding vectors Bj. In such a case, it is
more efficient to first find the LU factorization of ^, and then to use this factorization to solve the system
for each new B.
Example 3.23. Consider the following system of linear equations:
X + 2y-\- z = ki
2x + 3_y + 3z :
-3x+ 10j; + 2z:
■■k2
AX = B,
where
1
2
-3
2 1
3 3
10 2
and B -
Suppose we want to solve the system three times where B is equal, say, to Bi,B2,Bt^. Furthermore, suppose
By =[\,\, 1]^, and suppose
+1
-^J
{ioxj=\,2)
where X is the solution oiAX = Bj. Here it is more efficient to first obtain the LU factorization of ^ and then use the
LU factorization to solve the system for each of the B's. (This is done in Problem 3.42.)
Solved Problems
LINEAR EQUATIONS, SOLUTIONS, 2x2 SYSTEMS
3.1. Determine whether each of the following equations is linear:
(a) 5x-\-ly — ^yz =16, (b) x-\-ny -\- ez = log 5, (c) 3x + Ar^ — 8z = 16
{a) No, since the product yz of two unknowns is of second degree.
{b) Yes, since n, e, and log 5 are constants.
(c) As it stands, there are four unknowns: x,y,z,k. Because of the term ky it is not a linear equation.
However, assuming ^ is a constant, the equation is linear in the unknowns x, y, z.
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 3] SYSTEMS OF LINEAR EQUATIONS 93
3.2. Determine whether the following vectors are solutions of Xj + 2x2 — 4x3 + 3x4 = 15:
(a) u = C, 2, 1,4) and (b) v = A,2, 4, 5).
(a) Substitute to obtain 3 + 2B) - 4A) + 3D) = 15, or 15 = 15; yes, it is a solution.
(b) Substitute to obtain 1 + 2B) - 4D) + 3E) = 15, or 4 = 15; no, it is not a solution.
3.3. Solve: (a) ex = n, (b) 3x —4—x = 2x + 3, (c) 7 + 2x — 4 = 3x +3 — x
(a) Since e ^ 0, multiply by 1 /e to obtain x = n/e.
(b) Rewrite in standard form, obtaining Ox = 7. The equation has no solution.
(c) Rewrite in standard form, obtaining Ox = 0. Every scalar ^ is a solution.
3.4. Prove Theorem 3.4: Consider the equation ax = b.
(i) If a 7^ 0, then x = b/a is a unique solution of ax = b.
(ii) If a = 0 but b j^ 0, then ax = b has no solution,
(iii) If a = 0 and b = 0, then every scalar ^ is a solution of ax = b.
Suppose a ^ 0. Then the scalar b/a exists. Substituting b/a max = b yields a{b/a) = b, or b = b; hence
b/a is a solution. On the other hand, suppose Xq is a solution to ax = b, so that axg = b. Multiplying both
sides by I/a yields Xq = b/a. Hence b/a is the unique solution of ax = b. Thus (i) is proved.
On the other hand, suppose a = 0. Then, for any scalar k, we have ak = Ok = 0. If b ^ 0, then ak 7^ b.
Accordingly, k is not a solution of ax = b, and so (ii) is proved. lfb = 0, then ak = b. That is, any scalar ^ is a
solution of ax = b, and so (iii) is proved.
3.5. Solve each of the following systems:
. . 2x — 5y = \\ .,. 2x — 3y = ^ . . 2x — 3y = 8
^^^ 3x + 43; = 5 ^^ -6x + 9y = 6 ^^^ -4x + 63;=-16
(a) Eliminate x from the equations by forming the new equation L = —SL^ + 2^2. This yields the equation
23j; = —23, and so y =—I
Substitute y = —I in one of the original equations, say L^, to get
2x —5(—1)=11 or 2x + 5 = ll or 2x = 6 or x = 3
Thus X = 3, j; = — 1 or the pair w = C, —1) is the unique solution of the system.
(b) Eliminate x from the equations by forming the new equation L = SL^ -\-L2. This yields the equation
0x + 0j; = 30
This is a degenerate equation with a nonzero constant; hence this equation and the system have no
solution. (Geometrically, the lines corresponding to the equations are parallel.)
(c) Eliminate x from the equations by forming the new equation L = IL^ -\- L2. This yields the equation
Ox + Oj; = 0
This is a degenerate equation where the constant term is also zero. Thus the system has an infinite
number of solutions, which correspond to the solution of either equation. (Geometrically, the lines
corresponding to the equations coincide.)
To find the general solution, SQty = a and substitute in Z^ to obtain
2x — 3a = 8 or 2x = 3a + 8 or x = \a-\-A
Thus the general solution is
x = |a + 4, y = a or u=(^a-\-4, a)
where a is any scalar
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
94 SYSTEMS OF LINEAR EQUATIONS [CHAP. 3
3.6. Consider the system
X -\- ay = 4
ax -\-9y = b
(a) For which values of a does the system have a unique solution?
(b) Find those pairs of values (a, b) for which the system has more than one solution.
{a) Eliminate x from the equations by forming the new equation L = —aLy -\-L2. This yields the equation
(9 - a^)y = b-Aa A)
The system has a unique solution if and only if the coefficient of j; in A) is not zero, that is, if 9 — a^ 7^ 0
or if a 7^ dz3.
{b) The system has more than one solution if both sides of A) are zero. The left-hand side is zero when
a = dz3. When a = 3, the right-hand side is zero when 6—12 = 0 or 6= 12. When a = —3, the right-
hand side is zero when 6+12 — 0 or 6 = —12. Thus C, 12) and (—3, —12) are the pairs for which the
system has more than one solution.
SYSTEMS IN TRIANGULAR AND ECHELON FORM
3.7. Determine the pivot and free variables in each of the following systems:
Ixx + 3x2
'3 JX4 ~r -^-^5 — '
:3 + 3x4 — 7x5 = 6
X4 -^^5 ~~ I
(a)
2x — 6y -\-lz = 1
43; + 3z = 8
2z = 4
(b)
X -\- 2y — 3z = 2
2x + 33;+ z = 4
3x + 43; + 5z = 8
(c)
(a) In echelon form, the leading unknowns are the pivot variables, and the others are the free variables. Here
Xi, X3, X4 are the pivot variables, and X2 and X5 are the free variable.
(b) The leading unknowns are x, y, z, so they are the pivot variables. There are no free variables (as in any
triangular system).
(c) The notion of pivot and free variables applies only to a system in echelon form.
3.8. Solve the triangular system in Problem 3.1(b).
Since it is a triangular system, solve by back-substitution.
(i) The last equation gives z = 2.
(ii) Substitute z = 2 in the second equation to get 4j; + 6 = 8 or j; = ^.
(iii) Substitute z = 2 and j; = ^ in the first equation to get
2x-6|-J+7B) = 1 or 2x+ll = l or x = 5
Thus X = 5, j; = ^, z = 2 or w = E, ^, 2) is the unique solution to the system.
3.9. Solve the echelon system in Problem 3.1(a).
Assign parameters to the free variables, say X2 = a and X5 = b, and solve for the pivot variables by back-
substitution.
(i) Substitute x^ = b in the last equation to get x^ — 2b = I or X4 = 26 + 1.
(ii) Substitute x^ = b and x^ = 2b -\- I in the second equation to get
X3+3B6+1)-7Z? = 6 or x^-b-\-3 = 6 or x^=b-\-3
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 3] SYSTEMS OF LINEAR EQUATIONS 95
(iii) Substitute x^ = b, x^ = 2b -\- I, x^ = b -\-3, X2 = a in the first equation to get
2xi + 3fl - 6(b + 3) - 5B6 + 1) + 26 = 7 or 2x^ -\-3a - I4b-23 = 7
or xi =|a + 7Z?+ 15
Thus
3
Xi =-a-\-lb-\-\5, X2 = a, X2=b-\-3, X/^ = 2b-\-\, x^ = b
or u= (-a-\-lb-\-\5, a, b-\-3, 26+1, 6 J
is the parametric form of the general solution.
Alternatively, solving for the pivot variable x^, X3, X4 in terms of the free variables X2 and X5 yields the
following free-variable form of the general solution:
3
•^1 = ^^2 + '7^5 + 15, X3 = X5 + 3, X4 = 2x5 + 1
3.10. Prove Theorem 3.5. Consider the system C.4) of linear equations in echelon form with r equations
and n unknowns.
(i) If r = w, then the system has a unique solution.
(ii) If r < w, then we can arbitrarily assign values to the w — r free variable and solve uniquely for
the r pivot variables, obtaining a solution of the system.
(i) Suppose r = n. Then we have a square system AX = B where the matrix A of coefficients is (upper)
triangular with nonzero diagonal elements. Thus A is invertible. By Theorem 3.9, the system has a
unique solution.
(ii) Assigning values to the « — r free variables yields a triangular system in the pivot variables, which, by
(i), has a unique solution.
GAUSSIAN ELIMINATION
3.11. Solve each of the following systems:
X -\- 2y — Az = —A X -\- 2y — 3z = —\ x -\- 2y — 3z = \
2x-\-5y - 9z =-10 -3x-\- y - 2z =-1 2x-\-5y - 8z = 4
3x-23; + 3z= 11 5x + 3;; - 4z = 2 3x + 83;-13z = 7
(a) (b) (c)
Reduce each system to triangular or echelon form using Gaussian elimination:
(a) Apply "Replace L2 by —2Li + Z2" and "Replace Z3 by —3Li + Z3" to eliminate x from the second and
third equations, and then apply "Replace L^ by 8Z2 + L^" to eliminate y from the third equation. These
operations yield
X + 2j; - 4z = -4 X + 2j; - 4z = -4
y — z — —2 and then y — z — —2
-8j;+15z= 23 7z = 23
The system is in triangular form. Solve by back-substitution to obtain the unique solution w = B, — 1, 1).
F) Eliminate x from the second and third equations by the operations "Replace L2 by 3L^ +^2" ^^^d
"Replace Z3 by —5L^ +^3". This gives the equivalent system
X + 2_y — 3z = — 1
ly- nz= -10
-7j; + 1 Iz = 7
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
96 SYSTEMS OF LINEAR EQUATIONS [CHAR 3
The operation "Replace Z3 by L2 + X3" yields the following degenerate equation with a nonzero
constant:
Ox + Oj; + Oz = -3
This equation and hence the system have no solution.
(c) Eliminate x from the second and third equations by the operations "Replace L2 by —IL^ +^2" and
"Replace Z3 by —3Zi +^3". This yields the new system
X + 2j; - 3z = 1 , - _ .
y-2z = 2 or ^ + 2;.- z =
2y-4z = A y-2z = 2
(The third equation is deleted, since it is a multiple of the second equation.) The system is in echelon
form with pivot variables x and y and free variable z.
To find the parametric form of the general solution, set z = a and solve for x and y by back-
substitution. Substitute z = a in the second equation to get j; = 2 + 2a. Then substitute z = a and
j; = 2 + 2a in the first equation to get
X + 2B + 2a) — 3a=l or x + 4 + a=l or x = —3 — a
Thus the general solution is
X = —3 — a, y = 2-\- 2a, z = a or u = (—3 —a, 2 + 2a, a)
where a is a parameter.
3.12. Solve each of the following systems:
X^ •^•^2 ~^ "^3 "^4 "r "^5
)Xi — 9X2 + '7-^3 — -^4 + 3X5
ixj — 6x2 + 7x3 + 4x4 — 5x5
(a)
= 2
= 7
= 7
Xi ~|~ -^-^2 3 "r '■^A
ZXj ~r ^^-^2 "^-^3 "r -^4
5xi + 12x2 ~ '7-^3 + 6-^4
ib)
= 2
= 1
= 3
Reduce each system to echelon form using Gaussian elimination:
{a) Apply "Replace L2 by —31^ + Z2" and "Replace Z3 by —21^ + Z3" to eliminate x from the second and
third equations. This yields
Xi - 3X2 + 2X3 - X4 + 2X5 = 2 Qv -U 9v V -U 9v — 9
>- _L 9^- -1^ — ^ r.y -^1 ~ ^-^2 "T ^-^3 ~ X4 -h ZX5 — Z
X3 + ZX4 - JX5 — 1 or V -L 9v ^v — 1
3;,3+6x4-9x5=3 X3+2X4-3X5-I
(We delete Z3, since it is a multiple of Z2.) The system is in echelon form with pivot variables x^ and X3
and free variables X2, X4, X5.
To find the parametric form of the general solution, set X2 = a, X4 = 6, X5 = c, where a,b,c are
parameters. Back-substitution yields X3 = 1 — 26 + 3c and x^ = 3a -\- 5b — 8c. The general solution is
Xj = 3a -\- 5b — 8c, X2 = a, X3 = 1 — 26 + 3c, X4 = 6, X5 = c
or, equivalently, u = Ca -\- 5b — 8c, a, \ —2b -\- 3c, b, c).
{b) Eliminate x^ from the second and third equations by the operations "Replace L2 by —21^ +^2" and
"Replace Z3 by —5Li +X3". This yields the system
Xi + 2x2 ~ 3x3 + 4x4 = 2
X2 + 4x3 — 7x4 = —3
2x2 + 8-^3 ~ 14x4 = —7
The operation "Replace Z3 by —2^2 + X3" yields the degenerate equation 0 = 3. Thus the system has no
solution (even though the system has more unknowns than equations).
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
CHAP. 3]
SYSTEMS OF LINEAR EQUATIONS
97
3.13. Solve using the condensed format:
ed format follows:
Number
B) (;)
A) (»
C)
C0
C^0
2;; + 3z =
x+ ;;+ z =
4x + 83; - 3z =
Equation
2y+ 3z= 3
x+ ;;+ z= 4
4x + 8j;- 3z = 35
Ay- lz= 19
- 13z= 13
3
4
35
Replace
Replace
Operation
Li ^ L2
Li ^> L2
; Z3 by - 4Zi + Z3
; Z3 by — 2L2 +13
Here A), B), and C^0 form a triangular system. (We emphasize that the interchange of L^ and L2 is
accomplished by simply renumbering L^ and L2 as above.)
Using back-substitution with the triangular system yields z = — 1 from L^, y = 3 from L2, and x = 2
from L^. Thus the unique solution of the system is x = 2, j; = 3, z = — 1 or the triple u = B,3, —1).
3.14. Consider the system
x-\-2y -\- z =
ay -\- 5z =
\x-\-ly -\- az =
3
10
b
(a) Find those values of a for which the system has a unique solution.
(b) Find those pairs of values (a, b) for which the system has more than one solution.
Reduce the system to echelon form. That is, eliminate x from the third equation by the operation
"Replace Z3 by —2Li +^3" and then eliminate y from the third equation by the operation
"Replace Z3 by —3^2 + ^^3". This yields
X + 2_y -\- z = 3
ay +5z=10
y + {a-2)z = b-6
and then
X + 2_y + z -
ay -\- 5z -
(a^ - 2a - I5)z--
■ 3
: 10
-ab-6a-30
Examine the last equation (a^ — 2a — I5)z = ab — 6a — 30.
(a) The system has a unique solution if and only if the coefficient of z is not zero, that is, if
a
■2a-l5 = (a-5)(a-\-3)^0
or
a ^ 5 and a 7^ —3.
(b) The system has more than one solution if both sides are zero. The left-hand side is zero when a = 5 or
a = —3. When a = 5, the right-hand side is zero when 56 — 60 = 0, or 6 = 12. When a = —3, the right-
hand side is zero when —3b +12 = 0, or b = 4. Thus E, 12) and (—3, 4) are the pairs for which the
system has more than one solution.
ECHELON MATRICES, ROW EQUIVALENCE, ROW CANONICAL FORM
3.15. Row reduce each of the following matrices to echelon form:
(a) A.
1
2
3
2
4
6
-3 0
-2 2
-4 3
(b) B:
-4 1 -6
1 2 -5
6 3-4
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
98
SYSTEMS OF LINEAR EQUATIONS
[CHAR 3
1 2
0 0
0 0
-3 01
4 2
5 3
^
(a) Use ail = I ^s a pivot to obtain O's below an, that is, apply the row operations "Replace R2 by
—2Ri + 7^2" and "Replace ^^3 by —3Ri + ^^3"; and then use ^23 = 4 as a pivot to obtain a 0 below ^23,
that is, apply the row operation "Replace ^^3 by —5i?2 + 4^3"- These operations yield
~1 2 -3 0^
0 0 4 2
^0 0 0 2
The matrix is now in echelon form.
(b) Hand calculations are usually simpler if the pivot element equals 1. Therefore, first interchange Ri and
R2. Next apply the operations "Replace R2 by 4Ri -\-R2'' and "Replace ^^3 by —6R1 +^^3"; and then
apply the operation "Replace ^^3 by R2 +^3". These operations yield
1 :
0 ^
^0 -^
The matrix is now in echelon form.
1 2
4 1
6 3
-5"
-6
-4
^
-5
26
26
^
1 2
0 9
0 0
-5
-26
0
3.16. Describe the pivoting row-reduction algorithm. Also describe the advantages, if any, of using this
pivoting algorithm.
The row-reduction algorithm becomes a pivoting algorithm if the entry in column 7 of greatest absolute
value is chosen as the pivot ay^ and if one uses the row operation
(-
/a^:)Ri+Ri^R,
The main advantage of the pivoting algorithm is that the above row operation involves division by the
(current) pivot ay^, and, on the computer, roundoff errors may be substantially reduced when one divides by a
number as large in absolute value as possible.
3.17.
Let^ =
2
-3
1
-2
6
-7
2
0
10
1
-1
2
Reduce A to echelon form using the pivoting algorithm.
First interchange Ri and R2 so that —3 can be used as the pivot, and then apply the operations "Replace
R2 by ^Ri +^^2" a^d "Replace ^^3 by ^Ri +^^3". These operations yield
3
2
1
6
-2
-7
0
2
10
-n
1
2
^
-3
0
0
6
2
-5
0
2
10
Now interchange R2 and ^^3 so that —5 can be used as the pivot, and then apply the operation "Replace ^^3 by
17^2+^3"- We obtain
-3 6 0
0 -5 10
0 2 2
-1
3 J
0 0
0
10
The matrix has been brought to echelon form using partial pivoting.
3.18. Reduce each of the following matrices to row canonical form:
(a) A
2
4
8
2
4
8
-1
1
-1
6
10
26
4
13
19
(b) B.
5
0
0
-9
2
0
6
3
7
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
CHAR 3]
SYSTEMS OF LINEAR EQUATIONS
99
(a) First reduce A to echelon form by applying the operations "Replace R2 by —IR^ + R2 " and "Replace ^^3
by —4Ri +^^3", and then applying the operation "Replace ^^3 by —R2 +^3". These operations yield
2 2-1 6 4
0 0 3-2 6
0 0 3 2 7
2 2-1 6 4
0 0 3-2 6
0 0 0 4 2
Now use back-substitution on the echelon matrix to obtain the row canonical form of ^. Specifically, first
multiply R2, by | to obtain the pivot a^^ = I, and then apply the operations "Replace R2 by 2R2, +^^2"
and "Replace Ri by —6R^ +i^i". These operations yield
2 2 -1
0 0 3
0 0 0
Now multiply R2 by ^, making the pivot ^23
^2 2 -
0 0
^0 0
Finally, multiply R^ by ^, so the pivot <
6 41
2 5
1 U
^
2 2-101
0 0 3 0 6
0 0 0 1 i
1, and then apply "Replace R^ by R2 +^1", yielding
2 2 0 0 3"
0 0 10 2
1 0 11
1 0 2
0 1 i
^
0 0 0 1 i
1. Thus we obtain the following row canonical form of ^:
0
0
1
0
0
0
1
0
0
0
1
3
9
2
1
2
(b) Since B is in echelon form, use back-substitution to obtain
5
0
0
-9 6"
2 3
0 1_
^
5 -9
0 2
0 0
5
0
0
-9 0"
1 0
0 1_
^
5 0 0"
0 1 0
0 0 1_
^
1 0 0"
0 1 0
0 0 1
The last matrix, which is the identity matrix /, is the row canonical form of 5. (This is expected, since B
is invertible, and so its row canonical form must be /.)
3.19. Describe the Gauss-Jordan elimination algorithm, which also row reduces an arbitrary matrix A to
its row canonical form.
The Gauss-Jordan algorithm is similar in some ways to the Gaussian elimination algorithm, except that
here each pivot is used to place O's both below and above the pivot, not just below the pivot, before working
with the next pivot. Also, one variation of the algorithm first normalizes each row, that is, obtains a unit pivot,
before it is used to produce O's in the other rows, rather than normalizing the rows at the end of the algorithm.
Use Gauss-Jordan to find the row canonical form of ^.
[1-23 12
3.20. Let^= 1 14-13
[2 5 9-2 8_
Use ail = I as a. pivot to obtain O's below an by applying the operations "Replace R2 by —Ri + R2'' and
"Replace R^ by —2Ri -\-R^'\ This yields
^1-2 3 1 2^
0 3 1-21
0 9 3-44
Multiply R2 by ^ to make the pivot ^22 = 1? ^nd then produce O's below and above ^22 by applying the
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
100
SYSTEMS OF LINEAR EQUATIONS
[CHAR 3
operations "Replace R^ by —9i?2 +^3" ^nd "Replace Ri by 2R2 -\-Ri'\ These operations yield
-2 3 1 2"
0 1
0 9 3-44
2 i
3 3
I 0 11 _i 8-
^ ^ 3 33
0 1
0 0
2 i
3 3
0
2 1
Finally, multiply R^ by ^ to make the pivot a^^ = I, and then produce O's above a^^ by applying the operations
"Replace R2 by |i?3 +^^2" ^^^ "Replace R^ by ^^^3 -\-Ri'\ These operations yield
0 1
0 <
which is the row canonical form of ^.
0
1
0
11
3
1
3
0
1
3
2
3
1
8n
3
1
3
1
2-
^
0
0
0
1
0
11
3
1
3
0
0
0
1
17
6
2
3
1
2
SYSTEMS OF LINEAR EQUATION IN MATRIX FORM
3.21. Find the augmented matrix M and the coefficient matrix A of the following system:
X + 23; - 3z = 4
3y-4z-\-lx = 5
6z + 8x - 93; = 1
First align the unknowns in the system, and then use the aligned system to obtain M and A. We have
X + 2j; - 3z = 4
7x + 3j; - 4z = 5 ;
8x - 9j; + 6z = 1
then
M :
-3 4
-4 5
6 1
and
2 -3
3 -4
-9 6
3.22. Solve each of the following systems using its augmented matrix M:
x + 23;— z=3 X — 23; + 4z = 2
X -\-3y -\- z = 5 2x — 3y -\- 5z = 3
3x + 83; + 4z = 17 3x - 43; + 6z = 7
(a) (b)
X + 3; + 3z = 1
2x + 33; — z = 3
5x + 73;+ z = 7
(c)
(a) Reduce the augmented matrix M to echelon form as follows
M
1 2
1 3
3 8
-1 3
1 5
4 17
1 2
0 1
0 2
-1 3
2 2
7 8
Now write down the corresponding triangular system
X + 23; — z = 3
3; + 2z = 2
3z = 4
1 2
0 1
0 0
and solve by back-substitution to obtain the unique solution
x = ^, y = -l z = l or „ = (n,_|,|)
Alternately, reduce the echelon form of M to row canonical form, obtaining
M^
1 2
0 1
0 0
-1 3
2 2
1 ^
12 0
0 1 0
0 0 1
10 0
0 1 0
0 0 1
This also corresponds to the above solution.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
CHAR 3]
SYSTEMS OF LINEAR EQUATIONS
101
(b) First reduce the augmented matrix M to echelon form as follows:
M -
-2 4 2"
-3 5 3
-4 6 7
1
0
0
-2
1
2
4
-3
-6
2
-1
1
^
1
0
0
-2
1
0
4
-3
0
2
-1
3
The third row corresponds to the degenerate equation Ox -h Oy -h Oz = 3, which has no solution. Thus
"DO NOT CONTINUE". The original system also has no solution. (Note that the echelon form indicates
whether or not the system has a solution.)
(c) Reduce the augmented matrix M to echelon form and then to row canonical form:
M -
(The third row of the second matrix is deleted, since it is a multiple of the second row and will result in a
zero row.) Write down the system corresponding to the row canonical form of M and then transfer the
free variables to the other side to obtain the free-variable form of the solution:
1 1
2 3
5 7
3 11
-1 3
1 7
^
1 1
0 1
0 2
3 1
-7 1
-14 2
r-^
1 0
0 1
10
-/
0
1
y-
10z = 0
7z= 1
and
X :
y^
-- -lOz
l-h7z
Here z is the only free variable. The parametric solution, using z = a, is as follows:
x=—10a, y = I -\-la, z = a or u = (—lOa, 1-h 7a, a)
3.23. Solve the following system using its augmented matrix M:
Xj -|- ■^^2 "
ZXj -|- ^^2 "
Xi -\- 4^2 -
- 3X3 — 2X4 -h 4X5 = 1
- 8x3 — X4 -h 6x5 = 4
- 7x3 -h 5x4 -h 2x5 = 8
Reduce the augmented matrix M to echelon form and then to row canonical form:
M :
1
2
1
0
0
2
5
4
2
1
0
-3
-8
7
-3
-2
0
-2 4
-1 6
5 2
0 8
0 -8
1 2
1 2 -3
0 1 -2
0 2-4
4
-2
-2
1 2
0 1
0 0
-3
-2
0
7
7
3
^
1 0
0 1
0 0
1 0
2 0
0 1
24
-8
2
21
-7
3
4
-2
2
Write down the system corresponding to the row canonical form of M and then transfer the free variables to
the other side to obtain the free-variable form of the solution:
Xi -h
X3 -h
- 2X:! —
24x5
8x5
X4 + 2x.
21
-7
3
and
Xi = 21 — X3 — 24x5
X2 = —7 -h 2x3 -h 8x5
X4 ^3 — -^-^5
Here x^, X2, X4 are the pivot variables and X3 and X5 are the free variables. Recall that the parametric form of
the solution can be obtained from the free-variable form of the solution by simply setting the free variables
equal to parameters, say X3 = a, x^ = b. This process yields
M
: 21 — a — 24b, X2
or u = B1 — a -
which is another form of the solution.
-7 -h 2a -h ^
■2b,
24b, -7-\-2a-\-Sb, a, 3-2b, b)
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
102
SYSTEMS OF LINEAR EQUATIONS
[CHAR 3
LINEAR COMBINATIONS, HOMOGENEOUS SYSTEMS
3.24. Write i; as a linear combination of Wj, W2» ^3? where
(a) V = C, 10, 7) and u^=(l, 3, -2), U2 = A,4, 2), u^ = B, 8, 1);
(b) V = B, 7, 8) and u^ = A, 2, 3), U2 = A, 3, 5), u^ = A, 5, 9);
(c) V = A, 5, 4) and u^ = A, 3, -2), 1/2 = B, 7, -1), u^ = A, 6, 7).
Find the equivalent system of linear equations by writing v = xui -\-yu2 -\-zu^. Alternatively, use the
augmented matrix M of the equivalent system, where M = [ui,U2,U2,v]. (Here Ui,U2,U2,v slyq the columns
ofM.)
(a) The vector equation v = xu^ -\-yu2 -\-zu^ for the given vectors is as follows:
3
10
7
= X
1
3
-2
■^y
1
4
2
+ z
2
8
1
=
x-\-y-\-2z
3x + 4j; + 8z
—2x -\-2y -\-z
Form the equivalent system of linear equations by setting corresponding entries equal to each other, and
then reduce the system to echelon form:
X + _y + 2z = 3
3x + 4j; + 8z= 10
—2x -\-2y -\- z — 1
x-\- y -\-2z = 3
j; + 2z= 1
4j; + 5z= 13
X + _y + 2z = 3
j; + 2z = 1
-3z = 9
3.
The system is in triangular form. Back-substitution yields the unique solution x = 2, j; = 7, z
Thus V = 2ui + 7^2 — 3^3.
Alternatively, form the augmented matrix M = [ui,U2,U2,v] of the equivalent system, and reduce
M to echelon form:
M -
The last matrix corresponds to a triangular system that has a unique solution. Back-substitution yields the
solution X = 2, j; = 7, z = —3. Thus v = 2ui + 7^2 — 3^3.
{b) Form the augmented matrix M = [ui,U2,U2,v] of the equivalent system, and reduce M to the echelon
form:
1 1 2
3 4 8
2 2 1
31
10
7
^
1 1 2
0 1 2
0 4 5
31
1
13
^
1 1
0 1
0 0
2 3
2 1
-3 9
M -
1112"
2 3 5 7
3 5 9 8
^
1112"
0 13 3
0 2 6 4
^
111 2 "
0 13 3
0 0 0-2
The third row corresponds to the degenerate equation Ox + Oj; + Oz = —2, which has no solution. Thus
the system also has no solution, and v cannot be written as a linear combination of U^, 1^2 ? ^3 •
(c) Form the augmented matrix M = [ui,U2,U2,v] of the equivalent system, and reduce M to echelon form:
M -
2
1
1
1
6
7
1"
5
4
^
12 11"
0 13 2
0 3 9 6
^
12 11"
0 13 2
0 0 0 0
The last matrix corresponds to the following system with free variable z:
X + 2_y + z = 1
j; + 3z = 2
Thus V can be written as a linear combination of Ui,U2, u^ in many ways. For example, let the free
variable z = I, and, by back-substitution, we get y = —2 and x = 2. Thus v = 2ui — 2u2 + W3.
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 3] SYSTEMS OF LINEAR EQUATIONS 103
3.25. Let Ui = A, 2, 4), ^2 = B, —3, 1), W3 = B, 1, — 1) in R^. Show that Wj, ^2' ^3 ^re orthogonal, and
write 1; as a linear combination of Wj, U2, W3, where: (a) v = G, 16, 6), (b) v = C, 5, 2).
Take the dot product of pairs of vectors to get
Ui-U2=2-6-\-4 = 0, wi-1/3 =2 + 2-4 = 0, ^2'% =4-3-1=0
Thus the three vectors in R^ are orthogonal, and hence Fourier coefficients can be used. That is.
(a)
X
(*)
V = xui -\-yu2 -\-zu^, where
V • Ui
Ui • Ui
We have
7 + 32 + 24 63
1+4+16 ~2\~ ' ^~~
Thus V = 3ui — 2^2 + 4^3.
We have
3 + 10 + 8 21
^~ l+4 + 16~2T~ '
Thus V = Ui — ^^2 + 2^3-
U2 ' U2 Ut, ' W3
14-48 + 6 _-28 _ _ 14+16-6 _ 24
4 + 9+1 ~~[4~~^~~ ' 4+1 + 1 ~~6
6-15 + 2_-7_ 1 _6 + 5-2_9_3
' 4 + 9 + 1 ~1a~~2' ^~4 + 1 + 1 ~6~2
3.26. Find the dimension and a basis for the general solution W of each of the following homogeneous
systems:
2xi + 4x2 ~ 5-^3 + ^^4 = ^ X — 2y — 3z = 0
3xi + 6x2 — 7x3 + 4x4 = 0 2X+3;— z = 0
5xi + 10x2 - 11x3 + 6x4 = 0 3x - 43; - 8z = 0
(a) (b)
(a) Reduce the system to echelon form using the operations "Replace L2 by —31^ + 2^2 ", "Replace Z3 by
—5Li + 2Z3", and then "Replace Z3 by —2^2 +^3"- These operations yield:
"'^^'-'tt'lZl _, IV,+4,-5, +3,.. 0
3x3-3x4 = 0 ''3- ^4-0
The system in echelon form has two free variables, X2 and X4, so dim ^ = 2. A basis [u^, U2] for W may
be obtained as follows:
A) Set X2 = 1, X4 = 0. Back-substitution yields X3 = 0, and then x^ = —2. Thus Ui = (—2, 1, 0, 0).
B) Set X2 = 0, X4 = 1. Back-substitution yields X3 = 1, and then x^ = 1. Thus ^2 =A^0, 1,1).
(b) Reduce the system to echelon form, obtaining
X — 2j; — 3z = 0 X — 2j; — 3z = 0
5j; + 9z = 0 and 5j; + 9z = 0
2j; + 7z = 0 17z = 0
There are no free variables (the system is in triangular form). Hence dim W = 0, and W has no basis.
Specifically, W consists only of the zero solution, that is, W = {0}.
3.27. Find the dimension and a basis for the general solution W of the following homogeneous system
using matrix notation:
Xj + 2X2 + ^-^3 ~ 2-^4 + 4X5 = 0
2xi + 4x2 + ^-^3 + -^4 + 9x5 = 0
3xi + 6x2 + 13x3 + 4^4 + 14x5 = 0
Show how the basis gives the parametric form of the general solution of the system.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
104
SYSTEMS OF LINEAR EQUATIONS
[CHAR 3
When a system is homogeneous, we represent the system by its coefficient matrix A rather than by its
augmented matrix M, since the last column of the augmented matrix M is a zero column, and it will remain a
zero column during any row-reduction process.
Reduce the coefficient matrix A to echelon form, obtaining
1 2
2 4
3 6
3
8
13
-2
1
4
4
9
14
^
1 2 3
0 0 2
0 0 4
-2 4
5 1
10 2
[i
2 3-2
0 0 2 5
(The third row of the second matrix is deleted, since it is a multiple of the second row and will result in a zero
row) We can now proceed in one of two ways.
(a) Write down the corresponding homogeneous system in echelon form:
Xi + 2x2 + 3x3 — 2x4 + 4x5 = 0
2x3 + 5x4 + X5 = 0
The system in echelon form has three free variables, X2, X4, X5, so dim ^ = 3. A basis [ui,U2,u^] for W
may be obtained as follows:
A) Set X2 = 1, X4 = 0, X5 = 0, Back-substitution yields X3
ib)
wi =(-2,1,0,0,0).
B) Set X2 = 0, X4 = 1, X5 = 0. Back-substitution yields X3
C) Set X2 = 0, X4 = 0, X5
and then Xi = 12. Thus
wi =A2,0,-|, 1,0).
1. Back-substitution yields X3 = -
«i=(-f, 0,-1,0,1).
- ^, and then x^
-|. Thus
[One could avoid fractions in the basis by choosing X4 = 2 in B) and X5 = 2 in C), which yields
multiples of ^2 and W3.] The parametric form of the general solution is obtained from the following linear
combination of the basis vectors using parameters a,b,c:
aui + bu2 + CW3 = {—2a + 126
Reduce the echelon form of ^ to row canonical form:
|c, a,
-ib-
c, b, c)
1 2 3
0 0 1
-2 4
5 i
2 2
1 2 3 -i? ?
0 0 1
19 5
2 2
5 i
2 2
Write down the corresponding free-variable solution:
_ ^ 19 5
Xi — —2X2 + ~y^4 ~ '^^5
5 1
Using these equations for the pivot variables x^ and X3, repeat the above process to obtain a basis [ui,U2,u^]
for W. That is, set X2 = 1, X4 = 0, X5 = 0 to get Ui\ set X2 = 0, X4 = 1, X5 = 0 to geet U2\ and set X2 = 0,
X4 = 0, X5 = 1 to get W3.
3.28. Prove Theorem 3.13. Let v^ be a particular solution of AX = B, and let W be the general solution of
AX = 0. Then U = Vq -\- W = {vq -\- w\ w G ^} is the general solution of AX = B.
Let w be a solution of AX = 0. Then
A(vq -\-w)=Avq-\-Aw = B-\-0=B
Thus the sum Vq -\-w is a. solution of AX = B. On the other hand, suppose v is also a solution of AX = B.
Then
A(v -Vq)=Av-Avq=B-B = 0
Therefore v — Vq belongs to W. Since v = Vq -\- (v — Vq), we find that any solution of AX = B can be obtained
by adding a solution of AX = 0 to a solution of AX = B. Thus the theorem is proved.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
CHAR 3]
SYSTEMS OF LINEAR EQUATIONS
105
ELEMENTARY MATRICES, APPLICATIONS
3.29. Let 6^,62, ^3 denote, respectively, the elementary row operations
"Interchange rows R^ and i?2"' "Replace R^, by 7i?3", "Replace R2 by —3i?i + R2'
Find the corresponding 3-square elementary matrices Ey,E2,Et^.
Apply each operation to the 3 x 3 identity matrix 73 to obtain
0 1 0
1 0 0
0 0 1
1 0 0
0 1 0
0 0 7
1 0 0
-3 1 0
0 0 1
3.30. Consider the elementary row operations in Problem 3.29.
{a) Describe the inverse operations e^^, e^^, ej^.
(b) Find the corresponding 3-square elementary matrices E[, E'2, E^.
(c) What is the relationship between the matrices E[, E2, E'^ and the matrices E^, E2, E^l
(a) The inverses of e^, 62, e^ are, respectively,
"Interchange rows Ri and R2'\ "Replace Rt, by ^^^3''
"Replace R2 by SR^ +^^2"-
0 1 0
1 0 0
0 0 1
^2 =
1 0 0
0 1 0
0 0^
E^3 =
1 0 0
3 1 0
0 0 1
(b) Apply each inverse operation to the 3 x 3 identity matrix 73 to obtain
E\
(c) The matrices E[, E'2, ^3 are, respectively, the inverses of the matrices E^, E2, E^
3.31. Write each of the following matrices as a product of elementary matrices:
2 3"
0 1 4
(a) A
(b) B:
0 0 1
(c) C
The following three steps write a matrix M as a product of elementary matrices:
Step 1. Row reduce M to the identity matrix 7, keeping track of the elementary row operations.
Step 2. Write down the inverse row operations.
Step 3. Write M as the product of the elementary matrices corresponding to the inverse operations. This
gives the desired result.
If a zero row appears in Step 1, then M is not row equivalent to the identity matrix 7, and M cannot be written
as a product of elementary matrices.
(a) A) We have
A =
-3"
-2 4_
^
-3"
0 -2_
^
0
where the row operations are, respectively.
"Replace R2 by 2Ri -\-R2'\ "Replace R2 by —
B) Inverse operations:
"Replace i?2 by — ^
IR^ +R2\
"Replace
1R2
by
-3"
1_
^
0"
0 1_
= 1
\R2\ "Replace R^ by 3i?2 + ^1"
-2R2
,
"Rei
)lace R^ by 3i?2 + ^1"
C) A:
u
0
2 1
1 0
0 -2
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
106
SYSTEMS OF LINEAR EQUATIONS
[CHAP. 3
(b) A) We have
B -
1 2 31
0 1 4
0 0 1
^
1 2 01
0 1 0
0 0 1
^
1 0 0'
0 1 0
0 0 1
where the row operations are, respectively,
"Replace R2 by - 4R^ + i^2 ". "Replace R^ by - 3R^ + i^i", "Replace R^ by -2R2 + R^"
B) Inverse operations:
"Replace R2 by 4R^ + i^2". "Replace R^ by 3R^-\-R^'\ "Replace R^ by 27^2 + ^1"
C) B-
1 0 0
0 1 4
0 0 1
1
0
0
0
1
0
3"|
0
1 J
1 2 0"
0 1 0
0 0 1
(c) A) First row reduce C to echelon form. We have
1 2'
3 8
-1 2
1 1
0 1
0 2
1 1 2"
0 1 4
0 0 0
In echelon form, C has a zero row. "STOP". The matrix C cannot be row reduced to the identity
matrix /, and C cannot be written as a product of elementary matrices. (We note, in particular, that
C has no inverse.)
3.32. Find the inverse of: (a) A =
1 2
-1 -1
2 7
-4
5
-3
(b) B =
(a) Form the matrix M = [A,I] and row reduce M to echelon form:
4 -4
5 -1
13 -6
M -
1
-1 ■
2
1 2
0 1
0 0
2 -4
-1 5
7 -3
-4
1
2
1 0 0"
0 1 0
0 0 1_
^
0 0
1 0
-3 1
1 2
0 1
0 3
1 0 0
1 1
-2 0
In echelon form, the left half of M is in triangular form; hence A has an inverse. Further reduce M to row
canonical form:
M'
1
0
0
2
1
0
0 1
0 1
1 '
i
2
i
2 J
1 0 0
0 1 0
0 0 1
-16
7
2
5
2
-11
5
2
3
2
3
1
2
1
2_
The final matrix has the form [I, A ^]; that is, A ^ is the right half of the last matrix. Thus
-16 -11 3"
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
CHAR 3]
SYSTEMS OF LINEAR EQUATIONS
107
(b) Form the matrix M = [B, I] and row reduce M to echelon form:
M -
I
4; 1 0 0
1 3 -
1 5 -1 I 0 1 0
3 13 -61 0 0 1
1 3
0 2
0 4
-4 I 10 0
3 I -1 1 0
6 1-3 0 1
1 3
0 2
0 0
-4
3
0
1
-1
-1
0 0
1 0
-2 1
In echelon form, M has a zero row in its left half; that is, B is not row reducible to triangular form.
Accordingly, B has no inverse.
3.33. Show that every elementary matrix E is invertible, and its inverse is an elementary matrix.
Let E be the elementary matrix corresponding to the elementary operation e, that is, e(I) = E. Let e^ be
the inverse operation of e and let E^ be the corresponding elementary matrix, that is, eXi) = E'. Then
/ = e\e{I)) = e\E) = E'E
Therefore E' is the inverse of E.
and
/ = e{e\I)) = e{E') = EE'
3.34. Prove Theorem 3.13: Let e be an elementary row operation and let E be the corresponding m-square
elementary matrix, that is, E = e{I). Then e{A) = EA, where A is any m ^ n matrix.
Let R^ be the row / of ^; we denote this by writing A = [Ri,..., R^. If 5 is a matrix for which AB is
defined then AB = [R^B..., R^B]. We also let
e,. = @,..., 0,1,0, ...,0), ^=i
Here "= / means 1 is the ith entry. One can show (Rroblem 2.45) that e^A = Rf. We also note that
I = [ci, 62, ..., e,^] is the identity matrix.
(i) Let e be the elementary row operation "Interchange rows Ri and Rj'\ Then, for"= / and " =j,
E = e(I) = [e^, ... Sj, ■ ■ ■ Si, ■ ■ ■, ej
and
e(A) = [R„...,Rj,...,R„...,RJ
Thus
EA = [e,A, ...,ejA,..., e^,..., e^A] = [R„ ... ,Rj, ... ,R,, ..., RJ = e(A)
(ii) Let e be the elementary row operation "Replace R^ by kR^ (k 7^ 0)". Then, for"= /,
E = e(l) = [e^,...,kei,...,ej
and
Thus
e(A) = [R„...,kR„...,RJ
EA = [e^A, ..., ke^A, ..., e^A] = [R^,..., kR^, ...,RJ = e(A)
(iii) Let e be the elementary row operation "Replace R^ by kRj +i^/". Then, for"= /,
E = e(I) = [e^, ..., kej-\-ei, ..., ej
and
e(A) = [R„ ..., kR^^R,, ..., RJ
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
108
SYSTEMS OF LINEAR EQUATIONS
[CHAR 3
Using {kCj + ei)A = k(ejA) + e^A = kRj + R^ , we have
EA = [e,A, ..., (kej^eM, •••, e^A]
= IR„ ..., kR^^R,, ..., RJ = e(A)
3.35. Prove Theorem 3.14: Let ^ be a square matrix. Then the following are equivalent:
(a) A is invertible (nonsingular).
(b) A is row equivalent to the identity matrix /.
(c) ^ is a product of elementary matrices.
Suppose A is invertible and suppose A is row equivalent to matrix B in row canonical form. Then there
exist elementary matrices £^,£2, ... ,E^ such that E^.. .E2E1A = B. Since A is invertible and each
elementary matrix is invertible, B is also invertible. But if 5 7^ /, then B has a zero row; whence B is not
invertible. Thus B = I, and (a) implies (b).
If (b) holds, then there exist elementary matrices Ei,E2,... ,E^ such that E^.. .E2E1A = I. Hence
A = (E^.. .^2^i)~^ = E~^E2^ ... ,E~^. But the E^^ are also elementary matrices. Thus (b) implies (c).
If (c) holds, then A = E1E2 .. .E^. The E^ are invertible matrices; hence their product A is also invertible.
Thus (c) implies {a). Accordingly, the theorem is proved.
3.36. Prove Theorem 3.15: If ^5 = /, then BA = /, and hence B = A'K
Suppose A is not invertible. Then A is not row equivalent to the identity matrix /, and so A is row
equivalent to a matrix with a zero row. In other words, there exist elementary matrices E^, ... ,E^ such that
E^... E2E1A has a zero row. Hence E^... E2E1AB = E^... ^2^1 ? ^^ invertible matrix, also has a zero row. But
invertible matrices cannot have zero rows; hence A is invertible, with inverse A~^. Then also,
B=IB = (A-^A)B = A-\AB) = A-^I = A'^
3.37. Prove Theorem 3.16: B is row equivalent to A (written B ^ A) if and only if there exists a
nonsingular matrix P such that B = PA.
If B ^A, then B = e^- ■ ■ fe(ei(^))) ...) = E^.. .^2^1^ = PA where P = E^.. .^2^1 is nonsingular.
Conversely, suppose B = PA, where P is nonsingular. By Theorem 3.14, P is a product of elementary
matrices, and so B can be obtained from ^ by a sequence of elementary row operations, that is, B ^ A. Thus
the theorem is proved.
3.38. Prove Theorem 3.18: Every m x n matrix A is equivalent to a unique block matrix of the form
I. L where /^ is the r x r identity matrix.
The proof is constructive, in the form of an algorithm.
Step 1. Row reduce A to row canonical form, with leading nonzero entries ay^, a2j^,..., a^^j^.
Step 2. Interchange C^ and Cy^, interchange C2 and C2j^,..., and interchange C^ and Cj^. This gives a
I D
matrix in the form |-?-i - -
['i-
2/2'
, with leading nonzero entries a^i, ^22, • • •, «r^
Step 3. Use column operations, with the a^ as pivots, to replace each entry in B with a zero, i.e., for
/ = 1, 2, ..., r and 7 = r -\- I, r -\-2, ... ,n, apply the operation —bfjCi + Cj -^ Cj.
The final matrix has the desired form
/,_i_0
"o| 0;
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
CHAR 3]
SYSTEMS OF LINEAR EQUATIONS
109
LU FACTORIZATION
3.39. Find the LU decomposition of: (a) A =
1 -3 5
2-4 7
-1 -2 1
,(b) B =
(a) Reduce A to triangular form by the following operations:
"Replace R2 by - 2R^ + i^2 " ^ "Replace ^^3 by R^ + ^^3'
"Replace ^^3 by |i?2 +^3"
These operations yield the following, where the triangular form is U:
(b)
1 4 -3
2 8 1
-5 -9 7
and then
1
0
0
-3
2
-5
5"
-3
6
^
1
0
0
-3
2
0
5
-3
3
2_
u
and
1
2
-1
0 0"
1 0
-? 1
- R2 " and "Replace
The entries 2, — 1, — | in Z are the negatives of the multipliers —2, 1, § in the above row operations. (As a
check, multiply L and U to verify A = LU.)
Reduce B to triangular form by first applying the operations "Replace R2 by —IR^
R^ by 5Ri -\-R^'\ These operations yield
B^
Observe that the second diagonal entry is 0. Thus B cannot be brought into triangular form without row
interchange operations. Accordingly, B is not Zt/-factorable. (There does exist sl PLU factorization of
such a matrix B, where P is a permutation matrix, but such a factorization lies beyond the scope of this
text.)
1
0
0
4
0
11
-3
7
-8
3.40. Find the LDU factorization of the matrix A in Problem 3.39.
The A = LDU factorization refers to the situation where Z is a lower triangular matrix with I's on the
diagonal (as in the LU factorization of ^), D is a diagonal matrix, and U is an upper triangular matrix with 1 's
on the diagonal. Thus simply factor out the diagonal entries in the matrix U in the above Zt/ factorization of ^
to obtain D and L. That is
1 0 0"
2 1 0
-1 -I 1_
D--
1 0
0 2
0 0
1
2
-3
0
0
3
2_
2
3
-IC
^
r
3
2
u =
1
0
0
-3
1
0
5
-3
1
3.41. Find the LU factorization of the matrix A =
Reduce A to triangular form by the following operations:
A) "Replace R2 by -2R^ + ^^2". B) "Replace ^^3 by 3R^ + ^^3",
These operations yield the following, where the triangular form is U:
C) "Replace ^^3 by -4R2 + R^
1
0
0
2 1
-1 1
-4 5
^
1
0
0
2 1
-1 1
0 1
u
and
1
2
3
0
1
4
0
0
1
The entries 2, —3, 4 in L are the negatives of the multipliers —2, 3,-4 in the above row operations. (As a
check, multiply L and U to verify A = LU.)
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
110
SYSTEMS OF LINEAR EQUATIONS
[CHAR 3
3.42. Let A be the matrix in Problem 3.41. Find Xi,X2,X2, where: Xf is the solution of AX = Bf for:
(a) 5i =A,1,1), (Z?) 52=5i+Xi, (c) B^=B2+X2.
(a) Find L~^Bi by applying the row operations A), B), and then C) in Problem 3.41 to Bi\
"
1
1
A) and B)
>
r
-1
4
C)
>
r
-1
8
^1
Solve UX = BioxB = {\,-\,^)hy back-substitution to obtain X^ = (-25, 9, 8
(b) First find B2 = B^ -\-X^ =A,1,1) + (-25, 9, 8) = (-24, 10, 9). Then as above
C)
B,
T A) and B) r
-24, 10, 9f -— -4 [-24, 58, -63]^ -
[-24, 58, -295]^
Solve UX = BforB = (-24, 58, -295) by back-substitution to obtain X2 = (943, -353, -295).
(c) First find ^3 = ^2 +X2 = (-24, 10, 9) + (943, -353, -295) = (919, -343, -286). Then, as above
B,
T A) and B) ^
■ [943, -353, -295]^ -4 [919, -2181, 2671]^ ■
C)
-> [919,-2181,11395]^
Solve UX = B foYB = (919, -2181, 11 395) by back-substitution to obtain
X3 = (-37 628,13 576,11395).
MISCELLANEOUS PROBLEMS
3.43. Let L be a linear combination of the m equations in n unknowns in the system C.2). Say L is the
equation
(ciflii + • • • + c^a^i)xi + • • • + (cifli„ + • • • + c^a^Jx^ = c^b^ + • • • + c^b^ A)
Show that any solution of the system C.2) is also a solution of Z.
Let u = (ki,... ,k^)bQ a. solution of C.2). Then
aiiki + a^-2^2 + • • • + <^mK —^i (i = 1,2,... ,m)
Substituting u in the left-hand side of A) and using B), we get
(cifln + • • • + c^fl^i)^! + • • • + (cifli„ + • • • + c^a^Jk^
= ci(fln^i + • • • + a^nK) + • • • + c^(«^i^i + • • • + a^X)
= cibi-\-- - --\-c^b^
This is the right-hand side of A); hence w is a solution of A).
B)
3.44. Suppose a system Jf of linear equations is obtained from a system if by applying an elementary
operation (page 64). Show that ^ and if have the same solutions.
Each equation Z in ^ is a linear combination of equations in if. Hence, by Problem 3.43, any solution
of if will also be a solution of Ji. On the other hand, each elementary operation has an inverse elementary
operation, so ^ can be obtained from ^ by an elementary operation. This means that any solution of ^ is a
solution of ^. Thus ^ and ^ have the same solutions.
3.45. Prove Theorem 3.4: Suppose a system ^ of linear equations is obtained from a system if by a
sequence of elementary operations. Then ^ and ^ have the same solutions.
Each step of the sequence does not change the solution set (Problem 3.44). Thus the original system ^
and the final system Ji (and any system in between) have the same solutions.
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 3] SYSTEMS OF LINEAR EQUATIONS 111
3.46. A system if of linear equations is said to be consistent if no linear combination of its equations is a
degenerate equation L with a nonzero constant. Show that ^ is consistent if and only if ^ is
reducible to echelon form.
Suppose ^ is reducible to echelon form. Then ^ has a solution, which must also be a solution of every
linear combination of its equations. Thus Z, which has no solution, cannot be a linear combination of the
equations in ^. Thus ^ is consistent.
On the other hand, suppose ^ is not reducible to echelon form. Then, in the reduction process, it must
yield a degenerate equation L with a nonzero constant, which is a linear combination of the equations in ^.
Therefore, ^ is not consistent, that is, ^ is inconsistent.
3.47. Suppose u and v are distinct vectors. Show that, for distinct scalars k, the vectors u + k{u — v) are
distinct.
Suppose u-\-ki{u — v) = u-\-k2{u — v). We need only show that k^ = ^2- We have
ki{u — v) = k2(u — v), and so (k^ — k2)(u — v) = 0
Since u and v are distinct, u — v ^ 0. Hence k^ — k2 = 0, and so k^ = ^2-
3.48. Suppose AB is defined. Prove:
(a) Suppose A has a zero row. Then AB has a zero row.
(b) Suppose B has a zero column. Then AB has a zero column.
(a) Let Ri be the zero row of ^, and Q, ..., C„ the columns of B. Then the ith row of AB is
(j?,c„j?A.---,^,c„) = (o,o,o,...,o)
(b) B^ has a zero row, and so B^A^ = (AB) has a zero row. Hence AB has a zero column.
Supplementary Problems
LINEAR EQUATIONS, 2x2 SYSTEMS
3.49. Determine whether each of the following systems is linear:
(a) 3x-4y-\- lyz = 8, (b) ex-\-3y = n, (c) 2x - 3y-\-kz = 4
3.50. Solve: (a) nx = 2, (b) 3x + 2 = 5x + 7 - 2x, (c) 6x + 2 - 4x = 5 + 2x - 3
3.51. Solve each of the foUowing systems:
(a) 2x + 3j;=l (b) 4x - 2y = 5 (c) 2x - 4 = 3y (d) 2x-4j;=10
5x + 7j; = 3 —6x + 3j;=l 5j; — x = 5 3x — 6j;=15
3.52. Consider each of the following systems in unknowns x and y:
(a) X — ay = \ (b) ax -\-3y = 2 (c) x-\- ay = 3
ax — 4y = b \2x-\-ay = b 2x-\-5y = b
For which values of a does each system have a unique solution, and for which pairs of values {a, b) does each
system have more than one solution?
Lipschutz-Lipson:Schaum's I 3. Systems of Linear I Text I I ©The McGraw-Hill
Outline of Theory and Equations Companies, 2004
Problems of Linear
Algebra, 3/e
112 SYSTEMS OF LINEAR EQUATIONS [CHAR 3
GENERAL SYSTEMS OF LINEAR EQUATIONS
3.53. Solve:
(a)
3.54. Solve:
(a) x-2y = 5 (b) x-\-2y - 3z-\-2t = 2 (c) x-\-2y-\-4z - 5t = 3
2x-\-3y = 3 2x-\-5y-^z-\-6t = 5 3x - y-\-5z-\-2t = 4
3x-\-2y = 7 3x-\-4y-5z-\-2t = 4 5x - 4y - 6z-\-9t = 2
x-\- y-\- 2z= 4
2x-\-3y-\- 6z= 10
3x + 6j;+10z= 14
(b) x-2y-\- 3z = 2
2x-3y-\- Sz = 7
3x-4j;+13z = 8
(c) x-\-2y-\- 3z= 3
2x-\-3y-\- Sz= 4
5x + 8j;+19z= 11
3.55. Solve:
(a) 2x- y-4z = 2 (b) x-\-2y-z-\-3t=3
4x-2y-6z = 5 2x-\-4y-\-4z-\-3t = 9
6x-3j;-8z = 8 3x +6j; - z +8^ = 10
3.56. Consider each of the following systems in unknowns x and y:
(a)
X — 2y = I
X — y -\- az = 2
ay-\-4z = b
(b) x-\- 2y-\-2z=l
X + ay + 3z = 3
X -\- lly -\- az = b
(c)
X + y -\- az = I
x-\- ay -\- z = 4
ax-\- y -\- z — b
For which values of a does the system have a unique solution, and for which pairs of values (a, 6) does the
system have more than one solution? The value of b does not have any effect on whether the system has a
unique solution. Why?
LINEAR COMBINATIONS, HOMOGENEOUS SYSTEMS
3.57. Write i; as a linear combination of where:
(a) i; = D,-9,2), u^= A,2,-1), U2 = A,4,2), u^ = (I,-3, 2);
(b) i; = (l,3,2), wi =A,2,1), U2 = B,6,5), W3 =A,7,8);
(c) i; = (l,4,6), wi =A,1,2), U2 = B,3,5), u^=C,5,S>).
3.58. Let Ui = A, 1, 2), U2 = A, 3, —2), u^ = D, —2, —I) in R^. Show that Ui,U2, % are orthogonal, and write v as
a linear combination of Ui, U2, W3, where: (a) v = E, —5, 9), (b) v = (I, —3, 3), (c) i; = A, 1, 1).
(Hint: Use Fourier coefficients.)
3.59. Find the dimension and a basis of the general solution W of each of the following homogeneous systems:
(a) X — y -\-2z = 0
2x+j;+ z = 0
5x+j; + 4z = 0
(b) X + 2j; - 3z = 0 (c) x + 2j; + 3z + ^ = 0
2x + 5j; + 2z = 0 2x + 4j; + 7z +4^ = 0
3x- j;-4z = 0 3x + 6j;+10z + 5^ = 0
3.60. Find the dimension and a basis of the general solution W of each of the following systems:
(a) Xi + 3x2 + 2x3 — X4 — X5
2xi + 6x2 + 5x3 + X4 — X5
5xi + 15x2 + 12x3 + X4 — 3x5
0
0
0
(b) 2xi
3xi
5xi
— 4x2 + 3x3 — X4 + 2x5 = 0
— 6x2 + 5x3 — 2x4 + 4x5 = 0
- 10x2 + 7x3 - 3x4 + 4x5 = 0
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
3. Systems of Linear
Equations
Text
©The McGraw-Hill
Companies, 2004
CHAR 3]
SYSTEMS OF LINEAR EQUATIONS
113
ECHELON MATRICES, ROW CANONICAL FORM
3.61. Reduce each of the following matrices to echelon form and then to row canonical form:
{a)
1 1 2"
2 4 9
1 5 12
ib)
1 2
2 4
3 6
-1
1
3
2 1
-2 3
-7 7
{c)
'2 4 2
3 6 2
4 8 2
5 1
0 4
-5 7
3.62. Reduce each of the following matrices to echelon form and then to row canonical form:
{a)
12 12 1 2'
2 4 3 5 5 7
3 6 4 9 10 11
12 4 3 6 9
ib)
0 12 3"
0 3 8 12
0 0 4 6
0 2 7 10
{c)
13 13"
2 8 5 10
1 7 7 11
3 11 7 15
3.63. Using only O's and I's, list all possible 2x2 matrices in row canonical form.
3.64. Using only O's and I's, find the number n of possible 3x3 matrices in row canonical form.
ELEMENTARY MATRICES, APPLICATIONS
3.65. Let 61,62, 62, denote, respectively, the following elementary row operations:
"Interchange R2 and ^^3", "Replace R2 by 3i?2 "' "Replace R^ by IR^ + Ri"
(a) Find the corresponding elementary matrices Ey,E2,Et^.
(b) Find the inverse operations 6^^, 62^, 6j^; their corresponding elementary matrices E[, E2, E^-^; and the
relationship between them and ^^, £, ^3.
(c) Describe the corresponding elementary column operations/i,72,73-
(d) Find elementary matrices ^^,^2,^3 corresponding to/i,y^,y^, and the relationship between them and
3.66. Express each of the following matrices as a product of elementary matrices:
[3
2'
3 4
3 -6
-2 4
C-
2 6
-3 -7
D-
1 2 0
0 1 3
3 8 7
3.67. Find the inverse of each of the following matrices (if it exists):
1 -2
2 -3
3 -4
1 2 3
2 6 1
3 10 -1
1 3
2 8
1 7
D-
2 1 -1
5 2-3
0 2 1
3.68. Find the inverse of each of the following n x n matrices:
(a) A has I's on the diagonal and sup6rdiagonal (entries directly above the diagonal) and O's elsewhere.
{b) B has I's on and above the diagonal, and O's elsewhere.
LU FACTORIZATION
3.69. Find the LU factorization of each of the following matrices:
2 3 6
(fl) I 3 -4 -2 I, F) I 2 5 1 I, (c) I 4 7 9
3 5 4
1
3
2
-1
-4
-3
-1"
-2
-2
.ib)
1 3
2 5
3 4
-1"
1
2
,(c)
id)
1 2 3
2 4 7
3 7 10
Lipschutz-Lipson:Schaum's I 3. Systems of Linear
Outline of Theory and Equations
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
114
SYSTEMS OF LINEAR EQUATIONS
[CHAR 3
3.70. Let A be the matrix in Problem 3.69(a). Find X^, X2, X3,X4 where:
(a) Xi is the solution of AX = B^, where B^ =(l,l, if.
(b) For k > 1, X^ is the solution of AX = Bj^, where Bj^ = Bj^_i -\-Xj^_i.
3.71. Let B be the matrix in Problem 3.69F). Find the LDU factorization of B.
MISCELLANEOUS PROBLEMS
3.72. Consider the following systems in unknowns x and y:
. . ax-\-by = I
{a)
(b)
ax -\- by = 0
ex -\- dy = 0 ex-\- dy = \
Suppose D = ad — be ^ 0. Show that each system has the unique solution:
(a) X = d/D, y = -e/D, (b) x = -b/D, y = a/D.
3.73. Find the inverse of the row operation "Replace R^ by kRj + tR^ {k' 7^ 0)".
3.74. Prove that deleting the last column of an echelon form (respectively, the row canonical form) of an augmented
matrix M = [A,B] yields an echelon form (respectively, the row canonical form) of ^.
3.75. Let e be an elementary row operation and E its elementary matrix, and let/ be the corresponding elementary
column operation and F its elementary matrix. Prove:
(a) f(A) = (e(A^)f,{b) F = E^, {c) f{A)=AF.
3.76. Matrix A is equivalent to matrix B, written A ^ B, if there exist nonsingular matrices P and Q such that
B = PAQ. Prove that ^ is an equivalenee relation, that is:
(a) A^A, (b) IfA^B, then B ^ A, (e) If A ^ B and B ^ C, then A^C.
Answers to Supplementary Problems
Notation: A = [Ri; R2', ...] denotes the matrix A with rows Ri,R2,... . The elements in each row are separated
by commas (which may be omitted with single digits), the rows are separated by semicolons, and 0 denotes a zero row.
For example.
1
5
0
2 3
-6 7
0 0
4
-8
0
^ = [1,2,3,4; 5,-6,7,-8; 0]:
3.49. (a) no, (b) yes, (c) linear in x, y, z, not linear in x, y, z, k
3.50. (a) X = 2/71, (b) no solution, (c) every scalar ^ is a solution
3.51. (a) B,-1), (b) no solution, (c) E,2), (d) E-2a, a)
3.52. (a) a ^±2, B, 2), (-2,-2), (b) a ^ ±6, F, 4), (-6,-4), (e) a^\,{\,6)
3.53. {a) B,1,^), {b) no solution, (c) u = {—la-\-l, 2a —2, a).
3.54. {a) C,-1), {b) u = {—a-\-2b, \-\-2a, a, b), (e) no solution
Lipschutz-Lipson:Schaum's I 3. Systems of Linear
Outline of Theory and Equations
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 3]
SYSTEMS OF LINEAR EQUATIONS
115
3.55.
3.56.
3.57.
3.58.
3.59.
3.60.
3.61.
3.62.
3.63.
3.64.
3.65.
(a) u = (\a,a,\l (b) u = (\A - 5b - 4a), a, \(l-\-b), b)
{a) a 7^ ±3, C, 3), (-3,-3), {b) a 7^ 5 and a 7^-1, E, 7), (-1,-5),
(c) a^\ and a ^ -2, A, -2), (-2, 5)
{a) 2,-1,3,
{a) 3,-2,1,
{a) dim r = 1, wi = (-1, 1, 1), (b) dim ^ = 0, no basis,
(c) dim W = 2,u^= (-2, 1,0, 0), U2 = E, 0, -2, 1)
(a) dimW = 3,u^ = (-3, 1, 0, 0, 0), ^2 =G,0,-3,1,0), u^ =C,0,-1,0,1),
(b) dim W = 2,u^=B,l, 0, 0, 0), U2 = E, 0, -5, -3, 1)
(b) -5,f,l,
(b) f,-l,i
(c) not possible
(c) ^ i ^
K^J 3 ' 7 ' 21
(a) [1,0,-i; 0,1,|; 0], (b) [l,2,0,0,f;
(c) [1,2,0,4,5,3; 0,0,1,-5,^
_ _5.
2 ' 2'
0,0,1,0,^;
0,0,0,1,2],
0]
(a) [1,2,0,0,-4,-2; 0,0,1,0,1,2; 0,0,0,1,2,1; 0],
(b) [0,1,0,0; 0,0,1,0; 0,0,0,1; 0], (c) [1,0,0,4; 0,1,0,-1; 0,0,1,2; 0]
5: [1,0; 0,1], [1,1; 0,0], [1,0; 0,0], [0,1; 0, 0], 0
15
(a) [1,0,0; 0,0,1; 0, 1, 0], [1, 0, 0; 0,3,0; 0, 0, 11], [1, 0, 2; 0,1,0; 0,0,1],
(b) i?2 ^ ^3; ^^2 -^ ^2; -2^3 + ^1 ^ ^1; each ^; = Ef\
(c) C2 ^ C3, 3C2 -^ C2, 2C3 + Ci ^ Ci, (d) each F^- = Ef.
3.66. ^ = [1,0; 3,0][1,0; 0,-2][1,2; 0,1], 5 is not invertible,
C = [1,0; -|,1][1,0; 0, 1][1,6; 0, 1][2,0; 0,1],
D = [100; 010; 301][100; 010; 021][100; 013; 001][120; 010; 001]
3.67.
3.68.
3.69.
c-
:[-8, 12,-5; -5,7,-3; 1,-2,1],
3,-2,1],
1-29 _17 7.
L2 ' 2 '2'
i.
2'2' 2'
[1,-1,1,-1,
0,1,-1,1,-1,
B has no inverse,
D-^ =[8,-3,-1;
0,0,1,-1,1,-1,1,
-5,2,1; 10,-4,-1]
.; ...; ...; 0, ...0,1]
B has I's on diagonal, — I's on superdiagonal, and O's elsewhere.
(a) [100; 310
(b) [100; 210
(c) [100; 210
211][1,-1,-1; 0,-1,1; 0,0,-1],
351][1,3,-1; 0,-1,-3; 0,0,-10],
|,i,l][2,|,3; 0,1,-3; 0,0,-5,
(d) There is no LU decomposition.
3.70. Xi =[1,1,-1]^,52 = [2,2,0]^ X2 = [6,4,0]^ ^3 = [8, 6, 0]^ X3 = [22, 16,-2]^, ^4 = [30, 22,-2]^
3.71.
3.73.
3.75.
3.76.
X4 = [86, 62, -6]'
5 = [100; 210; 351] diag(l,-1,-10) [1, 3,-1; 0,1,-3; 0,0,1]
Replace R^ by -kRj + A/^0^/
(c) f(A) = (e(A^)f = (EA^f = (A^fE^ = AF
(a) A = ML (b) lfA= PBQ, then B = p-^AQ-\
(c) \iA= PBQ and B = P'CQ, then A = {PP')C{Q'Q)
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAPTER 4
Vector Spaces
4.1 INTRODUCTION
This chapter introduces the underlying structure of linear algebra, that of a finite-dimensional vector
space. The definition of a vector space V, whose elements are called vectors, involves an arbitrary field K,
whose elements are called scalars. The following notation will be used (unless otherwise stated or
imphed):
V the given vector space
u,v,w vectors in V
K the given number field
a, b,c or k scalars in K
Almost nothing essential is lost if the reader assumes that K is the real field R or the complex field C.
The reader might suspect that the real line R has "dimension" one, the cartesian plane R^ has
"dimension" two, and the space R^ has "dimension" three. This chapter formalizes the notion of
"dimension", and this definition will agree with the reader's intuition.
Throughout this text, we will use the following set notation:
a eA
a,b eA
WxeA
3x eA
A C5
AHB
AUB
0
Element a belongs to set A
Elements a and b belong to A
For every x in ^
There exists an x in ^
^ is a subset of B
Intersection of A and B
Union of A and B
Empty set
4.2 VECTOR SPACES
The following defines the notion of a vector space V where K is the field of scalars.
Definition: Let F be a nonempty set with two operations:
(i) Vector Addition: This assigns to any u,v e Fa sum u-\-v in V.
(ii) Scalar Multiplication: This assigns to any u e V, k e K sl product ku e V.
116
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAR 4]
VECTOR SRACES
117
Then V is called a vector space (over the field K) if the following axioms hold for any
vectors u,v,w e V:
[Ai]
[A2]
[A3
[A4]
[Ml]
[M2]
[M3]
[M4]
(u-\- v) -\-w = u-\- (v -\-w)
There is a vector in V, denoted by 0 and called the zero vector, such that, for any
ueV,
w+0=0+w=0
For each u e V, there is a vector in V, denoted by —w, and called the negative of u,
such that
u + (—u) = (—u) -\- u = 0.
u-\-v = v-\-u.
k(u -\-v) = ku-\- kv, for any scalar k eK.
{a + b)u = au-\-bu, for any scalars a,b eK.
(ab)u = a(bu), for any scalars a,b e K.
lu = u, for the unit scalar I e K.
The above axioms naturally split into two sets (as indicated by the labeling of the axioms). The first
four are only concerned with the additive structure of V, and can be summarized by saying F is a
commutative group under addition. This means:
{a) Any sum Vi -\-V2
the summands.
, + 1;^ of vectors requires no parentheses and does not depend on the order of
(b) The zero vector 0 is unique, and the negative — w of a vector u is unique.
(c) (Cancellation Law) If u -\- w = v -\- w, then u = v.
Also, subtraction in V is defined by u — v = u-\- (—v), where —v is the unique negative of v.
On the other hand, the remaining four axioms are concerned with the "action" of the field K of scalars
on the vector space V. Using these additional axioms we prove (Problem 4.2) the following simple
properties of a vector space.
Theorem 4.1: Let F be a vector space over a field K.
(i) For any scalar k e K and 0 e V, ^0 = 0.
(ii) For 0 e K and any vector u e V, Ou = 0.
(iii) If ku = 0, where k e K and u e V, then ^ = 0 or w = 0.
(iv) For any k e K and any u e V, (—k)u = k(—u) = —ku.
4.3 EXAMPLES OF VECTOR SPACES
This section lists important examples of vector spaces that will be used throughout the text.
Space ^"
Let K be an arbitrary field. The notation K'^ is frequently used to denote the set of all ^-tuples of
elements in K. Here K" is a vector space over K using the following operations:
(i) Vector Addition: (a^, ^2, ..., a„) + (Z?i, Z?2, ..., ^„) = («i + ^1, ^2 + Z?2, ..., «„ + ^„)
(ii) Scalar Multiplication: k{a^ ,a2, ... ,a^) = (kai, ka2, ..., ka^)
The zero vector in K" is the w-tuple of zeros,
0 = @, 0,..., 0)
and the negative of a vector is defined by
-(^1,^2, ...,aj = (-^1, -^2, ..., -a J
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
118 VECTOR SPACES [CHAR 4
Observe that these are the same as the operations defined for R" in Chapter 1. The proof that K^ is a vector
space is identical to the proof of Theorem 1.1, which we now regard as stating that R" with the operations
defined there is a vector space over R.
Polynomial Space P@
Let V{t) denote the set of all real polynomials of the form
p{t) = ^0 + a^t + B2^^ + ... + a/ (^=1,2,...)
where the coefficients a^ belong to a field K. Then V{t) is a vector space over K using the following
operations:
(i) Vector Addition: Here p{t) + q{t) in P(^) is the usual operation of addition of polynomials.
(ii) Scalar Multiplication: Here kp{t) in V{t) is the usual operation of the product of a scalar k and a
polynomial p{t).
The zero polynomial 0 is the zero vector in V{t).
Polynomial Space P„@
Let P„(^) denote the set of all polynomials jc>(^) over a field K, where the degree oip{t) is less than or
equal to n, that is,
p{t) = uq -\- a^t -\- B2^ + ... + a^f
where s < n. Then P„@ is a vector space over K with respect to the usual operations of addition of
polynomials and of multiplication of a polynomial by a constant (just like the vector space F(t) above). We
include the zero polynomial 0 as an element of ^n(t), even though its degree is undefined.
Matrix Space M^ „
The notation M^ „, or simply M, will be used to denote the set of all m x n matrices with entries in a
field K. Then M^ „ is a vector space over K with respect to the usual operations of matrix addition and
scalar multiplication of matrices, as indicated by Theorem 2.1.
Function Space F(X)
Let X be a nonempty set and let K be an arbitrary field. Let F(X) denote the set of all functions of X
into K. [Note that F(X) is nonempty, since X is nonempty.] Then F(X) is a vector space over K with
respect to the following operations:
(i) Vector Addition: The sum of two functions/ and g in F{X) is the function/ + g in F{X) defined by
{f + g){x)=f{x) + g{x) Vx€X
(ii) Scalar Multiplication: The product of a scalar k e K and a function/ in F(X) is the function kf in
F(X) defined by
(kf)(x) = kf(x) VxgX
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAR 4]
VECTOR SPACES
119
The zero vector in F{X) is the zero function 0, which maps every x G X into the zero element 0 G AT,
that is,
0(x) = 0 Vx G X
Also, for any function/ in F{X), the function —/ in F{X) defined by
(-/)(x) =-/(x) VxeX
is the negative of the function/
Fields and Subfields
Suppose a field E is an extension of a field K, that is, suppose £" is a field that contains AT as a subfield.
Then E may be viewed as a vector space over K using the following operations:
(i) Vector Addition: Here u-\-v in E \^ the usual addition in E.
(ii) Scalar Multiplication: Here ku in E, where k e K and w G £", is the usual product of k and u as
elements of £".
That is, the eight axioms of a vector space are satisfied by E and its subfield K with respect to the above
two operations.
4.4 LINEAR COMBINATIONS, SPANNING SETS
Let F be a vector space over a field K. A vector i; in F is a linear combination of vectors Ui,U2, ... ,u^^
in V if there exist scalars a^, ^2, ..., a^ in AT such that
V = a^Ui + B2^2 + . . . + (^m^m
Alternatively, i; is a linear combination of Ui,U2, ... ,Uf^ if there is a solution to the vector equation
ijMj
■ X2 U2
where Xj, X2, ..., x^ are unknown scalars.
Example 4.1. (Linear Combinations in R") Suppose we want to express v = C,1, —4) in R^ as a linear combination of
the vectors
wi =A,2,3), U2= B,3,1), u^= C,5,6)
We seek scalars x, y, z such that v = xui -\-yu2 + zui,\ that is,
x + 2j; + 3z=3
or 2x + 3_y + 5z = 7
3x + 7j; + 6z = -4
(For notational convenience, we have written the vectors in R^ as columns, since it is then easier to find the equivalent
system of linear equations.) Reducing the system to echelon form yields
x + 2j; + 3z= 3 x + 2j; + 3z= 3
—y — z = 1 and then —y
y-3z=-\3
Back-substitution yields the solution x = 2,y = —4, z = 3. Thus v = lu^ — 4^2 + 3^3
3
3
-4
= X
1
2
3
+y
2
3
7
+ z
3
5
6
- Z :
■4Z:
Remark: Generally speaking, the question of expressing a given vector v in K^ as a linear
combination of vectors Ui,U2, ... ,u^mK^ \?> equivalent to solving a system AX = B of linear equations,
where v is the column B of constants, and the w's are the columns of the coefficient matrix ^. Such a system
may have a unique solution (as above), many solutions, or no solution. The last case - no solution - means
that V cannot be written as a linear combination of the w's.
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
120 VECTOR SPACES [CHAR 4
Example 4.2. (Linear combinations in P(f)) Suppose we want to express the polynomial v = 3f + 5t — 5 as a. linear
combination of the polynomials
p^=t^ -\-2t-\-\, p2 = 2t^ -\-5t-\-4, p^=t^ -\-3t-\-6
We seek scalars x, y, z such that v = xp^ -\-yP2 +2P3; that is,
3^^ + 5^ - 5 = x{t^ +2^+1) +y{2t^ + 5^ + 4) + z{t^ +3^ + 6) (*)
There are two ways to proceed from here.
A) Expand the right-hand side of (*) obtaining:
3^^ + 5^ - 5 = xt^ + 2x^ + X + 2yf + 5yt + 4j; + zt^ + 3zt + 6z
= (x + 2j; + z)t^ + Bx + 5j; + 3z)t + (x + 4j; + 6z)
Set coefficients of the same powers of t equal to each other, and reduce the system to echelon form:
x + 2_y+z=3 x + 2_y+z=3 x + 2_y+z=3
2x + 5_y + 3z=5 or y -\- z = —\ or y -\- z = —\
X + 4j; + 6z = -5 2j; + 5z = -8 3z = -6
The system is in triangular form and has a solution. Back-substitution yields the solution x = 3, j; = 1, ^ = —2.
Thus
V = 3p^ -\-p2 - 2j^3
B) The equation (*) is actually an identity in the variable t; that is, the equation holds for any value of t. We can
obtain three equations in the unknowns x, y, z by setting t equal to any three values. For example:
Set^ = Oin(l) to obtain: x+ 4y-\- 6z =-5
Set ^ = 1 in A) to obtain: 4x + 1 Ij; + lOz = 3
Set ^ = -1 in A) to obtain: y-\- 4z = -7
Reducing this system to echelon form and solving by back-substitution again yields the solution x = 3, j; = 1,
z = —2. Thus (again) v = 3pi -\-p2 — 2^3.
Spanning Sets
Let F be a vector space over K. Vectors Ui,U2, ... ,u^ in V slyq said to span V or to form a spanning
set of V if every 1; in F is a linear combination of the vectors Ui,U2, ... ,u^, that is, if there exist scalars
ai,a2, ... ,a^ in K such that
V = a^Ux + B2^2 + . . . + <^m'^m
The following remarks follow directly from the definition.
Remark 1: Suppose Ux,U2,... ,u^ span V. Then, for any vector w, the set w, Wj, ^2. • • •» ^w ^l^o
spans V.
Remark 2: Suppose Ui,U2,... ,Uj^ span V and suppose Uj^ is a linear combination of some of the
other w's. Then the w's without Uj^ also span V.
Remark 3: Suppose Ux,U2,... ,u^ span Fand suppose one of the w's is the zero vector. Then the w's
without the zero vector also span V.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAR 4]
VECTOR SRACES
121
^3 = @, 0, 1)
W3 =A,0,0)
Example 4.3. Consider the vector space V = R^.
(a) We claim that the following vectors form a spanning set of R^:
e^ =A,0,0), ^2 = @,1,0),
Specifically, if v = (a, b,c) is any vector in R^, then
V = aei + be2 + ce2,
For example, v = E, —6, 2) = —5^^ — 6^2 + 2e^.
(b) We claim that the following vectors also form a spanning set of R
Wi =A,1,1), >V2= A,1,0),
Specifically, if v = (a,b,c) is any vector in R^, then (Rroblem 4.62)
V = (a, b, c) = cwi -\-{b — c)w2 -\-{a — 6I4^3
For example, v = E, —6, 2) = Iw^ — 8^2 + 1 Iw^.
(c) One can show (Rroblem 3.24) that v = B, 7, 8) cannot be written as a linear combination of the vectors
wi =A,2,3), U2 = A,3,5), W3=(l,5,9)
Accordingly, u^, U2, u^^ do not span R^.
Example 4.4. Consider the vector space V = P„@ consisting of all polynomials of degree <n.
(a) Clearly every polynomial in P„(^) can be expressed as a linear combination of the n + 1 polynomials
1, t, t^, P, ..., f
Thus these powers of t (where 1 = fi) form a spanning set for P„@-
(b) One can also show that, for any scalar c, the following n -\-\ powers of ^ — c,
1, t-c, (t-cf, (t-cf, ..., {t-cT
(where {t — c) =1), also form a spanning set for P„@-
Example 4.5. Consider the vector space M = M2 2 consisting of all 2 x 2 matrices, and consider the following four
matrices in M:
[;
0
0 0
E\2 =
[0 1]
0 0_
^21 =
ro 0
1 0
0 0
0 1
Then clearly any matrix ^ in M can be written as a linear combination of the four matrices. For example,
6"
['.
:5^n -6^12 + 7^21 +8^2:
Accordingly, the four matrices E^i, £^2, ^21, ^22 ^P^^ M.
4.5 SUBSPACES
This section introduces the important notion of a subspace.
Definition: Let F be a vector space over a field K and let ^ be a subset of V. Then ^ is a subspace of V
if W is itself a vector space over K with respect to the operations of vector addition and
scalar multiplication on V.
The way in which one shows that any set ^ is a vector space is to show that W satisfies the eight
axioms of a vector space. However, if ^ is a subset of a vector space V, then some of the axioms
automatically hold in W, since they already hold in V. Simple criteria for identifying subspaces follow.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
122
VECTOR SPACES
[CHAR 4
Theorem 4.2: Suppose ^ is a subset of a vector space V. Then ^ is a subspace of F if the following two
conditions hold:
(a) The zero vector 0 belongs to W.
(b) For every u,v e W,keK: (i) The sum u-\-v eW. (ii) The multiple ku eW.
Property (i) in (Z?) states that W is closed under vector addition, and property (ii) in (Z?) states that W is
closed under scalar multiplication. Both properties may be combined into the following equivalent single
statement:
{b') For every w, i; G W,a,b e K, the linear combination au-\- bv e W.
Now let V be any vector space. Then V automatically contains two subspaces, the set {0} consisting of
the zero vector alone and the whole space V itself. These are sometimes called the trivial subspaces of V.
Examples of nontrivial subspaces follow.
Example 4.6. Consider the vector space V = ^.
(a) Let U consist of all vectors in R^ whose entries are equal; that is,
U = {(a, b,c)\a = b = c}
For example, A,1,1), ( — 3, —3, —3), A,1,1), ( — 2, —2, —2) are vectors in U. Geometrically, U is the line
through the origin O and the point A,1,1) as shown in Fig. 4-1(a). Clearly 0 = @, 0, 0) belongs to U, since all
entries in 0 are equal. Further, suppose u and v are arbitrary vectors in U, say, u = (a, a, a) and v = (b, b, b).
Then, for any scalar ^ G R, the following are also vectors in U\
u -\- V = (a -\- b, a -\- b, a-\-b) and ku = (ka, ka, ka)
Thus U is a. subspace of R^.
(b) Let W be any plane in R^ passing through the origin, as pictured in Fig. 4-1F). Then 0 = @, 0, 0) belongs to W,
since we assumed W passes through the origin O. Further, suppose u and v are vectors in W. Then u and v may be
viewed as arrows in the plane W eminating from the origin O, as in Fig. 4-1F). The sum u-\-v and any multiple
ku of u also lie in the plane W. Thus ^ is a subspace of R^.
(b)
Fig. 4-1
Example 4.7
(a) Let V = M„^, the vector space ofnxn matrices. Let W^ be the subset of all (upper) triangular matrices and let
W2 be the subset of all symmetric matrices. Then W^ is a subspace of V, since W^ contains the zero matrix 0 and
Wi is closed under matrix addition and scalar multiplication, that is, the sum and scalar multiple of such
triangular matrices are also triangular. Similarly, W2 is a subspace of V.
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 4] VECTOR SRACES 123
(b) Let V = P@, the vector space P(t) of polynomials. Then the space P„@ of polynomials of degree at most n may
be viewed as a subspace of P(t). Let Q(^) be the collection of polynomials with only even powers of t. For
example, the following are polynomials in Q(^):
p^=3-\-4t^ - 5t^ and p2 = 6-7t'^-\- 9t^ + 3p
(We assume that any constant k = kfl is an even power of t.) Then Q@ is a subspace of V{t).
(c) Let V be the vector space of real-valued functions. Then the collection W^ of continuous functions and the
collection W2 of differentiable functions are subspaces of V.
Intersection of Subspaces
Let U and W be subspaces of a vector space V. We show that the intersection ^ fi ^ is also a subspace
of V. Clearly, 0 G ^ and 0 G ^ since U and W are subspaces; whence 0 g ^ fi ^ Now suppose u and v
belong to the intersection U nW. Then u,v e U and u,v e W. Further, since U and W are subspaces, for
any scalars a,b e K,
au-\- bv e U and au-\- bv e W
Thus au-\-bv e U HW. Therefore ^ fi ^ is a subspace of V.
The above result generalizes as follows.
Theorem 4.3: The intersection of any number of subspaces of a vector space F is a subspace of V.
Solution Space of a Homogeneous System
Consider a system AX = B of linear equations in n unknowns. Then every solution u may be viewed
as a vector in K". Thus the solution set of such a system is a subset of K". Now suppose the system is
homogeneous, that is, suppose the system has the form AX = 0. Let W be its solution set. Since AO = 0,
the zero vector 0 e W. Moreover, suppose u and v belong to W. Then u and v are solutions of AX = 0, or, in
other words, Au = 0 and Av = 0. Therefore, for any scalars a and b, we have
A(au + bv) = aAu -\-bAv = aO-\-bO = 0-\-0 = 0
Thus au + bv belongs to W, since it is a solution of AX = 0. Accordingly, ^ is a subspace of K".
We state the above result formally.
Theorem 4.4: The solution set W of a homogeneous system AX = Omn unknowns is a subspace ofK".
We emphasize that the solution set of a nonhomogeneous system AX = B is not a subspace ofK". In
fact, the zero vector 0 does not belong to its solution set.
4.6 LINEAR SPANS, ROW SPACE OF A MATRIX
Suppose Ui,U2, ... ,Uj^ SiVQ any vectors in a vector space V. Recall (Section 4.4) that any vector of the
form aiUi + B2 ^2 + • • • + ^m^m^ where the a^ are scalars, is called a linear combination ofui,U2,...,Uj^.
The collection of all such linear combinations, denoted by
span(wi ,U2,..., u^) or span(w^)
is called the linear span of Ui,U2, ... ,u^.
Clearly the zero vector 0 belongs to span(w^), since
0 = Owj + 0^2 + ... + Ou^
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
124
VECTOR SPACES
[CHAR 4
Furthermore, suppose v and v' belong to span(w^), say,
V = fljWj + ^2^2 + ... + cij^Uj^ and v' = biUi + Z?2W2 + ... + Z?^w^
Then, for any scalar k e K, wq have
V + V' = (a^-\- b^)u^ + (^2 + Z?2)W2 + • • • + («m + bm)Um
and
Thus i; + i;^ and ^i; also belong to span(w^). Accordingly, span(w^) is a subspace of V.
More generally, for any subset S of V, span(^) consists of all linear combinations of vectors in S or,
when S = (p, span(^ = {0}. Thus, in particular, ^ is a spanning set (Section 4.4) of span(^).
The following theorem, which was partially proved above, holds.
Theorem 4.5: Let ^ be a subset of a vector space V.
(i) Then span(^) is a subspace of V that contains S.
(ii) If ^ is a subspace of V containing S, then span(^) c W.
Condition (ii) in Theorem 4.5 may be interpreted as saying that span(^) is the "smallest" subspace of
V containing S.
Example 4.8. Consider the vector space V = R^.
(a) Let u be any nonzero vector in R^. Then span(w) consists of all scalar multiples of w. Geometrically, span(w) is
the line through the origin O and the endpoint of u, as shown in Fig. 4-2(a).
(a)
ib)
Fig. 4-2
(b) Let u and v be vectors in R that are not multiples of each other. Then span(w, v) is the plane through the origin O
and the endpoints of u and v as shown in Fig. 4-2F).
(c) Consider the vectors ei = (I, 0, 0), ^2 = @, 1, 0), ^3 = @, 0, 1) in R^. Recall [Example 4.1(a)] that every vector
in R^ is a linear combination of e^, ^2, e^. That is, e^, ^2, ^3 form a spanning set of R^. Accordingly,
span(ei, 62,61,) = R^.
Row Space of a Matrix
Let A = [ay] be an arbitrary m x n matrix over a field K. The rows of ^,
Rl = (ail, <^12' • • • ' <^ln)' ^2 = (<3!21' <^22' • • • ' <^2n)' • • • ' ^m = (^ml' ^m2' • • • ' ^mn)
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAR 4]
VECTOR SRACES
125
may be viewed as vectors in K^\ hence they span a subspace ofK^ called the row space of A and denoted
by rowsp(^). That is,
rowsp(^) = span(i?i, R2
^Rm)
Analagously, the columns of ^ may be viewed as vectors in K^ called the column space of A and denoted
by colsp(^). Observe that colsp(^) = rowsp(^^).
Recall that matrices A and B are row equivalent, written A ^ B, if B can be obtained from ^ by a
sequence of elementary row operations. Now suppose M is the matrix obtained by applying one the
following elementary row operations on a matrix A:
A) Interchange R^ and Rj. B) Replace R^ by kR^,
C) Replace Rj by kR^ + Rj
Then each row of M is a row of ^ or a linear combination of rows of ^. Hence the row space of M is
contained in the row space of ^. On the other hand, we can apply the inverse elementary row operation on
M to obtain A; hence the row space of ^ is contained in the row space of M. Accordingly, A and M have
the same row space. This will be true each time we apply an elementary row operation. Thus we have
proved the following theorem.
Theorem 4.6: Row equivalent matrices have the same row space.
We are now able to prove (Problems 4A5-4A1) basic results on row equivalence (which first
appeared as Theorems 3.6 and 3.7 in Chapter 3).
Theorem 4.7: Suppose A = [afj] and B = [bfj] are row equivalent echelon matrices with respective pivot
entries
' «2/.
• ' «r/..
and
b]k.,b'
'2k, ^
.,b
sL
Then A and B have the same number of nonzero rows, that is, r = s, and their pivot entries
are in the same positions, that is,j\ = ki,J2 = ^2' - - - Jr = K-
Theorem 4.8: Suppose A and B are row canonical matrices. Then A and B have the same row space if
and only if they have the same nonzero rows.
Corollary 4.9: Every matrix A is row equivalent to a unique matrix in row canonical form.
We apply the above results in the next example.
Example 4.9. Consider the following two sets of vectors in R^:
wi =A,2, -1, 3), U2 = B, 4, 1, -2), W3 = C, 6, 3, -7)
wi =A,2,-4,11), >V2 = B,4,-5, 14)
Let U = span(w^) and W = span(>v^). There are two ways to show that U = W.
(a) Show that each Ui is a linear combination of w^ and W2, and show that each Wi is a linear combination ofui,U2,
W3. Observe that we have to show that six systems of linear equations are consistent.
(b) Form the matrix A whose rows are u^, U2, u^ and row reduce A to row canonical form, and form the matrix B
12-1 3
2 4 1-2
3 6 3 -7_
1 2 -4 111
2 4-5 I4J
^
4,
1 2 -1
31
0 0 3 -8
_0 0 6 -16 J
12-4 11"
3 0 3 -8_
^
^
fl
[0
1 2 0
0 0 1-
0 0 0
2 0 f
0 1 -f
B -
Since the nonzero rows of the matrices in row canonical form are identical, the row spaces of ^ and B are equal.
Therefore, U = W.
Clearly, the method in (b) is more efficient than the method in (a).
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
126 VECTOR SPACES [CHAR 4
4.7 LINEAR DEPENDENCE AND INDEPENDENCE
Let F be a vector space over a field K. The following defines the notion of linear dependence and
independence of vectors over K. (One usually suppresses mentioning K when the field is understood.) This
concept plays an essential role in the theory of linear algebra and in mathematics in general.
Definition: We say that the vectors Vx,V2, ... ,v^ in V are linearly dependent if there exist scalars
^1, B2, ..., a^ in AT, not all of them 0, such that
a^v^+a2V2 + ...+a^v^ = 0
Otherwise, we say that the vectors are linearly independent.
The above definition may be restated as follows. Consider the vector equation
where the x's are unknown scalars. This equation always has the zero solution Xj = 0, X2 = 0, ..., x^ = 0.
Suppose this is the only solution, that is, suppose we can show:
x^Vx + X2i^2 + ... + -^w^w = ^ implies Xj = 0, X2 = 0, ..., x^ = 0
Then the vectors Vx,V2, ... ,v^diXQ linearly independent. On the other hand, suppose the equation (*) has a
nonzero solution; then the vectors are linearly dependent.
A ?>Qt S = {vi,V2, ... ,v^} of vectors in V is linearly dependent or independent according as the
vectors Vi,V2, ... ,v^ are linearly dependent or independent.
An infinite set S of vectors is linearly dependent or independent according as there do or do not exist
vectors Vy,V2, ... ,Vj^m S that are linearly dependent.
Warning: The set ^ = {i;i, i;2, • • •, ^wl above represents a list or, in other words, a finite sequence of
vectors where the vectors are ordered and repetition is permitted.
The following remarks follow directly from the above definition.
Remark 1: Suppose 0 is one of the vectors Vx,V2, ... ,v^, say v^ = 0. Then the vectors must be
linearly dependent, since we have the following linear combination where the coefficient of i;i 7^ 0:
li;i + 0i;2 + ... + Oi;^ = 1 • 0 + 0 + ... + 0 = 0
Remark 2: Suppose 1; is a nonzero vector. Then v, by itself, is linearly independent, since
kv = 0, V j^O implies ^ = 0
Remark 3: Suppose two of the vectors Vi,V2,... ,Vf^ are equal or one is a scalar multiple of the
other, say Vi = kv2. Then the vectors must be linearly dependent, since we have the following linear
combination where the coefficient of i;i 7^ 0:
v^-kv2 + 0v^ + ... + 0v^ = 0
Remark 4: Two vectors Vi and V2 are linearly dependent if and only if one of them is a multiple of
the other.
Remark 5: If the set {i;i, ..., 1;^} is linearly independent, then any rearrangement of the vectors
{Vi^ ,Vi^,..., ViJ is also linearly independent.
Remark 6: If a set ^ of vectors is linearly independent, then any subset of ^ is linearly independent.
Alternatively, if S contains a linearly dependent subset, then S is linearly dependent.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAR 4]
VECTOR SRACES
127
Example 4.10
(a) Let u = (I, 1,0), i; = A, 3, 2), w = D,9, 5). Then u, v, w are linearly dependent, since
3u-\-5v-2w = 3A, 1,0) + 5A, 3, 2) - 2D, 9, 5) = @, 0, 0) = 0
(b) We show that the vectors u = A,2,3), v = B, 5, 7), w = A,3,5) are linearly independent. We form the vector
equation xu-\-yv -\-zw = 0, where x, y, z are unknown scalars. This yields
1
2
3
^y
2
5
1
+ z
1
3
5
=
0
0
0
x + 2y+ z = 0
2x + 5j; + 3z = 0
3x + 7j; + 5z = 0
or
X + 2j; + z = 0
y-\- z = 0
2z = 0
Back-substitution yields x = 0, j; = 0, z = 0. We have shown that
xu-\-yv -\-zw = 0 implies x = 0,
Accordingly, u, v, w are linearly independent.
^^ = 0,
(c) Let V be the vector space of functions from R into R. We show that the fLinctions/@ = sin t, g(t) = e\ h(t) = f
are linearly independent. We form the vector (function) equation xf -\-yg-\-zh = ^, where x, y, z are unknown
scalars. This function equation means that, for every value of t,
X sin ^ + ye^ -\-zt^ = 0
Thus, in this equation, we choose appropriate values of t to easily get x = 0, j; = 0, z = 0. For example:
(i) Substitute t = 0
(ii) Substitute t = n
(iii) Substitute t = n/2
to obtain x@) -\-y(l)-\- z@) = 0
to obtain x@) = O(e^) + z(n^) = 0
to obtain x(l) + 0(e^/2) + 0(n^/4) = 0
or
or
or
y = 0
z = 0
x = 0
We have shown:
xf -\-yg -\- zf = 0 implies
Accordingly, u, v, w are linearly independent.
0, y = 0.
Linear Dependence in R
Linear dependence in the vector space V = R^ can be described geometrically as follows:
(a) Any two vectors u and v in R^ are linearly dependent if and only if they lie on the same line through
the origin O, as shown in Fig. 4-3(a).
(b) Any three vectors w, v, w in R^ are linearly dependent if and only if they lie on the same plane
through the origin O, as shown in Fig. 4-3(Z?).
Later, we will be able to show that any four or more vectors in R^ are automatically linearly dependent.
(a) u and v are linearly dependent (b) u, v, and w are linearly dependent
Fig. 4-3
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
128 VECTOR SPACES [CHAR 4
Linear Dependence and Linear Combinations
The notions of linear dependence and linear combinations are closely related. Specifically, for more
than one vector, we show that the vectors Vi,V2, ... ,Vf^SiYQ linearly dependent if and only if one of them is
a linear combination of the others.
Suppose, say, v^ is a linear combination of the others,
i;,- = flii;i + ... + fl/_i Vi + a,-+ii;,-+i + ... + a^v^
Then by adding —Vf to both sides, we obtain
flii;i + ... + fl,_i Vi - Vi + fl/+ii;/+i + ... a^v^ = 0
where the coefficient of Vf is not 0. Hence the vectors are linearly dependent. Conversely, suppose the
vectors are linearly dependent, say,
biVi -\-... -\- bjVj + ... Z?^i;^ = 0, where bj j^ 0
Then we can solve for Vj obtaining
Vj = bj^b^v^ - ... - bJ^bj_^Vj_^ - bJ^bj^^Vj^^ - ... - bj^b^v^
and so Vj is a linear combination of the other vectors.
We now state a slightly stronger statement than the one above. This result has many important
consequences.
Lemma 4.10: Suppose two or more nonzero vectors Vx,V2,... ,v^diXQ linearly dependent. Then one of
the vectors is a linear combination of the preceding vectors, that is, there exists k > \ such
that
% = CxVx + C2V2 + . . . + Cj^-\Vj^_^
Linear Dependence and Echelon Matrices
Consider the following echelon matrix A, whose pivots have been circled:
(D 3 4 5 6 7"
0 0 @ 3 2 3 4
^=|0000G)89
0 0 0 0 0 F) 7
0 0 0 0 0 0 0
Observe that the rows R2, R3, R4 have O's in the second column below the nonzero pivot in i^j, and hence
any linear combination of R2, R^, R4 must have 0 as its second entry. Thus Ri cannot be a linear
combination of the rows below it. Similarly, the rows R^ and R4 have O's in the third column below the
nonzero pivot in R2, and hence R2 cannot be a linear combination of the rows below it. Finally, R^ cannot
be a multiple of R4, since R4 has a 0 in the fifth column below the nonzero pivot in ^^3. Viewing the
nonzero rows from the bottom up, R4, R^,, R2, R\, t^o row is a linear combination of the preceding rows.
Thus the rows are linearly independent by Lemma 4.10.
The argument used with the above echelon matrix A can be used for the nonzero rows of any echelon
matrix. Thus we have the following very useful result.
Theorem 4.11: The nonzero rows of a matrix in echelon form are linearly independent.
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 4] VECTOR SRACES 129
4.8 BASIS AND DIMENSION
First we state two equivalent ways to define a basis of a vector space V. (The equivalence is proved in
Problem 4.28.)
Definition A: AsQt S = {ui,U2,..., w„} of vectors is a basis of V if it has the following two properties:
A) S is linearly independent. B) S spans V.
Definition B: A sQt S = {ui, U2,..., uj of vectors is a basis of V if every v e V can be written uniquely
as a linear combination of the basis vectors.
The following is a fundamental result in linear algebra.
Theorem 4.12: Let F be a vector space such that one basis has m elements and another basis has n
elements. Then m = n.
A vector space V is said to be of finite dimension n or n-dimensional, written
dim V = n
if V has a basis with n elements. Theorem 4.12 tells us that all bases of V have the same number of
elements, so this definition is well-defined.
The vector space {0} is defined to have dimension 0.
Suppose a vector space V does not have a finite basis. Then V is said to be of infinite dimension or to
be infinite-dimensional.
The above fundamental Theorem 4.12 is a consequence of the following "replacement lemma"
(proved in Problem 4.35).
Lemma 4.13: Suppose {vi, V2,..., vj spans V, and suppose {wj, W2, ..., w^} is linearly independent.
Then m < n, and V is spanned by a set of the form
Thus, in particular, n-\- I or more vectors in V are linearly dependent.
Observe in the above lemma that we have replaced m of the vectors in the spanning set of V by the m
independent vectors and still retained a spanning set.
Examples of Bases
This subsection presents important examples of bases of some of the main vector spaces appearing in
this text.
(a) Vector space K^: Consider the following n vectors in K^:
e^ = A, 0, 0, 0, ..., 0, 0), e2 = @, 1, 0, 0, ..., 0, 0), ..., e„ = @, 0, 0, 0, ..., 0, 1)
These vectors are linearly independent. (For example, they form a matrix in echelon form.)
Furthermore, any vector u = (ai, a2, ..., aj in K^ can be written as a linear combination of the
above vectors. Specifically,
V = a^ex + B2^2 + ... + a„e„
Accordingly, the vectors form a basis of K^ called the usual or standard basis of K^. Thus (as one
might expect) K^ has dimension n. In particular, any other basis ofK^ has n elements.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
130
VECTOR SPACES
[CHAR 4
(b) Vector space M = M^^ of all r x ^ matrices: The following six matrices form a basis of the vector
space M2 3 of all 2 X 3 matrices over K:
1 0 0
0 0 0
'
0 1 0
0 0 0
'
0 0 1
0 0 0
'
0 0 0
1 0 0
'
0 0 0
0 1 0
'
0 0 0
0 0 1
More generally, in the vector space M = M^^ of all r x ^ matrices, let E^ be the matrix with //-entry 1
and O's elsewhere. Then all such matrices form a basis of M^^ called the usual or standard basis of
M^^. Accordingly, dimM^^ = rs.
(c) Vector space P„(^) of all polynomials of degree < n: The set ^ = {1, ^, ^^, ^^,..., ^"} of n-\- \
polynomials is a basis of P„(/^). Specifically, any polynomial/(/^) of degree <n can be expessed as a
linear combination of these powers of t, and one can show that these polynomials are linearly
independent. Therefore, dimP„(^) = ^ + 1.
(d) Vector space P(^) of all polynomials: Consider any finite set ^ = {/i@'^@'•••'7^@1 of
polynomials in F(t), and let m denote the largest of the degrees of the polynomials. Then any
polynomial g(t) of degree exceeding m cannot be expressed as a linear combination of the elements of
S. Thus S cannot be a basis ofF(t). This means that the dimension ofF(t) is infinite. We note that the
infinite set S' = {l,t,t^,t^, ...}, consisting of all the powers of t, spans F(t) and is linearly
independent. Accordingly, S^ is an infinite basis of P(^).
Theorems on Bases
The following three theorems (proved in Problems 4.37, 4.38, and 4.39) will be used frequently.
Theorem 4.14: Let F be a vector space of finite dimension n. Then:
(i) Any n-\- I or more vectors in V are linearly dependent.
(ii) Any linearly independent SQt S = {ui,U2, ... ,uj with n elements is a basis of V.
(iii) Any spanning set T = {vi,V2, ... ,v^} of F with n elements is a basis of V.
Theorem 4.15: Suppose S spans a vector space V. Then:
(i) Any maximum number of linearly independent vectors in S form a basis of V.
(ii) Suppose one deletes from S every vector that is a linear combination of preceding
vectors in S. Then the remaining vectors form a basis of V.
Theorem 4.16: Let F be a vector space of finite dimension and let S = {ui,U2,..., u^} be a set of
linearly independent vectors in V. Then S is part of a basis of F; that is, S may be
extended to a basis of V.
Example 4.11
(a) The following four vectors in R"^ form a matrix in echelon form:
A,1,1,1), @,1,1,1), @,0,1,1), @,0,0,1)
Thus the vectors are linearly independent, and, since dimR"^ = 4, the vector form a basis of R"^.
(b) The following n-\-\ polynomials in P„@ are of increasing degree:
1, t-\, {t-\y
., (^-l)''
Therefore no polynomial is a linear combination of preceding polynomials; hence the polynomials are linear
independent. Furthermore, they form a basis of P„@, since dimP„(^) = n-\-\.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAR 4]
VECTOR SRACES
131
(c) Consider any four vectors in R^; say
B57,-132,58), D3,0,-17),
E21,-317,94),
C28,-512,-731)
By Theorem 4.14(a), the four vectors must be linearly dependent, since they come from the 3-dimensional
vector space R^.
Dimension and Subspaces
The following theorem (proved in Problem 4.40) gives the basic relationship between the dimension of
a vector space and the dimension of a subspace.
Theorem 4.17: Let ^ be a subspace of an ^-dimensional vector space V. Then dim W < n.ln particular,
if dim W = n, then W = V.
Example 4.12. Let ^ be a subspace of the real space R^. Note that dimR^ — 3. Theorem 4.17 tells us that the dimension
of W can only be 0, 1, 2, or 3. The following cases apply:
(a) dim W = 0, then W = {0}, a point.
(b) dim W = I, then ^ is a line through the origin 0.
(c) dim W = 2, then ^ is a plane through the origin 0.
(d) dim W = 3, then W is the entire space R^.
4.9 APPLICATION TO MATRICES, RANK OF A MATRIX
Let A be any m x n matrix over a field K. Recall that the rows of ^ may be viewed as vectors in K" and
that the row space of ^, written rowsp(^), is the subspace ofK^ spanned by the rows of ^. The following
definition applies.
Definition: The rank of a matrix A, written rank(^), is equal to the maximum number of linearly
independent rows of ^ or, equivalently, the dimension of the row space of ^.
Recall, on the other hand, that the columns of an m x ^ matrix A may be viewed as vectors in K"^ and
that the column space of ^, written colsp(^), is the subspace ofK^ spanned by the columns of ^. Although
m may not be equal to n, that is, the rows and columns of ^ may belong to different vector spaces, we do
have the following fundamental result.
Theorem 4.18: The maximum number of linearly independent rows of any matrix A is equal to the
maximum number of linearly independent columns of ^. Thus the dimension of the row
space of ^ is equal to the dimension of the column space of ^.
Accordingly, one could restate the above definition of the rank of A using column instead of row.
Basis-Finding Problems
This subsection shows how an echelon form of any matrix A gives us the solution to certain problems
about A itself Specifically, let A and B be the following matrices, where the echelon matrix B (whose
pivots are circled) is an echelon form of ^:
1
2
3
1
2
2
5
7
5
6
1
5
6
10
8
3
6
11
8
11
1
4
6
9
9
2
5
9
9
12
and
B :
"® 2
0 ®
0 0
0 0
0 0
1
3
0
0
0
3
1
®
0
0
1
2
1
0
0
2
1
2
0
0
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
132 VECTOR SPACES [CHAR 4
We solve the following four problems about the matrix A, where Cj, C2, ..., Q denote its columns:
(a) Find a basis of the row space of ^.
(b) Find each column Q of ^ that is a linear combination of preceding columns of ^.
(c) Find a basis of the column space of ^.
(d) Find the rank of ^.
(a) We are given that A and B are row equivalent, so they have the same row space. Moreover, B is in
echelon form, so its nonzero rows are linearly independent and hence form a basis of the row space of
B. Thus they also form a basis of the row space of ^. That is,
basis of rowsp(^): A, 2, 1, 3, 1,2), @, 1, 3, 1, 2, 1), @, 0, 0, 1, 1, 2)
(b) Let 714 = [Cj, C2, ..., Q], the submatrix of ^ consisting of the first k columns of ^. Then Mj^_i and
Mj^ are, respectively, the coefficient matrix and augmented matrix of the vector equation
XjCj +X2C2 + ... +Xy^_iQ_i = Q
Theorem 3.8 tells us that the system has a solution, or, equivalently, Q is a linear combination of the
preceding columns of^ if and only if rankGl4) = rankGl4_i), where rankGl4) means the number of
pivots in an echelon form of Mj^. Now the first k columns of the echelon matrix B is also an echelon
form of 714. Accordingly,
rank(M2) = rank(M3) = 2 and rank(M4) = rank(M5) = rank(M5) = 3
Thus C3, C5, Q are each a linear combination of the preceding columns of ^.
(c) The fact that the remaining columns Cj, C2, C4 are not linear combinations of their respective
preceding columns also tells us that they are linearly independent. Thus they form a basis of the
column space of ^. That is
basis of colsp(^): [1, 2, 3, 1, 2f, [2, 5, 7, 5, 6]^, [3, 6, 11, 8, 11]^
Observe that Cj, C2, C4 may also be characterized as those columns of ^ that contain the pivots in
any echelon form of ^.
(d) Here we see that three possible definitions of the rank of ^ yield the same value.
(i) There are three pivots in B, which is an echelon form of ^.
(ii) The three pivots in B correspond to the nonzero rows of 5, which form a basis of the row space
of^.
(iii) The three pivots in B correspond to the columns of ^, which form a basis of the column space
of^.
Thus rank(^) = 3.
Application to Finding a Basis for W = span(wi, U2,... ,u^)
Frequently, we are given a list S = {ux,U2, ... ,Uj.] of vectors in K^ and we want to find a basis for the
subspace W of K^ spanned by the given vectors, that is, a basis of
W = span(^) = span(wi,U2, ... ,u^)
The following two algorithms, which are essentially described in the above subsection, find such a basis
(and hence the dimension) of W.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAR 4]
VECTOR SRACES
133
Algorithm 4.1 (Row space algorithm)
Step 1. Form the matrix M whose rows are the given vectors.
Step 2. Row reduce M to echelon form.
Step 3. Output the nonzero rows of the echelon matrix.
Sometimes we want to find a basis that only comes from the original given vectors. The next algorithm
accomplishes this task.
Algorithm 4.2 (Casting-out algorithm)
Step 1. Form the matrix M whose columns are the given vectors.
Step 2. Row reduce M to echelon form.
Step 3. For each column Q in the echelon matrix without a pivot, delete (cast out) the vector Uj^ from the
list S of given vectors.
Step 4. Output the remaining vectors in S (which correspond to columns with pivots).
We emphasize that in the first algorithm we form a matrix whose rows are the given vectors, whereas
in the second algorithm we form a matrix whose columns are the given vectors.
Example 4.13. Let W be the subspace of R^ spanned by the following vectors:
wi = A, 2, 1, 3, 2), U2 = A, 3, 3, 5, 3), u^ = C, 8, 7, 13, 8)
U4 = A,4, 6, 9, 7), U5 = E, 13, 13, 25, 19)
Find a basis of W consisting of the original given vectors, and find dim W.
Form the matrix M whose columns are the given vectors, and reduce M to echelon form:
M :
113 15
2 3 8 4 13
13 7 6 13
3 5 13 9 25
2 3 8 7 19
113 15
0 12 2 3
0 0 0 12
0 0 0 0 0
0 0 0 0 0
The pivots in the echelon matrix appear in columns C^, C2, C4. Accordingly, we "cast out" the vectors W3 and u^ fi-om
the original five vectors. The remaining vectors Ui,U2,U4, which correspond to the columns in the echelon matrix with
pivots, form a basis of W. Thus, in particular, dim W = 3.
Remark: The justification of the Casting-out algorithm is essentially described above, but we repeat
it again here for emphasis. The fact that column C3 in the echelon matrix in Example 4.13 does not have a
pivot means that the vector equation
xui -\-yu2 = W3
has a solution, and hence W3 is a linear combination of Wj and ^2- Similarly, the fact that C5 does not have a
pivot means that u^ is a linear combination of the preceding vectors. We have deleted each vector in the
original spanning set that is a linear combination of preceding vectors. Thus the remaining vectors are
linearly independent and form a basis of W.
Application to Homogeneous Systems of Linear Equations
Consider again a homogeneous system AX = 0 of linear equations over K with n unknowns. By
Theorem 4.4, the solution set W of such a system is a subspace of AT", and hence W has a dimension. The
following theorem, whose proof is postponed until Chapter 5, holds.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
134
VECTOR SPACES
[CHAR 4
Theorem 4.19: The dimension of the solution space W of a homogeneous system AX = 0 is n — r,
where n is the number of unknowns and r is the rank of the coefficient matrix A.
In the case where the system AX = 0 is in echelon form, it has precisely n — r free variables, say
Xi^ ,Xi^,..., x^^ . Let Vj be the solution obtained by setting x^ = 1 (or any nonzero constant) and the
remaining free variables equal to 0. We show (Problem 4.50) that the solutions Vi,V2, ..., v^_^ are linearly
independent; hence they form a basis of the solution space W.
We have already used the above process to find a basis of the solution space ^ of a homogeneous
system AX = 0 in Section 3.11. Problem 4.48 gives three other examples.
4.10 SUMS AND DIRECT SUMS
Let U and W be subsets of a vector space V. The sum of U and W, written U -\-W, consists of all sums
u-\-w where u e U and w e W. That is,
U -\- W = {v : V = u -\- w, where u e U and w e W}
Now suppose U and W are subspaces of V. Then one can easily show (Problem 4.53) that ^ + ^ is a
subspace of V. Recall that ^ fi ^ is also a subspace of V. The following theorem (proved in Problem 4.58)
relates the dimensions of these subspaces.
Theorem 4.20: Suppose U and W are finite-dimensional subspaces of a vector space V. Then U -\- W
has finite dimension and
dim(U -\-W) = dimU-\- dim W - dim(U H W)
Example 4.14. Let V = M22, the vector space of 2 x 2 matrices. Let U consist of those matrices whose second row is
zero, and let W consist of those matrices whose second column is zero. Then
U -
a b
0 0
W--
|[:
and
U-\-W--
|[:
b
c 0
un w -
a 0
0 0
That i^,U-\-W consists of those matrices whose lower right entry is 0, and U HW consists of those matrices whose
second row and second column are zero. Note that dimt/ = 2, dim ^ = 2, dim(UnW)=l. Also,
dim(U -\- W) = 3, which is expected from Theorem 4.20. That is,
dim(U -\-W) = dimU-\- dim V - dim(U nW) = 2-\-2 - I = 3
Direct Sums
The vector space V is said to be the direct sum of its subspaces U and W, denoted by
v= uew
if every v e V can be written in one and only one way sls v = u-\-w where u e U and w e W.
The following theorem (proved in Problem 4.59) characterizes such a decomposition.
Theorem 4.21: The vector space V is the direct sum of its subspaces U and W if and only if:
(i)V=u+w, (ii) unw = {0}.
Example 4.15. Consider the vector space V = R^
(a) Let U be the xy-plane and let W be the j;z-plane; that is,
U = {(a, b,0)\a,be R} and W = {@, b,c)\b,ce R}
Then K^ = U -\- W, since every vector in R^ is the sum of a vector in U and a vector in W. However, R^ is not the
direct sum of U and W, since such sums are not unique. For example,
C, 5, 7) = C,1,0) + @, 4, 7) and also C, 5, 7) = C, -4, 0) + @, 9, 7)
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 4] VECTOR SRACES 13 5
(b) Let U be the xy-plane and let W be the z-axis, that is,
U = {(a, b,0)\a,be R} and W = {@, 0, c) : c g R}
Now any vector (a,b,c) eB? can be written as the sum of a vector in U and a vector in V in one and only one
way:
(a, b, c) = (a, b, 0) + @, 0, c)
Accordingly, R^ is the direct sum of U and W; that is, R^ = U®W.
General Direct Sums
The notion of a direct sum is extended to more than one factor in the obvious way. That is, V is the
direct sum of subspaces Wi, W2, ..., Wj., written
F = ^1 e ^2 e... e ^r
if every vector v ^V can be written in one and only one way as
1; = Wj + W2 + ... + w^
where Wi e Wi,W2 ^ W2, ..., w^ g Wj..
The following theorems hold.
Theorem 4.22: Suppose V = Wi ^ W2 ^ ... ^ W^. Also, for each k, suppose Sj^ is a linearly
independent subset of Wj^. Then:
(a) The union S = |J^ Sj^ is linearly independent in V.
(b) If each Sj^ is a basis of Wj^, then |J^ Sj^ is a basis of V.
(c) dim F = dim ^1 + dim ^2 + • • • + dim W^.
Theorem 4.23: Suppose V = W^ + W2 + ... + W^ md dimV = J2k d™ ^k- Then
v=w^ew2e...ew,.
4.11 COORDINATES
Let V be an ^-dimensional vector space over K with basis S = {ui,U2, ..., uj. Then any vector v e V
can be expressed uniquely as a linear combination of the basis vectors in S, say
V = aiUi + B2^2 + . . . + Cln^n
These n scalars a^, ^2, ..., a„ are called the coordinates of 1; relative to the basis S, and they form a vector
[ai,a2, ..., a^] in K" called the coordinate vector of v relative to S. We denote this vector by [v]^, or
simply [v], when S is understood. Thus
[v]s = [a^,a2,...,aj
For notational convenience, brackets [...], rather than parentheses (...), are used to denote the coordinate
vector.
Remark: The above n scalars ai,a2, ... ,a^ also form the coordinate column vector
[ai,a2,...,aJ^ofv relative to S. The choice of the column vector rather than the row vector to represent
V depends on the context in which it is used. The use of such column vectors will become clear later in
Chapter 6.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
136
VECTOR SPACES
[CHAR 4
Example 4.16. Consider the vector space P2@ of polynomials of degree <2. The polynomials
p^ =t-\-l, p2=t - I,
P3=((- 1)^ = t^ -2t-\-l
form a basis S of P2@- The coordinate vector [v] of v = 2f — 5t -\- 9 relative to S is obtained as follows.
Set V = xpi -\-yp2 +2^3 using unknown scalars x, y, z, and simplify:
2t^ -5t + 9= x(t + 1) -\-y(t - l)-\-z(f -2t-\-l)
= xt -\-X-\-yt — y -\-zf' —2zt-\-z
= zf + (x +j; - 2z)t + (x-y-\-z)
Then set the coefficients of the same powers of t equal to each other to obtain the system
z = 2, x-\-y — 2z = —5, x —y -\- z = 9
The solution of the system is x = 3, j; = —4, z — 2. Thus
V — Ipx— 4p2 + 2/?3, and hence [v] = [3, —4, 2]
Example 4.17. Consider real space R^. The following vectors form a basis 6* of R^:
wi = A, -1, 0), U2 = A, 1, 0), ^3 = @, 1, 1)
The coordinates of i; = E, 3, 4) relative to the basis S is obtained as follows.
Set V = xvi -\-yv2 + zvt^, that is, set i; as a linear combination of the basis vectors using unknown scalars x, y, z.
This yields:
5
3
4
= X
1
-1
0
+y
1
1
0
+ z
0
1
1
The equivalent system of linear equations is as follows:
x-\-y = 5, —x-\-y-\-z =
The solution of the system is x = 3, j; = 2, z = 4. Thus
V = 3ui + 2^2 + 4^3, and so
K = [3,2,4]
Remark 1: There is a geometrical interpretation of the coordinates of a vector v relative to a basis S
for the real space R", which we illustrate using the basis S of R^ in Example 4.17. First consider the space
R^ with the usual x, y, z axes. Then the basis vectors determine a new coordinate system of R^, say with x\
y, / axes, as shown in Fig. 4-4. That is:
A) The x^-axis is in the direction of u^ with unit length \\ui ||.
B) The /-axis is in the direction of U2 with unit length 11^2 II-
C) The z^-axis is in the direction of u^ with unit length Wu^ \\.
Then each vector v = (a,b, c) or, equivalently, the point P(a, b, c) in R^ will have new coordinates with
respect to the new x\ /, z^ axes. These new coordinates are precisely [v]^, the coordinates of v with respect
to the basis S. Thus, as shown in Example 4.17, the coordinates of the point PE, 3, 4) with the new axes
form the vector [3,2,4].
Remark 2: Consider the usual basis E = {ei,e2,... ,e^} of K" defined by
^1 = A, 0, 0, ..., 0, 0), ^2 = @, 1, 0,..., 0, 0), ..., ^ = @, 0, 0,..., 0, 1)
Let V = (ai,a2, ... ,aJhQ any vector in K^. Then one can easily show that
V = aiCi + B2^2 + ... + a„e„, and so [v]^ = [a^, ^2, ..., a„]
That is, the coordinate vector [v]^ of any vector v relative to the usual basis E of K^ is identical to the
original vector v.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAR 4]
VECTOR SRACES
137
V = E, 3, 4) = [3, 2, 4]
Fig. 4-4
Isomorphism of V and K^
Let F be a vector space of dimension n over K, and suppose S = {ui,U2, ... ,uj is sl basis of V. Then
each vector v e V corresponds to a unique w-tuple [v]^ in K^. On the other hand, each w-tuple
[cj, C2, ..., c„] in K^ corresponds to a unique vector CjWi + C2U2 + ... + c„w„ in V. Thus the basis S
induces a one-to-one correspondence between V and K". Furthermore, suppose
V = aiUi + B2^2 + . . . + Clfi^n
and
W = Z?iWi + Z?2W2 + . . . + Z?„W„
Then
V-\-W = (ai-\- bi)Ui + (B2 + Z?2)W2 + ...+(«„+ ^„)W„
^1; = (kai)ui + {ka2)u2 + ... + (ka^)u^
where ^ is a scalar. Accordingly,
[1; + w]^ = [fli + Z?i, ..., fl„ + Z?J = [flj,..., flj + [Z?i,..., Z?J = [1;]^ + [w]s
[kv]s = [^^1, ka2, ..., ^a„] = k[ai, a2, ..., aj = k[v]s
Thus the above one-to-one correspondence between V and K^ preserves the vector space operations of
vector addition and scalar multiplication. We then say that V and K" are isomorphic, written
V^K""
We state this result formally.
Theorem 4.24: Let V be an ^-dimensional vector space over a field K. Then V and K^ are isomorphic.
The next example gives a practical application of the above result.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
138
VECTOR SPACES
[CHAR 4
Example 4.18. Suppose we want to determine whether or not the following matrices in F = M2 3 are linearly dependent:
n 2
[4 0
B -
C-
3
16
10
-11
9
The coordinate vectors of the matrices in the usual basis of M2 3 are as follows:
[A] = [1, 2, -3, 4, 0, 1], [B] = [1, 3, -4, 6, 5, 4], [C] = [3, 8, -11, 16, 10, 9]
Form the matrix M whose rows are the above coordinate vectors and reduce M to an echelon form:
M -
Since the echelon matrix has only two nonzero rows, the coordinate vectors [A], [B], [C] span a subspace of dimension
2 and so are linearly dependent. Accordingly, the original matrices A, B, C are linearly dependent.
1 2
1 3
3 8
-3
-4
-11
4
6
16
0 r
5 4
10 9
^
1 2
0 1
0 2
-3 4
-1 2
-2 4
0 r
5 3
0 6
^
1 2
0 1
0 0
-3401
-12 5 3
0 0 0 0
Solved Problems
VECTOR SPACES, LINEAR COMBINATIONS
4.1. Suppose u and v belong to a vector space V. Simplify each of the following expressions:
(b) E2 = 3u — 6Cu — 5v) + 7u, (d) E^ = 5u
(a) E^=3Bu-4v)-\-5u-\-lv, (c) E^, = 2uv-\-3Bu-\-4v)
3 ^
V
Multiply out and collect terms:
(a) Ey=6u — \2v -\-5u-\-lv = \\u — 5v
(b) E2 = 3u- 18w + 30i; -\-lu = -8w + 30i;
(c) E^ is not defined since the product uv of vectors is not defined.
(d) E4 is not defined since division by a vector is not defined.
4.2. Prove Theorem 4.1: Let F be a vector space over a field K.
(i) ^0 = 0. (ii) Ou = 0. (iii) If ku = 0, then ^ = 0 or w = 0. (iv) (-k)u = k(-u) = -ku.
(i) By Axiom [A2] with u = 0, wq have 0 + 0 = 0. Hence, by Axiom [MJ, we have
^0 = k@ -\-0) = kO-\-kO
Adding —^0 to both sides gives the desired result.
(ii) For scalars, 0 + 0 = 0. Hence, by Axiom [M2], we have
Ou = @-\- 0)u = 0u-\-0u
Adding —Ou to both sides gives the desired result.
(iii) Suppose ku = 0 and k ^ 0. Then there exists a scalar k~^ such that k~^k = 1. Thus
u=lu = (k-^k)u = k-\ku) = k-^0 = 0
(iv) Using u + (—u) = 0 and k + (—k) = 0 yields
0 = kO = k[u-\- (—u)] = ku-\- k(—u) and
0 = Ou = [k -\- (—k)]u = ku-\- (—k)u
Adding —ku to both sides of the first equation gives —ku = k(—u), and adding —ku to both sides of the
second equation gives —ku = (—k)u. Thus (—k)u = k(—u) = —ku.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAP. 4]
VECTOR SPACES
139
4.3. Show that: (a) k(u — v) = ku — kv, (b) u -\- u = 2u.
(a) Using the definition of subtraction, that u — v = u -\- (—v), and Theorem 4.1(iv), that k(—v) -
have
k(u — v) = k[u + (—v)] = ku-\- k(—v) = ku-\- (—kv) = ku — kv
(b) Using Axiom [M4] and then Axiom [M2], we have
u -\- u = lu -\- lu = (I -\- l)u = 2u
—kv, we
4.4. Express 1; = A, —2, 5) in R^ as a linear combination of the vectors
wi = A, 1, 1), U2 = A,2, 3), W3 = B, -1, 1)
We seek scalars x, y, z, as yet unknown, such that v = xu^ -\-yu2 -{-zu^. Thus we require
1
-2
5
= X
1
1
1
^y
1
2
3
+ z
2
-1
1
X + y-\-2z --
x-\-2y — z --
X + 3_y + z -
(For notational convenience, we write the vectors in R^ as columns, since it is then easier to find the equivalent
system of linear equations.) Reducing the system to echelon form yields the triangular system
x+j; + 2z= 1,
y-
■3Z:
10
The system is consistent and has a solution. Solving by back-substitution yields the solution x = —6, J^ = 3,
z — 2. Thus V — —6ui + 3^2 + 2^3.
Alternatively, write down the augmented matrix M of the equivalent system of linear equations, where w^,
U2, W3 are the first three columns of M and v is the last column, and then reduce M to echelon form:
M :
1 1 2
1 2 -1
1 3 1
112 1
0 1-3-3
0 2-1 4
112 1
0 1-3-3
0 0 5 10
The last matrix corresponds to a triangular system, which has a solution. Solving the triangular system by
back-substitution yields the solution x = —6, j; = 3, z = 2. Thus v = —6ui + 3^2 + 2^3.
4.5. Express 1; = B, —5, 3) in R^ as a linear combination of the vectors
wi = A, -3, 2), U2 = B, -4, -1), W3 = A, -5, 7)
We seek scalars x, y, z, as yet unknown, such that v = xui -\-yu2 -\-zu^. Thus we require
2
-5
3
= X
1
-3
2
■^y
2
-4
-1
+ z
1
-5
7
X + 2_y + z =
3x — 4y — 5z =
2x — y -\-lz =
2
-5
3
Reducing the system to echelon form yields the system
x + 2j; + z = 2, 2j;-2z=l,
0 = 3
The system is inconsistent and so has no solution. Thus v cannot be written as a linear combination of
i/i, ^2? ^3 •
Lipschutz-Lipson:Schaum's I 4. Vector Spaces
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
140
VECTOR SPACES
[CHAR 4
4.6. Express the polynomial v = f -\- 4t — 3 in F(t) as a linear combination of the polynomials
Pi=t^ -2t-\-5, P2=2t^-3t, p^=t-\-l
Set i; as a linear conbination of p^, P2, P2, using unknowns x, y, z to obtain
f-\-4t-3= x(f -2t-\-5) -\-yBf - 3t) + z{t + 1) (*)
We can proceed in two ways.
Method 1. Expand the right side of (*) and express it in terms of powers of t as follows:
t^ +At-3 =xt^ -2xt + 5x + 2yt^ -3yt+zt + z
= (x + 2y)t^ + (-2x - 3j; + z)t + Ex + 3z)
Set coefficients of the same powers of t equal to each other, and reduce the system to echelon form. This
yields
x + 2y= 1
-2x-3y+ z= 4
5x + 3z = -3
x + 2y= 1
y-\-z= 6
-10j; + 3z=-8
x + 2y = \
y-\- z = 6
\3z = 52
The system is consistent and has a solution. Solving by back-substitution yields the solution x = —3,y = 2,
z = A. Thus V = —3pi + 2/?2 + 4/?2.
Method 2. The equation (*) is an identity in t; that is, the equation holds for any value of t. Thus we can set t
equal to any numbers to obtain equations in the unknowns.
(a) Set ^ = 0 in (*) to obtain the equation —3 = 5x + z.
(b) Set ^ = 1 in (*) to obtain the equation 2 = 4x — j; + 2z.
(c) Set ^ = — 1 in (*) to obtain the equation — 6 = 8x + 5j;.
Solve the system of the three equations to again obtain the solution x=—3, y = 2, z = 4. Thus
v = -3p^ +2j^2+4p3-
4.7. Express M as a linear combination of the matrices A, B, C, where
M =
4 7
7 9
and
A =
1 1
1 1
B =
1 2
3 4
C =
1 1
4 5
Set M as a linear combination ofA,B, C using unknown scalars x, y, z, that is, set M = xA -\-yB -\- zC.
This yields
' IM i]-{3 IH\ i]-[:.
y-\-z x-\-2y-\-z
-\-3y-\-4z X + 4j; + 5z
Form the equivalent system of equations by setting corresponding entries equal to each other:
x+j; + z = 4, x + 2j; + z = 7, x + 3j; + 4z = 7, x + 4j; + 5z = 9
Reducing the system to echelon form yields
x+j; + z = 4, y = 3, 3z=-3, 4z =-4
The last equation drops out. Solving the system by back-substitution yields z=—l,j; = 3, x = 2. Thus
M = 2A-\-3B-C.
SUBSPACES
4.8. Prove Theorem 4.2: ^ is a subspace of V if the following two conditions hold:
(a) OeW. (b) lfu,veW, then u-\-v,kueW.
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 4] VECTOR SPACES 141
By (a), W is nonempty, and, by F), the operations of vector addition and scalar multiplication are well
defined for W. Axioms [AJ, [A4], [MJ, [M2], [M3], [M4] hold in W since the vectors in W belong to V. Thus
we need only show that [A2] and [A3] also hold in W. Now [A2] holds since the zero vector in V belongs to W
by {a). Finally, if v e W, then (—l)v = —v e W, and v + (—v) = 0. Thus [A3] holds.
4.9. Let V = R^. Show that W is not a subspace of V, where:
(a) W = {(a, b,c)\a> 0}, (b) W = {(a, b, c) \ a^-\-b^-\-c^ < 1}.
In each case, show that Theorem 4.2 does not hold.
(a) W consists of those vectors whose first entry is nonnegative. Thus i; = A, 2, 3) belongs to W. Let
^ = —3. Then kv = (—3, —6, —9) does not belong to W, since —3 is negative. Thus W is not a subspace
of K
(b) W consists of vectors whose length does not exceed 1. Hence w = A, 0, 0) and i; = @, 1, 0) belong to W,
but u-\-v = {\, 1,0) does not belong to W, since 1^ + 1^ + 0^ = 2 > 1. Thus W is not a subspace of V.
4.10. Let V = P(/^), the vector space of real polynomials. Determine whether or not ^ is a subspace of V,
where:
{a) W consists of all polynomials with integral coefficients.
(Z?) W consists of all polynomials with degree > 6 and the zero polynomial.
(c) W consists of all polynomials with only even powers of t.
(a) No, since scalar multiples of polynomials in W do not always belong to W. For example,
/(O = ?> + 6t + lt^ eW but \f{t) = l-\-3t-\-lt^ ^W
(b) and (c). Yes. Since, in each case, W contains the zero polynomial, and sums and scalar multiples of
polynomials in W belong to W.
4.11. Let V be the vector space of functions / : R ^ R. Show that ^ is a subspace of V, where:
(a) W = {/(x) :/(l) = 0}, all functions whose value at 1 is 0.
(b) W = {/(x) :/C) =/(!)}, all functions assigning the same value to 3 and 1.
(c) W = {f{i) :/(—x) = —/(x)}, all odd functions.
Let 0 denote the zero polynomial, so 0(x) = 0 for every value of x.
(a) 0 G ^ since 0A) = 0. Suppose/, g G W. Then/(I) = 0 and g(l) = 0. Also, for scalars a and b, we
have
{af + bg){\) = af(l) + bg(l) =a0^bO = 0
Thus af -\-bg e W, and hence ^ is a subspace.
(b) beW, since 6C) = 0 = 6A). Suppose /g g W. Then /C) =f(l) and gC) = g(l). Thus, for any
scalars a and b, we have
(af + bg)i2) = a/C) + 6gC) = af{l) + bg{l) = (af + bg)il)
Thus af -\-bg e W, and hence ^ is a subspace.
(c) 0 G ^ since 6(-x) = 0 = -0 = -6(x). Suppose fgeW. Then/(-x) = -/(x) and g(-x) = -g(x).
Also, for scalars a and b,
(af + bg)(-x) = af(-x) + bg(-x) = -af(x) - bg(x) = -(af + bg)(x)
Thus ab -\- gf e W, and hence ^ is a subspace of V.
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
142 VECTOR SPACES [CHAR 4
4.12. Prove Theorem 4.3: The intersection of any number of subspaces of F is a subspace of V.
Let {^ : / G /} be a collection of subspaces of V and let W = n(Wi : i e I). Since each ^ is a subspace
of V, we have 0 e Wi, for every / e I. Hence 0 e W. Suppose u,v e W. Then u,v e Wi, for every / e I. Since
each Wi is a subspace, au -\- bv e W^, for every / g /. Hence au -\- bv e W. Thus ^ is a subspace of V.
LINEAR SPANS
4.13. Show that the vectors Wj = A, 1, 1), ^2 = A.2, 3), W3 = A, 5, 8) span R^
We need to show that an arbitrary vector i; = (a, 6, c) in R^ is a linear combination of w^, U2, u^. Set
V = xui -\-yu2 -{-zu^, that is, set
(a,b,c) =x(l, 1, l)+Xl,2,3)+z(l,5,8) = (x-\-y-\-z, x-\-2y-\-5z, x + 3j; + 8z)
Form the equivalent system and reduce it to echelon form:
x+ y-\- z — a x+_y+ z — a x+_y+ z — a
X -\- 2y -\- 5z = b or y-\-4z = b — a or y-\-4z = b — a
x + 3_y + 8z = c 2y-\-lc = c — a —z = c — 2b-\-a
The above system is in echelon form and is consistent; in fact,
X = —a + 56 — 3c, y = ?ia — lb -\-Ac, z = a-\-2b — c
is a solution. Thus Wj, U2, u^ span R^.
4.14. Find conditions on a, b, c so that v = (a, b, c) in R^ belongs to ^ = span(wi, U2, W3), where
wi =A,2,0), U2 = (-1,1,2), W3 = C,0,-4)
Set i; as a linear combination of w^, U2, u^^ using unknowns x, y, z\ that is, set v — xu^ -\-yu2 + ZW3. This
yields
(fl, b, c) = x(l, 2, 0) -\-y(-l, 1, 2) +zC, 0, -4) = (x - j; + 3z, 2x + j;, 2y - 4z)
Form the equivalent system of linear equations and reduce it to echelon form:
X—_y + 3z = a x—_y + 3z = a x—_y + 3z = a
2x -\-y = b or 3y — 6z = b — 2a or 3y — 6z = b — 2a
2y-4z = c 2y-4z = c 0 = 4a-2b-\-3c
The vector v = (a, b, c) belongs to W if and only if the system is consistent, and it is consistent if and only if
4a — 2b -\- 3c = 0. Note, in particular, that w^, U2, W3 do not span the whole space R^.
4.15. Show that the vector space V = F(t) of real polynomials cannot be spanned by a finite number of
polynomials.
Any finite set S of polynomials contains a polynomial of maximum degree, say m. Then the linear span
spanE') of S cannot contain a polynomial of degree greater than m. Thus spanE') 7^ V, for any finite set S.
4.16. Prove Theorem 4.5: Let ^ be a subset of V. (i) Then span(^) is a subspace of V containing S.
(ii) If ^ is a subspace of V containing S, then span(^) c W.
(i) Suppose S is empty. By definition, spanE') = {0}. Hence spanE') = {0} is a subspace of V and
S c spanE'). Suppose S is not empty and v e S. Then i; = li; g spanE'); hence S c spanE'). Also
0 = Oi; G spanE'). Now suppose u,w e spanE'), say
1^1 + ... + a^u^ = J2 ^i^i and w = b^w^ + ... + b^w^ = J2 bj
Wi
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAR 4]
VECTOR SRACES
143
where u^, Wj e S and a^, bj e K. Then
U-\-V = J2 ^fUi + J2 bjWj
and
ku = kl ^ttiUi I = ^kttiUi
belong to spanE') since each is a linear combination of vectors in S. Thus spanE') is a subspace of V.
(ii) Suppose Ui,U2,... ,u^ e S. Then all the Ui belong to W. Thus all multiples a^Ui, ^2^2' • • •' ^r^r ^ ^
and so the sum aiUi + ^2^2 + • • • + ^r^r ^ ^ That is, W contains all linear combinations of elements in
S, or, in other words, spanE') c ^ as claimed.
LINEAR DEPENDENCE
4.17. Determine whether or not u and v are linearly dependent, where:
(a) u = (l,2lv = C,-5l (c) u = A,2,-3), v = D, 5,-6)
(b) u = A, -3), V = (-2, 6), (d) u = B, 4, -8), v = C, 6, -12)
Two vectors u and v are linearly dependent if and only if one is a multiple of the other.
(a) No. (b) Yes; for v = —2u. (c) No. (d) Yes, for v = ^u.
4.18. Determine whether or not u and v are linearly dependent where:
(a) u = 2f -\-4t-3,v = 4f + 8^-6, (b) u = 2f - 3t-\-4, v = 4f - 3t -
■2,
(c) u =
1 3
5 0
, V =
-4
-20
-12
0
16
4
(d) u =
1 1 1
2 2 2
, V =
2 2 2
3 3 3
Two vectors u and v are linearly dependent if and only if one is a multiple of the other.
(a) Yes; for v = 2u. (b) No. (c) Yes, for v = —Au. (d) No.
4.19. Determine whether or not the vectors u ■
dependent.
A, 1, 2), 1; = B, 3, 1), w = D, 5, 5) in R^ are linearly
Method 1. Set a linear combination of u, v, w equal to the zero vector using unknowns x, y, z to obtain the
equivalent homogeneous system of linear equations and then reduce the system to echelon form. This yields
1
1
1
■^y
2
3
1
+ z
4
5
5
=
0
0
0
X + 2j; + 4z :
X + 3_y + 5z :
2x + y -\- 5z -
x-\-2y-\-4z = 0
y-\- z = 0
The echelon system has only two nonzero equations in three unknowns; hence it has a fi"ee variable and a
nonzero solution. Thus u, v, w are linearly dependent.
Method 2. Form the matrix A whose columns are u, v, w and reduce to echelon form:
2 4
3 5
1 5
1
0
0
2
1
0
4
1
0
The third column does not have a pivot; hence the third vector w is a linear combination of the first two
vectors u and v. Thus the vectors are linearly dependent. (Observe that the matrix A is also the coefficient
matrix in Method 1. In other words, this method is essentially the same as the first method.)
Method 3. Form the matrix B whose rows are u, v, w, and reduce to echelon form:
"^112^
0 1 -3
0 0 0
1 1 2"
2 3 1
4 5 5
^
0 1
0 1
0 1
2"
-3
-3
^
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
144
VECTOR SPACES
[CHAR 4
Since the echelon matrix has only two nonzero rows, the three vectors are linearly dependent. (The three given
vectors span a space of dimension 2.)
4.20. Determine whether or not each of the following lists of vectors in R^ is linearly dependent:
(a) wi =A,2,5), U2 = (l,3, 1), W3 =B,5,7), W4 = C, 1,4),
(b) u = A, 2, 5), V = B, 5, 1), w = A, 5, 2),
(c) u = A, 2, 3), V = @, 0, 0), w = A, 5, 6).
(a) Yes, since any four vectors in R^ are linearly dependent.
(b) Use Method 2 above; that is, form the matrix A whose columns are the given vectors, and reduce the
matrix to echelon form:
"^12 1^
0 1 3
0 0 24
Every column has a pivot entry; hence no vector is a linear combination of the previous vectors. Thus the
vectors are linearly independent.
(c) Since 0 = @, 0, 0) is one of the vectors, the vectors are linearly dependent.
1 2 r
2 5 5
5 1 2
^
1
0
0
2
1
-9
r
3
-3
^
4.21. Show that the ftinctions/(^) = sin^, g(t)cost, h(t) = t from R into R are linearly independent.
Set a linear combination of the functions equal to the zero function 0 using unknown scalars x, y, z, that
is, SQt xf -\-yg -\-zh = 0; and then show x = 0,j; = 0, z = 0. We emphasize that xf -\-yg -\-zh = 0 means that,
for every value of t, we have xf(t) -\-yg(t) -\-zh(t) = 0.
Thus, in the equation x sin ^ + j; cos t -\-zt = 0:
x@) +X1) + ^@) = 0 or y = 0.
x(l) -\-y@) + zn/2 = 0 or x + nz/2 = 0.
x@) + y(-l) + z(n) = 0 or -y-\-nz = 0.
The three equations have only the zero solution, that is, x = 0, j; = 0, z = 0. Thus /, g, h are linearly
independent.
(i) Set ^ = 0
(ii) Set t = n/2
(iii) Set ^ = 71
to obtain
to obtain
to obtain
4.22. Suppose the vectors u, v, w are linearly independent. Show that the vectors u -\- v, u — v, u — 2v -\- w
are also linearly independent.
Suppose x(u + v) -\-y(u — v) -\- z(u — 2v-\-w) = 0. Then
xu -\-XV -\-yu — yv -\-zu — 2zv -\-zw = 0
(x + _y + z)u -\- (x — y — 2z)v + zw = 0
Since w, v, w are linearly independent, the coefficients in the above equation are each 0; hence
x+j; + z = 0, X—j; —2z = 0, z = Q
The only solution to the above homogeneous system is x = 0, j; = 0, z = 0. Thus u -\- v, u — v, u — 2v -\- w diXQ
linearly independent.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAR 4]
VECTOR SRACES
145
4.23. Show that the vectors w = A + /, 2/) and w = A, 1 + /) in C^ are linearly dependent over the
complex field C but linearly independent over the real field R.
Recall that two vectors are linearly dependent (over a field K) if and only if one of them is a multiple of
the other (by an element in K). Since
(l+0>v = A+0A, 1+0 = A+^ 2i) = u
u and w are linearly dependent over C. On the other hand, u and w are linearly independent over R, since no
real multiple of w can equal u. Specifically, when k is real, the first component of ^ = (^, k-\- ki) must be
real, and it can never equal the first component 1 + / of w, which is complex.
BASIS AND DIMENSION
4.24. Determine whether or not each of the following form a basis of R^:
{a) A,1,1), A,0,1); (c) A,1,1), A,2,3), B,-1,1):
{b) A,2,3), A,3,5), A,0,1), B,3,0); {d) A,1,2), A,2,5), E,3,4).
{a and b) No, since a basis of R^ must contain exactly 3 elements because dimR^ = 3.
(c) The three vectors form a basis if and only if they are linearly independent. Thus form the matrix whose
rows are the given vectors, and row reduce the matrix to echelon form:
^111^
0 1 2
0 0 5
The echelon matrix has no zero rows; hence the three vectors are linearly independent, and so they do
form a basis of R^.
{d) Form the matrix whose rows are the given vectors, and row reduce the matrix to echelon form:
1 1 2"
0 1 3
0 0 0
The echelon matrix has a zero row; hence the three vectors are linearly dependent, and so they do not
form a basis of R^.
1
1
2
1
2
-1
r
3
1
^
1
0
0
1
1
-3
r
2
-1
^
1 1 2"
1 2 5
5 3 4
^
1
0
0
1
1
-2
2"
3
-6
^
4.25. Determine whether A,1,1,1), A,2,3,2), B,5,6,4), B,6,8,5) form a basis of R^ If not, find
the dimension of the subspace they span.
Form the matrix whose rows are the given vectors, and row reduce to echelon form:
1
1
2
2
1
2
5
6
1 r
3 2
6 4
8 5
1111"
0 12 1
0 3 4 2
0 4 6 3
1111
0 12 1
0 0-2-1
0 0-2-1
1111
0 12 1
0 0 2 1
0 0 0 0
The echelon matrix has a zero row. Hence the four vectors are linearly dependent and do not form a basis of
R^. Since the echelon matrix has three nonzero rows, the four vectors span a subspace of dimension 3.
4.26. Extend {wj = A, 1, 1, 1), U2 = B, 2, 3, 4)} to a basis of R"^
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
146 VECTOR SPACES [CHAR 4
First form the matrix with rows Ui and U2, and reduce to echelon form:
1111
2 2 3 4
1111
0 0 12
Then w^ =A,1,1,1) and W2 = @, 0, 1, 2) span the same set of vectors as spanned by Ui and ^2- L^t
u^ = @, 1,0,0) and W4 = @, 0, 0, 1). Then Wi, u^, W2, W4 form a matrix in echelon form. Thus they are
linearly independent, and they form a basis of R^. Hence w^, U2, W3, W4 also form a basis of R^.
4.27. Consider the complex field C, which contains the real field R, which contains the rational field Q.
(Thus C is a vector space over R, and R is a vector space over Q.)
(a) Show that {1, /} is a basis of C over R; hence C is a vector space of dimension 2 over R.
(b) Show that R is a vector space of infinite dimension over Q.
(a) For any v e C, we have v = a -\- bi = a(l) -\- b(i), where a,b eK. Hence {1, /} spans C over R.
Furthermore, if x(l) +X0 = ^ or x -\- yi = 0, where x, y eR, then x = 0 and y = 0. Hence {1, /} is
linearly independent over R. Thus {1, /} is a basis for C over R.
(b) It can be shown that 71 is a transendental number, that is, n is not a root of any polynomial over Q. Thus,
for any n, the « + 1 real numbers 1,71,71^,..., 71" are linearly independent over Q. Thus R cannot be of
dimension n over Q. Accordingly, R is of infinite dimension over Q.
4.28. Suppose S = {ui,U2, ... ,uj is sl subset of V. Show that the following Definitions A and B of a
basis of V are equivalent:
(A) S is linearly independent and spans V.
(B) Every v e V is sl unique linear combination of vectors in S.
Suppose (A) holds. Since S spans V, the vector i; is a linear combination of the u^, say
U = aiUi -\- C2^2 + • • • + ^n^n ^^d U = biUi -\- 62^2 + • • • + b^U^
Subtracting, we get
0 = v-v = (a^- b^)u^ + (^2 - b2)u2 + ... + (a„ - b^)u^
But the Ui are linearly independent. Hence the coefficients in the above relation are each 0:
tti —bi =0, a2 — b2 = 0, ..., a^ — b^ = 0
Therefore a^ = bi,a2 = 62, • • •, «„ = b^. Hence the representation of i; as a linear combination of the Ui is
unique. Thus (A) implies (B).
Suppose (B) holds. Then S spans V. Suppose
0 = c^Ui + C2U2 + ... + c„w„
However, we do have 0 = Ou^ + 0^2 + • • • + Ow„
By hypothesis, the representation of 0 as a linear combination of the Ui is unique. Hence each c, = 0 and the Ui
are linearly independent. Thus (B) implies (A).
DIMENSION AND SUBSR4CES
4.29. Find a basis and dimension of the subspace W of R^ where:
(a) W = {(a, b,c):a + b + c = 0}, (b) W = {(a, b,c):(a = b = c)}
(a) Note that W^R\ since, e.g., A,2,3) ^W. Thus dim^<3. Note that u^=(l,0,-l) and
U2 = @,1, —I) are two independent vectors in W. Thus dim W = 2, and so u^ and U2 form a basis of W.
(b) The vector u = A,1, I) e W. Any vector w e W has the form w = (k, k, k). Hence w = ku. Thus u
spans W and dim W = \.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAP. 4]
VECTOR SPACES
147
4.30. Let W be the subspace of R"^ spanned by the vectors
wi = A, -2, 5, -3), U2 = B, 3, 1, -4),
C,8,-3,-5)
(a) Find a basis and dimension of W. (b) Extend the basis of ^ to a basis of R .
(a) Apply Algorithm 4.1, the row space algorithm. Form the matrix whose rows are the given vectors, and
reduce it to echelon form:
1
2
3
-2
3
8
5
1
-3
-3
-4
-5
^
1
0
0
-2
7
14
3
2
4
^
1
0
0
-2
7
0
5
-9
0
-3
2
0
The nonzero rows A, —2, 5, —3) and @, 7, —9, 2) of the echelon matrix form a basis of the row space of
A and hence of W. Thus, in particular, dim W = 2.
(b) We seek four linearly independent vectors, which include the above two vectors. The four vectors
A, —2, 5, —3), @, 7, —9, 2), @,0,1,0), and @,0,0,1) are linearly independent (since they form an
echelon matrix), and so they form a basis of R^, which is an extension of the basis of W.
4.31. Let W be the subspace of R^ spanned by Wj = A, 2,-1, 3, 4), ^2 = B,4,-2, 6, 8),
W3 = A, 3, 2, 2, 6), W4 = A, 4, 5, 1, 8), W5 = B, 7, 3, 3, 9). Find a subset of the vectors that
form a basis of W.
Here we use Algorithm 4.2, the Casting-out algorithm. Form the matrix M whose columns (not rows) are
the given vectors, and reduce it to echelon form:
M -
1 2 112'
2 4 3 4 7
-1-2253
3 6 2 13
4 8 6 8 9
1 2 1 1 2'
0 0 12 3
0 0 3 6 5
0 0-1-2 -3
0 0 2 4 1
12 11 2'
0 0 12 3
0 0 0 0-4
0 0 0 0 0
0 0 0 0 0
The pivot positions are in columns Q, C3, C5. Hence the corresponding vectors ^1,1/3, u^ form a basis of W,
and dim W = 3.
4.32. Let V be the vector space of 2 x 2 matrices over K. Let W be the subspace of symmetric matrices.
Show that dim ^ = 3, by finding a basis of W.
Recall that a matrix A = [aij] is symmetric if A^ = A, or, equivalently, each aij = Gji. Thus ^ =
denotes an arbitrary 2x2 symmetric matrix. Setting (i) a = I, b = 0, d = 0, (ii) a = 0, b = I, d = 0,
(iii) a = 0, b = 0, d = I, WQ obtain the respective matrices:
1 0
0 0
0 1
1 0
[2
We claim that S = {£^,£2,E^} is sl basis of W; that is, (a) S spans W and (b) S is linearly independent.
(a) The above matrix ^ = ^ \ ~ ^^1 + ^^2 + ^^3 • Thus S spans W.
(b) Suppose xEi -\-yE2 -l-zEi, = 0, where x, y, z are unknown scalars. That is, suppose
[b d\
zE^ = 0,
4J s]+4i o]^{o i]^[
0 0
0 0
X y
y z
H s
Setting corresponding entries equal to each other yields x = 0, j; = 0, z = 0. Thus S is linearly independent.
Therefore, S is sl basis of W. as claimed.
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
148 VECTOR SPACES [CHAR 4
THEOREMS ON LINEAR DEPENDENCE, BASIS, AND DIMENSION
4.33. Prove Lemma 4.10: Suppose two or more nonzero vectors Vi,V2, ... ,v^ are linearly dependent.
Then one of them is a linear combination of the preceding vectors.
Since the Vi are linearly dependent, there exist scalars ai,...,a^, not all 0, such that
aiVi -\-... -\- Uj^Vj^ = 0. Let k be the largest integer such that a^ ^ 0. Then
aiVi -\-... -\- aj^Vj^ + Ovj^^i + ... + Ov^ = 0 or aiVi -\-... -\- aj^Vj^ = 0
Suppose k = I; then a^v^ = 0, a^ ^ 0, and so v^ = 0. But the v^ are nonzero vectors. Hence k > 1 and
Vk = -«^^«i^i - • • • - ak^ak_^Vk_^
That is, Vj^ is a linear combination of the preceding vectors.
4.34. Suppose S = {vi,V2, ..., v^] spans a vector space V.
(a) If w G V, then {w,Vi, ... ,Vj^} is linearly dependent and spans V.
(b) If Vi is a linear combination of Vi, ... ,Vi_i, then S without v^ spans V.
(a) The vector w is a linear combination of the Vi, since {i;J spans V. Accordingly, {w, 1;^, ..., i;^} is linearly
dependent. Clearly, w with the Vi span V, since the Vi by themselves span V, that is, {w,Vi,... ,v^} spans
V.
(b) Suppose Vi = kiVi -\-... -\- ki_iVi_i. Let u e V. Since {i;J spans V, u is sl linear combination of the Vj's,
say u = aiVi -\-... -\- Uj^Vj^. Substituting for Vi, we obtain
u = a^v^ + ... + ai_^Vi_^ + ai(k^v^ + ... + ^/_ii^/_i) + «/+ii^/+i + • • • + a^v^
= (ai + a,-^i)i;i + ... + (a,-_i + aiki_i)Vi_i + a,-+ii^/+i + ... + a^i^^
Thus {i;i,..., i;;_i, i;,^!,..., i;^} spans V. In other words, we can delete Vf from the spanning set and still
retain a spanning set.
4.35. Prove Lemma 4.13: Suppose {vy,V2, ... ,v^] spans V, and suppose {wj, W2, ..., w^} is linearly
independent. Then m < n, and F is spanned by a set of the form
Thus any n-\- I or more vectors in F are linearly dependent.
It suffices to prove the lemma in the case that the Vi are all not 0. (Prove!) Since {i;J spans V, we have by
Problem 4.34 that
{Wi,Vi,...,vJ A)
is linearly dependent and also spans V. By Lemma 4.10, one of the vectors in A) is a linear combination of the
preceding vectors. This vector cannot be w^, so it must be one of the v\ say Vj. Thus by Problem 4.34, we can
delete Vj from the spanning set A) and obtain the spanning set
{w^,v^,...,Vj_^, Vj+^,...,vJ B)
Now we repeat the argument with the vector W2. That is, since B) spans V, the set
{w^,W2,V^,...,Vj_^, Vj+^,...,VJ C)
is linearly dependent and also spans V. Again by Lemma 4.10, one of the vectors in C) is a linear combination
of the preceding vectors. We emphasize that this vector cannot be w^ or W2, since {w^,... ,w^} is
independent; hence it must be one of the v's, say Vj^. Thus, by Problem 4.34, we can delete Vj^ from the
spanning set C) and obtain the spanning set
We repeat the argument with w^, and so forth. At each step, we are able to add one of the w's and delete
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 4] VECTOR SRACES 149
one of the v's in the spanning set. If m < n, then we finally obtain a spanning set of the required form:
Finally, we show that m > nis not possible. Otherwise, after n of the above steps, we obtain the spanning
set {wi, ... ,wj. This implies that w^^i is a linear combination of Wi,...,w^, which contradicts the
hypothesis that {wj is linearly independent.
4.36. Prove Theorem 4.12: Every basis of a vector space V has the same number of elements.
Suppose {ui,U2,..., u^} is a basis of V, and suppose {vi,V2,...} is another basis of V. Since {wj spans V,
the basis {vi,V2,...} must contain n or less vectors, or else it is linearly dependent by Rroblem 4.35, that is.
Lemma 4.13. On the other hand, if the basis {vi,V2,...} contains less than n elements, then {ui,U2,... ,u^} is
linearly dependent by Rroblem 4.35. Thus the basis {vi,V2, ...} contains exactly n vectors, and so the theorem
is true.
4.37. Prove Theorem 4.14: Let F be a vector space of finite dimension n. Then:
(i) Any n -\- I or more vectors must be linearly dependent.
(ii) Any linearly independent SQt S = {ui,U2, .. .uj with n elements is a basis of V.
(iii) Any spanning SQt T = {vi,V2, ... ,v^} of F with n elements is a basis of V.
Suppose B = {wi,W2,..., wj is a basis of V.
(i) Since B spans V, any « + 1 or more vectors are linearly dependent by Lemma 4.13.
(ii) By Lemma 4.13, elements fi-om B can be adjoined to S to form a spanning set of V with n elements.
Since S already has n elements, S itself is a spanning set of V. Thus S is sl basis of V.
(iii) Suppose T is linearly dependent. Then some v^ is a linear combination of the preceding vectors. By
Rroblem 4.34, F is spanned by the vectors in T without v^ and there are « — 1 of them. By Lemma 4.13,
the independent set B cannot have more than n — I elements. This contradicts the fact that B has n
elements. Thus T is linearly independent, and hence T is a basis of V.
4.38. Prove Theorem 4.15: Suppose S spans a vector space V. Then:
(i) Any maximum number of linearly independent vectors in S form a basis of V.
(ii) Suppose one deletes from S every vector that is a linear combination of preceding vectors in
S. Then the remaining vectors form a basis of V.
(i) Suppose {vi,... ,Vfj^} is a. maximum linearly independent subset of S, and suppose w e S. Accordingly
{vi,... ,Vj^,w} is linearly dependent. No Vj^ can be a linear combination of preceding vectors. Hence w is
a linear combination of the v^. Thus w e span(i;^), and hence S c span(i;^). This leads to
V = spm(S) c span(i;,) c V
Thus {i;J spans V, and, since it is linearly independent, it is a basis of V.
(ii) The remaining vectors form a maximum linearly independent subset of S; hence, by (i), it is a basis of V.
4.39. Prove Theorem 4.16: Let F be a vector space of finite dimension and \Qt S = {ui,U2, ..., u^} be a
set of linearly independent vectors in V. Then S is part of a basis of V; that is, S may be extended to
a basis of V.
Suppose B = {wi,W2,... ,wj is a. basis of V. Then B spans V, and hence V is spanned by
S U B = {Ui,U2, . . . ,U^, Wi,W2, . . . ,Wfj}
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
150
VECTOR SPACES
[CHAP. 4
By Theorem 4.15, we can delete from SUB each vector that is a linear combination of preceding vectors to
obtain a basis B' for V. Since S is linearly independent, no Uj^ is a linear combination of preceding vectors.
Thus B' contains every vector in S, and S is part of the basis B' for V.
4.40. Prove Theorem 4.17: Let ^ be a subspace of an ^-dimensional vector space V. Then dim W< n.ln
particular, if dim W= n, then W=V.
Since V is of dimension n, any « + 1 or more vectors are linearly dependent. Furthermore, since a basis
of W consists of linearly independent vectors, it cannot contain more than n elements. Accordingly,
dim W< n.
In particular, if {wi,...,%} is a basis of W, then, since it is an independent set with n elements, it is also
a basis of V. Thus W = V when dim W = n.
RANK OF A MATRIX, ROW AND COLUMN SPACES
4.41. Find the rank and basis of the row space of each of the following matrices:
{a) A =
1
2
3
2
6
10
0
-3
-6
-1
-3
-5
(a) Row reduce A to echelon form:
(b)
1 2
0 2
0 4
B =
0 -
-3 -
-6 -
1
1
2
_3
r
1
2
3
1
4 3
3 -4
8 1
^
2
0 2
0 0
-2
-1
-7
-7
0
-3
0
-3
-4
-3
-8
-1
-1
0
The two nonzero rows A, 2, 0, —1) and @, 2, —3,-1) of the echelon form of A form a basis for
rowsp(^). In particular, rank(^) = 2.
(b) Row reduce B to echelon form:
1 3 1-2-3
0 12 1-1
0 -3 -6 -3 3
0 -1 -2 -1 1
13 1-2-3
0 12 1-1
0 0 0 0 0
0 0 0 0 0
The two nonzero rows A,3, 1, —2, —3) and @, 1, 2, 1, —1) of the echelon form of B form a basis for
rowspE). In particular, rankE) = 2.
4.42. Show that U = W, where U and W are the following subspaces of R^:
^ = span(wi, W2» W3) = span(l, 1, — 1), B,3,-1), C,1,-5)}
^ = span(wi,W2,W3) = span(l,-l,-3), C,-2,-8), B,1,-3)}
Form the matrix A whose rows are the w,, and row reduce A to row canonical form:
1 1
2 3
3 1
-1"
-1
-5
^
1
0
0
1
1
-2
-1"
1
-2
^
1 0 -2
0 1 1
0 0 0
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAR 4]
VECTOR SRACES
151
Next form the matrix B whose rows are the w , and row reduce B to row canonical form:
[1 -1 -3l [1 -1 -3'
5=3-2-8-0 1 1
\_2 1 -3J [o 3 3
Since A and B have the same row canonical form, the row spaces of A and B are equal, and so U =W.
1 0
0 1
0 0
-2
1
0
4.43. Let A =
1
2
1
3
2
4
2
6
1
3
2
6
2
7
5
15
3
7
5
14
1
4
6
15
■ c,?
1
0
0
0
2
0
0
0
1 2
1 3
1 3
3 9
3
1
2
5
1
0
0
0
2
0
0
0
1
1
0
0
2 3 1
3 1 2
0 1 3
0 0 0
(a) Find rankGl4), for ^ = 1, 2, ..., 6, where Mj^ is the submatrix of A consisting of the first k
columns Q, C2, ..., Q of ^.
(b) Which columns Q^j are linear combinations of preceding columns Cj,
(c) Find columns of ^ that form a basis for the column space of ^.
(d) Express column C4 as a linear combination of the columns in part (c).
(a) Row reduce A to echelon form:
1^ "
2
5
12
Observe that this simultaneously reduces all the matrices M^ to echelon form; for example, the first four
columns of the echelon form of ^ are an echelon form of M4. We know that rank(M^) is equal to the
number of pivots or, equivalently, the number of nonzero rows in an echelon form of M^. Thus
rank(Mi) = rank(M2) = 1, rank(M3) = rank(M4) = 2
rank(M5) = rank(M6) = 3
(b) The vector equation x^ C^ + X2C2 + ... + x^Q = Q^^ yields the system with coefficient matrix M^ and
augmented M^^^. Thus Q^^ is a linear combination of C^,..., Q if and only if rank(M^) = rank(M^^i)
or, equivalently, if Q^^ does not contain a pivot. Thus each of C2, C4, Q is a linear combination of
preceding columns.
(c) In the echelon form of ^, the pivots are in the first, third, and fifth columns. Thus columns C^, C3, C5 of
A form a basis for the columns space of ^. Alternatively, deleting columns C2, C4, Q from the spanning
set of columns (they are linear combinations of other columns), we obtain, again, Q, C3, C5.
(d) The echelon matrix tells us that C4 is a linear combination of columns C^ and C3. The augmented matrix
M of the vector equation C4 = xC^ -\-yC2 consists of the columns Q, C3, C4 of ^ which, when reduced
to echelon form, yields the matrix (omitting zero rows)
1 1 2
0 1 3
x-\-y = 2
-1,
y--
Thus C4 = -Ci + 3C3 = -Q + 3C3 + OC5.
4.44. Suppose w = (a^, ^2,..., a„) is a linear combination of the rows Ri,R2,
B = [byl say u = k^R^ + ^2^2 + • • • + k^Rm- Prove that
«/ = hb\i + ^2^2/ + ... + k^b^i, i=l,2,...,n
. ,R^ of a matrix
where b^^, Z?2/,
, b^i are the entries in the /th column of B.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
152
VECTOR SPACES
[CHAR 4
We are given that u = kiRi-\- ^2^2 + •
(ai, ^2, ...,a^) = k^(bu, ■■■, ^iJ + • • • + ^^(^^1, • • •, b^J
= (k^bu + ... + k^b^^,..., k^b^^ + ... + k^b^J
Setting corresponding components equal to each other, we obtain the desired result.
4.45. Prove Theorem 4.7: Suppose A = [afj] and B = [bfj] are row equivalent echelon matrices with
respective pivot entries
ay^^ay^, ..., a,j^ and ^lyt^, %2' • • •' ^^
'sL
(pictured in Fig. 4-5). Then^ and B have the same number of nonzero rows, that is, r = s, and their
pivot entries are in the same positions, that is,7i = hji = ^2^ • • • Jr = K-
Cloj ^ ^ ^ ^
-^2^2
* * * *
-^sL
Fig. 4-5
Clearly ^ = 0 if and only if 5 = 0, and so we need only prove the theorem when r > 1 and ^ > 1. We
first show thaty'i = k^. Supposeyi < k^. Then they^th column ofB is zero. Since the first row i?* of ^ is in the
row space of B, we have i?* = c^Ri + C1R2 + ... + c^R^, where the Ri are the rows of B. Since the y^th
column of B is zero, we have
^Vi
:Ci0 + C20 + ...+c^0 = 0
But this contradicts the fact that the pivot entry ay^ 7^ 0. Hence y'l > ki and, similarly, ki >ji. Thus7^ = ki.
Now let A^ be the submatrix of ^ obtained by deleting the first row of ^, and let B^ be the submatrix ofB
obtained by deleting the first row of B. We prove that A^ and B^ have the same row space. The theorem will
then follow by induction, since A^ and B^ are also echelon matrices.
Let R = (ai,a2, ... ,ajj)bQ any row of ^^ and \QtRi, ... ,R^ be the rows ofB. Since R is in the row space
ofB, there exist scalars d^, ... ,d^ such that R = d^R^ -\- (^2^2 + • • • + d^^Rj^. Since A is in echelon form and
R is not the first row of ^, they^th entry ofR is zero: a^ = 0 for / =j\ = k^. Furthermore, since B is in echelon
form, all the entries in the ^^th column ofB are 0 except the first: bu^^ 7^ 0, but 62^^ = 0,..., b^j^^ = 0. Thus
0 = aj^ = dibij^ + (^2^ + • • • + dj^O = dib^^
Now bij^^ 7^ 0 and so di = 0. Thus R is a. linear combination of R2,..., R^ and so is in the row space of B'.
Since R was any row of ^^, the row space of ^^ is contained in the row space ofB'. Similarly, the row space of
B' is contained in the row space of ^^ Thus A' and B' have the same row space, and so the theorem is proved.
4.46. Prove Theorem 4.8: Suppose A and B are row canonical matrices. Then A and B have the same row
space if and only if they have the same nonzero rows.
Obviously, if A and B have the same nonzero rows, then they have the same row space. Thus we only
have to prove the converse.
Suppose A and B have the same row space, and suppose i? 7^ 0 is the /th row of ^. Then there exist scalars
Ci, ..., c, such that
R = c^R^ + C2R2 + ... + c,R,
A)
where the Ri are the nonzero rows ofB. The theorem is proved if we show that R = Ri, that is, that c^ = 1 but
c^ = 0 for ^ 7^ /.
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 4] VECTOR SPACES 153
Let Qij, be the pivot entry in R, i.e., the first nonzero entry oi R. By A) and Problem 4.44,
^ij^ = Clby^ + C2b2j^ + . . . + C^b^j^ B)
But, by Problem 4.45, b^j is a pivot entry of 5, and, since B is row reduced, it is the only nonzero entry in the
yth column of 5. Thus, from B), we obtain a^y = ^fiijr However, a^y = 1 and b^^ = 1, since A and B are row
reduced; hence q = 1.
Now suppose k 7^ /, and bj^j^ is the pivot entry in Rj^. By A) and Problem 4.44,
^//. = ^l^lA + ^2^2A + • • • + ^sbsj, C)
Since B is row reduced, bj^^ is the only nonzero entry in theyth column of B. Hence, by C), aij^ = Cj^bj^^.
Furthermore, by Problem 4.45, aj^^ is a pivot entry of ^, and since A is row reduced, a^^ = 0. Thus Cj^bj^^ = 0,
and since bj^j^ = 1, c^ = 0. Accordingly R = R^, and the theorem is proved.
4.47. Prove Corollary 4.9: Every matrix A is row equivalent to a unique matrix in row canonical form.
Suppose A is row equivalent to matrices A^ and A2, where A^ and A2 are in row canonical form. Then
rowsp(^) = rowsp(^i) and rowsp(^) = rowsp(^2)- Hence rowsp(^i) = rowsp(^2)- Since A^ and A2 are in
row canonical form, A^ = A2 by Theorem 4.8. Thus the corollary is proved.
4.48. Suppose RB and AB are defined, where i? is a row vector and A and B are matrices. Prove:
(a) RB is a linear combination of the rows of B.
(b) The row space of AB is contained in the row space of B.
(c) The column space of AB is contained in the column space of ^.
(d) rank(^5) < rankE) and rank(^5) < rank(^).
(a) Suppose R = (ai,a2, ..., a^) and B = [bij]. Let B^,... ,B^ denote the rows of B and B^,... ,B^ its
columns. Then
RB = (RB\RB^,...,RB'')
= (a^bu -\-a2b21 +...+^^6^1, ..., a^b^^-\-a2b2n-\-■ ■ ■-\-a^b^J
= ai(bu,bu, ■■■, b^J + a2(Z?2i, Z?22, • • •, ^iJ + • • • + «^(^^i, ^^2^ • • - b^J
= a^B^ + ^2^2 + • • • + a^B^
Thus RB is a linear combination of the rows of B, as claimed.
(b) The rows of AB are RiB, where Ri is the ith row of ^. Thus, by part (a), each row of AB is in the row
space of B. Thus rowsp(^5) c rowspE), as claimed.
(c) Using part (b), we have
colsp(^5) = rowsp(^5)^ = rowspE^^^) c rowsp(^^) = colsp(^)
(d) The row space of AB is contained in the row space of B; hence rank(^5) < rankE). Furthermore, the
column space of AB is contained in the column space of ^; hence rank(^5) < rank(^).
4.49. Let A be an ^-square matrix. Show that A is invertible if and only if rank(^) = n.
Note that the rows of the w-square identity matrix /„ are linearly independent, since /„ is in echelon form;
hence rank(/„) = n. Now if ^ is invertible, then A is row equivalent to /„; hence rank(^) = n. But if ^ is not
invertible, then A is row equivalent to a matrix with a zero row; hence rank(^) < n, that is, A is invertible if
and only if rank(^) = n.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
154
VECTOR SPACES
[CHAR 4
APPLICATIONS TO LINEAR EQUATIONS
4.50. Find the dimension and a basis of the solution space W of each homogeneous system:
x-\-2y-\-2z-s-\-3t = 0
x-\-2y-\-3z-\-s-\- t = 0,
3x + 63; + 8z + ^ + 5^ = 0
(a)
(a) Reduce the system to echelon form:
x-\-2y-\- z-2t = 0
2x-\-4y-\-4z-3t = 0
3x-\-6y-\-lz-4t = 0
(b)
x-\- y-\-2z = 0
2x-\-3y-\-3z = 0
x-\-3y-\-5z = 0
(c)
X + 2j; + 2z - ^ + 3^ = 0
z-\-2s-2t = 0
2z-\-4s-4t = 0
X + 2j; + 2z - ^ + 3^ = 0
or z + 2^ - 2^ = 0
The system in echelon form has two (nonzero) equations in five unknowns. Hence the system has
5 — 2 = 3 fi"ee variables, which are y, s, t. Thus dim W = 3. We obtain a basis for W\
A)
B)
C)
Setj;= 1,^ = 0,^ = 0
Setj; = 0,^= 1,^ = 0
Setj; = 0,^ = 0, ^= 1
to obtain the solution
to obtain the solution
to obtain the solution
v^ =(-2,1,0,0,0)
i;2 = E,0,-2, 1,0)
v^ =(-7,0,2,0,1)
The set {v^, 1^2. ^3} is a basis of the solution space W.
(b) (Here we use the matrix format of our homogeneous system.) Reduce the coefficient matrix A to echelon
form:
12 1-2
2 4 4-3
3 6 7-4
1 2 1
0 0 2
0 0 4
-2"
1
2
^
12 1-2
0 0 2 1
0 0 0 0
This corresponds to the system
- 2y -
■2z-2t:
2z-\- t:
The free variables are y and t, and dim W = 2.
(i) Set j; = 1, z = 0 to obtain the solution u^ = (—2, 1, 0, 0).
(ii) Set j; = 0, z = 2 to obtain the solution U2 = F, 0, — 1, 2).
Then {w^, U2} is a basis of W.
(c) Reduce the coefficient matrix A to echelon form:
1 2
3 3
3 5
1 1
0 1
0 2
2
- 1
3
^
1 1
0 1
0 0
2
-1
5
This corresponds to a triangular system with no fi"ee variables. Thus 0 is the only solution, that is,
W = {0}. Hence dim W = 0.
4.51. Find a homogeneous system whose solution set W is spanned by
{u^,U2, W3} = {A, -2, 0, 3), A, -1, -1,4), A,0,
-2, 5)}
Let V = (x,y, z, t). Then v e W if and only if i; is a linear combination of the vectors w^, ^2? ^3 that span
W. Thus form the matrix M whose first columns are w^, ^2? ^3 and whose last column is v, and then row reduce
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAR 4]
VECTOR SRACES
155
M to echelon form. This yields
M :
1
-2
0
3
1
-1
-1
4
1 X
0 y
-2 z
5 t
1
0
0
0
1
1
-1
1
1
2
-2
2
X
2x+j;
z
-3x + ^
1 1 1
0 1 2
0 0 0
0 0 0
X
2x+j;
2x + _y + z
—5x —y-\-t
Then i; is a linear combination of w^, U2, W3 if rank(M) = rank(^), where A is the submatrix without column v.
Thus set the last two entries in the fourth column on the right equal to zero to obtain the required
homogeneous system:
2x + _y + z
5x + _y — t -
:0
:0
4.52. Let Xi^,Xi^, ... ,x^-^ be the free variables of a homogeneous system of linear equations with n
unknowns. Let Vj be the solution for which x^j = 1, and all other free variables equal 0. Show that
the solutions Vi,V2, ... ,Vj^ slyq linearly independent.
Let A be the matrix whose rows are the i;^. We interchange column 1 and column i^, then column 2 and
column 12, ■ ■ ■, then column k and column /^, and we obtain the ^ x « matrix
B = [I, C]
1
0
0
1
0 .
0 .
. 0
. 0
0
0
<^l,k+l •
^2,k+\ ■
■ C^n
■ C2n
0 0 0
0 1
^k,k^\
^kn
The above matrix B is in echelon form, and so its rows are independent; hence rankE) = k. Since A and B are
column equivalent, they have the same rank, i.e., rank(^) = k. But A has k rows; hence these rows, i.e., the v^,
are linearly independent, as claimed.
SUMS, DIRECT SUMS, INTERSECTIONS
4.53. Let U and W be subspaces of a vector space V. Show that:
{a) ^ + F is a subspace of V.
U and W are contained in ^ + ^
^ + ^ is the smallest subspace containing U and W, that is, U -\-W ■
w-\-w= w.
ib)
(d)
(a)
: span(?7, W).
Since U and W are subspaces, 0 e U and 0 e W. Hence 0 = 0 + 0 belongs to U -\- W. Now suppose
v,v' e U -\-W. Then v = u-\-w and v' = u' -\- v\ where u,u' eU and w,w' eW. Then
av + bv' = (au + bu') + (aw + W) eU +W
Thus U -\-W is di subspace of V.
(b) Let u eU. Since ^ is a subspace, 0 e W. Hence u = u-\-0 belongs to U-\-W. Thus U ^ U -\-W.
Similarly, W ^ U-\-W.
(c) Since U -\- W is a. subspace of V containing U and W, it must also contain the linear span of U and W.
That is, spm(U, W)^U+W.
On the other hand, if i; G U -\- W, then v = u-\-w = lu-\- Iw, where u e U and w e W. Thus i; is a
linear combination of elements in U U W, and so i; G span(t/, W). Hence U -\-W <^ span(t/, W).
The two inclusion relations give the desired result.
(d) Since ^ is a subspace of V, we have that W is closed under vector addition; hence W -\- W ^ W. By
part (a), W ^ W-\-W. Hence W-\-W = W.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
156
VECTOR SPACES
[CHAR 4
4.54. Consider the following subspaces of R^:
U = span(wi, U2, u^) = span{(l, 3, —2, 2, 3), A, 4,
W = span(wi, W2, w^) = span{(l, 3, 0, 2, 1), A,5,
Find a basis and the dimension of: {a) U -\- W, (b) U HW.
-3,4,2), B,3,-1,-2,9)}
-6,6,3), B,5,3,2,1)}
(a) U -\-W is the space spanned by all six vectors. Hence form the matrix whose rows are the given six
vectors, and then row reduce to echelon form:
-2
-3
-1
0
-6
3
3 -2
1 -1
-3
0
2
-1
2
2
-6
0
4
-2
31
-1
3
-2
0
-5
1 3
0 1
0 0
0 0
0 0
0 0
-2 2
-1 2
1 0
0 0
0 0
0 0
3'
-1
-1
0
0
0
The following three nonzero rows of the echelon matrix form a basis oi U f^W^.
A,3,-2,2,2,3),
@,1,-1,2,-1),
@,0,1,0,-1)
Thusdim(t/+r) = 3.
(b) Let V = (x,y, z, s, t) denote an arbitrary element in R^. First find, say as in Problem 4.49, homogeneous
systems whose solution sets are U and W, respectively.
Let M be the matrix whose columns are the u^ and v, and reduce M to echelon form:
M .
Set the last three entries in the last column equal to zero to obtain the following homogeneous system whose
solution set is U\
-x+j; + z = 0, 4x-2j; + ^ = 0, -6x+j; + ^ = 0
Now let M' be the matrix whose columns are the w, and v, and reduce M' to echelon form:
1
3
2
2
3
1
4
-3
4
2
2
3
-1
-2
9
X
y
z
s
t _
^
1
0
0
0
0
1
1
0
0
0
2
-3
0
0
0
X
-3x+j;
—x-\-y-\-z
4x - 2j; + ^
—6x +y + t
1
3
0
2
1
1
5
-6
6
3
2
5
3
2
1
X
y
z
s
t
^
1
0
0
0
0
1
2
0
0
0
2
-1
0
0
0
X
-3x+j;
-9x -\-3y-\-z
4x - 2j; + ^
2x — _y + ^
M' :
Again set the last three entries in the last column equal to zero to obtain the following homogeneous system
whose solution set is W:
-9 + 3-
0,
4x - 2j; + ^ = 0,
2x-j; + ^ = 0
Combine both of the above systems to obtain a homogeneous system, whose solution space is U A W, and
reduce the system to echelon form, yielding
-x-\-y-\- z = 0
2y-\-4z-\- s = 0
Sz-\-5s-\-2t = 0
s-2t = 0
There is one free variable, which is t; hence dim(t/H W) = I. Setting t -
w = A, 4, —3, 4, 2), which forms our required basis of U A W.
2, we obtain the solution
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 4] VECTOR SRACES 157
4.55. Suppose U and W are distinct four-dimensional subspaces of a vector space V, where dim F = 6.
Find the possible dimensions of ^ fl ^
Since U and W are distinct, U -\-W properly contains U and W\ consequently dim(t/ -\-W)>A. But
dim(t/ + W) cannot be greater than 6, since dim V = 6. Hence we have two possibilities: (a)
dim(t/ +W) = 5ox (b) dim(t/ +W) = 6. By Theorem 4.20,
dim(t/ n r) = dim t/ + dim W - dim(t/ + r) = 8 - dim(t/ + W)
Thus (a) dim(t/ n r) = 3 or (b) dim(t/ nW) = 2.
4.56. Let ^ and ^ be the following subspaces of R^:
U = {(a, b,c):a = b = c} and W = {@, b, c)}
(Note that W is the jz-plane.) Show that R^ = U ^ W.
First we show that U DW = {0}. Suppose v = (a,b,c) e U D W. Then a = b = c and a = 0. Hence
a = 0, 6 = 0, c = 0. Thus i; = 0 = @, 0, 0).
Next we show that R^ = U-\-W. For, if i; = (a, b, c) G R^ then
V = (a, a, a) + @, b — a, c — a) where (a, a,a) e U and @, b — a, c — a) e W
Both conditions U A W = {0} md U -\- W = R^ imply that R^ = U ® W.
4.57. Suppose that U and ^ are subspaces of a vector space V and that S = {wj spans U and ^^ = {wj}
spans ^ Show that ^ U ^^ spans U -\- W. (Accordingly, by induction, if S^ spans W^, for
i=l,2,...,n, then ^j U ... U ^„ spans W^-\-...-\- W„.)
Let V e U -\- W. Then i; = w + w, where u e U and w e W. Since 5* spans t/, w is a linear combination of
Wp and since S^ spans ^ w is a linear combination of wy, say
i/ = a^w^-^ + a2Ui^ + ... + a^Ui^ and v = b^Wj^ + ^2^72 + • • • + ^s^js
where a,, 6^ G ^. Then
1; = i/ + w = aiw,-^ + a2Ui^ + ... + a^Ui^ + Z^iiv,-^ + 62w^^ + • • • + ^5^7,
Accordingly, S U S' = {u^, Wj} spans U -\-W.
4.58. Prove Theorem 4.20: Suppose U and F are finite-dimensional subspaces of a vector space V. Then
U -\-W has finite dimension and
dim(?7 + ^) = dim ?7 + dim W - 6im{U H W)
Observe that t/ H ^ is a subspacc of both U and W. Suppose dim U = m, dim W = n, dim(U nW) = r.
Suppose {vi,..., v^} is a basis of t/ H ^ By Theorem 4.16, we can extend {i;J to a basis of U and to a basis
of W; say,
{i;i,..., i;^, Wi,..., w^_^} and {v^,... ,Vj,,Wi,... ,w^_j,}
are bases of t/ and W, respectively. Let
B = {vi,...,v„ui,..., u^_„ Wi,..., w^_,]
Note that B has exactly m-\-n — r elements. Thus the theorem is proved if we can show that 5 is a basis
of U -\-W. Since {v^, Uj} spans U and {v^, Wj^} spans W, the union B = {v^, Uj, Wj^} spans U -\-W. Thus it
suffices to show that B is independent.
Suppose
aiVi + ... + a^v^ + ^i^i + ... + b^_^u^_^ + Ci>Vi + ... + c^_^w^_^ = 0 A)
where a,, 6^, c^ are scalars. Let
V = aiVi + ... + a^v^ + biUi + ... + b^_^u^_^ B)
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
158 VECTOR SPACES [CHAP. 4
By A), we also have
V = -C^W^- ...- C^-r^^_, C)
Since {Vf, Uj} ^ U, v e U by B); and since {wj^} ^ W, v e W by C). Accordingly, v e U D W. Now {i;J is a
basis of U nW, and so there exist scalars d^,... ,dj, for which v = d^v^ -\-... -\- d^v^. Thus, by C), we have
dyVy + . . . + d^V^ + CyWi + . . . + C^_^W^_^ = 0
But {Vf, Wj^} is a basis of W, and so is independent. Hence the above equation forces c^ = 0,..., c„_^ = 0.
Substituting this into A), we obtain
aiVi + ... + Gj-Vj, + biUi + ... + b^_j,u^_j, = 0
But {vi, Uj} is a basis of U, and so is independent. Hence the above equation forces
ai = 0, ... ,a^ = 0,bi = 0,..., b^_^ = 0.
Since A) implies that the a^, bj, Cj^ are all 0, B = {v^, Uj, Wj^} is independent, and the theorem is proved.
4.59. Prove Theorem 4.21: V = U ^ W if md only if (i) V = U-\- W, (ii) UnW = {0}.
Suppose V =U ®W. Then any v eV can be uniquely written in the form v = u-\-w, where u e U and
w e W. Thus, in particular, V =U -\-W. Now suppose v eU nW. Then
A) i; = i; + 0, where veU, ^ eW, {2) v = ^ + v, where 0 G t/, veW.
Thus i; = 0 + 0 = Oandt/nr = {0}.
On the other hand, suppose V = U-\-W md UDW = {0}. Let i; G V. Since V = U-\-W, there exist
u e U and w e W such that v = u-\-w. We need to show that such a sum is unique. Suppose also that
V = u' -\- W, where u' eU and w' eW. Then
u-\-w = u' -\-w', and so u — u' =w' — w
Buiu-u' eU and w' -w eW\ hence, by U r^W = {0},
w — w^ = 0, V — w = 0, and so u = u', w = w'
Thus such a sum for i; G F is unique, and V =U ®W.
4.60. Prove Theorem 4.22 (for two factors): Suppose V =U ®W. Also, suppose S = {u^,..., u^} and
S' = {wi,..., w„} are linearly independent subsets of U and W, respectively. Then:
(a) The union ^ U ^^ is linearly independent in V.
(b) If S and S^ are bases of U and W, respectively, then ^ U ^^ is a basis of V.
(c) dim V = dimU -\- dim W.
(a) Suppose a^Ui + ... + a^u^ -\- b^w^ + ... + b^w^ = 0, where a^ bj are scalars. Then
(a^u^ + ... + a^uj + (Z^iWi + ... + b^wj = 0 = 0 + 0
where 0, a^Wi + ... + a^w^ G t/ and 0, ^^Wi + ... + b^w^ G ^ Since such a sum for 0 is unique, this
leads to
aiUi -\-... -\- a^u^ = 0 and biWi + ... + Z}„>v„ = 0
Since S^ is linearly independent, each a^ = 0, and since S2 is linearly independent, each bj = 0. Thus
S = S^U S2 is linearly independent.
(b) By part (a), S = S^[JS2is linearly independent, and, by Problem 4.55, S = S^[JS2 spans V = U-\-W.
Thus S = Si U S2 is a. basis of K
(c) This follows directly from part (b).
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAR 4]
VECTOR SRACES
159
COORDINATES
4.61. Relative to the basis S = {ui, U2} = {A, 1), B, 3)} of R^, find the coordinate vector of v, where:
(a) V = D, -3), (b) V = (a, b).
In each case, set
V = xui -\-yu2 = x(l, 1) -\-yB, 3) = (x + 2y, x + 3y)
and then solve for x and y.
(a) We have
D, -3) = (x-\-2y, X + 3y) or
The solution is x = 18, j; = —7. Hence [v] = [18, —7].
(b) We have
(a, b) = (x-\- 2y, x + 3y) or
x-\-2y= 4
X + 3j; = -3
x-\-2y = a
x-\-3y = b
The solution is x = 3a — 2b, y = —a -\- b. Hence [v] = [3a — 2b, a -\- b].
4.62. Find the coordinate vector of v = (a,b, c) in R^ relative to:
(a) the usual basis ^ = {A,0,0), @,1,0), @,0,1)},
(b) the basis S = {u^,U2, u^} = {A,1,1), A,1, 0), A, 0, 0)}.
(a) Relative to the usual basis E, the coordinates of [v]^ are the same as v itself That is, [v]^ = [a,b,c],
(b) Set i; as a linear combination of u^, U2, u^^ using unknown scalars x, y, z. This yields
x-\-y -\- z = a
or x-\-y = b
X = C
Solving the system yields x = c, y = b — c, z = a — b. Thus [v]^ = [c, b — c, a — b].
a
b
c
= X
1
1
1
■^y
1
1
0
+ z
1
0
0
4.63. Consider the vector space P3@ of polynomials of degree <3.
(a) Show that ^ = {(^ - l)^ (t-lf, t-\, 1} is a basis of P3@.
(Z?) Find the coordinate vector {v\ of v = 3t^ — 4f -\-2t — 5 relative to S.
(a) The degree of (^ — 1) is k; writing the polynomials of S in reverse order, we see that no polynomial is a
linear combination of preceding polynomials. Thus the polynomials are linearly independent, and, since
dimP3(^) = 4, they form a basis ofP^i^).
(b) Set i; as a linear combination of the basis vectors using unknown scalars x, y, z, s. We have
v = 3p -\-4t^-\-2t-5= x(t - if -\-y(t - if + z(t - 1) + ^A)
= x(P - 3t^ + 3^-1) +y{t^ - 2^ + 1) + z{t - 1) + ^A)
= xt' - 3xt^ -\-3xt-x-\-yf -2yt-\-y-\-zt-z-\-s
= xP + (-3x -\-yy + Cx -2y-\-z)t-\-(-x-\-y-z-\-s)
Then set coefficients of the same powers of t equal to each other to obtain
X = 3, —3x+_y = 4, 3x — 2y-\-z = 2, —x-\-y — z-\-s = —5
Solving the system yields x = 3, j; = 13, z = 19, ^ = 4. Thus [v] = [3, 13, 19, 4].
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
160
VECTOR SPACES
[CHAR 4
4.64. Find the coordinate vector of ^ =
1 1
1 1
in the real vector space M = M2 2 relative to:
'
1
-1
1 0
1 0
0
0
'
'
1
-1
0 0
0 1
0
3
'
'
0 0
1
0
1
0
'
!•
(a) the basis S =
(b) the usual basis E =
(a) Set ^ as a linear combination of the basis vectors using unknown scalars x, y, z, t as follows:
0 0
0 1
Set corresponding entries equal to each other to obtain the system
x-\-z -\-t = 2, X —y — z = 3, x-\-y = 4,
X + Z + t X — y — z
x-\-y X
-7
Solving the system yields x = -7, j; = 11, z = -21, ^ = 30. Thus [^1^ = [-7, 11,-21, 30]. (Note that
the coordinate vector of ^ is a vector in R^, since dimM = 4.)
ih) Expressing ^ as a linear combination of the basis matrices yields
r?
4
3]
-7_
= X
[1 0]
0 0_
^y
0
0 0
oj+li oj+lo ij = [
X y
z t
Thus X = 2, j; = 3, z = 4, ^:
written row by row.
-7. Hence [A] = [2, 3, 4, —7], whose components are the elements of ^
Remark: This result is true in general, that is, if ^ is any m x n matrix in M = M^ „, then the
coordinates of A relative to the usual basis of M are the elements of A written row by row.
4.65. In the space M = M2 3, determine whether or not the following matrices are linearly dependent:
1 2 3
4 0 5
B :
2 4 7
10 1 13
C:
1 2
8 2
5
11
If the matrices are linearly dependent, find the dimension and a basis of the subspace ^ of M
spanned by the matrices.
The coordinate vectors of the above matrices relative to the usual basis of M are as follows:
[A] = [1,2, 3, 4, 0, 5], [B] = [2, 4, 7, 10, 1, 13], [C] = [1,2, 5, 8, 2, 11]
Form the matrix M whose rows are the above coordinate vectors, and reduce M to echelon form:
M -
12 3 4 0 5
2 4 7 10 1 13
12 5 8 2 11
12 3 4 0 5
0 0 12 13
0 0 0 0 0 0
Since the echelon matrix has only two nonzero rows, the coordinate vectors [A], [B], [C] span a space of
dimension two, and so they are linearly dependent. Thus A, B, C are linearly dependent. Furthermore,
dim W = 2, and the matrices
1 2 3
4 0 5
0 0 1
2 1 3
and W2
corresponding to the nonzero rows of the echelon matrix form a basis of W.
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 4] VECTOR SRACES 161
MISCELLANEOUS PROBLEMS
4.66. Consider a finite sequence of vectors S = {vi,V2,..., v^}. Let T be the sequence of vectors
obtained from S by one of the following "elementary operations": (i) interchange two vectors,
(ii) multiply a vector by a nonzero scalar, (iii) add a multiple of one vector to another. Show that S
and T span the same space W. Also show that T is independent if and only if S is independent.
Observe that, for each operation, the vectors in T are linear combinations of vectors in S. On the other
hand, each operation has an inverse of the same type (Rrove!); hence the vectors in S are linear combinations
of vectors in T. Thus S and T span the same space W. Also, T is independent if and only if dim W = n, and
this is true if and only if S is also independent.
4.67. Let A = [afj] and B = [bfj] be row equivalent m x n matrices over a field K, and \QtVi, ... ,v^ be any
vectors in a vector space V over K. Let
Ul = CluVi + a^2^2 + . . . + ClinV^ Wj = t^^V^ + b^2^2 + • • • + b^^V^
U2 = ^21^1 + <^22^2 + . . . + (^2n^n ^2 = ^21^1 + ^22^2 + • • • + ^2n^n
Show that {Ui) and {wj span the same space.
Applying an "elementary operation" of Rroblem 4.66 to {wj is equivalent to applying an elementary row
operation to the matrix A. Since A and B are row equivalent, B can be obtained from ^ by a sequence of
elementary row operations; hence {wj can be obtained from {wj by the corresponding sequence of operations.
Accordingly, {wj and {wj span the same space.
4.68. Let i;i, ..., i;„ belong to a vector space V over K, and let P = [a^y] be an ^-square matrix over K. Let
Wj = UxxVx + ^121^2 + . . . + <^\n^n^ • • • . ^n = ^nl^l + ^n2^2 + • • • + Clnn^n
(a) Suppose P is invertible. Show that {wj and {i;J span the same space; hence {wj is
independent if and only if {i;J is independent.
(b) Suppose P is not invertible. Show that {wj is dependent.
(c) Suppose {wj is independent. Show that P is invertible.
(a) Since P is invertible, it is row equivalent to the identity matrix /. Hence, by the Rroblem 4.67, {wj and
{i;J span the same space. Thus one is independent if and only if the other is.
(b) Since P is not invertible, it is row equivalent to a matrix with a zero row. This means that {wj spans a
space which has a spanning set of less than n elements. Thus {wj is dependent.
(c) This is the contrapositive of the statement of (b), and so it follows from (b).
4.69. Suppose that ^j, A2, ... slyq linearly independent sets of vectors, and that ^j c ^2 ^ .... Show that
the union A = A1UA2U ... is also linearly independent.
Suppose A is linearly dependent. Then there exist vectors i;i,..., i;„ e A and scalars a^,..., a„ e K, not
all of them 0, such that
a^v^ -\-a2V2-\-...-\-a^v^ =0 A)
Since A = [JAi and the Vi e A, there exist sets Ai ,..., A^ such that
Let k be the maximum index of the sets A^ \ k = max(/i ,...,/„). It follows then, since A^ ^ A2 ^ ..., that
each Ai is contained in Aj^. Hence Vi,V2,... ,Vjj e A^, and so, by A), A^ is linearly dependent, which
contradicts our hypothesis. Thus A is linearly independent.
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
162 VECTOR SPACES [CHAR 4
4.70. Let AT be a subfield of a field L, and let L be a subfield of a field E. (Thus K ^ L ^E, and K is a
subfield ofE.) Suppose E is of dimension n over L, and L is of dimension m over K. Show that E is
of dimension mn over K.
Suppose {vi,..., v^} is a basis of E over Z and {a^,..., a^} is a basis of L over ^. We claim that
{aiVj : i = I,..., m,j = 1,...,«} is a basis of ^ over K. Note that {aiVj} contains mn elements.
Let w be any arbitrary element in E. Since {i;i, ..., i;„} spans E over Z, w is a linear combination of the Vi
with coefficients in L:
w = b^Vi-\-b2V2-\-...-\-b^v^, bi e L A)
Since {a^,..., a^} spans L over ^, each 6, G Z is a linear combination of the Gj with coefficients in K:
bi = kuai + ^i2«2 + • • • + ki^a^
^2 = ^21<2l + ^22<22 + • • • + ^2m<2m
^n — ^nl^l + ^n2^2 + • • • + kf^n^m
where ^^ G ^. Substituting in A), we obtain
W = (^n«l + • • • + ki^ajv^ + (^21«l + • • • + ^2m«m)^2 + • • • + (^„1«1 + . . . + Km^mK
= k^a^V^ + . . . + k^^a^V^ + ^21«li^2 + • • • + ^2m«m^2 + • • • + Kl^l^n + • • • + ^„^«^i^„
= E^7/(<2/^7)
where ^^^ G ^. Thus w is a linear combination of the a^Vj with coefficients in K; hence {a^Vj} spans ^ over K.
The proof is complete if we show that {a^Vj} is linearly independent over K. Suppose, for scalars
Xji G K, we have J].. Xji(aiVj) = 0; that is,
(Xn^il^l +Xi2«2^1 + • • • +-^lm«m^l) + • • • + (-^nl^li^n + •^n2«2^n + • • • +-^nm«m^m) = 0
or
(Xn^i +Xi2«2 + • • • +-^lm«m)^l + • • • + (-^nl^l + •^„2«2 + • • • + •^nm«m)^n = ^
Since {i;i,..., i;„} is linearly independent over L and since the above coefficients of the v^ belong to L, each
coefficient must be 0:
Xn^i +Xi2«2 + • • • +-^lm«m =0' • • • ' -^nl^l + •^n2«2 + • • • +-^nm^m = ^
But {a^,... ,a^} is linearly independent over K; hence, since the Xji G K,
xn=0, Xi2 = 0, ..., xi^ = 0, ..., x„i=0, x„2=0, ..., x„^ = 0
Accordingly, {a^Vj} is linearly independent over K, and the theorem is proved.
Supplementary Problems
VECTOR SPACES
4.71. Suppose u and v belong to a vector space V. Simplify each of the following expressions:
(a) E^ = 4Eu - 6v) + 2Cu + v), (c) E^ = 6Cu + 2v) -\-5u - Iv,
(b) E2 = 5Bu - 3v) + 4Gi; + 8), (d) E^ = 3Eu + 2/v)
4.72. Let V be the set of ordered pairs (a, b) of real numbers with addition in V and scalar multiplication on V
defined by
(a, b) + (c, d) = {a + c, b + d) and k{a, b) = (ka, 0)
Show that V satisfies all the axioms of a vector space except [M4], that is, except lu = u. Hence [M4] is not a
consequence of the other axioms.
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 4] VECTOR SRACES 163
4.73. Show that Axiom [A4] of a vector space V, that is, that u-\-v = v-\-u, can be derived fi-om the other axioms
for K
4.74. Let V be the set of ordered pairs (a, b) of real numbers. Show that V is not a vector space over R with addition
and scalar multiplication defined by:
(i) (a, b) + (c, d) = {a-\- d, b -\- c) and k{a, b) = (ka, kb),
(ii) (a, b) + (c, d) = {a + c, b + d) and k{a, b) = (a, b),
(iii) (a, b) + (c, d) = @, 0) and k(a, b) = (ka, kb),
(iv) (a,b) -\-(c,d) = (ac, bd) and k{a, b) = (ka, kb).
4.75. Let V be the set of infinite sequences (a^, ^2,...) in a field K. Show that F is a vector space over K with
addition and scalar multiplication defined by
(ai,a2,...)-\-(bi,b2,...) = (ai-\-bi, ^2+^72, • • •) and k(ai, a2,...) = (kai,ka2, ...)
4.76. Let U and W be vector spaces over a field K. Let V be the set of ordered pairs (u, w) where u e U and w e W.
Show that F is a vector space over K with addition in V and scalar multiplication on V defined by
(u, w) + (u\ w^) = (u-\-u\ w -\- w^) and k(u, w) = (ku, kw)
(This space V is called the external direct product of U and W.)
SUBSPACES
le whether or not ^ is a subspace of R^ where W consists of a
(a) a = 3b, (b) a < b < c, (c) ab = 0, (d) a-\-b-\-c = 0, (e) b = a^, (/) a = 2b = 3c.
4.77. Determine whether or not ^ is a subspace of R^ where W consists of all vectors (a, b, c) in R^ such that:
4.78. Let V be the vector space of w-square matrices over a field K. Show that ^ is a subspace of F if ^ consists of
all matrices A = [aij] that are:
(a) symmetric (^^ = ^ or a^ = aji), (b) (upper) triangular, (c) diagonal, (d) scalar.
4.79. Let AX = 5 be a nonhomogeneous system of linear equations in n unknowns, that is, B ^ 0. Show that the
solution set is not a subspace of ^".
4.80. Suppose U and W are subspaces of V for which U U W is sl subspace. Show that U ^ W or W ^ U.
4.81. Let V be the vector space of all functions fi-om the real field R into R. Show that ^ is a subspace of V where
W consists of all: (a) bounded functions, (b) even functions. [Recall that/: R ^ R is bounded if 3M G R
such that Vx G R, we have |/(x)| < M; and/(x) is even if/(—x) =f(x), Vx G R.]
4.82. Let V be the vector space (Problem 4.75) of infinite sequences (a^, ^2,...) in a field K. Show that ^ is a
subspace of F if ^ consists of all sequences with: (a) 0 as the first element, (b) only a finite number of
nonzero elements.
LINEAR COMBINATIONS, LINEAR SPANS
4.83. Consider the vectors u = A,2,3) and i; = B, 3, 1) in R^.
(a) Write w = A, 3, 8) as a linear combination of u and v.
(b) Write w = B, 4, 5) as a linear combination of u and v.
(c) Find k so that w = (l,k, —2) is a linear combination of u and v.
(d) Find conditions on a, b, c so that w = (a, b, c) is a linear combination of u and v.
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
164 VECTOR SPACES [CHAP. 4
4.84. Write the polynomial f{t) = af -\-bt-\- c as a linear combination of the polynomials p^ = {t — \) ,
P2 = t — \, Pt^ = \. \J\ms pi, P2, P2, span the space P2@ of polynomials of degree < 2.]
4.85. Find one vector in R^ that spans the intersection of U and W where U is the xy-plane, i.e. U = {(a, b, 0)}, and
W is the space spanned by the vectors A,1,1) and A,2,3).
4.86. Prove that spanE') is the intersection of all subspaces of V containing S.
4.87. Show that spanE') = spanE' U {0}). That is, by joining or deleting the zero vector from a set, we do not
change the space spanned by the set.
4.88. Show that: (a) IfS^T, then spm(S) c span(r). (b) span[spanE')] = spanE').
LINEAR DEPENDENCE AND LINEAR INDEPENDENCE
4.89. Determine whether the following vectors in R^ are linearly dependent or independent:
(a) A,2, -3, 1), C, 7, 1, -2), A, 3, 7, -4), (b) A, 3, 1, -2), B, 5, -1, 3), A, 3, 7, -2).
4.90. Determine whether the following polynomials u, v, w in P(^) are linearly dependent or independent:
(a) u = P -4f -\-3t-\-3, v = P -\-2f -\-4t-l, w = 2p -f-3t-\-5,
(b) u = P -5f-2t-\-3, v = P -4f-3t-\-4, w = 2P - if -It+ 9.
4.91. Show that the following functions/, g, h are linearly independent:
{a) f{t) = ^, g{t) = smt, h(t) = t\ (b) f(t) = e', g(t) = e", h(t) = t.
4.92. Show that u = (a, b) and v = (c, d) in K^ are linearly dependent if and only if ad — be = 0.
4.93. Suppose w, v, w are linearly independent vectors. Prove that S is linearly independent where:
(a) S = {u-\-v — 2w, u — V — w, u-\- w}, {b) S = {u-\-v — 3w, u -\-3v — w, v -\- w}.
4.94. Suppose {u^,... ,u^,Wi, ..., wj is a linearly independent subset of V. Show that span(w^) H span(>Vy) = {0}.
4.95. Suppose Vi,V2,... ,v^ are linearly independent. Prove that S is linearly independent where
(a) S = {a^Vi, a2V2, ■ ■ ■, ctn^n) ^^^ ^^ch a^ ^ 0.
(b) S = {vy,..., Vj^_i, w, Vj^^i,... ,vj and w = J2i ^i^i ^^^ ^k t^ 0-
4.96. Suppose (a^, ..., a^^), {a2\, ..., a2n), • • •, (^^^i, • • •, a^^) are linearly independent vectors in K^, and
suppose Vi,V2,... ,v^ SiVQ linearly independent vectors in a vector space V over K. Show that the following
vectors are also linearly independent:
wi = fln^i + • • • + «i„i^„, ^2 = ^21^1 + • • • + a2nV^, ..., w^ = a^^v^-\-...-\- a^^v^
BASIS AND DIMENSION
4.97. Find a subset of that gives a basis for W = span(w^) of R^ where:
(a) wi =A,1,1,2,3), U2 =A,2,-1,-2,1), W3 =C,5,-1,-2,5), W4 = A, 2, 1,-1, 4)
(b) wi =A,-2,1,3,-1), U2 =(-2,4,-2,-6,2), u^ =A,-3,1,2,1), W4 = C,-7, 3, 8,-1)
(c) wi =A,0, 1,0, 1), ^2 =(!'!'2, 1,0), W3 =A,2,3,1,1), W4 = A, 2, 1, 1, 1)
(d) wi =A,0, 1,1,1), ^2 =B,1,2,0,1), W3 =A,1,2,3,4), W4 = D, 2, 5, 4, 6)
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAR 4]
VECTOR SRACES
165
4.98. Consider the subspaces U = {(a, b, c, d) : b — 2c -\- d = 0] and W -
Find a basis and the dimension of: (a) U, (b) W, (c) U D W.
{(a, b,c,d)\a = d,b = 2c} of R^
4.99. Find a basis and the dimension of the solution space W of each of the following homogeneous systems:
(a) X + 2v — 2z -
x-\-2y-2z-\-2s- t = 0
x-\-2y- z-\-3s-2t = 0
2x-\-4y-lz-\- s-\- t = 0
(b) x-\-2y- z-\-3s-4t = 0
2x-\-4y-2z- s-\-5t = 0
2x-\-4y-2z-\-4s-2t = 0
4.100. Find a homogeneous system whose solution space is spanned by the following sets of three vectors:
(a) A, -2, 0, 3, -1), B, -3, 2, 5, -3), A, -2, 1,2, -2),
(b) A,1,2,1,1), A,2,1,4,3), C,5,4,9,7).
4.101. Determine whether each of the following is a basis of the vector space P„@-
(a) {1, 1+^, l-\-t-\-f, l-\-t-\-f
(b) {l-\-t, t-\-t\ f-\-t\ ..
l-\-t-\-f-\-...-\- f-^ + f},
4.102. Find a basis and the dimension of the subspace W of V{t) spanned by:
{a) u = P -\-2f -2t-\-l, v = P -\-3ft-\-4, w = 2p -\-f -It-1,
(b) u = P -\-f -3t-\-2, v = 2p -\-f -\-t-4, w = 4P + 3f -5t-\-2.
4.103. Find a basis and the dimension of the subspace W of V = M2 2 spanned by
1 -5
-4 2
1 1
-1 5
C-
2 -4
-5 1
D-
1 -7
-5 1
RANK OF A MATRIX, ROW AND COLUMN SPACES
4.104. Find the rank of each of the following matrices:
(a)
13-25 4"
14 13 5
14 2 4 3
2 7 -3 6 13
(b)
1
1
3
2
2
3
8
1
-3
-2
-7
-9
-2
0
-2
-10
, (c)
1
4
5
-1
1 2
5 5
8 1
-2 2
4.105. For k = 1, 2,..., 5, find the number Uj^ of linearly independent subsets consisting of ^ columns for each of the
following matrices:
(a) A.
4.106. Let (a) A
110 2 3"
12 0 2 5
13 0 2 7
(b) B--
12 10 2"
12 3 0 4
115 0 2
12 1 3 1 6"
2 4 3 8 3 9
12 2 5 3 11
4 8 6 16 7 26
(b) B--
12 2 12 1"
2 4 5 4 5 5
12 3 4 4 6
3 6 7 7 9 10
For each matrix (where C^,..., Q denote its columns):
(i) Find its row canonical form M.
(ii) Find the columns that are linear combinations of preceding columns,
(iii) Find columns (excluding Q) that form a basis for the column space,
(iv) Express Q as a linear combination of the basis vectors obtained in (iii).
Lipschutz-Lipson:Schaum's I 4. Vector Spaces
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
166
VECTOR SPACES
[CHAR 4
4.107. Determine which of the following matrices have the same row space:
[I
-2 -1
3-4 5
[5
-1 2
2 3-1
C-
1-1 3'
2 -1 10
3 -5 1
4.108. Determine which of the following subspaces of R^ are identical:
U^ = span[(l, 1, -1), B, 3, -1), C, 1, -5)], U2 = span[(l, -1, -3), C, -2, ■
t/3 =span[(l,l,l), A,-1,3), C,-1,7)]
), B,1,-3)]
4.109. Determine which of the following subspaces of R^ are identical:
t/i =span[(l,2, 1,4), B,4,1,5), C,6,2,9)], t/2 = span[(l, 2, 1, 2), B,4,1,3)],
t/3 =span[(l,2,3, 10), B,4,3,11)]
4.110. Find a basis for (i) the row space and (ii) the column space of each matrix M:
(a) M.
0 0 3 14"
13 12 1
3 9 4 5 2
4 12 8 8 7
{b) M:
12 1 or
12 2 13
3 6 5 2 7
2 4 1-10
4.111. Show that if any row is deleted from a matrix in echelon (respectively, row canonical) form, then the resulting
matrix is still in echelon (respectively, row canonical) form.
4.112. Let A and B be arbitrary m x n matrices. Show that rank(^ -\- B) < rank(^) + rankE).
4.113. Let r = rank(^ + B). Find 2x2 matrices A and B such that:
(a) r < rank(^), mnk(B); (b) r = rank(^) = rankE); (c) r > rank(^), mnk(B).
SUMS, DIRECT SUMS, INTERSECTIONS
4.114. Suppose U and W are two-dimensional subspaces of ^^. Show that U DW ^ {0}.
4.115. Suppose U and W are subspaces of V such that dim U = 4, dim W = 5, and dim V = 1. Find the possible
dimensions oi U f^W.
4.116. Let U and W be subspaces of R^ for which dim t/ = 1, dim ^ = 2, and t/ g ^ Show that ^^ = U®W.
4All, Consider the following subspaces of R^:
U = span[(l, -1, -1, -2, 0), A, -2, -2, 0, -3), A, -1, -2, -2, 1)]
W = span[(l, -2, -3, 0, -2), A, -1, -3, 2, -4), A, -1, -2, 2, -5)]
(a) Find two homogeneous systems whose solution spaces are U and W, respectively.
(b) Find a basis and the dimension of U D W.
4.118. Let Ui, U2, U2, be the following subspaces of R^:
U^ = {(a, b,c)\a = c}, U^ = {(a, b, c) \ a-\-b-\-c = 0}, U^ = {@, 0, c)}
Show that: (a) R^ = U^-\-U2, (b) R^ = t/2 + ^3, (c) R^ = U^-\-U^. When is the sum direct?
Lipschutz-Lipson:Schaum's I 4. Vector Spaces I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 4] VECTOR SRACES 167
4.119. Suppose U, Wi, W2 are subspaces of a vector space V. Show that
(u n w^) -\-(unW2)^un (w^ + W2)
Find subspaces of R^ for which equality does not hold.
4.120. Suppose Wi,W2,... ,W^ slyq subspaces of a vector space V. Show that:
(a) span(ri, W2,..., W^) = W^-\-W2-\-...-\-W^.
(b) If Si spans W^ for / = 1,..., r, then S^US2U...US, spans W^-\-W2-\-...-\-W,.
4.121. Suppose V =UeW. Show that dim V = dimU-\- dim W.
4.122. Let S and T be arbitrary nonempty subsets (not necessarily subspaces) of a vector space V and let ^ be a
scalar. The sum S -\- T and the scalar product kS are defined by:
S-\-T = (u-\-v:ueS, V eT}, kS = {ku : u e S}
[We also write w-\-S for {w} + S.] Let
^ = {A,2), B,3)}, T = {A,4), A,5), B,5)}, w = (l,l), k = 3
Find: (a)S-\-T, (b)w-\-S, (c) kS, (d) kT, (e)kS-\-kT, (f)k(S-\-T).
4.123. Show that the above operations of S -\- T and kS satisfy:
(a) Commutative law: S -\- T = T -\- S.
(b) Associative law: (S^ + S2) -\-S^ = S^-\- (S2 + S^).
(c) Distributive law: k(S -\-T) = kS-\- kT.
(d) S-\-{0] = {0]-\-S = S SindS-\-V=V-\-S=V.
4.124. Let V be the vector space of w-square matrices. Let U be the subspace of upper triangular matrices, and let W
be the subspace of lower triangular matrices. Find: (a) U A W, (b) U -\- W.
4.125. Let V be the external direct sum of vector spaces U and W over a field K. (See Rroblem 4.76). Let
U = {(u, 0)\ueU} and W = {@, w) \ w e W}
Show that: (a) U and W are subspaces of V, (b) V = U ® W.
4.126. Suppose V =U -\-W. Let V be the external direct sum of U and W. Show that V is isomorphic to V under
the correspondence v = u-\-w ^^ {u,w).
4.127. Use induction to prove: {a) Theorem 4.22, {b) Theorem 4.23.
COORDINATES
4.128. The vectors u^ ={\, —2) and U2 = D, —7) form a basis 5* of R^. Find the coordinate vector [v] of i; relative to
S where: (a) v = E, 3), (b) v = (a, b).
4.129. The vectors Ui = A, 2, 0), U2 =A,3, 2), W3 = @, 1, 3) form a basis 5* of R^. Find the coordinate vector [v\ of
V relative to S where: {a) v = B, 7, —4), (b) v = (a, b, c).
4.130. S = {P -\- f, f -\-t, t -\-\, 1} is a basis of P3(^). Find the coordinate vector [v\ of v relative to S where:
{a) v = 2p -\-f -4t-\- 2, (b) v = aP -\- bf -\-ct-\-d.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
4. Vector Spaces
Text
©The McGraw-Hill
Companies, 2004
168
VECTOR SPACES
[CHAR 4
4.131. Let V = M2 2- Find the coordinate vector [A] of A relative to S where:
r
1 1
'
1
-1"
1
'
0"
0 o_
and (a) A
[6 7,
(b) A:
a b
4.132. Find the dimension and a basis of the subspace W of P3@ spanned by
w = ^^ + 2^^ - 3^ + 4, v = It' -\-5f -4t-\-l, w = P -\-4i^ -\-t-\-2
4.133. Find the dimension and a basis of the subspace ^ of M = M2 3 spanned by
1 2 1
3 1 2
[?
4 3
5 6
[-
MISCELLANEOUS PROBLEMS
4.134. Answer true or false. If false, prove it with a counterexample.
(a) If Ui, U2, W3 span V, then dim V = 3.
(b) If ^ is a 4 X 8 matrix, then any six columns are linearly dependent.
(c) If Ui, U2, W3 are linearly independent, then u^, U2, u^, w are linearly dependent.
(d) If Ui, U2, W3, W4 are linearly independent, then dim V > 4.
(e) If Ui, U2, W3 span V, then w, Ui, U2, W3 span V.
(/) If Ui, U2, u^, u^ are linearly independent, then u^, U2, u^ are linearly independent.
4.135. Answer true or false. If false, prove it with a counterexample.
(a) If any column is deleted from a matrix in echelon form, then the resulting matrix is still in echelon form.
(b) If any column is deleted from a matrix in row canonical form, then the resulting matrix is still in row
canonical form.
(c) If any column without a pivot is deleted from a matrix in row canonical form, then the resulting matrix is
in row canonical form.
4.136. Determine the dimension of the vector space W of the following w-square matrices:
(a) symmetric matrices, (b) antisymmetric matrices,
(d) diagonal matrices, (c) scalar matrices.
4.137. Let ^1, ^2' • • •' ^n b^ symbols, and let K be any field. Let V be the following set of expressions where a^ e K:
a^t^ -\-a2t2-\-...-\-aj^
Define addition in V and scalar multiplication on V by
(ai^i + ... + ajj + F1^1 + ... + bjj = (ai + b^)t^ + ... + (a^b^Jt^
k{aiti + C2^2 + • • • + aJn) — ka^ti + ^^2^2 + • • • + kaj^^
Show that F is a vector space over K with the above operations. Also, show that {^^,..., ^„} is a basis of V,
where
tj = Qt^+ ... + Qtj_^ + \tj + Oif,+i ■
. + 0L
Lipschutz-Lipson:Schaum's I 4. Vector Spaces
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 4]
VECTOR SRACES
169
4.71.
4.77.
4.79.
4.83.
4.84.
4.85.
4.89.
4.90.
4.97.
4.98.
4.99.
4.100.
4.101.
4.102.
4.103.
4.104.
4.105.
4.106.
4.107.
4.108.
4.109.
Answers to Supplementary Problems
(a) El = 26u — 22i;. (b) The sum 7i; + 8 is not defined, so E2 is not defined,
(c) E^ = 23u + 5v. (d) Division by v is not defined, so E^ is not defined
(a) Yes. (b) No; e.g. A, 2, 3) G ^ but -2A, 2, 3) ^ ^
(c) No; e.g. A,0, 0), @, 1, 0) G ^ but not their sum. (d) Yes.
(e) No; e.g. A,1,1) G ^ but 2A, 1,1)^^ (/) Yes
The zero vector 0 is not a solution
(a) w = 3ui — U2. (b) Impossible. (c) k = ^. (d) la — 5b-\-c = 0
Using/ = xpi -\-yp2 + 2^3, we get x = a, y = 2a-\-b,z = a-\-b + c
v = {2,5,0)
(a) Dependent. (b) Independent
(a) Independent. (b) Dependent
(a) Ui, U2, %, W4; (b) Ui, U2, u^;
(c) Ui, U2, W3, W4; (d) Ui, U2, W3
(a) dim U = 3, (b) dim W = 2, (c) dim(U nW)=\
(a) Basis: {B, -1, 0, 0, 0), D, 0, 1, -1, 0), C, 0, 1, 0, 1)}; dim W = 3.
(b) Basis: {B,-1,0,0,0), A,0,1,0,0)}; dimW = 2
(a) 5x+j; —z —^ = 0, x-\-y — z — t = 0',
(b) 2x-z = 0, 2x - 3j; + ^ = 0, x - 2j; + ^ = 0
(a) Yes. (b) No, since dimP„@ = n-\- I, but the set contains only n elements
(a) diimW = 2, (b) dim ^ = 3
dim r = 2
(a) 3, (b) 2, (c) 3
(a) Hi =4, n2 = 5, n^ = U/^ = n^ = 0; (b) Ui
n^ =2, n^ = n^ = 0
(a) (i) M = [1,2,0,1,0,3; 0,0,1,2,0,1; 0,0,0,0,1,2; 0];
(ii) C2, C4, Q; (iii) Ci,C3, C5; (iv) Q = 3Q+C3+2Q
(b) (i) M = [1,2,0,0,3,1; 0,0,1,0,-1,-1; 0,0,0,1,1,2; 0];
(ii) C2, C5, Cg; (iii) Ci,C3, C4; (iv) C6 = Ci-C3+2C4
A and C are row equivalent to
Ui and U2 are row equivalent to
Ui, U2, U^ are row equivalent to
1 0 7
0 1 4
ri 0
[0 1
[1201
[0013
but not B
, but not t/3
Lipschutz-Lipson:Schaum's I 4. Vector Spaces
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
170
VECTOR SPACES
[CHAR 4
4.110. (a) (i) A,3,1,2,1), @,0,1,-1,-1), @,0,0,4,7); (ii) Q, C2, C5.
(b) (i) A, 2, 1, 0, 1), @, 0, 1, 1, 2); (ii) Ci, C3
4.113. (a) A.
(c) A--
[0 0
n 0
[0 0
' ^"
, B =
\-\ -1]
_ 0 0_
[0 0]
0 1_
{b) A:
1 0
0 0
0 2
0 0
4.115. dim(t/ nW) = 2, 3, or 4
3x + 4j; - z - ^ = 0
4.117. (a) (i)
4x + 2j; - ^ = 0
4x + 2j; + ^ = 0 ^"^ 9x + 2j; + z + ^ = 0
(b) Basis: {A, -2, -5, 0, 0), @, 0, 1,0, -1)}; dim(UnW) = 2
(ii)
4.118.
4.119.
4.122.
4.124.
4.128.
4.129.
4.130.
4.131.
4.132.
4.133.
4.134.
4.135.
4.136.
The sum is direct in (b) and (c)
In R^, let U, V, W be, respectively, the line y = x, the x-axis, the j^-axis.
(a) {B,6), B,7), C,7), C,8), D,8)}; (b) {B,3), C,4)};
(c) {C,6), F,9)}; (d) {C,12), C,15), F,15)};
(emdf) {F,18), F,21), (9,21), (9,24), A2,24)}
(a) Diagonal matrices, (b) V
(a) [-41,11], (b) [-7a-4b, 2a-\-b]
(a) [-11,13,-10], (b) [c-3b-\-7a, -c-\-3b-6a, c-2b-\-4a]
(a) [2,-1,-2,2]; (b) [a, b - c, c - b-\-a, d - c-\-b - a]
(a) [7,-1,-13,10]; (b) [d, c-d, b-\-c-d, a-b-2c-\-2d]
dim W = 2; basis: {P + 2f -3^ + 4, f -\-2t- 1}
dim W = 2; basis: {[1,2, 1,3,1, 2], [0, 0, 1, 1, 3, 2]}
(a) False; A,1), A,2), B,1) span R^. (b) True.
(c) False; A,0,0,0), @,1,0,0), @,0,1,0), w = @, 0, 0, 1).
(d) True, (e) True. (/) True
(a) True, (b) False; e.g. delete C2 from
ri 0
[0 1
0 3
2
(c) True.
(a) \n(n-\-\X (b) \n(n - l), (c) n, (d) 1
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
5. Linear Mappings
Text
©The McGraw-Hill
Companies, 2004
CHAPTER 5
Linear Mappings
5.1 INTRODUCTION
The main subject matter of Hnear algebra is the study of hnear mappings and their representation by
means of matrices. This chapter introduces us to these hnear maps and the next chapter shows how they
can be represented by matrices. First, however, we begin with a study of mappings in general.
5.2 MAPPINGS, FUNCTIONS
Let A and B be arbitrary nonempty sets. Suppose to each element in A there is assigned a unique
element of B; the collection / of such assignments is called a mapping (or map) from A into B, and is
denoted by
f:A^B
The set A is called the domain of the mapping, and B is called the target set. We write/(a), read "/ of a ",
for the unique element of B that/ assigns to a e A.
One may also view a mapping/: ^ ^ 5 as a computer that, for each input value a e A, produces a
unique output/(a) G B.
Remark: The term function is used synonymously with the word mapping, although some texts
reserve the word "function" for a real-valued or complex-valued mapping.
Consider a mapping f: A ^ B. If A' is any subset of A, then f(A') denotes the set of images of
elements of ^^; and ifB' is any subset of 5, then f~^(B') denotes the set of elements of ^, each of whose
image lies in B. That is.
f(A') = {f(a):aeA'}
and
f-\B') = {aeA:f(a)eB'}
We call/(^0 the image ofA^ mid f~^(B^) the inverse image or preimage ofB'. In particular, the set of all
images, i.e.,/(^), is called the image or range off.
To each mapping/: A ^ B there corresponds the subset of ^ x B given by {(a,f(a)): a e A}. We call
this set the graph off. Two mappings/: A ^ B and g: A ^ B slyq defined to be equal, written/ = g, if
f{a) = g{a) for every a e A, that is, if they have the same graph. Thus we do not distinguish between a
function and its graph. The negation off = g is written/ ^ g and is the statement:
There exists mia e A for which/(a) j^ g(ci).
171
Lipschutz-Lipson:Schaum's I 5. Linear Mappings
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
172
LINEAR MAPPINGS
[CHAR 5
Sometimes the "barred" arrow i^ is used to denote the image of an arbitrary element x eA under a
mapping f: A ^ B by writing
x\^f(x)
This is illustrated in the following example.
Example 5.1
(a) Let/: R ^ R be the function that assigns to each real number x its square x^. We can denote this function by
writing
f(x)=x'
x"
Here the image of —3 is 9, so we may write /(—3) = 9. However, / ^(9) = {3, —3}. Also,
/(R) = [0, oo) = {x: X > 0} is the image off.
(b) Let A = {a, b, c, d} and B = {x,y, z, t}. Then the following defines a mapping/: A ^ B:
f(a) = y, fib) = X, /(c) = z, f(d) =y or f = {(a, y), (b, x), (c, z), (d, y)}
The first defines the mapping explicitly, and the second defines the mapping by its graph. Here,
/({a, b, d]) = {f(a),f{b),f(d)] = {y, x,y] = {x,y}
Furthermore, f(A) = {x, y, z} is the image of/
Example 5.2. Let V be the vector space of polynomials over R, and \Qtp(t) = 3f — 5t + 2.
(a) The derivative defines a mapping D: F ^ F where, for any polynomials/(O, we have D(/) = df/dt. Thus
D(/7) = DC^2 - 5^ + 2) = 6^ - 5
{b) The integral, say from 0 to 1, defines a mapping J: F ^ R. That is, for any polynomial/(O,
J(/) =
fit) dt,
and so
J(p)= Ct'-5t^2)=\
0
Jo
Observe that the mapping in (b) is from the vector space V into the scalar field R, whereas the mapping in (a) is from
the vector space V into itself
Matrix Mappings
Let A be any m x n matrix over K. Then A determines a mapping F^: K" ^ K^ by
F^{u) = Au
where the vectors in K^ and K^ are written as columns. For example, suppose
1 -4 5
2 3-6
and u ■■
1
3
-5
then
1
2
-4
3
5]
-6 J
r r
3
[-5
=
"-36"
41
F^(u) = Au ■■
Remark: For notational convenience, we shall frequently denote the mapping F^ by the letter A, the
same symbol as used for the matrix.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 5] LINEAR MAPPINGS 173
Composition of Mappings
Consider two mappings f: A ^ B and g: B ^ C, illustrated below:
a-Ub^ c
The composition off and g, denoted by g o/, is the mapping g of: A ^ C defined by
(g-f)(a)^g(f(a))
That is, first we apply/ to a e A, and then we apply g tof(a) e B to get g(f(a)) g C. Viewing/ and g as
"computers", the composition means we first input a e A to get the output/(a) G B using/, and then we
input/(a) to get the output g(f(a)) G C using g.
Our first theorem tells us that the composition of mappings satisfies the associative law
Theorem 5.1: Let/: A ^ B, g: B ^ C, h: C ^ D. Then
h^(g^f) = (hog)of
We prove this theorem here. Let a e A. Then
(h o (g of)){a) = h(ig of)(a)) = h{g{f{a)))
({hog) of){a) = {hog)(f{a)) = h(g{f{a)))
Thus {ho{gof))(a) = ((hog)of)(a) for every a e A, and so ho(go/) = (hog)of
One-to-One and Onto Mappings
We formally introduce some special types of mappings.
Definition: A mapping/: A ^ B is said to be one-to-one (or 1-1 or injective) if different elements of ^
have distinct images; that is:
A) lfa^a\t\v^nf{a)^f{a').
Equivalently,
B) If/(fl) =f{a'), then a = a'.
Definition: A mapping/ \ A ^ B is said to be onto (or/ maps A onto B or surjective) if every b e B is
the image of at least one a e A.
Definition: A mapping f: A ^ B is said to be a one-to-one correspondance between A and B (or
bijective) iff is both one-to-one and onto.
Example 5.3. Let /: R ^ R, g: R ^ R, h.R^R be defined by
/(x) = 2\ g(x) =x^ -X, h(x) = x^
The graphs of these functions are shown in Fig. 5-1. The function/ is one-to-one. Geometrically, this means that each
horizontal line does not contain more than one point off. The function g is onto. Geometrically, this means that each
horizontal line contains at least one point of g. The function h is neither one-to-one nor onto. For example, both 2 and
—2 have the same image 4, and —16 has no preimage.
Identity and Inverse Mappings
Let A be any nonempty set. The mapping f: A ^ A defined by f(a) = a, that is, the function that
assigns to each element in A itself, is called the identity mapping. It is usually denoted by 1^ or 1 or /.
Thus, for any a e A, wq have l^(a) = a.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
5. Linear Mappings
Text
©The McGraw-Hill
Companies, 2004
174
LINEAR MAPPINGS
[CHAP 5
fix) - 2^
9i^)
Fig. 5-1
hix)
Now \Qtf:A^B. We call g: B ^ A the inverse of/ written/ \ if
fog = l^ and g^f = h
We emphasize that/ has an inverse if and only if/ is a one-to-one correspondence between A and B, that
is,/ is one-to-one and onto (Problem 5.7). Also, if Z? g 5, then/~^(Z?) = a, where a is the unique element
of ^ for which/(a) = b.
5.3 LINEAR MAPPINGS (LINEAR TRANSFORMATIONS)
We begin with a definition.
Definition: Let V and U be vector spaces over the same field K. A mapping F: V ^ U is called a linear
mapping or linear transformation if it satisfies the following two conditions:
A) For any vectors i;, w G V, F(v + w) = F(v) + F(w).
B) For any scalar k and vector v e V, F(kv) = kF(v).
Namely, F: V ^ U is linear if it "preserves" the two basic operations of a vector space, that of vector
addition and that of scalar multiplication.
Substituting ^ = 0 into condition B), we obtain F@) = 0. Thus, every linear mapping takes the zero
vector into the zero vector.
Now for any scalars a,b e K and any vector v,w e F, we obtain
F(av + bw) = F(av) + F(bw) = aF(v) + bF(w)
More generally, for any scalars a^ G K and any vector v^ G V, we obtain the following basic property of
linear mappings:
F(a^v^ + a2V2 + ... + a^vj = a^F(v^) + a2F(v2) + ... + a^F(vJ
Remark 1: A linear mapping F: V ^ U is completely characterized by the condition
F(av + bw) = aF(v) + bF(w) (*)
and so this condition is sometimes used as its defintion.
Remark 2: The term linear transformation rather than linear mapping is frequently used for linear
mappings of the form F: R" ^ R'^.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 5] LINEAR MAPPINGS 175
Example 5.4
(a) Let F: R^ ^ R^ be the "projection" mapping into the xy-plane, that is, F is the mapping defined by
F(x,y, z) = (x,y, 0). We show that F is linear. Let v = (a,b,c) and w = (a\b\ c'). Then
F(v-\-w)=F(a-\-a\ b-\-b\ c + c') = {a + a\ b + b\ 0)
= {a, b, 0) + (a\ b\ 0) = F(v) + F(w)
and, for any scalar k,
F(kv) = F((ka, kb, kc) = (ka, kb, 0) = k(a, b, 0) = kF(v)
Thus F is linear.
(b) Let G: R^ ^ R^ be the "translation" mapping defined by G(x,y) = (x-\- I, y-\-2). [That is, G adds the vector
A,2) to any vector v = (x,y) in R^.] Note that
G@) = G@,0) = A,2O^0
Thus the zero vector is not mapped into the zero vector. Hence G is not linear.
Example 5.5. (Derivative and Integral Mappings) Consider the vector space V = P@ of polynomials over the real field
R. Let u(t) and v(t) be any polynomials in V and let k be any scalar.
(a) Let D : F ^ F be the derivative mapping. One proves in calculus that
d(u + v) du dv ^ d(ku) , du
; = -r-\--r and —;— = k-r
at at at at at
That is, D(w -\-v) = D(w) + D(i;) and T){ku) = kD(u). Thus the derivative mapping is linear.
(b) Let J: F ^ R be an integral mapping, say
j(/@)= \mdt
Jo
One also proves in calculus that,
[(u(t) + v(t)]dt = u(t) dt + v(t) dt
0 Jo Jo
and
|.i .1
ku(t) dt = k \ u(t) dt
Jo Jo
That is, J(u -\-v) = J(u) + J(v) and J(ku) = kJ(u). Thus the integral mapping is linear.
Example 5.6. (Zero and Identity Mappings.)
(a) Let F : V ^ t/ be the mapping that assigns the zero vector 0 G t/ to every vector v e V. Then, for any vectors
v,w e V and any scalar k e K, wq have
F(v-\-w) = 0 = 0-\-0= F(v) + F(w) and F(kv) = 0 = kO = kF(v)
Thus F is linear. We call F the zero mapping, and we shall usually denote it by 0.
(b) Consider the identity mapping /: V ^ V, which maps each v e V into itself. Then, for any vectors v,w e V and
any scalars a,b e K, wq have
I(av + bw) = av -\-bw = al{v) + bl{w)
Thus / is linear.
Our next theorem (proved in Problem 5.13) gives us an abundance of examples of linear mappings. In
particular, it tells us that a linear mapping is completely determined by its values on the elements of a basis.
Theorem 5.2: Let V and U be vector spaces over a field K. Let {vx,V2,..., vj be a basis of V and let
Ui,U2, ... ,u„hQ any vectors in U. Then there exists a unique linear mapping F: V ^ U
such that F(vi) = Wj, F(v2) = U2, ..., F{v^) = w„.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
176 LINEAR MAPPINGS [CHAR 5
We emphasize that the vectors Ui,U2, ... ,u^ in Theorem 5.2 are completely arbitrary; they may be
linearly dependent or they may even be equal to each other.
Matrices as Linear Mappings
Let A be any real m x n matrix. Recall that A determines a mapping F^: K" ^ K^ by FJ^u) = Au
(where the vectors in K^ and K^ are written as columns). We show F^ is linear. By matrix multiplication,
Fj^{v -\-w)= A(v -\-w) =Av-\- Aw = F^(v) + F^(w)
F^(kv) = A(kv) = k(Av) = kF^(v)
In other words, using A to represent the mapping, we have
A(v -\-w)=Av-\- Aw and A(kv) = k(Av)
Thus the matrix mapping A is linear.
Vector Space Isomorphism
The notion of two vector spaces being isomorphic was defined in Chapter 4 when we investigated the
coordinates of a vector relative to a basis. We now redefine this concept.
Definition: Two vector spaces V and U over K are isomorphic, written V = U, if there exists a bijective
(one-to-one and onto) linear mapping F: V ^ U. The mapping F is then called an
isomorphism between V and U.
Consider any vector space V of dimension n and let S be any basis of V. Then the mapping
V 1^ [V]s
which maps each vector v e V into its coordinate vector [v]^, is an isomorphism between V and K^.
5.4 KERNEL AND IMAGE OF A LINEAR MAPPING
We begin by defining two concepts.
Definition: Let F: V ^ ^ be a linear mapping. The kernel of F, written Ker F, is the set of elements in
V that map into the zero vector 0 in U', that is,
Ker F = {i; G F: F{v) = 0}
The image (or range) of F, written Im F, is the set of image points in U; that is,
ImF = {u e U: there exists i; g F for which F(v) = u}
The following theorem is easily proved (Problem 5.22).
Theorem 5.3: Let F: V ^ ^ be a linear mapping. Then the kernel of F is a subspace of V and the
image of F is a subspace of U.
Now suppose that Vi,V2,... ,Vj^ span a vector space V and that F: V ^ U is linear. We show that
F(vi), F(v2), ..., F(i;^) span Im F. Let w G Im F. Then there exists v e V such thatF(i;) = u. Since the i;'s
span V and since v e V, there exist scalars a^, ^2,..., a^ for which
v = a^v^+a2V2 + ... + a^v^
Therefore,
u = F(v) = F(a^v^ + a2V2 + ... + a^vj = a^F(v^) + a2F(v2) + ... + a^F(vJ
Thus the vectors F(vi), F(v2), ..., F(i;^) span Im F.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 5]
LINEAR MARRINGS
177
We formally state the above result.
Proposition 5.4: Suppose Vi,V2,... ,v^ span a vector space V, and suppose F: V ^ U is linear. Then
F(v^), F(V2), ..., F(vJ span Im F.
Example 5.7
(a) Let F: R^ ^ R^ be the projection of a vector v into the xy-plane [as pictured in Fig. 5-2(a)]; that is
F(x,y,z) = (x,y,0)
Clearly the image of F is the entire xy-plane, i.e., points of the form (x,y, 0). Moreover, the kernel of F is the
z-axis, i.e., points of the form @, 0, c). That is,
Im F = {(a, b,c): c = 0} = xy-plane and Ker F = {(a, b,c): a = 0,b = 0} = z-axis
(b) Let G: R^ ^ R^ be the linear mapping that rotates a vector v about the z-axis through an angle 9 [as pictured in
Fig. 5-2F)]; that is,
G(x,y,z) = (xcosO — ysin6, xsin^+j;cos0, z)
Observe that the distance of a vector v from the origin O does not change under the rotation, and so only the zero
vector 0 is mapped into the zero vector 0. Thus Ker G = {0}. On the other hand, every vector u in R^ is the
image of a vector v in R^ that can be obtained by rotating u back by an angle of 9. Thus Im G = R^, the entire
space.
Example 5.8. Consider the vector space V = P@ of polynomials over the real field R, and let //: V ^ F be the third-
derivative operator, that is, H[f(t)] =d^f/dP. [Sometimes the notation D^ is used for H, where D is the derivative
operator.] We claim that
Ker H = {polynomials of degree < 2} = P2@ and ImH = V
The first comes from the fact that H(af -\- bt -\- c) = 0 but H(f) 7^ 0 for «> 3. The second comes from that fact that
every polynomial g(t) in V is the third derivative of some polynomial/(O (which can be obtained by taking the anti-
derivative of g(t) three times).
A
<
\o
1
tv ^ (a, b, c)
' F{v) = {a, b, 0)
y
(a)
(b)
Fig. 5-2
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
5. Linear Mappings
Text
©The McGraw-Hill
Companies, 2004
178
LINEAR MAPPINGS
[CHAR 5
Kernel and Image of Matrix Mappings
Consider, say, a 3 x 4 matrix A and the usual basis {gj, 62,62,6^} of K"^ (written as columns):
fll
bi
C\
a^
b2
Cl
«3
h
^3
^4
b.
C4
^2
^3
^4
Recall that A may be viewed as a linear mapping A\K^^K^, where the vectors in K^ and K^ are viewed
as column vectors. Now the usual basis vectors span K^, so their images Ae^^Ae^, Ae^, Ae^^ span the image
of A. But the vectors Ae^, Ae^, Ae^, Ac/^ are precisely the columns of A\
Aci =[ai,bi,Ci] , Ae2 = [a2, b2, C2] ,
Thus the image of A is precisely the column space of A.
Ae^ = [^3, Z?3, C3] ,
Ae4 = [B4, Z?4, C4]
On the other hand, the kernel of A consists of all vectors v for which Av = 0. This means that the
kernel of ^ is the solution space of the homogeneous system AX = 0, called the null space of A.
We state the above results formally
Proposition 5.5: Let A be any m x n matrix over a field K viewed as a linear map A: K" ^ K^. Then
Ker A = nullsp(^) and Im ^ = colsp(^)
Here colsp(^) denotes the column space of ^, and nullsp(^) denotes the null space of ^.
Ranli and Nullity of a Linear Mapping
Let F \ V ^ ^ be a linear mapping. The rank of F is defined to be the dimension of its image, and the
nullity of F is defined to be the dimension of its kernel; namely,
rank(F) = dim(Im F) and nullity(F) = dim(Ker F)
The following important theorem (proved in Problem 5.23) holds.
Theorem 5.6: Let V be of finite dimension, and let F: F ^ ^ be linear. Then
dim V = dim(Ker F) + dim(Im F) = nullity(F) + rank(F)
Recall that the rank of a matrix A was also defined to be the dimension of its column space and row
space. If we now view ^ as a linear mapping, then both definitions correspond, since the image of ^ is
precisely its column space.
Example 5.9. Let F : R"^ ^ R^ be the linear mapping defined by
F{x,y,z,i) = (x-y-\-z-\-t, 2x - 2y-\-3z-\-4t, 3x - 3y-\-4z-\-5t)
(a) Find a basis and the dimension of the image of F.
First find the image of the usual basis vectors of R^,
F(l, 0,0,0) = A,2, 3),
F@, 1,0,0) = (-1,-2,-3),
F@,0, 1,0) = A,3,4)
F@,0,0, 1) = A,4,5)
By Proposition 5.4, the image vectors span Im F. Hence form the matrix M whose rows are these image vectors
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
5. Linear Mappings
Text
©The McGraw-Hill
Companies, 2004
CHAR 5]
LINEAR MARRINGS
179
and row reduce to echelon form:
M -
1
1
1
1
2
-2
3
4
1 2 3"
0 0 0
0 1 1
0 2 2
1 2 3'
0 1 1
0 0 0
0 0 0
Thus A,2,3) and @,1,1) form a basis of Im F. Hence dim(Im F) = 2 and rank(F) = 2.
Qd) Find a basis and the dimension of the kernel of the map F.
Set F{v) = 0, where v = (x, y, z, t),
F(x,y, z,t) = (x-y-\-z-\-t, 2x - 2y-\-3z-\- At, 3x - 3j; + 4z + 50 = @, 0, 0)
Set corresponding components equal to each other to form the following homogeneous system whose solution
space is Ker F\
X- y-[- z+ ^ = 0
2x - 2j; + 3z + 4^ = 0
3x - 3j; + 4z + 5^ = 0
-j^ + z+ t--
z + 2t--
z-\-2t--
The free variables are y and t. Hence dim(Ker F) = 2 or nullity(F) = 2.
(i) Set j; = 1, ^ = 0 to obtain the solution (-1,1,0,0),
(ii) Set j; = 0, ^ = 1 to obtain the solution A, 0, -2, 1).
Thus (-1, 1, 0, 0) and A, 0, -2, 1) form a basis for Kqy F
As expected from Theorem 5.6, dim(Im F) + dim(Ker F) = 4 = dimR^.
-y-\-z-\- t--
z + 2t--
Application to Systems of Linear Equations
Let AX = B denote the matrix form of a system of m linear equations in n unknowns. Now the matrix
A may be viewed as a linear mapping
^ . ^« ^ ^m
Thus the solution of the equation AX = B may be viewed as the preimage of the vector B g K^ under the
linear mapping A. Furthermore, the solution of the associated homogeneous system
AX = 0
may be viewed as the kernel of the linear mapping A. Applying Theorem 5.6 to this homogeneous system
yields
dim(Ker A) = dim AT" — dim(Im A) = n — rank A
But n is exactly the number of unknowns in the homogeneous system AX = 0. Thus we have proved the
following theorem of Chapter 4.
Theorem 4.19: The dimension of the solution space ^ of a homogenous system AX = 0 of linear
equations \^ s = n — r, where n is the number of unknowns and r is the rank of the
coefficient matrix A.
Observe that r is also the number of pivot variables in an echelon form of AX = 0, so ^ = w — r is also
the number office variables. Furthermore, the s solution vectors of AX = 0 described in Theorem 3.12 are
linearly independent (Problem 4.52). Accordingly, since dim W = s, they form a basis for the solution
space W. Thus we have also proved Theorem 3.12.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
180 LINEAR MAPPINGS [CHAR 5
5.5 SINGULAR AND NONSINGULAR LINEAR MAPPINGS, ISOMORPHISMS
Let F: V ^ U hQ 2i linear mapping. Recall that F@) = 0. F is said to be singular if the image of
some nonzero vector v is 0, that is, if there exists v ^ 0 such that F(v) = 0. Thus F: V ^ U is
nonsingular if the zero vector 0 is the only vector whose image under F is 0 or, in other words, if
Ker F = {0}.
Example 5.10. Consider the projection map F: R^ ^ R^ and the rotation map G: R^ ^ R^ appearing in Fig. 5-2. (See
Example 5.7.) Since the kernel of F is the z-axis, F is singular. On the other hand, the kernel of G consists only of the zero
vector 0. Thus G is nonsingular.
Nonsingular linear mappings may also be characterized as those mappings that carry independent sets
into independent sets. Specifically, we prove (Problem 5.28) the following theorem.
Theorem 5.7: Let F: V ^ U be a nonsingular linear mapping. Then the image of any linearly
independent set is linearly independent.
Isomorphisms
Suppose a linear mapping F \ V ^ U is one-to-one. Then only 0 eV can map into 0 eU, and so F is
nonsingular. The converse is also true. For suppose F is nonsingular and F{v) = F(w\ then
F{v — w)= F(v) — F(w) = 0, and hence v — w = Ooyv = w. Thus F(v) = F(w) implies v = w, that is,
F is one-to-one. Thus we have proved the following proposition.
Proposition 5.8: A linear mapping F: V ^ U is one-to-one if and only if F is nonsingular.
Recall that a mapping F: V ^ U is called an isomorphism if F is linear and if F is bijective, i.e., if F
is one-to-one and onto. Also, recall that a vector space V is said to be isomorphic to a vector space U,
written V =U,ii there is an isomorphism F: F ^ U.
The following theorem (proved in Problem 5.29) applies.
Theorem 5.9: Suppose V has finite dimension and dim V = dim U. Suppose F: V ^ U is linear. Then
F is an isomorphism if and only if F is nonsingular.
5.6 OPERATIONS WITH LINEAR MAPPINGS
We are able to combine linear mappings in various ways to obtain new linear mappings. These
operations are very important and will be used throughout the text.
Let F: V ^ U and G: V ^ U hQ linear mappings over a field K. The sum F + G and the scalar
product AF, where k e K, are defined to be the following mappings from V into U:
(F + G)(v) = F(v) + G(v) and (kF)(v) = kF(v)
We now show that if F and G are linear, then F + G and kF are also linear. Specifically, for any vectors
v,w e V and any scalars a,b e K,
(F + G)(av + bw) = F(av + bw) + G(av + bw)
= aF(v) + bF(w) + aG(v) + bG(w)
= a[F(v) + G(v)] + b[F(w) + G(w)]
= a(F + G)(v) + b(F + G)(w)
and (kF)(av + bw) = kF(av + bw) = k[aF(v) + bF(w)]
= akF(v) + bkF(w) = a(kF)(v) + b(kF)(w)
Thus F + G and kF are linear.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 5] LINEAR MARRINGS 181
The following theorem holds.
Theorem 5.10: Let V and U be vector spaces over a field K. Then the collection of all linear mappings
from V into U with the above operations of addition and scalar multiplication forms a
vector space over K.
The vector space of linear mappings in the above Theorem 5.10 is usually denoted by
Hom(F, U)
Here Hom comes from the word "homomorphism". We emphasize that the proof of Theorem 5.10 reduces
to showing that Hom(F, U) does satisfy the eight axioms of a vector space. The zero element of
Hom(F, U) is the zero mapping from V into U, denoted by 0 and defined by
0(i;) = 0
for every vector v e V.
Suppose V and U are of finite dimension. Then we have the following theorem.
Theorem 5.11: Suppose dim V = m and dim U = n. Then dim[Hom(F, U)] = mn.
Composition of Linear Mappings
Now suppose V, U, and W are vector spaces over the same field K, and suppose F: V ^ U and
G: U ^ W diXQ linear mappings. We picture these mappings as follows:
F G
V ^ U ^ W
Recall that the composition function G o F is the mapping from V into W defined by (G o F)(v) = G(F(v)).
We show that GoF is linear whenever F and G are linear. Specifically, for any vectors v,w e V and any
scalars a,b e K, wq have
(GoF)(av + bw) = G(F(av + bw)) = G(aF(v) + bF(w))
= aG(F(v)) + bG(F(w)) = a(GoF)(v) + b(GoF)(w)
Thus G o F is linear.
The composition of linear mappings and the operations of addition and scalar multiplication are
related as follows.
Theorem 5.12: Let V, U, W be vector spaces over K. Suppose the following mappings are linear:
F-.V^U, F'\V^U and G.U^W, G'\ U ^ W
Then, for any scalar k e K:
(i) Go(F + FO = GoF + GoF.
(ii) {G + G')oF = GoF + G'oF.
(iii) k{G o F) = {kG) oF = Go(kF).
5.7 ALGEBRA A(V) OF LINEAR OPERATORS
Let F be a vector space over a field K. This section considers the special case of linear mappings from
the vector space V into itself, that is, linear mappings of the form F: V ^ V. They are also called linear
operators or linear transformations on V. We will write A(V), instead of Hom(F, V), for the space of all
such mappings.
Now^(F) is a vector space over AT (Theorem 5.8), and, if dim V = n, then dimA(V) = n^. Moreover,
for any mappings F,G e A(V), the composition GoF exists and also belongs to A(V). Thus we have a
"multiplication" defined in A(V). [We sometimes write FG instead of GoF in the space A(V).]
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
182 LINEAR MAPPINGS [CHAR 5
Remark: An algebra A over a field AT is a vector space over K in which an operation of
multiplication is defined satisfying, for every F, G,H e A and every k e K:
(i) F(G-\-H)=FG-\-FH,
(ii) (G + H)F = GF-\- HF,
(iii) k{GF) = (kG)F = G(kF).
The algebra is said to be associative if, in addition, (FG)H = F(GH).
The above definition of an algebra and previous theorems give us the following result.
Theorem 5.13: Let F be a vector space over K. Then A(V) is an associative algebra over K with respect
to composition of mappings. If dim V = n, then dimA(V) = n^.
This is why A(V) is called the algebra of linear operators on V.
Polynomials and Linear Operators
Observe that the identity mapping I: V ^^ V belongs to A{V). Also, for any linear operator F mA{V),
we have FI = IF = F. We can also form "powers" of F. Namely, we define
F^=/, F^=FoF, F^ =F^oF = FoFoF, F^=F^oF,...
Furthermore, for any polynomial p{t) over K, say,
p{t) = a^ -\- a^t -\- a^i^ + ... + aj^
we can form the linear operator p(F) defined by
p{F) = a^I + a^F + ^2^^ + ... + a,F'
(For any scalar k, the operator kl is sometimes denoted simply by k.) In particular, we say F is a zero of the
polynomial p(t) ifp(F) = 0.
Example 5.11. Let F: ^^ ^ ^^ be defined by F(x, y, z) = @, x, y). For any (a,b,c)e K\
(F-\-I)(a,b,c) = @,a,b)-\-(a,b,c) = (a, a-\-b, b-\-c)
F\a, b, c) = F\0, a, b) = F@, 0, a) = @, 0, 0)
Thus F^ = 0, the zero mapping in A(V). This means F is a zero of the polynomial jc>@ = t^.
Square Matrices as Linear Operators
Let M = M„ „ be the vector space of all square n x n matrices over AT. Then any matrix AinM defines
a linear mapping F^: K" ^ K" by F^(u) = Au (where the vectors in K" are written as columns). Since the
mapping is from K^ into itself, the square matrix ^ is a linear operator, not simply a linear mapping.
Suppose A and B are matrices in M. Then the matrix product AB is defined. Furthermore, for any
(column) vector u in K",
F^s(u) = (AB)u = A(Bu) = A(Fs(U)) = FAFb(u)) = (F^ oF^Xu)
In other words, the matrix product AB corresponds to the composition of A and B as linear mappings.
Similarly, the matrix sum A-\-B corresponds to the sum of A and B as linear mappings, and the scalar
product kA corresponds to the scalar product of ^ as a linear mapping.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 5] LINEAR MAPPINGS 183
Invertible Operators in A(V)
Let F: V ^ F be a linear operator. F is said to be invertible if it has an inverse, that is, if there exists
F~^ in A(V) such that FF~^ = F~^F = I. On the other hand, F is invertible as a mapping if F is both one-
to-one and onto. In such a case, F~^ is also linear and F~^ is the inverse of F as a linear operator (proved in
Problem 5.15).
Suppose F is invertible. Then only 0 e V can map into itself, and so F is nonsingular. The converse is
not true, as seen by the following example.
Example 5.12. Let V — P@, the vector space of polynomials over K. Let F be the mapping on V that increases by 1 the
exponent of t in each term of a polynomial, that is
F(aQ + a^t + a2t^ + ... + a/) = a^t + a^t^ + a2p + ... + a/+^
Then F is a linear mapping and F is nonsingular. However, F is not onto, and so F is not invertible.
The vector space V = F(t) in the above example has infinite dimension. The situation changes
significantly when V has finite dimension. Namely, the following theorem applies.
Theorem 5.14: Let F be a linear operator on a finite-dimensional vector space V. Then the following
four conditions are equivalent.
(i) F is nonsingular: Ker F = {0}. (iii) F is an onto mapping,
(ii) F is one-to-one. (iv) F is invertible.
The proof of the above theorem mainly follows from Theorem 5.6, which tells us that
dim V = dim(Ker F) + dim(Im F)
By Proposition 5.8, (i) and (ii) are equivalent. Note that (iv) is equivalent to (ii) and (iii). Thus, to prove the
theorem, we need only show that (i) and (iii) are equivalent. This we do below.
(a) Suppose (i) holds. Then dim(Ker F) = 0, and so the above equation tells us that dim V = dim(Im F).
This means V = lmF or, in other words, F is an onto mapping. Thus (i) implies (iii).
(b) Suppose (iii) holds. Then F = Im F, and so dim V = dim(Im F). Therefore the above equation tells
us that dim(Ker F) = 0, and so F is nonsingular. Therefore (iii) implies (i).
Accordingly, all four conditions are equivalent.
Remark: Suppose ^ is a square n x n matrix over K. Then A may be viewed as a linear operator on
K". Since K" has finite dimension. Theorem 5.14 holds for the square matrix A. This is why the terms
"nonsingular" and "invertible" are used interchangeably when applied to square matrices.
Example 5.13. Let F be the linear operator on R^ defined by F(x,y) = Bx +y, 3x + 2y).
(a) To show that F is invertible, we need only show that F is nonsingular. Set F(x,y) = @, 0) to obtain the
homogeneous system
2x -\-y = 0 and 3x + 2j; = 0
Solve for x and y to get x = 0, y = 0. Hence F is nonsingular and so invertible.
(b) To find a formula for F~\ we set F(x,y) = (s, t) and so F~^{s, t) = (x,y). We have
2x + y = s
Bx-\-y, 3x-\-2y) = (s,t) or
3x -\-2y = t
Solve for x and y in terms of s and t to obtain x = 2s — t, y = —3s -\-2t. Thus
F~\s,t) = Bs-t, -3s-\-2t) or F~\x,y) = Bx - y, -3x-\-2y)
where we rewrite the formula for F~^ using x and y instead of ^ and t.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
5. Linear Mappings
Text
©The McGraw-Hill
Companies, 2004
184
LINEAR MAPPINGS
[CHAR 5
Solved Problems
MAPPINGS
5.1. State whether each diagram in Fig. 5-3 defines a mapping from A = {a, b, c} into B = {x,y, z}
(a) No. There is nothing assigned to the element b e A.
(b) No. Two elements, x and z, are assigned to c G ^.
(c) Yes.
5.2. LQtf:A^Bmdg:B^ChQ defined by Fig. 5-4.
(a) Find the composition mapping (g of) \ A ^ C.
(b) Find the images of the mappings /, g, g of.
Fig. 5-4
(a) Use the definition of the composition mapping to compute
(g °f) ia) = gifia)) = g(y) = t, (g of) (b) = gifm = gix) = s
(g°f)(c)=g(f(c))=g(y) = t
Observe that we arrive at the same answer if we "follow the arrows" in Fig. 5-4:
a ^ y ^ t, b ^ X ^ s, c ^ y ^ t
(b) By Fig. 5-4, the image values under the mapping/ are x and y, and the image values under g are r, s, t.
Hence
Im/ = {x,y} and Im g = {r, s, t}
Also, by part (a), the image values under the composition mapping gof are t and s\ accordingly,
\v[\gof = {s,t}. Note that the images of g and gof are different.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 5] LINEAR MAPPINGS 185
5.3. Consider the mapping F: R^ ^ R^ defined by F{x,y, z) = (yz, x^). Find:
{a) FB, 3, 4); (b) FE, -2, 7); (c) F'^O, 0), that is, all veR^ such that F(v) = 0.
(a) Substitute in the formula for F to get FB, 3, 4) = C • 4, 2^) = A2, 4).
(b) FE, -2, 7) = (-2 • 7, 5^) = (-14, 25).
(c) Set F(i;) = 0, where v = (x,y, z), and then solve for x, y, z:
F(x,y, z) = (yz, x^) = @, 0) or yz = 0,x^ =0
Thus X = 0 and either j; = 0 or z = 0. In other words, x = 0, j; = 0 or z = 0. That is, the z-axis and the
j^-axis.
5.4. Consider the mapping F: R^ ^ R^ defined by F(x,y) = C;;, 2x). Let S be the unit circle in R^,
that is, the solution set of x^ +3;^ = 1. (a) Describe F(S). (b) Find F~^(S).
(a) Let (a, b) be an element of F{S). Then there exists {x,y) e S such that F(x,y) = (a, b). Hence
Cy, 2x) = (a,b) or 3y = a,2x = b or y = -^x = -
Since (x,y) e S, that is, x^ -\-y^ = I, wq have
\3/ 9 4
/a\2 a^ b^ ^
+ C)=1 or ^ + ^=1
Thus F(S) is an ellipse.
(b) Let F(x,y) = (a, b), where (a, b) e S. Then Cy, 2x) = (a, b) or 3y = a, 2x = b. Since (a, b) e S, we
have a^ -\-b^ = 1. Thus C;;)^ + Bxf = 1. Accordingly, F-\S) is the ellipse 4x^ + 9/ = 1.
5.5. Let the mappings/: A ^ B, g: B ^ C, h: C ^ DhQ defined by Fig. 5-5. Determine whether or
not each function is: (a) one-to-one; (b) onto; (c) invertible, i.e. has an inverse.
(a) The mapping f: A ^^ B is one-to-one, since each element of A has a different image. The mapping
g\ B ^ C is not one-to one, since x and z both have the same image 4. The mapping h\ C ^ Dis one-
to-one.
(b) The mapping/: A ^ B is not onto, since z e B is not the image of any element of ^. The mapping
g\ B ^ C is onto, since each element of C is the image of some element of 5. The mapping h.C^D
is also onto.
(c) A mapping has an inverse if and only if it is one-to-one and onto. Hence only h has an inverse.
Fig. 5-5
5.6. Suppose/: A ^ B and g: B ^ C. Hence (g^f): A ^ C exists. Prove:
(a) Iff and g are one-to-one, then g of is one-to-one.
(b) Iff and g are onto mappings, then g 0/ is an onto mapping.
(c) If go/ is one-to-one, then/ is one-to-one.
(d) If g 0/ is an onto mapping, then g is an onto mapping.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
186 LINEAR MAPPINGS [CHAR 5
(a) Suppose (gof)(x) = (gof)(y). Then g(f(x)) = g(f(y)). Since g is one-to-one,/(x) =f(y). Since/ is
one-to-one, x = y. We have proven that (g °f)(x) = (g °f)(y) imphes x = y; hence g of is one-to-one.
(b) Suppose c e C. Since g is onto, there exists b e B for which g(b) = c. Since/ is onto, there exists a e A
for which/(a) = b. Thus (g°f)(a) = g(f(a)) = g(b) = c. Hence go/ is onto.
(c) Suppose/ is not one-to-one. Then there exist distinct elements x,y e A for which/(x) =f(y). Thus
fe°/)W = g(f(^)) = g(f(y)) = (s°f)(y)' Hence go/ is not one-to-one. Therefore if g°/ is one-to-
one, then/ must be one-to-one.
(d) IfaeA, then (g of)(a) = g(f(a)) e g(B). Hence (g of)(A) c g(B). Suppose g is not onto. Then g(B) is
properly contained in C and so (g °f)(A) is properly contained in C; thus g of is not onto. Accordingly if
g of is onto, then g must be onto.
5.7. Prove that/: A ^ B has an inverse if and only if/ is one-to-one and onto.
Suppose / has an inverse, i.e., there exists a function f~^:B^A for which f~^ of = \^ and
f °f~^ = I5. Since 1^ is one-to-one,/ is one-to-one by Problem 5.6(c), and since I5 is onto,/ is onto by
Problem 5.6(d), that is,/ is both one-to-one and onto.
Now suppose/ is both one-to-one and onto. Then each b e B is the image of a unique element in A, say
6*. Thus if/(a) = b, then a = b^\ hence/(Z?*) = b. Now let g denote the mapping from B to A defined by
b \^ b^. We have:
(i) fe °f)(a) = g(f(a)) = g(b) = b"" = a for every a e A\ hence g 0/ = 1^.
(ii) {f-g){b) =f(g(b)) =f(b^) = b for every b e B; hence/og = 1^.
Accordingly, / has an inverse. Its inverse is the mapping g.
5.8. Let/: R ^ R be defined by/(x) = 2x — 3. Now/ is one-to-one and onto; hence/ has an inverse
mapping/"^. Find a formula for/~^
Let j; be the image of x under the mapping/; that is, y =f(x) = 2x — 3. Hence x will be the image of j;
under the inverse mapping/"^. Thus solve for x in terms of j; in the above equation to obtain x = ^(y-\-3).
Then the formula defining the inverse function is /~^(y) = ^(y + 3), or, using x instead of y,
/-Hx) = i(x + 3).
LINEAR MAPPINGS
5.9. Suppose the mapping F: R^ ^ R^ is defined by F(x, y) = (x-\- y, x). Show that F is linear.
We need to show that F(v -\-w) = F(v) + F(w) and F(kv) = kF(v), where u and v are any elements of R^
and k is any scalar Let v = (a,b) and w = (a\ b'). Then
V -\-w = {a-\- a', b -\-b') and kv = (ka, kb)
We have F(v) = (a-\-b,a) and F(w) = (a' + b\ a'). Thus
F(v-\-w)=F(a-\-a\ b-\-b') = (a-\-a'-\-b-\-b\ a + a')
= (a-\-b, a)-\-(a'-\-b', a') = F(v)-\-F(w)
and
F(kv)=F(ka,kb) = (ka-\-kb, ka) = (a-\-b, a) = kF(v)
Since v, w, k were arbitrary, F is linear
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 5] LINEAR MAPPINGS 187
5.10. Suppose F: R^ ^ R^ is defined by F(x, y,z) = (x-\-y-\-z, 2x — 3y-\- 4z). Show that F is linear.
We argue via matrices. Writing vectors as columns, the mapping F may be written in the form F(v) = Av,
where v = [x,y,zf and
Then, using properties of matrices, we have
F(v -\-w) = A(v -\-w) =Av -\- Aw = F(v) + F(w)
and F(kv) = A(kv) = k(Av) = kF(v)
Thus F is linear.
5.11. Show that the following mappings are not linear:
(a) F: r2 ^ r2 defined by F(x, y) = (xy, x)
(b) FR^^R^ defined by F(x,y) = (x + 3, 2y, x-\-y)
(c) FR^^R^ defined by F(x,y,z) = (\x\, y-\-z)
{a) Let i; = A, 2) and w = C, 4); then v + w = {A, 6). Also,
F{v) = AC), 1) = C, 1) and F(w) = CD), 3) = A2, 3)
Hence
F(v -\-w) = DF), 4) = B4, 6) 7^ F(v) + F(w)
(b) Since F@, 0) = C, 0, 0) ^ @, 0, 0), F cannot be linear.
(c) Let i; = A, 2, 3) and k = -3. Then kv = (-3, -6, -9). We have
F(v) = A, 5) and kF(v) = -3A, 5) = (-3, -15).
Thus
F(kv) = F(-3, -6, -9) = C, -15) 7^ kF(v)
Accordingly, F is not linear.
5.12. Let V be the vector space of ^-square real matrices. Let M be an arbitrary but fixed matrix in V. Let
F: V ^ F be defined by F(A) = AM + MA, where A is any matrix in V. Show that F is linear.
For any matrices A and B in V and any scalar k, we have
F(A-\-B) = (A-\-B)M-\-M(A-\-B)=AM-\-BM-\-MA-\-MB
= (AM + MA) = {BM + MB) = F{A)+F{B)
and F{kA) = (kA)M + M(kA) = k(AM) + k(MA) = k(AM + MA) = kF(A)
Thus F is linear.
5.13. Prove Theorem 5.2: Let V and U be vector spaces over a field K. Let {vi,V2,..., vj be a basis of V
and \Qt Ui,U2, ... ,u^hQ any vectors in U. Then there exists a unique linear mapping F: V ^ U
such that F(vi) = Wj, F(i;2) = U2, ..., F(v^) = w„.
There are three steps to the proof of the theorem: A) Define the mapping F\V^U such that
F(vi) = Ui,i= I,... ,n. B) Show that F is linear. C) Show that F is unique.
Step 1. Let V e V. Since {v^,..., vj is a basis of V, there exist unique scalars a^,... ,a^ e K for which
V = a^Vi -\- a2V2 + ... + a^v^. We define F : V ^ U by
F(v) = ttiUi + C2^2 + • • • + CLfi^n
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
188 LINEAR MAPPINGS [CHAR 5
(Since the a^ are unique, the mapping F is well-defined.) Now, for / = !,...,«,
i;,- = Oi;i +... + li;,- + ... + Oi;„
Hence F{vi) = ^u^ -\-... -\- lu^ -\-... -\- Ow„ = u^
Thus the first step of the proof is complete.
Step 2. Suppose v = aiVi + ^21^2 + • • • + cin^n ^^^ w = biVi + 621^2 + • • • + b^v^. Then
1; + w = (ai + bi)vi + (C2 + Z}2)i^2 + ...+(«„ + b^)v^
and, for any k e K, kv = ka^Vi + ^^21^2 + • • • + ^^^^n- ^Y definition of the mapping F,
F(v) = aiUi-\- C2^2 + • • • + a^v^ and F(>v) = Z}ii;i + 621^2 + • • • + b^^v^
Hence
F(i; + w) = (ai + 61)^1 + (C2 + ^72)^2 + ... + («„+ ^„)w„
= (aiUi + (^2^2 + ... + C„w„) + (biUi + ^2^2 + • • • + b^uj
= F(v) + F(>v)
and F(kv) = k(aiUi + (^2^2 + ... + «„w„) = A:F(i;)
Thus F is linear
Step 3. Suppose G: V ^ t/ is linear and G{vi) = Ui,i = I,... ,n. Let
V = aiVi-\- a2V2 + ... + C„i^„
Then
G(v) = G(a^v^ + a2V2 + ... + C„i;J = aiG(i;i) + a2G(v2) + ... + «„(?(i^J
= C1^1 + (^2^2 + • • • + «n^n — ^(^)
Since G(i;) = F(i;) for every v e V,G = F. Thus F is unique and the theorem is proved.
5.14. Let F: r2 ^ R^e the linear mapping for which F(l, 2) = B, 3) and F@, 1) = A, 4). [Note that
{A, 2), @, 1)} is a basis of R^, so such a linear map F exists and is unique by Theorem 5.2.] Find a
formula for F; that is, find F(a, b).
Write {a, b) as a linear combination of A,2) and @,1) using unknowns x and y,
{a,b) = x{\,2)+y{Q,\) = {x, 2x+y), so a = x, b = 2x-\-y
Solve for x and y in terms of C and b to get x = C, j; = —2a + 6. Then
F(a, b) = xF(l, 2) +yF{Q, 1) = aB, 3) + {-2a + Z?)(l, 4) = F, -5a + 46)
5.15. Suppose a linear mapping F: F ^ ^ is one-to-one and onto. Show that the inverse mapping
F~^ : U ^ V is also linear.
Suppose u,u' eU. Since F is one-to-one and onto, there exist unique vectors v,v' e V for which
F{v) = u and F(v^) = u'. Since F is linear, we also have
F{v + v') = F(v) + F(v') = u-\-u' and F(kv) = kF(v) = ku
By definition of the inverse mapping,
F-\u) = V, F-\u') = v\ F-\u -\-u') = v-\- v\ F-\ku) = kv.
Then
F-\u + u') = v + v' = F-\u)+F-\u') and F-\ku) = kv = kF-\u)
Thus F~^ is linear.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
5. Linear Mappings
Text
©The McGraw-Hill
Companies, 2004
CHAP 5]
LINEAR MAPPINGS
KERNEL AND IMAGE OF LINEAR MAPPINGS
5.16. Let F: R^ ^ R^ be the linear mapping defined by
F(x, y, z, t) = (x — y -\- s -\-1, x-\-2s — t, x -\- y -\- 3s — 3t)
Find a basis and the dimension of: (a) the image of F, (b) the kernel of F.
(a) Find the images of the usual basis of R^:
F(l, 0, 0, 0) = A, 1, 1), F@, 0, 1, 0) = A, 2, 3)
F@, 1, 0, 0) = (-1, 0, 1), F@, 0, 0, 1) = A, -1, -3)
By Proposition 5.4, the image vectors span ImF. Hence form the matrix whose rows are these image
vectors, and row reduce to echelon form:
1
0
2
-1
r
1
3
-3_
1
0
0
0
1
1
1
-2
r
2
2
-4_
1 1 1
0 1 2
0 0 0
0 0 0
Thus A,1,1) and @,1,2) form a basis for Im F; hence dim(Im F) = 2.
(b) Set F(v) = 0, where v = (x,y, z, t); that is, set
F(x,y,z,t) = (x-y-\-z-\-t, x-\-2z-t, x+j; + 3z - 30 = @, 0, 0)
Set corresponding entries equal to each other to form the following homogeneous system whose solution
space is Ker F:
x—y-\- z+ ^ = 0 X — y-\- z+ t
X -\-2z — t = 0 or y -\- z — 2t ■■
X + j; + 3z - 3^ = 0 2j; + 2z - 4^:
The free variables are z and t. Hence dim(Ker F) = 2.
(i) Set z = -1, ^ = 0 to obtain the solution B, 1, -1, 0).
(ii) Set z = 0, ^ = 1 to obtain the solution A,2,0,1).
Thus B, 1, -1, 0) and A,2,0,1) form a basis of Ker F.
[As expected, dim(Im F) + dim(Ker F) = 2 + 2 = 4 = dimR^, the domain of F.]
0
0
0
or
x-y-\-z-\- t = 0
j; + z-2^ = 0
5.17. Let G: R^ ^ R^ be the linear mapping defined by
G(x, y,z) = (x -\-2y — z, y -\- z, x-\-y — 2z)
Find a basis and the dimension of: {a) the image of G, (Z?) the kernel of G.
{a) Find the images of the usual basis of R^:
G(l, 0, 0) = A, 0, 1), G@, 1, 0) = B, 1, 1), G@, 0, 1) = (-1, 1, -2)
By Proposition 5.4, the image vectors span Im G. Hence form the matrix M whose rows are these image
vectors, and row reduce to echelon form:
M -
Thus A,0,1) and @, 1, —1) form a basis for Im G; hence dim(Im G) = 2.
(b) Set G{v) = 0, where v = (x,y, z); that is,
G(x,y, z) = (x-\-2y-z, y-\-z, x-\-y- 2z) = @, 0, 0)
1 0
2 1
1 1
n
1
-2
^
ri 0
0 1
0 1
n
-1
-1
^
ri 0
0 1
0 0
1
-1
0
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
5. Linear Mappings
Text
©The McGraw-Hill
Companies, 2004
190
LINEAR MAPPINGS
[CHAR 5
Set corresponding entries equal to each other to form the following homogeneous system whose solution
space is Ker G:
X -\-2y — z = 0 X -\-2y — z = 0
y -\- z = 0 or y -\-z = 0 or
X -\- y — 2z = 0 —y — z = 0
The only free variable is z; hence dim(Ker G) = I. Set z = I; then y = —I and x = 3. Thus C, —1, 1)
forms a basis of Ker G. [As expected, dim(Im G) + dim(Ker G) = 2 + 1 = 3 = dimR^, the domain
of G.]
X -\-2y — z -
y-\-z--
1
1
3
2
3
8
3
5
13
1
-2
-3
5.18. Consider the matrix mapping ^ : R^ ^ R^, where A -
dimension of: (a) the image of ^, (b) the kernel of ^.
(a) The column space of ^ is equal to Im^. Now reduce ^^ to echelon form:
1
2
3
1
Thus {A,1, 3), @, 1, 2)} is a basis of Im^, and dim(Im^) = 2.
(b) Here Ker A is the solution space of the homogeneous system AX = 0, where X
reduce the matrix A of coefficients to echelon form:
. Find a basis and the
1
3
5
-2
3"
8
13
-3
1
0
0
0
1
1
2
-3
31
2
4
-6
1 2 3
0 1 2
0 2 4
1
-3
-6
^
1 2 3
0 1 2
0 0 0
1
-3
0
1 1 3
0 1 2
0 0 0
0 0 0_
: 0, where X = {x,y, z, t)^. Thus
-2j; + 3z+ ^ = 0
j; + 2z - 3^ = 0
The free variables are z and t. Thus dim(Ker A) = 2.
(i) Set z = 1, ^ = 0 to get the solution A, -2, 1, 0).
(ii) Set z = 0, ^ = 1 to get the solution (-7, 3, 0, 1).
Thus A, -2, 1,0) and (-7, 3, 0, 1) form a basis for Ker ^.
5.19. Find a linear map F: R^ ^ R^ whose image is spanned by A, 2, 0, —4) and B, 0,-1, —3).
Form a 4 X 3 matrix whose columns consist only of the given vectors, say
1
2
0
4
2
0
-1
-3
2
0
-1
-3
Recall that A determines a linear map A : R^
satisfies the required condition.
R^ whose image is spanned by the columns of ^. Thus A
5.20. Suppose f: V ^ U is linear with kernel W, and that f(v) = u. Show that the "coset"
v-\-W = {v-\-w:we ^} is the preimage of u; that is, f~^(u) = v-\-W.
We must prove that (\)f~\u) c i; + ^ and (ii) v+W <^f~\u).
We first prove (i). Suppose v' ef~^(u). Then/(i;0 = u, and so
f(v^-v)=fW)-f(v) = u-u = 0
that is v^ — V e W. Thus v^ = v -\- (v^ — v) e v -\- W, and hence/~^(w) ^ v -\- W.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 5] LINEAR MAPPINGS 191
Now we prove (ii). Suppose v' e v -\-W. Then v' = v -\-w, where w e W. Since W is the kernel of/, we
have/(>v) = 0. Accordingly,
fW) =f(v+w) +m +/(w) =m+0 =m = u
Thus v' ef-\uX and so i; + ^ Qf~\u).
Both inclusions imply/~^(w) = v -\-W.
5.21. Suppose F: V ^ U and G: U ^ W slyq linear. Prove:
(a) rank(GoF) < rank(G), (b) rank(GoF) < rank(F).
(a) Since F(V) c U, we also have G(F(V)) c G(U% and so dim[G(F(V))] < dim[G(U)]. Then
rank(GoF) = dim[(GoF)(F)] = dim[G(F(F))] < dim[G(U)] = rank(G).
(b) We have dim[G(F(F))] < dim[F(V)]. Hence
rank(GoF) = dim[(GoF)(F)] = dim[G(F(F))] < dim[F(V)] = rank(F)
5.22. Prove Theorem 5.3: Let F: V ^ U hQ linear. Then:
(a) Im F is a subspace of U, (b) Ker F is a subspace of V.
(i) Since F@) = 0, we have 0 G Im F. Now suppose u,u' elmF and a,b e K. Since u and u' belong to
the image of F, there exist vectors v,v^ e V such that F(v) = u and F(v^) = u'. Then
F{av + bv') = aF(v) + bF{v') -\- au-\- bu' G Im F
Thus the image of F is a subspace of U.
(ii) Since F@) = 0, we have 0 G Ker F Now suppose v,w e Ker F and a,b e K. Since v and w belong to
the kernel of F, F(v) = 0 and F(w) = 0. Thus
F(av + bw) = aF(v) + bF(w) = aO-\-bO = 0-\-0 = 0, and so av-\-bw e Ker F
Thus the kernel of F is a subspace of V.
5.23. Prove Theorem 5.6: Suppose V has finite dimension and F: V ^ U is linear. Then
dim V = dim(Ker F) + dim(Im F) = nullity(F) + rank(F)
Suppose dim(KerF) = r and {w^,... ,w^} is a basis of KqyF, and suppose dim(ImF) = ^ and
{ui, ..., u^} is a basis of ImF. (By Proposition 5.4, ImF has finite dimension.) Since Uj G ImF, there
exist vectors v^,... ,v^ in V such that F(vi) = u^,..., F(vJ = u^. We claim that the set
B = {w^,...,w„v^,...,vJ
is a basis of V, that is, (i) 5 spans V, and (ii) 5 is linearly independent. Once we prove (i) and (ii), then
dim V = r -\- s = dim(Ker F) + dim(Im F).
(i) B spans V. Let v eV. Then F{v) e Im F. Since the Uj span Im F, there exist scalars a^,... ,a^ such that
F(v) = a^Ui + ... + a^u^. SqI v = a^v^ + ... + a^v^ — v. Then
F(v) = F(a^v^ + ... + a,v, - v) = a^F(v^) + ... + a,F(v,) - F(v)
= a^Ui -\-... -\- a^u^ — F(v) = 0
Thus V G Ker F. Since the Wi span Ker F, there exist scalars b^,... ,b^, such that
V = biWi + ... + bf^Wf. = aiVi -\-... -\- a^v^ — v
Accordingly
y = a^Vi -\-... -\- a^Vg — biWi — ... — b^w^
Thus B spans V.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
192
LINEAR MAPPINGS
[CHAR 5
(ii) B is linearly independent. Suppose
xiWi +... -\-x^w^ -\-y\V\ + • • • -\-ys^s = 0
where Xi,yj e K. Then
0 = F@) = F(x^w^ + ... + x,w, -\-y^v^-\-... -\-y,vJ
= x\F(w^) + ... + x,F(w,) -\-yiF(v^) + ... + j^,F(i;,)
A)
B)
But F(wi) = 0, since w^ e Ker F, and F(vj) = Uj. Substituting into B), we will obtain
yiUi + ... -\-ysUs = 0. Since the Uj are linearly independent, each yj = 0. Substitution into A) gives
x^Wi-\-...-\-Xj,Wj, = 0. Since the w^ are linearly independent, each x, = 0. Thus B is linearly
independent.
SINGULAR AND NONSINGULAR LINEAR MAPS, ISOMORPHISMS
5.24. Determine whether or not each of the following linear maps is nonsingular. If not, find a nonzero
vector V whose image is 0.
(a) F : r2 ^ r2 defined by F(x, y) = (x-y, x- 2y).
(b) G: r2 ^ r2 defined by G(x, y) = Bx - 4y, 3x - 6y).
(a) Find Ker F by setting F(v) = 0, where v = (x,y),
(x-y, x-2y) = @,0) or
X — y = 0
x-2y = 0
X —y = 0
-y = 0
The only solution is x = 0, j; = 0. Hence F is nonsingular
(b) Set G(x, y) = @, 0) to find Ker G:
Bx-4y, 3x-6y) = @,0)
2x-4y = 0
3x-6y = 0
2y = 0
The system has nonzero solutions, since j; is a free variable. Hence G is nonsingular. Let j; = 1 to obtain
the solution i; = B, 1), which is a nonzero vector, such that G(v) = 0.
5.25. The linear map F: R^ ^ R^ defined by F(x, y) = (x — y, x — 2y) is nonsingular by the previous
Problem 5.24. Find a formula for F~^.
Set F{x,y) = (a, b), so that F-\a, b) = (x,y). We have
(x — y, X — 2y) = (a, b) or
X — y = a
X — 2y = b
X — y = a
y = a — b
Solve for x and y in terms of a and 6 to get x = 2a — b, y = a — b. Thus
F-\a,b) = Ba-b, a - b)
F \x,y) = {2x-y, x-y)
(The second equation is obtained by replacing a and 6 by x and y, respectively.)
5.26. Let G: r2 ^ RHe defined by G{x,y) = {x+y, x- 2y, 3x +;;).
(a) Show that G is nonsingular. (Z?) Find a formula for G~^.
(a) Set G(x,y) = @, 0, 0) to find Ker G. We have
(x-\-y, x-2y, 3x+;;) = @, 0, 0) or x-\-y = 0, x-2y = 0, 3x-\-y = 0
The only solution is x = 0, j; = 0; hence G is nonsingular
(b) Although G is nonsingular, it is not invertible, since R^ and R^ have different dimensions. (Thus
Theorem 5.9 does not apply.) Accordingly, G~^ does not exist.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 5] LINEAR MAPPINGS 193
5.27. Suppose that F: V ^ U is linear and that V is of finite dimension. Show that V and the image of F
have the same dimension if and only if F is nonsingular. Determine all nonsingular linear mappings
r:R'^^ Rl
By Theorem 5.6, dim V = dim(Im F) + dim(Ker F). Hence V and Im F have the same dimension if and
only if dim(Ker F) = 0 or Ker F = {0}, i.e., if and only if F is nonsingular.
Since dimR^ is less than dimR^, we have that dim(Im T) is less than the dimension of the domain R^ of
T. Accordingly no linear mapping T: R^ ^ R^ can be nonsingular.
5.28. Prove Theorem 5.7: Let F: F ^ ^ be a nonsingular linear mapping. Then the image of any
linearly independent set is linearly independent.
Suppose Vi,V2,... ,v^ are linearly independent vectors in V. We claim that F(vi), F(v2), ..., F(v^) are
also linearly independent. Suppose aiF(vi) + a2F(v2) + ... + a^F{v^) = 0, where a, G K. Since F is linear,
F(aiVi -\- a2V2 -\-... -\- a^v^) = 0. Hence
aiVi + a2V2 + ... + a„i;„ G Ker F
But F is nonsingular, i.e., KerF = {0}. Hence a^Vi +^21^2 + • • • +^n^n — ^- Since the Vi are linearly
independent, all the a^ are 0. Accordingly, the F(vi) are linearly independent. Thus the theorem is proved.
5.29. Prove Theorem 5.9: Suppose V has finite dimension and dim V = dim U. Suppose F: V ^ U is
linear. Then F is an isomorphism if and only if F is nonsingular.
If F is an isomorphism, then only 0 maps to 0; hence F is nonsingular. Conversely, suppose F is
nonsingular. Then dim(Ker F) = 0. By Theorem 5.6, dim V = dim(Ker F) + dim(Im F). Thus
dim U = dim V = dim(Im F)
Since U has finite dimension, ImF = U. This means F maps V onto U. Thus F is one-to-one and onto; that
is, F is an isomorphism.
OPERATIONS WITH LINEAR MAPS
5.30. Define F: R^ ^ R^ and G: R^ ^ R^ by F(x, y, z) = Bx, y + z) and G(x, y, z) = (x - z, y).
Find formulas defining the maps: (a) F -\- G, (b) 3F, (c) IF — 5G.
(a) (F + G)(x, y, z) = F(x, y, z) + G{x, y, z) = Bx, y -\- z) -\- (x — z, y) = Cx — z, 2y -\- z)
(b) CF)(x, y, z) = 3F(x, y, z) = 3Bx, y-\-z) = Fx, 3y + 3z)
(c) BF - 5G)(x, y, z) = 2F(x, y, z) - 5G(x, y, z) = 2Bx, y-\-z) - 5(x - z, y)
= Dx, 2y + 2z) + (-5x + 5z, -5y) = (-x + 5z, -3y + 2z)
5.31. Let F: R^ ^ r2 and G: r2 ^ R^ be defined by F(x, y, z) = Bx, y + z) and G(x, y) = (y, x).
Derive formulas defining the mappings: (a) GoF, (b) F o G.
(a) (G o F)(x, y, z) = G(F(x, y, z)) = GBx, y-\-z) = (y-\-z, 2x)
(b) The mapping F o G is not defined, since the image of G is not contained in the domain of F
5.32. Prove: (a) The zero mapping 0, defined by 0(i;) = 0 G ^ for every 1; G F, is the zero element of
Hom(F, U). (b) The negative of F G Hom(F, U) is the mapping (—1)F, i.e., —F = (—1)F
Let F G Hom(F, U). Then, for every v e V:
(a) (F + 0)(v) = F(v) + 0(v) = F(v) + 0 = F(v)
Since (F + 0)(v) = F(v) for every v e V, wq have F + 0 = F Similarly, 0 + F = F.
(b) (F + (-l)F)(i;) = F(v) + (-l)F(i;) = F(v) - F(v) = 0 = 0(v)
Thus F + (-l)F = 0. Similarly (-l)F + F = 0. Hence, -F = (-l)F.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
194 LINEAR MAPPINGS [CHAR 5
5.33. Suppose Fj, F2, ..., F„ are linear maps fi-om V into U. Show that, for any scalars a^, ^2, ..., a„,
and for any v e V,
(a^F^ + ^2^2 + ... + ci„FJ(v) = a^F^(v) + fl2^2(^) + • • • + ci„F„(v)
The mapping a^F^ is defined by (aiF^Xv) = aiF(v). Hence the theorem holds for « = 1. Accordingly, by
induction,
(a,F, + ^2^2 + ... + a,Fj(v) = (a,F,)(v) + (^2^2 + • • • + a,Fj(v)
= a^F^iv) + ^2^2A;) + ... + a^F^(v)
5.34. Consider linear mappings F: R^ ^ R^ G: R^ ^ R^ //: R^ ^ R^ defined by
F(x, y,z) = (x -\-y-\- z, x -\-y), G(x, y, z) = Bx + z, x + t), H(x, y, z) = {2y, x)
Show that F, G, H are linearly independent [as elements of Hom(R^, R^)].
Suppose, for scalars a,b,c e K,
aF-\-bG-\-cH = 0 A)
(Here 0 is the zero mapping.) For e^ = A, 0, 0) G R^, we have 0(ei) = @, 0) and
(aF -\-bG-\- cH)(e2) = aF(l, 0, 0) + bG@, 1,0) + cH@, 0, 1)
= a(l, 1) + bB, 1) + c@, l) = (a-\-2b, a-\-b-\-c)
Thus by A), (a-\-2b, a-\-b-\-c) = @, 0) and so
a-\-2b = 0 and a-\-b-\-c = 0 B)
Similarly for 62 = @, 1, 0) G R^ we have 0(^2) = @, 0) and
(aF -\-bG-\- cH)(e2) = aF@, 1,0) + bG@, 1,0) + cH@, 1, 0)
= fl(l, 1) + b@, 1) + cB, 0) = (fl + 2c, a-\-b)
Thus a + 2c = 0 and a + Z? = 0 C)
Using B) and C), we obtain
a = 0, b = 0, c = 0 D)
Since A) implies D), the mappings F, G, H are linearly independent.
5.35. Let ^ be a nonzero scalar. Show that a linear map T is singular if and only if ^7 is singular. Hence T
is singular if and only if —7 is singular.
Suppose T is singular. Then T{v) = 0 for some vector v ^ 0. Hence
(kT)(v) = kT(v) = ^0 = 0
and so kT is singular.
Now suppose kT is singular. Then (kT)(w) = 0 for some vector w ^ 0. Hence
T(kw) = kT(w) = (kT)(w) = 0
But k ^ 0 and w ^ 0 implies kw ^ 0. Thus T is also singular.
5.36. Find the dimension d of
(fl) Hom(R\ r4), F) Hom(R^ R^), (c) Hom(P3@, R^), (^) Hom(M2,3, R"^).
Use dim[Hom(K U)] = mn, where dim V = m and dim U = n.
(a) d = 3D)=l2. (c) Since dimP3@ = 4, J = 4B) = 8.
(b) d = 5C)=l5. (d) Since dimM2,3 =6, J = 6D) = 24.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 5] LINEAR MAPPINGS 195
5.37. Prove Theorem 5.11. Suppose dim V = m and dim U = n. Then dim[Hom(F, U)] = mn.
Suppose {vi,... ,v^} is a. basis of V and {ui, ... ,uj is a. basis of U. By Theorem 5.2, a linear mapping
in Hom(l^ U) is uniquely determined by arbitrarily assigning elements of U to the basis elements Vf of V. We
define
FijeRom(V,U), i=l,...,m, j=l,...,n
to be the linear mapping for which F^j(vi) = Uj, and F^j(vj^) = 0 for ^ 7^ /. That is, Fy maps v^ into Uj and the
other v's into 0. Observe that {F^y} contains exactly mn elements; hence the theorem is proved if we show that
it is a basis of Hom(K U).
Proof that {F^} generates Hom(F, U). Consider an arbitrary function F e Hom(F, U). Suppose
F(vi) = Wi,F(v2) = W2,..., F(v^) = w^. Since Wj^ e U, it is a linear combination of the w's; say,
Wj^ = aj^iUi + aj^2^2 + • • • + (^kn^ny k = I, . . . ,m, a^j E K A)
Consider the linear mapping G = J]^^ 2]Li ^ij^ij- Since G is a linear combination of the Fy, the proof that
{Fy} generates Hom(K U) is complete if we show that F = G.
We now compute G(vj^), k = 1,..., m. Since F^j(vj^) = 0 for ^ 7^ / and F]^(vj^) = u^,
m n n n
G{vj,) = E E aijFij{vj,) = E ajgFkjiVk) = E ^kjUj
= ^kiu^-\-ak2U2-\-...-\-aj^u^
Thus, by A), G(vj^) = Wj^ for each k. But F(vj^) = Wj^ for each k. Accordingly, by Theorem 5.2, F = G; hence
{Fy} generates Hom(K U).
Proof that {Fij} is linearly independent. Suppose, for scalars c^j e K,
m n
/=iy=i
For Vj^,k = 1, ..., m.
0 = ^M = E E CijFij(Vk) = E CkjFkj(Vk) = E CkjUj
= Ck\U^ -\-Ck2U2 -\- . . . -\-Ck^U^
But the Ui are linearly independent; hence, for ^ = 1, ..., m, we have c^^ = 0, c^2 = 0, ..., c^ = 0. In other
words, all the Cij = 0, and so {F^} is linearly independent.
5.38. Prove Theorem 5.12: (i) Go(F-\-F') = GoF-\-GoF\ (ii) (G-\-G')oF = GoF-\-G'oF.
(iii) k(GoF) = (kG)oF = Go(kF).
(i) For every v e V,
(G o (F + F'))(v) = G((F + F')(v)) = G(F(v) + FXv))
= G(F(v)) + G(FXv)) = (GoF)(v) + (GoFO(i;) = (GoF + GoFOW
Thus Go(F + FO = GoF + GoF.
(ii) For every v e V,
((G + GO °F(i;) = (G + GO(F(i;)) = G(F(v)) + GXF(v))
= (G o F)(v) + (G^ o F)(v) = (GoF-\-G'o F)(v)
Thus (G + GO°F = GoF + G^oF.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
196 LINEAR MAPPINGS [CHAR 5
(iii) For every v e V,
(k(GoF))(v) = k(GoF)(v) = k(G(F(v))) = (kG)(F(v)) = (kGoF)(v)
and (k(GoF))(v) = k(GoF)(v) = k(G(F(v))) = G(kF(v)) = G((kF)(v)) = (G o kF)(v)
Accordingly, k(GoF) = (kG)oF = Go(kF). (We emphasize that two mappings are shown to be equal
by showing that each of them assigns the same image to each point in the domain.)
ALGEBRA OF LINEAR MAPS
5.39. Let F and G be the linear operators on R^ defined by F(x,y) = (y, x) and G(x,y) = @,x). Find
formulas defining the following operators: (a) F + G, (b) IF - 3G, (c) FG, (d) GF, {e) F^, (/) G^.
{a) (F + G)(x, y) = F(x, y) + G(x, y) = (y,x)-\- @, x) = (y, 2x).
(b) {IF - 3G)(x,y) = 2F(x,y) - 3G(x,y) = 2(y, x) - 3@, x) = By, -x).
(c) (FG)(x, y) = F(G(x, y)) = F@, x) = (x, 0).
(d) (GF)(x, y) = G(F(x, y)) = G(y, x) = @, y).
(e) F^(x,y) = F(F(x,y)) = F(y, x) = (x,y). (Note that F^ = I, the identity mapping.)
(/) G^(x,y) = G(G(x,y)) = G@, x) = @, 0). (Note that G^ = 0, the zero mapping.)
5.40. Consider the linear operator T on R^ defined by T(x, y, z) = Bx, 4x —y, 2x-\-3y — z).
(a) Show that T is invertible. Find formulas for: (b) T-\ (b) T^, (c) T'^.
(a) Let W = Ker T. We need only show that T is nonsingular, i.e., that W = {0}. Set T(x,y, z) = @, 0, 0),
which yields
T(x,y,z) = Bx, 4x-y, 2x + 3j; -z) = @, 0, 0)
Thus W is the solution space of the homogeneous system
2x = 0, 4x-y = 0, 2x-\-3y-z = 0
which has only the trivial solution @,0,0). Thus W = {0}. Hence T is nonsingular, and so T is
invertible.
(b) Set T(x,y, z) = (r, s, t) [and so T~^(r, s, i) = (x,y, z)]. We have
Bx, 4x — y, 2x-\-3y — z) = (r, s,t) or 2x = r, 4x — y = s, 2x -\-3y — z = t
Solve for x, y, z in terms of r, s, t to get x = \r,y = 2r — s,z = lr — 3s — t. Thus
T~^{r,s,t) = i^r, 2r — s, Ir — 3s — t) or T~^{x,y,z) = (^x, 2x—y, lx — 3y — z)
(c) Apply T twice to get
r^(x, y, z) = TBx, 4x -y, 2x-\-3y- z)
= [4x, 4Bx) - Dx - y), 2Bx) + 3 Dx - j;) - Bx -\-3y- z)]
= Dx, 4x-\-y, 14x - 6j; + z)
(d) Apply T twice to get
T-^(x,y,z) = T-^(\x, 2x-y, lx-3y-z)
2^
[ix, 2{\x) - Bx -y), l{\x) - 3Bx -y)- Gx -3y- z)]
■.{\x, -x+y, -fx + 6y + z)
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 5] LINEAR MAPPINGS 197
5.41. Let V be of finite dimension and let 7 be a linear operator on V for which TR = /, for some
operator R on V. (We call R a right inverse of T.)
(a) Show that T is invertible. (b) Show that R = T'K
(c) Give an example showing that the above need not hold if V is of infinite dimension.
(a) Let dim V = n.By Theorem 5.14, T is invertible if and only if T is onto; hence T is invertible if and only
if rank(r) = n. We have n = rank(/) = YSink(TR) < rank(r) < n. Hence rank(r) = n and T is invertible.
(b) TT-^ =T-^T = I. Then R = IR = (T-^T)R = T-\TR) = T'^I = T'K
(c) Let V be the space of polynomials in t over K\ say, p{t) = a^ -\- ayt -\- ^2^^ + ... + a^t\ Let T and R be
the operators on V defined by
T{p{t)) = 0 + ai + ^2^ + ... + a/~^ and R{p{t)) = aQt-\-a^t^ -\-...-\- a/^^
We have
(TR)(p(t)) = T(R(p(t))) = T(aQt + a^t^ + ... + a/+^) = a^-\-a^t-\-...-\- a/ = p(t)
and so TR = I, the identity mapping. On the other hand, if ^ G ^ and k ^ 0, then
(RT)(k) = R(T(k)) = R@) = 0^k
Accordingly, RT 7^ /.
5.42. Let F and G be linear operators on R^ defined by F(x, y) = @, x) and G(x, y) = (x, 0). Show that:
(a) GF = 0, the zero mapping, but FG ^ 0. (b) G^ = G.
{a) (GF)(x, y) = G(F(x, y)) = G@, x) = @, 0). Since GF assigns 0 = @, 0) to every vector (x, y) in R^ it is
the zero mapping, that is, GF = 0.
On the other hand, (FG)(x,y) = F(G(x,y) = F(x, 0) = @, x). For example, (FG)B, 3) = @, 2).
Thus FG 7^ 0, since it does not assign 0 = @, 0) to every vector in R^.
(b) For any vector (x, y) in R^ we have G^(x, y) = G(G(x, y)) = G(x, 0) = (x, 0) = G(x, y). Hence G^ = G.
5.43. Find the dimension of {a) A{^% (b) A(F2(t)X (c) ^(M2 3).
Use dim[^(F)] = n^ where dim V = n. Hence: (a) dim[^(R4)] = 4^ = 16, (b) dim[^(PJ@] = 3^ = 9,
(c) dim[^(M2,3)] = 6^ = 36.
5.44. Let £" be a linear operator on V for which E^ = E. (Such an operator is called slprojection.) Let U
be the image of E, and let W be the kernel. Prove:
(a) If u e U, then E(u) = u, i.Q., E is the identity mapping on U.
(b) If E ^ I, then E is singular, i.e., E(v) = 0 for some v ^ 0.
(c) v=uew.
(a) If u e U, the image of E, then E(v) = u for some v eV. Hence, using E^ = E, we have
u = E(v) = E^(v) = E(E(v)) = E(u)
(b) If E ^ I, then for some v e V, E(v) = u, where v ^ u. By (i), E(u) = u. Thus
E(v — u) = E(v) — E(u) = u — u = 0, where v — u ^ 0
(c) We first show that V = U -\-W. LqX v e V. SqX u = E(v) and w = v - E(v). Then
V = E(v) -\-v— E(v) = u-\-w
By definition, u = E(v) e U, the image of E. We now show that w G ^ the kernel of E,
E(w) = E(v - E(v)) = E(v) - E^(v) = E(v) - E(v) = 0
and thus w e W. Hence V = U -\-W.
We next show that UnW = {0}. LqXv eUnW. Since v e U, E(v) = i; by part (a). Since v e W,
E(v) = 0. Thus V = E(v) = Osindso UnW = {0}.
The above two properties imply that V =U ®W.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
198 LINEAR MAPPINGS [CHAR 5
Supplementary Problems
MAPPINGS
5.45. Determine the number of different mappings from {a, b} into {1,2,3}.
5.46. Let/: R ^ R and g: R ^ R be defined by/(x) = x^ -\- 3x -\- I and g(x) = 2x — 3. Find formulas defining
the composition mappings: (a) fog; (b) g°f; (c) g°g; (d)fof.
5.47. For each mappings/: R ^ R find a formula for its inverse: (a)f(x) = 3x — 1, (b)f(x) = x^ -\-2.
5.48. For any mapping/: A ^ B, show that l^of =f =fo\^^
LINEAR MAPPINGS
5.49. Show that the following mappings are linear:
(a) F : R^ ^ r2 defined by F(x, y, z) = (x-\-2y- 3z, 4x - 5y-\- 6z).
(b) F : R^ ^ R^ defined by F(x, y) = (ax + by, ex + dy), where a, b, c, d belong to R.
5.50. Show that the following mappings are not linear:
(fl) F : r2 ^ r2 defined by F{x, y) = (x^, /).
(b) FR^^R^ defined by F(x,y,z) = (x+ 1, y-\-z).
(c) F : r2 ^ r2 defined by F(x, y) = (xy, y).
id) F: R^ ^ r2 defined by F(x,y,z) = {\x\, y-\-z).
5.51. Find F(a, b\ where the linear map F: R^ ^ R^ is defined by F(l, 2) = C, -1) and F@, 1) = B, 1).
5.52. Find a 2 x 2 matrix A that maps:
(a) A, 3)^ and A,4)^ into (-2, 5)^ and C, -1)^, respectively.
(b) B, -4f and (-1, 2f into A, 1)^ and A, 3)^, respectively.
5.53. Find a 2 x 2 singular matrix B that maps A,1) into A,3) .
5.54. Let V be the vector space of real w-square matrices, and let M be a fixed nonzero matrix in V. Show that the
first two of the following mappings T \ V ^ V slyq linear, but the third is not:
(a) T(A) = MA, (b) T(A) = AM-\- MA, (c) T(A) =M-\-A.
5.55. Give an example of a nonlinear map F: R^ ^ R^ such that F~^@) = {0} but F is not one-to-one.
5.56. Let F : R^ ^ R^ be defined by F(x, y) = Cx + 5y, 2x + 3y), and let S be the unit circle in R^. (S consists of
all points satisfying x^ -\-y^ = I.) Find: (a) the image F(S), (b) the preimage F~^(S).
5.57. Consider the linear map G:R^ ^R^ defined by G(x,y,z) = (x-\-y-\-z, y — 2z, y — 3z) and the unit
sphere S2 in R^, which consists of the points satisfying x^ -\-y^ -\-z^ = \. Find: {a) G(Ss), (b) G~^(S2).
5.58. Let H be the plane x + 2j; — 3z = 4 in R^ and let G be the linear map in Problem 5.57. Find:
(a)G(H), (b)G-\H).
5.59. Let ^ be a subspace of V. The inclusion map, denoted by /: W ^^ ^ is defined by i(w) = w for every w e W.
Show that the inclusion map is linear.
5.60. Suppose F: F ^ U is linear. Show that F(—v) = —F(v).
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 5] LINEAR MAPPINGS 199
KERNEL AND IMAGE OF LINEAR MAPPINGS
5.61. For each linear map F find a basis and the dimension of the kernel and the image of F:
{a) F: R^ ^ R^ defined by F(x,j;,z) = (x+ 2;;-3z, lx + 5y-Az, x + 4j; + z),
(b) F : R"^ ^ R^ defined by F{x,y, z, 0 = (x + 2j; + 3z + It, 2x + 4j; + 7z + 5^, x + 2j; + 6z + 50-
5.62. For each linear map G, find a basis and the dimension of the kernel and the image of G\
{a) G: R^ ^ R^ defined by G(x,y,z) = {x+y + z, 2x + 2j; + 2z),
(b) G: R^ ^ R2 defined by G(x, y,z) = {x+y, y + z),
(c) G: R^ ^ R^ defined by
G{x,y, z,s,t) = {x + 2y + 2z + s + t, x-\-2y-\-3z-\-2s - t, 3x + 6j; + 8z + 5^ - 0-
5.63. Each of the following matrices determines a linear map from R^ into R^
(a) A.
1 2 0 1"
2-12-1
1-3 2-2
,(b) B-
10 2-1"
2 3-1 1
-2 0-5 3
Find a basis as well as the dimension of the kernel and the image of each linear map.
5.64. Find a linear mapping F: R^ ^ R^ whose image is spanned by A,2,3) and D,5,6).
5.65. Find a linear mapping G: R"^ ^ R^ whose kernel is spanned by A,2,3,4) and @,1,1,1).
5.66. Let V = PioCO? the vector space of polynomials of degree < 10. Consider the linear map J)'^: V ^ V, where
D"^ denotes the fourth derivative of d^/dt^. Find a basis and the dimension of:
(a) the image ofD^; F) the kernel of D^.
5.67. Suppose F: V ^^ U is linear. Show that: (a) the image of any subspace of F is a subspace of U;
(b) the preimage of any subspace of t/ is a subspace of V.
5.68. Show that if F: V ^ U is onto, then dim U < dim V. Determine all linear maps F: R^ ^ R^ that are onto.
5.69. Consider the zero mapping 0: V ^ U defined by 0(v) = 0,^ v e V. Find the kernel and the image of 0.
OPERATIONS WITH LINEAR MAPPINGS
5.70. Let F: R^ ^ r2 and G: R^ ^ R^ be defined by F(x, y,z) = {y, x + z) and G(x, y, z) = Bz, x + y). Find
formulas defining the mappings F -\- G and 3F — 2G.
5.71. Let //: R^ ^ R^ be defined by H(x,y) = (y, 2x). Using the maps F and G in Problem 5.70, find formulas
defining the mappings: (a)HoF mdHoG, (b)FoH md Go//, (c)//o(F + G) and//oF+ //oG.
5.72. Show that the following mappings F, G, H are linearly independent:
(a) F,G,H e Hom(R2, R^) defined by F(x, y) = (x, 2yl G(x, y) = {y, x + y\ H(x, y) = @, x),
(b) F,G,H e Hom(R\R) defined by F(x, j;, z) =x-\-y-\-z, G(x,y,z) = y-\-z, H(x,y,z) = x - z.
5.73. For F,G e Hom(K U), show that rank(F + G) < rank(F) + rank(G). (Here V has finite dimension.)
5.74. Let F \ V ^ U and G: U ^ F be linear. Show that if F and G are nonsingular, then GoF is nonsingular.
Give an example where G o F is nonsingular but G is not.
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
200 LINEAR MAPPINGS [CHAR 5
5.75. Find the dimension d of: {a) Hom(R2, R^), {b) Hom(P4@, R^), (c) Hom(M2,4, P2@)-
5.76. Determine whether or not each of the following linear maps is nonsingular. If not, find a nonzero vector v
whose image is 0; otherwise find a formula for the inverse map:
{a) F: R^ ^ R^ defined by F(x, j;, z) = (x+j; + z, 2x + 3j; + 5z, x + 3j; + 7z),
{b) G: R^ ^ P2@ defined by G(x, y, z) = (x + y)i^ + (x + 2j; + 2z)^ + J^ + z,
(c) //: r2 ^ P2@ defined by H{x, y) = {x + 2y)f + (x - j;)^ + x + j;.
5.77. When can dim [Hom(K U)] = dim F?
ALGEBRA OF LINEAR OPERATORS
5.78. Let F and G be the linear operators on R^ defined by F(x,y) = (x-\-y, 0) and G(x,y) = (—y,x). Find
formulas defining the linear operators: (a) F-\-G, (bMF-3G, (c) FG, (d) GF, (e)F^, (f) G^.
5.79. Show that each linear operator T on R^ is nonsingular and find a formula for T~^ , where:
(a) T(x, y) = (x-\- 2y, 2x + 3yX (b) T(x, y) = Bx - 3y, 3x - 4y).
5.80. Show that each of the following linear operators T on R^ is nonsingular and find a formula for r~\ where:
(a) T(x, y, z) = (x — 3y — 2z, y — 4z, z); (b) T(x, y,z) = (x -\-z, x — y, y).
5.81. Find the dimension of A(V), where: (a) V = R\ (b) V = Ps(t% (c) V = M3 4.
5.82. Which of the following integers can be the dimension of an algebra A(V) of linear maps:
5, 9, 12, 25, 28, 36, 45, 64, 88, 100?
5.83. Let T be the linear operator on R^ defined by T(x, y) = (x -\- 2y, 3x + 4y). Find a formula for/(r), where: (a)
f{t) = f-\-2t-3, (b)f(t) = f -5t-2.
MISCELLANEOUS PROBLEMS
5.84. Suppose F \ V ^ U is linear and ^ is a nonzero scalar. Prove that the maps F and kF have the same kernel
and the same image.
5.85. Suppose F and G are linear operators on V and that F is nonsingular. Assume that V has finite dimension.
Show that rank(FG) = rank(GF) = rank(G).
5.86. LetF \ V ^ t/ be linear and let ^ be a subspace of V. The restriction of F to W is the map F\W \ W ^ U
defined by F\W(v) = F(v) for every v in W. Prove the following:
(a) F\ W is linear; {b) Ker(F| W) = (Ker F) n W; (c) lm(F\ W) = F(W).
5.87. Suppose V has finite dimension. Suppose T is a linear operator on V such that rank(r^) = rank(r). Show that
Ker r n Im r = {0}.
5.88. Suppose V = U ®W. Let E^ and E2 be the linear operators on V defined by Ei{v) = u, ^2A;) = w, where
V = u -\-w, u e U, w e W. Show that: (a) E\ = E^ and E^ = E2, i.e., that E^ and E2 are projections;
(b) Ei-\-E2= /, the identity mapping; (c) E1E2 =0 and ^2^1 = ^^
5.89. Let Ey and E2 be linear operators on V satisfying parts (a), F), (c) of Problem 5.88. Prove:
V = \mEy elm£.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
5. Linear Mappings
Text
©The McGraw-Hill
Companies, 2004
CHAR 5]
LINEAR MARRINGS
201
5.90. Let V and w be elements of a real vector space V. The line segment L from vto v -\-wis defined to be the set of
vectors i; + M' for 0 < ^ < 1. (See Fig. 5.6.)
Fig. 5-6
(a) Show that the line segment L between vectors v and u consists of the points:
(i) A - t)v -\-tu forO <t < I, (ii) t^v + t2U for t^-\-12 = 1, t^ > 0, t2 > 0.
(b) Let F: V ^ t/ be linear. Show that the image F{L) of a line segment Z in F is a line segment in U.
5.91. A subset X of a vector space V is said to be convex if the line segment L between any two points (vectors)
P,Q eX is contained inX. {a) Show that the intersection of convex sets is convex; {b) suppose F \ V ^ U is
linear and X is convex. Show that F{X) is convex.
Answers to Supplementary Problems
5.45.
5.46.
Nine
(a) (/og)(x) = 4x2 - 6x + 1, {b) (go/)(x) = 2x2 + 6x - 1, (c) {gog){x) = 4x - 9,
id) (f°f)(x) = x4 + 6x^ + 14x2 + 15x + 5
5.47. (a)f-\x) = ^(x-\-lX (b)f-\x) = ^/^^^2
5.49. F(x, y, z) = A(x, y, z) , where: {a) A ■
1 2 -3
4-5 6
,{b)A
a b
c d
5.50.
5.51.
5.52.
5.53.
5.55.
5.56.
5.57.
{a) u = B, 2), k = 3; then F(ku) = C6, 36) but kF(u) = A2, 12). (b) F@) 7^ 0.
(c) u = (l,2iv = C, 4); then F(u -\-v) = B4, 6) but F(u) + F(v) = A4, 6).
(d) u = A, 2, 3), k = -2; then F(ku) = B, -10) but kF(u) = (-2, -10).
F(a, b) = (-a + 2b, -3a + b)
[-17 51
(a) ^ = . (b) None. B, —4) and (—1,2) are linearly dependent but not A,1) and A,3).
B=\l q1 [Hint: Send @, 1)^ into @, 0)^.]
F(x,y) = (x\/)
(a) 13x2 - 42xj; + 34/ = 1, (b) 13x2 + 42xj; + 24/ = 1
(a) x2 - 8xj; + 26/ + 6xz - 3^yz + 14z2 = I, (b) x^ -\- Ixy + 3/ + 2xz - %yz + 14z2 = 1
Lipschutz-Lipson:Schaum's I 5. Linear Mappings I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
202 LINEAR MAPPINGS [CHAR 5
5.58. (a)x-y-\-2z = 4, (b) x - I2z = 4
5.61. (a) dim(Ker F) = 1, {G, -2, 1)}; dim(Im F) = 2, {A,2, 1), @, 1, 2)}
(b) dim(KerF) = 2, {(-2,1,0,0), A,0,-1,1)}; dim(ImF) = 2, {A,2,1), @,1,3)}
5.62. (a) dim(Ker G) = 2, {A,0, -1), A,-1, 0)}; dim(Im G) = 1, {A, 2)}
(b) dim(Ker G) = 1, {A, -1, 1)}; Im G = R^ {A, 0), @, 1)}
(c) dim(Ker G) = 3, {(-2, 1, 0, 0, 0), A, 0, -1, 1, 0), (-5, 0, 2, 0, 1)}; dim(Im G) = 2,
{A,1,3), @,1,2)}
5.63. (a) dim(Ker ^) = 2, {D, -2, -5, 0), A, -3, 0, 5)}; dim(Im^) = 2, {A, 2, 1), @, 1, 1)}
(b) dim(Ker5)= 1, {(-l,f, 1, 1)}; Im5 = R^
5.64. F(x, y, z) = (x + 4y, 2x + 5y, 3x + 6y)
5.65. F(x,y,z,t) = (x-\-y-z, 2x-\-y-t,0)
5.66. (a){l,t,t\...,t^},(b){l,t,t\P}
5.68. None, since dim R^ > dim R^.
5.69. Ker 0 = F, Im 0 = {0}
5.70. (F-\-G)(x,y,z) = fy-\-2z, 2x - y-\-z% CF-2G)(x,y,z) = Cy - 4z, x + 2j; + 3z)
5.71. (a) (H oF)(x, y, z) = (x + y, 2y\ (H o G)(x, y, z) = (x - y, 4z); (b) not defined;
(c) (Ho(F + G))(x, j;,z) = (H oF-\-H oG)(x,y, z) = Bx-y-\-z, 2y + 4z)
5.74. F(x, y) = (x, y, y), G(x, y, z) = (x, y)
5.75. (fl) 16, F) 15, (c) 24
5.76. (fl) V = B, -3, 1); (b) G-\af -\-bt-\-c) = (b-2c, a-b-\-2c, -a-\-b - c);
(c) H is nonsingular, but not invertible, since dimP2@ > dimR^.
5.77. dim t/ = 1; that is, U = K.
5.78. (fl) (F + G)(x,j;) = (x,x); {b) {5F-3G){x,y) = {5x + ^y, -3x); {c) {FG){x,y) = {x-y, 0);
(J) (GF)(x,j;) = @, x+j;); (e) ^^(x, j;) = (x+;;, 0) (note that F^ = F); (/) G2(x, j;) = (-x, -j;).
[Note that G^ + / = 0; hence G is a zero of/@ = ^^ + 1]
5.79. (a) T-\x,y) = (-3x + 2y, 2x -y), (b) T-\x,y) = (-4x + 3y, -3x + 2y)
5.80. (a) T-\x,y,z) = (x-\-3y-\-\4t, y-4t, t), (b) T-\x,y,z) = (y-\-z, y, x-y-z)
5.81. {a) 49, {b) 36, (c) 144
5.82. Squares: 9, 25, 36, 64, 100
5.83. {a) T(x,y) = Fx + 14;;, 21x + 27;;); (b) T(x,y) = @, 0), i.e.,/(r) = 0
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAPTER 6
Linear Mappings
and Matrices
6.1 INTRODUCTION
Consider a basis S = {ui,U2,... ,uj of a vector space V over a field K. For any vector v e V, suppose
Then the coordinate vector of v relative to the basis S, which we assume to be a column vector (unless
otherwise stated or implied), is denoted and defined by
[v]s = [a^,a2,...,aj^
Recall (Section 4.11) that the mapping v\^[v]s, determined by the basis S, is an isomorphism between V
and K\
This chapter shows that there is also an isomorphism, determined by the basis S, between the algebra
A(V) of linear operators on V and the algebra M of ^-square matrices over K. Thus every linear mapping
F:V^V will correspond to an ^-square matrix [F]^ determined by the basis S. We will also show how
our matrix representation changes when we choose another basis.
6.2 MATRIX REPRESENTATION OF A LINEAR OPERATOR
Let r be a linear operator (transformation) fi-om a vector space V into itself, and suppose
S = {ui,U2, ..., uj is a basis of V. Now T(ui\ T(u2), ..., T(uJ are vectors in V, and so each is a
linear combination of the vectors in the basis S; say,
T(ui) = a^iUi + B12^2 + ... + a^^u^
T(U2) = ^21^1 + ^22^2 + • • • + ^2n^n
T(uJ = a^^u^ + a^2U2 + • • • + «„„w„
The following definition applies.
Definition: The transpose of the above matrix of coefficients, denoted by ms(T) or [T]^, is called the
matrix representation of T relative to the basis S, or simply the matrix of T in the basis S.
(The subscript S may be omitted if the basis S is understood.)
203
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
204
LINEAR MAPPINGS AND MATRICES
[CHAR 6
Using the coordinate (column) vector notation, the matrix representation of T may be written in the
form
ms{T) = ms = {mu,)\s, mu2)\s. .-., mu,)\s\
That is, the columns of m(T) are the coordinate vectors of T(ui), T(u2), ..., T(u^), respectively.
Example 6.1. Let F: R^ -^ R^ be the linear operator defined by F(x, y) = Bx + 3y, 4x - 5y).
(a) Find the matrix representation of F relative to the basis S = {ui,U2} = {A,2), B, 5)}.
A) First find F(ui), and then write it as a linear combination of the basis vectors u^ and ^2- (For notational
convenience, we use column vectors.) We have
F(Ui) :
>[j]=-ii]-0] -
x-\-2y--
2x-\-5y --
B)
Solve the system to obtain x = 52, y = —22. Hence F(ui) = 52ui — 22^2-
Next find F(u2), and then write it as a linear combination of u^ and ^2-
F(U2) = F
]H^]-'\lh{l\
and
x-\-2y= 19
2x-\-5y= -17
Solve the system to get x = 129, j; = —55. Thus F(u2) = \29ui — 55^2
Now write the coordinates oiF{ui) and F{u2) as columns to obtain the matrix
r 52 l:
[-22 -,
129
55
(b)
Find the matrix representation of F relative to the (usual) basis E = {e^, 62) = {A, 0), @, 1)}.
Find F(ei) and write it as a linear combination of the usual basis vectors ei and 62, and then find ^(^2) and
write it as a linear combination of ei and 62- We have
F(e,) = F(\, 0) = B, 2) = 2e, + 4^2
F(e2)=F@,l) = C,-5) = 3e,-5e2
and so
[Fh
Note that the coordinates of F(ei) and ^(^2) form the columns, not the rows, of [F]^. Also, note that the
arithmetic is much simpler using the usual basis of R^.
Example 6.2. Let V be the vector space of functions with basis S = {sin t, cos t, e^^}, and let D: V ■
operator defined by D(/(f)) = d(f, t))/dt. We compute the matrix representing D in the basis S:
V be the differential
D(sin t) =
)(cos t) =
D(e^O =
cost =
- sin ^ =
3e'^ =
m =
0(sinO+ l(cosO +0(^-^0
-I(sin0 + O(cos0 + O(e^0
0(sin0 + 0(cos0 + 3(e^0
-1 0"
1 0 0
0 0 3
and so
[Note that the coordinates of D(sin^), D(cosO, D(e^O form the columns, not the rows, of [D].^
Matrix Mappings and Their Matrix Representation
Consider the following matrix A, which may be viewed as a linear operator on R^, and basis ^ of R^:
A =
and
S = [ui, U2] =
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAR 6]
LINEAR MARRINGS AND MATRICES
205
(We write vectors as columns, since our map is a matrix.) We find the matrix representation of ^ relative to
the basis S.
A) First we write A(ui) as a linear combination of Wj and ^2- We have
A(u^) :
\^
4
-7]
-5 J
[1]
[2
—
[-1]
-6
= X
[1]
2
+y
[7]
5
and so
x + 2y=-l
2x-\-5y= -6
Solving the system yields x = 1, y = —4. Thus A(ui) = lui — 4^2-
B) Next we write A(u2) as a linear combination of Wj and 1/2• We have
A(U2) =
\^
4
-7]
-5 J
[71
[5
—
[-4]
-7
= X
[1]
2
+y
[7]
5
and so
X + 23; = -4
2x-\-5y= -7
Solving the system yields x = —6, y = I. Thus A(u2) = —6ui + ^2- Writing the coordinates ofA(ui)
and A(u2) as columns gives us the following matrix representation of ^:
w.=
7 -6
-4 1
Remark: Suppose we want to find the matrix representation of A relative to the usual basis
E = {6^,62} = {[I, of, [0, if} ofRl We have
and so
^^^1) =4 -5 0 ^ 4 ^ "^^1 "^ ^^2
[3 _2iro1 [-2]
^(^2)= [4 _5j|_iJ = |__5j--2^i-5^2
Note that [A]^ is the original matrix A. This result is true in general:
[Ah =
3 -2
4 -5
The matrix representation of any n x n square
usual basis E of K" is the matrix A itself; that
matrix A
is,
over a
field K relative to the
Algorithm for Finding Matrix Representations
Next follows an algorithm for finding matrix representations. The first Step 0 is optional. It may be
useful to use it in Step 1(b), which is repeated for each basis vector.
Algorithm 6.1: The input is a linear operator 7 on a vector space V and a basis S = {ui,U2,
V. The output is the matrix representation {T\.
Step 0. Find a formula for the coordinates of an arbitrary vector v relative to the basis S.
Step 1. Repeat for each basis vector uj^ in S:
(a) Find T(uj^).
(b) Write T(uj^) as a linear combination of the basis vectors Ui,U2, ... ,u„.
, uj of
Step 2. Form the matrix [T]^ whose columns are the coordinate vectors in Step 1(b).
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
206
LINEAR MAPPINGS AND MATRICES
[CHAR 6
iH-^
^y
Example 6.3. Let F: R^ -^ R^ be defined by F(x,y) = Bx + 3y, 4x - 5y). Find the matrix representation [F]^ of F
relative to the basis S = {u^, U2] = {A, —2), B, —5)}.
(Step 0) First find the coordinates of (a,b) e R^ relative to the basis S. We have
2~| x-\-2y = a x-\-2y = a
-5 J —2x — 5y = b —y = 2a-\-b
Solving for x and y in terms of a and b yields x = 5a-\-2b,y = —2a — b. Thus
(a, b) = Ea + 2b)ui + (—2a — b)u2
(Step 1) Now we find F(ui) and write it as a linear combination of w^ and U2 using the above formula for (a, b), and
then we repeat the process for F(u2). We have
F(u^) = F(l, -2) = (-4, 14) = 8^1 - 6U2
F(u2) = FB, -5) = (-11,33) = 11 wi -11^2
(Step 2) Finally, we write the coordinates of F(ui) and F(u2) as columns to obtain the required matrix
11
-11
Properties of Matrix Representations
This subsection gives the main properties of the matrix representations of linear operators 7 on a
vector space V. We emphasize that we are always given a particular basis S of V.
Our first theorem, proved in Problem 6.9, tells us that the "action" of a linear operator 7 on a vector v
is preserved by its matrix representation.
Theorem 6.1: Let T:V^ F be a linear operator, and let ^ be a (finite) basis of V. Then, for any vector v
in V, [T]s[v]s = [T(v)]s.
Example 6.4. Consider the linear operator F on R^ and the basis S of Example 6.3, that is,
Let
F(x, y) = Bx + 3y,
4x — 5y) and S = {ui,U2} = {(I, ■
E,-7), and so F(i;) = (-11, 55)
and [F(v)] = [55, -33]^
-2), B, -5)}
Using the formula from Example 6.3, we get
M = [ll,-3f
We verify Theorem 6.1 for this vector v (where [F] is obtained from Example 6.3):
[F][
'^=[-6 -11
11
-3
55
-33
: [F(V)]
Given a basis ^ of a vector space V, we have associated a matrix [T] to each linear operator T in the
algebra A(V) of linear operators on V. Theorem 6.1 tells us that the "action" of an individual linear
operator T is preserved by this representation. The next two theorems (proved in Problems 6.10 and 6.11)
tell us that the three basic operations in A(V) with these operators, namely (i) addition, (ii) scalar
multiplication, and (iii) composition, are also preserved.
Theorem 6.2: Let V be an ^-dimensional vector space over K, let ^ be a basis of V, and let M be the
algebra of n x n matrices over K. Then the mapping:
m: A(V)^M defined by m(T) = [T]s
is a vector space isomorphism. That is, for any F,G eA(V) and any k e K,
(i) m(F + G) = m(F) + m(G) or [F-\-G] = [F] + [G]
(ii) m(kF) = km(F) or {kF] = k{F]
(iii) m is bijective (one-to-one and onto).
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAR 6]
LINEAR MARRINGS AND MATRICES
207
Theorem 6.3: For any linear operators F,G e A(V),
m(G oF) = m(G)m(F) or [GoF] = [G][F]
(Here G o F denotes the composition of the maps G and F.)
6.3 CHANGE OF BASIS
Let V be an ^-dimensional vector space over a field K. We have shown that, once we have selected a
basis S of V, every vector v e V can be represented by means of an w-tuple [v]^ in K^, and every linear
operator TmA(V) can be represented by an w x ^ matrix over K. We ask the following natural question:
How do our representations change if we select another basis?
In order to answer this question, we first need a definition.
Definition: Let S = {ui,U2,... ,u^} be a basis of a vector space V, and let S' = {vi,V2,..., v^} be
another basis. (For reference, we will call S the "old" basis and S' the "new" basis.) Since S
is a basis, each vector in the "new" basis S' can be written uniquely as a linear combination
of the vectors in S; say.
^11^1 "•" ^12 ^2
■ Cl]nUr,
V2 = ^21^1 + ^22^2 + • • • + <^2n^n
^n = ^nlUl + a^2U2 + • • • + «„„W„
Let P be the transpose of the above matrix of coefficients; that is, let P = [p^y], where
p.J = aji. Then P is called the change-of-basis matrix (or transition matrix) fi*om the "old"
basis S to the "new" basis S'.
The following remarks are in order.
Remark 1: The above change-of-basis matrix P may also be viewed as the matrix whose columns
are, respectively, the coordinate column vectors of the "new" basis vectors v^ relative to the "old" basis S',
namely,
P = [[^i]^, [^2k,..., K]s]
Remark 2: Analogously, there is a change-of-basis matrix Q fi-om the "new" basis S^ to the "old"
basis S. Similarly, Q may be viewed as the matrix whose columns are, respectively, the coordinate column
vectors of the "old" basis vectors Uf relative to the "new" basis S'; namely,
Q= [[UlhAu2h^---AUnh]
Remark 3: Since the vectors Vi,V2,... ,v^mthQ new basis S' are linearly independent, the matrix P
is invertible (Problem 6.18). Similarly, Q is invertible. In fact, we have the following proposition (proved in
Problem 6.18).
Proposition 6.4: Let P and Q be the above change-of-basis matrices. Then Q = P~^.
Now suppose S = {ui,U2, ... ,uj is a basis of a vector space V, and suppose P = Ipy] is any
nonsingular matrix. Then the n vectors
^i =P\iUi +P2iU2 + • • • +PmUn^ / = 1, 2, . . . , W
corresponding to the columns of P, are linearly independent [Problem 6.21(a)]. Thus they form another
basis S' of V. Moreover, P will be the change-of-basis matrix fi*om S to the new basis S\
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
208
LINEAR MAPPINGS AND MATRICES
[CHAR 6
Example 6.5. Consider the following two bases of R^:
S = {u^,U2] = {(l,2), C,5)} and
S^ = {v„V2] = {(h-ll A,-2)}
(a) Find the change-of-basis matrix P from S to the "new" basis S\
Write each of the new basis vectors of ^ as a linear combination of the original basis vectors u^ and U2 ofS.
We have
Thus
^2
—8^1 + 3^2
-11^1 +4^2
X + 3j; = 1
2x-\-5y= -1
X + 3j; = 1
2x-\-5y= -1
and hence
yielding
yielding
-8, y = 3
-11, y = 4
-11
4
Note that the coordinates of v^ and V2 are the columns, not rows, of the change-of-basis matrix P.
(b) Find the change-of-basis matrix Q from the "new" basis S' back to the "old" basis S.
Here we write each of the "old" basis vectors u^ and U2 of ^ as a linear combination of the "new" basis
vectors v^ and V2 of S\ This yields
4Vi — ?iV2
Wvi — 8i;2
and hence
Q
r 4 11
[-3 -8
As expected from Proposition 6.4, Q = P ^. (In fact, we could have obtained Q by simply finding P ^.)
Example 6.6. Consider the following two bases of R^:
i? = {ei, 62,63} = {A,0,0), @,1,0), @,0,1)}
and
5 = {«„«2,«3} = {A,0,1), B,1,2), A,2,2)}
(a) Find the change-of-basis matrix P from the basis E to the basis S.
Since E is the usual basis, we can immediately write each basis element of 5* as a linear combination of the
basis elements of ^. Specifically,
Ul =A,0,1):
U2 =B,1,2):
W3 =A,2,2):
Again, the coordinates of Ui,U2, u^ appear as the columns in P. Observe that P is simply the matrix whose
columns are the basis vectors of S. This is true only because the original basis was the usual basis E.
(b) Find the change-of-basis matrix Q from the basis S to the basis E.
The definition of the change-of-basis matrix Q tells us to write each of the (usual) basis vectors in ^ as a
linear combination of the basis elements of S. This yields
^1+ ^3
2ei + ^2 + 2^3
ei + 2^2 + 2^3
and hence
P =
1 2 1
0 1 2
1 2 2
e-^ = A, 0, 0) = —2ui + 2^2 — W3
62 = @, 1,0) = -2ui + U2
^3 = @, 0, 1) = 3ui — 2u2 + W3
and hence
-2-2 3 ■
2 1 -2
-1 0 1
We emphasize that to find Q, we need to solve three 3x3 systems of linear equations - one 3x3 system for
each of ^1,^2,^3.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAR 6]
LINEAR MARRINGS AND MATRICES
209
Alternatively, we can find Q = P ^ by forming the matrix M = [P, I] and row reducing M to row canonical
form:
M -
thus
12 110 0'
0 12 0 10
12 2 0 0 1
1 0 0
0 1 0
0 0 1
"-2 -2
2 1
-1 0
-2
2
-1
3"
-2
1_
-2
1
0
3"
-2
1
:[/,p-l]
(Here we have used the fact that Q is the inverse of P.)
The result in Example 6.6(a) is true in general. We state this result formally, since it occurs often.
Proposition 6.5: The change-of-basis matrix from the usual basis E of K" to any basis S of K" is the
matrix P whose columns are, respectively, the basis vectors of S.
Applications of Change-of-Basis Matrix
First we show how a change of basis affects the coordinates of a vector in a vector space V. The
following theorem is proved in Problem 6.2.2.
Theorem 6.6: Let P be the change-of-basis matrix from a basis ^ to a basis S' in a vector space V. Then,
for any vector v e V, wq have:
P[vh
and hence
P-'Ms = [vh
Namely, if we multiply the coordinates of v in the original basis Shy P \ we get the coordinates of v
in the new basis S\
Remark 1: Although P is called the change-of-basis matrix from the old basis S to the new basis S\
we emphasize that it is P~^ that transforms the coordinates oft; in the original basis S into the coordinates
of V in the new basis S'.
Remark 2: Because of the above theorem, many texts call Q = P~^, not P, the transition matrix
from the old basis S to the new basis S\ Some texts also refer to Q as the change-of-coordinates matrix.
We now give the proof of the above theorem for the special case that dim F = 3. Suppose P is the
change-of-basis matrix from the basis S = {ui,U2,U2} to the basis S^ = {vi, V2, v^}; say.
V2 = biUi + Z?2W2 + Z?3W3
V2 = CiUi + C2U2 + C3W3
and hence
ai bi Cj
^2 ^2 ^2
^3 ^3 ^3
Now suppose V e V and, say, 1; = ^ji;j + ^2^2 + ^3^3- Then, substituting for 1^1,1^2' ^3 ^^m above, we
obtain
V = ki(aiUi + B2^2 + ^3^3) + ^2(^1^1 + ^2^2 + ^3^3) + k^iCiUi + C2U2 + C3W3)
= (a^ki + Z?i^2 + <^i^3)^i + (^2^1 + ^2^2 + ^^2^3)^2 + (^3^1 + ^3^2 + ^^3^3)^3
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
210
LINEAR MAPPINGS AND MATRICES
[CHAP 6
Thus
Accordingly,
and
a^ki + Z?i^2 + <^i^3
B2^1 + Z?2^2 + ^2^2,
a2,ki + Z?3^2 + <^3^3
P[vh =
Uy
a2
a^
b,
b2
b,
C\
Cl
C3
aiki + Z?i^2 + ^ih
B2^1 + Z?2^2 + ^ih
a2,ki + Z?3^2 + <^3^3
Finally, multiplying the equation {v\^ = P{v\, by P \ we get
The next theorem (proved in Problem 6.26) shows how a change of basis affects the matrix
representation of a linear operator.
Theorem 6.7: Let P be the change-of-basis matrix from a basis ^ to a basis S' in a vector space V. Then,
for any linear operator T on V,
ms'=P-\T]sP
That is, if ^ and B are the matrix representations of T relative, respectively, to S and S\
then
B = P-^AP
Example 6.7. Consider the following two bases of R^:
E = {6^,62, e^] = {A,0, 0), @, 1,0), @, 0, 1)}
and S = {uy,U2,u^} = {A,0,1), B,1,2), A,2,2)}
The change-of-basis matrix P from E to S and its inverse P~^ were obtained in Example 6.6.
(a) Write i; = A, 3, 5) as a linear combination of w^, ^2. ^3^ or, equivalently, find [v]^.
One way to do this is to directly solve the vector equation v -
- yu2 + zu^, that is,
1
3
5
= X
1
0
1
■^y
2
1
2
+ z
1
2
2
X + 2_y + z = I
or j; + 2z = 3
X + 2j; + 2z = 5
The solution is x = 7, j; = —5, z = 4, so i; = lu^ — 5u^ + 4^3.
On the other hand, we know that [v]^ = [1,3,5] , since E is the usual basis, and we already know P~
Therefore, by Theorem 6.6,
31
2
1 J
rr
3
[5
=
7
-5
4
Thus, again, v = 7ui — 5u2 + 4^3.
[1 3-2
(b) LQtA= 2-4 1
[3 -1 2
relative to the basis S.
, which may be viewed as a linear operator on R^. Find the matrix B that represents A
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAR 6]
LINEAR MARRINGS AND MATRICES
211
The definition of the matrix representation of ^ relative to the basis S tells us to write each ofA(ui), A(u2),
A(u2,) as a linear combination of the basis vectors Ui,U2, u^^ of S. This yields
^(wi) = (-l,3,5)= llwi-5^2 + 6% f 11 21 17'
A{u2) = {1,2,9) =2lu^
A(u^) = C,-4,5)= 17^1
We emphasize that to find B, we need to solve three 3x3 systems of linear equations - one 3x3 system for
each of A(ui), A(u2), A(ui,).
On the other hand, since we know P and P~^, we can use Theorem 6.7. That is,
5^2 + 6^3
14^2 + 8^3
8^2 + 2^3
and hence B =
11
-5
6
21
-14
8
B = P-'AP--
1 2 r
0 1 2
1 2 2
=
1
5
6
21
-14
8
17
-8
2
This, as expected, gives the same result.
6.4 SIMILARITY
Suppose A and B are square matrices for which there exists an invertible matrix P such that
B = P~^AP; then B is said to be similar to A, or B is said to be obtained from ^ by a similarity
transformation. We show (Problem 6.29) that similarity of matrices is an equivalence relation.
By Theorem 6.7 and the above remark, we have the following basic result.
Theorem 6.8: Two matrices represent the same linear operator if and only if the matrices are similar.
That is, all the matrix representations of a linear operator T form an equivalence class of similar
matrices.
A linear operator T is said to be diagonalizable if there exists a basis SofV such that T is represented
by a diagonal matrix; the basis S is then said to diagonalize T. The preceding theorem gives us the
following result.
Theorem 6.9: Let A be the matrix representation of a linear operator T. Then T is diagonalizable if and
only if there exists an invertible matrix P such that P~^AP is a diagonal matrix.
That is, T is diagonalizable if and only if its matrix representation can be diagonalized by a similarity
transformation.
We emphasize that not every operator is diagonalizable. However, we will show (Chapter 10) that
every linear operator can be represented by certain "standard" matrices called its normal or canonical
forms. Such a discussion will require some theory of fields, polynomials, and determinants.
Functions and Similar Matrices
Suppose / is a function on square matrices that assigns the same value to similar matrices; that is
f(A) =f(B) whenever^ is similar to B. Then/ induces a function, also denoted by/, on linear operators T
in the following natural way. We define
f(T) ^f{[T]s)
where S is any basis. By Theorem 6.8, the function is well defined.
The determinant (Chapter 8) is perhaps the most important example of such a function. The trace
(Section 2.7) is another important example of such a function.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
212
LINEAR MAPPINGS AND MATRICES
[CHAR 6
Example 6.8. Consider the following linear operator F and bases E and 6* of R^:
F{x,y) = {2x + 3y, Ax-5y), ^ = {A,0), @,1)}, ^ = {A,2), B,5)}
By Example 6.1, the matrix representations of F relative to the bases E and S are, respectively.
and
B--
52 129
-22 -55
Using matrix A, we have:
(i) Determinant of F = det(^) = -10 - 12 = -
On the other hand, using matrix B, we have:
(i) Determinant of F = detE) = -2860 + 2838
As expected, both matrices yield the same result.
22; (ii) Trace of F = tr(^) = 2 - 5 = -3.
= -22; (ii) Trace of F = trE) = 52-55
6.5 MATRICES AND GENERAL LINEAR MAPPINGS
Lastly, we consider the general case of linear mappings from one vector space into another. Suppose V
and U are vector spaces over the same field K and, say, dim V = m and dim U = n. Furthermore, suppose
S={V^,V2,...,V^^
and
S^ = jwi, Wo
are arbitrary but fixed bases, respectively, of V and U.
Suppose F:V^ ^ is a linear mapping. Then the vectors F(vi), F(v2),
and so each is a linear combination of the basis vectors in S'; say,
F(vi) = a^iUi + B12^2 + ... + a^^u^
F(V2) = ^21^1 + ^22^2 + • • • + ^2n^n
F{v^) belong to U,
H^m) = ^ml^l + ^^2^2 + • • • + ^mn^n
Definition: The transpose of the above matrix of coefficients, denoted by m^ ^^{F) or [F]^ y, is called the
matrix representation of F relative to the bases S and S\ [We will use the simple notation
m(F) and [F] when the bases are understood.]
The following theorem is analogous to Theorem 6.1 for linear operators (Problem 6.67).
Theorem 6.10: For any vector v e V, [F]^s'Ms = [Fiv)]^^.
That is, multiplying the coordinates oft; in the basis ^ of F by [F], we obtain the coordinates of F(i;) in
the basis S' of U.
Recall that for any vector spaces V and U, the collection of all linear mappings from V into ^ is a
vector space and is denoted by Hom(F, U). The following theorem is analogous to Theorem 6.2 for linear
operators, where now we let M = M^ „ denote the vector space of all m x ^ matrices (Problem 6.67).
Theorem 6.11: The mapping m: Hom(F, ^) ^ M defined by m(F) = [F] is a vector space
isomorphism. That is, for any F,G e Hom(F, U) and any scalar k,
(i) m(F + G) = m(F) + m(G) or [F + G] = [F] + [G]
(ii) m(kF) = km(F) or [kF] = k[F]
(iii) m is bijective (one-to-one and onto).
Lipschutz-Lipson:Schaum's I 6. Linear Mappings and I Text I I ©The McGraw-Hill
Outline of Theory and Matrices Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 6] LINEAR MARRINGS AND MATRICES 213
Our next theorem is analogous to Theorem 6.3 for linear operators (Problem 6.67).
Theorem 6.12: Let S, S\ S'' be bases of vector spaces V, U, W, respectively. Let F:V ^ U and
G o U ^ W hQ linear mappings. Then
[GoF]s^s'' = [Gh,A^]s,s'
That is, relative to the appropriate bases, the matrix representation of the composition of two mappings
is the matrix product of the matrix representations of the individual mappings.
Next we show how the matrix representation of a linear mapping F:V ^ U is affected when new
bases are selected (Problem 6.67).
Theorem 6.13 Let P be the change-of-basis matrix fi-om a basis e to a basis e' in V, and let Q be the
change-of-basis matrix fi*om a basis / to a basis f in U. Then, for any linear map
F'.V ^U,
In other words, if ^ is the matrix representation of a linear mapping F relative to the bases e and/, and
B is the matrix representation of F relative to the bases e' and/^, then
B = Q~^AP
Our last theorem, proved in Problem 6.36, shows that any linear mapping fi*om one vector space V into
another vector space U can be represented by a very simple matrix. We note that this theorem is analogous
to Theorem 3.18 for m x n matrices.
Theorem 6.14: Let F'.V ^ U hQ linear and, say, rank(F) = r. Then there exist bases of V and U such
that the matrix representation of F has the form
Ir 0
0 0
A =
where /^ is the r-square identity matrix.
The above matrix A is called the normal or canonical form of the linear map F.
Solved Problems
MATRIX REPRESENTATION OF LINEAR OPERATORS
6.1. Consider the linear mapping F:R^^R^ defined by F(x, j) = Cx + 4;;, 2x — 5y) and the
following bases of R^:
^ = {^1,^2} = {A,0), @,1)} and ^ = {wj, 1/2} = {A, 2), B,3)}
{a) Find the matrix A representing F relative to the basis E.
(b) Find the matrix B representing F relative to the basis S.
(a) Since E is the usual basis, the rows of ^ are simply the coefficients in the components ofE(x,y), that is,
using (a, b) = aei + 6^2, we have
F(e2)=F@,l) = D,-5) = 4e^-5e2 [l -5
Note that the coefficients of the basis vectors are written as columns in the matrix representation.
Lipschutz-Lipson:Schaum's I 6. Linear Mappings and
Outline of Theory and Matrices
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
214
LINEAR MAPPINGS AND MATRICES
[CHAR 6
(b) First find F(ui) and write it as a linear combination of the basis vectors Ui and ^2- We have
F(wi) = F(l, 2) = A1, -8) = x(l, 2) +X2, 3), and so
x-\-2y= 11
2x-\-3y= -8
Solve the system to obtain x = —49, y = 30. Therefore
F(u^) = -49^1 +30^2
Next find F(u2) and write it as a linear combination of the basis vectors Ui and ^2- We have
x-\-2y= 18
F(U2) = FB, 3) = A8, -11) = x(l, 2) -\-yB, 3), and so
Solve for x and y to obtain x = —76,y = 47. Hence
F(u2) = —16ui + 47^2
2x + 3j;:
-11
Write the coefficients of Wi and Uj as columns to obtain B --
-49 -76
30 47
(b^) Alternatively, one can first find the coordinates of an arbitrary vector (a, b) in R^ relative to the basis S.
We have
(a, b) = x(l, 2) +X2, 3) = (x + 2y, 2x + 3y),
and so
X -\-2y = a
2x-\-3y = b
Solve for x and y in terms of a and b to get x = —3a -\-2b, y = 2a — b. Thus
(a, b) = (—3a + 2b)ui + Ba — b)u2
Then use the formula for (a, b) to find the coordinates oiF(ui) and F(u2) relative to S\
F(u^)=F(\,2) = (n,-^) =-49u^-\-30u2
F(u2) = FB, 3) = A8, -11) = -76^1 + 47^2
and so
B--
49 -76
30 47
6.2. Consider the following linear operator G on R^ and basis S:
G(x,y) = Bz-ly, 4x + 3y) and S = {u^,U2} = {A,3), B,5)}
(a) Find the matrix representation [G]^ of G relative to S.
(b) Verify {G]s{v\s = [G(^)]s for the vector v = D, -3) in R^.
First find the coordinates of an arbitrary vector v = (a, b) in R^ relative to the basis S. We have
[6]=43]+45]' '"'*'
x-\-2y = a
3x-\-5y = b
Solve for x and y in terms of a and 6 to get x = —5a-\- 2b, y = 3a — b. Thus
(a, b) = (—5a + 2b)ui + Ca — b)u2, and so [v] = [—5a + 2b, 3a — b]
(a) Using the formula for (a, b) and G(x,y) = Bx — ly. Ax + 3y), we have
and so [G]^
G(u^) = G(\, 3) = (-19, 13) = 121^1 - 70^2
G(U2) = GB, 5) = (-31, 23) = 20lu^ - 116^2
121 201
-70 -116
(We emphasize that the coefficients of u^ and U2 are written as columns, not rows, in the matrix
representation.)
(b) Use the formula (a, b) = (—5a -\- 2b)ui -\- Ca — b)u2 to get
V = D, -3) = -26^1 + 15^2
G(v) = GD, -3) = B0, 7) = -I3lu^ + 80^2
Then
-26, 15]^
and
[G{v)]s
-131,80]^
Lipschutz-Lipson:Schaum's I 6. Linear Mappings and
Outline of Theory and Matrices
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 6]
LINEAR MARRINGS AND MATRICES
215
Accordingly,
(This is expected from Theorem 6.1.)
121 201
-70 -116
-26
15
]-n
^ [G(v)]s
6.3. Consider the following 2x2 matrix A and basis S of R :
2 4
5 6
and
{Ui,U2}
U [-?]|
The matrix A defines a linear operator on R^. Find the matrix B that represents the mapping A
relative to the basis S.
First find the coordinates of an arbitrary vector (a, bf with respect to the basis S. We have
Y^-'UHA
—2x — ly = b
Solve for x and y in terms of a and b to obtain x = 7a + 36, y = —la — b. Thus
{a, bf = {la + 36)^1 + {-2a - b)u2
Then use the formula for {a, b) to find the coordinates of Aui and Au2 relative to the basis S:
Au]
iU2 =
2 4]
5 6j
2 4]
5 6j
r r
L-2.
r 3"
L-7.
"-61
_-7j
"-22"
_-27_
-63ui + 19^2
-235^1 + 71^2
Writing the coordinates as columns yields
-63
19
-235
71
6.4. Find the matrix representation of each of the following linear operators F on R^ relative to the usual
basis E = {ei,e2,e2} of R^; that is, find [F] = [F]^:
{a) F defined by F{x, y, z) = {x-\-2y- 3z, 4x - 5y - 6z, 7x -h 83; -h 9z).
[1 1 1"
{b) F defined by the 3 x 3 matrix A = \ 2 3 4
[5 5 5_
(c) F defined by F{e^) = A,3, 5), ^(^2) = B, 4, 6), F{e^) = G, 7, 7). (Theorem 5.2 states that a
linear map is completely defined by its action on the vectors in a basis.)
{a) Since E is the usual basis, simply write the coefficients of the components of F(x, j;, z) as rows:
[E]--
2 -3
-5 -6
{b) Since E is the usual basis, [E] = A, the matrix A itself
(c) Here
E{e^) = {1,3,5) = ^1+3^2+ 5^3
F(e2) = B,4,6) = 2ei+4e2 + 6e3
^(^3) = G, 7, 7) = 7^1 + 7^2 + 7^3
and so [E] ■
That is, the columns of [E] are the images of the usual basis vectors.
1 2 7"
3 4 7
5 6 7
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
216
LINEAR MAPPINGS AND MATRICES
[CHAR 6
6.5. Let G be the hnear operator on R^ defined by G(x, y, z) = By-\-z, x — 4y, 3x).
(a) Find the matrix representation of G relative to the basis
S = {wi, W2, W3} = {A, 1, 1), A, 1, 0), A,0, 0)}
(b) Verify that [G][v] = [G(v)] for any vector 1; in R^.
First find the coordinates of an arbitrary vector (a,b,c) eR^ with respect to the basis S. Write (a,b,c) as
a linear combination ofw^, W2, w^ using unknown scalars x,y, and z:
(a,b,c)=x(l,l,l)-\-y(l,l,0)-\-z(l,0,0) = (x-\-y-\-z, x-\-y, x)
Set corresponding components equal to each other to obtain the system of equations
X -\-y -\-z = a, x-\-y = b, x = c
Solve the system for x, y, z in terms of a,b, c to find x = c, y = b — c, z = a — b. Thus
(a,b,c) = cwi -\-(b — c)w2 -\-(a — b)w2, or, equivalently, [(a,b, c)] = [c, b — c, a — b]^
(a) Since G(x, y, z) = By + z, x — 4y, 3x),
G(w^) = G(l, 1, 1) = C, -3, 3) = 3>vi
G(W2) = G(l, 1, 0) = B, -3,3) = 3>vi
- 6x2 + 6x3
- 6w2 + Sw^
G(w^) = G(l, 0, 0) = @, 1, 3) = 3>vi - 2w2 - w^
Write the coordinates G(wi), G(w2), ^'(^3) as columns to get
[G]--
3 3
-6 -2
5 -1
(b) Write G(v) as a linear combination of w^, W2, W3, where v = (a, b,c) is mi arbitrary vector in R^
G(v) = G(a, b, c) = Bb-\-c, a- 4b, 3a) = 3aw^ + (-2a - Ab)w2 + {-a + 66 + c)w^
or, equivalently.
Accordingly,
{G{v)\ = {3a, -2a-4b, -a-\-6b-\-cY
[G][v]
3"
2
1
c
b-c
_a-b_
=
3a
-2a - 4b
-a -\- 6b -\- c
^ [G{v)]
6.6.
A =
1
1
1
'
0
1
1
'
1
2
3
Consider the following 3x3 matrix A and basis S of R^:
and S = {ui,U2,U2} =
The matrix A defines a linear operator on R^. Find the matrix B that represents the mapping A
relative to the basis S. (Recall that A represents itself relative to the usual basis of R^).
First find the coordinates of an arbitrary vector (a,b,c) in R^ with respect to the basis S. We have
x+ z = a
or x-\-y-\-2z = b
X -\- y -\- 3z = c
a
b
c
= X
1
1
1
■^y
0
1
1
+ z
1
2
3
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAR 6]
LINEAR MARRINGS AND MATRICES
217
Solve for x,y, z in terms of a, b, c to get
X = a-\- b — c, y = —a -\-2b — c, z = c — b
thus (a, b,c) = (a-\- b — c)ui + (—a -\-2b — c)u2 + (c — b)u2,
Then use the formula for (a, b, c) to find the coordinates of Au^, Au2, Aut, relative to the basis S:
A(u^)=A(l, 1, if = @,2,3f = -wi +W2+W3
A(u2) = A(l, 1, of = (-1, -1, 2f = -4wi - 3u2 + 3^3
A(u^) =AA,2, 3f = @,1, 3)^ = -2wi -U2-\- 2u^
B-
1 -4 -2
1 -3 -1
1 3 2
6.7. For each of the following linear transformations (operators) L on R^, find the matrix A that
represents L (relative to the usual basis of R^):
{a) L is defined by L(l, 0) = B, 4) and L@, 1) = E, 8).
(Z?) L is the rotation in R^ counterclockwise by 90°.
(c) L is the reflection in R^ about the line y = —x.
(a) Since {A, 0), @, 1)} is the usual basis of R^, write their images under L as columns to get
A:
2 5
4 8
(b) Under the rotation L, we have L(\, 0) = @, 1) and L@, 1) = (-1, 0). Thus
A :
0 -1
1 0
(c) Under the reflection L, we have L(l, 0) = @, -1) and L@, 1) = (-1,0). Thus
A-\ ^ -^
^"' -1 0
6.8. The set S = {e^\ te^\ fe^^} is a basis of a vector space V of functions/: R ^ R. Let D be the
differential operator on V, that is, D(/) = df /dt. Find the matrix representation of D relative to the
basis S.
Find the image of each basis function:
(e^O = 3^3' = 3{e^') + 0(^30 + ^{t^e^')
(fe^O = e^' + ^te^' = 1(^^0 + Xte^') + 0(fe^')
(fe^') = 2te^' + 3^2g3. ^ 0(^30 + ^{te^') + Xt^e^')
and thus
m =
1 0
0 3 2
0 0 3
6.9. Prove Theorem 6.1: Let T:V^ F be a linear operator, and let ^ be a (flnite) basis of V. Then, for
any vector v in V, [T]^[v]s = [7^A^)]^.
Suppose S = {ui,U2,... ,uj, and suppose, for / = !,...,«,
n
T(Ui) = a^u^ + ai2U2 + • • • + a^^u^ = J2 ^tjUj
7=1
Then [T]^ is the w-square matrix whose yth row is
(ay,a2j,...,a^)
A)
Lipschutz-Lipson:Schaum's I 6. Linear Mappings and
Outline of Theory and Matrices
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
218
LINEAR MAPPINGS AND MATRICES
[CHAR 6
Now suppose
i; = ^1^1 + ^2^2 + • • • + KUn = E k^i
/=1
Writing a column vector as the transpose of a row vector, we have
[v]s = (h,k„...,k„y
Furthermore, using the linearity of T,
T(v) = t(^± k,u^ = ± k,T{u,) = tki(^t aiju)j
n / n \
= EI E aijk ]uj = T. («i/i + ^2/2 + • • • + a^jK>j
7=1 V=l / 7=1
Thus [r(i;)]^ is the column vector whose yth entry is
fli/i + ayk2 + ... + a^jK
B)
C)
On the other hand, theyth entry of [r]^[i;]^ is obtained by multiplying theyth row of [T]^ by [v\, that is
A) by B). But the product of A) and B) is C). Hence [r]^[i;]^ and [^(i;)]^ have the same entries. Thus
[T]s[v]s = [nv)]s.
6.10. Prove Theorem 6.2: Let S = {ui, U2,..., u^} hQ 3. basis for V over K, and let M be the algebra of
^-square matrices over K. Then the mapping m:A(V) -^ M defined by m(T) = [T]^ is a vector
space isomorphism. That is, for any F,G e A(V) and any k e K, wq have:
(i) [F + G] = [F] + [G], (ii) [kF] = k[Fl (iii) m is one-to-one and onto.
(i) Suppose, for / = 1,..., w,
n n
F{u^) = E ^ij^j and G(w,) = J] bijUj
7=1 7=1
Consider the matrices A = [aij] and B = [bij]. Then [F] = A^ and [G] = B^. We have, for / = !,...,«,
{F + G){u,) = F{u,) + G{u,) = t(a,j + b,j)uj
7=1
Since A-\-B is the matrix (a^ + by), we have
[F^G] = (A^Bf =A' +B^ = [F] + [G]
(ii) Also, for / = 1,
7=1 7=1
Since ^^ is the matrix (kay), we have
[kF] = (kAf = kA^ = k[F]
(iii) Finally, m is one-to-one, since a linear mapping is completely determined by its values on a
basis. Also, m is onto, since matrix A = [afj] in M is the image of the linear operator.
Hui) = J2^ijUj^
7=1
/ = I,... ,n
Thus the theorem is proved.
Lipschutz-Lipson:Schaum's I 6. Linear Mappings and I Text I I ©The McGraw-Hill
Outline of Theory and Matrices Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 6] LINEAR MARRINGS AND MATRICES 219
6.11. Prove Theorem 6.3: For any linear operators G,F e A(V), [GoF] = [G][F].
Using the notation in Problem 6.10, we have
(G o F)(Ui) = G(F(Ui)) = gIJ2 ^ijUj 1 = E aijG{u)
n / n \ n / n \
= E ^ij E V% = E E «y V W
Recall that AB is the matrix AB = [q^], where q^ = E"=i ^ij^jk- Accordingly,
[GoF] = (ABf = B^A^ = [G][F]
Thus the theorem is proved.
6.12. Let A be the matrix representation of a linear operator T. Prove that, for any polynomial/(^), we
have that/(^) is the matrix representation off(T). [Thus/(r) = 0 if and only if/(^) = 0.]
Let (j) be the mapping that sends an operator T into its matrix representation A. We need to prove that
(j){f{T)) =f(A). Suppose/(O = aj^ -\-... -\- a^t -\- Gq. The proof is by induction on n, the degree off(t).
Suppose n = 0. Recall that (/)(/0 = /, where F is the identity mapping and / is the identity matrix. Thus
^(f(T)) = ^(aoF) = aoHn = V =f(^)
and so the theorem holds for n = 0.
Now assume the theorem holds for polynomials of degree less than n. Then, since (p is an algebra
isomorphism,
^(f(T)) = <j){a,r + a,_,r-' +... + a^r + vO
= a^(t){T)(t){r-') + (t){a^_ir-' + ... + aiT + a^F)
= a^AA""-^ + K_i^""^ + ... + ai^ + V) =/(^)
and the theorem is proved.
CHANGE OF BASIS
The coordinate vector \v\ in this section will always denote a column vector, that is,
6.13. Consider the following basis of R^:
£ = {e,,e2} = {(l,0), @, 1)} and 5 = {«,, ^2} = {A, 3), A, 4)}
(a) Find the change-of-basis matrix P from the usual basis F \o S.
(b) Find the change-of-basis matrix Q from S back to F.
(c) Find the coordinate vector \v\ of i; = E, —3) relative to S.
(a) Since E is the usual basis, simply write the basis vectors in S as columns: P -
1 1
3 4
Lipschutz-Lipson:Schaum's I 6. Linear Mappings and
Outline of Theory and Matrices
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
220
LINEAR MAPPINGS AND MATRICES
[CHAR 6
(b) Method 1. Use the definition of the change-of-basis matrix. That is, express each vector in ^ as a
linear combination of the vectors in S. We do this by first finding the coordinates of an arbitrary vector
V = (a,b) relative to S. We have
(a,b)=x(\,3)^y(\,4) = (x^y,3x^4y)
x+ y = a
3x-\-4y = b
Solve for x and y to obtain x = 4a — b, y = —3a + b. Thus
v = Da- b)ui + (-3a + b)u2 and [v]^ = [(a, b)]^ = [4a - b, -3a + bf
Using the above formula for [v]^ and writing the coordinates of the e^ as columns yields
ei = A, 0) = 4ui — 3^2
62 = @, 1) = —Ui + ^2
and
Q
Method 2.
Thus
Since Q = P ^, find P \ say by using the formula for the inverse of a 2 x 2 matrix.
[-3 -I]
(c) Method 1. Write i; as a linear combination of the vectors in S, say by using the above formula for
V = (a, b). We have v = E, -3) = 23u^ - 18^2, and so [v]s = [23, -18]^.
Method 2.
Use, from Theorem 6.6, the fact that [v]^ = P ^[v]^ and the fact that [v]^ = [5, —3]
P-'[v]e
4 -1
-3 1
H-n
6.14. The vectors Wj = A, 2, 0), U2 = A, 3, 2), u^ = @, 1,3) form a basis S of R^. Find:
(a) The change-of-basis matrix P form the usual basis £" = {gj, ^2. ^3} to S.
(b) The change-of-basis matrix Q Irom S back to E. r 1 1 n"
(a) Since E is the usual basis, simply write the basis vectors of S as columns: P = \ 2 3 1
[o 2 3_
(b) Method 1. Express each basis vector of ^ as a linear combination of the basis vectors of S by first
finding the coordinates of an arbitrary vector v = (a,b,c) relative to the basis S. We have
X -\- y = a
or 2x-\-3y-\- z = b
2y -\- 3z = c
Solve for x, y, z to get x = 7a — 3b -\- c, y — —6a + 36 — c, z = 4a — 2b-\-c. Thus
V = (a, b, c) = (la — 3b -\- c)ui + (—6a -\-3b — c)u2 + Da — 2b-\- c)ut^
or W\s = [(<^' ^' c)]s = Va - 36 + c, -6a + 36 - c, 4a - 26 + cf
Using the above formula for {v\s and then writing the coordinates of the e^ as columns yields
a
6
c
= X
1
2
0
■^y
1
3
2
+ z
0
1
3
e^ = (I, 0, 0) = lui — 6^2 + 4^3
62 = @, 1, 0) = —3ui + 3^2 — 2^3
^3 = @, 0, 1) = Ui — U2 -\- W3
and
7 -3
-6 3
4 -2
1
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAR 6]
LINEAR MARRINGS AND MATRICES
221
Method 2.
Find P-^ by row reducing M = [P, I] to the form [/, p-^]:
M :
1
2
0 2
1 1
1 0
3 1
3
0
0 1 1
0 0 1
1 0 0
0 1 0
0 0 1
1
-2
4
1 1
0
0
0
1 1
2 3
1 0 0'
-2 1 0
0 0 1
0 0
1 0
2 1
10 0 17
0 1 0
0 0 1
3
6 3
4 -2
u,p-
Thus Q = P~
7 -3 1
-6 3 -1
4 -2 1
6.15. Suppose the x- andj-axes in the plane R^ are rotated counterclockwise 45° so that the new x'- and
/-axes are along the line y = x and the line y = —x, respectively.
(a) Find the change-of-basis matrix P.
(b) Find the coordinates of the point ^E, 6) under the given rotation.
(a) The unit vectors in the direction of the new x^- and /-axes are
u, =aV2,\V2)
and
^2
:(-lV2iV2)
(The unit vectors in the direction of the original x and j; axes are the usual basis of R .) Thus write the
coordinates of Ui and U2 as columns to obtain
\V2 -iV2
iV2
\V2
(b) Multiply the coordinates of the point by P ^:
[:]
(Since P is orthogonal, P ^ is simply the transpose of P.)
6.16. The vectors Wj = A, 1, 0), U2 = @, 1, 1), u^ = (I, 2, 2) form a basis ^ of R^. Find the coordinates
of an arbitrary vector v = (a,b,c) relative to the basis S.
Method 1. Express i; as a linear combination of Ui,U2, Ut, using unknowns x,y,z. We have
(a,b,c)=x(l,l,0)-\-y@,l,l)-\-z(l,2,2) = (x-\-z, x-\-y-\-2z, y-\-2z)
this yields the system
x+ z = a
x-\-y -\-2z = b
y -\-2z = c
x+ z = a
y-\- z = —a-\-b
y -\-2z = c
x+ z = a
y -\-z = —a-\-b
z = a — b -\- c
Solving by back-substitution yields x = b — c, y — —2a -\-2b — c, z = a — b -\- c. Thus,
[v\g = [b — c, —2a -\-2b — c, a — b-\- cf
Method 2. Find P~^ by row reducing M = [P, I] to the form [/, P~^], where P is the change-of-basis
matrix from the usual basis E to S or, in other words, the matrix whose columns are the basis vectors of S.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
222
LINEAR MAPPINGS AND MATRICES
[CHAR 6
We have
M .
Thus
1 0 1
1 1 2
0 1 2
1 0 1
0 1 1
0 0 1
0 1
-2 2
1 -1
1 0 0
0 1 0
0 0 1
0 0
1 0
-1 1
1 0 1
0 1 1
0 1 2
~1 0 0
0 1 0
0 0 1
and [v\s=P-\v\
1 0 0
-1 1 0
0 0 1
0
2
1 -
0
■2
1 -1
[LP-
-la -\-2b — c
a — b -\- c
6.17. Consider the following bases of R^:
^ = K, 1/2} = {A,-2), C,-4)}
and
^^ = M, 1^2} = {A,3), C,8)}
{a) Find the coordinates of 1; = (a, Z?) relative to the basis S.
(b) Find the change-of-basis matrix P from ^ to ^^
(c) Find the coordinates of v = (a, b) relative to the basis S\
(d) Find the change-of-basis matrix Q from S' back to S.
(e) Verify Q = p-\
(/) Show that, for any vector v = (a, b) in R^, P~\v\s = [i;]y. (See Theorem 6.6.)
{a) Let V =xui -\-yu2 for unknowns x and y\ that is,
W-{-\H-i]
X + 3_y = a
-2x -Ay = b
X + 3_y = a
2y = 2a + b
Solve for x and y in terms of a and 6 to get x = -
(a, b) = (—2a — ^u^ + (a -\-\b)u2 or
-2a = \b and y = a-\-\b. Thus
[(a,b)\s = [-2a-\b, a + \bf
(b) Use part (a) to write each of the basis vectors Vi and V2 of S' as a linear combination of the basis vectors
Ui and U2 of S; that is,
:(l,3) = (-2-^)u,^(l^l)u2--
■ ^Ui + 2 i^2
V2 = C, 8) = (-6 - 12)wi + C + 4)u2 = -18^1 + 7^2
Then P is the matrix whose columns are the coordinates of Vi and V2 relative to the basis S; that is,
-^ -18^
^ 7
2 '
(c) Let V = xvi -\-yV2 for unknown scalars x and y\
H-'[3]
+y
X + 3_y = a
3x + 8j; = 6
X + 3_y = a
—y = b — 'ia
Solving for x and y to get x = —8a + 36 and y = ?ia — b. Thus
{a, b) = (-8a + ?>b)vy + Ca - b)v2 or [(a, 6)]^, = [-8a + 36, 3a - bf
Lipschutz-Lipson:Schaum's I 6. Linear Mappings and
Outline of Theory and Matrices
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 6]
LINEAR MARRINGS AND MATRICES
223
(d) Use part (c) to express each of the basis vectors Ui and ^2 of 5* as a linear combination of the basis
vectors v^ and V2 of S':
wi =(l,-2) = (-8-6)i;i +C+2)i;2 = -14i;i + 5i;2
u^ = C, -4) = (-24 - 12)i;i + (9 + 4)i;2 = -36i;i + 13i;2
Write the coordinates of u^ and U2 relative to S^ as columns to obtain Q -
[-"
-36
13
(e) QP
if) Use parts (a), (c), and {d) to obtain
P-'{v\s = Q{v\s
ai
[
-14 -36
5 13
-14 -36
5 13
:a-\b^\
-8a+ 36
3a-b
6.18. Suppose P is the change-of-basis matrix from a basis {wj to a basis {wj, and suppose Q is the
change-of-basis matrix from the basis {wj back to {wj. Prove that P is invertible and that Q = P~^.
Suppose, for / = 1, 2,..., «, that
Wi = aiiUi + a;-2W2 + • • • + (^in^n = Yl ^ij^j
7=1
A)
B)
and, for7 = 1, 2,..., «,
n
uj = bj^w^ + bj2W2 + ... + Z?^-„% = E V>^^
Let A = [a^j] and 5 = [bji,]. Then P = ^^ and g = 5^. Substituting B) into A) yields
n / n \ n / n \
>^/ = T.'^ij\ E ^#>^^ = E E<2y^# ]^k
Since {wj is a basis, E ^^z/^yvt = ^ik^ where ^^^ is the Kronecker delta, that is, Su^ = I if i = k but Su^ = 0 if
/ 7^ k. Suppose AB = [cij^]. Then c^^ = d^^. Accordingly, AB = /, and so
QP = B^A^ = (ABf =f=I
Thus Q = P-
6.19. Consider a finite sequence of vectors S = {u^,U2, ..., w„}. Let S' be the sequence of vectors
obtained from S by one of the following "elementary operations":
A) Interchange two vectors.
B) Multiply a vector by a nonzero scalar.
C) Add a multiple of one vector to another vector.
Show that S and S' span the same subspace W. Also, show that S' is linearly independent if and
only if S is linearly independent.
Observe that, for each operation, the vectors S' are linear combinations of vectors in S. Also, since each
operation has an inverse of the same type, each vector in 5* is a linear combination of vectors in S'. Thus S and
S' span the same subspace W. Moreover, S' is linearly independent if and only if dim W = n, and this is true if
and only if S is linearly independent.
Lipschutz-Lipson:Schaum's I 6. Linear Mappings and I Text I I ©The McGraw-Hill
Outline of Theory and Matrices Companies, 2004
Problems of Linear
Algebra, 3/e
224 LINEAR MAPPINGS AND MATRICES [CHAR 6
6.20. Let A = [ay] and B = [by] be row equivalent m x n matrices over a field K, and \QtVi,V2, ... ,v„ be
any vectors in a vector space V over K. For / = 1, 2, ..., m, let w^ and Wf be defined by
in^n
Ui = a^Vi + aaV2 + ... + a^^v^ and w^ = b^^v^ + Z?,-2^2 + • • • + K^^
Show that {ui) and {wj span the same subspace of V.
Applying an "elementary operation" of Problem 6.19 to {wj is equivalent to applying an elementary row
operation to the matrix A. Since A and B are row equivalent, B can be obtained from ^ by a sequence of
elementary row operations. Hence {wj can be obtained from {wj by the corresponding sequence of
operations. Accordingly, {wj and {wj span the same space.
6.21. Suppose Ui,U2,... ,u^ belong to a vector space V over a field K, and suppose P = [a^] is an
^-square matrix over K. For / = 1, 2, ..., w, let Vf = a^Ui + a^-2^2 + • • • + <3!/„w„.
(a) Suppose P is invertible. Show that {wj and {i;J span the same subspace of V. Hence {wj is
linearly independent if and only if {i;J is linearly independent.
(b) Suppose P is singular (not invertible). Show that {i;J is linearly dependent.
(c) Suppose {Vf} is linearly independent. Show that P is invertible.
(a) Since P is invertible, it is row equivalent to the identity matrix /. Hence, by Problem 6.19, {i;J and {wj
span the same subspace of V. Thus one is linearly independent if and only if the other is linearly
independent.
(b) Since P is not invertible, it is row equivalent to a matrix with a zero row. This means {i;J spans a
substance that has a spanning set with less than n elements. Thus {i;J is linearly dependent.
(c) This is the contrapositive of the statement of part (b), and so it follows from part (b).
6.22. Prove Theorem 6.6: Let P be the change-of-basis matrix from a basis ^ to a basis S' in a vector
space V. Then, for any vector v e V, wq have P[v]^f = [v]^ and hence P~^[v]s = [i;]y.
Suppose S = {ui,... ,u^} and S^ = {w^,... ,w^}, and suppose, for / = !,...,«,
7=1
Then P is the w-square matrix whose jth row is
(ay,a2j,...,a^) A)
Also suppose i; = ^^Wj + ^21^2 + • • • + ^n% — J]/Li ^/^/- Then
M^, =[)^i,)^2,---,y^ B)
Substituting for w^ in the equation for v, we obtain
n n / n \ n / n \
/=1 /=1 \7=1 / 7=1 \/=l /
n
= E(«i/i + ^2/2 + • • • + a^jK>j
7=1
Accordingly, [i;]^ is the column vector whose yth entry is
ayk^ + ^2/2 + • • • + a^jK C)
Lipschutz-Lipson:Schaum's I 6. Linear Mappings and
Outline of Theory and Matrices
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 6]
LINEAR MAPPINGS AND MATRICES
225
On the other hand, theyth entry of P[i;]y is obtained by multiplying theyth row of P by [i;]^, that is, A) by B).
However, the product of A) and B) is C). Hence P[i;]y and [v]^ have the same entries. Thus P[i;]y = [i;]y, as
claimed.
Furthermore, multiplying the above by P~^ gives P~^[v]s = P~^P[v]s> = [i;]^.
LINEAR OPERATORS AND CHANGE OF BASIS
6.23. Consider the linear transformation F on R^ defined by F(x,y) = Ex —y, 2x-\-y) and the
following bases of R^:
^ = {^1,^2} = {A,0), @,1)} and S = {u^,U2} = {A,4), B,7)}
(a) Find the change-of-basis matrix P from E to S and the change-of-basis matrix Q from S back
to E.
(b) Find the matrix A that represents F in the basis E.
(c) Find the matrix B that represents F in the basis S.
(a) Since E is the usual basis, simply write the vectors in S as columns to obtain the change-of-basis matrix
P. Recall, also, that Q = P-\ Thus
1 2
4 7
and Q = P~
\-7 2
I 4 -1
(b) Write the coefficients of x and y in F(x, y) = Ex —y, 2x + y) as rows to get
A :
5 -1
2 1
(c) Method 1. Find the coordinates ofF(ui) and F(u2) relative to the basis S. This may be done by first
finding the coordinates of an arbitrary vector (a, b) in R^ relative to the basis S. We have
{a, b) = x(l, 4) +X2, 7) = (x + 2y, 4x + 7y),
and so
x-\-2y = a
4x-\-ly = b
Solve for x and y in terms of a and b to get x = —7a -\-2b, y = 4a — b. Then
{a, b) = (-la + 2b)u^ + Da - b)u2
Now use the formula for (a, b) to obtain
F(u,)=F(\A) = (\,6) =5u,-2u2 .^^ ^
F(W2)=^B,7) = C,11)= wi+ U2
5 1
-2 1
Method 2.
By Theorem 6.7, B = p-^AP. Thus
B = P-^AP -
-1 2
4 -1
5 -1
2 1
1 21 r 5 1
4 l\ [-2 1
6.24. Let^
2 3
4 -1
Find the matrix B that represents the linear operator A relative to the basis
^ = {wj, ^2} = {[!» 3] , [2, 5] }. [Recall A defines a linear operator ^: R^ -^ R^ relative to the
usual basis £" of R^].
Method 1.
Find the coordinates oiA(ui) and A(u2) relative to the basis S by first finding the coordinates
of an arbitrary vector [a, b] in R relative to the basis S. By Problem 6.2,
[a, bY = (—5a + 2b)ui + Ca — b)u2
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
226
LINEAR MAPPINGS AND MATRICES
[CHAR 6
Using the formula for [a, bf, we obtain
and
Thus
[2 31
^ [4 -ij
rr
[3.
=
1"
1_
"=>^[: ^m-T^
B =
-[-
-53
32
= —53^1 + 32^2
= -89^1 + 54^2
-891
54 J
Method 2. Use B = P~^AP, where P is the change-of-basis matrix from the usual basis E to S. Thus
simply write the vectors in S (as columns) to obtain the change-of-basis matrix P and then use the formula for
P~^. This gives
Then
B = P-UP
ill
and
p 31
[4 -ij
p-' = \
r-5 2
L 3 -1
-;]
U
-53 -89
32 54
6.25. Let A
basis
Find the matrix B that represents the linear operator A relative to the
S = {u^,U2, W3} = {[1, 1, 0]^, [0, 1, 1]^, [1,2, if}
[Recall A which defines a linear operator ^: R^ -^ R^ relative to the usual basis E of R^].
Method 1. Find the coordinates of A(ui), A(u2), ^(^3) relative to the basis S by first finding the
coordinates of an arbitrary vector v = (a, b, c) in R^ relative to the basis S. By Problem 6.16,
[v]g = (b — c)ui + {—2a -\-2b — c)u2 -\-{a — b -\- c)ut^
Using this formula for [a,b,cf, we obtain
A{u^) = [4, 7, -1]^ = 8wi + 7u2
■5^3, ^(W2) = [4, 1,0]^:
- 6^2 — 3^3
A(u^) = [9, 4, 1]^ = 3^1-11^2 + 6^3
Writing the coefficients of w^, ^2^ ^3 as columns yields
B=\ 7 -
-5
3"
-11
6
Method 2. Use B = P ^AP, where P is the change-of-basis matrix from the usual basis E to S. The
matrix P (whose columns are simply the vectors in S) and P~^ appear in Problem 6.16. Thus
B = P-^AP--
1 -1
2 -1
-1 1
3 1
5 -4
-2 2
1 0 r
1 1 2
0 1 2
=
1 3
-6 -11
3 6
6.26. Prove Theorem 6.7: Let P be the change-of-basis matrix fi*om a basis ^ to a basis S' in a vector
space V. Then, for any linear operator T on V, {T\, = P~^{T\P.
Let i; be a vector in V. Then, by Theorem 6.6, P[v\, = [v]^. Therefore,
P-'[T]sP[v]s, = P-'[T]s[v]s = P~'[nv)]s = [T{v)]s,
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAR 6]
LINEAR MARRINGS AND MATRICES
227
But [T]s.[v]s. = [T(v)]s.. Hence
P-'[T]sP[vh=[TUvh
Since the mapping v\^[v]s, is onto K^, we have P ^[T]gPX = [T]gX for every X e K^. Thus
P~^[T]sP = [r\s', as claimed.
SIMILARITY OF MATRICES
6.27. Let A =
andP =
1 2
3 4
(a) Find B = P-^AP. (b) Verify trE) = tr(^). (c) Verify detE) = det(^).
(a) First find P~^ using the formula for the inverse of a 2 x 2 matrix. We have
Then
B = P-^AP
-2
3
2
1 1
1
2 J
1 ^
\ 3
-2
6
[1 2
[3 4
25
27
2
30
-15
(b) tr(A) = 4 + 6 = 10 and tr(B) = 25 - 15 = 10. Hence tr(B) = tr(A).
(c) dQt(A) = 24 + 6 = 30 and detE) = -375 + 405 = 30. Hence detE) = det(^).
6.28. Find the trace of each of the linear transformations F on R^ in Problem 6.4.
Find the trace (sum of the diagonal elements) of any matrix representation of F such as the matrix
representation [F] = [F]^ of F relative to the usual basis E given in Problem 6.4.
(a) tr(F) = tr([F]) =1-5 + 9 = 5.
(b) tr(F) = tr([F]) =1+3+5 = 9.
(c) tr(F) = tr([F]) = 1+4 + 7 = 12.
6.29. Write A ^ B if A is similar to B, that is, if there exists an invertible matrix P such that A = P ^BP.
Prove that ^ is an equivalence relation (on square matrices); that is,
A^A, for every A. (b) If A ^ B, then B ^ A.
If A ^B mdB^a then A^C.
(a)
(c)
(a) The identity matrix / is invertible, and I~^ = I. Since A = I~^AI, we have A ^
(b) Since A ^ B, there exists an invertible matrix P such that A = P~^BP. Hence B --
and P~^ is also invertible. Thus B ^ A.
(c) Since A ^ B, there exists an invertible matrix P such that A = P ^BP, and since B
invertible matrix Q such that B = Q~^CQ. Thus
A = P-'BP = P-\Q-'CQ)P = (P-'Q-')C(QP) = (QPy'CiQP)
and gP is also invertible. Thus A ^ C.
PAP-^ = (p-^y^AP
C, there exists an
6.30. Suppose B is similar to A, say 5 = P ^AP. Prove:
(a) 5" = P-^A^'P, and so 5" is similar to A"".
(b) f(B) = P~^f(A)P, for any polynomial/(x), and so f(B) is similar to/(^).
(c) 5 is a root of a polynomial g(x) if and only if ^ is a root of g(x).
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
228
LINEAR MAPPINGS AND MATRICES
[CHAR 6
(a) The proof is by induction on n. The result holds forn=l by hypothesis. Suppose n > 1 and the result
holds for « — 1. Then
(b) Suppose/(x) = a„x" + ... + a^x + Gq. Using the left and right distributive laws and part (a), we have
p-'f(A)P = P-\a^A^ + ... + ai^ + aoI)P
= p-\a^A^)P + ... + p-\a^A)P + P-\a^I)P
= a^(p-'A^P) + ... + a,{p-'AP) + a^iP-'lP)
= a,B'^ + ... + a,B + a,I=f{B)
(c) By part F), g{B) = 0 if and only iiP-^g{A)P = 0 if and only if g(^) = POP-^ = 0.
MATRIX REPRESENTATIONS OF GENERAL LINEAR MAPPINGS
6.31. Let F\ R^ -^ R^ be the linear map defined by F(x, j, z) = Cx -\-2y — Az, x — 5y-\- 3z).
{a) Find the matrix of F in the following bases of R^ and R^:
^ = {wi,W2,W3} = {A,1,1), A,1,0), A,0,0)} and S' = {u^.u^] = {{\,?>), B,5)}
(Z?) Verify Theorem 6.10: The action of F is preserved by its matrix representation; that is, for any
i; in R^ , we have {F]s,s'\.^]s = [H^)h-
(a) From Problem 6.2, (a, b) = (—5a + 2b)ui + Ca — b)u2. Thus
F(w^) = F(l, 1, 1) = A, -1) - 7^1 +4^2
F(W2) =F(l,l,0) = E, -4) = -33^1 + 19^2
F(w^) = F(l, 0, 0) = C, 1) = -13^1 + 8^2
Write the coordinates of F(wi), F(w2),F(w2) as columns to get
[^s
[-1
-33 13
19 8
(b) If i; = (x, y, z), then, by Problem 6.5, v = zw^ + (y — z)w2 + (x — y)w2. Also,
F(v) = Cx + 2j; - 4z, x - 5j; + 3z) = (-13x - 20j; + 26z)wi + (8x + 1 Ij^ - 15z)w2
-13x-20j; + 26z"
Hence
Thus
Ms = (z^ y-z^ ^-yf
[F]s
s'l^'ls = [ 4
-33
19
-13
and
z
y-x
X —y
[F(v)h
8x-hllj^- 15z
-13x-20j;-h26z
8x-hllj^- 15z
^ [F(v%
6.32. Let F: R" -^ R^ be the linear mapping defined as follows:
F(Xi,X2, . . . , X„) = (aiiXi -\- . . . -\- ai„X„, ^21X1 -h . . . + CL2n^n^ • • • ' ^m\^\ + • • • + (^mn^n)
(a) Show that the rows of the matrix [F] representing F relative to the usual bases of R" and R'^
are the coefficients of the x^ in the components of F(xi, ..., x„).
(b) Find the matrix representation of each of the following linear mappings relative to the usual
basis of R":
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
CHAR 6]
LINEAR MARRINGS AND MATRICES
229
(i) F: r2 -^ R^ defined by F(x, y) = Cx -y, 2x + 4y, 5x - 6y).
(ii) F: R^ -^ R^ defined by F(x, y, s, t) = Cx -4y-\-2s-5t, 5x-\-ly-s- It).
(iii) F\ R^ -^ R^ defined by F(x, j, z) = Bx + Bj; - 8z, x + j; + z, Ax - 5z, 6y).
(a) We have
F(\, 0, ..., 0) = (au, a2i,..., a^^)
F@, I,...,0) = (ai2,«22---«m2)
and thus
[F]
1 2
^^21 ^^22
"In
«2n
F) By part (a), we need only look at the coefficients of the unknown x,y, ... in F(x,y, ...). Thus
(i) [F]
3 -1
2 4
5 -6
(ii) [^]^
[3 -4 2
[5 7 -1
(iii) [F]:
2 3 -8"
1 1 1
4 0-5
0 6 0
6.33. LQtA
Recall that A determines a mapping F: R^ -^ R^ defined by F(v) = Av,
2 5-3"
_1 -4 7_
where vectors are written as columns. Find the matrix [F] that represents the mapping relative to the
following bases of R^ and R^:
(a) The usual bases of R^ and of R^.
(b) S = {w^,W2,w^} = {(\,\,\), A,1,0), A,0,0)} and ^^ = {1/1,^2} = {A,3), B,5)}.
(a) Relative to the usual bases, [F] is the matrix A itself.
(b) From Problem 9.2, (a, b) = (—5a + 2b)ui + Ca — b)u2. Thus
F(W,) :
[■
F(W2)=P
5
4
5
4
5
4
-3
7J
-3
7J
-3
7J
rn
1
[i
rn
1
[oj
rn
0
[oj
4
L4J
=
7
-3
=
2
1
-12wi + 8^2
-41^1 +24^2
8wi + 5^2
Writing the coefficients of F(wi), F(w2), F(wi,) as columns yields [F] --
-12 -41
8 24
6.34. Consider the linear transformation T on R defined by T(x, y) = Bx — 3y, x-\- 4y) and the
following bases of R^:
^ = {^1,^2} = {A,0), @,1)} and S = {u^,U2} = {A,3), B,5)}
(a) Find the matrix A representing T relative to the bases E and S.
(b) Find the matrix B representing T relative to the bases S and E.
(We can view 7 as a linear mapping from one space into another, each having its own basis.)
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
230
LINEAR MAPPINGS AND MATRICES
[CHAR 6
(a) From Problem 6.2, (a, b) = (—5a + 2b)ui + Ca — b)u2. Hence
T(e^) = T(l,0) = B,l) =-8wi+ 5^2
T(e2) = T@, 1) = (-3, 4) = 23u^ - 13^2
(b) We have
T(u^) = T(\,3) = (-7, 13) = -7^1 + 13^2
T(U2) = TB, 5) = (-11, 22) = -11^1 + 22^2
and so
and so
-8 23
5 -13
-7 -11
13 22
6.35. How are the matrices A and B in Problem 6.34 related?
By Theorem 6.12, the matrices A and B are equivalent to each other; that is, there exist nonsingular
matrices P and Q such that B = Q~^AP, where P is the change-of-basis matrix from S to E, and Q is the
change-of-basis matrix from E to S. Thus
and
ill}
[I
Q-'AP :
2'
3 5
Q =
r-8 -
L 5 -
"-5
3
-231
-I3J
2"
-1_
'
ri 2"
[3
5_
Q-'
"-7
_ 13
=
"]
-11"
22 _
2
5
]
— 1
6.36. Prove Theorem 6.14: Let F:V ^ U hQ linear and, say, rank(F) = r. Then there exist bases V and
of U such that the matrix representation of F has the following form, where /^ is the r-square
identity matrix:
A =
Ir 0
0 0
Suppose dim V = m and dim U = n. Let W be the kernel of F and U^ the image of F. We are given that
rank (F) = r. Hence the dimension of the kernel of F is m — r. Let {w^, ..., >v^_^} be a basis of the kernel of
F and extend this to a basis of V\
Set
{i;i,...,i;,,>vi,...,>v^_,}
^1=^(^1). U2=F(V2), ..., u,=F(v,)
Then {w^,..., Uj] is a basis of U', the image of F. Extend this to a basis of U, say
Observe that
F(vi) = Ui = \ui + 0^2 + ... + Ou^ + Ou^^i + ... + Ow„
F(i;2) = U2 = OUi + 1^2 + • • • + O^r + Oi^r+l + • • • + O^^n
F(v^) = u^ = Oui + 0^2 + ... + Iw^ + Ou^^i + ... + Ow„
F(w^) =0 = Owi + 0^2 + • • • + Ow^ + Ou^+i + ... + Ow„
^(>^m-r) = 0 = OWi + 0^2 + . . . + OW, + Ow,+ i + . . . + 0W„
Thus the matrix of F in the above bases has the required form.
Lipschutz-Lipson:Schaum's I 6. Linear Mappings and I Text I I ©The McGraw-Hill
Outline of Theory and Matrices Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 6] LINEAR MARRINGS AND MATRICES 231
Supplementary Problems
MATRICES AND LINEAR OPERATORS
6.37. Let F\ R^ -^ R^ be defined by F{x, y) = Dx + 5y, 2x - y).
(a) Find the matrix A representing F in the usual basis E.
(b) Find the matrix B representing F in the basis S = {ui, U2} = {A, 4), B, 9)}.
(c) Find P such that B = p-^AP.
(d) For V = (a, b\ find [v\s and [F{v)\s. Verify that [F]s[v\s = [Hv)]s-
6.38. Let A:R^ ^R^ bQ defined by the matrix ^ = L
(a) Find the matrix B representing A relative to the basis S = {ui, U2} = {A, 3), B, 8)}. (Recall that A
represents the mapping A relative to the usual basis E.)
(b) For V = (a, b), find [v]s and [^(i^)]^.
6.39. For each linear transformation Z on R^, find the matrix A representing L (relative to the usual basis of R^):
(a) L is the rotation in R^ counterclockwise by 45°.
(b) L is the reflection in R^ about the line y = x.
(c) L is defined by L(l, 0) = C, 5) and L@, 1) = G, -2).
(d) L is defined by L(l, 1) = C, 7) and L(l, 2) = E, -4).
6.40. Find the matrix representing each linear transformation T on R^ relative to the usual basis of R^:
(a) T(x, y, z) = (x, y, 0). (b) T(x, y,z) = (z, y-\-z, x-\-y-\- z).
(c) r(x,j;,z) = Bx-7j;-4z, 3x+j; + 4z, 6x-8j; + z).
6.41. Repeat Problem 6.40 using the basis ^' = {^1,^2,%} = {A, 1,0), A,2,3), A,3,5)}.
6.42. Let L be the linear transformation on R^ defined by
L{\, 0, 0) = A, 1, 1), Z@, 1, 0) = A, 3, 5), Z@, 0, 1) = B, 2, 2)
{a) Find the matrix A representing L relative to the usual basis of R^.
Qd) Find the matrix B representing L relative to the basis S in Problem 6.41.
6.43. Let D denote the differential operator; that is, D(/(^)) = df /dt. Each of the following sets is a basis of a
vector space V of functions. Find the matrix representing D in each basis:
{a) {e',e^',te^']. (b) {1, ^, sin3^, cos 3^}. (c) {e^',te^',fe^'].
6.44. Let D denote the differential operator on the vector space V of functions with basis S = {sin 6, cos^}.
(a) Find the matrix A = [D]^. (b) Use A to show that D is a zero off(t) = f -\- \.
6.45. Let V be the vector space of 2 x 2 matrices. Consider the following matrix M and usual basis E of V:
and E ■-
c d
1 0
0 0
]•[? ;]' [: 2]' [2 ']]
Find the matrix representing each of the following linear operators T on V relative to E:
(a) T(A)=MA. (b) T(A)=AM. (c) T(A)=MA-AM.
6.46. Let ly and Oy denote the identity and zero operators, respectively, on a vector space V. Show that, for any
basis S of V: (a) [ly]s = h the identity matrix. {b) [^y]s = 0? the zero matrix.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
6. Linear Mappings and
Matrices
Text
©The McGraw-Hill
Companies, 2004
232
LINEAR MAPPINGS AND MATRICES
[CHAR 6
CHANGE OF BASIS
6.47. Find the change-of-basis matrix P from the usual basis E of R^ to a basis S, the change-of-basis matrix Q
from S back to E, and the coordinates of i; = {a, b) relative to S, for the following bases S\
{a) S = {{\,2\ C,5)}.
{b) ^ = {A,-3), C,-8)}.
(c) ^ = {B,5), C,7)}.
{d) ^ = {B,3), D,5)}.
6.48. Consider the bases S = {A,2), B, 3)} and ^ = {A,3), A,4)} of R^. Find the change-of-basis matrix:
(a) PfromStoS^ (b) g from ^ back to 5*.
6.49. Suppose that the x- and j;-axes in the plane R^ are rotated counterclockwise 30° to yield new x^- and/-axes
for the plane. Find:
(a) The unit vectors in the direction of the new x^- and /-axes.
(b) The change-of-basis matrix P for the new coordinate system.
(c) The new coordinates of the points A(l, 3), BB, —5), C(a, b).
6.50. Find the change-of-basis matrix P from the usual basis E of R^ to a basis S, the change-of-basis matrix Q
from S back to E, and the coordinates of v = {a,b,c) relative to S, where S consists of the vectors:
{a) wi =A,1,0),W2 = @, 1,2),W3 =@,1,1).
{b) wi =A,0, 1),W2 = A,1,2), ^3 =A,2,4).
(c) wi =A,2, 1),W2 = A,3,4), ^3 =B,5,6).
6.51. Suppose 5*1, 5*2, 5*3 are bases of V. Let P and Q be the change-of-basis matrices, respectively, from S^ to S2 and
from S2 to 5*3. Prove that PQ is the change-of-basis matrix from Si to 6*3.
LINEAR OPERATORS AND CHANGE OF BASIS
6.52. Consider the linear operator F on R^ defined by F{x,y) = Ex -\-y, 3x — 2y) and the following bases of R^
^ = {A,2), B,3)} and S' -
(a) Find the matrix A representing F relative to the basis S.
(b) Find the matrix B representing F relative to the basis S\
(c) Find the change-of-basis matrix P from S to S'.
(d) How are A and B related?
{A,3), A,4)}
6.53. Let ^: R^ -^ R^ be defined by the matrix A ■
A relative to each of the following bases:
[3 2
. Find the matrix B that represents the linear operator
(a) S = {(l,3f, B,5/}.
(b) 5 = {A,3/, B,4/}.
6.54. Let F: R^ -^ R^ be defined by F(x, y) = (x — 3y, 2x — 4y). Find the matrix A that represents F relative to
each ofthe following bases: (a) S = {B,5), C,7)}. (b) S = {B,3), D,5)}.
6.55. Let ^: R^ ^ R^ be defined by the matrix A ■■
1 3 1
2 7 4
1 4 3
Find the matrix B that represents the linear
operator^ relative to the basis 5* = {A,1, 1)\ @, 1, 1)\ A,2,3)^}.
Lipschutz-Lipson:Schaum's I 6. Linear Mappings and I Text I I ©The McGraw-Hill
Outline of Theory and Matrices Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 6] LINEAR MARRINGS AND MATRICES 233
SIMILARITY OF MATRICES
6.56. Let A ■■
1 1
2 -3
mdP -
1 -2
3 -5
(a) Find B = p-^AP. (b) Verify that tr(B) = tr(A). (c) Verify that detE) = det(^).
6.57. Find the trace and determinant of each of the following linear maps on R^:
(a) F(x, y) = Bx — 3y, 5x + 4y). (b) G(x, y) = (ax + by, ex + dy).
6.58. Find the trace of each of the following linear maps on R^:
(a) F(x, y, z) = (x -\- 3y, 3x — 2z, x — Ay — 3z).
{b) G(x, y,z) = (y-\- 3z, 2x - 4z, 5x + 7y).
6.59. Suppose S = {ui, U2} is a basis of V, and T\V ^ F is defined by T{ui) = Su^ — 2^2 and T(u2) = u^ -\- 4^2-
Suppose y = {wi, W2} is a basis of V for which w^ = u^ -\-U2 and W2 =2ui +3^2-
{a) Find the matrices A and B representing T relative to the bases S and S\ respectively.
{b) Find the matrix P such that B = p-^AP.
6.60. Let ^ be a 2 X 2 matrix such that only A is similar to itself Show that ^ is a scalar matrix, that is, that
0"
\a 0
[0 a
6.61. Show that all matrices similar to an invertible matrix are invertible. More generally, show that similar matrices
have the same rank.
MATRIX REPRESENTATION OF GENERAL LINEAR MAPPINGS
6.62. Find the matrix representation of each of the following linear maps relative to the usual basis for R":
(a) F: R^ -^ R^ defined by F(x, y, z) = Bx - 4y-\- 9z, 5x-\-3y- 2z).
(b) F\ r2 -^ r4 defined by F(x, y) = Cx + 4y, 5x -2y, x + 7y, 4x).
(c) F: R^ ^ R defined by F(xi, X2, X3, X4) = 2xi + X2 — 7x3 — X4.
6.63. Let G: R^ -^ R^ be defined by G(x, y, z) = Bx-\-3y-z, 4x - j; + 2z).
(a) Find the matrix A representing G relative to the bases
5* = {A,1,0), A,2,3), A,3,5)} and S' = {A,2), B,3)}
(b) For any v = (a, b, c) in R^ find [v]s and [G(v)]s.. (c) Verify that A[v]s = [G(v)]s..
6.64. Let H: R^ -^ R^ be defined by H(x, y) = Bx -\-7y, x — 3y) and consider the following bases of R^:
^ = {A,1), A,2)} and ^^ = {A,4), A,5)}
(a) Find the matrix A representing H relative to the bases S and S\
(b) Find the matrix B representing H relative to the bases S' and S.
6.65. Let F: R^ -^ R^ be defined by F(x, y, z) = Bx-\-y- z, 3x - 2j; + 4z).
(a) Find the matrix A representing G relative to the bases
5* = {A,1,1), A,1,0), A,0,0)} and S' = (l,3), A,4)}
(b) Verify that, for any v = (a, b, c) inR^, A[v]s = [F(v)]sf.
Lipschutz-Lipson:Schaum's I 6. Linear Mappings and I Text I I ©The McGraw-Hill
Outline of Theory and Matrices Companies, 2004
Problems of Linear
Algebra, 3/e
234 LINEAR MAPPINGS AND MATRICES [CHAR 6
6.66. Let S and S' be bases of V, and let Ij^ be the identity mapping on V. Show that the matrix A representing ly
relative to the bases S and S^ is the inverse of the change-of-basis matrix P from S to S\ that is, ^ = P~^.
6.67. Prove: (a) Theorem 6.10, (b) Theorem 6.11, (c) Theorem 6.12, (d) Theorem 6.13. [Hint: See the proofs of the
analogous Theorems 6.1 (Problem 6.9), 6.2 (Problem 6.10), 6.3 (Problem 6.11), and 6.7 (Problem 6.26).]
MISCELLANEOUS PROBLEMS
6.68. Suppose F\V ^ F is linear. A subspace ^ of F is said to be invariant under F if F(^) c W. Suppose W is
A B
invariant under F and dim W = r. Show that F has a block triangular matrix representation M
where ^ is an r x r submatrix.
0 C
6.69. Suppose V = U -\-W, and suppose U and V are each invariant under a linear operator F\V ^ V. Also,
'A 0
suppose dim U = r and dim W = S. Show that F has a block diagonal matrix representation M ■
where A and B slyq r x r and ^ x ^ submatrices.
0 B
6.70. Two linear operators F and G on V are said to be similar if there exists an invertible linear operator T on V
such that G = T~^ oF o T. Prove:
(a) F and G slyq similar if and only if, for any basis S of V, [F]^ and [G]^ slyq similar matrices.
(b) If F is diagonalizable (similar to a diagonal matrix), then any similar matrix G is also diagonalizable.
Answers to Supplementary Problems
Notation: M = [i^j; i?2 5 • • •] represents a matrix M with rows R\,R2,
6.37. (fl) A = [A,5\ 2,-1] {b) 5 = [220,478; -98,-217] (c) P = [l,2; 4,9]
{d) [v\s = [9a - 2b, -4a + bf and [F(v)] = [32a + 41b, -14a - 2lbf
6.38. (a) B = [-6, -28; 4, 15]
-^fl + i
2^ -r 2
(b) [v] = [4a - b, -\a + \bf and [A{v)\s = [ISa - Sb, U-Ua + lb)]
6.39. (a) [V2,-V2; V2, V2] (b) [0,1; 1,0] (c) [3,7; 5,-2]
(^ [1,2; 18,-11]
6.40. (a) [1,0,0; 0,1,0; 0,0,0] (b) [0,0,1; 0,1,1; 1,1,1]
(c) [2,-7,-4; 3,1,4; 6,-8,1]
6.4L (a) [1,3,5; 0,-5,-10; 0,3,6] F) [0,1,2; -1,2,3; 1,0,0]
(c) [15,51,104; -49,-191,351; 29,116,208]
6.42. (a) [1,1,2; 1,3,2; 1,5,2] (b) [6,17,26; -4,-3,-4; 0,-5,-8]
6.43. (a) [1,0,0; 0,2,1; 0,0,2] (b) [0,1,0,0; 0; 0,0,0,-3; 0,0,3,0]
(c) [5,1,0; 0,5,2; 0,0,5]
6.44. (a) ^ = [0,-1; 1,0] (b) A^-\-I = 0
Lipschutz-Lipson:Schaum's I 6. Linear Mappings and
Outline of Theory and Matrices
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 6]
LINEAR MARRINGS AND MATRICES
235
6.45.
6.47.
6.48.
6.49.
(a) [a,0,b,0; 0,a,0,b; c,0,d,0; 0,c,0,d]
(b) [a,c,0,0; b,d, 0,0; 0,0,a,c; 0,0, b,d]
(c) [0, -c, b, 0; -b, a-d,0, b; c,0,d- a, -c; 0, c, -b, 0]
(a) [1,3
(b) [1,3
(c) [2,3
(d) [2,4;
2, 5], [-5, 3; 2, -1], [v] = [-5a -\-b, 2a- bf
-3, -8], [-8, -3; 3, 1], [v] = [-8a - 3b, 3a - bf
5, 7], [-7, 3; 5, -2], [v] = [-la + 3b, 5a - 2bf
3, 5], [-f, 2; I, -1], [v] = [-f a + 2b, \a - bf
2' ^' 2' ^J' L*^J — L 2
{a) P = [3,5; -1,-1] {b) Q = [2,5; -1,-3]
{a) (iV3,i),(-i,iV3) ib) P = [\^,-
(c) [A] = P^[l, 3f, [B] = P^[2, -5f, [C] = P^[a, bf
h^v^
6.50. P is the matrix whose columns are w^, ^2^ ^3^ 2 = ^ ^ [^] = Q[^^ b,c] .
(a) e = [1,0,0; 1,-1,1; -2, 2,-1], [i;] = [a, a - b-\-c, -2a-\-2b-cf (b)
e = [0,-2,1; 2,3,-2; -I,-I, l],[v] = [-2b-\-c, 2a-\-3b-2c, -a - b-\-cf
(c)
Q = [-2,2,-1; -7,4,-1; 5,-3, \],[v] = [-2a-\-2b - c, -la + Ab-c, 5a-3b + cY
6.52.
6.53.
6.54.
6.55.
6.56.
6.57.
6.58.
6.59.
6.62.
6.63.
6.64.
6.65.
(fl) [-23,-39; 13,26]
(fl) [28,42; -15,-25]
{a) [43,60; -33,-46]
{b) [35,41; -27,-32]
{b) [13,18; -^,-10]
{b)
(c) [3,5; -1,-2] {d) B = p-^AP
87
2 '
143 . 49 8il
2 ' 2 ' 2-1
[10,8,20; 13,11,28; -5,-4,10]
(a) [-34,57; -19,32] (b) tr(B) = tr(A) =-2 (c) detE) = det(^) =-5
(a) tr(F) = 6, det(F) = 23 (b) tr(G) = a-\-d, det(G) = ad - be
(a) tr(F) = -2 (b) tr(G) = 0
(a) ^ = [3,1; -2,4],5 = [8, 11; -2,-1] (b) P = [l,2; 1,3]
(a) [2,-4,9; 5,3,-2] (b) [3,5,1,4; 4,-2,7,0] (c) [2,3,-7,-11]
(a) [-9, 1,4; 7,2,1] (b) [v]s = [-a -\-2b-c, 5a - 5b + 2c, -3a + 36 - cf, and
[G{v)\s, = [2a-nb + lc, lb- Acf
(a) ^ = [47,85; -38,-69]
^ = [3,11,5; -1,-8,-3]
{b) B = [l\,
-41,-51]
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
CHAPTER 7
Inner Product Spaces,
Orthogonality
7.1 INTRODUCTION
The definition of a vector space V involves an arbitrary field K. Here we first restrict K to be the real
field R, in which case V is called a real vector space; in the last sections of this chapter, we extend our
results to the case where K is the complex field C, in which case V is called a complex vector space. Also,
we adopt the previous notation that
u,v,w are vectors in V
a, b, c, k are scalars in K
Furthermore, the vector spaces V in this chapter have finite dimension unless otherwise stated or implied.
Recall that the concepts of "length" and "orthogonality" did not appear in the investigation of
arbitrary vector spaces V (although they did appear in Chapter 1 on the spaces R" and C"). Here we place
an additional structure on a vector space V to obtain an inner product space, and in this context these
concepts are defined.
7.2 INNER PRODUCT SPACES
We begin with a definition.
Definition: Let F be a real vector space. Suppose to each pair of vectors u,v e V there is assigned a real
number, denoted by (u, v). This function is called a {real) inner product on V if it satisfies the
following axioms:
[Ij] {Linear Property): {au^ + bu2, v) = a{u^, v) + b{u2, v).
[I2] {Symmetric Property): (u,v) = (v,u).
[I3] {Positive Definite Property): {u, u) > 0.; and {u, u) = 0 if and only if w = 0.
The vector space V with an inner product is called a {real) inner product space.
236
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 7] INNER RRODUCT SRACES, ORTHOGONALITY 237
Axiom [Ij] states that an inner product function is linear in the first position. Using [IJ and the
symmetry axiom [I2], we obtain
(w, cvi + dv2) = {cvi + dv2, u) = c{vx, u) + d{v2, u) = c(w, Vx) + d{u, V2)
That is, the inner product function is also linear in its second position. Combining these two properties and
using induction yields the following general formula:
(E^i^v EbjVj) = E E^ibjiui, vj)
That is, an inner product of linear combinations of vectors is equal to a linear combination of the inner
products of the vectors.
Example 7.1. Let F be a real inner product space. Then, by linearity,
{3^1 — 4^2, 2i;i — 5i;2 + 6vi,) = 6{ui, Vi) — I5{ui, V2) + lS{ui, v^)
— 8(^2, Vi) + 20(^2, V2) — 24(^2, v^)
{2u - 5v, 4u + 6v) = 8(w, u) + I2{u, v) - 20(i;, u) - 30(i;, v)
= ^{u,u) -8A;, w) -30(i;, 1;)
Observe that in the last equation we have used the symmetry property that {u, v) = {v, u).
Remark: Axiom [Ij] by itself impHes @, 0) = (Oi;, 0) = 0(i;, 0) = 0. Thus [Ij], [I2], [I3] are
equivalent to [Ij], [I2], and the following axiom:
[I3] If u 7^ 0, then (u, u) is positive.
That is, a function satisfying [Ij], [I2], [I3] is an inner product.
Norm of a Vector
By the third axiom [I3] of an inner product, (w, u) is nonnegative for any vector u. Thus its positive
square root exists. We use the notation
ll^ll = V (w, u)
This nonnegative number is called the norm or length of u. The relation ||w||^ = (w, u) will be used
frequently.
Remark: If ||w|| = 1 or, equivalently, if (w, w) = 1, then u is called a unit vector and is said to be
normalized. Every nonzero vector vmV can be multiplied by the reciprocal of its length to obtain the unit
vector
1
M
which is a positive multiple of v. This process is called normalizing v.
7.3 EXAMPLES OF INNER PRODUCT SPACES
This section lists the main examples of inner product spaces used in this text.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
238 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP 7
Euclidean w-Space R"
Consider the vector space R". The dot product or scalar product in R" is defined by
u ■ V = a^b^ + B2^72 + ... + a^b^
where u = {a^) and v = (Z?^). This fiinction defines an inner product on R". The norm ||w|| of the vector
u = («•) in this space is as follows:
||w|| = V^T^ = Ja^ + B2 + . .
On the other hand, by the Pythagorean theorem, the distance from the origin O in R^ to a point P(a, b,c) is
given by \/a^ + b^ -\- c^. This is precisely the same as the above-defined norm of the vector v = (a, b,c) in
R^. Since the Pyghagorean theorem is a consequence of the axioms of Euclidean geometry, the vector
space R" with the above inner product and norm is called Euclidean n-space. Although there are many
ways to define an inner product on R", we shall assume this inner product unless otherwise stated or
implied. It is called the usual (or standard inner product) on R".
Remark: Frequently the vectors in R" will be represented by column vectors, that is, by w x 1
column matrices. In such a case, the formula
(w, v) = u^v
defines the usual inner product on R".
Example 7.2. Let w = A, 3,-4,2), i; = D,-2, 2, 1), w = E,-1,-2, 6) in R^
{a) By definition,
{w, w) = 5 - 3 + 8 + 12 = 22 and {i;, w) = 20 + 2 - 4 + 6 = 24
Note that ?>u-2v = (-5, 13, -16, 4). Thus
{3w-2i;, w) = -25-13+32 + 24= 18
As expected, 3{w, w) - 2{v, w) = 3B2) - 2B4) = 18 = {3u - 2v, w).
(b) By definition,
\\u\\ = Vl+9+16 + 4 = V30 and ||i;|| = V16 + 4 + 4+1 =5
We normalize u and v to obtain the following unit vectors in the directions of u and v, respectively:
1_/1 3-4 2\ ._1
/4 -2 2 1\
(v5'T'5'5J
Function Space C[a, b] and Polynomial Space P(t)
The notation C[a, b] is used to denote the vector space of all continuous fiinctions on the closed
interval [a, b], that is, where a < t < b. The following defines an inner product on C[a, b], where/(^) and
g(t) are functions in C[a, b]:
fb
(/,
fit)g{t) dt
It is called the usual inner product on C[a, b\
The vector space P(^) of all polynomials is a subspace of C[a, b] for any interval [a, b\ and hence the
above is also an inner product on V{t).
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces,
Outline of Theory and Orthogonality
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 7]
INNER RRODUCT SRACES, ORTHOGONALITY
239
Example 7.3. Consider/(f) = 3t — 5 and g(t) = f in the polynomial space P@ with inner product
{f,g)= mg{t)dt.
{a) Find{/,g).
We have/(Og@ = 3^^ - 5f. Hence
(/,g)
{}t^ -5t^)dt = lt^ -^^t^
- 3 _ 5 _ _ii
3 ~ 12
{b) Find 11/11 and
We have [f{t)Y =f{t)f{t) = 9f - 30f + 25 and \g{t)Y = t\ Then
II ff = ifj) = (9t^ - 30f + 25) dt = 3P - 15^2 + 25t
13
{g,g)
■ dt = \t^
Therefore, ||/|| = VU and ||g|| = ^\ = \V5.
Matrix Space M = M^ „
Let M = M^ „, the vector space of all real m x n matrices. An inner product is defined on M by
{A,B) =tr(B^A)
where, as usual, tr() is the trace, i.e., the sum of the diagonal elements. If ^ = [ay] and B = [by], then
m n m n
{A,B) = tv(B^A) = EE^ijbij and IMH^ = {A,A) = E E4
/=l7=l /=l7=l
That is, (A, B) is the sum of the corresponding entries in A and B and, in particular, {A, A) is the sum of the
squares of the entries of ^.
Hilbert Space
Let V be the vector space of all infinite sequences of real numbers (aj, ^2, ^3, ...) satisfying
00
Y^a] = a\ ^- al ^-... < OQ
/=i
that is, the sum converges. Addition and scalar multiplication are defined in V componentwise, that is, if
w = (flj, B2,...) and 1; = (Z?i, Z?2,...)
then w + 1; = (^1 + Z?i, ^2 + Z?2, ...) and ku = {ka^, ^^2, ...)
An inner product is defined in v by
(W, V) = a^bx + B2^72 + . . .
The above sum converges absolutely for any pair of points in V. Hence the inner product is well defined.
This inner product space is called l2-space or Hilbert space.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
240 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP 7
7.4 CAUCHY-SCHWARZ INEQUALITY, APPLICATIONS
The following formula (proved in Problem 7.8) is called the Cauchy-Schwarz inequality or Schwarz
inequality. It is used in many branches of mathematics.
Theorem 7.1: (Cauchy-Schwarz) For any vectors u and v in an inner product space V,
(u,v)^<(u,u)(v,v) or |(w, i;)| < ||w||||i;||
Next we examine this inequality in specific cases.
Example 7.4.
(a) Consider any real numbers ai, ... ,a^, bi,... ,b^. Then, by the Cauchy-Schwarz inequality,
(ai^i +^2^2 + • • ■+a,b,f <{a\ + .. .^al)(b] ^ ... ^ bl)
That is, (u • v) < ||w||^||i^||^, where u = (qi) and v = (bi).
(b) Let/ and g be continuous functions on the unit interval [0, 1]. Then, by the Cauchy-Schwarz inequality.
[ J/@^@ dt\ < J^/@ dt J^ g\t) dt
That is, ({f,g)) < ||/||^||i^||^ Here V is the inner product space C[0, 1].
The next theorem (proved in Problem 7.9) gives the basic properties of a norm. The proof of the third
property requires the Cauchy-Schwarz inequality.
Theorem 7.2: Let V be an inner product space. Then the norm in V satisfies the following properties:
[Nj] ||i;|| > 0; and ||i;|| = 0 if and only if i; = 0.
[Nd \\kv\\ = \k\\\v\\.
[N3] ||w + i;|| < ||w|| + ||i;||.
The property [N3] is called the triangle inequality, because if we view w + 1; as the side of the triangle
formed with sides u and v (as shown in Fig. 7-1), then [N3] states that the length of one side of a triangle
cannot be greater than the sum of the lengths of the other two sides.
u "¥ V
u
Tri[ingle fnequality
Fig. 7-1
Angle Between Vectors
For any nonzero vectors u and v in an inner product space V, the angle between u and v is defined to be
the angle 6 such that 0 < ^ < 71 and
COS 6 = ■
\\u\\\\v\\
By the Cauchy-Schwartz inequality, — 1 < cos^ < 1, and so the angle exists and is unique.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 7] INNER RRODUCT SRACES, ORTHOGONALITY 241
Example 7.5.
(a) Consider vectors u = B,3,5) and i; = A, —4, 3) in R^. Then
{u, i;) = 2 - 12 + 15 = 5, \\u\\ = V4 + 9 + 25 = V38, ||i;|| = Vl + 16 + 9 = V26
Then the angle 6 between u and v is given by
5
V38V26
Note that 6 is an acute angle, since cos 6 is positive.
(b) Let/@ = 3^ — 5 and g(t) = f in the polynomial space P(t) with inner product {/, g) = Jo/@^@ dt. By
Example 7.3,
(/.g> = -TI' 11/11 =V^, \\g\\=^V5
Then the "angle" 6 between/ and g is given by
-1^ 55
COS 9 -
"(yi3)(^V5) 12^13^5
Note that 6 is an obtuse angle, since cos 6 is negative.
7.5 ORTHOGONALITY
Let V by an inner product space. The vectors u,veV are said to be orthogonal and u is said to be
orthogonal to i; if
(w, i;) = 0
The relation is clearly symmetric, that is, if w is orthogonal to v, then (v, u) = 0, and so v is orthogonal to u.
We note that 0 g F is orthogonal to every v ^V, since
@, v) = (Ov, v) = 0(v, v) =0
Conversely, if w is orthogonal to every v e V, then {u,u) =0 and hence w = 0 by [I3]. Observe that u and v
are orthogonal if and only if cos ^ = 0, where 6 is the angle between u and v. Also, this is true if and only if
u and V are "perpendicular", i.e., 6 = n/l (or ^ = 90°).
Example 7.6.
(a) Consider the vectors u = A, 1, 1), v = A,2, -3), w = A, -4, 3) in R^ Then
{u,v) = l-\-2-3 = 0, {u,w) = 1-4-\-3=0, {v,w) = I-S-9 =-16
Thus u is orthogonal to v and w, but v and w are not orthogonal.
(b) Consider the functions sin t and cos t in the vector space C[—n,n] of continuous functions on the closed interval
[—71, n]. Then
{sint, cost) = \ sintcost dt = \ sin^ ^|^^ = 0 - 0 = 0
J-71
Thus sin^ and cos^ are orthogonal functions in the vector space C[—n, n].
Remark: A vector w = (xj, X2,..., x„) is orthogonal to w = (a^, ^2,..., a„) in R" if
(w, w) = a^Xx + B2X2 + ... + a„x„ = 0
That is, w is orthogonal to w if w satisfies a homogeneous equation whose coefficients are the elements
of w.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
7. Inner Product Spaces,
Orthogonality
Text
©The McGraw-Hill
Companies, 2004
242
INNER PRODUCT SPACES, ORTHOGONALITY
[CHAR 7
Example 7.7. Find a nonzero vector w that is orthogonal to u^ — (I, 2, 1) and U2 — B, 5, 4) in R^.
Let w = (x,y,z). Then we want {w^, w) = 0 and {u2, w) = 0. This yields the homogeneous system
x-\-2y-\- z = 0
2x-\-5y-\-4z = 0
x-\-2y-\- z = 0
y-\-2z = 0
Here z is the only free variable in the echelon system. Set z = 1 to obtain j; = —2 and x = 3. Thus, w = C, —2, 1) is a
desired nonzero vector orthogonal to u^ and ^2-
Any multiple of w will also be orthogonal to u^ and ^2- Normalizing w, we obtain the following unit vector
orthogonal to u^ and ^2-
\\w\\
Vyi4 yi4 yi4/
Orthogonal Complements
Let ^ be a subset of an inner product space V. The orthogonal complement of S, denoted by S-^ (read
"^ perp") consists of those vectors in V that are orthogonal to every vector u e S; that is,
S-^ = {v e V : (v,u) =0 for every u e S}
In particular, for a given vector u in V, we have
u^ = {v eV : (v, u) = 0}
that is, u^ consists of all vectors in V that are orthogonal to the given vector u.
We shown that S-^ is a substance of V. Clearly 0 g ^^, since 0 is orthogonal to every vector in V. Now
suppose V, w e S-^. Then, for any scalars a and b and any vector u e S, wq have
(av + bw, u) = a{v, u) + Z?(w, u) = a -0 -\-b ■ 0 = 0
Thus av-\-bw e S^, and therefore S^ is a subspace of V.
We state this result formally.
Proposition 7.3: Let ^ be a subset of a vector space V. Then ^^ is a subspace of K
Remark 1. Suppose w is a nonzero vector in R^. Then there is a geometrical description of u^.
Specifically, u^ is the plane in R^ through the origin O and perpendicular to the vector u. This is shown
in Fig. 7-2.
Orthogonal Cnmp]t:mcTit it^
Fig. 7-2
Remark 2: Let W be the solution space of an m x ^ homogeneous system AX = 0, where A = [a^y]
and X = {Xj\. Recall that W may be viewed as the kernel of the linear mapping ^: R" -^ R'^. Now we can
give another interpretation of W using the notion of orthogonality. Specifically, each solution vector
w = (xj, X2, ..., x„) is orthogonal to each row of ^; and hence W is the orthogonal complement of the row
space of ^.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 7] INNER RRODUCT SRACES, ORTHOGONALITY 243
Example 7.8. Find a basis for the subspace u^ of R^, where w = A, 3, —4).
Note that u^ consists of all vectors w = (x,y, z) such that {w, w) = 0, or x + 3j; — 4z = 0. The free variables
are y and z.
A) Set j; = 1, z = 0 to obtain the solution Wy = (—3, 1, 0).
B) Set j; = 0, z = 1 to obtain the solution w^ = D, 0, 1).
The vectors Wi and W2 form a basis for the solution space of the equation, and hence a basis for u-^.
Suppose ^ is a subspace of V. Then both W and W-^ are subspaces of V. The next theorem, whose
proof (Problem 7.28) requires results of later sections, is a basic result in linear algebra.
Theorem 7.4: Let ^ be a subspace of V. Then V is the direct sum of W and W-^, that is, V = W ^ W^.
7.6 ORTHOGONAL SETS AND BASES
Consider a set S = {ux,U2, ..., w^} of nonzero vectors in an inner product space K ^ is called
orthogonal if each pair of vectors in S are orthogonal, and S is called orthonormal if S is orthogonal and
each vector in S has unit length. That is:
(i) Orthogonal: (w^ Uj) = 0 for / ^j
0 for i^j
1 for / =j
(ii) Orthonormal: (w^ Uj) =
Normalizing din orthogonal set S refers to the process of multiplying each vector in S by the reciprocal of
its length in order to transform S into an orthonormal set of vectors.
The following theorems apply.
Theorem 7.5: Suppose S is an orthogonal set of nonzero vectors. Then S is linearly independent.
Theorem 7.6: (Pythagoras) Suppose {wj, W2» • • •» ^rl i^ ^^ orthogonal set of vectors. Then
\\Ux + ^2 + •• • + ^rll = INlll + IN2II + . . . + \\Ur\\
These theorems are proved in Problems 7.15 and 7.16, respectively. Here we prove the Pythagorean
theorem in the special and familiar case for two vectors. Specifically, suppose (w, v) = 0. Then
\\u + v\\^ = {u-\-v, u-\-v) = {u, u) + 2(w, v) + A;, v) = (w, u) + A;, v) = ||w||^ + ||i;||^
which gives our result.
Example 7.9
{a) Let E = {cy, 62, e^} = {A, 0, 0), @, 1, 0), @, 0, 1)} be the usual basis of Euclidean space R^. It is clear that
{61,62) = {61,62) = {62,62) =0 and {61,61) = {62,62) = {62,62) = 1
Namely, E is an orthonormal basis of R^. More generally, the usual basis of R" is orthonormal for every n.
(b) Let V = C[—n,n]hQ the vector space of continuous functions on the interval —n<t<n with inner product
defined by {f,g)= jlj(t)g(t) dt. Then the following is a classical example of an orthogonal set in V\
{1, cos^, cos 2^, cos 3^,..., sin^, sin2^, sin 3^,...}
This orthogonal set plays a fundamental role in the theory of Fourier series.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces,
Outline of Theory and Orthogonality
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
244
INNER PRODUCT SPACES, ORTHOGONALITY
[CHAR 7
Orthogonal Basis and Linear Combinations, Fourier Coefficients
Let S consist of the following three vectors in R^:
wi = A,2, 1), U2 = B, 1, -4), W3 = C, -2, 1)
The reader can verify that the vectors are orthogonal; hence they are linearly independent. Thus S is an
orthogonal basis of R^.
Suppose we want to write i; = G, 1, 9) as a linear combination of Wj, W2» ^s- First we set i; as a linear
combination of Wj, W2» ^3 using unknowns Xj, X2 5 ^3 as follows:
v = xiui -\-X2U2 + x^u^ or G, l,9)=xi(l,2, 1)+X2B, 1,-4)+X3C,-2, 1) (*)
We can proceed in two ways.
Method 1: Expand (*) (as in Chapter 3) to obtain the system
Xi + 2X2 + ^-^3 = '7' 2Xi + X2 — 2X3 = 1,
Xj — 4x2 + x3 = 7
Solve the system by Gaussian elimination to obtain Xj = 3, X2 = — 1, X3 = 2. Thus
V = 3ui — ^2 + 2^3.
Method 2: (This method uses the fact that the basis vectors are orthogonal, and the arithmetic is much
simpler.) If we take the inner product of each side of (*) with respect to u^, we get
(v, Uj) = (X1W2 +-^2^2 +-^3^3' ^i) or A;, Uj) = x^-(w^-, Uj) or x^
Here two terms drop out, since Wj, ^2^ ^3 are orthogonal. Accordingly,
_ A;, wi) _ 7 + 2 + 9 _ 18 _ _ A;, W2> _ 14 + 1 - 36 _ -21 _
■^^ ~ (wi,wi) ~ 1 +4+ 1 ~~6~' ^^~ {U2, U2) ~ 4+1 + 16 ~ ~7A' ~ ~
_ (v,u^) _21 -2 + 9_28_
^^ ~ (u^,u^) ~ 9 + 4+1 4"
Thus, again, we get 1; = 3wi —1/2+2^3.
The procedure in Method 2 is true in general. Namely, we have the following theorem (proved in
Problem 7.17).
Theorem 7.7: Let {ui,U2, ..., uJ be an orthogonal basis of V. Then, for any v e V,
{V,U^) {V,U2) , , {V,U„)
V = - rUi +- -U2 + . . . -\-- rU^
{Ui,Ui) (^2,^2)
(w„,w„) "
Remark: The scalar L =
(v, Ui)
is called the Fourier coefficient of v with respect to w,, since it is
analogous to a coefficient in the Fourier series of a function. This scalar also has a geometric interpretation,
which is discussed below.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
7. Inner Product Spaces,
Orthogonality
Text
©The McGraw-Hill
Companies, 2004
CHAR 7]
INNER RRODUCT SRACES, ORTHOGONALITY
245
Projections
Let V be an inner product space. Suppose w is a given nonzero vector in V, and suppose v is another
vector. We seek the "projection oft; along w", which, as indicated in Fig. 7-3 (a), will be the multiple cw of
w such that v' = v — cw is orthogonal to w. This means
(v — cw, w) = 0
(v, w) — c(w, w) = 0
or
{v, w)
(w, w)
Fig. 7-3
Accordingly, the projection of v along w is denoted and defined by
proj(i;, w) = cw = w
(w, w)
Such a scalar c is unique, and it is called the Fourier coefficient of v with respect to w or the component of v
along w.
The above notion is generalized as follows (see Problem 7.2.5).
Theorem 7.8: Suppose Wj, W2,..., w^ form an orthogonal set of nonzero vectors in V. Let v be any
vector in V. Define
where
V' = V - (CjWi + C2W2 + . . . + Cj^Wj.)
{v,Wi) {V,W2)
(Wl,Wi)
C2
(W2,W2)'
Then v' is orthogonal to Wj, W2, ..., w^.
Note that each c^ in the above theorem is the component (Fourier coefficient) oft; along the given w^.
Remark: The notion of the projection of a vector v e V along a subspace ^ of F is defined as
follows. By Theorem 7.4, V = W ^ W^. Hence v may be expressed uniquely in the form
V = w-\-w', where w e W and w' g W^
We define w to be ihQ projection ofv along W, and denote it by proj(i;, W\ as pictured in Fig. 7-2(b). In
particular, if W = span(wi, W2, ..., w^), where the Wf form an orthogonal set, then
proj(i;, W) = c^wi + C2W2 + ... + c^w^
Here q is the component of v along w^, as above.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
246 INNER PRODUCT SPACES, ORTHOGONALITY [CHAR 7
7.7 GRAM-SCHMIDT ORTHOGONALIZATION PROCESS
Suppose {vi,V2, ..., vj is a basis of an inner product space V. One can use this basis to construct an
orthogonal basis {wj, W2,..., w„} of V as follows. Set
(Wl,Wi)
(V2,Wi) {V2,W2)
W^ = V^ Wi W2
(Wi,Wi) (W2,W2)
^n = ^n-^ -, \^\ - -, \^1 - ... - 7 rW„_i
(Wl,Wi) (W2,W2) {y^n-\->^n-\)
In other words, for ^ = 2, 3, ..., w, we define
^k^^k- Ck\^\ - Ck2^2 - ... - Ck^k-\^k-\
where Cj^ = (%, ^i)/{^v ^/) i^ the component of i;^^ along w^. By Theorem 7.8, each Wj^ is orthogonal to
the preceeding w's. Thus Wj, W2, ..., w„ form an orthogonal basis for V as claimed. Normalizing each w^
will then yield an orthonormal basis for V.
The above construction is known as the Gram-Schmidt orthogonalization process. The following
remarks are in order.
Remark 1: Each vector wj^ is a linear combination oivj^ and the preceding w's. Hence one can easily
show, by induction, that each Wj^ is a linear combination oi Vx,V2, ... ,v^.
Remark 2: Since taking multiples of vectors does not affect orthgonality, it may be simpler in hand
calculations to clear fi*actions in any new Wj^, by multiplying Wj^ by an appropriate scalar, before obtaining
the next w^^^j.
Remark 3: Suppose Ui,U2,... ,Uf. are linearly independent, and so they form a basis for
U = span(w^). Applying the Gram-Schmidt orthogonalization process to the w's yields an orthogonal
basis for U.
The following theorem (proved in Problems 7.26 and 7.27) use the above algorithm and remarks.
Theorem 7.9: Let {vi,V2,..., vj by any basis of an inner product space V. Then there exists an
orthonormal basis {ui,U2,...,uJofV such that the change-of-basis matrix fi-om {i;J to
{ui) is triangular, that is, for ^ = 1, ..., w,
% = ^kl^l + ^k2^2 + . . . + ClkkVk
Theorem 7.10: Suppose S = {wi,W2, ..., w^} is an orthogonal basis for a subspace W of a vector space
V. Then one may extend S to an orthogonal basis for V, that is, one may find vectors
Wj._^i, ..., w„ such that {wj, W2, ..., w„} is an orthogonal basis for V.
Example 7.10. Apply the Gram-Schmidt orthogonalization process to find an orthogonal basis and then an orthonormal
basis for the subspace U of R"^ spanned by
i;i = A, 1, 1, 1), i;2 = A, 2, 4, 5), v^=(l, -3, -4, -2)
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 7] INNER RRODUCT SRACES, ORTHOGONALITY 247
A) First set wi = 1;^ = A, 1, 1, 1).
B) Compute
V2-- -wi =V2-—w^ =(-2,-1,1,2)
Set>V2 = (-2,-1,1,2).
C) Compute
{V„W,) {V3,W2) (-8) (-7) ^, ^3
Clear fractions to obtain w^ = (—6, -17,-13, 14).
Thus Wi,W2,W2 form an orthogonal basis for U. Normalize these vectors to obtain an orthonormal basis
{u^,U2,u^} ofU. We have ||>Vi 11^=4, ||>V2||^ = 10, ||>V3||2 = 910, so
ui = -A, 1, 1, 1), U2 = -^(-2, -1, 1, 2), u. = ^^A6, -17, -13, 14)
1 2' ^ yio' ^ V9T0
pi
Example 7.11. Let V be the vector space of polynomials/(f) with inner product (/, g) = i_if(t)g(t) dt. Apply the
Gram-Schmidt orthogonalization process to {l,f,P} to find an orthogonal basis {fo,fi,f2,f3} with integer coefficients for
P3@.
Here we use the fact that, for r + s = n.
1
-1
dt =
^n+l
n + 1
2/{n +1) when n is even
0 when n is odd
{t\f)
A) First set/o = 1.
B) Compute t = 77^A) = t-^ = t. Set/i = t.
C) Compute
Multiply by 3 to obtain/2 = 3^^ = 1.
D) Compute
(l,!)*^^ (t,t) ^' Cf2-l, 3f2-l>'-
= ^^ - 0A) - f @ - 0C^2 - 1) = ^^ - f ^
3
-1)
Multiply by 5 to obtain/3 = 5^^ - 3t.
Thus {1, t, 3t^ — I, 5t^ — 3t} is the required orthogonal basis.
Remark: Normalizing the polynomials in Example 7.11 so thatjc>(l) = 1 yields the polynomials
1, t, \0t^ - 1), \{5t^ -30
These are the first four Legendre polynomials, which appear in the study of differential equations.
7.8 ORTHOGONAL AND POSITIVE DEFINITE MATRICES
This section discusses two types of matrices that are closely related to real inner product spaces V.
Here vectors in R" will be represented by column vectors. Thus (u,v) = u^v denotes the inner product in
Euclidean space R".
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
7. Inner Product Spaces,
Orthogonality
Text
©The McGraw-Hill
Companies, 2004
248
INNER PRODUCT SPACES, ORTHOGONALITY
[CHAR 7
Orthogonal Matrices
A real matrix P is orthogonal if P is non-singular and P~^ = P^, or, in other words, if
PP^ = P^P = /. First we recall (Theorem 2.6) an important characterization of such matrices.
Theorem 7.11: Let P be a real matrix. Then the following are equivalent: {a) P is orthogonal; (Z?) the
rows of P form an orthonormal set; (c) the columns of P form an orthonormal set.
(This theorem is true only using the usual inner product on R". It is not true if R" is given any other
inner product.)
Example 7.12,
(a) LetP.
1/V3 1/V3 1/V3"
0 1/V2 1/V2
2/V6 -1/V6 -1/V6.
is an orthogonal matrix.
The rows of P are orthogonal to each other and are unit vectors. Thus P
(b) Let P be a 2 X 2 orthogonal marix. Then, for some real number 6, we have
cos U sm u
- sin 6 cos 6
cosu
sin 6
smu
-cos 6
The following two theorems (proved in Problems 7.37 and 7.38) show important relationships between
orthogonal matrices and orthonormal bases of a real inner product space V.
Theorem 7.12: Suppose E = {ej and E^ = {ej} are orthonormal bases of V. Let P be the change-of-basis
matrix from the basis E to the basis E\ Then P is orthogonal.
Theorem 7.13: Let |e
, e„} be an orthonormal basis of an inner product space V. Let P = [aJ be an
orthogonal matrix. Then the following n vectors form an orthonormal basis for V:
e'i = dxiC^ + ^2,^2 + ... + a„,e„, / = 1, 2, ..., w
Positive Definite Matrices
Let ^ be a real symmetric matrix, that is, ^^ = A. Then A is said to be positive definite if, for every
nonzero vector w in R",
(u,Au) = u^Au > 0
Algorithms to decide whether or not a matrix A is positive definite will be given in Chapter 12. However,
for 2 X 2 matrices, we have simple criteria, which we state formally in the following theorem (proved in
Problem 7.43).
a b
c d
a b
b d
Theorem 7.14: A 2 x 2 real symmetric matrix A =
if the diagonal entries a and d are positive and the determinant \A\
is positive.
is positive definite if and only
ad — be = ad — b^
Example 7.13. Consider the following symmetric matrices:
B--
[1
3
3 4
-2
-3
C-
-2
5
Then A is not positive definite, since \A\ = 4 — 9 = —5 is negative. B is not positive definite, since the diagonal entry
—3 is negative. However, C is positive definite, since the diagonal entries 1 and 5 are positive, and the determinant
\C\ =5—4= 1 is also positive.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 7] INNER RRODUCT SRACES, ORTHOGONALITY 249
The following theorem (proved in Problem 7.44) holds.
Theorem 7.15: Let ^ be a real positive definite matrix. Then the function (w, v) = u^Av is an inner
product on R".
Matrix Representation of an Inner Product
Theorem 7.15 says that every positive definite matrix A determines an inner product on R". This
subsection may be viewed as giving the converse of this result.
Let F be a real inner product space with basis S = {ui,U2, ... ,uj. The matrix
A = [aij], where a^j = (Uf, Uj)
is called the matrix representation of the inner product on V relative to the basis S.
Observe that^ is symmetric, since the inner product is symmetric, that is, (w^, Uj) = {uj, u^). Also, A
depends on both the inner product on V and the basis S for V. Moreover, if ^ is an orthogonal basis, then A
is diagonal, and if S is an orthonormal basis, then A is the identity matrix.
Example 7.14. The vectors u^ =A,1, 0), U2 = A,2, 3), t/3 = A, 3, 5) form a basis S for Euclidean space R^. Find the
matrix A that represents the inner product in R^ relative to this basis S.
First compute each {w,, Uj) to obtain
{wi,wi) = 1 + 1+0 = 2, {^1,^2) = 1+2 + 0 = 3, {^1,^3) = 1+3 + 0 = 4
{^2, W2) = 1 + 4 + 9 = 14, {^2, W3) = 1 + 6 + 15 = 22, {1/3, W3) = 1 + 9 + 25 = 35
Then A ■
'2 3 4'
3 14 22
4 22 35
As expected, A is symmetric.
The following theorems (proved in Problems 7.45 and 7.46, respectively) hold.
Theorem 7.16: Let A be the matrix representation of an inner product relative to basis S for V. Then, for
any vectors u,v e F, we have
(w, v) = M^^[i^]
where {u\ and {v\ denote the (column) coordinate vectors relative to the basis S.
Theorem 7.17: Let A be the matrix representation of any inner product on V. Then ^ is a positive definite
matrix.
7.9 COMPLEX INNER PRODUCT SPACES
This section considers vector spaces over the complex field C. First we recall some properties of the
complex numbers (Section 1.7), especially the relations between a complex number z = a-\- bi, where
a, Z? G R, and its complex conjugate z = a — bi:
zz = a + Z? , \z\ = yja^-\-b'^, Zi+Z2=z7 + ^ z]^ = ZiZ2, z = z
Also, z is real if and only if z = z.
The following definition applies.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
7. Inner Product Spaces,
Orthogonality
Text
©The McGraw-Hill
Companies, 2004
250
INNER PRODUCT SPACES, ORTHOGONALITY
[CHAR 7
Definition: Let F be a vector space over C. Suppose to each pair of vectors, u,v e V there is assigned a
complex number, denoted by (u,v). This function is called a (complex) inner product on V if
it satisfies the following axioms:
[If] (Linear Property) (aui + bu2, v) = a^u^, v) + b{u2, v)
[/|] (Conjugate Symmetric Property) (u,v) = (v,u)
[If] (Positive Definite Property) {u, u) > 0; and {u, u) = 0 if and only if w = 0.
The vector space V over C with an inner product is called a (complex) inner product space.
Observe that a complex inner product differs fi*om the real case only in the second aniom [/f].
Axiom [If] (Linear Property) is equivalent to the two conditions:
(a) (ui-\-U2, v) = (ui,v)-\-(u2,v),
On the other hand, applying [If] and [If], we obtain
(b) (ku, v) = k{u, v)
(u, kv) = {kv, u) = k{v, u) = k{v, u) = k{u, v)
That is, we must take the conjugate of a complex number when it is taken out of the second position of a
complex inner product. In fact (Problem 7.47), the inner product is conjugate linear in the second position,
that is,
(w, avx + bv2) = a{u, v{) + b{u, V2)
Combining linear in the first position and conjugate linear in the second position, we obtain, by induction,
E ^i^v E bj^j) = E ^ibjiui, vj)
i J I iJ
The following remarks are in order.
Remark 1: Axiom [If] by itself impHes that @, 0) = (Oi;, 0) = 0(i;, 0) = 0. Accordingly, [If], [/f],
and [If] are equivalent to [If], [If], and the following axiom:
[I3*'] Ifuj^O, then {u, u) > 0.
That is, a function satisfying [/j], [If], and [If] is a (complex) inner product on V.
Remark 2: By [If], {u, u) = (w, u). Thus (w, u) must be real. By [If], {u, u) must be nonnegative,
and hence its positive real square root exists. As with real inner product spaces, we define ||w|| = ^(wT^y to
be the norm or length of u.
Remark 3: Besides the norm, we define the notions of orthogonality, orthogonal complement, and
orthogonal and orthonormal sets as before. In fact, the definitions of distance and Fourier coefficient and
projections are the same as in the real case.
Example 7.15. (Complex Euclidean Space C"). Let V = C", and let u = (z^) and v = (w^) be vectors in C". Then
{U, V) =J2 ^k^ = ^li^ + ^2^ + . . . + Zn"^
k
is an inner product on V, called the usual or standard inner product on C". V with this inner product is called
Complex Euclidean Space. We assume this inner product on C" unless otherwise stated or implied. Assuming u and v
are column vectors, the above inner product may be defined by
{u, v) = u^v
where, as with matrices, v means the conjugate of each element of i;. If w and v are real, we have Wi = Wi. In this case,
the inner product reduced to the analogous one on R".
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 7] INNER RRODUCT SRACES, ORTHOGONALITY 251
Example 7.16.
(a) Let V be the vector space of complex continuous functions on the (real) interval a < t < b. Then the following is
the usual inner product on V\
{f,g) = \mW)dt
Ja
(b) Let U be the vector space ofmxn matrices over C. Suppose A = {z^j) and B = (wij) are elements of U. Then the
following is the usual inner product on U:
m n
{A,B)=\x{B''A) = Y.i:^ifij
/=iy=i
As usual, B^ = B^, that is, B^ is the conjugate transpose of B.
The following is a list of theorems for complex inner product spaces that are analogous to those for the
real case. Here a Hermitian matrix A (i.e., one where A^ = A^ = ^) plays the same role that a symmetric
matrix A (i.e., one where A^ = A) plays in the real case. (Theorem 7.18 is proved in Problem 7.50.)
Theorem 7.18: (Cauchy-Schwarz) Let F be a complex inner product space. Then
\(u,v)\ < \\u\\\\v\\
Theorem 7.19: Let ^ be a subspace of a complex inner product space V. Then V =W ® W^.
Theorem 7.20: Suppose {wj, ^2. • • •» u^) is a basis for a complex inner product space V. Then, for any
^' -U\-\-- -W2 + ----
{U^,U^) (W2,W2> K^Un)
Theorem 7.21: Suppose {ui,U2, ..., uj is a basis for a complex inner product space V. Let A = [ay] be
the complex matrix defined by a^j = (w^ Uj). Then, for any u,v e V,
(u, v) = M^^[i^]
where [u] and [v] are the coordinate column vectors in the given basis {wj. (Remark: This
matrix A is said to represent the inner product on V.)
Theorem 7.22: Let ^ be a Hermitian matrix (i.e., A^ =A^=A) such that X^AX is real and positive for
every nonzero vector X G C". Then (u, v) = u^Av is an inner product on C".
Theorem 7.23: Let A be the matrix that represents an inner product on V. Then A is Hermitian, and
X^AX is real and positive for any nonzero vector in C".
7.10 NORMED VECTOR SPACES (OPTIONAL)
We begin with a definition.
Definition: Let F be a real or complex vector space. Suppose to each v eV there is assigned a real
number, denoted by ||i;||. This function || • || is called a norm on V if it satisfies the following
axioms:
[Nj] ||i;|| > 0; and ||i;|| = 0 if and only if i; = 0.
[N2] \\kv\\ = \k\\\vl
[N3] \\u^v\\ < ||w|| + ||i;||.
A vector space V with a norm is called a normed vector space.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
252 INNER PRODUCT SPACES, ORTHOGONALITY [CHAP 7
Suppose F is a normed vector space. The distance between two vectors u and i; in F is denoted and
defined by
d{u, v) = \\u — v\\
The following theorem (proved in Problem 7.56) is the main reason why d{u, v) is called the distance
between u and v.
Theorem 7.24: Let F be a normed vector space. Then the fiinction d{u, v) = \\u — v\\ satisfies the
following three axioms of a matric space:
[Mj] d{u, v) > 0; and d(u, i;) = 0 if and only if u = v.
[M2] d(u, v) = d(v, u).
[M3] d(u, v) < d(u, w) + d(w, v).
Normed Vector Spaces and Inner Product Spaces
Suppose V is an inner product space. Recall that the norm of a vector 1; in F is defined by
One can prove (Theorem 7.2) that this norm does satisfy [NJ, [N2], and [N3]. Thus every inner product
space F is a normed vector space. On the other hand, there may be norms on a vector space V that do not
come fi-om an inner product on V, as shown below.
Norms on R" and C"
The following define three important norms on R" and C":
11(^1,..., ajll^ =max(|a,-|)
11(^1,..., ajlli = 1^11 + 1^2! + ... + |q^|
11(^1, ..., aj||2 = ^\a^\^ + \a2\^ + ... + \af
(Note that subscripts are used to distinguish between the three norms.) The norms || • ||^, || • ||i, and || • II2
are called the infinity-norm, one-norm, and two-norm, respectively. Observe that || • II2 is the norm on R"
(respectively, C") induced by the usual inner product on R" (respectively, C"). We will let d^, d^, d2
denote the corresponding distance functions.
Example 7.17. Consider vectors w = A, -5, 3) and v = D, 2, -3) in R^
(a) The infinity norm chooses the maximum of the absolute values of the components. Hence
\\u\\^ = 5 and \\v\\^=4
(b) The one-norm adds the absolute values of the components. Thus
||t/||i = 1+5 + 3 = 9 and ||i;||i = 4 + 2 + 3 = 9
(c) The two-norm is equal to the square root of the sum of the squares of the components (i.e., the norm induced by
the usual inner product on R^). Thus
\\u\\2 = Vl+25 + 9 = V35 and ||i;||2 = VI6 + 4 + 9 = V29
(d) Since w-i; = (l-4, -5-2, 3 + 3) = (-3,-7, 6), we have
d^(u, v) = 7, d^(u, i;) = 3 + 7 + 6 = 16, d2(u, v) = V9 + 49 + 36 = V94
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
7. Inner Product Spaces,
Orthogonality
Text
©The McGraw-Hill
Companies, 2004
CHAR 7]
INNER PRODUCT SR\CES, ORTHOGONALITY
253
Example 7.18. Consider the Cartesian plane R^ shown in Fig. 7-4.
(a) Let Di be the set of points u = (x,y) in R^ such that ||w||2 = L Then D^ consists of the points (x,y) such that
||w||2 = x^ +y = 1. Thus D^ is the unit circle, as shown in Fig. 7-4.
Fig. 7-4
(b) Let D2 be the set of points u = (x,y) in R^ such that \\u\\i = 1. Then D^ consists of the points (x,y) such that
IIwill = jxj + |j;| = 1. Thus D2 is the diamond inside the unit circle, as shown in Fig. 7-3.
(c) Let D^ be the set of points u = (x,y) in R^ such that ||w||
1. Then D^ consists of the points (x, y) such that
: max(|x|, Ij^l) = 1. Thus D^ is the square circumscribing the unit circle, as shown in Fig. 7-3.
Norms on C[a, b]
Consider the vector space V = C[a, b] of real continuous functions on the interval a < t < b. Recall
that the following defines an inner product on V:
if.g)
f(t)g(t) dt
Accordingly, the above inner product defines the following norm onV = C[a, b] (which is analogous to
the
norm on R"):
[f(t)f dt
The following define the other norms on F = C[a, b]:
i/iii =
\f(t)\dt and 11/11^ =max(|/(OI)
There are geometrical descriptions of these two norms and their corresponding distance functions, which
are described below.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces,
Outline of Theory and Orthogonality
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
254
INNER PRODUCT SR\CES, ORTHOGONALITY
[CHAR 7
The first norm is pictured in Fig. 7-5. Here
II /II1 = area between the ftinction | f\ and the ^axis
<^i(/' S) = area between the ftinctions/ and g
(a) 11/I I lis shaded
Fig. 7-5
This norm is analogous to the norm || • ||i on R".
The second norm is pictured in Fig. 7-6. Here
{b) di{f,g) is shaded
11/11^ = maximum distance between/ and the ^axis
d^(f,g) = maximum distance between/ and g
This norm is analogous to the norms
(^) II/lU
onR".
(b) dM^
Fig. 7-6
Solved Problems
INNER PRODUCTS
7.1. Expand:
{a) {5ui + 8^2, ^Vi — IV2),
(b) {3u-\-5v, 4u-6v),
(c) \\2u-3vf
Use linearity in both positions and, when possible, symmetry, {u, v) = {v, u).
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 7] INNER PRODUCT SPACES, ORTHOGONALITY 255
(a) Take the inner product of each term on the lefi; with each term on the right:
{5ui + 8^2, 6^1 — IV2) = {5ui, 6vi) + {5ui, —IV2) + {8^2, ^V\) + {8^2^ —^^2)
= 30{ui, Vi) — 35{ui, V2) + 48(^2, ^^i) — 56{u2, V2)
[Remark: Observe the similarity between the above expansion and the expansion Ea-^b)Fc-ld) in
ordinary algebra.]
(b {3u + 5v, Au - 6v) = I2{u, u) - 18{w, v) + 20{i;, u) - 20{i;, v)
= I2{u,u) -\-2{u,v) -30{i;, i;)
(c) \\2u — 3v\\^ = {2u — 3v, 2u — 3v) = 4{u, u) — 6{u, v) — 6{v, u) + 9{v, v)
= A\\ut-\2{u,v)-r9\\vt
7.2. Consider vectors w = A, 2, 5), v = B, -3, 5), w = D, 2, -3) in rI Find:
(a) u ■ V, (b) u-w, (c) v-w, (d) (u-\-v) ■ w, (e) \\u\\, (f) \\v\\.
(a) Multiply corresponding components and add to get u-v = 2 — 6-\-20= 16.
(b) u-w = 4-\-4-l2 = -4.
(c) V'W = S-6-l5 = -l3.
(d) First find u-\-v = C,—1,9). Then (w + i;)->v=12 —2 —27 = —17. Alternatively, using [I^],
(u -\- v) • w = u • w -\- V • w = —4 — 13 = —11.
(e) First find ||w||^ by squaring the components of u and adding:
\\uf = 1^+22+42 = 1+4+16 = 21, and so \\u\\ =^21
(/) lli^ll^ = 4 + 9 + 25 = 38, and so ||i;|| = V38.
7.3. Verify that the following defines an inner product in R^:
{u, v) = Xiyi — Xiy2 — X2yi + 3x2j^2' where u = (xi, X2), v = (yi, 3^2)
We argue via matrices. We can write {u, v) in matrix notation as follows:
{u, v) = u^Av = [xi
'^2][_J 3]
Since A is real and symmetric, we need only show that A is positive definite. The diagonal elements 1 and 3
are positive, and the determinant ||^|| = 3 — 1 = 2 is positive. Thus, by Theorem 7.14, A is positive definite.
Accordingly, by Theorem 7.15, {u,v) is an inner product.
7.4. Consider the vectors w = A, 5) and 1; = C, 4) in R^. Find:
(a) (u, v) with respect to the usual inner product in R^.
(b) {u, v) with respect to the inner product in R^ in Problem 7.3.
(c) ||i;|| using the usual inner product in R^.
{d) \\v\\ using the inner product in R^ in Problem 7.3.
{a) {u,v) =3+20 = 23.
(b) {w, i;) = 1 • 3 - 1 • 4 - 5 • 3 + 3 • 5 • 4 = 3 - 4 - 15 + 60 = 44.
(c) \\vf = {v, v) = {C, 4), C, 4)) = 9 + 16 = 25; hence |i;|| = 5.
(d) \\vf = {v, v) = {C, 4), C, 4)) = 9 - 12 - 12 + 48 = 33; hence ||i;|| = V33.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces,
Outline of Theory and Orthogonality
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
256
INNER PRODUCT SPACES, ORTHOGONALITY
[CHAR 7
7.5. Consider the following polynomials in P(^) with the inner product (/, g) = Jq/@,§^@ ^t'-
/@ = ^ + 2,
{a) Find (/,g) and (/,/^).
{b) Find 11/11 and llgll.
(c) Normalize / and g.
(a) Integrate as follows:
g(t} = 3^-2,
h{t} ^f-2t-3
(t + 2)C? -2)dt--
(b)
{f,h)
(/./>=Jo'0 + 2H + 2)*
Cif- +At-4)dt=\fi + It
..)
•1 /,4 7,2
{t + 2H^ -lt-?,)dt=[-- — -6t
19.
3 '
hence
11/11
37
fl9 _ 1
57
te,g)^
C^ - 2)C^ - 2) = 1;
hence
3 ~ 3
(c) Since 11/11
57 and g is already a unit vector, we have
^-m'-7^''^''
and
g = g = 3^-2
7.6. Find cos 6 where 6 is the angle between:
(a) w = A, 3, -5, 4) and v = B, -43, 4, 1) in R^,
(b) A =
Use c
9 8 7
6 5 4_
and 5 =
^ (w, v)
,ost/ —
1 2 3
4 5 6
(a) Compute:
{u, v) -
= 2 - 9 - 2(
) + 4 = -23
\\uf
where (^,5) = trEM)
1+9 + 25 + 16 = 51,
Thus
COS0 :
-23
-23
\\vr
:4 + 9+16+l =30
/5TV30 3VT70
(b) Use {^, 5) = tr(B^A) = J]^i 2]Li ^ij^ij^ ^^^ ^^^ ^^ ^^^ products of corresponding entries.
{^, 5) = 9 + 16 + 21 + 24 + 25 + 24 = 119
Use 11^ 11^ = {A, A) = Jl'JLi 2]Li ^1' the sum of the squares of all the elements of ^.
Thus
\\Af = {A, A) =92 + 82 + 72+6^ + 52+42=271,
||5||2 = {B,B) = 12+22 + 32+42 + 52 + 62 =91,
COSt^ :
and so
and so
IMII
271
/271V91
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 7] INNER RRODUCT SRACES, ORTHOGONALITY 257
7.7. Verify each of the following:
(a) Parallelogram Law (Fig. 7-7): \\u + vf + \\u - vf = 2\\uf + 2||i;||^.
(b) Polar form for (u, v) (which shows the inner product can be obtained from the norm function):
(u,v)=\(\\u + vf-\\u-vf).
Expand as follows to obtain:
\\u + vf = {u-\-v,u-^v) = \\uf + 2{u, V) + \\vf A)
\\u - vf = {u-v,u-v) = \\uf - 2{u, V) + \\vf B)
Add A) and B) to get the Parallelogram Law (a). Subtract B) from A) to obtain
\\u + vf
Divide by 4 to obtain the (real) polar form (b).
\\u + vf - \\u-vf =A{u,v)
7.8. Prove Theorem 7.1 (Cauchy-Schwarz): For u and i; in a real inner product space V,
For any real number \
{u,uf < {u,u){v,v) or |{w, i;)| < ||w||||i;||.
{tu-\-v, tu-\-v) =t^{u,u) -\-2t{u,v) -\-{v,v) =t^\\uf -{-2t{u,v) + ||i;||^
Let a = \\uf, b = 2{u, v), c = \\vf. Since \\tu + vf > 0, we have
at^ -\-bt-\-c>0
for every value of t. This means that the quadratic polynomial cannot have two real roots, which implies that
b^ -4ac <0 or b^ < 4ac. Thus
4{u,vf<4\\uf\\vf
Dividing by 4 gives our result.
7.9. Prove Theorem 7.2: The norm in an inner product space V satisfies:
(a) [Ni] ||i;|| > 0; and ||i;|| = 0 if and only if i; = 0.
(b) [^,] \\kv\\ = \k\\\vl
(c) [N3] ||t/ + i;||<||t/|| + ||i;||.
(a) If v^O, then {v, v) > 0 and hence ||i;|| = ^{v, v) > 0. If i; = 0, then {0, 0) = 0. Consequently
||0|| = VO = 0. Thus [NJ is true.
{b) We have ||^i;||^ = {kv, v) = k^{v, v) = k^\\vf. Taking the square root of both sides gives [N2].
(c) Using the Cauchy-Schwarz inequality, we obtain
\\u + vf = {u-\-v, u-\-v) = {u, u) + {u, v) + {u, v) + {v, v)
<||t/||2+2||t/||||i;|| + ||i;||2= A1^,11+ ||t;||J
Taking the square root of both sides yields [N3].
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
258 INNER PRODUCT SPACES, ORTHOGONALITY [CHAR 7
ORTHOGONALITY, ORTHONORMAL COMPLEMENTS, ORTHOGONAL SETS
7.10. Find k so that w = A, 2, ^, 3) and v = C, k, 7, —5) in R^ are orthogonal.
First find
{u, v) =A,2, k, 3) • C, k, 7, -5) = 3 + 2^ + 7^ - 15 = 9^ - 12
Then set {w, i;) = 9^ — 12 = 0 to obtain k = \.
7.11. Let W be the substance of R^ spanned by u = A,2,3,-1,2) and v = B, 4, 7, 2, -1). Find a
basis of the orthogonal complement W-^ of W.
We seek all vectors w = (x,y, z,s,t) such that
{w,u) = x-\-2y-\-3z- s-\-2t = 0
{w,v) =2x-\-4y-\-lz-\-2s - t = 0
Eliminating x from the second equation, we find the equivalent system
X + 2j; + 3z - ^ + 2^ = 0
z + 4^ - 5^ = 0
The free variables are y, s, and t. Therefore
A) Set j; = -1, ^ = 0, ^ = 0 to obtain the solution w^ = B, -1, 0, 0, 0).
B) Set j; = 0, ^ = 1, ^ = 0 to find the solution W2 = A3, 0, -4, 1, 0).
C) Set j; = 0, ^ = 0, ^ = 1 to obtain the solution w^ = (-17, 0, 5, 0, 1).
The set {wi, ^2,^3} is a basis of W-^.
7.12. Let w = (l,2,3,l)bea vector in R . Find an orthogonal basis for w^.
Find a nonzero solution of x + 2j; + 3z + ^ = 0, say Vi = @, 0, 1, —3). Now find a nonzero solution of
the system
X + 2j; + 3z + ^ = 0, z - 3^ = 0
say i;2 = @, — 5, 3, 1). Lastly, find a nonzero solution of the system
X + 2j; + 3z + ^ = 0, -5y + 3z + ^ = 0, z - 3^ = 0
say v^ = (—14, 2, 3, 1). Thus Vi, V2, v^ form an orthogonal basis for w^.
7.13. Let S consist of the following vectors in R^:
wi =A,1,0,-1), U2 = A,2,1,3), u^ =A,1,-9,2), 1/4 = A6,-13,1,3)
(a) Show that S is orthogonal and a basis of R^.
(b) Find the coordinates of an arbitrary vector v = (a, b, c, d) in R^ relative to the basis S.
(a) Compute
i/^.i/2 = 1+2 + 0-3 = 0, wi-1/3 = 1 + 1+0-2 = 0, wi-1/4 = 16-13+ 0-3=0
i/2 • W3 = 1 + 2 - 9 + 6 = 0, ^2 • ^4 = 16 - 26 + 1 + 9 = 0, W3 • W4 = 16 - 13 - 9 + 6 = 0
Thus S is orthogonal, and hence S is linearly independent. Accordingly, 5* is a basis for R"^ since any four
linearly independent vectors form a basis of R"^.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 7] INNER RRODUCT SRACES, ORTHOGONALITY 259
(b) Since S is orthogonal, we need only find the Fourier coefficients of v with respect to the basis vectors, as
in Theorem 7.7. Thus
{v,Ui) a-\-b — d {VyU^) a-\-b — 9c-\-2d
h
{ui,Ui) 3 {ut^,Ut^) 87
{v,U2) a-\-2-\-c-\-3d {v,u^) I6a — I3b -\-c-\-3d
{u2,U2) 15 ' ^ {u^,u^) 435
are the coordinates of v with respect to the basis S.
7.14. Suppose S, S^, S2 are the subsets of V. Prove the following:
(a) S <^S^^.
(b) If^i c ^2. then ^2^ c^f.
(c) S^ = span (S)^.
(a) Let w e S. Then {w, v) = 0 for every v e S^; hence w e S^^. Accordingly, S c S^^.
(b) Let w e S2. Then {w, v) = 0 for every v e S2. Since S^ c 5*2, {w, v) = 0 for every v = S^. Thus
w e S^, and hence S2 ^ 5*^.
(c) Since S c spanE'), part (b) gives us spanE')^ c S. Suppose u e S^ and i; e spanE'). Then there exist
Wi,W2,... ,Wj^ in S such that v = a^Wi + ^2^2 + ... + «^>v^. Then, using u e S-^, we have
{u, v) = {u, ttiWi -\- a2W2 -\-... -\- aj^Wj^) = ai{u, Wi) -\- a2{u, W2) -\-.. .aj^{u, Wj^)
= ai@) + a2@) + ...a,@) = 0
Thus u e spanE')^. Accordingly, S^ c spanE')^. Both inclusions give S^ = spanE')^.
7.15. Prove Theorem 7.5: Suppose S is an orthogonal set of nonzero vectors. Then S is linearly
independent.
Suppose S = {ui,U2,... ,u^} and suppose
a^Ui + C2^2 + • • • + ^r^r = ^ A)
Taking the inner product of A) with w^, we get
0 = {0,Ui) = {aiUi + C2^2 + • • • + Clr^r^ ^l)
= ai{ui, Ui) + ^2(^2' Wi) + ... + a^{u^, Ui)
= ai{ui,Ui) + C2 • 0 -\-... -\- Gf. • 0 = ai{ui, Ui)
Since u^ 7^ 0, we have {ui,Ui) 7^ 0. Thus a^ = 0. Similarly, for / = 2, ..., r, taking the inner product of A)
with Ui,
0 = {0, Ui) = {aiUi -\-... a^u^, Ui)
= ai {ui, Ui) -\-...-\- GiiUi, Ui) -\- ...-\- a^{u^, u^) = a^iu^, u^)
But {Ui, Ui) ^ 0, and hence a, = 0. Thus S is linearly independent.
7.16. Prove Theorem 7.6 (Pythagoras): Suppose {wj, ^2^ • • •» ^rl i^ ^^ orthogonal set of vectors. Then
11^1 + ^2 + . . . + U.t = 11^1 f + 11^211^ + . . . + \Kf
Expanding the inner product, we have
11^1 -\- U2 -\- . . . -\- U^W^ = {Ui -\- U2 -\- . . . -\- U^, Uy -\- U2 -\- . . . -\- U^)
= {Ui, Ui) -\- {U2, U2) -\- . . . -\- {U^, U^) + I]{W/, U^
The theorem follows from the fact that {w^, w^) = ||w/||^ and {w^, Uj) = 0 for / 7^7.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
260 INNER PRODUCT SPACES, ORTHOGONALITY [CHAR 7
7.17. Prove Theorem 7.7: Let {ui,U2, ..., uj be an orthogonal basis of V. Then for any v e V,
(Ui,Ui) {U2,U2) {Un,Un)
Suppose V = kiUi -\- k2U2 -\-... -\- k^u^. Taking the inner product of both sides with Ui yields
{V, Ui) = {kiU2 + ^2^2 + • • • + K^n^ ^l)
= ki{Ui,Ui) + ^2(^2' ^l) + • • • + kfj{Ufj, Ui)
= ki{ui, Ui) -\- k2 • 0 -\-... -\- kf^ • 0 = ki{ui, Ui)
{V, Wi)
Thus ki = . Similarly, for / = 2,..., «,
{V, Ui) = {kiUi + ^2^2 + • • • + K^n^ ^i)
= ki{ui,Ui) + ^2(^2' Ui) -\- ...-\- k^{u^, Ui)
= ki • 0 + ... + ki{ui, Ui) -\-... -\- kf^ • 0 = ki{ui, Uj)
{V, Uj)
Thus ki = . Substituting for k^ in the equation u = k^Ui + ... + ^„w„, we obtain the desired result.
{Ui,Ui)
7.18. Suppose E = {ei,e2, ... ,ej is an orthonormal basis of V. Prove:
(a) For any u e V, wq have u = (u, 6^N^ + (w, ^2)^2 + • • • + (w, e^)e^.
(b) (aiei + ... a^e^, b^ex + ... + Z?„e„) = aiZ?2 + ^2^72 + ... + a„Z?„.
(c) For any u,v e V,wq have {u, v) = (w, ei)(i;, e^) + ... + (w, e^){v, e^).
(a) Suppose u = k^e^ + ^2^2 + • • • + ^n^n- Taking the inner product of u with e^,
{u, ei) = {k^ei-\-k2e2-\-...-\-k^e^, e^)
= ki{ei,ei) + ^2(^2' ^i) + ... + k^{e^, e^)
= k,{\) + k2{^) + ... + kM = K
Similarly, for / = 2, ..., «,
{u, Ci) = {k^e^ + ... + ^,-e,- + ... + k^e^, e^)
= ky {cy, Ci) -\-...-\- k^ie^, e,-) + ... + k^{e^, e^)
= k,@) + ... + k,(i) + ... + kM = ki
Substituting {u, e^) for k^ in the equation u = k^ei -\-... -\- k^e^, we obtain the desired result.
(b) We have
E ^1^1, E bjCj ) = E ajbjie^, ej) = E aAiei, e^) + E ^ibjief, Cj)
But {ei, ej) = 0 for / ^j, and {e,, e^) = 1 for / =j. Hence, as required,
n n
E ^i^v E bjCj ) = E <^ibi = «l^l + «2^2 + • • • + an^n
(c) By part (a), we have
u = {u, ei)ei-\-...-\-{u, eje^ and i; = {i;, ^1)^1 + ... + {i;, e„)e„
Thus, by part (b),
{u, v) = {u, ei){v, ei) + {u, 62){v, ^2) + ... + {u, ej{v, eJ
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 7] INNER RRODUCT SRACES, ORTHOGONALITY 261
PROJECTIONS, GRAM-SCHMIDT ALGORITHM, APPLICATIONS
7.19. Suppose w ^ 0. Let v be any vector in V. Show that
(v, w) (v, w)
(w, w) \\w\\^
is the unique scalar such that v' = v — cw is orthogonal to w.
In order for v^ to be orthogonal to w we must have
{v — cw, w) = 0 or {v, w) — c{w, w) = 0 or {v, w) = c{w, w)
{v, w) {v, w)
Thus c . Conversely, suppose c = . Then
{w, w) {w, w)
{v, w)
{v — cw, w) = {v, w) — c{w, w) = {v, w) {w, w) = 0
{w, w)
7.20. Find the Fourier coefficient c and the projection of i; = A, —2, 3, —4) along w = A, 2, 1, 2) in R^.
Compute {i;, w) = 1 - 4 + 3 - 8 = -8 and ||>v||^ = 1+4+1+4=10. Then
c=-jQ = -l and proj(i;,>v) = c>v = (-|,-f,-|,-f)
7.21. Consider the subspace U of R^ spanned by the vectors:
v^ = A, 1, 1, 1), V2 = A, 1,2, 4), v^ = A,2, -4, -3)
Find (a) an orthogonal basis of U; (b) an orthonormal basis of U.
(a) Use the Gram-Schmidt algorithm. Begin by setting Wi = u = (I, I, I, I). Next find
Set W2 = (-1, -1, 0, 2). Then find
V2-\^ 2 wi =A,1, 2, 4)--A,1, 1,1) = (-1,-1, 0,2)
^3- „ ||2^i- II ,,2 ^2-(l.2,-4,-3)-—-A,1,1,1)--—(-1,-1,0,2)
II Will 11^2 II 4 6
Clear fractions to obtain W3 = A, 3, — 6, 2). Then Wi,W2, Wt, form an orthogonal basis of U.
(b) Normalize the orthogonal basis consisting ofwi,W2, w^. Since \\wi \\^ = 4, ||>V2||^ = 6, and IIW31|^ = 50,
the following vectors form an orthonormal basis of U:
u,=^(\,\,\,\), u2=^(-\,-\,0,2), u,=^(\,3,-6,2)
7.22. Consider the vector space P(^) with inner product {f,g) = Jq f(t)g(t) dt. Apply the Gram-Schmidt
algorithm to the set {1, t, f} to obtain an orthogonal set {fo,fi,f2} with integer coefficients.
First set^o = 1- Then find
t-^-i=t4-i=t-i
A,1) 1
Clear fractions to obtain /^ = 2^ — 1. Then find
A,1)^^ {2^-1, 2^-1)^ ^ r^ f ^ ^6
Clear fractions to obtain^ = 6f — 6t -\- I. Thus {1, 2^—1, 6^^ — 6^ + 1} is the required orthogonal set.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces,
Outline of Theory and Orthogonality
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
262
INNER PRODUCT SPACES, ORTHOGONALITY
[CHAR 7
7.23. Suppose i; = A, 3, 5, 7). Find the projection of v onto W or, in other words, find w eW that
minimizes ||i; — w||, where W is the subspance of R^ spanned by:
{a) wi = A, 1, 1, 1) and u^ = A, -3, 4, -2),
(b) v^ =(l,l,l,l)andi;2 = A,2,3,2)
(a) Since u^ and U2 are orthogonal, we need only compute the Fourier coefficients:
_ {v,u^) _l+3 + 5+7_16_
^^ ~ {u^,u^) ~ 1 + 1 + 1 + 1 ~ 4 ~
_ {v,U2) _ 1 - 9 + 20 - 14 _ -2 _ ^
^^ ~ {U2,U2) ~ 1+9 + 16 + 4 ~ ^ ~ ^
Then w = proj(i;, W) = c^u^ + C2U2 = 4A, 1, 1, 1) - ^^A, -3, 4, -2) = (ff, f, ff, ff).
(b) Since v^ and V2 are not orthogonal, first apply the Gram-Schmidt algorithm to find an orthogonal basis
for W. Set w^=v^= A, 1, 1, 1). Then find
{V2,W^)
V2--
-wi =A,2, 3, 2)--A,1, 1,1) = (-1,0, 1,0)
(Wi,>Vi)
Set W2 = (—1, 0, 1, 0). Now compute
{v,w^) _l+3+5 + 7_16_^
^^ ~ {w^,w^) ~ 1 + 1 + 1 + 1 ~ 4 ~
_ {v,W2) -l+0 + 5 + 0_-6_
^^ ~ {W2,W2) ~ 1 +0+1 +0 ~^~~
Then w = proj(i;, W) = c^w^-\-C2W2 = 4A, 1, 1, 1) - 3(-l, 0, 1, 0) = G,4, 1,4).
7.24. Suppose w^ and W2 are nonzero orthogonal vectors. Let v be any vector in V. Find Cj and C2 so that
v^ is orthogonal to Wj and W2, where v^ = v — CiWi — C2W2.
If v' is orthogonal to Wi, then
0 = {V — CiWi — C2W2, Wi) = {V, Wi) — Ci{Wi, Wi) — C2{W2, Wi)
= {v, Wi) — Ci {wi,Wi) — C2O = {v,Wi) — Ci {wi, Wi)
Thus Ci = {v, Wi)/{wi, Wi). (That is, c^ is the component of i; along w^.) Similarly, if i;^ is orthogonal to W2,
then
0 = {v — CiWi — C2W2, W2) = {v, W2) — ^2(^2, W2)
Thus C2 = {v, W2)/{w2, W2). (That is, C2 is the component of v along W2.)
7.25. Prove Theorem 7.8: Suppose Wj, W2, ..., w^ form an orthogonal set of nonzero vectors in V. Let
V e V. Define
v' = V — (CiWi + C2I
, + Cj^Wj.), where q =
(V, Wi)
Then v' is orthogonal to Wj, W2, ..., w^.
For / = 1, 2,..., r and using {w,, Wj) = 0 for / 7^7, we have
{v — c^Wi — C2X2 — ... — c^w^, Wi) = {v, Wi) — Ci {wi, Wi) — ... — Ci{wi, w^-) — ... — c^{w^, Wi)
= {v,Wi) -c^-0- ...- qiWi, Wi) - ...-c,-0
{v, Wi)
{v, Wi) - CiiWi, Wi) = {v, Wi) - {Wi, Wi) = 0
[Wi, Wi)
Thus the theorem is proved.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 7] INNER PRODUCT SPACES, ORTHOGONALITY 263
7.26. Prove Theorem 7.9: Let {vi,V2, ..., vj be any basis of an inner product space V. Then there exists
an orthonormal basis {ui,U2, ... ,u^} of V such that the change-of-basis matrix from {i;J to {wj is
triangular, that is, for ^ = 1,2, ... ,n,
Uk = %l^l + ^kl^l + . . . + ClkkVk
The proof uses the Gram-Schmidt algorithm and Remarks 1 and 3 of Section 7.7. That is, apply the
algorithm to {i;J to obtain an orthogonal basis {w^,..., wj, and then normalize {wj to obtain an orthonormal
basis {wj of V. The specific algorithm guarantees that each Wj^ is a linear combination ofvi,...,Vj^, and hence
each Uj^ is a linear combination of Vi,... ,Vj^.
7.27. Prove Theorem 7.10: Suppose S = {wi,W2, ..., w^} is an orthogonal basis for a subspace W of V.
Then one may extend S to an orthogonal basis for V, that is, one may find vectors w^^j, ..., w^ such
that {wi, W2, ..., w„} is an orthogonal basis for V.
Extend S to sl basis S^ = {w^, ... ,w^, v^^^, ..., v^} for V. Applying the Gram-Schmidt algorithm to S\
we first obtain Wi,W2,... ,w^ since S is orthogonal, and then we obtain vectors w^^^, ... ,w^, where
{wi, >V2,...,%} is an orthogonal basis for V. Thus the theorem is proved.
7.28. Prove Theorem 7.4: Let ^ be a subspace of V. Then V =W^ W^.
By Theorem 7.9, there exists an orthogonal basis {w^,..., Uj] of W, and by Theorem 7.10 we can extend
it to an orthogonal basis {ui,U2,... ,Uj^} of V. Hence u^j^i,... ,u^ e W^. \f v eV, then
D = QiUi -\-... -\- a^u^, where a^Ui + ... + a^w^ G W and a^+i^^+i + ... + a^u^ e W^
Accordingly, V =W -\-W^.
On the other hand, ifweWD W^, then {w, w) = 0. This yields w = 0. Hence W D W^ = {0}.
The two conditions V = W -\-W^ mid W A W^ = {0} give the desired result V =W ®W^.
Remark: Note that we have proved the theorem for the case that V has finite dimension. We remark that
the theorem also holds for spaces of arbitrary dimension.
7.29. Suppose ^ is a subspace of a finite-dimensional space V. Prove that W = W^^.
By Theorem 7.4, V =W ® W^, and also V = W^ ® W^^. Hence
dim W = dimV- dim W^ and dim W^^ = dim F - dim W^
This yields dim W = dim W^^. But W c W^^ (see Problem 7.14). Hence W = W^^, as required.
7.30. Prove the following: Suppose Wj, W2, ..., w^ form an orthogonal set of nonzero vectors in V. Let v
be any vector in V and let c^ be the component of v along w^. Then, for any scalars a^, ..., a^, we
have
Jl^k^k
Jl^k^k
That is, J2 ^i^i is the closest approximation to i; as a linear combination of Wj, ..., w^.
By Theorem 7.8, i; — J] cpvj^ is orthogonal to every w^ and hence orthogonal to any linear combination
ofwi,W2,...,Wj.. Therefore, using the Pythagorean theorem and summing from ^ = 1 to r,
II ^ - E ^kWj, f = \v-Y. Ck^k + E (c^ - aO^k II ^= II ^ - E Ck^k fH T.(ck - ak)^k II ^
> l^-E^^i^^ll^
The square root of both sides gives our theorem.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
7. Inner Product Spaces,
Orthogonality
Text
©The McGraw-Hill
Companies, 2004
264
INNER PRODUCT SPACES, ORTHOGONALITY
[CHAP 7
7.31. Suppose {ei,e2, ..., e^} is an orthonormal set of vectors in V. Let v be any vector in V and let c^ be
the Fourier coefficient of v with respect to u^. Prove Bessel's inequality:
i:ci<\\vt
get
Note that q = {v,ei), since ||ej = 1. Then, using {e^, ej) = 0 for / ^j and summing from ^ = 1 to r, we
0 < (i; - ECkek, v-Y.Ck, Cj,) = {v, v) - 2{v, Ec^^^) + E4 = (^' ^) - E^c^iv, ej,) -\-J24
= {V, t;) - E24 + E4 = {V, v)-Y.cl
This gives us our inequality.
ORTHOGONAL MATRICES
7.32. Find an orthogonal matrix P whose first row is u^ = (^, |, |).
First find a nonzero vector w^ — (x, j^, z) which is orthogonal to w^, i.e., for which
0= {wi,>v2) =^ + Y + y = 0 or x + 2j; + 2z = 0
One such solution is w^ = @, 1,-1). Normalize w^ to obtain the second row of P, i.e.,
W2 = @, I/V2,-1/V2).
Next find a nonzero vector W3 = {x,y, z) that is orthogonal to both u^ and U2, i.e., for which
0= {ui,w^) =^-\-^-\-^ = 0 or x + 2j; + 2z = 0
0= A/2,^3) =-^--^ = 0 or y-z = 0
Set z = — 1 and find the solution w^ = D, — 1, — 1). Normalize w^ and obtain the third row of P, that is,
W3 = D/yi8, -l/yi8, -l/yi8).
Thus
12 2
3 3 3
0 1/V2 -1/V2
.4/3V2 -I/3V2 -I/3V2J
We emphasize that the above matrix P is not unique.
7.33. Let A =
1 1 -1
1 3 4
7-5 2
Determine whether or not: (a) the rows of A are orthogonal;
(b) A is an orthogonal matrix; (c) the columns of ^ are orthogonal.
(a) Yes, since A, 1,-1) • (A, 3, 4) = 1 + 3 - 4 = 0, A, 1 - 1) • G,-5, 2) = 7 - 5 - 2 = 0, and
A, 3, 4) • G, -5, 2) = 7 - 15 + 8 = 0.
(b) No, since the rows of ^ are not unit vectors; e.g., A, 1, — 1)^ = 1 + 1 + 1 = 3.
(c) No; e.g., A, 1, 7) • A, 3,-5) = 1 + 3 - 35 =-31 7^ 0.
7.34. Let B be the matrix obtained by normalizing each row of ^ in Problem 7.33.
(a) Find B.
(b) Is B an orthogonal matrix?
(c) Are the columns of B orthogonal?
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 7] INNER PRODUCT SPACES, ORTHOGONALITY 265
(a) We have
Thus
11A,1,
-1I1'
= 1 + 1 + 1=3,
llG,-5,2)||2:
B =
" 1/V3
1/V26
_7/V78
||(l,3,4)||2 = l+9+16 = 26
= 49 + 25 + 4 = 78
1/V3 -1/V3"
3/V26 4/V26
-5/V78 2/V78 _
(b) Yes, since the rows of B are still orthogonal and are now unit vectors.
(c) Yes, since the rows of B form an orthonormal set of vectors. Then, by Theorem 7.11, the columns of B
must automatically form an orthonormal set.
7.35. Prove each of the following:
(a) P is orthogonal if and only if P^ is orthogonal.
(b) If P is orthogonal, then P~^ is orthogonal.
(c) If P and Q are orthogonal, then PQ is orthogonal.
(a) We have (P^f = P. Thus P is orthogonal if and only if PP^ = I if and only if P^^P^ = I if and only if
P^ is orthogonal.
(b) We have P^ = P~^, since P is orthogonal. Thus, by part (a), P~^ is orthogonal.
(c) We have P^ = p-^ and Q^ = Q~\ Thus (PQ)(PQf = PQQ^P^ = PQQ~^P~^ = L Therefore
(PQf = iPQy\ and so PQ is orthogonal.
7.36. Suppose P is an orthogonal matrix. Show that:
{a) (Pu, Pv) = (w, v) for any w, i; G F;
(Z?) \\Pu\\ = ||w|| for every u e V.
Use P^P = / and (u, v) = w^i;.
{a) {Pu, Pv) = (Puf(Pv) = u^P^Pv = u^v = {u, v).
(b) We have
\\Puf = {Pu,Pu) = u^P^pu = u^u = {u, u) = \\uf
Taking the square root of both sides gives our result.
7.37. Prove Theorem 7.12: Suppose E = {ej and E' = {ej} are orthonormal bases of V. Let P be the
change-of-basis matrix from E to E'. Then P is orthogonal.
Suppose
e'i = baei + bi2e2 + • • • + b^^e^, i=l,...,n A)
Using Problem 7.18(b) and the fact that E^ is orthonormal, we get
3,j = {e;, ej) = babj, + babj2 + ... + b,,bj, B)
Let B = [bij] be the matrix of the coefficients in A). (Then P = B^.) Suppose BB^ = [cij]. Then
Cij = babji + bi2bj2 + ... + b^^bj^ C)
By B) and C), we have Cij = d^j. Thus BB^ = I. Accordingly, B is orthogonal, and hence P = B^ is
orthogonal.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
7. Inner Product Spaces,
Orthogonality
Text
©The McGraw-Hill
Companies, 2004
266
INNER PRODUCT SPACES, ORTHOGONALITY
[CHAR 7
7.38. Prove Theorem 7.13: Let {gj, ..., ej be an orthonormal basis of an inner product space V. Let
P = [a-j] be an orthogonal matrix. Then the following n vectors form an orthonormal basis for V:
e[ = a^ex + ^2/^2 + ... + a„^-e„, / = 1, 2, ..., w
Since {ej is orthonormal, we get, by Problem 7.18F),
{e'i, e'j) = a^itty + fl2/«2/- + • • • + «„/«„y = (Q, Cj)
where Q denotes the ith column of the orthogonal matrix P = [aij]. Since P is orthogonal, its columns form an
orthonormal set. This implies {ej, ej) = {Q, Cj) = d^j. Thus {e\} is an orthonormal basis.
INNER PRODUCTS AND POSITIVE DEFINITE MATRICES
7.39. Which of the following symmetric matrices are positive definite?
{a) A:
3 4
4 5
ib) B:
8 -3
-3 2
,(c) C
2 1
1 -3
,{d) D.
3 5
5 9
Use Theorem 7.14 that a 2 x 2 real symmetric matrix is positive definite if its diagonal entries are
positive and if its determinant is positive.
{a) No, since |^| = 15 — 16 = —lis negative.
{b) Yes.
(c) No, since the diagonal entry —3 is negative.
{d) Yes.
7.40. Find the values of k that make each of the following matrices positive definite:
{a) A =
Ab) B^
4 k
k 9
,(c} C =
(a) First, k must be positive. Also, \A\ =2k — 16 must be positive, that is, 2^ — 16 > 0. Hence ^ > 8.
(b) We need \B\ =36-k^ positive, that is, 36 - F > 0. Hence F < 36 or -6 < ^ < 6.
(c) C can never be positive definite, since C has a negative diagonal entry —2.
7.41. Find the matrix A that represents the usual inner product on R^ relative to each of the following
bases of R^: (a) {v^ = A, 3), V2 = B, 5)}, (b) {wj = A,2), W2 = D, -2)}
(a) Compute {i;i,i;i) = 1 +9 = 10, {vx,V2) = 2 + 15 = 17, fe^^^i) =4 + 25 = 29. Thus^ =
(b) Compute {wi, w^) = 1 + 4 = 5, {wi, ^2) = 4 - 4 = 0, {^2, ^2) = 16 + 4 = 20. Thus A ■■
(Since the basis vectors are orthogonal, the matrix A is diagonal.)
10
17
0
17
29 _
0"
20 _
7.42. Consider the vector space ^2@ with inner product {f,g) = j_if(t)g(t) dt.
{a) Find (/, g), where/(O = ^ + 2 and g{t) = f-3t-\-4.
(b) Find the matrix A of the inner product with respect to the basis {l,t,f} of K
(c) Verify Theorem 7.16 by showing that {f,g) = [/]^^[g] with respect to the basis {\,t,f}.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces,
Outline of Theory and Orthogonality
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAP. 7]
INNER PRODUCT SPACES, ORTHOGONALITY
267
{a) {f^g)=\ {t + 2){t^-Zt + A)dt--
(P -t^-2t-\-S)dt= (y-T--
(b) Here we use the fact that, if r -\- s = n,
fl fU+l |1
{f,n= fdt = —- =
J-i « + i|_i 1
2/(« +1) if « is even,
46
0 if « is odd.
Then {1, 1) = 2, {1, t) = 0, {1, t^) = f, {t, t) = f, {t, f) = 0, {f, f) = f. Thus
^2 0 I
0 f 0
L3 ^ 5 J
(c) We have [f]^ = B, 1, 0) and [g]^ = D, —3, 1) relative to the given basis. Then
[fVA[g] = B,1,0)
2 0 f1
0 f 0
^ 0 H
3 ^ 5 J
r ^~
-3
[_ 1
^D,f,|)
--f={f,g)
7A3. Prove Theorem 7.14: A
\A\ = ad — b^ is positive. L
Let u = [x,yf. Then
b c
is positive definite if and only if a and d are positive and
f(u) = u^Au = [x,y]\ ^ ^
: ax^ + 2bxy + (iy^
Suppose f(u) > 0 for every u ^ 0. Then /(I, 0) = a > 0 and /(O, I) = d > 0. Also, we have
/(Z), —a) = a(ad — b^) > 0. Since a > 0, we get ad — b^ > 0.
Conversely, suppose a > 0, b = 0, ad — b^ > 0. Completing the square gives us
/(w) = a x^ + —xj; + —/ \+d/ / =a[x + —\ +
\ a a2 / a \ a J a
-y
Accordingly, f{u) > 0 for every u ^ 0.
7.44. Prove Theorem 7.15: Let^ be a real positive definite matrix. Then the function (u, v) = u^Av is an
inner product on R".
For any vectors Ui,U2, and v,
{vi -\-U2, v) = {ui + U2) Av = {u[ + u^)Av = u[Av + u^Av = {ui, v) + {^2, v)
and, for any scalar k and vectors u, v,
{ku, v) = (ku) Av = ku^Av = k{u, v)
Thus [Ii] is satisfied.
Since u^Av is a scalar, {i/AvY = u^Av. Also, A^ = A since A is symmetric. Therefore
{u, v) = u^Av = {li^Avf = v^A^u^^ = v^au = {v, u)
Thus [I2] is satisfied.
Lastly, since A is positive definite, X^AX > 0 for any nonzero X G R". Thus for any nonzero vector
V, {v, v) = v^Av > 0. Also, {0, 0) = O^^O = 0. Thus [I3] is satisfied. Accordingly, the function {u, v) = Av is
an inner product.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces,
Outline of Theory and Orthogonality
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
268
INNER PRODUCT SPACES, ORTHOGONALITY
[CHAR 7
7.45. Prove Theorem 7.16: Let A be the matrix representation of an inner product relative to a basis S of
V. Then, for any vectors u,v e F, we have
(u, v) = [w]^^[i^]
Suppose S = {wi, W2,..., w^} and A = [kij]. Hence kij = {wi, Wj). Suppose
u = a^Wi + ^2^2 + ... «„% and v = b^Wi + ^2^2 + ... + b^w^
Then
On the other hand,
{u,v) =T.T.aibj{Wi,Wj)
/=l7=l
[ufA[v\ = (a^,a2,...,aj
kn
^21
Ki
kn ■
^22 •
K2 •
• ky,
^2n
^nn
--HHaibjkij
y=i/=i
A)
B)
Equations A) and B) give us our result.
7.46. Prove Theorem 7.17: Let A be the matrix representation of any inner product on V. Then ^ is a
positive definite matrix.
Since {wi,Wj) = {wj,Wi) for any basis vectors Wi and Wj, the matrix A is symmetric. Let X be any
nonzero vector in R". Then [w]=X for some nonzero vector u e V. Theorem 7.16 tells us that
X^AX = [ufA[u] = {u, u) > 0. Thus A is positive definite.
COMPLEX INNER PRODUCT SPACES
7.47. Let F be a complex inner product space. Verify the relation
(w, avi-\-bv2) = a(u,Vi)-\-b(u,V2)
Using [/f], [If], and then [/f], we find
{u,avi -\-bv2) = {aVi -\-bv2, u) = a{vi, u) -\-b{v2, u) = a{vi, u) -\-b{v2, u) = a{u, v^) -\-b{u, V2)
7.48. Suppose (w, 1;) = 3 + 2/ in a complex inner product space V. Find:
{a) (B-4z>, 1;) (Z?) (w, D + 3/)i;) (c) (C - 6z>, {5-2i)v)
{a) {B - Ai)u, v)={2- Ai){u, v) = {2 - 40C + 2i) = \A - 18/
{b) {u, D + 3i)v) = D + 3i){u, v) = D - 30C + 2/) = 18 - /
(c) {C - 6i)u, E - 2O1;) = C - 60E - 2i){u, v) = C - 6i)E + 20C + 20 = 137 - 30/
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 7] INNER PRODUCT SPACES, ORTHOGONALITY 269
7.49. Find the Fourier coefficient (component) c and the projection cw of i; = C + 4/, 2 — 3/) along
w = E + /, 2/)inC^
Recall that c = {v, w)/{w, w). Compute
{v, w) =C-\- 40ET7) + B - 30B/) = C + 4/)E - /) + B - 3/)(-2/)
= 19 +17/-6-4/= 13 + 13/
{w,w) =25-\-I-\-4 = 30
Thus c = A3 + 13/)/30=^ + ^/. Accordingly proj(v, w) = cw = (^-\-^i, -yf+ i^/)
7.50. Prove Theorem 7.18 (Cauchy-Schwarz): Let F be a complex inner product space. Then
\{u,v)\ < \\u\\\\v\\.
If i; = 0, the inequality reduces to 0 < 0 and hence is valid. Now suppose v ^ 0. Using zz = |z|^ (for any
complex number z) and {v, u) = {u, v), we expand \\u — {u, v)tv\\^ > 0, where t is any real value:
0 < \\u — {u, v)tv\\^ = {u — {u, v)tv, u — {u, v)tv)
= (w, u) — {u, v)t{u, v) — {u, v)t{v, u) + (w, v){u, v)t {v, v)
= \\uf -2t\{u,v)\^ + \{u,v)\^t^\\vf
Set t = l/\\vf to find 0 < \\uf- , , from which \(u, v)f < ||i;||^||i;||^ Taking the square
Mr
root of both sides, we obtain the required inequality.
7.51. Find an orthogonal basis for u^ in C^ where u = (I, /, 1 + /).
Here w^ consists of all vectors s = (x, y, z) such that
{w, u) = X — iy -\-(\ — i)z = 0
Find one solution, say w^ = @, 1 — /, /). Then find a solution of the system
x-iy + {\ - i)z = Q, A + /);; -iz = Q
Here z is a free variable. Set z = 1 to obtain y = //(I + /) = A + /)/2 and x = C/ — 3J. Multiplying by 2
yields the solution >V2 = C/ — 3, 1 + /, 2). The vectors w^ and W2 form an orthogonal basis for u^.
7.52. Find an orthonormal basis of the subspace W of C^ spanned by
v^ =A,/, 0) and v^ = A, 2, 1 -/).
Apply the Gram-Schmidt algorithm. Set w^ =Vi = A, /, 0). Compute
{1^2^ 2, l-/)-i^(l,/,0) = (i + /, \-U, \-i)
Multiply by 2 to clear fractions, obtaining W2 = (I -\-2i, 2 — i, 2 — 2/). Next find \\wi\\ = y/2 and then
||>V2|| = yf\%. Normalizing {w^, w{\^ we obtain the following orthonormal basis of W\
1 /
\ _/l+2/ 2-/ 2-2/\1
^^~'V2' V2' /' ^^"Vyis' 718' yiS/J
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
7. Inner Product Spaces,
Orthogonality
Text
©The McGraw-Hill
Companies, 2004
270
INNER PRODUCT SPACES, ORTHOGONALITY
[CHAR 7
7.53. Find the matrix P that represents the usual inner product on C^ relative to the basis {1, /, 1 — /}.
Compute the following six inner products:
A,1) =}, {!,/) =l=-i, {1,1 -/) =T^= 1+/
{/, /) = a = 1, {/, \ -i) = i(\ - i) = -1 + /,
{1 -/, 1 -i) ^
Then, using (u, v) = {v, u), we obtain
1
i
\-i
1
-1 -i
1+/
-1+/
2
(As expected, P is Hermitian, that is, P^ = P-)
NORMED VECTOR SPACES
7.54. Consider vectors w = A, 3, —6, 4) and i; = C, —5, 1, —2) in R"^. Find:
W INIloo ^^^ ll^loo? (^) \M\\ ^^^ ll^llb W INII2 ^^^ ll^lb?
(d) d^(u, v), di(u, v), d2(u, v).
(a) The infinity norm chooses the maximum of the absolute values of the components. Hence
\\u\\^ = 6 and \\v\\^ = 5
(b) The one-norm adds the absolute values of the components. Thus
1+3 + 6 + 4= 14
and
llt^lli
:3+5 + l+2= 11
(c) The two-norm is equal to the square root of the sum of the squares of the components (i.e., the norm
induced by the usual inner product on R^). Thus
\\u\\2 = Vl+9 + 36+16 = V62
(d) First fLndu-v = (-2, 8, -7, 6). Then
d^(u,v)= \\u-v\\^
di(u, v) = \\u — v\\i
d2(u, v) = \\u — v\\2
and
||i;||2 = V9 + 25 + 1 + 4 = V39
:2 + 8 + 7 + 6 = 23
: V4 + 64 + 49 + 36 :
/153
7.55. Consider the function/(O = f - 4t in C[0, 3].
(a) Findll/IU, (Z7) Plot/(O in the plane r2, (c) Find ||/||i, (J) Find H/l^.
(a) We seek ||/||oo = rnax(|/(^)|). Since/(O is differentiable on [0, 3], \f(t)\ has a maximum at a critical
point of/@, i.e., when the derivative/^O = 0^ or at an endpoint of [0, 3]. Since/^O = 2^ — 4, we set
2t — 4 = 0 and obtain ^ = 2 as a critical point. Compute
/B) = 4 - 8 = -4, /(O) = 0-0 = 0,
Thus ||/|U = 11/BI = 1-41=4.
(b) Compute/(O for various values of t in [0, 3], e.g..
/C) = 9 - 12 :
m
0
1
0
Plot the points in R^ and then draw a continuous curve through the points, as shown in Fig. 7-8.
(c) We seek \\f\\i = Jq |/@I dt. As indicated in Fig. 7-7,/(O is negative in [0, 3]; hence
\f(t)\ = -(t'-4t) = 4t-t\
:lDt-t')dt=Bt'-^-)\
Thus
ii/iii
18-9:
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
7. Inner Product Spaces,
Orthogonality
Text
©The McGraw-Hill
Companies, 2004
CHAR 7]
INNER RRODUCT SRACES, ORTHOGONALITY
271
i/(f>
Fig. 7-8
id) WfWl
: [ {f{tf dt =\{t^- 8^' + 16^') dt={y- If + ^"j
153
Thus II/II2
/153
7.56. Prove Theorem 7.24: Let F be a normed vector space. Then the function d(u, 1;) = ||w — i;|| satisfies
the following three axioms of a metric space:
[Mj] d(u, V) > 0; and d(u, 1;) = 0 iff w = v.
[M2] d(u, v) = d(v, u).
[M3]. d{uv) < d(u, w) + d(w, v).
If u ^ V, then u — v ^ 0, and hence d(u, v) = \\u — v\\ > 0. Also, d(u, u) = \\u — u\\ = ||0|| = 0. Thus
[Ml] is satisfied. We also have
d(u, v) = \\u — v\\ = II — l(v — u)\
and d(u, v) = \\u — v\\ = \\(u — w)-\-(w — v)
Thus [M2] and [M3] are satisfied.
\\u - v\\ > 0. Also, d(u, u) = \\u - u\\ = ||0||
= I — 1 |||i; — w|| — ||i; — w|| = d(v, u)
I < ||w - w\\ + \\w - v\\ = d(u, w) + d(w, v)
Supplementary Problems
INNER PRODUCTS
7.57. Verify that the following is an inner product on R^, where u = (x^,X2) and v = (yi,3^2)-
f(u, v) = x^yi - 2xiy2 - ^x^yx + 5x2j^2
7.58. Find the values of k so that the following is an inner product on R^, where u — (x^,X2) and v — (yi,3^2)-
/(w, V) = x^yi - 3xxy2 - 3x2j^i + Ax2j^2
7.59. Consider the vectors u = (I, —3) and v = B, 5) in R^. Find:
(a) {u, v) with respect to the usual inner product in R^.
(b) {u, v) with respect to the inner product in R^ in Problem 7.57.
(c) ||i;|| using the usual inner product in R^.
(d) \\v\\ using the inner product in R^ in Problem 7.57.
7.60. Show that each of the following is not an inner product on R^, where w = (x^, X2, X3) and v = {yi,y2,y^)'.
(a) {u, v) = x^yi -\-X2y2 and (b) {u, v) = Xi_y2-^3 +J^i-^2y3-
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces,
Outline of Theory and Orthogonality
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
272
INNER PRODUCT SPACES, ORTHOGONALITY
[CHAR 7
7.61. Let V be the vector space ofmxn matrices over R. Show that {A,B) = tY(B^A) defines an inner product in V.
7.62. Suppose | {u, v)\ = ||w|| ||i^||. (That is, the Cauchy-Schwarz inequality reduces to an equality.) Show that u and
V are linearly dependent.
7.63. Suppose/(w, v) and g(u, v) are inner products on a vector space V over R. Prove:
(a) The sum/ + g is an inner product on V, where (/ + g)(u, v) =f(u, v) + g(u, v).
(b) The scalar product kf, for ^ > 0, is an inner product on V, where (kf)(u, v) = kf(u, v).
ORTHOGONALITY, ORTHOGONAL COMPLEMENTS, ORTHGONAL SETS
7.64. Let V be the vector space of polynomials over R of degree <2 with inner product defined by
{/, g) = Jo/@^@ dt. Find a basis of the subspace W orthogonal to h(t) = 2t -\-\.
7.65. Find a basis of the subspace W of R"^ orthogonal to u^ ={\, —2, 3, 4) and U2 = {^,—5, 7, 8).
7.66. Find a basis for the subspace W of R^ orthogonal to the vectors Ui = A, 1, 3, 4, 1) and ^2 = A^2, 1,2, 1).
7.67. Let w = A, -2, -1, 3) be a vector in R"^. Find:
{a) an orthogonal basis for w^, (b) an orthnormal basis for w^.
7.68. Let W be the subspace of R"^ orthogonal to w^ = A, 1, 2, 2) and ^2 = @, 1, 2, —1). Find:
(a) an orthogonal basis for W, (b) an orthonormal basis for W. (Compare with Problem 7.65.)
7.69. Let S consist of the following vectors in R^:
Wl = A, 1, 1, 1), U2 = A, 1, -1, -1), W3 = A, -1, 1, -1),
^4 = A,-1, -1,1)
(a) Show that S is orthogonal and a basis of R"^.
(b) Write i; = (l,3,—5,6)asa linear combination of w^, ^2^ ^3^ ^4-
(c) Find the coordinates of an arbitrary vector v = (a,b, c, d) inR"^ relative to the basis S.
(d) Normalize S to obtain an orthonormal basis of R^.
7.70. Let M = M2 2 with inner product {A,B) = tY(B^A). Show that the following is an orthonormal basis for M:
1 0
0 0
0
0 0
1} [2 '} [I °]\
7.71. Let M = M2 2 with inner product {A,B) = tY(B^A). Find an orthogonal basis for the orthogonal complement
of: (a) diagonal matrices (b) symmetric matrices.
7.72. Suppose {ui,U2,..., u^} is an orthogonal set of vectors. Show that [kiUi, ^2^2' • • •' K^A is an orthogonal set
for any scalars ki, k2, ..., k^,.
7.73. Let U and W be subspaces of a finite-dimensional inner product space V. Show that:
(a) (U + W)^ = U^n W^, (b) (U n W)^ = U^-\- W^.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces, I Text I I ©The McGraw-Hill
Outline of Theory and Orthogonality Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 7] INNER PRODUCT SPACES, ORTHOGONALITY 273
PROJECTIONS, GRAM-SCHMIDT ALGORITHM, APPLICATIONS
7.74. Find the Fourier coefficient c and projection cw of v along w, where:
(fl) V = B, 3, -5) and w = A, -5, 2) in R\
(b) V = A, 3, 1,2) and w = (l, -2, 7, 4) in R^.
(c) V = t^ and w = t -\-3 in P(^), with inner product {/, g) = Jq f(t)g(t)dt
r
5 5
id) v=\; I
[I
and w -
in M = M2 2, with inner product {A, B) = tY(B^A).
7.75. Let U be the subspace of R"^ spanned by
v^ =A,1,1,1), i;2 = A,-1,2,2), v^ =A,2,-3,-4)
(a) Apply the Gram-Schmidt algorithm to find an orthogonal and an orthonormal basis for U.
(b) Find the projection of i; = A, 2, —3, 4) onto U.
7.76. Suppose i; = A, 2, 3, 4, 6). Find the projection of v onto W, or, in other words, find w e W that minimizes
||i; — wll, where W is the subspace of R^ spanned by:
(a) wi = A, 2, 1,2, 1) and U2 = A, -1, 2, -1, 1), (b) v^ = A, 2, 1,2, 1) and V2 = A,0, 1, 5, -1).
7.77. Consider the subspace W = P2@ of P@ with inner product {/, g) = Jo/@^@ dt. Find the projection of
f(t) = t^ onto W. (Hint: Use the orthogonal polynomials 1, 2^ — 1, 6^^ — 6^ + 1 obtained in Problem 7.22.)
7.78. Consider P(t) with inner product {/, g) = J_i/@^@ ^^ and the subspace W = P^it).
(a) Find an orthogonal basis for W by applying the Gram-Schmidt algorithm to {l,t,t^,t^}.
(b) Find the projection off(t) = t^ onto W.
ORTHOGONAL MATRICES
7.79. Find the number and exhibit all 2 x 2 orthogonal matrices of the form ^
i
3
y z
7.80. Find a 3 x 3 orthogonal matrix P whose first two rows are multiples ofw = (l,l,l) and i; = A, —2, 3),
respectively.
7.81. Find a symmetric orthogonal matrix P whose first row is (^, f, f). (Compare with Problem 7.32.)
7.82. Real matrices A and B are said to be orthogonally equivalent if there exists an orthogonal matrix P such that
B = P^AP. Show that this relation is an equivalence relation.
POSITIVE DEFINITE MATRICES AND INNER PRODUCTS
7.83. Find the matrix A that represents the usual inner product on R^ relative to each of the following bases:
(a) {«, =A,4), 02 = B,-3)}, (b) {wi=(l,-3), w^ = F, 2)}.
7.84. Consider the following inner product on R^:
f(u, v) = Xiyi — 2xiy2 — 2x2y\ + 5x2^2' where u = {xi, X2) v = (yi, 3^2)
Find the matrix B that represents this inner product on R^ relative to each basis in Problem 7.83.
7.85. Find the matrix C that represents the usual basis on R^ relative to the basis S of R^ consisting of the vectors
wi = A, 1, 1), U2 = A,2, 1), W3 = A, -1,3).
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces,
Outline of Theory and Orthogonality
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
274
INNER PRODUCT SPACES, ORTHOGONALITY
[CHAR 7
7.86. Let V = P2@ with inner product {f,g) = Jo/(Og@ dt.
(a) Find {/, g), where/(O = ^ + 2 and g(t) = f-3t-\-4.
(b) Find the matrix A of the inner product with respect to the basis {l,t,f} of V.
(c) Verify Theorem 7.16 that {/,g) = [/]^^[g] with respect to the basis {l,t,f].
7.87. Determine which of the following matrices are positive definite:
"' [; ^].<« [: ?
(c)
4 2
2 1
Ad)
6 -1
-1 9
7.88. Suppose A and B are positive definite matrices. Show that: (a)A-\-B is positive definite and (b) kA is positive
definite for ^ > 0.
7.89. Suppose 5 is a real non-singular matrix. Show that: (a) B^B is symmetric and (b) B^B is positive definite.
COMPLEX INNER PRODUCT SPACES
7.90. Verify that
{aiUi +^2^2 ^1^1 + ^2^2) = aibi{ui, v^) -\- aib2{ui, V2) -\- a2bi{u2, v^) -\- a2b2{u2, V2)
More generally prove that {J^^^^ a^u^, J^j^i ^jVj) = J2ij ^ibjiUi, v^).
7.91. Consider w = A +/, 3, 4 -/) and i; = C - 4/, 1+/, 2/) in Cl Find:
(a) {u, v), (b) {v, u), (c) \\u\\, id) \\v\l {e) d(u, v).
7.92. Find the Fourier coefficient c and the projection cw of
(a) u = C -\-i, 5 — 2i) along w = {5 -\-i, 1 + /) in C^,
{b) u = (l-i, 3i, 1+0 along >v = (l, 2-i, 3 + 2/) in Cl
7.93. Let u = (zi, Z2) and v = (w^, W2) belong to C^. Verify that the following is an inner product on C^:
f(u, V) = Z^Wi -\- (\ -\- i)ZiW2 + A ~ 0-^2^1 + 3Z2>V2
7.94. Find an orthogonal basis and an orthonormal basis for the subspace W of C^ spanned by w^ = A, /, 1) and
^2 = A + i, 0, 2).
7.95. Let u = (^1,^2) and v = (wi, W2) belong to C^. For what values of a,b, c,d e C is the following an inner
product on C ?
f(u, v) = az^Wi + bziW2 + CZ2W1 + dz2W2
7.96. Prove the following form for an inner product in a complex space V:
{U, V) = 1 ||« + t,||2 - 1 \\u - yf + 1 \\U + ,i,||2 - 1 ||„ - iv\\
[Compare with Problem 1.1b).]
7.97. Let F be a real inner product space. Show that:
(i) ||w|| = ||i;|| if and only if {u-\-v, u — v) = 0;
(ii) \\u + vf = \\uf + \\vf if and only if {u, v) = 0.
Show by counterexamples that the above statements are not true for, say, C^.
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces,
Outline of Theory and Orthogonality
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 7]
INNER RRODUCT SRACES, ORTHOGONALITY
275
7.98. Find the matrix P that represents the usual inner product on C relative to the basis {1, 1 + /, 1—2/}.
7.99. A complex matrix A is unitary if it is invertible and A~^ = A^. Alternatively, A is unitary if its rows (columns)
form an orthonormal set of vectors (relative to the usual inner product of C"). Find a unitary matrix whose first
row is: {a) a multiple of A, 1 — /); F) a multiple of (^, ^ /,
1 1 ; 1 1
2' 2'' 2 2
0.
NORMED VECTOR SPACES
7.100. Consider vectors w = A, -3, 4, 1, -2) and i; = C, 1, -2, -3, 1) in R^ Find:
{a) \\u\\^ and ||i;||^, {b) \\u\\^ and ||i;||i, (c) ||w||2 and ||i;||2, {d) d^(u, v),d^(u, v), d2(u, v)
7.101. Repeat Problem 7.100 for w = (l+/, 2 - 4/) and i; = A -/, 2 + 3/) in Cl
7.102. Consider the fiinctions/@ = 5t - t^ and g(t) = 3t - t^ in C[0, 4]. Find:
(a) d^(f,g%(b) d,(f,g%(c) d2(f,g)
7.102. Prove: (a) \\- \\i is a norm on R". (b) \\ • ||^ is a norm on R".
7.103. Prove: (a) \\- \\i is a norm on C[a, b]. (b) \\ • ||^ is a norm on C[a, b].
Answers to Supplementary Problems
Notation: M = [Ri', R2, ...] denotes a matrix M with rows Ri,R2,...
7.58. k>9
7.59. (a) -13, (b) -71, (c) V29, (d) VS9
7.60. Let u = @, 0, 1); then {u, u) = 0 in both cases
7.64. {7t^-5t, l2t^-5]
7.65. {A,2,1,0), D,4,0,1)}
7.66. (-1, 0, 0, 0, 1), (-6, 2, 0, 1, 0), (-5, 2, 1, 0, 0)
7.67. (a) @, 0, 3, 1), @, 3, -3, 1), B, 10, -9, 3),
(b) @, 0, 3, 1)/Vl0, @, 3, -3, l)/yT9, B, 10, -9, 3)/VT94
7.68. (a) @,2,-1,0), (-15, 1,2,5), (b) @, 2,-1, 0)/V5, (-15, 1, 2, 5)/V255
7.69. (b) V = \Eui + 3u2 - 13^3 + 9^4),
(c) [v] = ^[a-\-b -\-c-\-d, a-\-b — c — d, a — b -\-c — d, a — b — c-\-d]
7.71. (a) [0,1; 0,0], [0,0; 1,0], (b) [0,-1; 1,0]
7.74. (a) c = -g, (b) c = i, (c) c = ^, (d) c = 'i
Lipschutz-Lipson:Schaum's I 7. Inner Product Spaces,
Outline of Theory and Orthogonality
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
276
INNER PRODUCT SPACES, ORTHOGONALITY
[CHAR 7
7.75. (a) wi =A,1,1,1),>V2 =@,-2,1,1), W3 =A2,-4,-1,-7),
7.76.
7.77.
7.78.
7.79.
7.80.
7.81.
7.83.
7.84.
7.85.
7.86.
7.87.
7.91.
7.92.
7.94.
7.95.
7.97.
7.98.
7.99.
7.100.
7.101.
7.102.
(b) proj(i;, U) = ^(-14, 158, 47, 89)
(a) proj(i;, W) = ^Bl, 27, 26, 27, 21), (b) First find an orthogonal basis for W;
say, wi = A, 2, 1, 2, 1) and W2 = @, 2, 0, -3, 2). Then proj(i;, W) = ^^C4, 76, 34, 56,
proj(/,r) = |^2_|^ + i
(a) {1, t, ?>f -\, 5^3-3^}
Four: [a, b; b, —a], [a, b; —b, —a], [a,
42)
proi(f,W)=fP=l^t
b, a], [a, ■
-b, —a] where a = -^ and i
\V^
P = [\/a, \/a, \/a; \/b, -2/b, 3/b; 5/c, -2/c, -3/c], where a = V3,b = ^14, (
38
i[l,2, 1; 2,-2,1; 2,1,-2]
(a) [17,-10; -10,13], (b) [10,0; 0,40]
(a) [65,-68; -68,25], (b) [58,16; 16,8]
[3,4,3; 4,6,2; 3,2,11]
(a) j|, (b) [l,a,b', a,b,c', b,c,d\,whQXQ a
{a) No. {b) Yes, (c) No. {d) Yes
{a) -AU (b) 4/, (c) V28, (d) V3T,
(a) c = ^A9-50, (b) c = ^C + 60
2' ^ 35 ^ 45 " 5
(e) V59
{i;i =A,/, 1)/V3, 1^2 =B/, 1-3/, 3-0/V24}
a and d real and positive, c = b and ad — be positive.
u = (l,2), v = (i,2i)
P = [l, l-i, 1+2/; 1+/, 2, -2 + 3/; 1-2/, -2-3/, 5]
(a) A/V3)[1, l-i; 1+/, -1],
(b) [a, ai, a — ai; bi, —b, 0; a.
li, —a + a/], where a = \ and b = 1/V2.
(a) 4 and 3, (b) 11 and 13, (c) V3T and V24, (d) 6,19,9
(a) V20andyT3, (b) V2 + V20 and V2 + 713, (c) V22 and ^15, (d) 7,9,^53
(a) 8, F) 16, (c) ^
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
8. Determinants
Text
©The McGraw-Hill
Companies, 2004
CHAPTER 8
Determinants
8.1 INTRODUCTION
Each w-square matrix A
dQt(A) or \A\ or
is assigned a special scalar called the determinant of ^, denoted by
H\
H2
^21 ^22
«2«
^n\ ^w2 • • • ^}
We emphasize that an w x ^ array of scalars enclosed by straight lines, called a determinant of order n, is
not a matrix but denotes the determinant of the enclosed array of scalars, i.e., the enclosed matrix.
The determinant function was first discovered during the investigation of systems of linear equations.
We shall see that the determinant is an indispensable tool in investigating and obtaining properties of
square matrices.
The definition of the determinant and most of its properties also apply in the case where the entries of a
matrix come from a commutative ring.
We begin with a special case of determinants of orders 1, 2, and 3. Then we define a determinant of
arbitrary order. This general definition is preceded by a discussion of permutations, which is necessary for
our general definition of the determinant.
8.2 DETERMINANTS OF ORDER 1 AND 2
Determinants of orders 1 and 2 are defined as follows:
l^iil = a^x and
^11 ^12
^21 ^22
■ ^11^22 ^12^21
Thus the determinant of a 1 x 1 matrix^ = [ajj] is the scalar a^ itself; that is, det(^) = lajjl = an. The
determinant of order two may easily be remembered by using the following diagram:
277
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
8. Determinants
Text
©The McGraw-Hill
Companies, 2004
278
DETERMINANTS
[CHAR
That is, the determinant is equal to the product of the elements along the plus-labeled arrow minus the
product of the elements along the minus-labeled arrow. (There is an analogous diagram for determinants of
order 3, but not for higher-order determinants.)
Example 8.1.
(a) Since the determinant of order one is the scalar itself, we have:
detB7) = 27, det(-7) = -7,
(b)
dQt(t -3) = t-3
: 5F) - 3D) = 30 - 12 = n
: 21+10 = 31
Application to Linear Equations
Consider two linear equations in two unknowns, say
a2X + Z?2J = ^2
Let D = aib2 — «2^i? the determinant of the matrix of coefficients. Then the system has a unique solution
if and only if D ^ 0. In such a case, the unique solution may be expressed completely in terms of
determinants as follows:
N, ^2^1
D aiZ?2
-biC2 _
- a2bi
C\
Cl
ax
a2
hA
b^\
bA
bi\
^1^2 ^2^1
aiZ?2 — «2^1
fll
1 ^2
1 ^1
1 ^2
C\
^2
bA
bA
Here D appears in the denominator of both quotients. The numerators A/^ and A"^ of the quotients for x and
J, respectively, can be obtained by substituting the column of constant terms in place of the column of
coefficients of the given unknown in the matrix of coefficients. On the other hand, if £> = 0, then the
system may have no solution or more than one solution.
|4x-37= 15
2x + 5^ = 1
First find the determinant D of the matrix of coefficients:
Example 8.2. Solve by determinants the system
D -
4 -3
2 5
: 4E) - (-3)B) = 20 + 6 = 26
Since D ^ 0, the system has a unique solution. To obtain the numerators N^ and Ny, simply replace, in the matrix of
coefficients, the coefficients of x and j;, respectively, by the constant terms, and then take their determinants:
A.
15
1
: 75 + 3 = 78
A.,
30:
-26
Then the unique solution of the system is
N,
' = ^-
78 ^
= 26 = ^'
y^
Nr
D
-26
^6"
8.3 DETERMINANTS OF ORDER 3
Consider an arbitrary 3x3 matrix A -
dQt(A):
The determinant of ^ is defined as follows:
«11
«21
«31
«12
«22
«32
«13
«23
«33
— Cl\\Cl22^33 + ^12^23^31 + ^13^21^32 "
- Cl\2Cl22^3\
— Cl\2<^2\^33 ~
- Cl\\Cl23^32
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
8. Determinants
Text
©The McGraw-Hill
Companies, 2004
CHAR 8]
DETERMINANTS
279
Observe that there are six products, each product consisting of three elements of the original matrix. Three
of the products are plus-labeled (keep their sign) and three of the products are minus-labeled (change their
sign).
The diagrams in Fig. 8-1 may help to remember the above six products in det(^). That is, the
determinant is equal to the sum of the products of the elements along the three plus-labeled arrows in Fig.
8-1 plus the sum of the negatives of the products of the elements along the three minus-labeled arrows. We
emphasize that there are no such diagrammatic devices to remember determinants of higher order.
Example 8.3. Let.
2 1 1
0 5-2
_1 -3 4
Use the diagrams in Fig. 8-1:
and 5:
Fig. 8-1
3 2 1'
-4 5 -1
2-3 4
. Find det(^) and detE).
det(^) = 2E)D) + 1(-2)A) + l(-3)@) - 1E)A) - (-3)(-2)B) - 4A)@)
= 40-2 + 0-5-12-0 = 21
detE) = 60 - 4 + 12 - 10 - 9 + 32 = 81
Alternative Form for a Determinant of Order 3
The determinant of the 3 x 3 matrix A = [a^] may be rewritten as follows:
det(^) = aii(a22«23 - «23«32) " «12(«21«33 " «23«3l) + «13(«21«32 " «22«3l)
«22
«32
«23
«33
-«12
«21
«31
«23
«33
+ «13
«21
«31
«22
«32
which is a linear combination of three determinants of order 2 whose coefficients (with alternating signs)
form the first row of the given matrix. This linear combination may be indicated in the form
«11
«21
«31
«12
«22
«32
«13
«23
«33
-«12
«11
«21
«31
«12
«22
«32
«13
«23
«33
+ «13
11
a
12
a
13
^21 ^22 ^23
^31 ^32 ^33
Note that each 2x2 matrix can be obtained by deleting, in the original matrix, the row and column
containing its coefficient.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
8. Determinants
Text
©The McGraw-Hill
Companies, 2004
280
DETERMINANTS
[CHAR
Example 8.4.
1 2
4 -2
0 5
+ 3
1
4
0
2
-2
5
3
3
-1
-2
5
3
-1
-2
4
0
3
-1
+ 3
4
0
-2
5
or
0- =7l/2 • • -Jn^
where 7^ = G{i)
= 1B - 15) - 2(-4 + 0) + 3B0 + 0) = -13 + 8 + 60 = 55
8.4 PERMUTATIONS
A permutation cr of the set {1, 2, ..., w} is a one-to-one mapping of the set onto itself or, equivalently, a
rearrangement of the numbers 1, 2, ..., w. Such a permutation a is denoted by
\ 2 ... n
The set of all such permutations is denoted by ^„, and the number of such permutations is nl. If a G S^,
then the inverse mapping G~^ e S^; and if a, t g S^, then the composition mapping a o t e S^. Also, the
identity mapping s = a o G~^ e S„. (In fact, e = 123 ... w.)
Example 8.5.
(a) There are 2! = 2 • 1 =2 permutations in S2; they are 12 and 21.
(b) There are 3! = 3 • 2 • 1 = 6 permutations in S3; they are 123, 132, 213, 231, 312, 321.
Sign (Parity) of a Permutation
Consider an arbitrary permutation ainS^, say a =j\ ,72 • • -Jn- We say a is an even or odd permutation
according to whether there is an even or odd number of inversions in a. By an inversion in a we mean a
pair of integers (/, k) such that / > k, but / precedes k in a. We then define the sign or parity of a, written
sgn 0-, by
1 if cr is even
sgn a =
-1 if cr is odd
Example 8.6.
(a) Find the sign of cr = 35142 in S^.
For each element k, we count the number of elements / such that / > k and / precedes k in a. There are:
2 numbers C and 5) greater than and preceding 1,
3 numbers C,5, and 4) greater than and preceding 2,
1 number E) greater than and preceding 4.
(There are no numbers greater than and preceding either 3 or 5.) Since there are, in all, six inversions, a is even
and sgn a = I.
(b) The identity permutation e = 123 ...« is even because there are no inversions in e.
(c) In S2, the permutation 12 is even and 21 is odd. In S^, the permutations 123, 231, 312 are even and the
permutations 132, 213, 321 are odd.
(d) Let T be the permutation that interchanges two numbers / and 7 and leaves the other numbers fixed. That is,
t(/) =j\ x{j) = /, T(k) = k where k ^ i,j
We call T a transposition. If / <j, then there are 2G — /= 1) + 1 inversions in t, and hence the transposition t
is odd.
Lipschutz-Lipson:Schaum's I 8. Determinants I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 8] DETERMINANTS 281
Remark: One can show that, for any n, half of the permutations in S^ are even and half of them are
odd. For example, 3 of the 6 permutations in ^3 are even, and 3 are odd.
8.5. DETERMINANTS OF ARBITRARY ORDER
Let A = [Uij] be a square matrix of order n over a field K.
Consider a product oin elements of ^ such that one and only one element comes from each row and
one and only one element comes from each column. Such a product can be written in the form
that is, where the factors come from successive rows, and so the first subscripts are in the natural order
1, 2, ..., w. Now since the factors come from different columns, the sequence of second subscripts forms a
permutation a =j\ J2 .. .j^ in S^. Conversely, each permutation in S^ determines a product of the above
form. Thus the matrix A contains nl such products.
Definition: The determinant of^ = [a^], denoted by det(^) or |^|, is the sum of all the above nl products,
where each such product is multiplied by sgn a. That is,
Ml = E (sgn (^)aij,ci2j^ ''' %■„
G
or Ml = E (sgn (y)axa{\)Cl2a{2) ' ' ' ^nain)
aeS„
The determinant of the ^-square matrix A is said to be of order n.
The next example shows that the above definition agrees with the previous definition of determinants
of order 1, 2, and 3.
Example 8.7.
(a) Let A = [an] he Si 1 x 1 matrix. Since Si has only one permutation, which is even, det(^) = an, the number
itself.
(b) Let A = [aij] be a 2 x 2 matrix. In S2, the permutation 12 is even and the permutation 21 is odd. Hence
det(^):
an ai2
- ^n^ii ~ <2i2<22i
(c) Let A = [a^j] be a 3 x 3 matrix. In S^, the permutations 123, 231, 312 are even, and the permutations 321, 213,
132 are odd. Hence
det(^):
an ai2 ^13
^21 ^22 ^23
<231 ^32 ^^33
^11^22^33 +^12^23^31 +^13^21^32 "^13^22^31 "^12^21^33 "^11^23^32
Remark: As n increases, the number of terms in the determinant becomes astronomical.
Accordingly, we use indirect methods to evaluate determinants rather than the definition of the determinant. In
fact, we prove a number of properties about determinants that will permit us to shorten the computation
considerably. In particular, we show that a determinant of order n is equal to a linear combination of
determinants of order w — 1, as in the case n = 3 above.
Lipschutz-Lipson:Schaum's I 8. Determinants I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
282 DETERMINANTS [CHAR
8.6 PROPERTIES OF DETERMINANTS
We now list basic properties of the determinant.
Theorem 8.1: The determinant of a matrix A and its transpose A^ are equal; that is, \A\ = \A^\.
By this theorem (proved in Problem 8.22), any theorem about the determinant of a matrix A that
concerns the rows of ^ will have an analogous theorem concerning the columns of ^.
The next theorem (proved in Problem 8.24) gives certain cases for which the determinant can be
obtained immediately.
Theorem 8.2: Let ^ be a square matrix.
(i) If A has a row (column) of zeros, then \A\ = 0.
(ii) If A has two identical rows (columns), then \A\ =0.
(iii) If A is triangular, i.e., A has zeros above or below the diagonal, then \A\ = product
of diagonal elements. Thus in particular, |/| = 1, where / is the identity matrix.
The next theorem (proved in Problems 8.23 and 8.25) shows how the determinant of a matrix is
affected by the elementary row and column operations.
Theorem 8.3: Suppose B is obtained from A by an elementary row (column) operation.
(i) If two rows (columns) of ^ were interchanged, then \B\ = —\A\.
(ii) If a row (column) of ^ were multiplied by a scalar k, then \B\ = k\A\.
(iii) If a multiple of a row (column) of ^ were added to another row (column) of ^, then
\B\ = Ml.
Major Properties of Determinants
We now state two of the most important and useful theorems on determinants.
Theorem 8.4: The determinant of a product of two matrices A and B is the product of their determinants;
that is,
dQt(AB) = dQt(A) detE)
The above theorem says that the determinant is a multiplicative function.
Theorem 8.5: Let ^ be a square matrix. Then the following are equivalent:
(i) A is invertible; that is, A has an inverse A~^.
(ii) AX = 0 has only the zero solution,
(iii) The determinant of ^ is not zero; that is, det(^) 7^ 0.
Remark: Depending on the author and the text, a nonsingular matrix A is defined to be an invertible
matrix ^, or a matrix A for which \A\ 7^ 0, or a matrix A for which AX = 0 has only the zero solution. The
above theorem shows that all such definitions are equivalent.
We shall prove Theorems 8.4 and 8.5 (in Problems 8.29 and 8.28, respectively) using the theory of
elementary matrices and the following lemma (proved in Problem 8.26), which is a special case of
Theorem 8.4.
Lipschutz-Lipson:Schaum's I 8. Determinants
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 8]
DETERMINANTS
283
Lemma 8.6: Let E be an elementary matrix. Then, for any matrix A, \EA\ = l^"!!^!.
Recall that matrices A and B are similar if there exists a nonsingular matrix P such that B = P~^AP.
Using the multiplicative property of the determinant (Theorem 8.4), one can easily prove (Problem 8.31)
the following theorem.
Theorem 8.7: Suppose A and B are similar matrices. Then \A\ = \B\.
8.7 MINORS AND COFACTORS
Consider an ^-square matrix A = [a^y]. Let My denote the {n — l)-square submatrix of ^ obtained by
deleting its /th row andyth column. The determinant |M^| is called the minor of the element a^ of^, and we
define the cofactor oi a^, denoted by Ay, to be the "signed" minor:
4. = (-l)'+^|M^|
Note that the "signs" (—1)^^^ accompanying the minors form a chessboard pattern with +'s on the main
diagonal:
^ F - + - ...
- + -+...
I- - + - ...
We emphasize that My denotes a matrix whereas Ay denotes a scalar.
Remark: The sign (—1)^^^ of the cofactor ^^y is frequently obtained using the checkerboard pattern.
Specifically, beginning with + and alternating signs, i.e.,
count from the main diagonal to the appropriate square.
Example 8.8. Let A
{a) IM23I
{b) IM31I
1 2 3
4 5 6
7 8 9
1 2 3
4 5 6
7 8 9
1 2 3
4 5 6
7 8 9
1 2
7 8
Find the following minors and cofactors: {a) IM23I and ^23^ (^) l^ail ^^^ ^3
,2+3 I
: 8 - 14 = -6, and so ^23 = (-1) 1^23! = -(-6) = 6
2 3
5 6
1+3,
12-15 = -3, and so ^31 = (-l)'^'|M3i| =+(-3)
Laplace Expansion
The following theorem (proved in Problem 8.32) holds.
Theorem 8.8: (Laplace) The determinant of a square matrix A = [a^y] is equal to the sum of the products
obtained by multiplying the elements of any row (column) by their respective cofactors:
n
7=1
n
\A\ = ayAy + ayAy + . . . + U^jA^j = J2 ^ij^ij
/=1
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
8. Determinants
Text
©The McGraw-Hill
Companies, 2004
284
DETERMINANTS
[CHAR
The above formulas for \A\ are called the Laplace expansions of the determinant of ^ by the /th row
and the yth column. Together with the elementary row (column) operations, they offer a method of
simplifying the computation of |^|, as described below.
8.8 EVALUATION OF DETERMINANTS
The following algorithm reduces the evaluation of a determinant of order n to the evaluation of a
determinant of order n — \.
Algorithm 8.1: (Reduction of the order of a determinant) The input is a nonzero ^-square matrix
A = [a^jl with n > \.
Step 1. Choose an element ay = 1 or, if lacking, ay 7^ 0.
Step 2. Using a^ as a pivot, apply elementary row (column) operations to put O's in all the other positions
in the column (row) containing a^.
Step 3. Expand the determinant by the column (row) containing a^.
The following remarks are in order.
Remark 1: Algorithm 8.1 is usually used for determinants of order 4 or more. With determinants of
order less than 4, one uses the specific formulas for the determinant.
Remark 2: Gaussian elimination or, equivalently, repeated use of Algorithm 8.1 together with row
interchanges can be used to transform a matrix A into an upper triangular matrix whose determinant is the
product of its diagonal entries. However, one must keep track of the number of row interchanges, since
each row interchange changes the sign of the determinant.
Example 8.9. Use Algorithm 8.1 to find the determinant of ^
Use C23 = 1 as a pivot to put O's in the other positions of the third column, that is, apply the row operations
"Replace R^ by -2R2 +Ri\ "Replace R^ by 3i?2 +^3", and "Replace R^ by i?2 +^4" By Theorem 8.3(c), the
value of the determinant does not change under these operations. Thus
5
2
5
1
4
3
-7
-2
2
1
-3
-1
1
-2
9
4
\A\
5 4 2 1
2 3 1-2
-5 -7 -3 9
1-2-1 4
1-2 0 5
2 3 1-2
12 0 3
3 10 2
Now expand by the third column. Specifically, neglect all terms that contain 0 and use the fact that the sign of the
minor M23 is (—1)
2+3 ,
-1. Thus
\A\
1 2
2 3
1 2
3 1
0
1
0
0
5
-2
3
2
1 -2 5
1 2 3
3 1 2
-D - 18 + 5 - 30 - 3 + 4) = -(-38) = 38
Lipschutz-Lipson:Schaum's I 8. Determinants
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 8]
DETERMINANTS
285
8.9 CLASSICAL ADJOINT
Let A = [ay] be an w x ^ matrix over a field K and let A^- denote the cofactor of a^. The classical
adjoint of A, denoted by adj A, is the transpose of the matrix of cofactors of ^. Namely,
We say "classical adjoint" instead of simply "adjoint" because the term "adjoint" is currently used for an
entirely different concept.
Example 8.10. Let^
-4 2
-1 5
A21
^31
—
= +
^
-1
s
-4
-4
5
-4
2
'2 3 -4'
0-4 2
1 -1 5
-18,
-11,
-10,
The cofactors of the nine elements of^ follow:
^13 = +
: 14, ^23 = -
-4, ^33 = +
A^2 =
A22--
= +
0 2
1 5
2 -
1
4
5
0
1
2
1
-4
-1
3
-1
2 -4
0 2
2 3
0 -4
The transpose of the above matrix of cofactors yields the classical adjoint of ^, that is,
adj A --
-18 -11 -10
2 14 -4
4 5-8
The following theorem (proved in Problem 8.34) holds.
Theorem 8.9: Let A be any square matrix. Then
^(adj A) = (adj A)A = I
where / is the identity matrix. Thus, if |^| 7^ 0,
Example 8.1 L Let A be the matrix in Example 8.10. We have
det(^) = -40 + 6 + 0 - 16 + 4 + 0 = -46
Thus A does have an inverse, and, by Theorem 8.9,
-18 -11 -10
1 1 I
A
"'=>''J-^> = -i
14 -4
5 -8
r _9_ ii J_n
23 46 23
_ J_ _X 2.
23 23 23
^2. _ J_ ±
L 23 46 23-1
8.10 APPLICATIONS TO LINEAR EQUATIONS, CRAMER'S RULE
Consider a system AX = Bofn linear equations in n unknowns. Here A = [a^] is the (square) matrix
of coefficients and B = [Z?J is the column vector of constants. Let A^ be the matrix obtained from A by
replacing the /th column of ^ by the column vector B. Furthermore, let
D = dQt(A), N^ = det(^i). ^2 = det(^2).
N, = det(^ J
The fundamental relationship between determinants and the solution of the system AX = B follows.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
8. Determinants
Text
©The McGraw-Hill
Companies, 2004
286
DETERMINANTS
[CHAR
Theorem 8.10: The (square) system AX = B has a solution if and only ifD 7^ 0. In this case, the unique
solution is given by
Xi
D
X2
D
D
The above theorem (proved in Problem 8.10) is known as Cramers rule for solving systems of linear
equations. We emphasize that the theorem only refers to a system with the same number of equations as
unknowns, and that it only gives the solution when £) 7^ 0. In fact, if £) = 0, the theorem does not tell us
whether or not the system has a solution. However, in the case of a homogeneous system, we have the
following useful result (to be proved in Problem 8.54).
Theorem 8.11: A square homogeneous system AX = 0 has a nonzero solution if and only if
D= \A\ = 0.
ix-\- y -\- z = 5
x — 2y — ?>z= — 1
2x+ y — z = 3
First compute the determinant D of the matrix of coefficients:
D--
1 1
-2 -3
1 -1
:2-6+l+4+3+l
Since D ^ 0, the system has a unique solution. To compute A^^, A^^, N^, we replace, respectively, the coefficients of
X, y, z in the matrix of coefficients by the constant terms. This yields
A^.
5
1
3
1
-2
1
1
-3
-1
20,
A^.
1
1
2
5
-1
3
1
-3
-1
-10,
A^.
1
1
2
1
-2
1
5
-1
3
15
Thus the unique solution of the system is
vector u = B, —1,0).
X = NJD = 4, y = NJD = -2, z = NJD = 3, that is, the
8.11 SUBMATRICES, MINORS, PRINCIPAL MINORS
Let^ = [aij] be a square matrix of order n. Consider any r rows and r columns of A. That is, consider
any set I = A1,12, ..., ir) of r row indices and any set J = (Ji,J2^ - - - Jr) of ^ column indices. Then / and
J define an r x r submatrix of ^, denoted by ^(/; J\ obtained by deleting the rows and columns of ^
whose subscripts do not belong to / or J, respectively. That is,
A(r, J) = [a^^ : s el,t eJ]
The determinant \A(r, J)\ is called a minor of A of order r and
^_jyi+/2+...+/.+7i+72+...U|^(j.j)|
is the corresponding signed minor. (Note that a minor of order w — 1 is a minor in the sense of Section 8.7,
and the corresponding signed minor is a cofactor.) Furthermore, if r and J^ denote, respectively, the
remaining row and column indices, then
M(/';/')l
denotes the complementary minor, and its sign (Problem 8.74) is the same sign as the minor itself
Lipschutz-Lipson:Schaum's I 8. Determinants
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 8]
DETERMINANTS
287
Example 8.13. Let A^la^j] be a 5-square matrix, and let / = {1,2,4} and / = {2, 3, 5}. Then /'= {3, 5} and
J' — {1,4}, and the corresponding minor \M\ and complementary minor \M'\ are as follows:
\M\ = \A(I;J)\
ai2 ^13 ai5
^^22 ^^23 ^25
a,
fl31 fl34
and \M'\ = M(/VOl
^42 3 5 I
Since 1+2 + 4 + 2 + 3 + 5 = 17 is odd, —\M\ is the signed minor, and —\M^\ is the signed complementary minor.
PRINCIPAL MINORS
A minor is principal if the row and column indices are the same, or equivalently, if the diagonal
elements of the minor come from the diagonal of the matrix. We note that the sign of a principal minor is
always +1, since the sum of the row and identical column subscripts must always be even.
Example 8.14. Let^
and three, respectively.
1 2 -1
3 5 4
-3 1 -2
. Find the sums C^, C2, and C3 of the principal minors of^ of orders one, two
(a) There are three principal minors of order one. These are
|1| = 1, |5|=5, |-2| = -2, and so Ci = 1 + 5 - 2 = 4
Note that C^ is simply the trace of ^. Namely, C^ = tY(A).
(b) There are three ways to choose two of the three diagonal elements, and each choice gives a minor of order two.
These are
1 2
3 5
-1,
1 -1
-3 -2
1,
5 4
1 -2
-14
(Note that these minors of order two are the cofactors ^33, A22, and A-^-^ of A, respectively.) Thus
C2 = -1 + 1 - 14 = -14
(c) There is only one way to choose three of the three diagonal elements. Thus the only minor of order three is the
determinant of ^ itself Thus
C3 = 1^1 = -10 - 24 - 3 - 15 - 4 + 12 = -44
8.12 BLOCK MATRICES AND DETERMINANTS
The following theorem (proved in Problem 8.36) is the main result of this section.
Theorem 8.12: Suppose M is an upper (lower) triangular block matrix with the diagonal blocks
Ai,A2, ... ,An' Then
det(M) = det(^i) det(^2) • • • det(^J
Example 8.15. Find \M\ where M =
2 3 I 4 7 8'
-]_ _5 I 3 _ _2_ !_
" 0 0"^ 2 15
0 0 I 3 -1 4
0 0 I 5 2 6
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
8. Determinants
Text
©The McGraw-Hill
Companies, 2004
288
DETERMINANTS
[CHAR
Note that M is an upper triangular block matrix. Evaluate the determinant of each diagonal block:
2 3
-1 5
10 + 3 = 13,
2 1 5
3-14
5 2 6
-12 + 20 + 30 + 25 - 16 - 18 = 29
Then \M\ = 13B9) = 377.
Remark: Suppose M ■
A B
C D
, where A,B, C,D are square matrices. Then it is not generally
true that \M\ = \A\\D\ - \B\\C\. (See Problem 8.68.)
8.13 DETERMINANTS AND VOLUME
Determinants are related to the notions of area and volume as follows. Let Ui,U2, ... ,u^hQ vectors in
R". Let S be the (solid) parallelipiped determined by the vectors, that is,
S = {fljWj + ^2^2 + • • • + <^n^n • 0 < a^- < 1 for / = 1, . . . , w}
(When n = 2, S is 2i parallelogram.) Let V(S) denote the volume of ^ (or area of ^ when n = 2). Then
V(S) = absolute value of det (A)
where A is the matrix with rows Ui,U2, ... ,u^.ln general, V(S) = 0 if and only if the vectors Wj, ..., w„ do
not form a coordinate system for R", i.e., if and only if the vectors are linearly dependent.
Example 8.16. Let u^ ^A,1,0), U2 ^A,1, 1), % = @, 2, 3). Find the volume V(S) of the parallelopiped S in R^
(Fig. 8-2) determine by the three vectors.
Fig. 8-2
Evaluate the determinant of the matrix whose rows are w^, ^2. ^3-
3 + 0 + 0-0-2-3 :
1 1 0
1 1 1
0 2 3
Hence V(S) = | -2| =2.
8.14 DETERMINANT OF A LINEAR OPERATOR
Let F be a linear operator on a vector space V with finite dimension. Let A be the matrix representation
of F relative to some basis S of V. Then we define the determinant of F, written det(F), by
det(F) = \A\
2
1
5
-4
-2
1
1
3
-1
Lipschutz-Lipson:Schaum's I 8. Determinants I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 8] DETERMINANTS 289
If B were another matrix representation of F relative to another basis S' of V, then A and B are similar
matrices (Theorem 6.7) and hence \B\ = |^| by Theorem 8.7. In other words, the above definition det(F) is
independent of the particular basis ^ of K (We say that the definition is well-defined.)
The next theorem (to be proved in Problem 8.62) follows from analogous theorems on matrices.
Theorem 8.13: Let F and G be linear operators on a vector space V. Then
(i) det(Fo G) = det(F) det(G).
(ii) F is invertible if and only if det(F) ^ 0.
Example 8.17. Let F be the following linear operator on R^ and let A be the matrix that represents F relative to the usual
basis of R^:
F(x, y, z) = Bx — Ay -\-z, x — 2y -\-3z, 5x -\-y — z) and A
Then
det(F) = 1^1 = 4 - 60 + 1 + 10 - 6 - 4 = -55
8.15 MULTILINEARITY AND DETERMINANTS
Let F be a vector space over a field K. Let j/ = F", that is, j/ consists of all the ^-tuples
A = (A,,A2,---Mn)
where the Af are vectors in V. The following definitions apply.
Definition: A function D: s^ ^ K is said to be multilinear if it is linear in each component, that is:
(i) If4=5 + C, then
D(A) = D(..., B + C, ...) = D(...,B,...,) + D(...,C,---)
(ii) If ^^ = kB, where k e K, then
D(A) = D(...,kB,...) = kD(..., 5, ...)
We also say n-linear for multilinear if there are n components.
Definition: A function D: s^ ^ K is said to be alternating if D(A) = 0 whenever A has two identical
elements, that is,
D(Ai, A2, ..., A J = 0 whenever A^ = Aj, i j^j
Now let M denote the set of all ^-square matrices A over a field K. We may view A as an w-tuple
consisting of its row vectors Ai,A2, ... ,Anl that is, we may view A in the form A = (Ai,A2, ..., A J.
The following theorem (proved in Problem 8.37) characterizes the determinant function.
Theorem 8.14: There exists a unique function D:M ^ K such that:
(i) D is multilinear, (ii) D is alternating, (iii) D(I) = 1.
This function D is the determinant function; that is, D(A) = \A\, for any matrix A e M.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
8. Determinants
Text
©The McGraw-Hill
Companies, 2004
290
DETERMINANTS
[CHAR
Solved Problems
COMPUTATION OF DETERMINANTS
8.1. Evaluate the determinant of each of the following matrices:
(a) A =
5"
_2 3_
le formi
Ab) B
ila
a b
c d
=
=
-3"
_4 7_
ad — be:
(c) C =
(d) D =
(a) \A\ = 6C) - 5B) =18-10 = 8
(b) 1^1 = 14+12 = 26
(c) |C| = -8-5 = -13
(d) \D\ =(t- 5)(t -\-2)-lS = f-3t-lO-lS = t^- lOt - 28
t-5 6
3 t + 2
2 3 4
5 4 3
1 2 1
Ab) B =
1
2
1
-2
4
5
3
-1
-2
Ac) c =
1
3
1
3
-1
-2
-5
2
1
8.2. Evaluate the determinant of each of the following matrices:
(a) A =
Use the diagram in Fig. 8-1 to obtain the six products:
(a) \A\ = 2D)A) + 3C)A) + 4B)E) - 1D)D) - 2C)B) - 1C)E):
(b) |5| = -8 + 2 + 30 - 12 + 5 - 8 = 9
(c) |C| = -1+6 + 30-5+4-9 = 25
; + 9 + 40-16-12-15:
14
8.3. Compute the determinant of each of the following matrices:
^4 -6 8 9^
(a) A =
2 3 4
5 6 7
8 9 1
Ab) B =
0
0
0
-2
0
0
7
5
0
-3
6
3
(c) C =
(a) One can simplify the entries by first subtracting twice the first row fi-om the second row, that is by
applying the row operation "Replace R2 by —2^ +^^2"- Then
Mh
2 3 4
5 6 7
8 9 1
=
2 3
1 0
8 0
4
-1
1
: 0-24 + 36-0+U
:27
(b) B is triangular, so |5| = product of the diagonal entries = —120.
(c) The arithmetic is simpler if fi-actions are first eliminated. Hence multiply the first row i^^ by 6 and the
second row R2 by 4. Then
24C:
3 -6 -2
3 2-4
1 -4 1
: 6 + 24 + 24 + 4 - 48 + 18 = 28, so \C\
28
24'
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
8. Determinants
Text
©The McGraw-Hill
Companies, 2004
CHAR 8]
DETERMINANTS
291
8.4. Compute the determinant of each of the following matrices:
(a) A:
2
2
1
1
5
-3
3
-6
-3
2
-2
4
-2
-5
2
3
(b) B.
6
2
1
3
1
2
1
1
0
-1
1
1
2
2
-3
0
-2
-2
3
4
5
1
3
-1
2
(a) Use Qt^i = 1 as a pivot to put O's in the first column, by applying the row operations "Replace R^ by
-2R^ +Ri\ "Replace R2 by 2R^ + R2
2 5-3-2
2 -3 2 -5
13-22
1-643
10 + 3-36 + 36-2
and "Replace R^ by ^^3 +7^4". Then
\A\
15
0 -1
0 3
1 3
0 -3
5 = -4
1
-7.
-2
2
-6
-1
2
5
=
-1
'^
-3
1
-7.
7.
-6
-1
5
(b)
First reduce \B\ to a determinant of order 4, and then to a determinant of order 3, for which we can use
Fig. 8-1. First use C22 = 1 as a pivot to put O's in the second column, by applying the row operations
"Replace R^ by -2R2 + R^
\B\-
-1
1
1
2
-2
-1
-5
7
'Replace ^^3 by —R2 ■
4 31
-7^3", and "Replace R^ by R2 +^5". Then
-2
0
3
2
-1
1
2
-2
1
0
5
-1
1 4
1 0
2 3
-2 2
-1
0
-5
7
: 21 + 20 - 10 - 3 + 10 - 140 = -102
COFACTORS, CLASSICAL ADJOINTS, MINORS, PRINCIPAL MINORS
8.5. Let A
2 1-34
5-4 7-2
4 0 6-3
3-252
(a) Find ^23? the cofactor (signed minor) of 7 in ^.
(b) Find the minor and the signed minor of the submatrix M = ^B,4; 2,3).
(c) Find the principal minor determined by the first and third diagonal entries, that is by
M = ^(l,3; 1,3).
(a) Take the determinant of the submatrix of A obtained by deleting row 2 and column 3 (those which
contain the 7), and multiply the determinant by (—1)
|2 1 4
^23 — ~
2+3.
0 -3
-2 2
-(-61) = 61
The exponent 2 + 3 comes fi-om the subscripts of ^23' that is, fi-om the fact that 7 appears in row 2 and
column 3.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
8. Determinants
Text
©The McGraw-Hill
Companies, 2004
292
DETERMINANTS
[CHAP.
(b) The row subscripts are 2 and 4 and the column subscripts are 2 and 3. Hence the minor is the determinant
\M\
^^22 ^^23
^42 ^43
and the signed minor is (-l)^+'^+^+^|M| =
(c) The principal minor is the determinant
I <^3\ ^33
-4 7
-2 5
-\M\ = -
2 -3
4 6
= -20+14 = -6
(-6) = 6.
12+12 = 24
Note that now the diagonal entries of the submatrix are diagonal entries of the original matrix. Also, the
sign of the principal minor is positive.
8.6. Let B :
8.7.
1 1 1
2 3 4
5 8 9
. Find: (a) \B\, (b) adj B, (c) B'^ using adj B.
(a) \B\ = 27 + 20 + 16 - 15 - 32 - 18 = -2
(b) Take the transpose of the matrix of cofactors:
adj B :
-
3 4
8 9
1 1
8 9
1 1
3 4
2 4
5 9
1 1
5 9
1 1
2 4
2 3
5 8
1 1
5 8
1 1
2 3
T
(c) Since \B\ ^ 0, B'
\B\
(adj B)
-5 -1
2 4
1 -3
5
2 1
1 4 -3
1-2 1_
=
7
=
r 5 1
2 2
-1 -2
1
_ 2
3
2
-5 -1
2 4
L
in
2
1
1
2_
I -3
1
-2
1
, and let Sj^ denote the sum of its principal minors of order k. Find Sj^ for:
[12 3
Let ^ = 4 5 6
[o 7 8_
(a) k=l,(b) k = 2,(c) k = 3.
(a) The principal minors of order 1 are the diagonal elements. Thus S^ is the trace of ^; that is.
(b) The principal minors of order 2 are the cofactors of the diagonal elements. Thus
5 6
7 8
+
1 3
0 8
+
1 2
4 5
^2 — ^11 +^22 +^33 ■
(c) There is only one principal minor of order 3, the determinant of ^. Then
5*3 = 1^1 = 40 + 0 + 84 - 0 - 42 - 64 = 18
8.8.
Let^ =
1
4
1
3
3
2
0
-2
0
5
3
1
-1
1
-2
4
. Find the number Nj^ and sum Sj^ of principal minors of order:
(a) k=l,(b) k = 2,(c) k = 3,(d) k = 4.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
8. Determinants
Text
©The McGraw-Hill
Companies, 2004
CHAR 8]
DETERMINANTS
293
Each (nonempty) subset of the diagonal (or equivalently, each nonempty subset of {1, 2, 3, 4}) determines
a principal minor of ^, and A^^ = I j
Thus N^ = (^\=4,
N,
k\(n - k)\
of them are of order k.
N,
ii)
(a) 5,
(b) S2
(c) 53
|l| + |2| + |3| + |4| = l+2 + 3 + 4 = 10
1 3
-4 2
+
1 0
1 3
+
1
3
-1
4
+
2 5
0 3
+
2 1
-2 4
+
3
1
-2
4
14 + 3+7 + 6 + 10+14 = 54
1 3 -1
-4 2 1
3-2 4
: 57+ 65+ 22+ 54= 198
1 3 0
4 2 5
1 0 3
+
1 0
1 3
3 1
+
(d) S4 = dQt(A) = 378
-2 1
N^
C)
DETERMINANTS AND SYSTEMS OF LINEAR EQUATIONS
133; + 2x = z + 1
3x -\-2z = ^ — 5y
3z — I = X — 2y
First arrange the equation in standard form, then compute the determinant D of the matrix of coefficients:
1
2x -\-3y — z -
3x + 5j; + 2z :
X — 2y — 3z -
and
D--
-1
3 -1
5 2
-2 -3
-30 + 6 + 6 + 5 + 8+27 = 22
Since D ^ 0, the system has a unique solution. To compute N^,Ny,N^, we replace, respectively, the
coefficients of x, y, z in the matrix of coefficients by the constant terms. Then
A^.
Thus
1
8
1
3 -1
5 2
-2 -1
''- D'
= 66,
=^=3,
22
^y =
y =
2 1 -1
3 8 2
1 -1 -3
Ny _ -22 _
D~ 22 ~
= -22,
-1,
N,=
D :
2
3
1
^4
12~
3
5
-2
--2
1
8
-1
:44
Ikx-\-y + z = 1
x-\- ky -\-z = 1
x-\- y -\- kz = 1
Use determinants to find those values of k for which the system has:
(a) a unique solution, (b) more than one solution, (c) no solution.
(a) The system has a unique solution when D ^ 0, where D is the determinant of the matrix of coefficients.
Compute
D:
k
1
1
1
k
1
1
1
k
1+1-k-k-k-
■3k-\-2 = (k-l)\k-\-2)
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
8. Determinants
Text
©The McGraw-Hill
Companies, 2004
294
DETERMINANTS
[CHAR
Thus the system has a unique solution when
{k - \f{k + 2) 7^ 0, that is, when ^ 7^ 1 and ^ 7^ 2
{b and c) Gaussian elimination shows that the system has more than one solution when k = \, and the
system has no solution when k = —2.
MISCELLANEOUS PROBLEMS
8.11. Find the volume V(S) of the parallelepiped S in R^ determined by the vectors:
(a) wi = A, 1, 1), U2 = A, 3, -4), W3 = A,2, -5).
(b) wi = A, 2, 4), U2 = B, 1, -3), W3 = E, 7, 9).
V(S) is the absolute value of the determinant of the matrix M whose rows are the given vectors. Thus
(a) \M\
(b) \M\
1 1
1 3
1 2
1 2
2 1
5 7
1
-4
-5
4
-3
9
-15 - 4 + 2 - 3 + 8 + 5 = -7. Hence V(S) = | - 7| = 7.
: 9 - 30 + 56 - 20 + 21 - 36 = 0. Thus V(S) = 0, or, in other words, Ui,U2, u^
lie in a plane and are linearly dependent.
8.12. Find det(M) where M ■■
3
2
0
0
0
4
5
9
5
0
0
0
2
0
4
0
0
0
6
3
0
0
0
7
4
=
3
2
0
0
0
41010
5|0|0
9 1 2 1 0
5|0|6
0 1 4 1 3
0
0
0
7~
4
M is a (lower) triangular block matrix; hence evaluate the determinant of each diagonal block:
3 4
2 5
Thus \M\ = 7B)C) = 42.
15-8:
|2|
6 7
3 4
:24-21
8.13. Find the determinant of F: R^ -^ R^ defined by
F(x, y, z) = (x -\-3y — 4z, 2y -\-lz, x -\- 5y — 3z)
The determinant of a linear operator F is equal to the determinant of any matrix that represents F Thus
first find the matrix A representing F in the usual basis (whose rows, respectively, consist of the coefficients
x,y,z). Then
1 3 -4
0 2 7
1 5 -3
and so
det(F) = 1^1 = -6 + 21 + 0 + 8 - 35 - 0 :
8.14. Write out g = g(xi, X2, X3, X4) explicitly where
g(xi,X2,...,xJ = Y\(Xi-Xj).
The symbol ]~[ is used for a product of terms in the same way that the symbol J2 i^ ^^^^ ^^^ ^ ^^^ ^^
terms. That is, Yli<j (^i ~ ^j) means the product of all terms (x, — Xj) for which / < j. Hence
g = g(xi, . . . , X4) = (Xi - X2)(xi - X3)(xi - X4)(X2 - X^)(x2 - X^^X^ - X4)
Lipschutz-Lipson:Schaum's I 8. Determinants I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 8] DETERMINANTS 295
8.15. Let £) be a 2-linear, alternating function. Show that D(A, B) = —D(B,A).
Since D is alternating, D(A, A) = 0, D(B, B) = 0. Hence
D(A -\-B,A-\-B)=D(A,A)-\- D(A, B) + D{B, A) + D{B, B) = D(A, B) + D{B, A)
However, D{A -\-B, A-\-B) = 0. Hence D(A, B) = -D(B, A), as required.
PERMUTATIONS
8.16. Determine the parity (sign) of the permutation g = 364152.
Count the number of inversions. That is, for each element k, count the number of elements / in a such
that / > k and / precedes ^ in cr. Namely,
k= 1:
k = 2:
k = 3:
3 numbers C, 6, 4)
4 numbers C,6,4,5)
0 numbers
k = 4:
k = 5:
k= 1:
1 number F)
1 number F)
0 numbers
Since 3+4 + 0+1 + 1+0 = 9 is odd, a is an odd permutation, and sgn a = —I.
8.17. Let G = 24513 and t = 41352 be permutations in ^5. Find: (a) tog, (b) g~^.
Recall that a = 24513 and t = 41352 are short ways of writing
/I 2 3 4 5\
^=(^24513) ""' ^A) = 2, cjB) = 4, cjC) = 5, cjD)=l, cjE) = 3
1 2 3 4 5\
4 ^ 3 ^ J ""' <1) = 4, tB) = 1, tC) = 3, tD) = 5, tE) = 2
(a) The effects of a and then t on 1, 2, 3, 4, 5 are as follows:
1^2^ 1, 2^4^ 5, 3^5^ 2, 4^1^ 4, 5^3^3
[That is, for example, (to o-)A) = t(o-A)) = tB) = 1.] Thus to a = 15243.
(b) By definition, o~^{j) = k if and only if o{k) =j. Hence
'2 4 5 1 3\ /I 2 3 4 5\
1 2 3 4 5;-V4 1 5 2 3; ^^ ^" =^^^^^
8.18. Let G =j\J2 '' 'Jn ^^ ^^y permutation in S^. Show that, for each inversion (/, k) where / > k but /
precedes k in g, there is a pair (i^,j^) such that
/* < k^ and o-(/*) > g(j^) A)
and vice versa. Thus g is even or odd according to whether there is an even or an odd number of
pairs satisfying A).
Choose /* and ^* so that o-(/*) = / and o-(^*) = k. Then / > k if and only if o-(/*) > o-(^*), and / precedes
^ in 0" if and only if /* < ^*.
8.19. Consider the polynomials g = g(xi, ..., x„) and G(g\ defined by
g = g(xi,...,xj = Yii^i - xj) and G(g) = flfco) - ^aij))
(See Problem 8.14.) Show that G(g) = g when g is an even permutation, and G(g) = —g when g is
an odd permutation. That is, G(g) = (sgn G)g.
Since a is one-to-one and onto,
^(g) = UiMi) ~ Mj)"^ = n (^i - ^j)
i<j i<j or i>j
Thus o{g) or o{g) = —g according to whether there is an even or an odd number of terms of the form x^ — Xj,
where / >j. Note that for each pair (i,j) for which
i<j and a(i) > a(j)
Lipschutz-Lipson:Schaum's I 8. Determinants I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
296 DETERMINANTS [CHAP. 8
there is a term (x^^p^ — x^(,)) in a(g) for which a(i) > (j{j). Since o is even if and only if there is an even
number of pairs satisfying A), we have o{g) = g if and only if o is even. Hence o{g) = —g if and only if a
is odd.
8.20. Let (T,T e S„. Show that sgn(To a) = (sgn T)(sgn a). Thus the product of two even or two odd
permutations is even, and the product of an odd and an even permutation is odd.
Using Problem 8.19, we have
sgn(T oa)g = (To a)(g) = T(a(g)) = T((sgn a)g) = (sgn T)(sgn a)g
Accordingly sgn (t o a) = (sgn T)(sgn a).
8.21. Consider the permutation a =j\J2 - - Jn- Show that sgn g~^ = sgn a and, for scalars Uy, show that
where G~^ = ^1^2 • • -K-
We have a~^ o a = e, the identity permutation. Since e is even, (j~^ and a are both even or both odd.
Hence sgn (j~^ = sgn a.
Since a =j]J2 • • -Jn is a permutation, aj^iQj^^ ... Qj^^ = ^i^^^i^j • • • ^^k^' ^^^^ ki,k2,... ,k^ have the
property that
G{k^)=\, (j(k2) = 2, ..., (T(kJ = n
Let T = kik2 ... kf^. Then, for / = !,...,«,
{aoT)ii) = <Tizii)) = a{ki) = i
Thus a o T = 8, the identity permutation. Hence t = (j~^.
PROOFS OF THEOREMS
8.22. Prove Theorem 8.1: \A^\ = \A\.
If A = [gij], then A^ = [bij], with bij = a^y. Hence
\^^\= E (sgn ^)^laB)^2aB) • • • Ka(n) = E (sgn ^)«a(l),l«aB),2 • • • a^(n),n
GeS„ GeS„
Let T = a-\ By Problem 8.21 sgn t = sgn a, and a^(i),ia^B),2 • • • «a(n),n = «1tA)«2tB) • • • ««!(«)• Hence
\^^\= E (sgn ^)ait(i)a2rB) ■ • • ««<«)
However, as a runs through all the elements of 5'„, t = cr~^ also runs through all the elements of S^. Thus
\A^\ = \A\.
8.23. Prove Theorem 8.3(i): If two rows (columns) of ^ are interchanged, then \B\ = —\A\.
We prove the theorem for the case that two columns are interchanged. Let t be the transposition that
interchanges the two numbers corresponding to the two columns of A that are interchanged. If ^ = [a^] and
B = [bij], then bij = ai^(jy Hence, for any permutation a,
^1GA)^2GB) • • • ^nain) = '^Uo g{\)'^2xo a{l) ■ ■ ■ ^mo a{n)
Thus
1^1 = E (sgn ^)^la(l)^2GB) • • • bnain) = E (sgn ^)«1to ^A)^21 ° .7B) • • • «nT° a(n)
Since the transposition t is an odd permutation, sgn(T ° a) = (sgn T)(sgn a) = —sgn a. Thus
sgno" = —sgn (t o 0"), and so
1^1 = - E [sgn(T ° ^)]«lTo ^A)^21 ° ^B) • • • ^nroain)
But as a runs through all the elements of S^, to a also runs through all the elements of 5'„. Hence \B\ = —\A\.
Lipschutz-Lipson:Schaum's I 8. Determinants I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 8] DETERMINANTS 297
8.24. Prove Theorem 8.2: (i) If A has a row (column) of zeros, then \A\ =0.
(i) If A has two identical rows (columns), then \A\ =0.
(ii) If A is triangular, then \A\ = product of diagonal elements. Thus |/| = 1.
(i) Each term in \A \ contains a factor from every row, and so from the row of zeros. Thus each term of \A \ is
zero, and so |^| = 0.
(ii) Suppose 1 + 1 7^ 0 in ^. If we interchange the two identical rows of ^, we still obtain the matrix A.
Hence, by Rroblem S23, \A\ = -\Al and so |^| = 0.
Now suppose 1 + 1 = 0 in ^. Then sgn a = 1 for every (J e S^. Since A has two identical rows, we
can arrange the terms of ^ into pairs of equal terms. Since each pair is 0, the determinant of ^ is zero.
(iii) Suppose A = [aij] is lower triangular, that is, the entries above the diagonal are all zero: a^ = 0
whenever / <j. Consider a term t of the determinant of ^:
t = (sgn a)a^i^a2i^... a„,-^, where a = i^/2 • • • h
Suppose /i 7^ 1. Then 1 < i^ and so a^^ = 0; hence t = 0. That is, each term for which i^ 7^ 1 is zero.
Now suppose ii = 1 but 12 ^ 2. Then 2 < 12, and so a2i^ = 0; hence t = 0. Thus each term for
which i^ 7^ 1 or ^2 7^ 2 is zero.
Similarly, we obtain that each term for which i^ 7^ 1 or /2 7^ 2 or ... or i^ ^ n is zero. Accordingly,
\A\ = aiia22 ■ ■ ■ cinn = product of diagonal elements.
8.25. Prove Theorem 8.3: B is obtained from A by an elementary operation.
(i) If two rows (columns) of^ were interchanged, then \B\ = —\A\.
(ii) If a row (column) of ^ were multiplied by a scalar k, then \B\ = k\A\.
(iii) If a multiple of a row (column) of ^ were added to another row (column) of ^, then \B\ = \A\.
(i) This result was proved in Rroblem 8.23.
(ii) If theyth row of ^ is multiplied by k, then every term in \A\ is multiplied by k, and so |5| = ^|^|. That is,
\B\=J2 (sgn (T)a^i^fl2/2 • • • (%}) • • • «m„ =kJ2 (sgn a)^!,- ^2/2 • • • «m„ =k\A\
a a
(iii) Suppose c times the ^th row is added to theyth row of ^. Using the symbol " to denote theyth position in
a determinant term, we have
\B\ = J2 (sgn ^)«i/^«2/2 • • • (C^ki, + %) ■ ■ ■ ^ni^
G
= CJ2 (sgn ^)«i/^«2/2 ■■■^,--- ^ni, + E (sgn ^)«l/j«2/2 • • • ^jij ■ ■ ■ ^ni,
a a
The first sum is the determinant of a matrix whose ^th and yth rows are identical. Accordingly, by
Theorem 2(ii), the sum is zero. The second sum is the determinant of ^. Thus |5| = c • 0 + |^| = |^|.
8.26. Prove Lemma 8.6: Let E be an elementary matrix. Then \EA\ = l^"!!^!.
Consider the elementary row operations: (i) Multiply a row by a constant ^ 7^ 0,
(ii) Interchange two rows, (iii) Add a multiple of one row to another.
Let Ei,E2,Et, be the corresponding elementary matrices That is, Ei,E2,E^, are obtained by applying the
above operations to the identity matrix /. By Rroblem 8.25,
\E,\=k\I\=k, |£2| = -|/| = -1, |£3| = |/| = 1
Recall (Theorem 3.11) that EfA is identical to the matrix obtained by applying the corresponding operation to
A. Thus, by Theorem 8.3, we obtain the following which proves our lemma:
\EiA\ = k\A\ = \E,\\Al \E2A\ =-\A\ = \E2\\A\, \E,A\ = \A\ = l\A\ = \E,\\A\
Lipschutz-Lipson:Schaum's I 8. Determinants I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
298 DETERMINANTS [CHAR
8.27. Suppose B is row equivalent to a square matrix A. Prove that \B\ = 0 if and only if |^| = 0.
By Theorem 8.3, the effect of an elementary row operation is to change the sign of the determinant or to
multiply the determinant by a nonzero scalar. Hence \B\ = 0 if and only if |^| =0.
8.28. Prove Theorem 8.5: Let A be an ^-square matrix. Then the following are equivalent:
(i) A is invertible, (ii) AX = 0 has only the zero solution, (iii) det(^) 7^ 0.
The proof is by the Gaussian algorithm. If ^ is invertible, it is row equivalent to /. But |/| 7^ 0. Hence, by
Problem 8.40, \A\ ^^. \i A is not invertible, it is row equivalent to a matrix with a zero row. Hence
det(^) = 0. Thus (i) and (iii) are equivalent.
liAX = 0 has only the solution X = 0, then A is row equivalent to / and A is invertible. Conversely, if ^
is invertible with inverse A~^, then
X = IX = (A-^A)X = A-\AX) = A-^0 = 0
is the only solution of AX = 0. Thus (i) and (ii) are equivalent.
8.29. Prove Theorem 8.4: \AB\ = \A\\B\.
If A is singular, then AB is also singular, and so \AB\ = 0 = |^||5|. On the other hand, if ^ is non-
singular, then A = E^.. .^2^i' ^ product of elementary matrices. Then, Lemma 8.6 and induction, yields
\AB\ = \E, . ..E^E.B = \EJ ... |^2ll^ill^ill^l = \A\\B\
8.30. Suppose P is invertible. Prove that \p-^\ = \P\~\
p-ip = /. Hence 1 = |/| = \P~^P\ = |P-^||P|, and so \p-^\ = \P\-\
8.31. Prove Theorem 8.7: Suppose A and B are similar matrices. Then \A\ = \B\.
Since A and B are similar, there exists an invertible matrix P such that B = P~^AP. Therefore, using
Problem 8.30, we get \B\ = \p-UP\ = \p-^\\A\\P\ = \A\\p-^\\P = \A\.
We remark that although the matrices P~^ and A may not commute, their determinants \P~^ \ and \A\ do
commute, since they are scalars in the field K
8.32. Prove Theorem 8.8 (Laplace): Let^ = [a^y], and let^^y denote the cofactor of a^j. Then, for any / orj
Ml = «/i^/i + ... + «m^m and \A\ = UyAy + ... + a^jA^j
Since \A\ = |^^|, we need only prove one of the expansions, say, the first one in terms of rows of ^. Each
term in \A \ contains one and only one entry of the /th row (a^i, a,2' • • •' ^in) of ^- Hence we can write \A \ in the
form
\A\ = a^^Al + aaAf2 + ... + a^^Al
(Note that ^ J is a sum of terms involving no entry of the ith row of ^.) Thus the theorem is proved if we can
show that
where Mij is the matrix obtained by deleting the row and column containing the entry a^. (Historically, the
expression Afj was defined as the cofactor of aij, and so the theorem reduces to showing that the two
definitioins of the cofactor are equivalent.)
First we consider the case that i = n,j = n. Then the sum of terms in \A\ containing a„„ is
a
where we sum over all permutations o e S^^ for which o{n) = n. However, this is equivalent (Prove!) to
summing over all permutations of{l,...,«— 1}. Thus A^^ = |M„„| = (—1)"^"|M„„|.
Lipschutz-Lipson:Schaum's I 8. Determinants I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 8] DETERMINANTS 299
Now we consider any / andy. We interchange the /th row with each succeeding row until it is last, and we
interchange theyth column with each succeeding column until it is last. Note that the determinant |7l^y| is not
affected, since the relative positions of the other rows and columns are not affected by these interchanges.
However, the "sign" of \A\ and of ^| is changed n — 1 and then n —j times. Accordingly,
Afj = (-ir-^+^-^\M,j\=(-iy+^\M,j\
8.33. Let A = [a^j] and let B be the matrix obtained from A by replacing the /th row of ^ by the row vector
(bfi, ..., bf^). Show that
1^1 = baAa + M/2 + • • • + Mm
Furthermore, show that, fory ^ /,
^ji^n + «72^/2 + ... + ajn^in = 0 and a^A^^ + a2jA2i + ... + a^jA^^ = 0
Let B = [bijl By Theorem 8.8,
Since Bij does not depend on the /th row of B, we get Bij = A^j foxj= I,... ,n. Hence
|5|=6,i4,+M/2 + --- + M/.
Now let A^ be obtained from A by replacing the /th row of ^ by theyth row of ^. Since A^ has two identical
rows, \A^\ = 0. Thus, by the above result,
\A'\ = aj^A^^ + aj2Aa + ... + aj^A^^ = 0
Using \A^\ = \A\, we also obtain that ayA^^ + a2jA2i + ... + a^jA^^ = 0.
8.34. Prove Theorem 8.9: ^(adj A) = (adj A)A = \A\I.
Let A = [gij] and let ^(adj A) = [bij]. The /th row of ^ is
(a,i,a,-2,...,aj A)
Since adj A is the transpose of the matrix of cofactors, theyth column of adj A is the tranpose of the cofactors
of theyth row of ^, that is,
iAj,Aj2,...,Aj,y B)
Now bij, the ij entry in ^(adj A), is obtained by multiplying expressions A) and B):
t>ij = a^Aji + aaAj2 + ... + a^^Aj^
By Theorem 8.8 and Problem 8.33,
_\\A\ if i=j
' jo if / ^j
Accordingly, ^(adj A) is the diagonal matrix with each diagonal element |^|. In other words, ^(adj A) = \A\I.
Similarly (adj ^)^ = \A\L
8.35. Prove Theorem 8.10 (Cramer's rule): The (square) system AX = B has a unique solution if and only
if D j^O. In this case, x^ = Ni/D for each /.
By previous results, AX = B has a unique solution if and only if ^ is invertible, and A is invertible if and
only if D= \A\ ^0.
Now suppose D^O.By Theorem 8.9, A''^ = A/D)(adj A). Multiplying AX = B by A-\ we obtain
X = A-^AX = (l/D)(sidj A)B A)
Lipschutz-Lipson:Schaum's I 8. Determinants I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
300 DETERMINANTS [CHAP.
Note that the ith row of (l/D)(sidj A) is (l/D)(A^i,A2i, • • - Mni)- If ^ = (b^,b2,..., b^, then, by A),
X, = (l/D)(b,A,, + M2/ + • • • + KAi)
However, as in Problem 8.33, b^A^^ + ^72^2/ + • • • + b^A^^ = N^, the determinant of the matrix obtained by
replacing the ith column of ^ by the column vector B. Thus x, = (l/D)Ni, as required.
8.36. Prove Theorem 8.12: Suppose M is an upper (lower) triangular block matrix with diagonal blocks
Ai,A2, ... ,A^. Then
det(M) = det(^i) det(^2) • • • det(^J
[a c
We need only prove the theorem for n = 2, that is, when M is a square matrix of the form M =
The proof of the general theorem follows easily by induction. L
Suppose A = [aij] is r-square, B = [bij] is ^-square, and M = [rriij] is w-square, where n = r -\- s. By
definition,
det(M) = J2 (Sgn ^)^la(l)^2aB) • • • ^nain)
If / > r and 7 < r, then niij = 0. Thus we need only consider those permutations a such that
(j{r +l,r + 2,...,r + ^} = {r+l,r + 2, ...,r + ^} and cr{ 1, 2, ..., r} = {1, 2, ..., r}
Let (Ji(k) = (j(k) for k < r, and let (J2(k) = o{r -\- k) — r iox k < s. Then
(Sgn G)^1^A)^2^B) • • • ^nain) = (sgn Ol)ai^,il)a2a,B) ■ ■ ■ «mi(r)(sgn ^2)^I^
2A)^2G2B) • • • ^SG2{s)
which implies det(M) = det(^) detE).
8.37. Prove Theorem 8.14: There exists a unique function £): M ^ AT such that:
(i) D is multilinear, (ii) D is alternating, (iii) D{I) = 1.
This function D is the determinant function; that is, D{A) = \A\.
LetD be the determinant function, D{A) = \A\. We must show that D satisfies (i), (ii), and (iii), and that D
is the only function satisfying (i), (ii), and (iii).
By Theorem 8.2, D satisfies (ii) and (iii). Hence we show that it is multilinear. Suppose the /th row of
A = [gij] has the form (b^ + c^, bi2 + c,-2, • • •, ^/„ + c^-^). Then
D(A)=D(A„ ..., 5, + C,, ..., AJ
= J2 (sgn ^)«l^(l) . . . «/-l,a(/-l)(^/a@ + ^^(O) • • • ^nain)
Sn
= J2 (sgn (T)a^a(\) ■ ■ ■ K{i) ■ ■ ■ ^nain) + E (sgn ^)«l^(l) • • • q^@ . . . a^a{n)
s„ s„
= D(A„...,B,,...,An)+D(A„...,q,...,An)
Also, by Theorem 8.3(ii),
D(A^,..., kAi,.. .,An) = kD{A^,... ,4,... ,^J
Thus D is multilinear, i.e., D satisfies (i).
We next must prove the uniqueness of D. Suppose D satisfies (i), (ii), and (iii). If {^i, ..., e„} is the usual
basis of ^", then, by (iii), D{ei, 62, ..., e^) = D(I) = 1. Using (ii), we also have that
D(ei ,61 ,... ,61) = sgn 0", where a = /1/2 ■ ■ -in A)
Now suppose A = [ay]. Observe that the ^th row Aj^ of A is
A = (^ki ^ak2,---, a^n) = aiiCi + a^2^2 + • • • + «^e„
Thus
D{A) = D{aiiei + ... + ai^e^, ^21^1 + ... + a2„e„, ..., a^i^i + ... + a„„e„)
Lipschutz-Lipson:Schaum's I 8. Determinants
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 8]
DETERMINANTS
301
Using the multilinearity of D, we can write D{A) as a sum of terms of the form
B)
where the sum is summed over all sequences /1/2 ■ ■ -^n^ where /^ G {I,... ,n}. If two of the indices are equal,
say ij = ij^ but 7 ^ k, then, by (ii),
D(ei^,ei^,...,eij = 0
Accordingly, the sum in B) need only be summed over all permutations a = /1/2 ■ ■ -^n- Using A), we finally
have that
D(A) = J2 («i/i«2/2 • • • «m„)^fei' ^/2' • • •' ^0
G
= J2 (sgn (T)a^i^ a^i^ ... a^^^, where a = i^ i^... in
a
Hence D is the determinant function, and so the theorem is proved.
Supplementary Problems
COMPUTATION OF DETERMINANTS
8.38. Evaluate:
{a)
2 6
4 1
ib)
5 1
3 -2
(c)
-2 8
-5 -3
(d)
4 9
1 -3
(e)
■b a
a-\-b
8.39. Find all t such that: (a)
t-A
2
: 0, {b)
3
8.40. Compute the determinant of each of the following matrices:
(a)
2
0
1
1
5
-3
1
-2
4
Ab)
3
2
0
-2
5
6
-4
-1
1
Ac)
-2
6
4
-1
-3
1
4
-2
2
,W
7
1
3
6 5
2 1
-2 1
8.41. Find the determinant of each of the following matrices:
8.42.
{a)
1
1
3
4
Evaluate:
{a)
2
2
3
5
2
0
-1
-3
-1
1
3
2
2
-2
1
0
3
-2
-5
-1
Ab)
(b)
2
3
1
2
1
0
-1
2
3
1
4
-1
2
-2
3
1
2-14-3
-110 2
3 2 3-1
1-2 2-3
(c)
1-2 3-1
11-20
2 0 4-5
14 4-6
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
8. Determinants
Text
©The McGraw-Hill
Companies, 2004
302
DETERMINANTS
[CHAR
8.43. Evaluate each of the following determinants:
{a)
1
2
3
5
2
2
-1
1
1
3
-1
1
0
2
-1
3
-2
2
-3
1
1
3
-1
4
-2
(b)
1
2
0
0
0
3
4
0
0
0
5
2
1
5
2
7
4
2
6
3
9
2
3
2
1
(c)
1
5
0
0
0
2
4
0
0
0
3
3
6
0
0
4
2
5
7
2
5
1
1
4
3
COFACTORS, CLASSICAL ADJOINTS, INVERSES
8.44. Find det(^), adj A, and ^~\ where:
{a) A.
8.45. Find the classical adjoint of each matrix in Problem 8.41.
1 1 0
1 1 1
0 2 1
,F) A =
1 2 2
3 1 0
1 1 1
8.46. Let A .
a b
c d
{a) Find adj A, (b) Show that adj (adj A) = A, (c) When does A = adj A7
8.47. Show that if A is diagonal (triangular) then adj A is diagonal (triangular).
8.48. Suppose A = [aij] is triangular. Show that:
(a) A is invertible if and only if each diagonal element a^ ^ 0.
(b) The diagonal elements of ^~^ (if it exists) are aj^^, the reciprocals of the diagonal elements of ^.
MINORS, PRINCIPAL MINORS
8.49. Let A -
1
1
3
4
2
0
-1
-3
3
-2
2
0
2
3
b
-1
and B -
1
2
0
3
3
-3
-b
0
-1
1
2
5
5
4
1
-2
Find the minor and the signed minor
corresponding to the following submatrices:
(a) A(l,4; 3,4), (b) 5A,4; 3,4), (c) ^B,3; 2,4), (d) 5B,3; 2,4).
8.50. For k = 1, 2, 3, find the sum Sj^ of all principal minors of order k for:
(a) A.
1
2
5
3 2
-4 3
-2 1
,F) 5 =
1
2
3
5
6
-2
-4
1
0
,(c) C =
1
2
4
-4 3
1 5
-7 11
8.51. For k = 1, 2, 3, 4, find the sum Sj^ of all principal minors of order k for:
(a) ^:
1
1
0
4
2
-2
1
0
3
0
-2
-1
-1
5
2
-3
,F) B-
12 12
0 12 3
13 0 4
2 7 4 5
Lipschutz-Lipson:Schaum's I 8. Determinants I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 8] DETERMINANTS 303
DETERMINANTS AND LINEAR EQUATIONS
8.52. Solve the following systems by determinants:
. . f 3x + 5j; = 8 .,. I 2x — 3y = —I . . f ax — 2by = c / 7 / a\
(''' l4x-2^=l'(*' J4x + 7^=-l'(^> \3ax-5by = 2c ^''^ ^'^^
8.53. Solve the following systems by determinants:
i2x-5y-\-2z = l i2z-\-3=y-\-3x
x-\-2y-4z = 3,(b) \ x - 3z = 2y-\-I
3x — 4y — 6z = 5 y3y -\-z = 2 — 2x
8.54. Rrove Theorem 8.11: The system AX = 0 has a nonzero solution if and only if D = \A\ = 0.
PERMUTATIONS
8.55. Find the parity of the permutations o = 32\5A, t = 13524, % = 42531 in S^.
8.56. For the permutations in Rroblem 8.55, find:
(a) TOO", (b) no a, (c) (j~^, (d) t~^.
8.57. Let T e S^. Show that t o cr runs through S^ as a runs through S^, that is, S^ = {t o a : a e S^}.
8.58. Let (J e S^ have the property that (j(n) = n. Let o^ e S^_i be defined by (j'^(x) = (j(x).
(a) Show that sgn a"^ = sgn a,
(b) Show that as a runs through S^, where (j(n) = n, a"^ runs through S^_i, that is,
S^_-^ = {a* : (J e Sj,, (j(n) = n}.
8.59. Consider a permutation a =j\ J2 .. j„. Let {ej be the usual basis of ^", and let A be the matrix whose /th row
is ej^, i.e., A = (ej^, ej^,..., ejj. Show that \A\ = sgn a.
DETERMINANT OF LINEAR OPERATORS
8.60. Find the determinant of each of the following linear transformations:
(a) T.R^ -^ r2 defined by T(x,y) = Bx - 9y, 3x - 5y),
(b) T.R^ -^ R^ defined by T(x, y, z) = Cx - 2z, 5y-\-7z, x-\-y-\- z),
(c) T.R^ -^ r2 defined by T(x, y, z) = Bx -\-7y- 4z, 4x - 6y-\- 2z).
8.61. Let D: F ^ F be the differential operator, that is, D(/@) = df/dt. Find det(D) if V is the vector space of
functions with the following bases: (a) {1,^,...,^^}, (b) {e\e^\e^^}, (c) {sin ^, cos ^}.
8.62. Rrove Theorem 8.13: Let F and G be linear operators on a vector space V. Then:
(i) det(F o G) = det(F) det(G), (ii) F is invertible if and only if det(F) ^ 0.
8.63. Rrove: (a) det(lj^) = 1, where Ij^ is the identity operator, (b) -det(r~^) = det(r)~^ when T is invertible.
MISCELLANEOUS PROBLEMS
8.64. Find the volume V(S) of the parallelepiped S in R^ determined by the following vectors:
(a) wi = A, 2, -3), U2 = C, 4, -1), u^ = B, -1, 5),
(b) wi = A, 1, 3), U2 = A, -2, -4), ^3 = D, 1,2).
Lipschutz-Lipson:Schaum's I 8. Determinants
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
304
DETERMINANTS
[CHAP.
8.65. Find the volume V(S) of the parallelepiped S in R"^ determined by the following vectors:
wi =A,-2,5,-1), U2 = B,l,-2,l), ^3 =C,0, 1-2), W4 = A,-1, 4,-1)
8.66. Let V be the space of 2 x 2 matrices M ■-
respect to the rows) where:
a b
c d
over R. Determine whether D: F ^ R is 2-linear (with
{a) D(M) = a-\-d, (c) D(M) = ac-bd, (e) D(M) = 0
(b) D(M) = ad, (d) D(M) = ab - cd, (/) D(M) = 1
8.67. Let A be an w-square matrix. Prove \kA\ =k^\A\.
8.68. Let A,B,C,DhQ commuting w-square matrices. Consider the 2«-square block matrix M ■■
A B
C D
Prove
that \M\ = \A\\D\ — \B\\C\. Show that the result may not be true if the matrices do not commute.
8.69. Suppose A is orthogonal, that is, A^A = I. Show that det(^) = ±1.
8.70. Let V be the space of m-square matrices viewed as m-tuples of row vectors. Suppose D:V ^ K is m-linear
and alternating. Show that:
(a) D(... ,A, ... ,B, ...) = —D(... ,B, ... ,A,...); sign changed when two rows are interchanged.
(b) lfAi,A2,...,Am are linearly dependent, then D(Ai, A2,..., A^) = 0.
8.71. Let V be the space of m-square matrices (as above), and suppose D:V^K. Show that the following weaker
statement is equivalent to D being alternating:
D(Ay, A2,..., A^) = 0 whenever Ai = A^j^^ for some /
Let V be the space of w-square matrices over K. Suppose B eV is invertible and so detE) ^ 0. Define
D\V ^ Khy D(A) = dQt(AB)/dQt(B), where A e V. Hence
D(A^,A2,---,An) = dQt(A^B, A2B,..., A^B)/dQt(B)
where Ai is the ith row of ^, and so AiB is the ith row ofAB. Show that D is multilinear and alternating, and
that D(I) = 1. (This method is used by some texts to prove that \AB\ = \A\\B\.)
8.73. Show that g = g(xi,... ,x„) = (—l)"F„_i(x) where g = g(x^) is the difference product in Problem 8.19,
X = x^, and V^_i is the Vandermonde determinant defined by
Vn-l{x)-.
1
Xi
x]
1
X2
1
8.74. Let A be any matrix. Show that the signs of a minor A{I, J] and its complementary minor A{r, J'] are equal.
8.75. Let A be an w-square matrix. The determinantal rank of A is the order of the largest square submatrix of A
(obtained by deleting rows and columns of ^) whose determinant is not zero. Show that the determinantal rank
of ^ is equal to its rank, i.e., the maximum number of linearly independent rows (or columns).
Lipschutz-Lipson:Schaum's I 8. Determinants
Outline of Theory and
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 8]
DETERMINANTS
305
Answers to Supplementary Problems
Notation: M = {R^\ R2', ...] denotes a matrix with rows Ri,R2, ...
8.38. (a
8.39. (a
8.40. (a
8.41. (a
8.42. (a
8.43. (a
8.44. (a
(b]
8.45. (a
(b]
8.46. (a
8.49. (a
8.50. (a
8.51. (a
8.52. (a
8.53. (a
8.55. (fl
8.56. (fl
8.60. (a
8.61. (fl
8.64. (a
8.65. 17
8.66. (a) no,
-22, (b) -13, (c) 46, (d) -21, (e) a^-\-ab-\-b^
3, 10, F) 5, -2
21, (b) -11, (c) 100, (d) 0
-131, F) -55
33, (b) 0, (c) 45
-12, (b) -42, (c) -468
\A\ = -i,adj^ = [-1,-1,1; -1,1,-1; 2,-2,0],
\A\ = -1, adj A = [1, 0, -2; -3, -1, 6; 2, 1, -5]. Also, ^-
:(adj^)/M|
[-16, -29, -26, -2; -30, -38, -16, 29; -8, 51, -13, -1; -13, 1, 28, -U
[21, -14, -17, -19; -44, 11, 33, 11; -29, 1, 13, 21; 17, 7, -19, -18]
adj A = [d, -b; —c, a], (c) A = kl
-3,-3, (b) -23,-23, (c) 3,-3, {d) \1,-\1
-2,-17,73, (b) 7,10,105, (c) 13,54,0
6,13,62,-219; {b) 7,-39,29,20
^ = fe^y = fe^ (b) x=-^,,y = ^,, (c) x=-g,j;=-f
x = 5,j;=l,z=l, (b) since D = 0, the system cannot be solved by determinants
sgn a = 1, sgn t = — 1, sgn n = —I
TO a = 53142, (b) 710 a = 52413, (c) G-^=32154, (^ 1-^ = 14253
det(r) = 17, (b) det(r) = 4, (c) not defined
0, (b) 6, (c) 1
30, (b) 0
F) yes, (c) yes, ((/) no, (e) yes, (/) no
Lipschutz-Lipson:Schaum's I 9. Diagonalization: I Text I I ©The McGraw-Hill
Outline of Theory and Eigenvalues and Companies, 2004
Problems of Linear Eigenvectors
Algebra, 3/e
CHAPTER 9
Diagonalization:
Eigenvalues and
Eigenvectors
9.1 INTRODUCTION
The ideas in this chapter can be discussed from two points of view.
Matrix Point of View
Suppose an ^-square matrix A is given. The matrix A is said to be diagonalizable if there exists a
nonsingular matrix P such that
B = P-^AP
is diagonal. This chapter discusses the diagonalization of a matrix ^. In particular, an algorithm is given to
find the matrix P when it exists.
Linear Operator Point of View
Suppose a linear operator T:V^ F is given. The linear operator T is said to be diagonalizable if
there exists a basis S of V such that the matrix representation of T relative to the basis ^ is a diagonal
matrix D. This chapter discusses conditions under which the linear operator T is diagonalizable.
306
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 9]
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
307
Equivalence of the Two Points of View
The above two concepts are essentially the same. Specifically, a square matrix A may be viewed as a
linear operator F defined by
F(X) = AX
where X is a column vector, and B = P~^AP represents F relative to a new coordinate system (basis) S
whose elements are the columns of P. On the other hand, any linear operator T can be represented by a
matrix A relative to one basis and, when a second basis is chosen, T is represented by the matrix
B = P-^AP
where P is the change-of-basis matrix.
Most theorems will be stated in two ways: one in terms of matrices A and again in terms of linear
mappings T.
Role of Underlying Field K
The underlying number field K did not play any special role in our previous discussions on vector
spaces and linear mappings. However, the diagonalization of a matrix ^ or a linear operator T will depend
on the roots of a polynomial A(t) over K, and these roots do depend on K. For example suppose
A(t) = f-\-\. Then A{t) has no roots UK = R, the real field; but A{t) has roots =b/ UK = C, the complex
field. Furthermore, finding the roots of a polynomial with degree greater than two is a subject unto itself
(fi-equently discussed in courses in Numerical Analysis). Accordingly, our examples will usually lead to
those polynomials A(^) whose roots can be easily determined.
9.2 POLYNOMIALS OF MATRICES
Consider a polynomial/(^) = aj^ -\-... -\- a^t -\- a^ over a field K. Recall (Section 2.8) that if ^ is any
square matrix, then we define
f{A) = a^A"" -\-...-\-a^A-\-a^I
where / is the identity matrix. In particular, we say that ^ is a root off(t) if/(^) = 0, the zero matrix.
Example 9.1. Let^
1 2
3 4
Then A^
7 10
15 22
If
f(t) = 2t^ -3t-\-5 and g(t) = t^
-5t-2
then
and
f(A) = 2A^ -3A-\-5I--
ri4 201 r-3 -61 r5 o] ri6 i4i
[30 44j^[-9 -12j^[0 5j [21 37j
Thus ^ is a zero of g(t).
The following theorem (proved in Problem 9.7) applies.
Theorem 9.1: Let/ and g be polynomials. For any square matrix A and scalar k,
(i) (f + g)(A)=f(A) + g(A) (iii) (kf)(A) = kf(A)
(ii) (fg)(A)=f(A)g(A) (iv) f(A)g(A)=g(A)f(A).
Observe that (iv) tells us that any two polynomials in A commute.
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
308
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
[CHAR 9
Matrices and Linear Operators
Now suppose that T:V^ F is a linear operator on a vector space V. Powers of T are defined by the
composition operation, that is,
Tor
r^o r
Also, for any polynomial f(t) = a J" -\-... -\- a^t -\- Uq, we define f(T) in the same way as we did for
matrices; that is,
where / is now the identity mapping. We also say that 7 is a zero or root off(t) iff(T) = 0, the zero
mapping. We note that the relations in Theorem 9.1 hold for linear operators as they do for matrices.
Remark: Suppose ^ is a matrix representation of a linear operator T. Then f(A) is the matrix
representation off(T), and, in particular,/(r) = 0 if and only if/(^) = 0.
9.3 CHARACTERISTIC POLYNOMIAL, CAYLEY-HAMILTON THEOREM
Let A = [ay] be an ^-square matrix. The matrix M = A — tl^, where /„ is the ^-square identity matrix
and t is an indeterminate, may be obtained by subtracting t down the diagonal of ^. The negative of M is
the matrix //„ —A, and its determinant
A(t) = det(^/„ - ^) = (-1)" det(^ - tIJ
which is a polynomial in t of degree n, is called the characteristic polynomial of A.
We state an important theorem in linear algebra (proved in Problem 9.8).
Theorem 9.2: (Cayley-Hamilton) Every matrix ^ is a root of its characteristic polynomial.
Remark: Suppose A = [a^] is a triangular matrix. Then tl — A is sl triangular matrix with diagonal
entries t — a^; and hence
A(t) = dQt(tI -A) = (t- au)(t - ^22) '"(t- ^nn)
Observe that the roots of A(t) are the diagonal elements of ^.
Example 9.2. Let A
1 3
4 5
Its characteristic polynomial is
A@ = \tI-A\
■1 -3
-4 ^-5
(t- l)(t-5)-l2 = i^ -6t-l
As expected from the Cayley-Hamilton Theorem, ^ is a root of A(^); that is,
A(A) =A^ -6A-1I:
13 18
24 37
+
6
4
-18]
-30_
+
[-7
0
0]
-7_
—
[0 0]
0 0_
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 9]
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
309
Now suppose A and B are similar matrices, say B = P ^AP, where P is invertible. We show that A and
B have the same characteristic polynomial. Using tl = P~^tIP, we have
^^{t) = dQt(tI -B) = det(^/ - P-^AP) = dQt(p-^tIP - P'^AP)
= dQt[p-\tI - A)P] = det(p-^) det(^/ -^) det(P)
Using the fact that determinants are scalars and commute and that det(P~^) det(P) = 1, we finally obtain
As(t) = dQt(tI-A) = AAt)
Thus we have proved the following theorem.
Theorem 9.3: Similar matrices have the same characteristic polynomial.
Characteristic Polynomials of Degree 2 and 3
There are simple formulas for the characteristic polynomials of matrices of orders 2 and 3.
ail ^12
(a) Suppose A =
Then
^21 ^22 _,
A(t) = t^ -(aii-\- a22)t + det(^) = f- - tv{A) t + det(^)
Here tr(^) denotes the trace of ^, that is, the sum of the diagonal elements of ^.
(Z?) Suppose A
. Then
Uii ai2 ai2
^21 ^22 ^23
.^31 ^32 ^33.
A(t) = t^ - tr(A) f- + {All +^22 +^33)^ - det(^)
(Here An, ^22? ^33 denote, respectively, the cofactors of a^, B22, ci^,^,-)
Example 9.3. Find the characteristic polynomial of each of the following matrices:
{a) A:
Y^ ,y.<') '-[r\
(c) C-
5 -2
4 -4
(a) We have tr(^) = 5 + 10 = 15 and |^| = 50 - 6 = 44; hence A@ -\-1^ - I5t + 44.
(b) We have trE) = 7 + 2 = 9 and |5| = 14 + 6 = 20; hence A@ = f-9t-\- 20.
(c) We have tr(C) = 5 - 4 = 1 and |C| = -20 + 8 = -12; hence A@ -
-t-n.
Example 9.4. Find the characteristic polynomial of ^ :
We have tr(^) = 1+3+9=13. The cofactors of the diagonal elements are as follows:
0
1
1 2"
3 2
3 9
3 2
3 9
21,
1 2
1 9
1 1
0 3
Thus All +^22 +^33 =31. Also, |^| = 27 + 2 + 0 - 6 - 6 - 0 = 17. Accordingly,
A@ = ^^- 13^2 + 3U-17
Remark: The coefficients of the characteristic polynomial A{t) of the 3-square matrix A are, with
alternating signs, as follows:
Si = tr(A), S2 =Aii -\-A22 +^33, ^3 = det(^)
We note that each Sj^ is the sum of all principal minors of A of order k.
Lipschutz-Lipson:Schaum's I 9. Diagonalization: I Text I I ©The McGraw-Hill
Outline of Theory and Eigenvalues and Companies, 2004
Problems of Linear Eigenvectors
Algebra, 3/e
310 DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS [CHAR 9
The next theorem, whose proof lies beyond the scope of this text, tells us that this result is true in
general.
Theorem 9.4: Let A be an ^-square matrix. Then its characteristic polynomial is
A@ = f- S^f-^ + S2f-^ + ... + (-1)"^„
where Sj^ is the sum of the principal minors of order k.
Characteristic Polynomial of a Linear Operator
Now suppose T:V^ F is a linear operator on a vector space V of finite dimension. We define the
characteristic polynomial A(t) of T to be the characteristic polynomial of any matrix representation of T.
Recall that if ^ and B are matrix representations of T, then B = P~^AP where P is a change-of-basis
matrix. Thus A and B are similar, and, by Theorem 9.3, A and B have the same characteristic polynomial.
Accordingly, the characteristic polynomial of T is independent of the particular basis in which the matrix
representation of T is computed.
Since f(T) = 0 if and only if f(A) = 0, where f(t) is any polynomial and A is any matrix
representation of 7, we have the following analogous theorem for linear operators.
Theorem 9.T: (Cayley-Hamilton) A linear operator 7 is a zero of its characteristic polynomial.
9.4 DIAGONALIZATION, EIGENVALUES AND EIGENVECTORS
Let A be any ^-square matrix. Then A can be represented by (or is similar to) a diagonal matrix
D = diag(^i, k2, ..., k^) if and only if there exists a basis S consisting of (column) vectors Ui,U2, ... ,u^
such that
Aui = kiUi
Au2 = ^2^2
Au„ = k^u
n''^''n
1
In such a case, A is said to be diagonizable. Furthermore, D = P ^AP, where P is the nonsingular matrix
whose columns are, respectively, the basis vectors Ui,U2, ... ,u^.
The above observation leads us to the following definition.
Definition. Let A be any square matrix. A scalar 1 is called an eigenvalue of A if there exists a nonzero
(column) vector v such that
Av = 2.V
Any vector satisfying this relation is called an eigenvector of A belonging to the eigenvalue 1.
We note that each scalar multiple kv of an eigenvector v belonging to A is also such an eigenvector,
since
A(kv) = k(Av) = k{Xv) = X{kv)
The set E^ of all such eigenvectors is a subspace of V (Problem 9.19), called the eigenspace of X. (If
dim^";^ = 1, then E^ is called an eigenline and X is called a scaling factor.)
The terms characteristic value and characteristic vector (or proper value and proper vector) are
sometimes used instead of eigenvalue and eigenvector.
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 9]
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
311
The above observation and definitions give us the following theorem.
Theorem 9.5. An ^-square matrix A is similar to a diagonal matrix D if and only if A has n linearly
independent eigenvectors. In this case, the diagonal elements of D are the corresponding
eigenvalues and D = P~^AP, where P is the matrix whose columns are the eigenvectors.
Suppose a matrix A can be diagonalized as above, say P~^AP = D, where D is diagonal. Then A has
the extremely useful diagonal factorization
A = PDP-^
Using this factorization, the algebra of A reduces to the algebra of the diagonal matrix D, which can be
easily calculated. Specifically, suppose D = diag(^i, k2,... ,k^). Then
A"" = {PDp-^y = pirp-^ = p diag(yif,..., k^)p-^
More generally, for any polynomial/(/^),
f(A) =f(PDP-') = Pf(D)p-' = P dmg(f(k,), f{h\ ... J{K))P-'
Furthermore, if the diagonal entries of D are nonnegative, let
5 = Pdiag(v^,v^,...,v^)p-i
Then 5 is a nonnegative square root of ^; that is, B^ = A and the eigenvalues of B are nonnegative.
Example 9.5. Let A
p 1
[2 2
[3 1
and let i;.
1
-2
1
-2
and V2 =
and
Then
[2 2j
Av^ = \ ^ ^ 11 ^ I = I = 4i;2
Thus Vi and V2 are eigenvectors of ^ belonging, respectively, to the eigenvalues A^ = 1 and /I2 = 4. Observe that v^
and V2 are linearly independent and hence form a basis of R^. Accordingly, A is diagonalizable. Furthermore, let P be
the matrix whose columns are the eigenvectors Vi and V2. That is, let
1 1
-2 1
and so
fi —i1
3 3
2 i
13 3 J
1 l"l
3 3
2 1
3 3 J
[3 1]
[2 2J
r 1 r
[-2 1
=
0"
0 4
Then A is similar to the diagonal matrix
D = P-^AP :
As expected, the diagonal elements 1 and 4 in D are the eigenvalues corresponding, respectively, to the eigenvectors v^
and V2, which are the columns of P. In particular, A has the factorization
PDP-
1 1]
-2 ij
fl 0]
[0 4J
ri 1-
3 3
2 1
L3 3_
Accordingly,
Moreover, suppose/(O = t^ - 5t^ +3^ + 6; hence/(I) = 5 and/D) = 2. Then
1 1]
-2 ij
fl 0]
[0 256j
ri
3
2
L3
1"
3
1
3_
"l71 85"
170 86
f(A) = Pf(D)p-
1 1]
-2 ij
[5 0]
[0 2J
ri 1"
3 3
2 1
L3 3_
=
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
312
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
[CHAR 9
Lastly, we obtain a "positive square root" of ^. Specifically, using \/l = 1 and V4 = 2, we obtain the matrix
b = pVdp-^ = \
\_-2 iJLo 2
where B^ = A and where B has positive eigenvalues 1 and 2.
Remark: Throughout this chapter, we use the following fact:
i —i 5 i
3 3 _ 3 3
2 i ~ 2 4
3 3 3 3
If p.
a b
c d
, then P~
dl\P\
-c/\P\
-bl\P\
a/\P\
That is, P ^ is obtained by interchanging the diagonal elements a and d of P, taking the negatives of the
nondiagonal elements b and c, and dividing each element by the determinant \P\.
Properties of Eigenvalues and Eigenvectors
The above Example 9.4 indicates the advantages of a diagonal representation (factorization) of a
square matrix. In the following theorem (proved in Problem 9.20), we list properties that help us to find
such a representation.
Theorem 9.6: Let ^ be a square matrix. Then the following are equivalent,
(i) A scalar X is an eigenvalue of ^.
(ii) The matrix M = A — U is singular,
(iii) The scalar 1 is a root of the characteristic polynomial A(^) of ^.
The eigenspace E^ of an eigenvalue X is the solution space of the homogeneous system MX = 0,
where M = A — XI, that is, M is obtained by subtracting X down the diagonal of ^.
Some matrices have no eigenvalues and hence no eigenvectors. However, using Theorem 9.6 and the
Fundamental Theorem of Algebra (every polynomial over the complex field C has a root), we obtain the
following result.
Theorem 9.7: Let ^ be a square matrix over the complex field C. Then A has at least one eigenvalue.
The following theorems will be used subsequently. (The theorem equivalent to Theorem 9.8 for linear
operators is proved in Problem 9.21, while Theorem 9.9 is proved in Problem 9.22.)
Theorem 9.8: Suppose Vx,V2,... ,v^ are nonzero eigenvectors of a matrix A belonging to distinct
eigenvalues Ij, /I2, ..., !„. Then Vi,V2, ... ,v^ dXQ linearly independent.
Theorem 9.9: Suppose the characteristic polynomial A(/^) of an ^-square matrix ^ is a product of n
distinct factors, say, A(/^) = (t — ai)(t — a2)... (t — a^). Then A is similar to the diagonal
matrix D = diag(ai, ^2, ..., a„).
If X is an eigenvalue of a matrix A, then the algebraic multiplicity of X is defined to be the multipHcity
of/I as a root of the characteristic polynomial of ^, while the geometric multiplicity of 1 is defined to be the
dimension of its eigenspace, dim^";^. The following theorem (whose equivalent for linear operators is
proved in Problem 9.23) holds.
Theorem 9.10: The geometric multiplicity of an eigenvalue 1 of a matrix A does not exceed its algebraic
multiplicity.
Lipschutz-Lipson:Schaum's I 9. Diagonalization: I Text I I ©The McGraw-Hill
Outline of Theory and Eigenvalues and Companies, 2004
Problems of Linear Eigenvectors
Algebra, 3/e
CHAR 9] DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS 313
Diagonalization of Linear Operators
Consider a linear operator T:V^ V. Then T is said to be diagonalizable if it can be represented by a
diagonal matrix D. Thus T is diagonalizable if and only if there exists a basis S = {u^, U2, ..., u^] oi V iox
which
T{ux) = k^Ui
T(u2) = ^2^2
T(uJ = k^u^
In such a case, T is represented by the diagonal matrix
i) = diag(^i,^2' '"^K)
relative to the basis S.
The above observation leads us to the following definitions and theorems, which are analogous to the
definitions and theorems for matrices discussed above.
Definition: Let 7 be a linear operator. A scalar X is called an eigenvalue of T if there exists a nonzero
vector V such that
T{v) = Xv
Every vector satisfying this relation is called an eigenvector of T belonging to the
eigenvalue X.
The set E^^ of all eigenvectors belonging to an eigenvalue 1 is a subspace of V, called the eigenspace of
X. (Alternatively, X is an eigenvalue oiT \i XI — T is singular, and, in this case, E^^ is the kernel oi XI — T.)
The algebraic and geometric multiplicities of an eigenvalue 1 of a linear operator T are defined in the same
way as those of an eigenvalue of a matrix A.
The following theorems apply to a linear operator 7 on a vector space V of finite dimension.
Theorem 9.5^: T can be represented by a diagonal matrix D if and only if there exists a basis S oi V
consisting of eigenvectors of T. In this case, the diagonal elements of D are the
corresponding eigenvalues.
Theorem 9,6': Let 7 be a linear operator. Then the following are equivalent.
(i) A scalar X is an eigenvalue of T.
(ii) The linear operator 1/ — 7 is singular,
(iii) The scalar 1 is a root of the characteristic polynomial A(^) of T.
Theorem 9,1': Suppose F is a complex vector space. Then T has at least one eigenvalue.
Theorem 9.8^: Suppose Vx,V2,... ,v^ are nonzero eigenvectors of a linear operator T belonging to
distinct eigenvalues X^, X2, ..., X^. Then Vx,V2, ... ,v^ diXQ linearly independent.
Theorem 9.9^: Suppose the characteristic polynomial A(/^) of 7 is a product of n distinct factors, say,
A(^) = {t — ai)(t — a2).. .(t — a J. Then T can be represented by the diagonal matrix
D = diag(ai, B2, ..., a„).
Theorem 9.10^: The geometric multiplicity of an eigenvalue X of T does not exceed its algebraic
multiplicity.
Lipschutz-Lipson:Schaum's I 9. Diagonalization: I Text I I ©The McGraw-Hill
Outline of Theory and Eigenvalues and Companies, 2004
Problems of Linear Eigenvectors
Algebra, 3/e
314 DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS [CHAR 9
Remark. The following theorem reduces the investigation of the diagonalization of a linear operator
T to the diagonalization of a matrix A.
Theorem 9.11: Suppose ^ is a matrix representation of T. Then T is diagonalizable if and only if ^ is
diagonalizable.
9.5 COMPUTING EIGENVALUES AND EIGENVECTORS, DIAGONALIZING MATRICES
This section gives an algorithm for computing eigenvalues and eigenvectors for a given square matrix
A and for determining whether or not a nonsingular matrix P exists such that P~^AP is diagonal.
Algorithm 9.7: (Diagonalization Algorithm) The input is an ^-square matrix A.
Step 1. Find the characteristic polynomial A(/^) of ^.
Step 2. Find the roots of A(^) to obtain the eigenvalues of ^.
Step 3. Repeat {a) and (Z?) for each eigenvalue X oi A.
(a) Form the matrix M = A — AI by subtracting A down the diagonal of ^.
(b) Find a basis for the solution space of the homogeneous system MX = 0. (These basis
vectors are linearly independent eigenvectors of ^ belonging to 1.)
Step 4. Consider the collection S = {vi, V2, ..., Vj^^} of slW eigenvectors obtained in Step 3.
(a) If m j^ n, then A is not diagonalizable.
(b) lfm = n, then A is diagonalizable. Specifically, let P be the matrix whose columns are the
eigenvectors Vi,V2,... ,v^. Then
D = P-^AP = diag(li, I2, • • •, 4)
where l^ is the eigenvalue corresponding to the eigenvector v^.
Example 9.6. The diagonalizable algorithm is applied to A
4 2
3 -1
A) The characteristic polynomial A(t) of A is computed. We have
tr(A) = 4 - 1 = -3, \A\ = -4-6 = -10;
hence
A@ = ^^ - 3^ - 10 = (^ - 5)(^ + 2)
B) Set A(t) = (t — 5)(t + 2) = 0. The roots A^ = 5 and /I2 = —2 are the eigenvalues of ^.
C) (i) We find an eigenvector Vi of A belonging to the eigenvalue Xi = 5. Subtract X^ = 5 down the diagonal of ^
-1 2"
to obtain the matrix M — , ^ ^
3 —6
homogeneous system MX = 0, that is,
. The eigenvectors belonging to 1^ =5 form the solution of the
-1 2
3 -6
x] To] -x + 2j; = 0 ^. „
The system has only one free variable. Thus a nonzero solution, for example, 1;^ = B, 1), is an eigenvector
that spans the eigenspace of A^ =5.
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 9]
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
315
(ii) We find an eigenvector V2 of A belonging to the eigenvalue /I2
diagonal of A to obtain the matrix
M -
[I'
and the homogenous system
6x-\-2y = 0
3x+ y = 0
-2. Subtract —2 (or add 2) down the
3x+j; = 0.
The system has only one independent solution. Thus a nonzero solution, say i;2 = (—1,3), is an
eigenvector that spans the eigenspace of /I2 = —2.
D) Let P be the matrix whose columns are the eigenvectors Vi and V2. Then
[5-3
and so
3 i'
7 7
i 2
7 7.
Accordingly, D = P ^AP is the diagonal matrix whose diagonal entries are the corresponding eigenvalues;
that is.
D = P-^AP :
3 l"l
7 7
1 2
7 7 J
[4 2]
[3 -ij
[2 -1'
[1 3
=
0"
0 -2
. We have
Example 9.7. Consider the matrix B --
tY(B) = 5 + 3 = 8, |5| = 15 + 1 = 16; so
Accordingly, A = 4 is the only eigenvalue of B.
Subtract 1 = 4 down the diagonal of B to obtain the matrix
M -
1 -1
1 -1
and the homogeneous system
m-
x-y = 0
x-y = 0
■8^+16 = (^-4r
x-y = 0
The system has only one independent solution; for example, x = l,y = 1. Thus i; = A, 1) and its multiples are the
only eigenvectors of B. Accordingly, B is not diagonalizable, since there does not exist a basis consisting of
eigenvectors of B.
Example 9.8. Consider the matrix A ■■
Here tr(^) = 3 - 3 = 0 and M| = -9 + 10 = 1. Thus A@ = f + l
is the characteristic polynomial of ^. We consider two cases:
(a) A is Si matrix over the real field R. Then A(^) has no (real) roots. Thus A has no eigenvalues and no eigenvectors,
and so A is not diagonalizable.
(b) A is a. matrix over the complex field C. Then A(t) = (t — i)(t + /) has two roots, / and —/. Thus A has two distinct
eigenvalues / and —/, and hence A has two independent eigenvectors. Accordingly there exists a nonsingular
matrix P over the complex field C for which
0"
P-^AP :
r / 0
[0 -i
Therefore A is diagonalizable (over C).
9.6 DIAGONALIZING REAL SYMMETRIC MATRICES
There are many real matrices A that are not diagonalizable. In fact, some real matrices may not have
any (real) eigenvalues. However, if ^ is a real symmetric matrix, then these problems do not exist. Namely,
we have the following theorems.
Theorem 9.12: Let ^ be a real symmetric matrix. Then each root 1 of its characteristic polynomial is
real.
Theorem 9.13: Let ^ be a real symmetric matrix. Suppose u and v are eigenvectors of ^ belonging to
distinct eigenvalues Ij and /I2. Then u and v are orthogonal, that is, (u,v) =0.
Lipschutz-Lipson:Schaum's I 9. Diagonalization: I Text I I ©The McGraw-Hill
Outline of Theory and Eigenvalues and Companies, 2004
Problems of Linear Eigenvectors
Algebra, 3/e
316 DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS [CHAP. 9
The above two theorems give us the following fundamental result.
Theorem 9.14: Let ^ be a real symmetric matrix. Then there exists an orthogonal matrix P such that
D = P~^AP is diagonal.
The orthogonal matrix P is obtained by normalizing a basis of orthogonal eigenvectors of A as
illustrated below. In such a case, we say that A is "orthogonally diagonalizable".
Example 9.9. Let^
2 -2
-2 5
a real symmetric matrix. Find an orthogonal matrix P such that P ^AP is diagonal.
First we find the characteristic polynomial A(^) of ^. We have
tr(^) = 2 + 5 = 7, 1^1 = 10-4 = 6; so A@ = ^^ - 7^ + 6 = (^ - 6)(^ - 1)
Accordingly, A^ = 6 and /I2 = 1 are the eigenvalues of ^.
(a) Subtracting A^ = 6 down the diagonal of ^ yields the matrix
^=\-2 -1
and the homogeneous system ^ ^ ^ or 2x + y = 0
—2x — y = [)
A nonzero solution is Ui = (I, —2).
(b) Subtracting X2 = ^ down the diagonal of ^ yields the matrix
U
"-1.2
and the homogeneous system x — 2y = 0
(The second equation drops out, since it is a multiple of the first equation.) A nonzero solution is
U2 = B, 1).
As expected from Theorem 9.13, u^ and U2 are orthogonal. Normalizing u^ and U2 yields the orthonormal vectors
wi = (I/V5, -2/V5) and U2 = B/V5, l/Vs)
Finally, let P be the matrix whose columns are u^ and U2, respectively. Then
1/V5 2/V5
-2/V5 I/V5J
and P-^AP --
6 0
0 1
As expected, the diagonal entries of P ^AP are the eigenvalues corresponding to the columns of P.
The procedure in the above Example 9.9 is formalized in the following algorithm, which finds an
orthogonal matrix P such that P~^AP is diagonal.
Algorithm 9.2: (Orthogonal Diagonalization Algorithm) The input is a real symmetric matrix A.
Step 1. Find the characteristic polynomial A(^) of ^.
Step 2. Find the eigenvalues of ^, which are the roots of A(^).
Step 3. For each eigenvalue 2. of A in Step 2, find an orthogonal basis of its eigenspace.
Step 4. Normalize all eigenvectors in Step 3, which then forms an orthonormal basis of R".
Step 5. Let P be the matrix whose columns are the normalized eigenvectors in Step 4.
Lipschutz-Lipson:Schaum's I 9. Diagonalization: I Text I I ©The McGraw-Hill
Outline of Theory and Eigenvalues and Companies, 2004
Problems of Linear Eigenvectors
Algebra, 3/e
CHAR 9] DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS 317
Application to Quadratic Forms
Let ^ be a real polynomial in variables Xj, X2, ..., x„ such that every term in q has degree two; that is
^(xi, X2, ..., x„) = Y. Cix} + E dyXiXj, where q, dy G R
/ i<j
Then q is called a quadratic form. If there are no cross-product terms xpCj, that is, all dy = 0, then q is said
to be diagonal.
The above quadratic form q determines a real symmetric matrix A = [a^y], where a^ = c^ and
Uy = aji = \dij. Namely, q can be written in the matrix form
q{X) = X^AX
where X = [xj, X2, ..., x„]^ is the column vector of the variables. Furthermore, suppose X = PY is a linear
substitution of the variables. Then substitution in the quadratic form yields
q{Y) = {PYfA{PY) = Y^(P^AP)Y
Thus P^AP is the matrix representation of q in the new variables.
We seek an orthogonal matrix P such that the orthogonal substitution X = PY yields a diagonal
quadratic form, that is, for which P^AP is diagonal. Since P is orthogonal, P^ = P~^, and hence
P^AP = P~^AP. The above theory yields such an orthogonal matrix P.
Example 9.10. Consider the quadratic form
By Example 9.9,
q(x,y) = 2x^ - 4xy-\-5y^ =X^AX, where ^=\ I 5 ^^d ^ = T
[I
p-'AP = I: ^
, 1/V5 2/V5
-P'AP, where P=\
-2/V5 1/V5
The matrix P corresponds to the following linear orthogonal substitution of the variables x and y in terms of the
variables s and t\
12 2 1
x = ^=s + ^=t, y= —s + ^=t
This substitution in q{x,y) yields the diagonal quadratic form q{s, t) = 6s^ +1^.
9.7 MINIMAL POLYNOMIAL
Let A be any square matrix. Let J(A) denote the collection of all polynomials/(^) for which ^ is a root,
i.e., for which/(^) = 0. The set J(A) is not empty, since the Cayley-Hamilton Theorem 9.1 tells us that
the characteristic polynomial A^(t) of A belongs to J(A). Let m(t) denote the monic polynomial of lowest
degree in J(A). (Such a polynomial m(t) exists and is unique.) We call m(t) the minimal polynomial of the
matrix A.
Remark: a polynomial/(^) 7^ 0 is monic if its leading coefficient equals one.
The following theorem (proved in problem 9.33) holds.
Theorem 9.15: The minimal polynomial m(t) of a matrix (linear operator) A divides every polynomial
that has ^ as a zero. In particular, m(t) divides the characteristic polynomial A(t) of A.
There is an even stronger relationship between m(t) and A(t).
Theorem 9.16: The characteristic polynomial A(t) and the minimal polynomial m(t) of a matrix A have
the same irreducible factors.
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
318
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
[CHAR 9
This theorem (proved in Problem 9.35) does not say that m(t) = A(^), only that any irreducible factor
of one must divide the other. In particular, since a linear factor is irreducible, m(t) and A(t) have the same
linear factors. Hence they have the same roots. Thus we have the following theorem.
Theorem 9.17: A scalar A is an eigenvalue of the matrix A if and only if 1 is a root of the minimal
polynomial of ^.
Example 9.11. Find the minimal polynomial m(t) of A —
2 2-5
3 7 -15
12 -4
First find the characteristic polynomial A(^) of ^. We have
tr(^) = 5, Au +A22 +^33 =2-3 + 8 = 7
Hence
and
\A\
A@:
■5t^ + lt-?>={t- \f{t - 3)
The minimal polynomial m{t) must divide A(^). Also, each irreducible factor of A(^), that is, ^ — 1 and ^ — 3, must
also be a factor of m{i). Thus m{t) is exactly one of the following:
/(O = {t- Z){t - 1) or g{t) = {t- Z){t - \f
We know, by the Cayley-Hamilton Theorem, that g{A) = A(A) = 0. Hence we need only test/(^). We have
12 -5"
3 6 -15
1 2 -5
f(A) = (A-I)(A-3I)--
- (t — 1)(^ — 3) = ^^ — 4^ + 3 is the minimal polynomial of ^.
1 2
3 4
1 2
-5"
-15
-7
=
0 0 0'
0 0 0
0 0 0
Thus f(t) = m(t)
Example 9.12.
(a) Consider the following two r-square matrices, where a ^ 0
Air):
A 1
0 A
0 0
0 0
0 .
1 .
0 .
0 .
. 0
. 0
. X
. 0
0
0
1
A
and
A
0
0
0
a
X
0
0
0 .
a .
0 .
0 .
. 0
. 0
. X
. 0
0
0
a
X
The first matrix, called a Jordan Block, has A's on the diagonal, 1 's on the superdiagonal (consisting of the entries
above the diagonal entries), and O's elsewhere. The second matrix A has A's on the diagonal, a's on the
superdiagonal, and O's elsewhere. [Thus ^ is a generalization of J{X, r).] One can show that
f{t) = {t-Xf
is both the characteristic and minimal polynomial of both J{X, r) and A.
(b) Consider an arbitrary monic polynomial
f(t) = f^a^_,f-'^...^a,t^ao
Let C(f) be the w-square matrix with 1 's on the subdiagonal (consisting of the entries below the diagonal
entries), the negatives of the coefficients in the last column, and O's elsewhere as follows:
C(/):
0 0
1 0
0 1
0 —Qq
0 -ay
0 —tto
0 0
1 -a„
Then C(/) is called the companion matrix of the polynomial/(O- Moreover, the minimal polynomial m{i) and
the characteristic polynomial A(^) of the companion matrix C(/) are both equal to the original polynomial/(O-
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 9]
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
319
Minimal Polynomial of a Linear Operator
The minimal polynomial m(t) of a linear operator T is defined to be the monic polynomial of lowest
degree for which 7 is a root. However, for any polynomial/(/^), we have
f(T) = 0 if and only if f(A) = 0
where ^ is any matrix representation of 7. Accordingly, T mid A have the same minimal polynomials. Thus
the above theorems on the minimal polynomial of a matrix also hold for the minimal polynomial of a linear
operator. That is, we have the following.
Theorem 9.15^: The minimal polynomial m(t) of a linear operator T divides every polynomial that has T
as a root. In particular, m(t) divides the characteristic polynomial A(t) of T.
Theorem 9.16^: The characteristic and minimal polynomials of a linear operator T have the same
irreducible factors.
Theorem 9.17^: A scalar A is an eigenvalue of a linear operator T if and only if 1 is a root of the minimal
polynomial m(t) of T.
9.8 CHARACTERISTIC AND MINIMAL POLYNOMIALS OF BLOCK MATRICES
This section discusses the relationship of the characteristic polynomial and the minimal polynomial to
certain (square) block matrices.
Characteristic Polynomial and Block Triangular Matrices
Suppose M is a block triangular matrix, say M =
M
B
0 A,
where A^ and A2 are square matrices.
Then tl — M is also a block triangular matrix, with diagonal blocks tl — A^ and tl — A2. Thus
\tI-M\ =
tl-A^
0
-B
tl -A,
= \tI-A,\\tI-A2\
That is, the characteristic polynomial of M is the product of the characteristic polynomials of the diagonal
blocks A^ and A2.
By induction, we obtain the following useful result.
Theorem 9.18: Suppose M is a block triangular matrix with diagonal blocks ^j,^2' • • •'^r- Then the
characteristic polynomial of M is the product of the characteristic polynomials of the
diagonal blocks A{, that is,
A^@ = A.(OA.(O...A.@
Example 9.13. Consider the matrix M --
Then M is a block triangular matrix with diagonal blocks A
9-115 7l
8 3_[ 2 -4
0 0 1 3 6
0 0 1 -1 8
[9
igonal blocks ^ =
-1"
3_
and 5 =
3 6"
_-l 8_
tr(^) = 9 + 3 = 12,
tr(B) = 3-\-S = 11,
det(^) = 27 + 8 = 35, and so
detE) = 24 + 6 = 30, and so
. Here
A^@ = f- nt -\-35 = (t- 5)(t - 7)
A^CO =
lU + 30 = (^-5)(^-6)
Accordingly, the characteristic polynomial of M is the product
Am@ = A^COA^CO = (t- sYit - 6)(t - 7)
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
320
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
[CHAR 9
Minimal Polynomial and Block Diagonal Matrices
The following theorem (proved in Problem 9.36) holds.
Theorem 9.19. Suppose M is a block diagonal matrix with diagonal blocks Ai,A2,...,Aj.. Then the
minimal polynomial of M is equal to the least common multiple (LCM) of the minimal
polynomials of the diagonal blocks A^.
Remark: We emphasize that this theorem applies to block diagonal matrices, whereas the analogous
Theorem 9.18 on characteristic polynomials appHes to block triangular matrices.
Example 9.14. Find the characteristic polynomal A(t) and the minimal polynomial m(t) of the block diagonal matrix
M -
0
0
0
0
5
2
0
0
0
0
0
4
3
0
0 lOl
o_[o
2 |0
5 10
0"[7
: diag(^i,^2'^3)' where A^
2 5
0 2
4 2
3 5
M3=V]
Then A(^) is the product of the characterization polynomials Ai(t), A2(t), A3(^) of A^, A2, A^, respectively
One can show that
A,(t) = (t-lf,
A2(t) = (t-2)(t-7),
A,(t) :
Thus A@ = (t- 2y(t - if. [As expected, deg A@ = 5.]
The minimal polynomials m^{t), 1112A), m^(t) of the diagonal blocks Ai,A2,A^, respectively, are equal to the
characteristic polynomials, that is.
m,{t) = {t-2Y
m2{t) = {t-2){t-l),
m^(t) --
But m(t) is equal to the least common multiple of mi(t), m2(t), m^it). Thus m(t) = (t — 2) (t — 7).
Solved Problems
POLYNOMIALS OF MATRICES, CHARACTERISTIC POLYNOMIAL
9.1.
Let^ =
Find/(^), where:
(a) f(t) = t^-3t-
(b) f(t) = t-6t+\3
First find ^
1 -2
4 5
(a) f(A)=A^ -3A-\-7I--
(b) f(A)=A^-6A-\-l3I =
[Thus ^ is a root off(t).
1 -2
4 5
-7 -12
24 17
-7 -12
24 17
Then:
-3 -6
12 9
-7 -12
24 17
1 r -6 121 ri3 oi^ro o
J ^[-24 -30j ^[ 0 13j [O 0
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 9]
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
321
9.2. Find the characteristic polynomial A{t) of each of the following matrices:
(a)
(a)
(b)
{c)
A =
2 5
4 1_
Ab)
Use the formula (t) =
tr(A) = 2 + 1=3,
tr(B) = 1-2 = 5,
HC) =
= 3-3 =
= 0,
B =
7 -3
_5 -2_
Ac)
f - tr(M) t + \M\ for i
\A\ =2-20 =-18,
\B\ = -14+15 = 1,
|C| =
= -9+18
= 9,
C =
3 -2
9 -3_
I 2 X 2 matrix M:
so A@ = {^-31-1
so A@ = t^ -5t+\
so
A@ =
^2+9
9.3. Find the characteristic polynomial A{t) of each of the following matrices:
{a) A =
1 2 3
3 0 4
6 4 5
,(b) B =
1 6 -2
-3 2 0
0 3-4
Use the formula A(t) = P — tr(A) f -\-(A^i +^22 +^33)^ — ML where A^ is the cofactor of a^ in the
3x3 matrix A = [aij].
(a) tr(A) =1+0 + 5 = 6,
Thus
(b) trE) =
Thus
An =
0 4
4 5
= -16,
All +-^22 +^33 =
= l+2-4=-l
5,1 =
2 0
3 -4
= -8,
^11 "I" ^22 "I" ^33
^22 =
1 3
6 5
= -13, ^33 =
1 2
3 0
-35, and |^| = 48 + 36 - 16 - 30 = 38
A@ = ^^ - 6^2 - 35^ - 38
^22 =
1 -2
0 -4
= -4, ^33 =
1 6
-3 2
= 8, and |5| =-8 + 18 - 72 =-62
A(^^
= P^
t^ -St-\- 62
20
9.4. Find the characteristic polynomial A(t) of each of the following matrices:
(a) A:
2
1
0
0
5
4
0
0
1
2
6
2
1
2
-5
3
Ab) B =
1
0
0
0
1 2
3 3
0 5
0 0
2
4
5
6
(a) A is block diagonal with diagonal blocks
5
4
[■
and
Thus
A@ = A^/OA^,@ = it" - 6^ + 3)(^2 _ 6^ + 3)
(Z?) Since 5 is triangular, A{i) = {t - \){t - 3)(^ - 5)(^ - 6).
Lipschutz-Lipson:Schaum's I 9. Diagonalization: I Text I I ©The McGraw-Hill
Outline of Theory and Eigenvalues and Companies, 2004
Problems of Linear Eigenvectors
Algebra, 3/e
322 DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS [CHAR 9
9.5. Find the characteristic polynomial A(t) of each of the following linear operators:
(a) F: r2 -^ r2 defined by F(x, y) = Cx + 5y, 2x - ly).
(b) D: F ^ F defined by D(/) = df/dt, where V is the space of functions with basis
S = {sint, cost}.
The characteristic polynomial A(^) of a linear operator is equal to the characteristic polynomial of any
matrix A that represents the linear operator
(a) Find the matrix A that represents T relative to the usual basis of R^. We have
so A@ = t^ - tr(^) t-\-\A\=f-\-4t-3l
[2 -1
(b) Find the matrix A representing the differential operator D relative to the basis S. We have
D(sin t) = cos t = 0(sin t) -\- I (cos t) _.
D(cos ^) = — sin ^ = — l(sin t) + O(cos t)
Therefore A@ = t^ - tr(^) t-\-\A\ = t^ -\-I
0 -1
1 0
9.6. Show that a matrix A and its transpose A^ have the same characteristic polynomial.
By the transpose operation, (tl — A) = tl^ — A^ = tl — A^. Since a matrix and its transpose have the
same determinant,
A^@ = \tI-A\ = \{tI-Af\ = \tI-A^\ = A^t)
9.7. Prove Theorem 9.1: Let/ and g be polynomials. For any square matrix A and scalar k,
(i) (f + g)(A)=f{A)+g(A), (iii) {kf)(A) = kf(A),
(ii) (fgm=nA)g{A}, (iv) f{A}g{A} = g(A)nA).
Suppose/ = Uf^f -\-... -\- ait -\- Uq and g = b^t^ -\-... -\-bit -\-b^. Then, by definition,
f{A) = a^A"" -\-...-\-aiA-\-a^I and g(A) = b^A"" -\-...-\-biA-\-b^I
(i) Suppose m <n and let b^ = 0 if i > m. Then
/ + g = («. + bjf + ... + (ai + 61)^ + («o + ^0)
Hence
(/ + gm = (a„ + bM" + ... + («!+ bi)A + (ao + b^)!
= aj" + b^A" + ... + aiA+biA+aQl + bf^I =f(A) + g(A)
n+m
(ii) By definition,/g = c„+^^"+"^ -\-...-\-Cit-\-Cq = J2 Ck(\ where
k
Ck = aobk + aibk_i + ... + a^bo = J2 aibk_i
/=o
n+m
Hence (fg)(A) = ^ c,^^ and
(n \ / ^ \ n m n+m
Ea4' E V' )=T.i:atbjA'^^ = E c,A^ = (/^)(^)
(iii) By definition, kf = kaj^ -\-... -\- kait -\- ka^, and so
{kf){A) = ka^A"" -\-...-\-kaiA-\- ka^I = kia^A"" + ... + ^i^ + a^I) = kf(A)
(iv) By (ii), g(A)f(A) = (gf)(A) = (fg)(A) =f(A)g(AX
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 9]
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
323
9.8. Prove the Cayley-Hamilton Theorem 9.2: Every matrix ^ is a root of its characterstic polynomial
A(t).
Let A be an arbitrary w-square matrix and let A(^) be its characteristic polynomial, say,
A@= \tI-A\ =f-\-a^_^f
. -\-ait -\-aQ
Now let B(t) denote the classical adjoint of the matrix tl — A. The elements oiB{t) are co factors of the matrix
tl — A, and hence are polynomials in t of degree not exceeding n — I. Thus
B(t)=B,_,f
-B^t-\-B,
where the Bi are w-square matrices over K which are independent of t. By the fundamental property of the
classical adjoint (Theorem 8.9), (tl - A)B(t) = \tl -A\I, or
(tl - A)(B^_^f-^ -\-...-\-B^t-\-BQ) = (^" + fl„_i^""^ + ... + fli^ + ao)I
Removing the parentheses and equating corresponding powers of t yields
B.
B„.
-AB„
-- ^n-lL
B.-AB, =aJ,
-ABq = GqI
Multiplying the above equations by ^", ^"
.., A, I, respectively, yields
A'^B,_,=AJ,
^""^n-2 - ^"^n-1 = «n-l^"" ' • • - ^^0 " ^^1 = «1^' "^^0 = V
Adding the above matrix equations yields 0 on the left-hand side and A(A) on the right-hand side, that is,
0 = ^" + fl„_i^""^ -\-...-\-a^A-\-GqI
Therefore, A(A) = 0, which is the Cayley-Hamilton Theorem.
EIGENVALUES AND EIGENVECTORS OF 2x2 MATRICES
9.9. Let A =
(a) Find all eigenvalues and corresponding eigenvectors.
(b) Find matrices P and D such that P is nonsingular and D = P~^AP is diagonal.
(a) First find the characteristic polynomial A(^) of ^:
A@ = t^ - tr(^) ^ + Ml = ^^ + 3^ - 10 = (^ - 2)(t + 5)
The roots ?, = 2 and A = —5 of A(t) are the eigenvalues of ^. We find corresponding eigenvectors.
(i) Subtract A = 2 down the diagonal of ^ to obtain the matrix M = A — 21, where the corresponding
homogeneous system MX = 0 yields the eigenvectors corresponding to A = 2. We have
M -
1 -4
2 -8
corresponding to
x-4y = 0
2x-Sy = 0
-4y = 0
The system has only one free variable, and v^ = D, 1) is a nonzero solution. Thus v^ = D, 1) is an
eigenvector belonging to (and spanning the eigenspace of) A = 2.
(ii) Subtract A = — 5 (or, equivalently, add 5) down the diagonal of ^ to obtain
M -
8 -4
2 -1
corresponding to
8x - 4j; = 0
2x- y = 0
The system has only one free variable, and V2
eigenvector belonging to A = 5.
or 2x —y = 0
A, 2) is a nonzero solution. Thus V2 = A, 2) is an
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
324
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
[CHAR 9
(b) Let P be the matrix whose columns are Vi and V2. Then
4 1
1 2
and
D = P-'AP-
\2 0
[0 -5
Note that D is the diagonal matrix whose diagonal entries are the eigenvalues of ^ corresponding to the
eigenvectors appearing in P.
Remark: Here P is the change-of-basis matrix from the usual basis of R^ to the basis
S = {vi,V2}, and D is the matrix that represents (the matrix function) A relative to the new basis S.
9.10. Let A =
2 2
1 3
(a) Find all eigenvalues and corresponding eigenvectors.
(b) Find a nonsingular matrix P such that D = P~^AP is diagonal, and P~^.
(c) Find A^ and/(^), where t"^ - 3fi -6f + lt + 3.
(d) Find a "positive square root" of ^, that is, a matrix B such that B^ = A and B has positive
eigenvalues.
(a) First find the characteristic polynomial A(^) of ^:
A(t):
The roots 1=1 and A = 4 of A(^) are the eigenvalues of ^. We find corresponding eigenvectors.
: t^ - tr(A) t-\-\A\=t^ -5t-\-4 = (t- \)(t - 4)
(i) Subtract 1=1 down the diagonal of ^ to obtain the matrix M = A — XI, where the corresponding
homogeneous system MX = 0 yields the eigenvectors belonging to A = 1. We have
M :
1 2
1 2
corresponding to
x-\-2y = 0
x-\-2y = 0
x-\-2y = 0
The system has only one independent solution; for example, x = 2, j; = — 1. Thus v^ = B, —1) is
an eigenvector belonging to (and spanning the eigenspace) of /I = 1.
(ii) Subtract 1 = 4 down the diagonal of A to obtain
M :
-2 2
1 -1
corresponding to
-2x -\-2y = 0
X- y = 0
x-y = 0
The system has only one independent solution; for example, x = ^, y = 1. Thus i;2 = A, 1) is an
eigenvector belonging to A = 4.
(b) Let P be the matrix whose columns are Vi and V2. Then
2 1
-1 1
and
D=P-'AP-
ri 0
[0 4
where
fi —ii
3 3
i 2
13 3 J
(c) Using the diagonal factorization A = PDP~\ and 1^ = 1 and 4^ = 4096, we get
: PD^'P-
2 1
-1 1
fl 0]
[0 4096 J
ri 1
3 3
1 2
L3 3
=
1366 2230
1365 2731
Also,/(I) = 2 and/D) = -1. Hence
f(A)=Pf(D)p-'
2 1]
-1 ij
[2 0]
[0 -ij
ri 1"
3 3
1 2
L3 3_
=
1 2"
-1 0
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 9]
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
325
{d) Here
±1 0
0 ±2
are square roots oi D. Hence the positive square root of ^ is
B = pVdP-
2 l]
-1 ij
fl O]
[o 2J
fl 1"
3 3
1 2
L3 3_
=
4 2
3 3
1 5
_3 3_
9.11. Each of the following real matrices defines a linear transformation on R
(a) A:
(b) B:
(c) C
Find, for each matrix, all eigenvalues and a maximum set S of linearly independent eigenvectors.
Which of these linear operators are diagonalizable, - that is, which can be represented by a
diagonal matrix?
(a)
(b)
First find A@ = f-3t-2S = (t- l){t + 4). The roots A :
We find corresponding eigenvectors.
: 7 and X = —A are the eigenvalues of ^.
(i)
(ii)
Subtract A :
M= I
Here Vi =
Subtract /I
M
7 down the diagonal of A to obtain
-2 6"
corresponding to
-2x + 6y -
3x-9y--
■3y = 0
C, 1) is a nonzero solution.
= —4 (or add 4) down the diagonal of A to obtain
9 6
3 2
corresponding to
Here V2 = B, —3) is a nonzero solution.
9x-\-6y--
3x + 2j;:
3x + 2j; = 0
Then S = {v^, V2} = {C, 1), B, —3)} is a maximal set of linearly independent eigenvectors. Since S is a.
basis of R^, A is diagonalizable. Using the basis S, A is represented by the diagonal matrix
D = diagG, -4).
First find the characteristic polynomial A(^) = f -\- I. There are no real roots. Thus B, a real matrix
representing a linear transformation on R^, has no eigenvalues and no eigenvectors. Hence, in particular,
B is not diagonalizable.
(c) First find A@ --
■ 8^ + 16 = (^ - 4) . Thus A = 4 is the only eigenvalue of C. Subtract A = 4 down
the diagonal of C to obtain
M =
1 -1
1 -1
corresponding to
x-y--
0
The homogeneous system has only one independent solution; for example, x = ^,y = 1. Thus i; = A, 1)
is an eigenvector of C. Furthermore, since there are no other eigenvalues, the singleton set
S = {v} = {(I, I)} is a, maximal set of linearly independent eigenvectors of C. Furthermore, since S is
not a basis of R^, C is not diagonalizable.
9.12. Suppose the matrix B in Problem 9.11 represents a linear operator on complex space C^. Show that,
in this case, B is diagonalizable by finding a basis S of C^ consisting of eigenvectors of B.
The characteristic polynomial of B is still A(t) = f -\- I. As a polynomial over C, A(t) does factor;
specifically, A(^) = (t — i)(t + /). Thus X = i and X = —i are the eigenvalues of B.
(i) Subtract X = i down the diagonal of B to obtain the homogeneous system
A - Ox - y = 0
2x + (-l -i)y = 0
A -i)x-y = 0
The system has only one independent solution; for example, x = ^,y = 1 — i- Thus v^ = (I, 1 — /) is an
eigenvector that spans the eigenspace of A = /.
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
326
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
[CHAR 9
(ii) Subtract ?. = —i (or add /) down the diagonal of B to obtain the homogeneous system
)r A -\-i)x—y = 0
A -\-i)x- y = 0
2x + (-l -\-i)y = 0
The system has only one independent solution; for example, x = 1, j; = 1 + /. Thus Vi = (\, 1 + /) is an
eigenvector that spans the eigenspace of /I = —/.
As a complex matrix, B is diagonalizable. Specifically, S = {vi, V2} = {A,1— i), A,1+ /)} is a basis of C^
consisting of eigenvectors ofB. Using this basis S, B is represented by the diagonal matrix D = diag(/, —/).
9.13. Let L be the linear transformation on R^ that reflects each point P across the line y = kx, where
^>0. (See Fig. 9-1.)
(a) Show that Vi = (k, 1) and V2 = A, —k) are eigenvectors of L.
(b) Show that L is diagonalizable, and find a diagonal representation D.
y-K
Fig. 9-1
(a) The vector Vi = (k, 1) lies on the line j; = kx, and hence is left fixed by L, that is L(vi) = Vi. Thus Vi is an
eigenvector of L belonging to the eigenvalue A^ = 1.
The vector V2 = A, —k) is perpendicular to the line j; = kx, and hence L reflects V2 into its negative;
that is, L(v2) = —V2. Thus V2 is an eigenvector of L belonging to the eigenvalue /I2 = —1.
{b) Here S = {vi,V2} is a basis of R^ consisting of eigenvectors of L. Thus L is diagonalizable, with the
diagonal representation i) = (relative to the basis S).
EIGENVALUES AND EIGENVECTORS
9.14. Let A =
. (a) Find all eigenvalues of ^.
4 1 -1"
2 5-2
1 1 2_
(b) Find a maximum set S of linearly independent eigenvectors of ^.
(c) Is A diagonalizable? If yes, find P such that D = P~^AP is diagonal.
(a) First find the characteristic polynomial A(^) of ^. We have
tr(^) = 4 + 5+2 = 11 and |^| = 40 - 2 - 2 + 5 + 8 - 4 = 45
Also, find each cofactor A^^ of a^y in A:
5 -2
1 2
12,
4 -1
1 2
4 1
2 5
n
Hence
A(t) = P -tr(A)t^-\-(A^, -\-A
22 -r- A^^)t - 1^1 = ^^ - 1U^ + 39^ - 45
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 9]
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
327
Assuming A^ has a rational root, it must be among ±1, ±3, ±5, ±9, ±15, ±45. Testing, by
synthetic division, we get
1
11±39-
3-24-
45
■45
1 -8±15± 0
Thus ^ = 3 is a root of A(^), and ^ — 3 is a factor; hence
A@ = (t- 3)(t^ -St-\- 15) = (t- 3)(t - 5)(t -3) = (t-
Accordingly, /I = 3 and /I = 5 are eigenvalues of ^.
(b) Find linearly independent eigenvectors for each eigenvalue of ^.
(i) Subtract 1 = 3 down the diagonal of A to obtain the matrix
1 1 -l\
corresponding to x -\-y -
3f(t-
5)
M
(ii)
Here u = (I, —1,0) and i; = A, 0, 1) are linearly independent solutions.
Subtract ?, = 5 down the diagonal of A to obtain the matrix
M -
corresponding to
—X-
2x -
X-
-y-
-y-
■ z -
2z--
■3z--
■ z --
-Iz--
Only z is a free variable. Here >v = (l,2, l)isa solution.
Thus S — {u, v,w} = {(I, —I, 0), A, 0, 1), A, 2, 1)} is a maximal set of linearly independent
eigenvectors of ^.
Remark: The vectors u and v were chosen so that they were independent solutions of the system
X +3; — z = 0. On the other hand, w is automatically independent of w and v since w belongs to a different
eigenvalue of ^. Thus the three vectors are linearly independent.
(c) A is diagonalizable, since it has three linearly independent eigenvectors. Let P be the matrix with
columns u,v,w. Then
1 1 1
-10 2
0 1 1
and
D = P-^AP
9.15. Repeat Problem 9.14 for the matrix B =
3 -1 1
7 -5 1
6-6 2
(a) First find the characteristic polynomial A(^) of B. We have
trE) = 0,
1^1
-16,
B^
By
0,
Therefore A@ = t'^ - Ut -\-16 = (t - 2f(t ± 4). Thus 1^
(b) Find a basis for the eigenspace of each eigenvalue of B.
(i) Subtract A^ = 2 down the diagonal of B to obtain
= 2 and /I2
i
-4 are the eigenvalues of B.
M -
1 -1 1
7 -7 1
6-6 0
corresponding to
X — y-\-z=0
lx-ly-\-z=0
6x-6y =0
x-y-\-z--
z --
The system has only one independent solution; for example, x = l,y = l,z = 0. Thus u = (I, 1,0)
forms a basis for the eigenspace of 1^ =2.
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
328
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
[CHAR 9
(ii) Subtract X2 — —4 (or add 4) down the diagonal of B to obtain
-1 II 7x- j;+ z = 0
M — \ 1 — 1 1 , corresponding to Ix— y -\- z — ^
6 -6 6 J 6x - 6j; + 6z = 0
■ j; + z :
6j; — 6z :
The system has only one independent solution; for example, x — ^,y— l,z = 1. Thus i; = @, 1, 1)
forms a basis for the eigenspace of /I2 = —4.
Thus S = {u,v} is a. maximal set of linearly independent eigenvectors of B.
(c) Since B has at most two linearly independent eigenvectors, B is not similar to a diagonal matrix; that is, B
is not diagonalizable.
9.16. Find the algebraic and geometric multiplicities of the eigenvalue l^ = 2 of the matrix B in Problem
9.15.
The algebraic multiplicity of A^ =2 is 2, since t — 2 appears with exponent 2 in A(t). However, the
geometric multiplicity of A^ = 2 is 1, since dim^^^ = 1 (where E;^^ is the eigenspace of/Ij).
9.17. Let T: R^ -^ R^ be defined by T(x, y, z) = Bx-\-y- 2z, 2x-\-3y-4z, x + y- z). Find all
eigenvalues of T, and find a basis of each eigenspace. Is T diagonalizable? If so, find the basis S of
R^ that diagonalizes 7, and find its diagonal representation D.
First find the matrix A that represents T relative to the usual basis of R^ by writing down the coefficients
of x,j;, z as rows, and then find the characteristic polynomial of ^ (and T). We have
tr(^) = 4, Ml =2
m
1 -2
3 -4
1 -1
and
1,
0, A
T^u
33
Therefore A@ = t^ — 4f -\- 5t — 2 = (t — l)^(t — 2), and so /I = 1 and /I = 2 are the eigenvalues of ^ (and
T). We next find linearly independent eigenvectors for each eigenvalue of ^.
(i) Subtract 1=1 down the diagonal of A to obtain the matrix
M :
1 1 -2
2 2-4
1 1 -2
corresponding to x + j; — 2z = 0
Here y and z are free variables, and so there are two linearly independent eigenvectors belonging to
/I = 1. For example, u = (I, —1,0) and v = B, 0, 1) are two such eigenvectors.
(ii) Subtract 1 = 2 down the diagonal of ^ to obtain
M :
0 1
2 1
1 1
-2
-4
-3
corresponding to
j; - 2z = 0
2x+j;-4z = 0
X + j; — 3z = 0
X + _y — 3z :
_y — 2z :
Only z is a free variable. Here w = A, 2, 1) is a solution.
Thus T is diagonalizable, since it has three independent eigenvectors. Specifically, choosing
S = {u, V, w} = {A, -1, 0), B, 0, 1), A, 2, 1)}
as a basis, T is represented by the diagonal matrix D = diag(l, 1, 2).
9.18. Prove the following for a linear operator (matrix) T:
(a) The scalar 0 is an eigenvalue of T if and only if T is singular.
(b) If /I is an eigenvalue of T, where T is invertible, then A~^ is an eigenvalue of T~
Lipschutz-Lipson:Schaum's I 9. Diagonalization: I Text I I ©The McGraw-Hill
Outline of Theory and Eigenvalues and Companies, 2004
Problems of Linear Eigenvectors
Algebra, 3/e
CHAR 9] DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS 329
(a) We have that 0 is an eigenvalue of T if and only if there is a vector v ^0 such that T(v) = Ov, that is, if
and only if T is singular.
(b) Since T is invertible, it is nonsingular; hence, by (a), A 7^ 0. By definition of an eigenvalue, there exists
V ^ 0 such that T(v) = Xv. Applying T~^ to both sides, we obtain
V = T-\?.v) = ?.T-\v), and so T-\v) = X~^v
Therefore /l~^ is an eigenvalue of r~^
9.19. Let X be an eigenvalue of a linear operator T:V^ V, and let E^ consists of all the eigenvectors
belonging to X (called the eigenspace of X). Prove that E^ is a subspace of V. That is, prove:
{a) If w G El, then ku e E^ for any scalar k. (b) If u, v, G E^, then u + veE^.
(a) Since u e E^, we have T(u) = Xu. Then T(ku) = kT(u) = k(Xu) = X(ku), and so ku e E^.
(We view the zero vector 0 G F as an "eigenvector" of X in order for E^ to be a subspace of V.)
(b) Since u,v e E^, we have T(u) = Xu and T{v) = Xv. Then
T(u -\-v) = T(u) + T(v) — Xu-\- Xv — X(u + v), and so u-\-veEi
9.20. Prove Theorem 9.6: The following are equivalent: (i) The scalar X is an eigenvalue of ^.
(ii) The matrix XI — A is singular.
(iii) The scalar 1 is a root of the characteristic polynomial A(t) of A.
The scalar X is an eigenvalue of A if and only if there exists a nonzero vector v such that
Av = Xv or (XI)v —Av = 0 or (XI — A)v = 0
or XI —A is singular. In such a case, /I is a root of A(t) = \tl —A\. Also, v is in the eigenspace E^ of/I if and
only if the above relations hold. Hence i; is a solution of (XI — A)X = 0.
9.21. Prove Theorem 9.8^: Suppose Vi,V2, ... ,v„ are nonzero eigenvectors of T belonging to distinct
eigenvalues Ij, /I2,..., >^„. Then Vi,V2,... ,v^ are linearly independent.
Suppose the theorem is not true. Let 1;^, i;2,..., i;^ be a minimal set of vectors for which the theorem is
not true. We have ^ > 1, since v^ ^ 0. Also, by the minimality condition, i;2, ..., i;^ are linearly independent.
Thus 1;^ is a linear combination of i;2,..., i^^, say,
v^ =a2V2-\-a^v^ +...+a,i;, A)
(where some a^ 7^ 0). Applying T to A) and using the linearity of T yields
T(v,) = T(a2V2 + a,v, + ... + a,vj = a2T(v2) + a^T(v^) + ... + a,T(v,) B)
Since Vj is an eigenvector of T belonging to Xj, we have T(Vj) = XjVj. Substituting in B) yields
X^v^ = a2X2V2 + a^X^v^ + ... + a^X^v^ C)
Multiplying A) by /l^ yields
X^Vi = a2XiV2 + cisXiVt, + ... + a^XiV^ D)
Setting the right-hand sides of C) and D) equal to each other, or subtracting C) from D) yields
^2(^1 - ^2)^2 + ^3(^1 - ^3)^3 + ... + a,(X^ - XJv, = 0 E)
Since V2,V2, ... ,v^ are linearly independent, the coefficients in E) must all be zero. That is,
^2(^1 - ^2) = 0' ^3(^1 - ^) = 0. • • •. aXXi -XJ = 0
However, the Xi are distinct. Hence X^ — Xj ^ 0 for7 > 1. Hencea2 = 0, a^ = 0,... ,a^ = 0. This contradicts
the fact that some a^ 7^ 0. Thus the theorem is proved.
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
330
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
[CHAR 9
9.22. Prove Theorem 9.9. Suppose A(^) = {t — ay){t — a2).. .(t — a^) is the characteristic polynomial of
an ^-square matrix A, and suppose the n roots a^ are distinct. Then A is similar to the diagonal
matrix D = diag(ai, ^2, ..., a„).
Let Vi,V2, ... ,Vj^hQ (nonzero) eigenvectors corresponding to the eigenvalues a^. Then the n eigenvectors
Vi are linearly independent (Theorem 9.8), and hence form a basis of ^". Accordingly, A is diagonalizable, that
is, A is similar to a diagonal matrix D, and the diagonal elements of D are the eigenvalues a,.
9.23. Prove Theorem 9.10^: The geometric multiplicity of an eigenvalue 1 of 7 does not exceed its
algebraic multiplicity.
Suppose the geometric multiplicity of /I is r. Then its eigenspace E^ contains r linearly independent
eigenvectors v^,... ,v^. Extend the set {i;J to a basis of V, say, {v^,... ,v^,Wi, ..., w^}. We have
T{vi) = Xvi, T(v2) = y^V2, ..., T(v^) = Xv^,
T(wi) = a^Vi + ... + a^,v, + b^^w^ + ... + b^,w,
T(W2) = C211^1 + ... + ^Ir'^r + ^21>^1 + • • • + ^25>^5
Then M ■■
K A
0 B
Tiy^s) = «,ii^i + ... + a^,v, + 6,i>vi + ... + b^^w^
is the matrix of T in the above basis, where A = [a^y] and B = [bij\
Since M is block diagonal, the characteristic polynomial {t — Xf of the block A/^ must divide the
characteristic polynomial of M and hence of T. Thus the algebraic multiplicity of /I for T is at least r, as
required.
DIAGONALIZING REAL SYMMETRIC MATRICES
9.24. Let A =
Find an orthogonal matrix P such that D = P ^AP is diagonal.
First find the characteristic polynomial A(^) of ^. We have
A@ = t^-\- tr(^) t-\-\A\=t^ -6t-16 = (t- S>)(t + 2)
Thus the eigenvalues of ^ are A = 8 and X = —2. We next find corresponding eigenvectors.
Subtract A = 8 down the diagonal of A to obtain the matrix
M :
corresponding to o _ n ^^ x — 3j; = 0
A nonzero solution is u^ = C, 1).
Subtract /I = — 2 (or add 2) down the diagonal of A to obtain the matrix
M -
9 3
3 1
corresponding to
9x + 3j; = 0
3x+ y = 0
3x-\-y = 0
A nonzero solution is U2 = (I, —3).
As expected, since A is symmetric, the eigenvectors Ui and U2 are orthogonal. Normalize Ui and U2 to
obtain, respectively, the unit vectors
wi = C/VlO, 1/VlO) and t/2 = (VV10, -3/VlO).
Finally, let P be the matrix whose columns are the unit vectors Ui and U2, respectively. Then
3/VlO 1/VlO
1/VTo -3/VTo
and
D = P-^AP--
0
As expected, the diagonal entries in D are the eigenvalues of ^.
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAP. 9]
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
331
9.25. Let 5:
11
. (a) Find all eigenvalues of B.
(b) Find a maximal set S of nonzero orthogonal eigenvectors of B.
(c) Find an orthogonal matrix P such that D = P~^BP is diagonal.
(a) First find the characteristic polynomial of B. We have
ix{B) = 6, \B\ = 400,
5i
0,
B-)
-60,
-75,
E^.
-135
Hence A(t) --
■ 6f -U5t- 400. If A@ has an integer root it must divide 400. Testing t = -5, by
synthetic division, yields
Thus ^ + 5 is a factor of A(^), and
1 - 6-135-400
- 5+ 55 + 400
1-11- 80+ 0
A@ = (t-\- 5)(t^ - 1U - 80) = (^ + 5f(t - 16)
The eigenvalues of B are A = —5 (multiplicity 2), and /I = 16 (multiplicity 1).
(b) Find an orthogonal basis for each eigenspace. Subtract A = —5 (or, add 5) down the diagonal of B to
obtain the homogeneous system
16x-8j; + 4z = 0, -8x + 4j; - 2z = 0, 4x-2j; + z = 0
That is, 4x — 2j; + z = 0. The system has two independent solutions. One solution is Vi =@,1, 2). We
seek a second solution V2 = (a, b, c), which is orthogonal to v^, that is, such that
4a-2b-\-c = 0,
and also
b-2c = 0
One such solution is V2 = (—5, —8, 4).
Subtract /I = 16 down the diagonal of B to obtain the homogeneous system
-5x - 8j; + 4z = 0,
-8x- 17j;-2z = 0,
4x - 2j; - 20z = 0
: 1^3/V21
This system yields a nonzero solution v^ = D, —2, 1). (As expected from Theorem 9.13, the eigenvector
v^ is orthogonal to Vi and V2.)
Then Vi,V2,Vt^ form a maximal set of nonzero orthogonal eigenvectors of B.
(c) Normalize Vi,V2,V2, to obtain the orthonormal basis
v^ = V1/V5, V2 = i;2/vl05.
Then P is the matrix whose columns are 91,1J,1^3. Thus
0 -5/7105 4/V2T
1/V5 -8^105 -2/V2T
.2/V5 4/7105 I/V2T,
and
D=P-'BP--
16
9.26. Let q(x,y) = x^ + 6xy — ly^. Find an orthogonal substitution that diagonalizes q.
Find the symmetric matrix A that represents q and its characteristic polynomial A(^). We have
A@ = ^^ + 6^ - 16 = (^ - 2){t + 8)
[3 -7
and
The eigenvalues of ^ are A = 2 and A = — 8. Thus, using s and t as new variables, a diagonal form of q is
q{s, t) = 2s^
■^t'
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
332
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
[CHAR 9
The corresponding orthogonal substitution is obtained by finding an orthogonal set of eigenvectors of ^.
(i) Subtract X — 1 down the diagonal of A to obtain the matrix
M ■
[-1
corresponding to
-X + 3j; = 0
3x - 9j; = 0
A nonzero solution is w^ = C, 1).
(ii) Subtract A = — 8 (or add 8) down the diagonal of ^ to obtain the matrix
M -
9 3
3 1
corresponding to
9x + 3j; = 0
3x+ y = 0
-x-\-3y = 0
3x+j; = 0
A nonzero solution is U2 = (—1, 3).
As expected, since A is symmetric, the eigenvectors Ui and U2 are orthogonal.
Now normalize u^ and U2 to obtain, respectively, the unit vectors
and
wi =C/V10, 1/VlO)
^2 = (-l/vl0, 3/VlO),
Finally, let P be the matrix whose columns are the unit vectors u^ and U2, respectively, and then
[x,y] = P[s, t] is the required orthogonal change of coordinates. That is.
10
3/VlO -IVIO , 3^-
P=\ ^ ^ and X ^
I 1/VTO 3/VTO J
One can also express s and t in terms of x and y by using P~^ = P. That is,
3x+_y —x + 3^
y--
s-\-3t
10
10
10
MINIMAL POLYNOMIAL
9.27. Let^ =
-2 2
-3 4
-2 3
and 5 =
-2 2
-4 6
-3 5
. The characteristic polynomial of both matrices is
A(t) = (t — 2)(t — 1) . Find the minimal polynomial m(t) of each matrix.
The minimal polynomial m(t) must divide A(t). Also, each factor of A(t), that is, ^ — 2 and t — I, must
also be a factor of m(t). Thus m(t) must be exactly one of the following:
m = (t-2)(t-\) or g(t) = (t-2)(t-lf
(a) By the Cay ley-Hamilton Theorem, g(A) = A(A) = 0, so we need only test/(^). We have
/(^) = (^-2/)(^-l):
2-2 2
6-5 4
3 -2 1
3 -2 21
6-44 =
3 -2 2 I
0 0 0
0 0 0
0 0 0
Thus m(t) =f(t) = (t — 2)(t — 1) = ^^ — 3^ + 2 is the minimal polynomial of ^.
(b) Again g(B) = A(B) = 0, so we need only test/@. We get
f(B) = (B-2I)(B-I):
1 -2 2"
4-6 6
2-3 3
2 -2 21
4 -5 6 =
2 -3 4J
-2 2 -2"
-4 4 -4
-2 2 -2
^0
Thus m@ 7^/@- Accordingly, m@ = g@ = (^ ~ 2)(^ — 1) is the minimal polynomial ofB. [We emphasize
that we do not need to compute g(B); we know g(B) = 0 from the Cay ley-Hamilton theorem.]
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 9]
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
333
9.28. Find the minimal polynomial m(t) of each of the following matrices:
(a) A:
(b) B:
1
0
0
2
2
0
3
3
3
(C) C:
(a) The characteristic polynomial of A is A(t) = f — I2t -\-32 = (t — 4)(t — 8). Since A(t) has distinct
factors, the minimal polynomial m(t) = A(t) = f — lit -\- 32.
(b) Since B is triangular, its eigenvalues are the diagonal elements 1,2,3; and so its characteristic
polynomial is A{t) = {t — \){t — 2)(t — 3). Since A(t) has distinct factors, m(t) = A(t).
(c) The characteristic polynomial of C is A(^) = ^^ — 6^ + 9 = (^ — 3)^. Hence the minimal polynomial of C
is/@ = ^ - 3 or g(t) = (t- 3I However,/(C) ^ 0, that is, C-3I ^0. Hence
m(t)=g(t) = A(t) = (t-3)\
9.29. Suppose S = {ui,U2,... ,u„} is sl basis of V, and suppose F and G are linear operators on V such
that [F] has O's on and below the diagonal, and [G] has a j^ 0 on the superdiagonal and O's
elsewhere. That is,
[F]
0 ^21 «31
0 0 a,2
0 0 0
0 0 0
««2
a
n,n-\
0
[G\
0
0
0
0
a
0
0
0
0 .
a .
0 .
0 .
. 0
. 0
a
. 0
Show that: {a) F" = 0, (b) G«-i ^ 0, but G« = 0. (These conditions also hold for [F] and [G].)
(a) We have F(ui) = 0 and, for r > 1, F(w^) is a linear combination of vectors preceding u^, in S. That is.
HUr)
- a,2U2 ■
.. + fl,,,_lw,_
Hence F^(u^) = F(F(u^)) is a linear combination of vectors preceding u^_i. And so on. Hence
F^{u^) = 0 for each r. Thus, for each r, F^{u^) = F"~'^@) = 0, and so F" = 0, as claimed.
(b) We have G(ui) = 0 and, for each k > I, G(uj^) = auj^_i. Hence G^{ui) = a''uj^_j, for r < k. Since a ^ 0,
so is a"-i ^ 0. Therefore, G^-^wJ -
^u^^O, and so G^-^ ^ 0. On the other hand, by (a), G" = 0.
9.30. Let B be the matrix in Example 9.12(a) that has 1 's on the diagonal, a's on the superdiagonal, where
a 7^ 0, and O's elsewhere. Show that f(t) = (t — Xf is both the characteristic polynomial A(t) and
the minimum polynomial m(t) oiA.
Since ^ is triangular with A's on the diagonal, A(^) —fit) — it — Xf is its characteristic polynomial. Thus
mit) is a power oi t - X. By Problem 9.29, {A - UJ~^ ^ 0. Hence m{i) = A(t) = (t - Xf.
9.31. Find the characteristic polynomial A.{i) and minimal polynomial m{f) of each matrix:
1000"
(a) M =
0 4 10 0
0 0 4 0 0
0 0 0 4 1
0 0 0 0 4
, {b) M' =
2
0
0
0
7
2
0
0
0
0
1
-2
0
0
1
4
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
334
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
[CHAR 9
(a) M is block diagonal with diagonal blocks
^4 1 0'
0 4 1
0 0 4
and
4 1
0 4
The characteristic and minimal polynomial of A is f{t) = {t — A) and the characteristic and minimal
polynomial of B is g{t) = {t — A) . Then
A@ =f{t)g(t) = (t- 4f but m(t) = LCM[/@, g(t)\ = (t- 4f
(where LCM means least common multiple). We emphasize that the exponent in m{t) is the size of the
largest block.
1 r
{b) Here M' is block diagonal with diagonal blocks A
,\1 1
[O 2
and B'
-2 4
The
characteristic and minimal polynomial of A^ is f(t) = (t — 2) . The characteristic polynomial of B^ is
g(t) = f — 5t -\- 6 = {t — 2)(t — 3), which has distinct factors. Hence g(t) is also the minimal
polynomial of B. Accordingly,
A@ =f(t)g(t) = (t- 2)\t - 3) but m(t) = LCM[/@, g(t)] = (t - 2)\t - 3)
9.32. Find a matrix A whose minimal polynomial is/(^) = t^ — ^f -\-5t-\-l.
the companion matrix oif{t) [defined in Example 9.12F)].
Simply let A =
0 0
1 0
0 1
-7
-5
8
9.33. Prove Theorem 9.15: The minimal polynomial m{t) of a matrix (linear operator) A divides every
polynomial that has ^ as a zero. In particular (by the Cayley-Hamilton Theorem), m{t) divides the
characteristic polynomial A(^) of ^.
Suppose/(O is a polynomial for which/(^) = 0. By the division algorithm, there exist polynomials q{i)
and r{t) for which f{t) = m(t)q(t) + r(t) and r(t) = 0 or deg r(t) < deg m(t). Substituting t = A in this
equation, and using that/(^) = 0 and m(A) = 0, we obtain r(A) = 0. If r(t) 7^ 0, then r(t) is a polynomial of
degree less than m(t) that has ^ as a zero. This contradicts the definition of the minimal polynomial. Thus
r(t) = 0, and so f(t) = m(t)q(t), that is, m(t) divides/(O-
9.34. Let m(t) be the minimal polynomial of an ^-square matrix A. Prove that the characteristic
polynomial A(t) of A divides [m(t)Y.
Suppose m(t) = f -\- cif~^ -\-... -\- c^_it -\- c^. Define matrices Bj as follows:
Then
Set
Then
B. = A^ -\-c.A-\-C.I
B,_, =A'-'+c,A'-^ + ... + c,_,I
so I = Bq
SO c^I = B^-A=B^- ABq
so C2I = B2- A(A + ci/) =B2- AB^
-AB^_,
-AB^_^ = c^I - (A' + c^A''~'^ + ... + c^_i^ + c^I) = c^I - m(A) = c^I
B(t) = f-^BQ + f-^B^ + ... + tB^_2 + 5,_i
(tl - A)B(t) = (fBo + f-'B, + ... + tB,_,) - (f-'ABo + f'^AB, + ... + AB,_,)
= fBo + f-\B, - ABo) + f-\B2 - AB,) + ... + t{B,_, - AB,_2) - AB,_,
= fl + c^f~^I + C2f~'^I + ... + c^_^tl + c^I = m(t)I
Taking the determinant of both sides gives \tl — A\\B(t)\ = \m(t)I\ = [m(t)Y. Since \B(t)\ is a polynomial,
\tl — A\ divides [m(t)Y; that is, the characteristic polynomial of ^ divides [m(t)Y.
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 9]
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
335
9.35. Prove Theorem 9.16: The characteristic polynomial A(t) and the minimal polynomial m(t) of A have
the same irreducible factors.
Suppose/(O is an irreducible polynomial. If/@ divides m(t), then/(^) also divides A(^) [since m(t)
divides A(t)]. On the other hand, if/@ divides A(t) then, by Problem 934, f(t) also divides lm(t)f. But/@ is
irreducible; hence/(^) also divides m(t). Thus m(t) and A(^) have the same irreducible factors.
9.36. Prove Theorem 9.19: The minimal polynomial m(t) of a block diagonal matrix M with diagonal
blocks Af is equal to the least common multiple (LCM) of the minimal polynomials of the diagonal
blocks A^.
M:
We prove the theorem for the case r = 2. The general theorem follows easily by induction. Suppose
A 0"
0 B
where A and B are square matrices. We need to show that the minimal polynomial m(t) of M
is the least common multiple of the minimal polynomials g(t) and h(t) of A and B, respectively.
'm(A) 0
0 m(B)_
m(B) = 0. Since g(t) is the minimal polynomial of ^, g(t) divides m(t). Similarly, h(t) divides m(t). Thus m(t)
is a multiple of g(t) and h(t).
Since m(t) is the minimal polynomial of M, m(M) ■■
■ 0, and hence m(A) = 0 and
Now let/@ be another multiple of g(t) and h(t). Then/(M) :
(A)
0
0
f(B)\
0"
0 0_
: 0. But m(t)
is the minimal polynomial of M; hence m(t) divides/(O- Thus m(t) is the least common multiple of g(t) and
h(t).
9.37. Suppose m(t) = f -\- aj._xf ^ + ... + aj/^ + Gq is the minimal polynomial of an ^-square matrix A.
Prove the following:
{a) A is nonsingular if and only if the constant term a^^O.
(b) If A is nonsingular, then A~^ is a polynomial in A of degree r — 1 < n.
(a) The following are equivalent: (i) A is nonsingular, (ii) 0 is not a root of m(t), (iii) Gq 7^ 0. Thus the
statement is true.
(b) Since A is nonsingular, ^q ^ 0 by (a). We have
m(A) =A'' + a,_iA''~^ + ... + a^A + a^I = Q
Thus
Accordingly
1
-{A'-^ + a,_^A'-^ + ... + a^I)A = I
1
(A'-'^ a,_,A'-^ ^ ... ^ a,I)
Supplementary Problems
POLYNOMIALS OF MATRICES
9.38. Let ^ = P ^ and 5 :
g(t) = P -2f-\-t-\-3.
1 2
0 3
9.39. Let A -
1 2
0 1
Find f(Al g(AX f{B\ g{B\ where f{t) = 2f -5t-\-6 and
. Find A^, A^, A^, where n > 3, and A
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
336
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
[CHAR 9
9.40. Let B --
8 12 0'
0 8 12
0 0 8
Find a real matrix A such that B = A^
9.41. For each matrix, find a polynomial having the following matrix as a root:
1 1 2"
(a) A.
['
2 5
3
,(b) B--
[5
, (c) C-
1 2 3
2 1 4
9.42. Let A be any square matrix and let/@ be any polynomial. Prove: (a) (P ^APy = P ^A^P.
(b) f(p-^AP)=p-^f(A)P. (c) f(A^) = [f(A)Y. (d) If ^ is symmetric, then/(^) is symmetric.
9.43. Let M = diag[^i, ..., ^^] be a block diagonal matrix, and let/(^) be any polynomial/(O- Show that/(M) is
block diagonal and/(M) = diag[/(^i),... ,f(Aj,)].
9.44. Let M be a block triangular matrix with diagonal blocks A^, ... ,A^, and let/(^) be any polynomial/@. Show
that/(M) is also a block triangular matrix, with diagonal blocks/(^j), ... ,f(A^).
EIGENVALUES AND EIGENVECTORS
9.45. For each of the following matrices, find all eigenvalues and corresponding linearly independent eigenvectors:
(a) A :
2 -3
2 -5
(b) B--
[J
2 4
6
(c) C-
1 -4
3 -7
When possible, find the nonsingular matrix P that diagonalizes the matrix.
9.46. Let A ■-
2 -1
-2 3
(a) Find eigenvalues and corresponding eigenvectors.
(b) Find a nonsingular matrix P such that D = P~^AP is diagonal.
(c) Find A^ and/(^) where/(O = t"^ - 5P + if -2t-\-5.
(d) Find a matrix B such that B^ = A.
9.47. Repeat Problem 9.46 for A ■■
5 6
-2 -2
9.48. For each of the following matrices, find all eigenvalues and a maximum set S of linearly independent
eigenvectors:
(a) A :
1 -3 3
3-5 3
6-6 4
(b) B-
3 -1 1
7 -5 1
6-6 2
,(C) C:
1 2 2
1 2 -1
-1 1 4
Which matrices can be diagonalized, and why?
9.49. For each of the following linear operators T\ R^ ^ R^, find all eigenvalues and a basis for each eigenspace:
(a) T(x, y) = Cx + 3;;, x + 5y), (b) T(x, y) = Cx - 13y, x-3y).
9.50. Let A ■
a b
c d
be a real matrix. Find necessary and sufficient conditions on a, b, c, d so that A is
diagonalizable, that is, so that A has two (real) linearly independent eigenvectors.
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 9]
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
337
9.51. Show that matrices A and A^ have the same eigenvalues. Give an example of a 2 x 2 matrix A where A and A^
have different eigenvectors.
9.52. Suppose v is an eigenvector of linear operators F and G. Show that v is also an eigenvector of the linear
operator kF + tG, where k and t are scalars.
9.53. Suppose v is an eigenvector of a linear operator T belonging to the eigenvalue X. Prove:
(a) For « > 0, i; is an eigenvector of T" belonging to A".
(b) /(/I) is an eigenvalue of/(r) for any polynomial/(O-
9.54. Suppose /I 7^ 0 is an eigenvalue of the composition F o G of linear operators F and G. Show that /I is also an
eigenvalue of the composition GoF. [Hint: Show that G(v) is an eigenvector of G o F.]
9.55. Let E\V ^ F be a projection mapping, that is, E^ = E. Show that E is diagonalizable and, in fact, can be
0"
represented by the diagonal matrix M ■■
0 0
where r is the rank of E.
DIAGONALIZING REAL SYMMETRIC MATRICES
9.56. For each of the following symmetric matrices A, find an orthogonal matrix P such that D = P~^AP is
diagonal:
(^) ^=[4 _i]'(^) ^=[-\
4 -1
4
(c) A.
7 3
3 -1
9.57. For each of the following symmetric matrices B, find its eigenvalues, a maximal orthogonal set S of
eigenvectors, and an orthogonal matrix P such that D = P~^BP is diagonal:
(a) B-
0 1 1
1 0 1
1 1 0
,(b) B =
2 2
4 5
4 8
4
8
17
9.58. Find an orthogonal substitution that diagonalizes each of the following quadratic forms:
(a) q(x, y) = 4x^ + 8xy - 11/, (b) q(x, y) = 2x^ - 6xy + 10/
9.59. For each of the following quadratic forms q{x,y, z), find an orthogonal substitution expressing x,y, z in terms
of variables r,s,t, and find q{r, s,t):
(a) q(x, y, z) = 5x^ + 3y^ + 12xz, (b) q(x, y, z) = 3x^ — Axy + 6j^ + 2xz — 4_yz + 3z^
9.60. Find a real 2x2 symmetric matrix A with eigenvalues:
(a) 1=1 and 1 = 4 and eigenvector u = A,1) belonging to A = 1;
(b) X — 2 and X — 1 and eigenvector w = A, 2) belonging to /I = 2.
In each case, find a matrix B for which B^ — A.
CHARACTERISTIC AND MINIMAL POLYNOMIALS
9.61. Find the characteristic and minimal polynomials of each of the following matrices:
id) A :
3 1 -r
2 4-2
-1 -1 3
{b) B-
3 2 -\
3 8-3
3 6-1
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
338
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
[CHAR 9
9.62. Find the characteristic and minimal polynomials of each of the following matrices:
9.63.
(a) A =
Let ^ =
2
0
0
0
[o
0
0
5
2
0
0
0
1
2
0
0
0
4
3
0
01
0
1
0 0
0 0
2 0
5 0
0 7_
,(
and 5 =
(b) B--
4-1000
1 2 0 0 0
0 0 3 10
0 0 0 3 1
0 0 0 0 3
(c) C-
3 2 0 0 0"
14 0 0 0
0 0 3 10
0 0 13 0
0 0 0 0 4
0 0'
0 2 2
0 0 1
Show that A and B have different characteristic polynomials
(and so are not similar), but have the same minimal polynomial. Thus nonsimilar matrices may have the same
minimal polynomial.
9.64. Let A be an w-square matrix for which A^ = 0 for some k > n. Show that ^" = 0.
9.65. Show that a matrix A and its transpose A^ have the same minimal polynomial.
9.66. Suppose/(O is an irreducible monic polynomial for which/(^) = 0 for a matrix A. Show thai f(t) is the
minimal polynomial of ^.
9.67. Show that ^ is a scalar matrix kl if and only if the minimal polynomial of A is m(t) = t — k.
9.68. Find a matrix A whose minimal polynomial is: (a) ^^ — 5^^ + 6^ + 8, (b) f — 5t^ — 2t -\-11 -\- A.
9.69. Let f{t) and g{t) be monic polynomials (leading coefficient one) of minimal degree for which ^ is a root.
Show/@ = g{t). [Thus the minimal polynomial of ^ is unique.]
Answers to Supplementary Problems
Notation: m = {Ri; R2', ...] denotes a matrix with rows Ri,R2,
9.38. f(A) = [-26, -3; 5, -27], g(A) = [-40, 39; -65, -27],
f(B) = [3,6', 0,91 g(B) = [3,12; 0,15]
9.39. ^2 ^[1,4; 0,1], ^3 =[1,6; 0, 1], ^" = [1, 2«; 0, 1], ^"^ = [1,-2; 0,1]
9.40. Let A = [2, a, b; 0, 2, c; 0, 0, 2]. Set B = A^ and then a=l,b = -\,c=l
9.41. Find A@: (a) t^-\-t-II, (b) t^-\-2t-\-13, (c) t^ -7t^-\-6t-l
9.45. (a) l=l,u = C,iy, 1 =-4,v = A,2), (b) l = 4,u = B,l),
(c) X = —l,u = B, l); X — —5, V — B, 3). Only A and C can be diagonalized; use P — [u, v]
9.46. (a) X=l,u = (l,iy, X = 4,v = (l,-2),
(b) P = [u, v],
(c) f(A) = [19, -13; -26, 32], A^ = [21 846, -21 845; -43 690, 43 691],
(d) B=\l,-i; -f,f]
Lipschutz-Lipson:Schaum's I 9. Diagonalization:
Outline of Theory and Eigenvalues and
Problems of Linear Eigenvectors
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAR 9]
DIAGONALIZATION: EIGENVALUES AND EIGENVECTORS
339
9.47. (a) X=l,u = C,-2); X = 2, v = B,-l), (b) P = [u,v],
(c) f(A) = [2,-6', 2,9], ^6 ^[1021, 1530; -510,-764],
(d) B = [-3-\-4V2, -6 + 6V2; 2-2^2, 4-3^2]
9.48. (a) X = -2,u = A, 1, 0), i; = A, 0, -1); X = 4,w = (l,l, 2),
(b) X = 2,u = (\,\,0y, X = -4,v = @,\,\X
(c) /I = 3, w = A, 1, 0), i; = A, 0, 1); ?, = l,w = B, —I, I). Only A and C can be diagonalized; use
P = [u, V, w]
9.49. (a) ?, = 2,u = C, —1); ?, = 6,v = (I, 1), (b) no real eigenvalues
9.50. We need [-tr(^)]^ - 4[det(^)] > 0 or (a - df -\-4bc>0
9.51. ^ = [1,1; 0,1]
9.56. (a) P = [2,-l', -1,2]/V5, F) P = [l,l; 1,-1]/V2, (c) P = [3,-l', 1,3]/VT0,
9.57. (a) X = -l,u = A, -1,0), V = A, 1, -2); X = 2,w = (l, 1, 1),
(b) X=\,u = {2, 1, -1), V = B, -3, 1); X = 22,w = A,2, 4);
Normalize u,v,w, obtaining u,v,w, and SQt P = [u,v,w]. (Remark: u and v are not unique.)
9.58. (fl) X = D^ + t)lVll,y = (s + 40/^17, q(s, t) = 5s^ - Uf,
(b) x = Cs- 0/VTO,y = (s-\- 30/VTO, q(s, t) = s^-\-\\f
9.59. (a) x = Cs-\- 2t)/Vl3,y = r,z = Bs- 3t)/Vl3, q(r, s, t) = 3r^ + 9s^ - 4f,
(b) x = 5Ks+Lt,y = Jr + 2Ks-2Lt,z = 2Jr-Ks-Lt, where / = 1/^5, ^ = 1/V30, Z = 1/^6;
q{r, s, t) = 2r^ + 2s^ + 8^^
9.60. {a) A=\{5,-3- -3, 5],5 = ^[3,-1; -1,3],
lb) A=\[\4,-2\ -^], 5 = (V2 + 4V3, 2a/2 + 2V3; 2^2 + ^3,4^2 + ^3)
9.61. {a) A@ = m{t) = (t - 2f(t - 6), (b) A(t) = (t - 2f(t - 6), m(t) = (t - 2)(t - 6)
9.62. (a) A@ = (t- 2)\t - if, m(t) = (t - 2f(t - 7),
(b) A(o = (t- 3y, m(t) = (t- 3y,
(c) A@ = (t- 2)\t - 4f{t - 5), m{t) = (t - 2)(t - 4)(t - 5)
9.68. Let A be the companion matrix [Example 9.12F)] with last column: (a) [—8, —6, 5]^, (b) [—4, —7, 2, 5]^
9.69. Hint: ^ is a root of h(t) =f(t) — g(t) where h(t) = 0 or the degree of h(t) is less than the degree of/@.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
10. Canonical Forms
Text
©The McGraw-Hill
Companies, 2004
CHAPTER 10
Canonical Forms
10.1 INTRODUCTION
Let r be a linear operator on a vector space of finite dimension. As seen in Chapter 6, T may not have
a diagonal matrix representation. However, it is still possible to "simplify" the matrix representation of T
in a number of ways. This is the main topic of this chapter. In particular, we obtain the primary
decomposition theorem, and the triangular, Jordan, and rational canonical forms.
We comment that the triangular and Jordan canonical forms exist for T if and only if the characteristic
polynomial A{i) of T has all its roots in the base field K. This is always true if AT is the complex field C but
may not be true if K is the real field R.
We also introduce the idea of a quotient space. This is a very powerful tool, and will be used in the
proof of the existence of the triangular and rational canonical forms.
10.2 TRIANGULAR FORM
Let r be a linear operator on an ^-dimensional vector space V. Suppose T can be represented by the
triangular matrix
^11 ^12
«22
ayiyi
Then the characteristic polynomial A{t) of 7 is a product of linear factors; that is,
A@ = det(^/ -A) = {t- axx){t - ^22) ...{t- a^J
The converse is also true and is an important theorem (proved in Problem 10.28).
Theorem 10.1: Let T:V ^ F be a linear operator whose characteristic polynomial factors into linear
polynomials. Then there exists a basis of V in which T is represented by a triangular
matrix.
Theorem 10.1: (Alternative Form) Let^ be a square matrix whose characteristic polynomial factors
into linear polynomials. Then A is similar to a triangular matrix, i.e., there exists an
invertible matrix P such that P~^AP is triangular.
340
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
10. Canonical Forms
Text
©The McGraw-Hill
Companies, 2004
CHAP. 10]
CANONICAL FORMS
341
We say that an operator T can be brought into triangular form if it can be represented by a triangular
matrix. Note that in this case, the eigenvalues of T are precisely those entries appearing on the main
diagonal. We give an application of this remark.
Example 10.1. Let ^ be a square matrix over the complex field C. Suppose /i is an eigenvalue of ^^. Show that VX or
is an eigenvalue of ^.
By Theorem 10.1, ^ and A^ are similar, respectively, to triangular matrices of the form
^2
l^n.
and
>?
Ill
l^l
Since similar matrices have the same eigenvalues, X = fij for some /. Hence fii = \/J or fi^ = —\/J is an eigenvalue
of^.
10.3 INVARIANCE
Let T:V ^ F be linear. A subspace ^of F is said to be invariant under Tor T-invariant if T maps W
into itself, i.e., if v e W implies T(v) e W. In this case, T restricted to W defines a linear operator on W;
that is, T induces a linear operator T\W ^ W defined by T{w) = T{w) for every w eW.
Example 10.2.
Let T\ R^ -
(shown in Fig. 10-1):
(a) Let T\ R^ -^ R^ be the following linear operator, which rotates each vector v about the z-axis by an angle
r(x,j;, z) = (xcos0 — j^sin^, xsin^+j;cos0, z)
♦ z T{v)
T{w)
"^
W
Fig. 10-1
Observe that each vector w = (a, b, 0) in the xy-plane W remains in W under the mapping T; hence W is
T-invariant. Observe also that the z-axis U is invariant under T. Furthermore, the restriction of T to W rotates
each vector about the origin O, and the restriction of T to U is the identity mapping of U.
(b) Nonzero eigenvectors of a linear operator T:V ^ V may be characterized as generators of T-invariant
1-dimensional subspaces. For suppose T(v) = ?,v, v ^ 0. Then W = {kv, k e K}, the 1-dimensional subspace
generated by v, is invariant under T because
T(kv) = kT(v) = k{Xv) = UveW
Conversely, suppose dim U = 1 and u ^ 0 spans U, and U is invariant under T. Then T(u) e U and so T(u) is a
multiple of u, i.e., T(u) = jiu. Hence u is an eigenvector of T.
The next theorem (proved in Problem 10.3) gives us an important class of invariant subspaces.
Theorem 10.2: Let T:V ^ F be any linear operator, and \Qtf(t) be any polynomial. Then the kernel of
f(T) is invariant under T.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
10. Canonical Forms
Text
©The McGraw-Hill
Companies, 2004
342
CANONICAL FORMS
[CHAR 10
The notion of invariance is related to matrix representations (Problem 11.5) as follows.
Theorem 10.3: Suppose W is_ an invariant subspace of T:V
~A B~
representation
to W.
0 C
V. Then T has a block matrix
where ^ is a matrix representation of the restriction T of T
10.4 INVARIANT DIRECT-SUM DECOMPOSITIONS
A vector space V is termed the direct sum of subspaces Wi, ... ,W^, written
if every vector v e V can be written uniquely in the form
V = Wi -\- W2 -\-... -\- Wj., with Wi G Wi
The following theorem (proved in Problem 10.7) holds.
Theorem 10.4: Suppose Wi,W2, ... ,Wj. sltq subspaces of V, and suppose
Bi = {wji
^i«ii
B^ = {w,i,w,2, •••,w,,
are bases of W^, W2
, Wj., respectively. Then V is the direct sum of the W^ if and only
if the union 5 = 5^ U ... U 5^ is a basis of V.
Now suppose T: V -
W^, W2,..., W/, that is,
V is linear and V is the direct sum of (nonzero) T-invariant subspaces
V= Wy
W^
and T{W,) c ^.,
/= 1,
Let 7; denote the restriction of 7 to W^. Then T is said to be decomposable into the operators T^ or T is said
to be the direxct sum of the 7,, written T = T]
I Tj.. Also, the subspaces Wi,... ,Wj. are said to
reduce T or to form a T-invariant direct-sum decomposition of V.
Consider the special case where two subspaces U and ^reduce an operator T:V ^ V; say dim U = 2
and dim W = 3 and suppose {ui,U2} and {wj, ^2,^3} are bases of U and W, respectively. If Tj and T2
denote the restrictions of 7 to U and W, respectively, then
Ti(ui) = aiiUi + ^12^2
Ti(u2) = «21^1 + <^22^2
72(^1) = biiWi + Z?i2>V2 + ^13^3
72(^2) = ^21^1 + ^22^2 + ^23^3
72(^3) = Z?3iWi + Z?32W2 + ^33^3
Accordingly, the following matrices A,B,M are the matrix representations of T^, T2, T, respectively:
A =
p
(ill
L^12
U21
6/22 J
5 =
r^ii
^19
_Z?i3
Z?91
^99
^23
Z?3i
^39
Z?33_
M =
^ 0
0 B
The block diagonal matrix M results from the fact that {ui,U2,Wi,W2,W2} is 3. basis of V (Theorem 10.4),
and that r(w,) = T^iu^) and T(wj) = 7^2(w,).
A generalization of the above argument gives us the following theorem.
Theorem 10.5: Suppose T:V ^ V is linear and suppose V is the direct sum of T-invariant subspaces,
say, Wi, ... ,Wf.. IfAi is a matrix representation of the restriction of T to ^, then T can
be represented by the block diagonal matrix
M = diag(^i,A2, ... ,Ar)
Lipschutz-Lipson:Schaum's I 10. Canonical Forms I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 10] CANONICAL FORMS 343
10.5 PRIMARY DECOMPOSITION
The following theorem shows that any operator T:V ^ V is decomposable into operators whose
minimum polynomials are powers of irreducible polynomials. This is the first step in obtaining a canonical
form for T.
Theorem 10.6: (Primary Decomposition Theorem) Let T:V ^ F be a linear operator with minimal
polynomial
m(t)=Mtrf2(tr...fxtr
where the fi(t) are distinct monic irreducible polynomials. Then V is the direct sum of
T-invariant subspaces W^,... ,W^ where W^ is the kernel ofyj(r)"'. Moreover, yj(/^)"' is
the minimal polynomial of the restriction of T to W^.
The above polynomialsyj@"' are relatively prime. Therefore, the above fundamental theorem follows
(Problem 10.11) fi*om the next two theorems (proved in Problems 10.9 and 10.10, respectively).
Theorem 10.7: Suppose T:V ^ V is linear, and suppose f(t) = g(t)h(t) are polynomials such that
f(T) = 0 and g(t) and h(t) are relatively prime. Then V is the direct sum of the
T-invariant subspace U and W, where U = Ker g(T) and W = Ker h(T).
Theorem 10.8: In Theorem 10.7, if/@ is the minimal polynomial of T [and g(t) and h(t) are monic],
then g(t) and h(t) are the minimal polynomials of the restrictions of 7 to U and W,
respectively.
We will also use the primary decomposition theorem to prove the following usefual characterization of
diagonalizable operators (see Problem 10.12 for the proof).
Theorem 10.9: A linear operator T:V ^ F is diagonalizable if and only if its minimal polynomial m{i)
is a product of distinct linear polynomials.
Theorem 10.9: (Alternative Form) A matrix A is similar to a diagonal matrix if and only if its
minimal polynomial is a product of distinct linear polynomials.
Example 10.3. Suppose ^ / / is a square matrix for which A^ — I. Determine whether or not A is similar to a diagonal
matrix if ^ is a matrix over: (i) the real field R, (ii) the complex field C.
Since A^ = I, A is a. zero of the polynomial/(O = ^^ — 1 = (^ — l)(f + ^ + 1). The minimal polynomial m(t) of A
cannot be ^ — 1, since A ^ I. Hence
m(t) = t^ -\-t-\-\ or m(t) = P -\
Since neither polynomial is a product of linear polynomials over R, A is not diagonalizable over R. On the other hand,
each of the polynomials is a product of distinct linear polynomials over C. Hence A is diagonalizable over C.
10.6 NILPOTENT OPERATORS
A linear operator T:V ^ F is termed nilpotent if 7" = 0 for some positive integer n; we call k the
index of nilpotency of T if T^ = 0 but T^~^ j^ 0. Analogously, a square matrix A is termed nilpotent if
A" = 0 for some positive integer n, and of index k if ^^ = 0 but A^~^ ^ 0. Clearly the minimum
polynomial of a nilpotent operator (matrix) of index k is m(f) = t^\ hence 0 is its only eigenvalue.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
10. Canonical Forms
Text
©The McGraw-Hill
Companies, 2004
344
CANONICAL FORMS
[CHAP. 10
Example 10.4. The following two r-square matrices will be used through the chapter:
A^ = N(r) :
0
0
0
0
1 0 .
0 1 .
0 0 .
0 0 .
. 0
. 0
. 0
. 0
0
0
1
0
and /(/I):
A
0
0
0
1
A
0
0
0 .
1 .
0 .
0 .
. 0
. 0
. A
. 0
0
0
1
A
The first matrix A^, called a Jordan nilpotent block, consists of 1 's above the diagonal (called the superdiagonal),
and O's elsewhere. It is a nilpotent matrix of index r. (The matrix A^ of order 1 is just the 1 x 1 zero matrix [0].)
The second matrix J(X), called a Jordan block belonging to the eigenvalue A, consists of X 's on the diagonal, 1 's
on the superdiagonal, and O's elsewhere. Observe that
J{k) = kl + N
In fact, we will prove that any linear operator T can be decomposed into operators, each of which is the sum of a scalar
operator and a nilpotent operator.
The following (proved in Problem 10.16) is a fundamental result on nilpotent operators.
Theorem 10.10: Let T:V ^ F be a nilpotent operator of index k. Then T has a block diagonal matrix
representation in which each diagonal entry is a Jordan nilpotent block A^. There is at
least one A^ of order k, and all other A^ are of orders < k. The number of A^ of each
possible order is uniquely determined by T. The total number of A^ of all orders is equal
to the nullity of T.
The proof of Theorem 10.10 shows that the number of A^ of order / is equal to 2m^
where m^ is the nullity of T\
'U+i
10.7 JORDAN CANONICAL FORM
An operator T can be put into Jordan canonical form if its characteristic and minimal polynomials
factor into linear polynomials. This is always true ifK is the complex field C. In any case, we can always
extend the base field K to a field in which the characteristic and minimal polynomials do factor into linear
factors; thus, in a broad sense, every operator has a Jordan canonical form. Analogously, every matrix is
similar to a matrix in Jordan canonical form.
The following theorem (proved in Problem 10.18) describes the Jordan canonical form J of a linear
operator T.
Theorem 10.11: Let T:V ^ F be a linear operator whose characteristic and minimal polynomials are,
respectively,
A@ = (t- liT' ...(t- XrT and m{t) = {t - XJ^' ...(t- X,T^
where the X^ are distinct scalars. Then T has a block diagonal matrix representation J in
which each diagonal entry is a Jordan block J^ = J{X^. For each X^, the corresponding
J^j have the following properties:
(i) There is at least one J^ of order m{, all other J^ are of order <m^.
(ii) The sum of the orders of the J^ is n^.
(iii) The number of J^ equals the geometric multiphcity of 1^.
(iv) The number of J^ of each possible order is uniquely determined by T.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
10. Canonical Forms
Text
©The McGraw-Hill
Companies, 2004
CHAP. 10]
CANONICAL FORMS
345
Example 10.5. Suppose the characteristic and minimal polynomials of an operator T are, respectively,
A@ = {t- 2)\t - 5f and m{t) = (t - 2f{t - 5f
Then the Jordan canonical form of T is one of the following block diagonal matrices:
diag
([
2 1
0 2
5 1 0
0 5 1
0 0 5
[2], [2],
5 1 0
0 5 1
0 0 5
The first matrix occurs if T has two independent eigenvectors belonging to the eigenvalue 2; and the second matrix
occurs if T has three independent eigenvectors belonging to 2.
10.8 CYCLIC SUBSPACES
Let r be a linear operator on a vector space V of finite dimension over K. Suppose v ^V and v ^ 0.
The set of all vectors of the form/(r)(i;), where/(/^) ranges over all polynomials over K, is a T-invariant
subspace of V called the T-cyclic subspace of V generated by v; we denote it by Z(v, T) and denote the
restriction of T to Z(i;, T) by T^. By Problem 10.56, we could equivalently define Z(i;, T) as the
intersection of all T-invariant subspaces of V containing v.
Now consider the sequence
V, T(v), T^(v), T\v), ...
of powers of T acting on v. Let k be the least integer such that T'^(v) is a linear combination of those vectors
that precede it in the sequence; say,
Then
T'iv) :
-a,_j'-\v)-
aiT(v) — UqV
m^t) = ^^ + a^_^t^ ^ + ... ^1^ + (
is the unique monic polynomial of lowest degree for which m^(T)(v) = 0. We call m^(t) the T-annihilator of
V and Z(v, T).
The following theorem (proved in Problem 10.29) holds.
Theorem 10.12: Let Z(i;, 7), 7^, m^{t) be defined as above. Then:
(i) The set {i;, T{v), ..., T^~^{v)} is a basis of Z(v, T); hence dim Z(v, T) = k.
(ii) The minimal polynomial of T^ is mXt)-
(iii) The matrix representation of T^ in the above basis is just the companion matrix
C(m^) of m^(t); that is.
C(m,) :
0
1
0
0
0
0
0
1
0
0
0 .
0 .
0 .
0 .
0 .
. 0
. 0
. 0
. 0
. 1
-«o
-fll
-^2
-^k-2
-%-l
10.9 RATIONAL CANONICAL FORM
In this section, we present the rational canonical form for a linear operator T:V ^ V. We emphasize
that this form exists even when the minimal polynomial cannot be factored into linear polynomials. (Recall
that this is not the case for the Jordan canonical form.)
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
10. Canonical Forms
Text
©The McGraw-Hill
Companies, 2004
346
CANONICAL FORMS
[CHAP. 10
Lemma 10.13: Let T:V ^ F be a linear operator whose minimal polynomial is f(ty where/(/^) is a
monic irreducible polynomial. Then V is the direct sum
v = z(v^,T)e...ez(v„T)
of T-cyclic subspaces Z(vi, T) with corresponding T-annihilators
f{tr\ f(tf\ ..., /@"^ ^ = ^1 > ^2 > ... > ^.
Any other decomposition of V into T-cyclic subspaces has the same number of
components and the same set of T-annihilators.
We emphasize that the above lemma (proved in Problem 10.31) does not say that the vectors v^ or other
T-cyclic subspaces Z(i;^, T) are uniquely determined by T\ but it does say that the set of T-annihilators is
uniquely determined by T. Thus T has a unique block diagonal matrix representation
M = diag(Ci,C2,...,C,)
where the Q are companion matrices. In fact, the Q are the companion matrices of the polynomials/(^)"'.
Using the Primary Decomposition Theorem and Lemma 10.13, we obtain the following result.
Theorem 10.14: Let T:V ^ F be a linear operator with minimal polynomial
where the fi(t) are distinct monic irreducible polynomials. Then T has a unique block
diagonal matrix representation
M = diag(CH, Ci2,..., Q,^,..., Qi, Q,..., C,,)
where the Cy are companion matrices. In particular, the Cy are the companion matrices
of the polynomials yj(/^)"'^ where
m
= ^11 > ^12 > ... >^iri.
m, = n,i>n,2>---> n,
The above matrix representation of T is called its rational canonical form. The polynomials yj(/^)"'^ are
called the elementary divisors of T.
Example 10.6. Let F be a vector space of dimension 8 over the rational field Q, and let T be a linear operator on V whose
minimal polynomial is
m{t) =fmiitf = (t' - 4t' + 61^-41- l){t - 3f
Then the rational canonical form M of T must have one block the companion matrix oifi{t) and one block the
companion matrix ofy^(^) . There are two possibilities:
{a) diag[C(^4 _ 4^3 _^ ^f _ 4^ _ 7^^ q^^^ _ 3^2^^ ^^^^ _ 3^2^^
{b) diag[C(^4 _ 4^3 _^ ^^2 _ 4^ _ 7)^ q^^^ _ 3^2^^ ^^ _ 3^^ ^^ _ 3^^
That is,
{a) diag
VL
0 0 0 7
10 0 4
0 10-6
0 0 1 4
[; 1]- [:
\ /
, {b) diag
/
VL
0 0 0
1 0 0
0 1 0
0 0 1
7
4
-6
4
r^
' |l
\
[3], [3]
/
10.10 QUOTIENT SPACES
Let F be a vector space over a field K and let ^ be a subspace of K If 1; is any vector in V, we write
1; + ^ for the set of sums v -\-w with w e W; that is,
v-\-W = {v-\-w:we W]
These sets are called the cosets of ^ in V. We show (Problem 10.22) that these cosets partition V into
mutually disjoint subsets.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
10. Canonical Forms
Text
©The McGraw-Hill
Companies, 2004
CHAR 10]
CANONICAL FORMS
347
Example 10.7. Let Whe the subspace of R^ defined by
W = {(a,b):a = b},
that is, W is the line given by the equation x—y = 0. We can view
i; + ^ as a translation of the line obtained by adding the vector v to
each point in W. As shown in Fig. 10-2, the coset i; + ^ is also a
line, and it is parallel to W. Thus the cosets of W in R^ are
precisely all the lines parallel to W.
In the following theorem, we use the cosets of a
subspace W of a vector space V to define a new vector
space; it is called the quotient space of V by W, and is
denoted by V/W.
v^ w
Fig. 10-2
Theorem 10.15: Let ^be a subspace of a vector space over a field K. Then the cosets of ^ in F form a
vector space over AT with the following operations of addition and scalar multiplication:
(i) (w + w) + (i; + ^) = (w + i;) + W, (ii) k{u -\-W) = ku-\-W, where k eK
We note that, in the proof of Theorem 10.15 (Problem 10.24), it is first necessary to show that the
operations are well defined; that is, whenever u-\-W = u'-\-W and v -\-W = v' -\-W, then
(i) {u + v)+W = {u' + v') + W and (ii) ku+W = ku' +W for 2iny k e K
In the case of an invariant subspace, we have the following useful result (proved in Problem 10.27).
Theorem 10.16: Suppose ^is a subspace invariant under a linear operator T\V ^ V. Then T induces a
linear operator 7 on V/W defined by T{v -\-W) = T(v) + W. Moreover, if 7 is a zero
of any polynomial, then so is T. Thus the minimal polynomial of T divides the minimal
polynomial of T.
Solved Problems
INVARIANT SUBSPACES
10.1. Suppose T:V ^ F is linear. Show that each of the following is invariant under T:
(a) {0}, (b) K (c) kernel of r, (d) image of 7.
(a) We have T@) = 0 G {0}; hence {0} is invariant under T.
(b) For every v e V , T(v) e V; hence V is invariant under T.
(c) Let u e Ker T. Then T(u) = 0 e Ker T since the kernel of T is a subspace of V. Thus Ker T is invariant
under T.
(d) Since T(v) elmT for every v e 1^ it is certainly true when v elmT. Hence the image of T is invariant
under T.
10.2. Suppose [Wi) is a collection of T-invariant subspaces of a vector space V. Show that the intersection
W = Pi • Wi is also T-invariant.
Suppose V e W; then v e W^ for every /. Since W^ is T-invariant, T{v) e W^ for every /. Thus T{v) e W
and so W is T-invariant.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
10. Canonical Forms
Text
©The McGraw-Hill
Companies, 2004
348
CANONICAL FORMS
[CHAP. 10
10.3. Prove Theorem 10.2: Let T:V
invariant under T.
V be linear. For any polynomial/(O, the kernel of f(T) is
Suppose V e Ker/(r), i.Q.,f(T)(v) = 0. We need to show that T(v) also belongs to the kernel of/(r),
i.Q.,f(T)(T(v)) = (f(T)o T){v) = 0. Since/(O^ = tf(tX we have/(r)o r = Tof(T). Thus, as required,
(f(T)oT)(v) = (Tof(T))(v) = T(f(T)(v)) = T@) = 0
10.4. Find all invariant subspaces of ^
viewed as an operator on R^.
By Problem 10.1, R^ and {0} are invariant under ^. Now if ^ has any other invariant subspace, it must be
1-dimensional. However, the characteristic polynomial of ^ is
A(t) = t^ - tr(A) t-\-\A\ =t^-\-l
Hence A has no eigenvalues (in R) and so A has no eigenvectors. But the 1-dimensional invariant subspaces
correspond to the eigenvectors; thus R^ and {0} are the only subspaces invariant under A.
10.5. Prove Theorem 10.3: Suppose W is T-invariant. Then T has a triangular block representation
~ A B,
where A is the matrix representation of the restriction 7 of 7 to W.
0 C
We choose a basis {wi,..., w^,} of W and extend it to a basis {wi,... ,Wj,,Vi,..
T(wi) = T(wi) = a^Wi -\-... -\- ai^w^
T{^2) = ^(^2) = ^21^1 + • • • + «2r^r
, i;,| of K We have
T(w^) = T(w^) = a^Wy + ... + a^^w^
T(v^) = Z^n^i + ... + b^,w, + cni^i + ... + q^i;,
T(V2) = 621^1 + ... + b2,W, + C2ii;i + . . . + C2sV,
T{v,) = 6,1 wi + ... + b,,w, + c,ii;i + ... + c,,v,
But the matrix of T in this basis is the transpose of the matrix of coefficients in the above system of equations
~A B'
(Section 6.2). Therefore it has the form
0 C
where A is the transpose of the matrix of coefficients for the
obvious subsystem. By the same argument, A is the matrix of f relative to the basis {wj of W.
10.6. Let t denote the restriction of an operator T to an invariant subspace W. Prove:
{a) For any polynomial/(/^), f{T){w) =f{T){w).
(b) The minimal polynomial of T divides the minimal polynomial of T.
(a) If/@ = 0 or if/@ is a constant, i.e., of degree 1, then the result clearly holds.
Assume deg/ = n > I and that the result holds for polynomials of degree less than n. Suppose that
f(t) = aj"" + a^_if-^ + ... + ai^ + ao
Then f{f){w) = (a,r + a^_^ J-i + ... + a,/)(w)
= (a,r-')(f(w)) + (a,_,r-' + ... + aoI)(w)
= {a,r-'){T{w)) + {a,_,r-' + ... + a,I){w) =f(T)(w)
(b) Let m(t) denote the minimal polynomial of T. Then by (i), m(T)(w) = m(T)(w) = 0(w^ = 0 for every
w e W; that is, T is a zero of the polynomial m(t). Hence the minimal polynomial of T divides m(t).
Lipschutz-Lipson:Schaum's I 10. Canonical Forms I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP 10] CANONICAL FORMS 349
INVARIANT DIRECT-SUM DECOMPOSITIONS
10.7. Prove Theorem 10.4: Suppose Wi,W2,... ,Wj. slyq subspaces of V with respective bases
Bi = {wii, wi2, ..., Wi„J, ..., B^ = Ki, w,2, ..., w,„ }
Then V is the direct sum of the Wf if and only if the union B = |J • Bf is a basis of V.
Suppose 5 is a basis of V. Then, for any v e V,
V = a^Wu + . . . + a^n.^^n, + • • • + a.^^rl + • • • + «rn,>^rn, = Wj + ^2 + . . . + W,
where Wi = a^Wn + ... + a^„ >v^„ G ^. We next show that such a sum is unique. Suppose
i; = >Vi + W2 + ... + w^, where wj- G ^•
Since {w^,..., w^} is a basis of Wi, w- = ^/iw^i + ... + bi^Wi^^, and so
i; = 6n>vn + ... + Z?i„^>Vi„^ + ... + 6,1 w.i + ... + b,^w,^^
Since 5 is a basis of V, a^j = bij, for each / and each j. Hence Wi = wj, and so the sum for v is unique.
Accordingly, V is the direct sum of the Wi.
Conversely , suppose V is the direct sum of the Wi. Then for any v e V, v = Wi -\-... -\- w^, where
Wi G Wf. Since {wij} is a basis of ^, each Wi is a linear combination of the w^^, and so i; is a linear
combination of the elements of B. Thus B spans K We now show that B is linearly independent. Suppose
flllWn + . . . + a^^W^^^ + . . . + flH>^rl + • • • + «rn,>^m, = 0
Note that a^iW^i + ... + «/„^>v,„^ ^ ^/. We also have that 0 = 0 + 0... 0 G ^;. Since such a sum for 0 is
unique,
aiiWii -\-... -\- aifjWifj = 0 for i = I,... ,r
The independence of the bases {wij } imply that all the a's are 0. Thus B is linearly independent, and hence is a
basis of V.
10.8. Suppose T:V ^ V is linear and suppose T = Ti ^ T2 with respect to a T-invariant direct-sum
decomposition V = U ^ W. Show that:
(a) m(t) is the least common multiple of mi(t) and m2(t), where m(t), mi(t), m2(t) are the
minimum polynomials of T, T^, T2, respectively.
(Z?) A@ = Ai@A2@, where A@, Ai@, A2@ are the characteristic polynomials of 7, Tj, 7^2,
respectively.
(a) By Problem 10.6, each ofmi{i) and ^2@ divides m@. Now suppose/@ is a multiple of both mi@ and
^2@; then/(ri)(^) = 0 and/(r2)(^) = 0. Let veV\ then v = u + w with ueU mdw eW. Now
/(r)i; =/(r)t/ +/(r)w ^/(rot/ +/(r2)w = 0 + 0 = 0
That is, r is a zero of/@. Hence m(t) divides/@, and so m(t) is the least common multiple of ^^@ and
(b) By Theorem 10.5, T has a matrix representation M -
tions of Ti and T2, respectively. Then, as required.
A 0
0 B
A(t)= \tI-M\
tl-A 0
0 tl -B
, where A and 5 are matrix representa-
\tI-A\\tI-B\ =Ai@A2@
10.9. Prove Theorem 10.7: Suppose T:V ^ F is linear, and suppose/(^) = g(t)h(t) are polynomials such
that/(r) = 0 and g(t) and h(t) are relatively prime. Then V is the direct sum of the T-invariant
subspaces U and W where U = Ker g(T) and ^ = Ker h(T).
Lipschutz-Lipson:Schaum's I 10. Canonical Forms I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
350 CANONICAL FORMS [CHAR 10
Note first that U and ^are T-invariant by Theorem 10.2. Now, since g{t) and h{t) are relatively prime,
there exist polynomials r{t) and s{t) such that
r{t)g{t) + s{t)h{i) = \
Hence, for the operator T, r{T)g{T) + s(T)h(T) = I (*)
Let veV; then, by (*), v = r(T)g(T)v + s(T)h(T)v
But the first term in this sum belongs to ^ = Ker h(T), since
h(T)r(T)g(T)v = r(T)g(T)h(T)v = r(T)f(T)v = r(T)Ov = 0
Similarly, the second term belongs to U. Hence V is the sum of U and W.
To prove that V =U ®W, we must show that a sum v = u-\-w with u e U, w e W, is uniquely
determined by v. Applying the operator r(T)g(T) to v = u -\-w and using g(T)u = 0, we obtain
r(T)g(T)v = r(T)g(T)u + r(T)g(T)w = r(T)g(T)w
Also, applying (*) to w alone and using h(T)w = 0, we obtain
w = r(T)g(T)w + s(T)h(T)w = r(T)g(T)w
Both of the above formulas give us w = r(T)g(T)v, and so w is uniquely determined by v. Similarly u is
uniquely determined by v. Hence V = U ® W, as required.
10.10. Prove Theorem 10.8: In Theorem 10.7 (Problem 10.9), if/@ is the minimal polynomial of T (and
g(t) and h(t) are monic), then g(t) is the minimal polynomial of the restriction T^ of 7 to U and h(t)
is the minimal polynomial of the restriction T2 of T to W.
Let mi(t) and ^2@ be the minimal polynomials of T^ and T2, respectively. Note that g(Ti) = 0 and
h(T2) = 0 because U = Ker g(T) and W = Ker h(T). Thus
mi(t) diyidQS g(t) and ^2@ divides ^@ A)
By Problem 10.9,/(O is the least common multiple of mi(t) and m2(t). But mi(t) and m2(t) are relatively
prime since g(t) and h(t) are relatively prime. Accordingly, f(t) = m^@^2@- We also have that
f(t)=g(t)h(t). These two equations together with A) and the fact that all the polynomials are monic
imply that g(t) = mi(t) and h(t) = m2(t), as required.
10.11. Prove the Primary Decomposition Theorem 10.6: Let T:V ^ F be a linear operator with minimal
polynomial
where the fi(t) are distinct monic irreducible polynomials. Then V is the direct sum of T-invariant
subspaces Wi, ... ,Wj. where W^ is the kernel ofyj(r)"'. Moreover,yj(^)"' is the minimal polynomial
of the restriction of T to W^.
The proof is by induction on r. The case r = 1 is trivial. Suppose that the theorem has been proved for
r — 1. By Theorem 10.7, we can write V as the direct sum of T-invariant subspaces Wi and F^, where Wi is
the kernel of f^{Tf' and where V^ is the kernel of f2{TfK. J^TT• By Theorem 10.8, the minimal
polynomials of the restrictions of T to W^ and V^ are/i@"^ andy2@"^ • • -fMT^ respectively.
Denote the restriction of T to V^ by T^. By the inductive hypothesis, V^ is the direct sum of subspaces
W2, ... ,W^ such that Wi is the kernel offi{Tif' and such thatyj(^)"' is the minimal polynomial for the
restriction of 1^ to W^. But the kernel ofyj(r)"% for / = 2, ..., r is necessarily contained in V^, sinceyj(^)"'
divides/2@"' • • JXtT- Thus the kernel offi(Tf' is the same as the kernel offi(T^f\ which is Wi. Also, the
restriction of T to ^ is the same as the restriction of T^ to Wi (for i = 2,..., r); hence yj@"' is also the
minimal polynomial for the restriction of T to Wi. Thus V = W^ ® W2® ... ® W^ is the desired
decomposition of T.
Lipschutz-Lipson:Schaum's I 10. Canonical Forms I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 10] CANONICAL FORMS 3 51
10.12. Prove Theorem 10.9: A linear operator T:V ^ V has a diagonal matrix representation if and only if
its minimal polynomal m{i) is a product of distinct linear polynomials.
Suppose m{t) is a product of distinct linear polynomials; say,
m{t) = (t- Ai)(^ - A2) ... (^ - ^r)
where the X^ are distinct scalars. By the Primary Decomposition Theorem, V is the direct sum of subspaces
W^,...,W„ where ^ = Ker(r - XJ). Thus, if i; G ^, then (T - XJ)(v) = 0 or T(v) = X^v. In other words,
every vector in W^ is an eigenvector belonging to the eigenvalue X^. By Theorem 10.4, the union of bases for
Wi,... ,W^ is Si basis of V. This basis consists of eigenvectors, and so T is diagonalizable.
Conversely, suppose T is diagonalizable, i.e., V has a basis consisting of eigenvectors of T. Let X^,..., X^
be the distinct eigenvalues of T. Then the operator
f(T) = (T-Xir)(T-X2r)...(T-X/)
maps each basis vector into 0. Thus f(T) = 0, and hence the minimal polynomial m(t) of T divides the
polynomial
f(t) = (t-X,)(t-X2)...(t-XJ)
Accordingly, m(t) is a product of distinct linear polynomials.
NILPOTENT OPERATORS, JORDAN CANONICAL FORM
10.13. Let T:V be linear. Suppose, for 1; g /^ T''(v) = 0 but T''-\v) ^ 0. Prove:
(a) The set ^ = {1;, T(v), ..., T^~^(v)} is linearly independent.
(b) The subspace W generated by S is T-invariant.
(c) The restriction 7 of 7 to ^ is nilpotent of index k.
(d) Relative to the basis {T'^~^(v), ..., T(v), v} of W, the matrix of T is the ^-square Jordan
nilpotent block Nj^ of index k (see Example 10.5).
(a) Suppose
av + a^T(v) + a2T^(v) + ... + ak_iT^~\v) = 0 (*)
Applying r^~^ to (*) and using T'^(v) = 0, we obtain aT^~^{v) = 0; since T'^~^(v) ^ 0, a = 0. Now
applying T^~^ to (*) and using T^(v) = 0 and a = 0, we fiind aiT^~^(v) = 0; hence a^ = 0. Next
applying r^~^ to (*) and using T'^(v) = 0 and a = ai = 0, wq obtain a2T^~^{v) = 0; hence ^2 = 0.
Continuing this process, we find that all the a's are 0; hence S is independent.
(b) LqXv eW. Then
v = bv-\- bj{v) + b2T^{v) + ... + bj,_J^-\v)
Using T^{v) = 0, we have that
T(v) = bT(v) + b,T\v) + ... + bk_2T''-\v) G W
Thus ^is T-invariant.
(c) By hypothesis T^(v) = 0. Hence, for / = 0, ..., ^ — 1,
f^(r(v)) = T^+\v) = 0
That is, applying T^ to each generator of W, we obtain 0; hence, T^ = 0 and so T is nilpotent of index at
most k. On the other hand, T^~^(v) = T^~^(v) ^ 0; hence T is nilpotent of index exactly k.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
10. Canonical Forms
Text
©The McGraw-Hill
Companies, 2004
352
CANONICAL FORMS
[CHAP. 10
(d) For the basis {T''-\vX T''-^(v), ..., T(v), v} of W,
f{T^-\v))
f(T^-\v))
f{T^-\v))
T\v):
:0
T(T(v))
f(v)
T\v)
T(v)
Hence, as required, the matrix of T in this basis is the ^-square Jordan nilpotent block A^^.
10.14. Let T:V^ VhQ linear. Let U = Ker P and W = Ker P^K Show that:
(a) U <^W, (b) T(W) c U.
(a) Suppose ueU = KQYr. Then r(u) = 0 and so P+\u) = T(P(u)) = T@) = 0. Thus
u e Ker 7'+^ = W. But this is true for every u e U; hence U ^ W.
(b) Similarly, if w G ^ = Ker r+\ then P+\w) = 0. Thus P+\w) = r(T(w)) = P@) = 0 and so
T(W) c U.
10.15. Let T:V be linear. Let X ■
X <^Y <^Z. Suppose
:Kerr-2, Y = KqyP-\ Z = KqyP. Therefore (Problem 10.14),
are bases of X, 7, Z respectively. Show that
S = {ui, ... ,Uj., T(wi)
is contained in Y and is linearly independent.
..., T(w,)}
By Problem 10.14, T(Z) c 7, and hence S ^ Y. Now suppose S is linearly dependent. Then there exists
a relation
a^Ui + ... + a^u^ + biT(wi) + ... + b^T(w^) = 0
where at least one coefficient is not zero. Furthermore, since {wj is independent, at least one of the bj^ must be
nonzero. Transposing, we find
Z?! T(w^) + ... + b^T(w^) = -ayUy- ...- a,u, G X = Ker r~^
Hence 7^'"^(^iT{w^) + ... + b,T{w,)) = 0
Thus r~\b^w^ + ... + b^w^) = 0, and so b^w^ + ... + b^w^ G 7 = Ker r~^
Since {w^ Vj} generates 7, we obtain a relation among the w^, f^, Wj^ where one of the coefficients, i.e., one of
the bj^, is not zero. This contradicts the fact that {ui, Vj, Wj^} is independent. Hence S must also be independent.
10.16. Prove Theorem 10.10: Let T:V ^ F be a nilpotent operator of index k. Then T has a unique block
diagonal matrix representation consisting of Jordan nilpotent blocks A^. There is at least one A^ of
order k, and all other A^ are of orders < k. The total number of A^ of all orders is equal to the nullity
of r.
Suppose dim V = n. Let W^ = Ker T, W2 =KQr P,..., Wj, = Ker P. Let us set m, = dim W^, for
/ = I, ... ,k. Since T is of index k, Wj^ = V and Wj^_i ^ V and so m^_i < mj^ = n. By Problem 10.14,
Thus, by induction, we can choose a basis {ui,... ,u^} of V such that {w^,..., w^ } is a basis of W^.
We now choose a new basis for V with respect to which T has the desired form. It will be convenient to
label the members of this new basis by pairs of indices. We begin by setting
v{\,k) = u +^, vB, k) = u +2^ ..., vimj, - mj,_y,k) = u
Lipschutz-Lipson:Schaum's I 10. Canonical Forms I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 10] CANONICAL FORMS 353
and setting
i;(l, ^ — 1) = Tv(l, k), vB, k — I) = TvB, k), ..., v(mj^ — mj^_i, k — I) = Tv(mj^ — mj^_i, k)
By the preceding problem,
S^={u^..., u^^_^, v(l,k-l),..., vimj, - mk_^,k - 1)}
is a linearly independent subset of Wj^_i. We extend S^ to a basis of Wj^_i by adjoining new elements (if
necessary), which we denote by
v(mj^ — mj^_i-\-I, k — I), v(mj^ — mj^_i-\-2, k — I), ..., v(mj^_i — mj^_2, k — I)
Next we set
v(l,k -2) = Tv(l, k - 1), vB, k-2) = TvB, k - I),
v{mj,_^ - m^_2, k-2) = Tv{mj,_^ - m^_2, ^ - 1)
Again by the preceding problem,
82 = {ui,..., Uj^^_^, v(l,k — 2), ..., v(mj^_i — m^_2, k — 2)}
is a linearly independent subset of Wj^_2, which we can extend to a basis of Wj^_2 by adjoining elements
v(mj^_i — mj^_2 -\- \,k — 2), v(mj^_i — m^_2 -\-2,k — 2), ..., ^{^k-i ~ ^k-3, k — 2)
Continuing in this manner, we get a new basis for V, which for convenient reference we arrange as follows:
v(l,k) ..., v(mj^ — mj^_i, k)
i;(l, ^ — 1), ..., v(mj^ — mj^_i, k — I) ..., v(mj^_i — m^_2, ^ — 1)
i;(l,2), ... ,v(mj^ — mj^_i,2), ..., i;(m^_i — m^_2, 2), ..., i;(m2 — m^, 2)
i;(l, 1), ... ,v(mj^ — mj^_i,l), ..., i;(m^_i — m^_2, 1), ..., i;(m2 — m^, 1), ...,v(mi,l)
The bottom row forms a basis of W^, the bottom two rows form a basis of W2, etc. But what is important for us
is that T maps each vector into the vector immediately below it in the table or into 0 if the vector is in the
bottom row. That is.
Now it is clear [see Problem 10.13(d)] that T will have the desired form if the v(i,j) are ordered
lexicographically: beginning with v(l, 1) and moving up the first column to v(l, k), then jumping to vB, 1)
and moving up the second column as far as possible, etc.
Moreover, there will be exactly m^ — m^_i diagonal entries of order k. Also, there will be:
(w^_i — ^^-2) — O^k ~ ^k-\) — ^^k-\ ~ ^k ~ ^k-2 diagonal entries of order k — I
diagonal entries of order 2
2m 1 — ^2 diagonal entries of order 1
as can be read off directly fi-om the table. In particular, since the numbers m^,..., m^ are uniquely determined
by T, the number of diagonal entries of each order is uniquely determined by T. Finally, the identity
^1 = (j^k ~ ^k-\) + (^^k-\ ~^k~ ^k-i) + • • • + B^2 —Ml — My) + {2mi — ^2)
shows that the nullity m^ of T is the total number of diagonal entries of T.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
10. Canonical Forms
Text
©The McGraw-Hill
Companies, 2004
354
CANONICAL FORMS
[CHAR 10
.17. Let^ =
0
0
0
0
0
1
0
0
0
0
1
1
0
0
0
0
1
0
0
0
1
1
0
0
0
and5:
0
0
0
0
0
1
0
0
0
0
1
1
0
0
0
0
1
1
0
0
0
1
1
0
0
The reader can verify that A and B are
both nilpotent of index 3; that is, A^ = 0 but A^ 7^ 0, and B^ = 0 but B^ 7^ 0. Find the nilpotent
matrices M^ and M^ in canonical form that are similar to A and B, respectively.
Since A and B are nilpotent of index 3, M^ and M^ must each contain a Jordan nilpotent block of order 3,
and none greater then 3. Note that rank(^) = 2 and rankE) = 3, so nullity(^) = 5—2 = 3 and
nullityE) = 5 — 3 =2. Thus M^ must contain 3 diagonal blocks, which must be one of order 3 and two
of order 1; and M^ must contain 2 diagonal blocks, which must be one of order 3 and one of order 2. Namely,
M,
0 10 0 0
0 0 10 0
0 0 0 0 0
0 0 0 0 0
0 0 0 0 0
and
M,
0 10 0 0
0 0 10 0
0 0 0 0 0
0 0 0 0 1
0 0 0 0 0
10.18. Prove Theorem 10.11 on the Jordan canonical form for an operator T.
By the primary decomposition theorem, T is decomposable into operators T^,
T = Ti ® ... ® T^, where (t — Xij^' is the minimal polynomial of T^. Thus, in particular.
Set TV,
■ A,/. Then, for /:
(^i-vr
i,...,r.
0,
ijr-Kir
0
- N, + A/,
where
TVf
0
. ,T/, that is.
That is, T^ is the sum of the scalar operator A/ and a nilpotent operator A^^, which is of index m^ since (t — /l,)f
is the minimal polynomial of T^.
Now, by Theorem 10.10 on nilpotent operators, we can choose a basis so that A^^ is in canonical form. In
this basis, T^ = N^ + A/ is represented by a block diagonal matrix M^ whose diagonal entries are the matrices
Jij. The direct sum J of the matrices M^ is in Jordan canonical form and, by Theorem 10.5, is a matrix
representation of T.
Lastly, we must show that the blocks Jij satisfy the required properties. Property (i) follows from the fact
that A^^ is of index m,. Property (ii) is true since T and J have the same characteristic polynomial. Property (iii)
is true since the nullity of A^^ = T^ — k^I is equal to the geometric multiplicity of the eigenvalue k^. Property
(iv) follows from the fact that the T^ and hence the A^, are uniquely determined by T.
10.19. Determine all possible Jordan canonical forms J for a linear operator T:V
V whose character-
2
istic polynomial A(/^) = {t — 2) and whose minimal polynomial m{t) = (t — 2)
J must be a 5 X 5 matrix, since A(^) has degree 5, and all diagonal elements must be 2, since 2 is the only
eigenvalue. Moreover, since the exponent of ^ — 2 in m(t) is 2, J must have one Jordan block of order 2, and
the others must be of order 2 or 1. Thus there are only two possibilities:
J = diag
ii' 5]- [' 5]- '^>)
J = dias
2 1
2
[2], [2], [2]
10.20. Determine all possible Jordan canonical forms for a linear operator T:V ^ V whose characteristic
polynomial A(t) = (t — 2)^{t — Sf. In each case, find the minimal polynomial m{t).
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
10. Canonical Forms
Text
©The McGraw-Hill
Companies, 2004
CHAR 10]
CANONICAL FORMS
355
Since t — 2 has exponent 3 in A@, 2 must appear three times on the diagonal. Similarly, 5 must appear
twice. Thus there are six possibilities:
{a) diag
2 1
2 1
2
Qd) diag
(c) diag(| ^ |, [2], r ^11, {d) diag(
ie) diag(^[2], [2], [2], J^ jj^, (/) diag([2], [2], [2], [5], [5])
The exponent in the minimal polynomial m{t) is equal to the size of the largest block. Thus:
{a) m{t) = {t-2)\t-5f, (b) m(t) = (t - 2^ (t - 5), (c) m(t) = (t - 2f(t - Sf,
(d) m(t) = (t-2f(t-5), (e) m(t) = (t - 2)(t - Sf, (f) m(t) = (t - 2)(t - 5)
QUOTIENT SPACE AND TRIANGULAR FORM
10.21. Let WhQ 2i subspace of a vector space V. Show that the following are equivalent:
(i) u e v-\-W, (ii) u - v e W, (iii) v eu-\-W.
Suppose u ev -\-W. Then there exists Wq eW such that u = v-\-Wq. Hence u — v = Wq e W. Conversely,
suppose u — V eW. Then u — v = Wq where Wq eW. Hence u = v-\-WQev-\-W. Thus (i) and (ii) are
equivalent.
We also have u — v e Wiff — (u — v) = v — ue W iff i; eu-\-W. Thus (ii) and (iii) are also equivalent.
10.22. Prove the following: The cosets of ^ in F partition V into mutually disjoint sets. That is:
{a) Any two cosets u-\-W and v -\- W slyq either identical or disjoint.
(b) Each V e V belongs to a coset; in fact, v ev -\-W.
Furthermore, u-\-W = v-\-W if and only if u — v e W, and so {v-\-w)-\-W = v-\-W for any
w eW.
Let V eV. Since 0 G ^ we have v = v-\-^ev-\-W, which proves {b).
Now suppose the cosets u-\-W and v -\- W slvq not disjoint; say, the vector x belongs to both u-\-W and
V -\-W. Then u — x e W and x — v e W. The proof of (a) is complete if we show that u-\-W = v-\-W. Let
w + Wq be any element in the coset u-\-W. Since u —x, x — v,Wq belongs to W,
(u + Wq) — v = (u—x)-\-(x — v)-\-Wq e W
Thus u -\- Wq e V -\- W, and hence the cost u-\-W is contained in the coset v -\-W. Similarly, i; + ^ is
contained in u-\- W, and so u-\- W = v -\- W.
The last statement follows from the fact that u-\-W = v-\-Wii and only if u e v -\- W, and, by Rroblem
10.21, this is equivalent to u — v e W.
10.23. Let W be the solution space of the homogeneous equation 2x + 3;; + 4z = 0. Describe the cosets of
^inRl
^ is a plane through the origin O = @, 0, 0), and the cosets of W are the planes parallel to W.
Equivalently, the cosets of W are the solution sets of the family of equations
2x-\-3y-\-4z = k, k eR
In fact, the coset v -\- W, where v = (a, b,c), is the solution set of the linear equation
2x-\-3y-\-4z = 2a-\-3b-\- 4c or 2(x - a)-\-3(y - b)-\-4(z - c) = 0
Lipschutz-Lipson:Schaum's I 10. Canonical Forms I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
356 CANONICAL FORMS [CHAP. 10
10.24. Suppose ^is a subspace of a vector space V. Show that the operations in Theorem 10.15 are well
defined; namely, show that \i u-\-W = u' -\-W and v -\-W = v' -\-W, then:
{a) {u + v) + W = {u' + v') + W and {b) ku-\-W = ku' -\-W for my k e K
(a) Since u-\-W = u'-\-W and v -\-W = v' -\-W, both u — u' and v — v' belong to W. But then
{u + v)- {u' + v') = (u- u') -]-{v-v') eW. Hence {u + v)+W = {u' + v') + W.
(b) Also, since u — u' e W implies k{u — u')eW, then ku — ku' = k(u — u') e W; accordingly,
ku-\-W = ku'-\-W.
10.25. Let F be a vector space and Wsl subspace of V. Show that the natural map f]:V^ V/W, defined by
r](v) = V -\- W, is linear.
For any u,v e V and any k e K, wq have
n(u -\-v) = u-\-v-\-W = u-\-W-\-v-\-W = f]{u) + f]{v)
and f]{kv) = kv -\-W = k(v -\- W) = kf]{v)
Accordingly, f] is linear.
10.26. Let WhQdi subspace of a vector space V. Suppose {wj,..., w^} is a basis of W and the set of cosets
{v^, ... ,Vg}, where Vj = Vj + ^, is a basis of the quotient space. Show that the set of vectors
5 = {i;i, ..., i;^, Wj, ..., w^} is a basis of V. Thus dim V = dim W + dim(F/^).
Suppose u e V. Since {vj} is a basis of V/W,
u = u -\- W = ttiVi -\- a2V2 + ... + a^Vg
Hence u = aiVi -\-... -\- a^v^ + w, where w e W. Since {wj is a basis of W,
u = a^Vi + ... + a^v^ + b^Wi + ... + b^w^
Accordingly, B spans V.
We now show that B is linearly independent. Suppose
c^Vi + ... + c^v^ + diWi + ... + dj,Wj, = 0 A)
Then c^Vi -\-... -\- c^v^ = 0=W
Since {Vj} is independent, the c's are all 0. Substituting into A), we find diWi + ... + dj,Wj, = 0. Since {wj is
independent, the d's are all 0. Thus B is linearly independent and therefore a basis of V.
10.27. Prove Theorem 10.16: Suppose ^is a subspace invariant under a linear operator T:V ^ V. Then T
induces a linear operator T on V/W defined by T{v -\-W) = T(v) + W Moreover, if 7 is a zero of
any polynomial, then so is T. Thus the minimal polynomial of T divides the minimal polynomial
of r.
We first show that f is well defined; i.e., if u-\-W = v-\-W, then f(u -\-W) = f(v + W). If
u-\-W = v-\-W, then u — v e W, and, since W is T-invariant, T(u — v) = T(u) — T(v) e W. Accordingly,
f{u +W) = T(u) -\-W = T(v) -\-W = f(v-\-W)
as required.
We next show that T is linear. We have
f((u -\-W)-\-(v-\- W)) = f(u-\-v-\-W) = T(u-\-v)-\-W = T(u) + T(v) + W
= T(u) -\-W-\- T(v) -\-W = f(u-\-W)-\- f(v + W)
Lipschutz-Lipson:Schaum's I 10. Canonical Forms I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP 10] CANONICAL FORMS 357
Furthermore
f{k{u + W)) = f(ku -\-W) = T(ku) -\-W = kT(u) -\-W = k(T(u) -\-W) = kf(u + W)
Thus T is linear.
Now, for any coset u-\-W in V/W,
f^(u -\-W) = T^(u) -\-W = T(T(u)) -\-W = f(T(u) -\-W) = f(f(u + W)) = f^(u + W)
Hence T^ = T^. Similarly, T" = T" for any n. Thus, for any polynomial
f(t) = a,f + ...+ao = j:a/
f{T)iu + W) =f(T)(u) + r = E aiTXu) + r = E ai(r(u) + W)
= E «/^(w + ^) = E air(u -\-w) = (j2 a^rxu + w) =f(f)(u + w)
and so JXT) =f(f). Accordingly if T is a root of/@ then/XT) = b=W =f(f); i.e., f is also a root of/@.
Thus the theorem is proved.
10.28. Prove Theorem 10.1: Let T:V ^ F be a linear operator whose characteristic polynomial factors
into linear polynomials. Then V has a basis in which T is represented by a triangular matrix.
The proof is by induction on the dimension of V. If dim V = I, then every matrix representation of T is a
1x1 matrix, which is triangular.
Now suppose dim V = n > I and that the theorem holds for spaces of dimension less than n. Since the
characteristic polynomial of T factors into linear polynomials, T has at least one eigenvalue and so at least one
nonzero eigenvector v, say T(v) = a^iV. Let ^be the 1-dimensional subspace spanned by v. Set V = V/W.
Then (Problem 10.26) dim V = dim V — dim W = n — \. Note also that ^is invariant under T. By Theorem
10.16, T induces a linear operator T onV whose minimal polynomial divides the minimal polynomial of T.
Since the characteristic polynomial of T is a product of linear polynomials, so is its minimal polynomial;
hence so are the minimal and characteristic polynomials of T. Thus V and T satisfy the hypothesis of the
theorem. Hence, by induction, there exists a basis {v2,... ,Vj^} oiV such that
fjv^) = a22V2
T{Vn) = a^l^n + ^nzh + • • • + ann^n
Now \QtV2,... ,v^ be elements of Fthat belong to the cosets V2,... ,v^, respectively. Then {i;, i;2,..., i;„} is a
basis of V (Problem 10.26). Since T(v2) = ^22^2' we have
T(V2) — ^22^22 = 0' ^^d so T(V2) — «22^2 ^ ^
But W is spanned by v; hence T(v2) — ^22^2 is a multiple of v, say,
T(v2) — «22^2 = ^2\^^ ^nd so T{v2) = a2\V-\- a22V2
Similarly, for / = 3,..., «
T(vi) — ai2V2 — a,31^3 — ... — aiiVi e W, and so T(vi) = a^v + a,-2^2 + • • • + ^u^i
Thus
T{v) = a^^v
T{V2) = a2\V -\- a22V2
T{Vn) = an\V + a^2^2 + • • • + a^n^n
and hence the matrix of T in this basis is triangular.
Lipschutz-Lipson:Schaum's I 10. Canonical Forms I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
358 CANONICAL FORMS [CHAP. 10
CYCLIC SUBSPACES, RATIONAL CANONICAL FORM
10.29. Prove Theorem 10.12: Let Z(v, T) be a T-cyclic subspace, T^ the restriction of T to Z{v, T), and
m^{i) = t^ -\- aj^_xt^~^ + ... + ^0 the T-annihilator of v. Then:
(i) The set {i;, T{v), ..., T^~^{v)} is a basis of Z(v, T); hence dim Z(v, T) = k.
(ii) The minimal polynomial of T^ is m^{i).
(iii) The matrix of T^ in the above basis is the companion matrix C = C{m^) of m^(t) [which has
I's below the diagonal, the negative of the coefficients Gq, a^,..., aj^_i of m^(t) in the last
column, and O's elsewhere].
(i) By definition of mj^(t), T^(v) is the first vector in the sequence v, T(v), T^(v), ... that is a linear
combination of those vectors which precede it in the sequence; hence the set B = {v, T(v),..., T'^~^(v)}
is linearly independent. We now only have to show that Z(v, T) = L(B), the linear span of B. By the
above, T^(v) e L(B). We prove by induction that r"(i;) e L(B) for every n. Suppose n > k and
T^'-^v) e L(B), i.e., T^'-^v) is a linear combination of i;, ..., T^'-^v). Then r"(i;) = T(r-\v)) is a
linear combination of r(i;), ..., T'^(v). But T'^(v) e L(B); hence T^(v) e L(B) for every n. Consequently,
f(T)(v) e L(B) for any polynomial/(O- Thus Z(v, T) = L(B), and so 5 is a basis, as claimed.
(ii) Suppose m(t) = f -\- b^_if~^ +...+69 is the minimal polynomial of T^. Then, since v e Z(v, T),
0 = m(T,)(v) = m(T)(v) = T\v) + b,_,r-\v) + ...+b,v
Thus T\v) is a linear combination of v, T(v), ..., T^~^(v), and therefore k < s. However, m^(T) = 0
and so m^(T^) = 0. Then m(t) divides m^(t), and so ^ < ^. Accordingly, k = s and hence m^(t) = m{t).
(iii) Uv) = T(v)
UT{v)) = T\v)
UT^-\v)) = T\v) = -a,v-a,T{v)-a^T\v)...-a,_,T^-\v)
By definition, the matrix of T^ in this basis is the tranpose of the matrix of coefficients of the above
system of equations; hence it is C, as required.
10.30. Let T:V ^ F be linear. Let ^be a T-invariant subspace of V and T the induced operator on V/W.
Prove:
(a) The T-annihilator of 1; G F divides the minimal polynomial of T.
(b) The T-annihilator of i; G V/W divides the minimal polynomial of T.
(a) The T-annihilator of i; G F is the minimal polynomial of the restriction of T to Z(v, T), and therefore, by
Problem 10.6, it divides the minimal polynomial of T.
(b) The T-annihilator of v e V/W divides the minimal polynomial of T, which divides the minimal
polynomial of T by Theorem 10.16.
Remark: In the case where the minimum polynomial of T is f(ty, where/(^) is a monic irreducible
polynomial, then the T-annihilator of i; G F and the T-annihilator of i; G V/W are of the form/(^)"^, where
m < n.
Lipschutz-Lipson:Schaum's I 10. Canonical Forms I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 10] CANONICAL FORMS 359
10.31. Prove Lemma 10.13: Let T:V ^ F be a linear operator whose minimal polynomial is f(ty, where
f(t) is a monic irreducible polynomial. Then V is the direct sum of T-cyclic subspaces
Zi = Z(vi, r), / = 1, ..., r, with corresponding T-annihilators
f(tr JUr, •. • J(tr\ n = n,>n2>...>n.
Any other decomposition of V into the direct sum of T-cyclic subspaces has the same number of
components and the same set of T-annihilators.
The proof is by induction on the dimension of V. If dim V = I, then V is itself r-cyclic and the lemma
holds. Now suppose dim V > I and that the lemma holds for those vector spaces of dimension less than that
of K
Since the minimal polynomial of T is f(ty, there exists v^ e V such thsit f(Ty~^(vi) ^ 0; hence the T-
annihilator of i;i is f(ty. Let Z^ = Z(vi, T) and recall that Z^ is T-invariant. Let V = V/Zy and let f be the
linear operator on V inducedby T. By Theorem 10.16, the minimal polynomial of T_ divides/(^)"; hence the
hypothesis holds for V and f. Consequently, by induction, V is the direct sum of T-cyclic subspaces; say,
V = Z{v2,f)®...®Z{v„f)
where the corresponding T-annihilators are/@"^ • • • ,fitf\ n>n2> ... >n^.
We claim that there is a vector V2 in the coset V2 whose T-annihilator isf{tf^, the T-annihilator of 1J. Let
w be any vector in V2. Then/(r)(>v) G Z^. Hence there exists a polynomial g{t) for which
f{Tr-(w)=g{T){v,) A)
Since/(O" is the minimal polynomial of T, w have, by A),
0 =f{Tr{w) =f(Tr-''^g(T)(v,)
But/@" is the T-annihilator of v^; hence/(O" divides/@"""'g@, and so g(t) =f(tf^h(t) for some
polynomial h(t). We set
V2 =w — h(T)(vi)
Since w — V2 = h(T)(vi) e Zi,V2 also belongs to the coset V2. Thus the T-annihilator of i;2 is a multiple of the
T-annihilator of V2. On the other hand, by A).
f(Ty\v2) =f(Tr'(w - h(T)(v,)) =fiTr-(w)-g(T)(vi) = 0
Consequently the T-annihilator of V2 is /(O"^, as claimed.
Similarly, there exist vectors v^,... ,Vj, e V such that v^ e W^ and that the T-annihilator of v^ is/@"% the
T-annihilator of I^. We set
Z2=Z{v2,n •••, Z,=Z{v,,T)
Let d denote the degree of/@, so that/@"' has degree dui. Then, since/(O"' is both the T-annihilator of i;,
and the T-annihilator of I^, we know that
{Vi,T{v^,...,T'^''-\v,)} and
are bases for Z(i;^, T) and Z(I^, f\ respectively, for / = 2,.
{V2,...,f''^^-\v2l...,Vr
is a basis for V. Therefore, by Problem 10.26 and the relation T^iv)
{Vy,...,T'^'''-\vy),V2,...,r''^~\v2)
is a basis for V. Thus, by Theorem 10.4, V = Z(vi, T) 0 ... 0 Z(v^, T), as required.
It remains to show that the exponents «i,..., «^ are uniquely determined by T. Since d = degree of f(t),
dim V = d(ni + ... + «^) and dim Z, = dui, i = 1,..., r
Also, if ^ is any positive integer, then (Problem 10.59)/(ry(Z^) is a cyclic subspace generated by f(Ty(vi),
and it has dimension d(ni — s) if rif > s and dimension 0 if «^ < ^.
Vi-mi
T'drij-
, r. But V = Z(n^.,
n-'dn^—
t'(v) =
...,v„.
'(S.)}
-'m
, r) 0..
.. 0 Z(T„ f);
P(v) (see Problem 10.27),
rpdn^ —
\v,)}
hence
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
10. Canonical Forms
Text
©The McGraw-Hill
Companies, 2004
360
CANONICAL FORMS
[CHAR 10
Now any vector v e V can be written uniquely in the form v = Wi
vector in f(Ty(V) can be written uniquely in the form
fiTYiv) =f(Ty(wi) + ... +f(Tf{w,)
where/(r)^(>v^) ef(Ty(Zi). Let t be the integer, dependent on s, for which
Ui > s, ..., n^ > s, n^j^i > s
. + w^, where w^ e Z^. Hence any
Then
and so
nrnv) =f(Tnzo e... enrnz,)
dimlf(rnV)] = rf[(«i -s) + ...+(„,-s)]
B)
The numbers on the left of B) are uniquely determined by T. Set s = n — I, and B) determines the number of
Hi equal to n. Next set ^ = « — 2, and B) determines the number of rif (if any) equal to n — 1. We repeat the
process until we set ^ = 0 and determine the number of «, equal to 1. Thus the «, are uniquely determined by
T and V, and the lemma is proved.
10.32. Let F be a 7-dimensional vector space over R, and let T:V ^ F be a linear operator with minimal
polynomial m(t) = (f — 2t -\- 5)(t — 3) . Find all possible rational canonical forms M of T.
The sum of the orders of the companion matrices must add up to 7. Also, one companion matrix must be
C(f -2t-\-5) and one must be C((t - 3)^) = C(P - 9f + 27^ - 27). Thus M must be one of the following
block diagonal matrices:
(a) diag
(b) diag
(c) diag
[? 1]' [?
0 -5
2
0 -5
1 2
0 0
1 0
0 1
27
-27
0 0
1 0
0 1
0 0
1 0
0 1
27
-27
27
-27
0 -9
1 6
[3], [3]
PROJECTIONS
10.33. Suppose F = ^j 0 ... 0 ^^. ThQ projection of V into its subspace Wj^ is the mapping E:V^V
defined by E(v) = Wj^ where v = Wi -\-... -\-w^,Wi e W^. Show that: {a) E is linear, (Z?) E^ = E.
{a) Since the sum v -
-w^, Wi e W is uniquely determined by v, the mapping E is well defined.
Suppose, for w G V, u = W^ -\-... -\- W^,, w- e Wf. Then
V -\-u = {wi -\-w\) -\-...-\- {w^ -\- w'^) and kv = kwi -h ... -h kw^, kwi, Wi -h wj- G ^•
are the unique sums corresponding to v -\-u and kv. Hence
E(v -\-u) = Wj^-\-w'j^ = E(v) -h E(u) and E(kv) = kwj^ -h kE(v)
and therefore E is linear
(b) We have that
>v^ = 0-h...-hO-h>v^-hO-h...-hO
is the unique sum corresponding to w^ G Wj^; hence E(wj^) = Wj^. Then, for any v e V,
E\v) = E{E{v)) = E(w,):
:E(V)
Thus E^ = E, as required.
Lipschutz-Lipson:Schaum's I 10. Canonical Forms I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 10] CANONICAL FORMS 361
10.34. Suppose E:V ^ V is linear and E^ = E. Show that: (a) E(u) = u for any u elmE, i.e. the
restriction ofE to its image is the identity mapping; (b) V is the direct sum of the image and kernel
of E:V = Im E ^ Ker E; (c) E is the projection of V into Im E, its image. Thus, by the preceding
problem, a linear mapping T:V ^ F is a projection if and only if T^ = T; this characterization of a
projection is frequently used as its definition.
(a) If w G Im E, then there exists i; G F for which E(v) = u; hence, as required,
E(u) = E(E(v)) = E^(v) = E(v) = u
(b) Let i; G K We can write v in the form v = E(v) -\-v — E(v). Now E(v) G Im ^ and, since
E(v - E(v)) = E{v) - E^(v) = E{v) - E(v) = 0
V — E(v) G Ker E. Accordingly, V = lmE -\- Ker E.
Now suppose w G Im ^ n Ker E. By (/), E(w) = w because w elmE. On the other hand, E(w) = 0
because w e Ker E. Thus w = 0, and so Im ^ H Ker ^ = {0}. These two conditions imply that V is the
direct sum of the image and kernel of E.
(c) Let V e V and suppose v = u -\-w, where u elmE and w e Ker E. Note that E(u) = u by (i), and
E(w) = 0 because w e Ker E. Hence
E(v) = E(u -\-w) = E(u) + E(w) = u-\-0 = u
That is, E is the projection of V into its image.
10.35. Suppose V =U ®W and suppose T:V ^ F is linear. Show that U and ^are both T-invariant if
and only if TE = ET, where E is the projection of V into U.
Observe that E(v) e U for every v e V, and that (i) E(v) = viff v e U, (ii) E(v) = 0 iff i; G W.
Suppose ET = TE. Let u e U. Since E(u) = u,
T(u) = T(E(u)) = (TE)(u) = (ET)(u) = E(T(u)) G U
Hence U is T-invariant. Now let w G ^ Since E(w) = 0,
E(T(w)) = (ET)(w) = (TE)(w) = T(E(w)) = T@) = 0, and so T(w) G W
Hence W is also T-invariant.
Conversely, suppose U and W are both T-invariant. Let v e V and suppose v = u -\-w, where u e T and
weW. Then T(u) e U and T(w) e W; hence E(T(u)) = T(u) and E(T(w)) = 0. Thus
(ET)(v) = (ET)(u -\-w) = (ET)(u) + (ET)(w) = E(T(u)) + E(T(w)) = T(u)
and (TE)(v) = (TE)(u -\-w) = T(E(u + w)) = T(u)
That is, (ET)(v) = (TE)(v) for every v e V; therefore ET = TE, as required.
Supplementary Problems
INVARIANT SUBSPACES
10.36. Suppose W is invariant under T:V ^^ V. Show that W is invariant under/(r) for any polynomial/(O-
10.37. Show that every subspace of V is invariant under / and 0, the identity and zero operators.
10.38. Let ^be invariant under T^-.V^ V and T2.V ^ V. Prove Wis also invariant under T^ + T2 and TyT2.
10.39. Let T\V ^ F be linear. Prove that any eigenspace, E^ is T-invariant.
10.40. Let F be a vector space of odd dimension (greater than 1) over the real field R. Show that any linear operator
on V has an invariant subspace other than V or {0}.
Lipschutz-Lipson:Schaum's I 10. Canonical Forms I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
362 CANONICAL FORMS [CHAP. 10
10.41. Determine the invariant subspace of ^
2 -4
5 -2
viewed as a linear operator on (i) R^, (ii) C^.
10.42. Suppose dim V = n. Show that T\V ^ V has a triangular matrix representation if and only if there exist
T-invariant subspaces Wi C W2 C • • • C W^^ = V for which dim Wj^ = k, k = I,... ,n.
INVARIANT DIRECT SUMS
10.43. The subspaces W^,... ,W^ are said to be independent if w^ + ... + w^ = 0, w^ e W^, implies that each
Wi = 0. Show that span(^^) = W^ ® ... ® W^, if and only if the W^ are independent. [Here span(^^) denotes
the linear span of the ^.]
10.44. Show that V = W^ e ... e W, if and only if: (i) V = spmiW^) and (ii) for ^=1,2, ...,r,
W, n spanC^i, ...,W,_„W,^„..., W,) = {0}.
10.45. Show that span(^) = ^1 0 ... 0 ^^ if and only if dim [span(^,)] = dim ^1 + ... + dim W,.
10.46. Suppose the characteristic polynomial ofT:V ^ F is A(t) = f^(ty^f2(ty^.. ./.@"% where theyj@ are distinct
monic irreducible polynomials. Let F = ^^ 0 ... 0 ^^ be the primary decomposition of V into T-invariant
subspaces. Show thatyj@"' is the characteristic polynomial of the restriction of T to W^.
NILPOTENT OPERATORS
10.47. Suppose T^ and T2 are nilpotent operators that commute, i.e. T1T2 = 7^2 T^. Show that T^ + T2 and T1T2 are
also nilpotent.
10.48. Suppose ^ is a supertriangular matrix, i.e., all entries on and below the main diagonal are 0. Show that A is
nilpotent.
10.49. Let V be the vector space of polynomials of degree < n. Show that the derivative operator on V is nilpotent of
index n-\- I.
10.50. Show that any Jordan nilpotent block matrix A^ is similar to its transpose A^^ (the matrix with 1 's below the
diagonal and O's elsewhere].
10.51. Show that two nilpotent matrices of order 3 are similar if and only if they have the same index of nilpotency
Show by example that the statement is not true for nilpotent matrices of order 4.
JORDAN CANONICAL FORM
10.52. Find all possible Jordan canonical forms for those matrices whose characteristic polynomial A(^) and minimal
polynomial m(t) are as follows:
(a) A@ = (t- 2)\t - 3)^ m(t) = (t - lf{t - 2)(t - 3)^
(b) A(t) = (t-iy, m(t) = (t-lf, (c) A(t) = (t-2)\ m{t) = {t-lf
10.53. Show that every complex matrix is similar to its transpose. {Hint: Use its Jordan canonical form.)
10.54. Show that slW n x n complex matrices A for which A^ = I but A^ ^ I iox k < n are similar.
10.55. Suppose ^ is a complex matrix with only real eigenvalues. Show that A is similar to a matrix with only real
entries.
Lipschutz-Lipson:Schaum's I 10. Canonical Forms I Text I I ©The McGraw-Hill
Outline of Theory and Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 10] CANONICAL FORMS 363
CYCLIC SUBSPACES
10.56. Suppose T:V ^ F is linear. Prove that Z(v, T) is the intersection of all T-invariant subspaces containing v.
10.57. Let/(^) and g{i) be the T-annihilators of w and v, respectively. Show that if/@ and g{t) are relatively prime,
then/(^)g(^) is the T-annihilator oiu-\-v.
10.58. Prove that Z{u, T) = Z(v, T) if and only if g{T){u) = v where g(t) is relatively prime to the T-annihilator of w.
10.59. Let W = Z(v, T), and suppose the T-annihilator of i; is f(ty, where/(^ is a monic irreducible polynomial of
degree d. Show that/(r)^(^) is a cyclic subspace generated by/(r)^(i;) and that it has dimension d(n — s) if
n > s and dimension 0 if n < s.
RATIONAL CANONICAL FORM
10.60. Find all possible rational forms for a 6 x 6 matrix over R with minimal polynomial:
(a) m(t) = (f-2t-\-3)(t-\-lf, (b) m(t) = (t - 2)\
10.61. Let^ be a 4 X 4 matrix with minimal polynomial m(t) = (f + l)(f — 3). Find the rational canonical form for
^ if ^ is a matrix over (a) the rational field Q, (b) the real field R, (c) the complex field C.
10.62. Find the rational canonical form for the 4-square Jordan block with A's on the diagonal.
10.63. Prove that the characteristic polynomial of an operator T:V ^ F is a product of its elementary divisors.
10.64. Prove that two 3x3 matrices with the same minimal and characteristic polynomials are similar.
10.65. Let C(f(t)) denote the companion matrix to an arbitrary polynomial/(O- Show that f(t) is the characteristic
polynomial of C(f(t)).
PROJECTIONS
10.66. Suppose V = W^ e ... e W,. Let E^ denote the projection of V into ^. Prove: (i) E^Ej = 0, / ^j;
(ii)I=E,+...+E,.
10.67. Let Ey,... ,E^hQ linear operators on V such that:
(i) E} = E^, i.e., the E^ are projections; (ii) E^Ej = 0, / 7^7; (iii) I = E^ -\-... -\-E^.
Prove that F = Im ^^ 0 ... 0 Im ^^.
10.68. Suppose E:V ^^ V is sl projection, i.e., E^ = E. Prove that E has a matrix representation of the form
'L 0"
0 0
, where r is the rank of E and 4 is the r-square identity matrix.
10.69. Prove that any two projections of the same rank are similar. (Hint: Use the result of Problem 10.68.)
10.70. Suppose E\V ^ F is a projection. Prove:
(i) / — ^ is a projection and F = Im ^ 0 Im (/ — E\ (ii) / + ^ is invertible (if 1 + 1 7^ 0).
QUOTIENT SPACES
10.71. Let ^be a subspace of V. Suppose the set of cosets {v^-l-W, V2-\- W, ..., i;„ + W} in V/W is linearly
independent. Show that the set of vectors {i;i, i;2,..., i^„} in F is also linearly independent.
10.72. Let ^be a substance of V. Suppose the set of vectors {ui,U2,..., u^} in V is linearly independent, and that
L(ui) nW = {0}. Show that the set of cosets {u^ -\- W, ..., u^-\- W} in V/W is also linearly independent.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
10. Canonical Forms
Text
©The McGraw-Hill
Companies, 2004
364
CANONICAL FORMS
[CHAR 10
10.73. Suppose V =U ®W and that {w^,..., w„} is a basis of U. Show that {ui -\- W, ..., u^ -\- W} is sl basis of
the quotient spaces V/W. (Observe that no condition is placed on the dimensionality of V or W.)
10.74. Let W be the solution space of the linear equation
a^Xi + C2X2 + ... + a„x„ = 0,
a^eK
and let v = {bi,b2,... ,b^) e K^. Prove that the coset v -\- W of W in K^ is the solution set of the linear
equation
aiXi + C2X2 + ... + a„x„ = b, where b = aibi -\-... -\- a^b^
10.75. Let V be the vector space of polynomials over R and let ^be the subspace of polynomials divisible by t^, i.e.
of the form aQt^ -\- a^t^ -\-... -\- a^_^f. Show that the quotient space V/W has dimension 4.
10.76. Let U and W be subspaces of V such that W c U c V. Note that any coset u-\- W of W in U may also be
viewed as a coset of ^ in F^ since u e U implies u e V; hence U/W is a subset of V/W. Prove that
(i) U/W is a subspace of V/W, (ii) dim(V/W) - dim(U/W) = dim(V/U).
10.77. Let U and ^be subspaces of V Show that the cosets ofUDWinV can be obtained by intersecting each of
the cosets of t/ in F by each of the cosets of W in V:
V/(U nW) = {(v -\-U)n (v' -\-W)\v,v' e V]
10.78. Let T\V ^ V be linear with kernel W and image U. Show that the quotient
space V/W is isomorphic to U under the mapping d\V/W ^ U defined by
6(v -\- W) = T(v). Furthermore, show that T = io Qo rj where r]\V ^ V/W is
the natural mapping of V into V/W, i.e. f](v) = v-\-W, and i\U^^V is the
inclusion mapping, i.e., i{u) = u. (See diagram.) V/W
Answers to Supplementary Problems
10.41. (a) R^andjO}, (b) C^ {0}, ^i = spanB, 1 - 2/), ^2 = spanB, 1+2/)
10.52. (a) diag
(b) diag
2 1
2
7 1
7
2 1
2^
7 r
7
3 1
3
j.diagf
2 1
2
, [2]. [2],
3])
[7] , diag
7 1
7
[7], [7], [7]
(c) Let M^ denote a Jordan block with 1 = 2 and order k. Then: diag(M3,M3,Mi), diag(M3,M2,M2),
diag(M3, M2, Ml, Ml), diag(M3, M^, M^, M^, M^)
10.60. Let^
0 -3
1 2
,B--
[: :5]-
0 0 8
1 0 -12
0 1 6
,D-
0 -4
1 4
(fl) diag(^,^,5), diag(^,5,5), diag(^,5, -1,-1), (b) diag(C, C), diag(C,D, [2]), diag(C, 2, 2, 2)
10.61. Let A
0 -1
1 0
,B--
\0 3
(fl) diag(^,5), F) diag(^, V3,-V3), (c) diag(/,-/, V3,-V3)
10.62. Companion matrix with the last column [—A , 4/1 , —6/1 , 4/1]^
Lipschutz-Lipson:Schaum's I 11. Linear Functionalsand I Text I I ©The McGraw-Hill
Outline of Theory and the Dual Space Companies, 2004
Problems of Linear
Algebra, 3/e
CHAPTER 11
Linear Functionals
and the Dual Space
11.1 INTRODUCTION
In this chapter, we study linear mappings from a vector space V into its field K of scalars. (Unless
otherwise stated or implied, we view AT as a vector space over itself.) Naturally all the theorems and results
for arbitrary mappings on V hold for this special case. However, we treat these mappings separately
because of their fundamental importance and because the special relationship of V to K gives rise to new
notions and results that do not apply in the general case.
11.2 LINEAR FUNCTIONALS AND THE DUAL SPACE
Let F be a vector space over a field K. A mapping (p'.V ^ K is termed a linear functional (or linear
form) if, for every u,v e V and every a,b, e K,
^{au + bv) = a^{u) + b^{v)
In other words, a linear functional on F is a linear mapping from V into K.
Example 11.1.
{a) Let nf.K^ ^ KhQ the /th projection mapping, i.e., 7i^(ai, a2, ... a^) = a^. Then 7i^ is linear and so it is a linear
functional on ^".
(b) Let V be the vector space of polynomials in t over R. Let J:F ^ R be the integral operator defined by
J(p(t)) = Jq p(t) dt. Recall that J is linear; and hence it is a linear functional on V.
(c) Let V be the vector space of w-square matrices over K. Let T: F ^ ^ be the trace mapping
T{A) = au-\-a22-^----^a^n^ where A = [a^^]
That is, T assigns to a matrix A the sum of its diagonal elements. This map is linear (Problem 11.24), and so it is
a linear functional on V.
365
Lipschutz-Lipson:Schaum's I 11. Linear Functionals and I Text I I ©The McGraw-Hill
Outline of Theory and the Dual Space Companies, 2004
Problems of Linear
Algebra, 3/e
366 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11
By Theorem 5.10, the set of linear functionals on a vector space V over a field K is also a vector space
over K, with addition and scalar multiplication defined by
((/) + G)(v) = (l)(v) + G(v) and (k(l))(v) = k^{v)
where 0 and a are linear functionals on V and k e K. This space is called the dual space of V and is
denoted by F*.
Example 11.2. Let V — K", the vector space of/z-tuples, which we write as column vectors. Then the dual space F* can
be identified with the space of row vectors. In particular, any linear functional (p = (a^, ... ,aj in F* has the representation
(/)(xi ,X2, .. .,xj = [ai,a2,..., aJ[x2,X2, ..., xj^ = a^x^ + ^2X2 + ... + a„x„
Historically, the formal expression on the right was termed a linear form.
11.3 DUAL BASIS
Suppose V is a vector space of dimension n over AT. By Theorem 5.11 the dimension of the dual space
F* is also n (since K is of dimension 1 over itself). In fact, each basis of V determines a basis of F* as
follows (see Problem 11.3 for the proof).
Theorem 11.1: Suppose {v^,..., vj is a basis of V over K. Let ^j, ...,(/)„ g F* be the linear
functionals as defined by
Then (/)j,..., (pj is a basis of F*.
The above basis {(/)J is termed the basis dual to {i;J or the dual basis. The above formula, which uses
the Kronecker delta Sy, is a short way of writing
(/)i(i;i)=l, 0i(i^2) = O, 01A^3) = 0, ..., 0i(i^J = O
M^r) = 0, M^2)=h 02(^3) = 0, ..., Uvj = 0
0„(^i) = 0, (I)n(v2) = 0, ..., (/>„(i;„_i) = 0, (/)„(i;J = 1
By Theorem 5.2, these linear mappings 0^ are unique and well-defined.
Example 11.3. Consider the basis {v^ = B, 1), V2 = C, 1)} of R^. Find the dual basis {0i, 02}-
We seek linear functionals (l)i(x,y) = ax-\- by and (lJ(x,y) = ex-\- dy such that
These four conditions lead to the following two systems of linear equations:
^,(v,) = ^,B,\) = 2a^b=\\ ^v,) = ^2, \) = 2e^d = 0
(I),(V2) = (/)iC, \) = 3a^b = 0\ ^ "^ (IJ(V2) = M^, \) = 3e^d=\
The solutions yield a = —I, b = 3 and e = I, d = —2. Hence (l)i(x,y) = —x + 3j; and (lJ(x,y) = x — 2y form the
dual basis.
The next two theorems (proved in Problems 11.4 and 11.5, respectively) give relationships between
bases and their duals.
Theorem 11.2: Let {vi,..., v^} be a basis of V and let {(p^,..., (pj be the dual basis in F*. Then:
(i) For any vector u e V, u = (/>i(w)i^i + ^2(^)^2 + • • • + 4^n(^)^n-
(ii) For any linear functional a G F*, a = G(vi)(pi + (T(v2)<p2 + • • • + cr(i;„)(/)„.
Lipschutz-Lipson:Schaum's I 11. Linear Functionalsand I Text I I ©The McGraw-Hill
Outline of Theory and the Dual Space Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 11] LINEAR FUNCTIONALS AND THE DUAL SRACE 367
Theorem 11.3: Let {vi,..., vj and {wj,..., wj be bases of V and let {(p^,..., (pj and {o-j,..., a J
be the bases of F* dual to {i;J and {wj, respectively Suppose P is the change-of-basis
matrix from {i;J to {wj. Then (P~^)^ is the change-of-basis matrix from {(/)J to {di}.
11.4 SECOND DUAL SPACE
We repeat: Every vector space V has a dual space F*, which consists of all the linear functionals on V.
Thus F* itself has a dual space F**, called the second dual of V, which consists of all the linear
functionals on F*.
We now show that each v e V determines a specific element v g F**. First of all, for any (p g F*, we
define
£((/)) = (/)(i;)
It remains to be shown that this map v:V^ ^ K is linear. For any scalars a,b e K and any linear
functionals (l),G e F*, we have
v(a(l) + ba) = (acj) + bG)(v) = a(/)(i;) + Z?cr(i;) = ai)((/)) + Z?i)(cr)
That is, V is linear and so i) G F**. The following theorem (proved in Problem 12.7) holds.
Theorem 11.4: If V has finite dimensions, then the mapping i; i^ £ is an isomorphism of V onto F**.
The above mapping i; i^ i) is called the natural mapping of F into F**. We emphasize that this
mapping is never onto F** if F is not finite-dimensional. However, it is always linear and moreover, it is
always one-to-one.
Now suppose V does have finite dimension. By Theorem 11.4, the natural mapping determines an
isomorphism between V and F**. Unless otherwise stated, we shall identify V with F** by this mapping.
Accordingly, we shall view V as the space of linear functionals on F* and shall write V = F**. We remark
that if {(/>J is the basis of F* dual to a basis {i;J of V, then {i;J is the basis of F** = V that is dual to {(/)J.
11.5 ANNIHILATORS
Let ^ be a subset (not necessarily a subspace) of a vector space V. A linear functional 0 g F* is called
aw annihilator of ^ if (/)(w) = 0 for every w e W, i.e., if (/>(^) = {0}. We show that the set of all such
mappings, denoted by W^ and called the annihilator of W, is a subspace of F*. Clearly, 0 G ^^. Now
suppose (l),G e W^. Then, for any scalars a,b, e K and for any w e W,
{a^ + bG){w) = a(l)(w) + bG(w) = aO-\-bO = 0
Thus acj) -\-bG e W^, and so ^^ is a subspace of F*.
In the case that ^ is a subspace of V, we have the following relationship between W and its annihilator
W^ (see Problem 11.11 for the proof).
Theorem 11.5: Suppose V has finite dimension and ^ is a subspace of V. Then:
(i) dim W + dim W^ = dim V and (ii) W^^ = W
Here W^^ = {v e V:(l)(v) = 0 for every 0 g W^} or, equivalently, ^^^ = (W^f, where ^^^ is viewed
as a subspace of F under the identification of V and F**.
Lipschutz-Lipson:Schaum's I 11. Linear Functionalsand I Text I I ©The McGraw-Hill
Outline of Theory and the Dual Space Companies, 2004
Problems of Linear
Algebra, 3/e
368 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAR 11
11.6 TRANSPOSE OF A LINEAR MAPPING
Let T:V ^ ^ be an arbitrary linear mapping from a vector space V into a vector space U. Now for
any linear functional (p e U^, the composition 0 o r is a linear mapping from V into K:
That is, (j) o T e F*. Thus the correspondence
(j) \^ (jyo T
is a mapping from U^ into F*; we denote it by P and call it the transpose of T. In other words,
ft.jj^ _^ y^ |g defined by
Thus (P((l)))(v) = 4^{T{v)) for every v eV.
Theorem 11.6: The transpose mapping P defined above is linear.
Proof. For any scalars a,b e K and any linear functionals (l),G e U^,
T\a^ + ba) = {a^ -\-bG)oT = a{^o T) + b{Go T) = aT\^) + bT^a)
That is, P is linear, as claimed.
We emphasize that if 7 is a linear mapping from V into U, then P is a linear mapping from ^* into
F*. The same "transpose" for the mapping P no doubt derives from the following theorem (proved in
Problem 11.16).
Theorem 11.7: Let T:V ^ UhQ linear, and let A be the matrix representation of T relative to bases {i;J
of V and {wj of U. Then the transpose matrix A^ is the matrix representation of
P-.U"^ -^ F* relative to the bases dual to {uA and {vA.
Solved Problems
DUAL SPACES AND DUAL BASES
11.1. Find the basis {(/)i, 02, 03} that is dual to the following basis of R^:
M =A,-1,3), i;2 = @,1,-1), i;3 = @,3,-2)}
The linear functionals may be expressed in the form
0i(x,_y, z) = aiX-\- a2y -\- ttT^z, (fJ{x,y,z) = b^x -\- b2y -\- b^z, (l)^(x,y,z) = c^x -\- C2y -\- c^z
By definition of the dual basis, 0,(fy) = 0 for / 7^7, but ^^(I'y) = 1 for / —j.
We find 01 by setting 01A^1) = 1, 0i(i^2) — ^^ 0i(^3) = 0. This yields
0i(l, -1,3) = ai -^2 + 3^3 = 1, 0i(O, 1, -1) = ^2 -«3 =0' 0i(O, 3,-2) = 3^2 - 2^3 = 0
Solving the system of equations yields a^ = 1, ^2 = 0, C3 = 0. Thus 0i(x, j;, z) = x.
We find 02 by setting 02(i^i) = 0, ^^^^i) — 1' 02(^3) — 0- This yields
02A, -1, 3) = 61 - 62 + ba^ = 0, 02@, l,-l) = b2-b^ = 1, 02@, 3, -2) = 3^2 -2b^=0
Solving the system of equations yields b^ =1, b2 = —2, a^ = —3. Thus 02(x, j;, z) = 7x — 2j; — 3z.
Lipschutz-Lipson:Schaum's I 11. Linear Functionals and I Text
Outline of Theory and the Dual Space
Problems of Linear
Algebra, 3/e
©The McGraw-Hill
Companies, 2004
CHAR 11]
LINEAR FUNCTIONALS AND THE DUAL SRACE
369
We find (p^ by setting (p^iv^) = 0, (tJ>{v2) = 0, (p^iv^) = 1. This yields
(/K(l,-l,3) = Ci -C2+Cfl3 =0, (/K@, 1,-1) = C2-C3 =0, (/K@, 3, -2) = 3C2 - 2C3 = 1
Solving the system of equations yields c^ = —2, C2 = 1, C3 = 1. Thus (lJ(x,y,z) = —lx-\-y-\-z.
11.2. Let V = {a-\-ht\a,h ^ R}, the vector space of real polynomials of degree < 1. Find the basis
{vx, V2} of V that is dual to the basis {(/)i, (/J} of F* defined by
i(/@) =
f(t) dt and (/J(/@)
f{t) dt
Let Vi = a-\-bt and V2 = c -\- dt. By definition of the dual basis,
(j)x{vx)=\, (I)x(v2) = 0 and (lJ(vi) = 0, (piivj) = \
Thus
(/)^(i;^) = ^^(a-\-bt)dt = a-\-\b= 1
P^(vx) = jj(a + bt) dt = 2a-\-2b = 0
and
(I)^(V2) = Jq(c -\-dt)dt = c-\-^d = 0
j)^(v2) = j^(c + dt)dt = 2c + 2d=\\
Solving each system yields a = 2, b = —2 and c = —^, d = I. Thus {vi =2 — 2t, V2 = —\-\-1} is the basis
of V that is dual to {(j)^, (jJ}.
11.3. Prove Theorem 11.1: Suppose {v^, ..., vj is a basis of V over K. Let ^j, ...,(/)„ g F* be defined
by (piivj) = 0 for / j^j, but (pfivj) = 1 for / =j. Then {(/)i, ...,(/)„} is a basis of F*.
We first show that {(j)^,..., (j)J spans F*. Let (j) be an arbitrary element of F*, and suppose
(p(vi) = ki, (j){v2) = k2, ..., (t){v^) = k^
SqX o = ki(j)Y-\-.. .-\- k^4>n' Then
a(i;i) = (^i(/)i + ... + k^cjy^v^) = ^i(/)i(i;i) + ^2</>2(^i) + • • • + k^cp^iv^)
= kx-l -\-k2-0-\- ...-\-k^-0 = kx
Similarly, for / = 2,..., «,
(^(^i) = (^i(/)i + ... + k^(l)J(Vi) = kx(l)x(Vi) + ... + ^,-(/),-(i;,-) + ... + ^>„(i^/) = k
Thus (/)(i^/) = (t(vi) for / = I,... ,n. Since (/) and a agree on the basis vectors, (j) = a = k^cpi + ... +k^(j)^.
Accordingly, {(/)j,..., c/)^} spans F*.
It remains to be shown that {(/)j,..., c/)^} is linearly independent. Suppose
<^\(t>\ + <^2(t>l + • • • + «„</>„ = 0
Applying both sides to v^, we obtain
0 = 0(i;i) = (ai(/)i + ... + a>J(i;i) = ai(/)i(i;i) + a2(jJivi) + ... + a^cpni^i)
= tti • 1 + C2 • 0 + ... + a„ • 0 = a^
Similarly, for / = 2,..., «,
0 = 0A;,.) = (fli(/)i + ... + a^(t)n){v,) = ax(l)i(Vi) + ... + fl,-(/),-(i;,-) + ... + «>„(i^/) = «/
That is, a^ = 0,..., a„ = 0. Hence {(j)^, ..., (j)J is linearly independent, and so it is a basis of F*.
Lipschutz-Lipson:Schaum's I 11. Linear Functionalsand I Text I I ©The McGraw-Hill
Outline of Theory and the Dual Space Companies, 2004
Problems of Linear
Algebra, 3/e
370 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAR 11
11.4. Prove Theorem 11.2: Let {vi, ..., vj be a basis of V and let {(jy^, ..., (jyj be the dual basis in F*.
For any u e V and any a g F*, (i) u = J2i 0/(^)^/- 00 ^ = J2i 0(^/H/-
Suppose
u = aiVi + a2V2 + ... + a„i;„ A)
Then
(l)^(u) = ai(l)^(vi) + a2(l)i(v2) + ... + a„(/)i(i;„) = a^ • 1 + ^2 • 0 + ... + a„ • 0 = a^
Similarly, for / = 2,..., «,
(Pi(u) = ai(Pi(vi) + • • • + ai(Pi(Vi) + ... + «>/(i^J = «/
That is, (j^iiu) = a^, (/J(^) = «2' • • •' 0n(^) — ^n- Substituting these results into A), we obtain (i).
Next we prove (ii). Applying the linear functional a to both sides of (i),
a(u) = (l)^(u)a(vi) + (IJ(u)a(v2) + ... + (/)„(w)o-(i;„)
= a(vi)(l)^(u) + a(v2)(lJ(u) + • • • + G(vJ(l)^(u)
Since the above holds for every u e V, a = (j(vi)(lJ + (j(v2)(lJ + • • • + o-(i;„)(/)„, as claimed.
11.5. Prove Theorem 11.3. Let {i;J and {wj be bases of V and let {(/)J and {di} be the respective dual
bases in F*. Let P be the change-of-basis matrix from {i;J to {wj. Then (P~^Y is the change-of-
basis matrix from {(/)J to {di}.
Suppose, for / = I, ... ,n,
Wi = a^yVy + aaV2 + • • • + a^n^^ and a,- = 6,-1 (/>! + ba(tJ + • • • + «/„i^„
Then P = [a^j] and g = [b^j]. We seek to prove that Q = (P~^f.
Let Ri denote the ith row of Q and let Cj denote theyth column of P^. Then
^i = (bn ,ba,---, bj and Cj = (gj^ ,aj2,..., aj^
By definition of the dual basis,
G^iwj) = (ba(l)i + bi2(lJ + • • • + bi^(l)J(aj^v^ + fl^.2i;2 + • • • + aj^vj
= bnaj^ + bi2aj2 + ... + Z?,.„fl^.„ = Rfij = d^
where dfj is the Kronecker delta. Thus
QP^ = [Rfij\ = [d,^=I
Therefore, Q = {P^Y^ = (P'^f, as claimed.
11.6. Suppose V e V, V j^ 0, and dim V = n. Show that there exists 0 g F* such that 4^{v) ^ 0.
We extend {v} to a basis {i;, i;2,..., i^„} of K By Theorem 5.2, there exists a unique linear mapping
(j)\V ^ K such that (/)(i;) = 1 and (j){Vi) = 0, i = 2, ... ,n. Hence (p has the desired property.
11.7. Prove Theorem 11.4: Suppose dim V = n. Then the natural mapping v\^ v is an isomorphism of
V onto F**.
^ \\^ first prove that the map i; i^ i) is linear, i.e., for any vectors v,w e V and any scalars a,b e K,
av -\-bw = av-\- bw. For any linear functional (/) G F*,
ai; + bw{(j)) = (j){av + bw) = a(j){v) + Z}(/)(>v) = av{(j)) + Z}w((/)) = {av + bw){(j))
Since ai; + bw{(f)) = (av + bw){(f)) for every (/) G F*, we have ai; + Z^w = ai) + Z^w. Thus the map v \^ v is
linear.
Lipschutz-Lipson:Schaum's I 11. Linear Functionalsand I Text I I ©The McGraw-Hill
Outline of Theory and the Dual Space Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 11] LINEAR FUNCTIONALS AND THE DUAL SRACE 371
Now suppose V e V, V ^ 0. Then, by Rroblem 11.6, there exists (j) e V^ for which (l)(v) ^ 0. Hence
v((j)) = (j)(v) 7^ 0, and thus v ^ 0. Since v ^ 0 implies v ^ 0, the map i; i^ i) is nonsingular and hence an
isomorphism (Theorem 5.64).
Now dim V = dim F* = dim F**, because V has finite dimension. Accordingly, the mapping i; i^ i) is
an isomorphism of V onto F**.
ANNIHILATORS
11.8. Show that if 0 g F* annihilates a subset S of V, then (p annihilates the linear span L(S) ofS. Hence
S^ = [span(^)]^
Suppose V e spanE'). Then there exists w^,... ,w^ e S for which v = a^Wi + ^2^2 + ... + a^w^.
(j){v) = ai(l)(wi) + a2(l)(w2) + ... + a^(l)(w^) = a^O + ^20 + ... + a^O = 0
Since v was an arbitrary element of spanE'), (j) annihilates spanE'), as claimed.
11.9. Find a basis of the annihilator W^ of the subspace W of R"^ spanned by
i;i =A,2,-3,4) and i;2 = @. 1. 4,-1)
By Rroblem 11.8, it suffices to find a basis of the set of linear functionals (j) such that (l)(vi) = 0 and
(j)(v2) = 0, where (j)(xi,X2,X3,X4) = axi + 6x2 + CX3 + dx/^. Thus
(l)(l,2,-3,4) = a-\-2b-3c-\-4d = 0 and (/)@, 1,4,-I) = b-\-4c - d = 0
The system of two equations in the unknowns a, b, c, d is in echelon form with free variables c and d.
A) SQt c = I, d = 0 to obtain the solution a = II, b = —4, c = I, d = 0.
B) Set c = 0, d = I to obtain the solution a = 6, b = —I, c = 0, d = 1.
The linear functions (l)\(xi) = 1 Ix^ — 4x2 + -^3 ^^^ ^li^t) = ^^\ — X2 + X4 form a basis of W^.
11.10. Show that: (a) For any subset ^ of F, ^ c S^^. (b) If S^ c ^2, then S^ c ^0.
(a) Let V e S. Then for every linear functional (j) e S^, v{(j)) = (j){v) = 0. Hence v e (S^)^. Therefore, under
the identification of V and F**, i; G S^^. Accordingly, S ^S^^.
(b) Let (p G S2. Then (/)(i;) = 0 for every v G S2. But Si c S2; hence (p annihilates every element of Si, i.e.,
(peS^ Therefore S^ ^ S^
11.11. Prove Theorem 11.5: Suppose V has finite dimension and ^ is a subspace of V. Then:
(i) dim W + dim W^ = dim F, (ii) W^^ = W.
(i) Suppose dim V = n and dim W = r <n. We want to show that dim W^ = n — r. We choose a basis
{wi,..., Wj] of W and extend it to a basis of V, say {wi,..., w^, 1;^,..., i^„_^}. Consider the dual basis
By definition of the dual basis, each of the above cr's annihilates each wf, hence Oi, ..., (j^_^ G W^. We
claim that {gi} is a basis of W^. Now {cr^} is part of a basis of F*, and so it is linearly independent.
Lipschutz-Lipson:Schaum's I 11. Linear Functionalsand I Text I I ©The McGraw-Hill
Outline of Theory and the Dual Space Companies, 2004
Problems of Linear
Algebra, 3/e
372 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAP. 11
We next show that {(j)} spans W^. Let a e ^^. By Theorem 11.2,
a = (t(w^)(I)^ + ... + (t(w,)(I)^ + (t(v^)(j^ + ... + (j(v^_,)(j^_,
= 0(/)i + ... + Ocj)^ + (T(vi)ai + ... + o-(i;„_^)o-„_^
Consequently {gi, ..., (J^-r} spans W^ and so it is a basis of W^. Accordingly, as required
dim W^ = n — r = dim V — dim W.
(ii) Suppose dim V = n and dim W = r. Then dim V^ = n and, by (i), dim W^ = n — r. Thus, by (i),
dim W^^ =n-{n-r) = r; therefore dim W = dim W^^. By Problem 11.10, ^ c ^oo. Accordingly
W = W^^.
11.12. Let U and W be subspaces of V. Prove that (U + wf = U^ n W^.
Let (j) e {U -\-W) . Then (j) annihilates U -\-W, and so, in particular, (j) annihilates U and W. That is,
(/) G t/o and (^ G W^; hence (/) g t/^ n ^^ Thus {U + r)^ c t/^ n ^^
On the other hand, suppose o e U^ (^W^. Then o annihilates U and also W. \i v eU -\-W, then
V = u-\-w, where u e U and w e W. Hence (j(v) = o{u) + o{w) = 0 + 0 = 0. Thus o annihilates U -\- W, i.e.,
(je(U-\- wf. Accordingly U^ -\-W^ ^ (U-\- Wf.
The two inclusion relations together give us the desired equality.
Remark: Observe that no dimension argument is employed in the proof; hence the result holds for
spaces of finite or infinite dimension.
TRANSPOSE OF A LINEAR MAPPING
11.13. Let (j) be the linear functional on R^ defined by (l)(x,y) = x — 2y. For each of the following linear
operators T on R^ find (P((l)))(x,y):
(a)T(x,y) = (x,0), (b) T(x,y) = (y, x + y), (c) T(x,y) = Bx - 3y, 5x + 2y)
By definition, r^((/)) = (f)° T, that is, {T\(f))){v) = (f){T{v)) for every v. Hence:
{a) (Pm(x, y) = ^(T(x, y)) = ^(x, 0) = x,
(b) (Pm(x,y) = ^(nx,y)) = ^(y, x^y)=y-2(x^y) = -2x-y
(c) (P((l)))(x, y) = (t){T{x, y)) = (/)Bx - 3y, 5x + 2y) = Bx - 3y) - 2Ex + 2y) = -8x - ly
11.14. Let T:V ^ ^ be linear and let PiU"^ -^ F* be its transpose. Show that the kernel of P is the
annihilator of the image of 7, i.e., Ker P = (Im Tf.
Suppose (p G Ker P; that is, P((l)) = (p ° T = 0. If u e Im T, then u = T(v) for some v e V; hence
^(u) = ^(T(v)) = (^ o T)(v) = 0(v) = 0
We have that (/)(w) = 0 for every u elmT; hence (j) G (Im Tf. Thus Ker P c (Im Tf.
On the other hand, suppose a G (Im if; that is, o-(Im T) = {0} . Then, for every v e V,
iT'{ay)iv) = (<To T){v) = a{T{v)) = 0 = Q{v)
We have {P{o)){v) = 0(i;) for every v e V; hence P(cf) = 0. Thus a e Ker P, and so (Im T) c Ker P.
The two inclusion relations together give us the required equality.
Lipschutz-Lipson:Schaum's I 11. Linear Functionalsand I Text I I ©The McGraw-Hill
Outline of Theory and the Dual Space Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 373
11.15. Suppose V and U have finite dimension and T:V ^ U is linear. Prove rank(r) = mnk(P).
Suppose dim V = n and dim U = m, and suppose rank(r) = r. By Theorem 11.5,
dim(Im T)^ = dim u — dim(Im T) = m — rank(r) = m — r
By Problem 11.14, Ker P = (Im T) . Hence nullity (P) = m — r. It then follows that as claimed,
rank(r^) = dim t/* — nullity(r^) = m — (m — r) = r = rank(r)
11.16. Prove Theorem 11.7: Let T:V ^ ^ be linear and let A be the matrix representation of T in the
bases {Vj} of V and {wj of U. Then the transpose matrix A^ is the matrix representation of
p-jj^ _^ Y^ {^ the bases dual to {wj and {Vj}.
Suppose, fory = 1,..., m,
T{vj) = aj^u^ + 0^-2^2 + • • • + aj^u^ A)
We want to prove that, for / = I,... ,n,
P((Ji) = ay^cj)^ + a2>2 + • • • + a^i^m B)
where {aj and {(j)j} are the bases dual to {wj and {Vj}, respectively.
Let V eV and suppose v = kiVi-\- ^2i^2 + • • • + k^^^. Then, by A),
T{v) = k,T(v,) + k^Tiv^) + ... + k^T(vJ
= k^(auUi + ... + ai„wj + ^2(«2iWi + • • • + a2nuj + ... + k^(a^iUi + ... + a^^uj
= (k^au + ^2«2i + • • • + k^a^i)u^ + ... + (k^a^^ + ^2«2n + • • • + ^^«^„K
n
= E(^i«i/ + ^2^2/ + • • • + k^a^iX
Hence, fory = I, ... ,n.
(TXaj)(v)) = aj(T(v)) = a/fc^i^i, + ^2^2/ + • • • + k^aJuA
= k^ay + ^2«2/- + • • • + k^a^j C)
On the other hand, for7 = I, ... ,n,
{ayCJ)^ + a2j(tJ + . . . + a^j(Pm)(^) = (<^\j(t>\ + <^2y</>2 + • • • + a^j(t>m)iK^\ + ^2^2 + • • • + ^m^m)
= ^l^y + ^2«2y + • • • + ^^«^y D)
Since v eV was arbitrary, C) and D) imply that
T\oj) = aycj)^ + a2j(tJ + • • • + %</>^, 7 = 1, ..., «
which is B). Thus the theorem is proved.
Supplementary Problems
DUAL SPACES AND DUAL BASES
11.17. Find: {a)(j) + o, (b) 3A), {cJ(j)-5o, where (/):R^ ^ R and a:R^ ^ R are defined by
(f){x, y, z) = 2x — 3y -\- z and (j(x, y, z) = Ax — 2y -\- 3z
11.18. Find the dual basis of each of the following bases of R^: (a) {A,0,0), @,1,0), @,0,1)},
FIA,-2,3), A,-1,1), B,-4,7)}.
Lipschutz-Lipson:Schaum's I 11. Linear Functionalsand I Text I I ©The McGraw-Hill
Outline of Theory and the Dual Space Companies, 2004
Problems of Linear
Algebra, 3/e
374 LINEAR FUNCTIONALS AND THE DUAL SPACE [CHAR 11
11.19. Let V be the vector space of polynomials over R of degree <2. Let (l)i, AJ, 4>3 be the linear functionals on V
defined by
0i(/(O) = f /(o dt, urn) =f{i\ UM) =f(o)
Jo
Here/@ = a-\-bt-\-cf e V and/@ denotes the derivative of/@. Find the basis {/iCO^/iCO^/sCOl of ^
that is dual to {(p^, (jJ, (p^}.
11.20. Suppose u,v e V and that (/)(w) = 0 implies (/)(i;) = 0 for all (j) e V^. Show that v = ku for some scalar k.
11.21. Suppose (/), a G F* and that (l)(v) = 0 implies (j(v) = 0 for all i; G K Show that o = k(j) for some scalar k.
11.22. Let V be the vector space of polynomials over K. For a e K, define (f)^\V ^ Khy (j^aifW) =f(^)- Show
that: (a) (f)^ is linear; (b) if a ^ b, then (f)^^ (f)j^.
the linear functionals defined by (/),(/@) =/D (/),(/@) =fib\ Uf(t)) =f(c). Show that {(/)„ cj),, cjyj is
linearly independent, and find the basis {/iCO'^iCO'/sCOl of V that is its dual.
11.24. Let V be the vector space of square matrices of order n. Let T\V ^ K be the trace mapping; that is,
T{A) = ttii + C22 + • • • + «„„, where A = (aij). Show that T is linear.
11.25. Let ^ be a subspace of V. For any linear functional (ponW, show that there is a linear functional a on F such
that a(w) = (l)(w) for any w e W; that is, (j) is the restriction of cr to W.
11.26. Let {e^,... ,e^}hQ the usual basis of ^". Show that the dual basis is {tii ,..., 7i„} where 7i^ is the ith projection
mapping; that is, ni(ai, ..., a„) = a^.
11.27. Let F be a vector space over R. Let (t)i,(tJ ^ ^* and suppose aiF ^ R, defined by o{v) = (J)y{v)(JJ{v),
ANNIHILATORS
11.28. Let W be the subspace of R"^ spanned by A,2,-3,4), A,3,-2,6), A,4,-1,8). Find a basis of the
annihilator of W.
11.29. Let W be the subspace of R^ spanned by A, 1, 0) and @, 1, 1). Find a basis of the annihilator of W.
11.30. Show that, for any subset S of V, spanE') = S^^, where spanE') is the linear span of S.
11.31. Let U and W be subspaces of a vector space V of finite dimension. Prove that {U nW) = U^ -\-W^.
11.32. Suppose V =U®W. Prove that V^ = W ® W^.
TRANSPOSE OF A LINEAR MAPPING
11.33. Let (j) be the linear functional on R^ defined by (t){x,y) = 3x — 2y. For each of the following linear mappings
r:R3 -^ r2, find {T\(t))){x,y,z):
(a) T(x,y, z) = (x -\-y, y + z), (b) T(x,y, z) = (x -\-y + z, 2x - j;)
11.34. Suppose T^\U ^ V md T2:V ^ W slvq linear. Prove that (T2 « TJ = T{ o T{.
Lipschutz-Lipson:Schaum's I 11. Linear Functionalsand I Text I I ©The McGraw-Hill
Outline of Theory and the Dual Space Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 11] LINEAR FUNCTIONALS AND THE DUAL SPACE 375
11.35. Suppose T:V ^^ U is linear and V has finite dimension. Prove that Im P = (Ker T)^.
11.36. Suppose T:V ^ U is linear and u e U. Prove that w g Im T or there exists (p e V^ such that P((l)) = 0 and
(l)(u) = 1.
11.37. Let V be of finite dimension. Show that the mapping T i^ P is an isomorphism from Hom(F, V) onto
Hom(F*, F*). (Here T is any linear operator on V.)
MISCELLANEOUS PROBLEMS
11.38. Let F be a vector space over R. The line segment uv joining points u,v e V is defined by
uv = {tu -\- (I — t)v:0 < t < 1}. A subset 5* of F is convex if u,v e S implies uv ^ S. Let (p e F*. Define
W+ = {veV: (l)(v) > 0}, W = {v e V: (l)(v) = 0], W' = {v e V: (l)(v) < 0}
Prove that W^, W, and W~ are convex.
11.39. Let F be a vector space of finite dimension. A hyperplane H of V may be defined as the kernel of a nonzero
linear functional (/) on K Show that every subspace of V is the intersection of a finite number of hyperplanes.
Answers to Supplementary Problems
11.17. (a) 6x-5y-\-4z, (b) 6x-9y-\-3z, (c) -I6x-\-4y - I3z
11.18. (a) (f)^ =x, AJ= y, 4^3 = z, (b) (f)^ = —3x — 5y — 2z, (fJ=2x-\-y, (f)^, = x-\-2y -\-z
11.22. {b) Let/@ = t. Then (/)^(/@) = a^b = (l)t(f(t)); and therefore (j)^ 7^ (j)^
(a-b)(a-c) ' ''^' (b-a)(b-c) ' ''^' (c - a)(c - b)
11.28. {(j)i(x,y,z,t) = 5x—y-\-z, (jJ(x,y,z,t) = 2y — t}
11.29. {(l)(x,y,z)=x-y-\-z}
11.33. (a) {T\(t))){x,y, z) = 3x+y- 2z, {b) {T\(t))){x,y, z) = -x + 5y + 3z
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
12. Bilinear, Quadratic, and
Hermitian Forms
Text
©The McGraw-Hill
Companies, 2004
CHAPTER 12
Bilinear, Quadratic,
and Hermitian Forms
12.1 INTRODUCTION
This chapter generalizes the notions of linear mappings and linear functionals. Specifically, we
introduce the notion of a bilinear form. These bilinear maps also give rise to quadratic and Hermitian
forms. Although quadratic forms were discussed previously, this chapter is treated independently of the
previous results.
Although the field K is arbitrary, we will later specialize to the cases AT = R and K = C. Furthermore,
we may sometimes need to divide by 2. In such cases, we must assume that 1 + 1 7^ 0, which is true when
K = RorK = C.
12.2 BILINEAR FORMS
Let F be a vector space of finite dimension over a field K. A bilinear form on F is a mapping
/:F X V ^ K such that, for all a,b e K and all u^, v^ g V:
(i) f(aui + bu2, v) = af(ui, v) + bf(u2, v),
(ii) f(u, avi + bv2) = af(u, 1^1) + bf(u, V2)
We express condition (i) by saying/ is linear in the first variable, and condition (ii) by saying/ is linear in
the second variable.
Example 12.1.
(a) Let/ be the dot product on R"; that is, for u = (a,) and v = F,),
f(u, v) = u- V = ttibi + a2b2 + ... + a^b^
Then/ is a bilinear form on R". (In fact, any inner product on a real vector space F is a bilinear form on V.)
(b) Let (/) and a be arbitrarily linear functionals on V. Let/: F x V ^^ Khe defined by f(u, v) = (f){u)o{v). Then/ is
a bilinear form, since (f) and o are each linear
376
Lipschutz-Lipson:Schaum's I 12. Bilinear, Quadratic,and I Text I I ©The McGraw-Hill
Outline of Theory and Hermitian Forms Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 12] BILINEAR, QUADRATIC, AND HERMITIAN FORMS 377
(c) Let A = [aij] be any n x n matrix over a field K. Then A may be identified with the following bilinear form F on
K^, where X = [xj and Y = [yj are column vectors of variables:
f(X, Y) = X^AY = J2atjX^i = a^x^y^ + ^is^ij^s + • • • + «„AJ^„
The above formal expression in the variables Xi,yi is termed the bilinear polynomial corresponding to the matrix
A. Equation A2.1) below shows that, in a certain sense, every bilinear form is of this type.
Space of Bilinear Forms
Let B(V) denote the set of all bilinear forms on V. A vector space structure is placed on B(V), where
for any/, g e B(V) and any k e K, wq define/ + g and kf as follows:
(/ + g)(u, v) =f(u, v) + g(u, v) and (kf)(u, v) = kf(u, v)
The following theorem (proved in Problem 12.4) applies.
Theorem 12.1: Let F be a vector space of dimension n over K. Let {(jy^, ..., cjyJhQ any basis of the dual
space F*. Then {fij:i,j=l,...,n} is a basis of B(V), where/y is defined by
fij(u, v) = 4^^{uL>j{v). Thus, in particular, dim 5(F) = n^.
12.3 BILINEAR FORMS AND MATRICES
Let/ be a bilinear form on V and let ^ = {wj,..., w„} be a basis of V. Suppose u,v e V and
u = fljWj + ... + a„w„ and v = biUi + ... + Z?„w„
Then
f(u, v) =f(a^u^ + ... + a^u^, b^u^-\- ...-\- b^uj = J2 ^fb/iuf, Uj)
Thus/ is completely determined by the n^ values/(w^, Uj).
The matrix A = [afj] where afj =f(ui, Uj) is called the matrix representation off relative to the basis S
or, simply, the "matrix off in S'\ It "represents"/ in the sense that, for all u,v e V,
f{u,;;) = E«A/(«<. ";) = MJ^Ms A2.1)
[As usual {u\s denotes the coordinate (column) vector of u in the basis S.]
Change of Basis, Congruent Matrices
We now ask, how does a matrix representing a bilinear form transform when a new basis is selected?
The answer is given in the following theorem (proved in Problem 12.5).
Theorem 12.2: Let P be a change-of-basis matrix fi*om one basis S to another basis S'. \fA is the matrix
representing a bilinear form / in the original basis S, then B = P^AP is the matrix
representing/ in the new basis S'.
The above theorem motivates the following definition.
Definition: A matrix B is congruent to a matrix A, written B ^A, if there exists a non-singular matrix P
such that B = P^AP.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
12. Bilinear, Quadratic, and
Hermitian Forms
Text
©The McGraw-Hill
Companies, 2004
378
BILINEAR, QUADRATIC, AND HERMITIAN FORMS
[CHAR 12
Thus, by Theorem 12.2, matrices representing the same bilinear form are congruent. We remark that
congruent matrices have the same rank, because P and P^ are nonsingular; hence the following definition
is well-defined.
Definition: The rank of a bilinear form/ on V, written rank(/), is the rank of any matrix representation
of /. We say / is degenerate or nondegenerate according as rank(/) < dim V or
rank(/) = dim V.
12.4 ALTERNATING BILINEAR FORMS
Let/ be a bilinear form on V. The/ is called:
(i) alternating if/(i^, i;) = 0 for every v e V;
(ii) skew-symmetric iff(u, v) = —f(v, u) for every u,v e V.
Now suppose (i) is true. Then (ii) is true, since, for any u,v, e V,
0 =f(u -\- V, u-\-v) =f(u, u) -\-f(u, v) -\-f(v, u) -\-f(v, v) =f(u, v) -\-f(v, u)
On the other hand, suppose (ii) is true and also 1 + 1 ^ 0. Then (i) is true, since, for every i; G ^ we have
f(v, v) = —f(v, v). In other words, alternating and skew-symmetric are equivalent when 1 + 1 ^ 0.
The main structure theorem of alternating bilinear forms (proved in Problem 12.23) is as follows.
Theorem 12.3: Let/ be an alternating bilinear form on V. Then there exists a basis of V in which/ is
represented by a block diagonal matrix M of the form
M = diag
0 r
-1 0
'
0 r
-1 0
, ...,
0 r
-1 0
[0], [0],
[0]
Moreover, the number of nonzero blocks is uniquely determined by/ [since it is equal to
i rank(/)].
In particular, the above theorem shows that any alternating bilinear form must have even rank.
12.5 SYMMETRIC BILINEAR FORMS, QUADRATIC FORMS
This section investigates the important notions of symmetric bilinear forms and quadratic forms and
their representation by means of symmetric matrices. The only restriction on the field K is that 1 + 1 7^ 0.
In Section 12.6, we will restrict K to be the real field R, which yields important special results.
Symmetric Bilinear Forms
Let/ be a linear form on V. Then/ is said to be symmetric if, for every u,v e V,
f(u, v) =f(v, u)
One can easily show that/ is symmetric if and only if any matrix representation A off is a symmetric
matrix.
The main result for symmetric bilinear forms (proved in Problem 12.10) is as follows. (We emphasize
that we are assuming that 1 + 1 7^ 0.)
Theorem 12.4: Let/ be a symmetric bilinear form on V. Then V has a basis {vi,... ,v^} in which it is
represented by a diagonal matrix, that is, where/(i;^ Vj) = 0 for / ^j.
Lipschutz-Lipson:Schaum's I 12. Bilinear, Quadratic,and I Text I I ©The McGraw-Hill
Outline of Theory and Hermitian Forms Companies, 2004
Problems of Linear
Algebra, 3/e
CHAR 12] BILINEAR, QUADRATIC, AND HERMITIAN FORMS 379
Theorem 12.4: (Alternative Form) Let ^ be a symmetric matrix over K. Then A is congruent to a
diagonal matrix; that is, there exists a non-singular matrix P such that P^AP is diagonal.
Diagonalization Algorithm
Recall that a nonsingular matrix P is a product of elementary matrices. Accordingly, one way of
obtaining the diagonal form D = P^AP is by a sequence of elementary row operations and the same
sequence of elementary column operations. This same sequence of elementary row operations on the
identity matrix / will yield P^. This algorithm is formalized below.
Algorithm 12.1: (Congruence Diagonalization of a Symmetric Matrix) The input is a symmetric
matrix A = [a^y] of order n.
Step 1. Form the n x 2n (block) matrix M = [^i, /], where Ai = ^ is the left half of M and the identity
matrix / is the right half of M.
Step 2. Examine the entry a^. There are three cases.
Case I: a^ j^O. (Use a^ as a pivot to put O's below a^ in M and to the right of a^ in A^.)
For i = 2, ... ,n:
(a) Apply the row operation "Replace R^ by —a^Ri + auRi''.
(b) Apply the corresponding column operation "Replace Q by —a^Ci + a^Q".
These operations reduce the matrix M to the form
n 0 * *
M'
0 ^1 * *
(*)
Case II: a^ = 0 but aj^j^ j^ 0, for some ^ > 1.
(a) Apply the row operation "Interchange Ri and Rj^'\
(b) Apply the corresponding column operation "Interchange Cj and Q".
(These operations bring aj^j^ into the first diagonal position, which reduces the matrix
to Case I.)
Case III: All diagonal entries a^ = 0 but some a^j j^ 0.
(a) Apply the row operation "Replace R^ by Rj + i?/'.
(b) Apply the corresponding column operation "Replace Q by Cj + Q".
(These operations bring la^ into the /th diagonal position, which reduces the matrix
to Case II.)
Thus M is finally reduced to the form (*), where A2 is a symmetric matrix of order less than A.
Step 3. Repeat Step 2 with each new matrix Aj^ (by neglecting the first row and column of the preceding
matrix) until A is diagonalized. Then M is transformed into the form M' = [D, Q], where D is
diagonal.
Step 4. Set P = Q^. Then D = PUP.
Remark 1: We emphasize that in Step 2, the row operations will change both sides of M, but the
column operations will only change the left half of M.
Remark 2: The condition 1 + 1 7^ 0 is used in Case III, where we assume that la^ ^ 0 when
ay ^ 0.
Lipschutz-Lipson:Schaum's I 12. Bilinear, Quadratic, and I Text
Outline of Theory and Hermitian Forms
Problems of Linear
Algebra, 3/e
©The McGraw-Hill
Companies, 2004
380
BILINEAR, QUADRATIC, AND HERMITIAN FORMS
[CHAR 12
The justification for the above algorithm appears in Problem 12.9.
Example 12.2. Let^
diagonal.
1 2 -3
2 5-4
-3 -4 8
. Apply Algorithm 9.1 to find a nonsingular matrix P such that D — P^AP is
First form the block matrix M = [A,I]; that is, let
M = [A,I]
2 5
-3 -4
-31100
-4 I 0 1 0
8 I 0 0 1
Apply the row operations "Replace R2 by —IR^ +^2" ^^^ "Replace ^^3 by SR^ +^^3" to M, and then apply the
corresponding column operations "Replace C2 by —IC^ + C2" and "Replace C3 by 3Q + C3" to obtain
12-31 100"
0 1
0 2 -1 I 3 0 1
2; -2 1 0
and then
10 01 1 0 0"
0 1
0 2 -1 I 3 0 1
2 I -2 1 0
Next apply the row operation "Replace R^ by —2R2 + R^" and then the corresponding column operation "Replace C3
by —2C2 + C3" to obtain
To Oil 00'
0 1 2 I -2 10
0 0 -5 I 7 -2 1
and then
10 Oil
0 0"
0 1
0 I -2 10
0 0 -5 I 7-2 1
Now A has been diagonalized. Set
1 -2 7
0 1 -2
0 0 1
and then
D = P-'AP--
1 0 0
0 1 0
0 0-5
We emphasize that P is the transpose of the right half of the final matrix.
Quadratic Forms
We begin with a definition.
Definition A: A mapping q\V ^ K is di quadratic form if q{v) =f(v, v) for some symmetric bilinear
form/ on V.
If 1 + 1 ^ 0 inK, then the bilinear form/ can be obtained from the quadratic form q by the following
polar form off:
f(u, v)=^ [q(u -\-v)- q(u) - q(v)]
Now suppose/ is represented by a symmetric matrix^ = [a^], and 1 + 1 7^ 0. LettingX = [xj denote
a column vector of variables, q can be represented in the form
q(X) =f(X, X) = X^AX = J2 ^ij^i^j = X! ^iiA + 2 X! ^ij^i^j
ij i i<j
The above formal expression in the variables x^ is also called a quadratic form. Namely, we have the
following second definition.
Definition B: A quadratic form q in variables Xj, X2, ..., x„ is a polynomial such that every term has
degree two; that is,
q{xx, X2, ..., x„) = 2_^ CfXf + 2_^ dyXpCj
i i<j
Lipschutz-Lipson:Schaum's I 12. Bilinear, Quadratic,and I Text I I ©The McGraw-Hill
Outline of Theory and Hermitian Forms Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 12] BILINEAR, QUADRATIC, AND HERMITIAN FORMS 381
Using 1 + 1 7^ 0, the quadratic form q in Definition B determines a symmetric matrix A = [Uy] where
a-- = Cf and a^j = ap = jdfj. Thus Definitions A and B are essentially the same.
If the matrix representation ^ of ^ is diagonal, then q has the diagonal representation
q(X) = X^AX = a^^x\ + ^22X2 + ... + a„„x^
That is, the quadratic polynomial representing q will contain no "cross product" terms. Moreover, by
Theorem 12.4, every quadratic form has such a representation (when 1 + 1 7^ 0).
12.6 REAL SYMMETRIC BILINEAR FORMS, LAW OF INERTIA
This section treats symmetric bilinear forms and quadratic forms on vector spaces V over the real field
R. The special nature of R permits an independent theory. The main result (proved in Problem 12.14) is as
follows.
Theorem 12.5: Let/ be a symmetric form on V over R. Then there exists a basis of V in which/ is
represented by a diagonal matrix. Every other diagonal matrix representation of/ has the
same number p of positive entries and the same number n of negative entries.
The above result is sometimes called the Law of Inertia or Sylvester's Theorem. The rank and signature
of the symmetric bilinear form/ are denoted and defined by
rank(/) = p + n and sig(/) = p — n
These are uniquely defined by Theorem 12.5.
A real symmetric bilinear form/ is said to be:
(i) positive definite \i q{v) =f(v, v) > 0 for every v j^ 0,
(ii) nonnegative semidefinite if ^A;) =f(v, 1^) > 0 for every v.
Example 12.3. Let/ be the dot product on R". Recall that/ is a symmetric bilinear form on R". We note that/ is also
positive definite. That is, for any u = {af) / 0 in R",
f{u, u) = ai -\- a2 -\- ... -\- a^ > 0
Section 12.5 and Chapter 13 tell us how to diagonalize a real quadratic form q or, equivalently, a real
symmetric matrix A by means of an orthogonal transition matrix P. If P is merely nonsingular, then q can
be represented in diagonal form with only I's and —I's as nonzero coefficients. Namely, we have the
following corollary.
Corollary 12.6: Any real quadratic form q has a unique representation in the form
q{xx, X2, ..., x„) = Xj + ... + Xp — Xp^i — ... — x;
where r = /7 + w is the rank of the form.
Corollary 12.6: (Alternative Form) Any real symmetric matrix A is congruent to the unique diagonal
matrix
i) = diag(/p,-4,0)
where r = /7 + w is the rank of ^.
Lipschutz-Lipson:Schaum's I 12. Bilinear, Quadratic,and I Text I I ©The McGraw-Hill
Outline of Theory and Hermitian Forms Companies, 2004
Problems of Linear
Algebra, 3/e
382 BILINEAR, QUADRATIC, AND HERMITIAN FORMS [CHAR 12
12.7 HERMITIAN FORMS
Let F be a vector space of finite dimension over the complex field C. A Hermitian form on F is a
mapping/:F x F ^ C such that, for all a,b e C and all Ui,ve V,
(i) f{aux + bu2, v) = af{ux, v) + bf{u2, v),
(ii) f(u,v)=f(v,u).
(As usual, k denotes the complex conjugate of ^ g C.)
Using (i) and (ii), we get
f(u, avi + bv2) =f(avi + bv2, u) = af{vi, u) + bf{v2, u)
= af{vx, u) + bf{v2, u) = af{u, v^)-\- bf(u, V2)
That is,
(iii)/(w, avx + bv2) = af{u, v^) -\- bf(u, V2).
As before, we express condition (i) by saying/ is linear in the first variable. On the other hand, we express
condition (iii) by saying/ is "conjugate linear" in the second variable. Moreover, condition (ii) tells us that
f(v, v) =f(v, v), and hence/(i;, v) is real for every v e V.
The results of Sections 12.5 and 12.6 for symmetric forms have their analogues for Hermitian forms.
Thus, the mapping q:V ^ R, defined by q(v) =f(v, v), is called the Hermitian quadratic form or complex
quadratic form associated with the Hermitian form/. We can obtain/ from q by the polar form
/(w, v) = \ [q(u -\-v)- q(u -v)]-\-\ [q(u + iv) - q(u - iv)]
Now suppose S = {ui, ... ,uj is 2i basis of V. The matrix H = [hy] where hy =f(ui, Uj) is called the
matrix representation of f in the basis S. By (ii), /(w/, Uj) =f(uj, Uf); hence H is Hermitian and, in
particular, the diagonal entries of H are real. Thus any diagonal representation of/ contains only real
entries.
The next theorem (to be proved in Problem 12.47) is the complex analog of Theorem 12.5 on real
symmetric bilinear forms.
Theorem 12.7: Let/ be a Hermitian form on V over C. Then there exists a basis of V in which/ is
represented by a diagonal matrix. Every other diagonal matrix representation of/ has the
same number p of positive entries and the same number n of negative entries.
Again the rank and signature of the Hermitian form/ are denoted and defined by
rank(/) = p + n and sig(/) = p — n
These are uniquely defined, by Theorem 12.7.
Analogously, a Hermitian form/ is said to be:
(i) positive definite if ^A;) =f(v, 1;) > 0 for every v ^ 0,
(ii) nonnegative semidefinite if ^A;) =f(v, 1^) > 0 for every v.
Example 12.4. Let/ be the dot product on C"; that is, for any u = (z^) and v = (w^) in C",
f(u, v) = U • V = Z^Wi -\- Z2W2 + . . . + Z^W^
Then/ is a Hermitian form on C". Moreover,/ is also positive definite, since, for any u = (z^) 7^ 0 in C",
f(u, u) = z,z, +Z2Z2 + ... ■^z.z, = |zi |2 + |Z2|2 + ... + |zj2 > 0
Lipschutz-Lipson:Schaum's I 12. Bilinear, Quadratic, and I Text
Outline of Theory and Hermitian Forms
Problems of Linear
Algebra, 3/e
©The McGraw-Hill
Companies, 2004
CHAR 12]
BILINEAR, QUADRATIC, AND HERMITIAN FORMS
383
Solved Problems
BILINEAR FORMS
12.1. Let u = (xi,X2,X3) and v = (yi, J2'Js)- Express/ in matrix notation, where
f{u, v) = 3x1}; 1 — Ix^yi, + 5x2y\ + 7x2^2 ~ 8-^2^3 + ^^i,y2 ~ 6-^3^3
Let A = [aij], where a^ is the coefficient of x^^. Then
f(u, v) = X^AY = [xi, X2, X3]
■3 0 -2'
5 7-8
0 4-6
12.2. Let ^ be an w X ^ matrix over K. Show that the mapping/ defined by/(X, 7) = X^AY is a bihnear
form on K^.
For any a,b e K and any X^, Y^ e K",
f(aX^ + bX2, Y) = (aX^ + 3X2/AY = (aX[ + bXl)AY
= aX[AY + bX^AY = af(X^, Y) + 6/(^2, 7)
Hence/ is linear in the first variable. Also,
f(X, aY^ + 672) = X^A(aY^ + 672) = aX^AY^ + 6X^^72 = af(X, Y^) + bf(X, Y2)
Hence/ is linear in the second variable, and so/ is a bilinear form on K".
12.3. Let/ be the bilinear form on R^ defined by
/[(xj, X2), (yi, J2)] = ^XiJi - 3xi3;2 + 4x2^2
(a) Find the matrix A off in the basis {wj = A, 0), U2 = (I, 1)}.
(b) Find the matrix B off in the basis {vi = 2, 1), V2 = A, —1)}.
(c) Find the change-of-basis matrix P from the basis {wj to the basis {i;J, and verify that
B = P^AP.
(a) Set A = [aij], where aij =f(ui, Uj}. This yields
./[A,0), A,0I = 2-0-0 = 2,
Thus A
«i2 =/[(!, 0), (l,l)] = 2-3-0 = -l,
_i 1
is the matrix of/ in the basis [ui, U2}
«2i =/[(!, IX A,0I = 2
«22 =/[(!, IX A,1I = 3
{b) Set B = [bij], where Z?,-, =f(Vi, Vj). This yields
./[B, 1), B, 1I :
■6 + 4 = 6,
b,2=f[B,ll A,-1) = 4 + 6-4 = 6,
b2i=f[(h-l), B, 1I = -3
b22=f[(h-lX A,-1I = 9
Thus B :
6 6
-3 9
is the matrix of/ in the basis {1^1, 1^2
(c) Writing 1;^ and V2 in terms of the w, yields Vi = u^ -\-U2 and V2 = 2ui — ^2- Then
and
pUp--
p =
2
_1
1]
-ij
2"
-1_
[2
[2
'
-1]
3J
P^ =
[1 2"
[1 -1
_2
r
-i_
6 6"
_-3
9_
Lipschutz-Lipson:Schaum's I 12. Bilinear, Quadratic, and I Text
Outline of Theory and Hermitian Forms
Problems of Linear
Algebra, 3/e
©The McGraw-Hill
Companies, 2004
384
BILINEAR, QUADRATIC, AND HERMITIAN FORMS
[CHAR 12
12.4. Prove Theorem 12.1: Let V be an ^-dimensional vector space over K. Let {(jy^, ..., cjyJhQ any basis
of the dual space F*. Then {f}j: i,j = I,... ,n} is a basis of B(V), where f^j is defined by
f:j(u, v) = ^^{u)^j{v). Thus dim 5(F) = n^.
Let {wi,..., wj be the basis of V dual to {(/)J. We first show that {/^} spans B{V). Let/ g B{V) and
suppose/(w^, Uj) = ttij. We claim that/ = ^ijCiijfij- It suffices to show that
f(u„ u^) = (E aijfij){u^, u^) for s,t=\,...,n
We have
(E«//•/>•)(w.' Ut) = H^yfijiu,, u,) = T.aij(t)i{u,)(t)j{u,) = J^aijSJjt = a,, =f(u„ u,)
as required. Hence {/J^} spans B{V). Next, suppose ^aijfy = 0. Then for ^, ^ = !,...,«,
0 = 0(w„ w^) = {Y.aijfij){u^, u^) = a,^
The last step follows as above. Thus {/y} is independent, and hence is a basis of B(V).
12.5. Prove Theorem 12.2. Let P be the change-of-basis matrix Irom a basis ^ to a basis S'. Let A be the
matrix representing a bilinear form in the basis S. Then B = P^P is the matrix representing/ in
the basis S\
Let u,v e V. Since P is the change-of-basis matrix from S to y, we have P[w]y = [w]^ and also
P[v]sf = [v]s; hence [w]J = [w]J/P^. Thus
Since u and i; are arbitrary elements of V, P^AP is the matrix of/ in the basis S\
SYMMETRIC BILINEAR FORMS, QUADRATIC FORMS
12.6. Find the symmetric matrix that corresponds to each of the following quadratic forms:
(a) q(x, y, z) = 3x^ + Axy —y^-\- 8xz = 6yz + z^,
(Z?) q'{x, y, z) = 3x^ + xz — 23;z, (c) q"{x, y, z) = 2x^ — 5y^ — Iz^
The symmetric matrix A = [a^j] that represents q(xi,..., x„) has the diagonal entry a^i equal to the
coefficient of the square term xj and the nondiagonal entries a^j and Gj^ each equal to half of the coefficient of
the cross product term x^x^. Thus
(a) A:
3 2
2 -1
4 -3
F) ^^:
3 0 i
0 0-1
i 0 0
,(C) ^^^:
2
0
0
0
-5
0
0
0
-7
The third matrix A^^ is diagonal, since the quadratic form q^^ is diagonal, that is, q^^ has no cross product terms.
12.7. Find the quadratic form q(X) that corresponds to each of the following symmetric matrices:
4-5 7"
5 -6 8
7 8 -9
(a) A =
(b) B =
Ac) C =
2
4
1
5
4
-7
-6
8
-1
-6
3
9
5
8
0
1
The quadratic form q(X) that corresponds to a symmetric matrix M is defined by q(X) = X^MX,
where X = [xj is the column vector of unknowns.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
12. Bilinear, Quadratic, and
Hermitian Forms
Text
©The McGraw-Hill
Companies, 2004
CHAR 12]
BILINEAR, QUADRATIC, AND HERMITIAN FORMS
385
(a) Compute as follows:
(b)
(c)
q(x,y)=X^AX = [x,y]\
.yj
:[5x-3j;, -3x+8j;]
: 5x^ - 3xy - 3xy + S/ = 5x^ - 6xy + 8/
As expected, the coefficient 5 of the square term x^ and the coefficient 8 of the square term y^ are the
diagonal elements of A, and the coefficient —6 of the cross product term xy is the sum of the nondiagonal
elements —3 and — 3 of ^ (or: twice the nondiagonal element —3, since A is symmetric).
Since 5 is a 3-square matrix, there are three unknowns, say x,y, z or x^, X2, X3. Then
q(x,y, z) = 4x^ - lOxy - 6/ + 14xz + 16j;z - 9z^
or q(xi,X2, X3) = 4xi — 10x^X2 — 6x2 + 14x^X3 + 16x2X3 — 9x^
Here we use the fact that the coefficients of the square terms x\, x\,x\ (or x^,y^,z^) are the respective
diagonal elements 4, —6, —9 of 5, and the coefficient of the cross product term x^x^ is the sum of the
nondiagonal elements b^j and bj^ (or twice b^j, since b^j = bj^).
Since C is a 4-square matrix, there are four unknowns. Hence
qyX-\ , X2 5 X3, Xaj ^ ^Xi '-^2 ' 3 ' "^4 ' ^-^l-^? •^XiX3
+ 10X1X4 — 12X2X3 + 16X2X4 + 18X3X4
12.8. Let A =
Apply Algorithm 12.1 to find a non-singular matrix P such that
D = P^AP is diagonal, and find sig(^), the signature of ^.
First form the block matrix M = [A, I]:
M = [A,I] --
1
3
2
-3
7
-5
2 110 0"
-5 0 1 0
8 1 0 0 1
Using ail = I ^s a pivot, apply the row operations "Replace R2 by 3Ri +^^2" ^^^ "Replace Rt, by
—2Ri +^^3" to M and then apply the corresponding column operations "Replace C2 by 3Q + C2" and
"Replace C3 by —2Ci + C3" to A to obtain
1-321100'
-2 1; 3 1 0
1 4 I 0 0 1
and then
1
0
0
0 0 110 0
-21310
1 4 1 0 0 1
Next apply the row operation "Replace ^^3 by R2 + 27^3" and then the corresponding column operation
"Replace C3 by C2 + 2C3" to obtain
0 0 1 10 0'
-2 1
0 9
3 1 0
-1 1 2
and then
0
-2 C
0 n
0 I
1 0 0"
3 1 0
-1 1 2
Now A has been diagonalized and the transpose of P is in the right half of M. Thus set
1 3 -1"
0 1 1
0 0 2
and then
D = P^AP :
0 0
-2 0
0 18
Note D has p = 2 positive and n = 1 negative diagonal elements. Thus the signature of A is
sig(^) = p-n = 2-l = l.
Lipschutz-Lipson:Schaum's I 12. Bilinear, Quadratic, and I Text
Outline of Theory and Hermitian Forms
Problems of Linear
Algebra, 3/e
©The McGraw-Hill
Companies, 2004
386
BILINEAR, QUADRATIC, AND HERMITIAN FORMS
[CHAR 12
12.9. Justify Algorithm 12.1, which diagonalizes (under congruence) a symmetric matrix A.
Consider the block matrix M = [A, I]. The algorithm applies a sequence of elementary row operations
and the corresponding column operations to the left side of M, which is the matrix A. This is equivalent to
premultiplying ^ by a sequence of elementary matrices, say, £^,£2,... ,E^, and postmultiplying A by the
transposes of the Ef. Thus when the algorithm ends, the diagonal matrix D on the left side of M is equal to
D=E^... E2E^AEIeI ...E^ = QAQ^, where Q = E^... £^1
On the other hand, the algorithm only applies the elementary row operations to the identity matrix / on the
right side of M. Thus when the algorithm ends, the matrix on the right side of M is equal to
E,...E2E,I = E,...E2E, =Q
Setting P = Q^, we get D = P^AP, which is a diagonalization of ^ under congruence.
12.10. Prove Theorem 12.4: Let/ be a symmetric bilinear form on V over K (where 1 + 1 7^ 0). Then V
has a basis in which/ is represented by a diagonal matrix.
Algorithm 12.1 shows that every symmetric matrix over K is congruent to a diagonal matrix. This is
equivalent to the statement that/ has a diagonal representation.
12.11. Let q be the quadratic form associated with the symmetric bilinear form/. Verify the polar identity
ii
We have
f(u, v) = ^[q(u -\- v) — q(u) — q(v)]. (Assume that 1 + 1 7^ 0.)
q(u -\-v) — q(u) — q(v) =f(u -\-v, u-\-v) —f(u, u) —f(v, v)
=f(u, u) -\-f(u, v) -\-f(v, u) -\-f(v, v) -f(u, u) -f(v, v) = 2f(u, v)
If 1 + 1 7^ 0, we can divide by 2 to obtain the required identity.
12.12. Consider the quadratic form q(x,y) = 3x^ + 2xy — y^ and the linear substitution
X = s — 3t, y = 2s -\-1
(a) Rewrite q(x,y) in matrix notation, and find the matrix A representing q(x,y).
(b) Rewrite the linear substitution using matrix notation, and find the matrix P corresponding to
the substitution.
(c) Find q(s, t) using direct substitution.
{d) Find q{s, t) using matrix notation.
(a) Here q{x,y) = [x,y]
3 1
1 -1
(b) Here
[i\
1 -3
2 1
. Thus P
Thus A=\^ _! ; and q(X) = X^AX, where X = [x, yf
and X = PY.
(c) Substitute for x and j; in ^ to obtain
q{s, t) = 3(s - 3tf + 2{s - 3t){2s + 0 - B^ + tf
= 3(s^ - 6st + 9^) + 2B^2 _ 5^^ _ 3^2^ _ ^^2 _^ 4^^ _^ ^2^ ^ 3^2 _ ^2st + 20^^
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
12. Bilinear, Quadratic, and
Hermitian Forms
Text
©The McGraw-Hill
Companies, 2004
CHAP. 12]
BILINEAR, QUADRATIC, AND HERMITIAN FORMS
387
(d) Here q(X) = X^AX and X = PY. Thus X^ = Y^P^. Therefore
= [sj]
3 -16
-16 20
[As expected, the results in parts (c) and (d) are equal.
r 1 2
:]
3^^ - 32^^ + l^f
12.13. Consider any diagonal matrix A = diag(ai, ..., a„) over K. Show that, for any nonzero scalars
ki, ... ,k„ e K,A is congruent to a diagonal matrix D with diagonal entries Uik^, ..., a^k^.
Furthermore, show that:
(a) If K = C, then we can choose D so that its diagonal entries are only I's and O's.
(b) If K = R, then we can choose D so that its diagonal entries are only I's, —I's, and O's.
Let P = diag(^i,... ,k^). Then, as required.
D = P^AP = diag(^,) diag(a,) diag(^,) = dmg(a^k^, .
' ^nkl)
(a) Let P = diag(Z},), where bi = \
Then P^AP has the required form.
(b) Let P = diag(Z?,), where b^
if Gi 7^ 0
if Gi = 0
l/y|a,-| if a^ + 0
1 if a^ = 0
Then P^AP has the required form.
Remark: We emphasize that (b) is no longer true if "congruence" is replaced by "Hermitian
congruence".
12.14. Prove Theorem 12.5: Let/ be a symmetric bilinear form on V over R. Then there exists a basis of V
in which/ is represented by a diagonal matrix. Every other diagonal matrix representation off has
the same number p of positive entries and the same number n of negative entries.
By Theorem 12.4, there is a basis {ui, ... ,uj of V in which/ is represented by a diagonal matrix with,
say, p positive and n negative entries. Now suppose {w^, ... ,wj is another basis of V, in which/ is
represented by a diagonal matrix with p^ positive and n^ negative entries. We can assume without loss of
generality that the positive entries in each matrix appear first. Since rank(/) = p + n = p^ + n^, it suffices to
prove that p = p^
Let U be the linear span of w^,..., Wp and let W be the linear span ofw^,^i,... ,w^. Then/(i;, i;) > 0 for
every nonzero v e U, and/(i;, i;) < 0 for every nonzero v e W. Hence U HW = {0}. Note that dim U = p
and dim W = n — p\ Thus
dim(U -\-W) = dimU -\- dimW - dim(U H W) = p-\-(n - p') - 0 = p - p' -\-n
But dim(U -\- W) < dim V = n; hence p — p^ + « < « or p < p^ Similarly, p^ < p and therefore p = p^ as
required.
Remark: The above theorem and proof depend only on the concept of positivity. Thus the
theorem is true for any subfield K of the real field R such as the rational field Q.
POSITIVE DEFINITE REAL QUADRATIC FORMS
12.15. Prove that the following definitions of a positive definite quadratic form q are equivalent:
(a) The diagonal entries are all positive in any diagonal representation of q.
(b) q(Y) > 0, for any nonzero vector 7 in R".
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
12. Bilinear, Quadratic, and
Hermitian Forms
Text
©The McGraw-Hill
Companies, 2004
388
BILINEAR, QUADRATIC, AND HERMITIAN FORMS
[CHAR 12
Suppose q(Y) = a^y] + a2yl + ... + a^l. If all the coefficients are positive, then clearly q(Y) > 0
whenever Y ^ 0. Thus (a) implies (b). Conversely, suppose (a) is not true; that is, suppose some diagonal
entry a^ < 0. Let e^ = @,..., 1, ... 0) be the vector whose entries are all 0 except 1 in the ^th position. Then
q(ej^) = ttj^ is not positive, and so (b) is not true. That is, (b) implies (a). Accordingly (a) and (b) are
equivalent.
12.16. Determine whether each of the following quadratic forms q is positive definite:
{a) q(x, y, z) = x^ -\- 2y^ — 4xz — 4yz + Iz^
(b) q(x,y, z) = x^ +/ + 2xz + 4yz + 3z^
Diagonalize (under congruence) the symmetric matrix A corresponding to q.
(a) Apply the operations "Replace Rt, by IR^ -\-R^'' and "Replace C3 by 2Ci + C3", and then "Replace Rt,
by R2 +^3" and "Replace C3 by C2 + C3". These yield
1
0
0
0
2
-2
1 0 0
0 2 0
0 0 1
The diagonal representation of ^ only contains positive entries, 1,2, 1, on the diagonal. Thus q is positive
definite.
(b) We have
1 0 1]
0 1 2
1 2 3
—
1 0 0"
0 1 2
0 2 2
—
1 0
0 1
0 0
0
0
-2
There is a negative entry —2 on the diagonal representation of q. Thus q is not positive definite.
12.17. Show that q(x,y) = ax^ + bxy + cy^ is positive definite if and only if a > 0 and the discriminant
D = b^ -4ac < 0.
Suppose V = (x,y) ^ 0. Then either x 7^ 0 or j; 7^ 0; say, y ^ 0. Let t = x/y. Then
q(v) = y^la(x/yf + b(x/y) -\-c] = y^(af -\-bt-\-c)
However, the following are equivalent:
(i) s = a f -\- bt -\- c is positive for every value of t.
(ii) s = af -\- bt -\- c lies above the ^axis.
(iii) a > 0 and D = b^ - Aac < 0.
Thus q is positive definite if and only if a > 0 and D < 0. [Remark: D < 0 is the same as det(^) > 0, where A
is the symmetric matrix corresponding to q.]
12.18. Determine whether or not each of the following quadratic forms q is positive definite:
(a) q(x, y) = x^ — 4xy + ly'^, (b) q(x, y) = x'^ -\- 8xy + Sy'^, (c) q(x, y) = 3x^ + 2xy + y'^
Compute the discriminant D = b^ — Aac, and then use Problem 12.17.
(a) D = 16 — 28 = —12. Since a = 1 > 0 and D < 0, ^ is positive definite.
(b) D = 64 — 20 = 44. Since D > 0, ^ is not positive definite.
(c) D = 4 — 12 = —8. Since a = 3 > 0 and D < 0, ^ is positive definite.
Lipschutz-Lipson:Schaum's I 12. Bilinear, Quadratic, and I Text
Outline of Theory and Hermitian Forms
Problems of Linear
Algebra, 3/e
©The McGraw-Hill
Companies, 2004
CHAP. 12]
BILINEAR, QUADRATIC, AND HERMITIAN FORMS
389
HERMITIAN FORMS
12.19. Determine whether the following matrices are Hermitian:
3 2-/ 4 + /"
, (Z?) I 2 - / 6 i
4 + / / 7
{a)
2 2 + 3/ 4-5/
2-3/ 5 6 + 2/
4 + 5/ 6-2/ -7
{c)
4-3 5
-3 2 1
5 1 -6
A complex matrix A = [aij] is Hermitian if ^* = A, that is, if a^ = aji.
(a) Yes, since it is equal to its conjugate transpose.
(b) No, even though it is symmetric.
(c) Yes. In fact, a real matrix is Hermitian if and only if it is symmetric.
12.20. Let ^ be a Hermitian matrix. Show that / is a Hermitian form on C" where / is defined by
f(X,Y)=X^AY.
For all a,beC and allX^,X2, Y e C\
f(aX^ + bX2, Y) = (aX^ + bX2fAY = (aX[ + bX[)AY
= I
Hence/ is linear in the first variable. Also
aX[AY + bX^AY = af(X^, Y) + 6/(^2, 7)
f(X, Y)=XUY = {XUff = Y^A^X = Y^A^X = Y^AX =f(Y,X)
Hence/ is a Hermitian form on C". (Remark: We use the fact that X^AY is a scalar and so it is equal to its
transpose.)
12.21. Let/ be a Hermitian form on V. Let H be the matrix of/ in a basis S = {wj of V. Prove the
following:
(a) f(u, v) = [w]J//[i;]^ for all w, i; G V.
(b) If P is the change-of-basis matrix from ^ to a new basis S' of V, then B = P^HP (or
B = Q^HQ, where g = P) is the matrix of/ in the new basis S'.
Note that (b) is the complex analog of Theorem 12.2.
(a) Let u,v e V and suppose u = a^Ui + ... + a„w„ and v = b^u^ -\-... -\- b^u^. Then, as required,
f{u, v) =f{aiUi + ... + a^u^, b^Ui + ... + b^u^)
= E aib/(ui, Vj) = [a„ ..., a„]H[bi,..., bj = [u]Ih[v\s
(b) Since P is the change-of-basis matrix from S to S\ we have P[u]s^ = [uj^ and P[v]s^ = [v]^; hence
[«]J = [ufs,P^ and [y]^ = P[v]s.. Thus by (a),
/(«, V) = [ufsH[v]s = [u]l.P^HP[v]s.
But u and i; are arbitrary elements of V; hence P^HP is the matrix of/ in the basis S'.
12.22. Let H =
1 1 + / 2/
-2/ 4 2-3/
-2/ 2 + 3/ 7
, a Hermitian matrix.
Find a nonsingular matrix P such that £) = P^HP is diagonal. Also, find the signature of H.
Lipschutz-Lipson:Schaum's I 12. Bilinear, Quadratic, and
Outline of Theory and Hermitian Forms
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
390
BILINEAR, QUADRATIC, AND HERMITIAN FORMS
[CHAR 12
Use the modified Algorithm 12.1 that applies the same row operations but the corresponding conjugate
column operations. Thus first form the block matrix M = [H, I]:
M -
1
1+/
2/110 0"
1 - / 4 2
-2/ 2 + 3/ 7 I 0 0 1
3/; 0 1 0
Apply the row operations "Replace R2 by (—1 + i)Ri +^^2" ^^^ "Replace ^^3 by liR^ +^^3" and then the
corresponding conjugate column operations "Replace C2 by (—1—/)Q+C2" and "Replace C3 by
—2/Ci + C3" to obtain
and then
Next apply the row operation "Replace ^^3 by —5/7^2+2^3" ^^^ the corresponding conjugate column
operation "Replace C3 by 5/C2 + 2C3" to obtain
1
0
0
1 + /
2
5/
2/1
-5^-|
3 1
1
-1+/
2/
0 0
1 0
0 1
1 0
0 2
0 5/
0 1
-5^
3 1
1
-1+/
2/
0 0
1 0
0 1
1 0
0 2
0 0
0 '
-5/|
-19 1
1
-1+/
5 + 9/
0
1
-5/
0
0
2
and then
1 0
0 2
0 I
0 I
1
-1+/
0 0 -38 I 5 + 9/
Now H has been diagonalized, and the transpose of the right half of M is P. Thus set
0
1
-5/
1 -1+/ 5 + 9/'
0 1 -5/
0 0 2
and then
D = P^HP -
1 0
0 2
0 0
0
0
-38
Note D has p = 2 positive elements and n = 1 negative elements. Thus the signature of H is
sig(//) = 2-l = l.
MISCELLANEOUS PROBLEMS
12.23. Prove Theorem 12.3: Let/ be an alternating form on V. Then there exists a basis of V in which/ is
represented by a block diagonal matrix M with blocks of the form
nonzero blocks is uniquely determined by/ [since it is equal to \ rank(/)].
or 0. The number of
If/ = 0, then the theorem is obviously true. Also, if dim V = I, thQnf(kiU, k2u) = kik2f(u, u) = 0 and
sof = 0. Accordingly, we can assume that dim V > I and/ 7^ 0.
Since / 7^ 0, there exist (nonzero) Ui,U2 ^ V such that /(u^, U2) 7^ 0. In fact, multiplying u^ by an
appropriate factor, we can assume that f(ui,U2) = 1 and so f(u2,Ui) = — 1. Now u^ and U2 are linearly
independent; because if, say, U2 = ku^, then/(wi, U2) =f(ui, ku^) = kfiu^, u^) = 0. Let U = span(wi, U2);
then:
0 1
-1 0
(/) The matrix representation of the restriction of/ to U in the basis {^1,^2} is
(ii) If u e U, say u = au^ + bu2, then
f(u, Ui) =f(aui + bu2, Ui) = —b and f(u, U2) =f(aui + bu2, ^2) = ^
Let W consists of those vectors w e V such that/(>v, u^) = 0 and/(>v, U2) = 0. Equivalently,
W = {w e V: f(w, u) = 0 for every u e U}
We claim that V = U ® W. It is clear that UnW = {0}, and so it remains to show that V = U -\-W. Let
V eV.SQt
U =f(v, ^2)^1 ~f{^y Ui)U2
U2,
and
A)
Lipschutz-Lipson:Schaum's I 12. Bilinear, Quadratic,and I Text I I ©The McGraw-Hill
Outline of Theory and Hermitian Forms Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 12] BILINEAR, QUADRATIC, AND HERMITIAN FORMS 391
We show next that we ^ By A) and (ii),/(w, Ui) =f(v, Ui); hence
f(w, wi) =f(v - u, wi) =f(v, wi) -f(u, wi) = 0
Similarly,/(w, U2) =f(v, ^2) ^^^ so
f(w, U2) -\-f(v - u, U2) =f(v, U2) -f(u, U2) = 0
Then w e W and so, by A), v = u-\-w, where u e W. This shows that V =U -\-W\ and therefore
V = U®W.
Now the restriction of/ to W is an alternating bilinear form on W. By induction, there exists a basis
W3,..., w„ of W in which the matrix representing / restricted to W has the desired form. Accordingly,
Wi, ^2' ^3' • • •' ^« is a basis of V in which the matrix representing/ has the desired form.
Supplementary Problems
BILINEAR FORMS
12.24. Let u = (x^, X2) and v = (yi,3^2)- Determine which of the following are bilinear forms on R^:
(a) f(u,v) = 2x^y2-3x2yu (c) f(u,v) = 3x2y2, (e) f(u,v)=l,
(b) f(u, v) = Xi -\-y2, (d) f(u, v) = x^X2 -\-y1y2, (/) f(u, v) = 0
12.25. Let/ be the bilinear form on R^ defined by
/[(xi, X2), (yi, 3^2)] = 3xij;i - 2xij;2 + 4x2j^2 - -^2^2
(a) Find the matrix A of/ in the basis {u^ = A, 1), U2 = (I, 2)}.
(b) Find the matrix B of/ in the basis {v^ = A, —1), V2 = C, 1)}.
(c) Find the change-of-basis matrix P from {wj to {i;J, and verify that B = P^AP.
12.26. Let V be the vector space of 2-square matrices over R. Let M = , and let/(^, B) = tr(A^MB), where
is a bilinear form or
;]• [: 1} [I
A,B e V and "tr"denotes trace, (a) Show that/ is a bilinear form on V. (b) Find the matrix of/ in the basis
1 0
0 0
0 1] ro 0] ro o
0 0 ' 1 0 ' 0 1
12.27. Let B(V) be the set of bilinear forms on V over K. Prove the following:
(a) Iff, g e B(V% then/ -\-g,kge B(V) for any ^ g ^
(b) If (p and a are linear functions on V, then/(w, v) = (j){u)o{v) belongs to B{V).
12.28. Let [/] denote the matrix representation of a bilinear form/ on V relative to a basis {wj. Show that the
mapping/ 1^ [/] is an isomorphism of B{V) onto the vector space V of w-square matrices.
12.29. Let/ be a bilinear form on V. For any subset S of V, let
S^ = {v e V: f(u, v) = 0 for every u e S} and S^ = {v e V: f(v, u) = 0 for every u e S}
Show that: (a) S~^ and S~^ are subspaces of V; (b) S^ c S2 implies S^ c S^ and Sj c Sj;
(c) {0}^ = {Of = V.
12.30. Suppose / is a bilinear form on V. Prove that: rank(/) = dim V — dim V-^ = dim V — dim V~^ and hence
dim V^ = dim V^.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
12. Bilinear, Quadratic, and
Hermitian Forms
Text
©The McGraw-Hill
Companies, 2004
392
BILINEAR, QUADRATIC, AND HERMITIAN FORMS
[CHAR 12
12.31. Let/ be a bilinear form on V. For each u e V, IqX u\V ^ K and u\V ^ KhQ defined by u(x) =f(x, u) and
u(x) =f(u,x). Prove the following:
(a) u and u are each linear, i.e. u,u e F*,
(b) u\^ u and u \^ u are each linear mappings from V into F*,
(c) rank(/) = rank(w i^ ii) = rank(w i^ u).
12.32. Show that congruence of matrices (denoted by :^) is an equivalence relation; that is:
(i) Ac^A. (ii) IfAc^B, then B c^ A. (iii) If A c^ B md B c^ C, then A c^ C.
SYMMETRIC BILINEAR FORMS, QUADRATIC FORMS
12.33. Find the symmetric matrix A belonging to each of the following quadratic forms:
(a) q(x, y, z) — 2x^ — 8xy + j^ — 16xz + \Ayz + 5z^, (c) q{x, y, z) = xy -\-y^ -\- Axz + z^
(b) q(x, y, z) = x^ — xz -\-y^, (d) q(x, y, z) = xy -\- yz
12.34. For each of the following symmetric matrices A, find a nonsingular matrix P such that D = P^AP is diagonal:
(a) A :
1 0 2
0 3 6
2 6 7
,(b) A =
1
-2
1
-2
5
3
1
3
-2
(c) A.
1-10 2"
-12 10
0 112
2 0 2-1
12.35. Let q(x, y) = 2x^ — 6xy — 3y^ and x = ^ + 2^, j; = 3^ — ^.
(a) Rewrite q(x, y) in matrix notation, and find the matrix A representing the quadratic form.
(b) Rewrite the linear substitution using matrix notation, and find the matrix P corresponding to the
substitution.
(c) Find q(s, i) using: (i) direct substitution, (ii) matrix notation.
12.36. For each of the following quadratic forms q{x,y,z), find a nonsingular linear substitution expressing the
variables x,y, z in terms of variables r, s, t such that q{r, s, t) is diagonal:
{a) q(x, y, z) = x^ -\- 6xy + 8j^ — 4xz + 2yz — 9z^,
(b) q(x, y, z) = 2x^ — 3xy + 8xz + 12_yz + 25z^,
(c) q(x,y, z)=x^ + 2xy + 3/ + 4xz + 8j;z + 6z^.
In each case, find the rank and signature.
12.37. Give an example of a quadratic form q{x,y) such that q{u) = 0 and q(v) = 0 but q(u -\-v)^0.
12.38. Let S(V) denote all symmetric bilinear forms on V. Show that:
(a) 5'(F) isasubspaceof5(F). (b) If dim V = n, thm dim S(V) = ^n(n-\-I).
12.39. Consider a real quadratic polynomial q(xi,..., x„) = Z]"/=i ^ij^i^j^ where a^y = a^y.
(a) If Gil ^ 0, show that the substitution
1 ,
Xi =y^ («12y2 + • • • + <3^1nJ^J' ^2 =y2^
^n=yn
yields the equation q{xi, ..., x„) = a^ J^ + q'iyi^ ■ ■ ■ ^yn)^ where q' is also a quadratic polynomial.
(b) If a^ = 0 but, say, a 12 7^ 0, show that the substitution
^\=y\+y2^ ^2=y\-y2^ -^3=3^3' ••- ^n=yn
yields the equation ^(x^, ... ,x„) = ^b^jyiyj, where b^ 7^ 0, which reduces this case to case (a).
Remark. This method of diagonalizing q is known as completing the square.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
12. Bilinear, Quadratic, and
Hermitian Forms
Text
©The McGraw-Hill
Companies, 2004
CHAP. 12]
BILINEAR, QUADRATIC, AND HERMITIAN FORMS
393
POSITIVE DEFINITE QUADRATIC FORMS
12.40. Determine whether or not each of the following quadratic forms is positive definite:
(a) q(x, y) = 4x^ + 5xy + 7^, (c) q(x, y, z) = x^ + 4xy + 5^ + 6xz + 2yz + 4z^
(b) q(x, y) = 2x^ — 3xy —y^, (d) q(x, y, z) = x^ -\- 2xy + 2y^ + 4xz + 6yz + Iz^
12.41. Find those values of k such that the given quadratic form is positive definite:
(a) q(x, y) = 2x^ — 5xy + ky^, (b) q(x, y) = 3x^ — kxy + 12j;^
(c) q{x, y, z) =x^ -\- 2xy + 2y^ + 2xz + 6yz + kz^
12.42. Suppose ^ is a real symmetric positive definite matrix. Show that A = P^P for some nonsingular matrix P.
HERMITIAN FORMS
12.43. Modify Algorithm 12.1 so that, for a given Hermitian matrix //, it finds a nonsingular matrix P for which
D = P^AP is diagonal.
12.44. For each Hermitian matrix //, find a nonsingular matrix P such that D = P^HP is diagonal:
1
{a) H.
1 i
-i 2
(b) H:
1 2 + 3/
2-3/ -1
, (c) H-
i 2 + /
-/ 2 1 - /
2 - / 1 + / 2
Find the rank and signature in each case.
12.45. Let ^ be a complex nonsingular matrix. Show that H = A^A is Hermitian and positive definite.
12.46. We say that B is Hermitian congruent to A if there exists a nonsingular matrix P such that B = P^AP or,
equivalently, if there exists a nonsingular matrix Q such that B = Q^AQ. Show that Hermitian congruence is
an equivalence relation. {Note: \i P = Q, then P^AP = Q^AQ.)
12.47. Prove Theorem 12.7: Let/ be a Hermitian form on V. Then there is a basis 5* of F in which/ is represented by
a diagonal matrix, and every such diagonal representation has the same number p of positive entries and the
same number n of negative entries.
MISCELLANEOUS PROBLEMS
12.48. Let e denote an elementary row operation, and let/* denote the corresponding conjugate column operation
(where each scalar ^ in e is replaced by k in/*). Show that the elementary matrix corresponding to/* is the
conjugate transpose of the elementary matrix corresponding to e.
12.49. Let V and W be vector spaces over K. A mapping/: F x ^ ^ ^ is called a bilinear form on V and W if:
(i) f(av^ + bv2, w) = af(v^, w) + bf(v2, w),
(ii) f(v, awi + bw2) = af(v, w^) + bf(v, W2)
for every a,b e K,Vi e V,Wj e W. Prove the following:
(a) The set B{V, W) of bilinear forms on V and ^ is a subspace of the vector space of functions from
V ^W into K.
(b) If {(/)i,..., (/)^} is a basis of F* and {di,..., a J is a basis of ^*, then {/^ : / = 1,..., m,j = I,... ,n}
is a basis ofB(V, W), where/^ is defined hyfj(v, w) = (l)i(v)(jj(w). Thus dim B(V, W) = dim V dim W.
[Note that if V = W, then we obtain the space B(V) investigated in this chapter.]
Lipschutz-Lipson:Schaum's I 12. Bilinear, Quadratic, and I Text
Outline of Theory and Hermitian Forms
Problems of Linear
Algebra, 3/e
©The McGraw-Hill
Companies, 2004
394
BILINEAR, QUADRATIC, AND HERMITIAN FORMS
[CHAR 12
12.50. Let F be a vector space over ^. A mapping/: F xVx...xV^Kis called a multilinear (or m-linear) form
on V iff is linear in each variable, i.e., for / = 1,..., m,
/(..., au -\- bv, ...) = af(..., w,...) + bf(... ,v, ...)
where " denotes the ith element, and other elements are held fixed. An m-linear form / is said to be
alternating if/(i^i, ... i;^) = 0 whenever v^ = Vj for / 7^7. Prove the following:
(a) The set B^(V) of m-linear forms on F is a subspace of the vector space of functions from
V xV X ... xV into K.
(b) The set A^(V) of alternating m-linear forms on F is a subspace of B^(V).
Remark 1: If m = 2, then we obtain the space B(V) investigated in this chapter.
Remark 2: If V = K^, then the determinant function is an alternating m-linear form on V.
Notation: M = [Ri;
12.24. (a) yes,
12.25. (a) A = [4,
12.26. (b) [1,0,2,
12.33. (a) [2,-4,-
Answers to Supplementary Problems
R2', ...] denotes a matrix M with rows Ri,R2,
(b) no, (c) yes, (d) no, (e) no, (/) yes
1; 7,3], (b) B = [0,-4; 20,32], (c) P = [3,5; -2,2]
0; 0,1,0,2; 3,0,5,0; 0,3,0,5]
S; -4,1,7; -8,7,5], (b) [1,0,-i; 0,1,0; -^,0,0],
(c) [0,i,2; i,l,0; 2,0,1], (d) [0,^,0; ^,0, 1; ^,0,^; 0,^,0]
12.34. (a) P = [\,
(b) P = [l,
(c) P = ll,
0,-2; 0,1,-2; 0, 0, 1], D = diag(l, 3,-9),
2,-11; 0,1,-5; 0, 0, 1], D = diag(l, 1,-28),
1,-1,-4; 0,1,-1,-2; 0,0,1,0; 0, 0, 0, 1], D = diag(l, 1, 0,-9)
12.35. ^ = [2,-3; -3,-3], P = [1, 2; 3,-1], ^(^, 0 =-43^^ _ 4^^ _^ ^7^2
12.36. (a) X = r -
(b) x = r-
12.37. q(x,y)=x^-
12.40. (a) yes,
12.41. (a) k>f,
12.44. (a) P = [l,
(c) P = [h
■ 3s - I9t, y = s-\-7t, z = t; q(r, s, t) = r^ - s^ -\- 36t^,
.2t, y = s-\-2t, z=\; q(r, s, t) = 2r^ - 3s^ + 29^^
-/, u = (l,lX i; = (l,-l)
(b) no, (c) no, (d) yes
(b) k<-l2oYk>l2, (c) k>5
i; 0,llD = I,s = 2, (b) P = [I,-2-\-3i; 0, 1], D = diag(l,-14), ^ = 0,
/,-3+/; 0,1,/; 0,0, l],D = diag(l, 1,-4),^= 1
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
13. Linear Operators on
Inner Product Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAPTER 13
Linear Operators on
Inner Product Spaces
13.1 INTRODUCTION
This chapter investigates the space A(V) of linear operators T on an inner product space V. (See
Chapter 7.) Thus the base field K is either the real number R or the complex numbers C. In fact, different
terminologies will be used for the real case and the complex case. We also use the fact that the inner
products on real Euclidean space R" and complex Euclidean space C" may be defined, respectively, by
(u, v) = u V
and
(u, v) = u V
where u and v are column vectors.
The reader should review the material in Chapter 7 and be very familiar with the notions of norm
(length), orthogonality, and orthonormal bases. We also note that Chapter 7 mainly dealt with real inner
product spaces, whereas here we assume that F is a complex inner product space unless otherwise stated or
implied.
Lastly, we note that in Chapter 2, we used^^ to denote the conjugate transpose of a complex matrix^,
that is, A^ = A^. This notation is not standard. Many texts, expecially advanced texts, use ^* to denote
such a matrix, and we will use this notation in this chapter. That is, now ^* = A^.
13.2 ADJOINT OPERATORS
We begin with the following basic definition.
Definition: A linear operator T on an inner product space V is said to have an adjoint operator 7* on V
if (r(w), v) = (w, r*(i;)) for every u,v e V.
The following example shows that the adjoint operator has a simple description within the context of
matrix mappings.
Example 13.1.
(a) Let ^ be a real ^-square matrix viewed as a linear operator on R". Then, for every w, i; G R„,
{Au, v) = {Aufv = u^A^v = {u, A^v)
Thus the transpose A^ oi A is the adjoint of ^.
395
Lipschutz-Lipson:Schaum's I 13. Linear Operators on I Text I I ©The McGraw-Hill
Outline of Theory and Inner Product Spaces Companies, 2004
Problems of Linear
Algebra, 3/e
396 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAR 13
(b) Let 5 be a complex ^-square matrix viewed as a linear operator on C". Then, for every w, i;, G C",
{Bu, v) = (Bufv = u^B^v = u^B^v = u, B'^v)
Thus the conjugate transpose 5* of B is the adjoint of B.
Remark: 5* may mean either the adjoint of 5 as a linear operator or the conjugate transpose of 5 as
a matrix. By Example 13.1(Z?), the ambiguity makes no difference, since they denote the same object.
The following theorem (proved in Problem 13.4) is the main result in this section.
Theorem 13.1: Let 7 be a linear operator on a finite-dimensional inner product space V over K. Then:
(i) There exists a unique linear operator 7* on V such that (T(u), v) = (w, r*(i;)) for
every u,v e V. (That is, T has an adjoint 7*.)
(ii) If ^ is the matrix representation T with respect to any orthonormal basis S = {wj
of V, then the matrix representation of 7* in the basis S is the conjugate transpose
A^ of A (or the transpose A^ of A when K is real).
We emphasize that no such simple relationship exists between the matrices representing T and 7* if
the basis is not orthonormal. Thus we see one useful property of orthonormal bases. We also emphasize
that this theorem is not valid if V has infinite dimension (Problem 13.31).
The following theorem (proved in Problem 13.5) summarizes some of the properties of the adjoint.
Theorem 13.2: Let T, Tj, T2 be linear operators on V and let k e K. Then:
(i) (ri + r2)* = n + rf, (iii) G1^2)* = 7|rf,
(ii) (kT)^ = kT^, (iv) (r*)* = T.
Observe the similarity between the above theorem and Theorem 2.3 on properties of the transpose
operation on matrices.
Linear Functionals and Inner Product Spaces
Recall (Chapter 11) that a linear functional 0 on a vector space F is a linear mapping cjy.V ^ K. This
subsection contains an important result (Theorem 13.3), which is used in the proof of the above basic
Theorem 13.1.
Let V be an inner product space. Each u e V determines a mapping u:V ^ K defined by
u(v) = (v, u)
Now, for any a,b e K and any i;i, i;2 ^ ^
u(avi + bv2) = {avi + bv2, u) = aiv^, u) + b{v2, u) = au(vi) + bu(v2)
That is, w is a linear functional on V. The converse is also true for spaces of finite dimension, and is the
following important theorem (proved in Problem 13.3).
Theorem 13.3: Let 0 be a linear functional on a finite-dimensional inner product space V. Then there
exists a unique vector u e V such that (/)(i;) = {v, u) for every v e V.
We remark that the above theorem is not valid for spaces of infinite dimension (Problem 13.24).
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
13. Linear Operators on
Inner Product Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAR 13]
LINEAR ORERATORS ON INNER PRODUCT SPACES
397
13.3 ANALOGY BETWEEN A(V) AND C, SPECIAL LINEAR OPERATORS
Let A(V) denote the algebra of all linear operators on a finite-dimensional inner product space V. The
adjoint mapping T \^ 7* onA(V) is quite analogous to the conjugation mapping z i^ z on the complex
field C. To illustrate this analogy we identify in Table 13-1 certain classes of operators T ^ A(V) whose
behavior under the adjoint map imitates the behavior under conjugation of familiar classes of complex
numbers.
Class of complex
numbers
Unit circle (\z\ = 1)
Real axis
Imaginary axis
Positive real axis
@,oo)
Behavior under
conjugation
z= l/z
z = z
z — —z
z = WW, w ^ 0
Table 13-1
Class of operators in A(V)
Orthogonal operators (real case)
Unitary operators (complex case)
Self-adjoint operators
Also called:
symmetric (real case)
Hermitian (complex case)
Skew-adjoint operators
Also called:
skew-symmetric (real case)
skew-Hermitian (complex case)
Positive definite operators
Behavior under the
adjoint map
y-'* y-'—1
T'* — T"
T'* T'
with S nonsingular
The analogy between these operators T and complex numbers z is reflected in the next theorem.
Theorem 13.4: Let A be an eigenvalue of a linear operator T on V.
(i) If r* = T~^ (i.e., T is orthogonal or unitary), then \A\ = 1.
(ii) If r* = r (i.e., T is self-adjoint), then A is real.
(iii) If r* = —T (i.e., T is skew-adjoint), then A is pure imaginary.
(iv) If r = S^S with S non-singular (i.e., T is positive definite), then A is real and
positive.
Proof. In each case let i; be a nonzero eigenvector of T belonging to 1, that is, T(v) = Av with v j^ 0;
hence (v, v) is positive.
Proof of (i). We show that A1{v, v) = (i;, v):
AI(v, v) = (Av, Av) = (T(v), T(v)) = (v, T^T(v)) = (v,I(v)) = (v, v)
But (i;, v) ^ 0; hence AA = 1 and so Ul = 1.
Proof of {ii). We show that A{v, v) = A{v, v):
A{v, v) = {Av, v) = {T(v), v) = {v, T^(v)) = {v, T(v)) = {v, Av) = A{v, v)
But (v, v) 7^ 0; hence A = A and so A is real.
Lipschutz-Lipson:Schaum's I 13. Linear Operators on I Text I I ©The McGraw-Hill
Outline of Theory and Inner Product Spaces Companies, 2004
Problems of Linear
Algebra, 3/e
398 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAR 13
Proof of {Hi). We show that X{v, v) = —X{v, v):
Mv, v) = {Av, v) = {T(v), v) = {v, T^(v)) = {v, -T(v)) = {v, -Av) = -l{v, v)
But (v,v) ^ 0; hence 1 = —1 or 1 = —1, and so A is pure imaginary.
Proof of (iv). Note first that S(v) ^ 0 because S is nonsingular; hence (S(v), S(v)) is positive. We
show that A(v, v) = (S(v), S(v)):
Mv, v) = {Xv, v) = {T(v), v) = {S^S(v), v) = {S(v), S(v))
But (v, v) and {S{v), S{v)) are positive; hence X is positive.
Remark: Each of the above operators T commutes with its adjoint; that is, IT* = 7*7. Such
operators are called normal operators.
13.4 SELF-ADJOINT OPERATORS
Let r be a self-adjoint operator on an inner product space F; that is, suppose
7* ^ T
(If T is defined by a matrix A, then A is symmetric or Hermitian according as A is real or complex.) By
Theorem 13.4, the eigenvalues of T are real. The following is another important property of T.
Theorem 13.5: Let 7 be a self-adjoint operator on V. Suppose u and v are eigenvectors of T belonging to
distinct eigenvalues. Then u and v are orthogonal, i.e., (w, v) = 0.
Proof Suppose T{u) = X^u and T{v) = ^2^^ where X^ i^ X^. We show that Xx{u, v) = /l2(w, v):
X^{u, v) = {X^u, v) = {T(u), v) = {u, T^(v)) = {u, T(v))
= (U, X2V) = X2(U, V) = /l2(w, V)
(The fourth equality uses the fact that 7* = 7, and the last equality uses the fact that the eigenvalue X2 is
real.) Since X^ ^ X2, we get (w, v) = 0. Thus the theorem is proved.
13.5 ORTHOGONAL AND UNITARY OPERATORS
Let ^ be a linear operator on a finite-dimensional inner product space V. Suppose
U"" = U-^ or equivalently UU'' = U^U = I
Recall that U is said to be orthogonal or unitary according as the underlying field is real or complex. The
next theorem (proved in Problem 13.10) gives alternative characterizations of these operators.
Theorem 13.6: The following conditions on an operator U are equivalent:
(i) U^ = ^-1; i.e., UU^ = U^U = I.[U is unitary (orthogonal).]
(ii) U preserves inner products; i.e., for every v,w e V, {U(v), U(w)) = (v, w).
(iii) U preserves lengths; i.e., for every v e V, 11^(^I1 = ll^ll-
Example 13.2.
(a) Let T:B? -^ R^ be the linear operator that rotates each vector v about the z-axis by a fixed angle 9 as shown in
Fig. 10-1 (Section 10.3). That is, T is defined by
r(x,j;, z) = (xcos0 — j^sin^, xsin^+j;cos0, z)
We note that lengths (distances from the origin) are preserved under T. Thus T is an orthogonal operator.
Lipschutz-Lipson:Schaum's I 13. Linear Operators on I Text I I ©The McGraw-Hill
Outline of Theory and Inner Product Spaces Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 13] LINEAR OPERATORS ON INNER PRODUCT SPACES 399
(b) Let V be /2-space (Hilbert space), defined in Section 7.3. Let T\V ^ F be the linear operator defined by
T{ai, a2, a^,...) = @, a^, a2, a^, ...)
Clearly, T preserves inner products and lengths. However, T is not subjective, since, for example, A, 0, 0,...)
does not belong to the image of T; hence T is not invertible. Thus we see that Theorem 13.6 is not valid for
spaces of infinite dimension.
An isomorphism from one inner product space into another is a bijective mapping that preserves the
three basic operations of an inner product space: vector addition, scalar multiplication, and inner products.
Thus the above mappings (orthogonal and unitary) may also be characterized as the isomorphisms of V
into itself Note that such a mapping U also preserves distances, since
11^A;)-^(w)II = 11^A;-w)II = ||i;-w||
Hence U is called an isometry.
13.6 ORTHOGONAL AND UNITARY MATRICES
Let ^ be a linear operator on an inner product space V. By Theorem 13.1, we obtain the following
results.
Theorem 13.7A: A complex matrix A represents a unitary operator U (relative to an orthonormal basis)
if and only if ^* = A~^.
Theorem 13.7B: A real matrix A represents an orthogonal operator U (relative to an orthonormal basis)
if and only if ^^ = A~^.
The above theorems motivate the following definitions (which appeared in Sections 2.10 and 2.11).
Definition: A complex matrix A for which ^* = A~^ is called a unitary matrix.
Definition: A real matrix A for which A^ = A~^ is called an orthogonal matrix.
We repeat Theorem 2.6, which characterizes the above matrices.
Theorem 13.8: The following conditions on a matrix A are equivalent:
(i) A is unitary (orthogonal),
(ii) The rows of ^ form an orthonormal set.
(iii) The columns of ^ form an orthonormal set.
13.7 CHANGE OF ORTHONORMAL BASIS
Orthonormal bases play a special role in the theory of inner product spaces V. Thus we are naturally
interested in the properties of the change-of-basis matrix form one such basis to another. The following
theorem (proved in Problem 13.12) holds.
Theorem 13.9: Let {wj, ..., w„} be an orthonormal basis of an inner product space V. Then the change-
of-basis matrix fi*om {wj into another orthonormal basis is unitary (orthogonal).
Conversely, if P = [afj] is a unitary (orthogonal) matrix, then the following is an
orthonormal basis:
{u'i = aiiUi + a2iU2 + ... + a^iU^ :i=l,...,n}
Lipschutz-Lipson:Schaum's I 13. Linear Operators on I Text I I ©The McGraw-Hill
Outline of Theory and Inner Product Spaces Companies, 2004
Problems of Linear
Algebra, 3/e
400 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAR 13
Recall that matrices A and B representing the same linear operator T are similar; i.e., 5 = P~^AP,
where P is the (non-singular) change-of-basis matrix. On the other hand, if V is an inner product space, we
are usually interested in the case when P is unitary (or orthogonal) as suggested by Theorem 13.9. (Recall
that P is unitary if the conjugate tranpose P* = P~^^ and P is orthogonal if the transpose P^ = P~^.) This
leads to the following definition.
Definition: Complex marices A and B are unitarily equivalent if there exists a unitary matrix P for which
B = P^AP. Analogously, real matrices A and B are orthogonally equivalent if there exists an
orthogonal matrix P for which B = P^AP.
Note that orthogonally equivalent matrices are necessarily congruent.
13.8 POSITIVE DEFINITE AND POSITIVE OPERATORS
Let P be a linear operator on an inner product space V. Then:
(i) P is said to be positive definite if P = S'^S for some nonsingular operators S.
(ii) P is said to be positive (or nonnegative or semi-definite) if P = S^S for some operator S.
The following theorems give alternative characterizations of these operators.
Theorem 13.10A: The following conditions on an operator P are equivalent:
(i) P = T^ for some nonsingular self-adjoint operator T.
(ii) P is positive definite,
(iii) P is self-adjoint and (P(u), u) > 0 for every u ^ 0 in V.
The corresponding theorem for positive operators (proved in Problem 13.21) follows.
Theorem 13.1 OB: The following conditions on an operator P are equivalent:
(i) P = T^ for some self-adjoint operator T.
(ii) P is positive i.e. P = S^'S.
(iii) P is self-adjoint and (P(u), u) >0 for every u e V.
13.9 DIAGONALIZATION AND CANONICAL FORMS IN INNER PRODUCT SPACES
Let r be a linear operator on a finite-dimensional inner product space V over K. Representing T by a
diagonal matrix depends upon the eigenvectors and eigenvalues of 7, and hence upon the roots of the
characteristic polynomial A(t) of T. Now A(t) always factors into linear polynomials over the complex field
C, but may not have any linear polynomials over the real field R. Thus the situation for real inner product
spaces (sometimes called Euclidean spaces) is inherently different than the situation for complex inner
product spaces (sometimes called unitary spaces). Thus we treat them separately.
Real Inner Product Spaces, Symmetric and Orthogonal Operators
The following theorem (proved in Problem 13.14) holds.
Theorem 13.11: Let 7 be a symmetric (self-adjoint) operator on a real finite-dimensional product space
V. Then there exists an orthonormal basis of V consisting of eigenvectors of T; that is,
T can be represented by a diagonal matrix relative to an orthonormal basis.
We give the corresponding statement for matrices.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
13. Linear Operators on
Inner Product Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAP. 13]
LINEAR OPERATORS ON INNER PRODUCT SPACES
401
Theorem 13.11: (Alternative Form) Let ^ be a real symmetric matrix. Then there exists an
orthogonal matrix P such that B = P~^AP = P^AP is diagonal.
We can choose the columns of the above matrix P to be normalized orthogonal eigenvectors of A; then
the diagonal entries of B are the corresponding eigenvalues.
On the other hand, an orthogonal operator T need not be symmetric, and so it may not be represented
by a diagonal matrix relative to an orthonormal matrix. However, such a matrix T does have a simple
canonical representation, as described in the following theorem (proved in Problem 13.16).
Theorem 13.12. Let T be an orthonormal operator on a real inner product space V. Then there exists an
orthonormal basis of V in which T is represented by a block diagonal matrix M of the
form
cos^i
sin^i
— sin 6i
cosOi
, ...,
cos^^
sin^^
— sin
cos
M = diag /„ -/^
The reader may recognize that each of the 2x2 diagonal blocks represents a rotation in the
corresponding two-dimensional subspace, and each diagonal entry —1 represents a reflection in the
corresponding one-dimensional subspace.
Complex Inner Product Spaces, Normal and Triangular Operators
A linear operator T is said to be normal if it commutes with its adjoint, that is, if IT* = T^T. We note
that normal operators include both self-adjoint and unitary operators.
Analogously, a complex matrix A is said to be normal if it commutes with its conjugate transpose, that
is, if AA^ =A^A.
Example 13.3. Let^
Also AA^
1 1
/ 3 + 2i
1 -/
1 3-2/
2 3-3/
3 + 3/ 14
. Then A*
A'^A. Thus A is normal.
The following theorem (proved in Problem 13.19) holds.
Theorem 13.13: Let 7 be a normal operator on a complex finite-dimensional inner product space V.
Then there exists an orthonormal basis of V consisting of eigenvectors of T; that is, T
can be represented by a diagonal matrix relative to an orthonormal basis.
We give the corresponding statement for matrices.
Theorem 13.13: (Alternative Form) Let ^ be a normal matrix. Then there exists a unitary matrix P
such that B = P'^AP = P'^AP is diagonal.
The following theorem (proved in Problem 13.20) shows that even nonnormal operators on unitary
spaces have a relatively simple form.
Theorem 13.14: Let T be an arbitrary operator on a complex finite-dimensional inner product space V.
Then T can be represented by a triangular matrix relative to an orthonormal basis of V.
Theorem 13.14 (Alternative Form) Let A be an arbitrary complex matrix. Then there exists a unitary
matrix P such that B = P~^AP = P^AP is triangular.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
13. Linear Operators on
Inner Product Spaces
Text
©The McGraw-Hill
Companies, 2004
402
LINEAR OPERATORS ON INNER PRODUCT SPACES
[CHAR 13
13.10 SPECTRAL THEOREM
The Spectral Theorem is a reformulation of the diagonalization Theorems 13.11 and 13.13.
Theorem 13.15: (Spectral Theorem) Let 7 be a normal (symmetric) operator on a complex (real)
finite-dimensional inner product space V. Then there exists linear operators Ei, ... ,Ej.
on V and scalars li
, 1, such that:
(i) T = li^i + ^2^2 + ... + K^r^ (iii) E\=E^,El=E2,...,El= E^,
(ii) ^1+^2 + -
E=L
(iv) E,Ej = 0 for ij^j.
The above linear operators E^, ... ,E^ are projections in the sense that E} = E^. Moreover, they are
said to be orthogonal projections since they have the additional property that E^Ej = 0 for / 7^7.
The following example shows the relationship between a diagonal matrix representation and the
corresponding orthogonal projections.
Example 13.4. Consider the following diagonal matrices A,Ei,E2,Ej,\
1 1 ro
0
0
0
The reader can verify that:
(i) A = 2E^+ 3E2 + 5^3, (ii) ^1 + ^2 + ^3 = ^. (iii) ^? = ^v (iv) E^Ej = 0 for / ^j
Solved Problems
ADJOINTS
13.1. Find the adjoint of F:R^ -^ R^ defined by
F(x, y, z) = Cx -\- 4y — 5z, 2x — 6y -\- Iz, 5x — 9;; + z)
First find the matrix A that represents F in the usual basis of R^, that is, the matrix A whose rows are the
coefficients of x,j;, z, and then form the transpose A^ of A. This yields
3 2 5"
and then
1
The adjoint F* is represented by the transpose of ^; hence
F^(x,y,z) = Cx-\-2y-\-5z, 4x-6y-9z, -5x-\-ly-\-z)
13.2. Find the adjoint of G:C^ -^ C^ defined by
G(x, y,z) = [2x-\-(\ - i)y, C + 2i)x - 4iz, 2ix + D - 3i)y - 3z]
First find the matrix B that represents G in the usual basis of C^, and then form the conjugate transpose
5* ofB. This yields
2 1 -/
3+2/ 0
2/ 4 - 3/
0
-4/
-3
and then B^ ■
2 3-2/ -2/
1 + / 0 4 + 3/
0 4/ -3
Then G*(x,y, z) = [2x + C - 2i)y - 2iz, A + i)x + D + 3/)z, Aiy - 3z].
Lipschutz-Lipson:Schaum's I 13. Linear Operators on I Text I I ©The McGraw-Hill
Outline of Theory and Inner Product Spaces Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 13] LINEAR OPERATORS ON INNER PRODUCT SPACES 403
13.3. Prove Theorem 13.3: Let 0 be a linear functional on an ^-dimensional inner product space V. Then
there exists a unique vector u e V such that (/)(i;) = (v, u) for every v ^V.
Let {wi,..., >v„} be an orthonormal basis of V. Set
U = (l)(Wi)Wi + (l)(W2)W2 + . . . + (/)(%)%
Let u be the linear functional on V defined by u(v) = {v, u) for every v e V. Then, for / = I,... ,n,
u(Wi) = {wi, u) = {wi, (l)(wi)wi + ... + (l)(Wj^)wJ = (l)(wi)
Since u and (/) agree on each basis vector, u = (f).
Now suppose u' is another vector in V for which (f){v) = {v, u') for every v eV. Then {v, u) = {v, u') or
{v, u — u') = 0. In particular, this is true ioxv = u — u\ and so {u — u\ u — u') = 0. This yields u — u^ = 0
and u = u'. Thus such a vector u is unique, as claimed.
13.4. Prove Theorem 13.1: Let 7 be a linear operator on an ^-dimensional inner product space V. Then:
{a) There exists a unique linear operator 7* on V such that
(r(w), v) = {u, T^(v)) for all u,veV.
(b) Let A be the matrix that represents T relative to an orthonormal basis S = {wj. Then the
conjugate transpose ^* of ^ represents 7* in the basis S.
(a) We first define the mapping T*. Let v be an arbitrary but fixed element of V. The map u i^ {T(u), v) is a
linear functional on V. Hence, by Theorem 13.3, there exists a unique element v' e V such that
{T{u), v) = {u, v') for every u e V. We define T* : F ^ F by T^(v) = v'. Then {T{u), v) = {u, T^(v)) for
every u,v e V.
We next show that T* is linear. For any u, v^ e V, and any a,b e K,
{u, T'^iavi + bv2)) = {T(u), av^ + bv2) = a{T(u), v^) + b{T(u), V2)
= a{u, r*(i;i)) + b{u, T''{v2)) = {u, aT''{v^) + bT''{v2))
But this is true for every u e V; hence T'^(avi + bv2) = aT'^(vi) + bT'^(v2). Thus T* is linear.
(b) The matrices A = [aij] and B = [bij] that represents T and T*, respectively, relative to the orthonormal
basis S are given by aij = {T(uj), Ui) and by = {T^(uj), Ui) (Problem 13.67). Hence
by = {T*(uj), Ui) = {Ui, T*(uj)} = {T(Ui), uj) = aji
Thus B = A^, as claimed.
13.5. Prove Theorem 13.2:
(i) (^1 + 72)* = rf + rf, (iii) (^172)* = T^Tf,
(ii) (kT)^ = kT^, (iv) (r*)* = r.
(i) For any u,v e V,
{(T, + T2)(u), V) = {TM + T2{u), V) = {T,(u), v) + {T2{u), v)
= {u, Ttiv)) + {u, Tl(v)) = {u, Ttiv) + Ti(v))
= {u, (n + rf)(i;))
The uniqueness of the adjoint implies (T^ + 72)* = Tf + TJ.
Lipschutz-Lipson:Schaum's I 13. Linear Operators on I Text I I ©The McGraw-Hill
Outline of Theory and Inner Product Spaces Companies, 2004
Problems of Linear
Algebra, 3/e
404 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAR 13
(ii) For any u,v e V,
{(kT)(u), v) = {kT(u), v) = k{T(u), v) = k{u, r*(i;)) = {u, kT''{v)) = {u, (^r*)(i;))
The uniqueness of the adjoint implies (^r)* = ^r*.
(iii) For any u,v e V,
mr^Xu), V) = {TiiT^iu)), V) = (Uu), mv))
= {u, Ti(TUv))) = {u, (T^Tt)(v))
The uniqueness of the adjoint implies G^72)* = T^Tf.
(iv) For any u,v e V,
{T'^iu), v) = {v, f^iu)) = {T(v), u) = {u, T(v))
The uniqueness of the adjoint implies (r*)* = T.
13.6. Show that: (a) /* = /, and (b) 0* = 0.
(a) For every u,v e V, {I(u), v) = {u, v) = {u, I(v)); hence /* = /.
(b) For every u,veK @(w), v) = {0,v) =0 = {u, 0) = {u, 0(v)); hence 0* = 0.
13.7. Suppose T is invertible. Show that G-^)* = G*)"^
j = j* = {TT-^Y = (r-i)*r*; hence G"^)* = G*)"^
13.8. Let r be a linear operator on V, and let ^ be a T-invariant subspace of V. Show that W-^ is invariant
under 7*.
Let ueW^AfweW, then T(w) e W and so {w, T'^iu)) = {T(w), u) = 0. Thus r*(w) e W^ since it is
orthogonal to every w e W. Hence W-^ is invariant under T*.
13.9. Let r be a linear operator on V. Show that each of the following conditions implies 7 = 0:
(i) {T(u), v) =0 for every u,v e V.
(ii) F is a complex space, and (T(u), u) = 0 for every u e V.
(iii) T is self-adjoint and (T(u), u) = 0 for every u e V.
Give an example of an operator 7 on a real space V for which (T(u), u) = 0 for every u e V but
r 7^ 0. [Thus (ii) need not hold for a real space V.]
(i) Set V = T(u). Then {T(u), T(u)) = 0 and hence T(u) = 0, for every ueV. Accordingly, T = 0.
(ii) By hypothesis, {T(v -\-w), i; + w) = 0 for any v,w e V. Expanding and setting {T(v), v) = 0 and
{T(w), w) = 0, we find
{T(v), w) + {T(w), v)=0 A)
Note w is arbitrary in A). Substituting iw for w, and using {T(v), iw) = i{T(v), w) = —i{T(v), w) and
{T(iw), v) = {iT(w), v) = i{T(w), v), we find
-i{T(v), w) + i{T(w), v)=0
Dividing through by / and adding to A), we obtain {T(w), v) = 0 for any v,w, e V. By (i), 7 = 0.
(iii) By (ii), the result holds for the complex case; hence we need only consider the real case. Expanding
{T{v -\-w), i; + w) = 0, we again obtain A). Since T is self-adjoint and since it is a real space, we have
{T{w), v) = {w, T(v)) = {T(v), w). Substituting this into A), we obtain {T(v), w) = 0 for any v,w e V.
By (i), r = 0.
For an example, consider the linear operator T on R^ defined by T(x,y) = (y, —x). Then
{T(u), u) =0 for every ueV,hutT ^0.
Lipschutz-Lipson:Schaum's I 13. Linear Operators on I Text I I ©The McGraw-Hill
Outline of Theory and Inner Product Spaces Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 13] LINEAR OPERATORS ON INNER PRODUCT SPACES 405
ORTHOGONAL AND UNITARY OPERATORS AND MATRICES
13.10. Prove Theorem 13.6: The following conditions on an operator U are equivalent:
(i) ?7* = U-\ that is, U is unitary, (ii) {U{v), U{w)) = (u, w). (iii) \\U(v)\\ = \\v\\.
Suppose (i) holds. Then, for every v,w, e V,
{U(v), U(w)) = {v, V'Uiw)) = {v,I(w)) = {v,w)
Thus (i) implies (ii). Now if (ii) holds, then
\\u(v)\\ = V([/(«), u(v)) = y(^=ii!)ii
Hence (ii) implies (iii). It remains to show that (iii) implies (i).
Suppose (iii) holds. Then for every v e V,
{V^Uiv)) = {U(v), U(v)) = {v, v) = {I(v), v)
Hence {(U'^U - I)(v), v) = 0 for every v e V. But U'^U -1 is self-adjoint (Prove!); then, by Problem 13.9,
we have U'^U — I = 0 and so Lf^U = I. Thus t/* = U~^, as claimed.
13.11. Let ^ be a unitary (orthogonal) operator on V, and let ^ be a subspace invariant under U. Show
that W^ is also invariant under U.
Since U is nonsingular, U{W) = W; that is, for any w e W, there exists w' e W such that U{w') = w.
Now let V e W^. Then, for any w e W,
{U(v), w) = {U(v), U(w')) = {v, V) = 0
Thus U(v) belongs to W-^. Therefore W-^ is invariant under U.
13.12. Prove Theorem 13.9: The change-of-basis matrix from an orthonormal basis {ui,...,uj into
another orthonormal basis is unitary (orthogonal). Conversely, if P = [a^] is a unitary (orthogonal)
matrix, then the vectors w,/ = V • ajM,- form an orthonormal basis.
Suppose {Vi) is another orthonormal basis and suppose
Vi = biiUi + 6^-2^2 + • • • + bi^Un, i = I,... ,n A)
Since {i;J is orthonormal,
^ij = {Vf, Vj) = babj^ + babj2 + ... + bi^bj^ B)
Let B = [bij] be the matrix of coefficients in A). (Then B^ is the change-of-basis matrix from {wj to
{Vi}.) Then 55* = [qj], where qj = b^-^'b'^ + b^b^ + ... + bi^b~. By B), qj = S^j, and therefore 55* = /.
Accordingly, 5, and hence 5^, is unitary.
It remains to prove that {wj} is orthonormal. By Problem 13.67,
{U-, Uj) = a^iO^j + fl2/«^ + ... + a^iO- = {Q, Cj)
where Q denotes the ith column of the unitary (orthogonal) matrix P = [aij]. Since P is unitary (orthogonal),
its columns are orthonormal; hence {wj, Uj) = {Q, Cj) = 3^. Thus {wj} is an orthonormal basis.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
13. Linear Operators on
Inner Product Spaces
Text
©The McGraw-Hill
Companies, 2004
406
LINEAR OPERATORS ON INNER PRODUCT SPACES
[CHAP 13
SYMMETRIC OPERATORS AND CANONICAL FORMS IN EUCLIDEAN SPACES
13.13. Let r be a symmetric operator. Show that: (a) The characteristic polynomial A(t) of 7 is a product
of linear polynomials (over R). (b) T has a nonzero eigenvector.
(a) Let ^ be a matrix representing T relative to an orthonormal basis of V; then A = A^. Let A(^) be the
characteristic polynomial of ^. Viewing ^ as a complex self-adjoint operator, A has only real eigenvalues
by Theorem 13.4. Thus
A{t) = {t-Xi){t-X2)...(t-XJ
where the X^ are all real. In other words, A(^) is a product of linear polynomials over R.
(b) By (a), T has at least one (real) eigenvalue. Hence T has a nonzero eigenvector.
13.14. Prove Theorem 13.11: Let 7 be a symmetric operator on a real ^-dimensional inner product space
V. Then there exists an orthonormal basis of V consisting of eigenvectors of T. (Hence T can be
represented by a diagonal matrix relative to an orthonormal basis.)
The proof is by induction on the dimension of V. If dim V = 1, the theorem trivially holds. Now suppose
dim V = n > l.By Problem 13.13, there exists a nonzero eigenvector Vi of T. Let W be the space spanned by
Vi, and let u^ be a unit vector in W, e.g., let u^ = Vi/\\vi ||.
Since v^ is an eigenvector of T, the subspace ^ of F is invariant under T. By Problem 13.8, W^ is
invariant under T* = T. Thus the restriction T of T to W-^ is a symmetric operator. By Theorem 7.4,
V = W ® W^. Hence dim W^ = n — I, since dim ^ = 1. By induction, there exists an orthonormal basis
{u2,..., uj of W^ consisting of eigenvectors of T and hence of T. But {ui, u^) = 0 for / = 2,..., « because
Ui e W^. Accordingly {ui,U2, ... ,uj is an orthonormal set and consists of eigenvectors of T. Thus the
theorem is proved.
13.15. Let q{x,y) = 3x^ — 6xy + Wy^. Find an orthonormal change of coordinates (linear substitution)
that diagonalizes the quadratic form q.
Find the symmetric matrix A representing q and its characteristic polynomial A(^). We have
3 -3
-3 11
and
A@ = t^ - tr(^) ^ + Ml = ^^ - 14^ + 24 = (^ - 2){t - 12)
The eigenvalues are A = 2 and X= \2. Hence a diagonal form of q is
q{s, t) = 2s^ + I2t^
(where we use s and t as new variables). The corresponding orthogonal change of coordinates is obtained by
finding an orthogonal set of eigenvectors of ^.
Subtract 1 = 2 down the diagonal of ^ to obtain the matrix
M :
corresponding to
X - 3 j; = 0
-3x -\-9y = 0
■3y = 0
A nonzero solution is Ui = C, 1). Next subtract A = 12 down the diagonal of ^ to obtain the matrix
M :
corresponding to
-9x - 3j; = 0
-3x- y = 0
3x-y = 0
A nonzero solution is ^2 = (~1' 3). Normalize u^ and U2 to obtain the orthonormal basis
wi = C/710, l/VTO), U2 = (-I/VTO, 3/710)
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
13. Linear Operators on
Inner Product Spaces
Text
©The McGraw-Hill
Companies, 2004
CHAP. 13]
LINEAR OPERATORS ON INNER PRODUCT SPACES
407
Now let P be the matrix whose columns are u^ and ^2- Then
and
3/VlO -1/VlO
i/yio s/yio
2 0
0 12
Thus the required orthogonal change of coordinates is
X
Vy\
= p
s
t
3s-t
y--
s + ?>t
VlO VlO
One can also express s and t in terms of x and y by using P~^ = P^\ that is,
3x + _y —X + 3_y
yio
10
13.16. Prove Theorem 13.12: Let T be an orthogonal operator on a real inner product space V. Then
there exists an orthonormal basis of V in which T is represented by a block diagonal matrix M of
the form
M = diag 1,...,1,-1,,
-1,
cos^i —sin^i
sin ^^ cos ^^
cosOj. — sin^^
sin 6^ cos 6^
Let S = T -\- T~^ = T -\- T*. Then S^ = (T -\- T*)* = T^ -\- T = S. Thus S is sl symmetric operator on
V. By Theorem 13.11, there exists an orthonormal basis of V consisting of eigenvectors of S. If 1^,..., 1^
denote the distinct eigenvalues of S, then V can be decomposed into the direct sum F = F^ 0 F2 0 ... 0 F^
where the F, consists of the eigenvectors of S belonging to A,. We claim that each F, is invariant under T. For
suppose V e V; then S(v) = X^v and
S{T{v)) = {T + T-^)T{v) = T(T + T'^Xv) = TS(v) = T{).^v) = l^Tiv)
That is, T(v) e V^. Hence V^ is invariant under T. Since the V^ are orthogonal to each other, we can restrict our
investigation to the way that T acts on each individual V^.
On a given V^, we have {T -\- T~^)v = S(v) = A^i;. Multiplying by T, we get
(T^-^iT^I)(v) = 0
We consider the cases Xi = ±2 and A,- ^ ±2 separately If A,- = ±2, then (T±lf(v) = 0, which leads to
(T dz I)(v) = 0 or T(v) = dzR Thus T restricted to this F^ is either / or —/.
If A; 7^ dz2, then T has no eigenvectors in F,, since, by Theorem 13.4, the only eigenvalues of T are 1 or
— 1. Accordingly, for v ^ 0, the vectors v and T(v) are linearly independent. Let W be the sub space spanned
by V and T(v). Then W is invariant under T, since
T(T(V)) :
T\V) :
: X,T(V) ■
By Theorem 7.4, Vf = W ® W^. Furthermore, by Problem 13.8, W^ is also invariant under T. Thus we can
decompose F^ into the direct sum of two-dimensional subspaces Wj where the Wj are orthogonal to each other
and each Wj is invariant under T. Thus we can restrict our investigation to the way in which T acts on
each individual Wj.
Since T^ — A^r + / = 0, the characteristic polynomial A(^) of T acting on Wj is A(^) = f — X^t + 1.
Thus the determinant of T is 1, the constant term in A(^). By Theorem 2.7, the matrix A representing T acting
on Wj relative to any orthogonal basis of Wj must be of the form
COS0
sin0
- sin0
COS0
The union of the bases of the Wj gives an orthonormal basis of F^, and the union of the bases of the F^ gives an
orthonormal basis of F in which the matrix representing T is of the desired form.
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
13. Linear Operators on
Inner Product Spaces
Text
©The McGraw-Hill
Companies, 2004
408
LINEAR OPERATORS ON INNER PRODUCT SPACES
[CHAP 13
NORMAL OPERATORS AND CANONICAL FORMS IN UNITARY SPACES
13.17. Determine which of the following matrices is normal:
(a) A:
(a) AA"" :
1 i
0 1_
1 i
0 1
(b) B:
1 0]
-i 1
1 i
1 2 + /
2 f
-i 1
A'^A
Since AA'^ ^ A'^A, the matrix A is not normal.
{b) BB-"
1 i
1 2 + /
1 1
-/ 2 + /
2 2 + 2/
2-2/ 6
n
1 0]
-/ ij
1 1
/ 2 +
[1
/
[o i_
/J
[1
[l
/
2 +
1 /]
_-/ 2 J
/
= 5*5
Since 55* = B'^B, the matrix 5 is normal.
13.18. Let r be a normal operator. Prove the following:
{a) T(v) = 0 if and only if r*(i;) = 0. (Z?) T - XI i% normal.
(c) If r(i;) = Xv, then r*(i;) = li;; hence any eigenvector of T is also an eigenvector of 7*.
(d) If r(i;) = /Iji; and T(w) = /I2W where X^ 7^ /I2, then A;, w) = 0; that is, eigenvectors of T
belonging to distinct eigenvalues are orthogonal.
(fl) We show that {r(i;), T{v)) = {r*(i;), r*(i;)):
{r(i;), r(i;)) = {v, T'^Tiv)) = {v, TT'^iv)) = {r*(i;), r*(i;))
Hence, by [73] in the definition of the inner product in Section 7.2, T(v) = 0 if and only if r*(i;) = 0.
(b) We show that T — XI commutes with its adjoint:
(T - XI)(T -Xiy = {T - XI){T'' - XI) = TT'' - XT'" -ll + XXI
= T'^T -XT- XT'' + XXI = (r* - XI)(T - XI)
= (T- XI)''(T - XI)
Thus T — XI is normal.
(c) If T(v) = Xv, then (T - XI)(v) = 0. Now T - XI is normal by (b); therefore, by (a), (T - A/)*(i;) = 0.
That is, (r* - XI)(v) = 0; hence r*(i;) = Xv.
(d) We show that X^ {v, w) = X^'yV, w)\
Xi{v, w) — {X^v, w) — {T(v), w) = {v, T^(w)) = {v, X2W) = X^'yV, w)
But Xx i^ X2\ hence {i;, w) — 0.
13.19. Prove Theorem 13.13: Let 7 be a normal operator on a complex finite-dimensional inner product
space V. Then there exists an orthonormal basis of V consisting of eigenvectors of T. (Thus T can
be represented by a diagonal matrix relative to an orthonormal basis.)
The proof is by induction on the dimension of V. If dim V = I, then the theorem trivially holds. Now
suppose dim V = n > 1. Since F is a complex vector space, T has at least one eigenvalue and hence a
nonzero eigenvector v. Let W be the sub space of V spanned by v, and let w^ be a unit vector in W.
Since v is an eigenvector of T, the subspace W is invariant under T. However, v is also an eigenvector of
r* by Problem 13.18; hence W is also invariant under T*. By Problem 13.8, W-^ is invariant under T** = T.
The remainder of the proof is identical with the latter part of the proof of Theorem 13.11 (Problem 13.14).
Lipschutz-Lipson:Schaum's I 13. Linear Operators on I Text I I ©The McGraw-Hill
Outline of Theory and Inner Product Spaces Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 13] LINEAR OPERATORS ON INNER PRODUCT SPACES 409
13.20. Prove Theorem 13.14: Let T be any operator on a complex finite-dimensional inner product space
V. Then T can be represented by a triangular matrix relative to an orthonormal basis of V.
The proof is by induction on the dimension of V. If dim V = I, then the theorem trivially holds. Now
suppose dim V = n > 1. Since F is a complex vector space, T has at least one eigenvalue and hence at least
one nonzero eigenvector v. Let W be the subspace of V spanned by v, and let w^ be a unit vector in W. Then u^
is an eigenvector of T and, say, T(ui) = anUi.
By Theorem 7.4, V =W ® W^. Let E denote the orthogonal projection V into W^. Clearly W^ is
invariant under the operator ^r. By induction, there exists an orthonormal basis {u2,..., u^^} oi W^ such that,
for / = 2,..., n,
ET(Ui) = aaU2 +0 W3 + • • • + ^^n^/
(Note that {ui, U2, ..., u^^} is diU orthonormal basis of V.) But E is the orthogonal projection of V onto W^\
hence we must have
T{ui) = a^iUi + a,-2^2 + • • • + ^u^i
for i = 2,..., n. This with T(ui) = a^^i gives us the desired result.
MISCELLANEOUS PROBLEMS
13.21. Prove Theorem 13.1 OB: The following are equivalent:
(i) P = T^ for some self-adjoint operator T.
(ii) P = S^S for some operator S, that is, P is positive,
(iii) P is self-adjoint and (P(u), u) >0 for every u e V.
Suppose (i) holds; that is, P = T^ where T = T^. Then P = TT = T^T, and so (i) implies (ii). Now
suppose (ii) holds. Then P* = (S^S)^ = S^S^^ = S^S = P, and so P is self-adjoint. Furthermore,
{P(u), u) = {S''S(u), u) = {S(u), S(u)) > 0
Thus (ii) implies (iii), and so it remains to prove that (iii) implies (i).
Now suppose (iii) holds. Since P is self-adjoint, there exists an orthonormal basis {u^,... ,u^} of V
consisting of eigenvectors of P; say, P(ui) = X^Ui. By Theorem 13.4, the X^ are real. Using (iii), we show that
the Xi are nonnegative. We have, for each /,
0 < {P{Ui), Ui) = {k^u^, Ui) = kiiUi, Ui)
Thus {ui, Ui) > 0 forces A, > 0, as claimed. Accordingly, yij is a real number. Let T be the linear operator
defined by
T{ui) = ^fX-Ui for /=!,...,«
Since T is represented by a real diagonal matrix relative to the orthonormal basis {wj, T is self-adjoint.
Moreover, for each /,
T^iu^ = T(^iUi) = y^iT(ii) = y^i^iUi = X^Ui = P(Ui)
Since T^ and P agree on a basis of V,P = T^. Thus the theorem is proved.
Remark: The above operator T is the unique positive operator such that P = T^; it is called the
positive square root of P.
13.22. Show that any operator T is the sum of a self-adjoint operator and a skew-adjoint operator.
SQtS = \(T-\- r*) and U = ^(T- T*). Then T = S-\-U, where
S"" = [\(T-\- r*)]* = ^(r* + r**) = ^(r* + r) = 5*
t/* = [^(r-r*)]* =^(r*-r) = -i(r-r*) = -u
i.e., S is self-adjoint and U is skew-adjoint.
Lipschutz-Lipson:Schaum's I 13. Linear Operators on I Text I I ©The McGraw-Hill
Outline of Theory and Inner Product Spaces Companies, 2004
Problems of Linear
Algebra, 3/e
410 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAP 13
13.23. Prove: Let T be an arbitrary linear operator on a finite-dimensional inner product space V. Then T is
a product of a unitary (orthogonal) operator U and a unique positive operator P; that is, T = UP.
Furthermore, if T is invertible, then U is also uniquely determined.
By Theorem 13.10, r*r is a positive operator, and hence there exists a (unique) positive operator/* such
that P^ = T'^T (Problem 13.43). Observe that
\\P(v)f = {P(vl P(v)) = {P\v), V) = {T^T(v), V) = {T(v), T(v)) = \\T(v)f A)
We now consider separately the cases when T is invertible and non-invertible.
If T is invertible, then we set IJ = PT~^. We show that IJ is unitary:
U^ = (PT~^)^ = T~^*p^ = (r*)~^P and (/''U = (T^)-^PPT~^ = (J^y^t''TT~^ = I
Thus U is unitary. We next set U = U~^. Then U is also unitary, and T = UP as required.
To prove uniqueness, we assume T = UqPq, where Uq is unitary and Pq is positive. Then
T^T = P^USUoPo=PoIPo=Pl
But the positive square root of T^T is unique (Problem 13.43); hence Pq = P. (Note that the invertibility of T
is not used to prove the uniqueness of P.) Now if T is invertible, then P is also invertible by A). Multiplying
UqP = UP on the right by P~^ yields Uq = U. Thus U is also unique when T is invertible.
Now suppose T is not invertible. Let W be the image of P, i.e., W = lmP. We define U^'.W ^ ^ by
U^{w) = T(v), where P(v) = w B)
We must show that Ui is well defined; that is, that P(v) = P{v') implies T{v) = T{v'). This follows from the
fact thatP(i; — v') = ^ is equivalent to \\P{v — v')\\ =0, which forces \\T{v — v')\\ = 0 by A). Thus Uy is well
defined. We next define U2'.W ^ V. Note that, by A), P and T have the same kernels. Hence the images ofP
and T have the same dimension, i.e., dim(Im P) = dim W = dim(Im T). Consequently, W-^ and (Im T)^ also
have the same dimension. We let U2 be any isomorphism between W-^ and (Im T)^.
We next set U = U^ ® U2. [Here U is defined as follows: If v e V and v = w-\-W, where w e W,
w' e W^, then U(v) = U^(w) + U2(w').] Now U is linear (Problem 13.69), and, if i; G F mdP(v) = w, then,
by B),
T(v) = U^(w) = U(w) = UP(v)
Thus T = UP, as required.
It remains to show that U is unitary. Now every vector x e V can be written in the form x = P(v) + w\
where w' e W^. Then U{x) = UP(v) + U2(w') = T(v) + U2(w% where {T(v), U2(w')) = 0 by definition of
U2. Also, {T(v), T{v)) = {P(v),P(v)) by A). Thus
{U(xl U(x)) = {T(v) + U2(Wl T(v) + U2(W)) = {T(v), T(v)) + {U2(w% U2(W))
= {P(v), P(v)) + {w\ W) = {P(v) + w\ P(v) + W) = {x, x)
[We also used the fact that {P(v), w^) = 0.] Thus U is unitary, and the theorem is proved.
13.24. Let V be the vector space of polynomials over R with inner product defined by
{f,g) =
f(t)g(t) dt
0
Give an example of a linear functional (f) on V for which Theorem 13.3 does not hold, i.e., for
which there is no polynomial h{t) such that (/>(/) = {f,h) for QVQry f G V.
Lipschutz-Lipson:Schaum's I 13. Linear Operators on
Outline of Theory and Inner Product Spaces
Problems of Linear
Algebra, 3/e
Text
©The McGraw-Hill
Companies, 2004
CHAP. 13]
LINEAR OPERATORS ON INNER PRODUCT SPACES
411
Let (jy-.V ^ KhQ defined by (/>(/) =f@); that is, (j) evaluates/(O at 0, and hence maps/@ into its
constant term. Suppose a polynomial h(t) exists for which
W) =f@) = f{t)h{t) dt
A)
for every polynomial/(O- Observe that (f) maps the polynomial tf{t) into 0; hence, by A),
tf{t)h{t) dt = 0 B)
0
for every polynomial/(O- In particular B) must hold for f(t) = th(t); that is,
fh^(t) dt = 0
0
This integral forces h(t) to be the zero polynomial; hence </>(/) = {f, h) = {f,0) = 0 for every polynomial
f(t). This contradicts the fact that (j) is not the zero functional; hence the polynomial h(t) does not exist.
Supplementary Problems
ADJOINT
13.25. Find the adjoint of:
(a) A
[5-2/
[4-6/
3+7/
8 + 3/
(b) B-
[3 5/
[/ -2/
(c) C-
1 1
2 3
13.26. Let r:R^ ^ R^ be definedby r(x,j;,z) = (x + 2j;, 3x - 4z, j;). Find r*(x,j;,z).
13.27. Letr:C^ ^ CHedefinedby r(x,j;,z) = [/x + B + 3/)j;, 3x + C-/)z, B - 5i)y-\-iz].
Find r*(x,j;,z).
13.28. For each linear function (p on V, find u e V such that (l)(v) = {v, u) for every v eV\
{a) ^
(b) ^
(c) ^
R^ ^ R defined by (/)(x, j;, z) = x + 2j; — 3z.
C^ ^ C defined by (l)(x,y, z) = ix + B + ?>i)y + A - 2/)z.
F ^ R defined by </>(/) =/(l), where V is the vector space of Problem 13.24.
13.29. Suppose V has finite dimension. Prove that the image of T* is the orthogonal complement of the kernel of T,
i.e., Im r* = (Ker T) . Hence rank(r) = rank(r*).
13.30. Show that T^'T = 0 implies T = 0.
13.31. Let V be the vector space of polynomials over R with inner product defined by {/, g) = Jq f(t)g(t) dt. Let D
be the derivative operator on V, i.e., D(/) = df/dt. Show that there is no operator D* on V such that
{D(/),g) = {/, D*(g)) for every/, g e V. That is, D has no adjoint.
UNITARY AND ORTHOGONAL OPERATORS AND MATRICES
13.32. Find a unitary (orthogonal) matrix whose first row is:
(a) B/713, 3/713), F) a multiple of A, 1 -/), (c) (\, \i, \-\i)
Lipschutz-Lipson:Schaum's I 13. Linear Operators on I Text I I ©The McGraw-Hill
Outline of Theory and Inner Product Spaces Companies, 2004
Problems of Linear
Algebra, 3/e
412 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAR 13
13.33. Prove that the products and inverses of orthogonal matrices are orthogonal. (Thus the orthogonal matrices
form a group under multiplication, called the orthogonal group.)
13.34. Prove that the products and inverses of unitary matrices are unitary. (Thus the unitary matrices form a group
under multiplication, called the unitary group.)
13.35. Show that if an orthogonal (unitary) matrix is triangular, then it is diagonal.
13.36. Recall that the complex matrices A and B are unitarily equivalent if there exists a unitary matrix P such that
B = P^AP. Show that this relation is an equivalence relation.
13.37. Recall that the real matrices A and B are orthogonally equivalent if there exists an orthogonal matrix P such
that B = P^AP. Show that this relation is an equivalence relation.
13.38. Let ^ be a subspace of V. For any v e V, \Qt v = w-\-w\ where w e W, w' e W^. (Such a sum is unique
because V =W® W^.) Let T\V ^ V hQ defined by T{v) = w — m/. Show that T is self-adjoint unitary
operator on V.
13.39. Let V be an inner product space, and suppose U\V ^ V (not assumed linear) is surjective (onto) and
preserves inner products, i.e., {U{v), U{w)) = {u, w) for every v,w e V. Prove that U is linear and hence
unitary.
POSITIVE AND POSITIVE DEFINITE OPERATORS
13.40. Show that the sum of two positive (positive definite) operators is positive (positive definite).
13.41. Let r be a linear operator on V and let/: V x V ^ KbQ defined by f(u, v) = {T(u), v). Show that/ is itself
an inner product on V if and only if T is positive definite.
13.42. Suppose E is an orthogonal projection onto some subspace W of V. Prove that kl -\-E is positive (positive
definite) if ^ > 0 (^ > 0).
13.43. Consider the operator T defined by T(ui) = y/TiUi, i = 1,...,«, in the proof of Theorem 13. lOA. Show that T
is positive and that it is the only positive operator for which T^ = P.
13.44. Suppose P is both positive and unitary. Prove that P = I.
13.45. Determine which of the following matrices are positive (positive definite):
ri 11 r 0 n r o H fi H [2 n [i 21
(i) L (ii) L (iii) L (iv) , (v) , (vi)
13.46. Prove that a 2 x 2 complex matrix A = \ j is positive if and only if (i) A = A*, and (ii) a, d, ad — be are
nonnegative real numbers. L -I
13.47. Prove that a diagonal matrix A is positive (positive definite) if and only if every diagonal entry is a
nonnegative (positive) real number.
Lipschutz-Lipson:Schaum's I 13. Linear Operators on I Text I I ©The McGraw-Hill
Outline of Theory and Inner Product Spaces Companies, 2004
Problems of Linear
Algebra, 3/e
CHAP. 13] LINEAR OPERATORS ON INNER PRODUCT SPACES 413
SELF-ADJOINT AND SYMMETRIC MATRICES
13.48. For any operator T, show that r + T* is self-adjoint and T — T* is skew-adjoint.
13.49. Suppose T is self-adjoint. Show that T^{v) = 0 implies T(v) = 0. Use this to prove that r"(i;) = 0 also implies
that T(v) = 0 for « > 0.
13.50. Let F be a complex inner product space. Suppose {T(v), v) is real for every v eV. Show that T is self-adjoint.
13.51. Suppose Ti and T2 are self-adjoint. Show that T1T2 is self-adjoint if and only if T^ and T^ commute, that is,
13.52. For each of the following symmetric matrices A, find an orthogonal matrix P for which P^AP is diagonal:
ia) A = \[ J],(Z» A=[l _\'\ac) A = ^^ J]
13.53. Find an orthogonal change of coordinates that diagonalizes each of the following quadratic forms:
{a) q(x, y) = 2x^ - 6xy + 10/, (b) q(x, y) = x^ -\- 8xy - 5/
(c) q(x, y, z) = 2x^ - 4xy + 5/ + 2xz - 4yz + 2z^
NORMAL OPERATORS AND MATRICES
i
13.54. Let A ■-
i 2
. Verify that A is normal. Find a unitary matrix P such that P^AP is diagonal. Find P^AP.
13.55. Show that a triangular matrix is normal if and only if it is diagonal.
13.56. Prove that if T is normal on V, then ||r(i;)|| = ||r*(i;)|| for every v e V. Prove that the converse holds in
complex inner product spaces.
13.57. Show that self-adjoint, skew-adjoint, and unitary (orthogonal) operators are normal.
13.58. Suppose T is normal. Prove that:
{a) T is self-adjoint if and only if its eigenvalues are real.
{b) T is unitary if and only if its eigenvalues have absolute value 1.
(c) T is positive if and only if its eigenvalues are nonnegative real numbers.
13.59. Show that if T is normal, then T and T* have the same kernel and the same image.
13.60. Suppose T^ and T2 are normal and commute. Show that T^ + T2 and T1T2 are also normal.
13.61. Suppose Ti is normal and commutes with T2. Show that T^ also commutes with TJ.
13.62. Prove the following: Let T^ and T2 be normal operators on a complex finite-dimensional vector space V. Then
there exists an orthonormal basis of V consisting of eigenvectors of both T^ and T2. (That is, T^ and T2 can be
simultaneously diagonalized.)
Lipschutz-Lipson:Schaum's I 13. Linear Operators on I Text I I ©The McGraw-Hill
Outline of Theory and Inner Product Spaces Companies, 2004
Problems of Linear
Algebra, 3/e
414 LINEAR OPERATORS ON INNER PRODUCT SPACES [CHAR 13
ISOMORPHISM PROBLEMS FOR INNER PRODUCT SPACES
13.63. Let S = {ui, ... ,uj bQ an orthonormal basis of an inner product space V over K. Show that the mapping
V 1^ [v]^ is an (inner product space) isomorphisms between V and K^. (Here [v]^ denotes the coordinate vector
of V in the basis S.)
13.64. Show that inner product spaces V and W over K are isomorphic if and only if V and W have the same
dimension.
13.65. Suppose {ui,..., u^} and {u\,..., u'^} are orthonormal basis of V and W, respectively. Let T\V ^ W hQ the
linear map defined by T{ui) = w- for each /. Show that T is an isomorphism.
13.66. Let V be an inner product space. Recall that each u e V determines a linear functional u in the dual space F*
by the definition u(v) = {v, u) for every v e V(Sqq the text immediately preceding Theorem 13.3). Show that
the map w i^ w is linear and nonsingular, and hence an isomorphism from V onto F*.
MISCELLANEOUS PROBLEMS
13.67. Suppose {ui, ..., uj is an orthonormal basis of V. Prove:
(a) {a^ui + ^2^2 + • • • + a^u^, b^u^ + 62^2 + • • • + b^u^) = a^b^ + ^2^2 + ... a^b^
(b) Let A = [aij] be the matrix representing T\V ^ V in the basis {wj. Then a^y = {T(ui), Uj).
13.68. Show that there exists an orthonormal basis {ui,... ,uj of F consisting of eigenvectors of T if and only if
there exist orthogonal projections E^,... ,E^ and scalars A^,..., A^ such that:
(i) r = Ai^i+...+A,^„(ii) ^1 + ...+^, =/, (iii) EiEj = 0 for i^j
13.69. Suppose V = U e W md suppose T^:U ^ F and T2\W ^ F are linear. Show that T = T^ 0 72 is also
linear. Here T is defined as follows: \f v eV and v = u-\-w where u eU,w eW, then
T{v) = T,{u) + T2{w)
Answers to Supplementary Problems
Notation: [Ri', R2, ...; R^ denotes a matrix with rows R^, R21 • • • 1 Rfi'
13.25. {a) [5 + 2/, 4 + 6/; 3-7/, 8-3/], {b) [3,-1; -5/, 2/], (c) [1,2; 1,3]
13.26. r*(x, y, z) = (x + 3j;, 2x + z, -Ay)
U.ll. r*(x, y, z) = [-/x + 3y, B - 3/)x + B + 5/)z, C + /);; - iz]
13.28. (a) u = A,2,-3), (b) u = (-i, 2-3/, 1+2/), (c) u =^(lSt^ - St-\-13)
13.32. (a) (l/yi3)[2,3; 3,-2], (b) A/V3)[1, 1-/; 1+/, -1],
(c) i[l, /, 1-/; V2/, -V2, 0; 1, -/, -1+/]
13.45. Only (i) and (v) are positive. Only (v) is positive definite.
13.52. (asindb) P = (l/V5)[2,-I; -1,2], (c) P = A/Vl0)[3,-I; -1,3]
13.53. (a) x = Cx^ -y)/yTO, y = (x^-\- 3y)/yT0, (b) x = {2x' -/)/V5, y = {x' + 2/)/V5,
(c) X = x7V3 + y/V2 + z7V6, j; = x7V3 - 2z7V6, z = x7V3 - //V2 + z7x/6
13.54. {a) P = (l/V2)[l,-1; 1, 1], P*^P = diagB +/, 2-/)
Lipschutz-Lipson:Schaum's I Back Matter
Outline of Theory and
Problems of Linear
Algebra, 3/e
Appendix: Multilinear
Products
©The McGraw-Hill
Companies, 2004
APPENDIX
Multilinear
Products
A.l INTRODUCTION
The material in this appendix is much more abstract than that which has previously appeared. We
motivate the material with the following observation.
Let ^ be a basis of a vector space V. Theorem 5.2 may be restated as follows.
Theorem 5.2: Let g : S ^ F be the inclusion map of ^ into V. Then for any vector space U and any
mapping f : S ^ U there exists a unique linear mapping f^:V^U such that
f=f*og.
Another way to state that that/ =/* o g is that the diagram in Fig. A-1(a) commutes.
V
.r
f
(a)
I^V^W
-^U VxW-
^u
J
Fig. A-1
(c)
^U
A.2. BILINEAR MAPPINGS AND TENSOR PRODUCTS
Let U, V, W be vector spaces over K. A map
f:Vx W ^ U
is bilinear if, for each v e V, the map f^'.W^U defined by/(w) =f{v, w) is linear and, for each
w e W, the map/^ : V ^ U defined hy f^(v) =f(v, w) is linear.
415
Lipschutz-Lipson:Schaum's I Back Matter I Appendix: Multilinear I I ©The McGraw-Hill
Outline of Theory and Products Companies, 2004
Problems of Linear
Algebra, 3/e
416 MULTILINEAR PRODUCTS [APPENDIX
That is,/ is linear in each of its two variables. Note that/ is very similar to a bilinear form except that
now the values of the map are in a vector space U rather than the field K.
Definition: Let V and W be vector spaces over a field K. The tensor product T of V and ^ is a vector
space over K together with a bilinear mapping g : V x W ^ T, denoted by g(v, w) = v <S)w,
with the following property:
(*) For any vector space U over K and any bilinear map f : V x W ^ U there exists a
unique linear map/*: T ^ U such that/* og =f.
The tensor product TofV and W is denoted by V <^ W and the element w (g) w is called the tensor of u
and w. The uniqueness in (*) imphes that the image of g spans T, that is, that span ({v (g) w}) = T.
Theorem A.l: The tensor product T = F (g) ^ of F and W exists and is unique (up to isomorphism). If
{vi,... ,v^} is 3. basis of V and {wj,..., w^} is a basis of W, then the vectors
Vi<^Wj (i = I, ... ,n andy = 1, ..., m)
form a basis of T. Thus dim T = (dimF)(dim^).
Another way to state condition (*) is that the diagram in Fig. A-1(b) commutes. The fact that such a linear
map/* exists for any mapping/ is called a "universal mapping principle". [Observe that the basis ^ of a
vector space V has this universal mapping property] Condition (*) also says that any bilinear mapping
f : V X W ^ U "factors through" the tensor poroduct T = V <S) W
Next we give a concrete example of a tensor product.
Example A.1
Let V be the vector space of polynomials P^_i(x), and let W be the vector space of polynomials P^_i(j^). Thus the
following form bases of V and W, respectively,
l,x,x^, ... ,x'^~^ and l,y,y^,... ,y'~^
In particular, dim V = r and dim W = s. Let T be the vector space of polynomials in variable x and y with basis
{xy^} where / = 0,..., r — 1, y = 0,..., ^ — 1
Then T is the tensor product V <S> W under the mapping (x\yJ) i^ x'y. Note, dim T = rs = (dim F)(dim W).
The next theorem tells us that the tensor product is associative in a canonical way.
Theorem A.2: Let U, V, W be vector spaces over a field K. Then there is a unique isomorphism
such that, for every u eU,v eV,w eW,
(w (g) i;) (g) w 1^ w (g) (i; (g) w)
A.3 ALTERNATING MULTILINEAR MAPS AND EXTERIOR PRODUCTS
Let/ : V^ ^^ U where V and U are vector spaces over K. [Recall V^ = VxVx...xV,r factors.]
A) The mapping/ is said to be multilinear or r-linear if/(i^i, ..., i;^) is linear as a function of each
Lipschutz-Lipson:Schaum's I Back Matter I Appendix: Multilinear I I ©The McGraw-Hill
Outline of Theory and Products Companies, 2004
Problems of Linear
Algebra, 3/e
APPENDIX] MULTILINEAR PRODUCTS 417
Vj when the other i;/s are held fixed. That is,
f(...,Vj + v'j,...)=f(...,Vj,...)+f(...,v'j,...)
f(...,kVj,...) = kf(...,Vj,...)
where only the jth position changes.
B) The mapping / is said to be alternating if
f(vi, ..., Vj.) = 0 whenever v^ = Vj with / j^j
One can easily show (Prove!) that if/ is an alternating multilinear mapping on V^, then
/(..., Vi, ...,Vj,...) = -/(. ..,Vj,..., Vi, . . .)
That is, if two of the vectors are interchanged, then the associated value changes sign.
Example A.2: Determinant Function
The determinant function D : M ^ ^ on the space M of n x n matrices may be viewed as an w-variable function
D(A) = D(R„R2,...,R„)
defined on the rows R^, R2,..., R^ of A. Recall (Chapter 8) that, in this context, D is both ^-linear and alternating.
We now need some additional notation. Let AT = [^j, ^2' • • •' ^rl denote an r-list (r-tuple) of elements
from 4 = {1, 2, ..., w}. We will then use the following notation where the VjJ?> denote vectors and the a^y^'s
denote scalars:
% = {%^%^---^\) and a^ = a^^,^ik, • • • ^rk.
Note % is a list of r vectors, and a^^ is a product of r scalars.
Now suppose the elements in AT = [^j, ^2' • • •' ^rl are distinct. Then AT is a permutation Gj^ of an r-list
J = [/j, /2, ..., /^] in standard form, that is, where /j < /2 < ... < zV. The number of such standard-form
r-lists J from /^ is the binomial coefficient:
n
r J r\(n — r)\
[Recall sign(crj^) = (—1)^^ where m^^ is the number of interchanges that transform K into J.]
Now suppose A = [afj] is an r x ^ matrix. For a given ordered r-list J, we define
DjiA)
^2/1 ^2/2 ■ ■ ■ ^2i^
a,,- a,,- ■ ■ ■ a,,
That is, Dj(A) is the determinant of the r x r submatrix of ^ whose column subscripts belong to J.
Our main theorem below uses the following "shuffling" lemma.
Lemma A.3: Let V and U be vector spaces over K, and hi f : V^ ^ U be an alternating r-linear
mapping. Let Vi,V2, ... ,v„ be vectors in V and let A = [a^] be an r x ^ matrix over K
where r < n. For / = 1, 2, ..., r,
let u^ = a^^v^ + aaV2 + ... + a^^v^
Then
/(wi ,...,u,) = J2 Dj{A)f{Vi^ ,Vi^,..., Vi)
where the sum is over all standard-form r-lists J = {/j, /2, ..., ir).
Lipschutz-Lipson:Schaum's I Back Matter I Appendix: Multilinear I I ©The McGraw-Hill
Outline of Theory and Products Companies, 2004
Problems of Linear
Algebra, 3/e
418 MULTILINEAR PRODUCTS [APPENDIX
The proof is technical but straightforward. The linearity of/ gives us the sum
K
where the sum is over all r-lists K from {1, ..., w}. The alternating property of/ tells us that/(%) = 0
when K does not contain distinct integers. The proof now mainly uses the fact that as we interchange the
Vj's to transform
/(%) =f{%,%,'", %) to f(vj) =/(i;,- ,Vi^,..., V,;)
so that /j < ... < /^, the associated sign of aj^ will change in the same way as the sign of the
corresponding permutation dj^ changes when it is transformed to the identity permutation using
transpositions.
We illustrate the lemma below for r = 2 and n = 3.
Example A.3
Suppose/ : V^ ^ U is an alternating multilinear function. Let Vi,V2,V2, e V and let u,w e V. Suppose
u = a^Vi + a2V2 + «3i^3 and w = biVi + 621^2 + ^3^3
Consider
f{u, w) =f(aiVi + a2V2 + a^v^, b^Vi + 621^2 + ^3^3)
Using multilinearity, we get nine terms:
f(u,w) = a^bJ(v^,v^)-\-a^b2f(v^,V2)-\-a^b^f(v^,v^)
+ «2^l/fe. ^1) + «2^2/fe. ^2) + «2^3/fe. ^3)
+ «3^l/(^3' ^1) + «3^2/(^3' ^2) + «3^3/(^3' ^3)
(Note that J = [1,2], J' = [1,3] and /'' = [2, 3] are the three standard-form 2-lists of/ = {1,2, 3}.) The alternating
property of/ tells us that each/(i;^, v^) = 0; hence three of the above nine terms are equal to 0. The alternating
property also tells us that/(i;,, Vj) = —f(Vj, v,). Thus three of the terms can be transformed so their subscripts form a
standard-form 2-list by a single interchange. Finally we obtain
f(u, W) = (a^b2 - «2^l)/(^l. ^2) + («1^3 - «3^l)/(^l' ^3) + fe^3 - «3^2)/fe. ^3)
ai a2
b\ ^2
which is the content of Lemma A.3.
/(^1.^2) +
b, b.
f(v„v,) +
a2 «3
/fe.^3)
Definition: Let V be an ^-dimensional vector space over a field K, and let r be an integer such that
1 < r < n. The exterior product or {wedge product) £" is a vector space over K together with
an alternating r-linear mapping g '.V^ ^^ E, denoted by g{vx, ... ,Vj) = v^ A ... A 1;^, with
the following property:
(*) For any vector space U over K and any alternating r-linear map f : V^ ^ U there
exists a unique linear map/* : E ^ U such that/* og =f.
Such an exterior product is denoted hy E = f^ V. Again, the uniqueness in (*) implies that the image
of g spans E, that is, span {{v^ A ... A v^}) = E.
Theorem A.4: Let V be an ^-dimensional vector space over K. Then the exterior product E = /\^ V of
V exists and is unique (up to isomorphism). If r > n then E = {0}. If r < n, then
dimE = I j. Moreover, if {i;i, ..., i;„} is a basis of V, then the vectors
where 1 < /j < /2 < ... < zV < n, form a basis of E.
Lipschutz-Lipson:Schaum's I Back Matter I Appendix: Multilinear I I ©The McGraw-Hill
Outline of Theory and Products Companies, 2004
Problems of Linear
Algebra, 3/e
APPENDIX] MULTILINEAR PRODUCTS 419
Again, condition (*) says that the diagram in Fig. A-1(c) commutes. Condition (*) also says that any
alternating r-linear mapping/ : V^ ^ U "factors through" the exterior product E = /\^ V.
We give a concrete example of an exterior product.
Example A.4. Cross Product
Consider V = B? with the usual basis {i, j,k}. Let E = /\ V. Note dimF = 3. Thus dim^ = 3 with basis
iAj, iAk, JAk. We identify E with R^ under the correspondence
i = jAk, j = kAi = —iAk, k = iAj
Let u and w be arbitrary vectors in V = R^, say
u = (ai,a2, a^) = aii + a2 j + a^lfi and w = (bi, b2, b^) = bii -\- b2J + bi^k
Then, as in Example A. 3,
U Aw = (^1^2 — «2^l)i ^ J + (^1^3 ~ ^3^l)i A k + (^2^73 — ^3^2)] A k
Using the above identification, we get
U Aw = (^2^K — ^3^2I — («i^3 — «3^l)j + (^1^2 ~ ^l^O^^
«2
bi
«3
b.
i —
fli
bi
«3
b.
.i +
fli
bi
fl2
b2
The reader may recognize that the above exterior product is precisely the well-known cross product in R^.
Our last theorem tells us that we are actually able to "multiply" exterior products which allows us to
form an "exterior algebra" which is illustrated below.
Theorem A.5: Let F be a vector space over K. Let r and s be positive integers. Then there is a unique
bilinear mapping
such that, for any vectors u^, Wj in V,
(Ui A . . . A Uj.) X (Wi A ... A Wg) \^ Ui A . . . A Uj. A Wi A . . . A Wg
Example A.5
We form an exterior algebra A over a field K using noncommuting variables x, y, z. Since it is an exterior algebra, our
variables satisfy:
xAx = 0, yAy = 0, zAz = 0, and _y Ax = —x A_y, zAx=—xAz, zAy = —yAz
Every element of ^ is a linear combination of the eight elements
1, X, y, z, X A _y, X A z, j^ A z, x A _y A z
We multiply two "polynomials" in A using the usual distributive law, but now we also use the above conditions. For
example,
[3 + 4_y — 5x A_y + 6x A z] A [5x — 2y] = 15x — 6_y — 20x Ay -\- \2x Ay Az
Observe we use the fact that:
[4y] A [5x] = 20y A x = -20x A y and [6x A z] A [—2y] = —\2x Az Ay = \2x Az Ay
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
Back Matter
List of Symbols
©The McGraw-Hill
Companies, 2004
LIST OF SYMBOLS
Ar
A = [a^j], matrix, 28
A = [ay], conjugate matrix, 39
\A\, determinant, 277, 281
^*, adjoint, 395
A^, conjugate transpose, 40
transpose, 34
minor, 283
A(I, J), minor, 286
A(V), linear operators, 181
adj A, adjoint (classical), 285
A ^ B, row equivalence, 785
A :^ B, congruence, 377
C, complex numbers, 12
C", complex w-space, 14
C[a, Z?], continuous functions, 238
C(/), companion matrix, 318
colsp(^), column space, 125
d(u, v), distance, 6, 252
diag(aii, ..., a„„), diagonal matrix, 37
diag(^
11'
, ^„„), block diagonal, 42
det(^), determinant, 281
dim F, dimension, 129
{ei, ..., e„}, usual basis, 129
Ej^, projections, 402
f : A ^ B, mapping, 171
F(X), function space, 118
G o F, composition, 181
HoM(F, U), homomorphisms, 181
i, j, k, 10
4, identity matrix, 34
ImF, image, 176
J(A), Jordan block, 344
K, field of scalars, 117
Ker F, kemal, 176
m(t), minimal polynomial, 318
M^„, m X n matrices, 118
w-space, 5, 14, 238, 250
P(^), polynomials, 118
P„(^), polynomials, 118
proj(w, v), projection, 6, 245
proj(w, F), projection, 245
Q, rational numbers, 12
R, real numbers, 1
R", real w-space, 5
rowsp(^), rowspace, 124
S-^, orthogonal complement, 242
sgn cr, sign, parity, 280
span(^), linear span, 123
tr(^), trace, 34
[T]s, matrix representation, 203
r*, adjoint, 396
T-invariant, 342
P, transpose, 368
||w||, norm, 5, 14, 237, 251, 252
[w]^, coordinate vector, 135
u ■ V, dot product, 4, 14
(u,v), inner product, 236, 249
u X V, cross product, 11
u<S)V, tensor product, 416
u Av, exterior product, 418
w 0 i;, direct sum, 134, 342
V = U, isomorphism, 137, 176
V <^W, tensor product, 416
F*, dual space, 366
F**, second dual space, 367
/\^ V, exterior product, 418
W^, annihilator, 367
z, complex conjugate, 13
Z(v, T), T-cychc subspace, 345
Sfj, Kronecker delta, 39
A(t), characteristic polynomial, 308
1, eigenvalue, 310
Y^, summation symbol, 30
420
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
Back Matter
Index
©The McGraw-Hill
Companies, 2004
Absolute value (complex), 13
Adjoint, classical, 285
Adjoint operator, 395
Algebraic multiplicity, 312, 313
Alternating mappings, 290, 416
Angle between vectors, 6, 240
Annihilator, 345
Associated homogeneous system, 85
Augmented matrix, 61
Back-substitution, 66, 70, 77
Basis, 85, 129
change of, 207
dual, 366
orthogonal, 243
orthonormal, 243
second dual, 367
usual, 129
Basis-finding algorithm, 133
Bessel inequality, 264
Bijective mapping, 173
Bilinear form, 376, 394
alternating, 289
matrix representation of, 377
polar form of, 380
real symmetric, 381
symmetric, 378
Bilinear mapping, 376, 415
Block matrix, 41
Jordan, 344
Caley-Hamilton theorem, 308, 310
Canonical forms, 340
Jordan, 344
rational, 345
row, 74
triangular, 340
Casting-out algorithm, 133
Cauchy-Schwarz inequality, 6, 239, 256
Change of basis, 207
Change-of-basis (transition) matrix, 207
Characteristic polynomial, 308
Classical adjoint, 285
Coefficient:
Fourier, 83, 244
matrix, 61
Cofactor, 283
Column, 28
operations, 89
space, 125
vector, 3
Companion matrix, 318
Complement, orthogonal, 242
Complementary minor, 286
Completing the square, 393
Complex:
conjugate, 13
inner product, 249
«-space, 14
numbers, 12
plane, 13
Composition of mappings, 173, 181
Congruent matrices, 377
diagonalization, 379
Conjugate:
complex, 13
linearity, 250
Consistent systems, 61
Convex set, 201
Coordinate vector, 135
Cramer's rule, 285
Cross product, 11, 419
Curves, 9
Cyclic subspaces, 345
Decomposition:
direct-sum, 342
primary, 343
Degenerate:
bilinear form, 378
linear equations, 62
Dependence, linear, 126
Determinant, 278
computation of, 284
linear operator, 289
Diagonal (of a matrix), 35
Diagonal matrix, 37
Diagonalization, 306, 310, 315
algorithm, 313, 379
Dimension of solution spaces, 86
Dimension of vector spaces, 129
subspaces, 131
Direct sum, 134
decomposition, 342
Directed line segment, 7
Distance, 6, 252
421
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
Back Matter
Index
©The McGraw-Hill
Companies, 2004
422
INDEX
Domain, 171
Dot product, 4, 14
Dual:
basis, 366
space, 366
Echelon:
form, 68
matrices, 73
Eigenspace, 310, 313
Eigenvalue, 310, 313
Eigenvector, 310, 313
Elementary divisors, 346
Elementary matrix, 87, 89
Elementary operations, 63
column, 88
row, 75
Elimination, Gaussian, 64, 69, 71, 76
Equal:
functions, 171
matrices, 29
Equations, {See Linear equations)
Equivalence:
matrix, 90
relation, 76
row, 75
Equivalent systems, 63
Euclidean space, 5, 14, 238, 250
Existence theorem, 79
Exterior product, 417
Field of scalars, 116
Finite dimension, 129
Fourier coefficient, 83, 244
Free variable, 68
Function, 171
Functional, linear, 365
Gauss-Jordan algorithm, 77
Gaussian elimination, 64, 69, 71, 76
General solution, 60
Geometric multiplicity, 312
Gram-Schmidt orthogonalization, 247
Hermitian:
form, 382
matrix, 40
quadratic form, 394
Homogeneous system, 60, 84
Hilbert space, 239
Hyperplane, 81, 375
ijk notation, 10
Identity:
mapping, 173
matrix, 34
Image, 171, 176
Inclusion mapping, 198
Inconsistent systems, 61
Independence, linear, 126
Infinity-norm, 254
Injective mapping, 173
Inner product, 4, 236
complex, 14, 249
usual, 238
Inner product spaces, 235, 249
linear operators on, 395
Invariance, 341
Invariant subspaces, 341
direct-sum, 342
Inverse matrix, 36
computing, 88
Inverse mapping, 174
Invertible:
matrices, 35
linear operators, 183
Isometry, 399
Isomorphic vector spaces, 137, 176
Isomorphism, 180
Jordan:
canonical form, 344
block, 344
Kernel, 176, 178
Kronecker delta ^„
35
Lagrange's identity, 21
Laplace expansion, 283
Law of inertia, 381
Leading nonzero entry, 73
Leading unknown, 62
Length, 5, 237
Line, 8, 201
Linear:
combination, 4, 30, 62, 82
dependence, 126
functional, 365
independence, 126
span, 123
Linear equation, 59
Linear equations (system), 60
consistent, 61
echelon form, 68
triangular form, 67
Linear mapping (function), 174
image, 176
kernel, 176
matrix representation, 212
nullity, 178
rank, 178
transpose, 368
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
Back Matter
Index
©The McGraw-Hill
Companies, 2004
INDEX
423
Linear operator, 181
adjoint, 395
characteristic polynomial, 310
determinant, 289
inner product spaces, 395
invertible, 183
matrix representation, 203
Linear transformation (See linear mappings)
Located vectors, 7
LU decomposition, 90
LDU decomposition, 109
Mappings (maps), 171
composition of, 173
linear, 174
Matrices:
congruent, 377
equivalent, 90
similar, 211
Matrix, 28
augmented, 61
change-of-basis, 207
coefficient, 61
companion, 318
echelon, 73
equivalence, 90
Hermitian, 40, 382
invertible, 35
nonsingular, 35
normal, 39
orthogonal, 38, 248
positive definite, 248
rank, 75
Matrix mapping, 172
Matrix multiplication, 31
Matrix representation:
adjoint operator, 395
bilinear form, 377
change of basis, 207
linear mapping, 212
linear operator, 203
Minkowski's inequality, 18
Minimal polynomial, 317, 319
Minor, 283, 286
principle, 287
Multilinearity, 289, 416
Multiphcity, 312
Multipher, 70, 76, 90
«-space:
complex, 14
real, 2
Natural mapping, 367
Nilpotent, 343
Nonnegative semidefinite, 381
Nonsingular:
linear maps, 180
matrices, 35
Norm, 5, 237, 257
Normal:
matrix, 39
operator, 397, 401
Normal vector, 8
Normed vector space, 251
Normalizing, 5
Null space, 178
Nullity, 178
One-norm, 252
One-to-one:
correspondence, 173
mapping, 173
Onto mapping, 173
Operators (See Linear operators)
Orthogonal, 4, 241
basis, 243
complement, 242
group, 412
matrix, 38, 248
operator, 398
projection, 402
sets, 243
substitution, 317
Orthogonalization, Gram-Schmidt, 247
Orthogonally equivalent, 400
Orthonormal basis, 39
Parameters, 69
Permutations, 279
Perpendicular, 4
Pivot:
entries, 70
variables, 68
Pivoting (row reduction), 98
Polar form, 380
Polynomial:
characteristic, 309
minimum, 318, 320
Positive definite: 381
matrices, 248
operators, 400
Positive operators, 400
Primary decomposition, 243
Principle minor, 288
Product:
exterior, 417
inner, 4, 236
tensor, 415
Projections, 6, 175, 197, 245, 360
orthogonal, 402
Pythagorian theorem, 243
Quadratic form, 317, 380
Quotient spaces, 346
Lipschutz-Lipson:Schaum's
Outline of Theory and
Problems of Linear
Algebra, 3/e
Back Matter
Index
©The McGraw-Hill
Companies, 2004
424
INDEX
Rank, 75, 90, 131, 178
Rational canonical form, 345
Real symmetric bilinear form, 381
Restriction mapping, 200
Rotation, 177
Row, 28
canonical form, 74
equivalence, 75
operations, 75
rank, 132
space, 124
Scalar, 1, 116
matrix, 34
multiplication, 2, 3
product, 4
Schwarz inequality, 6
{See Cauchy-Schwarz inequality)
Second dual space, 367
Self-adjoint operator, 398
Sign of permutation, 280
Signature, 381, 382
Similarity, 211, 234
Singular, 180
Skew-adjoint operator, 397
Skew-Hermitian, 40
Skew-symmetric, 38
Solution, general, 60
Spatial vectors, 10
Span, 120
Spanning sets, 120
Spectral theorem, 402
Square matrix, 33
Square root of a matrix, 311
Standard:
basis, 129, 130
inner product, 238
Subspace, 121
Sum of vector spaces, 134
Summation symbol, 30
Surjective map, 173
Sylvester's theorem, 381
Symmetric:
bilinear form, 378, 381
matrices, 38
Systems of linear equations {See Linear equations)
Tangent vector, 9
Target set, 171
Tensor product, 414
Time complexity, 91
Trace, 34
Transpose:
matrix, 33
linear mapping, 368
Transition matrix, 207
Triangle inequality, 240
Triangular form:
linear equations, 67
linear operators, 340
Triangular matrix, 37
Triple product, 11
Two-norm, 252
Uniqueness theorem, 79
Unit vector, 5, 237
Unitary:
equivalence, 400
group, 412
matrix, 40
operator, 398
Usual:
basis, 129, 130
inner product, 238
Variable, free, 68
Vector, 1, 14, 116
column, 3
coordinate, 135
located, 7
spatial, 10
unit, 5
Vector space, 116
basis, 129
dimension, 129
isomorphism, 176
normed, 251
sums, 135
Volume, 11,288
Wedge product, 418
Zero:
mapping, 181
matrix, 29
solution, 84
vector, 3