/
Author: Ward J.P.
Tags: mathematics algebra linear algebra quaternions algebraic systems
ISBN: 978-94-010-6434-7
Year: 1997
Text
tl
n
li
I r de i Iish r
Quaternions and Cayley Numbers
Mathematics and Its Applications
Managing Editor:
M.HAZEWINKEL
Centre for Mathematics and Computer Science, Amsterdam, The Netherlands
Volume 403
Quaternions and
Cayley Numbers
Algebra and Applications
by
J. P. Ward
Department of Mathematical Sciences,
Loughborough University,
Loughborough, England
SPRINGER-SCIENCE+BUSINESS MEDIA, B.V.
A CLE Catalogue record for this book is available from the Library of Congress.
ISBN 978-94-010-6434-7 ISBN 978-94-011-5768-1 (eBook)
DOI 10.1007/978-94-011-5768-1
Printed on acid-free paper
All Rights Reserved
© 1997 Springer Science+Business Media Dordrecht
Originally published by Kluwer Academic Publishers in 1997
Softcover reprint of the hardcover 1st edition 1997
No part of the material protected by this copyright notice may be reproduced or
utilized in any form or by any means, electronic or mechanical,
including photocopying, recording or by any information storage and
retrieval system, without written permission from the copyright owner
CONTENTS
Preface
1 Fundamentals of Linear Algebra
l.l
1.2
1.3
1.4
1.5
1.6
1.7
1.8
Integers, Rationals and Real Numbers
Real Numbers and Displacements
Groups
Rings and Fields
Linear Spaces
Inner Product Spaces
Algebras
Complex Numbers
Vll
1
5
9
14
19
27
38
42
Quaternions
2.1 Inventing Quaternions 54
2.2 Quaternion Algebra 62
2.3 The Exponential Form and Root Extraction 70
2.4 Frobenius' Theorem 72
2.5 Inner Product for Quaternions 76
2.6 Quaternions and Rotations in 3- and 4-Dimensions 78
2.7 Relation to the Rotation Matrix 89
2.8 Matrix Formulation of Quaternions 91
2.9 Applications to Spherical Trigonometry 95
2.10 Rotating Axes in Mechanics 102
Complexified Quaternions
3.1 Scalars, Pseudoscalars, Vectors and Pseudovectors 105
3.2 Complexified Quaternions: Euclidean Metric 107
3.3 Complexified Quaternions: Minkowski Metric 114
3.4 Application of Complexified Quaternions to Space-Time 120
3.5 Quaternions and Electromagnetism 133
3.6 Quaternionic Representation of Bivectors 139
3.7 Null Tetrad for Space-time 146
3.8 Classification of Complex Bivectors and of the Weyl Tensor 158
VI
Cayley Numbers
4.1 A Common Notation for Numbers 164
4.2 Cayley Numbers 167
4.3 Angles and Cayley Numbers 173
4.4 Cayley Number Identities 177
4.5 Normed Algebras and the Hurwitz Theorem 183
4.6 Rotations in 7-and 8-Dimensional Euclidean Space 195
4.7 Basis Elements for Cayley Numbers 206
4.8 Geometry of 8-Dimensional Rotations 213
Appendix 1 Clifford Algebras 217
Appendix 2 Computer Algebra and Cayley Numbers 225
References 231
Index 233
Preface
In essence, this text is written as a challenge to others, to discover significant uses for
Cayley number algebra in physics. I freely admit that though the reading of some
sections would benefit from previous experience of certain topics in physics — particularly
relativity and electromagnet ism — generally the mathematics is not sophisticated. In fact,
the mathematically sophisticated reader, may well find that in many places, the rather
deliberate progress too slow for their liking.
This text had its origin in a 90-minute lecture on complex numbers given by the author
to prospective university students in 1994. In my attempt to develop a novel approach to
the subject matter I looked at complex numbers from an entirely geometric perspective
and, no doubt in line with innumerable other mathematicians, re-traced steps first taken
by Hamilton and others in the early years of the nineteenth century. I even enquired into
the possibility of using an alternative multiplication rule for complex numbers (in which
aigziZ2 = arg^i— arg^) other than the one which is normally accepted (arg^i^ = argzi +
arg^). Of course, my alternative was rejected because it didn't lead to a 'product' which
had properties that we now accept as fundamental (i.e. my product was not commutative;
any real number has an infinite number of distinct square roots and square roots of any
other number do not exist!) My considerations on complex numbers led me quite naturally
to consider quaternions (denoted by H) which, though a professional mathematician for
nearly twenty years, I had not properly considered before. Paradoxically, some of the
properties I rejected for complex numbers had now to be accepted: a non-commutative
product and infinite numbers of square roots for particular numbers.
Quaternion algebra is fascinating and (though out of favour) has considerable applications
to many areas of mathematics and physics (though in the latter case it often needs to
be 'complexified'). Just as a complex number can be written as an ordered pair of real
numbers, a quaternion can also be written as an ordered pair of complex numbers.
Following this approach one is then naturally led to consider ordered pairs of quaternions
— these new objects being called Cayley numbers (8-dimensional objects denoted by K).
One might imagine that this pair-wise construction continues to produce new algebraic
objects increasing in dimensionality by a power of 2 each time. However, as one increases
the dimensionality of these algebraic objects one finds that, at each stage of generalisation,
algebraic structure is lost. In going from R to C the concept of order is lost. In going
from C to H commutativity in the product is lost and in going from H to K the associative
rule goes. So, in a very natural sense, the Cayley numbers are the end-point of a very
interesting sequence of algebras.
It can be shown that each of these algebras is a normed algebra (in which the norm,
suitably defined, of a product is the product of norms) and indeed, over the real number
Vll
Vlll
field, these are the only normed algebras. One is then led to suspect that because three of
the members of this important class, the reals, the complex numbers and the quaternion
numbers have found considerable application in Physics that the final member of the class
the Cay ley numbers should also have significant application to physics. However, thus far,
this appears not to be the case. In my researches I have not found a single area of physics
to which Cayley numbers have been applied. Indeed I have only found one significant
reference, hidden away in the appendix of Volume 2 of Penrose & Rindler's 'Spinors and
Space-Time' [1] which mentions Cayley numbers. One of the reasons for writing this text
is to introduce Cayley numbers and its 'alternative' algebra to a wider audience. My
approach to writing this text is not to give an exhaustive account of quaternions and of
Cayley numbers and (where they exist) their applications, but rather to produce a text
which is relatively short, to the point and one which is accessible to specialists and non-
specialists alike.
For reasons of conciseness and (on my part) ignorance, I have not included applications of
quaternions to mechanisms or to quantum physics. In the theory of mechanisms a 'dual
quaternion' is introduced p + eq (p,q are quaternions and e2 = 0). This I regard as a
straightforward extension of the use of quaternions in mechanics and is fully described in
A T Yang & F Freudenstein [2] and in references therein. The application of a quaternion
formalism to quantum theory is more well known by most physicists (at least at a superficial
level through the 'Pauli matrices'). More determined attempts to incorporate quaternions
into quantum theory are described by J Edmonds [3] and by Hestenes [4,5,6].
The layout of the book is straightforward. The first chapter covers background material and
fundamental concepts of linear algebra. Chapter 2 begins with the geometrical introduction
to quaternions and fully develops the rules of quaternion algebra. A discussion of Frobenius'
Theorem is included which shows that the only associative division algebra over the real
numbers is isomorphic to one of E, C, or H. The application of quaternions to rotations
in 3- and 4-dimensional Euclidean space is then considered and its relation to the classical
matrix approach is explored. This chapter also illustrates the application of quaternions
to spherical trigonometry and to problems of rotating coordinating systems in mechanics.
In Chapter 3 we consider the algebra of complexified quaternions and examine two possible
inner products, one of which gives rise to a Euclidean metric and the other to a Minkowski
metric. It could easily be argued that the complex numbers and the quaternions were
introduced to describe rotations in (respectively) 2- and 3-dimensional space and indeed
the idea of a 'rotation' both in geometric and algebraic terms is a common thread which
runs throughout this work. Rotations in a complex 3-dimensional space are considered in
this chapter and applied to the treatment of the Lorentz tranformation in special relativity
and to the description of electromagnet ism.
In Chapter 4 we develop the algebra of Cayley numbers. We examine the various Cayley
number identities which make its non-associative algebra easier to handle. Also included is
IX
the important Hurwitz Theorem which proves that the only normed algebras over the real
field are isomorphic to M, C, lorK. I have attempted, as far as I was able, to extract
the geometrical interpretation of Cayley numbers and I have mirrored closely the approach
taken in the analysis of quaternions.
There are two appendices. In the first we describe Clifford algebras (all of which are
associative) which include as special cases the real numbers, the complex numbers, the
quaternions and the complexified quaternions but, significantly not the Cayley numbers.
The second appendix describes the use of the symbolic computation language in Cayley
number algebra.
J P Ward
Loughborough, December 1996
Dedication
This book is dedicated to Joanna Mary Lewis who has been a source of constant and
unfailing encouragement through the two years it has taken to successfully complete this
project. I particularly commend her on her excellent comedy routines (via Victoria Wood),
the odd cup of tea and an endless supply (via Shirley) of digestive biscuits.
XI
Chapter 1
Fundamentals of Linear Algebra
1.1 Integers, Rationals and Real Numbers
The main purpose of this text is to describe the algebra of quaternions and Cayley numbers
and to examine some of their applications. However, before we do that it will be useful to
consider the integers, rationals and reals and to remind ourselves what these mathematical
constructs are and discuss some of their properties.
The main purpose of the positive integers is in counting which is essentially the process
of putting items from one set into one-to-one correspondence with the items from another
set. Perhaps it is because that we are all so familiar with this process that it took
mathematicians so long to abstract the idea of an integer — to remove the integers away
from objects with which they were associated. Thus, with their invention, the symbols
1,2,3... could be used to count fingers or sheep or apples. As well as being able to count
we each have a natural concept of order. This exists in the sense that we say the number
tie of elements of the set B is greater than the number ua in the set A and we write
n>B > n>A if the items of A can be put into one-to-one correspondence with the items of B
and if there are some items of B left over.
Once the integers were considered in abstract terms they could be used not just for counting
but also for addition. It is perfectly reasonable to count two distinct sets, say of apples;
set A containing 'a' apples and set B containing '6' apples and then to consider combining
the sets so that all the apples are in one set. The combined number c, of apples, is defined
as the sum of a and 6, written using the usual notation: c = a + b = b + a.
A second operation, multiplication is easily defined being the operation of repeated
addition. Now we can consider equations and unknowns. In the simplest case we can
ask, what is that integer x such that a + x = boiax = b where a, b are integers. Now, of
course, the first equation can only be solved for integer x if b > a and the second equation
only has a solution for integer x if b > a and if b is an integer multiple of a. We would
much prefer a situation in which these equations could be solved for all possible choices
of a, b. This leads directly to expanding the system of positive integers to include the
negative integers and then further extending them to include the rationals. If we include
the negative integers and zero ..., —3, —2, — 1,0,1,2,3,... then the 'solution' to a + x = b
is x = b + (—a), where the symbol (—a) has the usual meaning.
It is interesting to note that using only the natural numbers N : {1,2,3,...} all the
negative integers and zero can be defined using only the normal operations for adding
and multiplying. We define the set ^(a,&) to be the collection of all number pairs (a, b)
where a, b G N with the rules for adding and multiplying being:
(a, b) © (c, d) = (a + c, b + d) (a, b) ® (c, d) = (ac + bd, be + ad)
J. P. Ward, Quaternions and Cayley Numbers
© Kluwer Academic Publishers 1997
2
We shall also assume that equality between two such number pairs satisfies:
(a, b) = (c, d) only if a + d = b + c.
Prom this set of 'objects' the positive integers are those number pairs (a,b) for which a > b;
the negative integers are those for which a < b and the zero element is the number pair
(a, a). Effectively we are denoting b — a by (a, b). All the usual properties of negative
numbers can be derived from these basic axioms (the negative sign need never be used).
Of course it can be shown that %(a,b) and the set Z of integers {..., —3, —2, — 1,0,1, 2,3...}
are essentially identical. We note that the product of negative integers (a, b) b > a and
(c, d), d > c is a positive integer. To see this we must show that ac + bd> bc + ad. Since
b > a and d > c then there exist integers k, m G N such that b = a + m and d = c + k.
Then
ac + bd = ac+ (a + m)(c + k)
= ac + ac + mc + ak + mk
= (a + m)c + a(c + k) + mk
= bc + ad + mk > bc + ad.
Therefore {ac 4- &d, be + ad) is a positive integer. The use of an ordered list of objects (here
a pair of integers) to define a different kind of number is a recurring theme in this area of
mathematics.
Using integers to count is an elementary idea. But how do we distinguish between a 'small'
apple and a 'large' apple or how do we 'account' for an apple sliced into exactly two pieces?
Distinguishing between large and small is a more complicated problem than dealing with
'fractions' of a whole. Again we use the artifice of a number pair.
When we divide an apple into exactly two pieces the selection of one piece might be denoted
by (1, 2); the first integer standing for the number of pieces in our selection and the second
integer standing for the number of pieces in the whole group. Thus the pairing (3,27)
could describe the selection of 3 sheep from a flock of 27. Even more suggestively (with
present day notation) we could write the pair of integers in column form ( ) or (
A pairing such as I I is meaningless; choosing a items from 0 is simply not allowable.
The rational numbers is the set Q of all ordered pairs of integers I J 6^0o, 6 G 2
satisfying the algebraic rules of combination:
:)«(5)-0)8(0-(-«s*
;Ws)-(sW:)-(s
f , J = f , J where s is an integer
Now, the algebraic equation: ax = b or equivalently, ( J ® x = I J has the solution
j. Of course, the integers are contained within the rationals, being rationals of the
form I I. In any practical calculation almost any measurement can be accompolished
using the rational number system. Practicalities, however, are not of direct interest to the
mathematician. Of course, he is very happy if his 'constructs' find application in the real
world but this is not his overriding interest.
The need for a number which could not be represented by a rational was first considered
by Pythagoras who was born about 580bc. The example is well known. We take a unit
square and slice it into two along a diagonal and if x is the length of the hypotenuese of
one of the right-angled triangles so formed then x satisfies the simple equation: x2 = 2.
The series of rationals,
14 141 1414
10' 100' 1000
might be 'close' candidates for x as they produce for x2, in turn
196 19881 1999396
100' 10000' 1000000 '"
Now we can perhaps appreciate Pythagoras' problem. What (if any) is the rational number
x which satisfies this equation? It is not too difficult to see that in fact no such rational
number exists. For let us assume that there is such a rational number: x = I ) which
is assumed to be written in such a way that m and n have no common factor other than
±1. Therefore x2 = 2 implies:
1
Thus we deduce that m,n exist such that m2 = 2n2. But the right hand side is an even
number irrespective of the eveness or otherwise of n; thus m2 is an even number. Since the
square of any odd integer is odd we deduce that m must be an even number and so m2 must
have a factor of 4 implying n2 is even. Thus both m and n are even which contradicts our
original assumption that m and n have no common factors. Thus the solution to x2 = 2
cannot be a rational number. Such a number is termed irrational.
Without going into great detail the real numbers, denoted by M, can be defined in
rigorous terms. Any real number b can be written in decimal fraction form:
b = k.d1d2d3... = k + ^ + ^ + ^-Q+... generally dn ee ^
4
where k £ Z and d\,d<i... are non-negative integers. The term .d\did^ ... is called the
decimal part of b. Here we are reverting to the ordinary conventional way of writing a
rational number. So, for example, (in ordinary notation)
is the decimal fraction representation of \fl. It is clear that any decimal fraction which
satisfies dv = 0 for p greater than some integer q represents a rational number. (These are
called finite decimal fractions). Thus
2.35000.
is the rational
235
Too
However a rational may also be represented by a periodic decimal in which a sequence of
digits in the decimal part repeat indefinitely. For example, consider the decimal fraction
b = 0.09090909...
Clearly
9
02
9
02
9
02
9 9
+ 104 + 106 + '
1 1
. + 102 + 104 •
1
1 - lO"2
1
11
where we have used the usual 'sum to infinity' rule of a geometric series. It is easily shown
that all periodic decimals are rational numbers. For consider the periodic decimal:
*\0
b = k.did2d3 ... dk{e[e*2 ... e*)
where (e^e^ ... e*)° denotes the sequence of recurring digits in the decimal part of the
number. Now, clearly,
k.d\d2ds...dk is rational
and if we let, for the first set of recurring digits,
K - [e\e\ ... e;) - ^^r + ^0^+2
.-. (c;c;...<)° = ir
+ ...+
IQfc+n 1Qfc
_[0.cic2...e„]
1 1
+ 10n + 102n + "'
5
= K\_1_
1 - 10~n
which is rational implying k.d^ds ... dk{e\e\ ... e*)° is rational. As a typical example
consider b = 0.660333... = 0.660(3)°
,.««oW.M +
10
10-1
10000
660 1 1981
+ ■
1000 3000 3000
In this formalism the definition of an irrational number is quite simple — it is one for which
the decimal fraction is non-periodic (here a number such as 2.30000... is periodic as n = 1
and ei =0).
As we have seen above, the most elementary concept of number, the positive integer, was,
(for algebraic convenience) first generalised to the negative integers, then to the rationals
and finally to the irrational numbers. We continue our discussion of real numbers by
considering their geometrical interpretation. In this way we shall be led naturally to
consider further generalisations of number.
1.2 Real Numbers and Displacements
Of all the mathematical inventions perhaps the positive real numbers (Ep) have the most
direct application to Physics and Engineering and their properties are well known. We
imagine these numbers to be in one to one correspondence with points on a half-line
called the positive real line. Two binary operations, addition (denoted by +) and
multiplication (denoted by x) are defined. Addition is defined in an obvious way. Given
two positive real numbers a and b we construct a + b by sliding b so that it points out of the
end of a. Then a 4- b is that positive real number measured from the start point of a to the
end point of b. Multiplication also has an obvious geometric meaning in terms of multiple
additions. The construction of na where both n and a are irrational is conceptually easier
to consider if we move away from this one-dimensional picture. Consider two lines OH
and OT intersecting at O inclined at an acute angle as shown in Figure 1.1. Along OH
we measure the unit 1 and the number a whilst along line OT we mark the number n. We
draw two further lines one from P to Q and then a line parallel to this one from R to meet
OT in a point S. The length of OS is the geometrical construct of the product number
na. This diagram assumes that a > 1. The second diagram considers the construct when
a < 1.
Figure 1.1
There is also an order property that can be associated with positive real numbers. If a
and b are two positive real numbers such that b is further to the right on the positive real
line than a then we write b > a.
The need to extend the positive real numbers to include the negative real numbers arises
(as it did with the positive integers) if we demand that all linear equations should have a
solution. If we wish to 'solve' a + x = b irrespective of the values of a and b then we must
consider negative real numbers. As with integers this can be done by considering number
pairs which obviates the need to introduce a minus sign. However, here we consider the
construction of negative real numbers from a geometrical perspective.
Points on a Line: Displacements
We draw a line, called the scalar axis, and choose a point 0 called the origin (Figure 1.2).
0
• ► s
Figure 1.2
We can perform two basic operations on a line; moving to the right or moving to the left.
A positive scalar is obtained if we move to the right, a negative scalar is obtained
if we move to the left. The set of all scalars is denoted by E. An ordinary scalar has
both a magnitude and a direction indicated by an arrow. (A displacement may be
a better terminology). We accept that the value of a scalar is independent of its position
on this line. That is, scalars are slidable. Thus two scalars a, b are said to be equal if
and only if both have the same length and both are pointing in the same direction. For
every scalar c (except zero) we can define two useful geometric terms the norm Nc and
the direction(sign) signc:
Nc = square of the length of c Nc E Rp
signc = +1 if c points right and signc = — 1 if c points left
Operations with Scalars
These operations are familiar to all of us. Addition has an obvious geometric interpretation.
As with positive real numbers, to form a + b we move '6' to point out of the end of ia\
Then 'a + 6' is the line joining the start point of a with the end point of b. The same
construction applies if both scalars are negative. Subtraction implies a change of direction
from right to left or from left to right. To construct a — bwe move '6' to the end of 'a' and
then change the direction (the arrow) of '6' to produce (—b). See Figure 1.3.
a-b
0 a
* ^- m^s
Figure 1.3
The same construction applies if the scalars are negative.
There are two types of product that we can contemplate: multiplication of a scalar with a
positive real number and products of scalars with scalars.
Multiplication by Positive Real Numbers
If r is a positive real and a is a scalar then the product of r with a, written ra, is a scalar
in the same direction as a (i.e with the same sign as a) with norm r2Na. See Figure 1.4.
ra
rs/Na
Figure 1.4
Scalar Multiplication
If a, b are scalars then the product of a with 6, also written ab is the scalar with norm N^:
and direction (sign):
Nah = NaNh
signa6 = +1 if signa = sign6
Signa6 = -1 if siSna ^ Sign6
This definition reduces (as we would demand it should) to multiplication by a real number
if a is a positive real number and b is scalar. The only other possible choice for the definition
of the sign of a product (consistent with multiplication by positive real numbers) is:
siSna& = +1 if siSna = siSn& = +1
signa6 = — 1 otherwise
Multiplication defined with this choice for the sign is easily shown not to satisfy the
distributive law; a(b + c) = ab + ac. and so is rejected.
It is clear that any scalar a with norm Na may be written in the form: a = \J~N^a where
a is a scalar with the same sign as a; signa = signa and with unit norm N& = 1. Now
with any unit scalar (there are only two of them; either pointing to the right or pointing
to the left) we can associate what we shall call an 'arc'. Essentially arca =
1.5 displays arce (if e has +ve sign) and arc ? (if / has — ve sign).
0
arcs arc^s
f e
Figure 1.5
We note that a real number a = y/N^a may be written in 'polar coordinate' form: (y/Na,
arc a). Arcs may slide and combine to produce two distinct constructs. Any two arcs in
the same direction (with the same sign) is called a 'circle', and any two arcs in opposite
directions is called a 'semi-circle'. (The reason for this slightly odd terminology, in this
context, will become clear as we generalise to higher dimensions). See Figure 1.6.
0
semi-circle
0 0
-• ^^ s •-
circle
Figure 1.6
circle
9
With this convention and denoting by ~ the correspondence between a unit scalar and its
associated arc viz a ~ area we have
+ve 1 ~ circle -vel~ semi — circle
and if we use the symbol © to indicate 'summation' according to the constructs in Figure
1.6. we have for any two unit scalars p, q
arCp vp arCfl —— arcpg
(There are only two possibilities: aicpq = circle if p, q have the same sign and arc^ =
semi-circle if p, q have opposite signs). In particular
arc^ © arCp = arc^ = circle ~ +ve 1
which states that the square of any unit scalar (and hence any scalar) always points to the
right (i.e. is always +ve). Also
arCp © circle = arCp arCp © semi — circle = — arCp
for any unit scalar p. We can now re-write the product of scalars a = y/N^a and b = y/N^b
in the form:
ab = {^aa){y/N~bb) = VK^Mi^c^)
The term arca£ in this expression provides the 'sign' of the product. So scalar multiplication
has been reduced to multiplication of positive real numbers and addition of arcs.
1.3 Groups
In the previous sections we have examined many different sets including integers, rationals
and real numbers. On each of these sets we have introduced two operations; addition and
multiplication each being an example of a binary operation. On each one an algebraic
structure of some kind was imposed. Perhaps the simplest algebraic structure shared by
most of the sets we have met is the structure of a group.
Binary Operations
Let G be a non-empty set. A binary operation * on G is a rule which assigns to elements
a, b of G the unique element a * b. A set G is said to be closed under the operation * if
10
for every pair of elements a,b £ G then a*b e G. A binary operation can be regarded as
a map from the Cartesian Product P x P into P
f:PxP^P /{(pi,P2)}=Pi*P2
Of course a group is one of the central ideas of mathematics. Precisely, a group is a set of
elements G closed with respect to operation * and which satisfies the following axioms.
Glo p * (q * r) = (p * q) * r V p,q,r e G * is associative
G2o 3 an identity element 2 G G such that for every p e G
p*i = i *p = p
G3 o For each element p G G, 3 an element g G G called the inverse such that
p*q = q*p = i
A group is said to be an abelian group if for every pair of elements p,q £ G the operator
is commutative: p*q = q*p.
In the group axioms there is required to be at least one identity element and for each
element at least one inverse. However, it is easily shown that the group identity and the
group inverse are unique.
Uniqueness of Identity and Inverse
In the group axioms there is required to be at least one identity element and for each
element at least one inverse. However, it is easily shown that, in a group, the identity
element is unique and also each element has a unique inverse. To prove that the group
identity is unique we assume that there are two elements 21,22 and try and argue that i\
is identical to 22. Thus for every element a G G we assume:
a*ii=2i * a = a and a* 22= 22 * a = a
But in particular, for element 2*2 G G we have, using i\ as identity:
22 * l\ = 2i * 22 = 22
11
Also, for element i\ G G we have, using i2 as identity:
ix = %x * i2 = i2 from above
proving that the identity is unique.
To show that every element a G G has a unique inverse we assume that p and q are two
possible inverses. Then
p = p*i=p*(a*g) since a and q are inverse
= (p * a) * g using the associative property
= i* q = q since p and a are inverse
A simple deduction from this result is that if a, b G G then
(a* b)~ = b~ *a~
This follows easily since
(a * b) * (6~ * a~) = a * (6 * b~) * a~
= a *a~
= i
so, since the inverse is unique, the inverse element to a * b is b~ * a~.
Subgroups
Let G be a group with operation *. The set (5 is said to be a subgroup of G if G is a
subset of G and is also a group with the same binary operation *. Perhaps not surprisingly
one can readily show that the identity in G is the same as the identity in G and if a G G
(implying a G G) then the inverse of a G G is the same as the inverse of a G G.
The following theorem allows one to easily check whether or not a subset of a group is a
subgroup.
Theorem 1.1 If G is a group with operation * then 6, a subset of G, is also a subgroup
of G if and only if the following statements are true:
o G has at least one element.
12
o if a, P G G then a * j3 G G i.e. G is closed under *.
o if /3 is the inverse of a G G then f3 G (5.
Isomorphisms of Groups
Let 5 be a group with operation * and let T be a group with operation • then a map
<j>: S »-> T is called a homomorphism if
</>(a * 6) = 0(a) • <j>(b) V a,beS
A homomorphism preserves the algebraic structure of the group upon which it acts. An
immediate deduction is that if <\>: S »-> T is a homomorphism and if is, 2t are the identities
in S and T respectively then:
</>fe) = ^t and if 0(as) = ax then ^(a^1) = a^1
Let 5 be a group with operation * and let T be a group with operation •. An isomorphism
of S onto T is a homomorphism <j>: S i—> T that is one-to-one and onto. In this case S and
T are said to be isomorphic which we denote by 5 ~ T.
Example Let M be the group of all 2 x 2 matrices with matrix multiplication * as
group operation. Show that the map
^:Mh1 (j)(A) = detA V AeM
is a homomorphism.
Solution This follows easily since (using well known properties of matrices and
determinants)
cj)(A * B) = det(A * B) = detA • detB = <j>{A) • </>{B)
Note also that, as expected from the theorem above
nl \
= 1
MA-1) =
"1 0"
0 1
)■
-d l
-*u
etA
detf
' d -
—c
'1
0
-b
a
0"
1
^
)
{detA)-1
13
Anti-isomorphisms
A map 0: S »-> T that is one-to-one and onto such that
0(a * 6) = 0(6) • 0(a) V a, 6 G S
is called an anti-isomorphism. The deduction that if 0 and 0 are two anti-isomorphisms
of S i-> 5 then the composition 0 o 0 is an isomorphism is an immediate result.
Example If S is a group then show that the map 0: S »-> 5 such that
0(5) =5" V 5 G S
where s~ is the inverse of 5, is an anti-isomorphism.
Solution The map is clearly one-to-one and onto since there exists an inverse map 0_:
<t>~:S^S <t>-{s) = s~ VseS
which satisfies
0- o </>(*) = </>-(</>(*)) = ^-(O = (O" = s
0 O 0"(5) = 0(0" M) = 0(5") = (5-)" = 5
0" O0(5) =0O0"(5) =2S
Also
0(a * 6) = (a * 6)_ = b~ * a- = 0(6) * 0(a)
so 0 is an anti-isomorphism.
Theorem 1.2 If 0 and 0 are two anti-isomorphisms of S »-> 5 then the composition
0 o 0 is an isomorphism.
proof 0 and 0 are both onto and one-to-one and so, by an earlier result 0 o 0 is onto and
one-to-one. Also if a, 6 G S then
0o0(a*6)=0(0(a*6))
= 0(0(6)* 0(a))
= 0(0(a))* 0(0(6))
= 0 o 0(a) * 0 o 0(b)
14
So the binary operation is preserved, confirming that the composite map is an isomorphism.
1.4 Rings and Fields
A group, as we have seen, is a set upon which a single operation * is defined satisfying
certain basic axioms. A ring is a mathematical construct — a set S upon which two binary
operations ©, ® are defined such that S is closed under both © and ® and such that the
elements of S together with these operations satisfy a number of basic axioms.
Although we are primarily interested in the properties of real numbers (and their
generalisations) we shall nevertheless carry through our discussion of rings in general,
abstract terms though the notation that we shall use will be very suggestive of the notation
commonly employed with real numbers.
Precisely: a ring is a set S such that
Rl o a © (6 © c) = (a(Bb)(Bc V a,b,c£ S Operation © is associative.
R2 o For all a, b G S the equation a © d = b has a solution d G 5.
R3o a(Bb = b(Ba V a, 6 G S Operation © is commutative.
R4o a <S> (b <S> c) = (a <S> b) <S>c V a,b,c£ S Operation ® is associative.
R5 o a <g> (b © c) = (a <g> b) © (a <g> c) (b © c) <g> a = (6 <g> a) © (c <g> a)
These are called the distributive laws
If the operation ® is commutative (a <8>6 = 6® a) then the ring is said to be a commutative
ring. The first three axioms define, for the operation © and the elements of 5, an abelian
group. These three axioms alone can be used to show that there exists a unique ©identity
io and for each a G S a unique ©inverse _a. Normally an ©identity is called a zero and an
©inverse is called an additive inverse (or negative). Note that these two properties; the
existence of a unique ©inverse and a unique ©identity together imply R2o. For if a, b G S
and _a is the ©inverse of a then the equation a © d = b always has a solution
_a © b = _a © (a © d) = (_a © a) © d = i0 © d = d
Therefore d e S. Thus the second ring axiom could have been replaced by an equivalent
axiom specifying the existence of a unique ©inverse and a unique ©identity.
15
Isomorphisms of Rings
Let S with operations (©, ®) and T with operations (+, •) be rings. A map / : S »-> T is
called a homomorphism if it preserves the operations in S and T. That is if
/(a © 6) = /(a) + f(b) /(a ® 6) = /(a) • /(b)
An obvious example of such a map is that between the ring of polynomials F and L (the
ring from which its coefficents are taken), specified by an evaluation
/:P^L /(p)->/(x)
which satisfies
f(pi ©P2) = /(Pi) + f(p2) f(pi ®P2) = f(pi) • /(p2)
If the homomorphic map has the further structure that it is both one-to-one and onto
then / is said to be an isomorphism of S onto T and (as with groups) we then write
S ~ T. An isomorphic map induces many attributes from S onto T. For example if is is
an ^identity of S then under an isomorphic map f(is) is an -identity of T. This follows
since if a £ S then
a = a®is thus f(a) = f(a <g> is) = f(a) ■ f(is)
But since the identity of T is unique then f(is) is the -identity of T. We conclude that if
S and T are isomorphic then if S contains an (^identity then T must contain a -identity.
This result shows immediately that although the ring of integers I and the ring of even
integers le are isomorphic as groups (with operations ©, + respectively), they cannot be
isomorphic as rings since le does not have an -identity whereas I does have an (^identity.
Because the ring operations ©, ® are preserved it follows that an isomorphism preserves
all the ring axioms so that (apart from notational differences) two isomorphic rings are
essentially identical. One can view an isomorphic map / : S \-> T as producing a ring T
from a ring S and, as such, certain algebraic properties of S are induced into T. As seen
above if S contains an (^identity then so must T. Similarly it can be shown that if S is
abelian then so is T; if S is commutative then so is T.
Example Show that the map
/ :Z(flj6) ^->Z /{(a, b)} = a-b
16
is an isomorphism
Solution
/{(a, b) 0 (c,<t)} = /{(a + c,6 + d)} = (a + c) - (6 + d)
= (a-b) + (c-d)
= f{(a,b)} + f{(c,d)}
/{(a, 6) <g> (c, d)} = /{(ac + bd, ad + bc)} = (ac + bd) - (ad + 6c)
= (a-b)(c-d)
= /{(a,6)}/{(c,d)}
This proves the map is homomorphic. To show that it is isomorphic we need to demonstrate
that it is also one-to-one and onto or, equivalently, that it is invertible. We show that it is
invertible by exhibiting its inverse explicitly:
/" :Z^Z(a,6) f-[m] = (b + m,b)
Here
(/" o/){(c,d)} =/-[/{(c,d)}] = r[c-d] = (b + c-d,b) = (c,d)
and
(/o/-)[m]=/(/-[m])=/{(6 + m,6)} = 6 + m-6 = m
proving that / is invertible. Thus %(a,b) ~ ^- This shows, of course, as we intimated
earlier, in Section 1.2, that the negative integers may be constructed without ever needing
to introduce the negative sign.
Integral Domains and Fields
If a G 5, a commutative ring and a ^ i$ and if we can find an element b G 5, 6 ^ io such
that
a ® 6 = io
then a is said to be a divisor of zero. For example the product space Z x Z (which is a
commutative ring) contains elements (1,0) and (0,1) such that
(1,0) 0(0,1) = (0,0)
17
that is, each is a divisor of zero. The ring M2 of 2 x 2 matrices has divisors of zero since,
for example,
"2 r
4 2
-1 0.5'
2 -1
=
0 0"
0 0
For the number systems that we shall be mainly interested in we will wish to only
consider those structures without divisors of zero and to this end we introduce the integral
domain, which is a commutative ring without any divisors of zero. A field has even
greater structure; it being a commutative ring S in which every element a G S a ^ i0 has
an (^inverse. It is easily deduced that a field has no divisors of zero; since if a G S and
a ® b = i0 when b ^ io
then because 5 is a field there exists an element a~ (the multiplicative inverse of a) such
that a~ ®a = ii (the multiplicative unit or (^identity) we have:
a~ ® a ® b = a~ ® i$ = io
implying b = %q (using associativity on the left hand side and then the property of the
identity) which contradicts our initial assumption. Thus every field is an integral domain.
A set for which all the axioms of a field hold except the commutative law of multiplication
is called a division algebra or a skew field.
Order
The concept of order that we are familiar with, from the integers, can be applied to other
rings. A ring S is said to be ordered if there exists a subset Sp C S which is called the
set of positive elements of S such that
Olo If for each a, 6 G Sp then
a®beSp and a®beSp
02o For each a G Sp then exactly one of the following alternatives are true:
either a G Sp or a = iq or -a e Sp
This is called the Trichotomy Law.
Those elements a G S such that _a G Sp are called negative elements of 5. The concept
of order also applies to integral domains and to fields, the definition being identical to that
for rings.
We can deduce two fundamental results applicable to all ordered rings. The first states
that if a G Sp then _a ^ Sp since if _a G Sp then a © _a = i0 G Sp which contradicts O2o.
18
The second fundamental property of all ordered sets is that if i0 ^ a G S then a® a G Sp.
The proof is immediate: If a G Sp then a® a e Sp by Olo . However if a 0 5P then
_a G 5P and again _a^_oG 5P by Olo. What this states of course is that the square of
any non-zero element of an ordered ring is positive.
The Modulus
Let S be an ordered ring. The modulus of a G S is written \a\ and is defined as:
(a if a G Sv
\a\ = < io if a = i0
{-a ifa£Sp
Clearly if a ^ i0 then for all a G 5, \a\ G 5P. The Modulus Map
Let 5 be an ordered ring. The modulus of a G S is written \a\ and is denned as:
(a if a G 5p
|a| = < 20 if a = 20
La ifa£Sp
Clearly if a ^ i0 then for all a G 5, \a\ G 5P. The expression |6© _a| is called the absolute
difference of a and b. The modulus has a number of important properties.
Property (i) \a\ > a
proof This is immediate if the three cases a > io, a = io and a < i0 are considered.
Property (ii) \a\2 = a2
proof This is clearly true from the definition of \a\ (and using _a (8> _a = a® a = a2).
Property (hi) |a <g> 6| = \a\ <g> |b|
proof We consider the various cases. For example if a > io and b > i0 then a ® 6 > 20
and so |a ® fe| =a®i>. Also |a| = a and |6| = b implying \a\ 0 |6| = a ® b. The other
possibilities are treated similarly.
Property (iv) If |a| > |fe| then a2 > b2.
proof Since \a\ > io and |6| > zo then
\a\ (8> |a| > |6| (8> |a| = |fe (8> a| |a| (8> |6| > |6| (8> |6|
19
that is
a2 > \b®a\ and b2 < \a ® b\
Therefore by proposition (i) a2 > b2.
Property (v) \a © b\ < \a\ © |b|
proof
(|a|©|6|)2 = a2©2|a®6|©62
But |a ® 6| > a ® b and so
(H © \b\)2 > a2 © (2a (8> b) © 62 = (a © b)2
Hence |a| © |6| > |a©6|.
Property (vi) \a © Jb\ > \\a\ © _|6||
proof Similar to (v).
Note that with the introduction of the modulus we can introduce the norm and sign of
an element of an ordered ring to tie in with these descriptors introduced earlier in our brief
discussion of scalars. If a G S then we write 7Va = |a|2 = a2 and the sign of a is:
[ i\ if a > io
fflgn» = l_i1 iia<i0
Two other results for an ordered field are worthy of note. If a G Sp then a" G Sv. The
proof is immediate: if a~ £ Sp then a 0 a~ = i\ £ Sp which contradicts an earlier result.
Also if a > i\ then a~ < i\. To see this we note that since a > i\ then a G Sp and so
a~ eSp
a~ ®a > a~ ®i\ = a~
Thus i\ > a~. However, since a~ is positive we conclude io < a~ < i\.
1.5 Linear Spaces
Let S be an abelian group under the operation + and let T be an ordered commutative
field. We consider a map, called scalar multiplication
such that for all 5, s' G S and all a, a' G T it has the properties
20
Smlo a(s + s') = as + as' , (a + a')s = as + a's
Sm2o a'(as) = (ct'ct)s
Sm3 o Is = 5 where 1 is x identity in T.
An abelian group for which there is a scalar multiplication map as defined here is called a
linear space over T denoted by St- The elements of T are called scalars.
Linear Dependence
If St is a linear space over T and si, «2> • • • > Sfc £ 5, ai, #2, •. •, ak G T then the element
s G S given by
s = a\S\ + a2s2 + • • • + afcSfc
is called a linear combination of s\, s2,..., $&.. The space comprising all those elements
formed in this way is itself a linear space over field T. These elements satisfy the axioms of
an abelian group under + and also satisfy the extra constraints required of a linear space.
The elements s\, s2,..., Sfc G 5t are said to be linearly independent if and only if
a\s\ + a2s2 + ... + aksk = 0S implies a\ = a2 = ... = ak = 0
otherwise they are said to be linearly dependent. The space of all linear combinations
of si, s2,..., Sk is said to be spanned by s\, s2,..., Sk and we denote this space by
span{si, s2,..., Sk}- If there exists a finite set of elements s\, s2,..., s^ such that
ST = span{si,s2,...,Sk}
then 5t is said to be finite dimensional and if si, s2,..., Sk are also linearly independent
then this set of elements is called a basis of St-
As a simple example of a linear space consider a set of elements of the form (ai, a2,..., an)
ajGTa field with addition and scalar multiplication defined by
(ai,a2,...,an) + (ai,a2,... ,a4) = (<*i +<*i, a2 + a;2,... ,an + a'n)
A(ai,..., an) = (Aai,..., Xan) X G T.
Equality between elements is defined by:
(ai,a2,...,an) = (a[,a2,... ,o4) =» ai = «i,... ,an = a^
21
It is easy to verify that this is a linear space by checking through the axioms. We denote
this space, which is called the space of n—tuples, by Tn. A little thought allows the
construction of a basis using the particular elements
ei = (l,0,...,0), e2 = (0,l,0,...,0),... ,en = (0,0,..., 1)
or
( 1 i = j
e» = (*»i,*»2,...,*in) i = l,2,...,n where ^ = < Q
These n elements are linearly independent. These elements therefore form a basis and Tn
is finite dimensional. This basis is called the standard basis for Tn.
With a little more work it can be shown that any two bases of St contain the same
number of elements, this common number being called the dimension of St, denoted
by dim (St)- For the example of the linear space of n—tuples above, dim Tn = n.
Linear Maps
Let St and Vr be linear spaces defined over the same field T. Let </) be the map from St
to Vr 4> '• St |—> Vr such that
</>(si + s2) = (j)(si) + (j)(s2) and <j)(as) = acj)(s)
for all s\,S2,s G St and all a G T then <j> is called a linear transformation from St
to Vr- (Sometimes a linear transformation is called a homomorphism). We now consider
combining linear maps. Let St be a linear space over a field T and let 0, 9 be two linear
transformations from St to itself. We now define the map V© : St »-> St such that for
any s £ St i>®(s) = <l>(s) + 0(s). Then -0® is a linear map
as) = (j)(as) + 0(as)
= a[0(s) + 0(*)]
s)
Sl + S2) = </>(5l + 52) + 0(«1 + S2)
= <I>(S1)+<I>(S2)+0(S1)+9(S2)
= ^©(5i) + ^©(s2)
The map -0® is called the sum of the linear transformations <j> and 9 and denoted by <j> + 0.
We can also define the composition map ^^ : St ^^ St by applying 0,0 in succession
(s) = 4>(9(s)) s G 5y. Again ^^ is a linear map.
22
i/)9(as) = <K9(as)) = <KaB(s)) = acf)(0{s)) = a^(s)
^®(«l +52) = <t>{0(si + S2))
= <t>[B(si) + 9(s2)]
= ^®(«1)+V'®(S2)
The map ij)® is called the product of the linear transformations <j> and 9 and denoted by
(j)o0. Generally the map </> o 0 is not the same as 0 o </>.
Matrix Representation of a Linear Map
Let St be a linear space such that St = span{si, s2)..., sq} with dimension q and let
Vt be a subspace of 5t spanned by n elements Vr = span{v\,V2,.. •, vn} n < q. We
shall now consider linear maps from Vr onto itself: <\>: Vr »-> Vr- The image of each basis
element of Vr is an element of Vr and so
n
<l>(vk) = o,ikvi + a2fc^2 + ... + ankvn = ^ a^-. aik G T
i=i
The effect of the linear map <j) on a general element of Vr, can be expressed as a linear
combination of basis elements: v = X\V\ + x2v2 + ... + xnvn X{ £ T. As i>i,..., vn is a
basis for Vr we call the uniquely determined coefficients Xi i = 1,..., n the components
of v with respect to the basis Vi, i = 1,..., n. The image of i> under the map 0 is </>(i>) and
is an element of Vr and so can be expressed as a linear combination of the basis vectors:
n
(j>{v) = yivi + y2v2 + ... + yn^n = ^ VjVj Vi^T
But
(v) = Zl</>(^l) + ^2</)(^2) + • • • + Xn(j)(vn)
n
1=1
n n
= 2-/ / ^CLjiVjXj
i=lj=l
That is,
n n n n ( n \
Yl yM = Y, Y, a3iv3xi °r Yl\y^~Y aoiXi \vo = 0
23
But since Vr = span{vi,V2,..., vn} it follows immediately that
n
yj =^2ajixi j = l,2,...,n
2 = 1
So if we know the effect of the map <\> on the basis vectors Vi (i.e. we know the scalars
dij i = 1,..., n, j = 1,..., n) then we can calculate the components y* 2 = 1,..., n of </)(i>)
from the components Xi,i = 1,..., n of i>. The set of scalars a^, i = 1,..., n, j — 1,..., n
characterise the linear map 0. We can conveniently write this set of scalars as an n x n
array, and for convenience refer to it by a single capital letter:
Gin
I ^21 a22
A =
an
«21
0>nl
an
«22
an2
where the A;th column of A are the components of (j)(vk) with respect to the basis
span{v\, V2,..., vn}« The array ^4 can be shown to be a matrix. In fact, with respect to a
fixed basis the correspondence between linear maps and matrices is one-to-one, preserving
the operations of addition and multiplication (suitably interpreted). If we denote this
correspondence by ~ then tf <j> ~ A and 9 ~ B then it is easily verified that
a(p~aA, 0 + 0 ~ A + B,
>oO~AB.
Clearly this correspondence is an isomorphism between the collection of all linear
transformations <\>: St |—> St and their associated matrices.
Example Let (j>o, be a linear transformation from T2 into T2 defined by
4>o{e\) = (cos#, sin#) fa(e2) = (—sin0, cos#)
Show that the map </)# o cj>\ = (J)q+\
Solution The matrix representations of 4>e and (j)\ are, respectively:
A=\ \ B=
cos 0 — sin 0
sin 0 cos 0
cos A — sin A
sin A cos A
24
Thus the matrix representation of <j>q o <j)\ is
AB
cos 9 — sin 9
sin 0 cos 0
cos A — sin A
sin A cos A
cos 9 cos A — sin 9 sin A — cos 9 sin A — sin 9 cos A
sin 9 cos A + cos 9 sin A — sin 9 sin A + cos 9 cos A
cos(0 + A) -sin(0 + A)
sin(0 + A) cos(0 + A)
and, using the correspondence between maps and matrices we see that
AB ~ </>e+\
and so 4>q ° 4>\= (t>e+\- Note that in this case the order of the maps is not significant.
Change of basis
Let {ri,..., rn}, {i>i,..., vn} be bases in Rn. If x € Rn then
x = a\r\ + ... + anrn and x = f}\V\ + ... + f3nvn
How are the coordinates Pi related to the coordinates a2? Now, clearly, since {vi,..., vn}
is a basis
n n I n \
X = Y,airi =Ysai[ Y^V3
2=1 2=1 \j = l I
and
Y bvi = Y Y ai^vJ implying Y{^~Y ai^J \VJ=°
j=l 2=1j=\ j=\ \ 2=1 )
from which (since Vj are linearly independent) we deduce: pj = 2_.&ilij• That is, in
matrix notation,
2=1
Pi
fa
Pn\ 111
711 721 ... 7nl
712 722 •.. 7n2
7nn -I \-OL
<*2
25
in which the ith column of the nx n matrix are the coordinates of r» with respect to the
basis {vi,V2,..>,Vn}> Formally we write
[X\v = [1 \r—+v\X\r
[T]r->v is called the change of basis matrix from rtov.
Example In R2 we have two bases Sp{ri,r2}, Sp{v\,V2} where, with respect to the
standard basis:
n = (7, -6), r2 = (6,7), vi = (1,4), v2 = (3, -5)
Find the change of basis matrix [T]r_>v.
Solution
7*1 = OL\V\ + OL2V2
i.e.
7 = a\ + 3^2
—6 = 4ai — 5^2
These equations have solution
ai = l, a2 = 2, A =3, ft = l
Therefore
[T}r->v= |^ xj
Thus, for example, an element r: r = 2r, —3r2 (= (—4, —33) in standard basis) which
has coordinates (2, —3) with respect to the r—basis would have coordinates
"1 3"
2 1
2"
-3
=
"-7"
1
with respect to the v—basis: That is, v = —7v\ + v2 (= (—4, —33) in standard basis).
Orientation
In this section we consider the commutative field T to be the field of real numbers. Let
{ri, r2,..., rn} and {v 1, v2,..., vn} be two bases for Rn. There is a linear map </> which
will transform one basis into the other. In terms of matrices, if x £ Rn
r2 = P\v\ + #2^2
7 = 4/?i - 5&
[Xjv — [i Jr-—>>v L^Jt-
26
where [T]r_>v is the change of basis matrix. If det[T]r_>v > 0 then we write
span{ri,r2,..., rn} ~ span{vi,v2,..., nn}.
Since any non-zero determinant is either positive or negative the relation ~ divides all the
bases of Rn into just two sets - those related to the standard basis span{ei, • • •»en} via a
change of basis matrix for which det[T]e_>r > 0 and those for which det[T]e_>r < 0. If
Rn = span{ri, r2,..., rn} det[T]e->r > 0
then Rn is said to be positively oriented. If
Rn = span{ri,r2,...,rn} det[T]e->r < 0
then Rn is said to be negatively oriented.
The real line R is oriented. The standard basis is the single element e = 1. If we consider
a segment of the real line then this is positively oriented if the direction we associate
with it points to the right and negatively oriented if the associated direction points to the
left. Changing from one orientation into another is effected by multiplying by a negative
number.
In R2 orientation corresponds to the direction of rotation; conventionally, 'clockwise'
indicates positive orientation and 'anti-clockwise' indicates negative orientation. Here,
a linear transformation which reverses orientation, from clockwise to anti-clockwise or vice
versa, includes a reflection whilst one that preserves orientation is a rotation.
In R3 orientation is conventially referred to as being either right-handed or left-handed.
See Figure 1.7
(a)
(b)
Figure 1.7
The standard basis vectors ei, e2, e$ is in (a) a right-handed set. As e\ rotates towards e2,
through 90° a screw with a right-handed thread — most screws are of this type — aligned
27
normal to the plane of e\e2 would move in the direction of e%. On the other hand in (b) the
standard basis 61,62^3 is a left-handed set. A screw with a right-handed thread would,
as we turn it from b\ to 62, move in the direction of —63. An alternative, more natural
description of orientation in M3 is the following: align the thumb of your right hand along
e\ with your first finger pointed along e2 then the remaining fingers naturally point in
the direction of e%. The digits on your left hand will align naturally according to Figure
1.7(b). These informal considerations match our algebraic definition since, in this case,
the transformation from one basis set to the other is given by b\ = e\ b2 = e2 63 = — es
leading to a change of basis matrix:
[T]e->6 =
1 0 0
0 1 0
0 0-1
and det[T]e->6 = — 1 as expected.
1.6 Inner Product Spaces
Let t = (ti,t2,... ,tn) and u = (1x1,1x2,.. .un) be two elements of Tn over the ordered
commutative field T with respect to the standard basis Tn = span{e\,e2,... ,en}. The
norm Nt of the element t is defined by:
Nt = t? + t! + ... + # NteT
The element t is called a unit element if Nt = 1. The scalar product (sometimes called
the dot product) of t with u is written t • u and defined as
t-u = t\Ui + t2u2 + ... + tnun
We note that Nt = t-t. The geometrical interpretation of the norm and the scalar product
is facilitated by considering R which is the familiar linear space of our three dimensional
world. It easily follows from the definition of the scalar product that if t,u,v,w € Tn,
a,/3,7,<5<ETthen
(at + (3u) - (717 + 6w) = ajt • v + a8t • w + ft^u • v + (36u • w
Now since T is an ordered field then for every t G Tn Nt > 0, only vanishing if t = 0. We
can derive a useful inequality from this basic result.
Theorem 1.3 The Schwarz inequality. For every t,u £ Tn
NtNu > (u • t)2
28
Proof To show this we note the obvious statement that if u, £ G Tn, a, (5 G T
Nau+pt > 0 that is (au + (it) • (au + fit) > 0
or, expanding:
a2u-u + 2a(3u • £ + f32t • t > 0
If, in particular, we choose a = £ • t and /? = — rz • £ then
N?Nu-2Nt{u-t)2 + Nt{u.t)2>0 or Nt{NtNu - (u • £)2} > 0
Now since Nt>0 (Nt = 0 if and only if £ = 0 in which case NuNt = (u • £)2) we deduce
NtNu > (u • £)2
Continuing our discussion of the scalar product we note that in R3 with the usual
geometrical interpretation, the three standard basis elements e\ : (1,0,0), e<i : (0,1,0), e$ :
(0,0,1) point along three mutually perpendicular axes. See Figure 1.8. Then two elements
£, u G M3 are lines of length \/Wt, y/N^ respectively, pointing out of the origin. The line
joining the end-points of £, u is represented by u — t.
Figure 1.8
Now, for any triangle, the cosine rule states
Nu + Nt = Nu_t + 2v/^VuV/^cos(9
29
But Nu = u\ + u\ + u\ Nt=t\+t\ + t\ and so
Nu-t = (tii - *i)2 + (ti2 - t2)2 + (u3 - hf
implying
cos 0 = . . ■
. We generalise the concept of angle between elements rx, t € M to define the angle between
elements it, t G Tn a linear space over an ordered commutative field to be such that
u-t
cos 6 = . t—=
of course this interpretation is only sensible as long as — 1 < cos 0 < 1 which immediately
follows from the Schwarz inequality derived above. Continuing with this approach it is
natural to say that two elements it, t G Tn are orthogonal if and only if u • t = 0.
Orthonormal Bases
We say a set of elements {t\, £2, • • •, tP} € Tn is orthogonal if the elements ti i = l,p are
mutually orthogonal element. That is, if
^ • tj = 0 i^j
It is easy to show that any set of mutually orthogonal elements are linearly independent.
Thus if {ti, £2, • • •, tp} is such a set then consider
a\t\ + a2t2 + ... + aptp = 0
If we take the scalar product of both sides with tk 1 < k <p then using the orthogonality
property we have
otktk 'tk=0-tk=0
and so ak = 0 k = 1,2,.. .p implying that the set {ti,t2, • •. ,£p} is linearly independent.
It follows immediately that if {£1, £2, • • •, tn} are a mutually orthogonal set of elements of
Tn then
Tn = Sp{t1,t2,...,tn}
Such a set if called an orthonormal set if
ti-ti = l i = 1,2, ...,n.
The simplest example of an orthonormal set in Tn is the standard basis.
30
Inner Products
The scalar product, introduced above is a rule which associates with each pair of elements
£, u G Tn, an element t • u of T with the properties
(i) t-u = u-t
(ii) t'(u + v)=t-u + t-v
(iii) a(t • u) = (at) -u = t- (au) a G T
(iv) £•£>(), t-t = 0 iff t = 0.
As we have seen, for Tn the scalar product takes the form t-u = t\U\ + £2^2 + ... + tnun.
However, there are many linear spaces to which a 'scalar product' can be defined satisfying
the properties (i) - (iv) above. We are led, naturally, to define so-called 'inner-product'
spaces which are linear spaces St in which T is an ordered commutative field on which
is defined an inner product that associates an element of T with each ordered pair of
elements t, u G St- The inner product is denoted by < t, u > and satisfies:
IP1 o < t,u >=< u,t >
IP2 o < t,u + v>=< t,u > + < t,v >
IP3 o a < t, u >=< at, u >=< t, au >
IP4 o < t, t > > 0 and < t, t >= 0 iff t = 0
In any inner product space, norm ('length') and angle (and hence perpendicularity) may
be defined in exactly the same way as for Tn. Also the Schwarz inequality
< t,t > < u,u > > < t,u >2
holds in any inner product space.
Orthogonal Matrices
Transformations from R2 \-> M2 or from M3 i-> R3 which preserve the scalar product (and
so these are transformations which preserve length and angle) are called orthogonal. We
extend this concept to inner product spaces. Let St, Ut be inner product spaces defined
over an ordered commutative field T. A linear transformation </>:
(j): St *-+UT such that < 0(rx), <j)(v) > = < u,v >
is called orthogonal. Clearly such a map preserves the norm and the angle:
Afy(u) = < 4>(u), (j>(u) > = <%u> = Nu
008* = 4^4^=008*.
31
It immediately follows that if (u\, u2)..., Uk) is a set of orthonormal elements in St then
the set (<j)(ui), 0(^2), • • • A{uk)) is an orthonormal set in Ut-
Theorem 1.4 A linear map <j>: St 1-> f/r is orthogonal if and only if 0 preserves the
norm: N^u) = Nu.
Proof If (j) is orthogonal then N^u) = < <j)(u), <j>(u) > = < u,u > = Nu. Also the
inner product can be written in terms of the norm
Nu+V - Nu- Nv = <u + v, u + v > - <u,u> - <v,u>
= <u + v,u> + <u + v,v > — <u,u> — <v,v>
= <U, U> + <V,U> + <U, V > + <V,V > — < U,U> — <V,V>
= <v,u> + <u,v >= 2 < u, v >
Thus
2 < cj)(u), <j)(v) > = N^+fty) - N^u) - N^v)
= N<f>(u+v) - Nfty) - Nfty)
= N{u+v) -Nu-Nv
= 2 < u,v >
(j> is orthogonal.
Theorem 1.5 Let St, Ut be inner product spaces and let si,S2,...,sn be an
orthonormal basis for SV; then the linear map (j) : St •—> Ut is orthogonal if and only if
the set </>(si), </>(52), • • •, <t>{sn) is orthonormal in Ut>
Proof if (j) is orthogonal then if i ^ j
< 0(Si), 0(Sj) > = < Si,Sj >= 0
whilst Nfa.) = < <l>(si), (f)(si) > = <Si, si >= 1
Thus </>(si), </){s2), • • •, </>{sn) is an orthonormal set. Now let u be any element of St then
n
u = a\s\ + a2s2 + ... + ansn = ^J a^i
2=1
and
</)(u) = <*i0(si) + a2<^(S2) + • • • + <*n<l>(Sn)
32
Now
Since <j)(si) i = 1,2,... , n is an orthonormal set. Since the norm is preserved <j> is an
orthogonal map. An obvious consequence of this result is that two inner product spaces
5t, Ut are isomorphic if and only if dimSr = dim Ut- An n x n matrix is said to
be orthogonal if its columns, when considered with respect to an orthonormal basis, are
mutually orthogonal each with unit norm (with respect to the standard inner product).
Let
St — span{ei, e2,..., en} in which < e*, ej >= 6ij.
Now if we consider an orthogonal map <j) : St •—> St then (/)(ej) = SILi aij ei where A,
with jth column a^-, is the matrix representing the orthogonal map.
Now (j)(ej) j = 1,2,..., n are mutually orthogonal:
n n
< 0(e/b), 4>{ej) > = < ^2 a^ei ' 5Z aPieP >
2=1
p=i
2J ^ifc^ij = 4j
2=1
This is matrix multiplication between two matrices A with jth column a^ and £ with ith
column aife. That is, the rows of the B matrix are precisely the columns of the A matrix.
Hence B = AT. Hence the basic characteristic of an orthogonal map is that its matrix
Fepresentation A satisfies *
AA1 = I
or equivalently A
-1
Example Show that the matrix representation of any orthogonal map <j> from R2
M2 takes one of only two forms.
cos 0 sin 0
sin 0 — cos 9
in which the standard inner product is used. Describe the geometrical effect of such
transformations.
COS0
sin#
-sin#
COS0
or
Solution Using the standard basis in M2
m2 = sP{(i,o), (0,1)}
33
Let
0{(1,0)} = oi(l, 0) + a2(0,1) <f>{(Q,1)} = ft(l, 0) + /32(0,1)
However, since cf> is orthogonal it preserves norm:
W(i,o) = <(l,0), (1,0) >=l
%(i,o)> = < ai(i-O) + q2(0, 1), ai(l,0) + a2(0,1) >
2 , 2
= ax + a2
therefore
Also
leading to
a[ + c^ = 1 Similarly 0( + 0$ = 1
<</>{(l,0)}, </>{(0,l)}> = <(l,0), (0,1) >=0
<ai(l,0) + a2(0,l), A(1,0) + A(0,1) >= 0
ai/?i + a2ft = 0
If we choose a\ = cos 9 then #2 = ± sin 0. Also choosing (3\ = cos A then /?2 = ± sin A and
so this last result implies
cos0cosA + sin0sinA = O that is 9- A = (2A;+1)tt/2 fceZ
Therefore
ft = cos((9 + (2fc + 1)tt/2) = ±sin<9
ft = sin((9 + (2k + 1)tt/2) = T cos 9
Hence the orthogonal map 0 is characterised by the matrix
A =
cos 9 — sin 9
sin 0 cos 9
or B =
cos 0 sin 9
sin 0 — cos 9
where 0 < 9 < 2-k. (The other choice a\ = cos 9 ol<i = sin 9 is obtained by replacing 9 by
—9 in the above forms). These are examples of so-called orthogonal matrices; when the
columns are considered with respect to the standard basis they are mutually orthogonal
each with unit norm. They are essentially distinguished by their differing determinants,
deti4 = +l, det£ = -l.
34
The geometric interpretation of these two distinct types of orthogonal matrices in R2 is of
interest. The first describes a rotation. To see this we consider an element of unit norm
in M2, say (a, 6); a2 + b2 = 1 with respect to the basis element (1,0) it subtends an angle 7
<(i,o), M)>
cos 7 = —, r —,
= < (1,0), (a,b) > =a
Similarly with respect to the basis element (0,1) it subtends an angle (90 — 7),
cos(90 - 7) =< (0,1), (a, b) > = b that is b = sin 7
After applying the orthogonal map <j> then the angle with respect to the basis element (1,0)
is
cos /= <(i,o), <MM)}>
^(1,0) y/N4>A{(a,b)}
= <(1,0), 0A{(a,6)}>
= <(1,0), a[ai(l,0) + a2(0,l)]+6[i8i(l,0) + A(0,l)]>
= aa\ + b(3i = a cos 9 — b sin 9
cos 7' = cos 7 cos 9 — sin 7 sin 9
= cos(7 + 9)
The angle between (0,1) and (a, b) is
sin7, = <(0,l), </>a{M)}>
= <(0,1), a[ai(l,0) + a2(0,l)] + 6[i8i(l,0) + A(0,l)]>
= aa2 + b(32
= cos 7 sin 9 + sin 7 cos 0
= sin(0 + 7)
y = 9 + 7.
i.e. the element (a, 6) has been rotated anti-clockwise through angle 9. See Figure 1.9(a).
35
(0,1)
<h {(a,b)}
A
(10)
(a)
(b {(a,b)}
VB
'(1,0)
Figure 1.9
Under the second type of orthogonal map with matrix representation B :
„ <(i,o), MM)}>
cos 7 = —7==—-====
= <(1,0), <MM)}>
= aa\ + b(3\ = a cos 9 + b sin 9
= cos 7 cos 9 + sin 7 sin 0
= cos(7 — 9).
sin/= < (0,1), <MM)}>
= aa2 + 6/?2
= cos 7 sin 0 — sin 7 cos 9
= sin(0 - 7)
That is, using the cosine and sine results
7" = 9 - 7
36
which is a reflection in the (1,0) axis followed by a rotation through angle 0. See Figure
1.9(b). Of course the important characteristic of a transformation which involves a
reflection is that it does not preserve orientation. In terms of matrices of course, the
characteristic of an orthogonal matrix that corresponds to a rotation is that det^4 = +1
whereas for an orthogonal matrix that includes a reflection is that det B = — 1.
The Orthogonal Group
The set of all orthogonal transformations <j> : Rn *-> Rn in which the group operation is
map composition forms a group. It is perhaps easiest to see this if we make use of the
correspondence between orthogonal transformations and orthogonal matrices and then the
group operation is the matrix product. So let </>, 6, A be three orthogonal maps and A^, As
and A\ be their respective matrix representations. We must check that the three group
axioms are satisfied
Glo (j) o (<5 o A) = (</> o 6) o A since matrix multiplication is associative
A^AsAx) = (A4,As)Ax
G2o The unit matrix U corresponds to the identity transformation i so
io<j) = (j)oi = (j)
since UA^ = A^Ii = A^ for all A^.
G3o For every element <j) there exists an inverse transformation 4>~l with the property
cj) o (j) = (j) o (j) = i
This is true since for every orthogonal matrix A^ there always exists an inverse matrix
A^1 = A J such that
A+Al = A^Af = Ii
Thus the set of all orthogonal transformations or equivalently the set of all n x n orthogonal
matrices forms a group, called the orthogonal group and denoted by 0(n). However,
as we have seen, an orthogonal transformation in Rn need not preserve orientations.
An orthogonal transformation that does preserve orientation is said to be a special
orthogonal transformation or rotation and the group of such transformations is
called the special orthogonal group and denoted by SO(ri). The set of orthogonal
transformations that do not preserve orientations is called an anti-rotation. These cannot
form a group as they do not contain the identity.
37
The transformations which preserve orientation and those which do not are not
continuously connected, since for one set detA = +1 and for the other set detA = — 1.
In M3 it is easy to show, by direct construction, that all orthogonal transformations
which preserve orientation, changing one standard basis into another can be carried out
continuously as a sequence of rotations. To see this, refer to Figure 1.10 in which two
standard bases are illustrated; ei,e2,e3 directed along axes Ox,Oy,Oz and e'^e^e^ as
shown.
Figure 1.10
We see that the plane containing e[, e'2 has a normal e'3 inclined at an angle /3 to the Oz
axis. This plane intersects the plane Oxy in a line OA inclined to the Ox axis at an angle
a. Our first operation is to rotate about Oz so that e\ points along OA. That is multiply
by the orthogonal matrix
Aa
Then rotate about OA through an angle /? through the orthogonal matrix:
cos a
sin a
0
— sin a
cos a
0
0
0
1
Aa =
1 0 0
0 cos (3 — sin j3
0 sin P cos (3
38
Finally rotate about OD through an angle 7 so as to align ei with e'v The direction e^
will then be aligned with e'2. This last transformation is effected by the orthogonal matrix:
A,-
cos 7
sin 7
0
— sin 7
cos 7
0
0
0
1
The combined transformation is obtained via the product A1ApAot which is the orthogonal
matrix:
cos a cos 7 — sin a cos (3 sin 7 — cos a sin 7 — sin a cos /? cos 7 sin a sin /3
sin a cos 7 + cos a cos /? sin 7 — sin a sin 7 + cos a cos /? cos 7 — cos a sin /?
sin P sin 7 sin f3 cos 7 cos /?
The angles a, /?, 7 are called Euler's angles.
1.7 Algebras
Definition A finite-dimensional linear space St over a field T is called an algebra if
there is defined on St a product, denoted by adjacency, which satisfies, for all s,t,u G St
and alia,/?,7 G T
(i) s(at) = (as)t = a(si)
(ii) s(t + u) = st + su (t + u)s = ts + us
St is called an associative algebra if V 5, t, u G St we have
s(tu) = (st)u
It is called a commutative algebra if V s, £ G 5t
st = ts
The maps, corresponding to multiplication on the left and on the right:
Le : s —> is Rr : s —> sr s,r,l £ St
are linear transformations since, for example
Z*(as + /%) = ^(as + /%) = £(as) + ^(/%)
= a(&)+/3(ft)
= aLe(s) + PLe(t)
39
and similarly for Rr.
If {ei, e2, • • •, en} is a basis for St (then we say it is also a basis for the algebra) then every
s G St can be expressed uniquely in the form:
s = a\e\ + oliZi + • • • + &nen oti^T
St is said to be a division algebra if the equations
st = u and ts = u
always possess solutions if t ^ 0.
An algebra may have an element, which we shall usually denote by ii, called the identity
element such that
i\s = si\ = s V s G A
It is easy to show that if such an element i\ exists then it must be unique, for if there is a
second such element i\ then
But since i[ is the identity then i[i\ = i\ and so i\ = i[ proving uniqueness.
When an algebra A over a field T has an identity i\ then the set of elements
{mi, aeT}
is an algebra of order 1, since there is just one basis element i\. However, since
ai\ + a'i\ = (a + a')i\
and (aii)(a'i\) = aa'(iiii) = aa'i\
then this algebra is isomorphic to the field T.
A division algebra with an identity i\ contains no divisors of zero since if 5, t G 5t,
5^0, t ^ 0 such that
st = 0
then there would exist an element u G St such that
tu = i\
that is 0 = (st)u = s(tu) = si\ = s which is a contradiction.
40
Normed Algebras
Let s G St- The norm Ns of s is defined with respect to basis {ei, e2,..., en} as:
Ns = a\ + a\ + ... + al eT
The algebra is called a normed algebra if, for some basis the norm satisfies
Net = NsNt
It is easy to show that a normed algebra has no divisors of zero, since if s, t G St and if
st = 0 then
Nst = N0 = 0
.-. NsNt=0 -> ]Vs=0 or iVf = 0 -> s = 0 or £ = 0
We conclude that if st = 0 then either s = 0 or t = 0.
In a normed algebra, over the field of real numbers R it is always acceptable to assume the
existence of a unit element i\. To see this we note that for any element H G 5jj
H = a\e\ + a2e2 + ... anen a* E R
then
7VH = a22 + a\ + ... + a2
Hence we can introduce an element h = -zj-H such that Nh = 1. Now since 5to is a
normed algebra then V s G 5^
W„s = ^hAT, = JVS
but this shows that the map Lh : s —> hs is an orthogonal linear transformation from Syj
onto itself and hence L/, is invertible: L^" (hs) = s. Similarly R,- is invertible: R^ (sh) = s.
We can introduce an element i\ such that
h=h2
and a new product into Sjj by:
*®* = V(*)Lfc1(t)
then
ti®t = iifc1(t1)Lfc1(*)
= Wfc x(«)
= Lh(Lh-1(t)) = t
41
Similarly
«®*i = -Rft1(*)^1(«i)
= R^1(s)L-h1(h2)
= R-h\s)h
= Rh(R-\s)) = s
Thus, with respect to the new product we have constructed an element i\ such that
ii (£)£ = £(£) ii =t
Isomorphic Algebras
Two algebras St and S'T over the same field T are said to be isomorphic if there is a map
</> : (j)(s) -> s' such that V s,t G ST and V s', t' G S'T then
(i) </>($ + t) = S* + t'
(ii) (/)(o;5) = as' a eT
(iii) (/>(s£) = s't'
If (ii) is replaced by (j)(st) = t's' then St and S'T are said to be reciprocal.
Involution
A map 0 is said to be an involution if
(i) (j)2 = the identity map
(ii) 4>{ab) = 0(a)0(b)
If (ii) is replaced by (ii)' (j)(ab) = (j)(b)(j)(a) then <j> is said to be an anti-involution.
For example, in the field of complex numbers the conjugate operation (using the usual
notation)
(j): z —> z*
is an involution since
(j)2(z) = cj)((j)(z)) = (j>{zm) = z and (t){zw) = (zw)* = z*w*
42
However in the field of quaternions, as we shall see the quaternion conjugate is an anti-
involution.
1.8 Complex Numbers
The properties of multiplication of scalars which stem from the basic concept of addition
shows that positive and negative scalars are distinguished since positive scalars have
positive and negative square roots whilst negative scalars seem to have no square roots at
all. This biased treatment of positive and negative scalars is unsatisfactory and was only
resolved with the invention of complex numbers. The crux of the difficulty with scalars
is that when a scalar (or displacement) is multiplied by a +ve, no change of direction
is involved whereas when a scalar is multiplied by a —ve, a 180° change of direction is
involved. What we shall seek is a treatment of number which encompasses a continuous
change of direction, from moving to the right on a line to moving to the left on the line. It
is obvious how to do this geometrically by using the notion of rotation in the plane and
it is the geometrical construct of a complex number that we shall consider first. We are
led naturally to consider extending the concept of a scalar from the displacement of points
on a line, to the displacement of points on a plane. In this process, some of the algebraic
structure of scalars will be lost; in particular, as is obvious geometrically and as we shall
later prove algebraically, we lose the concept of order.
We can extend the geometrical concept of an ordinary number to two dimensions. In a
natural way, number pairs are introduced. Each point in the plane can be represented
by a number pair z = (a, 6), where a is the distance along the s—axis (the scalar or real
axis) and b is the distance along a perpendicular axis through 0, the v—axis (the vector or
imaginary axis). As with scalars the square of the length (of the line segment connecting
(0,0) to (a, b)) is called the norm of z, denoted by Nz. We call z a 'complex number'
because it is only properly defined by two 'ordinary' numbers a, b taken in a certain order.
We need to invent an algebra for these complex numbers which is consistent with ordinary
scalar algebra.
Addition is defined in an obvious way. If z = (a, b) and w = (c, d) are any two complex
numbers then their sum is written z + w and defined by: z + w = (a + c , b + d).
That is, corresponding 'components' are added together. Obviously addition is associative
and commutative following directly from the associative and commutative properties of
scalars. The additive identity, or zero, is the complex number 0 = (0,0) and is such that
z + 0 = 0 + z = z. Also, for given complex numbers z, w, we can always find a complex
number d satisfying the equation z + d = w. The solution is d = (c — a, d — b). Thus the
first three ring axioms are satisfied.
43
Scalar Multiplication
If s is a scalar and z a complex number then the product of s with z is written sz and
is the complex number in the direction of (signs)z with norm Nsz = NSNZ. Equivalently
if z = (a, b) then sz = (sa,sb). Using scalar multiplication it is clear that any complex
number z can be written in the form:
z = \/N~zZ
where Nz = 1 and z is in the same direction as z.
Complex Products
The more difficult concept is multiplication of complex numbers. How should this binary
operation be defined? Any definition we introduce must be in agreement with the results
that we are familiar with for the multiplication of scalars. A point in the plane can be
specified by a pair of Cartesian coordinates (s, v) or by so-called polar coordinates [r, 0]
where r denotes the positive distance from the origin and 0 denotes the angle made with
the positive s—axis. See Figure 1.11. In a very obvious sense [r, 0] is a generalisation of
the 'polar form' of a scalar (y/N^, arcc).
Figure 1.11
The complex number z can be labelled in two alternative ways
fay)
Cartesian description
M
Polar description
The important characteristic of the polar angle 0 is that it is only unique up to an integer
multiple of 27r since. In the polar description any positive scalar has the form [r, 2kir] and
any negative scalar has the form [r, (2k + 1)tt} where k is an integer or zero.
To consider possible definitions of multiplication we consider the well-known results, in
polar form, of multiplying scalars. These are grouped together in the following diagram,
which emphasises the product by using the times symbol x.
44
known results
(positive) times (positive)
^— a=[ra,0]
~^axb
(positive) times (negative)
b=[rbl7t\
axb~
-a = [ra,0]
— s
(negative) times (negative)
a=[ra,7t]~
b=[rb,7lY
~axb
axb=[rarb,0]
a positive ordinary number
axb = [rarb,n]
a negative ordinary number
axb=[rarb,2n]
a positive ordinary number
Definition 1
liz = [rz, 0Z] and
w — [tw> @w] are two general complex numbers then a 'possible' definition
for multiplication consistent with ordinary scalar multiplication could be:
z x w = [rzrw , 9W - 0Z]
That is, the 'r-values' multiply and the 'angles' subtract. This definition leads to the usual
results for scalars:
z = [rz,0] , w = [rw,0]; zxw = [rzrw,0]
a positive scalar
z = [rz,0], w= [rw,ir]; zxw = [rzrw, -tt]
a negative scalar
z = [rz,7r] , w = [rw,ir]i zxw = [rzrw,0]
a positive scalar.
45
This definition is not used as it has a number of obvious drawbacks
(i) z x w / w x z
That is, this product is not commutative for general complex numbers (though it is
for scalars).
(ii) Although each positive scalar has the usual square roots they also have an infinite
i
number of distinct complex square roots. That is, [rz,0] has square roots [rz ,0} for
any 0.
(iii) There do not exist square roots of any other scalar! That is, if we have a complex
number [rz, 0], 0 ^ 0 then we cannot find [rw, </)] such that
Vw,4>] x VwA] = Vz,o]
We are thus led to reject this definition of multiplication. (Some of these properties
seem strange indeed. However, as we shall see similar properties will have to be
accepted for quaternion numbers).
Definition 2
If z = [rz,9Z] and w = [rw,9w] are two general complex numbers then a second 'possible'
definition for multiplication could be:
zxw= [rzrw, 9Z+6W]
i.e. 'r—values' multiply and 'angles' add. As in definition 1 this produces results consistent
with the usual results for multiplication of scalars:
z = [rz,0] , w=[rw,0]; zxw= [rzrw,0]
a positive scalar
z = [rz,0] , w = [rw,tt]; z x w = [rzrw,7r]
a negative scalar
2 = [rz,7r], w=[rw,ir]; z x w = [rzrw, 2?r]
a positive scalar.
This definition does not suffer from the disadvantages of definition 1.
1. z x w = w x z
That is, the product is commutative for all complex numbers
2. Every complex number [r2,02] has, as we hoped, exactly two square roots. This is
easy to prove. If [rz, <j>] is the square root of the complex number [rw, 0 + 2/c7r] for some
integer k we have:
If
[rz,0] x [rz,</>] = [r^,6> + 2A:7r]
46
then
r2z=rw 2$ = 0 + 2kir k integer or zero
i 6
rz = rw <j) = - + kir k integer or zero
But there are only two distinct values of [r2, <j>] depending as A; is even or odd. The
geometrical construction of square roots is described in Figure 1.12.
zx = \r\l\ 0/2} z2 = [r1J\0/2 + 7r]
Figure 1.12
We agree to accept this as the definition of multiplication of complex numbers.
The close connection between multiplication and rotation is strongest when we consider
complex numbers of unit norm r = 1. If w and z are two such numbers:
w = [l,0] * = [l,<£] then zw = [1,0 + 0]
and the effect of multiplying w by z is to rotate (in the anti-clockwise, positive sense) the
complex number w through angle 0.
To recap: the product of two complex numbers z = (a, 6), w = (c, d) is defined according
to the rule:
zw = (ac — bd, ad + be) in Cartesians
or zw = [rzrw,6z + 0W] in Polars
This product is commutative
wz = (ca — db, cb + da) = zw
47
Since ordinary scalar multiplication is commutative. It is also associative. If p = (e, /) is
a third complex number then
z(wp) = (a,6){(c,d)(e,/)}
= (a,6){(ce-d/,c/ + de)}
= (a(ce - df) - b(cf + de), a(cf + de) + b(ce - df))
= ((ac — bd)e - f(ad + 6c), (ad + bc)e + f(ac - bd))
= (ac — 6d, ad + 6c) (e, /)
= (zw)p
The set of complex numbers denoted by C, with the binary operations of addition and
multiplication satisfy all the axioms of a commutative ring. The complex numbers admit a
multiplicative identity (or unit) being (1,0). Whether we use the paired number notation
(a,/?) or the more common algebraic form a + j/3 for which j2 = —1 (or indeed many
other possible forms used to represent a complex number) will depend entirely upon the
immediate application.
We should note that although C is a commutative ring, it cannot be ordered. To see this
we first assume that we can construct a subset Cp C C which satisfy the order relations.
Of course 1 € Cp since 1x1 = 1 and the square of every element of an ordered set is
positive. Thus -1 fi Cp. But j2 = -1 which contradicts the property that the square of
every element is positive. Thus C cannot be ordered.
The conjugate map
Let z = s + jv then the conjugate of z, denoted by z, is denned as: z = s — jv. It is
easily confirmed that for all z,w € C then (z + w) = z + w (zw) = zw. The map
/:Ci—>C f(z) = z is an automorphism since
f{zw) = (zw) = zw = f(z)f(w)
and f(z + w) = (z + w) = z + w = f(z) + f(w)
Also / is clearly one-to-one and onto. This of course is not surprising from a geometrical
point of view since z is the reflection in the scalar axis of z. Using the conjugate we can
re-express the norm of z, Nz = zz and we find Nz.w = Nz- Nw. Also since zz = zz G Rp it
is clear that to each non-zero complex number z, we can construct a multiplicative inverse
48
Since (C, +, •) is a commutative ring with, for every non-zero z £ C, an inverse z~l then
(C, + , •) is a field. Obviously, using the conjugate we see that the scalar part S(z) and
vector part V(z) of z are
S(z)=1-(z + z) V{z) = \{z-z)
We can also deduce two interesting inequalities with reference to the norm. First, for any
complex number z = s + jv
S(z) = -(z + z) = s
S{z) for all z
Now
Nz+W = (z + w)(z + w) = zz+ (zw + wz) + ww
= NZ + 2S(zw) + NW<NZ + 2y/N~^+Nw
.'■ Nz+W <NZ + 2y/NzNw + Nw
or, in terms of the more commonly used moduli: \z + w\ < \z\ + |iy|. In a similar vein
Nz-W = (z - w)(z - w)
= NZ- 2S(zw) + NW>NZ- 2y/NzNw + Nw
That is, \z — w\ > \\z\ — \w\\. These are the triangle inequalities.
The Angle Between two Complex Numbers
Let z and w be any two complex numbers. Then, in polar form:
z — rz (cos a + j sin a) w = rw (cos (3 + j sin (3)
and, as is easily found:
cos(a - (3) + j sin(a - (3)
Thus we can conclude
S(zw) = rzrw cos(a - (3) and V(zw) = rzrwj sin(a - (3)
Nz
zz ■
2 , 2
■ s +v
and
Therefore,
V s2 + v2 > s ■
z
w
zw
1±
Tin
49
This leads to the definition of the angle 0 = a- (3 between the two directions represented
by the complex numbers z and w to be such that:
cost
fK\fK
and j sin 0 ■
fKJN,
These results on angle suggest that the space of complex numbers can be considered as an
inner product space. This is confirmed if we choose as inner product in C:
< z,w > = S(zw) €R
The inner product axioms are easily checked.
Rotations and Reflections
We now show that multiplication of a complex number w by a unit complex number
z = x + jy represents a rotation of w in R2. To see this consider the matrix representation
of the linear map <j>: <j): R2 »-> R2 <j)(w) = zw. Now:
4(1) = z = x + jy (j)(j) = zj = -y + jx
Therefore its matrix representation is: A^ =
-y
An easy calculation confirms that
Af is orthogonal and detA^ = x2 + y2 = 1. Hence A^ € 50(2) and the transformation
is a rotation. We now consider a related transformation; a reflection. Let w be a given
complex number and z a unit complex number (N^ = 1). See Figure 1.13.
v
B
Figure 1.13
We might ask what is the reflection wr of w in the line represented by z and what is the
reflection wq in the line whose normal is represented by zl
Clearly wr is obtained by rotating w in the clockwise direction through angle 2a:
wr = we
-2ja
50
Now, since . e Ja = z
ML
~2Nw _ ~2NwW _ -2
In a similar manner we find:
WQ = ^'(18°-2q) = -we^j2a = -z2w = -wR.
It is self evident that any planar rotation is equivalent to a refelection in a line bisecting
the angle of rotation. We can consider reflection from a linear space perspective. The map
9 : M2 »-> E2 9(w) = z2w is linear. Also
9(l) = z2=x2-y2 + 2jxy
= l(x2 - y2) + (2xy)j
0Q) = z2j = 2xy-j(x2-y2)
= l(2xy) + (-(x2-y2))j
Therefore the matrix representing the map is:
Ae
x2 — y2 2xy
2xy -(x2 -y2
As with the rotation map (/>(w) = zw it is easily verified that this is an orthogonal
matrix but here, det^4# = — 1. That is, the #-map does not preserve orientation and
so cannot be a rotation. We also see (using our knowledge of matrices) that two successive
reflections; represented by Aq, Bq respectively would be equivalent to a rotation since AqBq
is orthogonal if Aq and Bq are and det^i^) = det^deti^ = +1.
Circular Arcs
We have seen that if w is any complex number then (cos# + j sin6)w rotates w through
an angle 0 in an anti-clockwise direction. We can form an association with the arc of the
circle which subtends angle 0 and the unit complex number cos 6 + j sin 6. We write (using
~ to specify the geometrical correspondence)
z = cos 9 + j sin 9 ~ arcAB
We note that the arc can be positioned anywhere on the circle — it is slidable. The arc
is able to move to any position on this circle so long as its length and direction remain
unchanged. See Figure 1.14(a).
51
Prom this correspondence we deduce that any point on the circle, or indeed the circle itself,
(0 = 0) is represented by z = 1 and any semi-circle (9 = it) is represented by z = — 1.
Also if arc,4# is represented by z then z"1 represents &ycba whilst — z represents arc£>#
{DOA is a diameter, see Figure 1.14(b)). Also the complex number z — j (i.e. 6 = tt/2)
represents a quartercircle. Circle arcs may be added vectorially by sliding one along the
circle until its start point is at the end point of the other and clearly
arc^ vp arc^ ^= arc^ii)
where © denotes 'vector summation' of arcs in the sense indicated in Figure 1.14. Any two
arcs £, w will form a closed circle only if
zw = 1 since then arcf © arc^ = 0
This is easily generalised: a collection of n arcs, represented by z, w,..., ft will form a closed
circle only if zw... ft = 1.
(a) (b)
Figure 1.14
We can group together the main results of the correspondence between unit complex
numbers and circular arcs:
z ~ arc^B z~l ~ arc#,4 — z ~ arc£>#
1 rsj point or circle — 1 ~ semi — circle j ~ quarter — circle
arcf © circle = arcf arcf © semi — circle = — arc^
52
If z,w are general complex numbers z = y/N^z, and w
numbers with unit norms then
/Nww where z,w are complex
zw = \fWz\fN^zw = \/N~z\/N^ arc^
Interpreted appropriately we see that multiplication of complex numbers can be
decomposed into the product of positive reals and the vector addition of arcs. As we
shall see later a very similar (and much more useful) correspondence can be made between
great circle arcs on a sphere and quaternions which generalise complex numbers.
Some authors define the polar coordinates of z = \fN~zz to be (essentially) the pair {\fN~z,
arcf). This is in accord with the 'polar coordinates' denned for real numbers. See Figure
1.15.
alternative polar coordinates for z
Figure 1.15
Matrices and Complex Numbers
Another way of considering complex numbers utilises matrices. Matrices arise naturally
when rotations in the plane are considered. A rotation of a point with coordinates (x,y)
to a new point with coordinates (xf, y') can be effected by the transformation
x' = ax — by
= v a2 + b2 [x cos 0 — y sin 0] cos 0 =
y' = bx + ay
V a2 + b2 [x sin 0 + y cos 0] sin 0 =
Va2 + b2
Va2 + b2
The terms in square brackets describe a pure rotation, through angle 0, and the factor
\/a2 + b2 describes an expansion: The point (xf, y') is a factor y/a2 + b2 further away from
the origin than (x,y). This transformation may be written in matrix form:
53
(x',y') = (x,y)
a b
—b a
Theorem 1.6 Let the set of all 2 x 2 matrices of the form
M/2^. The map
/ : (C,+,.) .- (M(2|R),0,®) /{(a+ Jb)} ~
is an isomorphism.
a
-b
a
-b
b
a
b
a
Proof
f{(a + jb) ■ (c + "jd)} = f{(ac - bd ■ +j{bc + ad)}
ac— bd
be + ad]
—be —ad ac — bd\
a b
—b a
(g>
c d
—d c
= f(a+jb)®f(c + jd)
f{(a+jb) + (c + jd)} = f{(a + c)+j(b + d)}
a + c
b + d
—b — d a + c
a b
—b a
e
c
-a
d
1 c.
be denoted by
= f(a+jb) + f(c+jd)
Thus / is a homomorphism. It is clearly one-to-one and onto. Thus the map / is an
isomorphism. We note that the complex conjugate corresponds, in matrix terms, to the
transpose operation.
Chapter 2
Quaternions
2.1 Inventing Quaternions
The quaternion is a generalisation of a complex number (See van der Warden's article [6]
for the historical development). We can consider generalising the complex number from
a geometrical or an algebraic perspective. We begin with the geometrical approach. The
complex number describes 2—dimensional space (the xy plane) and (as far as multiplication
of complex numbers is concerned) rotations within it. Now rotations in the xy plane are
commutative. That is, if P\ and P2 denote rotations in the xy plane and if successive
rotations Pi followed by P2 is notated by P2P1 then P1P2 = i^-Pi- This is reflected
in the algebra of complex numbers. Multiplication of complex numbers (which as we
know describe planar rotations) is commutative Z1Z2 = 22^1 • However, if we wish to
describe rotations in 3-dimensional space then we are immediately faced with a difficulty
since, as is easily demonstrated, space rotations are non-commutative. For example, in
3—dimensional space with axes Ox, Oy, Oz we consider two right-handed rotations:
R\: about the x-axis through 90° R2: about the y—axis through 90°
Applying R\ first takes a point P : (0,0,1) to the point P' : (0, —1,0). Application of R2
leaves P' unchanged. Conversely, applying R2 first takes P into P" : (1,0,0). Application
of R\ leaves P" unchanged, i.e. it^i 7^ R\R2 and the order of the rotations are important
(unlike planar rotations). See Figure 2.1
il
7
i
(a)
(b)
(c)
Figure 2.1
J. P. Ward, Quaternions and Cayley Numbers
© Kluwer Academic Publishers 1997
55
What we have found, in the natural progression from ordinary numbers to complex
numbers, is that both have exactly the same algebra: in particular the associative and
commutative relations for addition and multiplication hold true for complex numbers.
However, the order property of ordinary numbers is meaningless when applied to complex
numbers and must be abandoned. The example above, on finite rotations, illustrates that
if complex numbers are to be generalised to apply to describe 3—dimensional rotations
then the algebra of the generalised 'objects' is likely to have a 'product' which is non-
commutative.
Complex Numbers and Quaternions
We shall find it convenient to write a complex number in the form
z = la + jb j2 = — 1
where a, b are ordinary numbers (scalars). The complex number is made up of two parts;
a scalar part la and a vector part jb. The usual constructions can now be introduced: the
conjugate z, the norm Nz and the angle 9
a h
z = la — jb Nz = a2 + b2 cos 0 = sin 9 =
Nz y/Nz
Then every complex number can be written in polar form: z = y/W^(cos 6 + j sin 0). Other
common properties can now be deduced:
z
-l
™ZW — ™Z™W ^z/w —
The quaternion was introduced by Hamilton. His initial attempt to generalise the complex
numbers, by introducing a 3-dimensional object (of the form q = la + ib + jc) failed in
the sense that the algebra he constructed for these 3-dimensional objects did not have the
desired properties. In particular it failed to satisfy the 'norm' property Npq ^ NpNq. On
16th October 1843 Hamilton discovered that the appropriate generalisation is one in which
the scalar (real) axis is left unchanged whereas the vector (imaginary) axis is supplemented
by adding two further vector axes. The reader might find it helpful to think of the scalar
axis as representing 'time' and the three vector axes as representing 'space'. The basic
algebraic form for a quaternion q is:
q = la + ib + jc + kd
where a,b,c,d are ordinary numbers. The vector space is regarded as the usual
3—dimensional vector space with 'unit vectors' i,j and k.
56
What properties should we expect/demand for these new objects? We might certainly
wish that the rules for addition and multiplication by a scalar should mirror those for
complex numbers. Thus if q = a + bi + cj + dk and q' = a' + b'i + c'j + d'k are any two
quaternions then equality, addition and multiplication by a scalar are denned trivially:
equality: q = q' only if a = a', b = &', c = c', d = d!
addition: q + tf = a + a' + (b + b')i + (c + c')j + {d + d')k = q' + q
multiplication by a scalar sGM: sq = sa + sbi + scj + sdk
Clearly addition is associative : q + (p + h) = (q + p) + h and the usual properties of
multiplication by a scalar are satisfied:
(s + t)q = sq + tq, s(q + p) = sq + sp s,tGR
If the scalar and vector parts of q are Sq and Vq respectively, defined as:
Sq = a Vq = bi + cj + dk
then, for any quaternion q:
q = Sq + Vq
By analogy with complex numbers the conjugate of q, denoted by q is:
q = Sq - Vq
whilst the Norm of q, Nq is:
jVq = a2 + 62+c2+d2.
We can also associate an angle 9 with quaternion q:
a . n Vb2+c2+d2
cosy = —7= sm0 = 7=
This is a sensible definition since, obviously,
-1 < cos<9 < -hi - 1 < sin0 < +1 cos2 0 + sin2 0 = 1
If Nq = 1 the quaternion is called a unit quaternion. Every quaternion can be written in
'polar' form:
q = \[N~q (cos 0 + q sin 0)
where q is a unit vector.
57
Quaternionic Multiplication
The crucial question concerns the product of one quaternion with another. We will be
guided in our choice for the definition of multiplication in our wish to retain j2 = — 1
(and thus by 'symmetry'; i2 = — 1, k2 = — 1 also). These three basic products must hold
since, if our 3-dimensional vector space reduces to a 1-dimensional vector space (that is,
if b = c = 0 or if c = d = 0 or if b = d = 0), the quaternion should reduce to a complex
number with all its attendant algebraic properties. We shall also wish to associate rotations
with multiplication (in some form) just as we have done for complex numbers. There are
two types of rotation that we can consider: in the four-dimensional space of the 'general'
quaternion q = a + bi + cj + dk or in the more familiar three-dimensional space of so-called
'pure' quaternions bi + cj + dk. We first consider the three-dimensional rotations in the
space spanned by i, j, k. If w, q' are given quaternions and if q' is expressed in polar form:
q' = y/Nqt (cos <j) + q' sin 0)
then we might suspect/hope that the product of q' with w (written q'w) would imply a
rotation of the vector part of w about q' through an angle (j). (This turns out not to be
true). Instead let us denote by:
q' * w
to be a quaternion denoting the operation of rotating the vector part of w about q'
through an angle depending on (p (as yet unknown). (In complex numbers z\ * z<i = z\^.
That is, in that case the operator * is the direct product. We shall see below that
q' *w = q'w(q')~l). Just as multiplying a complex number by a scalar leaves the direction
of the complex number unchanged so we demand that multiplying a quaternion by a scalar
will imply no rotation; so s * w = w for s G M. That is, rotations will only occur if q' has
a non-zero vector part, Vq' ^0. If we now consider a third quaternion q, also expressed
in polar form
q = y/N^(cos 0 + q sin 0)
then, from the constraints already imposed,
q * (qf * w)
would imply rotation of the vector part ofq'*w about axis q through some angle depending
on 9. In the definition of the quaternion product we shall demand that
q * (</ * w) is equivalent to (qqf) * w
That is, two consecutive rotations should be equivalent to a single rotation, this being a
well known property of 3-dimensional space. There are two important questions we should
58
now ask. What should the angle of rotation be in q' * w and how do we determine qq'l We
shall make our choice of angle and consequently determine qq' by considering some special
cases (in effect by working out i*i, i*j, etc).
Special case 1 Let q = i, q' = i then, as we have demanded, qq' = (i)2 = — 1.
The effect of i * (i * w) is to rotate the vector part of a quaternion w through an angle
A (say) about the i—axis and then through angle A again, also about the i—axis. This is
to be equivalent to multiplying the quaternion by —1 (a quaternion with no vector part),
that is, not to rotate it at all. We are thus led to conclude that
2A = 0 or 2tt or 4?r, ...
We choose A = tt for non-trivial effects. That is, the operation i* rotates the vector part of
a quaternion through angle tt about the i—axis. This is to be contrasted with the operation
i* in the complex plane which only implies a planar rotation through 7r/2. By symmetry
we expect similar effects when operating by j* or by k*.
Special case 2 Now consider the choice q = i, q' = j. What is ij?
If we consider point P in 3-dimensional space with coordinates (—<7,p, e) then the vector
OP which could be the vector part of a quaternion, transforms into P' : (c,p, — e) when
operated on by j* (rotation of ir about j—axis) and then into P" : (cr, —p, e) when then
operated on by i* (rotation of tt about the i—axis). But by inspection (see Figure 2.2),
this is equivalent to either a single operation by k* (rotation of tt about the /c—axis), or
an operation by —A;* (rotation of — ir about the A;-axis).
t
P"
i
P'
Figure 2.2
Thus we are led to conclude that (ij)* = ±(k)*
If we reverse the product and consider ji we find P —► Q' : (—a, —p, — e) after operating
by i* then Q' —> Q" = P" : (a, — p, e) after operating by j*. Again we can only deduce
the value of the product ji (= ±k) up to a sign. We do not make the choice that ij = ji
59
as we are specifically looking for a product which is not commutative. Now both i2 and
i4 imply rotations through 360°: but what is the significance in the difference in signs:
i2 = — 1, i4 = +1? We need to distinguish (in terms of rotations) between the minus sign
obtained from the operator i2 (or from j2 or from k2) and the plus sign obtained from the
operator i4. If we can manage to do this we shall be able to decide whether ij = +k or
ij = — k. A mechanism for explaining the difference between operators i2 and i4 exists
and is known as the quaternion demonstrator [8].
Quaternion Demonstrator (The Belt Trick)
A demonstration of the properties of quaternionic multiplication is obtained by using a
disk, with distinctive sides, allowed to swing from a ribbon, also having distinctive sides.
See Figure 2.3. It is the existence of the ribbon, which undergoes twisting when the disk
rotates that reflects the distinction in the operators i2 and i4. The reader is strongly urged
to construct the disk as described here so that s/he can confirm directly the claims that
are made in this section. A similar, but less versatile, demonstrator can be made with a
plate balanced on the hand. This is less satisfying as a demonstrator as the rotation of
the plate is not quite so obvious as it is with the disk.
Front vi w
Back vi w
Figure 2.3
In Figure 2.4 the disk is rotated about the i—axis through 360° thereby returning the disk
to its original configuration. This demonstrates the operator i2 = — 1. (In the plate/arm
60
combination there is a twist in the arm, which can be removed by a similar further complete
rotation).
*2
i =
Although t\ - disk is ba k to the
o gin I con igunt on th re is
■i twist in the bbon
Figure 2.4
As a result of the rotation there is of course a twist in the ribbon which cannot be removed
unless the operation is reversed or unless the operation is repeated (demonstrating the
effect of the operator i4 = +1. See Figure 2.5).
Th isk is aii back to th *
o Tginal conflgur lion, i * ar
tu'his in he ribbon but, is is uisiU
tmonstra cd, thiAe ct n be untangled
wi hou my urther rot' ion of tht.
d'sk ya ' ckwisc tana ota ion
ofth lisk.
Figure 2.5
61
However, in this case the twists in the ribbon can be removed, without changing the
intrinsic rotation of the disk, by allowing the tension in the ribbon to relax and by moving
the disk along the circular path shown in Figure 2.6.
\
d thro gh I o
Figure 2.6
Thus we realise, using this demonstrator, that though i2 and i4 both leave the disk
unaffected; i2 is different from z4 in that the former leaves a single twist in the ribbon
(denoted by —1) whilst the second operator leaves the ribbon untwisted (denoted by -f-1)
after the planar motion described in Figure 2.6 is carried through. In Figure 2.7 the
demonstrator exhibits the operators k2 = — 1 and k4 = +1.
j Th disk is hack to the
I ri^ ndlcunHgur tion T crc ts
aw' n hcrbb n ppoMieti
\ ^
\ that obi nc > o v ling w*ih i
* The Kk is again cktoih
ori inal confi uraiion. The apparent
twists in the ihbon tan be un an Aq 1
wilhoul fu her ctaiionn thed'sk
as iclb cr h' imc a
anticlockwise pi na ro at t n wiU
achieve ih desired csuks)
Figure 2.7
62
ut
i ultip icarion by
mu U cation by j
iu] i lication b
entical to m Itiplic lion b -k ide tical to mltjplic tio b k
Figure 2.8
A similar operation to that described above will remove the apparent twists in the ribbon
without changing the intrinsic rotation of the disk. The operators j2 = —1 and j4 = +1
are (pictorially) identical to those described in Figures 2.4, 2.5. Finally, we use the
demonstrator to show that ji = —k. Rotating the disk through 7r about the i axis and then
through 7r about the j axis produces the second picture of Figure 2.8. This is identical to
the picture obtained by rotation through — tt about the k axis. As seen in Figure 2.8 the
difference in ij and ji lies in the twist on the ribbon; the twists have different orientations.
2.2 Quaternion Algebra
Our considerations with rotations and particularly with the demonstrator lead us to
conclude that the natural generalisation of the 2-dimensional complex form: z = a + jb, is
not to a 3-dimensional object but is the 4-dimensional quaternion q:
q = a + bi + cj + dk
such that i2 = j2 = k2 = — 1 and ij = k, ji = —k (the last two relations being cyclic).
The set of quaternion numbers is denoted by EL Using these basic products we can now
63
expand the product of two quaternions to give (assuming for the moment that the product
is distributive with respect to addition):
qq' = (a + bi + c] + dk)(a' + b't + c'j + d'k)
= (Sq + Vq)(Sq' + Vq')
= SqSq' + Sq'Vq + SqVq' + VqVq'
= SqSq' - Vq • Vq' + Sq'Vq + SqVq' + VqA Vq'
where we have used the usual dot and cross products in vector analysis. We see that the
quaternionic product contains all the products of vector analysis: products of two scalars;
products of scalars with vectors; the dot product and the cross product. Since, in general
Vq' /\Vq^ Vq' A Vq, the quaternion product is not commutative unless Vq is parallel to
Vq' or if one of q, q' has zero vector part. Although the product is not commutative it is
associative. If <?,p, h £ HI then the product q(ph) is:
q(ph) = (Sq + Vq)(S(ph) + V(ph))
= Sq Sp Sh - {Sq Vp ■ Vh + Sp Vh ■ Vq + Sh Vq ■ Vp}
+ {Sq Sh Vp + Sh Sp Vq + Sp Sq Vh}
+ {Sq VpAVh- Sp Vh A Vq + Sh Vq A Vp}
- (Vp ■ Vh)Vq + VqA (Vp A Vh) - Vq ■ (Vp A Vh)
and the product (qp)h is:
(qp)h = Sq Sp Sh - [Sh Vq ■ Vp + Sq Vp ■ Vh + Sq Vp ■ Vh]
+ [Sq Sp Vh + Sh Sq Vp + Sh Sq Vp]
+ [Sq Vp A Vh + Sp VqAVh + Sh Vq A Vp]
- (Vq ■ Vp)Vh + (Vq A Vp) AVh- (Vq A Vp) ■ Vh
Thus if q(ph) = (qp)h is valid we need to show
-(Vp ■ Vh)Vq + VqA (Vp A Vh) = -(Vq ■ Vp)Vh + (Vq A Vp) A Vh
But this is easily shown to be true if we employ the standard vector identities
a A (b A c) = (a • c)b - (a • b)c (a A b) A c = (a • c)b - (b • c)a
64
q(ph) = (qp)h
This property could have been checked directly, without appeal to vector algebra, by
evaluating all possible combinations of the products (pq)r, p(qr) when p, q, r are basis
elements 1, i, j or k.
We can also confirm that the quaternion product is distributive with respect to addition:
q(p + h) = {Sq + Vq)[Sp + Sh + Vp + Vh]
= SqSp + Sq Sh-Vq- {Vp + Vh)
+ ShVq + Sp Vq + Sq{Vp + Vh) + VqA (Vp + Vh)
= {Sq Sp-Vq-Vp + SpVq + SqVp + VqA Vp}
+ {Sq Sh-Vq-Vh + Sq Vh + ShVq + VqA Vh}
= qp + qh.
It is now a simple matter to determine the following properties of quaternions
S(qp) = SqSp-Vq.Vp = S(pq)
Nq = qq=(Sq + Vq)(Sq-Vq)
= {Sq)2 + Vq-Vq
= (Sq)2+NVq
Jqp) = {Sq Sp-Vq-Vp- SpVq - SqVp -VqA Vp}
= pq
Also the important property that the norm of a product is equal to a product of norms:
Npq = pq(pq) = PQ(QP) = PPQQ = NpNq
If a quaternion q has a non-zero norm Nq then the inverse q~l is defined by q~l = —. It
Nq
quickly follows that
qq~l= q-1q = l Np/q = ^ (pq)'1 = q-'p'K
A quaternion q is said to be pure if S(q) =0. We also note that since
qp-pq = VqAVp-VpA Vq
and so if p is a quaternion which commutes with every other quaternion then Vp = 0 and
so p is a real number.
65
It should be noted that a quaternion q = a + (3 can always be written as the (quaternion)
product of two vectors a, b:
q = ab= —a • b + a A b
In fact there are an infinite number of choices that can be made for a, b: explicitly, choosing
6 • /3 = 0, then
-b A (3 - ab
a = -=- gives a • b = —a and a A b = j3
b-b —
This result is used explicitly in showing (in Section 2.6) that any 3-dimensional rotation
can be formed from two successive reflections.
Example Application to Analytical Geometry. Find the quaternion equation of a
straight line and hence deduce its shortest distance from the origin.
Solution
If b is a vector parallel to a line which passes through the end-point of the position vector
a then its equation is
(r-a) = ab aGM
where r is the position vector of a point on the line (see Figure 2.9 ) with respect to origin
O.
Figure 2.9
If we define the moment M of the line about O:
M = aA6
then
{r- a) /\b = 0 or r Ab = M
66
is the equation of the line. Now we can interpret b as a pure quaternion and so
Therefore
(r A b)b~x = Mb'1 where b_1 = ——
[-(r A b) • b_1 + (r A b) A b_1] = Mb'1
- b"1 A (r A b) = Mb"1
Mb-1
that is
r ——o
Nb
therefore
r = (b-r + M_)b 1 = (7 + M_)b 1 where 7 = 6 • r
Now (Mb-1) and 6 are orthogonal since
5((M6_1)6C) = -S((M6_1)6) = S(K) = 0
Also, we can determine NL, it being the square of the length of the vector r
NL = rrc = (7b"1 + Mh'1)^'1 +Mh~lY
= (1b-1+Mb-1)(-^~M^-r
= (76-1+M6-1)(7]|-]|m)
72
Since, for a given line MIT1 is a fixed quaternion, we deduce that the minimum value of
NL occurs when 7 = 0 (as is obvious geometrically) and has value N^Mb-i^ which is easily
determined as:
Mb'1 =a+(b-a)b-1
Example Use the quaternion formalism to deduce standard vector identities.
Solution
Let a, b and c be pure quaternions, then
ob=— a-b + a/\b
67
Thus
S(ab) = — a • b = — \a\\b\ cos 0
V(ab) = a A b = \a\ \b\ sin Oh
V(ba) = \{ha- (ba)c) = ]-{ba-ab) = -V(ab)
Therefore
a A b = —b/\a
Now we know, since the quaternion product is distributive a(b + c) = ab + ac; so
S(a{b + c)) = S(ab) + S(ac)
i.e. a- (b + c) = a-b + a- c
Also
V(a(k + c)) = V{ab) + V(ac)
i.e. aA(Hc) = aAb + aAc
Now considering quaternion products of three vectors (using S(ab) = S(ba))
S{abc) = S{bca) = S{cab)
Now, clearly (as for any quaternion)
abc = S(abc) + V(abc)
and since abc = a(S(bc) + V(bc))
.'. S(abc) = S(aV(bc))
This implies
Ql ' {k A c) = b • (c A a) = c • (a A 6)
Also the relation S(abc) = S[(abc)c] = —S(cba) implies
a- (bAc) = —C' (bAa)
We also see that
V(abc) = aS{bc) + V(aV(bc))
But, directly from the definition:
V{abc) = habc-{oM)c)
= \[abc-(-c)(-b)(-a)}
= - [abc + cba]
= - [abc — bac + bac — bca + bca + cba]
= cS{ab) - bS{ac) + aS{bc)
68
Therefore we deduce
V{aV{bc)) = cS{ab) - bS(ac)
which states in classical vector terminology
Ql A {b A c) = -c(a • b) + b(a • c)
Complexified Quaternions
In this section we describe the use of a quaternion formalism in conjunction with complex
numbers. This is applied to the homogeneous coordinate formulation of the equations of
a point, line and plane. Following Brand [9] the equations of a point, plane and line are
given respectively by
ra = a0 (1)
r-a = a0 (2)
IA a = Oq in which a • Oq = 0 (3)
in which r is the position vector to the entity (point, plane or line) in question. Thus the
point, plane or line can be notated by
Point: (a,Oo) Plane : (a, ao) Line : (a,Oo)
these are called the homogeneous coordinates of point, plane and line respectively. (If
coordinates are multiplied by the same scalar then the entity they determine, through
(1),(2), (3) are unaltered).
In each case the first homogeneous coordinate cannot vanish. Also when the second
homogeneous coordinate vanishes the origin O is a part of the entity. It is interesting
to note that the shortest distances from point, plane and line to the origin is given by:
|gp| 1 Qol |gp|
respectively.
Let us now introduce a complexified quaternion (the next chapter will examine this object
in far greater detail):
p = Q + iQo Q, Qo € H
69
in which Q = a + o, Qo = #o + Oq. The four basic quantities: scalar, point, plane and line
are obtained as follows:
(a) V(Q) = V(Q0) = 0 p e C a scalar
(b) V(Q) = 0, S(Qo) = 0 p = a + iaQ <-> (a,o0) represents a point.
(c) S(Q) = 0, V(Qo) = 0 p = a + ia0 <-> (a, a0) represents a plane.
(d) S(Q) = 0, 5(Qo) = 0 p = a + iaQ <-> (a, Oq) represents a line.
Note that point and plane are 'dual' in the sense that one is obtained from the other by
multiplying through by i; but scalar and line are 'self-dual' objects.
Now any two entities determine a third; two points determine a line, two planes determine
a line, two lines determine a point and a point and a line determine a plane.
Let us consider first two distinct points with quaternionic representation:
p = a + iaQ q = [3 + ibQ
then it is easily checked that these two points determine the line with homogeneous
coordinates
(abo-000,00 A £o)
But, as is quickly confirmed the expression iV(pq*) (in which an asterisk * represents
complex conjugate) give the real and imaginary parts representing a line:
W(pq*) = ako- f5aQ + i{oq A 6q)
<-> (afio-/?Oo,OoA£o)
Thus, in quaternionic terms, the line connecting p, q is given by iV(pq*).
Secondly, considring two non-parallel planes
P = ia0 + a Q = ifio + b
These determine a line with homogeneous coordinates
(o A 6, PoQ. — aok)
Again, in quaternionic terms this is neatly expressed as iV(P*Q):
iV{P*Q) = aAb + i%a - a0b)
70
Much more complicated are the quaternionic expressions for the plane determined by a
point and a line. Here p = a + ia^ represents a point and h = b + ibQ represents a line. The
homogeneous coordinates of the plane determined by these entities are (ab — a0 A6, Oq • 60).
But, in quaternionic terms, these terms arise as the combination
l-[h*p-ph\
Finally, we can contemplate the point determined by the line h = a + ia$ and the plane
P = ipo + b. The homogeneous coordinates of the point of intersection are (a-6, A>a —Oq A&)
which, in quaternionic terms are obtained as the real and imaginary parts of
\[hP + Ph*]
2.3 The Exponential Form and Root Extraction
In the algebra of quaternions we have had to accept that multiplication is not commutative.
In complex number algebra positive real numbers and negative real numbers are essentially
treated in the same way; both have just two square roots. (In fact, in the invention of
complex number multiplication the definition avg(zw) = argz + argw is chosen precisely to
ensure symmetrical treatment of positive and negative real numbers. Had the definition
been chosen as 8iYg(zw) = argz — aigw which is possible if all one wants is a definition
consistent with ordinary real number multiplication then positive reals and negative
reals are treated differently. With this second definition complex multiplication is not
commutative, positive real numbers have an infinite number of square roots and any other
number has no square roots at all!). We shall see that in quaternion algebra positive
real numbers are again distinguished from negative real numbers. However, in quaternion
algebra, whereas positive real numbers have the expected two square roots, negative real
numbers now, instead of having no square roots have an infinite number of them! To
obtain these results it is convenient to consider the 'polar' form of a quaternion (where q
is a unit vector):
q = >//V^(cos <j> + q sin <j>)
If this is a unit quaternion with zero scalar part then Nq = 1 and <\> = 7r/2; that is q = q.
Then using the rules of quaternionic multiplication:
Q2 = -q-q + q/\q = -1
71
Thus since the direction of q is arbitrary there are an infinite number of quaternion square
roots of —1. This corresponds to the fact that performing two 180° rotations of the disk
in the demonstrator about an arbitrary axis q will bring the disk back into its original
configuration but will leave a twist in the ribbon. It is easy to show that any quaternion
square root of — 1 must take this form. For example let
q = a + bq q2 = —1
Then, from the second of these equations:
(AT,)2 = #(_!) = 1 .-. JV, = 1
Now since V(q2) = V(-l) = 0 and as
q2 = a2 — b2 + 2abq this implies that ab = 0
We easily conclude that a = 0 and 6=1 and so q = q.
Since, as we have seen, a pure vector unit quaternion q satisfies q2 = — 1 it follows that
the plane (depending upon two real parameters s, v),
q = s + vq
(for fixed q) is isomorphic (a copy of) the complex plane:
w = s + v j
Within the plane, quaternion multiplication reduces to complex multiplication. This view
allows a quaternion to be written in exponential form, as if it was a complex number.
Exponential form of a Quaternion
Let q be a quaternion:
q = y/N\j(cos 0 + e sin 9)
This can be written in the form (cf [9]):
q = ^/Wqe'e
since, in common with complex numbers, this only requires use of the property e2 = -1.
We cannot deduce ee9.e^ = ee0+^ since this can only be deduced by using commutativity.
However we can obtain the usual Euler formula:
e*«
= cos 0 + e sin 9
72
from which the De Moivre theorem is deduced,
(cos 0 + e sin 0)n = cos nO + e sin n#
This is now used to deduce the nth root of a quaternion. Let w = y/NZ{cos 4> + e sin 0) be
a given quaternion and let q be its nth root:
qn = (y/Nq)n (cos n0 + e sin nO) = y/Nw (cos </> + e sin </))
Here we are assuming that the nth root is in the same 'complex plane'. If sin^ ^ 0 we
have a non-degenerate quaternion and q can be found in the usual way as for complex
numbers:
y/N~q = {y/N^)1/n 0 = (</> + 2k7r)/n k = 0,1,... ,n - 1
However, if sin <j> = 0 then we can choose e in q arbitrarily. There are two cases to consider.
(i) <j) = 0, that is, w is a real number, w > 0. Here
2ibr
0 = fc = 0,l,...,n-l
n
If n = 2 then there are only two distinct values 0 = 0, tt and so there are just two square
roots of a real number ±y/w. If n ^ 2 then, since e is arbitrary, there are an infinite
number of nth roots.
(ii) (j) = ir. Here w is again a real number w < 0. This case has been described above.
There are an infinite number of nth roots of a negative number.
2.4 Frobenius' Theorem
As we saw, in Section 1.7 an associative algebra A is said to be a division algebra if for
any O^aGi the equations
ac = u ba = u
always possess solutions, (it's easy to show that, then, b = c). It immediately follows that
a division algebra A contains no divisors of zero: if ab = 0 then either a = 0 or b = 0.
The following theorem highlights the important central role played by quaternions in the
area of associative division algebras over the field of real numbers.
73
Frobenius' Theorem
If A is an associative division algebra over the field of reals R then A is isomporphic to
one of R, C, H
Proof
Let us assume that A is a division algebra of order n (dimension) over R. It follows that any
(n + 1) elements of ^4 are linearly dependent. In particular if x G A then
are linearly dependent. Here i\ is the unit element: i\s = si\ = s \/s G A. Thus there
exist scalars otj GR, not all zero, such that
a0ii + oliX + ... + anxn = 0
By the fundamental theorem of algebra this can be factorised:
F1{x)F2{x)... = 0
where Fi,F2,... are linear or quadratic factors with real coefficients. Since a division
algebra contains no divisors of zero we can conclude that at least one of F\, F2,... is zero.
If the zero factor happens to be linear then its square is quadratic. We deduce that every
x G A is a root of a quadratic equation with real coefficients.
In particular if ei = i\, e2, e3,..., en are the basis elements of A then we can conclude
e) + 2Pjej + jjii =0 fy,^ G R
but, completing the square:
(e^+/3^)2 = (/3J2-7j)ii ^-7j€M
Now define a new set of basis elements
(ei, e'2, e'3,..., e'n) = (ei, e2 + ft, e3 + ft,..., en + ft)
then
(cJ-)2 = (/S?-7i)<i #-7jgR
Now either /3? - 7^ > 0 or /3J5 - 7j < 0. If $■ - 7j > 0 then this would imply that e'- and
e\ =i\ were dependent so we must conclude that
(e'j)2 = -ot2ji\ for some otj G R
74
We now introduce yet another basis:
(e\,E2,E3,... ,En) = (ei, —, —,..., —)
a2 a3 an
so as to ensure
E? = -i\ each j = 2,3,..., n
The remaining part of the proof is to show that contradictions arise in all cases other than
n = l,2,4.
These contradictions arise primarily because of the constraint that every x G A is the root
of a real quadratic equation.
case 1 n = 1. This is the algebra generated by the single basis element e\ and is clearly
R itself.
case 2 n = 2. Here the algebra is that generated by two elements (e\,E2) in which
E\ = — i\ and is clearly the field C of complex numbers.
We therefore assume that n > 2 and consider the algebra generated by (ei,^,^,...).
Since all elements of the algebra are the roots of a real quadratic equation then, in
particular, so are E2±E3. Now
(E2 + E3) = E2 + E2E3 + E3E2 + E3 = —2i\ + E2E3 + E3E2
But
(E2 + E3)2 - a(E2 + E3) - fiii = 0 for some a, /? G R
that is
—22i + #2#3 + E3E2 = a(E2 + E3) + pix a,/3eR
Similarly, working with E2 — E3
-2ii - E2E3 - E3E2 = 7(^2 - E3) + 6ii 7,6 G R
Adding, we obtain:
{a + 7)^2 + (a - 7)^3 + (P + <$ + 4)ii = 0
and since i\,E2,E3 are linearly independent then
a = 7 = 0 and /? + 5 +,4 = 0
75
Thus E2E3 + E3E2 = 2eii (say) in which e G R. Then
(E2 + E3)2 = (2e - 2)ii {E2 - E3)2 = (-2e - 2)n
Now, following the same argument as above both (E2 + E3)2 and (E2 — E3)2 must be a
positive multiple of —i\. That is, e — 1 < 0 and — e — 1 < 0
(e-l)(-e-l)>0 that is 1 - e2 > 0
We now introduce new basis vectors
Then I2 = —ii, J2 = —i\ and IJ + JI = 0. However it is easy to show that IJ is linearly
independent of i\,I, J] since if
IJ = aii +(3I + jJ a,ft7GR
then, on multiplying through by /
I{IJ) = -J = aI-P + iIJ
= al-p + 7(mi +01 + 7 J)
from which it follows that — 1 = 72 which implies 7 ^ R: a contradiction.
This result shows that a division algebra consisting of three basis elements ii,/, J is
impossible, since it always gives rise to a fourth independent element I J. If we write
K = IJ then this algebra is generated by ii, /, J, if which satisfy
Iff = /(/J) = -J KI= (IJ)I = -{J 1)1 = J
KJ = {I J) J = -I JK = J {I J) = J(-JI) = I
K2 = (U)(IJ) = -(U)(JI) = -I2 J2 = -ii
This is the familiar algebra of real quaternions.
If we now consider n > 4 then there is a fifth basis element E$, such that E2 = —i\.
Proceeding as above we easily deduce
IE5 + E5I = ah, JE5+E5J = Ph, KE5 + E5K = 721 a, (3,7 € R
Then
E5K = E5(IJ) = (E5I)J
= {ah-IE5)J
= aJ-I(Pi1-JE5)
= aJ-ip + KE5
76
Hence, adding E5K to both sides:
2E5K = aJ-I/3 + KE5 + E5K
= aJ — (31 + 7Zi
/. 2(E5K)K = a{JK) - 0(1 K) + 7#
that is - 2E5 = aI + /3J + iK
which contradicts the linear independence of J, J, K, E5. Hence an associative division
algebra only exists if n = 1,2,4. If the constraint that the algebra be associative is relaxed
then as we shall see in Chapter 4 we obtain the 8-dimensional Cayley numbers.
2.5 Inner Product for Quaternions
Under the operation of addition, the set of quaternions HI form an abelian group. This,
together with the rule for multiplying such an element by a sacalar s £ M shows that the
quaternion numbers constitute a linear space. We can give an inner product structure to
this space if we define:
<p,q> = S(pq)
The four basic inner product axioms are easily checked.
IPlo < p,q > = S{pq) = S{m) = S{qp) = <q,P>
IP2o < p, q + r > = S(p(q + r)) = S(pq + pr) = <p,q> + <p,r>
IP3o a <p,q> = < ap,q> = <p,aq> aeR
IP4o < p,p > = S(pp) = Np > 0 only vanishing if p = 0.
Thus < p,q> = S(pq) is a suitable inner product.
We have already associated an angle with a given quaternion. Using the inner product we
can now naturally define an angle A between two quaternions p, q to be such that:
cos A —
/Npy/Nq
This is a meaningful definition of an angle as —1 < cos A < 1 for all p, q. This is verified
by using the Schwarz inequality:
<p,p><q,q>><p,q> ^qeM
77
Now y/Np > \S(p)\ and so yjKp^fNq = y/Npq > \S(pq)\ and the required result follows.
Using this we can easily deduce that the angle 9 of a quaternion number p is the angle uj
subtended by S(p) and p:
S(S(p)p)
COS U) =
(S(P))2
s(p)
/K
= cos 6
Note also that if
p = y/Np(cos 9 + p sin 0) and q = ^s/Nq(cos § + q sin 0)
where p, q are unit norm pure quaternion numbers, then the angle A, between p, q is such
that
cos A =
S(Pq)
= S [(cos 0 + p sin 0) (cos 0 — <? sin 0)]
= cos 6cos(j) — sin 0 sin (j)S(pq)
But if (5 is the angle between p and q then cos (5 = S(pq) = —S(pq). Therefore
cos A = cos 0 cos <j) + sin 0 sin </)(cos 8)
This is the Cosine Law for spherical triangles discussed more fully in Section 2.9.
We say p, q are perpendicular if S(pq) = 0 and are parallel if V(pq) = 0. The scalar
and vector parts of a quaternion number are perpendicular since:
S(S(p)Vp) = -S((p + p)(p-p))
— ~;S{pp — pp) since S(pp) = S(pp)
-{pp-pp+\pp-pp})
-(pp-pp + pp-pp) = 0
78
2.6 Quaternions and Rotations in 3- and 4-Dimensions
Four-dimensional Rotations
We now consider how quaternions can be used to describe rotations in 3- and 4-dimensional
space. Perhaps surprisingly, it is easier to consider rotations in 4-dimensional space first
and then to utilise these rotations to effect a rotation in 3-dimensions.
Because of the lack of commutativity there are two types of product in quaternion algebra;
left and right multiplication. Thus, for a given quaternion x (and q such that Nq = 1), we
can consider the two maps: </)/, : <j)(x) h-> qx or <j)R : <j)(x) »-> xq depending as we multiply
on the right or the left. In either case, considering HI to be R4 spanned by the usual
basis elements R4 = span{1,2, j, k} then these two maps are linear transformations from
R4 i-> R4. We suspect that both these maps correspond to rotations since, as is easy to
show, they are norm and angle preserving. For example, considering the map </)/, we have
already seen that if x, ?/, q £ HI and Nq = 1 then
^ qx — ™q™x — l*x
S(xy)
Also if x, y subtend an angle A then cos A = = and after multiplication by q we
VNxy/Ny
have (using the properties of the scalar part of a quaternion noted earlier)
S{qx(qy))
cos A'
'1 y qx y iy qy
S(qxyq)
iVq\ iyx y h\ V
Sjyqqx)
_ NqS(yx) _ S(xy) _
— . . — —. , — COo A
iy q V 1 yx y V VJ'xyiV2/
A similar result is obtained if we consider <j)R. Quaternionic multiplication preserves the
norm and the included angle. All that is left to show that this operation is a rotation is to
show that it orientation preserving. To see this we shall consider the matrix representation
of the maps
(/)L:E4^]R4 ^>L{x)^qx (j)R: R4 »-> R4 </>R(x) »-> xq
M4 is spanned by the standard basis elements
1 = (1,0,0,0), i = (0,1,0,0), j = (0,0,1,0), fc = (0,0,0,l)
79
with the usual quaternionic rules of combination. Now if q = a + bi + cj + dk with Nq = 1
then
</>l(1) = a + 6i + cj + dA; </>#(l) = a + 6i + cj + dA;
</>l(0 = -& + a* + dj - ck <t>R{i) = -b + ai- dj + ck
<S>l{J) = —c — di + aj + bk ^r{J) = —c + di + aj — bk
</>l(&) = —d + ci — bj + ak 4>R{k) = —d — ci + bj + ak
Therefore the matrix representations of the linear transformations </)/,, (J)r are, respectively
&H =
[a —b —c —d
b a —d c
c d a —b
d —c b aA
A<f>R —
[a —b —c —d'
b a d —c
c —d a b
d c —b a.
It is easily checked that both these matrices are orthogonal:
^4>L^<j>L — Ii
^<f>R^<f>R — I
and det(A/>L) = 1, det(A/>R) = 1. Hence A<f>L € 50(4), A(j)R £ 50(4) so that the operations
qx, xq represent rotations of x in R . The angle of rotation (using </)/, say) is easily
determined. This is the angle uj between x and qx: (q = cos9 + qs'm9)
coscj =
S(x(gx))
V ■*■* x y ^ qx
Fandso =
S(x(xq))
S(q) = S(g)
= S(q) = cos 0
so that the angle of rotation u is the angle of q.
Geometry of 4-dimensional rotations
It is possible to break up a 4-dimensional rotation into simpler, simultaneous rotations in
two orthogonal planes [10]. To see this consider, as above, q = cosQ + qs'mQ Nq = 1
and let x be any quaternion. Now
qx = xcosO + (qx)s'mQ and q(qx) — qx cos0 — x sin0
since (q)2 = — 1. If we define x' = qx then:
S(xqx) = S(—xxq) — —xxS{q) = 0
80
confirming that X) x are orthogonal and from the results just obtained:
qx = x cos 9 + x1 sin 9
qx' = -x sin 9 + x' cos 9
which shows that the operation of quaternion multiplication on the left is to produce a
positive rotation of elements of the plane containing x, x' through an angle of 9. The plane
ax + bx' a, b £ R remains invariant under left multiplication. Rotations occur within it.
As a special case, if we choose x = 1, then x' = q and this result shows that the plane
containing the elements 1 and q is also invariant:
q\ = cos9 + qsin9
qq = — sin 9 + q cos 9
so that elements within this plane are rotated through an angle 9. We can also show that
the plane (in the space of pure quaternions) perpendicular to q also remains invariant. To
confirm this let v^w^q (regarded as vectors) form a right-handed, mutually orthogonal,
system.
v • w = 0 v-q = 0 w • q = 0
vAw = q w/\q = v q/\v = w
Now, if we choose x = v then x' = qv = qAv = w and so
qv = v cos 9 + qv sin 9
= v cos 9 + w sin 9
qw = w cos 9 + qw sin 9
= —vsin9 + wcos9
indicating that elements in the plane containing t), w have been positively rotated through
the same angle 9. Similar deductions can be made about quaternion multiplication on the
right. Here
xq = x cos 9 + x" sin 9
x"q = —x sin 9 + x" cos 9
where x" = xq so that elements in the plane containing x, x" are rotated through an angle
9. Now choosing x = 1 then x" = <j shows that the plane containing 1 and q is invariant
under right multiplication:
\q = cos9 -\-qsm9
qq = —sm9 + q cos 9
81
which indicates a positive rotation through angle 9; that is, in the same direction as that
generated by a left multiplication. Also, choosing x — v then x" = vq = v A q = —w and
so
vq = v cos 9 + vq sin 9
= v cos 9 — w sin 9
wq = w cos 9 + wq sin 0
= t) sin 9 + it) cos 9
which indicates (in the plane containing v,w) a, negative rotation through angle 9. This
analysis shows that a 4-dimensional rotation is comprised of two simultaneous rotations:
of elements in the plane containing the scalar axis and q and of elements in the space of
pure quaternions perpendicular to q.
It should now be clear that by choosing an appropriate combination of right and left
multiplications of quaternions that we can produce a rotation in the space of pure
quaternions alone; in effect a 3-dimensional spatial rotation. In fact if we consider
multiplication on the right by q~l instead of q then a rotation through angle 9 (in
the opposite direction to that obtaining from a left-multiplication) occurs in the plane
containing 1 and g, whilst a rotation through angle 9 occurs in the plane containing t), w
in the same direction to that resulting from a left multiplication. Thus the combined
operation qxq~l leave elements within the plane containing 1 and q fixed whilst those in
the plane of w, v undergo a rotation through an angle 29 about the q axis in the space of
pure quaternions.
Three-Dimensional Rotations
Readers may find it interesting to see that this interpretation can be confirmed directly,
without recourse to utilising rotations in 4-dimensions. Let us consider the transformation
(j)q(x) = x' = qxq~l
in which:
x = y/Nx (cos (j> + x sin <j>) q = y/Wq (cos 9 + q sin 9) x2 = — 1, q2 = — 1
Because the angle of a quaternion can be viewed as that subtended between the quaternion
and its scalar part we can construct the following diagram.
82
scalar axis
Figure 2.10
We shall show directly that this transformation geometrically describes a rotation of the
vector part of x about the vector part of q through an angle 20.
First, It is easy to show that V(q) is left fixed by the mapping:
4>q(V(q)) = qV{q)q-' = (S(q) + V(q)){V(q)±(S(q) - V(q))}
= w(S(q) + V(q))[S(q)V(q) - V(q)V(q)]
= -(S(q) + V(q))[(S(q)-V(q))V(q)}
= q[(q~1V(q)] = V(q) using associativity
We are therefore justified in referring to V(q) (or q) as the axis of rotation. It is also easy
to show that the norm and scalar part of x are conserved.
Nx> = NqNxq-i = NqNxNq-i = Nx
since Nq-i = —. Also, using the general property S(pq) = S(qp):
■Na
S(x') = Siqixq-1)) = 5((^"1)g) = S(x)
Now
qxq'1 = q(S{x) + V{x))q-1 = qS(x)q-1 + qV(x)q-1 = S(x) + qV(x)q
-1
83
and since S(qxq~l) — S(x) then S(qV(x)q~1) = 0. Therefore qV(x)q~1 is pure quaternion
and so
V(qxq-1) = qV(x)q-1
Since V(x) is parallel to x then V(x') is parallel to qxq~l = £'. Now choose p to be a unit
pure quaternion number in the plane with normal q (i.e. S(qp) = 0). See Figure 2.11
scalar axis
Figure 2.11
If A is the angle between q and x, then
x — q cos A + p sin A
The quaternion x has a unit norm since (using q — q~x — —q, p = p~x = —p)
xx — (q cos A + p sin A) (—q cos A — p sin A)
.". xx — —q2 cos2 A — p2 sin2 A — sin A cos X(pq + qp)
— cos2 A + sin2 A = 1
and pq — —qp as p,q are perpendicular.
Now
x1 — qxq~x = g(^cos A +psin A)^_1
— ^_1 cos A + W-1 sin A
We shall show that qqq~l — q and ^p^_1 is a pure quaternion which revolves through an
angle 20 about q. The first part is relatively easy. Since V(q) and q are parallel then
V(V(q)q)=1-{V(q)q + -qV(q)}
= \[V(q)q-,qV(q)}=0
84
therefore
qqq-1 = (S(q) + V(q))[S(q)q - qV(q)}±
Nq
= ^S2(q) + S(q)[V(q)q - qV{q)} - V(q)qV(q)}
= ±-{qS\q)-V(q)qV(q)}
But V-\q) = ZM
NV(q)
••• V(q)qV(q) = -Nv{q)V(q)qV-1(q)
= -NV(q)q
The second part is developed along similar lines.
qpq-1 = (S(q) + V(q))[p(S(q) - V(q))]±-
= jr{Ps2(<i) + S(q)[V(q)p - pV{q)} - V(q)pV(q)}
However since V(q) and p are perpendicular then using V(q)p = —pV(q):
V(q)pV(q) = Nv(q)p
••• qpq-1 = ^r{(S\q)-Nv{q))p + S(q)[V(q)p-pV(q)}}
Let p' — qpq'1 then we show p' is perpendicular to q (showing that p has been rotated
about q). To do this we need to show Slq^q'1)^} = 0. Now S(pq) = —S(pq) = 0 by our
original assumption. So all we need to show is that
s[(v(q)p-pV(q))4] = o
or (equivalently)
S[{qp - pq)q] =0 or S{qpq) = 0
85
But S(qpq) = S(q2p) — 0 so that f is perpendicular to q. We can also determine the angle
of rotation:
cosV = S#P) = S(prp) = S(p'p) = ~S\{qpq-l}p]
y/JMp'y/JMp
= ~^(S2(q) - Nv{q))S(p2) - ^S{[V(q)p-pV(q)}p}
1 ,c2,^ AT ^ S(l)
=iv^w-
=£<*<')-
= i-(iV,cos2
-^k(?);-
-■Nv<,))
'0-Nq sin
AT,
20)
5(-V(9)-V(«))
Finally
= cos 2(9
.'. </> = 2<9
x' — q cos A + f' sin A
where f — p cos 20 + qp sin 20. We note that qp is perpendicular to both q and to p. This
completes the confirmation that qxq~Y describes a 3-dimensional rotation of the vector
part of x through an angle 29 about the vector part of q.
Example Determine a single rotation equivalent to the two successive rotations:
R\: about the x—axis through an angle 90°
R2'. about the z—axis through an angle 45°
Solution
In quaternion terms:
Ri\ q1 = cos 45° + 1 sin 45°
#2: qi =cos22.5° + A;sin22.50
Then
R2R1 : 42<7i = (cos 22.5 + k sin 22.5)(cos 45 + 1 sin 45)
= cos 45(cos 22.5 + i cos 22.5 + j sin 22.5 + k sin 22.5)
86
Also Nq2qi = 1 since Nqi = 1 and Nq2 = 1. Therefore
Q2Q1 = cos (j) + n sin </>
where
cos</> = cos 22.5 cos 45 = -( 7=—)2
2V y/2
sincj) =
tan</> =
(1 + sin2 22.5)
1
2
3\/2-l
= 2(3-71)5
n 2
V2+1
= [7-W2]*
^ = 49.210°
Thus the combined rotation is equivalent to a rotation through 98.42° about an axis
i + j(y/2 - 1) + fc(\/2 - 1). See Figure 2.12.
Equival n
otatio
ax's
P" "^
Figure 2.12
Reflections
We note that if q is a unit vector then q~l = —q and the angle of this unit quaternion is
equal to 7r/2. Thus q * w (= —qwq) describes a rotation through 180° about q. See Figure
2.13.
87
Figure 2.13
Similarly q * (—w)(= qwq) describes a reflection in the plane with normal q. (Note that
—q * (w) is distinct from q * (—w)).
Just as every planar rotation can be described in terms of a reflection so with three
dimensional rotations. In fact, as we shall show, every three dimensional rotation can
be decomposed into two reflections in planes which intersect in a line forming the axis of
rotation. The angle of rotation is twice the angle between the planes. This result is easily
deduced. Let q be the axis of rotation and w a vector. If w is rotated through angle 0
about axis q then, after rotation,
wr = qwq'1
where q = cos 0/2 + # sin 0/2. Now consider two planes d, d! with unit normals h,h'
respectively which intersect in q. That is
n A h' — q sin (3 h.h' = cos (3
where (5 is the angle between the planes. If w is reflected in d then it becomes w':
w' — hwh
then, reflecting in d'\
w" = h'(hwh)h' — (nh)w(hfi)
Now, using the quaternion product for vectors:
fin' = —n -n' + nAn'
= — cos (3 + q sin ft
n'h — — h' • h + h' A n
= — cos (3 — q sin (3
But
Therefore
{h'h)~l = (-cos/^-gsin/?)-1
= (— cos (3 + q sin 0) — fin'
w" — (hfh)w(hfn)~1
So if we choose (3 = 0/2 then w" — wr which proves the statement.
We have here examined rotations and reflections directly in geometrical terms. We can
confirm our interpretations by appealing to the properties of the linear space R3 underlying
this work. The map <j> acting on a pure quaternion w:
</>:R3 ^R3 4>{w) ^qwq-1
is linear. Without loss of generality we choose Nq = 1 and since R = span{i,j, k} and if
q — a + bi + cj + dk then
<t>{l) = i(a2 + b2 - c2 - d2) + j(2ad + 26c) + £;(26d - 2ac)
$(]) = J(-2ad + 26c) + j(fl2 + c2 - 62 - d2) + £(2a6 + 2cd)
<j)(k) = i(2ac + 26d) + ](2cd - 2ab) + A;(a2 + d2 - b2 - c2)
so that the matrix representation of the map <j) is
M ■
a2 + b2 -c2 -d2 -2ad + 2bc 2ac + 2bd
2ad + 26c a2 + c2 - 62 - d2 2cd - 2ab
2bd - 2ac 2ab + 2cd a2 + d2 - b2 - c2
It is easily checked that M is orthogonal: MMT = / and detM = 1 so that the linear map
4>(w) = qwq'1 represents a rotation in M3. On the other hand if we consider the linear
map
6 : R3 ^ R3 0(w) = qwq
then we find the matrix representation of this map N:
iV =
-62+c2+d2 -26c -2bd
-26c -c2 + 62 + d2 -2cd
-2bd -2cd -d2 + 62 + c2
We see that N = — M(a=0) is orthogonal with detiV = — 1 since the dimension of M is
odd. Hence the map 0 is not orientation preserving and represents a reflection (as we saw
above it is a reflection of w is the plane with normal q).
89
2.7 Relation to the Rotation Matrix
Let (x)i i = 1,2,3 be a given orthonormal system of vectors (which, as the need arises,
will be interpreted as 'pure' quaternions). Since it is an orthonormal system then
\2L)i ' V—)j ~ *ij
Under a rotation, defined by a given quaternion g, this systen is transformed into a 'primed'
orthonormal system (x)- i — 1,2,3 such that:
(x)i = q{x)iQc QQ° = 1
Now we can write (x)J as a linear combination of the (x_)i through the use of the rotation
tensor Rij
(x)i — Rij{%)j using the summation convention
The tensor R^ defines the rotation just as the quaternion q defines the rotation. We now
show that given R^ we can construct q and vice versa. Clearly
q(x)iqc = Rij(x)j
Therefore
[q(x)iQc}(x)k = Rij{x)j(x)k
= Rij[-(x)j • (x)k + (x)j A (x)k\
Thus, taking the scalar part of this equation leads directly to
S[q(x)iqc(x)k] = -Rik
showing that if we are given q and given (x)i we can construct R^.
Now let q = a + p then cf — a — p and, using the quaternion product:
(x)i(x)i = -(x)i • (x)i = -3
Also
(x)»gc(x)» = (z)»[a -/J|(x)i
= -3a - (x)iP(x)i
=-3a-(x)i[-p.(x)i+PA(x)i]
= -3a + {x)i[P • (x)i] - (x)i A {PA (x)i
= -3a^2(x)l[P-{x)l}-[(x)l-(x)l}P
90
in which the standard formula for the triple vector product has been employed. However,
since we can regard (3 as a linear combination of (x)i viz (3 = clj(x)j and so
§_ ■ (x)i = dj(x)j • (x)i = di and 2(x)i\j3 • (&)»] = 2a, (x), = 2^
Therefore
Thus
(aOi^feJi = ~3c* + 2)9 - 3/3 = -4a + gc
= q[-4a + <?c] = -Aaq + 1
Prom (aj)J = Rij(x)j we obtain
Rij(x)j(x)i = -4ag + l
Taking the quaternion conjugate of both sides and adding, we have:
Rij(x)j(x)i + Rij(x)i(x)j = -4ag + 1 - 4a<f + 1
= -8a2 + 2
(z)j(z)i = -fe)j * {x)i + (x)j A (x)i
= Sij + (x)j A (x)i
But
Therefore, directly:
Rij(x)j(x)i + Rij{x)i(x)j = Rij[-8ji + (x)j A (x)i + -5ij + (x)i A (x)j]
= —2Ra
Hence we deduce
and from an earlier result
4a2 = 1 + #u
l-fljj(g)j(g)i
= ±-
4a
\-Rij{x)j{x)j
y/TTR~i
showing that, given R^ and given (x)j we can find q (of course, in the rotation defined by
q(x)iqc this sign is superfluous as q occurs quadratically).
91
2.8 Matrix Formulation of Quaternions
Because quaternion algebra is associative they can be considered in terms of matrices.
That is, the map <j) between the space of quaternions and the space of 4 x 4 matrices over
the real numbers defined by:
(H, +, .) ->
U(4,
(j){a + ib + jc + kd]
dl
a —b —c
b a —d c
c d a —b
d —c b a
is an isomorphism (here 0 is matrix addition and ® is matrix multiplication). This map is
suggested by the form taken by the matrix representation of 0/, which we first considered
when four dimensional rotations were examined. We first demonstrate its homomorphic
properties.
If p = a + bi + cj + dk, q = a + fii + jj + 6k are any two quaternions then:
> {P + <?} = 0 {a + a + (b + P)i + (c + 7)3 + (d + 6)k}
a + a -(b + P) -(c + 7) -(d + 8)
b + P a + a -(d + 8) (a + a)
c + 7 (d + 6) a + a -(b + P)
d + 6 -(c + 7) (b + P) a + a
'a —b —c —dl
b a —d c
c d a —b
Id — c b a
= 0W 0 Hq}
a —p -7 -8'
P a —8 7
7 8 a -p
L (5 —7 /? a
92
^{pq} — </>{aa — b(5 — ey — dd + i [afi + c8 — ^d + ab]
+ j [aj + /3d-b6 + ac] + k [ad + 67 - cf5 + ad}}
= 0{,4 + Bi + Cj + Dk}
'A
B
C
.D
'a
b
c
id
-B
A
D
-C
-b
a
d
—c
-C -
Dl
-D c\
A -B
B A\
-c —d"
-d c
a —b
b a.
(g>
[a
r
7
16
-P
a
6
-7
-7
-6
a
p
-6
7
-P
a
Thus the map <j> is a homomorphism. It is also one-to-one and onto and so <\> is an
isomorphism. It is irrelevant whether we work with the quaternions as introduced earlier
or whether we use the matrix form: identical results will be obtained. Quaternion algebra
is the algebra of 4 x 4 matrices of this form. We note that the matrix transpose corresponds
to the quaternion conjugate.
The closely related map A:
A: (H,+,.)»-> (m(4)R),0, <g>) AJa + fa + cj + dA;}
"a —b —c —dl
b a d —c
c —d a b\
A c —b a\
is suggested by the matrix representation of cf>R. Now it is easily verified that A is such
that
\{p + q} = \{p} e A{<?} \{pq} = \{q} 0 \{p}
and so A is not a homomorphism. As the map A is onto and one-to-one it is an example
of an anti-isomorphism.
There is a further alternative map 0
^a —d —c —b~\
6
: (H,+,■)-> (:
M
(4,1
A Oia + bi + cj+ dk\
da b —c
c —b a d
b c —d a
93
If p = a + bl + cj + dk, q — a + fii + ^/j + 6k then 9{p + q} = 0{p} 0 #{#} is transparently
true. However,
6{pq} = 0{aa - &/? - cry - cK + i[a/? + cfi - 7^ + a6]
+ j[a7 + fid - b8 + ac] + fc[a* + 67 - cj3 + ad]}
= 9{A + £2 + Cj + Dfc]
=
= 6
-4 -D -C -
Bl
2? i4 B -c\
C -B A D\
_B C -D A]
"a — d —c —b'
da b —c
c —b a d
_b c —d a.
{p}®0{q}
®
fa
6
7
U
-6
a
-p
7
-7
p
a
-6
~P
-1
6
a
(The reader should check this computation). Therefore the map 0 is a homomorphism.
This is also clearly one-to-one and onto and so 9 is an isomorphism. There is a further
isomorphism between these 4x4 matrices and 2x2 matrices over the field of complex
numbers. We note that the 4x4 matrices of the form considered here can be partitioned
into 2x2 submatrices each of the form:
s t
_-t s
but these, as we have seen, provide an algebra isomorphic to the complex numbers C.
This implies that the algebra of quaternions can be fully described by the algebra of 2 x 2
matrices over the field of complex numbers.
~P
where
a, peC
•>a + y/^l d
c + yf^lb
which states that once the first column of this 2x2 matrix is specified then the quaternion q
is determined. We note that in this representation the quaternion conjugate is equivalent to
a =
P =
a
d
c
b
-d
a
-6]
c J
94
taking an Hermitian conjugate. The isomorphism between quaternions and 2x2 complex
matrices implies that we can notate the quaternion as
Q =
a,peC
in which q = a + ib + jc + kd. It is quickly checked that
and
qp
a
X
"7
6
a 1
r «7 - sp
1 7/3 + a5
which is (formally) identical to the rule for multiplying complex numbers. We shall return
to this method of referring to a quaternion (and to other 'numbers') later. It is perhaps
worthy of note here that the 2x2 complex matrix representation of quaternions has found
application in quantum mechanics. The quantum mechanical state vector of a fermion
(spin ^) has the odd property that when it rotates through 360° it rotates not into itself
but into minus itself. But this behaviour is not odd in quaternion algebra. The quaternion
demonstrator is an example of such behaviour. The fermion 'spin' might be modelled by
the disk with ribbon attached. Rotating once, through 360°, produces the disk with a twist
in the ribbon — noted as (—fermion spin). Rotating twice, through 720°, produces the disk
with no twist in the ribbon — noted as (+fermion spin). The 2x2 matrix representations
of i,j, k are proportional to the Pauli Matrices and are fundamental to the development
of quantum mechanics. Quaternions, of the type introduced here, are associated with
quadratic forms of positive definite signature (i.e. the norm Nq = a2 + b2 + c2 + d2).
They are therefore not suitable for the description of space-time which is associated with
quadratic forms of Lorentzian signature (+,—,—,—). If we allow the use of complex scalars
then this difficulty can be overcome. These 'complex quaternions' of the form:
p = a + ib aeC /JeC3
were introduced earlier when we discussed the application of a quaternion formalism to the
formulation of homogeneous equations of points, lines and planes. They will be considered
again in Chapter 3 and their application to space-time considered. However, we note here
that although complex quaternions might have considerable application to the description
of space-time they, unfortunately, do not constitute a division algebra.
95
2.9 Applications to Spherical Trigonometry
We begin our discussion on spherical trigonometry with some basic definitions.
Definition 1 The intersection of a plane with the surface of a sphere is called a great
circle if the plane passes through the centre of the sphere. Otherwise the curve of
intersection is called a small circle.
/ axis of small circle
small circle
great circle
Figure 2.14
Definition 2 The axis of any circle (small or great) of a sphere is the unique diameter
of the sphere which is perpendicular to the plane of the circle. The extremities of such a
diameter are called the poles of the circle. See Figure 2.15.
Definition 3 The arc-length of the great circle from a point on a small circle to its
nearest pole is known as the spherical radius. See Figure 2.15.
furthest pole
nearest pole
spherical radius
Figure 2.15
Definition 4 When two circles intersect (small or great) the angle between the tangents
at either of their points of intersection is simply referred to as the angle between the
96
circles. Clearly the angle of intersection of two great circles is equal to the inclination of
their planes. See Figure 2.16.
6 is the angle between the circles
Figure 2.16
Spherical Triangles
We first obtain a standard result. The shortest path connecting two points on the surface
of a sphere is a great circle arc.
proof Consider two points P, Q on the surface of a sphere, of radius r and let C be a
curve on the sphere connecting them with parametric equations
x = x(t), y = y(t), z = z(t) t0<t<ti
Figure 2.17
The length of this curve is
'-CIWWW'*
97
with the constraint that x2 + y2 + z2 = r2 which assumes that the centre of the sphere is
at the origin of coordinates. The constraint is automatically satisfied if we use spherical
polar coordinates (r, fl, </>):
x = rsin0(t)cosfl(t), y = rsin0(t)sinfl(t), z = r cos 0(t)
We choose, without loss of generality, that the z—axis passes through the point P. Then
dx d(j) . dfl
— = r cos <p cos fl — r sin 0 sin 0—-
at at at
dy . „d(j) . . _dfl
—- = r cos (p sin fl —- + r sin 0 cos
flat at at
d^ ,dfl
— = —rsin0—
dt Ydt
and therefore
=X"rv(*)
S)'^*(S)'*
Although we could find those specific functions fl(t), 0(t) which minimise L using the
methods of the variational calculus: in this case we proceed to optimise L by inspection.
Clearly
L> I' d<t> = r[(j){tl)-(l)(tQj\
But this is just the length of the great circle arc connecting points P, Q. (Equality is
dfl
obtained if — = 0 defining the great circle arc fl — const, or when sin2 0 — 0 which again
at
defines a particular great circle arc).
We are now in a position to define a spherical triangle which is a triangle on the surface
of a sphere whose sides are great circle arcs.
sphe ical ri ingle BC
not sph c&\ triangle P R
Figure 2.18
98
Great Circle Arcs
In our discussion of complex numbers we introduced the idea of representing an arc of
the unit circle (subtending an angle 9 at the origin) by a complex number of unit norm:
cos 9 + j sin 9. A similar correspondence is possible with quaternions and great circle arcs
on the unit sphere. Consider a unit quaternion q = cos9 + qsin9. We may associate this
quaternion by the great circle arc which is obtained when the diametral plane with normal
q intersects the unit sphere. Clearly the position of the arc along the circle is arbitrary
and so the arc AB is free to slide on this great circle as long as its length and direction
are maintained. See Figure 2.19(a). Because the correspondence between unit complex
numbers and unit quaternions (for fixed q the algebra of quaternions is identical to the
algebra of complex numbers) is so close we can draw the same conclusions as in Section
1.8. We write (using ~ to specify the geometrical correspondence)
q ~ cos 9 + q sin 9 ~ arc,4£
Great circle arcs can be positioned anywhere on its circle — they are slidable. The arc
is able to move to any position on the circle so long as its length and direction remain
unchanged. We deduce:
q ~ arc ab <fl ~ arc^A -q~ arc^B
1 ~ point — 1 ~ semicircle q ~ quartercircle
arCq + circle = arc9 arc9 + semicircle = — arc9
(b)
Figure 2.19
(a)
99
Great circle arcs (on the same circle) can be added vectorially in an identical manner to
plane circular arcs in the complex plane. This follows since if
p = cos <j) + q sin <\> q = cos 9 + q sin 9
then (using q2 = -1)
pq = cos(0 + <j>) + qsin(9 + <j>)
and so
arCp ~r arCqi ^ arCpo
However, there is a more general result. We know that if p, q are general unit quaternions
(not necessarily having vector parts parallel):
p = cos <j> + p sin <j> q = cos 6 + <? sin 9
then operation by q* (represented by arc^tf) followed by operation by p* (represented by
arc#c is equivalent to a single operation (pq)* (represented by arc^c)- See Figure 2.19(b).
Thus for general great circle arcs:
&ycab + arc^c = arc^c or arc9 + arcp = arcp<?
This is generalised to apply to the vector sum of any number of great circle arcs:
arcg + arcp + ... + arc^ = arc^..^
As a generalisation of a the result in complex numbers relating to arcs which combine to
a closed circle we easily show that the arcs associated with the unit quaternions q,p,..., h
taken in this order will form a closed spherical polygon only if
h...pq = 1.
Finally, we note that if P, Q are general quaternions; P = \/Npp and Q = y/NQq where
p, q are unit quaternions then
PQ = y/^pp^/NQq = \/^P\/^QarcP9
Here we see that multiplication of quaternions can be interpreted as multiplication of
positive real numbers together with addition of great circle arcs. Also, as with real numbers
a quaternion Q = yjNQq has 'polar coordinates' (y/Nq, aicq).
100
The Sine and Cosine Laws of Spherical Trigonometry
Having defined a spherical triangle there is naturally defined six angles a°, b°, c° called arc
angles and A°, B°, C° called vertex angles; see Figure 2.20.
Figure 2.20
For simplicity (and conventionally accepted) the lengths of all arcs comprising a spherical
triangle will be taken as less than a semi-circle. This implies that 0 < a°,b°,c° < 180°
which further implies that sin a°, sin b°, sin c° are all positive.
Now as we have seen above we can represent arcs quaternionically. If
q = cos a° + q sin a° p = cos c° + p sin c°
then
qp = cos c° cos a° — q.p sin c° sin a° + q sin a° cos c° + p cos a° sin c° + q A p sin c° sin a°
However, aicAB ~ p*, arcBC ~ <?*, arc^4C ~ gp* and writing arc^4C ~ cos 6° + msin&°
we obtain, by equating scalar and vector parts:
cos c° cos a° — ^.p sin c° sin a° = cos b° (i)
<? sin a° cos c° + p cos a° sin c° + q A p sin c° sin a° =m sin 6° (ii)
But, introducing unit vectors A,B,C via:
101
and so p, q, m are unit vectors in the directions of A A 6, B AC and A AC respectively.
Thus looking down the axis of B (see Figure 2.21) we deduce from (i)
cos(7r — B°) = q.p
cos c° cos a° + cos B° sin c° sin a° = cos b°
which is (together with two other formulae obtained by cyclic interchange) the Law of
Cosines in spherical trigonometry.
Relation (ii) is used to make another important deduction. Noting that qAp = —BsmB°,
B.q = 0 and B.p = 0 we obtain, from (ii)
B sin B° sin c° sin a° = q sin a° cos c° + p cos a° sin c° — rh sin 6°
therefore
sin B° sin c° sin a° = —B.m sin 6°
leading to
sin£° _ B.m _ B.(AAC) _ A.(BAC)
sin b° sin c° sin a° sin a° sin b° sin c° sin a° sin b° sin c°
But the right hand side is unchanged on cyclic interchange and so we deduce
sin ,4° _ sin£° _ sinC°
sin a° sin b° sin c°
102
which is the Sine Law of spherical trigonometry. We note that the Sine Law is obtained
from (ii) by taking the scalar product of both sides with B. The other possibility; taking
the vector product of this equation with B leads to
0 = B A q sin a° cos c° + B A p cos a° sin c° — B A m sin b° (iii)
But
^A BA(BAC) = [(B.C)B-C]
sin a° sin a°
BA(AaB) [A-{B.A)B]
BAp =
and B A m =
sin c° sin c°
BA(AaC) _ [(B.C)A - (B.A)C]
sin b° sin 6°
Then using B.A = cosc°, J3.C = cosa° and A.C = cos6° leads to an identity in (iii).
2.10 Rotating Axes in Mechanics
As a final application of the quaternion formalism we consider the kinematics of a rigid
body, in particular its velocity and acceleration with respect to coordinate systems related
by a rotation. Quaternions are likely to be useful in this situation since rotations are
involved. Let us consider a unit quaternion q so that qqc = 1 Then assuming the
components are dependent functions of a parameter t (the time) we have:
dq r dQc
that is
dq c\ dqc
dtQ q dt
%-
Therefore we conclude that — qc has no scalar part. But the identity
dt
dq (dq c\ c
Tt = {-dtq)q using qq = 1
implies that we can always write
dq 1
in which u_ = 2—-qc is a pure quaternion. It immediately follows that
dt
103
If p is the position vector (relative to space coordinates) of a point fixed in a body which
is rotating according to the value of quaternion q then its coordinates relative to axes fixed
in the body are
i = qcpq
dp'
and is such that -=- = 0. That is
at
dqc dp dq
0 = Wp-q + qftq + qp-Tt
1 dp I
= -^(furpq + qc-jj;q + -qcmQ
= <T
1 dp
which requires
Thus we can interpret lj_ as the angular velocity of the body. This analysis can be extended
to particles which are not fixed with respect to body or space coordinates. Now considering
a transformation of coordinates (i.e. rotating axes). So if the vector parts of quaternions
x, x1 represent the position vectors of a particle with respect to body and space coordinates
then x^ x are related through
q xq
and writing x = Xq 4- r_ then
But
therefore
dx' dq
Ax
— =—Xq + tf — q + tfx
dq
dt dt dt dt
1 r rdx 1 r
= —jQ-xq + q —q+ -qcxuq
= (t
dt* ' 2
1, x dx
-2(uix - xui) + -
lox — xw = lj[xo + r] — [xo + r\u_
— Xqlj_ — uj_.r_ + u_/\r_— XqU_ + r.a; — r Auj_
= 2w A r
dt
\dx
lit
■lj Ar\
104
Differentiating again:
d2x'
dt2
q u
2
+ 4
dx
d2x du dr
-r^- -Ar-wA-
dt2 dt ~ ~ dt
+ 4
d2x
dx
— uj At
dt ~ ~
-uj_q
dr_ du_ . x
If axes are chosen so that the two coordinate systems coincide instantaneously then q = 1
and the usual (classical) relations are obtained between rotating coordinate systems.
Chapter 3
Complexified Quaternions
3.1 Scalars, Pseudoscalars, Vectors and Pseudovectors
In the ordinary quaternion theory we have constructed an object
q = a + P
which essentially only distinguishes two types of objects the scalars and the vectors.
However, we know that the vectors split into two disjoint sets — the polar vectors and the
axial vectors (arising from vector products of polar vectors; often called pseudovectors)
and, in three-dimensions we have also scalars and pseudoscalars.
These distinctions between vectors are well illustrated if, for example, we consider three
vectors a, 6, c and consider their reflection in a mirror. If the world outside the mirror has
a right-handed system of axes then the mirror-world has a left-handed system. See Figure
3.1
■ 1 w irld
mi >r wi> 1
Figure 3.1
We see that the vector c is reversed in direction but the vector a A b is unchanged in
direction. This is seen algebraically by considering the coordinate transformation
Jb „' LLnnJU -j
using the summation convention
Now if
CLoj —
1 0 0
0-10
0 0 1
J. P. Ward, Quaternions and Cayley Numbers
© Kluwer Academic Publishers 1997
106
then det(a^) = — 1, and if
a = (0,0,1) 6 = (1,0,0) c= (0,1,0)
then a'= (0,0,1) 6' = (1,0,0) d = (0,-1,0)
and so, using (a A b) • = (a* A b')i = e^k^jb'^ We easily see that (a A b)i = (0,1,0) whilst
(& A b)[ = (0,1,0). That is (a A b)i which is in the same direction as c is unchanged under
the transformation whereas c is reversed in direction. Because of this a vector equation
should never contain a polar vector and an axial vector — they are essentially independent
entities. The same is true of scalars and pseudoscalars.
The present formulation of quaternions does not encompass these important distinctions.
We shall see in the study of complexified quaternions, below, applied to three-dimensional
space how this defect may be remedied. The distinctions examined here between vector
types applies more generally.
In three-dimensions the highest order completely skew-symmetric tensor is proportional to
e^fc. Let us call it T^k
Tijk = Ti23^ijk Tijk = ±Ti23
with the positive sign taken if ijk is an even permutation of 1,2,3 and the negative sign
being taken if ijk is an odd permutation of 1,2,3. Such a tensor is similar to a scalar as it
has only one distinct component value. However whereas a scalar is unchanged on rotating
coordinate systems this is not so for T^k- For consider component T{2s
^123 — alQa2/?«37^a/?7
= aiaa2/?a37Ti23ea/?7
= (detay)ci23Ti23 = (deta^Tm
More generally, a single component T transforming like
r = (deU)nT
is called a scalar density of weight n. Scalar densities of weight 1 are called
pseudoscalars.
A tensor multiplied by a scalar density of weight n becomes a tensor density of weight n.
For example, the 1st order tensor Wi (vector) transforming as
w[ = CLijWj
is a zeroth order tensor density whereas the vector product Ujkwj^k = Pi is a tensor density
of weight 1. This is called a pseudovector or a bivector. Not surprisingly there is a direct
relation between bivectors and skew-symmetric tensors of 2nd order Tij
*i>3 ~ *ji
107
which, if pi is a bivector
J-ij = ^ijkPk
3.2 Complexified Quaternions: Euclidean Metric
A complexified quaternion has the form:
q = al + i§_ a G C, £ G C3
wherein ft = pi + 7 j + 6k. Here (1,2, j, k) is the usual quaternion basis and i (satisfying
i2 = —1) is the usual complex unit. Clearly the factor i is superfluous as it could be
incorporated into p but this formulation of a complexified quaternion can lead to some
simplification. We note that the algebra, generated by l,ii,ij,i/c is identical to that
generated by the Pauli Matrices. In fact, we have the correspondence:
"1 0'
0 1
, a <->
"1 0'
0 -1
, ij ^
0 i
-i 0
, ik <->
"0 r
1 0
The algebra of complexified quaternions is easily developed. The set of complexified
quaternions is denoted by H^. The conjugate operation in H^ is
qc = a - ip_
whereas the complex conjugate, denoted by q* is
q* = a* — i/T
An easy calculation shows that V p, q G H^
(q*Y = (qcy, (qp)*=q*p\ (qp)c=Pcqc
We also note that if g* = q then
a* - ip_* = a + ip_
implying a = a*, /? = — /3*. Thus a is real and P_ is pure imaginary. In this case we can
write P_ = i~P_v& which |gR3. That is, a complexified quaternion q such that it equals its
complex conjugate reduces to a 'real' quaternion:
q=a-p GH
108
On the other hand, if qc = q then
a-ip_ = a + ip_ -> £ = 0
and so a complexified quaternion equals its quaternion conjugate only if it is a complex
number: q e C. Finally qc = q* only if q = a + i§_ qGR, ^GM3.
If q G Up the the scalar and vector parts of q are defined as for ordinary (real) quaternions:
Sq = \(q + qc)=a Vq=±(q- qc) =i§_
We note that
Sqc = ^(qc + q) = Sq and S(p + q) = Sp+Sq
and since, for p = a + ifi, q = 7 + i8_
pq = (a + 2/?) (7 + i£)
= 07 + /? • £ - /? A £ + 2 (a£ + 7/?)
then
We now find that complexified quaternion which commutes with all others. That is, let
z = az + i(3_ G H(£ such that
pz = zp V p G H(£
Now
£2 = (a + 2/?)(a2 + 2/? )
= aaz + ia/^ + iaz(3_ +(3_- §_z ~P^P_Z
then
and so:
pz - Zp = -2/? A P_z
thus V2 = (3 = 0 is required for 2: to commute with all p G H^. We conclude that z e C.
109
Inner Product for Complexified Quaternions
Here we choose to define a different inner product than that used for real quaternions.
Specifically:
<p,q>=S{pq*c)
We should note that alternative definitions for the inner product can be given. In a later
section we use the definition < p,q > = S(pqc) which is applicable in Relativity Theory.
The present definition satisfies all the usual requirements for an Hermitian inner product:
(i) <P,q>=S(pq*c) = S(q*Pc)
=<q*,P*>
=<q,p>*
(ii) < p, q + r > = S(p{q + r)*c) = S(pq*c + pr*c)
=<p,q> + <p,r>*
(iii) a < p, q > = aS(pq*c) = S((ap)q*c)
=< ap,q >
= S(p(a*qyc)
=<p,a*q> aeC
(iv) <p,p> =S(pp*c)
= \[(a + i£)(a* + if) + (a* - ifT)(a - ifi)]
= i[aa*+t(a£*+<*•£)+£•/?*-/?A£*
+ a*a - i(a§_* + a*0) + §_■ 0* - ff A0\
= aa*+/?-/T >0
and
<p,p>=0 only if p = 0
We now define the norm: for any p € Mq
Np=<p,p> = -\pp*c+p*pc}>0
We note that
Npc=<pc,pc> = -[pcp*+p*cp}
1
2
= \ [(a - iP)(a - if) + (a* + ip*)(a + 0)]
= | [aa* - i(a/F + a*(3) + §_ • (3* - §_ A (T
+a*a + i(a*P + a§_*) + /? • §_* - §_* A 0\
= aa* + §_ • P* = Np
110
Also,
Np. = <P*,P* >
= \\p*Pc+PP*c} = NP
and
JVp.c =<p*c,p*c >
= \\P*cP + PcP*]=Npc=Np
Npq = <pg,pq> = ^\pq(pq)*c + (pq)*(pq)c}
= \\pq<l*cP*c+P*q*qcpc}
-\p(2Nq - q*qc)p*c+p*(2Nq - qq*c)pc
Hence
si
Nq(PP*c + p*pc) - p(q*qc)p*c - P*qq*cpc
2NpNq-l-p{q*q'yc -l-p*qq*'pc
Npq. = 2Nq(Np)-p(qq*c)p*c-p*q*qY
Npq + Npq. = 4NpNq - p(qq*c + q*qc)p*c - p*{q*qc + qq*c)pc
We deduce that
■■ 2NpNq
^pq — ^-i'qr-i'p -LVpq*
(Although complexified quaternions are 8-dimensional, they are not Cayley numbers as, in
general Npq ^ NpNq). It is interesting to note that if q = ±q* then Npq* = N±pq = Npq
and so, in this case Npq = NpNq. We conclude that if one of the products is a vector or a
pseudo-scalar (see later) then the norm of a product is the product of norms.
For those particular complexified quaternions p for which Np = pp*c (i.e. p* = ±p for
example) we define an inverse:
y Np
The Metric
An inner product defines a metric for the space to which it applies. This can be obtained
as follows. Let ea = (l,ii,ij,ik) denote the basis for complexified quaternions. General
complexified quaternions p, q could then be represented as
P = P% Q = 9%
Ill
in which the summation convention is used (greek indices will generally range from 0 to
3). Now if we choose p, q as being basis elements p = eQ, q = ep say, for particular values
of a, p. Thus, calculating the inner product:
The metric g^u is defined through the relation
< PA > = S(pq*c) = g^qu
However, since p = eQ, q = ep then
p = £<*% q = «"%
Thus the components of p, q are:
p" = 6a» and
6aP : Kronecker delta
Sfr
This implies
Specifically;
Thus
9ap = ^[eae*pc + e*peca]
9oo
[2] = 1
0n = 2^)(^) + H0H01 = *
^33 = ^[(*fc)(*fc) + Hfc)Hfc)] = i
£a/3 = 0 a ^ /3
#a/3
10 0 0
0 10 0
0 0 10
0 0 0 U
which is the usual flat-space Euclidean metric. We conclude that with the form of
inner product chosen here the complexified quaternion formalism can be applied to the
description of classical mechanics. This is the point of view taken in Hestenes [5,11 ] who
has pioneered this approach using (the closely related) multivector calculus.
112
The Dual Operation
In this section we shall find it useful to write the scalar and vector parts of a complexified
R I R I
quaternion in real and imaginary parts: a = a + i a and p = p + i p. Thus a complexified
quaternion has expression
R I I R
q = a — f3 + i a + i p
We can choose, as basis elements, either (l,ii,ij,iA;) over the field of complex numbers
or (1,2, efc,2efc) k = 1,2,3 over the field of real numbers; (here e\ = i, e2 = j, e3 = k).
Following the conventions used by Hestenes [ ] the terms in q separate naturally into four
groups
a, a G R are called scalars
ia, a G R are called pseudoscalars
&k&k Gfc G R are called bivectors
idk^k &k £ R are called vectors
A bivector is obtained when two orthogonal vectors are multiplied together. That is,
(iek)(iem) = -ekAem k^m
and clearly the elements e\ A £2, e\ A £3, e2 A £3 form a basis for the bivectors. Any bivector
B can be written in terms of these three
D — ~ -DijCi A Cj ^ij — ji
Also, given any bivector B we can always find a vector b such that B = —ib. To see this
let (B)m = -(Bijii Mj)m. That is:
{B)m = (B12k + B23i - B13j)m
= (B23,-Bi3,Bi2)
If b = i(b\i + b2j + b3k) then — ib = (b\i + b2j + b3k) implying the identification if we choose
&i = #23, b2 = — £13, 63 = B\2. We say B and b are dual objects. We note that this
identification, B = —ib, exactly coincides with the usual definition of dual tensors:
Bij = eijkbk
where e^fc is the completely anti-symmetric object.
113
More generally we define the dual of a complexified quaternion q to be —iq. The dual
operation turns a scalar into a pseudoscalar (and vice versa) and a vector into a bivector
(and.vice versa). Note that to invert the relation B^ = eijkbk requires:
bp
in normal tensor notation. However in the present formalism obtaining the dual is achieved
by multiplying by — i.
The use of complexified quaternions highlights the four basic quantities: points
(represented by scalars), lines (represented by vectors), area elements (represented by
bi vectors) and volume elements (represented by pseudoscalars). The pseudoscalar is the
result of the quaternion product of three orthogonal vectors ii, ij, ik since
(ii)(ij)(ik) = (ii)(-j A k) = i(i • (j Ak))=i
which is the volume element of a unit cube with sides represented by i, j, k. A volume is
not a scalar — it is a pseudoscalar.
Rotations
For 'real' quaternions we have already seen that the operation
w' = qwq-1 Nq = 1
is such that the norm and scalar parts of w are preserved. Also the vector parts of w and
w' are related via a conical rotation. Specifically V(w') is obtained from V(w) by rotating
it about the axis Vq through twice the angle of q. If we consider complexified quaternions,
with the present choice of inner product then we amend this transformation to
w' = qwqc Nq = 1
in which q* = q. That is, q = a + /3, a G R, ft G M3. But with the present interpretation
a is a scalar and /3 is a bivector. We should note that any quaternion q of this form can
be written as a product of two vectors a, b
q = ab a = iab = ib
— ^ijk^ijp^k
- (fijjfikp ~ 8jp6kj)bk
= 3bp — bp = 2bp
114
then ab = a-b — aAb in which a• b is the scalar a and —aAb is the bivector /3. (There are
an infinite number of possible choices for a, b satisfying a- b = aand —a A 6 = /? (choose
b such that b- ft = 0 then a = [b A (5 + a£]/|6|2 gives a • 6 = a and — a A b = 0). It is also
easily checked that this transformation preserves the inner product:
< w', r' > = S(ti/r'*c) = ^(wV*c + r'Vc)
= ^[^c(^c)*c + (^c)*(^c)c]
= 2 ^C(^r*^C) + qr*qcqwcqc]
= -[qwr*cqc + gr*wcgc]
= q[-(wr*c + r*wc)]qc
= q < w,r > qc = < w,r > qqc = < w,r >
Those readers wishing to pursue this approach to mechanics and to geometry should consult
the various texts and research articles of Hestenes listed in the bibliography.
3.3 Complexified Quaternions: Minkowski Metric
We shall discover in this section that, by introducing an alternative prescription for the
inner product (and hence the metric) that the formalism of complexified quaternions
can be used to elegantly describe the fundamental relations in Special Relativity and
in Electromagnetism. We shall see that a space-time event is represented by a single
complexified quaternion x and the Lorentz Transformation characterised by the quaternion
q through the relation:
x' = q*cxq Nq = l
The material of this Chapter is heavily dependent upon the work of many other authors;
in particular, J D Edmonds [3], W Israel [12], A J Macfarlane [13], M Cahen, R
Debever, L Defrise [14]. Of course other formulations, in terms of differential forms, or
in terms of spinors, or the classical approach using tensors can be employed to describe
the fundamental relations of Special Relativity and of Electromagnetism. What I have
attempted, is to enquire what difficulties underlie the consistent use of a quaternionic
formulation in this area of physics. I have not the wit (nor at my age the time) to
carry through this analysis completely but I think I can demonstrate that a quaternionic
formulation is at least feasible and has (certainly in terms of elegance) some advantages
over other approaches.
115
We begin by considering an alternative inner product for two complexified quaternions p, q
<p,q> =S(pqc)
which is formally the same as that used for real quaternions. As is easily verified, the
following properties are satisfied
(i) < p,q > = S(pqc) = S((pqc)c) = S(qpc) = < q,p >
(ii) < p,q + r > = S(p(q + r)c) = S(pqc + prc) = <p,q> + <p,r >
(iii) a <p,q> = aS(pqc) = S((ap)qc) =< ap,q >= S(p, (aq)c) =<p,aq> a G C
(iv) < p,p >= S(ppc) =a2-P-peC
Thus the inner product defined here is a symmetric bilinear form but is not positive definite
(< P,P > is n°t necessarily real; let alone positive). The inner product defines the norm
of qeMc:
Nq = <q,q> =qqc
Clearly NqC = Nq and, for the product:
Npq = (pq)(pq)c
= pqqcpc = qqcppc = NpNq
(Here we have used the result that qqc G C and thus commutes with all complexified
quaternions). We can also define an angle (which may be complex) between two p,q E H^
via
<PA >
COS Z = , , =
So p, q will be said to be orthogonal if < p, q > = 0 (i.e. if S(pqc) = 0) and parallel if
V(pqc) = 0.
A major algebraic result, to be used extensively later in the discussion on the Lorentz
transformation, concerns the decomposition of any unit-norm complexified quaternion into
simpler quaternions. Explicitly, we show that
Q = QrQb V q G Hc, Nq = l
116
in which q*Bc = qB and q*R = qR. (The subscript B refers to a Lorentz boost and subscript
R refers to a spatial rotation). To verify this result we use the constraints on qB, qR to
write them in the following form:
QR=aR-PR qB=aB+iPB_ ocR,aBeR and /^/^GM3
Now
QrQb = K-^h)K+^)
If we consider a general complexified quaternion:
R I R I
q = a + i§_=(a + ia) + i((3_ + i(3)
R I I R
= a-^ + z(a+ /3)
R I R I
where a, a, p,/? are real. The constraint Nq = \ implies
fl / r r i i
a2 -a2 - P- P+ §_-§_= I
R I R I
aa- P_-P = 0
then if q = qRqB we must have
R I
I R
a+ p = aRpB+p~R.pB- PR ApB
That is
To
begin with,
assume aB
a)-
a = aRaB
i= <*,A
a=^R-^B
(L = aR§B_ -
-4a#
/ 0 and a / 0 then
R
qh = —
(1)
(2)
(3)
(4)
(1)'
117
(2)
(3)
(4)
1 1 '
a = —0-/3B
a —
R
§_ = aR(3B §_M3B
(2)'
(3)'
(4)'
Also
R I I R I
(3)', (4)'-* §_-§_ = aR(aBa) = a a
but this relation is always satisfied as a consequence of imposing the condition Nq = 1.
Now
(4)'-
/ R
I I
(3A (3_ = aR0A(3B §_A(§_A(3B)
(We note that if /? = 0 then, from the condition Nq = 1 we have a a = 0 implying a = 0
which is a pure boost in q). We find
i r i !
§_A§_ = aR§_A^- —
= aR(3A(3B -
(f3-f3B)(3-(p-§)(3B
ii ii
aBap-(p.§)f3B
= aRaB(aRPB - /?) ■
1
/ r x
(3Af3=—f3B
— — a0 —
from which fiB can be found unless
/ /
«+§_'§_
R II
OLROLB§_-OL§_
I I
aRaB+P-£=0
But this is never zero unless aR = 0, /? = 0 which (unless q = 0) is never true from the
constraint that Nq = 1.
Thus, in the case aB /Owe can solve for aR,(3R1a and /?B.
If we now consider the case aB — 0 then a = 0 and (3 = 0 from (2). But this is impossible
from the constraint Nq = 1.
In the case a = 0 and aR = 0, aB ^ 0 then fj-/3 = 0 from the constraint Nq = 1. Then
118
(1)-
(2)-
(3)-
P =
I
a =
I-
= Q;BPr
--£r-!L
= -in^iB.
^-±J
1 1 7
a = — P-0B
0LB~ —
I R ill
0A/? = PA{PAI3B)
- - aB- - —
I
aB
■ / / II-
{P-p^p-iP-PjP^
ii i / /
- aB —
afl[//A/f+a#l
fi_ = L~ ~—=J-
(/? ^ 0 from the constraint iV9 = 1). We conclude that any q e M.£ Nq = 1
be written in the form q = qRqB in which q*R = qR and q*B = qB.
Any complexified quaternion q G M.£ can be written in the form
q = y/Nq (cos 2 + q sin 2) Nq = I q2 = — 1
where, if g = a + i/3 then
cos 2 :
!Na
sin2= V g-~ $ = /? 2GC
vg VJV9
If gc = g* then q = a + i(3 a, /3 E M then 2 is wholly imaginary 2 = i6/2
cos 2 = cosh - =
2
On the other hand if q = q* then 2 is real.
Now consider a unit norm q € E.£ Nq = 1
<j = cos 2 + <? sin 2
.. 0 ^p-p
sin 2 = 1 sinh - = ,—
119
If x is any other complexified quaternion then, multiplying by q on the left:
qx = x cos z + qx sin z (1)
q(qx) = qx cos z + <?(<?£) sin 2
= -xsinz + qxcosz (2)
Now if x' = qx then x' and x are orthogonal since
S{xx'c) = \{xx,c + x'xc) = ^(-xxcg + ^xxc) = 0
z z
The relations in (1), (2) indicate that elements in the plane (x,qx) are rotated through a
complex angle z. In particular if x = 1 then xf = q and so elements in the plane (1, q) are
rotated through a complex angle z.
Now ifv,w,q form a right-handed system in R then if we choose
x = v xf = qv = q/\v = w
Therefore
qv = v cos 2 + ($)) sin z
= v cos z -\-wsmz
q(qv) = qw = w cos z + qw sin 2
= — Osin^ + u;cos2
Here, multiplication on the left by a complexified quaternion q rotates elements in the plane
containing (v,w) and in the plane containing (1,<?) through complex angle z. However,
multiplication on the right by q rotates elements in the (v,w) plane through complex
angle —z whilst those elements in the plane (1, q) are rotated through angle z. Therefore,
combining these results we see that the transformation:
x' = qcxq Nq = 1
can be interpreted as a rotation of V(x) through complex angle 2z about V(q).
120
3.4 Application of Complexified Quaternions to Space-Time
In this section we shall be concerned with the application of the quaternion formalism to
Special Relativity.
We denote a space-time event by four coordinates (x0,^1,^2,^3). This space-time event
will be represented by a complexified quaternion x
x = xaea = x°e0 + xlei + x2e2 + x3e3 xa e R
where ea = (1, ii, ij, ik) will be used to denote the basis for complexified quaternions. The
inner product introduced earlier can also be used to define a metric. Let p, q be space-time
events (i.e. p* = pc, q* = qc)
P = Paea Q = Qaea
then the relations:
<p,q> =S{pqc)=ri^paq^
defines the metric rjap. To obtain its components explicitly choose p = ea q = e@ (for
particular a, /?). Then
p = 6a»efl g = ^% 6a(3 is the kronecker delta
That is
therefore
<P,Q> = ^e^ePea}
Also
and so
lap = ^ep + ePe*}
giving, after a short calculation:
Vae
10 0 0
0-100
0 0-10
0 0 0 -1.
which is the usual flat-space metric of Minkowski space.
121
Aspects of Special Relativity
Fundamental to special relativity is the Lorentz transformation. It will prove to be of value
to spend some time on its derivation and on examining some of its properties.
The Principle of Special Relativity
By the 1880's the electromagnetic nature of light had been established by Maxwell. The
speed of light c in vacuo, is predicted by the Maxwell Equations:
c2 at ot
From these equations we can deduce that both E_, H_ satisfy the wave equation:
c2 dt2
9 1 d26
which admits plane wave solutions
0 ^ ei(k'L-c\k\t)
which propagate with speed c. Here k denotes the direction of propagation.
The experiments of Michelson-Morley in 1887 showed that this speed c is independent of
the motion of the observer. If we only consider observers in uniform relative motion this
result of Michelson-Morley essentially implies that Maxwell's equations (in particular the
wave-equation above) should be invariant with respect to coordinate systems moving with
constant relative velocity. The Galilean transformation
r' = r — vt
which assumes the existence of an absolute time t is not sufficient to guarantee the
invariance of Maxwell's equations. For example, under this transformation the wave
equation becomes:
d2(f) d2<t> d2(f)
Lamor [1] in 1900 and Lorentz [2] in 1903, described a coordinate transformation, now
called the Lorentz transformation which kept invariant the form of Maxwell's equations and
so accounted for the results of the Michelson-Morley experiment. However, it was not until
122
1905 when Einstein proposed two simple principles from which the Lorentz transformation
could be derived directly, that removed the somewhat 'constructed' approach to explaining
the (unexpected) negative results of Michelson-Morley.
Einstein proposed two 'relativity' principles:
(1) "The laws of nature are identical in form for any two observers 0,0' who are in
uniform relative motion."
His second principle refers direcly to the speed of light c
(ii) "The velocity of light c, in vacuo, is a constant, the same for all inertial observers.
That is, it is independent of the velocity of its source."
These two principles are usually supplemented by a third:
(iii) In all inertial systems, particles not acted on by forces (these are called 'free'
particles) will move along a straight line with uniform velocity.
The Lorentz Transformation
Every inertial observer will set up a coordinate system such that every space-time event
can be properly labelled. Observer O will have a clock to measure time t and a standard
rule to measure 3 spatial coordinates. Each event can thus be labelled
(ct,x,y,z) = (x°,x\x2,x3)
(ct converts t into a distance, so that all four coordinates have the same units). The
parameter t is called the proper time for observer O. A similar system of space-time
coordinates can be set up by observer 0'\
(ct\x',y',z') = (x'\x'l,x'2,x'3)
We are already assuming, in this notation, that the speed of light c is the same for both
observers. We shall assume that O' has velocity y_ relative to O and, for convenience, that
there is an instant (measured by t = 0 and il — 0) at which both spatial origins coincide.
The path of a free particle can be parametrized by its proper time s either from O or from
0"s perspective. That is
xa=sa(5) or x'a = ha(s)
123
(s is the time measured on a clock moving with the particle). Now, since the path of the
particle is uniform:
^ = 0 ^ = 0
ds2 ds2
and assuming xa, x'a are related via a coordinate transformation
dga _ dga dhP
ds dx'P ds
therefore
d2ga _ d fdga dh^
ds2 " ds \dxfP ds
d fdga dhP\ <W
~ dx'i \dxfP ds J ds
d2ga dhP dh7 dga d2h?
Thus we deduce:
dx'Wx'P ds ds dx'V ds2
d2ga dhP dtC
dx'idx'P ds ds
which, since this must be true for all free particles
* = 0
= 0
dx'idx'P
Thus assuming the free particle is moving rigidly with observer O
g°(s)=ct = x° g1(s)=x1 g2(s) = x2 g3(s)=x3
and so the coordinate transformation between xa and x,a is linear
(The possible additive constants are zero as the space-time origins of 0,0' coincide).
If the free particle is rigidly attached to O' then
dhm „ nn , dh°
——=0 m = l,2,3 and ——= c
as ds
dx
~d<
the transformation between the coordinate systems
Also —r— = vm which is the velocity of the particle from O's perspective. However, from
ds ~A f> ds ~A °C
124
Therefore
dt dr°
vm™=Am and ^- = A%C
as as
But ds = dt1 and dh° = cds and dx° = cdt
.'. A\ = f and Am0 = vm(^-
ds \ c
Now suppose that at this instant when the space-time origins coincide a flash of light is
emitted. According to 0,0' the wave front equations (spheres with radii increasing at the
speed of light) are
T}a(3XaXp = 0 VapX/aXf0 = 0
However, under our coordinate transformation the first equation has the form:
But the left-hand side of this equation can only be a multiple of r]apx'ax'P and so we
deduce
ria0Aa^6 = kr}l6 keR
If we take 7 = 6 = 0 then
(A%r-(A\r-(A\r-(A\r=k
But
and so
(A)2(1-£) = A
In order to obtain a value for k we consider the inverse transformation. Let Map be the
inverse matrix to Aap {Ma(3A^1 = Say) then
xfa = MapxP
In the special case when v = 0 we have Am0 = 0 and so M°0A°0 = 1. Thus in this case
However, from O's perspective when v = 0 (and considering the particle at 0"s spatial
origin) then
x° = A°0xf0 = ±Vk xf0 (1)
125
But, now reversing everything with the particle at O's spatial origin then, using exactly
the same construction as above we must have
x* = M°0x° = ±-^x° (2)
But for (1), (2) to take the same form we require k = 1.
To recap: A Lorentz transformation is a coordinate transformation:
xa = ia/ (or X = AX')
which satisfies:
r}a(3Aa1Afi6 = r}l6
or, in matrix form AttjA = rj from which it immediately follows that detA = ±1. A
Lorentz transformation is one which preserves the value of the inner product < x, y >.
v2
Now since (A°0)2(l j) = 1 there are just two cases to distinguish: either A°0 > 1 or
A°0 < — 1. If A°0 > 1 the matrix A transforms future pointing vectors pa into future
pointing vectors, and those pointing into the past into past-pointing vectors.
(nb A vector pa is called time-like, null or space-like according as r)appapP is positive,
zero or negative. Time-like or null vectors can be further charactrised as future- or past-
pointing. At the outset ua = (1,0,0,0) is designated future pointing. Any other vector pa
is future pointing if rja^pau^ > 0 implying pa is future pointing if p° > 0)
The Lorentz group can be considered to be the union of 4 disjoint components
4U4U^T-U^
in which the arrow refers to the value of A°0 (| if A°0 > 1 and [ if A°0 < -1) and the
sign refers to the value of detA. In special relativity, and in this text, we only consider the
component of the Lorentz group L+
4 = L+f|iT
This component of the Lorentz transformation being referred to as the proper
orthochronous component. The component L\ contains the identity transformation.
The dimensionality of the Lorentz group is most easily obtained by considering
iiifmitessimal transformations which, in matrix terms, take the form
126
A = I + eK
Here K is a general 4x4 matrix. The requirement that ATrjA = r\ implies
(/ + eKT)rf{I + eK)=r}
from which we deduce, to first order in e:
KTrj + tjK = 0
The left-hand side is a symmetric matrix and so this equation provides 10 conditions on
the sixteen components of K and so A is 6-dimensional.
The Quaternionic Form of a Lorentz transformation
In this section we show how the Lorentz transformation can be conveniently expressed in
terms of complexified quaternions. If x G M.£ represents a space-time event then a proper
Lorentz transformation can be written in terms of a complexified quaternion q G M.£ as:
x —► x' = q*cxq with Nq = 1
The constraint Nq = 1 (implying two real conditions) ensures that this transformation is
6-dimensional. It is easy to show that this transformation preserves the value of the inner
product and so must be a Lorentz transformation:
< x',y' > = S(x'y'c) = l-{x'y'c + y'x'c)
= \ WcmWvc<n + (q*cyq)(qcxcq*)\
-(xyc + yxc)
= q
= q*cq*S(xyc) = S{xyc) = <x,y>
We saw earlier that any complexified unit-norm quaternion q € H^ Nq = 1 could be
decomposed into the form
q = 9r9b C = 1b q*R = qR
Thus
q*c = CC = qBqR
and so
q*cxq = qB(qcRxqR)qB
127
We can show that Lorentz transformation q G M.£ for which q* = qc represents a 'pure'
Lorentz transformation and those for which q* = q correspond to the spatial rotations. The
verification that when q* = q the Lorentz transformation q*cxq = qcxq describes spatial
rotations is an immediate consequence as now, q eM. It will be instructive to verify our
second claim that when q* = qc the pure Lorentz transformations are obtained. To this
end consider a space-time event
x = t + zx or x = \/7Vx (cos (j) + x sin </>)
in which
cose
smc
fNZ
1 V r£ L X_ ' X_
If q* = qc then we can write (with Nq = 1)
then
q = cosh - + id sinh -
2^2
%Q = \A/Vx (cos (j) + x sin </>) (cosh -+iq sinh -)
z z
/TV^ | cos 0 cosh - + z(-x • q) sin 0 sinh -
0 0 0
+x sin (j) cosh - + z g cos 0 sinh - + zx A q sin 0 sinh -
and then
<7*cX(j = \A/Vx [cos 4> cosh 0 - z(x • g) sin 0 sinh 6
+x sin <j) + iq cos 0 sinh 0 + 2q(x • g) sin 0 sinh2 ;
= £cosh# + (x • g)^/x • x sinh0 + zg£ sinh0
Q
+ zx^/x • x + 2zg(x • ^^/x^x sinh2 -
Therefore
£' = £cosh# + (x • ^^/x^xsinh^
x' — Xy/x • x + 2q(x • g)^/x • xsinh2 - + gtsinhfl
Noting that x =
and writing q =
Jx_ - x yjy_ - y_ v
sinh 0 = —^v (to accord with standard formulations) then
= - with cosh 0 = 7 =
vo^y
and
t' = 7(t - x • v)
128
which is the standard form of a pure Lorentz transformation (a 'boost') along y_.
We now briefly examine some of the more straightforward applications of the quaternion
formalism to problems in Special Relativity, in particular, to problems in particle
mechanics.
Particle Mechanics
Following the approach outlined by Synge [15] we consider a particle moving along a path
with equations xr = xr(s) where s is the proper time measured by a clock moving with
the particle. The relation between the measures of time t, as measured by an observer O
and of s are related through the metric:
ds2 = rja(3dxadx(3 = c2dt2 - dxadxa
Therefore
ds = cdt
uaua" ~
1 =-
= -dt
where (3 = |
defined by
We can now
H)
\ u /
2
. Here the 3-velocity ua
v<*- —
ds
define the so-called 4-momentum M
and the 4
ua --
rex.
dxa
~ ~dt
_ .a dxa mP a.
Ma = m—— = (m3, -J-ua)
ds c
where m is called the proper mass of the particle. We can view this in quaternionic terms
by introducing the quaternion M
M = Maea = m/3e0 + —uaea
c
We also introduce the quaternion 4-force F:
ds
These are the (Lorentz invariant) equations of motion. This can be written in terms of the
4-velocity v (a quaternion) as M = mv
_ d . N dm dv ,
F = —(mv) = -z— v + m— (1)
ds ds ds
129
We easily verify that Nv = 1:
Nv = vvc = (v° + iu)(v° - iu)
= {v°)2-u-u
Thus
Now, from (1)
dv t
ds
l + v
Fvc
dv1
ds
=
- =0
dm
-r—VV
ds
dm
ds
1 + 777
dv
m—
as
dv t
ds
vc
(2)
Prom this relation we find (taking conjugates and adding)
Fvc+vFc = 2^
as
dm
.. — = <F,v>
as
We conclude that only if F, v are orthogonal (i.e. the 4-force is orthogonal to the path of
the particle) is the proper mass a constant.
It is interesting to note that the norm of the 4-momentum quaternion M is the square of
the proper mass of the particle:
<M,M> =MMC
r _ .m3 lr _ .m3 ,
= [m/3 + i—ulhnp - i—^-u]
c c
= m p — u-u
cz
= m2
A momentum 4-vector Mp can also be defined for a photon. A photon is associated (in
quantum theory) with a set of plane waves moving with velocity c. Regarding a photon
as a particle with zero proper mass (since its speed can never be reduced to zero), it is
characterised by a 3-velocity vector u = ca where a- a = 1. Thus its momentum 4-vector
has the quaternionic form
Mp = 6(1 + ia) beR
130
which satisfies
<Mp,Mp> = b2(l + ia)(l-ia)
= b2(l-a-a) = 0
The value of b must reflect the frequency of the photon (of the plane wave). The factor b
is taken as b = — v where h is Plank's constant and v is the frequency of the plane wave
associated with the photon.
Conservation of 4-Momentum
Here we consider a number of free particles colliding together. After impact they may
break up into smaller particles or coalesce together to produce fewer numbers of particles
or even produce photons in the collision process. Whatever happens, we assume the law
of conservation of 4-momentum is valid; a law which is easily expressed:
E MU = E Mw
over a over 6
Here the prime indicates the system of particles and photons after collision; the unprimed
quantities in the sytem prior to collision. Though not all collision problems can be solved
using only this conservation law there are some simple cases for which a solution is possible.
For example, the situation in which two particles with 4-momentum M(i), M(2) collide and
coalesce into a single particle M(3) can be fully analysed. Here
M(i) =m{i)v{i) i = 1,2,3
wherein m(j), v^ i = 1,2,3 are the proper masses and 4-velocities of the particles
respectively. Now we know that
v{a)vc{a) = 1 M{a)Mc{a) = m2{a) a = 1,2,3
Thus, by the conservation of 4-momentum M(3) = M^ + M(2) and so
m^ = (M(i) + M{2))(MC{1) + M(c2))
= (m(l)V(l) +™(2)^(2))(™(1)^(1) +™(2)V(2))
= m2{1) + mf2) + m{1)m{2){v{1)vc{2) + v{2)vc{l))
Prom which m^) can be found. Also M(3) = M^ + M(2) implies
TO(3)V(3) = m(i)V(i) + 771(2)^(2)
from which v^) can be found.
Angular Momentum
131
Consider a free particle with 4-momentum M at event x (both complexified quaternions).
The angular momentum of M at x with respect to the origin is defined to be the
complexified quaternion H:
H = V(xMc)
If x = x° + ix and M = M° + iA£ then
H = i{M°x-x°M.) + xAM_
This complexified quaternion corresponds directly to the bivector representation of angular
momentum
Hap = %aMp - XpMa
The angular momentum is clearly dependent upon the position of the origin. An observer
at event y would observe an angular momentum:
H' = V{{x - y)Mc) = H- V{yMc)
Using this formulation it is easy to show that the angular momentum of a free particle is
conserved between collisions. Let x, y be two events on the path of a particle; then the
difference in the angular momentum observed at x,y is
Hx-Hy = V(xMc) - V(yMc) = V((x - y)Mc)
(where we use the fact that the 4-momentum is unchanged along the path of a free particle).
Now, by definition the 4-momentum is the proper mass times the rate of change of the
particle's event position with respect to its proper time and so
M = k(x-y) keR
Therefore
Hx-Hy = kV((x-y)(x-y)c)=0
and so
Hx = Hy
132
Intrinsic Angular Momentum
The angular momentum defined above is normally referred to as the orbital angular
momentum. We shall show, in a later section, that if the angular momentum is defined
as H = V(xMc) then we can deduce that -(MCH - H*MC) = 0 (the tensor equivalent of
this statement is that Ha^M^ = 0).
As well as the orbital angular momentum a particle may be endowed with intrinsic angular
momentum often called the spin angular momentum. A simple analogy is the Earth/Sun
planetary system. With respect to the Sun the Earth, in its orbit, has angular momentum.
The angular momentum it has due to it spinning about its axis is its intrinsic angular
momentum.
In this context the intrinsic angular momentum of a particle, denoted by P, has the
quaternionic form:
P = B-iA B,AeR3
in which, if M is its 4—momentum, we demand
^(McH-H*Mc) = 0
This constraint implies
(M° - iK){R - iA) -{B + iA){M° - tM) = 0
that is
M°B-iM°A + iK-B-iMhB + M-A-M/\A
-{MoB + iMoA + iB'K-iBAK-A'K+AAM} = 0
that is
-i (M° A + MAi?) + M-A = 0
therefore
K-A = 0 and M°A = BAM
These equations show that, as 3-vectors, A,B_,M_ form a right-handed system. Also there
is a single (complex) invariant associated with P, namely its norm:
Np = {B-iA){-B + iA)
= BB-AA-2iAB
= &R-AA since AB_ = 0
Thus there is only one real invariant associated with P called its spin invariant.
133
3.5 Quaternions and Electromagnetism
Before discussing electromagnetism we consider introducing a complexified quaternion
differential operator D:
^ d .,*. d - d r d x _„
D = 75— + tfiTj- + j«— + fc^—) = Daea
OXq OX\ OX2 OX3
so that
Da =
d d d d
Kdxo dx\ 8x2 8x3
We can raise or lower indices using the metric
therefore
and
therefore
so
Xa — 'tfapK
x0 = x°, x\ — -x1, x2 = -x2, X3 =-x3
_d___d_ _9____9_ A-_JL d - d
dx0 dx° dx\ dx1 dx2 dx2 8x3 dx3
Da = riapD? Da=(
d d d d
dx°' dx1' dx2' dx3
D0 = D° Dm = -Dm m = 1,2,3
Now, under a Lorentz boost:
x° = (3(x'° + v • x') x = x' +
yZ
we obtain
ax'/3 ~ dx^ ax'/3 ~ dx<p 7~ ax7/9 ° + dx^m
W = d^=f}Do + pVmDm = /J(D°"Vm£>m)
dx'
Dk +
= Dk +
yZ
v Dm
Vk
Vk
134
which shows that Dk transforms like a 4-vector (Df = q*cDq in quaternion terms). Noting
that
D =
d
dx°
■ + k-
dx1 ' J dx2 dx3
fA *A kJL
dx1 dx2 dx3
dx°
-tV
where V = iirj + Jjr~2 + ^7T~3 1S the usual three-dimensional gradient operator then
We easily show that
r^^^JL+tTL
DD^DD^D'D^^
-V2
which is the wave-operator.
A very useful application of the D—operator occurs in the description of electromagnetic
fields. In this the electric and magnetic fields £,# are combined together as a single
complexified quaternion Q
Q = H-iE K,EeR3
in which, in this form, if, E_ are regarded as pure quaternions. We similarly combine the
charge density p and the current J as a complexified quaternion
J = —p + iJ_
The usefulness of the complexified quaternionic formulation is elegantly displayed in that
all of Maxwell's equations are incorporated in a single quaternion equation DCQ = J. This
is easily demonstrated:
DCQ =
d
+ zV
[K ~ iR]
dx°
= -p + iJ_
Then equating together approprite terms (scalars, vectors, reals and imaginary quantities):
V_'K = p VA£:
V-ff = 0
dH
dx°
8E
which are the usual set of Maxwell's equations. We can also describe electromagnet ism
using a scalar potential A0 and a vector potential A:
*-**-&
H = VAA
135
If we introduce a complexified quaternion A = A0 + iA then the dependency of E_, H_ on
the potential A can be written as a single equation
V{DA) = Q=R-iE
This follows since
DA
d
-tV
dx°
dA° .dA
dx° *ldx°
[A° + i£
-iV_A° -Y'A + VAA
therefore
and so
f)A° f)A
(DAy=^-i^^YA°-V.A-VAA
8A
V{DA) = t-=j - iV_A° + VA A
Thus V(DA) =K~iR implies
and F = VAi
which are the usual field/potential relations.
It is interesting to obtain Maxwell's equations in potential form:
l-Dc{DA - {DA)C) = J = -p + iJ
To see this, the left hand side is, when expanded
1 ft2 A r) r)
+ iV A (V A A) - i V • (V A A)
Note that we have not used the identities:
^A4)SSA(M) V.(VAA),0
as, only by keeping these terms can the full set of Maxwell's equations be recovered, from
the usual potential relations. We have already noted that V(DA) = K-iR- We can easily
verify that
1 BA°
S(DA) = -2{DA + {DAf) = — - V • A
136
dA°
which vanishes if the usual Lorenz (not H A Lorentz) gauge condition V • A = —- is
ox0
imposed. If this condition is imposed we can regard D and A as orthogonal 4-vectors.
Now
< D, Ac > = S(DA) = ~(DA+ (DA)C) and <D,D> = S{DDC) = DDC
and so, without the imposition of the gauge condition Maxwell's equations can be written
as
\dc(DA - {DA)C) = DCDA - \dc(DA + {DA)C)
= <D,D>A-DC<D,AC>=J
whereas if the gauge condition is imposed then Maxwell's equations (in terms of the
potential A) take the elegant form:
<D,D> A = J or DA = R-iE
The first of these equations gives, on expansion
d2A
- iV2A - J
r ^2 >io
+
_d{x0)2
implying the usual wave equations for A0, A.
d2A{
d{x°)
^-v^u + p
= 0
Lorentz Transformation of £, H_
To obtain the Lorentz transformation of E_, H_ one would normally express these quantities
in terms of the potentials:
BA
E = --= + VA° K = VAA
then
s = .gr+ffa»
H' = V' A A1
d
where -^—r and V' refer to coordinates (x,0,x,1,x,2^xf3). We can then make use of the
ax/u
fact that (A0, A) and (^-q, Y) transform as 4-vectors under a Lorentz transformation.
However, there is a more elegant approach to finding the forms of &,H/ using complex
rotations in the space of complex 3-vectors and utilising complexified quaternion algebra.
If x,y are 4-vectors then, under a Lorentz tranformation, defined by Q:
xf = Q*cxQ and y'= Q*cyQ NQ = l
137
thus
x'y'c = (Q*cxQ)(QcycQ*)
= Q*cxycQ*
Prom what we have already discovered about quaternion transformations this particular
transformation can be considered as a complex rotation in the space of complex 3-vectors
if Q is such that Q* = Qc (thereby obtaining QxycQc on the right hand side) which is
a Lorentz boost or if Q is such that Q* = Q (obtaining QcxycQ on the right hand side)
which is a Lorentz spatial rotation.
For such complex rotations we know that
S(x'y'c) = S{xyc) and
N.
xfyfC
Ns
xyc
Now reverting to the electromagnetic arena let us take Z), Ac as x, y then
xyc = DA =
dA°
dx°
dA°
dx°
-Y-4 + VAA + i
-Y-A + K-iE
dA
dx°
VA»
Thus, under a Lorentz transformation; either a boost or a spatial rotation
S(DA) = — -V.A
is conserved. That is
dxf0 - ~ dx°
dA°
V-A
Now if we choose the usual gauge then -7—q- = V • A (which is therefore true in all inertial
coordinate systems) and
xyc = K~ iE
Also, choosing Q = aB - i^_ aB G 1, ^_ E I3 (and thus Q* = Qc) which is a boost
then Hi — itf is obtained by using the complex rotation:
£' - iE! = (aB - i^)(H - iE)(<*B + i£s)
= (aB-i^)[aBK-iK-^+iHA/^-iaBE-E-(3JL+EApB]
= *\H + 2aBEApB- 2^(H ■§B) + (0JL- foH
+ i [-a\E + 2aBH A^ + 2pB_(E •&)-(&• P^e]
Therefore
E! = (al+pJL-^)H + 2aBEApJL-2pJL(H-0B)
E! = (a2B+pB-pB)E-2aBHAf3B-2f3B(E.(3B)
138
Now
If we let
then
and so
Q = aB — i(3B = cos z + q sin;
cosh0 = /3 =
= cosh - + iq sinh - where z = i-
z z z
sinh 0 = —/?v y_ = vq
vo^)
aB = cos 2 = cosh - — /3B = —iq sin 2 = q sinh -
£' = caBhOE + sinh^ _ 2sinh2 * ?(£' *)
= /J
E + vAtf
v(E-v)
yZ
+
2 u2
Similarly:
£'=/?
H-vAE-
viK - y)
+
2i(2£ • v)
These two formulae indicate how H_,E_ transform under a Lorentz boost. The same
approach can be used to obtain the forms of H_, E_ under a spatial rotation. In this case
we choose
Q = aR - f3R aR, (5R both real (Q* = Q)
then
H' - iE' = (aR + fa(H- iE)(aR - fa
= (aR+fa[aRK + H-(^-HAl[R-iaRE-iE-pR+iEA£R]
= a2RH_-2aRH_/\£ -iaRE + 2iaRE/\fiR_
+ 2faH-pR) ~ (Pn •&)£+ KPR -faE-2ifaE.fa
therefore
Now
Now if we let
B! = a2RH - 2aRH A & + 2faH ■ fa - (& • faH
g = a2RE - 2aRE A & + 20R{E ■ fa - (^ • faE
Q = aR— f3R = cosz + hsinz
= cos h sin - since z ■
2 2
COS0 =
VTTT^)
sm# =
y/^T^)
e_
"2
v — vh
139
then aR = cos(0/2) /^ = ftsin(0/2) and
<**-£*•& = cos2 {9/2) - sin2 (0/2) = cos 9
then
H' = cos 0J£ - sin 0=—= + 2 sin2 - "v J
= 1
H-HAv-
v(R-v)
+
2 u2
v{R-v)
where 7 :
VO + t?)
and
£'= 7
E-EAv-
v(E-v)
+
2>(i£' 2>)
Not surprisingly, these are precisely the results we expect when any 3-vectors transform
under a spatial rotation:
(x' = xcos0+ (x - n)n(l - cos0) + n Axsin0)
n ab
with xyc = H_ — iE_ then
Now, as we have seen above Nxyc is conserved in a Lorentz tranformation. In this case
Na
xyc
{K-iE){-R + iE)
= HH-EE-2iEK
Thus both H • H — E • E and E • H are conserved under Lorentz transformations.
3.6 Quaternionic Representation of Bivectors
A bivector is a skew-symmetric second order tensor Fap
Fap = —Fpa
In 4-dimensions Fap has six independent components. These can be conveniently arranged
in terms of two vectors A, R:
{A)m — Fom (R)m = ~^emjkFjk
(In this section I will try to be consistent about the use of lower case latin indices which
range through 1,2,3 whilst greek indices range through 0,1,2,3). The second relation here
is easily inverted:
£mab\J2.)m = ~7y^mab^mjk-^jk
1
2
= -Fab
{fiajfibk - SakSbj)Fjk
140
Therefore
Fab = —£mab{R)m
The two vectors A, B_ can now be subsumed into a single complexified quaternion H G H(p:
H = B-iA
Explicitly:
H = i{-F23) + 3"(Fi3) + H-Fu) - i [i(Foi) + 3(^)2) + fc(F03)]
Also of importance in this area is the dual bivector F®„ defined by:
where eapys is the completely anti-symmetric object with €0123 = +1 and
(1 if afi^S is an even permutation of 0123
-1 if a/3^S is an odd permutation of 0123
0 otherwise
The contravariant form is defined similarly and is such that
^0123 _ Oa 1/3 27„3a _ , _ ,
e —rj 77 Hr} ]r\ ea/97<5 - -Com - -1
We can list the dual components in terms of the ordinary components:
ZP® _ Z? IT1® _ Z? Z7"8) _ Z?
^01 - ^23 ^02 - ^31 ^03 - *12
^12 = _^03 ^13 = ^02 ^23 = _^01
These dual tensor components define a quaternion H®:
H® = -IF® + jF* - kF® - i [iF® + 3"*?2 + **&]
= iFoi + 5"F02 + A:F03 - t [iF23 + 3*3i + fcFi2]
= A + iB = iH
.'. H®=iH
Thus the dual operation in complexified quaternions is a simple operation :- simply multiply
by +i (a similar result was obtained for three-dimensional bivectors). Repeating the dual
operation it follows immediately that
H®® = -H
141
Quaternionic Form of the Contracted Product Fapz^
The contracted product of two 4-vectors pa,(la i-e. paQa has an obvious quaternionic
counterpart S(pqc). The contracted product of a bivector Fap with a 4-vector za i.e. Fapz@
is less easy to describe in quaternionic terms. To obtain the quaternionic counterpart we
shall find it instructive to decompose Fap into 'time' and 'spatial' components.
Fap = F0p6oa + FmpSma
= Fop60a + Fm06ma6po + FmkSma6pk
= Fok(S0aSkp - SkaSpo) + FmkSmaSpk
= (A)k(6oafikp - SkaSpo) - ^3mk{S)j8ma8pk
= -(ImH)k(SoaSkp - 8ka$po) - £jmk(R>eH)j6ma6pk
Thus if z@ is any 4—vector then
Fapzp = -(lmH)k{S0azk - Skaz°) - ejmk{ReH)j8mazk
= -{ImH)kzk60* + [{lmH)kz° - ejfcm(Retf )^m] Ska
But if we consider the quaternion product of H = B_ — %A with z = b + ia (a 4—vector) we
easily obtain:
Hz = -A-a + i{-bA + B_Aa) + i(-B_ . a) + bB_ + A A a
and therefore
-(if*)* = A • a + i(—6A + 5Aa)+ i(-5 • a) - bB_ - A A a
Thus, adding and subtracting this to its starred conjugate (the operator *c) we get
^[zcH-H*zc}=A.a + i(-bA + BAa)
and
l[-zcH-H*zc]=i{-B>a)-{bB + A/\a)
Now realising that za <-> (6, a) <-> 6 + za in quaternion form and that
e3km{ReH)jZm = -ekjm(BeH)jZm
shows that Fapz@ can be expressed in the form:
FapzP <-> (A-a, -bA + B_Aa)
^A'a + i{-bA + BAa) = ]-[zcH-H*zc}
142
Also, from above
F%z* = -(ImH®)kzk60a + [(JmH*)kz° + ekjm(ReH%z™] Ska
and so
Fapz(* "{-R'a,Bb + AAa)
<-> -£-a + z(&£ + ,4Aa)
= i [%B_ • a + (bB_ + A A a)]
= l[zcH + H*zc]
Thus
(Fa/9 - zF^)^ <-> l- [zcH - H*zc + *c# + ff *2C]
The combination F^ = Fap—iF®p is a useful combination and has the important property
that
W = «$
For this reason the complex bivector Fi is called self-dual. Its close relation
Kp = F<*0 + iFaf3 satisfies
(F-0r = -iF-0
and is called anti-self-dual.
There is a single complex invariant associated with Fi:
\K0F+a0 = \(Fa0 - tF®, )(F*» - iF**")
= \iFa0F0!3 - FVpF** - i[F%FaP + FQ0F^})
But
F^a0 = ^^jpspeaf,
= F^{\r,l6a0F^)
= -F^sFl6
Therefore
-F+^F+a/5 - F *Fa/5 - iF®„Faf3
2 <*P ~ r<*Pr trapr
143
This invariant is easily expressed in terms of the vectors A, B_.
FaPF«P = F0(3F°P + Fm/jF"*
= F0aF0a + Fm0Fm0 + FmkFmk
= 2F0aF0a + FrnkFmk
— —2FoaFoa + FmkFmk
= -2A • A + eimkejmk{B_)i(B)j
= -2A • A + (tfijtfmm - tiimtijm)(E)i{E)3
= -2A -A + (3B'B-B_'B_)
= 2(B_'B-A>A)
Also
= —2(F23Fqi + F31F02 + F12F03)
+ 2(-F03F12 + F02F13 - F01F23)
= -4(-£i Ai - £2^2 - £3^3) = 4,4 • £
However, as H = B_— iA then
thus, finally
Nj/ = £ • R - A • A - 2% A • 5
\f^F+^ = 2Nh
Simple Bivectors
When Fap can be written in terms of two vectors pa, qa as:
F<*0 = PaQp ~ PpQa
then it is called a simple bivector. If p, q (p* = pc, g* = qc) are the quaternionic
representation of pa) qa respectively then the quaternionic form of a simple bivector is
§ = V(pqc) = ±(pqc-qpc)
If we write p = a + i/3, q =-y + i6 then, by direct expansion:
§ = I [(a +1£)(7 - iS) - (7 + #)(<* - i§)]
= i[y§_-a6} + f3_A6
144
which is clearly a bivector. We note the relation between the norms of p, q and S.
NP = S(PPC)= a2 -§_■§_
Nq = S(qqc) = 12-6-S
and
N§ = - [W-clS) • (l§_-aS) - (PAS) • (pA6)]
= -[^2p'p-2a^S'p^a2S'S^{P'S)2-{P'P)(S'S)]
We choose, without loss of generality p, q to be orthogonal so that S(pqc) = 0. That is,
pqc + qpc = 0 .'. a6 = d-6
With this condition imposed we find
Ns = -[j2^p-a2^2 + a2S-S-{p-§){S-S)]
= -(p.p-a2)tf-6-6)
= NpNq
With reference to this result we note that the quaternion Q defined by
Q = S{pqc) + V(pqc)=pqc
is such that
NQ = NpNqC = NpNq
but when S(pqc) = 0 then Q = S and so the above result could have been deduced
immediately. We also note that S is unchanged by the addition to q of a multiple of p
and to p by the addition of a multiple of </, (whether or not p, q are orthogonal). Thus a
simple bivector represents a plane (called a 2—flat) containing the vectors p, q. Now every
2-flat contains at least one space-like vector. To see this consider any 2-flat spanned by
two orthogonal vectors p, q. Then t = ap + bq is any vector from the 2—flat. Now
Nt = (ap + bq)(apc + bqc)
= a2Np + b2Nq since pqc + qpc = 0
Not both orthogonal vectors p, q can be time-like. To see this let p = a + i&_ and q = 7 + iS
then
pqc + qpc = aj — /3 • £
If p, q are both time-like then
Np = a2-P-f3_>0 Nq=j2-S-S>0
145
Therefore
a2 > \p\2 and 7* > |£|2
But aj - (3 • 6 = 0 implies 0:7 = \/3\ \6] cos 0
(*)
2 2
•'• OT = C0S" or WW = C0*2e-1
which contradicts (*). Hence not both p, q can be time-like; one must be space-like. Since
every 2—flat contains a space-like vector we can choose (without loss of generality) p to be
space-like. Thus Np < 0. There are now three cases (Figure 3.2) using N§ = NpNq
(i) 7V§ < 0 then Nq > 0 q is time - like
(ii) Ng > 0 then Nq < 0 q is space - like
(iii) Ng = 0 then Nq = 0 <? is null
Figure 3.2
We note that if Fap is a simple bivector, that is,
1
H — 7^(PQC - QPC) where p* = pc, q* = qc
then
H0 = ^(Pqc-qpc)
Thus, using results derived earlier:
Similarly
F%i?~l-\pcH + H*pc)
= ±\pc(pqc - qpc) + (pcq - qcp)pc] = 0
= \ faW - qpc) + (pcq - qcP)qc] = o
146
We interpret these results as showing that every p, q (defining the 2-flat characterising S)
is perpendiular to every vector in the 2-flat characterising §®. If TVg / 0 then the only
vector z common to §,§® satisfies (using (Fap — iF®q)z^ = 0)
zcS = 0
.-. NzN§ = 0 .'. Nz = o
implying z is the zero vector as S_1 exists. However, if N§ = 0 then, since TVg = NpNq
and (by choice Np < 0) we must have Nq = 0. That is, q is a null vector.
Thus if if is a null simple bivector (corresponding to Fap satisfying FapFaP = 0 and
F®pFaP = 0) we can always write H in the form
H = pqc Np < 0, Nq = 0
in which p, q are orthogonal {S(pqc) = 0). Similarly H® is also a simple null bivector and
so
H® = rqc NR<0, Nq = 0
and, by choice, p, r are orthogonal and of the same magnitude. In 4-vector terms we can
write
Fap = PaQp ~ Ppqa F®p = raqp - rpqa
3.7 Null Tetrad for Space-time
It turns out to be of great advantage [16] to replace the tetrad e0 = 1, e\ = n, e2 =
ij, e3 = ik by a null tetrad:
Prom henceforth all tetrad components will be in brackets. The metric appropriate to this
choice of basis is (using the same inner product rule):
h(-»)=ft(«).fcW=^feWftW
147
or, equivalently
We easily find
h(ab)
h(ab) = < h(a))ft(6) > = 5(/j(a)(hW)C)
det(/i(a6)) = +1
0 10 0
10 0 0
0 0 0-1
0 0-1 0
The inverse matrix h^) is
h(ab) -
0 10 0
10 0 0
0 0 0-1
0 0-1 0
Basic Formulae
The following algebraic results are easily obtained:
Vy/2hW 0 y/2h<V 0
h^h^ =
h{a)h{b)c
0 y/2hW 0 y/2h&
0 y/2h<V 0 yfthW
ly/2h& 0 yfthW 0 .
" 0 y/2h<V -yfthW 0 "
y/2hW 0 0 -V2h<3>
y/2h<V 0 0 -V2h<°>
. 0 y/2h& -y/2hW 0 .
Clearly, since Nh(a) = 0 a = 0,1,2,3 this is a null tetrad. The tetrad components of any
tensor Hapmmml are #(a&...c) and defined by
Ht
(ab...c)
#a0...7/i?a)/if6) • ../l/(
(a)"(6)"-""(c)
The dual basis, /i,0x is defined by
h(a) = ^(a6)/L(6)
SO
and therefore
We find:
1(0)
Z>(1), h(1) = />(0), h,2) = -h{3\ h(3) = -fc<2>
(2)
1(3)
% = rTfih{ah)hf
1 1
i _ i
^'"72'
i_
h?o) = (-*=,-*=, 0,0)
fc«} = (-)=,--^,0,0)
hf2) = (0,0,^,-^)
fcw = (0'°'^^)
148
The basis vectors h^ a = 0,1,2,3 are, of course, not unique. The possible sets of basis
vectors are arranged into two classes - those for which the direction of h^ is fixed and
those for which the direction of h^ is not fixed. We will demand, in either case, that the
metric with respect to the tetrad is unchanged and that, for every tetrad (h^)*c = h^3\
These transformations will then correspond to Lorentz transformations.
Let us first consider a new tetrad, denoted by a prime, in which the direction of h^ is
held fixed. With this constraint in mind
h'(°) = c*h(°) keR
This preserves that hf^ is future-pointing if h^ is. Now let
h,(1>=a0/i(0) + a1/i(1)+a2/i(2> + a3/i(3)
^(2) = 6o/l(o) + 6l/i(D + 62/l(2) + 63^(3)
h/(3)=Cdh(0)+C1h(1)+C2/l(2)+C3/l(3)
since (h^)*c = h^ and (h^)*c = h^K The following results are easily obtained
/i/(01) = 1 implying axek = 1 .'. a\ = e~k
ti{02) = 0 implying hek = 0 .'. &i = 0
/i/(22) = 0 implying b2b3 = 0
Here we choose 63 = 0 (in order that we can recover the identity transformation). Also
j^/(ii) _ q implying a0ai - a2a3 = 0
^'(12) _ q implying aib0 — a3b2 = 0
/i/(13) = 0 implying a^ - a2b*2 = 0
ft,(32) = -l implying b2b*2 = 1 -+ b2 = eic c € R
Therefore we deduce
a\ = e~k, a0 = eka2a3
b0 = eka3eic, bx = 0, b2 = eic, 63 = 0
b* = a2b\ek = a2e~icek = eka*3e~ic
therefore
a2 = a*3
149
We can finally write
fc'<°> = c*h(o)
h'M = ekaa*hW + e~kh^ + afc<2> + a*h&
h'M=a*ekeichW+eichW
h/(3) = (/l/(2))*c = aefce-lC/l(0) + c-icft(3)
We see that if we insist on keeping the direction of h^ fixed then the freedom to
choose a tetrad involves four real parameters. We can obtain the Lorentz transformation
corresponding to each of these parameters. Consider the quaternion
q = ah(0)+PhW a,(leR
Prom the definition of h^°\h^ then, clearly q* = qc and Nq = 2a/3. Thus requiring
Nq = 1 implies 2a(3 = 1. If x is a space-tme event then
x1 — qxq
represents a pure (boost) Lorentz transformation. We easily find that with this choice of q
h'<°> = qh^q = [ahW+phW][hW(ahW+l3hM)]
= 2a2h^
Therefore choosing 2a2 = ek to conform with the above analysis we find
V2 P V2
Then, by a similar procedure
h«D = c-fcft(D
ft'P) = qhWq = 2aphV> = h™
h'<3> = ftO)
So the choice
q=±=(ekl2h^+e-kl2hV)
is the Lorentz transformation which changes h^ into h'^ with a = c = 0.
The Lorentz transformation with a = k = 0 is modelled by the choice
q = ah,W + ph.™ a* =(3 2aj3 = l
150
(for then q* = q and so this corresponds to a spatial rotation). The supplementary
conditions are obtained from the requirement that Nq = 1
Choose
Specifically:
Nq = qqc = 2a(3 and q* = q - {? = a
a = 4=e"ic/2 /?=-Lic/2 eel
V2 V2
fc'<°> = qch(o)q = 2aph(0) = h(0)
tiM=<fhWq = 2a/3hW=hW
h'W = qch™q = 2(32hW = cfch<2)
h'W = qchWq = 2a2fc<3> = e~ich^
By choosing a quaternion q of the form q = ah^ + j5h^ + 7/1^) + <5/i^3^ and then finding
a,/?,7,5 so that
h'<°> = fcw
A'<2>=a'/i<0>+h<2>
h'<3> = afc<°> + /i(3)
one can easily show that the transformation accounting for the h'^ tetrad above, with
k = c = 0 is
q=±(h^+h^+a*h^) Nq = l
V2
q* = -^(hW+h^-ahW)
^ = -L(ft<i>+/»W-a*ft<3>)
g*c = 4=(/i(0)+/i(1)+^(2))
v2
Then, using x' = q*cxq we find the h'^ tetrad above with k = c = 0.
It is interesting to note that the closely related quaternion q:
v2
leads to the tetrad
h'<°> = q'ch^q = ft<°> + WftW + bhW + 6'fcW
h/(2)=g.cft(2)g = /l(2)+6./l(l)
/,'(3)=q«h(3)g = /l(3)+6fc(l)
This choice satisfies
151
This (2 real-parameter) Lorentz transformation does not keep the direction of h^ fixed.
This, together with the 4 real-parameter Lorentz tranformation obtained above, defined
by parameters kyc,a represents the full 6 real-parameter Lorentz transformation.
Bivector Representation in the Null Tetrad
We first obtain the form taken by the alternating pseudo-tensor rja^s with respect to the
null tetrad. By definition
riapy6 = {-det(hap))* eQfa6
where ea^s is the completely anti-symmetric object with €0123 = +1- The tetrad
components are obtained in the usual way:
V(abcd) = (-det(fcaj9))!fCai97*/lfa)/lf6)ft(rc)'l(d)
= (-det(/ia/3))*C(a6cd)det(/igl))
from determinant theory. But
h(ab) = h*Ph<(a)h((b)
and
+1 = det(h(a6)) = det(/iQ/3)(det(/ifa)))2
/. (-det(hQ/9))idet(/ifa)) = ±i
We choose the negative sign implying
V(abcd) = -it(abcd)
Also
^(0123)
V — 77(1032) = 77(0123) = -^(0123)
We easily obtain the relations between the covariant and contravariant tetrad components
of P(a6) as:
p(20)=_p(3i) pdO)=P(01) p(30) = _p(21)
P^=-Pm P(13) = -P(02) P{23)=P{32)
whilst the dual components of P(a6); defined by
P® - ^ m P(cd) - * c p(cd)
152
whic
i implies
p® _
^(oi) -
p<8> _
^(12) -
-^(32)
-^(21)
P(02) - ~*P(02)
p®s)=ipm
P(03) - iP(03)
pm = ^(oi)
We note, in particular, if P^a^ is self-dual (satisfying P,®6* = 2P(a&)) the above relations
necessarily imply
P(oi) = P(
(23)
^(02) = 0 P(13) = 0
Any complex bivector P(ab) nas two invariants /i, J2 associated with it:
and
8
1
(ac)^(6d)
- ~2 [^(01) + ^(23)1 ~ ^(02)^(13) - ^(03)^(12)
h = lv{abCd)P(ab)P(cd)
Therefore
le(abcd)P(ab)P(cd)
= -^[^(01)^(23) + ^(20)^(13) + ^(03)^(12)]
h + %h = 2P(20)^(13) " g IP(01) - ^(23)]2
/l - t/2 = 2P(3o)P(12) " ^[P(01) + P(23)]2
Thus if the bivector is self-dual I\ +il2 = 0 and so there is only one complex invariant for
a self-dual bivector: I\ — il2. Now if we write
P(ab) ~ F(ab) ~ iF®ab)
which is self-dual then the usual real invariants of Fap can be recovered from I\ - il2. To
see this we remember that
Fab — ^mab\Mjm ^0m — \±± )n
or, in matrix form:
Fap =
0 Al A2 A3
-Al 0 -B3 B2
-A2 B3 0 -Bl
l-A3 -B2 £i 0.
Thus, using the relations:
F(ab) = Faph^h
(*)
tap ~
(06)
■ 0 -Bl
Bi 0
B2 A3
-B3 -A2
?aPhUh(b)
-B2
-A3
0
Al
-B31
A2
-Al
0.
153
leading to:
^(20) - ~2A<2 + 2Bs + 2^2 + 2^2
F®
r(20)
^2 + ^3-^3 + ^2
Therefore P(20) = 0 as expected and, similarly, P(13) = 0. Also
^(oi) - ~M
(23)
-Z^i
^(oi) = -^i - *#i
^(23) -
-Z^i
and so P(23) — ^(oi) as expected. Continuing, we obtain
F(3o) = ~9^2 + 2^3 ~ 2^3 ~ 2^2
(30)
F®
^(30)
^2 + ^3 + ^3-^42
Therefore P(30) = -A2 - iB2 - iA3 + B3. Similarly P(i2) = A2 + 2 £2 - M3 + £3. We are
now in a position to calculate I\ — il2:
■ 2P(30)^(12) - 2^(01) + ^(23))2
h-ih
= 2[-A2 - iB2 - iA3 + B3][A2 + i£2 - M3 + £3]
-2[A1+z51]2
= 2[B_- B_- A- A-2iA- B] as predicted
We have seen that a complex self-dual bivector P(a6) has only three independent complex
components P(3o)> P(u) and (P(oi) +^(23)) and, as such, they can be described by a single
complexified quaternion (as can any real bivector).
The set of all bivectors B_—iA form a three-dimensional complex vector space. Let Y^ a =
1,2,3 be a basis for this space. An immediate question is what basis bivectors should we
choose? Well there are some natural candidates which arise from the basic null tetrad
/i(a) a = 0,1,2,3 that we have already chosen. As we have noted earlier:
0 V2h<°> -y/2h& 0-
y/2hW 0
V2h& o
0 y/2h&
Thus two bivectors are immediately suggested
h(0)h(2)c _ _^(2)
h(a)(h(b)y =
0 -y/2h&
0 -y/2h<V
-V2hM o
and h^h^c = y/2h&
which we label as Y^\ Y^ respectively. A third is constructed by combining elements
from the above table: We choose
y(3)= 1(^(3)0^(1)^(0)0)
This, and indeed the other two are suggested by the terms in the complex invariant
154
h + ih = 2P(02)P(3i) - \{Pf2Z) + P(W))- We note that Y^ can be expressed in different
ways: from the table above
y(3) = l-(h^hWc _ ^(3)^(2)0) or y(3) = * (fcWfcWc _ h(0)h(l)C)
from either of which it is obvious that Y^3) is a simple self-dual bivector, as are Y^, Y^2\
Thus to recap:
y(D = ftWhW" = -yfthV)
y(2) = ft(3)h(«c = ^P)
y(3) = 1(^)^)0 + ^DftWc) = ^(_ft(0) + />(!))
z z
are the three basis vectors chosen to span the complex 3-D vector space of quaternions
H = B_ — iA, A,B_eR3. The metric of this space is found in the usual way:
kab = < y(a),y(6) > = £(y(a)y(6)c)
"0 2 0"
2 0 0
0 0-1
then
^ab —
o \ o
-2 0 0
0 0 -1
Thus at any space-time event any complex self-dual bivector P(a(,) (and hence any real
bivector) can be represented in one-to-one fashion by a vector of C3
= V2[-PM2) + P2h{3) + i(-/>(0) + h^)}
therefore
also
Bx - iAi
-iPs
-Pi(y-fc) + P2(tj + fc)-P3t
SO P3 = Ai + tBi = - (P(32) + P(10))
B2 - iA2 = -iPi + iP2 B3 - iA3 = Pl + P2
which together imply
P2 = --[B2-iA2+iB3 + A3]
= -[-A2-iB2 + B3-iA3]
= P(30)
Pi = \\B2-iA2-iB3-A3]
= -\A2 + iB2 + B3 - iA3]
= P
(12)
155
Also the invariants of Fap (i.e. FapFaP and F%Fal5 are the real and imaginary parts of
PapPa@ (where Pap = Fap — iF®„) and are conveniently expressed by
kabPaPB = 4PXP2 - Pi
Of course, Y^ a = 1,2,3 are also complex self-dual bivectors. By writing P(a&) = PaY^
with any two of Pa a = 1,2,3 taken as zero and utilising the relations
Pl=P(12) P2=P(30) P3 = \(P(32)+P(10))
we easily see that the bivectors representing Y^ a = 1,2,3 are (where the square brackets
represent anti-symmetrization /^j = -{lab — ha)-)
y(a6) - Zd[adb]
y(a6) - 2d[A]
y(a6) - d[adb] + d[eA]
We can therefore write, for any complex self-dual bivector P^
P(ab) = P(12)Y{ab) + ^(30)V(a26) + (P(23) + ^(01))^)
The general tensor components of Y/J m = 1,2,3 follow the usual prescription:
Ja/3 - X{ab)aoc ap
If Pa/? > Qa/? are any two complex self-dual bivectors then we can write
p n _ p v(m) n n v(m)
■"a/j — rmIap WaP — Vm1^
Thus, for example;
But, as is easy to verify:
yMy(n)^ = r • p^Qaf) = kmnPmQn
In particular if Pap = Qap = Fap - iF®p where Fap is real then
PapP«P = 2[Fa^ - iF®pF°P] = kmnPmPn
Thus Fap is a null bivector (when its two invariants vanish) if kmnPmPn = 0; that is if
Pm is a null vector in C3.
156
Other Formulae Involving Y^
Straightforward calculations confirm that
y(m)y(r0c _ umn _i_ 2fmnPY
Y{m)y(n) _ _fcmn _ 2emnpY(
rfYffY™ = kmnrtai + 2em^Y(p)ai
Effect of Lorentz transformations on the Bivector Basis in C3
As we saw earlier the Lorentz transformations can be split into those which alter the
direction of h^ and those which do not. Considering the first type with
g= 1 (h(0)+h(l)+6/l(2))
V2
defining the 2-parameter Lorentz transformation implies
h'<® = h<® +WhW + bh<n + b*hW
h'W = hw
h'W = h<n+FhW
h'(3)=/,(3)+6/l(l)
then we easily discover
y(i) _> y'(i) = /j'(o)/j'(2)c
= [ft<°> + 66* h™ + bh^ + b'h®][hWc + b*hWc]
= h^h^c + b*hWhWc + b*h^h^c + (b*)2h^h^c
= Y«-26*y(3) + (6*)2Y<2>
y(2) _> y'(2) = h'{3)ti(l)c = F(2)
y(3) ^ y/(3) = I(h'(2)A/(3)c + ^(1)^/(0)6
= y(3) _ 5*y(2)
Now considering transformations which do preserve the direction of h^ namely:
A"<°) = ekh^
h"W = ekaa*hW + e~kh^ + ah™ + a*/i<3>
h"& = a*ekeich^ + eich^
h"W = aeke-ich^ + e~ich^
157
we find
y(i) _+ y"(i) = ekeicYil)
y(2) _ y//(2) = _2ac"icy(3) + aVe"lcy(1) + e-*ce"fcy(2)
y(3) _^ y//(3) = _ae*y(i) + y(3)
An easy calculation (utilising the table of products found for y(m)y(n)) confirms that
< y'(m) y'(n) > __ £(y'(m)y'(n)c\ _ j^mn
^ y//(m) y//(n) ^ __ o/y//(m)y//(n)c\ _ iLmn
We conclude that rotating the null tetrad via a Lorentz transformation is equivalent to
rotating the complex basis Y^ in C3 since the scalar product (i.e. the metric) is left
unaltered.
Presumably, we can rotate axes in C (knowing that this is equivalent to Lorentz
transformation) in order to simplify the form taken by a vector in C3.
Now any complex self-dual bivector P can be expressed in the form
p = p.y(0 = p'Y'^ = p'.'Y"^
that is
PtfM + P2Y{2) + P3Y^ = p[[y(D _ 2b*Y^ + (&*)2y(2)]
+ p/y(2)+p^y(3)_6*y(2)]
SO
P1 = P[ P2 = (b*)2P[ - b*P^ + P'2 Ps = ~2b*P[ + Ps
or, inverting
P[ = P1 P'2 = P2 + (b*)2Pi + b*P3 Ps = P3 + 2b* Pl
Also, from the second relation
Ptf™ + P2y(2) + P3y(3) = P[\ekeicY^)
P^{-2ae-icY^3) +a2eVicy<1> + e"ice-fcy(2))
+ p^,(-aefcy(1) + y(3))
Therefore
Pi = ekeicP[f + aVe"*/^' - aefcP^
P2 = e-ice~kP^
°2 "•" ^3
158
or, again inverting:
p» = e-ke~icPl + a2eke-icP2 + ae~icP3
P>i = eicekP2
P% = P3 + 2aefcP2
Clearly, by suitable choice of b* we can always arrange matters so that P2 or P% vanish.
However, if we choose b* to make P3 vanish then another Lorentz transformation (involving
c,k,a) can be chosen to make it non-zero. Hence this is not invariant. If instead we choose
b* so that P2' was t° vanish then no other Lorentz transformation (involving c,k,a) can be
chosen to make it non-zero. Hence this is an invariant choice.
3.8 Classification of Complex Bivectors and of the Weyl Tensor
We have seen that any complex self-dual bivector Pap can be represented in quaternionic
form by the components Pa of a vector P in a complex 3-D space. (P = P^Y^). Now
such a vector can take one of two basic forms :- it is either null or non-null
NP = 0 or NP ^ 0
NP = 0 Null Complex Self-dual Bivectors
Here
kabPaPB = 2[2P1P2 - ip32] = 0
We can always choose coordinates (the value of 6*) to force P^ = 0. In this case there is
only one such value. Thus we can also deduce, in this case, that P3 = 0 and so we can,
by a complex rotation, find a basis in C3 in which P has only one component along Y^.
Therefore, in this case:
P = P1y<1> = -P1 >/2fc(2)
from which we find
hWcP = -P1y/2hWhW=0
Since there is only one value of b* then there is only one direction h^ for which this is
true. This has the tensor equivalent
(Fa0-iF®0)h^ = O - Fa(,hl°» = 0 and F%h^ = 0
159
Np ^ 0 Non-null Complex self-dual Bivectors
Here 2P1P2 P3 / 0 so there exist two values of b* which can be chosen to make P'2
vanish. This implies that a complex rotation can be chosen so that, at the outset, P2 = 0.
In this case
P = P^1) + P3y(3)
.'. h^cp = p3hM
implying (h^cP)h^ = 0. But this in turn implies
S[(h(0)cP)/i(0)] = 0 and V[{h^0)cP)h^} = 0
In terms of tensors the first of these relations imply
(FQ0 - iF^)h^h^a = 0 [or S[{hWcP)h<®] = S{P) = 0]
which is trivially satisfied. The second condition implies
(Faff ~ iF%)h^hW - (F,0 - iF*)hWhW = 0
that is
F0lahiy°W=O and F^h^ = 0
Classification of the Weyl Tensor
Here we consider extending the approach outlined above, used to classify the
electromagnetic field (the bivector Fa^), to the task of classifying the Weyl tensor (which
is identical to the Riemann curvature tensor in vacuum). It is well known that the Weyl
tensor is related to the Riemann tensor via:
Cap-y6 = Rap-y6 ~ ~ fo/W^cry ~ VpfRa6 + Va-yRp6 ~ VadRp-y]
-R[r}air}(36 ~ VadVp^}
where Rap is the Ricci tensor
Rap - Rai^ R - Raa
It is easily checked that
160
(i) Ca^s = C[apfrs — Ca/?[7<5]
(ii) C[a0-y]6 = 0
(iii) Ca/57<5 = C1sap
We can contemplate taking the dual on either the first or second pair of indices:
c ® = —n n auj
Also
Now
where
a/?7<5 — 2 'a^mn 7<5
_„ „ s^mncru;
— AflaPmn'l'y6(TUJ^'
= \va0mnVabCdria-rr,bSVcaV^Cmn''U
- CPm + Cm + C%% + C%%
- c^i+cm+cm+cm
= -6%sr - c%i - cc
+ CC-CC + CC
811 = 6181-8161
Therefore
_ £mPsi[qn]auj _. cmps^[pn]aw _ onps^[mq]auj _. cnqs^[mp]cruj
— __n n firnn^i <ru>
— <y 'Imcr'lnu;uab ^xy
= —dnao'nbu - VbcrVa^Cxy™
= ~~Z\y/xyab ~ ^xyba\ = ~^abxy
(In the above reduction we have used the property of the Weyl tensor that all of its
contractions vanish).Now, following the direction taken in the analysis of bivectors we
define
^abxy = ^abxy ~~ l^abxy
161
Since
that is
°a6xy ~ ^abxy ™ieil ^a&Xy — °a6xy
abxy abxy
and therefore
^abxy ~ ^abxy iK^abxy ~ ^abxy ^abxy
° a6xy ~~ °a6xy L^ abxy ~ ^abxy I" L^abxy
- lUabxy
r<+ ® _ /nr <g> _ -/nr® ® _ /^ ® \ »n ,
° a6xy — °a6xy ^a6xy ~ °a6xy ^ *W6xy
= ^a6xy
Therefore C^ is self-dual in both index pairs. By treating each pair of indices in C*bcd
as a separate entity it is clear that this can be represented in C3 by a complex 3x3 matrix
and
also
^abcd ~ ^rriniab I cd
and since C+cd = C+a6 then
L/mn-ra6 1cd ~ ^mn1 cd 1 af, — ^mn1^ Icd
SO
^mn ^nm
Also since ^wCjj,cd = 0 (all contractions of the Weyl tensor vanish) we obtain
u - omn77 ra6 rcd
= (smn[k Vac "I" 26 J(p)acj
— C ]crnnr\
— ^mri™ 'lac
.'. ^mnCmn = 0 or 4Ci2 = C33
This constraint, together with the symmetry condition shows that Cmn has 5 independent
complex components which matches the 10 real components of the Weyl tensor. Now to
characterise C*bcd we consider the eigenvalue problem for Cmn
n pm _ \ p
where P = PmY^ is a vector of C3. By analysing this problem we seek to characterise
the Weyl tensor in a similar manner to that effected for the electromagnetic field.
162
If we look for null eigenvectors then
CmnPmPn = 0 and kmnPmPn = 0
The second relation: AP\P2 - P2 = 0 or, equivalently, P1P2 - P32 = 0 is satisfied by
choosing P1 = a, P2 = /32, P3 = a/3 a, /? G C. The first equation can then be expressed
as (utilising the constraint C33 = 4Ci2)
Cna4 + 2Ci3a3/3 + 6Ci2a2/?2 + 2C23a/?3 + C22/34 = 0
Now under the Lorentz transformation, characterised by the quaternion q:
q = J-(hW+hW+bhW)
V2
we found
y(i) _+ y'(i) = y(i) _ 26*7(3) + (6*)2y<2)
y(2) _^ y/(2) = y(2)
y(3) __> y'<3) = y(3) _ fc*y(2)
Therefore
cmny(m)y(n) -+ C'1"1^
We easily find
C\\ —* Cn
Ci2 -> (fc*)2Cii-6*Cl3 + Cl2
C13 —► — 2b*Cn + C13
C22 - (b*)4^! - 2(6*)3C;3 + 6(6*)2C;2 - 2b*C23 + C^2
c23 - - 2(&*)3c;1 + 3(6*)2c;3 - 6&*c;2 + c^
c33 - 4(6*)2c;1-46*c;3 + 4c;2
where we have used C33 = 4C£2.
Clearly, as in the elctromagnetic case, we can choose a Lorentz transformation — a value of
b* which satisfies the 4th degree polynomial (to make C22 vanish). The multiplicity of the
roots for &*, dictated by the values of C'mn can be used to classify Cmn. Now each value
of 6* gives rise to a null vector h^. We note (since, when differentiated with respect to
&*, the transformed expression for C22 becomes the transformed expression for C23) that
if there is a double root for b* then
C22 = 0 and C23 = 0
163
A similar argument shows that if there is a triple root for b* then
C22=0 C23 = 0 and C12 = 0
whilst a 4-times repeated root imphes
C22 =0 C23 = 0 C12 = 0 and Ci3 = 0
The following table describes the very well known classification
Type
I
H,D
in
N
0
Vanishing Coefficients
C22
c c
^22 ^23
c c c
22 S>3 12
c c c c
^22 ^23 ^12 ^13
all
Null Vectors
{1,1,1,1}
{2,1,1} {2,2}
{3,1}
{4}
no preferred null vectors
Chapter 4
Cayley Numbers
4.1 A Common Notation for Numbers
In this section we describe, in a single notation, the three types of number that we have met;
the scalars E, the complex numbers C and the quaternions H. We consider a quaternion
number p to be a two-component object (suggested by the formalism of complex numbers
and by the matrix formulation of quaternions [17]):
a,/?eC
Equality between p =
and q —
is defined by:
p = q if and only if a = 7 and /3 = S
We define the binary operation 0 through the statement:
p®q =
We immediately see that there is a ©identity, designated by O:
a
0
V
6
=
a + 7
_/? + <$_
0 =
is such that p©O = O0p = p
sa
sf3
Multiplication by a scalar sERis written sp, defined by
sp =
We also define the operation of conjugate:
vc--
This leads us to consider the scalar and vector parts of p:
S(p) = \(p + Pc) V(jp) = \(p-Vc)
It immediately follows from these definitions that
if p =
then S{p)
S(a)
0
V{p) =
V(a)
J. P. Ward, Quaternions and Cayley Numbers
© Kluwer Academic Publishers 1997
165
Finally we define the multiplication rule, between elements p, q written poq:
poq =
a
0
7
6
=
cry - 8(3
7/3 + a8
The oidentity element E, is easily recognised:
E =
is such that po E = Eop = p for all p
From this we can verify the associative law of multiplication po (qor) = (po q) or. Let
p =
Q =
then
po (qor) =
(poq) or
a
A
Q7
1(3
0
+ c
7a - bS
a<5 + 7'6j
0
a
b
=
=
a(^a - bS) - (aS + 76)/?
(7a - 6<$)/3 + a(a£ + 76)
(c*7 - <$/3)a - 6(^7 + «a)
a(7/3 + aS) + (7a - /3<$)6
I have kept the separate elements in correct order as we shall meet both expressions again
later in situations in which associativity cannot be assumed. Since a, /3, 7, <$, a, 6 G C then
it is transparently clear that po (qor) = (po q) or. However, generally po q ^ qop and
so the commutative rule fails in general. The conditions under which commutativity holds
are easily derived. Using p, q as defined above:
poq.
aj — 6(3
7/3 + aS
qop-.
7a - (36
aS + 7/3
For these to be equal we require
<$/3 = (36 j(3 + a6 = a6 + 7/3
The first equation requires V(6J3) = 0 and the second requires V(7)j9 = V(a)S. From the
second: if 1/(7) ^ 0 then /3 = kS k eR and V(a) = W(i). The first equation is then
satisfied identically. If however, V(j) = 0 then either V(a) = 0 (implying (3 = sS s eR)
or S = 0. Grouping these results together we see that multiplication is commutative only
if either
(i) V(p) = kV(q) k eR that is, the vector parts are parallel.
(ii) either p or q has the form
ke
166
The norm Np of p is defined in the usual way:
Np=popc
Using our definition of multiplication it is easily observed that
Np=popc
aa + /3/3
0
gmp
The conjugate (denoted by 'bar' for individual elements) needs to be interpreted
appropriate to the type of number to which it refers. We can now define an inverse
element p~l to every element p with non-zero norm viz:
if p —
fV, -1 1 r 1
then p = —p
This satisfies
Nv
P O p = po p =r E
Nv
-j9
The reader can easily verify, that the scalar numbers have the form:
where a€l
Without any loss whatsoever such a number can simply be replaced by a in any algebraic
expression.
The complex numbers are obtained when:
where a, /? G M
Also, as is easily verified, objects of the form
aeC
are isomorphic to the complex numbers. As we have stated above the quaternions are
obtained if:
P =
where a, /? € C
To verify this we must exhibit an isomorphic map from these objects onto the quaternions.
Such a map is
fraii
<j): C x C f-> H <f> < \ „ > «-> ao + a\i + a2j + ask
167
in which a = clq + a$yf—i ji = a2 -f- ai'V^T. This map is a homomorphism since if
7 = &0 + 63v/^T <5 = 62 + &!>/-! then
and
0
</><
Q + 7
£+5
}
Ml
= f [a0 + 60 + (as + &3)v/zT] 1
^IK + k + fai + frOV^Tj/
= ao + 60 + (ai + 6i)i + («2 + h)j + (03 + h)k
{[?M«]}-*{fc2]}
■{
a060 - a363 - a2&2 - o-ih + v/-T(ao&3 + ^3^o - &2&i + 62^1)
^2^0 _ &1&3 + «0^2 + G3&1 + V/-T(ttlfeo + ^2^3 + ^0^1 _ ^362)
= Q>obo - a363 - a2b2 - a\b\ + (ai&0 + &2&3 + ao&i - 0362)*
H- (0260 - ai&3 + a0b2 + a3&i)j + («o^3 + «3^o - «2^i + b2ai)k
-♦{[?]M[i]}
The map is clearly one-to-one and onto and so this is an isomorphism.
4.2 Cayley Numbers
The obvious question that we can now ask concerns the possible existence of other numbers
(other than scalars, complex and quaternion numbers) which can be expressed in the form
described here. The next class of number that we might consider is
where a,/?Gl
168
These new numbers are called Cay ley numbers, denoted by K and we now explore some
of their properties. The definitions of conjugate, scalar and vector parts, and norm are as
defined above:
Xc =
-P
S(X)=l-(X + Xc)
5(a)
0
V(X)
V(a)
P
Nx=XoXc
Note that X = S{X) + V{X) and so S{X) G R and Nx 6 Rp.
One can now deduce that numbers defined with these basic operations satisfy
S{X oY) = S{Y oX) (Xo Y)c = YcoXc and NXoY = Nx o Ny
The first is relatively easy to verify and depends directly on the corresponding property for
quaternions S(pq) = S(qp). Verification of the second identity is also straightforward; (here
we must be careful with the positional order of elements as we know that quaternion
multiplication is not commutative).
YcoXc =
7
-6
Co'
0
Y)c =
a
-0
/a - P6
7(5 - aS
7a - (5c
—aS — 7
5 '
P_
= (XoY)c
The third property, involving the norm, NXoy = Nx o TVy, is quite remarkable and we
also verify it here. Using X, Y as above then
Nx oNY =
aa + P/3
0
77 + 66
0
(aa+ /?/?) (77+ W)
0
Now
07 - 6(5
7/9 + a6
XoY =
Thus (using the property of the quaternion conjugate, pq = qp)
NXoY
0:7 - 6(5
7/3 + a6
ja — (56
-7/? - 6l6
(07 - 6(5){ja - (56) + (<y(3 + ctf)(/?7 + 6a)
(7a - /?«)(7/? + ol6) - (7a - /?«)(7/J + a«)
(0077 + ^PP + 77/^/3 + «#<$<$ - <$/?7a - 07/^ + j(56a + a<5/?7
0
(0:0:77 + ##/?/? + 77/?/? + olol66
0
= Nx oNY
169
This result depends upon showing that for any four quaternions a, /?, 7,6:
—6 (3*/a — cry (3 6 + j(36a + a<$/?7
vanishes. (Of course for complex numbers it is trivially satisfied). To show that this is
zero we note that it can be expressed in the form:
—pa - ap + pa + ap = —2S(pa) + 2S(ap) = 0
where we have taken p = 6J3*y and we have employed properties of the conjugate applied
to products of quaternions and the identity S(pq) = S(qp) for any two quaternions.
Cayley numbers do not commute in general. To see when commutativity occurs consider:
X =
Y =
then
XoY =
aj - 6(3
7 (3 + a6
YoX =
7a- (38
aS + 7/3
and so, for commuting products, we require:
0:7 - 6(3 = 7a - (38 7(3 + a6 = aS + 7/?
The second of these equations demands V(7)/? = V(a)S. We first consider the case that
V(7) + 0 then
P--
1
Ny
vh)
-V{j)V{a)S and p ■■
iVi
V(7)
-^(a)y(7)
Therefore, since the quaternion product is associative
and so
That is
6(3:
aj — ja = —
N}
V(7)
■V(a)V(7)
^
JVi
V(7)
■[V(a)V(7)-V(7)V(a)]
[V(a)V(>y)-V(<y)V(a)][l +
^
TVi
0
V(7)
from which we deduce [V(a)Vr(7) - V(7)Vr(a)] = 0 which is only true if V(a) and V(7)
are parallel. That is:
y(7) = ^(a) keR
170
then
P = -k-±-V(a)V(a)6 = k^6
But AV(7) = k2NV(a) and therefore (3 = -. So X, y commute only if V(i) = kV(a) and
AC
(5 = /c/9. Which is equivalent to the single statement V(Y) = kV(X).
The other case from that described above, occurs when V(p() = 0. Here we immediately
deduce that 7 is real and V(a)6 = 0. Prom the second condition either (i) V(a) = 0
implying a is a real number or (ii) 6 = 0. Taking (i) first: if both 7 and a are real
numbers we deduce that 60 is also a real number. This implies that 6 = k(3 k eR which
again leads to the constraint that V(Y) is proportional to V(X). If the statement in (ii)
holds true then Y is a real number which of course commutes with all Cayley numbers.
Prom these considerations we conclude that Cayley numbers commute if and only if their
vector parts are proportional (that is, their vector parts are 'parallel'. See later for the
justification for using the word parallel in conjunction with Cayley numbers).
Continuing with this theme we could also ask what Cayley number, if any, commutes
with every other Cayley number? If Y is such a number then for every X we demand
X oY = Y o X. Prom the work above we require for every a, (3
a7 - 7a = 2V{6p) V(7)/3 = V{a)6
For the second equation of this set to be satisfied for all a, (3 we deduce that V(j) =0 and
6 = 0. The first equation is then satisfied identically since 7 is a real number. Therefore
we conclude that the only Cayley number which commutes with every Cayley number is
of the form:
Y=\7}
L°J
That is, y is a real number.
7g:
In the transition from complex to quaternion numbers commutativity of multiplication was
lost. For Cayley numbers the associative law of multiplication is lost. For associativity,
as we outlined above for quaternion numbers (considered as an ordered pair of complex
numbers), we require:
a(ja — b6) — (a6 + 76)/?
(7a - b6)(3 + a(a6 + 76)
(c*7 - 6f3)a - b((3j + 6a)
a(7/3 + a6) + (7a - (36)b
which necessarily implies (amongst other constraints):
a(bS) = {b6)a
171
which is generally false if the elements are quaternions. The associative rule for Cayley
numbers fails because the quaternion product is not commutative. However, we consider
below some special cases for which the associative rule is valid.
We can still construct an inverse. If X is a Cayley number, with non-zero norm, then we
define:
1
X = TTX
Nx
This satisfies:
X~1oX = -^-Xc oX = E = Xo (-^Xc) = XoX~1
Nx XNX
Now although Cayley algebra is not associative the following theorem shows that in certain
special cases associativity holds.
Theorem 4.1 If X, Y are Cayley numbers then (YoX~l)oX = Y and X~1o(XoY) = Y.
Proof
and
Let X =
then X~1 = —
Nx
a
-p
YoX~1 =
Nx
a
1
~N^
7a + (38
aS -7/3
(YoX-i)oX = —
7a + (36
a6 — j (3
1
~N^
(7a + (36)a - (3(6a - (3j)
a{a6 - 7/?) + (c*7 + 60)(3
= Y
_ _J_ \'y(<x<x + (3(3) 1 _ [7
" Nx [6(aa + j3(3)\ " [6
The proof that X'1 o (X oY) = Y is almost identical and is therefore omitted. Generally
Xo(YoX~l)^Y.
Inner Product for Cayley Numbers
Under the operation of addition 0 the elements R form an abelian group and then with
the rule for multiplying such an element by a sacalar s G M the Cayley numbers constitute
a linear space. We can give an inner product structure to this space if we define:
<X,Y > =S{XoYc)
172
Then, considering the four basic axioms for a valid inner product:
IPlo <X,Y>=S{Xo Yc) = S[{X o Yc)c] = S{Y oXc) = <Y,X>
IP2o
<x,y + z> =s{Xo(y + z)c)
= S(XoYc + Xo Zc)
= <X,Y> + <X,Z>
IP3o a < X,Y > = < aX,Y > = < X,aY > aeR
IP4o < X, X > = S{X o Xc) = Nx > 0 only vanishing if X = 0.
Thus < X, Y > = S(X o Yc) is a suitable inner product. If fact if
a = a0 + a\i + a2 j + a3A; = a0 + a0 /? = a4 + a5i + a6j + a7k = a4 + 04
with a similar prescription for Y:
Y =
7 = 60 + b\i + 62j H- 63A; = 60 + 6q S = 64 + 65i + 66j + 67^ = 64 + 64
then
S(X.
on=s{[s]°W}-H
0:7 + (5/? H- 7a + /Jtf
0
= a0b0 + Oq • 60 + a4&4 + 04-64
which is the standard inner product on M8. The Schwarz inequality for any inner product
states:
<X,X><Y,Y > > <X,Y>
that is, in this case:
NXNY > S{X o Yc)
Using this we can deduce a 'triangle inequality' for Cayley numbers. To deduce this we
construct the norm of a sum of Cayley numbers.
NX+Y = (X + Y)(XC + YC)
= XXC + XYC + YXC + YYC
= NX + 2S{XYC) + NY
<7Vx + 2v/Nx7 + 7Vy
= Nx + 2 v%<V/VV + Ny
173
That is,
y/Nx+Y < V^x + V^V
which is the triangle inequality for cayley numbers.
4.3 Angles and Cayley Numbers
We can associate an angle with a Cayley number. If X is a Cayley number then clearly
X = S{X) + V{X)
X is said to be a pure Cayley number if S(X) = 0. Let x be a pure Cayley number with
unit norm and Ny(X) be the norm of V(X).
Nv{x) = V{X) o VC{X) = -V{X) o V{X) x =
y/Nn>
■-V{X)
then X = ^N^(-±=S(X) + -±=V{X)) = ^(-L=S(X) + ^^x)
y/Nx y/Nx v^x v^x
<Nx(cos6 + xsm6)
where cos 9 = . , and sin 0 = . This representation is valid as long as
y/NX Ar
fNx
S2(X) + NV{X) = NX
To check this let
then Xc =
a
and V(X)=l-(X-Xc)=l-
a — a
2/3
V(X) o V{X) =
a — a
2/3
a — a
20
(a - a)2 - 4/3/3
0
a2 - 2aa + a2 - 4/9)8"
0
(a + a)2 - 4(aa + (3/3)
0
S2(a)-Nx
0
NV{X) = -V{X)oV{X) =
Nx-S2(a)
0
174
But
S(X)
S(a)
0
.-. S2(X) + NV(X)
:. S2(X) =
S2(a) + Nx-S2(a)
0
S2(a)
Nx
0
= NX
which validates the formulation X = \JNX (cos 0 + xsin#). Note the important result for
any unit-norm pure Cayley number
1
xox =
N,
V(X)
N,
v(X) L
V{X) o V(X) =
\-NV(x)
L °
=
i r
Nv(x) [
0
= -
S2(a)-Nx
0
This result can be extended to show that X is a pure Cayley number if and only if X2 is
a non-positive real number. To see this let X be pure. That is S(X) = 0. If
Therefore
S(X)
S(a)
0
.". S(a) = 0 if X is pure
XoX = V(X)oV{X)
V(a)
V(a)
-V(a) ■ V(a) - f$
0
Conversely if
X =
-k
0
ke
X2 =
a2 - /3p
aP + aP
If X2 e M then a/? + a/3 = 0 i.e. 5(a)/3 = 0. Thus either S{a) = 0 or /3 = 0. If S(a) = 0
then a2 - pp = V(a)V(a) - PP = -NV(a) - Np and so X is pure and X2 is a negative
real number as required. If /3 = 0 then
X2
If this is to be real then (using results deduced earlier for quaternions), V(a) = 0 or
S(a) = 0. In the first case X is not pure and in the second X2 is a negative real number.
175
Angle subtended by two Cayley numbers
We can also define an angle A between two Cayley numbers X, Y to be such that:
x S{X o Yc)
cos A —
/Nx^Ny
That this is a meaningful definition of an angle we must show that
-1 <cosA< 1
This follows since, from the Schwarz inequality, y/Nx > \S(X)\. Therefore \/Nx\/Ny =
\/NxoY > \S{X o Yc)\ and the required result follows. Using this we can easily deduce
that the angle of a Cayley number is the angle subtended by S(X) and X:
, S(S(X)oX<) (S(X))> S(X) .
cos A = ) -—- = \ \ L-^ — , — cos 0
y/Nsmy/Nx S(X)y/N^ v^
Note also that if
X = y/Nx{cos6 + xsm9) Y = \] NY (cos 4> + y sin </>)
where x, y are unit norm pure Cayley numbers, then the angle A, between X, Y is such
that
x S(X o Yc)
cos A =
fNxy/N^
= S [(cos 9 + x sin 6) o (cos (j) - y sin 4>)}
= cos 0 cos (f> — sin 0 sin (j)S(x o y)
But if 8 is the angle between x and y then cos<5 = 5(x o y) = —S(x o y). Therefore
cos A = cos 0 cos 0 + sin 0 sin </>(cos <$)
With the introduction of angle we can introduce some standard geometrical terminology.
We say X, Y are perpendicular if S{X o7c)=0 and are parallel if V(X o Yc) = 0. The
scalar and vector parts of a Cayley number are perpendicular since:
S(S(X) o (V(X)Y) = \S((XC + X) o (Xc - X))
= \s(Xc or-XoX) since S(X o Xc) = S(XC o X)
= 1{XcoXc-XoX + [XcoXc-Xo X}c)
8
= 1(XcoXc-XoX + XoX-XcoXc)=0
176
Theorem 4.1 showed that X~l o (X o Y) = (F o X"1) oI = y. The following
theorem specifically relates to pure Cayley numbers and utilises the idea of parallel and
perpendicular Cayley numbers.
Theorem 4.2 If h and p are pure Cayley and parallel then
po (hop'1) = h
whilst if h and p are pure Cayley and perpendicular then
po (hop-1) = -h
Proof Let
p =
a
hop'
.'. po(hop~
h =
• i 1
Np
') = -
' Np
1
71
s\
[s\
\a1
\a('
_(<yi
P-1
l-p.
1
9 L
1
~N~P
\^a + (18
[a6 — 7/?
y<5 + /3
x + 136
16)-
)0 +
a
-v.
\^a + fib
I aS - j(J
(a6-ipm
a(a6 - 7/9) J
But p,h are parallel (implying V(poh) = 0) and sopoh = hop (using pc = —p, hc = -h
since both are pure Cayley). Thus, as p,/i are pure Cayley numbers we deduce a = —a
and 7 = — 7 and, from the parallel constraint
#7 — 6(3 = 7a — /?<$ = —70; — /?<$
7/3 + a<$ = a<$ + 7/? = -a<$ + 7/?
Therefore
1
po (hop x) = —
W,
a(«/? - a7) + (7/? + a«)/?
(£/? - cry)/? + a(-7/9 - a<$)
1
(aa + /?/?h
«(/?/? + aa)
= h
The proof of the second relation is similar. Choosing p and h as above then the
perpendicularity of p and h implies
177
hop =—poh
which in turn implies
0:7 - 6(3 = -(7a - (36) j(3 + a6 = -(a6 + 7/?)
(or their conjugates). Also since pc = —p then a = —a and since hc = —h then 7 =
-7
•'• po{hop ) = —
AL
W»
a(7a + (36) - {a6 - 7/?)/?'
(7a + /?«)/? + a(a6 - 7/9)
a(a7 - 6/9) - (7/3 + a<$)/9
(a7 - 67?)/? + a(7/9 + a6)
-(aa + /?/?h
-(/?/?+ aa)«
4.4 Cayley Number Identities
We have already derived a number of identities satisfied by all Cayley numbers. In
particular we have noted elementary identities with respect to the scalar part of Cayley
numbers S{X) = S{XC), S{X o Y) = S{Y o X). The following theorem will be of
considerable use in later sections:
Theorem 4.3 For any Cayley numbers Q, X, Z e K then
S{{Q oX)o(Zo Qc)) = NQS{X o Z)
Proof
Let
a
A
, z =
7
6
, Q =
a
b
where a, (3, 7, 6, a, b G IK then
QoX =
ZoQc =
{QoX)o(ZoQc) =
a\ \a\ \ aa — (3b
b\ |_/?J [ab + a(3
a 1 I" 7a 4- b6
—b J \_a6-7b
(aa - /?6)(7a + b6) - {a6 - 76) (6a + /?a)
[ (7a + 6<$)(a& + a/9) + (aa - b(3)(a6 - 76)
178
where
S{{QoX)o{ZoQf)) = -
£ + m + n + p
0
Now
Also
£ = a(a7 + ja)a - a(6(3 + (36)a
m = 66(70: + 07 - (36 — 60)
n = (6baa — a6ba) + (aab6 — ab6a)
p = (jb(3a - ajb(3) + (a/367 - /367a)
5(07) - ^(07 + 7a) e » W) = ^(W + /w) g :
S(a(ab6)) = -(aabS + 6baa)
and
= S({ab6)a)
= - (ab6a + a6ba)
S(a((3h)) = - (a(3h + W
= S(((3h)a)
= -((3ha + aybP)
using the property S(pq) = S(qp) for any p, q G H. These results imply that both n and
p vanish.
Hence
" (aa + 66)S(a7) - (aa + 66)S(<S/3)"
0
5((QoI)o(ZoQc))
S(oa-6j3)'
0
7VQ5(X o Z)
which completes the proof.
In the remaining part of this section we derive other identities which will make the non-
associative algebra of Cayley numbers easier to deal with. We begin by defining the
associator (cf article by Curtis in [18])
A(X, Y,Z) = (XoY)oZ-Xo(YoZ)
179
which of course is generally non-zero. However, by direct computation using the properties
of quaternions we can easily show:
(a) A{X,X,Y) = (XoX)oY-Xo(XoY)=0
To obtain this result let
X
then
{XoX)oY
a
Y =
[a2 - 0p'
[a/3 + a(3
0
[Y
I6.
il
6\
whilst
X o (X o Y) =
'{c?-l30)T-6(Pa + 0ay
l{a(3 + a(3) + {a2 - PP)6
a27 - aSp - (7/?/? + aSp)
orfP - SPP + ajP + a2S
from which it follows easily (noting that a + a is a real number thus commuting with any
quaternion) that X o (X oY) = (X o X) oY.
(6) A{x, y, y) = {x o y) o y - x o (r o y) = o
This is deduced in an analogous manner to (a). The existence of identities (a) and (b)
show that Cayley number algebra is a member of the class of alternative algebras. We
shall show in a later section that Cayley numbers are the most important example of this
class of algebras.
The following Theorem is closely related to the identity in (b). A Cayley number which is
not a quaternion is called non-degenerate.
Theorem Three non-degenerate Cayley numbers X, y, W satisfy the associative law
Wo(XoY) = {WoX)oY
if and only if any two of them have parallel vector parts.
Proof First it is easy to see that there is no loss of generality in assuming that all
three Cayley numbers X, Y, W are pure. Thus let V(W), V(X) and V(Y) be pure Cayley
numbers and if
[V{W) o V(X)] o V{Y) = V(W) o [V(X) o V(Y)]
180
for all pure Cayley numbers V(W) then V{Y) = kV(X) k € R. To show this let
V(W)
V(X) =
V(Y) =
then
[V{W)oV{X)]oV{Y) =
V{W) o [V(X) o V{Y)}
(act - /3b)-y - S(ba + (to)
7(0:6 + o/9) + {ad - bp)6
a(aj - 6(3) - (7/3 + aS)b
(07 - 6(3)b + a(7/9 + ctf)
Now equating components and since V(W), V(X) and V(y) are pure Cayley numbers then
a, a and 7 are pure quaternions satisfying a = —a, a = —a and 7 = -7 we have
(aa - (Jb)i - «(6a H- /?a) = 0(07 - £/?) - (7/? + a£)6
7(0:6 + a/3) + (ao - 6/?)« = (07 - <$/?)& + 0(7/? + a«)
Since a and 6 are independent quaternions then
6 (3a = a6(3 and - (3try + <S6a = -7/% + o<S6
If the first of these relations is to be satisfied for all quaternions a then 6(3 must be a real
number since this is the only number which commutes with all quaternions. Thus
6 = k(3 where k is real
The second relation now becomes
(3b(ka - 7) = (ka - j)0b
If this is to be true for all b then 7 — ka must be real. Since 7 and a are pure quaternions
this implies
7 - ka = 0 .'. 7 = ka
The other relations are then identically satisfied so that if
[V{W) o V{X)\ o V{Y) = V{W) o [V{X) o V(Y)}
then V(Y) = kV(X) where k is real. This result will be used in Section 4.6.
In a similar manner we can show that if
[V(W) o V(X)] o V(Y) = V(W)o [V(X) o V(Y)]
181
for all V(X) then V{Y) = kV{W) k e R. To see this we need to show that for all
quaternions a, /?:
(aa - /96)7 - «(-6a + /3a) = a(c*7 - «/?) - (<y(3 - aS)b
7(0:6 - a/3) + (aa - 6/3)£ = (0:7 - £/?)6 - 0(7/? - a«)
which, on rearrangement gives
a<$/3 - 6Pa + 7/% - £67 + (56a - a<$6 = 0
and
70:6 - 0:76 + a7/3 - 7a/? + aa<5 - aaS + £/?6 - &/?<$ = 0
The first relation, if true for all a implies
6b 6R that is 6 = kb keR
The second relation (if true for all /?) then implies
aj = 7a
which, since a, 7 are pure quaternion gives
a = p7 pel
Using this in the second relation above implies
[kp — 1] [70:6 - 0:76] = 0
Thus, either b = 0, 7a = 0:7 or kp = 1. Now, if b = 0 then, from an earlier result 6 = 0.
Both main relations are now satisfied and
V(Y)
1
V(W)
The second possibility: 0:7 = 7a implies (if true for all a) that 7 G R which is a
contradiction since 7 is pure quaternion. The final possibility is that kp = 1. The first
relation, above, becomes
7[6£ +/ft] = [&£ + /%
which is satisfied identically since bfi + /ft is a scalar. Thus we deduce,
V(Y) =
7
6\
ka\
kb\
= W(W)
182
We can conclude that three (non-degenerate) Cayley numbers X, Y, W satisfy the associate
rule W o (X o Y) = (W o X) o Y if and only if any two of them have parallel vector parts.
Continuing with Cayley number identities, we have:
(c) A{XC, X, Y) = (Xc o X) o Y - Xc o (X o Y) = 0
By taking the conjugate of this relation we (essentially) obtain:
(d) A(X, Y, Yc) = X o (Y o Yc) - {X o Y) o Yc = 0
The last two identities (in aslightly amended form (YoX~l)oX = Y, X~1o(XoY) = Y)
have already been used). We show later, as part of the proof of the Hurwitz theorem that
identities (a),(b) are equivalent to either of (c) or (d). The remaining identities (e) to (i)
are valid not just for Cayley numbers but for all alternative algebras. In the first identity
(a) replace X by C + D to give, after some cancellation:
(c) {D o C) o Y - D o (C o Y) + (C o D) o Y - C o {D o Y) = 0
Putting D = Y in (e) and using (b) gives
(/) (Y o C) o Y = Y o (C o Y)
Replacing Y by C + D in identity (b) implies, after re-labelling
(g) {DoC)oY-Do(CoY) + {DoY)oC-Do(YoQ = 0
Subtracting (e) from (g) gives
(ft) (DoY)oC + Co(DoY) = (CoD)oY + Do(YoC)
Now in this expression we replace D by C o D and then Y by Y o C, add the resulting
expressions to obtain
(CoD)o(YoC) + {Co(CoD))oY + Do((YoC)oC) + {CoD)o(Yo C)
= Co[(CoD)oY + Do(YoC)] + [{CoD)oY + Do(YoC)]oC
Now using (h) and (f) in this last expression gives
183
(C2 oD)o y+2(C oD)o(YoC) + Do(YoC2)
= C2o(Do7) + (Doy)oC2 + 2Co ((D oY)oC)
Now, finally, replace C by C2 in (h) and combine it with the expression just obtained to
imply:
(t) {CoD)o(YoC) = Co(DoY)oC
Note that because of the identity (f) the right hand side of (i) is unambiguous. Identity
(i) is known as the Moufang identity.
A further identity can be obtained by replacing (in identity (h)) D by (Cc o D) and then
Y by Y o Cc and adding the resulting expressions to obtain
(CoD)o(Fo Cc) + {CcoD)o(YoC) = Co [(D oY)o Cc] + [Cc o {D o Y)] o C
which leads to, on taking the scalar parts of both sides
S[{C oD)o(Yo Cc)} + S[{CC oD)o(Yo C)]
= S(C o [(D o Y) o Cc\) + S([CC o (D o Y)] o C)
= S[{(D o Y) o Cc} o C] + S[C o {Cc o (D o Y)}\
= 2NCS{D o Y)
However, as we have shown above S(Co[(DoY)oCc]) = NcS(DoY) and so we immediately
deduce that
S[(C oD)o(Yo Cc)} = S[{CC oD)o(Yo C)] = NCS{D o Y)
This identity will be of considerable use in Section 4.6.
4.5 Normed Algebras and the Hurwitz Theorem
We are now in a position to prove Hurwitz's theorem which highlights the important role
enjoyed by Cay ley numbers in the domain of normed algebras. In the latter part of the
proof considerable use will be made of the alternative laws and of the Moufang relation.
Theorem 4.4 The only normed algebras over the real field are isomorphic to M, C, H, IK.
184
Proof
Let A be an algebra with basis ei, e2,..., en in which e\ = zi is the identity element. If
a,b e A then
n n
3=1 j=l
The norm Na of a is
Na = a\ + a^ + ... + a\ Na e R
Clearly 7Va = 0 if and only if a = 0. The algebra ^4 is normed if for any two elements
a, b E A there exists a basis for which
Nab = NaNb
We first show that any normed algebra over R is a division algebra. The proof is immediate.
If a, b G A and if a& = 0 then
Wa6 = N0 = 0
NaNb = 0 -» 7Va = Oor7V6 = 0 -» a = 0or& = 0
We conclude that if ab = 0 then either a = 0 or 6 = 0. Hence the algebra A is a division
algebra. Since it is a division algebra it must contain a unit which we denote by e\ = i\.
We present the proof of Hurwitz's theorem following the approach given in Jacobsen.
The standard inner product is introduced in A:
< a, b > = a\b\ + a2b2 + ... + anbn V a, 6 G -A
The following properties of the inner product are easily obtained.
(i) Na= <a,a>
(ii) < a,&> = <&,a>
(iii) < aa, &> = <a,a&>=a:<a,&> aeR
(iv) < a,6 + c> = <a,6> + < a,c>
<a + c,6> = <a,6> + <c,6> Va,6,cGR
(v) If < a, b >= 0 V 6 G A then a = 0
185
Properties (i) to (iv) are easily checked. To obtain property (v) choose b = bjej for given
j. Then < a, b > = ajbj = 0 and if < a, b > = 0 then we deduce aj = 0. We now repeat
for each j = 1,2,..., n to deduce aj = 0 j = 1,2,..., n and so a = 0. This shows that
the inner product is non-degenerate.
(vi) < a, b > = \ [Na+b -Na- Nb]
proof
7Va+6 = <a + b,a + b>
=<a,a + b> + <6,a + 6>
= <a,a> +2<a,6> + <6,6>
the result follows.
(vii) < ca,cb> = Nc < a, 6 > V a, 6, c G A
proof <ca,cb> = - [Nc(a+6) - 7Vca - 7Vc6]
= -Nc [Na+b -Na-Nb} = Nc<a,b> from (vi)
Similarly
< ac,bc> = Nc < a,b > V a,b,ce A
(viii) < ac,bd> + < ad,bc> =2 < c,d X a,6 >
proof In the second relation of (vii), replace c by c + d to give
< a(c + d), 6(c + d) > = Nc+d < a, b >
then, expanding the left hand side and rearranging gives
< ac, bd > + < ad, 6c > = Nc+d < a, b > — < ac,bc> — < ad,bd>
= 2<c,d><a,b>
The inner product can be used to split A into a subspace S and its orthogonal compliment
S1. Precisely S1 is the set of all elements q1 € A such that
<a±,a>=0 VgGS
It is now a standard construction to show that S1 is a linear subspace of A and in fact
A = S + S±
186
implying that every a e A can be expressed uniquely as a sum of an element of S and an
element of S1. In particular, we choose S = {e\}. Also, in what follows, any element of
{ei}1 will be underlined.
Thus in (viii) choose b = e\ and a = a G {ei}1 then
< ac,d> + < ad, c> = 2 < c, d >< a, e\ > = 0
whilst if d = e\ and c = cE {ei}1 then (viii) again implies
<ac,b> + < a,be > = 2 < c,e\ >< a,6 > = 0
This gives the result (with some relabelling):
(ix) < ac,d> + < ad, c > = 0 and < ca, d > + < c, da > = 0
Now any 6 G A can obviously be written in the form
b = aei+a aeR, a G {ei}1
With this partition of A we introduce the conjugate b* of 6:
6* = ae\ — a
Now, by (iii)
< ac,d> — < c, ad > = 0
and adding this to the first relation of (ix) we have:
< ac,d> + < ac,d> + < ad,c> - < c, ad > = 0
that is
<bc,d> = < c,ad> - < ad,c>
= < c,ad> - < c,ad>
Therefore we deduce
(x) <bc,d> = <c,b*d> \/b,c,deA
Similarly with the second relation in (ix)
< cb,d> = < c, db* >
187
relabelling the right hand side of this second relation d -> b*, b* -> d implies
< c,b*d > = < cd*,b* >
so we conclude
< bc,d> = < c,b*d> = < cd*,b* >
Now re-label again: d -> d*, b -> c* and c —* b* to give
< c*b*,d* > = <b*,cd* > = < b*d,c>
and so we finally write, from (x) and its extensions
(xi) < bc,d> = < c,b*d> = < cd*,b* > = < c*b*,d* > V b,c,de A
Prom the relations in (xi) a number of special cases can be deduced. First, with c = e\\
<b,d> = <b\d* > = <d*,b* >
so, replacing b by be
<bc,d> = < d*,(bc)* >
But, in particular, from (xi)
< bc,d> = < c*b*,d* > = < d*,c*6* >
from which we deduce
<d*,{bc)* -c*b* > =0
and so, since the inner product is non-degenerate we obtain
(bc)*=c*b* V6,cG A
Thus the map (j): b —» 6* is an anti-involution of the algebra A. (An involution is a map </>
such that (j)2 = identity map and one which respects multiplication 4>(ab) = (j)(a)(j)(b). An
anti-involution is one that reverses multiplication as does conjugation).
We now use the concept of conjugation, in conjunction with the inner product to show
that any normed algebra must be an alternative algebra.
It is obvious that the space of elements left fixed by the conjugate operation is the space
spanned by e\. Also since, from (xii), (bb*)* = bb* then we must have
bb* = ae\ for some a eR
188
But (from (xi))
Nb= <b,b> = < 66*,ei >
= < aei,ei > = a
.'. bb¥ = Nbel V6GA
Also
(66*)* =6*6 /. b*b = pei f3eR
and again from (xi)
Nb = < 6,6 > = < ei, 6*6 > = < 6*6, e1 >
= </3eue1> = /?l
b*b = Nbei and so 66* = 6*6 = Nbei
We can now deduce the alternative laws.
Prom (xi), replacing d by bd
<bc,bd> = < c,6*(6d) >
but, by (vii)
<bc,bd> = Nb <c,d> = < c,Nbd > since Nb G R
= <c, (bb*)d>
<c,6*(6d)> = < c, (bb*)d> Vc,6,dGA
Thus using non-degeneracy of the inner product we obtain:
(xii) b*{bd) = (bb*)d yb.de A
By taking the conjugate of both sides and using results derived earlier
(6d)*6 = (66*)d* /. (d*6*)6 = (66*)d*
or, relabelling, we have
(xiii) (d6)6* = (66* )d V6,deA
Now writing
6 + 6* = 2aei where 6 = ae\ + a and 6* = aei - a
189
we have from (xii)
(2ae1-b)(bd) = Nbd
.'. 2a(bd) - b(bd) = Nbd
that is (2ab - Nb)d = b(bd)
.'. [2ab - b{2aei - b)]d = b{bd)
and so b2d = b(bd)
whilst from (xiii)
(db)(2ael-b) = Nhd
.'. db{2a) - Nhd = {db)b
hence d[b(2a) - 6(2aei - b)] = (db)b
leading to db2 — (db)b
These results show that a normed algebra over R must be an alternative algebra. The
converse of this result is easily obtained.
If A is an alternative algebra over R with identity e\ and with anti-involution b -> b* such
that (with b = ae\ + a, b* = ae\ — a)
bb* = Nhei and b + b* = 2aex aeR
then A is a normed algebra. The proof of this result depends crucially on the Moufang
identity which was derived earlier in the discussion on Cayley numbers. However, that
identity is true for all alternative algebras. Now V c,b,d e A the Moufang identity states
{cb){dc) = c{bd)c
Also
Nbd = (bd){bd)* = {bd){d*b*)
= {bd)[d*{2ae1-b)}
= 2a{bd)d* - {bd){d*b)
= 2ab(dd*) — b(dd*)b Using the Moufang identity
= Nd[2ab-b2}
= Nd[2ae1-b]b
= Nd(b*b) = NdNb Mb,deA
That is, A is a normed algebra.
We are now in a position to prove Hurwitz's theorem.
190
Let Abe a, normed algebra over the real field. Let S be a subalgebra of A containing the
identity e\. (A set S of elements of an algebra A is called a subalgebra of A if S ^ <j> and
if S is closed under multiplication). Then, as we have seen,
A = S + S1
If S ^ A then there exists an element I E S1 such that Nj = 1 (this is always possible
since, for non-zero element b G A we can construct another element B = ——b such that
Nb
NB = 1). Since /eS1 then
/* = -/ and I2 = -1*1 = -Niei = -ei
Our aim now is to show that the set S + IS (i.e. all elements of the form a + lb a, b e S)
is a subalgebra of A. We need to develop the product of elements within S + IS.
Now V * = Nxe\ and so replacing x by x + ?/ gives
(x + 2/)(x* + 2/*) = Nx+yei
i.e. 7Vxei + yx* + xy* + 7Vyei = Nx+yei
yx* + xy* = 2 < x, ?/ > ei from (vi)
Choosing x e S and y = i" G S1 gives
ix* + xJ* = 2<x,J>ei =0
.*. xl = lx* VxGS
Now, by an earlier result (xi)
<bc,d> = < c, b*d >
therefore, with d = I
<bc,I > = < c,b*I >
But if c, b e S then be e S (since 5, being a subalgebra of A, is closed under multiplication).
Thus
0= <c,&*/> = <c,/6> VcGS
.\ 76 GS1
We write IS C 51 (to mean all elements of the form lb, be S are members of S1). Also,
again by (xi)
< bc,d > = < c,b*d >
gives, by choosing b = I and d = If
<Ic,If> = <c, /*(//)>
= < c, (/*/)/> = <c,/>
191
where we have employed the alternative law (ab)b* — a(bb*). Thus the map b —> lb is a
unitary transformation as it preserves the inner product. Such a transformation takes
an orthonormal basis in S into an orthonormal basis in IS. Hence S and IS have the same
dimension.
The alternative laws and the Moufang identity can now be used to deduce the product
rules of elements of S + IS.
Since
a(a*x) = (aa*)x = Nax
then, replacing a by a + b
(a + &)[(a + &)*x] = Na+bx
leading to:
a(b*x) + b(a*x) = 2 < a, b > x
Therefore choosing a, x G S and b = I then
a(I*x) + I(a*x) =2<aJ>x = 0
that is a(/x) = I(a*x) V a, x G S
Then, taking conjugates
(xT)a* = (x*a)I*
i.e. (Jx)a* =/(a*x)
or, relabelling
(Ix)a = I (ax) V a, x G 5
From the Moufgang identity (cb)(dc) = c(6d)c then
(J6)(/c) = (Ib)(c*I)
= [/(6c*)]/
192
However, if we revert to the notation used first to describe Cay ley numbers, the algebra
of ordered pairs
a, b G S in which the product rule is
ac - db*
cb + a*d
and in which I is identified with the element
and the identity is the element
and
with conjugate
a
-b
is precisely the algebra described above.
We now show that S + IS is associative if and only if algebra S is commutative. Let
X
Y =
Z =
t, u, v,w,a, b e S
in which we assume the algebra S is associative. Then according to the product rule:
X o (Y o Z) =
' t~
u
0
va — bw*
aw + v*b
tv - wu*
vu + t*w
0
a
b
^
—
t(va — bw*) — (aw + v*b)u*
(va - bw*)u + £*(atu + i>*6)
. . 7_i^ ""* I I " I _ I (tv -wu*)a - b(u*v* +w*i)
^ ' L^ + ^^J LM [a(vu + t*w) + (v*t*—uw*)b
If S is commutative then we easily see that
X o (Y o Z) = (X o Y) o Z
and so the algebra of S + 75 is associative. However, if A is not commutative then S-\-IS
is not associative since, in general, t(bw*) ^ (bw*)t obtained from the first components of
X o (Y o Z) and (X o Y) o 2.
We now show that 5 + IS is an alternative algebra if and only if S is associative. We
assume that S is alternative.
Now the alternative laws are that, for any two elements X,Y e S + IS
(i) X2oY-Xo(XoY) = 0 (ii) (YoX)oX-Yo(X2) = 0
We easily show that these two conditions are equivalent to a single condition. Now any
element X G S + IS can be partitioned
X = Xc+-ae1 aeR
and so, employing this in (i):
[X o (Xc + aei)] oY-Xo [(Xc + aei) o Y] = 0
(IoIc)oy + (Ioa7)-Io [(Xc o Y) + aY] = 0
(IoIc)o7-Io(Ico7) = 0 (i)'
Similarly, in (ii):
(YoX)o {Xc + aei) - Y o [X o (Xc + aci)] = 0
{YoX)oXc + {YoX)a-Yo(XoXc)-YoXa = 0
.'. (Y o X) o Xc -Y o (X o Xc) = 0
But, taking the conjugate of (i) we have
(ii)'
[{XoXc)oY-Xo (Xc o Y )]c = [{X oXc)oY]c-[Xo (Xc o Y)]c
= Yco(Xo X)c - {Xc o Y)c o Xc
= Yco(Xo Xc) - (Yc oI)oIc
which except for the replacement Y —* —Yc is exactly the expression obtained in
Hence we can express the alternative laws as a single condition:
(X o Xc) oY - X o (Xc oY) = 0
Now, as above, define X, Y:
X =
Y =
and thus Xc =
t*
—u
and hence
therefore
oXc =
' t'
u
0
XcoY =
Xo{Xc
ol
') =
" i
-u
t*
[—u
' t'
u
=
0
o
\tr
[t*
V
w
t*v
—I
> + UU*'
u - t*u
=
+ u
m +
=
'tt*
t*v + wu*
—vu + tw
m*~
tw
+ uu*
0
t(t*v + wu*) - {—vu + £iu)u*
(t*v + wu*)u + t*{—vu + tw)
and
(ioic)0y:
0
(tt* + uu*)v
(tt*+mx*)w
and so
Io(Ico7)-.(IoIc)o7 =
But 5 is alternative and so
t(t*v) + {vu)u* - {tt* + uu*)v + t(wu*) - (tiu)u*
(t*v)u - t*{vu) + (W)u + t*{tw) - {tt* + uu*)w
t{t*v) = {tt*)v {vu)u* = {uu*)v {wu*)u = {uu*)w t*{tw) = {tt*}
w
194
leading to
Xo(XcoY)-(XoXc)oY =
t(wu*) - (tw)u*
(t*v)u - t*(vu)
[t(wu*) - {tw)u*}
+
[(t*v)u-t*(vu)]
a,/?e:
implying immediately that S + IS is alternative if and only if S is associative.
Now the construction, described above, of S + IS from A can now be repeated for S. We
begin with
A0 = A = {a " | a e
Algebra A\ is the algebra of complex numbers since
/? = m + 7/3
in which J2 = — i. Of course j4i is commutative, since A0 is associative.
z,w e Ai = C}
a
= a
1
0
+
0
1
M = {p =
this is the algebra of quaternions.
A2 is associative since A\ is commutative. However, as we know, A2 is not commutative.
A3 = {Q =
p,qeA2 = :
this is the algebra of Cayley numbers K. As is not associative since A2 is not commutative.
However As is alternative (and hence normed) since A2 is associative. The 'last' algebra
of the sequence is
A4 = {
Q,PeAs = K}
However, this algebra is not alternative (and hence not normed) since As is not associative.
This completes the proof of Hurwitz's theorem: the only normed algebras over the real
field are E, C, M, and K.
195
4.6 Rotations in 7- and 8-Dimensional Euclidean Space
We begin by considering rotations in R7. The work carried through with quaternions
suggests that we consider the operation:
Wf = Xo(WoX~1)
where W and X are given Cayley numbers. (Note that the right hand side can be shown
to be equal to (X oW)o X~l: to see this put D -» X, C -» W, Y -» X~l in Cayley
identity (e)). Let
W = y/Nw (cos (f) + w sin </>) X = y/Nx (cos 0 + x sin 0) fox = -l, u;ou; = -l
Because the angle can be viewed as that subtended between the Cayley number and its
scalar part we can construct the following diagram.
Figure 4.1
Theorem 4.5 The map
(f)X : M8 h+ M8 (t>x(W) = Xo(Wo X~l)
can be interpreted geometrically as describing a rotation of the vector part of W about
the vector part of X through an angle 20.
196
Proof It is easy to show that V(X) is left fixed by the mapping:
<1>x(V(X)) = Xo(V(X)oX-1)
= (S(X) + V(X)) o [V(X) o j^(S(X) - V(X))}
= -j^(S(X) + V(X)) o [S(X)V(X) - V(X) o V(X)]
= j^MX) + v(x)) o \(s(x) - v(x)) o v(x)}
= Xo [(X-1 o V(X)} = V{X)
We are therefore justified in referring to V(X) as the axis of rotation. It is also easy to
show that the norm and scalar part of W are conserved.
Nw = NxNWoX-i = NxNwNx-l = Nw
since Nx-i = -rr-■ Concerning the scalar part we have noted earlier in Section 4.4 that
Nx
for any Cayley numbers X,Y then S(X o Y) = S(Y o X). Thus:
S(W) = S(X o (W o X'1))
= S{(W o X'1) o X) = S(W)
since {W o X'1) o X = W. Now
Xo(WoX-1) = Xo \{S(W) + V{W)) oX"1]
= X o (S(W) o X'1) + Xo (V{W) o X"1)
= S(W) + X o (V(W) o X'1)
and since S(X o(Wo X'1)) = S(W) then
S(Xo(V{W)oX-1)) = 0
Therefore X o (V^W) o X-1) is pure Cayley and so
V(Xo(WoX-1)) = Xo (V{W) oX"1)
Let W be such that its vector part is parallel to w then V(W) is parallel to
197
Xo (woX 1) = wf. Now choose p to be a unit pure Cayley number in 'plane' with 'normal'
x (i.e. S(xop) = 0). (See Figure 4.2)
scalar axis
Figure 4.2
Let A be the 'angle' between x and w.
w = x cos A + p sin A
w is a unit Cayley since (using xc = x~l = — x, pc = p~l = —p)
w owc = (xcosA + psinA) o (-xcosA -psinA)
.'. w o wc = —x ox cos2 A — p opsin2 A - sin A cos X(p ox -\- x op)
= cos2 A + sin2 A = 1
since pox = — x op as p,x are perpendicular.
Now
wf = Xo(woX~1)
= X o [(xcosA + psinA) ol"1]
= A'o(xoA'-1)cosA + A'o(poA'-1)sinA
We shall show that X o (x o X~l) = x and Xo(pol_1)isa pure Cayley which revolves
through an angle 26 about x. The first part is relatively easy. Since V(X) and x are
198
parallel then
V(V(X) ox) = ^[V(X) ox + io V{X)]
= hv(X)ox-xoV(X)] = 0
therefore
X o (x o X-1) = (S(X) + V(X)) o [£ o (5(X) - V{X))]±-
= (S(X) + V(X)) o [S(X)x - x o V(X)]±-
= -^-{xS2(X) + S(X)[V(X) ox-xo V(X)} - V(X) o (xo V(X))}
But V-i(X) = ^
.'. V(X) o(io V(X)) = -NV{X)V{X) o(io V"1^))
= — NV(x)X from a result obtained earlier
.-. Xo(xoX-1) = ^-[S2(X) + Nv(x)}x = x
The second part is developed along similar lines.
Xo(poX-1) = (S(X) + V(X))o\po(S(X)-V(X))}±-
= -^{PS2(X) + S(X)[V(X) op-poV(X)} - V(X) o (poV(X))}
Nx
However since V{X) and p are perpendicular then adapting an earlier result:
V(X)o(poV(X)) = Nv{x)p
:. X o (poX-1) = j^[(S2(X) - NV(X)}p + S(X){V(X)op-poV(X)}}
199
Let p' = Xo^poX-1) then we show p1 is perpendicular to x (showing that p has been rotated
about x). To do this we need to show 5[Xo(poX_1)]ox] = 0. Now 5 (pox) = -S(pox) = 0
by our original assumption. So all we need to show is that
S[{V{X)op-poV{X))ox} = 0
or (equivalent ly)
S[(x op — pox) ox] = 0
Now, since p, x are perpendicular
(x op) o x = —(pox)ox
= (pox-1) ox = p
and
— (po x) ox = (po x~l) o x = p
.'. S[(xop-pox) ox] = S(2p) = 0
We can also determine the angle of rotation:
co-*=-^|= = S(jf ofi)
V ■* *p' v P
= -5(p'op)
= -5[{Xo(poX-1)}op]
= "j^W - Nvm)S{pop) - &S{[V(X)op-poV(X)} op}
= ^(S\X) - Nv{x)) - ?£ls(-V(X) - V(X))
= J^(S2(X) - NV(X)) = -±-(Nx cos2 6 - Nx sin2 9) = cos20
:. 4> = 26
Finally
w' = x cos A + p' sin A
where p' = p cos 20 + x o psin 20. We note that £ o p is perpendicular to both x and to p.
200
Successive Rotations
A natural question that can now be asked is what if a second rotation is performed?
Are two rotations equivalent to a single one, as they are for complex numbers and for
quaternions? Mathematically we ask; if
W = Xo(Wo X-1) and W" = Y o (W o Y'1)
then is
W" = Yo (W o y1) = (YoX)o[Wo(Yo X)-1}
Of course, for quaternions this relation is valid and is obtained immediately since
quaternion multiplication is associative. Unfortunately this relation is not true for Cayley
multiplication, except in certain special cases. For example if Y = X~l (this corresponds
to choosing our second rotation to be the reverse rotation and should return us to where
we started) then one can show that
W" = X~1o{[Xo(Wo X-1)} oX} = W
Proof. In Cayley identity (e) let D -» W,C -» X and Y -» X'1 then
Io(|yo X-1) = {WoX)oX~1-Wo(Xo X-1) + {XoW)o X~l
= W-W+{XoW)oX~1
= KoX~1 where K = X oW
therefore
[Xo(Wo X'1)} oX = (Ko X'1) oX = K
.'. X'1 o {[X o(Wo X-1)} oX} = X~l o(XoW)
= w
which proves the relation. We now use the Cayley number identities to develop a more
general result. Since, as we have argued above, the transformation
W = X1o(WoX{1)
with Xi = y/Nxl (cos 0 + x sin 0) rotates the vector part of W through an angle 26 about
an axis x then we might expect that the transformation:
W" = X2o [(Xi o(Wo Xf1)) o X21}
201
where X2 = \/Nx2 (cos <f> + x sin <f) would imply a further rotation through angle 2cf> about
the same axis £ and therefore be equivalent to a single transformation
{X2oX1)o[Wo(X2oX1)-1)]
since X2 ° X\ = y/NXi ^/NXl (cos(0 + </>) + £ sin(0 + <£)). As the following theorem shows
this turns out not to be true.
Theorem 4.6 Let X, Y, W be general Cayley numbers. The relation:
{Yox)o\Wo(Yo x)-1)] = y o [(x o (w o X"1)) o y-1]
is valid for all W only when V(Y) = kV(X) where k is a real number.
Proof. Without loss of generality we assume X, Y have unit norms. Let
Y = S{Y) + V(Y) X = S{X) + V(X)
then
Xo(Wo X'1) = \S(X) + V(X)} o [S(X)W -Wo V(X)}
= S2(X)W - S{X)[W o V{X) - V(X) oW}- V(X) o(Wo V(X))
thus
[Xo(i¥oX-1)]0y-1
= S2{X)S{Y)W - S(X)S(Y)[W o V(X) - V{X) oW}- S(Y)V(X) o{Wo V(X))
- S2(X)W o V(Y) + S{X)[(W o V{X)) o V(Y) - (V(X) oW)o V(Y)]
+ [V{X) o(Wo V(X))] o V(Y)
Therefore
YoUXoiWoX-^oY-1}
= S2{X)S2{Y)W - S(X)S2(Y)\W o V{X) - V{X) oW\- S2{Y)V(X) o{Wo V{X))
-S2(X)S(Y)W o V(Y) + S(X)S(Y)[{W o V(X)) o V(Y) - {V{X) o W) o V(Y)]
+S{Y)\V(X) o (W o V(X))] o V(Y)
+S2(X)S(Y)V(Y) oW- S(X)S(Y)[V(Y) o(Wo V(X) - V{Y) o (V{X) o W)]
-S(Y)V(Y) o [V(X) o(Wo V(X))] - S2(X)V(Y) o(Wo V(Y))
+S(X)V(Y) o \(W o V(X)) o y(y)] - S{X)V{Y) o [(V(X) o W) o V{Y)]
+v(Y) o {\v{x) o(wo v(x))] o v(y)}
Now
YoX = S(Y)S(X) + 5(Jf)V(y) + S{Y)V{X) + V(y) o V{X)
and so
(y o X)-1 = S{Y)S{X) - S(X)V(Y) - S{Y)V(X) + V{X) o V(Y)
Therefore
(y o x) o [w o (y o x)-1}
= S2(Y)S2{X)W - S(Y)S2(X)W o V(Y)
-S2{Y)S(X)W o V(X) + S(Y)S(X)W o (V(X) o V(y))
+5(y)52(x)y(y) o w - s2(x)v(Y) o(Wo v(y))
-S(Y)S(X)V{Y) o(Wo V(X)) + S{X)V(Y) o[Wo (V(X) o V(y))]
+52(y)5(X) V(X) oW- S(X)S{Y)V(X) o(Wo V(Y))
-S2(Y)V(X) o(Wo V(X)) + S{Y)V(X) o[Wo (V(X) o V(Y))]
+S{Y)S(X){V{Y) o V(X)) oW- S{X)(V(Y) o V(JQ) o (W o V(y))
-S(Y)(V(Y) o V(X)) o (W o V{X)) + [V(Y) o V(X)] o [W o (V(X) o V(y))]
Now form
ropofifo x-1)} o y-1] - (y o x) o [w o (y o x)-1]
203
= S(Y)
{V(X) o (W o V(X))} o V(Y) - V{Y) o [V(X) o (W o V (X))]
-v(x) o [w o (v(x) o v(y))] + (v(y) o v(x)) o (w o v{x))
+S(X)
v{Y) o \{w o v(x)) o v(y)] - v(y) o [(v(x) o w) o v(y)]
-v(y) o \w o (v(jf) o v(y))] + (v(y) o v(x)) o (w o v(y))
+5(x)5(y)
(w o v{x)) o v(y) - (v(x) o\v)o v{y)
-V(Y) o(Wo V(X)) + V{Y) o (V(X) o W)
-W o (V(X) o V{Y)) + V{X) o{Wo V(Y))
-(V(Y) o V(X)) oW + V{Y) o(Wo V(X))
+V(Y) o {{V(X) o(Wo V{X))} o V(Y)}
-(V(Y) o V(X)) o[Wo (V(X) o V(Y))}
This is the general expression which appears not to vanish. To see this consider the last
two terms which must vanish separately from the other terms. Clearly by Cayley identity
(i) we have
V(Y) o {[V(X) o(Wo V(X))] o V(Y)} - (V(Y) o V{X)) o[Wo (V(X) o V(Y))}
V(Y) o V(X)
{W o V(X)) o V(Y) -Wo (V(X) o V(Y))
which does not vanish unless (since Cayley numbers constitute a division algebra)
V(Y) = 0, V(X) = 0 or
(W o V(X)) o V(Y) -Wo (V(X) o V{Y)) = 0
We first show that there is no loss of generality if we assume W is also a pure Cayley
number as
\{S(W) + V(W)) o V(X)] o V{Y) = [S{W) + V{W)] o (V(X) o V{Y))
204
which after expansion and cancellation reduces to
\V{W) o V(X)] o V(Y) = V(W) o \V(X) o V(Y)]
However, as proved in Section 4.4 this is only zero for arbitrary Cayley numbers W if
V(Y) = kV(X) where k is a real number.
Returning to the main problem we make the identification V(Y) = kV(X) then apart from
obvious cancellations and writing V for V(X) we obtain:
y o [{x o (w o x-1)} o y-1] - (y o x) o [w o (y o x)-1]
= fcS(Y)
(Ko(lfoy))oV-Ko(^o(Ko V))
+ k2S(X)
- V o ((V oW) oV) + (V oV) o (W oV)
+ kS(X)S(Y)
(V o W) o V + V o (W o V)
+ k2
Vo{[Vo(Wo V)} o V} - (V o V) o [W o (V o V)]
which is easily seen to vanish by applying one or other of the Cayley identities (particularly
(h))-
Reflections in 7-dimensional Space
If p is a unit pure Cayley number then p_1 = — p and the angle of this unit Cayley is n/2.
Thus p o (W op"1) = — p o (W op) describes a rotation of V(W) through 180° about p.
Similarly p o (W o p) describes a reflection in the plane with normal p. Let X be a Cayley
number whose vector part lies in the plane with normal p. That is S(po V(X)) = 0. Then
under the reflection we find (from previous work)
V{po(Xop))=po (V(X) op) = V{X)
That is, elements in the plane normal to p are unchanged by the transformation po (X op).
Also in agreement with our interpretation the direction of any vector parallel to p is
reversed:
V(p o ((rp) o p)) = —rp
Rotations in R
We now consider the transformation maps </>£, </>#:
(j)L : R8 ^ R8 (j)L(X) = QoX (j)R : R8 ^ R8 (f)R(X) = XoQ
where Q, X e K and Nq = 1. These maps are norm and angle preserving:
Nqox = NqNx = Nx
and if X, 7eK subtend and angle A before the transformation (say </>l):
, S(XoYc)
cos A =
then after the transformation the angle subtended is
x;_ S((QoX)o(QoYy)
VNQ°x VNQ°y
S((QoX)o(Y'oQc))
cos A
VNxVNy
However, as we have noted earlier
S{{Q oX)o(Zo Qc)) = NQS{X o Z)
As a special case
S{{Q o X) o (Yc o Qc)) = S{X o Yc) if NQ = 1
Hence, using this theorem in the change of angle formula we have:
A' = cos A
so the map 4>l is angle preserving. Similarly, for the map (J)r{X) = X o Q
cosA/=S((XoQ)o(YoQ)c)
>/*
S((X<
S((QC
S(YC
s/N^
S(Xc
XoQ
>Q)o
Wx
oYc)
Vny°Q
(QC°YC))
VNy
o(XoQ))
fN^JTh
oX)
\fNr
>yc)
hv
°y
by the above theorem
VNxVN^
= cos A.
206
All that is left to be done to show that </>l, <t>R describe rotations in M8 is that these maps
are orientation preserving orthogonal maps. To show that they are orthogonal we can either
determine the matrix representations of the maps and show that they are orthogonal or we
can check that the maps preserve the inner product. We shall consider both approaches.
These maps preserve the inner product since
<cj>L{X),cj>L{Y)> =S[m)o(QoY)c)
= S[(QoX)o(YcoQc)}
= S{XoYc) = <X,Y>
in which we have used the result obtained above S((Q o X) o (Yc oQc)) = S(X oYc).
Hence the map </>l is orthogonal. The same approach shows that (J)r is also orthogonal.
To determine matrix representations we need to obtain appropriate basis elements.
4.7 Basis elements for Cayley numbers
Q
Let
P] _ |~Po+Pii+P2.;+P3fc
Although we could make many choices for basis elements, to conform, with accepted
formulations we make the following selection:
D(o)
1
0
"o"
i
, e« =
, e(5> =
0
1
"%'
o_
, e& =
, e^ =
i
0
"o"
A
, e^ =
, e^ =
J
0_
"o"
J _
e(4) =
As the reader should verify (see exercise) these basis vectors satisfy the following relations
(m, n ^ 0)
c(m) 0 c(m) = _c(o) m ^ 0
e(m) 0 e(n) = _e(n) 0 e(m) n ^ m
indices modulo 7
e(m) Q e(m+l) _ e(m+3)
The triples (e(m)5e(m+1)5e(m+3); indices modulo 7) form what are called Hamilton
triangles (cf Porteous [19]). They mimic the orthonormal triplet (z, j, k) of 3-dimensional
space. There are seven Hamilton triangles. In Figure 4.3 only one is highlighted
207
(e^\e^2\e^). The other Hamilton triangles are obtained by moving this triangle so
that the vertex (shown at e^) moves to vertex at e^2\e^\..., e^ in turn.
Figure 4.3
We shall meet Hamilton triangles again in our further analysis of rotations in 8-dimensional
space.
Other triplets can be chosen from e^m\ m = 1,2,..., 7. A Cayley triangle is a collection
of three mutually orthogonal unit norm pure Cayley numbers (/i, £, rh) with the property
that
S{h o (£ o rh)) = S{£ o (rh o h)) = S{rh o(ho £)) = 0
(i.e. any one element is orthogonal to the product of the remaining two). For example
e^\e^3\e^ is a Cayley triangle.
(eWoe(3>) =
e(5)0fe(Doe(3)l
T
0
•{
"0"
i
0
3
0
k
0
o
0
~
J
^
0
^
I
so S(e& o (eM o e^)) = 0. Similary S{e^ o (e<3) o e^)) = 0 and S(e^ o (e<5> o e^)) - 0.
From a single Cayley triangle the complete orthonormal basis for the set of pure Cayley
numbers can be constructed. From the Cayley triplet (e^l\e^\e^) define p = e(1) o e^
then the set
is an orthonormal set and so is a basis for the space of pure Cayley numbers.
208
We can use the present basis vectors to find:
4>L(e(0)) =
p
_q_
"1"
0
=
p
,q_
: p0e(0) + q0eW +pie<2) +p2e(3) + 9ie(4) +p3e(5) + g3e(6) + <?2e(7)
4>L(e{1])
p
_q.
"0"
i
=
' -q
. p _
= -g0e(0) +p0e
<j>L(e{2)) = ~Pie{0) - qie
0i(e(3)) = -p2e
(0)
<?2e
^L(eW) = -9iew+Pie'
(o)
«^L(e(5)) = -p3e(0) -
fee
0L(e(6)) = -</3e(o)+p3e
0L(e(7)) = -</2e(o)+p2e
+ 9ie(2) + g2e(3) - pie<4) + g3e(5) - p3e(6) - p2e(7)
+ p0e(2) + p3e(3) + q0e{4) - p2e(5) + g2e(6) - g3e(7)
- p3e(2) +p0e(3) + 93e(4) + Pie<5) - me™ + q0e^
- q0e™ +93e(3) +p0e(4) +?2e(5) +p2e(6) -p3e(7)
+ p2e(2) -Pie(3» - g2e(4' +p0e(5) + </0e(6) + «hc(7)
- g2e(2) + qie^ -p2e^ - q0e& +p0e(6) +Pie^
+ g3e(2) - </oe(3) +p3e(4) - W5' - Ple^ + p0e(7)
Thus the matrix representing the transformation </>l is
'Po -qo -p\ -P2 -q\ -P3 -q3 -q2'
qo po -q\ -qi p\ -q3 P3 P2
p\ q\ po -P3 -qo P2 -q2 <?3
P2 q2 P3 Po <?3 -pi qi -qo
q\ -p\ qo q3 Po -q2 -P2 P3
P3 q3 -P2 p\ q2 Po -qo -qi
<?3 -P3 q2 -q\ P2 qo Po -pi
-q2 -P2 -q3 qo -P3 q\ Pi Po .
which can be written in partitioned form:
A B~
-BT D
MH =
where
A =
Po -qo -pi -P2
qo po -qi -q2
p\ q\ po -P3
P2 <?2 P3 PO
209
B =
D =
-q\ -P3 -qs -Q2
p\ -qs ps P2
-qo P2 -q2 qs
qs -p\ q\ -qo
PO -Q2 ~P2 PS
q2 po -qo -q\
P2 qo Po -Pi
.-Ps q\ P\ Po J
A prolonged calculation confirms what we already know, that M(j)L is orthogonal
(M(f)LMT = I). This is checked by showing
AAT + BBT = [Np + Nq)I and Np + Nq = 1
and
-AB + BDT = 0
Also, as we have for any orthogonal matrix:
det (M^^Jj = det J = 1 .'. det M4>L = ±1
However, by inspection;
det M4)L =Po+/(pi,P2,P3, 90,91,92, 93)
in which / is some function of its arguments. Now since the coefficients Pi, qi; z = 0,1,2,3
are independent it must follow* that
det M(f)L = +1
and so the map </>l preserves orientation and so is a member of 50(8) and therefore
represents a rotation in E8. A similar deduction can be made for (J)r. Here:
<Me(0)) = [po,9o,Pi,P2,9i,P3,93,92]
Me{1))
M*W) =
°| \P\ _ \~q\ _ r~4o-9i*-92J -93&
lj [q\ [ Pj [ Po + Pli + P2J + Psk J
= [~9o,Po, -9i, -92,Pi, -93,P3,P2] in terms of e(l) z = 0, ..7
~P\ + Poi ~ Psj + P2&
9i -qoi + qsj ~q2k
i
0
P
.9.
=
ip
=
= [-Pi,9i,Po,-P3,-9o,P2,-92,93]
* (In fact the reader should be able to argue that necessarily: det M$L = (pi + pi + p\ +
P3 + 9o+9? + 922 + 93)4)
210
Me{3)) =
3
0
P
_9_
=
j p
.~3 4.
=
-P2+P3*+Poj-Plfc
q2 ~ q3i ~ qoj + qik
[-P2,^2,P3,Po,-93,-pi,9l,-9o]
<Me(4)) =
"o"
i
P
.9.
=
'qi'
pi
=
= [-9i.-Pi.9o,93,Po,
Me{5]) =
~k
0
p
.9.
=
kp
—kq
=
-q\ + 9o* + <?3j - 92&
-Pi +Po* + P3J-P2A:
92,~P2,P3]
-P3-P2^+PiJ+PoA:
+93 + qii ~ q\) - qok
= [-P3,43,-P2,Pl,42,P0,-g0,-4l]
Me{6)) =
1 1
p
.9.
=
qk
pk
=
-93 + q2i - qij + qok
~P3 + P2i - Pij + Pok
= [-93 ~P3, 92,~9l, P2,9o,Po "Pl]
0(e(7)) =
"o"
.3'.
p
.9.
=
qj
.pi.
=
-92 -q3i + qoj + qik
~P2 - P3i + Po] + Pik
= [-92,-P2,-93,9o,-P3,915P15Po]
M^R =
"Po
90
Pl
P2
9i
P3
93
92
-9o
Po
-9i
-92
Pl
-93
P3
P2
-Pl
9i
Po
-P3
-9o
P2
-92
93
-P2
92
P3
Po
-93
-Pl
9i
-9o
-9i
-Pl
9o
93
Po
"92
-P2
P3
-P3
93
-P2
Pl
92
Po
-9o
-9i
-93
-P3
92
-9i
P2
9o
Po
-Pl
-92 "
-P2
-93
9o
-P3
9i
Pl
Po
which again can be written in partioned form:
F
C
in which D is as defined in M^L and F, C are defined from the matrix.
We saw, at the equivalent stage in our discussion of quaternions that some matrix
representations were easier to interpret than others. The same is true here. The matrix
representations developed here for M^L and M(f)R seem to have no direct correspondence
with simpler objects such as quaternions. We remember that the quaternions could be
expressed in terms of 2 x 2 matrices over the field of complex numbers. We might think
211
that we could represent a Cayley number Q G IK by an 8 x 8 matrix Qm £ M//8^e ~y
Unfortunately an isomorphic map of this kind cannot exist for Cayley numbers since
Cayley algebra is non-associative whereas matrix algebra is associative. However, in the
case of maps representing rotations some simplification is possible if we make the alternative
choice for basis vectors:
£(°> =
£<*> =
"l'
0
"o"
1
, E™ =
, E^ =
i
0
"o"
i
, E™ =
, E^ =
J
0
"o"
J.
, E^ =
, ew =
1 1
"o"
k_
then, in terms of E^ i = 0,.., 7
<M£(0)) =
<M£(1)) =
<M£(2)) =
<M£(3)) =
"ll
oJ
i 1
oJ
31
oJ
'jfel
_°J
\p]
UJ
\P]
UJ
\p]
UJ
\p
U_
[P0,Pl,P2,P3,90, 91,92,93]
= [-Pi,Po,-P3,P2,9i,-9o,93, -92]
= [-P2,P3,P0,-Pl, 92,-93,-90,9l]
[~P3, -P2,Pi,P0,93,92, -9l, -9o]
ME(5)) =
<Pr(E{6)) =
ME{7)) =
"0"
1
"0"
i
"0"
.3.
"0"
_k_
p]
.^J
p]
.4}
P~\
.tfj
P
_Q_
= [-Qo, -qu -Q2, -q3>Po,PiiP2,P3\
= [-QuQo,Q3,-Q2>-Pi,Po>P3 -P2]
= [-^2,-^3,^0,91 -P2,-P3,P0,Pl]
= [-43,92, ~qu ^0,-P3,P2,-Pl,Po]
then we find the matrix representation of the map is
212
I Po
Pl
P2
P3
qo
qi
q2
q3
-pi
Po
~P3
P2
qi
-qo
<73
-02
-P2
P3
Po
-Pl
02
-Q3
-qo
qi
~P3
~P2
Pl
Po
q3
<72
-qi
-qo
-qo
-qi
-q2
-q3
Po
Pl
P2
P3
-Qi
qo
q3
-q2
-pi
Po
P3
~P2
-q2
-q3
qo
qi
~P2
~P3
Po
Pl
-q3 '
<72
-qi
qo
~P3
P2
-Pl
Po
prot
R
. QrRf
-(QrHf)T
prot j
It is easily checked that
PrRot (PTH0tf = NPI (QRef)T(Qr^f) = NqI (Plot)(Plot)T = NPI
and
(prRot)(QrRf)T -mf)T(Piot)T=o
also
detP£ot = detPrRot = y/f^> detQ^e/ = -y/Wp
In a very obvious way we can (by referring back to the work on quaternions) interpret
P^ot,P£ot as the matrix representations of maps in M4 which correspond to right and
left rotations together with an expansion (due to the factor y/Np). Also QrRe* can be
interpreted as being the matrix representation of a right reflection in R together with an
expansion (due to the factor y/Nq). Repeating this calculation for the map </>l we find the
matrix representation is
" Po
Pl
V2
P3
qo
qi
<72
1 03
-Pl
Po
P3
~P2
-qi
qo
-q3
<72
~P2
~P3
Po
Pl
-q2
<73
qo
-qi
~P3
P2
-Pl
Po
-q3
-q2
qi
qo
-qo
qi
<72
03
Po
-pi
~P2
~P3
-qi
-qo
-q3
02
pi
Po
~P3
P2
-q2
<73
-qo
-qi
P2
P3
Po
-Pl
-q3 '
-q2
qi
-qo
P3
~P2
Pl
Po
T>rot
. QrRot
-mr
(PLotV .
213
in which Q7^1 is a right rotation with expansion (due to factor y/Wqj and P£ot is a left
rotation with expansion (due to factor y/Np).
4.8 Geometry of 8-dimensional Rotations
Let Q be a unit norm Cayley with vector part parallel to q:
Q = cosO + qsmQ Nq = 1, qoq = -l
and let X be any Cayley number. Now, defining X' = q o X, we have:
QoX = Xcos 6 + X' sin 6
Q o X' = -X sin 6 + X' cos 6
So a Cayley multiplication on the left rotates elements in the plane containing X, X1 by
angle 0. Choosing X = 1 then X' — q and so the plane containing the elements 1 and q is
left invariant:
Q o 1 = 1 cos 9 + <? sin 0
Q oq = -Isin0 + <7cos0
Multiplication on the right by Q also rotates elements in the plane containing elements X
and X"{= X o q) through angle 6 in the same direction as left multiplication. Thus,
following multiplication on the right by Q, elements in the plane containing 1 and q rotate
through angle 0 in the same direction as left multiplication. Multiplication by Q~l
rotates these elements through angle 6 but in the opposite direction. In the corresponding
discussion on quaternions we saw that quaternion multiplication on the left (or right)
rotated elements in the plane (in the space of pure quaternions) perpendicular to q. For
Cayley multiplication the situation is somewhat more complicated. We first show that the
space of pure Cayley numbers has an orthonormal basis of seven elements based on the
element q (c.f. e^m\ m = 1,2,...,7 seen above). Having chosen q (via Q) choose two
other elements l,mso that (q,£,rh) is an orthonormal set and so that m is orthogonal to
qol:
rho(qo£) = -(qo£) om
We can quickly deduce that these assumptions alone imply
q o (£ o to) = -(£ om) oq and I o (to o q) = -(to o q) o £
so that (<?, I, to) is a Cayley triangle.
214
proof
q o (£ o rh) + (q o £) o rh = q o {£ o rh) - rh o (q o £)
= —q o (m o £) — rh o (q o £)
= [(q)2 + (m)2] o£-(q + rh)o(qo£ + fho£)
= [(I)2 + (™)2] oi-(9 + m)o(q + m)oi
= [M)2 + (m)2]oi-(g + m)2o/
= 0
since qorh = —rh o g. Also, taking conjugates:
(m o ^) o q + m o {£ o <?) =0
Thus, using the above two relations:
S(q o (£ o m)) = — (to o ^) o q + <? o (^ o to)
= to o {£ o <?) - (<? o £) o to
= to o [£ o <?) + [£ o <?) o rh = 0
since to is orthogonal to (£o q). By a very similar calculation it is confirmed that
S{£o(qoth)) = 0
so that (<?, £, rh) is a Cayley triangle.
Now, as seen earlier, from this single Cayley triangle a basis for the set of pure Cayley
numbers can be constructed. Let h = q o £ then the collection
(#, £,h,rh, q o to, £ o to, n o to)
is the required orthonormal basis. See Figure 4.4.
215
The seven Hamilton triangles are obtained as usual. Of these seven only three involve the
element q directly:
{q,t,n) (q,rh,qorh) (q.homjom)
which are shown highlighted.
Figure 4.4
The discussion now follows that for quaternions. When multiplying on the left, Cay ley
multiplication by Q = cosO + qsmO rotates elements in the three planes containing (I,n);
(ra, qom) and (norh^iom) about the <?-axis through an angle 6 in a positive direction
according to the right-handed rule). Also multiplying on the right by Cay ley number Q
rotates elements in these planes through an angle 6 in a negative direction according to
the right-hand-rule.
Since an arbitrary Cayley number can be written uniquely in terms of the basis elements
1, q and the three pairs of elements taken from the Hamilton triangles containing q then
under the operation Q o (X o Q~l) we see that
(i) Elements in the plane containing 1 and q are unaffected since Qo rotates elements
through angle 6 whilst oQ~l rotates them back, through 0, to their original position.
(ii) Elements in the three Hamilton triangles with common element q rotate through
a positive angle 0, about axis q when multiplied on the left by Q and then by a further
positive angle 0 when multiplied on the right by Q~l: a total rotation of 26 about the
axis q. Thus the effect of Q o {X o Q~l) is to effect a rotation in the space of pure Cayley
numbers of 26 about axis q. This is precisely what we found when the detailed analysis of
Q o (X o Q~x) was undertaken.
216
Reflections in 8-dimensional space
As with complex numbers and quaternions, reflections are generated by the conjugate map.
The simplest reflection is:
cj): R8 h+ R8 cj)(W) = Wc
with matrix representation
At = diag[l, -1, -1, -1, -1, -1, -1, -1]
and so AK = A^ with detA^ = — 1. The geometrical interpretation is a reflection in the
scalar axis (the scalar axis is left fixed by this map).
Next we consider the right and left reflections:
(/>L(W) = Q o Wc cj)R{W) = Wc o Q NQ = l
Here </>l is interpreted as a reflection in the scalar axis followed by a left rotation whilst
(J)r is interpreted as a reflection in the scalar axis followed by a right rotation. These two
lead to the axial reflection:
<f>a(W) = Qo(WcoQ) NQ = 1
This map is orthogonal since it is norm preserving. Also under this transformation the
line rQ r Gl remains invariant:
Q o (rQc oQ)=rQ
I leave it as an exercise for the reader to show that the matrix representation for this
transformation A^a is such that detA^a = — 1. The other type of reflection is called a
simple reflection:
A : R8 h+ R8 X(W) = -Qo (Wc o Q)
and corresponds to a reflection in the plane with normal Q. To see this let Y be a Cay ley
number lying in the plane normal to Q. That is:
S{YoQc)=0 or YoQc + QoYc = 0
then
\(Y) = -Qo(YcoQ) = -(QoYc)oQ = (YoQc)oQ = Y
That is, it is unaffected by this reflection.
Appendix 1
Clifford Algebras
This appendix relies heavily on the article by Riesz [20]. I have tried to argue in this text
that, based almost entirely on their natural relation to the other normed algebras M, C, H,
Cayley numbers are deserving of more attention. However, as this appendix briefly
describes, and as is well known, the real numbers, the complex numbers, the ordinary
and complexified quaternion numbers are also particular examples from the important
class of Clifford Algebras which, for completeness, we now introduce. But we should keep
in mind that although the Clifford Algebras can be thought of as being more fundamental
than M, C, H, there is an important algebra missing from this general class; all the Clifford
algebras being associative cannot include the Cayley numbers.
We have already noted the Hurwitz theorem: "If the sum of n squares times the sum of n
squares is again a sum of n squares in which the last sum is comprised of terms computed
bilinearly from the terms of the first two sums, then n takes one of only four possible
values, 1,2,4,8." That is if
(n \ / n \ n
i=l / \i=l / 2 = 1
in which
n n
Ci = Y^2,aiJkaJhk
j = l k=\
where a^ are constants then this can only be true if n = 1,2,4,8. This result of course is
intimately related to the well known result, due to that there exist only four normed
algebras over the reals; E, C, H, K.
A related problem is to ask if (and when) it is possible to write the square of a linear
expression of n terms as the sum of n squares. That is
(n \ 2 n
J^aivA =^a] (*)
2=1 / 2=1
in which a2 are scalars and Vi are entities whose algebra is yet to be discovered. In a very
imprecise sense this is a kind of "square root" of the problem above. In this problem we
can make some immediate deductions
(n \ 2 n
^2 aiVi ) = Yl ^ + Yl aiaj(ViVJ + v3vi)
2=1 / i=l i<j
218
in which the product of v^Vj is not assumed to be commutative. It follows that (*) will
be satisfied if
ViVj + VjVi = 26 ij
Now consider a vector space Vn over the reals or the complex numbers (scalars). We define
a scalar product on Vn as follows: to each pair x, y G Vn we associate a scalar < x, y >
with the properties
(i) < x,y > = < y,x >
(ii) < ax + f3y, z > = a < x,z > + j3 < y,z > a, j3 are scalars
Theorem There exists an orthogonal basis for Vn that is {vi, t>2,..., vn} such that
<Vi,Vj>=0 i^j
proof
Assume the existence of two elements x,y G V"n such that < x,y > ^ 0 (If no such pair
existed then any basis for Vn is orthogonal).
Prom the identity (using (ii))
<x + y,x + y> = <x,x> + 2 < x,y > + <y,y>
we easily deduce that there exists a vector with non-vanishing inner product. For if
<£,£>^0or<?/,?/>^0 then we have nothing to prove. However if < x, x > = 0 and
< ViV > — 0 then the above gives
< x + y,x + y > =2 < x,y > ^0
Thus let v\ G Vn be such that < v\, V\ > ^ 0. For any other x e Vn we can always write
t . < a;,vi >
x = x H vi
< vi,vi >
Now, using (ii) we easily find
T < x,V\ > T
< X, Vi > = < X -\ Vi, Vi > = < X , Vi > + <X,Vi>
< Vl,Vl >
and so < xT, vj > = 0, that is, xT is orthogonal to v\.
219
The space W\ of all elements xT G Vn such that < xT ,V\ > = 0 forms an (n - 1)-
dimensional subspace of Vn. We can now repeat this construction for the space W\. As
above we assume there exists two vectors u, v e W\ such that < u, v > ^ 0. (If no such
pair exists then v\ and any basis in W\ is an orthogonal basis for V^). In this case we can
always find an element v<i G W\ such that < ^2,^2 > 7^ 0 which allows W\ to be written
as the direct sum of V2 and the space W2 C W\ of vectors orthogonal to V2.
In this way we can construct a basis {^1,^2, • • • > vn} and find a number r such that (when
properly arranged)
< Vi,Vj > =0 i^j
<vuVi > =NVi ^0 i<r (<n)
and < v^ Vi > =0 (r + 1) < i < n
A typical element x eVn can be written in terms of this orthogonal basis
n
X — 7 X%Vi
2 = 1
n
^ X) X s> — y yijXiXj yij — \ t/j, Vj *>
= E^(xt)2
The matrix g^ is called the metric and the number r is clearly the rank of the matrix
gij. There are only two possibilities (i) r = n, (ii) r < n. If r = n then <ftj is said to be
non-singular (this occurs if and only if det(^j) ^ 0). If r < n then g^ is singular. We
shall assume that r — n and so the metric is non-singular.
Then for any two elements a, b £ Vn
with inner product
a = yZ °"ivi and k — /J fri^i
2=1 2=1
< a,b > = 2^ ^tja»6j
Given a definition for the scalar product the explicit components for the metric can be
found. Conversely, given the metric g^ (g^ = gji) we can find the scalar product of two
elements. There are two cases of interest. If g^ = 6{j we have the Euclidean metric whilst
220
if <7n = 1, gu = -1 i ^ 1 we have the Lorentz metric. This allows us to give a meaning
to the product of elements of Vn. We define
a2 = < a,a>
(In some discussions on this subject the choice a2 = - < a, a > is made). Prom this
definition we easily obtain
(a + b)2 = < a + 6, a + b >
i.e. a2 + ab + 6a + 62 = < a, a > +2 < a, b > + < 6,6 >
Using the usual properties of inner product and again not assuming that the product ab
is commutative. We immediately deduce
< a, b > = -(ab-\-ba)
Obviously, for an orthogonal basis: {vi, t>2,..., vn}
< Vi, Vj > = 0 z ^ j and so ViVj + VjVi = 0 z ^ j
and (from < a, 6 >= ]Cr,j=i 0*ja*M we nave
n
< ^,^fc > = ^ gijhktijk = 9kk
(since vfc = ££=1 «fcpvp)
uiuj +VjVi = 2gij
The basis elements {vi,V2,... ,vn} generate an algebra C That is, all product
combinations
1, Vi, V2, . . . , V„, ViV2, VlV3, • • • , ^1^3, ...,..., ^1^3 • • • Vn
(including the empty product - regarded as 1) generate the algebra. We shall prove that
there are in total 2n independent elements here.
Clearly, any product involving the Vi terms may be reduced, up to a sign, of distinct basis
elements of C. For example (for a Euclidean metric)
V1V3V3V4V2V3 = 1>lU4^2^3 = -^1^2^4^3 = V\V2V^V^
(we are assuming associativity). For the space Vn with n basis elements Vi,V2,...,vn
then every product can be reduced to one containing not more than n factors. The term
V\V2 . • • vn is called the pseudoscalar.
221
The product of two elements of C is defined in an obvious way: for example consider V\V2V±
and ^1^3 then
(v\V2V±)(v2V\Vz) — V\ViV±V2V\Vz
— v\v\vzv± = V3V4 using associativity
It is easily shown that the algebra C is associative. Following Riesz we can denote any
element of C as
rjPiP2-.pn —v\v2 '"vn
where the indices pi are taken modulo 2. So, for example (for n = 4) £1001 = ^1^4 whilst
£1011 — ^1^3v4 and so on. Any element of C can (by using v&j + VjVi = 0 i ^ j and
depending on the metric being Euclidean or Lorentzian, v\ — ±1) always be written in
this form up to a sign. The product of two elements of the algebra is
EPlPa...PftEtriai...<rn = v^v? ... v^v^v? ... <"
where k = Ym=2 Piai (on eacn interchange, only if pi = 1 and o\ — 1 a factor of (—1) is
introduced). Proceeding in this way it is now obvious that
n
Thus
^tlt2... tn{Epip2...pnE<Tl<T2...<Tn) — ( —l)P^tit2-.. tn^pi+<ri,p2+a2,...,Pn+^n
— (_1)P qEti+Pi+(Ti,t2+P2+(T2,...,tn+Pn+(Tn
where
n
which clearly indicates that the product is associative.
Now let Ep^ denote, for a given p the zth element of the set £PlP2...Pn where p = £n=1 Pj
of which there are nCp elements. Now
VjEp^ = sEp{i)Vj
222
where
+1 p = 0
(-I)p if EpW does not contain vd
(-l)p_1 if EpW does contain Vj
+1 if p = n (odd)
I — 1 if p = n (even)
Thus for every Ep^ we can find an element Vj which anti-commutes with it unless the
dimension n of Vn is odd and p = n. That is, for the pseudoscalar £qi...i.
Linear Independence
(i) Assume n is even. Consider the expression
n k
££>pi£*>« = 0 k=nCp (*)
p=0 2=1
where each aPi is a scalar. We need to show that every scalar api is zero.
Suppose there exists a particular scalar aqm ^ 0. Now every element Vi i = 1, 2,... , n
has an inverse v^1. This follows if ^ is non-singular; i.e. if g^1 ^ 0 z = 1,2, ...,n.
Inverses for the elements of algebra C now immediately follow. For example
[l>ll>2...Up]~1 = Vp1...^1^1
since
(wi... ^(v"1... vf1) = K1 • • • Ofai • • • vv) = l
in which associativity has been used.
Thus multiplying through (*) by a~^(i^m))_1 we obtain an expression of the 'form'
n k
p=0 2=1
where the p = q, i = m term is missing from the double sum. If (*) had contained a
single term then we have the contradiction 1 = 0. If (*) contains more than one term then
assume azt ^ 0 (for particular z,t). Since we are assuming n is even then there exists an
Vj which anti-commutes with Ez^\ Thus multiplying (**) on the left by Vj and on the
right by v~l we obtain
l + ^a^^K)"1^
p=0 2=1
223
(again the p = q, i — m term is missing. Adding to (**) and dividing by 2:
^lEE^K1,+vjE«i>{vj)-1] = 0 (***)
p=0 i=l
Now VjEv^l\vj)~l = ±EP^ depending as Vj commutes or anti-commutes with Ep^ and
vjEz^{vj)-1 = -Ez{t)
and so (***) has exactly the same form as (**) but with one term less in the double sum.
We now repeat this process, reducing the terms in the double summation by one each time.
Eventually we obtain the contradiction 1=0.
(ii) Now assume n is odd. As argued earlier there is no element Vj which anti-commutes
with the pseudoscalar £ai...i. Thus using exactly the above procedure we are led to
l+/?£ii...i = 0
To show that this also leads to a contradiction we note the automorphism Vi —> —Vi leaves
all relations in C unchanged. However -E11...1 -» —-E11...1 under this change and so
1 - j3£n...i = 0
must also be true. We immediately conclude that 2 = 0 a contradiction. Thus the elements
from the set Ep^ p = 0,1,..., n; i = 1,2,... ,n Cp are linearly independent and so form
a basis for C of dimension Yl^=o nCp = 2n.
We now show that the algebra splits naturally into two disjoint sets.
C is formed from 2n independent elements
1, Vi, V2, . . . , Vn, ViV2i ViV3i • • • , ViV2V3) ...,..., V1V2V3 ...Vn
For fixed p the nCp-dimensional subspace of C spanned by the basis elements
vaiva2 ...vQp with 1 < ol\ < ct2 < ... < ap < n
with exactly p factors is denoted by Cp. We now define C+, C_ to be
c+ = (J cp c- = (J cp
V even p odd
224
C+ is a subalgebra (since it is closed under multiplication) of C but C_ is not a
subalgebra. The elements of C+ are called even - those of C_ are called odd. We now
examine some special cases.
n = 0 C = {1} : This is the space of scalars.
n = 1 C : This is the vector space Vn
n = 2 C = {1, vi, ^2, ^1^2} : This is the space of quaternions (if ^ = — 1 other ^ = 0).
C+ = {1,^1^2}. Here (viv2)(viv2) = -v\vl = -1. Clearly C+ is isomorphic to
the complex field.
n = 3 C = {l,vi,v2, v3, ^3^2, V1V3, v2vi, ^1^3}- Here, if we write
z <-» ViV2v3 then i2 = (vi^2^3)(^i^2^3) = (^3X^3) = -1
Also denoting i <-> V1V2, j <-> ^2^3, A: <-> ^1^3 then
U ^(ViV2V3)(v3V2) = Vi
U ^(^1^2^3)(^1^3) = V2
ik <^(viv2v3)(v2vi) = v3
This is the algebra generated by the elements
{l,ii,ij,ik,i,j,k,i}
in which z, j, A; are the usual quaternion units and i2 = — 1. Clearly C is isomorphic to the
complexified quaternions (the biquaternions).
n = 3 C+ = {1,^1^2, v\v3, v2v3}. Here
(^2^l)2 = "I (VlV3)2 = "I (^2^3)2 = "I
(v2vi)(viv3) = v2v3 = -(viv3)(v2vi)
(^l)(W2) = ViV3 = -{v3V2)(v2Vi)
{viv3)(v3v2) = viv2 = -(v3v2)(viv3)
So if we make the identifications
i <-> v3v2 j <-> viv3 k <-> ^2^1
then C+ is isomorphic to the quaternions.
Appendix 2
Computer Algebra and Cayley Numbers
The following program segment is used to perform symbolic computations on Cayley
numbers using the symbolic programming language MAPLE. As in the text, a Cayley
number is considered as an ordered pair of quaternions, and the quaternions are taken as
2x2 complex matrices. The segment is written as a collection of procedures.
The first procedure quat defines a quaternion. (/ is MAPLE's notation for the imaginary
quantity i. Prom this procedure is returned the quaternion
a + id —c + ib
c + ib a — id
quat:=proc(a,b,c,d)
alpha:=a+I*d:
beta:=c+I*b:
evalc(matrix(2,2,[alpha,-conjugate(beta),beta,conjugate(alpha)])):
end:
Next, six general quaternions are defined jft, <ft, i = 1,2,3.
pl
ql
p2
q2
P3
q3
=quat(al,bl,cl,dl):
=quat(Al,Bl,Cl,Dl):
=quat(a2,b2,c2,d2):
=quat(A2,B2,C2,D2):
=quat(a3,b3,c3,d3):
=quat(A3,B3,C3,D3):
Then we define three general Cayley numbers Cayl, Cay2 and Cay 3. They are written
as an ordered pair (in MAPLE this is a list) of quaternions and, through the procedures
defined below adhere to the rules of Cayley algebra.
Cayl
Cay2
Cay3
Cay4:
= [pl,ql]:
= [P2,q2]:
= [P3,q3]:
= [htranspose(pl) ,-ql]:
226
Cayl, Cay2, Cay3 are general Cayley numbers and Cay 4 is the conjugate of Cay 1.
(We note that, in matrix form, the quaternion conjugate is the Hermition transpose).
The next procedure defines the Cayley conjugate:
X =
then Xc =
V
-Q
CayConj:=proc(Cay)
[evalc(evalm(htranspose(Cay[l]))),evalc(evalm(-Cay[2]))]:
end:
CaySum defines the sum of two Cayley numbers:
CaySum:=proc(Cayl,Cay2)
[evalc(evalm(Cayl[l]+Cay2[l])),evalc(evalm(Cayl[2]+Cay2[2]))];
end:
CayScprod defines the product of a scalar / G C with a Cayley number:
CayScprod:=proc(lambda,Cayl)
[evalc(evalm(Cayl[l]*lambda)),evalc(evalm(Cayl[2]*lambda))]:
end:
The following procedure extracts the vector part of a Cayley number:
Cayvec:=proc(Cay)
local K1,K11,K2,K3,K4;
Kl:=CayConj(Cay):
Kll:=CayScprod(-l,Kl):
K2:=CaySum(Cay,Kll):
K3:=l/2:
K4:=CayScprod(K3,K2):
[evalc(evalm(K4[l])),evalc(evalm(K4[2]))];
end:
The next procedure extracts the scalar part of a Cayley number:
Caysca:=proc(Cay)
ml:=CayConj(Cay);
227
mll:=evalc(ml[l]);
ml2:=evalc(ml[2]);
mm:=[mil,ml 2];
m2:=CaySum(Cay,mm);
m3:=l/2;
m4:=CayScprod(m3,m2);
m41:=evalm(m4[l]);
m42:=evalm(m4[2]);
[evalc(evalm(m41)) ,evalc(evalm(m42))];
end:
The inverse of a Cayley number is defined in the procedure Caylnv:
Caylnv: =proc(Cay)
[evalc(evalm(CayScprod(l/CayNorm(Cay),CayConj(Cay))[l])),
evalc(evalm(CayScprod(l/CayNorm(Cay),CayConj(Cay))[2]))];
end:
As a final definition we consider the product of two Cayley numbers:
X =
(note that in MAPLE the matrix product is denoted by &*).
Cayprod:=proc(Cayl,Cay2)
wl:=evalm(Cayl[l]&*Cay2[l]-Cay2[2]&*(htranspose(Cayl[2])));
w2:=evalm(Cay2[l]&*Cayl[2]+(htranspose(Cayl[l])&*Cay2[2]));
wl:=evalc(wl);
w2:=evalc(w2);
[evalc(evalm(wl)),evalc(evalm(w2))];
end:
We are now in a position to find the norm:
CayNorm:=proc(Cayl)
Cayprod(Cayl,CayConj(Cayl))[l][l,l]:
simplify (");
end;
V
.9.
Y =
a
b
XoY = X =
pa — bq
aq + pb
228
One of the most important questions that one can ask in Cayley number algebra is that
concerning the equality of two Cayley numbers. This is defined in the procedure equat.
equat :=proc(Cayl,Cay2)
zl:=simplify(Cayl[l][l,l]-Cay2[l][l,l])
z2:=simplify(Cayl[l][l,2]-Cay2[l][l,2])
z3:=simpUfy(Cayl[l][2,l]-Cay2[l][2,l])
z4:=simplify(Cayl [1] [2,2]-Cay2[l] [2,2])
z5:=simplify(Cayl[2][l,l]-Cay2[2][l,l])
z6:=simplify(Cayl[2][l,2]-Cay2[2][l,2])
z7:=simpUfy(Cayl[2][2,l]-Cay2[2][2,l])
z8:=simplify(Cayl[2][2,2]-Cay2[2][2,2])
if zl=0 then
if z2=0 then
if z3=0 then
if z4=0 then
if z5=0 then
if z6=0 then
if z7=0 then
if z8=0 then
RETURN(true) else RETURN(false) fi
else RETURN(false) fi
else RETURN(false) fi
else RETURN(false) fi
else RETURN(false) fi
else RETURN(false) fi
else RETURN(false) fi
else
RETURN(false)
fi;
end:
The next program segment defines a Cayley number Cay5 with vector part proportional
to the vector part of Cayl.
K:=k+I*0;
VecCay5:=CayScprod(K,Cayvec(Cayl));
VecCay5:=[evalc(evalm(VecCay5[l])),evalc(evalm(VecCay5[2]))];
ScaCay5:=Caysca(Cay3);
229
ScaCay5:=[evalc(evalm(ScaCay5[l])),evalc(evalm(ScaCay5[2]))];
Cay5:=CaySum(ScaCay5,VecCay5);
The next program segments test (using the procedure equat) the fundamental relations
of Cayley number algebra and the procedures defined above.
Since Cayl=(Cay4)c then MAPLE request:
equat(Cayl,CayConj(Cay4));
returns TRUE.
Since Caylo(Cay2oCay3)^ (CayloCay2)oCay3 then MAPLE request:
equat(Cayprod(Cayl,Cayprod(Cay2,Cay3)), Cayprod(Cayprod(Cayl,Cay2),Cay3));
returns FALSE.
However, as Caylo(Cay2oCay4)= (CayloCay2)oCay4 then MAPLE request:
equat(Cayprod(Cayl,Cayprod(Cay2,Cay4)), Cayprod(Cayprod(Cayl,Cay2),Cay4));
returns TRUE.
Also, as Caylo(Cay2oCayl)= (CayloCay2)oCayl then MAPLE request:
equat(Cayprod(Cayl,Cayprod(Cay2,Cayl)), Cayprod(Cayprod(Cayl,Cay2),Cayl));
returns TRUE.
In this way all of the Cayley identities described in Section 4.4 can be checked. The reader
should be aware that symbolic computation, though easy to formulate (and often only
requiring one word answers TRUE or FALSE) may require large amounts of memory and
take a considerable time to execute in comparison to 'ordinary' numerical computation.
The last two program segments test the result obtained in Theorem 4.6:
(yoI)o[^o(ro X)-1} = Yo[(Xo(Wo X-1)) o Y~l)
is valid for all W only when V(Y) = kV(X) where k e R. To save on computation I have,
without loss of generality, used conjugates instead of inverses. The first part tests equality
when V(Y) = kV{X) in which the identifications: W =Cay3, X =Cayl and Y =Cay5
have been used.
tl:=Cayprod(Cayl,Cay5):
t2:=CayConj(tl):
t3:=Cayprod(Cay3,t2):
t4:=Cayprod(tl,t3):
H:=Cayprod(Cay3,CayConj(Cay5)):
12:=Cayprod(Cay5,ll):
13:=Cayprod(12,CayConj(Cayl)):
14:=Cayprod(Cayl,13):
the MAPLE request:
equat(t4,14);
returns TRUE.
The second part tests non-equality when V(Y) ^ kV(X) (here W =Cay3, X =Cayl and
Y =Cay2).
tl:=Cayprod(Cayl,Cay2):
t2:=CayConj(tl):
t3:=Cayprod(Cay3,t2):
t4:=Cayprod(tl,t3):
H:=Cayprod(Cay3,CayConj(Cay2)):
12:=Cayprod(Cay2,ll):
13:=Cayprod(12,CayConj(Cayl)):
14:=Cayprod(Cayl,13):
The MAPLE request
equat(t4,14);
returns FALSE. Of course, it does not test the general statement of Theorem 4.6. For that,
one would require a much more sophisticated symbolic programming language.
REFERENCES
{1] Penrose R. and Rindler W.: Spinors and Space-Time
Vol I and 2 Cambridge University Press, (1984)
[2] Yang A. T. and Preudenstein F: Application of Dual Number Quaternion Algebra to
the Analysis of Spatial Mechanisms
Trans ASME June (1964) pp 300-307
[3] Edmonds J: Nature's Natural Numbers: Relativistic Quantum Theory over the Ring
of Complex Quaternions
Int. Journal of Theoretical Physics, Vol 6, No 3 (1972), pp205-224
[4] Hestenes, D.: Space-Time Algebra
Gordon and Breach, New York, 1966
[5] Hestenes, D.: Vectors, spinors and complex numbers in classical and quantum physics
Am. J. Phys. 39, 1013 (1971).
[6] Hestenes, D. and Gurtler, R.: Local Observables in Quantum Theory
Am. J. Phys. 39, 1028 (1971).
[7] van der Warden, B. L.: Hamilton's Discovery of Quaternions
Math. Magazine 49 (1976) pp227-236.
[8] Kauffman, L. H.: Knots and Physics
World Scientific Publishing Co. (1991).
[9] Brand, L.: Vector and Tensor Analysis
John Wiley and Sons (1947).
[10] du Val, P.: Homographies, Quaternions and Rotations
Claredon Press, Oxford (1964).
[11] Hestenes, D and Sobczyk, G.: Clifford Algebra to Geometric Calculus
D Reidel Publishing Company (1984).
[12] Israel, W.: Differential Forms in General Relativity
Comm. of the Dublin Institute for Advanced Studies Series A, No 19 (1970).
[13] Macfarlane, A. J.: On the Restricted Lorentz Group ...
Journal of Mathematical Physics, Vol 3 No. 6 pplll6-1129 (1962).
[14] Cahen, M.,Debever, R., & Defrise, L,: A Complex Vectorial Formalism in General
Relativity
Journal of Mathematics and Mechanics, Voll6, No 7 pp761-785 (1967).
[15] Synge, J. L.: Relativity: The Special Theory
North-Holland Publishing Company (1963).
[16] Trautman, A. & Kopczynski, W.: Spacetime and Gravitation
John Wiley and Sons (1992).
232
[17] Artmann, B.: The Concept of Number: From Quaternions to Monads and
Topological Fields
John Wiley and Sons (1988).
[18] Curtis, C: The Four and Eight Square Problem and Division Algebras
Studies in Modern Algebra: A A Albert, editor: Prentice-Hall, Inc pplOO-125 (1963).
[19] Porteous, I. R.: Topological Geometry
Van Nostrand Reinhold Company (1969).
[20] Riesz, M.: Clifford Numbers and Spinors
Lecture Series No 38, The Institute for Fluid Dynamics and Applied Mathematics,
University of Maryland (1958).
INDEX
abelian group 10,76,171
addition 5,56
algebras 38
alternative algebras 179,189
analytical geometry 65,95
angle 29,48,55,76,173,175,205
angular momentum 131
anti-isomorphisms 13
anti-rotation 36
arcs 8,50,98
associative 14,47,56,63,170,221
associative algebra 38
associator 178
basis 20,25,40,206,218
belt trick 59
binary operations 9
biquaternions 224
bivector 106,139
bivector basis 156
Cartesian 43
Cayley identities 177
Cayley numbers 110,164,167,225
Cayley triangles 207
Clifford algebras 217
commutative 10,38,47,54,55,63,78,169,194
commutative algebra 38
complex numbers 41,42,55,74
complex rotations 119,137
complexified quaternions 68,94,105,107,114,224
components 22
234
conjugate 41,47,56,107,185
cosine law 77,100
decimal 3
determinant 26,50,88,106
dimension 21
direction 6
displacements 5
distributive 14,63
division algebra 17,39,73
divisors of zero 16
dual 112
bivector 140
quaternion 140
electric fields 134
electromagnetism 133
Euclidean metric 107,111
Euclidean space 195
Euler's angles 38
exponential form 70
fields 14,16,48
force 128
Frobenius' theorem 72
Galilean transformation 121
gauge condition 136
groups 9
Hamilton triangles 206,215
homogeneous coordinates 68
homomorphism 12,15,92
Hurwitz 183,217
identity 10,39
235
inner product 30,76,109,115,171,184
Hermitian 109
spaces 27
integers 1
integral domains 16
invariants 152,155
inverse 10
involution 41
irrational 3
isomorphic algebras 41
isomorphisms 12,15,23,167
linear independence 20,73,222
linear maps 21
linear spaces 19
linear transformation 21
Lorentz boost 116
Lorentz group 125
Lorentz spatial rotation 116
Lorentz transformation 114,122,126,148
magnetic fields 134
magnitude 6
mass 129
matrix representation 22,23,53,79,88,91
Maxwell's equations 121,134
metric 110,219
Minkowski metric 114
modulus 18
momentum 128,129
Moufang relation 183,189
multiplication 5
negative 6,14,17
norm 6,19,30,40,55,109,115,166
normed algebra 40,183,184
null tetrad 147
null vectors 146
236
number pair 2,42,164,192
order 1,6,17,42
orientation 25,26
orthogonal group 36
orthogonal matrices 30,33,209
orthogonal 29,31
orthonormal set 29
parallel 77,175
particle mechanics 128
Pauli matrices 94,107
perpendicularity 77,175,197
photon 129
polar form 43,52
products 43
pseudoscalars 106,110,112,220
pseudovectors 106,112
quaternion algebra 62
quaternion demonstrator 59
quaternions 54,75
quaternion roots 70
rational numbers 2
real numbers 3,5
reflection 26,49,86,204,216
Riemann tensor 159
rings 14
rotating axes 102
rotation matrix 89
rotation 26,34,49,54,57,78,113,195,205
scalar multiplication 43
scalar part 56,164,197
scalar potential 134
scalars 6,20,42,112
Schwartz 27
237
self-dual 142
sign 19
simple bivectors 143
sine law 100
skew field 17
space-time 120
span 20
special relativity 120
spherical triangles 96,100
spherical trigonometry 95
spin 132
square roots 70
subalgebra 190,224
subgroups 11
subspace 185
tetrad 146
time-like 125
triangle inequality 172
trichotomy law 17
unit 27
unit quaternion 56
vector identities 63,66
vector part 56,164
vector potential 134
vectors 165,112
wave operator 134
Weyl tensor 158
zero 1,14
Other Mathematics and Its Applications titles of interest:
P.H. Sellers: Combinatorial Complexes. A Mathematical Theory of Algorithms.
1979,200 pp. ISBN 90-277-1000-7
P.M. Cohn: Universal Algebra. 1981, 432 pp.
ISBN 90-277-1213-1 (hb), ISBN 90-277-1254-9 (pb,
J. Mockor: Groups of Divisibility. 1983,192 pp. ISBN 90-277-1539-4
A. Wwarynczyk: Group Representations and Special Functions. 1986, 704 pp.
ISBN 90-277-2294-3 (pb), ISBN 90-277-1269-7 (hb)
I. Bucur: Selected Topics in Algebra and its Interrelations with Logic, Number
Theory and Algebraic Geometry. 1984, 416 pp. ISBN 90-277-1671-4
H. Walther: Ten Applications of Graph Theory. 1985, 264 pp.
ISBN 90-277-1599-8
L. Beran: Orthomodular Lattices. Algebraic Approach. 1985, 416 pp.
ISBN90-277-1715-X
A. Pazman: Foundations of Optimum Experimental Design. 1986, 248 pp.
ISBN 90-277-1865-2
K. Wagner and G. Wechsung: Computational Complexity. 1986, 552 pp.
ISBN 90-277-2146-7
A.N. Philippou, G.E. Bergum and A.F. Horodam (eds.): Fibonacci Numbers and
Their Applications. 1986, 328 pp. ISBN 90-277-2234-X
C. Nastasescu and F. van Oystaeyen: Dimensions of Ring Theory. 1987, 372 pp.
ISBN90-277-2461-X
Shang-Ching Chou: Mechanical Geometry Theorem Proving. 1987, 376 pp.
ISBN 90-277-2650-7
D. Przeworska-Rolewicz: Algebraic Analysis. 1988, 640 pp. ISBN 90-277-2443-1
C.T.J. Dodson: Categories, Bundles and Spacetime Topology. 1988, 264 pp.
ISBN 90-277-2771-6
V.D. Goppa: Geometry and Codes. 1988,168 pp. ISBN 90-277-2776-7
A.A. Markov and N.M. Nagorny: The Theory of Algorithms. 1988, 396 pp.
ISBN 90-277-2773-2
E. Kratzel: Lattice Points. 1989, 322 pp. ISBN 90-277-2733-3
A.M.W. Glass and W.Ch. Holland (eds.): Lattice-Ordered Groups. Advances and
Techniques. 1989, 400 pp. ISBN 0-7923-0116-1
N.E. Hurt: Phase Retrieval and Zero Crossings: Mathematical Methods in Image
Reconstruction. 1989, 320 pp. ISBN 0-7923-0210-9
Du Dingzhu and Hu Guoding (eds.): Combinatorics, Computing and Complexity.
1989, 248 pp. ISBN 0-7923-0308-3
Other Mathematics and Its Applications titles of interest:
A.Ya. Helemskii: The Homology of Banach and Topological Algebras. 1989,
356 pp. ISBN 0-7923-0217-6
J. Martinez (ed.): Ordered Algebraic Structures. 1989, 304 pp.
ISBN 0-7923-0489-6
V.I. Varshavsky: Self-Timed Control of Concurrent Processes. The Design of
Aperiodic Logical Circuits in Computers and Discrete Systems. 1989, 428 pp.
ISBN 0-7923-0525-6
E. Goles and S. Martinez: Neural and Automata Networks. Dynamical Behavior
and Applications. 1990,264 pp. ISBN 0-7923-0632-5
A. Crumeyrolle: Orthogonal and Symplectic Clifford Algebras. Spinor Structures.
1990, 364 pp. ISBN 0-7923-0541-8
S. Albeverio, Ph. Blanchard and D. Testard (eds.): Stochastics, Algebra and
Analysis in Classical and Quantum Dynamics. 1990, 264 pp. ISBN 0-7923-0637-6
G. Karpilovsky: Symmetric and G-Algebras. With Applications to Group
Representations. 1990, 384 pp. ISBN 0-7923-0761-5
J. Bosak: Decomposition of Graphs. 1990, 268 pp. ISBN 0-7923-0747-X
J. Adamek and V. Trnkova: Automata and Algebras in Categories. 1990, 488 pp.
ISBN 0-7923-0010-6
A.B. Venkov: Spectral Theory of Automorphic Functions and Its Applications.
1991, 280 pp. ISBN 0-7923-0487-X
M.A. Tsfasman and S.G. Vladuts: Algebraic Geometric Codes. 1991, 668 pp.
ISBN 0-7923-0727-5
H.J. Voss: Cycles and Bridges in Graphs. 1991, 288 pp. ISBN 0-7923-0899-9
V.K. Kharchenko: Automorphisms and Derivations of Associative Rings. 1991,
386 pp. ISBN 0-7923-1382-8
A.Yu. Olshanskii: Geometry of Defining Relations in Groups. 1991, 513 pp.
ISBN 0-7923-1394-1
F. Brackx and D. Constales: Computer Algebra with LISP and REDUCE. An
Introduction to Computer-Aided Pure Mathematics. 1992, 286 pp.
ISBN 0-7923-1441-7
N.M. Korobov: Exponential Sums and their Applications. 1992, 210 pp.
ISBN 0-7923-1647-9
D.G. Skordev: Computability in Combinatory Spaces. An Algebraic Generalization
of Abstract First Order Computability. 1992, 320 pp. ISBN 0-7923-1576-6
E. Goles and S. Martinez: Statistical Physics, Automata Networks and Dynamical
Systems. 1992, 208 pp. ISBN 0-7923-1595-2
Other Mathematics and Its Applications titles of interest:
M.A. Frumkin: Systolic Computations. 1992, 320 pp. ISBN 0-7923-1708-4
J. Alajbegovic and J. Mockor: Approximation Theorems in Commutative Algebra.
1992, 330 pp. ISBN 0-7923-1948-6
LA. Faradzev, A.A. Ivanov, M.M. Klin and A.J. Woldar: Investigations in
Algebraic Theory of Combinatorial Objects. 1993, 516 pp. ISBN 0-7923-1927-3
I.E. Shparlinski: Computational and Algorithmic Problems in Finite Fields. 1992,
266 pp. ISBN 0-7923-2057-3
P. Feinsilver and R. Schott: Algebraic Structures and Operator Calculus. Vol. I.
Representations and Probability Theory. 1993,224 pp. ISBN 0-7923-2116-2
A.G. Pinus: Boolean Constructions in Universal Algebras. 1993, 350 pp.
ISBN 0-7923-2117-0
V.V. Alexandrov and N.D. Gorsky: Image Representation and Processing. A
Recursive Approach. 1993, 200 pp. ISBN 0-7923-2136-7
L.A. Bokut' and G.P. Kukin: Algorithmic and Combinatorial Algebra. 1994,
384 pp. ISBN 0-7923-2313-0
Y. Bahturin: Basic Structures of Modern Algebra. 1993, 419 pp.
ISBN 0-7923-2459-5
R. Krichevsky: Universal Compression and Retrieval. 1994,219 pp.
ISBN 0-7923-2672-5
A. Elduque and H.C. Myung: Mutations of Alternative Algebras. 1994, 226 pp.
ISBN 0-7923-2735-7
E. Goles and S. Martinez (eds.): Cellular Automata, Dynamical Systems and
Neural Networks. 1994,189 pp. ISBN 0-7923-2772-1
A.G. Kusraev and S.S. Kutateladze: Nonstandard Methods of Analysis. 1994,
444 pp. ISBN 0-7923-2892-2
P. Feinsilver and R. Schott: Algebraic Structures and Operator Calculus. Vol. II.
Special Functions and Computer Science. 1994,148 pp. ISBN 0-7923-2921-X
V.M. Kopytov and N. Ya. Medvedev: The Theory of Lattice-Ordered Groups.
1994, 400 pp. ISBN 0-7923-3169-9
H. Inassaridze: Algebraic K-Theory. 1995, 438 pp. ISBN 0-7923-3185-0
C. Mortensen: Inconsistent Mathematics. 1995,155 pp. ISBN 0-7923-3186-9
R. Ablamowicz and P. Lounesto (eds.): Clifford Algebras and Spinor Structures. A
Special Volume Dedicated to the Memory of Albert Crumeyrolle (1919-1992).
1995, 421 pp. ISBN 0-7923-3366-7
W. Bosma and A. van der Poorten (eds.), Computational Algebra and Number
Theory. 1995, 336 pp. ISBN 0-7923-3501-5
Other Mathematics and Its Applications titles of interest:
A.L. Rosenberg: Noncommutative Algebraic Geometry and Representations of
Quantized Algebras. 1995, 316 pp. ISBN 0-7923-3575-9
L. Yanpei: Embeddability in Graphs. 1995, 400 pp. ISBN 0-7923-3648-8
B.S. Stechkin and V.I. Baranov: Extremal Combinatorial Problems and Their
Applications. 1995, 205 pp. ISBN 0-7923-3631-3
Y. Fong, H.E. Bell, W.-F. Ke, G. Mason and G. Pilz (eds.): Near-Rings and Near-
Fields. 1995, 278 pp. ISBN 0-7923-3635-6
A. Facchini and C. Menini (eds.): Abelian Groups and Modules. (Proceedings of
the Padova Conference, Padova, Italy, June 23-My 1,1994). 1995, 537 pp.
ISBN 0-7923-3756-5
D. Dikranjan and W. Tholen: Categorical Structure of Closure Operators. With
Applications to Topology, Algebra and Discrete Mathematics. 1995, 376 pp.
ISBN 0-7923-3772-7
A.D. Korshunov (ed.): Discrete Analysis and Operations Research. 1996, 351 pp.
ISBN 0-7923-3866-9
P. Feinsilver and R. Schott: Algebraic Structures and Operator Calculus. Vol. Ill:
Representations of Lie Groups. 1996,238 pp. ISBN 0-7923-3834-0
M. Gasca and C.A. Micchelli (eds.): Total Positivity and Its Applications. 1996,
528 pp. ISBN0-7923-3924-X
W.D. Wallis (ed.): Computational and Constructive Design Theory. 1996, 368 pp.
ISBN 0-7923-4015-9
F. Cacace and G. Lamperti: Advanced Relational Programming. 1996, 410 pp.
ISBN 0-7923-4081-7
N.M. Martin and S. Pollard: Closure Spaces and Logic. 1996,248 pp.
ISBN 0-7923-4110-4
A.D. Korshunov (ed.): Operations Research and Discrete Analysis. 1997, 340 pp.
ISBN 0-7923-4334-4
W.D. Wallis: One-Factorizations. 1997, 256 pp. ISBN 0-7923-4323-9
G. Weaver: Henkin-Keisler Models. 1997, 266 pp. ISBN 0-7923-4366-2
V.N. Kolokoltsov and V.P. Maslov: Idempotent Analysis and Its Applications.
1997, 318 pp. ISBN 0-7923-4509-6
J.P. Ward: Quaternions and Cay ley Numbers. Algebra and Applications. 1997,
250 pp. ISBN 0-7923-4513-4