Text
                    tl
n
li
I r de i Iish r


Quaternions and Cayley Numbers
Mathematics and Its Applications Managing Editor: M.HAZEWINKEL Centre for Mathematics and Computer Science, Amsterdam, The Netherlands Volume 403
Quaternions and Cayley Numbers Algebra and Applications by J. P. Ward Department of Mathematical Sciences, Loughborough University, Loughborough, England SPRINGER-SCIENCE+BUSINESS MEDIA, B.V.
A CLE Catalogue record for this book is available from the Library of Congress. ISBN 978-94-010-6434-7 ISBN 978-94-011-5768-1 (eBook) DOI 10.1007/978-94-011-5768-1 Printed on acid-free paper All Rights Reserved © 1997 Springer Science+Business Media Dordrecht Originally published by Kluwer Academic Publishers in 1997 Softcover reprint of the hardcover 1st edition 1997 No part of the material protected by this copyright notice may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording or by any information storage and retrieval system, without written permission from the copyright owner
CONTENTS Preface 1 Fundamentals of Linear Algebra l.l 1.2 1.3 1.4 1.5 1.6 1.7 1.8 Integers, Rationals and Real Numbers Real Numbers and Displacements Groups Rings and Fields Linear Spaces Inner Product Spaces Algebras Complex Numbers Vll 1 5 9 14 19 27 38 42 Quaternions 2.1 Inventing Quaternions 54 2.2 Quaternion Algebra 62 2.3 The Exponential Form and Root Extraction 70 2.4 Frobenius' Theorem 72 2.5 Inner Product for Quaternions 76 2.6 Quaternions and Rotations in 3- and 4-Dimensions 78 2.7 Relation to the Rotation Matrix 89 2.8 Matrix Formulation of Quaternions 91 2.9 Applications to Spherical Trigonometry 95 2.10 Rotating Axes in Mechanics 102 Complexified Quaternions 3.1 Scalars, Pseudoscalars, Vectors and Pseudovectors 105 3.2 Complexified Quaternions: Euclidean Metric 107 3.3 Complexified Quaternions: Minkowski Metric 114 3.4 Application of Complexified Quaternions to Space-Time 120 3.5 Quaternions and Electromagnetism 133 3.6 Quaternionic Representation of Bivectors 139 3.7 Null Tetrad for Space-time 146 3.8 Classification of Complex Bivectors and of the Weyl Tensor 158
VI Cayley Numbers 4.1 A Common Notation for Numbers 164 4.2 Cayley Numbers 167 4.3 Angles and Cayley Numbers 173 4.4 Cayley Number Identities 177 4.5 Normed Algebras and the Hurwitz Theorem 183 4.6 Rotations in 7-and 8-Dimensional Euclidean Space 195 4.7 Basis Elements for Cayley Numbers 206 4.8 Geometry of 8-Dimensional Rotations 213 Appendix 1 Clifford Algebras 217 Appendix 2 Computer Algebra and Cayley Numbers 225 References 231 Index 233
Preface In essence, this text is written as a challenge to others, to discover significant uses for Cayley number algebra in physics. I freely admit that though the reading of some sections would benefit from previous experience of certain topics in physics — particularly relativity and electromagnet ism — generally the mathematics is not sophisticated. In fact, the mathematically sophisticated reader, may well find that in many places, the rather deliberate progress too slow for their liking. This text had its origin in a 90-minute lecture on complex numbers given by the author to prospective university students in 1994. In my attempt to develop a novel approach to the subject matter I looked at complex numbers from an entirely geometric perspective and, no doubt in line with innumerable other mathematicians, re-traced steps first taken by Hamilton and others in the early years of the nineteenth century. I even enquired into the possibility of using an alternative multiplication rule for complex numbers (in which aigziZ2 = arg^i— arg^) other than the one which is normally accepted (arg^i^ = argzi + arg^). Of course, my alternative was rejected because it didn't lead to a 'product' which had properties that we now accept as fundamental (i.e. my product was not commutative; any real number has an infinite number of distinct square roots and square roots of any other number do not exist!) My considerations on complex numbers led me quite naturally to consider quaternions (denoted by H) which, though a professional mathematician for nearly twenty years, I had not properly considered before. Paradoxically, some of the properties I rejected for complex numbers had now to be accepted: a non-commutative product and infinite numbers of square roots for particular numbers. Quaternion algebra is fascinating and (though out of favour) has considerable applications to many areas of mathematics and physics (though in the latter case it often needs to be 'complexified'). Just as a complex number can be written as an ordered pair of real numbers, a quaternion can also be written as an ordered pair of complex numbers. Following this approach one is then naturally led to consider ordered pairs of quaternions — these new objects being called Cayley numbers (8-dimensional objects denoted by K). One might imagine that this pair-wise construction continues to produce new algebraic objects increasing in dimensionality by a power of 2 each time. However, as one increases the dimensionality of these algebraic objects one finds that, at each stage of generalisation, algebraic structure is lost. In going from R to C the concept of order is lost. In going from C to H commutativity in the product is lost and in going from H to K the associative rule goes. So, in a very natural sense, the Cayley numbers are the end-point of a very interesting sequence of algebras. It can be shown that each of these algebras is a normed algebra (in which the norm, suitably defined, of a product is the product of norms) and indeed, over the real number Vll
Vlll field, these are the only normed algebras. One is then led to suspect that because three of the members of this important class, the reals, the complex numbers and the quaternion numbers have found considerable application in Physics that the final member of the class the Cay ley numbers should also have significant application to physics. However, thus far, this appears not to be the case. In my researches I have not found a single area of physics to which Cayley numbers have been applied. Indeed I have only found one significant reference, hidden away in the appendix of Volume 2 of Penrose & Rindler's 'Spinors and Space-Time' [1] which mentions Cayley numbers. One of the reasons for writing this text is to introduce Cayley numbers and its 'alternative' algebra to a wider audience. My approach to writing this text is not to give an exhaustive account of quaternions and of Cayley numbers and (where they exist) their applications, but rather to produce a text which is relatively short, to the point and one which is accessible to specialists and non- specialists alike. For reasons of conciseness and (on my part) ignorance, I have not included applications of quaternions to mechanisms or to quantum physics. In the theory of mechanisms a 'dual quaternion' is introduced p + eq (p,q are quaternions and e2 = 0). This I regard as a straightforward extension of the use of quaternions in mechanics and is fully described in A T Yang & F Freudenstein [2] and in references therein. The application of a quaternion formalism to quantum theory is more well known by most physicists (at least at a superficial level through the 'Pauli matrices'). More determined attempts to incorporate quaternions into quantum theory are described by J Edmonds [3] and by Hestenes [4,5,6]. The layout of the book is straightforward. The first chapter covers background material and fundamental concepts of linear algebra. Chapter 2 begins with the geometrical introduction to quaternions and fully develops the rules of quaternion algebra. A discussion of Frobenius' Theorem is included which shows that the only associative division algebra over the real numbers is isomorphic to one of E, C, or H. The application of quaternions to rotations in 3- and 4-dimensional Euclidean space is then considered and its relation to the classical matrix approach is explored. This chapter also illustrates the application of quaternions to spherical trigonometry and to problems of rotating coordinating systems in mechanics. In Chapter 3 we consider the algebra of complexified quaternions and examine two possible inner products, one of which gives rise to a Euclidean metric and the other to a Minkowski metric. It could easily be argued that the complex numbers and the quaternions were introduced to describe rotations in (respectively) 2- and 3-dimensional space and indeed the idea of a 'rotation' both in geometric and algebraic terms is a common thread which runs throughout this work. Rotations in a complex 3-dimensional space are considered in this chapter and applied to the treatment of the Lorentz tranformation in special relativity and to the description of electromagnet ism. In Chapter 4 we develop the algebra of Cayley numbers. We examine the various Cayley number identities which make its non-associative algebra easier to handle. Also included is
IX the important Hurwitz Theorem which proves that the only normed algebras over the real field are isomorphic to M, C, lorK. I have attempted, as far as I was able, to extract the geometrical interpretation of Cayley numbers and I have mirrored closely the approach taken in the analysis of quaternions. There are two appendices. In the first we describe Clifford algebras (all of which are associative) which include as special cases the real numbers, the complex numbers, the quaternions and the complexified quaternions but, significantly not the Cayley numbers. The second appendix describes the use of the symbolic computation language in Cayley number algebra. J P Ward Loughborough, December 1996
Dedication This book is dedicated to Joanna Mary Lewis who has been a source of constant and unfailing encouragement through the two years it has taken to successfully complete this project. I particularly commend her on her excellent comedy routines (via Victoria Wood), the odd cup of tea and an endless supply (via Shirley) of digestive biscuits. XI
Chapter 1 Fundamentals of Linear Algebra 1.1 Integers, Rationals and Real Numbers The main purpose of this text is to describe the algebra of quaternions and Cayley numbers and to examine some of their applications. However, before we do that it will be useful to consider the integers, rationals and reals and to remind ourselves what these mathematical constructs are and discuss some of their properties. The main purpose of the positive integers is in counting which is essentially the process of putting items from one set into one-to-one correspondence with the items from another set. Perhaps it is because that we are all so familiar with this process that it took mathematicians so long to abstract the idea of an integer — to remove the integers away from objects with which they were associated. Thus, with their invention, the symbols 1,2,3... could be used to count fingers or sheep or apples. As well as being able to count we each have a natural concept of order. This exists in the sense that we say the number tie of elements of the set B is greater than the number ua in the set A and we write n>B > n>A if the items of A can be put into one-to-one correspondence with the items of B and if there are some items of B left over. Once the integers were considered in abstract terms they could be used not just for counting but also for addition. It is perfectly reasonable to count two distinct sets, say of apples; set A containing 'a' apples and set B containing '6' apples and then to consider combining the sets so that all the apples are in one set. The combined number c, of apples, is defined as the sum of a and 6, written using the usual notation: c = a + b = b + a. A second operation, multiplication is easily defined being the operation of repeated addition. Now we can consider equations and unknowns. In the simplest case we can ask, what is that integer x such that a + x = boiax = b where a, b are integers. Now, of course, the first equation can only be solved for integer x if b > a and the second equation only has a solution for integer x if b > a and if b is an integer multiple of a. We would much prefer a situation in which these equations could be solved for all possible choices of a, b. This leads directly to expanding the system of positive integers to include the negative integers and then further extending them to include the rationals. If we include the negative integers and zero ..., —3, —2, — 1,0,1,2,3,... then the 'solution' to a + x = b is x = b + (—a), where the symbol (—a) has the usual meaning. It is interesting to note that using only the natural numbers N : {1,2,3,...} all the negative integers and zero can be defined using only the normal operations for adding and multiplying. We define the set ^(a,&) to be the collection of all number pairs (a, b) where a, b G N with the rules for adding and multiplying being: (a, b) © (c, d) = (a + c, b + d) (a, b) ® (c, d) = (ac + bd, be + ad) J. P. Ward, Quaternions and Cayley Numbers © Kluwer Academic Publishers 1997
2 We shall also assume that equality between two such number pairs satisfies: (a, b) = (c, d) only if a + d = b + c. Prom this set of 'objects' the positive integers are those number pairs (a,b) for which a > b; the negative integers are those for which a < b and the zero element is the number pair (a, a). Effectively we are denoting b — a by (a, b). All the usual properties of negative numbers can be derived from these basic axioms (the negative sign need never be used). Of course it can be shown that %(a,b) and the set Z of integers {..., —3, —2, — 1,0,1, 2,3...} are essentially identical. We note that the product of negative integers (a, b) b > a and (c, d), d > c is a positive integer. To see this we must show that ac + bd> bc + ad. Since b > a and d > c then there exist integers k, m G N such that b = a + m and d = c + k. Then ac + bd = ac+ (a + m)(c + k) = ac + ac + mc + ak + mk = (a + m)c + a(c + k) + mk = bc + ad + mk > bc + ad. Therefore {ac 4- &d, be + ad) is a positive integer. The use of an ordered list of objects (here a pair of integers) to define a different kind of number is a recurring theme in this area of mathematics. Using integers to count is an elementary idea. But how do we distinguish between a 'small' apple and a 'large' apple or how do we 'account' for an apple sliced into exactly two pieces? Distinguishing between large and small is a more complicated problem than dealing with 'fractions' of a whole. Again we use the artifice of a number pair. When we divide an apple into exactly two pieces the selection of one piece might be denoted by (1, 2); the first integer standing for the number of pieces in our selection and the second integer standing for the number of pieces in the whole group. Thus the pairing (3,27) could describe the selection of 3 sheep from a flock of 27. Even more suggestively (with present day notation) we could write the pair of integers in column form ( ) or ( A pairing such as I I is meaningless; choosing a items from 0 is simply not allowable. The rational numbers is the set Q of all ordered pairs of integers I J 6^0o, 6 G 2 satisfying the algebraic rules of combination: :)«(5)-0)8(0-(-«s* ;Ws)-(sW:)-(s
f , J = f , J where s is an integer Now, the algebraic equation: ax = b or equivalently, ( J ® x = I J has the solution j. Of course, the integers are contained within the rationals, being rationals of the form I I. In any practical calculation almost any measurement can be accompolished using the rational number system. Practicalities, however, are not of direct interest to the mathematician. Of course, he is very happy if his 'constructs' find application in the real world but this is not his overriding interest. The need for a number which could not be represented by a rational was first considered by Pythagoras who was born about 580bc. The example is well known. We take a unit square and slice it into two along a diagonal and if x is the length of the hypotenuese of one of the right-angled triangles so formed then x satisfies the simple equation: x2 = 2. The series of rationals, 14 141 1414 10' 100' 1000 might be 'close' candidates for x as they produce for x2, in turn 196 19881 1999396 100' 10000' 1000000 '" Now we can perhaps appreciate Pythagoras' problem. What (if any) is the rational number x which satisfies this equation? It is not too difficult to see that in fact no such rational number exists. For let us assume that there is such a rational number: x = I ) which is assumed to be written in such a way that m and n have no common factor other than ±1. Therefore x2 = 2 implies: 1 Thus we deduce that m,n exist such that m2 = 2n2. But the right hand side is an even number irrespective of the eveness or otherwise of n; thus m2 is an even number. Since the square of any odd integer is odd we deduce that m must be an even number and so m2 must have a factor of 4 implying n2 is even. Thus both m and n are even which contradicts our original assumption that m and n have no common factors. Thus the solution to x2 = 2 cannot be a rational number. Such a number is termed irrational. Without going into great detail the real numbers, denoted by M, can be defined in rigorous terms. Any real number b can be written in decimal fraction form: b = k.d1d2d3... = k + ^ + ^ + ^-Q+... generally dn ee ^
4 where k £ Z and d\,d<i... are non-negative integers. The term .d\did^ ... is called the decimal part of b. Here we are reverting to the ordinary conventional way of writing a rational number. So, for example, (in ordinary notation) is the decimal fraction representation of \fl. It is clear that any decimal fraction which satisfies dv = 0 for p greater than some integer q represents a rational number. (These are called finite decimal fractions). Thus 2.35000. is the rational 235 Too However a rational may also be represented by a periodic decimal in which a sequence of digits in the decimal part repeat indefinitely. For example, consider the decimal fraction b = 0.09090909... Clearly 9 02 9 02 9 02 9 9 + 104 + 106 + ' 1 1 . + 102 + 104 • 1 1 - lO"2 1 11 where we have used the usual 'sum to infinity' rule of a geometric series. It is easily shown that all periodic decimals are rational numbers. For consider the periodic decimal: *\0 b = k.did2d3 ... dk{e[e*2 ... e*) where (e^e^ ... e*)° denotes the sequence of recurring digits in the decimal part of the number. Now, clearly, k.d\d2ds...dk is rational and if we let, for the first set of recurring digits, K - [e\e\ ... e;) - ^^r + ^0^+2 .-. (c;c;...<)° = ir + ...+ IQfc+n 1Qfc _[0.cic2...e„] 1 1 + 10n + 102n + "'
5 = K\_1_ 1 - 10~n which is rational implying k.d^ds ... dk{e\e\ ... e*)° is rational. As a typical example consider b = 0.660333... = 0.660(3)° ,.««oW.M + 10 10-1 10000 660 1 1981 + ■ 1000 3000 3000 In this formalism the definition of an irrational number is quite simple — it is one for which the decimal fraction is non-periodic (here a number such as 2.30000... is periodic as n = 1 and ei =0). As we have seen above, the most elementary concept of number, the positive integer, was, (for algebraic convenience) first generalised to the negative integers, then to the rationals and finally to the irrational numbers. We continue our discussion of real numbers by considering their geometrical interpretation. In this way we shall be led naturally to consider further generalisations of number. 1.2 Real Numbers and Displacements Of all the mathematical inventions perhaps the positive real numbers (Ep) have the most direct application to Physics and Engineering and their properties are well known. We imagine these numbers to be in one to one correspondence with points on a half-line called the positive real line. Two binary operations, addition (denoted by +) and multiplication (denoted by x) are defined. Addition is defined in an obvious way. Given two positive real numbers a and b we construct a + b by sliding b so that it points out of the end of a. Then a 4- b is that positive real number measured from the start point of a to the end point of b. Multiplication also has an obvious geometric meaning in terms of multiple additions. The construction of na where both n and a are irrational is conceptually easier to consider if we move away from this one-dimensional picture. Consider two lines OH and OT intersecting at O inclined at an acute angle as shown in Figure 1.1. Along OH we measure the unit 1 and the number a whilst along line OT we mark the number n. We draw two further lines one from P to Q and then a line parallel to this one from R to meet OT in a point S. The length of OS is the geometrical construct of the product number na. This diagram assumes that a > 1. The second diagram considers the construct when a < 1.
Figure 1.1 There is also an order property that can be associated with positive real numbers. If a and b are two positive real numbers such that b is further to the right on the positive real line than a then we write b > a. The need to extend the positive real numbers to include the negative real numbers arises (as it did with the positive integers) if we demand that all linear equations should have a solution. If we wish to 'solve' a + x = b irrespective of the values of a and b then we must consider negative real numbers. As with integers this can be done by considering number pairs which obviates the need to introduce a minus sign. However, here we consider the construction of negative real numbers from a geometrical perspective. Points on a Line: Displacements We draw a line, called the scalar axis, and choose a point 0 called the origin (Figure 1.2). 0 • ► s Figure 1.2 We can perform two basic operations on a line; moving to the right or moving to the left. A positive scalar is obtained if we move to the right, a negative scalar is obtained if we move to the left. The set of all scalars is denoted by E. An ordinary scalar has both a magnitude and a direction indicated by an arrow. (A displacement may be a better terminology). We accept that the value of a scalar is independent of its position on this line. That is, scalars are slidable. Thus two scalars a, b are said to be equal if and only if both have the same length and both are pointing in the same direction. For every scalar c (except zero) we can define two useful geometric terms the norm Nc and the direction(sign) signc: Nc = square of the length of c Nc E Rp
signc = +1 if c points right and signc = — 1 if c points left Operations with Scalars These operations are familiar to all of us. Addition has an obvious geometric interpretation. As with positive real numbers, to form a + b we move '6' to point out of the end of ia\ Then 'a + 6' is the line joining the start point of a with the end point of b. The same construction applies if both scalars are negative. Subtraction implies a change of direction from right to left or from left to right. To construct a — bwe move '6' to the end of 'a' and then change the direction (the arrow) of '6' to produce (—b). See Figure 1.3. a-b 0 a * ^- m^s Figure 1.3 The same construction applies if the scalars are negative. There are two types of product that we can contemplate: multiplication of a scalar with a positive real number and products of scalars with scalars. Multiplication by Positive Real Numbers If r is a positive real and a is a scalar then the product of r with a, written ra, is a scalar in the same direction as a (i.e with the same sign as a) with norm r2Na. See Figure 1.4. ra rs/Na Figure 1.4 Scalar Multiplication If a, b are scalars then the product of a with 6, also written ab is the scalar with norm N^: and direction (sign): Nah = NaNh signa6 = +1 if signa = sign6 Signa6 = -1 if siSna ^ Sign6
This definition reduces (as we would demand it should) to multiplication by a real number if a is a positive real number and b is scalar. The only other possible choice for the definition of the sign of a product (consistent with multiplication by positive real numbers) is: siSna& = +1 if siSna = siSn& = +1 signa6 = — 1 otherwise Multiplication defined with this choice for the sign is easily shown not to satisfy the distributive law; a(b + c) = ab + ac. and so is rejected. It is clear that any scalar a with norm Na may be written in the form: a = \J~N^a where a is a scalar with the same sign as a; signa = signa and with unit norm N& = 1. Now with any unit scalar (there are only two of them; either pointing to the right or pointing to the left) we can associate what we shall call an 'arc'. Essentially arca = 1.5 displays arce (if e has +ve sign) and arc ? (if / has — ve sign). 0 arcs arc^s f e Figure 1.5 We note that a real number a = y/N^a may be written in 'polar coordinate' form: (y/Na, arc a). Arcs may slide and combine to produce two distinct constructs. Any two arcs in the same direction (with the same sign) is called a 'circle', and any two arcs in opposite directions is called a 'semi-circle'. (The reason for this slightly odd terminology, in this context, will become clear as we generalise to higher dimensions). See Figure 1.6. 0 semi-circle 0 0 -• ^^ s •- circle Figure 1.6 circle
9 With this convention and denoting by ~ the correspondence between a unit scalar and its associated arc viz a ~ area we have +ve 1 ~ circle -vel~ semi — circle and if we use the symbol © to indicate 'summation' according to the constructs in Figure 1.6. we have for any two unit scalars p, q arCp vp arCfl —— arcpg (There are only two possibilities: aicpq = circle if p, q have the same sign and arc^ = semi-circle if p, q have opposite signs). In particular arc^ © arCp = arc^ = circle ~ +ve 1 which states that the square of any unit scalar (and hence any scalar) always points to the right (i.e. is always +ve). Also arCp © circle = arCp arCp © semi — circle = — arCp for any unit scalar p. We can now re-write the product of scalars a = y/N^a and b = y/N^b in the form: ab = {^aa){y/N~bb) = VK^Mi^c^) The term arca£ in this expression provides the 'sign' of the product. So scalar multiplication has been reduced to multiplication of positive real numbers and addition of arcs. 1.3 Groups In the previous sections we have examined many different sets including integers, rationals and real numbers. On each of these sets we have introduced two operations; addition and multiplication each being an example of a binary operation. On each one an algebraic structure of some kind was imposed. Perhaps the simplest algebraic structure shared by most of the sets we have met is the structure of a group. Binary Operations Let G be a non-empty set. A binary operation * on G is a rule which assigns to elements a, b of G the unique element a * b. A set G is said to be closed under the operation * if
10 for every pair of elements a,b £ G then a*b e G. A binary operation can be regarded as a map from the Cartesian Product P x P into P f:PxP^P /{(pi,P2)}=Pi*P2 Of course a group is one of the central ideas of mathematics. Precisely, a group is a set of elements G closed with respect to operation * and which satisfies the following axioms. Glo p * (q * r) = (p * q) * r V p,q,r e G * is associative G2o 3 an identity element 2 G G such that for every p e G p*i = i *p = p G3 o For each element p G G, 3 an element g G G called the inverse such that p*q = q*p = i A group is said to be an abelian group if for every pair of elements p,q £ G the operator is commutative: p*q = q*p. In the group axioms there is required to be at least one identity element and for each element at least one inverse. However, it is easily shown that the group identity and the group inverse are unique. Uniqueness of Identity and Inverse In the group axioms there is required to be at least one identity element and for each element at least one inverse. However, it is easily shown that, in a group, the identity element is unique and also each element has a unique inverse. To prove that the group identity is unique we assume that there are two elements 21,22 and try and argue that i\ is identical to 22. Thus for every element a G G we assume: a*ii=2i * a = a and a* 22= 22 * a = a But in particular, for element 2*2 G G we have, using i\ as identity: 22 * l\ = 2i * 22 = 22
11 Also, for element i\ G G we have, using i2 as identity: ix = %x * i2 = i2 from above proving that the identity is unique. To show that every element a G G has a unique inverse we assume that p and q are two possible inverses. Then p = p*i=p*(a*g) since a and q are inverse = (p * a) * g using the associative property = i* q = q since p and a are inverse A simple deduction from this result is that if a, b G G then (a* b)~ = b~ *a~ This follows easily since (a * b) * (6~ * a~) = a * (6 * b~) * a~ = a *a~ = i so, since the inverse is unique, the inverse element to a * b is b~ * a~. Subgroups Let G be a group with operation *. The set (5 is said to be a subgroup of G if G is a subset of G and is also a group with the same binary operation *. Perhaps not surprisingly one can readily show that the identity in G is the same as the identity in G and if a G G (implying a G G) then the inverse of a G G is the same as the inverse of a G G. The following theorem allows one to easily check whether or not a subset of a group is a subgroup. Theorem 1.1 If G is a group with operation * then 6, a subset of G, is also a subgroup of G if and only if the following statements are true: o G has at least one element.
12 o if a, P G G then a * j3 G G i.e. G is closed under *. o if /3 is the inverse of a G G then f3 G (5. Isomorphisms of Groups Let 5 be a group with operation * and let T be a group with operation • then a map <j>: S »-> T is called a homomorphism if </>(a * 6) = 0(a) • <j>(b) V a,beS A homomorphism preserves the algebraic structure of the group upon which it acts. An immediate deduction is that if <\>: S »-> T is a homomorphism and if is, 2t are the identities in S and T respectively then: </>fe) = ^t and if 0(as) = ax then ^(a^1) = a^1 Let 5 be a group with operation * and let T be a group with operation •. An isomorphism of S onto T is a homomorphism <j>: S i—> T that is one-to-one and onto. In this case S and T are said to be isomorphic which we denote by 5 ~ T. Example Let M be the group of all 2 x 2 matrices with matrix multiplication * as group operation. Show that the map ^:Mh1 (j)(A) = detA V AeM is a homomorphism. Solution This follows easily since (using well known properties of matrices and determinants) cj)(A * B) = det(A * B) = detA • detB = <j>{A) • </>{B) Note also that, as expected from the theorem above nl \ = 1 MA-1) = "1 0" 0 1 )■ -d l -*u etA detf ' d - —c '1 0 -b a 0" 1 ^ ) {detA)-1
13 Anti-isomorphisms A map 0: S »-> T that is one-to-one and onto such that 0(a * 6) = 0(6) • 0(a) V a, 6 G S is called an anti-isomorphism. The deduction that if 0 and 0 are two anti-isomorphisms of S i-> 5 then the composition 0 o 0 is an isomorphism is an immediate result. Example If S is a group then show that the map 0: S »-> 5 such that 0(5) =5" V 5 G S where s~ is the inverse of 5, is an anti-isomorphism. Solution The map is clearly one-to-one and onto since there exists an inverse map 0_: <t>~:S^S <t>-{s) = s~ VseS which satisfies 0- o </>(*) = </>-(</>(*)) = ^-(O = (O" = s 0 O 0"(5) = 0(0" M) = 0(5") = (5-)" = 5 0" O0(5) =0O0"(5) =2S Also 0(a * 6) = (a * 6)_ = b~ * a- = 0(6) * 0(a) so 0 is an anti-isomorphism. Theorem 1.2 If 0 and 0 are two anti-isomorphisms of S »-> 5 then the composition 0 o 0 is an isomorphism. proof 0 and 0 are both onto and one-to-one and so, by an earlier result 0 o 0 is onto and one-to-one. Also if a, 6 G S then 0o0(a*6)=0(0(a*6)) = 0(0(6)* 0(a)) = 0(0(a))* 0(0(6)) = 0 o 0(a) * 0 o 0(b)
14 So the binary operation is preserved, confirming that the composite map is an isomorphism. 1.4 Rings and Fields A group, as we have seen, is a set upon which a single operation * is defined satisfying certain basic axioms. A ring is a mathematical construct — a set S upon which two binary operations ©, ® are defined such that S is closed under both © and ® and such that the elements of S together with these operations satisfy a number of basic axioms. Although we are primarily interested in the properties of real numbers (and their generalisations) we shall nevertheless carry through our discussion of rings in general, abstract terms though the notation that we shall use will be very suggestive of the notation commonly employed with real numbers. Precisely: a ring is a set S such that Rl o a © (6 © c) = (a(Bb)(Bc V a,b,c£ S Operation © is associative. R2 o For all a, b G S the equation a © d = b has a solution d G 5. R3o a(Bb = b(Ba V a, 6 G S Operation © is commutative. R4o a <S> (b <S> c) = (a <S> b) <S>c V a,b,c£ S Operation ® is associative. R5 o a <g> (b © c) = (a <g> b) © (a <g> c) (b © c) <g> a = (6 <g> a) © (c <g> a) These are called the distributive laws If the operation ® is commutative (a <8>6 = 6® a) then the ring is said to be a commutative ring. The first three axioms define, for the operation © and the elements of 5, an abelian group. These three axioms alone can be used to show that there exists a unique ©identity io and for each a G S a unique ©inverse _a. Normally an ©identity is called a zero and an ©inverse is called an additive inverse (or negative). Note that these two properties; the existence of a unique ©inverse and a unique ©identity together imply R2o. For if a, b G S and _a is the ©inverse of a then the equation a © d = b always has a solution _a © b = _a © (a © d) = (_a © a) © d = i0 © d = d Therefore d e S. Thus the second ring axiom could have been replaced by an equivalent axiom specifying the existence of a unique ©inverse and a unique ©identity.
15 Isomorphisms of Rings Let S with operations (©, ®) and T with operations (+, •) be rings. A map / : S »-> T is called a homomorphism if it preserves the operations in S and T. That is if /(a © 6) = /(a) + f(b) /(a ® 6) = /(a) • /(b) An obvious example of such a map is that between the ring of polynomials F and L (the ring from which its coefficents are taken), specified by an evaluation /:P^L /(p)->/(x) which satisfies f(pi ©P2) = /(Pi) + f(p2) f(pi ®P2) = f(pi) • /(p2) If the homomorphic map has the further structure that it is both one-to-one and onto then / is said to be an isomorphism of S onto T and (as with groups) we then write S ~ T. An isomorphic map induces many attributes from S onto T. For example if is is an ^identity of S then under an isomorphic map f(is) is an -identity of T. This follows since if a £ S then a = a®is thus f(a) = f(a <g> is) = f(a) ■ f(is) But since the identity of T is unique then f(is) is the -identity of T. We conclude that if S and T are isomorphic then if S contains an (^identity then T must contain a -identity. This result shows immediately that although the ring of integers I and the ring of even integers le are isomorphic as groups (with operations ©, + respectively), they cannot be isomorphic as rings since le does not have an -identity whereas I does have an (^identity. Because the ring operations ©, ® are preserved it follows that an isomorphism preserves all the ring axioms so that (apart from notational differences) two isomorphic rings are essentially identical. One can view an isomorphic map / : S \-> T as producing a ring T from a ring S and, as such, certain algebraic properties of S are induced into T. As seen above if S contains an (^identity then so must T. Similarly it can be shown that if S is abelian then so is T; if S is commutative then so is T. Example Show that the map / :Z(flj6) ^->Z /{(a, b)} = a-b
16 is an isomorphism Solution /{(a, b) 0 (c,<t)} = /{(a + c,6 + d)} = (a + c) - (6 + d) = (a-b) + (c-d) = f{(a,b)} + f{(c,d)} /{(a, 6) <g> (c, d)} = /{(ac + bd, ad + bc)} = (ac + bd) - (ad + 6c) = (a-b)(c-d) = /{(a,6)}/{(c,d)} This proves the map is homomorphic. To show that it is isomorphic we need to demonstrate that it is also one-to-one and onto or, equivalently, that it is invertible. We show that it is invertible by exhibiting its inverse explicitly: /" :Z^Z(a,6) f-[m] = (b + m,b) Here (/" o/){(c,d)} =/-[/{(c,d)}] = r[c-d] = (b + c-d,b) = (c,d) and (/o/-)[m]=/(/-[m])=/{(6 + m,6)} = 6 + m-6 = m proving that / is invertible. Thus %(a,b) ~ ^- This shows, of course, as we intimated earlier, in Section 1.2, that the negative integers may be constructed without ever needing to introduce the negative sign. Integral Domains and Fields If a G 5, a commutative ring and a ^ i$ and if we can find an element b G 5, 6 ^ io such that a ® 6 = io then a is said to be a divisor of zero. For example the product space Z x Z (which is a commutative ring) contains elements (1,0) and (0,1) such that (1,0) 0(0,1) = (0,0)
17 that is, each is a divisor of zero. The ring M2 of 2 x 2 matrices has divisors of zero since, for example, "2 r 4 2 -1 0.5' 2 -1 = 0 0" 0 0 For the number systems that we shall be mainly interested in we will wish to only consider those structures without divisors of zero and to this end we introduce the integral domain, which is a commutative ring without any divisors of zero. A field has even greater structure; it being a commutative ring S in which every element a G S a ^ i0 has an (^inverse. It is easily deduced that a field has no divisors of zero; since if a G S and a ® b = i0 when b ^ io then because 5 is a field there exists an element a~ (the multiplicative inverse of a) such that a~ ®a = ii (the multiplicative unit or (^identity) we have: a~ ® a ® b = a~ ® i$ = io implying b = %q (using associativity on the left hand side and then the property of the identity) which contradicts our initial assumption. Thus every field is an integral domain. A set for which all the axioms of a field hold except the commutative law of multiplication is called a division algebra or a skew field. Order The concept of order that we are familiar with, from the integers, can be applied to other rings. A ring S is said to be ordered if there exists a subset Sp C S which is called the set of positive elements of S such that Olo If for each a, 6 G Sp then a®beSp and a®beSp 02o For each a G Sp then exactly one of the following alternatives are true: either a G Sp or a = iq or -a e Sp This is called the Trichotomy Law. Those elements a G S such that _a G Sp are called negative elements of 5. The concept of order also applies to integral domains and to fields, the definition being identical to that for rings. We can deduce two fundamental results applicable to all ordered rings. The first states that if a G Sp then _a ^ Sp since if _a G Sp then a © _a = i0 G Sp which contradicts O2o.
18 The second fundamental property of all ordered sets is that if i0 ^ a G S then a® a G Sp. The proof is immediate: If a G Sp then a® a e Sp by Olo . However if a 0 5P then _a G 5P and again _a^_oG 5P by Olo. What this states of course is that the square of any non-zero element of an ordered ring is positive. The Modulus Let S be an ordered ring. The modulus of a G S is written \a\ and is defined as: (a if a G Sv \a\ = < io if a = i0 {-a ifa£Sp Clearly if a ^ i0 then for all a G 5, \a\ G 5P. The Modulus Map Let 5 be an ordered ring. The modulus of a G S is written \a\ and is denned as: (a if a G 5p |a| = < 20 if a = 20 La ifa£Sp Clearly if a ^ i0 then for all a G 5, \a\ G 5P. The expression |6© _a| is called the absolute difference of a and b. The modulus has a number of important properties. Property (i) \a\ > a proof This is immediate if the three cases a > io, a = io and a < i0 are considered. Property (ii) \a\2 = a2 proof This is clearly true from the definition of \a\ (and using _a (8> _a = a® a = a2). Property (hi) |a <g> 6| = \a\ <g> |b| proof We consider the various cases. For example if a > io and b > i0 then a ® 6 > 20 and so |a ® fe| =a®i>. Also |a| = a and |6| = b implying \a\ 0 |6| = a ® b. The other possibilities are treated similarly. Property (iv) If |a| > |fe| then a2 > b2. proof Since \a\ > io and |6| > zo then \a\ (8> |a| > |6| (8> |a| = |fe (8> a| |a| (8> |6| > |6| (8> |6|
19 that is a2 > \b®a\ and b2 < \a ® b\ Therefore by proposition (i) a2 > b2. Property (v) \a © b\ < \a\ © |b| proof (|a|©|6|)2 = a2©2|a®6|©62 But |a ® 6| > a ® b and so (H © \b\)2 > a2 © (2a (8> b) © 62 = (a © b)2 Hence |a| © |6| > |a©6|. Property (vi) \a © Jb\ > \\a\ © _|6|| proof Similar to (v). Note that with the introduction of the modulus we can introduce the norm and sign of an element of an ordered ring to tie in with these descriptors introduced earlier in our brief discussion of scalars. If a G S then we write 7Va = |a|2 = a2 and the sign of a is: [ i\ if a > io fflgn» = l_i1 iia<i0 Two other results for an ordered field are worthy of note. If a G Sp then a" G Sv. The proof is immediate: if a~ £ Sp then a 0 a~ = i\ £ Sp which contradicts an earlier result. Also if a > i\ then a~ < i\. To see this we note that since a > i\ then a G Sp and so a~ eSp a~ ®a > a~ ®i\ = a~ Thus i\ > a~. However, since a~ is positive we conclude io < a~ < i\. 1.5 Linear Spaces Let S be an abelian group under the operation + and let T be an ordered commutative field. We consider a map, called scalar multiplication such that for all 5, s' G S and all a, a' G T it has the properties
20 Smlo a(s + s') = as + as' , (a + a')s = as + a's Sm2o a'(as) = (ct'ct)s Sm3 o Is = 5 where 1 is x identity in T. An abelian group for which there is a scalar multiplication map as defined here is called a linear space over T denoted by St- The elements of T are called scalars. Linear Dependence If St is a linear space over T and si, «2> • • • > Sfc £ 5, ai, #2, •. •, ak G T then the element s G S given by s = a\S\ + a2s2 + • • • + afcSfc is called a linear combination of s\, s2,..., $&.. The space comprising all those elements formed in this way is itself a linear space over field T. These elements satisfy the axioms of an abelian group under + and also satisfy the extra constraints required of a linear space. The elements s\, s2,..., Sfc G 5t are said to be linearly independent if and only if a\s\ + a2s2 + ... + aksk = 0S implies a\ = a2 = ... = ak = 0 otherwise they are said to be linearly dependent. The space of all linear combinations of si, s2,..., Sk is said to be spanned by s\, s2,..., Sk and we denote this space by span{si, s2,..., Sk}- If there exists a finite set of elements s\, s2,..., s^ such that ST = span{si,s2,...,Sk} then 5t is said to be finite dimensional and if si, s2,..., Sk are also linearly independent then this set of elements is called a basis of St- As a simple example of a linear space consider a set of elements of the form (ai, a2,..., an) ajGTa field with addition and scalar multiplication defined by (ai,a2,...,an) + (ai,a2,... ,a4) = (<*i +<*i, a2 + a;2,... ,an + a'n) A(ai,..., an) = (Aai,..., Xan) X G T. Equality between elements is defined by: (ai,a2,...,an) = (a[,a2,... ,o4) =» ai = «i,... ,an = a^
21 It is easy to verify that this is a linear space by checking through the axioms. We denote this space, which is called the space of n—tuples, by Tn. A little thought allows the construction of a basis using the particular elements ei = (l,0,...,0), e2 = (0,l,0,...,0),... ,en = (0,0,..., 1) or ( 1 i = j e» = (*»i,*»2,...,*in) i = l,2,...,n where ^ = < Q These n elements are linearly independent. These elements therefore form a basis and Tn is finite dimensional. This basis is called the standard basis for Tn. With a little more work it can be shown that any two bases of St contain the same number of elements, this common number being called the dimension of St, denoted by dim (St)- For the example of the linear space of n—tuples above, dim Tn = n. Linear Maps Let St and Vr be linear spaces defined over the same field T. Let </) be the map from St to Vr 4> '• St |—> Vr such that </>(si + s2) = (j)(si) + (j)(s2) and <j)(as) = acj)(s) for all s\,S2,s G St and all a G T then <j> is called a linear transformation from St to Vr- (Sometimes a linear transformation is called a homomorphism). We now consider combining linear maps. Let St be a linear space over a field T and let 0, 9 be two linear transformations from St to itself. We now define the map V© : St »-> St such that for any s £ St i>®(s) = <l>(s) + 0(s). Then -0® is a linear map as) = (j)(as) + 0(as) = a[0(s) + 0(*)] s) Sl + S2) = </>(5l + 52) + 0(«1 + S2) = <I>(S1)+<I>(S2)+0(S1)+9(S2) = ^©(5i) + ^©(s2) The map -0® is called the sum of the linear transformations <j> and 9 and denoted by <j> + 0. We can also define the composition map ^^ : St ^^ St by applying 0,0 in succession (s) = 4>(9(s)) s G 5y. Again ^^ is a linear map.
22 i/)9(as) = <K9(as)) = <KaB(s)) = acf)(0{s)) = a^(s) ^®(«l +52) = <t>{0(si + S2)) = <t>[B(si) + 9(s2)] = ^®(«1)+V'®(S2) The map ij)® is called the product of the linear transformations <j> and 9 and denoted by (j)o0. Generally the map </> o 0 is not the same as 0 o </>. Matrix Representation of a Linear Map Let St be a linear space such that St = span{si, s2)..., sq} with dimension q and let Vt be a subspace of 5t spanned by n elements Vr = span{v\,V2,.. •, vn} n < q. We shall now consider linear maps from Vr onto itself: <\>: Vr »-> Vr- The image of each basis element of Vr is an element of Vr and so n <l>(vk) = o,ikvi + a2fc^2 + ... + ankvn = ^ a^-. aik G T i=i The effect of the linear map <j) on a general element of Vr, can be expressed as a linear combination of basis elements: v = X\V\ + x2v2 + ... + xnvn X{ £ T. As i>i,..., vn is a basis for Vr we call the uniquely determined coefficients Xi i = 1,..., n the components of v with respect to the basis Vi, i = 1,..., n. The image of i> under the map 0 is </>(i>) and is an element of Vr and so can be expressed as a linear combination of the basis vectors: n (j>{v) = yivi + y2v2 + ... + yn^n = ^ VjVj Vi^T But (v) = Zl</>(^l) + ^2</)(^2) + • • • + Xn(j)(vn) n 1=1 n n = 2-/ / ^CLjiVjXj i=lj=l That is, n n n n ( n \ Yl yM = Y, Y, a3iv3xi °r Yl\y^~Y aoiXi \vo = 0
23 But since Vr = span{vi,V2,..., vn} it follows immediately that n yj =^2ajixi j = l,2,...,n 2 = 1 So if we know the effect of the map <\> on the basis vectors Vi (i.e. we know the scalars dij i = 1,..., n, j = 1,..., n) then we can calculate the components y* 2 = 1,..., n of </)(i>) from the components Xi,i = 1,..., n of i>. The set of scalars a^, i = 1,..., n, j — 1,..., n characterise the linear map 0. We can conveniently write this set of scalars as an n x n array, and for convenience refer to it by a single capital letter: Gin I ^21 a22 A = an «21 0>nl an «22 an2 where the A;th column of A are the components of (j)(vk) with respect to the basis span{v\, V2,..., vn}« The array ^4 can be shown to be a matrix. In fact, with respect to a fixed basis the correspondence between linear maps and matrices is one-to-one, preserving the operations of addition and multiplication (suitably interpreted). If we denote this correspondence by ~ then tf <j> ~ A and 9 ~ B then it is easily verified that a(p~aA, 0 + 0 ~ A + B, >oO~AB. Clearly this correspondence is an isomorphism between the collection of all linear transformations <\>: St |—> St and their associated matrices. Example Let (j>o, be a linear transformation from T2 into T2 defined by 4>o{e\) = (cos#, sin#) fa(e2) = (—sin0, cos#) Show that the map </)# o cj>\ = (J)q+\ Solution The matrix representations of 4>e and (j)\ are, respectively: A=\ \ B= cos 0 — sin 0 sin 0 cos 0 cos A — sin A sin A cos A
24 Thus the matrix representation of <j>q o <j)\ is AB cos 9 — sin 9 sin 0 cos 0 cos A — sin A sin A cos A cos 9 cos A — sin 9 sin A — cos 9 sin A — sin 9 cos A sin 9 cos A + cos 9 sin A — sin 9 sin A + cos 9 cos A cos(0 + A) -sin(0 + A) sin(0 + A) cos(0 + A) and, using the correspondence between maps and matrices we see that AB ~ </>e+\ and so 4>q ° 4>\= (t>e+\- Note that in this case the order of the maps is not significant. Change of basis Let {ri,..., rn}, {i>i,..., vn} be bases in Rn. If x € Rn then x = a\r\ + ... + anrn and x = f}\V\ + ... + f3nvn How are the coordinates Pi related to the coordinates a2? Now, clearly, since {vi,..., vn} is a basis n n I n \ X = Y,airi =Ysai[ Y^V3 2=1 2=1 \j = l I and Y bvi = Y Y ai^vJ implying Y{^~Y ai^J \VJ=° j=l 2=1j=\ j=\ \ 2=1 ) from which (since Vj are linearly independent) we deduce: pj = 2_.&ilij• That is, in matrix notation, 2=1 Pi fa Pn\ 111 711 721 ... 7nl 712 722 •.. 7n2 7nn -I \-OL <*2
25 in which the ith column of the nx n matrix are the coordinates of r» with respect to the basis {vi,V2,..>,Vn}> Formally we write [X\v = [1 \r—+v\X\r [T]r->v is called the change of basis matrix from rtov. Example In R2 we have two bases Sp{ri,r2}, Sp{v\,V2} where, with respect to the standard basis: n = (7, -6), r2 = (6,7), vi = (1,4), v2 = (3, -5) Find the change of basis matrix [T]r_>v. Solution 7*1 = OL\V\ + OL2V2 i.e. 7 = a\ + 3^2 —6 = 4ai — 5^2 These equations have solution ai = l, a2 = 2, A =3, ft = l Therefore [T}r->v= |^ xj Thus, for example, an element r: r = 2r, —3r2 (= (—4, —33) in standard basis) which has coordinates (2, —3) with respect to the r—basis would have coordinates "1 3" 2 1 2" -3 = "-7" 1 with respect to the v—basis: That is, v = —7v\ + v2 (= (—4, —33) in standard basis). Orientation In this section we consider the commutative field T to be the field of real numbers. Let {ri, r2,..., rn} and {v 1, v2,..., vn} be two bases for Rn. There is a linear map </> which will transform one basis into the other. In terms of matrices, if x £ Rn r2 = P\v\ + #2^2 7 = 4/?i - 5& [Xjv — [i Jr-—>>v L^Jt-
26 where [T]r_>v is the change of basis matrix. If det[T]r_>v > 0 then we write span{ri,r2,..., rn} ~ span{vi,v2,..., nn}. Since any non-zero determinant is either positive or negative the relation ~ divides all the bases of Rn into just two sets - those related to the standard basis span{ei, • • •»en} via a change of basis matrix for which det[T]e_>r > 0 and those for which det[T]e_>r < 0. If Rn = span{ri, r2,..., rn} det[T]e->r > 0 then Rn is said to be positively oriented. If Rn = span{ri,r2,...,rn} det[T]e->r < 0 then Rn is said to be negatively oriented. The real line R is oriented. The standard basis is the single element e = 1. If we consider a segment of the real line then this is positively oriented if the direction we associate with it points to the right and negatively oriented if the associated direction points to the left. Changing from one orientation into another is effected by multiplying by a negative number. In R2 orientation corresponds to the direction of rotation; conventionally, 'clockwise' indicates positive orientation and 'anti-clockwise' indicates negative orientation. Here, a linear transformation which reverses orientation, from clockwise to anti-clockwise or vice versa, includes a reflection whilst one that preserves orientation is a rotation. In R3 orientation is conventially referred to as being either right-handed or left-handed. See Figure 1.7 (a) (b) Figure 1.7 The standard basis vectors ei, e2, e$ is in (a) a right-handed set. As e\ rotates towards e2, through 90° a screw with a right-handed thread — most screws are of this type — aligned
27 normal to the plane of e\e2 would move in the direction of e%. On the other hand in (b) the standard basis 61,62^3 is a left-handed set. A screw with a right-handed thread would, as we turn it from b\ to 62, move in the direction of —63. An alternative, more natural description of orientation in M3 is the following: align the thumb of your right hand along e\ with your first finger pointed along e2 then the remaining fingers naturally point in the direction of e%. The digits on your left hand will align naturally according to Figure 1.7(b). These informal considerations match our algebraic definition since, in this case, the transformation from one basis set to the other is given by b\ = e\ b2 = e2 63 = — es leading to a change of basis matrix: [T]e->6 = 1 0 0 0 1 0 0 0-1 and det[T]e->6 = — 1 as expected. 1.6 Inner Product Spaces Let t = (ti,t2,... ,tn) and u = (1x1,1x2,.. .un) be two elements of Tn over the ordered commutative field T with respect to the standard basis Tn = span{e\,e2,... ,en}. The norm Nt of the element t is defined by: Nt = t? + t! + ... + # NteT The element t is called a unit element if Nt = 1. The scalar product (sometimes called the dot product) of t with u is written t • u and defined as t-u = t\Ui + t2u2 + ... + tnun We note that Nt = t-t. The geometrical interpretation of the norm and the scalar product is facilitated by considering R which is the familiar linear space of our three dimensional world. It easily follows from the definition of the scalar product that if t,u,v,w € Tn, a,/3,7,<5<ETthen (at + (3u) - (717 + 6w) = ajt • v + a8t • w + ft^u • v + (36u • w Now since T is an ordered field then for every t G Tn Nt > 0, only vanishing if t = 0. We can derive a useful inequality from this basic result. Theorem 1.3 The Schwarz inequality. For every t,u £ Tn NtNu > (u • t)2
28 Proof To show this we note the obvious statement that if u, £ G Tn, a, (5 G T Nau+pt > 0 that is (au + (it) • (au + fit) > 0 or, expanding: a2u-u + 2a(3u • £ + f32t • t > 0 If, in particular, we choose a = £ • t and /? = — rz • £ then N?Nu-2Nt{u-t)2 + Nt{u.t)2>0 or Nt{NtNu - (u • £)2} > 0 Now since Nt>0 (Nt = 0 if and only if £ = 0 in which case NuNt = (u • £)2) we deduce NtNu > (u • £)2 Continuing our discussion of the scalar product we note that in R3 with the usual geometrical interpretation, the three standard basis elements e\ : (1,0,0), e<i : (0,1,0), e$ : (0,0,1) point along three mutually perpendicular axes. See Figure 1.8. Then two elements £, u G M3 are lines of length \/Wt, y/N^ respectively, pointing out of the origin. The line joining the end-points of £, u is represented by u — t. Figure 1.8 Now, for any triangle, the cosine rule states Nu + Nt = Nu_t + 2v/^VuV/^cos(9
29 But Nu = u\ + u\ + u\ Nt=t\+t\ + t\ and so Nu-t = (tii - *i)2 + (ti2 - t2)2 + (u3 - hf implying cos 0 = . . ■ . We generalise the concept of angle between elements rx, t € M to define the angle between elements it, t G Tn a linear space over an ordered commutative field to be such that u-t cos 6 = . t—= of course this interpretation is only sensible as long as — 1 < cos 0 < 1 which immediately follows from the Schwarz inequality derived above. Continuing with this approach it is natural to say that two elements it, t G Tn are orthogonal if and only if u • t = 0. Orthonormal Bases We say a set of elements {t\, £2, • • •, tP} € Tn is orthogonal if the elements ti i = l,p are mutually orthogonal element. That is, if ^ • tj = 0 i^j It is easy to show that any set of mutually orthogonal elements are linearly independent. Thus if {ti, £2, • • •, tp} is such a set then consider a\t\ + a2t2 + ... + aptp = 0 If we take the scalar product of both sides with tk 1 < k <p then using the orthogonality property we have otktk 'tk=0-tk=0 and so ak = 0 k = 1,2,.. .p implying that the set {ti,t2, • •. ,£p} is linearly independent. It follows immediately that if {£1, £2, • • •, tn} are a mutually orthogonal set of elements of Tn then Tn = Sp{t1,t2,...,tn} Such a set if called an orthonormal set if ti-ti = l i = 1,2, ...,n. The simplest example of an orthonormal set in Tn is the standard basis.
30 Inner Products The scalar product, introduced above is a rule which associates with each pair of elements £, u G Tn, an element t • u of T with the properties (i) t-u = u-t (ii) t'(u + v)=t-u + t-v (iii) a(t • u) = (at) -u = t- (au) a G T (iv) £•£>(), t-t = 0 iff t = 0. As we have seen, for Tn the scalar product takes the form t-u = t\U\ + £2^2 + ... + tnun. However, there are many linear spaces to which a 'scalar product' can be defined satisfying the properties (i) - (iv) above. We are led, naturally, to define so-called 'inner-product' spaces which are linear spaces St in which T is an ordered commutative field on which is defined an inner product that associates an element of T with each ordered pair of elements t, u G St- The inner product is denoted by < t, u > and satisfies: IP1 o < t,u >=< u,t > IP2 o < t,u + v>=< t,u > + < t,v > IP3 o a < t, u >=< at, u >=< t, au > IP4 o < t, t > > 0 and < t, t >= 0 iff t = 0 In any inner product space, norm ('length') and angle (and hence perpendicularity) may be defined in exactly the same way as for Tn. Also the Schwarz inequality < t,t > < u,u > > < t,u >2 holds in any inner product space. Orthogonal Matrices Transformations from R2 \-> M2 or from M3 i-> R3 which preserve the scalar product (and so these are transformations which preserve length and angle) are called orthogonal. We extend this concept to inner product spaces. Let St, Ut be inner product spaces defined over an ordered commutative field T. A linear transformation </>: (j): St *-+UT such that < 0(rx), <j)(v) > = < u,v > is called orthogonal. Clearly such a map preserves the norm and the angle: Afy(u) = < 4>(u), (j>(u) > = <%u> = Nu 008* = 4^4^=008*.
31 It immediately follows that if (u\, u2)..., Uk) is a set of orthonormal elements in St then the set (<j)(ui), 0(^2), • • • A{uk)) is an orthonormal set in Ut- Theorem 1.4 A linear map <j>: St 1-> f/r is orthogonal if and only if 0 preserves the norm: N^u) = Nu. Proof If (j) is orthogonal then N^u) = < <j)(u), <j>(u) > = < u,u > = Nu. Also the inner product can be written in terms of the norm Nu+V - Nu- Nv = <u + v, u + v > - <u,u> - <v,u> = <u + v,u> + <u + v,v > — <u,u> — <v,v> = <U, U> + <V,U> + <U, V > + <V,V > — < U,U> — <V,V> = <v,u> + <u,v >= 2 < u, v > Thus 2 < cj)(u), <j)(v) > = N^+fty) - N^u) - N^v) = N<f>(u+v) - Nfty) - Nfty) = N{u+v) -Nu-Nv = 2 < u,v > (j> is orthogonal. Theorem 1.5 Let St, Ut be inner product spaces and let si,S2,...,sn be an orthonormal basis for SV; then the linear map (j) : St •—> Ut is orthogonal if and only if the set </>(si), </>(52), • • •, <t>{sn) is orthonormal in Ut> Proof if (j) is orthogonal then if i ^ j < 0(Si), 0(Sj) > = < Si,Sj >= 0 whilst Nfa.) = < <l>(si), (f)(si) > = <Si, si >= 1 Thus </>(si), </){s2), • • •, </>{sn) is an orthonormal set. Now let u be any element of St then n u = a\s\ + a2s2 + ... + ansn = ^J a^i 2=1 and </)(u) = <*i0(si) + a2<^(S2) + • • • + <*n<l>(Sn)
32 Now Since <j)(si) i = 1,2,... , n is an orthonormal set. Since the norm is preserved <j> is an orthogonal map. An obvious consequence of this result is that two inner product spaces 5t, Ut are isomorphic if and only if dimSr = dim Ut- An n x n matrix is said to be orthogonal if its columns, when considered with respect to an orthonormal basis, are mutually orthogonal each with unit norm (with respect to the standard inner product). Let St — span{ei, e2,..., en} in which < e*, ej >= 6ij. Now if we consider an orthogonal map <j) : St •—> St then (/)(ej) = SILi aij ei where A, with jth column a^-, is the matrix representing the orthogonal map. Now (j)(ej) j = 1,2,..., n are mutually orthogonal: n n < 0(e/b), 4>{ej) > = < ^2 a^ei ' 5Z aPieP > 2=1 p=i 2J ^ifc^ij = 4j 2=1 This is matrix multiplication between two matrices A with jth column a^ and £ with ith column aife. That is, the rows of the B matrix are precisely the columns of the A matrix. Hence B = AT. Hence the basic characteristic of an orthogonal map is that its matrix Fepresentation A satisfies * AA1 = I or equivalently A -1 Example Show that the matrix representation of any orthogonal map <j> from R2 M2 takes one of only two forms. cos 0 sin 0 sin 0 — cos 9 in which the standard inner product is used. Describe the geometrical effect of such transformations. COS0 sin# -sin# COS0 or Solution Using the standard basis in M2 m2 = sP{(i,o), (0,1)}
33 Let 0{(1,0)} = oi(l, 0) + a2(0,1) <f>{(Q,1)} = ft(l, 0) + /32(0,1) However, since cf> is orthogonal it preserves norm: W(i,o) = <(l,0), (1,0) >=l %(i,o)> = < ai(i-O) + q2(0, 1), ai(l,0) + a2(0,1) > 2 , 2 = ax + a2 therefore Also leading to a[ + c^ = 1 Similarly 0( + 0$ = 1 <</>{(l,0)}, </>{(0,l)}> = <(l,0), (0,1) >=0 <ai(l,0) + a2(0,l), A(1,0) + A(0,1) >= 0 ai/?i + a2ft = 0 If we choose a\ = cos 9 then #2 = ± sin 0. Also choosing (3\ = cos A then /?2 = ± sin A and so this last result implies cos0cosA + sin0sinA = O that is 9- A = (2A;+1)tt/2 fceZ Therefore ft = cos((9 + (2fc + 1)tt/2) = ±sin<9 ft = sin((9 + (2k + 1)tt/2) = T cos 9 Hence the orthogonal map 0 is characterised by the matrix A = cos 9 — sin 9 sin 0 cos 9 or B = cos 0 sin 9 sin 0 — cos 9 where 0 < 9 < 2-k. (The other choice a\ = cos 9 ol<i = sin 9 is obtained by replacing 9 by —9 in the above forms). These are examples of so-called orthogonal matrices; when the columns are considered with respect to the standard basis they are mutually orthogonal each with unit norm. They are essentially distinguished by their differing determinants, deti4 = +l, det£ = -l.
34 The geometric interpretation of these two distinct types of orthogonal matrices in R2 is of interest. The first describes a rotation. To see this we consider an element of unit norm in M2, say (a, 6); a2 + b2 = 1 with respect to the basis element (1,0) it subtends an angle 7 <(i,o), M)> cos 7 = —, r —, = < (1,0), (a,b) > =a Similarly with respect to the basis element (0,1) it subtends an angle (90 — 7), cos(90 - 7) =< (0,1), (a, b) > = b that is b = sin 7 After applying the orthogonal map <j> then the angle with respect to the basis element (1,0) is cos /= <(i,o), <MM)}> ^(1,0) y/N4>A{(a,b)} = <(1,0), 0A{(a,6)}> = <(1,0), a[ai(l,0) + a2(0,l)]+6[i8i(l,0) + A(0,l)]> = aa\ + b(3i = a cos 9 — b sin 9 cos 7' = cos 7 cos 9 — sin 7 sin 9 = cos(7 + 9) The angle between (0,1) and (a, b) is sin7, = <(0,l), </>a{M)}> = <(0,1), a[ai(l,0) + a2(0,l)] + 6[i8i(l,0) + A(0,l)]> = aa2 + b(32 = cos 7 sin 9 + sin 7 cos 0 = sin(0 + 7) y = 9 + 7. i.e. the element (a, 6) has been rotated anti-clockwise through angle 9. See Figure 1.9(a).
35 (0,1) <h {(a,b)} A (10) (a) (b {(a,b)} VB '(1,0) Figure 1.9 Under the second type of orthogonal map with matrix representation B : „ <(i,o), MM)}> cos 7 = —7==—-==== = <(1,0), <MM)}> = aa\ + b(3\ = a cos 9 + b sin 9 = cos 7 cos 9 + sin 7 sin 0 = cos(7 — 9). sin/= < (0,1), <MM)}> = aa2 + 6/?2 = cos 7 sin 0 — sin 7 cos 9 = sin(0 - 7) That is, using the cosine and sine results 7" = 9 - 7
36 which is a reflection in the (1,0) axis followed by a rotation through angle 0. See Figure 1.9(b). Of course the important characteristic of a transformation which involves a reflection is that it does not preserve orientation. In terms of matrices of course, the characteristic of an orthogonal matrix that corresponds to a rotation is that det^4 = +1 whereas for an orthogonal matrix that includes a reflection is that det B = — 1. The Orthogonal Group The set of all orthogonal transformations <j> : Rn *-> Rn in which the group operation is map composition forms a group. It is perhaps easiest to see this if we make use of the correspondence between orthogonal transformations and orthogonal matrices and then the group operation is the matrix product. So let </>, 6, A be three orthogonal maps and A^, As and A\ be their respective matrix representations. We must check that the three group axioms are satisfied Glo (j) o (<5 o A) = (</> o 6) o A since matrix multiplication is associative A^AsAx) = (A4,As)Ax G2o The unit matrix U corresponds to the identity transformation i so io<j) = (j)oi = (j) since UA^ = A^Ii = A^ for all A^. G3o For every element <j) there exists an inverse transformation 4>~l with the property cj) o (j) = (j) o (j) = i This is true since for every orthogonal matrix A^ there always exists an inverse matrix A^1 = A J such that A+Al = A^Af = Ii Thus the set of all orthogonal transformations or equivalently the set of all n x n orthogonal matrices forms a group, called the orthogonal group and denoted by 0(n). However, as we have seen, an orthogonal transformation in Rn need not preserve orientations. An orthogonal transformation that does preserve orientation is said to be a special orthogonal transformation or rotation and the group of such transformations is called the special orthogonal group and denoted by SO(ri). The set of orthogonal transformations that do not preserve orientations is called an anti-rotation. These cannot form a group as they do not contain the identity.
37 The transformations which preserve orientation and those which do not are not continuously connected, since for one set detA = +1 and for the other set detA = — 1. In M3 it is easy to show, by direct construction, that all orthogonal transformations which preserve orientation, changing one standard basis into another can be carried out continuously as a sequence of rotations. To see this, refer to Figure 1.10 in which two standard bases are illustrated; ei,e2,e3 directed along axes Ox,Oy,Oz and e'^e^e^ as shown. Figure 1.10 We see that the plane containing e[, e'2 has a normal e'3 inclined at an angle /3 to the Oz axis. This plane intersects the plane Oxy in a line OA inclined to the Ox axis at an angle a. Our first operation is to rotate about Oz so that e\ points along OA. That is multiply by the orthogonal matrix Aa Then rotate about OA through an angle /? through the orthogonal matrix: cos a sin a 0 — sin a cos a 0 0 0 1 Aa = 1 0 0 0 cos (3 — sin j3 0 sin P cos (3
38 Finally rotate about OD through an angle 7 so as to align ei with e'v The direction e^ will then be aligned with e'2. This last transformation is effected by the orthogonal matrix: A,- cos 7 sin 7 0 — sin 7 cos 7 0 0 0 1 The combined transformation is obtained via the product A1ApAot which is the orthogonal matrix: cos a cos 7 — sin a cos (3 sin 7 — cos a sin 7 — sin a cos /? cos 7 sin a sin /3 sin a cos 7 + cos a cos /? sin 7 — sin a sin 7 + cos a cos /? cos 7 — cos a sin /? sin P sin 7 sin f3 cos 7 cos /? The angles a, /?, 7 are called Euler's angles. 1.7 Algebras Definition A finite-dimensional linear space St over a field T is called an algebra if there is defined on St a product, denoted by adjacency, which satisfies, for all s,t,u G St and alia,/?,7 G T (i) s(at) = (as)t = a(si) (ii) s(t + u) = st + su (t + u)s = ts + us St is called an associative algebra if V 5, t, u G St we have s(tu) = (st)u It is called a commutative algebra if V s, £ G 5t st = ts The maps, corresponding to multiplication on the left and on the right: Le : s —> is Rr : s —> sr s,r,l £ St are linear transformations since, for example Z*(as + /%) = ^(as + /%) = £(as) + ^(/%) = a(&)+/3(ft) = aLe(s) + PLe(t)
39 and similarly for Rr. If {ei, e2, • • •, en} is a basis for St (then we say it is also a basis for the algebra) then every s G St can be expressed uniquely in the form: s = a\e\ + oliZi + • • • + &nen oti^T St is said to be a division algebra if the equations st = u and ts = u always possess solutions if t ^ 0. An algebra may have an element, which we shall usually denote by ii, called the identity element such that i\s = si\ = s V s G A It is easy to show that if such an element i\ exists then it must be unique, for if there is a second such element i\ then But since i[ is the identity then i[i\ = i\ and so i\ = i[ proving uniqueness. When an algebra A over a field T has an identity i\ then the set of elements {mi, aeT} is an algebra of order 1, since there is just one basis element i\. However, since ai\ + a'i\ = (a + a')i\ and (aii)(a'i\) = aa'(iiii) = aa'i\ then this algebra is isomorphic to the field T. A division algebra with an identity i\ contains no divisors of zero since if 5, t G 5t, 5^0, t ^ 0 such that st = 0 then there would exist an element u G St such that tu = i\ that is 0 = (st)u = s(tu) = si\ = s which is a contradiction.
40 Normed Algebras Let s G St- The norm Ns of s is defined with respect to basis {ei, e2,..., en} as: Ns = a\ + a\ + ... + al eT The algebra is called a normed algebra if, for some basis the norm satisfies Net = NsNt It is easy to show that a normed algebra has no divisors of zero, since if s, t G St and if st = 0 then Nst = N0 = 0 .-. NsNt=0 -> ]Vs=0 or iVf = 0 -> s = 0 or £ = 0 We conclude that if st = 0 then either s = 0 or t = 0. In a normed algebra, over the field of real numbers R it is always acceptable to assume the existence of a unit element i\. To see this we note that for any element H G 5jj H = a\e\ + a2e2 + ... anen a* E R then 7VH = a22 + a\ + ... + a2 Hence we can introduce an element h = -zj-H such that Nh = 1. Now since 5to is a normed algebra then V s G 5^ W„s = ^hAT, = JVS but this shows that the map Lh : s —> hs is an orthogonal linear transformation from Syj onto itself and hence L/, is invertible: L^" (hs) = s. Similarly R,- is invertible: R^ (sh) = s. We can introduce an element i\ such that h=h2 and a new product into Sjj by: *®* = V(*)Lfc1(t) then ti®t = iifc1(t1)Lfc1(*) = Wfc x(«) = Lh(Lh-1(t)) = t
41 Similarly «®*i = -Rft1(*)^1(«i) = R^1(s)L-h1(h2) = R-h\s)h = Rh(R-\s)) = s Thus, with respect to the new product we have constructed an element i\ such that ii (£)£ = £(£) ii =t Isomorphic Algebras Two algebras St and S'T over the same field T are said to be isomorphic if there is a map </> : (j)(s) -> s' such that V s,t G ST and V s', t' G S'T then (i) </>($ + t) = S* + t' (ii) (/)(o;5) = as' a eT (iii) (/>(s£) = s't' If (ii) is replaced by (j)(st) = t's' then St and S'T are said to be reciprocal. Involution A map 0 is said to be an involution if (i) (j)2 = the identity map (ii) 4>{ab) = 0(a)0(b) If (ii) is replaced by (ii)' (j)(ab) = (j)(b)(j)(a) then <j> is said to be an anti-involution. For example, in the field of complex numbers the conjugate operation (using the usual notation) (j): z —> z* is an involution since (j)2(z) = cj)((j)(z)) = (j>{zm) = z and (t){zw) = (zw)* = z*w*
42 However in the field of quaternions, as we shall see the quaternion conjugate is an anti- involution. 1.8 Complex Numbers The properties of multiplication of scalars which stem from the basic concept of addition shows that positive and negative scalars are distinguished since positive scalars have positive and negative square roots whilst negative scalars seem to have no square roots at all. This biased treatment of positive and negative scalars is unsatisfactory and was only resolved with the invention of complex numbers. The crux of the difficulty with scalars is that when a scalar (or displacement) is multiplied by a +ve, no change of direction is involved whereas when a scalar is multiplied by a —ve, a 180° change of direction is involved. What we shall seek is a treatment of number which encompasses a continuous change of direction, from moving to the right on a line to moving to the left on the line. It is obvious how to do this geometrically by using the notion of rotation in the plane and it is the geometrical construct of a complex number that we shall consider first. We are led naturally to consider extending the concept of a scalar from the displacement of points on a line, to the displacement of points on a plane. In this process, some of the algebraic structure of scalars will be lost; in particular, as is obvious geometrically and as we shall later prove algebraically, we lose the concept of order. We can extend the geometrical concept of an ordinary number to two dimensions. In a natural way, number pairs are introduced. Each point in the plane can be represented by a number pair z = (a, 6), where a is the distance along the s—axis (the scalar or real axis) and b is the distance along a perpendicular axis through 0, the v—axis (the vector or imaginary axis). As with scalars the square of the length (of the line segment connecting (0,0) to (a, b)) is called the norm of z, denoted by Nz. We call z a 'complex number' because it is only properly defined by two 'ordinary' numbers a, b taken in a certain order. We need to invent an algebra for these complex numbers which is consistent with ordinary scalar algebra. Addition is defined in an obvious way. If z = (a, b) and w = (c, d) are any two complex numbers then their sum is written z + w and defined by: z + w = (a + c , b + d). That is, corresponding 'components' are added together. Obviously addition is associative and commutative following directly from the associative and commutative properties of scalars. The additive identity, or zero, is the complex number 0 = (0,0) and is such that z + 0 = 0 + z = z. Also, for given complex numbers z, w, we can always find a complex number d satisfying the equation z + d = w. The solution is d = (c — a, d — b). Thus the first three ring axioms are satisfied.
43 Scalar Multiplication If s is a scalar and z a complex number then the product of s with z is written sz and is the complex number in the direction of (signs)z with norm Nsz = NSNZ. Equivalently if z = (a, b) then sz = (sa,sb). Using scalar multiplication it is clear that any complex number z can be written in the form: z = \/N~zZ where Nz = 1 and z is in the same direction as z. Complex Products The more difficult concept is multiplication of complex numbers. How should this binary operation be defined? Any definition we introduce must be in agreement with the results that we are familiar with for the multiplication of scalars. A point in the plane can be specified by a pair of Cartesian coordinates (s, v) or by so-called polar coordinates [r, 0] where r denotes the positive distance from the origin and 0 denotes the angle made with the positive s—axis. See Figure 1.11. In a very obvious sense [r, 0] is a generalisation of the 'polar form' of a scalar (y/N^, arcc). Figure 1.11 The complex number z can be labelled in two alternative ways fay) Cartesian description M Polar description The important characteristic of the polar angle 0 is that it is only unique up to an integer multiple of 27r since. In the polar description any positive scalar has the form [r, 2kir] and any negative scalar has the form [r, (2k + 1)tt} where k is an integer or zero. To consider possible definitions of multiplication we consider the well-known results, in polar form, of multiplying scalars. These are grouped together in the following diagram, which emphasises the product by using the times symbol x.
44 known results (positive) times (positive) ^— a=[ra,0] ~^axb (positive) times (negative) b=[rbl7t\ axb~ -a = [ra,0] — s (negative) times (negative) a=[ra,7t]~ b=[rb,7lY ~axb axb=[rarb,0] a positive ordinary number axb = [rarb,n] a negative ordinary number axb=[rarb,2n] a positive ordinary number Definition 1 liz = [rz, 0Z] and w — [tw> @w] are two general complex numbers then a 'possible' definition for multiplication consistent with ordinary scalar multiplication could be: z x w = [rzrw , 9W - 0Z] That is, the 'r-values' multiply and the 'angles' subtract. This definition leads to the usual results for scalars: z = [rz,0] , w = [rw,0]; zxw = [rzrw,0] a positive scalar z = [rz,0], w= [rw,ir]; zxw = [rzrw, -tt] a negative scalar z = [rz,7r] , w = [rw,ir]i zxw = [rzrw,0] a positive scalar.
45 This definition is not used as it has a number of obvious drawbacks (i) z x w / w x z That is, this product is not commutative for general complex numbers (though it is for scalars). (ii) Although each positive scalar has the usual square roots they also have an infinite i number of distinct complex square roots. That is, [rz,0] has square roots [rz ,0} for any 0. (iii) There do not exist square roots of any other scalar! That is, if we have a complex number [rz, 0], 0 ^ 0 then we cannot find [rw, </)] such that Vw,4>] x VwA] = Vz,o] We are thus led to reject this definition of multiplication. (Some of these properties seem strange indeed. However, as we shall see similar properties will have to be accepted for quaternion numbers). Definition 2 If z = [rz,9Z] and w = [rw,9w] are two general complex numbers then a second 'possible' definition for multiplication could be: zxw= [rzrw, 9Z+6W] i.e. 'r—values' multiply and 'angles' add. As in definition 1 this produces results consistent with the usual results for multiplication of scalars: z = [rz,0] , w=[rw,0]; zxw= [rzrw,0] a positive scalar z = [rz,0] , w = [rw,tt]; z x w = [rzrw,7r] a negative scalar 2 = [rz,7r], w=[rw,ir]; z x w = [rzrw, 2?r] a positive scalar. This definition does not suffer from the disadvantages of definition 1. 1. z x w = w x z That is, the product is commutative for all complex numbers 2. Every complex number [r2,02] has, as we hoped, exactly two square roots. This is easy to prove. If [rz, <j>] is the square root of the complex number [rw, 0 + 2/c7r] for some integer k we have: If [rz,0] x [rz,</>] = [r^,6> + 2A:7r]
46 then r2z=rw 2$ = 0 + 2kir k integer or zero i 6 rz = rw <j) = - + kir k integer or zero But there are only two distinct values of [r2, <j>] depending as A; is even or odd. The geometrical construction of square roots is described in Figure 1.12. zx = \r\l\ 0/2} z2 = [r1J\0/2 + 7r] Figure 1.12 We agree to accept this as the definition of multiplication of complex numbers. The close connection between multiplication and rotation is strongest when we consider complex numbers of unit norm r = 1. If w and z are two such numbers: w = [l,0] * = [l,<£] then zw = [1,0 + 0] and the effect of multiplying w by z is to rotate (in the anti-clockwise, positive sense) the complex number w through angle 0. To recap: the product of two complex numbers z = (a, 6), w = (c, d) is defined according to the rule: zw = (ac — bd, ad + be) in Cartesians or zw = [rzrw,6z + 0W] in Polars This product is commutative wz = (ca — db, cb + da) = zw
47 Since ordinary scalar multiplication is commutative. It is also associative. If p = (e, /) is a third complex number then z(wp) = (a,6){(c,d)(e,/)} = (a,6){(ce-d/,c/ + de)} = (a(ce - df) - b(cf + de), a(cf + de) + b(ce - df)) = ((ac — bd)e - f(ad + 6c), (ad + bc)e + f(ac - bd)) = (ac — 6d, ad + 6c) (e, /) = (zw)p The set of complex numbers denoted by C, with the binary operations of addition and multiplication satisfy all the axioms of a commutative ring. The complex numbers admit a multiplicative identity (or unit) being (1,0). Whether we use the paired number notation (a,/?) or the more common algebraic form a + j/3 for which j2 = —1 (or indeed many other possible forms used to represent a complex number) will depend entirely upon the immediate application. We should note that although C is a commutative ring, it cannot be ordered. To see this we first assume that we can construct a subset Cp C C which satisfy the order relations. Of course 1 € Cp since 1x1 = 1 and the square of every element of an ordered set is positive. Thus -1 fi Cp. But j2 = -1 which contradicts the property that the square of every element is positive. Thus C cannot be ordered. The conjugate map Let z = s + jv then the conjugate of z, denoted by z, is denned as: z = s — jv. It is easily confirmed that for all z,w € C then (z + w) = z + w (zw) = zw. The map /:Ci—>C f(z) = z is an automorphism since f{zw) = (zw) = zw = f(z)f(w) and f(z + w) = (z + w) = z + w = f(z) + f(w) Also / is clearly one-to-one and onto. This of course is not surprising from a geometrical point of view since z is the reflection in the scalar axis of z. Using the conjugate we can re-express the norm of z, Nz = zz and we find Nz.w = Nz- Nw. Also since zz = zz G Rp it is clear that to each non-zero complex number z, we can construct a multiplicative inverse
48 Since (C, +, •) is a commutative ring with, for every non-zero z £ C, an inverse z~l then (C, + , •) is a field. Obviously, using the conjugate we see that the scalar part S(z) and vector part V(z) of z are S(z)=1-(z + z) V{z) = \{z-z) We can also deduce two interesting inequalities with reference to the norm. First, for any complex number z = s + jv S(z) = -(z + z) = s S{z) for all z Now Nz+W = (z + w)(z + w) = zz+ (zw + wz) + ww = NZ + 2S(zw) + NW<NZ + 2y/N~^+Nw .'■ Nz+W <NZ + 2y/NzNw + Nw or, in terms of the more commonly used moduli: \z + w\ < \z\ + |iy|. In a similar vein Nz-W = (z - w)(z - w) = NZ- 2S(zw) + NW>NZ- 2y/NzNw + Nw That is, \z — w\ > \\z\ — \w\\. These are the triangle inequalities. The Angle Between two Complex Numbers Let z and w be any two complex numbers. Then, in polar form: z — rz (cos a + j sin a) w = rw (cos (3 + j sin (3) and, as is easily found: cos(a - (3) + j sin(a - (3) Thus we can conclude S(zw) = rzrw cos(a - (3) and V(zw) = rzrwj sin(a - (3) Nz zz ■ 2 , 2 ■ s +v and Therefore, V s2 + v2 > s ■ z w zw 1± Tin
49 This leads to the definition of the angle 0 = a- (3 between the two directions represented by the complex numbers z and w to be such that: cost fK\fK and j sin 0 ■ fKJN, These results on angle suggest that the space of complex numbers can be considered as an inner product space. This is confirmed if we choose as inner product in C: < z,w > = S(zw) €R The inner product axioms are easily checked. Rotations and Reflections We now show that multiplication of a complex number w by a unit complex number z = x + jy represents a rotation of w in R2. To see this consider the matrix representation of the linear map <j>: <j): R2 »-> R2 <j)(w) = zw. Now: 4(1) = z = x + jy (j)(j) = zj = -y + jx Therefore its matrix representation is: A^ = -y An easy calculation confirms that Af is orthogonal and detA^ = x2 + y2 = 1. Hence A^ € 50(2) and the transformation is a rotation. We now consider a related transformation; a reflection. Let w be a given complex number and z a unit complex number (N^ = 1). See Figure 1.13. v B Figure 1.13 We might ask what is the reflection wr of w in the line represented by z and what is the reflection wq in the line whose normal is represented by zl Clearly wr is obtained by rotating w in the clockwise direction through angle 2a: wr = we -2ja
50 Now, since . e Ja = z ML ~2Nw _ ~2NwW _ -2 In a similar manner we find: WQ = ^'(18°-2q) = -we^j2a = -z2w = -wR. It is self evident that any planar rotation is equivalent to a refelection in a line bisecting the angle of rotation. We can consider reflection from a linear space perspective. The map 9 : M2 »-> E2 9(w) = z2w is linear. Also 9(l) = z2=x2-y2 + 2jxy = l(x2 - y2) + (2xy)j 0Q) = z2j = 2xy-j(x2-y2) = l(2xy) + (-(x2-y2))j Therefore the matrix representing the map is: Ae x2 — y2 2xy 2xy -(x2 -y2 As with the rotation map (/>(w) = zw it is easily verified that this is an orthogonal matrix but here, det^4# = — 1. That is, the #-map does not preserve orientation and so cannot be a rotation. We also see (using our knowledge of matrices) that two successive reflections; represented by Aq, Bq respectively would be equivalent to a rotation since AqBq is orthogonal if Aq and Bq are and det^i^) = det^deti^ = +1. Circular Arcs We have seen that if w is any complex number then (cos# + j sin6)w rotates w through an angle 0 in an anti-clockwise direction. We can form an association with the arc of the circle which subtends angle 0 and the unit complex number cos 6 + j sin 6. We write (using ~ to specify the geometrical correspondence) z = cos 9 + j sin 9 ~ arcAB We note that the arc can be positioned anywhere on the circle — it is slidable. The arc is able to move to any position on this circle so long as its length and direction remain unchanged. See Figure 1.14(a).
51 Prom this correspondence we deduce that any point on the circle, or indeed the circle itself, (0 = 0) is represented by z = 1 and any semi-circle (9 = it) is represented by z = — 1. Also if arc,4# is represented by z then z"1 represents &ycba whilst — z represents arc£># {DOA is a diameter, see Figure 1.14(b)). Also the complex number z — j (i.e. 6 = tt/2) represents a quartercircle. Circle arcs may be added vectorially by sliding one along the circle until its start point is at the end point of the other and clearly arc^ vp arc^ ^= arc^ii) where © denotes 'vector summation' of arcs in the sense indicated in Figure 1.14. Any two arcs £, w will form a closed circle only if zw = 1 since then arcf © arc^ = 0 This is easily generalised: a collection of n arcs, represented by z, w,..., ft will form a closed circle only if zw... ft = 1. (a) (b) Figure 1.14 We can group together the main results of the correspondence between unit complex numbers and circular arcs: z ~ arc^B z~l ~ arc#,4 — z ~ arc£># 1 rsj point or circle — 1 ~ semi — circle j ~ quarter — circle arcf © circle = arcf arcf © semi — circle = — arc^
52 If z,w are general complex numbers z = y/N^z, and w numbers with unit norms then /Nww where z,w are complex zw = \fWz\fN^zw = \/N~z\/N^ arc^ Interpreted appropriately we see that multiplication of complex numbers can be decomposed into the product of positive reals and the vector addition of arcs. As we shall see later a very similar (and much more useful) correspondence can be made between great circle arcs on a sphere and quaternions which generalise complex numbers. Some authors define the polar coordinates of z = \fN~zz to be (essentially) the pair {\fN~z, arcf). This is in accord with the 'polar coordinates' denned for real numbers. See Figure 1.15. alternative polar coordinates for z Figure 1.15 Matrices and Complex Numbers Another way of considering complex numbers utilises matrices. Matrices arise naturally when rotations in the plane are considered. A rotation of a point with coordinates (x,y) to a new point with coordinates (xf, y') can be effected by the transformation x' = ax — by = v a2 + b2 [x cos 0 — y sin 0] cos 0 = y' = bx + ay V a2 + b2 [x sin 0 + y cos 0] sin 0 = Va2 + b2 Va2 + b2 The terms in square brackets describe a pure rotation, through angle 0, and the factor \/a2 + b2 describes an expansion: The point (xf, y') is a factor y/a2 + b2 further away from the origin than (x,y). This transformation may be written in matrix form:
53 (x',y') = (x,y) a b —b a Theorem 1.6 Let the set of all 2 x 2 matrices of the form M/2^. The map / : (C,+,.) .- (M(2|R),0,®) /{(a+ Jb)} ~ is an isomorphism. a -b a -b b a b a Proof f{(a + jb) ■ (c + "jd)} = f{(ac - bd ■ +j{bc + ad)} ac— bd be + ad] —be —ad ac — bd\ a b —b a (g> c d —d c = f(a+jb)®f(c + jd) f{(a+jb) + (c + jd)} = f{(a + c)+j(b + d)} a + c b + d —b — d a + c a b —b a e c -a d 1 c. be denoted by = f(a+jb) + f(c+jd) Thus / is a homomorphism. It is clearly one-to-one and onto. Thus the map / is an isomorphism. We note that the complex conjugate corresponds, in matrix terms, to the transpose operation.
Chapter 2 Quaternions 2.1 Inventing Quaternions The quaternion is a generalisation of a complex number (See van der Warden's article [6] for the historical development). We can consider generalising the complex number from a geometrical or an algebraic perspective. We begin with the geometrical approach. The complex number describes 2—dimensional space (the xy plane) and (as far as multiplication of complex numbers is concerned) rotations within it. Now rotations in the xy plane are commutative. That is, if P\ and P2 denote rotations in the xy plane and if successive rotations Pi followed by P2 is notated by P2P1 then P1P2 = i^-Pi- This is reflected in the algebra of complex numbers. Multiplication of complex numbers (which as we know describe planar rotations) is commutative Z1Z2 = 22^1 • However, if we wish to describe rotations in 3-dimensional space then we are immediately faced with a difficulty since, as is easily demonstrated, space rotations are non-commutative. For example, in 3—dimensional space with axes Ox, Oy, Oz we consider two right-handed rotations: R\: about the x-axis through 90° R2: about the y—axis through 90° Applying R\ first takes a point P : (0,0,1) to the point P' : (0, —1,0). Application of R2 leaves P' unchanged. Conversely, applying R2 first takes P into P" : (1,0,0). Application of R\ leaves P" unchanged, i.e. it^i 7^ R\R2 and the order of the rotations are important (unlike planar rotations). See Figure 2.1 il 7 i (a) (b) (c) Figure 2.1 J. P. Ward, Quaternions and Cayley Numbers © Kluwer Academic Publishers 1997
55 What we have found, in the natural progression from ordinary numbers to complex numbers, is that both have exactly the same algebra: in particular the associative and commutative relations for addition and multiplication hold true for complex numbers. However, the order property of ordinary numbers is meaningless when applied to complex numbers and must be abandoned. The example above, on finite rotations, illustrates that if complex numbers are to be generalised to apply to describe 3—dimensional rotations then the algebra of the generalised 'objects' is likely to have a 'product' which is non- commutative. Complex Numbers and Quaternions We shall find it convenient to write a complex number in the form z = la + jb j2 = — 1 where a, b are ordinary numbers (scalars). The complex number is made up of two parts; a scalar part la and a vector part jb. The usual constructions can now be introduced: the conjugate z, the norm Nz and the angle 9 a h z = la — jb Nz = a2 + b2 cos 0 = sin 9 = Nz y/Nz Then every complex number can be written in polar form: z = y/W^(cos 6 + j sin 0). Other common properties can now be deduced: z -l ™ZW — ™Z™W ^z/w — The quaternion was introduced by Hamilton. His initial attempt to generalise the complex numbers, by introducing a 3-dimensional object (of the form q = la + ib + jc) failed in the sense that the algebra he constructed for these 3-dimensional objects did not have the desired properties. In particular it failed to satisfy the 'norm' property Npq ^ NpNq. On 16th October 1843 Hamilton discovered that the appropriate generalisation is one in which the scalar (real) axis is left unchanged whereas the vector (imaginary) axis is supplemented by adding two further vector axes. The reader might find it helpful to think of the scalar axis as representing 'time' and the three vector axes as representing 'space'. The basic algebraic form for a quaternion q is: q = la + ib + jc + kd where a,b,c,d are ordinary numbers. The vector space is regarded as the usual 3—dimensional vector space with 'unit vectors' i,j and k.
56 What properties should we expect/demand for these new objects? We might certainly wish that the rules for addition and multiplication by a scalar should mirror those for complex numbers. Thus if q = a + bi + cj + dk and q' = a' + b'i + c'j + d'k are any two quaternions then equality, addition and multiplication by a scalar are denned trivially: equality: q = q' only if a = a', b = &', c = c', d = d! addition: q + tf = a + a' + (b + b')i + (c + c')j + {d + d')k = q' + q multiplication by a scalar sGM: sq = sa + sbi + scj + sdk Clearly addition is associative : q + (p + h) = (q + p) + h and the usual properties of multiplication by a scalar are satisfied: (s + t)q = sq + tq, s(q + p) = sq + sp s,tGR If the scalar and vector parts of q are Sq and Vq respectively, defined as: Sq = a Vq = bi + cj + dk then, for any quaternion q: q = Sq + Vq By analogy with complex numbers the conjugate of q, denoted by q is: q = Sq - Vq whilst the Norm of q, Nq is: jVq = a2 + 62+c2+d2. We can also associate an angle 9 with quaternion q: a . n Vb2+c2+d2 cosy = —7= sm0 = 7= This is a sensible definition since, obviously, -1 < cos<9 < -hi - 1 < sin0 < +1 cos2 0 + sin2 0 = 1 If Nq = 1 the quaternion is called a unit quaternion. Every quaternion can be written in 'polar' form: q = \[N~q (cos 0 + q sin 0) where q is a unit vector.
57 Quaternionic Multiplication The crucial question concerns the product of one quaternion with another. We will be guided in our choice for the definition of multiplication in our wish to retain j2 = — 1 (and thus by 'symmetry'; i2 = — 1, k2 = — 1 also). These three basic products must hold since, if our 3-dimensional vector space reduces to a 1-dimensional vector space (that is, if b = c = 0 or if c = d = 0 or if b = d = 0), the quaternion should reduce to a complex number with all its attendant algebraic properties. We shall also wish to associate rotations with multiplication (in some form) just as we have done for complex numbers. There are two types of rotation that we can consider: in the four-dimensional space of the 'general' quaternion q = a + bi + cj + dk or in the more familiar three-dimensional space of so-called 'pure' quaternions bi + cj + dk. We first consider the three-dimensional rotations in the space spanned by i, j, k. If w, q' are given quaternions and if q' is expressed in polar form: q' = y/Nqt (cos <j) + q' sin 0) then we might suspect/hope that the product of q' with w (written q'w) would imply a rotation of the vector part of w about q' through an angle (j). (This turns out not to be true). Instead let us denote by: q' * w to be a quaternion denoting the operation of rotating the vector part of w about q' through an angle depending on (p (as yet unknown). (In complex numbers z\ * z<i = z\^. That is, in that case the operator * is the direct product. We shall see below that q' *w = q'w(q')~l). Just as multiplying a complex number by a scalar leaves the direction of the complex number unchanged so we demand that multiplying a quaternion by a scalar will imply no rotation; so s * w = w for s G M. That is, rotations will only occur if q' has a non-zero vector part, Vq' ^0. If we now consider a third quaternion q, also expressed in polar form q = y/N^(cos 0 + q sin 0) then, from the constraints already imposed, q * (qf * w) would imply rotation of the vector part ofq'*w about axis q through some angle depending on 9. In the definition of the quaternion product we shall demand that q * (</ * w) is equivalent to (qqf) * w That is, two consecutive rotations should be equivalent to a single rotation, this being a well known property of 3-dimensional space. There are two important questions we should
58 now ask. What should the angle of rotation be in q' * w and how do we determine qq'l We shall make our choice of angle and consequently determine qq' by considering some special cases (in effect by working out i*i, i*j, etc). Special case 1 Let q = i, q' = i then, as we have demanded, qq' = (i)2 = — 1. The effect of i * (i * w) is to rotate the vector part of a quaternion w through an angle A (say) about the i—axis and then through angle A again, also about the i—axis. This is to be equivalent to multiplying the quaternion by —1 (a quaternion with no vector part), that is, not to rotate it at all. We are thus led to conclude that 2A = 0 or 2tt or 4?r, ... We choose A = tt for non-trivial effects. That is, the operation i* rotates the vector part of a quaternion through angle tt about the i—axis. This is to be contrasted with the operation i* in the complex plane which only implies a planar rotation through 7r/2. By symmetry we expect similar effects when operating by j* or by k*. Special case 2 Now consider the choice q = i, q' = j. What is ij? If we consider point P in 3-dimensional space with coordinates (—<7,p, e) then the vector OP which could be the vector part of a quaternion, transforms into P' : (c,p, — e) when operated on by j* (rotation of ir about j—axis) and then into P" : (cr, —p, e) when then operated on by i* (rotation of tt about the i—axis). But by inspection (see Figure 2.2), this is equivalent to either a single operation by k* (rotation of tt about the /c—axis), or an operation by —A;* (rotation of — ir about the A;-axis). t P" i P' Figure 2.2 Thus we are led to conclude that (ij)* = ±(k)* If we reverse the product and consider ji we find P —► Q' : (—a, —p, — e) after operating by i* then Q' —> Q" = P" : (a, — p, e) after operating by j*. Again we can only deduce the value of the product ji (= ±k) up to a sign. We do not make the choice that ij = ji
59 as we are specifically looking for a product which is not commutative. Now both i2 and i4 imply rotations through 360°: but what is the significance in the difference in signs: i2 = — 1, i4 = +1? We need to distinguish (in terms of rotations) between the minus sign obtained from the operator i2 (or from j2 or from k2) and the plus sign obtained from the operator i4. If we can manage to do this we shall be able to decide whether ij = +k or ij = — k. A mechanism for explaining the difference between operators i2 and i4 exists and is known as the quaternion demonstrator [8]. Quaternion Demonstrator (The Belt Trick) A demonstration of the properties of quaternionic multiplication is obtained by using a disk, with distinctive sides, allowed to swing from a ribbon, also having distinctive sides. See Figure 2.3. It is the existence of the ribbon, which undergoes twisting when the disk rotates that reflects the distinction in the operators i2 and i4. The reader is strongly urged to construct the disk as described here so that s/he can confirm directly the claims that are made in this section. A similar, but less versatile, demonstrator can be made with a plate balanced on the hand. This is less satisfying as a demonstrator as the rotation of the plate is not quite so obvious as it is with the disk. Front vi w Back vi w Figure 2.3 In Figure 2.4 the disk is rotated about the i—axis through 360° thereby returning the disk to its original configuration. This demonstrates the operator i2 = — 1. (In the plate/arm
60 combination there is a twist in the arm, which can be removed by a similar further complete rotation). *2 i = Although t\ - disk is ba k to the o gin I con igunt on th re is ■i twist in the bbon Figure 2.4 As a result of the rotation there is of course a twist in the ribbon which cannot be removed unless the operation is reversed or unless the operation is repeated (demonstrating the effect of the operator i4 = +1. See Figure 2.5). Th isk is aii back to th * o Tginal conflgur lion, i * ar tu'his in he ribbon but, is is uisiU tmonstra cd, thiAe ct n be untangled wi hou my urther rot' ion of tht. d'sk ya ' ckwisc tana ota ion ofth lisk. Figure 2.5
61 However, in this case the twists in the ribbon can be removed, without changing the intrinsic rotation of the disk, by allowing the tension in the ribbon to relax and by moving the disk along the circular path shown in Figure 2.6. \ d thro gh I o Figure 2.6 Thus we realise, using this demonstrator, that though i2 and i4 both leave the disk unaffected; i2 is different from z4 in that the former leaves a single twist in the ribbon (denoted by —1) whilst the second operator leaves the ribbon untwisted (denoted by -f-1) after the planar motion described in Figure 2.6 is carried through. In Figure 2.7 the demonstrator exhibits the operators k2 = — 1 and k4 = +1. j Th disk is hack to the I ri^ ndlcunHgur tion T crc ts aw' n hcrbb n ppoMieti \ ^ \ that obi nc > o v ling w*ih i * The Kk is again cktoih ori inal confi uraiion. The apparent twists in the ihbon tan be un an Aq 1 wilhoul fu her ctaiionn thed'sk as iclb cr h' imc a anticlockwise pi na ro at t n wiU achieve ih desired csuks) Figure 2.7
62 ut i ultip icarion by mu U cation by j iu] i lication b entical to m Itiplic lion b -k ide tical to mltjplic tio b k Figure 2.8 A similar operation to that described above will remove the apparent twists in the ribbon without changing the intrinsic rotation of the disk. The operators j2 = —1 and j4 = +1 are (pictorially) identical to those described in Figures 2.4, 2.5. Finally, we use the demonstrator to show that ji = —k. Rotating the disk through 7r about the i axis and then through 7r about the j axis produces the second picture of Figure 2.8. This is identical to the picture obtained by rotation through — tt about the k axis. As seen in Figure 2.8 the difference in ij and ji lies in the twist on the ribbon; the twists have different orientations. 2.2 Quaternion Algebra Our considerations with rotations and particularly with the demonstrator lead us to conclude that the natural generalisation of the 2-dimensional complex form: z = a + jb, is not to a 3-dimensional object but is the 4-dimensional quaternion q: q = a + bi + cj + dk such that i2 = j2 = k2 = — 1 and ij = k, ji = —k (the last two relations being cyclic). The set of quaternion numbers is denoted by EL Using these basic products we can now
63 expand the product of two quaternions to give (assuming for the moment that the product is distributive with respect to addition): qq' = (a + bi + c] + dk)(a' + b't + c'j + d'k) = (Sq + Vq)(Sq' + Vq') = SqSq' + Sq'Vq + SqVq' + VqVq' = SqSq' - Vq • Vq' + Sq'Vq + SqVq' + VqA Vq' where we have used the usual dot and cross products in vector analysis. We see that the quaternionic product contains all the products of vector analysis: products of two scalars; products of scalars with vectors; the dot product and the cross product. Since, in general Vq' /\Vq^ Vq' A Vq, the quaternion product is not commutative unless Vq is parallel to Vq' or if one of q, q' has zero vector part. Although the product is not commutative it is associative. If <?,p, h £ HI then the product q(ph) is: q(ph) = (Sq + Vq)(S(ph) + V(ph)) = Sq Sp Sh - {Sq Vp ■ Vh + Sp Vh ■ Vq + Sh Vq ■ Vp} + {Sq Sh Vp + Sh Sp Vq + Sp Sq Vh} + {Sq VpAVh- Sp Vh A Vq + Sh Vq A Vp} - (Vp ■ Vh)Vq + VqA (Vp A Vh) - Vq ■ (Vp A Vh) and the product (qp)h is: (qp)h = Sq Sp Sh - [Sh Vq ■ Vp + Sq Vp ■ Vh + Sq Vp ■ Vh] + [Sq Sp Vh + Sh Sq Vp + Sh Sq Vp] + [Sq Vp A Vh + Sp VqAVh + Sh Vq A Vp] - (Vq ■ Vp)Vh + (Vq A Vp) AVh- (Vq A Vp) ■ Vh Thus if q(ph) = (qp)h is valid we need to show -(Vp ■ Vh)Vq + VqA (Vp A Vh) = -(Vq ■ Vp)Vh + (Vq A Vp) A Vh But this is easily shown to be true if we employ the standard vector identities a A (b A c) = (a • c)b - (a • b)c (a A b) A c = (a • c)b - (b • c)a
64 q(ph) = (qp)h This property could have been checked directly, without appeal to vector algebra, by evaluating all possible combinations of the products (pq)r, p(qr) when p, q, r are basis elements 1, i, j or k. We can also confirm that the quaternion product is distributive with respect to addition: q(p + h) = {Sq + Vq)[Sp + Sh + Vp + Vh] = SqSp + Sq Sh-Vq- {Vp + Vh) + ShVq + Sp Vq + Sq{Vp + Vh) + VqA (Vp + Vh) = {Sq Sp-Vq-Vp + SpVq + SqVp + VqA Vp} + {Sq Sh-Vq-Vh + Sq Vh + ShVq + VqA Vh} = qp + qh. It is now a simple matter to determine the following properties of quaternions S(qp) = SqSp-Vq.Vp = S(pq) Nq = qq=(Sq + Vq)(Sq-Vq) = {Sq)2 + Vq-Vq = (Sq)2+NVq Jqp) = {Sq Sp-Vq-Vp- SpVq - SqVp -VqA Vp} = pq Also the important property that the norm of a product is equal to a product of norms: Npq = pq(pq) = PQ(QP) = PPQQ = NpNq If a quaternion q has a non-zero norm Nq then the inverse q~l is defined by q~l = —. It Nq quickly follows that qq~l= q-1q = l Np/q = ^ (pq)'1 = q-'p'K A quaternion q is said to be pure if S(q) =0. We also note that since qp-pq = VqAVp-VpA Vq and so if p is a quaternion which commutes with every other quaternion then Vp = 0 and so p is a real number.
65 It should be noted that a quaternion q = a + (3 can always be written as the (quaternion) product of two vectors a, b: q = ab= —a • b + a A b In fact there are an infinite number of choices that can be made for a, b: explicitly, choosing 6 • /3 = 0, then -b A (3 - ab a = -=- gives a • b = —a and a A b = j3 b-b — This result is used explicitly in showing (in Section 2.6) that any 3-dimensional rotation can be formed from two successive reflections. Example Application to Analytical Geometry. Find the quaternion equation of a straight line and hence deduce its shortest distance from the origin. Solution If b is a vector parallel to a line which passes through the end-point of the position vector a then its equation is (r-a) = ab aGM where r is the position vector of a point on the line (see Figure 2.9 ) with respect to origin O. Figure 2.9 If we define the moment M of the line about O: M = aA6 then {r- a) /\b = 0 or r Ab = M
66 is the equation of the line. Now we can interpret b as a pure quaternion and so Therefore (r A b)b~x = Mb'1 where b_1 = —— [-(r A b) • b_1 + (r A b) A b_1] = Mb'1 - b"1 A (r A b) = Mb"1 Mb-1 that is r ——o Nb therefore r = (b-r + M_)b 1 = (7 + M_)b 1 where 7 = 6 • r Now (Mb-1) and 6 are orthogonal since 5((M6_1)6C) = -S((M6_1)6) = S(K) = 0 Also, we can determine NL, it being the square of the length of the vector r NL = rrc = (7b"1 + Mh'1)^'1 +Mh~lY = (1b-1+Mb-1)(-^~M^-r = (76-1+M6-1)(7]|-]|m) 72 Since, for a given line MIT1 is a fixed quaternion, we deduce that the minimum value of NL occurs when 7 = 0 (as is obvious geometrically) and has value N^Mb-i^ which is easily determined as: Mb'1 =a+(b-a)b-1 Example Use the quaternion formalism to deduce standard vector identities. Solution Let a, b and c be pure quaternions, then ob=— a-b + a/\b
67 Thus S(ab) = — a • b = — \a\\b\ cos 0 V(ab) = a A b = \a\ \b\ sin Oh V(ba) = \{ha- (ba)c) = ]-{ba-ab) = -V(ab) Therefore a A b = —b/\a Now we know, since the quaternion product is distributive a(b + c) = ab + ac; so S(a{b + c)) = S(ab) + S(ac) i.e. a- (b + c) = a-b + a- c Also V(a(k + c)) = V{ab) + V(ac) i.e. aA(Hc) = aAb + aAc Now considering quaternion products of three vectors (using S(ab) = S(ba)) S{abc) = S{bca) = S{cab) Now, clearly (as for any quaternion) abc = S(abc) + V(abc) and since abc = a(S(bc) + V(bc)) .'. S(abc) = S(aV(bc)) This implies Ql ' {k A c) = b • (c A a) = c • (a A 6) Also the relation S(abc) = S[(abc)c] = —S(cba) implies a- (bAc) = —C' (bAa) We also see that V(abc) = aS{bc) + V(aV(bc)) But, directly from the definition: V{abc) = habc-{oM)c) = \[abc-(-c)(-b)(-a)} = - [abc + cba] = - [abc — bac + bac — bca + bca + cba] = cS{ab) - bS{ac) + aS{bc)
68 Therefore we deduce V{aV{bc)) = cS{ab) - bS(ac) which states in classical vector terminology Ql A {b A c) = -c(a • b) + b(a • c) Complexified Quaternions In this section we describe the use of a quaternion formalism in conjunction with complex numbers. This is applied to the homogeneous coordinate formulation of the equations of a point, line and plane. Following Brand [9] the equations of a point, plane and line are given respectively by ra = a0 (1) r-a = a0 (2) IA a = Oq in which a • Oq = 0 (3) in which r is the position vector to the entity (point, plane or line) in question. Thus the point, plane or line can be notated by Point: (a,Oo) Plane : (a, ao) Line : (a,Oo) these are called the homogeneous coordinates of point, plane and line respectively. (If coordinates are multiplied by the same scalar then the entity they determine, through (1),(2), (3) are unaltered). In each case the first homogeneous coordinate cannot vanish. Also when the second homogeneous coordinate vanishes the origin O is a part of the entity. It is interesting to note that the shortest distances from point, plane and line to the origin is given by: |gp| 1 Qol |gp| respectively. Let us now introduce a complexified quaternion (the next chapter will examine this object in far greater detail): p = Q + iQo Q, Qo € H
69 in which Q = a + o, Qo = #o + Oq. The four basic quantities: scalar, point, plane and line are obtained as follows: (a) V(Q) = V(Q0) = 0 p e C a scalar (b) V(Q) = 0, S(Qo) = 0 p = a + iaQ <-> (a,o0) represents a point. (c) S(Q) = 0, V(Qo) = 0 p = a + ia0 <-> (a, a0) represents a plane. (d) S(Q) = 0, 5(Qo) = 0 p = a + iaQ <-> (a, Oq) represents a line. Note that point and plane are 'dual' in the sense that one is obtained from the other by multiplying through by i; but scalar and line are 'self-dual' objects. Now any two entities determine a third; two points determine a line, two planes determine a line, two lines determine a point and a point and a line determine a plane. Let us consider first two distinct points with quaternionic representation: p = a + iaQ q = [3 + ibQ then it is easily checked that these two points determine the line with homogeneous coordinates (abo-000,00 A £o) But, as is quickly confirmed the expression iV(pq*) (in which an asterisk * represents complex conjugate) give the real and imaginary parts representing a line: W(pq*) = ako- f5aQ + i{oq A 6q) <-> (afio-/?Oo,OoA£o) Thus, in quaternionic terms, the line connecting p, q is given by iV(pq*). Secondly, considring two non-parallel planes P = ia0 + a Q = ifio + b These determine a line with homogeneous coordinates (o A 6, PoQ. — aok) Again, in quaternionic terms this is neatly expressed as iV(P*Q): iV{P*Q) = aAb + i%a - a0b)
70 Much more complicated are the quaternionic expressions for the plane determined by a point and a line. Here p = a + ia^ represents a point and h = b + ibQ represents a line. The homogeneous coordinates of the plane determined by these entities are (ab — a0 A6, Oq • 60). But, in quaternionic terms, these terms arise as the combination l-[h*p-ph\ Finally, we can contemplate the point determined by the line h = a + ia$ and the plane P = ipo + b. The homogeneous coordinates of the point of intersection are (a-6, A>a —Oq A&) which, in quaternionic terms are obtained as the real and imaginary parts of \[hP + Ph*] 2.3 The Exponential Form and Root Extraction In the algebra of quaternions we have had to accept that multiplication is not commutative. In complex number algebra positive real numbers and negative real numbers are essentially treated in the same way; both have just two square roots. (In fact, in the invention of complex number multiplication the definition avg(zw) = argz + argw is chosen precisely to ensure symmetrical treatment of positive and negative real numbers. Had the definition been chosen as 8iYg(zw) = argz — aigw which is possible if all one wants is a definition consistent with ordinary real number multiplication then positive reals and negative reals are treated differently. With this second definition complex multiplication is not commutative, positive real numbers have an infinite number of square roots and any other number has no square roots at all!). We shall see that in quaternion algebra positive real numbers are again distinguished from negative real numbers. However, in quaternion algebra, whereas positive real numbers have the expected two square roots, negative real numbers now, instead of having no square roots have an infinite number of them! To obtain these results it is convenient to consider the 'polar' form of a quaternion (where q is a unit vector): q = >//V^(cos <j> + q sin <j>) If this is a unit quaternion with zero scalar part then Nq = 1 and <\> = 7r/2; that is q = q. Then using the rules of quaternionic multiplication: Q2 = -q-q + q/\q = -1
71 Thus since the direction of q is arbitrary there are an infinite number of quaternion square roots of —1. This corresponds to the fact that performing two 180° rotations of the disk in the demonstrator about an arbitrary axis q will bring the disk back into its original configuration but will leave a twist in the ribbon. It is easy to show that any quaternion square root of — 1 must take this form. For example let q = a + bq q2 = —1 Then, from the second of these equations: (AT,)2 = #(_!) = 1 .-. JV, = 1 Now since V(q2) = V(-l) = 0 and as q2 = a2 — b2 + 2abq this implies that ab = 0 We easily conclude that a = 0 and 6=1 and so q = q. Since, as we have seen, a pure vector unit quaternion q satisfies q2 = — 1 it follows that the plane (depending upon two real parameters s, v), q = s + vq (for fixed q) is isomorphic (a copy of) the complex plane: w = s + v j Within the plane, quaternion multiplication reduces to complex multiplication. This view allows a quaternion to be written in exponential form, as if it was a complex number. Exponential form of a Quaternion Let q be a quaternion: q = y/N\j(cos 0 + e sin 9) This can be written in the form (cf [9]): q = ^/Wqe'e since, in common with complex numbers, this only requires use of the property e2 = -1. We cannot deduce ee9.e^ = ee0+^ since this can only be deduced by using commutativity. However we can obtain the usual Euler formula: e*« = cos 0 + e sin 9
72 from which the De Moivre theorem is deduced, (cos 0 + e sin 0)n = cos nO + e sin n# This is now used to deduce the nth root of a quaternion. Let w = y/NZ{cos 4> + e sin 0) be a given quaternion and let q be its nth root: qn = (y/Nq)n (cos n0 + e sin nO) = y/Nw (cos </> + e sin </)) Here we are assuming that the nth root is in the same 'complex plane'. If sin^ ^ 0 we have a non-degenerate quaternion and q can be found in the usual way as for complex numbers: y/N~q = {y/N^)1/n 0 = (</> + 2k7r)/n k = 0,1,... ,n - 1 However, if sin <j> = 0 then we can choose e in q arbitrarily. There are two cases to consider. (i) <j) = 0, that is, w is a real number, w > 0. Here 2ibr 0 = fc = 0,l,...,n-l n If n = 2 then there are only two distinct values 0 = 0, tt and so there are just two square roots of a real number ±y/w. If n ^ 2 then, since e is arbitrary, there are an infinite number of nth roots. (ii) (j) = ir. Here w is again a real number w < 0. This case has been described above. There are an infinite number of nth roots of a negative number. 2.4 Frobenius' Theorem As we saw, in Section 1.7 an associative algebra A is said to be a division algebra if for any O^aGi the equations ac = u ba = u always possess solutions, (it's easy to show that, then, b = c). It immediately follows that a division algebra A contains no divisors of zero: if ab = 0 then either a = 0 or b = 0. The following theorem highlights the important central role played by quaternions in the area of associative division algebras over the field of real numbers.
73 Frobenius' Theorem If A is an associative division algebra over the field of reals R then A is isomporphic to one of R, C, H Proof Let us assume that A is a division algebra of order n (dimension) over R. It follows that any (n + 1) elements of ^4 are linearly dependent. In particular if x G A then are linearly dependent. Here i\ is the unit element: i\s = si\ = s \/s G A. Thus there exist scalars otj GR, not all zero, such that a0ii + oliX + ... + anxn = 0 By the fundamental theorem of algebra this can be factorised: F1{x)F2{x)... = 0 where Fi,F2,... are linear or quadratic factors with real coefficients. Since a division algebra contains no divisors of zero we can conclude that at least one of F\, F2,... is zero. If the zero factor happens to be linear then its square is quadratic. We deduce that every x G A is a root of a quadratic equation with real coefficients. In particular if ei = i\, e2, e3,..., en are the basis elements of A then we can conclude e) + 2Pjej + jjii =0 fy,^ G R but, completing the square: (e^+/3^)2 = (/3J2-7j)ii ^-7j€M Now define a new set of basis elements (ei, e'2, e'3,..., e'n) = (ei, e2 + ft, e3 + ft,..., en + ft) then (cJ-)2 = (/S?-7i)<i #-7jgR Now either /3? - 7^ > 0 or /3J5 - 7j < 0. If $■ - 7j > 0 then this would imply that e'- and e\ =i\ were dependent so we must conclude that (e'j)2 = -ot2ji\ for some otj G R
74 We now introduce yet another basis: (e\,E2,E3,... ,En) = (ei, —, —,..., —) a2 a3 an so as to ensure E? = -i\ each j = 2,3,..., n The remaining part of the proof is to show that contradictions arise in all cases other than n = l,2,4. These contradictions arise primarily because of the constraint that every x G A is the root of a real quadratic equation. case 1 n = 1. This is the algebra generated by the single basis element e\ and is clearly R itself. case 2 n = 2. Here the algebra is that generated by two elements (e\,E2) in which E\ = — i\ and is clearly the field C of complex numbers. We therefore assume that n > 2 and consider the algebra generated by (ei,^,^,...). Since all elements of the algebra are the roots of a real quadratic equation then, in particular, so are E2±E3. Now (E2 + E3) = E2 + E2E3 + E3E2 + E3 = —2i\ + E2E3 + E3E2 But (E2 + E3)2 - a(E2 + E3) - fiii = 0 for some a, /? G R that is —22i + #2#3 + E3E2 = a(E2 + E3) + pix a,/3eR Similarly, working with E2 — E3 -2ii - E2E3 - E3E2 = 7(^2 - E3) + 6ii 7,6 G R Adding, we obtain: {a + 7)^2 + (a - 7)^3 + (P + <$ + 4)ii = 0 and since i\,E2,E3 are linearly independent then a = 7 = 0 and /? + 5 +,4 = 0
75 Thus E2E3 + E3E2 = 2eii (say) in which e G R. Then (E2 + E3)2 = (2e - 2)ii {E2 - E3)2 = (-2e - 2)n Now, following the same argument as above both (E2 + E3)2 and (E2 — E3)2 must be a positive multiple of —i\. That is, e — 1 < 0 and — e — 1 < 0 (e-l)(-e-l)>0 that is 1 - e2 > 0 We now introduce new basis vectors Then I2 = —ii, J2 = —i\ and IJ + JI = 0. However it is easy to show that IJ is linearly independent of i\,I, J] since if IJ = aii +(3I + jJ a,ft7GR then, on multiplying through by / I{IJ) = -J = aI-P + iIJ = al-p + 7(mi +01 + 7 J) from which it follows that — 1 = 72 which implies 7 ^ R: a contradiction. This result shows that a division algebra consisting of three basis elements ii,/, J is impossible, since it always gives rise to a fourth independent element I J. If we write K = IJ then this algebra is generated by ii, /, J, if which satisfy Iff = /(/J) = -J KI= (IJ)I = -{J 1)1 = J KJ = {I J) J = -I JK = J {I J) = J(-JI) = I K2 = (U)(IJ) = -(U)(JI) = -I2 J2 = -ii This is the familiar algebra of real quaternions. If we now consider n > 4 then there is a fifth basis element E$, such that E2 = —i\. Proceeding as above we easily deduce IE5 + E5I = ah, JE5+E5J = Ph, KE5 + E5K = 721 a, (3,7 € R Then E5K = E5(IJ) = (E5I)J = {ah-IE5)J = aJ-I(Pi1-JE5) = aJ-ip + KE5
76 Hence, adding E5K to both sides: 2E5K = aJ-I/3 + KE5 + E5K = aJ — (31 + 7Zi /. 2(E5K)K = a{JK) - 0(1 K) + 7# that is - 2E5 = aI + /3J + iK which contradicts the linear independence of J, J, K, E5. Hence an associative division algebra only exists if n = 1,2,4. If the constraint that the algebra be associative is relaxed then as we shall see in Chapter 4 we obtain the 8-dimensional Cayley numbers. 2.5 Inner Product for Quaternions Under the operation of addition, the set of quaternions HI form an abelian group. This, together with the rule for multiplying such an element by a sacalar s £ M shows that the quaternion numbers constitute a linear space. We can give an inner product structure to this space if we define: <p,q> = S(pq) The four basic inner product axioms are easily checked. IPlo < p,q > = S{pq) = S{m) = S{qp) = <q,P> IP2o < p, q + r > = S(p(q + r)) = S(pq + pr) = <p,q> + <p,r> IP3o a <p,q> = < ap,q> = <p,aq> aeR IP4o < p,p > = S(pp) = Np > 0 only vanishing if p = 0. Thus < p,q> = S(pq) is a suitable inner product. We have already associated an angle with a given quaternion. Using the inner product we can now naturally define an angle A between two quaternions p, q to be such that: cos A — /Npy/Nq This is a meaningful definition of an angle as —1 < cos A < 1 for all p, q. This is verified by using the Schwarz inequality: <p,p><q,q>><p,q> ^qeM
77 Now y/Np > \S(p)\ and so yjKp^fNq = y/Npq > \S(pq)\ and the required result follows. Using this we can easily deduce that the angle 9 of a quaternion number p is the angle uj subtended by S(p) and p: S(S(p)p) COS U) = (S(P))2 s(p) /K = cos 6 Note also that if p = y/Np(cos 9 + p sin 0) and q = ^s/Nq(cos § + q sin 0) where p, q are unit norm pure quaternion numbers, then the angle A, between p, q is such that cos A = S(Pq) = S [(cos 0 + p sin 0) (cos 0 — <? sin 0)] = cos 6cos(j) — sin 0 sin (j)S(pq) But if (5 is the angle between p and q then cos (5 = S(pq) = —S(pq). Therefore cos A = cos 0 cos <j) + sin 0 sin </)(cos 8) This is the Cosine Law for spherical triangles discussed more fully in Section 2.9. We say p, q are perpendicular if S(pq) = 0 and are parallel if V(pq) = 0. The scalar and vector parts of a quaternion number are perpendicular since: S(S(p)Vp) = -S((p + p)(p-p)) — ~;S{pp — pp) since S(pp) = S(pp) -{pp-pp+\pp-pp}) -(pp-pp + pp-pp) = 0
78 2.6 Quaternions and Rotations in 3- and 4-Dimensions Four-dimensional Rotations We now consider how quaternions can be used to describe rotations in 3- and 4-dimensional space. Perhaps surprisingly, it is easier to consider rotations in 4-dimensional space first and then to utilise these rotations to effect a rotation in 3-dimensions. Because of the lack of commutativity there are two types of product in quaternion algebra; left and right multiplication. Thus, for a given quaternion x (and q such that Nq = 1), we can consider the two maps: </)/, : <j)(x) h-> qx or <j)R : <j)(x) »-> xq depending as we multiply on the right or the left. In either case, considering HI to be R4 spanned by the usual basis elements R4 = span{1,2, j, k} then these two maps are linear transformations from R4 i-> R4. We suspect that both these maps correspond to rotations since, as is easy to show, they are norm and angle preserving. For example, considering the map </)/, we have already seen that if x, ?/, q £ HI and Nq = 1 then ^ qx — ™q™x — l*x S(xy) Also if x, y subtend an angle A then cos A = = and after multiplication by q we VNxy/Ny have (using the properties of the scalar part of a quaternion noted earlier) S{qx(qy)) cos A' '1 y qx y iy qy S(qxyq) iVq\ iyx y h\ V Sjyqqx) _ NqS(yx) _ S(xy) _ — . . — —. , — COo A iy q V 1 yx y V VJ'xyiV2/ A similar result is obtained if we consider <j)R. Quaternionic multiplication preserves the norm and the included angle. All that is left to show that this operation is a rotation is to show that it orientation preserving. To see this we shall consider the matrix representation of the maps (/)L:E4^]R4 ^>L{x)^qx (j)R: R4 »-> R4 </>R(x) »-> xq M4 is spanned by the standard basis elements 1 = (1,0,0,0), i = (0,1,0,0), j = (0,0,1,0), fc = (0,0,0,l)
79 with the usual quaternionic rules of combination. Now if q = a + bi + cj + dk with Nq = 1 then </>l(1) = a + 6i + cj + dA; </>#(l) = a + 6i + cj + dA; </>l(0 = -& + a* + dj - ck <t>R{i) = -b + ai- dj + ck <S>l{J) = —c — di + aj + bk ^r{J) = —c + di + aj — bk </>l(&) = —d + ci — bj + ak 4>R{k) = —d — ci + bj + ak Therefore the matrix representations of the linear transformations </)/,, (J)r are, respectively &H = [a —b —c —d b a —d c c d a —b d —c b aA A<f>R — [a —b —c —d' b a d —c c —d a b d c —b a. It is easily checked that both these matrices are orthogonal: ^4>L^<j>L — Ii ^<f>R^<f>R — I and det(A/>L) = 1, det(A/>R) = 1. Hence A<f>L € 50(4), A(j)R £ 50(4) so that the operations qx, xq represent rotations of x in R . The angle of rotation (using </)/, say) is easily determined. This is the angle uj between x and qx: (q = cos9 + qs'm9) coscj = S(x(gx)) V ■*■* x y ^ qx Fandso = S(x(xq)) S(q) = S(g) = S(q) = cos 0 so that the angle of rotation u is the angle of q. Geometry of 4-dimensional rotations It is possible to break up a 4-dimensional rotation into simpler, simultaneous rotations in two orthogonal planes [10]. To see this consider, as above, q = cosQ + qs'mQ Nq = 1 and let x be any quaternion. Now qx = xcosO + (qx)s'mQ and q(qx) — qx cos0 — x sin0 since (q)2 = — 1. If we define x' = qx then: S(xqx) = S(—xxq) — —xxS{q) = 0
80 confirming that X) x are orthogonal and from the results just obtained: qx = x cos 9 + x1 sin 9 qx' = -x sin 9 + x' cos 9 which shows that the operation of quaternion multiplication on the left is to produce a positive rotation of elements of the plane containing x, x' through an angle of 9. The plane ax + bx' a, b £ R remains invariant under left multiplication. Rotations occur within it. As a special case, if we choose x = 1, then x' = q and this result shows that the plane containing the elements 1 and q is also invariant: q\ = cos9 + qsin9 qq = — sin 9 + q cos 9 so that elements within this plane are rotated through an angle 9. We can also show that the plane (in the space of pure quaternions) perpendicular to q also remains invariant. To confirm this let v^w^q (regarded as vectors) form a right-handed, mutually orthogonal, system. v • w = 0 v-q = 0 w • q = 0 vAw = q w/\q = v q/\v = w Now, if we choose x = v then x' = qv = qAv = w and so qv = v cos 9 + qv sin 9 = v cos 9 + w sin 9 qw = w cos 9 + qw sin 9 = —vsin9 + wcos9 indicating that elements in the plane containing t), w have been positively rotated through the same angle 9. Similar deductions can be made about quaternion multiplication on the right. Here xq = x cos 9 + x" sin 9 x"q = —x sin 9 + x" cos 9 where x" = xq so that elements in the plane containing x, x" are rotated through an angle 9. Now choosing x = 1 then x" = <j shows that the plane containing 1 and q is invariant under right multiplication: \q = cos9 -\-qsm9 qq = —sm9 + q cos 9
81 which indicates a positive rotation through angle 9; that is, in the same direction as that generated by a left multiplication. Also, choosing x — v then x" = vq = v A q = —w and so vq = v cos 9 + vq sin 9 = v cos 9 — w sin 9 wq = w cos 9 + wq sin 0 = t) sin 9 + it) cos 9 which indicates (in the plane containing v,w) a, negative rotation through angle 9. This analysis shows that a 4-dimensional rotation is comprised of two simultaneous rotations: of elements in the plane containing the scalar axis and q and of elements in the space of pure quaternions perpendicular to q. It should now be clear that by choosing an appropriate combination of right and left multiplications of quaternions that we can produce a rotation in the space of pure quaternions alone; in effect a 3-dimensional spatial rotation. In fact if we consider multiplication on the right by q~l instead of q then a rotation through angle 9 (in the opposite direction to that obtaining from a left-multiplication) occurs in the plane containing 1 and g, whilst a rotation through angle 9 occurs in the plane containing t), w in the same direction to that resulting from a left multiplication. Thus the combined operation qxq~l leave elements within the plane containing 1 and q fixed whilst those in the plane of w, v undergo a rotation through an angle 29 about the q axis in the space of pure quaternions. Three-Dimensional Rotations Readers may find it interesting to see that this interpretation can be confirmed directly, without recourse to utilising rotations in 4-dimensions. Let us consider the transformation (j)q(x) = x' = qxq~l in which: x = y/Nx (cos (j> + x sin <j>) q = y/Wq (cos 9 + q sin 9) x2 = — 1, q2 = — 1 Because the angle of a quaternion can be viewed as that subtended between the quaternion and its scalar part we can construct the following diagram.
82 scalar axis Figure 2.10 We shall show directly that this transformation geometrically describes a rotation of the vector part of x about the vector part of q through an angle 20. First, It is easy to show that V(q) is left fixed by the mapping: 4>q(V(q)) = qV{q)q-' = (S(q) + V(q)){V(q)±(S(q) - V(q))} = w(S(q) + V(q))[S(q)V(q) - V(q)V(q)] = -(S(q) + V(q))[(S(q)-V(q))V(q)} = q[(q~1V(q)] = V(q) using associativity We are therefore justified in referring to V(q) (or q) as the axis of rotation. It is also easy to show that the norm and scalar part of x are conserved. Nx> = NqNxq-i = NqNxNq-i = Nx since Nq-i = —. Also, using the general property S(pq) = S(qp): ■Na S(x') = Siqixq-1)) = 5((^"1)g) = S(x) Now qxq'1 = q(S{x) + V{x))q-1 = qS(x)q-1 + qV(x)q-1 = S(x) + qV(x)q -1
83 and since S(qxq~l) — S(x) then S(qV(x)q~1) = 0. Therefore qV(x)q~1 is pure quaternion and so V(qxq-1) = qV(x)q-1 Since V(x) is parallel to x then V(x') is parallel to qxq~l = £'. Now choose p to be a unit pure quaternion number in the plane with normal q (i.e. S(qp) = 0). See Figure 2.11 scalar axis Figure 2.11 If A is the angle between q and x, then x — q cos A + p sin A The quaternion x has a unit norm since (using q — q~x — —q, p = p~x = —p) xx — (q cos A + p sin A) (—q cos A — p sin A) .". xx — —q2 cos2 A — p2 sin2 A — sin A cos X(pq + qp) — cos2 A + sin2 A = 1 and pq — —qp as p,q are perpendicular. Now x1 — qxq~x = g(^cos A +psin A)^_1 — ^_1 cos A + W-1 sin A We shall show that qqq~l — q and ^p^_1 is a pure quaternion which revolves through an angle 20 about q. The first part is relatively easy. Since V(q) and q are parallel then V(V(q)q)=1-{V(q)q + -qV(q)} = \[V(q)q-,qV(q)}=0
84 therefore qqq-1 = (S(q) + V(q))[S(q)q - qV(q)}± Nq = ^S2(q) + S(q)[V(q)q - qV{q)} - V(q)qV(q)} = ±-{qS\q)-V(q)qV(q)} But V-\q) = ZM NV(q) ••• V(q)qV(q) = -Nv{q)V(q)qV-1(q) = -NV(q)q The second part is developed along similar lines. qpq-1 = (S(q) + V(q))[p(S(q) - V(q))]±- = jr{Ps2(<i) + S(q)[V(q)p - pV{q)} - V(q)pV(q)} However since V(q) and p are perpendicular then using V(q)p = —pV(q): V(q)pV(q) = Nv(q)p ••• qpq-1 = ^r{(S\q)-Nv{q))p + S(q)[V(q)p-pV(q)}} Let p' — qpq'1 then we show p' is perpendicular to q (showing that p has been rotated about q). To do this we need to show Slq^q'1)^} = 0. Now S(pq) = —S(pq) = 0 by our original assumption. So all we need to show is that s[(v(q)p-pV(q))4] = o or (equivalently) S[{qp - pq)q] =0 or S{qpq) = 0
85 But S(qpq) = S(q2p) — 0 so that f is perpendicular to q. We can also determine the angle of rotation: cosV = S#P) = S(prp) = S(p'p) = ~S\{qpq-l}p] y/JMp'y/JMp = ~^(S2(q) - Nv{q))S(p2) - ^S{[V(q)p-pV(q)}p} 1 ,c2,^ AT ^ S(l) =iv^w- =£<*<')- = i-(iV,cos2 -^k(?);- -■Nv<,)) '0-Nq sin AT, 20) 5(-V(9)-V(«)) Finally = cos 2(9 .'. </> = 2<9 x' — q cos A + f' sin A where f — p cos 20 + qp sin 20. We note that qp is perpendicular to both q and to p. This completes the confirmation that qxq~Y describes a 3-dimensional rotation of the vector part of x through an angle 29 about the vector part of q. Example Determine a single rotation equivalent to the two successive rotations: R\: about the x—axis through an angle 90° R2'. about the z—axis through an angle 45° Solution In quaternion terms: Ri\ q1 = cos 45° + 1 sin 45° #2: qi =cos22.5° + A;sin22.50 Then R2R1 : 42<7i = (cos 22.5 + k sin 22.5)(cos 45 + 1 sin 45) = cos 45(cos 22.5 + i cos 22.5 + j sin 22.5 + k sin 22.5)
86 Also Nq2qi = 1 since Nqi = 1 and Nq2 = 1. Therefore Q2Q1 = cos (j) + n sin </> where cos</> = cos 22.5 cos 45 = -( 7=—)2 2V y/2 sincj) = tan</> = (1 + sin2 22.5) 1 2 3\/2-l = 2(3-71)5 n 2 V2+1 = [7-W2]* ^ = 49.210° Thus the combined rotation is equivalent to a rotation through 98.42° about an axis i + j(y/2 - 1) + fc(\/2 - 1). See Figure 2.12. Equival n otatio ax's P" "^ Figure 2.12 Reflections We note that if q is a unit vector then q~l = —q and the angle of this unit quaternion is equal to 7r/2. Thus q * w (= —qwq) describes a rotation through 180° about q. See Figure 2.13.
87 Figure 2.13 Similarly q * (—w)(= qwq) describes a reflection in the plane with normal q. (Note that —q * (w) is distinct from q * (—w)). Just as every planar rotation can be described in terms of a reflection so with three dimensional rotations. In fact, as we shall show, every three dimensional rotation can be decomposed into two reflections in planes which intersect in a line forming the axis of rotation. The angle of rotation is twice the angle between the planes. This result is easily deduced. Let q be the axis of rotation and w a vector. If w is rotated through angle 0 about axis q then, after rotation, wr = qwq'1 where q = cos 0/2 + # sin 0/2. Now consider two planes d, d! with unit normals h,h' respectively which intersect in q. That is n A h' — q sin (3 h.h' = cos (3 where (5 is the angle between the planes. If w is reflected in d then it becomes w': w' — hwh then, reflecting in d'\ w" = h'(hwh)h' — (nh)w(hfi) Now, using the quaternion product for vectors: fin' = —n -n' + nAn' = — cos (3 + q sin ft n'h — — h' • h + h' A n = — cos (3 — q sin (3
But Therefore {h'h)~l = (-cos/^-gsin/?)-1 = (— cos (3 + q sin 0) — fin' w" — (hfh)w(hfn)~1 So if we choose (3 = 0/2 then w" — wr which proves the statement. We have here examined rotations and reflections directly in geometrical terms. We can confirm our interpretations by appealing to the properties of the linear space R3 underlying this work. The map <j> acting on a pure quaternion w: </>:R3 ^R3 4>{w) ^qwq-1 is linear. Without loss of generality we choose Nq = 1 and since R = span{i,j, k} and if q — a + bi + cj + dk then <t>{l) = i(a2 + b2 - c2 - d2) + j(2ad + 26c) + £;(26d - 2ac) $(]) = J(-2ad + 26c) + j(fl2 + c2 - 62 - d2) + £(2a6 + 2cd) <j)(k) = i(2ac + 26d) + ](2cd - 2ab) + A;(a2 + d2 - b2 - c2) so that the matrix representation of the map <j) is M ■ a2 + b2 -c2 -d2 -2ad + 2bc 2ac + 2bd 2ad + 26c a2 + c2 - 62 - d2 2cd - 2ab 2bd - 2ac 2ab + 2cd a2 + d2 - b2 - c2 It is easily checked that M is orthogonal: MMT = / and detM = 1 so that the linear map 4>(w) = qwq'1 represents a rotation in M3. On the other hand if we consider the linear map 6 : R3 ^ R3 0(w) = qwq then we find the matrix representation of this map N: iV = -62+c2+d2 -26c -2bd -26c -c2 + 62 + d2 -2cd -2bd -2cd -d2 + 62 + c2 We see that N = — M(a=0) is orthogonal with detiV = — 1 since the dimension of M is odd. Hence the map 0 is not orientation preserving and represents a reflection (as we saw above it is a reflection of w is the plane with normal q).
89 2.7 Relation to the Rotation Matrix Let (x)i i = 1,2,3 be a given orthonormal system of vectors (which, as the need arises, will be interpreted as 'pure' quaternions). Since it is an orthonormal system then \2L)i ' V—)j ~ *ij Under a rotation, defined by a given quaternion g, this systen is transformed into a 'primed' orthonormal system (x)- i — 1,2,3 such that: (x)i = q{x)iQc QQ° = 1 Now we can write (x)J as a linear combination of the (x_)i through the use of the rotation tensor Rij (x)i — Rij{%)j using the summation convention The tensor R^ defines the rotation just as the quaternion q defines the rotation. We now show that given R^ we can construct q and vice versa. Clearly q(x)iqc = Rij(x)j Therefore [q(x)iQc}(x)k = Rij{x)j(x)k = Rij[-(x)j • (x)k + (x)j A (x)k\ Thus, taking the scalar part of this equation leads directly to S[q(x)iqc(x)k] = -Rik showing that if we are given q and given (x)i we can construct R^. Now let q = a + p then cf — a — p and, using the quaternion product: (x)i(x)i = -(x)i • (x)i = -3 Also (x)»gc(x)» = (z)»[a -/J|(x)i = -3a - (x)iP(x)i =-3a-(x)i[-p.(x)i+PA(x)i] = -3a + {x)i[P • (x)i] - (x)i A {PA (x)i = -3a^2(x)l[P-{x)l}-[(x)l-(x)l}P
90 in which the standard formula for the triple vector product has been employed. However, since we can regard (3 as a linear combination of (x)i viz (3 = clj(x)j and so §_ ■ (x)i = dj(x)j • (x)i = di and 2(x)i\j3 • (&)»] = 2a, (x), = 2^ Therefore Thus (aOi^feJi = ~3c* + 2)9 - 3/3 = -4a + gc = q[-4a + <?c] = -Aaq + 1 Prom (aj)J = Rij(x)j we obtain Rij(x)j(x)i = -4ag + l Taking the quaternion conjugate of both sides and adding, we have: Rij(x)j(x)i + Rij(x)i(x)j = -4ag + 1 - 4a<f + 1 = -8a2 + 2 (z)j(z)i = -fe)j * {x)i + (x)j A (x)i = Sij + (x)j A (x)i But Therefore, directly: Rij(x)j(x)i + Rij{x)i(x)j = Rij[-8ji + (x)j A (x)i + -5ij + (x)i A (x)j] = —2Ra Hence we deduce and from an earlier result 4a2 = 1 + #u l-fljj(g)j(g)i = ±- 4a \-Rij{x)j{x)j y/TTR~i showing that, given R^ and given (x)j we can find q (of course, in the rotation defined by q(x)iqc this sign is superfluous as q occurs quadratically).
91 2.8 Matrix Formulation of Quaternions Because quaternion algebra is associative they can be considered in terms of matrices. That is, the map <j) between the space of quaternions and the space of 4 x 4 matrices over the real numbers defined by: (H, +, .) -> U(4, (j){a + ib + jc + kd] dl a —b —c b a —d c c d a —b d —c b a is an isomorphism (here 0 is matrix addition and ® is matrix multiplication). This map is suggested by the form taken by the matrix representation of 0/, which we first considered when four dimensional rotations were examined. We first demonstrate its homomorphic properties. If p = a + bi + cj + dk, q = a + fii + jj + 6k are any two quaternions then: > {P + <?} = 0 {a + a + (b + P)i + (c + 7)3 + (d + 6)k} a + a -(b + P) -(c + 7) -(d + 8) b + P a + a -(d + 8) (a + a) c + 7 (d + 6) a + a -(b + P) d + 6 -(c + 7) (b + P) a + a 'a —b —c —dl b a —d c c d a —b Id — c b a = 0W 0 Hq} a —p -7 -8' P a —8 7 7 8 a -p L (5 —7 /? a
92 ^{pq} — </>{aa — b(5 — ey — dd + i [afi + c8 — ^d + ab] + j [aj + /3d-b6 + ac] + k [ad + 67 - cf5 + ad}} = 0{,4 + Bi + Cj + Dk} 'A B C .D 'a b c id -B A D -C -b a d —c -C - Dl -D c\ A -B B A\ -c —d" -d c a —b b a. (g> [a r 7 16 -P a 6 -7 -7 -6 a p -6 7 -P a Thus the map <j> is a homomorphism. It is also one-to-one and onto and so <\> is an isomorphism. It is irrelevant whether we work with the quaternions as introduced earlier or whether we use the matrix form: identical results will be obtained. Quaternion algebra is the algebra of 4 x 4 matrices of this form. We note that the matrix transpose corresponds to the quaternion conjugate. The closely related map A: A: (H,+,.)»-> (m(4)R),0, <g>) AJa + fa + cj + dA;} "a —b —c —dl b a d —c c —d a b\ A c —b a\ is suggested by the matrix representation of cf>R. Now it is easily verified that A is such that \{p + q} = \{p} e A{<?} \{pq} = \{q} 0 \{p} and so A is not a homomorphism. As the map A is onto and one-to-one it is an example of an anti-isomorphism. There is a further alternative map 0 ^a —d —c —b~\ 6 : (H,+,■)-> (: M (4,1 A Oia + bi + cj+ dk\ da b —c c —b a d b c —d a
93 If p = a + bl + cj + dk, q — a + fii + ^/j + 6k then 9{p + q} = 0{p} 0 #{#} is transparently true. However, 6{pq} = 0{aa - &/? - cry - cK + i[a/? + cfi - 7^ + a6] + j[a7 + fid - b8 + ac] + fc[a* + 67 - cj3 + ad]} = 9{A + £2 + Cj + Dfc] = = 6 -4 -D -C - Bl 2? i4 B -c\ C -B A D\ _B C -D A] "a — d —c —b' da b —c c —b a d _b c —d a. {p}®0{q} ® fa 6 7 U -6 a -p 7 -7 p a -6 ~P -1 6 a (The reader should check this computation). Therefore the map 0 is a homomorphism. This is also clearly one-to-one and onto and so 9 is an isomorphism. There is a further isomorphism between these 4x4 matrices and 2x2 matrices over the field of complex numbers. We note that the 4x4 matrices of the form considered here can be partitioned into 2x2 submatrices each of the form: s t _-t s but these, as we have seen, provide an algebra isomorphic to the complex numbers C. This implies that the algebra of quaternions can be fully described by the algebra of 2 x 2 matrices over the field of complex numbers. ~P where a, peC •>a + y/^l d c + yf^lb which states that once the first column of this 2x2 matrix is specified then the quaternion q is determined. We note that in this representation the quaternion conjugate is equivalent to a = P = a d c b -d a -6] c J
94 taking an Hermitian conjugate. The isomorphism between quaternions and 2x2 complex matrices implies that we can notate the quaternion as Q = a,peC in which q = a + ib + jc + kd. It is quickly checked that and qp a X "7 6 a 1 r «7 - sp 1 7/3 + a5 which is (formally) identical to the rule for multiplying complex numbers. We shall return to this method of referring to a quaternion (and to other 'numbers') later. It is perhaps worthy of note here that the 2x2 complex matrix representation of quaternions has found application in quantum mechanics. The quantum mechanical state vector of a fermion (spin ^) has the odd property that when it rotates through 360° it rotates not into itself but into minus itself. But this behaviour is not odd in quaternion algebra. The quaternion demonstrator is an example of such behaviour. The fermion 'spin' might be modelled by the disk with ribbon attached. Rotating once, through 360°, produces the disk with a twist in the ribbon — noted as (—fermion spin). Rotating twice, through 720°, produces the disk with no twist in the ribbon — noted as (+fermion spin). The 2x2 matrix representations of i,j, k are proportional to the Pauli Matrices and are fundamental to the development of quantum mechanics. Quaternions, of the type introduced here, are associated with quadratic forms of positive definite signature (i.e. the norm Nq = a2 + b2 + c2 + d2). They are therefore not suitable for the description of space-time which is associated with quadratic forms of Lorentzian signature (+,—,—,—). If we allow the use of complex scalars then this difficulty can be overcome. These 'complex quaternions' of the form: p = a + ib aeC /JeC3 were introduced earlier when we discussed the application of a quaternion formalism to the formulation of homogeneous equations of points, lines and planes. They will be considered again in Chapter 3 and their application to space-time considered. However, we note here that although complex quaternions might have considerable application to the description of space-time they, unfortunately, do not constitute a division algebra.
95 2.9 Applications to Spherical Trigonometry We begin our discussion on spherical trigonometry with some basic definitions. Definition 1 The intersection of a plane with the surface of a sphere is called a great circle if the plane passes through the centre of the sphere. Otherwise the curve of intersection is called a small circle. / axis of small circle small circle great circle Figure 2.14 Definition 2 The axis of any circle (small or great) of a sphere is the unique diameter of the sphere which is perpendicular to the plane of the circle. The extremities of such a diameter are called the poles of the circle. See Figure 2.15. Definition 3 The arc-length of the great circle from a point on a small circle to its nearest pole is known as the spherical radius. See Figure 2.15. furthest pole nearest pole spherical radius Figure 2.15 Definition 4 When two circles intersect (small or great) the angle between the tangents at either of their points of intersection is simply referred to as the angle between the
96 circles. Clearly the angle of intersection of two great circles is equal to the inclination of their planes. See Figure 2.16. 6 is the angle between the circles Figure 2.16 Spherical Triangles We first obtain a standard result. The shortest path connecting two points on the surface of a sphere is a great circle arc. proof Consider two points P, Q on the surface of a sphere, of radius r and let C be a curve on the sphere connecting them with parametric equations x = x(t), y = y(t), z = z(t) t0<t<ti Figure 2.17 The length of this curve is '-CIWWW'*
97 with the constraint that x2 + y2 + z2 = r2 which assumes that the centre of the sphere is at the origin of coordinates. The constraint is automatically satisfied if we use spherical polar coordinates (r, fl, </>): x = rsin0(t)cosfl(t), y = rsin0(t)sinfl(t), z = r cos 0(t) We choose, without loss of generality, that the z—axis passes through the point P. Then dx d(j) . dfl — = r cos <p cos fl — r sin 0 sin 0—- at at at dy . „d(j) . . _dfl —- = r cos (p sin fl —- + r sin 0 cos flat at at d^ ,dfl — = —rsin0— dt Ydt and therefore =X"rv(*) S)'^*(S)'* Although we could find those specific functions fl(t), 0(t) which minimise L using the methods of the variational calculus: in this case we proceed to optimise L by inspection. Clearly L> I' d<t> = r[(j){tl)-(l)(tQj\ But this is just the length of the great circle arc connecting points P, Q. (Equality is dfl obtained if — = 0 defining the great circle arc fl — const, or when sin2 0 — 0 which again at defines a particular great circle arc). We are now in a position to define a spherical triangle which is a triangle on the surface of a sphere whose sides are great circle arcs. sphe ical ri ingle BC not sph c&\ triangle P R Figure 2.18
98 Great Circle Arcs In our discussion of complex numbers we introduced the idea of representing an arc of the unit circle (subtending an angle 9 at the origin) by a complex number of unit norm: cos 9 + j sin 9. A similar correspondence is possible with quaternions and great circle arcs on the unit sphere. Consider a unit quaternion q = cos9 + qsin9. We may associate this quaternion by the great circle arc which is obtained when the diametral plane with normal q intersects the unit sphere. Clearly the position of the arc along the circle is arbitrary and so the arc AB is free to slide on this great circle as long as its length and direction are maintained. See Figure 2.19(a). Because the correspondence between unit complex numbers and unit quaternions (for fixed q the algebra of quaternions is identical to the algebra of complex numbers) is so close we can draw the same conclusions as in Section 1.8. We write (using ~ to specify the geometrical correspondence) q ~ cos 9 + q sin 9 ~ arc,4£ Great circle arcs can be positioned anywhere on its circle — they are slidable. The arc is able to move to any position on the circle so long as its length and direction remain unchanged. We deduce: q ~ arc ab <fl ~ arc^A -q~ arc^B 1 ~ point — 1 ~ semicircle q ~ quartercircle arCq + circle = arc9 arc9 + semicircle = — arc9 (b) Figure 2.19 (a)
99 Great circle arcs (on the same circle) can be added vectorially in an identical manner to plane circular arcs in the complex plane. This follows since if p = cos <j) + q sin <\> q = cos 9 + q sin 9 then (using q2 = -1) pq = cos(0 + <j>) + qsin(9 + <j>) and so arCp ~r arCqi ^ arCpo However, there is a more general result. We know that if p, q are general unit quaternions (not necessarily having vector parts parallel): p = cos <j> + p sin <j> q = cos 6 + <? sin 9 then operation by q* (represented by arc^tf) followed by operation by p* (represented by arc#c is equivalent to a single operation (pq)* (represented by arc^c)- See Figure 2.19(b). Thus for general great circle arcs: &ycab + arc^c = arc^c or arc9 + arcp = arcp<? This is generalised to apply to the vector sum of any number of great circle arcs: arcg + arcp + ... + arc^ = arc^..^ As a generalisation of a the result in complex numbers relating to arcs which combine to a closed circle we easily show that the arcs associated with the unit quaternions q,p,..., h taken in this order will form a closed spherical polygon only if h...pq = 1. Finally, we note that if P, Q are general quaternions; P = \/Npp and Q = y/NQq where p, q are unit quaternions then PQ = y/^pp^/NQq = \/^P\/^QarcP9 Here we see that multiplication of quaternions can be interpreted as multiplication of positive real numbers together with addition of great circle arcs. Also, as with real numbers a quaternion Q = yjNQq has 'polar coordinates' (y/Nq, aicq).
100 The Sine and Cosine Laws of Spherical Trigonometry Having defined a spherical triangle there is naturally defined six angles a°, b°, c° called arc angles and A°, B°, C° called vertex angles; see Figure 2.20. Figure 2.20 For simplicity (and conventionally accepted) the lengths of all arcs comprising a spherical triangle will be taken as less than a semi-circle. This implies that 0 < a°,b°,c° < 180° which further implies that sin a°, sin b°, sin c° are all positive. Now as we have seen above we can represent arcs quaternionically. If q = cos a° + q sin a° p = cos c° + p sin c° then qp = cos c° cos a° — q.p sin c° sin a° + q sin a° cos c° + p cos a° sin c° + q A p sin c° sin a° However, aicAB ~ p*, arcBC ~ <?*, arc^4C ~ gp* and writing arc^4C ~ cos 6° + msin&° we obtain, by equating scalar and vector parts: cos c° cos a° — ^.p sin c° sin a° = cos b° (i) <? sin a° cos c° + p cos a° sin c° + q A p sin c° sin a° =m sin 6° (ii) But, introducing unit vectors A,B,C via:
101 and so p, q, m are unit vectors in the directions of A A 6, B AC and A AC respectively. Thus looking down the axis of B (see Figure 2.21) we deduce from (i) cos(7r — B°) = q.p cos c° cos a° + cos B° sin c° sin a° = cos b° which is (together with two other formulae obtained by cyclic interchange) the Law of Cosines in spherical trigonometry. Relation (ii) is used to make another important deduction. Noting that qAp = —BsmB°, B.q = 0 and B.p = 0 we obtain, from (ii) B sin B° sin c° sin a° = q sin a° cos c° + p cos a° sin c° — rh sin 6° therefore sin B° sin c° sin a° = —B.m sin 6° leading to sin£° _ B.m _ B.(AAC) _ A.(BAC) sin b° sin c° sin a° sin a° sin b° sin c° sin a° sin b° sin c° But the right hand side is unchanged on cyclic interchange and so we deduce sin ,4° _ sin£° _ sinC° sin a° sin b° sin c°
102 which is the Sine Law of spherical trigonometry. We note that the Sine Law is obtained from (ii) by taking the scalar product of both sides with B. The other possibility; taking the vector product of this equation with B leads to 0 = B A q sin a° cos c° + B A p cos a° sin c° — B A m sin b° (iii) But ^A BA(BAC) = [(B.C)B-C] sin a° sin a° BA(AaB) [A-{B.A)B] BAp = and B A m = sin c° sin c° BA(AaC) _ [(B.C)A - (B.A)C] sin b° sin 6° Then using B.A = cosc°, J3.C = cosa° and A.C = cos6° leads to an identity in (iii). 2.10 Rotating Axes in Mechanics As a final application of the quaternion formalism we consider the kinematics of a rigid body, in particular its velocity and acceleration with respect to coordinate systems related by a rotation. Quaternions are likely to be useful in this situation since rotations are involved. Let us consider a unit quaternion q so that qqc = 1 Then assuming the components are dependent functions of a parameter t (the time) we have: dq r dQc that is dq c\ dqc dtQ q dt %- Therefore we conclude that — qc has no scalar part. But the identity dt dq (dq c\ c Tt = {-dtq)q using qq = 1 implies that we can always write dq 1 in which u_ = 2—-qc is a pure quaternion. It immediately follows that dt
103 If p is the position vector (relative to space coordinates) of a point fixed in a body which is rotating according to the value of quaternion q then its coordinates relative to axes fixed in the body are i = qcpq dp' and is such that -=- = 0. That is at dqc dp dq 0 = Wp-q + qftq + qp-Tt 1 dp I = -^(furpq + qc-jj;q + -qcmQ = <T 1 dp which requires Thus we can interpret lj_ as the angular velocity of the body. This analysis can be extended to particles which are not fixed with respect to body or space coordinates. Now considering a transformation of coordinates (i.e. rotating axes). So if the vector parts of quaternions x, x1 represent the position vectors of a particle with respect to body and space coordinates then x^ x are related through q xq and writing x = Xq 4- r_ then But therefore dx' dq Ax — =—Xq + tf — q + tfx dq dt dt dt dt 1 r rdx 1 r = —jQ-xq + q —q+ -qcxuq = (t dt* ' 2 1, x dx -2(uix - xui) + - lox — xw = lj[xo + r] — [xo + r\u_ — Xqlj_ — uj_.r_ + u_/\r_— XqU_ + r.a; — r Auj_ = 2w A r dt \dx lit ■lj Ar\
104 Differentiating again: d2x' dt2 q u 2 + 4 dx d2x du dr -r^- -Ar-wA- dt2 dt ~ ~ dt + 4 d2x dx — uj At dt ~ ~ -uj_q dr_ du_ . x If axes are chosen so that the two coordinate systems coincide instantaneously then q = 1 and the usual (classical) relations are obtained between rotating coordinate systems.
Chapter 3 Complexified Quaternions 3.1 Scalars, Pseudoscalars, Vectors and Pseudovectors In the ordinary quaternion theory we have constructed an object q = a + P which essentially only distinguishes two types of objects the scalars and the vectors. However, we know that the vectors split into two disjoint sets — the polar vectors and the axial vectors (arising from vector products of polar vectors; often called pseudovectors) and, in three-dimensions we have also scalars and pseudoscalars. These distinctions between vectors are well illustrated if, for example, we consider three vectors a, 6, c and consider their reflection in a mirror. If the world outside the mirror has a right-handed system of axes then the mirror-world has a left-handed system. See Figure 3.1 ■ 1 w irld mi >r wi> 1 Figure 3.1 We see that the vector c is reversed in direction but the vector a A b is unchanged in direction. This is seen algebraically by considering the coordinate transformation Jb „' LLnnJU -j using the summation convention Now if CLoj — 1 0 0 0-10 0 0 1 J. P. Ward, Quaternions and Cayley Numbers © Kluwer Academic Publishers 1997
106 then det(a^) = — 1, and if a = (0,0,1) 6 = (1,0,0) c= (0,1,0) then a'= (0,0,1) 6' = (1,0,0) d = (0,-1,0) and so, using (a A b) • = (a* A b')i = e^k^jb'^ We easily see that (a A b)i = (0,1,0) whilst (& A b)[ = (0,1,0). That is (a A b)i which is in the same direction as c is unchanged under the transformation whereas c is reversed in direction. Because of this a vector equation should never contain a polar vector and an axial vector — they are essentially independent entities. The same is true of scalars and pseudoscalars. The present formulation of quaternions does not encompass these important distinctions. We shall see in the study of complexified quaternions, below, applied to three-dimensional space how this defect may be remedied. The distinctions examined here between vector types applies more generally. In three-dimensions the highest order completely skew-symmetric tensor is proportional to e^fc. Let us call it T^k Tijk = Ti23^ijk Tijk = ±Ti23 with the positive sign taken if ijk is an even permutation of 1,2,3 and the negative sign being taken if ijk is an odd permutation of 1,2,3. Such a tensor is similar to a scalar as it has only one distinct component value. However whereas a scalar is unchanged on rotating coordinate systems this is not so for T^k- For consider component T{2s ^123 — alQa2/?«37^a/?7 = aiaa2/?a37Ti23ea/?7 = (detay)ci23Ti23 = (deta^Tm More generally, a single component T transforming like r = (deU)nT is called a scalar density of weight n. Scalar densities of weight 1 are called pseudoscalars. A tensor multiplied by a scalar density of weight n becomes a tensor density of weight n. For example, the 1st order tensor Wi (vector) transforming as w[ = CLijWj is a zeroth order tensor density whereas the vector product Ujkwj^k = Pi is a tensor density of weight 1. This is called a pseudovector or a bivector. Not surprisingly there is a direct relation between bivectors and skew-symmetric tensors of 2nd order Tij *i>3 ~ *ji
107 which, if pi is a bivector J-ij = ^ijkPk 3.2 Complexified Quaternions: Euclidean Metric A complexified quaternion has the form: q = al + i§_ a G C, £ G C3 wherein ft = pi + 7 j + 6k. Here (1,2, j, k) is the usual quaternion basis and i (satisfying i2 = —1) is the usual complex unit. Clearly the factor i is superfluous as it could be incorporated into p but this formulation of a complexified quaternion can lead to some simplification. We note that the algebra, generated by l,ii,ij,i/c is identical to that generated by the Pauli Matrices. In fact, we have the correspondence: "1 0' 0 1 , a <-> "1 0' 0 -1 , ij ^ 0 i -i 0 , ik <-> "0 r 1 0 The algebra of complexified quaternions is easily developed. The set of complexified quaternions is denoted by H^. The conjugate operation in H^ is qc = a - ip_ whereas the complex conjugate, denoted by q* is q* = a* — i/T An easy calculation shows that V p, q G H^ (q*Y = (qcy, (qp)*=q*p\ (qp)c=Pcqc We also note that if g* = q then a* - ip_* = a + ip_ implying a = a*, /? = — /3*. Thus a is real and P_ is pure imaginary. In this case we can write P_ = i~P_v& which |gR3. That is, a complexified quaternion q such that it equals its complex conjugate reduces to a 'real' quaternion: q=a-p GH
108 On the other hand, if qc = q then a-ip_ = a + ip_ -> £ = 0 and so a complexified quaternion equals its quaternion conjugate only if it is a complex number: q e C. Finally qc = q* only if q = a + i§_ qGR, ^GM3. If q G Up the the scalar and vector parts of q are defined as for ordinary (real) quaternions: Sq = \(q + qc)=a Vq=±(q- qc) =i§_ We note that Sqc = ^(qc + q) = Sq and S(p + q) = Sp+Sq and since, for p = a + ifi, q = 7 + i8_ pq = (a + 2/?) (7 + i£) = 07 + /? • £ - /? A £ + 2 (a£ + 7/?) then We now find that complexified quaternion which commutes with all others. That is, let z = az + i(3_ G H(£ such that pz = zp V p G H(£ Now £2 = (a + 2/?)(a2 + 2/? ) = aaz + ia/^ + iaz(3_ +(3_- §_z ~P^P_Z then and so: pz - Zp = -2/? A P_z thus V2 = (3 = 0 is required for 2: to commute with all p G H^. We conclude that z e C.
109 Inner Product for Complexified Quaternions Here we choose to define a different inner product than that used for real quaternions. Specifically: <p,q>=S{pq*c) We should note that alternative definitions for the inner product can be given. In a later section we use the definition < p,q > = S(pqc) which is applicable in Relativity Theory. The present definition satisfies all the usual requirements for an Hermitian inner product: (i) <P,q>=S(pq*c) = S(q*Pc) =<q*,P*> =<q,p>* (ii) < p, q + r > = S(p{q + r)*c) = S(pq*c + pr*c) =<p,q> + <p,r>* (iii) a < p, q > = aS(pq*c) = S((ap)q*c) =< ap,q > = S(p(a*qyc) =<p,a*q> aeC (iv) <p,p> =S(pp*c) = \[(a + i£)(a* + if) + (a* - ifT)(a - ifi)] = i[aa*+t(a£*+<*•£)+£•/?*-/?A£* + a*a - i(a§_* + a*0) + §_■ 0* - ff A0\ = aa*+/?-/T >0 and <p,p>=0 only if p = 0 We now define the norm: for any p € Mq Np=<p,p> = -\pp*c+p*pc}>0 We note that Npc=<pc,pc> = -[pcp*+p*cp} 1 2 = \ [(a - iP)(a - if) + (a* + ip*)(a + 0)] = | [aa* - i(a/F + a*(3) + §_ • (3* - §_ A (T +a*a + i(a*P + a§_*) + /? • §_* - §_* A 0\ = aa* + §_ • P* = Np
110 Also, Np. = <P*,P* > = \\p*Pc+PP*c} = NP and JVp.c =<p*c,p*c > = \\P*cP + PcP*]=Npc=Np Npq = <pg,pq> = ^\pq(pq)*c + (pq)*(pq)c} = \\pq<l*cP*c+P*q*qcpc} -\p(2Nq - q*qc)p*c+p*(2Nq - qq*c)pc Hence si Nq(PP*c + p*pc) - p(q*qc)p*c - P*qq*cpc 2NpNq-l-p{q*q'yc -l-p*qq*'pc Npq. = 2Nq(Np)-p(qq*c)p*c-p*q*qY Npq + Npq. = 4NpNq - p(qq*c + q*qc)p*c - p*{q*qc + qq*c)pc We deduce that ■■ 2NpNq ^pq — ^-i'qr-i'p -LVpq* (Although complexified quaternions are 8-dimensional, they are not Cayley numbers as, in general Npq ^ NpNq). It is interesting to note that if q = ±q* then Npq* = N±pq = Npq and so, in this case Npq = NpNq. We conclude that if one of the products is a vector or a pseudo-scalar (see later) then the norm of a product is the product of norms. For those particular complexified quaternions p for which Np = pp*c (i.e. p* = ±p for example) we define an inverse: y Np The Metric An inner product defines a metric for the space to which it applies. This can be obtained as follows. Let ea = (l,ii,ij,ik) denote the basis for complexified quaternions. General complexified quaternions p, q could then be represented as P = P% Q = 9%
Ill in which the summation convention is used (greek indices will generally range from 0 to 3). Now if we choose p, q as being basis elements p = eQ, q = ep say, for particular values of a, p. Thus, calculating the inner product: The metric g^u is defined through the relation < PA > = S(pq*c) = g^qu However, since p = eQ, q = ep then p = £<*% q = «"% Thus the components of p, q are: p" = 6a» and 6aP : Kronecker delta Sfr This implies Specifically; Thus 9ap = ^[eae*pc + e*peca] 9oo [2] = 1 0n = 2^)(^) + H0H01 = * ^33 = ^[(*fc)(*fc) + Hfc)Hfc)] = i £a/3 = 0 a ^ /3 #a/3 10 0 0 0 10 0 0 0 10 0 0 0 U which is the usual flat-space Euclidean metric. We conclude that with the form of inner product chosen here the complexified quaternion formalism can be applied to the description of classical mechanics. This is the point of view taken in Hestenes [5,11 ] who has pioneered this approach using (the closely related) multivector calculus.
112 The Dual Operation In this section we shall find it useful to write the scalar and vector parts of a complexified R I R I quaternion in real and imaginary parts: a = a + i a and p = p + i p. Thus a complexified quaternion has expression R I I R q = a — f3 + i a + i p We can choose, as basis elements, either (l,ii,ij,iA;) over the field of complex numbers or (1,2, efc,2efc) k = 1,2,3 over the field of real numbers; (here e\ = i, e2 = j, e3 = k). Following the conventions used by Hestenes [ ] the terms in q separate naturally into four groups a, a G R are called scalars ia, a G R are called pseudoscalars &k&k Gfc G R are called bivectors idk^k &k £ R are called vectors A bivector is obtained when two orthogonal vectors are multiplied together. That is, (iek)(iem) = -ekAem k^m and clearly the elements e\ A £2, e\ A £3, e2 A £3 form a basis for the bivectors. Any bivector B can be written in terms of these three D — ~ -DijCi A Cj ^ij — ji Also, given any bivector B we can always find a vector b such that B = —ib. To see this let (B)m = -(Bijii Mj)m. That is: {B)m = (B12k + B23i - B13j)m = (B23,-Bi3,Bi2) If b = i(b\i + b2j + b3k) then — ib = (b\i + b2j + b3k) implying the identification if we choose &i = #23, b2 = — £13, 63 = B\2. We say B and b are dual objects. We note that this identification, B = —ib, exactly coincides with the usual definition of dual tensors: Bij = eijkbk where e^fc is the completely anti-symmetric object.
113 More generally we define the dual of a complexified quaternion q to be —iq. The dual operation turns a scalar into a pseudoscalar (and vice versa) and a vector into a bivector (and.vice versa). Note that to invert the relation B^ = eijkbk requires: bp in normal tensor notation. However in the present formalism obtaining the dual is achieved by multiplying by — i. The use of complexified quaternions highlights the four basic quantities: points (represented by scalars), lines (represented by vectors), area elements (represented by bi vectors) and volume elements (represented by pseudoscalars). The pseudoscalar is the result of the quaternion product of three orthogonal vectors ii, ij, ik since (ii)(ij)(ik) = (ii)(-j A k) = i(i • (j Ak))=i which is the volume element of a unit cube with sides represented by i, j, k. A volume is not a scalar — it is a pseudoscalar. Rotations For 'real' quaternions we have already seen that the operation w' = qwq-1 Nq = 1 is such that the norm and scalar parts of w are preserved. Also the vector parts of w and w' are related via a conical rotation. Specifically V(w') is obtained from V(w) by rotating it about the axis Vq through twice the angle of q. If we consider complexified quaternions, with the present choice of inner product then we amend this transformation to w' = qwqc Nq = 1 in which q* = q. That is, q = a + /3, a G R, ft G M3. But with the present interpretation a is a scalar and /3 is a bivector. We should note that any quaternion q of this form can be written as a product of two vectors a, b q = ab a = iab = ib — ^ijk^ijp^k - (fijjfikp ~ 8jp6kj)bk = 3bp — bp = 2bp
114 then ab = a-b — aAb in which a• b is the scalar a and —aAb is the bivector /3. (There are an infinite number of possible choices for a, b satisfying a- b = aand —a A 6 = /? (choose b such that b- ft = 0 then a = [b A (5 + a£]/|6|2 gives a • 6 = a and — a A b = 0). It is also easily checked that this transformation preserves the inner product: < w', r' > = S(ti/r'*c) = ^(wV*c + r'Vc) = ^[^c(^c)*c + (^c)*(^c)c] = 2 ^C(^r*^C) + qr*qcqwcqc] = -[qwr*cqc + gr*wcgc] = q[-(wr*c + r*wc)]qc = q < w,r > qc = < w,r > qqc = < w,r > Those readers wishing to pursue this approach to mechanics and to geometry should consult the various texts and research articles of Hestenes listed in the bibliography. 3.3 Complexified Quaternions: Minkowski Metric We shall discover in this section that, by introducing an alternative prescription for the inner product (and hence the metric) that the formalism of complexified quaternions can be used to elegantly describe the fundamental relations in Special Relativity and in Electromagnetism. We shall see that a space-time event is represented by a single complexified quaternion x and the Lorentz Transformation characterised by the quaternion q through the relation: x' = q*cxq Nq = l The material of this Chapter is heavily dependent upon the work of many other authors; in particular, J D Edmonds [3], W Israel [12], A J Macfarlane [13], M Cahen, R Debever, L Defrise [14]. Of course other formulations, in terms of differential forms, or in terms of spinors, or the classical approach using tensors can be employed to describe the fundamental relations of Special Relativity and of Electromagnetism. What I have attempted, is to enquire what difficulties underlie the consistent use of a quaternionic formulation in this area of physics. I have not the wit (nor at my age the time) to carry through this analysis completely but I think I can demonstrate that a quaternionic formulation is at least feasible and has (certainly in terms of elegance) some advantages over other approaches.
115 We begin by considering an alternative inner product for two complexified quaternions p, q <p,q> =S(pqc) which is formally the same as that used for real quaternions. As is easily verified, the following properties are satisfied (i) < p,q > = S(pqc) = S((pqc)c) = S(qpc) = < q,p > (ii) < p,q + r > = S(p(q + r)c) = S(pqc + prc) = <p,q> + <p,r > (iii) a <p,q> = aS(pqc) = S((ap)qc) =< ap,q >= S(p, (aq)c) =<p,aq> a G C (iv) < p,p >= S(ppc) =a2-P-peC Thus the inner product defined here is a symmetric bilinear form but is not positive definite (< P,P > is n°t necessarily real; let alone positive). The inner product defines the norm of qeMc: Nq = <q,q> =qqc Clearly NqC = Nq and, for the product: Npq = (pq)(pq)c = pqqcpc = qqcppc = NpNq (Here we have used the result that qqc G C and thus commutes with all complexified quaternions). We can also define an angle (which may be complex) between two p,q E H^ via <PA > COS Z = , , = So p, q will be said to be orthogonal if < p, q > = 0 (i.e. if S(pqc) = 0) and parallel if V(pqc) = 0. A major algebraic result, to be used extensively later in the discussion on the Lorentz transformation, concerns the decomposition of any unit-norm complexified quaternion into simpler quaternions. Explicitly, we show that Q = QrQb V q G Hc, Nq = l
116 in which q*Bc = qB and q*R = qR. (The subscript B refers to a Lorentz boost and subscript R refers to a spatial rotation). To verify this result we use the constraints on qB, qR to write them in the following form: QR=aR-PR qB=aB+iPB_ ocR,aBeR and /^/^GM3 Now QrQb = K-^h)K+^) If we consider a general complexified quaternion: R I R I q = a + i§_=(a + ia) + i((3_ + i(3) R I I R = a-^ + z(a+ /3) R I R I where a, a, p,/? are real. The constraint Nq = \ implies fl / r r i i a2 -a2 - P- P+ §_-§_= I R I R I aa- P_-P = 0 then if q = qRqB we must have R I I R a+ p = aRpB+p~R.pB- PR ApB That is To begin with, assume aB a)- a = aRaB i= <*,A a=^R-^B (L = aR§B_ - -4a# / 0 and a / 0 then R qh = — (1) (2) (3) (4) (1)'
117 (2) (3) (4) 1 1 ' a = —0-/3B a — R §_ = aR(3B §_M3B (2)' (3)' (4)' Also R I I R I (3)', (4)'-* §_-§_ = aR(aBa) = a a but this relation is always satisfied as a consequence of imposing the condition Nq = 1. Now (4)'- / R I I (3A (3_ = aR0A(3B §_A(§_A(3B) (We note that if /? = 0 then, from the condition Nq = 1 we have a a = 0 implying a = 0 which is a pure boost in q). We find i r i ! §_A§_ = aR§_A^- — = aR(3A(3B - (f3-f3B)(3-(p-§)(3B ii ii aBap-(p.§)f3B = aRaB(aRPB - /?) ■ 1 / r x (3Af3=—f3B — — a0 — from which fiB can be found unless / / «+§_'§_ R II OLROLB§_-OL§_ I I aRaB+P-£=0 But this is never zero unless aR = 0, /? = 0 which (unless q = 0) is never true from the constraint that Nq = 1. Thus, in the case aB /Owe can solve for aR,(3R1a and /?B. If we now consider the case aB — 0 then a = 0 and (3 = 0 from (2). But this is impossible from the constraint Nq = 1. In the case a = 0 and aR = 0, aB ^ 0 then fj-/3 = 0 from the constraint Nq = 1. Then
118 (1)- (2)- (3)- P = I a = I- = Q;BPr --£r-!L = -in^iB. ^-±J 1 1 7 a = — P-0B 0LB~ — I R ill 0A/? = PA{PAI3B) - - aB- - — I aB ■ / / II- {P-p^p-iP-PjP^ ii i / / - aB — afl[//A/f+a#l fi_ = L~ ~—=J- (/? ^ 0 from the constraint iV9 = 1). We conclude that any q e M.£ Nq = 1 be written in the form q = qRqB in which q*R = qR and q*B = qB. Any complexified quaternion q G M.£ can be written in the form q = y/Nq (cos 2 + q sin 2) Nq = I q2 = — 1 where, if g = a + i/3 then cos 2 : !Na sin2= V g-~ $ = /? 2GC vg VJV9 If gc = g* then q = a + i(3 a, /3 E M then 2 is wholly imaginary 2 = i6/2 cos 2 = cosh - = 2 On the other hand if q = q* then 2 is real. Now consider a unit norm q € E.£ Nq = 1 <j = cos 2 + <? sin 2 .. 0 ^p-p sin 2 = 1 sinh - = ,—
119 If x is any other complexified quaternion then, multiplying by q on the left: qx = x cos z + qx sin z (1) q(qx) = qx cos z + <?(<?£) sin 2 = -xsinz + qxcosz (2) Now if x' = qx then x' and x are orthogonal since S{xx'c) = \{xx,c + x'xc) = ^(-xxcg + ^xxc) = 0 z z The relations in (1), (2) indicate that elements in the plane (x,qx) are rotated through a complex angle z. In particular if x = 1 then xf = q and so elements in the plane (1, q) are rotated through a complex angle z. Now ifv,w,q form a right-handed system in R then if we choose x = v xf = qv = q/\v = w Therefore qv = v cos 2 + ($)) sin z = v cos z -\-wsmz q(qv) = qw = w cos z + qw sin 2 = — Osin^ + u;cos2 Here, multiplication on the left by a complexified quaternion q rotates elements in the plane containing (v,w) and in the plane containing (1,<?) through complex angle z. However, multiplication on the right by q rotates elements in the (v,w) plane through complex angle —z whilst those elements in the plane (1, q) are rotated through angle z. Therefore, combining these results we see that the transformation: x' = qcxq Nq = 1 can be interpreted as a rotation of V(x) through complex angle 2z about V(q).
120 3.4 Application of Complexified Quaternions to Space-Time In this section we shall be concerned with the application of the quaternion formalism to Special Relativity. We denote a space-time event by four coordinates (x0,^1,^2,^3). This space-time event will be represented by a complexified quaternion x x = xaea = x°e0 + xlei + x2e2 + x3e3 xa e R where ea = (1, ii, ij, ik) will be used to denote the basis for complexified quaternions. The inner product introduced earlier can also be used to define a metric. Let p, q be space-time events (i.e. p* = pc, q* = qc) P = Paea Q = Qaea then the relations: <p,q> =S{pqc)=ri^paq^ defines the metric rjap. To obtain its components explicitly choose p = ea q = e@ (for particular a, /?). Then p = 6a»efl g = ^% 6a(3 is the kronecker delta That is therefore <P,Q> = ^e^ePea} Also and so lap = ^ep + ePe*} giving, after a short calculation: Vae 10 0 0 0-100 0 0-10 0 0 0 -1. which is the usual flat-space metric of Minkowski space.
121 Aspects of Special Relativity Fundamental to special relativity is the Lorentz transformation. It will prove to be of value to spend some time on its derivation and on examining some of its properties. The Principle of Special Relativity By the 1880's the electromagnetic nature of light had been established by Maxwell. The speed of light c in vacuo, is predicted by the Maxwell Equations: c2 at ot From these equations we can deduce that both E_, H_ satisfy the wave equation: c2 dt2 9 1 d26 which admits plane wave solutions 0 ^ ei(k'L-c\k\t) which propagate with speed c. Here k denotes the direction of propagation. The experiments of Michelson-Morley in 1887 showed that this speed c is independent of the motion of the observer. If we only consider observers in uniform relative motion this result of Michelson-Morley essentially implies that Maxwell's equations (in particular the wave-equation above) should be invariant with respect to coordinate systems moving with constant relative velocity. The Galilean transformation r' = r — vt which assumes the existence of an absolute time t is not sufficient to guarantee the invariance of Maxwell's equations. For example, under this transformation the wave equation becomes: d2(f) d2<t> d2(f) Lamor [1] in 1900 and Lorentz [2] in 1903, described a coordinate transformation, now called the Lorentz transformation which kept invariant the form of Maxwell's equations and so accounted for the results of the Michelson-Morley experiment. However, it was not until
122 1905 when Einstein proposed two simple principles from which the Lorentz transformation could be derived directly, that removed the somewhat 'constructed' approach to explaining the (unexpected) negative results of Michelson-Morley. Einstein proposed two 'relativity' principles: (1) "The laws of nature are identical in form for any two observers 0,0' who are in uniform relative motion." His second principle refers direcly to the speed of light c (ii) "The velocity of light c, in vacuo, is a constant, the same for all inertial observers. That is, it is independent of the velocity of its source." These two principles are usually supplemented by a third: (iii) In all inertial systems, particles not acted on by forces (these are called 'free' particles) will move along a straight line with uniform velocity. The Lorentz Transformation Every inertial observer will set up a coordinate system such that every space-time event can be properly labelled. Observer O will have a clock to measure time t and a standard rule to measure 3 spatial coordinates. Each event can thus be labelled (ct,x,y,z) = (x°,x\x2,x3) (ct converts t into a distance, so that all four coordinates have the same units). The parameter t is called the proper time for observer O. A similar system of space-time coordinates can be set up by observer 0'\ (ct\x',y',z') = (x'\x'l,x'2,x'3) We are already assuming, in this notation, that the speed of light c is the same for both observers. We shall assume that O' has velocity y_ relative to O and, for convenience, that there is an instant (measured by t = 0 and il — 0) at which both spatial origins coincide. The path of a free particle can be parametrized by its proper time s either from O or from 0"s perspective. That is xa=sa(5) or x'a = ha(s)
123 (s is the time measured on a clock moving with the particle). Now, since the path of the particle is uniform: ^ = 0 ^ = 0 ds2 ds2 and assuming xa, x'a are related via a coordinate transformation dga _ dga dhP ds dx'P ds therefore d2ga _ d fdga dh^ ds2 " ds \dxfP ds d fdga dhP\ <W ~ dx'i \dxfP ds J ds d2ga dhP dh7 dga d2h? Thus we deduce: dx'Wx'P ds ds dx'V ds2 d2ga dhP dtC dx'idx'P ds ds which, since this must be true for all free particles * = 0 = 0 dx'idx'P Thus assuming the free particle is moving rigidly with observer O g°(s)=ct = x° g1(s)=x1 g2(s) = x2 g3(s)=x3 and so the coordinate transformation between xa and x,a is linear (The possible additive constants are zero as the space-time origins of 0,0' coincide). If the free particle is rigidly attached to O' then dhm „ nn , dh° ——=0 m = l,2,3 and ——= c as ds dx ~d< the transformation between the coordinate systems Also —r— = vm which is the velocity of the particle from O's perspective. However, from ds ~A f> ds ~A °C
124 Therefore dt dr° vm™=Am and ^- = A%C as as But ds = dt1 and dh° = cds and dx° = cdt .'. A\ = f and Am0 = vm(^- ds \ c Now suppose that at this instant when the space-time origins coincide a flash of light is emitted. According to 0,0' the wave front equations (spheres with radii increasing at the speed of light) are T}a(3XaXp = 0 VapX/aXf0 = 0 However, under our coordinate transformation the first equation has the form: But the left-hand side of this equation can only be a multiple of r]apx'ax'P and so we deduce ria0Aa^6 = kr}l6 keR If we take 7 = 6 = 0 then (A%r-(A\r-(A\r-(A\r=k But and so (A)2(1-£) = A In order to obtain a value for k we consider the inverse transformation. Let Map be the inverse matrix to Aap {Ma(3A^1 = Say) then xfa = MapxP In the special case when v = 0 we have Am0 = 0 and so M°0A°0 = 1. Thus in this case However, from O's perspective when v = 0 (and considering the particle at 0"s spatial origin) then x° = A°0xf0 = ±Vk xf0 (1)
125 But, now reversing everything with the particle at O's spatial origin then, using exactly the same construction as above we must have x* = M°0x° = ±-^x° (2) But for (1), (2) to take the same form we require k = 1. To recap: A Lorentz transformation is a coordinate transformation: xa = ia/ (or X = AX') which satisfies: r}a(3Aa1Afi6 = r}l6 or, in matrix form AttjA = rj from which it immediately follows that detA = ±1. A Lorentz transformation is one which preserves the value of the inner product < x, y >. v2 Now since (A°0)2(l j) = 1 there are just two cases to distinguish: either A°0 > 1 or A°0 < — 1. If A°0 > 1 the matrix A transforms future pointing vectors pa into future pointing vectors, and those pointing into the past into past-pointing vectors. (nb A vector pa is called time-like, null or space-like according as r)appapP is positive, zero or negative. Time-like or null vectors can be further charactrised as future- or past- pointing. At the outset ua = (1,0,0,0) is designated future pointing. Any other vector pa is future pointing if rja^pau^ > 0 implying pa is future pointing if p° > 0) The Lorentz group can be considered to be the union of 4 disjoint components 4U4U^T-U^ in which the arrow refers to the value of A°0 (| if A°0 > 1 and [ if A°0 < -1) and the sign refers to the value of detA. In special relativity, and in this text, we only consider the component of the Lorentz group L+ 4 = L+f|iT This component of the Lorentz transformation being referred to as the proper orthochronous component. The component L\ contains the identity transformation. The dimensionality of the Lorentz group is most easily obtained by considering iiifmitessimal transformations which, in matrix terms, take the form
126 A = I + eK Here K is a general 4x4 matrix. The requirement that ATrjA = r\ implies (/ + eKT)rf{I + eK)=r} from which we deduce, to first order in e: KTrj + tjK = 0 The left-hand side is a symmetric matrix and so this equation provides 10 conditions on the sixteen components of K and so A is 6-dimensional. The Quaternionic Form of a Lorentz transformation In this section we show how the Lorentz transformation can be conveniently expressed in terms of complexified quaternions. If x G M.£ represents a space-time event then a proper Lorentz transformation can be written in terms of a complexified quaternion q G M.£ as: x —► x' = q*cxq with Nq = 1 The constraint Nq = 1 (implying two real conditions) ensures that this transformation is 6-dimensional. It is easy to show that this transformation preserves the value of the inner product and so must be a Lorentz transformation: < x',y' > = S(x'y'c) = l-{x'y'c + y'x'c) = \ WcmWvc<n + (q*cyq)(qcxcq*)\ -(xyc + yxc) = q = q*cq*S(xyc) = S{xyc) = <x,y> We saw earlier that any complexified unit-norm quaternion q € H^ Nq = 1 could be decomposed into the form q = 9r9b C = 1b q*R = qR Thus q*c = CC = qBqR and so q*cxq = qB(qcRxqR)qB
127 We can show that Lorentz transformation q G M.£ for which q* = qc represents a 'pure' Lorentz transformation and those for which q* = q correspond to the spatial rotations. The verification that when q* = q the Lorentz transformation q*cxq = qcxq describes spatial rotations is an immediate consequence as now, q eM. It will be instructive to verify our second claim that when q* = qc the pure Lorentz transformations are obtained. To this end consider a space-time event x = t + zx or x = \/7Vx (cos (j) + x sin </>) in which cose smc fNZ 1 V r£ L X_ ' X_ If q* = qc then we can write (with Nq = 1) then q = cosh - + id sinh - 2^2 %Q = \A/Vx (cos (j) + x sin </>) (cosh -+iq sinh -) z z /TV^ | cos 0 cosh - + z(-x • q) sin 0 sinh - 0 0 0 +x sin (j) cosh - + z g cos 0 sinh - + zx A q sin 0 sinh - and then <7*cX(j = \A/Vx [cos 4> cosh 0 - z(x • g) sin 0 sinh 6 +x sin <j) + iq cos 0 sinh 0 + 2q(x • g) sin 0 sinh2 ; = £cosh# + (x • g)^/x • x sinh0 + zg£ sinh0 Q + zx^/x • x + 2zg(x • ^^/x^x sinh2 - Therefore £' = £cosh# + (x • ^^/x^xsinh^ x' — Xy/x • x + 2q(x • g)^/x • xsinh2 - + gtsinhfl Noting that x = and writing q = Jx_ - x yjy_ - y_ v sinh 0 = —^v (to accord with standard formulations) then = - with cosh 0 = 7 = vo^y and t' = 7(t - x • v)
128 which is the standard form of a pure Lorentz transformation (a 'boost') along y_. We now briefly examine some of the more straightforward applications of the quaternion formalism to problems in Special Relativity, in particular, to problems in particle mechanics. Particle Mechanics Following the approach outlined by Synge [15] we consider a particle moving along a path with equations xr = xr(s) where s is the proper time measured by a clock moving with the particle. The relation between the measures of time t, as measured by an observer O and of s are related through the metric: ds2 = rja(3dxadx(3 = c2dt2 - dxadxa Therefore ds = cdt uaua" ~ 1 =- = -dt where (3 = | defined by We can now H) \ u / 2 . Here the 3-velocity ua v<*- — ds define the so-called 4-momentum M and the 4 ua -- rex. dxa ~ ~dt _ .a dxa mP a. Ma = m—— = (m3, -J-ua) ds c where m is called the proper mass of the particle. We can view this in quaternionic terms by introducing the quaternion M M = Maea = m/3e0 + —uaea c We also introduce the quaternion 4-force F: ds These are the (Lorentz invariant) equations of motion. This can be written in terms of the 4-velocity v (a quaternion) as M = mv _ d . N dm dv , F = —(mv) = -z— v + m— (1) ds ds ds
129 We easily verify that Nv = 1: Nv = vvc = (v° + iu)(v° - iu) = {v°)2-u-u Thus Now, from (1) dv t ds l + v Fvc dv1 ds = - =0 dm -r—VV ds dm ds 1 + 777 dv m— as dv t ds vc (2) Prom this relation we find (taking conjugates and adding) Fvc+vFc = 2^ as dm .. — = <F,v> as We conclude that only if F, v are orthogonal (i.e. the 4-force is orthogonal to the path of the particle) is the proper mass a constant. It is interesting to note that the norm of the 4-momentum quaternion M is the square of the proper mass of the particle: <M,M> =MMC r _ .m3 lr _ .m3 , = [m/3 + i—ulhnp - i—^-u] c c = m p — u-u cz = m2 A momentum 4-vector Mp can also be defined for a photon. A photon is associated (in quantum theory) with a set of plane waves moving with velocity c. Regarding a photon as a particle with zero proper mass (since its speed can never be reduced to zero), it is characterised by a 3-velocity vector u = ca where a- a = 1. Thus its momentum 4-vector has the quaternionic form Mp = 6(1 + ia) beR
130 which satisfies <Mp,Mp> = b2(l + ia)(l-ia) = b2(l-a-a) = 0 The value of b must reflect the frequency of the photon (of the plane wave). The factor b is taken as b = — v where h is Plank's constant and v is the frequency of the plane wave associated with the photon. Conservation of 4-Momentum Here we consider a number of free particles colliding together. After impact they may break up into smaller particles or coalesce together to produce fewer numbers of particles or even produce photons in the collision process. Whatever happens, we assume the law of conservation of 4-momentum is valid; a law which is easily expressed: E MU = E Mw over a over 6 Here the prime indicates the system of particles and photons after collision; the unprimed quantities in the sytem prior to collision. Though not all collision problems can be solved using only this conservation law there are some simple cases for which a solution is possible. For example, the situation in which two particles with 4-momentum M(i), M(2) collide and coalesce into a single particle M(3) can be fully analysed. Here M(i) =m{i)v{i) i = 1,2,3 wherein m(j), v^ i = 1,2,3 are the proper masses and 4-velocities of the particles respectively. Now we know that v{a)vc{a) = 1 M{a)Mc{a) = m2{a) a = 1,2,3 Thus, by the conservation of 4-momentum M(3) = M^ + M(2) and so m^ = (M(i) + M{2))(MC{1) + M(c2)) = (m(l)V(l) +™(2)^(2))(™(1)^(1) +™(2)V(2)) = m2{1) + mf2) + m{1)m{2){v{1)vc{2) + v{2)vc{l)) Prom which m^) can be found. Also M(3) = M^ + M(2) implies TO(3)V(3) = m(i)V(i) + 771(2)^(2) from which v^) can be found.
Angular Momentum 131 Consider a free particle with 4-momentum M at event x (both complexified quaternions). The angular momentum of M at x with respect to the origin is defined to be the complexified quaternion H: H = V(xMc) If x = x° + ix and M = M° + iA£ then H = i{M°x-x°M.) + xAM_ This complexified quaternion corresponds directly to the bivector representation of angular momentum Hap = %aMp - XpMa The angular momentum is clearly dependent upon the position of the origin. An observer at event y would observe an angular momentum: H' = V{{x - y)Mc) = H- V{yMc) Using this formulation it is easy to show that the angular momentum of a free particle is conserved between collisions. Let x, y be two events on the path of a particle; then the difference in the angular momentum observed at x,y is Hx-Hy = V(xMc) - V(yMc) = V((x - y)Mc) (where we use the fact that the 4-momentum is unchanged along the path of a free particle). Now, by definition the 4-momentum is the proper mass times the rate of change of the particle's event position with respect to its proper time and so M = k(x-y) keR Therefore Hx-Hy = kV((x-y)(x-y)c)=0 and so Hx = Hy
132 Intrinsic Angular Momentum The angular momentum defined above is normally referred to as the orbital angular momentum. We shall show, in a later section, that if the angular momentum is defined as H = V(xMc) then we can deduce that -(MCH - H*MC) = 0 (the tensor equivalent of this statement is that Ha^M^ = 0). As well as the orbital angular momentum a particle may be endowed with intrinsic angular momentum often called the spin angular momentum. A simple analogy is the Earth/Sun planetary system. With respect to the Sun the Earth, in its orbit, has angular momentum. The angular momentum it has due to it spinning about its axis is its intrinsic angular momentum. In this context the intrinsic angular momentum of a particle, denoted by P, has the quaternionic form: P = B-iA B,AeR3 in which, if M is its 4—momentum, we demand ^(McH-H*Mc) = 0 This constraint implies (M° - iK){R - iA) -{B + iA){M° - tM) = 0 that is M°B-iM°A + iK-B-iMhB + M-A-M/\A -{MoB + iMoA + iB'K-iBAK-A'K+AAM} = 0 that is -i (M° A + MAi?) + M-A = 0 therefore K-A = 0 and M°A = BAM These equations show that, as 3-vectors, A,B_,M_ form a right-handed system. Also there is a single (complex) invariant associated with P, namely its norm: Np = {B-iA){-B + iA) = BB-AA-2iAB = &R-AA since AB_ = 0 Thus there is only one real invariant associated with P called its spin invariant.
133 3.5 Quaternions and Electromagnetism Before discussing electromagnetism we consider introducing a complexified quaternion differential operator D: ^ d .,*. d - d r d x _„ D = 75— + tfiTj- + j«— + fc^—) = Daea OXq OX\ OX2 OX3 so that Da = d d d d Kdxo dx\ 8x2 8x3 We can raise or lower indices using the metric therefore and therefore so Xa — 'tfapK x0 = x°, x\ — -x1, x2 = -x2, X3 =-x3 _d___d_ _9____9_ A-_JL d - d dx0 dx° dx\ dx1 dx2 dx2 8x3 dx3 Da = riapD? Da=( d d d d dx°' dx1' dx2' dx3 D0 = D° Dm = -Dm m = 1,2,3 Now, under a Lorentz boost: x° = (3(x'° + v • x') x = x' + yZ we obtain ax'/3 ~ dx^ ax'/3 ~ dx<p 7~ ax7/9 ° + dx^m W = d^=f}Do + pVmDm = /J(D°"Vm£>m) dx' Dk + = Dk + yZ v Dm Vk Vk
134 which shows that Dk transforms like a 4-vector (Df = q*cDq in quaternion terms). Noting that D = d dx° ■ + k- dx1 ' J dx2 dx3 fA *A kJL dx1 dx2 dx3 dx° -tV where V = iirj + Jjr~2 + ^7T~3 1S the usual three-dimensional gradient operator then We easily show that r^^^JL+tTL DD^DD^D'D^^ -V2 which is the wave-operator. A very useful application of the D—operator occurs in the description of electromagnetic fields. In this the electric and magnetic fields £,# are combined together as a single complexified quaternion Q Q = H-iE K,EeR3 in which, in this form, if, E_ are regarded as pure quaternions. We similarly combine the charge density p and the current J as a complexified quaternion J = —p + iJ_ The usefulness of the complexified quaternionic formulation is elegantly displayed in that all of Maxwell's equations are incorporated in a single quaternion equation DCQ = J. This is easily demonstrated: DCQ = d + zV [K ~ iR] dx° = -p + iJ_ Then equating together approprite terms (scalars, vectors, reals and imaginary quantities): V_'K = p VA£: V-ff = 0 dH dx° 8E which are the usual set of Maxwell's equations. We can also describe electromagnet ism using a scalar potential A0 and a vector potential A: *-**-& H = VAA
135 If we introduce a complexified quaternion A = A0 + iA then the dependency of E_, H_ on the potential A can be written as a single equation V{DA) = Q=R-iE This follows since DA d -tV dx° dA° .dA dx° *ldx° [A° + i£ -iV_A° -Y'A + VAA therefore and so f)A° f)A (DAy=^-i^^YA°-V.A-VAA 8A V{DA) = t-=j - iV_A° + VA A Thus V(DA) =K~iR implies and F = VAi which are the usual field/potential relations. It is interesting to obtain Maxwell's equations in potential form: l-Dc{DA - {DA)C) = J = -p + iJ To see this, the left hand side is, when expanded 1 ft2 A r) r) + iV A (V A A) - i V • (V A A) Note that we have not used the identities: ^A4)SSA(M) V.(VAA),0 as, only by keeping these terms can the full set of Maxwell's equations be recovered, from the usual potential relations. We have already noted that V(DA) = K-iR- We can easily verify that 1 BA° S(DA) = -2{DA + {DAf) = — - V • A
136 dA° which vanishes if the usual Lorenz (not H A Lorentz) gauge condition V • A = —- is ox0 imposed. If this condition is imposed we can regard D and A as orthogonal 4-vectors. Now < D, Ac > = S(DA) = ~(DA+ (DA)C) and <D,D> = S{DDC) = DDC and so, without the imposition of the gauge condition Maxwell's equations can be written as \dc(DA - {DA)C) = DCDA - \dc(DA + {DA)C) = <D,D>A-DC<D,AC>=J whereas if the gauge condition is imposed then Maxwell's equations (in terms of the potential A) take the elegant form: <D,D> A = J or DA = R-iE The first of these equations gives, on expansion d2A - iV2A - J r ^2 >io + _d{x0)2 implying the usual wave equations for A0, A. d2A{ d{x°) ^-v^u + p = 0 Lorentz Transformation of £, H_ To obtain the Lorentz transformation of E_, H_ one would normally express these quantities in terms of the potentials: BA E = --= + VA° K = VAA then s = .gr+ffa» H' = V' A A1 d where -^—r and V' refer to coordinates (x,0,x,1,x,2^xf3). We can then make use of the ax/u fact that (A0, A) and (^-q, Y) transform as 4-vectors under a Lorentz transformation. However, there is a more elegant approach to finding the forms of &,H/ using complex rotations in the space of complex 3-vectors and utilising complexified quaternion algebra. If x,y are 4-vectors then, under a Lorentz tranformation, defined by Q: xf = Q*cxQ and y'= Q*cyQ NQ = l
137 thus x'y'c = (Q*cxQ)(QcycQ*) = Q*cxycQ* Prom what we have already discovered about quaternion transformations this particular transformation can be considered as a complex rotation in the space of complex 3-vectors if Q is such that Q* = Qc (thereby obtaining QxycQc on the right hand side) which is a Lorentz boost or if Q is such that Q* = Q (obtaining QcxycQ on the right hand side) which is a Lorentz spatial rotation. For such complex rotations we know that S(x'y'c) = S{xyc) and N. xfyfC Ns xyc Now reverting to the electromagnetic arena let us take Z), Ac as x, y then xyc = DA = dA° dx° dA° dx° -Y-4 + VAA + i -Y-A + K-iE dA dx° VA» Thus, under a Lorentz transformation; either a boost or a spatial rotation S(DA) = — -V.A is conserved. That is dxf0 - ~ dx° dA° V-A Now if we choose the usual gauge then -7—q- = V • A (which is therefore true in all inertial coordinate systems) and xyc = K~ iE Also, choosing Q = aB - i^_ aB G 1, ^_ E I3 (and thus Q* = Qc) which is a boost then Hi — itf is obtained by using the complex rotation: £' - iE! = (aB - i^)(H - iE)(<*B + i£s) = (aB-i^)[aBK-iK-^+iHA/^-iaBE-E-(3JL+EApB] = *\H + 2aBEApB- 2^(H ■§B) + (0JL- foH + i [-a\E + 2aBH A^ + 2pB_(E •&)-(&• P^e] Therefore E! = (al+pJL-^)H + 2aBEApJL-2pJL(H-0B) E! = (a2B+pB-pB)E-2aBHAf3B-2f3B(E.(3B)
138 Now If we let then and so Q = aB — i(3B = cos z + q sin; cosh0 = /3 = = cosh - + iq sinh - where z = i- z z z sinh 0 = —/?v y_ = vq vo^) aB = cos 2 = cosh - — /3B = —iq sin 2 = q sinh - £' = caBhOE + sinh^ _ 2sinh2 * ?(£' *) = /J E + vAtf v(E-v) yZ + 2 u2 Similarly: £'=/? H-vAE- viK - y) + 2i(2£ • v) These two formulae indicate how H_,E_ transform under a Lorentz boost. The same approach can be used to obtain the forms of H_, E_ under a spatial rotation. In this case we choose Q = aR - f3R aR, (5R both real (Q* = Q) then H' - iE' = (aR + fa(H- iE)(aR - fa = (aR+fa[aRK + H-(^-HAl[R-iaRE-iE-pR+iEA£R] = a2RH_-2aRH_/\£ -iaRE + 2iaRE/\fiR_ + 2faH-pR) ~ (Pn •&)£+ KPR -faE-2ifaE.fa therefore Now Now if we let B! = a2RH - 2aRH A & + 2faH ■ fa - (& • faH g = a2RE - 2aRE A & + 20R{E ■ fa - (^ • faE Q = aR— f3R = cosz + hsinz = cos h sin - since z ■ 2 2 COS0 = VTTT^) sm# = y/^T^) e_ "2 v — vh
139 then aR = cos(0/2) /^ = ftsin(0/2) and <**-£*•& = cos2 {9/2) - sin2 (0/2) = cos 9 then H' = cos 0J£ - sin 0=—= + 2 sin2 - "v J = 1 H-HAv- v(R-v) + 2 u2 v{R-v) where 7 : VO + t?) and £'= 7 E-EAv- v(E-v) + 2>(i£' 2>) Not surprisingly, these are precisely the results we expect when any 3-vectors transform under a spatial rotation: (x' = xcos0+ (x - n)n(l - cos0) + n Axsin0) n ab with xyc = H_ — iE_ then Now, as we have seen above Nxyc is conserved in a Lorentz tranformation. In this case Na xyc {K-iE){-R + iE) = HH-EE-2iEK Thus both H • H — E • E and E • H are conserved under Lorentz transformations. 3.6 Quaternionic Representation of Bivectors A bivector is a skew-symmetric second order tensor Fap Fap = —Fpa In 4-dimensions Fap has six independent components. These can be conveniently arranged in terms of two vectors A, R: {A)m — Fom (R)m = ~^emjkFjk (In this section I will try to be consistent about the use of lower case latin indices which range through 1,2,3 whilst greek indices range through 0,1,2,3). The second relation here is easily inverted: £mab\J2.)m = ~7y^mab^mjk-^jk 1 2 = -Fab {fiajfibk - SakSbj)Fjk
140 Therefore Fab = —£mab{R)m The two vectors A, B_ can now be subsumed into a single complexified quaternion H G H(p: H = B-iA Explicitly: H = i{-F23) + 3"(Fi3) + H-Fu) - i [i(Foi) + 3(^)2) + fc(F03)] Also of importance in this area is the dual bivector F®„ defined by: where eapys is the completely anti-symmetric object with €0123 = +1 and (1 if afi^S is an even permutation of 0123 -1 if a/3^S is an odd permutation of 0123 0 otherwise The contravariant form is defined similarly and is such that ^0123 _ Oa 1/3 27„3a _ , _ , e —rj 77 Hr} ]r\ ea/97<5 - -Com - -1 We can list the dual components in terms of the ordinary components: ZP® _ Z? IT1® _ Z? Z7"8) _ Z? ^01 - ^23 ^02 - ^31 ^03 - *12 ^12 = _^03 ^13 = ^02 ^23 = _^01 These dual tensor components define a quaternion H®: H® = -IF® + jF* - kF® - i [iF® + 3"*?2 + **&] = iFoi + 5"F02 + A:F03 - t [iF23 + 3*3i + fcFi2] = A + iB = iH .'. H®=iH Thus the dual operation in complexified quaternions is a simple operation :- simply multiply by +i (a similar result was obtained for three-dimensional bivectors). Repeating the dual operation it follows immediately that H®® = -H
141 Quaternionic Form of the Contracted Product Fapz^ The contracted product of two 4-vectors pa,(la i-e. paQa has an obvious quaternionic counterpart S(pqc). The contracted product of a bivector Fap with a 4-vector za i.e. Fapz@ is less easy to describe in quaternionic terms. To obtain the quaternionic counterpart we shall find it instructive to decompose Fap into 'time' and 'spatial' components. Fap = F0p6oa + FmpSma = Fop60a + Fm06ma6po + FmkSma6pk = Fok(S0aSkp - SkaSpo) + FmkSmaSpk = (A)k(6oafikp - SkaSpo) - ^3mk{S)j8ma8pk = -(ImH)k(SoaSkp - 8ka$po) - £jmk(R>eH)j6ma6pk Thus if z@ is any 4—vector then Fapzp = -(lmH)k{S0azk - Skaz°) - ejmk{ReH)j8mazk = -{ImH)kzk60* + [{lmH)kz° - ejfcm(Retf )^m] Ska But if we consider the quaternion product of H = B_ — %A with z = b + ia (a 4—vector) we easily obtain: Hz = -A-a + i{-bA + B_Aa) + i(-B_ . a) + bB_ + A A a and therefore -(if*)* = A • a + i(—6A + 5Aa)+ i(-5 • a) - bB_ - A A a Thus, adding and subtracting this to its starred conjugate (the operator *c) we get ^[zcH-H*zc}=A.a + i(-bA + BAa) and l[-zcH-H*zc]=i{-B>a)-{bB + A/\a) Now realising that za <-> (6, a) <-> 6 + za in quaternion form and that e3km{ReH)jZm = -ekjm(BeH)jZm shows that Fapz@ can be expressed in the form: FapzP <-> (A-a, -bA + B_Aa) ^A'a + i{-bA + BAa) = ]-[zcH-H*zc}
142 Also, from above F%z* = -(ImH®)kzk60a + [(JmH*)kz° + ekjm(ReH%z™] Ska and so Fapz(* "{-R'a,Bb + AAa) <-> -£-a + z(&£ + ,4Aa) = i [%B_ • a + (bB_ + A A a)] = l[zcH + H*zc] Thus (Fa/9 - zF^)^ <-> l- [zcH - H*zc + *c# + ff *2C] The combination F^ = Fap—iF®p is a useful combination and has the important property that W = «$ For this reason the complex bivector Fi is called self-dual. Its close relation Kp = F<*0 + iFaf3 satisfies (F-0r = -iF-0 and is called anti-self-dual. There is a single complex invariant associated with Fi: \K0F+a0 = \(Fa0 - tF®, )(F*» - iF**") = \iFa0F0!3 - FVpF** - i[F%FaP + FQ0F^}) But F^a0 = ^^jpspeaf, = F^{\r,l6a0F^) = -F^sFl6 Therefore -F+^F+a/5 - F *Fa/5 - iF®„Faf3 2 <*P ~ r<*Pr trapr
143 This invariant is easily expressed in terms of the vectors A, B_. FaPF«P = F0(3F°P + Fm/jF"* = F0aF0a + Fm0Fm0 + FmkFmk = 2F0aF0a + FrnkFmk — —2FoaFoa + FmkFmk = -2A • A + eimkejmk{B_)i(B)j = -2A • A + (tfijtfmm - tiimtijm)(E)i{E)3 = -2A -A + (3B'B-B_'B_) = 2(B_'B-A>A) Also = —2(F23Fqi + F31F02 + F12F03) + 2(-F03F12 + F02F13 - F01F23) = -4(-£i Ai - £2^2 - £3^3) = 4,4 • £ However, as H = B_— iA then thus, finally Nj/ = £ • R - A • A - 2% A • 5 \f^F+^ = 2Nh Simple Bivectors When Fap can be written in terms of two vectors pa, qa as: F<*0 = PaQp ~ PpQa then it is called a simple bivector. If p, q (p* = pc, g* = qc) are the quaternionic representation of pa) qa respectively then the quaternionic form of a simple bivector is § = V(pqc) = ±(pqc-qpc) If we write p = a + i/3, q =-y + i6 then, by direct expansion: § = I [(a +1£)(7 - iS) - (7 + #)(<* - i§)] = i[y§_-a6} + f3_A6
144 which is clearly a bivector. We note the relation between the norms of p, q and S. NP = S(PPC)= a2 -§_■§_ Nq = S(qqc) = 12-6-S and N§ = - [W-clS) • (l§_-aS) - (PAS) • (pA6)] = -[^2p'p-2a^S'p^a2S'S^{P'S)2-{P'P)(S'S)] We choose, without loss of generality p, q to be orthogonal so that S(pqc) = 0. That is, pqc + qpc = 0 .'. a6 = d-6 With this condition imposed we find Ns = -[j2^p-a2^2 + a2S-S-{p-§){S-S)] = -(p.p-a2)tf-6-6) = NpNq With reference to this result we note that the quaternion Q defined by Q = S{pqc) + V(pqc)=pqc is such that NQ = NpNqC = NpNq but when S(pqc) = 0 then Q = S and so the above result could have been deduced immediately. We also note that S is unchanged by the addition to q of a multiple of p and to p by the addition of a multiple of </, (whether or not p, q are orthogonal). Thus a simple bivector represents a plane (called a 2—flat) containing the vectors p, q. Now every 2-flat contains at least one space-like vector. To see this consider any 2-flat spanned by two orthogonal vectors p, q. Then t = ap + bq is any vector from the 2—flat. Now Nt = (ap + bq)(apc + bqc) = a2Np + b2Nq since pqc + qpc = 0 Not both orthogonal vectors p, q can be time-like. To see this let p = a + i&_ and q = 7 + iS then pqc + qpc = aj — /3 • £ If p, q are both time-like then Np = a2-P-f3_>0 Nq=j2-S-S>0
145 Therefore a2 > \p\2 and 7* > |£|2 But aj - (3 • 6 = 0 implies 0:7 = \/3\ \6] cos 0 (*) 2 2 •'• OT = C0S" or WW = C0*2e-1 which contradicts (*). Hence not both p, q can be time-like; one must be space-like. Since every 2—flat contains a space-like vector we can choose (without loss of generality) p to be space-like. Thus Np < 0. There are now three cases (Figure 3.2) using N§ = NpNq (i) 7V§ < 0 then Nq > 0 q is time - like (ii) Ng > 0 then Nq < 0 q is space - like (iii) Ng = 0 then Nq = 0 <? is null Figure 3.2 We note that if Fap is a simple bivector, that is, 1 H — 7^(PQC - QPC) where p* = pc, q* = qc then H0 = ^(Pqc-qpc) Thus, using results derived earlier: Similarly F%i?~l-\pcH + H*pc) = ±\pc(pqc - qpc) + (pcq - qcp)pc] = 0 = \ faW - qpc) + (pcq - qcP)qc] = o
146 We interpret these results as showing that every p, q (defining the 2-flat characterising S) is perpendiular to every vector in the 2-flat characterising §®. If TVg / 0 then the only vector z common to §,§® satisfies (using (Fap — iF®q)z^ = 0) zcS = 0 .-. NzN§ = 0 .'. Nz = o implying z is the zero vector as S_1 exists. However, if N§ = 0 then, since TVg = NpNq and (by choice Np < 0) we must have Nq = 0. That is, q is a null vector. Thus if if is a null simple bivector (corresponding to Fap satisfying FapFaP = 0 and F®pFaP = 0) we can always write H in the form H = pqc Np < 0, Nq = 0 in which p, q are orthogonal {S(pqc) = 0). Similarly H® is also a simple null bivector and so H® = rqc NR<0, Nq = 0 and, by choice, p, r are orthogonal and of the same magnitude. In 4-vector terms we can write Fap = PaQp ~ Ppqa F®p = raqp - rpqa 3.7 Null Tetrad for Space-time It turns out to be of great advantage [16] to replace the tetrad e0 = 1, e\ = n, e2 = ij, e3 = ik by a null tetrad: Prom henceforth all tetrad components will be in brackets. The metric appropriate to this choice of basis is (using the same inner product rule): h(-»)=ft(«).fcW=^feWftW
147 or, equivalently We easily find h(ab) h(ab) = < h(a))ft(6) > = 5(/j(a)(hW)C) det(/i(a6)) = +1 0 10 0 10 0 0 0 0 0-1 0 0-1 0 The inverse matrix h^) is h(ab) - 0 10 0 10 0 0 0 0 0-1 0 0-1 0 Basic Formulae The following algebraic results are easily obtained: Vy/2hW 0 y/2h<V 0 h^h^ = h{a)h{b)c 0 y/2hW 0 y/2h& 0 y/2h<V 0 yfthW ly/2h& 0 yfthW 0 . " 0 y/2h<V -yfthW 0 " y/2hW 0 0 -V2h<3> y/2h<V 0 0 -V2h<°> . 0 y/2h& -y/2hW 0 . Clearly, since Nh(a) = 0 a = 0,1,2,3 this is a null tetrad. The tetrad components of any tensor Hapmmml are #(a&...c) and defined by Ht (ab...c) #a0...7/i?a)/if6) • ../l/( (a)"(6)"-""(c) The dual basis, /i,0x is defined by h(a) = ^(a6)/L(6) SO and therefore We find: 1(0) Z>(1), h(1) = />(0), h,2) = -h{3\ h(3) = -fc<2> (2) 1(3) % = rTfih{ah)hf 1 1 i _ i ^'"72' i_ h?o) = (-*=,-*=, 0,0) fc«} = (-)=,--^,0,0) hf2) = (0,0,^,-^) fcw = (0'°'^^)
148 The basis vectors h^ a = 0,1,2,3 are, of course, not unique. The possible sets of basis vectors are arranged into two classes - those for which the direction of h^ is fixed and those for which the direction of h^ is not fixed. We will demand, in either case, that the metric with respect to the tetrad is unchanged and that, for every tetrad (h^)*c = h^3\ These transformations will then correspond to Lorentz transformations. Let us first consider a new tetrad, denoted by a prime, in which the direction of h^ is held fixed. With this constraint in mind h'(°) = c*h(°) keR This preserves that hf^ is future-pointing if h^ is. Now let h,(1>=a0/i(0) + a1/i(1)+a2/i(2> + a3/i(3) ^(2) = 6o/l(o) + 6l/i(D + 62/l(2) + 63^(3) h/(3)=Cdh(0)+C1h(1)+C2/l(2)+C3/l(3) since (h^)*c = h^ and (h^)*c = h^K The following results are easily obtained /i/(01) = 1 implying axek = 1 .'. a\ = e~k ti{02) = 0 implying hek = 0 .'. &i = 0 /i/(22) = 0 implying b2b3 = 0 Here we choose 63 = 0 (in order that we can recover the identity transformation). Also j^/(ii) _ q implying a0ai - a2a3 = 0 ^'(12) _ q implying aib0 — a3b2 = 0 /i/(13) = 0 implying a^ - a2b*2 = 0 ft,(32) = -l implying b2b*2 = 1 -+ b2 = eic c € R Therefore we deduce a\ = e~k, a0 = eka2a3 b0 = eka3eic, bx = 0, b2 = eic, 63 = 0 b* = a2b\ek = a2e~icek = eka*3e~ic therefore a2 = a*3
149 We can finally write fc'<°> = c*h(o) h'M = ekaa*hW + e~kh^ + afc<2> + a*h& h'M=a*ekeichW+eichW h/(3) = (/l/(2))*c = aefce-lC/l(0) + c-icft(3) We see that if we insist on keeping the direction of h^ fixed then the freedom to choose a tetrad involves four real parameters. We can obtain the Lorentz transformation corresponding to each of these parameters. Consider the quaternion q = ah(0)+PhW a,(leR Prom the definition of h^°\h^ then, clearly q* = qc and Nq = 2a/3. Thus requiring Nq = 1 implies 2a(3 = 1. If x is a space-tme event then x1 — qxq represents a pure (boost) Lorentz transformation. We easily find that with this choice of q h'<°> = qh^q = [ahW+phW][hW(ahW+l3hM)] = 2a2h^ Therefore choosing 2a2 = ek to conform with the above analysis we find V2 P V2 Then, by a similar procedure h«D = c-fcft(D ft'P) = qhWq = 2aphV> = h™ h'<3> = ftO) So the choice q=±=(ekl2h^+e-kl2hV) is the Lorentz transformation which changes h^ into h'^ with a = c = 0. The Lorentz transformation with a = k = 0 is modelled by the choice q = ah,W + ph.™ a* =(3 2aj3 = l
150 (for then q* = q and so this corresponds to a spatial rotation). The supplementary conditions are obtained from the requirement that Nq = 1 Choose Specifically: Nq = qqc = 2a(3 and q* = q - {? = a a = 4=e"ic/2 /?=-Lic/2 eel V2 V2 fc'<°> = qch(o)q = 2aph(0) = h(0) tiM=<fhWq = 2a/3hW=hW h'W = qch™q = 2(32hW = cfch<2) h'W = qchWq = 2a2fc<3> = e~ich^ By choosing a quaternion q of the form q = ah^ + j5h^ + 7/1^) + <5/i^3^ and then finding a,/?,7,5 so that h'<°> = fcw A'<2>=a'/i<0>+h<2> h'<3> = afc<°> + /i(3) one can easily show that the transformation accounting for the h'^ tetrad above, with k = c = 0 is q=±(h^+h^+a*h^) Nq = l V2 q* = -^(hW+h^-ahW) ^ = -L(ft<i>+/»W-a*ft<3>) g*c = 4=(/i(0)+/i(1)+^(2)) v2 Then, using x' = q*cxq we find the h'^ tetrad above with k = c = 0. It is interesting to note that the closely related quaternion q: v2 leads to the tetrad h'<°> = q'ch^q = ft<°> + WftW + bhW + 6'fcW h/(2)=g.cft(2)g = /l(2)+6./l(l) /,'(3)=q«h(3)g = /l(3)+6fc(l) This choice satisfies
151 This (2 real-parameter) Lorentz transformation does not keep the direction of h^ fixed. This, together with the 4 real-parameter Lorentz tranformation obtained above, defined by parameters kyc,a represents the full 6 real-parameter Lorentz transformation. Bivector Representation in the Null Tetrad We first obtain the form taken by the alternating pseudo-tensor rja^s with respect to the null tetrad. By definition riapy6 = {-det(hap))* eQfa6 where ea^s is the completely anti-symmetric object with €0123 = +1- The tetrad components are obtained in the usual way: V(abcd) = (-det(fcaj9))!fCai97*/lfa)/lf6)ft(rc)'l(d) = (-det(/ia/3))*C(a6cd)det(/igl)) from determinant theory. But h(ab) = h*Ph<(a)h((b) and +1 = det(h(a6)) = det(/iQ/3)(det(/ifa)))2 /. (-det(hQ/9))idet(/ifa)) = ±i We choose the negative sign implying V(abcd) = -it(abcd) Also ^(0123) V — 77(1032) = 77(0123) = -^(0123) We easily obtain the relations between the covariant and contravariant tetrad components of P(a6) as: p(20)=_p(3i) pdO)=P(01) p(30) = _p(21) P^=-Pm P(13) = -P(02) P{23)=P{32) whilst the dual components of P(a6); defined by P® - ^ m P(cd) - * c p(cd)
152 whic i implies p® _ ^(oi) - p<8> _ ^(12) - -^(32) -^(21) P(02) - ~*P(02) p®s)=ipm P(03) - iP(03) pm = ^(oi) We note, in particular, if P^a^ is self-dual (satisfying P,®6* = 2P(a&)) the above relations necessarily imply P(oi) = P( (23) ^(02) = 0 P(13) = 0 Any complex bivector P(ab) nas two invariants /i, J2 associated with it: and 8 1 (ac)^(6d) - ~2 [^(01) + ^(23)1 ~ ^(02)^(13) - ^(03)^(12) h = lv{abCd)P(ab)P(cd) Therefore le(abcd)P(ab)P(cd) = -^[^(01)^(23) + ^(20)^(13) + ^(03)^(12)] h + %h = 2P(20)^(13) " g IP(01) - ^(23)]2 /l - t/2 = 2P(3o)P(12) " ^[P(01) + P(23)]2 Thus if the bivector is self-dual I\ +il2 = 0 and so there is only one complex invariant for a self-dual bivector: I\ — il2. Now if we write P(ab) ~ F(ab) ~ iF®ab) which is self-dual then the usual real invariants of Fap can be recovered from I\ - il2. To see this we remember that Fab — ^mab\Mjm ^0m — \±± )n or, in matrix form: Fap = 0 Al A2 A3 -Al 0 -B3 B2 -A2 B3 0 -Bl l-A3 -B2 £i 0. Thus, using the relations: F(ab) = Faph^h (*) tap ~ (06) ■ 0 -Bl Bi 0 B2 A3 -B3 -A2 ?aPhUh(b) -B2 -A3 0 Al -B31 A2 -Al 0.
153 leading to: ^(20) - ~2A<2 + 2Bs + 2^2 + 2^2 F® r(20) ^2 + ^3-^3 + ^2 Therefore P(20) = 0 as expected and, similarly, P(13) = 0. Also ^(oi) - ~M (23) -Z^i ^(oi) = -^i - *#i ^(23) - -Z^i and so P(23) — ^(oi) as expected. Continuing, we obtain F(3o) = ~9^2 + 2^3 ~ 2^3 ~ 2^2 (30) F® ^(30) ^2 + ^3 + ^3-^42 Therefore P(30) = -A2 - iB2 - iA3 + B3. Similarly P(i2) = A2 + 2 £2 - M3 + £3. We are now in a position to calculate I\ — il2: ■ 2P(30)^(12) - 2^(01) + ^(23))2 h-ih = 2[-A2 - iB2 - iA3 + B3][A2 + i£2 - M3 + £3] -2[A1+z51]2 = 2[B_- B_- A- A-2iA- B] as predicted We have seen that a complex self-dual bivector P(a6) has only three independent complex components P(3o)> P(u) and (P(oi) +^(23)) and, as such, they can be described by a single complexified quaternion (as can any real bivector). The set of all bivectors B_—iA form a three-dimensional complex vector space. Let Y^ a = 1,2,3 be a basis for this space. An immediate question is what basis bivectors should we choose? Well there are some natural candidates which arise from the basic null tetrad /i(a) a = 0,1,2,3 that we have already chosen. As we have noted earlier: 0 V2h<°> -y/2h& 0- y/2hW 0 V2h& o 0 y/2h& Thus two bivectors are immediately suggested h(0)h(2)c _ _^(2) h(a)(h(b)y = 0 -y/2h& 0 -y/2h<V -V2hM o and h^h^c = y/2h& which we label as Y^\ Y^ respectively. A third is constructed by combining elements from the above table: We choose y(3)= 1(^(3)0^(1)^(0)0) This, and indeed the other two are suggested by the terms in the complex invariant
154 h + ih = 2P(02)P(3i) - \{Pf2Z) + P(W))- We note that Y^ can be expressed in different ways: from the table above y(3) = l-(h^hWc _ ^(3)^(2)0) or y(3) = * (fcWfcWc _ h(0)h(l)C) from either of which it is obvious that Y^3) is a simple self-dual bivector, as are Y^, Y^2\ Thus to recap: y(D = ftWhW" = -yfthV) y(2) = ft(3)h(«c = ^P) y(3) = 1(^)^)0 + ^DftWc) = ^(_ft(0) + />(!)) z z are the three basis vectors chosen to span the complex 3-D vector space of quaternions H = B_ — iA, A,B_eR3. The metric of this space is found in the usual way: kab = < y(a),y(6) > = £(y(a)y(6)c) "0 2 0" 2 0 0 0 0-1 then ^ab — o \ o -2 0 0 0 0 -1 Thus at any space-time event any complex self-dual bivector P(a(,) (and hence any real bivector) can be represented in one-to-one fashion by a vector of C3 = V2[-PM2) + P2h{3) + i(-/>(0) + h^)} therefore also Bx - iAi -iPs -Pi(y-fc) + P2(tj + fc)-P3t SO P3 = Ai + tBi = - (P(32) + P(10)) B2 - iA2 = -iPi + iP2 B3 - iA3 = Pl + P2 which together imply P2 = --[B2-iA2+iB3 + A3] = -[-A2-iB2 + B3-iA3] = P(30) Pi = \\B2-iA2-iB3-A3] = -\A2 + iB2 + B3 - iA3] = P (12)
155 Also the invariants of Fap (i.e. FapFaP and F%Fal5 are the real and imaginary parts of PapPa@ (where Pap = Fap — iF®„) and are conveniently expressed by kabPaPB = 4PXP2 - Pi Of course, Y^ a = 1,2,3 are also complex self-dual bivectors. By writing P(a&) = PaY^ with any two of Pa a = 1,2,3 taken as zero and utilising the relations Pl=P(12) P2=P(30) P3 = \(P(32)+P(10)) we easily see that the bivectors representing Y^ a = 1,2,3 are (where the square brackets represent anti-symmetrization /^j = -{lab — ha)-) y(a6) - Zd[adb] y(a6) - 2d[A] y(a6) - d[adb] + d[eA] We can therefore write, for any complex self-dual bivector P^ P(ab) = P(12)Y{ab) + ^(30)V(a26) + (P(23) + ^(01))^) The general tensor components of Y/J m = 1,2,3 follow the usual prescription: Ja/3 - X{ab)aoc ap If Pa/? > Qa/? are any two complex self-dual bivectors then we can write p n _ p v(m) n n v(m) ■"a/j — rmIap WaP — Vm1^ Thus, for example; But, as is easy to verify: yMy(n)^ = r • p^Qaf) = kmnPmQn In particular if Pap = Qap = Fap - iF®p where Fap is real then PapP«P = 2[Fa^ - iF®pF°P] = kmnPmPn Thus Fap is a null bivector (when its two invariants vanish) if kmnPmPn = 0; that is if Pm is a null vector in C3.
156 Other Formulae Involving Y^ Straightforward calculations confirm that y(m)y(r0c _ umn _i_ 2fmnPY Y{m)y(n) _ _fcmn _ 2emnpY( rfYffY™ = kmnrtai + 2em^Y(p)ai Effect of Lorentz transformations on the Bivector Basis in C3 As we saw earlier the Lorentz transformations can be split into those which alter the direction of h^ and those which do not. Considering the first type with g= 1 (h(0)+h(l)+6/l(2)) V2 defining the 2-parameter Lorentz transformation implies h'<® = h<® +WhW + bh<n + b*hW h'W = hw h'W = h<n+FhW h'(3)=/,(3)+6/l(l) then we easily discover y(i) _> y'(i) = /j'(o)/j'(2)c = [ft<°> + 66* h™ + bh^ + b'h®][hWc + b*hWc] = h^h^c + b*hWhWc + b*h^h^c + (b*)2h^h^c = Y«-26*y(3) + (6*)2Y<2> y(2) _> y'(2) = h'{3)ti(l)c = F(2) y(3) ^ y/(3) = I(h'(2)A/(3)c + ^(1)^/(0)6 = y(3) _ 5*y(2) Now considering transformations which do preserve the direction of h^ namely: A"<°) = ekh^ h"W = ekaa*hW + e~kh^ + ah™ + a*/i<3> h"& = a*ekeich^ + eich^ h"W = aeke-ich^ + e~ich^
157 we find y(i) _+ y"(i) = ekeicYil) y(2) _ y//(2) = _2ac"icy(3) + aVe"lcy(1) + e-*ce"fcy(2) y(3) _^ y//(3) = _ae*y(i) + y(3) An easy calculation (utilising the table of products found for y(m)y(n)) confirms that < y'(m) y'(n) > __ £(y'(m)y'(n)c\ _ j^mn ^ y//(m) y//(n) ^ __ o/y//(m)y//(n)c\ _ iLmn We conclude that rotating the null tetrad via a Lorentz transformation is equivalent to rotating the complex basis Y^ in C3 since the scalar product (i.e. the metric) is left unaltered. Presumably, we can rotate axes in C (knowing that this is equivalent to Lorentz transformation) in order to simplify the form taken by a vector in C3. Now any complex self-dual bivector P can be expressed in the form p = p.y(0 = p'Y'^ = p'.'Y"^ that is PtfM + P2Y{2) + P3Y^ = p[[y(D _ 2b*Y^ + (&*)2y(2)] + p/y(2)+p^y(3)_6*y(2)] SO P1 = P[ P2 = (b*)2P[ - b*P^ + P'2 Ps = ~2b*P[ + Ps or, inverting P[ = P1 P'2 = P2 + (b*)2Pi + b*P3 Ps = P3 + 2b* Pl Also, from the second relation Ptf™ + P2y(2) + P3y(3) = P[\ekeicY^) P^{-2ae-icY^3) +a2eVicy<1> + e"ice-fcy(2)) + p^,(-aefcy(1) + y(3)) Therefore Pi = ekeicP[f + aVe"*/^' - aefcP^ P2 = e-ice~kP^ °2 "•" ^3
158 or, again inverting: p» = e-ke~icPl + a2eke-icP2 + ae~icP3 P>i = eicekP2 P% = P3 + 2aefcP2 Clearly, by suitable choice of b* we can always arrange matters so that P2 or P% vanish. However, if we choose b* to make P3 vanish then another Lorentz transformation (involving c,k,a) can be chosen to make it non-zero. Hence this is not invariant. If instead we choose b* so that P2' was t° vanish then no other Lorentz transformation (involving c,k,a) can be chosen to make it non-zero. Hence this is an invariant choice. 3.8 Classification of Complex Bivectors and of the Weyl Tensor We have seen that any complex self-dual bivector Pap can be represented in quaternionic form by the components Pa of a vector P in a complex 3-D space. (P = P^Y^). Now such a vector can take one of two basic forms :- it is either null or non-null NP = 0 or NP ^ 0 NP = 0 Null Complex Self-dual Bivectors Here kabPaPB = 2[2P1P2 - ip32] = 0 We can always choose coordinates (the value of 6*) to force P^ = 0. In this case there is only one such value. Thus we can also deduce, in this case, that P3 = 0 and so we can, by a complex rotation, find a basis in C3 in which P has only one component along Y^. Therefore, in this case: P = P1y<1> = -P1 >/2fc(2) from which we find hWcP = -P1y/2hWhW=0 Since there is only one value of b* then there is only one direction h^ for which this is true. This has the tensor equivalent (Fa0-iF®0)h^ = O - Fa(,hl°» = 0 and F%h^ = 0
159 Np ^ 0 Non-null Complex self-dual Bivectors Here 2P1P2 P3 / 0 so there exist two values of b* which can be chosen to make P'2 vanish. This implies that a complex rotation can be chosen so that, at the outset, P2 = 0. In this case P = P^1) + P3y(3) .'. h^cp = p3hM implying (h^cP)h^ = 0. But this in turn implies S[(h(0)cP)/i(0)] = 0 and V[{h^0)cP)h^} = 0 In terms of tensors the first of these relations imply (FQ0 - iF^)h^h^a = 0 [or S[{hWcP)h<®] = S{P) = 0] which is trivially satisfied. The second condition implies (Faff ~ iF%)h^hW - (F,0 - iF*)hWhW = 0 that is F0lahiy°W=O and F^h^ = 0 Classification of the Weyl Tensor Here we consider extending the approach outlined above, used to classify the electromagnetic field (the bivector Fa^), to the task of classifying the Weyl tensor (which is identical to the Riemann curvature tensor in vacuum). It is well known that the Weyl tensor is related to the Riemann tensor via: Cap-y6 = Rap-y6 ~ ~ fo/W^cry ~ VpfRa6 + Va-yRp6 ~ VadRp-y] -R[r}air}(36 ~ VadVp^} where Rap is the Ricci tensor Rap - Rai^ R - Raa It is easily checked that
160 (i) Ca^s = C[apfrs — Ca/?[7<5] (ii) C[a0-y]6 = 0 (iii) Ca/57<5 = C1sap We can contemplate taking the dual on either the first or second pair of indices: c ® = —n n auj Also Now where a/?7<5 — 2 'a^mn 7<5 _„ „ s^mncru; — AflaPmn'l'y6(TUJ^' = \va0mnVabCdria-rr,bSVcaV^Cmn''U - CPm + Cm + C%% + C%% - c^i+cm+cm+cm = -6%sr - c%i - cc + CC-CC + CC 811 = 6181-8161 Therefore _ £mPsi[qn]auj _. cmps^[pn]aw _ onps^[mq]auj _. cnqs^[mp]cruj — __n n firnn^i <ru> — <y 'Imcr'lnu;uab ^xy = —dnao'nbu - VbcrVa^Cxy™ = ~~Z\y/xyab ~ ^xyba\ = ~^abxy (In the above reduction we have used the property of the Weyl tensor that all of its contractions vanish).Now, following the direction taken in the analysis of bivectors we define ^abxy = ^abxy ~~ l^abxy
161 Since that is °a6xy ~ ^abxy ™ieil ^a&Xy — °a6xy abxy abxy and therefore ^abxy ~ ^abxy iK^abxy ~ ^abxy ^abxy ° a6xy ~~ °a6xy L^ abxy ~ ^abxy I" L^abxy - lUabxy r<+ ® _ /nr <g> _ -/nr® ® _ /^ ® \ »n , ° a6xy — °a6xy ^a6xy ~ °a6xy ^ *W6xy = ^a6xy Therefore C^ is self-dual in both index pairs. By treating each pair of indices in C*bcd as a separate entity it is clear that this can be represented in C3 by a complex 3x3 matrix and also ^abcd ~ ^rriniab I cd and since C+cd = C+a6 then L/mn-ra6 1cd ~ ^mn1 cd 1 af, — ^mn1^ Icd SO ^mn ^nm Also since ^wCjj,cd = 0 (all contractions of the Weyl tensor vanish) we obtain u - omn77 ra6 rcd = (smn[k Vac "I" 26 J(p)acj — C ]crnnr\ — ^mri™ 'lac .'. ^mnCmn = 0 or 4Ci2 = C33 This constraint, together with the symmetry condition shows that Cmn has 5 independent complex components which matches the 10 real components of the Weyl tensor. Now to characterise C*bcd we consider the eigenvalue problem for Cmn n pm _ \ p where P = PmY^ is a vector of C3. By analysing this problem we seek to characterise the Weyl tensor in a similar manner to that effected for the electromagnetic field.
162 If we look for null eigenvectors then CmnPmPn = 0 and kmnPmPn = 0 The second relation: AP\P2 - P2 = 0 or, equivalently, P1P2 - P32 = 0 is satisfied by choosing P1 = a, P2 = /32, P3 = a/3 a, /? G C. The first equation can then be expressed as (utilising the constraint C33 = 4Ci2) Cna4 + 2Ci3a3/3 + 6Ci2a2/?2 + 2C23a/?3 + C22/34 = 0 Now under the Lorentz transformation, characterised by the quaternion q: q = J-(hW+hW+bhW) V2 we found y(i) _+ y'(i) = y(i) _ 26*7(3) + (6*)2y<2) y(2) _^ y/(2) = y(2) y(3) __> y'<3) = y(3) _ fc*y(2) Therefore cmny(m)y(n) -+ C'1"1^ We easily find C\\ —* Cn Ci2 -> (fc*)2Cii-6*Cl3 + Cl2 C13 —► — 2b*Cn + C13 C22 - (b*)4^! - 2(6*)3C;3 + 6(6*)2C;2 - 2b*C23 + C^2 c23 - - 2(&*)3c;1 + 3(6*)2c;3 - 6&*c;2 + c^ c33 - 4(6*)2c;1-46*c;3 + 4c;2 where we have used C33 = 4C£2. Clearly, as in the elctromagnetic case, we can choose a Lorentz transformation — a value of b* which satisfies the 4th degree polynomial (to make C22 vanish). The multiplicity of the roots for &*, dictated by the values of C'mn can be used to classify Cmn. Now each value of 6* gives rise to a null vector h^. We note (since, when differentiated with respect to &*, the transformed expression for C22 becomes the transformed expression for C23) that if there is a double root for b* then C22 = 0 and C23 = 0
163 A similar argument shows that if there is a triple root for b* then C22=0 C23 = 0 and C12 = 0 whilst a 4-times repeated root imphes C22 =0 C23 = 0 C12 = 0 and Ci3 = 0 The following table describes the very well known classification Type I H,D in N 0 Vanishing Coefficients C22 c c ^22 ^23 c c c 22 S>3 12 c c c c ^22 ^23 ^12 ^13 all Null Vectors {1,1,1,1} {2,1,1} {2,2} {3,1} {4} no preferred null vectors
Chapter 4 Cayley Numbers 4.1 A Common Notation for Numbers In this section we describe, in a single notation, the three types of number that we have met; the scalars E, the complex numbers C and the quaternions H. We consider a quaternion number p to be a two-component object (suggested by the formalism of complex numbers and by the matrix formulation of quaternions [17]): a,/?eC Equality between p = and q — is defined by: p = q if and only if a = 7 and /3 = S We define the binary operation 0 through the statement: p®q = We immediately see that there is a ©identity, designated by O: a 0 V 6 = a + 7 _/? + <$_ 0 = is such that p©O = O0p = p sa sf3 Multiplication by a scalar sERis written sp, defined by sp = We also define the operation of conjugate: vc-- This leads us to consider the scalar and vector parts of p: S(p) = \(p + Pc) V(jp) = \(p-Vc) It immediately follows from these definitions that if p = then S{p) S(a) 0 V{p) = V(a) J. P. Ward, Quaternions and Cayley Numbers © Kluwer Academic Publishers 1997
165 Finally we define the multiplication rule, between elements p, q written poq: poq = a 0 7 6 = cry - 8(3 7/3 + a8 The oidentity element E, is easily recognised: E = is such that po E = Eop = p for all p From this we can verify the associative law of multiplication po (qor) = (po q) or. Let p = Q = then po (qor) = (poq) or a A Q7 1(3 0 + c 7a - bS a<5 + 7'6j 0 a b = = a(^a - bS) - (aS + 76)/? (7a - 6<$)/3 + a(a£ + 76) (c*7 - <$/3)a - 6(^7 + «a) a(7/3 + aS) + (7a - /3<$)6 I have kept the separate elements in correct order as we shall meet both expressions again later in situations in which associativity cannot be assumed. Since a, /3, 7, <$, a, 6 G C then it is transparently clear that po (qor) = (po q) or. However, generally po q ^ qop and so the commutative rule fails in general. The conditions under which commutativity holds are easily derived. Using p, q as defined above: poq. aj — 6(3 7/3 + aS qop-. 7a - (36 aS + 7/3 For these to be equal we require <$/3 = (36 j(3 + a6 = a6 + 7/3 The first equation requires V(6J3) = 0 and the second requires V(7)j9 = V(a)S. From the second: if 1/(7) ^ 0 then /3 = kS k eR and V(a) = W(i). The first equation is then satisfied identically. If however, V(j) = 0 then either V(a) = 0 (implying (3 = sS s eR) or S = 0. Grouping these results together we see that multiplication is commutative only if either (i) V(p) = kV(q) k eR that is, the vector parts are parallel. (ii) either p or q has the form ke
166 The norm Np of p is defined in the usual way: Np=popc Using our definition of multiplication it is easily observed that Np=popc aa + /3/3 0 gmp The conjugate (denoted by 'bar' for individual elements) needs to be interpreted appropriate to the type of number to which it refers. We can now define an inverse element p~l to every element p with non-zero norm viz: if p — fV, -1 1 r 1 then p = —p This satisfies Nv P O p = po p =r E Nv -j9 The reader can easily verify, that the scalar numbers have the form: where a€l Without any loss whatsoever such a number can simply be replaced by a in any algebraic expression. The complex numbers are obtained when: where a, /? G M Also, as is easily verified, objects of the form aeC are isomorphic to the complex numbers. As we have stated above the quaternions are obtained if: P = where a, /? € C To verify this we must exhibit an isomorphic map from these objects onto the quaternions. Such a map is fraii <j): C x C f-> H <f> < \ „ > «-> ao + a\i + a2j + ask
167 in which a = clq + a$yf—i ji = a2 -f- ai'V^T. This map is a homomorphism since if 7 = &0 + 63v/^T <5 = 62 + &!>/-! then and 0 </>< Q + 7 £+5 } Ml = f [a0 + 60 + (as + &3)v/zT] 1 ^IK + k + fai + frOV^Tj/ = ao + 60 + (ai + 6i)i + («2 + h)j + (03 + h)k {[?M«]}-*{fc2]} ■{ a060 - a363 - a2&2 - o-ih + v/-T(ao&3 + ^3^o - &2&i + 62^1) ^2^0 _ &1&3 + «0^2 + G3&1 + V/-T(ttlfeo + ^2^3 + ^0^1 _ ^362) = Q>obo - a363 - a2b2 - a\b\ + (ai&0 + &2&3 + ao&i - 0362)* H- (0260 - ai&3 + a0b2 + a3&i)j + («o^3 + «3^o - «2^i + b2ai)k -♦{[?]M[i]} The map is clearly one-to-one and onto and so this is an isomorphism. 4.2 Cayley Numbers The obvious question that we can now ask concerns the possible existence of other numbers (other than scalars, complex and quaternion numbers) which can be expressed in the form described here. The next class of number that we might consider is where a,/?Gl
168 These new numbers are called Cay ley numbers, denoted by K and we now explore some of their properties. The definitions of conjugate, scalar and vector parts, and norm are as defined above: Xc = -P S(X)=l-(X + Xc) 5(a) 0 V(X) V(a) P Nx=XoXc Note that X = S{X) + V{X) and so S{X) G R and Nx 6 Rp. One can now deduce that numbers defined with these basic operations satisfy S{X oY) = S{Y oX) (Xo Y)c = YcoXc and NXoY = Nx o Ny The first is relatively easy to verify and depends directly on the corresponding property for quaternions S(pq) = S(qp). Verification of the second identity is also straightforward; (here we must be careful with the positional order of elements as we know that quaternion multiplication is not commutative). YcoXc = 7 -6 Co' 0 Y)c = a -0 /a - P6 7(5 - aS 7a - (5c —aS — 7 5 ' P_ = (XoY)c The third property, involving the norm, NXoy = Nx o TVy, is quite remarkable and we also verify it here. Using X, Y as above then Nx oNY = aa + P/3 0 77 + 66 0 (aa+ /?/?) (77+ W) 0 Now 07 - 6(5 7/9 + a6 XoY = Thus (using the property of the quaternion conjugate, pq = qp) NXoY 0:7 - 6(5 7/3 + a6 ja — (56 -7/? - 6l6 (07 - 6(5){ja - (56) + (<y(3 + ctf)(/?7 + 6a) (7a - /?«)(7/? + ol6) - (7a - /?«)(7/J + a«) (0077 + ^PP + 77/^/3 + «#<$<$ - <$/?7a - 07/^ + j(56a + a<5/?7 0 (0:0:77 + ##/?/? + 77/?/? + olol66 0 = Nx oNY
169 This result depends upon showing that for any four quaternions a, /?, 7,6: —6 (3*/a — cry (3 6 + j(36a + a<$/?7 vanishes. (Of course for complex numbers it is trivially satisfied). To show that this is zero we note that it can be expressed in the form: —pa - ap + pa + ap = —2S(pa) + 2S(ap) = 0 where we have taken p = 6J3*y and we have employed properties of the conjugate applied to products of quaternions and the identity S(pq) = S(qp) for any two quaternions. Cayley numbers do not commute in general. To see when commutativity occurs consider: X = Y = then XoY = aj - 6(3 7 (3 + a6 YoX = 7a- (38 aS + 7/3 and so, for commuting products, we require: 0:7 - 6(3 = 7a - (38 7(3 + a6 = aS + 7/? The second of these equations demands V(7)/? = V(a)S. We first consider the case that V(7) + 0 then P-- 1 Ny vh) -V{j)V{a)S and p ■■ iVi V(7) -^(a)y(7) Therefore, since the quaternion product is associative and so That is 6(3: aj — ja = — N} V(7) ■V(a)V(7) ^ JVi V(7) ■[V(a)V(7)-V(7)V(a)] [V(a)V(>y)-V(<y)V(a)][l + ^ TVi 0 V(7) from which we deduce [V(a)Vr(7) - V(7)Vr(a)] = 0 which is only true if V(a) and V(7) are parallel. That is: y(7) = ^(a) keR
170 then P = -k-±-V(a)V(a)6 = k^6 But AV(7) = k2NV(a) and therefore (3 = -. So X, y commute only if V(i) = kV(a) and AC (5 = /c/9. Which is equivalent to the single statement V(Y) = kV(X). The other case from that described above, occurs when V(p() = 0. Here we immediately deduce that 7 is real and V(a)6 = 0. Prom the second condition either (i) V(a) = 0 implying a is a real number or (ii) 6 = 0. Taking (i) first: if both 7 and a are real numbers we deduce that 60 is also a real number. This implies that 6 = k(3 k eR which again leads to the constraint that V(Y) is proportional to V(X). If the statement in (ii) holds true then Y is a real number which of course commutes with all Cayley numbers. Prom these considerations we conclude that Cayley numbers commute if and only if their vector parts are proportional (that is, their vector parts are 'parallel'. See later for the justification for using the word parallel in conjunction with Cayley numbers). Continuing with this theme we could also ask what Cayley number, if any, commutes with every other Cayley number? If Y is such a number then for every X we demand X oY = Y o X. Prom the work above we require for every a, (3 a7 - 7a = 2V{6p) V(7)/3 = V{a)6 For the second equation of this set to be satisfied for all a, (3 we deduce that V(j) =0 and 6 = 0. The first equation is then satisfied identically since 7 is a real number. Therefore we conclude that the only Cayley number which commutes with every Cayley number is of the form: Y=\7} L°J That is, y is a real number. 7g: In the transition from complex to quaternion numbers commutativity of multiplication was lost. For Cayley numbers the associative law of multiplication is lost. For associativity, as we outlined above for quaternion numbers (considered as an ordered pair of complex numbers), we require: a(ja — b6) — (a6 + 76)/? (7a - b6)(3 + a(a6 + 76) (c*7 - 6f3)a - b((3j + 6a) a(7/3 + a6) + (7a - (36)b which necessarily implies (amongst other constraints): a(bS) = {b6)a
171 which is generally false if the elements are quaternions. The associative rule for Cayley numbers fails because the quaternion product is not commutative. However, we consider below some special cases for which the associative rule is valid. We can still construct an inverse. If X is a Cayley number, with non-zero norm, then we define: 1 X = TTX Nx This satisfies: X~1oX = -^-Xc oX = E = Xo (-^Xc) = XoX~1 Nx XNX Now although Cayley algebra is not associative the following theorem shows that in certain special cases associativity holds. Theorem 4.1 If X, Y are Cayley numbers then (YoX~l)oX = Y and X~1o(XoY) = Y. Proof and Let X = then X~1 = — Nx a -p YoX~1 = Nx a 1 ~N^ 7a + (38 aS -7/3 (YoX-i)oX = — 7a + (36 a6 — j (3 1 ~N^ (7a + (36)a - (3(6a - (3j) a{a6 - 7/?) + (c*7 + 60)(3 = Y _ _J_ \'y(<x<x + (3(3) 1 _ [7 " Nx [6(aa + j3(3)\ " [6 The proof that X'1 o (X oY) = Y is almost identical and is therefore omitted. Generally Xo(YoX~l)^Y. Inner Product for Cayley Numbers Under the operation of addition 0 the elements R form an abelian group and then with the rule for multiplying such an element by a sacalar s G M the Cayley numbers constitute a linear space. We can give an inner product structure to this space if we define: <X,Y > =S{XoYc)
172 Then, considering the four basic axioms for a valid inner product: IPlo <X,Y>=S{Xo Yc) = S[{X o Yc)c] = S{Y oXc) = <Y,X> IP2o <x,y + z> =s{Xo(y + z)c) = S(XoYc + Xo Zc) = <X,Y> + <X,Z> IP3o a < X,Y > = < aX,Y > = < X,aY > aeR IP4o < X, X > = S{X o Xc) = Nx > 0 only vanishing if X = 0. Thus < X, Y > = S(X o Yc) is a suitable inner product. If fact if a = a0 + a\i + a2 j + a3A; = a0 + a0 /? = a4 + a5i + a6j + a7k = a4 + 04 with a similar prescription for Y: Y = 7 = 60 + b\i + 62j H- 63A; = 60 + 6q S = 64 + 65i + 66j + 67^ = 64 + 64 then S(X. on=s{[s]°W}-H 0:7 + (5/? H- 7a + /Jtf 0 = a0b0 + Oq • 60 + a4&4 + 04-64 which is the standard inner product on M8. The Schwarz inequality for any inner product states: <X,X><Y,Y > > <X,Y> that is, in this case: NXNY > S{X o Yc) Using this we can deduce a 'triangle inequality' for Cayley numbers. To deduce this we construct the norm of a sum of Cayley numbers. NX+Y = (X + Y)(XC + YC) = XXC + XYC + YXC + YYC = NX + 2S{XYC) + NY <7Vx + 2v/Nx7 + 7Vy = Nx + 2 v%<V/VV + Ny
173 That is, y/Nx+Y < V^x + V^V which is the triangle inequality for cayley numbers. 4.3 Angles and Cayley Numbers We can associate an angle with a Cayley number. If X is a Cayley number then clearly X = S{X) + V{X) X is said to be a pure Cayley number if S(X) = 0. Let x be a pure Cayley number with unit norm and Ny(X) be the norm of V(X). Nv{x) = V{X) o VC{X) = -V{X) o V{X) x = y/Nn> ■-V{X) then X = ^N^(-±=S(X) + -±=V{X)) = ^(-L=S(X) + ^^x) y/Nx y/Nx v^x v^x <Nx(cos6 + xsm6) where cos 9 = . , and sin 0 = . This representation is valid as long as y/NX Ar fNx S2(X) + NV{X) = NX To check this let then Xc = a and V(X)=l-(X-Xc)=l- a — a 2/3 V(X) o V{X) = a — a 2/3 a — a 20 (a - a)2 - 4/3/3 0 a2 - 2aa + a2 - 4/9)8" 0 (a + a)2 - 4(aa + (3/3) 0 S2(a)-Nx 0 NV{X) = -V{X)oV{X) = Nx-S2(a) 0
174 But S(X) S(a) 0 .-. S2(X) + NV(X) :. S2(X) = S2(a) + Nx-S2(a) 0 S2(a) Nx 0 = NX which validates the formulation X = \JNX (cos 0 + xsin#). Note the important result for any unit-norm pure Cayley number 1 xox = N, V(X) N, v(X) L V{X) o V(X) = \-NV(x) L ° = i r Nv(x) [ 0 = - S2(a)-Nx 0 This result can be extended to show that X is a pure Cayley number if and only if X2 is a non-positive real number. To see this let X be pure. That is S(X) = 0. If Therefore S(X) S(a) 0 .". S(a) = 0 if X is pure XoX = V(X)oV{X) V(a) V(a) -V(a) ■ V(a) - f$ 0 Conversely if X = -k 0 ke X2 = a2 - /3p aP + aP If X2 e M then a/? + a/3 = 0 i.e. 5(a)/3 = 0. Thus either S{a) = 0 or /3 = 0. If S(a) = 0 then a2 - pp = V(a)V(a) - PP = -NV(a) - Np and so X is pure and X2 is a negative real number as required. If /3 = 0 then X2 If this is to be real then (using results deduced earlier for quaternions), V(a) = 0 or S(a) = 0. In the first case X is not pure and in the second X2 is a negative real number.
175 Angle subtended by two Cayley numbers We can also define an angle A between two Cayley numbers X, Y to be such that: x S{X o Yc) cos A — /Nx^Ny That this is a meaningful definition of an angle we must show that -1 <cosA< 1 This follows since, from the Schwarz inequality, y/Nx > \S(X)\. Therefore \/Nx\/Ny = \/NxoY > \S{X o Yc)\ and the required result follows. Using this we can easily deduce that the angle of a Cayley number is the angle subtended by S(X) and X: , S(S(X)oX<) (S(X))> S(X) . cos A = ) -—- = \ \ L-^ — , — cos 0 y/Nsmy/Nx S(X)y/N^ v^ Note also that if X = y/Nx{cos6 + xsm9) Y = \] NY (cos 4> + y sin </>) where x, y are unit norm pure Cayley numbers, then the angle A, between X, Y is such that x S(X o Yc) cos A = fNxy/N^ = S [(cos 9 + x sin 6) o (cos (j) - y sin 4>)} = cos 0 cos (f> — sin 0 sin (j)S(x o y) But if 8 is the angle between x and y then cos<5 = 5(x o y) = —S(x o y). Therefore cos A = cos 0 cos 0 + sin 0 sin </>(cos <$) With the introduction of angle we can introduce some standard geometrical terminology. We say X, Y are perpendicular if S{X o7c)=0 and are parallel if V(X o Yc) = 0. The scalar and vector parts of a Cayley number are perpendicular since: S(S(X) o (V(X)Y) = \S((XC + X) o (Xc - X)) = \s(Xc or-XoX) since S(X o Xc) = S(XC o X) = 1{XcoXc-XoX + [XcoXc-Xo X}c) 8 = 1(XcoXc-XoX + XoX-XcoXc)=0
176 Theorem 4.1 showed that X~l o (X o Y) = (F o X"1) oI = y. The following theorem specifically relates to pure Cayley numbers and utilises the idea of parallel and perpendicular Cayley numbers. Theorem 4.2 If h and p are pure Cayley and parallel then po (hop'1) = h whilst if h and p are pure Cayley and perpendicular then po (hop-1) = -h Proof Let p = a hop' .'. po(hop~ h = • i 1 Np ') = - ' Np 1 71 s\ [s\ \a1 \a(' _(<yi P-1 l-p. 1 9 L 1 ~N~P \^a + (18 [a6 — 7/? y<5 + /3 x + 136 16)- )0 + a -v. \^a + fib I aS - j(J (a6-ipm a(a6 - 7/9) J But p,h are parallel (implying V(poh) = 0) and sopoh = hop (using pc = —p, hc = -h since both are pure Cayley). Thus, as p,/i are pure Cayley numbers we deduce a = —a and 7 = — 7 and, from the parallel constraint #7 — 6(3 = 7a — /?<$ = —70; — /?<$ 7/3 + a<$ = a<$ + 7/? = -a<$ + 7/? Therefore 1 po (hop x) = — W, a(«/? - a7) + (7/? + a«)/? (£/? - cry)/? + a(-7/9 - a<$) 1 (aa + /?/?h «(/?/? + aa) = h The proof of the second relation is similar. Choosing p and h as above then the perpendicularity of p and h implies
177 hop =—poh which in turn implies 0:7 - 6(3 = -(7a - (36) j(3 + a6 = -(a6 + 7/?) (or their conjugates). Also since pc = —p then a = —a and since hc = —h then 7 = -7 •'• po{hop ) = — AL W» a(7a + (36) - {a6 - 7/?)/?' (7a + /?«)/? + a(a6 - 7/9) a(a7 - 6/9) - (7/3 + a<$)/9 (a7 - 67?)/? + a(7/9 + a6) -(aa + /?/?h -(/?/?+ aa)« 4.4 Cayley Number Identities We have already derived a number of identities satisfied by all Cayley numbers. In particular we have noted elementary identities with respect to the scalar part of Cayley numbers S{X) = S{XC), S{X o Y) = S{Y o X). The following theorem will be of considerable use in later sections: Theorem 4.3 For any Cayley numbers Q, X, Z e K then S{{Q oX)o(Zo Qc)) = NQS{X o Z) Proof Let a A , z = 7 6 , Q = a b where a, (3, 7, 6, a, b G IK then QoX = ZoQc = {QoX)o(ZoQc) = a\ \a\ \ aa — (3b b\ |_/?J [ab + a(3 a 1 I" 7a 4- b6 —b J \_a6-7b (aa - /?6)(7a + b6) - {a6 - 76) (6a + /?a) [ (7a + 6<$)(a& + a/9) + (aa - b(3)(a6 - 76)
178 where S{{QoX)o{ZoQf)) = - £ + m + n + p 0 Now Also £ = a(a7 + ja)a - a(6(3 + (36)a m = 66(70: + 07 - (36 — 60) n = (6baa — a6ba) + (aab6 — ab6a) p = (jb(3a - ajb(3) + (a/367 - /367a) 5(07) - ^(07 + 7a) e » W) = ^(W + /w) g : S(a(ab6)) = -(aabS + 6baa) and = S({ab6)a) = - (ab6a + a6ba) S(a((3h)) = - (a(3h + W = S(((3h)a) = -((3ha + aybP) using the property S(pq) = S(qp) for any p, q G H. These results imply that both n and p vanish. Hence " (aa + 66)S(a7) - (aa + 66)S(<S/3)" 0 5((QoI)o(ZoQc)) S(oa-6j3)' 0 7VQ5(X o Z) which completes the proof. In the remaining part of this section we derive other identities which will make the non- associative algebra of Cayley numbers easier to deal with. We begin by defining the associator (cf article by Curtis in [18]) A(X, Y,Z) = (XoY)oZ-Xo(YoZ)
179 which of course is generally non-zero. However, by direct computation using the properties of quaternions we can easily show: (a) A{X,X,Y) = (XoX)oY-Xo(XoY)=0 To obtain this result let X then {XoX)oY a Y = [a2 - 0p' [a/3 + a(3 0 [Y I6. il 6\ whilst X o (X o Y) = '{c?-l30)T-6(Pa + 0ay l{a(3 + a(3) + {a2 - PP)6 a27 - aSp - (7/?/? + aSp) orfP - SPP + ajP + a2S from which it follows easily (noting that a + a is a real number thus commuting with any quaternion) that X o (X oY) = (X o X) oY. (6) A{x, y, y) = {x o y) o y - x o (r o y) = o This is deduced in an analogous manner to (a). The existence of identities (a) and (b) show that Cayley number algebra is a member of the class of alternative algebras. We shall show in a later section that Cayley numbers are the most important example of this class of algebras. The following Theorem is closely related to the identity in (b). A Cayley number which is not a quaternion is called non-degenerate. Theorem Three non-degenerate Cayley numbers X, y, W satisfy the associative law Wo(XoY) = {WoX)oY if and only if any two of them have parallel vector parts. Proof First it is easy to see that there is no loss of generality in assuming that all three Cayley numbers X, Y, W are pure. Thus let V(W), V(X) and V(Y) be pure Cayley numbers and if [V{W) o V(X)] o V{Y) = V(W) o [V(X) o V(Y)]
180 for all pure Cayley numbers V(W) then V{Y) = kV(X) k € R. To show this let V(W) V(X) = V(Y) = then [V{W)oV{X)]oV{Y) = V{W) o [V(X) o V{Y)} (act - /3b)-y - S(ba + (to) 7(0:6 + o/9) + {ad - bp)6 a(aj - 6(3) - (7/3 + aS)b (07 - 6(3)b + a(7/9 + ctf) Now equating components and since V(W), V(X) and V(y) are pure Cayley numbers then a, a and 7 are pure quaternions satisfying a = —a, a = —a and 7 = -7 we have (aa - (Jb)i - «(6a H- /?a) = 0(07 - £/?) - (7/? + a£)6 7(0:6 + a/3) + (ao - 6/?)« = (07 - <$/?)& + 0(7/? + a«) Since a and 6 are independent quaternions then 6 (3a = a6(3 and - (3try + <S6a = -7/% + o<S6 If the first of these relations is to be satisfied for all quaternions a then 6(3 must be a real number since this is the only number which commutes with all quaternions. Thus 6 = k(3 where k is real The second relation now becomes (3b(ka - 7) = (ka - j)0b If this is to be true for all b then 7 — ka must be real. Since 7 and a are pure quaternions this implies 7 - ka = 0 .'. 7 = ka The other relations are then identically satisfied so that if [V{W) o V{X)\ o V{Y) = V{W) o [V{X) o V(Y)} then V(Y) = kV(X) where k is real. This result will be used in Section 4.6. In a similar manner we can show that if [V(W) o V(X)] o V(Y) = V(W)o [V(X) o V(Y)]
181 for all V(X) then V{Y) = kV{W) k e R. To see this we need to show that for all quaternions a, /?: (aa - /96)7 - «(-6a + /3a) = a(c*7 - «/?) - (<y(3 - aS)b 7(0:6 - a/3) + (aa - 6/3)£ = (0:7 - £/?)6 - 0(7/? - a«) which, on rearrangement gives a<$/3 - 6Pa + 7/% - £67 + (56a - a<$6 = 0 and 70:6 - 0:76 + a7/3 - 7a/? + aa<5 - aaS + £/?6 - &/?<$ = 0 The first relation, if true for all a implies 6b 6R that is 6 = kb keR The second relation (if true for all /?) then implies aj = 7a which, since a, 7 are pure quaternion gives a = p7 pel Using this in the second relation above implies [kp — 1] [70:6 - 0:76] = 0 Thus, either b = 0, 7a = 0:7 or kp = 1. Now, if b = 0 then, from an earlier result 6 = 0. Both main relations are now satisfied and V(Y) 1 V(W) The second possibility: 0:7 = 7a implies (if true for all a) that 7 G R which is a contradiction since 7 is pure quaternion. The final possibility is that kp = 1. The first relation, above, becomes 7[6£ +/ft] = [&£ + /% which is satisfied identically since bfi + /ft is a scalar. Thus we deduce, V(Y) = 7 6\ ka\ kb\ = W(W)
182 We can conclude that three (non-degenerate) Cayley numbers X, Y, W satisfy the associate rule W o (X o Y) = (W o X) o Y if and only if any two of them have parallel vector parts. Continuing with Cayley number identities, we have: (c) A{XC, X, Y) = (Xc o X) o Y - Xc o (X o Y) = 0 By taking the conjugate of this relation we (essentially) obtain: (d) A(X, Y, Yc) = X o (Y o Yc) - {X o Y) o Yc = 0 The last two identities (in aslightly amended form (YoX~l)oX = Y, X~1o(XoY) = Y) have already been used). We show later, as part of the proof of the Hurwitz theorem that identities (a),(b) are equivalent to either of (c) or (d). The remaining identities (e) to (i) are valid not just for Cayley numbers but for all alternative algebras. In the first identity (a) replace X by C + D to give, after some cancellation: (c) {D o C) o Y - D o (C o Y) + (C o D) o Y - C o {D o Y) = 0 Putting D = Y in (e) and using (b) gives (/) (Y o C) o Y = Y o (C o Y) Replacing Y by C + D in identity (b) implies, after re-labelling (g) {DoC)oY-Do(CoY) + {DoY)oC-Do(YoQ = 0 Subtracting (e) from (g) gives (ft) (DoY)oC + Co(DoY) = (CoD)oY + Do(YoC) Now in this expression we replace D by C o D and then Y by Y o C, add the resulting expressions to obtain (CoD)o(YoC) + {Co(CoD))oY + Do((YoC)oC) + {CoD)o(Yo C) = Co[(CoD)oY + Do(YoC)] + [{CoD)oY + Do(YoC)]oC Now using (h) and (f) in this last expression gives
183 (C2 oD)o y+2(C oD)o(YoC) + Do(YoC2) = C2o(Do7) + (Doy)oC2 + 2Co ((D oY)oC) Now, finally, replace C by C2 in (h) and combine it with the expression just obtained to imply: (t) {CoD)o(YoC) = Co(DoY)oC Note that because of the identity (f) the right hand side of (i) is unambiguous. Identity (i) is known as the Moufang identity. A further identity can be obtained by replacing (in identity (h)) D by (Cc o D) and then Y by Y o Cc and adding the resulting expressions to obtain (CoD)o(Fo Cc) + {CcoD)o(YoC) = Co [(D oY)o Cc] + [Cc o {D o Y)] o C which leads to, on taking the scalar parts of both sides S[{C oD)o(Yo Cc)} + S[{CC oD)o(Yo C)] = S(C o [(D o Y) o Cc\) + S([CC o (D o Y)] o C) = S[{(D o Y) o Cc} o C] + S[C o {Cc o (D o Y)}\ = 2NCS{D o Y) However, as we have shown above S(Co[(DoY)oCc]) = NcS(DoY) and so we immediately deduce that S[(C oD)o(Yo Cc)} = S[{CC oD)o(Yo C)] = NCS{D o Y) This identity will be of considerable use in Section 4.6. 4.5 Normed Algebras and the Hurwitz Theorem We are now in a position to prove Hurwitz's theorem which highlights the important role enjoyed by Cay ley numbers in the domain of normed algebras. In the latter part of the proof considerable use will be made of the alternative laws and of the Moufang relation. Theorem 4.4 The only normed algebras over the real field are isomorphic to M, C, H, IK.
184 Proof Let A be an algebra with basis ei, e2,..., en in which e\ = zi is the identity element. If a,b e A then n n 3=1 j=l The norm Na of a is Na = a\ + a^ + ... + a\ Na e R Clearly 7Va = 0 if and only if a = 0. The algebra ^4 is normed if for any two elements a, b E A there exists a basis for which Nab = NaNb We first show that any normed algebra over R is a division algebra. The proof is immediate. If a, b G A and if a& = 0 then Wa6 = N0 = 0 NaNb = 0 -» 7Va = Oor7V6 = 0 -» a = 0or& = 0 We conclude that if ab = 0 then either a = 0 or 6 = 0. Hence the algebra A is a division algebra. Since it is a division algebra it must contain a unit which we denote by e\ = i\. We present the proof of Hurwitz's theorem following the approach given in Jacobsen. The standard inner product is introduced in A: < a, b > = a\b\ + a2b2 + ... + anbn V a, 6 G -A The following properties of the inner product are easily obtained. (i) Na= <a,a> (ii) < a,&> = <&,a> (iii) < aa, &> = <a,a&>=a:<a,&> aeR (iv) < a,6 + c> = <a,6> + < a,c> <a + c,6> = <a,6> + <c,6> Va,6,cGR (v) If < a, b >= 0 V 6 G A then a = 0
185 Properties (i) to (iv) are easily checked. To obtain property (v) choose b = bjej for given j. Then < a, b > = ajbj = 0 and if < a, b > = 0 then we deduce aj = 0. We now repeat for each j = 1,2,..., n to deduce aj = 0 j = 1,2,..., n and so a = 0. This shows that the inner product is non-degenerate. (vi) < a, b > = \ [Na+b -Na- Nb] proof 7Va+6 = <a + b,a + b> =<a,a + b> + <6,a + 6> = <a,a> +2<a,6> + <6,6> the result follows. (vii) < ca,cb> = Nc < a, 6 > V a, 6, c G A proof <ca,cb> = - [Nc(a+6) - 7Vca - 7Vc6] = -Nc [Na+b -Na-Nb} = Nc<a,b> from (vi) Similarly < ac,bc> = Nc < a,b > V a,b,ce A (viii) < ac,bd> + < ad,bc> =2 < c,d X a,6 > proof In the second relation of (vii), replace c by c + d to give < a(c + d), 6(c + d) > = Nc+d < a, b > then, expanding the left hand side and rearranging gives < ac, bd > + < ad, 6c > = Nc+d < a, b > — < ac,bc> — < ad,bd> = 2<c,d><a,b> The inner product can be used to split A into a subspace S and its orthogonal compliment S1. Precisely S1 is the set of all elements q1 € A such that <a±,a>=0 VgGS It is now a standard construction to show that S1 is a linear subspace of A and in fact A = S + S±
186 implying that every a e A can be expressed uniquely as a sum of an element of S and an element of S1. In particular, we choose S = {e\}. Also, in what follows, any element of {ei}1 will be underlined. Thus in (viii) choose b = e\ and a = a G {ei}1 then < ac,d> + < ad, c> = 2 < c, d >< a, e\ > = 0 whilst if d = e\ and c = cE {ei}1 then (viii) again implies <ac,b> + < a,be > = 2 < c,e\ >< a,6 > = 0 This gives the result (with some relabelling): (ix) < ac,d> + < ad, c > = 0 and < ca, d > + < c, da > = 0 Now any 6 G A can obviously be written in the form b = aei+a aeR, a G {ei}1 With this partition of A we introduce the conjugate b* of 6: 6* = ae\ — a Now, by (iii) < ac,d> — < c, ad > = 0 and adding this to the first relation of (ix) we have: < ac,d> + < ac,d> + < ad,c> - < c, ad > = 0 that is <bc,d> = < c,ad> - < ad,c> = < c,ad> - < c,ad> Therefore we deduce (x) <bc,d> = <c,b*d> \/b,c,deA Similarly with the second relation in (ix) < cb,d> = < c, db* >
187 relabelling the right hand side of this second relation d -> b*, b* -> d implies < c,b*d > = < cd*,b* > so we conclude < bc,d> = < c,b*d> = < cd*,b* > Now re-label again: d -> d*, b -> c* and c —* b* to give < c*b*,d* > = <b*,cd* > = < b*d,c> and so we finally write, from (x) and its extensions (xi) < bc,d> = < c,b*d> = < cd*,b* > = < c*b*,d* > V b,c,de A Prom the relations in (xi) a number of special cases can be deduced. First, with c = e\\ <b,d> = <b\d* > = <d*,b* > so, replacing b by be <bc,d> = < d*,(bc)* > But, in particular, from (xi) < bc,d> = < c*b*,d* > = < d*,c*6* > from which we deduce <d*,{bc)* -c*b* > =0 and so, since the inner product is non-degenerate we obtain (bc)*=c*b* V6,cG A Thus the map (j): b —» 6* is an anti-involution of the algebra A. (An involution is a map </> such that (j)2 = identity map and one which respects multiplication 4>(ab) = (j)(a)(j)(b). An anti-involution is one that reverses multiplication as does conjugation). We now use the concept of conjugation, in conjunction with the inner product to show that any normed algebra must be an alternative algebra. It is obvious that the space of elements left fixed by the conjugate operation is the space spanned by e\. Also since, from (xii), (bb*)* = bb* then we must have bb* = ae\ for some a eR
188 But (from (xi)) Nb= <b,b> = < 66*,ei > = < aei,ei > = a .'. bb¥ = Nbel V6GA Also (66*)* =6*6 /. b*b = pei f3eR and again from (xi) Nb = < 6,6 > = < ei, 6*6 > = < 6*6, e1 > = </3eue1> = /?l b*b = Nbei and so 66* = 6*6 = Nbei We can now deduce the alternative laws. Prom (xi), replacing d by bd <bc,bd> = < c,6*(6d) > but, by (vii) <bc,bd> = Nb <c,d> = < c,Nbd > since Nb G R = <c, (bb*)d> <c,6*(6d)> = < c, (bb*)d> Vc,6,dGA Thus using non-degeneracy of the inner product we obtain: (xii) b*{bd) = (bb*)d yb.de A By taking the conjugate of both sides and using results derived earlier (6d)*6 = (66*)d* /. (d*6*)6 = (66*)d* or, relabelling, we have (xiii) (d6)6* = (66* )d V6,deA Now writing 6 + 6* = 2aei where 6 = ae\ + a and 6* = aei - a
189 we have from (xii) (2ae1-b)(bd) = Nbd .'. 2a(bd) - b(bd) = Nbd that is (2ab - Nb)d = b(bd) .'. [2ab - b{2aei - b)]d = b{bd) and so b2d = b(bd) whilst from (xiii) (db)(2ael-b) = Nhd .'. db{2a) - Nhd = {db)b hence d[b(2a) - 6(2aei - b)] = (db)b leading to db2 — (db)b These results show that a normed algebra over R must be an alternative algebra. The converse of this result is easily obtained. If A is an alternative algebra over R with identity e\ and with anti-involution b -> b* such that (with b = ae\ + a, b* = ae\ — a) bb* = Nhei and b + b* = 2aex aeR then A is a normed algebra. The proof of this result depends crucially on the Moufang identity which was derived earlier in the discussion on Cayley numbers. However, that identity is true for all alternative algebras. Now V c,b,d e A the Moufang identity states {cb){dc) = c{bd)c Also Nbd = (bd){bd)* = {bd){d*b*) = {bd)[d*{2ae1-b)} = 2a{bd)d* - {bd){d*b) = 2ab(dd*) — b(dd*)b Using the Moufang identity = Nd[2ab-b2} = Nd[2ae1-b]b = Nd(b*b) = NdNb Mb,deA That is, A is a normed algebra. We are now in a position to prove Hurwitz's theorem.
190 Let Abe a, normed algebra over the real field. Let S be a subalgebra of A containing the identity e\. (A set S of elements of an algebra A is called a subalgebra of A if S ^ <j> and if S is closed under multiplication). Then, as we have seen, A = S + S1 If S ^ A then there exists an element I E S1 such that Nj = 1 (this is always possible since, for non-zero element b G A we can construct another element B = ——b such that Nb NB = 1). Since /eS1 then /* = -/ and I2 = -1*1 = -Niei = -ei Our aim now is to show that the set S + IS (i.e. all elements of the form a + lb a, b e S) is a subalgebra of A. We need to develop the product of elements within S + IS. Now V * = Nxe\ and so replacing x by x + ?/ gives (x + 2/)(x* + 2/*) = Nx+yei i.e. 7Vxei + yx* + xy* + 7Vyei = Nx+yei yx* + xy* = 2 < x, ?/ > ei from (vi) Choosing x e S and y = i" G S1 gives ix* + xJ* = 2<x,J>ei =0 .*. xl = lx* VxGS Now, by an earlier result (xi) <bc,d> = < c, b*d > therefore, with d = I <bc,I > = < c,b*I > But if c, b e S then be e S (since 5, being a subalgebra of A, is closed under multiplication). Thus 0= <c,&*/> = <c,/6> VcGS .\ 76 GS1 We write IS C 51 (to mean all elements of the form lb, be S are members of S1). Also, again by (xi) < bc,d > = < c,b*d > gives, by choosing b = I and d = If <Ic,If> = <c, /*(//)> = < c, (/*/)/> = <c,/>
191 where we have employed the alternative law (ab)b* — a(bb*). Thus the map b —> lb is a unitary transformation as it preserves the inner product. Such a transformation takes an orthonormal basis in S into an orthonormal basis in IS. Hence S and IS have the same dimension. The alternative laws and the Moufang identity can now be used to deduce the product rules of elements of S + IS. Since a(a*x) = (aa*)x = Nax then, replacing a by a + b (a + &)[(a + &)*x] = Na+bx leading to: a(b*x) + b(a*x) = 2 < a, b > x Therefore choosing a, x G S and b = I then a(I*x) + I(a*x) =2<aJ>x = 0 that is a(/x) = I(a*x) V a, x G S Then, taking conjugates (xT)a* = (x*a)I* i.e. (Jx)a* =/(a*x) or, relabelling (Ix)a = I (ax) V a, x G 5 From the Moufgang identity (cb)(dc) = c(6d)c then (J6)(/c) = (Ib)(c*I) = [/(6c*)]/
192 However, if we revert to the notation used first to describe Cay ley numbers, the algebra of ordered pairs a, b G S in which the product rule is ac - db* cb + a*d and in which I is identified with the element and the identity is the element and with conjugate a -b is precisely the algebra described above. We now show that S + IS is associative if and only if algebra S is commutative. Let X Y = Z = t, u, v,w,a, b e S in which we assume the algebra S is associative. Then according to the product rule: X o (Y o Z) = ' t~ u 0 va — bw* aw + v*b tv - wu* vu + t*w 0 a b ^ — t(va — bw*) — (aw + v*b)u* (va - bw*)u + £*(atu + i>*6) . . 7_i^ ""* I I " I _ I (tv -wu*)a - b(u*v* +w*i) ^ ' L^ + ^^J LM [a(vu + t*w) + (v*t*—uw*)b If S is commutative then we easily see that X o (Y o Z) = (X o Y) o Z and so the algebra of S + 75 is associative. However, if A is not commutative then S-\-IS is not associative since, in general, t(bw*) ^ (bw*)t obtained from the first components of X o (Y o Z) and (X o Y) o 2. We now show that 5 + IS is an alternative algebra if and only if S is associative. We assume that S is alternative. Now the alternative laws are that, for any two elements X,Y e S + IS (i) X2oY-Xo(XoY) = 0 (ii) (YoX)oX-Yo(X2) = 0 We easily show that these two conditions are equivalent to a single condition. Now any element X G S + IS can be partitioned X = Xc+-ae1 aeR and so, employing this in (i): [X o (Xc + aei)] oY-Xo [(Xc + aei) o Y] = 0 (IoIc)oy + (Ioa7)-Io [(Xc o Y) + aY] = 0 (IoIc)o7-Io(Ico7) = 0 (i)'
Similarly, in (ii): (YoX)o {Xc + aei) - Y o [X o (Xc + aci)] = 0 {YoX)oXc + {YoX)a-Yo(XoXc)-YoXa = 0 .'. (Y o X) o Xc -Y o (X o Xc) = 0 But, taking the conjugate of (i) we have (ii)' [{XoXc)oY-Xo (Xc o Y )]c = [{X oXc)oY]c-[Xo (Xc o Y)]c = Yco(Xo X)c - {Xc o Y)c o Xc = Yco(Xo Xc) - (Yc oI)oIc which except for the replacement Y —* —Yc is exactly the expression obtained in Hence we can express the alternative laws as a single condition: (X o Xc) oY - X o (Xc oY) = 0 Now, as above, define X, Y: X = Y = and thus Xc = t* —u and hence therefore oXc = ' t' u 0 XcoY = Xo{Xc ol ') = " i -u t* [—u ' t' u = 0 o \tr [t* V w t*v —I > + UU*' u - t*u = + u m + = 'tt* t*v + wu* —vu + tw m*~ tw + uu* 0 t(t*v + wu*) - {—vu + £iu)u* (t*v + wu*)u + t*{—vu + tw) and (ioic)0y: 0 (tt* + uu*)v (tt*+mx*)w and so Io(Ico7)-.(IoIc)o7 = But 5 is alternative and so t(t*v) + {vu)u* - {tt* + uu*)v + t(wu*) - (tiu)u* (t*v)u - t*{vu) + (W)u + t*{tw) - {tt* + uu*)w t{t*v) = {tt*)v {vu)u* = {uu*)v {wu*)u = {uu*)w t*{tw) = {tt*} w
194 leading to Xo(XcoY)-(XoXc)oY = t(wu*) - (tw)u* (t*v)u - t*(vu) [t(wu*) - {tw)u*} + [(t*v)u-t*(vu)] a,/?e: implying immediately that S + IS is alternative if and only if S is associative. Now the construction, described above, of S + IS from A can now be repeated for S. We begin with A0 = A = {a " | a e Algebra A\ is the algebra of complex numbers since /? = m + 7/3 in which J2 = — i. Of course j4i is commutative, since A0 is associative. z,w e Ai = C} a = a 1 0 + 0 1 M = {p = this is the algebra of quaternions. A2 is associative since A\ is commutative. However, as we know, A2 is not commutative. A3 = {Q = p,qeA2 = : this is the algebra of Cayley numbers K. As is not associative since A2 is not commutative. However As is alternative (and hence normed) since A2 is associative. The 'last' algebra of the sequence is A4 = { Q,PeAs = K} However, this algebra is not alternative (and hence not normed) since As is not associative. This completes the proof of Hurwitz's theorem: the only normed algebras over the real field are E, C, M, and K.
195 4.6 Rotations in 7- and 8-Dimensional Euclidean Space We begin by considering rotations in R7. The work carried through with quaternions suggests that we consider the operation: Wf = Xo(WoX~1) where W and X are given Cayley numbers. (Note that the right hand side can be shown to be equal to (X oW)o X~l: to see this put D -» X, C -» W, Y -» X~l in Cayley identity (e)). Let W = y/Nw (cos (f) + w sin </>) X = y/Nx (cos 0 + x sin 0) fox = -l, u;ou; = -l Because the angle can be viewed as that subtended between the Cayley number and its scalar part we can construct the following diagram. Figure 4.1 Theorem 4.5 The map (f)X : M8 h+ M8 (t>x(W) = Xo(Wo X~l) can be interpreted geometrically as describing a rotation of the vector part of W about the vector part of X through an angle 20.
196 Proof It is easy to show that V(X) is left fixed by the mapping: <1>x(V(X)) = Xo(V(X)oX-1) = (S(X) + V(X)) o [V(X) o j^(S(X) - V(X))} = -j^(S(X) + V(X)) o [S(X)V(X) - V(X) o V(X)] = j^MX) + v(x)) o \(s(x) - v(x)) o v(x)} = Xo [(X-1 o V(X)} = V{X) We are therefore justified in referring to V(X) as the axis of rotation. It is also easy to show that the norm and scalar part of W are conserved. Nw = NxNWoX-i = NxNwNx-l = Nw since Nx-i = -rr-■ Concerning the scalar part we have noted earlier in Section 4.4 that Nx for any Cayley numbers X,Y then S(X o Y) = S(Y o X). Thus: S(W) = S(X o (W o X'1)) = S{(W o X'1) o X) = S(W) since {W o X'1) o X = W. Now Xo(WoX-1) = Xo \{S(W) + V{W)) oX"1] = X o (S(W) o X'1) + Xo (V{W) o X"1) = S(W) + X o (V(W) o X'1) and since S(X o(Wo X'1)) = S(W) then S(Xo(V{W)oX-1)) = 0 Therefore X o (V^W) o X-1) is pure Cayley and so V(Xo(WoX-1)) = Xo (V{W) oX"1) Let W be such that its vector part is parallel to w then V(W) is parallel to
197 Xo (woX 1) = wf. Now choose p to be a unit pure Cayley number in 'plane' with 'normal' x (i.e. S(xop) = 0). (See Figure 4.2) scalar axis Figure 4.2 Let A be the 'angle' between x and w. w = x cos A + p sin A w is a unit Cayley since (using xc = x~l = — x, pc = p~l = —p) w owc = (xcosA + psinA) o (-xcosA -psinA) .'. w o wc = —x ox cos2 A — p opsin2 A - sin A cos X(p ox -\- x op) = cos2 A + sin2 A = 1 since pox = — x op as p,x are perpendicular. Now wf = Xo(woX~1) = X o [(xcosA + psinA) ol"1] = A'o(xoA'-1)cosA + A'o(poA'-1)sinA We shall show that X o (x o X~l) = x and Xo(pol_1)isa pure Cayley which revolves through an angle 26 about x. The first part is relatively easy. Since V(X) and x are
198 parallel then V(V(X) ox) = ^[V(X) ox + io V{X)] = hv(X)ox-xoV(X)] = 0 therefore X o (x o X-1) = (S(X) + V(X)) o [£ o (5(X) - V{X))]±- = (S(X) + V(X)) o [S(X)x - x o V(X)]±- = -^-{xS2(X) + S(X)[V(X) ox-xo V(X)} - V(X) o (xo V(X))} But V-i(X) = ^ .'. V(X) o(io V(X)) = -NV{X)V{X) o(io V"1^)) = — NV(x)X from a result obtained earlier .-. Xo(xoX-1) = ^-[S2(X) + Nv(x)}x = x The second part is developed along similar lines. Xo(poX-1) = (S(X) + V(X))o\po(S(X)-V(X))}±- = -^{PS2(X) + S(X)[V(X) op-poV(X)} - V(X) o (poV(X))} Nx However since V{X) and p are perpendicular then adapting an earlier result: V(X)o(poV(X)) = Nv{x)p :. X o (poX-1) = j^[(S2(X) - NV(X)}p + S(X){V(X)op-poV(X)}}
199 Let p' = Xo^poX-1) then we show p1 is perpendicular to x (showing that p has been rotated about x). To do this we need to show 5[Xo(poX_1)]ox] = 0. Now 5 (pox) = -S(pox) = 0 by our original assumption. So all we need to show is that S[{V{X)op-poV{X))ox} = 0 or (equivalent ly) S[(x op — pox) ox] = 0 Now, since p, x are perpendicular (x op) o x = —(pox)ox = (pox-1) ox = p and — (po x) ox = (po x~l) o x = p .'. S[(xop-pox) ox] = S(2p) = 0 We can also determine the angle of rotation: co-*=-^|= = S(jf ofi) V ■* *p' v P = -5(p'op) = -5[{Xo(poX-1)}op] = "j^W - Nvm)S{pop) - &S{[V(X)op-poV(X)} op} = ^(S\X) - Nv{x)) - ?£ls(-V(X) - V(X)) = J^(S2(X) - NV(X)) = -±-(Nx cos2 6 - Nx sin2 9) = cos20 :. 4> = 26 Finally w' = x cos A + p' sin A where p' = p cos 20 + x o psin 20. We note that £ o p is perpendicular to both x and to p.
200 Successive Rotations A natural question that can now be asked is what if a second rotation is performed? Are two rotations equivalent to a single one, as they are for complex numbers and for quaternions? Mathematically we ask; if W = Xo(Wo X-1) and W" = Y o (W o Y'1) then is W" = Yo (W o y1) = (YoX)o[Wo(Yo X)-1} Of course, for quaternions this relation is valid and is obtained immediately since quaternion multiplication is associative. Unfortunately this relation is not true for Cayley multiplication, except in certain special cases. For example if Y = X~l (this corresponds to choosing our second rotation to be the reverse rotation and should return us to where we started) then one can show that W" = X~1o{[Xo(Wo X-1)} oX} = W Proof. In Cayley identity (e) let D -» W,C -» X and Y -» X'1 then Io(|yo X-1) = {WoX)oX~1-Wo(Xo X-1) + {XoW)o X~l = W-W+{XoW)oX~1 = KoX~1 where K = X oW therefore [Xo(Wo X'1)} oX = (Ko X'1) oX = K .'. X'1 o {[X o(Wo X-1)} oX} = X~l o(XoW) = w which proves the relation. We now use the Cayley number identities to develop a more general result. Since, as we have argued above, the transformation W = X1o(WoX{1) with Xi = y/Nxl (cos 0 + x sin 0) rotates the vector part of W through an angle 26 about an axis x then we might expect that the transformation: W" = X2o [(Xi o(Wo Xf1)) o X21}
201 where X2 = \/Nx2 (cos <f> + x sin <f) would imply a further rotation through angle 2cf> about the same axis £ and therefore be equivalent to a single transformation {X2oX1)o[Wo(X2oX1)-1)] since X2 ° X\ = y/NXi ^/NXl (cos(0 + </>) + £ sin(0 + <£)). As the following theorem shows this turns out not to be true. Theorem 4.6 Let X, Y, W be general Cayley numbers. The relation: {Yox)o\Wo(Yo x)-1)] = y o [(x o (w o X"1)) o y-1] is valid for all W only when V(Y) = kV(X) where k is a real number. Proof. Without loss of generality we assume X, Y have unit norms. Let Y = S{Y) + V(Y) X = S{X) + V(X) then Xo(Wo X'1) = \S(X) + V(X)} o [S(X)W -Wo V(X)} = S2(X)W - S{X)[W o V{X) - V(X) oW}- V(X) o(Wo V(X)) thus [Xo(i¥oX-1)]0y-1 = S2{X)S{Y)W - S(X)S(Y)[W o V(X) - V{X) oW}- S(Y)V(X) o{Wo V(X)) - S2(X)W o V(Y) + S{X)[(W o V{X)) o V(Y) - (V(X) oW)o V(Y)] + [V{X) o(Wo V(X))] o V(Y) Therefore YoUXoiWoX-^oY-1} = S2{X)S2{Y)W - S(X)S2(Y)\W o V{X) - V{X) oW\- S2{Y)V(X) o{Wo V{X)) -S2(X)S(Y)W o V(Y) + S(X)S(Y)[{W o V(X)) o V(Y) - {V{X) o W) o V(Y)]
+S{Y)\V(X) o (W o V(X))] o V(Y) +S2(X)S(Y)V(Y) oW- S(X)S(Y)[V(Y) o(Wo V(X) - V{Y) o (V{X) o W)] -S(Y)V(Y) o [V(X) o(Wo V(X))] - S2(X)V(Y) o(Wo V(Y)) +S(X)V(Y) o \(W o V(X)) o y(y)] - S{X)V{Y) o [(V(X) o W) o V{Y)] +v(Y) o {\v{x) o(wo v(x))] o v(y)} Now YoX = S(Y)S(X) + 5(Jf)V(y) + S{Y)V{X) + V(y) o V{X) and so (y o X)-1 = S{Y)S{X) - S(X)V(Y) - S{Y)V(X) + V{X) o V(Y) Therefore (y o x) o [w o (y o x)-1} = S2(Y)S2{X)W - S(Y)S2(X)W o V(Y) -S2{Y)S(X)W o V(X) + S(Y)S(X)W o (V(X) o V(y)) +5(y)52(x)y(y) o w - s2(x)v(Y) o(Wo v(y)) -S(Y)S(X)V{Y) o(Wo V(X)) + S{X)V(Y) o[Wo (V(X) o V(y))] +52(y)5(X) V(X) oW- S(X)S{Y)V(X) o(Wo V(Y)) -S2(Y)V(X) o(Wo V(X)) + S{Y)V(X) o[Wo (V(X) o V(Y))] +S{Y)S(X){V{Y) o V(X)) oW- S{X)(V(Y) o V(JQ) o (W o V(y)) -S(Y)(V(Y) o V(X)) o (W o V{X)) + [V(Y) o V(X)] o [W o (V(X) o V(y))] Now form ropofifo x-1)} o y-1] - (y o x) o [w o (y o x)-1]
203 = S(Y) {V(X) o (W o V(X))} o V(Y) - V{Y) o [V(X) o (W o V (X))] -v(x) o [w o (v(x) o v(y))] + (v(y) o v(x)) o (w o v{x)) +S(X) v{Y) o \{w o v(x)) o v(y)] - v(y) o [(v(x) o w) o v(y)] -v(y) o \w o (v(jf) o v(y))] + (v(y) o v(x)) o (w o v(y)) +5(x)5(y) (w o v{x)) o v(y) - (v(x) o\v)o v{y) -V(Y) o(Wo V(X)) + V{Y) o (V(X) o W) -W o (V(X) o V{Y)) + V{X) o{Wo V(Y)) -(V(Y) o V(X)) oW + V{Y) o(Wo V(X)) +V(Y) o {{V(X) o(Wo V{X))} o V(Y)} -(V(Y) o V(X)) o[Wo (V(X) o V(Y))} This is the general expression which appears not to vanish. To see this consider the last two terms which must vanish separately from the other terms. Clearly by Cayley identity (i) we have V(Y) o {[V(X) o(Wo V(X))] o V(Y)} - (V(Y) o V{X)) o[Wo (V(X) o V(Y))} V(Y) o V(X) {W o V(X)) o V(Y) -Wo (V(X) o V(Y)) which does not vanish unless (since Cayley numbers constitute a division algebra) V(Y) = 0, V(X) = 0 or (W o V(X)) o V(Y) -Wo (V(X) o V{Y)) = 0 We first show that there is no loss of generality if we assume W is also a pure Cayley number as \{S(W) + V(W)) o V(X)] o V{Y) = [S{W) + V{W)] o (V(X) o V{Y))
204 which after expansion and cancellation reduces to \V{W) o V(X)] o V(Y) = V(W) o \V(X) o V(Y)] However, as proved in Section 4.4 this is only zero for arbitrary Cayley numbers W if V(Y) = kV(X) where k is a real number. Returning to the main problem we make the identification V(Y) = kV(X) then apart from obvious cancellations and writing V for V(X) we obtain: y o [{x o (w o x-1)} o y-1] - (y o x) o [w o (y o x)-1] = fcS(Y) (Ko(lfoy))oV-Ko(^o(Ko V)) + k2S(X) - V o ((V oW) oV) + (V oV) o (W oV) + kS(X)S(Y) (V o W) o V + V o (W o V) + k2 Vo{[Vo(Wo V)} o V} - (V o V) o [W o (V o V)] which is easily seen to vanish by applying one or other of the Cayley identities (particularly (h))- Reflections in 7-dimensional Space If p is a unit pure Cayley number then p_1 = — p and the angle of this unit Cayley is n/2. Thus p o (W op"1) = — p o (W op) describes a rotation of V(W) through 180° about p. Similarly p o (W o p) describes a reflection in the plane with normal p. Let X be a Cayley number whose vector part lies in the plane with normal p. That is S(po V(X)) = 0. Then under the reflection we find (from previous work) V{po(Xop))=po (V(X) op) = V{X) That is, elements in the plane normal to p are unchanged by the transformation po (X op). Also in agreement with our interpretation the direction of any vector parallel to p is reversed: V(p o ((rp) o p)) = —rp
Rotations in R We now consider the transformation maps </>£, </>#: (j)L : R8 ^ R8 (j)L(X) = QoX (j)R : R8 ^ R8 (f)R(X) = XoQ where Q, X e K and Nq = 1. These maps are norm and angle preserving: Nqox = NqNx = Nx and if X, 7eK subtend and angle A before the transformation (say </>l): , S(XoYc) cos A = then after the transformation the angle subtended is x;_ S((QoX)o(QoYy) VNQ°x VNQ°y S((QoX)o(Y'oQc)) cos A VNxVNy However, as we have noted earlier S{{Q oX)o(Zo Qc)) = NQS{X o Z) As a special case S{{Q o X) o (Yc o Qc)) = S{X o Yc) if NQ = 1 Hence, using this theorem in the change of angle formula we have: A' = cos A so the map 4>l is angle preserving. Similarly, for the map (J)r{X) = X o Q cosA/=S((XoQ)o(YoQ)c) >/* S((X< S((QC S(YC s/N^ S(Xc XoQ >Q)o Wx oYc) Vny°Q (QC°YC)) VNy o(XoQ)) fN^JTh oX) \fNr >yc) hv °y by the above theorem VNxVN^ = cos A.
206 All that is left to be done to show that </>l, <t>R describe rotations in M8 is that these maps are orientation preserving orthogonal maps. To show that they are orthogonal we can either determine the matrix representations of the maps and show that they are orthogonal or we can check that the maps preserve the inner product. We shall consider both approaches. These maps preserve the inner product since <cj>L{X),cj>L{Y)> =S[m)o(QoY)c) = S[(QoX)o(YcoQc)} = S{XoYc) = <X,Y> in which we have used the result obtained above S((Q o X) o (Yc oQc)) = S(X oYc). Hence the map </>l is orthogonal. The same approach shows that (J)r is also orthogonal. To determine matrix representations we need to obtain appropriate basis elements. 4.7 Basis elements for Cayley numbers Q Let P] _ |~Po+Pii+P2.;+P3fc Although we could make many choices for basis elements, to conform, with accepted formulations we make the following selection: D(o) 1 0 "o" i , e« = , e(5> = 0 1 "%' o_ , e& = , e^ = i 0 "o" A , e^ = , e^ = J 0_ "o" J _ e(4) = As the reader should verify (see exercise) these basis vectors satisfy the following relations (m, n ^ 0) c(m) 0 c(m) = _c(o) m ^ 0 e(m) 0 e(n) = _e(n) 0 e(m) n ^ m indices modulo 7 e(m) Q e(m+l) _ e(m+3) The triples (e(m)5e(m+1)5e(m+3); indices modulo 7) form what are called Hamilton triangles (cf Porteous [19]). They mimic the orthonormal triplet (z, j, k) of 3-dimensional space. There are seven Hamilton triangles. In Figure 4.3 only one is highlighted
207 (e^\e^2\e^). The other Hamilton triangles are obtained by moving this triangle so that the vertex (shown at e^) moves to vertex at e^2\e^\..., e^ in turn. Figure 4.3 We shall meet Hamilton triangles again in our further analysis of rotations in 8-dimensional space. Other triplets can be chosen from e^m\ m = 1,2,..., 7. A Cayley triangle is a collection of three mutually orthogonal unit norm pure Cayley numbers (/i, £, rh) with the property that S{h o (£ o rh)) = S{£ o (rh o h)) = S{rh o(ho £)) = 0 (i.e. any one element is orthogonal to the product of the remaining two). For example e^\e^3\e^ is a Cayley triangle. (eWoe(3>) = e(5)0fe(Doe(3)l T 0 •{ "0" i 0 3 0 k 0 o 0 ~ J ^ 0 ^ I so S(e& o (eM o e^)) = 0. Similary S{e^ o (e<3) o e^)) = 0 and S(e^ o (e<5> o e^)) - 0. From a single Cayley triangle the complete orthonormal basis for the set of pure Cayley numbers can be constructed. From the Cayley triplet (e^l\e^\e^) define p = e(1) o e^ then the set is an orthonormal set and so is a basis for the space of pure Cayley numbers.
208 We can use the present basis vectors to find: 4>L(e(0)) = p _q_ "1" 0 = p ,q_ : p0e(0) + q0eW +pie<2) +p2e(3) + 9ie(4) +p3e(5) + g3e(6) + <?2e(7) 4>L(e{1]) p _q. "0" i = ' -q . p _ = -g0e(0) +p0e <j>L(e{2)) = ~Pie{0) - qie 0i(e(3)) = -p2e (0) <?2e ^L(eW) = -9iew+Pie' (o) «^L(e(5)) = -p3e(0) - fee 0L(e(6)) = -</3e(o)+p3e 0L(e(7)) = -</2e(o)+p2e + 9ie(2) + g2e(3) - pie<4) + g3e(5) - p3e(6) - p2e(7) + p0e(2) + p3e(3) + q0e{4) - p2e(5) + g2e(6) - g3e(7) - p3e(2) +p0e(3) + 93e(4) + Pie<5) - me™ + q0e^ - q0e™ +93e(3) +p0e(4) +?2e(5) +p2e(6) -p3e(7) + p2e(2) -Pie(3» - g2e(4' +p0e(5) + </0e(6) + «hc(7) - g2e(2) + qie^ -p2e^ - q0e& +p0e(6) +Pie^ + g3e(2) - </oe(3) +p3e(4) - W5' - Ple^ + p0e(7) Thus the matrix representing the transformation </>l is 'Po -qo -p\ -P2 -q\ -P3 -q3 -q2' qo po -q\ -qi p\ -q3 P3 P2 p\ q\ po -P3 -qo P2 -q2 <?3 P2 q2 P3 Po <?3 -pi qi -qo q\ -p\ qo q3 Po -q2 -P2 P3 P3 q3 -P2 p\ q2 Po -qo -qi <?3 -P3 q2 -q\ P2 qo Po -pi -q2 -P2 -q3 qo -P3 q\ Pi Po . which can be written in partitioned form: A B~ -BT D MH = where A = Po -qo -pi -P2 qo po -qi -q2 p\ q\ po -P3 P2 <?2 P3 PO
209 B = D = -q\ -P3 -qs -Q2 p\ -qs ps P2 -qo P2 -q2 qs qs -p\ q\ -qo PO -Q2 ~P2 PS q2 po -qo -q\ P2 qo Po -Pi .-Ps q\ P\ Po J A prolonged calculation confirms what we already know, that M(j)L is orthogonal (M(f)LMT = I). This is checked by showing AAT + BBT = [Np + Nq)I and Np + Nq = 1 and -AB + BDT = 0 Also, as we have for any orthogonal matrix: det (M^^Jj = det J = 1 .'. det M4>L = ±1 However, by inspection; det M4)L =Po+/(pi,P2,P3, 90,91,92, 93) in which / is some function of its arguments. Now since the coefficients Pi, qi; z = 0,1,2,3 are independent it must follow* that det M(f)L = +1 and so the map </>l preserves orientation and so is a member of 50(8) and therefore represents a rotation in E8. A similar deduction can be made for (J)r. Here: <Me(0)) = [po,9o,Pi,P2,9i,P3,93,92] Me{1)) M*W) = °| \P\ _ \~q\ _ r~4o-9i*-92J -93& lj [q\ [ Pj [ Po + Pli + P2J + Psk J = [~9o,Po, -9i, -92,Pi, -93,P3,P2] in terms of e(l) z = 0, ..7 ~P\ + Poi ~ Psj + P2& 9i -qoi + qsj ~q2k i 0 P .9. = ip = = [-Pi,9i,Po,-P3,-9o,P2,-92,93] * (In fact the reader should be able to argue that necessarily: det M$L = (pi + pi + p\ + P3 + 9o+9? + 922 + 93)4)
210 Me{3)) = 3 0 P _9_ = j p .~3 4. = -P2+P3*+Poj-Plfc q2 ~ q3i ~ qoj + qik [-P2,^2,P3,Po,-93,-pi,9l,-9o] <Me(4)) = "o" i P .9. = 'qi' pi = = [-9i.-Pi.9o,93,Po, Me{5]) = ~k 0 p .9. = kp —kq = -q\ + 9o* + <?3j - 92& -Pi +Po* + P3J-P2A: 92,~P2,P3] -P3-P2^+PiJ+PoA: +93 + qii ~ q\) - qok = [-P3,43,-P2,Pl,42,P0,-g0,-4l] Me{6)) = 1 1 p .9. = qk pk = -93 + q2i - qij + qok ~P3 + P2i - Pij + Pok = [-93 ~P3, 92,~9l, P2,9o,Po "Pl] 0(e(7)) = "o" .3'. p .9. = qj .pi. = -92 -q3i + qoj + qik ~P2 - P3i + Po] + Pik = [-92,-P2,-93,9o,-P3,915P15Po] M^R = "Po 90 Pl P2 9i P3 93 92 -9o Po -9i -92 Pl -93 P3 P2 -Pl 9i Po -P3 -9o P2 -92 93 -P2 92 P3 Po -93 -Pl 9i -9o -9i -Pl 9o 93 Po "92 -P2 P3 -P3 93 -P2 Pl 92 Po -9o -9i -93 -P3 92 -9i P2 9o Po -Pl -92 " -P2 -93 9o -P3 9i Pl Po which again can be written in partioned form: F C in which D is as defined in M^L and F, C are defined from the matrix. We saw, at the equivalent stage in our discussion of quaternions that some matrix representations were easier to interpret than others. The same is true here. The matrix representations developed here for M^L and M(f)R seem to have no direct correspondence with simpler objects such as quaternions. We remember that the quaternions could be expressed in terms of 2 x 2 matrices over the field of complex numbers. We might think
211 that we could represent a Cayley number Q G IK by an 8 x 8 matrix Qm £ M//8^e ~y Unfortunately an isomorphic map of this kind cannot exist for Cayley numbers since Cayley algebra is non-associative whereas matrix algebra is associative. However, in the case of maps representing rotations some simplification is possible if we make the alternative choice for basis vectors: £(°> = £<*> = "l' 0 "o" 1 , E™ = , E^ = i 0 "o" i , E™ = , E^ = J 0 "o" J. , E^ = , ew = 1 1 "o" k_ then, in terms of E^ i = 0,.., 7 <M£(0)) = <M£(1)) = <M£(2)) = <M£(3)) = "ll oJ i 1 oJ 31 oJ 'jfel _°J \p] UJ \P] UJ \p] UJ \p U_ [P0,Pl,P2,P3,90, 91,92,93] = [-Pi,Po,-P3,P2,9i,-9o,93, -92] = [-P2,P3,P0,-Pl, 92,-93,-90,9l] [~P3, -P2,Pi,P0,93,92, -9l, -9o] ME(5)) = <Pr(E{6)) = ME{7)) = "0" 1 "0" i "0" .3. "0" _k_ p] .^J p] .4} P~\ .tfj P _Q_ = [-Qo, -qu -Q2, -q3>Po,PiiP2,P3\ = [-QuQo,Q3,-Q2>-Pi,Po>P3 -P2] = [-^2,-^3,^0,91 -P2,-P3,P0,Pl] = [-43,92, ~qu ^0,-P3,P2,-Pl,Po] then we find the matrix representation of the map is
212 I Po Pl P2 P3 qo qi q2 q3 -pi Po ~P3 P2 qi -qo <73 -02 -P2 P3 Po -Pl 02 -Q3 -qo qi ~P3 ~P2 Pl Po q3 <72 -qi -qo -qo -qi -q2 -q3 Po Pl P2 P3 -Qi qo q3 -q2 -pi Po P3 ~P2 -q2 -q3 qo qi ~P2 ~P3 Po Pl -q3 ' <72 -qi qo ~P3 P2 -Pl Po prot R . QrRf -(QrHf)T prot j It is easily checked that PrRot (PTH0tf = NPI (QRef)T(Qr^f) = NqI (Plot)(Plot)T = NPI and (prRot)(QrRf)T -mf)T(Piot)T=o also detP£ot = detPrRot = y/f^> detQ^e/ = -y/Wp In a very obvious way we can (by referring back to the work on quaternions) interpret P^ot,P£ot as the matrix representations of maps in M4 which correspond to right and left rotations together with an expansion (due to the factor y/Np). Also QrRe* can be interpreted as being the matrix representation of a right reflection in R together with an expansion (due to the factor y/Nq). Repeating this calculation for the map </>l we find the matrix representation is " Po Pl V2 P3 qo qi <72 1 03 -Pl Po P3 ~P2 -qi qo -q3 <72 ~P2 ~P3 Po Pl -q2 <73 qo -qi ~P3 P2 -Pl Po -q3 -q2 qi qo -qo qi <72 03 Po -pi ~P2 ~P3 -qi -qo -q3 02 pi Po ~P3 P2 -q2 <73 -qo -qi P2 P3 Po -Pl -q3 ' -q2 qi -qo P3 ~P2 Pl Po T>rot . QrRot -mr (PLotV .
213 in which Q7^1 is a right rotation with expansion (due to factor y/Wqj and P£ot is a left rotation with expansion (due to factor y/Np). 4.8 Geometry of 8-dimensional Rotations Let Q be a unit norm Cayley with vector part parallel to q: Q = cosO + qsmQ Nq = 1, qoq = -l and let X be any Cayley number. Now, defining X' = q o X, we have: QoX = Xcos 6 + X' sin 6 Q o X' = -X sin 6 + X' cos 6 So a Cayley multiplication on the left rotates elements in the plane containing X, X1 by angle 0. Choosing X = 1 then X' — q and so the plane containing the elements 1 and q is left invariant: Q o 1 = 1 cos 9 + <? sin 0 Q oq = -Isin0 + <7cos0 Multiplication on the right by Q also rotates elements in the plane containing elements X and X"{= X o q) through angle 6 in the same direction as left multiplication. Thus, following multiplication on the right by Q, elements in the plane containing 1 and q rotate through angle 0 in the same direction as left multiplication. Multiplication by Q~l rotates these elements through angle 6 but in the opposite direction. In the corresponding discussion on quaternions we saw that quaternion multiplication on the left (or right) rotated elements in the plane (in the space of pure quaternions) perpendicular to q. For Cayley multiplication the situation is somewhat more complicated. We first show that the space of pure Cayley numbers has an orthonormal basis of seven elements based on the element q (c.f. e^m\ m = 1,2,...,7 seen above). Having chosen q (via Q) choose two other elements l,mso that (q,£,rh) is an orthonormal set and so that m is orthogonal to qol: rho(qo£) = -(qo£) om We can quickly deduce that these assumptions alone imply q o (£ o to) = -(£ om) oq and I o (to o q) = -(to o q) o £ so that (<?, I, to) is a Cayley triangle.
214 proof q o (£ o rh) + (q o £) o rh = q o {£ o rh) - rh o (q o £) = —q o (m o £) — rh o (q o £) = [(q)2 + (m)2] o£-(q + rh)o(qo£ + fho£) = [(I)2 + (™)2] oi-(9 + m)o(q + m)oi = [M)2 + (m)2]oi-(g + m)2o/ = 0 since qorh = —rh o g. Also, taking conjugates: (m o ^) o q + m o {£ o <?) =0 Thus, using the above two relations: S(q o (£ o m)) = — (to o ^) o q + <? o (^ o to) = to o {£ o <?) - (<? o £) o to = to o [£ o <?) + [£ o <?) o rh = 0 since to is orthogonal to (£o q). By a very similar calculation it is confirmed that S{£o(qoth)) = 0 so that (<?, £, rh) is a Cayley triangle. Now, as seen earlier, from this single Cayley triangle a basis for the set of pure Cayley numbers can be constructed. Let h = q o £ then the collection (#, £,h,rh, q o to, £ o to, n o to) is the required orthonormal basis. See Figure 4.4.
215 The seven Hamilton triangles are obtained as usual. Of these seven only three involve the element q directly: {q,t,n) (q,rh,qorh) (q.homjom) which are shown highlighted. Figure 4.4 The discussion now follows that for quaternions. When multiplying on the left, Cay ley multiplication by Q = cosO + qsmO rotates elements in the three planes containing (I,n); (ra, qom) and (norh^iom) about the <?-axis through an angle 6 in a positive direction according to the right-handed rule). Also multiplying on the right by Cay ley number Q rotates elements in these planes through an angle 6 in a negative direction according to the right-hand-rule. Since an arbitrary Cayley number can be written uniquely in terms of the basis elements 1, q and the three pairs of elements taken from the Hamilton triangles containing q then under the operation Q o (X o Q~l) we see that (i) Elements in the plane containing 1 and q are unaffected since Qo rotates elements through angle 6 whilst oQ~l rotates them back, through 0, to their original position. (ii) Elements in the three Hamilton triangles with common element q rotate through a positive angle 0, about axis q when multiplied on the left by Q and then by a further positive angle 0 when multiplied on the right by Q~l: a total rotation of 26 about the axis q. Thus the effect of Q o {X o Q~l) is to effect a rotation in the space of pure Cayley numbers of 26 about axis q. This is precisely what we found when the detailed analysis of Q o (X o Q~x) was undertaken.
216 Reflections in 8-dimensional space As with complex numbers and quaternions, reflections are generated by the conjugate map. The simplest reflection is: cj): R8 h+ R8 cj)(W) = Wc with matrix representation At = diag[l, -1, -1, -1, -1, -1, -1, -1] and so AK = A^ with detA^ = — 1. The geometrical interpretation is a reflection in the scalar axis (the scalar axis is left fixed by this map). Next we consider the right and left reflections: (/>L(W) = Q o Wc cj)R{W) = Wc o Q NQ = l Here </>l is interpreted as a reflection in the scalar axis followed by a left rotation whilst (J)r is interpreted as a reflection in the scalar axis followed by a right rotation. These two lead to the axial reflection: <f>a(W) = Qo(WcoQ) NQ = 1 This map is orthogonal since it is norm preserving. Also under this transformation the line rQ r Gl remains invariant: Q o (rQc oQ)=rQ I leave it as an exercise for the reader to show that the matrix representation for this transformation A^a is such that detA^a = — 1. The other type of reflection is called a simple reflection: A : R8 h+ R8 X(W) = -Qo (Wc o Q) and corresponds to a reflection in the plane with normal Q. To see this let Y be a Cay ley number lying in the plane normal to Q. That is: S{YoQc)=0 or YoQc + QoYc = 0 then \(Y) = -Qo(YcoQ) = -(QoYc)oQ = (YoQc)oQ = Y That is, it is unaffected by this reflection.
Appendix 1 Clifford Algebras This appendix relies heavily on the article by Riesz [20]. I have tried to argue in this text that, based almost entirely on their natural relation to the other normed algebras M, C, H, Cayley numbers are deserving of more attention. However, as this appendix briefly describes, and as is well known, the real numbers, the complex numbers, the ordinary and complexified quaternion numbers are also particular examples from the important class of Clifford Algebras which, for completeness, we now introduce. But we should keep in mind that although the Clifford Algebras can be thought of as being more fundamental than M, C, H, there is an important algebra missing from this general class; all the Clifford algebras being associative cannot include the Cayley numbers. We have already noted the Hurwitz theorem: "If the sum of n squares times the sum of n squares is again a sum of n squares in which the last sum is comprised of terms computed bilinearly from the terms of the first two sums, then n takes one of only four possible values, 1,2,4,8." That is if (n \ / n \ n i=l / \i=l / 2 = 1 in which n n Ci = Y^2,aiJkaJhk j = l k=\ where a^ are constants then this can only be true if n = 1,2,4,8. This result of course is intimately related to the well known result, due to that there exist only four normed algebras over the reals; E, C, H, K. A related problem is to ask if (and when) it is possible to write the square of a linear expression of n terms as the sum of n squares. That is (n \ 2 n J^aivA =^a] (*) 2=1 / 2=1 in which a2 are scalars and Vi are entities whose algebra is yet to be discovered. In a very imprecise sense this is a kind of "square root" of the problem above. In this problem we can make some immediate deductions (n \ 2 n ^2 aiVi ) = Yl ^ + Yl aiaj(ViVJ + v3vi) 2=1 / i=l i<j
218 in which the product of v^Vj is not assumed to be commutative. It follows that (*) will be satisfied if ViVj + VjVi = 26 ij Now consider a vector space Vn over the reals or the complex numbers (scalars). We define a scalar product on Vn as follows: to each pair x, y G Vn we associate a scalar < x, y > with the properties (i) < x,y > = < y,x > (ii) < ax + f3y, z > = a < x,z > + j3 < y,z > a, j3 are scalars Theorem There exists an orthogonal basis for Vn that is {vi, t>2,..., vn} such that <Vi,Vj>=0 i^j proof Assume the existence of two elements x,y G V"n such that < x,y > ^ 0 (If no such pair existed then any basis for Vn is orthogonal). Prom the identity (using (ii)) <x + y,x + y> = <x,x> + 2 < x,y > + <y,y> we easily deduce that there exists a vector with non-vanishing inner product. For if <£,£>^0or<?/,?/>^0 then we have nothing to prove. However if < x, x > = 0 and < ViV > — 0 then the above gives < x + y,x + y > =2 < x,y > ^0 Thus let v\ G Vn be such that < v\, V\ > ^ 0. For any other x e Vn we can always write t . < a;,vi > x = x H vi < vi,vi > Now, using (ii) we easily find T < x,V\ > T < X, Vi > = < X -\ Vi, Vi > = < X , Vi > + <X,Vi> < Vl,Vl > and so < xT, vj > = 0, that is, xT is orthogonal to v\.
219 The space W\ of all elements xT G Vn such that < xT ,V\ > = 0 forms an (n - 1)- dimensional subspace of Vn. We can now repeat this construction for the space W\. As above we assume there exists two vectors u, v e W\ such that < u, v > ^ 0. (If no such pair exists then v\ and any basis in W\ is an orthogonal basis for V^). In this case we can always find an element v<i G W\ such that < ^2,^2 > 7^ 0 which allows W\ to be written as the direct sum of V2 and the space W2 C W\ of vectors orthogonal to V2. In this way we can construct a basis {^1,^2, • • • > vn} and find a number r such that (when properly arranged) < Vi,Vj > =0 i^j <vuVi > =NVi ^0 i<r (<n) and < v^ Vi > =0 (r + 1) < i < n A typical element x eVn can be written in terms of this orthogonal basis n X — 7 X%Vi 2 = 1 n ^ X) X s> — y yijXiXj yij — \ t/j, Vj *> = E^(xt)2 The matrix g^ is called the metric and the number r is clearly the rank of the matrix gij. There are only two possibilities (i) r = n, (ii) r < n. If r = n then <ftj is said to be non-singular (this occurs if and only if det(^j) ^ 0). If r < n then g^ is singular. We shall assume that r — n and so the metric is non-singular. Then for any two elements a, b £ Vn with inner product a = yZ °"ivi and k — /J fri^i 2=1 2=1 < a,b > = 2^ ^tja»6j Given a definition for the scalar product the explicit components for the metric can be found. Conversely, given the metric g^ (g^ = gji) we can find the scalar product of two elements. There are two cases of interest. If g^ = 6{j we have the Euclidean metric whilst
220 if <7n = 1, gu = -1 i ^ 1 we have the Lorentz metric. This allows us to give a meaning to the product of elements of Vn. We define a2 = < a,a> (In some discussions on this subject the choice a2 = - < a, a > is made). Prom this definition we easily obtain (a + b)2 = < a + 6, a + b > i.e. a2 + ab + 6a + 62 = < a, a > +2 < a, b > + < 6,6 > Using the usual properties of inner product and again not assuming that the product ab is commutative. We immediately deduce < a, b > = -(ab-\-ba) Obviously, for an orthogonal basis: {vi, t>2,..., vn} < Vi, Vj > = 0 z ^ j and so ViVj + VjVi = 0 z ^ j and (from < a, 6 >= ]Cr,j=i 0*ja*M we nave n < ^,^fc > = ^ gijhktijk = 9kk (since vfc = ££=1 «fcpvp) uiuj +VjVi = 2gij The basis elements {vi,V2,... ,vn} generate an algebra C That is, all product combinations 1, Vi, V2, . . . , V„, ViV2, VlV3, • • • , ^1^3, ...,..., ^1^3 • • • Vn (including the empty product - regarded as 1) generate the algebra. We shall prove that there are in total 2n independent elements here. Clearly, any product involving the Vi terms may be reduced, up to a sign, of distinct basis elements of C. For example (for a Euclidean metric) V1V3V3V4V2V3 = 1>lU4^2^3 = -^1^2^4^3 = V\V2V^V^ (we are assuming associativity). For the space Vn with n basis elements Vi,V2,...,vn then every product can be reduced to one containing not more than n factors. The term V\V2 . • • vn is called the pseudoscalar.
221 The product of two elements of C is defined in an obvious way: for example consider V\V2V± and ^1^3 then (v\V2V±)(v2V\Vz) — V\ViV±V2V\Vz — v\v\vzv± = V3V4 using associativity It is easily shown that the algebra C is associative. Following Riesz we can denote any element of C as rjPiP2-.pn —v\v2 '"vn where the indices pi are taken modulo 2. So, for example (for n = 4) £1001 = ^1^4 whilst £1011 — ^1^3v4 and so on. Any element of C can (by using v&j + VjVi = 0 i ^ j and depending on the metric being Euclidean or Lorentzian, v\ — ±1) always be written in this form up to a sign. The product of two elements of the algebra is EPlPa...PftEtriai...<rn = v^v? ... v^v^v? ... <" where k = Ym=2 Piai (on eacn interchange, only if pi = 1 and o\ — 1 a factor of (—1) is introduced). Proceeding in this way it is now obvious that n Thus ^tlt2... tn{Epip2...pnE<Tl<T2...<Tn) — ( —l)P^tit2-.. tn^pi+<ri,p2+a2,...,Pn+^n — (_1)P qEti+Pi+(Ti,t2+P2+(T2,...,tn+Pn+(Tn where n which clearly indicates that the product is associative. Now let Ep^ denote, for a given p the zth element of the set £PlP2...Pn where p = £n=1 Pj of which there are nCp elements. Now VjEp^ = sEp{i)Vj
222 where +1 p = 0 (-I)p if EpW does not contain vd (-l)p_1 if EpW does contain Vj +1 if p = n (odd) I — 1 if p = n (even) Thus for every Ep^ we can find an element Vj which anti-commutes with it unless the dimension n of Vn is odd and p = n. That is, for the pseudoscalar £qi...i. Linear Independence (i) Assume n is even. Consider the expression n k ££>pi£*>« = 0 k=nCp (*) p=0 2=1 where each aPi is a scalar. We need to show that every scalar api is zero. Suppose there exists a particular scalar aqm ^ 0. Now every element Vi i = 1, 2,... , n has an inverse v^1. This follows if ^ is non-singular; i.e. if g^1 ^ 0 z = 1,2, ...,n. Inverses for the elements of algebra C now immediately follow. For example [l>ll>2...Up]~1 = Vp1...^1^1 since (wi... ^(v"1... vf1) = K1 • • • Ofai • • • vv) = l in which associativity has been used. Thus multiplying through (*) by a~^(i^m))_1 we obtain an expression of the 'form' n k p=0 2=1 where the p = q, i = m term is missing from the double sum. If (*) had contained a single term then we have the contradiction 1 = 0. If (*) contains more than one term then assume azt ^ 0 (for particular z,t). Since we are assuming n is even then there exists an Vj which anti-commutes with Ez^\ Thus multiplying (**) on the left by Vj and on the right by v~l we obtain l + ^a^^K)"1^ p=0 2=1
223 (again the p = q, i — m term is missing. Adding to (**) and dividing by 2: ^lEE^K1,+vjE«i>{vj)-1] = 0 (***) p=0 i=l Now VjEv^l\vj)~l = ±EP^ depending as Vj commutes or anti-commutes with Ep^ and vjEz^{vj)-1 = -Ez{t) and so (***) has exactly the same form as (**) but with one term less in the double sum. We now repeat this process, reducing the terms in the double summation by one each time. Eventually we obtain the contradiction 1=0. (ii) Now assume n is odd. As argued earlier there is no element Vj which anti-commutes with the pseudoscalar £ai...i. Thus using exactly the above procedure we are led to l+/?£ii...i = 0 To show that this also leads to a contradiction we note the automorphism Vi —> —Vi leaves all relations in C unchanged. However -E11...1 -» —-E11...1 under this change and so 1 - j3£n...i = 0 must also be true. We immediately conclude that 2 = 0 a contradiction. Thus the elements from the set Ep^ p = 0,1,..., n; i = 1,2,... ,n Cp are linearly independent and so form a basis for C of dimension Yl^=o nCp = 2n. We now show that the algebra splits naturally into two disjoint sets. C is formed from 2n independent elements 1, Vi, V2, . . . , Vn, ViV2i ViV3i • • • , ViV2V3) ...,..., V1V2V3 ...Vn For fixed p the nCp-dimensional subspace of C spanned by the basis elements vaiva2 ...vQp with 1 < ol\ < ct2 < ... < ap < n with exactly p factors is denoted by Cp. We now define C+, C_ to be c+ = (J cp c- = (J cp V even p odd
224 C+ is a subalgebra (since it is closed under multiplication) of C but C_ is not a subalgebra. The elements of C+ are called even - those of C_ are called odd. We now examine some special cases. n = 0 C = {1} : This is the space of scalars. n = 1 C : This is the vector space Vn n = 2 C = {1, vi, ^2, ^1^2} : This is the space of quaternions (if ^ = — 1 other ^ = 0). C+ = {1,^1^2}. Here (viv2)(viv2) = -v\vl = -1. Clearly C+ is isomorphic to the complex field. n = 3 C = {l,vi,v2, v3, ^3^2, V1V3, v2vi, ^1^3}- Here, if we write z <-» ViV2v3 then i2 = (vi^2^3)(^i^2^3) = (^3X^3) = -1 Also denoting i <-> V1V2, j <-> ^2^3, A: <-> ^1^3 then U ^(ViV2V3)(v3V2) = Vi U ^(^1^2^3)(^1^3) = V2 ik <^(viv2v3)(v2vi) = v3 This is the algebra generated by the elements {l,ii,ij,ik,i,j,k,i} in which z, j, A; are the usual quaternion units and i2 = — 1. Clearly C is isomorphic to the complexified quaternions (the biquaternions). n = 3 C+ = {1,^1^2, v\v3, v2v3}. Here (^2^l)2 = "I (VlV3)2 = "I (^2^3)2 = "I (v2vi)(viv3) = v2v3 = -(viv3)(v2vi) (^l)(W2) = ViV3 = -{v3V2)(v2Vi) {viv3)(v3v2) = viv2 = -(v3v2)(viv3) So if we make the identifications i <-> v3v2 j <-> viv3 k <-> ^2^1 then C+ is isomorphic to the quaternions.
Appendix 2 Computer Algebra and Cayley Numbers The following program segment is used to perform symbolic computations on Cayley numbers using the symbolic programming language MAPLE. As in the text, a Cayley number is considered as an ordered pair of quaternions, and the quaternions are taken as 2x2 complex matrices. The segment is written as a collection of procedures. The first procedure quat defines a quaternion. (/ is MAPLE's notation for the imaginary quantity i. Prom this procedure is returned the quaternion a + id —c + ib c + ib a — id quat:=proc(a,b,c,d) alpha:=a+I*d: beta:=c+I*b: evalc(matrix(2,2,[alpha,-conjugate(beta),beta,conjugate(alpha)])): end: Next, six general quaternions are defined jft, <ft, i = 1,2,3. pl ql p2 q2 P3 q3 =quat(al,bl,cl,dl): =quat(Al,Bl,Cl,Dl): =quat(a2,b2,c2,d2): =quat(A2,B2,C2,D2): =quat(a3,b3,c3,d3): =quat(A3,B3,C3,D3): Then we define three general Cayley numbers Cayl, Cay2 and Cay 3. They are written as an ordered pair (in MAPLE this is a list) of quaternions and, through the procedures defined below adhere to the rules of Cayley algebra. Cayl Cay2 Cay3 Cay4: = [pl,ql]: = [P2,q2]: = [P3,q3]: = [htranspose(pl) ,-ql]:
226 Cayl, Cay2, Cay3 are general Cayley numbers and Cay 4 is the conjugate of Cay 1. (We note that, in matrix form, the quaternion conjugate is the Hermition transpose). The next procedure defines the Cayley conjugate: X = then Xc = V -Q CayConj:=proc(Cay) [evalc(evalm(htranspose(Cay[l]))),evalc(evalm(-Cay[2]))]: end: CaySum defines the sum of two Cayley numbers: CaySum:=proc(Cayl,Cay2) [evalc(evalm(Cayl[l]+Cay2[l])),evalc(evalm(Cayl[2]+Cay2[2]))]; end: CayScprod defines the product of a scalar / G C with a Cayley number: CayScprod:=proc(lambda,Cayl) [evalc(evalm(Cayl[l]*lambda)),evalc(evalm(Cayl[2]*lambda))]: end: The following procedure extracts the vector part of a Cayley number: Cayvec:=proc(Cay) local K1,K11,K2,K3,K4; Kl:=CayConj(Cay): Kll:=CayScprod(-l,Kl): K2:=CaySum(Cay,Kll): K3:=l/2: K4:=CayScprod(K3,K2): [evalc(evalm(K4[l])),evalc(evalm(K4[2]))]; end: The next procedure extracts the scalar part of a Cayley number: Caysca:=proc(Cay) ml:=CayConj(Cay);
227 mll:=evalc(ml[l]); ml2:=evalc(ml[2]); mm:=[mil,ml 2]; m2:=CaySum(Cay,mm); m3:=l/2; m4:=CayScprod(m3,m2); m41:=evalm(m4[l]); m42:=evalm(m4[2]); [evalc(evalm(m41)) ,evalc(evalm(m42))]; end: The inverse of a Cayley number is defined in the procedure Caylnv: Caylnv: =proc(Cay) [evalc(evalm(CayScprod(l/CayNorm(Cay),CayConj(Cay))[l])), evalc(evalm(CayScprod(l/CayNorm(Cay),CayConj(Cay))[2]))]; end: As a final definition we consider the product of two Cayley numbers: X = (note that in MAPLE the matrix product is denoted by &*). Cayprod:=proc(Cayl,Cay2) wl:=evalm(Cayl[l]&*Cay2[l]-Cay2[2]&*(htranspose(Cayl[2]))); w2:=evalm(Cay2[l]&*Cayl[2]+(htranspose(Cayl[l])&*Cay2[2])); wl:=evalc(wl); w2:=evalc(w2); [evalc(evalm(wl)),evalc(evalm(w2))]; end: We are now in a position to find the norm: CayNorm:=proc(Cayl) Cayprod(Cayl,CayConj(Cayl))[l][l,l]: simplify ("); end; V .9. Y = a b XoY = X = pa — bq aq + pb
228 One of the most important questions that one can ask in Cayley number algebra is that concerning the equality of two Cayley numbers. This is defined in the procedure equat. equat :=proc(Cayl,Cay2) zl:=simplify(Cayl[l][l,l]-Cay2[l][l,l]) z2:=simplify(Cayl[l][l,2]-Cay2[l][l,2]) z3:=simpUfy(Cayl[l][2,l]-Cay2[l][2,l]) z4:=simplify(Cayl [1] [2,2]-Cay2[l] [2,2]) z5:=simplify(Cayl[2][l,l]-Cay2[2][l,l]) z6:=simplify(Cayl[2][l,2]-Cay2[2][l,2]) z7:=simpUfy(Cayl[2][2,l]-Cay2[2][2,l]) z8:=simplify(Cayl[2][2,2]-Cay2[2][2,2]) if zl=0 then if z2=0 then if z3=0 then if z4=0 then if z5=0 then if z6=0 then if z7=0 then if z8=0 then RETURN(true) else RETURN(false) fi else RETURN(false) fi else RETURN(false) fi else RETURN(false) fi else RETURN(false) fi else RETURN(false) fi else RETURN(false) fi else RETURN(false) fi; end: The next program segment defines a Cayley number Cay5 with vector part proportional to the vector part of Cayl. K:=k+I*0; VecCay5:=CayScprod(K,Cayvec(Cayl)); VecCay5:=[evalc(evalm(VecCay5[l])),evalc(evalm(VecCay5[2]))]; ScaCay5:=Caysca(Cay3);
229 ScaCay5:=[evalc(evalm(ScaCay5[l])),evalc(evalm(ScaCay5[2]))]; Cay5:=CaySum(ScaCay5,VecCay5); The next program segments test (using the procedure equat) the fundamental relations of Cayley number algebra and the procedures defined above. Since Cayl=(Cay4)c then MAPLE request: equat(Cayl,CayConj(Cay4)); returns TRUE. Since Caylo(Cay2oCay3)^ (CayloCay2)oCay3 then MAPLE request: equat(Cayprod(Cayl,Cayprod(Cay2,Cay3)), Cayprod(Cayprod(Cayl,Cay2),Cay3)); returns FALSE. However, as Caylo(Cay2oCay4)= (CayloCay2)oCay4 then MAPLE request: equat(Cayprod(Cayl,Cayprod(Cay2,Cay4)), Cayprod(Cayprod(Cayl,Cay2),Cay4)); returns TRUE. Also, as Caylo(Cay2oCayl)= (CayloCay2)oCayl then MAPLE request: equat(Cayprod(Cayl,Cayprod(Cay2,Cayl)), Cayprod(Cayprod(Cayl,Cay2),Cayl)); returns TRUE. In this way all of the Cayley identities described in Section 4.4 can be checked. The reader should be aware that symbolic computation, though easy to formulate (and often only requiring one word answers TRUE or FALSE) may require large amounts of memory and take a considerable time to execute in comparison to 'ordinary' numerical computation. The last two program segments test the result obtained in Theorem 4.6: (yoI)o[^o(ro X)-1} = Yo[(Xo(Wo X-1)) o Y~l)
is valid for all W only when V(Y) = kV(X) where k e R. To save on computation I have, without loss of generality, used conjugates instead of inverses. The first part tests equality when V(Y) = kV{X) in which the identifications: W =Cay3, X =Cayl and Y =Cay5 have been used. tl:=Cayprod(Cayl,Cay5): t2:=CayConj(tl): t3:=Cayprod(Cay3,t2): t4:=Cayprod(tl,t3): H:=Cayprod(Cay3,CayConj(Cay5)): 12:=Cayprod(Cay5,ll): 13:=Cayprod(12,CayConj(Cayl)): 14:=Cayprod(Cayl,13): the MAPLE request: equat(t4,14); returns TRUE. The second part tests non-equality when V(Y) ^ kV(X) (here W =Cay3, X =Cayl and Y =Cay2). tl:=Cayprod(Cayl,Cay2): t2:=CayConj(tl): t3:=Cayprod(Cay3,t2): t4:=Cayprod(tl,t3): H:=Cayprod(Cay3,CayConj(Cay2)): 12:=Cayprod(Cay2,ll): 13:=Cayprod(12,CayConj(Cayl)): 14:=Cayprod(Cayl,13): The MAPLE request equat(t4,14); returns FALSE. Of course, it does not test the general statement of Theorem 4.6. For that, one would require a much more sophisticated symbolic programming language.
REFERENCES {1] Penrose R. and Rindler W.: Spinors and Space-Time Vol I and 2 Cambridge University Press, (1984) [2] Yang A. T. and Preudenstein F: Application of Dual Number Quaternion Algebra to the Analysis of Spatial Mechanisms Trans ASME June (1964) pp 300-307 [3] Edmonds J: Nature's Natural Numbers: Relativistic Quantum Theory over the Ring of Complex Quaternions Int. Journal of Theoretical Physics, Vol 6, No 3 (1972), pp205-224 [4] Hestenes, D.: Space-Time Algebra Gordon and Breach, New York, 1966 [5] Hestenes, D.: Vectors, spinors and complex numbers in classical and quantum physics Am. J. Phys. 39, 1013 (1971). [6] Hestenes, D. and Gurtler, R.: Local Observables in Quantum Theory Am. J. Phys. 39, 1028 (1971). [7] van der Warden, B. L.: Hamilton's Discovery of Quaternions Math. Magazine 49 (1976) pp227-236. [8] Kauffman, L. H.: Knots and Physics World Scientific Publishing Co. (1991). [9] Brand, L.: Vector and Tensor Analysis John Wiley and Sons (1947). [10] du Val, P.: Homographies, Quaternions and Rotations Claredon Press, Oxford (1964). [11] Hestenes, D and Sobczyk, G.: Clifford Algebra to Geometric Calculus D Reidel Publishing Company (1984). [12] Israel, W.: Differential Forms in General Relativity Comm. of the Dublin Institute for Advanced Studies Series A, No 19 (1970). [13] Macfarlane, A. J.: On the Restricted Lorentz Group ... Journal of Mathematical Physics, Vol 3 No. 6 pplll6-1129 (1962). [14] Cahen, M.,Debever, R., & Defrise, L,: A Complex Vectorial Formalism in General Relativity Journal of Mathematics and Mechanics, Voll6, No 7 pp761-785 (1967). [15] Synge, J. L.: Relativity: The Special Theory North-Holland Publishing Company (1963). [16] Trautman, A. & Kopczynski, W.: Spacetime and Gravitation John Wiley and Sons (1992).
232 [17] Artmann, B.: The Concept of Number: From Quaternions to Monads and Topological Fields John Wiley and Sons (1988). [18] Curtis, C: The Four and Eight Square Problem and Division Algebras Studies in Modern Algebra: A A Albert, editor: Prentice-Hall, Inc pplOO-125 (1963). [19] Porteous, I. R.: Topological Geometry Van Nostrand Reinhold Company (1969). [20] Riesz, M.: Clifford Numbers and Spinors Lecture Series No 38, The Institute for Fluid Dynamics and Applied Mathematics, University of Maryland (1958).
INDEX abelian group 10,76,171 addition 5,56 algebras 38 alternative algebras 179,189 analytical geometry 65,95 angle 29,48,55,76,173,175,205 angular momentum 131 anti-isomorphisms 13 anti-rotation 36 arcs 8,50,98 associative 14,47,56,63,170,221 associative algebra 38 associator 178 basis 20,25,40,206,218 belt trick 59 binary operations 9 biquaternions 224 bivector 106,139 bivector basis 156 Cartesian 43 Cayley identities 177 Cayley numbers 110,164,167,225 Cayley triangles 207 Clifford algebras 217 commutative 10,38,47,54,55,63,78,169,194 commutative algebra 38 complex numbers 41,42,55,74 complex rotations 119,137 complexified quaternions 68,94,105,107,114,224 components 22
234 conjugate 41,47,56,107,185 cosine law 77,100 decimal 3 determinant 26,50,88,106 dimension 21 direction 6 displacements 5 distributive 14,63 division algebra 17,39,73 divisors of zero 16 dual 112 bivector 140 quaternion 140 electric fields 134 electromagnetism 133 Euclidean metric 107,111 Euclidean space 195 Euler's angles 38 exponential form 70 fields 14,16,48 force 128 Frobenius' theorem 72 Galilean transformation 121 gauge condition 136 groups 9 Hamilton triangles 206,215 homogeneous coordinates 68 homomorphism 12,15,92 Hurwitz 183,217 identity 10,39
235 inner product 30,76,109,115,171,184 Hermitian 109 spaces 27 integers 1 integral domains 16 invariants 152,155 inverse 10 involution 41 irrational 3 isomorphic algebras 41 isomorphisms 12,15,23,167 linear independence 20,73,222 linear maps 21 linear spaces 19 linear transformation 21 Lorentz boost 116 Lorentz group 125 Lorentz spatial rotation 116 Lorentz transformation 114,122,126,148 magnetic fields 134 magnitude 6 mass 129 matrix representation 22,23,53,79,88,91 Maxwell's equations 121,134 metric 110,219 Minkowski metric 114 modulus 18 momentum 128,129 Moufang relation 183,189 multiplication 5 negative 6,14,17 norm 6,19,30,40,55,109,115,166 normed algebra 40,183,184 null tetrad 147 null vectors 146
236 number pair 2,42,164,192 order 1,6,17,42 orientation 25,26 orthogonal group 36 orthogonal matrices 30,33,209 orthogonal 29,31 orthonormal set 29 parallel 77,175 particle mechanics 128 Pauli matrices 94,107 perpendicularity 77,175,197 photon 129 polar form 43,52 products 43 pseudoscalars 106,110,112,220 pseudovectors 106,112 quaternion algebra 62 quaternion demonstrator 59 quaternions 54,75 quaternion roots 70 rational numbers 2 real numbers 3,5 reflection 26,49,86,204,216 Riemann tensor 159 rings 14 rotating axes 102 rotation matrix 89 rotation 26,34,49,54,57,78,113,195,205 scalar multiplication 43 scalar part 56,164,197 scalar potential 134 scalars 6,20,42,112 Schwartz 27
237 self-dual 142 sign 19 simple bivectors 143 sine law 100 skew field 17 space-time 120 span 20 special relativity 120 spherical triangles 96,100 spherical trigonometry 95 spin 132 square roots 70 subalgebra 190,224 subgroups 11 subspace 185 tetrad 146 time-like 125 triangle inequality 172 trichotomy law 17 unit 27 unit quaternion 56 vector identities 63,66 vector part 56,164 vector potential 134 vectors 165,112 wave operator 134 Weyl tensor 158 zero 1,14
Other Mathematics and Its Applications titles of interest: P.H. Sellers: Combinatorial Complexes. A Mathematical Theory of Algorithms. 1979,200 pp. ISBN 90-277-1000-7 P.M. Cohn: Universal Algebra. 1981, 432 pp. ISBN 90-277-1213-1 (hb), ISBN 90-277-1254-9 (pb, J. Mockor: Groups of Divisibility. 1983,192 pp. ISBN 90-277-1539-4 A. Wwarynczyk: Group Representations and Special Functions. 1986, 704 pp. ISBN 90-277-2294-3 (pb), ISBN 90-277-1269-7 (hb) I. Bucur: Selected Topics in Algebra and its Interrelations with Logic, Number Theory and Algebraic Geometry. 1984, 416 pp. ISBN 90-277-1671-4 H. Walther: Ten Applications of Graph Theory. 1985, 264 pp. ISBN 90-277-1599-8 L. Beran: Orthomodular Lattices. Algebraic Approach. 1985, 416 pp. ISBN90-277-1715-X A. Pazman: Foundations of Optimum Experimental Design. 1986, 248 pp. ISBN 90-277-1865-2 K. Wagner and G. Wechsung: Computational Complexity. 1986, 552 pp. ISBN 90-277-2146-7 A.N. Philippou, G.E. Bergum and A.F. Horodam (eds.): Fibonacci Numbers and Their Applications. 1986, 328 pp. ISBN 90-277-2234-X C. Nastasescu and F. van Oystaeyen: Dimensions of Ring Theory. 1987, 372 pp. ISBN90-277-2461-X Shang-Ching Chou: Mechanical Geometry Theorem Proving. 1987, 376 pp. ISBN 90-277-2650-7 D. Przeworska-Rolewicz: Algebraic Analysis. 1988, 640 pp. ISBN 90-277-2443-1 C.T.J. Dodson: Categories, Bundles and Spacetime Topology. 1988, 264 pp. ISBN 90-277-2771-6 V.D. Goppa: Geometry and Codes. 1988,168 pp. ISBN 90-277-2776-7 A.A. Markov and N.M. Nagorny: The Theory of Algorithms. 1988, 396 pp. ISBN 90-277-2773-2 E. Kratzel: Lattice Points. 1989, 322 pp. ISBN 90-277-2733-3 A.M.W. Glass and W.Ch. Holland (eds.): Lattice-Ordered Groups. Advances and Techniques. 1989, 400 pp. ISBN 0-7923-0116-1 N.E. Hurt: Phase Retrieval and Zero Crossings: Mathematical Methods in Image Reconstruction. 1989, 320 pp. ISBN 0-7923-0210-9 Du Dingzhu and Hu Guoding (eds.): Combinatorics, Computing and Complexity. 1989, 248 pp. ISBN 0-7923-0308-3
Other Mathematics and Its Applications titles of interest: A.Ya. Helemskii: The Homology of Banach and Topological Algebras. 1989, 356 pp. ISBN 0-7923-0217-6 J. Martinez (ed.): Ordered Algebraic Structures. 1989, 304 pp. ISBN 0-7923-0489-6 V.I. Varshavsky: Self-Timed Control of Concurrent Processes. The Design of Aperiodic Logical Circuits in Computers and Discrete Systems. 1989, 428 pp. ISBN 0-7923-0525-6 E. Goles and S. Martinez: Neural and Automata Networks. Dynamical Behavior and Applications. 1990,264 pp. ISBN 0-7923-0632-5 A. Crumeyrolle: Orthogonal and Symplectic Clifford Algebras. Spinor Structures. 1990, 364 pp. ISBN 0-7923-0541-8 S. Albeverio, Ph. Blanchard and D. Testard (eds.): Stochastics, Algebra and Analysis in Classical and Quantum Dynamics. 1990, 264 pp. ISBN 0-7923-0637-6 G. Karpilovsky: Symmetric and G-Algebras. With Applications to Group Representations. 1990, 384 pp. ISBN 0-7923-0761-5 J. Bosak: Decomposition of Graphs. 1990, 268 pp. ISBN 0-7923-0747-X J. Adamek and V. Trnkova: Automata and Algebras in Categories. 1990, 488 pp. ISBN 0-7923-0010-6 A.B. Venkov: Spectral Theory of Automorphic Functions and Its Applications. 1991, 280 pp. ISBN 0-7923-0487-X M.A. Tsfasman and S.G. Vladuts: Algebraic Geometric Codes. 1991, 668 pp. ISBN 0-7923-0727-5 H.J. Voss: Cycles and Bridges in Graphs. 1991, 288 pp. ISBN 0-7923-0899-9 V.K. Kharchenko: Automorphisms and Derivations of Associative Rings. 1991, 386 pp. ISBN 0-7923-1382-8 A.Yu. Olshanskii: Geometry of Defining Relations in Groups. 1991, 513 pp. ISBN 0-7923-1394-1 F. Brackx and D. Constales: Computer Algebra with LISP and REDUCE. An Introduction to Computer-Aided Pure Mathematics. 1992, 286 pp. ISBN 0-7923-1441-7 N.M. Korobov: Exponential Sums and their Applications. 1992, 210 pp. ISBN 0-7923-1647-9 D.G. Skordev: Computability in Combinatory Spaces. An Algebraic Generalization of Abstract First Order Computability. 1992, 320 pp. ISBN 0-7923-1576-6 E. Goles and S. Martinez: Statistical Physics, Automata Networks and Dynamical Systems. 1992, 208 pp. ISBN 0-7923-1595-2
Other Mathematics and Its Applications titles of interest: M.A. Frumkin: Systolic Computations. 1992, 320 pp. ISBN 0-7923-1708-4 J. Alajbegovic and J. Mockor: Approximation Theorems in Commutative Algebra. 1992, 330 pp. ISBN 0-7923-1948-6 LA. Faradzev, A.A. Ivanov, M.M. Klin and A.J. Woldar: Investigations in Algebraic Theory of Combinatorial Objects. 1993, 516 pp. ISBN 0-7923-1927-3 I.E. Shparlinski: Computational and Algorithmic Problems in Finite Fields. 1992, 266 pp. ISBN 0-7923-2057-3 P. Feinsilver and R. Schott: Algebraic Structures and Operator Calculus. Vol. I. Representations and Probability Theory. 1993,224 pp. ISBN 0-7923-2116-2 A.G. Pinus: Boolean Constructions in Universal Algebras. 1993, 350 pp. ISBN 0-7923-2117-0 V.V. Alexandrov and N.D. Gorsky: Image Representation and Processing. A Recursive Approach. 1993, 200 pp. ISBN 0-7923-2136-7 L.A. Bokut' and G.P. Kukin: Algorithmic and Combinatorial Algebra. 1994, 384 pp. ISBN 0-7923-2313-0 Y. Bahturin: Basic Structures of Modern Algebra. 1993, 419 pp. ISBN 0-7923-2459-5 R. Krichevsky: Universal Compression and Retrieval. 1994,219 pp. ISBN 0-7923-2672-5 A. Elduque and H.C. Myung: Mutations of Alternative Algebras. 1994, 226 pp. ISBN 0-7923-2735-7 E. Goles and S. Martinez (eds.): Cellular Automata, Dynamical Systems and Neural Networks. 1994,189 pp. ISBN 0-7923-2772-1 A.G. Kusraev and S.S. Kutateladze: Nonstandard Methods of Analysis. 1994, 444 pp. ISBN 0-7923-2892-2 P. Feinsilver and R. Schott: Algebraic Structures and Operator Calculus. Vol. II. Special Functions and Computer Science. 1994,148 pp. ISBN 0-7923-2921-X V.M. Kopytov and N. Ya. Medvedev: The Theory of Lattice-Ordered Groups. 1994, 400 pp. ISBN 0-7923-3169-9 H. Inassaridze: Algebraic K-Theory. 1995, 438 pp. ISBN 0-7923-3185-0 C. Mortensen: Inconsistent Mathematics. 1995,155 pp. ISBN 0-7923-3186-9 R. Ablamowicz and P. Lounesto (eds.): Clifford Algebras and Spinor Structures. A Special Volume Dedicated to the Memory of Albert Crumeyrolle (1919-1992). 1995, 421 pp. ISBN 0-7923-3366-7 W. Bosma and A. van der Poorten (eds.), Computational Algebra and Number Theory. 1995, 336 pp. ISBN 0-7923-3501-5
Other Mathematics and Its Applications titles of interest: A.L. Rosenberg: Noncommutative Algebraic Geometry and Representations of Quantized Algebras. 1995, 316 pp. ISBN 0-7923-3575-9 L. Yanpei: Embeddability in Graphs. 1995, 400 pp. ISBN 0-7923-3648-8 B.S. Stechkin and V.I. Baranov: Extremal Combinatorial Problems and Their Applications. 1995, 205 pp. ISBN 0-7923-3631-3 Y. Fong, H.E. Bell, W.-F. Ke, G. Mason and G. Pilz (eds.): Near-Rings and Near- Fields. 1995, 278 pp. ISBN 0-7923-3635-6 A. Facchini and C. Menini (eds.): Abelian Groups and Modules. (Proceedings of the Padova Conference, Padova, Italy, June 23-My 1,1994). 1995, 537 pp. ISBN 0-7923-3756-5 D. Dikranjan and W. Tholen: Categorical Structure of Closure Operators. With Applications to Topology, Algebra and Discrete Mathematics. 1995, 376 pp. ISBN 0-7923-3772-7 A.D. Korshunov (ed.): Discrete Analysis and Operations Research. 1996, 351 pp. ISBN 0-7923-3866-9 P. Feinsilver and R. Schott: Algebraic Structures and Operator Calculus. Vol. Ill: Representations of Lie Groups. 1996,238 pp. ISBN 0-7923-3834-0 M. Gasca and C.A. Micchelli (eds.): Total Positivity and Its Applications. 1996, 528 pp. ISBN0-7923-3924-X W.D. Wallis (ed.): Computational and Constructive Design Theory. 1996, 368 pp. ISBN 0-7923-4015-9 F. Cacace and G. Lamperti: Advanced Relational Programming. 1996, 410 pp. ISBN 0-7923-4081-7 N.M. Martin and S. Pollard: Closure Spaces and Logic. 1996,248 pp. ISBN 0-7923-4110-4 A.D. Korshunov (ed.): Operations Research and Discrete Analysis. 1997, 340 pp. ISBN 0-7923-4334-4 W.D. Wallis: One-Factorizations. 1997, 256 pp. ISBN 0-7923-4323-9 G. Weaver: Henkin-Keisler Models. 1997, 266 pp. ISBN 0-7923-4366-2 V.N. Kolokoltsov and V.P. Maslov: Idempotent Analysis and Its Applications. 1997, 318 pp. ISBN 0-7923-4509-6 J.P. Ward: Quaternions and Cay ley Numbers. Algebra and Applications. 1997, 250 pp. ISBN 0-7923-4513-4