Author: Dörrie Heinrich  

Tags: elementary mathematics  

ISBN: 486-61348-8

Year: 1965

Text
                    100
Great Problems of
Elementary Mathematics
THEIR HISTORY AND SOLUTION
Heinrich Dorrie
Translated by David Antin


100 Great Problems of Elementary Mathematics THEIR HISTORY AND SOLUTION BY HEINRICH DORRIE TRANSLATED BY DAVID ANTIN NEW YORK DOVER PUBLICATIONS, INC.
Copyright © 1965 by Dover Publications, Inc.; originally published in German under the title of Triumph der Mathematik, © 1958 by Physica- Verlag, Wiirzburg. All rights reserved under Pan American and International Copyright Conventions. Published in Canada by General Publishing Company, Ltd., 30 Lesmill Road, Don Mills, Toronto, Ontario. Published in the United Kingdom by Constable and Company, Ltd., 10 Orange Street, London WC 2. This Dover edition, first published in 1965, is a new translation of the unabridged text of the fifth edition of the work published by the Physica-Ver- lag, Wiirzburg, Germany, in 1958 under the title Triumph der Mathematik: Hundert beruhmte Probleme aus zwei Jahrtausenden mathemattscher Kultur. This authorized translation is published by special arrangement with the German-language publishers, Physica-Verlag, Wiirzburg. Standard Book Number: 486-61348-8 Library of Congress Catalog Card Number:65-14030 Manufactured in the United States of America Dover Publications, Inc. 180 Varick Street New York, N.Y. 10014
Preface A book collecting the celebrated problems of elementary mathematics that would commemorate their origin and, above all, present their solutions briefly, clearly, and comprehensibly has long seemed a necessary and attractive task to the author. The restriction to problems of elementary mathematics was considered advisable in view of those readers who have neither the time nor the opportunity to acquaint themselves in any detail with higher mathematics. Nevertheless, in spite of this limitation a colorful and compelling picture has emerged, one that gives an idea of the amazing variety of mathematical methods and one that will—I hope—enchant many who are interested in mathematics and who take pleasure in characteristic mathematical thought processes. In the present work there are to be found many pearls of mathematical art, problems the solutions of which represent, in the achievements of a Gauss, an Euler, Steiner, and others, incredible triumphs of the mathematical mind. Because the difficult economic situation at the present time barred the publication of a larger work, a limit had to be set to the scope and number of the problems treated. Thus, I decided on a round number of one hundred problems. Moreover, since many of the problems and solutions require considerable space despite the greatest concision, this had to be compensated for by the inclusion of a number of mathematical miniatures. Possibly, however, it may be just these little problems, which are, in their way, true jewels of mathematical miniature work, that will find the readiest readers and win new admirers for the queen of the sciences. As we have indicated already, a knowledge of higher analysis is not assumed. Consequently, the Taylor expansion could not be used for the treatment of the important infinite series. I hope nonetheless that the derivations we have given, particularly the striking derivation of the sine and cosine series, will please and will not be found unattractive even by mathematically sophisticated readers.
VI Preface On the other hand, in some of the problems, e.g., the Euler tetrahedron problem and the problem of skew lines, the author believed it necessary not to dispense with the simplest concepts of vector analysis. The characteristic advantages of brevity and elegance of the vector method are so obvious, and the time and effort required for mastering it so slight, that the vectorial methods presented here will undoubtedly spur many readers on to look into this attractive area. For the rest, only the theorems of elementary mathematics are assumed to be known, so that the reading of the book will not entail significant difficulties. In this connection the inclusion of the little problems may in fact increase the acceptability of the book, in that it will perhaps lead the mathematically weaker readers, after completion of the simpler problems, to risk the more difficult ones as well. So then, let the book go out and do its part to awaken and spread the interest and pleasure in mathematical thought. Wiesbaden, Heinrich Dorrie Fall, 1932 Preface to the Second Edition The second edition of the book contains few changes. An insufficiency in the proof of the Fermat-Gauss Impossibility Theorem has been eliminated, Problem 94 has been placed in historical perspective and the Problem of the Length of the Polar Night, which in relation to the other problems was of less significance, has been replaced by a problem of a higher level: "Andre's Derivation of the Secant and Tangent Series." Wiesbaden, Spring, 1940 Heinrich Dorrie
Contents Arithmetical Problems P"ge 1. Archimedes' Problema Bovinum 3 2. The Weight Problem of Bachet de Meziriac 7 3. Newton's Problem of the Fields and Cows 9 4. Berwick's Problem of the Seven Sevens 11 5. Kirkman's Schoolgirl Problem 14 6. The Bernoulli-Euler Problem of the Misaddressed Letters 19 7. Euler's Problem of Polygon Division 21 8. Lucas' Problem of the Married Couples 27 9. Omar Khayyam's Binomial Expansion 34 10. Cauchy's Mean Theorem 37 11. Bernoulli's Power Sum Problem 40 12. The Euler Number 44 13. Newton's Exponential Series 48 14. Nicolaus Mercator's Logarithmic Series 56 15. Newton's Sine and Cosine Series 59 16. Andre's Derivation of the Secant and Tangent Series 64 17. Gregory's Arc Tangent Series 69 18. Buffon's Needle Problem 73 19. The Fermat-Euler Prime Number Theorem 78 20. The Fermat Equation 86 21. The Fermat-Gauss Impossibility Theorem 96 22. The Quadratic Reciprocity Law 104 23. Gauss' Fundamental Theorem of Algebra 108 24. Sturm's Problem of the Number of Roots 112 25. Abel's Impossibility Theorem 116 26. The Hermite-Lindemann Transcendence Theorem. . 128
Vlll Contents page Planimetric Problems 27. Euler's Straight Line 141 28. The Feuerbach Circle 142 29. Castillon's Problem 144 30. Malfatti's Problem 147 31. Monge's Problem 151 32. The Tangency Problem of Apollonius 154 33. Mascheroni's Compass Problem 160 34. Steiner's Straight-edge Problem 165 35. The Delian Cube-doubling Problem 170 36. Trisection of an. Angle 172 37. The Regular Heptadecagon 177 38. Archimedes' Determination of the Number ir 184 39. Fuss' Problem of the Chord-Tangent Quadrilateral.. 188 40. Annex to a Survey 193 41. Alhazen's Billiard Problem 197 Problems Concerning Conic Sections and Cycloids 42. An Ellipse from Conjugate Radii 203 43. An Ellipse in a Parallelogram 204 44. A Parabola from Four Tangents 206 45. A Parabola from Four Points 208 46. A Hyperbola from Four Points 212 47. Van Schooten's Locus Problem 214 48. Cardan's Spur Wheel Problem 216 49. Newton's Ellipse Problem 217 50. The Poncelet-Brianchon Hyperbola Problem 219 51. A Parabola as Envelope 220 52. The Astroid 222 53. Steiner's Three-pointed Hypocycloid 226 54. The Most Nearly Circular Ellipse Circumscribing a Quadrilateral 231 55. The Curvature of Conic Sections 236 56. Archimedes' Squaring of a Parabola 239 57. Squaring a Hyperbola 242
Contents ix page 58. Rectification of a Parabola 247 59. Desargues' Homology Theorem (Theorem of Homologous Triangles) 250 60. Steiner's Double Element Construction 255 61. Pascal's Hexagon Theorem 257 62. Brianchon's Hexagram Theorem 261 63. Desargues' Involution Theorem 265 64. A Conic Section from Five Elements 273 65. A Conic Section and a Straight Line 278 66. A Conic Section and a Point 278 Stereometric Problems 67. Steiner's Division of Space by Planes 283 68. Euler's Tetrahedron Problem 285 69. The Shortest Distance Between Skew Lines 289 70. The Sphere Circumscribing a Tetrahedron 292 71. The Five Regular Solids 295 72. The Square as an Image of a Quadrilateral 301 73. The Pohlke-Schwarz Theorem 303 74. Gauss' Fundamental Theorem of Axonometry 307 75. Hipparchus' Stereographic Projection 310 76. The Mercator Projection 314 Nautical and Astronomical Problems 77. The Problem of the Loxodrome 319 78. Determining the Position of a Ship at Sea 321 79. Gauss' Two-Altitude Problem 323 80. Gauss' Three-Altitude Problem 327 81. The Kepler Equation 330 82. Star Setting 334 83. The Problem of the Sundial 336 84. The Shadow Curve 340 85. Solar and Lunar Eclipses 342 86. Sidereal and Synodic Revolution Periods 346 87. Progressive and Retrograde Motion of the Planets. . 349 88. Lambert's Comet Problem 352
x Contents page Extremes 89. Steiner's Problem Concerning the Euler Number. .. . 359 90. Fagnano's Altitude Base Point Problem 359 91. Fermat's Problem for Torricelli 361 92. Tacking Under a Headwind 363 93. The Honeybee Cell (Problem by Reaumur) 366 94. Regiomontanus' Maximum Problem 369 95. The Maximum Brightness of Venus 371 96. A Comet Inside the Earth's Orbit 374 97. The Problem of the Shortest Twilight 375 98. Steiner's Ellipse Problem 378 99. Steiner's Circle Problem 381 100. Steiner's Sphere Problem 384 Index of Names 391
Arithmetical Problems
^^H Archimedes' Problema Bovinum The sun god had a herd of cattle consisting of bulls and cows, one part of which was white, a second black, a third spotted, and a fourth brown. Among the bulls, the number of white ones was one half plus one third the number of the black greater than the brown; the number of the black, one quarter plus one fifth the number of the spotted greater than the brown; the number of the spotted, one sixth and one seventh the number of the white greater than the brown. Among the cows, the number of white ones was one third plus one quarter of the total black cattle; the number of the black, one quarter plus one fifth the total of the spotted cattle; the number of the spotted, one fifth plus one sixth the total of the brown cattle; the number of the brown, one sixth plus one seventh the total of the white cattle. What was the composition of the herd? Solution. If we use the letters X, Y, Z, T to designate the respective number of the white, black, spotted, and brown bulls and x, y, z, t to designate the white, black, spotted, and brown cows, we obtain the following seven equations for these eight unknowns: (1) X- T= |F, (4) x = ^(Y+y), (2) Y- T=fsZ, (5) y = &(Z + z), (3) Z-T=&X, (6) z = U(T+t), (7) t = mX + x). From equations (1), (2), (3) we obtain 6X - 5Y = 6T, 20Y - 9Z= 20T, 42Z- \3X = 42T, and taking these three equations as equations for the three unknowns X, Y, and Z, we find Y — 742 T V _ 178 T 7 _ 1 5807^ A — -j^J 1 , I — gg I , £j — g91 J. Since 891 and 1580 possess no common factors, T must be some whole multiple—let us say G—of 891. Consequendy, (I) Z=2226G, F=1602G, Z = 1580G, T = 891G.
4 Arithmetical Problems If these values are substituted into equations (4), (5), (6), (7), the following equations are obtained: 12* - ly = 11214G, 20y- 9z = 14220G, 30z - 11/= 9801G, 42/ - 13* = 28938G. These equations are solved for the four unknowns *, y, z, t and we obtain (ex = 7206360G, cy = 4893246G, ^ ' \cz = 3515820G, ct = 5439213G, in which c is the prime number 4657. Since none of the coefficients of G on the right can be divided by c, then G must be an integral multiple of c: G = cg. If this value of G is introduced into (I) and (II), we finally obtain the following relationships: (X = 10366482^, Y = 7460514^, <"■> {:: 7358060^, T = 4149387^, = 7206360^, y = 4893246^, 3515820& t = 5439213^, where g may be any positive integer. The problem therefore has an infinite number of solutions. If g is assigned the value 1, we obtain the following: Solution in the Smallest Numbers white bulls 10,366,482 white cows 7,206,360 black bulls 7,460,514 black cows 4,893,246 spotted bulls 7,358,060 spotted cows 3,515,820 brown bulls 4,149,387 brown cows 5,439,213 Historical. As the above solution shows, the problem of the cattle cannot properly be considered a very difficult problem, at least in terms of present concepts. Since, however, in ancient times a difficult problem was frequendy referred to specifically as a problema bovinum or else as a problema Archimedis, one may assume that the form of the problem dealt with above does not represent the complete and original form of Archimedes' problem, especially when one considers
Archimedes' "Problema Bovinum" 5 the rest of Archimedes' brilliant achievements, as well as the fact that Archimedes dedicated the cattle problem to the Alexandrian astronomer Eratosthenes. A "more complete" formulation of the problem is contained in a manuscript (in Greek) discovered by Gotthold Ephraim Lessing in the Wolfenbuttel library in 1773. Here the problem is posed in the following poetic form, made up of twenty-two distichs, or pairs of verses: Number the sun god's cattle, my friend, with perfect precision. Reckon them up with great care, if any wisdom you'd claim: How many catde were there that once did graze in the meadows On the Sicilian isle, sorted by herds into four, Each of these four herds differently colored: the first herd was milk-white, Whereas the second gleamed in a deep ebony black. Brown was the third group, the fourth was spotted; in every division Bulls of respective hues greatly outnumbered the cows. Now, these were the proportions among the catde: the white ones Equaled the number of brown, adding to that the third part Plus one half of the ebony catde all taken together. Further, the group of the black equaled one fourth of the flecked Plus one fifth of them, taken along with the total of brown ones. Finally, you must assume, friend, that the total with spots Equaled a sixth plus a seventh part of the herd of white cattle, Adding to that the entire herd of the brown-colored kine. Yet quite different proportions held for the female contingent: Cows with white-colored hair equaled in number one third Plus one fourth of the black-hued cattle, the males and the females. Further, the cows colored black totaled in number one fourth Plus one fifth of the whole spotted herd, in this computation Counting in each spotted cow, each spotted bull in the group. Likewise, the spotted cows comprised the fifth and the sixth part Out of the total of brown catde that went out to graze. Lastly, the cows colored brown made up a sixth and a seventh Out of the white-coated herd, female and male ones alike. If, my friend, you can tell me exactly what was the number Gathered together there then, also the accurate count Color by color of every well-nourished male and each female, Then with right you'll be called skillful in keeping accounts.
6 Arithmetical Problems But you will not be reckoned a wise man yet; if you would be, Come and answer me this, using new data I give: When the entire aggregation of white bulls and that of the black bulls Joined together, they all made a formation that was Equally broad and deep; the far-flung Sicilian meadows Now were thoroughly filled, covered by great crowds of bulls. But when the brown and the spotted bulls were assembled together, Then was a triangle formed; one bull stood at the tip; None of the brown-colored bulls was missing, none of the spotted, Nor was there one to be found different in color from these. If this, too, you discover and grasp it well in your thinking, If, my friend, you supply every herd's make-up and count, Then with justice proclaim yourself victor and march about proudly, For your fame will glow bright all through the world of the wise. Lessing, however, disputed the authorship of Archimedes. So also did Nesselmann {Algebra der Griechen, 1842), the French writer Vincent (Nouvelles Annales de Mathematiques, vol. XV, 1856), the Englishman Rouse Ball {A Short Account of the History of Mathematics), and others. The distinguished Danish authority on Archimedes J. L. Heiberg (Quaestiones Archimedeae), the French mathematician P. Tannery (Sciences exactes dans I'antiquite), as well as Krummbiegel and Amthor (Schlb'milchs Zeitschrift fiir Mathematik und Physik, vol. XXV, 1880), on the other hand, are of the opinion that this complete form of the problem is to be attributed to Archimedes. The two conditions set forth in the last seven distichs require that X + Y be a square number U2 and Z + T a triangular number* iV(V + 1), as a result of which we obtain the following relations: (8) X + Y = U2 and (9) 2Z + 2T = V2 + V. If we substitute in (8) and (9) the values X, Y, Z, T in accordance with (I), these equations are transformed into 3828G = U2 and 4942G = V2 + V. If we replace 3828, 4942, and G, respectively, with 4a (a being equal to 3-11 -29 = 957), b, and eg, we obtain (8') U2 = 4acg, (9') V2 + V = beg. * A triangular number is a number n such that it is possible to construct with n points a lattice of congruent equilateral triangles whose vertexes are the points. The first triangle numbers are 1 = -J--1-2, 3 = 1 + 2 = i-2-3, 6 = 1 + 2 + 3 = i-3-4, 10 =1 + 2 + 3 + 4 = i-4-5, etc.
The Weight Problem of Bachet de Meziriac 7 Uis consequendy an integral multiple of 2, a, and c: U = 2acu, so that U2 = Aa2c2u2 = Aacg and (8") g = acu2. If this value for g is introduced into (9') we obtain V2 + V = abc2u2 or (2V + 1)2 = 4abc2u2 + 1. If the unknown is designated as 2F + 1» and the product \abc2 = 4-3-11-29-2-7-353-46572 is abbreviated as d, the last equation is transformed into v2 - du2 = 1. This is a so-called Fermat equation, which can be solved in the manner described in Problem 19. The solution is, however, extremely difficult because d has the inconveniently large value d = 410286423278424 and even the smallest solution for u and v of this Fermat equation leads to astronomical figures. Even if u is assigned the smallest conceivable value 1, in solving for g the value of ac is 4456749 and the combined number of white and black bulls is over 79 billion. However, since the island of Sicily has an area of only 25500 km2 = 0.0255 billion m2, i.e., less than -^ billion m2, it would be quite impossible to place that many bulls on the island, which contradicts the assertion of the seventeenth and eighteenth distichs. ^^9 The Weight Problem of Bachet de Meziriac A merchant had a forty-pound measuring weight that broke into four pieces as the result of a fall. When the pieces were subsequently weighed, it was found that the weight of each piece was a whole number of pounds and that the four pieces could be used to weigh every integral weight between 1 and 40 pounds. What were the weights of the pieces?
8 Arithmetical Problems This problem stems from the French mathematician Claude Gaspard Bachet de Meziriac (1581-1638), who solved it in his famous book Problemes plaisants et delectables qui se font par les nombres, published in 1624. We can distinguish the two scales of the balance as the weight scale and the load scale. On the former we will place only pieces of the measuring weight; whereas on the load scale we will place the load and any additional measuring weights. If we are to make do with as few measuring weights as possible it will be necessary to place measuring weights on the load scale as well. For example, in order to weigh one pound with a two-pound and a three-pound piece, we place the two-pound piece on the load scale and the three-pound piece on the weight scale. If we single out several from among any number of weights lying on the scales, e.g., two pieces weighing 5 and 10 lbs each on one scale, and three pieces weighing 1, 3, and 4 lbs each on the other, we say that these pieces give the first scale a preponderance of 7 lbs. We will consider only integral loads and measuring weights, i.e., loads and weights weighing a whole number of pounds. If we have a series of measuring weights A,B,C,..., which when properly distributed upon the scales enable us to weigh all the integral loads from 1 through n lbs, and if P is a new measuring weight of such nature that its weight/) exceeds the total weight n of the old measuring weights by 1 more than that total weight: p - n = n + \ or p = 2n + \, it is then possible to weigh all integral loads from 1 through p + n = 3n + 1 by addition of the weight P to the measuring weights A, B, C,.... In fact, the old pieces are sufficient to weigh all loads from 1 to n lbs. In order to weigh a load of (p + x) and/or (p — x) lbs, where x is one of the numbers from 1 to n, we place the measuring weight P on the weight scale and place weights A,B,C,... on the scales in such a manner that these pieces give either the weight scale or the load scale a preponderance of x lbs. This being established, the solution of the problem is easy. In order to carry out the maximum possible number of weighings with two measuring weights, A and B, A must weigh 1 lb and B 3 lbs. These two pieces enable us to weigh loads of 1, 2, 3, 4 lbs.
Newton's Problem of the Fields and Cows 9 If we then choose a third piece C such that its weight c = 2-4+l=9 lbs, it then becomes possible to use the three pieces A, B, C to weigh all integral loads from 1 to ¢ + 4 = 9 + 4= 13. Finally, if we choose a fourth piece D such that its weight d = 2-13 + 1 = 27 lbs, the four weights A, B, C, D then enable us to weigh all loads from 1 to 27 + 13 = 40 lbs. Conclusion. The four pieces weigh 1, 3, 9, 27 lbs. Note. Bachet's weight problem was generalized by the English mathematician MacMahon. In Volume 21 of the Quarterly Journal of Mathematics (1886) MacMahon determined all the conceivable sets of integral weights with which all loads of 1 ton lbs can be weighed. ^^fl Newton's Problem of the Fields and Cows In Newton's Arithmetica universalis (1707) the following interesting problem is posed: a cows graze b fields bare in c days, a' cows graze b' fields bare in c' days, a" cows graze b" fields bare in c" days; what relation exists between the nine magnitudes a to c"? It is assumed that all the fields provide the same amount of grass, that the daily growth of the fields remains constant, and that all the cows eat the same amount each day. Solution. Let the initial amount of grass contained by each field be M, the daily growth of each field m, and the daily grass consumption of each cow Q. On the evening of the first day the amount of grass remaining in each field is bM + bm - aQ, on the evening of the second day bM + 2bm - 2aQ, on the evening of the third day bM + Urn - 3aQ,
10 Arithmetical Problems etc., so that on the evening of the eth day bM + cbm — caQ. And this value must be equal to zero, since the fields are grazed bare in c days. This gives rise to the equation (1) bM + cbm = caQ. In like manner the following equations are obtained: (2) b'M + c'b'm = c'a'Q and (3) b'M + c"b"m = c"a"Q. If (1) and (2) are taken as linear equations for the unknowns M and m, we obtain cc'(ab'-ba') M bb'(c'-c) V' be'a' — b'ca m = TTT7-, r V. bb'(c' - c) If these values are introduced into equation (3) and the resulting equation is multiplied by \bb'{c' — c)]/Q, we obtain the desired relation: b"cc'(ab' - ba') + c"b"{bc'a' - b'ca) = c"a"bb'{c' - c). The solution is more easily seen when expressed in the form of determinants. If q represents the reciprocal of Q, equations (1), (2), (3) assume the form bM + cbm + caq = 0, b'M + c'b'm + c'a'q = 0, b"M + c"b"m + c"a"q = 0. According to one of the basic theorems of determinant theory, the determinant of a system of n (3 in this case) linear homogeneous equations possessing n unknowns that do not all vanish (M, m, q in this case) must be equal to zero. Consequently, the desired relation has the form b be ca V b'c' b" b"c" = 0.
Berwick's Problem of the Seven Sevens 11 Berwick's Problem of the Seven Sevens In the following division example, in which the divisor goes into the dividend without a remainder: **7*******:****7* = **7** *****7* ******* *7**** *7**** ****7** ****** the numbers that occupied the places marked with the asterisks (*) were accidentally erased. What are the missing numbers? This remarkable problem comes from the English mathematician E. H. Berwick, who published it in 1906 in the periodical The School World. Solution. We will assign a separate letter to each of the missing numerals. The example then has the following appearance: AB 7 CDE LQWz: a/Sy87e = kA7/w a b A c d e Third line Fourth line Fifth line «-7-b Seventh line Ninth line I. The first numeral (a) of the divisor b must be 1, since 7b, as the sixth line of the example shows, possesses six numerals, whereas if a equaled 2, 7b would possess seven numerals. Since the remainders in the third and seventh lines possess six numerals, Fmust equal 1 and R must equal 1, as a result of which/ and r must also equal 1 (according to the outline). FGH IK 7 L f g h i ka I M 7 NOPQ m 7 n o p q RSTUZVW r s t u 7 v w X Y Z x y z X Y Z x y z
12 Arithmetical Problems Since b cannot exceed 199979, the maximum value of ft is 9, so that the product in the eighth line cannot exceed 1799811, and s < 8. And since S can only be 9 or 0, and since there is no remainder in the ninth line under s, only the second case is possible. Consequently, S = 0 and (since R = I) sis also equal to 0. It also follows from R = 1 and S = 0 that M = m + 1, thus m ^ 8, and the product 7b of the sixth line cannot be higher than 87nopq. II. Consequently, the only possible values for the second divisor numeral /3 are 0,1, and 2. (7-130000 is already higher than 900000.) /3 = 0 is eliminated because even when multiplied by nine 109979 does not give a seven-figure number, which, for example, is required by the eighth line. Let us then consider the case of/3 = 1. This requires y to be equal to only 0 or 1. (If y ^ 2, on determination of the second figure of line 6 one would have to add to 7/3 = 7 • 1 = 7 the amount ^ 1 coming from the product 7 • y, whereas the second figure must be 7.) y = 0, however, is impossible as a result of the seven figures of line 8, since not even 9-110979 yields a seven-figure product. In the event that y = 1 the following conditions must be observed, as a glance at line 8 will show: 8, e, and ft must be so chosen that ft-11187 e results in a seven-place number, the third last figure of which is 7. The only hope of this is offered by the multiplier ft = 9 (since even 8-111979 has only six places). Now the third last figure of 9-11187e, as is easily seen by experiment, can be a seven only if 8 = 0 or 8 = 9. In the first case line 8 will not possess seven places even when 111079 is multiplied by 9, and in the second case line 6 is 7-11197* = 783***, which is impossible. Thus, the case of y = 1 is also excluded. The possibility of /3 equaling 1 must, therefore, be discarded. The only appropriate value for the second figure of the divisor is therefore /3 = 2. From this it follows that m = 8 and M = 9. III. The third figure y of the divisor can only be 4 or 5, since 7-126000 is greater and 7-124000 is smaller than the sixth line. Moreover, since 9 • 124000 is greater and 7-126000 is smaller than the eighth line (I0tu7vw), ft must be equal to 8. Since 8-124979 = 999832 < 1000000 the assumption that y = 4 fails to satisfy the requirements of line 8, and y therefore has to be equal to 5. IV. Since the third last figure of 8- 12537e must be 7, we find by
Berwick's Problem of the Seven Sevens 13 testing that 8 is equal to either 4 or 9. 8 = 9 is eliminated because even 7-125970 = 881790 comes out greater than the sixth line, so that only 8 = 4 is suitable. Thus, e can be considered one of numbers 0 to 4. However, whichever one of these is chosen, we find for the third figure of the sixth line n = 8 from 7 • 12547e = 878***. Similarly, for the eighth line we obtain 8-12547e = 10037**, and consequently t = 0 and u = 3. Since Ab = A- 12547e results in a seven-place fourth line and only 8b and 9b have seven places, A is either 8 or 9. V. From t = 0 and X ^ 1 (together with R = r=l,S = s = 0) it follows that T ^ 1, and from n = 8, N ^ 9, it follows that T ^ 1, so that T = 1. N is therefore equal to 9 and X = 1. Since X = 1 and 2-b > 200000 (line 9), it follows that v = 1 and also that Y = 2, Z = 5, x = 4, y = 7, and z = e. With the results obtained at this point the problem has the following appearance: AB 7 CDELQW e: 12547e = /cA781 a b A c d e \GH IK 7 L 1 g h i ka I 9 7 90 PQ 8 7 8 o p q 1 0 1 VL VW 1 0 0 3 7 v w 12 5 4 7 e 1 2 5 4 7 e VI. In this case e is one of the five numbers 0, 1, 2, 3, 4. These five cases correspond to the number series vw = 60, 68, 76, 84, 92, opq = 290, 297, 304, 311, 318 and, depending upon whether A is equal to 8 or 9, S/ = 60, 68, 76, 84, 92 or El = 30, 39, 48, 57, 66.
14 Arithmetical Problems This presents ten different possibilities. If we test each of them by going upward in three successive additions beginning from lines 9 and 8 to line 7, then from lines 7 and 6 to line 5, and finally from lines 5 and 4 to line 3, we find that only when e = 3 and A = 8 do we obtain the requisite 7 for the next to last figure of line 3. In this case vw = 84, WLVW = 6331, opq = 311, OPQ = 944, ghikal = 003784, and GHIK7L = 101778. This gives the problem the following appearance: A B 7 CD E 8 4 1 3:125473 = «8781 a b A c d e 110 17 7 8 10 0 3 7 8 4 9 7 9 9 4 4 8 7 8 3 1 1 10 16 3 3 1 10 0 3 7 8 4 12 5 4 7 3 12 5 4 7 3 VII. Finally, since of all the multiples of b only 5b = 627365 added to the division remainder 110177 of the third line gives a number containing a 7 in the third place, we get k = 5 and at the same time abAcde = 627365 and AB7CDE = 737542, which gives us all of the figures missing from the problem. ^^fl Kirkman's Schoolgirl Problem In a boarding school there are fifteen schoolgirls who always take their daily walks in rows of threes. How can it be arranged so that each schoolgirl walks in the same row with every other schoolgirl exactly once a week? This extraordinary problem was posed in the Lady's and Gentleman's Diary for 1850, by the English mathematician T. P. Kirkman. Of the great number of solutions that have been found we will reproduce two. One is by the English minister Andrew Frost (" General Solution and Extension of the Problem of the 15 Schoolgirls," Quarterly Journal of Pure and Applied Mathematics, vol. XI, 1871); the other is that of B. Pierce ("Cyclic Solutions of the School-girl Puzzle," The AstronomicalJournal, vol. VI, 1859-1861).
Kirkman's Schoolgirl Problem 15 Frost's solution. Mathematically expressed the problem consists of arranging the fifteen elements x, au a2, bu b2, cu c2, du d2, eu ei-> f\> fii gi> gi in seven columns of five triplets each in such a way that any two selected elements always occur in one and only one of the 35 triplets. As the initial triplets of the seven columns we shall select: xa1a2\xb1b2\xc1c2\xd1d2\xe1e2\xf1f2\xg1g2. Then we have only to distribute the 14 elements au a2, bu b2,.. ., gi! gz correctly over the other four lines of our system. Using the seven letters a, b, c, d, e,f g, we form a group of triplets in which each pair of elements occurs exactly once, specifically the group: abc, ode, afg, bdf, beg, cdg, cef. (The triplets are in alphabetical order.) From this group it is possible to take for each column exactly four triplets that contain all the letters except those contained in the first line of the column. If we then place the appropriate triplets in alphabetical order in each column, we obtain the following preliminary arrangement: Sun. Xfl^ bdf beg cdg cef Mon. xbib2 ade afg cdg cef Tues. XCiC2 ade afg bdf beg Wed. xdtd2 abc afg beg cef Thurs. xe1e2 abc afg bdf cdg Fri. xfj2 abc ade beg cdg Sat. Xglgt abc ade bdf cef Now we have to index the triplets bdf, beg, cdg, cef, ade, afg, abc, i.e., to provide them with the index numbers 1 and 2. We index them in the order just mentioned, i.e., first all the triplets bdf, then all the triplets beg, etc., observing the following rules: I. When a letter in one column has received its index number, the next time that letter occurs in the same column it receives the other index number. II. If two letters of a triplet have already been assigned index numbers, these two index numbers must not be used in the same sequence for the same letters in other triplets. III. If the index number of a letter is not determined by rules I. and II., the letter is assigned the index number 1.
16 Arithmetical Problems The letters are indexed in three steps. First step. The triplets bdf, beg, cdg, and all the letters aside from a that can be indexed in accordance with this numbering system and rules I., II., and III. are successively indexed. Second step. The missing index numbers (in boldface in the diagram) of the triplets ode and qfg, as well as the index numbers obtained in accordance with rule I. for the last two a's in line 2 are assigned. Third step. The still missing index numbers of the a's in columns 4 and 5 (in the empty spaces of the printed diagram) are inserted; these are 2 in line 2 and 1 in line 3. This method results in the following completed diagram, which represents the solution of the problem. Sun. xaxa2 Mi/i *2*l£l Cid2g2 C2e2J2 Mon. xbib2 fil<4«2 at/2g2 Cidigi C2elfl Tues. XCiC2 aidiCi 'htfigi 61(/2/2 *2«2£2 Wed. Jtrfjrfj ab2c2 a/2gi *l«l£2 c\e2J\ Thurs. *«1«2 abiCi <tflg2 *2</l/2 c2d2gi Fri. *flf» aib2Ci Ofd^l *ie2#i C2dlg2 Sat. *glg2 «1*1^2 atdi'2 *2<4/l ^1*1/2 Pierce's solution (judged the best by Sylvester). Let one girl, whom we will indicate as *, walk in the middle of the same row on all days; we will divide the other girls into two groups of 7 and designate the first group by the Arabic numbers 1 to 7 or else by lower-case letters and the second group by the Roman numbers I to VII or else by capital letters. We will let an equation such as R = s indicate that the Roman number indicated by the letter R possesses the same numerical value as the Arabic numeral corresponding to the letter s. Also, we will designate the days of the week Sunday, Monday,..., Saturday by the numerals 0, 1, 2,..., 6. Let the Sunday arrangement have the following order: a a A b 0 B C y C d * D E F G
Kirkman's Schoolgirl Problem 17 From this, by adding r = R to each numeral, we obtain the arrangement a + r b + r c + r d + r E + R a + r P + r y + r * F + R A + R B + R C + R D + R G + R for the rth weekday. Here every figure thus obtained that exceeds 7, such as perhaps c + r or D + R, will represent the girl who receives a number (c + r — 7 or D + R — 7), that is 7 below the figure and is subsequently converted into that number. The arrangements thus obtained yield the solution of the problem if the following three conditions are satisfied: I. The three differences a — a, /3 — b, y — c are 1, 2, and 3. II. The seven differences A — a, A — a, B — b, B — /3, C — c, C — y,D — d form a complete residue system of incongruent numbers to the modulus 7 (cf. No. 19). III. The three differences F - E, G - F, G - £are 1, 2, 3. Proof. We take as a premise that the following congruences (cf. No. 19) are all related to the modulus 7. 1. Each girl x of the first group will come together exactly once with every other girl y of this group. The difference x — y is then (according to I.) congruent to only one of the 6 differences a — a, b — fi, c — y, a — a, fi — b, y — c. Let us assume x — y = /3 — b or x — /3 = y — b. Thus, if r represents the number of the day of the week that is congruent to x — /3 (or y — b), then x = /3 + r and y = b + r, so that the girls x and y walk in the same row on weekday r. 2. Each girl x of the first group comes together exactly once with each girl X of the second group. The difference X — x (according to II.) can be congruent to only one of the seven differences A — a, A — a, B — b, B — /3, C — c, C — y, D — d. Let us assume X — x = C — y or X — C = x — y. If s = S is the weekday number that is congruent to X — C (or x — y), then we have X = C + S and x = y + s, so that the girls X and x walk in the same row on weekday s.
18 Arithmetical Problems 3. Each girl X of the second group comes together exactly once with every other girl Y of this group. The difference X — Y is (according to III.) congruent to only one of the differences F - E, G - F, G - E, E - F, F - G, E - G. Let us assume that X - Y= G - F or X - G = Y - F. Then if R represents the weekday number that is congruent to X — G (or Y — F), we obtain X= G + R and Y = F + R, so that the girls X and Y walk in the same row on weekday R. Thus, we need only satisfy conditions I., II., and III. to obtain the Sunday arrangement. We choose a = 1, a = 2, b = 3, consequently /3 = 5, and then c = 4, so that y = 7 and d = 6. We then select A — 1, and thus B = VI, C = II, and D = III, so that the differences mentioned in condition II. are the numbers 0, —1, 3, 1, —2, —5, which are incongruent to the modulus 7. The numbers IV, V, and VII then remain for the letters E, F, G. The Sunday arrangement is therefore 1 2 I 3 5 VI 4 7 II 6 * III IV V VII The weekday rows, in order, are arranged in the following manner: 2 4 5 7 V 5 7 1 3 3 6 1 * VI 6 2 4 * II VII III IV I, V III VI VII 3 5 6 1 VI 6 1 2 4 4 7 2 * VII 7 3 5 * III I IV V II, VI IV VII I 4 6 7 2 VII 7 2 3 5 5 1 3 * I 1 4 6 * IV II V VI III VII V I II II IV, II III V, III IV VI.
Bemoulli-Euler Problem of Misaddressed Letters 19 ^^fl The Bernoulli-Euler Problem of the Misaddressed Letters To determine the number of permutations of n elements in which no element occupies its natural place. This problem was first considered by Niclaus Bernoulli (1687-1759), the nephew of the two great mathematicians Jacob and Johann Bernoulli. Later Euler became interested in the problem, which he called a quaestio curiosa ex doctrina combinationis (a curious problem of combination theory), and he solved it independently of Bernoulli. The problem can be stated in a somewhat more concrete form as the problem of the misaddressed letters: Someone writes n letters and writes the corresponding addresses on n envelopes. How many different ways are there of placing all the letters in the wrong envelopes? This problem is particularly interesting because of its ingenious solution. Let the letters be known as a, b, c,..., the corresponding envelopes as A,B,C,.... Let the number of misplacements, which we are seeking, be designated as n. Let us first consider all the cases in which a finds its way into B and b into A as one group, and all the cases in which a gets into B and b does not get into A as a second group. The first group obviously includes n — 2 cases. The number of cases falling into the second group can be determined if instead of b, c, d, e,... and A, C, D, E,.. . we write, say, b', c', d',e',... and B', C", D', E',.... Accordingly, the number is n~^l. The number of all the cases in which a ends up in B is then n — \ + n — 2. Since each operation of placing "a in C," "a in D," ... yields an equal number of cases, the total number n of all the possible cases is n = (n - \)[n~^l + n~^2]. We write this recurrence formula n — n-n — 1 = i[n — 1 — (n — l)-n — 2],
20 Arithmetical Problems in which t represents — 1 and apply it to the letter numbers 3, 4, 5, .. up to n. Thus, we obtain 3" - 3-2 = t[2 -2-1], 4 - 4-3 = t[3 - 3-2], n - n-n - 1 = i[n - 1 - (n - \)-n - 2]. By multiplying these (n — 2) equations we obtain n - n-n - 1 = in-2[2 - 2-T], or, since T = 0, 2 = 1, and tn "2 = tn, n — n-n — 1 = tn. We then divide this equation by n\, which gives n n — 1 _ tn n! ~ (n - 1)! ~ nl' If we replace n in this formula by the series 2, 3, 4,.. ., n, we obtain £ _ I = i! 2! 1! ~ 2'.' 1 _ 1 = t 3! 2! ~ 3!' h _ n - \ _ in J[\ ~ (B - 1)! ~ nl' Addition of these (n — 1) equations results (since T = 0) in n! ~ 2! + 3! + '" + ~n~\ From this we are finally able to obtain the desired number n: ,/1 1 1 in\ n = "!(2! - si + 4t -+ ••• +;n)- If § represents a symbol such that the application of the binomial theorem (cf. No. 9) to (¾ — 1)" allows v\ to be written for each power 3V of the binomial expansion, the number can be expressed in the simpler form n = (3 - l)n.
Euler's Problem of Polygon Division 21 For a value such as n = 4, for example, we obtain 4 = (3 — 1)4 = §* _ 433 + 632 - 43 + 1 = 4! - 4-3! + 6-2! -4-11 + 1=9, which is easily checked by testing. Similarly, the number of permutations that can be formed from n elements in which no element is in its natural place is (3 — l)n. For the four elements 1, 2, 3, 4, for example, there are the nine permutations 2143, 2341, 2413, 3142, 3412, 3421, 4123, 4312, 4321. Note. The result obtained also contains the solution of the determinant problem: In how many constituents of an n-degree determinant do no principal diagonal elements occur? This is immediately seen if the rth element of the .rth column is called e*. The elements of the principal diagonal are then rl .1 r3 rn °1) °2i °39 • • • > °n* The determinant therefore contains (3 — l)n constituents outside the principal diagonal elements. ^^B Euler's Problem of Polygon Division In how many ways can a (plane convex) polygon of n sides be divided into triangles by diagonals? Leonhard Euler posed this problem in 1751 to the mathematician Christian Goldbach. For the number to be found, En, the number of possible divisions, Euler developed the formula: _ 2-6-10...(½-10) (1) E» ~ (^nyi This problem is of the greatest interest because it involves many difficulties in spite of its innocuous appearance, as many a surprised reader will discover if he attempts to derive the Euler formula without outside assistance. Euler himself said, "The process of induction I employed was quite laborious." In the simplest cases n = 3, 4, 5, 6 the various divisions E3 = 1, £4 = 2, EB = 5, E6 = 14 are easily obtained from the graphic representations. But this method soon becomes impossible as the number of angles is increased.
22 Arithmetical Problems In 1758 Segner, to whom Euler had communicated the first seven division numbers 1, 2, 5, 14, 42, 132, 429, established a recurrence formula for En {Novi Commentarii Academiae Petropolitanae pro annis 1758 et 1759, vol. VII) which we will begin by deriving. Let the angles of any convex polygon of n angles be 1,2,3,...,«. For every possible division En of the polygon of n angles we may take the side n\ as the base line of a triangle the apex of which is situated at one of the angles 2, 3, 4,..., n — 1 in accordance with the division selected. If the apex is, for example, situated at angle r, on one side of the triangle n\r there is a polygon of r angles and on the other a polygon of s angles, r + s being equal to n + 1 (since the apex r belongs to both the polygon of r angles and the polygon of s angles). Since the polygon of r angles (or r-gon) permits Er divisions and the .r-gon permits Es divisions, and since each division of the r-gon can be connected with every division of the .r-gon toward a division of the given n-gon, the mere choice of the apex r results in Er-Es different divisions of the given n-gon. Since, then, r can possess successively every value of the series 2, 3,.. ., n — 1 and j can accordingly possess successively every value of the series n — 1, n — 2,..., 3, 2, it follows that (2) En = E2En_1 + E3En_2 + • • • + En_xE2, where the factor E2, which is merely added for better appearance, has the value 1. Formula (2) is Segner's recurrence formula. It confirms the previously given values for E3 to E6 as well as giving £7 = E2E6 + E3E5 + £4£4 + £5£3 + E6E2 = 42, ■^8 = E2En + E3E6 + EtEB + E5Et + E6E3 + EnE2 = 132, etc. As the index number is increased Segner's formula, in contrast with Euler's, grows more and more unwieldy, as Goldbach has already indicated. We can obtain the Euler formula (1) most simply if we consider Euler's division problem or Segner's recurrence formula in the light of an idea of Rodrigues {Journal de Mathematiques, 3 [1838]) and connect it with a problem treated by the French mathematician Catalan in the year 1838 in the Journal de Mathematiques.
Euler's Problem of Polygon Division 23 Catalan's problem has the form: How many different ways can a product of n different factors be calculated by pairs? We say that a product is calculated by pairs when it is always only two factors that are multiplied together and when the product arising from such a "paired" multiplication is used as one factor in the continuation of the calculation. Calculation by pairs of the product 3 • 4 • 5 • 7, for example, is carried out in the following manner: 3 • 5 = 15, 4 • 15 = 60, 7 • 60 = 420. For the four-membered product abed an alphabetical arrangement of the factors gives the following five paired multiplications: [(a-b)-c]-d, [a-(b-c)]-d, (a-b)-(c-d), a[(b-c)-d], a-[b-(c-d)]. A product in which the paired multiplications that are to be carried out are marked by brackets or the like will be referred to in abbreviated form as "paired." {[(a-b) -c] ■ [(d-e) ■ (fg)]}-{(h-i) -k) is therefore a paired product of the ten factors a to k. It is immediately seen that a paired product of n factors contains (n — 1) multiplication signs and correspondingly involves (n — 1) paired multiplications (for every two factors). Catalan's problem requires the answers to two questions: 1. How many paired products of n different prescribed factors are there? 2. How many paired products can be formed from n factors if the sequence of the factors (e.g., an alphabetical sequence) is prescribed? The first number we will designate as Rn and the second as Cn. The simplest method of obtaining Rn (according to Rodrigues) is by means of a recurrence formula. We will imagine the Rn n- membered paired products to be formed of the n given factors fufi> ■ ■ -t/nl we will add to this an (n + l)th factor fn + 1 =/and form from the available Rn n-membered products all the Rn + i (n + l)-membered products of the factors /i,/2, ■ ■ -,fn + i- Now each of the Rn n-membered products P includes (n — 1) paired multiplications of the form A-B. If we use f once as the multiplier in front of A, once as the multiplicand after A, once as the multiplier before B and once as the multiplicand after B, we thereby obtain from AB four new paired products (f-A)-(B), (A -f) ■ (B), (A) ■ (fB), and (A) ■ (B./). Since these four arrangements of the factor / can be effected for each of the n — 1 paired subproducts of P, we obtain from P
24 Arithmetical Problems 4(n — 1) (n + l)-membered paired products. Moreover, we also obtain from P the two (n + l)-membered paired products f-P and P-f. The described arrangement of the factors/" thus yields from only one (P) of the Rn n-membered products (½ — 2) (n + 1)- membered products. From all Rn n-membered paired products we therefore obtain i?n-(4n — 2) (n + l)-membered paired products. The sought-for recurrence formula accordingly reads (3) Rn + 1 = (½ - 2)Rn. To obtain an independent representation of R^ we begin with R2 = 2 (two factors a and b yield only two products: a-b and b-a) and we infer from (3) R3 = 6R2 = 2-6, /J4 = 10/J3 = 2-6-10, /J5 = 14/J4 = 2-6-10-14, etc., and finally (4) Rn = 2-6-10-14... (4n - 6). The second question can also be answered by returning to a recurrence formula. Let the n factors f„ in the prescribed order be <plt <p2, ■ ■., <pn. We will take from the Cn paired n-membered products belonging to this series those having the form ()•(). where the parenthesis on the left includes the r members 9»!, <p2)..., <pn and the one on the right the s = n — t members 9?r + i> <Pr+2t • • •> 9V+s = 9V Since the left parenthesis, in accordance with its r members, can possess Cr different forms and the right correspondingly can possess Cs different forms, while each form belonging to the left parenthesis can combine with each form included in the right parenthesis, the above main form yields Cr-Cs different n-membered paired products. Since, moreover, r can have every value from 1 to n — 1, it follows that (5) Cn = CjCn^! + C2Cn_2 + • • • + Cn-iCi. By using this recurrence formula and beginning from Cx = 1 and C2 = 1, we obtain the following sequence G3 = iy\L/2 + (s2(si = 2, C4 = C1C3 + C2C2 + ^3^1 = Jy G5 = (^04 + C2C/3 + G3G2 + (-'if-'x = 14, etc.
Euler's Problem of Polygon Division 25 To obtain an independent representation of Cn we can imagine that there are n\ different sequences (permutations) of the factors f\if2.i---ifni mat each of these sequences possesses Cn paired n- membered products and that all the sequences together possess Rn such products. Then Rn = Cn-nl or _ Rn _ 2-6-10...(½-6) W °» ~ -^T nl Formulas (4) and (6) solve Catalan's problem Now for Euler's formula! From the indicated values E2 = 1, £3=1, Et = 2, E6 = 5, Gj = 1, G2 = 1, G3 = ^, G4 = 5 and formulas (2) and (5) it immediately follows that in general (7) En =Cn.x. [The proof is by induction. We assume that (7) is true for all indices through n, so that E2 = Cu E3 = C2,..., En = Cn_i. According to (2) and (5) En + 1 = E2En + E3En_1 + ■ • • + EnE2, Cn = CiCn_i + G2Cn_2 + • • • + Cn-iGV Since the right sides of the two last equations correspond member for member, it also follows that En+i = Gn; i.e., formula (7) is valid for every index.] (6) and (7) give us Euler's formula immediately: 2-6.10...(411-10) (8) En (n^rryi In conclusion we would like to give a slight simplification of Euler's formula. It is 2"-M-3-5...(2n- 5) 2n"2(2n-3)! A. = — (n-1)! (b - l)!2"-2-(n-2)!(2n-3)
26 Arithmetical Problems and consequently En = kr/k, where/ = n — 2 is the number of triangles into which the n-gon can always be divided and k = 2n — 3 is the number of sides bounding these triangles. Recently (Zeitschrift fur math, und naturw. Unterricht, 1941, vol. 4) H. Urban derived Euler's formula in the following manner. He first calculated E5, E6, En by means of the Segner recurrence formula and "inferred" the following: E2 = 1, E3 = 1, £4 = 2, E5 = 5, E6 = 14, £7 = 42, £3 _ 2 £j 6 £5 K) ^6 = H £V _ r8 £2 _ 2' £3 ~ 3' £4 _ 4' £5 ~ 5' £e ~ 6' on the strength of which he surmised that En would have to be (I) e^'JL^e^. (Unfortunately, he does not say whether it was Euler's recurrence formula or some other idea that led him to his "inference.") This recurrence formula is certainly correct for the first values of the index n. To prove its general validity the conclusion for n is applied to n + 1: it is assumed that the recurrence formula (I) is true for all index numbers from 1 to n — 1 and it is demonstrated that it is therefore also true for n. The proof is carried out by means of the expression (II) S=\-E2-En.1 + 2-E3-En_2 + 3-£4.£n_3 + • • • + (n-2)-En_1.E2 or, written in the reverse order, (III) 5=(^-2)-^.^^ + (n-3)-En.2-E3 + (n-4)-£n.3.£4+ -..+ 1 -E^E^. Columnar addition of these two equations gives 2S={n- l)^^.! + E3En_2 +•••+ En^E2-\ or, since in accordance with Segner's recurrence formula the value of the expression within the brackets is equal to En, (IV) 2S=(n-l)En.
Lucas' Problem of the Married Couples 27 Now the left-hand factor E, in each product E,-Es of (II) and (III) (except the case in which r = 2) is replaced in accordance with the recurrence formula (I) by Xr.1Er_1/(r — 1) with Av = 4V — 6. This gives us (II') S = E2En_x + X2E2En_2 + ^zEzEn_z + • • • + An-2En_2E2, (III') S= K-2En-2E2 + An-3£n_3£3 + • •• + X2E2En_2 + E2En_1 and by columnar addition of these two lines, since Av + An_v = \n — 12, we obtain 2S=En.1 + (½ - 12) [E2En_2 + E3En_3 +■■■ + En.2E2]+ En_x or, since the expression within brackets is equal to En _ u (V) 25 = (½ - 10)3,^. Equations (IV) and (V) give us F 4" ~ 10 F so that Euler's recurrence formula (I) is thereby shown to be valid for the index number n, also, and thus generally valid. Lucas' Problem of the Married Couples How many ways can n married couples be seated about a round table in such a manner that there is always one man between two women and none of the men is ever next to his own wife? This problem appeared (probably for the first time) in 1891 in the Theorie des Nombres of the French mathematician Edouard Lucas (1842-1891), author of the famous work Recreations mathematiques. The English mathematician Rouse Ball has said of this problem, "The solution is far from easy." The problem has been solved by the Frenchmen M. Laisant and M. C. Moreau and by the Englishman H. M. Taylor. A solution based upon modern viewpoints is to be found in MacMahon's Combinatory Analysis. The approach adopted here is essentially that of Taylor (The Messenger of Mathematics, 32, 1903).
28 Arithmetical Problems We will number the series of circularly arranged chairs from 1 through 2n. The wives will then all have to be seated on the even- or odd-numbered chairs. In each of these two cases there are n\ different possible seating arrangements, so that there are 2-n\ different possible seating arrangements for the women alone. We will assume that the women have been seated in one of these arrangements and we will maintain this seating arrangement throughout the following. The nucleus of the problem then consists of determining the number of possible ways of seating the men between the women. Let us designate the women in the assumed seating sequence as Flt F2,..., Fn, their respective husbands Mu M2,..., Mn) the couples (Fu M1),(F2,M2),. ■., as 1,2,... and arrangements in which there are n married couples as n-pair arrangements. Let us designate the husbands about whom we have no further information as ai, X2,.... Let F1X1F2X2 ... FnXnFn + 1Xn +! be an (n + l)-pair arrangement in which none of the husbands sits beside his own wife. (It must be remembered that the arrangement is circular, so that Xn + 1 is seated between Fn + 1 and Flt) If we take Fn + 1 and Mn + 1 = X, out of the arrangement and replace X, with -^n +1 = Mtt) we obtain the n-pair arrangement FXX1F2X2... FVMUFV + 1... FnXn. This arrangement can occur in three ways: 1. No man sits next to his wife (thus Mu 9& M„ Mu *M, + 1,Xn* Mj). 2. One man sits next to his own wife (namely when Mu = Mv or Mu = Mv + 1 or else Xn = Mx). 3. Two men sit next to their own wives (when Mu = M, or Mu = Mv + 1 and at the same time Xn = Mu that is, when in our arrangement the order M^F^ occurs). Thus, we must consider other seating arrangements in addition to the one prescribed in the problem. In the following we will distinguish between three types of arrangements: arrangements A, B, and C. An ^4-arrangement will be
Lucas' Problem of the Married Couples 29 one in which no man sits next to his wife. A 5-arrangement will be one in which a certain man sits on a certain side of his wife. Finally, a C-arrangement will be one in which a certain man sits on a certain side of his wife and another man—which one, is not prescribed—sits alongside his wife—but the side is likewise not prescribed. We will designate the number of n-pair A-, B-, C-arrangements as An, Bn) Cn, respectively. First we will try to determine the relationships among the six magnitudes An, Bn, Cn, An + 1) Bn + U Cn + 1; we will begin with the simplest of these relationships. Let us consider Bn + 1 5-arrangements FXXXF2X2 ... FnXnFn +!Mn +! of the pairs 1, 2,..., (n + 1), in which Mn + 1 sits next toFn + 1 on her right. We will divide the arrangements into two groups in accordance with whether Xn = M1 or Xn ^ Mx. We then remove the pair Fn+1Mn + 1 from all of them. The first group then gives us all Bn n-pair 5-arrangements, and the second all An n-pair .^-arrangements, so that (1) Bn + 1=Bn + An. We can obtain a second relationship by considering the Cn + 1 (n + l)-pair C-arrangements M1F1X1FaXa...F9XnFn + u in which one of the men Xu X2)..., Xn sits next to his own wife. We also divide these arrangements into two groups in accordance with whether or not ^ is or is not equal to Mn + 1. The second group then contains (2n — 1) subgroups. In the first, M2 is seated on the left of F2) in the second on her right; in the third, M3 sits on the left of F3, in the fourth on her right, etc.; in the (2n — l)th, Mn + 1 is seated on the left ofFn + 1. If we leave the pair M1F1 out of all of the Cn + 1 C-arrangements, we obtain from the first group all Cn C-arrangements of the pairs 2, 3, 4,..., (n + 1) in which Mn + x is seated on the right of Fn + u and from each subgroup of the second group we obtain Bn 5-arrangements of the pairs 2, 3,..., (n + 1), so that (2) Cn + i = Cn + (2b - l)Bn.
30 Arithmetical Problems As we found above, if we remove the pair Fn + 1) Mn + 1 from an (n + l)-pair ^4-arrangement F^X^F^X^ .. .Fn + 1Xn + 1 and replace the Mn + 1 that has been removed with Xn + 1, the arrangement is transformed into an n-pair A-, B-, or C-arrangement. Conversely, we obtain an .^-arrangement of the {n + 1) pairs 1, 2,..., (n + 1) when we insert Fn +1 Mn +1 before Fx of an A-, B-, or C-arrangement of the n pairs 1, 2,..., n and then exchange the places of Mn +! and some other man (in such a manner that none of the men is seated next to his own wife after the exchange of places). It is also clear that this method gives us all the .^-arrangements of the (n + 1) pairs 1, 2,..., (n + 1). In order to find An + 1it is therefore only necessary to determine the number of ways in which this insertion and the subsequent exchange can be accomplished for all possible A-, B-, and C-arrangements of the n pairs 1 through n. We accomplish the described formation of the (n + l)-pair ^4-arrangements in three steps. I. Formation from A-arrangements. After the insertion: F1X1F2X2...FnXnFn + 1Mn + 1 we can exchange the places of Mn + 1 and any other man except Xn and Mu so that from each of the An n-pair ^4-arrangements we obtain (n — 2) (n + l)-pair .^-arrangements. Consequently, we obtain a total of (n — 2)An (n + 1 )-pair ^4-arrangements. II. Formation from ^-arrangements. The n-pair 5-arrangements exhibit the following 2n forms: 1. ...F.M,... 2. ... FXM2F2 ... 3. ...iwyi^..., (2b-2). ...Mnivr„iV.., (2n-l). ...F.M,^..., In. ...FnM,F,.... And there are Bn of each of these forms. Our process of formation is not applicable to the first and the (2n — l)th of these forms (since the inserted Mn + 1 would have to be
Lucas' Problem of the Married Couples 31 exchanged with M1 or Mn, as a result of which, however, Mx would end up on the left side of Fu or Mn + x would be on the left side ofFn + 1). In the second, third,..., (2b — 2)th form, the exchange of the inserted Mn +! with M2, M2, M3, M3,..., Mn _ u Mn _ u Mn transforms the B-pair 5-arrangements into (n + l)-pair .^-arrangements, as a result of which a total of (2b — 3)Bn (n + l)-pair ^4-arrangements are obtained. In the (2n)th form, the inserted Mn + 1 can be exchanged with any of the men M2, M3,..., Mn, as a result of which a total of (n — l)Bn (n + l)-pair^4-arrangements are obtained. III. Formation from C-arrangements. Our method transforms any one of the Cn n-pair C-arrangements: M1FxX2F2X3Fz...XnFn into an (n + l)-pair ^4-arrangement if we switch the places of Mn +! and the man M, seated next to his wife (v being one of the values 2, 3, 4,..., n). In this manner we obtain from every n-pair C- arrangement an (n + l)-pair .^-arrangement, which corresponds to a total of Cn (n + l)-pair ^4-arrangements. Thus, the methods of formation described under I., II., and III. give us all of the (n + l)-pair ^4-arrangements, or a total of [(b - 2)An + (3b - 4)5n + Cn], arrangements, so that (3) An + 1 = (B - 2)An + (3b - 4)5n + Cn. In order to obtain formulas in which only the same capital letters occur, we infer from (1) -4 = -8,, + 1--8.1 and 4 + 1 =-8,, + 2 - -Bn + l and introduce these values into (3). This gives Bn + 2 = (b - l)Bn + 1 + (2b - 2)Bn + Cn.
32 Arithmetical Problems If we then replace n by n + 1, it follows that •Sn + 3 = «#n + 2 + 2n2?n + 1 + Cn + 1. If we subtract the next to the last equation from the last one and take (2) into consideration, we get Bn + 3 = (n + l)[5n + 2 + Bn + 1] + Bn or, if we replace n + 1 here by n, (4) Bn + 2 = n(Bn + 1 + Z?n) + £„_!. This simple recurrence formula for the .B's enables us to calculate from three successive B's the B that follows immediately. It is also possible to derive a recurrence formula in which only three successive B's are connected, i.e., a formula having the form (5) 'nBn +1 + fnBn + gnBn _! = e, in which the coefficients en,fn, gn represent known functions of n and c is a constant. In order to find it we replace n in (5) with (n + 1) and obtain «n + l-Bn + 2 +/n + l-Bn + l + gn + lBn = C Subtraction of this equation from (5) gives -«n + l-Bn + 2 + («n -fn + l)Bn + 1 + (/„ - gn + l)Bn + f„5n_! = 0. In order to find the equations of condition for the coefficients e,f, g which are still unknown, we compare the formula obtained with equation (4) after equation (4) has been multiplied by gn: -gnBn + 2 + ngnBn + 1 + ngnBn + £n.Bn-i = 0. Thus, we are able to obtain e, f, g and satisfy the three conditions (I) «n + l = gn, (II) «„-/„ + != ngn, (III) fn- gn + 1 = ngn, giving us the sought-for recurrence formula. From (III) it follows that /n = gn + i + ngn or /B + 1 = £n + 2 + (n + l)^n + 1, and from (II) and (I) /n + l = «n - ngn = gn-1 ~ ngn. By equating the two values obtained for^ + 1 we get (« + l)gn + l + «5n = gn-1 ~ gn + 2-
Lucas'1 Problem of the Married Couples 33 It is easily seen that gn = run (t 1) is a solution of this equation. This, according to (I), yields «»-*»-i = -(«- 1)»" and, according to (III), /n = £n + l + "£n = '"(«2 - « - 1). Equation (5) is thereby transformed into (n - l)Bn + 1 _(„*_„_ l)Bn - nBn.x = -a\ In order to determine the constant c, we set n equal to 4, we observe that B3 = 0, Bi = 1, and 55 = 3, and we obtain c = 2. The sought-for recurrence formula consequently reads (6) (n - l)Bn + 1 _(„*-„_ 1)5, + nBn^ - 2i\ In order to obtain a recurrence formula for the A's as well, we express An_u An, and An + 1) in accordance with (1) and (6), by Bn and Bn + l. Thus we obtain An = Bn + 1 — Bn, A - "2~ l R + " + ! R -4_ 2t" 4, + 1 - -^- *n + l + -jj- 5n + —, and from this by elimination of Bn and Bn + 1 we obtain (7) (B - l)An + 1 = (n* - 1)4, + (B + 1)^-1 + 4*\ This is Laisant's recurrence formula. It makes possible the calculation of each ^4 from the two immediately preceding A's. Thus, from A3 = 1, ^44 = 2, and (7), it follows that ^45 = 13, which is still easy to check directly. Moreover, the whole series A6 = 80, ^7 = 579, As = 4738, Ag = 43387, A10 = 439792, Alt = 4890741, A12 = 59216642, etc. can then be derived from (7). The difficult point in the calculation of A can therefore be considered as eliminated. The problem is solved. The number of possible seating arrangements of n married couples is 2A„-n!, in which An can be calculated from Laisant's recurrence formula.
34 Arithmetical Problems K[fl Omar Khayyam's Binomial Expansion To obtain the nth power of the binomial a + b in powers of a and b when n if any positive whole number. Solution. In order to determine the binomial expansion we write (a + b)n = (a + b)(a + b)...(a + b), where the right side consists of a product of n identical parentheses (a + b). As is known, the multiplication of parentheses consists of choosing one term from each parenthesis and obtaining the product of the terms chosen, and continuing this process until all the possible choices are exhausted. Finally, the resulting products are added together. A product of this sort has the following appearance: P = aaibeiaa2be2a"sb6s..., in which the factor a is taken from the first 0¾ parentheses, the factor b from the next /3j parentheses, the factor a from the next <x2 parentheses, etc. In this case ax + ^ + <x2 + /32 + • • • equals the number of parentheses present, i.e., n. If we set txj + <x2 + «3 + • • • equal to a and ^1 + /32 + • • • equal to /3, the expression can be written in the simpler form P = a"be with a + /3 = n. Now the product P can generally be obtained in many other ways than the one described, for example, by taking a from the first a parentheses and b from the last /3 parentheses, or by taking b from the first /3 parentheses, and a from the last a parentheses, etc. If we assume that the product P occurs exactly C times in the method described, C being understood to represent an initially unknown whole number, then G = Ca"be represents one term of the binomial expansion. The other terms have the same form, except that the exponents a and /3 and the coefficients C are different. However, a + /3 always equals n. The core of the problem is to determine the so-called binomial coefficient C, i.e., to answer the question: How many times does the product P = aab6 appear in the binomial expansion?
Omar Khayyam's Binomial Expansion 35 To answer this question we first write the factors a and b of the product one after another in the order in which we initially selected them from the parentheses: aa... a-bb ... b-aa... a-.... totaling totaling totaling «1 £1 «2 This is a permutation of n elements in which a identical elements a and /3 identical elements b occur. There are as many possible permutations of these elements as there are terms P resulting from the multiplication of the n parentheses (a + b). But the number of permutations of n elements among which there appear a identical elements of one kind and /3 identical elements of the other is n!/<x!//3!. This is how often the product a"b6 appears in the binomial expansion. Consequently, An apparent exception to this formula is presented by the terms an and bn of the expansion, each of which occurs only once. To eliminate this exception let us agree to let the symbol 0! represent unity; we are then able to write the coefficients of an and bn as n !/n !0! and n\jQ\n\, respectively, in agreement with the formula. The individual possibilities of forming the product P can also be represented geometrically. We can, for example, represent the first possibility considered above in the following way: We mark off" a horizontal distance of ax successive segments a, and from the end of this distance extend a vertical distance of ^ successive segments b, from the end of this vertical line a third horizontal distance of <x2 successive segments a, etc. In a similar manner we represent the other possibilities of forming the product P; however, we begin all C zigzag traces, which represent the C possibilities, from the same point. Thus, for example, if we are concerned with finding the number v of all the products of the form a11^7 in the binomial expansion of (a + b)18, we draw a rectangular network of 11-7 rectangular compartments possessing a horizontal side a and a vertical side b and lying in seven 11-compartment rows one below the other. The possibility aWa^b* (a from the first four parentheses, b from the following three, a from the next seven parentheses and b from the last
36 Arithmetical Problems four) is then represented by the unbroken heavy line, and the possibility b2a6b3a2b2a3 by the line of dashes. The sought-for number v is therefore equal to the number of all the possible direct paths leading from the corner E of the network to the opposite corner F. E Fig. 1. The formula previously found for C thus also provides us with the solution to the interesting problem: A city has m streets that run from east to west and n that run from north to south; how many ways (without detours) are there of getting from the northwest corner of the city to the southeast comer? Since there are (n — 1) west-east partial paths a and (m — 1) north-south partial paths b, the number of all the possible paths is (m + n -2)! (m - l)!(n - 1)!' Now back to the binomial theorem! Determination of the binomial coefficient C gives us immediately the sought-for binomial expansion: (a + b)n = JJCoPb" with C = -^- Here a and /3 pass through all the possible integral non-negative values that satisfy the condition a + /3 = n. The expansion of (a + b)5, for example, gives («+ *)5 = -5 + 4Ti!aib + m** +2¾^3 + m\ ^ + *5 or (a + b)5 = a5 + 5a*b + I0a3b2 + 10a263 + 5ab* + b5.
Cauchy's Mean Theorem 37 Instead of n!/<x!/3! one usually writes B(n- l)(n-2) ...(«-« + 1) 1-2-3. ..a and also abbreviates this coefficient na (read as n sub a). The expansion then takes on a somewhat simpler appearance: (a + b)n = an + nxan-lb + n2an-2b2 + ■ ■ ■ + bn. The coefficient nv is known as the binomial coefficient to the base n with index v. The binomial theorem was probably discovered by the Persian astronomer Omar Khayyam, who lived during the eleventh century. At least he prided himself on having discovered the expansion " for all (integral positive) exponents n, which no one had been able to accomplish before him." Note. The derivation given above is easily extended to give the nth. power expansion of a polynomial a + b + c + • • •. The polynomial theorem for a polynomial consisting of three terms, for example, is (« + * + <)" = 2 ^flW> where the sum 2 includes all possible terms for which the integral non-negative exponents a, /3, y satisfy the condition a + /3 + y = n. im Cauchy's Mean Theorem The geometric mean of several positive numbers is smaller than the arithmetic mean of these numbers. Augustin Louis Cauchy (1789-1857) was one of the greatest French mathematicians. The theorem concerning the arithmetic and geometric means occurs in his Cours d'''Analyse (pp. 458-9), which appeared in 1821. The proof of the theorem that will be presented here is based upon the solution of the fundamental problem: When does the product of n positive numbers of constant sum attain its maximum value? We will call the n numbers a, b, c,..., their constant sum K, and their product P. Experimentation with various numbers suggests that the product P reaches its maximal value when the numbers a,b,c,... all possess the same value M = Kjn.
38 Arithmetical Problems To determine the accuracy of this hypothesis, we use the Auxiliary theorem : Of two pairs of numbers of equal sum the pair possessing the greater product is the one whose numbers exhibit the smaller difference. [If AT and Y represent one pair and x and y the other, and X + Y = x + y, the auxiliary theorem follows from the equations \XY = (X + Y)» -(X- Y)\ 4xy = (x + y)* - (x - y)«, in which the minuends of the right sides are equal and the greater right side is the one in which the subtrahends are smaller.] If the n numbers a,b,c,... are not all equal, then at least one, a, for example, must be greater than M, and at least one, let us say b, must be smaller than M. Let us form a new system of n numbers a', b', c',... in such a manner that (1) a' = M, (2) the pairs a, b and a', V have the same sum, (3) the other numbers c', d', e',... correspond to c, d, e,.... The new numbers then have the same sum K as the old ones, but a greater product P'( = a'b'c'...), since in accordance with the auxiliary theorem a'V > ab. If the numbers a', V, c',... are not all equal to M, then at least one, let us say b', is greater (smaller) and at least one, say c', is smaller (greater) than M. Let us form a new system of n numbers a", b", c", d",... in such a manner that (1) a" = a' = M, (2) b" = M, (3) the pairs b', c' and b", c" possess the same sum, (4) d", e",... correspond to d',e',.... The numbers a", b", c",... then have the same sum K as the numbers a', V, c',..., but possess a greater product P" = a"b"c"..., since in accordance with the auxiliary theorem b"c" > b'c'. We continue in this fashion and obtain a series of increasing products P, P', P",... each successive member of which is greater than the immediately preceding one by at least one more multiple of the factor M. The last product obtained in this manner is the greatest of all and consists of n equal factors M. Consequently, P < Mn, which gives us the theorem: The product of n positive numbers whose sum is constant attains its maximal value when the numbers are equally great. If we extract the nth. root of the last inequality and express P and M in terms of the magnitudes a, b, c,..., we obtain Cauchy's formula: n
Cauchy's Mean Theorem 39 This is expressed verbally as follows: The theorem of the arithmetic and geometric mean: The geometric mean of several numbers is always smaller than the arithmetic mean of the numbers, except when the numbers are equal, in which case the two means are also equal. Note 1. Cauchy's theorem leads directly to the converse of the above extreme theorem: The sum of n positive numbers whose product is constant attains its minimal value when the numbers are equal. Proof. Let us call the n numbers x, y, z,..., their given product k, their variable sum s, and let us designate by m the nth root of k. According to Cauchy, x + y + z + ••• y 2—_ ^ yxyz. . . ; consequently s ^ nm, where the equality sign applies only in the event that x = y = z. Q.E.D. The two preceding extreme theorems form the basis for a simple solution of many problems concerning maximum and minimum (cf. Nos. 54, 92, 96, 98). Note 2. Cauchy's theorem also furnishes us directly with the important exponential inequality for the exponential function x°. If a is any positive number not equal to 1, n a whole number > 0, m a whole number >n, then the geometric mean of the m numbers of which n possess the value a and the (m — n) others possess the value 1 is smaller than the arithmetic mean (na + m — n)jm of these m numbers or Vtf < 1 + - (a - 1), m or, if we write e in place of njm, (1) a° < 1 + e(a - 1). In this inequality e is any rational, positive proper fraction. We will now show that this inequality is also true for any irrational proper fraction i. First, it is clear that a7 > 1 + J(a — 1) cannot be true for any irrational proper fraction J. If that were the case it would be possible to find a rational proper fraction R < J so close to J that aR would differ from a7, and 1 + R(a — 1) from 1 + J (a — 1), by less than—
40 Arithmetical Problems let us say—£ of the difference a1 = [I + J (a — 1)]. In that event aR would still be > 1 + R(a — 1), which is, however, impossible according to (1). Now let z be so small that i + z and i — z are both positive proper fractions. Then we have a' < a' 2 (since the arithmetic mean of the numbers <? and a"2 is greater than 1 according to Cauchy) or . ai+z + a'-* a< 2 According to the above relation, however, ai+* £ 1 + ({ + z)(a - 1), a'"2 £ 1 + (« - z)(a - 1), therefore 2 ^ 1 + i(a- 1); thus, it is certain that a' < 1 + i(a - 1). Inequality (1) is therefore true for any proper fraction e. If we replace e in (1) by I In, 1 + e(a — 1) by b, i.e., a by 1 + fi(b — 1), (1) is transformed into (2) 6« > 1 + ,,(* - 1), where /* is any positive improper fraction, b any positive number. Conclusion. The exponential inequality. If x is any positive magnitude and c any positive exponent, the exponential inequality is: *• $ 1 + e(x - 1), in which proper fractional exponents require the use of the upper sign and improper fractional exponents require the use of the lower sign. Bernoulli's Power Sum Problem Determine the sum S = 1" + 2" + 3" + • • • + n" of the p powers of the first n natural numbers for integral positive exponents p.
Bernoulli's Power Sum Problem 41 The problem, posed in this general form, was first solved in the Ars Conjectandi (Probability Computation), which appeared in 1713. It was the work of the Swiss mathematician Jacob Bernoulli (1654-1705). The following elegant solution is based upon the binomial theorem. By resorting to the device of considering the magnitudes ©*, ©2, ©3,..., ©v resulting from the binomial expansion of (x + ©)v as unknowns subject to v certain conditions rather than as powers of ©, we obtain an amazingly short derivation of S. According to the binomial theorem, if P is understood to represent the number/) + 1, (v + ©)p = vp + /V©1 + /V"1©2 + • • • and (v + © - l)p = v* + JV(@ - 1)1 + /V"1^ - 1)2 + • • •. Subtraction of these two equations gives us f> + ©)p - (v - 1 + ©)P = Pv" + /V-H©2 - (© - I)2] () \ + JV-S[@S - (©- I)3] + •••• We now define the unknowns ©*, ©2, ©3,... by the equations (I) (© - 1)2 = ©2, (2) (© - 1)3 = ©3, (3) (© - 1)* = ©*, etc. This results in the simplification of (I) to (la) Pv* = (v + ©)p - (v - 1 + ©)p. This equation is formed for v = 1, 2, 3,..., n, and we thereby obtain P-l" = (1 + ©)p- ©p, P-2" = (2 + ©)p -(1+ ©)p, p.np = (n + ©)p - (n - 1 + ©)p. Addition of these n equations gives us (II) PS= (n + ©)p - ©p or (II) l> + 2>+ ••• +n>= (" + @j>P~ @P with P=p+l. This formula, in which the magnitudes ©x, ©2, ©3,... on the right side of the equation, obtained from expansion of the binomial (n + ©)p, are defined by equations (1), (2), (3),..., gives us the sought-for power sum.
42 Arithmetical Problems In order to apply it to the cases n = 1, 2, 3, 4, we first determine the unknowns (S1, ©2, ©3, and ©4 in accordance with equations (1),(2),.... From (1) it follows that -2S* + 1 = 0, i.e., & = ■£. Then, from (2), -3S2 + 3S1 - 1 = 0, i.e., <S2 = £. And from (3), -4<S3 + 6®2 - 4©1 + 1 = 0, i.e., <S3 = 0. Finally, from (@ - 1)5 = <S5 we obtain <34 = --^. The numbers (S1 = |, <32 = £, S3 = 0, <S4 = — -^o, etc., are known as Bernoulli numbers. Then from (II) we obtain l+2 + 3+--.+,= ^ + @)2-@2 = "2+22"@1 n+ 1 P + 22 + 32 + --- + h2 = (" + @)3 ~ @3 = "' + 3"2®1 + 3"®2 = in(R + 1)(2b + 1), 1» +2»+ 3»+ ---+^=^ + ^-64 n4 + 4b3©1 + 6n2©2 + 4n©3 / B+ 1\S ^ + 2. + ^+...+^=^ + ^5-85 5 = rc5 + 5¾4¾1 + 10«3©2 + 10«2©3 + 5rc©4 5 pst = 30' with/> = n(n + I), s = 2n + I, t = 3p - I. If n in (II) increases without limit, S also increases without limit, but the quotient S/np possesses a finite value. In fact, in accordance with the binomial theorem, (II) is written PS = np + PiSV"1 + iVSV-2 + • • -, so that
Bernoulli's Power Sum Problem 43 Now, if n increases infinitely all the fractions on the right-hand side with the exception of the first become infinitely small, and we obtain the limit equation of the power sum: (in) iim 1P + 2'+•••+"' = _L_. This important limit equation can also be derived from the exponential inequality (No. 10) xp > 1 +P(x- 1). This derivation has the advantage over the one just given that it is true for any positive exponent p, not only for integral positive exponents! If we first replace x in the exponential inequality with the improper fraction V/v, then by the proper fraction vj V, after elimination of the denominators we obtain V > vp + Pvp(V - v) and vp > V - PVP(V - v) or Vp - vp Pv" < -½ < PV. V — v Into this new inequality we introduce the series 110, 211, 312,..., n\n — 1 for the pair of values V\ v and we obtain p.Op < P - 0P < P-lp, P-l" < 2P - lp < P-2p, P-(n - 1)" < np - (n - l)p < P-n". Addition of these n inequalities results in P(S - np) <np <PS or 15 11 P np P n Since both boundaries between which the quotient 5/np is situated assume the value 1 /P when n = oo, (in) iim lP + 2Pfr + nP = -A- ^ ' »-.» np + 1 p + 1 where p represents any positive magnitude.
44 Arithmetical Problems If the mean value of the function x" is introduced, the limit equation of the power sum can be obtained in still another form. The mean value of a function over an interval is commonly understood to mean the limiting value toward which the mean value of n values of the function uniformly distributed over the interval tends if n increases without limit. The mean value M of the function/(*) over the interval 0 to x, if 8 represents the nth part of x, is thus the limiting value of the quotient /(8) +f\28)+ ••• + /(,.8) r n X for n = co. We write this mean value as 2R/(*). o Thus, the mean value of the function xp over the interval 0 through x is the limiting value of 8" + (28)" + • • • + (n8)" &p l" + 2" + ••• + n" u = = o* > n n i.e., since 8 = x/n, the limiting value of I? + 2" + ••• + np H = x" „p + i Since the fractional factor of the right side according to (III) has the limiting value \j(p + 1), it follows that the sought-for mean value of the function xp is (Ilia) Six" = o p + y this formula, however, is basically no different from (III). Formula (III) or (Ilia) has found many applications in geometry and physics. The Euler Number Find the limiting values of the functions 9>M = (l + ^)* and 4>(*) = (l + 1)" for an infinitely increasing x.
The Euler Number 45 The simplest solution of this very interesting problem is based upon the exponential inequality xe < 1 + e(x - 1) (cf. No. 10), in which x is any positive magnitude and e is any proper fraction between 0 and 1. Let us introduce two arbitrary positive numbers a and b, the first of which is larger than the second and the second > 0, and introduce into the exponential inequality first x-1+7 and then v 1 l b + r In the first case we obtain 11 + I) b e = -> a b+ 1 Z a+\ Wa J | < 1 + - or 1 a (1) KH'+i)' (J \6 + l/a + l j 1 — ■; r) < 1 r or 6+1/ a + 1 / b \"+1 / a \a+1 \b+\) <\a+ l) or, finally, (2, (1+}p>(,+ip The two inequalities obtained, (1) and (2), contain the remarkable theorem: With an increasingly positive argument x the function y(x) = II -|—) / l\x+1 increases while the function O(x) = II H—I decreases. Thus, for AT > x q>(X) > q>(x), whereas O(Z) < O(x).
46 Arithmetical Problems Since, on the other hand, for the same values of the argument the function O exceeds the function q> [<&(*) = (l +¾ •*(*)]• we obtain the inequalities <p(x) < <p(X) < <S>(X) and <p(X) < ${X) < <D(*), i.e., every value of the function O is greater than every value of the function <p. (Only positive values of the argument will be considered.) Let us imagine two movable points p and P on the positive number axis which are situated at distances <p(t) and ¢(/) from the zero point at time t and begin their movements in the instant /=1. Point p, beginning from q>(l) = 2, then moves continuously toward the right, while point P, which begins at ¢(1) = 4, moves continuously toward the left. However, since ¢(/) is always greater than 9)(/), i.e., P is always to the right of p, the points can never meet. Nevertheless, the distance between them is diminished d = ¢(/) - 9)(/) = 9)(/)//, since 9)(/) < 4, and thus d < 4// without limit with increasing time, so that they finally are separated by an infinitely small distance. The only way to explain this situation is to assume that on the number axis (between the numbers 2 and 4) there exists a fixed point that the moving points p and P approach infinitely closely from the left and from the right, respectively, without ever touching. The distance of this fixed point from the zero point is the so-called Euler number e. The proposal to designate this number, which also forms the base of the natural logarithmic system (No. 14), by the letter e stems from Euler (Commentarii Academiae Petropolitanae ad annum 1739, vol. IX). The important inequality (I) (l + I)* < . < (1 + I)*+1 is true for Euler''s number (x represents any positive number >0). If we choose x = 1,000,000, this inequality gives us the number e exactly to five decimal places. However, the use of the series for e (No. 13) is a better method of computation.
The Euler Number 47 Then we obtain e = 2.718281828459045.... The sought-for limiting values, however, are / 1\* / 1\*+1 lim (1+-1 = e and lim II H—I = e, X-* oo \ X/ X-* oo \ Xj the first of which is an upper limit, while the second is a lower limit. Note. From the inequality (I) for the number e the inequality for the exponential function ex follows directly. 1. In the inequality ('♦9' < e we replace x by 1 jP, where P is any positive number > 0; we assign to e the power P and obtain (1) f > 1 + P. 2. In the inequality e < we replace * + 1 by — 1 \n, thus \ -\— by -j > n being a negative proper fraction ?^0; we assign to e the power n and obtain (2) en > 1 + n. 3. We consider that for every negative improper fraction N (1 + N) is negative, and consequently we have (3) «■» > 1 + N. Combining the inequalities (1), (2), (3), we obtain the inequality of the exponential function : e* > 1 + x, which is true for every finite real value of x and only becomes an equation when x = 0. The inequality obtained leads directly to the so-called limit equation of the exponential function.
48 Arithmetical Problems Let x be any finite real magnitude and n a positive number of such magnitude that 1 + - is positive. In accordance with the inequality of the exponential function, exln > 1 + - and e~xln > 1 - -• n n We assign these inequalities the power n, in the case of the second, X however, only after we have multiplied it by 1 H— This results in •* > (1 + $ and (1 + ;)v* > (1 - $"• Since the right-hand side of the second inequality, in accordance with x2 the exponential inequality (No. 10), is greater than 1 > then actually (' + ;)"<-* > ' - i - (' + tf > (' - &■ Combining the inequalities obtained, we get If n is then allowed to increase infinitely, the left side of this inequality is transformed into ex and we obtain the limit equation of the exponential function: lim (1 + -1 = ex, n-oo \ n] in which x represents any finite real number and n is an infinitely increasing magnitude. Newton's Exponential Series Transform the exponential function ex into a progression in terms of powers ofx. This power progression, the so-called exponential series, which may in fact be the most important series in mathematics, was discovered by the great English mathematician and physicist Isaac Newton
Newton's Exponential Series 49 (1642-1727). The famous treatise that contains the sine series, the cosine series, the arc sine series, the logarithmic series, and the binomial series as well as the exponential series was written in 1665 and bears the title De analyst per aequationes numero terminorum infinitas. Newton's derivation of the exponential series is, however, not rigorous and rather complicated. The following derivation is based upon the mean values of the functions x° (No. 11) and «*. We find the mean value of the function ex with the help of the inequality of the exponential function (1) e" > 1 + u. (No. 12) We will consider two arbitrary values v and V = v + <p > » of the argument of the exponential function and first set u = q> and then u = —q> in (1). This gives us e° > 1 + <p and e~° > 1 — <p, respectively. Multiplication with ev and ev, respectively, results in ev > ev + <pev and ev > ev — <pev, respectively; combining, we obtain: ev - e" (2) * < V^T < eV- The mean value M of e* over the interval 0 to x is the limiting value of the quotient " 1 (8 = n) for an unlimitedly increasing n. In order to find p, for a positive x we set down in (2) for the pair of values v\ V'm succession 0|S, S|2S, 2S|3S,..., (b- 1)8|r8 and add the resulting n inequalities. This gives e* — 1 nfi + 1 — ex < —^— < np or, solved for p, e* - 1 ex - 1 e* - 1 < n < \ (x > 0).
50 Arithmetical Problems For a negative x we put down successively for v\ V in (2) S\0,2S\S,3S\28,...,n8\(n - 1)8. Summation of the resulting n inequalities then leads to the same final inequality; only in this case the extremes are reversed, so that this time it reads «* - 1 e* - 1 e* - 1 1 < u < (* < 0). x n x If we then allow n to become infinite in the two inequalities obtained, we get for the lim p the value X gX _ 1 (3) We- = ?-—i, o * whether x is positive or negative. Now for the series expansion of ex! We begin with the inequality e* > 1 + x. We assume initially that x is positive and obtain the mean values of both sides. This gives us e* — I . x , x2 >!+0 01" <!* > \ + X + jrr- x 2 2! Repeated mean formation gives rise to ex — 1 , x x2 ,, x2 x3 ~T~ >1+2 + 3! °r e>1+* + 2! + 37 We continue in this manner and obtain (4) e* > 1 + * + 2; + 35 + • • • + ~x; In order to obtain an upper limit for ex also we begin with the inequality e~x > 1 — x, multiply by ex and obtain 1 > ex — xex or ex < 1 + xex. In the subsequent mean formations we employ the self-evident theorem: "The mean of the product of two (positive) functions u and v is smaller than the product of the mean value of u and the maximum value of» over the interval considered."
Newton's Exponential Series 51 In the first step (u = x, v = ex) we obtain ~ < \ +^e* or e* < 1 + x + ^ ex, in the second I v = -^-> t> = e* I e* — 1 . x x2 , * , * * - ~~T~ < J + 2 + 3! °r 6 + * + 2! + 3! ' ' etc., and finally (5) «* < 1 + x + ^ + 3j + • • • + -{ e*. If we then consider the case in which x is negative, the situation is somewhat simpler. From e* > 1 + x it follows as above that ex - 1 , x x 2 however, since x is now negative, x2 e* < 1 + x + ^- The next mean formation yields ex — 1 . x x2 , x2 x3 ___ < 1 +2+ 37 or eX>1+x + 2l. + 3\' the next ** < ! + * + Ti + Ji + £' etc., and finally x2 x3 x2" ~1 (6) e*>l+x+ +-+...+ 2! ^ 3! ^ ^ (2v - 1)! and x2 x3 x2v (7) ex<l+x + 7r. +^. + ■■■ + 2! ^ 3! ^ ^ (2v) From inequalities (4), (5), (6), and (7) it follows that: When x is positive e* lies between X X^ X x^ l+^ + 2i+---+^ and 1+, + -+...+-^,
52 Arithmetical Problems and when x is negative between 1 + * + 27+ •• Then if we write X X + _ and !+*+_ + + («+!)« X X (8) e*=l +x + -+ ... +-, the error encountered for a positive value of x is less than hv-»> and for a negative value of x less than («+ 1)! But for a finite value of x and for an infinitely increasing n the fraction xn/n\ approaches zero. [In accordance with No. 10 each of the products 2{n — 1), 3(n — 2),..., (n — 1) -2 is greater than 1 -n. The product of these products is therefore greater than nn~2, i.e., (n — 1) !2 > n""2 or n!2 > n" or n! > Vn". Thus, it follows that V~n If n is assigned a value such that Vn is greater than \2x\, then "7= < (9) and lim 7T\ = °-] The error encountered with formula (8) thus disappears as x increases infinitely. Consequently: The progression , x2 x3 e*=l+x + -+J[ + (9) is true for every finite x. Note. The series obtained is particularly well suited for computation of the Euler number e. If, for example, we set x equal to 1, 1 1 1 e = l + n + 2i+ + j^j = 2.7182818012
Newton's Exponential Series 53 and the encountered error is F-— — — -J-(] A. l \ 11!+ 12! + 13! + "" 111V. + 12+ 1213 + '")' which is smaller than 11! \ + 12 + 122 + 123 + "/ or smaller than -j-j-j— < 0.00000008. The exact value is e = 2.71828182845904523536 .... Formula (9), which applies to every finite real value of x, suggests the further extension of the concept of the exponential function to include the complex argument values z. The exponential Junction ez for the complex argument z is defined by the formula 2 3 (10) «*=l+z + £ + £+... to infinity. It is easily seen that the infinite power series on the right-hand side of (10) has a definite finite value for every finite z, or, in other words, that the series converges for every finite z: We set 1 + z + h. +'" +h = £n(z)' zn + 2 zn-~ + /. , on + • • • + TrTTTT = *»(*)» (b + 1)! T (b + 2!) ^ ^ (b + v)! so that £n+v(z) - En(z) = Rv(z). If f represents the absolute magnitude of z, then the absolute magnitude of Ry(z) must certainly be smaller than rn + 1 rn + 2 rn + v + , V nn + fc (b+ 1)! ^ (b + 2!) ^ (b + v)! and consequently considerably smaller than rn +1 rn + 2 + /., OM + - - - t0 infi"hy = «* - £n(0- (B+ 1)1 ^ (B + 2)!
54 Arithmetical Problems Since, in accordance with (8) or (9), ec — En(t) can be made as small as desired with the selection of a sufficiently high value for n, R,(z) can certainly be made as small as desired for such an n, no matter how great the value of v. However, this means that the series Z2 z3 1 + Z + 21 + 31 + '"" converges. (It is in fact absolutely convergent, i.e., it still converges when z is converted into its absolute magnitude f.) Moreover, let a and b be two arbitrary real or complex values, a and /3 their absolute magnitudes, and a + /3 = y. By multiplication of „ . , . a a2 an En(a) = 1+-+-+...+- and p /in . b b* bn E«(b) = 1+TT + 2!+---+^T we obtain En(a)En(b) = 1 + Cj + C2 + • • • + C2n, Cv representing <fbs the sum of all the members of the form -7-7 in which the exponents r r\s\ r and s have the sum v. As long as v does not exceed the value of n, all v + 1 positive index pairs (r, s) occur in Cv with the sum v, whereas when v > n only some of them do. Consequently, according to the binomial theorem (No. 9) for v £ n Cv = 1 (a + b)\ for v > n ICJ < I y\ v \ The sum of the first (n + 1) terms of En(a)En(b) is therefore equal to EJa + b), and the sum of the absolute magnitudes of the following n terms is smaller than Rn(y), i.e., is certainly smaller than (^nyi+ 0¾ +--- +toinfinity = * ~ e«m =8' so that we can set it equal to eS, where |e| < 1. Accordingly, we obtain the equation En(a).En(b)=En(a + b) + e8.
Newton's Exponential Series 55 If we then allow n to become infinite in this equation, 8 becomes equal to zero, and the equation is converted into (11) ea-eb = e0*". This fundamental formula justifies our previous suggestion of designating the series z2 z3 1 +Z + 2l + 3!+ '•• as e2. Now let z = x + iy, where x and y are real. According to (11), e* = e"-^" or fie* = fi = 1 + iy - t - i £ + £ + i |! _ |! + . . . \ 2!+4! 6! + / The brackets appearing here are, in accordance with No. 15, cosy and siny, and we obtain the Eulerformula: (12) e**1" = ex(cosy + isiny), which when x = 0 takes the form (12a) «*" = cosy + isiny. If in (12a) y = it, we obtain the remarkable Euler relation J" = -1 between the two significant numbers e and -n. If we then replace y by — y in (12a), we obtain (126) e~iy = cosy — tsiny and subsequent addition and subtraction of (12a) and (126) yields the equally remarkable pair of formulas cosy = ^ ) siny = 2»
56 Arithmetical Problems Nicolaus Mercator's Logarithmic Series To calculate the logarithm of a given number without the use of the logarithmic table. This fundamental problem, which forms the basis for the construction of the logarithmic tables, is solved simply and conveniently by logarithmic series. The simplest logarithmic series: * - i*2 + i*3 - i** + , which represents the natural log of 1 + x, is found for the first time in the Logarithmotechnia (London, 1668) of the Holstein mathematician Nicolaus Mercator (1620-1687) (whose real name was Kaufmann). For the derivation of the logarithmic series we will make use of the mean value of the function f(x) = -: » which we will therefore determine first. We will begin with the inequality (2) for the above number; we begin by converting this inequality into an inequality for the logarithmic function nat log x (nat log x, abbreviated as Ix, is the logarithm of x when Euler's number e is taken as the base of the logarithmic system, i.e., the logarithm is the power of e required to obtain x). Consequently, we replace v and Fwith lu and IU, where U > u > 0, and, correspondingly, ev and ev with u and U. This gives us U -u TT u<W=Tu<u or ... 1 IU — lu 1 IT1 n. (1 Tr < Tr <_ (U > u > 0). U U — u u The mean value of the function f(x) = 1/(1 + x) is the limiting value of the fraction /(8)+/(28) + ••• + /(n8) M~ n for an infinitely increasing n and 8 = x\n. To determine lim /i for positive and negative values of x, respectively, we write 1 + vS|l + (v — 1)8 in (1) for the pairs U\u and u\U,
Nicolaus Mercator's Logarithmic Series 57 respectively, and then form (1) for v = 1, 2, 3,..., n. Addition of the resulting n inequalities gives in both cases: ——r—- lies between nu and nu + > 8 ^ ^ \ + x in other words, .. , /(1 + *) . /(1 + *) x u lies between — and r.—■—r> n x x n(\ + x) Thus, if n becomes infinite, it follows that where (1 + *) is naturally to be considered positive. Now for the derivation of the series for /(1 + x)! If we replace/on the right-hand side of this equation with 1 - */, we obtain f= 1 -* + **/. If we again replace/on the right-hand side by 1 — xf, we obtain f= 1 - x + x2 - x3f. Similarly, from this we obtain f = 1 - x + x2 - x3 + x% etc., and in general: /=1 — x + x2 — x3 + x* — H — exn~x + exnf, where e is equal to +1 for even values of n and — 1 for uneven values of n. Obtaining the mean value from this formula, we have (3, 21^.,.-^.^......^: + ^. If F represents the maximum value assumed by/over the interval 0 to x (thus F = 1 for positive values of x, F = 1/(1 + x) for negative values of *), then in terms of the absolute value the mean value of *y"must be smaller than theF-value of the mean value [*"/(« + 1)] of xn. Accordingly, we are able to write 2H*n/= 0F- *" n+V where 0 is a definite positive proper fraction.
58 Arithmetical Problems This converts (3) into ,., , x2 x3 x* xn n /(1 +X) =x __ + ___ + £~ + R xn + 1 with R = eBF- n+ 1 As n approaches infinity, if x is a proper fraction (also when x = +1) the "residue" R tends toward zero. Consequently, the following progression is valid when x is a proper fraction and when * = 1: (4) /(1 +x)=x_|+ 1.-^+_.... The series on the right-hand side of the equation is Mercator's series. Since it is only valid for proper fractional values of x, it is not suited for computing the logarithms of any number whatever. In order to obtain the series required for this, we substitute in (4) — x for x and obtain (5) /(1 -*)= -x-*-*-*- .... Subtracting (5) from (4) gives us 1 +* 1 , . + * n\ x3 x5 For every positive or negative proper fractional value of x, 1 i Y \ X = ■; is positive, while at the same time x = -= 7> and the 1 - x ^ ' X + 1 formula obtained is written X- 1 (6) IX = 2[x + j*3 + j*5 + • • •] with x = x j This new series converges for every positive X. In this series we substitute for X the quotient Zjz of two arbitrary positive numbers (> 0). This gives us (IZ -lz = 2[Q + W + |Q5 + w + ■ ■ •] (7> V* 0-^7 This series, in which Z and z may be any two positive numbers, is the logarithmic series from which the logarithmic tables can be computed.
Newton's Sine and Cosine Series 59 In order, for example, to compute /2 we set z equal to 1 and Z to 2, which gives us -il /2 - 2(„ + 3>33 + 5 35 + 7 3? + In order to compute /5 we set z = 125 = 53, and Z = 128 = 27, and this gives us 7/2 - 3/5 = 2(Q + iQ3 + |Q5 + • • •) with Q = ^h- To compute /3 we assume that z = 80 = 5-2*, Z = 81 = 34, so that /2 = /5 + 4/2, /Z = 4/3. This gives us 4/3 -/5- 4/2 = 2(Q + iQ3 + \Q5 + ■ ■ ■) with Q = ^. To compute /7 we set z equal to 2400 = 25-52-3, Z = 2401 = 74, and obtain 4/7 - 5/2 - 2/5 -/3 = 2{Q + iQ3 + \Q5 + ■ ■ ■) with Q = 48¾^. The series in the parentheses converge very rapidly, i.e., we require relatively few terms to obtain their sum fairly exactly. Note. The common logarithms to the base 10 are computed from the natural logarithms. From IQlos-v = elx (= ^ it follows in terms of the natural logarithms that log*-/10 = Ix or log* = Mix, where M = j^. = 0.4342944819 is the so-called modulus by which the natural logarithm must be multiplied to give the common logarithm. Newton's Sine and Cosine Series Compute the circular functions sine and cosine of a given angle without the use of tables. The simplest way of carrying out the required computation is with the use of the sine and cosine series.
60 Arithmetical Problems The series for sin x and cos x first appeared in Newton's treatise De analysi per aequationes numero terminorum infinitas (1665-1666). (No. 13.) The sine series appears there as the converse of the arc sine series, which today is a very uncommon approach. The derivation of the sine and cosine series presented here is based upon the mean values of the functions sin x and cos x over the interval 0 through x. (All of the angles mentioned in what follows are considered in circular measure.) The mean value M of the function sin x over the interval 0 through x is the limiting value of the quotient sin 8 + sin 28 + • • • + sin nh /* = n for an infinitely increasing integral positive n, where 8 represents the nth part of x. But the numerator of the quotient* possesses the value . 8 sin n- sin m *-> o sin- where m is the arithmetic mean of the n argument values 8, 28,..., nh, i.e., Consequently, »+1. x 8 -2- 8 = 2 + 2 . x sin m sin - M = —• "sin 2 Since the denominator of the fraction on the right-hand side tends toward the limit \x as n becomes infinitely great,* and the lim m is also equal to \x, we obtain x x sin - sin - M = lim /j. = n-* 00 X * The reader who is unfamiliar with this fact will find the proof in note 2 at the end of this number, p. 63.
Newton's Sine and Cosine Series 61 or (1) 2R sin x = o 1 — cos x By the same route, with the use of the formula cos 8 + cos 28 + • • • + cos n8 = cos m . nS siny 2 we obtain (2) sin x 2R cos* = o x The series for sin x and cos x are now very easily found. Starting with the inequality cos* < 1, we obtain the mean value for both sides and we have sin * < 1 or sin * < *. If we once again obtain the mean values (Formula [1] and No. 11) we obtain 1 — cos * 1 < - * or cos * > 1 — -y By again obtaining the mean value we get sin x . x2 . x3 > 1 — tt: Or Sill X > X — rr-.> 3! 3! etc. This results in: cos * < 1 cos* > 1 — ^j °°SX K l ~ 2l +4! . *2 *4 *6 COS*> 1 --+--- sin * < * Sin * > * — <r-j y3 Y& sin * < * — jj + j7-j V V V sin * > * - _ + - - -, etc.
62 Arithmetical Problems The integral rational functions on the 1 ight-hand side of these inequalities are the 1st, 2nd, 3rd,..., vth approximations of the functions sin x and cos x. They are called approximations because the degree of their deviation from the correct circular function grows progressively smaller as the index v becomes higher and can be made as small as desired if v is sufficiently great. Specifically, each of the two circular functions lies between two successive approximations of the true value. Thus, if we set them equal to one of these two approximations, the error incurred is smaller than the difference between the approximations, which has the form xy/v\. The fraction x*jv\, however, tends toward zero as v becomes infinitely great (No. 13). Accordingly, the following progressions ~3 y5 „7 sin* = *~3l + 5!~7!+ ' . X2 X* X6 cos*= 1--+---+-.... are valid for finite values of x. If one of these series is interrupted at any point the error thereby incurred is smaller than the first disregarded term. With these series it is possible to compute the sine and cosine of any given angle. They were used to draw up the sine and cosine tables found in logarithmic handbooks. In order to illustrate the degree of approximation let us compute, for example, the sin 1° = sin x (where x = 7r/180). We set x3 sin 1° = sin x = x ——• D The error thereby incurred is smaller than *5/120, and this fraction is smaller than 0.000 000 000 02, so that, calculated exactly to 10 places, sin 1° = 0.0174524064. Note 1. Summation of the series S = sin a + sin (a + 8) + sin (a + 28) + • • • + sin (a + n — 18). We multiply both sides by 2 sin 8/2 and transform each of the products on the right in accordance with the formula 2 sin - sin (a + vS) = cos la H -— SI — cos la H ^— IS.
Newton's Sine and Cosine Series 63 We are then left with „.. 8 I 8\ I In - 1 s\ 2Ssin ■= = cos lot — -=\ — cos lot -\ ■?— SI- Since the right side of this equation is o • / n-\\ . 2 2 sin lot -\ ^—I sin n ■?> we obtain . 8 sinn ■= S = sin m r-> o sin- i where m = a H -— 8 represents the mean value of all n angles a, a + 8, . . ., a + n — 18. In order to obtain the sum of the series 2 = cos a + cos (a + 8) + • • • + cos (a + n — 18) g we again multiply both sides by 2 sin ^> but on the right-hand side we write o • § / *n • I 2v + 1 -\ ./ 2v - 1 -\ 2 sin ■= cos (a + v8) = sin la -\ -— SI — sin la H -— SI- We are then left with „_ . 3 ./ 2n - 1 s\ . I 8\ 22-sin - = sin la + —-— 81 - sin la - -A I n- I A . 8 = 2 cos la H -— SI sin n ■=> and we obtain . 8 n sin^r 2 = cosm r- sin 2 Note 2. Proof that lim n sin — = w. n-* oo n a- r> • w U> U> ->w w 11 • ->w\ bin w = 2 sin — cos — = 2 tan — cos2 — = 2 tan —• 11 — sin2 ^-1-
64 Arithmetical Problems However, since sin w < w and tan w > w, it follows that 1 sin w > 2 w I. w2\ 'TV ~T) or sin w > w — -r W. n 4n2 w , w lai3 ~ and ~- 1 -3 w ,. Then sin — lies between — and -j —*•> i.e., n sin — lies between w n n 4 rJ and w — ^ ^-- Thus, lim n sin — = w. n-* oo W ^M Andre's Derivation of the Secant and Tangent Series Perhaps the most convenient and certainly the most attractive way of deriving the exponential series of the functions sec x and tan x is the method of zigzag permutations devised by the French mathematician Andre (Comptes Rendus, 1879, and Journal de Mathimatiques, 1881). A zigzag permutation—called by Andrd an " alternating permutation"—of the n numbers 1, 2, 3,..., n is an arrangement cu c2,..., cn of these numbers in which no element cv possesses a magnitude such that it lies between its two neighbors cv_r and cv + 1. If the points Pu P2, ■ ■ ■, Pn are marked off on a system of coordinates such that their respective abscissas are 1, 2,..., n and their respective ordinates cu c2,..., cn, and each two successive points Pv and Pv + 1 are connected by a line segment, the zigzag line by which the permutation gets its name is obtained. Fio. 2.
Andre's Derivation of the Secant and Tangent Series 65 A zigzag line or zigzag permutation can begin either by rising or falling. We assert: There are as many zigzag permutations {among n elements) that begin by rising as by falling. Proof. Let PiP2 • ■ ■ Pn be the zigzag line corresponding to one zigzag permutation. Let us draw, through their highest and lowest point, parallels to the abscissa axis and a parallel midway between them. If we construct a mirror image of the zigzag line upon the middle parallel, the mirror image gives us a new zigzag line Q1Q2 • • • Qn or zigzag permutation, which begins either by falling or rising, depending upon whether the first zigzag line begins by rising or falling. Thus, for every zigzag permutation which begins by rising (or falling) we can obtain a corresponding zigzag permutation which begins by falling (or rising). Consequendy, there is an equal number of each type. Naturally there are just as many zigzag permutations that end by rising as by falling. Let us, therefore, designate the number of zigzag permutations of n elements as 2An, so that An represents the number of zigzag permutations of n elements that begin (or end) by rising (or falling). The number An can be determined by a periodic formula. Let us consider all the 2An zigzag permutations of the n elements 1,2,...,« as written down and let us single out one of them, in which the highest element n occupies the (r + l)th place (counting from the left). To the left of n there are then the r elements au a2,..., a,, while to the right of n there are the s numbers f}1} /32,..., /3,, with r + s = m = n — 1. The permutation 0^0¾ ... ot, ends by falling, since ot, is followed by n, which is higher; the permutation /3^ ... /3, begins by rising, since /3X follows n, which is higher. Now let there be formed from the r elements au ot2,..., ot, a total of AT zigzag permutations with falling ends and, similarly, from the s elements flu /32,..., /3, a total of A, zigzag permutations with rising beginnings. Consequently, there are A, ■ A, zigzag permutations of n elements in which n occupies the (r + l)th position and in which to the left of n there are r elements au ot2)..., a,. However, since there are many other combinations of m elements to the rth class aside from the considered combination au ot2)..., ot,—as is commonly known, there are a total of Cm = m, = m\\r\s\—there are consequendy a total of p, = trtrArA, (r + s = m)
66 Arithmetical Problems zigzag permutations of n elements in which the highest element (n) occupies the (r + l)th place. It is also easily seen that this formula is also valid for the indices r = 0, 1, 2 if one sets A0 = A1 = A2 = 1. In order to obtain all the possible zigzag permutations we must obtain the expression pT for all the values from r = 0 through r = m = n — 1 and add the resulting products. This gives us m 0,m 2An = Jtpr = 2tmTArA,. 0 r In order to simplify this formula somewhat further, we write m\jr\s\ instead of m, and set (1) £ = *■ It is then transformed into 2nan = a^.i + a^.a + • • • + an.^ + a,,.^,,, or, utilizing the symbol for the sum, into (2) 2nan = 2a,a„ where r and s pass through all the possible integral numbers ^ 0, for which r + s = n — 1. Using the periodic formula (2) it is possible to compute, beginning with a2, each number of the series a0, au a2, a3, a4,... from the numbers preceding it. From a„, when it is multiplied by n!, it is possible to obtain half the number of zigzag permutations of n elements. We can draw up a table for the simplest cases: n = an = An = 0 1 1 1 1 1 2 i 1 3 i 2 4 A 5 5 tV 16 6 T% 61 7 tVV 272 8 TTTS* 1385 We are able to confirm, for example, that the four elements 1, 2, 3, 4 yield 2-At = 10 zigzag permutations 1324, 2143, 3142, 4132, 1423, 2314, 3241, 4231, 2413, 3412.
AndrS's Derivation of the Secant and Tangent Series 67 It is but a short step from the zigzag permutations to the series for sec x and tan x. First we establish that starting with the index 3 all av are proper fractions < \. Since the number of zigzag permutations of n elements for n > 2 is smaller than the number of all the permutations of n elements, then 2An must be <n\, and consequently, an < i- Therefore, the infinite series y = a0 + atx + a2x2 + a3x3 + ■■■ converges absolutely and is uniform over every interval — h through +h where h < 1. It therefore represents over this interval a continuous function with differentiable terms. The derivative of y is y' = a1 + 2a2x + 3a3x2 + • • •. Since, moreover, the series fory converges absolutely, we can square it and thereby obtain ^ = 2^-1. n where b1 = 1 and for all n ^ 2 bn = aodn-i + a^-2 + a2an_3 + • • • + a,,..^,,. In accordance with (2), therefore, whenever n ^ 2, bn = 2nan, and then y2 = 1 + 2-2a2x + 2-3a3x2 + 2-4a4x3 + • • •. If we then add one to both sides we obtain 1 + y2 = 2\ax + 2a2x + 3a3x2 + 4a4x3 + • • •] or 1 + y2 = 2y'. We write this equation 1 +y2 2 and reflect that the left side is the derivative of the function Y = arc tan y — \x,
68 Arithmetical Problems but that the derivative of a function (Y) can be zero only if this function is a constant. Thus we have Y = arc tan y — \x = const. In order to determine the constant, we set x equal to zero and obtain for this value of the argument x y = 1, arc tan y = -> and Y = -• The constant therefore has the value 7r/4, and our equation is transformed into IT x arc tan y = 5+^ From this it follows that y = tan (j + *). and we have the progression (3) tan (z + 2) = a° + a** + a2*2 + a3*3 + - - - which is true in any case for every proper fractional positive or negative value of x. We replace x in (3) by — x and obtain (4) tan (5-^) = ao ~ «i* + «2*2 - «3*3 + • As is easily seen, however, the two trigonometric formulas lir x\ lir x\ 2 sec x = tan (4 + 2)+ tan (4 ~ 2) and 2 tan x = tan ^ + |j - tan \? - ^ are true. If we introduce on the right-hand side here the series indicated in (3) and (4) we obtain the progressions for sec x and tan x which we were seeking: sec x = a0 + a2x2 + a4*4 + a6x6 + • • •, tan x = ax* + a3x3 + aBxs + a7*7 + • • •
Gregory's Arc Tangent Series 69 or, if we return to half the number of zigzag permutations, An, V V V sec x = A0 + A2 „] + At j-^ + A6 — + • • •, Y Y V tan x = Aj.x + A3 jj + AB ^ + A^ yj + These two progressions are true in all cases for every proper fractional value of x. However, since sec x and tan x as functions of the complex argument x are analytic functions of x and the individual position closest to zero is x = 7r/2, the convergence circle has the radius 7r/2. The two exponential series for sec x and tan x consequently converge for every x the absolute value of which lies below 7r/2. Gregory's Arc Tangent Series Determine the angles of a triangle from the sides without the use of tables. If a, b, c are the given sides of the triangle, a, B, y the angles (given in arc measure), the following relations, as is well known, are obtained: a p B p y p tan r: = -» tan £ = *-■> tan £ = *-■> 2 « 2 v 2 w where p2 = uvw/s, u = s — a,v = s — b,w = s — c,2s = a + b + c. Thus, a/2, /3/2, y/2 are the arcs whose tangents are pju, pjv, pjw. We write a p B p y p tt = arc tan -> ^ = arc tan -> £ = arc tan —• 2 «2 v 2 w Arc tan * is understood to represent the arc whose tangent is x. The function arc tan x is called a cyclometric function. We can consider our problem solved if we can succeed in calculating the cyclometric function arc tan xfor any given x. This can be calculated by means of the exponential series for the arc tangent function obtained in 1671 by the English mathematician James Gregory (1638-1675). To derive the arc tangent series we make use of the mean value of the function f{x) = -j ^> which we must consequently compute * i x beforehand.
70 Arithmetical Problems On a tangent of a unit circle ft we mark off from the point of tangency A the two segments Ap = v and AP = V in such a manner that Pp = <p = V — v; we connect /> and /> with the center of the circle 0 and designate the distances Op and OP as r and .ft, their intersections with ft as ? and Q, and the arcs Aq, AQ, qQ in that order as w, W, to. This gives us the equations w = arc tan v, W = arc tan V, to = arc tan F — arc tan v. We would like to divide the area (-^) of the triangle OPp into two sections and for this purpose we draw the two arcs ph and PH concentric to qQ so that they meet OP and the extension of Op at h and H. The area of the triangle is then greater than the area ($r2to) of the sector Oph but smaller than the area {%R2to) of the sector OPH, so that r2to < q> < R2to. It follows from this that 1 to 1 -S5 < — < 9 r R2 - „ - ,2 or, if instead of y, to, r2, and .ft2 we write in the same order V — v, arc tan V — arc tan v, 1 + v2, 1 + F2 (Pythagoras), 1 arc tan V — arc tan » 1 (1) T-nr2< v—; < l+F2 V -v \ +v2 1 In order to determine the mean value of the function F{x) trough x, i.e., the limiting value F(8) + F(28) + ■■■ + F(n8) 1 + x2 over the interval 0 through x, i.e., the limiting value of H- = (where 8 = */n),in (1) we substitute successively 0| 8, 8\28, 2S|3S,..., (n — \)8\n8 for the value pair v\ V, add the resulting inequalities, and obtain arc tan x . 1 fH, < g— < rn, + 1 - y—s or arc tan x x2 arc tan x < fj. < x n(l + x2) r x As the limit n = oo is approached this inequality is transformed into (2) 2R *, = HE*!* K ' o 1 + x2 x
Gregory's Arc Tangent Series 71 Now for the derivation of the arc tangent series! It is 1 x2 = 1 - 1 + x2 1 + x2 or F = 1 - x2F, if for the sake of brevity we write F for F(x). If we replace the F on the right-hand side of this equation with 1 — x2F, we obtain F = 1 - x2 + x4F. If here we once again write 1 — x2F for F on the right-hand side, we obtain F = 1 - x2 + x* - x6F. In a similar manner, from this we obtain F = 1 - x2 + x4 - xe + xaF, F = 1 - x2 + x* - xs + xs - x10F, etc. Consequently, we obtain the inequality 1 - x2 + x* - xB + x4n-2 < F < 1 - X2 + x4 - xe + + x4n. Obtaining the mean value here gives us v2 v4 v8 v4n-2 1-^- + ^-?r+ 3^5 7^ 4n - 1 arc tan x <l-=r + ^--=r+- 3 ' 5 7 ' 4n + 1 or (3) '--3 + 5-T+ 4^TT<arCtan* K3 X5 X7 3+r7T ' 4n + 1 If we then set r r x' x^ < x - -=- + — --=-+ + arc tan x = x —;r + -? =-H •• — 3 ' 5 7 4n - 1 or rather v4n-l X3 X" X' X1 arc tan x = x — -5- + — =- H -; r + 3 ' 5 7^ 4« - 1 ' 4n + 1
72 Arithmetical Problems the error thereby incurred is smaller than the difference *4n+1/ (½ + 1) of the boundaries of (3). Since, however, this difference tends toward zero when n becomes infinitely great and x is a proper fraction (also when x = 1), we obtain the progression V V V (4) arc tan * = * —5- + -=- — -=- H • (for * ^ 1). This is Gregory's formula. If the progression is interrupted at any point the error incurred is smaller than the first disregarded term. The series cannot be used when x is an improper fraction, because it no longer converges. In order to calculate arc tan x in this case we introduce y = 1 /*, the reciprocal value of x, and make use of the formula ... It (5) arc tan x + arc tany = ^- [If arc tan x = a, i.e., x = tan a, then from lit \ 1 tan I- — a) = = y \2 / tana * we obtain by inversion it it ., ^ — a = arc tan y or ^ = arc tan * + arc tan y.\ We then obtain arc tan y in accordance with Gregory's formula and arc tan x in accordance with (5). But even if x is a proper fraction the arc tangent series is not advisable when x is very close to 1. In this case we introduce 1 _ x z = -j > the half reciprocal value of x, and make use of the formula (6) arc tan x + arc tan z = j- [If arc tan x = a, i.e., x = tan a, then from (it \ 1 — tan a T — a\ = T 4 / 1 + tan a we obtain by inversion it \ — x it -7 — a = arc tan -; or T = arc tan x + arc tan z.\ 4 1 + * 4 J
Buffon's Needle Problem 73 Thus we obtain arc tan z with Gregory's formula and then arc tan x with (6). Note. If in (4) we set x = 1, we obtain the so-called Leibniz series: 4 3 + 5 7 + '"' which was discovered by Leibniz independently of Gregory in 1674. It is not advisable, however, to use this series to calculate it. The series discovered by the English mathematician John Machin (t 1751), which was published by him in 1706, is much better suited for this purpose. Machin made use of the auxiliary angle A whose tangent is \. From tan A = \ it follows that tan 2A = 2 tan A/ (1 — tan2 A) = -^-, and from this, similarly, that 120 tan 4A = 2 tan 2A/(1 - tan2 2A) = j^ Inversion gives us 4A = arc tan |-f-§ or 120 . 1 arc tan -=-^ = 4 arc tan -• 1 iy 5 The left side of this equation, according to (5), has the value ;r — arc tan -}^§; arc tan ^-¾, however, according to (6), has the value -t — arc tan j^y, so that the left side is 7 + arc tan j^g- Consequently, -7=4 arc tan - — arc tan ^r-> 4 5 2.59 or written out completely: 7T _ n 1_ J \ _ 4 U 3-53 + 5-55 +'"/ (J \239 3-! 1 1 + ^^ - + ■ -2393 ^ 5-2395 Using this series, Machin calculated it to 100 decimal places, Buffon's Needle Problem On a table at d intervals parallels are drawn. A needle of length 1 smaller than d is thrown at random on the table. What is the probability that the needle will touch one of the parallels?
74 Arithmetical Problems This remarkable problem stems from Georges Louis Leclerc, Comte de Buffon (1707-1788), who was the first man to clothe probability problems in geometric form. The probability of an event is commonly understood to mean the ratio of the number of cases favoring an event to the total number of possible cases. Let the probability we are seeking be W. Let the needle have the terminal points A and B. Let us imagine the parallels extended horizontally. Let us single out two such adjacent parallels I and II (below I) and from any point P on line I let us drop a perpendicular PQ (= d) to line II. Let us begin by considering the special positions £ of the needle which are characterized by the following three conditions: (1) the terminal point A lies on the segment PQ; (2) the needle lies to the right of QP; (3) AP forms an acute angle: the inclination of the needle toward QP. Let the probability that the needle touches parallel I in any of the special positions be w. First we will show that W = w. If we consider all of the positions £' in which the needle touches with its terminal point A either end of the segment PQ but is otherwise arbitrarily situated (i.e., touching either I or II or neither) this quadruples (as compared to the number of positions £) both the number of all the possible cases and the number of all the favorable cases. The probability of touching one of the two parallels I and II in all of the positions £' is, therefore, likewise w. If to the cases £' we add those positions in which the terminal point B instead of terminal point A comes to rest on the segment PQ, we obtain a total of £" positions, which doubles the number of possible cases as well as the number of favorable cases. Consequently, the probability of touching one of the parallels I and II in the positions £" is also w. Now if instead of taking one perpendicular PQ we take a very great number—v—of very closely situated equidistant successive perpendiculars between I and II and consider all the positions of the needle in which one end of the needle comes to rest upon one of these v perpendiculars, we thereby multiply by v (with respect to £") the number of all the possible as well as that of all the favorable cases.
Buffbn's Needle Problem 75 Consequently, the probability of touching one of the parallels I and II by a needle position in which one needle end lies between I and II is again w. The addition of still a third parallel III representing a mirror image of I on II (or of II on I), as well as the addition of the needle positions in which one end of the needle lies between III and II (or between III and I), again give us a probability of w. In short, we have shown that W=w. Consequently, our problem has been limited to the task of determining the probability w of the needle touching line I in a special position. Fig. 3. To obtain a better view of the infinitely great number of special positions, let us divide the above segment PQ into a very great number —N > 10001000—of equal parts and let us consider all of the cases in which the needle end A cuts one of the dividing points. For each dividing point there are an infinitely great number of possibilities corresponding to the infinitely great number of possible needle angles. For convenience in considering these possibilities also, let us consider only the M angles = 0,6,= = 2e = 3« .! = (M- 1)., where M likewise represents a very great number (e.g., M > 22'3) and e is the Mth part of 71-/2.
76 Arithmetical Problems In this manner our consideration involves N points and M angles, thus, a total of NM needle positions. However, only a certain fraction—just w—of these positions are favorable. In order to determine this fraction we begin by obtaining the total number of only those favorable positions in which the angle of inclination of the needle has the selected value 6,, as illustrated in Figure 3. These positions form a parallelogram EFGP with the sides EF = I and EP = I cos 6,. Since there are FP / N-PQ = N-rose> dividing points on the segment EP, our overall total comprises N -jj cos 6, favorable positions (with the common needle angle 6,). The number n of all the favorable positions altogether is consequently n = N-j (cos 60 + cos 6X + cos 62 + • • • + cos 0M-i)- The probability that we are seeking is, therefore, n I cos 60 + cos 6X + cos 62 + ■ ■ ■ + cos 6M _ x w = There remains then only the task of determining the value of the fraction cos 0O + cos Qx + cos 02 + • • • + cos ®m -1 m = Ti ' M The fraction m is no different from the mean value of the cosine function over the interval 0 through 7r/2. Those who are familiar with the elements of integral calculus will immediately be able to write this mean value; it is m = fo'2cosXdXjl = l. Those readers who are not familiar with this type of calculation can obtain m just as easily in the following adroit manner. Draw a quadrant of a circle with a radius of 1, designating the horizontal arm as OH and the vertical as OK. If this is rotated about the radius OK it forms a hemisphere the area of whose surface is commonly known to be 27r.
Buffon's Needle Problem 77 The area of this surface can be expressed in a different form. For this purpose let us move the above angles of inclination 0o> 0i> 02> • • •> ^M-i s0 tnat the angles are formed at 0 with 0//. The resulting free arms divide the quadrant into M very small arcs with the common length e. Let us select from among them the one lying between the free arms of the angles 6, and 0S + 1. On being rotated it forms a very small spherical zone, which when flattened out to a strip possesses the length 2ir cos 6, and the height e, so that the area is then 2ire cos 6t. Since the sum of all the spherical zones obtained in this manner gives the hemisphere, we obtain the equation 27re(cos 60 + cos #! + cos 62 + • • • + cos 0M-i) = 2^ or, since Me = 7r/2, cos 60 + cos flt + cos 62 + • • • + cos &M-\ _ 2 M ~ IT Thus, we have obtained the mean value that we were seeking. The mean value of the cosine function (naturally that of the sine function also) over the interval 0 through 7r/2 is 2/ir. [This also follows from formulas (1) and (2) of No. 15.] At the same time we obtain I I 2 w-dm-T* or W-2-±, it a This formula gives us the probability we were seeking. Note. Wolf in Zurich (1850) arrived at the original idea of using the obtained formula to calculate the number it. Experimentally, by a great number (5000) of throws with a needle 36 mm long and a distance of 45 mm between the parallels, he found the probability W to be (approximately) 0.5064, and obtained 2/ v = -m> = 3.1596. dW The Englishmen Smith (1855) and Fox (1864) repeated the experiment and found with 3200 and 1100 throws, respectively, values of 3.1553 and 3.1419 for 7T.
78 Arithmetical Problems HeS The Fermat-Euler Prime Number Theorem Every prime number of the form 4n + 1 can be represented in only one manner as the sum of two squares. This famous theorem was discovered about 1660 by Pierre de Fermat (1601-1665), the greatest French mathematician of the seventeenth century. It was not published, however, until 1670, when it appeared, unfortunately without proof, in the notes to the works of Diophantus, edited by Fermat's son. It is not certain whether or not Fermat had obtained the proof. The first proof of the theorem was presented almost 100 years later by Leonhard Euler in his treatise " Demonstratio theorematis Fermatiani, omnem numerum primum formae 4n + 1 esse summam duorum quadratorum" (Novi Commentarii Academiae Petropolitanae ad annos 1754-1755, vol. V), after years of fruitless attempts at its solution. Today there are several proofs of the Fermat-Euler theorem. The following proof is distinguished by its great simplicity. For the reader who is unfamiliar with problems of number theory we will provide several explanations that will be necessary for understanding this proof and will also be found useful for the problem dealt with in No. 22. At the same time, it is to be understood that the letters used here and in No. 22 represent whole numbers. Two numbers a and b (according to Gauss), are called congruent to the modulus m, written: a = b mod m, read a congruent to b modulo m, when their difference is divisible by m. Every number, for example, in regard to the modulus (to the modulus, modulo) m, is congruent to the residue it leaves over when divided by m, for example 65 = 2 mod 7. And this is also true when the word residue is taken in its most general sense, in which it means the residue left after division when the quotient is arbitrarily chosen. If, for instance, we write 65/7 = 12, we remain with a residue of —19. Among the many possible residues two are of special importance: the conventional or common residue, which is positive and smaller than the divisor, and the minimal residue, the magnitude of which never exceeds half the divisor. A minimal residue of the division 89/13 is, for example, —2, because 89/13 = 7 — -^, which can also be written 89 = -2 mod 13.
The Fermat-Euler Prime Number Theorem 79 The following self-evident rules apply to congruences to the same modulus: 1. If two numbers are congruent to a third, they are also congruent to each other. 2. Two congruences can be added, subtracted, and multiplied. From A = B mod m, a = b mod m it follows that A ± a = B ± b mod m and Aa = Bb mod m. [From A = B + Gm and a = b + gm it follows, for example, that Aa = Bb + gm (g integral), i.e., -da = Bb mod m.] 3. The congruence a = b mod m may be multiplied by any whole number g: ag = bg mod m. It can be divided by g only when g is a common divisor of a and 6 that has no common divisor with the modulus. If, for example, we divide 49 = 14 mod 5 by 7, we obtain a correct congruence 7 = 2 mod 5. A system of m integral numbers no two of which are congruent to the modulus m is called a complete residue system to the modulus m. The simplest complete residue system is the system of the m common residues 0, 1, 2,..., m — 1, and the next simplest is the system of m minimal residues. Every number z is congruent to the modulus m to one and only one number of a complete residue system mod m. Of particular importance is the following theorem: Theorem : If the numbers of a complete residue system are multiplied by a number possessing no common divisor with the modulus, there is obtained once again a complete residue system with respect to the modulus. Proof. Let m be the modulus, a the multiplier possessing no common divisor with m. If then for two different numbers x and x' of the given residue system ax = ax' mod m were true, it would follow from congruence rule 3 that x = x' mod m, which, however, is not the case.
80 Arithmetical Problems From this theorem it follows directly that: The congruence ax = b mod m, in which a and m possess no common divisor, possesses in each complete residue system mod m one and only one " root" x. Quadratic Residues Of two numbers possessing no common divisor one is called the quadratic residue of the other when it is congruent to a square number with respect to the other as modulus; if there is no such square number it is called a quadratic nonresidue. For example, 12 is a quadratic residue of 13, since 12 = 82 mod 13; —1 is a quadratic nonresidue of 3, since there exists no square number x2 such that x2 = -1 mod 3. The following theorems concerning quadratic residues and non- residues apply to odd prime number modulus p: 1. There are a total of p = (p — 1)/2 mutually incongruent quadratic residues and just as many mutually incongruent nonresidues of p. The former are 12, 22, 32,..., p2, or whichever numbers are congruent to them mod p. II. The product of two residues is a residue, the product of a residue and a nonresidue is a nonresidue, and finally, the product of two nonresidues is a residue. Proof of I. 1. If two of the designated squares were congruent to each other, for example x2 = y2 mod p, the product (x + y) (x — y) [which is equal to x2 — y2] would be divisible by p, which is impossible, because both of its factors are smaller than p. 2. If we continue the series of squares beyond p2, no new residues are obtained. The square (p + h)2, for example, is congruent to k2 mod p if k ^ p is so determined that p + h + k is divisible by p, since then p + h = —k and moreover (p + h)2 = k2 mod p. Since there are (aside from the number divisible by p, disregarded here) 2fc> numbers mutually incongruent modp, there must be a total of p mutually incongruent quadratic nonresidues of p. Proof of II. Let R and r be quadratic residues, JVand n quadratic nonresidues of p. 1. From A2 = R, a2 = r mod p we obtain by multiplication (Aa)2 = Rr mod p. Consequently, Rr is a residue. 2. The 2*3 numbers l2, 22,..., p2, Nl2, N22,..., Np2 are mutually incongruent mod p. Since the first p of these numbers are quadratic
The Fermat-Euler Prime Number Theorem 81 residues of p, and since only p residues exist, the p numbers Nl2, N22,..., Np2 must be nonresidues, i.e., NR is a nonresidue. 3. The 2*3 numbers n-l2, n-22, n-32,..., n-p2, n-Nl2, n-N22,..., n ■ Np2 are mutually incongruent mod p. The first p of these numbers are nonresidues in accordance with 2.; consequently, the others must be residues in accordance with 1.; however, among them is the product of the two nonresidues N and n. Q.E.D. Let us now consider the bilinear congruence (0) xy = D mod p, in which the modulus/) is once again an odd prime number, D a given number possessing no common divisor with p, and the "mutually conjugate" or "linked" magnitudes x and y are chosen in such a manner from the system £ of the numbers 1,2, 3,..., p — 1 that (0) is satisfied. For each x from £ there is then only one conjugate y. [From xy = D mod p and xy' = D mod p it follows that xy = xy' mod p and from this y = y' mod p or y — y' = 0 mod p. However, since both y and y' ^ p — 1, their difference is divisible by p only when y' = y.] We select x1 arbitrarily from £ and determine yx such that *!#! = D mod p. Then we select from £ a number x2 that differs from xr and yr and determine y2 such that x2y2 = D mod /». y2 then is different from xr as well as from yx. We continue in this manner until all the numbers of £ have been arranged in the resulting congruences. Here there are two cases to be distinguished: 1. yv never equals xv. In other words: the congruence x2 = D modp is impossible; D is a quadratic nonresidue of p. We then obtain exactly p = (p — 1)/2 pairs *v, yv of conjugate numbers, and multiplication of the p congruences formed gives (1) (p- 1)! = Z)pmod/>. 2. For a certain index v, yv = *v, thus *J = Dmodp; D is a quadratic residue of p. If aside from v there is also an index p for which the same occurs, then x2 = D mod /», and so x2 = *j| mod /», i.e., x2 — x2 or (*„ + *v) (*„ — xv) is divisible by /». Since x„ — xv is not
82 Arithmetical Problems divisible by p, xu + xv must be divisible by p, and consequently xu = p — xv. Actually, then x\ = p2 — 2pxv + X* = *2 = D mod p. Equal linked magnitudes thus occur exactly twice if they occur at all. In our case (yv = xv,yu = x„) we now have only p — 1 congruences xsys = D mod p, where ys differs from xs. To these p — 1 congruences we add the congruence xvXu = — D mod p, multiply all p congruences and obtain (2) (p- 1)1= -D'modp. This is the case when, for example, D = 1, since then 12 = D mod p. Then we have the congruence (2a) (p - 1)! = -1 modp, which represents the so-called Wilson theorem. Using Wilson's formula we write instead of (1) and (2) (la) D" = -l(mod/>) (2a) D" = l(mod/>) and obtain Euler's theorem : The number D that possesses no common divisor with the prime number p is either a quadratic residue or nonresidue of p, depending on whether Dp is congruent mod p to the positive or negative unit. The introduction of the Legendre symbol makes it possible to express this criterion of the residue character of a number by a formula. The Legendre symbol I —J represents the positive or negative unit, depending on whether or not D is a quadratic residue or nonresidue of p. Thus, for example, 1 = ) = 1, since 32 — 2 is divisible by 7, whereas I <r) = — 1, since there is no square number whose difference from 2 is divisible by 3. When this symbol is used Euler's criterion assumes the simple form (3) {j)=DP m°d P' Wkh *» =^^- In the simple case D > — 1, congruence (3) is transformed into the equation (4) ^=(_l)(P-i)/2)
The Fermat-Euler Prime Number Theorem 83 since in this case both sides of (3) are units, and the difference between two units is divisible by the odd prime number p only when these units are equal. Now ?——— is even or odd, depending on whether the prime number /> is of the form 4n + lor4n + 3. In the first case, then I J = +1, i.e., — 1 is a quadratic residue of p, and in the second case I ) = — 1, i.e., — 1 is a quadratic nonresidue of p. Consequently, the following is true: Theorem of Euler: The negative unit is a quadratic residue of the prime number p, when p has the form 4n + 1 and a quadratic nonresidue when p has the form 4n + 3. In other words: The pure quadratic congruence x2 + 1 = 0 mod p has integral solutions x when p has the form 4n + 1 and has not when p has the form 4n + 3. Now for the proof of the Fermat-Euler theorem! The following proof is based upon the above theorems and the Norm theorem : If a prime number goes into a norm but not into the bases of the norm, it is itself a norm. A norm is understood to mean the sum of the squares of two whole numbers, which are the "bases" of the norm. Proof of the norm theorem. Let the prime number p go into the norm a2 + b2, but not into its bases a and b, so that (5) a2 + b2 = pf it being assumed that the factor fis greater than 1 but smaller than p/2. This assumption does not represent a limitation of the theorem, since from A2 + B2 = pF, with F > (P/2), we can immediately form the equation a2 + b2 = pf, with f < (P/2), if the minimal residues A — hp and B — kp of the divisions A\p and B\p, respectively, are taken for a and b, respectively. On the one hand, a2 + b2 = [42 + B2] - 2(Ah + Bk)p + (h2 + k2)p2 is divisible by p, and thus a2 + b2 = pf,
84 Arithmetical Problems while on the other hand, since |a| < ip and \b\ < ip, a2 + b2 is smaller than ip2 or pf < ip2 or f < ip. Moreover, p does not go into either a or b, because then (contrary to our assumption) it would go into A = a + hp or into B = b + kp. We determine the minimal residues a = a — mf and /3 = b — nf of the divisions a//and b\f and obtain similarly (6) «2 + /32 -/', with /' ^ if. Multiplication of (5) and (6) gives us (a2 + b2)(a2 + p2) =pf2f or (<2<x + ty})2+ (afS-b*)2=pf2f. Since <z<x + 6/3 = [a2 + b2] - (am + &«)/= a'f, aft + ba = (bm - an)f= b'f, the equation obtained is written (7) a12 + V2 = pf, where /' ^ if. Here/' cannot disappear. If/' = 0, then in accordance with (6) a = 0 and /3 = 0, and from this it follows that a = mfand b = nf; then according to (5) p = (m2 + n2)f In this event /» would have to be divisible by/, and then/would have to equal 1, which contradicts our premise. If, then,/' = 1, (7) already gives us the norm expression of/». If/' > 1, we obtain from (7) (8) a"2 + b"2 = pf with 0 </" ^ if, just as (7) was obtained from (5). This method of constructing new equations with continuously diminishing factors / /', f,... is continued until the factor 1 appears. The corresponding equation gives the prime number p represented as a norm. Now we will prove I. A prime number q of the form 4n + 3 cannot be represented as a norm. II. Every prime number p of the form 4n + 1 can be represented as a norm in only one way. Proof of I. If it were true that a2 + b2 = q,
The Fermat-Euler Prime Number Theorem 85 then it would follow that b2 = — a2 mod q and the product (— 1) (a2) of a quadratic nonresidue (— 1) and a residue (a2) of q would be a quadratic residue (b2) of q, which according to the above is impossible. Proof of II. According to Euler's theorem there is a whole number x such that the norm x2 + 1 is divisible by p. According to the norm theorem, p is then itself a norm: p = a2 + b2. Here also there is only one possible norm representation. If we assume a second such representation: p = A2 + B2 (where a, b, A, B represent four different positive numbers), it follows that p2 = (a2 + b2)(A2 + B2) = (Aa + Bb)2 + (Ab + Ba)2, where either the two upper signs or the two lower signs are possible. Then, since the product of the two factors Aa + Bb and Aa — Bb: A2a2 - B2b2 = A2(a2 + b2) - b2{A2 + B2) is divisible by p, one of the factors must be divisible by p. Consequently, we select the upper or lower signs depending upon whether the first or second factor is divisible by p. Then either Aa + Bb = p and at the same time Ab — Ba = 0 or Ab + Ba = p and at the same time Aa — Bb = 0, thus, either A2b2 = B2a2 or A2a2 = B2b2. From the first of these equations it follows that A2 _ B2 _ A2 + B2 _ a2 ~ b2~ a2 + b2 ~ ' and from the second A2 _B2 _ A2 + B2 _ b2 ~ a2 ~ b2 + a2 ~ ' thus, from the first A = a, while from the second A = b, both of which contradict the initial assumption, which requires that A =£ a and A ^ b. There is therefore only one way of representing/) as a norm, and the Fermat-Euler theorem is proved.
86 Arithmetical Problems |gm The Fermat Equation Find the integral solutions of the equation x2 - dy2 = 1, in which d is a nonquadratic positive whole number. This extremely important problem of number theory was posed by Pierre Fermat in 1657, first to his friend Frenicle and then to all contemporary mathematicians. The first solution, a very complicated one, was obtained by the Englishmen Lord Brouncker and John Wallis. The simplest and best solutions to this problem were discovered by Euler, Lagrange, and Gauss. [Euler: "De usu novi algorithmi...," Novi Commentarii Academiae Petropolitanae ad annum 1765. Lagrange: "Solution d'un probleme d'arithmetique," Miscellanea Taurinensia, vol. IV, 1768. Gauss: Disquisitiones arithmeticae, 1801.] They are all based upon the properties of periodic continued fractions. We will examine a somewhat modified form of this method with the more general equation X2 - DY2 = 4, which includes the original Fermat equation (with X = 2x, Y = y, D = 4<f) as a special case, but includes as well the case in which D leaves a residue of 1 on being divided by 4. For the sake of convenience we shall write the continued fraction *+!+i i c+-d+... in the abbreviated form (a, b,c,d,.. .). A purely periodic continued fraction with an n-term period has the form u = (in £2.---, gn, gl,g2>---,gm---)> so that we may write « = (gl>g2,---,gN,u), where Nis an integral multiple of n, which we will assume to be even for reasons presently to be described. The terms (partial denominators) gi, g2,... are assumed to be positive whole numbers > 0. If we designate the numerator and denominator of the Nth approximation
The Fermat Equation 87 (gi,g2, ■ ■ ■ >gs) and of the (N - l)th approximation (gi,g2,---, gN-!) as P and Q and p and q, respectively, then according to continued fraction theory we obtain the two equations (1) Pq-Qp=l and (2) u = ^X the second of which may also be expressed in the form (2a) Qu2 - Hu -p = 0 with H = P - q. The discriminant D = H2 + 4Qp of the quadratic equation (2a) has, according to (1), the value H2 + 4Pq — 4 = (P + q)2 - 4; it is consequently smaller by 4 than a square number and therefore cannot itself be a square number. Its (positive) root r = VD is therefore irrational. Moreover, since r > H (because r2 = H2 + 4Qp), the second root u = (H — r)/2Q of the quadratic equation is negative, so that the first root (H + r)/2Q represents our (improperly fractionated) continued fraction u. To obtain information about the magnitude of u we form the product of the roots uii = — pjQ and obtain PIQ —u = u Since P > p and Q > q, then -«<22 and -*<ttL u u One of the right-hand fractions, however, is a proper fraction, since the value u of the continued fraction lies between the two successive approximationsp\q and PjQ; therefore, — u must be a proper fraction. A quadratic equation with integral coefficients and a nonquadratic discriminant whose first root is a positive and improper fraction while the second root is a negative proper fraction is called a reduced equation, and its first root is called a reduced number. Our conclusion therefore reads: Every purely periodic, improperly fractionated, continued fraction is a reduced number. We will now show conversely that the continued fraction of a reduced number is purely periodic. First, we will solve the problem: Obtain the first root u = (r — b)/2a of the quadratic equation (3)
88 Arithmetical Problems with integral indivisible coefficients and the positive nonquadratic discriminant D = r2 = b2 — 4ac in the form of a continued fraction. We write where g is the largest whole number below u (in the following to be designated as [«] and u' a positive improper fraction. We introduce three new magnitudes a', b', c' that are of the opposite sign and equal to the magnitudes ag2 + bg + c, 2ag + b, and a, and we obtain , _ 1 _ 2a _ 2a(r - b') _ r - V " ~ u - g~ r + V ~ r2 - b'2 ~ 2a' with b'2 - 4a'c' = b2 - 4ac = D. Consequently, u is the first root of the quadratic equation (3') a'u'2 + b'u' + c' = 0, which likewise belongs to the discriminant D and possesses coefficients having no common divisor. (If a', b', c' possessed a common divisor, the latter because of the equations — c' = a, — V = 2ag + b, —a' = ag2 + bg + c would go into a, b, c, which contradicts our assumption.) We call the new equation (3') the derivative of the initial equation (3) and its first root u' the derivative of u. The new coefficients a', V, c' are calculated in practice in accordance with the following system: ga + b -+ g(ga + b) a' b' c' We add the two terms of the third column and change the sign of the sum, thus obtaining a'. We add the two lower terms of the second column, change the sign of the sum and get V. We change the sign of a and get c'. The derived quadratic equation (3') is treated in exactly this manner and the process continued as far as desired. The following example is presented to make the process completely clear.
The Fermat Equation 89 Expand the positive root of the quadratic equation 3«2 - 10« - 1 = 0 into a continued fraction. The discriminant is 112, thus r = 10,.... In the scheme we will write in only the coefficients of the successive quadratic equations each of which is the derivative of the preceding one. In the last column we will write the first root of the appropriate equation and the highest integral contained in it that is at the same time the correct partial denominator of the continued fraction. -10 -1 10,--- + 10 9 -1 -3 4 3 1 -8 8 0 -8 9 1 -10 10 0 -3 0 -4 3 -3 0 10,- 10,- 10,- 6 •• + 8 •• + 6 8 8 ••+10 = 3 + = 2 + = 3 + = 10 + 3 -10 -1 Since we come back to the initial equation, the expansion is purely periodic, and we obtain V\\2 + 10 |2Li- = (3, 2, 3, 10, 3, 2, 3, 10,...). Now for the proof of the theorem that the expansion of a reduced number yields a purely periodic continued fraction! Since the first root u of the reduced equation au2 + bu + c = 0 is a positive improper fraction, and the second one, u, is a negative proper fraction, then according to the relations c uu = -> a u + u = h a
90 Arithmetical Problems between roots and coefficients, both the free term c and the coefficient b of the linear term of a reduced equation are always negative (the coefficient a is assumed to be always positive). In accordance with the expansion examined above we write (4) u = g + 1 with g = [«] and «' > 1. From u' = l/(« — g) it follows initially that the first root «' of the derived equation is a positive improper fraction. If we then transform r into — r in the equation u' = l/(« — g), the equation assumes the form u' = \j{u — g) and shows that the second root u' is a negative proper fraction. The derivative of a reduced equation or number is consequently also reduced, so that only reduced numbers occur in the continued fraction expansion of a reduced number. If we write (4) 1 ~ff = g ~ "' we see that g can also be taken as the greatest integer that is contained in the reciprocal value of opposite sign of the second root of the derived equation. Now, the number of all the reduced numbers corresponding to a given discriminant D is finite. (From D = b2 — 4ac and — ac > 0 it follows first that the b's must be sought only among the numbers of the series — 1, —2,..., — [r]. Of these the only ones that need be considered are those for which D — b2 is divisible by 4. We select these, and for each such b we determine the pairs of numbers a, c [with a > 0, c < 0] for which —ac= (D — b2) /4, which in turn gives us a finite quantity of numbers a and c. Each number triplet a, b, c obtained in this way, however, leads to a reduced equation au2 + bu + c = 0 and thus to a reduced number u only when 2a lies between r + b and r — b.) Consequently, in the continued fraction expansion of a reduced number U there must reappear after a finite number of steps a reduced number previously obtained, e.g., in such manner: U = (K, L, u), u = (h, k, I, u). But since, in accordance with the above, both I and L represent the greatest integer that is contained in the reciprocal value of u of opposite sign, L = I. Similarly, we find that K = k.
The Fertnat Equation 91 Consequently, U- (k,l,h,k,l,h,...), i.e.: The expansion of a reduced number yields a purely periodic continued fraction. After these preliminaries the solution of the Fermat equation becomes quite simple. We will show: I. that the continued fraction expansion of any reduced number belonging to the discriminant D possesses an infinite number of solutions of the Fermat equation; II. that every solution of the equation is obtained by this expansion. I. Let « = Ui> £2,---, gn, gi, £2, • • •, £n, • • •) be the positive root of the reduced equation (5) au2 + bu + c = 0 with the discriminant D and coefficients possessing no common divisor. Also, let q = (gi> £2,---, £jv) be an approximation of u and the index number N an even multiple of n, and let - = (gugt, ■ ■ -,gN-l) be the preceding approximate fraction; then, according to (2a), (5') Qu2 - Hu -p = 0 (H = P - q). Since the roots of (5) and (5') agree and the coefficients of (5) possess no common divisor, it must be possible to obtain (5') from (5) by multiplication with a certain whole number y, such that <*> -%-'-3-h If we then introduce the whole number (7) x = P + q, we obtain from (6) and (7) x2 - b2y2 =(P+ q)2 - (P - q)2 and 4acy2 = -4Qp,
92 Arithmetical Problems from which by addition we obtain x2 - Dy2 = 4(Pg - Qp), and, using (1), x2 - Dy2 = 4. II. Conversely, now let x\y represent a solution of the Fermat equation (8) x2 - Dy2 = 4 in nonevanescent positive integers x and y and let u represent the first root of a reduced equation an2 + bu + c = 0. Making use of (6) and (7), we obtain the four nonevanescent positive integers n x — by n x + by P 2^ Q = "y, P ey, q ^- (It is immediately obvious that Q and p are such numbers, whereas for P and q it follows from equation (8), if we make use also of the equation D = b2 — 4ac to write: (* + by)(x -by) = x2 - b2y2 = 4(1 - acy2) = 4(1 + Qp). We are then able to conclude from the appearance of the nonevanescent integer on the right, which is divisible by 4, that the two integral factors 2q and 2P of the product on the left-hand side have to be even and not equal to zero.) According to (8) they satisfy the equation (9) Pq-Qp=\. If we then replace the coefficients a, b, c in the reduced equation with Qly> -(P - q)ly, -Ply, we get (10) u = #4^. v Qu + q Before we get from here to the continued fraction expansion, we still have to prove that Q ^ q. It is true that 2{Q — q) = [2a — b]y — x. Since the second root u of the reduced equation is a negative proper fraction, it follows that r + b < 2a or 2a — b > r. Consequently, 2(Q - q) > ry - x = (r2y2 - x2)/^ + x) = -4/(ry + *)
The Fermat Equation 93 or (Q — q) > —2j{ry + x). However, since D = r2 = b2 — 4ac is at least equal to 5, y is at least 1, and x at least 3, it follows that ry + x > 5 and from this Q — q > —0.4, i.e., Q ^ q. Q.E.D. We now expand PjQ into a continued fraction (y1; y2, ■ ■., yv) with the even number of terms v in such a manner that between it and the last approximate fraction p'jq' there exists the relation (9') Pq' -QP' = 1. From (9) and (9') it then follows by subtraction that P(q' - q) = Q(P' - P)- However, since q ^ Q, q' < Q, and (q' — q) is divisible by Q, q' must equal q and therefore p' must also equal p. We then obtain i \ Pu+P (yi,y* •••,*,«) = Q^Tq i.e., because of (10), « = (n, 72, •••,yv, «)• Every solution * |y of the Fermat equation can therefore be obtained by the expansion of any reduced number u as a continued fraction. Final result : The Fermat equation x2 -Dy2 = 4 has an infinite number of solutions; these can all be obtained in accordance with rules (6) and (7) from the approximation values, containing an even number of periods, obtained from the expansion as a continued fraction of any arbitrarily selected reduced number belonging to the discriminant D. Example. Find the smallest solution x \y of the Fermat equation x2 - \\2y2 = 4. A reduced equation applying to the discriminant 112 is the equation treated above 3«2 - 10« - 1 = 0; the expansion of the reduced number u reads « = (3,2,3,10,3,2,3,10,...) and has a four-termed period. The first four approximate fractions are 3 7 />_24 P 247 1' 2' q~ 7' Q ~ 72'
94 Arithmetical Problems Since here a = 3, b = — 10, c = — 1, we find, in accordance with (6) and (7), that x = 254, y = 24. It now remains to be shown that there is at least one reduced number corresponding to each discriminant D. 1. If D = 4n and g is the maximum integer that is contained in Vn, then a = 1, b 2g, c = g2 - n are the coefficients of a reduced equation. Proof. The discriminant of the equation is b2 — 4ac = 4n = D. Moreover, r + b<2a<r — b, since 2Vn" - 2g < 2 < 2Vn" + 2g. 2. If D = 4n + 1 and g is the largest integer for which g2 + g will be smaller than n (so that (g + I)2 + (g + 1) > nor/ + 3g + 2 > n), then a-1, 6=-(2^+1), c=g2 + g-n are the coefficients of a reduced equation. Proof. The discriminant of the equation is b2 — 4ac = 4» + 1 = D. Also, r+b<2a<r— b, since VD - (2g + 1) < 2 < VD + 2g + I. (That Vi) — 2g — 1 < 2 follows from the above condition g2 + 3g + 2 > n. On multiplication by 4 this becomes 4g2 + I2g + 9 > 4n + 1, i.e., it becomes (2^ + 3)2 > D. From this it follows that 2g + 3 > VI) or VI) - 2g - 1 < 2.) Note. If we have found the minimal solution of the Fermat equation (e.g., by the method just presented), we can find the other solutions (we will consider only positive solutions) in a simpler manner after Lagrange. We assign to each solution x\y the "Lagrange number" z = i(x + yr) and call x and y the components of the Lagrange number.
The Fermat Equation 95 We will first prove the auxiliary theorem. The product and the improperly fractionated quotient £ = ■£(£ + rjr) of two Lagrange numbers Z = -J(X + Yr) and z = -J(x + yr) is also a Lagrange number. Proof. We immediately find that tf = 1 or ^ - Dr)2 = 4 with , Xx ± DYy Yx ± Xy £ = 2 ' ^ 2—' where the upper sign is used when we are concerned with the product and the lower when we are concerned with the quotient. From X > rY and x > ry it follows that Xx > DYy, so that £ is positive in every case. From it follows in the case of £ = Zjz, since then Y > y, that XjY < xjy or Yx > Xy, so that rj is also positive in every case. Consequently, £ is positive and improper because ££ = 1. Now it merely remains to show that £ and rj are integers. Either D is divisible by 4 or Z) leaves a residue of 1 on division by 4. In the first case X and x are even. In the second case every solution of the Fermat equation consists either of two even or two odd numbers. In all cases £ and rj are consequently integers. The method mentioned above is based upon the theorem: Every Lagrange number is a power of the smallest Lagrange number with an integral exponent. Proof. Let x \y be the minimal solution of the Fermat equation and thus z = \{x + yr) the smallest Lagrange number. First it follows from the auxiliary theorem that every power of z is a Lagrange number. Now let Z = \{X + Yr) be a Lagrange number that is not a power of z. Then there must certainly exist two successive powers § = zn and §' = zn+1 between which Zis situated. From zn < Z < zn + 1 it follows on division with zn that 1 < Z/5 < z.
96 Arithmetical Problems Thus, the Lagrange number £ = Z/§ would be smaller than the smallest Lagrange number z, which is naturally absurd. Consequently, the only Lagrange numbers are the powers z z2 z3 z4 And the simplest way of finding the 2nd, 3rd,... solution of the Fermat equation is to find them as components of the Lagrange numbers z2, z3,.... H9H The Fermat-Gauss Impossibility Theorem Prove that the sum of two cubic numbers cannot be a cubic number. Thus, what must be proved is that the equation x3 + y3 = z3 cannot be composed of nonevanescent integers x, y, z. The theorem that we have to prove is a special case of the famous Fermat impossibility theorem, which was expressed by Fermat in the following way in the arithmetic of Diophantus, edited by Fermat's son, and published in 1670: " It is impossible to divide a cube into two cubes, a fourth power into two fourth powers, and in general any power except the square into two powers with the same exponents." Fermat added: "I have discovered a truly wonderful proof of this, but the margin (of the notebook) is too narrow to hold it." Unfortunately, Fermat neglected to disclose this "wonderful proof." Fermat's impossibility theorem became very famous as a result of the fact that many of the greatest mathematicians since Fermat, including Euler, Legendre, Gauss, Dirichlet, Kummer, and others tried unsuccessfully to obtain the general proof of this theorem. To the present day a proof of the impossibility of the equation xn + yn = z" is known only for special values of the exponent n, e.g., for the values from 3 to 100, and even this proof involves extraordinary complications and difficulties. In the following we will limit ourselves to the simplest case, the case n = 3. The impossibility of the equation x3 + y3 = z3
The Fermat-Gauss Impossibility Theorem 97 was demonstrated by Euler in his algebra, which appeared in 1770, and later by Gauss (Complete Works, vol. II). This problem shows, as it often happens in mathematics, that the proof of a more general theorem is easier to obtain than that of a special case. To prove the impossibility of (1) a3 + b3 = c3 for the common integers a, b, c Euler had to resort to a relatively complicated method; Gauss, on the other hand, proved simply and clearly the impossibility of the more general equation (2) a3 + /S3 = y3 for any numbers a, /3, y of the form xJ + yO, where x and y are any integers, , 1 +iV3 , n 1 -iV3 J = 2 and ° = 2 are cube roots of the (negative) unit. For convenience in notation we will call numbers of the form xJ + yO (in which x andy are integers) G-numbers. That the case treated by Euler is simply a special case of (2) is apparent from the fact that every integer g is also a G-number: g = gJ + gO. The G-numbers (which are the integers of the so-called group of the cubic unit roots) have many properties in common with common integers. Readers unfamiliar with these properties will find all the information necessary for an understanding of the Gauss proof in the supplement provided on p. 100. Gauss' Proof of the Impossibility of the Equation (2) <x3 + /33 = y3. First, let Greek letters designate G-numbers and small Roman letters common integers. We then replace a, /3, y with £, rj, — £, transforming (2) first into the symmetrical equation (3) i3 + 7]3 + ? = 0, of which we assume that two of the three "bases" £, rj, £ will always have no common divisor; we will then refer to this equation as a Gauss equation. [The assumption we have just made in no way
98 Arithmetical Problems limits the proof. If, for example, £ and rj possessed a common prime factor S, then, in accordance with (3), S would also go into £3 and consequently into £, so that division by S3 would eliminate the divisor S from (3).] The impossibility of (3) is obtained from the two following theorems, which we will derive from the assumption of the existence of (3). I. In every Gauss equation one and only one of the three bases—we will call it the special base—has the prime divisor it = J — O. II. For every Gauss equation there is a second Gauss equation in which the special base contains the divisor it fewer times than the special base of the first equation. These two theorems, however, contradict each other. By continued application of II. it is possible to obtain a Gauss equation that no longer contains a special base, which contradicts theorem I. Proof of I. If none of the three bases £, rj, £ were divisible by v, then i3 = e, r? =f £3 = g mod 9 with e" = /2 = g* = 1 and consequently, because of (3), e +f+ g = 0 mod 9, which is, however, impossible. Therefore a situation such as the following must exist: £ = 0 mod it, £ & 0 mod it, rj ^ 0 mod it. Proof of II. It follows from £3 = mod -n3, according to (3), that £3 + t)3 = 0 mod it3, and since £3 = e mod 9, rf =■ /mod 9, e + /=0 mod it3, then e + / = 0 mod 3 must be true; from this it follows that/= — e. Now £3 + rj3 = e +/= 0 mod 9, and consequently i3 = 0 mod 7T4 and £ = 0 mod tt2. From £3 + rj3 = 0 mod it3 and the identity £3 + r? = <p4>X, where 9> = £J + vO, j = £0 + 7)J, x = £ + r,, it follows that at least one of the factors <p, </<, x *s divisible by it. From this and from <p — >p = (£ — 17)71-, 9; + ^r = x it follows that each one of the factors q>, >p, \ is divisible by it, so that Thus no pair of the numbers 9/, ^r', ^' possesses a common divisor.
The Fermat-Gauss Impossibility Theorem 99 [If, for example, 9/ and >p' possessed a common divisor S, then also <p' — ifi' would equal £ — rj and 71-(9/ + ^') = £ + ^) and tnen afe° 2£ and 2rj would be divisible by S, so that S would be equal to 2. Then we would either have £ = 2A + e, rj = 2/* + e, or £ = 2A + e, 7] = 2fi — e, with e3 = + 1 and then 9) = 2v + e or <p = 2v + £7r, which, however, is not divisible by S = 2.] If we now set £/7r = at, then w3 = — (p'lp'x' with 9)' + ^r' = x'. Since then no pair of 9/, ^', — x' possesses a common divisor, these three magnitudes down to the possible unit factors a, /3, y must be cubes of the numbers p, a, t, no pair of which possesses a common divisor: 9>' = ap3, ifi' = /Str3, — x = y"3 with a6 = /S6 = y6 = 1, so that (4) a,3 = afiypWr3, <xp3 + /Sex3 + yr3 = 0. However, if the cube of k = oi\pcn is the G-unit a, /3, y, then, since k3 = E mod 9, a/Sy = £ mod 9 also, and consequently <x/3y = E with £2 = 1. From u> = 0 mod 7r it follows, for example, that t = 0 mod 7T and p ^ 0, a & 0 mod 7r. Then, however, p3 = e and tr3 =/mod9 (e2 =/2 = 1), and consequently, according to (4), ea +Jfi = 0 mod 3, and from this ea +J]3 = 0. Thus, we obtain /3 = Fa, Fa2y = £ (with F2 = 1) and from (4) Fa3p3 + <x3<t3 + Et3 = 0. If we write here £', rj', J' in place ofFap, aa, Et, respectively, we finally obtain the Gauss equation (3') r3 + r,'3 + r = 0, into the special base £' of which the factor it goes fewer times than into the special base £ of (3).
100 Arithmetical Problems Supplement. Properties of G-numbers I. The magnitudes J and 0 satisfy the following equations: J + 0 = 1, JO = 1, J2 + 0 = 0, O2 + J = 0, J3 = -1, O3 = -1. II. The sum, difference, and products o/G-numbers are also G-numbers. The product of the two numbers aJ + bO and a'J + b'O is, for example (according to l-),pJ + qO with p = ab' + ba' — bb' and ^ = ab' + ba! — aa!. III. Norm. The norm of a complex number j = | + it) is commonly understood to be the product Jo = #(¾) = U = (| + it)) (| - it)) = |2 + t)2 of the two mutually conjugate numbers J and J = | — it). 7¾ norm q/" <fo G-number aJ + bO accordingly has the value a2 + b2 — ab. It is a positive integer which disappears only when a and b are both zero. The smallest conceivable norms of G-numbers are 1, 2, 3. From a2 + b2 - ab = 1 we obtain one of the six following cases: a = b = 1 0 0 1 -1 0 0 -1 1 1 -1 -1 There are thus six G-numbers: J, -J, 0, -0, 1, -1 with the norm 1. The equation a2 + b* - ab = 2 has no solution that is an integer. There is consequently no G-number whose norm is 2. The equation a2 + b" - ab = 3 finally has six integral solutions a=l, b = -\; a=-\, b = 1; a = 1, b = 2; a=-l, b = -2; a = 2, 6 = 1; a = -2, b = -1.
The Fermat-Gauss Impossibility Theorem 101 Accordingly, there are six G-numbers with the norm 3, the numbers it = J — 0 = iV3, 77-./, 7rO, and their conjugates 7? = — it, — irO, -nJ. The norm of the product of two numbers is equal to the product of the norms of these numbers. Proof. N{aB) = <x/3-^ = <x/3-a-/3 = <xa-/3/3 = N(a)-N(B). IV. Units. A G-number e is called a unit, or more accurately a G-unit, when its reciprocal value rj is also a G-number. From er/ = 1 it follows from norm formation that e0rj0 = 1, i.e., e0 = 1. According to III., there are consequently six G-units: J, -J, 0, -0, 1, -1. These six units are the integral powers of J or 0, e.g., J, J2, J3, J*, J5, and J6. V. Associated numbers. The six numbers that are obtained when a G-number J is multiplied by the six G-units are called the associated numbers of £. The six associated numbers of it = J — 0 are, for example, TTj = -1 - 0, 7T.72 = -1 - J, TrJ3 = -7T, 77J4 = 1 + 0, 77-./6 = 1 + J, 77-./6 = 7T. VI. Division. The quotient q = <x//3 of two G-numbers a and /3 is not necessarily a G-number. If it is a G-number, however, /3 is called a divisor (G-divisor) of a or one says that /3 goes into a. In order to divide any G-number a by any other /3, we write <x _ a/5 _ a/5 _ hJ + £0 _ 6 _A /3 ~ /3/5" /30 ~ /30 -/3/ + J3~o Here we divide each rational fraction A//30 and &//30 into the integral components m and n, respectively, and the rational components r and 3, respectively, the absolute value of which never exceeds \ [Example: -^ = 4- 0.2], we set mJ + nO = k, xJ + %0 = «R, and obtain 3 = /( + 9¾ or a = kB + 91/3. P From 91/3 = a — kB it follows that 91/3 is a G-number y, and we have a = kB + y. Here y0 = 9t0/30 = (r2 + 32 — t3)/30. Since, however, |r| = | and |§| = -J, then 9t0 must certainly be ^ £, i.e., y0 g £/30.
102 Arithmetical Problems Conclusion. The division of a G-number a by another G-number /3 results in a "quotient" k and a "residue" y such that a = k/3 + y>, with the residue norm being at most equal to j of the divisor norm. VII. The algorithm of the greatest common divisor. We start with the division <x//3 and the related equation (1) a = kP + y with y0 Z tfo, and determine, as in VI., the quotient A and the residue S of the division /3/y; in this way we obtain the corresponding equation (2) /3 = Ay + S with S0 £ %y0. Then in a similar manner we obtain (3) y = /*8 + e with e0 ^ £80, etc. Since the residue norms become progressively smaller, we must finally obtain a residue of zero. To avoid unnecessary writing we will assume that the division after (3) S/e leaves no residue, so that (4) S = ve. Now it follows from (4) that every divisor t of e also goes into S without residue, and, therefore, it follows from (3) that t also goes into y without residue; consequently, it follows from (2) that t goes into /3 without residue, and, finally, from (1) it follows that t goes into a without residue. In reverse order: it follows from (1) that every common divisor t of a and /3 is also a divisor of y, then, from (2), that t also goes into S without residue, and, finally, from (3), that t is also a divisor of e. Every common divisor of a and /3 consequently goes into e without residue, and every divisor of e goes into a and /3 without residue. e is accordingly (in terms of its absolute value) the highest common divisor of a and /3. If, in particular, e is a G-unit, the numbers a and /3 are said to have no common divisor or to be prime with respect to each other. The chain of equations (1,) (2), (3),. .. is nothing other than the extension to G-numbers of the well-known algorithm for determination of the highest common divisor of common integers.
The Fermat-Gauss Impossibility Theorem 103 VIII. Unequivocal division of G-numbers into prime factors. Just as with integers, the common theorems governing divisibility, indivisibility and unequivocal division into prime factors are derived from the divisional algorithm: 1. If a and /3 possess no common divisor and ap, is divisible by /3, then p is divisible by /3. 2. If two G-numbers possess no common divisor with one and the same third G-number, their product also possesses no common divisor with this third G-number. 3. Every G-number can be divided into a product of prime factors (i.e., G-primes) in only one way. [Divisions such as <x/3y and aJ-fi-yO, in which one contains the associated numbers of the other rather than certain factors of it, are not considered different from each other.] A G-prime is a G-number that possesses no divisor aside from its six associated numbers and the six units. The numbers it = J — 0 and 2 are, for example, primes. If, for example, we assume that it is divisible: it = Xfi, then 7r0 = X0fi0 or 3 = X0fi0. From this it follows that A0 = 3, /*0 = 1. fj. is therefore a unit and the equation it = A/* does not represent a division. From 2 = Xfi.it follows that 2 = A0^0 or 4 = A0^0. The case of A0 = 2, ftQ = 2 is eliminated because, according to III., there is no G-number having a norm equal to 2. Thus, we are left with A0 = 4, fi0 = 1. Once again p is a unit and the equation 2 = Xfi does not represent a division. IX. Congruence. As in the theory of natural numbers, we say here also that two G-numbers a and /3 are congruent modulo /*—written a = /3 mod fi—when their difference a — /3 is divisible by the G-number p. X. G-numbers modulo it. We will consider one more G-number k = aJ + bO in relation to the modulus it = J — 0. If k is divisible by it: aJ + bO = (mJ + nO)(J - 0) = (2n - m)J + (n - 2m)0, then a = 2n — m, b = n — 2m, thus a + b = 3g with g = n — m. Conversely, if a + b = 3g, m and n are determined from n — m = g and 2n — m = a, giving k = (mJ + nO)(J — 0).
104 Arithmetical Problems The G-number k = aj + bO is thus divisible by it only when a + b is divisible by 3. If k is not divisible by it, then one of the three following formula pairs is valid: a = 3h, b = 3k + e; a = 3h + e, b = 3k; a = 3h + e, b = 3k + e, with e2 = 1, and thus, if hJ + £0 is set equal to A, k = 3A + eO or k = 3X + eJ or « = 3A + e, so that in every case k has the form k = 3A + e, where e is a G-unit. Let us now consider the cube of k. It becomes k3 = 9(3A3 + 3A2e + Ae2) + e3, and, because e3 = ± 1, it has the form K3 = 9fl ± 1. If k is not divisible by it we then have the congruences k = e mod 3, k3 = ± 1 morf 9. |£m The Quadratic Reciprocity Law (The Euler-Legendre-Gauss theorem.) The reciprocal Legendre symbols of the odd prime numbers p and q are governed by the formula (E\.(l\ = (_1)[(P-1)/21-[(8-1)/3] This law, the so-called quadratic reciprocity law, was formulated but not proved by Euler (Opuscula analytica, Petersburg, 1783). In 1785 Legendre discovered the same law (Histoire de VAcademie des Sciences) independently of Euler and proved it partially. The first complete proof was presented by Karl Friedrich Gauss (1777-1855) in his famous Disquisitiones arithmetics (published in 1801), a book that laid the foundations of contemporary number theory; this work, its five hundred quarto pages swarming with profound
The Quadratic Reciprocity Law 105 ideas, was written when Gauss was 20 years old. "It is really astonishing," says Kronecker, "to think that a single man of such young years was able to bring to light such a wealth of results, and above all to present such a profound and well organized treatment of an entirely new discipline." Later Gauss discovered seven other proofs of the reciprocity theorem. (The Gauss proofs may be found in vol. 14 of Ostwald's Klassiker der exakten Wissenschaften.) The quadratic reciprocity law is one of the most important theorems of number theory. Gauss called it the " Theorema Jundamentale." The American mathematician Dickson says in his Theory of Numbers: "The quadratic reciprocity law is doubtless the most important tool in the theory of numbers and occupies the central position in its history." The importance of this law led other mathematicians like Jacobi, Cauchy, Liouville, Kronecker, Schering, and Frobenius to investigate it after Gauss and offer proofs of it. In his Niedere Zahkntheorie, P. Bachmann cites no fewer than 52 proofs and reports on the most important. Probably the simplest of all the proofs is the following arithmetic- geometric proof, which arises from the combination of the so-called lemma of Gauss (Gauss' Werke, vol. II, p. 51) and a geometric idea of Cayley (Arthur Cayley [1821-1895], Collected Mathematical Papers, vol. II). Before taking up the proof itself we will give the derivation of Gauss' lemma. Letp be an odd prime number and D an integer that is not divisible by p. If x represents one of the numbers 1,2,3,.. .,p = (p — 1)/2, Rx the common residue of the division Dxjp, gx the corresponding integral quotient, then (1) Dx = Rx+gxp. Accordingly as Rx is smaller or greater than \p, we set Rx = px or Rx = Px + P> where in the second case px represents the negative minimum residue of the division Dxjp, and we obtain (la) Dx = Px + gxp or (lb) Dx = Px + p + gxp. If n is then the number of negative minimum residues occurring in the p divisions Dxjp (for x = 1, 2, 3,..., p), we have n equations of the form (lb) and m = p — n equations of the form (la).
106 Arithmetical Problems We convert these equations into congruences mod p and obtain the p congruences (2) Dx = px mod p. Now the p residues px agree, except with respect to sign and sequence, with the p numbers 1 to p. [If, for example, pT were equal to p, or pT = — p, for two different values r and s of x, then Dr = pT and Z)j = p, would yield by subtraction or addition, respectively, D(r + s) = Omod/>. This congruence is, however, impossible, because neither D nor r + s is divisible by p.] Multiplication of the p congruences (2) results in D"pl = (-l)nplmodp, and from this we obtain D> = (-l)nmod/>. However, since, according to Euler's theorem (No. 19), D> = (D mod/,, we obtain (|) = (-1)" mod /,, whence, since both sides of this congruence have the absolute value 1, (3) ©-<-1)B- This formula, in which n represents the number of negative minimum residues resulting from the p divisions Dx\p (x = 1,2,3,...,^), is Gauss' lemma. Now let D be some odd prime number q that differs from p. We convert the p equations (la) and (lb) into congruences to the modulus 2, leave out all the excess multiples of 2, e.g., (q — l)x, and obtain x = px + gx mod 2 and x = I + px + gA mod 2. Addition of these p congruences yields 2x = n + 2px + 2gx mod 2.
The Quadratic Reciprocity Law 107 However, since the absolute values of px are in agreement with the numbers 1 through p and each summand can be replaced by its opposite value in a congruence mod 2, we will write 2* in the obtained congruence instead of 2p* and —n instead of n, thereby obtaining 2* + n = 2* + Zgx mod 2 or (4) n = Zgx mod 2. In accordance with (4) we can now write (3) as (¾-(-■)-.. Now gx is the greatest integer contained in the quotient qxjp. If we designate this as [qx/p], we obtain at last (I) (|) = (-1)IIM,P1, where x passes through all the integers from 1 to p = (p — 1)/2. Accordingly, (II) ©"'-" UPI/IQ] where y passes through all the integers from 1 to q = (q — 1)/2. Multiplication of (I) and (II) gives us (in, (^).(1) = (-„ The exponent of the righ L[(«/p)x]+E[(P/<I)l/] Fio. 4. ly found.
108 Arithmetical Problems On a system of rectangular coordinates xy we draw the rectangle with the four angles 0|0, f o, I !• ° and bisect it with a diagonal d from the origin, possessing the equation y = {qxjp); we then mark off all the lattice points* within the rectangle. (Cf. the figure, in which/) = 19, q = 11.) To begin with, it is clear that no marked lattice point x \y lies on d, since here x would necessarily be < \p and y < \q, which contradicts the condition y\x = q\p. For an integral abscissa x the corresponding ordinate of d is y = (qx/p) and the number of marked lattice points lying on this ordinate is [qx/p]. Consequently, the number of the marked lattice points lying in the lower half of the rectangle is 2[?*//>], where x passes through all the integers from 1 to p. Similarly, the number of all the marked lattice points lying in the upper half of our rectangle is ^.[py/q], where y passes through all the integers from 1 to q. The exponent appearing in (III) is then the number of all the marked lattice points in our rectangle. This is a total of p • q elements. Consequently, (9-(9-<-"- or (t\ .(l\ = ( _ 1 )«P- D/2MM- l>/2]# Q.E.D. ^H Gauss' Fundamental Theorem of Algebra Every equation of the nth degree zn + dz-1 + C2z-2 + • • • + C„ = 0 has n roots. Expressed more precisely, this theorem reads: The polynomial f(z) = z" + C.z"-1 + C2zn~2 + ... +Cn can always be divided into n linear factors of the form z — <xv. * A lattice point is a point whose coordinates are integers.
Gauss' Fundamental Theorem of Algebra 109 This famous theorem, the fundamental theorem of algebra, was first stated by d'Alembert in 1746, but only partially proved. The first rigorous proof was given in 1799 by Gauss, then twenty-one years old, in his doctoral dissertation Demonstratio nova theorematis omnem functionem algebraicam rationalem integram unius variabilis in factores reales primi vel secundi gradus resolvi posso (Helmstaedt, 1799). Subsequently, Gauss gave three other proofs of this theorem. All four are to be found in the third volume of his Works, as well as in vol. 14 of Ostwald's Klassiker der exakten Wissenschaften. Other authors after Gauss, including Argand, Cauchy, Ullherr, Weierstrass, and Kronecker also gave proofs of the fundamental theorem. The proof followed here (as modified by Cauchy) is Argand's {Annales de Gergonne, 1815), which is distinguished by its brevity and simplicity. This proof (like most of the other proofs) falls into two steps. The first—and more difficult—step merely demonstrates that an equation of the nth degree will always contain at least one root; the second step shows that it has n roots and no more. First Step We set zn + C^z-1 + C2z-2 + ... +Cn =/(z) = w and consider the different values that are assumed by the absolute magnitude \w\ when z is moved in the Gauss plane (the plane of complex numbers). Let the smallest of these values be p and let it be attained, for example, at the site z0, so that 1/(¾) | = \w0\ = p. There are two possible cases: 1. The minimum p is greater than zero. 2. The minimum p is equal to zero. We will begin by considering the first case. In the immediate vicinity of the point Zq, say, in the area defined by a small circle K of radius R with a center at z0, \w\ is everywhere ^ p, since p represents the smallest value of \w\; at z0 itself \w\ = \w0\ = p. For any z in K, z = z0 + £, where £ = p(cos & + i sin &) and p is the absolute magnitude of £, i.e., the line segment z0z, and & the inclination of this segment toward the axis of the positive real numbers. We calculate «> =/(*) =/(¾ + £) = (¾ + 0n + C,(z„ + £)-1 + ... +C„
110 Arithmetical Problems eliminating the parentheses and arranging according to increasing powers of £. In this way we obtain w =f(z) = zg + C^zg"1 + C2zg-2 + • • • + C„ + cd + c3? + ■ ■ ■ cni\ i.e., Since several coefficients c, may be equal to zero, we call the first of the nonevanescent coefficients c, the second c, and so forth, so that w = w0 + ct? + c'p' + c"V" + ••-, with v < v < v" .... Division with w0 and isolation of £v yields if = 1 +^.(1 + K), where q = c/w0 and £ represents a sum of different powers of £ with positive exponents and known coefficients. We consider the product ?£v-(l + ££)• We write the_y?w' factor trigonometrically, abbreviating cos q> + i sin <p to 1 „, and, from q = A(cos A + isin A) = h\x and £ = p-l#, we obtain q£" = A-1A• /»v• 1 v# = hp"-\K+tf. From now on we confine ourselves to z-values of K for which A + v& = it, which consequently lie on the radius z0H which forms the angle & = {it — A)/j> with the real axis. For all these z's the number l* + v# = 1„ has the value —1, and our product assumes the form — hpv- (1 + ££). If we choose a sufficiently small radius R, the second factor 1 + ££ can be brought as close to unity as we desire, since p = |£| < R. But this means that the product lies as close as desired to the value — hpv, i.e., the fraction ^- = 1-V(i +K) lies as close as we desire to the point 1 — hpv of the Gauss plane, which shows that for all z's between z0 and H the absolute magnitude |w/w0| < 1. In other words, for this z, \w\ < p., while for all z's in the vicinity of z0, \w\ should be ^p.. This is a contradiction, and consequently the first of the two possible cases given above (p. > 0) is eliminated. This leaves only the second case: w0 is equal to zero or f(z0) = 0. Therefore: Every equation regardless of its degree, has at least one root.
Gauss'Fundamental Theorem of Algebra 111 Second Step We begin with the demonstration of the auxiliary theorem: If an algebraic equation f (z) = 0 has the root a, then the left side of the equation can be divided byz — a without a remainder. If we divide the polynomial/(z) by z — a until the remainder R no longer contains any more z, we obtain iX£L=/l(z)+_*L, Z — a J ' z — a where R is a constant and/1(z) has the form z""1 + GiZ-2 + e2z-3 + • • • + <£„_!. Multiplication with z — a. gives f(z) = (z - a)Mz) + R. If in this equation, which is valid for every z, we set z = a, we obtain R =/(«) = 0 and thus for every z /(z) = (z - «)./i(z). Q..E.D. If we combine this auxiliary theorem with the theorem proved in the first step, which demonstrated the existence of one root, we obtain the new theorem: Every polynomial of z can be represented as the product of a linear factor z — a with a polynomial one degree lower. We now write c^ rather than a and obtain f(z) = (Z - ax)f{z). We then apply the obtained theorem to the polynomial/^ (z) and get /i(z) = (z - «a)/a(z)» where/2(z) is of the (n — 2)th degree and <x2 is a root of the equation /i(z) = 0. Also in similar fashion: f2(z) = (z - «3)/3(z), /3(z) = (z - «4)/4(z), etc. In this chain of equations, beginning with the next to last, if we replace every f on the right-hand side with its following value in the
112 Arithmetical Problems equation below, we finally obtain the theorem for the transformation of a polynomial of the nth degree into a product of n linear factors: f(z) = (z - ax)(z - a2) . .. (z - a,). Expressed verbally: Every integral rational Junction of the nth degree can be represented as the product of n linear factors. Thus, the previous equation/(z) = 0 allows us to write (z - ai)(z - a2) . . . (z - a„) = 0. However, the product on the left becomes zero only when one factor is equal to zero. And since z — av = 0 implies z = av, we finally obtain: The equation f (z) = 0 possesses the n roots a1} a2,..., a„ and no others. Thus we have proved the fundamental theorem. Note. It is possible for several of the n roots ax, a2, , a„ to be equally great, for example, for a2 and a3 both to be equal to a1} while a4, a5,..., a„ may be different from ax. In this case ax is called a multiple root, and specifically in the case we have assumed of three equal roots, a triple root. ■SEH Sturm's Problem of the Number of Roots Find the number of real roots of an algebraic equation with real coefficients over a given interval. This very important algebraic problem was solved in a surprisingly simple way in 1829 by the French mathematician Charles Sturm (1803-1855). The paper containing the famous Sturm theorem appeared in the eleventh volume of the Bulletin des sciences de Ferussac and bears the title, "Memoire sur la resolution des equations numeriques." "With this major discovery," says Liouville, "Sturm at once simplified and perfected the elements of algebra, enriching them with new results." Solution. We distinguish two cases: I. The real roots of the equation in question are all simple over the given interval. II. The equation also possesses multiple real roots over the interval. We will first show that the second case leads us back to the first.
Sturm's Problem of the Number of Roots 113 Let the prescribed equation F(x) = 0 have the distinct roots a, /3, y,..., and let the root a be a-fold, /3 6-fold, y e-fold,..., so that F(*) = (*-«)"(*-|8)»(*-y)e.... For the derivative F'(x) ofF(x) we obtain *"(*) = a + 6 + c + ... F(*) * — a * — /3 * — y = fl(s-)8)(x-y)(x- 8)--.+6(x-<x)(*-y)(*- 8)--- + --- (*-«)(*-j8)(*-y)--. If we then call the numerator of this fraction p(x) and the denominator q{x) and set the whole rational function F{x)jq(x) equal to G(x), then F(x) = G(x).q(x) and F'(x) = G(x)-p(x). Now the functions />(*) and ¢(^) have no common divisor. (The factor x — /3 of q{x) may, for example, go into all the terms of p(x) except the second with no remainder.) It follows from this that G(x) is the greatest common divisor of F(x) and F'(x). This can be determined easily from the divisional algorithm and can therefore be considered known, as a result of which q(x) is known also. The equation F(x) = 0 then falls into the two equations q(x) = 0 and G(x) = 0, the first of which possesses only simple roots, while the second can be further reduced in the same way that F(x) = 0 was. An equation with multiple roots can therefore always be transformed into equations (with known coefficients) possessing only simple roots. Consequently, it is sufficient to solve the problem for the first case. Letf(x) = 0 be an algebraic equation all of whose roots are simple. The derivative/'(*) of/(*) then vanishes for none of these roots and the highest common divisor of the functions/"(*) and/"'(*) is a constant K that differs from zero. We use the divisional algorithm to determine the highest common divisor off(x) and/"'(x), writing, for the sake of convenience in representation, f0(x') and f(x) instead of f(x) and/"'(*), and calling the quotients resulting from the successive divisions q0(x), qi(x), q2(x),. ■. and the remainders —f2(x), —fz{x),....
114 Arithmetical Problems If we also drop the argument sign for the sake of brevity, we obtain the following scheme: (0) /o = qofi -/2, 0) /l = Vl/l "/a. (2) /2 = ¢2/3 -/4. etc. In this scheme there must at last appear—at the very latest with the remainder K—a remainder —f,(x) that does not vanish at any point of the interval and consequently possesses the same sign over the whole interval. Here we break off the algorithm. The functions involved JoiJl>j2J • • •!./« form a "Sturm chain" and in this connection are called Sturm Junctions. The Sturm functions possess the following three properties: 1. Two neighboring functions do not vanish simultaneously at any point of the interval. 2. At a null point of a Sturm function its two neighboring functions are of different sign. 3. Within a sufficiently small area surrounding a zero point of f0(x), f^x) is everywhere greater than zero or everywhere smaller than zero. Proof of 1. If, for example, f2 and/3 vanish at any point of an interval, ft [according to (2)] also vanishes at this point, and consequently^ also [according to (3)], and so forth, so that finally [according to the last line of the algorithm]/, also vanishes, which, however, contradicts our assumption. Proof of 2. If the function/3 vanishes at the point a, for example, of the interval, then it follows from (2) that U") = -M°)- Proof of 3. This proof follows from the known theorem: A function [f0(x)] rises or falls at a point depending on whether its derivative [/i(*)] at that point is greater or smaller than zero. We now select any point x of the interval, note the sign of the values fo(x)>fi(x)> ■ ■ ->ft(x), and obtain a Sturm sign chain (to obtain an unequivocal sign, however, it must be assumed that none of the designated s + 1 function values is zero). The sign chain will contain sign sequences (+ + and ) and sign changes (H— and We will consider the number Z(x) of sign changes in the sign chain and the changes undergone by Z(x) when x passes through the interval. A change can occur only if one or more of the Sturm
Sturm's Problem of the Number of Roots 115 functions changes sign, i.e., passes over from negative (positive) values through zero to positive (negative) values. We will accordingly study the effect produced on Z(x) by the passage of a function f(x) through zero. Let A: be a point at which f disappears, h a point situated to the left, and I a point to the right of k and so close to k that over the interval h to I the following holds true: (1) f(x) does not vanish except when x = k; (2) every neighbor (/v + n/v-i) off does not change sign. We must distinguish between the cases v > 0 and v = 0; in the first case we are concerned with the triplet f _ 1} f, f + 1, in the second, with the pair f0,f- In the triplet, f = ! and/v + 1 possess either the + and — sign or the — and + sign at all three points h, k, I. Thus, whatever the sign of f may be at these points, the triplet possesses one change of sign for each of the three arguments h, k, I. The passage through zero of the function f does not change the number of sign changes in the chain! In the pair, f has either the + or — sign at all three points h, k, I. In the first case,/0 is increasing and is thus negative at h and positive at I. In the second case, f0 is decreasing and is positive at point h, and negative at I. In both cases a sign change is lost. From our investigation we learn that: The Sturm sign chain undergoes a change in the number of sign changes Z(x) only when x passes through a null point of/(*); and specifically, the chain then loses (with an increasing x) exactly one sign change. Thus, if x passes through the interval (the ends of which do not represent roots off(x) = 0) from left to right, the sign chain loses exactly as many sign changes as there are null points of f(x) within the interval. Result: Sturm's theorem: The number of real roots of an algebraic equation with real coefficients whose real roots are simple over an interval the end points of which are not roots is equal to the difference between the numbers of sign changes of the Sturm sign chains formed for the interval ends. Note. The same considerations can also be applied unchanged to the series formed when we multiply f0,fi,f2> •••>/« by any positive constants; this series is then likewise designated as a Sturm chain. In the formation of the Sturm function chain all fractional coefficients are accordingly avoided. Example 1. Determine the number and situation of the real roots of the equation x5 — 3x — 1 = 0.
776" Arithmetical Problems The Sturm chain is /0 = *5 - 3x - 1, f = 5x* - 3, /a = 12* + 5, /s = 1. The signs of/for x = -2, -1,0, +1, +2 are X -2 -1 0 + 1 + 2 /0 - + - - + /1 + + - + + /. - - + + + /a + + + + + The equation thus has three real roots: one between —2 and — 1, one between — 1 and 0, one between +1 and +2. The other two roots are complex. Example 2. Determine the number of real roots of the equation xs — ax — b = 0 when a and b are positive magnitudes and 44a5 > 5564. The Sturm chain reads x5 - ax - b, 5x* -a, \ax + 5b, 44a5 - 55b*. For the values x = -co and +00 it has the signs - + - + and + + + +, respectively. The equation has three real and two complex roots. ■SOB Abel's Impossibility Theorem Equations of higher than the fourth degree are in general incapable of algebraic solution. This famous theorem was first stated by the Italian physician Paolo Ruffini (1765-1822) in his book Teoria generate delle equazioni, published in Bologna in 1798. Ruffini's proof, however, is incomplete. The
Abel's Impossibility Theorem 117 first rigorous proof was given in 1826 in the first volume of Crelle's Journal fur Mathematik by the young Norwegian mathematician Niels Henrik Abel (1802-1829). His celebrated paper bore the title "Demonstration de l'impossibilite de la resolution algebraique des equations generates qui depassent le quatrieme degre." The following proof of Abel's impossibility theorem is based on a theorem of Kronecker, published in 1856 in the Monatsberkhte der Berliner Akademie. We will begin by presenting in a short introduction the auxiliary algebraic theorems necessary for an understanding of the Kronecker proof. A system M: of numbers is called a number group or rational domain when the addition, subtraction, multiplication, and division of two numbers of the system will also yield a number of the system. For brevity we will call the numbers of the system ^-numbers. Two groups are called equal when every number of the one belongs also to the other. The simplest group is that composed of all rational numbers, the group 9¾ of rational numbers or the natural rationality domain. A group S' = S(<x, /3, y,...) created by the "substitution of the magnitudes a, /3, y,... in a group S" is understood to mean the totality of all the numbers obtained from the ^-numbers and the substituted magnitudes a, /3, y,... by one or more applications of the four species, in other words, the totality of all the rational functions of a, /3, y,... whose coefficients are ^-numbers. A function f (x) or an equation f (x) = 0 in a group is a function or equation whose coefficients are numbers of the group. A polynomial in M: is understood to mean an integral rational function of the variable x whose coefficients are ^-numbers. A polynomial F(x) = Axn + Bx"-1 + •.. or an equation F(x) = 0 in a group M: is said to be reducible or irreducible in this group accordingly asF(*) is divisible into a product of polynomials of lower degree in S or not. The function x2 — 10* + 7, for example, is irreducible in the group 9¾ whereas it is reducible in the group 9?( V2): x2 - 10* + 7 = (* - 5 - 3V2)(* - 5 + 3V2).
118 Arithmetical Problems Abel's lemma:* The pure equation x> = C of the prime number degree p is irreducible in a group S when Cis a number of the group but not the pth power of a group number. Indirect proof. Let x" — C = 0 be reducible, so that xp - C = </.(*M*)> where >p and <p are polynomials in S, whose free terms A and B are ^-numbers. Since the roots of the equation x" = C are r, re, re2,..., re"'1, where r is one of the roots and e a complex pth unit root, and the free term of the equation ifi(x) = 0 or <p(x) = 0, independent of sign, represents the product of the equation's roots, then, for example, A = r"eM, B = rV. Since ft and v possess no common divisor (because ft + v = p), there are integers h, k such that fth + vk = 1. Thus, we obtain for the product K of the powers Ah and Bk the value r£*M+kjv an(j) consequently, the value K" = r" = C for the pth power of the ^-number K. It was assumed, however, that C must not be the pth power of a ^-number. Consequently, *p = C cannot be reducible. Schoenemann's theorem (Crelle's Journal, vol. XXXII, 1846): If the integral coefficients C0, C1} C2,.. .,0^-1 of the polynomial f(x) = C0 + d* + C2x2 + • • • + (Vi*"-1 + ** are divisible by a prime number p, while the free term C0 is not divisible by p2, then f (x) if irreducible in the natural rationality domain. Indirect proof. Let/be reducible so that/= <ji-<p, with if, = a0 + a^x + a2x2 + • • • + fl^.^-1 + *», 9> = b0 + bxx + V2 + • • • + i,,-!*""1 + xn. * Abel, CEuvres computes, vol. II, p. 196.
Abel's Impossibility Theorem 119 According to a theorem of Gauss* the coefficients a and b are here integers. We multiply the expressions for >p and % obtaining, by comparison withy, Co = fto^o, Ci = Oobx + axb0, C2 = a062 + a^i + fl^oi etc- Since C0 is not divisible by p2, let us say that a0 is divisible by p, in which case b0 is not. Since Cx and a0 are divisible by/», while b0 is not, it follows from the second line of our scheme that a1 is divisible by p. Then it follows according to the third line of our scheme, in which C2, a0, a1 are divisible by p, that a2 is also divisible by p, and so forth. Finally, we would be able to conclude that am = 1 is also divisible by p, which is naturally absurd. Consequently,/cannot be reducible. Reducible and irreducible polynomials play the same role among polynomials that composite and prime numbers play among the integers. Thus, for example, every reducible polynomial can be divided in only one way into a product of irreducible polynomials. All of the theorems concerned here are based on the fundamental theorem of irreducible functions. * Gauss'theorem: If a polynomial f = xN + Cxx"-1 + C2xN~2 + ••• + CN with integral coefficients is divisible into a product of two polynomials <ji = xm + axx™ " * + • • • + <«m and <f = xn + j3ix°_1 + • • • + )3n with rational coefficients (f = ipip), then the coefficients of this polynomial are integers. Proof. We bring a, and )3, to their highest common denominators a0 and b0, respectively, so that a, = a,ja0 and )3» = 6v/60, and the numbers a0, fli a2> • • •) am as well as the numbers ba, bu ..., bn, possess no common divisor, and we obtain F = TO with F = a0b0f, T = a0*m + axx"-1 + • • • + a„, O = 6o*" + Ai*""1 + ■■■ + bn. Let p be a prime divisor of a0b0. Then all the coefficients of F are divisible by p, but not bv T and ¢. We combine these terms of T and ¢, respectively, whose coefficients are divisible by p, to form the respective polynomials U and V, and similarly combine these terms whose coefficients are not divisible by p to form the polynomials u and v, so that F = (U + u){V + v), and consequently uv = F - UV- Uv- Vu. The right-hand side of this equation contains a polynomial in which, according to our assumptions for F, U, and V, every coefficient is divisible by p; the left side, however, does not, since the coefficient of the highest power of the left side, being the product of two factors a, and bt that are not divisible by p, is also not divisible by p. This contradiction disappears only when a0b0 has no prime divisor, i.e., when aa = 1 and b0 = 1, in which case a, and )3, are integers.
120 Arithmetical Problems Abel's irreducibility theorem:* If one root of the equation f (x) = 0, which is irreducible in S, is also a root of the equation F(x) = 0 in S, then all the roots of the irreducible equation are roots of F(x) = 0. At the same time F(x) can be divided by f (x) without a remainder: F{x) =f{x).Fl(x), where Fx(x) is also a polynomial in S. The simple proof of this theorem is based on the familiar algorithm for finding the highest common divisor g(x) of two arbitrary polynomials F(x) and/(*) in S. This algorithm leads through a chain of divisions, in which all the coefficients are ^-numbers, to the pair of equations F(x)=F1(x)-g(x), f(x) =Mx).g(x) and to the equation V(x)F(x) + *(*)/(*) = g(x), where all the indicated functions are polynomials in S. If the prescribed functions F and f have no common divisor, then g(x) is a constant which is for convenience set equal to 1. If/is irreducible and a root a off = 0 is also a root of F = 0, then there exists a common divisor of at least the first degree (x — a). Since/is irreducible, f(x) must equal 1 and/(*) = g(x), and then F(x) is thus divisible by f(x) and vanishes for every zero point of /(*). Q..E.D. The fundamental theorem directly implies two important corollaries : I. If a root of an equation f (x) = 0, which is irreducible in S, is also a root of an equation F(x) = 0 in M: of lower degree than f, then all the coefficients of F are equal to zero. II. Iff(x) = 0 is an irreducible equation in a group S, then there is no other irreducible equation in S that has a common root with f (x) = 0. The commonest case of substitution in a group S consists of the substitution of a root a of an irreducible equation of the nth degree /(*) = xn + a^"-1 + ■■■ + an = 0 * N. H. Abel, " Memoire sur une classe particuliere d'equations resolubles algibraiquement," Crellt's Journal, vol. IV, 1829.
Abel's Impossibility Theorem 121 into ft. A number £ of the group ft' = ft (a) defined by this substitution is a rational function of a with coefficients from ft and can be written £ = T(a)/0(a), where *F and O are polynomials in ft. Since a" = — fl^a"-1 — a2an_2 — • • • — a„, every power of a with the exponent n or with a higher exponent can be expressed by the powers an ~1, an ~2,..., a, so that we may write £ = >p(a)l(p(a), where ifi and <p are polynomials in ft of no higher than the (n — l)th degree. Since f(x) and <p(x) possess no common divisor, two polynomials u(x) and v(x) can be found (see above) in ft, such that u{x)<p{x) + v(x)f(x) = 1. If in this equation we set x = a, then [since /(a) = 0] «(<*)• 9)(<x) = 1, i.e., £ = ^r(a) •«(<*). We multiply this out and once again eliminate every power of a whose exponent ^ n. This finally gives us £ = C0 + Cxa + C2«2 + • • • + Cn-itt"-1, where the cv are S-numbers; i.e., III. Every number of the group S(a), where a is a root of an irreducible equation of the nth degree in S, can be represented as a polynomial of the (n — l)th degree of a with coefficients that are ^-numbers. There is only one suck possible way of representing it. [From C0 +Cia + ••• + Cn.ia"-1 = C0 + £> + • • • + Q.^"-1 it follows that d0 + dxa + • • • + </n_l(xn_1 = 0, with dv = Cv - cv. Then the function of the (n — l)th degree d0 + ^x + d2x2 + • • • + </n-i*n-1 vanishes for a root off(x) = 0 and, according to corollary I., must have nothing but evanescent coefficients. From dv = 0, however, it follows that Cv = cv.] We have just seen a simple example of an irreducible function that became reducible by substitution of a root. Let us consider the more general case in which an irreducible function f(x) in S of prime number degree p becomes reducible by substitution of a root a of an irreducible equation of the qth degree g(x) = 0 in S, in which, therefore, f(x) can be divided into the product of the two polynomials ifi(x, a) and <p(x, a), which may be of the mth and nth degree of x, respectively.
122 Arithmetical Problems Now the function in S «(*) =f(r) ~ <l>{^x)9{r,x), where r is some rational number, vanishes for x = a. According to the fundamental theorem of irreducible functions, u{x) is then evanescent for all roots a, a, a",... of the irreducible equation g(x) = 0. Since, for example, the equation f(x) - >P(x, a')<p(x, a) = 0 is therefore valid for every rational x, it is valid for all the values of x, so that by identity f(x) = xl>(x, a')<p(x,a) and similarly for all other roots of g(x) = 0. From the q equations f(x) = t(x, cc)<p(x, a), f(x) = >P{x, a')<p(x, a), etc., thus obtained, it follows by multiplication that /(*)« = <?(*).<&(*), where ^(x) and <!>(*) are the products of the q polynomials >p(x, a), >p(x, a),... and q>(x, a), q>(x, a),..., respectively. Since each of these products is a symmetrical function of the roots ofg(*) = 0, each product can be expressed rationally according to the Waring theorem by the coefficients of g(x) = 0 [and naturally by *], so that ¥(*) and <!>(*) are polynomials in S. Now T(^) certainly vanishes for at least one root of the irreducible equation f(x) = 0, as does 0(*). Consequently both ¥(*) and <!>(*) can be divided without a remainder by/"(*), and since/"is irreducible no other divisor than fis possible, as a result of which with fi + v = q. Comparing the degree of the left and right sides, we obtain mq = fip, nq = vp and from these, since m and n are smaller than p, it follows that p is a divisor of q. We therefore obtain the theorem:
Abel's Impossibility Theorem 123 IV. An irreducible equation of the prime number degree p in a group can become reducible through substitution of a root of another irreducible equation in this group only when p is a divisor of the degree of the latter equation. After this introduction we can turn to the proof of Abel's theorem. First, however, we will consider what is meant by an algebraically soluble equation. An equation of the nth degree f(x) = 0 in a group 9¾ is called algebraically soluble when it is soluble by a series of radicals, i.e., when a root w can be determined in the following manner: 1. Determination of the ath root a = V/Jofan 9t-numberii, which is not, however, an ath power of an 9t-number, and substitution of a into 9¾. so that the group & = 91(a) is formed; 2. Determination of the 6th root /3 = "vA of an 2t-number A, which, however, is not a 6th power of an 2t-number, and substitution of /3 into 2t, so that the group 93 = 2t(/3) = 9t(<x, /3) is formed; 3. Determination of the cth root y = "vB of a 93-number B, which, however, is not a cth power of a 23-number, and the substitution of y into S3, so that the group © = 93 (y) = 9t(<x, /3, y) is formed, etc., until these successive substitutions of radicals a, /3, y,... at length result in a group to which a>, the sought-for root, belongs and in which f (x) [since it possesses the divisor x — w] becomes reducible. It is here assumed that all the radical exponents a, b,c,... are prime numbers. This does not represent a restriction since any extraction of roots with composite exponents can be reduced to successive extractions of roots with prime exponents (e.g., Vu = •^with v = tyu). In order to shorten our task somewhat, we will limit ourselves to equations f(x) = 0 which possess rational coefficients, so that 9¾ is the natural rationality domain, which are, moreover, irreducible in 91, and which are of the degree n, which is an odd prime number. Let the first substitution be that of the nth root of unity a = t) = v 1 = cos h i sin — ' n n According to IV., this substitution still does not make/reducible, since rj is a root of the equation x"'1 + xn~2 + • •• + x + 1 = 0, the degree of which is < n. Also, with each substituted radical of our series, which still does not allow division of/"(*), we will also substitute at the same time the
124 Arithmetical Problems complex conjugate radical. Though this may be superfluous, it can certainly do no harm. Let A = *{/~K. be the radical the addition of which to the preceding radicals makes f(x) reducible, so that/"(*) is still indivisible in the group S (to which the number K belongs), but becomes divisible in S = *(A): f(x) = «/.(*, \)-<p(x, X)-X(x, A) .... Here the factors >p, q>, x>- • • are irreducible polynomials in fl (but naturally not polynomials in S) whose coefficients are polynomials of A in S. Since, according to IV., the prime number n must be a divisor of the prime number I, I must be equal to n. The I roots of the equation x' = K, which is irreducible in M: according to Abel's lemma, are A0 = A, Ax = Arj, A2 = Arj2,..., Av = Arjv,..., A„_! = Arjn_1. Since >p(x, A) is a divisor of/"(*), then >p(x, Av) also goes into f(x) without a remainder (cf. the proof of IV.). Every one of the n functions ifi(x, Av) is irreducible in &. [As in the proof of IV., it follows from >p(x, Av) = u(x, Av) -v(x, Av) that >fi(x, A) = u(x, A) -v(x, A), but this equation is impossible because >p(x, A) is irreducible in fl.] No two of the n functions ifi(x, Av) are equal. [In ifi(x, Arj") = ifi(x, Arjv), A could, as before, be replaced by the root Arjn_", from which it would follow that *(*, A) = +(x, XH), where H represents the root of unity rjv_". Here A could in turn be replaced by XH, which would give «£(*, XH) = $(x, XH2). Similarly, it would follow that «£(*, XH2) = «£(*, XH3), etc. Thus, we would then have 4<(x, A) = 4<(x, XH) = 4<(x, XH2) i.e., also = t(x,\) +Hx,XH) + ... +^,A#"-i)
Abel's Impossibility Theorem 125 The right side of this equation, however, as a symmetrical function of the n roots A, XH, XH2,... of*" = K, is a polynomial of x in S, so that ifi(x, A) would also be a polynomial of x in S. This, however, contradicts what was stipulated above concerning/"(*).] For these two reasons it follows tha.tf(x) is divisible by the product ¥(*) of the n different factors ifi(x, A), ifi(x, Arj),..., ifi(x, Aijn_1) that are irreducible in fl: f(x)=V(x).U(x), where T (as a symmetrical function of the roots of xn = K), and consequently U as well, are polynomials of x in S. Now, since f(x) is not reducible in S, U{x) must equal 1 and necessarily /(*) = Y(*) = *(*, A)*(*. A,) ... *(*, A,,-1)- The postulated divisibility of /(*) for the group fl consequently reveals itself as a divisibility into linear factors. Thus, if w, <ax, co2,..., <«„_! are the roots and x — a>, x — w1}..., x — <»>„_! are the linear factors otf(x), then x — to = ^r(*, A), ^ — O)! = ^r(*, Arj), .. . x — a>„_i = ^r(*, Arjn_1), and consequently co = K0 + K,X + K2X2 + ■■■ + Xn-iA"-1, «! = tf0 + KiXt + K2X2 + ■■■ + x,.!*;-1, c,,^ = tf0 + XA-i + *2A2-i+ • • • + K^iZZl, where all the Kv are ^-numbers. Now the equation f{x) = 0 has at least one real root, since it is of an odd degree. Let this real root be w=K0 + K1X+ ■■■ +^,,.^-1. We distinguish two cases: I. The base K of the reducible radical X is real; II. the base K is complex. Case I. Here we can assume that A is real, since the nth roots of unity belong to the group S. In that event the complex conjugate of a> is
126 Arithmetical Problems where the complex conjugates Kv of Kv are also S-numbers. From ai = co it follows then that (K0 - K0) + (K, - KJX + ■■■ + (K^, - tf^OA*"1 = 0, and from this, taking theorem I into consideration, it follows that Kv = Kv for every v. The magnitudes K0, Ku ..., Kn_ x are therefore also real. Furthermore, 0)v = K0 + K^ + • • • + ifn-iAJ-1 and <un_v = K0 + K1Xn.v + • • • + ^n-^^li. However, since Av = Aijv and A„_v = Arjn_v = Arj_v are complex conjugates, it follows that <uv and <u„_v are also complex conjugates, i.e.: The equation f (x) = 0 possesses one real root and n — 1 paired conjugate complex roots (a^ and <*>„_!, a>2 and a>„_2, etc.). Case II. In this case we substitute, in addition to the reducible radical A = VR, the complex conjugate A = vK with the result that the real magnitude A = AA is also substituted. If the substitution of A = V KK alone (i.e., without A) were sufficient to make f(x) reducible, this would give us the situation of Case I. We may therefore assume that f(x) is still irreducible in S( A) and does not become reducible until the additional substitution of A. From oj = K0+K1X+ ••• +^,,.^-1 it follows that <a = K0+K1X+ ••• +^,,.^-1 and from this, since to = <u, that K0+K.X+ ... +/:,,.^-1
Abel's Impossibility Theorem 127 In this equation all of the magnitudes with the exception of A belong to the group S(A), and since the equation xn = K (according to Abel's lemma) is irreducible in this group, we are able to replace A in the above equation by any root Av of xn = K. If we do this and keep in mind that AAA . "T—J r- we obtain K0 + K,\v + • • • + K^.Xr1 = *0 + *A + • • • + ^,,-iAr1 or <uv = tov. Thus, all the roots off (x) = 0 are real. The combination of the results of I. and II. yields the Kronecker* theorem: An algebraically soluble equation of an odd degree that is a prime and which is irreducible in the natural rationality domain possesses either only one real root or only real roots. Kronecker's theorem proves at the same time that an equation of higher than the fourth degree cannot be solved generally by algebraic means. The simple fifth-degree equation xs — ax — b = 0, for example, cannot be solved algebraically when a and b are positive integers that are divisible by a prime number p, b is indivisible by p2, and when 44a5 > 5564. According to Schoenemann's theorem the equation is irreducible. Sturm's theorem (No. 24) proves that it possesses three real roots and two complex roots. Consequently, the equation is algebraically insoluble according to Kronecker's theorem. In exactly the same way it can be shown that x1 - ax - b = 0 is algebraically insoluble when'66a7 > 7766, etc. * Leopold Kronecker (1823-1891), a German mathematician.
128 Arithmetical Problems ■SBS The Hermite-Lindemann Transcendence Theorem The expression A^i + A^2 + A^a + ..., in which the coefficients A differ from zero and in which the exponents a are algebraic numbers differing jrom each other, cannot equal zero. This extremely important theorem (see below) was proved in 1882 by the German mathematician Lindemann (in the Berliner Sitzungs- berichte) after the French mathematician Hermite (1822-1901), in vol. 77 of the Comptes rendus in 1873, had proved the special case in which the coefficients and exponents were rational integers. Linde- mann's proof, which required a great many higher mathematical tools, was simplified to such an extent, first (1885, Berliner Sitzungs- berichte) by K. Weierstrass (1815-1897), then (1893, Mathematische Annalen, vol. 43) by P. Giordan (1837-1912), that the proof is now generally accessible. The proof is presented here essentially in the form given to it in his textbook of algebra by H. Weber (1842-1913). The proof is indirect. We assume that there are I algebraic numbers A1,A2,.-.,Al and I algebraic numbers <*!, <x2, •••,«, differing from one another that satisfy the equation (1) A^i + A2e"2 + ... + Afi =0, and we show that this assumption leads to a contradiction. The demonstration is divided into four steps. I. We consider the coefficients A as roots of a real equation %{x) = 0 with rational coefficients the degree of which, L, will generally be greater than I. Let the roots of this equation be Alt A2, ■.., Au ..., AL. We form all the possible /-termed expressions As"! + A^a + • • • [totaling L(L - 1)(/, - 2)... (L - I + 1) elements], where Ar, A„... are any I components of the series Alt A2,..., AL, and we multiply these expressions together, always combining each of the members with the same exponential factor e*. The resulting product has the form II' = A\e"i + A'^ + • • • + A'meBm> where the A' are nonevanescent magnitudes. [That the coefficients A' obtained by multiplying out and combining cannot all vanish is proved in the following manner. We call the first of the two complex numbers x + iy and X + iY the "smaller"
The Hermite-Lindemann Transcendence Theorem 129 when either x < X or x = X if y is at the same time < Y. Now the product n' consists only of factors of the form Fv = Pv«"v + Qve*v + Rver* + • • •, where none of the coefficients P, Q, R vanishes, and we can consider the terms as being arranged in such a manner that pv < qv < rv < • • •. On multiplying the factors Fv the exponent p1 + p2 + pz + • • • °f the first term obtained is then the smallest of all the exponents obtained and occurs only once. Consequently, at the very least the first term of the multiplied-out product differs from zero, which was what we set out to prove.] The coefficients A' are not changed by transpositions of the magnitudes Au A2,..., AL; in other words, they are symmetrical functions of the roots of 2t(*) = 0, and, therefore, according to the principal theorem concerning symmetrical functions, are rational numbers. Since the left side of (1) is also among the factors of II', IT = 0. We multiply this equation by the common denominator of the A "s and obtain the new equation (2) Bji + BJ* + . •. + By* = 0, where the /3 different algebraic numbers and the coefficients B are nonevanescent rational integers. II. Let us consider the exponents /3 as roots of an algebraic equation S3(*) = 0 with rational coefficients of degree M, with M generally greater than m, and let us in the usual way think of the equation as being free of identical roots. We form the M(M — 1) (M — 2) ... (M — m + 1) m-termed sums B^r + B2e"6. + •••, where v is a variable and /3,., /3S,... are any m roots of S3(*) = 0, and multiply these sums by each other, once again combining terms with the same exponential factor e*. The resulting product has the form II = <Vi» + C2eW + ■■■ + CneV, where the coefficients C are nonevanescent rational integers and y represents different algebraic numbers. The product II is a symmetrical function of the roots of S3(*) = 0. Consequently, the coefficients of the expansion of II according to the
130 Arithmetical Problems powers of v are also symmetrical functions of those roots; thus, for example, the coefficient kv of vv: K = (Ciyi + C2yl + ■ ■ ■ + Cnyl)lv\. Every coefficient kv is therefore a rational number. Accordingly, if g(*) is a rational function of x with coefficients that are rational l.n integers, the sum 2_, Cs8(Ys) 'a rationally composed of the coefficients > k" and is consequently a rational number. Now since the product n for v = 1 contains the factor B^ePi, Btf** + • • • + B^", which is equal to zero according to (2), the product for v = 1 is also equal to zero, and we obtain the equation (3) C^i + C2e>2 + • • • + Cne>n = 0, in addition to which for every integral rational function g(*) with integral rational coefficients (3a) Crffri) + C2g(y2) + • • • + Cng(y„) is a rational number. III. We consider the exponents yu y2,---,yn as roots of an algebraic equation *" + r^'1 + r^-2 + • • • + rN = 0 with rational coefficients of degree N ^ n, possessing no identical roots. We multiply this equation by the Nth power of the common denominator H of the coefficients rlt r2, • • • and obtain (Hx)N + Hr^Hx)"-1 + H2r2(Hx)N-2 +...=0 or,if we write ATinstead of Hxand call the integers//^,//2ra,//3r3l..., f(X) = X" + glX"^ + g2xN-2 + ... +gN = 0. If Tj, r2,..., rN are the roots of this equation, then f(X) = (x-r1)(x-r2)...(x-rN). The roots T possess the n values T1 = Hylt F2 = Hy2,..., r„ = Hyn.,
The Hermite-Lindemann Transcendence Theorem 131 Since T represents integral algebraic numbers, then, as a result of (3a), (3b) dofro + c2fl(r2) + • • • + cnfl(rn) is a rational integer. Besides f(X) we will consider the function 9{X) - Y^T, + x=T2 + • • + T=TM = (x-r2)(x-r3)...(x-rN) + (x- ro(* - rs)(jr - r4)... (X - rN) + ■ ■ ■ = NX"-1 + N^"-2 + •••, which is not evanescent for any of the values Tu T2, • • •, FN, and the coefficients of which N, Nu ... (as symmetrical functions of the roots r1} r2,..., TN off(X) = 0) are rational integers. If the sum Ci^ro + c29(r2) + ■■■ + cn9,(r„) should by chance equal zero, we select the positive integral exponent h( < n) in such a manner that the (integral) sum g = (^ixro + c2r»,,(r2) + • • • + cru^rj * o. [Such an exponent must exist, because otherwise the n linear homogeneous equations 1 I\ r? rj- ■Xl + l •*i + r2 •*i + r§ '•*i + n- •x2 + • •x2 + • ■x2 + • *-x2 + ■ ■ + 1 • +r„ • +T2n • +r»- •*» = o, •*» = o, •*» = o, '■Xn = 0 would exist for the n nonevanescent" unknowns" *x = Cl9(1^),..., xn = Cn<p(Tn). This, however, is impossible, since then the determinant 1 1 r? rj- r| l rn rj r»-
132 Arithmetical Problems of the equation system would have to disappear; however, this determinant represents the product of all the differences Tr — rs, in which r > s, and, in accordance with the above, none of which disappear.] IV. Now we put the fundamental property of the exponential function—the series expansion for e"—into the form most suited for our proof. This is x2 xv e*=l +* + 2J+ ••• +-1 + --- We multiply this equation by H*v! and obtain (Hx = X) e*v\W = Wv\ + vW-^v -\)\X + v2Hv-\v - 2) IX2 + ■■■ + ^ + ^ivh + VTWT2) + --} In order to write this formula more conveniently, we introduce the symbol <3, which will be defined by the following direction: A function F(<&) shall be considered the expression obtained when F(<3), on the assumption that © is a number, is transformed in the usual way into a power series of <3 and ©v is replaced by vlHv at the end of expansion. Our formula can then be written in the simple form: e*<Bv = (© + X)v + Xv-[ ]. If we then designate the absolute magnitude of x as £, the absolute magnitude of [ ] is smaller than ( (2 0 = ^ + ^ + ... v+ 1 + („ + l)(v + 2) + ' and therefore certainly smaller than ?2 1 + £ + || + * If e is understood to be a magnitude the absolute value of which is a proper fraction, we therefore obtain (4) e*<Sv = (X + <S)V + et-eX*. We will immediately extend this somewhat further. Let V(X) = Xk + W-i + K2Xk'2 + ... +Kk
The Hermite-Lindemann Transcendence Theorem 133 represent an integral rational function of X with integral rational coefficients. We form (4) for v = k, k — 1, k — 2,..., multiply the resulting equations by 1, Ku K2, • • •, and add. This gives us (5) exV(<S>) = V(X + ®) + *V{X), with (5a) V(X) = e0Xk + e^X"-1 + etK^X*-2 + ■■■, where the absolute values of the magnitudes eK are proper fractions. If A1} A2,... represent the roots of V(X) = 0 and d represents the greatest of the k values \X\ + |AK|, it follows from V(X) = (X-A1)(X-A2)... that the absolute magnitude of V{X) [like that of V(X)] is smaller than dk: (5b) \V(X)\<d\ We apply the results (5), (5a), (5b) to the function V(X) = F(X)«i>(X), in which F(X) = X*f(X), <b(X) = X*9(X), q = p — 1, and p is a preliminarily selected, still undetermined prime number. Since the degree ofF(X) is h + N, and the degree of O(AT) is h + N - 1, V(X) is of the degree k = (h + N)q + h + N - 1. Equation (5) is now transformed into «*F(<3) = V(X + ©) + £^</k, where d is the greatest of the k values \X\ + |AK| and e is a number whose absolute magnitude is a proper fraction. We now choose for x and X the values yv and r„, respectively (v is any one of the numbers 1 ton). Then £ is the absolute magnitude £v of yv and d = dv is the greatest of the A: sums |rv| + |AK|. If D then represents the greatest of the 2n numbers d^j*h and ei*d?N+h)~1, then the improper fraction D/dy + h is SWf**"1, and consequently (Dldl+y Z e^d*!*"-1 or £>« > eWJ must be true, and we obtain the somewhat simpler formula (6) e->>V(<5) = V(TV + @) + rtJ)-, where |rjv| < 1.
134 Arithmetical Problems The expansion of V(VV + ©) according to the powers of© gives us V(TV + ©) = ^0©« + ^©« + 1 + ^2©« + a +-.., where the coefficients >p are integral rational functions of rv with integral rational coefficients. In particular, 0b = 0o(rv) = o(rv)p. [For v = 1, for example, F(i\ + ©) = (i\ + ©)»[©.(© + r1-r2).(© + r1-r3)...] = r»(i\ - rj(rx - r3)... (i\ - r„)•© + • • • = rk>(ri)-@ + ••• and o(rx + ©) = (i\ + ©)V(rx + ©) = n^ro + • • -, consequently F(rx + ©) = np9(ri)p-@* + ••• = ¢(1^-©« +•••.] If we introduce this expansion into (6), we finally obtain «».F(@) = 0o(r,)©8 + *i(r,)®p + ^2(rv)©"+i + • • • + ,vz)«. This formula, multiplied by Cv, we then form for all v from 1 through n, and we add the resulting n equations. According to (3), we then obtain (7) 0 = G0<B" + (?!©" + G2©p + 1 + • • • + Gk&*k + AZ)«, where Gr = Ci^ro + c2^r(r2) + • • • + cn^(r„) is, according to (3b), a rational integer and A is a number the absolute magnitude of which does not exceed the n-fold value of the maximum |C|-value. We now replace ©r with Hrr\, divide (7) by the then universally common factor H", abbreviate D\H as E, and combine all the terms containing the factor/)!, and we obtain (8) G0?! +G'p\ = AE\ where G' is an integer and A = — A.
The Hermite-Lindemann Transcendence Theorem 135 Now we compare g0 = Ci^ro* + c2o(r2)» + • • • + cno(rn)» with g = c^rj + c2o(r2) + • • • + cno(r„), the latter of which, according to our assumption concerning h, differs from zero. If we expand Gp according to the polynomial theorem, every term of the expansion, with the exception of the n terms CfO(rv)p, is the />-multiple of an integral algebraic number, and, therefore, (9) g> = [Ci^ro- + • • • + cw(rn)>] + up, where p is an integral algebraic number (which is, in fact, integral and rational). Now according to Fermat's theorem* every difference C* — Cv, as well as Gp — G, is an integral multiple cvp and gp, respectively, of p. Accordingly, (9) is transformed into G + gp = (C1 + Cl/»)0(r1)p + ... + (c„ + cn/>)0(rn)p + ^ = c1o(r1)p + • • • + cn<t>(rny + p'p = g0 + p'p, where fi' is also integral and algebraic. This equation simplifies into G0 = G + gp, where g' = g — fi' is an integral algebraic number, and is also an integral rational number, as a result of g' = (G0 — G)jp. If we introduce this value into (8), we obtain Gq\ + g'p\ + G'p\ = AE" or, if the integer G' + g' is designated as @, E" (10) G + ®p = A—- We now choose a prime number p so large that (1) p > \G\ and (2) the absolute magnitude of the right side of (10) is smaller than 1. * Fermat's theorem: For every integer g and every prime number p the difference g" — g is divisible by p. Proof. The theorem is self-evident if g is divisible by p. For every g that is indivisible by p the theorem follows directly from the congruences (la) and (2a) of No. 19, if g is substituted for D there and the congruences are squared. In both cases g"'1 = 1 mod p is obtained, and from this g" = g mod p.
136 Arithmetical Problems Equation (10) then contains a contradiction. On the left side of the equation there is an integer that is indivisible by p (because G ^ 0) and is thus not equal to zero, while on the right there is a number whose absolute magnitude is less than 1. This is impossible. Consequently, the initial equation (1) is also impossible and Lindemann's theorem is proved. The inferences that can be drawn from Lindemann's theorem are amazing. Here we present only a few: 1. The transcendence of e: The Euler number e is transcendent, i.e., it is not an algebraic number. (In other words, it cannot be a root of an algebraic equation with rational coefficients.) 2. The transcendence of it: The Archimedes (Ludolph) number it is transcendent. According to Euler (No. 13), there exists the equation etn + 1 = 0. According to Lindemann's theorem the exponent iir cannot, therefore, be an algebraic number. Consequently, it is also impossible for it to be an algebraic number. (If it were algebraic, then the product of the two algebraic numbers i and tt would have to be algebraic.) Thus, the ancient question of squaring the circle is answered, though the answer is negative: It is impossible to draw with a compass and straight-edge a square that is equal in area to a given circle. If, for example, we choose the radius of the given circle in such a manner that it is equal to the unit length, the area of the circle is it and the desired side of the square Vir. If, however, Vn could be drawn with compass and straight-edge, then the square it of this segment could also be constructed, and, according to No. 36, it would have to be the root of an algebraic equation with rational coefficients (whose degree would be a power of 2). However, it is transcendent. 3. The exponential curve y = ex passes through no algebraic point of the plane except the point 0| 1. (An algebraic point is a point whose coordinates x and y are both algebraic numbers.) Since algebraic points are omnipresent in densely concentrated quantities within the plane, the exponential curve accomplishes the remarkably difficult feat of winding between all these points without touching any of them. The same is, naturally, also true of the logarithmic curve y = Ix.
The Hermite-Lindemann Transcendence Theorem 137 4. The sine curve y = sin x also passes through no algebraic points of the plane except the lattice point 0|0. If, for example, <x|/3 were an algebraic point situated on the sine curve, /3 would be equal to sin a or, since 2i sin a = eia —e~ia, eia —e~ia — 2i/3 = 0. However, according to Lindemann's theorem, this equation cannot exist for algebraic numbers a, /3.
Planimetric Problems
■8|| Eider's Straight Line In all triangles the center of the circumscribed circle, the point of intersection of the medians, and the point of intersection of the altitudes are situated in this order in a straight line—the Euler line—and are spaced in such a manner that the altitude intersection is twice as far from the median intersection as the center of the circumscribed circle is. Leonhard Euler (1707-1783) was one of the greatest and most fertile mathematicians of all time. His writings comprise 45 volumes and over 700 papers, most of them long ones, published in periodicals. The above theorem is among the results of the paper "Solutio facilis problematum quorundam geometricoruni difficillimorum," which appeared in the journal Novi commentarii Academiae Petropolitanae {ad annum 1765). The following proof of the Euler theorem is distinguished by its great simplicity. In the triangle ABC let M be the midpoint of side AB, S the median intersection, which lies on CM, so that (1) SC = 2-SM, and U the center of the circle of circumscription, lying on the perpendicular bisector of AB. We extend US by SO so that (2) SO = 2-SU, and join 0 to C. According to (1) and (2) the triangles MUS and COS are similar. Consequently, CO\\MU, i.e., CO J_ AB, or expressed verbally, the line connecting the point 0 with a vertex of the triangle is perpendicular to the side of the triangle opposite the vertex; consequently, the connecting line is an altitude of the triangle. The three altitudes consequently pass through point 0. This is, therefore, the altitude intersection, and Euler's theorem is proved. Note. Our proof contains at the same time the solution to the interesting
142 Planimetric Problems Problem of Sylvester: To find the resultant of the three vectors UA, UB, UC acting on the center of the circle of circumscription U of the triangle ABC. A M B Fio. 5. Since UM is half the resultant of the two vectors UA and UB, CO represents in magnitude and direction the whole resultant of these vectors. Now, since UO is the resultant of UC and CO, UO is the resultant we are seeking. The resultant of the vectors represented by the three radii from the center of the circle of circumscription to the vertexes of the triangle is the segment extending from the center of the circle of circumscription to the altitude intersection. James Joseph Sylvester (1814-1897) was an English jurist and mathematician. HRfl The Feuerbach Circle In every triangle the three midpoints of the sides, the three base points of the altitudes, and% the midpoints of the three altitude sections touching the vertexes lie on a circle. This circle was already known to Euler (1765), but is most commonly called the Feuerbach circle after Karl Feuerbach (1800-1834) [the uncle of the painter Anselm Feuerbach], who rediscovered it in 1822. It is also known as the nine-point circle, although it passes through many other significant points as well as those indicated above.
The Feuerbach Circle 143 The proof consists of two steps. In the first we demonstrate that the circle circumscribing the triangle of the three midpoints of the sides passes through the base points of the altitudes; and in the second we show that the circle circumscribing the triangle of the altitude base points passes through the midpoints of altitude sections. I. Let ABC represent the prescribed triangle, A', B', C" the midpoints, respectively, of sides BC, CA, AB. Let H be the base point of the altitude AH. Then the trapezoid HA'B'C is isosceles (A'B', as a midline of the triangle ABC, is equal to \AB\ HC, as the radius of the Thales circle having the diameter AB, is also equal to \AB.) The trapezoid is therefore a quadrilateral inscribed in a circle. All of the altitude base points consequently lie on the circle 3f circumscribing the triangle A'B'C. Fio. 7. F:o. 8. II. Let the altitudes of the triangle ABC be AH, BK, CL, and O their point of intersection. We will now show that the center of each altitude section touching a vertex, let us say section OC, also lies on g. For this purpose we consider the triangle OBC, which also has the altitude bases H, K, L. According to I., the circle % circumscribing the altitude base triangle (HKL) of this triangle passes
144 Planimetric Problems through the triangle at the side midpoints, e.g., through the center of OB and OC, which completes the proof. Corollary. The midpoint F of the Feuerbach circle lies at the center of the Euler line OU, and the radius f of the Feuerbach circle is equal to one half the radius of the circle of circumscription of the triangle ABC. The first of these propositions follows from the fact that the perpendicular bisectors of the Feuerbach circle chords HA' and KB', as midlines of the trapezoids UOHA' and UOKB', pass through the center of OU, and the second, from the fact that the sides of the triangle A'B'C inscribed in the Feuerbach circle are one half the size of the sides of the triangle ABC. ^| Castillon's Problem To inscribe in a given circle a triangle the sides of which pass through three given points. This problem, posed by the Swiss mathematician Cramer, takes its name from the Italian mathematician Castillon, who solved it in 1776. (Gabriel Cramer, 1704-1752, in 1750 published his major work Introduction d Vanalyse des lignes courbes algebraiques, in which for the first time, a system of linear equations was solved by means of determinants. I. F. Salvemini, 1709-1791, took the name Castillon after his place of birth Castiglione in Tuscany.) The following simple, though not easily seen, solution of the Castillon problem stems from the Italian Giordano. We call the given circle S, the given points A, B, C, the desired triangle XYZ, and let YZ, ZX, AT pass, respectively, through A, B, C. Ottaiano in his solution makes use of four auxiliary points. These are: I. the end point of the chord parallel to AB and beginning from X; II. the point of intersection of the lines FI and AB; III. the end point of the chord beginning at X that is parallel to IIC; IV. the point of intersection of the lines CH and I III. The construction consists of the following five steps. 1. Construction of auxiliary point II. The angles ^411 I and XIY, as alternate interior angles between parallek, are equal, and the
Castilloris Problem 145 angles XZY and ATI Fare equal because they are inscribed in the same arc XY. Consequently, 2t XZY = &AII1 and therefore BZYII is a quadrilateral inscribed in a circle. It also follows from this that AIIAB = AYAZ. Since, however, the right side of this equation is known to be the power P of the circle S at A (see p. 152), it follows that ,411 = P\AB can be constructed, as a result of which II is known. 2. Construction of auxiliary point IV. The angles FCTV and FYIII are corresponding angles between parallels and are consequently equal, while angles FI III and FYIII are supplementary since they are opposite angles in the quadrilateral inscribed in the circle. Thus, FI III and FCIV are also supplementary, and FCIV I is a quadrilateral inscribed in a circle. It follows from this that IICIIIV = IIFIII.
146 Planimetric Problems However, since the right side of this equation represents the power II of circle S at II, which, according to 1., is to be regarded as known, we find ii iv = n/iic and thus the auxiliary point IV. 3. Determination of the angle IATIII = a>. Since angle AW IV = k is known and since a> and k, having pairwise parallel sides, are identical, it follows that CO = K. 4. Construction of the chord I III. We draw through IV a chord subtending the angle w = k. The points of intersection of this chord with S are the remaining points I and III. 5. Construction of the triangle XYZ. We determine X as the point of intersection of S with the line through III parallel to IIIV; Y as the point of intersection of the line III with S; and Z as the point of intersection of the line AY with S. In comparison to this fairly intricate solution the following projective solution of the Castillon problem is very simple. This solution is based upon Steiner's double element construction (No. 60) and the involution theorem: If a ray is rotated about a fixed point, its two points of intersection with a circle describe on this circle (involutional) projective ranges of points (No. 63). We take any arbitrary point ATX on the given circle S, determine the (second) point of intersection Zx of the circle with the secant BXX, then the (second) point of intersection Yx of the circle with the secant AZU and, finally, the (second) point of intersection X[ of the circle with the secant CYV Only when X[ happens to coincide with ATX is X1Y1Z1 the sought-for triangle. This favorable situation will, however, occur only rarely. We will consider the described construction as repeated with other starting points X2, X3,..., giving us the points Y2, Y3,..., Z2, Z3,...; X2, X'3, According to the auxiliary theorem each of the fields of points Xu X2,...; Yu Y2,...; Zx, Z2,..., and X\, X'2 is projective with respect to the following one; consequently, (X1, X2,...) "a (X[, X2,...). The desired triangle is obtained from the described construction when the starting point Xv coincides with the end point X'v and is accordingly
Malfatti's Problem 147 determined by a double element of this projection. This gives us the following simple Construction: We choose any three points Xx, X2, X3 on S, draw in the manner described the three corresponding points X'u X'2, X'3, and determine according to Steiner the double elements XT and Xs of the projection on S in which the points X\, X'2, X'3 correspond to Xu X2, X3. Thus, each of the two triangles XTYTZT and X,Y,Z, satisfies the conditions of the Castillon problem. Note. In a quite similar manner we are able to prove the converse of the Castillon problem : To draw about a circle a triangle the angles of which lie on three given lines. The construction is based upon the auxiliary theorem: If a point describes a straight line, the two tangents from the point to a circle determine upon this circle two (involutional) projective fields of tangents (No. 63). We call the given circle S, the given lines a, b, c, the sides of the desired triangle x, y, z. We draw any three tangents xu x2, x3 to S; through their points of intersection with b we draw three more tangents zu z2, z3; through the points of intersection of the latter with a we draw three new tangents yu y2, y3, and through their intersections with c three more tangents x[, x'2, x'3. We draw the double elements xr and x, of the projection defined on S by the homologous triplets (xu x2, x3) and (x'i, x'2, x'3). The triangles xryrZr and xsy,z, obtained from these double elements are the ones we are seeking. j^g Malfatti's Problem To draw within a given triangle three circles each of which is tangent to the other two and to two sides of the triangle. This famous problem was posed by the Italian mathematician Malfatti (1731-1807) in 1803 and solved in the tenth volume of the Memorie di Matematica e di Fisica delta Societa italiana delle Scienze. This algebraic-geometric solution can be found, for example, in vol. 123 of Ostwald's Klassiker der exakten Wissenschaften (Supplement). The purely geometric solution of Malfatti's problem submitted by Jakob Steiner in 1826 without proof is also described and proved there. Here we will restrict ourselves to the exposition of the thoroughly simple solution published by Schellbach in volume 45 oWrelle's Journal.
148 Planimetric Problems Let ABC be the given triangle with sides a, b, c, the perimeter 2s and the angles a, /3, y. Let the Malfatti circles we are seeking (which are tangent to the arms of the angles a, /3, y) be %, d, SR, their midpoints P, Q, R, and their radii p, q, r. Let the tangents from the angles A, B, C to $, O, 9¾ be u, v, w. Fig. 10. We introduce 3, a circle inscribed in the triangle. Let its center be J and its radius p, and let the tangents to it from angles A, B, C be <*d ^1) Ci) respectively. From the three equations b1 + c1 = a, c1 + a1 = b, <Zi + bx = c we obtain the values ax = s — a, b1 = s — b, c1 = s — c. Since the points P and J lie on the bisector of the angle a, it follows from the ray theorem that pip = MK or P = ^-u. Similarly we find q = ■£- v. We call the points of tangency of $ and O with AB, U and F and calculate UV = /. Since PF, the perpendicular dropped from P to Q F, is equal to t, it follows from the right triangle PQF that PQ* = PF2 + FQ2 or (/- + ?)2 = /2 + (/- - ?)2
Malfatti's Problem 149 and from this UV=t = 2Vpq. If we then introduce here the values found above fbrp and q, we obtain = iVw J-Z-r- V a1b1 But it is known that p2 = a1b1c1/s. This simplifies the value for t to = t = 2^ VJTv. UV ■ Since the side AB of the triangle is composed of the three segments A U, BV, and UV, we obtain the equation u + v + 2 J— Vtw = c. In the same way we obtain for the two other sides of the triangle BC and CM v + w + 2 and + u + 2 J-^ Vwu = b.
150 Planimetric Problems Taking half the perimeter as the unit length, we obtain somewhat more simply: (v + w + 2VH[Vvw = a, w + u + 2\/Vv/au« = b, u + v + 2V^VuJT = c. Now we take the proper fractions a, b, c, u, v, w as squares of the sines of six acute angles A, p, v, >p, <p, x' sin2 A = a, sin2 fi = b, sin2 v = c, sin2 ifi = u, sin2 <p = v, sin2 x = w- Then also (since a + ^ = s = I, b + b1 = \, c + ^ = \) cos2 A = au cos2 fj. = bu cos2 v = cls and the obtained equation triplet (1) assumes the form: 'sin2 <p + sin2 x + 2 sin y sin x cos A = sin2 A, (2) sin2 x + sm2 ^ + 2 sin x sin >p cos /* = sin2 p, sin2 ^r + sin2 <p + 2 sin >p sin y cos v = sin2 v. Now, for example, let us consider the first of these equations! It is nothing other than a trigonometric expression of the known relation (<p + x = A) between the angles q> and x of the two vertexes of a triangle and the exterior angle A of the third vertex. If, for example, we take such a triangle with a circle of circumscription of the diameter 1, then the three sides are sin <p, sin x, sin A, and the cosine theorem gives the equation sin2 A = sin2 <p + sin2 x + 2 sin <p sin x cos A. It then follows from (2) that 9 + X = *» X + t = H-> ip + (p = v and from this >p = a — A, q> = a — fi, x = a ~ vi with cr = ^ • Thus, we obtain the following simple Construction : 1. We draw three angles A, p, v whose sine squares are equal to the sides of the given triangle (where half the perimeter of the triangle is the unit length).
Monge's Problem 151 2. We draw the half sum A + n + v ° 2 of the three angles A, fi, v and the three new angles ifl = <T — A, <p = O — fj., x = <T — V. 3. We draw the sine squares of the three angles >fi, q>, \- These are the tangents from the triangle vertexes to the three Malfatti circles. Note. If we are to draw the sine square m = sin2 w for a given angle w, or to draw the angle w (whose sine square equals m) for a given segment m, we proceed in the following manner: We draw a semicircle ip with the diameter HK = 1. We draw the given angle w at K on KH and from the intersection L of its free side with ip we drop the perpendicular LM to HK. Then HM = m = sin2 w. Conversely, if m is given and we have to find w, we draw HM = m on HK, erect at Ma perpendicular on HK extending to the intersection L with !q, and extend LK Then %_HKL = w. Proof. From the right triangle HML it follows that m = HM = HLsin HLM = HL sin w, and from the right triangle HKL HL = HK sin w = sin w. Consequently, m = sin2 w. BS1H Monge's Problem To draw a circle that cuts three given circles perpendicularly. The French mathematician Monge (1746-1818) was the founder of descriptive geometry. In order to solve the problem, we seek the locus of the centers of all the circles that are perpendicular to two given circles. [Two circles are said to intersect perpendicularly when the radii r and r' drawn to a single point of intersection are perpendicular to each other; in other words, when they form the base and altitude of a right triangle the hypotenuse z of which joins the centers of the circles, so that r2 + r'2 = z2 or z2 — r2 = r'2. Two circles are
152 Planimetric Problems therefore perpendicular to each other when the power* of the one at the midpoint of the other is equal to the square of the radius of the other.] F:o. 12. Let the given circles be M: and S', their centers K and K', their radii k and k' (>k), the line joining their centers KK' = I. Let the circle 3£ with the midpoint X and the radius x be perpendicular to them. Let the center lines KX and K'X be equal to z and z', respectively. Then z2 — k2 and z'2 — k'2 are each equal to x2, so that (1) z2 - k2 = z'2 - k'2. Consequently, both circles M: and S' have the same power at X. We therefore first attempt to find the locus of the point X at which the two given circles possess the same power. If AT is a point possessing this locus and the perpendicular from X intercepts the center line KK' at the point F, and, moreover, if KF =/and K'F =/', then, according to the Pythagorean theorem, the square of the perpendicular is equal to z2 — f2 as well as to z'2 — f'2, so that (2) z2 -f2 = z'2 -f2. * By the power of a circle at a point is meant the amount by which the square of the axis to the point exceeds the square of the radius of the circle. In accordance with the secant or chord theorem it can also be represented as the product of the two segments originating from the point that are generated by the circle through the point on any secant.
Monge's Problem 153 If we subtract (2) from (1) we obtain (3) p-k*=f'*- k'\ i.e., M: and S' possess equal powers at Falso. If we figure the distances /and/' as positive in the directions KK' and K'K, respectively, then it is always true that (4) / + /' = /• Equations (3) and (4) give us fixed values for the unknowns/and/'. Consequently every locus point X lies on the perpendicular erected on the center line KK' at the fixed point F, and we obtain the Theorem of the chordal: The locus of the point at which two given circles possess the same powers is a straight line perpendicular to the line joining the midpoints of the circles and is known as the chordal or power line of the two circles. In the construction of the chordal we distinguish two different cases: 1. The circles intersect. Since both circles have equal powers at each of their points of intersection, i.e., 0, the points of intersection lie on the chordal. The chordal of two circles that intersect is the secant of intersection. 2. The circles do not intersect. Here the construction of the chordal is based upon the Theorem of Monge : The three chordals of three circles pass through a point known as the power center of the three circles. [Proof. Let the circles be I, II, III. We determine the point of intersection 0 of the chordals of the two pairs (II, III) and (III, I). At this point (1) II and III, (2) III and I possess equal powers; consequently II and I also have the same power at 0, i.e., 0 lies on the chordal of I and II.] Thus, to construct the chordal of two nonintersecting circles I and II, we draw an auxiliary circle III that intersects I and II and the chordals of the pairs (II, III) and (III, I). The perpendicular from the intersection of these chordals to the line joining the centers of I and II is the chordal we are looking for. From the theorem of the chordal it then follows: The locus of the centers of all circles that are perpendicular to two given circles is the chordal of the given circles or, in the event that these circles intersect, the section of the chordal that lies outside the given circles. (The powers of the given circles at a single point must be positive!) The solution of Monge's problem now becomes very simple. We draw the power center 0 of the given circles. If it lies outside the
154 Planimetric Problems three circles, the circle with the midpoint 0 and the radius formed by the tangent from 0 to one of the given circles intersects perpendicularly with the given circles. If 0 is located inside even one of the given circles, the problem is insoluble. The Tangency Problem of Apollonius To draw a circle that is tangent to three given circles. The circles may also comprise degenerate circles: points or straight lines. This celebrated problem was put forth by the greatest mathematician of the ancient world after Euclid and Archimedes, Apollonius of Perga (ca. 260-170 B.C.), whose major work Kwviko. extended with an astonishing comprehensiveness the period's naturally slight knowledge of conic sections. His treatise De Tactionibus, which contained the solution of the tangency problem given above, has unfortunately been lost. Francois Viete, called Vieta, the greatest French mathematician of the sixteenth century (1540-1603), attempted about 1600 to restore the lost treatise of Apollonius and solved the tangency problem by treating each of its ten special cases individually, deriving each successive one from the preceding one. In contrast to this the solutions of Gauss {Complete Works, vol. IV, p. 399), Gergonne (Annates de Mathematiques, vol. IV), and Petersen [Methoden und Theorien) solve the general problem. Here we will restrict ourselves to the exposition of the elegant solution of Gergonne. Since this proof presupposes, in addition to the chordal theorems proved in No. 31, a knowledge of the properties of similarity points and polars, we will begin with a brief discussion of these. Similarity Points When we refer to the external or positive and internal or negative similarity points, respectively, of two circles M: and S' with the centers M and M' and the radii r and r , we mean the points A and J, respectively, on the line MM' joining the centers for which MA r . MJ r . , „ ,., . = +-7 and ,-. r = —-.■> respectively.* MA r M J r> r j * The segment ratio AX:BX is considered positive if X is situated outside AB and negative if X is inside AB.
Tangency Problem of Apollonius 155 It follows directly from the ray theorem that: The line connecting the end points of two parallel {oppositely directed) radii of two circles passes through the external (internal) similarity point. In particular, the external (internal) common tangents of the two circles pass through the external (internal) similarity point. We will further designate the external similarity point of the circles M: and S' as + St®', the internal one as — SS', and, if the sign is not determined, we will indicate the similarity point as e®8'. The symbol ee'e"... is to be understood as meaning plus when the number of minus signs occurring among the symbols e, e, e",... is even and minus when it is odd. The similarity points of three'circles are described by the Theorem of d'Alembert:* If three circles 21, S3, © are taken in pairs (S3, (£), ((£, 2C), and (21, S3), the external similarity points of the three pairs lie on a straight line; and, similarly, the external similarity point of one pair and the two internal similarity points of the other two pairs lie upon a straight line, a so-called similarity axis of the three circles. More briefly: If afly is plus, the three similarity points <x93G, /3©9t, and y2C93 lie on a straight line. Monge's proof. Let the centers of the circles 2C, S3, G be A, B, C, and let the external similarity points of the pairs (S3, <£), ((£, 2C), (21, S3) be P, Q, R. If the circle pair (S3, (£) with its external tangents that pass through P is rotated about the axis PBC, we obtain the spheres S30 and E0 and their tangent cone with apex P. The case is similar for the other two circle pairs. The planes £^ and E2 are tangent to the spheres 2C0, S30, E0 in such a manner that the spheres always lie on one side of the plane, and both planes contain the point P, since this point lies on the external * D'Alembert (1717-1783), a French mathematician.
156 Planimetric Problems tangent of (230, ©0) within E^E^. They likewise contain the points Q and R. The three points P, Q, R thus lie on the line of intersection of the planes Ex and E2- If we are concerned with the internal similarity points of the pairs (SB, <S) and (2t, G) and the external similarity point of (2t, 93), we must take the tangential planes so that 2t0 and S30 lie on one side of such a plane while ©0 lies on the other. Let an arbitrary circle 3£ with the center X be homogeneously (nonhomogeneously) tangent to two fixed circles S and £', with centers K and K' and radii k and k', at P and Q'. Let the points of intersection of the straight line PQ' with the circles S and S' and the line KK' joining their centers be P, Q; P', Q', and S. Since the base angles of the isosceles triangles KPQ, K'P'Q', and XPQ' are also the opposite and coincident angles at P and Q', all six base angles are equal. Since the two base angles at P and P' are equal, the radii KP and K'P' are parallel. Consequently, S is the external (internal) similarity point of S and £'. From this it follows that SP _ k_ SQ _ k_ SP' ±k'' SQ' ±k'' so that the two products SP-SQ' and SQ-SP' are equal. If we call their common value w, then w* = SP-SQ'-SQ-SP' = SP-SQ - SP'-SQ',
Tangency Problem of Apollonius 157 i.e., w2 is equal to the product of the powers II and II' of the two circles S and S' at S. Consequently, SP-SQ' = w= VTTTF. I.e.: The power (SP-SQ') of the circle X at S is a constant (VTTTF). The result of our considerations is the following Tangency theorem: The external (internal) similarity point of two fixed circles is the point at which all the circles homogeneously (nonhomogeneously) tangent to the fixed circles have the same power and at which all the tangency secants (which are determined by the points of tangency to the fixed circles) intersect. Pole and Polar Two points P and P' that lie on a ray originating at the center O of a circle S with radius r in such manner that OP-OP' = r2 are called conjugate with respect to each other in relation to the circle. Of two conjugate points one lies inside the circle and the other outside. The conjugate of an external point A is the point of intersection J of the circle bisector from A with the tangency chord determined by the tangents AT1 and AT2 from A to the circle. The conjugate of an internal point J is the point of intersection A of the tangents that pass through the end points Tx and T2 of the chord passing through J and perpendicular to the circle bisector from J.
158 Planimetric Problems F:o. 16. (From the right triangle OA Tt it follows directly that r2 = OA • OJ.) By the polar of the point P we mean the line p that is perpendicular to the circle bisector from P and passes through the conjugate of P. Conversely, by the pole of the line p we mean the point P that is conjugate to the base point of the perpendicular dropped from the center of the circle to the line. The relation between the pole and the polar is therefore reciprocal: If pis the polar ofP, then P is the pole ofp, and conversely. Now let Q be any point on the polar p of P (that passes through the conjugate P' of P) and let Q' be the conjugate of Q. Then OP-OP' = OQ-OQ' (= r2), and consequently PP'QQ' is a quadrilateral inscribed in a circle. Since here the angle at P' is 90° the angle at Q' must also be 90°, i.e., Fro. 17.
Tangency Problem of Apollonius 159 PQ' must be perpendicular to OQ. PQ' is therefore the polar q of Q, and we have the Theorem of the pole and polar: If Q lies on the polar ofP,P also lies on the polar ofQ. Or also: If p passes through the pole of q, q also passes through the pole of p. Now for Gergonne's solution of the tangency problem. In general, there are a number of circles that are tangent to three given circles 2t, S3, (£. Gergonne's solution is based upon the device of seeking the unknown circles in pairs rather than individually; in particular, one always seeks that pair (3£, j) that is homogeneously or nonhomogeneously tangent to each of the given circles. For the sake of convenience, we will call homogeneous tangencies positive ( + ) and nonhomogeneous tangencies negative ( —) and combinations such as ee' of the tangency signs e and e' will be treated in accordance with the rule that "like signs give plus and unlike minus." Let the circles 3£ and j, respectively, be tangent to the circles % S3, © at the points P, Q, R and p, q, r, respectively, and let the tangencies possess the signs A, B, C and a, b, c, respectively. Then Aa = Bb = Cc = e, and BC = be = a, CA = ca = p, AB = ab = y and <x/Sy = +. Let us first consider (3£, j) as the pair tangent to the circles 2t, S3, ¢. According to the tangency theorem, the similarity point e3£j of £ and J is the power center O of the three circles %, S3, © and the point of intersection of the three tangency chords Pp, Qq, Rr. We then take in succession (S3, <S), (©, 2t), (¾. S3) as the pair tangent to the circles 3£ and j. In accordance with the tangency theorem, the circles 3£ and j then have the same powers at the similarity point <xS3(£ = I, as well as at the similarity point /3(5¾ = II, and the similarity point y&S3 = III. And since a/Sy is +, the three points I, II, III, in accordance with d'Alembert's theorem, lie upon a similarity axis of % S3, ©. The similarity axis III III is thus the chordal \ of the circles 3£ and J. Further, if S represents the point of intersection of the tangents to 9t at P and p, then SP = Sp. Since these tangents also touch 3£ and j, S lies on the chordal x of 3£ and j. Now S is also the pole of the
160 Planimetric Problems tangency chord Pp with respect to circle 2t. Since \ therefore passes through the pole of Pp, it follows from the theorem of the pole and polar that Pp passes through the pole of \. Since the same conclusions can be drawn with respect to the tangency chords Qq and Br, we obtain the theorem: The tangency chords Pp, Qq, and Rr pass respectively through the poles of the line x = IIIIII with respect to the circles 21, 33, ©. Fio. 18. From the three theorems italicized in the last three paragraphs we obtain directly Gergonne's construction : Draw the power center O of the given circles and the similarity axis III III = \. Determine the poles \,2,"iof\inrelation to the given circles and connect them with O. The connecting lines touch the given circles at the points at which they are tangent to the sought-for circles. ■Sj^l Mascheroni's Compass Problem To prove that any construction that can be carried out with a compass and straight-edge can be carried out with the compass alone.
Mascheroni's Compass Problem 161 The Italian L. Mascheroni (1750-1800) posed himself the problem of executing the geometric constructions with a compass alone (without the use of the straight-edge) and solved it in a masterly fashion in his book La geometria del compasso, which was published in Paviain 1797. If we examine the separate steps by which the circle and straightedge constructions are carried out, we see that every step consists of one of the following three basic constructions: I. Finding the point of intersection of two straight lines; II. finding the point of intersection of a straight line and a circle; III. finding the point of intersection of two circles. Consequently, we need only show that the two basic constructions I. and II. can be accomplished with a compass alone. (In Mascheroni's geometry of the compass a straight line is, naturally, regarded as given or determined if two of its points are known.) First we must solve two preliminary problems. Preliminary problem 1. To draw the sum or difference of two given segments a and b. In other words: to lengthen or shorten a given segment PQ = a by a segment QX = b. Solution. 1. We draw the arc Q\b,* take upon this arc any point H, draw the mirror image H' of H (the mirror image 0' of a point 0 on a straight line AB is the point of intersection of the arcs A\AO and B\BO) on the straight line g determined by the points P and Q, and designate the segment HH' as h. 2. We draw the isosceles trapezoid KHH'K' whose legs KH and K'H' are equal to b and whose base KK' = 2h. (K is the point of intersection of the arcs Q\h and H\b, K' is the mirror image of K on g.) Let the diagonal KH' = HK' of the trapezoid be called d. Since the trapezoid is a quadrilateral that can be inscribed in a circle, according to Ptolemy the following equation is applicable: d* = b" + 2h2. On the other hand, it follows from the right triangle QK'X, where K'X will be designated as x, that ^ = ^ + h". Let arc Q\b mean the circle arc whose midpoint is Q and radius b.
162 Planimetric Problems From these two equations it follows that d" = x2 + h", so that x is one of the legs of a right triangle with the hypotenuse d and the other leg h. If we then find the point of intersection S of the arcs K\d and K'\d on the straight line g, QS = x. 3. We draw the point of intersection of the arcs K \ x and K' \ x; this is the point X that we have been trying to find. Preliminary problem 2. To find the fourth segment x that is in proportion to the three given segments m, n, s. In other words, draw the segment n x = — s. m The following solution that Mascheroni found for this fundamental problem is remarkable for its shortness and simplicity. We draw two concentric circles 2R = Z\m and 91 = Z\n, draw the chord AB = s in 2R, lay off with the compass any length w from A
Mascheroni's Compass Problem 163 and from B on 5ft, obtaining from the distance between the resulting points of intersection H and K the sought-for segment x. The proof follows directly from the similar triangles ZAB and ZHK. In this construction it is assumed that s falls within circle 2R. If this is not the case, we first transform the fraction n\m into N/M, where N and M, respectively, are sufficiently great integral multiples of n and m which can be drawn according to the first preliminary problem. (A comparatively simple method is the doubling that results, for example, when PQ = m, and the radius m of the circle P\PQ is laid off three times in succession from Q. The end point after this laying off is separated from Q by the distance 2m.) After the solution of the preliminary problems, we go on to the solution of the two major problems. I'. To find the point of intersection S of two straight lines AB and CD (each of which is given by two points) with the compass alone. II'. To determine the point of intersection S of a given circle £ and a given straight line AB with the compass alone. Fio. 21.
164 Planimetric Problems Solution of I'. We draw the mirror images C and D' of C and D with respect to AB. The sought-for point of intersection S then also lies on CD'. According to the ray theorem, it follows that CSjSD = CC/DD', i.e., if we designate the segments CS, CD, CC, DD' as x, e, c, d, respectively, x\ (e — x) = c\d or c x = -.- e. c + d Now we begin by drawing CH = c + d (H as the point of intersection of the arcs C'\d and D\e); then we draw the segment x in accordance with preliminary problem 2; and finally we draw the sought-for point of intersection S as the intersection of the arcs C\x and C'\x. /k \ I M I A \/T i\/b M' Fio. 22. Solution of II'. Let the center of the given circle be known as M, the radius as r. We draw the mirror image M' of M with respect to the straight line AB and with the compass open to the radius r we strike off" r on the circle ft from M'. The resulting points of intersection are the sought-for points of intersection of the given straight line AB with the given circle S. The construction cannot be carried out if the straight line AB happens to pass through M. In this exceptional case we extend and shorten the segment AM by r in accordance with preliminary problem 1. The end points of the extended and shortened segment are the sought-for points of intersection of ® and AB. This completes the solution of Mascheroni's problem.
Steiner's Straight-edge Problem 165 ■££■ Steiner's Straight-edge Problem To prove that every construction that can be executed with compass and straight-edge can be executed with a straight-edge alone in the event that within the picture plane there is also given a fixed circle. As far back as 1759 Lambert had solved a whole series of geometric constructions with straight-edge alone in his book Freie Perspektive, which was published in Zurich that year. He is also the source of the term "straight-edge geometry." After Lambert the French mathematicians, primarily Poncelet and Brianchon, took up straightedge geometry, particularly after the publication of Mascheroni's Geometria del compasso provided a new stimulus to these studies, and they attempted to execute as many constructions as possible with the straight-edge alone. Now, with the use of a straight-edge alone it is possible to represent only those algebraic expressions whose algebraic form is rational (thus, for example, it is impossible to represent expressions such as Vab). This circumstance suggested to Poncelet that an additional fixed circle (as well as the center!) must be given inside the picture plane for it to be possible to draw with straight-edge alone all the algebraic expressions that can be constructed with a compass and straight-edge. This suggestion was confirmed as a certainty by Jakob Steiner (1796-1863), the greatest geometer since the days of Apollonius, in his celebrated book Die geometrischen Konstruktionen ausgefuhrt mittels der geraden Linie und Eines festen Kreises (Geometrical Constructions Executed with a Straight Line and One Fixed Circle), published in Berlin, 1833. The solution presented here is based upon that in Steiner's book, except that we have here eliminated everything that is not strictly essential for the purpose at hand, and we have also made it somewhat more elementary by dispensing with the theorems of homothety and chordals employed by Steiner. Since in straight-edge geometry the intersection of two straight lines is known directly, we need only demonstrate that the two fundamental problems II. and III. of the previous section can be solved by means of a straight-edge and a fixed circle alone. As in the solution of Mascheroni's problem, we must first solve several preliminary problems; in this case there are five rather than two.
166 Planimetric Problems F:o. 23. A M B Preliminary problem 1: To draw through a given point the parallel to a given line. Steiner distinguishes two cases: la. construction of the parallel to a directed straight line; lb. construction of the parallel to an arbitrary straight line. la. A directed straight line is understood to mean a straight line in which two points A and B and the midpoint M of the segment joining them are known. In order to draw the parallel to such a line through a given point P, we draw AP, choose a point S on the extension of AP, connect this point with B and M, draw BP, and draw the straight line AO through the point of intersection O o£BP and MS in such a manner that AO cuts BS at Q. PQ is then the desired parallel. A simple proof. A M B *% P » A F:o. 24.
Steiner's Straight-edge Problem 167 1 b. We connect a given point M of the given straight line g with the midpoint F of the given fixed circle fj and designate the points of intersection of the connecting line and fj as U and V. The points U, F, Kmake the line FM a directed line. In accordance with 1 a., we draw a parallel to FM in such a manner that it cuts ft at X and Y and g at A. If we then draw the diameters XFX' and YFY' and connect the end points X' and Y', the connecting line intersects the given line at a point B in such a manner that MA = MB and g, defined by the three points A, M, B, is then a directed line. This makes it possible to determine the parallel to g in accordance with la. Preliminary problem 1 gives us the solution to the problem: shift a given segment AB parallel to itself in such a manner that one of its end points lies on a given point P. If P falls outside the straight line AB we find the point of intersection Q of the parallel through B to AP and the parallel through P to AB; PQ is then parallel to AB. Preliminary problem 2: Draw a perpendicular through a given point P to a given straight line g. We draw g' parallel to g in such a manner that it cuts ft at C/and V. We then draw the diameter UFU' and the chord VU', which, according to Thales' theorem, is perpendicular to g' and consequently also perpendicular to g. Finally, we draw the parallel to VU' through P in accordance with 1; this parallel is the desired perpendicular. 1* / 1 JK ^----^ u' 1 v /ig>' P Fio. 25. Preliminary problem 3: To lay off a given distance PQ, from a given point O in a given direction. Let us consider the prescribed direction as given by the segment OH from O. First, in accordance with 1., we displace PQ parallel to
168 Planimetric Problems itself to OK. Then from F we draw two radii FU and FV in the directions OH and OK. Finally, if we draw through K the parallel to UV, the point of intersection S of the parallel with the line OH gives the end point of the desired segment. Preliminary problem 4: If three distances m, n, s are given, draw the fourth proportional. From any point 0 we draw two rays I and II, mark off the two distances OM = m and ON = n on I and the distance OS = s on II; we draw the parallel to MS through N and designate its point of intersection with II as AT. Then OX = ^s m is the desired fourth proportional. Preliminary problem 5: If two segments a and b are given, draw the mean proportional. We designate the sought-for mean proportional (Vab) as x, the diameter of the fixed circle as d, the sum a + b that can be constructed according to 3. as c, and we write x -. s, with s = Vhk, h = - a, k = -b d c c (so that h + k = d). First, in accordance with 4., we draw the segments h and k, and in accordance with 3., we make HO = h on a diameter HK of the fixed circle, so that KO will necessarily equal k. Then, according to 2., we construct through 0 the perpendicular to HK and call the intersection of the perpendicular with the fixed circle S. Then OS = Vhk = s. Finally, we draw the desired segment x(= {cjd)s) according to 4. Now that we have solved these five preliminary problems, the solution of the two basic problems II and III is simple. Basic problem II: To draw the points of intersection of a given line and a given circle. In straight-edge geometry a circle is considered determined if its center and radius are known. Let us designate the given circle as ®, its center as C, its diameter as r, the given straight line as g, the points of intersection of g with circle S as X and Y, the chord of intersection as 2s, the midpoint of the chord as M, its distance from the center C as I. From the right triangle CMX we obtain the equation
Steiner's Straight-edge Problem 169 Then, in accordance with 2., we drop the perpendicular CM = I to g; we draw the segments a = r + I and b = r — I in accordance with 3.; then, according to 5., we draw the segment s = Vab; and finally, according to 3., we lay off s from M on g in both directions. The end points of the laid-off segments are the desired points of intersection X and Y. Fio. 26. Basic problem III: Find the points of intersection of two given circles. Let us designate the circles as 9t and 58, their midpoints as A and B, their radii as a and b, the line AB joining their centers as c, the sought- for points of intersection as X and Y, the point of intersection of the chord XY with the center line AB as O, and, finally, the unknown segments AO and OX as q and x. Finding q. From the triangle ABX it may be inferred, in accordance with the expanded Pythagorean theorem, b2 = c2 + a2 — 2cq; thus, if we set c2 + a2 equal to d2, 1 = (d+b)(d-b) 2c Consequently, we draw, in accordance with 2. and 3., a right triangle with the short legs a and c and obtain as the hypotenuse d.
170 Planimetric Problems Then, according to 3., we draw the segments n = d + b, m = 2c, s = d — b and finally, according to 4., n m Finding x. From /S.OAX it follows, according to the Pythagorean theorem, that x2 = a2 — q2; thus x = V(a + q)(a - q). According to 3., we draw h = a + q, k = a — q and, according to 5., x = Vhk. Construction of X and Y. According to 3., we lay off q from A on AB. At O, the end of the segment laid off, we erect the perpendicular to AB in accordance with 2. and (according to 3.) we lay off x on it in both directions. The end points of the laid-off segments are the points of intersection that we are looking for. E9 The ™iaa Cube-doubling Problem To construct the edge of a cube that is double the size of a given cube. The name "Delian problem," according to an account given by the mathematician and historian Eutocius (sixth century A.D.), goes back to an old legend according to which the Delphic oracle in one of its utterances demanded that the Delian altar block be doubled. If A: is the edge of the given cube and x the edge of the cube we are seeking, the respective volumes of the two cubes are A:3 and r2. Consequently we are confronted with the problem of finding, when the segment k is given, a second segment x such that r2 = 2k3. This problem is not capable of solution with compass and straight-edge. (See the Supplement to No. 36.) The numerous solutions to this problem, some of which were found in antiquity, consequently make use of more advanced means.
The Delian Cube-doubling Problem 171 Thus, the solution of the Greek mathematician Menaechmus (ca. 375-325 b.c.) is based upon finding the point of intersection of the two parabolas (1) x2 = ky and (2) y2 = 2kx with the parameters k and 2k. The abscissa x of the point of intersection satisfies the condition x3 = 2k3 as a result of the fact that x* = k2y2 = 2k3x, and the sought-for edge x is thereby obtained. Descartes (1596-1650) showed that one of the two parabolas (1) and (2) was sufficient. For their point of intersection x\y the following equation is also true: x2 + y2 = ky + 2kx; and this is the equation of a circle with the midpoint coordinates k and k/2 which passes through the common apex of the two parabolas. Thus, it is only necessary to find the intersection of this circle with one of the two parabolas to find the sought-for point of intersection. The simplest and most accurate method of constructing x = k¥2 is by paper strip construction. 1. We draw an equilateral triangle ABC with the side k, extend CA by AD = k, and draw the line DB. 2. We mark off on the sharp edge of a paper strip the distance k. 3. We place the paper strip in such a way that the edge passes through C and the end points of the marked-off distance fall upon two points P and Q of the extensions of AB and DB. Then CQ = x = A:^2.
172 Planimetric Problems Proof. Let CQ = x, BP = y. According to the leg transversal theorem used in figure CABP, (x + k)2 — k2 = y(k + y) or (I) x2 + 2kx = y2 + ky. According to the theorem applied by Menelaus to the triangle ACP with the transversal DBQ, ADCQBP = PQABCD or (II) xy = 2k2. A glance at equations (I) and (II) shows that they are satisfied by the roots x andy of equations (1) and (2). The unknowns x andy, which are determined by (I) and (II), are therefore at the same time the coordinates of the point of intersection of Menaechmus' parabolas. In particular, x = kV2. Naturally, this result can also be obtained without reference to these parabolas. Note. The doubled cube can also be constructed by means of the so-called conchoid of Nicomedes, a Greek mathematician who lived at the beginning of the second century B.C.; we cannot, however, present this construction here. KRI Trisection of an Angle To divide an angle into three equal angles. This famous problem cannot be solved with compass and straightedge (see the supplement). The simplest solution is by means of the following paper strip construction of Archimedes. Fig. 28. Taking as the center the apex S of the angle <P to be trisected, we draw a circle of radius r that intersects the legs of the angle at A and B. We mark off a segment of length r on the edge of a paper strip. We place the edge on the figure in such a way that it passes through B and that one end point of the marked-off segment coincides with a
Trisection of an Angle 173 point P on the circle, while the other end point coincides with a point Q (outside the circle) of the extension of AS. Then &PQS = q> is one third of the given angle ¢. Proof. Since PS = PQ (= r), &PQS is isosceles and &PSQ is therefore also equal to <p, while the external angle %_SPB is equal to 2<p. Since &SPB is also isosceles, &SBP = &SPB = 2<p. Finally, since the external angle O at S of the triangle SBQ is equal to the sum of the two nonadjacent internal angles SQB and SBQ, we find that O = q> + 2q> or 9 = iO. Q..E.D. The problem of the trisection of an angle can also be solved by means of a fixed hyperbola, as the Greek mathematician Pappus (ca. 300 a.d.) demonstrated in his ingenious masterwork "SLwayojyal fj.a0Tjfj.aTi.Kai (Collectiones mathematicae). In order to understand the construction we must first solve the problem: Find the locus of the vertex Pofa triangle ABP with fixed base AB when the base angles a and /3 are to each other in the proportion of 2 to 1. Let AB = 3A:, AP = u. We lay off the angle /3 at P on PB and designate the point of intersection of the free leg with segment AB as Q. The triangles BPQ and APQ are then isosceles {^AQP as the external angle otBPQ is equal to 2/3 = a); consequently, AP = QP = BQ = u. We then extend AB by BC = A: and set CP equal to ». From figure AQCP it then follows, according to the apex transversal theorem, that v2 -u2 = CA-CQ = U(k + u) or v2 = (u + 2A:)2, more simply » = u + 2k or also » - u = 2A:. This is the equation for the locus in bipolar coordinates u, v. The locus of the point P is thus a hyperbola with the foci A and C and the major axis BD = 2k. (D lies between A and B in such a way that, according to the locus equation w — u = 2k, CD = 3A:, and AD is equal to k.) Let us now consider this hyperbola as having been drawn once and for all for any k. (The half of the branch belonging to the focus A, lying above the major axis, is sufficient.)
174 Planimetric Problems In order to trisect the prescribed angle a> we draw about AB as chord the arc subtending the angle 180° — a> and call its intersection with the hyperbola P. Then &AEP = /3 = i<u. Proof. From &APB = 180° - a> it follows that a + /3 = a>, i.e., (because a = 2/3), 3/3 = a>. Note. 11 is also possible to trisect an angle by means of Nicomedes' conchoid; this method, however, now possesses only historical interest. Supplement to Nos. 35, 36, and 37 On the degree of irreducible equations that can be solved by quadratic roots: Let a rational function of one or more magnitudes be known as an 9t-function and an algebraic equation with rational coefficients as an 9t-equation; in particular, let us designate an integral rational function of several magnitudes with rational coefficients as an 91- polynomial. We will also call a quadratic root of a rational number or an 9t-function of such quadratic roots an expression of the first order, and a quadratic root of an expression of the first order or an 9t-function of such quadratic roots an expression of the second order, etc. In every expression of the mth order we assume that none of its roots of the mth order can be expressed rationally by the remaining ones or even by expressions of lower than the mth order; we assume as well that the expression (by elimination of irrational denominators and powers higher than the first of the relevant quadratic roots) has been put into its simplest form—the normal form. An expression of the mth order that contains the root of the mth order Vot will thus appear in the form o + aVa, where o and a are expressions of the mth order (or lower) in which the Va does not recur. Now let *! be an expression of the mth order which contains the mth-order roots Va, V/3, Vy,... and in which a total of n different roots [of mth and lower order] occur. If we change the signs of these n roots in every possible way, we obtain a total of 2" = N similarly constructed root expressions xl3 x2, x3,..., xN, We form the function F(x) = (x - Xl)(x - x2)... (x - xN).
Trisection of an Angle 175 If everywhere in this expression we change the sign of any of the above n roots contained in it, the value of the expression is not changed. Thus, if we multiply out the parentheses, the resulting polynomial of x—as we know from computations with root expressions—will merely contain the squares of the roots and is consequently an 9t-function of x. The equation (1) F(x) = 0 is thus an 9t-equation with the roots xl3 x2, , xN, which moreover need not all be different. We now postulate: If an 'Si-polynomial f (x) vanishes for a null value, such as x1} o/T(x), then f(x) will vanish for all the roots ofF(x) = 0. Proof. We write xx = o + aVa (see above) and introduce this value into/"(*), and on computation we obtain 0 =/(*0 = 9t + AVZ, where 9t and A contain expressions of the mth degree and lower with the exception of Vo. Now, since it is assumed that Va is independent of these expressions, A cannot differ from zero (for otherwise it would follow that Va = — 9t/^4 and thus Va would be a function of VJi, Vy,...) and, therefore, necessarily ,4 = 0 and 9t = 0. We will write the expressions A and 2t as b + b V/3 and SB + bV/3, where b, b, SB, B are no longer dependent upon Va and V/J. From b + bVp = 0 and SB + -BV/3 = 0 it follows as above that 6 = 0, b = 0, SB = 0, B = 0, etc. From these values we finally obtain equations that possess no roots but only rational numbers and which are, in other words, independent of the signs of the n roots occurring in x± and consequently are unchanged when the signs are changed in any way. Now, since this change of sign transforms x± into one of the values x2> *3> • • • > xN>f(x) must therefore also vanish for x2, x3,..., xN, which is what we set out to prove.
176 Planimetric Problems Among all the 9t-polynomials/(*) that vanish for x = x± there is one possessing the lowest possible degree v, let this be called <p{x). The polynomial <p(x) is irreducible in the natural rationality domain (cf. No. 24). [If <p were divisible: <p(x) = u(x)-v(x), then when ^(xj) = 0 it would necessarily follow that one of the factors such as v(x±) must equal zero: this would contradict our assumption in that there would be a polynomial v of lower degree than <p with the null value xv] Since the SR-polynomial F(x) vanishes for a null value x± of the irreducible polynomial <p{x), F{x), according to Abel's irreducibility theorem (No. 25), is divisible by <p(x): F(x) = F^xMx). Since, moreover, the 9t-polynomial F±(x) vanishes for a null value of F, thus also for <p, F± is also divisible by <p and F±(x) = F2(x)<p(x); consequently F(x) = Fa(x)9(x)», etc. Finally we obtain F(x) = 9(x)" (assuming that the first coefficient of F and <p has the value 1). If we compare the degree of the polynomial on the right-hand side of this equation with that of the polynomial on the left, we find that N = fj.v. Since, however, N = 2", v must also be a power of 2. Conclusion: The degree of an irreducible equation with rational coefficients for which a single expression formed from quadratic roots will suffice must be a power of 2. From this the two following theorems are easily obtained: Litis impossible to double a cube with compass and straight-edge. II. It is in general impossible to trisect an angle with compass and straight-edge. In both problems the specific magnitude x to be constructed is a root of an irreducible equation of the third degree, and according to our conclusion it is impossible for such an equation to be constructed from quadratic roots, and therefore with compass and straight-edge. [As is well known, all expressions that can be represented by compass and straight-edge constructions are either rational or built up from quadratic roots.]
The Regular Heptadecagon 177 Thus it merely remains to show that the equations for doubling a cube and trisecting an angle are cubic and irreducible. The edge x of the cube that is twice the size of a cube with an edge equal to 1 satisfies the equation r2 - 2 = 0. If this equation were reducible, then it would necessarily follow that x2 -2 = (x* + hx + k)(x - I), where h, k, I are rational numbers. Accordingly, the equation x2 = 2 would have to possess the rational root I = p/q, where we may assume that p and q have no common divisor, and consequently (p/q)3 would have to be equal to 2 or p3 equal to 2q3. Consequently, p3 would have to be divisible by q3 and therefore p would also have to be divisible by q, which is not the case. In the trisection of an angle we can consider the given angle a and the angle we are looking for <p as peripheral angles of a unit circle, so that the subtended arcs are a = 2 sin a and x = 2 sin <p, respectively. From a = 3<p and sin 3<p = 3 sin <p — 4 sin3 <p it follows that sin a = 3 sin <p — 4 sin3 <p or x3 - 3x + a = 0. If we assume an arc a of length 3m/n, where m and n possess no common divisors and are integers that cannot be divided by 3, and if we multiply the equation by n3 and set nx = X, the equation assumes the form X3 - 3n2X + 3 mn2 = 0. But according to Schoenemann's theorem (No. 25) this equation is irreducible, since the coefficient of AT is divisible by the prime number 3 and the free term is divisible by 3, but not by 32. KMH The Regular Heptadecagon To construct a regular heptadecagon. In other words: To divide the perimeter of a circle into 17 equal parts. This celebrated problem was solved by Gauss in his major work Disquisitiones arithmeticae, published in 1801. In the section of this
178 Planimetric Problems work dealing with the solution of the binomial equations xn = 1 Gauss proved the important theorem: A regular polygon can be constructed with compass and straight-edge when and only when the number of its sides has the form 2mp!p2 •. • pv> where Pu P2> • • • > Pv are dl different prime numbers of the form 2n + 1. For m = 0, v = 1, and p± = 3 and p± = 5, we obtain the cases of the regular triangle and pentagon, respectively, which had already been solved in antiquity. In the conclusion to his investigations Gauss said, "The division of a circle into three and into five equal parts was already known in Euclid's time; it is amazing that nothing new was added to these discoveries in the next two thousand years, that the geometers considered it as confirmed that, except for these cases and those that could be derived from them, regular polygons could not be constructed with compass and straight-edge." The great advances made in the division of the circle by Gauss were possible only because Gauss transformed the originally purely geometrical problem into an algebraic one. He arrived at this transformation in the course of his representation of complex numbers in the Gauss plane, which was named after him. An arbitrary complex number c = a + bi is conventionally represented in this plane by a point with the coordinates a\b; this point itself is designated as "the complex number c." Another common method is the trigonometric representation c = r(cos & + i sin &) of the complex number c, where r represents the so-called magnitude (modulus) of the number, the distance of the number c from the null point 0 of the number plane and &, the so-called angle of the number, which is the angle formed by the distance r and the axis of the positive real numbers. The points of the unit circle ® drawn about the center 0 represent the so-called Gauss numbers, i.e., numbers of the form y = cos <p + i sin <p, where <p is the angle of the number y. We will write for short COS 9; + l Sin q> = lq>.
The Regular Heptadecagon 179 The fundamental property of the Gauss numbers is described by the relation i.e., the product of two Gauss numbers is also a Gauss number; the angle of the product is the sum of the angles of the factors. It is easily confirmed that the theorem also holds for products of more than two Gauss numbers. For example, ln = 1 .1 .1 ...=1 or, written out fully, (cos q> + i sin q>)n = cos nq> + i sin n<p. This is Demoivre's formula (Abraham Demoivre, 1667-1754). To obtain a regular polygon of n angles we mark off the angle <p = (2ir/n) n times in succession from point 1 on ®. The resulting points representing the divisions are e-L = e = cos <p + i sin <p, e2 = cos 2<p + i sin 2<p,... e„ = cos nq> + i sin tup = 1. Then e, = eJ = ev and ej = evn = (en)v = 1. The n angles e1} e2,. ..,eaof a regular polygon of n angles are therefore the roots of the equation zn = 1. Thus the geometric problem of " constructing a regular polygon of n angles," following Gauss, turns out to be the problem "of finding the roots of the equation zn = 1." Since one of the n roots of this equation has the value 1, we need only find the other (n — 1) roots. These satisfy the equation Y^zj = zn_1 + z"~2 + • • • + z2 + z + 1 = 0, the so-called circle partition equation. In the case of n = 3, for example, the equation reads z2 + z + 1 = 0 and has the roots -1 + iV3 - 1 - iV3 £l = o ' £2 = o
180 Planimetric Problems Since the complex numbers e± and e2 both possess the real component — \, the angles e± and e2 of the regular triangle are the points of intersection of S with the parallel to the imaginary number axis that passes through the point —\. A proof of the general theorem of Gauss would take us too far, so that we will restrict ourselves here to a brief exposition of the basic idea and the elements that are necessary for an understanding of the construction of the regular heptadecagon. Let us first take note of the fact that the construction of the regular 2mJV-gon, where JVis the product of the odd prime numbers p, q,r,..., is equivalent to drawing the regular />-gon, ?-gon, r-gon, etc. If we have these polygons, we determine the integral numbers x, y, z in such manner that N N N — x + -y + — z + ... = 1. P q r This can be done because the numbers N N N P 1 r have no common divisor. Then 1 x y z so that the JVth part of M: is obtained by joining the x pths, y qths, z rths,. . . of the circle perimeter. Consequently, we need only be concerned with the solution of the circle partition equation (1) z""1 + z"~2 + • • • + z2 + z + 1 = 0, in which p is a prime number of the form 2" + 1. The brilliant idea underlying Gauss' method of solution consists in grouping the roots ei, e2> • • •> eP-i of (1) (where ev = ej = ev, e = cos q> + i sin q>, <p = 2irlp) into so-called periods. The Gauss periods are root sums in which each successive term is the £th power of the preceding term, and the £th power of the last sum term results once again in the first term (hence the name period). The exponent g is here a so-called primitive root of the prime number p, i.e., an integer such that g"'1 is the smallest of its integral powers that leaves a
The Regular Heptadecagon 181 residue of 1 on division hyp. In other words, g is an integer such that the roots of (1) can be expressed in the form z0 = e, Zi = e", z2 = e"2, . . ., zp.2 = e"""2. The next period is Z0 + Z± + Z2 + • • • + Zp_2. In fact, zv + i = z?andz£_2 = e""-1 = esp + 1 (where sis an integer) = e. The following period contains only a = (p — 1)/2 terms and reads z0 + za + z4 + • • • + Zr (r = 2a - 2). In this period each term is the Gth power of the preceding term and zf = Zq, where G = g2 is similarly a primitive root of p. Let b = \a, c = \b, d = ^c, etc. Gauss' method for solving the circle partition equation consists of reducing (1) to a chain of groups of quadratic equations. The first group contains one, the second group two, the third group four, the fourth group eight, etc., and the last group a quadratic equations. The roots of the first group form periods of a terms, those of the second group periods of b terms, those of the third periods of c terms, those of the last periods of a single term, i.e., the roots of (1) itself. The coefficients of the equations of one group can be determined from the coefficients of the preceding group, so that the equations of the last group give us the roots of (1) directly. In the successive determination of coefficients the formula (2) in which r represents the residue remaining when the integral exponent E is divided by p, plays a predominant role. We will now use the Gauss method to solve the equation for the heptadecagon (p = 17). z16 + z15 + • • • + z2 + z + 1 = 0. Let q> = 27r/17, e = £i = cos <p + i sin q>, ev = ev, and accordingly, let eu e2> e3> • • •> en be the corners of the heptadecagon, for which Zy = e"", where g represents the (smallest) primitive root 3 of 17. The powers 31, 32, 33,.. ., 316 on division by 17 leave the residues 3, 9, 10, 13, 5, 15, 11, 16, 14, 8, 7, 4, 12, 2, 6, 1.
182 Planimetric Problems Consequently, according to (2), Zq = ¢3 Z2 = € 3 ^4 = € 3 ^-6 = € 3 ^8 = e 3 ^10 = e 3 z12 = e , z14 = e , ^ = 6, z3 = e , z5 = e , Z7 = e , Zg = e , z-ii = e , z13 = e , z^5 = e . Each root in the series Zq, zx, z%,... is the cube of the preceding one. The first group in the chain contains a quadratic equation the roots of which are the periods X = Zq + Z2 + Z4 + Zg + Zg + z10 + z12 + z14 = e + e9 + e13 + e15 + e16 + ee + e* + e2 and X = Zi + Z3 + Z5 + Z7 + Z9 + Zn + z13 + z15 = e3 + e10 + e5 + e11 + e14 + e7 + e12 + e6. Since the sum of the roots of (1) possesses the value — 1, we obtain the relation X + x = -1. Making use of (2), we find on computation that Xx is equal to four times the sum of all the roots of (1), and consequently Xx = -4. The quadratic equation for the periods X and x consequently reads (I) t2 + t - 4 = 0. Its roots are -1 + VT7 , -1 - VT7 X = - and x = - That X > x is shown in the following manner. If we designate the real component of the complex number c as 9tc, then (cf. Fig. 29) (3) 9fe" = «Rev if n + " = 17, since the corners e* and ev of the heptadecagon are symmetrical to the real axis. Applying this rule, we obtain mX = 2[«R£l + «Re2 + «Re4 + Weei, mx = 2(9te3 + dte5 + me6 + «Re7). A glance at the figure shows that the bracket is positive and the parenthesis negative.
The Regular Heptadecagon The four four-term periods are U = z0 + z4 + z6 + z12 = e + e1 183 Fio. 29. Here we obtain U +u = X | V+ v = x and, applying rule (2), Uu = e1 + e2 + ■ ■ ■ + e16 = - 1 | Vv = e1 + e2 + ■ ■ ■ + e16 = The respective quadratic equations are (II) t2 - Xt - 1 = 0 | t2 - xt - 1 = 0. Their roots are -1. U = X + \/X2 + 4 u = X - VX2 + 4 V = X X + V7 2 - vG?' + 4 + 4
184 Planimetric Problems It follows from rule (3) that U > u and V > v. Consequently, fRU = 2[«R£l + ftej, «Ru = 2(«Re2 + dieB), fRV = 2[9fte3 + R«J, 9fo = 2(«Re8 + »«7). A look at the heptadecagon shows that the brackets are larger than the parentheses immediately below them. Of the two-membered periods obtained we need only the two Here and, W=z0 we find according + z6 = to (2), Ww = « + e16 w + e5 + e14 and w = + e3 w U + = Z4 e12 = + zl2 V. = e13 Here also W > w, since 9¾ W = 2S»e1and9»a; = 23te4, but 9»ei > 9*e4. The quadratic equation with the roots W^and w reads (III) t2 - Ut +v=o. The construction of the heptadecagon accordingly consists of the following four steps: I. Construction of AT and x; II. construction of U and V; III. construction of W and w according to (III) ; IV. finding the points W and w on the real number axis. The perpendicular bisectors of the lines joining them to the null point cut the circle S at the corners elt e16 and e4, e13 of the regular heptadecagon (thus all the other corners are also determined). Archimedes' Determination of the Number n Archimedes of Syracuse (287 ?—212 b.c.) was the greatest mathematician of the ancient world. The most famous of his achievements is the measurement of the circle. The crux of this problem is the calculation of the number it, i.e., the number by which the diameter and the square of the radius must be multiplied to determine the circumference and area, respectively, of a circle.* * The proposal that this number be designated as w came from Leonhard Euler (Commentarii Academiae Petropolitanae ad annum 1739, vol. IX).
Archimedes' Determination of the Number n 185 The idea upon which Archimedes' method was based is the following. The circumference of a circle lies between the perimeters of a circumscribed and inscribed n-gon, and in particular, the greater n is, the smaller is the deviation of the circumference of the circle from the perimeters of the two n-gons. Then the object is to calculate the perimeters of a circumscribed and inscribed regular polygon with so great a number of sides that their difference is equal to a very negligible magnitude e. Then if the circumference of the circle is set equal to the perimeter of one of these polygons, the resulting deviation from the true circumference of the circle is smaller than e, with the result that when e is sufficiently small the circumference of the circle is determined with sufficient accuracy. 6 t M 0 A Fio. 30. The particular achievement of Archimedes was to indicate a method by which the perimeters of such many-sided polygons could be calculated. This method, the so-called Archimedes algorithm, is based upon the two Archimedes recurrence formulas which we will now derive. In Figure 30, let Z be the center of the circle, let AB = 2t be the side of the circumscribed and CD = 2s the side of the inscribed regular n-gon. Let M be the midpoint of AB and N the midpoint of CD, let 0 be the point of intersection with MA of the tangent to the circle passing through C. Accordingly, OM = OC = t' is half the side of the circumscribed 2n-gon and MC = MD = 2s' is the side of the inscribed regular 2n-gon. Since ACO and AMZ are similar right triangles, t'ftt - t') = OC/OA = MZ/AZ,
186 Planimetric Problems and from the ray theorem, s/t = NC/MA = CZ/AZ. Since the right sides of these proportions are equal, we obtain t'l(t - t') = sjt or Since the isosceles triangles CMD and COM are similar, 2s'l2s = t'l2s', i.e., 2s'2 = st'. If a is the perimeter of the circumscribed n-gon and b the perimeter of the inscribed n-gon, and a' and b' are the perimeters, respectively, of the circumscribed and inscribed 2n-gons, we then have a = 2nt, b = 2ns, a' = 4nt', b' = 4nj'. If we then introduce the values obtained for t, s, t', s' from these equations into the two formulas we have found, they are transformed into the Archimedes recurrence formulas: (I) a' = -^-, (II) b' = VM. ' a + b ' TTius, a' is the harmonic mean of a and b, b' the geometric mean qfb and a'. Now let us consider in succession by the regular n-gon, 2n-gon, 4n-gon, 8n-gon, etc., and let us designate the perimeters of the circumscribed and inscribed 2vn-gons as av and b„ respectively. We then obtain the Archimedes series «0) *0> al> *1> a2) *2) • • • of the successive perimeters. Here the recurrence formulas (I) and (II) read (1) av + 1 = -¾.. (2) bv + 1 = VM^7. Uy ~T Oy That is: Each term of the Archimedes series is alternately the harmonic and geometric mean of the two preceding terms. Using this rule, we are able to calculate all the terms of the series if the first two terms are known. The Archimedes algorithm consists of this calculation of the successive perimeters of the polygons. Archimedes chose as his initial polygon the regular hexagon, the perimeters of which are a0 = 4V3r and b0 = 6r, respectively, and
Archimedes' Determination of the Number it 187 worked out the series alt blt a2, b2, a3, b3, a4, 64 up to the perimeters 04 and bt of the circumscribed and inscribed regular 96-cornered polygon. He found that a, = Xfid, bt = Stfrf, where d is the diameter of the circle. The Archimedes approximation for the value of it is consequently tt = 3} = 3.14. Note. The calculations involved in the Archimedes method are very laborious. For this reason Christian Huygens, in his treatise published in Leyden in 1654, De circuli magnitudine inventa, replaced the limits av and bv of the circumference u of the Archimedes method by the limits <xv and /Sv, which gave a closer approximation of u, since it made it possible to obtain it correctly to two decimal places for v = 1. Huygens' method, however, involves rather complicated considerations. The following method supplied by the author is faster and more convenient; it is based on the known theorem: The harmonic mean of two numbers is smaller than the geometric mean of the numbers. This can be expressed as —i^— < Vxy. x +y [Since (Vx — Vy)2 > 0, it follows that 2Vxy < x + y, and from this, multiplication with Vxyl(x + y) gives the designated inequality.] According to this theorem, we obtain from (1) a,+ ^ < Vavbv. If we multiply the square of this inequality by the square of (2), we obtain a» + i*? + i < <hb2 or, if we set Vavb2 = Av then (3) Av + 1 < Av. According to the same theorem, it follows from (2) that 2Mv + i ftp 2 11 ov+i > l , „— or i— < r + z— ov + av+1 ev+1 bv av+1 If we then add to this inequality the equation 2 1 1
188 Planimetric Problems which is only a different manner of writing (1), we obtain 1,2 12 + i— < - + r av + 1 bv + 1 av bv or 3av + iiy +! 3dybv 2a, + 1 + bv + 1 > 2av + bv' or, in abbreviated form, if we set 3«A _ R 2av + bv~ "v> then (4) Bv + 1 > Bv. The inequalities (3) and (4) imply that as v increases, Av grows continuously smaller, Bv continuously larger. Since for infinitely great v, both Av and Bv become the circumference u of the circle, for every finite v it must be true that Bv < u < Av. The limits Av and Bv of this inequality are much narrower than the Archimedes limits a„ and bv. If we take the hexagon, for example, as our initial polygon and d = 1, then a0 = 2\/3, b0 = 3, u = it, and we obtain Ax = 3.1423 and B0 = 3.1402; thus we are able to obtain the correct value of it to two accurate decimal places by using only the inscribed hexagon and the circumscribed dodecagon, whereas the same precision is achieved by the Archimedes method only with the use of the polygon of 96 sides. J@^ Fuss'Problem of the Chord-Tangent Quadrilateral To find the relation between the radii and the line joining the centers of the circles of circumscription and inscription of a bicentric quadrilateral. A bicentric or chord-tangent quadrilateral is defined as a quadrilateral that is simultaneously inscribed in one circle and circumscribed about another. Let PQRS be such a quadrilateral, © the circumscribed circle, T the inscribed circle. Let the points of tangency of the opposite sides PQ and RS with circle r be X and X', let the points of tangency of the opposite sides QR and SP be Y and Y', and let the
Fuss' Problem of the Chord-Tangent Quadrilateral 189 point of intersection of the tangency chords XX' and YY' be 0. If we then apply the theorem of the sum of the angles of a quadrilateral to the two quadrilaterals OXPY and OX'RY', designating the quadrilateral angles by means of a line over the letter representing the corner, we obtain the two equations 0 + X + P+Y= 360°, 0 + X' + R + 7' = 360°. Since the angles ^and X' (Y and Y') situated at opposite sides of the chord XX' (YY') add up to 180°, addition of the two equations gives the following relation (1) 20 + P + R = 360°. Now the sum of the two opposite angles P and R of the chord quadrilateral PQRS is 180°; consequently, 6 = 90°. The tangency chords of the two pairs of opposite sides of a bicentric quadrilateral are therefore perpendicular to each other. This condition is also sufficient: A bicentric quadrilateral PQRS is obtained if the tangents PQ, RS, SP, QR are drawn through the end points X, X', Y, Y' of two perpendicular chords XX' and YY' of an arbitrary circle T. In fact, it now follows from (1), since O = 90°, that the sum of the opposite angles P and R is 180°, i.e., that PQRS is also a chord quadrilateral. The simplest way of obtaining the desired relation between the radii and the axis of the centers of the circumscribed and inscribed circles is by means of the following locus problem. A right angle is rotated about its fixed vertex, which is located inside a circle; find the locus of
190 Planimetric Problems the point of intersection of the two circle tangents that pass through the point of intersection of the legs of the angle with the circle. Solution of the locus problem. Let the given circle be known as T, its midpoint as M, its radius as p, the fixed vertex of the right angle as 0, the distance of the vertex from Mase. Let the legs of the right angle intersect the circle at the (moving) points X and Y; and let the point of intersection of the two circle tangents passing through X and Y be known as P and its distance from the center of the circle as p. Fio. 32. We will first determine the relation between p and its angle <p (= TiOMP) with the fixed line MO. Since OXY is a right triangle, OF2 = FX-FY, where F represents the base point of the altitude to the hypotenuse. If we introduce the projections p = MN and e' = e cos <p and p" = NX and e" = e sin <p (= NF) on the lines MP and XY, respectively, the equation can be written (,' -e')2 =(p" -e")(p" + e") or 2p'2 - 2p'e' + e'2 + e"2 = p'2 + p'2 or (2) 2p'2 - 2p'e cos <p + e2 = p2. Since MXP is a right triangle, MX2 = MP-MN or (3) p2 = pP>.
Fuss' Problem of the Chord- Tangent Quadrilateral 191 If we introduce the value of p from (3) into (2), we obtain the relation we are looking for: w ? +2 t*V cos * = ^- The distance r = ZP of a point Z from P on the extension of OM at a distance of MZ = £ from M is obtained by the cosine theorem (5) r2 = z2 + p2 + 2zp cos <p. If for z, which up to this point has been arbitrary, we now choose the value (I) mz=z = T^-a-*, p r we obtain, in accordance with (4), (II) r2 = z2 + -¾. p - e and consequently r has a constant value! 3¾ desired locus of the point of intersection P if /Aiu a circle © whose center Z, which is situated on the extension of OM, is determined by (I) and whose radius r is determined by (II). Naturally, also belonging to this locus are the points of intersection Q, R, S of the tangents, which are obtained when we draw the' tangents through the points of intersection of the circle T with the extensions of XO and YO. The quadrilateral PQRS is simultaneously a tangent and chord quadrilateral, in that it circumscribes circle T and is inscribed in circle ¢. If the right angle XOY is rotated about 0 so that the points X, Y describe the circle T, the quadrilateral PQRS continuously assumes different positions but always circumscribes circle T and is always inscribed in circle ¢. Similarly, we see that in this way all the bicentric quadrilaterals belonging to the two circles T and © are obtained. The obtained formulas (I) and (II) contain the solution to the problem posed. We substitute the value obtained from (II) for p2 — e2 in (I) and obtain e = 2zp2\{r2 — z2). From this there follows p2 — e2 = p2[(r2 — z2)2 — 4:p2z2]l(r2 — z2)2. When this value is introduced into (II) we finally obtain the sought-for relation between the radii r and p and the axis z connecting the centers of the circumscribed and inscribed circles of the bicentric quadrilateral: 2p2{r2 + z2) = (r2 - z2)2.
192 Planimetric Problems The developed formula comes from Nicolaus Fuss (1755-1826), a student and friend of Leonhard Euler. Fuss also found the corresponding formulas for the bicentric pentagon, hexagon, heptagon, and octagon {Nova Acta PetropoL, XIII, 1798). The corresponding formula for the triangle had already been given by Euler. It is r2 - z2 = 2rp and is easily obtained in the following manner. Let ABC be any triangle, let Z and M be the respective centers, r and p the radii of the circles of circumscription and inscription, respectively; thus, ZM = z is the axis connecting the centers; further, let D be the point at which the extension of CM meets the circumscribed circle, so that DM = DA = DB. The power of the circumscribed circle at M is MC-MD = r2 - z2. However, since we can replace sin (y/2) by the ratio p/MC as well as by AD\2r or MD\2r, p/MC = MD\2r, i.e., MC-MD = 2rP. When the two values found for the product MC- MD are set equal to each other we obtain Euler's formula. Note. Much more remarkable than the Fuss formula is a theorem concerning bicentric quadrilaterals that follows directly from the preceding locus consideration. For convenience in expression we will make a prefatory observation. Let a circle T lie completely inside another circle (£. If from any point on © we draw a tangent to T, extend the tangent line so that it intersects (£, and draw from the point of intersection a new tangent to T, extend this tangent similarly to intersect (£, and continue in this manner, we obtain a so-called Poncelet traverse which, when it consists of n chords of the larger circle, is called n-sided. The theorem concerning bicentric quadrilaterals now reads: If on the circle of circumscription there is one point of origin for which a four-sided Poncelet traverse is closed, then the four-sided traverse will also close for any other point of origin on the circle. The French mathematician Poncelet (1788-1867) demonstrated that this theorem is not limited to four-sided traverses only, but is generally true for n-sided traverses, and not only for circles, but for any type of conic section. The general theorem reads:
Annex to a Survey 193 Poncelet's closure theorem: If an n-sided Poncelet traverse constructed for two given conic sections is closed for one position of the point of origin, it is closed for any position of the point of origin. Eufl Annex to a Survey To determine the position of unknown but accessible points of the earth's surface by taking the bearings of known points. (A point on the earth's surface is considered as known when its geographic coordinates [length and width] are known.) This problem is of great importance in the incorporation of new points of the earth's surface into a survey and consequently in the preparation of accurate maps. Land surveyors and sailors are specifically confronted with the following two cases: I. The Snellius-Pothenot problem; the problem of three inaccessible points: Determine the position of an unknown accessible point P by its bearings from three inaccessible known points A, B, C. This most famous of all land surveying problems was posed and solved by the Dutchman Willebrord Snellius (1581-1626) in his 1617 work, Eratosthenes Batavus, but attracted no attention among his contemporaries. It was not commonly known until it was solved once again by the Frenchman Pothenot (died 1732) in a paper submitted in 1692 to the French Academy. Since then it has been known as the Pothenot problem. II. Hansen's problem; the problem of the inaccessible distance : From the position of two known but inaccessible points A and B, determine the position of two unknown accessible points P and P' by bearings from A,B, P'toPWA.B, PtoP'. This problem was solved by the German astronomer Hansen (1795-1874), but was solved as well by other authors before him. Trigonometric Solution This type of solution is required when accuracy is important, as in land surveying. For both problems this type of solution is based upon the sine tangent theorem: sin a/sin jS = m\n,
194 then also Planimetric Problems a- B I a + B , ... . tan r/tan r = (m - n)/(m + n). [From sin a/sin B = mjn it first follows that (sin a — sin /J)/(sin a + sin B) = (m — n)j{m + n). If the numerator and denominator of the fraction on the left of the equation are converted into products, we obtain a + 8. a - 8 J . a + B a - B H cos —^- sin —-^- I sin —=-^ cos —=-=- = (m — n)j{m + n) or tan «-/*/♦« + /* Ytan! = (m - n)l(m + »).] Solution of the Pothenot Problem Known are the five elements AC = a, BC = b, &ACB = y, &APC = a, &BPC = B; to be found are the five elements AP = x, BP =y, CP = z, &CAP = 0, &.CBP = <p. If the sine theorem is applied to the triangles A CP and BCP, sin 0 z , sin ro z -t—!- = - and -^5 = t- sin a a sin p b C On division it follows from this that sin i/i/sin y = b sin a/a sin /?. We determine the auxiliary angle /t whose tangent is b sin a/a sin /?, and obtain sin 0/sin <p = tan /t.
Annex to a Survey 195 From this it follows according to the sine tangent theorem that tan^ f 2 = ten/i-1 = t^^ + y H-tanM vp i.e., tan £z_? = ^ t+JL-xan 0* - 45°). Since 0 + ^( = 360° — a — B — y) is known, this equation gives us 0 - 9 2 From addition and subtraction give us 0 and <p. The unknowns *, y, z are obtained from the following formulas derived from the sine theorem: x sin (a + 0) y sin (B + <p) z sin 0 - = : > T = : 3 > ~ = ~ a sin a b ■ sin p a sin a The position of the point P is determined from the magnitudes 0, 9, x, y, z. Solution of Hansen's Problem Known are the five elements AB = c, &APB = y, ^AP'B = y, &BPP' = 8, 2i AP'P = 8', and consequently also the angles PAP' = a and PBP' = B; we do not know the seven elements AP = x, AP' = x', BP = y, BP' = y', &BAP' = 0, &ABP = <p, and PP' = s. We now represent the four ratios of the adjacent sides of the quadrilateral as sine ratios in accordance with the sine theorem: c _ sin y x _ sin 8' s _ sin B y' _ sin 0 x sin qp s sin a y' sin 8 c sin y Multiplication of these equations gives us sin 0 sin B sin y sin 8' , sin 0 sin a sin y sin 8 — ; 1 ; 1 ; = J Qf = • sin <p sin a sin y' sin 8 sin 9 sin /3 sin y sin 8'
196 Planimetric Problems We then determine an auxiliary angle /t whose tangent is equal to the right side of this equation, and we obtain sin 0 -:—- = tana, sinqp ^ i.e., according to the sine tangent theorem as above, tan t^l = tan *±* tan 0* - 45°). As above, we find from this n (since 0 + <p = 8 + 8' is known) and then 0 and <p. Now the remaining unknowns are easily obtained by the sine theorem. The positions of P and P' are determined by the values found for the six unknowns. The Drawing Solution This is adequate when great accuracy is not requisite, for example, in sailing along a coast where A, B, C are known landmarks, P and P' unknown positions of a ship with a bearing on these landmarks. The solution of Pothenot's problem is extremely simple. The ship's position P is the point of intersection of the two circles to be drawn on the ship's chart with the chords AC and BC and the corresponding peripheral angles a and /?. Hansen's problem is solved in the following way. We draw a quadrilateral abp'p having the same form as ABP'P (beginning with an arbitrary distance pp') and lay this off on the chart so that b falls on B
Alkazen's Billiard Problem 197 and a on AB. The ship's position P is the point of intersection of Bp with the parallel to ap passing through A, the ship's position P' is the point of intersection oiBp' with the parallel to pp' passing through P. Alhazen's Billiard Problem To describe in a given circle an isosceles triangle whose legs pass through two given points inside the circle. This problem stems from the Arabic mathematician Abu Ali al Hassan ibn al Hassan ibn Alhaitham (ca. 965-ca. 1039), whose name was transformed into Alhazen by the translators of his Optics. In his Optics the above problem has the following form: "Find the point on a spherical concave mirror at which a ray of light coming from a given point must strike in order to be reflected to another given point." This problem can be posed in various other forms, e.g.: "On a circular billiard table there are two balls; in what manner must one be struck in order for it to strike the other after rebounding from the cushion?" or " On the circumference of a circle find a point the sum of whose distances from two given points within the circle is equal to a minimum (or maximum)" A whole series of famous mathematicians took up this problem after Alhazen, among them Huygens, Barrow, de L'Hdpital, Riccati, and Quetelet. Solution. Let us call the given circle ft, its center M, its radius r, the given points P and p, and let us make M the origin of a mutually perpendicular coordinate system xy in which P and p have the coordinates A\B and a\b. If OS and Os, which pass through P and p, are the legs of the isosceles triangle OSs that we are looking for, the angles O and <p, which these legs form with the radius OM, must be equal. If we designate the angles that the lines PO, MO, pO form with the x-axis as A, /x, A, then, on the one hand, O = A — p and <p = p — A or , tan A — tan u , tan u — tan A tan Q = - i-y and tan q> = -. - r> 1 + tan fx tan A 1 + tan p tan A while, on the other hand, if x \y are the coordinates of 0, . y — B y y — b tan A = j» tan u = -, tan A = - > x — A ^ x x — a
198 Planimetric Problems and consequently, since tan O = tan <p, y - B y y y - b x — A x yy - & xx — A Ay - Bx ' x x — a l+y_y-b xx — a bx - ay or x2 + y2 — Ax — By ~ x2 + y2 — ax — by or finally, if we set Ab + Ba = H, Aa - Bb = K, A + a = h, B + b = k, then H(x2 - y2) - 2Kxy + (x2 + y2)[hy - kx] = 0. Since the point 0(x\y) has to lie upon the circle $, the circle equation (1) x2+y2 = r2 consequently applies here, and our condition assumes the form (2) H(x2 - y2) - 2Kxy + r2[hy - kx] = 0. Since equation (2) represents a hyperbola, our conclusion reads as follows: The point O that we are looking for is the point of intersection of the circle (1) with hyperbola (2). Since there are in general four points of intersection for a circle and a hyperbola, there are in general four solutions to our problem. Possessing particular interest is the special case in which the distances C and c of the given points P and p from the center M are equally great. In this case we naturally take the perpendicular bisector of Pp as the *-axis, and then we have A = a, B b, H = 0, K = c2, h = 2a, k = 0 and, according to (2) -2c2xy + 2ar2y = 0. This equation is satisfied by each of the conditions (3) y = 0 and (4) * = «J3' From (3) follows the corresponding x = ±r. Consequently, the points of intersection of $ with the x-axis satisfy the condition for the point 0 we are looking for.
Alhazen's Billiard Problem 199 From (4) it follows that a x If we then draw through M a circle f whose diameter MN = d = c2\a lies on the x-axis, and if Q(X\ Y) is a point of intersection of this circle with ft, it follows, since MNQ is a right triangle, that MQ2 = MN-X or r2 = dX. However, since r2/x = d, we obtain X = x. Consequently, the points of intersection of the circles ft and f also satisfy the condition for the point 0 we are looking for. V For these points of intersection to exist, d must be > r or c2 > or. We will assume that this condition is satisfied. Now the quadrilateral MPpQ in circle f is a chord quadrilateral, and therefore, according to Ptolemy's theorem, the sum of the products of the opposite sides must be equal to the product of the diagonals: PQ-Mp + pQ-MP = MQ-Pp or (5) (PQ + pQ)c = 2br. For any other point Q' of ft, MPpQ' is not a chord quadrilateral, and therefore the sum of the products of the opposite sides must be greater than the product of the diagonals: (6) (PQ' + pQ')c > 2br. From (5) and (6) we obtain PQ + pQ < PQ' + pQ'.
200 Planimetric Problems The problem: " On a given circle find a point the sum of whose distances from two given points located in the circle at an equal distance from the midpoint of the circle b a minimum" has the following striking solution: The point we are looking for is the point of intersection of the given circle with the circle that passes through the given points and the center of the given circle. Note. In connection with the above problem Alhazen also solved the problem: "How to strike a ball lying on a circular billiard table in such a way that after twice striking the cushion the ball will return to its original position." Solution. Let the billiard table possess the radius r and the center M. Let the initial position of the ball be P, so that MP = c is known. Let the ball first strike the circle at U, cross the extension of Fig. 36. PM at a right angle at F, then strike the circle at V and return from here to P. UM and VM are then angle bisectors of the triangle PUV. We set MF =x, FU = y, UP = z. Applying the angle bisector theorem to the triangle FUP, y\z = xjc, and according to the Pythagorean theorem r2 = x2 + y2 and z2 = y2 + (x + c)2. If we eliminate y and z from these three equations, we obtain the quadratic equation lex2 + r2x = cr2 for the unknown x. From this, x is easily constructed.
Problems Concerning Conic Sections and Cycloids
An Ellipse from Conjugate Radii To draw an ellipse for which the magnitude and position of two conjugate radii are given. Solution. Let the ellipse have the center equation l 2 /,A 2 (1) M- Let the prescribed conjugate radii be OP and OQ such that the coordinates x \y and x' \y' of their end points satisfy the conditions (2) x a \ X a (The conditions (2) give us directly for the product of the slopesy/x and y'jx' of the two radii the known value — b2/a2 for the product of the slopes of the conjugate radii.) Let the base point of the ordinate from Q be V. We rotate the right triangle OQV clockwise about 0 by 90° to the position Oqv and extend the straight line Pq to intersect with the axes of the ellipse at H and K. According to (2), the distances of the points q and P from the x-axis and the distances of the points P and q from the y-axis are in the ratio of ajb. Consequently (according to the ray theorem), Hq a , KP a HP = b and Kq- = b
204 Problems Concerning Conic Sections and Cycloids It then follows from this that HP + Pq _ Kq + qP HP ~ Kq ' i.e., HP = Kq, so that the center M oiPq is also the center oiHK. If we substitute HP for Kq, one of our proportions becomes (3) KPfHP = a\b. In order to obtain a second equation for the unknowns KP and HP, we obtain the cosine and sine of the angle v from HK to the x-axis: cos v = xjKP, sin v = yjHP; squaring and adding, we obtain (4) — + J?- = 1 W Kp2 + Hp2 '• From (1), (3), and (4) it immediately follows that KP = a, HP = b. This gives us the following simple Construction. 1. We rotate OQ about 0 90° through the interior of the obtuse angle POQ to the position Oq. 2. We determine the center M of Pq and the points of intersection //and K of the line Pq with the circle of center M and radius MO. KP <w<f HP are then equal to half the length of the axes of the ellipse, while OH and OK represent the positions of the axes of the ellipse. The rest is simple. An Ellipse in a Parallelogram To inscribe in a prescribed parallelogram an ellipse that is tangent to the parallelogram at a boundary point. The solution of this problem is based upon the theorem: Every ellipse can be considered as a normal projection of a circle. Let ABCD be the given quadrilateral, N the given boundary point lying on AB. Let the other points at which the ellipse touches the boundary of the parallelogram be K on BC, M on CD, and H on DA. In the normal projection, in which the ellipse has the image of a circle, the parallelogram ABCD and the tangency points N, K, M, H
An Ellipse in a Parallelogram 205 appear as projections of a parallelogram circumscribing a circle, and specifically of a rhombus abed with the tangency points n, k, m, h. Since nA||Am||ac and «A||£m|| bd and since parallelism is preserved in a normal projection, NK\\HM\\AC and NH\\KM\\BD. Thus, we find the tangency points H and K, respectively, by causing the parallels through N to BD and AC to intersect with DA and BC, respectively. The fourth tangency point M is the point of intersection of CD with the parallel through H to AC. Let the centers of the circle and ellipse be o and 0, respectively. We will now assume an arbitrary point z on the arc nh of the circle, connect this point with m and n, and designate the points of intersection of these connecting lines with hk and da as x and y. The two triangles omx and any are then similar, since the angles at o and a, as well as the angles at m and n, are equal because they are enclosed between pairs of orthogonal legs. From this similarity we obtain the proportion oxjom = ay Jan. If we substitute oh for om and ah for an in this proportion, we obtain ox\oh = ay/ah. Let the normal projections of the points x, y, z be X, Y, Z. Since the ratio of parallel segments is not altered in normal projection, we have OX/OH = AY I AH. The points X and Y accordingly divide the radius of the ellipse OH and the ellipse tangent AH in the same proportions. Quite similar proportions are naturally found to obtain for the other ellipse arcs MH, MK, NK. We assign the tangents AH, BK, DH, CK to the arcs NH, NK, MH, MK, respectively. In summary we can then say. If we connect a point of one of the four arcs with M and N, the points of intersection of these connecting lines with the radius (OH or OK) and the corresponding tangents divide the radius and tangents in the same proportions. This gives rise to the following elegant construction. We divide the radii OH and OK and the tangents AH, BK, DH, CK each into v equal segments (eight segments are shown in Figure 38) and number the segments from 1 to v, beginning from the center of
206 Problems Concerning Conic Sections and Cycloids Fig. 38. the ellipse with the radii and at the corners of the parallelogram with the tangents. We then connect M (N) with an arbitrary segment point of a radius and N (M) with the segment point with the same number of the tangent corresponding to the arc bounded by N (M) and the end point of the radius. The point of intersection of the two connecting lines is in each case a point on the ellipse. ^^1 A Parabola from Four Tangents To draw a parabola four tangents to which are given. The simplest solution of this beautiful problem is based upon Lambert's theorem: The path of rotation of a parabola tangent triangle passes through the focus. (J. H. Lambert (1728-1777) was a German mathematician.) In order to prove Lambert's theorem we need the Theorem of similar triangles : Two tangents SA and SB to a parabola, together with the lines from the focus to the contact points A and B and the point of intersection S of the tangents, form two similar triangles FSA and FSB such that the angle of the one triangle, situated at the point oftangency, is always equal to the angle of the other triangle that is situated at the point of intersection. Proof. In accordance with the classical construction of the parabola, the mirror images H and K of the focus F on the tangents SA and SB, respectively, fall on the base points of the altitudes dropped from A and B, respectively, on the directrix L.
A Parabola from Four Tangents 207 Fig. 39. Since the angles FAS and HAS are symmetrical, and the angles HAS and FHK, as angles between pairs of orthogonal legs, are equal, it follows that &FAS = &FHK and likewise that &FBS = &FKH The angles FHK and FKH, as the boundary angles opposite the chords FK and FH, respectively, on the circumference of rotation of the triangle FHK (whose center is the intersection S of the median perpendiculars SA and SB of the triangle) are half as great as the corresponding central angle and consequently equal to angles FSB and FSA, respectively. Consequently, &FAS = &FSB and &FBS = £FSA. Q.E.D. Lambert's theorem follows directly from the theorem we have just proved. In fact: If P and Q are the points of intersection of a third tangent with the tangents SA and SB that touches the parabola at 0, then, according to the theorem of similar triangles, &FAS = &FSB and &FAP = &FPO and consequently TiFSQ = &FPQ. According to this equation, however, the quadrilateral FPSQ is a circle quadrilateral. Lambert's theorem gives us directly the requisite construction: From the four tangent triangles that can be formed from the four given
208 Problems Concerning Conic Sections and Cycloids tangents, we choose two and draw the circumference for each. The point of intersection of the two circumferences is the focus. We then find the mirror image of the focus on two tangents and in this way obtain two points of the directrix, which gives us the directrix. The rest is extremely simple. Note. The theorem of the circumference of the tangent triangle leads directly to the solution of the interesting problem: Determine the locus of the foci of all parabolas that are tangent to three straight lines. The sought-for locus is the circumference of the triangle formed from the lines. Kill A Parabola from Four Points To draw a parabola that passes through four given points. This lovely problem was first solved by Newton in his celebrated Philosophise naturalis principia mathematica, 1687, and then once again in 1707 in his Arithmetica universalis. It is commonly based upon the auxiliary problem: To draw a parabola for which three points and direction of the axis are known. The following solution of the auxiliary problem is based on the two theorems: I. The centers of parallel chords of a parabola lie on a parallel to an axis. II. 7%« perpendicular bisector of a parabola chord and the perpendicular to the axis through the center of the chord mark off the half parameter on the axis.
A Parabola from Four Points 209 Proof. The equation for the amplitude of a parabola is commonly expressed in the formy2 = 2px. If* \y and X\ Y are the end points of a parabola chord, the slope of the chord with respect to the *-axis S = (Y -y)l(X-x). From y2 = 2px and Y2 = 2pX it follows, however, by subtraction that Y* -y2 = 2p(X-x), i.e., @ = LzJ. = _^_. If we call the ordinate of the midpoint of the chord -q, the last equation can be written (because 2tj = Y + y) in the form P According to this equation, the midpoints of all chords with the same slope <3 have the same ordinate, with the result that these midpoints lie on a line parallel to the axis of the parabola, and thus I. is proved. To prove II., we take note of the fact that the segment marked off on the axis by the perpendicular bisector of our chords and the perpendicular to the axis through the chord midpoint is equal to ij3, where 3 is the slope of the perpendicular bisector of the chord with respect to the perpendicular to the axis. However, since § = (3, the length of the segment is tj<3 = p, which was to be proved. From II. it also follows that: If the midpoints of two parabola chords lie on a perpendicular to the axis, the perpendicular bisectors of the chords intersect on the axis. Let A, B, C be the given parabola points, 9¾ the direction of the axis. Let us draw through the center M oiAB a parallel to the axis, through the center NofCA the perpendicular to the axis, and call their point of intersection M0. Then according to I., M0 is the midpoint of the parabola chord AqB0 that passes through M0 and is parallel to AB. We draw the perpendicular bisectors of CA and AqB0 (the latter as a perpendicular dropped from M0 to AB). According to II., their point of intersection is a point on the axis, its distance from the base point of the perpendicular dropped from MQ or JVis the half parameter p. The rest is simple. For example, making use of the subnormal (p) from A, we draw the normal AU and the tangent AV (both being drawn to the axis). The midpoint of UV is then the focus and the mirror image of the focus on the tangent is a point on the directrix.
210 Problems Concerning Conic Sections and Cycloids at x Fig. 41. The solution of Newton's parabola problem is based upon the following auxiliary theorem: In all parabola quadrilaterals the products of the diagonal segments are proportional to the squares of the segments on the diagonals that are bounded by their point of intersection and the axis of the parabola. Proof. Let AB be an arbitrary parabola chord, let M be its midpoint, U the point of intersection of the parallel to the parabola axis through M. If we select UM as the *-axis and the parabola tangent through t/as they-axis, we obtain the usual parabola equation in the form y2 = 4kx, Fig. 42.
A Parabola from Four Points 211 where k is the focal radius of the coordinate origin U. The coefficient 4k possesses the value 2/>/sin2 k, where 2p is the parameter and k the angle enclosed between the coordinate axes or the angle formed by the chord AB with the axis of the parabola. We select an arbitrary point 0 on AB and designate the point of intersection of the parallel to the x-axis through 0 with the parabola as Q, the coordinates of Q as x and y, and the coordinates of A as X and Y, so that QO = q = X - x, OA = Y -y, OB = Y + y. From Y2 = 4kX and y2 = 4kx it follows by subtraction that Y2 - y2 = 4k(X - x) or (7 + y)(Y - y) = 4k(X - x), so that (1) 0A.0B = 4kq. If A'B' is a second parabola chord through 0, then accordingly (2) OA'-OB' = \k'q, with 4k' = 2/>/sin2 k, where k' is the angle of the chord A'B' with the parabola axis. Division of (1) and (2) gives 0A.OBjOA'.OB' = k\k! = sin2 K'/sin2 k. If Hand H' are the points of intersection of the chords AB and A'B' with the parabola axis, it follows from the sine theorem that OH/OH' = sin ic'/sin k. From the last two equations we finally obtain OA-OBIOA'-OB' = OH2/OH'2. Q.E.D. With this theorem we can now obtain the following solution to Newton's problem: Let A, B, C, D be the given points. We draw the diagonals AC and BD of the quadrilateral ABCD and call their point of intersection 0. On the diagonals we mark off from 0 the mean proportionals OP = VOA-OC and OQ = VOB-OD. The connecting line QP, according to the theorem we have just proved, is then parallel to the parabola axis, and the problem now reduces to the auxiliary problem treated above.
212 Problems Concerning Conk Sections and Cycloids The following projective solution of Newton's problem also consists of the reduction of the problem to the preceding auxiliary problem. This transformation of the problem is accomplished by means of Desargues' involution theorem (No. 63). According to this theorem, every tangent to a parabola cuts the opposite sides of an inscribed quadrilateral in point pairs of an involution in which the point of tangency of the tangent is a double point. As tangent T let us choose a very distant one. Let it be tangent to the parabola at 0 and let it be cut at P, Q, P', and Q' by the lines AB, BC, CD, DA connecting the four given parabola points. 0 is then the double point of the involution determined by the pairs (P, P') and (Q, Q'). Similarly, the rays drawn from an arbitrary point Z of the picture plane to P, Q, P', Q', 0 form an involution with the ray pairs (ZP, ZP') and {ZQ, ZQ') and the double ray ZO. Because of the very great distances of the points P, Q, P', Q', 0 the rays ZP, ZQ, ZP', ZQ' on the drawing paper run parallel to the quadrilateral sides AB, BC, CD, DA, and the ray ZO here runs parallel to the axis of the parabola. (The slope (y — b)/(x — a) = (V2px — b)l(x — a) of the line connecting points Z(a\b) and 0(x\y), because of the great value of x, is essentially equal to zero, so that the ray ZO appears parallel to the axis on the drawing paper.) Accordingly we obtain the following construction. We draw through an arbitrary point Z of the paper the parallels p, q, p', q' to the lines AB, BC, CD, and DA and construct a double ray of the involution determined by the ray pairs (p, p') and (q, q'); this ray has the direction of the parabola axis. Thus, the problem is reduced to the auxiliary problem solved above. Since in ray involution there are in general two double rays, there are in general two parabolas that can be drawn through four given points. Kufl A Hyperbola from Four Points To draw a right-angle (equilateral) hyperbola for which four points are given. The construction is based upon the auxiliary theorem: The Feuerbach circle of a triangle inscribed in an equilateral hyperbola passes through the center of the hyperbola.
A Hyperbola from Four Points 213 Proof. Let ABC be a triangle inscribed in an equilateral hyperbola with the center at Z and the asymptotes I and II; let A', B', C be the midpoints of the sides BC, CA, AB, and let Al and A2 be the points of intersection of BC with I and II, and Bx and B2 the points of intersection of CA with I and II. Since the asymptotes mark off equal segments on the extensions of a hyperbola chord, BA2 = CAl and CB2 = ABlt and A' is the midpoint of AXA2 and B' the midpoint of BXB2. These midpoints are also the midpoints of the circumferences of rotation of the right triangles AXZA2 and BXZB2) so that 2iA'ZAx = 2iA'AxZ and ^B'ZBt = -^B'B^Z. Since the difference of the left sides of these equations represents angle A'ZB' and the difference of the right sides angle A1CB1 (according to the theorem of external angles), both of these angles are equal or angles A'ZB' and A'CB' are supplementary. However, since the angles of the parallelogram CA'C'B' at C and C" are equal, angles A'ZB' and A'CB' are also supplementary. The quadrilateral ZA'C'B' is therefore a circle quadrilateral. In other words: the circumference of rotation of the triangle A'B'C, i.e., the Feuerbach circle of the triangle ABC (see No. 28), passes through the center of the hyperbola. Q.E.D. Construction. Let the four given points be A, B, C, D. We draw the Feuerbach circle of the triangles ABC and ABD; the point of their intersection Z is the center of the hyperbola. We connect Z to the midpoint A' of BC, draw the circle A'\A'Z and at its points of intersection Al and A2 with the line BC we have two points of the asymptotes I and II, which gives us the asymptotes. The rest is easy. (To
214 Problems Concerning Conic Sections and Cycloids draw the hyperbola from points, for example, we pass an arbitrary line through one of the given points, for example A, and mark off on this line the segment between A and I from II to A; the point at the end of the marked-off segment is a new point of the hyperbola. Repetition of the construction with new lines through A gives us as many points of the hyperbola as desired.) Note. The proved auxiliary theorem immediately gives, as well, the solution to the interesting Locus problem: Find the locus of the centers of all equilateral hyperbolas that can be circumscribed about a given triangle. The locus is the Feuerbach circle of the given triangle. K^H Van Schooten's Locus Problem Two vertexes of a rigid triangle in a plane slide along the arms of an angle of the plane; what locus does the third vertex describe? Franciscus van Schooten (the younger) (1615-1660), a Dutch mathematician, treated this beautiful problem in his Exercitationes mathematicae, which appeared in 1657. Solution. We will first consider a special case of van Schooten's problem, the solution to which had already been taught by the Byzantine Proclus (410-485). On a rigid line three points are marked; two of these slide along the arms of a right angle; what locus does the third describe? We select the arms I and II of the right angle as the x- andy-axes of a coordinate system. Let the three marked points of the rigid line be A, B, C, their mutual distances BC = a, CA = b, and AB = c. Then c = a ± b, accordingly as C does or does not lie between A and B. Let the point A slide on I and B on II. Let the marked point C possess the coordinates x and y. Let the angle of the line with respect to the x-axis be v; thus x, as the projection from a on I, is equal to a cos v; y, as the projection of b on II, is equal to b sin v; and consequently, x2 = a2 cos2 v, y2 = b2 sin2 v, and The locus of the marked point C is thus an ellipse with the half axes a and b. This locus property is the basis of the so-called paper strip construction of the ellipse and trammel.
Van Schooten's Locus Problem 215 Paper Strip Construction of the Ellipse On the sharp edge of a paper strip we mark off the three points in the sequence B, A, C in such manner that BC = a and AC = b (<a) are equal to the given half axes of an ellipse. We move the strips in such manner that A always remains on the x-axis and B on the y-axis and we constantly mark the place at which C is situated. The locus described by the point C is an ellipse with the prescribed half axes a and b. The Trammel A trammel consists of a cross with two grooves at right angles to each other in which two sliding pins A and B move. The pins are fixed to a beam to which at some point a movable pencil M can be attached. When the pins slide in the grooves the pencil describes an ellipse with the half axes AM and BM. Now for the general van Schooten problem! Let S be the apex of the fixed angle a along the arms of which the vertexes A and B of the rigid triangle ABC slide. We draw the circle ft with AB as chord and a as peripheral angle, join its midpoint M with C and determine the points of intersection P and Q of this connecting line with ft. Let us consider this circle along with points P and Q as being firmly connected to the rigid triangle, so that it also participates in the motion of the triangle. Consequently, since a is the peripheral angle opposite AB, it passes continuously through S. The arcs AP and AQ continuously change their position but not their
216 Problems Concerning Conic Sections and Cycloids magnitude! This entails the invariance of the peripheral angles ASP and ASQ, which implies the invariance of the directions I and II that are determined by SP and SQ. Since PQ is a diameter of ft, I and II are perpendicular to each other. We can therefore consider the motion of the vertex C as the motion of the marked point C of a rigid line PQC the other marked points of which P and Q slide along the arms I and II of a right angle. According to the above special case, C describes an ellipse. Result: van Schooten's theorem: The locus of one comer of a three- cornered plate the other two comers of which slide along the arms of a fixed angle is an ellipse. The above derivation also gives the magnitudes and position of the ellipse. The axes of the ellipse have the positions I and II and the magnitudes 2 • CP and 2 • CQ. KB| Cardan's Spur Wheel Problem What is the locus described by a marked point on a circular disc that rolls along the inner edge of a disc of double its radius? Jerome Cardan, an Italian mathematician (1501-1576), is known for the Cardan formula for solution of cubic equations. Solution. Let the boundary of the large disc be ft and that of the smaller disc I, and let their radii be equal to R = 2r and r, respectively. First we will observe the motion of the marked disc diameter AB, which we give the mark M. At the beginning of the motion let A lie at the midpoint 0 and B at the boundary point H on ft. When the circle I is rolled forward within ft by the arc HT, let it cut the radius OH at X, and let Y be the point at which it cuts the radius OK of ft, which is perpendicular to OH. Since the angle XOY is 90°, XY is a diameter of I, and the intersection S of XY with OT is the center of I. If w is a peripheral angle XOT of I in radian measure, then the corresponding central angle XST is 1w and the arc XT is 2rw. However, since w also represents the central angle HO T of ft, the arc HT = Rw = 2rw. The arc ATT of the smaller circle is exactly as long as the arc H T of the larger circle upon which the small circle is rolled forward. ATmust therefore be the end B of the marked diameter AB, consequently Y is the other end A of this diameter. The rotation of a disc along the inner margin of a disc of double its width consequently means that the end points of a marked diameter of the smaller circle slide along two
Newton's Ellipse Problem 217 fixed orthogonal diameters of the larger circle. The locus of our marked point M is therefore also the locus of the mark M of the diameter AB whose end points A and B slide along the arms OK and OH of the right angle HOK. In view of the paper strip construction of the ellipse (No. 47), the locus we are seeking is thus an ellipse. The half axes of this ellipse are MA and MB. Fig. 45. N0T3. Since a marked point on the boundary of the smaller disc describes a diameter of the larger disc, a gear consisting of two spur wheels the ratio of whose diameters is as 2:1 effects the conversion of a circular motion into a reciprocal rectilinear motion. ^^m Newton's Ellipse Problem To determine the locus of the centers of all ellipses that can be inscribed in a given (convex) quadrilateral. Newton's very elegant solution to this problem is based upon the theorem, also stemming from Newton: The line connecting the centers of the diagonals of a quadrilateral circumscribed about a circle passes through the center of the circle. The proof of this property of a tangent quadrilateral is based upon the following auxiliary theorem: The locus of the common vertex of two triangles with prescribed base lines and a prescribed area sum is a straight line. [Proof: Let/and g be the two prescribed base lines, x and y the distances of the common vertex S of the two triangles from the prescribed base lines and, at the same time, the "coordinates" of the
218 Problems Concerning Conic Sections and Cycloids point S. The prescribed sum of the areas of the two triangles we will call K. Since the triangles have the area \fx and \gy, we obtain the equation fx + gy = 2K, and this is the equation of a straight line.] Let there be circumscribed about a circle of center 0 and radius r the tangent quadrilateral ABCD with the sides AB = a, BC = b, CD = c, DA = d, so that a + c = b + d. Let M be the midpoint of the diagonal AC and N the midpoint of BD, 2 J the area of the quadrilateral. Since £\MAB and /\,MCD have areas equal to one half £\CAB and l^ACD, respectively, the sum of the areas of the two Fig. 46. triangles MAB and MCD is equal to J, or half the area of the quadrilateral. Consequently, the line MN is the locus of the common vertex S of all the pairs of triangles (SAB, SCD) having the area J. However, since the two triangles OAB and OCD also have the area sum J (specifically, I = OAB + OCD = r ^-^ and II = OBC + ODA = r ^-^- and I = II. From I + II = 2J it then follows that I = II = J), thus 0 belongs to the locus. Q.E.D. Now for the solution to Newton's problem! Let us consider any ellipse inscribed in the given quadrilateral as the normal projection of a circle. In this reflection the quadrilateral appears as the image (the normal projection) of an object quadrilateral circumscribed about the circle. Now, since: 1. in the object the center of the circle lies upon the line connecting the midpoints of the diagonals; 2. halving is preserved in the normal projection; 3. the center of
The Poncelet-Brianchon Hyperbola Problem 219 the ellipse is the image of the center of the circle, then in the image also the ellipse center lies on the line joining the midpoints of the diagonals of the prescribed quadrilateral. Conclusion: The locus of the centers of all the ellipses that can be inscribed in a given quadrilateral is a straight line, specifically, the line connecting the midpoints of the diagonals of the quadrilateral. ■3H The Poncelet-Brianchon Hyperbola Problem To determine the locus of the intersection of the altitudes of all the triangles that can be inscribed in a right-angle [equilateral) hyperbola. Brianchon (1785-1864) and Poncelet (1788-1867) were French mathematicians. The solution is in vol. XI of the Annales de Gergorme (1820-1821). We relate the hyperbola to its asymptotes, which will serve as coordinate axes (the x-axis and £-axis), and take the abscissa (ordinate) of the apex of the hyperbola as the unit length. The equation for the hyperbola then reads x(=l. Let PQR be an arbitrary triangle inscribed in the hyperbola, i.e., a triangle whose vertexes P, Q, R lie on the hyperbola. Let the abscissas of the points P, Q, R be a, b, c, the ordinates thus being a = I/a, j9 = \/b, Y = l/c. The slope of the side QR is (/3 — y)/(i — c) or, if we substitute 1/i and l/c for /3 and y, — 1/ic. The slope of the altitude to QR is thus be. The equation of this altitude is thus $ — a = bc(x — a) or (1) £ + abc = bc(x + <x/3y). For the altitude passing through Q we obtain similarly (2) £ + abc = ca(x + <x/3y). Now, if the coordinates of the altitude intersection are understood to be x\£, (1) and (2) both apply, and by equalizing the right sides we find the abscissa x of the point of intersection of the altitudes: (I) X = -afr. If we introduce this value into (1) or (2), we obtain as the ordinate of the altitude intersection (ID £ = — abc.
220 Problems Concerning Conic Sections and Cycloids Multiplying (I) and (II) finally gives us *f =1. The altitude intersection thus lies on the hyperbola. Consequently. The locus of the point of intersection of the altitudes of all the triangles that can be inscribed in an equilateral hyperbola is the hyperbola itself WSM A Parabola as Envelope On one arm of an angle the arbitrary segment e and, on the other, the segment f are marked off n times in succession from the vertex of the angle, and the segment end points are numbered, beginning from the vertex, 0, 1,2,...,« and n, n — 1,..., 2, 1,0, respectively. Prove that the lines joining the points with the same number envelop a parabola. The proof is based upon the Theorem of Apollonius: Two tangents to a parabola are divided into segments of like proportion by a third and this third is divided in the same proportion by its point oftangency. More precisely: If the two parabola tangents SA and SB, with the points of tangency A and B, are intersected by a third parabola tangent at P and Q, and if 0 is the point of tangency of this third tangent (Figure 40), we obtain the equation SP OQ BQ PA~ OP~ SQ' The proof of the Apollonian theorem is based upon the known parabola property: The point of intersection of two parabola tangents lies on a parallel to the parabola axis, passing through the midpoint of the chord connecting the points oftangency. (It follows directly from the situation that the three median perpendiculars of the triangle FA'B' whose vertexes are the focus F and the projections A' and B' of the points of tangency A and B on the directrix pass through a single point. Two median perpendiculars are the tangents and the third is the parallel to the axis.) Because of this property (1) p' = a', (2) q' = b', (3) b' + fS' = a' + a',
A Parabola as Envelope 221 if we call the projections of the segments AP = a, PS = <x, BQ = b, QS = /3, OP = p, OQ = q on the directrix a', a', b', Moreover, as a result of the equality of the projections of the segment PQ and the traverse PSQ, (4) P' + q' = a' + /3'. If, in accordance with (1) and (2), we substitute a' and b' for p' and q' in (4), we obtain a' + /3' = a' + b', and this equation when combined with (3) shows that a' = b' and /3' = a'.
222 Problems Concerning Conic Sections and Cycloids This now gives us a/a = a'la' = b'/a'^ qlp = q'lp' = b'la' V m = i7i3' = b'la] which proves the theorem of Apollonius. The execution of the envelope construction described above is now very simple. Let us call the apex angle S; we then select on the arms of the angle the points A and B in such manner that SA = ne and SB = nf (A and B are the same points that received the numbers n and 0 in the numbering process previously described), and consider the parabola that is tangent to the arms of the angle at A and B. According to Apollonius' theorem, the line connecting the point P on SA to which the number v has been assigned with the point Q, on SB is tangent to the parabola. [The ratios PS:PA and QB: QS are both equal to v.n — v.] Consequently, the parabola is enveloped by the lines joining the points with the same numbers. At the same time, Apollonius' theorem makes it possible to draw the tangency point for each connecting line. |3£fl The Astroid To find the envelope of a straight line, two marked points on which slide along two fixed, mutually perpendicular axes. Gottfried Wilhelm Leibniz (1646-1716), the inventor of infinitesimal calculus, founded the theory of envelopes in 1692 in his paper De linea ex lineis numero infinitis ordinatim ductis inter se concurrentibus easque omnes tangente. Solution. We seek the equation of the envelope in the coordinate system in which the two given axes are the *-axis and y-axis and their intersection 0 is the origin. Let the constant distance between the designated points be represented by I. Let AB and A'B' represent two positions of the marked- off distance I, M and N the midpoints of AA' and BB', OM = a, ON = b, AA' = 2a, BB' = 2/3, thus OA = a + a, OA' = a - a, OB = b - /3, OB' = b + /3. The conditions AB = I and A'B' = I can then be written (1) (a + a)2 + (b- /3)2 = /2 and (a - a)2 + (b + /3)2 = /2,
The Astroid 223 from which we obtain by subtraction (2) aa = bfi. The point of intersection S(x, y) of the two straight lines AB and A'B' is expressed by the two equations + r^-5 = l and + irr-z = l> a + a b — /3 a — a i + /3 and the following two equations: ax t by and \°) „2 2 "T" A2 pa L (4) ^^--^ i2 -/32 which are obtained from the first two by addition and subtraction. If we then divide (4) by (2), we obtain a(a2 - a2) b(b2 - /32) and, with the use of (3), a" -a2 .b2- /32 ^Tf2' y = ba^TT2 (5) x = a ^rirTz' y = b -jTTTRi If we then allow A and A' and B and B' to approach each other (naturally maintaining the conditions AB = I and A'B' = /), then a and /3 become continuously smaller and the point of intersection S of the lines AB and A'B' comes closer and closer to the envelope, finally reaching it when a and /3 are equal to zero. The point x \y at which the envelope is reached is then represented, according to (5), by the equations (5') x = .21 ta» y = a2 + b2 * a2 + b2 in which, in view of (1), (1') a2 + b2 = I2 is true.
224 Problems Concerning Conic Sections and Cycloids From (5') it then follows that a3 = l2x, b3 = l2y or a2 = /***, b2 = fry*, from which I2 = /*** + /*y* is obtained by addition. The equation of the envelope thus reads *K + y* = /* or, in rational form, (/2 _ x2 _ y2)3 = 27/Vy2. (The second form is obtained from the first by cubing twice. The first cubing results in x2 +y2 + 3**y*(** + y*) = /2 or 3x*y*l* = I2 - x2 - y2, and on the second cubing we obtain the indicated form.) Because of its shape the curve x* + y* • = /" is called an astrois or astroid in accordance with a proposal made by J. J. Littrow in 1838 or a star line after M. Simon's proposal. The astroid is a hypocycloid* in which the radius of the fixed circle is four times that of the rolling circle. Proof. In Figure 49, let C be the center, / the radius, the arc JT a section of the fixed circle %, 9¾ the rolling circle at the moment in which it touches 2f at the point T, so that the center Z of the rolling circle cuts the radius CT into the two segments ZT = r = \l and CZ = 3r. Also, let M be the point on the circumference of 9¾ whose path we are to follow, x its abscissa and y its ordinate. We then select C as the origin of the coordinates and draw the (horizontal) x-axis through point J, at which the marked point was at the beginning of its motion. The arcs JT of 3 and TM of 9¾ are then of equal length; the sector angle W = 2i TZM is therefore four times the sector angle w = ^JCT. The slope of the radius ZM from the horizontal is 4w — w = 3w, and the horizontal and vertical projections of ZM are r cos 3w and r sin 3w, respectively. The * If a circular disc rolls along the circumference of a fixed circle (without sliding), a marked point on the circumference of the rolling disc (the "rolling circle") describes an epicycloid when the disc rolls along the outside of the fixed circle and a hypocycloid when the disc rolls along the inside.
The Astroid 225 Fig. 48. Fig. 49.
226 Problems Concerning Conic Sections and Cycloids corresponding projections of CZ are Zr cos w and Zr sin w. Thus we obtain the equations (which can be read off the figure) x = 3r cos w + r cos Zw, y = Zr sin w — r sin Zw, which, as a result of the relationships cos Zw = 4 cos3 w — 3 cos w, sin Zw = 3 sin w — 4 sin3 w, can be transformed into x = I cos3 u>, y = I sin3 u>. In the pair of equations obtained the coordinates of the hypocycloid point x \y are represented as functions of the so-called rolling angle w. To obtain the curve equation in Cartesian coordinates, we solve for cos w and sin w, square, and add. Thus, we obtain *K + y* = /* i.e., the equation of an astroid, which was to be demonstrated. fgCg Steiner's Three-pointed Hypocycloid To determine the envelope of the Wallace line of a triangle. Solution. Let ABC be the given triangle, M the midpoint, and r the radius of the circle U circumscribed about it. A Wallace line of a triangle is the line connecting the three base points of the perpendiculars dropped from any point P on the circumference of the circle of circumscription to the sides of the triangle. We will make M the origin of an X-Y coordinate system and preliminarily select the .Y-axis arbitrarily. If we designate the angles formed by the radii MA, MB, MC, MP with the positive side of the .Y-axis as 2a, 2/3, 2y, 2<p, the coordinates of the three corners A, B, C are (r cos 2<x|rsin2<x), (r cos 2/3\r sin 2/3), (r cos 2y\r sin 2y), and the coordinates of the point P are (r cos 2<p, r sin 2<p). In order to find the coordinates Xx \ Yx of the base point Fx of the perpendicular dropped from P to BC, we form the equations of the
Steiner's Three-pointed Hypocycloid 227 line BC (in the two-point form) and the line PFl (in the slope form) and find from these equations that xi =/(«» 2/3 + cos 2y + cos 2? - cos 2/3 + 2y - 2<p), Yx =/(sin 2/3 + sin 2y + sin 2p - sin 2/3 + 2y - 2p), where/represents half of r. Accordingly, the coordinates X2\ Y2 of the base point F2 of the perpendicular dropped from P to CA will naturally be X2 =/(cos 2y + cos 2a + cos 2<p — cos 2y + 2a — 2<p), Y2 = /(sin 2y + sin 2a + sin 2<p - sin 2y + 2a - 2<p). An appropriate parallel displacement of the coordinate system allows us to put the coordinates into a simpler form. This displacement of the coordinate system is based upon Sylvester's theorem (No. 27). In accordance with this, the altitude intersection H of the triangle ABC has the coordinates r(cos 2a + cos 2/3 + cos 2y) and r(sin 2a + sin 2/3 + sin 2y). Since the center F of the Feuerbach circle lies halfway between Mand H (No. 28), the coordinates of F are X0 = /(cos 2a + cos 2/3 + cos 2y), Y0 =/(sin 2a + sin 2/3 + sin 2y). It is therefore convenient to select the center of the Feuerbach circle as the origin of the new coordinate system x, y. Between the coordinates X\ Y of a point in the old system and x\y in the new system there exist the relations X = X0 + x, Y=Y0+y. From these relations we obtain for the coordinates (xx \yx) and (*21½) °ftne points Fx and F2 in the new system the simpler values *! =/(cos 2tp — cos 2a — cos 2/3 + 2y — 2<p), yx = /(sin 2? - sin 2a - sin 2/3 + 2y - 2p) and x2 =/(cos 2p — cos 2/3 — cos 2y + 2a — 2p), y2 =/(sin 2? - sin 2/3 - sin 2y + 2a - 2q>).
228 Problems Concerning Conic Sections and Cycloids Now the equation for the Wallace line FXF2 reads (y - yi)l(* - *i) = (½ - yi)/(*a - *i)- For the differences x2 — xx and y2 — yi appearing here, we obtain, in accordance with the coordinate values just given, the expressions x2 — *i = /(cos 2a — cos 2/3) + /(cos 2/3 + 2y - 2<p - cos 2y + 2a - 2?) = — 2/sin a + /3 sin a — /3 + 2/sin a + /3 + 2y - 2p sin a - /3 = 4/"sin a — /Ssiny — <pcosa + /3 + y — qp and similarly #2 — yi = 4/"sin a — /3 sin y — <p sin a + /3 + y — 9. The quotient (y2 — yi)l(x2 — xx) thus has the value sin O/cos O with 0 = a + /3 + y — y, and the equation of the Wallace line assumes the form x sin O — y cos O = xx sin O — ^ cos ¢. Using the above values for the coordinates xx and yx, we are able to write the right side of this equation as /(sin O cos 2<p — cos O sin 2qp) — /(sin O cos 2a — cos O sin 2a) - /(sin O cos 2/3 + 2y - 2<p - cos O sin 2/3 + 2y - 2p), which expression becomes, according to the addition theorem of circular functions, /sin (a + /3 + y - 3<p) - /sin (/3 + y - a - <p) -/sin (a - /3 - y + <p) = /sin(a + /3 + y - 3«p). Now the equation of the Wallace line reads xsina + ft + y — <p — ycosa + /3 + y — <p = /sin a + /3 + y — 3<p. For the sake of a final simplification we now choose the position of the hitherto arbitrary x-axis in such manner that the sum of the three angles a, /3, y is equal to an integral multiple of 2v. It is easily seen that with F as the point of origin there are only three rays, separated from each other by angles of 27r/3, that satisfy this condition. We
Steiner's Three-pointed Hypocycloid 229 choose one of these three rays as the x-axis. In the coordinate system thus determined, the Wallace line has the simple equation (1) x sin <p + y cos <p = /sin 3<p. To interpret this equation geometrically we draw a triangle FQR with the side FQ = / with the angles 2<p at F and <p at R, thus, with the external angle 3<p at Q, whose side FR lies on the positive x-axis. The side QR of this triangle is then the Wallace line 3 represented by (1). In fact: If x = FU is the abscissa, y = UV the ordinate of any point V of the line 3, then the perpendicular FW dropped from F to 3 is/sin 3<p as the projection of FQ; on the other hand, as the projection of the traverse FU + UV, it is x sin <p + y cos <p, so that equation (1) applies to the coordinates of V. In particular, if Vis the base point of the perpendicular TV dropped to 3 from the end point T of the extension QT = 2/ofFQ, Flies on F U R Fig. 50. the circle I whose center Z is the midpoint of the hypotenuse Q T of the right triangle Q TV, which has the radius/ and which is tangent to the Feuerbach circle at Q and to the circle S of center F and radius 3 Tat T. Since 2i VZT, as an external angle of the isosceles triangle VZQ, is equal to 6<p, the arc FT of the circle I is equal to/- 69. And since the arc JT stretching from the point of intersection J of circle S with the x-axis to T is equal to 3/- 2<p, and is therefore also equal to 6f<p, it follows that arc VT of I = arc JT of ft.
230 Problems Concerning Conic Sections and Cycloids If we then think of circle I as rolling along circle S (along the inside) so that a point J( marked off on I initially lies at J, the marked point arrives precisely at point V at the moment when the rolling circle I assumes the drawn position. The locus of point V is consequently, as the path of the marked point J(, a hypocycloid (cf. No. 52), in which the radius of the fixed circle is three times as large as the radius of the rolling circle. And since at the moment depicted in the drawing the rolling circle is rotating precisely about the instantaneous point of rotation T, at this moment the marked point Ji at V is moving in a direction Q V that is precisely perpendicular to TV, i.e., the Wallace line 3 is the tangent drawn to the hypocycloid at VI Thus the totality of Wallace lines represents the totality of all the hypocycloid tangents. Conclusion: Steiner's theorem: The envelope of the Wallace lines of a triangle is a hypocycloid whose fixed circle possesses a radius that is three times as great as the radius of the rolling circle. The center of the fixed circle
Ellipse Circumscribing a Quadrilateral 231 is the center of the Feuerbach circle of the triangle, and the radius of the rolling circle is equal to the radius of the Feuerbach circle. The three points of the hypocycloid—the three places at which the marked point on the rolling circle touches the fixed circle—are the end points of the three radii of the fixed circle, separated from each other by 120°, of which one lies on the positive x-axis. The three apexes of the hypocycloid—the three places at which the marked point on the rolling circle touches the Feuerbach circle—divide the arcs of the Feuerbach circle lying outside the triangle, from the midpoints of the sides, into segments whose ratio to each other is as 1:2. [This ratio follows easily from the position of the x-axis and from the fact that the peripheral angle opposite the arc of a Feuerbach circle cut off by a triangle side is equal to the difference between the two triangle angles at the end points of the side.] The Most Nearly Circular Ellipse Circumscribing a Quadrilateral Of all the ellipses circumscribing a given quadrilateral, which deviates least from a circle? This problem, which was posed in the seventeenth volume of Gergonne's Annales de Mathematiques, was solved by J. Steiner (Crelle's Journal, vol. II; also: Steiner, Gesammelte Werke, vol. I). Solution (according to Steiner). To begin with, it is clear that the quadrilateral must be convex inasmuch as no ellipse can be circumscribed about a concave quadrilateral. Let OPRQ be the given quadrilateral, let QR cut the extension of OP at H and PR cut the extension of OQ at K, and let OP = p,
232 Problems Concerning Conic Sections and Cycloids OQ = q, OH = h, OK = k. We will take OP as the x-axis, OQ as the y-axis of an oblique-angle coordinate system. The equations for the sides OP and OQ of the quadrilateral are then y = Oandx = 0, while the equations for the sides PR and QR are - + i = 1 and - + i = 1 p k h q or, if we designate the expressions kx + py — kp and qx + hy — hq as u and v, u = 0 and i> = 0. The equation for every ellipse that can be circumscribed about the quadrilateral has the form (1) Xxu + fjyv = 0, where A and p are two arbitrary constants or so-called parameters. [Since at 0 x = 0 and y = 0, ai Py = 0 and u = 0, at Q x = 0 and v = 0, and, finally, at R u = 0 and o = 0, the second degree curve © represented by (1) passes through all four corners. Thus, 6 is an ellipse of circumscription, which, moreover, also passes through the fifth point x0\y0, and if we choose A and p in such manner that A*0(**0 + py0 - kp) + i*y0{qx0 + hy0 - hk) = 0, then *0|yo ^^5° hes on ®- Since, however, only one second degree curve can pass through five points, 6 is the ellipse @. Thus, every ellipse of circumscription can be represented by (1).] We introduce the values oft/ and v into (1) and obtain the equation of an arbitrary ellipse of circumscription: (1') Ax2 + 2Bxy + Cy2 + 2Dx + 2Ey = 0, where A = *A, 2B = pX + qp, C = hp, D = -kpX, E = -%i. We begin by looking for the locus of the centers of all the parallel chords of the ellipse (1') (2) y = Jtx + n, in which J( is the common directional constant of the chords, n the segment cut off on they-axis by one of these chords, chosen arbitrarily.
Ellipse Circumscribing a Quadrilateral 233 If we introduce y from (2) into (1'), we obtain the quadratic equation {A + 2BJ( + CJf*)x2 + 2[(Cn + E)M + Bn + D]x + Cn2 + 2En = 0 for the abscissas xl and x2 of the points of intersection of the chord (3) with the ellipse (1). According to a well-known theorem from quadratic equation theory, the sum of the two roots xx and x2 of this equation is _ 0 (Cn + E)M + Bn + D X.+X2- l A + WJjf + CJjf2 , i.e., the abscissa of the chord midpoint is (CM + B)n + EJt + D CM2 + 2BJt + A Since the chord midpoint X \ Y satisfies the equation (2) of the chord, Y = MX + n, so that we can substitute Y — MX for n in the equation found for X. If we do this, we obtain for the coordinates X and Y of the chord midpoint the equation (3) Y = JCX + n', with A + BJ? D + Rrf (3a) -* = -bTcj? n = -bTcm' Since (3) is the equation of a straight line, the following theorem applies: The midpoints of all the parallel chords of an ellipse possessing the directional constant J? lie on a straight line (a diameter of the ellipse) with the directional constant M'. The two directional constants M and M', as well as their corresponding directions and the diameters of the ellipse possessing this direction are said to be conjugate to each other. We will now prove two auxiliary theorems. Auxiliary theorem I: There is only one pair of conjugate directions (diameters) that belong to all the ellipses circumscribing a quadrilateral. Proof. We replace A, B, C in (3a) with their values and obtain _„, = {2k + pJ()-\ + qjf-y. M p-X+ (2hJl + ?)•/ If M' (for a prescribed M~) is to maintain the same value no matter which ellipse of circumscription we are concerned with and consequently, no matter how great A and p are, then this value must be
234 Problems Concerning Conic Sections and Cycloids obtained when A = 1 and p = 0 as well as when A = 0 and p = 1. Consequently, it must be true that 2k + pj( _ qJt p ~ 2hJt + q And if we are able to find a suitable M for this equation, then for every A and every p _ „, = (2k + pjf)\ + (2k + pj()p = 2k + pJi pX+pp p or (4) JT « -2-> i.e., J(' is independent of A and p. The equation giving the condition for J( is written hpj(2 + 2hkJ( + kq and gives the two ^-values (5) ^=- + j-, J?2 = P nP with r2 = h2k2 - hp-kq = hk(hk - pq). Since, according to the drawing, hk > pq, r2 is real, r is positive, and both ^-values are real. Moreover, (5a) Jti+Jt* 2-- P Now, according to (4), the directional constant Ji\ that is conjugate to JHX has the value — Mi — 2(k\p), i.e., the value M2- In like manner, •/»2 = *wj. Thus, there is only one pair of specific directions, determined by the directional constants Mi and M2, that will form a pair of conjugate directions for each ellipse of circumscription. Auxiliary theorem II: The acute angle formed by two conjugate diameters of an ellipse attains a minimum when the two conjugate diameters are equal, and the tangent of the half angle-minimum is equal to the ratio bra of the two half axes. = 0 k _r_ ~P hp
Ellipse Circumscribing a Quadrilateral 235 Proof. If 0 and <p are the two acute angles that the two conjugate diameters of an ellipse with the half axes a and b form with the large axis, then obviously b2 (6) tan 0 • tan <p = -j» For the angle Q = 0 + <p of the two conjugate diameters we therefore obtain _. ., tan 0 + tan q> tan 0 + tan q> tan Q = tan (0 + <p) = -j Z , . = a ' v r/ 1 — tan 0 tan <p b2 1 " a2 But the left side of this equation, and therefore the angle CI, attains a minimum when the numerator of the right side assumes its smallest value. This numerator is the sum of two numbers (tan 0 and tan <p) of constant product and, according to No. 10, attains a minimum when the numbers are equal. From tan 0 = tan <p it follows that 0 = <p and from this that the two diameters are equal. At the same time from (6) we obtain the value b\a for the tangent of the half angle- minimum. These preliminaries concluded, the solution of the problem is simple. The circumscribed ellipse becomes more and more circular, the closer the ratio b: a of the small to the large half axis comes to unity. Now, according to auxiliary theorem II., this ratio has the value tan (to/2), where to is the smallest angle formed by conjugate diameters. The most nearly circular circumscribed ellipse is therefore the ellipse in which w attains its maximum possible value. And this is the ellipse in which the directional constants of its equal conjugate diameters are determined by (5). Thus, if w0 is the angle between the equal conjugate diameters of this ellipse, then for every other ellipse of circumscription, w0, as the angle between two unequal conjugate diameters (with the directional constants M^ and J?2)> is greater than the angle m of this ellipse enclosed between equal conjugate diameters, so that a>mtLX = w0. Consequently: Of all the ellipses circumscribed about a quadrilateral the ellipse that deviates least from a circle is the one whose equal conjugate diameters possess the conjugate directions common to all the ellipses of circumscription.
236 Problems Concerning Conic Sections and Cycloids The directional constants of these specific directions are determined by the quadratic equation hpJP + 2hUt + kq = 0. ■M The Curvature of Conic Sections To determine the curvature of a conic section. By the curvature of a curve at a point is meant the reciprocal value of the radius of the circle of curvature, i.e., the radius of the circle that fits the curve most closely at the relevant point. Solution. Let the conic section be called ft, its parameter 2p, its form number e, its shortest focal radius k, so that/* = k{\ + e), and finally, let the equation for its maximum be qx2 +y2 - 2px = 0, with q = 1 - e2. It is known that the coordinates of a point n(^|ij) at a distance R from another point P(x\y) and lying at a direction from P that forms the angle & with the positive x-axis are £ = x + oR, 7) = y + iR, where o is the cosine and i the sine of &. If II lies on ft, then from qt2 + v2-2p{ = 0 we obtain the quadratic equation for R DR2 - ER+F=0 with the coefficients D = i2 + qo2, E = 2{ou - iy), F = qx2 + y2 - 2px, where u = p — qx. In respect to the conic section, we will call the three expressions D, E, F the directional number for the "direction" &, the emanant at point x\yfor the direction &, and the power at point x|y. If PI! is a secant, the roots Rt and R2 of the quadratic equation are the segments generated on the secant by the conic section. The relations between the roots and the coefficients of a quadratic equation give us the following theorems: I. The emanant is theDth sum of the secant segments. II. The power is the Tith product of the secant segments. We now draw through an arbitrary point P{x\y) of the conic section the tangent X and the normal and designate the segment of
The Curvature of Conk Sections 237 the normal from P to the x-axis as n and the segment reaching from P to the conic section as N. If & is the angle of % with the x-axis, o the cosine, i the sine of &, then the directional number for the tangent direction is D = p + qo* = u-2 + gy-2 = t-2 7 n2 n2 n2 (since u = p — qx represents the subnormal), while for the directional number of the inward-pointing normal we obtain the value A = o2 + qi2. The emanant at P for the direction of the normals becomes E = 2(oy + iu) = 2n. Therefore, according to I., (1) 2n = AN. On tangent S we select a point 0 whose distance OP from P we set equal to t; and we draw through 0 perpendicular to S through the conic section the secant ©. Let the two segments of the secant created by $ and measured from 0 be s and let S > s. According to II., we can write for the power of S at 0 both Dt2 and ASs, so that (2) Dt2 = ASs. We now draw a circle I to which for the time being we will attribute the arbitrary radius p; the center of this circle lies on the internal normal and the circle is tangent to the conic section at P. If s0 and S0 > s0 are the segments measured from 0 that the circle creates on the secant ©, then, according to the tangent theorem, (3) t2 = Vo- By division of (2) and (3) we obtain DS0s0 = ASs and, using (1), we obtain DNSoS0 = 2a£f. Now the closer the fraction s/s0 is to unity, the closer the approximation of the circle to the conic section in the vicinity of point P. But this fraction, according to the last equation, has the value s_ = N So Dp s0 S 1p n
238 Problems Concerning Conic Sections and Cycloids In the immediate vicinity of the point P, S becomes equal to N and S0 = 2/>, so that both the first and second factors on the right-hand side are equal to 1. Consequently, the fraction s/s0 comes closest to unity when the third right-hand factor Dpjn is also equal to 1. Thus: Of all circles I the one that most closely approximates the conk section is the one possessing the radius p = n/D. Since D was previously determined as equal to p2jn2, we obtain the fundamental theorem: The radius of curvature of a conic section has the value p = n*\p\ To draw the circle of curvature we must consider that p\n is the cosine of the angle tfi formed by the normal n with the focal radius r of the point P,* and accordingly we write the obtained formula as p = n/cos2 0. From inspection of this equation we obtain the following Construction of the radius of curvature: At the point of intersection H of the normal with the x-axis we erect a perpendicular Fig. 53. * From the triangle with sides n, r and the line w joining the end points of n and r lying on the x-axis, we obtain cos xfi = (n2 + r2 — w2)j2nr. If we express the numerator of this fraction entirely in terms of x, thus expressing n2 by y2 + u2 = 2px — qx2 + (p — qx)2, r by ex + k, and w by (x — k) + u = e2x + ke, and combine, the numerator then becomes equal to 2p(ex + k) = 2pr and cos ip becomes 2pr/2nr = pjn.
Archimedes' Squaring of a Parabola 239 to the normal. At its point of intersection K with the (extended) focal radius we then erect the perpendicular to the fiscal radius. The point of mtersection Z of this second perpendicular with the normal is the center of curvature, its distance from P the desired radius of curvature. Busfl Archimedes' Squaring of a Parabola To determine the area enclosed in a parabola section. The squaring of a parabola is one of Archimedes' most remarkable achievements. It was accomplished about 240 B.C. and is based upon the properties of Archimedes triangles. An Archimedes triangle is a triangle whose sides consist of two tangents to a parabola and the chord connecting the points of tan- gency. The last-mentioned side is taken as the base line or the base
240 Problems Concerning Conic Sections and Cycloids of the triangle. In order to construct such a triangle we draw the parallels to the parabola axis through the two points H and K of the directrix and erect the perpendicular bisectors upon the lines connecting H and K with the focus F. If we designate the point of intersection of the two perpendicular bisectors as S, the point of intersection of the first perpendicular bisector with the first parallel to the axis as A, and the point of intersection of the second perpendicular bisector with the second parallel to the axis as B, then A and B are points of the parabola and SA and SB are tangents of the parabola (classical construction of the parabola), and ASB is an Archimedes triangle (cf. Figure 39). Since SA and SB are two perpendicular bisectors of the triangle FHK, the parallel to the axis through S is the third perpendicular bisector; it consequently passes through the center of HK, and, as the midline of the trapezoid AHKB, it also passes through the center M of AB. This gives us the theorem: The median to the base of an Archimedes triangle is parallel to the axis. Let the parabola tangents through the point of intersection 0 of the median SM to the base with the parabola cut SA at A', SB at B'. Then AA'O and BB'O are also Archimedes triangles. Consequently, according to the above theorem, the medians to their bases are also parallel to the axis and are therefore also parallel to SO. These medians are therefore midlines in the triangles SAO and SBO, so that A' and B' are the centers of SA and SB. A'B' is consequently the midline of the triangle SAB and is therefore parallel to AB; also the point 0 on A'B' must be the center oiSM. The result of our investigations is the Theorem of Archimedes: The median to the base of an Archimedes triangle is parallel to the axis, the midline parallel to the base is a tangent, and its point of intersection with the median to the base is a point of the parabola. Now we can determine the area J of the parabola section enclosed in our Archimedes triangle ASB with the base line AB. The tangents A'B' and the chords OA and OB divide the triangle ASB into four sections: 1. the "internal triangle" AOB enclosed within the parabola; 2. the "external triangle" A'SB' lying outside the parabola; 3. and 4. two "residual triangles" AOA' and BOB', which are also Archimedes triangles and are penetrated by the parabola. Since 0 lies at the center of SM, the internal triangle is twice the size of the external triangle.
Archimedes' Squaring of a Parabola 241 In the same fashion, each of the two residual triangles in turn gives rise to an internal triangle, an external triangle and two new residual Archimedes triangles that are penetrated by the parabola, and once again each internal triangle is twice the size of the corresponding external triangle. Thus, we can continue without end and cover the entire surface of the initial Archimedes triangle ASB with internal and external triangles. The sum of all the internal triangles must also be twice as great as the sum of all the external triangles. In other words: Theorem of Archimedes: The parabola divides the Archimedes triangle into sections whose ratio is 2:1. Or also: The area enclosed by a parabola section is two thirds the area of the corresponding Archimedes triangle. Archimedes arrived at this conclusion by a somewhat different method. He found the area of the section by adding together the areas of all the successive internal triangles. If A represents the area of the initial Archimedes triangle ASB, then the area of the corresponding internal triangle is one half A, the area of the corresponding external triangle is one quarter of A, and the area of each of the two residual triangles is one eighth of A. The successive Archimedes triangles therefore have the areas a A A the corresponding internal triangles possess half this area; and since each internal triangle gives rise to two new internal triangles, we thus obtain for the sum of all the successive internal triangle areas the value 1[. . A . A „ A 1 j [A + 2.3 + 4-^ + 8.33 +...j. The bracket encloses a geometrical series with the quotient J, the sum of which is equal to A/( 1 — J) = f A. Thus, we again obtain for the area of the section the value J = §A. Since A'B' is tangent to the parabola at 0, the perpendicular h dropped from 0 to the base line AB of the section is the altitude of the section. Since h is also half the altitude of the triangle ASB, A = AB-h and J = \-AB-h, i.e.: The area enclosed by a parabola section is equal to two thirds the product of the base and the altitude of the section.
242 Problems Concerning Conic Sections and Cycloids Finally, we will express the area of the section in terms of the transverse q of the section, i.e., by the projection normal to the axis of the chord bounding the section. A Fig. 55. We use the equation for the amplitude of the parabola, calling the coordinates of the corners of the section x\y and X\ Y, and we have y2 = 2px and Y2 = 2pX with 2p representing the parameter. From Figure 55 it follows directly that J = SXY-$xy-(X-x).Y-±l. If we replace X and x here with Y2\2p and y2j2p, we obtain \2pJ = Y3 -y3 - ZY2y + ZYy2 = (Y - y)3. Since Y - y is the section transverse q, we finally obtain \2pJ = q3. This important formula can be expressed verbally as follows: Six times the product of the parameter and the area of the section is equal to the cube of the section transverse. Squaring a Hyperbola To determine the surface area enclosed by a section of a hyperbola. We select the major axis of the hyperbola as the x-axis, the minor axis as they-axis; the hyperbola equation then reads r/3 (1) *L _ £ - i a2 b2 ~ ' where a and b are half the major and minor axes, respectively.
Squaring a Hyperbola 243 We must find the area A of the hyperbola section cut off at a distance of x from the apex of the hyperbola by the hyperbola chord 1y that is normal to the x-axis (Figure 56). The coordinates for the corners of the section H and K are thus x\y and x\ —y. First we determine the area T of a so-called hyperbola trapezoid, i.e., the trapezoidal surface that is bounded by a hyperbola arc, the parallels to one of the asymptotes through the end points of the arc, and the segment cut off on the other asymptote by these parallels. Let the asymptote angle be 2a, its sine J, the sine and cosine of its halves i and o, so that i = bje and o = aje (with e = V a2 + b2) and J = 2io = 2ab\e2 (Figure 56). We choose as the asymptotes the u- and i»-axis of a second (oblique- angle) coordinate system. Between the coordinates x\y and u\v of a hyperbola point in the two systems there then exist the transformation equations (2) x = ou + ov, y = iv — iu, as may be seen from Figure 57, so that for the left side of (1) we obtain the value 4uvje2 and we have the equation of the
244 Problems Concerning Conic Sections and Cycloids hyperbola in the second system, the so-called asymptote equation of the hyperbola (3) uv = P with P = ie2, in which P is the so-called power of the hyperbola. Let the trapezoid T to be calculated be bounded by the hyperbola arc with end point coordinates u\v and U\V (where we let U > u, V < v), by the two ordinates v and V and by the base line U — u of the trapezoid (Figure 58). We divide the trapezoid into n equal sections t by means of parallels to the n-axis, so that T = nt, and we designate the coordinates of the points marking off the segments on the trapezoid arc as ux \vu u2\v2, ■. ■, Fig. 58.
Squaring a Hyperbola 245 The asymptote parallels through the end points u|» and U|& of the hyperbola arc corresponding to an arbitrary trapezoidal section t determine two parallelograms with a common base line g = U — u lying on the a-axis, one of which is larger and the other smaller than t. Since these parallelograms possess the areas Jgo and Jg%$, we obtain the inequality Jgt) > t > Jg%. We introduce the so-called quotient of the trapezoid t, q = U/u, replace g on the left by (q — l)u and on the right by [1 — (1/?)]U, and obtain J(q - l)u» > t> j(\ - -\uf8 or, as a result of (3), PJ(q - 1) > t> Pj(\ - -V If we replace t here with 77«, divide by PJ and abbreviate T\PJ as c, we obtain or, solving for q, q — 1 > - > 1 n q 1 + - < q < n Using this inequality for all n trapezoidal sections, we obtain the n inequalities c u-, 1 !+-< — < n u c n , c u2 1 !+-< — < n ttx l _c n . c U 1 1 + - < < n u. 1 - n
246 Problems Concerning Conk Sections and Cycloids Multiplication of these gives (' + ;)* < ? (■-r The mean of this inequality is the so-called quotient Q = U\u of the hyperbola trapezoid T. The left and right side tend (according to No. 12) toward the value e° for infinitely increasing n, e representing the Euler number (2.71828...). This gives us the equality Q = ee. With logarithms we obtain (I) T = PJIQ, or verbally: The area of the hyperbola trapezoid is proportional to the natural logarithm of the trapezoid quotient. The proportionality constant is the product of the hyperbola power and the sine of the asymptote angle. Since 4P = e2, J = 2abje2, we also have (I*) T= pQ. If we join the end points u\v and U\ Voi our hyperbola arc with the hyperbola center 0, we obtain a hyperbola sector to which we can similarly assign the "quotient" Q. Since the two triangles that are formed by the connecting lines mentioned and the coordinates of the end points of the arc have the areas \uvJ and \UVJ, which areas are equal in view of (3), the sector has the same area S as the trapezoid: (II) S = PJlQ = pQ. Now the determination of the area of the section A is simple. First, in accordance with (2), the abscissas u and U of the section corners H and K are found to be
Rectification of a Parabola From this it follows that the quotient of the sector OHK is ¢ = ^=1^ = (.^1)2 a b and, consequently, the area of the sector, according to (II), is ...w(£ + f). Finally, A is found to be the amount by which the triangle OHK is greater than the sector OHK, or (III) A=xy-abl(l + ty H3S9 Rectification of a Parabola To determine the length of a parabola arc. Solution. The following ingenious solution to this problem stems from the famous book Lectiones Geometricae of the English mathematician Isaac Barrow (1630-1677), which was published in 1670 in London. We refer the parabola to a coordinate system in which the x-axis is the axis of the parabola and the y-axis is tangent to the apex. The parabola equation then reads y2 = 2px. We need only determine the length of an "apex arc," i.e., an arc of the parabola that takes its origin from the apex S, since any arc can be represented as the sum or difference of apex arcs. Let the end point P of the apex arc SP possess the coordinates X and Y, and let the sought-for length of the arc be L. Since the subnormal of a parabola is equal to the half parameter p, there exists between the ordinate y of a point of the parabola and the normal n corresponding to this point the relation n2 — y2 = p2. If we then assign to each parabola point x\y of our coordinate system a point n\y in a new n^-coordinate system, we obtain in the new system an equilateral hyperbola with the half axis p. 247 [cf. (1)]
248 Problems Concerning Conic Sections and Cycloids We show that p times the length (pL) of the parabola arc SP is numerically equal to the surface area F of the hyperbola trapezoid that is bounded by the hyperbola, its axes, and the perpendicular N that is dropped from the hyperbola point P' corresponding to the point P onto the minor axis of the hyperbola. (JVis at the same time the abscissa of the hyperbola point P' and the parabola normal at the parabola point P.) Fig. 59. Let us consider a portion a = AB of the parabola arc SP that is short enough to be considered a rectilinear distance (a so-called arc element) and let us draw through its end points the parallel AC to the parabola axis and BC = 17 to the apex tangent. At the same time we draw the ordinate y and the normal n of the midpoint of AS, which gives us a right triangle with the sides y, n, and p that is similar to the triangle ABC. As a result of this similarity we obtain the proportion T)\a = p:n, and this gives us the equation (1) pa = nv. We then draw from the hyperbola points A' and B' corresponding to the points A and B the perpendiculars to the minor axis of the hyperbola, and we obtain a narrow hyperbola trapezoid that corresponds to the arc A'B'. The area <p of this trapezoid is the product of its altitude ij and its midline n (the latter is n because it passes through the center of the altitude and thus through the end point of the hyperbola ordinate y): (2) <p = nv From (1) and (2) we get pa = <p.
Rectification of a Parabola 249 If we form this equation for each element of the parabola arc SP and its corresponding minute hyperbola trapezoid, and if we add the resulting equations, we obtain on the left p times the arc length L and on the right the area F of the hyperbola trapezoid above described, i.e., the equation pL = F. Now from the concluding formula of No. 57 it follows that NY p* N+Y 2 2 p The sought-for arc length is thus L=NY p_ JV+7 2P 2 p where Y represents the ordinate, N the normal of the end point of the arc. We now slightly transform the equation we have found. Let T be the portion of the parabola tangent passing through P, bounded by P and the y-axis, let t be the slope angle of the parabola at point P, i.e., the angle formed by the tangent with the .v-axis (and, at the same time, by the normal JVwith they-axis). Then NY _ YY _ _X_ _ T 2p ~ 2p cos t cos t and N+ Y _ N + ACCOST P ~ JVsin t consequently L where we have replaced \p by the shortest focal radius k. Conclusion: An apex arc of a parabola exceeds the length of the parabola tangent reaching from the end of the arc to the apex tangent by a quantity that is proportional to the natural logarithm of the cotangent of half the slope angle. The proportionality constant is the shortest focal radius. 1 + cos T sin t 2 cos2 = 2 sin - cos ^ T = cotI, T + kl cot -»
250 Problems Concerning Conic Sections and Cycloids ■3*1 Desargues' Homology Theorem (Theorem of Homologous Triangles) If the lines connecting the homologous vertexes of two triangles pass through a point, the points of intersection of the homologous sides lie on a straight line. And conversely: If the points of intersection of the homologous sides of two triangles lie on a straight line, the lines connecting the homologous vertexes pass through a point. One frequently has occasion to correlate to each other the vertexes and sides of two triangles (e.g., similar triangles), and in these cases for the sake of convenience the mutually correlated, so-called " homologous" vertexes and sides are usually designated by the same letter. Thus, one may have, for example, the homologous vertexes A and A', B and B', and finally C and C", as well as the homologous sides BC = a and B'C = a', CA = b and C'A' = b', and finally AB = c and A'B' = c'. Two such triangles, for which we will assume that no pair of homologous vertexes or sides coincides, are called copolar [perspective from a point] when the lines AA', BB', CC connecting the homologous vertexes pass through one point, the so-called homology pole. They are called coaxial [perspective from a line] when the points of intersection ad, bb', cc' of the homologous sides lie on a straight line, the so-called homology axis. Using these terms, the above theorem can be expressed in the abbreviated form of: Desargues' homology theorem: Copolar triangles are coaxial, coaxial triangles are copolar. Triangles that are both copolar and coaxial are called homologous triangles. The theorem of homologous triangles was discovered by the French mathematician and engineer Gerard Desargues (1593-1662) in about 1636 and is therefore known as Desargues' theorem. However, according to the Greek mathematician Pappus, this theorem was already contained in the lost treatise on porisms of Euclid. Desargues' theorem plays a very important role in projective geometry. Consequently, we will prove it in a projective manner though other, shorter proofs are possible. For the reader unfamiliar with projective geometry it may be appropriate to provide a short exposition of its most important
Desargues' Homology Theorem 251 concepts and its simplest theorems, especially as they will be encountered in the next few sections as well. The totality of the points (considered as rigidly connected to each other) in a line is called a range of points; the line is called the base of the range. The totality of the lines (considered as rigidly connected to each other) that pass through one point is called a ray pencil; the point is called the center of the pencil. Similarly, the totality of the points of a circle or, more generally, of a conic section is called a circular or conic range of points or field of points; the totality of the tangents of a conic section is called Afield of tangents of a conic section. Ranges of points, pencils, and tangent families are the basic structures of plane projective geometry, and the points, rays, and tangents are the elements of the corresponding structures. Two basic figures are called projective (symbol: 7\) when their elements are unequivocally related to each other in such manner that every four elements of the one figure and the four corresponding or "homologous" elements of the other have the same double ratio. The relation existing between the figures is called projectivity. [The cross ratio (ABCD) of four points A, B, C, D of a straight line is the ratio AC ,AD BC : BD' the cross ratio (abed) of four rays a, b, c, d of a pencil is the ratio sin ac _ sin ad sin be ' sin bd The cross ratio of four points of a circle is the cross ratio of the four rays that connect the four points with a fifth point of the circle, where (according to the boundary angle theorem) this fifth point can be chosen at pleasure. The cross ratio of four points of a conic section is similarly the cross ratio of the four rays that join the four points with an arbitrarily chosen fifth point of the conic section (cf. No. 61). Finally, the ratio of four conic section tangents is the cross ratio of their points of tangency.] A projectivity is completely determined if three elements of one structure and the corresponding elements of the other are given. Two projective structures are called conjective when their bases (or centers) coincide. A particularly important case of projectivity is perspectivity. A range of points and a ray pencil are called perspective (7\) when each
252 Problems Concerning Conic Sections and Cycloids element of the range lies on the corresponding element of the pencil. Each ray is called the reflection of the homologous point, the whole pencil is called the reflection of the range. Two nonconjective ranges are called perspective (symbol: 7^) when the lines connecting the homologous points pass through one point, the center of perspectivity. Two ray pencils are called perspective if every pair of corresponding rays intersect on one straight line, the axis of perspectivity. The projectivity of two perspective figures follows from Pappus' theorem: The cross ratio of four rays of a pencil is equal to the cross ratio of the four points at which an arbitrary line cuts the rays. (Pappus of Alexandria, fourth century a.d., Collections mathematicae.) Proof. Let A, B, C, D be the four points of intersection of a line with the pencil of four rays OA = a, OB = b, OC = c, OD = d. We designate the sine of the angle formed by two rays, for example, a and c, with each other as sine ac. Since the perpendiculars from A and B to c have the lengths a sin ac and b sin be and are in the same ratio as AC to BC, we obtain the proportion a sin ac: b sin be = AC:BC. Similarly, a sin ad:b sin bd = AD:BD. By division of these two equations we obtain sin ac sin ad _ AC AD sink : sinW ~ BC'' B~D ^' ' ' Two projective ranges or pencils can always be brought into a perspective position. Two projective ranges (pencils) become perspective when they are placed in such a way that an element of one range (pencil) falls on the homologous element of the other range (pencil), though the bases (centers) do not coincide. We have the following two important theorems: I. If in the projectivity between two ranges the point of intersection of the two bases corresponds to itself, the ranges are perspective. II. If in the projectivity between two pencils the line connecting the two centers corresponds to itself, the pencils are perspective. Proof of I. Let the bases of the two ranges be % and %', their point of intersection that corresponds to itself 0 = 0'. On % we choose two fixed elements A, B and an arbitrary point P and we
Desargues' Homology Theorem 253 designate the homologous elements on %' as A', B', and P'. We find the point of intersection S of the lines AA' and BB' and assign to the Fig. 60. lines connecting the designated elements with S the same letters, but in lower case. Then, according to Pappus, (oabp) = (OABP) and (o'a'b'p') = {O'A'B'P'). But since the right sides of these equations are equally great, according to our assumption, it follows that {o'a'b'p') = (oabp). But if two equal cross ratios agree in the first three elements (o' = o, a' = a, b' = b), then they also agree in the fourth. Consequently, p' falls on p, and thus PP' passes through S, and the ranges are perspective. Proof of II. Let the centers of the two projective pencils 3 and 3' be Z and Z', their self-corresponding connecting line o = o'. We select on 3 two fixed elements a and b and an arbitrary element p and designate the homologous elements of 3' as a', b', and p'. We find the connecting line g of the points aa' and bb' and assign to the points of intersection of the designated elements with g the same letters, but capitals. Then, according to Pappus, [oabp) = {OABP) and {o'a'b'p') = {O'A'B'P'). But since the left sides of these equations are equal, in accordance with our initial assumption, {O'A'B'P') = {OABP).
254 Problems Concerning Conic Sections and Cycloids But if two equal cross ratios agree in the first three elements (0' = 0, A' = A, B' = B), they also agree in the fourth. P' therefore falls on P, p and p' thus intersect on g, and the pencils 3 and 3' are perspective. The proof of Desargues' theorem is now easily obtained (Figure 62). We call the vertexes of one triangle A, B, C, the sides opposite them a, b, c, the homologous vertexes of the other triangle A', B', C, the sides opposite them a', V, c'. Let the points of intersection of the homologous sides a and a', b and V, c and c' be X, Y, and Z, respectively, and let the points of intersection of the line CC with the two lines AB and AB' be //and //'. The proof divides into two parts. I. We assume that the connecting lines AA', BB', CC pass through one point 0. We project the range of points AB from 0 onto A'B' and obtain two perspective ranges in which the elements A, B, H, Z of the first are homologous to the elements A', B', H', Z' = Z of the second. We then connect the points of these ranges with C and C", thereby obtaining two projective ray pencils in which the elements CA, CB, CH = CC, CZ correspond to the elements CA', C'B', C'H' = CC, CZ'. Since the line CC connecting the pencil centers corresponds to itself in this projectivity, the projectivity of the pencil is perspective and the points of intersection of the homologous rays lie on a straight line. Thus, for example, the points of intersection Y (oWA and CA'), X (of CB and C'B'), and Z (of CZ and CZ') lie on a straight line. II. We assume that the points aa' (X), bb' (Y), cc' (Z) lie on a straight line g. We connect the points of the line g with C and C", thereby obtaining two perspective ray pencils in which the elements
Steiner's Double Element Construction 255 a, b, CC, CZ of the first pencil correspond to the elements a', b', CC, CZ' = CZ of the second. We cut these pencils with the lines c and c' and obtain two projective ranges in which the elements B, A, H, Z of the first range correspond to the elements B', A', H', Z' = Z of the second. Since the point of intersection Z = Z' of the range bases corresponds to itself in this projectivity, the ranges are perspective and the connecting lines BB', AA', and HH' = CC of the homologous elements thus pass through one point, which was to be proved. B3II Steiner's Double Element Construction To draw the double elements of a connective projection that are given by three pairs of homologous elements. A double element of a conjective projectivity is an element that coincides with its homolog. The following simple solution to this fundamental problem of projective geometry was discovered by the German mathematician Jakob Steiner (Die geometrischen Konstruktionen, etc. [cf. No. 34], Berlin, 1833).
256 Problems Concerning Conic Sections and Cycloids Steiner's double element construction enriched the geometry of antiquity by providing it with a new and fruitful method for solving problems of geometric construction. This so-called method of false position (regula falsi) is based on the theorem: If in the projectivity between two ray pencils the line connecting the pencil centers corresponds to itself, the pencils are perspective (No. 59). We can distinguish three cases: I. Double elements of a projectivity on a circle. Let the projectivity between the two ranges of points 9¾ and SR' of the circle ft be given by the two corresponding point triplets (A, B, C) and {A', B', C). We consider the ray pencils © and ©', whose rays run from the points of ranges 9¾ and 9V, respectively, through the centers A' and A, respectively. Since 9¾ "A" © and 9V 7T ©', and, according to our assumption, 9¾ 7\ 9V, it is also true that <B "a ©'. But since in the line AA' connecting the centers of the two pencils <S> and ©' corresponding pencil elements coincide, the latter projectivity is a perspectivity. The axis of perspectivity is the line g connecting the point of inter- Fio. 63. section of the rays A'B and AB' with the point of intersection of the rays A'C and AC. Two corresponding rays of © and ©' thus always intersect at g. Thus, in order to obtain a point P' of 9V corresponding to the arbitrary point P of fH, we need only connect the point of intersection of A'P and g with A. The connecting line touches ft at P'. If we carry out this construction for the points of intersection H and K of the perspectivity axis with the circle, H' falls on H, K' on K. The double points of the projectivity on a circle are therefore the points of intersection of the circle with the above perspectivity axis.
Pascal's Hexagon Theorem 257 II. Double elements of two ray pencils. We draw a circle ft through the common center of the two projective pencils and, in accordance with I., we draw the double points of the two ranges at which the rays of the two pencils cut ft. The pencil rays passing to these double points are the double rays we are looking for. III. Double elements of two ranges of points. We draw, in accordance with II., the double rays of the two pencils that are obtained from the lines connecting the points of the two conjective projective ranges with an arbitrary center Z outside the base of the range. The points of intersection of the two double rays with the base of the range are the double points we are looking for. ^m Pascal's Hexagon Theorem To demonstrate that the three points of intersection of the opposite sides of a hexagon inscribed in a conic section lie on a straight line. A hexagon inscribed in a conic section essentially consists of six points anywhere on the conic section 1, 2, 3, 4, 5, 6, the "vertexes" of the hexagon, and the six connecting lines 12, 23, 34, 45, 56, 61, the "sides" of the hexagon. The sides 12 and 45, the sides 23 and 56, and finally 34 and 61 are called the "opposite sides." The straight line on which the three points of intersection of the opposite sides lie is called the Pascal line, and the hexagon is called the Pascal hexagon. In a somewhat more abbreviated form the theorem to be proved can be stated as: The three points of intersection of a Pascal hexagon lie on a straight line. This fundamental theorem in conic section theory was published in 1640 by Blaise Pascal (1623-1662) at the age of 16 in his six-page Essai sur les Coniques. There are a number of proofs of the Pascal theorem. The following projective proof is based upon the two theorems of Steiner: I. The points of a conic section are projected from pairs of themselves by projective pencils. II. If in the projectivity between two ranges of points the point of intersection of their bases corresponds to itself, the ranges are perspective. Proof of I. The theorem applies most directly to the circle. (In circles the designated pencils are even congruent.) Now, since a conic section is the central projection of a circle, and since in this
258 Problems Concerning Conic Sections and Cycloids projection the pencils we are concerned with appear as projections of projective ray pencils in a circle, we need only show that the central projection of a pencil on a plane is projective with respect to the pencil. Now this is the case according to Pappus' theorem. Specifically, if a, b, c, d are four rays lying in plane E, a', b', c', d' their central projections on plane E\ and A, B, C, D the points of intersection of the ray pairs (a, a'), (b, b'), (c, c'), and (d, d') lying on the line of intersection of the two planes, then, according to Pappus, (a'b'c'd') = (ABCD) and (abed) = {ABCD), thus, also {a'b'c'd') = {abed), i.e., the pencil and the pencil projection are projective. The proof of II. is in No. 59. Now to prove the Pascal theorem! Let the vertexes of the hexagon be 1,2, 3, 4, 5, 6. According to I., the rays from the centers 1 and 3 to the conic section points 2, 4, 5, 6 form projective pencils; thus the points of intersection 2', 4', 5', 6' and 2", 4", 5", 6" of these rays with the straight lines 54 and 56 form projective ranges. Since at the point of intersection 5 of their bases the corresponding range elements are coincident (5' = 5"), the ranges are perspective according to II., and consequently the lines 2'2", 4'4", and 6'6" pass through one point, the point of intersection Z of the lines 4'4" and 6'6", i.e., the lines 34 and 61. In other words: The points of intersection of the opposite sides 2' (intersection of 12 and 45), 2" (intersection of 23 and 56), and Z (intersection of 34 and 61) lie on one straight line, the Pascal line p = 2'Z2". Q..E.D.
Pascal's Hexagon Theorem 259 The converse of Pascal's theorem: If the opposite sides of a hexagon (of which no three vertexes lie on a straight line) intersect on a straight line, the six vertexes lie on a conic section. Indirect proof. Let the conic section that is unequivocally determined by the five vertexes 1, 2, 3, 4, 5 touch the fifth side of the hexagon 56 at 6*. According to Pascal's theorem, we obtain 6* by drawing the Pascal line (as the line connecting the points of intersection of the opposite sides 12 and 45, as well as 23 and 56 = 56*), causing it to intersect with 34 at Z and determining the point of intersection (6*) of 1Z with 56* = 56. But according to our assumption, this is 6, so that 6* = 6. If two vertexes of a Pascal hexagon coincide once or twice or three times, there follow the corollaries of the Pascal theorem, the most important of which we will now give. I. The vertexes 5 and 6 coincide: this is to be considered as meaning that point 6 approaches point 5 ever more closely until it finally coincides with it. This transforms the chord 56 into the tangent at point 5 and the hexagon is transformed into the pentagon 12 3 4 5. Pascal's theorem then assumes the form: Corollary 1 (Figure 65): In every pentagon inscribed in a conic section the points of intersection of two pairs of nonadjacent sides and the point of intersection of the fifth side with the tangent passing through the opposite vertex lie on a straight line. Fig. 65. II. The vertexes 5 and 6 coincide and the vertexes 2 and 3 coincide; the hexagon thus becomes a tetragon 12 4 5. Now the opposite sides of the tetragon 12 and 45, and likewise 24 and 51, and the tangents at the opposite vertexes 2 and 5 intersect each other on a straight line.
260 Problems Concerning Conic Sections and Cycloids Since we could just as easily choose the two other opposite vertexes, the point of intersection of the tangents at these vertexes also lies on the Pascal line. We therefore obtain the following Corollary 2 (Figure 66): In every tetragon inscribed in a conic section all the pairs of opposite sides and tangents to the pairs of opposite vertexes intersect on a straight line. Fig. 67.
Brianchon's Hexagram Theorem 261 III. The vertexes 1 and 2 coincide, so do vertexes 3 and 4, and so do vertexes 5 and 6; the hexagon becomes a triangle, and we obtain Corollary 3 (Figure 67): In every triangle inscribed in a conic section the sides intersect with the tangents to the opposite vertexes on a straight line. ISgfl Brianchon's Hexagram Theorem To demonstrate that the three opposite vertex lines of a hexagram circumscribed about a conic section pass through a point. A hexagram circumscribed about a conic section consists essentially of six tangents I, II, III, IV, V, VI to the conic section, which are the sides of the hexagram, and the six points of intersection III, II III, III IV, IV V, V VI, VII forming the vertexes of the hexagram. The vertexes III and IV V, the vertexes II III and V VI, and the vertexes III IV and VII are called opposite vertexes, and the lines connecting them are called opposite vertex lines. The point through which the three opposite vertex lines pass is called the Brianchon point and the hexagram the Brianchon hexagram. The theorem to be proved can be stated in a somewhat shorter form as follows. The three opposite vertex lines of a Brianchon hexagram pass through a point. This theorem, which is as important in the theory of conic sections as the Pascal theorem, was published in 1810 by the French mathematician Brianchon (1785-1864) in the Journal de Vitcole Polytechnique. The following projective proof of Brianchon's theorem is based on the two theorems of Sterner: I. The tangents of a conic section cut two of the tangents into projective ranges of points. II. If in the projectivity between two ray pencils the line joining the pencil centers corresponds to itself, the pencils are perspective. Proof of I. We first prove I. for a circle. For this purpose let us consider the following structure: 1. the range of points 9¾ through which a moving point P on the circle passes; 2. the pencil 58 of the rays FP that run from the fixed circle point F to the moving point P; 3. the field © of tangents t drawn to the different positions of P; 4. the range r of the points of intersection S of these tangents with the
262 Problems Concerning Conic Sections and Cycloids fixed circle tangents f through F; 5. finally, the pencil b of the rays MS that run from the center point M of the circle to S. Then SK, 58, and © are projective by definition, 58 and b are projective because they are congruent (every ray from 58 is perpendicular to the corresponding ray from b), and finally r and b are projective because they are perspective. Consequently, <B and r are projective. I.e.: Afield of tangents to a circle is projective with respect to the range of points that the tangents of the field generate on an arbitrary fixed tangent. From this it follows directly that: The tangents of a circle cut two of them into projective ranges of points. We will now prove theorem I. for a conic section. The conic section is the central projection of a circle in which its tangents are perspectives of circle tangents. In this projection the ranges of points mentioned appear as perspectives of the two ranges that the circle tangents generate on the two fixed circle tangents, which correspond to the chosen conic section tangents in the central projection. Now, since the latter ranges are projective, the former must also be. Proof of II. is given in No. 59. Now for the proof of Brianchon's theorem! Let the sides of the hexagram be I, II, III, IV, V, VI. According to auxiliary theorem I., the points of intersection generated on tangents I and III by II, IV, V, VI form projective ranges of points, and consequently the junction lines II', IV, V, VI', and II", IV", V", VI" of these points with the points (centers) VIV and V VI form
Brianchon's Hexagram Theorem 263 projective pencils. Since in the line V connecting the centers, corresponding rays (V = v") coincide, the pencils are perspective according to auxiliary theorem II., and the rays II' and II", IV and IV", and VI' and VI" intersect on one straight line, the axis of perspectivity, the junction line a of the points IV IV" and VI' VI", i.e., of the points III IV and VI I. In other words: The opposite vertex lines II' (from III to IV V), II" (from II III to V VI), and a (from III IV to VI I) pass through one point, the Brianchon point. Q.E.D. The converse of Brianchon's theorem : If the opposite vertex lines of a hexagram (of which three sides do not pass through one point) pass through a point, the sides of the hexagram form tangents of a conic section. Indirect proof, similar to the proof of the converse of Pascal's theorem (No. 61). If two sides of the Brianchon hexagram coincide once or twice or three times, we obtain the corollaries of the Brianchon theorem, the most important of which we will here mention. I. The sides V and VI coincide; this is to be considered as a situation in which side VI comes closer and closer to side V and finally coincides with it. The point of intersection V VI then becomes the point of tangency of the tangent V, and the hexagram becomes the pentagram III III IV V. Brianchon's theorem then assumes the following form: Corollary 1 (Figure 69): In every pentagram circumscribed about a conic section the lines joining two pairs of nonadjacent vertexes and the junction line of the fifth vertex with the point of tangency of its opposite side pass through one point. Fig. 69.
264 Problems Concerning Conic Sections and Cycloids II. The sides V and VI coincide, and the sides II and III coincide; here the hexagram becomes the tetragram IIIIV V. Now the junction lines of the opposite vertexes III and IV V, as well as those of IIIV and V I, and also the junction lines of the tangency points of II and V pass through one point. Since we could as easily select the tangency points of the opposite sides I and IV, their junction line also passes through the Brianchon point. Consequently, we obtain Corollary 2 (Figure 70): In every tetragram circumscribed about a conic section the two diagonals and the two tangency chords of the opposite sides pass through one point. Fig. 71.
Desargues' Involution Theorem 265 III. The sides I and II coincide, the sides III and IV coincide, and the sides V and VI also coincide; the hexagram becomes a trigram, and we obtain Corollary 3 (Figure 71): In every triangle circumscribed about a conic section the lines connecting the vertexes with the tangency points of the opposite sides pass through one point. Hgftfl Desargues' Involution Theorem The points of intersection of a line with the three pairs of opposite sides of a complete tetragon* and a conic section circumscribed about this tetragon form four point pairs of an involution. The lines joining a point with the three pairs of opposite vertexes of a complete tetragram* and the tangents drawn from the point to a conic section inscribed in the tetragram form four ray pairs of an involution. It is here assumed that the line does not pass through a corner of the tetragon and that the point does not lie on a side of the tetragram. This double theorem was formulated and proved in 1639 by Desargues (No. 59) in his major work on conic sections. The work bears the strange title Brouillon-Projet d'une atteinte aux tenements des rencontres d'un cone avec un plan, or approximately in English "First Draft of a Projected Essay on the Phenomena Arising from the Intersection of a Cone with a Plane." Desargues was the source of the concept of involution and of an amazing series of involution theorems as well, so that it seems appropriate at this point to take up briefly for readers unfamiliar with it the most significant properties of involution. In a conjective projectivity (No. 59) between two homologous structures I and II each element of a common base can be assigned to I as well as II. Now, if there are two elements A and B of the base such that to the element A of I there corresponds the element B of II and simultaneously to the element B of I there corresponds the element A of II, we say that the elements A and B are conjugate (to each other) or correspond to each other in double fashion. * A complete tetragon (tetragram) consists essentially of four points (lines) 1, 2, 3, 4 and their six connecting lines (points of intersection) 23, 14, 31, 24, 12, 34, of which 23 and 14, 31 and 24, 12 and 34 are known as opposite sides (opposite vertexes).
266 Problems Concerning Conic Sections and Cycloids Let us consider in addition to the conjugate point pair (A, B) another arbitrary pair of homologous elements: P from I and Q from II. From the equation (ABPQ) = (BAQP) it then follows that to the element Q from I there also corresponds the element P from II, i.e., P and Q are also conjugate. Thus, if one pair of homologous elements in a conjective projectivity is composed of conjugate elements, then every pair is composed of conjugate elements. A conjective projectivity in which every two homologous elements are conjugate is called an involution or an involutional projectivity. Every pair of conjugate elements is called for short an element pair of the involution. . Fig. 72. Since a projectivity is fixed by three elements of one structure and the homologous elements of the other, an involution is determined by two pairs A, A' and B, B' of conjugate elements insofar as the elements A, A', B of the one structure correspond to the elements A', A, B' of the other. Construction of an involution, i.e., construction of an element P' corresponding to an arbitrary element P, is most effectively accomplished by means of Desargues' involution theorem (where conic sections do not enter into the picture). Let us say, for example, that we are concerned with the involution of two ranges of points. Let {A, A') and (B, B') be the given point pairs of the involution, C an additional given point of the base %, and C" the homolog of C we are looking for. We draw through A, B, C three lines that form a
Desargues' Involution Theorem 267 triangle 1 2 3 (A on 23, B on 31, C on 12), connect A' with 1, B' with 2, and the point of intersection 4 of these connecting lines with 3. Then 34 touches the base at C". (The opposite side pairs 23 and 14, 31 and 24, 12 and 34 of the tetragon 12 3 4 cut 3; at the point pairs (A, A'), (B,B'), and {C,C) of the Desargues involution.) The construction of the involution between two ray pencils is" carried out in a very similar fashion. We will now consider the important case of the involution on a circle. Let (A, A') and (B, B') be two point pairs of an involution between two ranges of points of a circle (Figure 73). We connect the points of both sets with the circle points A and A'. We thereby obtain two projective ray pencils in which the rays AA', AB, AB' of the first pencil correspond to the rays A'A, A'B', A'B of the second pencil. Since the junction line AA' of the pencil centers corresponds to itself, the pencils are perspective (No. 59). The axis of perspectivity is the junction line of the points of intersection Z of AB and A'B' and 0 of AB' and BA'. In order to find the homolog C" in the involution of an arbitrary point C, we cause AC and OZ to intersect at Y and connect Y with A'; the connecting line touches the circle at C". Since we can just as well undertake the whole consideration with the pencil centers B and B' (instead of A and A'), we also obtain C" when we cause BC and OZ to intersect and connect the point of intersection X with B'. Since the homologous sides (bearing the same letter designation) of triangles ABC and A'B'C intersect on a straight line (XYZ), then,
268 Problems Concerning Conic Sections and Cycloids according to Desargues' homology theorem (No. 59), the junction lines AA', BB', and CC of the homologous vertexes pass through one point S. If we then draw through S any secant, this secant cuts the circle at two conjugate points of the involution. The result of our consideration is the theorem: The lines joining the conjugate points of an involution on a circle pass through a fixed point. And conversely: A secant rotated about a fixed point cuts a circle at the point pairs of an involution. In quite similar fashion the following theorem is proved: The points of intersection of conjugate tangents of an involution on a circle lie on a straight line. And conversely: If a point moves on a line, the tangents drawn from this point to a circle generate an involution on the circle (Figure 74). Moreover, since every conic section is the central projection of a circle, and projectivity, and thus also involution, between two structures is not annulled by projection of these structures (Pappus' theorem, No. 59), the two just stated theorems are valid for conic sections as well: Involution on a conic section: The lines connecting conjugate points of an involution on a conic section pass through a fixed point. The points of intersection of conjugate tangents of an involution on a conic section lie on a fixed straight line.
Desargues' Involution Theorem 269 And conversely: A secant rotated about a fixed point cuts a conic section at the point pairs of an involution. The tangents from a point moving along a fixed straight line to a conic section are tangent pairs of an involution. The proofofDesargues' involution theorem is based on the theorems: The points of a conic section are projected from pairs of themselves by projective pencils (No. 61). The tangents of a conic section cut two of the tangents into projective ranges of points (No. 62). Let 12 3 4 be an inscribed tetragon. Let the line g cut the sides 23, 31, 12 at A, B, C, the opposite sides 14, 24, 34 at A', B', C", the conic section at S and 5". We connect the conic section points 2, 3, S, 5" with 1 and 4 and obtain two projective pencils with the centers 1 and 4, so that the projections 12 13 15 15" and 42 43 45 45" are projective. Fig. 75. We cause these pencils to intersect with g and obtain two Let III III IV be a circumscribed tetragram. Let the lines connecting the point P with the vertexes II III, III I, III be a, b, c, with the opposite angles IIV, IIIV, III IV a', b', c'. Let the tangents from P to the conic section be t and t'. We cut the conic section tangents II, III, t, t' with I and IV and obtain two projective ranges of points on the bases I and IV, so that the projections III I III It It' and IVII IV III IV* rVT are projective. Fig. 76. We project these ranges froi. ' and obtain two conjective
270 Problems Concerning Conic Sections and Cycloids conjective projective ranges of points with the base g in which CBSS' a B'C'SS', i.e., {CBSS') = (B'C'SS'). (CBSS') = (C'B'S'S), so that CBSS' a C'B'S'S. In this projection there are two conjugate points S and S'. Consequently, the projectivity is an involution, and the points B and B', as well as the points C and C", are conjugate. If we connect the conic section points 3, 1, S, S' with 2 and 4, and undertake the same considerations, we find that (ACSS') = (A'C'S'S), so that in the involution defined by the point pairs (S, S') and (C,C) the points A and A' are also conjugate. Accordingly, (A, A'), (B,B'), (C, C"), and (S, S') are point pairs of an involution. We maintain fixed the conic section, the three vertexes 1, 2, 3, and the straight line g; we allow the vertex 4, on the other hand, projective ray pencils with the center P in which cbtt' 7\ b'c'tt', i.e., (cbtt') = (b'c'tt'). (cbtt') = (c'b't't), so that cbtt' 7\ c'b't't. In this projection there are two conjugate rays t and t'. Consequently, the projectivity is an involution, and the rays b and b', as well as the rays c and c', are conjugate. If we cut the conic section tangents III, I, t, t' with II and IV, and undertake the same considerations, we find that (actf) = (a'c't't), so that in the involution defined by the ray pairs (t, t') and (c, c') the rays a and a' are also conjugate. Accordingly, (a, a'), (b, b'), (c, c'), and (t, t') are ray pairs of an involution. We maintain fixed the conic section, the three sides I, II, III, and the point P; we allow the side IV to roll along the conic We now switch the first two terms with each other and the second two terms with each other on the right-hand side and obtain Thus Desargues theorem is proved. Special Cases
Desargues' Involution Theorem 271 to travel on the conic section toward the point 3. The secant 34 then comes closer and closer to the tangent at 3, while at the same time point A' comes closer and closer to point B and point B' closer and closer to point A. When 4 reaches 3, 43 becomes a tangent through 3, and A' coincides with B and B' with A. Consequently, we obtain The points of intersection of a straight line: 1. with a conic section, 2. with two sides of a triangle inscribed in a conic section, 3. with the third side of the triangle and the conic section tangent passing through its opposite vertex are three point pairs of an involution. Fig. 77. If we maintain fixed the conic section in the figure obtained, the line g, and the vertexes 1 and 3, and let 2 travel toward 1, then 12 approaches more and more closely the tangent through section into position III. The vertex III IV then comes closer and closer to the point of tangency of the tangent III, while at the same time the ray a' comes closer and closer to the ray b and the ray V comes closer and closer to the ray a. When IV coincides with III, IV III becomes the tangency point of III, and a' coincides with b and b' with a. 1. The tangents from a point to a conic section, 2. the lines joining the point with two vertexes of a tri- gram circumscribed about a conic section, 3. the lines joining the point with the third vertex of the trigram and the point of tangency on its opposite side are three ray pairs of an involution. Fig. 78. ^ If we maintain fixed the conic section in the figure obtained, the point P, and the sides I and III, and let II roll toward I, the point I II approaches more and more closely the tangency Corollary 1
272 Problems Concerning Conic Sections and Cycloids 1 and A the point A'. When 2 reaches 1,12 becomes the tangent through 1, A coincides with A', and C falls on the tangent through 1. Fig. 79. point of I and a the ray a'. When II reaches I, III becomes the tangency point of I, a coincides with a', and c passes through the tangency point of I. Fig. 80. Thus, we have Corollary 2 Given a conic section with two tangents and their corresponding tangency chord (Figures 79 and 80): If tht points of intersection of an arbitrary line with the conic section are chosen as the first pair, the points of intersection with the given tangents as the second pair of an involution, the point of intersection of the tangency chord with the line is a double point of the involution. If the tangents drawn to a conic section from an arbitrary point are chosen as the first pair, and the rays from the point to the ends of the tangency chord as the second pair of an involution, the line joining the point with the point of intersection of the given tangents is a double ray of the involution. Note. Through the four corners of a tetragon there pass an infinite number of conic sections, which form a so-called conic section pencil. The (complete) tetragon is called a fundamental tetragon in this context.
A Conic Section from Five Elements 273 Similarly, there are an infinite number of conic sections that are tangent to the four sides of a tetragram; they form a so-called field of conic sections. The (complete) tetragram in this context is called a fundamental tetragram. Since Desargues' theorem applies to every one of these conic sections, we can state the theorem in the following manner, which is its most general and shortest form. Desargues' involution theorem: The intersection point pairs of a line with the conic sections of a pencil are point pairs of an involution. The tangent pairs from a point to the conic sections of afield are ray pairs of an involution. Here the opposite side pairs of the fundamental tetragon are to be considered as (degenerate) conic sections of the pencil, and the opposite vertex pairs of the fundamental tetragram as (degenerate) conic sections of the field. A Conic Section from Five Elements To draw a conic section of which five elements—points and tangents—are known. In the solution of this fundamental problem we distinguish three cases: I. the five elements are of the same type; II. four elements are of the same type, but the fifth is of the other; III. three elements are of one type, two are of the other. In the following we will designate the conic section as S. I. To draw a conic section from five points. This problem is commonly solved by means of Pascal's theorem. We number the points in an arbitrary sequence from 1 to 5 and designate as 6 the unknown point of intersection of an arbitrary line Q = 56, passing through 5, with S. We then draw the Pascal line p of the I. To draw a conic section from five tangents. This problem is commonly solved by means of Brianchon's theorem. We number the tangents in an arbitrary sequence from I to V and designate as VI the unknown tangent drawn to S from an arbitrary point P = V VI of tangent V. We then draw the Brianchon point B of the hexagram
274 Problems Concerning Com hexagon 12 3 4 5 6 as the line connecting the point of intersection of the opposite sides 12 and 45 with the point of intersection of the opposite sides 23 and 56 = g. The line joining the point of intersection of the two lines 34 and p with the vertex 1 cuts g (= 56) at the sought-for point 6. By repeating the construction with another line Q we can obtain as many points of K as we desire. In order to draw the tangent to S at one of the five known points 1, 2, 3, 4, 5 of a conic section, let us say at 5, we make use of the first corollary to Pascal's theorem. We draw the point of intersection of the two sides 51 and 43, also the point of intersection of the sides 54 and 12, and allow the line p connecting these two points with the side 23 to intersect. The line connecting the resulting point of intersection with the vertex 5 is the sought-for tangent at 5. II. To draw a conic section of which four points 1, 2, 3, 4 and one tangent t are given. First case : The tangent t passes through one of the given points, for example, through 4. Let us consider the tangent t as the line connecting two infinitely close conic section points ic Sections and Cycloids III III IV V VI as the point of intersection of the line connecting the opposite vertexes III and IV V with the line connecting the opposite vertexes II III and V VI = P. The point of intersection of the line connecting the two points III IV and B with the side I is a second point of the sought-for tangent VI. By repeating the construction with other points P we can obtain as many tangents of t as we desire. To draw on one of five known tangents I, II, III, IV, V to a conic section, let us say on V, the point of tangency with S, we make use of the first corollary to Brianchon's theorem. We draw the line connecting the two vertexes V I and IV III and the line connecting the two vertexes VIV and III, and connect the point of intersection B of the two lines with the vertex II III. This new junction line meets the tangent V at the sought-for point of tangency. II. To draw a conic section of which four tangents I, II, III, IV and one point P are given. First case : The point P lies on one of the given tangents, for example, on IV. Let us consider the point P as the point of intersection of two infinitely close conic section
A Conic Section from Five Elements 275 4 and 5, so that t = 45, and let us designate as 6 the point of intersection of S with an arbitrary line x starting from 1, so that x = 16. We then draw the Pascal line p of the hexagon 1 2 3 4 5 6 as the line connecting the point of intersection of opposite sides 12 and 45 = t with the point of intersection of the opposite sides 34 and 61 = x. The line connecting the point of intersection of the lines p and 23 with the vertex 4 meets g at the sought-for point 6. We now have five known points of S, and the problem is reduced to I. Second case: The tangent t does not pass through any of the given points. To solve this problem we use the Desargues' involution theorem (No. 63), taking t as the involution base. We determine the points of intersection, let us say A, A', B, B', of the sides 12, 34, 23, 41 of the tetragon 12 3 4 with t and draw a double point of the involution determined on t by the two point pairs (A, A') and (B, B'); this is the point of tangency of the tangent t. Now five points of S are known and the problem is reduced to I. tangents IV and V, so that P = IV V, and let us designate as VI a second tangent from an arbitrary point X of I to St, so that X = I VI. We then draw the Brianchon point B of the hexagram III III IV V VI as the point of intersection of the line connecting the opposite vertexes III and IV V = P and the line connecting the opposite vertexes III IV and VII = X. The point of intersection of the line connecting the points B and II III with the side IV is a second point of the sought-for tangent VI. We now have five known tangents of S? and the problem is thereby reduced to I. Second case : The point P does not lie on any of the given tangents. To solve this problem we make use of Desargues' involution theorem (No. 63), taking P as the involution base. We determine the junction lines a, a', b, V connecting the vertexes III, III IV, II III, IV I of the tetra- gram I II III IV with P and construct the double ray of the involution determined on P by the two ray pairs (a, a') and (b, b'); this is the conic section tangent passing through P. We now have five known tangents of S and the problem thus reduces to I.
276 Problems Concerning Conic Sections and Cycloids The second case of II. has two solutions if the involution has two double elements and no solution if the involution has no double elements. III. To draw a conic section of which three points A, B, C and two tangents d and e are given. First case : d passes through A, and e through B. We draw the point of intersection S of an arbitrary line g originating at A with ft. For our purpose we construct the Pascal line p of the hexagon 1 2 3 4 5 6 of which the vertexes 1 and 2 coincide with A, the vertexes 3 and 4 with B, the vertex 5 with C, and the vertex 6 with S, the sides 12 and 34 being represented by the tangents d and e, respectively, p is the line connecting the point of intersection of the sides 12 = d and 45 = BC with the point of intersection of the sides 34 = e and 61 = g. The line connecting the point of intersection of the lines p and 23 = AB with the vertex 5 = C meets g at the sought-for conic section point S. In the same way we draw a fifth point of ft and thus reduce the problem to I. Second case: d passes through A, and e does not pass through any of the given points. III. To draw a conic section of which three tangents a, b, c and two points D and E are given. First case : D lies on a, and E on b. We draw the (second) tangent t from an arbitrary point P of tangent ato$. For our purpose we construct the Brianchon point B of the hexagram III III IV V VI of which the sides I and II coincide with a, the sides III and IV with b, the side V with c, and the side VI with t, the vertexes III and III IV being represented by the points D and E, respectively. B is the point of intersection of the line connecting the vertexes III = D and IV V = be and the line connecting the vertexes III IV = E and VII = P. The point of intersection of the line connecting points B and II III = ab with the side V = c is a second point of the sought-for tangent t. In the same way we draw a fifth tangent of ft and thereby reduce the problem to I. Second case: D lies on a, and E does not lie on any of the given tangents. We solve this case with the second corollary to Desargues' involution theorem.
A Conic Section from Five Elements 277 We determine the points of intersection D and E of the line BC with d and e and construct a double point of the involution defined by the point pairs (B, C) and (D,E). Its junction line with A passes through the point of tangency of e. Third case: Neither of the two tangents passes through any of the given points. We designate the points of intersection of BC with d and e as D and E and determine a double point P of the involution defined by the point pairs (B, C) and (D,E). It lies on the tangency chord of the tangents d and e. We designate the points of intersection of CA with d and e as D' and E' and draw a double point P' of the involution determined by the point pairs (C, A) and (D', E'). This double point also lies on the tangency chord of the tangents d and e. The line joining the two double points P and P' is thus the tangency chord we have mentioned and meets the tangents d and e at their tangency points. We now know Jive points of ft and thus return to I. We determine the connecting lines d and e joining the point be with D and E and draw a double ray of the involution determined by the ray pairs (b, c) and (d, e). Its point of intersection with a lies on the tangent passing through E; this tangent is thus determined. Third case : Neither of the two points lies on any of the given tangents. We designate the lines joining be with D and E as d and e and determine a double ray s of the involution determined by the ray pairs (b, c) and (d, e). It passes through the point of intersection of the tangents drawn through D and E. We designate the lines joining ca with D and E as d' and e' and draw a double ray s' of the involution determined by the ray pairs (c, a) and (d',e'). This double ray also passes through the point of intersection of the tangents through D and E. The point of intersection of the two double rays s and s' is thus the tangent intersection point that was mentioned beforeand thelines joining it to D and E are the tangents passing through D and E. We now haxtfive tangents of ft and thus return to I. The problem is now reduced to the preceding case. In this case also the solution is based on the second corollary to Desargues' involution theorem.
278 Problems Concerning Conic Sections and Cycloids This last problem admits of a solution only when each of the two designated involutions has double elements. And since we can connect each of the two double elements of one of the involutions with each of the double elements of the other, we obtains/bar possible tangency chords and tangent intersection points, respectively, and thus four different conic sections. E£| A Conic Section and a Straight Line To draw the points of intersection of a given straight line with a conic section of which five elements—points and tangents—are known. In the solution of this problem we may assume, in view of No. 64, that five points of the conic section are known. The solution is then based on the theorem: The points of a conic section are projected from pairs of themselves by projective pencils (No. 61) and on Steiner's double element construction (No. 60). Let the given line be called Q, the given points of the conic section A, B, C, D, E. We can think of the points of the conic section as projected from D and E by the two projective pencils I and II. These pencils cut q into the two projective ranges of points 1 and 2. The points of intersection S and T of Q with the conic section are the double elements of the projectivity 1 a" 2. This projectivity is, however, determined by the points of intersection Au Bu Cx of the rays DA, DB, DC with Q and the homologous points of intersection A2, B2, C2 of the rays EA, EB, EC with q. We therefore draw according to Steiner the double elements of the projectivity defined on q by the homologous point triplets (Alt Bt) Cx) and (A2, B2, C2); they are the points of intersection we are looking for. ^BBI A Conic Section and a Point To draw the tangents from a given point to a conic section of which five elements—points and tangents—are known. In view of the considerations of No. 64, we may assume the given conic section elements to be tangents. The solution to this problem is based upon the theorem: The tangents of a conic section mark off projective ranges of points on two of the tangents (No. 62) and on Steiner's double element construction (No. 60).
A Conic Section and a Point 279 Let the given point be P, the given tangents a, b, c, d, e. Let us consider the tangents of the conic section as intersecting with d and e, so that we obtain on d and e the projective ranges 1 and 2 in which the points of intersection Aly Bu Cx of the tangents a, b, c with d and the points of intersection A2, B2, C2 of the tangents a, b, c with e are homologous elements. The reflections of these ranges of points on P thus form two projective ray pencils I and II. The (conjective) pro- jectivity is determined by the lines aX) bx, cx connecting the points of intersection Au Bu Cx to P and the homologous connecting lines an> ^ii) cn joining the points of intersection A2, B2, C2 to P. Since each of the two tangents s and t from P to the conic section cuts 1 and 2 into homologous elements, s and t are therefore the double elements of the projectivity I X II. We thus draw according to Steiner the double elements of the conjective projectivity determined by the homologous ray triplets («x, bj, cx) and (an> bn, ca); they are the sought-for tangents.
Stereometric Problems
Steiner's Division of Space by Planes What is the maximum number of parts into which a space can be divided by n planes? This very interesting problem appears in Steiner's paper " Several laws governing the division of planes and space" (Crelle's Journal, vol. I and Steiner's Complete Works, vol. I). We first solve the preliminary problem: What is the maximum number of parts into which a plane can be divided by n straight lines? The number of parts will evidently be maximal when no two lines are parallel and no more than two lines pass through one point. In the following we will assume these two conditions to be satisfied and we will designate the corresponding number of surface sections generated by the n lines as n. Thus, let the plane be divided by n lines into n surface sections. We now draw one additional line. This line is divided by the first n lines into n points, and thus traverses n + 1 of the available n surface sections, dividing each of them into two parts, so that the (n + l)th line increases the number of surface sections by n + 1. Consequently, we obtain the equation n + 1 = n + (n + 1). We then apply this equation to the cases in which n = 0, 1, 2,... and we form the n equations 1 = 1 + 1, 2 = 1 + 2, 3 = 2 + 3, n = n — 1 + n. Addition of these equations results in n = 1 + (1 +2 + 3+---+ n) or, since the sum of the first n natural numbers is n(n + 1)/2, /in i n + 1 (1) „=!+„__
284 Stereometric Problems Thus, the maximum number of parts into which a plane can be divided by n lines is (n2 + n + 2)/2. The obtained result is easily confirmed for the cases n = 1, 2, 3,.... Now for the space problem! It is apparent that the number of partial spaces attains a maximum when no more than three planes ever intersect at one point and when the lines of intersection of no more than two planes are ever parallel. We will therefore assume that these conditions are satisfied in the following and we designate the number of partial spaces formed by n planes as H. Then, let the space be divided by n planes into n partial spaces. To these planes we now add one additional plane. This plane is cut by the original n planes into n lines of which no more than two pass through a single point and no two or more are parallel. The new (n + l)th plane is therefore divided by the n lines into n surface sections. Each of these n surface sections cuts the partial space that it traverses into two smaller spaces, so that the addition of the (n + l)th plane increases the number of the partial spaces originally present by n. This gives us the equation n + 1 = n + n. We form this equation for the cases n = 1, 2, 3, etc., and obtain the n equations 1 = 1 + 1, 2 = 1 + T, 3 = 3 + 2, « = n-l+fi-l. Addition of these equations results in «=2+1+2+3+ ■■■ + n - 1 or, according to (1), n = n + 1 +1(1-2 + 2-3 + ■■• + (n- 1)«). If we then divide each product v(v + 1) into v2 + v, we obtain * = n + 1 + #[12 + 22 + ■ ■ • + (n - 1)2] + [1 + 2 + ■ • • + (n - 1)]}. Now, according to No. 11, the sums in the first and second square brackets, respectively, are i(n — l)n(2n — 1) and \{n — l)n, respectively;
Eider's Tetrahedron Problem 285 the brace thus equals $(n — \)n(n + 1), and n = n + 1 + i(n - l)n(n + 1) or .. n3 + 5n + 6 " 6 ' Conclusion: The maximum number of parts into which a space can be divided by n planes is (n3 + 5n + 6)/6. ■Sgg Eider's Tetrahedron Problem To express the area of a tetrahedron in terms of its six edges. This fundamental problem was posed and solved by Leonhard Euler (Novi Commentarii Academiae Petropolitanae ad annos 1752 et 1753). The following convenient and simple solution is based upon vector calculus. We will designate the vertexes of the tetrahedron as A, B, C, 0, the six edges BC, CA, AB, OA, OB, OC as a, b, c, p, q, r, the three vectors OA, OB, OC as p, q, r, and the area we are looking for as T. We will consider the edges p, q, r originating from the vertex 0 as being so arranged that they form a right-handed system, i.e., that p can be imagined as the thumb, q as the index finger, and r as the middle finger of the right hand. If we take the triangle OAB as the base surface and the vertex C as the apex of the tetrahedron, then the double value of the base surface area S is given by the magnitude of the vector product © = p x q, the altitude CF is the projection of the edge r on CF, i.e., ro, if we designate as o the cosine of the angle between CO and CF or also of the angle of the two vectors © and r. Consequently, six times the tetrahedron area is equal to S-ro or equal to the scalar product* <S> ■ r of the vector @ and r. Thus, we obtain the simple formula 6T = p x q-r, which can be stated verbally as follows: Six times the area of a tetrahedron is equal to the mixed product of the three vectorial edges originating from one edge of the tetrahedron. * The scalar product of two vectors iS and IB is most conveniently written «• 8 or in the still simpler form 8U8.
286 Stereometric Problems Here the three factors of the mixed product must be written in such sequence as to form a right-handed system (for otherwise the mixed product would represent six times the negative tetrahedron area). Fig. 81. We now introduce a right-angle coordinate system with origin at 0 and designate the coordinates of the three vertexes A, B, C as x\y\z, x'\y'\z', and x"\y"\z". The three components of the vector © = pxq are then yz' — zy', zx' — xz', xy' — yx', and the scalar product @-t is equal to {yz' — zy')x" + (zx' — xz')y" + (xy' — yx')z", i.e., equal to the determinant whose columns are the components of the vectors p, q, r. Thus we obtain the elegant formula 6T = x y z x' y' z' x" y" z" On squaring this formula, multiplying the two (same) determinants row by row, we obtain 36 T2 = A = xx+yy + zz x x' + y y' + z z' x x" + y y" + zz" x'x + y'y + z' z x'x' + y'y' + z'z' x'x" + y'y" + z'z" x"x + y"y + z"z x"x' + y"y' + z"z' x"x" + y"y" + z"z" or, since the elements of this determinant are the scalar products of the vectors p, q, r in pairs, or the squares of these vectors, w t>q pt 36 T2 = qp qq qr tp rq rr (I)
Eider's Tetrahedron Problem 287 This is Eider's tetrahedron formula. (Euler, however, expressed the right-hand side as an algebraic sum rather than as a determinant.) It contains the solution to the problem posed, since the elements of the determinant are simple expressions of the edges; specifically: W = P2, qr = q2 + r2 - a2 qq = q2, r2+p2 - -b2 1,1 > rr = r2, Pa p2 + q2- pq — 2 - c2 In the tetrahedron with the edges a = 11, b = 10, c = 9, /> = 8, y = 7, r = 6, for example, we have pp = 64, qq = 49, rr = 36, qr =-18, xp = 0, pq = 16, and 36 r2 = 64 16 0 16 49 -18 0-18 36 = 16-36 4 16 0 1 49 -9 0 -1 1 = 16-36-916 and T = 48. We can put the obtained result into still another form. If we multiply each element of A by 2 and express the doubled scalar product by the squares P, Q, R, A, B, C of the edge magnitudes p, q, r, a, b, c, we obtain 288 T2 = 2P P + Q -C P + R- B Q +P-C 2Q Q + R-A R + P - B R + Q - A 2R Now we distribute zeros at the left and minus ones at the bottom and obtain 288 T2 = 0 2P P+ Q -C P + R- B 0 Q + P-C 2Q Q + R-A 0 R + P - B R + Q - A 2R -1 -1 -1 -1
288 Stereometric Problems If we add the P-, Q-, and iJ-multiples of the last row to the first, second, and third rows, respectively, we obtain the somewhat simpler 288 T2 = -P P Q-C R-B -Q P-C Q R-A -R P - B Q - A R -1 -1 -1 -1 We now distribute zeros and ones at the top and right: 288 T2 = 0 -p -Q -R -1 0 P P-C P-B -1 0 Q-c Q Q-A -1 0 R-B R-A R -1 1 1 1 1 0 If we now subtract the P-, Q-, and iJ-multiples of the last column from the second, third, and fourth columns, respectively, we finally obtain 288 T2 = 0 -P -Q -R -1 -P 0 -c -B -1 -Q -c 0 -A -1 -R -B -A 0 -1 1 1 1 1 0 or, if we reverse all the minus signs, (II) 288 T2 0 p Q R 1 P 0 c B 1 Q c 0 A 1 R B A 0 1 1 1 1 1 0 In this remarkable formula P, Q, R, A, B, C are the squares of the edges p, q, r, a, b, c.
The Shortest Distance Between Skew Lines 289 Note: the four-point relation: If A, B, C, 0 are four points of a plane, the area of the tetrahedron ABCO is zero and (I) is transformed into the so-called four-point relation: W t>q P* qp qq qr xp rq rr = 0 for the six junction lines BC = a,CA = b, AB = c, OA = p, OB = q, OC = r that are possible between the four points. The Shortest Distance Between Skew Lines To calculate the angle and distance between two given skew lines. This important problem is usually encountered in one of the following two forms: I. To calculate the angle and distance between two skew lines when a point on each line and the direction of each line are given—the former by coordinates and the latter by the direction cosine of the lines. II. To calculate the angle and distance between two opposite edges of a tetrahedron whose six edges are known. The distance between two skew lines is naturally the shortest distance between the lines, i.e., the length of the line perpendicular to both lines and joining a point on each. Solution of I. We designate the perpendicular coordinates of the two given points P and p as ^4|5|C and a\b\c, the vector pP (with the components A — a, B — b, C — c) as b, the direction cosine of the two lines, together with the components of two unit vectors @ and e lying on the lines as L, M, N and I, m, n, the sought-for angle of the two lines as at, and the sought-for minimum distance as k. The solution to this problem, which is in itself not very simple, becomes astonishingly simple with the introduction of the scalar product @-e and the vector product @ X e of the two vectors @ and e. The former can be expressed on the one hand (since the vectors @ and e have a magnitude of 1) as cos co, and, on the other, by the components of the factors as LI + Mm + Nn. We therefore obtain (1) cos o) = Ll + Mm + Nn.
290 Stereometric Problems The latter is perpendicular to both lines, so that the projection of b on the vector @ X e represents the desired distance k (the shortest distance k between the two lines is specifically the projection of b on k and at the same time the projection of b on every parallel to k, for example, on @ X e). However, since the projection of a vector SB on a second vector b of the magnitude v is SB • b/i», we obtain for k the value b • @ X e/sin <d (sin <d is the magnitude of the vector @ X e). Now the scalar product of the two vectors b and @ X e is nothing other than the so-called mixed product of the three vectors b, @, and e. And since the latter is equal to the determinant whose rows are the components of the three vectors (No. 68), we obtain the formula (2) k- A - a B - b C - c L M N I m n /sin i Note. If we desire to calculate the coordinates X/Y/Z and xjyjz of the end points U and u of the shortest junction line k, we designate the segments PU and pu as R and r, the vector uU as !, and we then have idf = 7p + pP + PU, or ! = -re + b + R<&. If we multiply this equation in scalar fashion with @ and e, we obtain, as a result of @-f = 0 and e-! = 0, the two linear equations <$<$R - @er + @b = 0, @eR - eer + eb = 0, from which the unknowns R and r are obtained. Solution of II. Let the six edges of the tetrahedron be BC = a, CA = b, AB = c, OA = p, OB = q, OC = r, and let the vectors —>■—>■—>■—>■ —>■—>■ BC, CA, AB, OA, OB, OC be a, b, c, p, q, r. Let the angle and distance between the two opposite edges c and r be called <d and k, respectively. Determination of m. We have c + x = AB + OC=Al) + 6B + 0~A+AC=dB + Al:=q-b,
The Shortest Distance Between Skew Lines 291 and thus (c + r)2 = (c + r) ■ (q — b) = cq + qr — be — br. However, since (c + r)2 = c2 + r2 + 2cr = c2 + r2 + 2cr cos a>, 2cq = c2 + q2 - p2, 2qr = q2 + r2 - a2, 2bc = a2 - b2 - c2, 2br = p2 - b2 - r2, the equation obtained is transformed into (3) 2cr cos a> = b2 + q2 - a2 - p2, so that <d is determined. Calculation of k. Let the area of the tetrahedron ABCO, which we can consider as known in accordance with Euler's formula (No. 68), be called T. We displace the vector r parallel to itself until it has a starting point A in common with c; its new end point we will call Q, and thus AQ # OC. Since the triangles CQA and CO A are halves of the parallelogram COAQ, they are congruent, and thus the tetrahedrons CQAB and COAB have the same area (7^). If we now take QAB as the base surface of the tetrahedron CQAB and C as the Fig. 82. apex, the base surface has the area \AQ • AB • sin QAB = \rc sin a», and the altitude (as the distance of the point C from the plane QAB that contains the edge c and the line AQ that is parallel to the opposite edge OC) has a length of k. The area of the tetrahedron is therefore ^ • \cr sin o> • k, and we obtain the formula (4) 6T = kcr sin at.
292 Stereometric Problems Since all the magnitudes in this formula are known with the exception of k, it gives us the distance between the opposite edges k which we have been looking for. Note. If we keep in mind that cr sin o> is the magnitude of the vector c X r and that the shortest distance ! (conceived of as a vector) between the edges c and r is parallel to c X r, we can write 6T= fc X r and we have the following Theorem : The mixed product of two opposite sides of a tetrahedron and the distance between them is equal to six times the area of the tetrahedron. A direct consequence of this theorem is the famous Theorem of Steiner: All tetrahedrons having two opposite edges of prescribed length lying on two fixed lines have the same area. Bl The Sphere Circumscribing a Tetrahedron To determine the radius of the sphere circumscribing a tetrahedron of which all six edges are given. One should compare the developments of Legendre in his Elements de Giomitrie, Note V. We will first solve the Preliminary problem: To find the relation between the six major arcs that connect the four points of a spherical surface. We will call the four points 0, 1,2, 3, the arcs joining them 01, 02, 03, 23, 31, 12, the radii (considered as vectors) running to them ro> ri> ra> r3 and tneir common magnitude h. Since there is always a homogeneous linear relation between four vectors of a space, we have the equation °*o + I&1 + y*a + 8r3 = 0, in which not all of the coefficients a, /3, y, 8 vanish simultaneously. We multiply the relation sequentially in scalar fashion by r0, v1} r2, r3 and obtain the four equations r0r0<x + totjjS + r0r2y + r0r38 = 0, tjtoa + txtx/3 + txtay + 1^8 = 0, r2r0<x + r.jrx/3 + r2r2y + r2r38 = 0, r3r0a + tatx/S + r3r2y + r3r38 = 0.
The Sphere Circumscribing a Tetrahedron 293 However, when four homogeneous linear equations with four unknowns (a, /3, y, 8) possess an actual solution, the determinant of the coefficients of the equations must be equal to zero. Consequently ^0^0 ^0^1 *0*2 ^0^3 VO *1*1 *1*2 *1*3 *2*0 *2*1 *2*2 *2*3 ^3^0 ^3^1 ^3^2 ^3^3 Here we replace each product rnrv by h2 cos nv, eliminate everywhere the factor h2, and obtain the relation we are looking for cos 00 cos 01 cos 02 cos 03 cos 10 cos 11 cos 12 cos 13 (1) = 0. cos 20 cos 21 cos 22 cos 23 cos 30 cos 31 cos 32 cos 33 (cos 00, cos 11, cos 22, cos 33 are naturally merely symmetrical ways of writing unity.) The solution of the tetrahedron problem is now simple. In order to maintain agreement with the designations of the preliminary problem we will call the vertexes of the tetrahedron 0, 1,2, 3, the radius of the sphere of circumscription h. The edges 01, 02, 03, 23, 31, 12 we will call p, q, r, a, b, c, their squares P, Q, R, A, B, C, the area of the tetrahedron T. We now introduce the four-point relation (1), assign to each cosine the factor H = 2h2 and replace the new determinant elements in accordance with the cosine theorem, e.g., H cos 01 by H — P, H cos 02 by H - Q, H cos 23 by H - A, etc. (naturally H cos 00 and the other elements of the diagonals will be replaced by H). This gives us, after we reverse the sign of all the elements, -H P-H Q-H R-H P-H -H C-H B-H Q-H C-H -H A-H R-H B-H A-H -H
294 Stereometric Problems - H P - H Q-H R- H 1 P - H - H C - H B - H 1 Q-H C - H - H A-H 1 R- H B - H A-H - H 1 0 0 0 0 1 We now line the bottom of this determinant with ones and the right- hand side with zeros and obtain = 0. We now add to the first, second, third, and fourth rows H times the last row; this gives us = 0. If we call the minors of the last column Mu M2, M3, M4, M5 and arrange them according to the elements of the last column, we obtain H(MX + M2 + M3 + Mt) +M5 = 0. If we also arrange the determinant of equation (II) of No. 68 according to the elements of the last column, that equation assumes the form Ml + M2 + M3 + M4 = 288T2. From the last two equations we obtain 288HT2 = -Mb, 0 p Q R 1 P 0 c B 1 Q c 0 A 1 R B A 0 1 H H H H 1 where MR = Q R R B C 0 A Computation gives -M8 = 2FG + 2GE + 2EF - E2 - F2 - G2,
The Five Regular Solids 295 where E, F, G are the three products AP, BQ, CR. If we replace A, B, C, P, Q, R once again by a2, b2, c2, p2, q2, r2 and designate the products ap, bq, cr of the opposite edges as e,f, g, the last formula can be written as -M5 = 2/V + 2g2e2 + 2e2p - e* -/* - g*. If we consider e,f, g as sides of a triangle, the right side of this formula (according to Hero) represents 16 times the square of the area j of this triangle. Thus the equation found for H = 2h2 is transformed into 576h2T2 = 16/2, and from this we can obtain the simple formula 6hT=j for the radius of the sphere of circumscription. Verbally, this can be stated as follows: Six times the product of a tetrahedron volume and the radius of its sphere of circumscription is equal to the area of a triangle whose sides are the products of the opposite edges of the tetrahedron. Note. The question of the radius p of the sphere inscribed in a tetrahedron is much simpler. The lines joining the center Z of the inscribed sphere and the boundary points of the four triangles bounding the tetrahedron divide the tetrahedron into four pyramids with the common apex Z and the areas ^pl, $/>II, ^pHI, ipIV, where I, II, III, IV are the areas of the bounding triangles. We thus obtain the formula T = $P(I + II + III + IV). This equation represents p as a function of the tetrahedron edges, since I, II, III, IV, and T are known functions of the edges. m The Five Regular Solids To divide the surface of a sphere into congruent regular spherical polygons. Solution. We will call the required division "regular" and we will first answer the question concerning the maximum possible number of regular divisions.
296 Stereometric Problems We will assume that the sphere is covered completely and without any gaps by z regular n-gons and that at every corner of such an n-gon v sides come together. We divide each n-gon by means of the spherical radii running from the center to the vertexes into n isosceles triangles. Each of these triangles possesses the central angle 2irjn and the base angle irjv (since at each vertex 2v such base angles come together), and thus the spherical excess of each is S = 7T = 77-|-H II- n v \n v / Now, the area of such a triangle, when r is the spherical radius is r2e; the area of an n-gon is thus nr2e and the area of the spherical surface consisting of z such n-gons is znr2e. Accordingly, we obtain the equation znr2e = 4rrr2 or or H + H-i+i. n v zn Since the left side of this equation is > 1 and at the same time n as well as v must be >2, we obtain the following five possibilities for n, v, and z: n 3 3 3 4 5 V 3 4 5 3 3 z 4 8 20 6 12 Thus, there are only five possible regular divisions of a spherical surface: by dividing the surface with 1. four regular triangles, 2. six regular tetragons, 3. eight regular triangles, 4. twenty regular triangles, 5. twelve regular pentagons.
The Five Regular Solids 297 If we connect every two adjacent corners of such a spherical n-gon by means of a line segment, we obtain a regular plane n-gon bounded by the n line segments that connect the corners. If we construct this plane n-gon for each of the z spherical n-gons, we obtain a regular polyhedron bounded by z regular n-gons, or a so-called regular solid. There are accordingly only five regular solids, namely, the regular tetrahedron, hexahedron (the cube), octahedron, icosahedron, and dodecahedron. In the following we will actually carry out the five regular divisions of the spherical surface, which we had initially only shown to be possible. For convenience in viewing the sphere we will imagine it as a globe with a north pole N and a south pole S and with meridians and latitudinal circles. I. The tetrahedron (n = 3, v = 3, z = 4). On the three meridians 0°, 120°, 240° we lay off from N the three equal arcs NA, NB, NC such that the triangles NBC, NCA, NAB are equilateral. The three arcs BC, CA, AB enclosing the south pole then also form an equilateral triangle that is congruent to the designated triangles, and the spherical surface has been divided into the four regular triangles NBC, NCA, NAB, ABC. II. The hexahedron (n = 4, v = 3, z = 6). On the four meridians 0°, 90°, 180°, 270° we lay off from N and S the eight equal arcs NA, NB, NC, ND and SC, SD', SA', SB' (each one equal to h) such that each of the arcs AC, BD', CA', DB' is equal to AB (= 2k). kis obtained from the spherical triangle NAB by means of the equation cos 2k = cos h cos h. Since on the one hand 2h + 2k = NA + SC + AC = NS = 180° or h + k = 90°, and thus cos h = sin k, and on the other hand cos 2k = 1 — 2 sin2 k, we obtain 1 - 2 sin2 k = sin2 k and consequently sin k = VJ, cos 2k = ^, cos h = VJ. The corners A, B, C, D, A', B', C, D' defined by these conditions are the eight corners of the cube. III. The octahedron (n = 3, v = 4, 2 = 8). The corners of the octahedron are the points N, S and four equator points separated from each other by 90°.
298 Stereometric Problems IV. The kosahedron (n = 3, v = 5, z = 20). We choose ten meridians 36° apart and call them 1, 2, 3,..., 10. On the meridians 1, 3, 5, 7, 9 we lay off from N the equal arcs NA, NB, NC, ND, NE, and on the meridians 6, 8, 10, 2, 4 we lay off from S the equal arcs SA', SB', SC, SD', SE' such that the ten triangles NAB, NBC, NCD, NDE, NEA, SA'B', SB'C, SC'D', SD'E', SE'A' are equilateral. The common length 2k of the marked-off arcs can be obtained, for example, from one of the right triangles NBO, NCO, into which the meridian 4 divides the equilateral triangle NBC. Since &BNO = 36°, &OBN = 72°, it follows from triangle NBO that __ cos 36° 1 cos BO = cos k = and from this that 2k = 63°26'. If we extend NO by its own length to H, we obtain the isosceles triangle NBH with the base NH = 2h and the legs BN = BH = 2k, the base angle 36°, and the apex angle HBN = 144°. Since these angles have the same sine, the sines of their opposite sides NH and NB are equal according to the sine theorem. But since these opposite sides (2h and 2k) are not equal, 2h must be the supplement of 2k. And since NE' is also the supplement of 2k (= SE'), then necessarily NE' = 2h = NH. Accordingly, point H coincides with E' and E'B is equal to 2k, i.e., equal to NB. In similar fashion each of the arcs AD', D'B, E'C, CA', A'D, DB', B'E, EC, CA is equal to 2k, and the ten "encircling" triangles ABD', D'E'B, BCE', E'A'C, CDA', A'B'D, DEB', B'C'E, EAC, CD'A are likewise equilateral triangles and also congruent to the ten equilateral triangles above.
The Five Regular Solids 299 The 12 points N, S, A, B, C, D, E, A', B', C, D', E' are thus the vertexes of 20 equilateral triangles that completely cover the sphere; they are the 12 corners of the regular icosahedron. V. The dodecahedron (n = 5, v = 3, z = 12). As in the icosahedron, we begin the construction of the dodecahedron by laying off a system of ten meridians 1, 2, 3,..., 10 that are 36° apart. About N as a common apex we group five congruent isosceles triangles NAB, NBC, NCD, NDE, NEA with the apex angle 72° and the base angle 60° (= 180°/v) whose base vertexes A, B, C, D, E lie on the meridians 1, 3, 5, 7, 9. Thus we obtain the regular pentagon ABCDE. In the same way we draw about S as a common center point the regular pentagon A'B'C'D'E' whose vertexes A', B', C, D', E' lie on the meridians 6, 8, 10, 2, 4. Fig. 84. If 0 and 0' represent the base midpoints of the isosceles triangles ABN and D'E'S, then NAO and SD'O' are right triangles with the angles 60° and 36°. Our construction is now based on the theorem (proved below): " The perimeter of a spherical right triangle with angles of 60° and 36° is 90°." If we designate the hypotenuse, the long leg, and the short leg of such a triangle as I, h, and k, then (1) I + k + k = 90°. If we remember that NA = SD' = I, NO = SO' = h, AO = D'O' = k, we see that 2k is the side, I the radius of the circumscribed circle (on the sphere), h the radius of the inscribed circle, and s = I + h the altitude of the pentagon ABCDE or A'B'C'D'E'.
300 Stereometric Problems We now mark off on the meridians 1, 3, 5, 7, 9 from A, B, C, D, E southwards and on the meridians 6, 8, 10, 2, 4 from A', B', C, D', E' northwards the pentagon side 2k, which gives us the points F, G, H, K, L, F,G,H,K,L. Now since, according to (1), each meridian consists of the four segments I, 2k, s, and h, it follows that OG and O'H, for example, represent the pentagon altitude s; i.e., the pentagons ABHGF and D'E'KHG are congruent to the regular pentagon ABCDE. The same is naturally true of the pentagons BCLKH, CDG'F'L, DEK'H'G', EAFL'K', E'A'F'LK, A'B'H'G'F', B'CL'K'H', C'D'GFL. With the 12 regular pentagons already designated the sphere is completely covered. The points A, B, C, D, E, F, G, H, K, L, A', B', C, D', E', F', G', H', K', L' are accordingly the 20 corners of the regular dodecahedron. Supplement: Proof of the theorem: " The perimeter of a spherical right triangle with the angles 60° and 36° is 90°." Let the sides of the triangle be a, b, c, their opposite angles a = 60°, /3 = 36°, y = 90°. We express the tangents of the sides by the regular decagon side z = 2 sin 18° corresponding to the unit circle, for which it is known that z2 + z = 1. 1. Firstly, cos fl = 1 - 2 sin2 18° = 1 - \z2 = ^-^ = ^- r 2 2 2z or sec (3 = 2z. 2. From sec c = tan a tan /3 it follows that sec2 c = 3 tan2 /3 or (tan2c + 1) = 3(sec2/3 - 1) or tan2c = 4(3z2 - 1). However, 3z2 - 1 = z2 + (2z2 - 1) = z2 + (1 - 2z) = [1 - z]2 = z\ and thus tan c = 2z2. 3. tan a = tan c cos /3 = 2z2\2z = z. 4. tan b = tan c cos a = 2z2-\ = z2. Now we have i\ o 2 * + ^2 2z2 tanc-tan (a + b) = 2z ■ 1-23 1-23 2z2 2z2 (1 - z)[\ + z + z2] (z2)[l + 1] Consequently, a + bis the complement of c. Q.E.D. = 1.
The Square as an Image of a Quadrilateral 301 The regular solids were already known to the Pythagoreans and thus go back to the sixth century B.C. The proof that there are only five regular solids probably stems from Euclid (ca. 330-275 B.C.). ^^M The Square as an Image of a Quadrilateral To show that every quadrilateral can be considered as a perspective image of a square. The perspective projection, perspectivity or central projection, the simplest and most important of all projections, can be explained as follows. Given are a fixed point Z, the center of projection, and a fixed plane E, the plane of the image. The perspective image or, more briefly, the perspective of an arbitrary point P0 is understood to mean the point of intersection P of the "projection ray" ZP0 with the plane of the image. P0 is the "object," P the "image." The image of a figure is the totality of the images of the points of which the figure (the object) consists. Thus, the perspective of a straight line g0 is a straight line g, namely the intersection of the plane Zg0 with the plane of the image. Of particular importance is the perspective projection in which only points of a plane E0, the object plane, are projected onto the image plane. The line of intersection 91 of the object plane and the image plane is called the axis of perspectivity. The axis of perspectivity is the locus of the object point that coincides with the point of its image. An arbitrary object line and its image accordingly intersect at the axis. A noteworthy role in this perspectivity is played by the infinitely distant points of the object plane. Since the projection rays to the infinitely distant points of E0 run parallel to E0, they lie in a plane A passing through Zand parallel to E0 and consequently meet the image plane at the line of intersection f of this plane with A. This line of intersection is called the vanishing line of the object plane E0. The vanishing line is parallel to the axis of perspectivity. In order to avoid limiting the general validity of the above theorem, "The perspective of a line is also a line," by a special case, we call the totality of infinitely distant points of E0 the "infinitely distant line" of this plane and can then state briefly that: The perspective of the infinitely distant line of a plane is the vanishing line of this plane.
302 Stereometric Problems The place at which the image g of an arbitrary line g0 of E0 intersects the vanishing line f and which is the image of the infinitely distant point of g0 is called the vanishing point of g0. Now for the solution of our problem! Fig. 85. Let the quadrilateral ABCD in the drawing plane E be the given quadrilateral, let 0 be the point of intersection of the diagonals AC and BD, P the point of intersection of the opposite sides AB and CD, Q the point of intersection of the opposite sides BC and DA. Let the square we are looking for be called accordingly AoBqCqDo, the point of intersection of its diagonals 00, its plane E0. Since the points of intersection P0 and Q0 of the two pairs of opposite sides lie on the infinitely distant line of E0, their images P and Q must lie on the vanishing line f of the perspectivity passing from E0 to E. We accordingly choose the line PQ as the vanishing lineal It makes no difference which parallel to /we choose as the axis of perspectivity a. We choose the parallel through A. The points of intersection of the axis with the lines CD, BC, OP, OQ, and BD we designate as H, K, M, N, and S. Since each object line meets the corresponding image line at the axis, these points may also be called H0, K0, M0, N0, S0.
The Poklke-Sckwarz Theorem 303 In the quadrilateral ABCD the opposite sides PBA and PCD and the diagonals PO and PQ form a harmonic ray pencil. Since the ray PQ runs parallel to the line a, the segments MA and MH are of equal length. In the quadrilateral ABCD the opposite sides QCB and QDA and the diagonals QO and QP also form a harmonic ray pencil. Since QP||ct, the segments NA and NK are also equally long. Since the diagonals of the sought-for square must meet the diagonals of the given quadrilateral at the axis, the diagonals of the square must pass through A and S. The point of intersection 00 of the diagonals accordingly lies on the semicircle with the diameter AS belonging to the plane E0. Since the midlines M000 and N000 of the square pass through 00, 00 also lies on the semicircle with the diameter MN in the plane E0. The point of intersection of the two semicircles is the center point 00 of the square. The sides AqBq and C0D0 of the square are the parallels through A and H to M00, the sides B0C0 and DqAq of the square are the parallels through K and A to N00. For convenience we execute the drawing (cf. Figure 85) in the drawing plane itself. Then, in order to obtain the spatial per- spectivity we are looking for, we rotate the square about the axis a as an axis of rotation into a new plane E0, draw through/the plane A parallel to E0, join the point of intersection of the diagonals, 00, now lying in E0, with 0, and designate the point of intersection of this connecting line with A as Z. If we now project the square AqBqCqDq lying in E0 from the center Z onto E, we thereby obtain as a perspective image the square ABCD. ^£1 The Pohlke-Schwarz Theorem Four arbitrary points of a plane that do not all lie on the same line can be considered as an oblique image of the corners of a tetrahedron that is similar to a given tetrahedron. This fundamental theorem of oblique parallel projection, proved by H. A. Schwarz (1843-1921) in 1864 (Crelle's Journal, vol. 63; also, Schwarz, GesammelU Abkandlungen), includes as a special case the theorem formulated in 1853 by K. Pohlke (1810-1876):
304 Stereometric Problems The fundamental theorem of oblique axonometry: Three arbitrary segments originating from a single point in a plane that do not all belong to the same line can be considered as the oblique image of a tripod. Before taking up the proof of this theorem we shall make several prefatory remarks about oblique projection, affinity, and axonometry. An oblique projection is a projection of a plane or three-dimensional figure, an object figure, onto the drawing plane or image plane in which each object point is projected onto the image plane by a "projection" ray drawn in a fixed direction. If the projection rays are perpendicular to the image plane, the oblique projection is called a normal or orthogonal projection. The oblique projection of points of a plane (the object plane) onto the image plane is a so-called affinity. An affinity or qffine projection is understood to mean a projection of an object plane onto the picture plane (which may also lie in the object plane) in which the points of the object plane are transformed into points of the image plane in such manner that they exhibit the following fundamental properties: I. The qffine image of a line is also a line. II. Parallelism is not annulled by qffine projection. (The image of a parallelogram is a parallelogram.) III. The ratio of parallel segments is not altered by qffine projection. In other words: Parallel segments are projected in the same proportion. (This third property is a consequence of I. and II.) It is therefore immediately evident that the oblique projection of a plane onto a second plane possesses these three fundamental properties. The most general affinity between two arbitrary planes E and E' is determined by the mutual correspondence between two arbitrary triangles ABC and A'B'C of these planes, where A', B', C are determined as the affine images of A, B, C, respectively. The affine image P' of an arbitrary object point P (of E) is drawn by letting AP intersect with the side BC at //, then (according to III.) determining the affine image H' of // on the line B'C by means of the condition B'H'-.CH' = BH-.CH, and finally determining P' on A'H' by means of the condition A'P'-.H'P' = AP.HP. A frequently employed method of drawing the oblique projection of a three-dimensional figure is the axonometric method. In this method the points P of the three-dimensional figure are determined by their coordinates x\y\z most commonly in a perpendicular
The Poklke-Schwarz Theorem 305 coordinate system. Three equal segments OA, OB, and OC are laid off from the origin 0 on the axes; these segments form a so-called tripod. The oblique outline O'A'B'C of the tripod is drawn, and this also gives us the oblique images of the coordinate axes. We then construct, in accordance with III., the oblique image of the point P, which in this context is called the axonometric image. It is now of fundamental importance to know whether three arbitrary segments O'A', O'B', O'C originating from a point 0' of the drawing plane can be considered as the oblique projection of a tripod OABC. This question was answered by Pohlke and, in a somewhat more general fashion, by Schwarz, as mentioned above. Of the numerous proofs of the Pohlke-Schwarz fundamental theorem the following (stemming from Schwarz) is quite elementary. It is based upon the theorem of Lhuilier, which is in itself very interesting: The sections of an arbitrary three-edged prism include all the possible forms of triangles. In other words: Every triangle can be considered as the normal projection of a triangle of given form. This theorem was stated in 1811 by the French-Swiss mathematician Simon Lhuilier (1750-1840). Proof. Since parallel sections of a prism are congruent, we can assume that the prescribed triangle AoBQC0, which is also the cross section of the prism, and the sought-for prism section ABC, which possesses a prescribed form, have a common vertex, C = C0. If we now drop the perpendiculars A0X and B0Y from A0 and B0 to the intersection line (axis) g of the two planes E0 of AqB0C0 and E of ABC and
306 Stereometric Problems rotate the plane E about g as the rotation axis to the plane E0, then A and B, as the figure shows, fall on the perpendiculars A0Xa.nd B0Y, respectively, and the point of intersection S = S0 of the lines AqB0 and AB falls on the axis. We now draw the perpendicular to the axis through C and let it touch AqBq at T0 and AB at T. If we designate the cosine of the angle formed by the plane E in its original position with E0 as /x, then AaX = ix-AX, B0Y = /x-BY, T0C = y.- TC. Now according to the ray theorem, SA-.AT-.TB = S0A0:A0T0: T^. We can therefore draw a parallel SxA-^T-^B^ to SATB that cuts the lines g, CA, CT, CB at Sl} Au Tu Bt) and is congruent to SoAqTqBo (so that S^ = SoA0, A^ = A0T0, 7^ = 7^). We displace the triangle StBiC in such a way that 5X falls on S0, Ax on A0, 7\ on T0, Bt on B0. The vertex C then falls on a point V of the semicircle $ described about the diameter S07o (since ASiCTt is a right triangle), on which C lies, also. From this fact we obtain the following simple method for constructing the described figure when the triangle AoB0C0 and the form of the triangle ABC are given. We draw over AqBq the triangle AqBqVthat is similar to the triangle ABC (with A0, B0, V being homologous to A, B, C, respectively). We let the median perpendicular of CV intersect with AqBq at M and draw the semicircle $ with the center M and the radius MC = MV. The end points S0 and T0 of the semicircle, which lie on the line AqBq, we designate in such manner that S0V and T0C become sides (not diagonals) of the chord quadrilateral S0T0CV. We then choose CS0 as the axis and CT0 as the perpendicular to the axis. On the axis we make CSt = VS0, on the perpendicular to the axis C7\ = VT0, and we draw the line S^A^T^B^ S 50^o^o^o- Finally, we draw parallel to 51^417'151 the line SATB of which S, A, T, B lie on the perpendiculars through S0, A0, T0, B0, respectively, while at the same time A lies on CAt and B lies on CBt. If we rotate the triangle ABC about CS as the axis of rotation by the angle whose cosine /x = CQTQjCT as the angle of rotation, AqBqCo then appears as the normal projection of the rotated triangle ABC, which possesses the prescribed form. That the ratio /x = C0T0jCT can be considered as a cosine, i.e., is a proper fraction, is shown as follows. According to the ray
Gauss' Fundamental Theorem of Axonometry 307 theorem, CT = CT^^CSjCS^), i.e., according to the construction, = VT0-CSjVS. If we introduce this value into the equation for p, we obtain _CT0 _ct0vs M CT CSVTq However, since, according to the theory of Ptolemy, in the chord quadrilateral ST0CV the product CT0VS of the opposite sides is smaller than the product CSVT0 of the diagonals, p represents a proper fraction. This proves the auxiliary theorem concerning the prism. The proof of the Pohlke-Schwarz theorem is now easy. We can state the theorem in the following manner: The oblique image of a given tetrahedron can always be determined in such manner that it is similar to a given quadrilateral. Let the tetrahedron be ABCS, the quadrilateral A'B'C'D'. In the affinity between the planes ABC and A'B'C, in which A', B', C are correlated to the points A, B, and C, respectively, let the point D correspond to the point D'. We select SD as the direction of the affinity (projection ray). We construct the triangular prism whose edges are parallel to SD through A, B, and C, and determine the section A"B"C" that is parallel to A'B'C. In the affinity in which the points A", B", C" are correlated to the points A', B', C", let the point D" correspond to the point D'. Then A"B"C"D" is similar to A'B'C'D'. Now, since A'B"C"D" and also ABCD are affine with respect to A'B'C'D', then A'B"C"D" is also affine to ABCD. The latter affinity, however, arises from the projection rays parallel to SD. In this affinity the quadrilateral A"B"C"D" that is similar to A'B'C'D1 is thus the oblique image of the given tetrahedron ABCS. ^Kfl Gauss' Fundamental Theorem of Axonometry Though three segments OA, OB, OC originating from a point 0 in the drawing plane (image plane) all three of which do not belong to the same straight line can always, according to Pohlke's fundamental theorem (No. 73), be considered as an oblique projection of a tripod, this is no longer the case for the normal projection of a tripod.
308 Stereometric Problems Moreover, there exists between the lengths and directions of the normal projections OA, OB, OC of the three legs a definite relationship. Thus we come to Gauss' problem: What is the relation between the normal projections OA, OB, OC of the legs of a tripod? Solution. We select the image plane E as the *y-plane, the perpendicular to this plane from the apex of the tripod as the z-axis of a triaxial orthogonal coordinate system; we take the common length of the three legs as the unit length and call the direction cosines of the legs A|A'|A", p\p'\p", and v|v'|v". At the same time we take the *y-plane as the Gauss plane (the plane of complex numbers) and designate the complex number represented by any point (P) of £ by the corresponding small gothic letter (p). Since the three points A, B, C in E have the coordinates A| A', p\p', v\v, a = A + i\', b = p + ip', c = v + iv. Squaring and adding, we obtain a2 + b2+c2= (A2 + ^2 + ,,2) _ [A'2 + ^2 + „'2J + 2i{AA' + pp' + w'}- According to the well-known relations between the direction cosines of three mutually perpendicular lines, the expression within parentheses and the expression within brackets both equal one, while the expression within the braces is equal to zero. This gives us the Gauss equation a2 + b2 + c2 = 0. This formula forms Gauss' fundamental theorem of normal axonometry: If in the normal projection of a tripod the image plane is considered as the plane of complex numbers, the projection of the apex of the tripod as the null point, and the projections of the leg ends as complex numbers of the plane, the quadratic sum of these numbers is equal to zero. The Gauss theorem immediately provides the solution of the Fundamental problem of normal axonometry: To complete the normal projection OABC of a tripod of which the normal projections OA and OB of two of the legs are already drawn. Solution. We select (as above) the point 0 as the null point of the complex number plane and the .direction of OA as the direction of the positive real number axis. The magnitudes of the three
Gauss' Fundamental Theorem of Axonometry 309 numbers a, b, C we will designate as a, b, c, and the three angles BOC, CO A, AOB as a, /3, y. We write the Gauss equation b2 c2 a + — = a a In order to construct p = b2/ct. we lay off at 0 on OB the angle y, at B on BO the angle OAB; the point of intersection P of the free legs of the angle drawn gives us p. We then draw through A the parallel to OP, through P the parallel to OA and obtain at the point of intersection Q of the two parallels the complex number q = a + (b2/ct). Consequently, the end point R of the extension of QO by itself is the number r = c2/ct. From c = Vat it follows that: 1. The magnitude of C is the mean proportion of the magnitudes of a and r; 2. the direction of c is the direction of the bisector of the angle (2/3) enclosed between OA and OR. Accordingly, we bisect the angle AOR and mark off on the bisector from 0 the mean proportion of OA and OR; the end point of the marked-off segment is the sought-for point C. Since we can choose the bisector of the concave angle AOR as well as that of the convex
310 Stereometric Problems angle (in accordance with the two values of Vat), there are two possible positions for C. Note. Weisbach's theorem. Since the square of a complex number has an angle twice as great as the number itself, the vectors of the squares of two complex numbers form with each other an angle that is twice as great as the vectors of the numbers. Thus the vectors of the squares ct2-b2-c2 form the angles 2a, 2/3, 2y with each other. Thus, if we group these vectors (by magnitude and direction), we obtain (in accordance with the Gauss formula) a triangle with the external angles 2a, 2/3,2y. Since the sides of this triangle are a2, b2, c2, the sine theorem gives us the equation a2:b2:c2 = sin 2a: sin 2/3: sin 2y. This formula is Weisbach's theorem: The squares of the normal projections of the legs of a tripod relate to each other as the sine of twice the angles enclosed by the projections. Thus, Weisbach's theorem appears as the direct consequence of the Gauss theorem. The Gauss theorem can be found unproved in the second volume of Gauss' Werke, the Weisbach theorem in Weisbach's paper on axonom- etry, which was published in 1844 at Tubingen in the Polytechnische Mitteilungen of Volz and Karmarsch. H|9 Hipparchus' Stenographic Projection To present a conformal map projection that transforms the circles of the globe into circles of the map. The projection we are looking for, which is called a stereographic or polar projection, is very important in cartography. In all probability the source of this problem is the astronomer Hipparchus (of Nicaea in Bithynia), one of the most amazing men of antiquity, who was making astronomical observations in the period from 160-125 B.C. in Rhodes, Alexandria, Syracuse, and Babylon. The problem is solved by the following projection directive: One selects as the projection plane or image plane (map plane) the plane E tangent to the globe at an appropriate point 0—the so-called map center—of the area to be projected, and as the center of a central projection the end point Z of the globe diameter OZ originating at 0.
Hipparchus' Stereographic Projection 311 The stereographic image P' of an arbitrary point P of the globe is the point of intersection of the projection ray ZP with the image plane E. 2 Fig. 88. The distance r = OP' from the map center is given by the equation r = 2 tan £, where £ represents the angle formed by the projection ray ZP with the center ray ZO, and the radius of the globe is chosen as the unit length. The stereographic projection thus defined has the following two properties: I. Every image circle of a globe circle is a circle. II. The stereographic map is conformal. (I.e., the map image of an angle located on the globe is an equally great angle.) The proofs of these properties are both based on the following auxiliary theorem: The image of a globe tangent bounded by globe and map is just as long as the tangent. C13y Fig. 89.
312 Stereometric Problems Proof of the auxiliary theorem. Let P be a point on the globe, P' its image, M the place at which the globe tangent passing through P and lying in the drawing plane ZOP meets the image plane, and at the same time (since the two tangents MO and MP are equal) the midpoint of the hypotenuse of the right triangle OPP'. The intersection point D of any other globe tangent passing through P with the image plane will then lie perpendicularly above (below) M. The image D' of D is D itself, and the image of the tangent DP is thus DP'. Now the two right triangles at M, DMP and DMP', are congruent (MD = MD and MP = MP'). Consequently, D'P' = DP, which was to be proved. Proof of I. We will now prove the somewhat more general Chasles theorem:* The stereographic image of a globe circle S is a circle whose midpoint is the stereographic projection S' of the apex S of the cone that is tangent to the globe along the circle S. Proof. In Figure 90 let P be an arbitrary point of S, let P' be its image, D the point of intersection of the tangent to the sphere and cone-generator SP with the image plane E. According to the auxiliary theorem, DP then equals DP'. Thus, if // is the point of intersection of the parallel through S' to DP with the projection ray ZP, it follows from the similarity of the triangle S'P'H to the isosceles triangle DP'P that the two segments S'P' and S'H are equal. Consequently, in the relation S'H:SP = ZS'-.ZS derived from the ray theorem, we can replace S'H with S'P', obtaining * Michel Chasles (1793-1880), French mathematician, especially well-known for his brilliantly written Aperfu historique sur I'origine et U developpement dts mithodes en giomitrie.
Hipparchus' Stenographic Projection 313 Now, if P describes the circle S, SP (as the distance of the apex S of the cone from S) remains constant, and consequently, in view of the last equation, S'P' also remains constant and P' describes a circle in E. If the object circle S is a great circle of the globe, the apex S of the cone lies at infinity. In this case let F be the place at which the perpendicular from Z on the plane of S touches the map plane E, and let V be the place at which the globe tangent through P parallel to this perpendicular touches the map plane E. Since, according to the auxiliary theorem, VP' = VP, the triangle VPP' is isosceles; and since VP is parallel to FZ, the triangle FZP' is also isosceles; therefore, FP' = ZF. The locus of the image point P' is thus a circle with the midpoint F and the radius ZF. In those great circles of the globe that pass through the projection center and the map center, the midpoint F of the image circle recedes to infinity. In fact, these circles, as direct inspection will show, are transformed into straight lines by projection. Proof of II. Let m be an arbitrary angle on the globe, its apex P, therefore, a point on the globe, and each of its legs a globe tangent. If X and Y are accordingly the points at which the two tangents intersect the image plane E, then m = &XPY. The image <o' of this angle is the angle XP'Y. Now, since the triangles XPY and ATT are congruent (AT = AT; also, according to the auxiliary theorem, XP = XP' and YP = YP'), we immediately obtain to' = at, which was to be proved. Fio. 91.
314 Stereometric Problems Note. If instead of the tangential plane E we choose a plane parallel to it as our map plane, we obtain a similar stereographic projection, which, naturally, also possesses the fundamental properties I. and II. Of particular importance is a picture plane passing through the center of the globe, especially when the north pole is chosen as the projection center and the equatorial plane is accordingly chosen as the image plane. In this case we obtain for the distance r of the image point P' from the map center 0 lying at the center of the globe the formula r = tan ^45° + |), where <p is the geographic latitude of the point P. (The above cited angle £ = 2i OZP is the base angle of the isosceles triangle OPZ in which the apex angle situated at 0 is the complement of the latitude q>.) ^^g The Mercator Projection To draw a conformal geographic map whose grid is composed of right-angle compartments. The Mercator map, which is equally important for both geography and nautical science, was conceived by Gerhard Kremer, called Mercator (1512-1594). On the Mercator map the equator is a segment AB, the length of which agrees with the length (2n) of the globe equator. If we divide AB into 360 equal parts and erect at the dividing points perpendiculars to AB, we thereby obtain the map meridians. The latitude parallel on the map that corresponds to the globe parallel of latitude <p is a line parallel to AB whose distance O from the map equator is called the exaggerated latitude. The core of the problem consists of representing the exaggerated latitude O as a function of the geographic latitude <p. In order to solve this problem we will compare the Mercator map with the—also conformal—Hipparchus map (No. 75), in which the north pole of the globe is the projection center and the plane £ of the globe equator is the map plane, and in which, therefore, the globe equator is projected isometrically. Here also the globe radius will serve as the unit length. On the Mercator map we divide the distance O of the latitude parallel from the equator into n equal parts, where n is a very large
The Mercator Projection 315 number; we draw through the dividing points the latitude parallels 1, 2, 3,..., n — 1 and call their corresponding geographic latitudes 9u 9>2) ■ ■ ■ > Vn -1) so that instead of <p we write <pn also. We then draw the two parallel map meridians A' and A' corresponding to the globe meridians A and A, whose difference in longitude measured in radian measure e = A — A we will make very small. We thereby obtain on the map a series of successive, very small, congruent rectangles with the base line e and the altitude 0/n. We now do the same on the Hipparchus map. Thus, we draw the concentric map latitudes corresponding to the latitudes <pu <p2, ■ ■ ■, <pn-i and call their radii rlxr2, ■ ■ -,rn = r- According to No. 75, (1) = tan (45° + I)- Similarly, we draw the map meridians A" and A" corresponding to the two longitudes A and A; these meridians are at the same time the radii of the circle of latitude of radius r. Thus, we obtain on the Hipparchus map a series of n successive, very small compartments, which A' v+1 equator Fig. 92. A' we can consider as rectangles if n is sufficiently great. We single out the compartment situated between the latitude circles of radii rv and rv + 1. Since its base line parallel to the map equator is rv times as great as the base line e of the first compartment, and thus also rv times as great as the base line e of the compartment of the Mercator map, then as a result of the conformal nature of the two maps, the altitude rv + 1 — rv of the Hipparchus map compartment must also be rv times as great as the altitude O/n of the corresponding compartment of the Mercator map: _ _ O
316 Stereometric Problems From this it follows that ,♦1 = r„(l + *)• If we construct this equation for all n compartments, r0 being equal to 1, and multiply the resulting n equations together, we obtain 0\n (2) r- (l+|) However, since for sufficiently great n the right side of this equation does not deviate noticeably from e* (No. 12), we obtain the equation (2a) r = «•. From this we get O = Ir or, because of (1), (3) ¢ = /tan (45° + |J, and thus the exaggerated latitude O is represented as a function of the geographic latitude <p. As a result of our investigation we obtain the following Directive for drawing a Mercator map : The map image of a point on the earth of longitude A and latitude <phas a distance Xfrom the zero meridian on the map and a distance of /tang + f) from the map equator. Here the angles A and <p are taken as being in radian measure and the radius of the globe on which the map is based is taken as the unit length.
Nautical and Astronomical Problems
The Problem of the Loxodrome To determine the longitude of the loxodromic line joining two points on the surface of the earth. A loxodrome is understood to mean a line on the earth's surface that makes the same angle with all the meridians that it cuts. As long as a ship does not alter its course it is sailing on a loxodrome. The angle k formed by the loxodrome with the meridians it cuts is therefore called the azimuth of course. On a Mercator map (No. 76), which is conformal and possesses rectilinear parallel meridians, the loxodrome appears as a straight line that cuts the map meridians at the angle k. In our study of the Mercator map we chose the radius of the globe as the unit length. Sailors use as the unit length the nautical mile (nm), which is the length of one minute latitude on a meridian of the earth's surface or, also, the length of a minute longitude on the equator (each being 1852 meters). Since a meridian is n earth radians long and 180 degrees of latitude is equal to 10800 latitude minutes, the earth radius is n = 10800/7r nm long. If we think of a Mercator map with 1:1 scale (i.e., a map whose equator is as long as the real equator), the distance between the map circle corresponding to the latitude <p and the map equator, the so-called exaggerated latitude (according to No. 76), is <S> = nl tan (45° + |) nm. The two earth points 0 and 0' whose loxodromic distance d is to be determined are given by their longitudes A, A' and latitudes <p, <p' (>¥)■ The exaggerated latitudes on the map are O = nl tan ^45° + |\ and <D' = nl tan ^45° + ^\ nm, the distances of the map meridians from the zero meridian A and A' nm, where A represents the number of longitude minutes comprising A and A' the number of longitude minutes comprising A'.
320 Nautical and Astronomical Problems Let us say that the map meridian through 0 and the map parallel through 0' intersect at S. Then OS = B is the exaggerated latitude difference 0' — ¢, O'S = L = A' — A (nm), 00' is the map loxodrome and 2{.0'0S = k is the azimuth of course. From the right map triangle 00'S we find the azimuth of course k by means of the equation (1) tan»c = -g- In order to determine the loxodromic distance rfof the two positions on the surface of the earth we divide d into N very small equal segments e considered as being rectilinear. If we draw the meridian through one of two adjacent division points and the circle of latitude through the other, we obtain thereby a very small right triangle with the hypotenuse e, whose meridional leg is the latitude difference /3 (measured in nm) of the two division points and forms the angle k with the loxodrome, so that /3 = e cos k. Every two adjacent points thus possess the same latitude difference /3. The total (measured in nm) latitude difference b of the two positions 0 and 0' on the earth's surface is therefore b = JV)3 = Ne cos k = dcosK. Consequently, the sought-for loxodromic distance is (2) d = b sec k. Formulas (1) and (2) contain the solution to the problem. Example. How great is the loxodromic distance from Valdivia (A = 286° 34.9' E, <p 39° 53.1') to Yokohama (A' = 139° 39.2' E, <p' = +35° 26.6')? Here the longitudinal difference /, = 8815.7 minutes; the latitudinal difference b = 4519.7 minutes or nautical miles; the exaggerated latitude difference B = O' — O = 4890 nm; k, according to (1), is 60° 58'50"; and the loxodromic distance d, according to (2), is 9317 nm. Note. The shortest distance k between the two positions can be found by applying the cosine theorem to the spherical triangle NVY (North Pole-Valdivia-Yokohama). In this triangle NV = 90° - <p = 129° 53.1', NY = 90° - <p', &VNY = A - A', and VY = k. According to the cosine theorem cos k = cos NVcos NY + sin NVsin NY cos (A — A') or cos k = sin <p sin <p' + cos <p cos <p' cos (A — A').
Determining the Position of a Ship at Sea 321 This yields k = 153° 36.1' = 9216.1' = 9216.1 nm. The shortest distance is consequently 101 nm shorter than the loxodromic distance. The name loxodrome stems from the Dutchman Willebrord Snell (Snellius, 1581-1626). The Portuguese mathematician Pedro Nunes (1492-1577) was the first to recognize that the loxodromic line connecting two points of the earth's surface is not the shortest connecting line and that a loxodrome continuously approaches the pole without ever reaching it. ^^^1 Determining the Position of a Ship at Sea One of the most important problems in nautical science is that of determining the position of a ship at sea. The solution is usually obtained by the method of the so-called astronomical meridian reckoning, which will be analyzed in the following example. Problem : On board a ship in the Pacific Ocean in the north latitude on October 20, 1923 at 6:50 p.m. mean Greenwich time by the chronometer the sun's altitude was taken in the morning as h = 21° 40.5'; the Nautical Almanac gave the declination of the sun for the time of observation as 8 = 10° 10.2' S, the equation of time as e = — 15 min 3 sec. The ship then sailed till noon 15.2 nm WNW, and the altitude of the sun at zenith was then measured as H = 35° 2.7' and the sun's declination determined at A = 10° 13'. Where was the ship? The solution to this problem consists of four steps. I. Determination of the meridional latitude ¢. At culmination the successive arcs—the altitude of the sun, the pole distance, the pole altitude—cover the meridional half circle above the horizon in such manner that H + (90° + A) + O = 180°. This gives us O = 90° - // - A = 44° 44.3'. II. Determination of the latitude difference /3 and the LONGITUDE DIFFERENCE I OF THE TWO OBSERVATION POINTS, AS WELL AS THE A.M. LATITUDE <p. If one imagines two sufficiently close points A and B on the earth's surface, the distance between which is d nm and the line connecting
322 Nautical and Astronomical Problems which forms the angle k with the longitudinal circle passing through the center M of AB, then the latitudinal difference of the two points is d cos k nm, the longitudinal difference dsin k nm. Since one nautical mile of latitudinal difference is equivalent to one minute latitude difference and one nautical mile longitudinal difference at the latitude <p corresponds to sec <p minutes longitudinal difference, then the latitudinal and longitudinal differences of A and B in minutes are: /3 = dcos k, I = dim k sec /t, where p is the latitude of M, the so-called mean latitude of A and B. In our example (d = 15.2, k = 67.5°) we find first that /3 = 5.8'. From this it follows that the a.m. latitude is <p = d) - p = 44° 38.5', and the mean latitude is ^=o_+^ = 440414, Fig. 93. Accordingly we find the longitude difference to be I = 19.75'. III. Determination of the a.m. longitude A. In the formula (see Figure 93) corresponding to the nautical triangle PZO (pole-zenith-sun) of the a.m. observation cos z = cosp cos b + sin p sin b cos ZPO,
Gauss' Two-Altitude Problem 323 we replace z, p, b, and -&ZPO with 90° - h, 90° + 8, 90° - <p, and 180° — T (T being understood to represent the time angle of the sun), and we obtain _ „ sin h — cos T = tan 8 tan q> H r cos o cos <p This yields the true local time T of the a.m. observation T.L.T. = T= 134° 47.5' = 8hr59min 10 sec. From this and the time equation e we obtain the mean local time of the observation M.L.T. = T.L.T. + e = 8 hr 44 min 7 sec. If we reduce the mean Greenwich time of the observation by the mean local time, we obtain the western longitude A of the observation point in time: A = M.G.T. - M.L.T. = 10 hr 5 min 53 sec. In angular measure (1 hr time longitude = 15 degrees longitude), this comes to A= 151° 28.25' W. IV. Determination of the meridian longitude A. A = A + /= 151° 48'. Result: a.m. Position: 44° 38.5' N, 151° 28.25' W, Noon Position: 44° 44.3' N, 151° 48' W. ^^| Gauss* Two-Altitude Problem From the altitudes of two known stars determine the time and position. This problem, which is very important for astronomers, geographers, and mariners, was solved by Gauss in 1812 in Bode's Astronomisches Jahrbuch. Two stars are said to be known when their equatorial coordinates— the right ascension and declination—are known. Let these coordinates of the two stars S and S' be ct|8 and ct'|8'. In the present problem all we need in addition is the right ascension difference a' - a. In the figure let Pbe the world pole; thus PS = p = 90° - 8
324 Nautical and Astronomical Problems will be the pole distance from S; PS' = p' = 90° - 8' will be the pole distance from S'; and ^SPS' = r will be the angle between the hour circles of the two stars, as well as the magnitude of the right ascension difference; let Z be the zenith of the observation point, so that PZ = b = 90° — <p is the complement of the latitude <p, ZS = z the zenith distance from S, and ZS" = z' the zenith distance from S', the last two being as well the complements of the altitudes h and h', respectively. We still need the auxiliary magnitudes ^PSS' = a, £_PS'S = a', &PSZ = 0, &ZSS' = J, &ZPS = t, and the side SS' = s. Z b Fig. 94. The computation, which is very simple, consists of three steps corresponding to the three triangles PSS', ZSS', PZS, which are taken up in that order. I. Triangle PSS'. The angles a and <r' are determined according to Napier's formulas tan a + a' cos cos P'-P 2 T —; cot pr P' +P 2 tan ■ P' - P 5m 2 r -TT-p^z sin ^--—- and the side s is determined according to the sine formula sin s:smp = sin r:sin a'.
Gauss' Two-Altitude Problem 325 II. Triangle ZSS'. The angle £ is calculated according to the tangent theorem for the half angle: £ _ /sin (S — z) sin (S — s) tan2 ~ V sin S sin (S - z') ' where S is half the sum of the triangle sides z, z', J. In connection with this we determine ifi = a — £. III. Triangle PZS, determination of the locale and the time. The sought-for latitude can be obtained from cos b = cosp cos z + sinp sin z cos ifi or sin 9 = sin 8 sin A + cos 8 cos A cos 0. The sought-for time angle T, i.e., the angle at the pole that has been described by the hour circle of the star S since its lower culmination, follows from cos z — cos b cos b sin h — sin 8 sin w cos t = -..-1. = 5 ~ sin p sin o cos 8 cos 9 and T = 12 hr ± t, where the upper sign applies when the star S at the moment of observation is in the western celestial hemisphere and the lower when it is in the eastern celestial hemisphere. From this we obtain directly the sought-for time—sidereal time <S> (the time angle of the Aries point)—of the observation when we add the right ascension a to the time angle T: <B = T + a. In order to obtain the mean local time—M.L.T.—of the observation we first determine with an approximate value <x0 of the right ascension of the mean sun for the moment of the observation the approximate mean local time © — &0 of the observation; then, using this already fairly exact mean local time we determine the exact right ascension <x0 of the mean sun for the moment of observation and finally the exact mean local time M.L.T. = © - <x0. We can apply this solution of the Gauss two-altitude problem directly to the solution of the very important navigational problem,
326 Nautical and Astronomical Problems Douwes'* problem: From two altitudes of a star {the sun) with known declination and the interval between the two observations determine the latitude of the place of observation. We need only consider S and S', respectively, as the place, 8 and 8', respectively, as the declination of the star at the first and second observations. For fixed stars 8 = 8', while for the sun and the planets 8' differs somewhat from 8. (t is the angle determined by the known time interval between the hour circles of the star corresponding to the two moments of observation.) Since the two measured altitudes are usually observed at different places A and B, while the above calculation is related to only one place, let us say B, the altitude measured at A must be " reduced to place B." For this purpose we solve the problem: At a place A the altitude of a star is observed at a given time $; at the same moment in time what is the altitude of the star at place B? To begin with, it is clear that all places on the earth's surface at which the star has the same altitude or the same zenith distance at moment 3 lie on a wcle of the geosphere the spherical midpoint of which is the end point S0 of the earth radius from the geocenter to the star. This circle is called the equal altitude circle of the star, its midpoint So the star image. Fig. 95. In Figure 95 let ¾ and 58 be the two equal altitude circles of the star at moment 3 on which the observation points A and B lie; let S0 be the star image, 0 the point of intersection of the great arc SqA with 58. We will assume that the distance AB is so small that the triangle AOB can be considered plane. This gives for the difference between * Douwes was a Dutch admiralty mathematician.
Gauss' Three-Altitude Problem 327 the zenith distances and, consequently, also for the difference in the altitudes of the star at A and B AO = AB cos <u, where co is the angle between the ship's course AB and the bearing ^40 of the star at A We accordingly obtain the sought-for star altitude h at B at the time 3 of the observation made at A if we increase or reduce the star altitude measured at A by the product of the traversed distance AB and the cosine of the angle between the course and the bearing of the star at A, accordingly as the ship draws nearer to or recedes from the star. The "reduced" altitude thus obtained must then be substituted for k in the above Gauss equation, while the altitude measured at B must be used for h'. The value for tp obtained by this calculation is naturally the latitude of the second observation point B. ggjlj Gauss' Three-Altitude Problem From the time intervals between the moments at which three known stars attain the same altitude, determine the moments of the observations, the latitude of the observation point and the altitude of the stars. The significance of this Gauss method for determining time and location resides in the fact that it eliminates all observational error resulting from atmospheric refraction. Solution. We designate the equatorial coordinates (right ascension and declination) of the three stars as <x|8, <x'|8', <x"|8", the latitude of the observation point as <p, the moments of the observations as t, t', t", the time angles of the three stars at these moments as T, T', T", so that the differences T' - T = t' - t and T" - T = t" — t are known. This gives us the three equations (1) sin A = sin 8 sin 9 — cos 8 cos <p cos T, (2) sin h = sin 8' sin <p — cos 8' cos <p cos T', (3) sin h = sin 8" sin <p — cos 8" cos <p cos T". By subtracting the two first equations we obtain (4) sin <p(sin 8 — sin 8') = cos <p(cos 8 cos T — cos 8' cos T').
328 Nautical and Astronomical Problems We now introduce the half sum and half difference 8' + 8 . 8' - 8 s — and u = —y- and T' + T T' — T S=±-±-L and U=L-TFL of the declinations 8' and 8 and the time angles T' and T, respectively, and accordingly replace 8' and 8 in (4) by s + u and s — u, and replace T' and T by S + U and S — U. In the transformed equation (4) we then apply the addition theorem throughout and obtain — sin 9 cos s sin u = cos <p(sin S sin U cos s cos u + cos S cos f/sin j sin u). Here we divide by cos 9 cos J sin u and obtain — tan 9 = sin S-sin £/ cot u + cos 5- cos U tan J. Since U, u, and J are known, we determine the auxiliary magnitudes r and w such that r cos u> = sin U cot a and r sin u; = cos U tan j. (First u> is determined from tan w = tan s tan a cot {/and then r from one of the two auxiliary equations.) The equation obtained then assumes the simple form (I) —tan 9 = r sin [S + w]. In precisely the same way, by subtracting the two equations (1) and (3), introducing the half sums and half differences and introducing the auxiliary magnitudes r and tv determined by the conditions r cos rt) = sin U cot u, r sin to = cos U tan 3, we find the equation (II) -tan 9 = r sin (© + to). 8" + 8 §= 2 ' 8" - 8 u = —^—' T. + T 6- 2 u = — >
Gauss' Three-Altitude Problem 329 By division of II and I we obtain the sine ratio of the two unknown angles (<B + to) and [S + w], (Hi) 5in (« + »> = L v ; sin [S + w] x However, since the difference (© + to) - [S + w] = - + tv - w of these angles is known, it is easy to calculate the sum of the angles by applying the sine tangent theorem (No. 40) to (III). From the sum and the difference we obtain directly the angles © + tv and S + w themselves and consequently also the unknown angles _. T" + T T' + T © and S From S and the known difference T' — 7wc then obtain the sought- for time angles T and T'; from © and the known difference T" — T we obtain in similar fashion the time angles Tand T". By adding the right ascension to the time angle we finally obtain the moments of the observations in sidereal time. The sought-for latitude then follows from (I) or (II), the sought-for altitude h from (1), (2), or (3). Note. If the latitude is to be determined from two observations of the same star altitude and the time interval between them, we have at our disposal only equations (1) and (2) and must assume that the time angle T for one of the observations is known. Equation (I), all the magnitudes on the right side of which are known, then gives <p. A remarkable special case of this situation is the Problem of Riccioli: From the time between the culminations of two known stars that rise or set at the same time, find the latitude of the observation point. This problem posed by Riccioli in 1651 is especially noteworthy in that the method employed makes possible determinations of latitude without an angle-measuring instrument. If T and T' are the time angles of star risings, their difference 1U = T' — T is also the time between their culminations. Our initial equations (1) and (2) are simplified here (because h = 0) to cos T = tan 8 tan <p and cos T' = tan 8' tan <p.
330 Nautical and Astronomical Problems We introduce the complements t and t' of the time angles and obtain sin t = tan 8 tan <p, sin t' = tan 8' tan 9, and from this by division we get the sine ratio of the angles t and t' : sin rtsin t' = tan 8: tan 8'. Since t — t' = T' — T is known, we obtain t + t' from this equation, in accordance with the sine-tangent theorem. We then get 2t = (t + t') + (t — t') and finally 9 from sin t = tan 8 tan 9. ■OH The Kepler Equation From the mean anomaly of a planet calculate the eccentric and true anomaly. Johannes Kepler (1571-1630) was one of the greatest astronomers of all time. The famous problem named after him is to be found in the 60th chapter of Kepler's major work Astronomia nova, published in Prague in 1609, a book that, according to Lalande, every astronomer must read at least once. Before taking up the solution we will present a short explanation of the three anomalies. Let S and P be the midpoints of the sun and a planet, respectively, let N be the point of the planet's orbit at which the planet is nearest to the sun, the so-called perihelion, let 0 be the midpoint of the elliptical orbit and of its circle of circumscription, P0 the point of intersection of the circle of circumscription with the parallel drawn through P to the minor orbit axis, a and b the major and minor axes of the ellipse, respectively, OS = e the linear eccentricity, e = eja the Fig. 96.
The Kepler Equation 331 astronomic eccentricity or form number, T the period of revolution of the planet, and t the time elapsed at the planet's position P since its passage through the perihelion. The true anomaly Wis the angle NSP, i.e., the angle described by the focal radius of the planet in the time t, the mean anomaly M the angle that the focal radius would describe in the time t if it were to revolve uniformly (with the same period of revolution T), so that in angular measure M = -= t. Finally, the eccentric anomaly E is the angle NOP0 formed by the radius of the circle of circumscription to P0 with the radius of the circle of circumscription ON. With £ as a variable parameter we have x = a cos E, y = b sin E the equation of the orbit x = a cos E, y0 = a sin E the equation of its circle of circumscription. There exists between the eccentric and true anomaly the relation (obtainable from the right triangle with the legs e — x and y) .., bsinE tan W = = ; a cos h — e after squaring and use of the formulas b2 = a2 — e2, e = ae, and cos2£ + sin2£ = 1, sec2 W — tan2 W = 1, this relation is transformed into ... cos E — e cos W = -. =• 1 — e cos E In order to obtain, in addition, a formula that is convenient for logarithmic treatment, Gauss introduced the half angles \W and \E and made use of the formulas 1 + cos 9 = 2 cos2 ^ and 1 — cos q> = 2 sin2 ~ We write the above equation 1 — cos W _ 1 + e 1 — cos E 1 + cos W ~ 1 - e 1 + cos£ and obtain the Gauss formula:
332 Nautical and Astronomical Problems There exists between the eccentric and mean anomaly (in radian measure) the famous Kepler equation: E — e sin E = M. This equation is a consequence of the formula J = — (E — e sin E)* for the area J of the elliptical sector SNP and of the Kepler surface theorem: "The focal radius of a planet sweeps equal surfaces in equal times." [According to the area formula, the area of the half ellipse (E = 7r) is tyrab; the area of the whole ellipse is thus nab. According to Kepler's surface theorem, there exists the proportion J: nab = t:T. Consequently, E — e sin E = 2nt: T = M.] The crux of the Kepler problem now consists of the solution of the Kepler equation E — e sin E = M for the unknown E (when M and e are assumed to be known). The following determination of E rests upon the assumption that the form number e is a proper fraction and consists in the calculation of a series Elt E2, E3,... of approximate values for the eccentric anomaly that deviate progressively less and less from the true value E as the index number increases and approximate the true value sufficiently closely at a relatively low index number. For the first approximation value we choose E1 = M + e sin M. Its deviation from the true value E is E — Ex = e(sin E — sin M). However, since |sin E — sin M\ < \E — M\ = \e sin E\ < e, it follows that \E - £i| < e2. * This formula is obtained as follows: Since the circle sector ONP0 has the area J0 = ia2E and each ordinate of the elliptical sector ONP is equal to bja times the circle ordinate at that point, the area of the sector ONP is also equal to bja times Jo, i.e., \abE. Consequently, the area J of the elliptical sector SNP that is smaller than ONP by the area iey = \abe sin E of the triangle OSP, is J = ±abE — \ab-e-sm E.
The Kepler Equation 333 As the second approximation value we choose E2 = M + e sin E^. Its deviation from E is E — E2 = e(sin E — sini^). However, since |sin E — sin E^\ < \E — E^ and the latter magnitude, as was just shown, is < e2, it follows that \E - E2\ < e3. The third approximation value is E3 = M + e sin E2. Its deviation from E, absolutely considered, is < e4, etc. The nth approximation value deviates from the true value by less than the (n + \)th power of the form number e. The approximation values accordingly approach the true value progressively more rapidly as e diminishes. In the earth's orbit, for example, e = 0.01674, e3 = 0.00000469, arc 1" = 0.00000485. Consequently: For the earth's orbit the second approximation value is already exact to seconds! In the orbit of Mars, which has the fairly high form number of 0.0933, e5 = 0.0000071, so that the fourth approximation value E results in an error of less than 2". After E is determined the true anomaly is calculated by the Gauss formula. Note. Kepler's problem is of the greatest importance for astronomy. It forms the basis, for example, for the determination of the equation of time for a given moment of time. [The equation of time is conventionally understood to be the difference between mean and true local time or also the difference between the right ascensions a and <x0 of the true and mean sun: e = M.L.T. - T.L.T. = a - <x0-] The calculation is based on the following seven steps: 1. Determination of the right ascension <x0 of the mean sun for the given moment of time from its daily increase of 3 m 56.55536 s and its value for a. fixed moment of time (on January 1, 1925, at midnight, M.G.T. was a0 = 18 hr 40 min 30 sec).
334 Nautical and Astronomical Problems 2. Calculation of the mean anomaly M according to the (definition) equation a0 = M + U, where II is the longitude of the true sun at perigee. (II on January 1, 1925, was 281° 39' 2" and it increases annually by 1' 1.9".) 3. Determination of the eccentric anomaly E from Kepler's equation E — e sin E = M with e = 0.01674. 4. Calculation of the true anomaly W from the Gauss formula tani^=yj-±ltan^ 5. Determination of the longitude L of the true sun according to the equation L = W + II. 6. Determination of the right ascension a of the true sun in accordance with the equation tan a = cos i tan L obtained from the astronomical triangle having the hypotenuse L and the legs a and 8; in the equation, i represents the inclination of the ecliptic. 7. Calculation of the equation of time e from e = a — <x0. Example. The equation of time for the 2nd of December, 1925 at 4:00 p.m. Central European Time. <x0 = 16 hr 43 min 44 sec = 250° 56', M = 329° 16' 1", Ex = 328° 46' 38", E2 = E = 328° 46' 12", W = 328° 16' 10", L = 249° 56' 9", <x = 248° 17' 28" = 16 hr 33 min 10 sec, e = — 10 min 34 sec. ^^| Star Setting Calculate the time and azimuth of setting of a known star for a given place and day. Solution. The method of calculation can best be illustrated by a numerical example. Thus, let us consider a more definite form of the problem: On the 3lst of December, 1932, when did Saturn set in Nordlingen, Bavaria {<p = 48° 51.1', A = 10° 29.4')? The nautical almanac gives the following data for December 31, 1932 at midnight, mean Greenwich time: right ascension of Saturn a = 20 hr 25 min 30 sec (hourly increase = 1.2 sec), declination of Saturn 8 = 19° 47.4' S (hourly decrease 0.06'), right ascension of the mean sun <x0 = 18 hr 36 min 50 sec (hourly increase = 9.86 sec).
Star Setting 335 At the moment of setting the star is already in reality a certain distance h below the horizon (SN) as a result of atmospheric refraction. The horizontal refraction h can be set at an average of 35', but in precise measurements special refraction tables must be consulted. It follows from the nautical triangle PZ* (in which PZ = b = 90° — 9 represents the complement of the latitude <p, P* = p = 90° + 8 the pole distance, Z# = z = 90° + h the zenith distance, 2lZP* = t the hour angle, and &PZ* = a the azimuth of the star), according to the cosine theorem, that cos z = cos b cos p + sin b sin p cos t. If we introduce the magnitudes h, <p, 8 here instead of z, b, p, we obtain cos t = tan w tan 8 =:• cos <p cos o First we calculate the approximate time t of setting, taking for the moment of setting 8=19° 47.4'. We then obtain from the formula we have found (assuming h = 35'), t = 66° 42.8' = 4 hr 26 min 51 sec and for the time angle T of the moment of setting T = 16 hr 26 min 51 sec. From this we get for the sidereal time © (i.e., the time angle at the vernal equinox) the approximate value © = T + a = 36 hr 52 min 21 sec, and thus for the mean local time of setting M.L.T. = © - (x0 = 18 hr 15 min 31 sec
336 Nautical and Astronomical Problems and for the mean Greenwich time M.G.T. = M.L.T. - (A = 41 min 58 sec) = 17 hr 33 min 33 sec. At the moment of setting, then, approximately 17.55 hr have gone by since midnight mean Greenwich time. In these 17.55 hr the three magnitudes <x, 8, and <x0 increase by 21 sec, —1.1', 2 min 53 sec, so that at the moment of setting they have the values a = 20 hr 25 min 51 sec, 8 = 19° 46.3', a0 = 18 hr 39 min 43 sec. The calculation must now be repeated with these exact values. This gives T = 16 hr 26 min 57 sec a = 20 hr 25 min 51 sec © = 36 hr 52 min 48 sec <x0 = 18 hr 39 min 43 sec M.L.T. = 18 hr 13 min 5 sec M.G.T. = 17 hr 31 min 7 sec. The sought-for azimuth a is computed from the sine formula sin a:sin t = sinp'.sin z and comes out to be a = 120° 10'. Result. Saturn set at 18 hr 31.1 min C.E.T. at an azimuth of S 59° 50' W. Note. The method described is naturally just as well suited to the determination of the rising time or the time at which a star attains a prescribed altitude. If it is specifically desired to determine the moment of culmination, the logarithmic calculation can be dispensed with, since the time angle of culmination, T = 12 hr, is known. gjgH The Problem of the Sundial To construct a sundial. First we will consider the two simplest forms of sundial: the horizontal dial and the vertical meridional dial. In the first the plane of the dial E is horizontal, in the second vertical, specifically through the eastern and western points of the horizon. The earth's axis is represented by a pin, the gnomon or style that casts a shadow on E. At noon the shadow is situated at its center position, the meridian line of the
The Problem of the Sundial 337 dial plane, and at t hr before or after noon forms the "shadow angle" s or a, respectively, with the meridian line. The problem is to determine the relation between the time t and the shadow angle. We will call the plane formed by the sun and the earth's axis (the gnomon) the shadow plane, since the shadow must lie in this plane. At noon the shadow plane at its central position passes through the north and south points of the horizon and at time t forms the angle t (t hr = I5t°) with its central position. In the figure let US, UO, and UZ be segments running from U toward the southern point, the eastern point, and the zenith of the horizon, specifically in such manner that SZ represents the gnomon; thus 2IUSZ represents the latitude <p of the place and SOZ the shadow plane, so that SO is the shadow; &USO is the shadow angle s of the horizontal dial, ZO the shadow, 2IUZO the shadow angle a of the vertical meridional dial. The angle t between the shadow plane SOZ and its meridional position SUZ is the angle UFO that is formed with UF by the perpendicular OF dropped from 0 to SZ. If we select SZ as the unit length and, for the sake of brevity, set cos 9 = 0, sin 9 = i, it follows from the right triangle SUZ that US = o,UZ = i, UF = oi, from the right triangle IWthat UO = oi tan t, and from the right triangles USO and UZO that UO = 0 tan s and UO = i tan a. If we set the three values for UO equal to each other, we get the equations (1) tan s = i tan t, (2) tan a = 0 tan t, which contain the sought-for relations between the time t and the shadow angles s and a, respectively.
338 Nautical and Astronomical Problems In order to construct the dial we compute, in accordance with (1) or (2), the shadow angle corresponding to different times t, draw them in, but write on their free leg not s or a, but the corresponding times t. It is also possible to use a purely graphic method. On an arbitrary segment AB we begin at B and mark off i or o times its length to C, draw the semicircle with the center C and the arc center B, and draw the tangent through B which is at the same time perpendicular to AC. Fig. 99. If we now make the arc BT equal to the time angle t (thus, for example, 45° for 3 hr), extend CT to the intersection J with the tangent, and connect J with A, then ^BAJ = m is the shadow angle s or a for time t. [From /\BJA it follows that BJ = BA tan co, from £\BJC that B J = BC tan t, so that BA tan <d = BC tan t or, since BC is i or o times BA, tan <n = i tan t or tan at = o tan t. According to (1), <d is equal to s and according to (2), at = a.] We carry out the described construction for as many time angles t as possible and obtain the dial as the totality of lines A J each of which bears written on it its corresponding time. In order to install it, we place the drawing plane horizontally, so that BA points from the northern point of the horizon to the southern point, or vertically, so that BA points perpendicularly upward and the tangent runs from west to east, and fix the style parallel to the earth's axis at A. A Vertical Sundial at an Arbitrary Azimuth Let us now consider the case in which a sundial is to be fastened to a vertical house wall that does not run east and west.
The Problem of the Sundial 339 In Figure 100, let UZ be a vertical line on the wall and UH a horizontal line on the wall, US a horizontal pointing south, ZS the gnomon, so that -&USZ = <p and £_UZS = b = 90° — <p; UZS is the meridian plane and ^SUH = a the azimuth (calculated from the south point) of the wall; ZH is the shadow at time t, so that ZSH is the shadow plane, and the angle that it forms with the meridian plane ZSU is the time angle t; finally, the angle that ZH forms with ZU is Fig. 100. the shadow angle a. The three-dimensional vertex Z with the edges ZU, ZH, ZS cuts out of the sphere with the center Z a spherical triangle (shown in the figure) in which the side a, the angle a, the side b, and the angle t are four successive elements. According to the cotangent theorem, therefore, cos b cos a = sin b cot a — sin a cot t or cos <p cot a — sin a cot t = sin <p cos a. This is the relation between the time t and the shadow angle a. This relation makes it possible to calculate a corresponding a for every t. The invention of the sundial is lost in antiquity. A statement by Vitruvius (which was also found engraved on an ancient sundial unearthed on the Via Flaminia), according to which the inventor is
340 Nautical and Astronomical Problems the Chaldaean Berosus, is not reliable in view of the fact that sundials were known in ancient Babylonia many centuries before Berosus. g*g| The Shadow Curve To determine the curve described by the shadow of a point of a rod in the course of a day, when the rod is erected at a place of latitude <p and the declination of the sun for the day has a value of 8. Solution. We select the perpendicular from the point of the rod to the horizon of the place as the unit length and the base point 0 of the perpendicular as the origin of a right-angle coordinate system whose x-axis runs toward the north point and whose y-axis runs toward the west point of the horizon. At the moment in which the sun (©) has the azimuth S a° E and the zenith distance z, the distance of the shadow from 0 is tan z, and the abscissa and ordinate, respectively, of the shadow are x = tan z cos a, y = tan z sin a. In the nautical triangle PZ® the latitude complement PZ = b and the pole distance P® = p = 90° — 8 are constant. The zenith distance Z® = z, the azimuth supplement PZ® = 180° — a and the hour angle ZP® = t are variable. We find the equation of the shadow curve by expressing sin t and cos t in terms of x and y and introducing the resulting expressions into the equation cos2 t + sin2 t = 1. We abbreviate sin <p, cos <p, and tan <p, as i, o, and q, respectively, and sin p, cos p, and tan p, as I, 0, and Q, respectively. If we then apply to the nautical triangle the sine theorem, cosine theorem, and cotangent theorem in that order, we obtain the three equations sin a sin z = sin p sin t, cos z = cos p cos b + sin p sin b cos t, — cos b cos a = sin b cot z — sin a cot t. We divide the first by the second and obtain tan p sin t sin a tan z = -. - •— sin 9 + cos 9 tan p cos t or en = ^sin * ^ ' y i + oQ cos t
The Shadow Curve 341 We multiply the third by — tan z and obtain sin 9-cos a tan z = sin a tan z-cot t — cos 9 or c°s t (2) ix = y —. 0. v ' " sin t From (1) and (2) we find Q cos t = ■: > Q sin t = -r-*-— l — OX I — ox and from this, in accordance with what was stated above, we obtain (0 + ix)2 + y* = Q\i - ox)2 as the equation of the shadow curve. We solve for y2 and obtain y2 = (Q2i2 - 02) - 2io(Q2 + l)x + (Q2o2 - i2)x2 or, if we go on to divide by o2, § = (QV -0- MQ2 +l)x + (Q2- q2)x2. To put this equation into a simpler form, we introduce a new coordinate system X, Y whose origin U is situated at the apex of the curve, i.e., at the point where the shadow lies at noon; the .Y-axis runs toward the south and the Y-axis toward the west. When the sun is at meridian, its zenith distance is p — b, and thus ,, , ,, tan p — tan b Qq — 1 Uo = a = tan (p - b) = ■; <- r = -77 We accordingly introduce x = «-X, y=Y into the above curve equation and obtain „2 V-2Q(1 + g2)X + (Q2 - g2)X2 0- or, if we write the first parenthesis as 1 /o2 and the second as — - - 02 02 and multiply the equation by o2, Y2 = 2QX - (l - ^X2.
342 Nautical and Astronomical Problems The amplitude equation of the shadow curve thus reads ys = 2tan/,Ar-(l-^W. r \ cos2 p) The curve is consequently a conic section with the half parameter tan p and the form number {eccentricity) cos 93/cos p. If the latitude is equal to the polar distance of the sun, then the shadow describes a parabola; at higher latitudes it describes an ellipse, and at lower a hyperbola. ■SIS Solar and Lunar Eclipses To determine the beginning and end of a solar eclipse, together with the maximum fraction of the solar disc that is obscured, if the right ascensions, declinations, and radii of the sun and moon are known for two moments in time sufficiently close to the time of the eclipse. Example. At the famous solar eclipse that occurred at Athens during the Peloponnesian War on August 3, 431 B.C., the magnitudes mentioned had, at 4:30 p.m. and 5:30 p.m. mean Athenian time, the values A0 = 126° 51' 52", A0 = 19° 23' 46", R0 = 15' 52", <x0 = 126° 40' 55", 80 = 19° 38' 58", r0 = 15' 38.5" and Ax = 126° 54' 21", A! = 19° 23' 11", Rx = 15' 52", ai = 127° 8' 49", Bx = 19° 24' 30", rx = 15' 36.5". A solar eclipse can only occur at a time when the moon is sufficiently close to the sun on the celestial sphere, i.e., at a time when the differences a = a — A and d = 8 — A between the right ascensions and declinations of the two bodies are sufficiently small. The spherical cosine theorem gives for the spherical distance z of the midpoints of the two bodies (their central axis) the formula cos z = sin A sin 8 + cos A cos 8 cos a. We replace cos z and cos a here by 1 - 2 sin2 I and 1-2 sin2 £
Solar and Lunar Eclipses 343 and obtain 1 — 2 sin2 -r = cos d — 2 cos A cos 8 sin2 -• If we now write 1 — 2 sin2 (d/2) for cos d, we obtain sin2 ^ = cos A cos 8 sin2 - + sin2 ^- If we now consider that, according to our assumption, a and d and, therefore, also z are small angles that in no case exceed 1°, we can substitute the angles themselves for their sine (No. 15) and write z2 = a2 cos A cos 8 + d2. If in addition to this we introduce the abbreviations Vcos A cos 8 = g and ag = x and substitute y for d, we obtain the simple equation z2 = x2 + y2. The magnitudes a, x, y, and z are most conveniently measured in angular seconds. If the right ascensions and declinations of the moon and the sun for two moments of time sufficiently close to the time of the eclipse (the first moment being taken as the zero point of time) are known and are, for example, <x0> A0, ^o> &n<i A0 for the first moment and alt Alt Slt and A! for the second, then we also know the values a, d, and g, and therefore also x = ga andy = d for these moments in time, and we can calculate from these the hourly increases h and k of x and y. Since the eclipse lasts only a short time, we can assume that the magnitudes x and y change uniformly in the period of time here under consideration and that, consequently, at time t, i.e., at t hours after moment 0, x = x0 + ht and y = y0 + kt. If we introduce these values into the above equation, it assumes the form z2 = (*0 + ht)2 + (y0 + kt)2, which permits us to calculate the central axis of the two bodies for any moment t. The eclipse begins and ends at the moments when the central axis z is equal to the sum s of the two radii R and r. In the period of time
344 Nautical and Astronomical Problems under consideration the solar radius does not change (R = Rq = Rx), while the lunar radius exhibits the slight hourly increase p = —2", so that r = r0 + pt and s = R + r = R + r0 + pt = s0 + pt. We therefore obtain for the desired moment t of the beginning (and also the end) of the eclipse the so-called Eclipse equation: (*0 + ht)2 + (y0 + kt)2 = (s0 + pt)2. This quadratic equation has two roots for the unknown t; the smaller value, t', indicates the beginning of the eclipse, and the larger, t", the end. The maximum eclipse occurs at the moment t in which the central axis z reaches its minimum value £. Thus, we have z2 = z2 + 2mt + n2t2, where A = *o + yl, m = x0h + t/ok, n2 = h2 + k2. If we write 2 2 m2 , T . , mV we see that z attains its minimum value when the bracket disappears. We then have r = - J and ? = Jz2 - J- At the moment of the maximum eclipse the moon has advanced over the solar disc by (R + r — Q/2R of the sun's diameter. The fraction of the solar disc that is covered by the moon at that moment can also be calculated easily from J. Carrying out the computations for the Athenian solar eclipse, we obtain: a0 657(-10'57"), log£0 = 9.97428, x0 = -619.2, y0= +912(+15'12"), h = xx - x0 = 1438, s0 = 1890.5, sx = 1888.5, ax = +868(+14'28"), log ft =9.97462, *! = 818.7, y1 = +79(1'19"), k=yx-y0 = 833, p = h - so = -2
Solar and Lunar Eclipses 345 and the eclipse equation is (-619 + 143802 + (912 - 833<)2 = (1890.5 - 2*)2 or 2761729*2 - 3292074* - 2359085 = 0 or t2 - 1.192034« = 0.8542059159. Its roots are f = -0.50373, t" = 1.69576. Converting the decimals into minutes and seconds, we obtain — 30 min 13 sec and 1 hr 41 min 45 sec, respectively. Consequently: Beginning of eclipse: 3 hr 59 min 47 sec, End of eclipse: 6 hr 11 min 45 sec. The length of the eclipse was therefore 2 hr 12 min, the moment of maximum eclipse 5 hr 5 min 46 sec [2t = t' + t" gives r = 0.596]. The central axis of the sun and moon at this moment is obtained from £2 = (619 - 1438-0.596)2 + (912 - 833-0.596)2; it is £ = V2382 + 415.52 = 479, i.e., 8'. The moon then covers -ff^, i.e., 74% of the central solar diameter and 67% of the solar disc. Lunar eclipses are treated in a similar way. But here, instead of being concerned with the sun, we are concerned with the so-called shadow circle, i.e., the cross section of the conical shadow (the umbra) cast by the sun-illuminated earth at the distance of the moon. The angle radius 9¾ is equal to p — k, where p represents the lunar parallax* and k represents the half aperture angle of the conical shadow. k is the excess of the angle radius R over the parallax* P of the sun. [In the Figure 101, let S be the center of the sun, E the center of the earth, K the apex of the conical shadow, AB the diameter of the shadow circle, se a tangent to the periphery of the sun and the earth, * The lunar or solar parallax is the angle radius of the earth on the moon or sun, respectively.
346 Nautical and Astronomical Problems EFthe perpendicular toSsfromE, and thus -&EAe = p, &AEK = SK, and &FES = £_eKE = k. Since p is an external angle of the triangle EKA, we have/* = 9¾ + k. It also follows from /^SEFthaX _ SF _ Ss_ _ Ee_ smK ~SE~ SE~ SE Since the minuend of the right side is the sine of the angle radius of the sun and the subtrahend is the sine of the solar parallax, it follows that sin k = sin R — sin P or, because the angle involved is so small (k is smaller than 16.2', R < 16.3', and P < 8.9"), K = R - P, as was asserted above.] The right ascension of the center of the shadow circle is the right ascension of the sun increased or diminished by 180° and the declination is the reciprocal value of the solar declination. In order to take account of the atmospheric refraction, in computing a lunar eclipse the theoretical value for the radius of the shadow circle given above, 9¾ = p + P — R, must be replaced by a value 2% greater. flEffifl Sidereal and Synodic Revolution Periods To determine the synodic revolution period of two coplanar rotation rays for which the sidereal revolution periods are known. A rotation ray is a line segment AB of invariable length the end point B of which rotates about the starting point A in a plane £ at a
Sidereal and Synodic Revolution Periods 347 constant rate of revolution, while the starting point either remains at rest or describes a curve of plane E. Using a well-known astronomical expression we call the time T in the course of which the rotation ray AB describes one complete revolution of 360° its sidereal revolution period. Let a second rotation ray of the plane E with the starting point a and the end point b have the sidereal revolution period t (<T). We will consider the angle that the two rays form with each other at a given moment of time. The time s at the end of which they once again form the same angle we will call the synodic revolution period of the two rays or the synodic revolution period of the one ray with respect to the other. In order to find this we will imagine an auxiliary rotation ray a'V whose starting point a' always coincides with A and whose direction always agrees with that of ab, and we will now consider the relative rotation of this auxiliary ray with respect to AB. Since the rotation of a'b' (or ab) in the unit time is equal to 360°/< and that of AB is 360°/ T, the relative rotation of a'b' with respect to AB in each time unit is (1) 8 = (} - t)360°- If a'b' resumes the same position with respect to AB at the end off units of time, then sS must equal 360° or (2) 8 = - 360°. s From (1) and (2) it follows that 111 Tt - = 7-=- or s = "= 7' s t T T — t and thus the synodic revolution period s is represented as a function of the two sidereal revolution periods T and t. This unpretentious problem, the solution to which is also a model of brevity and simplicity, nevertheless possesses noteworthy applications, four of which we will discuss. Problem 1. The hands of a clock are superimposed one on the other at exactly 12:00; when is the next time they are exactly superimposed one on the other ? Here let AB be the small hand, ab = Ab the big hand, T = 12 hr, t = 1 hr, thus s = ^1 = 1 iV hr = 1 hr 5 min 27-^- sec.
348 Nautical and Astronomical Problems The event takes place at 5 min 27-j\- sec after 1:00. Problem 2- From the synodic revolution period (583^ days) of Venus, determine its sidereal revolution period. The sidereal revolution period of a planet is understood to mean the time in which the rotation ray sun-planet makes one complete revolution. The synodic revolution period of the planet is understood to mean the time s at the end of which the three celestial bodies sun, earth, planet are once again in the same position with respect to one another. Here AB is the rotation ray sun-earth, ab the rotation ray sun-Venus, and T = 365¾ days. The synodic revolution period s of Venus has been determined by observations. Its sidereal revolution period t is obtained from the relation 1 _ J_ _ 1 t T~ s as 224.7 days. Problem 3. To determine the relation between the solar day and the sidereal day. A solar day is the time interval between two successive culminations of the sun, a sidereal day the time interval between two successive culminations of a fixed star or the time interval within which the earth rotates once about its own axis. Let the midpoint of the sun be S, that of the earth E, a marked point of the earth's equator 0. Here AB is the rotation ray SE, ab the rotation ray EO, T is here 365J days (1 year, the period of time in which AB = SE completes one full revolution of 360°), t the length of a sidereal day, and s the length of a solar day (the period of time at the end of which the ray EO is once again in the same position relative to the sun). From I _ 1_ _ \_ s ~ t T we obtain t s T/t represents the number of sidereal days, T/s the number of solar days, that occur in a year. The sought-for relation can accordingly be stated in the following form: A year contains one more sidereal day than the number of solar days (365¾ solar days, 366J sidereal days).
Progressive and Retrograde Motion of the Planets 349 Problem 4. What is the relation between the sidereal and synodic month? A sidereal month is the time it takes the rotation ray EM (earth- moon) to complete one full revolution. A synodic month is the time interval between two successive new moons (full moons). Here AB is the rotation ray SE, ab the rotation ray EM, T = 365¾ days, t the length of the sidereal month, s the length of the synodic month. The sought-for relation accordingly reads 1 _l _ J_ t s ~ t Verbally it can be stated as follows: The reciprocal of the synodic month subtracted from the reciprocal of the sidereal month is equal to the reciprocal of the sidereal year. This can be confirmed for the numerical values: t = 27.3217 days, s = 29.5306 days, T = 365.2564 days. ■SHf Progressive and Retrograde Motion of the Planets When does a planet pass from progressive to retrograde motion (or conversely, from retrograde to progressive motion) ? The planetary orbits, considered as circles on the ecliptic plane, their orbital radii and revolution periods, as well as their positions at a given moment of time serving as the starting point of the time record are assumed to be known. Solution. The motion of a planet is conventionally called progressive when it travels among the fixed stars of the celestial sphere like the sun, i.e., from west to east, and retrograde when it travels in the opposite direction, i.e., from east to west. The transition from one motion to the other occurs when the planet appears to be stationary for a brief period among the fixed stars, in other words, when the sight-line "earth-planet" retains the same direction for a short period of time. The earth and the planet have the orbital radii r and R, respectively, and the revolution periods u and U, and the orbital radii, which are rotating about the sun, accordingly have the rates of revolution k = 277-/a and K = %it\U. The solution to the problem is most conveniently obtained by the vector method. Let 0, p, P be the midpoints of the sun, the earth, and the planet, r = Op and 9¾ = OP the vectorial distances of the
350 Nautical and Astronomical Problems earth and the planet from the sun. The vectors r and 9¾ are " rotational vectors," i.e., vectors with the constant lengths r and R, that rotate in the ecliptic plane E with constant velocities k and K, respectively, about their fixed point of origin 0. For the vectors r and 9t of the orbital velocities we again select 0 as the starting point. The magnitudes of the velocities r and 9t are kr and KR, the directions always perpendicular to the directions oft and 9¾. If we then imagine two vectors r0 and 9J0 situated in E, originating at 0, and possessing the magnitudes r and R that are always 90° in advance of the rotational vectors r and 9¾. then r = kt0 and 9¾ = A"9*0. The vectorial distance of the planet from the earth is 3 = pP = OP — Op = 9¾ — r, the relative velocity of the planet with respect to the earth (i.e., the velocity of the planet for an observer on the earth, for whom the earth is at rest) is thus I = ft - r = Kft0 - Arr0. Let the angle by which the vector 9¾ is in advance of the vector r at time 0 be a and at time t let it be £. Then (1) £ = « + *', where k = K — k represents the angle by which the vector 9¾ rotates in advance of the vector r in the unit time. The motion of the planets is then progressive when the vector 3 rotates in a counterclockwise direction for an observer at the North Pole and retrograde when it rotates in a clockwise direction for this observer, i.e., in accordance with whether the apex S of the vector OS = 3 x I that is perpendicular to E lies above or below the ecliptic plane. Now, 8 x * = (« - t) x (* - t) = (« - t) x (*9*0 - kx0) = p - q with p = k3t x 9J0 + kt x r0, q = Kx x 9J0 + k?H x r0, it being assumed that the vectors p and q also have their starting point at 0. The vector p has the magnitude KR2 + kr2 and lies above E. The vector q, as may be seen from Figure 102, lies above or below E
Progressive and Retrograde Motion of the Planets 351 accordingly as cos £ is positive or negative, and has the magnitude (K + k)Rr\cos £\. The vector 3x1 thus lies above or below E accordingly as KR2 + kr2 — (K + k)Rr cos £ is positive or negative, i.e., accordingly as , . KR2 + kr2 C°5^ (K + k)R/ Now, according to Kepler's third law, U2:u2 = R3:r3 or k2:K2 = R3:r3, so that the ratio k: K on the right side of the obtained inequality can be replaced by W3:w3, where W = VR, w = V7. We,thus obtain for this right side the value w3W + WW {W +w)Ww Ww (W3 + w3)W2w2~ W3 + w3 ~ W2 + w2 - Ww _ VRr ~ R + r - V~R~r and our conclusion reads: The motion of a planet is progressive or retrograde accordingly as cos 4 $ ■=• R+ r - VRr At the moments when (2) cos £ = VRr R + r - VWr the one type of motion changes into the other.
352 Nautical and Astronomical Problems Example. How many days after upper conjunction does Venus become retrograde ? Here r = 149, R = 107.5 million kilometers, k and K, respectively, in degrees are 0.9856° and 1.602°, k thus equals 0.6164° per day, with a = 180° and VR~rj{R + r - VRr) = 0.974. From (1) and (2) we therefore obtain cos 0.6164* = —0.974 and from this t = 271 days. g*g| Lambert's Comet Problem To express the time required for a comet to describe an arc of its parabolic orbit by means of the focal radii and the chord connecting the end points of the arc. Johann Heinrich Lambert (1728-1777) in 1761 published a paper on comet orbits in which may be found the celebrated formula bearing his name; the formula represents the area of a parabolic focal sector as a function of the bounding focal radii and the sector chord. For the derivation of the Lambert formula we require a formula of the English astronomer Barker, which we Arill derive first. We begin with the amplitude equation of a parabola, y2 = Akx, in which k represents the shortest focal radius, which is commonly known to be one fourth of the parabola parameter. Let us consider the sector FOP, which is enclosed by the minimum focal radius FO, the focal radius FP = r of an arbitrary point P(x\y), and the parabola arc OP, and in which the angle OFP = W represents the so-called true anomaly of the point P. Barker's problem is stated thus: Represent the area of the parabola sector as a function of the anomaly. In order to solve the problem we first express the sector area S in terms of x and y. If we drop the perpendicular PQ from P to the axis, S is the difference between the area of the half sector OPQ (cf. No. 56) and the area of the triangle FPQ, so that S = \*y ~ i(* - k)y or 65 = y(x + 3k). We then express x and y in terms of W. According to the polar coordinate theorem of the parabola, the focal radius is P k r 1 + cos W „W' cos^ —
Lambert's Comet Problem and consequently, WW W y = r sin W = 2r sin -^ cos -=- = 2k tan -=- 3 2 2 2 and W x = y2J4:k = k tan3 -=-• 353 If we introduce Barker's auxiliary magnitude we obtain W T = tan y, * = jfcra, y = 2kT (the equation of the parabola in a parametric form), and after substitution of these values into the above area formula, we obtain This is Barker's formula. S = k2(T + jr3). W is positive or negative accordingly as P lies above or below the axis. In the first case, T and S are positive; in the second, negative. Now for the solution of Lambert's problem! Let P and P' be two points of the parabola, W and W' their anomalies, T and 7" the corresponding Barker auxiliary magnitudes, S and S' the areas of the sectors FOP and FOP', with FP = r and FP' = r' as the focal radii of the two points, &PFP' = 2£ the angle
354 Nautical and Astronomical Problems between them, PP' = s the connecting chord, and a the area of the sector PFP' enclosed by the two focal radii. Let r lie above the axis and r' above or below it; in the first case, let r' < r, and thus in both cases W < W. The area a is then in both cases the difference S — S'. Now, according to Barker, ZS = k2(3T + T3), 35" = k2(3T' + T'3), and consequently, 3(t = k2(T- T')[3 + T2 + T'2 + 7*7"]. Using the abbreviations J, 0, J', 0' for . W W . W W sin -^-. cos -^-. sin —-» cos -— and i, o for sin 1,, cos J, we can write the factor in parentheses as _ _J__^__ JQ' - QJ' L K ' 0 0' 00' 00'' and the factor in square brackets as [ ] = l + T2 + 1 + T'2 + 1 + TT J2 , J'2 . JJ' = 1 + 05+1 + 0^+1 + 007 02 + J2 0'2 + J'2 00' + JJ' 02 T 0'2 T 00' 0 02 ~r 0'2 ~*~ 00'' 02 + /T2 + If we introduce these values and, in accordance with the polar equation, express k/02 and k\0'2 as r and r', respectively, we obtain 3<r = i(r + r' + oVrF)VrF. Now, i2 = (JO' - OJ')2 = J20'2 + 02J'2 - 2J0J'0' = (1 - 02)0'2 + (1 - 0'2)02 - 2J0J'0' = 02 + 0'2 - 200'(00' + JJ') = 02 + 0'2 - 2oOO', and, since k = rO2 = r'O'2, 1 = Vk(r + r' - 2oVr7)IVrP.
Lambert's Comet Problem 355 If we introduce this value into the equation found for 3<r, we obtain 3(j = (r + r' + oVrr')^/k(r + r' - 2oVrP). We transform this equation further by introducing the chord s. Its square, according to the cosine theorem, is s2 = r2 + r'2 - 2rr' cos2£ = r2 + r'2 - 2rr'{2o2 - 1), i.e., s2 = (r + r')2 — 4rr'o2. From this we obtain 4rr'o2 = (r + r' + s) (r + r' — s). We abbreviate and write v = Vr + r' + s, u = Vr + r' — s, obtaining . V2 + U2 /—. uv r + r = —-—, oVrr = ± ^-. where the upper sign applies when the enclosed angle 2£ is concave and the lower when it is convex. If we substitute these two values into our last formula for 3<r, it finally yields o *rrv2 + u2 ± vu v + u /£,,_ ,. 3. = Vk . _ = J-.{+ + tt3) or, in complete form, T ° = 4f2 [(r + r' + sy* + (r + r' - s)^]. This formula represents the parabola sector a as a function of the two bounding focal radii r and r' and the chord s connecting their end points. In order to use this formula to determine the time required for a comet to complete its orbital arc, we need only introduce the value found for a into the Gauss formula of the Theoria motus, 2<r tVpV\ + y. (cf. No. 96).
356 Nautical and Astronomical Problems Since here p = Ik and the comet mass /x is to be set equal to zero, we have initially GtVk = <tV2 and, as a result of substitution, 6Gt = (r + r' + s)1-5 + (r + r' - s)1-5. This remarkable formula contains the solution to the problem posed. It is usually called the Lambert formula, although it had already been formulated by Euler. It states that the time required by a comet to describe an orbital arc depends only on the arc chord and the sum of the focal radii of the ends of the arc. According to Lagrange, Lambert's formula represents the most beautiful and significant discovery in the theory of comet motion. It is, in fact, of fundamental importance for the determination of comet orbits. This determination is carried out essentially in the following way: The longitude and latitude of the comet is determined for three different moments of time, together with the corresponding longitude and distance of the sun (from the earth). Let r and r be the respective focal radii of the first and third time of measurement, s the distance between the ends of the focal radii, r' and s are expressed in terms of the known magnitudes and r, and these values are substituted into the Lambert equation, which results in an equation with only one unknown, r. From this equation r is obtained, and then r' and s are found from the previously mentioned expressions. This then gives us the focus and two points of the orbit, so that it is completely determined. When the Gauss formula is applied to one of the points, we obtain the time at which the comet passes the perihelion. After this has been determined, the position of the comet for any moment of time can be obtained from the Gauss formula.
Extremes
(SCI Steiner's Problem Concerning the Euler Number At what value ofx, ifx is a positive variable, will the expression V x be at a maximum ? Jacob Steiner posed this problem in Crelle's Journal, vol. XL; it may also be found in his Works, vol. 2, p. 423. Solution. According to the inequality of exponential functions (No. 12), e<,x-e)le ^ j + f_Zi, e where the equal sign applies only when x = e. The inequality is simplified to ex'e— ^ - or to exie ^ x. e e Here we extract the xth root and obtain Ve ^ y/~x. Verbally expressed: The Euler number e is the number yielding the maximum possible value for the expression Vx for which x is a positive variable. Fagnano's Altitude Base Point Problem To inscribe in a given acute-angled triangle the triangle of minimum perimeter. This celebrated problem stems from I. F. Fagnano, son of the Italian count C. Fagnano (1682-1766), who became famous as a result of his remarkable studies of lemniscate partition. The following solution of the problem is distinguished by its extreme simplicity. It comes from Fr. Gabriel-Marie, author of the excellent book Exercices de Geometric.
360 Extremes Let the given triangle be ABC and let XYZ be a triangle inscribed in it, with X, Y, and Z on BC, CA, and AB, respectively. We will initially consider that Z is arbitrarily situated on AB; we draw its mirror images H and K on BC and CA, respectively, and determine the points of intersection A'and Y of the connecting line HK with BC and CA. For & fixed point Z the triangle ATZ thus formed has the smallest perimeter of all the inscribed triangles. In fact: let X' and Y' be two other points on BC and CA. Since ZX' and HX' are mirror images, and also ZY' and AY', and naturally also ZX and HX, as well as Zy and KY, the perimeters of the two inscribed triangles to be compared can be written as ZXYZ = HX + XY +YK= HK, ZX'Y'Z = HX' + X'Y' + Y'K = HX'Y'K. However, since the direct path HK from H to K is shorter than the roundabout path HX'Y'K, the first triangle possesses a smaller perimeter than the second. It now merely remains to choose the point Z in such manner as to obtain the smallest possible segment HK (which represents the perimeter of XYZ). Now CZ is the mirror image of CH and also of CK; likewise, t^ZCB = &HCB and &ZCA = &KCA and thus &HCK = 1y. Segment HK is therefore the base of an isosceles triangle (HKC) with a constant apex angle 2y and the variable leg s = CZ; as such it attains a minimum when CZ is at a minimum, i.e., when CZ is perpendicular to AB. Since we could just as easily have carried out the investigation with X or Y as with Z, AX is perpendicular to BC and BY to CA. The points X, Y, Z are thus the base points of the altitudes of the triangle ABC.
Fermat's Problem for Torricelli 361 Result: Of all the triangles that can be inscribed in a given acute-angled triangle, the one with the smallest perimeter is the triangle formed by the base points of the altitudes. ^^M Fermat's Problem for Torricelli To find the point the sum of whose distances from the vertexes of a given triangle is the smallest possible. This celebrated problem was put by the French mathematician Fermat (1601-1665) to the Italian physicist Torricelli (1608-1647), the famous student of Galileo, and was solved by the latter in several ways. The simplest solution is the one obtained by the use of Viviani's theorem: In an equilateral triangle the sum of the three distances of a point from the sides of a triangle has a value that is independent of the position of the point. This value is equal to the altitude of the triangle. Viviani (1622-1703), an Italian mathematician and physicist, was a student of Galileo and Torricelli. In Viviani's theorem the distance of a point from a triangle side is reckoned as positive when it is inside the triangle and negative when it is outside. Proof. Let the equilateral triangle have the vertexes P, Q, and R, the side g, the altitude k, and the area J. If x, y, z are the distances of an arbitrary point 0 from the sides QR, RP, PQ, then s = x+y + z is the designated sum. - w R Fig. 105.
362 Extremes Now, the area of the triangle PQR is composed (additively or subtractively) of the three component triangles OQR, ORP, OPQ, so that we obtain the equation \i* + \gy + igz = J no matter what position the point 0 may have. From this we obtain directly s = x+y + z = — = h, g and thus the auxiliary theorem is proved. Now let ABC be the given triangle. We choose the point 0 so that the three perpendiculars at A, B, C to AO, BO, CO form an equilateral triangle PQR. Let 0' be any other point. Then if O'A', O'B', O'C are the perpendiculars dropped from 0' to QR, RP, PQ, we have A'O' <> AO', B'O' < BO', CO' <> CO', where, however, the equal sign cannot apply to all three. By addition it follows from this that (1) A'O' + B'O' + CO' < AO' + BO' + CO'. However, according to the auxiliary theorem as applied to the equilateral triangle PQR, (2) AO + BO +CO=z A'O' + B'O' + CO', where the equals sign applies when 0' is inside the triangle PQR and the "smaller than" sign when 0' is outside. From (2) and (1) we get AO + BO + CO < AO' + BO' + CO', so that AO + BO + CO is the smallest possible sum of the distances. Since the quadrilaterals OBPC, OCQA, OARB are circle quadrilaterals, each of the three angles BOC, CO A, and AOB is equal to 120°. The point we are looking for is accordingly the common point of intersection of the three circle arcs with the chords BC, CA, AB and the common peripheral angle of 120°. The construction of this point is impossible when one triangle angle, for example, ^ACB = y reaches or exceeds 120°. In that event C itself is the point 0 that we are looking for. Specifically, in this case, AC + BC < AU + BU + CU, no matter where the point U may be.
Tacking Under a Headwind 363 Proof. We introduce the angles ACU = if/ and BCU = q>. If t/ lies in the space enclosed by the angle ACB = y, the sum of 0 and <p is equal to y; if t/ lies in the space enclosed by the adjacent angle of y, the difference between these two angles is equal to y; and, finally, if U lies in the space of the opposite angle from y, then 0 + <p = 360° - y. Let the base points of the perpendiculars dropped from U to AC and BC be F and G. Their distances from C are then x = CU cos if) and y = CU cos <p, with such a distance, e.g., x, being counted as positive when cos 0 is positive or negative when cos 0 is negative. In each case then we have AC = AF + x and BC = BG + y, and accordingly AC + BC = AF + BG + x + y. Now x + y = CU cos tfi + CU cos <p = CU(cos tfi + cos <p) = 2.Cf/.cosi±^cos^. Since, according to the above, one of the two cosines of the right side of this equation has the magnitude cos (y/2), and this (because y/2 ^ 60°) is smaller than -J-, the right side has a maximum magnitude ofCU This yields AC + BC^AF + BG + CU. Since the legs AF and BG of the right triangles AUF and BUG are smaller than the hypotenuses AU and BU, it is certainly true that AC + BC < AU + BU + CU. Q.E.D. ^!^| Tacking Under a Headwind How must a sailboat tack with a north wind in order to get north as quickly as possible ? Solution. Let the course of the boat be Oy°N, and let the sail form the acute angle a with the bearing north and the angle /? with the course bearing.
364 Extremes First let us solve the preliminary problem: Let the maximum speed that a sailboat can make through the wind with the most favorable sail position be C knots; how great a speed can it make when the angle of the sail with the bearing of the wind is a and with the axis of the boat is /3 ? Let the pressure exerted upon the sail by the wind when the sail is perpendicular to the wind be P. If the sail forms an angle a differing from 90° with the bearing of the wind, then the wind pressure P' (which works perpendicular to the sail) is smaller. It is reasonable to assume that the wind pressure is now equal to only sin a times P, so that P' = Psina. This formula, conceived by Lossl, is, however, only approximate. Fig. 106. We divide P' into two components: one, p = P' sin /3, in the direction of the boat axis; the other, q = P' cos /3, perpendicular to it. Of these components p is the only relevant one for the forward motion of the boat. Thus, pressure exercised by the wind on the boat in the course direction has the value p = P sin a sin /3. The velocity c of the boat is proportional to this pressure: c = kp = kP sin a sin /3, where k represents the proportionality constant. For a = /3 = 90° this formula becomes cmax = C = kP, so that we can replace kP in the formula by C. The solution to our preliminary problem thus reads c = C sin a sin |3.
Tacking Under a Headwind 365 This formula forms the basis of the solution of the main problem. C is here the velocity that the north wind gives to the boat when it travels due south and the sail is perpendicular to the wind direction. If the boat is to get as far north as possible in a given time, the northerly component c' of the boat's velocity c must be at a maximum. This component is, however, c' = c sin y = C- sin a sin /3 sin y. Consequently, what is necessary is to choose the three angles a, /3, y, the sum of which is 90°, in such manner as to obtain the maximum product for sin a sin /3 sin y. This reduces our task to the following problem: When is the product of the sines of three angles of a constant concave sum at a maximum? The solution of this problem is very similar to that of No. 10. It is based on the theorem: Of two angle pairs with equal concave sums the pair possessing the higher sine product is the pair with the smaller difference between its angles. [It follows from the formulas that 2 sin X sin Y = cos (X — Y) — cos (X + Y), and 2 sin x sin y = cos (x — y) — cos (x + y), where X, Y and x, y represent the two pairs with the common sum X+Y = x+y (< 180°). Since the subtrahends of the right sides are equally great, the larger right side is the one that possesses the greater minuend, i.e., in this case, the one in which the minuend shows the smaller angle difference.] Let the constant sum of the three variable angles a, /3, y be 3k (^ 180°). Now if a, /3, y is such an angle triplet in which none of the angles chances to equal k, then at least one, let us say a, must necessarily be greater than k, and another, let us say /3, must be smaller than k. We form a new triplet a, /3', y such that (1) a' = k, (2) the pairs a, /3' and a, /3 possess equal sums, and (3) y = y. According to the above theorem, sin a' sin /3' will then be > sin a sin /3, and consequently, sin a! sin /3' sin y will also be > sin a sin /3 sin y, or (1) sin k sin /3' sin y > sin a sin /3 sin y. Since /3' + y = 2k, the same theorem yields (2) sin k sin k ^ sin |3' sin y'.
366 Extremes Combining (1) and (2), we obtain sin k sin k sin k > sin a sin /3 sin y. Consequently: The product of the sines of three angles of constant concave sum assumes its maximum value when the angles are equal. The solution to our sailboat problem thus reads a = /3 = y = 30°. This means that: The axis of the boat must form a 60° angle with the bearing north, and the sail must bisect the angle formed by the wind bearing and the boat's axis. In these optimal positions the northerly motion is equal to exactly % the maximum southerly motion. ^^g The Honeybee Cell (Problem by Reaumur) The cell of the honeybee (cf. Figure 107) has the form of a regular hexagonal prism that is sealed at only one end by a regular hexagon arbpcq, while at the other end it is sealed by a roof consisting of three congruent rhombuses PBSC, QCSA, and RASB that are inclined toward each other and toward the axis of the prism at equal angles, in such S a Fig. 107. manner that the lateral surfaces of the prism are congruent trapezoids (AarR, RrbB, etc.). The longest side of one such trapezoid is somewhat more than twice as long as the diameter of the inscribed circle of the base surface arbpcq. As a result of the regular arrangement of the rhombuses, each of the three rhombus diagonals (SP, SQ, SR) originating
The Honeybee Cell 367 at the roof apex S forms the same angle with the axis of the prism as the rhombus plane, and the two planes ABC and PQR are perpendicular to the edges of the prism. Since the obtuse-angled rhombus vertexes abut on each other at S, the diagonals mentioned are the short rhombus diagonals. This singular construction of the honeybee cell suggested to naturalists like Maraldi, R6aumur, and others (at the beginning of the eighteenth century) that the bees had chosen this design in order to save as much as possible in the building material, i.e., in wax. The problem posed by R6aumur in this connection to the Swiss mathematician Koenig can be stated as: To close a regular hexagonal prism with a roof consisting of three congruent rhombuses in such manner as to obtain a solid of prescribed volume and minimal surface. Solution. Let the regular hexagonal cross section of the prism have the side 2e, so that its shorter diagonals ab = be = ca = 2d = 2e\/3 and thus also AB = BC = CA = 2d = 2eV$. Let the distance of the plane PQR and the apex S of the roof from the plane ABC be x, and let the short rhombus diagonals (SP = SQ = SR) be 2y. Since the projection from SR = 2y on the axis of the prism is 2x, and on the plane PQR is 2e, we obtain the equation (1) y2 = e2 + x2. If $, >Q, 9¾ are the points at which the prism edges passing through P, Q, R intersect the plane ABC, then ^49lB$C£l is a regular hexagon with the side 2e. First it becomes apparent that the volume of the prism undergoes no change when the rooflike closure that has been described is chosen instead of the plane closure AfftBtyCQ,, since as much room is added on the one side of the plane ABC (pyramid S-ABC) as is taken away from the other side (the three pyramids P-BC$, Q-CAD,, R-ABfH). Only the surface changes with the change in design; the surface decreases by the area 6e2\/3 of the hexagon A?HB $CC, as well as by the area of the six right triangles P^B, P%C, QD.C, QDA, RRA, RdlB—together 6ex—while it increases by the total area of the three rhombuses PBSC, QCSA, RASB, namely 6dy = 6eV3y. The saving in surface area thus obtained is accordingly 6e2\/3 + 6ex - 6eV3y or 662^ - 6e[yV3 - x],
368 Extremes so that it now remains to obtain a minimum value for the expression in the bracket u = y\/3 - x by an appropriate choice of x. Now, if v is understood to be the similarly constructed expression xV3 — y, then, as a result of (1), u2 - v2 = 2(y2 - x2) = 2e2 or u2 = 2e2 + v2. From this it follows that u attains a minimum (specifically eV2) when v is equal to zero, i.e., when (2) y = xy/3. From (1) and (2) we obtain x = eV\ and y = evf. The diagonal SR = 2y = eV6 is consequently shorter than the diagonal AB = 2d = 2«VlJ = eV\2, so that the three rhombus angles abutting on one another at S are obtuse. If we designate the acute rhombus angle SAR as 2<p, it follows from tan <p = yjd = 1/V2 and tan 2q> = 2 tan q>j{ 1 — tan2 qp) that tan 2<p = VH, cos 2<p = ^, and 2<p = 70° 32'. The obtuse rhombus angle 20 is therefore 109° 28'. For the angle /t of the rhombus diagonals SP, SQ, SR with respect to the axis of the prism we obtain the relation tan /t = 2e/2x = V2, and thus y. = 90° - <p = 54° 44'. The angle v of the rhombuses with respect to the prism cross section is, finally, v = 90° - y. = <p = 35° 16'. Since the tangent of the acute trapezoid angle {^aAR) has the value 2e\x = Vo (= tan 2<p), the acute and obtuse angles of the trapezoid correspond to the acute and obtuse angles, respectively, of the rhombus. Particular interest attaches to the angles enclosed between every two bounding surfaces of the prism. These angles are easily determined.
Regiomontarw' Maximum Problem 369 To begin with, since the three-sided corners S, P, Q, R are congruent and regular (each side is 20), the surface angles belonging to these corners are all equal to each other. Since the four-sided corners A, B, C are also regular and congruent (each side is 2<p), these corners also all have the same surface angle. Now, a surface angle of the corner P atp as ^_bpc equals 120°, and a surface angle of the corner A at a as l^qar also equals 120°. Consequently, all the surface angles of the prism are 120° (naturally, with the exception of the right angles forming the base surface). The angles we have just calculated have in fact been confirmed by actual measurement for the honeybee cell—within the limits of observational error. Of particular interest is the remarkable fact that every two abutting wax surfaces enclose an angle of 120°. ■&■ Regiomontanus' Maximum Problem At what point of the earth's surface does a perpendicularly suspended rod appear longest ? (I.e., at what point is the visual angle at a maximum ?) This problem was posed in 1471 by the mathematician Johannes Muller, called Regiomontanus after his birthplace Konigsberg in Franconia, to the Erfurt professor Christian Roder. This problem, which in itself is not difficult, nevertheless deserves special attention as the first extreme problem encountered in the history of mathematics since the days of antiquity. The author of the following simple solution is Ad. Lorsch, who published it in vol. XXIII of the Zeitschrift fur Mathematik und Physik. Let A be the upper and B the lower end point of the rod, F the base point of the perpendicular to the earth's surface from A (or B), so that the segments FA = a and FB = b are known. Since the rod appears to be equally long at all the points of a circle on the earth's surface described about F as the center, it is sufficient to erect an arbitrary perpendicular g to FA at F and to seek on this line that runs horizontally on the earth's surface the point 0 at which the visual angle at = 2$ AOB is a maximum. First Lorsch shows that the circle of circumscription ft of the triangle ABO is tangent to the line g at O. Indeed, if g were not tangent to ft, then ft would have another point Q in common with g besides point O, and for each intermediate point Z of g between O and Q, &AZB would be greater than the boundary angle of the circle ft on AB, and
370 Extremes it would consequently be greater than at, whereas <d is supposed to be the maximum. Let us therefore draw the circle S that passes through points A and B and is tangent to the line q ; the point of tangency 0 is the place at which the viewing angle of the rod attains its maximum value <n. Indeed, if P is any point other than 0 on the line q, then the angle APB is smaller than the boundary angle of $ on AB, and consequently smaller than at. Lorsch also shows the most convenient and quickest method of constructing the circle $ and/or its midpoint M and radius r. To begin with, the midpoint M lies on the perpendicular bisector of AB, which runs parallel to the line Q and passes through the midpoint N of AB. Now, in the rectangle MOFN the side FN is equal to the opposite side MO, and is thus equal to r, so that all that is necessary is to mark off from B (or A) the distance FN on the perpendicular bisector in order to obtain, at the resulting point of intersection, the desired midpoint M. If one wishes to determine the position of 0 by calculation—using its distance t from F—one need only bear in mind that, according to the tangent theorem, FO2 = FA FB. This equation immediately gives us t = An interesting variant of the problem of Regiomontanus is the Saturn problem, probably first posed by Hermann Martus, the author of the well-known problem collection: At what latitude circle of Saturn does the ring appear widest ? Saturn is assumed to be a sphere with a radius of 56,900 km, and the ring is assumed to be a circular ring in the plane of Saturn's equator, having an inner radius of 88,500 km and an outer radius of 138,800 km. Solution. In Figure 108, let the arc SDl represent a meridian, M the midpoint of Saturn, AB the width of the ring, MA = a being the outer radius, and MB = b the inner radius of the ring, and let MC = r be the equatorial radius of Saturn on MA. Let 0 be the point situated at the latitude <p = ■fcCMO at which the ring width appears greatest, so that -j^AOB = ifi is a maximum. We now apply Lorsch's considerations to our figure and directly obtain the following solution. We draw the circle S that passes through the points A and B and is tangent to the meridian 2R; the point of tangency 0 is the place at which the ring width appears to be greatest.
The Maximum Brightness of Venus 371 In order to calculate the latitude <p of 0 and the maximum 0, we examine the right triangles MZF and AZF, in which Z is the center of the circle ft, F the center of AB. From these triangles, with the understanding that p is the radius of ft, we obtain cos <p = MF _ a + b MZ ~ 2(r + P) and Sin^ = 3z = : - & 2p ' Fig. 108. The unknown p, however, follows from the secant theorem, according to which MA-MB = MZ2 - p2 or ab = (r + p)2 - p2 = r2 + 2rp, and consequently p = (ab — r2)/2r. If we introduce this into the above, we at length obtain (a + b)r COSro = -^-j s- Y ab + r2 and ., (* ~ b)r and from this, <p = 33^°, iji = 1¾0. The Maximum Brightness of Venus In what position does the planet Venus appear to kave the greatest brilliance ? Solution. Let the midpoints of the sun, earth, and Venus be S, E, V, the radii of the orbits (assumed as circular) of the earth and Venus SE = a and SV = b, the variable distance of Venus from the earth EV = r, the radius of Venus h. The tangents to Venus from S and E touch Venus along circles I and II, respectively, whose diameters in the plane SEV we will call AB and CD, respectively. Since AB J_SV and CD _\_EV, the angle between the planes of the two circles is equal to the angle <p = SVE between their normals VS and VE. The projection of the portion of Venus that is illuminated by the sun and visible from the earth on the plane of circle II consists of the semicircle with the central radius VC and the area (irj2)k2 and the
372 Extremes projection of the semicircle with the central radius VB, having the area (irfflk2 cos <p. (The area of the projection of a plane surface on a plane is equal to the product of the area of the surface and the cosine of the angle between the two planes.) The radiation from Fig. 109. Venus to the earth is thus exactly the same as that of a surface at V perpendicular to the rays, with the area J = ^nh2{\ + cosy). If 1 cm2 of this surface at distance 1 develops the illumination intensity c, the entire surface generates the illumination intensity cJ and at the distance VE = r the illumination intensity is _ cJ _ cirk2 1 + cos <p 7* ~ 1 7* Accordingly, the illumination intensity attains a maximum when the factor 1 + cosy J - ^ reaches its peak value. Now, according to the cosine theorem as applied to triangle SEV, r* + b2 - a2 0059 Wr ' and consequently,
The Maximum Brightness of Venus 373 This expression has the form / = Ax + Bx2 - Cx3, where A-1 B-\ C-'2-*2 A-Tb, B-\, C-—2£- are constants and x = (1/r) is a variable. We must now make the function f of x as great as possible by a suitable choice of x. As the curve of the function shows,/"initially grows as x (> 0) increases; at a certain point x = a it attains its maximal value, and then declines. For every (positive) x # a, therefore, Ax + Bx2 - Cx3 < Aa + Be? - Co?. Accordingly as x $ a, we write this inequality as C{o? - x3) < A(a - x) + B(*2 - x2) or q*3 - a3) > A{x - a) + B{x2 - a2), and divide both sides by a — x and x — a, respectively. From this we find that: The function C(a2 + a + x2) lies below the function A + B(a + x) when x < a, and above it when x > a. Since these two continuous functions increase steadily, they must attain equal values at the point x = a, so that C{a2 + o? + a2) = A + B{a + a). This equation yields B + VB2 + ZCA a 3C If we introduce here the values of A, B, C, we find for the desired distance r(= 1/a) the value r = V3a2 + b2 - 2b. Now all three sides of the triangle SEV for the optimal position are known (a:b:r = 1:0.7233:0.4304), and the sought-for angular distance (&SEV) of Venus from the sun is found to be 39° 43.5'.
374 Extremes A Comet Inside the Earth's Orbit What is the maximum number of days that a comet can remain within the earth's orbit? We will assume that the earth's orbit is circular and the comet's parabolic, and that the orbital planes coincide. Solution. We will select the large half axis of the earth's orbit as the unit length, the mean solar day as the unit time, and we will designate the parabola parameter as Ak, the base line of the parabola section lying within the earth's orbit as 2y, the altitude of the section as x, the sector described by the focal radius of the comet within the earth's orbit as S, and finally, the time required to traverse the sector as t. Then (1) y2 = 4A* according to the amplitude equation of the parabola, (2) (* - A)2 + y2 = 1 according to the circle equation, and (3) 3S = y(x + 3k) according to the formula for the area of a parabola section [No. 56. S = the section — triangle = %xy — (x — k)y~\. If 2p represents the orbit parameter of a celestial body of mass /t revolving about the sun (the mass of the sun is considered as the unit mass), if t is any time, S the sector described by the body in this time, we can use the Gauss formula* 2S tVpV\ + /x = G, where G (the root of the gravitation constant) is the so-called Gauss constant, which has the numerical value of 0.0172021 for the units assumed. Since the mass of the comet relative to that of the sun is negligible, the Gauss formula is transformed into (4) S = CtVk, with C = G/V2 in our problem. * Gauss, Theoria motus corporum coelestium in sectionibus conicis solem ambientium (Hamburg, 1809). (English translation by C. H. Davis reprinted by Dover Publications, 1963.)
The Problem of the Shortest Twilight 375 From (1) and (2) we find x + k = 1, y = 2VA:(1 - A) and, making use of these values, we obtain from (3) ZS = 2\/k{\ - k){\ + 2k). If we introduce here the value for S from (4), it follows that (5) t = c(\ + 2k)VY^k, with c = V8/3G. Since t is to be a maximum, the expression (1 + 2k) Vl — k must be made as great as possible. It therefore remains to select k in such manner that the expression or its square or fourth power, namely, P = (1 +2A)-(1 +2A).(4-4A:), becomes a maximum. However, since P is a product of factors of constant sum, it attains a maximum (No. 10) when the factors are equally great, thus when 1 + 2k = 4 - 4k. This gives us k = \ and, as a result of (5), t = 78. The sought-for maximum possible length of stay is thus 78 days. Efi| The Problem °f the Shortest Twilight On what day of the year is the twilight shortest at a place of given latitude? This problem was posed, but not solved, by the Portuguese Nunes in 1542 in his book De crepusculis. Jacob Bernoulli and d'Alembert solved the problem by means of differential calculus, but obtained no simple results. The first elementary solution stems from Stoll (Zeitschriftfur Mathematik und Physik, vol. XXVIII). The following very simple solution is from Brunnow's Lehrbuch der sphdrischen Astronomic (Textbook of Spherical Astronomy). A distinction is made between civil and astronomical twilight. Civil twilight ends when the midpoint of the sun stands 6^° below the horizon. Approximately at this moment one must turn on one's lights in order to continue working. Astronomical twilight ends when the midpoint of the sun stands 18° below the horizon; it is approximately at this time that the astronomer can begin making observations.
376 Extremes It is convenient to choose as the beginning of twilight the moment at which the midpoint of the sun is intersected by the horizon. Let the latitude of the observation point be <p, the pole distance of the sun p. The duration of the twilight is measured by the angle d that is formed by the two-hour circle arcs of the nautical triangles determined by the sun for the beginning and end of the twilight. If we superimpose one of these triangles on the other in such manner that the two pole distances coincide, the angle between the two latitude complements b (now having in common only the world pole P) represents Y Fig. 110. the duration d of the twilight. In this position let the triangles be PCX and PCY, with PC = p, PX = PY = b = 90° - <p, CX = 90°, CY = 90° + h (k is to be understood as representing the depth of the sun below the horizon at the end of the twilight), and &XPY = d. Moreover, let XY = u and &XCY = 0. From the isosceles triangle PXY it follows, according to the cosine theorem, that Consequently, d becomes a minimum or cos d a maximum when cos u is at a maximum. From the triangle CXY it follows, however, that cos u = cos CX cos CY + sin CX sin CY cos tfi or, since cos CX = 0, sin CX = 1, sin CY = cos h, that cos u = cos h cos if).
The Problem of the Shortest Twilight 377 Thus, cos u attains its greatest possible value when cos 0 is a maximum, i.e., when 0 = 0. On the day of the shortest twilight, point X accordingly falls on the side CY, and the base XY = u of the isosceles triangle PXY is h. At the same time we find from (1) for the minimum duration b of the twilight cos h — sin2 <p cos b = cos2 <p or, in accordance with the two formulas cos b = 1 — 2 sin2 s» cos h = 1 — 2 sin2 -p I h . h I) sin £ 1 v ' 2 cos <p To find the corresponding declination of the sun 8, we express the cosine of the angle co = £.PCX = £_PCY twice in accordance with the cosine theorem and set the resulting values equal to each other. It follows from APCX (since cos CX = 0, sin CX = 1) that sin q> cos co = -.—£» sinP from &PCY (since cos CY = —sin k, sin CY = cos h) that sin a> + cos p sin h cos co = —. —, sin p cos h Equalizing, we obtain sin qj cos h = sin cp + cos p sin h or or or, finally, — cosp sin k = sin qj(1 — cos h) — cos/f-2 sin ■? cos - = sinoj-2sin2- cos p = — sin qj tan -•
378 Extremes Because of the minus sign, the pole distance p is an obtuse angle for northern latitudes, the sun's declination 8 is thus southerly and (II) sin 8 = sin <p tan -• The shortest twilight duration is determined by (I) and the southerly declination of the sun for the day on which that twilight occurs is given by (II). From the declination the sought-for day can be found by means of the nautical almanac. This datum is also found with sufficient accuracy if the familiar formula (2) sin 8 = sin e sin I is used; here 8 represents the sun's declination, I the angular distance of the sun from the autumnal or vernal equinox, and e the inclination of the ecliptic (23° 27'). Since the above-mentioned angular distance changes at an average daily rate of m = 59.1', the sought-for information varies by n = l\m days from the 23rd of September or from the 21st of March. For Leipzig, for example, (<p = 51° 20.1') we find, from (II), 8=7° 6.2', then from (2), I = 18° 6.3', and then n = 18.4. The shortest twilight in Leipzig thus falls on October 11 and March 3. |g|g| Steiner's Ellipse Problem Of all the ellipses that can be circumscribed about (inscribed in) a given triangle, which one has the smallest (largest) area ? "Dans le plan, la question des polygones d'aire maximum ou minimum inscrits ou circonscrits a une ellipse ne pr6sente aucune difficulte. II suffit de projeter I'ellipse de telle maniere qu'elle devienne un cercle, et Ton est ramen£ a une question bien connue de geom6trie 61ementaire"* (Darboux, Principes de Ge'ome'trie analytique, p. 287). * Translation: "In a plane the question of polygons of maximum or minimum area inscribed in or circumscribed about an ellipse offers no difficulty. All that is necessary is to project the ellipse in such manner that it is transformed into a circle, and the problem is reduced to a well-known question of elementary geometry".
Steiner's Ellipse Problem 379 The solution of the problem is based on the two auxiliary theorems: I. Of all the triangles inscribed in a circle the one possessing the maximum area is the equilateral. II. Of all the triangles that can be circumscribed about a circle the one possessing the minimum area is the equilateral. Proof of I. We call the circle diameter d, the sides and angles of an inscribed triangle p, q, r and a, /3, y, respectively, the area of the triangle J. Then J = ipq sin y and p = d sin a, q = d sin /3, and consequently, J = \d2 • sin a sin /3 sin y. According to No. 92, the product of the sines sin a sin /3 sin y of the three angles a, /3, y df constant sum (180°) is at a maximum when « = /3 = y(=60°), i.e., when the triangle is equilateral. The area of this maximal triangle is -faVSd2, thus V27/4-77 of the area of the circle. Proof of II. If we designate the sides of an arbitrary circumscribed triangle PQR asp, q, r, then the tangents to the circle from the vertexes P, Q, R are x = s— p, y = s— q, z = s — r, where s represents half the perimeter of the triangle (- 1 = x +y + z) The area J of the triangle and the radius p of the inscribed circle are given by the well-known formulas J = ps and J = Vxyzs (Hero of Alexandria). These give us sp2 = xyz. Making use of the formula J = ps, we write this equation in the following two ways: 1111 (1 - + — + 5» yz zx xy p2 (2) 1.1.1 = 4r w yz zx xy J^p*
380 Extremes We now introduce the new unknowns 1 1 1 U = — > V = —y w = — yz zx xy and obtain 1 1 u + v + w = -=, uvw = -==-=. p* J2P2 Since J is supposed to be a minimum and p is constant, uvw must attain a maximum. A product uvw of numbers u, v, w of constant sum (u + v + w = const.) reaches a maximum, however (No. 10), when the numbers are equal to each other: u = v = w. The circumscribed triangle therefore becomes smallest when yz = zx = xy, i.e., when x = y = z, i.e., when p = q = r, which proves II. We find that the area of the smallest circumscribed triangle is four times that of the maximum inscribed triangle, i.e., v27 p2, and for the ratio of this area to the area of the circle we obtain the improper fraction V27/ir. Now for the solution of the ellipse problem). Let @ be any ellipse circumscribed about (inscribed in) the given triangle abc, fits surface area, 8 the area of the triangle abc. We consider @ as the normal projection of a circle $, whose surface area we will call F, In the projection the inscribed (circumscribed) triangle ABC of the circle, possessing an area we will call A, corresponds to the inscribed (circumscribed) triangle abc of the ellipse. If /x represents the cosine of the angle between the plane of the circle and the plane of the ellipse, then the normal projection of every surface lying in the plane of the circle is the /t-multiple of the surface. This gives us the formulas f=l*F, 8 = ,xA. Since 8 is constant, f attains a minimum (maximum) when the quotient f/B or the equal quotient F/A reaches a minimum (maximum). The latter quotient, however, according to auxiliary theorem I. (II.) reaches its minimal (maximal) value 477/V27 (■rrl V27) when the triangle ABC is equilateral. To establish more exactly the ellipse determined by this condition, we make use of the properties of a normal projection: 1. Parallelism is not annulled by projection. 2. The ratio between parallel segments is maintained in projection: in particular, the ratio of two segments of the same line is not altered.
Steiner's Circle Problem 381 Now, the center M of the circle is the point of intersection of the medians of the equilateral triangle ABC and the diameter through C bisects the chords of the circle parallel to AB. Consequently, the point of intersection of the medians of the triangle abc is the center point m of the sought-for ellipse, and the ellipse diameter through c bisects the ellipse chords parallel to the side ab, so that ab and mc are conjugate directions of the ellipse. Now, since the circle radius MK parallel to the circle chord (tangent) AB is equal to l/\/3(\/3/6) of AB, the ellipse half diameter mk parallel to the ellipse chord (tangent) ab is also equal to 1/^3(-^3/6) of ab. Result. Of all the ellipses that can be circumscribed about (inscribed in) a given triangle abc, the one with the smallest (greatest) area is the ellipse whose midpoint m is the point of intersection of the medians of the triangle abc and from which the ellipse half diameter to c (to the center of ab) and the ellipse half diameter parallel to ab, mk = ai/V3(ai/2VlJ), are conjugate half diameters. The area of the ellipse thus characterized—the so-called Steiner ellipse—is of the area of the triangle. This ellipse can be constructed easily in accordance with No. 42. Bg| Steiner's Circle Problem Of all isoperimetric plane surfaces (i.e., those having equal perimeters) the circle has the greatest area. And conversely: Of all plane surfaces with equal area the circle has the smallest perimeter. This fundamental double theorem was first proved by J. Steiner (Crelle's Journal, vol. XVIII; also in Steiner's Gesammelte Werke, vol. II). Steiner even provided several proofs. Here we will consider only the one that is based upon the Steiner symmetrization principle. First we will prove the second half of the theorem. It is obviously sufficient to limit our considerations to convex surfaces, i.e., those surfaces in which the line segment connecting two arbitrary points of the surface belongs completely to the surface. 477 / 7T \ V27 \V27/
382 Extremes We will first prove the auxiliary theorem: Of all trapezoids with common base lines and altitudes the isosceles trapezoid is the one the sum of whose legs is smallest. Let ABCD be an arbitrary trapezoid with the base lines BC and AD, the legs AB and CD. Let the mirror image of B on the perpendicular bisector of AD be B', let the center of CB' be C0. On the extension of A // B0 B D B' C0 C Fig. 111. CB we set BB0 = CC0 and obtain the isosceles trapezoid ABqCqD, which has base lines and altitude in common with the given trapezoid, and consequently also the same area. If we extend DC0 by its own length to H, we obtain the parallelogram DCHB', in which the diagonal DH is shorter than the sum of the sides DC and CH: DH < DC+ CH. However, since DH = 2-DC0 = DC0 + AB0 and CH = DB' = AB, we obtain AB0 + DC0 < AB + DC. Thus, the isosceles trapezoid has the smallest leg sum. Now let 5 be the surface having the smallest perimeter for the given area J; let the perimeter be u. We draw an arbitrary line Q and divide g by perpendiculars to Q into trapezoids ABCD that we select so narrow that the arc-shaped legs AB and CD can be considered as rectilinear. From the points of intersection of the dividing lines ... AD, BC,... with g we mark off on the dividing lines on both sides of Q the half chords ... AD, BC,..., as a result of which we obtain the points .. .A', D', B', C,... and the trapezoids ..., A'B'C'D',.... The new trapezoid A'B'C'D' is isosceles and possesses equal base lines and altitude with ABCD, so that the area is also the same. This gives us (1) A'B' + CD' i AB + CD, in which the equals sign applies only when ABCD is also isosceles.
Steiner's Circle Problem 383 Our method enables us to obtain from 3 a new surface 3' with the symmetry axis Q, having the same area as 3 and a perimeter, therefore, that cannot be smaller than u. Thus, the equals sign in (1) must always apply. All trapezoids ABCD are therefore isosceles, and the perpendicular bisector of BC is an axis of symmetry of 3- The surface 3 of minimal perimeter therefore possesses an axis of symmetry in every direction. But such a surface must be a circle! Proof. Let I and II be two mutually perpendicular symmetry axes of 3> M their point of intersection. Let the mirror image of an arbitrary point P of 3 on I be P,, and let the mirror image of Px on II be P' = P12. Then PMP' is a straight line and MP' = MP, i.e., the point Mis a midpoint of the surface. Now 3 can only have one midpoint. Indeed, if N were a second midpoint, then extending PM by its own length, we would first arrive at P'; next, extending P'N by its own length, we would arrive at a new point P" of 3; then extending P"M by its own length, we would arrive at a point P" of 3; extending P'N by itself, we would then come to still another point of 3> etc If these operations are represented graphically it will be observed that in this manner we would end up at some arbitrary distance beyond the drawing paper (on which 3 lies), which is naturally absurd. Thus, 3 has only the one midpoint M. It follows from this, further, that: This Mmust belong to each axis of symmetry of 3- Indeed, if M does not lie on the axis of symmetry a of 3> then we can draw the mirror images m and p of M and of an arbitrary surface point P on a, extend pM by its own length to the surface point p', and draw the mirror image p" of/>' on a. Now, since p" is a point of 3> Pmp" is a straight line, and mp" = mP, this would mean that 3 had a second midpoint, m, and this is impossible. Thus, all the axes of symmetry intersect at M. Now let F be a fixed boundary point of 3 and P an arbitrary boundary point of 3- Since the perpendicular bisector of FP is an axis of symmetry of 3> it passes through M. Therefore, MP = MF; i.e., all the boundary points of 3 are equidistant from M, and the surface 3 is a circle.
384 Extremes Consequently, of all surfaces of equal area the circle has the smallest perimeter. We now state conversely: Of all isoperimetric surfaces the circle has the greatest area. Proof. Let the perimeter/of an arbitrary surface g, which is not a circle, be equal to the perimeter k of the circular surface ft. Let the area of 3 be F and that of ft be K. Now, if F ^ K, we will consider the circular surface ft', concentric to ft, of area K' = F, and we will let its perimeter be k'. Since ft' covers ft, (2) k' Z k. However—since the surfaces ft' and g have the same area—according to the theorem proved above, k' < for (3) k' < k. The inequalities (2) and (3) contradict each other, however, and thus the assumption that F ^ K must be false. Consequently, F < K. Q.E.D. The foregoing Steiner proof of the major isoperimetric theorem for the circle has certain weaknesses. The same is true of the proof of the major isoperimetric theorem for the sphere, presented in the following section. The reader may learn how these weaknesses can be eliminated and the Steiner proof formulated in a completely rigorous fashion by consulting the excellent book Kreis und Kugel (Circle and Sphere) by W. Blaschke. Unfortunately, we cannot go into these interesting investigations because of lack of space. MUM Steiner's Sphere Problem Of all solids of equal surface the sphere possesses the maximum volume. Of all solids of equal volume the sphere possesses the smallest surface. (Steiner, Crelle's Journal, vol. XVIII; Steiner, Gesammelte Werke, vol. II.) As in No. 99, we will prove the second part of the theorem first. Naturally, we will consider only convex solids, i.e., those solids in which the line segment connecting two arbitrary points on the solid belongs completely to the solid.
Steiner's Sphere Problem 385 Steiner's proof is based on the principle of symmetrization and the theorem: Of all triangular prisms whose parallel edges AA', BB', CC have the prescribed lengths h, k, 1 and lie on three given lines, the prism with the plane of symmetry normal to the edges possesses the smallest base surface sum ABC + A'B'C. Proof. We will designate the distances of the edges from one another as a, b, c, so that % = ±a{k + I), » = \b{l + h), ® = ±c{h + k) are the areas of the three trapezoidal prism faces. These areas are given magnitudes. We extend CB and C'B to the point of intersection P, and CA and C'A' to the point of intersection Q, and obtain P C Fig. 112. c' the tetrahedron CC'PQ in which for brevity we will call the surfaces CC'P and CC'Q "lateral surfaces" and the surfaces CPQ and C'PQ "top surfaces." We determine the relations between the areas J, J', $, C of the tetrahedron bounding surfaces CPQ, C'PQ, CC'P, CC'Q, on the one hand, and the areas A, A', 2t, 58, © of the prism bounding surfaces ABC, A'B'C, BB'C'C, CCA'A, AA'B'B, on the other. From the ray theorem it follows that CP CP I , CQ CQ I (1) CB = CB'=-\ and CA=CA' = ? where A is the difference between I and k, and /t is the difference between I and h. Now, since the areas of similar triangles are in the same proportion to each other as the squares of homologous sides, we obtain the relations $ _ I2 O _ /f f^~! " k* ""* a - IB " A2"
386 Extremes From these we obtain (2) % = *% O = /3», with /2 I2 a = jT^n? and P = jrzrF Moreover, since the areas of two triangles with a common angle are to each other as the products of the adjacent sides of this angle, we obtain J _ CPCQ T_ _ C'PC'Q A ~ CA-CB A' C'A'C'B'' and consequently as a result of (1), (3) J = /<A and J' = /<rA', where k is the constant l2l\p. From (2) it follows that the areas $ and D, of the lateral surfaces of the tetrahedron are constant no matter where the prism edges AA', BB', CC happen to lie, and from (3), that the sum S of the areas J and J' of the top surfaces of the tetrahedron is k times the sum £ of the areas A and A' of the base surfaces of the prism: (4) S = kS. We will now prove the auxiliary theorem: Of all tetrahedrons with two fixed comers C, C and two movable comers P and Q, that lie on the fixed lines I and II parallel to CC, the tetrahedron in which P and Q, lie on the perpendicular bisector plane of CC is the one possessing the smallest area sum S of its top surfaces CPQand CPQ,. To begin with, it is clear that the tetrahedrons concerned all have the same volume V. (The base surface CC'P has the constant area ^5 and the corresponding apex Q lies on a fixed parallel to the plane CC'P.) We draw through the center M of CC the plane E normal to CC and designate its points of intersection with the lines I and II as/> and q. Let P and Q be two (other) points anywhere on I and II. We now express the tetrahedron volume V, first using the tetrahedron CC'pq and then the tetrahedron CC'PQ.
Steiner's Sphere Problem 387 For this purpose we construct at C and C" on the top surfaces Cpq and Cpq perpendiculars running toward the inside* of these surfaces and designate their point of intersection on E as 0. We will select the common length of the two perpendiculars as our unit length. The perpendiculars from 0 to the top surfaces CPQ and C'PQ and to the planes ICC" and II CC we will designate as x, x', m, n, the common area of the lateral surfaces CC'p and CC'P as $, that of the lateral surfaces CC'q and CC'Q as iQ, and, finally, the areas of the top surfaces Cpq, Cpq, CPQ, C'PQ as i, i', J, J'. We then obtain for the volume V of the tetrahedrons CC'pq and CCPQ the formulas ZV = i + V + m% + nD. and 3V = xJ + x'J' + m$ + niQ, respectively [where x, x', m, and n, respectively, are positive or negative accordingly as 0 lies on the inside or outside of the bounding surfaces CPQ, C'PQ, ICC, and II-CC", respectively]. It follows from this that xJ + x'J' = i + {'. If we consider that the perpendicular x (x') from 0 to the plane CPQ (C'PQ) is shorter than the oblique line OC (OC), we see that x and x' are proper fractions. The left side of the last equation is therefore smaller than J + J' and consequently also i + i' < J + J', which proves the auxiliary theorem. We now go back to (4). Since, according to the auxiliary theorem, S becomes a minimum when P and Q lie on E, and, as a result of (4), S and S attain a minimum at the same time, then S attains a minimum when the prism bounding surfaces ABC and A'B'C are symmetrical with respect to E. Q.E.D. Note. The preceding proof assumes that one prism edge (I) differs from the other two. This limitation is of no importance, since it is immediately apparent that the theorem is true in the case h = k = I. The continuation of the proof for the major isoperimetric theorem is similar to that in No. 99. Let & be the solid that for a given volume V has the smallest surface; let the latter be 0. * The inside of a bounding surface of a tetrahedron is the side on which the tetrahedron is situated.
388 Extremes We choose an arbitrary plane E and divide ft by perpendiculars to E into triangular prisms ABCA'B'C, which we assume to be so narrow that the bounding triangles ABC and A'B'C belonging to the surface of ft can be considered as plane triangles. From the points of intersection of the perpendiculars ... AA', BB', CC,... with E we mark off on the perpendiculars on both sides of E the halves of the segments .. .AA', BB', CC,..., as a result of which we obtain the points ...,a, a', b, V, c, c',.... The new prism abca'b'c' possesses the symmetry plane E normal to the edges and, according to the above prism theorem, possesses a smaller base surface sum than ABCA'B'C: (5) abc + a'b'c' ^ ABC + A'B'C, in which the equals sign applies only if the prism ABCA'B'C also possesses a symmetry plane normal to the edges. By means of our procedure we obtain from ft a new solid ft' with the symmetry plane E, possessing the same volume Fas ft and a surface that consequently cannot be smaller than 0. Therefore, the equals sign in (5) must always apply. All prisms ABCA'B'C therefore possess one plane of symmetry normal to the edges, the perpendicular bisector plane of AA'. The solid ft having the smallest surface thus possesses a parallel symmetry plane for every plane. Such a solid must, however, be a sphere! Proof. Let I, II, III be three symmetry planes of ft that are normal to each other, M their point of intersection. Let the mirror image of an arbitrary point P of ft on I be Ply let the mirror image of Pj on II be Pi2, let that of P12 on III be PI23 = P'. Then PMP' is a straight line and MP' = MP, i.e., the point M is a midpoint of ft. Now, ft can have only one midpoint. (Proof as in No. 99.) It then follows from this that M must lie on every symmetry plane of ft. Indeed, if M does not belong to the symmetry plane A of ft, then we can draw the mirror images m and p of M and of an arbitrary point P of the solid on A, extend pM by its own length to the point p' of the solid, and draw the mirror image p" of p' on A. Now, since p" is a point of ft, Pmp" is a straight line, and mp" = mP, this would result in a second midpoint, m, for ft, which is impossible.
Steiner's Sphere Problem 389 All the symmetry planes, therefore, intersect at M. Now let F be a fixed point and P an arbitrary point of the surface of ft. Since the perpendicular bisector plane of FP is the symmetry plane of ft, it passes through M. Therefore, MP = MF; i.e., all the surface points of ft are equidistant from M, and the solid ft is a sphere. Of all solids of equal volume the sphere thus has the smallest surface. We now state conversely: Of all solids of equal surface the sphere has the greatest volume. Proof. Let the surface 0 of an arbitrary solid ft, which is not a sphere, be equal to the surface o of the sphere I. Let the volume of ft be V and that of I be v. Let us assume V ^ v; then let us consider the sphere I' concentric to I, having the area v' = V and the surface o'. Since I lies on I', (6) o' > o. However—since the solids I' and ft have the same volume—according to the previously proved theorem, o' < 0, or (7) o' < o. The inequalities (6) and (7) contradict each other. The assumption V ^ v must therefore be false, and v > V, as we asserted.
Index of Names Abel, Niels Henrik 121-132 Alembert, Jean Le Rond d' 109, 155, 375 Alhazen (Abu Ali al Hassan ibn al Hassen ibn Alhaitham) 197-200 Amthor 6 Andre 64-69 Apollonius of Perga 154-160, 165, 220 Archimedes 1-7, 154-160, 172, 184- 188, 239-242 Argand, J. R. 109 Bachet de Meziriac, Claude Gaspard 7-9 Bachmann, P. 105 Ball, W. W. Rouse 6,27 Barker 352, 353, 354 Barrow, Isaac 197, 247 Bernoulli, Jacob (1654-1705) 40-44, 375 Bernoulli, Nidaus (1687-1759) 19- 21 Berosus 340 Berwick, E. H. 11-14 Blaschke, Wilhelm 384 Brianchon, Charles Julien 165, 219- 220, 261-265 Brounckner, William 86 Briinnow, Franz Friedrich Ernst 375 Buffon, Georges Louis Leclerc, Comte de 73-77 Cardan, Jerome (Girolamo Cardano) 216-217 Castillon (I. F. Salvemini) 144-147 Catalan 22, 23 Cauchy, Augustin Louis 37-40, 105, 109 Cayley, Arthur 105 Chasles, Michel 312 Cramer, Gabriel 144 Darboux, Jean Gaston 378 Demoivre, Abraham 179 Desargues, Gerard 250-255, 265-273 Descartes, Rene 171 Dickson, Leonard Eugene 105 Dirichlet, Peter Gustav Lejeune 96 Douwes 326 Eratosthenes 5 Euclid 154, 250, 301 Euler, Leonhard 19-27, 44-48, 55, 78-85, 96, 97, 104, 136, 141-142, 184, 192, 285-289, 356, 359 Eutocius 170 Fagnano, I. F. 359-361 Fermat, Pierre de 78-85, 86-96, 96- 104, 135, 361-363 Feuerbach, Karl Wilhelm 142-144 Fox 77 Frenicle de Bessy, B. 86 Frobenius, Leo 105 Frost, Andrew 14 Fuss, Nicolaus 188-193 Gabriel-Marie, F. 359 Gauss, Karl Friedrich 86, 96-104, 104-108, 108-112, 119, 154, 177- 181, 307-310, 323-330, 331, 374 Gergonne, Joseph Diez 154, 159, 160 Giordano 144 Goldbach, Christian 21,22 Gordan, P. 128 Gregory, James 69-73 Hansen, Peter 193, 195, 196 Heiberg, Johan Ludvig 6
392 Index of Names Hermite, Charles 128-137 Hipparchus 310-314 Huygens, Christian 187, 197 Jacobi, Karl Gustav Jakob 105 Kepler, Johannes 330-334 Khayyam, Omar 34-37 Kirkman, T. P. 14-18 Koenig, Gabriel 367 Kronecker, Leopold 105, 109, 117, 127 Krummbiegel 6 Kummer, Ernest Eduard 96 Lagrange, Joseph Louis 86, 94-96, 356 Laisant, M. 27, 33 Lalande, Joseph Jerome Le Francais de 330 Lambert, Johann Heinrich 165, 206, 352-356 Legendre, Adrien Marie 82, 96, 104, 292 Leibniz, Gottfried Wilhelm von 73, 222 Lessing, Gotthold Ephraim 5, 6 L'Hdpital, Guillaume Francois 197 Lhuilier, Simon 305 Lindemann, Ferdinand 128-137 Liouville, Joseph 105,112 Littrow, Joseph Johann von 224 Lossel, von 364 Lorsch, A. 369 Lucas, fidouard 27-33 Ludolph van Ceulen 136 Machin, John 73 MacMahon, Percy Alexander 9, 27 Malfatti, Giovanni Francesco 147- 151 Maraldi, Giacomo Filippo 367 Martus, Hermann 370 Mascheroni, Lorenzo 160-164, 165 Menaechmus 171 Mercator, Gerhard 314-316 Mercator, Nicolaus 56-59 Moivre: see Demoivre Monge, Gaspard 151-154 Moreau, M. C. 27 Miiller: see Regiomontanus Nesselmann, G. H. F. 6 Newton, Isaac 9-10, 48-55, 59-64, 208, 217-219 Nicomedes 172 Nunes Pedro 321, 375 Pappus 173, 250, 252 Pascal, Blaise 257-261 Peirce, Benjamin 14, 16 Petersen 154 Pohlke, K. 303-307 Poncelet, Jean Victor 165, 192, 193, 219-220 Pothenot 193, 194, 196 Proclus 214 Quetelet, Lambert Adolphe Jacques 197 Reaumur, Rene Antoine Ferchault de 366-369 Regiomontanus (Johannes Miiller) 369-371 Riccati, Jacopo Francesco 197 Riccioli, Giovanni Battista 329 Roder, Christian 369 Rodrigues 22 Ruffini, Paolo 116 Schellbach 147 Schering 105 Schoenemann 118 Schooten, Franciscus van 214-217 Schwarz, Hermann Amandus 303-307 Segner 22 Simon, M. 224 Smith 77 Snellius, Willebrord 193, 321 Steiner, Jakob 165-170, 226-231, 255- 257, 278, 283-285,292,359,378-389 Stoll 375 Sturm, Jacques Charles Francois 112-116 Sylvester, James Joseph 16, 142 Tannery, P. 6 Taylor, H. M. 27 Torricelli, Evangelista 361-363 Ullherr 109 Urban, H. 26
Index of Names 393 Vieta (Viete), Francois 154 Vincent, A. J. H. 6 Vitruvius Pollio, Marcus 339 Viviani, Vincenzo 361 Wallis, John 86 Weber, H. 128 Weierstrass, Karl Theodor 109, 128 Weisbach 310 Wilson, J. 82 Wolf 77