/
Text
100
Great Problems of
Elementary Mathematics
THEIR HISTORY AND SOLUTION
Heinrich Dorrie
Translated by David Antin
100 Great Problems
of Elementary Mathematics
THEIR HISTORY AND SOLUTION
BY HEINRICH DORRIE
TRANSLATED BY DAVID ANTIN
NEW YORK
DOVER PUBLICATIONS, INC.
Copyright © 1965 by Dover Publications, Inc.;
originally published in German under the title of
Triumph der Mathematik, © 1958 by Physica-
Verlag, Wiirzburg.
All rights reserved under Pan American and
International Copyright Conventions.
Published in Canada by General Publishing
Company, Ltd., 30 Lesmill Road, Don Mills, Toronto,
Ontario.
Published in the United Kingdom by Constable
and Company, Ltd., 10 Orange Street, London WC 2.
This Dover edition, first published in 1965, is a
new translation of the unabridged text of the fifth
edition of the work published by the Physica-Ver-
lag, Wiirzburg, Germany, in 1958 under the title
Triumph der Mathematik: Hundert beruhmte
Probleme aus zwei Jahrtausenden mathemattscher
Kultur.
This authorized translation is published by
special arrangement with the German-language
publishers, Physica-Verlag, Wiirzburg.
Standard Book Number: 486-61348-8
Library of Congress Catalog Card Number:65-14030
Manufactured in the United States of America
Dover Publications, Inc.
180 Varick Street
New York, N.Y. 10014
Preface
A book collecting the celebrated problems of elementary
mathematics that would commemorate their origin and, above all,
present their solutions briefly, clearly, and comprehensibly has long
seemed a necessary and attractive task to the author.
The restriction to problems of elementary mathematics was
considered advisable in view of those readers who have neither the time
nor the opportunity to acquaint themselves in any detail with higher
mathematics. Nevertheless, in spite of this limitation a colorful and
compelling picture has emerged, one that gives an idea of the amazing
variety of mathematical methods and one that will—I hope—enchant
many who are interested in mathematics and who take pleasure in
characteristic mathematical thought processes. In the present work
there are to be found many pearls of mathematical art, problems the
solutions of which represent, in the achievements of a Gauss, an Euler,
Steiner, and others, incredible triumphs of the mathematical mind.
Because the difficult economic situation at the present time barred
the publication of a larger work, a limit had to be set to the scope and
number of the problems treated. Thus, I decided on a round number
of one hundred problems. Moreover, since many of the problems
and solutions require considerable space despite the greatest
concision, this had to be compensated for by the inclusion of a number of
mathematical miniatures. Possibly, however, it may be just these
little problems, which are, in their way, true jewels of mathematical
miniature work, that will find the readiest readers and win new
admirers for the queen of the sciences.
As we have indicated already, a knowledge of higher analysis is not
assumed. Consequently, the Taylor expansion could not be used for
the treatment of the important infinite series. I hope nonetheless
that the derivations we have given, particularly the striking derivation
of the sine and cosine series, will please and will not be found
unattractive even by mathematically sophisticated readers.
VI
Preface
On the other hand, in some of the problems, e.g., the Euler
tetrahedron problem and the problem of skew lines, the author believed it
necessary not to dispense with the simplest concepts of vector analysis.
The characteristic advantages of brevity and elegance of the vector
method are so obvious, and the time and effort required for mastering
it so slight, that the vectorial methods presented here will undoubtedly
spur many readers on to look into this attractive area.
For the rest, only the theorems of elementary mathematics are
assumed to be known, so that the reading of the book will not entail
significant difficulties. In this connection the inclusion of the little
problems may in fact increase the acceptability of the book, in that
it will perhaps lead the mathematically weaker readers, after
completion of the simpler problems, to risk the more difficult ones as
well.
So then, let the book go out and do its part to awaken and spread
the interest and pleasure in mathematical thought.
Wiesbaden, Heinrich Dorrie
Fall, 1932
Preface to the Second Edition
The second edition of the book contains few changes. An
insufficiency in the proof of the Fermat-Gauss Impossibility Theorem
has been eliminated, Problem 94 has been placed in historical
perspective and the Problem of the Length of the Polar Night, which
in relation to the other problems was of less significance, has been
replaced by a problem of a higher level: "Andre's Derivation of the
Secant and Tangent Series."
Wiesbaden,
Spring, 1940
Heinrich Dorrie
Contents
Arithmetical Problems P"ge
1. Archimedes' Problema Bovinum 3
2. The Weight Problem of Bachet de Meziriac 7
3. Newton's Problem of the Fields and Cows 9
4. Berwick's Problem of the Seven Sevens 11
5. Kirkman's Schoolgirl Problem 14
6. The Bernoulli-Euler Problem of the Misaddressed
Letters 19
7. Euler's Problem of Polygon Division 21
8. Lucas' Problem of the Married Couples 27
9. Omar Khayyam's Binomial Expansion 34
10. Cauchy's Mean Theorem 37
11. Bernoulli's Power Sum Problem 40
12. The Euler Number 44
13. Newton's Exponential Series 48
14. Nicolaus Mercator's Logarithmic Series 56
15. Newton's Sine and Cosine Series 59
16. Andre's Derivation of the Secant and Tangent Series 64
17. Gregory's Arc Tangent Series 69
18. Buffon's Needle Problem 73
19. The Fermat-Euler Prime Number Theorem 78
20. The Fermat Equation 86
21. The Fermat-Gauss Impossibility Theorem 96
22. The Quadratic Reciprocity Law 104
23. Gauss' Fundamental Theorem of Algebra 108
24. Sturm's Problem of the Number of Roots 112
25. Abel's Impossibility Theorem 116
26. The Hermite-Lindemann Transcendence Theorem. . 128
Vlll
Contents
page
Planimetric Problems
27. Euler's Straight Line 141
28. The Feuerbach Circle 142
29. Castillon's Problem 144
30. Malfatti's Problem 147
31. Monge's Problem 151
32. The Tangency Problem of Apollonius 154
33. Mascheroni's Compass Problem 160
34. Steiner's Straight-edge Problem 165
35. The Delian Cube-doubling Problem 170
36. Trisection of an. Angle 172
37. The Regular Heptadecagon 177
38. Archimedes' Determination of the Number ir 184
39. Fuss' Problem of the Chord-Tangent Quadrilateral.. 188
40. Annex to a Survey 193
41. Alhazen's Billiard Problem 197
Problems Concerning Conic Sections and Cycloids
42. An Ellipse from Conjugate Radii 203
43. An Ellipse in a Parallelogram 204
44. A Parabola from Four Tangents 206
45. A Parabola from Four Points 208
46. A Hyperbola from Four Points 212
47. Van Schooten's Locus Problem 214
48. Cardan's Spur Wheel Problem 216
49. Newton's Ellipse Problem 217
50. The Poncelet-Brianchon Hyperbola Problem 219
51. A Parabola as Envelope 220
52. The Astroid 222
53. Steiner's Three-pointed Hypocycloid 226
54. The Most Nearly Circular Ellipse Circumscribing a
Quadrilateral 231
55. The Curvature of Conic Sections 236
56. Archimedes' Squaring of a Parabola 239
57. Squaring a Hyperbola 242
Contents ix
page
58. Rectification of a Parabola 247
59. Desargues' Homology Theorem (Theorem of
Homologous Triangles) 250
60. Steiner's Double Element Construction 255
61. Pascal's Hexagon Theorem 257
62. Brianchon's Hexagram Theorem 261
63. Desargues' Involution Theorem 265
64. A Conic Section from Five Elements 273
65. A Conic Section and a Straight Line 278
66. A Conic Section and a Point 278
Stereometric Problems
67. Steiner's Division of Space by Planes 283
68. Euler's Tetrahedron Problem 285
69. The Shortest Distance Between Skew Lines 289
70. The Sphere Circumscribing a Tetrahedron 292
71. The Five Regular Solids 295
72. The Square as an Image of a Quadrilateral 301
73. The Pohlke-Schwarz Theorem 303
74. Gauss' Fundamental Theorem of Axonometry 307
75. Hipparchus' Stereographic Projection 310
76. The Mercator Projection 314
Nautical and Astronomical Problems
77. The Problem of the Loxodrome 319
78. Determining the Position of a Ship at Sea 321
79. Gauss' Two-Altitude Problem 323
80. Gauss' Three-Altitude Problem 327
81. The Kepler Equation 330
82. Star Setting 334
83. The Problem of the Sundial 336
84. The Shadow Curve 340
85. Solar and Lunar Eclipses 342
86. Sidereal and Synodic Revolution Periods 346
87. Progressive and Retrograde Motion of the Planets. . 349
88. Lambert's Comet Problem 352
x Contents
page
Extremes
89. Steiner's Problem Concerning the Euler Number. .. . 359
90. Fagnano's Altitude Base Point Problem 359
91. Fermat's Problem for Torricelli 361
92. Tacking Under a Headwind 363
93. The Honeybee Cell (Problem by Reaumur) 366
94. Regiomontanus' Maximum Problem 369
95. The Maximum Brightness of Venus 371
96. A Comet Inside the Earth's Orbit 374
97. The Problem of the Shortest Twilight 375
98. Steiner's Ellipse Problem 378
99. Steiner's Circle Problem 381
100. Steiner's Sphere Problem 384
Index of Names 391
Arithmetical Problems
^^H Archimedes' Problema Bovinum
The sun god had a herd of cattle consisting of bulls and cows, one part of
which was white, a second black, a third spotted, and a fourth brown.
Among the bulls, the number of white ones was one half plus one third the
number of the black greater than the brown; the number of the black, one quarter
plus one fifth the number of the spotted greater than the brown; the number of
the spotted, one sixth and one seventh the number of the white greater than the
brown.
Among the cows, the number of white ones was one third plus one quarter of
the total black cattle; the number of the black, one quarter plus one fifth the
total of the spotted cattle; the number of the spotted, one fifth plus one sixth the
total of the brown cattle; the number of the brown, one sixth plus one seventh
the total of the white cattle.
What was the composition of the herd?
Solution. If we use the letters X, Y, Z, T to designate the
respective number of the white, black, spotted, and brown bulls and
x, y, z, t to designate the white, black, spotted, and brown cows, we
obtain the following seven equations for these eight unknowns:
(1) X- T= |F, (4) x = ^(Y+y),
(2) Y- T=fsZ, (5) y = &(Z + z),
(3) Z-T=&X, (6) z = U(T+t),
(7) t = mX + x).
From equations (1), (2), (3) we obtain 6X - 5Y = 6T, 20Y -
9Z= 20T, 42Z- \3X = 42T, and taking these three equations
as equations for the three unknowns X, Y, and Z, we find
Y — 742 T V _ 178 T 7 _ 1 5807^
A — -j^J 1 , I — gg I , £j — g91 J.
Since 891 and 1580 possess no common factors, T must be some
whole multiple—let us say G—of 891. Consequendy,
(I) Z=2226G, F=1602G, Z = 1580G, T = 891G.
4
Arithmetical Problems
If these values are substituted into equations (4), (5), (6), (7), the
following equations are obtained:
12* - ly = 11214G, 20y- 9z = 14220G,
30z - 11/= 9801G, 42/ - 13* = 28938G.
These equations are solved for the four unknowns *, y, z, t and we
obtain
(ex = 7206360G, cy = 4893246G,
^ ' \cz = 3515820G, ct = 5439213G,
in which c is the prime number 4657. Since none of the coefficients
of G on the right can be divided by c, then G must be an integral
multiple of c:
G = cg.
If this value of G is introduced into (I) and (II), we finally obtain the
following relationships:
(X = 10366482^, Y = 7460514^,
<"■> {::
7358060^, T = 4149387^,
= 7206360^, y = 4893246^,
3515820& t = 5439213^,
where g may be any positive integer.
The problem therefore has an infinite number of solutions. If g is
assigned the value 1, we obtain the following:
Solution in the Smallest Numbers
white bulls 10,366,482 white cows 7,206,360
black bulls 7,460,514 black cows 4,893,246
spotted bulls 7,358,060 spotted cows 3,515,820
brown bulls 4,149,387 brown cows 5,439,213
Historical. As the above solution shows, the problem of the
cattle cannot properly be considered a very difficult problem, at least
in terms of present concepts. Since, however, in ancient times a
difficult problem was frequendy referred to specifically as a problema
bovinum or else as a problema Archimedis, one may assume that the form
of the problem dealt with above does not represent the complete and
original form of Archimedes' problem, especially when one considers
Archimedes' "Problema Bovinum"
5
the rest of Archimedes' brilliant achievements, as well as the fact that
Archimedes dedicated the cattle problem to the Alexandrian
astronomer Eratosthenes.
A "more complete" formulation of the problem is contained in a
manuscript (in Greek) discovered by Gotthold Ephraim Lessing in
the Wolfenbuttel library in 1773. Here the problem is posed in the
following poetic form, made up of twenty-two distichs, or pairs of
verses:
Number the sun god's cattle, my friend, with perfect precision.
Reckon them up with great care, if any wisdom you'd claim:
How many catde were there that once did graze in the meadows
On the Sicilian isle, sorted by herds into four,
Each of these four herds differently colored: the first herd was
milk-white,
Whereas the second gleamed in a deep ebony black.
Brown was the third group, the fourth was spotted; in every division
Bulls of respective hues greatly outnumbered the cows.
Now, these were the proportions among the catde: the white ones
Equaled the number of brown, adding to that the third part
Plus one half of the ebony catde all taken together.
Further, the group of the black equaled one fourth of the flecked
Plus one fifth of them, taken along with the total of brown ones.
Finally, you must assume, friend, that the total with spots
Equaled a sixth plus a seventh part of the herd of white cattle,
Adding to that the entire herd of the brown-colored kine.
Yet quite different proportions held for the female contingent:
Cows with white-colored hair equaled in number one third
Plus one fourth of the black-hued cattle, the males and the females.
Further, the cows colored black totaled in number one fourth
Plus one fifth of the whole spotted herd, in this computation
Counting in each spotted cow, each spotted bull in the group.
Likewise, the spotted cows comprised the fifth and the sixth part
Out of the total of brown catde that went out to graze.
Lastly, the cows colored brown made up a sixth and a seventh
Out of the white-coated herd, female and male ones alike.
If, my friend, you can tell me exactly what was the number
Gathered together there then, also the accurate count
Color by color of every well-nourished male and each female,
Then with right you'll be called skillful in keeping accounts.
6
Arithmetical Problems
But you will not be reckoned a wise man yet; if you would be,
Come and answer me this, using new data I give:
When the entire aggregation of white bulls and that of the black bulls
Joined together, they all made a formation that was
Equally broad and deep; the far-flung Sicilian meadows
Now were thoroughly filled, covered by great crowds of bulls.
But when the brown and the spotted bulls were assembled together,
Then was a triangle formed; one bull stood at the tip;
None of the brown-colored bulls was missing, none of the spotted,
Nor was there one to be found different in color from these.
If this, too, you discover and grasp it well in your thinking,
If, my friend, you supply every herd's make-up and count,
Then with justice proclaim yourself victor and march about proudly,
For your fame will glow bright all through the world of the wise.
Lessing, however, disputed the authorship of Archimedes. So also
did Nesselmann {Algebra der Griechen, 1842), the French writer Vincent
(Nouvelles Annales de Mathematiques, vol. XV, 1856), the Englishman
Rouse Ball {A Short Account of the History of Mathematics), and others.
The distinguished Danish authority on Archimedes J. L. Heiberg
(Quaestiones Archimedeae), the French mathematician P. Tannery
(Sciences exactes dans I'antiquite), as well as Krummbiegel and Amthor
(Schlb'milchs Zeitschrift fiir Mathematik und Physik, vol. XXV, 1880), on
the other hand, are of the opinion that this complete form of the
problem is to be attributed to Archimedes.
The two conditions set forth in the last seven distichs require that
X + Y be a square number U2 and Z + T a triangular number*
iV(V + 1), as a result of which we obtain the following relations:
(8) X + Y = U2 and (9) 2Z + 2T = V2 + V.
If we substitute in (8) and (9) the values X, Y, Z, T in accordance
with (I), these equations are transformed into
3828G = U2 and 4942G = V2 + V.
If we replace 3828, 4942, and G, respectively, with 4a (a being
equal to 3-11 -29 = 957), b, and eg, we obtain
(8') U2 = 4acg, (9') V2 + V = beg.
* A triangular number is a number n such that it is possible to construct
with n points a lattice of congruent equilateral triangles whose vertexes are the
points. The first triangle numbers are 1 = -J--1-2, 3 = 1 + 2 = i-2-3,
6 = 1 + 2 + 3 = i-3-4, 10 =1 + 2 + 3 + 4 = i-4-5, etc.
The Weight Problem of Bachet de Meziriac 7
Uis consequendy an integral multiple of 2, a, and c:
U = 2acu,
so that
U2 = Aa2c2u2 = Aacg
and
(8") g = acu2.
If this value for g is introduced into (9') we obtain
V2 + V = abc2u2
or
(2V + 1)2 = 4abc2u2 + 1.
If the unknown is designated as 2F + 1» and the product
\abc2 = 4-3-11-29-2-7-353-46572 is abbreviated as d, the last
equation is transformed into
v2 - du2 = 1.
This is a so-called Fermat equation, which can be solved in the
manner described in Problem 19. The solution is, however, extremely
difficult because d has the inconveniently large value
d = 410286423278424
and even the smallest solution for u and v of this Fermat equation leads
to astronomical figures.
Even if u is assigned the smallest conceivable value 1, in solving for
g the value of ac is 4456749 and the combined number of white and
black bulls is over 79 billion. However, since the island of Sicily has
an area of only 25500 km2 = 0.0255 billion m2, i.e., less than -^
billion m2, it would be quite impossible to place that many bulls on
the island, which contradicts the assertion of the seventeenth and
eighteenth distichs.
^^9 The Weight Problem of Bachet de Meziriac
A merchant had a forty-pound measuring weight that broke into four pieces
as the result of a fall. When the pieces were subsequently weighed, it was
found that the weight of each piece was a whole number of pounds and that
the four pieces could be used to weigh every integral weight between 1 and
40 pounds.
What were the weights of the pieces?
8
Arithmetical Problems
This problem stems from the French mathematician Claude
Gaspard Bachet de Meziriac (1581-1638), who solved it in his famous
book Problemes plaisants et delectables qui se font par les nombres, published
in 1624.
We can distinguish the two scales of the balance as the weight scale
and the load scale. On the former we will place only pieces of the
measuring weight; whereas on the load scale we will place the load
and any additional measuring weights. If we are to make do with as
few measuring weights as possible it will be necessary to place
measuring weights on the load scale as well. For example, in order to weigh
one pound with a two-pound and a three-pound piece, we place the
two-pound piece on the load scale and the three-pound piece on the
weight scale.
If we single out several from among any number of weights lying
on the scales, e.g., two pieces weighing 5 and 10 lbs each on one scale,
and three pieces weighing 1, 3, and 4 lbs each on the other, we say that
these pieces give the first scale a preponderance of 7 lbs.
We will consider only integral loads and measuring weights, i.e.,
loads and weights weighing a whole number of pounds.
If we have a series of measuring weights A,B,C,..., which when
properly distributed upon the scales enable us to weigh all the integral
loads from 1 through n lbs, and if P is a new measuring weight of such
nature that its weight/) exceeds the total weight n of the old measuring
weights by 1 more than that total weight:
p - n = n + \ or p = 2n + \,
it is then possible to weigh all integral loads from 1 through p + n =
3n + 1 by addition of the weight P to the measuring weights
A, B, C,....
In fact, the old pieces are sufficient to weigh all loads from 1 to n lbs.
In order to weigh a load of (p + x) and/or (p — x) lbs, where x is one
of the numbers from 1 to n, we place the measuring weight P on the
weight scale and place weights A,B,C,... on the scales in such a
manner that these pieces give either the weight scale or the load scale
a preponderance of x lbs.
This being established, the solution of the problem is easy.
In order to carry out the maximum possible number of weighings
with two measuring weights, A and B, A must weigh 1 lb and B 3 lbs.
These two pieces enable us to weigh loads of 1, 2, 3, 4 lbs.
Newton's Problem of the Fields and Cows
9
If we then choose a third piece C such that its weight
c = 2-4+l=9 lbs,
it then becomes possible to use the three pieces A, B, C to weigh all
integral loads from 1 to ¢ + 4 = 9 + 4= 13.
Finally, if we choose a fourth piece D such that its weight
d = 2-13 + 1 = 27 lbs,
the four weights A, B, C, D then enable us to weigh all loads from
1 to 27 + 13 = 40 lbs.
Conclusion. The four pieces weigh 1, 3, 9, 27 lbs.
Note. Bachet's weight problem was generalized by the English
mathematician MacMahon. In Volume 21 of the Quarterly Journal
of Mathematics (1886) MacMahon determined all the conceivable sets
of integral weights with which all loads of 1 ton lbs can be weighed.
^^fl Newton's Problem of the Fields and Cows
In Newton's Arithmetica universalis (1707) the following interesting
problem is posed:
a cows graze b fields bare in c days,
a' cows graze b' fields bare in c' days,
a" cows graze b" fields bare in c" days;
what relation exists between the nine magnitudes a to c"?
It is assumed that all the fields provide the same amount of grass,
that the daily growth of the fields remains constant, and that all the
cows eat the same amount each day.
Solution. Let the initial amount of grass contained by each field
be M, the daily growth of each field m, and the daily grass consumption
of each cow Q.
On the evening of the first day the amount of grass remaining in
each field is
bM + bm - aQ,
on the evening of the second day
bM + 2bm - 2aQ,
on the evening of the third day
bM + Urn - 3aQ,
10 Arithmetical Problems
etc., so that on the evening of the eth day
bM + cbm — caQ.
And this value must be equal to zero, since the fields are grazed bare
in c days. This gives rise to the equation
(1) bM + cbm = caQ.
In like manner the following equations are obtained:
(2) b'M + c'b'm = c'a'Q
and
(3)
b'M + c"b"m = c"a"Q.
If (1) and (2) are taken as linear equations for the unknowns M
and m, we obtain
cc'(ab'-ba')
M bb'(c'-c) V'
be'a' — b'ca
m = TTT7-, r V.
bb'(c' - c)
If these values are introduced into equation (3) and the resulting
equation is multiplied by \bb'{c' — c)]/Q, we obtain the desired
relation:
b"cc'(ab' - ba') + c"b"{bc'a' - b'ca) = c"a"bb'{c' - c).
The solution is more easily seen when expressed in the form of
determinants. If q represents the reciprocal of Q, equations (1), (2),
(3) assume the form
bM + cbm + caq = 0,
b'M + c'b'm + c'a'q = 0,
b"M + c"b"m + c"a"q = 0.
According to one of the basic theorems of determinant theory, the
determinant of a system of n (3 in this case) linear homogeneous
equations possessing n unknowns that do not all vanish (M, m, q in
this case) must be equal to zero. Consequently, the desired relation
has the form
b be ca
V b'c'
b" b"c"
= 0.
Berwick's Problem of the Seven Sevens
11
Berwick's Problem of the Seven Sevens
In the following division example, in which the divisor goes into the dividend
without a remainder:
**7*******:****7* = **7**
*****7*
*******
*7****
*7****
****7**
******
the numbers that occupied the places marked with the asterisks (*) were
accidentally erased. What are the missing numbers?
This remarkable problem comes from the English mathematician
E. H. Berwick, who published it in 1906 in the periodical The School
World.
Solution. We will assign a separate letter to each of the missing
numerals. The example then has the following appearance:
AB 7 CDE LQWz: a/Sy87e = kA7/w
a b A c d e
Third line
Fourth line
Fifth line
«-7-b
Seventh line
Ninth line
I. The first numeral (a) of the divisor b must be 1, since 7b, as the
sixth line of the example shows, possesses six numerals, whereas if a
equaled 2, 7b would possess seven numerals.
Since the remainders in the third and seventh lines possess six
numerals, Fmust equal 1 and R must equal 1, as a result of which/
and r must also equal 1 (according to the outline).
FGH IK 7 L
f g h i ka I
M 7 NOPQ
m 7 n o p q
RSTUZVW
r s t u 7 v w
X Y Z x y z
X Y Z x y z
12
Arithmetical Problems
Since b cannot exceed 199979, the maximum value of ft is 9, so
that the product in the eighth line cannot exceed 1799811, and
s < 8. And since S can only be 9 or 0, and since there is no remainder
in the ninth line under s, only the second case is possible.
Consequently, S = 0 and (since R = I) sis also equal to 0. It also follows
from R = 1 and S = 0 that M = m + 1, thus m ^ 8, and the product
7b of the sixth line cannot be higher than 87nopq.
II. Consequently, the only possible values for the second divisor
numeral /3 are 0,1, and 2. (7-130000 is already higher than 900000.)
/3 = 0 is eliminated because even when multiplied by nine 109979 does
not give a seven-figure number, which, for example, is required by
the eighth line.
Let us then consider the case of/3 = 1. This requires y to be equal
to only 0 or 1. (If y ^ 2, on determination of the second figure of
line 6 one would have to add to 7/3 = 7 • 1 = 7 the amount ^ 1 coming
from the product 7 • y, whereas the second figure must be 7.)
y = 0, however, is impossible as a result of the seven figures of line
8, since not even 9-110979 yields a seven-figure product.
In the event that y = 1 the following conditions must be observed,
as a glance at line 8 will show: 8, e, and ft must be so chosen that
ft-11187 e results in a seven-place number, the third last figure of which
is 7. The only hope of this is offered by the multiplier ft = 9 (since
even 8-111979 has only six places). Now the third last figure of
9-11187e, as is easily seen by experiment, can be a seven only if
8 = 0 or 8 = 9. In the first case line 8 will not possess seven places
even when 111079 is multiplied by 9, and in the second case line 6 is
7-11197* = 783***, which is impossible. Thus, the case of y = 1 is
also excluded. The possibility of /3 equaling 1 must, therefore, be
discarded.
The only appropriate value for the second figure of the divisor is
therefore /3 = 2. From this it follows that m = 8 and M = 9.
III. The third figure y of the divisor can only be 4 or 5, since
7-126000 is greater and 7-124000 is smaller than the sixth line.
Moreover, since 9 • 124000 is greater and 7-126000 is smaller than the
eighth line (I0tu7vw), ft must be equal to 8.
Since 8-124979 = 999832 < 1000000 the assumption that y = 4
fails to satisfy the requirements of line 8, and y therefore has to be
equal to 5.
IV. Since the third last figure of 8- 12537e must be 7, we find by
Berwick's Problem of the Seven Sevens
13
testing that 8 is equal to either 4 or 9. 8 = 9 is eliminated because
even 7-125970 = 881790 comes out greater than the sixth line, so
that only 8 = 4 is suitable. Thus, e can be considered one of
numbers 0 to 4. However, whichever one of these is chosen, we find
for the third figure of the sixth line n = 8 from 7 • 12547e = 878***.
Similarly, for the eighth line we obtain 8-12547e = 10037**, and
consequently t = 0 and u = 3.
Since Ab = A- 12547e results in a seven-place fourth line and only
8b and 9b have seven places, A is either 8 or 9.
V. From t = 0 and X ^ 1 (together with R = r=l,S = s = 0)
it follows that T ^ 1, and from n = 8, N ^ 9, it follows that T ^ 1,
so that T = 1. N is therefore equal to 9 and X = 1. Since X = 1
and 2-b > 200000 (line 9), it follows that v = 1 and also that
Y = 2, Z = 5, x = 4, y = 7, and z = e. With the results obtained
at this point the problem has the following appearance:
AB 7 CDELQW e: 12547e = /cA781
a b A c d e
\GH IK 7 L
1 g h i ka I
9 7 90 PQ
8 7 8 o p q
1 0 1 VL VW
1 0 0 3 7 v w
12 5 4 7 e
1 2 5 4 7 e
VI. In this case e is one of the five numbers
0, 1, 2, 3, 4.
These five cases correspond to the number series
vw = 60, 68, 76, 84, 92,
opq = 290, 297, 304, 311, 318
and, depending upon whether A is equal to 8 or 9,
S/ = 60, 68, 76, 84, 92
or
El = 30, 39, 48, 57, 66.
14
Arithmetical Problems
This presents ten different possibilities. If we test each of them by
going upward in three successive additions beginning from lines 9 and
8 to line 7, then from lines 7 and 6 to line 5, and finally from lines 5
and 4 to line 3, we find that only when e = 3 and A = 8 do we obtain
the requisite 7 for the next to last figure of line 3. In this case vw =
84, WLVW = 6331, opq = 311, OPQ = 944, ghikal = 003784, and
GHIK7L = 101778. This gives the problem the following
appearance:
A B 7 CD E 8 4 1 3:125473 = «8781
a b A c d e
110 17 7 8
10 0 3 7 8 4
9 7 9 9 4 4
8 7 8 3 1 1
10 16 3 3 1
10 0 3 7 8 4
12 5 4 7 3
12 5 4 7 3
VII. Finally, since of all the multiples of b only 5b = 627365 added
to the division remainder 110177 of the third line gives a number
containing a 7 in the third place, we get k = 5 and at the same time
abAcde = 627365 and AB7CDE = 737542, which gives us all of the
figures missing from the problem.
^^fl Kirkman's Schoolgirl Problem
In a boarding school there are fifteen schoolgirls who always take their daily
walks in rows of threes. How can it be arranged so that each schoolgirl walks
in the same row with every other schoolgirl exactly once a week?
This extraordinary problem was posed in the Lady's and Gentleman's
Diary for 1850, by the English mathematician T. P. Kirkman.
Of the great number of solutions that have been found we will
reproduce two. One is by the English minister Andrew Frost
(" General Solution and Extension of the Problem of the 15
Schoolgirls," Quarterly Journal of Pure and Applied Mathematics, vol. XI, 1871);
the other is that of B. Pierce ("Cyclic Solutions of the School-girl
Puzzle," The AstronomicalJournal, vol. VI, 1859-1861).
Kirkman's Schoolgirl Problem
15
Frost's solution. Mathematically expressed the problem
consists of arranging the fifteen elements x, au a2, bu b2, cu c2, du d2, eu
ei-> f\> fii gi> gi in seven columns of five triplets each in such a way
that any two selected elements always occur in one and only one of
the 35 triplets. As the initial triplets of the seven columns we shall
select:
xa1a2\xb1b2\xc1c2\xd1d2\xe1e2\xf1f2\xg1g2.
Then we have only to distribute the 14 elements au a2, bu b2,.. .,
gi! gz correctly over the other four lines of our system.
Using the seven letters a, b, c, d, e,f g, we form a group of triplets
in which each pair of elements occurs exactly once, specifically the
group:
abc, ode, afg, bdf, beg, cdg, cef. (The triplets are in alphabetical order.)
From this group it is possible to take for each column exactly four
triplets that contain all the letters except those contained in the first
line of the column. If we then place the appropriate triplets in
alphabetical order in each column, we obtain the following preliminary
arrangement:
Sun.
Xfl^
bdf
beg
cdg
cef
Mon.
xbib2
ade
afg
cdg
cef
Tues.
XCiC2
ade
afg
bdf
beg
Wed.
xdtd2
abc
afg
beg
cef
Thurs.
xe1e2
abc
afg
bdf
cdg
Fri.
xfj2
abc
ade
beg
cdg
Sat.
Xglgt
abc
ade
bdf
cef
Now we have to index the triplets bdf, beg, cdg, cef, ade, afg, abc, i.e.,
to provide them with the index numbers 1 and 2. We index them in
the order just mentioned, i.e., first all the triplets bdf, then all the
triplets beg, etc., observing the following rules:
I. When a letter in one column has received its index number, the
next time that letter occurs in the same column it receives the other
index number.
II. If two letters of a triplet have already been assigned index
numbers, these two index numbers must not be used in the same
sequence for the same letters in other triplets.
III. If the index number of a letter is not determined by rules I.
and II., the letter is assigned the index number 1.
16
Arithmetical Problems
The letters are indexed in three steps.
First step. The triplets bdf, beg, cdg, and all the letters aside from a
that can be indexed in accordance with this numbering system and
rules I., II., and III. are successively indexed.
Second step. The missing index numbers (in boldface in the
diagram) of the triplets ode and qfg, as well as the index numbers
obtained in accordance with rule I. for the last two a's in line 2 are
assigned.
Third step. The still missing index numbers of the a's in columns
4 and 5 (in the empty spaces of the printed diagram) are inserted;
these are 2 in line 2 and 1 in line 3.
This method results in the following completed diagram, which
represents the solution of the problem.
Sun.
xaxa2
Mi/i
*2*l£l
Cid2g2
C2e2J2
Mon.
xbib2
fil<4«2
at/2g2
Cidigi
C2elfl
Tues.
XCiC2
aidiCi
'htfigi
61(/2/2
*2«2£2
Wed.
Jtrfjrfj
ab2c2
a/2gi
*l«l£2
c\e2J\
Thurs.
*«1«2
abiCi
<tflg2
*2</l/2
c2d2gi
Fri.
*flf»
aib2Ci
Ofd^l
*ie2#i
C2dlg2
Sat.
*glg2
«1*1^2
atdi'2
*2<4/l
^1*1/2
Pierce's solution (judged the best by Sylvester). Let one girl,
whom we will indicate as *, walk in the middle of the same row on all
days; we will divide the other girls into two groups of 7 and designate
the first group by the Arabic numbers 1 to 7 or else by lower-case
letters and the second group by the Roman numbers I to VII or else
by capital letters. We will let an equation such as R = s indicate
that the Roman number indicated by the letter R possesses the same
numerical value as the Arabic numeral corresponding to the letter s.
Also, we will designate the days of the week Sunday, Monday,...,
Saturday by the numerals 0, 1, 2,..., 6.
Let the Sunday arrangement have the following order:
a a A
b 0 B
C y C
d * D
E F G
Kirkman's Schoolgirl Problem
17
From this, by adding r = R to each numeral, we obtain the
arrangement
a + r
b + r
c + r
d + r
E + R
a + r
P + r
y + r
*
F + R
A + R
B + R
C + R
D + R
G + R
for the rth weekday. Here every figure thus obtained that exceeds 7,
such as perhaps c + r or D + R, will represent the girl who receives
a number (c + r — 7 or D + R — 7), that is 7 below the figure and
is subsequently converted into that number.
The arrangements thus obtained yield the solution of the problem
if the following three conditions are satisfied:
I. The three differences a — a, /3 — b, y — c are 1, 2, and 3.
II. The seven differences A — a, A — a, B — b, B — /3, C — c,
C — y,D — d form a complete residue system of incongruent numbers
to the modulus 7 (cf. No. 19).
III. The three differences F - E, G - F, G - £are 1, 2, 3.
Proof. We take as a premise that the following congruences
(cf. No. 19) are all related to the modulus 7.
1. Each girl x of the first group will come together exactly once
with every other girl y of this group. The difference x — y is then
(according to I.) congruent to only one of the 6 differences a — a,
b — fi, c — y, a — a, fi — b, y — c. Let us assume x — y = /3 — b
or x — /3 = y — b. Thus, if r represents the number of the day of
the week that is congruent to x — /3 (or y — b), then
x = /3 + r and y = b + r,
so that the girls x and y walk in the same row on weekday r.
2. Each girl x of the first group comes together exactly once with
each girl X of the second group.
The difference X — x (according to II.) can be congruent to only
one of the seven differences A — a, A — a, B — b, B — /3, C — c,
C — y, D — d. Let us assume X — x = C — y or X — C = x — y.
If s = S is the weekday number that is congruent to X — C (or
x — y), then we have
X = C + S and x = y + s,
so that the girls X and x walk in the same row on weekday s.
18
Arithmetical Problems
3. Each girl X of the second group comes together exactly once
with every other girl Y of this group.
The difference X — Y is (according to III.) congruent to only one
of the differences F - E, G - F, G - E, E - F, F - G, E - G.
Let us assume that X - Y= G - F or X - G = Y - F. Then if
R represents the weekday number that is congruent to X — G (or
Y — F), we obtain
X= G + R and Y = F + R,
so that the girls X and Y walk in the same row on weekday R.
Thus, we need only satisfy conditions I., II., and III. to obtain the
Sunday arrangement.
We choose a = 1, a = 2, b = 3, consequently /3 = 5, and then
c = 4, so that y = 7 and d = 6. We then select A — 1, and thus
B = VI, C = II, and D = III, so that the differences mentioned in
condition II. are the numbers 0, —1, 3, 1, —2, —5, which are
incongruent to the modulus 7. The numbers IV, V, and VII then
remain for the letters E, F, G.
The Sunday arrangement is therefore
1 2 I
3 5 VI
4 7 II
6 * III
IV V VII
The weekday rows, in order, are arranged in the following manner:
2
4
5
7
V
5
7
1
3
3
6
1
*
VI
6
2
4
*
II
VII
III
IV
I,
V
III
VI
VII
3
5
6
1
VI
6
1
2
4
4
7
2
*
VII
7
3
5
*
III
I
IV
V
II,
VI
IV
VII
I
4
6
7
2
VII
7
2
3
5
5
1
3
*
I
1
4
6
*
IV
II
V
VI
III
VII
V
I
II
II IV, II III V, III IV VI.
Bemoulli-Euler Problem of Misaddressed Letters 19
^^fl The Bernoulli-Euler Problem of the Misaddressed
Letters
To determine the number of permutations of n elements in which no element
occupies its natural place.
This problem was first considered by Niclaus Bernoulli (1687-1759),
the nephew of the two great mathematicians Jacob and Johann
Bernoulli. Later Euler became interested in the problem, which he
called a quaestio curiosa ex doctrina combinationis (a curious problem of
combination theory), and he solved it independently of Bernoulli.
The problem can be stated in a somewhat more concrete form as
the problem of the misaddressed letters:
Someone writes n letters and writes the corresponding addresses on n
envelopes. How many different ways are there of placing all the letters in the
wrong envelopes?
This problem is particularly interesting because of its ingenious
solution.
Let the letters be known as a, b, c,..., the corresponding envelopes
as A,B,C,.... Let the number of misplacements, which we are
seeking, be designated as n.
Let us first consider all the cases in which a finds its way into B and
b into A as one group, and all the cases in which a gets into B and b
does not get into A as a second group.
The first group obviously includes n — 2 cases.
The number of cases falling into the second group can be
determined if instead of b, c, d, e,... and A, C, D, E,.. . we write, say,
b', c', d',e',... and B', C", D', E',.... Accordingly, the number is
n~^l.
The number of all the cases in which a ends up in B is then
n — \ + n — 2. Since each operation of placing "a in C," "a in D,"
... yields an equal number of cases, the total number n of all the
possible cases is
n = (n - \)[n~^l + n~^2].
We write this recurrence formula
n — n-n — 1 = i[n — 1 — (n — l)-n — 2],
20 Arithmetical Problems
in which t represents — 1 and apply it to the letter numbers 3, 4, 5,
.. up to n. Thus, we obtain
3" - 3-2 = t[2 -2-1],
4 - 4-3 = t[3 - 3-2],
n - n-n - 1 = i[n - 1 - (n - \)-n - 2].
By multiplying these (n — 2) equations we obtain
n - n-n - 1 = in-2[2 - 2-T],
or, since T = 0, 2 = 1, and tn "2 = tn,
n — n-n — 1 = tn.
We then divide this equation by n\, which gives
n n — 1 _ tn
n! ~ (n - 1)! ~ nl'
If we replace n in this formula by the series 2, 3, 4,.. ., n, we obtain
£ _ I = i!
2! 1! ~ 2'.'
1 _ 1 = t
3! 2! ~ 3!'
h _ n - \ _ in
J[\ ~ (B - 1)! ~ nl'
Addition of these (n — 1) equations results (since T = 0) in
n! ~ 2! + 3! + '" + ~n~\
From this we are finally able to obtain the desired number n:
,/1 1 1 in\
n = "!(2! - si + 4t -+ ••• +;n)-
If § represents a symbol such that the application of the binomial
theorem (cf. No. 9) to (¾ — 1)" allows v\ to be written for each power
3V of the binomial expansion, the number can be expressed in the
simpler form
n = (3 - l)n.
Euler's Problem of Polygon Division 21
For a value such as n = 4, for example, we obtain 4 = (3 — 1)4 =
§* _ 433 + 632 - 43 + 1 = 4! - 4-3! + 6-2! -4-11 + 1=9,
which is easily checked by testing.
Similarly, the number of permutations that can be formed from n
elements in which no element is in its natural place is (3 — l)n.
For the four elements 1, 2, 3, 4, for example, there are the nine
permutations 2143, 2341, 2413, 3142, 3412, 3421, 4123, 4312, 4321.
Note. The result obtained also contains the solution of the
determinant problem:
In how many constituents of an n-degree determinant do no principal diagonal
elements occur?
This is immediately seen if the rth element of the .rth column is
called e*. The elements of the principal diagonal are then
rl .1 r3 rn
°1) °2i °39 • • • > °n*
The determinant therefore contains (3 — l)n constituents outside the
principal diagonal elements.
^^B Euler's Problem of Polygon Division
In how many ways can a (plane convex) polygon of n sides be divided into
triangles by diagonals?
Leonhard Euler posed this problem in 1751 to the mathematician
Christian Goldbach. For the number to be found, En, the number of
possible divisions, Euler developed the formula:
_ 2-6-10...(½-10)
(1) E» ~ (^nyi
This problem is of the greatest interest because it involves many
difficulties in spite of its innocuous appearance, as many a surprised
reader will discover if he attempts to derive the Euler formula without
outside assistance. Euler himself said, "The process of induction I
employed was quite laborious."
In the simplest cases n = 3, 4, 5, 6 the various divisions
E3 = 1, £4 = 2, EB = 5, E6 = 14
are easily obtained from the graphic representations. But this
method soon becomes impossible as the number of angles is increased.
22
Arithmetical Problems
In 1758 Segner, to whom Euler had communicated the first seven
division numbers 1, 2, 5, 14, 42, 132, 429, established a recurrence
formula for En {Novi Commentarii Academiae Petropolitanae pro annis
1758 et 1759, vol. VII) which we will begin by deriving.
Let the angles of any convex polygon of n angles be 1,2,3,...,«.
For every possible division En of the polygon of n angles we may take
the side n\ as the base line of a triangle the apex of which is situated
at one of the angles 2, 3, 4,..., n — 1 in accordance with the division
selected. If the apex is, for example, situated at angle r, on one
side of the triangle n\r there is a polygon of r angles and on the
other a polygon of s angles, r + s being equal to n + 1 (since the
apex r belongs to both the polygon of r angles and the polygon of
s angles).
Since the polygon of r angles (or r-gon) permits Er divisions and the
.r-gon permits Es divisions, and since each division of the r-gon can be
connected with every division of the .r-gon toward a division of the
given n-gon, the mere choice of the apex r results in Er-Es different
divisions of the given n-gon.
Since, then, r can possess successively every value of the series
2, 3,.. ., n — 1 and j can accordingly possess successively every value
of the series n — 1, n — 2,..., 3, 2, it follows that
(2) En = E2En_1 + E3En_2 + • • • + En_xE2,
where the factor E2, which is merely added for better appearance, has
the value 1.
Formula (2) is Segner's recurrence formula. It confirms the
previously given values for E3 to E6 as well as giving
£7 = E2E6 + E3E5 + £4£4 + £5£3 + E6E2 = 42,
■^8 = E2En + E3E6 + EtEB + E5Et + E6E3 + EnE2 = 132,
etc.
As the index number is increased Segner's formula, in contrast with
Euler's, grows more and more unwieldy, as Goldbach has already
indicated.
We can obtain the Euler formula (1) most simply if we consider
Euler's division problem or Segner's recurrence formula in the light
of an idea of Rodrigues {Journal de Mathematiques, 3 [1838]) and
connect it with a problem treated by the French mathematician
Catalan in the year 1838 in the Journal de Mathematiques.
Euler's Problem of Polygon Division
23
Catalan's problem has the form:
How many different ways can a product of n different factors be calculated
by pairs?
We say that a product is calculated by pairs when it is always only
two factors that are multiplied together and when the product
arising from such a "paired" multiplication is used as one factor in
the continuation of the calculation. Calculation by pairs of the
product 3 • 4 • 5 • 7, for example, is carried out in the following manner:
3 • 5 = 15, 4 • 15 = 60, 7 • 60 = 420. For the four-membered product
abed an alphabetical arrangement of the factors gives the following
five paired multiplications:
[(a-b)-c]-d, [a-(b-c)]-d, (a-b)-(c-d), a[(b-c)-d], a-[b-(c-d)].
A product in which the paired multiplications that are to be carried
out are marked by brackets or the like will be referred to in abbreviated
form as "paired."
{[(a-b) -c] ■ [(d-e) ■ (fg)]}-{(h-i) -k) is therefore a paired product of
the ten factors a to k. It is immediately seen that a paired product
of n factors contains (n — 1) multiplication signs and correspondingly
involves (n — 1) paired multiplications (for every two factors).
Catalan's problem requires the answers to two questions:
1. How many paired products of n different prescribed factors are there?
2. How many paired products can be formed from n factors if the sequence
of the factors (e.g., an alphabetical sequence) is prescribed?
The first number we will designate as Rn and the second as Cn.
The simplest method of obtaining Rn (according to Rodrigues) is
by means of a recurrence formula. We will imagine the Rn n-
membered paired products to be formed of the n given factors
fufi> ■ ■ -t/nl we will add to this an (n + l)th factor fn + 1 =/and
form from the available Rn n-membered products all the Rn + i
(n + l)-membered products of the factors /i,/2, ■ ■ -,fn + i-
Now each of the Rn n-membered products P includes (n — 1)
paired multiplications of the form A-B. If we use f once as the
multiplier in front of A, once as the multiplicand after A, once as
the multiplier before B and once as the multiplicand after B, we
thereby obtain from AB four new paired products (f-A)-(B),
(A -f) ■ (B), (A) ■ (fB), and (A) ■ (B./).
Since these four arrangements of the factor / can be effected for
each of the n — 1 paired subproducts of P, we obtain from P
24
Arithmetical Problems
4(n — 1) (n + l)-membered paired products. Moreover, we also
obtain from P the two (n + l)-membered paired products f-P and
P-f. The described arrangement of the factors/" thus yields from
only one (P) of the Rn n-membered products (½ — 2) (n + 1)-
membered products. From all Rn n-membered paired products we
therefore obtain i?n-(4n — 2) (n + l)-membered paired products.
The sought-for recurrence formula accordingly reads
(3) Rn + 1 = (½ - 2)Rn.
To obtain an independent representation of R^ we begin with
R2 = 2 (two factors a and b yield only two products: a-b and b-a)
and we infer from (3) R3 = 6R2 = 2-6, /J4 = 10/J3 = 2-6-10,
/J5 = 14/J4 = 2-6-10-14, etc., and finally
(4) Rn = 2-6-10-14... (4n - 6).
The second question can also be answered by returning to a
recurrence formula.
Let the n factors f„ in the prescribed order be <plt <p2, ■ ■., <pn. We
will take from the Cn paired n-membered products belonging to this
series those having the form
()•().
where the parenthesis on the left includes the r members 9»!, <p2)..., <pn
and the one on the right the s = n — t members 9?r + i> <Pr+2t • • •>
9V+s = 9V Since the left parenthesis, in accordance with its r
members, can possess Cr different forms and the right correspondingly
can possess Cs different forms, while each form belonging to the left
parenthesis can combine with each form included in the right
parenthesis, the above main form yields Cr-Cs different n-membered paired
products.
Since, moreover, r can have every value from 1 to n — 1, it follows
that
(5) Cn = CjCn^! + C2Cn_2 + • • • + Cn-iCi.
By using this recurrence formula and beginning from Cx = 1 and
C2 = 1, we obtain the following sequence
G3 = iy\L/2 + (s2(si = 2,
C4 = C1C3 + C2C2 + ^3^1 = Jy
G5 = (^04 + C2C/3 + G3G2 + (-'if-'x = 14,
etc.
Euler's Problem of Polygon Division 25
To obtain an independent representation of Cn we can imagine
that there are n\ different sequences (permutations) of the factors
f\if2.i---ifni mat each of these sequences possesses Cn paired n-
membered products and that all the sequences together possess Rn
such products. Then Rn = Cn-nl or
_ Rn _ 2-6-10...(½-6)
W °» ~ -^T nl
Formulas (4) and (6) solve Catalan's problem
Now for Euler's formula!
From the indicated values
E2 = 1, £3=1, Et = 2, E6 = 5,
Gj = 1, G2 = 1, G3 = ^, G4 = 5
and formulas (2) and (5) it immediately follows that in general
(7) En =Cn.x.
[The proof is by induction. We assume that (7) is true for all
indices through n, so that E2 = Cu E3 = C2,..., En = Cn_i.
According to (2) and (5)
En + 1 = E2En + E3En_1 + ■ • • + EnE2,
Cn = CiCn_i + G2Cn_2 + • • • + Cn-iGV
Since the right sides of the two last equations correspond member
for member, it also follows that
En+i = Gn;
i.e., formula (7) is valid for every index.]
(6) and (7) give us Euler's formula immediately:
2-6.10...(411-10)
(8) En (n^rryi
In conclusion we would like to give a slight simplification of Euler's
formula. It is
2"-M-3-5...(2n- 5) 2n"2(2n-3)!
A. = —
(n-1)! (b - l)!2"-2-(n-2)!(2n-3)
26
Arithmetical Problems
and consequently
En = kr/k,
where/ = n — 2 is the number of triangles into which the n-gon can
always be divided and k = 2n — 3 is the number of sides bounding
these triangles.
Recently (Zeitschrift fur math, und naturw. Unterricht, 1941, vol. 4)
H. Urban derived Euler's formula in the following manner.
He first calculated E5, E6, En by means of the Segner recurrence
formula and "inferred" the following:
E2 = 1, E3 = 1, £4 = 2, E5 = 5, E6 = 14, £7 = 42,
£3 _ 2 £j 6 £5 K) ^6 = H £V _ r8
£2 _ 2' £3 ~ 3' £4 _ 4' £5 ~ 5' £e ~ 6'
on the strength of which he surmised that En would have to be
(I) e^'JL^e^.
(Unfortunately, he does not say whether it was Euler's recurrence
formula or some other idea that led him to his "inference.")
This recurrence formula is certainly correct for the first values of
the index n. To prove its general validity the conclusion for n is
applied to n + 1: it is assumed that the recurrence formula (I) is
true for all index numbers from 1 to n — 1 and it is demonstrated that
it is therefore also true for n.
The proof is carried out by means of the expression
(II) S=\-E2-En.1 + 2-E3-En_2 + 3-£4.£n_3 + • • •
+ (n-2)-En_1.E2
or, written in the reverse order,
(III) 5=(^-2)-^.^^ + (n-3)-En.2-E3
+ (n-4)-£n.3.£4+ -..+ 1 -E^E^.
Columnar addition of these two equations gives
2S={n- l)^^.! + E3En_2 +•••+ En^E2-\
or, since in accordance with Segner's recurrence formula the value of
the expression within the brackets is equal to En,
(IV) 2S=(n-l)En.
Lucas' Problem of the Married Couples 27
Now the left-hand factor E, in each product E,-Es of (II) and (III)
(except the case in which r = 2) is replaced in accordance with the
recurrence formula (I) by Xr.1Er_1/(r — 1) with Av = 4V — 6.
This gives us
(II') S = E2En_x + X2E2En_2 + ^zEzEn_z + • • •
+ An-2En_2E2,
(III') S= K-2En-2E2 + An-3£n_3£3 + • ••
+ X2E2En_2 + E2En_1
and by columnar addition of these two lines, since Av + An_v =
\n — 12, we obtain
2S=En.1 + (½ - 12) [E2En_2 + E3En_3 +■■■ + En.2E2]+ En_x
or, since the expression within brackets is equal to En _ u
(V) 25 = (½ - 10)3,^.
Equations (IV) and (V) give us
F 4" ~ 10 F
so that Euler's recurrence formula (I) is thereby shown to be valid for
the index number n, also, and thus generally valid.
Lucas' Problem of the Married Couples
How many ways can n married couples be seated about a round table in such
a manner that there is always one man between two women and none of the men
is ever next to his own wife?
This problem appeared (probably for the first time) in 1891 in the
Theorie des Nombres of the French mathematician Edouard Lucas
(1842-1891), author of the famous work Recreations mathematiques.
The English mathematician Rouse Ball has said of this problem,
"The solution is far from easy."
The problem has been solved by the Frenchmen M. Laisant and
M. C. Moreau and by the Englishman H. M. Taylor. A solution
based upon modern viewpoints is to be found in MacMahon's
Combinatory Analysis. The approach adopted here is essentially that of
Taylor (The Messenger of Mathematics, 32, 1903).
28
Arithmetical Problems
We will number the series of circularly arranged chairs from 1
through 2n. The wives will then all have to be seated on the even-
or odd-numbered chairs. In each of these two cases there are n\
different possible seating arrangements, so that there are 2-n\ different
possible seating arrangements for the women alone.
We will assume that the women have been seated in one of these
arrangements and we will maintain this seating arrangement
throughout the following. The nucleus of the problem then consists of
determining the number of possible ways of seating the men between the
women.
Let us designate the women in the assumed seating sequence as
Flt F2,..., Fn, their respective husbands Mu M2,..., Mn) the
couples (Fu M1),(F2,M2),. ■., as 1,2,... and arrangements in
which there are n married couples as n-pair arrangements. Let us
designate the husbands about whom we have no further information
as ai, X2,....
Let
F1X1F2X2 ... FnXnFn + 1Xn +!
be an (n + l)-pair arrangement in which none of the husbands sits
beside his own wife. (It must be remembered that the arrangement
is circular, so that Xn + 1 is seated between Fn + 1 and Flt) If we take
Fn + 1 and Mn + 1 = X, out of the arrangement and replace X, with
-^n +1 = Mtt) we obtain the n-pair arrangement
FXX1F2X2... FVMUFV + 1... FnXn.
This arrangement can occur in three ways:
1. No man sits next to his wife
(thus Mu 9& M„ Mu *M, + 1,Xn* Mj).
2. One man sits next to his own wife (namely when
Mu = Mv or Mu = Mv + 1 or else Xn = Mx).
3. Two men sit next to their own wives (when Mu = M, or
Mu = Mv + 1 and at the same time Xn = Mu that is, when in our
arrangement the order M^F^ occurs).
Thus, we must consider other seating arrangements in addition to
the one prescribed in the problem.
In the following we will distinguish between three types of
arrangements: arrangements A, B, and C. An ^4-arrangement will be
Lucas' Problem of the Married Couples
29
one in which no man sits next to his wife. A 5-arrangement
will be one in which a certain man sits on a certain side of his wife.
Finally, a C-arrangement will be one in which a certain man sits on
a certain side of his wife and another man—which one, is not
prescribed—sits alongside his wife—but the side is likewise not prescribed.
We will designate the number of n-pair A-, B-, C-arrangements as
An, Bn) Cn, respectively.
First we will try to determine the relationships among the six
magnitudes An, Bn, Cn, An + 1) Bn + U Cn + 1; we will begin with the
simplest of these relationships.
Let us consider Bn + 1 5-arrangements
FXXXF2X2 ... FnXnFn +!Mn +!
of the pairs 1, 2,..., (n + 1), in which Mn + 1 sits next toFn + 1 on her
right. We will divide the arrangements into two groups in accordance
with whether Xn = M1 or Xn ^ Mx. We then remove the pair
Fn+1Mn + 1 from all of them. The first group then gives us all Bn
n-pair 5-arrangements, and the second all An n-pair .^-arrangements,
so that
(1) Bn + 1=Bn + An.
We can obtain a second relationship by considering the Cn + 1
(n + l)-pair C-arrangements
M1F1X1FaXa...F9XnFn + u
in which one of the men Xu X2)..., Xn sits next to his own wife.
We also divide these arrangements into two groups in accordance with
whether or not ^ is or is not equal to Mn + 1.
The second group then contains (2n — 1) subgroups. In the first,
M2 is seated on the left of F2) in the second on her right; in the third,
M3 sits on the left of F3, in the fourth on her right, etc.; in the
(2n — l)th, Mn + 1 is seated on the left ofFn + 1.
If we leave the pair M1F1 out of all of the Cn + 1 C-arrangements,
we obtain from the first group all Cn C-arrangements of the pairs
2, 3, 4,..., (n + 1) in which Mn + x is seated on the right of Fn + u and
from each subgroup of the second group we obtain Bn 5-arrangements
of the pairs 2, 3,..., (n + 1), so that
(2)
Cn + i = Cn + (2b - l)Bn.
30
Arithmetical Problems
As we found above, if we remove the pair Fn + 1) Mn + 1 from an
(n + l)-pair ^4-arrangement F^X^F^X^ .. .Fn + 1Xn + 1 and replace the
Mn + 1 that has been removed with Xn + 1, the arrangement is
transformed into an n-pair A-, B-, or C-arrangement.
Conversely, we obtain an .^-arrangement of the {n + 1) pairs
1, 2,..., (n + 1) when we insert Fn +1 Mn +1 before Fx of an A-, B-, or
C-arrangement of the n pairs 1, 2,..., n and then exchange the places
of Mn +! and some other man (in such a manner that none of the men
is seated next to his own wife after the exchange of places). It is also
clear that this method gives us all the .^-arrangements of the (n + 1)
pairs 1, 2,..., (n + 1).
In order to find An + 1it is therefore only necessary to determine the
number of ways in which this insertion and the subsequent exchange
can be accomplished for all possible A-, B-, and C-arrangements of the
n pairs 1 through n.
We accomplish the described formation of the (n + l)-pair
^4-arrangements in three steps.
I. Formation from A-arrangements.
After the insertion:
F1X1F2X2...FnXnFn + 1Mn + 1
we can exchange the places of Mn + 1 and any other man except Xn
and Mu so that from each of the An n-pair ^4-arrangements we obtain
(n — 2) (n + l)-pair .^-arrangements. Consequently, we obtain a
total of
(n — 2)An (n + 1 )-pair ^4-arrangements.
II. Formation from ^-arrangements.
The n-pair 5-arrangements exhibit the following 2n forms:
1. ...F.M,...
2. ... FXM2F2 ...
3. ...iwyi^...,
(2b-2). ...Mnivr„iV..,
(2n-l). ...F.M,^...,
In. ...FnM,F,....
And there are Bn of each of these forms.
Our process of formation is not applicable to the first and the
(2n — l)th of these forms (since the inserted Mn + 1 would have to be
Lucas' Problem of the Married Couples 31
exchanged with M1 or Mn, as a result of which, however, Mx would
end up on the left side of Fu or Mn + x would be on the left side
ofFn + 1).
In the second, third,..., (2b — 2)th form, the exchange of the
inserted Mn +! with M2, M2, M3, M3,..., Mn _ u Mn _ u Mn transforms
the B-pair 5-arrangements into (n + l)-pair .^-arrangements, as a
result of which a total of
(2b — 3)Bn (n + l)-pair ^4-arrangements
are obtained.
In the (2n)th form, the inserted Mn + 1 can be exchanged with any
of the men M2, M3,..., Mn, as a result of which a total of
(n — l)Bn (n + l)-pair^4-arrangements
are obtained.
III. Formation from C-arrangements.
Our method transforms any one of the Cn n-pair C-arrangements:
M1FxX2F2X3Fz...XnFn
into an (n + l)-pair ^4-arrangement if we switch the places of
Mn +! and the man M, seated next to his wife (v being one of the values
2, 3, 4,..., n). In this manner we obtain from every n-pair C-
arrangement an (n + l)-pair .^-arrangement, which corresponds to a
total of
Cn (n + l)-pair ^4-arrangements.
Thus, the methods of formation described under I., II., and III. give
us all of the (n + l)-pair ^4-arrangements, or a total of
[(b - 2)An + (3b - 4)5n + Cn],
arrangements, so that
(3) An + 1 = (B - 2)An + (3b - 4)5n + Cn.
In order to obtain formulas in which only the same capital letters
occur, we infer from (1)
-4 = -8,, + 1--8.1 and 4 + 1 =-8,, + 2 - -Bn + l
and introduce these values into (3). This gives
Bn + 2 = (b - l)Bn + 1 + (2b - 2)Bn + Cn.
32 Arithmetical Problems
If we then replace n by n + 1, it follows that
•Sn + 3 = «#n + 2 + 2n2?n + 1 + Cn + 1.
If we subtract the next to the last equation from the last one and take
(2) into consideration, we get
Bn + 3 = (n + l)[5n + 2 + Bn + 1] + Bn
or, if we replace n + 1 here by n,
(4) Bn + 2 = n(Bn + 1 + Z?n) + £„_!.
This simple recurrence formula for the .B's enables us to calculate
from three successive B's the B that follows immediately.
It is also possible to derive a recurrence formula in which only three
successive B's are connected, i.e., a formula having the form
(5) 'nBn +1 + fnBn + gnBn _! = e,
in which the coefficients en,fn, gn represent known functions of n and c
is a constant.
In order to find it we replace n in (5) with (n + 1) and obtain
«n + l-Bn + 2 +/n + l-Bn + l + gn + lBn = C
Subtraction of this equation from (5) gives
-«n + l-Bn + 2 + («n -fn + l)Bn + 1 + (/„ - gn + l)Bn + f„5n_! = 0.
In order to find the equations of condition for the coefficients e,f, g
which are still unknown, we compare the formula obtained with
equation (4) after equation (4) has been multiplied by gn:
-gnBn + 2 + ngnBn + 1 + ngnBn + £n.Bn-i = 0.
Thus, we are able to obtain e, f, g and satisfy the three conditions
(I) «n + l = gn, (II) «„-/„ + != ngn, (III) fn- gn + 1 = ngn,
giving us the sought-for recurrence formula.
From (III) it follows that
/n = gn + i + ngn or /B + 1 = £n + 2 + (n + l)^n + 1,
and from (II) and (I)
/n + l = «n - ngn = gn-1 ~ ngn.
By equating the two values obtained for^ + 1 we get
(« + l)gn + l + «5n = gn-1 ~ gn + 2-
Lucas'1 Problem of the Married Couples
33
It is easily seen that
gn = run (t 1)
is a solution of this equation. This, according to (I), yields
«»-*»-i = -(«- 1)»"
and, according to (III),
/n = £n + l + "£n = '"(«2 - « - 1).
Equation (5) is thereby transformed into
(n - l)Bn + 1 _(„*_„_ l)Bn - nBn.x = -a\
In order to determine the constant c, we set n equal to 4, we observe
that B3 = 0, Bi = 1, and 55 = 3, and we obtain c = 2.
The sought-for recurrence formula consequently reads
(6) (n - l)Bn + 1 _(„*-„_ 1)5, + nBn^ - 2i\
In order to obtain a recurrence formula for the A's as well, we
express An_u An, and An + 1) in accordance with (1) and (6), by Bn and
Bn + l. Thus we obtain
An = Bn + 1 — Bn,
A - "2~ l R + " + ! R -4_ 2t"
4, + 1 - -^- *n + l + -jj- 5n + —,
and from this by elimination of Bn and Bn + 1 we obtain
(7) (B - l)An + 1 = (n* - 1)4, + (B + 1)^-1 + 4*\
This is Laisant's recurrence formula. It makes possible the calculation
of each ^4 from the two immediately preceding A's.
Thus, from A3 = 1, ^44 = 2, and (7), it follows that ^45 = 13, which
is still easy to check directly. Moreover, the whole series A6 = 80,
^7 = 579, As = 4738, Ag = 43387, A10 = 439792, Alt = 4890741,
A12 = 59216642, etc. can then be derived from (7). The difficult
point in the calculation of A can therefore be considered as eliminated.
The problem is solved.
The number of possible seating arrangements of n married couples is
2A„-n!, in which An can be calculated from Laisant's recurrence formula.
34
Arithmetical Problems
K[fl Omar Khayyam's Binomial Expansion
To obtain the nth power of the binomial a + b in powers of a and b when
n if any positive whole number.
Solution. In order to determine the binomial expansion we write
(a + b)n = (a + b)(a + b)...(a + b),
where the right side consists of a product of n identical parentheses
(a + b). As is known, the multiplication of parentheses consists of
choosing one term from each parenthesis and obtaining the product
of the terms chosen, and continuing this process until all the possible
choices are exhausted. Finally, the resulting products are added
together.
A product of this sort has the following appearance:
P = aaibeiaa2be2a"sb6s...,
in which the factor a is taken from the first 0¾ parentheses, the factor
b from the next /3j parentheses, the factor a from the next <x2
parentheses, etc. In this case ax + ^ + <x2 + /32 + • • • equals the number
of parentheses present, i.e., n.
If we set txj + <x2 + «3 + • • • equal to a and ^1 + /32 + • • • equal
to /3, the expression can be written in the simpler form
P = a"be with a + /3 = n.
Now the product P can generally be obtained in many other ways
than the one described, for example, by taking a from the first a
parentheses and b from the last /3 parentheses, or by taking b from the
first /3 parentheses, and a from the last a parentheses, etc. If we
assume that the product P occurs exactly C times in the method
described, C being understood to represent an initially unknown whole
number, then
G = Ca"be
represents one term of the binomial expansion. The other terms have
the same form, except that the exponents a and /3 and the coefficients
C are different. However, a + /3 always equals n.
The core of the problem is to determine the so-called binomial
coefficient C, i.e., to answer the question: How many times does the
product P = aab6 appear in the binomial expansion?
Omar Khayyam's Binomial Expansion
35
To answer this question we first write the factors a and b of the
product one after another in the order in which we initially selected
them from the parentheses:
aa... a-bb ... b-aa... a-....
totaling totaling totaling
«1 £1 «2
This is a permutation of n elements in which a identical elements a
and /3 identical elements b occur. There are as many possible
permutations of these elements as there are terms P resulting from the
multiplication of the n parentheses (a + b).
But the number of permutations of n elements among which there
appear a identical elements of one kind and /3 identical elements of
the other is n!/<x!//3!. This is how often the product a"b6 appears in
the binomial expansion. Consequently,
An apparent exception to this formula is presented by the terms an
and bn of the expansion, each of which occurs only once. To
eliminate this exception let us agree to let the symbol 0! represent unity;
we are then able to write the coefficients of an and bn as n !/n !0! and
n\jQ\n\, respectively, in agreement with the formula.
The individual possibilities of forming the product P can also be
represented geometrically. We can, for example, represent the first
possibility considered above in the following way: We mark off" a
horizontal distance of ax successive segments a, and from the end of
this distance extend a vertical distance of ^ successive segments b,
from the end of this vertical line a third horizontal distance of <x2
successive segments a, etc. In a similar manner we represent the
other possibilities of forming the product P; however, we begin all C
zigzag traces, which represent the C possibilities, from the same point.
Thus, for example, if we are concerned with finding the number v
of all the products of the form a11^7 in the binomial expansion of
(a + b)18, we draw a rectangular network of 11-7 rectangular
compartments possessing a horizontal side a and a vertical side b and
lying in seven 11-compartment rows one below the other. The
possibility aWa^b* (a from the first four parentheses, b from the
following three, a from the next seven parentheses and b from the last
36
Arithmetical Problems
four) is then represented by the unbroken heavy line, and the
possibility b2a6b3a2b2a3 by the line of dashes. The sought-for number
v is therefore equal to the number of all the possible direct paths
leading from the corner E of the network to the opposite corner F.
E
Fig. 1.
The formula previously found for C thus also provides us with the
solution to the interesting problem:
A city has m streets that run from east to west and n that run from north to
south; how many ways (without detours) are there of getting from the
northwest corner of the city to the southeast comer?
Since there are (n — 1) west-east partial paths a and (m — 1)
north-south partial paths b, the number of all the possible paths is
(m + n -2)!
(m - l)!(n - 1)!'
Now back to the binomial theorem!
Determination of the binomial coefficient C gives us immediately
the sought-for binomial expansion:
(a + b)n = JJCoPb" with C = -^-
Here a and /3 pass through all the possible integral non-negative
values that satisfy the condition a + /3 = n.
The expansion of (a + b)5, for example, gives
(«+ *)5 = -5 + 4Ti!aib + m** +2¾^3 + m\ ^ + *5
or
(a + b)5 = a5 + 5a*b + I0a3b2 + 10a263 + 5ab* + b5.
Cauchy's Mean Theorem 37
Instead of n!/<x!/3! one usually writes
B(n- l)(n-2) ...(«-« + 1)
1-2-3. ..a
and also abbreviates this coefficient na (read as n sub a). The
expansion then takes on a somewhat simpler appearance:
(a + b)n = an + nxan-lb + n2an-2b2 + ■ ■ ■ + bn.
The coefficient nv is known as the binomial coefficient to the base n
with index v.
The binomial theorem was probably discovered by the Persian
astronomer Omar Khayyam, who lived during the eleventh century.
At least he prided himself on having discovered the expansion " for
all (integral positive) exponents n, which no one had been able to
accomplish before him."
Note. The derivation given above is easily extended to give the
nth. power expansion of a polynomial a + b + c + • • •. The
polynomial theorem for a polynomial consisting of three terms, for
example, is
(« + * + <)" = 2 ^flW>
where the sum 2 includes all possible terms for which the integral
non-negative exponents a, /3, y satisfy the condition a + /3 + y = n.
im Cauchy's Mean Theorem
The geometric mean of several positive numbers is smaller than the arithmetic
mean of these numbers.
Augustin Louis Cauchy (1789-1857) was one of the greatest
French mathematicians. The theorem concerning the arithmetic and
geometric means occurs in his Cours d'''Analyse (pp. 458-9), which
appeared in 1821.
The proof of the theorem that will be presented here is based upon
the solution of the fundamental problem: When does the product of n
positive numbers of constant sum attain its maximum value?
We will call the n numbers a, b, c,..., their constant sum K, and
their product P. Experimentation with various numbers suggests
that the product P reaches its maximal value when the numbers
a,b,c,... all possess the same value M = Kjn.
38
Arithmetical Problems
To determine the accuracy of this hypothesis, we use the
Auxiliary theorem : Of two pairs of numbers of equal sum the pair
possessing the greater product is the one whose numbers exhibit the smaller
difference.
[If AT and Y represent one pair and x and y the other, and X + Y =
x + y, the auxiliary theorem follows from the equations
\XY = (X + Y)» -(X- Y)\ 4xy = (x + y)* - (x - y)«,
in which the minuends of the right sides are equal and the greater
right side is the one in which the subtrahends are smaller.]
If the n numbers a,b,c,... are not all equal, then at least one, a, for
example, must be greater than M, and at least one, let us say b, must
be smaller than M. Let us form a new system of n numbers a', b',
c',... in such a manner that (1) a' = M, (2) the pairs a, b and a', V
have the same sum, (3) the other numbers c', d', e',... correspond to
c, d, e,.... The new numbers then have the same sum K as the old
ones, but a greater product P'( = a'b'c'...), since in accordance with
the auxiliary theorem a'V > ab.
If the numbers a', V, c',... are not all equal to M, then at least one,
let us say b', is greater (smaller) and at least one, say c', is smaller
(greater) than M. Let us form a new system of n numbers a", b", c",
d",... in such a manner that (1) a" = a' = M, (2) b" = M, (3) the
pairs b', c' and b", c" possess the same sum, (4) d", e",... correspond to
d',e',.... The numbers a", b", c",... then have the same sum K as
the numbers a', V, c',..., but possess a greater product P" =
a"b"c"..., since in accordance with the auxiliary theorem b"c" > b'c'.
We continue in this fashion and obtain a series of increasing
products P, P', P",... each successive member of which is greater
than the immediately preceding one by at least one more multiple
of the factor M. The last product obtained in this manner is the
greatest of all and consists of n equal factors M. Consequently,
P < Mn,
which gives us the theorem:
The product of n positive numbers whose sum is constant attains its maximal
value when the numbers are equally great.
If we extract the nth. root of the last inequality and express P and M
in terms of the magnitudes a, b, c,..., we obtain Cauchy's formula:
n
Cauchy's Mean Theorem
39
This is expressed verbally as follows:
The theorem of the arithmetic and geometric mean: The
geometric mean of several numbers is always smaller than the arithmetic mean
of the numbers, except when the numbers are equal, in which case the
two means are also equal.
Note 1. Cauchy's theorem leads directly to the converse of the
above extreme theorem:
The sum of n positive numbers whose product is constant attains its
minimal value when the numbers are equal.
Proof. Let us call the n numbers x, y, z,..., their given product k,
their variable sum s, and let us designate by m the nth root of k.
According to Cauchy,
x + y + z + ••• y
2—_ ^ yxyz. . . ;
consequently
s ^ nm,
where the equality sign applies only in the event that x = y = z.
Q.E.D.
The two preceding extreme theorems form the basis for a simple
solution of many problems concerning maximum and minimum
(cf. Nos. 54, 92, 96, 98).
Note 2. Cauchy's theorem also furnishes us directly with the
important exponential inequality for the exponential function x°.
If a is any positive number not equal to 1, n a whole number > 0,
m a whole number >n, then the geometric mean of the m numbers of
which n possess the value a and the (m — n) others possess the value 1
is smaller than the arithmetic mean (na + m — n)jm of these m
numbers or
Vtf < 1 + - (a - 1),
m
or, if we write e in place of njm,
(1) a° < 1 + e(a - 1).
In this inequality e is any rational, positive proper fraction. We will
now show that this inequality is also true for any irrational proper
fraction i.
First, it is clear that a7 > 1 + J(a — 1) cannot be true for any
irrational proper fraction J. If that were the case it would be possible
to find a rational proper fraction R < J so close to J that aR would
differ from a7, and 1 + R(a — 1) from 1 + J (a — 1), by less than—
40 Arithmetical Problems
let us say—£ of the difference a1 = [I + J (a — 1)]. In that event
aR would still be > 1 + R(a — 1), which is, however, impossible
according to (1).
Now let z be so small that i + z and i — z are both positive
proper fractions. Then we have
a' < a' 2
(since the arithmetic mean of the numbers <? and a"2 is greater than 1
according to Cauchy) or
. ai+z + a'-*
a< 2
According to the above relation, however,
ai+* £ 1 + ({ + z)(a - 1), a'"2 £ 1 + (« - z)(a - 1),
therefore
2 ^ 1 + i(a- 1);
thus, it is certain that
a' < 1 + i(a - 1).
Inequality (1) is therefore true for any proper fraction e.
If we replace e in (1) by I In, 1 + e(a — 1) by b, i.e., a by
1 + fi(b — 1), (1) is transformed into
(2) 6« > 1 + ,,(* - 1),
where /* is any positive improper fraction, b any positive number.
Conclusion. The exponential inequality. If x is any positive
magnitude and c any positive exponent, the exponential inequality is:
*• $ 1 + e(x - 1),
in which proper fractional exponents require the use of the upper
sign and improper fractional exponents require the use of the lower
sign.
Bernoulli's Power Sum Problem
Determine the sum
S = 1" + 2" + 3" + • • • + n"
of the p powers of the first n natural numbers for integral positive exponents p.
Bernoulli's Power Sum Problem 41
The problem, posed in this general form, was first solved in the
Ars Conjectandi (Probability Computation), which appeared in 1713. It
was the work of the Swiss mathematician Jacob Bernoulli (1654-1705).
The following elegant solution is based upon the binomial theorem.
By resorting to the device of considering the magnitudes ©*, ©2,
©3,..., ©v resulting from the binomial expansion of (x + ©)v as
unknowns subject to v certain conditions rather than as powers of ©,
we obtain an amazingly short derivation of S.
According to the binomial theorem, if P is understood to represent
the number/) + 1,
(v + ©)p = vp + /V©1 + /V"1©2 + • • •
and
(v + © - l)p = v* + JV(@ - 1)1 + /V"1^ - 1)2 + • • •.
Subtraction of these two equations gives us
f> + ©)p - (v - 1 + ©)P = Pv" + /V-H©2 - (© - I)2]
() \ + JV-S[@S - (©- I)3] + ••••
We now define the unknowns ©*, ©2, ©3,... by the equations
(I) (© - 1)2 = ©2, (2) (© - 1)3 = ©3, (3) (© - 1)* = ©*,
etc. This results in the simplification of (I) to
(la) Pv* = (v + ©)p - (v - 1 + ©)p.
This equation is formed for v = 1, 2, 3,..., n, and we thereby obtain
P-l" = (1 + ©)p- ©p,
P-2" = (2 + ©)p -(1+ ©)p,
p.np = (n + ©)p - (n - 1 + ©)p.
Addition of these n equations gives us
(II) PS= (n + ©)p - ©p
or
(II) l> + 2>+ ••• +n>= (" + @j>P~ @P with P=p+l.
This formula, in which the magnitudes ©x, ©2, ©3,... on the
right side of the equation, obtained from expansion of the binomial
(n + ©)p, are defined by equations (1), (2), (3),..., gives us the
sought-for power sum.
42
Arithmetical Problems
In order to apply it to the cases n = 1, 2, 3, 4, we first determine
the unknowns (S1, ©2, ©3, and ©4 in accordance with equations
(1),(2),....
From (1) it follows that -2S* + 1 = 0, i.e., & = ■£. Then,
from (2), -3S2 + 3S1 - 1 = 0, i.e., <S2 = £. And from (3),
-4<S3 + 6®2 - 4©1 + 1 = 0, i.e., <S3 = 0. Finally, from (@ - 1)5
= <S5 we obtain <34 = --^. The numbers (S1 = |, <32 = £,
S3 = 0, <S4 = — -^o, etc., are known as Bernoulli numbers.
Then from (II) we obtain
l+2 + 3+--.+,= ^ + @)2-@2 = "2+22"@1
n+ 1
P + 22 + 32 + --- + h2 = (" + @)3 ~ @3 = "' + 3"2®1 + 3"®2
= in(R + 1)(2b + 1),
1» +2»+ 3»+ ---+^=^ + ^-64
n4 + 4b3©1 + 6n2©2 + 4n©3
/ B+ 1\S
^ + 2. + ^+...+^=^ + ^5-85
5
= rc5 + 5¾4¾1 + 10«3©2 + 10«2©3 + 5rc©4
5
pst
= 30'
with/> = n(n + I), s = 2n + I, t = 3p - I.
If n in (II) increases without limit, S also increases without limit,
but the quotient S/np possesses a finite value. In fact, in accordance
with the binomial theorem, (II) is written
PS = np + PiSV"1 + iVSV-2 + • • -,
so that
Bernoulli's Power Sum Problem 43
Now, if n increases infinitely all the fractions on the right-hand side
with the exception of the first become infinitely small, and we obtain
the limit equation of the power sum:
(in) iim 1P + 2'+•••+"' = _L_.
This important limit equation can also be derived from the
exponential inequality (No. 10)
xp > 1 +P(x- 1).
This derivation has the advantage over the one just given that it is
true for any positive exponent p, not only for integral positive exponents!
If we first replace x in the exponential inequality with the improper
fraction V/v, then by the proper fraction vj V, after elimination of the
denominators we obtain
V > vp + Pvp(V - v) and vp > V - PVP(V - v)
or
Vp - vp
Pv" < -½ < PV.
V — v
Into this new inequality we introduce the series 110, 211, 312,...,
n\n — 1 for the pair of values V\ v and we obtain
p.Op < P - 0P < P-lp,
P-l" < 2P - lp < P-2p,
P-(n - 1)" < np - (n - l)p < P-n".
Addition of these n inequalities results in
P(S - np) <np <PS
or
15 11
P np P n
Since both boundaries between which the quotient 5/np is situated
assume the value 1 /P when n = oo,
(in) iim lP + 2Pfr + nP = -A-
^ ' »-.» np + 1 p + 1
where p represents any positive magnitude.
44
Arithmetical Problems
If the mean value of the function x" is introduced, the limit equation
of the power sum can be obtained in still another form.
The mean value of a function over an interval is commonly understood to
mean the limiting value toward which the mean value of n values of
the function uniformly distributed over the interval tends if n increases
without limit. The mean value M of the function/(*) over the
interval 0 to x, if 8 represents the nth part of x, is thus the limiting value of
the quotient
/(8) +f\28)+ ••• + /(,.8)
r n
X
for n = co. We write this mean value as 2R/(*).
o
Thus, the mean value of the function xp over the interval 0 through
x is the limiting value of
8" + (28)" + • • • + (n8)" &p l" + 2" + ••• + n"
u = = o* >
n n
i.e., since 8 = x/n, the limiting value of
I? + 2" + ••• + np
H = x"
„p + i
Since the fractional factor of the right side according to (III) has the
limiting value \j(p + 1), it follows that the sought-for mean value of the
function xp is
(Ilia) Six" =
o
p + y
this formula, however, is basically no different from (III).
Formula (III) or (Ilia) has found many applications in geometry
and physics.
The Euler Number
Find the limiting values of the functions
9>M = (l + ^)* and 4>(*) = (l + 1)"
for an infinitely increasing x.
The Euler Number 45
The simplest solution of this very interesting problem is based
upon the exponential inequality
xe < 1 + e(x - 1)
(cf. No. 10), in which x is any positive magnitude and e is any proper
fraction between 0 and 1.
Let us introduce two arbitrary positive numbers a and b, the first of
which is larger than the second and the second > 0, and introduce into
the exponential inequality first
x-1+7
and then
v 1 l
b + r
In the first case we obtain 11 +
I)
b
e = ->
a
b+ 1
Z a+\
Wa J
| < 1 + - or
1 a
(1)
KH'+i)'
(J \6 + l/a + l j
1 — ■; r) < 1 r or
6+1/ a + 1
/ b \"+1 / a \a+1
\b+\) <\a+ l)
or, finally,
(2, (1+}p>(,+ip
The two inequalities obtained, (1) and (2), contain the remarkable
theorem:
With an increasingly positive argument x the function y(x) = II -|—)
/ l\x+1
increases while the function O(x) = II H—I decreases.
Thus, for AT > x
q>(X) > q>(x), whereas O(Z) < O(x).
46
Arithmetical Problems
Since, on the other hand, for the same values of the argument the
function O exceeds the function q>
[<&(*) = (l +¾ •*(*)]•
we obtain the inequalities
<p(x) < <p(X) < <S>(X) and <p(X) < ${X) < <D(*),
i.e., every value of the function O is greater than every value of the
function <p. (Only positive values of the argument will be considered.)
Let us imagine two movable points p and P on the positive number
axis which are situated at distances <p(t) and ¢(/) from the zero point
at time t and begin their movements in the instant /=1. Point p,
beginning from q>(l) = 2, then moves continuously toward the right,
while point P, which begins at ¢(1) = 4, moves continuously toward
the left. However, since ¢(/) is always greater than 9)(/), i.e., P is
always to the right of p, the points can never meet. Nevertheless, the
distance between them is diminished
d = ¢(/) - 9)(/) = 9)(/)//,
since 9)(/) < 4, and thus d < 4// without limit with increasing time,
so that they finally are separated by an infinitely small distance.
The only way to explain this situation is to assume that on the
number axis (between the numbers 2 and 4) there exists a fixed
point that the moving points p and P approach infinitely closely from
the left and from the right, respectively, without ever touching. The
distance of this fixed point from the zero point is the so-called Euler
number e. The proposal to designate this number, which also forms
the base of the natural logarithmic system (No. 14), by the letter e
stems from Euler (Commentarii Academiae Petropolitanae ad annum 1739,
vol. IX).
The important inequality
(I) (l + I)* < . < (1 + I)*+1
is true for Euler''s number (x represents any positive number >0).
If we choose x = 1,000,000, this inequality gives us the number e
exactly to five decimal places. However, the use of the series for e
(No. 13) is a better method of computation.
The Euler Number 47
Then we obtain
e = 2.718281828459045....
The sought-for limiting values, however, are
/ 1\* / 1\*+1
lim (1+-1 = e and lim II H—I = e,
X-* oo \ X/ X-* oo \ Xj
the first of which is an upper limit, while the second is a lower limit.
Note. From the inequality (I) for the number e the inequality for
the exponential function ex follows directly.
1. In the inequality
('♦9'
< e
we replace x by 1 jP, where P is any positive number > 0; we assign to
e the power P and obtain
(1) f > 1 + P.
2. In the inequality
e <
we replace * + 1 by — 1 \n, thus \ -\— by -j > n being a negative
proper fraction ?^0; we assign to e the power n and obtain
(2) en > 1 + n.
3. We consider that for every negative improper fraction N
(1 + N) is negative, and consequently we have
(3) «■» > 1 + N.
Combining the inequalities (1), (2), (3), we obtain the inequality
of the exponential function :
e* > 1 + x,
which is true for every finite real value of x and only becomes an
equation when x = 0.
The inequality obtained leads directly to the so-called limit
equation of the exponential function.
48 Arithmetical Problems
Let x be any finite real magnitude and n a positive number of such
magnitude that 1 + - is positive. In accordance with the inequality
of the exponential function,
exln > 1 + - and e~xln > 1 - -•
n n
We assign these inequalities the power n, in the case of the second,
X
however, only after we have multiplied it by 1 H— This results in
•* > (1 + $ and (1 + ;)v* > (1 - $"•
Since the right-hand side of the second inequality, in accordance with
x2
the exponential inequality (No. 10), is greater than 1 > then
actually
(' + ;)"<-* > ' - i - (' + tf > (' - &■
Combining the inequalities obtained, we get
If n is then allowed to increase infinitely, the left side of this
inequality is transformed into ex and we obtain the limit equation of the
exponential function:
lim (1 + -1 = ex,
n-oo \ n]
in which x represents any finite real number and n is an infinitely
increasing magnitude.
Newton's Exponential Series
Transform the exponential function ex into a progression in terms of powers
ofx.
This power progression, the so-called exponential series, which may
in fact be the most important series in mathematics, was discovered
by the great English mathematician and physicist Isaac Newton
Newton's Exponential Series
49
(1642-1727). The famous treatise that contains the sine series, the
cosine series, the arc sine series, the logarithmic series, and the binomial
series as well as the exponential series was written in 1665 and bears
the title De analyst per aequationes numero terminorum infinitas. Newton's
derivation of the exponential series is, however, not rigorous and
rather complicated.
The following derivation is based upon the mean values of the
functions x° (No. 11) and «*.
We find the mean value of the function ex with the help of the
inequality of the exponential function
(1) e" > 1 + u. (No. 12)
We will consider two arbitrary values v and V = v + <p > » of the
argument of the exponential function and first set u = q> and then
u = —q> in (1). This gives us
e° > 1 + <p and e~° > 1 — <p, respectively.
Multiplication with ev and ev, respectively, results in
ev > ev + <pev and ev > ev — <pev, respectively;
combining, we obtain:
ev - e"
(2) * < V^T < eV-
The mean value M of e* over the interval 0 to x is the limiting value
of the quotient
" 1 (8 = n)
for an unlimitedly increasing n. In order to find p, for a positive x
we set down in (2) for the pair of values v\ V'm succession
0|S, S|2S, 2S|3S,..., (b- 1)8|r8
and add the resulting n inequalities. This gives
e* — 1
nfi + 1 — ex < —^— < np
or, solved for p,
e* - 1 ex - 1 e* - 1
< n < \ (x > 0).
50 Arithmetical Problems
For a negative x we put down successively for v\ V in (2)
S\0,2S\S,3S\28,...,n8\(n - 1)8.
Summation of the resulting n inequalities then leads to the same final
inequality; only in this case the extremes are reversed, so that this
time it reads
«* - 1 e* - 1 e* - 1
1 < u < (* < 0).
x n x
If we then allow n to become infinite in the two inequalities
obtained, we get for the lim p the value
X gX _ 1
(3) We- = ?-—i,
o *
whether x is positive or negative.
Now for the series expansion of ex!
We begin with the inequality
e* > 1 + x.
We assume initially that x is positive and obtain the mean values of
both sides. This gives us
e* — I . x , x2
>!+0 01" <!* > \ + X + jrr-
x 2 2!
Repeated mean formation gives rise to
ex — 1 , x x2 ,, x2 x3
~T~ >1+2 + 3! °r e>1+* + 2! + 37
We continue in this manner and obtain
(4) e* > 1 + * + 2; + 35 + • • • + ~x;
In order to obtain an upper limit for ex also we begin with the
inequality
e~x > 1 — x,
multiply by ex and obtain 1 > ex — xex or
ex < 1 + xex.
In the subsequent mean formations we employ the self-evident
theorem: "The mean of the product of two (positive) functions u and
v is smaller than the product of the mean value of u and the maximum
value of» over the interval considered."
Newton's Exponential Series 51
In the first step (u = x, v = ex) we obtain
~ < \ +^e* or e* < 1 + x + ^ ex,
in the second I v = -^-> t> = e* I
e* — 1 . x x2 , * , * * -
~~T~ < J + 2 + 3! °r 6 + * + 2! + 3! ' '
etc., and finally
(5) «* < 1 + x + ^ + 3j + • • • + -{ e*.
If we then consider the case in which x is negative, the situation is
somewhat simpler.
From e* > 1 + x it follows as above that
ex - 1 , x
x 2
however, since x is now negative,
x2
e* < 1 + x + ^-
The next mean formation yields
ex — 1 . x x2 , x2 x3
___ < 1 +2+ 37 or eX>1+x + 2l. + 3\'
the next
** < ! + * + Ti + Ji + £'
etc., and finally
x2 x3 x2" ~1
(6) e*>l+x+ +-+...+
2! ^ 3! ^ ^ (2v - 1)!
and
x2 x3 x2v
(7) ex<l+x + 7r. +^. + ■■■ +
2! ^ 3! ^ ^ (2v)
From inequalities (4), (5), (6), and (7) it follows that:
When x is positive e* lies between
X X^ X x^
l+^ + 2i+---+^ and 1+, + -+...+-^,
52 Arithmetical Problems
and when x is negative between
1 + * + 27+ ••
Then if we write
X X
+ _ and !+*+_ +
+
(«+!)«
X X
(8) e*=l +x + -+ ... +-,
the error encountered for a positive value of x is less than
hv-»>
and for a negative value of x less than
(«+ 1)!
But for a finite value of x and for an infinitely increasing n the
fraction xn/n\ approaches zero. [In accordance with No. 10 each of
the products 2{n — 1), 3(n — 2),..., (n — 1) -2 is greater than 1 -n.
The product of these products is therefore greater than nn~2, i.e.,
(n — 1) !2 > n""2 or n!2 > n" or n! > Vn". Thus, it follows that
V~n
If n is assigned a value such that Vn is greater than \2x\, then
"7= < (9) and lim 7T\ = °-]
The error encountered with formula (8) thus disappears as x
increases infinitely. Consequently:
The progression
, x2 x3
e*=l+x + -+J[ +
(9)
is true for every finite x.
Note. The series obtained is particularly well suited for
computation of the Euler number e. If, for example, we set x equal to 1,
1 1 1
e = l + n + 2i+
+ j^j = 2.7182818012
Newton's Exponential Series 53
and the encountered error is
F-— — — -J-(] A. l \
11!+ 12! + 13! + "" 111V. + 12+ 1213 + '")'
which is smaller than
11! \ + 12 + 122 + 123 + "/
or smaller than
-j-j-j— < 0.00000008.
The exact value is e = 2.71828182845904523536 ....
Formula (9), which applies to every finite real value of x, suggests
the further extension of the concept of the exponential function to
include the complex argument values z.
The exponential Junction ez for the complex argument z is defined by the
formula
2 3
(10) «*=l+z + £ + £+... to infinity.
It is easily seen that the infinite power series on the right-hand side
of (10) has a definite finite value for every finite z, or, in other words,
that the series converges for every finite z:
We set
1 + z + h. +'" +h = £n(z)'
zn + 2 zn-~
+ /. , on + • • • + TrTTTT = *»(*)»
(b + 1)! T (b + 2!) ^ ^ (b + v)!
so that
£n+v(z) - En(z) = Rv(z).
If f represents the absolute magnitude of z, then the absolute
magnitude of Ry(z) must certainly be smaller than
rn + 1 rn + 2 rn + v
+ , V nn + fc
(b+ 1)! ^ (b + 2!) ^ (b + v)!
and consequently considerably smaller than
rn +1 rn + 2
+ /., OM + - - - t0 infi"hy = «* - £n(0-
(B+ 1)1 ^ (B + 2)!
54
Arithmetical Problems
Since, in accordance with (8) or (9), ec — En(t) can be made as small
as desired with the selection of a sufficiently high value for n, R,(z) can
certainly be made as small as desired for such an n, no matter how
great the value of v. However, this means that the series
Z2 z3
1 + Z + 21 + 31 + '""
converges. (It is in fact absolutely convergent, i.e., it still converges
when z is converted into its absolute magnitude f.)
Moreover, let a and b be two arbitrary real or complex values, a and
/3 their absolute magnitudes, and a + /3 = y. By multiplication of
„ . , . a a2 an
En(a) = 1+-+-+...+-
and
p /in . b b* bn
E«(b) = 1+TT + 2!+---+^T
we obtain En(a)En(b) = 1 + Cj + C2 + • • • + C2n, Cv representing
<fbs
the sum of all the members of the form -7-7 in which the exponents r
r\s\ r
and s have the sum v. As long as v does not exceed the value of n,
all v + 1 positive index pairs (r, s) occur in Cv with the sum v, whereas
when v > n only some of them do. Consequently, according to the
binomial theorem (No. 9)
for v £ n Cv = 1 (a + b)\
for v > n ICJ < I y\
v \
The sum of the first (n + 1) terms of En(a)En(b) is therefore equal to
EJa + b), and the sum of the absolute magnitudes of the following
n terms is smaller than Rn(y), i.e., is certainly smaller than
(^nyi+ 0¾ +--- +toinfinity = * ~ e«m =8'
so that we can set it equal to eS, where |e| < 1.
Accordingly, we obtain the equation
En(a).En(b)=En(a + b) + e8.
Newton's Exponential Series 55
If we then allow n to become infinite in this equation, 8 becomes equal
to zero, and the equation is converted into
(11) ea-eb = e0*".
This fundamental formula justifies our previous suggestion of
designating the series
z2 z3
1 +Z + 2l + 3!+ '••
as e2.
Now let z = x + iy, where x and y are real. According to (11),
e* = e"-^" or
fie* = fi = 1 + iy - t - i £ + £ + i |! _ |! + . . .
\ 2!+4! 6! + /
The brackets appearing here are, in accordance with No. 15,
cosy and siny, and we obtain the Eulerformula:
(12) e**1" = ex(cosy + isiny),
which when x = 0 takes the form
(12a) «*" = cosy + isiny.
If in (12a) y = it, we obtain the remarkable Euler relation
J" = -1
between the two significant numbers e and -n.
If we then replace y by — y in (12a), we obtain
(126) e~iy = cosy — tsiny
and subsequent addition and subtraction of (12a) and (126) yields
the equally remarkable pair of formulas
cosy = ^ ) siny =
2»
56
Arithmetical Problems
Nicolaus Mercator's Logarithmic Series
To calculate the logarithm of a given number without the use of the logarithmic
table.
This fundamental problem, which forms the basis for the
construction of the logarithmic tables, is solved simply and conveniently by
logarithmic series. The simplest logarithmic series:
* - i*2 + i*3 - i** + ,
which represents the natural log of 1 + x, is found for the first time
in the Logarithmotechnia (London, 1668) of the Holstein mathematician
Nicolaus Mercator (1620-1687) (whose real name was Kaufmann).
For the derivation of the logarithmic series we will make use of the
mean value of the function f(x) = -: » which we will therefore
determine first.
We will begin with the inequality (2) for the above number; we
begin by converting this inequality into an inequality for the
logarithmic function nat log x (nat log x, abbreviated as Ix, is the logarithm
of x when Euler's number e is taken as the base of the logarithmic
system, i.e., the logarithm is the power of e required to obtain x).
Consequently, we replace v and Fwith lu and IU, where U > u > 0,
and, correspondingly, ev and ev with u and U. This gives us
U -u TT
u<W=Tu<u
or
... 1 IU — lu 1 IT1 n.
(1 Tr < Tr <_ (U > u > 0).
U U — u u
The mean value of the function f(x) = 1/(1 + x) is the limiting
value of the fraction
/(8)+/(28) + ••• + /(n8)
M~ n
for an infinitely increasing n and 8 = x\n.
To determine lim /i for positive and negative values of x, respectively,
we write 1 + vS|l + (v — 1)8 in (1) for the pairs U\u and u\U,
Nicolaus Mercator's Logarithmic Series 57
respectively, and then form (1) for v = 1, 2, 3,..., n. Addition of
the resulting n inequalities gives in both cases:
——r—- lies between nu and nu + >
8 ^ ^ \ + x
in other words,
.. , /(1 + *) . /(1 + *) x
u lies between — and r.—■—r>
n x x n(\ + x)
Thus, if n becomes infinite, it follows that
where (1 + *) is naturally to be considered positive.
Now for the derivation of the series for /(1 + x)!
If we replace/on the right-hand side of this equation with 1 - */, we
obtain
f= 1 -* + **/.
If we again replace/on the right-hand side by 1 — xf, we obtain
f= 1 - x + x2 - x3f.
Similarly, from this we obtain
f = 1 - x + x2 - x3 + x%
etc., and in general:
/=1 — x + x2 — x3 + x* — H — exn~x + exnf,
where e is equal to +1 for even values of n and — 1 for uneven values
of n.
Obtaining the mean value from this formula, we have
(3, 21^.,.-^.^......^: + ^.
If F represents the maximum value assumed by/over the interval 0
to x (thus F = 1 for positive values of x, F = 1/(1 + x) for negative
values of *), then in terms of the absolute value the mean value of
*y"must be smaller than theF-value of the mean value [*"/(« + 1)] of
xn. Accordingly, we are able to write
2H*n/= 0F- *"
n+V
where 0 is a definite positive proper fraction.
58 Arithmetical Problems
This converts (3) into
,., , x2 x3 x* xn n
/(1 +X) =x __ + ___ + £~ + R
xn + 1
with R = eBF-
n+ 1
As n approaches infinity, if x is a proper fraction (also when x = +1)
the "residue" R tends toward zero.
Consequently, the following progression is valid when x is a proper
fraction and when * = 1:
(4) /(1 +x)=x_|+ 1.-^+_....
The series on the right-hand side of the equation is Mercator's series.
Since it is only valid for proper fractional values of x, it is not suited
for computing the logarithms of any number whatever. In order to
obtain the series required for this, we substitute in (4) — x for x and
obtain
(5) /(1 -*)= -x-*-*-*- ....
Subtracting (5) from (4) gives us
1 +*
1
, . + * n\ x3 x5
For every positive or negative proper fractional value of x,
1 i Y \
X = ■; is positive, while at the same time x = -= 7> and the
1 - x ^ ' X + 1
formula obtained is written
X- 1
(6) IX = 2[x + j*3 + j*5 + • • •] with x = x j
This new series converges for every positive X.
In this series we substitute for X the quotient Zjz of two arbitrary
positive numbers (> 0). This gives us
(IZ -lz = 2[Q + W + |Q5 + w + ■ ■ •]
(7> V* 0-^7
This series, in which Z and z may be any two positive numbers, is
the logarithmic series from which the logarithmic tables can be
computed.
Newton's Sine and Cosine Series 59
In order, for example, to compute /2 we set z equal to 1 and Z to 2,
which gives us
-il
/2 - 2(„ + 3>33 + 5 35 + 7 3? +
In order to compute /5 we set z = 125 = 53, and Z = 128 = 27, and
this gives us
7/2 - 3/5 = 2(Q + iQ3 + |Q5 + • • •) with Q = ^h-
To compute /3 we assume that z = 80 = 5-2*, Z = 81 = 34, so that
/2 = /5 + 4/2, /Z = 4/3. This gives us
4/3 -/5- 4/2 = 2(Q + iQ3 + \Q5 + ■ ■ ■) with Q = ^.
To compute /7 we set z equal to 2400 = 25-52-3, Z = 2401 = 74,
and obtain
4/7 - 5/2 - 2/5 -/3 = 2{Q + iQ3 + \Q5 + ■ ■ ■)
with Q = 48¾^.
The series in the parentheses converge very rapidly, i.e., we
require relatively few terms to obtain their sum fairly exactly.
Note. The common logarithms to the base 10 are computed from
the natural logarithms. From
IQlos-v = elx (= ^
it follows in terms of the natural logarithms that
log*-/10 = Ix
or
log* = Mix,
where
M = j^. = 0.4342944819
is the so-called modulus by which the natural logarithm must be
multiplied to give the common logarithm.
Newton's Sine and Cosine Series
Compute the circular functions sine and cosine of a given angle without the
use of tables.
The simplest way of carrying out the required computation is with
the use of the sine and cosine series.
60
Arithmetical Problems
The series for sin x and cos x first appeared in Newton's treatise
De analysi per aequationes numero terminorum infinitas (1665-1666).
(No. 13.) The sine series appears there as the converse of the
arc sine series, which today is a very uncommon approach.
The derivation of the sine and cosine series presented here is based
upon the mean values of the functions sin x and cos x over the interval
0 through x. (All of the angles mentioned in what follows are
considered in circular measure.)
The mean value M of the function sin x over the interval 0 through
x is the limiting value of the quotient
sin 8 + sin 28 + • • • + sin nh
/* =
n
for an infinitely increasing integral positive n, where 8 represents the
nth part of x.
But the numerator of the quotient* possesses the value
. 8
sin n-
sin m *->
o
sin-
where m is the arithmetic mean of the n argument values 8, 28,..., nh,
i.e.,
Consequently,
»+1. x 8
-2- 8 = 2 + 2
. x
sin m sin -
M = —•
"sin 2
Since the denominator of the fraction on the right-hand side tends
toward the limit \x as n becomes infinitely great,* and the lim m is
also equal to \x, we obtain
x x
sin - sin -
M = lim /j. =
n-* 00 X
* The reader who is unfamiliar with this fact will find the proof in note 2 at
the end of this number, p. 63.
Newton's Sine and Cosine Series
61
or
(1)
2R sin x =
o
1 — cos x
By the same route, with the use of the formula
cos 8 + cos 28 + • • • + cos n8 = cos m
. nS
siny
2
we obtain
(2)
sin x
2R cos* =
o x
The series for sin x and cos x are now very easily found. Starting
with the inequality
cos* < 1,
we obtain the mean value for both sides and we have
sin *
< 1 or sin * < *.
If we once again obtain the mean values (Formula [1] and No. 11)
we obtain
1 — cos * 1
< - * or cos * > 1 — -y
By again obtaining the mean value we get
sin x . x2 . x3
> 1 — tt: Or Sill X > X — rr-.>
3!
3!
etc. This results in:
cos * < 1
cos* > 1 — ^j
°°SX K l ~ 2l +4!
. *2 *4 *6
COS*> 1 --+---
sin * < *
Sin * > * — <r-j
y3 Y&
sin * < * — jj + j7-j
V V V
sin * > * - _ + - - -,
etc.
62
Arithmetical Problems
The integral rational functions on the 1 ight-hand side of these
inequalities are the 1st, 2nd, 3rd,..., vth approximations of the functions
sin x and cos x. They are called approximations because the degree
of their deviation from the correct circular function grows
progressively smaller as the index v becomes higher and can be made as small
as desired if v is sufficiently great. Specifically, each of the two
circular functions lies between two successive approximations of the
true value. Thus, if we set them equal to one of these two
approximations, the error incurred is smaller than the difference between the
approximations, which has the form xy/v\. The fraction x*jv\,
however, tends toward zero as v becomes infinitely great (No. 13).
Accordingly, the following progressions
~3 y5 „7
sin* = *~3l + 5!~7!+ '
. X2 X* X6
cos*= 1--+---+-....
are valid for finite values of x.
If one of these series is interrupted at any point the error thereby incurred is
smaller than the first disregarded term.
With these series it is possible to compute the sine and cosine of any
given angle. They were used to draw up the sine and cosine tables
found in logarithmic handbooks.
In order to illustrate the degree of approximation let us compute,
for example, the sin 1° = sin x (where x = 7r/180). We set
x3
sin 1° = sin x = x ——•
D
The error thereby incurred is smaller than *5/120, and this fraction is
smaller than 0.000 000 000 02, so that, calculated exactly to 10 places,
sin 1° = 0.0174524064.
Note 1. Summation of the series
S = sin a + sin (a + 8) + sin (a + 28) + • • • + sin (a + n — 18).
We multiply both sides by 2 sin 8/2 and transform each of the
products on the right in accordance with the formula
2 sin - sin (a + vS) = cos la H -— SI — cos la H ^— IS.
Newton's Sine and Cosine Series 63
We are then left with
„.. 8 I 8\ I In - 1 s\
2Ssin ■= = cos lot — -=\ — cos lot -\ ■?— SI-
Since the right side of this equation is
o • / n-\\ . 2
2 sin lot -\ ^—I sin n ■?>
we obtain
. 8
sinn ■=
S = sin m r->
o
sin-
i
where m = a H -— 8 represents the mean value of all n angles a,
a + 8, . . ., a + n — 18.
In order to obtain the sum of the series
2 = cos a + cos (a + 8) + • • • + cos (a + n — 18)
g
we again multiply both sides by 2 sin ^> but on the right-hand side
we write
o • § / *n • I 2v + 1 -\ ./ 2v - 1 -\
2 sin ■= cos (a + v8) = sin la -\ -— SI — sin la H -— SI-
We are then left with
„_ . 3 ./ 2n - 1 s\ . I 8\
22-sin - = sin la + —-— 81 - sin la - -A
I n- I A . 8
= 2 cos la H -— SI sin n ■=>
and we obtain
. 8
n sin^r
2 = cosm r-
sin 2
Note 2. Proof that lim n sin — = w.
n-* oo n
a- r> • w U> U> ->w w 11 • ->w\
bin w = 2 sin — cos — = 2 tan — cos2 — = 2 tan —• 11 — sin2 ^-1-
64 Arithmetical Problems
However, since sin w < w and tan w > w, it follows that
1
sin w > 2
w I. w2\
'TV ~T)
or sin w > w — -r W.
n
4n2
w , w lai3
~ and ~- 1 -3
w ,.
Then sin — lies between — and -j —*•> i.e., n sin — lies between w
n n 4 rJ
and w — ^ ^-- Thus,
lim n sin — = w.
n-* oo W
^M Andre's Derivation of the Secant and Tangent Series
Perhaps the most convenient and certainly the most attractive way
of deriving the exponential series of the functions sec x and tan x is the
method of zigzag permutations devised by the French mathematician
Andre (Comptes Rendus, 1879, and Journal de Mathimatiques, 1881).
A zigzag permutation—called by Andrd an " alternating
permutation"—of the n numbers 1, 2, 3,..., n is an arrangement cu c2,..., cn
of these numbers in which no element cv possesses a magnitude such
that it lies between its two neighbors cv_r and cv + 1. If the points
Pu P2, ■ ■ ■, Pn are marked off on a system of coordinates such that
their respective abscissas are 1, 2,..., n and their respective ordinates
cu c2,..., cn, and each two successive points Pv and Pv + 1 are
connected by a line segment, the zigzag line by which the permutation
gets its name is obtained.
Fio. 2.
Andre's Derivation of the Secant and Tangent Series 65
A zigzag line or zigzag permutation can begin either by rising or
falling. We assert:
There are as many zigzag permutations {among n elements) that begin by
rising as by falling.
Proof. Let PiP2 • ■ ■ Pn be the zigzag line corresponding to one
zigzag permutation. Let us draw, through their highest and lowest
point, parallels to the abscissa axis and a parallel midway between
them. If we construct a mirror image of the zigzag line upon the
middle parallel, the mirror image gives us a new zigzag line Q1Q2 • • •
Qn or zigzag permutation, which begins either by falling or rising,
depending upon whether the first zigzag line begins by rising or
falling. Thus, for every zigzag permutation which begins by rising
(or falling) we can obtain a corresponding zigzag permutation which
begins by falling (or rising). Consequendy, there is an equal number
of each type.
Naturally there are just as many zigzag permutations that end by
rising as by falling.
Let us, therefore, designate the number of zigzag permutations of n
elements as 2An, so that An represents the number of zigzag
permutations of n elements that begin (or end) by rising (or falling).
The number An can be determined by a periodic formula. Let us
consider all the 2An zigzag permutations of the n elements 1,2,...,«
as written down and let us single out one of them, in which the highest
element n occupies the (r + l)th place (counting from the left). To
the left of n there are then the r elements au a2,..., a,, while to the
right of n there are the s numbers f}1} /32,..., /3,, with r + s = m =
n — 1. The permutation 0^0¾ ... ot, ends by falling, since ot, is
followed by n, which is higher; the permutation /3^ ... /3, begins by
rising, since /3X follows n, which is higher.
Now let there be formed from the r elements au ot2,..., ot, a total
of AT zigzag permutations with falling ends and, similarly, from the s
elements flu /32,..., /3, a total of A, zigzag permutations with rising
beginnings. Consequently, there are A, ■ A, zigzag permutations of n
elements in which n occupies the (r + l)th position and in which to
the left of n there are r elements au ot2)..., a,. However, since there
are many other combinations of m elements to the rth class aside from
the considered combination au ot2)..., ot,—as is commonly known,
there are a total of Cm = m, = m\\r\s\—there are consequendy a
total of
p, = trtrArA, (r + s = m)
66
Arithmetical Problems
zigzag permutations of n elements in which the highest element (n)
occupies the (r + l)th place. It is also easily seen that this formula
is also valid for the indices r = 0, 1, 2 if one sets A0 = A1 = A2 = 1.
In order to obtain all the possible zigzag permutations we must
obtain the expression pT for all the values from r = 0 through r = m =
n — 1 and add the resulting products. This gives us
m 0,m
2An = Jtpr = 2tmTArA,.
0 r
In order to simplify this formula somewhat further, we write m\jr\s\
instead of m, and set
(1) £ = *■
It is then transformed into
2nan = a^.i + a^.a + • • • + an.^ + a,,.^,,,
or, utilizing the symbol for the sum, into
(2) 2nan = 2a,a„
where r and s pass through all the possible integral numbers ^ 0, for
which r + s = n — 1.
Using the periodic formula (2) it is possible to compute, beginning
with a2, each number of the series a0, au a2, a3, a4,... from the
numbers preceding it.
From a„, when it is multiplied by n!, it is possible to obtain half the
number of zigzag permutations of n elements.
We can draw up a table for the simplest cases:
n =
an =
An =
0
1
1
1
1
1
2
i
1
3
i
2
4
A
5
5
tV
16
6
T%
61
7
tVV
272
8
TTTS*
1385
We are able to confirm, for example, that the four elements 1, 2, 3, 4
yield 2-At = 10 zigzag permutations
1324, 2143, 3142, 4132,
1423, 2314, 3241, 4231,
2413, 3412.
AndrS's Derivation of the Secant and Tangent Series 67
It is but a short step from the zigzag permutations to the series for
sec x and tan x.
First we establish that starting with the index 3 all av are proper
fractions < \. Since the number of zigzag permutations of n elements
for n > 2 is smaller than the number of all the permutations of n
elements, then 2An must be <n\, and consequently,
an < i-
Therefore, the infinite series
y = a0 + atx + a2x2 + a3x3 + ■■■
converges absolutely and is uniform over every interval — h through
+h where h < 1. It therefore represents over this interval a
continuous function with differentiable terms. The derivative of y is
y' = a1 + 2a2x + 3a3x2 + • • •.
Since, moreover, the series fory converges absolutely, we can square
it and thereby obtain
^ = 2^-1.
n
where b1 = 1 and for all n ^ 2
bn = aodn-i + a^-2 + a2an_3 + • • • + a,,..^,,.
In accordance with (2), therefore, whenever n ^ 2,
bn = 2nan,
and then
y2 = 1 + 2-2a2x + 2-3a3x2 + 2-4a4x3 + • • •.
If we then add one to both sides we obtain
1 + y2 = 2\ax + 2a2x + 3a3x2 + 4a4x3 + • • •]
or
1 + y2 = 2y'.
We write this equation
1 +y2 2
and reflect that the left side is the derivative of the function
Y = arc tan y — \x,
68 Arithmetical Problems
but that the derivative of a function (Y) can be zero only if this
function is a constant. Thus we have
Y = arc tan y — \x = const.
In order to determine the constant, we set x equal to zero and obtain
for this value of the argument x
y = 1, arc tan y = -> and Y = -•
The constant therefore has the value 7r/4, and our equation is
transformed into
IT x
arc tan y = 5+^
From this it follows that
y = tan (j + *).
and we have the progression
(3) tan (z + 2) = a° + a** + a2*2 + a3*3 + - - -
which is true in any case for every proper fractional positive or negative
value of x.
We replace x in (3) by — x and obtain
(4) tan (5-^) = ao ~ «i* + «2*2 - «3*3 + •
As is easily seen, however, the two trigonometric formulas
lir x\ lir x\
2 sec x = tan (4 + 2)+ tan (4 ~ 2)
and
2 tan x = tan ^ + |j - tan \? - ^
are true.
If we introduce on the right-hand side here the series indicated in
(3) and (4) we obtain the progressions for sec x and tan x which we
were seeking:
sec x = a0 + a2x2 + a4*4 + a6x6 + • • •,
tan x = ax* + a3x3 + aBxs + a7*7 + • • •
Gregory's Arc Tangent Series
69
or, if we return to half the number of zigzag permutations, An,
V V V
sec x = A0 + A2 „] + At j-^ + A6 — + • • •,
Y Y V
tan x = Aj.x + A3 jj + AB ^ + A^ yj +
These two progressions are true in all cases for every proper fractional
value of x.
However, since sec x and tan x as functions of the complex argument
x are analytic functions of x and the individual position closest to zero
is x = 7r/2, the convergence circle has the radius 7r/2.
The two exponential series for sec x and tan x consequently converge for every
x the absolute value of which lies below 7r/2.
Gregory's Arc Tangent Series
Determine the angles of a triangle from the sides without the use of tables.
If a, b, c are the given sides of the triangle, a, B, y the angles (given
in arc measure), the following relations, as is well known, are obtained:
a p B p y p
tan r: = -» tan £ = *-■> tan £ = *-■>
2 « 2 v 2 w
where p2 = uvw/s, u = s — a,v = s — b,w = s — c,2s = a + b + c.
Thus, a/2, /3/2, y/2 are the arcs whose tangents are pju, pjv, pjw. We
write
a p B p y p
tt = arc tan -> ^ = arc tan -> £ = arc tan —•
2 «2 v 2 w
Arc tan * is understood to represent the arc whose tangent is x. The
function arc tan x is called a cyclometric function.
We can consider our problem solved if we can succeed in calculating
the cyclometric function arc tan xfor any given x. This can be calculated
by means of the exponential series for the arc tangent function
obtained in 1671 by the English mathematician James Gregory
(1638-1675).
To derive the arc tangent series we make use of the mean value of
the function f{x) = -j ^> which we must consequently compute
* i x
beforehand.
70
Arithmetical Problems
On a tangent of a unit circle ft we mark off from the point of
tangency A the two segments Ap = v and AP = V in such a manner
that Pp = <p = V — v; we connect /> and /> with the center of the
circle 0 and designate the distances Op and OP as r and .ft, their
intersections with ft as ? and Q, and the arcs Aq, AQ, qQ in that order
as w, W, to. This gives us the equations w = arc tan v, W =
arc tan V, to = arc tan F — arc tan v.
We would like to divide the area (-^) of the triangle OPp into two
sections and for this purpose we draw the two arcs ph and PH
concentric to qQ so that they meet OP and the extension of Op at h and H.
The area of the triangle is then greater than the area ($r2to) of the
sector Oph but smaller than the area {%R2to) of the sector OPH,
so that
r2to < q> < R2to.
It follows from this that
1 to 1
-S5 < — <
9 r
R2 - „ - ,2
or, if instead of y, to, r2, and .ft2 we write in the same order V — v,
arc tan V — arc tan v, 1 + v2, 1 + F2 (Pythagoras),
1 arc tan V — arc tan » 1
(1) T-nr2< v—; <
l+F2 V -v \ +v2
1
In order to determine the mean value of the function F{x)
trough x, i.e., the limiting value
F(8) + F(28) + ■■■ + F(n8)
1 + x2
over the interval 0 through x, i.e., the limiting value of
H- =
(where 8 = */n),in (1) we substitute successively 0| 8, 8\28, 2S|3S,...,
(n — \)8\n8 for the value pair v\ V, add the resulting inequalities, and
obtain
arc tan x . 1
fH, < g— < rn, + 1 - y—s
or
arc tan x x2 arc tan x
< fj. <
x n(l + x2) r x
As the limit n = oo is approached this inequality is transformed into
(2) 2R *, = HE*!*
K ' o 1 + x2 x
Gregory's Arc Tangent Series 71
Now for the derivation of the arc tangent series!
It is
1 x2
= 1 -
1 + x2 1 + x2
or
F = 1 - x2F,
if for the sake of brevity we write F for F(x). If we replace the F on
the right-hand side of this equation with 1 — x2F, we obtain
F = 1 - x2 + x4F.
If here we once again write 1 — x2F for F on the right-hand side, we
obtain
F = 1 - x2 + x* - x6F.
In a similar manner, from this we obtain
F = 1 - x2 + x4 - xe + xaF,
F = 1 - x2 + x* - xs + xs - x10F,
etc. Consequently, we obtain the inequality
1 - x2 + x* - xB + x4n-2 < F
< 1 - X2 + x4 - xe + + x4n.
Obtaining the mean value here gives us
v2 v4 v8 v4n-2
1-^- + ^-?r+
3^5 7^ 4n - 1
arc tan x
<l-=r + ^--=r+-
3 ' 5 7 ' 4n + 1
or
(3) '--3 + 5-T+ 4^TT<arCtan*
K3 X5 X7
3+r7T ' 4n + 1
If we then set
r r x' x^
< x - -=- + — --=-+ +
arc tan x = x —;r + -? =-H •• —
3 ' 5 7 4n - 1
or rather
v4n-l
X3 X" X' X1
arc tan x = x — -5- + — =- H -; r +
3 ' 5 7^ 4« - 1 ' 4n + 1
72
Arithmetical Problems
the error thereby incurred is smaller than the difference *4n+1/
(½ + 1) of the boundaries of (3). Since, however, this difference
tends toward zero when n becomes infinitely great and x is a proper
fraction (also when x = 1), we obtain the progression
V V V
(4) arc tan * = * —5- + -=- — -=- H • (for * ^ 1).
This is Gregory's formula. If the progression is interrupted at any point the
error incurred is smaller than the first disregarded term.
The series cannot be used when x is an improper fraction, because it
no longer converges. In order to calculate arc tan x in this case we
introduce y = 1 /*, the reciprocal value of x, and make use of the
formula
... It
(5) arc tan x + arc tany = ^-
[If arc tan x = a, i.e., x = tan a, then from
lit \ 1
tan I- — a) = = y
\2 / tana *
we obtain by inversion
it it .,
^ — a = arc tan y or ^ = arc tan * + arc tan y.\
We then obtain arc tan y in accordance with Gregory's formula and
arc tan x in accordance with (5).
But even if x is a proper fraction the arc tangent series is not
advisable when x is very close to 1. In this case we introduce
1 _ x
z = -j > the half reciprocal value of x, and make use of the formula
(6) arc tan x + arc tan z = j-
[If arc tan x = a, i.e., x = tan a, then from
(it \ 1 — tan a
T — a\ = T
4 / 1 + tan a
we obtain by inversion
it \ — x it
-7 — a = arc tan -; or T = arc tan x + arc tan z.\
4 1 + * 4 J
Buffon's Needle Problem 73
Thus we obtain arc tan z with Gregory's formula and then arc tan x
with (6).
Note. If in (4) we set x = 1, we obtain the so-called Leibniz
series:
4 3 + 5 7 + '"'
which was discovered by Leibniz independently of Gregory in 1674.
It is not advisable, however, to use this series to calculate it. The
series discovered by the English mathematician John Machin
(t 1751), which was published by him in 1706, is much better suited
for this purpose. Machin made use of the auxiliary angle A whose
tangent is \. From tan A = \ it follows that tan 2A = 2 tan A/
(1 — tan2 A) = -^-, and from this, similarly, that
120
tan 4A = 2 tan 2A/(1 - tan2 2A) = j^
Inversion gives us 4A = arc tan |-f-§ or
120 . 1
arc tan -=-^ = 4 arc tan -•
1 iy 5
The left side of this equation, according to (5), has the value
;r — arc tan -}^§; arc tan ^-¾, however, according to (6), has the value
-t — arc tan j^y, so that the left side is 7 + arc tan j^g- Consequently,
-7=4 arc tan - — arc tan ^r->
4 5 2.59
or written out completely:
7T _ n 1_ J \ _
4 U 3-53 + 5-55 +'"/
(J
\239 3-!
1 1
+ ^^ - + ■
-2393 ^ 5-2395
Using this series, Machin calculated it to 100 decimal places,
Buffon's Needle Problem
On a table at d intervals parallels are drawn. A needle of length 1 smaller
than d is thrown at random on the table. What is the probability that the
needle will touch one of the parallels?
74
Arithmetical Problems
This remarkable problem stems from Georges Louis Leclerc,
Comte de Buffon (1707-1788), who was the first man to clothe
probability problems in geometric form.
The probability of an event is commonly understood to mean the
ratio of the number of cases favoring an event to the total number of
possible cases.
Let the probability we are seeking be W.
Let the needle have the terminal points A and B. Let us imagine
the parallels extended horizontally. Let us single out two such
adjacent parallels I and II (below I) and from any point P on line I
let us drop a perpendicular PQ (= d) to line II.
Let us begin by considering the special positions £ of the needle
which are characterized by the following three conditions: (1) the
terminal point A lies on the segment PQ; (2) the needle lies to
the right of QP; (3) AP forms an acute angle: the inclination of
the needle toward QP.
Let the probability that the needle touches parallel I in any of the
special positions be w.
First we will show that
W = w.
If we consider all of the positions £' in which the needle touches with
its terminal point A either end of the segment PQ but is otherwise
arbitrarily situated (i.e., touching either I or II or neither) this
quadruples (as compared to the number of positions £) both the number
of all the possible cases and the number of all the favorable cases.
The probability of touching one of the two parallels I and II in all
of the positions £' is, therefore, likewise w.
If to the cases £' we add those positions in which the terminal point
B instead of terminal point A comes to rest on the segment PQ, we
obtain a total of £" positions, which doubles the number of possible
cases as well as the number of favorable cases.
Consequently, the probability of touching one of the parallels I
and II in the positions £" is also w.
Now if instead of taking one perpendicular PQ we take a very
great number—v—of very closely situated equidistant successive
perpendiculars between I and II and consider all the positions of the
needle in which one end of the needle comes to rest upon one of these
v perpendiculars, we thereby multiply by v (with respect to £") the
number of all the possible as well as that of all the favorable cases.
Buffbn's Needle Problem
75
Consequently, the probability of touching one of the parallels I
and II by a needle position in which one needle end lies between
I and II is again w.
The addition of still a third parallel III representing a mirror
image of I on II (or of II on I), as well as the addition of the needle
positions in which one end of the needle lies between III and II (or
between III and I), again give us a probability of w.
In short, we have shown that
W=w.
Consequently, our problem has been limited to the task of
determining the probability w of the needle touching line I in a special
position.
Fig. 3.
To obtain a better view of the infinitely great number of special
positions, let us divide the above segment PQ into a very great number
—N > 10001000—of equal parts and let us consider all of the cases in
which the needle end A cuts one of the dividing points. For each
dividing point there are an infinitely great number of possibilities
corresponding to the infinitely great number of possible needle angles.
For convenience in considering these possibilities also, let us consider
only the M angles
= 0,6,=
= 2e
= 3«
.! = (M- 1).,
where M likewise represents a very great number (e.g., M > 22'3)
and e is the Mth part of 71-/2.
76
Arithmetical Problems
In this manner our consideration involves N points and M angles,
thus, a total of NM needle positions.
However, only a certain fraction—just w—of these positions are
favorable. In order to determine this fraction we begin by obtaining
the total number of only those favorable positions in which the angle
of inclination of the needle has the selected value 6,, as illustrated in
Figure 3. These positions form a parallelogram EFGP with the sides
EF = I and EP = I cos 6,. Since there are
FP /
N-PQ = N-rose>
dividing points on the segment EP, our overall total comprises
N -jj cos 6,
favorable positions (with the common needle angle 6,). The number
n of all the favorable positions altogether is consequently
n = N-j (cos 60 + cos 6X + cos 62 + • • • + cos 0M-i)-
The probability that we are seeking is, therefore,
n I cos 60 + cos 6X + cos 62 + ■ ■ ■ + cos 6M _ x
w =
There remains then only the task of determining the value of the
fraction
cos 0O + cos Qx + cos 02 + • • • + cos ®m -1
m = Ti '
M
The fraction m is no different from the mean value of the cosine
function over the interval 0 through 7r/2.
Those who are familiar with the elements of integral calculus will
immediately be able to write this mean value; it is
m = fo'2cosXdXjl = l.
Those readers who are not familiar with this type of calculation can
obtain m just as easily in the following adroit manner.
Draw a quadrant of a circle with a radius of 1, designating the
horizontal arm as OH and the vertical as OK. If this is rotated about
the radius OK it forms a hemisphere the area of whose surface is
commonly known to be 27r.
Buffon's Needle Problem
77
The area of this surface can be expressed in a different form.
For this purpose let us move the above angles of inclination
0o> 0i> 02> • • •> ^M-i s0 tnat the angles are formed at 0 with 0//.
The resulting free arms divide the quadrant into M very small arcs
with the common length e. Let us select from among them the one
lying between the free arms of the angles 6, and 0S + 1. On being
rotated it forms a very small spherical zone, which when flattened out
to a strip possesses the length 2ir cos 6, and the height e, so that the
area is then 2ire cos 6t.
Since the sum of all the spherical zones obtained in this manner
gives the hemisphere, we obtain the equation
27re(cos 60 + cos #! + cos 62 + • • • + cos 0M-i) = 2^
or, since Me = 7r/2,
cos 60 + cos flt + cos 62 + • • • + cos &M-\ _ 2
M ~ IT
Thus, we have obtained the mean value that we were seeking.
The mean value of the cosine function (naturally that of the sine function
also) over the interval 0 through 7r/2 is 2/ir.
[This also follows from formulas (1) and (2) of No. 15.]
At the same time we obtain
I I 2
w-dm-T*
or
W-2-±,
it a
This formula gives us the probability we were seeking.
Note. Wolf in Zurich (1850) arrived at the original idea of using
the obtained formula to calculate the number it. Experimentally, by
a great number (5000) of throws with a needle 36 mm long and a
distance of 45 mm between the parallels, he found the probability W
to be (approximately) 0.5064, and obtained
2/
v = -m> = 3.1596.
dW
The Englishmen Smith (1855) and Fox (1864) repeated the
experiment and found with 3200 and 1100 throws, respectively, values of
3.1553 and 3.1419 for 7T.
78
Arithmetical Problems
HeS The Fermat-Euler Prime Number Theorem
Every prime number of the form 4n + 1 can be represented in only one
manner as the sum of two squares.
This famous theorem was discovered about 1660 by Pierre de
Fermat (1601-1665), the greatest French mathematician of the
seventeenth century. It was not published, however, until 1670, when
it appeared, unfortunately without proof, in the notes to the works of
Diophantus, edited by Fermat's son. It is not certain whether or not
Fermat had obtained the proof.
The first proof of the theorem was presented almost 100 years later
by Leonhard Euler in his treatise " Demonstratio theorematis
Fermatiani, omnem numerum primum formae 4n + 1 esse summam
duorum quadratorum" (Novi Commentarii Academiae Petropolitanae ad
annos 1754-1755, vol. V), after years of fruitless attempts at its solution.
Today there are several proofs of the Fermat-Euler theorem. The
following proof is distinguished by its great simplicity.
For the reader who is unfamiliar with problems of number theory
we will provide several explanations that will be necessary for
understanding this proof and will also be found useful for the problem
dealt with in No. 22. At the same time, it is to be understood that
the letters used here and in No. 22 represent whole numbers.
Two numbers a and b (according to Gauss), are called congruent to
the modulus m,
written: a = b mod m, read a congruent to b modulo m,
when their difference is divisible by m. Every number, for example,
in regard to the modulus (to the modulus, modulo) m, is congruent to
the residue it leaves over when divided by m, for example 65 = 2 mod 7.
And this is also true when the word residue is taken in its most general
sense, in which it means the residue left after division when the quotient
is arbitrarily chosen. If, for instance, we write 65/7 = 12, we remain
with a residue of —19.
Among the many possible residues two are of special importance:
the conventional or common residue, which is positive and smaller than
the divisor, and the minimal residue, the magnitude of which never
exceeds half the divisor. A minimal residue of the division 89/13 is,
for example, —2, because 89/13 = 7 — -^, which can also be written
89 = -2 mod 13.
The Fermat-Euler Prime Number Theorem
79
The following self-evident rules apply to congruences to the same
modulus:
1. If two numbers are congruent to a third, they are also congruent to each
other.
2. Two congruences can be added, subtracted, and multiplied.
From
A = B mod m, a = b mod m
it follows that
A ± a = B ± b mod m
and
Aa = Bb mod m.
[From A = B + Gm and a = b + gm it follows, for example,
that Aa = Bb + gm (g integral), i.e., -da = Bb mod m.]
3. The congruence
a = b mod m
may be multiplied by any whole number g:
ag = bg mod m.
It can be divided by g only when g is a common divisor of a and 6
that has no common divisor with the modulus. If, for example, we
divide 49 = 14 mod 5 by 7, we obtain a correct congruence
7 = 2 mod 5.
A system of m integral numbers no two of which are congruent to
the modulus m is called a complete residue system to the modulus m.
The simplest complete residue system is the system of the m common
residues 0, 1, 2,..., m — 1, and the next simplest is the system of m
minimal residues.
Every number z is congruent to the modulus m to one and only
one number of a complete residue system mod m.
Of particular importance is the following theorem:
Theorem : If the numbers of a complete residue system are multiplied by a
number possessing no common divisor with the modulus, there is obtained once
again a complete residue system with respect to the modulus.
Proof. Let m be the modulus, a the multiplier possessing no
common divisor with m. If then for two different numbers x and x'
of the given residue system ax = ax' mod m were true, it would follow
from congruence rule 3 that x = x' mod m, which, however, is not
the case.
80
Arithmetical Problems
From this theorem it follows directly that:
The congruence
ax = b mod m,
in which a and m possess no common divisor, possesses in each complete
residue system mod m one and only one " root" x.
Quadratic Residues
Of two numbers possessing no common divisor one is called the
quadratic residue of the other when it is congruent to a square number
with respect to the other as modulus; if there is no such square
number it is called a quadratic nonresidue. For example, 12 is a
quadratic residue of 13, since 12 = 82 mod 13; —1 is a quadratic
nonresidue of 3, since there exists no square number x2 such that
x2 = -1 mod 3.
The following theorems concerning quadratic residues and non-
residues apply to odd prime number modulus p:
1. There are a total of p = (p — 1)/2 mutually incongruent quadratic
residues and just as many mutually incongruent nonresidues of p. The
former are 12, 22, 32,..., p2, or whichever numbers are congruent to them
mod p.
II. The product of two residues is a residue, the product of a residue and a
nonresidue is a nonresidue, and finally, the product of two nonresidues is a residue.
Proof of I. 1. If two of the designated squares were congruent to
each other, for example x2 = y2 mod p, the product (x + y) (x — y)
[which is equal to x2 — y2] would be divisible by p, which is
impossible, because both of its factors are smaller than p.
2. If we continue the series of squares beyond p2, no new residues
are obtained. The square (p + h)2, for example, is congruent to
k2 mod p if k ^ p is so determined that p + h + k is divisible by p,
since then p + h = —k and moreover (p + h)2 = k2 mod p. Since
there are (aside from the number divisible by p, disregarded here) 2fc>
numbers mutually incongruent modp, there must be a total of p
mutually incongruent quadratic nonresidues of p.
Proof of II. Let R and r be quadratic residues, JVand n quadratic
nonresidues of p.
1. From A2 = R, a2 = r mod p we obtain by multiplication
(Aa)2 = Rr mod p. Consequently, Rr is a residue.
2. The 2*3 numbers l2, 22,..., p2, Nl2, N22,..., Np2 are mutually
incongruent mod p. Since the first p of these numbers are quadratic
The Fermat-Euler Prime Number Theorem 81
residues of p, and since only p residues exist, the p numbers Nl2,
N22,..., Np2 must be nonresidues, i.e., NR is a nonresidue.
3. The 2*3 numbers n-l2, n-22, n-32,..., n-p2, n-Nl2, n-N22,...,
n ■ Np2 are mutually incongruent mod p. The first p of these numbers
are nonresidues in accordance with 2.; consequently, the others must
be residues in accordance with 1.; however, among them is the product
of the two nonresidues N and n. Q.E.D.
Let us now consider the bilinear congruence
(0) xy = D mod p,
in which the modulus/) is once again an odd prime number, D a given
number possessing no common divisor with p, and the "mutually
conjugate" or "linked" magnitudes x and y are chosen in such a
manner from the system £ of the numbers 1,2, 3,..., p — 1 that (0)
is satisfied. For each x from £ there is then only one conjugate y.
[From xy = D mod p and xy' = D mod p it follows that xy = xy'
mod p and from this y = y' mod p or y — y' = 0 mod p. However,
since both y and y' ^ p — 1, their difference is divisible by p only
when y' = y.]
We select x1 arbitrarily from £ and determine yx such that
*!#! = D mod p.
Then we select from £ a number x2 that differs from xr and yr and
determine y2 such that
x2y2 = D mod /».
y2 then is different from xr as well as from yx.
We continue in this manner until all the numbers of £ have been
arranged in the resulting congruences.
Here there are two cases to be distinguished:
1. yv never equals xv. In other words: the congruence x2 =
D modp is impossible; D is a quadratic nonresidue of p. We then obtain
exactly p = (p — 1)/2 pairs *v, yv of conjugate numbers, and
multiplication of the p congruences formed gives
(1) (p- 1)! = Z)pmod/>.
2. For a certain index v, yv = *v, thus *J = Dmodp; D is a
quadratic residue of p. If aside from v there is also an index p for which
the same occurs, then x2 = D mod /», and so x2 = *j| mod /», i.e.,
x2 — x2 or (*„ + *v) (*„ — xv) is divisible by /». Since x„ — xv is not
82 Arithmetical Problems
divisible by p, xu + xv must be divisible by p, and consequently
xu = p — xv. Actually, then x\ = p2 — 2pxv + X* = *2 = D mod p.
Equal linked magnitudes thus occur exactly twice if they occur at all.
In our case (yv = xv,yu = x„) we now have only p — 1 congruences
xsys = D mod p, where ys differs from xs. To these p — 1
congruences we add the congruence
xvXu = — D mod p,
multiply all p congruences and obtain
(2) (p- 1)1= -D'modp.
This is the case when, for example, D = 1, since then 12 = D mod p.
Then we have the congruence
(2a) (p - 1)! = -1 modp,
which represents the so-called Wilson theorem.
Using Wilson's formula we write instead of (1) and (2)
(la) D" = -l(mod/>) (2a) D" = l(mod/>)
and obtain
Euler's theorem : The number D that possesses no common divisor with
the prime number p is either a quadratic residue or nonresidue of p, depending
on whether Dp is congruent mod p to the positive or negative unit.
The introduction of the Legendre symbol makes it possible to express
this criterion of the residue character of a number by a formula. The
Legendre symbol I —J represents the positive or negative unit,
depending on whether or not D is a quadratic residue or nonresidue of
p. Thus, for example, 1 = ) = 1, since 32 — 2 is divisible by 7, whereas
I <r) = — 1, since there is no square number whose difference from 2 is
divisible by 3.
When this symbol is used Euler's criterion assumes the simple form
(3) {j)=DP m°d P' Wkh *» =^^-
In the simple case D > — 1, congruence (3) is transformed into the
equation
(4) ^=(_l)(P-i)/2)
The Fermat-Euler Prime Number Theorem
83
since in this case both sides of (3) are units, and the difference between
two units is divisible by the odd prime number p only when these
units are equal.
Now ?——— is even or odd, depending on whether the prime number
/> is of the form 4n + lor4n + 3. In the first case, then I J = +1,
i.e., — 1 is a quadratic residue of p, and in the second case
I ) = — 1, i.e., — 1 is a quadratic nonresidue of p. Consequently,
the following is true:
Theorem of Euler: The negative unit is a quadratic residue of the prime
number p, when p has the form 4n + 1 and a quadratic nonresidue when p
has the form 4n + 3.
In other words: The pure quadratic congruence
x2 + 1 = 0 mod p
has integral solutions x when p has the form 4n + 1 and has not when p has
the form 4n + 3.
Now for the proof of the Fermat-Euler theorem!
The following proof is based upon the above theorems and the
Norm theorem : If a prime number goes into a norm but not into the
bases of the norm, it is itself a norm.
A norm is understood to mean the sum of the squares of two whole
numbers, which are the "bases" of the norm.
Proof of the norm theorem. Let the prime number p go into the
norm a2 + b2, but not into its bases a and b, so that
(5) a2 + b2 = pf
it being assumed that the factor fis greater than 1 but smaller than
p/2. This assumption does not represent a limitation of the theorem,
since from A2 + B2 = pF, with F > (P/2), we can immediately
form the equation a2 + b2 = pf, with f < (P/2), if the minimal
residues A — hp and B — kp of the divisions A\p and B\p, respectively,
are taken for a and b, respectively. On the one hand,
a2 + b2 = [42 + B2] - 2(Ah + Bk)p + (h2 + k2)p2
is divisible by p, and thus
a2 + b2 = pf,
84 Arithmetical Problems
while on the other hand, since |a| < ip and \b\ < ip, a2 + b2 is
smaller than ip2 or pf < ip2 or f < ip. Moreover, p does not go
into either a or b, because then (contrary to our assumption) it would
go into A = a + hp or into B = b + kp.
We determine the minimal residues a = a — mf and /3 = b — nf of
the divisions a//and b\f and obtain similarly
(6) «2 + /32 -/', with /' ^ if.
Multiplication of (5) and (6) gives us
(a2 + b2)(a2 + p2) =pf2f
or
(<2<x + ty})2+ (afS-b*)2=pf2f.
Since
<z<x + 6/3 = [a2 + b2] - (am + &«)/= a'f,
aft + ba = (bm - an)f= b'f,
the equation obtained is written
(7) a12 + V2 = pf, where /' ^ if.
Here/' cannot disappear. If/' = 0, then in accordance with (6)
a = 0 and /3 = 0, and from this it follows that a = mfand b = nf;
then according to (5) p = (m2 + n2)f In this event /» would have to
be divisible by/, and then/would have to equal 1, which contradicts
our premise.
If, then,/' = 1, (7) already gives us the norm expression of/».
If/' > 1, we obtain from (7)
(8) a"2 + b"2 = pf with 0 </" ^ if,
just as (7) was obtained from (5). This method of constructing new
equations with continuously diminishing factors / /', f,... is
continued until the factor 1 appears. The corresponding equation
gives the prime number p represented as a norm.
Now we will prove
I. A prime number q of the form 4n + 3 cannot be represented as a norm.
II. Every prime number p of the form 4n + 1 can be represented as a norm
in only one way.
Proof of I. If it were true that
a2 + b2 = q,
The Fermat-Euler Prime Number Theorem
85
then it would follow that
b2 = — a2 mod q
and the product (— 1) (a2) of a quadratic nonresidue (— 1) and a
residue (a2) of q would be a quadratic residue (b2) of q, which
according to the above is impossible.
Proof of II. According to Euler's theorem there is a whole
number x such that the norm x2 + 1 is divisible by p. According to
the norm theorem, p is then itself a norm:
p = a2 + b2.
Here also there is only one possible norm representation.
If we assume a second such representation:
p = A2 + B2
(where a, b, A, B represent four different positive numbers), it follows
that
p2 = (a2 + b2)(A2 + B2) = (Aa + Bb)2 + (Ab + Ba)2,
where either the two upper signs or the two lower signs are possible.
Then, since the product of the two factors Aa + Bb and Aa — Bb:
A2a2 - B2b2 = A2(a2 + b2) - b2{A2 + B2)
is divisible by p, one of the factors must be divisible by p.
Consequently, we select the upper or lower signs depending upon whether
the first or second factor is divisible by p. Then either
Aa + Bb = p and at the same time Ab — Ba = 0
or
Ab + Ba = p and at the same time Aa — Bb = 0,
thus, either A2b2 = B2a2 or A2a2 = B2b2.
From the first of these equations it follows that
A2 _ B2 _ A2 + B2 _
a2 ~ b2~ a2 + b2 ~ '
and from the second
A2 _B2 _ A2 + B2 _
b2 ~ a2 ~ b2 + a2 ~ '
thus, from the first A = a, while from the second A = b, both of which
contradict the initial assumption, which requires that A =£ a and
A ^ b. There is therefore only one way of representing/) as a norm,
and the Fermat-Euler theorem is proved.
86
Arithmetical Problems
|gm The Fermat Equation
Find the integral solutions of the equation
x2 - dy2 = 1,
in which d is a nonquadratic positive whole number.
This extremely important problem of number theory was posed by
Pierre Fermat in 1657, first to his friend Frenicle and then to all
contemporary mathematicians.
The first solution, a very complicated one, was obtained by the
Englishmen Lord Brouncker and John Wallis.
The simplest and best solutions to this problem were discovered by
Euler, Lagrange, and Gauss. [Euler: "De usu novi algorithmi...,"
Novi Commentarii Academiae Petropolitanae ad annum 1765. Lagrange:
"Solution d'un probleme d'arithmetique," Miscellanea Taurinensia,
vol. IV, 1768. Gauss: Disquisitiones arithmeticae, 1801.] They are
all based upon the properties of periodic continued fractions.
We will examine a somewhat modified form of this method with the
more general equation
X2 - DY2 = 4,
which includes the original Fermat equation (with X = 2x, Y = y,
D = 4<f) as a special case, but includes as well the case in which D
leaves a residue of 1 on being divided by 4.
For the sake of convenience we shall write the continued fraction
*+!+i i
c+-d+...
in the abbreviated form (a, b,c,d,.. .).
A purely periodic continued fraction with an n-term period has the
form
u = (in £2.---, gn, gl,g2>---,gm---)>
so that we may write
« = (gl>g2,---,gN,u),
where Nis an integral multiple of n, which we will assume to be even
for reasons presently to be described. The terms (partial
denominators) gi, g2,... are assumed to be positive whole numbers > 0. If we
designate the numerator and denominator of the Nth approximation
The Fermat Equation
87
(gi,g2, ■ ■ ■ >gs) and of the (N - l)th approximation (gi,g2,---,
gN-!) as P and Q and p and q, respectively, then according to
continued fraction theory we obtain the two equations
(1) Pq-Qp=l and (2) u = ^X
the second of which may also be expressed in the form
(2a) Qu2 - Hu -p = 0 with H = P - q.
The discriminant D = H2 + 4Qp of the quadratic equation (2a)
has, according to (1), the value H2 + 4Pq — 4 = (P + q)2 - 4; it is
consequently smaller by 4 than a square number and therefore
cannot itself be a square number. Its (positive) root r = VD is
therefore irrational. Moreover, since r > H (because r2 = H2 +
4Qp), the second root u = (H — r)/2Q of the quadratic equation is
negative, so that the first root (H + r)/2Q represents our (improperly
fractionated) continued fraction u. To obtain information about the
magnitude of u we form the product of the roots uii = — pjQ and
obtain
PIQ
—u =
u
Since P > p and Q > q, then
-«<22 and -*<ttL
u u
One of the right-hand fractions, however, is a proper fraction, since
the value u of the continued fraction lies between the two successive
approximationsp\q and PjQ; therefore, — u must be a proper fraction.
A quadratic equation with integral coefficients and a nonquadratic
discriminant whose first root is a positive and improper fraction while
the second root is a negative proper fraction is called a reduced equation,
and its first root is called a reduced number. Our conclusion therefore
reads:
Every purely periodic, improperly fractionated, continued fraction is a
reduced number.
We will now show conversely that the continued fraction of a
reduced number is purely periodic.
First, we will solve the problem:
Obtain the first root u = (r — b)/2a of the quadratic equation
(3)
88
Arithmetical Problems
with integral indivisible coefficients and the positive nonquadratic discriminant
D = r2 = b2 — 4ac in the form of a continued fraction.
We write
where g is the largest whole number below u (in the following to be
designated as [«] and u' a positive improper fraction. We introduce
three new magnitudes a', b', c' that are of the opposite sign and equal
to the magnitudes ag2 + bg + c, 2ag + b, and a, and we obtain
, _ 1 _ 2a _ 2a(r - b') _ r - V
" ~ u - g~ r + V ~ r2 - b'2 ~ 2a'
with
b'2 - 4a'c' = b2 - 4ac = D.
Consequently, u is the first root of the quadratic equation
(3') a'u'2 + b'u' + c' = 0,
which likewise belongs to the discriminant D and possesses coefficients
having no common divisor. (If a', b', c' possessed a common divisor,
the latter because of the equations — c' = a, — V = 2ag + b, —a' =
ag2 + bg + c would go into a, b, c, which contradicts our assumption.)
We call the new equation (3') the derivative of the initial equation
(3) and its first root u' the derivative of u.
The new coefficients a', V, c' are calculated in practice in accordance
with the following system:
ga + b -+ g(ga + b)
a' b' c'
We add the two terms of the third column and change the sign of the
sum, thus obtaining a'. We add the two lower terms of the second
column, change the sign of the sum and get V. We change the sign
of a and get c'.
The derived quadratic equation (3') is treated in exactly this manner
and the process continued as far as desired. The following example
is presented to make the process completely clear.
The Fermat Equation
89
Expand the positive root of the quadratic equation
3«2 - 10« - 1 = 0
into a continued fraction. The discriminant is 112, thus r = 10,....
In the scheme we will write in only the coefficients of the successive
quadratic equations each of which is the derivative of the preceding
one. In the last column we will write the first root of the appropriate
equation and the highest integral contained in it that is at the same
time the correct partial denominator of the continued fraction.
-10 -1 10,--- + 10
9
-1 -3
4
3
1
-8
8
0
-8
9
1
-10
10
0
-3
0
-4
3
-3
0
10,-
10,-
10,-
6
•• +
8
•• +
6
8
8
••+10
= 3 +
= 2 +
= 3 +
= 10 +
3 -10 -1
Since we come back to the initial equation, the expansion is purely
periodic, and we obtain
V\\2 + 10
|2Li- = (3, 2, 3, 10, 3, 2, 3, 10,...).
Now for the proof of the theorem that the expansion of a reduced
number yields a purely periodic continued fraction!
Since the first root u of the reduced equation
au2 + bu + c = 0
is a positive improper fraction, and the second one, u, is a negative
proper fraction, then according to the relations
c
uu = ->
a
u + u =
h
a
90
Arithmetical Problems
between roots and coefficients, both the free term c and the coefficient b of
the linear term of a reduced equation are always negative (the coefficient a is
assumed to be always positive).
In accordance with the expansion examined above we write
(4) u = g + 1
with g = [«] and «' > 1. From u' = l/(« — g) it follows initially
that the first root «' of the derived equation is a positive improper
fraction. If we then transform r into — r in the equation u' =
l/(« — g), the equation assumes the form u' = \j{u — g) and shows
that the second root u' is a negative proper fraction. The derivative
of a reduced equation or number is consequently also reduced, so that only
reduced numbers occur in the continued fraction expansion of a
reduced number.
If we write (4)
1
~ff = g ~ "'
we see that g can also be taken as the greatest integer that is contained in the
reciprocal value of opposite sign of the second root of the derived equation.
Now, the number of all the reduced numbers corresponding to a
given discriminant D is finite. (From D = b2 — 4ac and — ac > 0
it follows first that the b's must be sought only among the numbers of
the series — 1, —2,..., — [r]. Of these the only ones that need be
considered are those for which D — b2 is divisible by 4. We select
these, and for each such b we determine the pairs of numbers a, c
[with a > 0, c < 0] for which —ac= (D — b2) /4, which in turn gives
us a finite quantity of numbers a and c. Each number triplet a, b, c
obtained in this way, however, leads to a reduced equation au2 + bu
+ c = 0 and thus to a reduced number u only when 2a lies between
r + b and r — b.)
Consequently, in the continued fraction expansion of a reduced
number U there must reappear after a finite number of steps a
reduced number previously obtained, e.g., in such manner:
U = (K, L, u), u = (h, k, I, u).
But since, in accordance with the above, both I and L represent the
greatest integer that is contained in the reciprocal value of u of
opposite sign, L = I. Similarly, we find that K = k.
The Fertnat Equation 91
Consequently,
U- (k,l,h,k,l,h,...),
i.e.: The expansion of a reduced number yields a purely periodic continued
fraction.
After these preliminaries the solution of the Fermat equation
becomes quite simple. We will show: I. that the continued fraction
expansion of any reduced number belonging to the discriminant D
possesses an infinite number of solutions of the Fermat equation;
II. that every solution of the equation is obtained by this expansion.
I. Let
« = Ui> £2,---, gn, gi, £2, • • •, £n, • • •)
be the positive root of the reduced equation
(5) au2 + bu + c = 0
with the discriminant D and coefficients possessing no common
divisor. Also, let
q = (gi> £2,---, £jv)
be an approximation of u and the index number N an even multiple
of n, and let
- = (gugt, ■ ■ -,gN-l)
be the preceding approximate fraction; then, according to (2a),
(5') Qu2 - Hu -p = 0 (H = P - q).
Since the roots of (5) and (5') agree and the coefficients of (5)
possess no common divisor, it must be possible to obtain (5') from (5)
by multiplication with a certain whole number y, such that
<*> -%-'-3-h
If we then introduce the whole number
(7) x = P + q,
we obtain from (6) and (7)
x2 - b2y2 =(P+ q)2 - (P - q)2
and
4acy2 = -4Qp,
92
Arithmetical Problems
from which by addition we obtain
x2 - Dy2 = 4(Pg - Qp),
and, using (1),
x2 - Dy2 = 4.
II. Conversely, now let x\y represent a solution of the Fermat
equation
(8) x2 - Dy2 = 4
in nonevanescent positive integers x and y and let u represent the first
root of a reduced equation
an2 + bu + c = 0.
Making use of (6) and (7), we obtain the four nonevanescent
positive integers
n x — by n x + by
P 2^ Q = "y, P ey, q ^-
(It is immediately obvious that Q and p are such numbers, whereas
for P and q it follows from equation (8), if we make use also of the
equation D = b2 — 4ac to write:
(* + by)(x -by) = x2 - b2y2 = 4(1 - acy2) = 4(1 + Qp).
We are then able to conclude from the appearance of the
nonevanescent integer on the right, which is divisible by 4, that the two
integral factors 2q and 2P of the product on the left-hand side have
to be even and not equal to zero.) According to (8) they satisfy the
equation
(9) Pq-Qp=\.
If we then replace the coefficients a, b, c in the reduced equation with
Qly> -(P - q)ly, -Ply, we get
(10) u = #4^.
v Qu + q
Before we get from here to the continued fraction expansion, we still
have to prove that Q ^ q.
It is true that 2{Q — q) = [2a — b]y — x. Since the second root u
of the reduced equation is a negative proper fraction, it follows that
r + b < 2a or 2a — b > r. Consequently,
2(Q - q) > ry - x = (r2y2 - x2)/^ + x) = -4/(ry + *)
The Fermat Equation
93
or (Q — q) > —2j{ry + x). However, since D = r2 = b2 — 4ac is
at least equal to 5, y is at least 1, and x at least 3, it follows that
ry + x > 5 and from this Q — q > —0.4, i.e., Q ^ q. Q.E.D.
We now expand PjQ into a continued fraction (y1; y2, ■ ■., yv)
with the even number of terms v in such a manner that between it
and the last approximate fraction p'jq' there exists the relation
(9') Pq' -QP' = 1.
From (9) and (9') it then follows by subtraction that
P(q' - q) = Q(P' - P)-
However, since q ^ Q, q' < Q, and (q' — q) is divisible by Q, q' must
equal q and therefore p' must also equal p. We then obtain
i \ Pu+P
(yi,y* •••,*,«) = Q^Tq
i.e., because of (10),
« = (n, 72, •••,yv, «)•
Every solution * |y of the Fermat equation can therefore be obtained
by the expansion of any reduced number u as a continued fraction.
Final result : The Fermat equation
x2 -Dy2 = 4
has an infinite number of solutions; these can all be obtained in accordance with
rules (6) and (7) from the approximation values, containing an even number of
periods, obtained from the expansion as a continued fraction of any arbitrarily
selected reduced number belonging to the discriminant D.
Example. Find the smallest solution x \y of the Fermat equation
x2 - \\2y2 = 4.
A reduced equation applying to the discriminant 112 is the equation
treated above
3«2 - 10« - 1 = 0;
the expansion of the reduced number u reads
« = (3,2,3,10,3,2,3,10,...)
and has a four-termed period. The first four approximate fractions
are
3 7 />_24 P 247
1' 2' q~ 7' Q ~ 72'
94
Arithmetical Problems
Since here a = 3, b = — 10, c = — 1, we find, in accordance with (6)
and (7), that
x = 254, y = 24.
It now remains to be shown that there is at least one reduced
number corresponding to each discriminant D.
1. If D = 4n and g is the maximum integer that is contained in
Vn, then
a = 1, b 2g, c = g2 - n
are the coefficients of a reduced equation.
Proof. The discriminant of the equation is b2 — 4ac = 4n = D.
Moreover,
r + b<2a<r — b,
since
2Vn" - 2g < 2 < 2Vn" + 2g.
2. If D = 4n + 1 and g is the largest integer for which g2 + g will
be smaller than n (so that (g + I)2 + (g + 1) > nor/ + 3g + 2 >
n), then
a-1, 6=-(2^+1), c=g2 + g-n
are the coefficients of a reduced equation.
Proof. The discriminant of the equation is b2 — 4ac =
4» + 1 = D. Also,
r+b<2a<r— b,
since
VD - (2g + 1) < 2 < VD + 2g + I.
(That Vi) — 2g — 1 < 2 follows from the above condition
g2 + 3g + 2 > n. On multiplication by 4 this becomes
4g2 + I2g + 9 > 4n + 1, i.e., it becomes (2^ + 3)2 > D.
From this it follows that
2g + 3 > VI) or VI) - 2g - 1 < 2.)
Note. If we have found the minimal solution of the Fermat
equation (e.g., by the method just presented), we can find the other
solutions (we will consider only positive solutions) in a simpler manner
after Lagrange.
We assign to each solution x\y the "Lagrange number"
z = i(x + yr)
and call x and y the components of the Lagrange number.
The Fermat Equation
95
We will first prove the auxiliary theorem. The product and the
improperly fractionated quotient £ = ■£(£ + rjr) of two Lagrange numbers
Z = -J(X + Yr) and z = -J(x + yr) is also a Lagrange number.
Proof. We immediately find that
tf = 1 or ^ - Dr)2 = 4
with
, Xx ± DYy Yx ± Xy
£ = 2 ' ^ 2—'
where the upper sign is used when we are concerned with the product
and the lower when we are concerned with the quotient.
From X > rY and x > ry it follows that Xx > DYy, so that £ is
positive in every case. From
it follows in the case of £ = Zjz, since then Y > y, that XjY < xjy
or Yx > Xy, so that rj is also positive in every case. Consequently, £
is positive and improper because ££ = 1.
Now it merely remains to show that £ and rj are integers. Either D
is divisible by 4 or Z) leaves a residue of 1 on division by 4. In the
first case X and x are even. In the second case every solution of the
Fermat equation consists either of two even or two odd numbers.
In all cases £ and rj are consequently integers.
The method mentioned above is based upon the theorem:
Every Lagrange number is a power of the smallest Lagrange number with an
integral exponent.
Proof. Let x \y be the minimal solution of the Fermat equation
and thus z = \{x + yr) the smallest Lagrange number. First it
follows from the auxiliary theorem that every power of z is a Lagrange
number.
Now let Z = \{X + Yr) be a Lagrange number that is not a power
of z. Then there must certainly exist two successive powers § = zn
and §' = zn+1 between which Zis situated. From
zn < Z < zn + 1
it follows on division with zn that
1 < Z/5 < z.
96
Arithmetical Problems
Thus, the Lagrange number £ = Z/§ would be smaller than the
smallest Lagrange number z, which is naturally absurd.
Consequently, the only Lagrange numbers are the powers
z z2 z3 z4
And the simplest way of finding the 2nd, 3rd,... solution of the
Fermat equation is to find them as components of the Lagrange
numbers z2, z3,....
H9H The Fermat-Gauss Impossibility Theorem
Prove that the sum of two cubic numbers cannot be a cubic number.
Thus, what must be proved is that the equation
x3 + y3 = z3
cannot be composed of nonevanescent integers x, y, z.
The theorem that we have to prove is a special case of the famous
Fermat impossibility theorem, which was expressed by Fermat in the
following way in the arithmetic of Diophantus, edited by Fermat's son,
and published in 1670:
" It is impossible to divide a cube into two cubes, a fourth power into two
fourth powers, and in general any power except the square into two powers
with the same exponents."
Fermat added: "I have discovered a truly wonderful proof of this,
but the margin (of the notebook) is too narrow to hold it."
Unfortunately, Fermat neglected to disclose this "wonderful proof."
Fermat's impossibility theorem became very famous as a result of
the fact that many of the greatest mathematicians since Fermat,
including Euler, Legendre, Gauss, Dirichlet, Kummer, and others
tried unsuccessfully to obtain the general proof of this theorem. To
the present day a proof of the impossibility of the equation
xn + yn = z"
is known only for special values of the exponent n, e.g., for the values
from 3 to 100, and even this proof involves extraordinary complications
and difficulties.
In the following we will limit ourselves to the simplest case, the
case n = 3. The impossibility of the equation
x3 + y3 = z3
The Fermat-Gauss Impossibility Theorem
97
was demonstrated by Euler in his algebra, which appeared in 1770,
and later by Gauss (Complete Works, vol. II). This problem shows,
as it often happens in mathematics, that the proof of a more general
theorem is easier to obtain than that of a special case. To prove the
impossibility of
(1) a3 + b3 = c3
for the common integers a, b, c Euler had to resort to a relatively
complicated method; Gauss, on the other hand, proved simply and
clearly the impossibility of the more general equation
(2) a3 + /S3 = y3
for any numbers a, /3, y of the form xJ + yO, where x and y are any
integers,
, 1 +iV3 , n 1 -iV3
J = 2 and ° = 2
are cube roots of the (negative) unit.
For convenience in notation we will call numbers of the form
xJ + yO (in which x andy are integers) G-numbers.
That the case treated by Euler is simply a special case of (2) is
apparent from the fact that every integer g is also a G-number:
g = gJ + gO.
The G-numbers (which are the integers of the so-called group of the
cubic unit roots) have many properties in common with common
integers. Readers unfamiliar with these properties will find all the
information necessary for an understanding of the Gauss proof in the
supplement provided on p. 100.
Gauss' Proof of the Impossibility of the Equation
(2) <x3 + /33 = y3.
First, let Greek letters designate G-numbers and small Roman
letters common integers.
We then replace a, /3, y with £, rj, — £, transforming (2) first into the
symmetrical equation
(3) i3 + 7]3 + ? = 0,
of which we assume that two of the three "bases" £, rj, £ will always
have no common divisor; we will then refer to this equation as a
Gauss equation. [The assumption we have just made in no way
98
Arithmetical Problems
limits the proof. If, for example, £ and rj possessed a common prime
factor S, then, in accordance with (3), S would also go into £3 and
consequently into £, so that division by S3 would eliminate the
divisor S from (3).]
The impossibility of (3) is obtained from the two following theorems,
which we will derive from the assumption of the existence of (3).
I. In every Gauss equation one and only one of the three bases—we will call
it the special base—has the prime divisor it = J — O.
II. For every Gauss equation there is a second Gauss equation in which the
special base contains the divisor it fewer times than the special base of the first
equation.
These two theorems, however, contradict each other. By
continued application of II. it is possible to obtain a Gauss equation that
no longer contains a special base, which contradicts theorem I.
Proof of I. If none of the three bases £, rj, £ were divisible by v,
then
i3 = e, r? =f £3 = g mod 9 with e" = /2 = g* = 1
and consequently, because of (3), e +f+ g = 0 mod 9, which is,
however, impossible. Therefore a situation such as the following
must exist:
£ = 0 mod it, £ & 0 mod it, rj ^ 0 mod it.
Proof of II. It follows from £3 = mod -n3, according to (3), that
£3 + t)3 = 0 mod it3, and since £3 = e mod 9, rf =■ /mod 9, e +
/=0 mod it3, then e + / = 0 mod 3 must be true; from this it follows
that/= — e. Now £3 + rj3 = e +/= 0 mod 9, and consequently
i3 = 0 mod 7T4 and
£ = 0 mod tt2.
From £3 + rj3 = 0 mod it3 and the identity
£3 + r? = <p4>X,
where
9> = £J + vO, j = £0 + 7)J, x = £ + r,,
it follows that at least one of the factors <p, </<, x *s divisible by it.
From this and from <p — >p = (£ — 17)71-, 9; + ^r = x it follows that
each one of the factors q>, >p, \ is divisible by it, so that
Thus no pair of the numbers 9/, ^r', ^' possesses a common divisor.
The Fermat-Gauss Impossibility Theorem 99
[If, for example, 9/ and >p' possessed a common divisor S, then
also <p' — ifi' would equal £ — rj and 71-(9/ + ^') = £ + ^) and tnen afe°
2£ and 2rj would be divisible by S, so that S would be equal to 2.
Then we would either have £ = 2A + e, rj = 2/* + e, or £ = 2A + e,
7] = 2fi — e, with e3 = + 1 and then 9) = 2v + e or <p = 2v + £7r,
which, however, is not divisible by S = 2.]
If we now set £/7r = at, then
w3 = — (p'lp'x' with 9)' + ^r' = x'.
Since then no pair of 9/, ^', — x' possesses a common divisor, these
three magnitudes down to the possible unit factors a, /3, y must be
cubes of the numbers p, a, t, no pair of which possesses a common
divisor:
9>' = ap3, ifi' = /Str3, — x = y"3 with a6 = /S6 = y6 = 1,
so that
(4) a,3 = afiypWr3, <xp3 + /Sex3 + yr3 = 0.
However, if the cube of k = oi\pcn is the G-unit a, /3, y, then, since
k3 = E mod 9, a/Sy = £ mod 9 also, and consequently
<x/3y = E with £2 = 1.
From u> = 0 mod 7r it follows, for example, that
t = 0 mod 7T and p ^ 0, a & 0 mod 7r.
Then, however, p3 = e and tr3 =/mod9 (e2 =/2 = 1), and
consequently, according to (4), ea +Jfi = 0 mod 3, and from this
ea +J]3 = 0. Thus, we obtain
/3 = Fa, Fa2y = £ (with F2 = 1)
and from (4)
Fa3p3 + <x3<t3 + Et3 = 0.
If we write here £', rj', J' in place ofFap, aa, Et, respectively, we finally
obtain the Gauss equation
(3') r3 + r,'3 + r = 0,
into the special base £' of which the factor it goes fewer times than into
the special base £ of (3).
100 Arithmetical Problems
Supplement. Properties of G-numbers
I. The magnitudes J and 0 satisfy the following equations:
J + 0 = 1, JO = 1, J2 + 0 = 0,
O2 + J = 0, J3 = -1, O3 = -1.
II. The sum, difference, and products o/G-numbers are also G-numbers.
The product of the two numbers aJ + bO and a'J + b'O is, for
example (according to l-),pJ + qO with
p = ab' + ba' — bb' and ^ = ab' + ba! — aa!.
III. Norm. The norm of a complex number j = | + it) is
commonly understood to be the product
Jo = #(¾) = U = (| + it)) (| - it)) = |2 + t)2
of the two mutually conjugate numbers J and J = | — it).
7¾ norm q/" <fo G-number aJ + bO accordingly has the value a2 +
b2 — ab. It is a positive integer which disappears only when a and b
are both zero. The smallest conceivable norms of G-numbers
are 1, 2, 3.
From
a2 + b2 - ab = 1
we obtain one of the six following cases:
a =
b =
1
0
0
1
-1
0
0
-1
1
1
-1
-1
There are thus six G-numbers:
J, -J, 0, -0, 1, -1
with the norm 1.
The equation
a2 + b* - ab = 2
has no solution that is an integer. There is consequently no G-number
whose norm is 2.
The equation
a2 + b" - ab = 3
finally has six integral solutions
a=l, b = -\; a=-\, b = 1; a = 1, b = 2;
a=-l, b = -2; a = 2, 6 = 1; a = -2, b = -1.
The Fermat-Gauss Impossibility Theorem 101
Accordingly, there are six G-numbers with the norm 3, the numbers
it = J — 0 = iV3, 77-./, 7rO, and their conjugates 7? = — it, — irO,
-nJ.
The norm of the product of two numbers is equal to the product of the norms
of these numbers.
Proof. N{aB) = <x/3-^ = <x/3-a-/3 = <xa-/3/3 = N(a)-N(B).
IV. Units. A G-number e is called a unit, or more accurately a
G-unit, when its reciprocal value rj is also a G-number. From
er/ = 1 it follows from norm formation that e0rj0 = 1, i.e., e0 = 1.
According to III., there are consequently six G-units:
J, -J, 0, -0, 1, -1.
These six units are the integral powers of J or 0, e.g., J, J2, J3, J*, J5,
and J6.
V. Associated numbers. The six numbers that are obtained when a
G-number J is multiplied by the six G-units are called the associated
numbers of £.
The six associated numbers of it = J — 0 are, for example,
TTj = -1 - 0, 7T.72 = -1 - J, TrJ3 = -7T,
77J4 = 1 + 0, 77-./6 = 1 + J, 77-./6 = 7T.
VI. Division. The quotient q = <x//3 of two G-numbers a and /3 is
not necessarily a G-number. If it is a G-number, however, /3 is
called a divisor (G-divisor) of a or one says that /3 goes into a.
In order to divide any G-number a by any other /3, we write
<x _ a/5 _ a/5 _ hJ + £0 _ 6 _A
/3 ~ /3/5" /30 ~ /30 -/3/ + J3~o
Here we divide each rational fraction A//30 and &//30 into the integral
components m and n, respectively, and the rational components r and
3, respectively, the absolute value of which never exceeds \ [Example:
-^ = 4- 0.2], we set mJ + nO = k, xJ + %0 = «R, and obtain
3 = /( + 9¾ or a = kB + 91/3.
P
From 91/3 = a — kB it follows that 91/3 is a G-number y, and we have
a = kB + y.
Here y0 = 9t0/30 = (r2 + 32 — t3)/30. Since, however, |r| = | and
|§| = -J, then 9t0 must certainly be ^ £, i.e., y0 g £/30.
102
Arithmetical Problems
Conclusion. The division of a G-number a by another G-number /3
results in a "quotient" k and a "residue" y such that
a = k/3 + y>,
with the residue norm being at most equal to j of the divisor norm.
VII. The algorithm of the greatest common divisor. We start with the
division <x//3 and the related equation
(1) a = kP + y with y0 Z tfo,
and determine, as in VI., the quotient A and the residue S of the
division /3/y; in this way we obtain the corresponding equation
(2) /3 = Ay + S with S0 £ %y0.
Then in a similar manner we obtain
(3) y = /*8 + e with e0 ^ £80,
etc. Since the residue norms become progressively smaller, we must
finally obtain a residue of zero. To avoid unnecessary writing we will
assume that the division after (3) S/e leaves no residue, so that
(4) S = ve.
Now it follows from (4) that every divisor t of e also goes into S
without residue, and, therefore, it follows from (3) that t also goes
into y without residue; consequently, it follows from (2) that t
goes into /3 without residue, and, finally, from (1) it follows that
t goes into a without residue.
In reverse order: it follows from (1) that every common divisor t of
a and /3 is also a divisor of y, then, from (2), that t also goes into S
without residue, and, finally, from (3), that t is also a divisor of e.
Every common divisor of a and /3 consequently goes into e without residue,
and every divisor of e goes into a and /3 without residue.
e is accordingly (in terms of its absolute value) the highest common
divisor of a and /3.
If, in particular, e is a G-unit, the numbers a and /3 are said to have
no common divisor or to be prime with respect to each other.
The chain of equations (1,) (2), (3),. .. is nothing other than the
extension to G-numbers of the well-known algorithm for determination of the
highest common divisor of common integers.
The Fermat-Gauss Impossibility Theorem 103
VIII. Unequivocal division of G-numbers into prime factors. Just as
with integers, the common theorems governing divisibility,
indivisibility and unequivocal division into prime factors are derived from
the divisional algorithm:
1. If a and /3 possess no common divisor and ap, is divisible by /3, then p is
divisible by /3.
2. If two G-numbers possess no common divisor with one and the same third
G-number, their product also possesses no common divisor with this third
G-number.
3. Every G-number can be divided into a product of prime factors
(i.e., G-primes) in only one way. [Divisions such as <x/3y and
aJ-fi-yO, in which one contains the associated numbers of the other
rather than certain factors of it, are not considered different from
each other.]
A G-prime is a G-number that possesses no divisor aside from its six
associated numbers and the six units.
The numbers it = J — 0 and 2 are, for example, primes.
If, for example, we assume that it is divisible: it = Xfi, then
7r0 = X0fi0 or 3 = X0fi0. From this it follows that A0 = 3, /*0 = 1.
fj. is therefore a unit and the equation it = A/* does not represent a
division.
From 2 = Xfi.it follows that 2 = A0^0 or 4 = A0^0. The case of
A0 = 2, ftQ = 2 is eliminated because, according to III., there is no
G-number having a norm equal to 2.
Thus, we are left with A0 = 4, fi0 = 1. Once again p is a unit and
the equation 2 = Xfi does not represent a division.
IX. Congruence. As in the theory of natural numbers, we say here
also that two G-numbers a and /3 are congruent modulo /*—written
a = /3 mod fi—when their difference a — /3 is divisible by the
G-number p.
X. G-numbers modulo it. We will consider one more G-number
k = aJ + bO in relation to the modulus it = J — 0.
If k is divisible by it:
aJ + bO = (mJ + nO)(J - 0) = (2n - m)J + (n - 2m)0,
then a = 2n — m, b = n — 2m, thus
a + b = 3g with g = n — m.
Conversely, if a + b = 3g, m and n are determined from n — m = g
and 2n — m = a, giving k = (mJ + nO)(J — 0).
104 Arithmetical Problems
The G-number k = aj + bO is thus divisible by it only when a + b is
divisible by 3.
If k is not divisible by it, then one of the three following formula
pairs is valid:
a = 3h, b = 3k + e; a = 3h + e, b = 3k;
a = 3h + e, b = 3k + e,
with e2 = 1, and thus, if hJ + £0 is set equal to A,
k = 3A + eO or k = 3X + eJ or « = 3A + e,
so that in every case k has the form
k = 3A + e,
where e is a G-unit.
Let us now consider the cube of k. It becomes
k3 = 9(3A3 + 3A2e + Ae2) + e3,
and, because e3 = ± 1, it has the form
K3 = 9fl ± 1.
If k is not divisible by it we then have the congruences k = e mod 3,
k3 = ± 1 morf 9.
|£m The Quadratic Reciprocity Law
(The Euler-Legendre-Gauss theorem.) The reciprocal Legendre symbols
of the odd prime numbers p and q are governed by the formula
(E\.(l\ = (_1)[(P-1)/21-[(8-1)/3]
This law, the so-called quadratic reciprocity law, was formulated
but not proved by Euler (Opuscula analytica, Petersburg, 1783). In
1785 Legendre discovered the same law (Histoire de VAcademie des
Sciences) independently of Euler and proved it partially.
The first complete proof was presented by Karl Friedrich Gauss
(1777-1855) in his famous Disquisitiones arithmetics (published in 1801),
a book that laid the foundations of contemporary number theory;
this work, its five hundred quarto pages swarming with profound
The Quadratic Reciprocity Law
105
ideas, was written when Gauss was 20 years old. "It is really
astonishing," says Kronecker, "to think that a single man of such
young years was able to bring to light such a wealth of results, and
above all to present such a profound and well organized treatment of
an entirely new discipline."
Later Gauss discovered seven other proofs of the reciprocity
theorem. (The Gauss proofs may be found in vol. 14 of Ostwald's
Klassiker der exakten Wissenschaften.)
The quadratic reciprocity law is one of the most important theorems
of number theory. Gauss called it the " Theorema Jundamentale."
The American mathematician Dickson says in his Theory of Numbers:
"The quadratic reciprocity law is doubtless the most important tool
in the theory of numbers and occupies the central position in its
history."
The importance of this law led other mathematicians like Jacobi,
Cauchy, Liouville, Kronecker, Schering, and Frobenius to investigate
it after Gauss and offer proofs of it. In his Niedere Zahkntheorie,
P. Bachmann cites no fewer than 52 proofs and reports on the most
important.
Probably the simplest of all the proofs is the following arithmetic-
geometric proof, which arises from the combination of the so-called
lemma of Gauss (Gauss' Werke, vol. II, p. 51) and a geometric idea of
Cayley (Arthur Cayley [1821-1895], Collected Mathematical Papers,
vol. II).
Before taking up the proof itself we will give the derivation of
Gauss' lemma.
Letp be an odd prime number and D an integer that is not divisible
by p. If x represents one of the numbers 1,2,3,.. .,p = (p — 1)/2,
Rx the common residue of the division Dxjp, gx the corresponding
integral quotient, then
(1) Dx = Rx+gxp.
Accordingly as Rx is smaller or greater than \p, we set Rx = px or
Rx = Px + P> where in the second case px represents the negative
minimum residue of the division Dxjp, and we obtain
(la) Dx = Px + gxp or (lb) Dx = Px + p + gxp.
If n is then the number of negative minimum residues occurring in
the p divisions Dxjp (for x = 1, 2, 3,..., p), we have n equations of
the form (lb) and m = p — n equations of the form (la).
106 Arithmetical Problems
We convert these equations into congruences mod p and obtain the
p congruences
(2) Dx = px mod p.
Now the p residues px agree, except with respect to sign and
sequence, with the p numbers 1 to p.
[If, for example, pT were equal to p, or pT = — p, for two different
values r and s of x, then Dr = pT and Z)j = p, would yield by
subtraction or addition, respectively, D(r + s) = Omod/>. This
congruence is, however, impossible, because neither D nor r + s is
divisible by p.]
Multiplication of the p congruences (2) results in
D"pl = (-l)nplmodp,
and from this we obtain
D> = (-l)nmod/>.
However, since, according to Euler's theorem (No. 19),
D> = (D mod/,,
we obtain
(|) = (-1)" mod /,,
whence, since both sides of this congruence have the absolute value 1,
(3) ©-<-1)B-
This formula, in which n represents the number of negative minimum
residues resulting from the p divisions Dx\p (x = 1,2,3,...,^),
is Gauss' lemma.
Now let D be some odd prime number q that differs from p. We
convert the p equations (la) and (lb) into congruences to the
modulus 2, leave out all the excess multiples of 2, e.g., (q — l)x, and
obtain
x = px + gx mod 2 and x = I + px + gA mod 2.
Addition of these p congruences yields
2x = n + 2px + 2gx mod 2.
The Quadratic Reciprocity Law
107
However, since the absolute values of px are in agreement with the
numbers 1 through p and each summand can be replaced by its
opposite value in a congruence mod 2, we will write 2* in the obtained
congruence instead of 2p* and —n instead of n, thereby obtaining
2* + n = 2* + Zgx mod 2
or
(4) n = Zgx mod 2.
In accordance with (4) we can now write (3) as
(¾-(-■)-..
Now gx is the greatest integer contained in the quotient qxjp. If we
designate this as [qx/p], we obtain at last
(I)
(|) = (-1)IIM,P1,
where x passes through all the integers from 1 to p = (p — 1)/2.
Accordingly,
(II)
©"'-"
UPI/IQ]
where y passes through all the integers from 1 to q = (q — 1)/2.
Multiplication of (I) and (II) gives us
(in, (^).(1) = (-„
The exponent of the righ
L[(«/p)x]+E[(P/<I)l/]
Fio. 4.
ly found.
108
Arithmetical Problems
On a system of rectangular coordinates xy we draw the rectangle
with the four angles
0|0, f
o, I
!• °
and bisect it with a diagonal d from the origin, possessing the equation
y = {qxjp); we then mark off all the lattice points* within the
rectangle. (Cf. the figure, in which/) = 19, q = 11.)
To begin with, it is clear that no marked lattice point x \y lies on d,
since here x would necessarily be < \p and y < \q, which contradicts
the condition y\x = q\p.
For an integral abscissa x the corresponding ordinate of d is
y = (qx/p) and the number of marked lattice points lying on this
ordinate is [qx/p]. Consequently, the number of the marked lattice
points lying in the lower half of the rectangle is 2[?*//>], where x
passes through all the integers from 1 to p.
Similarly, the number of all the marked lattice points lying in the
upper half of our rectangle is ^.[py/q], where y passes through all the
integers from 1 to q.
The exponent appearing in (III) is then the number of all the
marked lattice points in our rectangle. This is a total of p • q elements.
Consequently,
(9-(9-<-"-
or
(t\ .(l\ = ( _ 1 )«P- D/2MM- l>/2]# Q.E.D.
^H Gauss' Fundamental Theorem of Algebra
Every equation of the nth degree
zn + dz-1 + C2z-2 + • • • + C„ = 0
has n roots.
Expressed more precisely, this theorem reads:
The polynomial
f(z) = z" + C.z"-1 + C2zn~2 + ... +Cn
can always be divided into n linear factors of the form z — <xv.
* A lattice point is a point whose coordinates are integers.
Gauss' Fundamental Theorem of Algebra 109
This famous theorem, the fundamental theorem of algebra, was
first stated by d'Alembert in 1746, but only partially proved. The
first rigorous proof was given in 1799 by Gauss, then twenty-one
years old, in his doctoral dissertation Demonstratio nova theorematis
omnem functionem algebraicam rationalem integram unius variabilis in
factores reales primi vel secundi gradus resolvi posso (Helmstaedt, 1799).
Subsequently, Gauss gave three other proofs of this theorem. All
four are to be found in the third volume of his Works, as well as in
vol. 14 of Ostwald's Klassiker der exakten Wissenschaften. Other
authors after Gauss, including Argand, Cauchy, Ullherr, Weierstrass,
and Kronecker also gave proofs of the fundamental theorem. The
proof followed here (as modified by Cauchy) is Argand's {Annales de
Gergonne, 1815), which is distinguished by its brevity and simplicity.
This proof (like most of the other proofs) falls into two steps. The
first—and more difficult—step merely demonstrates that an equation
of the nth degree will always contain at least one root; the second step
shows that it has n roots and no more.
First Step
We set
zn + C^z-1 + C2z-2 + ... +Cn =/(z) = w
and consider the different values that are assumed by the absolute
magnitude \w\ when z is moved in the Gauss plane (the plane of
complex numbers). Let the smallest of these values be p and let it
be attained, for example, at the site z0, so that 1/(¾) | = \w0\ = p.
There are two possible cases:
1. The minimum p is greater than zero.
2. The minimum p is equal to zero.
We will begin by considering the first case. In the immediate
vicinity of the point Zq, say, in the area defined by a small circle K of
radius R with a center at z0, \w\ is everywhere ^ p, since p represents
the smallest value of \w\; at z0 itself \w\ = \w0\ = p.
For any z in K, z = z0 + £, where £ = p(cos & + i sin &) and p is
the absolute magnitude of £, i.e., the line segment z0z, and & the
inclination of this segment toward the axis of the positive real numbers.
We calculate
«> =/(*) =/(¾ + £) = (¾ + 0n + C,(z„ + £)-1 + ... +C„
110
Arithmetical Problems
eliminating the parentheses and arranging according to increasing
powers of £. In this way we obtain
w =f(z) = zg + C^zg"1 + C2zg-2 + • • • + C„
+ cd + c3? + ■ ■ ■ cni\
i.e.,
Since several coefficients c, may be equal to zero, we call the first of
the nonevanescent coefficients c, the second c, and so forth, so that
w = w0 + ct? + c'p' + c"V" + ••-,
with v < v < v" ....
Division with w0 and isolation of £v yields
if = 1 +^.(1 + K),
where q = c/w0 and £ represents a sum of different powers of £ with
positive exponents and known coefficients.
We consider the product ?£v-(l + ££)• We write the_y?w' factor
trigonometrically, abbreviating cos q> + i sin <p to 1 „, and, from
q = A(cos A + isin A) = h\x and £ = p-l#, we obtain q£" =
A-1A• /»v• 1 v# = hp"-\K+tf. From now on we confine ourselves to
z-values of K for which A + v& = it, which consequently lie on the
radius z0H which forms the angle & = {it — A)/j> with the real axis.
For all these z's the number l* + v# = 1„ has the value —1, and our
product assumes the form — hpv- (1 + ££).
If we choose a sufficiently small radius R, the second factor 1 + ££
can be brought as close to unity as we desire, since p = |£| < R.
But this means that the product lies as close as desired to the value
— hpv, i.e., the fraction
^- = 1-V(i +K)
lies as close as we desire to the point 1 — hpv of the Gauss plane, which
shows that for all z's between z0 and H the absolute magnitude
|w/w0| < 1. In other words, for this z, \w\ < p., while for all z's in
the vicinity of z0, \w\ should be ^p.. This is a contradiction, and
consequently the first of the two possible cases given above (p. > 0) is
eliminated. This leaves only the second case: w0 is equal to zero or
f(z0) = 0.
Therefore: Every equation regardless of its degree, has at least one root.
Gauss'Fundamental Theorem of Algebra 111
Second Step
We begin with the demonstration of the auxiliary theorem: If an
algebraic equation f (z) = 0 has the root a, then the left side of the equation
can be divided byz — a without a remainder.
If we divide the polynomial/(z) by z — a until the remainder R no
longer contains any more z, we obtain
iX£L=/l(z)+_*L,
Z — a J ' z — a
where R is a constant and/1(z) has the form
z""1 + GiZ-2 + e2z-3 + • • • + <£„_!.
Multiplication with z — a. gives
f(z) = (z - a)Mz) + R.
If in this equation, which is valid for every z, we set z = a, we obtain
R =/(«) = 0
and thus for every z
/(z) = (z - «)./i(z). Q..E.D.
If we combine this auxiliary theorem with the theorem proved in
the first step, which demonstrated the existence of one root, we obtain
the new theorem: Every polynomial of z can be represented as the product of
a linear factor z — a with a polynomial one degree lower.
We now write c^ rather than a and obtain
f(z) = (Z - ax)f{z).
We then apply the obtained theorem to the polynomial/^ (z) and
get
/i(z) = (z - «a)/a(z)»
where/2(z) is of the (n — 2)th degree and <x2 is a root of the equation
/i(z) = 0. Also in similar fashion:
f2(z) = (z - «3)/3(z),
/3(z) = (z - «4)/4(z), etc.
In this chain of equations, beginning with the next to last, if we
replace every f on the right-hand side with its following value in the
112
Arithmetical Problems
equation below, we finally obtain the theorem for the transformation
of a polynomial of the nth degree into a product of n linear factors:
f(z) = (z - ax)(z - a2) . .. (z - a,).
Expressed verbally: Every integral rational Junction of the nth degree can be
represented as the product of n linear factors.
Thus, the previous equation/(z) = 0 allows us to write
(z - ai)(z - a2) . . . (z - a„) = 0.
However, the product on the left becomes zero only when one factor is
equal to zero. And since z — av = 0 implies z = av, we finally
obtain:
The equation f (z) = 0 possesses the n roots a1} a2,..., a„ and no others.
Thus we have proved the fundamental theorem.
Note. It is possible for several of the n roots ax, a2, , a„ to be
equally great, for example, for a2 and a3 both to be equal to a1} while
a4, a5,..., a„ may be different from ax. In this case ax is called a
multiple root, and specifically in the case we have assumed of three
equal roots, a triple root.
■SEH Sturm's Problem of the Number of Roots
Find the number of real roots of an algebraic equation with real coefficients
over a given interval.
This very important algebraic problem was solved in a surprisingly
simple way in 1829 by the French mathematician Charles Sturm
(1803-1855). The paper containing the famous Sturm theorem
appeared in the eleventh volume of the Bulletin des sciences de Ferussac
and bears the title, "Memoire sur la resolution des equations
numeriques."
"With this major discovery," says Liouville, "Sturm at once
simplified and perfected the elements of algebra, enriching them with
new results."
Solution. We distinguish two cases:
I. The real roots of the equation in question are all simple over the
given interval.
II. The equation also possesses multiple real roots over the interval.
We will first show that the second case leads us back to the first.
Sturm's Problem of the Number of Roots 113
Let the prescribed equation F(x) = 0 have the distinct roots
a, /3, y,..., and let the root a be a-fold, /3 6-fold, y e-fold,..., so that
F(*) = (*-«)"(*-|8)»(*-y)e....
For the derivative F'(x) ofF(x) we obtain
*"(*) = a + 6 + c + ...
F(*) * — a * — /3 * — y
= fl(s-)8)(x-y)(x- 8)--.+6(x-<x)(*-y)(*- 8)--- + ---
(*-«)(*-j8)(*-y)--.
If we then call the numerator of this fraction p(x) and the
denominator q{x) and set the whole rational function F{x)jq(x) equal to
G(x), then
F(x) = G(x).q(x) and F'(x) = G(x)-p(x).
Now the functions />(*) and ¢(^) have no common divisor. (The
factor x — /3 of q{x) may, for example, go into all the terms of p(x)
except the second with no remainder.) It follows from this that G(x)
is the greatest common divisor of F(x) and F'(x). This can be
determined easily from the divisional algorithm and can therefore be
considered known, as a result of which q(x) is known also.
The equation F(x) = 0 then falls into the two equations
q(x) = 0 and G(x) = 0,
the first of which possesses only simple roots, while the second can be
further reduced in the same way that F(x) = 0 was.
An equation with multiple roots can therefore always be
transformed into equations (with known coefficients) possessing only
simple roots.
Consequently, it is sufficient to solve the problem for the first case.
Letf(x) = 0 be an algebraic equation all of whose roots are simple.
The derivative/'(*) of/(*) then vanishes for none of these roots and
the highest common divisor of the functions/"(*) and/"'(*) is a constant
K that differs from zero. We use the divisional algorithm to
determine the highest common divisor off(x) and/"'(x), writing, for the
sake of convenience in representation, f0(x') and f(x) instead of
f(x) and/"'(*), and calling the quotients resulting from the successive
divisions q0(x), qi(x), q2(x),. ■. and the remainders —f2(x), —fz{x),....
114
Arithmetical Problems
If we also drop the argument sign for the sake of brevity, we obtain the
following scheme:
(0) /o = qofi -/2,
0) /l = Vl/l "/a.
(2) /2 = ¢2/3 -/4. etc.
In this scheme there must at last appear—at the very latest with
the remainder K—a remainder —f,(x) that does not vanish at any
point of the interval and consequently possesses the same sign over
the whole interval. Here we break off the algorithm. The functions
involved
JoiJl>j2J • • •!./«
form a "Sturm chain" and in this connection are called Sturm Junctions.
The Sturm functions possess the following three properties:
1. Two neighboring functions do not vanish simultaneously at any
point of the interval. 2. At a null point of a Sturm function its two
neighboring functions are of different sign. 3. Within a sufficiently
small area surrounding a zero point of f0(x), f^x) is everywhere
greater than zero or everywhere smaller than zero.
Proof of 1. If, for example, f2 and/3 vanish at any point of an
interval, ft [according to (2)] also vanishes at this point, and
consequently^ also [according to (3)], and so forth, so that finally
[according to the last line of the algorithm]/, also vanishes, which, however,
contradicts our assumption.
Proof of 2. If the function/3 vanishes at the point a, for example,
of the interval, then it follows from (2) that
U") = -M°)-
Proof of 3. This proof follows from the known theorem: A
function [f0(x)] rises or falls at a point depending on whether its
derivative [/i(*)] at that point is greater or smaller than zero.
We now select any point x of the interval, note the sign of the values
fo(x)>fi(x)> ■ ■ ->ft(x), and obtain a Sturm sign chain (to obtain an
unequivocal sign, however, it must be assumed that none of the
designated s + 1 function values is zero). The sign chain will
contain sign sequences (+ + and ) and sign changes (H— and
We will consider the number Z(x) of sign changes in the sign chain
and the changes undergone by Z(x) when x passes through the
interval. A change can occur only if one or more of the Sturm
Sturm's Problem of the Number of Roots 115
functions changes sign, i.e., passes over from negative (positive)
values through zero to positive (negative) values. We will accordingly
study the effect produced on Z(x) by the passage of a function f(x)
through zero.
Let A: be a point at which f disappears, h a point situated to the
left, and I a point to the right of k and so close to k that over the
interval h to I the following holds true: (1) f(x) does not vanish
except when x = k; (2) every neighbor (/v + n/v-i) off does not
change sign. We must distinguish between the cases v > 0 and
v = 0; in the first case we are concerned with the triplet f _ 1} f,
f + 1, in the second, with the pair f0,f-
In the triplet, f = ! and/v + 1 possess either the + and — sign or the
— and + sign at all three points h, k, I. Thus, whatever the sign of
f may be at these points, the triplet possesses one change of sign for
each of the three arguments h, k, I. The passage through zero of the
function f does not change the number of sign changes in the chain!
In the pair, f has either the + or — sign at all three points h, k, I.
In the first case,/0 is increasing and is thus negative at h and positive
at I. In the second case, f0 is decreasing and is positive at point h,
and negative at I. In both cases a sign change is lost.
From our investigation we learn that: The Sturm sign chain
undergoes a change in the number of sign changes Z(x) only when x
passes through a null point of/(*); and specifically, the chain then
loses (with an increasing x) exactly one sign change. Thus, if x
passes through the interval (the ends of which do not represent roots
off(x) = 0) from left to right, the sign chain loses exactly as many
sign changes as there are null points of f(x) within the interval.
Result:
Sturm's theorem: The number of real roots of an algebraic equation
with real coefficients whose real roots are simple over an interval the end points
of which are not roots is equal to the difference between the numbers of sign
changes of the Sturm sign chains formed for the interval ends.
Note. The same considerations can also be applied unchanged to
the series formed when we multiply f0,fi,f2> •••>/« by any positive
constants; this series is then likewise designated as a Sturm chain.
In the formation of the Sturm function chain all fractional coefficients
are accordingly avoided.
Example 1. Determine the number and situation of the real roots
of the equation x5 — 3x — 1 = 0.
776" Arithmetical Problems
The Sturm chain is
/0 = *5 - 3x - 1, f = 5x* - 3, /a = 12* + 5, /s = 1.
The signs of/for x = -2, -1,0, +1, +2 are
X
-2
-1
0
+ 1
+ 2
/0
-
+
-
-
+
/1
+
+
-
+
+
/.
-
-
+
+
+
/a
+
+
+
+
+
The equation thus has three real roots: one between —2 and — 1,
one between — 1 and 0, one between +1 and +2. The other two
roots are complex.
Example 2. Determine the number of real roots of the equation
xs — ax — b = 0 when a and b are positive magnitudes and
44a5 > 5564.
The Sturm chain reads
x5 - ax - b, 5x* -a, \ax + 5b, 44a5 - 55b*.
For the values x = -co and +00 it has the signs
- + - +
and
+ + + +, respectively.
The equation has three real and two complex roots.
■SOB Abel's Impossibility Theorem
Equations of higher than the fourth degree are in general incapable of algebraic
solution.
This famous theorem was first stated by the Italian physician Paolo
Ruffini (1765-1822) in his book Teoria generate delle equazioni, published
in Bologna in 1798. Ruffini's proof, however, is incomplete. The
Abel's Impossibility Theorem
117
first rigorous proof was given in 1826 in the first volume of Crelle's
Journal fur Mathematik by the young Norwegian mathematician Niels
Henrik Abel (1802-1829). His celebrated paper bore the title
"Demonstration de l'impossibilite de la resolution algebraique des
equations generates qui depassent le quatrieme degre."
The following proof of Abel's impossibility theorem is based on a
theorem of Kronecker, published in 1856 in the Monatsberkhte der
Berliner Akademie.
We will begin by presenting in a short introduction the auxiliary
algebraic theorems necessary for an understanding of the Kronecker
proof.
A system M: of numbers is called a number group or rational domain
when the addition, subtraction, multiplication, and division of two
numbers of the system will also yield a number of the system. For
brevity we will call the numbers of the system ^-numbers. Two
groups are called equal when every number of the one belongs also to
the other. The simplest group is that composed of all rational
numbers, the group 9¾ of rational numbers or the natural rationality
domain.
A group S' = S(<x, /3, y,...) created by the "substitution of the
magnitudes a, /3, y,... in a group S" is understood to mean the
totality of all the numbers obtained from the ^-numbers and
the substituted magnitudes a, /3, y,... by one or more applications
of the four species, in other words, the totality of all the rational
functions of a, /3, y,... whose coefficients are ^-numbers.
A function f (x) or an equation f (x) = 0 in a group is a function or
equation whose coefficients are numbers of the group. A polynomial
in M: is understood to mean an integral rational function of the
variable x whose coefficients are ^-numbers.
A polynomial
F(x) = Axn + Bx"-1 + •..
or an equation
F(x) = 0
in a group M: is said to be reducible or irreducible in this group
accordingly asF(*) is divisible into a product of polynomials of lower
degree in S or not.
The function x2 — 10* + 7, for example, is irreducible in the group
9¾ whereas it is reducible in the group 9?( V2):
x2 - 10* + 7 = (* - 5 - 3V2)(* - 5 + 3V2).
118 Arithmetical Problems
Abel's lemma:* The pure equation
x> = C
of the prime number degree p is irreducible in a group S when Cis a number of
the group but not the pth power of a group number.
Indirect proof. Let x" — C = 0 be reducible, so that
xp - C = </.(*M*)>
where >p and <p are polynomials in S, whose free terms A and B are
^-numbers. Since the roots of the equation x" = C are r, re, re2,...,
re"'1, where r is one of the roots and e a complex pth unit root, and the
free term of the equation ifi(x) = 0 or <p(x) = 0, independent of sign,
represents the product of the equation's roots, then, for example,
A = r"eM, B = rV.
Since ft and v possess no common divisor (because ft + v = p), there
are integers h, k such that
fth + vk = 1.
Thus, we obtain for the product K of the powers Ah and Bk the value
r£*M+kjv an(j) consequently, the value K" = r" = C for the pth power
of the ^-number K. It was assumed, however, that C must not be the
pth power of a ^-number. Consequently, *p = C cannot be reducible.
Schoenemann's theorem (Crelle's Journal, vol. XXXII, 1846): If
the integral coefficients C0, C1} C2,.. .,0^-1 of the polynomial
f(x) = C0 + d* + C2x2 + • • • + (Vi*"-1 + **
are divisible by a prime number p, while the free term C0 is not divisible by p2,
then f (x) if irreducible in the natural rationality domain.
Indirect proof. Let/be reducible so that/= <ji-<p, with
if, = a0 + a^x + a2x2 + • • • + fl^.^-1 + *»,
9> = b0 + bxx + V2 + • • • + i,,-!*""1 + xn.
* Abel, CEuvres computes, vol. II, p. 196.
Abel's Impossibility Theorem
119
According to a theorem of Gauss* the coefficients a and b are here
integers. We multiply the expressions for >p and % obtaining, by
comparison withy,
Co = fto^o,
Ci = Oobx + axb0,
C2 = a062 + a^i + fl^oi etc-
Since C0 is not divisible by p2, let us say that a0 is divisible by p, in
which case b0 is not. Since Cx and a0 are divisible by/», while b0 is not,
it follows from the second line of our scheme that a1 is divisible by p.
Then it follows according to the third line of our scheme, in which
C2, a0, a1 are divisible by p, that a2 is also divisible by p, and so forth.
Finally, we would be able to conclude that am = 1 is also divisible by
p, which is naturally absurd. Consequently,/cannot be reducible.
Reducible and irreducible polynomials play the same role among
polynomials that composite and prime numbers play among the
integers. Thus, for example, every reducible polynomial can be
divided in only one way into a product of irreducible polynomials.
All of the theorems concerned here are based on the fundamental
theorem of irreducible functions.
* Gauss'theorem: If a polynomial f = xN + Cxx"-1 + C2xN~2 + ••• + CN
with integral coefficients is divisible into a product of two polynomials <ji = xm + axx™ " *
+ • • • + <«m and <f = xn + j3ix°_1 + • • • + )3n with rational coefficients (f = ipip),
then the coefficients of this polynomial are integers.
Proof. We bring a, and )3, to their highest common denominators a0 and
b0, respectively, so that a, = a,ja0 and )3» = 6v/60, and the numbers a0, fli
a2> • • •) am as well as the numbers ba, bu ..., bn, possess no common divisor,
and we obtain
F = TO with F = a0b0f,
T = a0*m + axx"-1 + • • • + a„, O = 6o*" + Ai*""1 + ■■■ + bn.
Let p be a prime divisor of a0b0.
Then all the coefficients of F are divisible by p, but not bv T and ¢. We
combine these terms of T and ¢, respectively, whose coefficients are divisible
by p, to form the respective polynomials U and V, and similarly combine these
terms whose coefficients are not divisible by p to form the polynomials u
and v, so that F = (U + u){V + v), and consequently
uv = F - UV- Uv- Vu.
The right-hand side of this equation contains a polynomial in which, according
to our assumptions for F, U, and V, every coefficient is divisible by p; the left
side, however, does not, since the coefficient of the highest power of the left side,
being the product of two factors a, and bt that are not divisible by p, is also not
divisible by p.
This contradiction disappears only when a0b0 has no prime divisor, i.e., when
aa = 1 and b0 = 1, in which case a, and )3, are integers.
120 Arithmetical Problems
Abel's irreducibility theorem:* If one root of the equation f (x) = 0,
which is irreducible in S, is also a root of the equation F(x) = 0 in S, then
all the roots of the irreducible equation are roots of F(x) = 0. At the
same time F(x) can be divided by f (x) without a remainder:
F{x) =f{x).Fl(x),
where Fx(x) is also a polynomial in S.
The simple proof of this theorem is based on the familiar algorithm
for finding the highest common divisor g(x) of two arbitrary
polynomials F(x) and/(*) in S. This algorithm leads through a chain of
divisions, in which all the coefficients are ^-numbers, to the pair of
equations
F(x)=F1(x)-g(x), f(x) =Mx).g(x)
and to the equation
V(x)F(x) + *(*)/(*) = g(x),
where all the indicated functions are polynomials in S.
If the prescribed functions F and f have no common divisor, then
g(x) is a constant which is for convenience set equal to 1.
If/is irreducible and a root a off = 0 is also a root of F = 0, then
there exists a common divisor of at least the first degree (x — a).
Since/is irreducible, f(x) must equal 1 and/(*) = g(x), and then
F(x) is thus divisible by f(x) and vanishes for every zero point of
/(*). Q..E.D.
The fundamental theorem directly implies two important
corollaries :
I. If a root of an equation f (x) = 0, which is irreducible in S, is also a
root of an equation F(x) = 0 in M: of lower degree than f, then all the
coefficients of F are equal to zero.
II. Iff(x) = 0 is an irreducible equation in a group S, then there is no
other irreducible equation in S that has a common root with f (x) = 0.
The commonest case of substitution in a group S consists of the
substitution of a root a of an irreducible equation of the nth degree
/(*) = xn + a^"-1 + ■■■ + an = 0
* N. H. Abel, " Memoire sur une classe particuliere d'equations resolubles
algibraiquement," Crellt's Journal, vol. IV, 1829.
Abel's Impossibility Theorem
121
into ft. A number £ of the group ft' = ft (a) defined by this
substitution is a rational function of a with coefficients from ft and can be
written £ = T(a)/0(a), where *F and O are polynomials in ft.
Since a" = — fl^a"-1 — a2an_2 — • • • — a„, every power of a with
the exponent n or with a higher exponent can be expressed by the
powers an ~1, an ~2,..., a, so that we may write £ = >p(a)l(p(a), where
ifi and <p are polynomials in ft of no higher than the (n — l)th degree.
Since f(x) and <p(x) possess no common divisor, two polynomials
u(x) and v(x) can be found (see above) in ft, such that u{x)<p{x)
+ v(x)f(x) = 1. If in this equation we set x = a, then [since
/(a) = 0] «(<*)• 9)(<x) = 1, i.e., £ = ^r(a) •«(<*). We multiply this
out and once again eliminate every power of a whose exponent ^ n.
This finally gives us
£ = C0 + Cxa + C2«2 + • • • + Cn-itt"-1,
where the cv are S-numbers; i.e.,
III. Every number of the group S(a), where a is a root of an irreducible
equation of the nth degree in S, can be represented as a polynomial of the
(n — l)th degree of a with coefficients that are ^-numbers. There is only
one suck possible way of representing it.
[From
C0 +Cia + ••• + Cn.ia"-1 = C0 + £> + • • • + Q.^"-1
it follows that
d0 + dxa + • • • + </n_l(xn_1 = 0, with dv = Cv - cv.
Then the function of the (n — l)th degree
d0 + ^x + d2x2 + • • • + </n-i*n-1
vanishes for a root off(x) = 0 and, according to corollary I., must
have nothing but evanescent coefficients. From dv = 0, however, it
follows that Cv = cv.]
We have just seen a simple example of an irreducible function that
became reducible by substitution of a root.
Let us consider the more general case in which an irreducible
function f(x) in S of prime number degree p becomes reducible by
substitution of a root a of an irreducible equation of the qth degree
g(x) = 0 in S, in which, therefore, f(x) can be divided into the
product of the two polynomials ifi(x, a) and <p(x, a), which may be of
the mth and nth degree of x, respectively.
122 Arithmetical Problems
Now the function in S
«(*) =f(r) ~ <l>{^x)9{r,x),
where r is some rational number, vanishes for x = a. According to
the fundamental theorem of irreducible functions, u{x) is then
evanescent for all roots a, a, a",... of the irreducible equation
g(x) = 0.
Since, for example, the equation
f(x) - >P(x, a')<p(x, a) = 0
is therefore valid for every rational x, it is valid for all the values of x,
so that by identity
f(x) = xl>(x, a')<p(x,a)
and similarly for all other roots of g(x) = 0.
From the q equations
f(x) = t(x, cc)<p(x, a),
f(x) = >P{x, a')<p(x, a), etc.,
thus obtained, it follows by multiplication that
/(*)« = <?(*).<&(*),
where ^(x) and <!>(*) are the products of the q polynomials >p(x, a),
>p(x, a),... and q>(x, a), q>(x, a),..., respectively. Since each of
these products is a symmetrical function of the roots ofg(*) = 0, each
product can be expressed rationally according to the Waring theorem
by the coefficients of g(x) = 0 [and naturally by *], so that ¥(*) and
<!>(*) are polynomials in S.
Now T(^) certainly vanishes for at least one root of the irreducible
equation f(x) = 0, as does 0(*). Consequently both ¥(*) and <!>(*)
can be divided without a remainder by/"(*), and since/"is irreducible
no other divisor than fis possible, as a result of which
with fi + v = q. Comparing the degree of the left and right sides,
we obtain
mq = fip, nq = vp
and from these, since m and n are smaller than p, it follows that p is a
divisor of q. We therefore obtain the theorem:
Abel's Impossibility Theorem
123
IV. An irreducible equation of the prime number degree p in a group can
become reducible through substitution of a root of another irreducible equation
in this group only when p is a divisor of the degree of the latter equation.
After this introduction we can turn to the proof of Abel's theorem.
First, however, we will consider what is meant by an algebraically
soluble equation.
An equation of the nth degree f(x) = 0 in a group 9¾ is called
algebraically soluble when it is soluble by a series of radicals, i.e., when a
root w can be determined in the following manner:
1. Determination of the ath root a = V/Jofan 9t-numberii, which
is not, however, an ath power of an 9t-number, and substitution of a
into 9¾. so that the group & = 91(a) is formed;
2. Determination of the 6th root /3 = "vA of an 2t-number A,
which, however, is not a 6th power of an 2t-number, and substitution
of /3 into 2t, so that the group 93 = 2t(/3) = 9t(<x, /3) is formed;
3. Determination of the cth root y = "vB of a 93-number B, which,
however, is not a cth power of a 23-number, and the substitution of y
into S3, so that the group © = 93 (y) = 9t(<x, /3, y) is formed, etc., until
these successive substitutions of radicals a, /3, y,... at length result in
a group to which a>, the sought-for root, belongs and in which f (x)
[since it possesses the divisor x — w] becomes reducible. It is here
assumed that all the radical exponents a, b,c,... are prime numbers.
This does not represent a restriction since any extraction of roots with
composite exponents can be reduced to successive extractions of roots
with prime exponents (e.g., Vu = •^with v = tyu).
In order to shorten our task somewhat, we will limit ourselves to
equations f(x) = 0 which possess rational coefficients, so that 9¾ is
the natural rationality domain, which are, moreover, irreducible in 91,
and which are of the degree n, which is an odd prime number.
Let the first substitution be that of the nth root of unity
a = t) = v 1 = cos h i sin —
' n n
According to IV., this substitution still does not make/reducible,
since rj is a root of the equation x"'1 + xn~2 + • •• + x + 1 = 0,
the degree of which is < n.
Also, with each substituted radical of our series, which still does not
allow division of/"(*), we will also substitute at the same time the
124
Arithmetical Problems
complex conjugate radical. Though this may be superfluous, it can
certainly do no harm.
Let A = *{/~K. be the radical the addition of which to the preceding
radicals makes f(x) reducible, so that/"(*) is still indivisible in the
group S (to which the number K belongs), but becomes divisible in
S = *(A):
f(x) = «/.(*, \)-<p(x, X)-X(x, A) ....
Here the factors >p, q>, x>- • • are irreducible polynomials in fl (but
naturally not polynomials in S) whose coefficients are polynomials of
A in S.
Since, according to IV., the prime number n must be a divisor of
the prime number I, I must be equal to n.
The I roots of the equation x' = K, which is irreducible in M:
according to Abel's lemma, are
A0 = A, Ax = Arj, A2 = Arj2,..., Av = Arjv,..., A„_! = Arjn_1.
Since >p(x, A) is a divisor of/"(*), then >p(x, Av) also goes into f(x)
without a remainder (cf. the proof of IV.).
Every one of the n functions ifi(x, Av) is irreducible in &.
[As in the proof of IV., it follows from >p(x, Av) = u(x, Av) -v(x, Av)
that >fi(x, A) = u(x, A) -v(x, A), but this equation is impossible because
>p(x, A) is irreducible in fl.]
No two of the n functions ifi(x, Av) are equal. [In ifi(x, Arj") =
ifi(x, Arjv), A could, as before, be replaced by the root Arjn_", from which
it would follow that
*(*, A) = +(x, XH),
where H represents the root of unity rjv_". Here A could in turn be
replaced by XH, which would give
«£(*, XH) = $(x, XH2).
Similarly, it would follow that
«£(*, XH2) = «£(*, XH3),
etc. Thus, we would then have
4<(x, A) = 4<(x, XH) = 4<(x, XH2)
i.e., also
= t(x,\) +Hx,XH) + ... +^,A#"-i)
Abel's Impossibility Theorem
125
The right side of this equation, however, as a symmetrical function of
the n roots A, XH, XH2,... of*" = K, is a polynomial of x in S, so that
ifi(x, A) would also be a polynomial of x in S. This, however,
contradicts what was stipulated above concerning/"(*).]
For these two reasons it follows tha.tf(x) is divisible by the product
¥(*) of the n different factors ifi(x, A), ifi(x, Arj),..., ifi(x, Aijn_1) that
are irreducible in fl:
f(x)=V(x).U(x),
where T (as a symmetrical function of the roots of xn = K), and
consequently U as well, are polynomials of x in S. Now, since f(x)
is not reducible in S, U{x) must equal 1 and necessarily
/(*) = Y(*) = *(*, A)*(*. A,) ... *(*, A,,-1)-
The postulated divisibility of /(*) for the group fl consequently
reveals itself as a divisibility into linear factors. Thus, if w, <ax, co2,...,
<«„_! are the roots and x — a>, x — w1}..., x — <»>„_! are the linear
factors otf(x), then
x — to = ^r(*, A), ^ — O)! = ^r(*, Arj), .. . x — a>„_i = ^r(*, Arjn_1),
and consequently
co = K0 + K,X + K2X2 + ■■■ + Xn-iA"-1,
«! = tf0 + KiXt + K2X2 + ■■■ + x,.!*;-1,
c,,^ = tf0 + XA-i + *2A2-i+ • • • + K^iZZl,
where all the Kv are ^-numbers.
Now the equation f{x) = 0 has at least one real root, since it is of
an odd degree. Let this real root be
w=K0 + K1X+ ■■■ +^,,.^-1.
We distinguish two cases:
I. The base K of the reducible radical X is real;
II. the base K is complex.
Case I. Here we can assume that A is real, since the nth roots of
unity belong to the group S. In that event the complex conjugate
of a> is
126 Arithmetical Problems
where the complex conjugates Kv of Kv are also S-numbers. From
ai = co it follows then that
(K0 - K0) + (K, - KJX + ■■■ + (K^, - tf^OA*"1 = 0,
and from this, taking theorem I into consideration, it follows that
Kv = Kv for every v. The magnitudes K0, Ku ..., Kn_ x are therefore
also real.
Furthermore,
0)v = K0 + K^ + • • • + ifn-iAJ-1
and
<un_v = K0 + K1Xn.v + • • • + ^n-^^li.
However, since Av = Aijv and A„_v = Arjn_v = Arj_v are complex
conjugates, it follows that <uv and <u„_v are also complex conjugates,
i.e.:
The equation f (x) = 0 possesses one real root and n — 1 paired conjugate
complex roots (a^ and <*>„_!, a>2 and a>„_2, etc.).
Case II. In this case we substitute, in addition to the reducible
radical A = VR, the complex conjugate A = vK with the result
that the real magnitude A = AA is also substituted.
If the substitution of A = V KK alone (i.e., without A) were
sufficient to make f(x) reducible, this would give us the situation of
Case I. We may therefore assume that f(x) is still irreducible in
S( A) and does not become reducible until the additional substitution
of A.
From
oj = K0+K1X+ ••• +^,,.^-1
it follows that
<a = K0+K1X+ ••• +^,,.^-1
and from this, since to = <u, that
K0+K.X+ ... +/:,,.^-1
Abel's Impossibility Theorem
127
In this equation all of the magnitudes with the exception of A belong
to the group S(A), and since the equation xn = K (according to
Abel's lemma) is irreducible in this group, we are able to replace A in
the above equation by any root Av of xn = K.
If we do this and keep in mind that
AAA . "T—J r-
we obtain
K0 + K,\v + • • • + K^.Xr1 = *0 + *A + • • • + ^,,-iAr1
or
<uv = tov.
Thus, all the roots off (x) = 0 are real.
The combination of the results of I. and II. yields the
Kronecker* theorem: An algebraically soluble equation of an odd
degree that is a prime and which is irreducible in the natural rationality
domain possesses either only one real root or only real roots.
Kronecker's theorem proves at the same time that an equation of
higher than the fourth degree cannot be solved generally by algebraic
means.
The simple fifth-degree equation
xs — ax — b = 0,
for example, cannot be solved algebraically when a and b are positive
integers that are divisible by a prime number p, b is indivisible by p2,
and when 44a5 > 5564.
According to Schoenemann's theorem the equation is irreducible.
Sturm's theorem (No. 24) proves that it possesses three real roots and
two complex roots. Consequently, the equation is algebraically
insoluble according to Kronecker's theorem.
In exactly the same way it can be shown that
x1 - ax - b = 0
is algebraically insoluble when'66a7 > 7766, etc.
* Leopold Kronecker (1823-1891), a German mathematician.
128
Arithmetical Problems
■SBS The Hermite-Lindemann Transcendence Theorem
The expression
A^i + A^2 + A^a + ...,
in which the coefficients A differ from zero and in which the exponents a are
algebraic numbers differing jrom each other, cannot equal zero.
This extremely important theorem (see below) was proved in 1882
by the German mathematician Lindemann (in the Berliner Sitzungs-
berichte) after the French mathematician Hermite (1822-1901), in
vol. 77 of the Comptes rendus in 1873, had proved the special case in
which the coefficients and exponents were rational integers. Linde-
mann's proof, which required a great many higher mathematical
tools, was simplified to such an extent, first (1885, Berliner Sitzungs-
berichte) by K. Weierstrass (1815-1897), then (1893, Mathematische
Annalen, vol. 43) by P. Giordan (1837-1912), that the proof is now
generally accessible. The proof is presented here essentially in the
form given to it in his textbook of algebra by H. Weber (1842-1913).
The proof is indirect. We assume that there are I algebraic
numbers A1,A2,.-.,Al and I algebraic numbers <*!, <x2, •••,«,
differing from one another that satisfy the equation
(1) A^i + A2e"2 + ... + Afi =0,
and we show that this assumption leads to a contradiction. The
demonstration is divided into four steps.
I. We consider the coefficients A as roots of a real equation
%{x) = 0 with rational coefficients the degree of which, L, will
generally be greater than I. Let the roots of this equation be
Alt A2, ■.., Au ..., AL. We form all the possible /-termed
expressions As"! + A^a + • • • [totaling L(L - 1)(/, - 2)... (L - I + 1)
elements], where Ar, A„... are any I components of the series
Alt A2,..., AL, and we multiply these expressions together, always
combining each of the members with the same exponential factor e*.
The resulting product has the form
II' = A\e"i + A'^ + • • • + A'meBm>
where the A' are nonevanescent magnitudes.
[That the coefficients A' obtained by multiplying out and combining
cannot all vanish is proved in the following manner. We call the
first of the two complex numbers x + iy and X + iY the "smaller"
The Hermite-Lindemann Transcendence Theorem 129
when either x < X or x = X if y is at the same time < Y. Now the
product n' consists only of factors of the form Fv = Pv«"v + Qve*v +
Rver* + • • •, where none of the coefficients P, Q, R vanishes,
and we can consider the terms as being arranged in such a manner
that pv < qv < rv < • • •. On multiplying the factors Fv the
exponent p1 + p2 + pz + • • • °f the first term obtained is then the
smallest of all the exponents obtained and occurs only once.
Consequently, at the very least the first term of the multiplied-out product
differs from zero, which was what we set out to prove.]
The coefficients A' are not changed by transpositions of the
magnitudes Au A2,..., AL; in other words, they are symmetrical
functions of the roots of 2t(*) = 0, and, therefore, according to the
principal theorem concerning symmetrical functions, are rational
numbers.
Since the left side of (1) is also among the factors of II',
IT = 0.
We multiply this equation by the common denominator of the A "s
and obtain the new equation
(2) Bji + BJ* + . •. + By* = 0,
where the /3 different algebraic numbers and the coefficients B are
nonevanescent rational integers.
II. Let us consider the exponents /3 as roots of an algebraic equation
S3(*) = 0 with rational coefficients of degree M, with M generally
greater than m, and let us in the usual way think of the equation as
being free of identical roots. We form the M(M — 1) (M — 2) ...
(M — m + 1) m-termed sums
B^r + B2e"6. + •••,
where v is a variable and /3,., /3S,... are any m roots of S3(*) = 0, and
multiply these sums by each other, once again combining terms with
the same exponential factor e*. The resulting product has the form
II = <Vi» + C2eW + ■■■ + CneV,
where the coefficients C are nonevanescent rational integers and y
represents different algebraic numbers.
The product II is a symmetrical function of the roots of S3(*) = 0.
Consequently, the coefficients of the expansion of II according to the
130 Arithmetical Problems
powers of v are also symmetrical functions of those roots; thus, for
example, the coefficient kv of vv:
K = (Ciyi + C2yl + ■ ■ ■ + Cnyl)lv\.
Every coefficient kv is therefore a rational number. Accordingly, if
g(*) is a rational function of x with coefficients that are rational
l.n
integers, the sum 2_, Cs8(Ys) 'a rationally composed of the coefficients
>
k" and is consequently a rational number.
Now since the product n for v = 1 contains the factor B^ePi,
Btf** + • • • + B^", which is equal to zero according to (2), the
product for v = 1 is also equal to zero, and we obtain the equation
(3) C^i + C2e>2 + • • • + Cne>n = 0,
in addition to which for every integral rational function g(*) with
integral rational coefficients
(3a) Crffri) + C2g(y2) + • • • + Cng(y„)
is a rational number.
III. We consider the exponents yu y2,---,yn as roots of an
algebraic equation
*" + r^'1 + r^-2 + • • • + rN = 0
with rational coefficients of degree N ^ n, possessing no identical
roots.
We multiply this equation by the Nth power of the common
denominator H of the coefficients rlt r2, • • • and obtain
(Hx)N + Hr^Hx)"-1 + H2r2(Hx)N-2 +...=0
or,if we write ATinstead of Hxand call the integers//^,//2ra,//3r3l...,
f(X) = X" + glX"^ + g2xN-2 + ... +gN = 0.
If Tj, r2,..., rN are the roots of this equation, then
f(X) = (x-r1)(x-r2)...(x-rN).
The roots T possess the n values T1 = Hylt F2 = Hy2,...,
r„ = Hyn.,
The Hermite-Lindemann Transcendence Theorem 131
Since T represents integral algebraic numbers, then, as a result of
(3a),
(3b) dofro + c2fl(r2) + • • • + cnfl(rn)
is a rational integer.
Besides f(X) we will consider the function
9{X) - Y^T, + x=T2 + • • + T=TM
= (x-r2)(x-r3)...(x-rN)
+ (x- ro(* - rs)(jr - r4)... (X - rN) + ■ ■ ■
= NX"-1 + N^"-2 + •••,
which is not evanescent for any of the values Tu T2, • • •, FN, and the
coefficients of which N, Nu ... (as symmetrical functions of the roots
r1} r2,..., TN off(X) = 0) are rational integers.
If the sum
Ci^ro + c29(r2) + ■■■ + cn9,(r„)
should by chance equal zero, we select the positive integral exponent
h( < n) in such a manner that the (integral) sum
g = (^ixro + c2r»,,(r2) + • • • + cru^rj * o.
[Such an exponent must exist, because otherwise the n linear
homogeneous equations
1
I\
r?
rj-
■Xl + l
•*i + r2
•*i + r§
'•*i + n-
•x2 + •
•x2 + •
■x2 + •
*-x2 + ■
■ + 1
• +r„
• +T2n
• +r»-
•*» = o,
•*» = o,
•*» = o,
'■Xn = 0
would exist for the n nonevanescent" unknowns" *x = Cl9(1^),...,
xn = Cn<p(Tn). This, however, is impossible, since then the
determinant
1 1
r?
rj-
r|
l
rn
rj
r»-
132 Arithmetical Problems
of the equation system would have to disappear; however, this
determinant represents the product of all the differences Tr — rs, in
which r > s, and, in accordance with the above, none of which
disappear.]
IV. Now we put the fundamental property of the exponential
function—the series expansion for e"—into the form most suited for
our proof.
This is
x2 xv
e*=l +* + 2J+ ••• +-1 + ---
We multiply this equation by H*v! and obtain (Hx = X)
e*v\W = Wv\ + vW-^v -\)\X + v2Hv-\v - 2) IX2 + ■■■
+ ^ + ^ivh + VTWT2) + --}
In order to write this formula more conveniently, we introduce the
symbol <3, which will be defined by the following direction:
A function F(<&) shall be considered the expression obtained when
F(<3), on the assumption that © is a number, is transformed in the
usual way into a power series of <3 and ©v is replaced by vlHv at the
end of expansion.
Our formula can then be written in the simple form:
e*<Bv = (© + X)v + Xv-[ ].
If we then designate the absolute magnitude of x as £, the absolute
magnitude of [ ] is smaller than
( (2
0 = ^ + ^ + ...
v+ 1 + („ + l)(v + 2) + '
and therefore certainly smaller than
?2
1 + £ + || + *
If e is understood to be a magnitude the absolute value of which is a
proper fraction, we therefore obtain
(4) e*<Sv = (X + <S)V + et-eX*.
We will immediately extend this somewhat further. Let
V(X) = Xk + W-i + K2Xk'2 + ... +Kk
The Hermite-Lindemann Transcendence Theorem 133
represent an integral rational function of X with integral rational
coefficients. We form (4) for v = k, k — 1, k — 2,..., multiply the
resulting equations by 1, Ku K2, • • •, and add. This gives us
(5) exV(<S>) = V(X + ®) + *V{X),
with
(5a) V(X) = e0Xk + e^X"-1 + etK^X*-2 + ■■■,
where the absolute values of the magnitudes eK are proper fractions.
If A1} A2,... represent the roots of V(X) = 0 and d represents the
greatest of the k values \X\ + |AK|, it follows from
V(X) = (X-A1)(X-A2)...
that the absolute magnitude of V{X) [like that of V(X)] is smaller
than dk:
(5b) \V(X)\<d\
We apply the results (5), (5a), (5b) to the function
V(X) = F(X)«i>(X),
in which
F(X) = X*f(X), <b(X) = X*9(X),
q = p — 1, and p is a preliminarily selected, still undetermined prime
number. Since the degree ofF(X) is h + N, and the degree of O(AT)
is h + N - 1, V(X) is of the degree k = (h + N)q + h + N - 1.
Equation (5) is now transformed into
«*F(<3) = V(X + ©) + £^</k,
where d is the greatest of the k values \X\ + |AK| and e is a number
whose absolute magnitude is a proper fraction.
We now choose for x and X the values yv and r„, respectively (v is
any one of the numbers 1 ton). Then £ is the absolute magnitude
£v of yv and d = dv is the greatest of the A: sums |rv| + |AK|.
If D then represents the greatest of the 2n numbers d^j*h and
ei*d?N+h)~1, then the improper fraction D/dy + h is SWf**"1, and
consequently
(Dldl+y Z e^d*!*"-1
or
£>« > eWJ
must be true, and we obtain the somewhat simpler formula
(6) e->>V(<5) = V(TV + @) + rtJ)-,
where |rjv| < 1.
134 Arithmetical Problems
The expansion of V(VV + ©) according to the powers of© gives us
V(TV + ©) = ^0©« + ^©« + 1 + ^2©« + a +-..,
where the coefficients >p are integral rational functions of rv with
integral rational coefficients. In particular,
0b = 0o(rv) = o(rv)p.
[For v = 1, for example,
F(i\ + ©) = (i\ + ©)»[©.(© + r1-r2).(© + r1-r3)...]
= r»(i\ - rj(rx - r3)... (i\ - r„)•© + • • •
= rk>(ri)-@ + •••
and
o(rx + ©) = (i\ + ©)V(rx + ©) = n^ro + • • -,
consequently
F(rx + ©) = np9(ri)p-@* + ••• = ¢(1^-©« +•••.]
If we introduce this expansion into (6), we finally obtain
«».F(@) = 0o(r,)©8 + *i(r,)®p + ^2(rv)©"+i + • • • + ,vz)«.
This formula, multiplied by Cv, we then form for all v from 1 through
n, and we add the resulting n equations.
According to (3), we then obtain
(7) 0 = G0<B" + (?!©" + G2©p + 1 + • • • + Gk&*k + AZ)«,
where
Gr = Ci^ro + c2^r(r2) + • • • + cn^(r„)
is, according to (3b), a rational integer and A is a number the absolute
magnitude of which does not exceed the n-fold value of the maximum
|C|-value.
We now replace ©r with Hrr\, divide (7) by the then universally
common factor H", abbreviate D\H as E, and combine all the terms
containing the factor/)!, and we obtain
(8) G0?! +G'p\ = AE\
where G' is an integer and A = — A.
The Hermite-Lindemann Transcendence Theorem 135
Now we compare
g0 = Ci^ro* + c2o(r2)» + • • • + cno(rn)»
with
g = c^rj + c2o(r2) + • • • + cno(r„),
the latter of which, according to our assumption concerning h, differs
from zero.
If we expand Gp according to the polynomial theorem, every term
of the expansion, with the exception of the n terms CfO(rv)p, is the
/>-multiple of an integral algebraic number, and, therefore,
(9) g> = [Ci^ro- + • • • + cw(rn)>] + up,
where p is an integral algebraic number (which is, in fact, integral and
rational).
Now according to Fermat's theorem* every difference C* — Cv, as
well as Gp — G, is an integral multiple cvp and gp, respectively, of p.
Accordingly, (9) is transformed into
G + gp = (C1 + Cl/»)0(r1)p + ... + (c„ + cn/>)0(rn)p + ^
= c1o(r1)p + • • • + cn<t>(rny + p'p = g0 + p'p,
where fi' is also integral and algebraic.
This equation simplifies into
G0 = G + gp,
where g' = g — fi' is an integral algebraic number, and is also an
integral rational number, as a result of g' = (G0 — G)jp.
If we introduce this value into (8), we obtain
Gq\ + g'p\ + G'p\ = AE"
or, if the integer G' + g' is designated as @,
E"
(10) G + ®p = A—-
We now choose a prime number p so large that (1) p > \G\ and
(2) the absolute magnitude of the right side of (10) is smaller than 1.
* Fermat's theorem: For every integer g and every prime number p the difference
g" — g is divisible by p.
Proof. The theorem is self-evident if g is divisible by p. For every g that
is indivisible by p the theorem follows directly from the congruences (la) and
(2a) of No. 19, if g is substituted for D there and the congruences are squared.
In both cases g"'1 = 1 mod p is obtained, and from this g" = g mod p.
136
Arithmetical Problems
Equation (10) then contains a contradiction. On the left side of
the equation there is an integer that is indivisible by p (because
G ^ 0) and is thus not equal to zero, while on the right there is a number
whose absolute magnitude is less than 1. This is impossible.
Consequently, the initial equation (1) is also impossible and Lindemann's
theorem is proved.
The inferences that can be drawn from Lindemann's theorem are
amazing. Here we present only a few:
1. The transcendence of e: The Euler number e is transcendent, i.e.,
it is not an algebraic number. (In other words, it cannot be a root
of an algebraic equation with rational coefficients.)
2. The transcendence of it: The Archimedes (Ludolph) number it is
transcendent.
According to Euler (No. 13), there exists the equation
etn + 1 = 0.
According to Lindemann's theorem the exponent iir cannot, therefore,
be an algebraic number. Consequently, it is also impossible for
it to be an algebraic number. (If it were algebraic, then the
product of the two algebraic numbers i and tt would have to be algebraic.)
Thus, the ancient question of squaring the circle is answered, though
the answer is negative:
It is impossible to draw with a compass and straight-edge a square that is
equal in area to a given circle.
If, for example, we choose the radius of the given circle in such a
manner that it is equal to the unit length, the area of the circle is it
and the desired side of the square Vir. If, however, Vn could be
drawn with compass and straight-edge, then the square it of this
segment could also be constructed, and, according to No. 36, it would
have to be the root of an algebraic equation with rational coefficients
(whose degree would be a power of 2). However, it is transcendent.
3. The exponential curve y = ex passes through no algebraic point of the
plane except the point 0| 1.
(An algebraic point is a point whose coordinates x and y are both
algebraic numbers.) Since algebraic points are omnipresent in
densely concentrated quantities within the plane, the exponential
curve accomplishes the remarkably difficult feat of winding between
all these points without touching any of them.
The same is, naturally, also true of the logarithmic curve y = Ix.
The Hermite-Lindemann Transcendence Theorem 137
4. The sine curve y = sin x also passes through no algebraic points of the
plane except the lattice point 0|0.
If, for example, <x|/3 were an algebraic point situated on the sine
curve, /3 would be equal to sin a or, since 2i sin a = eia —e~ia,
eia —e~ia — 2i/3 = 0. However, according to Lindemann's theorem,
this equation cannot exist for algebraic numbers a, /3.
Planimetric Problems
■8|| Eider's Straight Line
In all triangles the center of the circumscribed circle, the point of intersection
of the medians, and the point of intersection of the altitudes are situated in this
order in a straight line—the Euler line—and are spaced in such a manner that
the altitude intersection is twice as far from the median intersection as the
center of the circumscribed circle is.
Leonhard Euler (1707-1783) was one of the greatest and most
fertile mathematicians of all time. His writings comprise 45 volumes
and over 700 papers, most of them long ones, published in periodicals.
The above theorem is among the results of the paper "Solutio
facilis problematum quorundam geometricoruni difficillimorum,"
which appeared in the journal Novi commentarii Academiae Petropolitanae
{ad annum 1765).
The following proof of the Euler theorem is distinguished by its
great simplicity.
In the triangle ABC let M be the midpoint of side AB, S the median
intersection, which lies on CM, so that
(1) SC = 2-SM,
and U the center of the circle of circumscription, lying on the
perpendicular bisector of AB.
We extend US by SO so that
(2) SO = 2-SU,
and join 0 to C.
According to (1) and (2) the triangles MUS and COS are similar.
Consequently, CO\\MU, i.e., CO J_ AB, or expressed verbally, the line
connecting the point 0 with a vertex of the triangle is perpendicular
to the side of the triangle opposite the vertex; consequently, the
connecting line is an altitude of the triangle.
The three altitudes consequently pass through point 0. This is, therefore,
the altitude intersection, and Euler's theorem is proved.
Note. Our proof contains at the same time the solution to the
interesting
142
Planimetric Problems
Problem of Sylvester: To find the resultant of the three vectors
UA, UB, UC acting on the center of the circle of circumscription U of the
triangle ABC.
A M B
Fio. 5.
Since UM is half the resultant of the two vectors UA and UB,
CO represents in magnitude and direction the whole resultant of
these vectors. Now, since UO is the resultant of UC and CO, UO is
the resultant we are seeking.
The resultant of the vectors represented by the three radii from the center of
the circle of circumscription to the vertexes of the triangle is the segment extending
from the center of the circle of circumscription to the altitude intersection.
James Joseph Sylvester (1814-1897) was an English jurist and
mathematician.
HRfl The Feuerbach Circle
In every triangle the three midpoints of the sides, the three base points of the
altitudes, and% the midpoints of the three altitude sections touching the vertexes
lie on a circle.
This circle was already known to Euler (1765), but is most commonly
called the Feuerbach circle after Karl Feuerbach (1800-1834) [the
uncle of the painter Anselm Feuerbach], who rediscovered it in 1822.
It is also known as the nine-point circle, although it passes through many
other significant points as well as those indicated above.
The Feuerbach Circle
143
The proof consists of two steps. In the first we demonstrate that
the circle circumscribing the triangle of the three midpoints of the
sides passes through the base points of the altitudes; and in the second
we show that the circle circumscribing the triangle of the altitude
base points passes through the midpoints of altitude sections.
I. Let ABC represent the prescribed triangle, A', B', C" the
midpoints, respectively, of sides BC, CA, AB. Let H be the base point of
the altitude AH. Then the trapezoid HA'B'C is isosceles (A'B', as a
midline of the triangle ABC, is equal to \AB\ HC, as the radius of
the Thales circle having the diameter AB, is also equal to \AB.) The
trapezoid is therefore a quadrilateral inscribed in a circle. All of the
altitude base points consequently lie on the circle 3f circumscribing
the triangle A'B'C.
Fio. 7. F:o. 8.
II. Let the altitudes of the triangle ABC be AH, BK, CL, and O
their point of intersection. We will now show that the center of each
altitude section touching a vertex, let us say section OC, also lies on
g. For this purpose we consider the triangle OBC, which also has
the altitude bases H, K, L. According to I., the circle %
circumscribing the altitude base triangle (HKL) of this triangle passes
144
Planimetric Problems
through the triangle at the side midpoints, e.g., through the center of
OB and OC, which completes the proof.
Corollary. The midpoint F of the Feuerbach circle lies at the center of the
Euler line OU, and the radius f of the Feuerbach circle is equal to one half the
radius of the circle of circumscription of the triangle ABC.
The first of these propositions follows from the fact that the
perpendicular bisectors of the Feuerbach circle chords HA' and KB', as
midlines of the trapezoids UOHA' and UOKB', pass through the
center of OU, and the second, from the fact that the sides of the
triangle A'B'C inscribed in the Feuerbach circle are one half the size
of the sides of the triangle ABC.
^| Castillon's Problem
To inscribe in a given circle a triangle the sides of which pass through three
given points.
This problem, posed by the Swiss mathematician Cramer, takes its
name from the Italian mathematician Castillon, who solved it in 1776.
(Gabriel Cramer, 1704-1752, in 1750 published his major work
Introduction d Vanalyse des lignes courbes algebraiques, in which for the first
time, a system of linear equations was solved by means of determinants.
I. F. Salvemini, 1709-1791, took the name Castillon after his place of
birth Castiglione in Tuscany.)
The following simple, though not easily seen, solution of the
Castillon problem stems from the Italian Giordano.
We call the given circle S, the given points A, B, C, the desired
triangle XYZ, and let YZ, ZX, AT pass, respectively, through A, B, C.
Ottaiano in his solution makes use of four auxiliary points. These
are:
I. the end point of the chord parallel to AB and beginning
from X;
II. the point of intersection of the lines FI and AB;
III. the end point of the chord beginning at X that is parallel to
IIC;
IV. the point of intersection of the lines CH and I III.
The construction consists of the following five steps.
1. Construction of auxiliary point II. The angles ^411 I and
XIY, as alternate interior angles between parallek, are equal, and the
Castilloris Problem 145
angles XZY and ATI Fare equal because they are inscribed in the same
arc XY. Consequently,
2t XZY = &AII1
and therefore BZYII is a quadrilateral inscribed in a circle. It also
follows from this that
AIIAB = AYAZ.
Since, however, the right side of this equation is known to be the
power P of the circle S at A (see p. 152), it follows that
,411 = P\AB
can be constructed, as a result of which II is known.
2. Construction of auxiliary point IV. The angles FCTV and
FYIII are corresponding angles between parallels and are
consequently equal, while angles FI III and FYIII are supplementary
since they are opposite angles in the quadrilateral inscribed in the
circle. Thus, FI III and FCIV are also supplementary, and
FCIV I is a quadrilateral inscribed in a circle. It follows from this
that
IICIIIV = IIFIII.
146
Planimetric Problems
However, since the right side of this equation represents the power II
of circle S at II, which, according to 1., is to be regarded as known,
we find
ii iv = n/iic
and thus the auxiliary point IV.
3. Determination of the angle IATIII = a>. Since angle
AW IV = k is known and since a> and k, having pairwise parallel
sides, are identical, it follows that
CO = K.
4. Construction of the chord I III. We draw through IV a
chord subtending the angle w = k. The points of intersection of this
chord with S are the remaining points I and III.
5. Construction of the triangle XYZ. We determine X as the
point of intersection of S with the line through III parallel to IIIV;
Y as the point of intersection of the line III with S; and Z as the
point of intersection of the line AY with S.
In comparison to this fairly intricate solution the following
projective solution of the Castillon problem is very simple.
This solution is based upon Steiner's double element construction
(No. 60) and the involution theorem: If a ray is rotated about a fixed
point, its two points of intersection with a circle describe on this circle
(involutional) projective ranges of points (No. 63).
We take any arbitrary point ATX on the given circle S, determine the
(second) point of intersection Zx of the circle with the secant BXX,
then the (second) point of intersection Yx of the circle with the secant
AZU and, finally, the (second) point of intersection X[ of the circle
with the secant CYV Only when X[ happens to coincide with ATX is
X1Y1Z1 the sought-for triangle. This favorable situation will,
however, occur only rarely. We will consider the described
construction as repeated with other starting points X2, X3,..., giving us
the points Y2, Y3,..., Z2, Z3,...; X2, X'3, According to the
auxiliary theorem each of the fields of points Xu X2,...; Yu Y2,...;
Zx, Z2,..., and X\, X'2 is projective with respect to the following
one; consequently,
(X1, X2,...) "a (X[, X2,...).
The desired triangle is obtained from the described construction when
the starting point Xv coincides with the end point X'v and is accordingly
Malfatti's Problem
147
determined by a double element of this projection. This gives us the
following simple
Construction: We choose any three points Xx, X2, X3 on S,
draw in the manner described the three corresponding points X'u X'2,
X'3, and determine according to Steiner the double elements XT and
Xs of the projection on S in which the points X\, X'2, X'3 correspond to
Xu X2, X3. Thus, each of the two triangles XTYTZT and X,Y,Z,
satisfies the conditions of the Castillon problem.
Note. In a quite similar manner we are able to prove the
converse of the Castillon problem :
To draw about a circle a triangle the angles of which lie on three given lines.
The construction is based upon the auxiliary theorem:
If a point describes a straight line, the two tangents from the point to a circle
determine upon this circle two (involutional) projective fields of tangents
(No. 63).
We call the given circle S, the given lines a, b, c, the sides of the
desired triangle x, y, z.
We draw any three tangents xu x2, x3 to S; through their points of
intersection with b we draw three more tangents zu z2, z3; through
the points of intersection of the latter with a we draw three new
tangents yu y2, y3, and through their intersections with c three more
tangents x[, x'2, x'3. We draw the double elements xr and x, of the
projection defined on S by the homologous triplets (xu x2, x3) and
(x'i, x'2, x'3). The triangles xryrZr and xsy,z, obtained from these
double elements are the ones we are seeking.
j^g Malfatti's Problem
To draw within a given triangle three circles each of which is tangent to the
other two and to two sides of the triangle.
This famous problem was posed by the Italian mathematician
Malfatti (1731-1807) in 1803 and solved in the tenth volume of the
Memorie di Matematica e di Fisica delta Societa italiana delle Scienze. This
algebraic-geometric solution can be found, for example, in vol. 123
of Ostwald's Klassiker der exakten Wissenschaften (Supplement). The
purely geometric solution of Malfatti's problem submitted by Jakob
Steiner in 1826 without proof is also described and proved there.
Here we will restrict ourselves to the exposition of the thoroughly
simple solution published by Schellbach in volume 45 oWrelle's Journal.
148
Planimetric Problems
Let ABC be the given triangle with sides a, b, c, the perimeter 2s and
the angles a, /3, y. Let the Malfatti circles we are seeking (which are
tangent to the arms of the angles a, /3, y) be %, d, SR, their midpoints
P, Q, R, and their radii p, q, r. Let the tangents from the angles
A, B, C to $, O, 9¾ be u, v, w.
Fig. 10.
We introduce 3, a circle inscribed in the triangle. Let its center
be J and its radius p, and let the tangents to it from angles A, B, C be
<*d ^1) Ci) respectively. From the three equations
b1 + c1 = a, c1 + a1 = b, <Zi + bx = c
we obtain the values
ax = s — a, b1 = s — b, c1 = s — c.
Since the points P and J lie on the bisector of the angle a, it follows
from the ray theorem that
pip = MK or P = ^-u.
Similarly we find q = ■£- v.
We call the points of tangency of $ and O with AB, U and F and
calculate UV = /. Since PF, the perpendicular dropped from P to
Q F, is equal to t, it follows from the right triangle PQF that
PQ* = PF2 + FQ2 or (/- + ?)2 = /2 + (/- - ?)2
Malfatti's Problem
149
and from this
UV=t = 2Vpq.
If we then introduce here the values found above fbrp and q, we obtain
= iVw J-Z-r-
V a1b1
But it is known that
p2 = a1b1c1/s.
This simplifies the value for t to
= t = 2^ VJTv.
UV ■
Since the side AB of the triangle is composed of the three segments
A U, BV, and UV, we obtain the equation
u + v + 2 J— Vtw = c.
In the same way we obtain for the two other sides of the triangle BC
and CM
v + w + 2
and
+ u + 2 J-^ Vwu = b.
150
Planimetric Problems
Taking half the perimeter as the unit length, we obtain somewhat
more simply:
(v + w + 2VH[Vvw = a,
w + u + 2\/Vv/au« = b,
u + v + 2V^VuJT = c.
Now we take the proper fractions a, b, c, u, v, w as squares of the
sines of six acute angles A, p, v, >p, <p, x'
sin2 A = a, sin2 fi = b, sin2 v = c,
sin2 ifi = u, sin2 <p = v, sin2 x = w-
Then also (since a + ^ = s = I, b + b1 = \, c + ^ = \) cos2 A =
au cos2 fj. = bu cos2 v = cls and the obtained equation triplet (1)
assumes the form:
'sin2 <p + sin2 x + 2 sin y sin x cos A = sin2 A,
(2)
sin2 x + sm2 ^ + 2 sin x sin >p cos /* = sin2 p,
sin2 ^r + sin2 <p + 2 sin >p sin y cos v = sin2 v.
Now, for example, let us consider the first of these equations! It is
nothing other than a trigonometric expression of the known relation
(<p + x = A) between the angles q> and x of the two vertexes of a
triangle and the exterior angle A of the third vertex. If, for example,
we take such a triangle with a circle of circumscription of the diameter
1, then the three sides are sin <p, sin x, sin A, and the cosine theorem
gives the equation
sin2 A = sin2 <p + sin2 x + 2 sin <p sin x cos A.
It then follows from (2) that
9 + X = *» X + t = H-> ip + (p = v
and from this
>p = a — A, q> = a — fi, x = a ~ vi with cr = ^ •
Thus, we obtain the following simple
Construction :
1. We draw three angles A, p, v whose sine squares are equal to the
sides of the given triangle (where half the perimeter of the triangle is
the unit length).
Monge's Problem
151
2. We draw the half sum
A + n + v
° 2
of the three angles A, fi, v and the three new angles
ifl = <T — A, <p = O — fj., x = <T — V.
3. We draw the sine squares of the three angles >fi, q>, \- These are
the tangents from the triangle vertexes to the three Malfatti circles.
Note. If we are to draw the sine square m = sin2 w for a given
angle w, or to draw the angle w (whose sine square equals m) for a
given segment m, we proceed in the following manner:
We draw a semicircle ip with the diameter HK = 1. We draw the
given angle w at K on KH and from the intersection L of its free side
with ip we drop the perpendicular LM to HK. Then HM = m =
sin2 w.
Conversely, if m is given and we have to find w, we draw HM = m
on HK, erect at Ma perpendicular on HK extending to the intersection
L with !q, and extend LK Then %_HKL = w.
Proof. From the right triangle HML it follows that
m = HM = HLsin HLM = HL sin w,
and from the right triangle HKL
HL = HK sin w = sin w.
Consequently,
m = sin2 w.
BS1H Monge's Problem
To draw a circle that cuts three given circles perpendicularly.
The French mathematician Monge (1746-1818) was the founder of
descriptive geometry.
In order to solve the problem, we seek the locus of the centers of all
the circles that are perpendicular to two given circles.
[Two circles are said to intersect perpendicularly when the radii r
and r' drawn to a single point of intersection are perpendicular to
each other; in other words, when they form the base and altitude of a
right triangle the hypotenuse z of which joins the centers of the
circles, so that r2 + r'2 = z2 or z2 — r2 = r'2. Two circles are
152
Planimetric Problems
therefore perpendicular to each other when the power* of the one at
the midpoint of the other is equal to the square of the radius of the
other.]
F:o. 12.
Let the given circles be M: and S', their centers K and K', their
radii k and k' (>k), the line joining their centers KK' = I. Let the
circle 3£ with the midpoint X and the radius x be perpendicular to
them. Let the center lines KX and K'X be equal to z and z',
respectively. Then z2 — k2 and z'2 — k'2 are each equal to x2, so
that
(1) z2 - k2 = z'2 - k'2.
Consequently, both circles M: and S' have the same power at X. We
therefore first attempt to find the locus of the point X at which the
two given circles possess the same power. If AT is a point possessing
this locus and the perpendicular from X intercepts the center line KK'
at the point F, and, moreover, if KF =/and K'F =/', then,
according to the Pythagorean theorem, the square of the perpendicular is
equal to z2 — f2 as well as to z'2 — f'2, so that
(2) z2 -f2 = z'2 -f2.
* By the power of a circle at a point is meant the amount by which the
square of the axis to the point exceeds the square of the radius of the circle.
In accordance with the secant or chord theorem it can also be represented as
the product of the two segments originating from the point that are generated
by the circle through the point on any secant.
Monge's Problem
153
If we subtract (2) from (1) we obtain
(3) p-k*=f'*- k'\
i.e., M: and S' possess equal powers at Falso. If we figure the distances
/and/' as positive in the directions KK' and K'K, respectively, then
it is always true that
(4) / + /' = /•
Equations (3) and (4) give us fixed values for the unknowns/and/'.
Consequently every locus point X lies on the perpendicular erected on
the center line KK' at the fixed point F, and we obtain the
Theorem of the chordal: The locus of the point at which two given
circles possess the same powers is a straight line perpendicular to the line
joining the midpoints of the circles and is known as the chordal or power line
of the two circles.
In the construction of the chordal we distinguish two different cases:
1. The circles intersect. Since both circles have equal powers at each of
their points of intersection, i.e., 0, the points of intersection lie on the
chordal. The chordal of two circles that intersect is the secant of intersection.
2. The circles do not intersect. Here the construction of the chordal
is based upon the
Theorem of Monge : The three chordals of three circles pass through a
point known as the power center of the three circles.
[Proof. Let the circles be I, II, III. We determine the point of
intersection 0 of the chordals of the two pairs (II, III) and (III, I).
At this point (1) II and III, (2) III and I possess equal powers;
consequently II and I also have the same power at 0, i.e., 0 lies on
the chordal of I and II.]
Thus, to construct the chordal of two nonintersecting circles I and
II, we draw an auxiliary circle III that intersects I and II and the
chordals of the pairs (II, III) and (III, I). The perpendicular from
the intersection of these chordals to the line joining the centers of I and
II is the chordal we are looking for.
From the theorem of the chordal it then follows:
The locus of the centers of all circles that are perpendicular to two given
circles is the chordal of the given circles or, in the event that these circles
intersect, the section of the chordal that lies outside the given circles. (The
powers of the given circles at a single point must be positive!)
The solution of Monge's problem now becomes very simple. We
draw the power center 0 of the given circles. If it lies outside the
154
Planimetric Problems
three circles, the circle with the midpoint 0 and the radius formed by
the tangent from 0 to one of the given circles intersects perpendicularly
with the given circles. If 0 is located inside even one of the given
circles, the problem is insoluble.
The Tangency Problem of Apollonius
To draw a circle that is tangent to three given circles.
The circles may also comprise degenerate circles: points or straight
lines.
This celebrated problem was put forth by the greatest mathematician
of the ancient world after Euclid and Archimedes, Apollonius of
Perga (ca. 260-170 B.C.), whose major work Kwviko. extended with an
astonishing comprehensiveness the period's naturally slight knowledge
of conic sections. His treatise De Tactionibus, which contained the
solution of the tangency problem given above, has unfortunately been
lost. Francois Viete, called Vieta, the greatest French mathematician
of the sixteenth century (1540-1603), attempted about 1600 to restore
the lost treatise of Apollonius and solved the tangency problem by
treating each of its ten special cases individually, deriving each
successive one from the preceding one. In contrast to this the
solutions of Gauss {Complete Works, vol. IV, p. 399), Gergonne
(Annates de Mathematiques, vol. IV), and Petersen [Methoden und
Theorien) solve the general problem.
Here we will restrict ourselves to the exposition of the elegant
solution of Gergonne. Since this proof presupposes, in addition to the
chordal theorems proved in No. 31, a knowledge of the properties of
similarity points and polars, we will begin with a brief discussion of
these.
Similarity Points
When we refer to the external or positive and internal or negative
similarity points, respectively, of two circles M: and S' with the centers
M and M' and the radii r and r , we mean the points A and J,
respectively, on the line MM' joining the centers for which
MA r . MJ r . , „
,., . = +-7 and ,-. r = —-.■> respectively.*
MA r M J r> r j
* The segment ratio AX:BX is considered positive if X is situated outside
AB and negative if X is inside AB.
Tangency Problem of Apollonius
155
It follows directly from the ray theorem that:
The line connecting the end points of two parallel {oppositely directed) radii
of two circles passes through the external (internal) similarity point.
In particular, the external (internal) common tangents of the two
circles pass through the external (internal) similarity point. We will
further designate the external similarity point of the circles M: and S'
as + St®', the internal one as — SS', and, if the sign is not determined,
we will indicate the similarity point as e®8'. The symbol ee'e"... is
to be understood as meaning plus when the number of minus signs
occurring among the symbols e, e, e",... is even and minus when it
is odd.
The similarity points of three'circles are described by the
Theorem of d'Alembert:* If three circles 21, S3, © are taken in pairs
(S3, (£), ((£, 2C), and (21, S3), the external similarity points of the three pairs
lie on a straight line; and, similarly, the external similarity point of one pair
and the two internal similarity points of the other two pairs lie upon a straight
line, a so-called similarity axis of the three circles. More briefly:
If afly is plus, the three similarity points <x93G, /3©9t, and y2C93 lie on a
straight line.
Monge's proof. Let the centers of the circles 2C, S3, G be A, B, C,
and let the external similarity points of the pairs (S3, <£), ((£, 2C), (21, S3)
be P, Q, R. If the circle pair (S3, (£) with its external tangents that
pass through P is rotated about the axis PBC, we obtain the spheres
S30 and E0 and their tangent cone with apex P. The case is similar
for the other two circle pairs.
The planes £^ and E2 are tangent to the spheres 2C0, S30, E0 in such
a manner that the spheres always lie on one side of the plane, and
both planes contain the point P, since this point lies on the external
* D'Alembert (1717-1783), a French mathematician.
156
Planimetric Problems
tangent of (230, ©0) within E^E^. They likewise contain the points
Q and R.
The three points P, Q, R thus lie on the line of intersection of the
planes Ex and E2-
If we are concerned with the internal similarity points of the pairs
(SB, <S) and (2t, G) and the external similarity point of (2t, 93), we
must take the tangential planes so that 2t0 and S30 lie on one side of
such a plane while ©0 lies on the other.
Let an arbitrary circle 3£ with the center X be homogeneously
(nonhomogeneously) tangent to two fixed circles S and £', with
centers K and K' and radii k and k', at P and Q'. Let the points of
intersection of the straight line PQ' with the circles S and S' and the
line KK' joining their centers be P, Q; P', Q', and S.
Since the base angles of the isosceles triangles KPQ, K'P'Q', and
XPQ' are also the opposite and coincident angles at P and Q', all six
base angles are equal. Since the two base angles at P and P' are
equal, the radii KP and K'P' are parallel. Consequently, S is the
external (internal) similarity point of S and £'. From this it follows
that
SP _ k_ SQ _ k_
SP' ±k'' SQ' ±k''
so that the two products SP-SQ' and SQ-SP' are equal. If we call
their common value w, then
w* = SP-SQ'-SQ-SP' = SP-SQ - SP'-SQ',
Tangency Problem of Apollonius 157
i.e., w2 is equal to the product of the powers II and II' of the two
circles S and S' at S. Consequently,
SP-SQ' = w= VTTTF.
I.e.: The power (SP-SQ') of the circle X at S is a constant (VTTTF).
The result of our considerations is the following
Tangency theorem: The external (internal) similarity point of two fixed
circles is the point at which all the circles homogeneously (nonhomogeneously)
tangent to the fixed circles have the same power and at which all the tangency
secants (which are determined by the points of tangency to the fixed
circles) intersect.
Pole and Polar
Two points P and P' that lie on a ray originating at the center O
of a circle S with radius r in such manner that
OP-OP' = r2
are called conjugate with respect to each other in relation to the
circle. Of two conjugate points one lies inside the circle and the
other outside.
The conjugate of an external point A is the point of intersection J of
the circle bisector from A with the tangency chord determined by the
tangents AT1 and AT2 from A to the circle.
The conjugate of an internal point J is the point of intersection A of
the tangents that pass through the end points Tx and T2 of the chord
passing through J and perpendicular to the circle bisector from J.
158 Planimetric Problems
F:o. 16.
(From the right triangle OA Tt it follows directly that r2 = OA • OJ.)
By the polar of the point P we mean the line p that is perpendicular to
the circle bisector from P and passes through the conjugate of P.
Conversely, by the pole of the line p we mean the point P that is
conjugate to the base point of the perpendicular dropped from the
center of the circle to the line.
The relation between the pole and the polar is therefore reciprocal:
If pis the polar ofP, then P is the pole ofp, and conversely.
Now let Q be any point on the polar p of P (that passes through the
conjugate P' of P) and let Q' be the conjugate of Q. Then
OP-OP' = OQ-OQ' (= r2),
and consequently PP'QQ' is a quadrilateral inscribed in a circle.
Since here the angle at P' is 90° the angle at Q' must also be 90°, i.e.,
Fro. 17.
Tangency Problem of Apollonius
159
PQ' must be perpendicular to OQ. PQ' is therefore the polar q of Q,
and we have the
Theorem of the pole and polar: If Q lies on the polar ofP,P also
lies on the polar ofQ. Or also: If p passes through the pole of q, q also
passes through the pole of p.
Now for Gergonne's solution of the tangency problem.
In general, there are a number of circles that are tangent to three
given circles 2t, S3, (£. Gergonne's solution is based upon the device
of seeking the unknown circles in pairs rather than individually; in
particular, one always seeks that pair (3£, j) that is homogeneously or
nonhomogeneously tangent to each of the given circles.
For the sake of convenience, we will call homogeneous tangencies
positive ( + ) and nonhomogeneous tangencies negative ( —) and
combinations such as ee' of the tangency signs e and e' will be treated
in accordance with the rule that "like signs give plus and unlike
minus."
Let the circles 3£ and j, respectively, be tangent to the circles
% S3, © at the points P, Q, R and p, q, r, respectively, and let the
tangencies possess the signs A, B, C and a, b, c, respectively. Then
Aa = Bb = Cc = e,
and
BC = be = a, CA = ca = p, AB = ab = y
and
<x/Sy = +.
Let us first consider (3£, j) as the pair tangent to the circles 2t, S3, ¢.
According to the tangency theorem, the similarity point e3£j of £ and J is
the power center O of the three circles %, S3, © and the point of intersection of
the three tangency chords Pp, Qq, Rr.
We then take in succession (S3, <S), (©, 2t), (¾. S3) as the pair
tangent to the circles 3£ and j. In accordance with the tangency
theorem, the circles 3£ and j then have the same powers at the
similarity point <xS3(£ = I, as well as at the similarity point /3(5¾ = II,
and the similarity point y&S3 = III. And since a/Sy is +, the three
points I, II, III, in accordance with d'Alembert's theorem, lie upon a
similarity axis of % S3, ©. The similarity axis III III is thus the
chordal \ of the circles 3£ and J.
Further, if S represents the point of intersection of the tangents to 9t
at P and p, then SP = Sp. Since these tangents also touch 3£ and j,
S lies on the chordal x of 3£ and j. Now S is also the pole of the
160
Planimetric Problems
tangency chord Pp with respect to circle 2t. Since \ therefore passes
through the pole of Pp, it follows from the theorem of the pole and
polar that Pp passes through the pole of \. Since the same
conclusions can be drawn with respect to the tangency chords Qq and Br,
we obtain the theorem: The tangency chords Pp, Qq, and Rr pass
respectively through the poles of the line x = IIIIII with respect to the
circles 21, 33, ©.
Fio. 18.
From the three theorems italicized in the last three paragraphs we
obtain directly
Gergonne's construction : Draw the power center O of the given circles
and the similarity axis III III = \. Determine the poles \,2,"iof\inrelation
to the given circles and connect them with O. The connecting lines touch the
given circles at the points at which they are tangent to the sought-for circles.
■Sj^l Mascheroni's Compass Problem
To prove that any construction that can be carried out with a compass and
straight-edge can be carried out with the compass alone.
Mascheroni's Compass Problem
161
The Italian L. Mascheroni (1750-1800) posed himself the problem
of executing the geometric constructions with a compass alone
(without the use of the straight-edge) and solved it in a masterly
fashion in his book La geometria del compasso, which was published in
Paviain 1797.
If we examine the separate steps by which the circle and
straightedge constructions are carried out, we see that every step consists of
one of the following three basic constructions:
I. Finding the point of intersection of two straight lines;
II. finding the point of intersection of a straight line and a circle;
III. finding the point of intersection of two circles.
Consequently, we need only show that the two basic
constructions I. and II. can be accomplished with a compass alone.
(In Mascheroni's geometry of the compass a straight line is,
naturally, regarded as given or determined if two of its points are
known.)
First we must solve two preliminary problems.
Preliminary problem 1. To draw the sum or difference of two given
segments a and b.
In other words: to lengthen or shorten a given segment PQ = a by a
segment QX = b.
Solution. 1. We draw the arc Q\b,* take upon this arc any point
H, draw the mirror image H' of H (the mirror image 0' of a point 0
on a straight line AB is the point of intersection of the arcs A\AO and
B\BO) on the straight line g determined by the points P and Q, and
designate the segment HH' as h. 2. We draw the isosceles trapezoid
KHH'K' whose legs KH and K'H' are equal to b and whose base
KK' = 2h. (K is the point of intersection of the arcs Q\h and H\b,
K' is the mirror image of K on g.) Let the diagonal KH' = HK' of
the trapezoid be called d. Since the trapezoid is a quadrilateral
that can be inscribed in a circle, according to Ptolemy the following
equation is applicable:
d* = b" + 2h2.
On the other hand, it follows from the right triangle QK'X, where
K'X will be designated as x, that
^ = ^ + h".
Let arc Q\b mean the circle arc whose midpoint is Q and radius b.
162
Planimetric Problems
From these two equations it follows that
d" = x2 + h",
so that x is one of the legs of a right triangle with the hypotenuse d
and the other leg h. If we then find the point of intersection S of the
arcs K\d and K'\d on the straight line g, QS = x. 3. We draw the
point of intersection of the arcs K \ x and K' \ x; this is the point X that
we have been trying to find.
Preliminary problem 2. To find the fourth segment x that is in
proportion to the three given segments m, n, s.
In other words, draw the segment
n
x = — s.
m
The following solution that Mascheroni found for this fundamental
problem is remarkable for its shortness and simplicity.
We draw two concentric circles 2R = Z\m and 91 = Z\n, draw the
chord AB = s in 2R, lay off with the compass any length w from A
Mascheroni's Compass Problem
163
and from B on 5ft, obtaining from the distance between the resulting
points of intersection H and K the sought-for segment x. The proof
follows directly from the similar triangles ZAB and ZHK.
In this construction it is assumed that s falls within circle 2R. If this
is not the case, we first transform the fraction n\m into N/M, where N
and M, respectively, are sufficiently great integral multiples of n and m
which can be drawn according to the first preliminary problem. (A
comparatively simple method is the doubling that results, for example,
when PQ = m, and the radius m of the circle P\PQ is laid off three
times in succession from Q. The end point after this laying off is
separated from Q by the distance 2m.)
After the solution of the preliminary problems, we go on to the
solution of the two major problems.
I'. To find the point of intersection S of two straight lines AB
and CD (each of which is given by two points) with the compass
alone.
II'. To determine the point of intersection S of a given circle £ and
a given straight line AB with the compass alone.
Fio. 21.
164
Planimetric Problems
Solution of I'. We draw the mirror images C and D' of C and D
with respect to AB. The sought-for point of intersection S then also
lies on CD'. According to the ray theorem, it follows that CSjSD =
CC/DD', i.e., if we designate the segments CS, CD, CC, DD' as
x, e, c, d, respectively, x\ (e — x) = c\d or
c
x = -.- e.
c + d
Now we begin by drawing CH = c + d (H as the point of
intersection of the arcs C'\d and D\e); then we draw the segment x in
accordance with preliminary problem 2; and finally we draw the
sought-for point of intersection S as the intersection of the arcs C\x
and C'\x.
/k \
I M I
A \/T i\/b
M'
Fio. 22.
Solution of II'. Let the center of the given circle be known as M,
the radius as r. We draw the mirror image M' of M with respect to
the straight line AB and with the compass open to the radius r we
strike off" r on the circle ft from M'. The resulting points of
intersection are the sought-for points of intersection of the given straight
line AB with the given circle S.
The construction cannot be carried out if the straight line AB
happens to pass through M. In this exceptional case we extend and
shorten the segment AM by r in accordance with preliminary problem
1. The end points of the extended and shortened segment are the
sought-for points of intersection of ® and AB.
This completes the solution of Mascheroni's problem.
Steiner's Straight-edge Problem
165
■££■ Steiner's Straight-edge Problem
To prove that every construction that can be executed with compass and
straight-edge can be executed with a straight-edge alone in the event that within
the picture plane there is also given a fixed circle.
As far back as 1759 Lambert had solved a whole series of geometric
constructions with straight-edge alone in his book Freie Perspektive,
which was published in Zurich that year. He is also the source of
the term "straight-edge geometry." After Lambert the French
mathematicians, primarily Poncelet and Brianchon, took up
straightedge geometry, particularly after the publication of Mascheroni's
Geometria del compasso provided a new stimulus to these studies, and
they attempted to execute as many constructions as possible with the
straight-edge alone.
Now, with the use of a straight-edge alone it is possible to represent
only those algebraic expressions whose algebraic form is rational
(thus, for example, it is impossible to represent expressions such as
Vab). This circumstance suggested to Poncelet that an additional
fixed circle (as well as the center!) must be given inside the picture
plane for it to be possible to draw with straight-edge alone all the
algebraic expressions that can be constructed with a compass and
straight-edge.
This suggestion was confirmed as a certainty by Jakob Steiner
(1796-1863), the greatest geometer since the days of Apollonius, in
his celebrated book Die geometrischen Konstruktionen ausgefuhrt mittels
der geraden Linie und Eines festen Kreises (Geometrical Constructions
Executed with a Straight Line and One Fixed Circle), published in
Berlin, 1833.
The solution presented here is based upon that in Steiner's book,
except that we have here eliminated everything that is not strictly
essential for the purpose at hand, and we have also made it somewhat
more elementary by dispensing with the theorems of homothety and
chordals employed by Steiner.
Since in straight-edge geometry the intersection of two straight
lines is known directly, we need only demonstrate that the two
fundamental problems II. and III. of the previous section can be
solved by means of a straight-edge and a fixed circle alone.
As in the solution of Mascheroni's problem, we must first solve
several preliminary problems; in this case there are five rather than two.
166
Planimetric Problems
F:o. 23.
A M B
Preliminary problem 1: To draw through a given point the parallel to a
given line.
Steiner distinguishes two cases: la. construction of the parallel to a
directed straight line; lb. construction of the parallel to an arbitrary
straight line.
la. A directed straight line is understood to mean a straight line in
which two points A and B and the midpoint M of the segment joining
them are known. In order to draw the parallel to such a line through
a given point P, we draw AP, choose a point S on the extension of AP,
connect this point with B and M, draw BP, and draw the straight line AO
through the point of intersection O o£BP and MS in such a manner that
AO cuts BS at Q. PQ is then the desired parallel. A simple proof.
A M B *%
P
»
A
F:o. 24.
Steiner's Straight-edge Problem
167
1 b. We connect a given point M of the given straight line g with the
midpoint F of the given fixed circle fj and designate the points of
intersection of the connecting line and fj as U and V. The points
U, F, Kmake the line FM a directed line. In accordance with 1 a., we
draw a parallel to FM in such a manner that it cuts ft at X and Y and
g at A. If we then draw the diameters XFX' and YFY' and connect
the end points X' and Y', the connecting line intersects the given line
at a point B in such a manner that MA = MB and g, defined by the
three points A, M, B, is then a directed line. This makes it possible
to determine the parallel to g in accordance with la.
Preliminary problem 1 gives us the solution to the problem: shift a
given segment AB parallel to itself in such a manner that one of its end points
lies on a given point P.
If P falls outside the straight line AB we find the point of intersection
Q of the parallel through B to AP and the parallel through P to AB;
PQ is then parallel to AB.
Preliminary problem 2: Draw a perpendicular through a given point P
to a given straight line g.
We draw g' parallel to g in such a manner that it cuts ft at C/and V.
We then draw the diameter UFU' and the chord VU', which,
according to Thales' theorem, is perpendicular to g' and consequently also
perpendicular to g. Finally, we draw the parallel to VU' through
P in accordance with 1; this parallel is the desired perpendicular.
1* /
1 JK
^----^
u'
1
v /ig>'
P
Fio. 25.
Preliminary problem 3: To lay off a given distance PQ, from a given
point O in a given direction.
Let us consider the prescribed direction as given by the segment OH
from O. First, in accordance with 1., we displace PQ parallel to
168
Planimetric Problems
itself to OK. Then from F we draw two radii FU and FV in the
directions OH and OK. Finally, if we draw through K the parallel to
UV, the point of intersection S of the parallel with the line OH gives
the end point of the desired segment.
Preliminary problem 4: If three distances m, n, s are given, draw the
fourth proportional.
From any point 0 we draw two rays I and II, mark off the two
distances OM = m and ON = n on I and the distance OS = s on II;
we draw the parallel to MS through N and designate its point of
intersection with II as AT. Then
OX = ^s
m
is the desired fourth proportional.
Preliminary problem 5: If two segments a and b are given, draw the
mean proportional.
We designate the sought-for mean proportional (Vab) as x, the
diameter of the fixed circle as d, the sum a + b that can be constructed
according to 3. as c, and we write
x -. s, with s = Vhk, h = - a, k = -b
d c c
(so that h + k = d).
First, in accordance with 4., we draw the segments h and k, and in
accordance with 3., we make HO = h on a diameter HK of the fixed
circle, so that KO will necessarily equal k. Then, according to 2., we
construct through 0 the perpendicular to HK and call the intersection
of the perpendicular with the fixed circle S. Then OS = Vhk = s.
Finally, we draw the desired segment x(= {cjd)s) according to 4.
Now that we have solved these five preliminary problems, the
solution of the two basic problems II and III is simple.
Basic problem II: To draw the points of intersection of a given line and a
given circle.
In straight-edge geometry a circle is considered determined if its
center and radius are known. Let us designate the given circle as ®,
its center as C, its diameter as r, the given straight line as g, the points
of intersection of g with circle S as X and Y, the chord of intersection
as 2s, the midpoint of the chord as M, its distance from the center C
as I. From the right triangle CMX we obtain the equation
Steiner's Straight-edge Problem
169
Then, in accordance with 2., we drop the perpendicular CM = I
to g; we draw the segments a = r + I and b = r — I in accordance
with 3.; then, according to 5., we draw the segment s = Vab; and
finally, according to 3., we lay off s from M on g in both directions.
The end points of the laid-off segments are the desired points of
intersection X and Y.
Fio. 26.
Basic problem III: Find the points of intersection of two given circles.
Let us designate the circles as 9t and 58, their midpoints as A and B,
their radii as a and b, the line AB joining their centers as c, the sought-
for points of intersection as X and Y, the point of intersection of the
chord XY with the center line AB as O, and, finally, the unknown
segments AO and OX as q and x.
Finding q. From the triangle ABX it may be inferred, in
accordance with the expanded Pythagorean theorem, b2 = c2 + a2 — 2cq;
thus, if we set c2 + a2 equal to d2,
1 =
(d+b)(d-b)
2c
Consequently, we draw, in accordance with 2. and 3., a right
triangle with the short legs a and c and obtain as the hypotenuse d.
170 Planimetric Problems
Then, according to 3., we draw the segments
n = d + b, m = 2c, s = d — b
and finally, according to 4.,
n
m
Finding x. From /S.OAX it follows, according to the Pythagorean
theorem, that x2 = a2 — q2; thus
x = V(a + q)(a - q).
According to 3., we draw h = a + q, k = a — q and, according to 5.,
x = Vhk.
Construction of X and Y. According to 3., we lay off q from A
on AB. At O, the end of the segment laid off, we erect the
perpendicular to AB in accordance with 2. and (according to 3.) we lay
off x on it in both directions. The end points of the laid-off segments
are the points of intersection that we are looking for.
E9 The ™iaa Cube-doubling Problem
To construct the edge of a cube that is double the size of a given cube.
The name "Delian problem," according to an account given by the
mathematician and historian Eutocius (sixth century A.D.), goes back
to an old legend according to which the Delphic oracle in one of its
utterances demanded that the Delian altar block be doubled.
If A: is the edge of the given cube and x the edge of the cube we are
seeking, the respective volumes of the two cubes are A:3 and r2.
Consequently we are confronted with the problem of finding, when
the segment k is given, a second segment x such that
r2 = 2k3.
This problem is not capable of solution with compass and straight-edge.
(See the Supplement to No. 36.)
The numerous solutions to this problem, some of which were found
in antiquity, consequently make use of more advanced means.
The Delian Cube-doubling Problem
171
Thus, the solution of the Greek mathematician Menaechmus
(ca. 375-325 b.c.) is based upon finding the point of intersection of the
two parabolas
(1) x2 = ky and (2) y2 = 2kx
with the parameters k and 2k. The abscissa x of the point of
intersection satisfies the condition x3 = 2k3 as a result of the fact that
x* = k2y2 = 2k3x, and the sought-for edge x is thereby obtained.
Descartes (1596-1650) showed that one of the two parabolas (1)
and (2) was sufficient. For their point of intersection x\y the
following equation is also true:
x2 + y2 = ky + 2kx;
and this is the equation of a circle with the midpoint coordinates k
and k/2 which passes through the common apex of the two parabolas.
Thus, it is only necessary to find the intersection of this circle with one
of the two parabolas to find the sought-for point of intersection.
The simplest and most accurate method of constructing
x = k¥2
is by paper strip construction. 1. We draw an equilateral triangle ABC
with the side k, extend CA by AD = k, and draw the line DB. 2. We
mark off on the sharp edge of a paper strip the distance k. 3. We place
the paper strip in such a way that the edge passes through C and the
end points of the marked-off distance fall upon two points P and Q of
the extensions of AB and DB.
Then
CQ = x = A:^2.
172
Planimetric Problems
Proof. Let CQ = x, BP = y. According to the leg transversal
theorem used in figure CABP, (x + k)2 — k2 = y(k + y) or
(I) x2 + 2kx = y2 + ky.
According to the theorem applied by Menelaus to the triangle ACP
with the transversal DBQ, ADCQBP = PQABCD or
(II) xy = 2k2.
A glance at equations (I) and (II) shows that they are satisfied by
the roots x andy of equations (1) and (2). The unknowns x andy,
which are determined by (I) and (II), are therefore at the same time
the coordinates of the point of intersection of Menaechmus' parabolas.
In particular, x = kV2.
Naturally, this result can also be obtained without reference to
these parabolas.
Note. The doubled cube can also be constructed by means of the
so-called conchoid of Nicomedes, a Greek mathematician who lived at
the beginning of the second century B.C.; we cannot, however, present
this construction here.
KRI Trisection of an Angle
To divide an angle into three equal angles.
This famous problem cannot be solved with compass and
straightedge (see the supplement).
The simplest solution is by means of the following paper strip
construction of Archimedes.
Fig. 28.
Taking as the center the apex S of the angle <P to be trisected, we
draw a circle of radius r that intersects the legs of the angle at A and
B. We mark off a segment of length r on the edge of a paper strip.
We place the edge on the figure in such a way that it passes through B
and that one end point of the marked-off segment coincides with a
Trisection of an Angle
173
point P on the circle, while the other end point coincides with a point
Q (outside the circle) of the extension of AS. Then &PQS = q> is
one third of the given angle ¢.
Proof. Since PS = PQ (= r), &PQS is isosceles and &PSQ is
therefore also equal to <p, while the external angle %_SPB is equal to
2<p. Since &SPB is also isosceles, &SBP = &SPB = 2<p. Finally,
since the external angle O at S of the triangle SBQ is equal to the sum
of the two nonadjacent internal angles SQB and SBQ, we find that
O = q> + 2q> or
9 = iO. Q..E.D.
The problem of the trisection of an angle can also be solved by
means of a fixed hyperbola, as the Greek mathematician Pappus
(ca. 300 a.d.) demonstrated in his ingenious masterwork "SLwayojyal
fj.a0Tjfj.aTi.Kai (Collectiones mathematicae).
In order to understand the construction we must first solve the
problem: Find the locus of the vertex Pofa triangle ABP with fixed base AB
when the base angles a and /3 are to each other in the proportion of 2 to 1.
Let AB = 3A:, AP = u. We lay off the angle /3 at P on PB and
designate the point of intersection of the free leg with segment AB
as Q. The triangles BPQ and APQ are then isosceles {^AQP as the
external angle otBPQ is equal to 2/3 = a); consequently, AP = QP =
BQ = u. We then extend AB by BC = A: and set CP equal to ».
From figure AQCP it then follows, according to the apex transversal
theorem, that
v2 -u2 = CA-CQ = U(k + u)
or
v2 = (u + 2A:)2,
more simply
» = u + 2k
or also
» - u = 2A:.
This is the equation for the locus in bipolar coordinates u, v.
The locus of the point P is thus a hyperbola with the foci A and C and the
major axis BD = 2k. (D lies between A and B in such a way that,
according to the locus equation w — u = 2k, CD = 3A:, and AD is
equal to k.)
Let us now consider this hyperbola as having been drawn once and
for all for any k. (The half of the branch belonging to the focus A,
lying above the major axis, is sufficient.)
174
Planimetric Problems
In order to trisect the prescribed angle a> we draw about AB as chord the
arc subtending the angle 180° — a> and call its intersection with the
hyperbola P. Then
&AEP = /3 = i<u.
Proof. From &APB = 180° - a> it follows that a + /3 = a>, i.e.,
(because a = 2/3), 3/3 = a>.
Note. 11 is also possible to trisect an angle by means of Nicomedes'
conchoid; this method, however, now possesses only historical
interest.
Supplement to Nos. 35, 36, and 37
On the degree of irreducible equations that can be solved by quadratic roots:
Let a rational function of one or more magnitudes be known as an
9t-function and an algebraic equation with rational coefficients as
an 9t-equation; in particular, let us designate an integral rational
function of several magnitudes with rational coefficients as an 91-
polynomial. We will also call a quadratic root of a rational number
or an 9t-function of such quadratic roots an expression of the first
order, and a quadratic root of an expression of the first order or an
9t-function of such quadratic roots an expression of the second order, etc.
In every expression of the mth order we assume that none of its
roots of the mth order can be expressed rationally by the remaining
ones or even by expressions of lower than the mth order; we assume
as well that the expression (by elimination of irrational denominators
and powers higher than the first of the relevant quadratic roots) has
been put into its simplest form—the normal form. An expression of
the mth order that contains the root of the mth order Vot will thus
appear in the form o + aVa, where o and a are expressions of the
mth order (or lower) in which the Va does not recur.
Now let *! be an expression of the mth order which contains the
mth-order roots Va, V/3, Vy,... and in which a total of n different
roots [of mth and lower order] occur. If we change the signs of these
n roots in every possible way, we obtain a total of 2" = N similarly
constructed root expressions xl3 x2, x3,..., xN,
We form the function
F(x) = (x - Xl)(x - x2)... (x - xN).
Trisection of an Angle 175
If everywhere in this expression we change the sign of any of the
above n roots contained in it, the value of the expression is not changed.
Thus, if we multiply out the parentheses, the resulting polynomial of
x—as we know from computations with root expressions—will merely
contain the squares of the roots and is consequently an 9t-function of x.
The equation
(1) F(x) = 0
is thus an 9t-equation with the roots xl3 x2, , xN, which moreover
need not all be different.
We now postulate:
If an 'Si-polynomial f (x) vanishes for a null value, such as x1} o/T(x), then
f(x) will vanish for all the roots ofF(x) = 0.
Proof. We write xx = o + aVa (see above) and introduce this
value into/"(*), and on computation we obtain
0 =/(*0 = 9t + AVZ,
where 9t and A contain expressions of the mth degree and lower with
the exception of Vo. Now, since it is assumed that Va is
independent of these expressions, A cannot differ from zero (for otherwise
it would follow that Va = — 9t/^4 and thus Va would be a function
of VJi, Vy,...) and, therefore, necessarily
,4 = 0 and 9t = 0.
We will write the expressions A and 2t as b + b V/3 and SB + bV/3,
where b, b, SB, B are no longer dependent upon Va and V/J. From
b + bVp = 0 and SB + -BV/3 = 0
it follows as above that
6 = 0, b = 0, SB = 0, B = 0,
etc. From these values we finally obtain equations that possess no
roots but only rational numbers and which are, in other words,
independent of the signs of the n roots occurring in x± and
consequently are unchanged when the signs are changed in any way.
Now, since this change of sign transforms x± into one of the values
x2> *3> • • • > xN>f(x) must therefore also vanish for x2, x3,..., xN, which
is what we set out to prove.
176
Planimetric Problems
Among all the 9t-polynomials/(*) that vanish for x = x± there is
one possessing the lowest possible degree v, let this be called <p{x).
The polynomial <p(x) is irreducible in the natural rationality
domain (cf. No. 24).
[If <p were divisible: <p(x) = u(x)-v(x), then when ^(xj) = 0 it
would necessarily follow that one of the factors such as v(x±) must
equal zero: this would contradict our assumption in that there would
be a polynomial v of lower degree than <p with the null value xv]
Since the SR-polynomial F(x) vanishes for a null value x± of the
irreducible polynomial <p{x), F{x), according to Abel's irreducibility
theorem (No. 25), is divisible by <p(x):
F(x) = F^xMx).
Since, moreover, the 9t-polynomial F±(x) vanishes for a null value of
F, thus also for <p, F± is also divisible by <p and F±(x) = F2(x)<p(x);
consequently
F(x) = Fa(x)9(x)»,
etc. Finally we obtain
F(x) = 9(x)"
(assuming that the first coefficient of F and <p has the value 1).
If we compare the degree of the polynomial on the right-hand side
of this equation with that of the polynomial on the left, we find that
N = fj.v.
Since, however, N = 2", v must also be a power of 2.
Conclusion: The degree of an irreducible equation with rational
coefficients for which a single expression formed from quadratic roots will suffice
must be a power of 2. From this the two following theorems are
easily obtained:
Litis impossible to double a cube with compass and straight-edge.
II. It is in general impossible to trisect an angle with compass and
straight-edge.
In both problems the specific magnitude x to be constructed is a
root of an irreducible equation of the third degree, and according to
our conclusion it is impossible for such an equation to be constructed
from quadratic roots, and therefore with compass and straight-edge.
[As is well known, all expressions that can be represented by compass
and straight-edge constructions are either rational or built up from
quadratic roots.]
The Regular Heptadecagon
177
Thus it merely remains to show that the equations for doubling a
cube and trisecting an angle are cubic and irreducible.
The edge x of the cube that is twice the size of a cube with an edge
equal to 1 satisfies the equation
r2 - 2 = 0.
If this equation were reducible, then it would necessarily follow that
x2 -2 = (x* + hx + k)(x - I),
where h, k, I are rational numbers. Accordingly, the equation
x2 = 2 would have to possess the rational root I = p/q, where we may
assume that p and q have no common divisor, and consequently
(p/q)3 would have to be equal to 2 or p3 equal to 2q3. Consequently,
p3 would have to be divisible by q3 and therefore p would also have to
be divisible by q, which is not the case.
In the trisection of an angle we can consider the given angle a and
the angle we are looking for <p as peripheral angles of a unit circle, so
that the subtended arcs are a = 2 sin a and x = 2 sin <p, respectively.
From a = 3<p and sin 3<p = 3 sin <p — 4 sin3 <p it follows that
sin a = 3 sin <p — 4 sin3 <p
or
x3 - 3x + a = 0.
If we assume an arc a of length 3m/n, where m and n possess no
common divisors and are integers that cannot be divided by 3, and if
we multiply the equation by n3 and set nx = X, the equation assumes
the form
X3 - 3n2X + 3 mn2 = 0.
But according to Schoenemann's theorem (No. 25) this equation is
irreducible, since the coefficient of AT is divisible by the prime number
3 and the free term is divisible by 3, but not by 32.
KMH The Regular Heptadecagon
To construct a regular heptadecagon.
In other words: To divide the perimeter of a circle into 17 equal parts.
This celebrated problem was solved by Gauss in his major work
Disquisitiones arithmeticae, published in 1801. In the section of this
178
Planimetric Problems
work dealing with the solution of the binomial equations xn = 1
Gauss proved the important theorem:
A regular polygon can be constructed with compass and straight-edge when
and only when the number of its sides has the form 2mp!p2 •. • pv> where
Pu P2> • • • > Pv are dl different prime numbers of the form 2n + 1.
For m = 0, v = 1, and p± = 3 and p± = 5, we obtain the cases of
the regular triangle and pentagon, respectively, which had already
been solved in antiquity.
In the conclusion to his investigations Gauss said, "The division of a
circle into three and into five equal parts was already known in
Euclid's time; it is amazing that nothing new was added to these
discoveries in the next two thousand years, that the geometers
considered it as confirmed that, except for these cases and those that could
be derived from them, regular polygons could not be constructed with
compass and straight-edge."
The great advances made in the division of the circle by Gauss
were possible only because Gauss transformed the originally purely
geometrical problem into an algebraic one. He arrived at this
transformation in the course of his representation of complex numbers
in the Gauss plane, which was named after him.
An arbitrary complex number c = a + bi is conventionally
represented in this plane by a point with the coordinates a\b; this
point itself is designated as "the complex number c." Another
common method is the trigonometric representation
c = r(cos & + i sin &)
of the complex number c, where r represents the so-called magnitude
(modulus) of the number, the distance of the number c from the null
point 0 of the number plane and &, the so-called angle of the number,
which is the angle formed by the distance r and the axis of the positive
real numbers.
The points of the unit circle ® drawn about the center 0 represent
the so-called Gauss numbers, i.e., numbers of the form
y = cos <p + i sin <p,
where <p is the angle of the number y.
We will write for short
COS 9; + l Sin q> = lq>.
The Regular Heptadecagon 179
The fundamental property of the Gauss numbers is described by
the relation
i.e., the product of two Gauss numbers is also a Gauss number; the angle of
the product is the sum of the angles of the factors.
It is easily confirmed that the theorem also holds for products of
more than two Gauss numbers.
For example,
ln = 1 .1 .1 ...=1
or, written out fully,
(cos q> + i sin q>)n = cos nq> + i sin n<p.
This is Demoivre's formula (Abraham Demoivre, 1667-1754).
To obtain a regular polygon of n angles we mark off the angle
<p = (2ir/n) n times in succession from point 1 on ®. The resulting
points representing the divisions are
e-L = e = cos <p + i sin <p, e2 = cos 2<p + i sin 2<p,...
e„ = cos nq> + i sin tup = 1.
Then
e, = eJ = ev and ej = evn = (en)v = 1.
The n angles e1} e2,. ..,eaof a regular polygon of n angles are therefore
the roots of the equation
zn = 1.
Thus the geometric problem of " constructing a regular polygon of
n angles," following Gauss, turns out to be the problem "of finding the
roots of the equation zn = 1."
Since one of the n roots of this equation has the value 1, we need
only find the other (n — 1) roots. These satisfy the equation
Y^zj = zn_1 + z"~2 + • • • + z2 + z + 1 = 0,
the so-called circle partition equation. In the case of n = 3, for example,
the equation reads
z2 + z + 1 = 0
and has the roots
-1 + iV3 - 1 - iV3
£l = o ' £2 = o
180
Planimetric Problems
Since the complex numbers e± and e2 both possess the real component
— \, the angles e± and e2 of the regular triangle are the points of
intersection of S with the parallel to the imaginary number axis that
passes through the point —\.
A proof of the general theorem of Gauss would take us too far, so
that we will restrict ourselves here to a brief exposition of the basic
idea and the elements that are necessary for an understanding of the
construction of the regular heptadecagon.
Let us first take note of the fact that the construction of the regular
2mJV-gon, where JVis the product of the odd prime numbers p, q,r,...,
is equivalent to drawing the regular />-gon, ?-gon, r-gon, etc. If we
have these polygons, we determine the integral numbers x, y, z in
such manner that
N N N
— x + -y + — z + ... = 1.
P q r
This can be done because the numbers
N N N
P 1 r
have no common divisor. Then
1 x y z
so that the JVth part of M: is obtained by joining the x pths, y qths,
z rths,. . . of the circle perimeter.
Consequently, we need only be concerned with the solution of the
circle partition equation
(1) z""1 + z"~2 + • • • + z2 + z + 1 = 0,
in which p is a prime number of the form 2" + 1.
The brilliant idea underlying Gauss' method of solution consists in
grouping the roots ei, e2> • • •> eP-i of (1) (where ev = ej = ev,
e = cos q> + i sin q>, <p = 2irlp) into so-called periods. The Gauss
periods are root sums in which each successive term is the £th power
of the preceding term, and the £th power of the last sum term results
once again in the first term (hence the name period). The exponent g
is here a so-called primitive root of the prime number p, i.e., an integer
such that g"'1 is the smallest of its integral powers that leaves a
The Regular Heptadecagon 181
residue of 1 on division hyp. In other words, g is an integer such that
the roots of (1) can be expressed in the form
z0 = e, Zi = e", z2 = e"2, . . ., zp.2 = e"""2.
The next period is
Z0 + Z± + Z2 + • • • + Zp_2.
In fact,
zv + i = z?andz£_2 = e""-1 = esp + 1 (where sis an integer) = e.
The following period contains only a = (p — 1)/2 terms and reads
z0 + za + z4 + • • • + Zr (r = 2a - 2).
In this period each term is the Gth power of the preceding term and
zf = Zq, where G = g2 is similarly a primitive root of p.
Let
b = \a, c = \b, d = ^c, etc.
Gauss' method for solving the circle partition equation consists of
reducing (1) to a chain of groups of quadratic equations. The first
group contains one, the second group two, the third group four, the
fourth group eight, etc., and the last group a quadratic equations.
The roots of the first group form periods of a terms, those of the
second group periods of b terms, those of the third periods of c terms,
those of the last periods of a single term, i.e., the roots of (1) itself.
The coefficients of the equations of one group can be determined
from the coefficients of the preceding group, so that the equations of
the last group give us the roots of (1) directly.
In the successive determination of coefficients the formula
(2)
in which r represents the residue remaining when the integral
exponent E is divided by p, plays a predominant role.
We will now use the Gauss method to solve the equation for the
heptadecagon (p = 17).
z16 + z15 + • • • + z2 + z + 1 = 0.
Let q> = 27r/17, e = £i = cos <p + i sin q>, ev = ev, and accordingly,
let
eu e2> e3> • • •> en be the corners of the heptadecagon, for which
Zy = e"", where g represents the (smallest) primitive root 3 of 17.
The powers 31, 32, 33,.. ., 316 on division by 17 leave the residues
3, 9, 10, 13, 5, 15, 11, 16, 14, 8, 7, 4, 12, 2, 6, 1.
182 Planimetric Problems
Consequently, according to (2),
Zq = ¢3 Z2 = € 3 ^4 = € 3 ^-6 = € 3 ^8 = e 3 ^10 = e 3
z12 = e , z14 = e , ^ = 6, z3 = e , z5 = e , Z7 = e ,
Zg = e , z-ii = e , z13 = e , z^5 = e .
Each root in the series Zq, zx, z%,... is the cube of the preceding
one.
The first group in the chain contains a quadratic equation the roots
of which are the periods
X = Zq + Z2 + Z4 + Zg + Zg + z10 + z12 + z14
= e + e9 + e13 + e15 + e16 + ee + e* + e2
and
X = Zi + Z3 + Z5 + Z7 + Z9 + Zn + z13 + z15
= e3 + e10 + e5 + e11 + e14 + e7 + e12 + e6.
Since the sum of the roots of (1) possesses the value — 1, we obtain
the relation
X + x = -1.
Making use of (2), we find on computation that Xx is equal to four
times the sum of all the roots of (1), and consequently
Xx = -4.
The quadratic equation for the periods X and x consequently reads
(I) t2 + t - 4 = 0.
Its roots are
-1 + VT7 , -1 - VT7
X = - and x = -
That X > x is shown in the following manner. If we designate the
real component of the complex number c as 9tc, then (cf. Fig. 29)
(3) 9fe" = «Rev if n + " = 17,
since the corners e* and ev of the heptadecagon are symmetrical to the
real axis. Applying this rule, we obtain
mX = 2[«R£l + «Re2 + «Re4 + Weei,
mx = 2(9te3 + dte5 + me6 + «Re7).
A glance at the figure shows that the bracket is positive and the
parenthesis negative.
The Regular Heptadecagon
The four four-term periods are
U = z0 + z4 + z6 + z12 = e + e1
183
Fio. 29.
Here we obtain
U +u = X | V+ v = x
and, applying rule (2),
Uu = e1 + e2 + ■ ■ ■ + e16 = - 1 | Vv = e1 + e2 + ■ ■ ■ + e16 =
The respective quadratic equations are
(II) t2 - Xt - 1 = 0 | t2 - xt - 1 = 0.
Their roots are
-1.
U =
X + \/X2 + 4
u =
X - VX2 + 4
V =
X
X
+ V7
2
- vG?'
+ 4
+ 4
184
Planimetric Problems
It follows from rule (3) that U > u and V > v. Consequently,
fRU = 2[«R£l + ftej,
«Ru = 2(«Re2 + dieB),
fRV = 2[9fte3 + R«J,
9fo = 2(«Re8 + »«7).
A look at the heptadecagon shows that the brackets are larger than
the parentheses immediately below them.
Of the two-membered periods obtained we need only the two
Here
and,
W=z0
we find
according
+ z6 =
to (2),
Ww =
« + e16
w +
e5 + e14
and
w =
+ e3
w
U
+
= Z4
e12 =
+ zl2
V.
= e13
Here also W > w, since 9¾ W = 2S»e1and9»a; = 23te4, but 9»ei > 9*e4.
The quadratic equation with the roots W^and w reads
(III) t2 - Ut +v=o.
The construction of the heptadecagon accordingly consists of the
following four steps:
I. Construction of AT and x;
II. construction of U and V;
III. construction of W and w according to (III) ;
IV. finding the points W and w on the real number axis. The
perpendicular bisectors of the lines joining them to the null
point cut the circle S at the corners elt e16 and e4, e13 of the
regular heptadecagon (thus all the other corners are also
determined).
Archimedes' Determination of the Number n
Archimedes of Syracuse (287 ?—212 b.c.) was the greatest
mathematician of the ancient world.
The most famous of his achievements is the measurement of the
circle. The crux of this problem is the calculation of the number it,
i.e., the number by which the diameter and the square of the radius
must be multiplied to determine the circumference and area,
respectively, of a circle.*
* The proposal that this number be designated as w came from Leonhard
Euler (Commentarii Academiae Petropolitanae ad annum 1739, vol. IX).
Archimedes' Determination of the Number n 185
The idea upon which Archimedes' method was based is the
following. The circumference of a circle lies between the perimeters
of a circumscribed and inscribed n-gon, and in particular, the greater
n is, the smaller is the deviation of the circumference of the circle
from the perimeters of the two n-gons. Then the object is to calculate
the perimeters of a circumscribed and inscribed regular polygon
with so great a number of sides that their difference is equal to a very
negligible magnitude e. Then if the circumference of the circle is set
equal to the perimeter of one of these polygons, the resulting deviation
from the true circumference of the circle is smaller than e, with the
result that when e is sufficiently small the circumference of the circle is
determined with sufficient accuracy.
6 t M 0 A
Fio. 30.
The particular achievement of Archimedes was to indicate a method
by which the perimeters of such many-sided polygons could be
calculated.
This method, the so-called Archimedes algorithm, is based upon the
two Archimedes recurrence formulas which we will now derive.
In Figure 30, let Z be the center of the circle, let AB = 2t be the
side of the circumscribed and CD = 2s the side of the inscribed
regular n-gon. Let M be the midpoint of AB and N the midpoint of
CD, let 0 be the point of intersection with MA of the tangent to the
circle passing through C. Accordingly, OM = OC = t' is half the
side of the circumscribed 2n-gon and MC = MD = 2s' is the side of
the inscribed regular 2n-gon.
Since ACO and AMZ are similar right triangles,
t'ftt - t') = OC/OA = MZ/AZ,
186 Planimetric Problems
and from the ray theorem,
s/t = NC/MA = CZ/AZ.
Since the right sides of these proportions are equal, we obtain
t'l(t - t') = sjt or
Since the isosceles triangles CMD and COM are similar, 2s'l2s =
t'l2s', i.e.,
2s'2 = st'.
If a is the perimeter of the circumscribed n-gon and b the perimeter
of the inscribed n-gon, and a' and b' are the perimeters,
respectively, of the circumscribed and inscribed 2n-gons, we then have
a = 2nt, b = 2ns, a' = 4nt', b' = 4nj'.
If we then introduce the values obtained for t, s, t', s' from these
equations into the two formulas we have found, they are transformed
into the Archimedes recurrence formulas:
(I) a' = -^-, (II) b' = VM.
' a + b '
TTius, a' is the harmonic mean of a and b, b' the geometric mean qfb and a'.
Now let us consider in succession by the regular n-gon, 2n-gon,
4n-gon, 8n-gon, etc., and let us designate the perimeters of the
circumscribed and inscribed 2vn-gons as av and b„ respectively. We
then obtain the Archimedes series
«0) *0> al> *1> a2) *2) • • •
of the successive perimeters. Here the recurrence formulas (I) and
(II) read
(1) av + 1 = -¾.. (2) bv + 1 = VM^7.
Uy ~T Oy
That is: Each term of the Archimedes series is alternately the harmonic
and geometric mean of the two preceding terms.
Using this rule, we are able to calculate all the terms of the series
if the first two terms are known. The Archimedes algorithm consists of
this calculation of the successive perimeters of the polygons.
Archimedes chose as his initial polygon the regular hexagon, the
perimeters of which are a0 = 4V3r and b0 = 6r, respectively, and
Archimedes' Determination of the Number it 187
worked out the series alt blt a2, b2, a3, b3, a4, 64 up to the perimeters
04 and bt of the circumscribed and inscribed regular 96-cornered
polygon. He found that
a, = Xfid, bt = Stfrf,
where d is the diameter of the circle. The Archimedes approximation
for the value of it is consequently
tt = 3} = 3.14.
Note. The calculations involved in the Archimedes method are
very laborious. For this reason Christian Huygens, in his treatise
published in Leyden in 1654, De circuli magnitudine inventa, replaced the
limits av and bv of the circumference u of the Archimedes method by
the limits <xv and /Sv, which gave a closer approximation of u, since it
made it possible to obtain it correctly to two decimal places for v = 1.
Huygens' method, however, involves rather complicated
considerations. The following method supplied by the author is faster and
more convenient; it is based on the known theorem: The harmonic
mean of two numbers is smaller than the geometric mean of the numbers.
This can be expressed as
—i^— < Vxy.
x +y
[Since (Vx — Vy)2 > 0, it follows that 2Vxy < x + y, and from
this, multiplication with Vxyl(x + y) gives the designated inequality.]
According to this theorem, we obtain from (1) a,+ ^ < Vavbv. If
we multiply the square of this inequality by the square of (2), we
obtain
a» + i*? + i < <hb2
or, if we set
Vavb2 = Av
then
(3) Av + 1 < Av.
According to the same theorem, it follows from (2) that
2Mv + i ftp 2 11
ov+i > l , „— or i— < r + z—
ov + av+1 ev+1 bv av+1
If we then add to this inequality the equation
2 1 1
188
Planimetric Problems
which is only a different manner of writing (1), we obtain
1,2 12
+ i— < - + r
av + 1 bv + 1 av bv
or
3av + iiy +! 3dybv
2a, + 1 + bv + 1 > 2av + bv'
or, in abbreviated form, if we set
3«A _ R
2av + bv~ "v>
then
(4) Bv + 1 > Bv.
The inequalities (3) and (4) imply that as v increases, Av grows
continuously smaller, Bv continuously larger.
Since for infinitely great v, both Av and Bv become the
circumference u of the circle, for every finite v it must be true that
Bv < u < Av.
The limits Av and Bv of this inequality are much narrower than the
Archimedes limits a„ and bv. If we take the hexagon, for example, as
our initial polygon and d = 1, then a0 = 2\/3, b0 = 3, u = it, and
we obtain Ax = 3.1423 and B0 = 3.1402; thus we are able to obtain
the correct value of it to two accurate decimal places by using only
the inscribed hexagon and the circumscribed dodecagon, whereas the
same precision is achieved by the Archimedes method only with the
use of the polygon of 96 sides.
J@^ Fuss'Problem of the Chord-Tangent Quadrilateral
To find the relation between the radii and the line joining the centers of the
circles of circumscription and inscription of a bicentric quadrilateral.
A bicentric or chord-tangent quadrilateral is defined as a quadrilateral
that is simultaneously inscribed in one circle and circumscribed
about another. Let PQRS be such a quadrilateral, © the
circumscribed circle, T the inscribed circle. Let the points of tangency of the
opposite sides PQ and RS with circle r be X and X', let the points of
tangency of the opposite sides QR and SP be Y and Y', and let the
Fuss' Problem of the Chord-Tangent Quadrilateral 189
point of intersection of the tangency chords XX' and YY' be 0. If
we then apply the theorem of the sum of the angles of a quadrilateral
to the two quadrilaterals OXPY and OX'RY', designating the
quadrilateral angles by means of a line over the letter representing
the corner, we obtain the two equations
0 + X + P+Y= 360°, 0 + X' + R + 7' = 360°.
Since the angles ^and X' (Y and Y') situated at opposite sides of the
chord XX' (YY') add up to 180°, addition of the two equations gives
the following relation
(1) 20 + P + R = 360°.
Now the sum of the two opposite angles P and R of the chord
quadrilateral PQRS is 180°; consequently, 6 = 90°.
The tangency chords of the two pairs of opposite sides of a bicentric
quadrilateral are therefore perpendicular to each other.
This condition is also sufficient: A bicentric quadrilateral PQRS is
obtained if the tangents PQ, RS, SP, QR are drawn through the end points
X, X', Y, Y' of two perpendicular chords XX' and YY' of an arbitrary
circle T. In fact, it now follows from (1), since O = 90°, that the sum
of the opposite angles P and R is 180°, i.e., that PQRS is also a chord
quadrilateral.
The simplest way of obtaining the desired relation between the
radii and the axis of the centers of the circumscribed and inscribed
circles is by means of the following locus problem. A right angle is
rotated about its fixed vertex, which is located inside a circle; find the locus of
190
Planimetric Problems
the point of intersection of the two circle tangents that pass through the point of
intersection of the legs of the angle with the circle.
Solution of the locus problem. Let the given circle be known as
T, its midpoint as M, its radius as p, the fixed vertex of the right angle
as 0, the distance of the vertex from Mase. Let the legs of the right
angle intersect the circle at the (moving) points X and Y; and let the
point of intersection of the two circle tangents passing through X and
Y be known as P and its distance from the center of the circle as p.
Fio. 32.
We will first determine the relation between p and its angle <p
(= TiOMP) with the fixed line MO.
Since OXY is a right triangle,
OF2 = FX-FY,
where F represents the base point of the altitude to the hypotenuse.
If we introduce the projections p = MN and e' = e cos <p and
p" = NX and e" = e sin <p (= NF) on the lines MP and XY,
respectively, the equation can be written
(,' -e')2 =(p" -e")(p" + e")
or
2p'2 - 2p'e' + e'2 + e"2 = p'2 + p'2
or
(2) 2p'2 - 2p'e cos <p + e2 = p2.
Since MXP is a right triangle,
MX2 = MP-MN
or
(3) p2 = pP>.
Fuss' Problem of the Chord- Tangent Quadrilateral 191
If we introduce the value of p from (3) into (2), we obtain the
relation we are looking for:
w ? +2 t*V cos * = ^-
The distance r = ZP of a point Z from P on the extension of OM
at a distance of MZ = £ from M is obtained by the cosine theorem
(5) r2 = z2 + p2 + 2zp cos <p.
If for z, which up to this point has been arbitrary, we now choose the
value
(I) mz=z = T^-a-*,
p r
we obtain, in accordance with (4),
(II) r2 = z2 + -¾.
p - e
and consequently r has a constant value!
3¾ desired locus of the point of intersection P if /Aiu a circle © whose
center Z, which is situated on the extension of OM, is determined by
(I) and whose radius r is determined by (II).
Naturally, also belonging to this locus are the points of intersection
Q, R, S of the tangents, which are obtained when we draw the'
tangents through the points of intersection of the circle T with the
extensions of XO and YO.
The quadrilateral PQRS is simultaneously a tangent and chord
quadrilateral, in that it circumscribes circle T and is inscribed in
circle ¢. If the right angle XOY is rotated about 0 so that the points
X, Y describe the circle T, the quadrilateral PQRS continuously
assumes different positions but always circumscribes circle T and is
always inscribed in circle ¢. Similarly, we see that in this way all
the bicentric quadrilaterals belonging to the two circles T and © are
obtained. The obtained formulas (I) and (II) contain the solution
to the problem posed.
We substitute the value obtained from (II) for p2 — e2 in (I) and
obtain e = 2zp2\{r2 — z2). From this there follows p2 — e2 =
p2[(r2 — z2)2 — 4:p2z2]l(r2 — z2)2. When this value is introduced
into (II) we finally obtain the sought-for relation between the radii r and
p and the axis z connecting the centers of the circumscribed and inscribed circles
of the bicentric quadrilateral:
2p2{r2 + z2) = (r2 - z2)2.
192
Planimetric Problems
The developed formula comes from Nicolaus Fuss (1755-1826), a
student and friend of Leonhard Euler. Fuss also found the
corresponding formulas for the bicentric pentagon, hexagon, heptagon, and
octagon {Nova Acta PetropoL, XIII, 1798).
The corresponding formula for the triangle had already been given
by Euler. It is
r2 - z2 = 2rp
and is easily obtained in the following manner. Let ABC be any
triangle, let Z and M be the respective centers, r and p the radii of
the circles of circumscription and inscription, respectively; thus,
ZM = z is the axis connecting the centers; further, let D be the point
at which the extension of CM meets the circumscribed circle, so that
DM = DA = DB. The power of the circumscribed circle at M is
MC-MD = r2 - z2.
However, since we can replace sin (y/2) by the ratio p/MC as well as
by AD\2r or MD\2r, p/MC = MD\2r, i.e.,
MC-MD = 2rP.
When the two values found for the product MC- MD are set equal to
each other we obtain Euler's formula.
Note. Much more remarkable than the Fuss formula is a theorem
concerning bicentric quadrilaterals that follows directly from the
preceding locus consideration. For convenience in expression we will
make a prefatory observation.
Let a circle T lie completely inside another circle (£. If from any
point on © we draw a tangent to T, extend the tangent line so that it
intersects (£, and draw from the point of intersection a new tangent
to T, extend this tangent similarly to intersect (£, and continue in this
manner, we obtain a so-called Poncelet traverse which, when it consists
of n chords of the larger circle, is called n-sided.
The theorem concerning bicentric quadrilaterals now reads:
If on the circle of circumscription there is one point of origin for which a
four-sided Poncelet traverse is closed, then the four-sided traverse will also close
for any other point of origin on the circle.
The French mathematician Poncelet (1788-1867) demonstrated
that this theorem is not limited to four-sided traverses only, but is
generally true for n-sided traverses, and not only for circles, but for
any type of conic section. The general theorem reads:
Annex to a Survey 193
Poncelet's closure theorem: If an n-sided Poncelet traverse constructed
for two given conic sections is closed for one position of the point of origin, it is
closed for any position of the point of origin.
Eufl Annex to a Survey
To determine the position of unknown but accessible points of the earth's
surface by taking the bearings of known points.
(A point on the earth's surface is considered as known when its
geographic coordinates [length and width] are known.)
This problem is of great importance in the incorporation of new
points of the earth's surface into a survey and consequently in the
preparation of accurate maps.
Land surveyors and sailors are specifically confronted with the
following two cases:
I. The Snellius-Pothenot problem; the problem of three
inaccessible points: Determine the position of an unknown accessible point
P by its bearings from three inaccessible known points A, B, C.
This most famous of all land surveying problems was posed and
solved by the Dutchman Willebrord Snellius (1581-1626) in his 1617
work, Eratosthenes Batavus, but attracted no attention among his
contemporaries. It was not commonly known until it was solved
once again by the Frenchman Pothenot (died 1732) in a paper
submitted in 1692 to the French Academy. Since then it has been
known as the Pothenot problem.
II. Hansen's problem; the problem of the inaccessible distance :
From the position of two known but inaccessible points A and B, determine
the position of two unknown accessible points P and P' by bearings from
A,B, P'toPWA.B, PtoP'.
This problem was solved by the German astronomer Hansen
(1795-1874), but was solved as well by other authors before him.
Trigonometric Solution
This type of solution is required when accuracy is important, as in
land surveying. For both problems this type of solution is based
upon the sine tangent theorem:
sin a/sin jS = m\n,
194
then also
Planimetric Problems
a- B I a + B , ... .
tan r/tan r = (m - n)/(m + n).
[From sin a/sin B = mjn it first follows that
(sin a — sin /J)/(sin a + sin B) = (m — n)j{m + n).
If the numerator and denominator of the fraction on the left of the
equation are converted into products, we obtain
a + 8. a - 8 J . a + B a - B
H
cos —^- sin —-^- I sin —=-^ cos —=-=- = (m — n)j{m + n)
or
tan
«-/*/♦« + /*
Ytan!
= (m - n)l(m + »).]
Solution of the Pothenot Problem
Known are the five elements AC = a, BC = b, &ACB = y,
&APC = a, &BPC = B; to be found are the five elements AP = x,
BP =y, CP = z, &CAP = 0, &.CBP = <p. If the sine theorem is
applied to the triangles A CP and BCP,
sin 0 z , sin ro z
-t—!- = - and -^5 = t-
sin a a sin p b
C
On division it follows from this that
sin i/i/sin y = b sin a/a sin /?.
We determine the auxiliary angle /t whose tangent is b sin a/a sin /?,
and obtain
sin 0/sin <p = tan /t.
Annex to a Survey 195
From this it follows according to the sine tangent theorem that
tan^ f
2 = ten/i-1 =
t^^ + y H-tanM vp
i.e.,
tan £z_? = ^ t+JL-xan 0* - 45°).
Since 0 + ^( = 360° — a — B — y) is known, this equation gives us
0 - 9
2
From
addition and subtraction give us 0 and <p.
The unknowns *, y, z are obtained from the following formulas
derived from the sine theorem:
x sin (a + 0) y sin (B + <p) z sin 0
- = : > T = : 3 > ~ = ~
a sin a b ■ sin p a sin a
The position of the point P is determined from the magnitudes
0, 9, x, y, z.
Solution of Hansen's Problem
Known are the five elements AB = c, &APB = y, ^AP'B = y,
&BPP' = 8, 2i AP'P = 8', and consequently also the angles PAP' = a
and PBP' = B; we do not know the seven elements AP = x, AP' = x',
BP = y, BP' = y', &BAP' = 0, &ABP = <p, and PP' = s.
We now represent the four ratios of the adjacent sides of the
quadrilateral as sine ratios in accordance with the sine theorem:
c _ sin y x _ sin 8' s _ sin B y' _ sin 0
x sin qp s sin a y' sin 8 c sin y
Multiplication of these equations gives us
sin 0 sin B sin y sin 8' , sin 0 sin a sin y sin 8
— ; 1 ; 1 ; = J Qf = •
sin <p sin a sin y' sin 8 sin 9 sin /3 sin y sin 8'
196
Planimetric Problems
We then determine an auxiliary angle /t whose tangent is equal to the
right side of this equation, and we obtain
sin 0
-:—- = tana,
sinqp ^
i.e., according to the sine tangent theorem as above,
tan t^l = tan *±* tan 0* - 45°).
As above, we find from this
n (since 0 + <p = 8 + 8' is known)
and then 0 and <p. Now the remaining unknowns are easily obtained
by the sine theorem.
The positions of P and P' are determined by the values found for the
six unknowns.
The Drawing Solution
This is adequate when great accuracy is not requisite, for example,
in sailing along a coast where A, B, C are known landmarks, P and P'
unknown positions of a ship with a bearing on these landmarks.
The solution of Pothenot's problem is extremely simple. The
ship's position P is the point of intersection of the two circles to be
drawn on the ship's chart with the chords AC and BC and the
corresponding peripheral angles a and /?.
Hansen's problem is solved in the following way. We draw a
quadrilateral abp'p having the same form as ABP'P (beginning with an
arbitrary distance pp') and lay this off on the chart so that b falls on B
Alkazen's Billiard Problem
197
and a on AB. The ship's position P is the point of intersection of Bp
with the parallel to ap passing through A, the ship's position P' is the
point of intersection oiBp' with the parallel to pp' passing through P.
Alhazen's Billiard Problem
To describe in a given circle an isosceles triangle whose legs pass through
two given points inside the circle.
This problem stems from the Arabic mathematician Abu Ali al
Hassan ibn al Hassan ibn Alhaitham (ca. 965-ca. 1039), whose
name was transformed into Alhazen by the translators of his Optics.
In his Optics the above problem has the following form: "Find the
point on a spherical concave mirror at which a ray of light coming from a given
point must strike in order to be reflected to another given point."
This problem can be posed in various other forms, e.g.: "On a
circular billiard table there are two balls; in what manner must one be struck
in order for it to strike the other after rebounding from the cushion?" or " On
the circumference of a circle find a point the sum of whose distances from two
given points within the circle is equal to a minimum (or maximum)"
A whole series of famous mathematicians took up this problem after
Alhazen, among them Huygens, Barrow, de L'Hdpital, Riccati, and
Quetelet.
Solution. Let us call the given circle ft, its center M, its radius r,
the given points P and p, and let us make M the origin of a mutually
perpendicular coordinate system xy in which P and p have the
coordinates A\B and a\b.
If OS and Os, which pass through P and p, are the legs of the
isosceles triangle OSs that we are looking for, the angles O and <p,
which these legs form with the radius OM, must be equal.
If we designate the angles that the lines PO, MO, pO form with the
x-axis as A, /x, A, then, on the one hand, O = A — p and <p = p — A
or
, tan A — tan u , tan u — tan A
tan Q = - i-y and tan q> = -. - r>
1 + tan fx tan A 1 + tan p tan A
while, on the other hand, if x \y are the coordinates of 0,
. y — B y y — b
tan A = j» tan u = -, tan A = - >
x — A ^ x x — a
198
Planimetric Problems
and consequently, since tan O = tan <p,
y - B y y y - b
x — A x
yy - &
xx — A
Ay - Bx
'
x x — a
l+y_y-b
xx — a
bx - ay
or
x2 + y2 — Ax — By ~ x2 + y2 — ax — by
or finally, if we set
Ab + Ba = H, Aa - Bb = K, A + a = h, B + b = k,
then
H(x2 - y2) - 2Kxy + (x2 + y2)[hy - kx] = 0.
Since the point 0(x\y) has to lie upon the circle $, the circle
equation
(1) x2+y2 = r2
consequently applies here, and our condition assumes the form
(2) H(x2 - y2) - 2Kxy + r2[hy - kx] = 0.
Since equation (2) represents a hyperbola, our conclusion reads as
follows:
The point O that we are looking for is the point of intersection of the circle
(1) with hyperbola (2).
Since there are in general four points of intersection for a circle and
a hyperbola, there are in general four solutions to our problem.
Possessing particular interest is the special case in which the distances
C and c of the given points P and p from the center M are equally
great. In this case we naturally take the perpendicular bisector of
Pp as the *-axis, and then we have
A = a, B b, H = 0, K = c2, h = 2a, k = 0
and, according to (2)
-2c2xy + 2ar2y = 0.
This equation is satisfied by each of the conditions
(3) y = 0 and (4) * = «J3'
From (3) follows the corresponding x = ±r. Consequently, the
points of intersection of $ with the x-axis satisfy the condition for the
point 0 we are looking for.
Alhazen's Billiard Problem 199
From (4) it follows that
a x
If we then draw through M a circle f whose diameter MN = d =
c2\a lies on the x-axis, and if Q(X\ Y) is a point of intersection of this
circle with ft, it follows, since MNQ is a right triangle, that
MQ2 = MN-X or r2 = dX.
However, since r2/x = d, we obtain
X = x.
Consequently, the points of intersection of the circles ft and f also
satisfy the condition for the point 0 we are looking for.
V
For these points of intersection to exist, d must be > r or c2 > or.
We will assume that this condition is satisfied.
Now the quadrilateral MPpQ in circle f is a chord quadrilateral,
and therefore, according to Ptolemy's theorem, the sum of the
products of the opposite sides must be equal to the product of the
diagonals:
PQ-Mp + pQ-MP = MQ-Pp
or
(5) (PQ + pQ)c = 2br.
For any other point Q' of ft, MPpQ' is not a chord quadrilateral,
and therefore the sum of the products of the opposite sides must be
greater than the product of the diagonals:
(6) (PQ' + pQ')c > 2br.
From (5) and (6) we obtain
PQ + pQ < PQ' + pQ'.
200
Planimetric Problems
The problem: " On a given circle find a point the sum of whose distances
from two given points located in the circle at an equal distance from the
midpoint of the circle b a minimum" has the following striking solution:
The point we are looking for is the point of intersection of the given circle
with the circle that passes through the given points and the center of the given circle.
Note. In connection with the above problem Alhazen also solved
the problem: "How to strike a ball lying on a circular billiard table in such a
way that after twice striking the cushion the ball will return to its original position."
Solution. Let the billiard table possess the radius r and the center
M. Let the initial position of the ball be P, so that MP = c is
known. Let the ball first strike the circle at U, cross the extension of
Fig. 36.
PM at a right angle at F, then strike the circle at V and return from
here to P. UM and VM are then angle bisectors of the triangle
PUV. We set
MF =x, FU = y, UP = z.
Applying the angle bisector theorem to the triangle FUP,
y\z = xjc,
and according to the Pythagorean theorem
r2 = x2 + y2 and z2 = y2 + (x + c)2.
If we eliminate y and z from these three equations, we obtain the
quadratic equation
lex2 + r2x = cr2
for the unknown x. From this, x is easily constructed.
Problems Concerning Conic
Sections and Cycloids
An Ellipse from Conjugate Radii
To draw an ellipse for which the magnitude and position of two conjugate
radii are given.
Solution. Let the ellipse have the center equation
l 2 /,A 2
(1)
M-
Let the prescribed conjugate radii be OP and OQ such that the
coordinates x \y and x' \y' of their end points satisfy the conditions
(2)
x
a
\
X
a
(The conditions (2) give us directly for the product of the slopesy/x and
y'jx' of the two radii the known value — b2/a2 for the product of the
slopes of the conjugate radii.)
Let the base point of the ordinate from Q be V. We rotate the
right triangle OQV clockwise about 0 by 90° to the position Oqv
and extend the straight line Pq to intersect with the axes of the
ellipse at H and K. According to (2), the distances of the points q
and P from the x-axis and the distances of the points P and q from the
y-axis are in the ratio of ajb. Consequently (according to the ray
theorem),
Hq a , KP a
HP = b and Kq- = b
204 Problems Concerning Conic Sections and Cycloids
It then follows from this that
HP + Pq _ Kq + qP
HP ~ Kq '
i.e., HP = Kq,
so that the center M oiPq is also the center oiHK.
If we substitute HP for Kq, one of our proportions becomes
(3) KPfHP = a\b.
In order to obtain a second equation for the unknowns KP and HP,
we obtain the cosine and sine of the angle v from HK to the x-axis:
cos v = xjKP, sin v = yjHP;
squaring and adding, we obtain
(4) — + J?- = 1
W Kp2 + Hp2 '•
From (1), (3), and (4) it immediately follows that
KP = a, HP = b.
This gives us the following simple
Construction. 1. We rotate OQ about 0 90° through the
interior of the obtuse angle POQ to the position Oq. 2. We
determine the center M of Pq and the points of intersection //and K of the
line Pq with the circle of center M and radius MO.
KP <w<f HP are then equal to half the length of the axes of the ellipse, while
OH and OK represent the positions of the axes of the ellipse.
The rest is simple.
An Ellipse in a Parallelogram
To inscribe in a prescribed parallelogram an ellipse that is tangent to the
parallelogram at a boundary point.
The solution of this problem is based upon the theorem: Every
ellipse can be considered as a normal projection of a circle.
Let ABCD be the given quadrilateral, N the given boundary point
lying on AB. Let the other points at which the ellipse touches the
boundary of the parallelogram be K on BC, M on CD, and H on DA.
In the normal projection, in which the ellipse has the image of a
circle, the parallelogram ABCD and the tangency points N, K, M, H
An Ellipse in a Parallelogram
205
appear as projections of a parallelogram circumscribing a circle, and
specifically of a rhombus abed with the tangency points n, k, m, h.
Since nA||Am||ac and «A||£m|| bd and since parallelism is preserved in
a normal projection, NK\\HM\\AC and NH\\KM\\BD. Thus, we
find the tangency points H and K, respectively, by causing the
parallels through N to BD and AC to intersect with DA and BC,
respectively. The fourth tangency point M is the point of intersection
of CD with the parallel through H to AC.
Let the centers of the circle and ellipse be o and 0, respectively.
We will now assume an arbitrary point z on the arc nh of the circle,
connect this point with m and n, and designate the points of
intersection of these connecting lines with hk and da as x and y. The two
triangles omx and any are then similar, since the angles at o and a, as
well as the angles at m and n, are equal because they are enclosed
between pairs of orthogonal legs. From this similarity we obtain the
proportion
oxjom = ay Jan.
If we substitute oh for om and ah for an in this proportion, we obtain
ox\oh = ay/ah.
Let the normal projections of the points x, y, z be X, Y, Z. Since
the ratio of parallel segments is not altered in normal projection, we
have
OX/OH = AY I AH.
The points X and Y accordingly divide the radius of the ellipse OH and
the ellipse tangent AH in the same proportions.
Quite similar proportions are naturally found to obtain for the
other ellipse arcs MH, MK, NK.
We assign the tangents AH, BK, DH, CK to the arcs NH, NK, MH,
MK, respectively.
In summary we can then say.
If we connect a point of one of the four arcs with M and N, the
points of intersection of these connecting lines with the radius (OH or
OK) and the corresponding tangents divide the radius and tangents
in the same proportions.
This gives rise to the following elegant construction.
We divide the radii OH and OK and the tangents AH, BK, DH, CK
each into v equal segments (eight segments are shown in Figure 38)
and number the segments from 1 to v, beginning from the center of
206 Problems Concerning Conic Sections and Cycloids
Fig. 38.
the ellipse with the radii and at the corners of the parallelogram with
the tangents. We then connect M (N) with an arbitrary segment
point of a radius and N (M) with the segment point with the same
number of the tangent corresponding to the arc bounded by N (M)
and the end point of the radius. The point of intersection of the two
connecting lines is in each case a point on the ellipse.
^^1 A Parabola from Four Tangents
To draw a parabola four tangents to which are given.
The simplest solution of this beautiful problem is based upon
Lambert's theorem: The path of rotation of a parabola tangent triangle
passes through the focus.
(J. H. Lambert (1728-1777) was a German mathematician.)
In order to prove Lambert's theorem we need the
Theorem of similar triangles : Two tangents SA and SB to a parabola,
together with the lines from the focus to the contact points A and B and the
point of intersection S of the tangents, form two similar triangles FSA and
FSB such that the angle of the one triangle, situated at the point oftangency, is
always equal to the angle of the other triangle that is situated at the point of
intersection.
Proof. In accordance with the classical construction of the
parabola, the mirror images H and K of the focus F on the tangents SA
and SB, respectively, fall on the base points of the altitudes dropped
from A and B, respectively, on the directrix L.
A Parabola from Four Tangents
207
Fig. 39.
Since the angles FAS and HAS are symmetrical, and the angles
HAS and FHK, as angles between pairs of orthogonal legs, are equal, it
follows that
&FAS = &FHK
and likewise that
&FBS = &FKH
The angles FHK and FKH, as the boundary angles opposite the chords
FK and FH, respectively, on the circumference of rotation of the
triangle FHK (whose center is the intersection S of the median
perpendiculars SA and SB of the triangle) are half as great as the
corresponding central angle and consequently equal to angles FSB and
FSA, respectively. Consequently,
&FAS = &FSB and &FBS = £FSA. Q.E.D.
Lambert's theorem follows directly from the theorem we have just
proved.
In fact: If P and Q are the points of intersection of a third tangent
with the tangents SA and SB that touches the parabola at 0, then,
according to the theorem of similar triangles,
&FAS = &FSB and &FAP = &FPO
and consequently
TiFSQ = &FPQ.
According to this equation, however, the quadrilateral FPSQ is a
circle quadrilateral.
Lambert's theorem gives us directly the requisite construction: From
the four tangent triangles that can be formed from the four given
208 Problems Concerning Conic Sections and Cycloids
tangents, we choose two and draw the circumference for each. The
point of intersection of the two circumferences is the focus. We then
find the mirror image of the focus on two tangents and in this way
obtain two points of the directrix, which gives us the directrix. The
rest is extremely simple.
Note. The theorem of the circumference of the tangent triangle
leads directly to the solution of the interesting problem:
Determine the locus of the foci of all parabolas that are tangent to three
straight lines.
The sought-for locus is the circumference of the triangle formed
from the lines.
Kill A Parabola from Four Points
To draw a parabola that passes through four given points.
This lovely problem was first solved by Newton in his celebrated
Philosophise naturalis principia mathematica, 1687, and then once again in
1707 in his Arithmetica universalis.
It is commonly based upon the auxiliary problem:
To draw a parabola for which three points and direction of the axis are
known.
The following solution of the auxiliary problem is based on the two
theorems:
I. The centers of parallel chords of a parabola lie on a parallel to an axis.
II. 7%« perpendicular bisector of a parabola chord and the perpendicular
to the axis through the center of the chord mark off the half parameter on the
axis.
A Parabola from Four Points
209
Proof. The equation for the amplitude of a parabola is commonly
expressed in the formy2 = 2px. If* \y and X\ Y are the end points of
a parabola chord, the slope of the chord with respect to the *-axis
S = (Y -y)l(X-x). From
y2 = 2px and Y2 = 2pX
it follows, however, by subtraction that
Y* -y2 = 2p(X-x), i.e., @ = LzJ. = _^_.
If we call the ordinate of the midpoint of the chord -q, the last equation
can be written (because 2tj = Y + y) in the form
P
According to this equation, the midpoints of all chords with the
same slope <3 have the same ordinate, with the result that these
midpoints lie on a line parallel to the axis of the parabola, and thus I.
is proved.
To prove II., we take note of the fact that the segment marked off
on the axis by the perpendicular bisector of our chords and the
perpendicular to the axis through the chord midpoint is equal to
ij3, where 3 is the slope of the perpendicular bisector of the chord
with respect to the perpendicular to the axis. However, since
§ = (3, the length of the segment is tj<3 = p, which was to be proved.
From II. it also follows that: If the midpoints of two parabola chords lie
on a perpendicular to the axis, the perpendicular bisectors of the chords intersect
on the axis.
Let A, B, C be the given parabola points, 9¾ the direction of the axis.
Let us draw through the center M oiAB a parallel to the axis, through
the center NofCA the perpendicular to the axis, and call their point
of intersection M0. Then according to I., M0 is the midpoint of the
parabola chord AqB0 that passes through M0 and is parallel to AB.
We draw the perpendicular bisectors of CA and AqB0 (the latter as a
perpendicular dropped from M0 to AB). According to II., their
point of intersection is a point on the axis, its distance from the base
point of the perpendicular dropped from MQ or JVis the half parameter
p. The rest is simple. For example, making use of the subnormal
(p) from A, we draw the normal AU and the tangent AV (both being
drawn to the axis). The midpoint of UV is then the focus and the
mirror image of the focus on the tangent is a point on the directrix.
210 Problems Concerning Conic Sections and Cycloids
at x
Fig. 41.
The solution of Newton's parabola problem is based upon the
following auxiliary theorem: In all parabola quadrilaterals the products of
the diagonal segments are proportional to the squares of the segments on the
diagonals that are bounded by their point of intersection and the axis of the
parabola.
Proof. Let AB be an arbitrary parabola chord, let M be its
midpoint, U the point of intersection of the parallel to the parabola
axis through M. If we select UM as the *-axis and the parabola
tangent through t/as they-axis, we obtain the usual parabola equation
in the form
y2 = 4kx,
Fig. 42.
A Parabola from Four Points
211
where k is the focal radius of the coordinate origin U. The coefficient
4k possesses the value 2/>/sin2 k, where 2p is the parameter and k the
angle enclosed between the coordinate axes or the angle formed by the
chord AB with the axis of the parabola.
We select an arbitrary point 0 on AB and designate the point of
intersection of the parallel to the x-axis through 0 with the parabola
as Q, the coordinates of Q as x and y, and the coordinates of A as X
and Y, so that
QO = q = X - x, OA = Y -y, OB = Y + y.
From
Y2 = 4kX and y2 = 4kx
it follows by subtraction that
Y2 - y2 = 4k(X - x)
or
(7 + y)(Y - y) = 4k(X - x),
so that
(1) 0A.0B = 4kq.
If A'B' is a second parabola chord through 0, then accordingly
(2) OA'-OB' = \k'q,
with 4k' = 2/>/sin2 k, where k' is the angle of the chord A'B' with the
parabola axis.
Division of (1) and (2) gives
0A.OBjOA'.OB' = k\k! = sin2 K'/sin2 k.
If Hand H' are the points of intersection of the chords AB and A'B'
with the parabola axis, it follows from the sine theorem that
OH/OH' = sin ic'/sin k.
From the last two equations we finally obtain
OA-OBIOA'-OB' = OH2/OH'2. Q.E.D.
With this theorem we can now obtain the following solution to
Newton's problem: Let A, B, C, D be the given points. We draw the
diagonals AC and BD of the quadrilateral ABCD and call their point
of intersection 0. On the diagonals we mark off from 0 the mean
proportionals OP = VOA-OC and OQ = VOB-OD. The
connecting line QP, according to the theorem we have just proved, is then
parallel to the parabola axis, and the problem now reduces to the
auxiliary problem treated above.
212 Problems Concerning Conk Sections and Cycloids
The following projective solution of Newton's problem also consists of the
reduction of the problem to the preceding auxiliary problem. This
transformation of the problem is accomplished by means of Desargues'
involution theorem (No. 63). According to this theorem, every
tangent to a parabola cuts the opposite sides of an inscribed
quadrilateral in point pairs of an involution in which the point of tangency of
the tangent is a double point.
As tangent T let us choose a very distant one. Let it be tangent to
the parabola at 0 and let it be cut at P, Q, P', and Q' by the lines
AB, BC, CD, DA connecting the four given parabola points. 0 is
then the double point of the involution determined by the pairs
(P, P') and (Q, Q'). Similarly, the rays drawn from an arbitrary
point Z of the picture plane to P, Q, P', Q', 0 form an involution with
the ray pairs (ZP, ZP') and {ZQ, ZQ') and the double ray ZO.
Because of the very great distances of the points P, Q, P', Q', 0
the rays ZP, ZQ, ZP', ZQ' on the drawing paper run parallel to the
quadrilateral sides AB, BC, CD, DA, and the ray ZO here runs
parallel to the axis of the parabola. (The slope (y — b)/(x — a) =
(V2px — b)l(x — a) of the line connecting points Z(a\b) and 0(x\y),
because of the great value of x, is essentially equal to zero, so that the
ray ZO appears parallel to the axis on the drawing paper.)
Accordingly we obtain the following construction. We draw through
an arbitrary point Z of the paper the parallels p, q, p', q' to the lines
AB, BC, CD, and DA and construct a double ray of the involution
determined by the ray pairs (p, p') and (q, q'); this ray has the
direction of the parabola axis. Thus, the problem is reduced to the
auxiliary problem solved above.
Since in ray involution there are in general two double rays, there
are in general two parabolas that can be drawn through four given
points.
Kufl A Hyperbola from Four Points
To draw a right-angle (equilateral) hyperbola for which four points are
given.
The construction is based upon the auxiliary theorem: The Feuerbach
circle of a triangle inscribed in an equilateral hyperbola passes through the
center of the hyperbola.
A Hyperbola from Four Points
213
Proof. Let ABC be a triangle inscribed in an equilateral
hyperbola with the center at Z and the asymptotes I and II; let A', B', C
be the midpoints of the sides BC, CA, AB, and let Al and A2 be the
points of intersection of BC with I and II, and Bx and B2 the points of
intersection of CA with I and II.
Since the asymptotes mark off equal segments on the extensions of a
hyperbola chord, BA2 = CAl and CB2 = ABlt and A' is the midpoint
of AXA2 and B' the midpoint of BXB2. These midpoints are also the
midpoints of the circumferences of rotation of the right triangles
AXZA2 and BXZB2) so that
2iA'ZAx = 2iA'AxZ and ^B'ZBt = -^B'B^Z.
Since the difference of the left sides of these equations represents
angle A'ZB' and the difference of the right sides angle A1CB1
(according to the theorem of external angles), both of these angles are equal
or angles A'ZB' and A'CB' are supplementary. However, since the
angles of the parallelogram CA'C'B' at C and C" are equal, angles
A'ZB' and A'CB' are also supplementary. The quadrilateral
ZA'C'B' is therefore a circle quadrilateral. In other words: the
circumference of rotation of the triangle A'B'C, i.e., the Feuerbach
circle of the triangle ABC (see No. 28), passes through the center of
the hyperbola. Q.E.D.
Construction. Let the four given points be A, B, C, D. We draw
the Feuerbach circle of the triangles ABC and ABD; the point of their
intersection Z is the center of the hyperbola. We connect Z to the
midpoint A' of BC, draw the circle A'\A'Z and at its points of
intersection Al and A2 with the line BC we have two points of the
asymptotes I and II, which gives us the asymptotes. The rest is easy. (To
214 Problems Concerning Conic Sections and Cycloids
draw the hyperbola from points, for example, we pass an arbitrary
line through one of the given points, for example A, and mark off on
this line the segment between A and I from II to A; the point at the
end of the marked-off segment is a new point of the hyperbola.
Repetition of the construction with new lines through A gives us as
many points of the hyperbola as desired.)
Note. The proved auxiliary theorem immediately gives, as well,
the solution to the interesting
Locus problem: Find the locus of the centers of all equilateral hyperbolas
that can be circumscribed about a given triangle.
The locus is the Feuerbach circle of the given triangle.
K^H Van Schooten's Locus Problem
Two vertexes of a rigid triangle in a plane slide along the arms of an angle
of the plane; what locus does the third vertex describe?
Franciscus van Schooten (the younger) (1615-1660), a Dutch
mathematician, treated this beautiful problem in his Exercitationes
mathematicae, which appeared in 1657.
Solution. We will first consider a special case of van Schooten's
problem, the solution to which had already been taught by the
Byzantine Proclus (410-485).
On a rigid line three points are marked; two of these slide along the arms of a
right angle; what locus does the third describe?
We select the arms I and II of the right angle as the x- andy-axes of
a coordinate system. Let the three marked points of the rigid line be
A, B, C, their mutual distances BC = a, CA = b, and AB = c. Then
c = a ± b, accordingly as C does or does not lie between A and B.
Let the point A slide on I and B on II. Let the marked point C
possess the coordinates x and y. Let the angle of the line with
respect to the x-axis be v; thus x, as the projection from a on I, is
equal to a cos v; y, as the projection of b on II, is equal to b sin v; and
consequently, x2 = a2 cos2 v, y2 = b2 sin2 v, and
The locus of the marked point C is thus an ellipse with the half axes a and b.
This locus property is the basis of the so-called paper strip
construction of the ellipse and trammel.
Van Schooten's Locus Problem
215
Paper Strip Construction of the Ellipse
On the sharp edge of a paper strip we mark off the three points in
the sequence B, A, C in such manner that BC = a and AC = b (<a)
are equal to the given half axes of an ellipse. We move the strips in
such manner that A always remains on the x-axis and B on the y-axis
and we constantly mark the place at which C is situated. The locus
described by the point C is an ellipse with the prescribed half axes
a and b.
The Trammel
A trammel consists of a cross with two grooves at right angles to
each other in which two sliding pins A and B move. The pins are
fixed to a beam to which at some point a movable pencil M can be
attached. When the pins slide in the grooves the pencil describes an
ellipse with the half axes AM and BM.
Now for the general van Schooten problem!
Let S be the apex of the fixed angle a along the arms of which the
vertexes A and B of the rigid triangle ABC slide. We draw the circle
ft with AB as chord and a as peripheral angle, join its midpoint M
with C and determine the points of intersection P and Q of this
connecting line with ft. Let us consider this circle along with points P
and Q as being firmly connected to the rigid triangle, so that it also
participates in the motion of the triangle. Consequently, since a is
the peripheral angle opposite AB, it passes continuously through S.
The arcs AP and AQ continuously change their position but not their
216 Problems Concerning Conic Sections and Cycloids
magnitude! This entails the invariance of the peripheral angles ASP
and ASQ, which implies the invariance of the directions I and II that
are determined by SP and SQ. Since PQ is a diameter of ft, I and II
are perpendicular to each other. We can therefore consider the
motion of the vertex C as the motion of the marked point C of a rigid
line PQC the other marked points of which P and Q slide along the
arms I and II of a right angle. According to the above special case,
C describes an ellipse.
Result: van Schooten's theorem: The locus of one comer of a three-
cornered plate the other two comers of which slide along the arms of a fixed
angle is an ellipse.
The above derivation also gives the magnitudes and position of the
ellipse. The axes of the ellipse have the positions I and II and the
magnitudes 2 • CP and 2 • CQ.
KB| Cardan's Spur Wheel Problem
What is the locus described by a marked point on a circular disc that rolls
along the inner edge of a disc of double its radius?
Jerome Cardan, an Italian mathematician (1501-1576), is known
for the Cardan formula for solution of cubic equations.
Solution. Let the boundary of the large disc be ft and that of
the smaller disc I, and let their radii be equal to R = 2r and r,
respectively. First we will observe the motion of the marked disc diameter
AB, which we give the mark M. At the beginning of the motion let
A lie at the midpoint 0 and B at the boundary point H on ft. When
the circle I is rolled forward within ft by the arc HT, let it cut the
radius OH at X, and let Y be the point at which it cuts the radius OK of
ft, which is perpendicular to OH. Since the angle XOY is 90°, XY
is a diameter of I, and the intersection S of XY with OT is the center
of I. If w is a peripheral angle XOT of I in radian measure, then the
corresponding central angle XST is 1w and the arc XT is 2rw.
However, since w also represents the central angle HO T of ft, the
arc HT = Rw = 2rw. The arc ATT of the smaller circle is exactly as
long as the arc H T of the larger circle upon which the small circle is
rolled forward. ATmust therefore be the end B of the marked diameter
AB, consequently Y is the other end A of this diameter. The rotation
of a disc along the inner margin of a disc of double its width consequently means
that the end points of a marked diameter of the smaller circle slide along two
Newton's Ellipse Problem
217
fixed orthogonal diameters of the larger circle. The locus of our marked
point M is therefore also the locus of the mark M of the diameter AB
whose end points A and B slide along the arms OK and OH of the
right angle HOK. In view of the paper strip construction of the
ellipse (No. 47), the locus we are seeking is thus an ellipse.
The half axes of this ellipse are MA and MB.
Fig. 45.
N0T3. Since a marked point on the boundary of the smaller disc
describes a diameter of the larger disc, a gear consisting of two spur
wheels the ratio of whose diameters is as 2:1 effects the conversion of a
circular motion into a reciprocal rectilinear motion.
^^m Newton's Ellipse Problem
To determine the locus of the centers of all ellipses that can be inscribed in a
given (convex) quadrilateral.
Newton's very elegant solution to this problem is based upon the
theorem, also stemming from Newton:
The line connecting the centers of the diagonals of a quadrilateral
circumscribed about a circle passes through the center of the circle.
The proof of this property of a tangent quadrilateral is based upon
the following auxiliary theorem: The locus of the common vertex of two
triangles with prescribed base lines and a prescribed area sum is a straight line.
[Proof: Let/and g be the two prescribed base lines, x and y the
distances of the common vertex S of the two triangles from the
prescribed base lines and, at the same time, the "coordinates" of the
218 Problems Concerning Conic Sections and Cycloids
point S. The prescribed sum of the areas of the two triangles we will
call K. Since the triangles have the area \fx and \gy, we obtain the
equation fx + gy = 2K, and this is the equation of a straight line.]
Let there be circumscribed about a circle of center 0 and radius r
the tangent quadrilateral ABCD with the sides AB = a, BC = b,
CD = c, DA = d, so that a + c = b + d. Let M be the midpoint
of the diagonal AC and N the midpoint of BD, 2 J the area of the
quadrilateral. Since £\MAB and /\,MCD have areas equal to one
half £\CAB and l^ACD, respectively, the sum of the areas of the two
Fig. 46.
triangles MAB and MCD is equal to J, or half the area of the
quadrilateral. Consequently, the line MN is the locus of the common
vertex S of all the pairs of triangles (SAB, SCD) having the area J.
However, since the two triangles OAB and OCD also have the area
sum J (specifically,
I = OAB + OCD = r ^-^ and II = OBC + ODA = r ^-^-
and I = II. From I + II = 2J it then follows that I = II = J),
thus 0 belongs to the locus. Q.E.D.
Now for the solution to Newton's problem!
Let us consider any ellipse inscribed in the given quadrilateral as
the normal projection of a circle. In this reflection the quadrilateral
appears as the image (the normal projection) of an object quadrilateral
circumscribed about the circle. Now, since: 1. in the object the center
of the circle lies upon the line connecting the midpoints of the
diagonals; 2. halving is preserved in the normal projection; 3. the center of
The Poncelet-Brianchon Hyperbola Problem 219
the ellipse is the image of the center of the circle, then in the image
also the ellipse center lies on the line joining the midpoints of the
diagonals of the prescribed quadrilateral.
Conclusion: The locus of the centers of all the ellipses that can be
inscribed in a given quadrilateral is a straight line, specifically, the line
connecting the midpoints of the diagonals of the quadrilateral.
■3H The Poncelet-Brianchon Hyperbola Problem
To determine the locus of the intersection of the altitudes of all the triangles
that can be inscribed in a right-angle [equilateral) hyperbola.
Brianchon (1785-1864) and Poncelet (1788-1867) were French
mathematicians. The solution is in vol. XI of the Annales de Gergorme
(1820-1821).
We relate the hyperbola to its asymptotes, which will serve as
coordinate axes (the x-axis and £-axis), and take the abscissa (ordinate)
of the apex of the hyperbola as the unit length. The equation for the
hyperbola then reads
x(=l.
Let PQR be an arbitrary triangle inscribed in the hyperbola, i.e., a
triangle whose vertexes P, Q, R lie on the hyperbola. Let the abscissas
of the points P, Q, R be a, b, c, the ordinates thus being a = I/a,
j9 = \/b, Y = l/c.
The slope of the side QR is (/3 — y)/(i — c) or, if we substitute 1/i
and l/c for /3 and y, — 1/ic. The slope of the altitude to QR is thus be.
The equation of this altitude is thus $ — a = bc(x — a) or
(1) £ + abc = bc(x + <x/3y).
For the altitude passing through Q we obtain similarly
(2) £ + abc = ca(x + <x/3y).
Now, if the coordinates of the altitude intersection are understood
to be x\£, (1) and (2) both apply, and by equalizing the right sides we
find the abscissa x of the point of intersection of the altitudes:
(I) X = -afr.
If we introduce this value into (1) or (2), we obtain as the ordinate of
the altitude intersection
(ID
£ = — abc.
220 Problems Concerning Conic Sections and Cycloids
Multiplying (I) and (II) finally gives us
*f =1.
The altitude intersection thus lies on the hyperbola. Consequently.
The locus of the point of intersection of the altitudes of all the triangles that
can be inscribed in an equilateral hyperbola is the hyperbola itself
WSM A Parabola as Envelope
On one arm of an angle the arbitrary segment e and, on the other, the
segment f are marked off n times in succession from the vertex of
the angle, and the segment end points are numbered, beginning from
the vertex, 0, 1,2,...,« and n, n — 1,..., 2, 1,0, respectively.
Prove that the lines joining the points with the same number envelop a
parabola.
The proof is based upon the
Theorem of Apollonius: Two tangents to a parabola are divided into
segments of like proportion by a third and this third is divided in the same
proportion by its point oftangency.
More precisely: If the two parabola tangents SA and SB, with the
points of tangency A and B, are intersected by a third parabola
tangent at P and Q, and if 0 is the point of tangency of this third
tangent (Figure 40), we obtain the equation
SP OQ BQ
PA~ OP~ SQ'
The proof of the Apollonian theorem is based upon the known
parabola property: The point of intersection of two parabola tangents lies on
a parallel to the parabola axis, passing through the midpoint of the chord
connecting the points oftangency. (It follows directly from the situation
that the three median perpendiculars of the triangle FA'B' whose
vertexes are the focus F and the projections A' and B' of the points of
tangency A and B on the directrix pass through a single point. Two
median perpendiculars are the tangents and the third is the parallel to
the axis.)
Because of this property
(1) p' = a', (2) q' = b', (3) b' + fS' = a' + a',
A Parabola as Envelope 221
if we call the projections of the segments AP = a, PS = <x, BQ = b,
QS = /3, OP = p, OQ = q on the directrix a', a', b', Moreover,
as a result of the equality of the projections of the segment PQ and
the traverse PSQ,
(4) P' + q' = a' + /3'.
If, in accordance with (1) and (2), we substitute a' and b' for p' and q'
in (4), we obtain
a' + /3' = a' + b',
and this equation when combined with (3) shows that
a' = b' and /3' = a'.
222 Problems Concerning Conic Sections and Cycloids
This now gives us
a/a = a'la' = b'/a'^
qlp = q'lp' = b'la' V
m = i7i3' = b'la]
which proves the theorem of Apollonius.
The execution of the envelope construction described above is now
very simple. Let us call the apex angle S; we then select on the arms
of the angle the points A and B in such manner that SA = ne and
SB = nf (A and B are the same points that received the numbers n
and 0 in the numbering process previously described), and consider
the parabola that is tangent to the arms of the angle at A and B. According
to Apollonius' theorem, the line connecting the point P on SA to which the
number v has been assigned with the point Q, on SB is tangent to the parabola.
[The ratios PS:PA and QB: QS are both equal to v.n — v.]
Consequently, the parabola is enveloped by the lines joining the points
with the same numbers.
At the same time, Apollonius' theorem makes it possible to draw the
tangency point for each connecting line.
|3£fl The Astroid
To find the envelope of a straight line, two marked points on which slide
along two fixed, mutually perpendicular axes.
Gottfried Wilhelm Leibniz (1646-1716), the inventor of infinitesimal
calculus, founded the theory of envelopes in 1692 in his paper De linea
ex lineis numero infinitis ordinatim ductis inter se concurrentibus easque omnes
tangente.
Solution. We seek the equation of the envelope in the coordinate
system in which the two given axes are the *-axis and y-axis and their
intersection 0 is the origin.
Let the constant distance between the designated points be
represented by I. Let AB and A'B' represent two positions of the marked-
off distance I, M and N the midpoints of AA' and BB', OM = a,
ON = b, AA' = 2a, BB' = 2/3, thus OA = a + a, OA' = a - a,
OB = b - /3, OB' = b + /3. The conditions AB = I and A'B' = I
can then be written
(1) (a + a)2 + (b- /3)2 = /2 and (a - a)2 + (b + /3)2 = /2,
The Astroid
223
from which we obtain by subtraction
(2) aa = bfi.
The point of intersection S(x, y) of the two straight lines AB and
A'B' is expressed by the two equations
+ r^-5 = l and + irr-z = l>
a + a b — /3 a — a i + /3
and the following two equations:
ax t by
and
\°) „2 2 "T" A2 pa L
(4) ^^--^
i2 -/32
which are obtained from the first two by addition and subtraction.
If we then divide (4) by (2), we obtain
a(a2 - a2) b(b2 - /32)
and, with the use of (3),
a" -a2 .b2- /32
^Tf2' y = ba^TT2
(5) x = a ^rirTz' y = b -jTTTRi
If we then allow A and A' and B and B' to approach each other
(naturally maintaining the conditions AB = I and A'B' = /), then a
and /3 become continuously smaller and the point of intersection S of
the lines AB and A'B' comes closer and closer to the envelope, finally
reaching it when a and /3 are equal to zero. The point x \y at which
the envelope is reached is then represented, according to (5), by the
equations
(5') x = .21 ta» y =
a2 + b2 * a2 + b2
in which, in view of (1),
(1') a2 + b2 = I2
is true.
224 Problems Concerning Conic Sections and Cycloids
From (5') it then follows that
a3 = l2x, b3 = l2y or a2 = /***, b2 = fry*,
from which
I2 = /*** + /*y*
is obtained by addition.
The equation of the envelope thus reads
*K + y* = /*
or, in rational form,
(/2 _ x2 _ y2)3 = 27/Vy2.
(The second form is obtained from the first by cubing twice. The
first cubing results in
x2 +y2 + 3**y*(** + y*) = /2
or
3x*y*l* = I2 - x2 - y2,
and on the second cubing we obtain the indicated form.)
Because of its shape the curve x* + y* • = /" is called an astrois or
astroid in accordance with a proposal made by J. J. Littrow in 1838 or
a star line after M. Simon's proposal.
The astroid is a hypocycloid* in which the radius of the fixed circle is four
times that of the rolling circle.
Proof. In Figure 49, let C be the center, / the radius, the arc JT
a section of the fixed circle %, 9¾ the rolling circle at the moment in
which it touches 2f at the point T, so that the center Z of the rolling
circle cuts the radius CT into the two segments ZT = r = \l and
CZ = 3r. Also, let M be the point on the circumference of 9¾ whose
path we are to follow, x its abscissa and y its ordinate. We then
select C as the origin of the coordinates and draw the (horizontal)
x-axis through point J, at which the marked point was at the beginning
of its motion. The arcs JT of 3 and TM of 9¾ are then of equal
length; the sector angle W = 2i TZM is therefore four times the
sector angle w = ^JCT. The slope of the radius ZM from the
horizontal is 4w — w = 3w, and the horizontal and vertical
projections of ZM are r cos 3w and r sin 3w, respectively. The
* If a circular disc rolls along the circumference of a fixed circle (without
sliding), a marked point on the circumference of the rolling disc (the "rolling
circle") describes an epicycloid when the disc rolls along the outside of the
fixed circle and a hypocycloid when the disc rolls along the inside.
The Astroid
225
Fig. 48.
Fig. 49.
226 Problems Concerning Conic Sections and Cycloids
corresponding projections of CZ are Zr cos w and Zr sin w. Thus we
obtain the equations (which can be read off the figure)
x = 3r cos w + r cos Zw,
y = Zr sin w — r sin Zw,
which, as a result of the relationships
cos Zw = 4 cos3 w — 3 cos w,
sin Zw = 3 sin w — 4 sin3 w,
can be transformed into
x = I cos3 u>, y = I sin3 u>.
In the pair of equations obtained the coordinates of the hypocycloid
point x \y are represented as functions of the so-called rolling angle w.
To obtain the curve equation in Cartesian coordinates, we solve for
cos w and sin w, square, and add. Thus, we obtain
*K + y* = /*
i.e., the equation of an astroid, which was to be demonstrated.
fgCg Steiner's Three-pointed Hypocycloid
To determine the envelope of the Wallace line of a triangle.
Solution. Let ABC be the given triangle, M the midpoint, and r
the radius of the circle U circumscribed about it.
A Wallace line of a triangle is the line connecting the three base
points of the perpendiculars dropped from any point P on the
circumference of the circle of circumscription to the sides of the triangle.
We will make M the origin of an X-Y coordinate system and
preliminarily select the .Y-axis arbitrarily. If we designate the
angles formed by the radii MA, MB, MC, MP with the positive side
of the .Y-axis as 2a, 2/3, 2y, 2<p, the coordinates of the three corners
A, B, C are
(r cos 2<x|rsin2<x), (r cos 2/3\r sin 2/3), (r cos 2y\r sin 2y),
and the coordinates of the point P are (r cos 2<p, r sin 2<p).
In order to find the coordinates Xx \ Yx of the base point Fx of the
perpendicular dropped from P to BC, we form the equations of the
Steiner's Three-pointed Hypocycloid 227
line BC (in the two-point form) and the line PFl (in the slope form)
and find from these equations that
xi =/(«» 2/3 + cos 2y + cos 2? - cos 2/3 + 2y - 2<p),
Yx =/(sin 2/3 + sin 2y + sin 2p - sin 2/3 + 2y - 2p),
where/represents half of r.
Accordingly, the coordinates X2\ Y2 of the base point F2 of the
perpendicular dropped from P to CA will naturally be
X2 =/(cos 2y + cos 2a + cos 2<p — cos 2y + 2a — 2<p),
Y2 = /(sin 2y + sin 2a + sin 2<p - sin 2y + 2a - 2<p).
An appropriate parallel displacement of the coordinate system
allows us to put the coordinates into a simpler form. This
displacement of the coordinate system is based upon Sylvester's theorem (No. 27).
In accordance with this, the altitude intersection H of the triangle
ABC has the coordinates
r(cos 2a + cos 2/3 + cos 2y) and r(sin 2a + sin 2/3 + sin 2y).
Since the center F of the Feuerbach circle lies halfway between Mand
H (No. 28), the coordinates of F are
X0 = /(cos 2a + cos 2/3 + cos 2y),
Y0 =/(sin 2a + sin 2/3 + sin 2y).
It is therefore convenient to select the center of the Feuerbach circle
as the origin of the new coordinate system x, y. Between the
coordinates X\ Y of a point in the old system and x\y in the new system
there exist the relations
X = X0 + x, Y=Y0+y.
From these relations we obtain for the coordinates (xx \yx) and
(*21½) °ftne points Fx and F2 in the new system the simpler values
*! =/(cos 2tp — cos 2a — cos 2/3 + 2y — 2<p),
yx = /(sin 2? - sin 2a - sin 2/3 + 2y - 2p)
and
x2 =/(cos 2p — cos 2/3 — cos 2y + 2a — 2p),
y2 =/(sin 2? - sin 2/3 - sin 2y + 2a - 2q>).
228 Problems Concerning Conic Sections and Cycloids
Now the equation for the Wallace line FXF2 reads
(y - yi)l(* - *i) = (½ - yi)/(*a - *i)-
For the differences x2 — xx and y2 — yi appearing here, we obtain, in
accordance with the coordinate values just given, the expressions
x2 — *i = /(cos 2a — cos 2/3)
+ /(cos 2/3 + 2y - 2<p - cos 2y + 2a - 2?)
= — 2/sin a + /3 sin a — /3
+ 2/sin a + /3 + 2y - 2p sin a - /3
= 4/"sin a — /Ssiny — <pcosa + /3 + y — qp
and similarly
#2 — yi = 4/"sin a — /3 sin y — <p sin a + /3 + y — 9.
The quotient (y2 — yi)l(x2 — xx) thus has the value sin O/cos O with
0 = a + /3 + y — y, and the equation of the Wallace line assumes
the form
x sin O — y cos O = xx sin O — ^ cos ¢.
Using the above values for the coordinates xx and yx, we are able to
write the right side of this equation as
/(sin O cos 2<p — cos O sin 2qp) — /(sin O cos 2a — cos O sin 2a)
- /(sin O cos 2/3 + 2y - 2<p - cos O sin 2/3 + 2y - 2p),
which expression becomes, according to the addition theorem of
circular functions,
/sin (a + /3 + y - 3<p) - /sin (/3 + y - a - <p)
-/sin (a - /3 - y + <p)
= /sin(a + /3 + y - 3«p).
Now the equation of the Wallace line reads
xsina + ft + y — <p — ycosa + /3 + y — <p
= /sin a + /3 + y — 3<p.
For the sake of a final simplification we now choose the position of
the hitherto arbitrary x-axis in such manner that the sum of the three
angles a, /3, y is equal to an integral multiple of 2v. It is easily seen
that with F as the point of origin there are only three rays, separated
from each other by angles of 27r/3, that satisfy this condition. We
Steiner's Three-pointed Hypocycloid
229
choose one of these three rays as the x-axis. In the coordinate
system thus determined, the Wallace line has the simple equation
(1) x sin <p + y cos <p = /sin 3<p.
To interpret this equation geometrically we draw a triangle FQR
with the side FQ = / with the angles 2<p at F and <p at R, thus, with
the external angle 3<p at Q, whose side FR lies on the positive x-axis.
The side QR of this triangle is then the Wallace line 3 represented by
(1). In fact: If x = FU is the abscissa, y = UV the ordinate of any
point V of the line 3, then the perpendicular FW dropped from F to
3 is/sin 3<p as the projection of FQ; on the other hand, as the
projection of the traverse FU + UV, it is x sin <p + y cos <p, so that equation
(1) applies to the coordinates of V.
In particular, if Vis the base point of the perpendicular TV dropped
to 3 from the end point T of the extension QT = 2/ofFQ, Flies on
F U R
Fig. 50.
the circle I whose center Z is the midpoint of the hypotenuse Q T of
the right triangle Q TV, which has the radius/ and which is tangent
to the Feuerbach circle at Q and to the circle S of center F and radius
3 Tat T. Since 2i VZT, as an external angle of the isosceles triangle
VZQ, is equal to 6<p, the arc FT of the circle I is equal to/- 69. And
since the arc JT stretching from the point of intersection J of circle S
with the x-axis to T is equal to 3/- 2<p, and is therefore also equal to
6f<p, it follows that
arc VT of I = arc JT of ft.
230 Problems Concerning Conic Sections and Cycloids
If we then think of circle I as rolling along circle S (along the inside)
so that a point J( marked off on I initially lies at J, the marked point
arrives precisely at point V at the moment when the rolling circle I
assumes the drawn position.
The locus of point V is consequently, as the path of the marked
point J(, a hypocycloid (cf. No. 52), in which the radius of the fixed
circle is three times as large as the radius of the rolling circle. And
since at the moment depicted in the drawing the rolling circle is rotating
precisely about the instantaneous point of rotation T, at this moment
the marked point Ji at V is moving in a direction Q V that is precisely
perpendicular to TV, i.e., the Wallace line 3 is the tangent drawn to
the hypocycloid at VI Thus the totality of Wallace lines represents
the totality of all the hypocycloid tangents.
Conclusion: Steiner's theorem: The envelope of the Wallace lines of
a triangle is a hypocycloid whose fixed circle possesses a radius that is three
times as great as the radius of the rolling circle. The center of the fixed circle
Ellipse Circumscribing a Quadrilateral
231
is the center of the Feuerbach circle of the triangle, and the radius of the rolling
circle is equal to the radius of the Feuerbach circle.
The three points of the hypocycloid—the three places at which the
marked point on the rolling circle touches the fixed circle—are the
end points of the three radii of the fixed circle, separated from each
other by 120°, of which one lies on the positive x-axis.
The three apexes of the hypocycloid—the three places at which the
marked point on the rolling circle touches the Feuerbach circle—divide
the arcs of the Feuerbach circle lying outside the triangle, from the
midpoints of the sides, into segments whose ratio to each other is as 1:2.
[This ratio follows easily from the position of the x-axis and from
the fact that the peripheral angle opposite the arc of a Feuerbach
circle cut off by a triangle side is equal to the difference between the
two triangle angles at the end points of the side.]
The Most Nearly Circular Ellipse Circumscribing a
Quadrilateral
Of all the ellipses circumscribing a given quadrilateral, which deviates least
from a circle?
This problem, which was posed in the seventeenth volume of
Gergonne's Annales de Mathematiques, was solved by J. Steiner (Crelle's
Journal, vol. II; also: Steiner, Gesammelte Werke, vol. I).
Solution (according to Steiner). To begin with, it is clear that
the quadrilateral must be convex inasmuch as no ellipse can be
circumscribed about a concave quadrilateral.
Let OPRQ be the given quadrilateral, let QR cut the extension of
OP at H and PR cut the extension of OQ at K, and let OP = p,
232 Problems Concerning Conic Sections and Cycloids
OQ = q, OH = h, OK = k. We will take OP as the x-axis, OQ as
the y-axis of an oblique-angle coordinate system. The equations
for the sides OP and OQ of the quadrilateral are then y = Oandx = 0,
while the equations for the sides PR and QR are
- + i = 1 and - + i = 1
p k h q
or, if we designate the expressions
kx + py — kp and qx + hy — hq
as u and v, u = 0 and i> = 0.
The equation for every ellipse that can be circumscribed about the
quadrilateral has the form
(1) Xxu + fjyv = 0,
where A and p are two arbitrary constants or so-called parameters.
[Since at 0 x = 0 and y = 0, ai Py = 0 and u = 0, at Q x = 0 and
v = 0, and, finally, at R u = 0 and o = 0, the second degree curve ©
represented by (1) passes through all four corners. Thus, 6 is an
ellipse of circumscription, which, moreover, also passes through the
fifth point x0\y0, and if we choose A and p in such manner that
A*0(**0 + py0 - kp) + i*y0{qx0 + hy0 - hk) = 0,
then *0|yo ^^5° hes on ®- Since, however, only one second degree
curve can pass through five points, 6 is the ellipse @. Thus, every
ellipse of circumscription can be represented by (1).]
We introduce the values oft/ and v into (1) and obtain the equation
of an arbitrary ellipse of circumscription:
(1') Ax2 + 2Bxy + Cy2 + 2Dx + 2Ey = 0,
where
A = *A, 2B = pX + qp, C = hp, D = -kpX, E = -%i.
We begin by looking for the locus of the centers of all the parallel
chords of the ellipse (1')
(2) y = Jtx + n,
in which J( is the common directional constant of the chords, n the
segment cut off on they-axis by one of these chords, chosen arbitrarily.
Ellipse Circumscribing a Quadrilateral
233
If we introduce y from (2) into (1'), we obtain the quadratic
equation
{A + 2BJ( + CJf*)x2 + 2[(Cn + E)M + Bn + D]x + Cn2
+ 2En = 0
for the abscissas xl and x2 of the points of intersection of the chord (3)
with the ellipse (1). According to a well-known theorem from
quadratic equation theory, the sum of the two roots xx and x2 of this
equation is
_ 0 (Cn + E)M + Bn + D
X.+X2- l A + WJjf + CJjf2 ,
i.e., the abscissa of the chord midpoint is
(CM + B)n + EJt + D
CM2 + 2BJt + A
Since the chord midpoint X \ Y satisfies the equation (2) of the chord,
Y = MX + n, so that we can substitute Y — MX for n in the
equation found for X. If we do this, we obtain for the coordinates X
and Y of the chord midpoint the equation
(3) Y = JCX + n',
with
A + BJ? D + Rrf
(3a) -* = -bTcj? n = -bTcm'
Since (3) is the equation of a straight line, the following theorem
applies:
The midpoints of all the parallel chords of an ellipse possessing the directional
constant J? lie on a straight line (a diameter of the ellipse) with the
directional constant M'. The two directional constants M and M',
as well as their corresponding directions and the diameters of the
ellipse possessing this direction are said to be conjugate to each other.
We will now prove two auxiliary theorems.
Auxiliary theorem I: There is only one pair of conjugate directions
(diameters) that belong to all the ellipses circumscribing a quadrilateral.
Proof. We replace A, B, C in (3a) with their values and obtain
_„, = {2k + pJ()-\ + qjf-y.
M p-X+ (2hJl + ?)•/
If M' (for a prescribed M~) is to maintain the same value no matter
which ellipse of circumscription we are concerned with and
consequently, no matter how great A and p are, then this value must be
234 Problems Concerning Conic Sections and Cycloids
obtained when A = 1 and p = 0 as well as when A = 0 and p = 1.
Consequently, it must be true that
2k + pj( _ qJt
p ~ 2hJt + q
And if we are able to find a suitable M for this equation, then for
every A and every p
_ „, = (2k + pjf)\ + (2k + pj()p = 2k + pJi
pX+pp p
or
(4) JT « -2->
i.e., J(' is independent of A and p. The equation giving the condition
for J( is written
hpj(2 + 2hkJ( + kq
and gives the two ^-values
(5) ^=- + j-, J?2 =
P nP
with r2 = h2k2 - hp-kq = hk(hk - pq).
Since, according to the drawing, hk > pq, r2 is real, r is positive,
and both ^-values are real. Moreover,
(5a) Jti+Jt* 2--
P
Now, according to (4), the directional constant Ji\ that is
conjugate to JHX has the value — Mi — 2(k\p), i.e., the value M2- In
like manner,
•/»2 = *wj.
Thus, there is only one pair of specific directions, determined by
the directional constants Mi and M2, that will form a pair of
conjugate directions for each ellipse of circumscription.
Auxiliary theorem II: The acute angle formed by two conjugate
diameters of an ellipse attains a minimum when the two conjugate diameters
are equal, and the tangent of the half angle-minimum is equal to the ratio bra
of the two half axes.
= 0
k _r_
~P hp
Ellipse Circumscribing a Quadrilateral 235
Proof. If 0 and <p are the two acute angles that the two conjugate
diameters of an ellipse with the half axes a and b form with the large
axis, then obviously
b2
(6) tan 0 • tan <p = -j»
For the angle Q = 0 + <p of the two conjugate diameters we therefore
obtain
_. ., tan 0 + tan q> tan 0 + tan q>
tan Q = tan (0 + <p) = -j Z , . = a '
v r/ 1 — tan 0 tan <p b2
1 " a2
But the left side of this equation, and therefore the angle CI, attains a
minimum when the numerator of the right side assumes its smallest
value. This numerator is the sum of two numbers (tan 0 and tan <p)
of constant product and, according to No. 10, attains a minimum when
the numbers are equal. From tan 0 = tan <p it follows that 0 = <p
and from this that the two diameters are equal. At the same time
from (6) we obtain the value b\a for the tangent of the half angle-
minimum.
These preliminaries concluded, the solution of the problem is
simple.
The circumscribed ellipse becomes more and more circular, the
closer the ratio b: a of the small to the large half axis comes to unity.
Now, according to auxiliary theorem II., this ratio has the value
tan (to/2), where to is the smallest angle formed by conjugate
diameters. The most nearly circular circumscribed ellipse is
therefore the ellipse in which w attains its maximum possible value.
And this is the ellipse in which the directional constants of its equal
conjugate diameters are determined by (5). Thus, if w0 is the angle
between the equal conjugate diameters of this ellipse, then for every
other ellipse of circumscription, w0, as the angle between two unequal
conjugate diameters (with the directional constants M^ and J?2)> is
greater than the angle m of this ellipse enclosed between equal
conjugate diameters, so that a>mtLX = w0.
Consequently:
Of all the ellipses circumscribed about a quadrilateral the ellipse that
deviates least from a circle is the one whose equal conjugate diameters possess the
conjugate directions common to all the ellipses of circumscription.
236 Problems Concerning Conic Sections and Cycloids
The directional constants of these specific directions are determined
by the quadratic equation
hpJP + 2hUt + kq = 0.
■M The Curvature of Conic Sections
To determine the curvature of a conic section.
By the curvature of a curve at a point is meant the reciprocal value
of the radius of the circle of curvature, i.e., the radius of the circle
that fits the curve most closely at the relevant point.
Solution. Let the conic section be called ft, its parameter 2p, its
form number e, its shortest focal radius k, so that/* = k{\ + e), and
finally, let the equation for its maximum be
qx2 +y2 - 2px = 0, with q = 1 - e2.
It is known that the coordinates of a point n(^|ij) at a distance R
from another point P(x\y) and lying at a direction from P that forms
the angle & with the positive x-axis are
£ = x + oR, 7) = y + iR,
where o is the cosine and i the sine of &.
If II lies on ft, then from
qt2 + v2-2p{ = 0
we obtain the quadratic equation for R
DR2 - ER+F=0
with the coefficients
D = i2 + qo2, E = 2{ou - iy), F = qx2 + y2 - 2px,
where u = p — qx.
In respect to the conic section, we will call the three expressions
D, E, F the directional number for the "direction" &, the emanant at point
x\yfor the direction &, and the power at point x|y.
If PI! is a secant, the roots Rt and R2 of the quadratic equation are
the segments generated on the secant by the conic section. The
relations between the roots and the coefficients of a quadratic equation
give us the following theorems:
I. The emanant is theDth sum of the secant segments.
II. The power is the Tith product of the secant segments.
We now draw through an arbitrary point P{x\y) of the conic
section the tangent X and the normal and designate the segment of
The Curvature of Conk Sections
237
the normal from P to the x-axis as n and the segment reaching from P
to the conic section as N. If & is the angle of % with the x-axis, o the
cosine, i the sine of &, then the directional number for the tangent
direction is
D = p + qo* = u-2 + gy-2 = t-2
7 n2 n2 n2
(since u = p — qx represents the subnormal), while for the directional
number of the inward-pointing normal we obtain the value
A = o2 + qi2.
The emanant at P for the direction of the normals becomes
E = 2(oy + iu) = 2n.
Therefore, according to I.,
(1) 2n = AN.
On tangent S we select a point 0 whose distance OP from P we
set equal to t; and we draw through 0 perpendicular to S through
the conic section the secant ©. Let the two segments of the secant
created by $ and measured from 0 be s and let S > s. According
to II., we can write for the power of S at 0 both Dt2 and ASs, so that
(2) Dt2 = ASs.
We now draw a circle I to which for the time being we will attribute
the arbitrary radius p; the center of this circle lies on the internal
normal and the circle is tangent to the conic section at P. If s0 and
S0 > s0 are the segments measured from 0 that the circle creates on
the secant ©, then, according to the tangent theorem,
(3) t2 = Vo-
By division of (2) and (3) we obtain
DS0s0 = ASs
and, using (1), we obtain
DNSoS0 = 2a£f.
Now the closer the fraction s/s0 is to unity, the closer the
approximation of the circle to the conic section in the vicinity of point P. But
this fraction, according to the last equation, has the value
s_ = N So Dp
s0 S 1p n
238 Problems Concerning Conic Sections and Cycloids
In the immediate vicinity of the point P, S becomes equal to N and
S0 = 2/>, so that both the first and second factors on the right-hand
side are equal to 1. Consequently, the fraction s/s0 comes closest to
unity when the third right-hand factor Dpjn is also equal to 1. Thus:
Of all circles I the one that most closely approximates the conk section is the
one possessing the radius p = n/D.
Since D was previously determined as equal to p2jn2, we obtain the
fundamental theorem:
The radius of curvature of a conic section has the value
p = n*\p\
To draw the circle of curvature we must consider that p\n is the
cosine of the angle tfi formed by the normal n with the focal radius r of
the point P,* and accordingly we write the obtained formula as
p = n/cos2 0.
From inspection of this equation we obtain the following
Construction of the radius of curvature: At the point of
intersection H of the normal with the x-axis we erect a perpendicular
Fig. 53.
* From the triangle with sides n, r and the line w joining the end points of n
and r lying on the x-axis, we obtain cos xfi = (n2 + r2 — w2)j2nr. If we express
the numerator of this fraction entirely in terms of x, thus expressing n2 by
y2 + u2 = 2px — qx2 + (p — qx)2, r by ex + k, and w by (x — k) + u =
e2x + ke, and combine, the numerator then becomes equal to 2p(ex + k) = 2pr
and cos ip becomes 2pr/2nr = pjn.
Archimedes' Squaring of a Parabola
239
to the normal. At its point of intersection K with the (extended)
focal radius we then erect the perpendicular to the fiscal radius.
The point of mtersection Z of this second perpendicular with the
normal is the center of curvature, its distance from P the desired
radius of curvature.
Busfl Archimedes' Squaring of a Parabola
To determine the area enclosed in a parabola section.
The squaring of a parabola is one of Archimedes' most remarkable
achievements. It was accomplished about 240 B.C. and is based
upon the properties of Archimedes triangles.
An Archimedes triangle is a triangle whose sides consist of two
tangents to a parabola and the chord connecting the points of tan-
gency. The last-mentioned side is taken as the base line or the base
240 Problems Concerning Conic Sections and Cycloids
of the triangle. In order to construct such a triangle we draw
the parallels to the parabola axis through the two points H and K of
the directrix and erect the perpendicular bisectors upon the lines
connecting H and K with the focus F. If we designate the point of
intersection of the two perpendicular bisectors as S, the point of
intersection of the first perpendicular bisector with the first parallel
to the axis as A, and the point of intersection of the second
perpendicular bisector with the second parallel to the axis as B, then
A and B are points of the parabola and SA and SB are tangents of the
parabola (classical construction of the parabola), and ASB is an
Archimedes triangle (cf. Figure 39).
Since SA and SB are two perpendicular bisectors of the triangle
FHK, the parallel to the axis through S is the third perpendicular
bisector; it consequently passes through the center of HK, and, as the
midline of the trapezoid AHKB, it also passes through the center M
of AB. This gives us the theorem: The median to the base of an
Archimedes triangle is parallel to the axis.
Let the parabola tangents through the point of intersection 0 of
the median SM to the base with the parabola cut SA at A', SB at B'.
Then AA'O and BB'O are also Archimedes triangles. Consequently,
according to the above theorem, the medians to their bases are also
parallel to the axis and are therefore also parallel to SO. These
medians are therefore midlines in the triangles SAO and SBO, so that
A' and B' are the centers of SA and SB. A'B' is consequently the
midline of the triangle SAB and is therefore parallel to AB; also the
point 0 on A'B' must be the center oiSM.
The result of our investigations is the
Theorem of Archimedes: The median to the base of an Archimedes
triangle is parallel to the axis, the midline parallel to the base is a tangent, and
its point of intersection with the median to the base is a point of the parabola.
Now we can determine the area J of the parabola section enclosed
in our Archimedes triangle ASB with the base line AB.
The tangents A'B' and the chords OA and OB divide the
triangle ASB into four sections: 1. the "internal triangle" AOB enclosed
within the parabola; 2. the "external triangle" A'SB' lying
outside the parabola; 3. and 4. two "residual triangles" AOA' and
BOB', which are also Archimedes triangles and are penetrated by the
parabola.
Since 0 lies at the center of SM, the internal triangle is twice the size
of the external triangle.
Archimedes' Squaring of a Parabola
241
In the same fashion, each of the two residual triangles in turn gives
rise to an internal triangle, an external triangle and two new residual
Archimedes triangles that are penetrated by the parabola, and once
again each internal triangle is twice the size of the corresponding
external triangle.
Thus, we can continue without end and cover the entire surface of
the initial Archimedes triangle ASB with internal and external
triangles. The sum of all the internal triangles must also be twice as
great as the sum of all the external triangles. In other words:
Theorem of Archimedes: The parabola divides the Archimedes triangle
into sections whose ratio is 2:1.
Or also:
The area enclosed by a parabola section is two thirds the area of the
corresponding Archimedes triangle.
Archimedes arrived at this conclusion by a somewhat different
method. He found the area of the section by adding together the
areas of all the successive internal triangles.
If A represents the area of the initial Archimedes triangle ASB,
then the area of the corresponding internal triangle is one half A, the
area of the corresponding external triangle is one quarter of A, and
the area of each of the two residual triangles is one eighth of A.
The successive Archimedes triangles therefore have the areas
a A A
the corresponding internal triangles possess half this area; and since
each internal triangle gives rise to two new internal triangles, we thus
obtain for the sum of all the successive internal triangle areas the
value
1[. . A . A „ A 1
j [A + 2.3 + 4-^ + 8.33 +...j.
The bracket encloses a geometrical series with the quotient J, the
sum of which is equal to A/( 1 — J) = f A. Thus, we again obtain
for the area of the section the value J = §A.
Since A'B' is tangent to the parabola at 0, the perpendicular h
dropped from 0 to the base line AB of the section is the altitude of the
section. Since h is also half the altitude of the triangle ASB, A = AB-h
and J = \-AB-h, i.e.:
The area enclosed by a parabola section is equal to two thirds the product of
the base and the altitude of the section.
242 Problems Concerning Conic Sections and Cycloids
Finally, we will express the area of the section in terms of the
transverse q of the section, i.e., by the projection normal to the axis
of the chord bounding the section.
A
Fig. 55.
We use the equation for the amplitude of the parabola, calling the
coordinates of the corners of the section x\y and X\ Y, and we have
y2 = 2px and Y2 = 2pX
with 2p representing the parameter. From Figure 55 it follows
directly that
J = SXY-$xy-(X-x).Y-±l.
If we replace X and x here with Y2\2p and y2j2p, we obtain
\2pJ = Y3 -y3 - ZY2y + ZYy2 = (Y - y)3. Since Y - y is the
section transverse q, we finally obtain
\2pJ = q3.
This important formula can be expressed verbally as follows:
Six times the product of the parameter and the area of the section is equal to
the cube of the section transverse.
Squaring a Hyperbola
To determine the surface area enclosed by a section of a hyperbola.
We select the major axis of the hyperbola as the x-axis, the minor
axis as they-axis; the hyperbola equation then reads
r/3
(1)
*L _ £ - i
a2 b2 ~ '
where a and b are half the major and minor axes, respectively.
Squaring a Hyperbola
243
We must find the area A of the hyperbola section cut off at a
distance of x from the apex of the hyperbola by the hyperbola chord
1y that is normal to the x-axis (Figure 56). The coordinates for the
corners of the section H and K are thus x\y and x\ —y.
First we determine the area T of a so-called hyperbola trapezoid,
i.e., the trapezoidal surface that is bounded by a hyperbola arc,
the parallels to one of the asymptotes through the end points of
the arc, and the segment cut off on the other asymptote by these
parallels.
Let the asymptote angle be 2a, its sine J, the sine and cosine of its
halves i and o, so that i = bje and o = aje (with e = V a2 + b2) and
J = 2io = 2ab\e2 (Figure 56).
We choose as the asymptotes the u- and i»-axis of a second (oblique-
angle) coordinate system. Between the coordinates x\y and u\v of a
hyperbola point in the two systems there then exist the transformation
equations
(2) x = ou + ov, y = iv — iu,
as may be seen from Figure 57, so that for the left side of (1)
we obtain the value 4uvje2 and we have the equation of the
244 Problems Concerning Conic Sections and Cycloids
hyperbola in the second system, the so-called asymptote equation of the
hyperbola
(3) uv = P with P = ie2,
in which P is the so-called power of the hyperbola.
Let the trapezoid T to be calculated be bounded by the hyperbola
arc with end point coordinates u\v and U\V (where we let U > u,
V < v), by the two ordinates v and V and by the base line U — u of
the trapezoid (Figure 58).
We divide the trapezoid into n equal sections t by means of parallels
to the n-axis, so that T = nt, and we designate the coordinates of the
points marking off the segments on the trapezoid arc as ux \vu u2\v2, ■. ■,
Fig. 58.
Squaring a Hyperbola 245
The asymptote parallels through the end points u|» and U|& of
the hyperbola arc corresponding to an arbitrary trapezoidal section t
determine two parallelograms with a common base line g = U — u
lying on the a-axis, one of which is larger and the other smaller than t.
Since these parallelograms possess the areas Jgo and Jg%$, we obtain
the inequality
Jgt) > t > Jg%.
We introduce the so-called quotient of the trapezoid t, q = U/u,
replace g on the left by (q — l)u and on the right by [1 — (1/?)]U,
and obtain
J(q - l)u» > t> j(\ - -\uf8
or, as a result of (3),
PJ(q - 1) > t> Pj(\ - -V
If we replace t here with 77«, divide by PJ and abbreviate T\PJ as c,
we obtain
or, solving for q,
q — 1 > - > 1
n q
1 + - < q <
n
Using this inequality for all n trapezoidal sections, we obtain the
n inequalities
c u-, 1
!+-< — <
n u c
n
, c u2 1
!+-< — <
n ttx l _c
n
. c U 1
1 + - < <
n u.
1 -
n
246 Problems Concerning Conk Sections and Cycloids
Multiplication of these gives
(' + ;)* < ?
(■-r
The mean of this inequality is the so-called quotient Q = U\u of the
hyperbola trapezoid T. The left and right side tend (according to
No. 12) toward the value e° for infinitely increasing n, e representing the
Euler number (2.71828...). This gives us the equality
Q = ee.
With logarithms we obtain
(I) T = PJIQ,
or verbally:
The area of the hyperbola trapezoid is proportional to the natural logarithm
of the trapezoid quotient.
The proportionality constant is the product of the hyperbola power
and the sine of the asymptote angle.
Since 4P = e2, J = 2abje2, we also have
(I*) T= pQ.
If we join the end points u\v and U\ Voi our hyperbola arc with the
hyperbola center 0, we obtain a hyperbola sector to which we can
similarly assign the "quotient" Q. Since the two triangles that are
formed by the connecting lines mentioned and the coordinates of the
end points of the arc have the areas \uvJ and \UVJ, which areas are
equal in view of (3), the sector has the same area S as the trapezoid:
(II) S = PJlQ = pQ.
Now the determination of the area of the section A is simple. First,
in accordance with (2), the abscissas u and U of the section corners H
and K are found to be
Rectification of a Parabola
From this it follows that the quotient of the sector OHK is
¢ = ^=1^ = (.^1)2
a b
and, consequently, the area of the sector, according to (II), is
...w(£ + f).
Finally, A is found to be the amount by which the triangle OHK
is greater than the sector OHK, or
(III) A=xy-abl(l + ty
H3S9 Rectification of a Parabola
To determine the length of a parabola arc.
Solution. The following ingenious solution to this problem stems
from the famous book Lectiones Geometricae of the English mathematician
Isaac Barrow (1630-1677), which was published in 1670 in London.
We refer the parabola to a coordinate system in which the x-axis is
the axis of the parabola and the y-axis is tangent to the apex. The
parabola equation then reads y2 = 2px. We need only determine the
length of an "apex arc," i.e., an arc of the parabola that takes its
origin from the apex S, since any arc can be represented as the sum or
difference of apex arcs. Let the end point P of the apex arc SP
possess the coordinates X and Y, and let the sought-for length of the
arc be L.
Since the subnormal of a parabola is equal to the half parameter p,
there exists between the ordinate y of a point of the parabola and the
normal n corresponding to this point the relation
n2 — y2 = p2.
If we then assign to each parabola point x\y of our coordinate system a
point n\y in a new n^-coordinate system, we obtain in the new
system an equilateral hyperbola with the half axis p.
247
[cf. (1)]
248 Problems Concerning Conic Sections and Cycloids
We show that p times the length (pL) of the parabola arc SP is
numerically equal to the surface area F of the hyperbola trapezoid
that is bounded by the hyperbola, its axes, and the perpendicular N
that is dropped from the hyperbola point P' corresponding to the
point P onto the minor axis of the hyperbola. (JVis at the same time
the abscissa of the hyperbola point P' and the parabola normal at the
parabola point P.)
Fig. 59.
Let us consider a portion a = AB of the parabola arc SP that is
short enough to be considered a rectilinear distance (a so-called arc
element) and let us draw through its end points the parallel AC to
the parabola axis and BC = 17 to the apex tangent. At the same time
we draw the ordinate y and the normal n of the midpoint of AS, which
gives us a right triangle with the sides y, n, and p that is similar to the
triangle ABC. As a result of this similarity we obtain the proportion
T)\a = p:n, and this gives us the equation
(1) pa = nv.
We then draw from the hyperbola points A' and B' corresponding to
the points A and B the perpendiculars to the minor axis of the
hyperbola, and we obtain a narrow hyperbola trapezoid that
corresponds to the arc A'B'. The area <p of this trapezoid is the product of
its altitude ij and its midline n (the latter is n because it passes through
the center of the altitude and thus through the end point of the
hyperbola ordinate y):
(2) <p = nv
From (1) and (2) we get
pa = <p.
Rectification of a Parabola
249
If we form this equation for each element of the parabola arc SP and
its corresponding minute hyperbola trapezoid, and if we add the
resulting equations, we obtain on the left p times the arc length L and
on the right the area F of the hyperbola trapezoid above described,
i.e., the equation
pL = F.
Now from the concluding formula of No. 57 it follows that
NY p* N+Y
2 2 p
The sought-for arc length is thus
L=NY p_ JV+7
2P 2 p
where Y represents the ordinate, N the normal of the end point of
the arc.
We now slightly transform the equation we have found.
Let T be the portion of the parabola tangent passing through P,
bounded by P and the y-axis, let t be the slope angle of the parabola
at point P, i.e., the angle formed by the tangent with the .v-axis (and,
at the same time, by the normal JVwith they-axis). Then
NY _ YY _ _X_ _ T
2p ~ 2p cos t cos t
and
N+ Y _ N + ACCOST
P ~ JVsin t
consequently
L
where we have replaced \p by the shortest focal radius k.
Conclusion: An apex arc of a parabola exceeds the length of the parabola
tangent reaching from the end of the arc to the apex tangent by a quantity that is
proportional to the natural logarithm of the cotangent of half the slope angle.
The proportionality constant is the shortest focal radius.
1 + cos T
sin t
2 cos2 =
2 sin - cos ^
T = cotI,
T + kl cot -»
250 Problems Concerning Conic Sections and Cycloids
■3*1 Desargues' Homology Theorem (Theorem of
Homologous Triangles)
If the lines connecting the homologous vertexes of two triangles pass through a
point, the points of intersection of the homologous sides lie on a straight line.
And conversely:
If the points of intersection of the homologous sides of two triangles lie on a
straight line, the lines connecting the homologous vertexes pass through a point.
One frequently has occasion to correlate to each other the vertexes
and sides of two triangles (e.g., similar triangles), and in these cases
for the sake of convenience the mutually correlated, so-called "
homologous" vertexes and sides are usually designated by the same letter.
Thus, one may have, for example, the homologous vertexes A and A',
B and B', and finally C and C", as well as the homologous sides
BC = a and B'C = a', CA = b and C'A' = b', and finally AB = c
and A'B' = c'.
Two such triangles, for which we will assume that no pair of
homologous vertexes or sides coincides, are called copolar [perspective
from a point] when the lines AA', BB', CC connecting the homologous
vertexes pass through one point, the so-called homology pole. They are
called coaxial [perspective from a line] when the points of intersection
ad, bb', cc' of the homologous sides lie on a straight line, the so-called
homology axis.
Using these terms, the above theorem can be expressed in the
abbreviated form of:
Desargues' homology theorem: Copolar triangles are coaxial,
coaxial triangles are copolar.
Triangles that are both copolar and coaxial are called homologous
triangles.
The theorem of homologous triangles was discovered by the
French mathematician and engineer Gerard Desargues (1593-1662)
in about 1636 and is therefore known as Desargues' theorem.
However, according to the Greek mathematician Pappus, this theorem was
already contained in the lost treatise on porisms of Euclid.
Desargues' theorem plays a very important role in projective
geometry. Consequently, we will prove it in a projective manner
though other, shorter proofs are possible.
For the reader unfamiliar with projective geometry it may be
appropriate to provide a short exposition of its most important
Desargues' Homology Theorem
251
concepts and its simplest theorems, especially as they will be
encountered in the next few sections as well.
The totality of the points (considered as rigidly connected to each
other) in a line is called a range of points; the line is called the base of
the range. The totality of the lines (considered as rigidly connected
to each other) that pass through one point is called a ray pencil; the
point is called the center of the pencil. Similarly, the totality of the
points of a circle or, more generally, of a conic section is called a
circular or conic range of points or field of points; the totality of the
tangents of a conic section is called Afield of tangents of a conic section.
Ranges of points, pencils, and tangent families are the basic structures
of plane projective geometry, and the points, rays, and tangents are
the elements of the corresponding structures.
Two basic figures are called projective (symbol: 7\) when their
elements are unequivocally related to each other in such manner
that every four elements of the one figure and the four corresponding
or "homologous" elements of the other have the same double ratio.
The relation existing between the figures is called projectivity.
[The cross ratio (ABCD) of four points A, B, C, D of a straight line
is the ratio
AC ,AD
BC : BD'
the cross ratio (abed) of four rays a, b, c, d of a pencil is the ratio
sin ac _ sin ad
sin be ' sin bd
The cross ratio of four points of a circle is the cross ratio of the four
rays that connect the four points with a fifth point of the circle, where
(according to the boundary angle theorem) this fifth point can be
chosen at pleasure. The cross ratio of four points of a conic section
is similarly the cross ratio of the four rays that join the four points
with an arbitrarily chosen fifth point of the conic section (cf. No. 61).
Finally, the ratio of four conic section tangents is the cross ratio of
their points of tangency.]
A projectivity is completely determined if three elements of one structure and
the corresponding elements of the other are given.
Two projective structures are called conjective when their bases
(or centers) coincide.
A particularly important case of projectivity is perspectivity. A
range of points and a ray pencil are called perspective (7\) when each
252 Problems Concerning Conic Sections and Cycloids
element of the range lies on the corresponding element of the pencil.
Each ray is called the reflection of the homologous point, the whole
pencil is called the reflection of the range. Two nonconjective ranges
are called perspective (symbol: 7^) when the lines connecting the
homologous points pass through one point, the center of perspectivity.
Two ray pencils are called perspective if every pair of corresponding
rays intersect on one straight line, the axis of perspectivity.
The projectivity of two perspective figures follows from
Pappus' theorem: The cross ratio of four rays of a pencil is equal to the
cross ratio of the four points at which an arbitrary line cuts the rays.
(Pappus of Alexandria, fourth century a.d., Collections mathematicae.)
Proof. Let A, B, C, D be the four points of intersection of a line
with the pencil of four rays OA = a, OB = b, OC = c, OD = d. We
designate the sine of the angle formed by two rays, for example, a and
c, with each other as sine ac. Since the perpendiculars from A and B
to c have the lengths a sin ac and b sin be and are in the same ratio as
AC to BC, we obtain the proportion
a sin ac: b sin be = AC:BC.
Similarly,
a sin ad:b sin bd = AD:BD.
By division of these two equations we obtain
sin ac sin ad _ AC AD
sink : sinW ~ BC'' B~D ^' ' '
Two projective ranges or pencils can always be brought into a perspective
position.
Two projective ranges (pencils) become perspective when they are
placed in such a way that an element of one range (pencil) falls on
the homologous element of the other range (pencil), though the bases
(centers) do not coincide. We have the following two important
theorems:
I. If in the projectivity between two ranges the point of intersection of the
two bases corresponds to itself, the ranges are perspective.
II. If in the projectivity between two pencils the line connecting the two
centers corresponds to itself, the pencils are perspective.
Proof of I. Let the bases of the two ranges be % and %', their
point of intersection that corresponds to itself 0 = 0'. On % we
choose two fixed elements A, B and an arbitrary point P and we
Desargues' Homology Theorem
253
designate the homologous elements on %' as A', B', and P'. We find
the point of intersection S of the lines AA' and BB' and assign to the
Fig. 60.
lines connecting the designated elements with S the same letters, but
in lower case. Then, according to Pappus,
(oabp) = (OABP) and (o'a'b'p') = {O'A'B'P').
But since the right sides of these equations are equally great, according
to our assumption, it follows that
{o'a'b'p') = (oabp).
But if two equal cross ratios agree in the first three elements
(o' = o, a' = a, b' = b), then they also agree in the fourth.
Consequently, p' falls on p, and thus PP' passes through S, and the ranges
are perspective.
Proof of II. Let the centers of the two projective pencils 3 and
3' be Z and Z', their self-corresponding connecting line o = o'.
We select on 3 two fixed elements a and b and an arbitrary element p
and designate the homologous elements of 3' as a', b', and p'. We
find the connecting line g of the points aa' and bb' and assign to the
points of intersection of the designated elements with g the same
letters, but capitals. Then, according to Pappus,
[oabp) = {OABP) and {o'a'b'p') = {O'A'B'P').
But since the left sides of these equations are equal, in accordance with
our initial assumption,
{O'A'B'P') = {OABP).
254 Problems Concerning Conic Sections and Cycloids
But if two equal cross ratios agree in the first three elements
(0' = 0, A' = A, B' = B), they also agree in the fourth. P'
therefore falls on P, p and p' thus intersect on g, and the pencils 3
and 3' are perspective.
The proof of Desargues' theorem is now easily obtained (Figure 62).
We call the vertexes of one triangle A, B, C, the sides opposite them
a, b, c, the homologous vertexes of the other triangle A', B', C, the
sides opposite them a', V, c'.
Let the points of intersection of the homologous sides a and a', b and
V, c and c' be X, Y, and Z, respectively, and let the points of intersection
of the line CC with the two lines AB and AB' be //and //'.
The proof divides into two parts.
I. We assume that the connecting lines AA', BB', CC pass through
one point 0. We project the range of points AB from 0 onto A'B'
and obtain two perspective ranges in which the elements A, B, H, Z
of the first are homologous to the elements A', B', H', Z' = Z of the
second. We then connect the points of these ranges with C and C",
thereby obtaining two projective ray pencils in which the elements
CA, CB, CH = CC, CZ correspond to the elements CA', C'B', C'H' =
CC, CZ'. Since the line CC connecting the pencil centers
corresponds to itself in this projectivity, the projectivity of the pencil is
perspective and the points of intersection of the homologous rays lie on
a straight line. Thus, for example, the points of intersection Y
(oWA and CA'), X (of CB and C'B'), and Z (of CZ and CZ') lie on a
straight line.
II. We assume that the points aa' (X), bb' (Y), cc' (Z) lie on a
straight line g. We connect the points of the line g with C and C",
thereby obtaining two perspective ray pencils in which the elements
Steiner's Double Element Construction
255
a, b, CC, CZ of the first pencil correspond to the elements a', b', CC,
CZ' = CZ of the second. We cut these pencils with the lines c and c'
and obtain two projective ranges in which the elements B, A, H, Z of
the first range correspond to the elements B', A', H', Z' = Z of the
second. Since the point of intersection Z = Z' of the range bases
corresponds to itself in this projectivity, the ranges are perspective
and the connecting lines BB', AA', and HH' = CC of the homologous
elements thus pass through one point, which was to be proved.
B3II Steiner's Double Element Construction
To draw the double elements of a connective projection that are given by
three pairs of homologous elements.
A double element of a conjective projectivity is an element that
coincides with its homolog.
The following simple solution to this fundamental problem of
projective geometry was discovered by the German mathematician
Jakob Steiner (Die geometrischen Konstruktionen, etc. [cf. No. 34],
Berlin, 1833).
256 Problems Concerning Conic Sections and Cycloids
Steiner's double element construction enriched the geometry of
antiquity by providing it with a new and fruitful method for solving
problems of geometric construction. This so-called method of false
position (regula falsi) is based on the theorem:
If in the projectivity between two ray pencils the line connecting the pencil
centers corresponds to itself, the pencils are perspective (No. 59).
We can distinguish three cases:
I. Double elements of a projectivity on a circle. Let the projectivity
between the two ranges of points 9¾ and SR' of the circle ft be given
by the two corresponding point triplets (A, B, C) and {A', B', C). We
consider the ray pencils © and ©', whose rays run from the points of
ranges 9¾ and 9V, respectively, through the centers A' and A,
respectively. Since 9¾ "A" © and 9V 7T ©', and, according to our assumption,
9¾ 7\ 9V, it is also true that <B "a ©'. But since in the line AA'
connecting the centers of the two pencils <S> and ©' corresponding
pencil elements coincide, the latter projectivity is a perspectivity.
The axis of perspectivity is the line g connecting the point of inter-
Fio. 63.
section of the rays A'B and AB' with the point of intersection of the
rays A'C and AC. Two corresponding rays of © and ©' thus always
intersect at g. Thus, in order to obtain a point P' of 9V corresponding
to the arbitrary point P of fH, we need only connect the point of
intersection of A'P and g with A. The connecting line touches
ft at P'. If we carry out this construction for the points of
intersection H and K of the perspectivity axis with the circle, H' falls on
H, K' on K. The double points of the projectivity on a circle are
therefore the points of intersection of the circle with the above
perspectivity axis.
Pascal's Hexagon Theorem
257
II. Double elements of two ray pencils. We draw a circle ft through
the common center of the two projective pencils and, in accordance
with I., we draw the double points of the two ranges at which the rays
of the two pencils cut ft. The pencil rays passing to these double
points are the double rays we are looking for.
III. Double elements of two ranges of points. We draw, in accordance
with II., the double rays of the two pencils that are obtained from the
lines connecting the points of the two conjective projective ranges
with an arbitrary center Z outside the base of the range. The points
of intersection of the two double rays with the base of the range are
the double points we are looking for.
^m Pascal's Hexagon Theorem
To demonstrate that the three points of intersection of the opposite sides of a
hexagon inscribed in a conic section lie on a straight line.
A hexagon inscribed in a conic section essentially consists of six
points anywhere on the conic section 1, 2, 3, 4, 5, 6, the "vertexes"
of the hexagon, and the six connecting lines 12, 23, 34, 45, 56, 61, the
"sides" of the hexagon. The sides 12 and 45, the sides 23 and 56,
and finally 34 and 61 are called the "opposite sides." The straight
line on which the three points of intersection of the opposite sides lie
is called the Pascal line, and the hexagon is called the Pascal hexagon.
In a somewhat more abbreviated form the theorem to be proved can
be stated as:
The three points of intersection of a Pascal hexagon lie on a straight line.
This fundamental theorem in conic section theory was published in
1640 by Blaise Pascal (1623-1662) at the age of 16 in his six-page
Essai sur les Coniques.
There are a number of proofs of the Pascal theorem. The following
projective proof is based upon the two theorems of Steiner:
I. The points of a conic section are projected from pairs of themselves by
projective pencils.
II. If in the projectivity between two ranges of points the point of
intersection of their bases corresponds to itself, the ranges are perspective.
Proof of I. The theorem applies most directly to the circle.
(In circles the designated pencils are even congruent.) Now, since a
conic section is the central projection of a circle, and since in this
258 Problems Concerning Conic Sections and Cycloids
projection the pencils we are concerned with appear as projections
of projective ray pencils in a circle, we need only show that the
central projection of a pencil on a plane is projective with respect
to the pencil. Now this is the case according to Pappus' theorem.
Specifically, if a, b, c, d are four rays lying in plane E, a', b', c', d' their
central projections on plane E\ and A, B, C, D the points of
intersection of the ray pairs (a, a'), (b, b'), (c, c'), and (d, d') lying on the
line of intersection of the two planes, then, according to Pappus,
(a'b'c'd') = (ABCD) and (abed) = {ABCD),
thus, also
{a'b'c'd') = {abed),
i.e., the pencil and the pencil projection are projective.
The proof of II. is in No. 59.
Now to prove the Pascal theorem!
Let the vertexes of the hexagon be 1,2, 3, 4, 5, 6. According to I.,
the rays from the centers 1 and 3 to the conic section points 2, 4, 5, 6
form projective pencils; thus the points of intersection 2', 4', 5', 6'
and 2", 4", 5", 6" of these rays with the straight lines 54 and 56 form
projective ranges. Since at the point of intersection 5 of their bases the
corresponding range elements are coincident (5' = 5"), the ranges are
perspective according to II., and consequently the lines 2'2", 4'4", and
6'6" pass through one point, the point of intersection Z of the lines 4'4"
and 6'6", i.e., the lines 34 and 61. In other words: The points of
intersection of the opposite sides 2' (intersection of 12 and 45), 2"
(intersection of 23 and 56), and Z (intersection of 34 and 61) lie on
one straight line, the Pascal line p = 2'Z2". Q..E.D.
Pascal's Hexagon Theorem
259
The converse of Pascal's theorem: If the opposite sides of a hexagon
(of which no three vertexes lie on a straight line) intersect on a straight
line, the six vertexes lie on a conic section.
Indirect proof. Let the conic section that is unequivocally
determined by the five vertexes 1, 2, 3, 4, 5 touch the fifth side of the
hexagon 56 at 6*. According to Pascal's theorem, we obtain 6* by
drawing the Pascal line (as the line connecting the points of
intersection of the opposite sides 12 and 45, as well as 23 and 56 = 56*),
causing it to intersect with 34 at Z and determining the point of
intersection (6*) of 1Z with 56* = 56. But according to our
assumption, this is 6, so that 6* = 6.
If two vertexes of a Pascal hexagon coincide once or twice or three
times, there follow the corollaries of the Pascal theorem, the most
important of which we will now give.
I. The vertexes 5 and 6 coincide: this is to be considered as meaning
that point 6 approaches point 5 ever more closely until it finally
coincides with it. This transforms the chord 56 into the tangent at
point 5 and the hexagon is transformed into the pentagon 12 3 4 5.
Pascal's theorem then assumes the form:
Corollary 1 (Figure 65): In every pentagon inscribed in a conic section
the points of intersection of two pairs of nonadjacent sides and the point of
intersection of the fifth side with the tangent passing through the opposite vertex
lie on a straight line.
Fig. 65.
II. The vertexes 5 and 6 coincide and the vertexes 2 and 3 coincide; the
hexagon thus becomes a tetragon 12 4 5. Now the opposite sides of
the tetragon 12 and 45, and likewise 24 and 51, and the tangents at
the opposite vertexes 2 and 5 intersect each other on a straight line.
260 Problems Concerning Conic Sections and Cycloids
Since we could just as easily choose the two other opposite vertexes,
the point of intersection of the tangents at these vertexes also lies on
the Pascal line. We therefore obtain the following
Corollary 2 (Figure 66): In every tetragon inscribed in a conic section
all the pairs of opposite sides and tangents to the pairs of opposite vertexes
intersect on a straight line.
Fig. 67.
Brianchon's Hexagram Theorem
261
III. The vertexes 1 and 2 coincide, so do vertexes 3 and 4,
and so do vertexes 5 and 6; the hexagon becomes a triangle, and we
obtain
Corollary 3 (Figure 67): In every triangle inscribed in a conic section
the sides intersect with the tangents to the opposite vertexes on a straight line.
ISgfl Brianchon's Hexagram Theorem
To demonstrate that the three opposite vertex lines of a hexagram circumscribed
about a conic section pass through a point.
A hexagram circumscribed about a conic section consists essentially
of six tangents I, II, III, IV, V, VI to the conic section, which are the
sides of the hexagram, and the six points of intersection III, II III,
III IV, IV V, V VI, VII forming the vertexes of the hexagram.
The vertexes III and IV V, the vertexes II III and V VI, and the
vertexes III IV and VII are called opposite vertexes, and the lines
connecting them are called opposite vertex lines.
The point through which the three opposite vertex lines pass is
called the Brianchon point and the hexagram the Brianchon hexagram.
The theorem to be proved can be stated in a somewhat shorter form
as follows.
The three opposite vertex lines of a Brianchon hexagram pass through a
point.
This theorem, which is as important in the theory of conic sections
as the Pascal theorem, was published in 1810 by the French
mathematician Brianchon (1785-1864) in the Journal de Vitcole Polytechnique.
The following projective proof of Brianchon's theorem is based on
the two theorems of Sterner:
I. The tangents of a conic section cut two of the tangents into projective
ranges of points.
II. If in the projectivity between two ray pencils the line joining the pencil
centers corresponds to itself, the pencils are perspective.
Proof of I. We first prove I. for a circle. For this purpose let us
consider the following structure: 1. the range of points 9¾ through
which a moving point P on the circle passes; 2. the pencil 58 of the
rays FP that run from the fixed circle point F to the moving point P;
3. the field © of tangents t drawn to the different positions of P;
4. the range r of the points of intersection S of these tangents with the
262 Problems Concerning Conic Sections and Cycloids
fixed circle tangents f through F; 5. finally, the pencil b of the rays
MS that run from the center point M of the circle to S. Then SK, 58,
and © are projective by definition, 58 and b are projective because
they are congruent (every ray from 58 is perpendicular to the
corresponding ray from b), and finally r and b are projective because they
are perspective. Consequently, <B and r are projective. I.e.:
Afield of tangents to a circle is projective with respect to the range of points
that the tangents of the field generate on an arbitrary fixed tangent. From this
it follows directly that:
The tangents of a circle cut two of them into projective ranges of points.
We will now prove theorem I. for a conic section. The conic
section is the central projection of a circle in which its tangents are
perspectives of circle tangents. In this projection the ranges of points
mentioned appear as perspectives of the two ranges that the circle
tangents generate on the two fixed circle tangents, which correspond
to the chosen conic section tangents in the central projection. Now,
since the latter ranges are projective, the former must also be.
Proof of II. is given in No. 59.
Now for the proof of Brianchon's theorem!
Let the sides of the hexagram be I, II, III, IV, V, VI. According
to auxiliary theorem I., the points of intersection generated on
tangents I and III by II, IV, V, VI form projective ranges of points,
and consequently the junction lines II', IV, V, VI', and II", IV", V",
VI" of these points with the points (centers) VIV and V VI form
Brianchon's Hexagram Theorem
263
projective pencils. Since in the line V connecting the centers,
corresponding rays (V = v") coincide, the pencils are perspective
according to auxiliary theorem II., and the rays II' and II", IV and
IV", and VI' and VI" intersect on one straight line, the axis of
perspectivity, the junction line a of the points IV IV" and VI' VI",
i.e., of the points III IV and VI I. In other words: The opposite
vertex lines II' (from III to IV V), II" (from II III to V VI), and
a (from III IV to VI I) pass through one point, the Brianchon point.
Q.E.D.
The converse of Brianchon's theorem : If the opposite vertex lines
of a hexagram (of which three sides do not pass through one point)
pass through a point, the sides of the hexagram form tangents of a conic
section.
Indirect proof, similar to the proof of the converse of Pascal's
theorem (No. 61).
If two sides of the Brianchon hexagram coincide once or twice or
three times, we obtain the corollaries of the Brianchon theorem, the
most important of which we will here mention.
I. The sides V and VI coincide; this is to be considered as a situation
in which side VI comes closer and closer to side V and finally coincides
with it. The point of intersection V VI then becomes the point of
tangency of the tangent V, and the hexagram becomes the pentagram
III III IV V. Brianchon's theorem then assumes the following
form:
Corollary 1 (Figure 69): In every pentagram circumscribed about a
conic section the lines joining two pairs of nonadjacent vertexes and the
junction line of the fifth vertex with the point of tangency of its opposite side pass
through one point.
Fig. 69.
264 Problems Concerning Conic Sections and Cycloids
II. The sides V and VI coincide, and the sides II and III coincide;
here the hexagram becomes the tetragram IIIIV V. Now the
junction lines of the opposite vertexes III and IV V, as well as those
of IIIV and V I, and also the junction lines of the tangency points
of II and V pass through one point. Since we could as easily select
the tangency points of the opposite sides I and IV, their junction
line also passes through the Brianchon point. Consequently, we
obtain
Corollary 2 (Figure 70): In every tetragram circumscribed about a
conic section the two diagonals and the two tangency chords of the opposite sides
pass through one point.
Fig. 71.
Desargues' Involution Theorem
265
III. The sides I and II coincide, the sides III and IV coincide,
and the sides V and VI also coincide; the hexagram becomes a
trigram, and we obtain
Corollary 3 (Figure 71): In every triangle circumscribed about a conic
section the lines connecting the vertexes with the tangency points of the opposite
sides pass through one point.
Hgftfl Desargues' Involution Theorem
The points of intersection of a line with the three pairs of opposite sides of a
complete tetragon* and a conic section circumscribed about this tetragon form
four point pairs of an involution. The lines joining a point with the three
pairs of opposite vertexes of a complete tetragram* and the tangents drawn
from the point to a conic section inscribed in the tetragram form four ray pairs
of an involution.
It is here assumed that the line does not pass through a corner of
the tetragon and that the point does not lie on a side of the
tetragram.
This double theorem was formulated and proved in 1639 by
Desargues (No. 59) in his major work on conic sections. The work
bears the strange title Brouillon-Projet d'une atteinte aux tenements des
rencontres d'un cone avec un plan, or approximately in English "First
Draft of a Projected Essay on the Phenomena Arising from the
Intersection of a Cone with a Plane."
Desargues was the source of the concept of involution and of an
amazing series of involution theorems as well, so that it seems
appropriate at this point to take up briefly for readers unfamiliar with it the
most significant properties of involution.
In a conjective projectivity (No. 59) between two homologous
structures I and II each element of a common base can be assigned to
I as well as II. Now, if there are two elements A and B of the base
such that to the element A of I there corresponds the element B of II
and simultaneously to the element B of I there corresponds the element
A of II, we say that the elements A and B are conjugate (to each other)
or correspond to each other in double fashion.
* A complete tetragon (tetragram) consists essentially of four points (lines)
1, 2, 3, 4 and their six connecting lines (points of intersection) 23, 14, 31, 24, 12,
34, of which 23 and 14, 31 and 24, 12 and 34 are known as opposite sides
(opposite vertexes).
266 Problems Concerning Conic Sections and Cycloids
Let us consider in addition to the conjugate point pair (A, B)
another arbitrary pair of homologous elements: P from I and Q from
II. From the equation
(ABPQ) = (BAQP)
it then follows that to the element Q from I there also corresponds the
element P from II, i.e., P and Q are also conjugate. Thus, if one
pair of homologous elements in a conjective projectivity is composed of conjugate
elements, then every pair is composed of conjugate elements.
A conjective projectivity in which every two homologous elements
are conjugate is called an involution or an involutional projectivity.
Every pair of conjugate elements is called for short an element pair of the
involution. .
Fig. 72.
Since a projectivity is fixed by three elements of one structure and
the homologous elements of the other, an involution is determined by
two pairs A, A' and B, B' of conjugate elements insofar as the elements
A, A', B of the one structure correspond to the elements A', A, B' of
the other.
Construction of an involution, i.e., construction of an element P'
corresponding to an arbitrary element P, is most effectively
accomplished by means of Desargues' involution theorem (where conic
sections do not enter into the picture). Let us say, for example, that
we are concerned with the involution of two ranges of points. Let
{A, A') and (B, B') be the given point pairs of the involution, C an
additional given point of the base %, and C" the homolog of C we are
looking for. We draw through A, B, C three lines that form a
Desargues' Involution Theorem
267
triangle 1 2 3 (A on 23, B on 31, C on 12), connect A' with 1, B' with
2, and the point of intersection 4 of these connecting lines with 3.
Then 34 touches the base at C". (The opposite side pairs 23 and 14,
31 and 24, 12 and 34 of the tetragon 12 3 4 cut 3; at the point pairs
(A, A'), (B,B'), and {C,C) of the Desargues involution.) The
construction of the involution between two ray pencils is" carried out
in a very similar fashion.
We will now consider the important case of the involution on a circle.
Let (A, A') and (B, B') be two point pairs of an involution between
two ranges of points of a circle (Figure 73).
We connect the points of both sets with the circle points A and A'.
We thereby obtain two projective ray pencils in which the rays AA',
AB, AB' of the first pencil correspond to the rays A'A, A'B', A'B of
the second pencil. Since the junction line AA' of the pencil centers
corresponds to itself, the pencils are perspective (No. 59). The axis
of perspectivity is the junction line of the points of intersection Z of
AB and A'B' and 0 of AB' and BA'.
In order to find the homolog C" in the involution of an arbitrary
point C, we cause AC and OZ to intersect at Y and connect Y with A';
the connecting line touches the circle at C".
Since we can just as well undertake the whole consideration with
the pencil centers B and B' (instead of A and A'), we also obtain C"
when we cause BC and OZ to intersect and connect the point of
intersection X with B'.
Since the homologous sides (bearing the same letter designation) of
triangles ABC and A'B'C intersect on a straight line (XYZ), then,
268 Problems Concerning Conic Sections and Cycloids
according to Desargues' homology theorem (No. 59), the junction
lines AA', BB', and CC of the homologous vertexes pass through one
point S. If we then draw through S any secant, this secant cuts the
circle at two conjugate points of the involution.
The result of our consideration is the theorem:
The lines joining the conjugate points of an involution on a circle pass
through a fixed point.
And conversely:
A secant rotated about a fixed point cuts a circle at the point pairs of an
involution.
In quite similar fashion the following theorem is proved:
The points of intersection of conjugate tangents of an involution on a circle
lie on a straight line.
And conversely:
If a point moves on a line, the tangents drawn from this point to a circle
generate an involution on the circle (Figure 74).
Moreover, since every conic section is the central projection of a
circle, and projectivity, and thus also involution, between two
structures is not annulled by projection of these structures (Pappus'
theorem, No. 59), the two just stated theorems are valid for conic
sections as well:
Involution on a conic section: The lines connecting conjugate points
of an involution on a conic section pass through a fixed point.
The points of intersection of conjugate tangents of an involution on a conic
section lie on a fixed straight line.
Desargues' Involution Theorem
269
And conversely:
A secant rotated about a fixed point cuts a conic section at the point pairs of an
involution. The tangents from a point moving along a fixed straight line to a
conic section are tangent pairs of an involution.
The proofofDesargues' involution theorem is based on the theorems:
The points of a conic section are projected from pairs of themselves by
projective pencils (No. 61).
The tangents of a conic section cut two of the tangents into projective ranges
of points (No. 62).
Let 12 3 4 be an inscribed
tetragon. Let the line g cut the
sides 23, 31, 12 at A, B, C, the
opposite sides 14, 24, 34 at A', B',
C", the conic section at S and 5".
We connect the conic section
points 2, 3, S, 5" with 1 and 4 and
obtain two projective pencils
with the centers 1 and 4, so that
the projections 12 13 15 15" and
42 43 45 45" are projective.
Fig. 75.
We cause these pencils to
intersect with g and obtain two
Let III III IV be a
circumscribed tetragram. Let the lines
connecting the point P with the
vertexes II III, III I, III be
a, b, c, with the opposite angles
IIV, IIIV, III IV a', b', c'.
Let the tangents from P to the
conic section be t and t'. We cut
the conic section tangents II, III,
t, t' with I and IV and obtain two
projective ranges of points on the
bases I and IV, so that the
projections III I III It It' and
IVII IV III IV* rVT are
projective.
Fig. 76.
We project these ranges froi.
' and obtain two conjective
270 Problems Concerning Conic Sections and Cycloids
conjective projective ranges of
points with the base g in which
CBSS' a B'C'SS',
i.e.,
{CBSS') = (B'C'SS').
(CBSS') = (C'B'S'S),
so that
CBSS' a C'B'S'S.
In this projection there are
two conjugate points S and S'.
Consequently, the projectivity is
an involution, and the points B
and B', as well as the points C
and C", are conjugate.
If we connect the conic section
points 3, 1, S, S' with 2 and 4, and
undertake the same
considerations, we find that
(ACSS') = (A'C'S'S),
so that in the involution defined
by the point pairs (S, S') and
(C,C) the points A and A' are
also conjugate.
Accordingly, (A, A'), (B,B'),
(C, C"), and (S, S') are point
pairs of an involution.
We maintain fixed the conic
section, the three vertexes 1, 2, 3,
and the straight line g; we allow
the vertex 4, on the other hand,
projective ray pencils with the
center P in which
cbtt' 7\ b'c'tt',
i.e.,
(cbtt') = (b'c'tt').
(cbtt') = (c'b't't),
so that
cbtt' 7\ c'b't't.
In this projection there are
two conjugate rays t and t'.
Consequently, the projectivity is
an involution, and the rays b and
b', as well as the rays c and c', are
conjugate.
If we cut the conic section
tangents III, I, t, t' with II and
IV, and undertake the same
considerations, we find that
(actf) = (a'c't't),
so that in the involution defined
by the ray pairs (t, t') and (c, c')
the rays a and a' are also
conjugate.
Accordingly, (a, a'), (b, b'),
(c, c'), and (t, t') are ray pairs of
an involution.
We maintain fixed the conic
section, the three sides I, II, III,
and the point P; we allow the
side IV to roll along the conic
We now switch the first two terms with each other and the second
two terms with each other on the right-hand side and obtain
Thus Desargues theorem is proved.
Special Cases
Desargues' Involution Theorem
271
to travel on the conic section
toward the point 3. The secant
34 then comes closer and closer
to the tangent at 3, while at the
same time point A' comes closer
and closer to point B and point
B' closer and closer to point A.
When 4 reaches 3, 43 becomes a
tangent through 3, and A'
coincides with B and B' with A.
Consequently, we obtain
The points of intersection of a
straight line: 1. with a conic
section, 2. with two sides of a triangle
inscribed in a conic section, 3. with
the third side of the triangle and the
conic section tangent passing through
its opposite vertex are three point pairs
of an involution.
Fig. 77.
If we maintain fixed the conic
section in the figure obtained,
the line g, and the vertexes 1
and 3, and let 2 travel toward 1,
then 12 approaches more and
more closely the tangent through
section into position III. The
vertex III IV then comes closer
and closer to the point of tangency
of the tangent III, while at the
same time the ray a' comes closer
and closer to the ray b and the
ray V comes closer and closer to
the ray a. When IV coincides
with III, IV III becomes the
tangency point of III, and a'
coincides with b and b' with a.
1. The tangents from a point to
a conic section, 2. the lines joining
the point with two vertexes of a tri-
gram circumscribed about a conic
section, 3. the lines joining the
point with the third vertex of the
trigram and the point of tangency on
its opposite side are three ray pairs
of an involution.
Fig. 78. ^
If we maintain fixed the
conic section in the figure
obtained, the point P, and the sides
I and III, and let II roll toward I,
the point I II approaches more
and more closely the tangency
Corollary 1
272
Problems Concerning Conic Sections and Cycloids
1 and A the point A'. When 2
reaches 1,12 becomes the tangent
through 1, A coincides with A',
and C falls on the tangent
through 1.
Fig. 79.
point of I and a the ray a'.
When II reaches I, III becomes
the tangency point of I, a
coincides with a', and c passes through
the tangency point of I.
Fig. 80.
Thus, we have
Corollary 2
Given a conic section with two tangents and their corresponding tangency
chord (Figures 79 and 80):
If tht points of intersection of an
arbitrary line with the conic section
are chosen as the first pair, the points
of intersection with the given tangents
as the second pair of an involution,
the point of intersection of the
tangency chord with the line is a
double point of the involution.
If the tangents drawn to a conic
section from an arbitrary point are
chosen as the first pair, and the rays
from the point to the ends of the
tangency chord as the second pair of
an involution, the line joining the
point with the point of intersection of
the given tangents is a double ray of
the involution.
Note. Through the four corners of a tetragon there pass an
infinite number of conic sections, which form a so-called conic section
pencil. The (complete) tetragon is called a fundamental tetragon in
this context.
A Conic Section from Five Elements
273
Similarly, there are an infinite number of conic sections that are
tangent to the four sides of a tetragram; they form a so-called field of
conic sections. The (complete) tetragram in this context is called a
fundamental tetragram.
Since Desargues' theorem applies to every one of these conic
sections, we can state the theorem in the following manner, which is
its most general and shortest form.
Desargues' involution theorem: The intersection point pairs of a line
with the conic sections of a pencil are point pairs of an involution.
The tangent pairs from a point to the conic sections of afield are ray pairs of
an involution.
Here the opposite side pairs of the fundamental tetragon are to be
considered as (degenerate) conic sections of the pencil, and the opposite
vertex pairs of the fundamental tetragram as (degenerate) conic
sections of the field.
A Conic Section from Five Elements
To draw a conic section of which five elements—points and tangents—are
known.
In the solution of this fundamental problem we distinguish three
cases:
I. the five elements are of the same type;
II. four elements are of the same type, but the fifth is of the other;
III. three elements are of one type, two are of the other.
In the following we will designate the conic section as S.
I. To draw a conic section from
five points.
This problem is commonly
solved by means of Pascal's
theorem.
We number the points in an
arbitrary sequence from 1 to 5
and designate as 6 the unknown
point of intersection of an
arbitrary line Q = 56, passing
through 5, with S. We then
draw the Pascal line p of the
I. To draw a conic section from
five tangents.
This problem is commonly
solved by means of Brianchon's
theorem.
We number the tangents in an
arbitrary sequence from I to V
and designate as VI the unknown
tangent drawn to S from an
arbitrary point P = V VI of tangent
V. We then draw the Brianchon
point B of the hexagram
274 Problems Concerning Com
hexagon 12 3 4 5 6 as the line
connecting the point of
intersection of the opposite sides 12
and 45 with the point of
intersection of the opposite sides 23
and 56 = g.
The line joining the point of
intersection of the two lines 34
and p with the vertex 1 cuts g
(= 56) at the sought-for point 6.
By repeating the construction
with another line Q we can obtain
as many points of K as we desire.
In order to draw the tangent to
S at one of the five known points
1, 2, 3, 4, 5 of a conic section, let
us say at 5, we make use of the
first corollary to Pascal's theorem.
We draw the point of
intersection of the two sides 51 and
43, also the point of intersection
of the sides 54 and 12, and allow
the line p connecting these two
points with the side 23 to
intersect. The line connecting the
resulting point of intersection
with the vertex 5 is the sought-for
tangent at 5.
II. To draw a conic section of
which four points 1, 2, 3, 4 and one
tangent t are given.
First case : The tangent t passes
through one of the given points, for
example, through 4.
Let us consider the tangent t
as the line connecting two
infinitely close conic section points
ic Sections and Cycloids
III III IV V VI as the point
of intersection of the line
connecting the opposite vertexes III
and IV V with the line
connecting the opposite vertexes II III
and V VI = P.
The point of intersection of the
line connecting the two points
III IV and B with the side I is a
second point of the sought-for
tangent VI.
By repeating the construction
with other points P we can obtain
as many tangents of t as we
desire.
To draw on one of five known
tangents I, II, III, IV, V to a
conic section, let us say on V, the
point of tangency with S, we
make use of the first corollary to
Brianchon's theorem.
We draw the line connecting
the two vertexes V I and IV III
and the line connecting the two
vertexes VIV and III, and
connect the point of intersection
B of the two lines with the vertex
II III. This new junction line
meets the tangent V at the
sought-for point of tangency.
II. To draw a conic section of
which four tangents I, II, III, IV
and one point P are given.
First case : The point P lies on
one of the given tangents, for
example, on IV.
Let us consider the point P as
the point of intersection of two
infinitely close conic section
A Conic Section from Five Elements
275
4 and 5, so that t = 45, and let us
designate as 6 the point of
intersection of S with an arbitrary
line x starting from 1, so that
x = 16. We then draw the
Pascal line p of the hexagon
1 2 3 4 5 6 as the line connecting
the point of intersection of
opposite sides 12 and 45 = t with
the point of intersection of the
opposite sides 34 and 61 = x.
The line connecting the point of
intersection of the lines p and 23
with the vertex 4 meets g at the
sought-for point 6.
We now have five known points
of S, and the problem is reduced
to I.
Second case: The tangent t
does not pass through any of the given
points.
To solve this problem we use
the Desargues' involution
theorem (No. 63), taking t as the
involution base. We determine
the points of intersection, let us
say A, A', B, B', of the sides
12, 34, 23, 41 of the tetragon
12 3 4 with t and draw a double
point of the involution
determined on t by the two point
pairs (A, A') and (B, B'); this is
the point of tangency of the
tangent t.
Now five points of S are known
and the problem is reduced
to I.
tangents IV and V, so that
P = IV V, and let us designate
as VI a second tangent from an
arbitrary point X of I to St, so
that X = I VI. We then draw
the Brianchon point B of the
hexagram III III IV V VI as
the point of intersection of the
line connecting the opposite
vertexes III and IV V = P and
the line connecting the opposite
vertexes III IV and VII = X.
The point of intersection of the
line connecting the points B and
II III with the side IV is a second
point of the sought-for tangent
VI.
We now have five known
tangents of S? and the problem is
thereby reduced to I.
Second case : The point P does
not lie on any of the given tangents.
To solve this problem we make
use of Desargues' involution
theorem (No. 63), taking P as
the involution base. We
determine the junction lines a, a', b, V
connecting the vertexes III,
III IV, II III, IV I of the tetra-
gram I II III IV with P and
construct the double ray of the
involution determined on P by
the two ray pairs (a, a') and
(b, b'); this is the conic section
tangent passing through P.
We now have five known
tangents of S and the problem
thus reduces to I.
276 Problems Concerning Conic Sections and Cycloids
The second case of II. has two solutions if the involution has two
double elements and no solution if the involution has no double
elements.
III. To draw a conic section of
which three points A, B, C and two
tangents d and e are given.
First case : d passes through A,
and e through B.
We draw the point of
intersection S of an arbitrary line g
originating at A with ft.
For our purpose we construct
the Pascal line p of the hexagon
1 2 3 4 5 6 of which the vertexes
1 and 2 coincide with A, the
vertexes 3 and 4 with B, the
vertex 5 with C, and the vertex 6
with S, the sides 12 and 34 being
represented by the tangents d
and e, respectively, p is the line
connecting the point of
intersection of the sides 12 = d and
45 = BC with the point of
intersection of the sides 34 = e and
61 = g. The line connecting
the point of intersection of the
lines p and 23 = AB with the
vertex 5 = C meets g at the
sought-for conic section point S.
In the same way we draw a
fifth point of ft and thus reduce
the problem to I.
Second case: d passes through
A, and e does not pass through any
of the given points.
III. To draw a conic section of
which three tangents a, b, c and
two points D and E are given.
First case : D lies on a, and E on
b.
We draw the (second) tangent
t from an arbitrary point P of
tangent ato$.
For our purpose we construct
the Brianchon point B of the
hexagram III III IV V VI of
which the sides I and II coincide
with a, the sides III and IV with
b, the side V with c, and the side
VI with t, the vertexes III and
III IV being represented by the
points D and E, respectively.
B is the point of intersection of
the line connecting the vertexes
III = D and IV V = be and
the line connecting the vertexes
III IV = E and VII = P. The
point of intersection of the line
connecting points B and II III =
ab with the side V = c is a
second point of the sought-for
tangent t.
In the same way we draw a
fifth tangent of ft and thereby
reduce the problem to I.
Second case: D lies on a, and
E does not lie on any of the given
tangents.
We solve this case with the second corollary to Desargues' involution
theorem.
A Conic Section from Five Elements
277
We determine the points of
intersection D and E of the line
BC with d and e and construct a
double point of the involution
defined by the point pairs (B, C)
and (D,E). Its junction line
with A passes through the point
of tangency of e.
Third case: Neither of the two
tangents passes through any of the
given points.
We designate the points of
intersection of BC with d and e as
D and E and determine a double
point P of the involution defined
by the point pairs (B, C) and
(D,E). It lies on the tangency
chord of the tangents d and e.
We designate the points of
intersection of CA with d and e
as D' and E' and draw a double
point P' of the involution
determined by the point pairs (C, A)
and (D', E'). This double point
also lies on the tangency chord
of the tangents d and e.
The line joining the two
double points P and P' is thus
the tangency chord we have
mentioned and meets the
tangents d and e at their tangency
points.
We now know Jive points of ft
and thus return to I.
We determine the connecting
lines d and e joining the point be
with D and E and draw a double
ray of the involution determined
by the ray pairs (b, c) and (d, e). Its
point of intersection with a lies on
the tangent passing through E;
this tangent is thus determined.
Third case : Neither of the two
points lies on any of the given
tangents.
We designate the lines joining
be with D and E as d and e and
determine a double ray s of the
involution determined by the
ray pairs (b, c) and (d, e). It
passes through the point of
intersection of the tangents drawn
through D and E.
We designate the lines joining
ca with D and E as d' and e' and
draw a double ray s' of the
involution determined by the ray
pairs (c, a) and (d',e'). This
double ray also passes through
the point of intersection of the
tangents through D and E.
The point of intersection of the
two double rays s and s' is thus the
tangent intersection point that
was mentioned beforeand thelines
joining it to D and E are the
tangents passing through D and E.
We now haxtfive tangents of ft
and thus return to I.
The problem is now reduced to the preceding case.
In this case also the solution is based on the second corollary to
Desargues' involution theorem.
278 Problems Concerning Conic Sections and Cycloids
This last problem admits of a solution only when each of the two
designated involutions has double elements. And since we can
connect each of the two double elements of one of the involutions
with each of the double elements of the other, we obtains/bar possible
tangency chords and tangent intersection points, respectively, and
thus four different conic sections.
E£| A Conic Section and a Straight Line
To draw the points of intersection of a given straight line with a conic
section of which five elements—points and tangents—are known.
In the solution of this problem we may assume, in view of No. 64,
that five points of the conic section are known. The solution is then
based on the theorem: The points of a conic section are projected from pairs
of themselves by projective pencils (No. 61) and on Steiner's double element
construction (No. 60).
Let the given line be called Q, the given points of the conic section
A, B, C, D, E. We can think of the points of the conic section as
projected from D and E by the two projective pencils I and II.
These pencils cut q into the two projective ranges of points 1 and 2.
The points of intersection S and T of Q with the conic section are the
double elements of the projectivity 1 a" 2. This projectivity is,
however, determined by the points of intersection Au Bu Cx of the
rays DA, DB, DC with Q and the homologous points of intersection
A2, B2, C2 of the rays EA, EB, EC with q.
We therefore draw according to Steiner the double elements of the
projectivity defined on q by the homologous point triplets (Alt Bt) Cx)
and (A2, B2, C2); they are the points of intersection we are looking for.
^BBI A Conic Section and a Point
To draw the tangents from a given point to a conic section of which five
elements—points and tangents—are known.
In view of the considerations of No. 64, we may assume the given
conic section elements to be tangents.
The solution to this problem is based upon the theorem: The
tangents of a conic section mark off projective ranges of points on two of the
tangents (No. 62) and on Steiner's double element construction (No. 60).
A Conic Section and a Point
279
Let the given point be P, the given tangents a, b, c, d, e. Let us
consider the tangents of the conic section as intersecting with d and e,
so that we obtain on d and e the projective ranges 1 and 2 in which the
points of intersection Aly Bu Cx of the tangents a, b, c with d and
the points of intersection A2, B2, C2 of the tangents a, b, c with e are
homologous elements. The reflections of these ranges of points on P
thus form two projective ray pencils I and II. The (conjective) pro-
jectivity is determined by the lines aX) bx, cx connecting the points of
intersection Au Bu Cx to P and the homologous connecting lines
an> ^ii) cn joining the points of intersection A2, B2, C2 to P. Since
each of the two tangents s and t from P to the conic section cuts 1 and 2
into homologous elements, s and t are therefore the double elements
of the projectivity I X II.
We thus draw according to Steiner the double elements of the
conjective projectivity determined by the homologous ray triplets
(«x, bj, cx) and (an> bn, ca); they are the sought-for tangents.
Stereometric Problems
Steiner's Division of Space by Planes
What is the maximum number of parts into which a space can be divided by
n planes?
This very interesting problem appears in Steiner's paper " Several
laws governing the division of planes and space" (Crelle's Journal,
vol. I and Steiner's Complete Works, vol. I).
We first solve the preliminary problem: What is the maximum
number of parts into which a plane can be divided by n straight lines?
The number of parts will evidently be maximal when no two lines
are parallel and no more than two lines pass through one point. In
the following we will assume these two conditions to be satisfied and
we will designate the corresponding number of surface sections
generated by the n lines as n.
Thus, let the plane be divided by n lines into n surface sections. We
now draw one additional line. This line is divided by the first n lines
into n points, and thus traverses n + 1 of the available n surface
sections, dividing each of them into two parts, so that the (n + l)th
line increases the number of surface sections by n + 1. Consequently,
we obtain the equation
n + 1 = n + (n + 1).
We then apply this equation to the cases in which n = 0, 1, 2,...
and we form the n equations
1 = 1 + 1,
2 = 1 + 2,
3 = 2 + 3,
n = n — 1 + n.
Addition of these equations results in
n = 1 + (1 +2 + 3+---+ n)
or, since the sum of the first n natural numbers is n(n + 1)/2,
/in i n + 1
(1) „=!+„__
284
Stereometric Problems
Thus, the maximum number of parts into which a plane can be divided by
n lines is (n2 + n + 2)/2.
The obtained result is easily confirmed for the cases n = 1, 2, 3,....
Now for the space problem! It is apparent that the number of
partial spaces attains a maximum when no more than three planes
ever intersect at one point and when the lines of intersection of no
more than two planes are ever parallel. We will therefore assume
that these conditions are satisfied in the following and we designate
the number of partial spaces formed by n planes as H.
Then, let the space be divided by n planes into n partial spaces. To
these planes we now add one additional plane. This plane is cut by the
original n planes into n lines of which no more than two pass through
a single point and no two or more are parallel. The new (n + l)th
plane is therefore divided by the n lines into n surface sections.
Each of these n surface sections cuts the partial space that it
traverses into two smaller spaces, so that the addition of the (n + l)th
plane increases the number of the partial spaces originally present
by n. This gives us the equation
n + 1 = n + n.
We form this equation for the cases n = 1, 2, 3, etc., and obtain the n
equations
1 = 1 + 1,
2 = 1 + T,
3 = 3 + 2,
« = n-l+fi-l.
Addition of these equations results in
«=2+1+2+3+ ■■■ + n - 1
or, according to (1),
n = n + 1 +1(1-2 + 2-3 + ■■• + (n- 1)«).
If we then divide each product v(v + 1) into v2 + v, we obtain
* = n + 1 + #[12 + 22 + ■ ■ • + (n - 1)2]
+ [1 + 2 + ■ • • + (n - 1)]}.
Now, according to No. 11, the sums in the first and second square
brackets, respectively, are
i(n — l)n(2n — 1) and \{n — l)n, respectively;
Eider's Tetrahedron Problem 285
the brace thus equals $(n — \)n(n + 1), and
n = n + 1 + i(n - l)n(n + 1)
or
.. n3 + 5n + 6
" 6 '
Conclusion: The maximum number of parts into which a space can be
divided by n planes is (n3 + 5n + 6)/6.
■Sgg Eider's Tetrahedron Problem
To express the area of a tetrahedron in terms of its six edges.
This fundamental problem was posed and solved by Leonhard Euler
(Novi Commentarii Academiae Petropolitanae ad annos 1752 et 1753).
The following convenient and simple solution is based upon vector
calculus.
We will designate the vertexes of the tetrahedron as A, B, C, 0, the
six edges BC, CA, AB, OA, OB, OC as a, b, c, p, q, r, the three vectors
OA, OB, OC as p, q, r, and the area we are looking for as T. We will
consider the edges p, q, r originating from the vertex 0 as being so
arranged that they form a right-handed system, i.e., that p can be
imagined as the thumb, q as the index finger, and r as the middle
finger of the right hand.
If we take the triangle OAB as the base surface and the vertex C as
the apex of the tetrahedron, then the double value of the base surface
area S is given by the magnitude of the vector product © = p x q,
the altitude CF is the projection of the edge r on CF, i.e., ro, if we
designate as o the cosine of the angle between CO and CF or also of the
angle of the two vectors © and r.
Consequently, six times the tetrahedron area is equal to S-ro or
equal to the scalar product* <S> ■ r of the vector @ and r. Thus, we
obtain the simple formula
6T = p x q-r,
which can be stated verbally as follows:
Six times the area of a tetrahedron is equal to the mixed product of the three
vectorial edges originating from one edge of the tetrahedron.
* The scalar product of two vectors iS and IB is most conveniently written
«• 8 or in the still simpler form 8U8.
286
Stereometric Problems
Here the three factors of the mixed product must be written in such
sequence as to form a right-handed system (for otherwise the mixed
product would represent six times the negative tetrahedron area).
Fig. 81.
We now introduce a right-angle coordinate system with origin at 0
and designate the coordinates of the three vertexes A, B, C as x\y\z,
x'\y'\z', and x"\y"\z". The three components of the vector © =
pxq are then yz' — zy', zx' — xz', xy' — yx', and the scalar product
@-t is equal to {yz' — zy')x" + (zx' — xz')y" + (xy' — yx')z", i.e.,
equal to the determinant whose columns are the components of the
vectors p, q, r. Thus we obtain the elegant formula
6T =
x y z
x' y' z'
x" y" z"
On squaring this formula, multiplying the two (same) determinants
row by row, we obtain 36 T2 = A =
xx+yy + zz x x' + y y' + z z' x x" + y y" + zz"
x'x + y'y + z' z x'x' + y'y' + z'z' x'x" + y'y" + z'z"
x"x + y"y + z"z x"x' + y"y' + z"z' x"x" + y"y" + z"z"
or, since the elements of this determinant are the scalar products of
the vectors p, q, r in pairs, or the squares of these vectors,
w t>q pt
36 T2 = qp qq qr
tp rq rr
(I)
Eider's Tetrahedron Problem
287
This is Eider's tetrahedron formula. (Euler, however, expressed the
right-hand side as an algebraic sum rather than as a determinant.)
It contains the solution to the problem posed, since the elements of
the determinant are simple expressions of the edges; specifically:
W = P2,
qr =
q2 + r2 - a2
qq = q2,
r2+p2 -
-b2
1,1 >
rr = r2,
Pa p2 + q2-
pq — 2
- c2
In the tetrahedron with the edges a = 11, b = 10, c = 9, /> = 8,
y = 7, r = 6, for example, we have
pp = 64, qq = 49, rr = 36, qr =-18, xp = 0, pq = 16,
and
36 r2 =
64 16 0
16 49 -18
0-18 36
= 16-36
4 16 0
1 49 -9
0 -1 1
= 16-36-916
and T = 48.
We can put the obtained result into still another form.
If we multiply each element of A by 2 and express the doubled
scalar product by the squares P, Q, R, A, B, C of the edge magnitudes
p, q, r, a, b, c, we obtain
288 T2 =
2P P + Q -C P + R- B
Q +P-C 2Q Q + R-A
R + P - B R + Q - A 2R
Now we distribute zeros at the left and minus ones at the bottom and
obtain
288 T2 =
0 2P P+ Q -C P + R- B
0 Q + P-C 2Q Q + R-A
0 R + P - B R + Q - A 2R
-1 -1 -1 -1
288
Stereometric Problems
If we add the P-, Q-, and iJ-multiples of the last row to the first,
second, and third rows, respectively, we obtain the somewhat simpler
288 T2 =
-P P Q-C R-B
-Q P-C Q R-A
-R P - B Q - A R
-1 -1
-1
-1
We now distribute zeros and ones at the top and right:
288 T2 =
0
-p
-Q
-R
-1
0
P
P-C
P-B
-1
0
Q-c
Q
Q-A
-1
0
R-B
R-A
R
-1
1
1
1
1
0
If we now subtract the P-, Q-, and iJ-multiples of the last column
from the second, third, and fourth columns, respectively, we finally
obtain
288 T2 =
0
-P
-Q
-R
-1
-P
0
-c
-B
-1
-Q
-c
0
-A
-1
-R
-B
-A
0
-1
1
1
1
1
0
or, if we reverse all the minus signs,
(II)
288 T2
0
p
Q
R
1
P
0
c
B
1
Q
c
0
A
1
R
B
A
0
1
1
1
1
1
0
In this remarkable formula P, Q, R, A, B, C are the squares of the
edges p, q, r, a, b, c.
The Shortest Distance Between Skew Lines
289
Note: the four-point relation: If A, B, C, 0 are four points of a
plane, the area of the tetrahedron ABCO is zero and (I) is transformed
into the so-called four-point relation:
W t>q P*
qp qq qr
xp rq rr
= 0
for the six junction lines BC = a,CA = b, AB = c, OA = p, OB = q,
OC = r that are possible between the four points.
The Shortest Distance Between Skew Lines
To calculate the angle and distance between two given skew lines.
This important problem is usually encountered in one of the
following two forms:
I. To calculate the angle and distance between two skew lines when a point
on each line and the direction of each line are given—the former by coordinates
and the latter by the direction cosine of the lines.
II. To calculate the angle and distance between two opposite edges of a
tetrahedron whose six edges are known.
The distance between two skew lines is naturally the shortest
distance between the lines, i.e., the length of the line perpendicular
to both lines and joining a point on each.
Solution of I. We designate the perpendicular coordinates of the
two given points P and p as ^4|5|C and a\b\c, the vector pP (with the
components A — a, B — b, C — c) as b, the direction cosine of the two
lines, together with the components of two unit vectors @ and e
lying on the lines as L, M, N and I, m, n, the sought-for angle of the
two lines as at, and the sought-for minimum distance as k.
The solution to this problem, which is in itself not very simple,
becomes astonishingly simple with the introduction of the scalar
product @-e and the vector product @ X e of the two vectors @ and e.
The former can be expressed on the one hand (since the vectors @
and e have a magnitude of 1) as cos co, and, on the other, by the
components of the factors as LI + Mm + Nn. We therefore obtain
(1)
cos o) = Ll + Mm + Nn.
290
Stereometric Problems
The latter is perpendicular to both lines, so that the projection of b
on the vector @ X e represents the desired distance k (the shortest
distance k between the two lines is specifically the projection of b on k
and at the same time the projection of b on every parallel to k, for
example, on @ X e). However, since the projection of a vector SB
on a second vector b of the magnitude v is SB • b/i», we obtain for k the
value b • @ X e/sin <d (sin <d is the magnitude of the vector @ X e).
Now the scalar product of the two vectors b and @ X e is nothing
other than the so-called mixed product of the three vectors b, @, and e.
And since the latter is equal to the determinant whose rows are the
components of the three vectors (No. 68), we obtain the formula
(2) k-
A - a B - b C - c
L M N
I m n
/sin i
Note. If we desire to calculate the coordinates X/Y/Z and
xjyjz of the end points U and u of the shortest junction line k, we
designate the segments PU and pu as R and r, the vector uU as !, and
we then have
idf = 7p + pP + PU,
or
! = -re + b + R<&.
If we multiply this equation in scalar fashion with @ and e, we obtain,
as a result of @-f = 0 and e-! = 0, the two linear equations
<$<$R - @er + @b = 0,
@eR - eer + eb = 0,
from which the unknowns R and r are obtained.
Solution of II. Let the six edges of the tetrahedron be BC = a,
CA = b, AB = c, OA = p, OB = q, OC = r, and let the vectors
—>■—>■—>■—>■ —>■—>■
BC, CA, AB, OA, OB, OC be a, b, c, p, q, r. Let the angle and
distance between the two opposite edges c and r be called <d and k,
respectively.
Determination of m. We have
c + x = AB + OC=Al) + 6B + 0~A+AC=dB + Al:=q-b,
The Shortest Distance Between Skew Lines
291
and thus
(c + r)2 = (c + r) ■ (q — b) = cq + qr — be — br.
However, since
(c + r)2 = c2 + r2 + 2cr = c2 + r2 + 2cr cos a>,
2cq = c2 + q2 - p2, 2qr = q2 + r2 - a2,
2bc = a2 - b2 - c2, 2br = p2 - b2 - r2,
the equation obtained is transformed into
(3) 2cr cos a> = b2 + q2 - a2 - p2,
so that <d is determined.
Calculation of k. Let the area of the tetrahedron ABCO, which
we can consider as known in accordance with Euler's formula
(No. 68), be called T. We displace the vector r parallel to itself until
it has a starting point A in common with c; its new end point we will
call Q, and thus AQ # OC. Since the triangles CQA and CO A are
halves of the parallelogram COAQ, they are congruent, and thus the
tetrahedrons CQAB and COAB have the same area (7^). If we now
take QAB as the base surface of the tetrahedron CQAB and C as the
Fig. 82.
apex, the base surface has the area \AQ • AB • sin QAB = \rc sin a», and
the altitude (as the distance of the point C from the plane QAB that
contains the edge c and the line AQ that is parallel to the opposite
edge OC) has a length of k. The area of the tetrahedron is therefore
^ • \cr sin o> • k, and we obtain the formula
(4)
6T = kcr sin at.
292
Stereometric Problems
Since all the magnitudes in this formula are known with the exception
of k, it gives us the distance between the opposite edges k which we
have been looking for.
Note. If we keep in mind that cr sin o> is the magnitude of the
vector c X r and that the shortest distance ! (conceived of as a vector)
between the edges c and r is parallel to c X r, we can write
6T= fc X r
and we have the following
Theorem : The mixed product of two opposite sides of a tetrahedron and
the distance between them is equal to six times the area of the tetrahedron.
A direct consequence of this theorem is the famous
Theorem of Steiner: All tetrahedrons having two opposite edges of
prescribed length lying on two fixed lines have the same area.
Bl The Sphere Circumscribing a Tetrahedron
To determine the radius of the sphere circumscribing a tetrahedron of which
all six edges are given.
One should compare the developments of Legendre in his Elements
de Giomitrie, Note V.
We will first solve the
Preliminary problem: To find the relation between the six major arcs
that connect the four points of a spherical surface.
We will call the four points 0, 1,2, 3, the arcs joining them 01, 02,
03, 23, 31, 12, the radii (considered as vectors) running to them
ro> ri> ra> r3 and tneir common magnitude h. Since there is always a
homogeneous linear relation between four vectors of a space, we have
the equation
°*o + I&1 + y*a + 8r3 = 0,
in which not all of the coefficients a, /3, y, 8 vanish simultaneously.
We multiply the relation sequentially in scalar fashion by r0, v1} r2, r3
and obtain the four equations
r0r0<x + totjjS + r0r2y + r0r38 = 0,
tjtoa + txtx/3 + txtay + 1^8 = 0,
r2r0<x + r.jrx/3 + r2r2y + r2r38 = 0,
r3r0a + tatx/S + r3r2y + r3r38 = 0.
The Sphere Circumscribing a Tetrahedron 293
However, when four homogeneous linear equations with four
unknowns (a, /3, y, 8) possess an actual solution, the determinant of
the coefficients of the equations must be equal to zero. Consequently
^0^0 ^0^1 *0*2 ^0^3
VO *1*1 *1*2 *1*3
*2*0 *2*1 *2*2 *2*3
^3^0 ^3^1 ^3^2 ^3^3
Here we replace each product rnrv by h2 cos nv, eliminate everywhere
the factor h2, and obtain the relation we are looking for
cos 00 cos 01 cos 02 cos 03
cos 10 cos 11 cos 12 cos 13
(1) = 0.
cos 20 cos 21 cos 22 cos 23
cos 30 cos 31 cos 32 cos 33
(cos 00, cos 11, cos 22, cos 33 are naturally merely symmetrical ways
of writing unity.)
The solution of the tetrahedron problem is now simple.
In order to maintain agreement with the designations of the
preliminary problem we will call the vertexes of the tetrahedron
0, 1,2, 3, the radius of the sphere of circumscription h. The edges
01, 02, 03, 23, 31, 12 we will call p, q, r, a, b, c, their squares P, Q, R,
A, B, C, the area of the tetrahedron T.
We now introduce the four-point relation (1), assign to each
cosine the factor H = 2h2 and replace the new determinant elements
in accordance with the cosine theorem, e.g., H cos 01 by H — P,
H cos 02 by H - Q, H cos 23 by H - A, etc. (naturally H cos 00 and
the other elements of the diagonals will be replaced by H). This
gives us, after we reverse the sign of all the elements,
-H P-H Q-H R-H
P-H -H C-H B-H
Q-H C-H -H A-H
R-H B-H A-H -H
294
Stereometric Problems
- H
P - H
Q-H
R- H
1
P - H
- H
C - H
B - H
1
Q-H
C - H
- H
A-H
1
R- H
B - H
A-H
- H
1
0
0
0
0
1
We now line the bottom of this determinant with ones and the right-
hand side with zeros and obtain
= 0.
We now add to the first, second, third, and fourth rows H times the
last row; this gives us
= 0.
If we call the minors of the last column Mu M2, M3, M4, M5 and
arrange them according to the elements of the last column, we obtain
H(MX + M2 + M3 + Mt) +M5 = 0.
If we also arrange the determinant of equation (II) of No. 68
according to the elements of the last column, that equation assumes the
form
Ml + M2 + M3 + M4 = 288T2.
From the last two equations we obtain
288HT2 = -Mb,
0
p
Q
R
1
P
0
c
B
1
Q
c
0
A
1
R
B
A
0
1
H
H
H
H
1
where
MR =
Q R
R B
C
0
A
Computation gives
-M8 = 2FG + 2GE + 2EF - E2 - F2 - G2,
The Five Regular Solids
295
where E, F, G are the three products AP, BQ, CR. If we replace
A, B, C, P, Q, R once again by a2, b2, c2, p2, q2, r2 and designate the
products ap, bq, cr of the opposite edges as e,f, g, the last formula can
be written as
-M5 = 2/V + 2g2e2 + 2e2p - e* -/* - g*.
If we consider e,f, g as sides of a triangle, the right side of this formula
(according to Hero) represents 16 times the square of the area j of
this triangle. Thus the equation found for H = 2h2 is transformed
into
576h2T2 = 16/2,
and from this we can obtain the simple formula
6hT=j
for the radius of the sphere of circumscription. Verbally, this can be
stated as follows:
Six times the product of a tetrahedron volume and the radius of its sphere of
circumscription is equal to the area of a triangle whose sides are the products of
the opposite edges of the tetrahedron.
Note. The question of the radius p of the sphere inscribed in a
tetrahedron is much simpler. The lines joining the center Z of the
inscribed sphere and the boundary points of the four triangles
bounding the tetrahedron divide the tetrahedron into four pyramids with
the common apex Z and the areas ^pl, $/>II, ^pHI, ipIV, where
I, II, III, IV are the areas of the bounding triangles. We thus
obtain the formula
T = $P(I + II + III + IV).
This equation represents p as a function of the tetrahedron edges,
since I, II, III, IV, and T are known functions of the edges.
m The Five Regular Solids
To divide the surface of a sphere into congruent regular spherical polygons.
Solution. We will call the required division "regular" and we
will first answer the question concerning the maximum possible
number of regular divisions.
296
Stereometric Problems
We will assume that the sphere is covered completely and without
any gaps by z regular n-gons and that at every corner of such an n-gon
v sides come together. We divide each n-gon by means of the
spherical radii running from the center to the vertexes into n isosceles
triangles. Each of these triangles possesses the central angle 2irjn
and the base angle irjv (since at each vertex 2v such base angles come
together), and thus the spherical excess of each is
S = 7T = 77-|-H II-
n v \n v /
Now, the area of such a triangle, when r is the spherical radius is
r2e; the area of an n-gon is thus nr2e and the area of the spherical
surface consisting of z such n-gons is znr2e. Accordingly, we obtain
the equation
znr2e = 4rrr2
or
or
H + H-i+i.
n v zn
Since the left side of this equation is > 1 and at the same time n as well
as v must be >2, we obtain the following five possibilities for n, v,
and z:
n
3
3
3
4
5
V
3
4
5
3
3
z
4
8
20
6
12
Thus, there are only five possible regular divisions of a spherical
surface: by dividing the surface with
1. four regular triangles,
2. six regular tetragons,
3. eight regular triangles,
4. twenty regular triangles,
5. twelve regular pentagons.
The Five Regular Solids
297
If we connect every two adjacent corners of such a spherical n-gon
by means of a line segment, we obtain a regular plane n-gon bounded
by the n line segments that connect the corners. If we construct this
plane n-gon for each of the z spherical n-gons, we obtain a regular
polyhedron bounded by z regular n-gons, or a so-called regular solid.
There are accordingly only five regular solids, namely, the regular
tetrahedron, hexahedron (the cube), octahedron, icosahedron, and dodecahedron.
In the following we will actually carry out the five regular divisions
of the spherical surface, which we had initially only shown to be
possible. For convenience in viewing the sphere we will imagine it
as a globe with a north pole N and a south pole S and with meridians
and latitudinal circles.
I. The tetrahedron (n = 3, v = 3, z = 4). On the three meridians
0°, 120°, 240° we lay off from N the three equal arcs NA, NB, NC
such that the triangles NBC, NCA, NAB are equilateral. The three
arcs BC, CA, AB enclosing the south pole then also form an equilateral
triangle that is congruent to the designated triangles, and the spherical
surface has been divided into the four regular triangles NBC, NCA,
NAB, ABC.
II. The hexahedron (n = 4, v = 3, z = 6). On the four meridians
0°, 90°, 180°, 270° we lay off from N and S the eight equal arcs
NA, NB, NC, ND and SC, SD', SA', SB' (each one equal to h) such
that each of the arcs AC, BD', CA', DB' is equal to AB (= 2k). kis
obtained from the spherical triangle NAB by means of the equation
cos 2k = cos h cos h.
Since on the one hand 2h + 2k = NA + SC + AC = NS = 180° or
h + k = 90°, and thus cos h = sin k, and on the other hand cos 2k =
1 — 2 sin2 k, we obtain
1 - 2 sin2 k = sin2 k
and consequently
sin k = VJ, cos 2k = ^, cos h = VJ.
The corners A, B, C, D, A', B', C, D' defined by these conditions are
the eight corners of the cube.
III. The octahedron (n = 3, v = 4, 2 = 8). The corners of the
octahedron are the points N, S and four equator points separated
from each other by 90°.
298
Stereometric Problems
IV. The kosahedron (n = 3, v = 5, z = 20). We choose ten
meridians 36° apart and call them 1, 2, 3,..., 10. On the meridians
1, 3, 5, 7, 9 we lay off from N the equal arcs NA, NB, NC, ND, NE,
and on the meridians 6, 8, 10, 2, 4 we lay off from S the equal arcs
SA', SB', SC, SD', SE' such that the ten triangles NAB, NBC, NCD,
NDE, NEA, SA'B', SB'C, SC'D', SD'E', SE'A' are equilateral. The
common length 2k of the marked-off arcs can be obtained, for
example, from one of the right triangles NBO, NCO, into which the
meridian 4 divides the equilateral triangle NBC. Since &BNO =
36°, &OBN = 72°, it follows from triangle NBO that
__ cos 36° 1
cos BO = cos k =
and from this that 2k = 63°26'.
If we extend NO by its own length to H, we obtain the isosceles
triangle NBH with the base NH = 2h and the legs BN = BH = 2k,
the base angle 36°, and the apex angle HBN = 144°. Since these
angles have the same sine, the sines of their opposite sides NH and NB
are equal according to the sine theorem. But since these opposite
sides (2h and 2k) are not equal, 2h must be the supplement of 2k.
And since NE' is also the supplement of 2k (= SE'), then necessarily
NE' = 2h = NH.
Accordingly, point H coincides with E' and E'B is equal to 2k, i.e.,
equal to NB. In similar fashion each of the arcs AD', D'B, E'C, CA',
A'D, DB', B'E, EC, CA is equal to 2k, and the ten "encircling"
triangles ABD', D'E'B, BCE', E'A'C, CDA', A'B'D, DEB', B'C'E,
EAC, CD'A are likewise equilateral triangles and also congruent to
the ten equilateral triangles above.
The Five Regular Solids
299
The 12 points N, S, A, B, C, D, E, A', B', C, D', E' are thus the
vertexes of 20 equilateral triangles that completely cover the sphere;
they are the 12 corners of the regular icosahedron.
V. The dodecahedron (n = 5, v = 3, z = 12). As in the icosahedron,
we begin the construction of the dodecahedron by laying off a system
of ten meridians 1, 2, 3,..., 10 that are 36° apart. About N as a
common apex we group five congruent isosceles triangles NAB, NBC,
NCD, NDE, NEA with the apex angle 72° and the base angle 60°
(= 180°/v) whose base vertexes A, B, C, D, E lie on the meridians
1, 3, 5, 7, 9. Thus we obtain the regular pentagon ABCDE. In the
same way we draw about S as a common center point the regular
pentagon A'B'C'D'E' whose vertexes A', B', C, D', E' lie on the
meridians 6, 8, 10, 2, 4.
Fig. 84.
If 0 and 0' represent the base midpoints of the isosceles triangles
ABN and D'E'S, then NAO and SD'O' are right triangles with the
angles 60° and 36°.
Our construction is now based on the theorem (proved below):
" The perimeter of a spherical right triangle with angles of 60° and 36° is 90°."
If we designate the hypotenuse, the long leg, and the short leg of such
a triangle as I, h, and k, then
(1) I + k + k = 90°.
If we remember that
NA = SD' = I, NO = SO' = h, AO = D'O' = k,
we see that 2k is the side, I the radius of the circumscribed circle
(on the sphere), h the radius of the inscribed circle, and s = I + h the
altitude of the pentagon ABCDE or A'B'C'D'E'.
300
Stereometric Problems
We now mark off on the meridians 1, 3, 5, 7, 9 from A, B, C, D, E
southwards and on the meridians 6, 8, 10, 2, 4 from A', B', C, D', E'
northwards the pentagon side 2k, which gives us the points F, G, H,
K, L, F,G,H,K,L.
Now since, according to (1), each meridian consists of the four
segments I, 2k, s, and h, it follows that OG and O'H, for example,
represent the pentagon altitude s; i.e., the pentagons ABHGF and
D'E'KHG are congruent to the regular pentagon ABCDE. The same
is naturally true of the pentagons BCLKH, CDG'F'L, DEK'H'G',
EAFL'K', E'A'F'LK, A'B'H'G'F', B'CL'K'H', C'D'GFL.
With the 12 regular pentagons already designated the sphere is
completely covered.
The points A, B, C, D, E, F, G, H, K, L, A', B', C, D', E', F', G',
H', K', L' are accordingly the 20 corners of the regular dodecahedron.
Supplement: Proof of the theorem: " The perimeter of a spherical
right triangle with the angles 60° and 36° is 90°."
Let the sides of the triangle be a, b, c, their opposite angles a = 60°,
/3 = 36°, y = 90°. We express the tangents of the sides by the regular
decagon side z = 2 sin 18° corresponding to the unit circle, for which
it is known that z2 + z = 1.
1. Firstly,
cos fl = 1 - 2 sin2 18° = 1 - \z2 = ^-^ = ^-
r 2 2 2z
or
sec (3 = 2z.
2. From sec c = tan a tan /3 it follows that sec2 c = 3 tan2 /3 or
(tan2c + 1) = 3(sec2/3 - 1) or tan2c = 4(3z2 - 1). However,
3z2 - 1 = z2 + (2z2 - 1) = z2 + (1 - 2z) = [1 - z]2 = z\ and
thus
tan c = 2z2.
3. tan a = tan c cos /3 = 2z2\2z = z.
4. tan b = tan c cos a = 2z2-\ = z2.
Now we have
i\ o 2 * + ^2 2z2
tanc-tan (a + b) = 2z ■
1-23 1-23
2z2 2z2
(1 - z)[\ + z + z2] (z2)[l + 1]
Consequently, a + bis the complement of c. Q.E.D.
= 1.
The Square as an Image of a Quadrilateral 301
The regular solids were already known to the Pythagoreans and
thus go back to the sixth century B.C. The proof that there are only
five regular solids probably stems from Euclid (ca. 330-275 B.C.).
^^M The Square as an Image of a Quadrilateral
To show that every quadrilateral can be considered as a perspective image of a
square.
The perspective projection, perspectivity or central projection, the
simplest and most important of all projections, can be explained as
follows. Given are a fixed point Z, the center of projection, and a fixed
plane E, the plane of the image. The perspective image or, more briefly,
the perspective of an arbitrary point P0 is understood to mean the
point of intersection P of the "projection ray" ZP0 with the plane of
the image. P0 is the "object," P the "image." The image of a
figure is the totality of the images of the points of which the figure
(the object) consists. Thus, the perspective of a straight line g0 is a
straight line g, namely the intersection of the plane Zg0 with the plane
of the image.
Of particular importance is the perspective projection in which only
points of a plane E0, the object plane, are projected onto the image
plane. The line of intersection 91 of the object plane and the image
plane is called the axis of perspectivity. The axis of perspectivity is the
locus of the object point that coincides with the point of its image.
An arbitrary object line and its image accordingly intersect at the axis.
A noteworthy role in this perspectivity is played by the infinitely
distant points of the object plane. Since the projection rays to the
infinitely distant points of E0 run parallel to E0, they lie in a plane A
passing through Zand parallel to E0 and consequently meet the image
plane at the line of intersection f of this plane with A. This line of
intersection is called the vanishing line of the object plane E0. The
vanishing line is parallel to the axis of perspectivity.
In order to avoid limiting the general validity of the above theorem,
"The perspective of a line is also a line," by a special case, we call the
totality of infinitely distant points of E0 the "infinitely distant line"
of this plane and can then state briefly that:
The perspective of the infinitely distant line of a plane is the vanishing line of
this plane.
302
Stereometric Problems
The place at which the image g of an arbitrary line g0 of E0
intersects the vanishing line f and which is the image of the infinitely
distant point of g0 is called the vanishing point of g0.
Now for the solution of our problem!
Fig. 85.
Let the quadrilateral ABCD in the drawing plane E be the given
quadrilateral, let 0 be the point of intersection of the diagonals AC
and BD, P the point of intersection of the opposite sides AB and CD,
Q the point of intersection of the opposite sides BC and DA. Let
the square we are looking for be called accordingly AoBqCqDo, the
point of intersection of its diagonals 00, its plane E0. Since the
points of intersection P0 and Q0 of the two pairs of opposite sides lie
on the infinitely distant line of E0, their images P and Q must lie on the
vanishing line f of the perspectivity passing from E0 to E. We
accordingly choose the line PQ as the vanishing lineal It makes no
difference which parallel to /we choose as the axis of perspectivity a.
We choose the parallel through A. The points of intersection of the
axis with the lines CD, BC, OP, OQ, and BD we designate as H, K, M,
N, and S. Since each object line meets the corresponding image line
at the axis, these points may also be called H0, K0, M0, N0, S0.
The Poklke-Sckwarz Theorem
303
In the quadrilateral ABCD the opposite sides PBA and PCD and the
diagonals PO and PQ form a harmonic ray pencil. Since the ray PQ
runs parallel to the line a, the segments MA and MH are of equal
length.
In the quadrilateral ABCD the opposite sides QCB and QDA and
the diagonals QO and QP also form a harmonic ray pencil. Since
QP||ct, the segments NA and NK are also equally long.
Since the diagonals of the sought-for square must meet the diagonals
of the given quadrilateral at the axis, the diagonals of the square must
pass through A and S. The point of intersection 00 of the diagonals
accordingly lies on the semicircle with the diameter AS belonging to
the plane E0.
Since the midlines M000 and N000 of the square pass through
00, 00 also lies on the semicircle with the diameter MN in the
plane E0.
The point of intersection of the two semicircles is the center point
00 of the square.
The sides AqBq and C0D0 of the square are the parallels through A
and H to M00, the sides B0C0 and DqAq of the square are the parallels
through K and A to N00.
For convenience we execute the drawing (cf. Figure 85) in the
drawing plane itself. Then, in order to obtain the spatial per-
spectivity we are looking for, we rotate the square about the axis a as
an axis of rotation into a new plane E0, draw through/the plane A
parallel to E0, join the point of intersection of the diagonals, 00, now
lying in E0, with 0, and designate the point of intersection of this
connecting line with A as Z.
If we now project the square AqBqCqDq lying in E0 from the center Z
onto E, we thereby obtain as a perspective image the square ABCD.
^£1 The Pohlke-Schwarz Theorem
Four arbitrary points of a plane that do not all lie on the same line can be
considered as an oblique image of the corners of a tetrahedron that is similar to a
given tetrahedron.
This fundamental theorem of oblique parallel projection, proved by
H. A. Schwarz (1843-1921) in 1864 (Crelle's Journal, vol. 63; also,
Schwarz, GesammelU Abkandlungen), includes as a special case the
theorem formulated in 1853 by K. Pohlke (1810-1876):
304
Stereometric Problems
The fundamental theorem of oblique axonometry: Three
arbitrary segments originating from a single point in a plane that do not all
belong to the same line can be considered as the oblique image of a tripod.
Before taking up the proof of this theorem we shall make several
prefatory remarks about oblique projection, affinity, and axonometry.
An oblique projection is a projection of a plane or three-dimensional
figure, an object figure, onto the drawing plane or image plane in
which each object point is projected onto the image plane by a
"projection" ray drawn in a fixed direction. If the projection rays
are perpendicular to the image plane, the oblique projection is called
a normal or orthogonal projection.
The oblique projection of points of a plane (the object plane) onto
the image plane is a so-called affinity.
An affinity or qffine projection is understood to mean a projection of an
object plane onto the picture plane (which may also lie in the object
plane) in which the points of the object plane are transformed into
points of the image plane in such manner that they exhibit the
following fundamental properties:
I. The qffine image of a line is also a line.
II. Parallelism is not annulled by qffine projection. (The image of a
parallelogram is a parallelogram.)
III. The ratio of parallel segments is not altered by qffine projection. In
other words: Parallel segments are projected in the same proportion. (This
third property is a consequence of I. and II.)
It is therefore immediately evident that the oblique projection of a
plane onto a second plane possesses these three fundamental properties.
The most general affinity between two arbitrary planes E and E' is
determined by the mutual correspondence between two arbitrary
triangles ABC and A'B'C of these planes, where A', B', C are
determined as the affine images of A, B, C, respectively. The affine image
P' of an arbitrary object point P (of E) is drawn by letting AP
intersect with the side BC at //, then (according to III.) determining the
affine image H' of // on the line B'C by means of the condition
B'H'-.CH' = BH-.CH, and finally determining P' on A'H' by means
of the condition A'P'-.H'P' = AP.HP.
A frequently employed method of drawing the oblique projection
of a three-dimensional figure is the axonometric method. In this
method the points P of the three-dimensional figure are
determined by their coordinates x\y\z most commonly in a perpendicular
The Poklke-Schwarz Theorem
305
coordinate system. Three equal segments OA, OB, and OC are laid
off from the origin 0 on the axes; these segments form a so-called
tripod. The oblique outline O'A'B'C of the tripod is drawn, and this
also gives us the oblique images of the coordinate axes. We then
construct, in accordance with III., the oblique image of the point P,
which in this context is called the axonometric image.
It is now of fundamental importance to know whether three
arbitrary segments O'A', O'B', O'C originating from a point 0' of
the drawing plane can be considered as the oblique projection
of a tripod OABC. This question was answered by Pohlke and,
in a somewhat more general fashion, by Schwarz, as mentioned
above.
Of the numerous proofs of the Pohlke-Schwarz fundamental
theorem the following (stemming from Schwarz) is quite elementary.
It is based upon the theorem of Lhuilier, which is in itself very
interesting: The sections of an arbitrary three-edged prism include all the
possible forms of triangles. In other words: Every triangle can be considered
as the normal projection of a triangle of given form. This theorem was
stated in 1811 by the French-Swiss mathematician Simon Lhuilier
(1750-1840).
Proof. Since parallel sections of a prism are congruent, we can
assume that the prescribed triangle AoBQC0, which is also the cross
section of the prism, and the sought-for prism section ABC, which
possesses a prescribed form, have a common vertex, C = C0. If we now
drop the perpendiculars A0X and B0Y from A0 and B0 to the
intersection line (axis) g of the two planes E0 of AqB0C0 and E of ABC and
306
Stereometric Problems
rotate the plane E about g as the rotation axis to the plane E0, then
A and B, as the figure shows, fall on the perpendiculars A0Xa.nd B0Y,
respectively, and the point of intersection S = S0 of the lines AqB0 and
AB falls on the axis.
We now draw the perpendicular to the axis through C and let it
touch AqBq at T0 and AB at T. If we designate the cosine of the
angle formed by the plane E in its original position with E0 as /x,
then AaX = ix-AX, B0Y = /x-BY, T0C = y.- TC.
Now according to the ray theorem,
SA-.AT-.TB = S0A0:A0T0: T^.
We can therefore draw a parallel SxA-^T-^B^ to SATB that cuts the
lines g, CA, CT, CB at Sl} Au Tu Bt) and is congruent to SoAqTqBo
(so that S^ = SoA0, A^ = A0T0, 7^ = 7^). We displace
the triangle StBiC in such a way that 5X falls on S0, Ax on A0, 7\ on
T0, Bt on B0. The vertex C then falls on a point V of the semicircle
$ described about the diameter S07o (since ASiCTt is a right
triangle), on which C lies, also.
From this fact we obtain the following simple method for
constructing the described figure when the triangle AoB0C0 and the form of the
triangle ABC are given.
We draw over AqBq the triangle AqBqVthat is similar to the triangle
ABC (with A0, B0, V being homologous to A, B, C, respectively).
We let the median perpendicular of CV intersect with AqBq at M and
draw the semicircle $ with the center M and the radius MC = MV.
The end points S0 and T0 of the semicircle, which lie on the line
AqBq, we designate in such manner that S0V and T0C become sides
(not diagonals) of the chord quadrilateral S0T0CV. We then choose
CS0 as the axis and CT0 as the perpendicular to the axis. On the axis we
make CSt = VS0, on the perpendicular to the axis C7\ = VT0, and
we draw the line S^A^T^B^ S 50^o^o^o- Finally, we draw parallel
to 51^417'151 the line SATB of which S, A, T, B lie on the
perpendiculars through S0, A0, T0, B0, respectively, while at the same time
A lies on CAt and B lies on CBt.
If we rotate the triangle ABC about CS as the axis of rotation by the
angle whose cosine /x = CQTQjCT as the angle of rotation, AqBqCo
then appears as the normal projection of the rotated triangle ABC,
which possesses the prescribed form.
That the ratio /x = C0T0jCT can be considered as a cosine, i.e.,
is a proper fraction, is shown as follows. According to the ray
Gauss' Fundamental Theorem of Axonometry 307
theorem, CT = CT^^CSjCS^), i.e., according to the construction,
= VT0-CSjVS. If we introduce this value into the equation for p, we
obtain
_CT0 _ct0vs
M CT CSVTq
However, since, according to the theory of Ptolemy, in the chord
quadrilateral ST0CV the product CT0VS of the opposite sides is
smaller than the product CSVT0 of the diagonals, p represents a
proper fraction.
This proves the auxiliary theorem concerning the prism.
The proof of the Pohlke-Schwarz theorem is now easy. We can
state the theorem in the following manner:
The oblique image of a given tetrahedron can always be determined in such
manner that it is similar to a given quadrilateral.
Let the tetrahedron be ABCS, the quadrilateral A'B'C'D'.
In the affinity between the planes ABC and A'B'C, in which
A', B', C are correlated to the points A, B, and C, respectively, let
the point D correspond to the point D'. We select SD as the direction
of the affinity (projection ray).
We construct the triangular prism whose edges are parallel to SD
through A, B, and C, and determine the section A"B"C" that is
parallel to A'B'C.
In the affinity in which the points A", B", C" are correlated to the
points A', B', C", let the point D" correspond to the point D'. Then
A"B"C"D" is similar to A'B'C'D'. Now, since A'B"C"D" and also
ABCD are affine with respect to A'B'C'D', then A'B"C"D" is also
affine to ABCD.
The latter affinity, however, arises from the projection rays parallel
to SD. In this affinity the quadrilateral A"B"C"D" that is similar to
A'B'C'D1 is thus the oblique image of the given tetrahedron ABCS.
^Kfl Gauss' Fundamental Theorem of Axonometry
Though three segments OA, OB, OC originating from a point 0 in
the drawing plane (image plane) all three of which do not belong
to the same straight line can always, according to Pohlke's
fundamental theorem (No. 73), be considered as an oblique projection of a
tripod, this is no longer the case for the normal projection of a tripod.
308
Stereometric Problems
Moreover, there exists between the lengths and directions of the
normal projections OA, OB, OC of the three legs a definite
relationship. Thus we come to
Gauss' problem: What is the relation between the normal projections OA,
OB, OC of the legs of a tripod?
Solution. We select the image plane E as the *y-plane, the
perpendicular to this plane from the apex of the tripod as the z-axis
of a triaxial orthogonal coordinate system; we take the common
length of the three legs as the unit length and call the direction
cosines of the legs A|A'|A", p\p'\p", and v|v'|v". At the same time we
take the *y-plane as the Gauss plane (the plane of complex numbers)
and designate the complex number represented by any point (P)
of £ by the corresponding small gothic letter (p).
Since the three points A, B, C in E have the coordinates A| A', p\p',
v\v,
a = A + i\', b = p + ip', c = v + iv.
Squaring and adding, we obtain
a2 + b2+c2= (A2 + ^2 + ,,2) _ [A'2 + ^2 + „'2J
+ 2i{AA' + pp' + w'}-
According to the well-known relations between the direction cosines
of three mutually perpendicular lines, the expression within
parentheses and the expression within brackets both equal one, while the
expression within the braces is equal to zero. This gives us the Gauss
equation
a2 + b2 + c2 = 0.
This formula forms
Gauss' fundamental theorem of normal axonometry: If in the
normal projection of a tripod the image plane is considered as the plane of
complex numbers, the projection of the apex of the tripod as the null point, and
the projections of the leg ends as complex numbers of the plane, the quadratic
sum of these numbers is equal to zero.
The Gauss theorem immediately provides the solution of the
Fundamental problem of normal axonometry: To complete the
normal projection OABC of a tripod of which the normal projections OA and
OB of two of the legs are already drawn.
Solution. We select (as above) the point 0 as the null point of
the complex number plane and the .direction of OA as the direction
of the positive real number axis. The magnitudes of the three
Gauss' Fundamental Theorem of Axonometry 309
numbers a, b, C we will designate as a, b, c, and the three angles
BOC, CO A, AOB as a, /3, y.
We write the Gauss equation
b2 c2
a + — =
a a
In order to construct p = b2/ct. we lay off at 0 on OB the angle y,
at B on BO the angle OAB; the point of intersection P of the free legs
of the angle drawn gives us p. We then draw through A the parallel
to OP, through P the parallel to OA and obtain at the point of
intersection Q of the two parallels the complex number q = a + (b2/ct).
Consequently, the end point R of the extension of QO by itself is the
number r = c2/ct. From c = Vat it follows that:
1. The magnitude of C is the mean proportion of the magnitudes
of a and r;
2. the direction of c is the direction of the bisector of the angle (2/3)
enclosed between OA and OR.
Accordingly, we bisect the angle AOR and mark off on the bisector
from 0 the mean proportion of OA and OR; the end point of the
marked-off segment is the sought-for point C. Since we can choose
the bisector of the concave angle AOR as well as that of the convex
310
Stereometric Problems
angle (in accordance with the two values of Vat), there are two
possible positions for C.
Note. Weisbach's theorem. Since the square of a complex
number has an angle twice as great as the number itself, the vectors
of the squares of two complex numbers form with each other an angle
that is twice as great as the vectors of the numbers. Thus the vectors
of the squares ct2-b2-c2 form the angles 2a, 2/3, 2y with each other.
Thus, if we group these vectors (by magnitude and direction), we
obtain (in accordance with the Gauss formula) a triangle with the
external angles 2a, 2/3,2y. Since the sides of this triangle are a2, b2, c2,
the sine theorem gives us the equation
a2:b2:c2 = sin 2a: sin 2/3: sin 2y.
This formula is
Weisbach's theorem: The squares of the normal projections of the legs of
a tripod relate to each other as the sine of twice the angles enclosed by the
projections.
Thus, Weisbach's theorem appears as the direct consequence of the
Gauss theorem.
The Gauss theorem can be found unproved in the second volume of
Gauss' Werke, the Weisbach theorem in Weisbach's paper on axonom-
etry, which was published in 1844 at Tubingen in the Polytechnische
Mitteilungen of Volz and Karmarsch.
H|9 Hipparchus' Stenographic Projection
To present a conformal map projection that transforms the circles of the globe
into circles of the map.
The projection we are looking for, which is called a stereographic or
polar projection, is very important in cartography. In all probability
the source of this problem is the astronomer Hipparchus (of Nicaea
in Bithynia), one of the most amazing men of antiquity, who was
making astronomical observations in the period from 160-125 B.C. in
Rhodes, Alexandria, Syracuse, and Babylon.
The problem is solved by the following projection directive:
One selects as the projection plane or image plane (map plane) the
plane E tangent to the globe at an appropriate point 0—the so-called
map center—of the area to be projected, and as the center of a central
projection the end point Z of the globe diameter OZ originating at 0.
Hipparchus' Stereographic Projection 311
The stereographic image P' of an arbitrary point P of the globe is
the point of intersection of the projection ray ZP with the image
plane E.
2
Fig. 88.
The distance r = OP' from the map center is given by the equation
r = 2 tan £,
where £ represents the angle formed by the projection ray ZP with
the center ray ZO, and the radius of the globe is chosen as the unit
length.
The stereographic projection thus defined has the following two
properties:
I. Every image circle of a globe circle is a circle.
II. The stereographic map is conformal. (I.e., the map image of an
angle located on the globe is an equally great angle.)
The proofs of these properties are both based on the following
auxiliary theorem:
The image of a globe tangent bounded by globe and map is just as long as the
tangent.
C13y
Fig. 89.
312
Stereometric Problems
Proof of the auxiliary theorem. Let P be a point on the globe,
P' its image, M the place at which the globe tangent passing through
P and lying in the drawing plane ZOP meets the image plane, and at
the same time (since the two tangents MO and MP are equal) the
midpoint of the hypotenuse of the right triangle OPP'. The
intersection point D of any other globe tangent passing through P with the
image plane will then lie perpendicularly above (below) M. The
image D' of D is D itself, and the image of the tangent DP is thus DP'.
Now the two right triangles at M, DMP and DMP', are congruent
(MD = MD and MP = MP'). Consequently, D'P' = DP, which
was to be proved.
Proof of I. We will now prove the somewhat more general
Chasles theorem:* The stereographic image of a globe circle S is a circle whose
midpoint is the stereographic projection S' of the apex S of the cone that is
tangent to the globe along the circle S.
Proof. In Figure 90 let P be an arbitrary point of S, let P' be its
image, D the point of intersection of the tangent to the sphere and
cone-generator SP with the image plane E. According to the
auxiliary theorem, DP then equals DP'. Thus, if // is the point of
intersection of the parallel through S' to DP with the projection ray
ZP, it follows from the similarity of the triangle S'P'H to the isosceles
triangle DP'P that the two segments S'P' and S'H are equal.
Consequently, in the relation
S'H:SP = ZS'-.ZS
derived from the ray theorem, we can replace S'H with S'P', obtaining
* Michel Chasles (1793-1880), French mathematician, especially well-known
for his brilliantly written Aperfu historique sur I'origine et U developpement dts mithodes
en giomitrie.
Hipparchus' Stenographic Projection
313
Now, if P describes the circle S, SP (as the distance of the apex S of
the cone from S) remains constant, and consequently, in view of the
last equation, S'P' also remains constant and P' describes a circle
in E.
If the object circle S is a great circle of the globe, the apex S of the
cone lies at infinity.
In this case let F be the place at which the perpendicular from Z on
the plane of S touches the map plane E, and let V be the place at
which the globe tangent through P parallel to this perpendicular
touches the map plane E. Since, according to the auxiliary theorem,
VP' = VP, the triangle VPP' is isosceles; and since VP is parallel to
FZ, the triangle FZP' is also isosceles; therefore,
FP' = ZF.
The locus of the image point P' is thus a circle with the midpoint F
and the radius ZF.
In those great circles of the globe that pass through the projection
center and the map center, the midpoint F of the image circle recedes
to infinity. In fact, these circles, as direct inspection will show, are
transformed into straight lines by projection.
Proof of II. Let m be an arbitrary angle on the globe, its apex P,
therefore, a point on the globe, and each of its legs a globe tangent.
If X and Y are accordingly the points at which the two tangents
intersect the image plane E, then m = &XPY.
The image <o' of this angle is the angle XP'Y.
Now, since the triangles XPY and ATT are congruent (AT = AT;
also, according to the auxiliary theorem, XP = XP' and YP = YP'),
we immediately obtain
to' = at,
which was to be proved.
Fio. 91.
314
Stereometric Problems
Note. If instead of the tangential plane E we choose a plane
parallel to it as our map plane, we obtain a similar stereographic
projection, which, naturally, also possesses the fundamental properties
I. and II. Of particular importance is a picture plane passing through
the center of the globe, especially when the north pole is chosen
as the projection center and the equatorial plane is accordingly
chosen as the image plane. In this case we obtain for the distance r
of the image point P' from the map center 0 lying at the center of the
globe the formula
r = tan ^45° + |),
where <p is the geographic latitude of the point P. (The above cited
angle £ = 2i OZP is the base angle of the isosceles triangle OPZ in
which the apex angle situated at 0 is the complement of the latitude q>.)
^^g The Mercator Projection
To draw a conformal geographic map whose grid is composed of right-angle
compartments.
The Mercator map, which is equally important for both geography
and nautical science, was conceived by Gerhard Kremer, called
Mercator (1512-1594).
On the Mercator map the equator is a segment AB, the length of
which agrees with the length (2n) of the globe equator. If we divide
AB into 360 equal parts and erect at the dividing points perpendiculars
to AB, we thereby obtain the map meridians. The latitude parallel
on the map that corresponds to the globe parallel of latitude <p is a line
parallel to AB whose distance O from the map equator is called the
exaggerated latitude. The core of the problem consists of representing
the exaggerated latitude O as a function of the geographic latitude <p.
In order to solve this problem we will compare the Mercator map
with the—also conformal—Hipparchus map (No. 75), in which the
north pole of the globe is the projection center and the plane £ of the
globe equator is the map plane, and in which, therefore, the globe
equator is projected isometrically. Here also the globe radius will
serve as the unit length.
On the Mercator map we divide the distance O of the latitude
parallel from the equator into n equal parts, where n is a very large
The Mercator Projection
315
number; we draw through the dividing points the latitude parallels
1, 2, 3,..., n — 1 and call their corresponding geographic latitudes
9u 9>2) ■ ■ ■ > Vn -1) so that instead of <p we write <pn also. We then draw
the two parallel map meridians A' and A' corresponding to the globe
meridians A and A, whose difference in longitude measured in radian
measure e = A — A we will make very small. We thereby obtain on
the map a series of successive, very small, congruent rectangles with
the base line e and the altitude 0/n.
We now do the same on the Hipparchus map. Thus, we draw the
concentric map latitudes corresponding to the latitudes <pu <p2, ■ ■ ■,
<pn-i and call their radii rlxr2, ■ ■ -,rn = r- According to No. 75,
(1)
= tan (45° + I)-
Similarly, we draw the map meridians A" and A" corresponding to the
two longitudes A and A; these meridians are at the same time the radii
of the circle of latitude of radius r. Thus, we obtain on the
Hipparchus map a series of n successive, very small compartments, which
A'
v+1
equator
Fig. 92.
A'
we can consider as rectangles if n is sufficiently great. We single out
the compartment situated between the latitude circles of radii rv and
rv + 1. Since its base line parallel to the map equator is rv times as
great as the base line e of the first compartment, and thus also rv
times as great as the base line e of the compartment of the Mercator
map, then as a result of the conformal nature of the two maps, the
altitude rv + 1 — rv of the Hipparchus map compartment must also be
rv times as great as the altitude O/n of the corresponding compartment
of the Mercator map:
_ _ O
316 Stereometric Problems
From this it follows that
,♦1 = r„(l + *)•
If we construct this equation for all n compartments, r0 being equal
to 1, and multiply the resulting n equations together, we obtain
0\n
(2) r- (l+|)
However, since for sufficiently great n the right side of this equation
does not deviate noticeably from e* (No. 12), we obtain the equation
(2a) r = «•.
From this we get O = Ir or, because of (1),
(3) ¢ = /tan (45° + |J,
and thus the exaggerated latitude O is represented as a function of
the geographic latitude <p.
As a result of our investigation we obtain the following
Directive for drawing a Mercator map : The map image of a point
on the earth of longitude A and latitude <phas a distance Xfrom the zero meridian
on the map and a distance of
/tang + f)
from the map equator.
Here the angles A and <p are taken as being in radian measure and
the radius of the globe on which the map is based is taken as the unit
length.
Nautical and Astronomical
Problems
The Problem of the Loxodrome
To determine the longitude of the loxodromic line joining two points on the
surface of the earth.
A loxodrome is understood to mean a line on the earth's surface that
makes the same angle with all the meridians that it cuts. As long as a
ship does not alter its course it is sailing on a loxodrome. The angle k
formed by the loxodrome with the meridians it cuts is therefore called
the azimuth of course. On a Mercator map (No. 76), which is conformal
and possesses rectilinear parallel meridians, the loxodrome appears as a straight
line that cuts the map meridians at the angle k.
In our study of the Mercator map we chose the radius of the
globe as the unit length. Sailors use as the unit length the nautical
mile (nm), which is the length of one minute latitude on a meridian
of the earth's surface or, also, the length of a minute longitude on the
equator (each being 1852 meters). Since a meridian is n earth
radians long and 180 degrees of latitude is equal to 10800 latitude
minutes, the earth radius is n = 10800/7r nm long. If we think of a
Mercator map with 1:1 scale (i.e., a map whose equator is as long as
the real equator), the distance between the map circle corresponding
to the latitude <p and the map equator, the so-called exaggerated
latitude (according to No. 76), is
<S> = nl tan (45° + |)
nm.
The two earth points 0 and 0' whose loxodromic distance d is to be
determined are given by their longitudes A, A' and latitudes <p, <p'
(>¥)■
The exaggerated latitudes on the map are
O = nl tan ^45° + |\ and <D' = nl tan ^45° + ^\ nm,
the distances of the map meridians from the zero meridian A and A'
nm, where A represents the number of longitude minutes comprising
A and A' the number of longitude minutes comprising A'.
320
Nautical and Astronomical Problems
Let us say that the map meridian through 0 and the map parallel
through 0' intersect at S. Then OS = B is the exaggerated latitude
difference 0' — ¢, O'S = L = A' — A (nm), 00' is the map
loxodrome and 2{.0'0S = k is the azimuth of course.
From the right map triangle 00'S we find the azimuth of course k
by means of the equation
(1) tan»c = -g-
In order to determine the loxodromic distance rfof the two positions
on the surface of the earth we divide d into N very small equal
segments e considered as being rectilinear. If we draw the meridian
through one of two adjacent division points and the circle of latitude
through the other, we obtain thereby a very small right triangle with
the hypotenuse e, whose meridional leg is the latitude difference /3
(measured in nm) of the two division points and forms the angle k
with the loxodrome, so that /3 = e cos k. Every two adjacent points
thus possess the same latitude difference /3. The total (measured in
nm) latitude difference b of the two positions 0 and 0' on the earth's
surface is therefore b = JV)3 = Ne cos k = dcosK. Consequently,
the sought-for loxodromic distance is
(2) d = b sec k.
Formulas (1) and (2) contain the solution to the problem.
Example. How great is the loxodromic distance from Valdivia
(A = 286° 34.9' E, <p 39° 53.1') to Yokohama (A' = 139° 39.2' E,
<p' = +35° 26.6')? Here the longitudinal difference /, = 8815.7
minutes; the latitudinal difference b = 4519.7 minutes or nautical
miles; the exaggerated latitude difference B = O' — O = 4890 nm;
k, according to (1), is 60° 58'50"; and the loxodromic distance d,
according to (2), is 9317 nm.
Note. The shortest distance k between the two positions can be
found by applying the cosine theorem to the spherical triangle NVY
(North Pole-Valdivia-Yokohama). In this triangle NV = 90°
- <p = 129° 53.1', NY = 90° - <p', &VNY = A - A', and VY = k.
According to the cosine theorem
cos k = cos NVcos NY + sin NVsin NY cos (A — A')
or
cos k = sin <p sin <p' + cos <p cos <p' cos (A — A').
Determining the Position of a Ship at Sea 321
This yields
k = 153° 36.1' = 9216.1' = 9216.1 nm.
The shortest distance is consequently 101 nm shorter than the
loxodromic distance.
The name loxodrome stems from the Dutchman Willebrord Snell
(Snellius, 1581-1626). The Portuguese mathematician Pedro Nunes
(1492-1577) was the first to recognize that the loxodromic line
connecting two points of the earth's surface is not the shortest
connecting line and that a loxodrome continuously approaches the pole
without ever reaching it.
^^^1 Determining the Position of a Ship at Sea
One of the most important problems in nautical science is that of
determining the position of a ship at sea. The solution is usually obtained
by the method of the so-called astronomical meridian reckoning, which will
be analyzed in the following example.
Problem : On board a ship in the Pacific Ocean in the north latitude on
October 20, 1923 at 6:50 p.m. mean Greenwich time by the chronometer the
sun's altitude was taken in the morning as h = 21° 40.5'; the Nautical
Almanac gave the declination of the sun for the time of observation as 8 =
10° 10.2' S, the equation of time as e = — 15 min 3 sec. The ship then
sailed till noon 15.2 nm WNW, and the altitude of the sun at zenith was
then measured as H = 35° 2.7' and the sun's declination determined at
A = 10° 13'.
Where was the ship?
The solution to this problem consists of four steps.
I. Determination of the meridional latitude ¢. At
culmination the successive arcs—the altitude of the sun, the pole distance,
the pole altitude—cover the meridional half circle above the horizon
in such manner that H + (90° + A) + O = 180°. This gives us
O = 90° - // - A = 44° 44.3'.
II. Determination of the latitude difference /3 and the
LONGITUDE DIFFERENCE I OF THE TWO OBSERVATION POINTS, AS WELL AS
THE A.M. LATITUDE <p.
If one imagines two sufficiently close points A and B on the earth's
surface, the distance between which is d nm and the line connecting
322
Nautical and Astronomical Problems
which forms the angle k with the longitudinal circle passing through
the center M of AB, then the latitudinal difference of the two points
is d cos k nm, the longitudinal difference dsin k nm. Since one
nautical mile of latitudinal difference is equivalent to one minute latitude
difference and one nautical mile longitudinal difference at the
latitude <p corresponds to sec <p minutes longitudinal difference, then
the latitudinal and longitudinal differences of A and B in minutes
are:
/3 = dcos k, I = dim k sec /t,
where p is the latitude of M, the so-called mean latitude of A and B.
In our example (d = 15.2, k = 67.5°) we find first that
/3 = 5.8'.
From this it follows that the a.m. latitude is
<p = d) - p = 44° 38.5',
and the mean latitude is
^=o_+^ = 440414,
Fig. 93.
Accordingly we find the longitude difference to be
I = 19.75'.
III. Determination of the a.m. longitude A.
In the formula (see Figure 93) corresponding to the nautical
triangle PZO (pole-zenith-sun) of the a.m. observation
cos z = cosp cos b + sin p sin b cos ZPO,
Gauss' Two-Altitude Problem 323
we replace z, p, b, and -&ZPO with 90° - h, 90° + 8, 90° - <p, and
180° — T (T being understood to represent the time angle of the
sun), and we obtain
_ „ sin h
— cos T = tan 8 tan q> H r
cos o cos <p
This yields the true local time T of the a.m. observation
T.L.T. = T= 134° 47.5' = 8hr59min 10 sec.
From this and the time equation e we obtain the mean local time of
the observation
M.L.T. = T.L.T. + e = 8 hr 44 min 7 sec.
If we reduce the mean Greenwich time of the observation by the
mean local time, we obtain the western longitude A of the observation
point in time:
A = M.G.T. - M.L.T. = 10 hr 5 min 53 sec.
In angular measure (1 hr time longitude = 15 degrees longitude), this
comes to
A= 151° 28.25' W.
IV. Determination of the meridian longitude A.
A = A + /= 151° 48'.
Result: a.m. Position: 44° 38.5' N, 151° 28.25' W,
Noon Position: 44° 44.3' N, 151° 48' W.
^^| Gauss* Two-Altitude Problem
From the altitudes of two known stars determine the time and position.
This problem, which is very important for astronomers,
geographers, and mariners, was solved by Gauss in 1812 in Bode's
Astronomisches Jahrbuch.
Two stars are said to be known when their equatorial coordinates—
the right ascension and declination—are known. Let these
coordinates of the two stars S and S' be ct|8 and ct'|8'. In the present
problem all we need in addition is the right ascension difference
a' - a. In the figure let Pbe the world pole; thus PS = p = 90° - 8
324
Nautical and Astronomical Problems
will be the pole distance from S; PS' = p' = 90° - 8' will be the
pole distance from S'; and ^SPS' = r will be the angle between the
hour circles of the two stars, as well as the magnitude of the right
ascension difference; let Z be the zenith of the observation point, so
that PZ = b = 90° — <p is the complement of the latitude <p, ZS = z
the zenith distance from S, and ZS" = z' the zenith distance from S',
the last two being as well the complements of the altitudes h and h',
respectively.
We still need the auxiliary magnitudes ^PSS' = a, £_PS'S = a',
&PSZ = 0, &ZSS' = J, &ZPS = t, and the side SS' = s.
Z
b
Fig. 94.
The computation, which is very simple, consists of three steps
corresponding to the three triangles PSS', ZSS', PZS, which are taken
up in that order.
I. Triangle PSS'. The angles a and <r' are determined according
to Napier's formulas
tan
a + a'
cos
cos
P'-P
2 T
—; cot pr
P' +P 2
tan
■ P' - P
5m 2 r
-TT-p^z
sin ^--—-
and the side s is determined according to the sine formula
sin s:smp = sin r:sin a'.
Gauss' Two-Altitude Problem
325
II. Triangle ZSS'. The angle £ is calculated according to the
tangent theorem for the half angle:
£ _ /sin (S — z) sin (S — s)
tan2 ~ V sin S sin (S - z') '
where S is half the sum of the triangle sides z, z', J. In connection
with this we determine ifi = a — £.
III. Triangle PZS, determination of the locale and the time.
The sought-for latitude can be obtained from
cos b = cosp cos z + sinp sin z cos ifi
or
sin 9 = sin 8 sin A + cos 8 cos A cos 0.
The sought-for time angle T, i.e., the angle at the pole that has been
described by the hour circle of the star S since its lower culmination,
follows from
cos z — cos b cos b sin h — sin 8 sin w
cos t = -..-1. = 5 ~
sin p sin o cos 8 cos 9
and
T = 12 hr ± t,
where the upper sign applies when the star S at the moment of
observation is in the western celestial hemisphere and the lower when
it is in the eastern celestial hemisphere. From this we obtain directly
the sought-for time—sidereal time <S> (the time angle of the Aries
point)—of the observation when we add the right ascension a to the
time angle T: <B = T + a.
In order to obtain the mean local time—M.L.T.—of the
observation we first determine with an approximate value <x0 of the right
ascension of the mean sun for the moment of the observation the
approximate mean local time © — &0 of the observation; then, using
this already fairly exact mean local time we determine the exact right
ascension <x0 of the mean sun for the moment of observation and
finally the exact mean local time
M.L.T. = © - <x0.
We can apply this solution of the Gauss two-altitude problem
directly to the solution of the very important navigational problem,
326
Nautical and Astronomical Problems
Douwes'* problem: From two altitudes of a star {the sun) with known
declination and the interval between the two observations determine the latitude
of the place of observation.
We need only consider S and S', respectively, as the place, 8 and 8',
respectively, as the declination of the star at the first and second
observations. For fixed stars 8 = 8', while for the sun and the
planets 8' differs somewhat from 8. (t is the angle determined by the
known time interval between the hour circles of the star corresponding
to the two moments of observation.)
Since the two measured altitudes are usually observed at different
places A and B, while the above calculation is related to only one place,
let us say B, the altitude measured at A must be " reduced to place B."
For this purpose we solve the problem:
At a place A the altitude of a star is observed at a given time $; at the same
moment in time what is the altitude of the star at place B?
To begin with, it is clear that all places on the earth's surface at
which the star has the same altitude or the same zenith distance at
moment 3 lie on a wcle of the geosphere the spherical midpoint of
which is the end point S0 of the earth radius from the geocenter to the
star. This circle is called the equal altitude circle of the star, its midpoint
So the star image.
Fig. 95.
In Figure 95 let ¾ and 58 be the two equal altitude circles of the
star at moment 3 on which the observation points A and B lie; let S0
be the star image, 0 the point of intersection of the great arc SqA with
58. We will assume that the distance AB is so small that the triangle
AOB can be considered plane. This gives for the difference between
* Douwes was a Dutch admiralty mathematician.
Gauss' Three-Altitude Problem
327
the zenith distances and, consequently, also for the difference in the
altitudes of the star at A and B
AO = AB cos <u,
where co is the angle between the ship's course AB and the bearing
^40 of the star at A
We accordingly obtain the sought-for star altitude h at B at the
time 3 of the observation made at A if we increase or reduce the star
altitude measured at A by the product of the traversed distance AB
and the cosine of the angle between the course and the bearing of the
star at A, accordingly as the ship draws nearer to or recedes from
the star.
The "reduced" altitude thus obtained must then be substituted for
k in the above Gauss equation, while the altitude measured at B must
be used for h'.
The value for tp obtained by this calculation is naturally the latitude
of the second observation point B.
ggjlj Gauss' Three-Altitude Problem
From the time intervals between the moments at which three known stars
attain the same altitude, determine the moments of the observations, the latitude
of the observation point and the altitude of the stars.
The significance of this Gauss method for determining time and
location resides in the fact that it eliminates all observational error
resulting from atmospheric refraction.
Solution. We designate the equatorial coordinates (right
ascension and declination) of the three stars as <x|8, <x'|8', <x"|8", the
latitude of the observation point as <p, the moments of the observations
as t, t', t", the time angles of the three stars at these moments as
T, T', T", so that the differences T' - T = t' - t and T" - T =
t" — t are known. This gives us the three equations
(1) sin A = sin 8 sin 9 — cos 8 cos <p cos T,
(2) sin h = sin 8' sin <p — cos 8' cos <p cos T',
(3) sin h = sin 8" sin <p — cos 8" cos <p cos T".
By subtracting the two first equations we obtain
(4) sin <p(sin 8 — sin 8') = cos <p(cos 8 cos T — cos 8' cos T').
328 Nautical and Astronomical Problems
We now introduce the half sum and half difference
8' + 8 . 8' - 8
s — and u = —y-
and
T' + T T' — T
S=±-±-L and U=L-TFL
of the declinations 8' and 8 and the time angles T' and T, respectively,
and accordingly replace 8' and 8 in (4) by s + u and s — u, and replace
T' and T by S + U and S — U. In the transformed equation (4)
we then apply the addition theorem throughout and obtain
— sin 9 cos s sin u =
cos <p(sin S sin U cos s cos u + cos S cos f/sin j sin u).
Here we divide by cos 9 cos J sin u and obtain
— tan 9 = sin S-sin £/ cot u + cos 5- cos U tan J.
Since U, u, and J are known, we determine the auxiliary magnitudes r
and w such that
r cos u> = sin U cot a and r sin u; = cos U tan j.
(First u> is determined from tan w = tan s tan a cot {/and then r from
one of the two auxiliary equations.) The equation obtained then
assumes the simple form
(I) —tan 9 = r sin [S + w].
In precisely the same way, by subtracting the two equations (1)
and (3), introducing the half sums
and half differences
and introducing the auxiliary magnitudes r and tv determined by the
conditions
r cos rt) = sin U cot u, r sin to = cos U tan 3,
we find the equation
(II) -tan 9 = r sin (© + to).
8" + 8
§= 2 '
8" - 8
u = —^—'
T. + T
6- 2
u = — >
Gauss' Three-Altitude Problem
329
By division of II and I we obtain the sine ratio of the two unknown
angles (<B + to) and [S + w],
(Hi) 5in (« + »> = L
v ; sin [S + w] x
However, since the difference
(© + to) - [S + w] = - + tv - w
of these angles is known, it is easy to calculate the sum of the angles by
applying the sine tangent theorem (No. 40) to (III). From the sum
and the difference we obtain directly the angles © + tv and S + w
themselves and consequently also the unknown angles
_. T" + T T' + T
© and S
From S and the known difference T' — 7wc then obtain the sought-
for time angles T and T'; from © and the known difference T" — T
we obtain in similar fashion the time angles Tand T". By adding the
right ascension to the time angle we finally obtain the moments of the
observations in sidereal time.
The sought-for latitude then follows from (I) or (II), the sought-for
altitude h from (1), (2), or (3).
Note. If the latitude is to be determined from two observations of
the same star altitude and the time interval between them, we have
at our disposal only equations (1) and (2) and must assume that the
time angle T for one of the observations is known. Equation (I),
all the magnitudes on the right side of which are known, then gives <p.
A remarkable special case of this situation is the
Problem of Riccioli: From the time between the culminations of two
known stars that rise or set at the same time, find the latitude of the observation
point.
This problem posed by Riccioli in 1651 is especially noteworthy in
that the method employed makes possible determinations of latitude
without an angle-measuring instrument.
If T and T' are the time angles of star risings, their difference
1U = T' — T is also the time between their culminations. Our
initial equations (1) and (2) are simplified here (because h = 0) to
cos T = tan 8 tan <p and cos T' = tan 8' tan <p.
330 Nautical and Astronomical Problems
We introduce the complements t and t' of the time angles and obtain
sin t = tan 8 tan <p, sin t' = tan 8' tan 9,
and from this by division we get the sine ratio of the angles t and t' :
sin rtsin t' = tan 8: tan 8'.
Since t — t' = T' — T is known, we obtain t + t' from this
equation, in accordance with the sine-tangent theorem. We then
get 2t = (t + t') + (t — t') and finally 9 from sin t = tan 8 tan 9.
■OH The Kepler Equation
From the mean anomaly of a planet calculate the eccentric and true anomaly.
Johannes Kepler (1571-1630) was one of the greatest astronomers
of all time. The famous problem named after him is to be found in
the 60th chapter of Kepler's major work Astronomia nova, published in
Prague in 1609, a book that, according to Lalande, every astronomer
must read at least once.
Before taking up the solution we will present a short explanation of
the three anomalies.
Let S and P be the midpoints of the sun and a planet, respectively,
let N be the point of the planet's orbit at which the planet is nearest
to the sun, the so-called perihelion, let 0 be the midpoint of the
elliptical orbit and of its circle of circumscription, P0 the point of
intersection of the circle of circumscription with the parallel drawn
through P to the minor orbit axis, a and b the major and minor axes of
the ellipse, respectively, OS = e the linear eccentricity, e = eja the
Fig. 96.
The Kepler Equation
331
astronomic eccentricity or form number, T the period of revolution
of the planet, and t the time elapsed at the planet's position P since
its passage through the perihelion.
The true anomaly Wis the angle NSP, i.e., the angle described by the
focal radius of the planet in the time t, the mean anomaly M the angle
that the focal radius would describe in the time t if it were to revolve
uniformly (with the same period of revolution T), so that in angular
measure
M = -= t.
Finally, the eccentric anomaly E is the angle NOP0 formed by the radius
of the circle of circumscription to P0 with the radius of the circle of
circumscription ON.
With £ as a variable parameter we have
x = a cos E, y = b sin E the equation of the orbit
x = a cos E, y0 = a sin E the equation of its circle of
circumscription.
There exists between the eccentric and true anomaly the relation
(obtainable from the right triangle with the legs e — x and y)
.., bsinE
tan W = = ;
a cos h — e
after squaring and use of the formulas b2 = a2 — e2, e = ae, and
cos2£ + sin2£ = 1, sec2 W — tan2 W = 1, this relation is
transformed into
... cos E — e
cos W = -. =•
1 — e cos E
In order to obtain, in addition, a formula that is convenient for
logarithmic treatment, Gauss introduced the half angles \W and \E
and made use of the formulas
1 + cos 9 = 2 cos2 ^ and 1 — cos q> = 2 sin2 ~
We write the above equation
1 — cos W _ 1 + e 1 — cos E
1 + cos W ~ 1 - e 1 + cos£
and obtain the
Gauss formula:
332
Nautical and Astronomical Problems
There exists between the eccentric and mean anomaly (in radian
measure) the famous Kepler equation:
E — e sin E = M.
This equation is a consequence of the formula
J = — (E — e sin E)*
for the area J of the elliptical sector SNP and of the Kepler surface
theorem: "The focal radius of a planet sweeps equal surfaces in equal
times." [According to the area formula, the area of the half ellipse
(E = 7r) is tyrab; the area of the whole ellipse is thus nab. According
to Kepler's surface theorem, there exists the proportion J: nab = t:T.
Consequently, E — e sin E = 2nt: T = M.]
The crux of the Kepler problem now consists of the solution of the Kepler
equation
E — e sin E = M
for the unknown E (when M and e are assumed to be known).
The following determination of E rests upon the assumption that
the form number e is a proper fraction and consists in the calculation
of a series Elt E2, E3,... of approximate values for the eccentric
anomaly that deviate progressively less and less from the true value E
as the index number increases and approximate the true value
sufficiently closely at a relatively low index number.
For the first approximation value we choose
E1 = M + e sin M.
Its deviation from the true value E is
E — Ex = e(sin E — sin M).
However, since
|sin E — sin M\ < \E — M\ = \e sin E\ < e,
it follows that
\E - £i| < e2.
* This formula is obtained as follows: Since the circle sector ONP0 has the
area J0 = ia2E and each ordinate of the elliptical sector ONP is equal to bja times
the circle ordinate at that point, the area of the sector ONP is also equal to bja
times Jo, i.e., \abE. Consequently, the area J of the elliptical sector SNP that
is smaller than ONP by the area iey = \abe sin E of the triangle OSP, is
J = ±abE — \ab-e-sm E.
The Kepler Equation
333
As the second approximation value we choose
E2 = M + e sin E^.
Its deviation from E is E — E2 = e(sin E — sini^). However,
since
|sin E — sin E^\ < \E — E^
and the latter magnitude, as was just shown, is < e2, it follows that
\E - E2\ < e3.
The third approximation value is
E3 = M + e sin E2.
Its deviation from E, absolutely considered, is < e4, etc.
The nth approximation value deviates from the true value by less than the
(n + \)th power of the form number e. The approximation values
accordingly approach the true value progressively more rapidly as e
diminishes.
In the earth's orbit, for example, e = 0.01674, e3 = 0.00000469,
arc 1" = 0.00000485. Consequently:
For the earth's orbit the second approximation value is already exact to
seconds!
In the orbit of Mars, which has the fairly high form number of
0.0933, e5 = 0.0000071, so that the fourth approximation value E
results in an error of less than 2".
After E is determined the true anomaly is calculated by the Gauss
formula.
Note. Kepler's problem is of the greatest importance for
astronomy. It forms the basis, for example, for the determination of
the equation of time for a given moment of time.
[The equation of time is conventionally understood to be the
difference between mean and true local time or also the difference
between the right ascensions a and <x0 of the true and mean sun:
e = M.L.T. - T.L.T. = a - <x0-]
The calculation is based on the following seven steps:
1. Determination of the right ascension <x0 of the mean sun for the
given moment of time from its daily increase of 3 m 56.55536 s and its
value for a. fixed moment of time (on January 1, 1925, at midnight,
M.G.T. was a0 = 18 hr 40 min 30 sec).
334
Nautical and Astronomical Problems
2. Calculation of the mean anomaly M according to the (definition)
equation a0 = M + U, where II is the longitude of the true sun at
perigee. (II on January 1, 1925, was 281° 39' 2" and it increases
annually by 1' 1.9".)
3. Determination of the eccentric anomaly E from Kepler's
equation E — e sin E = M with e = 0.01674.
4. Calculation of the true anomaly W from the Gauss formula
tani^=yj-±ltan^
5. Determination of the longitude L of the true sun according to
the equation L = W + II.
6. Determination of the right ascension a of the true sun in
accordance with the equation tan a = cos i tan L obtained from the
astronomical triangle having the hypotenuse L and the legs a and 8; in the
equation, i represents the inclination of the ecliptic.
7. Calculation of the equation of time e from e = a — <x0.
Example. The equation of time for the 2nd of December, 1925
at 4:00 p.m. Central European Time.
<x0 = 16 hr 43 min 44 sec = 250° 56', M = 329° 16' 1",
Ex = 328° 46' 38", E2 = E = 328° 46' 12", W = 328° 16' 10",
L = 249° 56' 9", <x = 248° 17' 28" = 16 hr 33 min 10 sec,
e = — 10 min 34 sec.
^^| Star Setting
Calculate the time and azimuth of setting of a known star for a given place
and day.
Solution. The method of calculation can best be illustrated by a
numerical example. Thus, let us consider a more definite form of the
problem:
On the 3lst of December, 1932, when did Saturn set in Nordlingen,
Bavaria {<p = 48° 51.1', A = 10° 29.4')? The nautical almanac gives
the following data for December 31, 1932 at midnight, mean
Greenwich time: right ascension of Saturn a = 20 hr 25 min 30 sec (hourly
increase = 1.2 sec), declination of Saturn 8 = 19° 47.4' S (hourly
decrease 0.06'), right ascension of the mean sun <x0 = 18 hr 36 min 50 sec
(hourly increase = 9.86 sec).
Star Setting
335
At the moment of setting the star is already in reality a certain
distance h below the horizon (SN) as a result of atmospheric refraction.
The horizontal refraction h can be set at an average of 35', but in
precise measurements special refraction tables must be consulted.
It follows from the nautical triangle PZ* (in which PZ = b =
90° — 9 represents the complement of the latitude <p, P* = p = 90°
+ 8 the pole distance, Z# = z = 90° + h the zenith distance,
2lZP* = t the hour angle, and &PZ* = a the azimuth of the star),
according to the cosine theorem, that
cos z = cos b cos p + sin b sin p cos t.
If we introduce the magnitudes h, <p, 8 here instead of z, b, p, we obtain
cos t = tan w tan 8 =:•
cos <p cos o
First we calculate the approximate time t of setting, taking for the
moment of setting 8=19° 47.4'. We then obtain from the formula
we have found (assuming h = 35'), t = 66° 42.8' = 4 hr 26 min 51 sec
and for the time angle T of the moment of setting
T = 16 hr 26 min 51 sec.
From this we get for the sidereal time © (i.e., the time angle at the
vernal equinox) the approximate value
© = T + a = 36 hr 52 min 21 sec,
and thus for the mean local time of setting
M.L.T. = © - (x0 = 18 hr 15 min 31 sec
336
Nautical and Astronomical Problems
and for the mean Greenwich time
M.G.T. = M.L.T. - (A = 41 min 58 sec) = 17 hr 33 min 33 sec.
At the moment of setting, then, approximately 17.55 hr have gone by
since midnight mean Greenwich time. In these 17.55 hr the three
magnitudes <x, 8, and <x0 increase by 21 sec, —1.1', 2 min 53 sec, so
that at the moment of setting they have the values
a = 20 hr 25 min 51 sec, 8 = 19° 46.3',
a0 = 18 hr 39 min 43 sec.
The calculation must now be repeated with these exact values. This
gives
T = 16 hr 26 min 57 sec
a = 20 hr 25 min 51 sec
© = 36 hr 52 min 48 sec
<x0 = 18 hr 39 min 43 sec
M.L.T. = 18 hr 13 min 5 sec
M.G.T. = 17 hr 31 min 7 sec.
The sought-for azimuth a is computed from the sine formula
sin a:sin t = sinp'.sin z
and comes out to be
a = 120° 10'.
Result. Saturn set at 18 hr 31.1 min C.E.T. at an azimuth of
S 59° 50' W.
Note. The method described is naturally just as well suited to the
determination of the rising time or the time at which a star attains a
prescribed altitude. If it is specifically desired to determine the
moment of culmination, the logarithmic calculation can be dispensed
with, since the time angle of culmination, T = 12 hr, is known.
gjgH The Problem of the Sundial
To construct a sundial.
First we will consider the two simplest forms of sundial: the
horizontal dial and the vertical meridional dial. In the first the plane of
the dial E is horizontal, in the second vertical, specifically through the
eastern and western points of the horizon. The earth's axis is
represented by a pin, the gnomon or style that casts a shadow on E. At noon
the shadow is situated at its center position, the meridian line of the
The Problem of the Sundial
337
dial plane, and at t hr before or after noon forms the "shadow angle"
s or a, respectively, with the meridian line.
The problem is to determine the relation between the time t and
the shadow angle.
We will call the plane formed by the sun and the earth's axis (the
gnomon) the shadow plane, since the shadow must lie in this plane.
At noon the shadow plane at its central position passes through the
north and south points of the horizon and at time t forms the angle t
(t hr = I5t°) with its central position.
In the figure let US, UO, and UZ be segments running from U
toward the southern point, the eastern point, and the zenith of the
horizon, specifically in such manner that SZ represents the gnomon;
thus 2IUSZ represents the latitude <p of the place and SOZ the shadow
plane, so that SO is the shadow; &USO is the shadow angle s of the
horizontal dial, ZO the shadow, 2IUZO the shadow angle a of the
vertical meridional dial. The angle t between the shadow plane
SOZ and its meridional position SUZ is the angle UFO that is
formed with UF by the perpendicular OF dropped from 0 to SZ. If
we select SZ as the unit length and, for the sake of brevity, set
cos 9 = 0, sin 9 = i, it follows from the right triangle SUZ that
US = o,UZ = i, UF = oi, from the right triangle IWthat UO = oi
tan t, and from the right triangles USO and UZO that UO = 0 tan s
and UO = i tan a. If we set the three values for UO equal to each
other, we get the equations
(1) tan s = i tan t, (2) tan a = 0 tan t,
which contain the sought-for relations between the time t and the
shadow angles s and a, respectively.
338
Nautical and Astronomical Problems
In order to construct the dial we compute, in accordance with (1) or
(2), the shadow angle corresponding to different times t, draw them
in, but write on their free leg not s or a, but the corresponding times t.
It is also possible to use a purely graphic method. On an arbitrary
segment AB we begin at B and mark off i or o times its length to C,
draw the semicircle with the center C and the arc center B, and draw
the tangent through B which is at the same time perpendicular to AC.
Fig. 99.
If we now make the arc BT equal to the time angle t (thus, for
example, 45° for 3 hr), extend CT to the intersection J with the
tangent, and connect J with A, then ^BAJ = m is the shadow angle
s or a for time t. [From /\BJA it follows that BJ = BA tan co, from
£\BJC that B J = BC tan t, so that BA tan <d = BC tan t or, since BC
is i or o times BA,
tan <n = i tan t or tan at = o tan t.
According to (1), <d is equal to s and according to (2), at = a.]
We carry out the described construction for as many time angles t
as possible and obtain the dial as the totality of lines A J each of which
bears written on it its corresponding time. In order to install it, we
place the drawing plane horizontally, so that BA points from the
northern point of the horizon to the southern point, or vertically, so
that BA points perpendicularly upward and the tangent runs from
west to east, and fix the style parallel to the earth's axis at A.
A Vertical Sundial at an Arbitrary Azimuth
Let us now consider the case in which a sundial is to be fastened to
a vertical house wall that does not run east and west.
The Problem of the Sundial
339
In Figure 100, let UZ be a vertical line on the wall and UH a
horizontal line on the wall, US a horizontal pointing south, ZS the
gnomon, so that -&USZ = <p and £_UZS = b = 90° — <p; UZS is
the meridian plane and ^SUH = a the azimuth (calculated from the
south point) of the wall; ZH is the shadow at time t, so that ZSH is
the shadow plane, and the angle that it forms with the meridian plane
ZSU is the time angle t; finally, the angle that ZH forms with ZU is
Fig. 100.
the shadow angle a. The three-dimensional vertex Z with the edges
ZU, ZH, ZS cuts out of the sphere with the center Z a spherical
triangle (shown in the figure) in which the side a, the angle a, the side
b, and the angle t are four successive elements. According to the
cotangent theorem, therefore,
cos b cos a = sin b cot a — sin a cot t
or
cos <p cot a — sin a cot t = sin <p cos a.
This is the relation between the time t and the shadow angle a. This
relation makes it possible to calculate a corresponding a for every t.
The invention of the sundial is lost in antiquity. A statement by
Vitruvius (which was also found engraved on an ancient sundial
unearthed on the Via Flaminia), according to which the inventor is
340
Nautical and Astronomical Problems
the Chaldaean Berosus, is not reliable in view of the fact that sundials
were known in ancient Babylonia many centuries before Berosus.
g*g| The Shadow Curve
To determine the curve described by the shadow of a point of a rod in the
course of a day, when the rod is erected at a place of latitude <p and the
declination of the sun for the day has a value of 8.
Solution. We select the perpendicular from the point of the rod
to the horizon of the place as the unit length and the base point 0 of
the perpendicular as the origin of a right-angle coordinate system
whose x-axis runs toward the north point and whose y-axis runs
toward the west point of the horizon. At the moment in which the
sun (©) has the azimuth S a° E and the zenith distance z, the distance
of the shadow from 0 is tan z, and the abscissa and ordinate,
respectively, of the shadow are
x = tan z cos a, y = tan z sin a.
In the nautical triangle PZ® the latitude complement PZ = b and
the pole distance P® = p = 90° — 8 are constant. The zenith
distance Z® = z, the azimuth supplement PZ® = 180° — a and
the hour angle ZP® = t are variable. We find the equation of the
shadow curve by expressing sin t and cos t in terms of x and y and
introducing the resulting expressions into the equation
cos2 t + sin2 t = 1.
We abbreviate sin <p, cos <p, and tan <p, as i, o, and q, respectively,
and sin p, cos p, and tan p, as I, 0, and Q, respectively. If we then
apply to the nautical triangle the sine theorem, cosine theorem, and
cotangent theorem in that order, we obtain the three equations
sin a sin z = sin p sin t,
cos z = cos p cos b + sin p sin b cos t,
— cos b cos a = sin b cot z — sin a cot t.
We divide the first by the second and obtain
tan p sin t
sin a tan z = -. - •—
sin 9 + cos 9 tan p cos t
or
en = ^sin *
^ ' y i + oQ cos t
The Shadow Curve 341
We multiply the third by — tan z and obtain
sin 9-cos a tan z = sin a tan z-cot t — cos 9
or
c°s t
(2) ix = y —. 0.
v ' " sin t
From (1) and (2) we find
Q cos t = ■: > Q sin t = -r-*-—
l — OX I — ox
and from this, in accordance with what was stated above, we obtain
(0 + ix)2 + y* = Q\i - ox)2
as the equation of the shadow curve. We solve for y2 and obtain
y2 = (Q2i2 - 02) - 2io(Q2 + l)x + (Q2o2 - i2)x2
or, if we go on to divide by o2,
§ = (QV -0- MQ2 +l)x + (Q2- q2)x2.
To put this equation into a simpler form, we introduce a new
coordinate system X, Y whose origin U is situated at the apex of the
curve, i.e., at the point where the shadow lies at noon; the .Y-axis runs
toward the south and the Y-axis toward the west. When the sun is
at meridian, its zenith distance is p — b, and thus
,, , ,, tan p — tan b Qq — 1
Uo = a = tan (p - b) = ■; <- r = -77
We accordingly introduce
x = «-X, y=Y
into the above curve equation and obtain
„2
V-2Q(1 + g2)X + (Q2 - g2)X2
0-
or, if we write the first parenthesis as 1 /o2 and the second as
— - -
02 02
and multiply the equation by o2,
Y2 = 2QX - (l - ^X2.
342
Nautical and Astronomical Problems
The amplitude equation of the shadow curve thus reads
ys = 2tan/,Ar-(l-^W.
r \ cos2 p)
The curve is consequently a conic section with the half parameter tan p and
the form number {eccentricity) cos 93/cos p.
If the latitude is equal to the polar distance of the sun, then the shadow
describes a parabola; at higher latitudes it describes an ellipse, and at lower a
hyperbola.
■SIS Solar and Lunar Eclipses
To determine the beginning and end of a solar eclipse, together with the
maximum fraction of the solar disc that is obscured, if the right ascensions,
declinations, and radii of the sun and moon are known for two moments in time
sufficiently close to the time of the eclipse.
Example. At the famous solar eclipse that occurred at Athens
during the Peloponnesian War on August 3, 431 B.C., the magnitudes
mentioned had, at 4:30 p.m. and 5:30 p.m. mean Athenian time, the
values
A0 = 126° 51' 52", A0 = 19° 23' 46", R0 = 15' 52",
<x0 = 126° 40' 55", 80 = 19° 38' 58", r0 = 15' 38.5"
and
Ax = 126° 54' 21", A! = 19° 23' 11", Rx = 15' 52",
ai = 127° 8' 49", Bx = 19° 24' 30", rx = 15' 36.5".
A solar eclipse can only occur at a time when the moon is sufficiently
close to the sun on the celestial sphere, i.e., at a time when the
differences a = a — A and d = 8 — A between the right ascensions
and declinations of the two bodies are sufficiently small.
The spherical cosine theorem gives for the spherical distance z of
the midpoints of the two bodies (their central axis) the formula
cos z = sin A sin 8 + cos A cos 8 cos a.
We replace cos z and cos a here by
1 - 2 sin2 I and 1-2 sin2 £
Solar and Lunar Eclipses
343
and obtain
1 — 2 sin2 -r = cos d — 2 cos A cos 8 sin2 -•
If we now write 1 — 2 sin2 (d/2) for cos d, we obtain
sin2 ^ = cos A cos 8 sin2 - + sin2 ^-
If we now consider that, according to our assumption, a and d and,
therefore, also z are small angles that in no case exceed 1°, we can
substitute the angles themselves for their sine (No. 15) and write
z2 = a2 cos A cos 8 + d2.
If in addition to this we introduce the abbreviations
Vcos A cos 8 = g and ag = x
and substitute y for d, we obtain the simple equation
z2 = x2 + y2.
The magnitudes a, x, y, and z are most conveniently measured in
angular seconds.
If the right ascensions and declinations of the moon and the sun for
two moments of time sufficiently close to the time of the eclipse (the
first moment being taken as the zero point of time) are known and are,
for example, <x0> A0, ^o> &n<i A0 for the first moment and alt Alt Slt
and A! for the second, then we also know the values a, d, and g, and
therefore also x = ga andy = d for these moments in time, and we can
calculate from these the hourly increases h and k of x and y. Since the
eclipse lasts only a short time, we can assume that the magnitudes x and
y change uniformly in the period of time here under consideration
and that, consequently, at time t, i.e., at t hours after moment 0,
x = x0 + ht and y = y0 + kt.
If we introduce these values into the above equation, it assumes the
form
z2 = (*0 + ht)2 + (y0 + kt)2,
which permits us to calculate the central axis of the two bodies for
any moment t.
The eclipse begins and ends at the moments when the central axis z
is equal to the sum s of the two radii R and r. In the period of time
344
Nautical and Astronomical Problems
under consideration the solar radius does not change (R = Rq = Rx),
while the lunar radius exhibits the slight hourly increase p = —2",
so that
r = r0 + pt and s = R + r = R + r0 + pt = s0 + pt.
We therefore obtain for the desired moment t of the beginning (and
also the end) of the eclipse the so-called
Eclipse equation:
(*0 + ht)2 + (y0 + kt)2 = (s0 + pt)2.
This quadratic equation has two roots for the unknown t; the smaller
value, t', indicates the beginning of the eclipse, and the larger, t", the end.
The maximum eclipse occurs at the moment t in which the central axis z
reaches its minimum value £. Thus, we have
z2 = z2 + 2mt + n2t2,
where
A = *o + yl, m = x0h + t/ok, n2 = h2 + k2.
If we write
2 2 m2 , T . , mV
we see that z attains its minimum value when the bracket disappears.
We then have
r = - J and ? = Jz2 - J-
At the moment of the maximum eclipse the moon has advanced
over the solar disc by (R + r — Q/2R of the sun's diameter.
The fraction of the solar disc that is covered by the moon at that
moment can also be calculated easily from J.
Carrying out the computations for the Athenian solar eclipse, we obtain:
a0 657(-10'57"),
log£0 = 9.97428,
x0 = -619.2,
y0= +912(+15'12"),
h = xx - x0 = 1438,
s0 = 1890.5, sx = 1888.5,
ax = +868(+14'28"),
log ft =9.97462,
*! = 818.7,
y1 = +79(1'19"),
k=yx-y0 = 833,
p = h - so = -2
Solar and Lunar Eclipses
345
and the eclipse equation is
(-619 + 143802 + (912 - 833<)2 = (1890.5 - 2*)2
or
2761729*2 - 3292074* - 2359085 = 0
or
t2 - 1.192034« = 0.8542059159.
Its roots are
f = -0.50373, t" = 1.69576.
Converting the decimals into minutes and seconds, we obtain
— 30 min 13 sec and 1 hr 41 min 45 sec, respectively.
Consequently:
Beginning of eclipse: 3 hr 59 min 47 sec,
End of eclipse: 6 hr 11 min 45 sec.
The length of the eclipse was therefore 2 hr 12 min, the moment of
maximum eclipse 5 hr 5 min 46 sec [2t = t' + t" gives r = 0.596].
The central axis of the sun and moon at this moment is obtained
from
£2 = (619 - 1438-0.596)2 + (912 - 833-0.596)2;
it is
£ = V2382 + 415.52 = 479, i.e., 8'.
The moon then covers -ff^, i.e., 74% of the central solar diameter
and 67% of the solar disc.
Lunar eclipses are treated in a similar way. But here, instead of
being concerned with the sun, we are concerned with the so-called
shadow circle, i.e., the cross section of the conical shadow (the umbra)
cast by the sun-illuminated earth at the distance of the moon. The
angle radius 9¾ is equal to p — k, where p represents the lunar
parallax* and k represents the half aperture angle of the conical shadow.
k is the excess of the angle radius R over the parallax* P of the sun.
[In the Figure 101, let S be the center of the sun, E the center of the
earth, K the apex of the conical shadow, AB the diameter of the
shadow circle, se a tangent to the periphery of the sun and the earth,
* The lunar or solar parallax is the angle radius of the earth on the moon or
sun, respectively.
346
Nautical and Astronomical Problems
EFthe perpendicular toSsfromE, and thus -&EAe = p, &AEK = SK,
and &FES = £_eKE = k. Since p is an external angle of the
triangle EKA, we have/* = 9¾ + k. It also follows from /^SEFthaX
_ SF _ Ss_ _ Ee_
smK ~SE~ SE~ SE
Since the minuend of the right side is the sine of the angle radius of
the sun and the subtrahend is the sine of the solar parallax, it follows
that
sin k = sin R — sin P
or, because the angle involved is so small (k is smaller than 16.2',
R < 16.3', and P < 8.9"),
K = R - P,
as was asserted above.]
The right ascension of the center of the shadow circle is the right
ascension of the sun increased or diminished by 180° and the
declination is the reciprocal value of the solar declination.
In order to take account of the atmospheric refraction, in computing
a lunar eclipse the theoretical value for the radius of the shadow
circle given above, 9¾ = p + P — R, must be replaced by a value
2% greater.
flEffifl Sidereal and Synodic Revolution Periods
To determine the synodic revolution period of two coplanar rotation rays for
which the sidereal revolution periods are known.
A rotation ray is a line segment AB of invariable length the end
point B of which rotates about the starting point A in a plane £ at a
Sidereal and Synodic Revolution Periods 347
constant rate of revolution, while the starting point either remains at
rest or describes a curve of plane E. Using a well-known astronomical
expression we call the time T in the course of which the rotation ray
AB describes one complete revolution of 360° its sidereal revolution
period.
Let a second rotation ray of the plane E with the starting point a
and the end point b have the sidereal revolution period t (<T).
We will consider the angle that the two rays form with each other
at a given moment of time. The time s at the end of which they once
again form the same angle we will call the synodic revolution period of the
two rays or the synodic revolution period of the one ray with respect
to the other.
In order to find this we will imagine an auxiliary rotation ray a'V
whose starting point a' always coincides with A and whose direction
always agrees with that of ab, and we will now consider the relative
rotation of this auxiliary ray with respect to AB. Since the rotation
of a'b' (or ab) in the unit time is equal to 360°/< and that of AB is
360°/ T, the relative rotation of a'b' with respect to AB in each time
unit is
(1) 8 = (} - t)360°-
If a'b' resumes the same position with respect to AB at the end off units
of time, then sS must equal 360° or
(2) 8 = - 360°.
s
From (1) and (2) it follows that
111 Tt
- = 7-=- or s = "= 7'
s t T T — t
and thus the synodic revolution period s is represented as a function of
the two sidereal revolution periods T and t.
This unpretentious problem, the solution to which is also a model of
brevity and simplicity, nevertheless possesses noteworthy applications,
four of which we will discuss.
Problem 1. The hands of a clock are superimposed one on the
other at exactly 12:00; when is the next time they are exactly
superimposed one on the other ?
Here let AB be the small hand, ab = Ab the big hand, T = 12 hr,
t = 1 hr, thus s = ^1 = 1 iV hr = 1 hr 5 min 27-^- sec.
348
Nautical and Astronomical Problems
The event takes place at 5 min 27-j\- sec after 1:00.
Problem 2- From the synodic revolution period (583^ days) of Venus,
determine its sidereal revolution period.
The sidereal revolution period of a planet is understood to mean
the time in which the rotation ray sun-planet makes one complete
revolution. The synodic revolution period of the planet is
understood to mean the time s at the end of which the three celestial bodies
sun, earth, planet are once again in the same position with respect to
one another.
Here AB is the rotation ray sun-earth, ab the rotation ray sun-Venus,
and T = 365¾ days. The synodic revolution period s of Venus has
been determined by observations. Its sidereal revolution period t is
obtained from the relation
1 _ J_ _ 1
t T~ s
as 224.7 days.
Problem 3. To determine the relation between the solar day and the
sidereal day.
A solar day is the time interval between two successive culminations
of the sun, a sidereal day the time interval between two successive
culminations of a fixed star or the time interval within which the
earth rotates once about its own axis.
Let the midpoint of the sun be S, that of the earth E, a marked point
of the earth's equator 0. Here AB is the rotation ray SE, ab the
rotation ray EO, T is here 365J days (1 year, the period of time in
which AB = SE completes one full revolution of 360°), t the length of
a sidereal day, and s the length of a solar day (the period of time at the
end of which the ray EO is once again in the same position relative
to the sun). From
I _ 1_ _ \_
s ~ t T
we obtain
t s
T/t represents the number of sidereal days, T/s the number of solar
days, that occur in a year. The sought-for relation can accordingly
be stated in the following form:
A year contains one more sidereal day than the number of solar days (365¾
solar days, 366J sidereal days).
Progressive and Retrograde Motion of the Planets 349
Problem 4. What is the relation between the sidereal and synodic month?
A sidereal month is the time it takes the rotation ray EM (earth-
moon) to complete one full revolution. A synodic month is the time
interval between two successive new moons (full moons). Here AB is
the rotation ray SE, ab the rotation ray EM, T = 365¾ days, t the
length of the sidereal month, s the length of the synodic month. The
sought-for relation accordingly reads
1 _l _ J_
t s ~ t
Verbally it can be stated as follows: The reciprocal of the synodic month
subtracted from the reciprocal of the sidereal month is equal to the reciprocal of
the sidereal year.
This can be confirmed for the numerical values:
t = 27.3217 days, s = 29.5306 days, T = 365.2564 days.
■SHf Progressive and Retrograde Motion of the Planets
When does a planet pass from progressive to retrograde motion (or conversely,
from retrograde to progressive motion) ?
The planetary orbits, considered as circles on the ecliptic plane,
their orbital radii and revolution periods, as well as their positions at a
given moment of time serving as the starting point of the time record
are assumed to be known.
Solution. The motion of a planet is conventionally called
progressive when it travels among the fixed stars of the celestial sphere
like the sun, i.e., from west to east, and retrograde when it travels in
the opposite direction, i.e., from east to west. The transition from
one motion to the other occurs when the planet appears to be
stationary for a brief period among the fixed stars, in other words,
when the sight-line "earth-planet" retains the same direction for a
short period of time.
The earth and the planet have the orbital radii r and R, respectively,
and the revolution periods u and U, and the orbital radii, which are
rotating about the sun, accordingly have the rates of revolution
k = 277-/a and K = %it\U.
The solution to the problem is most conveniently obtained by the
vector method. Let 0, p, P be the midpoints of the sun, the earth,
and the planet, r = Op and 9¾ = OP the vectorial distances of the
350
Nautical and Astronomical Problems
earth and the planet from the sun. The vectors r and 9¾ are "
rotational vectors," i.e., vectors with the constant lengths r and R, that
rotate in the ecliptic plane E with constant velocities k and K,
respectively, about their fixed point of origin 0. For the vectors r
and 9t of the orbital velocities we again select 0 as the starting point.
The magnitudes of the velocities r and 9t are kr and KR, the directions
always perpendicular to the directions oft and 9¾. If we then imagine
two vectors r0 and 9J0 situated in E, originating at 0, and possessing
the magnitudes r and R that are always 90° in advance of the rotational
vectors r and 9¾. then
r = kt0 and 9¾ = A"9*0.
The vectorial distance of the planet from the earth is 3 = pP =
OP — Op = 9¾ — r, the relative velocity of the planet with respect to
the earth (i.e., the velocity of the planet for an observer on the earth,
for whom the earth is at rest) is thus
I = ft - r = Kft0 - Arr0.
Let the angle by which the vector 9¾ is in advance of the vector r
at time 0 be a and at time t let it be £. Then
(1) £ = « + *',
where k = K — k represents the angle by which the vector 9¾ rotates
in advance of the vector r in the unit time.
The motion of the planets is then progressive when the vector 3
rotates in a counterclockwise direction for an observer at the North
Pole and retrograde when it rotates in a clockwise direction for this
observer, i.e., in accordance with whether the apex S of the vector
OS = 3 x I that is perpendicular to E lies above or below the
ecliptic plane. Now,
8 x * = (« - t) x (* - t) = (« - t) x (*9*0 - kx0) = p - q
with
p = k3t x 9J0 + kt x r0, q = Kx x 9J0 + k?H x r0,
it being assumed that the vectors p and q also have their starting point
at 0. The vector p has the magnitude KR2 + kr2 and lies above E.
The vector q, as may be seen from Figure 102, lies above or below E
Progressive and Retrograde Motion of the Planets
351
accordingly as cos £ is positive or negative, and has the magnitude
(K + k)Rr\cos £\. The vector 3x1 thus lies above or below E
accordingly as KR2 + kr2 — (K + k)Rr cos £ is positive or negative,
i.e., accordingly as
, . KR2 + kr2
C°5^ (K + k)R/
Now, according to Kepler's third law,
U2:u2 = R3:r3 or k2:K2 = R3:r3,
so that the ratio k: K on the right side of the obtained inequality can be
replaced by W3:w3, where W = VR, w = V7. We,thus obtain for
this right side the value
w3W + WW {W +w)Ww
Ww
(W3 + w3)W2w2~ W3 + w3 ~ W2 + w2 - Ww
_ VRr
~ R + r - V~R~r
and our conclusion reads:
The motion of a planet is progressive or retrograde accordingly as
cos 4 $ ■=•
R+ r - VRr
At the moments when
(2)
cos £ =
VRr
R + r - VWr
the one type of motion changes into the other.
352
Nautical and Astronomical Problems
Example. How many days after upper conjunction does Venus become
retrograde ?
Here r = 149, R = 107.5 million kilometers, k and K, respectively,
in degrees are 0.9856° and 1.602°, k thus equals 0.6164° per day, with
a = 180° and VR~rj{R + r - VRr) = 0.974. From (1) and (2)
we therefore obtain cos 0.6164* = —0.974 and from this t = 271 days.
g*g| Lambert's Comet Problem
To express the time required for a comet to describe an arc of its parabolic
orbit by means of the focal radii and the chord connecting the end points of
the arc.
Johann Heinrich Lambert (1728-1777) in 1761 published a paper
on comet orbits in which may be found the celebrated formula bearing
his name; the formula represents the area of a parabolic focal sector
as a function of the bounding focal radii and the sector chord.
For the derivation of the Lambert formula we require a formula of
the English astronomer Barker, which we Arill derive first.
We begin with the amplitude equation of a parabola, y2 = Akx,
in which k represents the shortest focal radius, which is commonly
known to be one fourth of the parabola parameter.
Let us consider the sector FOP, which is enclosed by the minimum
focal radius FO, the focal radius FP = r of an arbitrary point P(x\y),
and the parabola arc OP, and in which the angle OFP = W represents
the so-called true anomaly of the point P.
Barker's problem is stated thus: Represent the area of the parabola
sector as a function of the anomaly.
In order to solve the problem we first express the sector area S in
terms of x and y. If we drop the perpendicular PQ from P to the axis,
S is the difference between the area of the half sector OPQ (cf. No. 56)
and the area of the triangle FPQ, so that
S = \*y ~ i(* - k)y or 65 = y(x + 3k).
We then express x and y in terms of W. According to the polar
coordinate theorem of the parabola, the focal radius is
P k
r 1 + cos W „W'
cos^ —
Lambert's Comet Problem
and consequently,
WW W
y = r sin W = 2r sin -^ cos -=- = 2k tan -=-
3 2 2 2
and
W
x = y2J4:k = k tan3 -=-•
353
If we introduce Barker's auxiliary magnitude
we obtain
W
T = tan y,
* = jfcra, y = 2kT
(the equation of the parabola in a parametric form), and after
substitution of these values into the above area formula, we obtain
This is Barker's formula.
S = k2(T + jr3).
W is positive or negative accordingly as P lies above or below the
axis. In the first case, T and S are positive; in the second, negative.
Now for the solution of Lambert's problem!
Let P and P' be two points of the parabola, W and W' their
anomalies, T and 7" the corresponding Barker auxiliary magnitudes,
S and S' the areas of the sectors FOP and FOP', with FP = r and
FP' = r' as the focal radii of the two points, &PFP' = 2£ the angle
354 Nautical and Astronomical Problems
between them, PP' = s the connecting chord, and a the area of the
sector PFP' enclosed by the two focal radii. Let r lie above the axis
and r' above or below it; in the first case, let r' < r, and thus in both
cases W < W.
The area a is then in both cases the difference S — S'.
Now, according to Barker,
ZS = k2(3T + T3), 35" = k2(3T' + T'3),
and consequently,
3(t = k2(T- T')[3 + T2 + T'2 + 7*7"].
Using the abbreviations J, 0, J', 0' for
. W W . W W
sin -^-. cos -^-. sin —-» cos -—
and i, o for sin 1,, cos J, we can write the factor in parentheses as
_ _J__^__ JQ' - QJ' L
K ' 0 0' 00' 00''
and the factor in square brackets as
[ ] = l + T2 + 1 + T'2 + 1 + TT
J2 , J'2 . JJ'
= 1 + 05+1 + 0^+1 + 007
02 + J2 0'2 + J'2 00' + JJ'
02 T 0'2 T 00'
0
02 ~r 0'2 ~*~ 00''
02 + /T2 +
If we introduce these values and, in accordance with the polar
equation, express k/02 and k\0'2 as r and r', respectively, we obtain
3<r = i(r + r' + oVrF)VrF.
Now,
i2 = (JO' - OJ')2 = J20'2 + 02J'2 - 2J0J'0'
= (1 - 02)0'2 + (1 - 0'2)02 - 2J0J'0'
= 02 + 0'2 - 200'(00' + JJ') = 02 + 0'2 - 2oOO',
and, since k = rO2 = r'O'2,
1 = Vk(r + r' - 2oVr7)IVrP.
Lambert's Comet Problem 355
If we introduce this value into the equation found for 3<r, we obtain
3(j = (r + r' + oVrr')^/k(r + r' - 2oVrP).
We transform this equation further by introducing the chord s. Its
square, according to the cosine theorem, is
s2 = r2 + r'2 - 2rr' cos2£ = r2 + r'2 - 2rr'{2o2 - 1),
i.e.,
s2 = (r + r')2 — 4rr'o2.
From this we obtain
4rr'o2 = (r + r' + s) (r + r' — s).
We abbreviate and write
v = Vr + r' + s, u = Vr + r' — s,
obtaining
. V2 + U2 /—. uv
r + r = —-—, oVrr = ± ^-.
where the upper sign applies when the enclosed angle 2£ is concave
and the lower when it is convex.
If we substitute these two values into our last formula for 3<r, it
finally yields
o *rrv2 + u2 ± vu v + u /£,,_ ,.
3. = Vk . _ = J-.{+ + tt3)
or, in complete form,
T
° = 4f2 [(r + r' + sy* + (r + r' - s)^].
This formula represents the parabola sector a as a function of the two
bounding focal radii r and r' and the chord s connecting their end points.
In order to use this formula to determine the time required for a
comet to complete its orbital arc, we need only introduce the value
found for a into the Gauss formula of the Theoria motus,
2<r
tVpV\ + y.
(cf. No. 96).
356
Nautical and Astronomical Problems
Since here p = Ik and the comet mass /x is to be set equal to zero,
we have initially
GtVk = <tV2
and, as a result of substitution,
6Gt = (r + r' + s)1-5 + (r + r' - s)1-5.
This remarkable formula contains the solution to the problem posed.
It is usually called the Lambert formula, although it had already been
formulated by Euler.
It states that the time required by a comet to describe an orbital arc depends
only on the arc chord and the sum of the focal radii of the ends of the arc.
According to Lagrange, Lambert's formula represents the most
beautiful and significant discovery in the theory of comet motion.
It is, in fact, of fundamental importance for the determination of
comet orbits.
This determination is carried out essentially in the following way:
The longitude and latitude of the comet is determined for three
different moments of time, together with the corresponding longitude
and distance of the sun (from the earth). Let r and r be the
respective focal radii of the first and third time of measurement, s the
distance between the ends of the focal radii, r' and s are expressed in
terms of the known magnitudes and r, and these values are
substituted into the Lambert equation, which results in an equation with
only one unknown, r. From this equation r is obtained, and then r'
and s are found from the previously mentioned expressions. This
then gives us the focus and two points of the orbit, so that it is
completely determined. When the Gauss formula is applied to one
of the points, we obtain the time at which the comet passes the
perihelion. After this has been determined, the position of the
comet for any moment of time can be obtained from the Gauss
formula.
Extremes
(SCI Steiner's Problem Concerning the Euler Number
At what value ofx, ifx is a positive variable, will the expression V x be at a
maximum ?
Jacob Steiner posed this problem in Crelle's Journal, vol. XL; it may
also be found in his Works, vol. 2, p. 423.
Solution. According to the inequality of exponential functions
(No. 12),
e<,x-e)le ^ j + f_Zi,
e
where the equal sign applies only when x = e. The inequality is
simplified to
ex'e— ^ - or to exie ^ x.
e e
Here we extract the xth root and obtain
Ve ^ y/~x.
Verbally expressed: The Euler number e is the number yielding the
maximum possible value for the expression Vx for which x is a positive
variable.
Fagnano's Altitude Base Point Problem
To inscribe in a given acute-angled triangle the triangle of minimum
perimeter.
This celebrated problem stems from I. F. Fagnano, son of the
Italian count C. Fagnano (1682-1766), who became famous as a
result of his remarkable studies of lemniscate partition.
The following solution of the problem is distinguished by its
extreme simplicity. It comes from Fr. Gabriel-Marie, author of the
excellent book Exercices de Geometric.
360
Extremes
Let the given triangle be ABC and let XYZ be a triangle inscribed
in it, with X, Y, and Z on BC, CA, and AB, respectively. We will
initially consider that Z is arbitrarily situated on AB; we draw its
mirror images H and K on BC and CA, respectively, and determine
the points of intersection A'and Y of the connecting line HK with BC
and CA. For & fixed point Z the triangle ATZ thus formed has the
smallest perimeter of all the inscribed triangles. In fact: let X' and
Y' be two other points on BC and CA. Since ZX' and HX' are mirror
images, and also ZY' and AY', and naturally also ZX and HX, as
well as Zy and KY, the perimeters of the two inscribed triangles to be
compared can be written as
ZXYZ = HX + XY +YK= HK,
ZX'Y'Z = HX' + X'Y' + Y'K = HX'Y'K.
However, since the direct path HK from H to K is shorter than the
roundabout path HX'Y'K, the first triangle possesses a smaller
perimeter than the second.
It now merely remains to choose the point Z in such manner as to
obtain the smallest possible segment HK (which represents the
perimeter of XYZ). Now CZ is the mirror image of CH and also of
CK; likewise, t^ZCB = &HCB and &ZCA = &KCA and thus
&HCK = 1y. Segment HK is therefore the base of an isosceles
triangle (HKC) with a constant apex angle 2y and the variable leg
s = CZ; as such it attains a minimum when CZ is at a minimum, i.e.,
when CZ is perpendicular to AB.
Since we could just as easily have carried out the investigation with X
or Y as with Z, AX is perpendicular to BC and BY to CA. The points
X, Y, Z are thus the base points of the altitudes of the triangle ABC.
Fermat's Problem for Torricelli
361
Result: Of all the triangles that can be inscribed in a given acute-angled
triangle, the one with the smallest perimeter is the triangle formed by the base
points of the altitudes.
^^M Fermat's Problem for Torricelli
To find the point the sum of whose distances from the vertexes of a given
triangle is the smallest possible.
This celebrated problem was put by the French mathematician
Fermat (1601-1665) to the Italian physicist Torricelli (1608-1647),
the famous student of Galileo, and was solved by the latter in several
ways.
The simplest solution is the one obtained by the use of
Viviani's theorem: In an equilateral triangle the sum of the three
distances of a point from the sides of a triangle has a value that is independent of
the position of the point.
This value is equal to the altitude of the triangle.
Viviani (1622-1703), an Italian mathematician and physicist, was
a student of Galileo and Torricelli.
In Viviani's theorem the distance of a point from a triangle side is
reckoned as positive when it is inside the triangle and negative when
it is outside.
Proof. Let the equilateral triangle have the vertexes P, Q, and R,
the side g, the altitude k, and the area J. If x, y, z are the distances of
an arbitrary point 0 from the sides QR, RP, PQ, then
s = x+y + z
is the designated sum. -
w
R
Fig. 105.
362
Extremes
Now, the area of the triangle PQR is composed (additively or
subtractively) of the three component triangles OQR, ORP, OPQ, so
that we obtain the equation
\i* + \gy + igz = J
no matter what position the point 0 may have. From this we obtain
directly
s = x+y + z = — = h,
g
and thus the auxiliary theorem is proved. Now let ABC be the given
triangle. We choose the point 0 so that the three perpendiculars
at A, B, C to AO, BO, CO form an equilateral triangle PQR. Let 0'
be any other point. Then if O'A', O'B', O'C are the perpendiculars
dropped from 0' to QR, RP, PQ, we have
A'O' <> AO', B'O' < BO', CO' <> CO',
where, however, the equal sign cannot apply to all three. By
addition it follows from this that
(1) A'O' + B'O' + CO' < AO' + BO' + CO'.
However, according to the auxiliary theorem as applied to the
equilateral triangle PQR,
(2) AO + BO +CO=z A'O' + B'O' + CO',
where the equals sign applies when 0' is inside the triangle PQR and
the "smaller than" sign when 0' is outside. From (2) and (1) we get
AO + BO + CO < AO' + BO' + CO',
so that AO + BO + CO is the smallest possible sum of the distances.
Since the quadrilaterals OBPC, OCQA, OARB are circle
quadrilaterals, each of the three angles BOC, CO A, and AOB is equal to 120°.
The point we are looking for is accordingly the common point of intersection
of the three circle arcs with the chords BC, CA, AB and the common peripheral
angle of 120°.
The construction of this point is impossible when one triangle angle,
for example, ^ACB = y reaches or exceeds 120°.
In that event C itself is the point 0 that we are looking for.
Specifically, in this case,
AC + BC < AU + BU + CU,
no matter where the point U may be.
Tacking Under a Headwind
363
Proof. We introduce the angles ACU = if/ and BCU = q>. If t/
lies in the space enclosed by the angle ACB = y, the sum of 0 and <p
is equal to y; if t/ lies in the space enclosed by the adjacent angle of y,
the difference between these two angles is equal to y; and, finally, if
U lies in the space of the opposite angle from y, then
0 + <p = 360° - y.
Let the base points of the perpendiculars dropped from U to AC
and BC be F and G. Their distances from C are then
x = CU cos if) and y = CU cos <p,
with such a distance, e.g., x, being counted as positive when cos 0 is
positive or negative when cos 0 is negative. In each case then we
have
AC = AF + x and BC = BG + y,
and accordingly
AC + BC = AF + BG + x + y.
Now
x + y = CU cos tfi + CU cos <p = CU(cos tfi + cos <p)
= 2.Cf/.cosi±^cos^.
Since, according to the above, one of the two cosines of the right side
of this equation has the magnitude cos (y/2), and this (because
y/2 ^ 60°) is smaller than -J-, the right side has a maximum magnitude
ofCU This yields
AC + BC^AF + BG + CU.
Since the legs AF and BG of the right triangles AUF and BUG are
smaller than the hypotenuses AU and BU, it is certainly true that
AC + BC < AU + BU + CU. Q.E.D.
^!^| Tacking Under a Headwind
How must a sailboat tack with a north wind in order to get north as quickly
as possible ?
Solution. Let the course of the boat be Oy°N, and let the sail
form the acute angle a with the bearing north and the angle /? with
the course bearing.
364
Extremes
First let us solve the preliminary problem: Let the maximum speed
that a sailboat can make through the wind with the most favorable sail position
be C knots; how great a speed can it make when the angle of the sail with the
bearing of the wind is a and with the axis of the boat is /3 ?
Let the pressure exerted upon the sail by the wind when the sail is
perpendicular to the wind be P. If the sail forms an angle a differing
from 90° with the bearing of the wind, then the wind pressure P'
(which works perpendicular to the sail) is smaller. It is reasonable to
assume that the wind pressure is now equal to only sin a times P, so
that P' = Psina. This formula, conceived by Lossl, is, however,
only approximate.
Fig. 106.
We divide P' into two components: one, p = P' sin /3, in the
direction of the boat axis; the other, q = P' cos /3, perpendicular to it.
Of these components p is the only relevant one for the forward motion
of the boat. Thus, pressure exercised by the wind on the boat in the
course direction has the value
p = P sin a sin /3.
The velocity c of the boat is proportional to this pressure:
c = kp = kP sin a sin /3,
where k represents the proportionality constant. For a = /3 = 90°
this formula becomes
cmax = C = kP,
so that we can replace kP in the formula by C. The solution to our
preliminary problem thus reads
c = C sin a sin |3.
Tacking Under a Headwind
365
This formula forms the basis of the solution of the main problem. C
is here the velocity that the north wind gives to the boat when it
travels due south and the sail is perpendicular to the wind direction.
If the boat is to get as far north as possible in a given time, the
northerly component c' of the boat's velocity c must be at a maximum.
This component is, however,
c' = c sin y = C- sin a sin /3 sin y.
Consequently, what is necessary is to choose the three angles a, /3, y,
the sum of which is 90°, in such manner as to obtain the maximum
product for sin a sin /3 sin y.
This reduces our task to the following problem:
When is the product of the sines of three angles of a constant concave sum at a
maximum?
The solution of this problem is very similar to that of No. 10.
It is based on the theorem: Of two angle pairs with equal concave sums
the pair possessing the higher sine product is the pair with the smaller difference
between its angles.
[It follows from the formulas that 2 sin X sin Y = cos (X — Y) —
cos (X + Y), and 2 sin x sin y = cos (x — y) — cos (x + y), where
X, Y and x, y represent the two pairs with the common sum
X+Y = x+y (< 180°).
Since the subtrahends of the right sides are equally great, the larger
right side is the one that possesses the greater minuend, i.e., in this
case, the one in which the minuend shows the smaller angle
difference.]
Let the constant sum of the three variable angles a, /3, y be 3k
(^ 180°). Now if a, /3, y is such an angle triplet in which none of the
angles chances to equal k, then at least one, let us say a, must
necessarily be greater than k, and another, let us say /3, must be smaller
than k. We form a new triplet a, /3', y such that (1) a' = k, (2) the
pairs a, /3' and a, /3 possess equal sums, and (3) y = y. According to
the above theorem, sin a' sin /3' will then be > sin a sin /3, and
consequently, sin a! sin /3' sin y will also be > sin a sin /3 sin y, or
(1) sin k sin /3' sin y > sin a sin /3 sin y.
Since /3' + y = 2k, the same theorem yields
(2) sin k sin k ^ sin |3' sin y'.
366
Extremes
Combining (1) and (2), we obtain
sin k sin k sin k > sin a sin /3 sin y.
Consequently:
The product of the sines of three angles of constant concave sum assumes its
maximum value when the angles are equal.
The solution to our sailboat problem thus reads a = /3 = y = 30°.
This means that:
The axis of the boat must form a 60° angle with the bearing north, and the
sail must bisect the angle formed by the wind bearing and the boat's axis.
In these optimal positions the northerly motion is equal to exactly % the
maximum southerly motion.
^^g The Honeybee Cell (Problem by Reaumur)
The cell of the honeybee (cf. Figure 107) has the form of a regular
hexagonal prism that is sealed at only one end by a regular hexagon
arbpcq, while at the other end it is sealed by a roof consisting of three
congruent rhombuses PBSC, QCSA, and RASB that are inclined toward
each other and toward the axis of the prism at equal angles, in such
S a
Fig. 107.
manner that the lateral surfaces of the prism are congruent trapezoids
(AarR, RrbB, etc.). The longest side of one such trapezoid is somewhat
more than twice as long as the diameter of the inscribed circle of the
base surface arbpcq. As a result of the regular arrangement of the
rhombuses, each of the three rhombus diagonals (SP, SQ, SR) originating
The Honeybee Cell
367
at the roof apex S forms the same angle with the axis of the prism
as the rhombus plane, and the two planes ABC and PQR are
perpendicular to the edges of the prism. Since the obtuse-angled
rhombus vertexes abut on each other at S, the diagonals mentioned
are the short rhombus diagonals.
This singular construction of the honeybee cell suggested to
naturalists like Maraldi, R6aumur, and others (at the beginning of
the eighteenth century) that the bees had chosen this design in order
to save as much as possible in the building material, i.e., in wax.
The problem posed by R6aumur in this connection to the Swiss
mathematician Koenig can be stated as:
To close a regular hexagonal prism with a roof consisting of three congruent
rhombuses in such manner as to obtain a solid of prescribed volume and minimal
surface.
Solution. Let the regular hexagonal cross section of the prism
have the side 2e, so that its shorter diagonals ab = be = ca = 2d =
2e\/3 and thus also AB = BC = CA = 2d = 2eV$. Let the distance
of the plane PQR and the apex S of the roof from the plane ABC be x,
and let the short rhombus diagonals (SP = SQ = SR) be 2y.
Since the projection from SR = 2y on the axis of the prism is 2x,
and on the plane PQR is 2e, we obtain the equation
(1) y2 = e2 + x2.
If $, >Q, 9¾ are the points at which the prism edges passing through
P, Q, R intersect the plane ABC, then ^49lB$C£l is a regular hexagon
with the side 2e.
First it becomes apparent that the volume of the prism undergoes
no change when the rooflike closure that has been described is chosen
instead of the plane closure AfftBtyCQ,, since as much room is added
on the one side of the plane ABC (pyramid S-ABC) as is taken away
from the other side (the three pyramids P-BC$, Q-CAD,, R-ABfH).
Only the surface changes with the change in design; the surface
decreases by the area 6e2\/3 of the hexagon A?HB $CC, as well as by
the area of the six right triangles P^B, P%C, QD.C, QDA, RRA,
RdlB—together 6ex—while it increases by the total area of the three
rhombuses PBSC, QCSA, RASB, namely 6dy = 6eV3y. The saving
in surface area thus obtained is accordingly
6e2\/3 + 6ex - 6eV3y
or
662^ - 6e[yV3 - x],
368
Extremes
so that it now remains to obtain a minimum value for the expression
in the bracket
u = y\/3 - x
by an appropriate choice of x.
Now, if v is understood to be the similarly constructed expression
xV3 — y, then, as a result of (1),
u2 - v2 = 2(y2 - x2) = 2e2
or
u2 = 2e2 + v2.
From this it follows that u attains a minimum (specifically eV2)
when v is equal to zero, i.e., when
(2) y = xy/3.
From (1) and (2) we obtain
x = eV\ and y = evf.
The diagonal SR = 2y = eV6 is consequently shorter than the
diagonal AB = 2d = 2«VlJ = eV\2, so that the three rhombus
angles abutting on one another at S are obtuse. If we designate the
acute rhombus angle SAR as 2<p, it follows from tan <p = yjd = 1/V2
and tan 2q> = 2 tan q>j{ 1 — tan2 qp) that tan 2<p = VH, cos 2<p = ^,
and 2<p = 70° 32'. The obtuse rhombus angle 20 is therefore
109° 28'.
For the angle /t of the rhombus diagonals SP, SQ, SR with respect
to the axis of the prism we obtain the relation tan /t = 2e/2x = V2,
and thus y. = 90° - <p = 54° 44'.
The angle v of the rhombuses with respect to the prism cross section
is, finally, v = 90° - y. = <p = 35° 16'.
Since the tangent of the acute trapezoid angle {^aAR) has the
value 2e\x = Vo (= tan 2<p), the acute and obtuse angles of the
trapezoid correspond to the acute and obtuse angles, respectively, of
the rhombus.
Particular interest attaches to the angles enclosed between every
two bounding surfaces of the prism. These angles are easily
determined.
Regiomontarw' Maximum Problem
369
To begin with, since the three-sided corners S, P, Q, R are
congruent and regular (each side is 20), the surface angles belonging to
these corners are all equal to each other. Since the four-sided
corners A, B, C are also regular and congruent (each side is 2<p),
these corners also all have the same surface angle. Now, a surface
angle of the corner P atp as ^_bpc equals 120°, and a surface angle of
the corner A at a as l^qar also equals 120°.
Consequently, all the surface angles of the prism are 120° (naturally,
with the exception of the right angles forming the base surface).
The angles we have just calculated have in fact been confirmed by
actual measurement for the honeybee cell—within the limits of
observational error. Of particular interest is the remarkable fact
that every two abutting wax surfaces enclose an angle of 120°.
■&■ Regiomontanus' Maximum Problem
At what point of the earth's surface does a perpendicularly suspended rod
appear longest ? (I.e., at what point is the visual angle at a maximum ?)
This problem was posed in 1471 by the mathematician Johannes
Muller, called Regiomontanus after his birthplace Konigsberg in
Franconia, to the Erfurt professor Christian Roder. This problem,
which in itself is not difficult, nevertheless deserves special attention as
the first extreme problem encountered in the history of mathematics
since the days of antiquity.
The author of the following simple solution is Ad. Lorsch, who
published it in vol. XXIII of the Zeitschrift fur Mathematik und Physik.
Let A be the upper and B the lower end point of the rod, F the
base point of the perpendicular to the earth's surface from A (or B),
so that the segments FA = a and FB = b are known. Since the rod
appears to be equally long at all the points of a circle on the earth's
surface described about F as the center, it is sufficient to erect an
arbitrary perpendicular g to FA at F and to seek on this line that runs
horizontally on the earth's surface the point 0 at which the visual
angle at = 2$ AOB is a maximum.
First Lorsch shows that the circle of circumscription ft of the triangle
ABO is tangent to the line g at O. Indeed, if g were not tangent to ft,
then ft would have another point Q in common with g besides point
O, and for each intermediate point Z of g between O and Q, &AZB
would be greater than the boundary angle of the circle ft on AB, and
370
Extremes
it would consequently be greater than at, whereas <d is supposed to be
the maximum.
Let us therefore draw the circle S that passes through points A and
B and is tangent to the line q ; the point of tangency 0 is the place at
which the viewing angle of the rod attains its maximum value <n.
Indeed, if P is any point other than 0 on the line q, then the angle
APB is smaller than the boundary angle of $ on AB, and
consequently smaller than at. Lorsch also shows the most convenient and
quickest method of constructing the circle $ and/or its midpoint M
and radius r. To begin with, the midpoint M lies on the perpendicular
bisector of AB, which runs parallel to the line Q and passes through
the midpoint N of AB. Now, in the rectangle MOFN the side FN
is equal to the opposite side MO, and is thus equal to r, so that all that
is necessary is to mark off from B (or A) the distance FN on the
perpendicular bisector in order to obtain, at the resulting point of
intersection, the desired midpoint M.
If one wishes to determine the position of 0 by calculation—using
its distance t from F—one need only bear in mind that, according to
the tangent theorem, FO2 = FA FB. This equation immediately
gives us t =
An interesting variant of the problem of Regiomontanus is the
Saturn problem, probably first posed by Hermann Martus, the author of
the well-known problem collection:
At what latitude circle of Saturn does the ring appear widest ?
Saturn is assumed to be a sphere with a radius of 56,900 km, and
the ring is assumed to be a circular ring in the plane of Saturn's
equator, having an inner radius of 88,500 km and an outer radius of
138,800 km.
Solution. In Figure 108, let the arc SDl represent a meridian, M
the midpoint of Saturn, AB the width of the ring, MA = a being the
outer radius, and MB = b the inner radius of the ring, and let
MC = r be the equatorial radius of Saturn on MA. Let 0 be the
point situated at the latitude <p = ■fcCMO at which the ring width
appears greatest, so that -j^AOB = ifi is a maximum.
We now apply Lorsch's considerations to our figure and directly
obtain the following solution. We draw the circle S that passes
through the points A and B and is tangent to the meridian 2R; the
point of tangency 0 is the place at which the ring width appears to be
greatest.
The Maximum Brightness of Venus
371
In order to calculate the latitude <p of 0 and the maximum 0, we
examine the right triangles MZF and AZF, in which Z is the center
of the circle ft, F the center of AB. From these triangles, with the
understanding that p is the radius of ft, we obtain
cos <p =
MF _ a + b
MZ ~ 2(r + P)
and
Sin^ = 3z =
: - &
2p '
Fig. 108.
The unknown p, however, follows from the secant theorem, according
to which MA-MB = MZ2 - p2 or ab = (r + p)2 - p2 = r2 + 2rp,
and consequently p = (ab — r2)/2r. If we introduce this into the
above, we at length obtain
(a + b)r
COSro = -^-j s-
Y ab + r2
and
., (* ~ b)r
and from this, <p = 33^°, iji = 1¾0.
The Maximum Brightness of Venus
In what position does the planet Venus appear to kave the greatest brilliance ?
Solution. Let the midpoints of the sun, earth, and Venus be
S, E, V, the radii of the orbits (assumed as circular) of the earth and
Venus SE = a and SV = b, the variable distance of Venus from the
earth EV = r, the radius of Venus h. The tangents to Venus from S
and E touch Venus along circles I and II, respectively, whose
diameters in the plane SEV we will call AB and CD, respectively. Since
AB J_SV and CD _\_EV, the angle between the planes of the two
circles is equal to the angle <p = SVE between their normals VS and
VE. The projection of the portion of Venus that is illuminated by
the sun and visible from the earth on the plane of circle II consists of
the semicircle with the central radius VC and the area (irj2)k2 and the
372
Extremes
projection of the semicircle with the central radius VB, having the
area (irfflk2 cos <p. (The area of the projection of a plane surface on
a plane is equal to the product of the area of the surface and the
cosine of the angle between the two planes.) The radiation from
Fig. 109.
Venus to the earth is thus exactly the same as that of a surface at V
perpendicular to the rays, with the area
J = ^nh2{\ + cosy).
If 1 cm2 of this surface at distance 1 develops the illumination
intensity c, the entire surface generates the illumination intensity cJ
and at the distance VE = r the illumination intensity is
_ cJ _ cirk2 1 + cos <p
7* ~ 1 7*
Accordingly, the illumination intensity attains a maximum when
the factor
1 + cosy
J - ^
reaches its peak value.
Now, according to the cosine theorem as applied to triangle SEV,
r* + b2 - a2
0059 Wr '
and consequently,
The Maximum Brightness of Venus
373
This expression has the form
/ = Ax + Bx2 - Cx3,
where
A-1 B-\ C-'2-*2
A-Tb, B-\, C-—2£-
are constants and x = (1/r) is a variable. We must now make the
function f of x as great as possible by a suitable choice of x. As the
curve of the function shows,/"initially grows as x (> 0) increases; at a
certain point x = a it attains its maximal value, and then declines.
For every (positive) x # a, therefore,
Ax + Bx2 - Cx3 < Aa + Be? - Co?.
Accordingly as x $ a, we write this inequality as
C{o? - x3) < A(a - x) + B(*2 - x2)
or
q*3 - a3) > A{x - a) + B{x2 - a2),
and divide both sides by a — x and x — a, respectively. From this
we find that: The function C(a2 + a + x2) lies below the function
A + B(a + x) when x < a, and above it when x > a. Since these
two continuous functions increase steadily, they must attain equal
values at the point x = a, so that
C{a2 + o? + a2) = A + B{a + a).
This equation yields
B + VB2 + ZCA
a 3C
If we introduce here the values of A, B, C, we find for the desired
distance r(= 1/a) the value
r = V3a2 + b2 - 2b.
Now all three sides of the triangle SEV for the optimal position are
known (a:b:r = 1:0.7233:0.4304), and the sought-for angular
distance (&SEV) of Venus from the sun is found to be 39° 43.5'.
374
Extremes
A Comet Inside the Earth's Orbit
What is the maximum number of days that a comet can remain within the
earth's orbit?
We will assume that the earth's orbit is circular and the comet's
parabolic, and that the orbital planes coincide.
Solution. We will select the large half axis of the earth's orbit as
the unit length, the mean solar day as the unit time, and we will
designate the parabola parameter as Ak, the base line of the parabola
section lying within the earth's orbit as 2y, the altitude of the section
as x, the sector described by the focal radius of the comet within the
earth's orbit as S, and finally, the time required to traverse the
sector as t. Then
(1) y2 = 4A*
according to the amplitude equation of the parabola,
(2) (* - A)2 + y2 = 1
according to the circle equation, and
(3) 3S = y(x + 3k)
according to the formula for the area of a parabola section [No. 56.
S = the section — triangle = %xy — (x — k)y~\.
If 2p represents the orbit parameter of a celestial body of mass /t
revolving about the sun (the mass of the sun is considered as the unit
mass), if t is any time, S the sector described by the body in this time,
we can use the Gauss formula*
2S
tVpV\ + /x
= G,
where G (the root of the gravitation constant) is the so-called Gauss
constant, which has the numerical value of 0.0172021 for the units
assumed.
Since the mass of the comet relative to that of the sun is negligible,
the Gauss formula is transformed into
(4) S = CtVk, with C = G/V2
in our problem.
* Gauss, Theoria motus corporum coelestium in sectionibus conicis solem ambientium
(Hamburg, 1809). (English translation by C. H. Davis reprinted by Dover
Publications, 1963.)
The Problem of the Shortest Twilight 375
From (1) and (2) we find
x + k = 1, y = 2VA:(1 - A)
and, making use of these values, we obtain from (3)
ZS = 2\/k{\ - k){\ + 2k).
If we introduce here the value for S from (4), it follows that
(5) t = c(\ + 2k)VY^k, with c = V8/3G.
Since t is to be a maximum, the expression (1 + 2k) Vl — k must
be made as great as possible. It therefore remains to select k in such
manner that the expression or its square or fourth power, namely,
P = (1 +2A)-(1 +2A).(4-4A:),
becomes a maximum. However, since P is a product of factors of
constant sum, it attains a maximum (No. 10) when the factors are
equally great, thus when
1 + 2k = 4 - 4k.
This gives us k = \ and, as a result of (5), t = 78.
The sought-for maximum possible length of stay is thus 78 days.
Efi| The Problem °f the Shortest Twilight
On what day of the year is the twilight shortest at a place of given latitude?
This problem was posed, but not solved, by the Portuguese Nunes
in 1542 in his book De crepusculis. Jacob Bernoulli and d'Alembert
solved the problem by means of differential calculus, but obtained no
simple results. The first elementary solution stems from Stoll
(Zeitschriftfur Mathematik und Physik, vol. XXVIII). The following
very simple solution is from Brunnow's Lehrbuch der sphdrischen
Astronomic (Textbook of Spherical Astronomy).
A distinction is made between civil and astronomical twilight.
Civil twilight ends when the midpoint of the sun stands 6^° below the
horizon. Approximately at this moment one must turn on one's
lights in order to continue working. Astronomical twilight ends
when the midpoint of the sun stands 18° below the horizon; it is
approximately at this time that the astronomer can begin making
observations.
376
Extremes
It is convenient to choose as the beginning of twilight the moment
at which the midpoint of the sun is intersected by the horizon.
Let the latitude of the observation point be <p, the pole distance of
the sun p.
The duration of the twilight is measured by the angle d that is
formed by the two-hour circle arcs of the nautical triangles determined
by the sun for the beginning and end of the twilight. If we
superimpose one of these triangles on the other in such manner that the two
pole distances coincide, the angle between the two latitude
complements b (now having in common only the world pole P) represents
Y
Fig. 110.
the duration d of the twilight. In this position let the triangles be
PCX and PCY, with PC = p, PX = PY = b = 90° - <p, CX = 90°,
CY = 90° + h (k is to be understood as representing the depth of the
sun below the horizon at the end of the twilight), and &XPY = d.
Moreover, let XY = u and &XCY = 0.
From the isosceles triangle PXY it follows, according to the cosine
theorem, that
Consequently, d becomes a minimum or cos d a maximum when
cos u is at a maximum.
From the triangle CXY it follows, however, that
cos u = cos CX cos CY + sin CX sin CY cos tfi
or, since cos CX = 0, sin CX = 1, sin CY = cos h, that
cos u = cos h cos if).
The Problem of the Shortest Twilight 377
Thus, cos u attains its greatest possible value when cos 0 is a
maximum, i.e., when
0 = 0.
On the day of the shortest twilight, point X accordingly falls on the
side CY, and the base XY = u of the isosceles triangle PXY is h. At
the same time we find from (1) for the minimum duration b of the
twilight
cos h — sin2 <p
cos b =
cos2 <p
or, in accordance with the two formulas
cos b = 1 — 2 sin2 s» cos h = 1 — 2 sin2 -p
I h
. h
I) sin £ 1
v ' 2 cos <p
To find the corresponding declination of the sun 8, we express the
cosine of the angle co = £.PCX = £_PCY twice in accordance with
the cosine theorem and set the resulting values equal to each other.
It follows from APCX (since cos CX = 0, sin CX = 1) that
sin q>
cos co = -.—£»
sinP
from &PCY (since cos CY = —sin k, sin CY = cos h) that
sin a> + cos p sin h
cos co = —. —,
sin p cos h
Equalizing, we obtain
sin qj cos h = sin cp + cos p sin h
or
or
or, finally,
— cosp sin k = sin qj(1 — cos h)
— cos/f-2 sin ■? cos - = sinoj-2sin2-
cos p = — sin qj tan -•
378
Extremes
Because of the minus sign, the pole distance p is an obtuse angle
for northern latitudes, the sun's declination 8 is thus southerly and
(II) sin 8 = sin <p tan -•
The shortest twilight duration is determined by (I) and the southerly
declination of the sun for the day on which that twilight occurs is given
by (II).
From the declination the sought-for day can be found by means of
the nautical almanac.
This datum is also found with sufficient accuracy if the familiar
formula
(2) sin 8 = sin e sin I
is used; here 8 represents the sun's declination, I the angular distance
of the sun from the autumnal or vernal equinox, and e the inclination
of the ecliptic (23° 27'). Since the above-mentioned angular distance
changes at an average daily rate of m = 59.1', the sought-for
information varies by n = l\m days from the 23rd of September or from the
21st of March.
For Leipzig, for example, (<p = 51° 20.1') we find, from (II),
8=7° 6.2', then from (2), I = 18° 6.3', and then n = 18.4. The
shortest twilight in Leipzig thus falls on October 11 and March 3.
|g|g| Steiner's Ellipse Problem
Of all the ellipses that can be circumscribed about (inscribed in) a given
triangle, which one has the smallest (largest) area ?
"Dans le plan, la question des polygones d'aire maximum ou
minimum inscrits ou circonscrits a une ellipse ne pr6sente aucune
difficulte. II suffit de projeter I'ellipse de telle maniere qu'elle
devienne un cercle, et Ton est ramen£ a une question bien connue de
geom6trie 61ementaire"* (Darboux, Principes de Ge'ome'trie analytique,
p. 287).
* Translation: "In a plane the question of polygons of maximum or minimum
area inscribed in or circumscribed about an ellipse offers no difficulty. All that
is necessary is to project the ellipse in such manner that it is transformed into
a circle, and the problem is reduced to a well-known question of elementary
geometry".
Steiner's Ellipse Problem 379
The solution of the problem is based on the two auxiliary theorems:
I. Of all the triangles inscribed in a circle the one possessing the maximum
area is the equilateral.
II. Of all the triangles that can be circumscribed about a circle the one
possessing the minimum area is the equilateral.
Proof of I. We call the circle diameter d, the sides and angles of
an inscribed triangle p, q, r and a, /3, y, respectively, the area of the
triangle J. Then
J = ipq sin y
and
p = d sin a, q = d sin /3,
and consequently,
J = \d2 • sin a sin /3 sin y.
According to No. 92, the product of the sines sin a sin /3 sin y of the
three angles a, /3, y df constant sum (180°) is at a maximum when
« = /3 = y(=60°),
i.e., when the triangle is equilateral. The area of this maximal
triangle is -faVSd2, thus V27/4-77 of the area of the circle.
Proof of II. If we designate the sides of an arbitrary
circumscribed triangle PQR asp, q, r, then the tangents to the circle from the
vertexes P, Q, R are x = s— p, y = s— q, z = s — r, where s
represents half the perimeter of the triangle
(-
1 = x +y + z)
The area J of the triangle and the radius p of the inscribed circle are
given by the well-known formulas
J = ps and J = Vxyzs (Hero of Alexandria).
These give us
sp2 = xyz.
Making use of the formula J = ps, we write this equation in the
following two ways:
1111
(1 - + — + 5»
yz zx xy p2
(2) 1.1.1 = 4r
w yz zx xy J^p*
380
Extremes
We now introduce the new unknowns
1 1 1
U = — > V = —y w = —
yz zx xy
and obtain
1 1
u + v + w = -=, uvw = -==-=.
p* J2P2
Since J is supposed to be a minimum and p is constant, uvw must
attain a maximum.
A product uvw of numbers u, v, w of constant sum (u + v + w =
const.) reaches a maximum, however (No. 10), when the numbers are
equal to each other: u = v = w. The circumscribed triangle
therefore becomes smallest when yz = zx = xy, i.e., when x = y = z, i.e.,
when p = q = r, which proves II.
We find that the area of the smallest circumscribed triangle is four
times that of the maximum inscribed triangle, i.e., v27 p2, and for
the ratio of this area to the area of the circle we obtain the improper
fraction V27/ir.
Now for the solution of the ellipse problem). Let @ be any ellipse
circumscribed about (inscribed in) the given triangle abc, fits surface
area, 8 the area of the triangle abc. We consider @ as the normal
projection of a circle $, whose surface area we will call F, In the
projection the inscribed (circumscribed) triangle ABC of the circle,
possessing an area we will call A, corresponds to the inscribed
(circumscribed) triangle abc of the ellipse. If /x represents the cosine of
the angle between the plane of the circle and the plane of the ellipse,
then the normal projection of every surface lying in the plane of the
circle is the /t-multiple of the surface. This gives us the formulas
f=l*F, 8 = ,xA.
Since 8 is constant, f attains a minimum (maximum) when the
quotient f/B or the equal quotient F/A reaches a minimum
(maximum). The latter quotient, however, according to auxiliary
theorem I. (II.) reaches its minimal (maximal) value 477/V27
(■rrl V27) when the triangle ABC is equilateral.
To establish more exactly the ellipse determined by this condition,
we make use of the properties of a normal projection: 1. Parallelism is
not annulled by projection. 2. The ratio between parallel segments is
maintained in projection: in particular, the ratio of two segments of the same
line is not altered.
Steiner's Circle Problem
381
Now, the center M of the circle is the point of intersection of the
medians of the equilateral triangle ABC and the diameter through C
bisects the chords of the circle parallel to AB. Consequently, the
point of intersection of the medians of the triangle abc is the center
point m of the sought-for ellipse, and the ellipse diameter through c
bisects the ellipse chords parallel to the side ab, so that ab and mc are
conjugate directions of the ellipse. Now, since the circle radius MK
parallel to the circle chord (tangent) AB is equal to l/\/3(\/3/6) of
AB, the ellipse half diameter mk parallel to the ellipse chord (tangent)
ab is also equal to 1/^3(-^3/6) of ab.
Result. Of all the ellipses that can be circumscribed about
(inscribed in) a given triangle abc, the one with the smallest (greatest)
area is the ellipse whose midpoint m is the point of intersection of
the medians of the triangle abc and from which the ellipse half
diameter to c (to the center of ab) and the ellipse half diameter
parallel to ab, mk = ai/V3(ai/2VlJ), are conjugate half diameters.
The area of the ellipse thus characterized—the so-called Steiner
ellipse—is
of the area of the triangle.
This ellipse can be constructed easily in accordance with No. 42.
Bg| Steiner's Circle Problem
Of all isoperimetric plane surfaces (i.e., those having equal perimeters) the
circle has the greatest area.
And conversely:
Of all plane surfaces with equal area the circle has the smallest perimeter.
This fundamental double theorem was first proved by J. Steiner
(Crelle's Journal, vol. XVIII; also in Steiner's Gesammelte Werke, vol.
II). Steiner even provided several proofs. Here we will consider
only the one that is based upon the Steiner symmetrization principle.
First we will prove the second half of the theorem.
It is obviously sufficient to limit our considerations to convex
surfaces, i.e., those surfaces in which the line segment connecting two
arbitrary points of the surface belongs completely to the surface.
477 / 7T \
V27 \V27/
382
Extremes
We will first prove the auxiliary theorem:
Of all trapezoids with common base lines and altitudes the isosceles trapezoid
is the one the sum of whose legs is smallest.
Let ABCD be an arbitrary trapezoid with the base lines BC and AD,
the legs AB and CD. Let the mirror image of B on the perpendicular
bisector of AD be B', let the center of CB' be C0. On the extension of
A
//
B0 B
D
B' C0 C
Fig. 111.
CB we set BB0 = CC0 and obtain the isosceles trapezoid ABqCqD,
which has base lines and altitude in common with the given trapezoid,
and consequently also the same area.
If we extend DC0 by its own length to H, we obtain the parallelogram
DCHB', in which the diagonal DH is shorter than the sum of the
sides DC and CH:
DH < DC+ CH.
However, since DH = 2-DC0 = DC0 + AB0 and CH = DB' = AB,
we obtain
AB0 + DC0 < AB + DC.
Thus, the isosceles trapezoid has the smallest leg sum.
Now let 5 be the surface having the smallest perimeter for the given
area J; let the perimeter be u.
We draw an arbitrary line Q and divide g by perpendiculars to Q
into trapezoids ABCD that we select so narrow that the arc-shaped
legs AB and CD can be considered as rectilinear. From the points of
intersection of the dividing lines ... AD, BC,... with g we mark off
on the dividing lines on both sides of Q the half chords ... AD, BC,...,
as a result of which we obtain the points .. .A', D', B', C,... and
the trapezoids ..., A'B'C'D',.... The new trapezoid A'B'C'D' is
isosceles and possesses equal base lines and altitude with ABCD, so
that the area is also the same. This gives us
(1) A'B' + CD' i AB + CD,
in which the equals sign applies only when ABCD is also isosceles.
Steiner's Circle Problem
383
Our method enables us to obtain from 3 a new surface 3' with the
symmetry axis Q, having the same area as 3 and a perimeter, therefore,
that cannot be smaller than u. Thus, the equals sign in (1) must
always apply. All trapezoids ABCD are therefore isosceles, and the
perpendicular bisector of BC is an axis of symmetry of 3-
The surface 3 of minimal perimeter therefore possesses an axis of symmetry
in every direction.
But such a surface must be a circle!
Proof. Let I and II be two mutually perpendicular symmetry
axes of 3> M their point of intersection. Let the mirror image of an
arbitrary point P of 3 on I be P,, and let the mirror image of Px on II
be P' = P12. Then PMP' is a straight line and
MP' = MP,
i.e., the point Mis a midpoint of the surface.
Now 3 can only have one midpoint. Indeed, if N were a second
midpoint, then extending PM by its own length, we would first arrive
at P'; next, extending P'N by its own length, we would arrive at a
new point P" of 3; then extending P"M by its own length, we would
arrive at a point P" of 3; extending P'N by itself, we would then come
to still another point of 3> etc If these operations are represented
graphically it will be observed that in this manner we would end up at
some arbitrary distance beyond the drawing paper (on which 3 lies),
which is naturally absurd. Thus, 3 has only the one midpoint M.
It follows from this, further, that: This Mmust belong to each axis
of symmetry of 3-
Indeed, if M does not lie on the axis of symmetry a of 3> then we
can draw the mirror images m and p of M and of an arbitrary surface
point P on a, extend pM by its own length to the surface point p', and
draw the mirror image p" of/>' on a. Now, since p" is a point of 3>
Pmp" is a straight line, and mp" = mP, this would mean that 3 had a
second midpoint, m, and this is impossible.
Thus, all the axes of symmetry intersect at M.
Now let F be a fixed boundary point of 3 and P an arbitrary boundary
point of 3- Since the perpendicular bisector of FP is an axis of
symmetry of 3> it passes through M. Therefore,
MP = MF;
i.e., all the boundary points of 3 are equidistant from M, and the
surface 3 is a circle.
384
Extremes
Consequently, of all surfaces of equal area the circle has the smallest perimeter.
We now state conversely:
Of all isoperimetric surfaces the circle has the greatest area.
Proof. Let the perimeter/of an arbitrary surface g, which is not
a circle, be equal to the perimeter k of the circular surface ft. Let
the area of 3 be F and that of ft be K.
Now, if F ^ K, we will consider the circular surface ft', concentric
to ft, of area K' = F, and we will let its perimeter be k'. Since ft'
covers ft,
(2) k' Z k.
However—since the surfaces ft' and g have the same area—according
to the theorem proved above, k' < for
(3) k' < k.
The inequalities (2) and (3) contradict each other, however, and thus
the assumption that F ^ K must be false. Consequently, F < K.
Q.E.D.
The foregoing Steiner proof of the major isoperimetric theorem for
the circle has certain weaknesses. The same is true of the proof of the
major isoperimetric theorem for the sphere, presented in the following
section.
The reader may learn how these weaknesses can be eliminated and
the Steiner proof formulated in a completely rigorous fashion by
consulting the excellent book Kreis und Kugel (Circle and Sphere) by
W. Blaschke. Unfortunately, we cannot go into these interesting
investigations because of lack of space.
MUM Steiner's Sphere Problem
Of all solids of equal surface the sphere possesses the maximum volume.
Of all solids of equal volume the sphere possesses the smallest surface.
(Steiner, Crelle's Journal, vol. XVIII; Steiner, Gesammelte Werke, vol.
II.)
As in No. 99, we will prove the second part of the theorem first.
Naturally, we will consider only convex solids, i.e., those solids in
which the line segment connecting two arbitrary points on the solid
belongs completely to the solid.
Steiner's Sphere Problem 385
Steiner's proof is based on the principle of symmetrization and the
theorem:
Of all triangular prisms whose parallel edges AA', BB', CC have the
prescribed lengths h, k, 1 and lie on three given lines, the prism with the plane
of symmetry normal to the edges possesses the smallest base surface sum
ABC + A'B'C.
Proof. We will designate the distances of the edges from one
another as a, b, c, so that
% = ±a{k + I), » = \b{l + h), ® = ±c{h + k)
are the areas of the three trapezoidal prism faces. These areas are
given magnitudes. We extend CB and C'B to the point of
intersection P, and CA and C'A' to the point of intersection Q, and obtain
P
C Fig. 112. c'
the tetrahedron CC'PQ in which for brevity we will call the surfaces
CC'P and CC'Q "lateral surfaces" and the surfaces CPQ and C'PQ
"top surfaces."
We determine the relations between the areas J, J', $, C of the
tetrahedron bounding surfaces CPQ, C'PQ, CC'P, CC'Q, on the one
hand, and the areas A, A', 2t, 58, © of the prism bounding surfaces
ABC, A'B'C, BB'C'C, CCA'A, AA'B'B, on the other.
From the ray theorem it follows that
CP CP I , CQ CQ I
(1) CB = CB'=-\ and CA=CA' = ?
where A is the difference between I and k, and /t is the difference
between I and h. Now, since the areas of similar triangles are in the
same proportion to each other as the squares of homologous sides, we
obtain the relations
$ _ I2 O _ /f
f^~! " k* ""* a - IB " A2"
386
Extremes
From these we obtain
(2) % = *% O = /3»,
with
/2 I2
a = jT^n? and P = jrzrF
Moreover, since the areas of two triangles with a common angle
are to each other as the products of the adjacent sides of this angle,
we obtain
J _ CPCQ T_ _ C'PC'Q
A ~ CA-CB A' C'A'C'B''
and consequently as a result of (1),
(3) J = /<A and J' = /<rA',
where k is the constant l2l\p.
From (2) it follows that the areas $ and D, of the lateral surfaces of
the tetrahedron are constant no matter where the prism edges AA',
BB', CC happen to lie, and from (3), that the sum S of the areas J and
J' of the top surfaces of the tetrahedron is k times the sum £ of the
areas A and A' of the base surfaces of the prism:
(4) S = kS.
We will now prove the auxiliary theorem: Of all tetrahedrons with
two fixed comers C, C and two movable comers P and Q, that lie on the fixed
lines I and II parallel to CC, the tetrahedron in which P and Q, lie on the
perpendicular bisector plane of CC is the one possessing the smallest area sum
S of its top surfaces CPQand CPQ,.
To begin with, it is clear that the tetrahedrons concerned all have
the same volume V. (The base surface CC'P has the constant area ^5
and the corresponding apex Q lies on a fixed parallel to the plane
CC'P.)
We draw through the center M of CC the plane E normal to CC
and designate its points of intersection with the lines I and II as/> and
q. Let P and Q be two (other) points anywhere on I and II.
We now express the tetrahedron volume V, first using the
tetrahedron CC'pq and then the tetrahedron CC'PQ.
Steiner's Sphere Problem
387
For this purpose we construct at C and C" on the top surfaces Cpq
and Cpq perpendiculars running toward the inside* of these surfaces
and designate their point of intersection on E as 0.
We will select the common length of the two perpendiculars as our
unit length. The perpendiculars from 0 to the top surfaces CPQ and
C'PQ and to the planes ICC" and II CC we will designate as x, x', m,
n, the common area of the lateral surfaces CC'p and CC'P as $, that
of the lateral surfaces CC'q and CC'Q as iQ, and, finally, the areas of the
top surfaces Cpq, Cpq, CPQ, C'PQ as i, i', J, J'. We then obtain for
the volume V of the tetrahedrons CC'pq and CCPQ the formulas
ZV = i + V + m% + nD. and 3V = xJ + x'J' + m$ + niQ,
respectively [where x, x', m, and n, respectively, are positive or
negative accordingly as 0 lies on the inside or outside of the bounding
surfaces CPQ, C'PQ, ICC, and II-CC", respectively]. It follows
from this that
xJ + x'J' = i + {'.
If we consider that the perpendicular x (x') from 0 to the plane CPQ
(C'PQ) is shorter than the oblique line OC (OC), we see that x and x'
are proper fractions. The left side of the last equation is therefore
smaller than J + J' and consequently also
i + i' < J + J',
which proves the auxiliary theorem.
We now go back to (4). Since, according to the auxiliary theorem,
S becomes a minimum when P and Q lie on E, and, as a result of (4),
S and S attain a minimum at the same time, then S attains a minimum
when the prism bounding surfaces ABC and A'B'C are symmetrical
with respect to E. Q.E.D.
Note. The preceding proof assumes that one prism edge (I)
differs from the other two. This limitation is of no importance, since
it is immediately apparent that the theorem is true in the case
h = k = I.
The continuation of the proof for the major isoperimetric theorem is
similar to that in No. 99.
Let & be the solid that for a given volume V has the smallest
surface; let the latter be 0.
* The inside of a bounding surface of a tetrahedron is the side on which the
tetrahedron is situated.
388
Extremes
We choose an arbitrary plane E and divide ft by perpendiculars to
E into triangular prisms ABCA'B'C, which we assume to be so narrow
that the bounding triangles ABC and A'B'C belonging to the surface
of ft can be considered as plane triangles. From the points of
intersection of the perpendiculars ... AA', BB', CC,... with E we mark
off on the perpendiculars on both sides of E the halves of the segments
.. .AA', BB', CC,..., as a result of which we obtain the points
...,a, a', b, V, c, c',.... The new prism abca'b'c' possesses the
symmetry plane E normal to the edges and, according to the above
prism theorem, possesses a smaller base surface sum than ABCA'B'C:
(5) abc + a'b'c' ^ ABC + A'B'C,
in which the equals sign applies only if the prism ABCA'B'C also
possesses a symmetry plane normal to the edges.
By means of our procedure we obtain from ft a new solid ft' with
the symmetry plane E, possessing the same volume Fas ft and a surface
that consequently cannot be smaller than 0. Therefore, the equals
sign in (5) must always apply. All prisms ABCA'B'C therefore
possess one plane of symmetry normal to the edges, the perpendicular
bisector plane of AA'.
The solid ft having the smallest surface thus possesses a parallel symmetry
plane for every plane.
Such a solid must, however, be a sphere!
Proof. Let I, II, III be three symmetry planes of ft that are
normal to each other, M their point of intersection. Let the mirror
image of an arbitrary point P of ft on I be Ply let the mirror image of
Pj on II be Pi2, let that of P12 on III be PI23 = P'. Then PMP' is a
straight line and
MP' = MP,
i.e., the point M is a midpoint of ft.
Now, ft can have only one midpoint. (Proof as in No. 99.)
It then follows from this that M must lie on every symmetry
plane of ft.
Indeed, if M does not belong to the symmetry plane A of ft, then we
can draw the mirror images m and p of M and of an arbitrary point P
of the solid on A, extend pM by its own length to the point p' of the
solid, and draw the mirror image p" of p' on A. Now, since p" is a
point of ft, Pmp" is a straight line, and mp" = mP, this would result in a
second midpoint, m, for ft, which is impossible.
Steiner's Sphere Problem
389
All the symmetry planes, therefore, intersect at M.
Now let F be a fixed point and P an arbitrary point of the surface
of ft. Since the perpendicular bisector plane of FP is the symmetry
plane of ft, it passes through M. Therefore,
MP = MF;
i.e., all the surface points of ft are equidistant from M, and the solid ft
is a sphere.
Of all solids of equal volume the sphere thus has the smallest surface.
We now state conversely:
Of all solids of equal surface the sphere has the greatest volume.
Proof. Let the surface 0 of an arbitrary solid ft, which is not a
sphere, be equal to the surface o of the sphere I. Let the volume of ft
be V and that of I be v.
Let us assume V ^ v; then let us consider the sphere I' concentric
to I, having the area v' = V and the surface o'. Since I lies on I',
(6) o' > o.
However—since the solids I' and ft have the same volume—according
to the previously proved theorem, o' < 0, or
(7) o' < o.
The inequalities (6) and (7) contradict each other. The assumption
V ^ v must therefore be false, and v > V, as we asserted.
Index of Names
Abel, Niels Henrik 121-132
Alembert, Jean Le Rond d' 109,
155, 375
Alhazen (Abu Ali al Hassan ibn al
Hassen ibn Alhaitham) 197-200
Amthor 6
Andre 64-69
Apollonius of Perga 154-160, 165,
220
Archimedes 1-7, 154-160, 172, 184-
188, 239-242
Argand, J. R. 109
Bachet de Meziriac, Claude Gaspard
7-9
Bachmann, P. 105
Ball, W. W. Rouse 6,27
Barker 352, 353, 354
Barrow, Isaac 197, 247
Bernoulli, Jacob (1654-1705) 40-44,
375
Bernoulli, Nidaus (1687-1759) 19-
21
Berosus 340
Berwick, E. H. 11-14
Blaschke, Wilhelm 384
Brianchon, Charles Julien 165, 219-
220, 261-265
Brounckner, William 86
Briinnow, Franz Friedrich Ernst 375
Buffon, Georges Louis Leclerc, Comte
de 73-77
Cardan, Jerome (Girolamo Cardano)
216-217
Castillon (I. F. Salvemini) 144-147
Catalan 22, 23
Cauchy, Augustin Louis 37-40, 105,
109
Cayley, Arthur 105
Chasles, Michel 312
Cramer, Gabriel 144
Darboux, Jean Gaston 378
Demoivre, Abraham 179
Desargues, Gerard 250-255, 265-273
Descartes, Rene 171
Dickson, Leonard Eugene 105
Dirichlet, Peter Gustav Lejeune 96
Douwes 326
Eratosthenes 5
Euclid 154, 250, 301
Euler, Leonhard 19-27, 44-48, 55,
78-85, 96, 97, 104, 136, 141-142,
184, 192, 285-289, 356, 359
Eutocius 170
Fagnano, I. F. 359-361
Fermat, Pierre de 78-85, 86-96, 96-
104, 135, 361-363
Feuerbach, Karl Wilhelm 142-144
Fox 77
Frenicle de Bessy, B. 86
Frobenius, Leo 105
Frost, Andrew 14
Fuss, Nicolaus 188-193
Gabriel-Marie, F. 359
Gauss, Karl Friedrich 86, 96-104,
104-108, 108-112, 119, 154, 177-
181, 307-310, 323-330, 331, 374
Gergonne, Joseph Diez 154, 159, 160
Giordano 144
Goldbach, Christian 21,22
Gordan, P. 128
Gregory, James 69-73
Hansen, Peter 193, 195, 196
Heiberg, Johan Ludvig 6
392
Index of Names
Hermite, Charles 128-137
Hipparchus 310-314
Huygens, Christian 187, 197
Jacobi, Karl Gustav Jakob 105
Kepler, Johannes 330-334
Khayyam, Omar 34-37
Kirkman, T. P. 14-18
Koenig, Gabriel 367
Kronecker, Leopold 105, 109, 117,
127
Krummbiegel 6
Kummer, Ernest Eduard 96
Lagrange, Joseph Louis 86, 94-96,
356
Laisant, M. 27, 33
Lalande, Joseph Jerome Le Francais
de 330
Lambert, Johann Heinrich 165, 206,
352-356
Legendre, Adrien Marie 82, 96, 104,
292
Leibniz, Gottfried Wilhelm von 73,
222
Lessing, Gotthold Ephraim 5, 6
L'Hdpital, Guillaume Francois 197
Lhuilier, Simon 305
Lindemann, Ferdinand 128-137
Liouville, Joseph 105,112
Littrow, Joseph Johann von 224
Lossel, von 364
Lorsch, A. 369
Lucas, fidouard 27-33
Ludolph van Ceulen 136
Machin, John 73
MacMahon, Percy Alexander 9, 27
Malfatti, Giovanni Francesco 147-
151
Maraldi, Giacomo Filippo 367
Martus, Hermann 370
Mascheroni, Lorenzo 160-164, 165
Menaechmus 171
Mercator, Gerhard 314-316
Mercator, Nicolaus 56-59
Moivre: see Demoivre
Monge, Gaspard 151-154
Moreau, M. C. 27
Miiller: see Regiomontanus
Nesselmann, G. H. F. 6
Newton, Isaac 9-10, 48-55, 59-64,
208, 217-219
Nicomedes 172
Nunes Pedro 321, 375
Pappus 173, 250, 252
Pascal, Blaise 257-261
Peirce, Benjamin 14, 16
Petersen 154
Pohlke, K. 303-307
Poncelet, Jean Victor 165, 192, 193,
219-220
Pothenot 193, 194, 196
Proclus 214
Quetelet, Lambert Adolphe Jacques
197
Reaumur, Rene Antoine Ferchault de
366-369
Regiomontanus (Johannes Miiller)
369-371
Riccati, Jacopo Francesco 197
Riccioli, Giovanni Battista 329
Roder, Christian 369
Rodrigues 22
Ruffini, Paolo 116
Schellbach 147
Schering 105
Schoenemann 118
Schooten, Franciscus van 214-217
Schwarz, Hermann Amandus 303-307
Segner 22
Simon, M. 224
Smith 77
Snellius, Willebrord 193, 321
Steiner, Jakob 165-170, 226-231, 255-
257, 278, 283-285,292,359,378-389
Stoll 375
Sturm, Jacques Charles Francois
112-116
Sylvester, James Joseph 16, 142
Tannery, P. 6
Taylor, H. M. 27
Torricelli, Evangelista 361-363
Ullherr 109
Urban, H. 26
Index of Names
393
Vieta (Viete), Francois 154
Vincent, A. J. H. 6
Vitruvius Pollio, Marcus 339
Viviani, Vincenzo 361
Wallis, John 86
Weber, H. 128
Weierstrass, Karl Theodor 109, 128
Weisbach 310
Wilson, J. 82
Wolf 77